sagemaker.train.agent_rft_job#
AgentRFTJob — wrapper around sagemaker-core Job for AgentRFT job category.
Classes
|
Wrapper around sagemaker-core Job for AgentRFT job category. |
- class sagemaker.train.agent_rft_job.AgentRFTJob(job: Job)[source]#
Bases:
objectWrapper around sagemaker-core Job for AgentRFT job category.
Delegates lifecycle methods to the underlying Job and adds typed convenience properties by parsing the JobConfigDocument JSON string.
- Parameters:
job – The sagemaker-core Job instance to wrap.
- JOB_CATEGORY = 'AgentRFT'#
- property agent_config: dict | None#
Full AgentConfig section from JobConfigDocument.
- property billable_token_usage: dict | None#
Billable token usage from ServiceOutput.
Returns dict with keys: TrainTokenCount, PrefillTokenCount, SampleTokenCount.
- property creation_time#
- property end_time#
- property failure_reason: str | None#
- classmethod from_job(job: Job) AgentRFTJob[source]#
Create an AgentRFTJob from a sagemaker-core Job instance.
- classmethod get(job_name: str, session=None) AgentRFTJob[source]#
Attach to an existing AgentRFT job by name.
- Parameters:
job_name – The name of the job.
session – Optional boto3 session.
- Returns:
AgentRFTJob wrapping the existing job.
- classmethod get_all(session=None, **kwargs)[source]#
List all AgentRFT jobs.
Delegates to Job.get_all with job_category pre-filled. Additional keyword arguments (e.g. creation_time_after, name_contains, sort_by, sort_order, status_equals) are forwarded.
- Parameters:
session – Optional boto3 session.
**kwargs – Additional filter arguments forwarded to Job.get_all.
- Yields:
AgentRFTJob instances.
- get_mlflow_url() str | None[source]#
Generate a fresh presigned MLflow URL for this job’s experiment/run.
In Jupyter notebooks, also renders a clickable link.
- Returns:
Presigned URL string, or None if MLflow is not configured.
- get_training_metrics() list[dict][source]#
Fetch per-step MTRL training metrics from MLflow.
Retrieves
rollout/reward/mean,rollout/turns/mean,training/total_tokens, andtraining/num_trajectoriesfor each training step and prints a summary table.- Returns:
List of dicts, one per step, with keys
step,rollout/reward/mean,rollout/turns/mean,training/total_tokens, andtraining/num_trajectories.
- property job_arn: str#
- property job_name: str#
- property job_status: str#
- property last_modified_time#
- property mlflow_details: dict | None#
MLflow experiment/run details from ServiceOutput.
Returns dict with keys: ExperimentName, RunName, ExperimentId, RunId.
- property output_model_package_arn: str | None#
ARN of the output model package from ServiceOutput, or None.
- property progress_info: dict | None#
Training progress from ServiceOutput.
Supports two formats: - Epoch-based: dict with MaxEpoch, StepsPerEpoch, CurrentEpoch, CurrentStep. - Step-only: dict with MaxSteps, CurrentStep.
Returns None if not available.
- property s3_output_path: str | None#
S3 output path from OutputDataConfig.
- property secondary_status: str#
- property secondary_status_transitions: list#
- property training_config: dict | None#
Full TrainingConfig section from JobConfigDocument.