CPT Training on HyperPod#
This notebook demonstrates Continued Pre-Training (CPTTrainer) on HyperPod.
CPT operates on a raw corpus rather than instruction pairs, extending the model’s knowledge in a specific domain.
Note: CPTTrainer is supported for Nova models only.
What you will learn#
Create a
CPTTrainerwith HyperPod computeSubmit a CPT training job
1. Setup#
# === Fill in your AWS resources ===
S3_BUCKET = "<your-s3-bucket>" # e.g. "sagemaker-us-east-1-123456789012"
TRAINING_DATASET = f"s3://{S3_BUCKET}/cpt-data/cpt-corpus.jsonl"
S3_OUTPUT_PATH = f"s3://{S3_BUCKET}/cpt-hyperpod/output/"
CLUSTER_NAME = "<your-cluster-name>" # e.g. "my-cluster"
2. Create CPTTrainer with HyperPod Compute#
Use CPTTrainer with HyperPodCompute for distributed pre-training on a managed cluster.
from sagemaker.train import CPTTrainer
from sagemaker.core.training.configs import HyperPodCompute
compute = HyperPodCompute(
cluster_name=CLUSTER_NAME,
instance_type="ml.p5.48xlarge",
node_count=2,
)
cpt_trainer = CPTTrainer(
model="nova-textgeneration-micro",
compute=compute,
training_dataset=TRAINING_DATASET,
s3_output_path=S3_OUTPUT_PATH,
)
3. Submit Training Job#
job_name = cpt_trainer.train(wait=False)
print(f"HyperPod CPT job submitted: {job_name}")