sagemaker.train.evaluate.llmaj_inference_benchmark

sagemaker.train.evaluate.llmaj_inference_benchmark#

This module generates the InspectAI benchmark Python file and supporting configuration that runs inside the InspectAI container to produce inference responses for LLM-as-Judge evaluation. It also handles dataset format conversion from LLMAJ format to InspectAI format.

Functions

`convert_dataset_to_inspectai_format`(...)	Convert LLMAJ dataset format to InspectAI format.
`generate_benchmark_files`()	Generate the benchmark directory file contents.

sagemaker.train.evaluate.llmaj_inference_benchmark.convert_dataset_to_inspectai_format(dataset_content: str) → str[source]#

Convert LLMAJ dataset format to InspectAI format.

Transform each JSONL line from {"prompt": "..."} or {"query": "..."} to {"input": "...", "target": ""} as expected by InspectAI’s json_dataset loader.

Parameters:: dataset_content (str) – Raw JSONL content from the customer’s dataset. Each non-empty line must be a JSON object containing either a "prompt" or "query" field.
Returns:: Converted JSONL string in InspectAI format with one {"input": ..., "target": ""} object per line.
Return type:: str
Raises:: ValueError – If a line contains neither "prompt" nor "query" field.

sagemaker.train.evaluate.llmaj_inference_benchmark.generate_benchmark_files() → dict[str, str][source]#

Generate the benchmark directory file contents.

Returns:: Dict mapping filename to file content string.
Return type:: dict[str, str]

sagemaker.train.evaluate.llmaj_inference_benchmark

Contents

sagemaker.train.evaluate.llmaj_inference_benchmark#