evals.spec_adapter
Convert agentspec eval configuration into pydantic-evals Datasets.
EvalCaseSpec Objects
@dataclass
class EvalCaseSpec()
A single evaluation case derived from the agentspec.
EvalSuiteSpec Objects
@dataclass
class EvalSuiteSpec()
A suite of eval cases for one named evaluation.
parse_eval_spec
def parse_eval_spec(eval_spec: list[dict[str, Any]]) -> list[EvalSuiteSpec]
Parse the agentspec evals YAML list into strongly-typed suite specs.
Parameters
eval_spec : list[dict]
The evals section from the agentspec, e.g.::
[
{"name": "KPI Accuracy", "category": "coding", "task_count": 400},
{"name": "Variance Quality", "category": "reasoning", "task_count": 200},
]
build_dataset_from_spec
async def build_dataset_from_spec(
eval_spec: list[dict[str, Any]],
agent_system_prompt: str | None = None,
tool_schemas: list[dict[str, Any]] | None = None) -> Any
Convert agentspec eval config into a pydantic-evals Dataset.
If pydantic_evals is available, returns a Dataset instance.
Otherwise returns a list of EvalSuiteSpec for manual processing.
Parameters
eval_spec : list[dict] The evals config from the agentspec. agent_system_prompt : str | None System prompt of the agent (used for synthetic case generation). tool_schemas : list[dict] | None JSON schemas of the agent's tools (for grounding case generation).