evals.spec_adapter

Convert agentspec eval configuration into pydantic-evals Datasets.

EvalCaseSpec Objects

@dataclass
class EvalCaseSpec()

A single evaluation case derived from the agentspec.

EvalSuiteSpec Objects

@dataclass
class EvalSuiteSpec()

A suite of eval cases for one named evaluation.

parse_eval_spec

def parse_eval_spec(eval_spec: list[dict[str, Any]]) -> list[EvalSuiteSpec]

Parse the agentspec evals YAML list into strongly-typed suite specs.

Parameters

eval_spec : list[dict] The evals section from the agentspec, e.g.::

    [
        {&quot;name&quot;: &quot;KPI Accuracy&quot;, &quot;category&quot;: &quot;coding&quot;, &quot;task_count&quot;: 400},
        {&quot;name&quot;: &quot;Variance Quality&quot;, &quot;category&quot;: &quot;reasoning&quot;, &quot;task_count&quot;: 200},
    ]

build_dataset_from_spec

async def build_dataset_from_spec(
        eval_spec: list[dict[str, Any]],
        agent_system_prompt: str | None = None,
        tool_schemas: list[dict[str, Any]] | None = None) -> Any

Convert agentspec eval config into a pydantic-evals Dataset.

If pydantic_evals is available, returns a Dataset instance. Otherwise returns a list of EvalSuiteSpec for manual processing.

Parameters

eval_spec : list[dict] The evals config from the agentspec. agent_system_prompt : str | None System prompt of the agent (used for synthetic case generation). tool_schemas : list[dict] | None JSON schemas of the agent's tools (for grounding case generation).

EvalCaseSpec Objects​

EvalSuiteSpec Objects​

parse_eval_spec​

Parameters​

build_dataset_from_spec​

Parameters​

EvalCaseSpec Objects

EvalSuiteSpec Objects

parse_eval_spec

Parameters

build_dataset_from_spec

Parameters