evals.report
Evaluation report formatting and persistence.
CaseResult Objects
@dataclass
class CaseResult()
Result of a single evaluation case.
ReportSummary Objects
@dataclass
class ReportSummary()
Aggregated summary of an eval report.
EvalReportData Objects
@dataclass
class EvalReportData()
Full evaluation report data.
format_report
def format_report(eval_id: str,
agent_id: str,
case_results: list[CaseResult],
started_at: datetime | None = None,
metadata: dict[str, Any] | None = None) -> EvalReportData
Build a structured eval report from individual case results.
Parameters
eval_id : str Unique identifier for this eval run. agent_id : str The agent that was evaluated. case_results : list[CaseResult] Results for each evaluation case. started_at : datetime | None When the eval run started. metadata : dict | None Additional metadata (spec version, model, etc.).
save_report_json
def save_report_json(report: EvalReportData,
output_dir: str = _DEFAULT_EVALS_DIR) -> str
Persist an eval report as JSON. Returns the file path.