maple.utils.eval.BatchResults

class maple.utils.eval.BatchResults(batch_id: str, policy_id: str, env_id: str, tasks: List[str] = <factory>, seeds: List[int] = <factory>, max_steps: int = 300, results: List[EvalResult] = <factory>, started_at: float = 0.0, finished_at: float = 0.0, total_episodes: int = 0, successful_episodes: int = 0, failed_episodes: int = 0, error_episodes: int = 0, success_rate: float = 0.0, avg_reward: float = 0.0, avg_steps: float = 0.0, avg_duration: float = 0.0, task_stats: Dict[str, ~typing.Dict[str, ~typing.Any]]=<factory>)

Aggregated results from a batch evaluation.

Container for multiple episode results with automatic statistics computation. Provides per-task breakdowns, success rates, and timing information. Supports multiple serialization formats for reporting and analysis.

Methods

__init__(batch_id, policy_id, env_id, tasks, ...)

compute_stats()

Compute aggregate statistics from episode results.

load(path)

Load batch results from JSON file.

save(path)

Save batch results to JSON file.

summary()

Generate a human-readable summary string.

to_dict()

Convert batch results to dictionary representation.

to_json([indent])

Convert batch results to JSON string.

Attributes

avg_duration

avg_reward

avg_steps

error_episodes

failed_episodes

finished_at

max_steps

started_at

success_rate

successful_episodes

total_episodes

batch_id

policy_id

env_id

tasks

seeds

results

task_stats