PerformanceEvaluator measures execution latency and memory footprint across multiple iterations. It provides statistical analysis (average, median, min, max, standard deviation) to help you understand the performance characteristics of your AI workflows.
How It Works
- Warmup — Runs the entity a configurable number of times to reach steady state.
- Measurement — Executes the entity for
num_iterationsruns, capturing high-precision latency and memory metrics per run. - Aggregation — Calculates statistics across all measurement runs.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
agent_under_test | Agent | Graph | Team | Yes | Entity to profile |
task | Task | List[Task] | Yes | Task(s) to execute each iteration |
num_iterations | int | No | Number of measurement runs (default: 10) |
warmup_runs | int | No | Warmup runs before measurement (default: 2) |
Result Structure
PerformanceEvaluationResult contains:
all_runs— List ofPerformanceRunResultobjects (one per iteration)num_iterations/warmup_runs— Configuration valueslatency_stats—{ average, median, min, max, std_dev }in secondsmemory_increase_stats— Net memory increase statistics in bytesmemory_peak_stats— Peak memory usage statistics in bytes
PerformanceRunResult includes:
latency_seconds— Wall-clock time for the runmemory_increase_bytes— Net memory increase during the runmemory_peak_bytes— Peak memory relative to run start

