Upsonic provides a built-in evaluation framework to systematically test and benchmark your AI agents, teams, and graphs. Evaluations help you ensure that your AI workflows meet quality, performance, and reliability standards before deploying to production.Documentation Index
Fetch the complete documentation index at: https://docs.upsonic.ai/llms.txt
Use this file to discover all available pages before exploring further.
Evaluation Types
Accuracy
LLM-as-a-judge evaluation that scores agent output quality against expected answers on a 1–10 scale.
Performance
Latency and memory profiling with statistical analysis across multiple iterations.
Reliability
Tool-call verification that asserts expected tools were invoked during execution.
Quick Start
Install the required dependencies and run your first evaluation in minutes.Supported Entities
Every evaluator works with all three core entities:| Entity | Description |
|---|---|
| Agent | Single agent executing a task |
| Team | Multi-agent team in sequential, coordinate, or route mode |
| Graph | DAG-based workflow with chained task nodes |

