What is Archal?
We build pre-deployment testing for AI agents. Just like how you wouldn’t hire someone without interviewing them or force push code onto a deployment branch, you shouldn’t let an AI system interact with volatile systems with testing.How it works
- Write a scenario in markdown with setup state, expected behavior, and success criteria.
- Archal provisions digital twins preloaded with scenario state.
- Your agent runs against those twins through MCP-compatible interfaces.
- The evaluator scores each run against deterministic and probabilistic criteria.
- You review satisfaction score, per-criterion results, and traces to decide what to improve.
Key concepts
| Concept | What it means |
|---|---|
| Digital twin | A stateful behavioral clone of a real service (not a mock or stub) |
| Scenario | A markdown file describing an agent test: setup, behavior, criteria |
| Satisfaction | A probabilistic score of how well an agent meets scenario criteria |
| Seed | Predefined state used to initialize a twin |
| Trace | Complete record of an agent’s tool calls during a run |
Next steps
Quickstart
Get up and running