Evaluator
Evaluator is the release confidence layer.
It tracks important user journeys and shows whether the project still satisfies the flows that matter before work is considered ready to ship.
Why It Matters
Task completion is not the same as product readiness.
An agent can finish a local task while a critical journey remains broken. Evaluator exists to keep delivery focused on outcomes:
- required journeys are visible;
- failed journeys block confidence;
- disabled journeys are explicit rather than forgotten;
- evidence and review gates are attached to the journey;
- humans can see where release risk still lives.
This turns "tests passed" into a stronger question: did the user journey pass?
What Evaluator Tracks
Evaluator is designed around journeys, not files.
Examples:
- onboarding works end to end;
- invitation flow can add a teammate;
- report download remains available;
- agent resume does not lose project context;
- dashboard actions still map to the right workspace.
Each journey can have checkpoints, latest run status, evidence, and human review state.
How It Fits With Tasks
Evaluator should not replace the Board. It complements it:
- Board answers "what should agents work on next?"
- Evaluator answers "is the product still safe to ship?"
- Activity explains "what happened and why?"
- Plans explain "what are we building and under which constraints?"
Together they make agentic development observable from plan to release.
Related Entry Points
| Need | Entry point |
|---|---|
| Inspect delivery health | sinaris hub |
| Review blocked journeys | Evaluator view |
| Connect evaluation to implementation | Board + Activity |
| Understand release confidence | Plan + Evaluator |