Models / Evaluation Studio

Score a model candidate against explicit, versioned criteria.

Evaluate candidates with the dataset, rubric, run, failed examples, human judgment, and release impact connected.

Evaluation, model, safety, and governance teams comparing model behavior. Each program is configured around the workflow, review policy, and evidence your team needs.

Scope an evaluation Explore Models

Available nowScoped around a named evaluation and candidate set.

Input, work, and output

What you provide

A model candidate, evaluation set, rubric, and release policy.
Rubric dimensions, thresholds, regressions, and reviewer notes.

What happens

Run, inspect traces, review failures, compare, and approve.
Responsible review: Evaluation owner, domain reviewer, and release approver.

What you receive

A versioned scorecard with blockers and source evidence.
Decision: Pass, hold, compare again, or return for improvement.

Decision flow

Move from intake to review and handoff with clear owners at every step.

01
Scope
A model candidate, evaluation set, rubric, and release policy.
Versioned rubric
02
Operate
Run, inspect traces, review failures, compare, and approve.
Trace review
03
Review
Rubric dimensions, thresholds, regressions, and reviewer notes. Evaluation owner, domain reviewer, and release approver.
Named review and decision history
04
Release
A versioned scorecard with blockers and source evidence.
Scorecard record

What you receive

Score a model candidate against explicit, versioned criteria. deliverables

Input: A model candidate, evaluation set, rubric, and release policy.Named source and version
Criteria: Rubric dimensions, thresholds, regressions, and reviewer notes.Versioned rubric
Reviewer: Evaluation owner, domain reviewer, and release approver.Trace review
Decision: Pass, hold, compare again, or return for improvement.Actor, role, time, and rationale
Output: A versioned scorecard with blockers and source evidence.Scorecard record

What the workflow supports

What the workflow supports.
Outcome	Work	What you receive	Program fit
Advance	The required Evaluation Studio checks and review are complete.	Versioned rubric, Trace review, Scorecard record	Program fit depends on the source, version, criteria, reviewers, and operating scope.
Hold	A required check, reviewer, approval, or source record is missing or unresolved.	Blocking requirement, responsible owner, source record, and required recovery.	The responsible owner and recovery step stay visible until the issue is resolved.
Return for work	The source object or workflow requires correction and another review.	Returned items, responsible owner, expected evidence, and review route.	The updated work returns through the same review path before a new decision.

Inspect the workflow

See how the workflow moves from input to decision.

Evaluation Studio walkthroughExplore the inputs, review steps, decisions, and deliverables.Model Launch CheckCompare candidates, replay regressions, and assemble a release decision.

AuraOne / Products

Loading the product.

Loading capabilities, workflow, and program details.

Score a model candidate against explicit, versioned criteria.

Evaluate candidates with the dataset, rubric, run, failed examples, human judgment, and release impact connected.

Evaluation, model, safety, and governance teams comparing model behavior. Each program is configured around the workflow, review policy, and evidence your team needs.

Available nowScoped around a named evaluation and candidate set.

Outcome

Work

What you receive

Program fit

Advance

The required Evaluation Studio checks and review are complete.

Versioned rubric, Trace review, Scorecard record

Program fit depends on the source, version, criteria, reviewers, and operating scope.

Hold

A required check, reviewer, approval, or source record is missing or unresolved.

Blocking requirement, responsible owner, source record, and required recovery.

The responsible owner and recovery step stay visible until the issue is resolved.

Return for work

The source object or workflow requires correction and another review.

Returned items, responsible owner, expected evidence, and review route.

The updated work returns through the same review path before a new decision.

Score a model candidate against explicit, versioned criteria.

Input, work, and output

What you provide

What happens

What you receive

Decision flow

Scope

Operate

Review

Release

Score a model candidate against explicit, versioned criteria. deliverables

What the workflow supports

See how the workflow moves from input to decision.

Loading the product.

Score a model candidate against explicit, versioned criteria.

Input, work, and output

What you provide

What happens

What you receive

Decision flow

Scope

Operate

Review

Release

Score a model candidate against explicit, versioned criteria. deliverables

What the workflow supports

See how the workflow moves from input to decision.