Human Data OS · Training demo

Turn reviewed work into a signed training corpus.

Most teams cannot prove where their training data came from. Walk one real loop: accepted reviewer decisions become a corpus you can defend in an audit and weights you keep. Read-only here. Run it on your own work in a pilot.

When the EU AI Act provenance rules land in August 2026, you will be asked who created each datapoint, who reviewed it, and under what rights. This is the answer, attached to the data itself.

Corpus pipeline

clinical-note-v18

Illustrative seed data. In a pilot these steps read from your metrics.

export ready
1
Reviewed work
Workforce + Annotation
42k accepted labels
2
Policy filter
consent, PHI, retention
1,128 exclusions
3
De-dupe
semantic near-match
3.8% removed
4
Slice
8 domains
clinical-note-v18
5
Signed export
checksums attached
jsonl + parquet

Failure feedback loop

Evaluation Studio
source
Hallucinated dosage
rewrite pair
219
Regression Bank
source
Old failure resurfaced
hard negative
37
AuraQC
source
Reviewer conflict
adjudicated preference
82

Manifest drilldown

run
train_2026_04_clinical_v18
reviewer coverage
94% · 19 certs
excluded records
1,128 · consent/PHI
checksums
dataset sha256:a91f · weights sha256:c03d
formats
jsonl · parquet · evidence packet

Dataset governance

Every slice carries source, exclusion, and reviewer coverage state.

Rights attached to the data

Export holds until consent, retention, and exclusions pass. Each datapoint carries who created it, who reviewed it, and under what rights.

Weights you keep

Weights and the evidence packet leave together. The tuned model is yours, and it survives examination.

Run this loop on your own work.

What you walked through here is read-only. In a pilot, the corpus, the exclusions, and the export proof read from your live data — and the weights are yours to keep.

Training Workflow Demo | AuraOne | AuraOne