Research grade data
for AI in open-ended domains
Multimodal expert reasoning traces for post-training and evaluation, starting with finance.
We capture tacit knowledge and reasoning for open-ended tasks that rely on human judgement.
Building AI for open-ended domains requires process-level training data
Capability gain in these domains requires new data collection formats – ones that capture expert reasoning and intent at the process level, codifying elements like judgement, deliberation, backtracking, and hypothesis construction – to construct clear reward signals, making data usable for model training.
- Code
- Maths
- Structured extraction
- Cybersec
- Biology
- Medicine
- Finance
- Law
- — Synthetic data cannot generate the out-of-distribution signal needed to adjust and improve existing capabilities.
- — Curation makes existing datasets higher signal, but fails to capture the process-level insight underlying outcome-level data.
- — Within data-labelling shops that aim to capture process level data, process is annotated retrospectively (not captured live, in-context), and therefore prone to memory reconstruction, bias, and imprecision. Quality is unmonitored as contractors with misaligned incentives optimise for throughput, with no systematic buy-in.
Reasoning traces captured in real time, structured for model training.
Engram captures expert reasoning live and in environment, where existing methods only annotate after the fact. The result is process-level supervision data, formatted for direct use in post-training.
Live expert reasoning and on-screen behavior is captured to generate post-training data with rich training signals
A recorded expert session is decomposed into structured cognitive events — timestamped, cross-referenced against screen behavior and behavioral cues like pauses and revisions.
Codifying wisdom into reasoning trace packages provides three compounding advantages.
Every session ships as turn-key bundles for the three dominant post-training methodologies: supervised fine-tuning, preference learning, and process reward modelling. SFT bundles preserve screen context and reasoning structure. Preference pairs come from expert revisions observed in-context during long-horizon reasoning, more ecologically valid than synthetic or retrospective pairs. PRM signal is derived per cognitive event, giving step-level supervision.
Each cognitive event is paired with the screen state at the moment of the event. Voice and screen provide redundant signal, reducing the need for external annotation. The model learns reasoning grounded in the specific data on screen.
Every cognitive event is timestamped, every pause is measured, every revision is anchored to the screen state that triggered it. Temporal structure surfaces signal that traditional formats discard: pause length marks uncertainty, revision events mark refinement, dwell time marks attention. These mechanism-level markers are invisible to outcome-only data formats.
Cement your legacy into an intellectual endowment
Existing compensation architecture for AI training data decouples expert contribution from downstream value creation. In the classic gig work model, an expert's judgment compounds indefinitely inside the models it trains, while the expert captures none of that compounding value.
The heuristics refined across years of practice are an intellectual legacy. We're aiming to cement and preserve this.
Through participating in Engram, the knowledge of the industry's most capable practitioners is preserved in structured form, and returned to them, with every model that learns from it.
No residual claim.
For customers, pricing is straightforward. The royalty structure is how we compensate experts on our side and doesn't create downstream obligations for your team.
Interested in contributing?
Become an ExpertThe data layer powering AI in finance.
Built for any team whose competitive advantage depends on the quality of financial reasoning their models can produce.
Our team and advisors come from









Interested in Engram?
We're working with a select group of design partners. Schedule a conversation to learn more.