Research grade data
for AI in open-ended domains

Multimodal expert reasoning traces for post-training and evaluation, starting with finance.

Schedule a Conversation Become an Expert

scroll

HOW IT WORKS

We capture tacit knowledge and reasoning for open-ended tasks that rely on human judgement.

1

Expert records a session

A senior analyst works through a real financial problem, reasoning aloud while their screen is captured.

2

Engram decomposes the reasoning

Cognitive events are detected, timestamped, and cross-referenced against screen behavior.

3

Your models learn how domain experts reason

Training-ready JSONL with temporal signals, screen context, and multi-scale reasoning chains.

The problem

Building AI for open-ended domains requires process-level training data

Existing model training methods (e.g., RLVR) are bottlenecked in domains without a clear reward signal.

Reinforcement learning continues to drive progress in domains where correctness can be automatically verified: coding, maths, structured extraction. But non-verifiable, open-ended domains, where the bulk of economically valuable work sits (finance, law, medicine, biology, cybersecurity), lack an automatic reward signal, leaving existing data unsuitable for direct model training.

Capability gain in these domains requires new data collection formats – ones that capture expert reasoning and intent at the process level, codifying elements like judgement, deliberation, backtracking, and hypothesis construction – to construct clear reward signals, making data usable for model training.

The existing supply chain is not optimized for signal detection in open-ended domains.

Synthetic data cannot generate the out-of-distribution signal needed to adjust and improve existing capabilities.
Curation makes existing datasets higher signal, but fails to capture the process-level insight underlying outcome-level data.
Within data-labelling shops that aim to capture process level data, process is annotated retrospectively (not captured live, in-context), and therefore prone to memory reconstruction, bias, and imprecision. Quality is unmonitored as contractors with misaligned incentives optimise for throughput, with no systematic buy-in.

WHAT ENGRAM PROVIDES

Reasoning traces captured in real time, structured for model training.

Engram captures expert reasoning live and in environment, where existing methods only annotate after the fact. The result is process-level supervision data, formatted for direct use in post-training.

Existing data formats

promptEvaluate this CIM and recommend a deal range.

chosenRecommend 6.0x to 6.5x EBITDA, contingent on margin verification.

rejectedPass; multiple looks too high relative to comparable transactions in the sector. Hold off until the margin trajectory has clearer visibility from the next two quarters of operating data.

rationaleI lowered the margin assumption after note 14 of the prior 10-K showed the restructuring charge is recurring, not one-time. The rejected response held the higher margin and missed the deferred operating leverage implicit in the distributor-channel mix shift.

What Engram captures

THE TECHNOLOGY

Live expert reasoning and on-screen behavior is captured to generate post-training data with rich training signals

A recorded expert session is decomposed into structured cognitive events — timestamped, cross-referenced against screen behavior and behavioral cues like pauses and revisions.

Deterministic Multimodal Reproducible

Expert Session

Voice + screen captured concurrently

›

Event detection Frame extraction Temporal enrichment

Cognitive Event Mapping

Structured event data with temporal signals

SFT Bundles

Supervised fine-tuning

Preference Pairs

DPO / RLHF

Process Labels

PRM training

Rubric Kits

LLM-judge calibration

Codifying wisdom into reasoning trace packages provides three compounding advantages.

Three post-training-ready outputs

Every session ships as turn-key bundles for the three dominant post-training methodologies: supervised fine-tuning, preference learning, and process reward modelling. SFT bundles preserve screen context and reasoning structure. Preference pairs come from expert revisions observed in-context during long-horizon reasoning, more ecologically valid than synthetic or retrospective pairs. PRM signal is derived per cognitive event, giving step-level supervision.

Multimodal grounding

Each cognitive event is paired with the screen state at the moment of the event. Voice and screen provide redundant signal, reducing the need for external annotation. The model learns reasoning grounded in the specific data on screen.

Temporal precision

Every cognitive event is timestamped, every pause is measured, every revision is anchored to the screen state that triggered it. Temporal structure surfaces signal that traditional formats discard: pause length marks uncertainty, revision events mark refinement, dwell time marks attention. These mechanism-level markers are invisible to outcome-only data formats.

HOW EXPERTS EARN

Cement your legacy into an intellectual endowment

Existing compensation architecture for AI training data decouples expert contribution from downstream value creation. In the classic gig work model, an expert's judgment compounds indefinitely inside the models it trains, while the expert captures none of that compounding value.

The heuristics refined across years of practice are an intellectual legacy. We're aiming to cement and preserve this.

Through participating in Engram, the knowledge of the industry's most capable practitioners is preserved in structured form, and returned to them, with every model that learns from it.

Incumbent model

$150/hr

One-time extraction.
No residual claim.

Compensation is indexed to time, not to the epistemic value of the contribution.

Engram model

Royalty

Income compounds with demand.

Compensation is indexed to downstream usage: receive a royalty each time your reasoning is licensed. The income stream scales with demand, aligning financial incentives with the long-term value of your intellectual contribution.

For customers, pricing is straightforward. The royalty structure is how we compensate experts on our side and doesn't create downstream obligations for your team.

Interested in contributing?

Become an Expert

WHO IT'S FOR

The data layer powering AI in finance.

Built for any team whose competitive advantage depends on the quality of financial reasoning their models can produce.

Financial intelligence platforms

Your agents retrieve documents. Engram teaches them how to reason through what they find: navigating ambiguity, weighing conflicting signals, and building conviction across multi-step workflows.

PE firms and hedge funds

Your best investors' judgment compounds over decades, but your models train on documents, not on the reasoning that produced them. Engram captures how your senior deal team actually evaluates a CIM, stress-tests a thesis, or revises a model, and structures it as proprietary training data.

Strategy and advisory firms

When a senior partner leaves, their pattern recognition leaves with them. Engram captures that reasoning in structured form while it's still accessible, turning institutional expertise into a durable, trainable asset.

TEAM

Our team and advisors come from

GET IN TOUCH

Interested in Engram?

We're working with a select group of design partners. Schedule a conversation to learn more.

Schedule a Conversation Try a Demo

Research grade datafor AI in open-ended domains