Skip to content
Research protocols

Protocol, instrumentation,
and evidence.

The workspace is treated as a research instrument. Each study begins with a question, a defined state model, and a small set of signals that can be reviewed across sessions. Cohorts in the Free Compute Program run under these protocols.

Protocol

A four-stage loop
for evaluating change.

01

Instrument

Baseline mastery, misconception state, curriculum context, and learner permissions are captured before any intervention.

02

Intervene

Learners move through the wrapper while tutoring mode, routing, and support patterns stay legible on every turn.

03

Observe

The system records user-scoped histories, evaluation traces, and notebook signals tied to the hypothesis.

04

Revise

Protocol changes follow evidence review, not because a flow merely feels smoother.

Evaluation

Decisions depend on
pre-declared signals.

The same matrix is reviewed across programs so results stay interpretable from one cycle to the next.

Tutor behavior

Signal
Correction quality, retention, transfer
Cadence
Per turn · weekly review
Threshold
Sustained gain across cohorts

Routing quality

Signal
Tier choice vs. task difficulty and energy
Cadence
Per request
Threshold
Smallest capable model selected

Curriculum stability

Signal
Graph and lesson consistency across return
Cadence
On generation and replay
Threshold
No drift for unchanged inputs

Pedagogical fit

Signal
Clarity · correctness · ped-fit scores
Cadence
Per evaluation trace
Threshold
Above review threshold

Before a study runs, it declares

01

Hypothesis

One question the study is built to answer, written before any data is collected.

02

State model

The mastery, misconceptions, and permissions captured for each learner up front.

03

Signals

The small set of measurements that will count as evidence, and nothing else.

04

Threshold

The result that would actually change the protocol.

05

Escalation

The point where a low-confidence turn leaves the model and goes to a human.

Principles

A quiet system, so researchers
can read what matters.

01

State is explicit, not incidental.

02

Every measurement is tied to a hypothesis.

03

Access is constrained until the surface is trustworthy.

04

Every public claim maps to an observable workflow.

05

Public demos are labelled simulated, preview, or experimental.

06

Low-confidence turns escalate to a human, and guardrails gate the sensitive ones.