Technical

How Donna cites its work

Donna

The Donna team

17 June 2026 · 8 min read

The most important interface decision in Donna is also the least visible: nothing Donna says about your documents arrives without a way to check it in one click. This is how that promise is engineered, and why we treat citations as a systems problem rather than a formatting one.

The trust gap is the real product gap

Every survey of legal AI adoption finds the same blocker, and it is not capability. It is the gap between how confident a model sounds and how correct it is. A lawyer who has to re-derive an answer from scratch to trust it has saved nothing. So the question we engineered around was never “how good can the answer be” but “how cheap can checking it be.”

Seconds, not minutes
Checking a claim means clicking it and looking at a highlighted region. Not re-reading a document.

Four stages between a sentence and a highlight

When Donna answers a question about a matter, each claim in that answer runs a pipeline of its own before you ever see a citation marker.

Claima sentence in the answerGroundmatch to source passageLocateregion on the real pageVerifysecond model must agreeShown as a citationclick opens the highlighted regionfails: fall back to the page, or go back around
The citation pipeline
A claim is grounded to its source passage, located as pixel coordinates on the original page, and verified by a second pass before it is shown. Anything that fails degrades honestly.

Ground. During ingestion every document is split into passages that remember where they came from: the page, and the position of every token on it. When a claim is made, we match it back to the passages that support it. No support, no claim: the answer is rewritten rather than decorated with a hopeful footnote.

Locate. A citation that opens page one of a ninety-page contract is technically true and practically useless. We resolve the supporting passage to a rectangle on the original page image using the token geometry captured at ingestion. Scanned documents, where the text layer is reconstructed rather than native, go through the same resolution against the recognised layout.

Verify. Locations are proposed by one pass and checked by another: a separate model call is shown the claim and the candidate region, and must independently agree that the region supports the claim. A fixed confidence score is worthless here, so the verifying pass earns its number instead of asserting it. Regions that fail are re-resolved, often because the same clause appears on multiple pages and the first candidate was the wrong occurrence.

Degrade honestly. Sometimes a precise region cannot be established: a low-quality scan, a table the layout model mangled. Donna then falls back to a page-level frame rather than pretending precision it does not have. A citation that says “this page” truthfully beats one that highlights the wrong sentence confidently.

Citations survive the conversation

A citation in Donna is a durable reference, not a chat artifact. Click one from a conversation last month and it opens the same document at the same region, with the document viewer scrolled and the highlight drawn. References survive document versioning because they bind to the version they cited: when a contract is superseded, the citation still opens what was actually relied on, clearly marked as a prior version.

What this buys the people who share the matter

  • For lawyers: review at the speed of reading. The claim and its evidence sit one click apart, so verifying AI output stops being a parallel research project.
  • For clients: answers that show their working. A client who clicks a citation and lands on the clause builds the kind of trust no disclaimer paragraph ever earned.
  • For the firm: an audit trail. Every answer is reconstructable evidence of what was relied on, when.

None of this makes a model infallible. It makes a model checkable, which is the property legal work actually requires. The rest of our reliability story, the verification loop inside agent runs, builds on exactly this foundation: an agent that must cite its work is an agent that can be caught, and an agent that can be caught is one you can delegate to.

Reading about it is one thing. Working in it is another.

60 seconds. No credit card.