How This Works

The self-description document. This is what Dot consults when a user asks what she is, what the module is doing, or how any specific part of it functions. It is also the page you are reading. The same text serves both purposes — which is itself a small lesson about the module. Transparency, not translation.


What The Document Workbench Is

The Document Workbench is a supervised sandbox. You pick a document, or upload one of your own, and then you ask questions about it. Dot answers. While she answers, the machinery is visible: the specific passages she was given, the number of tokens sent to the model, and the approximate cost of the call.

It is not a tool that does your work for you. It is a tool that shows you how a certain kind of AI tool works, using documents you can verify against.

Most chat products hide what we show. You upload a 200-page PDF, ask a question, and a confident answer appears. The model did not read 200 pages. It received a handful of passages selected by machinery you never saw. Whether the answer is correct depends as much on the machinery as on the model.

The Document Workbench makes that machinery visible.

What Dot Is

Dot is a research-assistant persona built on top of Claude. The underlying model is Anthropic's Claude Sonnet for most of what she says, and Anthropic's Claude Haiku for a few specific classification tasks that do not need the bigger model.

Dot is not a separate AI. She is an instructed instance of Claude, given a system prompt that tells her what to do, how to observe, when to narrate, and when to decline. If you ask her a question, the answer you get is Claude's answer, shaped by that prompt and by the passages retrieval has provided.

When Dot says something about how AI systems work in general, she is instructed to consult a small curated set of authoritative sources first — either passages from this document and the glossary, or links to vendor documentation. If no authoritative source supports her claim, she is instructed to flag it as "based on general training" rather than present it with the same confidence as a cited fact. This is not a guarantee she will never get it wrong. It is a design choice to make her uncertainty visible when it exists.

What Happens When You Pick A Document

Before you ask anything, some work has already happened. The first time any document appears on this workbench — whether it is one we prepared or one you just uploaded — the system pre-translates it.

The document gets split into passages of roughly one page each. Every passage is sent to a translation service that converts it into a standardized summary of its meaning. Those summaries are stored alongside the passages and are what the computer actually searches against later.

For the documents we prepared in advance, this happens once, at the time we add the document to the site. You never wait for it. For a document you upload, the pre-translation happens while the upload progress bar is finishing — typically a few seconds, sometimes longer for very large files.

The specifics of what "translation" and "standardized summary" mean are the subject of the next section. For now: the computer cannot read your document the way you do, so it keeps a version it can read. That version is what it searches when you start asking questions.

What Happens When You Ask A Question

The short version, for people who do not want to read the next six paragraphs: your question gets translated into the same standardized form as the document's passages, so the computer can compare them to each other. The closest matches get sent to Claude along with your question. Claude answers. You see both the answer and the passages it was given.

That is the whole mechanism. The rest of this section explains the translation part, because that is where most people's intuitions about AI get things wrong.

The translation

A computer cannot directly compare the meaning of two sentences. It can compare numbers very fast, but "what the case held about minimum contacts" and "the central holding on personal jurisdiction" are not numbers — they are phrases that mean something similar without looking much alike.

So we translate. Your question gets sent to a service called Voyage AI, which runs a model called voyage-law-2 that does one thing: it takes a piece of text and returns a list of 1,024 numbers.

Think of a nutritional label. Every label has the same fields in the same order: calories, fat, carbohydrates, protein, sodium, and so on. Any two products can be compared directly, field by field, because the label is standardized. The label is not the food, but it's a structured summary that makes any two foods comparable to each other using simple arithmetic.

voyage-law-2 writes standardized labels for text. Every label has exactly 1,024 fields, always in the same order, each measuring some aspect of the text's meaning. We do not know what each of those 1,024 fields is — the model figured that out on its own during training, and the fields are abstract. But because every label measures the same things in the same order, we can compare any two labels to each other by doing simple math.

When you type a question, we generate a label for it. We already generated labels for every passage in the document during pre-translation. We compare your question's label to every passage's label, and pull back the ten closest matches.

Why this matters

The comparison is based on meaning, not on keywords. If you ask about "what this court said about jurisdiction" and a passage uses the phrase "holding on minimum contacts," those will match even though they share almost no words, because voyage-law-2 has learned that those ideas live near each other in lawyer-speak. This is the technique's main advantage over plain keyword search.

It also has limits. The translation is imperfect. Sometimes the labels disagree with human intuition about what's relevant. This is why the right-hand panel of the workbench shows you the passages retrieval actually selected — so you can see, in any given case, whether the machine got it right.

And then

The matching passages, a short summary of the document, the last several turns of your conversation, and Dot's system prompt all get assembled into a single request. That request goes to Claude Sonnet at Anthropic. Claude's answer comes back. Citations are matched to specific passages. If Dot wants to step back and observe something — leading question, synthesis overreach, retrieval gap — a separate callout is prepared.

The answer, the observer callout if any, the passages Claude was given, and the approximate cost of the whole round-trip all appear in the interface.

If Dot's answer looks wrong, the first place to check is the passages panel. Often the retrieval was the problem, not the model.

The Machinery, Named

For readers who want the specifics. If you skipped the last section, skip this one too.

  • Embedding model: voyage-law-2. This is the "label writer" from the section above. Each label has 1,024 fields. We chose it because it was trained specifically on legal text and measurably outperforms general-purpose alternatives on legal retrieval, while still performing competitively on non-legal material. The module's corpus is mixed, so a legal-tuned model that handles non-legal text well is the right default.
  • Chunk size: documents are split into passages of roughly 564 tokens mean (NIST AI RMF 1.0 baseline) with some overlap between adjacent passages. Overlap exists because a thought that starts at the end of one passage and finishes at the start of the next would otherwise be split across two recipe cards, and neither would capture the whole thought.
  • How many passages Dot sees: five, drawn from a top-ten retrieval. The top five become Dot's context for synthesis; ranks six through ten are visible in the right panel as retrieved-but-cut, so you can see what was retrieved but didn't make it into Dot's context. This split is the workbench's commitment to making the retrieval seam visible: Dot reads five, you see all ten, and the gap between them is itself information.
  • The model that actually writes answers: Claude Sonnet 4.6 by default. Claude Haiku 4.5 is used for two specialized tasks — matching a question to a concept in FOLIO, the legal taxonomy this module uses, and sorting documents by category — and for synthesizing the foundations panel's curated meta-knowledge response when one fires. Haiku is cheaper and faster than Sonnet, appropriate for tasks that benefit from a deterministic model and do not need Sonnet's full capability.
  • How creative the model is allowed to be: on classification tasks, not at all — the setting is called "temperature" and we set it to zero, which makes the model pick the most probable answer every time. On ordinary question-answering, a small amount of variation is appropriate and we leave temperature at a non-zero default. Zero-temperature on classification is specifically because the lessons that use classification ask users to change one variable at a time; a model that randomized its own output would make that impossible to learn from.

Cost

Every call costs money. Not much per call — a typical question on a medium-sized document costs roughly one and a half cents. A question asked in full-context mode on a large document can cost fifteen cents or more. The module displays the cost of each call as it happens.

The module has a daily spending cap. When the cap is reached, the AI features pause until the next day. The static reading material on the module, and on the rest of Context Hitch, remains available. This is a deliberate design choice, not a bug. The module teaches that AI has a real price tag; it also lives that principle by disclosing its own. The cap is not a surprise — it is the honest answer to "what happens when this gets popular."

What Is And Is Not Logged

Logged: anonymous session tokens, query counts per session, approximate per-session cost, which documents were selected, whether the observer voice fired, whether Dot declined to answer, analytics events useful for understanding how the module is being used.

Not logged: the content of your questions, the text of any document you upload, the content of Dot's answers, anything that could identify you personally beyond your IP address at the transport layer.

When you upload a document, the document and its vector embeddings live in the system's database for the duration of your session and up to 24 hours after your last activity, whichever is shorter. After that, an automated job deletes them. There is no "recover my document" button because there is no stored document to recover.

Do not upload confidential information. Not because we are looking at it — we are not — but because this is a demonstration tool running on shared infrastructure, and the disclaimers we put in front of every upload page exist for a reason. If you need to do serious work with an AI on a confidential document, use an API endpoint with the appropriate contractual protections, or a tool your employer has evaluated and approved.

Sessions Are Ephemeral By Design

Every visit starts a new session. No accounts. No login. No persistent preferences. No history of previous visits.

This includes Dot's observer mode. On every new session, Dot begins in "new learner" mode and narrates heavily at first, trailing off as the session goes on. If you return tomorrow, she starts over. You will not experience this as Dot remembering you, because she does not. This is intentional: we teach that AI does not have persistent memory across sessions unless something is specifically storing it, and this module lives by that rule.

If this feels a little repetitive on your second visit, that is the repetition teaching the point.

When Dot Declines To Answer

Sometimes retrieval does not find anything useful. The question might be outside what the document covers, or phrased in a way the embedding model did not connect to the right passages. In those cases, Dot is instructed to say so — "the retrieved passages don't support an answer to this; try rephrasing, or switch to full-context mode" — rather than produce a plausible answer from training memory.

A production AI tool that silently makes things up when retrieval fails is failing in a specific way this module is designed to expose. When Dot declines, that is the product working correctly.

If Something Is Wrong

If Dot says something that contradicts what you know to be true about a document on this workbench, you are probably right and she is probably wrong. The curated documents on this site were chosen in part because users are likely to bring prior knowledge to them — that is how the audit-your-expertise lesson works. Check the passages she was given (right panel), check the source document (left panel), and trust your own reading before you trust hers.

If something on the module is genuinely broken — pages not loading, responses timing out, cost counters displaying nonsense numbers — use the Contact Us form. Include the approximate time and what you were trying to do. Ford will investigate.


Dot, stepping back: Most tools do not tell you this much about themselves. There is a reason for that, and it is not usually "we wanted to keep it simple." If an AI tool you are paying for cannot tell you what embedding model it uses, how its retrieval works, what is and isn't logged, and what happens when it can't find an answer — that is the information, not the omission.