Grounded Simulation: faithful synthetic users

LLM “synthetic users” produce fluent interviews that teams find easy to distrust. Grounded Simulation is a first-principles architecture for keeping them faithful, and auditable.

The fluency trap

Ask a large language model to role-play a user and it will happily talk for hours. The transcripts read well. That is exactly the problem. Fluent does not mean faithful, and a synthetic interview that sounds right is more dangerous than one that obviously fails, because someone will ship a decision on the back of it.

Three failure modes show up again and again. The simulation is self-fulfilling: prime the model with a hypothesis and it obligingly confirms it. It is one-dimensional: a flattened caricature of a person rather than a contradictory, situated one. And it is undefendable: when a stakeholder asks “how do you know?”, there is no chain of evidence to point at. So teams either distrust the output or, worse, trust it anyway.

The thesis: ground the simulation

Grounded Simulation is the architecture I have been building and writing up to address this. The premise is simple to state and hard to do well: a synthetic study should be grounded in behavioral science and cognitive models, not in the vibes of a clever prompt. The model is one component inside a method that constrains it, never the method by itself.

The goal isn’t a model that sounds like a user. It’s a study you can audit in thirty minutes.

Four mechanisms that keep it honest

In practice, and this is the architecture behind the research platform I work on at Articos, faithfulness comes from structure, not from a better persona prompt:

Hypothesis-blind persona generation. Personas are generated without exposure to the question under test, which removes the most common path to a self-fulfilling answer.
Protocols grounded in the literature. Study design draws on cognitive models and a corpus of peer-reviewed work rather than ad-hoc instructions, so the simulated behaviour has a reason to resemble the real thing.
Evidence-chained themes. Every theme links back to the quotes, questions, and hypotheses that produced it, including how many times it was refuted, not just supported.
Confidence, scored not asserted. Findings carry a confidence signal so a reader can weight them, and an audit layer makes the whole derivation inspectable.

Why auditability is the point

Most debates about synthetic users argue about accuracy in the abstract. I think that is the wrong frame. The useful question is whether a team can defend a decision made with the research, defending it to a skeptical PM, to a leadership review, to themselves in six months. Auditability is what makes that possible. It is also what separates a research instrument from a confident-sounding chatbot.

This connects directly to how I think the rest of AI product work should go: the system produces a strong, inspectable starting point, and a human stays in the loop to judge it. Magical, but accountable.

Status & reading

The formal write-up, Grounded Simulation: A First-Principles Architecture for LLM-Based Synthetic UX Research (SSRN, abstract ID 6503241), is in the process of being published. I’ll link the live paper here once it is indexed; until then, treat this essay as the plain-language version of the argument. The paper sits alongside my other work on my research page, and you can see Grounded Simulation applied in the Articos case study.

Grounded Simulation: faithful, not just fluent

The fluency trap

The thesis: ground the simulation

Four mechanisms that keep it honest

Why auditability is the point

Status & reading