AI as a Design Material

AI is a material with a grain. It is probabilistic, generative, sometimes wrong. You design with it the way you design with any material: by understanding its properties and building the right harnesses around it.

A material, not a feature

Most teams treat AI as a feature to bolt on: a button labelled “generate”, wired to a model, shipped behind a sparkle icon. That framing is where the trouble starts. A feature is supposed to be deterministic and predictable. AI is neither. It is a material, and like any material it has a grain.

The grain of this one is that it is probabilistic and generative, and it is sometimes confidently wrong. You do not fix that by wishing it away. You design with it the way a furniture-maker designs with wood or a structural engineer designs with steel. You understand the properties of the thing in your hands and build the right structure around them. For AI, that structure is a set of harnesses: guardrails, evaluation, model routing, and a human in the loop.

Magical, but accountable

The principle I keep returning to across every AI product I work on is short enough to put on a wall: the AI generates a strong starting point, and the human always approves. The result should feel magical to the person using it and stay accountable to the person responsible for it. Those two goals are usually framed as a trade-off. Designed well, they are the same goal.

The model is allowed to be wrong. The product is not. The harness is what holds that line.

Magical without accountable is a demo. Impressive once, untrustworthy at scale. Accountable without magical is a form with extra steps. The work is holding both at the same time: a generative starting point that removes effort, and a checkpoint that keeps a human in command of the outcome.

What it looks like in shipped work

This is not theory I admire from a distance. At AutoLeap, a B2B platform for auto-repair shops, two pieces of work show the pattern. The first is the AI Receptionist (AIR), which handles inbound calls a shop would otherwise miss. The second is Magic RO, AI-assisted repair-order generation that removes the blank page. It turns a conversation into a structured estimate in seconds.

The point of Magic RO is not that the model writes the estimate. The point is what surrounds the model. The service advisor always reviews and approves the output before anything is committed. The AI does the heavy lifting from cold start to draft; the human applies judgment, price, and accountability. That is the human-in-the-loop principle in production, not on a slide.

At Articos, an AI user-research platform, the same instinct is built deeper into the architecture. The research engine runs on cross-model routing, the right model for each step rather than one model for everything. That routing is wrapped in guardrails that constrain what it can do, an eval harness that measures whether it is doing it well, and audit journals that record how each output was derived. Those four pieces are the harness made literal.

Why the loop should be one person

There is an operational reason this approach holds together, and it is worth stating plainly without dressing it up. On this work I author the research, design the experience, and ship the production code in React, TypeScript, and Python, with agentic coding and MCP in the loop. That means the distance between an idea and the shipped behaviour is short.

When the same person who decides what to test also decides how it should feel and then writes what actually runs, there are fewer handoffs and fewer translation losses. A guardrail is not a ticket filed against another team; it is a constraint I can encode the same afternoon I notice it is missing. An eval result does not get summarised through three documents before it changes the design; it changes the design directly. That tightness is the practical value. It is not a personal achievement, just a shorter loop between noticing and fixing.

The throughline

Treating AI as a material is what unifies the research, the design, and the engineering for me. Research tells you what the material should be doing and how to know whether it is. Design decides where the human stays in command and where the system is allowed to run ahead. Engineering builds the harness that makes the first two real. None of those is the interesting part on its own; the interesting part is the seam between them.

If you want the version of this argument turned all the way up for a single domain, I wrote about grounded simulation, a first-principles architecture for synthetic UX research, and about auditable AI research, the case for research a team can defend rather than just generate. Both are the same idea seen from a different angle: the material is powerful, and the harness is what makes it trustworthy.