Design Tokens Are Sound in Theory. Most Implementations Are a Mess.

May 12, 2026

There are 847 tokens in the repository. The file is called tokens.json. Nobody on the team can explain which token controls the disabled button state. The designer changed the primary color in Figma three weeks ago. It hasn't propagated to the codebase yet. Two engineers independently hardcoded the new value in their components because the sync process "wasn't worth waiting for."

This is not a tooling problem. This is an architecture problem that tooling cannot fix.

Design tokens are a genuine innovation. The core idea — encode design intent as named variables that live in one place and propagate everywhere — solves a real problem: the gap between what a designer specifies and what an engineer implements, compounded by the gap between what was implemented last year and what the designer is specifying today. Token systems, done correctly, eliminate those gaps. They create a contract between design and engineering that reduces drift over time.

Done incorrectly, they produce exactly the scenario above: a false sense of system discipline, a growing file no one understands, and implementation drift that continues regardless because the token system wasn't actually governing anything.

Why The Architecture Matters

A well-designed token system has three layers. Most implementations have one.

Primitive tokens are the raw values. #2563EB. 16px. 400. These are facts about the visual palette — they don't carry any semantic meaning about where or how they should be used. blue-600 is a primitive. It tells you what color this is. It tells you nothing about what it means.

Semantic tokens are where intent lives. color-action-primary. spacing-component-gap. text-label-default. These tokens don't carry raw values — they reference primitives. color-action-primary: blue-600. The semantic layer is the contract. It says: whatever the interactive primary state of an action element should look like, it should look like this. When your brand blue changes from #2563EB to #1D4ED8, you change one primitive. Every semantic token that references it updates automatically. Every component that references the semantic token updates through it.

Component tokens are optional but sometimes useful — scoped to specific components when they genuinely need local customization that shouldn't propagate globally. button-background-primary might reference color-action-primary and add a slight opacity modifier. This layer should be thin. When it's thick, it's a sign the semantic layer wasn't doing its job.

The semantic layer is the layer most teams skip. Not because they don't know it exists — because it requires doing the hard design work of defining what things mean before defining what they look like. What does "primary action" mean in your system? What makes something a "label" versus a "body" versus a "heading"? These questions require actual design decisions. Generating primitives from an export is fast. Designing a semantic vocabulary for your system is slow and requires people to have opinions.

Most teams take the fast path. They export primitives from Figma, dump them in tokens.json, tell the engineering team "the tokens are ready," and call it done.

The Governance Vacuum

Even teams that implement the three-layer architecture correctly run into the same second problem: they ship without a governance process, and the system starts degrading on the first sprint after launch.

Token governance means: who approves new tokens? How does a designer request one? How does a deprecated token get removed? How does a component-scoped override get escalated to a semantic token when the same override appears in multiple components?

Without answers to these questions, the system evolves by accumulation. Engineers add tokens for specific cases. Designers add tokens in Figma that never land in code. The tokens.json file grows. Nobody removes tokens that are no longer used because nobody knows which ones are used. After eighteen months, the file has 800 tokens and the team has the same sync problems they had before the token system, plus a new layer of abstraction to debug through.

Francesco Improta's writing on design token governance is precise about this: without a formal process for token requests, reviews, and deprecation, systems become unmaintainable within months. Design and code drift after the first few sprints not because the tools failed — because there was no process to prevent the drift.

The Figma team themselves have documented this. Their research on design system adoption shows that most systems that fail, fail at the maintenance layer — not the build layer. The initial system is usually fine. The governance to keep it fine doesn't exist.

The Figma-to-Code Reality Gap

The tooling situation has improved considerably. W3C's Design Tokens Community Group is standardizing a format (the 2025.10 draft has support from Figma, Penpot, Sketch, Framer, Supernova, and several design system platforms). Token Studio, Theo, and Style Dictionary have made the export-and-transform pipeline faster. The toolchain is genuinely better than it was three years ago.

What the toolchain cannot fix is intent. When a designer creates a semantic token in Figma, they're making a decision about meaning. When that token exports to JSON and gets consumed by a code transformer, the meaning can only survive if the naming was precise enough to carry it. A token called text-label-active in Figma might be consumed by an engineer who isn't sure whether "label" means a form label element or any label-sized text. The code gets the value right. The usage is wrong. The system starts drifting.

This is why teams that skip the semantic layer suffer more from AI-generated code: when engineers prompt a model to build a component, the model reaches for the most obvious interpretation of token names. Ambiguous names produce ambiguous implementations. The design intent that was never encoded precisely enough in the token name doesn't survive the translation.

The toolchain is a pipe. You still have to decide what you're piping.

What a Token System That Actually Works Looks Like

There are three requirements — not recommendations — for a token system that holds up over time.

The semantic layer is mandatory. Primitives to components with no semantic layer is just CSS variables with extra steps. Every token that a component references should be a semantic token. Semantic tokens should be the only things components reference. Primitives exist to feed semantics; they should never appear directly in component code.

Governance exists before the first sprint after launch. The governance process — who requests tokens, who approves them, how deprecated tokens get removed — should be defined and agreed on before the system ships to engineering. Without this, the system will degrade on sprint two.

The token inventory gets audited quarterly. Any token that hasn't been referenced in code in the past 90 days is a candidate for deprecation. This audit should be automated if possible (most token platforms can generate usage reports) and reviewed by the design system team on a fixed schedule. Token debt compounds like any other technical debt. Addressing it quarterly is manageable. Addressing it at year two is a rewrite.

The design system isn't the problem. The token system isn't the problem. The problem is treating both as artifacts to ship rather than as systems to govern — and discovering, usually eighteen months later, that a system without governance is just organized chaos with better filenames.

Photo by Egor Komarov.

Design Tokens Are Sound in Theory. Most Implementations Are a Mess.

Why The Architecture Matters

The Governance Vacuum

The Figma-to-Code Reality Gap

What a Token System That Actually Works Looks Like

Your Team Uses AI for Everything. Your Estimates Are Still Wrong.

The Language of Therapy Became a Weapon. Here's What It's Costing Us.

Your AI Feature Works. The Inference Bill You Forgot to Budget Is Coming.