Why AI-Generated CSS Needs Design Constraints

By Jeff Shumate
5 min read

LLMs are good at generating CSS. They're terrible at following design systems. Give Claude or Cursor a component to build and it will happily use #FF5734 instead of your brand red, 16px instead of your spacing tokens, and font-weight: 600 instead of your typography scale.

The problem isn't the structure of the generated code. It's that AI tools treat CSS as just another stream of tokens they were trained on. If it is correctly structured and meets the request its fine. If the user complains there is always a handy !important you can throw at it. Every time the agent touches your code your css bloats a little more.

The Arbitrary Value Problem

When you tell an LLM to "use our brand colors," it picks whatever hex codes feel right and calls it done 1. You end up with different shades of blue or even the same shade of blue re-defined 187 times with inline css.

This creates maintenance hell. Every AI-generated component becomes a special case that doesn't follow your design system. Change your brand colors? Good luck finding all the places where the AI hardcoded #0265dc instead of using var(--color-primary).

A Three-Layer Architecture

I built tokenctl around a brand → semantic → component architecture that constrains AI-generated CSS 2. Instead of hoping the AI remembers your design system, you create structured constraints using W3C-compliant design tokens.

Brand tokens define what's available in your system. Your color palette, spacing scale, typography families. These are the raw materials:

{
  "color": {
    "$type": "color",
    "primary": { "$value": "oklch(49.12% 0.309 275.75)" },
    "secondary": { "$value": "#8b5cf6" }
  }
}

Semantic tokens define how those brand tokens should be used contextually. Instead of "primary," you get "accent" or "surface" with semantic meaning.

Component tokens map semantic tokens to specific UI parts. Your button doesn't use "color-accent" directly, it uses "button-primary-background."

The --strict-layers flag enforces this hierarchy and prevents components from referencing brand tokens directly. This stops AI tools from bypassing your semantic layer and hardcoding brand values.

Reference Resolution and Computed Values

One thing that works well is automatic reference resolution with cycle detection. When you change your primary color, everything that references it updates automatically:

{
  "color": {
    "primary": { "$value": "oklch(49.12% 0.309 275.75)" },
    "primary-content": { "$value": "contrast({color.primary})" }
  }
}

The contrast() function generates accessible text colors automatically. The AI doesn't need to understand color theory, it just needs to use the tokens.

LLM Manifests for Context Efficiency

The key insight for AI workflows is the manifest generation. Instead of dumping your entire design system into the AI's context window, you can generate category-scoped manifests:

tokenctl build --format=manifest:color --output=./dist

This creates focused JSON that contains just the color tokens relevant to the current task. The AI gets structured constraints without context bloat.

Multi-Directory Merge for Real Projects

The multi-directory approach handles real-world complexity. You can have base component tokens that define your core system, then project-specific extensions:

tokenctl build ./base-components ./dashboard-ext --output=./dist

Directories merge left-to-right, so later directories extend or override earlier ones. This solves the problem of maintaining consistency across projects without forcing everything into a monolithic system.

Scale Expansion and Theme Inheritance

Scale expansion generates size variants automatically. Define a base spacing value and get xs, sm, md, lg, xl variants without manual work. Theme inheritance with $extends lets you create variations that build on parent themes rather than rewriting everything.

The build output shows how this works in practice:

@theme {
  --color-primary: oklch(49.12% 0.309 275.75);
  --color-primary-content: oklch(100% 0 0);
}

@layer base {
  [data-theme="dark"] {
    --color-primary: oklch(65% 0.2 275);
  }
}

Validation That Actually Works

The validation layer enforces the three-layer architecture and catches violations before they make it into your codebase. When an AI generates background-color: #78bbfa, the system can flag it and suggest background-color: var(--color-accent) instead.

This isn't about slowing down development. It's about avoiding the kind of design debt that forces you to throw away months of AI-generated components because they're impossible to maintain.

Standards-Based Portability

Using the W3C Design Token Format means your constraints work with any toolchain. Generate Tailwind 4 configs, pure CSS, or JSON manifests from the same source 2. The semantic structure travels with your code, not with your build tools.

This matters because design systems outlive the tools that generate them. The constraints you build today need to work with whatever AI tools emerge next year.

Practical Guardrails

AI tools will keep getting better at generating code, but they'll always need guardrails. Structured constraints don't fix their fundamental limitations, they work around them by providing a vocabulary that makes violations obvious and corrections straightforward.

It's not about teaching the AI your design system. It's about giving it structured data that prevents the most common mistakes while preserving the productivity benefits of automated code generation.


tokenctl is open source and available at github.com/dmoose/tokenctl. It generates W3C-compliant design tokens for Tailwind, CSS, or JSON output.

Stay in the loop

I write about software architecture, distributed systems, and building things that matter. Occasional updates, no spam.

Loading...