Agent Skills Need Exit Criteria, Not More Prompt Lore

The interesting part of Addy Osmani's agent-skills repo is not that it gives AI coding agents more markdown to read. The interesting part is that it treats senior engineering judgment as a reusable artifact.

That is why the repo moved fast through the AI developer crowd. It packages production concerns like testing, accessibility, performance, code review, debugging, and migration work into skill files that can be dropped into tools such as Claude Code, Cursor, and Antigravity. The repo description is blunt: "Production-grade engineering skills for AI coding agents."

That framing matters because the next phase of AI coding is not "write a better prompt." It is "make the agent inherit the team's definition of done."

The take

Skills are only useful when they contain exit criteria.

A weak skill says:

Write better React components.

A useful skill says:

Before finishing, run the local checks, verify the responsive states, preserve existing user edits, avoid new dependencies unless justified, and report what was not verified.

That second version is closer to a production checklist than a prompt. It gives the agent a way to stop, inspect its own work, and produce a handoff that a human can review.

That is the same reason Claude Code skills are becoming a real workflow layer, and why skills beat prompts for coding agents. The durable part is not the prose. It is the repeated operating procedure.

Why developers are paying attention

The repo is useful because it meets agents at the exact place they fail: judgment transfer.

Most AI coding failures are not syntax failures anymore. They are taste, scope, verification, and integration failures. The agent can write the component, but it may not know the local design system. It can add tests, but it may test the wrong behavior. It can refactor the module, but it may erase an edge case the team learned the hard way.

A skill can encode those constraints in a way that survives across sessions.

That is different from a one-off instruction. A one-off prompt is a sticky note. A skill is closer to a small operating manual.

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools - delivered free every week.

From the archive

GitHub Copilot Agent Metrics Are the Real Product Update

May 4, 2026 • 7 min read

Google Skills Shows the Next Agent Playbook

May 4, 2026 • 6 min read

The 98% Context Reduction Pattern

May 2, 2026 • 8 min read

Agent Swarms Need Receipts

May 2, 2026 • 8 min read

The opposing view

The fair criticism is that skills can become another pile of stale docs.

If every team ships a 4,000-line skill pack, agents will skim, misapply, or ignore the important bits. Worse, bloated skills can make the agent sound more confident without making it more correct.

That is the trap. Skills should not become a second codebase of aspirational process.

Good skills are short, specific, and tied to observable behavior:

Which files or commands matter
What the agent must check before finishing
What it should never change casually
What evidence it should return
When it should stop and ask

That is also why long-running agents need harnesses, not hope. The skill is the instruction layer. The harness is the runtime layer. You need both if the work matters.

What to copy from the repo

The repo is best treated as a menu, not a template.

Do not copy every skill into your project. Start with the recurring failures you already see:

Agents change too much.
Agents forget verification.
Agents ignore design constraints.
Agents lose context between sessions.
Agents produce vague final reports.

Then write one skill per repeated failure.

For example, a frontend repo does not need a generic "build nice UI" skill. It needs a design-system skill that says which tokens, components, breakpoints, and visual checks count as done. That pairs well with a project-level design contract like DESIGN.md, which gives agents a persistent way to understand a visual identity.

For backend work, the useful skill is usually not "write APIs." It is "when changing this endpoint, update the schema, migration, tests, docs, and client types in the same change."

How I would use it

I would start with three production skills:

Review receipt skill. Every agent change must report files changed, commands run, commands not run, and risks left open. This is the human review surface.

Scope discipline skill. The agent must preserve unrelated local changes, avoid broad refactors, and explain why any new abstraction exists.

Verification ladder skill. The agent starts with cheap checks, escalates to build or browser QA when the change touches user-facing behavior, and reports the exact result.

Those three skills solve more real problems than a giant library of framework-specific tips.

They also compose with Claude Code subagents, multi-agent coordination, and agent replays. When multiple agents are working at once, the skill is how you make their handoffs consistent.

The practical bottom line

Agent skills are becoming the new team playbook.

The best ones do not teach the model to code. The model already knows enough about code. They teach the model how your team decides a change is finished.

That is the shift Addy's repo makes visible. The winning teams will not have the longest prompts. They will have the clearest operating rules, the smallest reusable skills, and the strongest verification habits.

Sources: addyosmani/agent-skills, google-labs-code/design.md, Claude Code skills docs.

Skills Are the New Agent Operating System

Parallel Coding Agents Need Merge Discipline

Why Skills Beat Prompts for Coding Agents in 2026

The take

Why developers are paying attention

GitHub Copilot Agent Metrics Are the Real Product Update

Google Skills Shows the Next Agent Playbook

The 98% Context Reduction Pattern

Agent Swarms Need Receipts

The opposing view

What to copy from the repo

How I would use it

The practical bottom line

Comments

Try These Tools

Related Tools

Claude Code

Windsurf

Bolt

Replit Agent

Apps from Developers Digest

Skill Builder

Agent Hub

Skills Pro

Related Guides

Subagent Frontmatter - Claude Code

Interactive Mode - Claude Code

Claude Code Setup Guide

Related Posts

Parallel Coding Agents Need Merge Discipline

Skills Are the New Agent Operating System

Best Claude Code Skills in 2026: A Curated Directory

Over-Editing: Why Your AI Coding Agent Rewrites What Isn't Broken

Zed Just Made Parallel AI Agents a Native Editor Primitive

What Is an AI Coding Agent? The Complete 2026 Guide

Get Smarter About AI Dev

Skills Are the New Agent Operating System

Parallel Coding Agents Need Merge Discipline

Why Skills Beat Prompts for Coding Agents in 2026

The take

Why developers are paying attention

GitHub Copilot Agent Metrics Are the Real Product Update

Google Skills Shows the Next Agent Playbook

The 98% Context Reduction Pattern

Agent Swarms Need Receipts

The opposing view

What to copy from the repo

How I would use it

The practical bottom line

Comments

Try These Tools

Related Tools

Claude Code

Windsurf

Bolt

Replit Agent

Apps from Developers Digest

Skill Builder

Agent Hub

Skills Pro

Related Guides

Subagent Frontmatter - Claude Code

Interactive Mode - Claude Code

Claude Code Setup Guide

Related Posts

Parallel Coding Agents Need Merge Discipline

Skills Are the New Agent Operating System

Best Claude Code Skills in 2026: A Curated Directory

Over-Editing: Why Your AI Coding Agent Rewrites What Isn't Broken

Zed Just Made Parallel AI Agents a Native Editor Primitive

What Is an AI Coding Agent? The Complete 2026 Guide

Get Smarter About AI Dev