Karpathy Skills Show Why CLAUDE.md Is Product Surface Now

Q: What is andrej-karpathy-skills?

It is a small public repository centered on a `CLAUDE.md` file with behavioral guidelines for reducing common coding-agent mistakes.

Q: Is the repo still under forrestchang?

The old GitHub URL currently redirects to `multica-ai/andrej-karpathy-skills`, so current links should use the `multica-ai` repository path.

Q: Are Claude Code skills the same thing as CLAUDE.md?

No. `CLAUDE.md` is a project instruction file. Skills are reusable capabilities or procedures that Claude can load for specific tasks. They work best together when the global file stays small.

Last updated: June 24, 2026

The Useful Signal Is Not The Star Count#

The old version of this post treated andrej-karpathy-skills as a GitHub-trending curiosity: a small instruction repo with a giant star count and a simple promise that Claude Code would behave better if you installed the right rules.

That was the wrong center of gravity.

The useful signal is not that a CLAUDE.md file went viral. The useful signal is that developers now recognize agent instructions as production infrastructure. A file that tells Claude Code how to reason, when to ask questions, how much code to touch, and what counts as done can change every future diff in a repository.

As of this refresh, the original forrestchang/andrej-karpathy-skills URL redirects to multica-ai/andrej-karpathy-skills. The public GitHub API shows a small repository with a very large audience, no declared license in the repo metadata, a last push on April 20, 2026, and enough forks to prove that many teams copied or adapted the idea. The repository has also moved beyond a loose instruction file: it now includes .claude-plugin/plugin.json and skills/karpathy-guidelines/SKILL.md, both of which package the rules in Claude Code's skill/plugin shape and declare MIT at that component level.

Those rules are useful. They are also incomplete unless a team turns them into local operating constraints.

If you want the broader strategic version, read Karpathy CLAUDE.md Skills: Use the Viral Rules as a Menu, Not a Template. This refresh is the practical 2026 version: how to treat a viral instruction file as something that belongs in review, versioning, and governance.

What The File Actually Teaches#

The CLAUDE.md is short because it targets four common coding-agent failures.

First: think before coding. The file tells the agent not to hide confusion, not to silently choose between interpretations, and to surface assumptions before implementation.

Second: simplicity first. It pushes against speculative abstractions, one-off flexibility, and code volume that grows past the actual requirement.

Third: surgical changes. It tells the agent not to refactor adjacent code, reformat unrelated files, or clean up pre-existing dead code just because it noticed it.

Fourth: goal-driven execution. It asks the agent to convert work into success criteria, verification steps, and a loop that stops only after the check passes.

Abstract systems illustration for Karpathy-style agent rules

The reason this resonates is simple: most frustrating agent failures are not exotic. They are ordinary engineering mistakes at machine speed. A vague task becomes an overbuilt solution. A small bug fix becomes a style cleanup. A missing acceptance criterion becomes a confident final answer. A rule file that names those behaviors is immediately useful.

This is the same pattern behind prompt engineering for coding agents. Good prompts are not magic words. They are task specs, constraints, examples, and verification gates.

The Move From Prompt To Product Surface#

Claude Code docs now make the extension surface explicit. There are project instructions, the .claude directory, skills, subagents, hooks, plugins, settings, and debug tooling. The docs index describes CLAUDE.md as part of memory and project instructions, plugins as bundles that can include skills, agents, hooks, and MCP servers, and /context or config debugging as ways to see what actually loaded.

That means a team should stop treating agent instructions as personal dotfile taste.

A repo-level CLAUDE.md is closer to a build script than a sticky note. It changes the behavior of future contributors, including automated ones. A skill is closer to a reusable runbook than a prompt snippet. A plugin is closer to a dependency bundle. A hook is closer to a policy enforcement point.

That is why agent config files are executable supply chain. Not every instruction file literally runs shell commands, but instructions can still change what an agent trusts, edits, ignores, or reports. Once an agent can run tools, the boundary between "instruction" and "execution" gets practical, not philosophical.

The Karpathy rules are harmless-looking because they are cautious. But the lesson applies both ways. If a trusted repo instruction can make an agent ask better questions, a bad one can make it skip tests, hide uncertainty, or treat a risky command as normal setup.

Newsletter

Get the weekly deep dive

Tutorials on Claude Code, AI agents, and dev tools, delivered free every week.

From the archive

Multica Turns Coding Agents Into Teammates. The Hard Part Is Receipts.

Apr 20, 2026 • 8 min read

271 MCP Servers Exist. These 5 Actually Make Claude Code Better.

Apr 19, 2026 • 11 min read

The $400 Overnight Bill: Why Managed Agents Need FinOps Now

Apr 19, 2026 • 13 min read

10 CLI Tools Reshaping AI Development in 2026

Apr 19, 2026 • 8 min read

What To Keep#

Keep the four categories. They are good defaults.

Think before coding is useful when the task is ambiguous, cross-cutting, or product-facing. It prevents the worst kind of waste: a long agent run based on an assumption nobody approved.

Simplicity first is useful when agents are editing mature code. It reduces review time by keeping changes near the request instead of letting the model invent a future architecture.

Surgical changes is the strongest rule for team adoption. Reviewers forgive imperfect code more easily than surprise diffs. A small diff with one clear reason can be reviewed, reverted, and learned from.

Goal-driven execution is what turns a chatty assistant into a useful worker. A final message should say what changed, what was checked, and what could not be checked. That connects directly to agent evals needing baseline receipts and long-running agents needing harnesses.

Also keep the tradeoff note. The upstream file says the guidelines bias toward caution over speed. That matters. A rule that improves a risky migration can slow down a trivial typo fix. Good agent instructions should leave room for judgment instead of turning every task into ceremony.

What To Change Before Copying#

Do not copy the viral file unchanged into a serious repository and call it a day.

Make each rule local.

Instead of "minimum code that solves the problem," define when this repo accepts a new abstraction. For example: only add an abstraction when it removes real duplication, matches an existing project pattern, or is required by a public API boundary.

Instead of "touch only what you must," name the files or ownership boundaries that matter. For example: route-only UI changes should not edit shared data loaders unless a failing test or type error proves shared code must change.

Instead of "verify your work," list the actual checks. For a Next.js site, that may be pnpm check:public-style, pnpm typecheck, a route smoke test, and a mobile screenshot. For a library, it may be unit tests plus a minimal usage fixture.

Instead of "ask if uncertain," define uncertainty thresholds. Agents should not ask when the repo has a clear local pattern. They should ask when a decision changes data shape, public copy, pricing, permissions, or migration behavior.

This is the difference between a slogan and an operating contract. The more concrete the rule, the less the model has to infer.

Skills, Plugins, And Hooks Should Split The Load#

The right long-term shape is not one giant CLAUDE.md.

Use CLAUDE.md for durable repo-wide behavior: autonomy, review expectations, style constraints, security rules, and verification requirements.

Use skills for repeatable procedures. A content skill can know frontmatter, images, source links, and FAQ rules. A migration skill can know schema commands, rollback checks, and data validation. A QA skill can know screenshot requirements. This is why why skills beat prompts for coding agents keeps compounding as a theme.

Use plugins when a team needs to distribute a bundle of skills, commands, hooks, subagents, or MCP wiring across projects. The official Claude Code plugin docs now separate discovery and creation, which is the right mental model: installable capability needs provenance and versioning. The Karpathy repo's current .claude-plugin manifest is a useful example of that shift from copy-paste file to packaged capability.

Use hooks for mechanical enforcement. If the rule says "format after edits" or "block a dangerous command," do not rely on prose. Put it in a hook or policy layer. Claude Code hooks explained is still the practical bridge between instruction and enforcement.

Use subagents when context should be smaller, not bigger. The main agent should know when to delegate, while the specialist carries the detailed procedure. That is the same direction as Claude Code sub-agents.

And if plugins become the distribution path, review the supply chain. Claude Code plugin URL supply-chain risk is the companion concern: a useful rule file becomes a different kind of risk once it is packaged, versioned, and installed across machines.

The Opposing View#

The skeptical read is fair: a viral instruction repo can become cargo culting.

Star counts do not prove better diffs. A famous name does not make a rule correct for every codebase. A concise file can still be too generic. And many teams will paste the file into a repo without measuring whether it reduced review burden, bug rate, or unnecessary changes.

There is also a hidden cost to caution. "Ask when uncertain" can become interruption spam. "Simplicity first" can prevent useful abstraction. "Surgical changes" can discourage a needed cleanup when the surrounding code is the actual source of the bug.

The fix is measurement. Track whether the instruction file produces smaller diffs, fewer reverted changes, clearer final reports, and better test discipline. If it does not, revise it.

This is exactly the governance problem in agent skills production checklist: every reusable agent rule should have an owner, a scope, a version, and a removal path.

A Better Rollout Plan#

Start with one repo and one class of work.

Add a short instruction file with the four categories, but rewrite each category into local language. Keep it under one screen if possible.

Run the agent on three representative tasks. One bug fix, one small feature, and one refactor are enough to reveal whether the rules help or just add friction.

Review the diffs for three signals:

Did the agent ask useful questions before doing irreversible work?
Did it avoid unrelated cleanup?
Did it report verification clearly?

Then move repeated procedures out of CLAUDE.md and into skills. Move enforceable checks into hooks. Add plugin packaging only when multiple repos need the same bundle.

That is how a viral rule file becomes infrastructure instead of decoration.

The Take#

The Karpathy-style skills repo is valuable because it made agent behavior legible. It gave developers four names for failures they were already seeing: hidden assumptions, overengineering, surprise edits, and weak verification.

But the winning move is not to worship the file. The winning move is to own the layer.

Treat CLAUDE.md like a product surface. Review it like code. Keep it small. Make it local. Split repeatable workflows into skills. Put mechanical checks into hooks. Track whether the result actually improves review.

That is how instruction files stop being prompt folklore and start becoming engineering infrastructure.

FAQ#

What is andrej-karpathy-skills?#

It is a small public repository centered on a CLAUDE.md file with behavioral guidelines for reducing common coding-agent mistakes.

Is the repo still under forrestchang?#

The old GitHub URL currently redirects to multica-ai/andrej-karpathy-skills, so current links should use the multica-ai repository path.

Should I copy the CLAUDE.md into every project?#

Use it as a starting point, not a template. Rewrite the rules around your repo's actual files, commands, review boundaries, and verification checks.

Are Claude Code skills the same thing as CLAUDE.md?#

No. CLAUDE.md is a project instruction file. Skills are reusable capabilities or procedures that Claude can load for specific tasks. They work best together when the global file stays small.

Why do plugins matter here?#

Plugins are the distribution layer. If a team wants to share skills, hooks, subagents, commands, or MCP configuration across projects, plugin packaging introduces the need for provenance, versioning, and install policy.

How do I know if agent rules are working?#

Look for smaller diffs, fewer unrelated edits, clearer assumptions, better test discipline, and final reports that explain what was verified. If those do not improve, rewrite the rules.

Sources#