Andrej Karpathy's CLAUDE.md Rules: The File That Fixes Claude Code

Karpathy identified four Claude Code failure modes and built a single CLAUDE.md file that fixes all of them. The four principles, how to install them in 2 minutes, and the four community-identified gaps the original rules don't cover in May 2026.

Jason ZhouUpdated 6 min read

Join AI Builder Club — courses, community, weekly workshops.

30-day money-back guarantee. $37/mo.

See Plans →

What Karpathy Said Was Broken

Andrej Karpathy stopped typing code manually in December 2024. He went from 80% manual coding to 80% agent-driven in a matter of weeks - and called it "the biggest change to my basic coding workflow in two decades."

But he also named exactly what was broken.

In a now-viral post, Karpathy listed the failure modes he kept hitting with Claude Code:

  • Models make wrong assumptions and run with them silently
  • They don't surface confusion, inconsistencies, or tradeoffs
  • They overcomplicate code - bloating abstractions, skipping cleanup
  • They touch code they shouldn't, even when it's orthogonal to the task

His diagnosis was precise. But he didn't just complain - he also gave the fix:

"LLMs are exceptionally good at looping until they meet specific goals. Don't tell it what to do, give it success criteria and watch it go."

That insight became the basis for andrej-karpathy-skills - a single CLAUDE.md file distilling Karpathy's observations into four engineering principles that you can drop into any project.

What Is CLAUDE.md?

CLAUDE.md is a project-level instruction file that Claude Code reads at the start of every session. Think of it as a persistent system prompt for your codebase - it tells Claude how to behave before you've typed a single word.

Most developers either leave it empty or fill it with vague instructions like "write clean code." Karpathy's version is different. It targets specific, documented failure modes with specific countermeasures.

The Four Principles

1. Think Before Coding

Addresses: wrong assumptions, hidden confusion, missing tradeoffs

The single most common failure mode Karpathy identified: Claude picks an interpretation and runs with it, never surfacing the assumption.

This principle forces explicit reasoning before any implementation:

  • State assumptions explicitly - if uncertain, ask rather than guess
  • Present multiple interpretations when ambiguity exists
  • Push back when a simpler approach exists
  • Stop and name what's unclear rather than guessing through it

In practice: instead of getting 200 lines of code that solves the wrong problem, you get a brief planning exchange that catches the misread upfront.

2. Simplicity First

Addresses: overcomplication, bloated abstractions, speculative features

Claude's default instinct is to over-engineer. It adds error handling for scenarios that can't happen, builds abstractions for single-use code, and layers in "flexibility" nobody asked for.

Karpathy's test: would a senior engineer look at this and say it's overcomplicated? If yes, simplify.

The principle enforces:

  • No features beyond what was explicitly asked
  • No abstractions for single-use code
  • No configurable options that weren't requested
  • If 200 lines could be 50, rewrite it

3. Surgical Changes

Addresses: orthogonal edits, touching unrelated code, style drift

Claude has a bad habit of "improving" adjacent code while completing a task. It reformats comments, refactors things that weren't broken, and removes dead code it doesn't fully understand.

This principle draws a hard boundary:

  • Touch only what the user's request requires
  • Match existing style even if you'd do it differently
  • If you notice unrelated dead code, mention it - don't delete it
  • Every changed line should trace directly to the request

The result: diffs that are clean and reviewable instead of sprawling across the codebase.

Free AI Builder Newsletter

Weekly guides on AI tools & builder strategies.

4. Goal-Driven Execution

Addresses: imperative task drift, weak success criteria, constant clarification loops

This is Karpathy's key insight operationalized. Instead of telling Claude what to do, you give it success criteria and let it loop.

The transformation:

  • Instead of "Add validation" - write "Write tests for invalid inputs, then make them pass"
  • Instead of "Fix the bug" - write "Write a test that reproduces it, then make it pass"
  • Instead of "Refactor X" - write "Ensure all tests pass before and after"

For multi-step tasks, the principle asks Claude to state a brief plan with verification steps before touching any code. Strong success criteria let the model loop independently. Weak criteria require constant interruption.

How to Install It

You have two options:

Option A: Claude Code Plugin (recommended - works across all projects)

From within Claude Code:

/plugin marketplace add forrestchang/andrej-karpathy-skills
/plugin install andrej-karpathy-skills@karpathy-skills

Option B: Per-project CLAUDE.md

For a new project:

curl -o CLAUDE.md https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md

To append to an existing CLAUDE.md:

echo "" >> CLAUDE.md
curl https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md >> CLAUDE.md

It also works with Cursor via .cursor/rules/karpathy-guidelines.mdc.

How to Know It's Working

You'll see the difference in your diffs:

  • Fewer unnecessary changes - only requested edits appear
  • Claude asks clarifying questions before implementing, not after mistakes
  • Code is simple the first time, not after you push back
  • PRs are clean and minimal with no drive-by refactoring

One caveat from the project itself: these guidelines bias toward caution over speed. For trivial changes (a typo fix, an obvious one-liner), you don't need the full rigor. The value is on non-trivial, multi-file work where silent wrong assumptions compound fast.

What the 4 Rules Don't Cover

Karpathy's rules were written in January 2026, when the dominant Claude Code workflow was single-session, one task at a time. The ecosystem has shifted. If you are running multi-step agents, working across multiple files in one session, or using hooks and sub-agents, you will hit failure modes the original 4 rules don't address.

The community has identified four gaps that matter most in May 2026:

Gap 1: No token budget rule

Karpathy's rules are silent on session length. Without a budget boundary, a debugging loop can run 90 minutes, gradually losing track of fixes it already tried. By the time you notice, the model is re-suggesting changes you rejected 40 messages earlier. A per-task budget (roughly 4,000 tokens) and a per-session cap (roughly 30,000 tokens) prevent this. If a task is approaching the limit, summarize and start fresh rather than pushing through.

Gap 2: No checkpoint rule for multi-step tasks

Goal-Driven Execution tells Claude to define success criteria and loop. It doesn't tell Claude to stop and report progress between steps. On a 6-step refactor, if step 4 goes wrong and there is no checkpoint, Claude will complete steps 5 and 6 on top of a broken state. Untangling takes longer than redoing the whole thing. Adding a rule to summarize what was done, what is verified, and what remains after each significant step catches this early.

Gap 3: No "read before you write" rule

"Surgical Changes" tells Claude not to touch adjacent code. It doesn't tell Claude to understand adjacent code before writing new code next to it. The result: Claude adds a function next to an existing identical function it never read, and the new one silently takes precedence because of import order. Before writing in a file, Claude should read exports, immediate callers, and obvious shared utilities first.

Gap 4: No "fail loud" rule

The most expensive Claude failures look like success. "Migration completed" when 14% of records were silently skipped. "Tests pass" when three tests were quietly excluded. "Feature works" when the edge case you specifically asked about was never verified. Adding an explicit rule to surface uncertainty - rather than defaulting to confident completions - catches silent failures before they reach production.

These four gaps are additive, not replacements. Karpathy's 4 rules remain the foundation. The community additions cover the agent-orchestration failure modes that didn't exist when the original template was written.

Why This Matters

The andrej-karpathy-skills repo has 161k GitHub stars and 16.5k forks - numbers that reflect genuine word-of-mouth adoption, not hype.

Most builders have already hit these exact failure modes and have been patching them manually, one frustrated debugging session at a time. A single CLAUDE.md file won't make Claude Code perfect, but it systematically targets the four behaviors that waste the most time.

Karpathy's framing is the right one: AI coding is now about giving success criteria and watching the model loop, not about writing the code yourself. The builders who internalize that mental model first are the ones who get the leverage.

Want to go deeper on Claude Code workflows? Join AI Builder Club - weekly workshops on tools like this, plus a community of 1,200+ builders shipping AI products.

Frequently Asked Questions

What is CLAUDE.md and why does it matter?

CLAUDE.md is a project-level instruction file that Claude Code reads at the start of every session. It acts as a persistent system prompt for your codebase — telling Claude how to behave before you type a single word. A well-written CLAUDE.md prevents the most common failure modes: wrong assumptions, overcomplicated code, touching unrelated files, and weak execution criteria.

What are Karpathy's four CLAUDE.md principles?

The four principles are: (1) Think Before Coding — state assumptions and surface confusion before implementing, (2) Simplicity First — no features, abstractions, or configs beyond what was asked, (3) Surgical Changes — touch only what the request requires, match existing style, (4) Goal-Driven Execution — give success criteria instead of step-by-step instructions so the model can loop independently.

How do I install the Karpathy CLAUDE.md rules?

Two options: (1) As a Claude Code plugin across all projects — run /plugin marketplace add forrestchang/andrej-karpathy-skills then /plugin install andrej-karpathy-skills@karpathy-skills, or (2) Per-project — run curl -o CLAUDE.md https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md in your project root. It also works with Cursor via .cursor/rules/karpathy-guidelines.mdc.

Does CLAUDE.md work with Cursor or just Claude Code?

The principles work with both. Claude Code reads CLAUDE.md natively from your project root. For Cursor, save the same content as .cursor/rules/karpathy-guidelines.mdc and Cursor will load it as a rule file for every session.

Continue Learning

Get the free AI Builder Newsletter

Weekly deep-dives on AI tools, automation workflows, and builder strategies. Join 5,000+ readers.

No spam. Unsubscribe anytime.

Go deeper with AI Builder Club

Join 1,000+ ambitious professionals and builders learning to use AI at work.

  • Expert-led courses on Cursor, MCP, AI agents, and more
  • Weekly live workshops with industry builders
  • Private community for feedback, collaboration, and accountability