How to Write a CLAUDE.md (or Cursor Rules) That Actually Works

· 7 min read

Part 1 of a 6-part series on configuring AI coding assistants for large codebases


Last month, I asked my AI assistant to add a new REST endpoint to our Dropwizard service. It generated a Spring Boot controller. Wrong framework. Wrong annotations. Wrong everything.

The code compiled in isolation. It would have blown up at runtime. And the AI had no idea, because nobody told it we use Dropwizard.

That’s the problem this post is about.

What instruction files do

Every major AI coding assistant supports an instruction file. It’s a markdown file in your repo that tells the AI how to behave. Claude Code calls it CLAUDE.md. GitHub Copilot uses .github/copilot-instructions.md. Cursor uses .cursor/rules/. Different names, same idea.

The instruction file loads into every AI session before any code gets touched. It’s like onboarding a new hire. When a senior engineer joins your team, someone walks them through the architecture. They explain why Jersey was chosen over Spring. They warn about the legacy Hibernate entities that shouldn’t be touched. They point out which modules own which concerns.

Your instruction file is that walkthrough. Except the new hire has perfect recall but zero institutional memory (it remembers everything you tell it this session, but nothing from yesterday), and a hard limit on how much it can absorb in one sitting.

That last part is where most teams get it wrong.

How many instructions can AI actually follow?

Your instruction file shares the context window with everything else. The code the AI reads. The conversation you’re having. The tool outputs it generates. Every line in your instruction file is a line that can’t be used for code context.

IFScale benchmarks measured this directly: frontier models maintain 95%+ accuracy on up to about 100 instructions, but performance degrades steeply beyond that, dropping to 73–85% at 250 instructions and below 70% at 500. The tool’s own system prompt already consumes 30–50 instruction slots. You’re left with maybe 100–150 effective slots for your rules. Each discrete rule or constraint counts as one instruction, regardless of how many lines it takes to express.

Here’s the counterintuitive part. Adding more instructions makes the AI worse at following any of them. A 500-line file doesn’t produce 5x better output than a 100-line file. It produces worse output. The math is brutal: if a model follows each individual instruction with 98% reliability, its chance of following all 100 simultaneously drops to ~13%. Every rule you add reduces compliance with every other rule.

The practical ceiling is about 200 lines for a root-level instruction file. Some of the best ones in open source are under 60.

What to include

The litmus test: would a senior engineer need to know this on day one? And can they not figure it out from the code alone?

Here’s what a good instruction file looks like for a Dropwizard service:

# PaymentService
Java 17 / Dropwizard 2.1 / Jersey / JDBI3 / PostgreSQL.
NOT Spring Boot. Do not use Spring annotations.
## Commands
- Build: `mvn clean package -DskipTests`
- Test: `mvn test`
- Run locally: `java -jar target/payment-service-1.0.jar server config.yml`
- Lint: `mvn checkstyle:check`
## Architecture
- /resources: Jersey REST resources (NOT controllers)
- /service: Business logic layer
- /db: JDBI3 DAOs and mappers
- /model: Request/response DTOs and domain objects
- /health: Dropwizard health checks
## Conventions
- Use `@Path`, `@GET`, `@POST` from javax.ws.rs, NEVER Spring MVC
- Constructor injection via HK2, not field injection
- All DB access through JDBI DAOs, never raw JDBC
- Return `Response` objects from resources, not raw entities
- Config via Dropwizard YAML config classes, not @Value or application.properties
## Boundaries
- /resources never calls DB directly. Always goes through /service
- /service never returns Jersey Response objects. Returns domain objects
- /db never contains business logic. Only queries and mappings

That’s about 30 lines. (Dropwizard 4.x moved from javax.ws.rs to jakarta.ws.rs. Adjust to match your version.) It prevents the top 5 mistakes an AI will make in a Dropwizard codebase. The Spring confusion alone probably saves hours per week.

What to leave out

Anything your tooling already catches. If Checkstyle enforces import ordering, don’t waste a slot telling the AI about it. That’s like printing a “wash your hands” sign in a bathroom with automatic soap dispensers.

Also dangerous: aspirational rules that don’t match reality. If your file says “use JDBI3 for all database access” but half your codebase still uses raw Hibernate, you’ve created a contradiction. The AI sees both the instruction and the Hibernate code. It picks one at random. Describe the codebase you have, not the one you wish you had. (Post 5 covers how to handle migrations in progress.)

Bad vs good instructions: examples

Abstract rules waste your budget. Concrete rules save it.

Bad, vague and unactionable:

Handle errors properly in all resources.

The AI has no idea what “properly” means in your codebase.

Good, specific, shows the exact pattern:

## Error handling in resources
Use ExceptionMapper classes. Never catch and swallow exceptions in resources.
// Correct:
throw new WebApplicationException("User not found", Response.Status.NOT_FOUND);
// Wrong:
try { ... } catch (Exception e) { return Response.ok().build(); }

Bad, duplicates what tooling enforces:

Use 4-space indentation. No wildcard imports. Max line length 120.

Checkstyle already handles this. Three wasted instruction slots.

Good, tells the AI what it can’t infer from code:

PaymentResource and RefundResource share no code.
They look similar but serve different compliance flows.
Never extract a "common" base class between them.

An AI looking at two similar resources will absolutely try to DRY them up. Only a human who knows the compliance context understands why they should stay separate.

Three layers of scoping

A single instruction file stops working as the codebase grows. You need layers.

Universal rules go in the root file. Tech stack. Build commands. Error handling philosophy. Stuff that’s true regardless of which directory you’re working in.

Scoped rules go in subdirectory files. Your payment service has JDBI conventions. Your notification service uses async messaging patterns. Your shared client library has strict API surface rules. These load only when the AI works in that directory.

Here’s what that looks like in a monorepo:

CLAUDE.md <- Universal (always loaded)
services/payment/CLAUDE.md <- JDBI, compliance rules
services/notification/CLAUDE.md <- Kafka, async patterns
libs/common-client/CLAUDE.md <- Strict export rules

When the AI works in services/payment/, it sees the root file plus the payment file. It never sees the notification rules. This matters. The notification service uses @Async patterns that would be wrong in the synchronous payment flow. Keeping them separate prevents cross-contamination.

Personal rules go in gitignored files. Some engineers want verbose explanations. Others want terse output. These preferences shouldn’t affect the team.

This layering is how you stay under the instruction budget in a large codebase. A 200K-line monorepo might need 500 lines of total instructions. But any given task should only see 80–100 of them.

Treat it like production config

Most teams create an instruction file once. They dump everything in. They never touch it again.

This is a mistake. A bad instruction file degrades every AI interaction for every developer on the team. It’s production configuration with a blast radius bigger than most config changes you’ll make.

Version-control it. Review changes in PRs. Assign CODEOWNERS. This isn’t overkill. It’s proportional to the impact.

The feedback loop

The best maintenance pattern I’ve seen works like this. The AI makes a mistake. Someone figures out why. The fix gets added to the instruction file.

For example:

Over time, the file becomes a curated list of your codebase’s specific gotchas. Every rule traces back to a real failure. Nothing speculative. Nothing aspirational. Just hard-won lessons.

The mental model shift

Stop thinking of instruction files as “AI prompts.” Think of them as your team’s engineering standards. Compressed for a reader with perfect recall, zero institutional memory, and a strict page limit.

That framing changes how you write them. You stop being verbose. You stop duplicating what Checkstyle and SpotBugs already catch. You start being precise about the things that actually trip people up. Like “this is Dropwizard, not Spring.”

A tight 80-line instruction file will outperform a 500-line one every time. Every line should earn its place. If it doesn’t make the AI measurably better at working in your codebase, cut it.


Next up: Post 2: Why Your AI Coding Assistant Ignores Half Your Instructions: how to structure instruction files so the AI only sees what’s relevant to the current task.

Subscribe to the Newsletter

Get notified when new posts are published. No spam, unsubscribe anytime.

Coming soon.