You’re adding a new REST endpoint to a Dropwizard service. Your AI assistant gives you a Spring Boot controller. Wrong framework. Wrong annotations. Wrong everything.
The code might even compile in isolation. It still would not fit your service, and the AI would have no idea unless you told it what kind of codebase it was working in.
That’s the problem this post is about.
What instruction files do
Every major AI coding assistant supports an instruction file. It’s a markdown file in your repo that tells the AI how to behave. Claude Code calls it CLAUDE.md. GitHub Copilot uses .github/copilot-instructions.md. Cursor uses .cursor/rules/. Different names, same idea.
The instruction file loads into every AI session before any code gets touched. It’s like onboarding a new hire. When a senior engineer joins your team, someone walks them through the architecture. They explain why Jersey was chosen over Spring. They warn about the legacy Hibernate entities that shouldn’t be touched. They point out which modules own which concerns.
Your instruction file is that walkthrough. Except the new hire remembers what you tell it in the current session, but starts each new one without context.
That constraint is where most teams get it wrong.
How many instructions can AI actually follow?
Your instruction file shares the context window with everything else. The code the AI reads. The conversation you’re having. The tool outputs it generates. Every line in your instruction file is a line that can’t be used for code context.
IFScale benchmarks measured this directly: frontier models maintain 95%+ accuracy on up to about 100 instructions, but performance degrades steeply beyond that, dropping to 73–85% at 250 instructions and below 70% at 500. The tool’s own system prompt already consumes 30–50 instruction slots. You’re left with maybe 100–150 effective slots for your rules. Each discrete rule or constraint counts as one instruction, regardless of how many lines it takes to express.
Here’s the counterintuitive part. Adding more instructions makes the AI worse at following any of them. A 500-line file doesn’t produce 5x better output than a 100-line file. It usually produces worse output. This paper explains why: even if a model follows each individual instruction with 98% reliability, the chance of following all 100 simultaneously drops fast.
In practice, a root-level instruction file usually works best when it stays well under 200 lines. That’s not a universal hard limit. It’s a rule of thumb based on benchmark degradation and the point where these files usually start carrying too much.
What to include
The litmus test: would a senior engineer need to know this on day one? And can they not figure it out from the code alone?
That is the real job of an instruction file. It should capture things a senior engineer on the team knows, but the code cannot teach by itself.
Here’s what a good instruction file looks like for a Dropwizard service:
# PaymentService
Java 17 / Dropwizard 2.1 / Jersey / JDBI3 / PostgreSQL.NOT Spring Boot. Do not use Spring annotations.
## Commands- Build: `mvn clean package -DskipTests`- Test: `mvn test`- Run locally: `java -jar target/payment-service-1.0.jar server config.yml`- Lint: `mvn checkstyle:check`
## Architecture- /resources: Jersey REST resources (NOT controllers)- /service: Business logic layer- /db: JDBI3 DAOs and mappers- /model: Request/response DTOs and domain objects- /health: Dropwizard health checks
## Conventions- Use `@Path`, `@GET`, `@POST` from javax.ws.rs, NEVER Spring MVC- Constructor injection via HK2, not field injection- All DB access through JDBI DAOs, never raw JDBC- Return `Response` objects from resources, not raw entities- Config via Dropwizard YAML config classes, not @Value or application.properties
## Boundaries- /resources never calls DB directly. Always goes through /service- /service never returns Jersey Response objects. Returns domain objects- /db never contains business logic. Only queries and mappingsThat’s about 30 lines. It prevents the top mistakes an AI will make in a Dropwizard codebase.
If you’re on Dropwizard 4.x, swap javax.ws.rs for jakarta.ws.rs.
What to leave out
Anything your tooling already catches. If Checkstyle enforces import ordering, don’t spend instruction space telling the AI about it.
Also dangerous: aspirational rules that don’t match reality. If your file says “use JDBI3 for all database access” but half your codebase still uses raw Hibernate, you’ve created a contradiction. The AI sees both the instruction and the Hibernate code. It has to choose. Describe the codebase you have, not the one you wish you had. (Post 5 covers how to handle migrations in progress.)
Bad vs good instructions: examples
Abstract rules waste your budget. Concrete rules save it.
Bad, vague and unactionable:
Handle errors properly in all resources.The AI has no idea what “properly” means in your codebase.
Good, specific, shows the exact pattern:
## Error handling in resourcesUse ExceptionMapper classes. Never catch and swallow exceptions in resources.
// Correct:throw new WebApplicationException("User not found", Response.Status.NOT_FOUND);
// Wrong:try { ... } catch (Exception e) { return Response.ok().build(); }Bad, duplicates what tooling enforces:
Use 4-space indentation. No wildcard imports. Max line length 120.Checkstyle already handles this. Three wasted instruction slots.
Good, tells the AI what it can’t infer from code:
PaymentResource and RefundResource share no code.They look similar but serve different compliance flows.Never extract a "common" base class between them.An AI looking at two similar resources will often try to DRY them up. Only someone who knows the compliance context understands why they should stay separate.
Three layers of scoping
A single instruction file stops working as the codebase grows. You need layers.
Universal rules go in the root file. Tech stack. Build commands. Error handling philosophy. Stuff that’s true regardless of which directory you’re working in.
Scoped rules go in subdirectory files. Your payment service has JDBI conventions. Your notification service uses async messaging patterns. Your shared client library has strict API surface rules. These load only when the AI works in that directory.
Here’s what that looks like in a monorepo:
CLAUDE.md <- Universal (always loaded)services/payment/CLAUDE.md <- JDBI, compliance rulesservices/notification/CLAUDE.md <- Kafka, async patternslibs/common-client/CLAUDE.md <- Strict export rulesWhen the AI works in services/payment/, it sees the root file plus the payment file. It never sees the notification rules. This matters. The notification service uses @Async patterns that would be wrong in the synchronous payment flow. Keeping them separate prevents cross-contamination.
Personal rules go in gitignored files. Some engineers want verbose explanations. Others want terse output. These preferences shouldn’t affect the team.
This layering is how you stay under the instruction budget in a large codebase. A 200K-line monorepo might need 500 lines of total instructions. But any given task should only see 80–100 of them.
Treat it like production config
Once you start relying on these files, they stop being a side note. They become part of how engineering work gets done.
That means they deserve the same habits you apply to other high-leverage configuration. Version-control them. Review changes in PRs. Assign CODEOWNERS.
Why? Because a casual change here can quietly distort every AI-assisted change that follows.
Say someone adds prefer abstract base classes for shared resource behavior after seeing two similar endpoints. It sounds reasonable. Nobody reviews it closely. A week later, the AI starts introducing inheritance into places where your team had intentionally kept flows separate for compliance and auditability. Nothing is broken at compile time, but the assistant is now pushing the codebase in the wrong direction on every related task.
That is the real blast radius. A bad instruction file doesn’t fail once. It keeps repeating the same mistake until someone notices and fixes the source.
The feedback loop
The best maintenance pattern I’ve seen works like this. The AI makes a mistake. Someone figures out why. The fix gets added to the instruction file.
For example:
- AI generates a
@Controllerclass → add “NOT Spring Boot. Use Jersey@Pathresources.” - AI uses
EntityManager→ add “Use JDBI3 DAOs, never JPA/Hibernate.” - AI puts business logic in a DAO → add “DAOs contain only queries. Business logic lives in /service.”
Over time, the file becomes a curated list of your codebase’s specific gotchas. Every rule traces back to a real failure. Nothing speculative. Nothing aspirational. Just hard-won lessons.
A simple test for every line
By the end, the goal is not to have a comprehensive file. It is to have a useful one.
Before you add a line, ask three questions:
- Would a senior engineer on this team say this to a new hire on day one?
- Can the AI learn this reliably from the code or tooling without being told?
- If the AI ignores this line, is the resulting mistake expensive enough to matter?
If the answer is not clearly yes, the line probably does not belong in the file.
That’s the mindset shift. These are not generic prompts. They are compressed operating instructions for your codebase.
A tight 80-line instruction file will usually outperform a 500-line one. Every line should earn its place.
Next up: Post 2: Why Your AI Coding Assistant Ignores Half Your Instructions: how to structure instruction files so the AI only sees what’s relevant to the current task.