Part 2 of a 6-part series on configuring AI coding assistants for large codebases
Imagine you have a Dropwizard service that’s been around for three years. It started as a simple order management system. Then user authentication got bolted on. Then inventory tracking. Then billing and invoicing. Now it’s 150K lines of Java across 40+ packages, and the instruction file at the root is 400 lines long because it covers every convention for every module.
You ask the AI to write a new JDBI DAO in the orders package. It generates the DAO with audit logging decorators. That’s a billing module pattern. Order DAOs don’t do that. But the AI read the billing rules and applied them anyway, because they were sitting right there in the same file.
The fix isn’t better instructions. It’s fewer instructions, loaded at the right time.
Why extra context hurts
This is easy to overlook. You’d think extra instructions are harmless. Worst case, the AI ignores the ones that don’t apply. Right?
Wrong. LLMs don’t have an “ignore irrelevant” switch. Every instruction competes for attention. When your order DAO conventions sit next to your billing audit conventions, the model has to figure out which ones apply. Sometimes it picks wrong.
Think of it like a restaurant menu. A focused 20-item menu helps you decide fast. A 200-item menu with Italian, Thai, Mexican, and sushi all mixed together? You spend more time filtering than choosing. And you’re more likely to order something weird.
Same thing happens with AI context. A focused instruction set for the current task produces better output than a comprehensive instruction set for the whole project.
Root-level instructions
The root instruction file should contain only what’s true everywhere. No exceptions.
For a large Dropwizard service, that looks like this:
# OrderPlatform
Java 17 / Dropwizard 2.1 / Jersey / JDBI3 / PostgreSQL.NOT Spring Boot. Do not use Spring annotations.
## Commands- Build: `mvn clean package -DskipTests`- Test: `mvn test`- Run: `java -jar target/order-platform-1.0.jar server config.yml`- Lint: `mvn checkstyle:check`
## Top-level package map- c.c.platform.resources: Jersey REST endpoints- c.c.platform.orders: Order lifecycle and fulfillment- c.c.platform.users: Authentication and user profiles- c.c.platform.inventory: Stock tracking and warehouse sync- c.c.platform.billing: Invoicing, audit trails, revenue reporting- c.c.platform.notifications: Email, SMS, push via async workers- c.c.platform.common: Shared utilities, base classes, constants
## Universal rules- Jersey annotations (javax.ws.rs), never Spring MVC- Constructor injection via HK2. No field injection- All configs extend io.dropwizard.Configuration- Logging via SLF4J. Never System.out or java.util.loggingThat’s about 25 lines. It tells the AI what this project is, how to build it, and what’s true in every package. The package map is short but critical. It prevents the AI from confusing which package owns which concern.
Everything else goes in scoped files.
Directory-scoped instructions
Each major package gets its own instruction file. It loads only when the AI operates in that subtree. This is the key mechanism.
Here’s src/main/java/com/company/platform/orders/CLAUDE.md:
# Orders Module
Manages order lifecycle: creation, payment capture, fulfillment, cancellation.
## Layer structure- /resource: Jersey endpoints. Thin. Validate input, call service, return Response- /service: Business logic. Transaction boundaries live here- /dao: JDBI3 DAOs. SQL queries only, no business logic- /mapper: JDBI RowMapper implementations- /model: Order DTOs and domain objects
## Conventions- DAOs return Optional<T> for single-row lookups. Never return null- All money amounts use BigDecimal. Never double or float- Order state transitions go through OrderStateMachine. Never set status directly- @Transaction belongs on service methods, not on DAOs
## Boundaries- Orders module never sends notifications directly- Order events go to Kafka topic `order.events`. Notifications module consumes them- Orders never reads from inventory tables. Calls InventoryService insteadAnd here’s src/main/java/com/company/platform/billing/CLAUDE.md:
# Billing Module
Invoicing, audit trails, and revenue reporting. Every financial mutation gets an audit record.
## Conventions- All audit events go through AuditService.record(). Never write to audit tables directly- Audit entries are append-only. Never update or delete audit records- Every auditable action needs AuditContext: actor, action, entity, timestamp, before/after state- Revenue reports use read replicas. Never query the primary database for reporting
## Boundaries- Other modules call AuditService. This module never calls other modules- Audit decorators exist for billing resources only. Do not add them to order or inventory resourcesAnd here’s src/main/java/com/company/platform/notifications/CLAUDE.md:
# Notifications Module
Async email, SMS, and push notifications. Consumes events from Kafka. Stateless.
## Architecture- /consumer: Kafka consumer classes. One consumer per topic- /handler: Message handling logic. Must be idempotent. Messages can arrive more than once- /client: HTTP clients for external providers (SendGrid, Twilio, Firebase)
## Conventions- Consumers use manual offset commit. Never auto-commit- HTTP clients use Dropwizard's Jersey client. Not OkHttp. Not Apache HttpClient- All external calls must have circuit breakers via Resilience4j- Never block the consumer thread on slow external calls. Use a bounded executor pool
## Boundaries- This module has NO direct database access. Do not create DAOs here- Consumes from Kafka topics only. Never imports from orders, billing, or inventory directlySee what happens? The orders file talks about JDBI, state machines, and BigDecimal. The billing file talks about audit trails and read replicas. The notifications file talks about Kafka, idempotency, and circuit breakers. They share almost nothing.
If all three loaded at once, the AI would see “use audit decorators” next to “do not add audit decorators to order resources.” That’s a contradiction that only makes sense if you know which package you’re in. Scoping removes the ambiguity entirely.
Pointers, not content
Even scoped instruction files have a budget. You can’t dump your entire invoicing spec into the billing module file.
The trick: point to documentation instead of embedding it.
## Key references- Database schema: see db/migrations/latest.sql- API contracts: see resources/openapi.yaml- Invoice generation rules: see docs/invoicing-rules.md- Order state machine diagram: see docs/order-states.mdThis is like the difference between carrying a reference manual and carrying an index card that says which shelf to check. The AI reads docs/invoicing-rules.md only when it’s working on billing. The rest of the time, that doc consumes zero context budget.
Four pointer lines replace 200 lines of inline documentation.
Glob-based activation
Sometimes rules apply to a file type, not a directory.
All three major tools support some form of glob-based activation. The examples below use Cursor’s .mdc format. Claude Code uses glob patterns in CLAUDE.md frontmatter. GitHub Copilot supports file-matching in .github/copilot-instructions.md. This is useful for cross-cutting concerns that span packages.
DAO files across the entire project:
---description: "JDBI DAO conventions"globs: "**/dao/**/*.java"alwaysApply: false---
All DAOs must:- Extend BaseDaoImpl<T>- Use @RegisterRowMapper with a dedicated mapper class- Return Optional<T> for single-row lookups- Use @SqlQuery and @SqlUpdate, never raw Handle operationsTest files:
---description: "Test file conventions"globs: "**/*Test.java, **/*IT.java"alwaysApply: false---
- Unit tests (*Test.java): JUnit 5 + Mockito. No Spring context. No database- Integration tests (*IT.java): DropwizardAppRule + H2 in-memory database- Assertions use AssertJ. assertThat(result).isEqualTo(expected)- One assert concept per test methodThese activate only when the AI edits a matching file. When it’s editing a resource or a service class, the test rules stay out of the context window.
Keep glob patterns simple. **/*Test.java is clear. **/src/{main,test}/**/!(Abstract)*.java is a nightmare to maintain.
Multi-repo scoping
Not every team has one big repo. Maybe your order service, notification service, and billing service live in separate repositories. The scoping story is simpler here. Each repo gets its own root instruction file. No cross-contamination by default.
The challenge shifts to shared context. If all three services talk to each other over REST or Kafka, each repo’s instruction file should document its external contracts.
## External dependencies- Calls inventory service at https://inventory.internal/api/v2 (see docs/inventory-api-contract.md)- Publishes to Kafka topic order.events (schema: docs/order-event-schema.avsc)- Consumed by: notification-service, billing-serviceWhen the AI generates code that calls the inventory service, it knows the contract. It doesn’t hallucinate endpoints or invent request schemas.
Personal overrides
Every engineer has preferences that don’t need standardizing.
Some want the AI to explain its reasoning. Others want code only. Some want test suggestions alongside every change. Others write tests themselves.
These go in gitignored files. In Claude Code, that’s CLAUDE.local.md. In Cursor, it’s User Rules. In Copilot, it’s VS Code user settings or .github/copilot-instructions.md.
# CLAUDE.local.md (gitignored)- When writing new code, always suggest corresponding test cases- Prefer verbose variable names over short ones- Explain what changed and why when modifying existing codeOne rule: personal overrides never contradict team standards. If the team says “use AssertJ,” a personal file shouldn’t say “use Hamcrest.”
Testing your hierarchy
Most teams set up a hierarchy and never verify it works. Quick way to check: open your AI assistant in a specific package and ask “what conventions should I follow when writing code here?”
If it mentions audit decorators when you’re in the orders package, your scoping is broken.
In Claude Code, run /memory to see which CLAUDE.md files are loaded. In Cursor, check the active rules panel. In Copilot, the instructions panel shows what’s applied.
Takes two minutes. Do it once per major package after setup.
Relevance over volume
Here’s what the full hierarchy looks like for a large Dropwizard service:
CLAUDE.md <- Universal: tech stack, build, package map (25 lines)CLAUDE.local.md <- Personal: gitignored
src/.../platform/orders/CLAUDE.md <- State machine, JDBI, BigDecimal rulessrc/.../platform/billing/CLAUDE.md <- Audit trails, read replicas, append-onlysrc/.../platform/inventory/CLAUDE.md <- Warehouse sync, stock locking conventionssrc/.../platform/notifications/CLAUDE.md <- Kafka consumers, circuit breakers, idempotencysrc/.../platform/users/CLAUDE.md <- Auth flows, password hashing, session handling
.cursor/rules/dao-conventions.mdc <- Activates for **/dao/**/*.java.cursor/rules/test-conventions.mdc <- Activates for **/*Test.java.cursor/rules/migration-conventions.mdc <- Activates for **/migrations/**/*.sqlWhen you work on an order DAO, the AI sees: root rules + orders rules + DAO conventions. Maybe 80 lines total. No audit decorator rules. No Kafka consumer patterns. No test conventions.
One practitioner reported reducing their instruction content from 47,000 words to 9,000 words using this approach. An 80% reduction. AI output quality went up, not down.
Less context. More relevant context. Better results.
Next up: Post 3: AI Writes Code That Compiles but Breaks Your Architecture, on documenting module boundaries, dependency directions, and design decisions so the AI stops guessing.