The Four Layers of Claude Code: A Mental Model for Engineering Teams
Claude Code is not just an AI chat window in your terminal. It has a four-layer architecture that determines how it behaves. Understanding it changes how you set it up and what you get out of it.
Most engineers who use Claude Code for the first time treat it as a smarter autocomplete or an AI chat window connected to their files. That is enough to get initial value. It is not enough to understand what the tool is actually capable of or how to configure it to work consistently at a team level.
Claude Code has a four-layer architecture. Each layer does something distinct. Each layer builds on the one below it. The teams that get the best results from Claude Code understand all four layers and have invested in each one deliberately. The teams that get mediocre or inconsistent results are usually operating with only the bottom layer, wondering why the tool does not behave more reliably.
The four layers are CLAUDE.md, Skills, Hooks, and Agents. Here is what each one does and why the sequence matters.
Layer 1: CLAUDE.md, Persistent Context and Rules
The foundation of Claude Code's architecture is the CLAUDE.md file. It is the first thing Claude reads when it starts a session in your repository, and it shapes everything that happens afterward.
CLAUDE.md is persistent context: information about your system that you want Claude to have available in every session, without having to repeat it in every prompt. It is also rules: constraints and conventions that should govern every action Claude takes in your codebase.
Without a CLAUDE.md, Claude Code starts each session with no knowledge of your specific system. It has to infer from what it can see locally: the files in the current directory, the patterns in the code near the task. In a small, well-structured repository with consistent patterns throughout, that inference works reasonably well. In a larger, older, or more complex codebase, it produces output that is generically plausible but specifically wrong.
A CLAUDE.md that describes your architecture, your conventions, your key constraints, and your workflow expectations is the difference between Claude treating your codebase as an unknown system and treating it as familiar territory. That difference shows up in every single session.
The CLAUDE.md file hierarchy allows for scoped context. A CLAUDE.md in the repository root applies globally. A CLAUDE.md in a subdirectory applies to work in that subdirectory and overrides or extends the root context. This is the right pattern for large repositories where different modules have different conventions or different levels of technical debt.
Layer 2: Skills, Auto-Invoked Knowledge Packs
The second layer is Skills. A Skill is a markdown file in .claude/skills/<name>/SKILL.md that teaches Claude how to do a specific type of work in your system.
If CLAUDE.md is the foundation that describes what your system is, Skills are the specialised knowledge about how to do specific things within it. A testing Skill describes your testing patterns, your factory setup, your coverage expectations, and your convention for test file organisation. A code review Skill describes what your team cares about in reviews. A deployment Skill walks through your deployment checklist.
The critical mechanism is auto-invocation. Skills have a description field that Claude reads to decide whether a Skill is relevant to the current task. Write the description precisely and the Skill fires automatically at the right moment: when a developer asks Claude to write tests, the testing Skill activates. When they ask for a code review, the review Skill activates. The developer does not need to remember to invoke them.
This is why Skills are the second layer, not an optional add-on. CLAUDE.md gives Claude context about the system. Skills give Claude context about how to act in specific situations. Both are necessary for consistent behaviour. A team running Claude Code with a good CLAUDE.md but no Skills is getting consistent background context but inconsistent task execution. The inconsistency shows up in tests that use different patterns, reviews that emphasise different things, commits that follow different message formats.
Team Skills live in the repository and are version-controlled. They improve over time as the team discovers where Claude's defaults diverge from their conventions. After a few months of investment, a Skills library becomes one of the most valuable engineering assets a team has: accumulated, machine-readable knowledge about how the team works.
Layer 3: Hooks, Safety Gates and Automation
The third layer is Hooks. A Hook is a shell command that runs before or after an agent action, deterministically, outside of Claude's decision-making process.
Where CLAUDE.md and Skills shape how Claude thinks and acts, Hooks enforce constraints regardless of what Claude thinks. They are the difference between "Claude should not touch production environment files" and "Claude cannot touch production environment files." The word change is not semantic. A PreToolUse Hook that blocks any write operation to a protected path enforces that constraint in every session, for every developer, without relying on Claude interpreting instructions correctly.
Hooks are where engineering discipline meets AI tooling. Every mature software system has invariants that must hold regardless of what any individual contributor intends: files that must not be modified by automated processes, commands that must be logged before execution, output that must pass validation before being accepted. Hooks make those invariants machine-enforceable for AI tools, using the same exit code pattern that shell scripts have used for decades.
The three Hook types map to distinct needs:
PreToolUse blocks actions before they happen. This is where safety enforcement lives: protecting files, blocking patterns that should never run in automated context, requiring authentication before sensitive operations.
PostToolUse validates results after actions complete. This is where quality enforcement lives: checking that modified files still pass linting, that tests still run, that specific invariants hold.
Notification logs events and triggers downstream processes. This is where audit and observability live: recording what the agent did, when, and in what context.
A team running Claude Code without Hooks is relying on probabilistic compliance: Claude will almost always do the right thing, and when it does not, they will catch it in review. A team with a Hooks layer has converted the most critical constraints from probabilistic to deterministic. That conversion is what makes it reasonable to deploy agents in workflows where the stakes are high enough that "almost always" is not acceptable.
Layer 4: Agents, Subagents with Their Own Context
The fourth layer is Agents: Claude Code instances that operate as subagents within a larger workflow, each with their own context, tools, and scope.
The first three layers change how a single Claude Code session behaves. The fourth layer changes the architecture of what you can build. A well-defined agent workflow can break a complex task into scoped subtasks, delegate each to a subagent with appropriate context, and integrate the results. This is how teams are automating workflows that previously required human coordination: code review pipelines that run automatically when a PR is opened, onboarding workflows that set up new developer environments, testing cycles that run in parallel across multiple subsystems.
Agents amplify everything in the layers below them. A subagent operating in a directory with a good CLAUDE.md and the right Skills will perform its task consistently and accurately. A subagent operating without that context will produce variable results. The agent layer is where the work of the first three layers pays off at scale: a well-configured system produces consistent, reliable automated workflows. A poorly configured one produces automated chaos.
The scope of a subagent matters significantly. Agents given well-defined, bounded tasks perform reliably. Agents given open-ended tasks with unclear boundaries accumulate errors. The discipline of scoping tasks precisely, which is valuable for interactive Claude Code sessions, is essential for agent workflows. The cost of a poorly scoped task in an automated workflow is substantially higher than the cost of the same mistake in an interactive session, because there is no human in the loop to catch the drift before it compounds.
Why the Sequence Is Not Arbitrary
The four layers are not a list of features. They are an architecture, and the sequence reflects a dependency structure.
Agents depend on Hooks. An agent operating without the safety layer can take actions that are difficult or impossible to reverse. The deterministic enforcement of Hooks is what makes it responsible to run automated agents in contexts that matter.
Hooks are more useful with Skills. A Hook that blocks a specific type of operation is a constraint. A Skill that teaches Claude the right approach to that type of task is the positive complement: it tells Claude what to do instead of just what not to do.
Skills build on CLAUDE.md. A Skill that describes your testing patterns assumes Claude knows what kind of system it is writing tests for. The CLAUDE.md provides that baseline context. Skills build on it rather than repeating it.
The practical implication is that building the layers in order is more effective than building them in parallel or in arbitrary sequence. Start with CLAUDE.md. Add a Skill for your most common task type. Add the Hooks that protect your most critical constraints. Then consider whether agent workflows are appropriate for specific repeated tasks.
Teams that skip directly to agent workflows without the foundation layers in place consistently report poor results. The agents behave inconsistently, produce architecturally wrong output, and occasionally do things they should not. The problem is not the agents. It is the missing foundation.
Where Most Teams Are and What to Do Next
The majority of teams using Claude Code in 2026 have Layer 1 in some form: they have either a CLAUDE.md or at least a minimal setup from running claude /init. A smaller but growing number have invested in Layer 2 by building a Skills library. A minority have configured Layer 3 with meaningful Hooks. Very few are running Layer 4 agent workflows in production.
That distribution reflects natural adoption sequence, not difficulty. Each layer is straightforward to build. The reason most teams stop at Layer 1 is not that the subsequent layers are hard. It is that they do not know they exist, or they have not yet encountered the pain points that the subsequent layers solve.
If your team is using Claude Code primarily as an interactive AI pair programmer and finding the results inconsistent, the problem is almost certainly Layer 2: you have context about what the system is but Claude does not have consistent guidance about how to work within it. One well-built Skill for your most common task type will produce a visible improvement in consistency within a sprint.
If your team wants to use Claude Code in automated workflows and is concerned about reliability, the problem is Layer 3: you need the safety layer in place before running agents in production contexts. Start with two or three Hooks that enforce your hardest constraints, prove to yourself that they work reliably, and then extend from there.
The architecture is not complicated. Building it well requires understanding what each layer does and why it matters. That understanding is the real prerequisite.
I help engineering teams close the gap between "we use AI tools" and "AI actually changed how we deliver." Book a 20-minute call and I'll tell you where the leverage is.
Working on something similar?
I work with founders and engineering leaders who want to close the gap between what their technology can do and what it's actually delivering.