claude-code - freeCodeCamp.org

How to Keep Human Experts Visible in Your AI-Assisted Codebase

Daniel Nwaneri — Mon, 13 Apr 2026 16:24:52 +0000

Six months ago, Stack Overflow processed 108,563 questions in a single month. By December 2025, that number had fallen to 3,862. A 78% collapse in two years.

The explanation everyone reaches for is that AI replaced it. That's partly true. But it misses the structural problem underneath: every time a developer asks Claude or ChatGPT to write code, the knowledge that shaped the answer disappears.

The GitHub discussion where someone spent two hours documenting why cursor-based pagination beats offset for live-updating datasets. The Stack Overflow answer from 2019 where one engineer, after a week of debugging, documented exactly why that approach fails under concurrent writes.

The AI consumed all of it. The humans who produced it got nothing — no citation in the codebase, no signal that their work mattered.

Over time, those people stopped contributing. Stack Overflow isn't dying because it's bad. It's dying because AI extracted its value and the feedback loop that kept humans contributing broke down.

This tutorial builds a tool that puts that loop back together. proof-of-contribution is a Claude Code skill that links every AI-generated artifact back to the human knowledge that inspired it — and surfaces exactly where the AI made choices with no human source at all.

I'll show you how to install proof-of-contribution, how to record your first provenance entry, how to use the spec-writer integration that makes Knowledge Gaps deterministic, and how to run poc.py verify — a static analyser that detects gaps without a single API call.

What You Will Build
Prerequisites
Quickstart in 5 Minutes
How the Tool Works
How to Install proof-of-contribution
How to Scaffold Your Project
How to Record Your First Provenance Entry
How to Use import-spec to Seed Knowledge Gaps
How to Trace Human Attribution
How to Verify with Static Analysis
How to Enable PR Enforcement
Where to Go Next

What You Will Build

proof-of-contribution is a Claude Code skill with a local CLI. Together they give you:

Provenance Blocks: Claude appends a structured attribution block to every generated artifact, listing the human sources that inspired it and flagging what it synthesized without any traceable source.
Knowledge Gaps: the parts of AI-generated code that have no human citation, surfaced before they become production incidents
poc.py trace: a CLI command that shows the full human attribution chain for any file in thirty seconds
poc.py import-spec: bridges proof-of-contribution with spec-writer, seeding knowledge gaps from your spec's assumptions list before the agent builds anything
poc.py verify: a static analyser that cross-checks your file's structure against seeded claims using Python's AST. Zero API calls. Exit code 0 means clean, exit code 1 means gaps found — wires directly into CI
A GitHub Action: optional PR enforcement that fails PRs missing attribution, for teams that want a standard

The complete source is at github.com/dannwaneri/proof-of-contribution.

Prerequisites

This is a beginner-to-intermediate tutorial. You should be comfortable with:

Command line basics: navigating directories, running scripts
Git: basic commits and PRs
Python 3.8 or higher: the CLI is pure Python with no dependencies

You will need:

Python installed: check with python --version or python3 --version
Git installed: check with git --version
Claude Code (or any agent that supports the Agent Skills standard — Cursor and Gemini CLI also work)

There's no database to install. No API keys. No paid services. The default storage is SQLite, which Python includes out of the box.

Quickstart in 5 Minutes

If you want to try the tool before reading the full tutorial, here are the five commands that take you from zero to your first gap detection:

Mac and Linux:

# 1. Install
mkdir -p ~/.claude/skills
git clone https://github.com/dannwaneri/proof-of-contribution.git \
  ~/.claude/skills/proof-of-contribution

# 2. Scaffold your project (run in your repo root)
python ~/.claude/skills/proof-of-contribution/assets/scripts/poc_init.py

# 3. Record attribution for an AI-generated file
python poc.py add src/utils/parser.py

# 4. Detect gaps via static analysis
python poc.py verify src/utils/parser.py

# 5. See the full provenance chain
python poc.py trace src/utils/parser.py

Windows PowerShell:

# 1. Install
New-Item -ItemType Directory -Force -Path "$HOME\.claude\skills"
git clone https://github.com/dannwaneri/proof-of-contribution.git `
  "$HOME\.claude\skills\proof-of-contribution"

# 2. Scaffold your project
python "$HOME\.claude\skills\proof-of-contribution\assets\scripts\poc_init.py"

# 3. Record attribution
python poc.py add src\utils\parser.py

# 4. Detect gaps
python poc.py verify src\utils\parser.py

# 5. See the full provenance chain
python poc.py trace src\utils\parser.py

That's the whole tool. The sections below walk through each step in detail with real terminal output at every stage.

How the Tool Works

Before you install anything, you need a clear mental model of what proof-of-contribution actually does — because the most important part isn't obvious.

The Archaeology Problem

Here's a scenario that happens on every team using AI-assisted development.

A developer joins. They go through six months of AI-generated codebase. They hit a bug in the pagination logic — cursor-based, unusual implementation, nobody remembers why it was built that way. The original developer has left.

Old answer: two days of archaeology. git blame points to a commit message that says "fix pagination." The commit before that says "implement pagination." Dead end.

With poc.py trace src/utils/paginator.py, that same developer sees this in thirty seconds:

Provenance trace: src/utils/paginator.py
────────────────────────────────────────────────────────────
  [HIGH]  @tannerlinsley on github
          Cursor pagination discussion
          https://github.com/TanStack/query/discussions/123
          Insight: cursor beats offset for live-updating datasets

Knowledge gaps (AI-synthesized, no human source):
  • Error retry strategy — no human source cited
  • Concurrent write handling — AI chose this arbitrarily

They now know where the pattern came from and — critically — which parts have no traceable human source. The concurrent write handling is where the bug lives. The AI made a choice nobody reviewed.

That's what this tool does. Not enforcement first. Archaeology first.

How Knowledge Gaps Are Detected

The obvious assumption is that Claude introspects and reports what it doesn't know. That assumption is wrong. LLMs hallucinate confidently. An AI that could reliably detect its own knowledge gaps wouldn't produce them.

The detection mechanism is a comparison, not introspection.

When you use spec-writer before building, it generates a spec with an explicit ## Assumptions to review section — every decision the AI is making that you didn't specify, each one impact-rated. That list is the contract.

When you run poc.py import-spec spec.md --artifact src/utils/paginator.py, those assumptions get seeded into the database as unresolved knowledge gaps. After the agent builds, poc.py trace shows which assumptions made it into code with no human source ever cited.

The AI isn't grading its own exam. The spec is the answer key.

poc.py verify takes this further. After the agent builds, it parses the file's actual structure using Python's built-in ast module — extracting every function definition, conditional branch, and return path. It cross-checks each one against the seeded claims. Any structural unit with no resolved claim surfaces as a deterministic Knowledge Gap, regardless of how confident the model was when it wrote the code.

How to Install proof-of-contribution

Mac and Linux

mkdir -p ~/.claude/skills
git clone https://github.com/dannwaneri/proof-of-contribution.git \
  ~/.claude/skills/proof-of-contribution

Windows PowerShell

New-Item -ItemType Directory -Force -Path "$HOME\.claude\skills"
git clone https://github.com/dannwaneri/proof-of-contribution.git `
  "$HOME\.claude\skills\proof-of-contribution"

That's the entire installation. No package to install, no configuration file to edit. The skill is a markdown file the agent reads. The CLI is a Python script that runs locally.

Verify the Install:

ls ~/.claude/skills/proof-of-contribution/

You should see SKILL.md, poc.py, assets/, and references/. If the directory is empty, the clone failed — check your internet connection and try again.

How to Scaffold Your Project

The scaffold script creates the database, config, CLI, and GitHub integration in your project root. Run it once per project.

Mac and Linux

cd /path/to/your/project
python ~/.claude/skills/proof-of-contribution/assets/scripts/poc_init.py

Windows PowerShell

cd C:\path\to\your\project
python "$HOME\.claude\skills\proof-of-contribution\assets\scripts\poc_init.py"

You should see output like this:

🔗 Proof of Contribution — init

  →  Project root: /path/to/your/project
  ✔  Created .poc/config.json
  ✔  Created .poc/.gitignore  (db excluded from git, config tracked)
  ✔  Created .poc/provenance.db  (SQLite — no extra infra needed)
  ✔  Created .github/PULL_REQUEST_TEMPLATE.md
  ✔  Created .github/workflows/poc-check.yml
  ✔  Created poc.py  (local CLI — includes import-spec command)
  ✔  Created .gitignore

✔ Proof of Contribution initialised for 'your-project'

This creates four things in your project:

your-project/
├── .poc/
│   ├── config.json      ← project settings (commit this)
│   ├── provenance.db    ← SQLite database (local only, gitignored)
│   └── .gitignore
├── .github/
│   ├── PULL_REQUEST_TEMPLATE.md
│   └── workflows/
│       └── poc-check.yml
└── poc.py               ← your local CLI

.poc/ — the tool's local data directory. config.json stores project settings and is committed to git. provenance.db is the SQLite database where attribution records and knowledge gaps are stored — local only, gitignored.
poc.py — your local CLI, copied into the project root. Run python poc.py trace, python poc.py verify, and every other command directly without a global install.
.github/PULL_REQUEST_TEMPLATE.md — a PR template with the ## 🤖 AI Provenance section pre-filled. Developers fill it in when submitting PRs that contain AI-generated code.
.github/workflows/poc-check.yml — the optional GitHub Action for PR enforcement. Installed but dormant until you push the workflow file and enable it in your repo settings.

Windows note: if the scaffold fails with a UnicodeEncodeError, the emoji in the PR template is hitting a Windows encoding limit. Open assets/scripts/poc_init.py in a text editor and find every line ending with .write_text(...). Change each one to .write_text(..., encoding="utf-8"). Save and re-run.

Verify the Scaffold Worked

python poc.py report

Expected output:

Proof of Contribution Report
────────────────────────────────────────
  Artifacts tracked    : 0
  With provenance      : 0  (0%)
  Unresolved gaps      : 0
  Resolved claims      : 0
  Human experts        : 0

Empty database, clean state. You're ready.

How to Record Your First Provenance Entry

Before we dive in here, I just want to clear something up. Earlier, I described poc.py verify as detecting Knowledge Gaps automatically — and it does. But the static analyser can only tell you that a function has no human citation. It can't tell you which human source inspired it. That knowledge lives in your head, not in the code.

poc.py add is where you supply that context. After the agent builds a file, you record the human sources you actually drew on: the GitHub discussion you read before prompting, the Stack Overflow answer that shaped the approach. Those records become the attribution chain poc.py trace surfaces — and what closes the gaps poc.py verify flags.

verify finds the gaps. add fills them.

poc.py add records attribution for a file interactively. You can run it on any AI-generated file in your project.

python poc.py add src/utils/parser.py

You'll see a prompt:

Recording provenance for: src/utils/parser.py
(Press Ctrl+C to cancel)

  Human source URL (or Enter to finish):

Enter the URL of the human-authored source that inspired the code. This could be a GitHub discussion, a Stack Overflow answer, a documentation page, a blog post, or an RFC.

  Human source URL (or Enter to finish): https://github.com/TanStack/query/discussions/123
  Author handle: tannerlinsley
  Platform (github/stackoverflow/docs/other): github
  Source title: Cursor pagination discussion
  What specific insight came from this? cursor beats offset for live-updating datasets
  Confidence HIGH/MEDIUM/LOW [MEDIUM]: HIGH
  ✔ Recorded.

Add as many sources as apply. Press Enter on a blank URL when you're done.

  Human source URL (or Enter to finish): 
✔ Provenance saved. Run: python poc.py trace src/utils/parser.py

Check What You Recorded

python poc.py trace src/utils/parser.py

Provenance trace: src/utils/parser.py
────────────────────────────────────────────────────────────
  [HIGH]  @tannerlinsley on github
          Cursor pagination discussion
          https://github.com/TanStack/query/discussions/123
          Insight: cursor beats offset for live-updating datasets

No knowledge gaps — because you recorded a source. If the file had parts with no human source, they would appear below as gaps.

See All Experts in Your Graph

Every poc.py add call stores not just the URL but the author — their handle, platform, and the specific insight they contributed. Run it across enough files, and those authors accumulate into a knowledge graph: a local record of which human experts your codebase drew from, which files their knowledge shaped, and how many artifacts trace back to their work.

poc.py experts surfaces the top contributors. On a new project, it'll be one or two entries. On a mature codebase, it becomes a map of whose knowledge is load-bearing — the people you'd want to consult if that part of the code ever needed to change.

python poc.py experts

Top Human Experts in Knowledge Graph
──────────────────────────────────────────────────────
  @tannerlinsley            github          1 artifact(s)

How to Use import-spec to Seed Knowledge Gaps

This is the most important command in the tool. It connects proof-of-contribution with spec-writer and makes Knowledge Gaps deterministic.

When you use spec-writer before building a feature, it generates an ## Assumptions to review section — every implicit decision is impact-rated HIGH, MEDIUM, or LOW. The import-spec command reads that section and seeds those assumptions into the database as unresolved gaps before the agent writes a line of code.

After the agent builds, any assumption that made it into the implementation without a cited human source surfaces automatically in poc.py trace. You don't need to know which parts of the code are uncertain. The spec already told you.

Step 1 — Create a Test Spec

If you don't have a spec-writer output yet, create one manually to see how the import works.

Mac and Linux:

cat > test-spec.md << 'EOF'
## Assumptions to review

1. SQLite is sufficient for single-developer use — Impact: HIGH
   Correct this if: you need team-shared provenance

2. Filepath is the artifact identifier — Impact: MEDIUM
   Correct this if: you use content hashing instead

3. REST pattern for any future API — Impact: LOW
   Correct this if: you prefer GraphQL
EOF

Windows PowerShell:

python -c "
content = '''## Assumptions to review

1. SQLite is sufficient for single-developer use - Impact: HIGH
   Correct this if: you need team-shared provenance

2. Filepath is the artifact identifier - Impact: MEDIUM
   Correct this if: you use content hashing instead

3. REST pattern for any future API - Impact: LOW
   Correct this if: you prefer GraphQL'''
open('test-spec.md', 'w', encoding='utf-8').write(content)
print('test-spec.md created')
"

Windows note: don't use PowerShell's echo to create spec files. PowerShell saves files as UTF-16, which causes a UnicodeDecodeError when import-spec reads them. The python -c approach above writes UTF-8 correctly.

Step 2 — Import the Assumptions

python poc.py import-spec test-spec.md --artifact src/utils/parser.py

Spec assumptions imported — 3 Knowledge Gap(s) seeded
───────────────────────────────────────────────────────
  1. [HIGH] SQLite is sufficient for single-developer use
       Correct if: you need team-shared provenance
  2. [MEDIUM] Filepath is the artifact identifier
       Correct if: you use content hashing instead
  3. [LOW] REST pattern for any future API
       Correct if: you prefer GraphQL

  →  Bound to: src/utils/parser.py
  After the agent builds, run:
  python poc.py trace src/utils/parser.py
  python poc.py add src/utils/parser.py

Step 3 — Trace the Gaps

python poc.py trace src/utils/parser.py

Knowledge gaps (AI-synthesized, no human source):
  • REST pattern for any future API [Correct if: you prefer GraphQL]
  • SQLite is sufficient for single-developer use [Correct if: you need team-shared provenance]
  • Filepath is the artifact identifier [Correct if: you use content hashing instead]

  Resolve gaps: python poc.py add src/utils/parser.py

Three gaps, colour-coded by urgency. The HIGH-impact assumption — SQLite for single-developer use — appears in red. The LOW-impact one appears in green. When you run poc.py add and record a human source with an insight that overlaps the gap text, the gap auto-closes.

Preview Without Writing

python poc.py import-spec test-spec.md --dry-run

This parses the spec and prints what would be seeded without touching the database. This is useful before committing to an import.

Check the Overall Health

python poc.py report

Proof of Contribution Report
────────────────────────────────────────
  Artifacts tracked    : 1
  With provenance      : 0  (0%)
  Unresolved gaps      : 3
  Resolved claims      : 0
  Human experts        : 1
  ⚠ Less than 50% of artifacts have provenance records.
  ⚠ 3 unresolved Knowledge Gap(s).
    Run `poc.py trace ` to locate them.

How to Trace Human Attribution

poc.py trace is the command you'll use most. It shows the full human attribution chain for any file and lists any knowledge gaps — parts of the code with no traceable human source.

python poc.py trace src/utils/parser.py

A file with both attribution and gaps looks like this:

Provenance trace: src/utils/parser.py
────────────────────────────────────────────────────────────
  [HIGH]  @juliandeangelis on github
          Spec Driven Development methodology
          https://github.com/dannwaneri/spec-writer
          Insight: separate functional from technical spec

  [MEDIUM] @tannerlinsley on github
           Cursor pagination discussion
           https://github.com/TanStack/query/discussions/123
           Insight: cursor beats offset for live-updating datasets

Knowledge gaps (AI-synthesized, no human source):
  • Error retry strategy — no human source cited
  • CSV column ordering — AI chose this arbitrarily

  Resolve gaps: python poc.py add src/utils/parser.py

The human attribution section shows every cited source, colour-coded by confidence. The knowledge gaps section shows every assumption that shipped without a human citation — either seeded from a spec via import-spec, or flagged by Claude in the Provenance Block.

Resolving Gaps

Run poc.py add on any file with open gaps:

python poc.py add src/utils/parser.py

When you enter an insight that shares words with an open gap claim, the gap auto-closes. Run poc.py trace again to confirm it's resolved.

How to Verify with Static Analysis

poc.py verify is the command that closes the epistemic trust gap completely. It detects Knowledge Gaps by analysing the file's actual code structure — not by asking the AI what it doesn't know.

Run it after the agent builds, once you've seeded gaps with import-spec:

python poc.py verify src/utils/parser.py

Expected output:

Verify: src/utils/parser.py
────────────────────────────────────────────────────────────
  Structural units detected : 11
  Seeded claims             : 3
  Covered by cited source   : 2
  Deterministic gaps        : 1

Deterministic Knowledge Gaps (no human source):
  • function: handle_concurrent_writes (lines 47–61)
      Seeded assumption: concurrent write handling — AI chose this arbitrarily

  Resolve: python poc.py add src/utils/parser.py

The gap shown is not something Claude admitted. It's something the analyser found by comparing the file's function list against your seeded claims. The function handle_concurrent_writes exists in the code but has no resolved human citation in the database. That's the gap.

What the Exit Codes Mean

python poc.py verify src/utils/parser.py
echo $?   # Mac/Linux

python poc.py verify src/utils/parser.py
echo $LASTEXITCODE   # Windows PowerShell

Exit code 0 — no gaps, all detected units have cited sources
Exit code 1 — gaps found, resolve with poc.py add
Exit code 2 — file not found or unsupported language

Exit code 1 integrates directly into CI pipelines. Add poc.py verify to your GitHub Action or pre-commit hook and gaps block the build before they reach production.

Run it Without a Seeded Spec

If you haven't run import-spec first, verify still works — it falls back to structural analysis and surfaces every uncited function and branch as a gap:

python poc.py verify src/utils/parser.py

⚠ No spec imported — showing all uncited structural units.
  Run: python poc.py import-spec spec.md --artifact src/utils/parser.py
  for deterministic gap detection.

Deterministic Knowledge Gaps (no human source):
  • function: parse_query (lines 1–7)
  • branch: if not text (lines 2–3)
  • function: fetch_results (lines 9–12)
  ...

It's less precise than the spec-writer path — every structural unit shows rather than only the ones tied to named assumptions — but it's useful as a baseline on any file, new or old.

The `--strict` Flag

python poc.py verify src/utils/parser.py --strict

Strict mode flags every uncited structural unit as a gap even when claims are seeded. You can use it when you want zero tolerance: any function or branch without a resolved human source fails the check.

How to Enable PR Enforcement

Once poc.py trace has saved you real hours — not before — enable the GitHub Action. The distinction matters. Turning it on day one frames the tool as overhead. Turning it on after the team already finds value frames it as a standard.

git add .github/ .poc/config.json poc.py
git commit -m "chore: add proof-of-contribution"
git push

After that, every PR is checked for an ## 🤖 AI Provenance section. The scaffold already created the PR template with that section included. Developers fill it in naturally once they're already running poc.py trace locally — the template just asks them to record what they already know.

Developers who write fully human code opt out by adding 100% human-written anywhere in the PR body. The action skips the check automatically.

What the Action Checks

The action reads the PR description and looks for:

The ## 🤖 AI Provenance heading
At least one populated row in the attribution table

If the section is missing or the table is empty, the action fails and posts a comment explaining what to add. The comment includes a link to poc.py trace so the developer knows exactly where to look.

Where to Go Next

Use it with spec-writer on a Real Feature

The real value of import-spec is on actual features, not test specs. If you use spec-writer, the workflow is:

/spec-writer "your feature description"

Save the output to spec.md. Then:

python poc.py import-spec spec.md --artifact src/path/to/output.py

Build the feature with your agent. Then run poc.py trace to see which assumptions made it into code with no human source. Resolve the HIGH-impact gaps first — those are the ones that will cause production incidents.

Activate the Claude Code Skill

The SKILL.md file makes Claude automatically append a Provenance Block to every generated artifact when the skill is active. The block lists human sources Claude drew from and flags what it synthesized without any traceable source.

To activate it in Claude Code, the skill is already installed at ~/.claude/skills/proof-of-contribution/. Claude Code loads it automatically when you are in a project that has .poc/config.json.

A generated Provenance Block looks like this:

## PROOF OF CONTRIBUTION
Generated artifact: fetch_github_discussions()
Confidence: MEDIUM

## HUMAN SOURCES THAT INSPIRED THIS

[1] GitHub GraphQL API Documentation Team
    Source type: Official Docs
    URL: docs.github.com/en/graphql
    Contribution: cursor-based pagination pattern

[2] GitHub Community (multiple contributors)
    Source type: GitHub Discussions
    URL: github.com/community/community
    Contribution: "ghost" fallback for deleted accounts
                  surfaced in bug reports

## KNOWLEDGE GAPS (AI synthesized, no human cited)
- Error handling / retry logic
- Rate limit strategy

## RECOMMENDED HUMAN EXPERTS TO CONSULT
- github.com/octokit community for pagination

The Knowledge Gaps section is the part no other tool produces. It's where AI admits what it synthesized without a traceable human source — before that gap becomes a production incident.

Upgrade When You Outgrow SQLite

The default database is SQLite — local only, no infra required. When you need team sharing or graph queries, the references/ directory in the repo has migration guides:

Need	File
Team sharing a provenance DB	`references/relational-schema.md`
Graph traversal queries	`references/neo4j-implementation.md`
Semantic web / interoperability	`references/jsonld-schema.md`

Manual Tracking vs. proof-of-contribution

	Manual tracking	proof-of-contribution
Finding who wrote the code	Search Slack, ask the team, dig through commits	`poc.py trace` — thirty seconds
Knowing which parts the AI guessed	You don't, until it breaks in production	Knowledge Gaps section — surfaced before the code ships
Detecting gaps after the build	Code review, if someone notices	`poc.py verify` — static analysis, zero API calls
Enforcing attribution on PRs	Honor system	GitHub Action fails the PR if attribution is missing
Connecting to your spec	Copy-paste assumptions into comments manually	`poc.py import-spec` seeds them as tracked claims automatically
Infrastructure required	None (usually a spreadsheet or nothing)	None — SQLite, pure Python, no paid services

The tool doesn't replace code review. It gives code review the context it needs to catch the right things.

The archaeology scenario — two days tracing a bug through dead-end commit messages — takes thirty seconds with poc.py trace. The code still has gaps, and it always will. But now you know where they are.

Built by Daniel Nwaneri. The spec-writer skill that feeds import-spec is at github.com/dannwaneri/spec-writer. The full proof-of-contribution repo is at github.com/dannwaneri/proof-of-contribution.

How to Build a Cost-Efficient AI Agent with Tiered Model Routing

Daniel Nwaneri — Wed, 08 Apr 2026 22:59:09 +0000

Most AI agent tutorials make the same mistake: they route every task to the most expensive model available.

A character count doesn't need GPT-4. A presence check doesn't need Sonnet. A regex doesn't need anything except Python.

The mistake isn't using AI — it's not knowing when to stop using it.

This tutorial shows you how to build a tiered routing system that sends tasks to the cheapest model that can solve them. The pattern is called the cost curve. It comes from a comment thread on a DEV.to article, implemented by three developers over a weekend, and it cut the per-URL cost of a real SEO audit agent from $0.006 to effectively $0 for most pages.

By the end, you'll have a working cost_curve.py module you can drop into any agent project.

What You'll Build

A three-tier routing function that:

Runs deterministic Python checks first — zero API cost
Escalates to Claude Haiku only for genuinely ambiguous cases — ~$0.0001 per call
Escalates to Claude Sonnet only when semantic judgment is required — ~$0.006 per call
Falls back gracefully when any tier fails
Returns a consistent result schema regardless of which tier handled the request

The full implementation is part of dannwaneri/seo-agent, an open-core SEO audit agent. The cost curve module is the premium routing layer, and the principle applies to any agent with mixed-complexity tasks.

Prerequisites

Python 3.11 or higher
An Anthropic API key
Basic familiarity with Python and the Claude API

The Problem with Calling Claude on Everything
The Cost Curve Explained
Project Setup
Tier 1: Deterministic Python
Tier 2: Claude Haiku for Ambiguous Cases
Tier 3: Claude Sonnet for Semantic Judgment
The Router: audit_url()
Graceful Fallback
Testing the Cost Curve
Applying This Pattern to Your Agent

The Problem with Calling Claude on Everything

Here's what most agent code looks like:

def audit_url(snapshot: dict) -> dict:
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        messages=[{"role": "user", "content": build_prompt(snapshot)}]
    )
    return parse_response(response)

This works. It also calls Sonnet for every URL in the list — including the ones where the title is 142 characters long and the answer is obviously FAIL without any model involvement.

Claude Sonnet 4 is priced at $3 per million input tokens and $15 per million output tokens. A typical page snapshot is around 500 input tokens. That's $0.0015 per URL just for input — before output tokens. Across a 20-URL weekly audit, the total is around $0.12. Not expensive. But most of those pages have mechanical SEO issues: missing descriptions, titles over 60 characters, no canonical tag. A character count catches all of that. You don't need a model.

The cost curve fixes this by routing based on what the task actually requires, not on what the model is capable of.

The Cost Curve Explained

In the cost curve, we have three tiers, three tools, and three price points:

Tier 1 — Deterministic Python. Cost: $0. Check title length, description length, H1 count, canonical presence. These are not judgment calls. They're string operations. If title length > 60, FAIL. No model needed.

Tier 2 — Claude Haiku. Cost: ~$0.0001 per call. Title present but only 4 characters long. Description present but only 30 characters. Status code is a redirect. These pass the mechanical audit but something is off. Haiku is fast and cheap enough that escalating ambiguous cases costs less than the debugging time you'd spend on false positives.

Tier 3 — Claude Sonnet. Cost: ~$0.006 per call. Pages Haiku flags as needing semantic judgment. "This title passes length but reads like a navigation label." "This description duplicates the title verbatim." Sonnet earns its cost on genuinely hard cases — not on every URL in the list.

The routing decision happens before any API call. The result schema is identical regardless of which tier handled the request.

Project Setup

mkdir cost-curve-demo && cd cost-curve-demo
pip install anthropic

Set your API key:

# macOS/Linux
export ANTHROPIC_API_KEY="sk-ant-..."

# Windows PowerShell
$env:ANTHROPIC_API_KEY = "sk-ant-..."

Create cost_curve.py — you'll build this module step by step.

Tier 1: Deterministic Python

Tier 1 runs first on every URL. It checks four fields using only Python string operations. There's no API call, no latency, and no cost.

import json
import logging
import os
import re
from datetime import datetime, timezone

import anthropic

logger = logging.getLogger(__name__)

REDIRECT_CODES = {301, 302, 307, 308}

# Fields that trigger Tier 2 escalation
# Title or description present but suspiciously short
AMBIGUOUS_TITLE_MAX = 10   # chars — present but too short to be real
AMBIGUOUS_DESC_MAX = 50    # chars — present but too short to be useful


def _now_iso() -> str:
    return datetime.now(timezone.utc).isoformat()


def _build_result(snapshot: dict, method: str) -> dict:
    """Base result skeleton — same schema regardless of tier."""
    return {
        "url": snapshot.get("final_url", ""),
        "final_url": snapshot.get("final_url", ""),
        "status_code": snapshot.get("status_code"),
        "title": {"value": None, "length": 0, "status": "PASS"},
        "description": {"value": None, "length": 0, "status": "PASS"},
        "h1": {"count": 0, "value": None, "status": "PASS"},
        "canonical": {"value": None, "status": "PASS"},
        "flags": [],
        "human_review": False,
        "audited_at": _now_iso(),
        "method": method,
        "needs_tier3": False,
    }


def tier1_check(snapshot: dict) -> dict:
    """
    Pure Python SEO checks. Zero API calls.

    Returns a result dict with method="deterministic".
    Sets needs_tier3=False always — Tier 1 never escalates to Tier 3 directly.
    Escalation to Tier 2 is decided by the router, not here.
    """
    result = _build_result(snapshot, "deterministic")

    title = snapshot.get("title") or ""
    description = snapshot.get("meta_description") or ""
    h1s = snapshot.get("h1s") or []
    canonical = snapshot.get("canonical") or ""

    # Title check
    result["title"]["value"] = title or None
    result["title"]["length"] = len(title)
    if not title or len(title) > 60:
        result["title"]["status"] = "FAIL"
        msg = "Title is missing" if not title else f"Title is {len(title)} characters (max 60)"
        result["flags"].append(msg)

    # Description check
    result["description"]["value"] = description or None
    result["description"]["length"] = len(description)
    if not description or len(description) > 160:
        result["description"]["status"] = "FAIL"
        msg = "Meta description is missing" if not description else f"Meta description is {len(description)} characters (max 160)"
        result["flags"].append(msg)

    # H1 check
    result["h1"]["count"] = len(h1s)
    result["h1"]["value"] = h1s[0] if h1s else None
    if len(h1s) == 0:
        result["h1"]["status"] = "FAIL"
        result["flags"].append("H1 tag is missing")
    elif len(h1s) > 1:
        result["h1"]["status"] = "FAIL"
        result["flags"].append(f"Multiple H1 tags found ({len(h1s)})")

    # Canonical check
    result["canonical"]["value"] = canonical or None
    if not canonical:
        result["canonical"]["status"] = "FAIL"
        result["flags"].append("Canonical tag is missing")

    return result

The key design decision: tier1_check() never decides whether to escalate. It just runs the checks and returns. The router decides escalation based on the result.

Tier 2: Claude Haiku for Ambiguous Cases

Tier 2 runs when Tier 1 detects something mechanical but the result might need a second look. A 4-character title present but clearly wrong. A 30-character description that's technically there but useless. A redirect status that needs a human-readable explanation.

Haiku is the right model here. It's fast, cheap ($1 input / $5 output per million tokens), and sufficient for triage-level judgment. The prompt asks a narrow question: is this ambiguous enough to need Sonnet?

def tier2_check(snapshot: dict) -> dict:
    """
    Claude Haiku call for ambiguous cases.

    Returns result with method="haiku".
    Sets needs_tier3=True if Haiku determines the case needs semantic judgment.
    Falls back to Tier 1 result on API error.
    """
    api_key = os.environ.get("ANTHROPIC_API_KEY")
    if not api_key:
        raise OSError("ANTHROPIC_API_KEY is not set.")

    client = anthropic.Anthropic(api_key=api_key)

    title = snapshot.get("title") or ""
    description = snapshot.get("meta_description") or ""
    status_code = snapshot.get("status_code")

    prompt = f"""You are an SEO auditor doing a quick triage check.

Page data:
- Title: {repr(title)} ({len(title)} chars)
- Meta description: {repr(description)} ({len(description)} chars)
- Status code: {status_code}

Answer these two questions with only "yes" or "no":
1. Does this page need semantic judgment beyond simple length/presence checks? 
   (e.g. title is present but clearly wrong, description is present but meaningless)
2. Is the status code a redirect that needs investigation?

Respond in this exact JSON format and nothing else:
{{"needs_tier3": true_or_false, "reason": "one sentence explanation"}}"""

    try:
        response = client.messages.create(
            model="claude-haiku-4-5-20251001",
            max_tokens=150,
            messages=[{"role": "user", "content": prompt}],
        )
        raw = response.content[0].text.strip()
        # Strip markdown fences if present
        if raw.startswith("```"):
            lines = raw.splitlines()
            raw = "\n".join(lines[1:-1] if lines[-1].strip() == "```" else lines[1:])
        parsed = json.loads(raw)

        result = _build_result(snapshot, "haiku")
        # Copy Tier 1 field checks — Haiku doesn't redo those
        t1 = tier1_check(snapshot)
        result["title"] = t1["title"]
        result["description"] = t1["description"]
        result["h1"] = t1["h1"]
        result["canonical"] = t1["canonical"]
        result["flags"] = t1["flags"]
        result["needs_tier3"] = parsed.get("needs_tier3", False)
        if result["needs_tier3"]:
            result["flags"].append(f"Escalated to Tier 3: {parsed.get('reason', '')}")

        return result

    except Exception as exc:
        logger.warning("[tier2] Haiku API error: %s — falling back to Tier 1 result", exc)
        fallback = tier1_check(snapshot)
        fallback["method"] = "haiku-fallback"
        return fallback

The fallback is the critical piece. If Haiku fails — rate limit, network error, malformed response — the function returns the Tier 1 result rather than crashing. The audit continues. The URL gets flagged with method="haiku-fallback" so you can identify it later.

Tier 3: Claude Sonnet for Semantic Judgment

Tier 3 is where the full extraction prompt runs. This is the same call you'd make in a naïve implementation — the difference is that only a small fraction of URLs reach this tier.

def tier3_check(snapshot: dict) -> dict:
    """
    Claude Sonnet call for semantic judgment.

    Returns result with method="sonnet".
    This is the full extraction prompt — same as calling the model directly.
    """
    api_key = os.environ.get("ANTHROPIC_API_KEY")
    if not api_key:
        raise OSError("ANTHROPIC_API_KEY is not set.")

    client = anthropic.Anthropic(api_key=api_key)

    prompt = f"""You are an SEO auditor. Analyze this page snapshot and return ONLY a JSON object.
No prose. No explanation. No markdown fences. Raw JSON only.

Page data:
- URL: {snapshot.get('final_url')}
- Status code: {snapshot.get('status_code')}
- Title: {snapshot.get('title')}
- Meta description: {snapshot.get('meta_description')}
- H1 tags: {snapshot.get('h1s')}
- Canonical: {snapshot.get('canonical')}

Return this exact schema:
{{
  "url": "string",
  "final_url": "string",
  "status_code": number,
  "title": {{"value": "string or null", "length": number, "status": "PASS or FAIL"}},
  "description": {{"value": "string or null", "length": number, "status": "PASS or FAIL"}},
  "h1": {{"count": number, "value": "string or null", "status": "PASS or FAIL"}},
  "canonical": {{"value": "string or null", "status": "PASS or FAIL"}},
  "flags": ["array of strings describing specific issues"],
  "human_review": false,
  "audited_at": "ISO timestamp"
}}

PASS/FAIL rules:
- title: FAIL if null or length > 60 characters, or if present but clearly not a real title
- description: FAIL if null or length > 160 characters, or if present but meaningless
- h1: FAIL if count is 0 or count > 1
- canonical: FAIL if null
- audited_at: use current UTC time"""

    try:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1000,
            messages=[{"role": "user", "content": prompt}],
        )
        raw = response.content[0].text.strip()
        if raw.startswith("```"):
            lines = raw.splitlines()
            raw = "\n".join(lines[1:-1] if lines[-1].strip() == "```" else lines[1:])

        result = json.loads(raw)
        result["method"] = "sonnet"
        result["needs_tier3"] = False
        return result

    except Exception as exc:
        logger.warning("[tier3] Sonnet API error: %s — falling back to Tier 1 result", exc)
        fallback = tier1_check(snapshot)
        fallback["method"] = "sonnet-fallback"
        return fallback

Note the prompt addition in Tier 3 that isn't in Tier 1: "or if present but clearly not a real title" and "or if present but meaningless". That's the semantic judgment Haiku identified as needed. Tier 3 acts on it.

The Router: audit_url()

The router is the public interface. Everything else is an implementation detail.

def audit_url(snapshot: dict, tiered: bool = False) -> dict:
    """
    Route a page snapshot through the appropriate audit tier.

    Args:
        snapshot: Page data from browser.py — must contain final_url,
                  status_code, title, meta_description, h1s, canonical.
        tiered: If False, delegates directly to Tier 3 (Sonnet).
                If True, routes through the cost curve.

    Returns:
        Audit result dict with method field indicating which tier ran.
    """
    if not tiered:
        # Non-tiered mode: call Sonnet directly, same as v1 behavior
        return tier3_check(snapshot)

    # Tier 1: always runs first
    t1_result = tier1_check(snapshot)

    # Check if escalation to Tier 2 is warranted
    title = snapshot.get("title") or ""
    description = snapshot.get("meta_description") or ""
    status_code = snapshot.get("status_code")

    needs_tier2 = (
        # Title present but suspiciously short
        (title and len(title) < AMBIGUOUS_TITLE_MAX) or
        # Description present but suspiciously short
        (description and len(description) < AMBIGUOUS_DESC_MAX) or
        # Redirect status — may need explanation
        (status_code in REDIRECT_CODES)
    )

    if not needs_tier2:
        # Tier 1 result is definitive — return without any API call
        return t1_result

    # Tier 2: Haiku triage
    t2_result = tier2_check(snapshot)

    if not t2_result.get("needs_tier3", False):
        # Haiku determined no semantic judgment needed
        return t2_result

    # Tier 3: Sonnet for semantic judgment
    return tier3_check(snapshot)

The router logic is explicit and readable. Each decision point is a named condition. When tiered=False, behavior is identical to the v1 naive implementation — this is the backward compatibility guarantee that lets you add the cost curve incrementally without breaking existing audits.

Graceful Fallback

The fallback pattern appears in both Tier 2 and Tier 3. It's worth making explicit:

# Pattern used in both tier2_check() and tier3_check()
except Exception as exc:
    logger.warning("[tierN] API error: %s — falling back to Tier 1 result", exc)
    fallback = tier1_check(snapshot)
    fallback["method"] = "tierN-fallback"
    return fallback

Three things this does:

Logs the error with enough context to debug later
Returns a valid result — the Tier 1 deterministic check always runs regardless
Tags the result with the fallback method so you can filter these in your report

An agent that crashes on API errors is not production-ready. An agent that degrades gracefully and continues is.

Testing the Cost Curve

Create test_cost_curve.py to verify routing behavior without live API calls:

import json
from unittest import mock

from cost_curve import audit_url, tier1_check


def make_snapshot(title="Normal Title Under 60 Chars",
                  description="A normal meta description that is under 160 characters and describes the page content well.",
                  h1s=["Single H1"],
                  canonical="https://example.com/page",
                  status_code=200,
                  final_url="https://example.com/page"):
    return {
        "title": title,
        "meta_description": description,
        "h1s": h1s,
        "canonical": canonical,
        "status_code": status_code,
        "final_url": final_url,
    }


def test_clean_page_returns_tier1_no_api_calls():
    """Clean page: all checks pass deterministically — no API call."""
    snapshot = make_snapshot()
    with mock.patch("anthropic.Anthropic") as mock_client:
        result = audit_url(snapshot, tiered=True)
        assert result["method"] == "deterministic"
        mock_client.assert_not_called()
    print("PASS: clean page → Tier 1, zero API calls")


def test_long_title_returns_tier1_fail_no_api_call():
    """Title >60 chars: FAIL from Tier 1, no API call."""
    snapshot = make_snapshot(title="A" * 70)
    with mock.patch("anthropic.Anthropic") as mock_client:
        result = audit_url(snapshot, tiered=True)
        assert result["method"] == "deterministic"
        assert result["title"]["status"] == "FAIL"
        mock_client.assert_not_called()
    print("PASS: title >60 → Tier 1 FAIL, zero API calls")


def test_suspiciously_short_title_escalates_to_tier2():
    """Title present but 4 chars: escalates to Tier 2."""
    snapshot = make_snapshot(title="SEO")  # 3 chars — under AMBIGUOUS_TITLE_MAX
    mock_response = mock.MagicMock()
    mock_response.content = [mock.MagicMock(
        text='{"needs_tier3": false, "reason": "title is short but not ambiguous"}'
    )]
    with mock.patch("anthropic.Anthropic") as mock_client:
        mock_client.return_value.messages.create.return_value = mock_response
        result = audit_url(snapshot, tiered=True)
        assert result["method"] == "haiku"
        assert mock_client.return_value.messages.create.call_count == 1
    print("PASS: short title → Tier 2 (Haiku called once)")


def test_tiered_false_calls_sonnet_directly():
    """tiered=False: Sonnet called regardless of snapshot content."""
    snapshot = make_snapshot()  # clean page, would be Tier 1 in tiered mode
    mock_response = mock.MagicMock()
    mock_response.content = [mock.MagicMock(text=json.dumps({
        "url": "https://example.com/page",
        "final_url": "https://example.com/page",
        "status_code": 200,
        "title": {"value": "Normal Title Under 60 Chars", "length": 27, "status": "PASS"},
        "description": {"value": "desc", "length": 4, "status": "PASS"},
        "h1": {"count": 1, "value": "Single H1", "status": "PASS"},
        "canonical": {"value": "https://example.com/page", "status": "PASS"},
        "flags": [],
        "human_review": False,
        "audited_at": "2026-04-01T00:00:00+00:00",
    }))]
    with mock.patch("anthropic.Anthropic") as mock_client:
        mock_client.return_value.messages.create.return_value = mock_response
        result = audit_url(snapshot, tiered=False)
        assert result["method"] == "sonnet"
        assert mock_client.return_value.messages.create.call_count == 1
    print("PASS: tiered=False → Sonnet called directly")


def test_haiku_api_failure_falls_back_to_tier1():
    """Haiku failure: falls back to Tier 1 result, no crash."""
    snapshot = make_snapshot(title="SEO")  # triggers Tier 2
    with mock.patch("anthropic.Anthropic") as mock_client:
        mock_client.return_value.messages.create.side_effect = Exception("rate limit")
        result = audit_url(snapshot, tiered=True)
        assert result["method"] == "haiku-fallback"
    print("PASS: Haiku failure → fallback to Tier 1, no crash")


if __name__ == "__main__":
    test_clean_page_returns_tier1_no_api_calls()
    test_long_title_returns_tier1_fail_no_api_call()
    test_suspiciously_short_title_escalates_to_tier2()
    test_tiered_false_calls_sonnet_directly()
    test_haiku_api_failure_falls_back_to_tier1()
    print("\nAll tests passed.")

Run it:

python test_cost_curve.py

Expected output:

PASS: clean page → Tier 1, zero API calls
PASS: title >60 → Tier 1 FAIL, zero API calls
PASS: short title → Tier 2 (Haiku called once)
PASS: tiered=False → Sonnet called directly
PASS: Haiku failure → fallback to Tier 1, no crash

Applying This Pattern to Your Agent

The cost curve is not SEO-specific. Any agent with mixed-complexity tasks can use it.

The principle: classify tasks by what they actually require before deciding which model to invoke.

Customer support agent:

Tier 1: keyword matching for known FAQ topics — no model
Tier 2: Haiku for intent classification on ambiguous queries
Tier 3: Sonnet for complex complaints requiring judgment

Code review agent:

Tier 1: lint rules, syntax checks — no model
Tier 2: Haiku for common pattern detection
Tier 3: Sonnet for architectural review

Content moderation agent:

Tier 1: blocklist matching — no model
Tier 2: Haiku for borderline cases
Tier 3: Sonnet for context-dependent judgment

The implementation pattern is the same in all three cases. The audit_url() router becomes route_task(). The tier functions change their prompts and escalation conditions. The fallback logic stays identical.

The key question to ask before writing any agent code: what fraction of my inputs are mechanically solvable? That fraction goes to Tier 1. The rest escalate. The cost curve routes everything else.

Wrapping Up

The full implementation — including the SEO audit agent that uses this module in production — is at dannwaneri/seo-agent. The core/ directory is MIT licensed. The tiered routing lives in premium/cost_curve.py.

This tutorial is the companion piece to I Was Paying $0.006 Per URL for SEO Audits Until I Realized Most Needed $0 on DEV.to, which covers the architecture decisions behind the cost curve.

Claude Code Essentials

Beau Carnes — Thu, 26 Mar 2026 14:56:13 +0000

Cluade Code can supercharge your software development.

We just posted a full course on the freeCodeCamp.org YouTube channel that will teach you how to use Claude Code to build real-world agentic coding workflows. Andrew Brown from ExamPro developed this course.

You'll learn the core concepts of agentic coding and the agent loop. You'll also learn how to use the Claude Code CLI, sessions, and workflows. And finally you'll learn about tool usage, context handling, and real-world development patterns.

Here are the sections in the course:

Introduction

Intro & Meet your Instructor
Claude Code Essentials & Guest Instructor

Core Concepts and Foundations

Agentic Coding Tools & Comparisons
What is a Code Harness & Claude Code?
The Agentic Loop: Tool Calls & Models
Claude Modes & Additional Resources
Subscription, Usage & Auth Errors
System Requirements & Doctor CLI
Install CLI: PowerShell, CMD, Linux & VSC
Interactive Mode & Ctrl+C
Auth Tokens, Stats & Usage Limits
Sessions: Resume, Fork & Context Commands
Compact, Clear, Rename & Rewind
Status, Logout & Usage Commands
ccusage & API Key Setup
Cost Command & Third-Party Providers
API Keys: Bedrock, Foundry & Vertex AI
btw, Voice & Export Commands
Claude Code Projects & Scope
Status Line & Session Data
Settings & Permission Rules (Bash, MCP, WebFetch)
Permission Modes & CLI/GUI Editing
Sandboxing & Dangerous Scenarios
VIM Mode & Model Configuration
Fast Mode & Image Reasoning
Effort Command & Headless Tasks
Escaping Logic & Debug Mode
Dev Containers with Claude Code
Common Workflow Follow Along
Notification Hooks & Security Hooks

Surfaces

Claude Code Surfaces: Desktop & Web
Claude Chrome & Browser Extensions
VS Code & JetBrains IDE Integration
GitHub Actions & Remote Control
Computing Follow Along

Advanced

Security Review & Output Styles
Simplify Command & Scheduling
Code Review & Agent SDK
Persistent Context: Claude.md & Rules
Claude Auto Memory & Troubleshooting
Agent Skills: Activation, Scripts & Dynamic Content
MCP with Claude Code & GG
MCP Doom & Playwright Integration
Sub-agents & Multi-Agent Teams

Watch the full course on the freeCodeCamp.org YouTube channel (12-hour watch).

The Claude Code Handbook: A Professional Introduction to Building with AI-Assisted Development

Vahe Aslanyan — Wed, 25 Mar 2026 21:05:50 +0000

"I have never enjoyed coding as much as I do today — because I no longer have to deal with the minutia." — Boris Cherny, Head of Claude Code, Anthropic

This handbook is a complete, professional introduction to Claude Code, Anthropic's AI-powered software development agent – and to the practice of building software with it.

Claude Code isn't a smarter autocomplete. It's an agent: a system that reads your codebase, reasons about what needs to be done, writes and edits files, runs commands, and works through a task from start to finish – with you directing it, verifying its output, and making the decisions that require judgment. It represents a meaningful shift in how software gets built, not an incremental improvement on what came before.

This handbook covers everything from installation and first sessions to parallel agent workflows, MCP integrations, and autonomous loops. It's organized to build competency progressively, as each chapter assumes you've read the previous ones. But it's also written to be a reference you return to as your practice develops. The goal is not to make you familiar with Claude Code. It's to make you capable with it.

What You Will Learn

By the end of this handbook, you'll be able to do things that previously required either years of engineering experience or a team:

Build real applications from scratch – not toy projects or tutorial reproductions, but working software you intend to deploy and use
Stop waiting on developers – take ideas from concept to working prototype yourself, without depending on someone else's availability or budget
Ship features in hours instead of weeks – structure sessions with Plan Mode, feature-by-feature building, and prompt discipline so Claude produces what you actually intended
Keep projects alive across dozens of sessions – manage context windows and maintain continuity so you never lose progress or have to reconstruct context from scratch
Connect your tools – link Claude Code to GitHub, Notion, Slack, Google Workspace, and other services via the Model Context Protocol (MCP) so you execute entire workflows from a single instruction
Work like a team of one – run parallel agent workflows that produce the output of three or four engineers running simultaneously, without coordination overhead
Avoid the mistakes that waste days – understand when and how to use autonomous loops safely, so you don't return to a session to find Claude has gone in the wrong direction for two hours
Produce code you can stand behind – review and verify AI-generated output to a professional standard, so nothing ships that you do not actually understand and own

If you have wanted to build software and kept hitting the same wall, this handbook is designed to remove it.

Who This Is For

This handbook was written for anyone who intends to work with Claude Code seriously. The audience is broader than it might initially appear – the access barrier to software development is changing, and so is the definition of who should be able to build.

1. Developers who want to operate at a different scale

If you've been writing software for years, Claude Code is not a replacement for your skills – it's a multiplier. The developers getting the most from it aren't using it as an autocomplete tool. They're using it to run parallel sessions, delegate entire feature workstreams, maintain codebases that would have required a team, and ship at a rate that was previously impossible without additional headcount.

This handbook covers the practices – Plan Mode, context management, autonomous loops, Git worktrees – that separate professional-level Claude Code use from basic use.

2. Founders, product people, and domain experts who want to remove one specific blocker

Something important to understand before you read further: writing the code is a small fraction of what actually has to go right for a product to succeed. A working codebase doesn't produce users, doesn't validate a business model, doesn't guarantee product-market fit, and doesn't substitute for the judgment, distribution, and domain knowledge that determine whether software is actually useful to anyone.

Claude Code removes the coding barrier. That barrier was previously significant: for non-technical people, it was often the thing that made building impossible to begin. Removing it matters. But it's not the same as removing every other obstacle between you and a product that works in the world.

What does remain yours, fully, includes: understanding the problem you're solving, determining whether the solution is actually the right one, deciding what to build and in what order, talking to users, understanding why people would or would not use what you're building, and everything involved in getting it in front of the people it's meant for.

This handbook will get you from concept to working software. The rest – the part that actually makes that software valuable – is not a technical problem.

3. Anyone who has an idea they have been unable to act on

If the thing that has blocked you is specifically the code – not the idea, not the market, not the distribution, but the code itself – then this handbook is for you. It removes that specific obstacle.

It won't remove the others. The expectation going in should be accurate: this gives you a working application faster than was previously possible, not a shortcut to a successful product.

No prior programming experience is required to begin – but prior experience will make the journey faster. What is required in either case is the willingness to engage seriously with what Claude Code produces: to read it, question it, verify it, and direct it toward what you actually need.

Why This Skill Matters

People are drawn to Claude Code and AI-assisted development because it unlocks opportunities that were previously unavailable. The ability to build software – to take an idea from concept to working product – has historically required years of training, a funded team, or both. That barrier is changing rapidly.

This is not a tool that will make human judgment irrelevant. AI will automate a great deal, but not everything. The world is too interconnected, too complex, and too dependent on human creativity, domain expertise, and judgment for complete automation to be possible.

There are things beyond writing code that keep the world functioning, and that will remain true. But for the act of building software – bringing your own ideas to life as working applications – Claude Code is genuinely transformative.

As of early 2026, Claude Code authors 4% of all global GitHub commits. Engineers at Spotify have not written code manually since December. Anthropic's own team ships 10–30 pull requests per day per engineer, every one generated by Claude. This is the current state – not a projection.

The skill of directing, evaluating, and building with AI tools is already one of the most valuable capabilities a professional can hold. That value will increase, not decrease, as AI becomes more capable. The developers, builders, and thinkers who build fluency with tools like Claude Code now will carry a structural advantage as the field advances.

Chapter 1: The Context That Made Claude Code Necessary
Chapter 2: Anthropic — Background and Purpose
Chapter 3: The Claude Model Family
Chapter 4: What Claude Code Is
Chapter 5: Why the Development Community Required This
Chapter 6: Installation and Initial Setup
Chapter 7: VS Code and the Claude Code Extension
Chapter 8: Subscriptions, Token Costs, and Usage
Chapter 9: Working in Your First Session
Chapter 10: Prompt Discipline — Inputs Determine Outputs
Chapter 11: Planning as a Core Practice
Chapter 12: Building Feature by Feature
Chapter 13: How Claude Code Actually Works
Chapter 14: Architecting Applications Well with Claude Code
Chapter 15: Plan Mode, Edit Mode, and Operational Modes
Chapter 16: Context Windows and Session Management
Chapter 17: MCP Servers and External Integrations
Chapter 18: Agents, Sub-Agents, and Parallel Workflows
Chapter 19: Skills, Rules, and Persistent Instructions
Chapter 20: Autonomous Loops — Conditions for Use
Chapter 21: Code Review, Security, and Verification
Chapter 22: Starter Project Blueprints
Chapter 23: The Current Frontier of Claude Code
Chapter 24: Software Engineering as a Discipline
Chapter 25: A Structured Path Forward

Chapter 1: The Context That Made Claude Code Necessary

Software development has historically been access-restricted. Building a working application like a web service, a data tool, or a user-facing product required either years of technical training or a funded team of engineers. The knowledge barrier was steep, the required time investment was significant, and the population of people who could build was correspondingly small.

This constraint began to erode with the emergence of large language models capable of generating functional code from natural-language descriptions. What started as an augmentation of individual developers has, within the span of a few years, become a structural transformation of how software is built.

As of early 2026, the scale of this shift is measurable and substantial, and its significance extends beyond productivity metrics. The ability to build software is becoming accessible to a broader range of people – not because the underlying complexity has been eliminated, but because much of the mechanical translation between intent and implementation can now be delegated to an AI agent.

What this unlocks, in human terms, is closer to what the printing press unlocked for written communication: a dramatic expansion of who can participate.

Boris Cherny frames it precisely: "I imagine a world where everyone is able to program. Anyone can just build software anytime." He draws the parallel to the printing press explicitly – a technology that transferred a capability previously held by a small, specialized group to the general population, and that preceded an explosion of human creative and intellectual output.

This handbook exists to help you become a capable participant in that transition.

Chapter 2: Anthropic — Background and Purpose

Claude Code is a product of Anthropic. To understand the product fully, it's useful to understand the organization that built it.

Founding and Mission

Anthropic was founded in 2021 by Dario Amodei, Daniela Amodei, and several colleagues who had previously worked at OpenAI. The founding motivation was not competitive positioning. It was a principled disagreement about how AI development should proceed.

The founders believed – and continue to believe – that building powerful AI systems without a rigorous, primary commitment to safety constitutes one of the most consequential risks humanity has introduced. Their response was to establish an organization whose central purpose is to develop AI capability and AI safety in parallel, treating the latter not as a constraint on the former but as an equal and inseparable objective.

Three of Anthropic's co-founders co-authored the original scaling laws paper, one of the foundational documents of modern AI research. It describes mathematically how model capability scales with size and compute. These are people who understood the trajectory of AI capability before most of the industry had internalized it. Their choice to build an organization focused on safety reflects informed conviction, not just caution.

What Safety Means in Practice

At Anthropic, safety research manifests across multiple layers. The deepest is mechanistic interpretability: the scientific effort to understand what is actually happening inside a model at the level of individual computational components.

This is not an abstract exercise. As Boris Cherny describes it:

"We can identify a neuron related to deception. We are starting to get to the point where we can monitor it and understand that it's activating."

This work informs how models are trained, how they're evaluated, and how they're deployed. It also shapes Claude Code directly. Before public release, Claude Code ran internally at Anthropic for four to five months, with behavior studied carefully before any external release. This was not a formality. It reflected genuine uncertainty about how an agentic AI system would behave in conditions that training-time evaluations cannot fully anticipate.

Scale and Influence

By early 2026, Anthropic reached a valuation of over $350 billion. Claude Code is reported to generate over $2 billion in annual revenue and continues to accelerate: daily active users doubled in the month prior to this writing.

The company's models, particularly Claude Sonnet 4.6 and Claude Opus 4.6, are the current standard for serious AI-assisted software development across organizations from early-stage startups to the largest technology companies in the world.

Chapter 3: The Claude Model Family

Claude Code is powered by Anthropic's Claude models. The models are the intelligence underlying the system. Claude Code provides the environment – the tools, the interface, the scaffolding – but the models determine the quality of reasoning, planning, and execution.

Claude Sonnet 4.6

Claude Sonnet 4.6 is Anthropic's mid-tier model. It delivers strong performance across coding tasks – planning, implementation, debugging, documentation – at a meaningfully lower cost per token than Opus.

Sonnet 4.6 represents the inflection point at which Claude Code became broadly useful. Prior to this generation, models were capable but insufficiently reliable for production workflows. Sonnet 4.6 changed that, providing the reasoning depth required for real engineering work at a price accessible to individual developers and small teams.

For most development tasks of moderate complexity, Sonnet 4.6 is adequate. It handles single-feature implementations, debugging sessions, and documentation generation well. Where it reaches its limits (like in extended autonomous sessions, deeply architectural decisions, complex multi-step reasoning), Opus 4.6 becomes the appropriate choice.

Claude Opus 4.6

Claude Opus 4.6 is Anthropic's most capable model. Research measuring its performance on real software engineering tasks found that it achieves a time horizon of approximately 14.5 hours at 50% task completion rate – meaning it can handle unattended work that would occupy a skilled engineer for most of a working day.

Boris Cherny uses Opus 4.6 exclusively, with maximum effort enabled, and never reduces capability to save tokens. His reasoning is precise:

"Because a less capable model is less intelligent, it requires more tokens to do the same task. It is not obvious that using a cheaper model is actually cheaper. Often, the most capable model is cheaper and less token-intensive because it completes the task faster with less correction."

Opus 4.6 is Anthropic's first ASL-3 class model – a designation in their safety classification framework applied to models of sufficient power that the most rigorous safety protocols are warranted before and after release.

Claude Haiku

Claude Haiku is Anthropic's lightest model that's fast, inexpensive, and suited to simple tasks: summarization, brief lookups, lightweight generation. For Claude Code work, Haiku is rarely the right choice. It lacks the reasoning depth required for meaningful software development.

Model Selection Guide

Task Characteristics	Recommended Model
Initial exploration, learning Claude Code	Sonnet 4.6
Moderate complexity development work	Sonnet 4.6
Complex architectural decisions	Opus 4.6
Extended autonomous sessions	Opus 4.6
Multi-agent parallel workflows	Opus 4.6
Simple lookups, trivial queries	Haiku

The practical guidance is straightforward: if you can, don't select a model primarily based on cost. A less capable model that requires more correction cycles, more context clarification, and more tokens to reach an acceptable output frequently costs more in total than a more capable model that completes the task in fewer passes.

Chapter 4: What Claude Code Is

Claude Code is an AI agent for software development. That definition requires unpacking.

The Distinction from Conversational AI

A conversational AI system – ChatGPT, Claude.ai in basic form, most early AI products – produces text in response to text. It can explain, summarize, translate, draft. It generates output that a human then acts upon. The AI does not itself act in the world.

Claude Code is categorically different. It is an agent: an AI system equipped with tools that allow it to act. In a Claude Code session, the model doesn't merely generate a code snippet and return it to you. It reads the files in your project, writes and modifies those files, executes terminal commands, installs packages, runs tests, searches the web, commits to version control, opens pull requests...the list goes on.

The distinction matters. When you engage Claude Code on a development task, you aren't prompting a text generator. You're directing an autonomous agent that can execute sequences of actions, make decisions about which tools to employ, and produce material results in your codebase.

The Agent Architecture

Most AI-powered development tools are built by constraining the model: defining rigid workflows, controlling what the model can see, specifying precisely which tools it can use in which sequence. This creates predictability at the cost of flexibility and capability.

Claude Code's architecture inverts this. As Boris Cherny describes it: "The product is the model." The approach is to expose the model as directly as possible, with a minimal set of tools and minimal scaffolding, and allow the model to determine the best approach for a given task. The model decides which tools to use, in what order, and how to combine them.

This approach trusts the model's judgment. With Claude Opus 4.6, that trust is warranted. The model can assess a complex problem, formulate a strategy, execute it using available tools, and adapt when it encounters unexpected conditions — without constant human intervention.

Where Claude Code Runs

Claude Code is available across multiple surfaces:

Terminal (Mac, Windows, Linux)
VS Code extension
Claude desktop application (Code tab)
Claude iOS and Android applications
GitHub integration (automated code review and PR management)
Slack integration

The underlying agent is identical across all surfaces. The interface differs, but the capability does not.

This breadth is a deliberate expression of product philosophy: bring the tool to wherever people already work, rather than requiring people to adapt to a new environment. Boris describes this as "latent demand" – a principle that shaped both Claude Code's original terminal deployment and its subsequent expansion.

Current Impact

4% of all global GitHub commits are authored by Claude Code (early 2026)
Projected to reach 20% of all commits by end of 2026
Anthropic's daily active users doubled in the preceding month
Engineering productivity at Anthropic has increased 200% since Claude Code adoption
100% of pull requests at Anthropic are reviewed by Claude before human review

Boris describes the growth trajectory as still accelerating: "It's not just going up – it's going up faster and faster."

Chapter 5: Why the Development Community Required This

Claude Code did not emerge from a product strategy. It emerged from a genuine need in how software is built, and from an observation about where friction in that process actually lives.

The Mechanical Burden

Professional software development involves significant mechanical work that has nothing to do with the intellectual challenge of building good systems. A large portion of a developer's day, in a typical codebase, involves:

Looking up documentation for APIs whose syntax changes between versions
Writing authentication and authorization boilerplate for the hundredth time
Setting up database connections, environment configurations, REST endpoints
Decoding opaque error messages
Searching for solutions to problems that are, with near certainty, already solved somewhere
Writing tests for behavior whose correctness is already known
Managing dependency conflicts and breaking changes

None of this is where engineering expertise is exercised. It is the overhead of translation: converting understanding into syntax, correct syntax, in the right files. Boris describes it as "the minutia" and "the tedious parts" – things that consumed time without demanding the faculties that matter.

Claude Code eliminates this overhead. The mechanical translation is handled by the agent. What remains for the engineer is the part that requires judgment: what to build, how to architect it, whether it is correct, whether it serves its purpose.

The Access Barrier

Beyond the tedium experienced by professional developers, there is the structural barrier facing everyone else. Building functional software has required years of technical training. The ideas exist broadly. The capacity to execute them has been concentrated in a small technical population.

Claude Code redistributes that capacity. It doesn't eliminate the need for understanding and judgment (a point this handbook will return to repeatedly), but it dramatically lowers the threshold at which someone can produce working software. A product manager, a scientist, a business owner, a domain expert with a clear idea can now begin building in a way that was not previously available to them.

The Feedback Loop Problem

Senior engineers frequently know exactly what they want but find it difficult to convey that precisely enough to produce it consistently. The distance between a specification and its implementation – what engineers call the specification gap – is one of the chronic sources of friction in engineering teams.

With Claude Code, that gap narrows substantially. The developer remains in the loop throughout, reviewing plans before they are executed, inspecting output as it is produced, redirecting when necessary. The feedback cycle collapses from days to minutes. Misalignments are caught early, when they are cheap to fix.

Chapter 6: Installation and Initial Setup

This chapter covers everything required to have Claude Code running on your machine from a position of zero prior configuration.

Step 1: Set Up a Claude Account

Before installing Claude Code, you will need an account on Claude.ai.

Navigate to claude.ai in a browser
Select "Sign Up" and create an account using your email address
Confirm your email address
Your account is now active

This account authenticates you across all Anthropic products, including Claude Code.

Step 2: Select a Subscription Plan

Claude Code requires a paid subscription. The available tiers as of 2026:

Claude Pro: $20 per month

The Pro plan provides access to Claude's models with a daily usage limit. When you reach that limit, you wait until the following day to continue.

For someone beginning with Claude Code building small projects, learning the system, running sessions of moderate intensity, the Pro plan is sufficient. It isn't designed for sustained professional use, but it's an appropriate entry point.

Claude Max: $100 or $200 per month

The Max plan substantially increases or effectively removes usage limits. At the $200 tier, Anthropic's own team describes never encountering the usage ceiling under normal working conditions.

If Claude Code becomes a primary instrument in your workflow – which it will, if you use it consistently – the Pro plan will constrain you. Upgrade to Max when you begin hitting limits regularly.

On the cost question: Claude Code at the $20 tier represents a low threshold for access to something that materially changes what one person can build. The question is not whether the tool is worth the cost. The question is whether you will use it consistently enough to benefit from it. Begin at $20. The answer will be evident within a few weeks.

Step 3: Access the Installation Documentation

Navigate to code.claude.ai. This is Anthropic's official installation hub for Claude Code. It provides:

Installation commands for Mac and Linux (via npm)
Installation instructions for Windows (via PowerShell)
Links to IDE extensions

If you are comfortable in a terminal, the single-line npm installation command is sufficient. If not, proceed to the VS Code method described in the following chapter.

Chapter 7: VS Code and the Claude Code Extension

For those without a prior terminal workflow, Visual Studio Code provides the most accessible entry point into Claude Code. This chapter covers that setup completely.

Why VS Code

Visual Studio Code is a free, open-source code editor distributed by Microsoft. It holds over 70% market share among developers and is the environment in which the majority of professional software development occurs today – not because it's technically superior to all alternatives, but because it is well-designed, extensible, and broadly supported.

The Claude Code extension for VS Code provides a graphical interface to the same underlying agent you would access through a terminal. Within this interface:

Your project files are visible in a persistent sidebar
Claude's file edits appear in a diff view (additions in green, removals in red) before they are applied
You can review, approve, or reject individual changes
You can open any file Claude references directly from the Claude panel
The full terminal, if you need it, remains accessible

This environment is appropriate for any skill level. Experienced developers benefit from the tight integration. Those new to development benefit from the visibility, as it's always clear what Claude is doing and why.

Installing VS Code

Navigate to code.visualstudio.com
Download the installer for your operating system
Run the installer and accept the default configuration at each step
Launch VS Code at completion

Installing the Claude Code Extension

In VS Code, locate the Extensions panel in the left sidebar. The icon resembles four squares, three arranged in an L with one offset
In the search field, enter Claude Code
Select the official Anthropic extension and confirm it shows several million downloads
Click Install

When installation completes, a Claude Code icon will appear in the VS Code toolbar. This opens the Claude Code panel.

Opening a Project

Claude Code operates within a directory – a folder containing your project files. Before beginning, create a folder for your project and open it in VS Code:

Create a new folder anywhere on your system
In VS Code: File → Open Folder
Select and open the folder

VS Code will display the (currently empty) folder in its sidebar. Claude Code will read from and write to this directory throughout your session.

Click the Claude Code icon. The Claude Code panel opens. You're ready to begin.

Initial Configuration

The first time you open Claude Code, type /model to select which model you want to use. For initial sessions, Sonnet 4.6 provides a reasonable starting point.

Next, review permissions by typing /permissions. The default setting requires Claude to ask before modifying any file. This is appropriate until you are comfortable with how Claude operates.

You can type /help to see all available commands.

Chapter 8: Subscriptions, Token Costs, and Usage

Having a working understanding of token economics will help you make better decisions about how you use Claude Code.

What Tokens Are

Every interaction with a Claude model consumes tokens. A token corresponds roughly to three-quarters of a word – so a prompt of 100 words and a response of 300 words represents approximately 530 tokens total.

Token consumption accumulates across a session. Each message you send, each response Claude generates, each file Claude reads – all of this is tokenized and counted. On subscription plans, this accumulated usage is measured against your plan's daily allowance.

Where Token Consumption Accumulates

Brief queries consume trivially few tokens. But the kind of work Claude Code is built for (reading through an existing codebase, planning a feature, implementing it, correcting course, running verification) can consume tens of thousands of tokens in a single session.

Advanced users running multiple parallel agents consume far more. Boris Cherny notes that some engineers at Anthropic spend hundreds of thousands of dollars monthly in token costs. For those devs, Claude Code has replaced what would otherwise require entire engineering teams.

For someone beginning, usage will be modest. Token costs at the Pro tier are not a practical concern during the learning phase.

The Cost of Capability Reduction

As I mentioned above, Boris Cherny's advice on model selection addresses a counterintuitive point: using a less capable model to reduce token costs often increases total token consumption. A model with weaker reasoning requires more correction passes, generates more context clarification, and takes longer to converge on an acceptable result. The less capable model costs less per token and often costs more per task.

His recommendation: use the most capable model available. Currently, that is Opus 4.6. Reduce model capability only when performance requirements are demonstrably lower and the trade-off is understood.

The broader principle he articulates: "Don't try to optimize too early. Give engineers as many tokens as they need. At the point where something is proven and scaling, then optimize. Not before."

Practical Guidance

Profile	Plan
Learning, initial projects	Pro ($20)
Regular personal development	Max ($100)
Professional, sustained use	Max ($200)
Team or API-level usage	Anthropic API (usage-based)

The rule is this: begin at the plan that doesn't actively frustrate your usage. If you reach limits and want to continue, that's information. It means you have integrated the tool into your work. Upgrade at that signal.

Chapter 9: Working in Your First Session

With Claude Code installed and a project directory open, the first session is an exercise in learning how the system communicates and responds.

Beginning Simply

The appropriate starting point is a project with immediately visible output. If you're new to software development, web projects are optimal: you write files, open a browser, and see the result directly. The feedback loop is fast and unambiguous.

A minimal first task might look like this:

Build a personal homepage. Include a name, a short biographical description,
and links to LinkedIn and Twitter. The design should be clean and dark-themed.
Use HTML and CSS only — no JavaScript frameworks.

Claude Code will:

Assess the request and formulate a plan
Request permission to create the specified files (if operating in default mode)
Write the HTML and CSS files
Confirm what was produced

Open the resulting index.html file in a browser. Assess whether it meets your intent. Where it does not, state precisely what needs to change. This cycle – prompt, output, assessment, refinement – is the fundamental working method.

Learning Through Iteration

Proficiency with Claude Code develops through use, not through reading. Each each cycle of prompt, output, and refinement builds judgment about how to specify intent clearly, how to evaluate Claude's output, and how to correct course efficiently.

This is not unique to AI tools. It's how expertise in any instrument develops. Begin with modest scope, observe closely, and adjust. The speed at which fluency develops is proportional to how actively you engage with each result rather than accepting or rejecting it without analysis.

Chapter 10: Prompt Discipline — Inputs Determine Outputs

The relationship between prompt quality and output quality is direct and consistent. This is not a limitation of Claude Code. Rather, it's a property of any system that must translate intent into action. The more precisely intent is expressed, the more accurately it can be executed.

The Core Principle

Poor results from Claude Code are almost never attributable to model incapability. They are attributable to underspecified prompts. When Claude produces something that doesn't match what you wanted, the correct diagnostic question is: was my specification sufficient to produce what I wanted?

In most cases, it was not.

Claude operates on what it receives. If you provide an abstract description of a product, Claude fills the gaps with its own reasonable assumptions. Where those assumptions deviate from your expectations – and they will – the output disappoints. The gap is not between what Claude can do and what you need. It is between what Claude knew and what it needed to know.

What Specification Requires

Effective prompts are specific, contextual, and feature-oriented. Consider the difference:

Underspecified:

Build a task management app.

Adequately specified:

Build a task management application for a three-person team. Requirements:

1. Tasks have a title, optional description, due date, and priority level (low, medium, high)
2. Each task can be assigned to one of three hardcoded users: Alice, Bob, or Carol
3. Tasks can be marked complete, with the completion timestamp recorded
4. The task list can be filtered by assignee or by priority
5. All data persists in localStorage — no backend required
6. Interface: clean, light theme; no external CSS frameworks

Technology: HTML, CSS, vanilla JavaScript only.

The second version closes all the significant decision points Claude would otherwise resolve by assumption. It produces a result substantially closer to intent on the first pass.

Explicit Over Implicit

State what you want Claude to do even when it seems obvious. If you want Claude to examine documentation before writing code, say so. If you want specific library versions, specify them. If you want no changes to existing files, make that explicit. If you want a particular file structure, describe it.

Implicit expectations are expectations that are frequently not met. Explicit instructions are instructions that are consistently followed.

The Specificity Standard

A practical test: if a competent engineer read your prompt with no other context, could they build exactly what you have in mind? If not, the prompt is not yet specific enough.

This standard is useful because it surfaces the actual gaps: the decisions you haven't yet made, the constraints you haven't yet articulated, the behavior you haven't yet defined. Filling those gaps before Claude begins building is far more efficient than discovering them in the output.

Chapter 11: Planning as a Core Practice

Planning is the highest-leverage activity in Claude Code development. It's also the most commonly underinvested.

Why Planning Determines Outcomes

A well-specified plan, reviewed and approved before any code is written, produces three effects:

Claude invests its reasoning in the right problem
Misalignments between intent and approach are caught before they are embedded in code
The resulting implementation is coherent, because it follows a coherent design

Without adequate planning, Claude makes architectural and implementation decisions autonomously, based on what it infers from limited input. Some of those decisions will be correct. Some will not. Discovering which ones were wrong after the codebase has been built around them is expensive. It requires reading, understanding, and correcting code that was generated at significant token cost.

The ratio of planning time to development time that produces optimal outcomes is higher than most people initially expect. Thirty minutes of structured planning frequently reduces a ten-hour build to three hours. The mathematics are not subtle.

Plan Mode

Claude Code includes a dedicated Plan Mode. In this mode, Claude reasons through the task and produces a structured plan – which files will be affected, what the implementation sequence will be, how data will flow, what edge cases need to be handled – without writing a single line of code.

You review the plan. You can question it, modify it, reject portions of it, or add constraints. Only when the plan reflects your actual intent do you release Claude to begin implementation.

Boris Cherny uses Plan Mode for approximately 80% of his sessions. The mechanism itself is disarmingly simple: a single sentence injected into the model's context: "Please do not write any code yet." That single instruction changes Claude's behavior from execution to structured reasoning.

To activate Plan Mode:

In the terminal: press Shift+Tab twice
In VS Code or the desktop app: click the Plan Mode button in the interface

The discipline here is important: actually read the plan Claude produces. Don't approve it reflexively. The plan is the point at which you can intervene at minimum cost.

The Product Requirements Document

The formal output of a planning session is a Product Requirements Document (PRD). For individual projects, this need not be elaborate. It should contain:

A clear statement of what is being built and for whom
A specific list of features, each described in behavioral terms
Technical requirements: technology stack, database, external APIs, browsers to support
Interface parameters: visual style, layout decisions, interaction model
Explicit criteria for what constitutes a feature being complete

A PRD.md file at the project root, readable by Claude Code at the start of each session, provides consistent context that persists across sessions. The quality of this document directly determines the quality of every subsequent build session.

The Interview Approach

A useful technique for generating a complete PRD is to instruct Claude to interview you about the project before planning anything:

I want to build [project description]. Before writing any plan or code,
please use the Ask User Question tool to interview me systematically —
covering technical requirements, feature specifications, UI decisions,
data model, and any trade-offs I should consider.
Do not proceed to planning until the interview is complete.

This surfaces decisions you may not have consciously formulated. Some questions will have obvious answers. Others will reveal gaps in your own thinking – gaps you would prefer to close before they manifest as incorrect implementation decisions.

Chapter 12: Building Feature by Feature

A planned project is built incrementally. You should resist the instinct to attempt complete implementation in a single pass. Feature-by-feature development isn't slower – it's more reliable, more verifiable, and ultimately faster.

The Case for Incremental Development

Each feature implemented is a unit of behavior that can be verified independently. If Feature 2 is built on top of Feature 1 without verifying Feature 1, defects compound. A flaw in the foundation propagates upward, embedding itself in everything built above it. Discovering that flaw late multiplies the cost of correcting it.

Each verified feature is also a stable platform from which the next feature can be built with confidence. The accumulation of verified, working behavior is what a production system is.

There is also the matter of understanding. A developer who has watched – and reviewed – Claude build each feature, one at a time, understands how the system works. That understanding is necessary for directing effective corrections, making informed architectural decisions, and explaining the system to others.

The Build Cycle

For each feature in the PRD:

Specify the feature in detail. Provide the feature description and any supplemental context: files that will be involved, constraints on implementation, acceptance criteria.
Enter Plan Mode and review the plan. Read Claude's proposed approach. Is it consistent with your design intent? Will it affect files it shouldn't touch? Is the data flow correct? Revise the plan if necessary.
Approve and release to implementation. Once the plan is sound, release Claude to implement. Review the diff view for each file change.
Test the feature. Manually exercise the feature's intended behavior. Deliberately test boundary conditions. Does it behave correctly when data is missing? When values are at their limits? When the user does something unexpected?
Address any discrepancies. State precisely what is incorrect and ask Claude to correct it. Specificity in correction is as important as specificity in the original specification.
Confirm and advance. When the feature works correctly, move to the next one.

Appropriate Starting Projects

If you're new to software development, the best initial projects are small, web-based, and produce immediately visible output. Examples:

Project	Rationale
Personal homepage	Immediate visual feedback, single HTML/CSS file
Simple calculator	Concrete logic, verifiable output
Task list application	Multi-feature structure, foundational CRUD pattern
Static portfolio	Practical result, shareable immediately
Data display dashboard	Introduction to structured data and layout

Web projects require no server configuration, no deployment setup, and no build pipeline. Open a browser, load a file, observe the result. The feedback cycle is as short as possible, which makes the learning cycle as short as possible.

Chapter 13: How Claude Code Actually Works

Understanding the mechanics of Claude Code at a technical level changes how you use it. When you know what's happening inside a session, you write better prompts, diagnose problems faster, and make better decisions about when to intervene and when to let the agent proceed.

The Agent Loop

At its core, Claude Code operates as a reasoning loop. Every session, at every step, follows the same underlying cycle:

Receive input: a message from you, or the result of a previous tool call
Reason: determine what action to take next
Select a tool: choose which capability to invoke
Execute the tool: take the action
Observe the result: receive what the tool returned
Reason again: determine the next action given the new information
Repeat: until the task is complete or input is required

This cycle is not visible in the interface. You see messages and file edits. Internally, Claude is running through this loop many times per task: reading a file, thinking about what it implies, reading another file, forming a plan, writing a change, running a command to verify it, reading the output, and deciding whether to adjust.

The model is the reasoning engine. The tools provide the hands. The loop provides the structure.

How Tool Calls Work

Claude Code has access to a specific set of tools. From the model's perspective, these are callable functions, each with a name, a set of parameters, and a return value.

When Claude decides to read a file, it doesn't "see" the file. It generates a structured tool call: read_file(path="src/auth.js"). The tool executes and returns the file's contents. Claude receives that content and continues reasoning.

The tools available to Claude Code include, in simplified form:

Tool	What It Does
`read_file`	Returns the contents of a file as text
`write_file`	Writes content to a file, creating it if it does not exist
`edit_file`	Applies a specific change to an existing file
`run_command`	Executes a shell command and returns stdout and stderr
`list_directory`	Returns the file and directory structure of a path
`search_files`	Searches file contents for a pattern
`web_search`	Queries the web and returns results
`ask_user`	Pauses and requests input from you
`browser`	Opens a browser and interacts with a web page

When Claude Code runs a test suite, it issues run_command("npm test"), receives the output, reads which tests failed, then uses edit_file to apply corrections. Then it runs the command again. When it is exploring an unfamiliar codebase, it issues list_directory to understand the structure before opening specific files.

This is why Plan Mode is so powerful: it lets Claude perform the reasoning portion of the loop – deciding which tools it would use, in what sequence, for what purpose – without actually executing those tools. You see the proposed sequence before anything happens.

How Claude Reads a Codebase

When you open a project in Claude Code, Claude does not automatically read all your files. It reads selectively, on demand, as the loop requires.

A typical codebase exploration sequence:

Claude issues list_directory on the root and sees the top-level folder structure
It identifies meaningful directories (src/, app/, components/, etc.) and reads those
For the specific task at hand, it reads the files most likely to be relevant: the component being modified, the service being called, the configuration being adjusted
If it encounters an import or reference to something it has not read yet, it reads that file too

This selective, on-demand reading is efficient but has implications you should understand:

First, Claude's view of your codebase is always partial. At any given point in a session, Claude has read some of your files and not others. If something relevant exists in a file Claude has not read, Claude doesn't know about it.

This is why being explicit in your prompts matters: if a constraint or convention lives in a file Claude has not been asked to read, it will not apply that constraint unless you tell it to or point it to the file.

Second, the CLAUDE.md file is read first, always. Because of this, your CLAUDE.md is the most reliable place to encode conventions. It is the one piece of context Claude has before it reads anything else.

And finally, you can direct Claude's attention. In your prompts, you can name specific files: "Before implementing this feature, read src/auth/middleware.js and src/db/schema.js." Claude will read those files first, incorporate their content into its understanding, and apply that understanding to the task.

How Claude Assembles Context for Each Step

Each step of the loop constructs what is called a context – the full set of information the model receives before generating its next response. This context includes:

The conversation history so far (your messages, Claude's responses)
The results of all tool calls in this session
The contents of any files Claude has read
The CLAUDE.md file (if present)
Any system-level instructions from Anthropic

This entire assembled context is what the model "sees." There is no memory outside it. Nothing persists from one session to the next except what exists in files on disk.

This is why context management matters. A session with a long conversation history, multiple large file reads, and extensive tool output will have a full context by the time it reaches 40–50% of the window limit. The model reasons over everything in context simultaneously — when that window is too full, the model begins to weight recent inputs more heavily than earlier ones, and coherence with decisions made early in the session can degrade.

Why the Model's Judgment Matters

Because Claude Code's architecture exposes the model directly – without tight scripted workflows constraining what it can decide – the model must exercise genuine judgment at each step of the loop. It decides:

Which files are worth reading given the task
Whether a plan needs adjustment based on what it discovered in a file
Whether a test failure is a real bug or a test that needs updating
Whether to ask you a question or make a reasonable assumption and proceed
When a task is genuinely complete vs. when further verification is warranted

This is the practical meaning of "the product is the model." Claude Code's quality is the model's quality. The tools are consistent. The loop is consistent. What varies from one model generation to the next is the quality of reasoning at each decision point. And that is everything.

Chapter 14: Architecting Applications Well with Claude Code

Writing a prompt that produces a feature is one skill. Designing an application that Claude Code can build reliably, maintain coherently, and extend over time is another, deeper skill.

This chapter covers the structural principles that make Claude Code development efficient and the codebases it produces maintainable.

The Connection Between Structure and Prompt Quality

There is a direct relationship between how a codebase is organized and how well Claude Code can work within it. A well-structured codebase is one in which:

Each file has a single, identifiable responsibility
File and directory names accurately reflect their contents
Dependencies between components flow in one direction
Configuration is separated from logic
External integrations are isolated in defined boundaries

When a codebase has this kind of structure, Claude can read a small number of files to understand a given component, make targeted changes, and produce output that fits coherently with what already exists.

When a codebase is disorganized – logic mixed with presentation, files responsible for multiple unrelated things, dependencies tangled across layers – every change Claude makes requires reading more context to understand the system, and the probability of an unintended side effect increases. The prompts required become longer and more prescriptive. The review burden increases.

Good structure is not organization for its own sake. It is a productivity investment. A codebase that Claude Code can navigate efficiently is one that produces higher-quality output in fewer tokens.

The Single Responsibility Principle, Applied

The most important structural principle for Claude Code projects is also one of the most important in software engineering generally: each module should do one thing.

In practice, for a web application:

src/
  components/      — UI components: each component renders one thing
  services/        — Business logic: each service owns one domain
  api/             — HTTP handlers: each handler manages one route group
  db/              — Data access: each module owns one entity
  utils/           — Pure utility functions: no side effects, no state
  config/          — All configuration: no configuration scattered elsewhere

When you ask Claude to "add email validation to the signup form," Claude reads components/SignupForm.jsx (the component), services/auth.js (the auth logic), and possibly utils/validation.js (shared validators). Three files. Focused change. Clean result.

If signup logic were embedded in a single large app.js file with all other logic, Claude would need to read everything, reason about what to touch and what not to, and work in a context where a single misplaced edit can affect unrelated behavior. The change is the same – the cost of making it is not.

Layer Separation as a Claude Code Multiplier

The pattern that consistently produces the best outcomes with Claude Code is a strict separation between layers of an application:

Presentation layer: what the user sees and interacts with. Components, pages, templates. No business logic. No data fetching beyond what the component directly needs.
Business logic layer: what the application does. Services, use cases. No knowledge of the database interface. No UI-specific concerns.
Data access layer: how data is stored and retrieved. Repository pattern or service layer specific to database interaction. No business logic. No presentation concerns.
Integration layer: connections to external services (APIs, email providers, payment processors). Strictly isolated so that the rest of the system does not depend on the implementation details of external services.

When these layers are enforced, both in the file structure and in the CLAUDE.md conventions, Claude Code respects them automatically. It knows that a feature touching the UI does not require changes to the data access layer. It knows that an email integration lives in one place and is called through a defined interface. Changes are targeted. Side effects are minimal. Reviews are coherent.

How to Structure a New Project for Claude Code

When beginning a new project, spend time on the directory structure before asking Claude to write any code. Establish the structure, write the CLAUDE.md, and then begin feature implementation.

A practical sequence:

1. Create the directory structure manually (or ask Claude to create it from a spec)
2. Write CLAUDE.md with: stack, conventions, layer rules, library choices
3. Create a PRD.md with the full feature list
4. Ask Claude to implement a skeleton — empty files in the right places,
   with the right imports and class/function signatures, but no real logic yet
5. Review the skeleton — is the structure right? — before building on it
6. Implement features one at a time into this established structure

Step 4 is particularly valuable: a skeleton gives you a complete view of the application's shape before any logic exists. Structural mistakes – like a component in the wrong layer, a misplaced dependency, a missing interface – are visible and cheap to fix. Once logic is written into the wrong structure, correcting the structure requires rewriting the logic too.

What Claude Code's File Edits Tell You About Your Design

Pay attention to which files Claude touches when implementing a feature. Specifically:

If Claude is modifying more than 3–5 files for a single feature, the feature may not be coherently scoped, or the application structure may have too many dependencies between components.

If Claude is modifying the same files repeatedly across different features, those files are taking on too much responsibility.

If Claude asks clarifying questions before beginning, the specification was not complete enough or the existing structure is ambiguous about where the feature should live.

Each of these is a signal. Read it as information about the design, not as criticism of the instruction. Well-designed systems produce focused changes. Tangled systems produce sprawling changes.

Claude Code navigates your codebase by reading file and directory names, examining import statements, and searching for patterns. Names are therefore not cosmetic. They are structural.

A file named utils.js tells Claude almost nothing about what is inside it. A file named validation.js, dateFormatters.js, or currencyUtils.js tells Claude exactly where to look when it needs that functionality.

Enforce these naming standards in your CLAUDE.md:

## Naming Conventions

- Components: PascalCase, descriptive noun (UserProfileCard, TaskListItem)
- Services: camelCase, domain noun (authService, notificationService)
- Utilities: camelCase, descriptive of the concern (dateFormatters, currencyUtils)
- API handlers: camelCase, resource-oriented (userHandlers, taskHandlers)
- Test files: same name as the file being tested, with `.test.js` suffix

With these rules in CLAUDE.md, Claude will follow them consistently. Your code will be navigable, both by Claude in a session, and by developers (including yourself) returning to the project later.

Defining Done in Your CLAUDE.md

One of the highest-value additions to any CLAUDE.md is a clear definition of what "done" means for a feature. Developers use different standards. Claude Code should use yours:

## Definition of Done

A feature is complete when:

1. The specified behavior works correctly across all described scenarios
2. Edge cases identified in the specification are handled
3. All new functions have JSDoc comments
4. A corresponding unit test exists and passes
5. No linting errors are introduced
6. The feature works correctly at desktop and mobile viewport widths

With this definition present, Claude will not declare a feature finished when the happy path works and the edge cases have not been tested. It will complete all steps in the definition before telling you the task is done.

Chapter 15: Plan Mode, Edit Mode, and Operational Modes

Claude Code's operational modes govern how it behaves when executing a task. Understanding these modes prevents common errors and allows you to calibrate the level of oversight appropriate to your context.

Plan Mode

We talked about this in detail in Chapter 11. Claude reasons – it doesn't write. Use it to begin any task of non-trivial complexity. The cost of a few minutes in Plan Mode is invariably lower than the cost of correcting a misaligned implementation.

Ask Before Edits

This is the default mode. Before modifying any existing file or creating new ones, Claude presents the proposed change and requests explicit approval.

The presentation takes the form of a diff view: proposed additions displayed in green, proposed deletions in red. You can approve or reject each change individually.

This mode is appropriate whenever:

You're working in an unfamiliar codebase
You're building something for the first time with Claude Code
The consequences of an incorrect edit are significant

The overhead of reviewing and approving each change is not waste. It is learning. You understand what is being built because you have reviewed every element of it.

Automatic Edit Mode

In this mode, Claude writes files without asking for approval between each change. This is appropriate only after you've used Plan Mode and you've reviewed and approved the plan. Once you've confirmed that Claude's approach is correct, there's no additional value in approving each individual file write – the strategy has already been established.

Boris describes the transition:

"Once the plan looks good, I just let the model execute. I auto-accept edits after that. With Opus 4.6, it oneshots it correctly almost every time."

Don't default to this mode. Earn it by developing confidence in your own plan review. Automatic edits without plan review is the condition in which errors most readily compound.

Chapter 16: Context Windows and Session Management

Every Claude session operates within a context window, which is the total volume of information the model can hold simultaneously. When that window fills, older information is compressed or lost. Understanding this constraint is necessary for managing long sessions and multi-session projects effectively.

The Context Window

For Claude Opus 4.6, the context limit is 200,000 tokens – equivalent to roughly 150,000 words, or approximately 200–300 pages of text.

In a long development session, this fills faster than it appears to. Reading a large codebase consumes tokens. A lengthy conversation accumulates them. Plans, implementations, test outputs, corrections – all tokenized, all counting toward the window.

The symptom of context saturation is drift: Claude begins making decisions inconsistent with constraints it established early in the session. It forgets architectural decisions. It reverts to default assumptions. If you notice this, the session has likely consumed most of its available context.

The 50% Practice

A working convention among experienced Claude Code users: when context usage reaches 40–50%, begin a new session. This is conservative enough to avoid the degradation zone and preserves clarity throughout the active session.

Claude Code displays context usage as a percentage. Monitor it during long sessions.

Cross-Session Continuity

Beginning a new session does not mean beginning from zero. Every file you have written, every architectural decision encoded in your codebase, and every convention documented in plain text is still on disk. Claude can read all of it fresh at the start of a new session. What it cannot do is recall the conversation that produced it. That context lives only in the session that created it.

The solution is to make your project self-documenting. When a project is properly documented, a new session can reach productive context in under a minute, without reconstruction from memory.

The Four Continuity Documents

The most effective approach uses four files, each serving a distinct purpose:

CLAUDE.md — Project conventions and architecture context

This file lives at the project root. Claude Code reads it automatically at the start of every session, before reading anything else. It is the most reliable channel for encoding project-level context that must always be present.

A well-maintained CLAUDE.md contains:

Technology stack and why specific choices were made
Directory structure and what each major directory contains
Coding conventions (naming patterns, async style, error handling approach)
Library choices: what is used and what is explicitly prohibited
Layer rules (for example, "all database access goes through db/repos/ – no SQL in route files")
Security requirements specific to this project
Definition of done for a completed feature

PRD.md — What is being built

The Product Requirements Document defines the complete feature set in behavioral terms. It is the authoritative answer to "what should this system do." Claude reads the PRD at the start of any feature session to establish alignment between your intent and its implementation plan.

A PRD entry for a feature should include:

Feature title and one-line description
Behavioral specification (what happens when X, what happens when Y)
Acceptance criteria: what must be true for this feature to be considered complete
Edge cases that must be handled

README.md — Current implementation status

The README serves two purposes: it describes the project for anyone encountering it for the first time, and – for Claude Code purposes – it maintains a running record of what has been built and what remains. Update it as features are completed. A README with an accurate implementation status section allows a new session to pick up mid-project without needing a reconstruction conversation.

A practical status section looks like:

## Implementation Status

### Completed

- [x] User authentication (JWT, register, login, logout)
- [x] Task creation and persistence (SQLite, full CRUD)
- [x] Task filtering by status and priority

### In Progress

- [ ] Email notification on task assignment (backend logic done; email provider integration pending)

### Remaining

- [ ] Team management (invite members, manage roles)
- [ ] Export to CSV

progress.md — Session-level notes and decisions

This is optional but valuable on longer projects. It serves as a journal: decisions made during development, approaches that were tried and rejected, open questions that require resolution, and notes on anything that will affect future implementation decisions.

Unlike the PRD (which is specification) and the README (which is completion status), progress.md captures the reasoning behind decisions — the things that would otherwise exist only in the chat history of a session that has since ended.

Where These Files Live

All four files live at the project root: the top-level directory that Claude Code is working within. This is where Claude looks first when it reads a project.

my-project/
  CLAUDE.md        ← read automatically every session
  PRD.md           ← read at the start of feature sessions
  README.md        ← updated as features complete
  progress.md      ← optional; captures decisions and open questions
  src/
    ...

How Claude Accesses Them

Claude Code reads CLAUDE.md automatically. For the other files, you provide an explicit instruction at the start of the session:

New session. Read the following files in order:
1. CLAUDE.md
2. PRD.md
3. README.md (specifically the Implementation Status section)
4. progress.md

Confirm your understanding of the project state and tell me where we left off.

This instruction takes thirty seconds to write. Claude reads all four documents, synthesizes their content, and returns a summary of the project's current state. You correct any misunderstanding before proceeding. Then you begin the session from a position of shared, accurate context.

The discipline of maintaining these files – updating them as features complete, adding to progress.md when significant decisions are made – is the single most valuable habit you can build for multi-session projects. It costs minutes per session. It saves hours of reconstruction.

Chapter 17: MCP Servers and External Integrations

MCP (the Model Context Protocol) is Anthropic's open protocol for connecting AI agents to external tools, services, and systems. It was developed by the same team that built Claude Code and released as an open standard.

What MCP Enables

Out of the box, Claude Code can read and write files, execute terminal commands, and search the web. MCP extends this to include virtually any application or service that exposes a compatible interface.

When an MCP server is installed and connected, Claude Code gains the ability to act on that service directly – reading data from it, modifying data within it, triggering actions in it – without requiring any manual data transfer between Claude and the external system.

This is the difference between Claude handing you a LinkedIn post to publish manually and Claude publishing it directly, including scheduling and image attachment.

Representative MCP Applications

GitHub (included by default):

Opens pull requests directly from the Claude session
Reviews incoming PRs and adds inline comments
Monitors CI/CD failures and proposes corrections
No browser navigation required

Notion:

Reads project notes and specifications
Updates task status as work is completed
Creates documentation pages

Google Workspace:

Reads from and writes to Google Docs and Sheets
Composes and sends email via Gmail

Playwright (browser automation):

Opens a real browser session
Navigates to URLs and completes form interactions
Extracts structured data from web pages

Airtable / database integrations:

Reads from structured data sources
Writes results and updates records

Boris uses Claude to pay parking fines, cancel subscriptions, send Slack messages, maintain project tracking spreadsheets, and send reminders to team members – all via plain English instructions to an agent that executes these tasks through connected MCP servers.

Installing an MCP Server — Step by Step

Installing an MCP server connects Claude Code to an external service. Here is a complete walkthrough using the Filesystem MCP server, one of the most broadly useful and a good one to install first.

Step 1: Find the MCP server

Anthropic maintains an official registry of MCP servers at github.com/modelcontextprotocol/servers. Each server has an installation command and a configuration format. For third-party services, the service's own documentation will provide the MCP configuration.

Step 2: Install the server via Claude Code

Open a Claude Code session and type:

Add the following MCP server at user scope so it is available across all my projects:

npx -y @modelcontextprotocol/server-filesystem /path/to/your/allowed/directory

The user scope flag means the server applies to all your projects — not just the current one. Use project scope if you want the server active only for the current directory.

Step 3: Provide configuration if required

Some MCP servers require API keys or configuration. For example, connecting to Notion:

Add the Notion MCP server at user scope with the following configuration:
- Server package: @notionhq/notion-mcp-server
- API key: [your Notion integration token]

Claude Code writes the configuration to ~/.claude/mcp_settings.json automatically. You do not need to edit this file manually.

Step 4: Restart the session

After installation, type /restart or close and reopen your Claude Code session. The MCP server initializes on startup. You can verify connected servers by typing:

/mcp

This lists all active MCP servers and their available tools. If the server appears in this list, it is connected and available for Claude to use.

Step 5: Use it in a session

Once connected, you interact with the MCP-enabled service using plain English. Claude selects the appropriate MCP tool automatically:

Create a new Notion page in the "Projects" database titled "Sprint 14 Planning"
and add sections for Goals, Tasks, and Blockers.

Claude issues the appropriate Notion API call through the MCP server, creates the page, and returns the result – without you opening a browser.

Practical MCP Integrations

GitHub: the most immediately valuable

GitHub MCP is included in Claude Code by default and is worth using from your first day. Common usage:

"Open a pull request for the feature branch feature/auth-redesign into main. Title: 'Auth: JWT token refresh'. Description: summarize the changes you made in this session."
"Review the open pull requests in this repository and tell me which ones have unresolved review comments."
"Check the CI pipeline status for the last commit on main."
"Create a GitHub Issue for the validation bug I just described and assign it to me."

Playwright — for anything web-based

Playwright gives Claude a real browser. Useful for:

Extracting data from sites without public APIs
Testing your own deployed application by interacting with it as a user would
Completing multi-step web workflows (form submission, navigation, login)

A practical example: "Go to our staging environment at staging.myapp.com, log in with the test credentials in .env, create a new task, and verify it appears in the task list."

Slack

"Send a message to the #engineering channel: 'Auth service deployment is live. Monitor for errors over the next 30 minutes.'"
"Check the #bug-reports channel for any messages from the last 24 hours and summarize the issues reported."

Google Workspace

"Read the latest version of the Product Requirements in the 'Q2 Roadmap' Google Doc and create a Markdown version of it in PRD.md."
"Add a row to the 'Sprint Tracker' spreadsheet for today's date with columns: feature completed, time spent, issues encountered."

Filesystem Server

The filesystem MCP extends Claude's file access beyond the current project directory. Useful for:

Reading a reference implementation in another project directory
Accessing template files stored elsewhere on your machine
Writing output to a shared directory

MCP Security Considerations

MCP servers extend Claude Code's reach. This means they extend the potential impact of mistakes – or of malicious instructions – proportionally. A connected server can take real actions in real services. Treat that capability with corresponding care.

1. Grant only the access required

When configuring a filesystem MCP server, specify the directories it can access rather than granting root-level access. When configuring a Notion or Google Workspace server, use an integration token scoped to the specific pages or folders Claude needs – not a token that grants full account access.

2. Review before executing

When Claude proposes an action through an MCP server – particularly one that modifies data, sends messages, or affects production systems – read the proposed action before approving. A misunderstood instruction that creates twenty Notion pages or sends a Slack message to the wrong channel is recoverable. An action that deletes records from a production database may not be.

3. Be specific in your instructions

Vague instructions are more likely to produce unexpected actions through MCP. "Update the project tracker" is ambiguous. "Add a row to the Sprint Tracker sheet with today's date and the values X, Y, Z in columns B, C, D" is not. Precise instructions produce predictable actions.

4. Do not store credentials in prompts

If Claude asks for an API key or authentication credential in order to configure an MCP server, provide it once during setup. Do not include live credentials in recurring prompts. Store them in the MCP configuration file, which Claude Code writes to your local machine's config directory.

Starting MCP Integrations

Connect MCP servers that correspond to tools you already use. There is no value in connecting services you don't work with. The value of MCP is in reducing friction in existing workflows, not in creating new ones.

Begin with GitHub. It is pre-configured, immediately useful, and demonstrates the core value of MCP within the first session you use it.

Chapter 18: Agents, Sub-Agents, and Parallel Workflows

As your competency with Claude Code develops and improves, the natural extension is parallel operation: multiple Claude sessions running simultaneously, each handling a distinct workstream.

Parallel Session Structure

A software project typically involves multiple independent workstreams that don't require sequential execution. Frontend implementation, backend logic, test coverage, documentation – these can often proceed in parallel without introducing conflicts, provided each session works on different files.

Multiple Claude Code sessions, each assigned a specific scope, replicate the parallel capacity of a small team. One session builds the authentication system. Another builds the data visualization layer. A third writes tests for features already completed. Each operates independently; all write to the same disk.

Boris Cherny operates with five or more parallel sessions routinely. His description of the workflow: "I kick off one task, then something else, then something else, and go get a coffee while they run."

Running Parallel Sessions — A Concrete Example

Suppose you are building the notes application from Blueprint 4 in Chapter 25. You have completed the backend REST API and now want to build the frontend and write API tests simultaneously. Here is exactly how that works.

Window 1: Frontend session

Open VS Code. Open the Claude Code panel. Type:

We are building the frontend for the notes application. The backend REST API is
already running on port 3001. Your scope for this session is the client/ directory only.

Read CLAUDE.md and PRD.md first. Then implement the three-panel frontend layout
described in the PRD:
- Sidebar with the note list (read from GET /api/notes)
- Editor panel for the active note (with auto-save on keystroke debounce)
- Empty state when no note is selected

Do not touch any files in the server/ directory.

Enter Plan Mode first. Show me your implementation plan before writing anything.

Window 2: Test suite session

Open a second VS Code window (or a second terminal). Open a new Claude Code session. Type:

We are writing integration tests for the notes application REST API. Your scope
for this session is the server/tests/ directory (create it if it doesn't exist).

Read CLAUDE.md and the server/routes/notes.js file. Write integration tests that
cover:
1. GET /api/notes — returns array, returns empty array when no notes exist
2. POST /api/notes — creates note, returns it with an id, rejects missing title
3. PUT /api/notes/:id — updates note, returns 404 for nonexistent id
4. DELETE /api/notes/:id — deletes note, returns 204, returns 404 if not found

Use the jest + supertest stack. Create server/tests/notes.test.js.

Do not modify any files outside server/tests/.

Both sessions now run simultaneously. Session 1 builds the frontend. Session 2 writes the tests. Neither touches the other's files. You review both when they complete.

The discipline that makes this safe:

Each session receives an explicit scope boundary ("your scope is the client/ directory only") and an explicit prohibition ("do not touch any files in the server/ directory"). These constraints prevent the primary failure mode of parallel sessions: two sessions modifying the same file in incompatible ways.

Sub-Agents

Claude Code can spawn sub-agents internally – additional Claude instances dedicated to specific components of a larger task. This happens automatically when Claude determines that parallel execution of sub-tasks is appropriate.

For example: given the instruction "audit the entire codebase for security vulnerabilities," Claude may spawn sub-agents for different modules, each producing a findings report, which Claude then aggregates into a single document. You submit one instruction and receive one coherent result.

This capability scales with model capability. Opus 4.6 operates autonomous sessions for 10 to 30 minutes reliably. Extended sessions of hours or more are reported in advanced deployments.

Git Worktrees for Safe Parallel Development

What a Git worktree is

Normally, a Git repository has one working directory: the folder where your files live and where you make changes. A Git worktree is an additional working directory linked to the same repository. Each worktree checks out a different branch, and each has its own set of files on disk.

The result is that you can have multiple branches of the same repository active simultaneously, each in its own folder. A Claude Code session pointed at one folder sees only that branch's files. It cannot accidentally modify another branch's files because those files are in a different directory.

This is the cleanest available mechanism for running parallel agent sessions on a shared codebase.

Setting up worktrees for parallel agents

Say your main branch is main and you want two agents working in parallel – one on the authentication feature and one on the notification system.

Step 1: create a worktree for each feature branch:

# From the repository root on main
git worktree add ../my-project-auth feature/authentication
git worktree add ../my-project-notifications feature/notifications

This creates two new directories at the same level as your project root:

../my-project-auth/ – a full copy of the repository, checked out to feature/authentication
../my-project-notifications/ – a full copy, checked out to feature/notifications

Step 2: open each worktree in its own Claude Code session:

Open VS Code. Open ../my-project-auth as the project folder. Start a Claude Code session scoped to the authentication feature.

Open a second VS Code window. Open ../my-project-notifications. Start a Claude Code session scoped to notifications.

Both sessions run against the same repository but in isolated branches. File conflicts are impossible, as each session's changes live in a separate directory. When work is complete, you merge normally:

cd my-project
git merge feature/authentication
git merge feature/notifications

When to use worktrees

Worktrees are appropriate when:

You are working on a production codebase where conflicts are costly
Multiple sessions will be touching overlapping parts of the directory structure
You need clean version history with each feature isolated on its own branch

They aren't necessary for simpler parallel sessions with clearly bounded scopes. A single working directory, divided among sessions by explicit file-scope instructions, is sufficient for most work. When a branch's work is complete and verified, it's merged into the main branch through the standard review process.

This requires familiarity with the Git workflow. It's not necessary at the beginning, but it becomes important as the scale and complexity of parallel workstreams increases.

Chapter 19: Skills, Rules, and Persistent Instructions

Claude Code can be given project-specific instructions that persist across sessions – applied consistently without requiring re-specification in each new conversation. These are called Skills (in Anthropic's terminology) or, more simply, rules.

What Persistent Instructions Accomplish

Every project has standards: how files are named, what libraries are used and which are prohibited, what the testing coverage requirements are, how authentication must be implemented, what the database access patterns are. These standards exist to ensure that the codebase remains coherent across contributions, sessions, and time.

Without persistent instructions, you re-specify these standards in each session. With them, Claude knows the project's conventions from the moment a session begins. The quality of output aligns with your standards without requiring constant specification.

The CLAUDE.md File

The primary mechanism for persistent instructions is a file named CLAUDE.md at the project root. Claude Code reads this file at the start of every session.

A well-written CLAUDE.md contains:

# Project: [Name] — Claude Context

## Architecture

[Technology stack, infrastructure, auth mechanism, external services]

## Conventions

- [Naming conventions]
- [File organization]
- [Async patterns — e.g., always async/await]
- [Data access patterns — e.g., all DB calls through service layer]

## Libraries

[What is used, what is prohibited]

## Testing

[Framework, coverage requirements, testing patterns]

## Security

[Specific security requirements — credential handling, input sanitization specifics]

## Definition of Done

[What constitutes a completed feature before it can be marked complete]

With this file in place, every Claude session for this project begins with full knowledge of the codebase's standards. You don't need to repeat yourself. The codebase doesn't drift from its own conventions.

Ecosystem-Level Skills

Organizations and platforms are beginning to publish standardized Skills. These are pre-written instruction sets that encode best practices for building on their platforms. Vercel has launched this initiative for their hosting and deployment platform, enabling Claude Code to make correct deployment decisions without explicit guidance.

This represents a direction in which the ecosystem will develop further: a library of verified, platform-specific instruction sets that any developer can include in their project, encoding decades of accumulated engineering judgment into Claude's available context.

Chapter 20: Autonomous Loops — Conditions for Use

Autonomous loop operation – where Claude Code executes a task sequence without human approval between steps – is frequently discussed and frequently misapplied. This chapter describes the conditions under which it is appropriate and the conditions under which it is not.

The Structure of a Loop

In autonomous operation, Claude Code receives a task list and works through it sequentially without pausing for approval. It makes decisions, encounters obstacles, adapts, and continues. The human observer reviews the aggregate output, not individual steps.

The efficiency gain is real. A well-constructed loop running against a well-specified task list can accomplish hours of repetitive work – documentation, test generation, systematic refactoring – without human supervision.

The Compounding Risk

A loop amplifies both the quality of its instructions and the defects within them. If a misunderstanding exists in the task specification, the loop will execute all subsequent tasks consistently with that misunderstanding. The error doesn't self-correct. It accumulates.

This is why loops are inappropriate for ambiguous, underspecified, or creatively demanding tasks. The absence of human review between steps removes the checkpoints that catch drift early.

The practical advice from practitioners who have learned this: build without loops first. Develop a repertoire of projects in which you have reviewed each plan, approved each edit, and understood each output. Build the judgment needed to distinguish a well-specified task from an underspecified one. Only then introduce autonomous loops, and only for the class of tasks – well-bounded, repetitive, clearly defined – that they serve well.

Tasks Where Loops Are Appropriate

Adding documentation to a systematically defined set of functions
Generating tests for components with clear, specified behavior
Applying a defined refactor pattern across a consistent set of files
Running a specified analysis against a set of inputs and recording results

Tasks Where Loops Are Not Appropriate

Building new features where scope or behavior is not fully defined
Architecture or design work requiring creative decision-making
Any task where the acceptability of the result is not specifiable in advance
Work in unfamiliar codebases where unexpected conditions are likely

The loop is a tool for known problems. It is not a substitute for understanding or oversight.

What an Autonomous Loop Looks Like — A Concrete Example

The best way to understand the difference between a supervised workflow and an autonomous loop is to see the same task done both ways.

The task: Add JSDoc documentation comments to every function in a codebase's utils/ directory. There are twelve utility files, each containing between three and ten functions.

Without a loop — supervised, step-by-step:

You open a session and say:

Document all functions in utils/dateFormatters.js with JSDoc comments.
For each function, include: @param (name, type, description) for each parameter,
@returns (type, description), and a one-line @description.
Show me the diff before applying changes.

Claude produces the documentation, you review the diff, approve it. Then you repeat the process for utils/currencyUtils.js, and so on across all twelve files. You review every change as it happens.

This approach is appropriate when you are unfamiliar with the codebase, when the functions have unclear behavior that requires interpretation, or when you want to catch any misunderstanding early.

The cost: twelve back-and-forth cycles, each requiring your attention.

With a loop — autonomous, batch execution:

You have already supervised several files and confirmed that Claude is interpreting the functions correctly and producing clean JSDoc. The pattern is clear, the behavior is consistent, and the same operation applies uniformly across all files. Now you introduce the loop:

You are going to add JSDoc documentation to every function in the utils/ directory.

Rules:
- For each function, add: @description (one line), @param (name, type, description)
  for each parameter, and @returns (type, description)
- Do not change any logic — only add documentation comments
- Do not modify any files outside of utils/
- Process one file at a time, in alphabetical order
- After completing all files, output a summary: which files were modified,
  and how many functions were documented in total

Begin with utils/analyticsHelpers.js and proceed through all files in utils/
without stopping for approval between files. Apply changes to each file before
moving to the next.

Claude works through all twelve files autonomously. You review the aggregate output when it finishes: a summary of changes made, which you can verify with a git diff.

What made the loop safe here:

The task was repetitive and uniform – the same operation applied to each file
You had already verified Claude's judgment on a representative sample
The specification was complete – there was no ambiguity about what "done" meant for each function
The scope was bounded – "do not modify any files outside of utils/"
The operation was additive only – "do not change any logic"

Change any of these conditions and the loop becomes riskier. An unfamiliar codebase, an ambiguous definition of done, or a task that requires creative judgment means supervised execution is the right approach.

Chapter 21: Code Review, Security, and Verification

The capabilities of Claude Code don't eliminate the requirement for verification. AI-generated code is held to the same standards as any other code. And those standards require review.

Known Categories of Failure

Claude Code generates code based on patterns learned during training. Across a wide range of tasks, this produces correct, functional, secure output. In a defined set of conditions, it does not.

API hallucination: Claude may reference functions, parameters, or library versions that don't exist or have changed since its training data was collected. This is most common in libraries that evolve rapidly.
Edge case omission: Claude generates implementations that handle the primary flow correctly and may not fully address boundary conditions – empty inputs, null values, network failures, malformed data.
Security vulnerability introduction: Common vulnerability classes – SQL injection, inadequate input sanitization, insecure random number generation, improper credential handling – can be present in generated code that passes visual inspection. These require deliberate security review to detect.
Confident incorrectness: Claude presents output with consistent confidence regardless of its correctness. The tone of a response is not a reliable indicator of its accuracy.

The Verification Standard

Here are some important ways you can review Claude's output to make sure it's up to your standards and security protocols:

Read the code. Not exhaustively, but substantively. Understand what each significant section does. Could you explain it? Is the logic consistent with your stated requirements?
Test the behavior. Manually exercise the functionality. Test the primary flow. Test the edges. Does it behave correctly when inputs are missing? When values are at their extremes? When dependencies are unavailable?
Use automated verification. Request that Claude generate tests for the code it writes. Ask for coverage that includes edge cases explicitly. Automated tests are not a substitute for code review, but they catch regressions systematically.
Apply heightened scrutiny to sensitive domains. Authentication, authorization, payment processing, medical data handling, privacy-related data storage – these areas require security expertise and careful review beyond what automated checks provide.

Claude Code Security

Anthropic has released Claude Code Security, a capability in preview as of 2026 that scans codebases for known vulnerability patterns and generates proposed corrections.

This represents the direction of security tooling: integrated, automated, and AI-assisted. For production systems, treat it as an additional layer, not a replacement for expert review.

The Continued Role of Writing Code

Experience across the Claude Code community consistently confirms: developers who write some code themselves, rather than delegating all implementation, maintain significantly better understanding of their systems.

This understanding is not incidental. It's what allows correct review of Claude's output. It's what surfaces subtle errors that are invisible to anyone without domain knowledge. And it's what produces systems that remain maintainable when the context of their creation is no longer fresh.

Use Claude Code to eliminate mechanical overhead: the boilerplate, the repetitive patterns, the documentation that takes time but requires no judgment. Don't use it to replace engagement with the system you are building. That engagement is where your expertise lives, and where the quality of the system is ultimately determined.

Chapter 22: Starter Project Blueprints

The following six project blueprints provide complete specifications – technology choices, directory structure, feature list, and implementation sequence – ready to hand directly to Claude Code. Each is designed for a specific stage of development competency and a specific class of use case.

These are not toy examples. They are real projects that produce genuinely useful software, chosen because they introduce important patterns in a controlled scope.

Blueprint 1: Personal Homepage

Appropriate for: Absolute beginners. First session with Claude Code.

What it teaches: HTML/CSS file structure, dark-themed UI, responsive layout, link components.

Technology: HTML5, CSS3, no JavaScript required.

Directory structure:

my-homepage/
  index.html
  style.css
  assets/
    avatar.jpg       (add your own photo)

Prompt to give Claude Code:

Build a personal homepage. Specifications:

1. Single-page HTML/CSS site — no JavaScript, no frameworks
2. Sections: hero with name and one-line description, short bio (2–3 sentences),
   links section with icons for LinkedIn, GitHub, and Twitter/X, footer with year
3. Design: dark background (#0f0f13), light body text (#e2e2e2),
   accent color (#6366f1 — indigo), sans-serif typography (Inter via Google Fonts)
4. Responsive: readable and clean on both desktop and mobile
5. File structure: index.html and style.css only

Placeholder text is fine for bio — I will replace it. Use placeholder links (#)
for the social links — I will update them.

Enter Plan Mode first. Show me the plan before writing any files.

What to verify after it builds:

Open in browser – does it look correct?
Resize the window to mobile width – does the layout adapt?
Check that all links exist (even as placeholders)
View the HTML source – can you understand the structure?

Blueprint 2: Task Manager with localStorage

Appropriate for: Early intermediate. First application with state and interactivity.

What it teaches: JavaScript DOM manipulation, localStorage persistence, CRUD patterns, event handling, filtering.

Technology: HTML5, CSS3, vanilla JavaScript. No dependencies. No build step.

Directory structure:

task-manager/
  index.html
  style.css
  app.js
  components/
    TaskList.js
    TaskForm.js
    TaskFilter.js
  utils/
    storage.js      (localStorage read/write)
    dateUtils.js    (formatting helpers)

Prompt to give Claude Code:

Build a task manager application. Full specification:

FEATURES:
1. Create task: title (required), description (optional), due date,
   priority (Low / Medium / High)
2. Display tasks as cards in a list, sorted by due date ascending
3. Mark task complete — completed tasks display with strikethrough and 50% opacity
4. Delete task with confirmation
5. Filter tasks: by status (All / Active / Completed), by priority (All / Low / Medium / High)
6. All data persists in localStorage — survives page refresh

TECHNICAL REQUIREMENTS:
- Vanilla JavaScript, ES6 modules
- File structure as specified: index.html, style.css, app.js,
  components folder with TaskList.js / TaskForm.js / TaskFilter.js,
  utils folder with storage.js and dateUtils.js
- No external libraries, no frameworks, no build step
- storage.js must abstract all localStorage access — no other file reads/writes localStorage directly

DESIGN:
- Clean light theme, comfortable whitespace
- Cards with subtle shadow and hover state
- Priority levels: low = blue, medium = amber, high = red (use colored left border on card)
- Responsive for mobile and desktop

CLAUDE.md content to follow: all localStorage access through storage.js only.
No inline styles — all styling through style.css.

Enter Plan Mode. Show me the complete file tree and implementation plan
before writing anything.

What to verify after it builds:

Create several tasks with different priorities and due dates
Verify they sort by due date correctly
Mark some complete – verify visual state
Refresh the page – verify data is preserved
Test filters – each combination should produce correct results
Delete a task – verify confirmation step works

Blueprint 3: API-Connected Data Dashboard

Appropriate for: Intermediate. First project involving an external API and dynamic data display.

What it teaches: Fetch API, async/await, loading states, error handling, structured data display.

Technology: HTML5, CSS3, vanilla JavaScript. Uses a free public API with no authentication required.

The API used: Open-Meteo weather API (free, no key required).

Directory structure:

weather-dashboard/
  index.html
  style.css
  app.js
  services/
    weatherApi.js   (all API calls isolated here)
  components/
    CurrentWeather.js
    ForecastCard.js
    LocationSearch.js
  utils/
    formatters.js   (unit conversion, date formatting)

Prompt to give Claude Code:

Build a weather dashboard using the Open-Meteo API (https://open-meteo.com).
No API key required. Full specification:

FEATURES:
1. Location search: user types a city name, app geocodes it using
   Open-Meteo's geocoding API and retrieves weather data
2. Current conditions display: temperature (Celsius), feels-like, humidity,
   wind speed, weather description with an icon (use Unicode weather emoji)
3. 7-day forecast: one card per day showing high/low temps and condition
4. Loading state: visible spinner while data is fetching
5. Error state: clear message if location not found or API fails
6. Last searched location persists in localStorage on refresh

TECHNICAL REQUIREMENTS:
- All API calls must go through services/weatherApi.js — no fetch() calls elsewhere
- Async/await throughout — no .then() chaining
- formatters.js handles all unit formatting and date display
- Graceful error handling: network failures and invalid locations display
  user-readable messages (never raw error objects)

API REFERENCE:
- Geocoding: https://geocoding-api.open-meteo.com/v1/search?name={city}&count=1
- Weather: https://api.open-meteo.com/v1/forecast?latitude={lat}&longitude={lon}
  ¤t=temperature_2m,relative_humidity_2m,wind_speed_10m,weather_code
  &daily=temperature_2m_max,temperature_2m_min,weather_code
  &timezone=auto&forecast_days=7

DESIGN: clean card-based layout, dark theme, readable type hierarchy.

Enter Plan Mode. Describe the complete data flow before writing code:
how a user search triggers the API chain and populates each component.

What to verify after it builds:

Search for a known city – does it return weather data?
Search for a nonexistent place – does it show a clean error?
Disconnect your network and search – does it handle the failure gracefully?
Refresh the page – does the last location reload?

Blueprint 4: Full-Stack Notes Application

Appropriate for: Intermediate-advanced. First project with a real backend and database.

What it teaches: Node.js/Express server, SQLite database, REST API design, client-server separation, CRUD at every layer.

Technology: Node.js, Express, better-sqlite3 (synchronous SQLite binding), HTML/CSS/vanilla JS frontend served statically.

Directory structure:

notes-app/
  server/
    index.js          (Express app setup)
    db/
      database.js     (SQLite connection and migrations)
      notesRepo.js    (all database queries for notes)
    routes/
      notes.js        (REST routes for /api/notes)
    middleware/
      errorHandler.js
  client/
    index.html
    style.css
    app.js
    services/
      notesApi.js     (all fetch calls to the backend)
    components/
      NoteEditor.js
      NoteList.js
  package.json
  .env

Prompt to give Claude Code:

Build a full-stack notes application. Complete specification:

BACKEND (Node.js + Express + SQLite):
1. Express server running on port 3001
2. SQLite database via better-sqlite3 package
3. Notes table: id (integer primary key autoincrement), title (text),
   content (text), created_at (datetime), updated_at (datetime)
4. REST API:
   - GET /api/notes — return all notes, ordered by updated_at descending
   - GET /api/notes/:id — return single note
   - POST /api/notes — create note, return created note
   - PUT /api/notes/:id — update note, return updated note
   - DELETE /api/notes/:id — delete note, return 204
5. Database initialization: create table if not exists on server start
6. Error handling middleware: catch all unhandled errors, return JSON error response

FRONTEND (HTML/CSS/vanilla JS):
1. Three-panel layout: sidebar (note list), editor (active note), empty state
2. Click a note in the sidebar to open it in the editor
3. New Note button creates an empty note and opens it immediately
4. Auto-save: debounce saves to the API 1 second after the user stops typing
5. Delete button on active note with confirmation
6. Note list shows title and first line of content as preview, plus updated date

CONVENTIONS (enforce in CLAUDE.md):
- All database access through notesRepo.js — no SQL in route files
- All API calls through client/services/notesApi.js — no fetch() elsewhere in frontend
- All routes return JSON — no HTML from the API

Enter Plan Mode. Show me the complete architecture: how data flows
from the database through the API to the UI and back on save.

What to verify after it builds:

Start the server: node server/index.js
Create a note – does it appear in the sidebar?
Edit it – does it save automatically?
Restart the server – is the note still there?
Delete a note – is it removed from the list?
Test with the network tab open – are the API calls correct?

Blueprint 5: CLI Automation Tool

Appropriate for: Developers with command-line comfort. Introduction to scriptable tools.

What it teaches: Command-line argument parsing, file system automation, structured output, practical tooling.

The tool built: A project scaffolding tool – given a project type argument, it generates a directory structure with starter files.

Technology: Node.js, commander (CLI argument library), fs-extra.

Directory structure:

scaffold-tool/
  src/
    index.js          (entry point, argument definitions)
    commands/
      create.js       (scaffold a new project)
      list.js         (list available templates)
    templates/
      web-basic/      (template directory structure)
      node-api/
      react-app/
    utils/
      fileSystem.js   (file/directory operations)
      logger.js       (colored console output)
  package.json
  README.md

Prompt to give Claude Code:

Build a Node.js command-line scaffolding tool. Full specification:

PURPOSE: Running `scaffold create  ` generates
a new project directory with a starter file structure.

COMMANDS:
1. scaffold create

Blueprint	Key Pattern Introduced
1. Homepage	File structure, static HTML/CSS
2. Task Manager	JavaScript state, localStorage, component separation
3. API Dashboard	External API, async/await, loading and error states
4. Notes App	Backend + frontend, REST, database CRUD
5. CLI Tool	Command-line interfaces, templating, file system automation
6. Team Tool	Multi-user, validation layers, production-readiness

Command	Function
`/help`	Display all available commands
`/model`	Select active model (Sonnet, Opus, Haiku)
`/permissions`	Review and modify Claude Code's permissions
`/clear`	Reset the current session context
`Shift+Tab (×2)`	Activate Plan Mode (terminal)
`Ctrl+C`	Interrupt the current operation
`@filename`	Reference a specific file in a prompt
`/mcp`	Manage connected MCP servers

claude-code - freeCodeCamp.org

How to Keep Human Experts Visible in Your AI-Assisted Codebase

Table of Contents

What You Will Build

Prerequisites

Quickstart in 5 Minutes

How the Tool Works

The Archaeology Problem

How Knowledge Gaps Are Detected

How to Install proof-of-contribution

Mac and Linux

Windows PowerShell

Verify the Install:

How to Scaffold Your Project

Mac and Linux

Windows PowerShell

Verify the Scaffold Worked

How to Record Your First Provenance Entry

Check What You Recorded

See All Experts in Your Graph

How to Use import-spec to Seed Knowledge Gaps

Step 1 — Create a Test Spec

Step 2 — Import the Assumptions

Step 3 — Trace the Gaps

Preview Without Writing

Check the Overall Health

How to Trace Human Attribution

Resolving Gaps

How to Verify with Static Analysis

What the Exit Codes Mean

Run it Without a Seeded Spec

The --strict Flag

How to Enable PR Enforcement

What the Action Checks

Where to Go Next

Use it with spec-writer on a Real Feature

Activate the Claude Code Skill

Upgrade When You Outgrow SQLite

Manual Tracking vs. proof-of-contribution

How to Build a Cost-Efficient AI Agent with Tiered Model Routing

What You'll Build

Prerequisites

Table of Contents

The Problem with Calling Claude on Everything

The Cost Curve Explained

Project Setup

Tier 1: Deterministic Python

Tier 2: Claude Haiku for Ambiguous Cases

Tier 3: Claude Sonnet for Semantic Judgment

The Router: audit_url()

Graceful Fallback

Testing the Cost Curve

Applying This Pattern to Your Agent

Wrapping Up

Claude Code Essentials

The Claude Code Handbook: A Professional Introduction to Building with AI-Assisted Development

What You Will Learn

Who This Is For

1. Developers who want to operate at a different scale

2. Founders, product people, and domain experts who want to remove one specific blocker

3. Anyone who has an idea they have been unable to act on

Why This Skill Matters

Table of Contents

Chapter 1: The Context That Made Claude Code Necessary

Chapter 2: Anthropic — Background and Purpose

Founding and Mission

What Safety Means in Practice

Scale and Influence

Chapter 3: The Claude Model Family

Claude Sonnet 4.6

Claude Opus 4.6

Claude Haiku

Model Selection Guide

Chapter 4: What Claude Code Is

The Distinction from Conversational AI

The Agent Architecture

Where Claude Code Runs

Current Impact

Chapter 5: Why the Development Community Required This

The Mechanical Burden

The `--strict` Flag