From Random Conversations to Reproducible Production: A Claude Code Advanced Workflow Guide
Transform chaotic AI-assisted development into a structured, document-first, diff-driven workflow that delivers predictable results through systematic planning and execution.
From Random Conversations to Reproducible Production: A Claude Code Advanced Workflow Guide
A comprehensive guide that transforms "multi-round casual code editing" into a "document-first, diff-driven, one-shot execution" methodology.
Executive Summary
This guide presents a battle-tested approach to making AI-assisted development predictable and professional. Instead of endless back-and-forth conversations in the editor, we establish an executable PLAN.md that clearly defines objectives, boundaries, steps, and acceptance criteria, then have Claude Code execute according to the documentation. When failures occur, we cold-start a new round rather than "patching up" within the same conversation thread.
When we implemented this methodology across our team, three immediate changes occurred:
- Rollbacks became painless because everything is delivered as unified diffs
- Code reviews became effortless because we only need to compare patches against the PLAN and Definition of Done (DoD)
- Model uncertainty was contained within file structures and processes, rather than scattered across chat histories
⸻
A Real Task: Walking Through the Complete Pipeline
Let's demonstrate the workflow with an actual task: adding an authentication middleware layer to API v2. Unlike the old habit of asking the model "how to do it," we first lay the tracks:
- Repository root contains a
CLAUDE_CODE_LOG/
directory - Each attempt round gets a unique
timestamp__task-name
folder - Each round contains six essential files:
PLAN.md
,PROMPT_USED.md
,0001.patch
,RUN_LOG.md
,COMMIT_SHA.txt
, andEVAL_REPORT.md
- Claude Code is only allowed to output unified diffs and is constrained by file whitelists
- After failures, we archive and cold-start the next round (new timestamp directory) rather than continuing the same conversation
On the first run, Claude provides a patch; we git apply --3way
, run tests, and the DoD fails—so we archive the actual prompt, patch, logs, and commit fingerprint, document the failure reason, and start round two. After repeating this process several times, you'll see both success rate and speed improve: the model performs within "narrow rails," maintaining freedom without losing control.
⸻
Core Methodology: The Four Guiding Principles
1. Document-First
Write a PLAN.md for any task before having the model execute it.
2. Diff-Driven
Only accept unified diffs; prohibit full-file rewrites or "repository scanning."
3. Scope Convergence
Implement file whitelists + directory blacklists + maximum change line limits.
4. Verifiable
Use Definition of Done (DoD) with accompanying tests/scripts to determine success/failure, with full traceability.
⸻
1. Repository Structure and Naming Conventions
project-root/
.claude/
templates/PLAN.template.md
rules/ # Code standards/error codes/security guidelines
repo_map.md # Key directories/entry points/data flow
CLAUDE_CODE_LOG/
20250827_083000__add-auth-mw/ # timestamp__task-kebab
PLAN.md
PROMPT_USED.md
0001.patch
RUN_LOG.md
COMMIT_SHA.txt
EVAL_REPORT.md
Naming Conventions:
- Timestamp:
YYYYMMDD_HHMMSS
- Task name: kebab-case, no spaces
- Branch:
feature/<task>-<timestamp>
- Failure tags:
cc-fail-<task>-<timestamp>
- Important: Don't move source code into archives—only archive documentation/patches/logs/fingerprints to avoid breaking paths and build chains
⸻
2. Consolidating the "7-Layer Prompt Stack" into PLAN.md
Tool/Language/Project/Persona/Component/Task/Query—all seven layers documented so Claude gains stable context by reading the file.
# Task Title
Auth middleware for API v2
## 0. Metadata
- Timestamp: 2025-08-27 08:30
- Branch: feature/auth-mw-20250827-0830
- Issue: #123
## 1) Tool Conventions
- Using Claude Code / Cursor (Claude)
- **Output only unified diff (patch)**
- **Only allow modification of**: server/middleware/auth.ts, server/routes/*.ts, tests/auth.test.ts
## 2) Language Conventions
- Node 20 + TypeScript strict
- ESLint/Prettier enforced, explicit types
## 3) Project Context
- Directory/data flow see .claude/repo_map.md
- Forbidden to change: infra/, migrations/, .env*
- Dependencies: jsonwebtoken@^9
## 4) Persona
- Senior backend + test engineer: MVP first, then add tests and documentation
## 5) Component Scope
- API v2 middleware layer; contract: Authorization: Bearer <JWT>
## 6) Task & DoD
- Validate JWT, inject req.user; distinguish 401/403
- **DoD**:
- `npm test` all green
- `GET /v2/ping` returns 200
- Change lines < 300 and only in whitelisted files
## 7) Query / Action
- If uncertain, first ask ≤3 clarification questions
- Then output **unified diff**
- Finally attach ≤120 character change summary
## Risks & Rollback
- Risk: Route ordering causing middleware not to take effect
- Rollback: revert current patch; interface unchanged
## Evaluation
- Run: `npm i && npm test`
- Record: output written to `RUN_LOG.md`
## Debrief (fill after execution)
- Success/failure; reason classification; next round hypothesis and corrections
⸻
3. Minimal Prompt for Execution
Read ./CLAUDE_CODE_LOG/20250827_083000__add-auth-mw/PLAN.md
Execute strictly according to "whitelist/DoD/steps":
1) If uncertain, first provide ≤3 clarification questions
2) Then output only unified diff patch
3) Attach ≤120 character change summary
⸻
4. "Read-and-Do" 10-Minute Workflow Script
# 1) Start round
TS=$(date +"%Y%m%d_%H%M%S"); TASK=add-auth-mw
ROUND="CLAUDE_CODE_LOG/${TS}__${TASK}"
mkdir -p "$ROUND" && git checkout -b "feature/${TASK}-${TS}"
# 2) Generate PLAN draft
cp .claude/templates/PLAN.template.md "$ROUND/PLAN.md"
echo -e "\n\nBranch: feature/${TASK}-${TS}\nTimestamp: ${TS}" >> "$ROUND/PLAN.md"
# 3) Have Claude output patch → paste and save as:
cat > "$ROUND/0001.patch"
# 4) Apply and verify with traceability
git rev-parse HEAD > "$ROUND/COMMIT_SHA.txt"
git apply "$ROUND/0001.patch" --3way
npm i && npm test | tee "$ROUND/RUN_LOG.md"
# 5) Commit or archive
git add -A && git commit -m "cc: ${TASK} @ ${TS}"
⸻
5. Four Common Task Types: Prompt Recipes
Implementation Tasks
Read PLAN.md
If uncertain, propose ≤3 clarification points
Implement minimally within whitelist files only, output unified diff
Refactoring Tasks
Goal: Don't change external behavior, only improve internal structure
Preserve all exported APIs and tests unchanged
Provide diff in steps (small first, then larger), each step should pass tests independently
Bug Fix Tasks
First provide "reproduction unit test" minimal patch (red)
Then provide fix patch (green)
Send two diffs separately
Documentation/Script Tasks
Only make limited changes to README/scripts
Provide local verification commands, output written to RUN_LOG.md
⸻
6. Change Scope and Safety Guardrails
Essential Protections:
- Whitelist: Only list files/directories allowed for modification
- Blacklist: infra/, migrations/, .env*, deployment scripts, and other sensitive areas
- Change limits: e.g., single round ≤300 lines; must stop and ask when hitting ceiling
- Style consistency: ESLint/Prettier enforced; TS explicit types
- Patch-only: Prohibit full-file rewrites and cross-repository refactoring
- Dependency locking: Preserve lockfile; new/upgraded dependencies must be explicitly declared in PLAN
⸻
7. Cold-Start Iteration vs. Long-Chat "Patching"
Failure Determination = DoD not achieved (not "feels wrong")
After Failure:
- Archive the six-piece set
- Create new timestamp directory for round two PLAN
- Write "failure reason → correction hypothesis → change points" into notes
- Whitelist/blacklist and DoD principles remain unchanged (changes require explicit reasoning)
- Run short prompt again: clarify first, then provide diff
- Continue until "one-shot success"
⸻
8. Quality Metrics and Dashboard
Record in each round's EVAL_REPORT.md
:
- First-pass rate (round-1 pass %)
- Change line count (median/distribution)
- Clarification question count (correlation with failure rate)
- Rollback rate (revert count)
- Failure reason classification: unclear requirements/missing context/interface conflicts/improper test assertions/out-of-scope changes/dependency issues...
Use these metrics to reverse-engineer templates and rules: Need finer whitelists? Harder DoD? Missing key entry points in repo_map?
⸻
9. Team-Level Asset Development (Playbook)
Create CLAUDE_CODE_PLAYBOOK/
in root directory:
- PLAN.template.md: Unified skeleton mapping seven-layer prompts
- PROMPT.recipes.md: Four patterns for implementation/refactoring/bugs/documentation
- BOUNDS.checklist.md: Whitelist/blacklist examples, line count limits
- EVAL.metrics.md: Metric definitions and collection methods
- RISK.cases.md: Real pitfalls and rollback strategies
For next similar task, only change three things: objective, whitelist, DoD. Everything else is reusable.
⸻
10. CI/Review Integration
CI Mandatory Validation:
git apply --check CLAUDE_CODE_LOG/**/0001.patch
npm test / pytest -q / go test ./...
PR Template with PLAN.md and EVAL_REPORT.md:
- Are task objectives/DoD clear?
- Is diff within whitelist/under limits?
- Do tests and scripts cover critical paths?
⸻
11. Anti-Patterns Warning
Red Flags (immediate stop):
- Back-and-forth modifications in same long conversation: context drift, instruction dilution, rising hallucinations
- Having model "scan entire repository for entry points": high risk + low certainty
- One-time massive changes: difficult rollback, difficult troubleshooting
- No DoD: inconsistent acceptance criteria, non-reproducible
- Moving source code into archives: breaks paths/build chains, creates lasting problems
⸻
Template: Ready-to-Use PLAN.template.md
# Task Title (≤60 characters)
## 1. Task Objective / DoD (must be verifiable)
- Acceptance commands:
- `npm i && npm test`
- `curl -s -o /dev/null -w "%{http_code}" http://localhost:3000/v2/ping == 200`
- Change boundaries:
- Only allow: <list files or globs>
- Forbidden: infra/, migrations/, .env*
- Single round changes ≤300 lines (stop and ask if exceeding)
## 2. Execution Steps (MVP)
1) <minimal implementation point>
2) <integration/registration location>
3) <add unit tests or scripts>
4) Run local tests and record logs
## 3. Notes
- **Output only unified diff (patch)**; new files use `--- /dev/null`
- Keep exported APIs unchanged (explicit notice if changing)
- ESLint/Prettier; TS explicit types
## 4. Additional Information
- Dependencies: <name@ver>
- Contract: <protocol/headers/error codes>
## 5. Risks & Rollback
- Risk: <potential failure points>
- Rollback: revert current patch, interface unchanged
## 6. Execution Agreement (for Claude)
- If uncertain, first provide ≤3 clarification questions
- Then output patch + ≤120 character summary
## 7. Round Summary (fill after execution)
- Success/failure:
- Failure reason classification:
- Next round correction plan:
⸻
Conclusion: Making Models "Rule-Following Pair Programmers"
This article deliberately combines narrative with numbered checklists: the first half tells you "why this path is more stable," while the second half fills in every "step B" to immediately executable level. Experience posts gave us direction, while processes, templates, scripts, and metrics turn direction into reproducible productivity.
Starting now, try fitting your next small requirement into this pipeline. Let Claude Code perform within narrow but clear rails—clarify first, then deliver, only patches, strict acceptance. You'll quickly feel: the codebase remains under your command, while the model finally becomes that disciplined, reliable partner.
Key Takeaways:
- Transform AI conversations from chaotic to systematic
- Use documentation to drive development instead of improvisation
- Implement verifiable success criteria with full traceability
- Contain model uncertainty within structured processes
- Build team-wide reproducible workflows for AI-assisted development
Claude Code is My Computer
A deep dive into using Claude Code in no-prompt mode for daily development tasks. Learn how to leverage AI for content shipping, code extraction, automation, and system management.
33 Claude Code Setup Tips You NEED to Know
Master Claude Code with 33 essential tips covering shortcuts, prompting techniques, MCP servers, project rules, and automation hooks. Transform from beginner to expert level productivity.