Production Architecture · Claude Code

Claude Agent OS

A file-based orchestration layer that turns Claude Code into a persistent, context-aware, skill-driven operator. No vector stores. No custom models. No external databases.

The context window is the runtime. The filesystem is the database. Skills are the API layer. The decision log is the audit trail.

7Components

3Skill Patterns

2Memory Tiers

0Infra Required

∞Skills Possible

Executive Summary

The Claude Agent OS uses no external databases, no vector stores, and no custom model fine-tuning. Persistence, context injection, behavioral configuration, and skill routing are implemented via structured markdown files, a YAML-fronted skill registry, and Claude Code's native MCP integration layer.

The entire OS runs on a local filesystem with git for version control. All intelligence emerges from prompt construction — context files, behavioral rules, and skill instructions combine at session load time to produce a highly specialized agent without any model modifications.

Design Philosophy

Intentionally low-infrastructure. No ops. No pipelines. No infra bill. The complexity budget is spent on content quality and workflow design, not tooling.

System Components

01

CLAUDE.md

Master context loader. Entry point for every session.

02

Context Files

Identity and state layer. Slow-changing facts about the operator.

03

Behavioral Rules

Runtime configuration. Operator-specific failure mode overrides.

04

Skill Layer

Workflow automation via trigger-activated markdown documents.

05

Memory Architecture

Two-tier persistence: version-controlled + private session learnings.

06

Decision Log

Append-only audit trail of architectural decisions.

07

Scheduling Layer

Time-based skill invocation without an active session.

Architecture

System diagrams and data flows for the Claude Agent OS. All diagrams reflect the production implementation.

System Component Map

How the major subsystems relate at runtime. Four environment groupings: Runtime, Triggers, Skill Execution, External Integrations, and Persistence.

graph TD subgraph RUNTIME["RUNTIME — Claude Code Session"] CM[CLAUDE.md\nMaster context loader] CF[Context Files\nme · work · priorities · goals] BR[Behavioral Rules\n.claude/rules/*.md] SK[Skill Router\n.claude/skills/*/SKILL.md] MEM[Memory Layer\n~/.claude/projects/.../memory/] end subgraph TRIGGERS["TRIGGERS"] UI[User Input] RT[Remote Triggers\nScheduled routines] MCP_IN[MCP Tool Calls\nCalendar · Drive · Notion · Browser] end subgraph SKILLS["SKILL EXECUTION"] ORCH[Orchestrating Skill\nMain thread] SA[Subagents\nAgent tool — parallel] TOOL[Tool Calls\nBash · Read · Write · Edit] end subgraph EXTERNAL["EXTERNAL INTEGRATIONS"] MCPS[MCP Servers\nGoogle · Notion · Desktop Commander] SCRIPTS[Automation Scripts\nPython · Bash in tools/] APIS[External APIs\nYouTube Data · Stripe] end subgraph PERSISTENCE["PERSISTENCE"] DL[decisions/log.md\nAppend-only ledger] GIT[Git Repository\nVersion-controlled OS state] MEMF[Memory Files\nPrivate · Not version-controlled] ARC[archives/\nRetired artifacts] end UI --> CM RT --> SK MCP_IN --> RUNTIME CM --> CF & BR & SK MEM --> CF SK --> ORCH ORCH --> SA & TOOL SA --> TOOL TOOL --> SCRIPTS SCRIPTS --> APIS TOOL --> MCPS ORCH --> DL ORCH --> MEMF TOOL --> GIT

Context Injection Architecture

How state reaches Claude's context window at session start. No RAG. No embeddings. Two-layer guaranteed injection: CLAUDE.md (always loaded) + hooks (deterministic lifecycle commands).

sequenceDiagram participant HK as .claude/settings.json (hooks) participant CC as Claude Code participant CM as CLAUDE.md participant CF as context/ files participant MEM as memory/MEMORY.md participant SK as .claude/skills/ HK->>CC: SessionStart hook fires — inject critical context (guaranteed) CC->>CM: Load session (always) CM->>CF: @-reference context files CF-->>CC: Inject: identity · work · priorities · goals CC->>MEM: Load memory index (always, if configured) MEM-->>CC: Inject: cross-session learnings Note over CC: Session context window now contains full OS state CC->>SK: On /skill trigger, load matching SKILL.md SK-->>CC: Inject: workflow instructions Note over HK,CC: PreCompact hook fires before any compaction — re-injects critical context

Key Design Decision

Flat file injection over vector retrieval. For a single-operator system with a bounded context surface, full injection is more reliable than retrieval. Retrieval introduces precision/recall trade-offs; injection is deterministic. The trade-off is context window consumption — mitigated by keeping context files concise and MEMORY.md as a pointer index rather than full content store.

CLAUDE.md injection is reliable but probabilistic — in very long sessions or under compaction, critical context can drift. Hooks are the guarantee layer: shell commands wired to Claude Code lifecycle events in .claude/settings.json. SessionStart fires at every session open and injects your current priorities unconditionally. PreCompact fires before any compaction event and re-injects critical context so it survives the summary. CLAUDE.md is a hint. Hooks are a guarantee.

Lifecycle Automation Hooks

Shell commands wired to Claude Code lifecycle events in .claude/settings.json. Hooks run outside the model's context window — deterministic OS-level automation, not AI instructions.

Hook event	Fires when	Production use
`SessionStart`	Session opens	Inject `current-priorities.md` unconditionally — guaranteed context every session regardless of CLAUDE.md drift
`PreCompact`	Before context compaction	Re-inject priorities; prompt operator to run `/close-session` before the compaction summary is written
`PostToolUse(Skill)`	After every Skill tool call	Auto-append timestamped stub to `skills/run-log.md` — closes the dropped-Learn gap for programmatic skill invocations
`UserPromptSubmit`	Every user message sent	Parse for leading `/command` → append run-log stub for typed slash skills (`PostToolUse` fires on Skill-tool calls only, not typed `/commands`)
`SessionEnd`	Session closes	If uncommitted or unpushed work exists, auto-commit a WIP snapshot and push — backstop behind `/close-session`. No-ops on clean state.

.claude/settings.json — hooks

{
  "hooks": {
    "SessionStart": [
      { "hooks": [{ "type": "command", "command": "python3 .claude/hooks/session-start.py", "timeout": 10 }] }
    ],
    "PreCompact": [
      { "hooks": [{ "type": "command", "command": "python3 .claude/hooks/pre-compact.py", "timeout": 15 }] }
    ],
    "SessionEnd": [
      { "hooks": [{ "type": "command", "command": "bash .claude/hooks/session-end-snapshot.sh", "timeout": 60 }] }
    ],
    "PostToolUse": [
      { "matcher": "Skill", "hooks": [{ "type": "command", "command": "python3 .claude/hooks/log-skill-run.py", "timeout": 10 }] }
    ],
    "UserPromptSubmit": [
      { "hooks": [{ "type": "command", "command": "python3 .claude/hooks/log-slash-skill.py", "timeout": 10 }] }
    ]
  }
}

Why two hooks for run-log?

PostToolUse(Skill) fires when Claude Code dispatches a skill as an internal tool call. UserPromptSubmit fires on every user message — including /commands that are inline-expanded rather than dispatched as tool calls. The two hooks are disjoint: together they guarantee every skill invocation reaches the log regardless of invocation path.

Full Production Session

End-to-end data flow for a content creation session using the /youtube-pipeline skill.

Session Start

Load CLAUDE.md

Inject context files

Load MEMORY.md index

User sends command

Load SKILL.md into context

Parse mode from input

Dispatch 3 parallel subagents

Research + SEO subagent

Script draft subagent

Metadata subagent

Aggregate outputs

↩ Redirect → Parse mode

Approval gate

Approved

Phase 2: Asset plan

Approval gate

Approved

Write outputs

Run filler remover

Run assembler

Upload to API

Log decisions

Update memory

Session End

Weekly OS Audit (Scheduled)

Fires every Friday at 4am CT via RemoteTrigger. No human input required.

Remote Trigger — Friday 4am CT

Session starts

Load os-verify skill

Module 1: Skill audit

Yes → Flag issue → skip to Module 3

Deprecated skills found?

No

Module 2: Context drift

Check Last updated dates

Yes → Flag issue → skip to Module 3

Context file stale?

No

Read last 10 decisions

Platform reference scan

Module 3: Memory hygiene

Module 4: Folder cleanup

Module 5: Write health report

Session ends

Components

Detailed specification for each of the seven core components of the Agent OS. Each component is independent and composable.

Component 01 Always loaded

Role

The entry point for every session. Sets operational scope, loads referenced files via @ syntax, declares the active project inventory, and establishes system rules. Every line in CLAUDE.md is injected into every session — treat it as precious real estate.

Size constraint

Keep CLAUDE.md under 200 lines. Move detailed instructions to skill files or project-specific projects/[name]/CLAUDE.md files. Every line costs context window on every session.

Implementation Pattern

CLAUDE.md

# CLAUDE.md

## Context Files
- @context/me.md
- @context/work.md
- @context/current-priorities.md

## Active Projects
| Project | Description |
|---------|-------------|
| `projects/[name]/` | [Status and description] |

**Hygiene rule:** Projects enter the table when they have a folder + active work.
Projects exit only when moved to archives/ AND logged in decisions/log.md.

The @ Reference Mechanism

The @ reference syntax in Claude Code causes the referenced file's full content to be injected into the system prompt at load time. This is not a link — it's a literal include.

File size matters

Each context file should be under ~500 lines. Above that, consider splitting or summarizing. The entire referenced file is injected, unconditionally, on every session load.

Active Projects Table as a Service Registry

The Active Projects table functions as a service registry — it tells Claude what workstreams exist, where they live, and what their current status is. Without it, Claude has to infer project existence from file exploration, which is slower and less reliable.

Component 02 Slow-changing

Role

Static facts about the operator, the business, and current priorities. These are the "long-term memory" of the system, version-controlled in git. They load on every session via @ references in CLAUDE.md.

File Inventory

File	Content	Update Frequency
`context/me.md`	Identity, working style, constraints, failure modes	Rarely — only when circumstances change
`context/work.md`	Business model, products, platforms, revenue	Monthly or when major changes occur
`context/current-priorities.md`	Active sprint, ordered tasks, backlog	Every session or weekly
`context/goals.md`	Quarterly objectives and milestones	Quarterly
`context/brand-story.md`	Core narrative, avatar, positioning	Rarely — locked early

Status Taxonomy in current-priorities.md

Status symbols

✅  Completed — done, ship it, move on
- [ ]  Pending — active, in queue
⏸  Parked — deliberately deferred, not forgotten

Staleness Detection

The os-verify skill includes an age check — it greps Last updated: from each context file, compares to the current date, and flags any file older than 28 days. Implemented as a plain bash command inside the skill, not a scheduled job.

Component 03 Auto-loaded

Role

.claude/rules/*.md files encode operator-specific behavioral constraints. These load automatically as part of the Claude Code workspace configuration. Rules are written in the second person, addressed to Claude, and describe both what to do and why.

The "Why" Matters

Including the reason allows Claude to apply the rule to edge cases rather than following it mechanically. Generic rules ("be helpful," "be concise") add no value. Effective rules encode operator-specific failure modes.

Pattern

.claude/rules/execution-mode.md

# execution-mode.md

[Operator's documented failure mode — e.g., "Default pattern:
research > refine > optimize > never ship."]

## Rules of Engagement

1. **Bias toward shipping.** "Good enough" beats "perfect but
   unpublished" every time.
2. **One review pass.** When reviewing content — do ONE thorough
   pass. Don't suggest a second round unless something is
   genuinely broken.
3. **Don't feed the research spiral.** If the operator asks to
   "look into" something tangential to top priorities, flag it
   and suggest parking it.
...

Design Principle

The more specific the rule, the more leverage it provides. Rules should encode operator-specific failure modes and preferences that deviate from Claude's defaults — not restate behaviors Claude already exhibits.

Component 04 Trigger-activated

Role

Skills are reusable, trigger-activated workflow specifications. Each skill is a structured markdown document that instructs Claude how to execute a specific multi-step process. No code. No execution environment beyond Claude Code.

Skill File Schema

.claude/skills/[name]/SKILL.md

---
name: skill-name              # Machine-readable identifier
description: >                # Used by Claude to determine when to invoke
  [Detailed trigger conditions and purpose]
trigger: /skill-name          # The /command that activates this skill
status: active                # active | deprecated | experimental
---

# Skill Title

## What It Does
[Plain English summary]

## Mode Selection (optional)
[Parse user input to determine execution path]

## Steps
### Phase 1 — [Name]
[Specific instructions — what to read, generate, where to write]

**Gate:** Present output to user. Wait for approval before proceeding.

### Phase 2 — [Name]
...

## Output
[Where output lands and in what format]

Execution Patterns

A

Orchestrator + Subagents

For content-heavy workflows requiring parallel generation

Parallel

The skill's main thread collects input and dispatches parallel subagents via the Agent tool for content generation. The main thread aggregates results, presents for approval, then proceeds.

sequenceDiagram participant User participant Main as Main Thread (Skill) participant SA1 as Subagent: Research participant SA2 as Subagent: Draft participant SA3 as Subagent: Metadata User->>Main: /skill-name "topic" Main->>SA1: Research topic + demand signals Main->>SA2: Draft content (parallel) Main->>SA3: Generate metadata (parallel) SA1-->>Main: Research output SA2-->>Main: Draft output SA3-->>Main: Metadata output Main->>User: Phase 1 — aggregated output for review User->>Main: Approved / redirect Main->>Main: Phase 2 — refine and finalize Main->>User: Final output + write files

B

Sequential Tool Execution

For automation workflows — no subagent dispatch

Sequential

The skill drives a sequence of tool calls — file reads, bash commands, API calls — with no subagent dispatch. Used for mechanical workflows like video processing pipelines.

sequenceDiagram participant User participant Skill participant Bash participant API User->>Skill: /publish-video "path/to/video.mp4" Skill->>Bash: Run filler-remover.py on SRT Bash-->>Skill: Clean SRT output Skill->>Bash: Run davinci-assembler.py Bash-->>Skill: Timeline config written Skill->>User: Review metadata — approve? User->>Skill: Approved Skill->>API: youtube-uploader.py (OAuth, resumable upload) API-->>Skill: Video ID + URL Skill->>Skill: Write to decisions/log.md Skill-->>User: Published — [URL]

C

Decision Support + Accountability

For high-stakes choices requiring multi-perspective analysis with behavioral guardrails

Multi-agent

Spawns 5 parallel subagents — each a distinct thinking style, not a persona. They analyze independently, peer-review each other anonymously, then a chairman synthesizes the verdict. Three behavioral rules prevent the council from becoming an enabler:

No Rescue — If an idea is off-priority or avoidance-driven, the council shuts it down instead of finding ways to make it work
Pattern Recognition — Scans memory for prior behavioral patterns (shiny object syndrome, analysis paralysis, scope creep) and alerts all advisors
Accountability Turn — At least one advisor turns a question back on the user: "Why are you asking this instead of finishing [priority]?"

Calling conventions control scope: Team (all 5 + peer review), Consultants (3 advisors, no peer review), The Guys (2 advisors), Quick Council (all 5, skip peer review), or any individual name. Used for the /llm-council skill.

graph LR Input[Decision Input] --> PS{Pattern Scan\nMemory + Decisions} PS --> A1[Contrarian] PS --> A2[First Principles] PS --> A3[Expansionist] PS --> A4[Outsider] PS --> A5[Executor] A1 & A2 & A3 & A4 & A5 --> PR[Anonymous Peer Review] PR --> SYN[Chairman Synthesis] SYN --> NR{No Rescue\nCheck} NR -->|Idea is valid| Output[Verdict + Next Step] NR -->|Avoidance detected| Kill[Hard NO\n+ Pattern Named]

Lifecycle Management

File paths

Active skill:      .claude/skills/[name]/SKILL.md   # status: active
Deprecated skill:  archives/[YYYY-MM-DD]-skill-[name]-deprecated/

Deprecation Rule

Move deprecated skills to archives/ immediately. A deprecated skill left in .claude/skills/ will eventually be invoked accidentally. The os-verify skill checks for skills with status: deprecated still in the active directory.

Component 05 Two-tier

Two-Tier Persistence Model

Tier 1 · context/*.md

Version-Controlled Facts

Tracked in git
Stable, slow-changing facts
Shared identity — what's true about the business
Products, strategy, brand, current sprint

Tier 2 · ~/.claude/projects/[path]/memory/

Private Session Learnings

NOT version-controlled (intentional)
Dynamic, session-accumulated
Private to the operator
Platform quirks, anti-patterns, key people, incidents

Memory File Schema

memory/[name].md

---
name: [memory name]
description: [one-line hook — used to determine relevance at load time]
type: user | feedback | project | reference
---

[Memory content]

**Why:** [The reason this matters]
**How to apply:** [When this kicks in]

MEMORY.md — The Pointer Index

memory/MEMORY.md is loaded into every session context. It contains only one-line pointers to individual memory files — not the memory content itself. This keeps the index small while allowing full memory to be read on demand.

memory/MEMORY.md

# Memory Index

## Feedback
- [Anti-pattern: never batch publish](feedback_no_batch_publish.md) — violated 2026-04-08, burned
- [Platform: YouTube API scope requirements](feedback_youtube_api.md) — force-ssl required for comments

## Projects
- [Project: Run AI product](project_run_ai.md) — $147, live on Systeme.io 2026-05-04

Multi-Machine Access

On macOS, ~/.claude/ can be synced via iCloud Drive (symlink ~/.claude to ~/Library/Mobile Documents/com~apple~CloudDocs/.claude) to make memory available across machines without making it public.

Privacy Rationale

Memory files often contain sensitive operational details — API quirks, day-job context, financial specifics, interpersonal notes. Keeping them in ~/.claude/projects/ ensures they never get pushed to a remote, even accidentally. Trade-off: no memory backup by default. Mitigation: copy to a private encrypted backup on a schedule.

Component 06 Append-only

Role

An immutable audit trail of architectural decisions. Every meaningful choice — platform selection, product decisions, process changes, brand decisions — is recorded here. Prevents decision re-litigation.

Format

decisions/log.md

[YYYY-MM-DD] DECISION: [What was decided]
            | REASONING: [Why]
            | CONTEXT: [Where it affects the system]

Implementation Constraints

1

Append-only

No edits to past entries, ever. History is immutable by convention.

2

Tracked in git

Full history via git log -p decisions/log.md.

3

Not pushed to public remotes

May contain financial specifics, product decisions, competitive strategy.

4

Read by os-verify

The audit skill greps the last N decisions and cross-checks against context files for contradictions.

Value

In a single-operator setup, the decision log replaces the "why did we do it this way?" conversation that would otherwise happen with a team. When an AI agent questions a past choice, the log provides full reasoning in context.

Component 07 Autonomous

Role

Time-based skill invocation without requiring an active session. Claude Code's RemoteTrigger fires a skill on a schedule, spins up a Claude Code session, executes the skill, and terminates.

Production Scheduled Triggers

Trigger	Schedule	Skill	Purpose
Weekly OS audit	Friday 4am CT	`/os-verify`	Catch context drift, stale skills, deprecated platform references

os-verify Audit Modules

1

Skill Audit

List all skills in .claude/skills/. Check status: field — flag deprecated skills still in active directory. Check trigger references in CLAUDE.md for orphaned entries.

2

Context Drift Audit

grep Last updated: from all context/*.md. Flag files >28 days. Read last 10 decisions. Cross-check against context for contradictions. Platform reference scan: grep for retired platforms.

3

Memory Hygiene

Read MEMORY.md index. Flag memory files referencing archived/retired projects. Flag memories >90 days old with no recent corroboration.

4

Folder Cleanup

Check projects/ for folders not in CLAUDE.md Active Projects table. Check archives/ for items missing a corresponding decision log entry.

5

Write Health Report

Output: projects/[strategist]/workspace/os-health-[date].md. System score (0-100), flags by category, no-action-needed summary.

API Reference

Complete schema definitions, format specifications, and configuration reference for all Agent OS components.

Skill File Schema

YAML frontmatter required fields. File location: .claude/skills/[name]/SKILL.md

namestringrequired

Machine-readable identifier. Used for internal routing and logging. Kebab-case recommended.

descriptionstringrequired

Multi-line description of trigger conditions and purpose. Claude reads this to determine when to auto-invoke the skill. The more specific, the more reliable the routing.

triggerstringrequired

The /command that activates this skill. Must start with /. Unique across all skills in the workspace.

statusenumrequired

Lifecycle state of the skill. One of: active | deprecated | experimental. Deprecated skills must be moved to archives/ immediately — do not leave them in .claude/skills/.

Memory File Schema

YAML frontmatter required fields. Files live in ~/.claude/projects/[path]/memory/

namestringrequired

Human-readable name for the memory. Used for display in the MEMORY.md index.

descriptionstringrequired

One-line hook used to determine relevance when loading context. This is the key used for retrieval decisions. Make it specific and actionable.

typeenumrequired

Memory category. One of: user (operator preferences, role, style) | feedback (corrections, validated approaches) | project (active workstream context) | reference (pointers to external systems).

MCP Server Configuration

File: .mcp.json in workspace root. Loaded by Claude Code on session start.

.mcp.json

{
  "mcpServers": {
    "google-calendar": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-google-calendar"],
      "env": { "GOOGLE_OAUTH_TOKEN": "..." }
    },
    "notion": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-notion"],
      "env": { "NOTION_TOKEN": "..." }
    }
  }
}

Credential Warning

Never embed live credentials directly in .mcp.json if the file is version-controlled. Use a secrets manager or environment variables for team deployments. For solo use: ensure .mcp.json is in .gitignore.

MCP Selection Hierarchy

Follow in order. Never fall through silently from a higher tier to a lower tier when a tool errors.

Priority	Tool	Use When
1	Dedicated MCP for the target app	App has its own MCP server — fastest and most precise
2	Browser automation MCP (Claude in Chrome)	Web app without a dedicated MCP
3	Desktop Commander	Native desktop apps and cross-app workflows
4	Computer use	Last resort for anything else

Decision Log Format

decisions/log.md

[YYYY-MM-DD] DECISION: [What was decided]
            | REASONING: [Why — include the alternatives considered]
            | CONTEXT: [Which files or components this affects]

# Examples:
[2026-05-01] DECISION: Use file-based context over vector RAG
            | REASONING: Deterministic retrieval, no infrastructure,
              debuggable — for bounded single-operator context, full
              injection is more reliable than retrieval
            | CONTEXT: context/*.md, CLAUDE.md, Memory architecture

Credential Storage Pattern

OAuth credential layout (YouTube example)

~/.config/fiti-youtube/
├── client_secrets.json   # OAuth app credentials — NOT in repo
└── token.json            # Cached access + refresh token (auto-refreshes)

# Git credential helper (never embed tokens in remote URLs)
git config --local credential.helper osxkeychain
git remote set-url origin https://github.com/[user]/[repo].git

.gitignore Minimums

.gitignore

*.env
client_secrets.json
token.json
.mcp.json          # if it contains inline credentials

Guides

Runbooks and how-to guides for common operations. Each guide is a self-contained procedure.

Session Startup Sequence

1

Claude Code loads CLAUDE.md

Injects all @context files — identity, work, priorities, goals.

2

MEMORY.md index loaded

Cross-session context and learnings become available.

3

Operator issues command or /skill-name

For skill invocations, the matching SKILL.md is loaded into context.

4

Skill SKILL.md loaded → workflow begins

The skill's instructions guide execution from this point.

Adding a New Skill

1

Create the skill file

Create .claude/skills/[name]/SKILL.md with the full schema. Set status: active in frontmatter.

2

Register in CLAUDE.md

Add the skill to the Active Skills table with its trigger and purpose.

3

Dry run test

Test with a dry run before relying on the skill in production. Verify outputs land in the correct locations.

4

Log the addition

Append to decisions/log.md — what the skill does and why it was added.

When to Build a Skill

Skills should emerge from repeated workflows. The threshold: if you've done it manually three times and found yourself wishing it was automated, build the skill. A skill built for a hypothetical workflow becomes maintenance burden.

Adding a New MCP Server

1

Add server config to .mcp.json

Follow the MCP server configuration format in the API Reference. Store credentials outside the file or use env vars.

2

Restart Claude Code

MCP connections are loaded at session start. A restart is required to reload .mcp.json.

3

Test tool availability

Verify with a simple call to one of the server's tools before building workflows that depend on it.

4

Update CLAUDE.md

Add to the Connected Tools section if the MCP materially changes Claude's behavior or capabilities.

Project Lifecycle

Adding a Project

1

Create folder

Create projects/[name]/ with a README.md describing the project.

2

Register

Add to CLAUDE.md Active Projects table.

3

Log

Append entry to decisions/log.md.

Retiring a Project

1

Move to archives

Move projects/[name]/ to archives/[YYYY-MM-DD]-[name]/. Never delete.

2

Remove from registry

Remove from CLAUDE.md Active Projects table.

3

Log

Append retirement decision to decisions/log.md.

4

Update memory

Update any memory files that reference the project.

Recovering from Stale Context

1

Run /os-verify

Or equivalent audit skill. This generates a health report identifying all flags.

2

Review health report flags

Treat the health report as a production incident list — each flag is a potential source of bad outputs.

3

Update stale context files

Update each context file identified as stale. Update the Last updated: timestamp.

4

Cross-check decision log

Read recent decisions/log.md entries. Verify each is reflected in the relevant context files.

5

Log the recovery

Append a recovery entry to decisions/log.md noting what was stale and what was fixed.

Scaling to a Team

The single-operator model scales to small teams with these modifications:

Component	Single Operator	Team
Memory	`~/.claude/projects/[path]/memory/` per machine	Separate memory paths per team member. No sharing.
Decision log	One file, one contributor	Same pattern — git blame provides attribution
Context files	Single identity layer	Shared identity (work.md, team.md) + per-member context
Skills	Shared skill library	Shared `.claude/skills/` in repo; member-specific skills in personal forks
CLAUDE.md	Single master file	Core CLAUDE.md + project-specific CLAUDE.md in `projects/[name]/`

Replacing File Storage with a Knowledge Base

For teams needing search across a larger document corpus:

1

Keep core identity files as direct injection

me.md, work.md — these are small and always needed.

2

Move large reference documents to a searchable store

Notion, Confluence, or a vector DB for large document sets.

3

Connect via MCP server

The MCP server provides the query interface.

4

Update CLAUDE.md instructions

Replace @context/large-doc.md with an instruction to query the MCP on demand.

Trade-off

Loses deterministic loading; gains scalability for large document sets. Introduces precision/recall trade-offs that the file-based system avoids entirely.

FAQ & Trade-offs

Design decisions, anti-patterns, security considerations, and observability model for the Agent OS.

Design Decisions

Decision	Alternative Considered	Rationale
File-based context over vector RAG	Pinecone/Weaviate + embeddings	Deterministic retrieval, no infrastructure, debuggable — for bounded single-operator context, full injection is more reliable than retrieval
Markdown over structured config	JSON skill definitions	Claude reads and generates markdown natively; markdown skills are self-documenting and editable without tooling
Append-only decision log	Database with CRUD	Immutability is a feature — decisions are permanent records; git history provides full diff
Memory outside version control	Memory in repo	Prevents accidental exposure of sensitive operational context to public remotes
Monorepo for all projects	Separate repos per project	Single context load covers entire operation; Claude doesn't need to switch repos for cross-project dependencies
Skills as text documents	Code-defined workflows (LangGraph, etc.)	No framework dependency; skills are readable, editable, and debuggable by anyone who can read markdown
Phase-gate approval pattern	Fully autonomous execution	For content + business decisions, human approval at phase boundaries prevents compounding errors
Hook-driven automation over manual logging	Relying on /close-session to capture run-log	PostToolUse + UserPromptSubmit hooks fire unconditionally — operator cannot forget to log. SessionEnd snapshot is a backstop for dropped close-session calls. Automation beats discipline for high-frequency operations.

Anti-Patterns

1. Storing frequently-changing state in context files

Context files are for slow-changing facts. If something changes every session, it should be in current-priorities.md at most — and if it changes multiple times per session, it should be written by Claude to a project-specific file during the skill, not loaded at session start.

2. Building skills speculatively

Skills should emerge from repeated workflows. A skill built for a hypothetical workflow becomes maintenance burden. The threshold: if you've done it manually three times and found yourself wishing it was automated, build the skill.

3. Making CLAUDE.md too long

Every line in CLAUDE.md is injected into every session. Keep it under 200 lines. Move detailed instructions to skill files or project-specific CLAUDE.md files in projects/[name]/CLAUDE.md.

4. Ignoring the os-verify report

The audit exists to catch drift before it causes bad outputs. A skill flagged for stale references, or a context file contradicting a logged decision, will produce incorrect behavior until fixed. Treat the health report as a production incident list.

5. Sharing memory files across operators

Memory files encode individual behavioral patterns — including failure modes, preferences, and sensitive operational context. They are deliberately not version-controlled. Sharing them across operators produces a contaminated context that reflects no one accurately.

6. Embedding credentials in any tracked file

.mcp.json, .env, client_secrets.json — if it contains a token, it does not go in the repo. The operational cost of a credential rotation after an accidental commit is always higher than the inconvenience of using a credential helper from the start.

Security Considerations

Credential Management

API tokens belong in ~/.config/[service]/ — never in the repo. Git credentials go in osxkeychain credential helper — never embedded in remote URLs (the common mistake: https://user:token@github.com/... stored in .git/config). MCP env vars: use a secrets manager for team deployments.

Decisions Log Sensitivity

decisions/log.md is version-controlled but should not be pushed to public remotes — it may contain financial specifics, product decisions, and competitive strategy. Add to .gitignore if using a public repo; use a private repo if tracking is needed.

Memory Privacy

~/.claude/projects/[path]/memory/ is not version-controlled by design. For team deployments: each operator must have a separate memory path. Never share memory files across operators.

Observability

No metrics dashboard by default. Observability is achieved through structured file outputs and the weekly audit.

Signal	Source	How to Read It
System health	`os-verify` report	`projects/[strategist]/workspace/os-health-[date].md` — generated weekly
Decision history	`decisions/log.md`	`git log -p decisions/log.md` for full diff history
Skill usage	`skills/run-log.md` (auto)	Appended automatically by `PostToolUse(Skill)` + `UserPromptSubmit` hooks on every invocation
WIP backup	`SessionEnd` hook	Auto-commits + pushes uncommitted work on session close; inspectable via `git log --oneline`
Memory drift	`os-verify` Module 3	Flags stale or contradicted memory entries
Context staleness	`os-verify` Module 2	Flags context files >28 days since last update

Team Observability Upgrade

Add a structured output step to os-verify that writes a JSON summary alongside the markdown report — parseable by a dashboard or alerting system without modifying the core OS.

Common Questions

Why not use a vector database for context retrieval?

For a single-operator system with a bounded context surface (a handful of context files, a skill registry, a memory index), full injection is more reliable than retrieval. Vector retrieval introduces precision/recall trade-offs — you might retrieve the wrong chunks, miss relevant ones, or get stale embeddings. File injection is deterministic: what's in the file is what Claude sees, every time. The trade-off is context window consumption, which is mitigated by keeping files concise.

Why are memory files excluded from version control?

Memory files often contain sensitive operational details — API quirks, day-job context, financial specifics, interpersonal notes about key people. Keeping them in ~/.claude/projects/ rather than the repo ensures they never get pushed to a remote, even accidentally. This is a deliberate design decision with one trade-off: no automatic backup. Mitigation: copy the memory directory to a private encrypted backup on a schedule.

Why skills as markdown instead of code (LangGraph, AutoGen, etc.)?

No framework dependency. Skills are readable, editable, and debuggable by anyone who can read markdown — no Python environment, no framework version pinning, no execution environment beyond Claude Code. When a skill needs updating, you open a text file and edit it. When a skill breaks, you read it and see immediately what the instructions say. Framework-based workflows require understanding the framework before you can debug the logic. Markdown skills keep the complexity in the content, not the infrastructure.

How does this scale to a larger team?

The core architecture scales with three modifications: (1) Each team member gets a separate memory path — never share memory files across operators. (2) Context files split into shared identity (work.md, team.md) and per-member context. (3) CLAUDE.md splits into a shared core file plus project-specific CLAUDE.md files. Skills and the decision log work unchanged — git blame provides attribution on the log, and the shared skill library in the repo benefits everyone. See the Scaling to a Team guide for the full modification table.

What's the difference between context files and memory files?

Context files (context/*.md) contain slow-changing facts about the business — products, strategy, current sprint, brand positioning. They're version-controlled in git, shared, and appropriate for information that defines the operation. Memory files (~/.claude/projects/.../memory/) contain dynamic, session-accumulated learnings — behavioral patterns, platform quirks, anti-patterns discovered through experience, notes about key people. They're private, not version-controlled, and appropriate for information that reflects individual operational history. If it changes quarterly, it's context. If it changes through experience, it's memory.