Architecture Reference

Design Patterns for
Autonomous AI Agents

A practitioner's guide to agent architecture. State machines, tool orchestration, memory systems, and human-in-the-loop escalation — from a real production agent.

The Agent Loop

Every autonomous agent follows the same fundamental loop. The differences are in what happens inside each phase and how the agent handles failure.

OBSERVE → ORIENT → DECIDE → ACT → MEASURE ↑ | └──────────────────────────────────────────────┘ OBSERVE: Read state, check environment, gather data ORIENT: Interpret observations against goals and constraints DECIDE: Select action from available tools/strategies ACT: Execute the chosen action MEASURE: Check result, update state, log metrics

This is an OODA loop adapted for AI agents. The key insight is that the filesystem is memory — state persists between invocations by writing to disk. The agent does not rely on in-context memory alone.

Core Design Patterns

Pattern 1: State Machine with Phase Transitions

Define discrete phases the agent moves through, with clear entry/exit criteria. This prevents the agent from thrashing between tasks.

Core
## Phase Transitions
RECON → RESEARCH → STRATEGY → BUILD → DEPLOY → MEASURE → EVOLVE
  ↑                                                          |
  └──────────────────────────────────────────────────────────┘

Entry criteria for each phase:
- RECON: Agent starts here. Scan environment, discover capabilities.
- RESEARCH: RECON complete. Market analysis, opportunity identification.
- STRATEGY: Research complete. Rank opportunities, allocate resources.
- BUILD: Strategy chosen. Execute highest-priority task.
- DEPLOY: Build complete. Ship to production.
- MEASURE: Deployed. Check metrics, revenue, user signals.
- EVOLVE: Measurement complete. Update strategy, re-rank, self-modify.

Pattern 2: Filesystem-as-Memory

Use the filesystem as the agent's persistent memory. Each file serves a specific memory function. The agent reads its own state files at the start of each invocation.

Core
state/
├── log.md              # Append-only event log (what happened)
├── strategy.md         # Current strategy and ranked priorities
├── metrics.md          # Quantitative measurements
├── roadmap.md          # Ordered task queue
├── blockers.md         # What's preventing progress
└── self_assessment.md  # Agent's evaluation of its own performance

Pattern 3: Tool Orchestration with Fallbacks

Define a tool hierarchy where the agent tries the best tool first, falls back to alternatives, and logs tool failures for future optimization.

CoreAdvanced
Tool Selection Algorithm:
1. Identify the task type (search, create, deploy, measure)
2. Select primary tool for task type
3. If primary fails → try fallback tool
4. If all tools fail → log failure, add to blockers, escalate if critical
5. Record tool success/failure rates in metrics

Example tool chain for "deploy website":
  Primary:   Vercel CLI (vercel deploy)
  Fallback:  GitHub Pages via git push
  Fallback:  Netlify CLI (netlify deploy)
  Escalate:  Write to human_actions_needed.md

Pattern 4: Human-in-the-Loop (HITL) Escalation

Define a queue for actions that require human intervention. Each item specifies what, why, time estimate, and priority. The agent continues with other tasks while waiting.

Core
## HITL Queue Protocol

Each item in the queue must specify:
- WHAT:     Exact steps (copy-pasteable commands)
- WHY:      What it unblocks
- TIME:     Estimated minutes
- PRIORITY: BLOCKING (cannot proceed) or NICE-TO-HAVE

Rules:
- HITL minutes per week MUST converge to zero over time
- Agent must be able to make progress on OTHER tasks while blocked
- Never queue something the agent can do itself
- Track cumulative HITL minutes at the top of the file

Pattern 5: Self-Evolution

The agent modifies its own instructions based on what it learns. Every N iterations, it re-reads all state, evaluates what worked, and updates its strategy document.

Advanced
Self-Evolution Protocol (every ~5 iterations):
1. Read all state files
2. Compute metrics delta since last evolution
3. Identify: what worked? what failed? what's stale?
4. Update strategy.md with new rankings
5. Update CLAUDE.md (the agent's own governance file) if needed
6. Prune completed items from roadmap
7. Log the evolution event

Pattern 6: Portfolio Diversification

Don't put all compute budget into one strategy. Maintain 2-3 concurrent workstreams at different risk levels.

Advanced
Portfolio Structure:
- ANCHOR:   High-probability, low-reward. The floor.
- GROWTH:   Medium-probability, medium-reward. Compounds over time.
- MOONSHOT: Low-probability, high-reward. Worth a small allocation.

Allocation Rule:
  60% of compute → ANCHOR (survival)
  30% of compute → GROWTH (compounding)
  10% of compute → MOONSHOT (optionality)

Architecture Comparison

Approach Complexity Autonomy Best For
Single-shot prompt Low None One-off tasks, code generation
ReAct loop Medium Single session Research, debugging, multi-step tasks
State machine agent High Multi-session Ongoing projects, autonomous operations
Multi-agent swarm Very high Multi-session Complex systems, parallel workstreams

Agent Architecture Skeleton Generator

Describe your agent's goal and get a starter architecture document.

Agent Architecture

Want production-grade agent architecture?

The Autonomous Agent Architect prompt generates complete agent systems with state machines, tool chains, memory schemas, error recovery, and self-evolution — based on a real production autonomous agent.

Get Agent Architect — $9.99