Core Architecture
Claude Code's architecture consists of three primary components that work together to create an effective AI-powered CLI:
Terminal UI (React + Ink)
The UI layer leverages React Ink to deliver rich terminal interactions beyond standard CLI capabilities:
- Interactive permission prompts for secure tool execution
- Syntax-highlighted code snippets for better readability
- Real-time status updates during tool operations
- Markdown rendering directly within the terminal environment
React hooks provide clean state management, enabling complex interactive experiences while maintaining a terminal-based interface.
Intelligence Layer (Claude API)
The intelligence layer connects with Claude through a streaming API interface:
- Parses responses to identify intended tool executions
- Extracts parameters from natural language instructions
- Validates input using Zod schemas to ensure correctness
- Handles errors gracefully when Claude provides invalid instructions
Communication flows bidirectionally - Claude triggers tool execution, and structured results stream back into the conversation context.
Tools Layer
Each tool in the system follows a consistent pattern:
const ExampleTool = {
name: "example",
description: "Does something useful",
schema: z.object({ param: z.string() }),
isReadOnly: () => true,
needsPermissions: (input) => true,
async *call(input) {
// Execute and yield results
}
} satisfies Tool;
This approach creates a plugin architecture where developers can add new capabilities by implementing the Tool interface. Available tools are dynamically loaded and presented to Claude, establishing an extensible capability framework.
Reactive Command Loop
At its core, Claude Code operates through a reactive command loop - processing user input via Claude's intelligence, executing resulting actions, and displaying outcomes while streaming results in real-time.
The fundamental pattern powering this flow uses generators:
// Core pattern enabling streaming UI
async function* query(input: string): AsyncGenerator<Message> {
// Show user's message immediately
yield createUserMessage(input);
// Stream AI response as it arrives
for await (const chunk of aiStream) {
yield chunk;
// Process tool use requests
if (detectToolUse(chunk)) {
// Execute tools and yield results
for await (const result of executeTool(chunk)) {
yield result;
}
// Continue conversation with tool results
yield* continueWithToolResults(chunk);
}
}
}
This recursive generator approach keeps Claude Code responsive during complex operations. Rather than freezing while waiting for operations to complete, the UI updates continuously with real-time progress.
Query Implementation Details
The complete query function handles all aspects of the conversation flow:
async function* query(
input: string,
context: QueryContext
): AsyncGenerator<Message> {
// Process user input
const userMessage = createUserMessage(input);
yield userMessage;
// Get streaming AI response
const aiResponseGenerator = querySonnet(
normalizeMessagesForAPI([...existingMessages, userMessage]),
systemPrompt,
context.maxTokens,
context.tools,
context.abortSignal,
{ dangerouslySkipPermissions: false }
);
// Stream response chunks
for await (const chunk of aiResponseGenerator) {
yield chunk;
// Handle tool use requests
if (chunk.message.content.some(c => c.type === 'tool_use')) {
const toolUses = extractToolUses(chunk.message.content);
// Execute tools (potentially in parallel)
const toolResults = await executeTools(toolUses, context);
// Yield tool results
for (const result of toolResults) {
yield result;
}
// Continue conversation recursively
const continuationGenerator = query(
null, // No new user input
{
...context,
messages: [...existingMessages, userMessage, chunk, ...toolResults]
}
);
// Yield continuation messages
yield* continuationGenerator;
}
}
}
Key benefits of this implementation include:
-
Immediate feedback: Results appear as they become available through generator streaming.
-
Seamless tool execution: When Claude invokes tools, the function recursively calls itself with updated context, maintaining conversation flow.
-
Responsive cancellation: Abort signals propagate throughout the system for fast, clean cancellation.
-
Comprehensive state management: Each step preserves context, ensuring continuity between operations.
Parallel Execution Engine
A distinctive feature of Claude Code is its parallel tool execution system. This capability dramatically improves performance when working with large codebases - tasks that might take minutes when executed sequentially often complete in seconds with parallel processing.
Concurrent Generator Approach
Claude Code implements an elegant solution using async generators to process multiple operations in parallel while streaming results as they become available.
The core implementation breaks down into several manageable concepts:
1. Generator State Tracking
// Each generator has a state object tracking its progress
type GeneratorState<T> = {
generator: AsyncGenerator<T> // The generator itself
lastYield: Promise<IteratorResult<T>> // Its next pending result
done: boolean // Whether it's finished
}
// We track all active generators in a map
const generatorStates = new Map<number, GeneratorState<T>>()
// We also track which generators are still running
const remaining = new Set(generators.map((_, i) => i))
2. Concurrency Management
// Control how many generators run simultaneously
const { signal, maxConcurrency = MAX_CONCURRENCY } = options
// Start only a limited batch initially
const initialBatchSize = Math.min(generators.length, maxConcurrency)
for (let i = 0; i < initialBatchSize; i++) {
if (generators[i]) {
// Initialize each generator and start its first operation
generatorStates.set(i, {
generator: generators[i],
lastYield: generators[i].next(),
done: false,
})
}
}
3. Non-blocking Result Collection
// Race to get results from whichever generator finishes first
const entries = Array.from(generatorStates.entries())
const nextResults = await Promise.race(
entries.map(async ([index, state]) => {
const result = await state.lastYield
return { index, result }
})
)
// Process whichever result came back first
const { index, result } = nextResults
// Immediately yield that result with tracking info
if (!result.done) {
yield { ...result.value, generatorIndex: index }
// Queue the next value from this generator without waiting
const state = generatorStates.get(index)!
state.lastYield = state.generator.next()
}
4. Dynamic Generator Replacement
// When a generator finishes, remove it
if (result.done) {
remaining.delete(index)
generatorStates.delete(index)
// Calculate the next generator to start
const nextGeneratorIndex = Math.min(
generators.length - 1,
Math.max(...Array.from(generatorStates.keys())) + 1
)
// If there's another generator waiting, start it
if (
nextGeneratorIndex >= 0 &&
nextGeneratorIndex < generators.length &&
!generatorStates.has(nextGeneratorIndex)
) {
generatorStates.set(nextGeneratorIndex, {
generator: generators[nextGeneratorIndex],
lastYield: generators[nextGeneratorIndex].next(),
done: false,
})
}
}
5. Cancellation Support
// Check for cancellation on every iteration
if (signal?.aborted) {
throw new AbortError()
}
The Complete Picture
These pieces work together to create a system that:
- Runs a controlled number of operations concurrently
- Returns results immediately as they become available from any operation
- Dynamically starts new operations as others complete
- Tracks which generator produced each result
- Supports clean cancellation at any point
This approach maximizes throughput while maintaining order tracking, enabling Claude Code to process large codebases efficiently.
Tool Execution Strategy
When Claude requests multiple tools, the system must decide how to execute them efficiently. A key insight drives this decision: read operations can run in parallel, but write operations need careful coordination.
Smart Execution Paths
The tool executor makes an important distinction:
async function executeTools(toolUses: ToolUseRequest[], context: QueryContext) {
// First, check if all requested tools are read-only
const allReadOnly = toolUses.every(toolUse => {
const tool = findToolByName(toolUse.name);
return tool && tool.isReadOnly();
});
let results: ToolResult[] = [];
// Choose execution strategy based on tool types
if (allReadOnly) {
// Safe to run in parallel when all tools just read
results = await runToolsConcurrently(toolUses, context);
} else {
// Run one at a time when any tool might modify state
results = await runToolsSerially(toolUses, context);
}
// Ensure results match the original request order
return sortToolResultsByRequestOrder(results, toolUses);
}
Performance Optimizations
This seemingly simple approach contains several sophisticated optimizations:
Read vs. Write Classification
Each tool declares whether it's read-only through an isReadOnly()
method:
// Example tools showing classification
const ViewFileTool = {
name: "View",
// Marked as read-only - can run in parallel
isReadOnly: () => true,
// Implementation...
}
const EditFileTool = {
name: "Edit",
// Marked as write - must run sequentially
isReadOnly: () => false,
// Implementation...
}
Smart Concurrency Control
The execution strategy balances resource usage with execution safety:
-
Parallel for read operations:
- File readings, glob searches, and grep operations run simultaneously
- Typically limits concurrency to ~10 operations at once
- Uses the parallel execution engine discussed earlier
-
Sequential for write operations:
- Any operation that might change state (file edits, bash commands)
- Runs one at a time in the requested order
- Prevents potential conflicts or race conditions
Ordering Preservation
Despite parallel execution, results maintain a predictable order:
function sortToolResultsByRequestOrder(
results: ToolResult[],
originalRequests: ToolUseRequest[]
): ToolResult[] {
// Create mapping of tool IDs to their original position
const orderMap = new Map(
originalRequests.map((req, index) => [req.id, index])
);
// Sort results to match original request order
return [...results].sort((a, b) => {
return orderMap.get(a.id)! - orderMap.get(b.id)!;
});
}
Real-World Impact
The parallel execution strategy significantly improves performance for operations that would otherwise run sequentially, making Claude Code more responsive when working with multiple files or commands.
Key Components and Design Patterns
The Claude Code architecture relies on several foundational components that work together:
Core Files
utils/generators.ts
: Contains the parallel execution engine and generator utilitiesquery.ts
: Implements the reactive command loop and tool execution logicTool.ts
: Defines the interface all tools must implementtools.ts
: Manages tool registration and discoverypermissions.ts
: Handles the security layer for tool execution
UI Components
screens/REPL.tsx
: Renders the main conversation interfacePromptInput.tsx
: Manages user input and command historyservices/claude.ts
: Handles API communication with Claudeutils/messages.tsx
: Processes message formatting and rendering
Architectural Patterns
The codebase employs several consistent patterns:
- Async Generators: Enable streaming data throughout the system
- Recursive Functions: Power multi-turn conversations and tool usage
- Plugin Architecture: Allows extending the system with new tools
- State Isolation: Keeps tool executions from interfering with each other
- Dynamic Concurrency: Adjusts parallelism based on operation types