Why this matters
This page explains the runtime loop that keeps a single Claude Code turn alive.
Big picture first
The query subsystem has two layers. QueryEngine.ts owns long-lived session
state across turns, while query.ts runs the step-by-step loop for one turn.
The parent chapter described this subsystem as taking prepared messages,
config, and tool context. In code, those inputs arrive as one QueryParams
bundle, and queryLoop() turns that bundle into a changing State object.
Data structures you need first
QueryParams
This is the one-time input bundle for a single turn.
export type QueryParams = {
messages: Message[]
systemPrompt: SystemPrompt
userContext: { [k: string]: string }
systemContext: { [k: string]: string }
canUseTool: CanUseToolFn
toolUseContext: ToolUseContext
fallbackModel?: string
querySource: QuerySource
maxOutputTokensOverride?: number
maxTurns?: number
skipCacheWrite?: boolean
taskBudget?: { total: number }
deps?: QueryDeps
}| Field | Meaning | First Pass |
|---|---|---|
messages | The prepared conversation history for this turn. | Very important |
systemPrompt | The runtime instructions that frame the model call. | Very important |
toolUseContext | The shared tool wiring and session context for execution. | Very important |
querySource | Why this turn is happening, such as normal user input or some internal path. | Helpful context |
maxOutputTokensOverride | An optional per-turn cap that recovery code can raise or reset. | Safe to skim on first read |
taskBudget | An optional whole-turn output budget tracked across loop iterations. | Safe to skim on first read |
Query turn state
This is the mutable bundle that changes as one turn progresses.
type State = {
messages: Message[]
toolUseContext: ToolUseContext
autoCompactTracking: AutoCompactTrackingState | undefined
maxOutputTokensRecoveryCount: number
hasAttemptedReactiveCompact: boolean
maxOutputTokensOverride: number | undefined
pendingToolUseSummary: Promise<ToolUseSummaryMessage | null> | undefined
stopHookActive: boolean | undefined
turnCount: number
transition: Continue | undefined
}| Field | Meaning | First Pass |
|---|---|---|
messages | The conversation history for this active turn. | Very important |
toolUseContext | Shared runtime context for tool execution. | Very important |
autoCompactTracking | Tracks whether the turn is drifting toward automatic compaction. | Safe to skim on first read |
maxOutputTokensRecoveryCount | Counts how many times the loop has tried to recover from max-output-token failures. | Important once recovery starts |
hasAttemptedReactiveCompact | Remembers whether the loop already tried the reactive compact path. | Important once recovery starts |
maxOutputTokensOverride | Carries the current output-token cap between iterations. | Helpful once budgets matter |
pendingToolUseSummary | Stores a promised summary message so the loop can emit it at the right time. | Safe to skim on first read |
stopHookActive | Marks whether stop-hook logic is currently in control of the turn. | Safe to skim on first read |
turnCount | How many loop iterations have happened inside this turn. | Important |
transition | Why the previous iteration continued. | Helpful once recovery paths start |
Iteration phases in plain English
Each pass through queryLoop() follows the same rhythm:
- Read the current
statebundle into local names for this iteration. - Build the request and start streaming the model response.
- Watch that response for text, tool calls, stop-hook signals, or recovery conditions.
- Run the needed tool or recovery work, then write a new
stateobject if the turn should continue. - Return a terminal result if the turn is done.
The important state-machine idea is that state is initialized once before the
loop starts. After that, each iteration destructures the current bundle, does
work, and either returns or replaces state for the next pass.
Code walk
The public query() generator is a thin wrapper. The real state machine lives
inside queryLoop():
async function* queryLoop(
params: QueryParams,
consumedCommandUuids: string[],
): AsyncGenerator<..., Terminal> {
const {
systemPrompt,
userContext,
systemContext,
canUseTool,
fallbackModel,
querySource,
maxTurns,
skipCacheWrite,
} = params
let state: State = {
messages: params.messages,
toolUseContext: params.toolUseContext,
maxOutputTokensOverride: params.maxOutputTokensOverride,
autoCompactTracking: undefined,
stopHookActive: undefined,
maxOutputTokensRecoveryCount: 0,
hasAttemptedReactiveCompact: false,
turnCount: 1,
pendingToolUseSummary: undefined,
transition: undefined,
}
while (true) {
let { toolUseContext } = state
const {
messages,
autoCompactTracking,
maxOutputTokensRecoveryCount,
hasAttemptedReactiveCompact,
maxOutputTokensOverride,
pendingToolUseSummary,
stopHookActive,
turnCount,
} = state
}
}
That pattern is the mental model to keep in your head. params mostly stays
fixed for the turn. state is the thing that evolves. The loop does not start
fresh each time; it carries forward history, tool context, counters, and
recovery flags until one iteration finally returns a terminal outcome.
The loop is also a generator, which is why it can stream partial events while still behaving like a state machine underneath.
Takeaways
- QueryEngine owns cross-turn session state, but query.ts owns the single-turn loop.
- The loop keeps mutable turn state in one bundle instead of scattering flags everywhere.
- Generators are central because the runtime streams intermediate events, not only final text.