Why this matters
Tool execution is where a model decision becomes action.
Big picture first
The query loop does not run all tool calls the same way. It first groups them into safe batches and only then decides whether a batch can run concurrently.
Code walk
function partitionToolCalls(
toolUseMessages: ToolUseBlock[],
toolUseContext: ToolUseContext,
): Batch[] {
return toolUseMessages.reduce((acc: Batch[], toolUse) => {
const tool = findToolByName(toolUseContext.options.tools, toolUse.name)
const parsedInput = tool?.inputSchema.safeParse(toolUse.input)
const isConcurrencySafe = parsedInput?.success
? (() => {
try {
return Boolean(tool?.isConcurrencySafe(parsedInput.data))
} catch {
return false
}
})()
: false
if (isConcurrencySafe && acc[acc.length - 1]?.isConcurrencySafe) {
acc[acc.length - 1]!.blocks.push(toolUse)
} else {
acc.push({ isConcurrencySafe, blocks: [toolUse] })
}
return acc
}, [])
}
This is the first decision point. The runtime looks up each tool, validates its input, and only treats it as concurrency-safe if the tool explicitly says so.
async function* runToolsConcurrently(
toolUseMessages: ToolUseBlock[],
assistantMessages: AssistantMessage[],
canUseTool: CanUseToolFn,
toolUseContext: ToolUseContext,
): AsyncGenerator<MessageUpdateLazy, void> {
yield* all(
toolUseMessages.map(async function* (toolUse) {
toolUseContext.setInProgressToolUseIDs(prev =>
new Set(prev).add(toolUse.id),
)
yield* runToolUse(
toolUse,
assistantMessages.find(_ =>
_.message.content.some(
_ => _.type === 'tool_use' && _.id === toolUse.id,
),
)!,
canUseTool,
toolUseContext,
)
markToolUseAsComplete(toolUseContext, toolUse.id)
}),
getMaxToolUseConcurrency(),
)
}
Read-only batches can flow through runToolsConcurrently, while stateful or
dangerous batches fall back to serial execution.
Takeaways
- The orchestration layer batches before it executes.
- Concurrency is opt-in and validated per tool input.
- Serial execution is the conservative fallback whenever safety is unclear.