Why this matters

Large conversations need a recovery path, or the runtime would eventually stall.

Big picture first

Compaction takes older history, asks a summarizing model pass to compress it, and rebuilds the conversation around a compact boundary plus summary messages.

Code walk

export function truncateHeadForPTLRetry(
  messages: Message[],
  ptlResponse: AssistantMessage,
): Message[] | null {
  const input =
    messages[0]?.type === 'user' &&
    messages[0].isMeta &&
    messages[0].message.content === PTL_RETRY_MARKER
      ? messages.slice(1)
      : messages

  const groups = groupMessagesByApiRound(input)
  if (groups.length < 2) return null

  const sliced = groups.slice(dropCount).flat()
  if (sliced[0]?.type === 'assistant') {
    return [
      createUserMessage({ content: PTL_RETRY_MARKER, isMeta: true }),
      ...sliced,
    ]
  }
  return sliced
}

This helper exists because the compaction request itself can become too large. When that happens, the system peels older context away and retries.

The assistant-first branch is the safety guard. If trimming drops the original user preamble, the helper prepends a synthetic user marker so the rebuilt message list still satisfies the API’s required user-first shape.

Compaction safety rule

services/compact/compact.ts

Tool use is not allowed during compaction

Takeaways

Compaction is a recovery subsystem, not a cosmetic feature.
Prompt-too-long can happen inside the recovery path itself.
Compaction forbids tool use because it wants a pure text summary.

Query Compaction And Recovery

Why this matters

Big picture first

Code walk

Compaction safety rule

Takeaways

Parent Chapter