Why this matters

Claude Code keeps a reusable memory file so it does not have to rediscover the same session notes over and over. When the conversation gets long enough, the runtime can summarize useful details into that file in the background, then keep the main turn moving.

The important part is not just that memory exists. It is that the write path is small, reusable, and guarded. The background worker cannot treat memory like a free scratchpad, and the manual summary path uses the same lock box instead of a looser one. A separate service called auto-memory shows the same guarding style, but it writes to a different directory and solves a different problem.

Session memory stays in the background

Startup registers the hook once, later hook runs check the threshold, and only then can the forked agent rewrite the session-memory file through a strict guard.

Why The Memory File Exists

The reusable memory file is a shortcut for future turns. Instead of making the main agent carry every long-running detail in its active context, Claude Code can write a smaller markdown note that survives after the current turn ends. That keeps the session lighter and gives later turns a stable place to read the important background from.

Before looking at the code, remember the data shape the threshold code sees: messages is the full transcript array for the session. The extraction logic does not need to know every field on every message here. It only needs the ordered conversation so it can count tokens and check whether the current turn already used tools.

Initialization Registers The Hook First

initSessionMemory() runs during setup and makes the feature available. It does not wait for the transcript threshold first. Its job is narrower: if the runtime is local and auto-compact is enabled, register a post-sampling hook and leave the real decision for later.

The other important guard is that initialization respects the same auto-compact world it lives inside. If auto-compact is off, session memory stays off too. That keeps the feature aligned with the rest of the recovery stack.

export function initSessionMemory(): void {
  if (getIsRemoteMode()) return
  // Session memory is used for compaction, so respect auto-compact settings
  const autoCompactEnabled = isAutoCompactEnabled()

  // Log initialization state (ant-only to avoid noise in external logs)
  if (process.env.USER_TYPE === 'ant') {
    logEvent('tengu_session_memory_init', {
      auto_compact_enabled: autoCompactEnabled,
    })
  }

  if (!autoCompactEnabled) {
    return
  }

  // Register hook unconditionally - gate check happens lazily when hook runs
  registerPostSamplingHook(extractSessionMemory)
}

Initialization is therefore a registration step, not a background job. extractSessionMemory is what actually does the work later, after the message threshold and the runtime gate both agree that extraction is worth trying.

The Hook Adds Runtime Gates Before The Threshold

Registering the hook does not mean every later turn will extract memory. extractSessionMemory() still checks where the event came from, whether the feature gate is on, and whether it needs to do the one-time memoized cached config load before it ever asks about tokens.

const extractSessionMemory = sequential(async function (
  context: REPLHookContext,
): Promise<void> {
  const { messages, toolUseContext, querySource } = context

  // Only run session memory on main REPL thread
  if (querySource !== 'repl_main_thread') {
    // Don't log this - it's expected for subagents, teammates, etc.
    return
  }

  // Check gate lazily when hook runs (cached, non-blocking)
  if (!isSessionMemoryGateEnabled()) {
    // Log gate failure once per session (ant-only)
    if (process.env.USER_TYPE === 'ant' && !hasLoggedGateFailure) {
      hasLoggedGateFailure = true
      logEvent('tengu_session_memory_gate_disabled', {})
    }
    return
  }

  // Initialize config from remote (lazy, only once)
  initSessionMemoryConfigIfNeeded()

  if (!shouldExtractMemory(messages)) {
    return
  }

That is the full start condition in plain English:

the event must come from the main REPL thread
the session-memory feature gate must be enabled
the service does its one-time cached-config load
only then does the threshold logic get a vote

The Threshold Runs When The Hook Fires

Only after the hook passes those runtime gates does shouldExtractMemory() ask whether the conversation is large enough and quiet enough to justify a background summary. In other words, startup wires the trigger once, the hook checks its environment, and then the threshold decides whether this turn is a good extraction point.

export function shouldExtractMemory(messages: Message[]): boolean {
  // Check if we've met the initialization threshold
  // Uses total context window tokens (same as autocompact) for consistent behavior
  const currentTokenCount = tokenCountWithEstimation(messages)
  if (!isSessionMemoryInitialized()) {
    if (!hasMetInitializationThreshold(currentTokenCount)) {
      return false
    }
    markSessionMemoryInitialized()
  }

  // Check if we've met the minimum tokens between updates threshold
  // Uses context window growth since last extraction (same metric as init threshold)
  const hasMetTokenThreshold = hasMetUpdateThreshold(currentTokenCount)

  // Check if we've met the tool calls threshold
  const toolCallsSinceLastUpdate = countToolCallsSince(
    messages,
    lastMemoryMessageUuid,
  )
  const hasMetToolCallThreshold =
    toolCallsSinceLastUpdate >= getToolCallsBetweenUpdates()

  // Check if the last assistant turn has no tool calls (safe to extract)
  const hasToolCallsInLastTurn = hasToolCallsInLastAssistantTurn(messages)

  // Trigger extraction when:
  // 1. Both thresholds are met (tokens AND tool calls), OR
  // 2. No tool calls in last turn AND token threshold is met
  //    (to ensure we extract at natural conversation breaks)
  //
  // IMPORTANT: The token threshold (minimumTokensBetweenUpdate) is ALWAYS required.
  // Even if the tool call threshold is met, extraction won't happen until the
  // token threshold is also satisfied. This prevents excessive extractions.
  const shouldExtract =
    (hasMetTokenThreshold && hasMetToolCallThreshold) ||
    (hasMetTokenThreshold && !hasToolCallsInLastTurn)

  if (shouldExtract) {
    const lastMessage = messages[messages.length - 1]
    if (lastMessage?.uuid) {
      lastMemoryMessageUuid = lastMessage.uuid
    }
    return true
  }

  return false
}

The shape of the decision is simple:

first, ask whether the session has crossed the initialization threshold
then, require enough space between updates
then, look for either the tool-call threshold or a natural break in the last assistant turn

The important detail is that startup does not do this work. The threshold keeps protecting the session later, when the registered hook runs.

Manual Extraction Needs A Result Shape First

The manual summary command needs a small, explicit result object so the caller can tell whether the summary worked and, if it did, where the memory file lives. That is why ManualExtractionResult comes before the manual extraction path in the code story.

export type ManualExtractionResult = {
  success: boolean
  memoryPath?: string
  error?: string
}

The fields mean:

success says whether the manual extraction finished cleanly
memoryPath gives the caller the exact memory file path when a summary was written
error carries a human-readable failure reason when something goes wrong

When manuallyExtractSessionMemory(messages, toolUseContext) runs, it uses this same result shape so the caller can see whether the write succeeded without guessing from side effects.

The Manual Path Reads And Rewrites The Same File

Manual extraction does not get a looser permission model. It still uses the same guarded file path as the background worker, just with a different trigger. The manual path is allowed to bypass the threshold, but it still has to write through the same narrow tool lock.

export function createMemoryFileCanUseTool(memoryPath: string): CanUseToolFn {
  return async (tool: Tool, input: unknown) => {
    if (
      tool.name === FILE_EDIT_TOOL_NAME &&
      typeof input === 'object' &&
      input !== null &&
      'file_path' in input
    ) {
      const filePath = input.file_path
      if (typeof filePath === 'string' && filePath === memoryPath) {
        return { behavior: 'allow' as const, updatedInput: input }
      }
    }
    return {
      behavior: 'deny' as const,
      message: `only ${FILE_EDIT_TOOL_NAME} on ${memoryPath} is allowed`,
      decisionReason: {
        type: 'other' as const,
        reason: `only ${FILE_EDIT_TOOL_NAME} on ${memoryPath} is allowed`,
      },
    }
  }
}

The shape is intentionally strict. The worker may edit exactly one memory file, and only that file. Everything else is denied.

Auto-Memory Uses A Similar Guard In A Different Service

createAutoMemCanUseTool() is not the session-memory guard above. It lives in services/extractMemories, a different service that writes durable notes into the auto-memory directory. It belongs in this chapter only because it shows the same design instinct: background writers still get a deliberately constrained permission callback instead of broad file access.

That separate service is still not a free write path. The forked agent gets a permission callback with a small set of effective operations:

REPL as a wrapper when REPL mode hides primitive tools
read-only Read, Grep, and Glob
read-only Bash commands
Edit and Write only when the target lives inside the memory directory

That is the security story. This is a permission-gated and deliberately constrained path, not a free write path. The auto-memory worker can inspect the session and write its own durable notes, but only through the gate that keeps it from wandering around the rest of the project. The broader-looking REPL entry is still safe because the inner primitive calls are re-checked by the same guard.

export function createAutoMemCanUseTool(memoryDir: string): CanUseToolFn {
  return async (tool: Tool, input: Record<string, unknown>) => {
    // Allow REPL — when REPL mode is enabled (ant-default), primitive tools
    // are hidden from the tool list so the forked agent calls REPL instead.
    // REPL's VM context re-invokes this canUseTool for each inner primitive
    // (toolWrappers.ts createToolWrapper), so the Read/Bash/Edit/Write checks
    // below still gate the actual file and shell operations. Giving the fork a
    // different tool list would break prompt cache sharing (tools are part of
    // the cache key — see CacheSafeParams in forkedAgent.ts).
    if (tool.name === REPL_TOOL_NAME) {
      return { behavior: 'allow' as const, updatedInput: input }
    }

    // Allow Read/Grep/Glob unrestricted — all inherently read-only
    if (
      tool.name === FILE_READ_TOOL_NAME ||
      tool.name === GREP_TOOL_NAME ||
      tool.name === GLOB_TOOL_NAME
    ) {
      return { behavior: 'allow' as const, updatedInput: input }
    }

    // Allow Bash only for commands that pass BashTool.isReadOnly.
    // `tool` IS BashTool here — no static import needed.
    if (tool.name === BASH_TOOL_NAME) {
      const parsed = tool.inputSchema.safeParse(input)
      if (parsed.success && tool.isReadOnly(parsed.data)) {
        return { behavior: 'allow' as const, updatedInput: input }
      }
      return denyAutoMemTool(
        tool,
        'Only read-only shell commands are permitted in this context (ls, find, grep, cat, stat, wc, head, tail, and similar)',
      )
    }

    if (
      (tool.name === FILE_EDIT_TOOL_NAME ||
        tool.name === FILE_WRITE_TOOL_NAME) &&
      'file_path' in input
    ) {
      const filePath = input.file_path
      if (typeof filePath === 'string' && isAutoMemPath(filePath)) {
        return { behavior: 'allow' as const, updatedInput: input }
      }
    }

    return denyAutoMemTool(
      tool,
      `only ${FILE_READ_TOOL_NAME}, ${GREP_TOOL_NAME}, ${GLOB_TOOL_NAME}, read-only ${BASH_TOOL_NAME}, and ${FILE_EDIT_TOOL_NAME}/${FILE_WRITE_TOOL_NAME} within ${memoryDir} are allowed`,
    )
  }
}

The important thing is not the labels themselves. The important idea is that auto-memory is not a hidden free-write feature either. It is a background path with real permissions and tight bounds.

Three Similar Paths, Three Different Jobs

This subtree has three related flows:

startup registers the session-memory hook
background session memory waits for the runtime gates and the threshold
manual extraction bypasses the threshold but keeps the same exact file guard

The auto-memory service is the useful contrast case. It also launches a background writer with tight permissions, but it stores durable notes in a different directory and does not reuse the session-memory file.

Takeaways

Session memory exists to preserve the useful background of a long conversation without bloating the main turn. Startup registers the background hook, later only main-thread hook runs with the feature gate and one-time cached-config load can even reach the threshold, and only then can a forked agent rewrite the single session-memory file through a strict tool guard.

Fun Facts

The background worker and the manual summary path both end up writing the same reusable session-memory file, but they enter through different gates.
Auto-memory still uses permissions, not trust, but it is a separate service writing to an auto-memory directory rather than the session-memory file.

Session Memory And Background Extraction