Why this matters
Claude Code keeps a reusable memory file so it does not have to rediscover the same session notes over and over. When the conversation gets long enough, the runtime can summarize useful details into that file in the background, then keep the main turn moving.
The important part is not just that memory exists. It is that the write path is
small, reusable, and guarded. The background worker cannot treat memory like a
free scratchpad, and the manual summary path uses the same lock box instead of a
looser one. A separate service called auto-memory shows the same guarding
style, but it writes to a different directory and solves a different problem.
Session memory stays in the background
Startup registers the hook once, later hook runs check the threshold, and only then can the forked agent rewrite the session-memory file through a strict guard.
Why The Memory File Exists
The reusable memory file is a shortcut for future turns. Instead of making the main agent carry every long-running detail in its active context, Claude Code can write a smaller markdown note that survives after the current turn ends. That keeps the session lighter and gives later turns a stable place to read the important background from.
Before looking at the code, remember the data shape the threshold code sees:
messages is the full transcript array for the session. The extraction logic
does not need to know every field on every message here. It only needs the
ordered conversation so it can count tokens and check whether the current turn
already used tools.
Initialization Registers The Hook First
initSessionMemory() runs during setup and makes the feature available. It does
not wait for the transcript threshold first. Its job is narrower: if the
runtime is local and auto-compact is enabled, register a post-sampling hook and
leave the real decision for later.
The other important guard is that initialization respects the same auto-compact world it lives inside. If auto-compact is off, session memory stays off too. That keeps the feature aligned with the rest of the recovery stack.
export function initSessionMemory(): void {
if (getIsRemoteMode()) return
// Session memory is used for compaction, so respect auto-compact settings
const autoCompactEnabled = isAutoCompactEnabled()
// Log initialization state (ant-only to avoid noise in external logs)
if (process.env.USER_TYPE === 'ant') {
logEvent('tengu_session_memory_init', {
auto_compact_enabled: autoCompactEnabled,
})
}
if (!autoCompactEnabled) {
return
}
// Register hook unconditionally - gate check happens lazily when hook runs
registerPostSamplingHook(extractSessionMemory)
}
Initialization is therefore a registration step, not a background job.
extractSessionMemory is what actually does the work later, after the message
threshold and the runtime gate both agree that extraction is worth trying.
The Hook Adds Runtime Gates Before The Threshold
Registering the hook does not mean every later turn will extract memory.
extractSessionMemory() still checks where the event came from, whether the
feature gate is on, and whether it needs to do the one-time memoized cached
config load before it ever asks about tokens.
const extractSessionMemory = sequential(async function (
context: REPLHookContext,
): Promise<void> {
const { messages, toolUseContext, querySource } = context
// Only run session memory on main REPL thread
if (querySource !== 'repl_main_thread') {
// Don't log this - it's expected for subagents, teammates, etc.
return
}
// Check gate lazily when hook runs (cached, non-blocking)
if (!isSessionMemoryGateEnabled()) {
// Log gate failure once per session (ant-only)
if (process.env.USER_TYPE === 'ant' && !hasLoggedGateFailure) {
hasLoggedGateFailure = true
logEvent('tengu_session_memory_gate_disabled', {})
}
return
}
// Initialize config from remote (lazy, only once)
initSessionMemoryConfigIfNeeded()
if (!shouldExtractMemory(messages)) {
return
}
That is the full start condition in plain English:
- the event must come from the main REPL thread
- the session-memory feature gate must be enabled
- the service does its one-time cached-config load
- only then does the threshold logic get a vote
The Threshold Runs When The Hook Fires
Only after the hook passes those runtime gates does shouldExtractMemory() ask
whether the conversation is large enough and quiet enough to justify a
background summary. In other words, startup wires the trigger once, the hook
checks its environment, and then the threshold decides whether this turn is a
good extraction point.
export function shouldExtractMemory(messages: Message[]): boolean {
// Check if we've met the initialization threshold
// Uses total context window tokens (same as autocompact) for consistent behavior
const currentTokenCount = tokenCountWithEstimation(messages)
if (!isSessionMemoryInitialized()) {
if (!hasMetInitializationThreshold(currentTokenCount)) {
return false
}
markSessionMemoryInitialized()
}
// Check if we've met the minimum tokens between updates threshold
// Uses context window growth since last extraction (same metric as init threshold)
const hasMetTokenThreshold = hasMetUpdateThreshold(currentTokenCount)
// Check if we've met the tool calls threshold
const toolCallsSinceLastUpdate = countToolCallsSince(
messages,
lastMemoryMessageUuid,
)
const hasMetToolCallThreshold =
toolCallsSinceLastUpdate >= getToolCallsBetweenUpdates()
// Check if the last assistant turn has no tool calls (safe to extract)
const hasToolCallsInLastTurn = hasToolCallsInLastAssistantTurn(messages)
// Trigger extraction when:
// 1. Both thresholds are met (tokens AND tool calls), OR
// 2. No tool calls in last turn AND token threshold is met
// (to ensure we extract at natural conversation breaks)
//
// IMPORTANT: The token threshold (minimumTokensBetweenUpdate) is ALWAYS required.
// Even if the tool call threshold is met, extraction won't happen until the
// token threshold is also satisfied. This prevents excessive extractions.
const shouldExtract =
(hasMetTokenThreshold && hasMetToolCallThreshold) ||
(hasMetTokenThreshold && !hasToolCallsInLastTurn)
if (shouldExtract) {
const lastMessage = messages[messages.length - 1]
if (lastMessage?.uuid) {
lastMemoryMessageUuid = lastMessage.uuid
}
return true
}
return false
}
The shape of the decision is simple:
- first, ask whether the session has crossed the initialization threshold
- then, require enough space between updates
- then, look for either the tool-call threshold or a natural break in the last assistant turn
The important detail is that startup does not do this work. The threshold keeps protecting the session later, when the registered hook runs.
Manual Extraction Needs A Result Shape First
The manual summary command needs a small, explicit result object so the caller
can tell whether the summary worked and, if it did, where the memory file lives.
That is why ManualExtractionResult comes before the manual extraction path in
the code story.
export type ManualExtractionResult = {
success: boolean
memoryPath?: string
error?: string
}
The fields mean:
successsays whether the manual extraction finished cleanlymemoryPathgives the caller the exact memory file path when a summary was writtenerrorcarries a human-readable failure reason when something goes wrong
When manuallyExtractSessionMemory(messages, toolUseContext) runs, it uses this
same result shape so the caller can see whether the write succeeded without
guessing from side effects.
The Manual Path Reads And Rewrites The Same File
Manual extraction does not get a looser permission model. It still uses the same guarded file path as the background worker, just with a different trigger. The manual path is allowed to bypass the threshold, but it still has to write through the same narrow tool lock.
export function createMemoryFileCanUseTool(memoryPath: string): CanUseToolFn {
return async (tool: Tool, input: unknown) => {
if (
tool.name === FILE_EDIT_TOOL_NAME &&
typeof input === 'object' &&
input !== null &&
'file_path' in input
) {
const filePath = input.file_path
if (typeof filePath === 'string' && filePath === memoryPath) {
return { behavior: 'allow' as const, updatedInput: input }
}
}
return {
behavior: 'deny' as const,
message: `only ${FILE_EDIT_TOOL_NAME} on ${memoryPath} is allowed`,
decisionReason: {
type: 'other' as const,
reason: `only ${FILE_EDIT_TOOL_NAME} on ${memoryPath} is allowed`,
},
}
}
}
The shape is intentionally strict. The worker may edit exactly one memory file, and only that file. Everything else is denied.
Auto-Memory Uses A Similar Guard In A Different Service
createAutoMemCanUseTool() is not the session-memory guard above. It lives in
services/extractMemories, a different service that writes durable notes into
the auto-memory directory. It belongs in this chapter only because it shows the
same design instinct: background writers still get a deliberately constrained
permission callback instead of broad file access.
That separate service is still not a free write path. The forked agent gets a permission callback with a small set of effective operations:
REPLas a wrapper when REPL mode hides primitive tools- read-only
Read,Grep, andGlob - read-only
Bashcommands EditandWriteonly when the target lives inside the memory directory
That is the security story. This is a permission-gated and deliberately
constrained path, not a free write path. The auto-memory worker can inspect the
session and write its own durable notes, but only through the gate that keeps
it from wandering around the rest of the project. The broader-looking REPL
entry is still safe because the inner primitive calls are re-checked by the
same guard.
export function createAutoMemCanUseTool(memoryDir: string): CanUseToolFn {
return async (tool: Tool, input: Record<string, unknown>) => {
// Allow REPL — when REPL mode is enabled (ant-default), primitive tools
// are hidden from the tool list so the forked agent calls REPL instead.
// REPL's VM context re-invokes this canUseTool for each inner primitive
// (toolWrappers.ts createToolWrapper), so the Read/Bash/Edit/Write checks
// below still gate the actual file and shell operations. Giving the fork a
// different tool list would break prompt cache sharing (tools are part of
// the cache key — see CacheSafeParams in forkedAgent.ts).
if (tool.name === REPL_TOOL_NAME) {
return { behavior: 'allow' as const, updatedInput: input }
}
// Allow Read/Grep/Glob unrestricted — all inherently read-only
if (
tool.name === FILE_READ_TOOL_NAME ||
tool.name === GREP_TOOL_NAME ||
tool.name === GLOB_TOOL_NAME
) {
return { behavior: 'allow' as const, updatedInput: input }
}
// Allow Bash only for commands that pass BashTool.isReadOnly.
// `tool` IS BashTool here — no static import needed.
if (tool.name === BASH_TOOL_NAME) {
const parsed = tool.inputSchema.safeParse(input)
if (parsed.success && tool.isReadOnly(parsed.data)) {
return { behavior: 'allow' as const, updatedInput: input }
}
return denyAutoMemTool(
tool,
'Only read-only shell commands are permitted in this context (ls, find, grep, cat, stat, wc, head, tail, and similar)',
)
}
if (
(tool.name === FILE_EDIT_TOOL_NAME ||
tool.name === FILE_WRITE_TOOL_NAME) &&
'file_path' in input
) {
const filePath = input.file_path
if (typeof filePath === 'string' && isAutoMemPath(filePath)) {
return { behavior: 'allow' as const, updatedInput: input }
}
}
return denyAutoMemTool(
tool,
`only ${FILE_READ_TOOL_NAME}, ${GREP_TOOL_NAME}, ${GLOB_TOOL_NAME}, read-only ${BASH_TOOL_NAME}, and ${FILE_EDIT_TOOL_NAME}/${FILE_WRITE_TOOL_NAME} within ${memoryDir} are allowed`,
)
}
}
The important thing is not the labels themselves. The important idea is that auto-memory is not a hidden free-write feature either. It is a background path with real permissions and tight bounds.
Three Similar Paths, Three Different Jobs
This subtree has three related flows:
- startup registers the session-memory hook
- background session memory waits for the runtime gates and the threshold
- manual extraction bypasses the threshold but keeps the same exact file guard
The auto-memory service is the useful contrast case. It also launches a background writer with tight permissions, but it stores durable notes in a different directory and does not reuse the session-memory file.
Takeaways
Session memory exists to preserve the useful background of a long conversation without bloating the main turn. Startup registers the background hook, later only main-thread hook runs with the feature gate and one-time cached-config load can even reach the threshold, and only then can a forked agent rewrite the single session-memory file through a strict tool guard.
Fun Facts
- The background worker and the manual summary path both end up writing the same reusable session-memory file, but they enter through different gates.
- Auto-memory still uses permissions, not trust, but it is a separate service writing to an auto-memory directory rather than the session-memory file.