Why this matters
These three service jobs look similar from a distance, but they do different work. Analytics records internal telemetry. GrowthBook stores cached feature and config values that steer runtime behavior. Rate-limit messages are the human-facing warnings the service layer shows after it interprets raw quota and utilization data.
Three Service Jobs, In Plain English
The simplest way to think about this chapter is:
- Internal telemetry stays inside the service layer because metadata shapes and sink boundaries guard what can leave the process, and queued delivery keeps startup decoupled from the backend.
- Cached feature values let the runtime read a useful answer immediately, even before the next network refresh.
- Rate-limit messaging turns structured quota state into words a person can understand.
That order matters because the code is not one story. It is a set of small service stories that only meet at the runtime boundary.
Safe Telemetry Starts With Data Boundaries
“Safe telemetry” here does not just mean buffered telemetry. It means the
service code narrows what may be logged and strips special _PROTO_* fields
before data reaches the general-purpose sinks.
export type AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS = never
/**
* Marker type for values routed to PII-tagged proto columns via `_PROTO_*`
* payload keys. The destination BQ column has privileged access controls,
* so unredacted values are acceptable — unlike general-access backends.
*
* sink.ts strips `_PROTO_*` keys before Datadog fanout; only the 1P
* exporter (firstPartyEventLoggingExporter) sees them and hoists them to the
* top-level proto field. A single stripProtoFields call guards all non-1P
* sinks — no per-sink filtering to forget.
*
* Usage: `rawName as AnalyticsMetadata_I_VERIFIED_THIS_IS_PII_TAGGED`
*/
export type AnalyticsMetadata_I_VERIFIED_THIS_IS_PII_TAGGED = never
/**
* Strip `_PROTO_*` keys from a payload destined for general-access storage.
* Used by:
* - sink.ts: before Datadog fanout (never sees PII-tagged values)
* - firstPartyEventLoggingExporter: defensive strip of additional_metadata
* after hoisting known _PROTO_* keys to proto fields — prevents a future
* unrecognized _PROTO_foo from silently landing in the BQ JSON blob.
*
* Returns the input unchanged (same reference) when no _PROTO_ keys present.
*/
export function stripProtoFields<V>(
metadata: Record<string, V>,
): Record<string, V> {
let result: Record<string, V> | undefined
for (const key in metadata) {
if (key.startsWith('_PROTO_')) {
if (result === undefined) {
result = { ...metadata }
}
delete result[key]
}
}
return result ?? metadata
}
The plain-English model is:
- ordinary analytics calls should not smuggle code snippets or file paths by accident
- specially tagged proto fields are allowed only on the narrow path that is meant for them
- general sinks see the cleaned payload, not the privileged one
Analytics Waits For A Sink
Telemetry is not user-visible output. It is internal bookkeeping that gets queued first and delivered later, once startup has attached the sink that knows where events should go.
export type AnalyticsSink = {
logEvent: (eventName: string, metadata: LogEventMetadata) => void
logEventAsync: (
eventName: string,
metadata: LogEventMetadata,
) => Promise<void>
}
// Event queue for events logged before sink is attached
const eventQueue: QueuedEvent[] = []
// Sink - initialized during app startup
let sink: AnalyticsSink | null = null
This is the important part: the service is willing to accept events before the backend is ready, because it can hold them safely in memory for a moment.
export function attachAnalyticsSink(newSink: AnalyticsSink): void {
if (sink !== null) {
return
}
sink = newSink
// Drain the queue asynchronously to avoid blocking startup
if (eventQueue.length > 0) {
const queuedEvents = [...eventQueue]
eventQueue.length = 0
// Log queue size for ants to help debug analytics initialization timing
if (process.env.USER_TYPE === 'ant') {
sink.logEvent('analytics_sink_attached', {
queued_event_count: queuedEvents.length,
})
}
queueMicrotask(() => {
for (const event of queuedEvents) {
if (event.async) {
void sink!.logEventAsync(event.eventName, event.metadata)
} else {
sink!.logEvent(event.eventName, event.metadata)
}
}
})
}
}
export function logEvent(
eventName: string,
// intentionally no strings unless AnalyticsMetadata_I_VERIFIED_THIS_IS_NOT_CODE_OR_FILEPATHS,
// to avoid accidentally logging code/filepaths
metadata: LogEventMetadata,
): void {
if (sink === null) {
eventQueue.push({ eventName, metadata, async: false })
return
}
sink.logEvent(eventName, metadata)
}
logEventAsync() follows the same shape. If the sink is not attached yet, it
joins the queue instead of failing the call. That is how the service keeps
startup from being coupled to analytics readiness.
GrowthBook Reads Cached Values
GrowthBook is the other half of the “quiet service” story. It does not just fetch values on demand; it also serves cached answers so the runtime can move forward before the next refresh lands.
export function onGrowthBookRefresh(
listener: GrowthBookRefreshListener,
): () => void {
let subscribed = true
const unsubscribe = refreshed.subscribe(() => callSafe(listener))
if (remoteEvalFeatureValues.size > 0) {
queueMicrotask(() => {
// Re-check: listener may have been removed, or resetGrowthBook may have
// cleared the Map, between registration and this microtask running.
if (subscribed && remoteEvalFeatureValues.size > 0) {
callSafe(listener)
}
})
}
return () => {
subscribed = false
unsubscribe()
}
}
That listener exists for long-lived objects that bake feature values into their own state. When the GrowthBook cache refreshes, those objects need to re-read the values and rebuild themselves if anything changed.
export function getFeatureValue_CACHED_MAY_BE_STALE<T>(
feature: string,
defaultValue: T,
): T {
// Check env var overrides first (for eval harnesses)
const overrides = getEnvOverrides()
if (overrides && feature in overrides) {
return overrides[feature] as T
}
const configOverrides = getConfigOverrides()
if (configOverrides && feature in configOverrides) {
return configOverrides[feature] as T
}
if (!isGrowthBookEnabled()) {
return defaultValue
}
// Log experiment exposure if data is available, otherwise defer until after init
if (experimentDataByFeature.has(feature)) {
logExposureForFeature(feature)
} else {
pendingExposures.add(feature)
}
// In-memory payload is authoritative once processRemoteEvalPayload has run.
// Disk is also fresh by then (syncRemoteEvalToDisk runs synchronously inside
// init), so this is correctness-equivalent to the disk read below — but it
// skips the config JSON parse and is what onGrowthBookRefresh subscribers
// depend on to read fresh values the instant they're notified.
if (remoteEvalFeatureValues.has(feature)) {
return remoteEvalFeatureValues.get(feature) as T
}
// Fall back to disk cache (survives across process restarts)
try {
const cached = getGlobalConfig().cachedGrowthBookFeatures?.[feature]
return cached !== undefined ? (cached as T) : defaultValue
} catch {
return defaultValue
}
}
The “cached may be stale” part is not a bug. It is the point. The service is choosing a fast, good-enough answer over a slower blocking one when the caller does not need the freshest possible network read.
ClaudeAILimits Turns Headers Into State
The quota side starts with raw response headers, but the runtime does not keep
those headers as its working model. It turns them into a structured snapshot
named ClaudeAILimits.
export type ClaudeAILimits = {
status: QuotaStatus
// unifiedRateLimitFallbackAvailable is currently used to warn users that set
// their model to Opus whenever they are about to run out of quota. It does
// not change the actual model that is used.
unifiedRateLimitFallbackAvailable: boolean
resetsAt?: number
rateLimitType?: RateLimitType
utilization?: number
overageStatus?: QuotaStatus
overageResetsAt?: number
overageDisabledReason?: OverageDisabledReason
isUsingOverage?: boolean
surpassedThreshold?: number
}
That type matters because the warning code does not want to reason about raw header names. It wants one state object that says whether the user is allowed, warned, or rejected, plus the timing and utilization data needed for a useful message.
The key fields mean:
statussays whether the current quota state is fine, approaching a warning, or fully rejectedrateLimitTypenames which limit window the message is aboututilizationsays how full that window is, which is what the warning path checks before it shows a caution messageisUsingOverageandoverageStatustell the warning code whether Claude is already spending extra usage and whether that extra pool is also nearing its limitresetsAtandoverageResetsAtgive the later text-formatting helpers the times they need to tell the user when usage opens again
The Warning Layer Translates Raw Data
claudeAiLimits.ts reads the headers and keeps the current snapshot updated.
rateLimitMessages.ts then converts that snapshot into readable text.
export function getRateLimitMessage(
limits: ClaudeAILimits,
model: string,
): RateLimitMessage | null {
// Check overage scenarios first (when subscription is rejected but overage is available)
// getUsingOverageText is rendered separately from warning.
if (limits.isUsingOverage) {
// Show warning if approaching overage spending limit
if (limits.overageStatus === 'allowed_warning') {
return {
message: "You're close to your extra usage spending limit",
severity: 'warning',
}
}
return null
}
// ERROR STATES - when limits are rejected
if (limits.status === 'rejected') {
return { message: getLimitReachedText(limits, model), severity: 'error' }
}
// WARNING STATES - when approaching limits with early warning
if (limits.status === 'allowed_warning') {
// Only show warnings when utilization is above threshold (70%)
// This prevents false warnings after week reset when API may send
// allowed_warning with stale data at low usage levels
const WARNING_THRESHOLD = 0.7
if (
limits.utilization !== undefined &&
limits.utilization < WARNING_THRESHOLD
) {
return null
}
// Don't warn non-billing Team/Enterprise users about approaching plan limits
// if overages are enabled - they'll seamlessly roll into overage
const subscriptionType = getSubscriptionType()
const isTeamOrEnterprise =
subscriptionType === 'team' || subscriptionType === 'enterprise'
const hasExtraUsageEnabled =
getOauthAccountInfo()?.hasExtraUsageEnabled === true
if (
isTeamOrEnterprise &&
hasExtraUsageEnabled &&
!hasClaudeAiBillingAccess()
) {
return null
}
const text = getEarlyWarningText(limits)
if (text) {
return { message: text, severity: 'warning' }
}
}
// No message needed
return null
}
That function is the translation layer. It takes raw quota and utilization state and decides whether the user should see nothing, a warning, or an error. The wrapper helpers then split the result back into the place where each kind of message belongs.
export function getRateLimitWarning(
limits: ClaudeAILimits,
model: string,
): string | null {
const message = getRateLimitMessage(limits, model)
// Only return warnings for the footer - errors are shown in AssistantTextMessages
if (message && message.severity === 'warning') {
return message.message
}
// Don't show errors in the footer
return null
}
Takeaway
The pattern is the same across all three services: keep the raw machinery inside the service layer, cache what needs to be fast, and translate the final state into something the rest of the app can safely consume.