Skip to content

Protect incoming KV prefix during live miss#448

Open
JordiPosthumus wants to merge 1 commit into
antirez:mainfrom
JordiPosthumus:fix/protect-incoming-kv-prefix-cache-only
Open

Protect incoming KV prefix during live miss#448
JordiPosthumus wants to merge 1 commit into
antirez:mainfrom
JordiPosthumus:fix/protect-incoming-kv-prefix-cache-only

Conversation

@JordiPosthumus

@JordiPosthumus JordiPosthumus commented Jun 23, 2026

Copy link
Copy Markdown

Summary

  • Protect disk KV checkpoints that are valid prefixes of the incoming prompt while storing old live state after a live-cache miss.
  • Add a regression test for the live-miss eviction case.

Why

On a live-cache miss, ds4-server stores the old live checkpoint before trying to load a disk fallback. Under disk pressure, that store can evict a checkpoint that prefixes the incoming prompt. When that happens, ds4 has to cold-prefill even though a valid disk fallback existed.

This change gives those incoming-prompt prefix checkpoints a very high eviction score during the pre-fallback store, so unrelated cache entries are evicted first.

Test

  • make ds4-server
  • make test

@fry69

fry69 commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Can you please fix your PR text? It's garbled.

There are still humans reading PRs... (maybe not for long, though)

@JordiPosthumus

Copy link
Copy Markdown
Author

ok, I spoke to my people ! Better? :-D

srinathh added a commit to srinathh/ds4 that referenced this pull request Jul 4, 2026
Reduce --kv-disk-space-mb 409600 -> 4096. 2 GiB was too tight (a single
1.9 GiB heartbeat-context checkpoint filled the budget, leaving no room for
antirez#448/recency to protect prefixes). 4 GiB holds the heartbeat checkpoint plus
a working set of ~35k conversation prefix chains, so eviction pressure comes
from accumulation while leaving headroom for the fixes. Revert before prod/ tag.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
srinathh added a commit to srinathh/ds4 that referenced this pull request Jul 4, 2026
The eviction score (hits+1)*tokens/file_size has no recency reward, so a
just-written prefix waypoint scores ~the same as a stale one and loses to
larger files on density. Under budget pressure the small, recent waypoints the
next turn will reuse are evicted first, and a later live miss cold-reprefills
even though a valid disk prefix existed moments earlier (antirez#444).

Add a recency gate in ds4_kvstore_entry_eviction_score, reusing the existing
used_at (last_used or created_at) and hit half-life, no new persistent state:
 - a decaying recency boost (1 + K*2^(-age/halflife)) so recent waypoints
   outscore stale entries of the same density;
 - gate the ec6a82a superseded-continued devaluation on !recent, so a recent
   waypoint the next turn may still need is not devalued mid-conversation.

Complements PR antirez#448: antirez#448 protects the incoming prompt's prefix during the
pre-fallback store; the recency gate protects recent prefixes during the
prefill continued-stores too. Adds test_kv_cache_eviction_keeps_recent_over_stale.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
srinathh added a commit to srinathh/ds4 that referenced this pull request Jul 4, 2026
Down from 400 GB. The recency gate (previous commit) keeps recent shared
prefixes alive under eviction pressure, so a leaner budget is viable. Ships
the recency gate to production; upstream PR antirez#448 intentionally NOT included.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants