Skip to content

fix prefix caching#4700

Merged
lvhan028 merged 1 commit into
InternLM:mainfrom
grimoire:fix-prefix-caching
Jun 23, 2026
Merged

fix prefix caching#4700
lvhan028 merged 1 commit into
InternLM:mainfrom
grimoire:fix-prefix-caching

Conversation

@grimoire

Copy link
Copy Markdown
Collaborator

Fix SSM prefix-cache KV eviction so it does not release checkpoints that are still pinned by an in-flight save/restore.

Previously, BlockTrie.evict() selected leaves only by KV block refcount. For SSM nodes, a leaf could have an idle KV block (ref_count == 1) while its state checkpoint was still protected by state_ref_count > 0. Evicting that leaf called release_state_checkpoint() and hit the invariant error:

Cannot release a pinned SSM prefix-cache checkpoint.

The fix makes KV eviction skip pinned SSM checkpoint leaves for the current eviction attempt. The node remains in self.leaves, so once the save/restore pin is released, a later eviction can reclaim it normally. It also avoids freeing an empty eviction list when all candidates are skipped.

Added a regression test covering the pinned-checkpoint case: eviction returns 0 while the checkpoint is pinned, keeps the node cached, then successfully evicts it after the producer pin is released.

Copilot AI review requested due to automatic review settings June 23, 2026 03:26

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes SSM prefix-cache KV eviction to avoid evicting trie leaves whose SSM state checkpoints are still pinned (e.g., by an in-flight save/restore), preventing release_state_checkpoint() from raising on pinned checkpoints and allowing eviction to succeed once pins are released.

Changes:

  • Update BlockTrie.evict() to skip KV eviction for leaves with pinned SSM checkpoints (state_ref_count > 0).
  • Avoid calling the allocator free-path when no blocks were actually evicted (e.g., all candidates were skipped).
  • Add a regression test ensuring eviction returns 0 while pinned, preserves the node, then evicts after releasing the save pin.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
lmdeploy/pytorch/paging/block_trie.py Skips pinned SSM checkpoint leaves during KV eviction and returns early when nothing is evicted.
tests/pytorch/paging/test_block_trie.py Adds regression coverage for KV eviction behavior when an SSM checkpoint is pinned by a producer save ref.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@lvhan028 lvhan028 merged commit 2a061aa into InternLM:main Jun 23, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants