Skip to content

Optimize BaseResponseParser streaming and add parser benchmark#4697

Open
lvhan028 wants to merge 1 commit into
InternLM:mainfrom
lvhan028:improve-reasoning-parser
Open

Optimize BaseResponseParser streaming and add parser benchmark#4697
lvhan028 wants to merge 1 commit into
InternLM:mainfrom
lvhan028:improve-reasoning-parser

Conversation

@lvhan028

Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings June 22, 2026 13:27

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes streaming parsing in BaseResponseParser (reducing string concatenation overhead and avoiding repeated partial JSON parsing for tool calls) and adds a standalone benchmark script to measure streaming vs complete parsing performance.

Changes:

  • Avoid repeated partial_json_parser.loads() calls for growing tool-call payloads during streaming.
  • Replace _accumulated_text string concatenation with chunk accumulation in BaseResponseParser.
  • Add benchmark/benchmark_parser.py to benchmark streaming and complete parsing scenarios.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
lmdeploy/serve/parsers/tool_parser/tool_parser.py Skip redundant partial JSON parsing on non-final chunks after tool name is emitted.
lmdeploy/serve/parsers/response_parser.py Accumulate streamed text via chunk list and reuse precomputed open-tag list in complete parsing.
benchmark/benchmark_parser.py New benchmark script for parser streaming and complete parsing throughput.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if self._name_emitted and not final:
return []

flags = Allow.ALL if final else Allow.ALL & ~Allow.STR
rcls = ReasoningParserManager.get(cfg.reasoning_parser)
reasoning_open = rcls.get_reasoning_open_tag()
reasoning_close = rcls.get_reasoning_close_tag()
rparser = rcls(enable_thinking=cfg.enable_thinking if cfg.enable_thinking else None)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants