[feat] support opd rl by hjh0119 · Pull Request #9641 · modelscope/ms-swift

hjh0119 · 2026-06-25T02:56:53Z

No description provided.

gemini-code-assist

Code Review

This pull request implements Megatron On-Policy Distillation as RL (OPD-RL) by integrating teacher KL as a GRPO advantage across local and Ray-based GKD and GRPO trainers. It also introduces OpenEnvScheduler and OpenEnvWrapper to support multi-turn rollouts in OpenEnv environments. The review feedback highlights several critical issues, including missing imports and potential AttributeErrors in gkd_helpers.py, a rank-guarding mismatch in teacher_mixin.py that could cause runtime failures, and synchronous blocking calls in OpenEnvScheduler that should be run in separate threads to avoid blocking the asyncio event loop. Additionally, defensive checks are recommended to prevent potential IndexError, StopIteration, and type promotion issues.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

hjh0119 added 2 commits June 23, 2026 21:53

init

ead42fc

support num_generation=1

7ad3148

gemini-code-assist Bot reviewed Jun 25, 2026

View reviewed changes

hjh0119 added 13 commits June 25, 2026 10:59

clean

b94f148

Merge branch 'main' into opd-rl

a8f112c

remove teacher mixin

15d7963

update

cf4528f

Merge branch 'main' into opd-rl

5f849bf

fix sp

540ed7c

clean

e949949

update args

12f7774

update doc& fix opsd+opd

ebea726

fix opd-opsd

57b611d

fix script

960d020

move script

d31c0d9

update en doc

fa55072

Jintao-Huang approved these changes Jun 29, 2026

View reviewed changes

hjh0119 added 2 commits June 29, 2026 20:45

fix doc

d1e7e12

fix dynamic opsd+opd-rl

958346e

hjh0119 merged commit 46f03e9 into modelscope:main Jun 29, 2026
3 checks passed

hjh0119 deleted the opd-rl branch June 29, 2026 15:01

hjh0119 mentioned this pull request Jun 30, 2026

啥时候支持带RL的GKD #8069

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feat] support opd rl#9641

[feat] support opd rl#9641
hjh0119 merged 17 commits into
modelscope:mainfrom
hjh0119:opd-rl

hjh0119 commented Jun 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hjh0119 commented Jun 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants