This repository contains the implementation of ReNIO.
conda env create -f environment.yml
conda activate opsdpip install flash-attn==2.8.3 --no-build-isolationFor Math task, the data can be download from here. For Coding task, the data can be download from here, we sample 30k code domain data from it.
Please put the training data in data/.
We provide the training shells in scripts/, change the model_name_or_path to your real model path to use them.
See scripts/run_grpo.sh.
See
scripts/run_opsd_1b.sh.
scripts/run_opsd_4b.sh.
scripts/run_opsd_8b.sh.
To use renio, you can try
CLIP=2.5 \
IMP=0.8 \
RENIO=True \
bash scripts/run_opsd_1b.sh
for math task OPSD training on qwen3-1.7B. And use
DATA="data/openthoughts/openthoughts_coding_30k.jsonl" \
TASK="coding" \
CLIP=2.5 \
IMP=0.8 \
RENIO=True \
bash scripts/run_opsd_1b.sh
for coding task training.
Here RENIO=True enables ReNIO for training, CLIP and IMP controls the student-teacher log ratio clip range and the threshold for key token selection.
See eval\run_eval.sh.
Our implementation builds on OPSD.
