MMLU-Pro incorrect prompt format

## Describe the bug
As far as I can tell MMLU-Pro is prompting models incorrectly.
MMLU-Pro questions have 10 options but the prompt reads ([link](https://github.com/huggingface/lighteval/blob/78dbee223ba56ce8b7a7a1ac4bbe841f7e10708f/src/lighteval/tasks/tasks/mmlu_pro.py#L39)):
> Answer the following multiple choice question. The last line of your response should be of the following format: 'Answer: $LETTER' (without quotes) where LETTER is one of ABCD. Think step by step before answering.

NOTE: It would be worthwhile to also check the format of the fewshot examples is consistent with [the official harness](https://github.com/TIGER-AI-Lab/MMLU-Pro/tree/main). The pipeline is a bit messy to trace, but it looks to me like this is also different from how the official harness does it.

## To Reproduce
I don't have a nice MWE for this. I was benchmarking DeepSeek-V4-Flash via VLLM using the official Tiger AI Lab harness and LightEval and noticed a 15% drop in performance for the lightEval experiment when disabling reasoning. Interestingly enabling reasoning seems to make the model able to deal with the quirks of the LightEval implementation.

## Expected behavior
The model should be given the same prompts in the LightEval harness as in the original Tiger Labs harness. As it is now the task is more difficult in LightEval. 

## Version info
I used LightEval v 0.13


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MMLU-Pro incorrect prompt format #1265

Describe the bug

To Reproduce

Expected behavior

Version info

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

MMLU-Pro incorrect prompt format #1265

Description

Describe the bug

To Reproduce

Expected behavior

Version info

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions