Skip to content

Question about length bias mitigation in pairwise evaluation #20

@liamjxu

Description

@liamjxu

Hi, thanks for open-sourcing the code, it looks great.

I have a question about the length bias mitigation part in the WB reward metric.

It seems that the code is shortening both reference output and model output to a fixed word count.

I'm curious about how to implement

converting outcomes of “slightly better/worse” to “tie” if the winner’s response exceeds the loser’s by more than K characters.

as mentioned in the paper. Or am I missing something here?

Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions