LLMTest

对 LLM 进行性能测试，包括 F1-Score，Rogue-L，困惑度等

以下内容均以 uv 为例

安装方式（快速，推荐）

uv pip install git+ssh://git@github.com/SJTU-DDST/LLMTest.git

安装方式（可修改）

下载并放入 3rd 文件夹

git submodule add git@github.com:SJTU-DDST/LLMTest.git 3rd/llmtest
# git submodule update --init --recursive

安装

# uv venv / uv sync
uv pip install -e 3rd/llmtest

使用方式

创建 test.py，写入

from LLMTest import LLMTest

# from LLMTest import change_log_level
# change_log_level("DEBUG")

def LLM(prompts):
    return ["The Answer is C"] * len(prompts)

tester = LLMTest("cais/mmlu", 'high_school_biology')
batch_id, prompts = tester.get()
answers = LLM(prompts)
score = tester.score(batch_id, answers)

print(score)

uv run test.py

开发

uv pip install -e .
uv run tests/test.py

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
src/LLMTest		src/LLMTest
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMTest

安装方式（快速，推荐）

安装方式（可修改）

使用方式

开发

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLMTest

安装方式（快速，推荐）

安装方式（可修改）

使用方式

开发

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages