Popular repositories Loading
-
BizFinBench
BizFinBench PublicA Business-Driven Real-World Financial Benchmark for Evaluating LLMs
-
MME-Finance
MME-Finance Public[MM 2025] A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning
-
BizFinBench.v2
BizFinBench.v2 PublicBizFinBench.v2: A Unified Offline–Online Bilingual Benchmark for Expert-Level Financial Capability Evaluation of LLMs
-
PuzzleClone
PuzzleClone PublicPuzzleClone: An SMT-Powered Framework for Synthesizing Verified Mathematical Reasoning Data
Python 5
Repositories
- GAGE Public
General AI evaluation and Gauge Engine. A unified evaluation engine for LLMs, MLLMs, audio, and diffusion models.
HiThink-Research/GAGE’s past year of commit activity - CCPO Public
Compress2Focus: Efficient Coordinate Compression for Policy Optimization in Multi-Turn GUI Agents
HiThink-Research/CCPO’s past year of commit activity - FinMTM Public
FinMTM: A Multi-Turn Multimodal Benchmark for Financial Reasoning and Agent Evaluation
HiThink-Research/FinMTM’s past year of commit activity - BizFinBench.v2 Public
BizFinBench.v2: A Unified Offline–Online Bilingual Benchmark for Expert-Level Financial Capability Evaluation of LLMs
HiThink-Research/BizFinBench.v2’s past year of commit activity - PuzzleClone Public
PuzzleClone: An SMT-Powered Framework for Synthesizing Verified Mathematical Reasoning Data
HiThink-Research/PuzzleClone’s past year of commit activity - MME-Finance Public
[MM 2025] A Multimodal Finance Benchmark for Expert-level Understanding and Reasoning
HiThink-Research/MME-Finance’s past year of commit activity - NEXUS-O Public
[MM 2025] NEXUS-O: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision
HiThink-Research/NEXUS-O’s past year of commit activity - Published_Papers Public
HiThink-Research/Published_Papers’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…