fix(aa-index): scrape App Router RSC payload instead of __NEXT_DATA__ by xack20 · Pull Request #97 · Andyyyy64/whichllm

xack20 · 2026-06-09T09:46:52Z

PR: fix(aa-index): scrape App Router RSC payload instead of `__NEXT_DATA__`

Base: Andyyyy64/whichllm:main · Head: fix/aa-index-rsc-scraper

Summary

The Artificial Analysis Intelligence Index source has been silently failing on
every run. artificialanalysis.ai migrated its leaderboard to the Next.js
App Router, which no longer embeds a <script id="__NEXT_DATA__"> blob — model
data now streams via self.__next_f.push([n, "…"]) (RSC) chunks. The scraper's
_NEXT_DATA_RE regex never matches, so fetch_aa_index_scores raises
ExtractionFailed("__NEXT_DATA__ payload not found"), and build_scores()
logs:

AA Index fetch failed, will use fallback: __NEXT_DATA__ payload not found

then falls back to the frozen AA_INDEX_FALLBACK_2026_05_14 snapshot. The CLI
still prints (the failure is caught), but the "current" AA tier is stale on
every invocation.

Changes

All in src/whichllm/models/benchmark_sources/aa_index.py:

Parse the App Router RSC stream. _decode_rsc_blob() concatenates and
unescapes the self.__next_f.push([n, "…"]) chunks; _extract_aa_pairs_from_html()
pulls every {"name", …, "intelligenceIndex"} record out with a bounded
regex (the payload is a flat RSC stream, not one parseable JSON document, so
the middle of the record regex forbids a second "name":" to avoid leaking
across records).
Canonicalize variant-suffixed names. AA now labels models like
"Qwen3 14B (Reasoning)", "GLM-5 (Non-reasoning)", "gpt-oss-20B (high)".
_canonical_name() strips parentheticals and normalizes separators/case so
they map back onto the existing AA_NAME_TO_HF_IDS table. This lifts live
name→HF coverage from 8 → ~46 models without enlarging the table.
Overlay live over the curated fallback. A successful live fetch now
merges on top of get_aa_curated_fallback() (live wins where both exist),
so a fetch can only add coverage — it can never shrink the AA tier below
the snapshot. Previously, replacing the ~72-entry snapshot with ~8 exact
matches would have regressed rankings.
Keep __NEXT_DATA__ as a secondary fallback in case the site format
changes again.

Verification

Against the live page (2026-06-09): live fetch returns 78 merged scores
(≥72 fallback baseline), with 12 models refreshed by live data and 6 new
ones not in the snapshot (GLM-4.7, MiniMax-M2, MiMo-V2-Flash, …). The
AA Index fetch failed warning no longer appears.

Tests

Adds tests/test_aa_index.py — fully offline (httpx.MockTransport), covering:

name canonicalization (variant + separator stripping),
RSC chunk decoding and record extraction (incl. the no-leak boundary),
canonical-name → HF mapping through fetch_aa_index_scores,
the merge-over-fallback coverage guarantee,
the ExtractionFailed path when no records are found.

uv run pytest          # 298 passed

🤖 Generated with Claude Code

artificialanalysis.ai migrated to the Next.js App Router, which no longer embeds a `<script id="__NEXT_DATA__">` blob. The AA Intelligence Index fetcher's regex never matched, so every run logged `AA Index fetch failed ... __NEXT_DATA__ payload not found` and silently fell back to the frozen 2026-05-14 snapshot — live scores stopped flowing into the rankings. Changes (src/whichllm/models/benchmark_sources/aa_index.py): - Parse the App Router RSC stream: concatenate + unescape the `self.__next_f.push([n, "..."])` chunks and pull every `{"name", ..., "intelligenceIndex"}` record with a bounded regex. - Canonicalize AA's variant-suffixed display names (`(Reasoning)`, `(Non-reasoning)`, `(high)`, effort/date tags) before mapping to HuggingFace ids — lifts live name->HF coverage from 8 to ~46 models. - Overlay live scores on top of the curated fallback so a successful live fetch can only add coverage, never shrink it below the snapshot. - Keep the legacy `__NEXT_DATA__` extraction as a secondary fallback. Adds tests/test_aa_index.py (offline, httpx.MockTransport) covering name canonicalization, RSC decoding/extraction, canonical-name mapping, the merge-over-fallback guarantee, and the no-records error path. Full suite: 298 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Updates the Artificial Analysis Intelligence Index integration to work with the site’s newer Next.js App Router (RSC) payload format, restoring live fetching while retaining a curated fallback.

Changes:

Add RSC scraping via self.__next_f.push(...) chunk decoding + bounded record extraction.
Add canonicalization for AA display names to improve mapping to HF IDs.
Merge live scores over a curated snapshot fallback, and add offline tests for the new behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
`src/whichllm/models/benchmark_sources/aa_index.py`	Implements RSC scraper, canonical-name matching, and live+fallback merge logic.
`tests/test_aa_index.py`	Adds offline unit tests for RSC decoding/extraction, name canonicalization, and merging behavior.
`CHANGELOG.md`	Documents the fix restoring live AA Index fetching and the new RSC parsing approach.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    # Overlay live scores on top of the curated snapshot so a successful live
+    # fetch can only ADD coverage, never shrink it below the fallback. Live
+    # numbers win wherever both exist; the snapshot fills the long tail of
+    # models AA labels in a way we can't map (or no longer tracks).
+    scores = get_aa_curated_fallback()
+    for hf_id, normalized in live.items():
+        if normalized > scores.get(hf_id, 0.0):
+            scores[hf_id] = normalized


+_AA_RECORD_RE = re.compile(
+    r'"name":"(?P<name>(?:[^"\\]|\\.)*)"'
+    r'(?:(?!"name":").)*?'
+    r'"intelligenceIndex":(?P<idx>-?\d+(?:\.\d+)?)',
+    re.DOTALL,
+)


+# Canonical-name -> HF ids, derived once from AA_NAME_TO_HF_IDS. Several display
+# names can collapse to one canonical key; we union their HF ids.
+_AA_CANON_TO_HF_IDS: dict[str, list[str]] = {}
+for _disp, _ids in AA_NAME_TO_HF_IDS.items():
+    _AA_CANON_TO_HF_IDS.setdefault(_canonical_name(_disp), []).extend(_ids)


+        try:
+            parts.append(json.loads(m.group("s")))
+        except (ValueError, json.JSONDecodeError):
+            continue


Copilot AI review requested due to automatic review settings June 9, 2026 09:46

Copilot AI reviewed Jun 9, 2026

View reviewed changes

samarthpatel24 mentioned this pull request Jun 9, 2026

fix: restore AA live fetch after RSC migration (#87) #96

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(aa-index): scrape App Router RSC payload instead of __NEXT_DATA__#97

fix(aa-index): scrape App Router RSC payload instead of __NEXT_DATA__#97
xack20 wants to merge 1 commit into
Andyyyy64:mainfrom
xack20:fix/aa-index-rsc-scraper

xack20 commented Jun 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xack20 commented Jun 9, 2026

PR: fix(aa-index): scrape App Router RSC payload instead of __NEXT_DATA__

Summary

Changes

Verification

Tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PR: fix(aa-index): scrape App Router RSC payload instead of `__NEXT_DATA__`