fix(validator): pick issue-bonus tier from live mirror maintainers, not stale per-issue snapshot#1485
Conversation
…nt, not stale per-issue author_association The mirror snapshots each linked issue's `author_association` at ingest time and never refreshes it. If an issue's author becomes a maintainer (or stops being one) after the issue was filed, every PR that links to that issue keeps using the stored snapshot's tier — even though the mirror's `/repos/:repo/maintainers` endpoint already knows the current role. Concrete miss: matthewevans and plind-junior on phase-rs/phase. Every one of their stored PR/issue rows has `author_association = CONTRIBUTOR` (they were private members at ingest), so any solver PR linking to their issues silently gets the standard tier (1.33×) instead of the maintainer tier (1.66×). Only opening a NEW PR/issue after the role flip would have been seen as MEMBER — every historical row stays wrong forever. This routes `_calculate_issue_multiplier`'s maintainer check through the mirror's `/repos/:repo/maintainers` endpoint instead — the same source the `maintainer_cut` carve-out already trusts in `build_maintainer_uids_by_repo`. Identity treatment is now unified across both bonus channels. Changes: - `gittensor/utils/mirror/client.py`: new `MirrorClient.get_maintainer_github_ids(repo)` returning a `frozenset[str]`, cached per-client instance (client lifetime == one scoring round). On `MirrorRequestError` returns an empty `frozenset` — conservative: caller treats the repo as having no identifiable maintainers, so the tier defaults to standard. - `gittensor/validator/oss_contributions/mirror/scoring.py`: thread the maintainer set through `score_pr` → `_calculate_pr_multipliers` → `_calculate_issue_multiplier`. The function now checks `li.author_github_id in maintainer_github_ids` instead of `li.author_association in MAINTAINER_ASSOCIATIONS`. - The maintainer-author drop in `_should_skip_merged_mirror_pr` at line 270 still uses `MAINTAINER_ASSOCIATIONS` on the PR itself — that's a different gate and intentionally unchanged. Tests: - New `TestGetMaintainerGithubIds` in test_mirror_client.py: cache hit on second call, separate repos each fetch once, mirror failure → empty frozenset (no raise), failure is cached too, drops entries with empty github_id. - Existing issue-multiplier tests updated to pass the set explicitly. - Two new tests in test_scoring.py prove the fix: a stale `CONTRIBUTOR` snapshot still gets the maintainer tier when the author is currently in the set, and a stale `OWNER` snapshot drops to standard when the author has left. Gates: ruff/format ✓, pyright 0/0, vulture clean, 945 pytest pass.
|
Closing — this is now fixed at the mirror layer instead. Your diagnosis is correct: das-github-mirror#192 (deployed) resolves the linked-issue author's association at serve time against a live Closing. |
Summary
_calculate_issue_multiplierdecides whether a linked issue grants the maintainer tier (maintainer_issue_multiplier, default 1.66) or the standard tier (standard_issue_multiplier, default 1.33) by readingli.author_associationoff the mirror's stored copy of the linked issue.That field is snapshotted at mirror-ingest time and never refreshed. If an issue's author becomes a maintainer (or stops being one) after the issue was filed, every PR that links to that issue keeps using the stored snapshot's role — even though the mirror's
/repos/:repo/maintainersendpoint already knows the current role and themaintainer_cutcarve-out already uses it.Concrete miss on
phase-rs/phase: matthewevans (id 1388610) and plind-junior (id 59729252) are both currently in the mirror's maintainer list (MEMBER/COLLABORATOR). But all of their stored historical PR/issue rows haveauthor_association = CONTRIBUTOR— they were private collaborators when the rows were ingested. So any solver PR linking to one of their issues gets the standard 1.33× tier instead of the maintainer 1.66×, indefinitely. Only a brand-new issue ingested after the role flip would have the right tier — and even then only for new linkers.Fix
Route
_calculate_issue_multiplier's tier check through the mirror's/repos/:repo/maintainersendpoint — the same sourcebuild_maintainer_uids_by_repoalready trusts for the carve-out. Identity treatment is now unified across both bonus channels.maintainer_github_idsis built once per scoring round by a newMirrorClient.get_maintainer_github_ids(repo) -> frozenset[str]that wrapsget_repo_maintainerswith a per-instance cache. First PR for a repo in a round hits the mirror; subsequent PRs in the same repo are O(1) dict lookups. OnMirrorRequestErrorthe method returns an emptyfrozenset(conservative — the tier silently defaults to standard rather than upgrading on stale data).Related Issues
N/A. The bug surfaces in production scoring; see the description above for the matthewevans/plind-junior reproduction on
phase-rs/phase. No prior issue or PR targets this specific stale-snapshot path.Type of Change
What this does NOT change
author_associationfield is still stored on linked issues and still written to the DB. Other consumers (analytics, audit) are unaffected._should_skip_merged_mirror_prstill usesMAINTAINER_ASSOCIATIONSon the PR itself (not on a linked issue), and is intentionally left as-is.entrius/das-github-mirror. This PR makes the validator resilient to that staleness for the issue-bonus tier, the same way the maintainer carve-out is already resilient.Testing
uv run pytest— 945 passed.uv run pyright— 0 errors, 0 warnings.uv run ruff check/ruff format --check— clean.uv run vulture— clean.New tests:
TestGetMaintainerGithubIds(6 tests): cache hits, per-repo isolation, mirror-failure fallback to empty frozenset, failure result also cached, drops entries with empty github_id.test_stale_contributor_association_still_gets_maintainer_tier: a storedCONTRIBUTORsnapshot whose author is currently in the maintainer set now correctly returns the maintainer multiplier.test_stale_owner_association_drops_to_standard_when_not_currently_maintainer: a storedOWNERsnapshot whose author has since departed correctly drops to the standard multiplier.Existing
_calculate_issue_multiplier/_calculate_pr_multiplierstests were updated to pass the maintainer set explicitly; the legacy preference-when-multiple-valid test (legacy parity for #673) still asserts the same behavior, just sourcing maintainer identity from the set rather than the stored field.Checklist