Skip to content

feat(scoring): crown-dominant quality-weighted-volume blend with a single credibility term#449

Open
anderdc wants to merge 14 commits into
testfrom
feat/crown-depth-quality-bonus
Open

feat(scoring): crown-dominant quality-weighted-volume blend with a single credibility term#449
anderdc wants to merge 14 commits into
testfrom
feat/crown-depth-quality-bonus

Conversation

@anderdc

@anderdc anderdc commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

What & why

Reshapes the reward from a crown-time base with bolted-on correction factors into a crown-dominant blend of availability and realized service, and collapses the two reliability terms into one. Builds on the original depth-quality work (the self-referential market reference, retained below).

reward = pool · credibility · (λ·qvol_share + (1−λ)·crown_share·cap·quality)
  • qvol_share — the miner's share of the network's quality-weighted swap volume (per direction): each completed swap's TAO leg scaled by quality_factor(clearing_rate). This is the service half — real TAO moved, at the rate it cleared. Volume now earns (a non-crown server collects the λ slice) instead of merely gating crown.
  • crown_share·cap·quality — the availability half: best-rate hold time, scaled by capacity and the per-block depth quality (the original 449 weighting, retained).
  • λ = QVOL_REWARD_WEIGHT = 0.3crown-DOMINANT on purpose. Realized volume is sybil/wash-inflatable until swap counterparties are verifiable, so the volume share is kept small. Raising λ toward volume is gated on wash filtering — documented at the constant. Going volume-dominant before wash is filtered would entrench wash farms (winner-take-all crown protects honest rate competition better than volume-share does against fakeable volume).
  • Adaptive λ — with no realized volume (bootstrap) crown takes the full pool; with volume but no crown holder, volume takes it. An empty component never silently recycles its share.

The depth reference (from the original feature, unchanged)

The quality_factor is anchored to a self-referential per-direction market reference: a trimmed, volume- and recency-weighted (EMA half-life) average of the subnet's own completed-swap clearing rates. No oracle, no contract change. Recomputed each round from the stored clearing_rate column — no persisted EMA state, so every validator derives a byte-identical value (stable total-order sort + math.fsum, now_block = window_end). Trim + volume-weight + per-miner cap defend it against single-actor manipulation. Disabled below QUALITY_N_MIN (factor 1.0) so it's a guaranteed no-op until real swap history accrues.

quality_factor ∈ [QUALITY_FLOOR, 1.0]: floor at/below market, full at QUALITY_ANCHOR (5%) deeper, linear between; never below the floor.

Reliability: one term, not two

Merges success_rate³ × credibility_ramp into a single credibility (the two double-counted timeouts). Drops SUCCESS_EXPONENT and the success_rate helper. The >2-timeout cliff is preserved.

⚠️ Known follow-up: the timeout cliff is an absolute count, so it's volume-unaware — a high-volume miner is killed by 3 timeouts (~0.3% fail) while a 3-swap/2-timeout miner is fully credible. Fine while crown dominates; should be volume-scaled before λ grows. Flagged in the credibility docstring.

Changes

  • scoring.pyquality_weighted_volume helper; per-direction qvol + adaptive-λ blend in calculate_miner_rewards; credibility replaces success_rate/SUCCESS_EXPONENT/volume_factor/VOLUME_WEIGHT_ALPHA.
  • constants.pyQVOL_REWARD_WEIGHT = 0.3 (with the wash-gating note); removed SUCCESS_EXPONENT, VOLUME_WEIGHT_ALPHA.
  • scoring_trace.py — trace surfaces crown_share, qvol_share, credibility, quality; success_ratecredibility plumbing.
  • state_store.py / event_watcher.py — the depth feature's clearing_rate column (+ idempotent migration), per-direction query, and SwapCompleted population (unchanged from the original feature).

Scope / not in this PR

No wash filtering. swap_outcomes records no counterparty, so self-dealing volume is unfilterable from the event stream — that needs a contract change to emit the depositor. The crown-dominant λ is the interim guard. Wash-resistance + a volume-dominant λ are the v2 direction.

⚠️ Calibration

QUALITY_ANCHOR (0.05) is a placeholder — calibrate against live rate dispersion before trusting it. QVOL_REWARD_WEIGHT (0.3) is a deliberately conservative starting point.

Tests

Reference math (incl. shuffle-invariance), quality_factor/quality_weighted_volume/credibility unit tests, and e2e blend/credibility/capacity/quality weighting via calculate_miner_rewards. Full suite green; ruff clean.

anderdc and others added 14 commits June 3, 2026 20:50
… factor

The crown is winner-take-all per direction: the best-rate miner earns the
full pool whether their rate is 0.5% or 30% below market, so there's no
marginal incentive to quote deeper. Scale crown reward by how far the rate
beats a per-direction "market" reference, floored at 0.5 (forgiving, like
volume_factor); the unearned remainder recycles.

The reference is self-referential: a trimmed, volume-weighted, recency-decayed
average of the subnet's own completed-swap clearing rates — no oracle, no
contract change. Trim + volume weight + per-miner cap defend it against wash
manipulation.

- storage: clearing_rate column on swap_outcomes (CREATE + idempotent ALTER),
  populated at SwapCompleted from the swap's snapshotted rate;
  get_clearing_rates_by_direction_since excludes 0-rate (legacy/timed-out) rows
- scoring: compute_quality_reference / quality_factor helpers; a
  quality_weighted_blocks accumulator parallel to cap_weighted_blocks, folded
  into reward as base * vol_factor * quality (independent multiply)
- determinism: stable total-order sort + math.fsum, now_block=window_end, no
  wall-clock — every validator must compute a byte-identical reference
- bootstrap: reference disabled below QUALITY_N_MIN observations (factor 1.0),
  so the feature is a no-op at deploy until real swap history accrues

Scope: depth only. No breadth/rank-spillage. No sigma-gate edge-sitter
exclusion — that belongs to the reservation-bid-window work, which makes
edge-sitters arbitrageable so the market polices them.

QUALITY_ANCHOR (0.05) is a placeholder — calibrate against live rate
dispersion before trusting it.

Tests: 23 new (reference math incl. shuffle-invariance, factor shape, e2e
weighting incl. bootstrap/direction/stacking, storage round-trip, watcher
population). Full suite 685 pass.
…divergence) (#451)

* Preserve the synchronous reserve-time pin instead of overwriting it

The event watcher re-read the miner's commitment at the reservation's
on-chain inclusion block and overwrote the pin the reserve handler had
already written. When a miner moves its rate between the handler's
quote-validation read and the inclusion of vote_reserve, the re-read
captures a different rate than the one the on-chain to_amount was reserved
against. That settlement rate then diverges from the reserved amount and
the user is short-changed at confirm (swap 2405: reserved at 370, pinned to
280, settled ~24% low).

The synchronous pin is written at the instant the user's quote was
validated, so it is the authoritative reserve-time rate. Make the watcher's
read a pure backfill: only write the settlement pin when none exists.

A NOTE/TODO documents that this relies on the single-validator invariant
(one authoritative synchronous pin) and that the multi-validator fix is to
bind (reserve_block, rate) into the reservation at quorum and verify it
against CommitmentOf(block) within slippage, which requires a contract
iteration.

Scoring-overlay pin events are intentionally left reading the canonical
block (separate scoring workstream).

* Harden the backfill guard against stale prior-reservation pins

The reserve-time backfill guard skipped writing whenever ANY pin existed
for the miner, keyed on miner_hotkey alone. A pin left over from a prior
reservation — one abandoned without a swap and not yet swept by
purge_expired_reservation_pins — would then be inherited by the next
reservation. A failed synchronous write, or an event replay on a fresh DB
that sees the abandoned MinerReserved before the live one, would settle
the new swap against the stale reservation's rate AND addresses.

Key the guard on reserved_until, which is distinct per reservation: only
preserve a pin that belongs to THIS reservation; backfill over one whose
TTL differs. Keeps the swap-2405 fix (same-reservation re-reads still
don't clobber the synchronous pin).

New test_stale_prior_reservation_pin_is_backfilled covers the overwrite;
test_existing_synchronous_pin_is_not_overwritten still passes.

* Trim verbose pin-guard comments (reasoning lives in PR #451)

* Tighten pin-guard comment to 3 lines

---------

Co-authored-by: anderdc <me@alexanderdc.com>
…#452)

* fix(miner): retain unmarked send cache, with bounded deadline cleanup

cleanup_stale_sends deleted a destination-send cache entry whenever the
swap left the poller's active set, even with mark_fulfilled not yet
landed. A transient get_swap() read gap (3 misses -> drop) followed by
rediscovery then made process_swap broadcast destination funds a second
time. Retain unmarked entries so a reappearing swap retries mark_fulfilled
instead of resending; discard them only once the chain is provably past
the swap's last-known (extended) deadline, so genuinely-resolved swaps
don't leak in the on-disk cache forever.

Fixes #353

* chore: gitignore .mcp.json, *.md, and .vouch/

* chore: trim SENT_CACHE_DISCARD_MARGIN_BLOCKS comment to 2 lines

---------

Co-authored-by: anderdc <me@alexanderdc.com>
* Accept Taproot BTC addresses via embit in address validation

is_valid_address validated with the bech32 package, which implements only
BIP-173 bech32 (witness v0), so every Taproot address (bc1p…, witness v1 /
bech32m) failed the checksum. That rejected all TAO->BTC swaps paying out to a
Taproot wallet — at the CLI pre-check and validator confirm — with a misleading
"Invalid destination address format" (fixes #448).

Route is_valid_address and to_mainnet_address through embit (already a
dependency, already used for the send path). embit validates/encodes all
address types offline — no RPC — so the hand-rolled bech32/base58 logic is
replaced and the direct bech32 dependency is dropped (it stays installed
transitively via bitcoin-message-tool). to_mainnet_address output is verified
byte-identical to the prior behavior for every type that reaches it.

Diagnosis and the BIP-350-library fix direction came from the issue reporter.

* Hoist embit imports to module top

---------

Co-authored-by: anderdc <me@alexanderdc.com>
… can't drift (#454)

* Extract shared crown-eligibility predicates into make_crown_predicates

The scoring replay (replay_crown_time_window) and the live-crown snapshot
(snapshot_current_crown_holders) each hand-copied the executable_check and
can_fund eligibility predicates. The two paths must return identical verdicts
or the dashboard's live crown holder diverges from who the ledger actually
rewards, but that invariant was enforced only by copy-paste.

Consolidate both into a single make_crown_predicates factory so the invariant
is structural. Calling the factory once per direction also removes the
loop-variable default-arg binding the snapshot copy needed.

Pure refactor, no behavior change. Adds a parity test locking the factory's
semantics to is_executable_rate / min_executable_tao_leg.

Fixes #450

* make_crown_predicates: use module-level crown_can_fund + partial, no nested defs

* test: assert live snapshot and scoring ledger agree on crown holder

End-to-end guard for the #450 invariant: feed both replay_crown_time_window
and snapshot_current_crown_holders identical state and assert they resolve
the crown to the same holder (squatter dropped by both via the funding gate).
Catches a future one-sided divergence even if it bypassed the factory.

---------

Co-authored-by: anderdc <me@alexanderdc.com>
Co-authored-by: anderdc <me@alexanderdc.com>
)

handle_swap_reserve called provider.get_balance outside axon_lock with a
comment claiming the source-chain RPC is a separate connection. That holds
for a BTC source (Esplora/Maestro HTTP) but not for a TAO source: the
subtensor provider's get_balance runs on the shared axon_subtensor websocket
that axon_lock exists to serialize. Every TAO->BTC reserve raced the
lock-protected readers, causing recurring 'cannot call recv while another
thread is already running recv' errors.

Mark substrate-backed providers with uses_substrate and gate the balance
check on it: serialize the TAO read under axon_lock, keep BTC's HTTP read
lock-free so a slow Esplora call doesn't stall the forward loop.

Co-authored-by: anderdc <me@alexanderdc.com>
The contract blocks vote_reserve during a halt (SystemHalted revert), but the
validator currently still runs the full handler and submits the doomed
extrinsic. During a halt that turns into a retry storm: miners re-request,
each failed reserve burns a round-trip, and the writes contend for the
hotkey's nonce/write path — starving confirm/timeout votes for in-flight swaps.

Check bounds_cache.halted() at the top of handle_swap_reserve and reject
immediately, before any provider/substrate work or extrinsic submission.
halted() fails open, so an RPC blip falls through to the contract's own
rejection rather than refusing a valid reserve.

Co-authored-by: anderdc <me@alexanderdc.com>
…swap (#467)

* fix(validator): close crown pin when a reservation expires without a swap

A reservation pin freezes a miner's crown rate so it keeps earning crown at
its committed rate while reserved, even if it bumps its live quote (the
bump-after-pin loophole closure). The pin is closed only by a pin-end event,
emitted on SwapInitiated/SwapCompleted/SwapTimedOut or a fresh MinerReserved.

When a reservation simply expires (its reserved_until TTL lapses with no swap)
no event fires: the contract emits nothing on natural expiry, and
purge_expired_reservation_pins() deletes the reservation_pins row but never
touches reservation_pin_events, the table the crown replay overlays. Pruning
deliberately preserves the latest pin event per (hotkey, direction) as an
anchor, so a dangling 'start' persists indefinitely. The miner keeps earning
crown at the pinned rate with no live reservation until it next reserves or
swaps. Observed in production: reservations expiring without a swap left
miners pinned for up to ~88 minutes with no live reservation.

Add ValidatorStateStore.get_expired_reservation_pins() and
ContractEventWatcher.expire_stale_reservation_pins(), called from the forward
loop in place of the bare purge. For each expired pin it emits a pin-end at
reserved_until + 1 (crediting crown through the reservation's last live block,
then stopping) before purging the row, reusing the existing RESERVED_END
replay path. Idempotent: _emit_reservation_pin_ends only closes directions
whose latest event is a 'start', and the row is purged afterward.

* docs: trim expire_stale_reservation_pins docstring
…e rejections (#466)

The BTC source-balance check is an uncached external Esplora/gomaestro
HTTP call, and it ran before the commitment/slippage/already-reserved/
cooldown rejections — so every reserve request that was going to be
rejected anyway still forced one external API call. Under a reserve
spam burst this is a per-request amplification vector (one upstream
call each, even for doomed requests) that burns provider quota and
backs up the axon threadpool.

Move the balance lookup to the last gate before vote_reserve, after all
the cheap in-process/substrate rejections. Spam destined for those now
rejects without ever touching Esplora. The TAO (substrate) balance read
still serialises under axon_lock and the BTC read stays lock-free, both
unchanged — just relocated.

Tradeoff: the balance call now sits between the already-reserved check
and the vote, so a concurrent request could reserve the miner first.
That race costs at most one doomed vote_reserve, which the contract
rejects atomically, so the early-reject guarantee is unchanged.
…#461) + retry cushion gating (#462) (#468)

* fix(miner): size send-cache discard margin for two timeout extensions

SENT_CACHE_DISCARD_MARGIN_BLOCKS was MAX_EXTENSION_BLOCKS +
DEFAULT_FULFILLMENT_TIMEOUT_BLOCKS (300), covering a single extension. The
contract permits MAX_EXTENSIONS_PER_SWAP (2) extensions, each pushing
timeout_block forward by up to MAX_EXTENSION_BLOCKS relative to its own
propose block with no cumulative cap, so a fully-extended live deadline can
reach D0 + 2 * MAX_EXTENSION_BLOCKS. When a get_swap gap drops a swap from
the active set across both extensions, the cached deadline is never
refreshed and cleanup_stale_sends discards the unmarked entry while the swap
is still active on-chain, re-sending destination funds on rediscovery (#461)
— the duplicate-send #452 set out to prevent.

Size the margin to MAX_EXTENSIONS_PER_SWAP * MAX_EXTENSION_BLOCKS +
DEFAULT_FULFILLMENT_TIMEOUT_BLOCKS (550) so it tracks the contract's caps,
and add a two-extension regression test.

Fixes #461

* fix(miner): don't apply the timeout cushion to the post-send mark_fulfilled retry

verify_swap_safety enforces MINER_TIMEOUT_CUSHION_BLOCKS, and process_swap
ran it on every pass — including the mark_fulfilled retry after dest funds
were already sent. Once the chain reached timeout_block - cushion, the gate
returned None and aborted the retry, so a transient mark_fulfilled failure
in the final ~18-block window left the swap Active at its deadline and the
miner was slashed for the full tao_amount despite having already paid the
user (#462). The cushion is scoped to STARTING a fulfill (#356); applying it
to the retry re-introduced the loss it was added to prevent.

Gate verify_swap_safety / verify_user_sent_funds under the first-send branch
only; on the retry recompute user_receives_amount from the snapshotted rate
and go straight to mark_fulfilled. Add a regression test that runs the real
cushion on the retry path inside the cushion window.

Fixes #462

* docs: trim verbose comments and drop stale step numbering
…into feat/qvol-quality-reliability-scoring

# Conflicts:
#	tests/test_scoring_v1.py
…gle credibility term

Reshape the reward from a crown base with bolted-on corrections into a
crown-dominant blend of availability and realized service:

    reward = pool · credibility · (λ·qvol_share + (1−λ)·crown_share·cap·quality)

- λ = QVOL_REWARD_WEIGHT = 0.3 (crown-DOMINANT on purpose). qvol_share is the
  miner's share of the network's quality-weighted swap volume — each completed
  swap's TAO leg scaled by quality_factor(clearing_rate). Volume now *earns*
  (a non-crown server collects the λ slice) instead of only gating crown.
  Raising λ toward volume is gated on wash filtering, since unfiltered volume is
  sybil/wash-inflatable — documented at the constant.
- Adaptive λ: with no realized volume (bootstrap) crown takes the full pool;
  with volume but no crown holder, volume takes it — an empty component never
  silently recycles its share.
- Merge success_rate³ × credibility_ramp into one `credibility` term (the two
  double-counted timeouts). Drops SUCCESS_EXPONENT and the success_rate helper.
  Note: the >2-timeout cliff is an absolute count, so it's volume-unaware —
  flagged in the docstring as a follow-up for when volume's share grows.
- 449's per-block depth quality is retained on the crown term; quality is also
  applied per-swap inside qvol. vol_factor is removed (volume is now a reward
  base, not a penalty).

Builds on #449 (depth reference + clearing_rate storage), merged in.
@anderdc anderdc changed the title feat(scoring): reward crown depth via a floored, EMA-anchored quality factor feat(scoring): crown-dominant quality-weighted-volume blend with a single credibility term Jun 9, 2026
@JSONbored

Copy link
Copy Markdown
Contributor

Reading the "no wash filtering — needs a contract change to emit the depositor" note — totally fair for self-dealt volume. But I had a thought on one slice that might not need the counterparty at all, since the settlement addresses are already in every commitment.

If I'm reading the reward loop right, crown_share and qvol_share accrue per hotkey with no grouping by settlement identity, and the per-miner cap only bounds weight in the quality reference, not the reward — so hotkeys sharing one BTC+TAO settlement address would still split the pool as N independent miners off one pool? λ=0.3 damps the volume half; I just wasn't sure anything caps the crown half (possible I'm missing it).

Why it might matter in practice: on SN7 right now, reading the Commitments pallet, one operator is running ~6 coldkeys that all post to just 3 BTC settlement addresses (one address backs the original coldkey plus a newer one; another backs three), holding the BTC→TAO crown above fair via self-reservation. So coldkey-level identity is already being churned — the settlement tuple is the only stable handle. (One of those addresses is also the one from #444's self-send write-up — A→A that's grown into A-across-N-coldkeys.)

Idea, fwiw: collapse active hotkeys sharing a canonical settlement tuple into one identity before the crown/qvol split — deterministic, no counterparty or contract change, in the spirit of #444 and robust to the coldkey-churn above. Could be a cheap interim complement to λ, or maybe that's already where v2 is headed. Happy to share the coldkeys/addresses + a failing test if useful, or I'm also happy to draft up an issue for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants