Context
Follow-up from PR #346 and issue #281.
PR #346 adds the basic per-SP piece_cleanup job to bound storage growth by deleting old dealbot-owned pieces. During review, we identified two lifecycle policy questions that should be handled separately so the cleanup PR stays focused.
Questions to resolve
1. Blocked providers
Provider blocklists currently stop scheduled data-storage/retrieval/dataset-creation checks for providers we do not want to test.
Cleanup may need different behavior: if an SP is blocked but still has old dealbot data, cleanup/reconciliation may still be useful so dealbot reduces storage load and keeps local DB state accurate.
Decide whether piece_cleanup should:
- ignore provider blocklists and continue cleanup,
- respect provider blocklists and skip cleanup,
- or use a separate cleanup-specific allow/block policy.
Initial recommendation: cleanup should probably still run for blocked providers, because blocking new checks should not prevent cleanup or DB reconciliation.
2. Terminated or non-live datasets
Dataset/service termination does not necessarily mean there is nothing to clean up. Depending on FWSS/filecoin-pay lifecycle state, a dataset may be unavailable for new additions while piece removal, settlement, or local DB reconciliation still matters.
Decide how dealbot should handle cleanup candidates from:
- live datasets where pieces can be actively removed,
- terminated/non-live datasets where removals may still be allowed,
- fully finalized/deleted datasets where local DB rows may need to be marked cleaned up without attempting provider deletion.
Initial recommendation: do not blindly filter non-live datasets out of quota/reconciliation until we define the lifecycle states and expected behavior.
Done criteria
- Document the expected
piece_cleanup behavior for blocked providers.
- Document how cleanup/reconciliation should treat live, terminated, and fully finalized/deleted datasets.
- Add implementation changes if needed.
- Add tests covering the selected policies.
- Link the resulting behavior from the jobs/runbook docs if operator action is needed.
Context
Follow-up from PR #346 and issue #281.
PR #346 adds the basic per-SP
piece_cleanupjob to bound storage growth by deleting old dealbot-owned pieces. During review, we identified two lifecycle policy questions that should be handled separately so the cleanup PR stays focused.Questions to resolve
1. Blocked providers
Provider blocklists currently stop scheduled data-storage/retrieval/dataset-creation checks for providers we do not want to test.
Cleanup may need different behavior: if an SP is blocked but still has old dealbot data, cleanup/reconciliation may still be useful so dealbot reduces storage load and keeps local DB state accurate.
Decide whether
piece_cleanupshould:Initial recommendation: cleanup should probably still run for blocked providers, because blocking new checks should not prevent cleanup or DB reconciliation.
2. Terminated or non-live datasets
Dataset/service termination does not necessarily mean there is nothing to clean up. Depending on FWSS/filecoin-pay lifecycle state, a dataset may be unavailable for new additions while piece removal, settlement, or local DB reconciliation still matters.
Decide how dealbot should handle cleanup candidates from:
Initial recommendation: do not blindly filter non-live datasets out of quota/reconciliation until we define the lifecycle states and expected behavior.
Done criteria
piece_cleanupbehavior for blocked providers.