Summary
When a Keep* rule (e.g. KeepLatestNDockerImages) shares a policy with a filtering
Delete* rule (e.g. DeleteDockerImagesNotUsed), the keep rule does not protect the
latest N items across all versions — only the latest N among the items that already
survived the delete rule's AQL filter.
The common retention intent — "delete images not used for N days, but always keep the
latest M versions" — is therefore not expressible in a single policy, and on many real
datasets the policy deletes nothing (or the wrong set).
Root cause
All rules contribute their aql_add_filter into a single AQL find combined with $and
(artifactory_cleanup/rules/base.py, _get_aql_find_filters, lines 238–249):
def _get_aql_find_filters(self) -> Dict:
filters = []
for rule in self.rules:
filters = rule.aql_add_filter(filters)
return {"$and": filters} # every rule's filter AND-ed together
DeleteDockerImagesNotUsed.aql_add_filter (docker.py:125–165) adds a date constraint, so
only stale/unused images are fetched — recently-used versions never come back.
Keep* rules add nothing to the AQL; they only implement in-memory filter(), which runs on
the already-narrowed set (CleanupPolicy.filter, base.py:280–287):
def filter(self, artifacts: ArtifactsList) -> ArtifactsList:
for rule in self.rules:
artifacts = rule.filter(artifacts) # `artifacts` = only the stale items from AQL
return artifacts
KeepLatestNDockerImages.filter (docker.py:214–239) groups the received artifacts by path
and keeps the N newest digests of that subset — it has no knowledge of the truly newest
versions, which were filtered out earlier as "recently used".
Reproduction
- name: Delete docker images not used for 120 days, keep latest 5
rules:
- rule: DeleteDockerImagesNotUsed
days: 120
- rule: KeepLatestNDockerImages
count: 5
- rule: IncludeDockerImages
masks: ["*"]
Take an image whose recent versions are still being downloaded and whose older versions are
not. The date filter drops the recently-used versions, so AQL returns only the stale ones.
KeepLatestNDockerImages then keeps the N newest of that stale subset. If the number of
stale versions is ≤ N, the policy deletes nothing — even though plenty of old, unused versions
exist beyond the latest N overall.
Expected vs actual
- Expected: keep the latest N versions overall, and delete the ones that are both older
than the threshold and not in that latest-N set.
- Actual: "keep latest N" is computed only over the stale subset, so the newest stale
versions are protected indefinitely and the intended deletions never happen.
The same applies to non-docker rules, e.g. DeleteOlderThan + KeepLatestNFiles in one policy.
Why config alone can't work around it
Splitting into two policies makes it worse: policies are independent and their deletions are
unioned, and a Keep* rule in one policy does not protect artifacts targeted by another.
So "delete (unused) AND NOT (in latest N of all)" is inexpressible.
Proposed fix
Make "keep latest N" evaluate over all versions, not just the post-filter subset:
- (Preferred, backward-compatible) Add a new keep rule that runs its own AQL query for
all tags of each image (independent of the policy's delete filter), computes the N newest,
and removes them from the deletion set. There's precedent for a rule issuing its own AQL
request (_collect_docker_size). Existing behavior stays intact.
- Otherwise, at least document this interaction, since changing the existing rule in place
would break policies that rely on "keep newest among the stale subset".
Happy to submit a PR for option 1 if maintainers agree on the rule name/approach.
Environment
- artifactory-cleanup: master (commit 4fff979)
- Rules:
DeleteDockerImagesNotUsed + KeepLatestNDockerImages (general: any filtering Delete
rule + any Keep rule in the same policy)
Summary
When a
Keep*rule (e.g.KeepLatestNDockerImages) shares a policy with a filteringDelete*rule (e.g.DeleteDockerImagesNotUsed), the keep rule does not protect thelatest N items across all versions — only the latest N among the items that already
survived the delete rule's AQL filter.
The common retention intent — "delete images not used for N days, but always keep the
latest M versions" — is therefore not expressible in a single policy, and on many real
datasets the policy deletes nothing (or the wrong set).
Root cause
All rules contribute their
aql_add_filterinto a single AQLfindcombined with$and(
artifactory_cleanup/rules/base.py,_get_aql_find_filters, lines 238–249):DeleteDockerImagesNotUsed.aql_add_filter(docker.py:125–165) adds a date constraint, soonly stale/unused images are fetched — recently-used versions never come back.
Keep*rules add nothing to the AQL; they only implement in-memoryfilter(), which runs onthe already-narrowed set (
CleanupPolicy.filter, base.py:280–287):KeepLatestNDockerImages.filter(docker.py:214–239) groups the received artifacts by pathand keeps the N newest digests of that subset — it has no knowledge of the truly newest
versions, which were filtered out earlier as "recently used".
Reproduction
Take an image whose recent versions are still being downloaded and whose older versions are
not. The date filter drops the recently-used versions, so AQL returns only the stale ones.
KeepLatestNDockerImagesthen keeps the N newest of that stale subset. If the number ofstale versions is ≤ N, the policy deletes nothing — even though plenty of old, unused versions
exist beyond the latest N overall.
Expected vs actual
than the threshold and not in that latest-N set.
versions are protected indefinitely and the intended deletions never happen.
The same applies to non-docker rules, e.g.
DeleteOlderThan+KeepLatestNFilesin one policy.Why config alone can't work around it
Splitting into two policies makes it worse: policies are independent and their deletions are
unioned, and a
Keep*rule in one policy does not protect artifacts targeted by another.So "delete (unused) AND NOT (in latest N of all)" is inexpressible.
Proposed fix
Make "keep latest N" evaluate over all versions, not just the post-filter subset:
all tags of each image (independent of the policy's delete filter), computes the N newest,
and removes them from the deletion set. There's precedent for a rule issuing its own AQL
request (
_collect_docker_size). Existing behavior stays intact.would break policies that rely on "keep newest among the stale subset".
Happy to submit a PR for option 1 if maintainers agree on the rule name/approach.
Environment
DeleteDockerImagesNotUsed+KeepLatestNDockerImages(general: any filtering Deleterule + any Keep rule in the same policy)