Skip to content

Encryption keys in metadata are not cleaned up #16352

@Hugo-WB

Description

@Hugo-WB

Apache Iceberg version

1.10.1 (latest release)

Query engine

Spark

Please describe the bug 🐞

We have an encrypted Iceberg Table. It has high number of writes which leads to many snapshots, we run expire_snapshots aggressively, but overtime we notice the metadata file keeps on growing. Which leads to reduced table performance.

We have metadata that is 90%+ unused encryption keys. 40MB+ of encryption keys 😢 .

The bug seems to be that encryption-keys within the metadata file are not cleaned up.

In both expire_snapshots which calls into removeSnapshotsInternal.

and also with expireSnapshots with cleanExpiredMetadata set to true. Which calls into: cleanExpiredMetadata.

Neither of these cleanup encryption keys and should be updated to clean up encryption keys.

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions