Apache Iceberg version
1.10.1 (latest release)
Query engine
Spark
Please describe the bug 🐞
We have an encrypted Iceberg Table. It has high number of writes which leads to many snapshots, we run expire_snapshots aggressively, but overtime we notice the metadata file keeps on growing. Which leads to reduced table performance.
We have metadata that is 90%+ unused encryption keys. 40MB+ of encryption keys 😢 .
The bug seems to be that encryption-keys within the metadata file are not cleaned up.
In both expire_snapshots which calls into removeSnapshotsInternal.
and also with expireSnapshots with cleanExpiredMetadata set to true. Which calls into: cleanExpiredMetadata.
Neither of these cleanup encryption keys and should be updated to clean up encryption keys.
Willingness to contribute
Apache Iceberg version
1.10.1 (latest release)
Query engine
Spark
Please describe the bug 🐞
We have an encrypted Iceberg Table. It has high number of writes which leads to many snapshots, we run expire_snapshots aggressively, but overtime we notice the metadata file keeps on growing. Which leads to reduced table performance.
We have metadata that is 90%+ unused encryption keys. 40MB+ of encryption keys 😢 .
The bug seems to be that
encryption-keyswithin the metadata file are not cleaned up.In both expire_snapshots which calls into removeSnapshotsInternal.
and also with expireSnapshots with
cleanExpiredMetadataset totrue. Which calls into: cleanExpiredMetadata.Neither of these cleanup encryption keys and should be updated to clean up encryption keys.
Willingness to contribute