Skip to content

HDDS-8655. Optimize listKeys by eliminating redundant seek.#9542

Closed
armstrong0704 wants to merge 1 commit intoapache:masterfrom
armstrong0704:HDDS-8655-optimize-listKeys
Closed

HDDS-8655. Optimize listKeys by eliminating redundant seek.#9542
armstrong0704 wants to merge 1 commit intoapache:masterfrom
armstrong0704:HDDS-8655-optimize-listKeys

Conversation

@armstrong0704
Copy link
Copy Markdown

What changes were proposed in this pull request?

Fixes HDDS-8655 - Improve listKey performance for OBS buckets

Description

This change optimizes the listKeys operation in Ozone Manager by removing a redundant seek operation during iterator initialization. Previously, the listKeys functionality would create an iterator which implicitly sought to the beginning of the table (seekToFirst), and then immediately sought to the specific start key. This caused a double-seek, which is inefficient for RocksDB, particularly when the initial seek triggers unnecessary block loads.

Changes

  • Framework: Extended the Table interface to support passing a seek key directly when creating an iterator.
  • Implementation: Updated RDBTable and TypedTable to handle this new parameter.
  • RocksDB: Modified RDBStoreByteArrayIterator to check for a seek key in its constructor. If one is provided, it seeks directly to that position, skipping the default behavior of seeking to the first key.
  • Ozone Manager: Updated the listKeys implementation to use this new iterator approach.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8655

How was this patch tested?

I added a new regression test, testIteratorWithSeek, to TestRDBStore to confirm that the iterator starts correctly at the designated key. I also verified thread safety with a new concurrency test, testConcurrentIteratorWithWrites, which runs a writer and a reader in parallel to ensure stability. Existing tests in TestOmMetadataManager also pass.

@armstrong0704 armstrong0704 force-pushed the HDDS-8655-optimize-listKeys branch 4 times, most recently from 3ed3c4a to 2277672 Compare December 22, 2025 00:50
@ivandika3
Copy link
Copy Markdown
Contributor

ivandika3 commented Dec 22, 2025

Thanks @armstrong0704 for the patch. Please enable the build-branch workflow in your fork.

@armstrong0704 armstrong0704 force-pushed the HDDS-8655-optimize-listKeys branch 3 times, most recently from ab850ad to 2a6cdd0 Compare December 22, 2025 01:56
@armstrong0704 armstrong0704 marked this pull request as draft December 22, 2025 02:00
@armstrong0704 armstrong0704 force-pushed the HDDS-8655-optimize-listKeys branch from 2a6cdd0 to 2d244f6 Compare December 22, 2025 04:28
@armstrong0704 armstrong0704 marked this pull request as ready for review December 22, 2025 04:29
@armstrong0704
Copy link
Copy Markdown
Author

@ivandika3 thanks for the input. Linters are making additional formatting. Any Guidance there?

@armstrong0704 armstrong0704 force-pushed the HDDS-8655-optimize-listKeys branch from 2d244f6 to 29cb2c7 Compare December 22, 2025 05:47
@ivandika3
Copy link
Copy Markdown
Contributor

ivandika3 commented Dec 22, 2025

@armstrong0704 Could you disable the auto format in the linter? Afterwards, if you are using IntelliJ IDEA, you can go to "Code Style" and import the https://github.com/apache/ozone/blob/master/hadoop-hdds/dev-support/checkstyle/checkstyle.xml

@sodonnel
Copy link
Copy Markdown
Contributor

Please revert all the whitespace and comment formatting changes. Its making this diff a lot bigger than it needs to be.

@armstrong0704 armstrong0704 force-pushed the HDDS-8655-optimize-listKeys branch from 29cb2c7 to 09129df Compare December 22, 2025 22:43
@armstrong0704
Copy link
Copy Markdown
Author

@sodonnel Done.

@ivandika3 ivandika3 requested a review from kerneltime December 27, 2025 03:04
@swamirishi swamirishi self-requested a review January 5, 2026 17:55
@swamirishi
Copy link
Copy Markdown
Contributor

@vyalamar do you wanna review this patch?

Copy link
Copy Markdown
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @armstrong0704 for the patch.


RDBStoreByteArrayIterator(ManagedRocksIterator iterator,
RDBTable table, byte[] prefix, IteratorType type) {
RDBTable table, byte[] prefix, IteratorType type) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RDBTable table, byte[] prefix, IteratorType type) {
RDBTable table, byte[] prefix, IteratorType type) {

nit: Please disable Align when multiline in settings (if using IDEA).

Comment on lines +36 to +37
RDBStoreByteArrayIterator(ManagedRocksIterator iterator,
RDBTable table, byte[] prefix, IteratorType type, byte[] seekKey) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please do not format method signature like this. Whenever visibility / return type / method name / other modifiers are changed, we would have to reindent all parameters.

*
* @param beginKey start metadata key
* @param endKey end metadata key
* @param endKey end metadata key
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param endKey end metadata key
* @param endKey end metadata key

Comment on lines +590 to +592
/**
* Covert the given key to the {@link RAW} type.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's not change this unnecessarily

Suggested change
/**
* Covert the given key to the {@link RAW} type.
*/
/** Covert the given key to the {@link RAW} type. */

keyIter = getKeyTable(getBucketLayout()).iterator(null, seekKey)) {
readFromRDbStartNs = Time.monotonicNowNanos();
KeyValue< String, OmKeyInfo > kv;
keyIter.seek(seekKey);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this seek now unnecessary?

Comment on lines +426 to +429
assertTrue(iter.hasNext());
assertArrayEquals(getBytesUtf16("a3"), iter.next().getKey());
assertTrue(iter.hasNext());
assertArrayEquals(getBytesUtf16("a5"), iter.next().getKey());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to suggest adding a helper method to verify the list of keys iterated. Something like:

  private static void assertIterator(String prefix, String seek, Table<byte[], ?> table, List<String> expected)
      throws RocksDatabaseException, CodecException {
    final byte[] prefixBytes = prefix != null ? getBytesUtf16(prefix) : null;
    final byte[] seekBytes = seek != null ? getBytesUtf16(seek) : null;
    try (TableIterator<byte[], ? extends Table.KeyValue<byte[], ?>> iter = table.iterator(prefixBytes, seekBytes)) {
      for (String e : expected) {
        assertTrue(iter.hasNext());
        assertArrayEquals(getBytesUtf16(e), iter.next().getKey());
      }
      assertFalse(iter.hasNext());
    }
  }

Then we can both simplify cases and make sure the iterator stops (like already done in two of the five).

    // Case 1: Seek to existing key, no prefix
    assertIterator(null, "a3", table, asList("a3", "a5", "b2", "b4"));

    // Case 2: Seek to non-existent key (should land on next greater), no prefix
    assertIterator(null, "a2", table, asList("a3", "a5", "b2", "b4"));

    // Case 3: Seek past all keys, no prefix
    assertIterator(null, "z9", table, emptyList());

    // Case 4: Seek with prefix
    assertIterator("b", "b3", table, singletonList("b4"));

    // Case 5: Seek with prefix to start of prefix
    assertIterator("b", "b2", table, asList("b2", "b4"));

It can be reused in testIteratorSeekEdgeCases, too.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 6, 2026

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

@github-actions github-actions Bot added the stale label Feb 6, 2026
@ivandika3 ivandika3 removed the stale label Feb 6, 2026
@github-actions
Copy link
Copy Markdown

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

@github-actions github-actions Bot added the stale label Feb 28, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 7, 2026

Thank you for your contribution. This PR is being closed due to inactivity. Please contact a maintainer if you would like to reopen it.

@github-actions github-actions Bot closed this Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants