cudax: align HLL with cuco implementation#8855
cudax: align HLL with cuco implementation#8855sleeepyjack wants to merge 7 commits intoNVIDIA:mainfrom
Conversation
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
📝 WalkthroughSummary by CodeRabbit
WalkthroughTwo HyperLogLog++ algorithm files are updated to align cudax with cuCollections: finalizer.cuh simplifies binary-search anchor selection by returning the insertion point directly, and hyperloglog_impl.cuh reworks bit slicing in the ChangesHyperLogLog++ Algorithm Synchronization with cuCollections
Assessment against linked issues
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4799ccf8-d277-4288-8c34-d459437ba23d
📒 Files selected for processing (2)
cudax/include/cuda/experimental/__cuco/__hyperloglog/finalizer.cuhcudax/include/cuda/experimental/__cuco/__hyperloglog/hyperloglog_impl.cuh
| using __fp_type = double; ///< Floating point type used for reduction | ||
| using __hash_value_type = decltype(::cuda::std::declval<_Hash>()(::cuda::std::declval<_Tp>())); ///< Hash value type | ||
|
|
||
| static_assert(::cuda::std::is_unsigned_v<__hash_value_type>, "HyperLogLog requires an unsigned hash value type"); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify hash return widths used with HLL and whether narrow unsigned types are instantiated.
# Expected: HLL hash functors should resolve to widths safely larger than runtime precision.
set -euo pipefail
echo "== HLL type declarations/usages =="
rg -nP --type=cpp --type=cu --type=h --type=hpp --type=cuh '\b(__hyperloglog_impl|hyperloglog)\b'
echo
echo "== Potential hash functor operator() return types (look for uint8_t/uint16_t etc.) =="
rg -nP --type=cpp --type=cu --type=h --type=hpp --type=cuh 'operator\s*\(\s*[^)]*\)\s*(const)?\s*(noexcept)?\s*(->\s*[^ {;]+)?'
echo
echo "== Explicit narrow unsigned type mentions near hash code =="
rg -nP --type=cpp --type=cu --type=h --type=hpp --type=cuh -C2 '\b(uint8_t|uint16_t|unsigned char|unsigned short)\b'Repository: NVIDIA/cccl
Length of output: 135
🏁 Script executed:
#!/bin/bash
# Corrected: use only recognized file types; check HLL instantiations and narrow types
set -euo pipefail
echo "=== Examining hyperloglog_impl.cuh (lines 70-180) ==="
head -n 180 cudax/include/cuda/experimental/__cuco/__hyperloglog/hyperloglog_impl.cuh | tail -n +70
echo
echo "=== HLL type declarations/usages (all files) ==="
rg -nP --type-list | grep -i cpp || echo "Available file types:"
rg --type-list | head -20
echo
echo "=== Search for hyperloglog instantiations ==="
rg -n 'hyperloglog' cudax/include/cuda/experimental/__cuco/ --type-list
echo
echo "=== Search in .cuh, .hpp, .h files for narrow unsigned types near hash ==="
find cudax/include/cuda/experimental/__cuco/ -type f \( -name "*.cuh" -o -name "*.hpp" -o -name "*.h" \) | head -20
echo
echo "=== Check __cuco directory structure ==="
ls -la cudax/include/cuda/experimental/__cuco/__hyperloglog/ 2>/dev/null || echo "Directory not accessible"Repository: NVIDIA/cccl
Length of output: 6113
🏁 Script executed:
#!/bin/bash
# Search for HLL instantiations, tests, and usage of narrow unsigned types with HLL
set -euo pipefail
echo "=== Find test files for hyperloglog ==="
find cudax -name "*hyperloglog*" -type f | head -20
echo
echo "=== Find __cuco test files ==="
find cudax -path "*__cuco*" -name "*.cu" -o -name "*.cpp" 2>/dev/null | grep -i test | head -20
echo
echo "=== Search for HLL constructor/instantiation calls ==="
rg -n '__hyperloglog_impl|hyperloglog' cudax --type=cpp --max-count=30
echo
echo "=== Check what hash types are used (search for declval in __hyperloglog context) ==="
rg -A5 -B5 '__hyperloglog_impl' cudax/include/cuda/experimental/__cuco/__hyperloglog/ --type=cpp
echo
echo "=== Look for example/test usage patterns ==="
find . -path "*/test*" -name "*.cu" -o -name "*.cpp" | xargs grep -l "hyperloglog" 2>/dev/null | head -5Repository: NVIDIA/cccl
Length of output: 564
🏁 Script executed:
#!/bin/bash
# Examine test and public API files to understand HLL instantiations
echo "=== Test file: test_hyperloglog.cu (first 200 lines) ==="
head -n 200 cudax/test/cuco/hyperloglog/test_hyperloglog.cu
echo
echo "=== Public API wrapper: hyperloglog.cuh ==="
cat cudax/include/cuda/experimental/__cuco/hyperloglog.cuh
echo
echo "=== Search for any hash functor returning uint8_t, uint16_t in test or examples ==="
rg -n 'uint8_t|uint16_t' cudax/test/cuco/hyperloglog/ 2>/dev/null || echo "No matches in test dir"Repository: NVIDIA/cccl
Length of output: 27560
important: Line 77's static_assert enforces unsignedness but not minimum width. Lines 173 and 175 invoke undefined behavior if __precision >= numeric_limits<__hash_value_type>::digits. Although no production instantiation uses narrow hash types (all default to 64-bit), add a compile-time precondition to constrain the hash type width relative to the runtime precision range [4, 18], or document the unsupported instantiation cases.
There was a problem hiding this comment.
This will be addressed in a PR that follows shortly and adds a Policy abstraction to allow for various HLL flavors. The hash type check will be tied to the specific policy.
There was a problem hiding this comment.
@sleeepyjack Understood. Would you like me to open a GitHub issue to track the follow-up (adding a compile-time width check ensuring numeric_limits<__hash_value_type>::digits > 18 or equivalent)?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| int __left = 0; | ||
| int __right = static_cast<int>(__n) - 1; | ||
| int __mid = -1; | ||
| int __candidate_index = 0; // Index of the closest element found |
There was a problem hiding this comment.
not sure how we missed this in the first place, could we use sized integer instead? e.g. int32_t as opposed to int?
🥳 CI Workflow Results🟩 Finished in 44m 37s: Pass: 100%/55 | Total: 15h 22m | Max: 44m 35s | Hits: 34%/110407See results here. |
Description
Closes #8854
Port NVIDIA/cuCollections#792 into the cudax HLL: bit slicing in
__add, finalizer interpolation-anchor selection, and removal ofthe now-dead
__register_maskhelper.Checklist