Add sm120 tunings for DevicePartition::Flagged 1/2#8846
Draft
gonidelis wants to merge 5 commits intoNVIDIA:mainfrom
Draft
Add sm120 tunings for DevicePartition::Flagged 1/2#8846gonidelis wants to merge 5 commits intoNVIDIA:mainfrom
DevicePartition::Flagged 1/2#8846gonidelis wants to merge 5 commits intoNVIDIA:mainfrom
Conversation
Verification on RTX PRO 6000 Blackwell Server Edition (sm_120, 188 SMs) showed +1% to +9% regressions for all encoded I32/I64/F32/F64 entries, while I8/I16 entries remained -3% to -33% FAST. The Workstation-EVO winners do not transfer to Server for size-4+ kernels. Drops: (size=4, off=4, distinct=F) (size=4, off=4, distinct=T) (size=4, off=8, distinct=F) (size=4, off=8, distinct=T) (size=8, off=4, distinct=F) (size=8, off=8, distinct=F) Remaining encodings (7): all on input_size 1 and 2.
Adds 6 sm120_tuning specializations for the partition.flagged variant (flagged::yes, keep_rejects::yes), covering the CT-axis combos where Workstation EVO produced clean winners: I8 / I32 / distinct_partitions::no I8 / I32 / distinct_partitions::yes I8 / I64 / distinct_partitions::no (2nd-best; 1st EVO winner was absurd) I8 / I64 / distinct_partitions::yes I16 / I32 / distinct_partitions::no I16 / I32 / distinct_partitions::yes Routes through the same Policy1200 chain added by the partition.if PR (via the shared sm120_tuning template). Verification on the Server SKU is pending; size-4+ entries are intentionally absent for now (per prior analysis those were Workstation-only winners). Stacked on partition_tuning branch (PR for partition.if).
Contributor
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Member
Author
Verification on RTX PRO 6000 Blackwell Server Edition (sm_120) showed
4 of the 6 originally-encoded partition.flagged entries are SLOW or
flat at 2^28:
I8 / I32 / distinct=false : +2.20% (7 SLOW across sizes)
I8 / I64 / distinct=false : -1.22% but 5 SLOW ("2nd-best after absurd")
I16 / I32 / distinct=false: +1.38%
I16 / I32 / distinct=true : +0.58%
Only the distinct=true variants for I8 transferred cleanly:
I8 / I32 / distinct=true : -3.10% (5 FAST / 0 SLOW / 7 SAME)
I8 / I64 / distinct=true : -8.48% (6 FAST / 2 SLOW / 4 SAME, SLOWs at 2^16)
Same Workstation->Server transfer pattern as partition.if: distinct=false
landscapes diverge meaningfully across SKUs. Remaining CT combos pending
fresh EVO sweep on Server.
Member
Author
|
elaborate results ['/home/ggonidelis/partition_jsons/flagged_main.json', '/home/ggonidelis/partition_jsons/flagged_tuning.json']
# base
## [0] NVIDIA RTX PRO 6000 Blackwell Server Edition
| T{ct} | OffsetT{ct} | DistinctPartitions{ct} | Elements{io} | Entropy | Ref Time | Ref Noise | Cmp Time | Cmp Noise | Diff | %Diff | Status |
|---------|---------------|--------------------------|----------------|-----------|------------|-------------|------------|-------------|------------|---------|----------|
| I8 | I32 | false | 2^16 | 1 | 8.262 us | 3.11% | 6.341 us | 8.33% | -1.921 us | -23.25% | FAST |
| I8 | I32 | false | 2^20 | 1 | 12.064 us | 5.70% | 12.120 us | 3.40% | 0.056 us | 0.46% | SAME |
| I8 | I32 | false | 2^24 | 1 | 49.818 us | 2.12% | 50.996 us | 1.06% | 1.178 us | 2.36% | SLOW |
| I8 | I32 | false | 2^28 | 1 | 588.090 us | 0.25% | 600.931 us | 0.29% | 12.841 us | 2.18% | SLOW |
| I8 | I32 | false | 2^16 | 0.544 | 10.142 us | 4.59% | 10.240 us | 0.00% | 0.098 us | 0.97% | SLOW |
| I8 | I32 | false | 2^20 | 0.544 | 11.682 us | 7.41% | 11.834 us | 6.72% | 0.152 us | 1.30% | SAME |
| I8 | I32 | false | 2^24 | 0.544 | 50.558 us | 1.86% | 51.496 us | 1.79% | 0.938 us | 1.85% | SLOW |
| I8 | I32 | false | 2^28 | 0.544 | 590.177 us | 0.24% | 604.271 us | 0.29% | 14.094 us | 2.39% | SLOW |
| I8 | I32 | false | 2^16 | 0 | 10.240 us | 0.00% | 10.192 us | 2.58% | -0.048 us | -0.47% | FAST |
| I8 | I32 | false | 2^20 | 0 | 12.288 us | 0.00% | 12.197 us | 2.61% | -0.092 us | -0.74% | ???? |
| I8 | I32 | false | 2^24 | 0 | 49.433 us | 1.59% | 50.812 us | 1.64% | 1.379 us | 2.79% | SLOW |
| I8 | I32 | false | 2^28 | 0 | 587.800 us | 0.23% | 599.793 us | 0.26% | 11.993 us | 2.04% | SLOW |
| I8 | I32 | true | 2^16 | 1 | 10.066 us | 4.29% | 9.901 us | 6.85% | -0.165 us | -1.64% | SAME |
| I8 | I32 | true | 2^20 | 1 | 10.148 us | 4.17% | 10.160 us | 5.43% | 0.013 us | 0.13% | SAME |
| I8 | I32 | true | 2^24 | 1 | 50.134 us | 2.17% | 49.902 us | 2.29% | -0.231 us | -0.46% | SAME |
| I8 | I32 | true | 2^28 | 1 | 639.126 us | 2.36% | 614.528 us | 0.61% | -24.598 us | -3.85% | FAST |
| I8 | I32 | true | 2^16 | 0.544 | 10.240 us | 0.00% | 8.192 us | 0.00% | -2.048 us | -20.00% | FAST |
| I8 | I32 | true | 2^20 | 0.544 | 10.240 us | 0.00% | 10.218 us | 2.44% | -0.022 us | -0.21% | FAST |
| I8 | I32 | true | 2^24 | 0.544 | 51.154 us | 1.19% | 50.881 us | 1.34% | -0.273 us | -0.53% | SAME |
| I8 | I32 | true | 2^28 | 0.544 | 623.371 us | 1.61% | 619.877 us | 0.70% | -3.494 us | -0.56% | SAME |
| I8 | I32 | true | 2^16 | 0 | 10.107 us | 4.72% | 8.135 us | 3.27% | -1.972 us | -19.51% | FAST |
| I8 | I32 | true | 2^20 | 0 | 10.290 us | 4.22% | 10.247 us | 2.32% | -0.043 us | -0.42% | SAME |
| I8 | I32 | true | 2^24 | 0 | 51.083 us | 1.31% | 51.140 us | 0.78% | 0.057 us | 0.11% | SAME |
| I8 | I32 | true | 2^28 | 0 | 638.557 us | 2.55% | 607.286 us | 0.64% | -31.271 us | -4.90% | FAST |
| I8 | I64 | false | 2^16 | 1 | 8.941 us | 11.41% | 8.836 us | 10.97% | -0.105 us | -1.17% | SAME |
| I8 | I64 | false | 2^20 | 1 | 12.337 us | 3.16% | 12.998 us | 7.51% | 0.661 us | 5.36% | SLOW |
| I8 | I64 | false | 2^24 | 1 | 49.093 us | 0.92% | 49.084 us | 0.73% | -0.008 us | -0.02% | SAME |
| I8 | I64 | false | 2^28 | 1 | 626.749 us | 0.84% | 620.654 us | 0.84% | -6.096 us | -0.97% | FAST |
| I8 | I64 | false | 2^16 | 0.544 | 10.046 us | 4.34% | 10.240 us | 0.00% | 0.194 us | 1.93% | SLOW |
| I8 | I64 | false | 2^20 | 0.544 | 10.453 us | 8.76% | 12.180 us | 3.40% | 1.727 us | 16.53% | SLOW |
| I8 | I64 | false | 2^24 | 0.544 | 50.093 us | 2.25% | 49.939 us | 2.04% | -0.153 us | -0.31% | SAME |
| I8 | I64 | false | 2^28 | 0.544 | 638.767 us | 0.89% | 631.036 us | 0.83% | -7.731 us | -1.21% | FAST |
| I8 | I64 | false | 2^16 | 0 | 8.192 us | 0.00% | 8.734 us | 10.64% | 0.542 us | 6.62% | SLOW |
| I8 | I64 | false | 2^20 | 0 | 10.240 us | 0.00% | 11.453 us | 9.11% | 1.213 us | 11.85% | SLOW |
| I8 | I64 | false | 2^24 | 0 | 49.405 us | 1.47% | 49.294 us | 1.97% | -0.111 us | -0.23% | SAME |
| I8 | I64 | false | 2^28 | 0 | 634.132 us | 0.83% | 624.679 us | 0.74% | -9.453 us | -1.49% | FAST |
| I8 | I64 | true | 2^16 | 1 | 10.240 us | 0.00% | 8.177 us | 1.44% | -2.063 us | -20.15% | FAST |
| I8 | I64 | true | 2^20 | 1 | 12.396 us | 4.23% | 12.298 us | 1.46% | -0.098 us | -0.79% | SAME |
| I8 | I64 | true | 2^24 | 1 | 50.023 us | 2.10% | 51.012 us | 1.37% | 0.988 us | 1.98% | SLOW |
| I8 | I64 | true | 2^28 | 1 | 657.379 us | 2.34% | 594.536 us | 0.49% | -62.843 us | -9.56% | FAST |
| I8 | I64 | true | 2^16 | 0.544 | 10.240 us | 0.00% | 10.074 us | 5.34% | -0.166 us | -1.62% | FAST |
| I8 | I64 | true | 2^20 | 0.544 | 10.319 us | 5.18% | 10.893 us | 8.94% | 0.574 us | 5.56% | SLOW |
| I8 | I64 | true | 2^24 | 0.544 | 50.827 us | 1.70% | 51.202 us | 1.11% | 0.375 us | 0.74% | SAME |
| I8 | I64 | true | 2^28 | 0.544 | 641.288 us | 2.09% | 596.355 us | 0.54% | -44.933 us | -7.01% | FAST |
| I8 | I64 | true | 2^16 | 0 | 9.998 us | 5.18% | 7.966 us | 5.77% | -2.032 us | -20.32% | FAST |
| I8 | I64 | true | 2^20 | 0 | 10.144 us | 3.57% | 10.257 us | 3.17% | 0.113 us | 1.11% | SAME |
| I8 | I64 | true | 2^24 | 0 | 49.952 us | 2.32% | 50.751 us | 1.73% | 0.799 us | 1.60% | SAME |
| I8 | I64 | true | 2^28 | 0 | 649.675 us | 2.36% | 591.964 us | 0.44% | -57.711 us | -8.88% | FAST |
| I16 | I32 | false | 2^16 | 1 | 10.240 us | 0.00% | 8.183 us | 1.31% | -2.057 us | -20.09% | FAST |
| I16 | I32 | false | 2^20 | 1 | 16.384 us | 0.00% | 14.208 us | 3.12% | -2.176 us | -13.28% | FAST |
| I16 | I32 | false | 2^24 | 1 | 69.444 us | 1.30% | 66.280 us | 1.73% | -3.164 us | -4.56% | FAST |
| I16 | I32 | false | 2^28 | 1 | 943.515 us | 0.53% | 954.071 us | 0.30% | 10.556 us | 1.12% | SLOW |
| I16 | I32 | false | 2^16 | 0.544 | 10.836 us | 8.92% | 8.286 us | 6.16% | -2.549 us | -23.53% | FAST |
| I16 | I32 | false | 2^20 | 0.544 | 14.336 us | 0.00% | 13.144 us | 7.61% | -1.192 us | -8.31% | ???? |
| I16 | I32 | false | 2^24 | 0.544 | 70.639 us | 1.74% | 67.822 us | 1.41% | -2.817 us | -3.99% | FAST |
| I16 | I32 | false | 2^28 | 0.544 | 942.381 us | 0.37% | 957.515 us | 0.33% | 15.134 us | 1.61% | SLOW |
| I16 | I32 | false | 2^16 | 0 | 10.211 us | 2.14% | 8.167 us | 2.07% | -2.044 us | -20.02% | FAST |
| I16 | I32 | false | 2^20 | 0 | 14.078 us | 4.74% | 12.675 us | 6.36% | -1.403 us | -9.97% | FAST |
| I16 | I32 | false | 2^24 | 0 | 69.833 us | 1.10% | 67.170 us | 1.32% | -2.663 us | -3.81% | FAST |
| I16 | I32 | false | 2^28 | 0 | 943.602 us | 0.56% | 957.019 us | 0.35% | 13.418 us | 1.42% | SLOW |
| I16 | I32 | true | 2^16 | 1 | 9.917 us | 6.32% | 8.162 us | 2.26% | -1.755 us | -17.70% | FAST |
| I16 | I32 | true | 2^20 | 1 | 14.250 us | 2.15% | 12.262 us | 2.21% | -1.988 us | -13.95% | FAST |
| I16 | I32 | true | 2^24 | 1 | 68.562 us | 1.59% | 66.283 us | 1.52% | -2.279 us | -3.32% | FAST |
| I16 | I32 | true | 2^28 | 1 | 944.636 us | 0.51% | 948.660 us | 0.25% | 4.025 us | 0.43% | SLOW |
| I16 | I32 | true | 2^16 | 0.544 | 10.027 us | 4.53% | 8.166 us | 2.22% | -1.861 us | -18.56% | FAST |
| I16 | I32 | true | 2^20 | 0.544 | 16.365 us | 0.84% | 14.299 us | 3.23% | -2.066 us | -12.62% | FAST |
| I16 | I32 | true | 2^24 | 0.544 | 70.920 us | 1.68% | 67.529 us | 0.69% | -3.392 us | -4.78% | FAST |
| I16 | I32 | true | 2^28 | 0.544 | 942.194 us | 0.38% | 950.090 us | 0.26% | 7.896 us | 0.84% | SLOW |
| I16 | I32 | true | 2^16 | 0 | 10.240 us | 0.00% | 9.255 us | 11.22% | -0.985 us | -9.62% | FAST |
| I16 | I32 | true | 2^20 | 0 | 14.319 us | 0.91% | 12.288 us | 0.00% | -2.031 us | -14.19% | ???? |
| I16 | I32 | true | 2^24 | 0 | 69.344 us | 1.34% | 65.993 us | 1.61% | -3.351 us | -4.83% | FAST |
| I16 | I32 | true | 2^28 | 0 | 944.111 us | 0.57% | 948.515 us | 0.24% | 4.404 us | 0.47% | SLOW |
| I16 | I64 | false | 2^16 | 1 | 8.196 us | 3.02% | 9.415 us | 10.63% | 1.219 us | 14.88% | SLOW |
| I16 | I64 | false | 2^20 | 1 | 12.259 us | 1.52% | 12.288 us | 0.00% | 0.029 us | 0.23% | ???? |
| I16 | I64 | false | 2^24 | 1 | 67.399 us | 0.95% | 67.455 us | 1.01% | 0.056 us | 0.08% | SAME |
| I16 | I64 | false | 2^28 | 1 | 956.073 us | 0.49% | 957.963 us | 0.51% | 1.891 us | 0.20% | SAME |
| I16 | I64 | false | 2^16 | 0.544 | 8.302 us | 7.38% | 9.907 us | 6.67% | 1.606 us | 19.34% | SLOW |
| I16 | I64 | false | 2^20 | 0.544 | 12.365 us | 4.02% | 12.567 us | 5.81% | 0.202 us | 1.64% | SAME |
| I16 | I64 | false | 2^24 | 0.544 | 67.735 us | 0.84% | 68.083 us | 1.33% | 0.348 us | 0.51% | SAME |
| I16 | I64 | false | 2^28 | 0.544 | 968.845 us | 0.68% | 971.328 us | 0.70% | 2.483 us | 0.26% | SAME |
| I16 | I64 | false | 2^16 | 0 | 10.003 us | 5.27% | 10.240 us | 0.00% | 0.237 us | 2.37% | SLOW |
| I16 | I64 | false | 2^20 | 0 | 12.108 us | 3.83% | 12.480 us | 5.15% | 0.372 us | 3.07% | SAME |
| I16 | I64 | false | 2^24 | 0 | 67.707 us | 0.85% | 68.052 us | 1.26% | 0.345 us | 0.51% | SAME |
| I16 | I64 | false | 2^28 | 0 | 957.897 us | 0.49% | 958.511 us | 0.53% | 0.614 us | 0.06% | SAME |
| I16 | I64 | true | 2^16 | 1 | 10.007 us | 6.46% | 10.081 us | 5.11% | 0.074 us | 0.74% | SAME |
| I16 | I64 | true | 2^20 | 1 | 12.288 us | 0.00% | 12.324 us | 5.85% | 0.036 us | 0.30% | ???? |
| I16 | I64 | true | 2^24 | 1 | 66.065 us | 1.41% | 66.485 us | 1.59% | 0.419 us | 0.63% | SAME |
| I16 | I64 | true | 2^28 | 1 | 944.940 us | 0.31% | 946.382 us | 0.34% | 1.443 us | 0.15% | SAME |
| I16 | I64 | true | 2^16 | 0.544 | 10.174 us | 3.30% | 10.214 us | 1.71% | 0.040 us | 0.39% | SAME |
| I16 | I64 | true | 2^20 | 0.544 | 12.347 us | 3.41% | 12.652 us | 6.35% | 0.304 us | 2.46% | SAME |
| I16 | I64 | true | 2^24 | 0.544 | 68.684 us | 1.59% | 69.065 us | 1.43% | 0.381 us | 0.55% | SAME |
| I16 | I64 | true | 2^28 | 0.544 | 950.885 us | 0.34% | 952.064 us | 0.37% | 1.179 us | 0.12% | SAME |
| I16 | I64 | true | 2^16 | 0 | 10.205 us | 1.98% | 10.051 us | 4.27% | -0.153 us | -1.50% | SAME |
| I16 | I64 | true | 2^20 | 0 | 14.284 us | 1.73% | 14.312 us | 0.53% | 0.027 us | 0.19% | SAME |
| I16 | I64 | true | 2^24 | 0 | 66.597 us | 1.55% | 66.926 us | 1.51% | 0.329 us | 0.49% | SAME |
| I16 | I64 | true | 2^28 | 0 | 942.230 us | 0.29% | 943.505 us | 0.33% | 1.276 us | 0.14% | SAME |
| I32 | I32 | false | 2^16 | 1 | 10.056 us | 4.19% | 10.240 us | 0.00% | 0.184 us | 1.83% | SLOW |
| I32 | I32 | false | 2^20 | 1 | 18.328 us | 2.54% | 18.426 us | 1.26% | 0.098 us | 0.53% | SAME |
| I32 | I32 | false | 2^24 | 1 | 111.366 us | 0.96% | 111.606 us | 1.08% | 0.239 us | 0.21% | SAME |
| I32 | I32 | false | 2^28 | 1 | 1.667 ms | 0.16% | 1.668 ms | 0.18% | 0.759 us | 0.05% | SAME |
| I32 | I32 | false | 2^16 | 0.544 | 10.215 us | 1.64% | 10.223 us | 1.41% | 0.008 us | 0.08% | SAME |
| I32 | I32 | false | 2^20 | 0.544 | 18.418 us | 1.67% | 18.269 us | 2.52% | -0.149 us | -0.81% | SAME |
| I32 | I32 | false | 2^24 | 0.544 | 112.865 us | 0.88% | 113.067 us | 0.87% | 0.203 us | 0.18% | SAME |
| I32 | I32 | false | 2^28 | 0.544 | 1.669 ms | 0.16% | 1.670 ms | 0.17% | 1.136 us | 0.07% | SAME |
| I32 | I32 | false | 2^16 | 0 | 10.239 us | 0.05% | 10.221 us | 1.44% | -0.018 us | -0.17% | FAST |
| I32 | I32 | false | 2^20 | 0 | 16.384 us | 0.00% | 17.533 us | 6.22% | 1.149 us | 7.02% | SLOW |
| I32 | I32 | false | 2^24 | 0 | 112.769 us | 1.22% | 112.061 us | 1.02% | -0.708 us | -0.63% | SAME |
| I32 | I32 | false | 2^28 | 0 | 1.669 ms | 0.17% | 1.669 ms | 0.17% | -0.189 us | -0.01% | SAME |
| I32 | I32 | true | 2^16 | 1 | 7.861 us | 11.90% | 9.091 us | 12.19% | 1.230 us | 15.65% | SLOW |
| I32 | I32 | true | 2^20 | 1 | 19.720 us | 5.17% | 18.947 us | 4.69% | -0.773 us | -3.92% | SAME |
| I32 | I32 | true | 2^24 | 1 | 117.952 us | 1.05% | 116.528 us | 1.08% | -1.424 us | -1.21% | FAST |
| I32 | I32 | true | 2^28 | 1 | 1.663 ms | 0.11% | 1.663 ms | 0.11% | 0.118 us | 0.01% | SAME |
| I32 | I32 | true | 2^16 | 0.544 | 10.240 us | 0.00% | 10.240 us | 0.00% | 0.000 us | 0.00% | SAME |
| I32 | I32 | true | 2^20 | 0.544 | 19.774 us | 5.08% | 20.144 us | 3.83% | 0.370 us | 1.87% | SAME |
| I32 | I32 | true | 2^24 | 0.544 | 117.789 us | 1.03% | 117.913 us | 1.02% | 0.124 us | 0.11% | SAME |
| I32 | I32 | true | 2^28 | 0.544 | 1.664 ms | 0.11% | 1.664 ms | 0.11% | 0.318 us | 0.02% | SAME |
| I32 | I32 | true | 2^16 | 0 | 10.102 us | 3.73% | 10.204 us | 1.95% | 0.103 us | 1.02% | SAME |
| I32 | I32 | true | 2^20 | 0 | 16.634 us | 4.27% | 19.576 us | 5.16% | 2.942 us | 17.68% | SLOW |
| I32 | I32 | true | 2^24 | 0 | 115.513 us | 1.08% | 115.822 us | 0.99% | 0.308 us | 0.27% | SAME |
| I32 | I32 | true | 2^28 | 0 | 1.663 ms | 0.10% | 1.664 ms | 0.10% | 0.333 us | 0.02% | SAME |
| I32 | I64 | false | 2^16 | 1 | 8.217 us | 2.95% | 8.256 us | 4.32% | 0.039 us | 0.48% | SAME |
| I32 | I64 | false | 2^20 | 1 | 16.350 us | 3.21% | 18.500 us | 3.53% | 2.150 us | 13.15% | SLOW |
| I32 | I64 | false | 2^24 | 1 | 112.764 us | 0.96% | 113.054 us | 0.92% | 0.290 us | 0.26% | SAME |
| I32 | I64 | false | 2^28 | 1 | 1.656 ms | 0.10% | 1.656 ms | 0.10% | -0.034 us | -0.00% | SAME |
| I32 | I64 | false | 2^16 | 0.544 | 10.240 us | 0.00% | 10.131 us | 3.39% | -0.109 us | -1.07% | FAST |
| I32 | I64 | false | 2^20 | 0.544 | 18.591 us | 3.22% | 18.598 us | 3.07% | 0.006 us | 0.03% | SAME |
| I32 | I64 | false | 2^24 | 0.544 | 114.585 us | 1.00% | 114.527 us | 1.09% | -0.058 us | -0.05% | SAME |
| I32 | I64 | false | 2^28 | 0.544 | 1.659 ms | 0.11% | 1.659 ms | 0.11% | -0.015 us | -0.00% | SAME |
| I32 | I64 | false | 2^16 | 0 | 10.240 us | 0.00% | 10.238 us | 0.08% | -0.002 us | -0.02% | FAST |
| I32 | I64 | false | 2^20 | 0 | 16.298 us | 3.40% | 18.481 us | 2.27% | 2.184 us | 13.40% | SLOW |
| I32 | I64 | false | 2^24 | 0 | 112.870 us | 1.01% | 112.717 us | 1.02% | -0.153 us | -0.14% | SAME |
| I32 | I64 | false | 2^28 | 0 | 1.657 ms | 0.10% | 1.657 ms | 0.10% | -0.129 us | -0.01% | SAME |
| I32 | I64 | true | 2^16 | 1 | 13.884 us | 5.36% | 14.336 us | 0.00% | 0.452 us | 3.26% | ???? |
| I32 | I64 | true | 2^20 | 1 | 22.195 us | 3.37% | 22.828 us | 3.77% | 0.634 us | 2.86% | SAME |
| I32 | I64 | true | 2^24 | 1 | 122.598 us | 0.89% | 122.534 us | 0.90% | -0.064 us | -0.05% | SAME |
| I32 | I64 | true | 2^28 | 1 | 1.689 ms | 0.09% | 1.689 ms | 0.09% | 0.001 us | 0.00% | SAME |
| I32 | I64 | true | 2^16 | 0.544 | 14.231 us | 3.27% | 14.251 us | 2.60% | 0.020 us | 0.14% | SAME |
| I32 | I64 | true | 2^20 | 0.544 | 22.528 us | 0.00% | 23.570 us | 4.55% | 1.042 us | 4.63% | SLOW |
| I32 | I64 | true | 2^24 | 0.544 | 124.279 us | 0.94% | 124.341 us | 0.91% | 0.062 us | 0.05% | SAME |
| I32 | I64 | true | 2^28 | 0.544 | 1.689 ms | 0.09% | 1.689 ms | 0.10% | 0.020 us | 0.00% | SAME |
| I32 | I64 | true | 2^16 | 0 | 14.312 us | 1.14% | 14.147 us | 3.02% | -0.165 us | -1.15% | FAST |
| I32 | I64 | true | 2^20 | 0 | 23.071 us | 4.15% | 23.025 us | 3.86% | -0.045 us | -0.20% | SAME |
| I32 | I64 | true | 2^24 | 0 | 122.461 us | 0.98% | 122.209 us | 0.91% | -0.252 us | -0.21% | SAME |
| I32 | I64 | true | 2^28 | 0 | 1.688 ms | 0.09% | 1.688 ms | 0.09% | -0.070 us | -0.00% | SAME |
| I64 | I32 | false | 2^16 | 1 | 12.288 us | 0.00% | 12.265 us | 1.31% | -0.023 us | -0.19% | ???? |
| I64 | I32 | false | 2^20 | 1 | 25.647 us | 4.40% | 25.265 us | 3.88% | -0.382 us | -1.49% | SAME |
| I64 | I32 | false | 2^24 | 1 | 212.249 us | 0.96% | 212.368 us | 0.94% | 0.119 us | 0.06% | SAME |
| I64 | I32 | false | 2^28 | 1 | 3.119 ms | 0.09% | 3.119 ms | 0.08% | 0.005 us | 0.00% | SAME |
| I64 | I32 | false | 2^16 | 0.544 | 11.761 us | 6.77% | 12.279 us | 0.77% | 0.518 us | 4.40% | SLOW |
| I64 | I32 | false | 2^20 | 0.544 | 25.888 us | 3.88% | 25.911 us | 3.77% | 0.023 us | 0.09% | SAME |
| I64 | I32 | false | 2^24 | 0.544 | 212.393 us | 1.00% | 214.354 us | 0.86% | 1.961 us | 0.92% | SLOW |
| I64 | I32 | false | 2^28 | 0.544 | 3.121 ms | 0.15% | 3.121 ms | 0.14% | -0.020 us | -0.00% | SAME |
| I64 | I32 | false | 2^16 | 0 | 12.286 us | 0.06% | 12.262 us | 1.36% | -0.024 us | -0.19% | FAST |
| I64 | I32 | false | 2^20 | 0 | 25.489 us | 3.99% | 25.041 us | 4.07% | -0.448 us | -1.76% | SAME |
| I64 | I32 | false | 2^24 | 0 | 212.038 us | 1.03% | 212.705 us | 0.90% | 0.667 us | 0.31% | SAME |
| I64 | I32 | false | 2^28 | 0 | 3.119 ms | 0.08% | 3.119 ms | 0.09% | 0.159 us | 0.01% | SAME |
| I64 | I32 | true | 2^16 | 1 | 12.274 us | 0.93% | 12.288 us | 0.00% | 0.014 us | 0.12% | ???? |
| I64 | I32 | true | 2^20 | 1 | 24.292 us | 3.08% | 24.343 us | 2.84% | 0.051 us | 0.21% | SAME |
| I64 | I32 | true | 2^24 | 1 | 212.773 us | 1.04% | 212.029 us | 1.01% | -0.744 us | -0.35% | SAME |
| I64 | I32 | true | 2^28 | 1 | 3.120 ms | 0.09% | 3.120 ms | 0.09% | -0.082 us | -0.00% | SAME |
| I64 | I32 | true | 2^16 | 0.544 | 12.288 us | 0.00% | 12.083 us | 3.71% | -0.205 us | -1.67% | ???? |
| I64 | I32 | true | 2^20 | 0.544 | 24.671 us | 2.67% | 24.612 us | 2.37% | -0.058 us | -0.24% | SAME |
| I64 | I32 | true | 2^24 | 0.544 | 214.192 us | 1.02% | 213.957 us | 0.96% | -0.234 us | -0.11% | SAME |
| I64 | I32 | true | 2^28 | 0.544 | 3.121 ms | 0.11% | 3.122 ms | 0.12% | 0.157 us | 0.01% | SAME |
| I64 | I32 | true | 2^16 | 0 | 11.846 us | 6.34% | 10.240 us | 0.00% | -1.606 us | -13.55% | FAST |
| I64 | I32 | true | 2^20 | 0 | 24.435 us | 2.39% | 24.521 us | 2.47% | 0.087 us | 0.36% | SAME |
| I64 | I32 | true | 2^24 | 0 | 212.638 us | 0.93% | 212.487 us | 0.98% | -0.151 us | -0.07% | SAME |
| I64 | I32 | true | 2^28 | 0 | 3.120 ms | 0.10% | 3.120 ms | 0.10% | 0.287 us | 0.01% | SAME |
| I64 | I64 | false | 2^16 | 1 | 9.616 us | 9.01% | 8.920 us | 10.89% | -0.696 us | -7.24% | SAME |
| I64 | I64 | false | 2^20 | 1 | 23.192 us | 6.19% | 22.990 us | 6.04% | -0.202 us | -0.87% | SAME |
| I64 | I64 | false | 2^24 | 1 | 206.067 us | 1.48% | 206.199 us | 1.33% | 0.132 us | 0.06% | SAME |
| I64 | I64 | false | 2^28 | 1 | 3.120 ms | 0.15% | 3.120 ms | 0.15% | -0.054 us | -0.00% | SAME |
| I64 | I64 | false | 2^16 | 0.544 | 10.240 us | 0.00% | 9.164 us | 11.08% | -1.076 us | -10.50% | FAST |
| I64 | I64 | false | 2^20 | 0.544 | 23.286 us | 5.31% | 23.326 us | 5.71% | 0.040 us | 0.17% | SAME |
| I64 | I64 | false | 2^24 | 0.544 | 207.917 us | 1.56% | 207.800 us | 1.61% | -0.117 us | -0.06% | SAME |
| I64 | I64 | false | 2^28 | 0.544 | 3.119 ms | 0.12% | 3.119 ms | 0.14% | -0.043 us | -0.00% | SAME |
| I64 | I64 | false | 2^16 | 0 | 10.214 us | 1.72% | 8.896 us | 10.16% | -1.318 us | -12.90% | FAST |
| I64 | I64 | false | 2^20 | 0 | 23.192 us | 5.92% | 22.914 us | 5.12% | -0.278 us | -1.20% | SAME |
| I64 | I64 | false | 2^24 | 0 | 205.257 us | 1.43% | 205.607 us | 1.59% | 0.350 us | 0.17% | SAME |
| I64 | I64 | false | 2^28 | 0 | 3.121 ms | 0.13% | 3.121 ms | 0.15% | -0.177 us | -0.01% | SAME |
| I64 | I64 | true | 2^16 | 1 | 12.142 us | 3.46% | 10.286 us | 2.88% | -1.856 us | -15.29% | FAST |
| I64 | I64 | true | 2^20 | 1 | 23.737 us | 4.28% | 23.438 us | 4.43% | -0.299 us | -1.26% | SAME |
| I64 | I64 | true | 2^24 | 1 | 210.503 us | 0.86% | 210.316 us | 0.94% | -0.187 us | -0.09% | SAME |
| I64 | I64 | true | 2^28 | 1 | 3.114 ms | 0.10% | 3.114 ms | 0.12% | 0.376 us | 0.01% | SAME |
| I64 | I64 | true | 2^16 | 0.544 | 12.112 us | 3.49% | 10.757 us | 8.14% | -1.355 us | -11.18% | FAST |
| I64 | I64 | true | 2^20 | 0.544 | 23.644 us | 4.60% | 23.560 us | 4.56% | -0.084 us | -0.35% | SAME |
| I64 | I64 | true | 2^24 | 0.544 | 211.391 us | 0.93% | 211.482 us | 0.92% | 0.091 us | 0.04% | SAME |
| I64 | I64 | true | 2^28 | 0.544 | 3.117 ms | 0.14% | 3.117 ms | 0.12% | -0.490 us | -0.02% | SAME |
| I64 | I64 | true | 2^16 | 0 | 12.286 us | 0.06% | 12.288 us | 0.00% | 0.002 us | 0.01% | ???? |
| I64 | I64 | true | 2^20 | 0 | 23.470 us | 4.38% | 23.572 us | 4.38% | 0.102 us | 0.44% | SAME |
| I64 | I64 | true | 2^24 | 0 | 211.073 us | 0.96% | 210.888 us | 0.99% | -0.185 us | -0.09% | SAME |
| I64 | I64 | true | 2^28 | 0 | 3.114 ms | 0.10% | 3.113 ms | 0.09% | -0.476 us | -0.02% | SAME |
| I128 | I32 | false | 2^16 | 1 | 12.288 us | 0.00% | 10.359 us | 3.02% | -1.929 us | -15.69% | ???? |
| I128 | I32 | false | 2^20 | 1 | 34.565 us | 2.79% | 34.587 us | 3.23% | 0.021 us | 0.06% | SAME |
| I128 | I32 | false | 2^24 | 1 | 389.955 us | 0.53% | 389.878 us | 0.54% | -0.077 us | -0.02% | SAME |
| I128 | I32 | false | 2^28 | 1 | 6.059 ms | 0.26% | 6.059 ms | 0.26% | 0.098 us | 0.00% | SAME |
| I128 | I32 | false | 2^16 | 0.544 | 12.128 us | 4.27% | 11.647 us | 8.39% | -0.481 us | -3.96% | SAME |
| I128 | I32 | false | 2^20 | 0.544 | 34.817 us | 2.54% | 34.928 us | 2.97% | 0.111 us | 0.32% | SAME |
| I128 | I32 | false | 2^24 | 0.544 | 390.148 us | 0.45% | 390.033 us | 0.43% | -0.115 us | -0.03% | SAME |
| I128 | I32 | false | 2^28 | 0.544 | 6.062 ms | 0.06% | 6.061 ms | 0.06% | -0.873 us | -0.01% | SAME |
| I128 | I32 | false | 2^16 | 0 | 12.084 us | 4.01% | 12.313 us | 3.77% | 0.229 us | 1.89% | SAME |
| I128 | I32 | false | 2^20 | 0 | 34.532 us | 2.76% | 34.510 us | 3.16% | -0.021 us | -0.06% | SAME |
| I128 | I32 | false | 2^24 | 0 | 389.730 us | 0.57% | 390.116 us | 0.57% | 0.386 us | 0.10% | SAME |
| I128 | I32 | false | 2^28 | 0 | 6.061 ms | 0.25% | 6.061 ms | 0.24% | -0.122 us | -0.00% | SAME |
| I128 | I32 | true | 2^16 | 1 | 12.398 us | 4.37% | 12.327 us | 4.58% | -0.071 us | -0.57% | SAME |
| I128 | I32 | true | 2^20 | 1 | 33.914 us | 3.48% | 33.987 us | 3.30% | 0.073 us | 0.21% | SAME |
| I128 | I32 | true | 2^24 | 1 | 389.791 us | 0.55% | 389.700 us | 0.53% | -0.091 us | -0.02% | SAME |
| I128 | I32 | true | 2^28 | 1 | 6.052 ms | 0.31% | 6.051 ms | 0.31% | -1.953 us | -0.03% | SAME |
| I128 | I32 | true | 2^16 | 0.544 | 12.162 us | 3.92% | 11.883 us | 6.06% | -0.280 us | -2.30% | SAME |
| I128 | I32 | true | 2^20 | 0.544 | 34.126 us | 3.16% | 34.084 us | 3.35% | -0.042 us | -0.12% | SAME |
| I128 | I32 | true | 2^24 | 0.544 | 389.981 us | 0.45% | 389.927 us | 0.40% | -0.054 us | -0.01% | SAME |
| I128 | I32 | true | 2^28 | 0.544 | 6.061 ms | 0.05% | 6.061 ms | 0.05% | 0.017 us | 0.00% | SAME |
| I128 | I32 | true | 2^16 | 0 | 12.310 us | 4.27% | 12.321 us | 3.44% | 0.011 us | 0.09% | SAME |
| I128 | I32 | true | 2^20 | 0 | 34.002 us | 3.18% | 33.924 us | 3.30% | -0.078 us | -0.23% | SAME |
| I128 | I32 | true | 2^24 | 0 | 389.649 us | 0.53% | 389.507 us | 0.51% | -0.142 us | -0.04% | SAME |
| I128 | I32 | true | 2^28 | 0 | 6.056 ms | 0.29% | 6.056 ms | 0.28% | 0.150 us | 0.00% | SAME |
| I128 | I64 | false | 2^16 | 1 | 12.155 us | 4.28% | 12.290 us | 4.05% | 0.136 us | 1.12% | SAME |
| I128 | I64 | false | 2^20 | 1 | 34.061 us | 3.45% | 34.170 us | 3.12% | 0.109 us | 0.32% | SAME |
| I128 | I64 | false | 2^24 | 1 | 391.476 us | 0.57% | 391.433 us | 0.54% | -0.043 us | -0.01% | SAME |
| I128 | I64 | false | 2^28 | 1 | 6.039 ms | 0.25% | 6.039 ms | 0.25% | 0.309 us | 0.01% | SAME |
| I128 | I64 | false | 2^16 | 0.544 | 12.305 us | 3.47% | 12.296 us | 4.49% | -0.009 us | -0.07% | SAME |
| I128 | I64 | false | 2^20 | 0.544 | 35.100 us | 2.71% | 35.159 us | 2.62% | 0.059 us | 0.17% | SAME |
| I128 | I64 | false | 2^24 | 0.544 | 391.858 us | 0.49% | 392.015 us | 0.52% | 0.157 us | 0.04% | SAME |
| I128 | I64 | false | 2^28 | 0.544 | 6.065 ms | 0.08% | 6.064 ms | 0.07% | -0.158 us | -0.00% | SAME |
| I128 | I64 | false | 2^16 | 0 | 12.329 us | 4.31% | 12.102 us | 4.57% | -0.227 us | -1.84% | SAME |
| I128 | I64 | false | 2^20 | 0 | 33.992 us | 3.24% | 33.992 us | 3.31% | 0.000 us | 0.00% | SAME |
| I128 | I64 | false | 2^24 | 0 | 392.210 us | 0.54% | 392.368 us | 0.50% | 0.158 us | 0.04% | SAME |
| I128 | I64 | false | 2^28 | 0 | 6.048 ms | 0.30% | 6.047 ms | 0.29% | -1.509 us | -0.02% | SAME |
| I128 | I64 | true | 2^16 | 1 | 12.244 us | 2.77% | 12.403 us | 4.58% | 0.159 us | 1.30% | SAME |
| I128 | I64 | true | 2^20 | 1 | 34.327 us | 3.11% | 34.262 us | 3.09% | -0.065 us | -0.19% | SAME |
| I128 | I64 | true | 2^24 | 1 | 389.152 us | 0.51% | 389.363 us | 0.51% | 0.211 us | 0.05% | SAME |
| I128 | I64 | true | 2^28 | 1 | 6.064 ms | 0.14% | 6.064 ms | 0.14% | 0.061 us | 0.00% | SAME |
| I128 | I64 | true | 2^16 | 0.544 | 12.093 us | 4.60% | 12.230 us | 2.29% | 0.137 us | 1.13% | SAME |
| I128 | I64 | true | 2^20 | 0.544 | 34.141 us | 3.19% | 34.060 us | 3.23% | -0.081 us | -0.24% | SAME |
| I128 | I64 | true | 2^24 | 0.544 | 389.622 us | 0.43% | 389.492 us | 0.45% | -0.130 us | -0.03% | SAME |
| I128 | I64 | true | 2^28 | 0.544 | 6.062 ms | 0.05% | 6.062 ms | 0.05% | -0.032 us | -0.00% | SAME |
| I128 | I64 | true | 2^16 | 0 | 12.301 us | 2.55% | 12.243 us | 3.84% | -0.057 us | -0.47% | SAME |
| I128 | I64 | true | 2^20 | 0 | 34.707 us | 2.83% | 34.984 us | 2.81% | 0.277 us | 0.80% | SAME |
| I128 | I64 | true | 2^24 | 0 | 389.060 us | 0.50% | 389.276 us | 0.50% | 0.216 us | 0.06% | SAME |
| I128 | I64 | true | 2^28 | 0 | 6.063 ms | 0.16% | 6.063 ms | 0.13% | 0.476 us | 0.01% | SAME |
| F32 | I32 | false | 2^16 | 1 | 10.165 us | 2.78% | 9.993 us | 4.95% | -0.172 us | -1.69% | SAME |
| F32 | I32 | false | 2^20 | 1 | 18.352 us | 2.43% | 18.356 us | 2.63% | 0.004 us | 0.02% | SAME |
| F32 | I32 | false | 2^24 | 1 | 111.766 us | 0.97% | 111.706 us | 1.00% | -0.059 us | -0.05% | SAME |
| F32 | I32 | false | 2^28 | 1 | 1.669 ms | 0.16% | 1.669 ms | 0.15% | -0.058 us | -0.00% | SAME |
| F32 | I32 | false | 2^16 | 0.544 | 10.218 us | 1.54% | 10.240 us | 0.00% | 0.022 us | 0.22% | SLOW |
| F32 | I32 | false | 2^20 | 0.544 | 18.413 us | 1.51% | 18.312 us | 2.14% | -0.101 us | -0.55% | SAME |
| F32 | I32 | false | 2^24 | 0.544 | 113.536 us | 1.03% | 113.424 us | 1.00% | -0.112 us | -0.10% | SAME |
| F32 | I32 | false | 2^28 | 0.544 | 1.671 ms | 0.16% | 1.671 ms | 0.15% | 0.202 us | 0.01% | SAME |
| F32 | I32 | false | 2^16 | 0 | 10.020 us | 4.61% | 10.188 us | 2.41% | 0.168 us | 1.68% | SAME |
| F32 | I32 | false | 2^20 | 0 | 18.314 us | 2.86% | 18.222 us | 3.40% | -0.092 us | -0.50% | SAME |
| F32 | I32 | false | 2^24 | 0 | 111.373 us | 0.95% | 111.285 us | 1.04% | -0.088 us | -0.08% | SAME |
| F32 | I32 | false | 2^28 | 0 | 1.669 ms | 0.17% | 1.669 ms | 0.17% | -0.145 us | -0.01% | SAME |
| F32 | I32 | true | 2^16 | 1 | 10.225 us | 1.23% | 10.199 us | 2.14% | -0.026 us | -0.25% | SAME |
| F32 | I32 | true | 2^20 | 1 | 19.669 us | 5.47% | 19.848 us | 5.23% | 0.179 us | 0.91% | SAME |
| F32 | I32 | true | 2^24 | 1 | 115.291 us | 1.02% | 115.206 us | 1.02% | -0.085 us | -0.07% | SAME |
| F32 | I32 | true | 2^28 | 1 | 1.663 ms | 0.12% | 1.663 ms | 0.11% | 0.091 us | 0.01% | SAME |
| F32 | I32 | true | 2^16 | 0.544 | 10.240 us | 0.00% | 10.065 us | 4.20% | -0.175 us | -1.71% | FAST |
| F32 | I32 | true | 2^20 | 0.544 | 20.126 us | 3.97% | 20.219 us | 3.69% | 0.093 us | 0.46% | SAME |
| F32 | I32 | true | 2^24 | 0.544 | 118.191 us | 1.06% | 118.295 us | 1.09% | 0.104 us | 0.09% | SAME |
| F32 | I32 | true | 2^28 | 0.544 | 1.664 ms | 0.11% | 1.664 ms | 0.10% | -0.185 us | -0.01% | SAME |
| F32 | I32 | true | 2^16 | 0 | 10.240 us | 0.00% | 10.068 us | 4.10% | -0.172 us | -1.68% | FAST |
| F32 | I32 | true | 2^20 | 0 | 19.839 us | 4.85% | 19.834 us | 4.90% | -0.005 us | -0.03% | SAME |
| F32 | I32 | true | 2^24 | 0 | 116.937 us | 0.99% | 116.990 us | 0.96% | 0.054 us | 0.05% | SAME |
| F32 | I32 | true | 2^28 | 0 | 1.664 ms | 0.10% | 1.663 ms | 0.10% | -0.080 us | -0.00% | SAME |
| F32 | I64 | false | 2^16 | 1 | 10.005 us | 4.97% | 10.240 us | 0.00% | 0.235 us | 2.35% | SLOW |
| F32 | I64 | false | 2^20 | 1 | 18.432 us | 0.00% | 18.451 us | 1.99% | 0.019 us | 0.11% | SLOW |
| F32 | I64 | false | 2^24 | 1 | 112.636 us | 0.86% | 112.585 us | 0.90% | -0.052 us | -0.05% | SAME |
| F32 | I64 | false | 2^28 | 1 | 1.656 ms | 0.10% | 1.656 ms | 0.10% | -0.164 us | -0.01% | SAME |
| F32 | I64 | false | 2^16 | 0.544 | 10.240 us | 0.00% | 10.240 us | 0.00% | 0.000 us | 0.00% | SAME |
| F32 | I64 | false | 2^20 | 0.544 | 18.582 us | 3.27% | 18.487 us | 2.70% | -0.095 us | -0.51% | SAME |
| F32 | I64 | false | 2^24 | 0.544 | 115.196 us | 0.87% | 115.056 us | 0.76% | -0.140 us | -0.12% | SAME |
| F32 | I64 | false | 2^28 | 0.544 | 1.659 ms | 0.10% | 1.659 ms | 0.10% | -0.146 us | -0.01% | SAME |
| F32 | I64 | false | 2^16 | 0 | 10.168 us | 2.65% | 10.255 us | 3.45% | 0.087 us | 0.86% | SAME |
| F32 | I64 | false | 2^20 | 0 | 18.378 us | 3.68% | 18.413 us | 3.81% | 0.035 us | 0.19% | SAME |
| F32 | I64 | false | 2^24 | 0 | 113.554 us | 1.09% | 113.164 us | 1.00% | -0.389 us | -0.34% | SAME |
| F32 | I64 | false | 2^28 | 0 | 1.657 ms | 0.10% | 1.657 ms | 0.10% | 0.160 us | 0.01% | SAME |
| F32 | I64 | true | 2^16 | 1 | 14.317 us | 0.91% | 14.142 us | 3.10% | -0.176 us | -1.23% | FAST |
| F32 | I64 | true | 2^20 | 1 | 23.693 us | 4.28% | 23.517 us | 4.32% | -0.176 us | -0.74% | SAME |
| F32 | I64 | true | 2^24 | 1 | 121.949 us | 1.07% | 122.151 us | 0.95% | 0.202 us | 0.17% | SAME |
| F32 | I64 | true | 2^28 | 1 | 1.688 ms | 0.09% | 1.688 ms | 0.09% | 0.020 us | 0.00% | SAME |
| F32 | I64 | true | 2^16 | 0.544 | 14.376 us | 5.53% | 14.595 us | 4.85% | 0.219 us | 1.52% | SAME |
| F32 | I64 | true | 2^20 | 0.544 | 24.306 us | 2.83% | 24.227 us | 3.18% | -0.079 us | -0.32% | SAME |
| F32 | I64 | true | 2^24 | 0.544 | 124.522 us | 0.95% | 124.587 us | 0.95% | 0.065 us | 0.05% | SAME |
| F32 | I64 | true | 2^28 | 0.544 | 1.688 ms | 0.09% | 1.689 ms | 0.09% | 0.118 us | 0.01% | SAME |
| F32 | I64 | true | 2^16 | 0 | 14.336 us | 0.00% | 14.336 us | 0.00% | 0.000 us | 0.00% | ???? |
| F32 | I64 | true | 2^20 | 0 | 22.950 us | 3.81% | 22.867 us | 3.50% | -0.084 us | -0.36% | SAME |
| F32 | I64 | true | 2^24 | 0 | 121.651 us | 0.91% | 121.669 us | 0.90% | 0.018 us | 0.01% | SAME |
| F32 | I64 | true | 2^28 | 0 | 1.689 ms | 0.09% | 1.688 ms | 0.09% | -0.191 us | -0.01% | SAME |
| F64 | I32 | false | 2^16 | 1 | 12.047 us | 5.39% | 12.288 us | 0.00% | 0.241 us | 2.00% | ???? |
| F64 | I32 | false | 2^20 | 1 | 25.024 us | 4.07% | 25.012 us | 3.78% | -0.012 us | -0.05% | SAME |
| F64 | I32 | false | 2^24 | 1 | 212.870 us | 0.88% | 212.741 us | 0.84% | -0.129 us | -0.06% | SAME |
| F64 | I32 | false | 2^28 | 1 | 3.119 ms | 0.09% | 3.119 ms | 0.09% | 0.191 us | 0.01% | SAME |
| F64 | I32 | false | 2^16 | 0.544 | 12.255 us | 1.52% | 12.068 us | 3.96% | -0.187 us | -1.53% | FAST |
| F64 | I32 | false | 2^20 | 0.544 | 25.841 us | 3.91% | 25.739 us | 4.05% | -0.102 us | -0.40% | SAME |
| F64 | I32 | false | 2^24 | 0.544 | 213.178 us | 0.94% | 213.023 us | 0.92% | -0.155 us | -0.07% | SAME |
| F64 | I32 | false | 2^28 | 0.544 | 3.121 ms | 0.15% | 3.121 ms | 0.15% | 0.031 us | 0.00% | SAME |
| F64 | I32 | false | 2^16 | 0 | 12.074 us | 3.92% | 12.050 us | 3.94% | -0.024 us | -0.20% | SAME |
| F64 | I32 | false | 2^20 | 0 | 25.312 us | 3.94% | 25.352 us | 4.00% | 0.039 us | 0.15% | SAME |
| F64 | I32 | false | 2^24 | 0 | 212.873 us | 0.89% | 212.851 us | 0.86% | -0.022 us | -0.01% | SAME |
| F64 | I32 | false | 2^28 | 0 | 3.119 ms | 0.10% | 3.119 ms | 0.09% | 0.021 us | 0.00% | SAME |
| F64 | I32 | true | 2^16 | 1 | 12.267 us | 1.25% | 12.281 us | 0.56% | 0.014 us | 0.11% | SAME |
| F64 | I32 | true | 2^20 | 1 | 24.407 us | 2.49% | 24.387 us | 2.92% | -0.020 us | -0.08% | SAME |
| F64 | I32 | true | 2^24 | 1 | 212.616 us | 0.91% | 212.632 us | 0.98% | 0.015 us | 0.01% | SAME |
| F64 | I32 | true | 2^28 | 1 | 3.121 ms | 0.09% | 3.120 ms | 0.09% | -0.220 us | -0.01% | SAME |
| F64 | I32 | true | 2^16 | 0.544 | 12.288 us | 0.00% | 12.263 us | 1.31% | -0.025 us | -0.20% | ???? |
| F64 | I32 | true | 2^20 | 0.544 | 24.576 us | 0.00% | 24.612 us | 2.29% | 0.036 us | 0.15% | ???? |
| F64 | I32 | true | 2^24 | 0.544 | 212.624 us | 0.94% | 212.688 us | 0.94% | 0.063 us | 0.03% | SAME |
| F64 | I32 | true | 2^28 | 0.544 | 3.122 ms | 0.12% | 3.122 ms | 0.11% | -0.102 us | -0.00% | SAME |
| F64 | I32 | true | 2^16 | 0 | 10.240 us | 0.00% | 10.346 us | 2.57% | 0.106 us | 1.03% | SLOW |
| F64 | I32 | true | 2^20 | 0 | 24.319 us | 2.71% | 24.455 us | 2.70% | 0.136 us | 0.56% | SAME |
| F64 | I32 | true | 2^24 | 0 | 212.273 us | 0.92% | 211.859 us | 0.94% | -0.414 us | -0.19% | SAME |
| F64 | I32 | true | 2^28 | 0 | 3.120 ms | 0.11% | 3.120 ms | 0.10% | -0.235 us | -0.01% | SAME |
| F64 | I64 | false | 2^16 | 1 | 10.003 us | 4.70% | 10.218 us | 1.51% | 0.215 us | 2.15% | SLOW |
| F64 | I64 | false | 2^20 | 1 | 22.839 us | 5.38% | 22.594 us | 5.57% | -0.245 us | -1.07% | SAME |
| F64 | I64 | false | 2^24 | 1 | 206.448 us | 1.49% | 206.560 us | 1.37% | 0.112 us | 0.05% | SAME |
| F64 | I64 | false | 2^28 | 1 | 3.119 ms | 0.17% | 3.119 ms | 0.16% | 0.401 us | 0.01% | SAME |
| F64 | I64 | false | 2^16 | 0.544 | 9.813 us | 8.42% | 9.747 us | 8.00% | -0.066 us | -0.68% | SAME |
| F64 | I64 | false | 2^20 | 0.544 | 23.043 us | 4.45% | 23.098 us | 4.34% | 0.056 us | 0.24% | SAME |
| F64 | I64 | false | 2^24 | 0.544 | 207.424 us | 1.42% | 207.256 us | 1.45% | -0.168 us | -0.08% | SAME |
| F64 | I64 | false | 2^28 | 0.544 | 3.119 ms | 0.13% | 3.119 ms | 0.13% | 0.158 us | 0.01% | SAME |
| F64 | I64 | false | 2^16 | 0 | 10.211 us | 1.79% | 10.240 us | 0.00% | 0.029 us | 0.29% | SLOW |
| F64 | I64 | false | 2^20 | 0 | 22.855 us | 4.50% | 22.740 us | 5.27% | -0.114 us | -0.50% | SAME |
| F64 | I64 | false | 2^24 | 0 | 206.420 us | 1.35% | 206.672 us | 1.55% | 0.252 us | 0.12% | SAME |
| F64 | I64 | false | 2^28 | 0 | 3.119 ms | 0.17% | 3.119 ms | 0.16% | 0.280 us | 0.01% | SAME |
| F64 | I64 | true | 2^16 | 1 | 10.275 us | 3.80% | 10.343 us | 4.02% | 0.069 us | 0.67% | SAME |
| F64 | I64 | true | 2^20 | 1 | 23.133 us | 4.55% | 23.265 us | 4.29% | 0.132 us | 0.57% | SAME |
| F64 | I64 | true | 2^24 | 1 | 211.000 us | 0.97% | 210.932 us | 0.91% | -0.069 us | -0.03% | SAME |
| F64 | I64 | true | 2^28 | 1 | 3.114 ms | 0.11% | 3.114 ms | 0.10% | -0.015 us | -0.00% | SAME |
| F64 | I64 | true | 2^16 | 0.544 | 12.077 us | 3.73% | 12.228 us | 2.29% | 0.151 us | 1.25% | SAME |
| F64 | I64 | true | 2^20 | 0.544 | 23.268 us | 4.24% | 23.185 us | 4.46% | -0.083 us | -0.36% | SAME |
| F64 | I64 | true | 2^24 | 0.544 | 211.445 us | 0.97% | 211.663 us | 0.85% | 0.218 us | 0.10% | SAME |
| F64 | I64 | true | 2^28 | 0.544 | 3.116 ms | 0.15% | 3.116 ms | 0.14% | -0.144 us | -0.00% | SAME |
| F64 | I64 | true | 2^16 | 0 | 11.980 us | 6.08% | 11.832 us | 6.55% | -0.148 us | -1.24% | SAME |
| F64 | I64 | true | 2^20 | 0 | 23.205 us | 4.41% | 23.164 us | 4.10% | -0.041 us | -0.18% | SAME |
| F64 | I64 | true | 2^24 | 0 | 210.645 us | 0.87% | 210.782 us | 0.89% | 0.137 us | 0.06% | SAME |
| F64 | I64 | true | 2^28 | 0 | 3.114 ms | 0.12% | 3.113 ms | 0.11% | -0.126 us | -0.00% | SAME |
# Summary
- Total Matches: 336
- Pass (diff <= min_noise): 236
- Unknown (infinite noise): 15
- Failure (diff > min_noise): 85
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

this builds on top of #7796 (I should probably using stacked PRs for this??)
It adds tunings from old DBs I scraped. Will be running more for the larger CT workloads and append them on a second pr.
posting verification results shortly...only
I8/I32,I8/I64make sense. keeping these two, dropping everything else and re-tuning