Skip to content

Conversation

@giacomomagni
Copy link
Collaborator

Address #397 using the FHMRUVV tables.

@giacomomagni giacomomagni added the benchmarks Benchmark (or infrastructure) related label Jan 13, 2026
@giacomomagni
Copy link
Collaborator Author

do you want to add it also to CI by default?

@giacomomagni giacomomagni linked an issue Jan 13, 2026 that may be closed by this pull request
@felixhekhorn
Copy link
Contributor

do you want to add it also to CI by default?

I'd say yes - let's see how long it takes

@felixhekhorn
Copy link
Contributor

I'd say yes - let's see how long it takes

okay, so we have the Rust run and also the Python run and we can make a number of observations:

  1. The benchmark should add 4 new EKOs (2 FFNS x 2 SV). The Rust time increases from ~7min to ~11min (~1min/EKO) and Python from ~45min to ~2h (~18min/EKO); I would keep both nevertheless just so we know if something is going wrong. The job is only running now and then ...
  2. I assume the FFNS to be the latter half: why are we off by up to 5% in the singlet sector? this sounds fishy ... in the non-singlet sector we agree better than 1e-3% (sic!). Actually, only in the non-singlet sector we can see that Rust and Python do not agree bit-by-bit but only up to ~1e-7 - instead the singlet error is so big that all displayed digits are identical
  3. for VFNS we need to set matching_order = (2,0), right?
    • on the Rust side this is done implicitly, since the N3LO OMEs are not even translated. Still, we are off by up to 600% in the non-singlet sector and up to 5% in the singlet sector. Surprisingly here we are clearly better in the singlet side, and, e.g., V and T15 are quite off in the small x region
    • on the Python side the misconfiguration matters and we are worse in some places - however, not everywhere and in some cases the bug even seems to work in our favour

@giacomomagni
Copy link
Collaborator Author

giacomomagni commented Jan 15, 2026

  1. The benchmark should add 4 new EKOs (2 FFNS x 2 SV). The Rust time increases from ~7min to ~11min (~1min/EKO) and Python from ~45min to ~2h (~18min/EKO); I would keep both nevertheless just so we know if something is going wrong. The job is only running now and then ...

okay looks good.

  1. I assume the FFNS to be the latter half: why are we off by up to 5% in the singlet sector? this sounds fishy ... in the non-singlet sector we agree better than 1e-3% (sic!). Actually, only in the non-singlet sector we can see that Rust and Python do not agree bit-by-bit but only up to ~1e-7 - instead the singlet error is so big that all displayed digits are identical

I think this is related to these motivations: #484 (comment)

  1. for VFNS we need to set matching_order = (2,0), right?

    • on the Rust side this is done implicitly, since the N3LO OMEs are not even translated. Still, we are off by up to 600% in the non-singlet sector and up to 5% in the singlet sector. Surprisingly here we are clearly better in the singlet side, and, e.g., V and T15 are quite off in the small x region

This looks more a bug, as for rust the disagreement should be of the same order as of FFNS. Maybe I miss copied the table, let me check.

  • on the Python side the misconfiguration matters and we are worse in some places - however, not everywhere and in some cases the bug even seems to work in our favour

yes I should set matching_order = (2,0) explicitly.

EDIT:
the tables seems to be good. The value which looks odd is this 600% difference for the valence in VFNS.

@felixhekhorn
Copy link
Contributor

We need NNPDF/banana#79 to be merged (+tagged+released)

@felixhekhorn
Copy link
Contributor

Okay, benchmarks seem to be back running. The mystery on why they don't match remains ... However, it also seems to be worse with SV than without - at least for some distributions:

$ poetry poe lha -m "n3lo and not sv and vfns"
[...]
─── 
  V  
 ─── 
               x       Q2       eko     eko_error       LHA  percent_error
0   1.000000e-07  10000.0  0.000085  6.211928e-10  0.000151     -43.569049
1   1.000000e-06  10000.0  0.000809  4.122221e-09  0.000910     -11.072806
2   1.000000e-05  10000.0  0.004748  2.396675e-08  0.004734       0.285170
3   1.000000e-04  10000.0  0.022717  4.002072e-08  0.022189       2.378366
4   1.000000e-03  10000.0  0.096559  2.837288e-07  0.095632       0.968525
[...]
 ───── 
  T15  
 ───── 
               x       Q2       eko     eko_error       LHA  percent_error
0   1.000000e-07  10000.0  6.902032  1.431165e-05  6.901966   9.586976e-04
1   1.000000e-06  10000.0  5.174521  1.647792e-05  5.174487   6.659105e-04
2   1.000000e-05  10000.0  3.808481  2.365504e-05  3.808474   1.934408e-04
3   1.000000e-04  10000.0  2.741352  5.529015e-06  2.741350   4.408902e-05
[...]
 ───── 
  T24  
 ───── 
               x       Q2        eko     eko_error        LHA  percent_error
0   1.000000e-07  10000.0  59.510470  2.773462e-04  57.608373       3.301772
1   1.000000e-06  10000.0  31.573497  6.014980e-04  30.965502       1.963459
2   1.000000e-05  10000.0  16.751943  1.344763e-04  16.606857       0.873650

stupid question: are we comparing the right things? i.e. T24|eko can be computed from the table and the rotation is the right one?

@giacomomagni
Copy link
Collaborator Author

I think T24 looks okay for your screenshot. At low-x this is affected by the new splitting functions updates (the ones that cames after the benchmark tables). The 2 surprising points for me are the V, but there one might argue abs error is small, but still I don't have a good explanation why there do not match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

benchmarks Benchmark (or infrastructure) related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add LHA aN3LO to ekomark

3 participants