Skip to content

Upstream FastJet algorithm from MuonCollider fork#51

Open
aloeliger wants to merge 22 commits intokey4hep:mainfrom
aloeliger:k4h_muoncollider_fastjet_rebase
Open

Upstream FastJet algorithm from MuonCollider fork#51
aloeliger wants to merge 22 commits intokey4hep:mainfrom
aloeliger:k4h_muoncollider_fastjet_rebase

Conversation

@aloeliger
Copy link
Copy Markdown

@aloeliger aloeliger commented Mar 18, 2026

This is the first of (likely) 4 smaller PRs designed to split up PR #50 into more manageable pieces. This PR includes the FastJet algorithm, made compatible with Gaudi and EDM4HEP as originally written by @samf25. I have added testing to the test/CMakeLists.txt, using pieces largely taken from k4GaudiPandora. The stated/desired inputs to FastJet are based on k4GaudiPandora so this has introduced a dependence in the tests on k4GaudiPandora (aside: I'm afraid this dependence may strictly be circular, as k4GaudiPandora lists k4Reco as a dependency).

The test for FastJet runs for me on lxplus (however, coming from CMS and scram I am somewhat new with CMake and am not sure I have it entirely correct, any suggestions are appreciated), and is dependent on existing CLD infrastructure, despite being added by the MuonCollider project.

As before, this is largely being opened for discussion to start the process of reconciling these two forks, if the ultimate PR requires a selected subset of these changes, that is something that can be discussed.

@tmadlener @madbaron @samf25, FYI

BEGINRELEASENOTES

  • Upstream the FastJet Gaudi algorithm from the MuonCollider fork (with quite some refactoring and some testing against its Marlin pendant).

ENDRELEASENOTES

@aloeliger aloeliger changed the title K4h muoncollider fastjet rebase Muon Collider Fork: FastJet Mar 18, 2026
Copy link
Copy Markdown
Member

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for breaking this out from #50. I think in general this is in decent shape already. I am not sure I fully get why e.g. k4GaudiPandora needs to be a dependency with this (see comments below).

I have also left a few comments on the c++ part. I know most of this is pre-existing for you, but since we are here, I think we can just as well try to improve some of the things.

Comment thread k4Reco/CMakeLists.txt Outdated
Comment thread test/CMakeLists.txt Outdated
Comment thread CMakeLists.txt Outdated
Comment thread CMakeLists.txt Outdated
Comment thread k4Reco/FastJet/include/FastJetAlg.hxx Outdated
Comment thread k4Reco/FastJet/src/FastJetAlg.cpp Outdated
Comment thread k4Reco/FastJet/include/EClusterMode.h Outdated
Comment thread k4Reco/FastJet/include/EClusterMode.h Outdated
Comment thread k4Reco/FastJet/include/EClusterMode.h Outdated
Comment thread k4Reco/FastJet/include/FastJetAlg.hxx Outdated
Comment thread cmake/FindFastJet.cmake
@aloeliger
Copy link
Copy Markdown
Author

@tmadlener & @andresailer I have made edits for most of the existing review comments. The result builds, and I have attempted to test it, however, I will note that the current nightly build seems to have some existing errors. Upwards of 4-5 tests fail, in places this PR does not touch, and fail with memory related errors.

run_GaudiLumiCalClusterer fails with:

ApplicationMgr        INFO Application Manager Stopped successfully
ApplicationMgr        INFO Application Manager Finalized successfully
ApplicationMgr        INFO Application Manager Terminated successfully with a user requested ScheduledStop
malloc_consolidate(): unaligned fastbin chunk detected

run_ClonesAndSplitTracksFinder fails similarly except for malloc(): unsorted double linked list corrupted

run_TruthTrackFinder and run_PandoraPlutFastJet_ttbar fails with the same failure as run_GaudiLumiCalClusterer

I don't know whether this means that any testing of this PR is somewhat inconclusive, so I am making a note of it here.

Copy link
Copy Markdown
Member

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for all the effort so far. This is definitely an improved version to what we started with already. Given that this will probably be one of the more commonly used algorithms, I think there are still a few things that can be done in order to make it as easy to maintain and understand as possible. In the comments below it's a bit of a mix of small fixes and nit-picks and somewhat higher level design choices.

There are also a few warnings that make the CI fail (due to -Werror) that I didn't explicitly point out, but you should be able to see them from the workflow logs.


Given that the tests are working on the main branch it's probably something that was introduced in this PR that makes things fail. malloc issues usually mean that we delete things that we shouldn't delete. That might be the case here due to my suggestion of making some things a smart pointer that shouldn't be one because fastjet does some internal cleanup. I wasn't aware of that, so maybe changing those back will already make things work properly again.

Comment thread k4Reco/FastJet/include/FastJetAlg.hxx Outdated
Comment thread k4Reco/FastJet/include/FastJetAlg.hxx Outdated
Comment thread k4Reco/CMakeLists.txt Outdated
Comment thread k4Reco/FastJet/include/FastJetAlg.hxx Outdated
Comment thread k4Reco/FastJet/src/FastJetAlg.cpp Outdated
Comment thread k4Reco/FastJet/src/FastJetAlg.cpp Outdated
Comment thread k4Reco/FastJet/src/FastJetAlg.cpp Outdated
Comment thread k4Reco/FastJet/src/FastJetAlg.cpp Outdated
Comment thread k4Reco/FastJet/src/FastJetAlg.cpp Outdated
Comment thread k4Reco/FastJet/src/FastJetAlg.cpp
@aloeliger
Copy link
Copy Markdown
Author

@tmadlener I have made the additional requested changes.

I don't think the failing changes are anything I have done here, because it fails in tests that shouldn't touch FastJet, with the exact same error. I don't know where fastbin may be present elsewhere in Key4HEP, but that is the consistent failure point.

@aloeliger aloeliger requested a review from tmadlener April 6, 2026 16:29
Copy link
Copy Markdown
Member

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you will have to run clang-format to make pre-commit happy (or alternatively make your editor call it when saving files).

The CLI option is along the lines of

clang-format --style=file -i k4Reco/FastJet/include/* k4Reco/FastJet/src/*

Comment thread k4Reco/CMakeLists.txt Outdated
Comment thread k4Reco/CMakeLists.txt
Comment thread k4Reco/FastJet/src/FastJetAlg.cpp Outdated
Comment thread k4Reco/FastJet/src/FastJetAlg.cpp Outdated
Comment thread k4Reco/FastJet/include/FastJetAlg.hxx Outdated
@aloeliger
Copy link
Copy Markdown
Author

clang-format has also been applied over emacs preferred formatting.

@tmadlener
Copy link
Copy Markdown
Member

clang-format has also been applied over emacs preferred formatting.

Just to point this out in case you don't know (yet). You can make emacs call clang-format on save. There are various options / packages out there that automate this. Together with projectile it should automatically pick up the correct .clang-format config file for the project and format accordingly.

Copy link
Copy Markdown
Member

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some (almost certainly) final minor comments.

pre-commit is complaining about the fact that the new files do not have the correct license header. It's probably easiest to simply copy them over from other files.

There are also a few warnings that are still flagged by the CI. From a quick look it's mostly unused parameters in functions (most easily fixed by simply removing the variable name from the function definition) and a few sign-compares.

Comment thread k4Reco/FastJet/include/FastJetAlg.hxx Outdated
Comment thread k4Reco/FastJet/include/FastJetAlg.hxx Outdated
Comment thread k4Reco/FastJet/options/pandoraPlusFastJet.py Outdated
Comment thread k4Reco/FastJet/include/EClusterMode.h Outdated
@aloeliger
Copy link
Copy Markdown
Author

I've touched up the few places with type or parameter warnings as well. My build on lxplus now has no warnings

@aloeliger
Copy link
Copy Markdown
Author

I still don't have the licensing thing though, I'll include that.

@tmadlener
Copy link
Copy Markdown
Member

One license header is still missing in the pandoraPlusFastJet.py script. It looks like the CI failures are unrelated. I will check that and see if it needs fixing elsewhere.

Comment thread k4Reco/FastJet/src/FastJetAlg.cpp Outdated
@jmcarcell
Copy link
Copy Markdown
Member

Is this porting equivalent to the Marlin processor? Can we add a test that compares the output from Marlin and here? This is already done in the tests in this repository in https://github.com/key4hep/k4Reco/blob/main/test/CMakeLists.txt (but note this will have some minor changes in #54). Code for comparing (bit by bit) edm4hep::ReconstructedParticle exists in https://github.com/key4hep/k4GaudiPandora/blob/main/test/scripts/compare-pfos.py#L199. Within the same stack fastjet is the same so if the same configuration and inputs are used we should get the same back.

Comment thread k4Reco/CMakeLists.txt Outdated
@aloeliger
Copy link
Copy Markdown
Author

Is this porting equivalent to the Marlin processor? Can we add a test that compares the output from Marlin and here? This is already done in the tests in this repository in https://github.com/key4hep/k4Reco/blob/main/test/CMakeLists.txt (but note this will have some minor changes in #54). Code for comparing (bit by bit) edm4hep::ReconstructedParticle exists in https://github.com/key4hep/k4GaudiPandora/blob/main/test/scripts/compare-pfos.py#L199. Within the same stack fastjet is the same so if the same configuration and inputs are used we should get the same back.

I'm content to write this test, however I am unfamiliar with any marlin version of the FastJet processor, I've just taken over this Gaudi version for the Muon Collider project. Is there an example of that somewhere I can use to create the test?

@samf25
Copy link
Copy Markdown

samf25 commented Apr 9, 2026

I believe this is the repo with the Marlin algorithm this is based on: https://github.com/iLCSoft/MarlinFastJet/tree/master

It should have a steering file in the test directory.

@aloeliger
Copy link
Copy Markdown
Author

I believe this is the repo with the Marlin algorithm this is based on: https://github.com/iLCSoft/MarlinFastJet/tree/master

It should have a steering file in the test directory.

The only thing in the test directory is an XML file I am not quite sure how to turn into a test or steering file. The repostiory is also ILCSoft. Do we have a marlin version of FastJet in Key4HEP?

@samf25
Copy link
Copy Markdown

samf25 commented Apr 10, 2026

Marlin steering files are XML files. So if you run Marlin steering_file.xml it will run.

@jmcarcell
Copy link
Copy Markdown
Member

jmcarcell commented Apr 13, 2026

You can look it up how it has been done for ported algorithms or an LLM may do well at converting from Marlin to Gaudi. DDPlanarDigi is a very simple example: Gaudi vs Marlin . Once the steering file is there, the process is to copy a similar setup to what we have in other tests in https://github.com/key4hep/k4Reco/blob/main/test/CMakeLists.txt. A test that runs the steering file, and another test that compares. There is already a test that runs the CLDReconstruction through Marlin, that will have the PFOsFromJets collections: https://github.com/key4hep/CLDConfig/blob/main/CLDConfig/HighLevelReco/JetClusteringOrRenaming.py#L43

@aloeliger
Copy link
Copy Markdown
Author

Okay, there should be a functioning marlin to gaudi comparison in there. It manages to pass on lxplus.

On another note, the malloc related errors seem to have gone away in the latest nightly, so these are no longer present in any test, including the fast jet ones.

samf25 and others added 6 commits April 16, 2026 10:27
Adds requirement on FastJet's desired inputs, Pandora outputs
Large changes include changes to validation responsibilities (moved to separate functions, with correct parameters put into a map), and changes to jet definition creation (moved to factory classes and mapped to generic functions)
@tmadlener tmadlener force-pushed the k4h_muoncollider_fastjet_rebase branch from 94e14b4 to 2b4fdcd Compare April 16, 2026 08:27
Copy link
Copy Markdown
Member

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. pre-commit complains again about missing license headers and a bit of formatting, should be straight forward to fix.

In the meantime we have tried to stabilize the test dependencies a bit in by switching to fixtures (see FIXTURES_SETUP and FIXTURES_REQUIRED for documentation). That should make the failure that is currently visible in CI go away if also applied to the tests you have added. (I have put some examples on how do it for the new tests into comments below)

Comment thread test/CMakeLists.txt Outdated
Comment thread test/CMakeLists.txt Outdated
COMMAND k4run CLDReconstruction.py --inputFiles sim_ttbar.edm4hep.root --outputBasename output_ttbar
)
set_test_env("run_wrapper_ttbar")
set_tests_properties("run_wrapper_ttbar" PROPERTIES DEPENDS "clone_CLDConfig;run_ddsim_ttbar")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
set_tests_properties("run_wrapper_ttbar" PROPERTIES DEPENDS "clone_CLDConfig;run_ddsim_ttbar")
set_tests_properties("run_wrapper_ttbar" PROPERTIES
FIXTURES_REQUIRED "CLDConfig;SimOutputTTbar
FIXTURES_SETUP WrapperTTbar
)

and then similar for other tests

Comment thread test/CMakeLists.txt Outdated
Comment thread test/CMakeLists.txt Outdated
Comment thread k4Reco/FastJet/options/pandoraPlusFastJet.py Outdated
@aloeliger
Copy link
Copy Markdown
Author

I've made the requested changes, however, I will note that I can longer build or test this after whatever change was made that required me to rebase the work. The build error is nothing I've touched, so I am somewhat confused:

GaudiLumiCalClusterer.cpp:55:37: error: 'getCellIDEncoding' is not a member of 'k4FWCore'
   55 |   const auto initString = k4FWCore::getCellIDEncoding(inputLocations(0)[0], this).value();
      |       

@tmadlener
Copy link
Copy Markdown
Member

I have fixed a few minor things I discovered while fixing the pre-commit things

  • The comparJets script compared the Gaudi Jets with the Gaudi Jets, instead of using the Marlin Jets for comparison
  • Removed a (presumbaly) debug print(os.getcwd())

The build error you see in the release based CI workflow can be ignored, it is here since we merged #52 (which only works on the nightlies so far).

Pending a successful CI run, I am happy with this.

@tmadlener tmadlener changed the title Muon Collider Fork: FastJet Upstream FastJet algorithm from MuonCollider fork Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants