Amoradzadeh/cp implicit filter by moradza · Pull Request #15199 · NVIDIA-NeMo/NeMo

moradza · 2025-12-16T21:16:16Z

Important

The Update branch button must only be pressed in very rare occassions.
An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.

What does this PR do ?

Reduce memory consumption of mixer kernels in context parallel setup.

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

Signed-off-by: moradza <moradza@users.noreply.github.com>

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

Signed-off-by: moradza <moradza@users.noreply.github.com>

jstjohn · 2025-12-16T23:14:05Z

        """Compute the log poles for the implicit modal filter."""
-        logp = -torch.exp(self.p.to(torch.float32))
-        glogp = logp * torch.exp(self.gamma.to(torch.float32))
+        if context_parallel_group is not None:


should this be and context_parallel_group.size() > 1:?

I see your point, if _hyena_use_cp set to False, this can be None. Slightly confusing code, b/c it sets cp_group to None.

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

Co-authored-by: amoradzadeh <amoradzadeh@nvidia.com> Signed-off-by: John St John <jstjohn@nvidia.com>

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

Signed-off-by: moradza <moradza@users.noreply.github.com>

moradza · 2025-12-17T19:07:56Z

@jstjohn You can test locally CP == 1 and CP == 2 with this command.
torchrun --nproc_per_node=2 tests/collections/llm/gpt/model/test_hyena_mixer_cp.py --operator_type hyena [--use_subquadratic_ops]

pzelasko · 2026-03-16T16:50:47Z

Closing this PR as "won't merge".
The following collections have been moved to separate repos in https://github.com/NVIDIA-NeMo organization: avlm, llm, multimodal, multimodal-autoregressive, vlm, speechlm, diffusion.
If you still wish to proceed with this contribution, please re-open it in the relevant repo.

moradza and others added 7 commits December 16, 2025 13:10

Wrong branch

f86c9b7

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

Merge branch 'main' into amoradzadeh/cp_implicit_filter

2608c24

Apply isort and black reformatting

fd4101f

Signed-off-by: moradza <moradza@users.noreply.github.com>

Clean up

2fe7b6a

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

Get the right chunk of R

e5acd4b

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

Fix wrong argument error

d2c8ed5

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

Apply isort and black reformatting

bfb7cae

Signed-off-by: moradza <moradza@users.noreply.github.com>

jstjohn reviewed Dec 16, 2025

View reviewed changes

Fix if-statement use

cf25721

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

jstjohn added a commit to NVIDIA/bionemo-framework that referenced this pull request Dec 16, 2025

Adding changes from NVIDIA-NeMo/NeMo#15199

b53b43b

Co-authored-by: amoradzadeh <amoradzadeh@nvidia.com> Signed-off-by: John St John <jstjohn@nvidia.com>

jstjohn previously approved these changes Dec 17, 2025

View reviewed changes

Use cp_group len for param selection

c1a5dad

moradza dismissed jstjohn’s stale review via c1a5dad December 17, 2025 19:05

moradza and others added 2 commits December 17, 2025 11:06

Use cp_group len for param selection

d671003

Signed-off-by: amoradzadeh <amoradzadeh@nvidia.com>

Apply isort and black reformatting

4b38fe1

Signed-off-by: moradza <moradza@users.noreply.github.com>

jstjohn approved these changes Dec 17, 2025

View reviewed changes

jstjohn enabled auto-merge (squash) December 17, 2025 21:39

pzelasko closed this Mar 16, 2026

auto-merge was automatically disabled March 16, 2026 16:50
Pull request was closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amoradzadeh/cp implicit filter#15199

Amoradzadeh/cp implicit filter#15199
moradza wants to merge 11 commits intoNVIDIA-NeMo:mainfrom
moradza:amoradzadeh/cp_implicit_filter

moradza commented Dec 16, 2025

Uh oh!

jstjohn Dec 16, 2025

Uh oh!

moradza Dec 16, 2025

Uh oh!

moradza commented Dec 17, 2025

Uh oh!

pzelasko commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

moradza commented Dec 16, 2025

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

Uh oh!

jstjohn Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

moradza Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

moradza commented Dec 17, 2025

Uh oh!

pzelasko commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants