Skip to content

Conversation

@conraddelgado
Copy link

@conraddelgado conraddelgado commented Nov 13, 2025

User description

User description

Description

Please include a summary of the changes and the related issue(s) if they exist.
Please also include relevant motivation and context.

Fixes #(issue) [optional]
-Overarching goal of investigating unclosed terms present in the volume filtered Euler-Lagrange framework
-Implements explicit computation of the unclosed terms present in the volume filtered momentum equation in the Euler-Lagrange framework for particle laden flows
-Explicit immersed boundary drag force computation
-General field statistics computation (1st-4th order statistical moments)
-Periodic spherical immersed boundaries
-Bulk flow periodic forcing

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Something else

Scope

  • This PR comprises a set of related changes with a common goal

If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration

  • Test A
  • Test B

Test Configuration:

  • What computers and compilers did you use to test this:
    -MacOs
    -Linux (Ubuntu)
    -NCSA Delta Nvidia A100 GPUs

Checklist

  • I have added comments for the new code
  • I added Doxygen docstrings to the new code
  • I have made corresponding changes to the documentation (docs/)
  • I have added regression tests to the test suite so that people can verify in the future that the feature is behaving as expected
  • I have added example cases in examples/ that demonstrate my new feature performing as expected.
    They run to completion and demonstrate "interesting physics"
  • I ran ./mfc.sh format before committing my code
  • New and existing tests pass locally with my changes, including with GPU capability enabled (both NVIDIA hardware with NVHPC compilers and AMD hardware with CRAY compilers) and disabled
  • This PR does not introduce any repeated code (it follows the DRY principle)
  • I cannot think of a way to condense this code and reduce any introduced additional line count

If your code changes any code source files (anything in src/simulation)

To make sure the code is performing as expected on GPU devices, I have:

  • Checked that the code compiles using NVHPC compilers
  • Checked that the code compiles using CRAY compilers
  • Ran the code on either V100, A100, or H100 GPUs and ensured the new feature performed as expected (the GPU results match the CPU results)
  • Ran the code on MI200+ GPUs and ensure the new features performed as expected (the GPU results match the CPU results)
  • Enclosed the new feature via nvtx ranges so that they can be identified in profiles
  • Ran a Nsight Systems profile using ./mfc.sh run XXXX --gpu -t simulation --nsys, and have attached the output file (.nsys-rep) and plain text results to this PR
  • Ran a Rocprof Systems profile using ./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace, and have attached the output file and plain text results to this PR.
  • Ran my code using various numbers of different GPUs (1, 2, and 8, for example) in parallel and made sure that the results scale similarly to what happens if you run without the new code/feature

PR Type

Enhancement, Tests


Description

  • Implements comprehensive volume filtering framework for particle-laden flows with explicit computation of unclosed terms in the Euler-Lagrange framework

  • Adds FFT-based filtering infrastructure supporting both CPU (FFTW) and GPU (cuFFT) backends with MPI parallelization

  • Computes key quantities including Reynolds stress tensors, effective viscosity, and interphase momentum exchange terms

  • Implements periodic spherical immersed boundaries with on-the-fly levelset computation when storage is disabled

  • Adds periodic bulk flow forcing based on space-time averaged quantities with fluid indicator function weighting

  • Extends MPI communication infrastructure with scalar field send/receive buffers and global domain bounds computation

  • Implements statistical moment computation (1st-4th order) using Welford's online algorithm with GPU parallelization

  • Adds support for writing volume-filtered conservative variables, pressure, Reynolds stress, effective viscosity, and interphase momentum exchange

  • Includes 5 new test cases demonstrating volume filtering, periodic forcing, and particle drag computation

  • Increases maximum immersed boundary patches from 10 to 1000 to support larger particle-laden systems

  • Adds comprehensive case validation for volume filtering and periodic forcing requirements

  • Includes utility scripts for generating 3D Voronoi-based sphere distributions for uniform particle packing


Diagram Walkthrough

flowchart LR
  A["Particle-laden Flow Simulation"] --> B["Volume Filtering Module"]
  A --> C["Periodic Forcing"]
  A --> D["Periodic Immersed Boundaries"]
  B --> E["FFT-based Filtering<br/>CPU/GPU"]
  B --> F["Statistical Moments<br/>1st-4th Order"]
  E --> G["Reynolds Stress<br/>Effective Viscosity"]
  F --> G
  C --> H["Space-time Averaging<br/>with Fluid Indicator"]
  D --> I["On-the-fly Levelset<br/>Computation"]
  G --> J["Filtered Data Output"]
  H --> J
  I --> J
  K["MPI Communication"] --> B
  K --> C
  K --> D
Loading

File Walkthrough

Relevant files
Enhancement
26 files
m_volume_filtering.fpp
Volume filtering module for Euler-Lagrange particle-laden flows

src/simulation/m_volume_filtering.fpp

  • Implements a comprehensive volume filtering module for particle-laden
    flows with explicit computation of unclosed terms in the
    Euler-Lagrange framework
  • Provides FFT-based filtering infrastructure supporting both CPU (FFTW)
    and GPU (cuFFT) backends with MPI parallelization
  • Computes key quantities including Reynolds stress tensors, effective
    viscosity, and interphase momentum exchange terms
  • Includes particle force computation and fluid indicator function
    management for immersed boundary methods
+2007/-0
m_data_output.fpp
Add volume-filtered data output capabilities to I/O module

src/simulation/m_data_output.fpp

  • Extends s_write_data_files and s_write_parallel_data_files subroutines
    with optional parameters for filtered quantities
  • Adds support for writing volume-filtered conservative variables,
    pressure, Reynolds stress, effective viscosity, and interphase
    momentum exchange
  • Implements conditional logic to handle filtered data output at initial
    and final time steps when q_filtered_wrt flag is enabled
  • Integrates s_initialize_mpi_data_filtered for MPI-based initialization
    of filtered field data structures
+45/-3   
m_time_steppers.fpp
Add volume filtering module dependencies to time steppers

src/simulation/m_time_steppers.fpp

  • Adds module dependencies for m_volume_filtering and
    m_compute_statistics
  • Enables integration of volume filtering and statistical computation
    capabilities into time-stepping routines
+4/-0     
m_mpi_common.fpp
MPI scalar field communication and global domain bounds   

src/common/m_mpi_common.fpp

  • Added send/receive buffers for scalar field communication with GPU
    memory management
  • Introduced domain_glb array to store global domain bounds across all
    processors
  • Implemented s_mpi_sendrecv_variables_buffers_scalarfield subroutine
    for scalar field MPI communication with pack/unpack operations
  • Added s_mpi_global_domain_bounds subroutine to compute and broadcast
    global domain boundaries
  • Added slab domain decomposition option for z-direction processor
    distribution
+393/-6 
m_start_up.fpp
Volume filtering and statistics computation integration   

src/simulation/m_start_up.fpp

  • Added module imports for volume filtering, statistics computation, and
    additional forcing
  • Conditionally read levelset data only when store_levelset flag is
    enabled
  • Added domain boundary initialization for x, y, z directions
  • Integrated volume filtering momentum equation computation and
    statistics calculation in main time-stepping loop
  • Added periodic forcing computation and particle drag force calculation
  • Implemented filtered variable output and MPI initialization for
    filtered data
  • Added initialization and finalization of volume filtering and
    statistics modules
+170/-81
m_compute_statistics.fpp
Statistical moments computation for volume filtered variables

src/simulation/m_compute_statistics.fpp

  • New module implementing statistical moment computation (1st-4th order)
    for unclosed terms
  • Allocates and manages arrays for Reynolds stress, effective viscosity,
    interphase momentum exchange, and filtered conservative variables
  • Implements Welford's online algorithm for computing statistical
    moments with GPU parallelization
  • Provides initialization and finalization routines for statistics
    module
+340/-0 
m_ibm.fpp
Periodic spherical immersed boundaries and on-the-fly levelset

src/simulation/m_ibm.fpp

  • Added periodic boundary population for immersed boundary markers in
    all three directions
  • Implemented on-the-fly levelset and levelset normal computation for
    spherical boundaries when store_levelset is disabled
  • Added periodic image point projection logic for spheres near domain
    boundaries
  • Modified ghost point detection to use directional subsections instead
    of full 3D subsection
  • Fixed bound checking in image point location search
+186/-18
m_data_input.f90
Post-processing support for filtered variables and statistics

src/post_process/m_data_input.f90

  • Added public declarations for filtered fluid indicator function and
    statistical moment arrays
  • Implemented s_allocate_filtered_arrays subroutine to allocate filtered
    variable arrays
  • Added s_read_parallel_filtered_data subroutine to read volume filtered
    data from restart files
  • Integrated filtered data reading into parallel data input pipeline
  • Added deallocation logic for filtered variable arrays in finalization
    routine
+193/-1 
m_additional_forcing.fpp
Periodic bulk flow forcing for particle-laden flows           

src/simulation/m_additional_forcing.fpp

  • New module implementing periodic bulk flow forcing based on space-time
    averaged quantities
  • Computes spatial averages of density, momentum, and energy with fluid
    indicator function weighting
  • Implements time-windowed averaging for phase-averaged quantities
  • Applies forcing terms to continuity, momentum, and energy equations
  • Includes optional output of forcing statistics to binary file
+204/-0 
m_boundary_common.fpp
Scalar field boundary condition population                             

src/common/m_boundary_common.fpp

  • Added s_populate_scalarfield_buffers subroutine for populating scalar
    field boundary buffers
  • Implemented s_periodic_scalarfield pure subroutine for periodic
    boundary conditions on scalar fields
  • Supports MPI communication and local periodic boundary application for
    scalar fields
+165/-0 
m_ib_patches.fpp
Periodic sphere initialization with domain wrapping           

src/common/m_ib_patches.fpp

  • Modified sphere initialization to support periodic boundary wrapping
    of sphere centroids
  • Added logic to check all periodic projections of sphere center and
    mark grid cells accordingly
  • Changed center storage from 1D to 2D array to accommodate periodic
    projections
  • Pre-computed radius squared for efficiency in distance calculations
+79/-24 
m_compute_levelset.fpp
Periodic levelset computation for spherical boundaries     

src/common/m_compute_levelset.fpp

  • Extended levelset computation to handle periodic boundary conditions
    for spheres
  • Added logic to compute periodic projections of sphere centers near
    domain boundaries
  • Implemented distance calculation to all periodic projections to find
    nearest surface
  • Conditional branching for periodic vs non-periodic sphere levelset
    computation
+91/-22 
m_mpi_proxy.fpp
MPI parameter broadcasting for new features                           

src/simulation/m_mpi_proxy.fpp

  • Added MPI broadcast for new integer parameters: t_step_start_stats,
    mom_f_idx, forcing_window
  • Added MPI broadcast for new logical parameters: periodic_ibs,
    compute_particle_drag, periodic_forcing, forcing_wrt,
    volume_filtering_momentum_eqn, store_levelset,
    slab_domain_decomposition, q_filtered_wrt
  • Added MPI broadcast for new real parameters: u_inf_ref, rho_inf_ref,
    P_inf_ref, filter_width, forcing_dt, fluid_volume_fraction
+7/-3     
m_start_up.fpp
Pre-processing domain bounds initialization                           

src/pre_process/m_start_up.fpp

  • Added call to s_mpi_global_domain_bounds to compute global domain
    bounds during preprocessing
  • Added module variable declarations for periodic immersed boundaries
    and levelset storage flags
+5/-2     
m_global_parameters.fpp
Global parameters for volume filtering and periodic forcing

src/simulation/m_global_parameters.fpp

  • Added 20 new global parameters for volume filtering, periodic forcing,
    and immersed boundary features
  • Initialized new parameters with default values in
    s_assign_default_values_to_user_inputs
  • Extended MPI I/O data allocation to handle 109 additional variables
    for volume filtering
  • Added GPU memory declarations for key filtering and forcing parameters
+50/-0   
m_start_up.fpp
Post-processing output for filtered quantities and statistics

src/post_process/m_start_up.fpp

  • Added filtered quantities output when q_filtered_wrt is enabled
  • Writes filtered fluid indicator function and statistical moments
    (Reynolds stress, effective viscosity, interfacial momentum exchange,
    filtered conserved variables, filtered pressure)
  • Computes 4th-order statistical moments for each quantity
+61/-1   
m_data_output.fpp
Conditional levelset storage in restart files                       

src/pre_process/m_data_output.fpp

  • Made levelset output conditional based on store_levelset flag
  • Fixed MPI I/O data type specifications for levelset and levelset norm
    writing
  • Wrapped levelset file operations in conditional block to skip when
    store_levelset is false
+34/-32 
m_global_parameters.fpp
Post-process global parameters for filtering and IB features

src/post_process/m_global_parameters.fpp

  • Added 4 new logical parameters for periodic immersed boundaries,
    levelset storage, domain decomposition, and filtered output
  • Initialized new parameters with appropriate default values
  • Extended MPI I/O data allocation for filtered quantities when
    q_filtered_wrt is enabled
+33/-9   
m_mpi_proxy.fpp
MPI broadcast for new immersed boundary parameters             

src/pre_process/m_mpi_proxy.fpp

  • Added 3 new logical parameters to MPI broadcast list: periodic_ibs,
    store_levelset, slab_domain_decomposition
  • Reordered parameter list for consistency
+2/-1     
m_mpi_proxy.fpp
MPI broadcast for post-process filtering parameters           

src/post_process/m_mpi_proxy.fpp

  • Added 4 new logical parameters to MPI broadcast: periodic_ibs,
    store_levelset, slab_domain_decomposition, q_filtered_wrt
+2/-1     
m_rhs.fpp
Periodic forcing integration into RHS computation               

src/simulation/m_rhs.fpp

  • Added module import for m_additional_forcing
  • Integrated periodic forcing computation into RHS calculation with NVTX
    profiling ranges
  • Calls s_compute_periodic_forcing when periodic_forcing is enabled
+8/-0     
m_global_parameters.fpp
Pre-process global parameters for periodic IB support       

src/pre_process/m_global_parameters.fpp

  • Added 3 new logical parameters for periodic immersed boundaries and
    domain decomposition
  • Initialized parameters with default values in
    s_assign_default_values_to_user_inputs
+7/-0     
p_main.fpp
Volume filtering initialization in main simulation             

src/simulation/p_main.fpp

  • Added module import for m_volume_filtering
  • Initialized fluid indicator function for volume filtering and periodic
    forcing
  • Initialized filtering kernel and filtered fluid indicator function
    when volume filtering is enabled
  • Initialized fluid indicator gradient computation
+9/-0     
m_constants.fpp
Increased maximum patches limit for IB support                     

src/common/m_constants.fpp

  • Increased num_patches_max from 10 to 1000 to support larger numbers of
    immersed boundaries
+1/-1     
case_validator.py
Case validation for volume filtering and periodic forcing

toolchain/mfc/case_validator.py

  • Increased num_ibs validation limit from 10 to 1000
  • Added validation for periodic immersed boundaries requiring all
    periodic BCs
  • Added check_periodic_forcing method validating required parameters for
    periodic forcing
  • Added check_volume_filtering method validating filter width, domain
    decomposition, and grid requirements
  • Integrated new validation checks into validate_common method
+77/-1   
case_dicts.py
Case dictionary parameter definitions for new features     

toolchain/mfc/run/case_dicts.py

  • Added 3 new logical parameters to COMMON dictionary for periodic IB
    support
  • Increased IB patch range from 10 to 1000 in both PRE_PROCESS and
    SIMULATION
  • Added 15 new parameters to SIMULATION dictionary for volume filtering
    and periodic forcing
  • Added q_filtered_wrt to POST_PROCESS dictionary
+20/-2   
Configuration changes
1 files
delta.mako
GPU memory allocation increase for Delta system                   

toolchain/templates/delta.mako

  • Increased memory allocation from 208G to 240G for GPU-enabled jobs on
    NCSA Delta
+1/-1     
Tests
6 files
case.py
Test case for volume filtered particle-laden flow               

runs/phi01/case.py

  • New test case for particle-laden flow with volume filtering
  • Configures 190 spheres in periodic domain with Mach 0.8 flow
  • Enables periodic forcing, volume filtering, and particle drag
    computation
  • Sets up filtering kernel width and domain decomposition parameters
+171/-0 
case.py
Single sphere volume filtering test case                                 

runs/3d_1sphere_filtering/case.py

  • New test case for single sphere with volume filtering
  • Configures 63³ grid with Mach 1.2 flow at Re=1500
  • Enables periodic forcing, volume filtering, and filtered output
    writing
  • Sets up slab domain decomposition for filtering operations
+154/-0 
case.py
Periodic immersed boundary test case                                         

runs/3d_1sphere_periodic/case.py

  • New test case for periodic immersed boundary with single sphere
  • Configures 63³ grid with Mach 1.2 flow
  • Enables periodic forcing and periodic immersed boundaries
  • Uses slab domain decomposition
+153/-0 
case.py
Off-center periodic IB test configuration                               

runs/3d_periodic_ibs_test/off-center/case.py

  • Test case for off-center periodic immersed boundary
  • Sphere positioned at (15D, 15D, 15D) in periodic domain
  • Enables periodic forcing and periodic immersed boundaries
  • Uses slab domain decomposition
+146/-0 
case.py
Centered periodic IB test configuration                                   

runs/3d_periodic_ibs_test/centered/case.py

  • Test case for centered periodic immersed boundary
  • Sphere positioned at domain center (0, 0, 0)
  • Enables periodic forcing and periodic immersed boundaries
  • Uses slab domain decomposition
+146/-0 
case.py
Drag force computation test case                                                 

runs/3d_drag_test/case.py

  • Test case for explicit drag force computation on single sphere
  • Configures 99³ grid with Mach 1.2 flow at Re=1500
  • Enables particle drag computation with reference flow parameters
  • Uses ghost cell extrapolation boundary conditions
+144/-0 
Miscellaneous
3 files
gen_voronoi_3D.py
3D Voronoi sphere distribution generator                                 

voronoi/gen_voronoi_3D.py

  • New utility script for generating 3D Voronoi-based sphere
    distributions
  • Implements Lloyd relaxation algorithm for uniform sphere packing
  • Generates sphere locations for particle-laden flow test cases
  • Validates non-overlapping sphere placement
+98/-0   
gen_voronoi_2D.py
2D Voronoi circle distribution generator                                 

voronoi/gen_voronoi_2D.py

  • New utility script for generating 2D Voronoi-based circle
    distributions
  • Implements Lloyd relaxation algorithm for uniform circle packing
  • Visualizes initial and relaxed point distributions
+99/-0   
sphere_array_locations.txt
Sphere location data for phi=0.1 test case                             

runs/phi01/sphere_array_locations.txt

  • Data file containing 190 sphere center coordinates for particle-laden
    flow test case
  • Coordinates generated using Voronoi-based Lloyd relaxation for uniform
    distribution
+190/-0 

Summary by CodeRabbit

  • New Features

    • Periodic immersed boundaries with automatic image wrapping.
    • FFT-based volume filtering for momentum with MPI-enabled workflow.
    • Periodic forcing with configurable averaging windows and timesteps.
    • Particle-drag computation and extensive statistics (Reynolds stress, effective viscosity, interaction moments).
    • Voronoi 2D/3D relaxation tools and several ready-to-run 3D IB case templates.
  • Improvements

    • Increased max patches (10→1000) and optional levelset storage to save memory.
    • Enhanced scalar-field boundary buffering, slab domain decomposition, and expanded MPI I/O for filtered data.

✏️ Tip: You can customize this high-level summary in your review settings.


CodeAnt-AI Description

Volume filtered Euler–Lagrange solver: explicit computation of unclosed momentum terms and particle forces

What Changed

  • Adds a volume-filtering module that computes explicit unclosed terms in the filtered momentum equation (pseudo-turbulent Reynolds stress, effective viscosity, interphase momentum exchange) and produces Favre-filtered primitive and conservative fields.
  • Computes and writes per-particle forces by integrating the divergence of pressure+viscous stress over immersed boundaries; writes binary particle_force.bin on runs that enable particle drag computation.
  • Implements an FFT-based Gaussian filtering kernel (CPU/GPU paths) and parallel slab transposes to support the filter, plus routines to initialize and finalize related buffers and plans.
  • Exposes filtered fields and their statistics to the data-output path so filtered conservative variables, filtered pressure, Reynolds stresses, effective viscosity, and interphase exchange can be included in parallel output; startup now gathers global domain bounds needed for periodic immersed boundaries.
  • Adds support for periodic immersed boundaries and fixes related MPI/GPU data-transfer issues observed when volume filtering and periodic forcing run on GPUs.

Impact

✅ Clearer momentum closure terms in filtered simulations
✅ More accurate particle drag reports (written per-step when enabled)
✅ Ability to output filtered fields and full tensor statistics for analysis

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 15, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds periodic immersed-boundary support, FFT-based volume filtering, periodic forcing, and statistics modules; expands patch capacity (10→1000); introduces scalar-field MPI buffers and levelset periodic wrapping; adds many new case scripts, Voronoi utilities, and toolchain/validation updates. Outputs and I/O paths updated to optionally include filtered/statistics data.

Changes

Cohort / File(s) Summary
New case configs & data
runs/.../*.py, examples/.../case.py, runs/phi01/sphere_array_locations.txt, examples/3D_random_sphere_array/sphere_array_locations.txt
Added multiple Python case builders and two sphere-location data files that emit JSON configs for IB simulations; a few example cases had small deletions (removed mu/eps definitions).
IB patch capacity & core IB logic
src/common/m_constants.fpp, src/common/m_ib_patches.fpp, src/simulation/m_ibm.fpp
Increased num_patches_max 10→1000; added periodic-image centroid handling and image-point/ghost-point logic; conditional deallocation when levelsets not stored.
Levelset & scalar-field BCs
src/common/m_compute_levelset.fpp, src/common/m_boundary_common.fpp
Levelset distance calculations extended to consider periodic images; new routines to populate scalar-field buffers and apply periodic scalar BCs.
MPI & global-domain
src/common/m_mpi_common.fpp, src/pre_process/m_mpi_proxy.fpp, src/simulation/m_mpi_proxy.fpp
New scalar-field send/recv buffers, domain_glb, global-domain bounds broadcast/reduction routines; extended MPI broadcast lists.
Volume filtering subsystem
src/simulation/m_volume_filtering.fpp, src/simulation/m_time_steppers.fpp, src/simulation/p_main.fpp
New FFT-based volume-filtering module (many public routines), MPI slab transposes, batched FFTs, filter kernel setup and outputs; time-stepper and main init integrate filter.
Periodic forcing module & RHS integration
src/simulation/m_additional_forcing.fpp, src/simulation/m_rhs.fpp
New periodic forcing module (init, compute, finalize) performing spatial/phase averaging and RHS augmentation; m_rhs calls forcing when enabled.
Statistics & filtered-data plumbing
src/simulation/m_compute_statistics.fpp, src/simulation/m_data_output.fpp, src/post_process/m_data_input.f90, src/post_process/m_global_parameters.fpp, src/post_process/m_start_up.fpp
New statistics module for unclosed momentum terms and moments; extended I/O and data-input to allocate/read/write filtered indicators and statistics; new flags to control storage.
Simulation global parameters & startup
src/simulation/m_global_parameters.fpp, src/simulation/m_start_up.fpp, src/pre_process/m_global_parameters.fpp, src/pre_process/m_start_up.fpp
Added many public flags/params (periodic_ibs, periodic_forcing, forcing_*, volume_filtering_momentum_eqn, store_levelset, slab_domain_decomposition, q_filtered_wrt, etc.); startup reads namelist entries and initializes new subsystems.
Toolchain & validation
toolchain/mfc/case_validator.py, toolchain/mfc/run/case_dicts.py, toolchain/templates/delta.mako
Added validators for periodic_forcing and volume filtering constraints; expanded case_dict schema (patch range → 1..1000) and new keys; increased SBATCH GPU memory.
Voronoi utilities
voronoi/gen_voronoi_2D.py, voronoi/gen_voronoi_3D.py
New Lloyd-relaxation utilities for 2D/3D using freud, with centroids/volume helpers, visualization and file output.
Minor examples/tests
examples/..., tests/331C1DFC/golden-metadata.txt
Small example removals and new test metadata file added.

Sequence Diagram(s)

sequenceDiagram
    participant Main as Simulation Loop
    participant Volume as m_volume_filtering
    participant FFT as MPI+FFTW
    participant Stress as Stress Compute
    participant Particle as Particle Forces

    Main->>Volume: s_volume_filter_momentum_eqn(q_cons, q_prim, dyn_visc, bc_type)
    Volume->>Stress: s_generate_viscous_stress_tensor(q_prim, dyn_visc)
    Stress->>Stress: compute viscous & pressure components
    Volume->>Stress: s_compute_divergence_stress_tensor()
    Volume->>FFT: s_filter_batch(reynolds_stress, pressure, ...)
    FFT->>FFT: s_mpi_FFT_fwd() / s_mpi_transpose_slab*
    FFT->>FFT: apply Gaussian filter (freq space)
    FFT->>FFT: s_mpi_FFT_bwd()
    Volume->>Particle: s_compute_particle_forces()
    Particle->>Main: reduced particle forces returned
Loading
sequenceDiagram
    participant RHS as RHS assembly
    participant Forcing as m_additional_forcing
    participant Avg as Spatial Averager
    participant Phase as Phase Averager

    RHS->>Forcing: s_compute_periodic_forcing(rhs_vf, q_cons_vf, q_prim_vf, t_step)
    Forcing->>Avg: compute domain integrals (rho,u,e)
    Avg->>Avg: MPI reductions -> global averages
    Forcing->>Phase: update forcing_window, compute phase averages
    Forcing->>RHS: update RHS fields (continuity, momentum, energy) conditioned on indicator
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

size:XXL

Suggested reviewers

  • sbryngelson
  • wilfonba

Poem

🐇 I hopped through domains both near and far,
Patches many—now a thousand star,
Filters hum and forces sway,
Voronoi points find their way,
Statistics dance: hooray, hooray!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 53.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main addition: explicit computation of unclosed terms in volume-filtered Euler-Lagrange frameworks for particle-laden flows.
Description check ✅ Passed The PR description is comprehensive and detailed with clear objectives, scope, testing, and checklist completion, but several critical items remain unchecked (Doxygen docstrings, documentation updates, regression tests).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sbryngelson
Copy link
Member

240 commits behind master...

@danieljvickers
Copy link
Member

@sbryngelson that is because there was a commit that was put on main that was not squashed. In terms of squashed commits, it isn't as bad.

@sbryngelson
Copy link
Member

good! glad to hear

@conraddelgado conraddelgado marked this pull request as ready for review January 21, 2026 00:59
@qodo-code-review
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Possible Issue

The online statistics update/normalization appears numerically unsafe for early samples: the update uses ns in a division and the moment formulas divide by (ns-1), (ns-2), (ns-3) and by powers of M2. This can lead to division by zero/NaNs when the statistics window is small (including the first step after start) or when variance is near zero. Also, the computation is gated on a final-step condition using t_step_stop, which is not passed in and must be validated as correctly scoped and consistent with when outputs are written.

subroutine s_compute_statistics_momentum_unclosed_terms(t_step, t_step_start_stats, reynolds_stress, eff_visc, int_mom_exch, q_cons_filtered, filtered_pressure)
    type(scalar_field), dimension(num_dims, num_dims), intent(in) :: reynolds_stress
    type(scalar_field), dimension(num_dims, num_dims), intent(in) :: eff_visc
    type(scalar_field), dimension(num_dims), intent(in) :: int_mom_exch
    type(scalar_field), dimension(sys_size), intent(in) :: q_cons_filtered
    type(scalar_field), intent(in) :: filtered_pressure
    integer, intent(in) :: t_step
    integer, intent(in) :: t_step_start_stats

    real(wp) :: ns
    integer :: i, j

    ns = real(t_step - t_step_start_stats, wp)

    ! update M1, M2, M3, M4
    do i = 1, num_dims
        do j = 1, num_dims
            call s_update_statistics(ns, reynolds_stress(i, j), Msn_reynolds_stress(i, j)%vf)
            call s_update_statistics(ns, eff_visc(i, j), Msn_eff_visc(i, j)%vf)
        end do
        call s_update_statistics(ns, int_mom_exch(i), Msn_int_mom_exch(i)%vf)
    end do
    do i = 1, sys_size - 1
        call s_update_statistics(ns, q_cons_filtered(i), Msn_q_cons_filtered(i)%vf)
    end do
    call s_update_statistics(ns, filtered_pressure, Msn_filtered_pressure)

    ! compute 1st, 2nd, 3rd, 4th order statistical moments
    if (t_step == t_step_stop - 1) then ! only compute at final time
        do i = 1, num_dims
            do j = 1, num_dims
                call s_compute_statistical_moments(ns, Msn_reynolds_stress(i, j)%vf, stat_reynolds_stress(i, j)%vf)
                call s_compute_statistical_moments(ns, Msn_eff_visc(i, j)%vf, stat_eff_visc(i, j)%vf)
            end do
            call s_compute_statistical_moments(ns, Msn_int_mom_exch(i)%vf, stat_int_mom_exch(i)%vf)
        end do
        do i = 1, sys_size - 1
            call s_compute_statistical_moments(ns, Msn_q_cons_filtered(i)%vf, stat_q_cons_filtered(i)%vf)
        end do
        call s_compute_statistical_moments(ns, Msn_filtered_pressure, stat_filtered_pressure)
    end if

end subroutine s_compute_statistics_momentum_unclosed_terms

subroutine s_update_statistics(ns, q_temp, Msn)
    type(scalar_field), intent(in) :: q_temp
    type(scalar_field), dimension(4), intent(inout) :: Msn

    real(wp), intent(in) :: ns
    real(wp) :: delta, delta_n, delta_n2, delta_f
    integer :: i, j, k

    $:GPU_PARALLEL_LOOP(collapse=3, copyin='[ns]', private='[delta, delta_n, delta_n2, delta_f]')
    do i = 0, m
        do j = 0, n
            do k = 0, p
                delta = q_temp%sf(i, j, k) - Msn(1)%sf(i, j, k)
                delta_n = delta/ns
                delta_n2 = delta_n**2
                delta_f = delta*delta_n*(ns - 1._wp)

                Msn(1)%sf(i, j, k) = Msn(1)%sf(i, j, k) + delta_n
                Msn(4)%sf(i, j, k) = Msn(4)%sf(i, j, k) + delta_f*delta_n2*(ns**2 - 3._wp*ns + 3._wp) + 6._wp*delta_n2*Msn(2)%sf(i, j, k) - 4._wp*delta_n*Msn(3)%sf(i, j, k)
                Msn(3)%sf(i, j, k) = Msn(3)%sf(i, j, k) + delta_f*delta_n*(ns - 2._wp) - 3._wp*delta_n*Msn(2)%sf(i, j, k)
                Msn(2)%sf(i, j, k) = Msn(2)%sf(i, j, k) + delta_f
            end do
        end do
    end do

end subroutine s_update_statistics

subroutine s_compute_statistical_moments(ns, Msn, q_stat)
    type(scalar_field), dimension(4), intent(in) :: Msn
    type(scalar_field), dimension(4), intent(inout) :: q_stat

    real(wp), intent(in) :: ns
    integer :: i, j, k

    $:GPU_PARALLEL_LOOP(collapse=3, copyin='[ns]')
    do i = 0, m
        do j = 0, n
            do k = 0, p
                q_stat(1)%sf(i, j, k) = Msn(1)%sf(i, j, k)
                q_stat(2)%sf(i, j, k) = Msn(2)%sf(i, j, k)/(ns - 1._wp)
                q_stat(3)%sf(i, j, k) = sqrt(ns - 1._wp)/(ns - 2._wp)*ns*Msn(3)%sf(i, j, k)/(Msn(2)%sf(i, j, k)**1.5)
                q_stat(4)%sf(i, j, k) = (ns - 1._wp)/((ns - 2._wp)*(ns - 3._wp))*((ns + 1._wp)*(ns*Msn(4)%sf(i, j, k)/(Msn(2)%sf(i, j, k)**2) - 3._wp) + 6._wp)
            end do
        end do
    end do

end subroutine s_compute_statistical_moments
Possible Issue

The new MPI-IO setup for filtered variables hard-codes an alt_sys size and populates MPI_IO_DATA views/vars using offsets relative to sys_size. This indexing is easy to get wrong and may write/read beyond allocated MPI_IO_DATA%var/view or mismatch the number of variables actually written (especially since multiple sites compute alt_sys). The view creation loop bounds should be double-checked to match exactly the indices populated and the file layout expected by post-process.

    !! @param filtered_fluid_indicator_function volume filtered fluid indicator function
    !! @param stat_q_cons_filtered 1-4 order statistical moments of volume filtered conservative variables
    !! @param stat_filtered_pressure 1-4 order statistical moments of volume filtered pressure
    !! @param stat_reynolds_stress 1-4 order statistics of reynolds stress tensor
    !! @param stat_eff_visc 1-4 order statistics of unclosed effective viscosity tensor
    !! @param stat_int_mom_exch 1-4 order statistics of interphase momentum exchange vector
    impure subroutine s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, &
                                                     stat_q_cons_filtered, stat_filtered_pressure, &
                                                     stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)

        type(scalar_field), intent(in) :: filtered_fluid_indicator_function
        type(vector_field), dimension(sys_size - 1), intent(in) :: stat_q_cons_filtered
        type(scalar_field), dimension(4), intent(in) :: stat_filtered_pressure
        type(vector_field), dimension(num_dims, num_dims), intent(in) :: stat_reynolds_stress
        type(vector_field), dimension(num_dims, num_dims), intent(in) :: stat_eff_visc
        type(vector_field), dimension(num_dims), intent(in) :: stat_int_mom_exch

        integer, dimension(num_dims) :: sizes_glb, sizes_loc

#ifdef MFC_MPI

        ! Generic loop iterator
        integer :: i, j, k
        integer :: ierr !< Generic flag used to identify and report MPI errors

        !total system size
        integer :: alt_sys

        alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4 ! 109

        MPI_IO_DATA%var(sys_size + 1)%sf => filtered_fluid_indicator_function%sf(0:m, 0:n, 0:p)
        do i = 1, num_dims
            do j = 1, num_dims
                do k = 1, 4
                    MPI_IO_DATA%var(sys_size + 1 + (i - 1)*4*num_dims + (j - 1)*4 + k)%sf => stat_reynolds_stress(i, j)%vf(k)%sf(0:m, 0:n, 0:p)
                end do
            end do
        end do
        do i = 1, num_dims
            do j = 1, num_dims
                do k = 1, 4
                    MPI_IO_DATA%var(sys_size + 37 + (i - 1)*4*num_dims + (j - 1)*4 + k)%sf => stat_eff_visc(i, j)%vf(k)%sf(0:m, 0:n, 0:p)
                end do
            end do
        end do
        do i = 1, num_dims
            do j = 1, 4
                MPI_IO_DATA%var(sys_size + 73 + (i - 1)*4 + j)%sf => stat_int_mom_exch(i)%vf(j)%sf(0:m, 0:n, 0:p)
            end do
        end do
        do i = 1, sys_size - 1
            do j = 1, 4
                MPI_IO_DATA%var(sys_size + 85 + (i - 1)*4 + j)%sf => stat_q_cons_filtered(i)%vf(j)%sf(0:m, 0:n, 0:p)
            end do
        end do
        do i = 1, 4
            MPI_IO_DATA%var(sys_size + 105 + i)%sf => stat_filtered_pressure(i)%sf(0:m, 0:n, 0:p)
        end do

        ! Define global(g) and local(l) sizes for flow variables
        sizes_glb(1) = m_glb + 1; sizes_loc(1) = m + 1
        if (n > 0) then
            sizes_glb(2) = n_glb + 1; sizes_loc(2) = n + 1
            if (p > 0) then
                sizes_glb(3) = p_glb + 1; sizes_loc(3) = p + 1
            end if
        end if

        ! Define the view for each variable
        do i = sys_size, alt_sys
            call MPI_TYPE_CREATE_SUBARRAY(num_dims, sizes_glb, sizes_loc, start_idx, &
                                          MPI_ORDER_FORTRAN, mpi_p, MPI_IO_DATA%view(i), ierr)
            call MPI_TYPE_COMMIT(MPI_IO_DATA%view(i), ierr)
        end do

#endif

    end subroutine s_initialize_mpi_data_filtered
Duplicate Code

The filtered-statistics allocation/deallocation repeats the same nested allocation patterns used elsewhere (e.g., for stats fields in simulation). This duplication increases the chance of shape/index mismatches (num_dims/sys_size/4) between writer and reader. Consider consolidating the filtered-variable count/packing and allocation into shared helpers/constants to avoid drift, and ensure the reader’s MPI-IO types and counts match what the writer produces.

!> Helper subroutine to allocate filtered variable arrays for given dimensionality
!!  @param start_idx Starting index for allocation
!!  @param end_x, end_y, end_z End indices for each dimension
impure subroutine s_allocate_filtered_arrays(local_start_idx, end_x, end_y, end_z)

    integer, intent(in) :: local_start_idx, end_x, end_y, end_z
    integer :: i, j, k

    allocate (filtered_fluid_indicator_function%sf(local_start_idx:end_x, local_start_idx:end_y, local_start_idx:end_z))

    do i = 1, sys_size - 1
        allocate (stat_q_cons_filtered(i)%vf(1:4))
    end do
    do i = 1, sys_size - 1
        do j = 1, 4
            allocate (stat_q_cons_filtered(i)%vf(j)%sf(local_start_idx:end_x, local_start_idx:end_y, local_start_idx:end_z))
        end do
    end do

    do i = 1, 4
        allocate (stat_filtered_pressure(i)%sf(local_start_idx:end_x, local_start_idx:end_y, local_start_idx:end_z))
    end do

    do i = 1, num_dims
        do j = 1, num_dims
            allocate (stat_reynolds_stress(i, j)%vf(1:4))
        end do
    end do
    do i = 1, num_dims
        do j = 1, num_dims
            do k = 1, 4
                allocate (stat_reynolds_stress(i, j)%vf(k)%sf(local_start_idx:end_x, local_start_idx:end_y, local_start_idx:end_z))
            end do
        end do
    end do

    do i = 1, num_dims
        do j = 1, num_dims
            allocate (stat_eff_visc(i, j)%vf(1:4))
        end do
    end do
    do i = 1, num_dims
        do j = 1, num_dims
            do k = 1, 4
                allocate (stat_eff_visc(i, j)%vf(k)%sf(local_start_idx:end_x, local_start_idx:end_y, local_start_idx:end_z))
            end do
        end do
    end do

    do i = 1, num_dims
        allocate (stat_int_mom_exch(i)%vf(1:4))
    end do
    do i = 1, num_dims
        do j = 1, 4
            allocate (stat_int_mom_exch(i)%vf(j)%sf(local_start_idx:end_x, local_start_idx:end_y, local_start_idx:end_z))
        end do
    end do

end subroutine s_allocate_filtered_arrays

Comment on lines +577 to +586
$:GPU_PARALLEL_LOOP(collapse=3, copyin='[fd_coeffs]')
do i = 0, m
do j = 0, n
do k = 0, p
$:GPU_LOOP(parallelism='[seq]')
do l = -4, 4
grad_fluid_indicator(1)%sf(i, j, k) = grad_fluid_indicator(1)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i + l, j, k)/dx(i)
grad_fluid_indicator(2)%sf(i, j, k) = grad_fluid_indicator(2)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j + l, k)/dy(j)
grad_fluid_indicator(3)%sf(i, j, k) = grad_fluid_indicator(3)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j, k + l)/dz(k)
end do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: To prevent out-of-bounds memory access in the finite-difference stencil, restrict the loop ranges for i, j, and k to 4:m-4, 4:n-4, and 4:p-4 respectively. [possible issue, importance: 8]

Suggested change
$:GPU_PARALLEL_LOOP(collapse=3, copyin='[fd_coeffs]')
do i = 0, m
do j = 0, n
do k = 0, p
$:GPU_LOOP(parallelism='[seq]')
do l = -4, 4
grad_fluid_indicator(1)%sf(i, j, k) = grad_fluid_indicator(1)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i + l, j, k)/dx(i)
grad_fluid_indicator(2)%sf(i, j, k) = grad_fluid_indicator(2)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j + l, k)/dy(j)
grad_fluid_indicator(3)%sf(i, j, k) = grad_fluid_indicator(3)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j, k + l)/dz(k)
end do
$:GPU_PARALLEL_LOOP(collapse=3, copyin='[fd_coeffs]')
do i = 4, m-4
do j = 4, n-4
do k = 4, p-4
$:GPU_LOOP(parallelism='[seq]')
do l = -4, 4
grad_fluid_indicator(1)%sf(i, j, k) = grad_fluid_indicator(1)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i + l, j, k)/dx(i)
grad_fluid_indicator(2)%sf(i, j, k) = grad_fluid_indicator(2)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j + l, k)/dy(j)
grad_fluid_indicator(3)%sf(i, j, k) = grad_fluid_indicator(3)%sf(i, j, k) + fd_coeffs(l)*fluid_indicator_function%sf(i, j, k + l)/dz(k)
end do

Comment on lines +215 to +217
!< local sizes on each processor
Nyloc = Ny/num_procs
Nzloc = p + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add checks to ensure Ny is divisible by num_procs to prevent errors in the slab decomposition for MPI parallelization. [possible issue, importance: 7]

Suggested change
!< local sizes on each processor
Nyloc = Ny/num_procs
Nzloc = p + 1
!< local sizes on each processor
if (mod(Ny, num_procs) /= 0) then
write(*,*) 'Error: Ny not divisible by num_procs'
stop
end if
Nyloc = Ny/num_procs
if (mod(Nz, num_procs) /= 0) then
write(*,*) 'Error: Nz not divisible by num_procs'
stop
end if
Nzloc = Nz/num_procs

Comment on lines +239 to +249
$:GPU_PARALLEL_LOOP(collapse=3, copyin='[ns]')
do i = 0, m
do j = 0, n
do k = 0, p
q_stat(1)%sf(i, j, k) = Msn(1)%sf(i, j, k)
q_stat(2)%sf(i, j, k) = Msn(2)%sf(i, j, k)/(ns - 1._wp)
q_stat(3)%sf(i, j, k) = sqrt(ns - 1._wp)/(ns - 2._wp)*ns*Msn(3)%sf(i, j, k)/(Msn(2)%sf(i, j, k)**1.5)
q_stat(4)%sf(i, j, k) = (ns - 1._wp)/((ns - 2._wp)*(ns - 3._wp))*((ns + 1._wp)*(ns*Msn(4)%sf(i, j, k)/(Msn(2)%sf(i, j, k)**2) - 3._wp) + 6._wp)
end do
end do
end do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add checks in the statistical moments calculation to prevent division by zero when the number of samples ns is small or when the second moment Msn(2) is zero. Initialize higher-order moments to a safe value in these cases. [possible issue, importance: 8]

Suggested change
$:GPU_PARALLEL_LOOP(collapse=3, copyin='[ns]')
do i = 0, m
do j = 0, n
do k = 0, p
q_stat(1)%sf(i, j, k) = Msn(1)%sf(i, j, k)
q_stat(2)%sf(i, j, k) = Msn(2)%sf(i, j, k)/(ns - 1._wp)
q_stat(3)%sf(i, j, k) = sqrt(ns - 1._wp)/(ns - 2._wp)*ns*Msn(3)%sf(i, j, k)/(Msn(2)%sf(i, j, k)**1.5)
q_stat(4)%sf(i, j, k) = (ns - 1._wp)/((ns - 2._wp)*(ns - 3._wp))*((ns + 1._wp)*(ns*Msn(4)%sf(i, j, k)/(Msn(2)%sf(i, j, k)**2) - 3._wp) + 6._wp)
end do
end do
end do
$:GPU_PARALLEL_LOOP(collapse=3, copyin='[ns]')
do i = 0, m
do j = 0, n
do k = 0, p
q_stat(1)%sf(i, j, k) = Msn(1)%sf(i, j, k)
if (ns > 1._wp) then
q_stat(2)%sf(i, j, k) = Msn(2)%sf(i, j, k)/(ns - 1._wp)
else
q_stat(2)%sf(i, j, k) = 0._wp
end if
if (ns > 2._wp .and. Msn(2)%sf(i, j, k) > 0._wp) then
q_stat(3)%sf(i, j, k) = sqrt(ns - 1._wp)/(ns - 2._wp)*ns*Msn(3)%sf(i, j, k)/(Msn(2)%sf(i, j, k)**1.5)
else
q_stat(3)%sf(i, j, k) = 0._wp
end if
if (ns > 3._wp .and. Msn(2)%sf(i, j, k) > 0._wp) then
q_stat(4)%sf(i, j, k) = (ns - 1._wp)/((ns - 2._wp)*(ns - 3._wp))*((ns + 1._wp)*(ns*Msn(4)%sf(i, j, k)/(Msn(2)%sf(i, j, k)**2) - 3._wp) + 6._wp)
else
q_stat(4)%sf(i, j, k) = 0._wp
end if
end do
end do
end do

Comment on lines +162 to +164
! f_E
q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt &
+ q_cons_vf(contxe + mom_f_idx)%sf(i, j, k)*q_periodic_force(2)%sf(i, j, k)/q_cons_vf(1)%sf(i, j, k)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add a check to prevent division by zero in the energy forcing term calculation. Before dividing by density (q_cons_vf(1)%sf), ensure it is above a small positive threshold. [possible issue, importance: 8]

Suggested change
! f_E
q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt &
+ q_cons_vf(contxe + mom_f_idx)%sf(i, j, k)*q_periodic_force(2)%sf(i, j, k)/q_cons_vf(1)%sf(i, j, k)
! f_E
if (q_cons_vf(1)%sf(i, j, k) > 1.e-12_wp) then
q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt &
+ q_cons_vf(contxe + mom_f_idx)%sf(i, j, k)*q_periodic_force(2)%sf(i, j, k)/q_cons_vf(1)%sf(i, j, k)
else
q_periodic_force(3)%sf(i, j, k) = (P_inf_ref*gammas(1) - phase_eps)*forcing_dt
end if

Comment on lines +1604 to 1612
else if (slab_domain_decomposition) then
num_procs_x = 1
num_procs_y = 1
num_procs_z = num_procs
ierr = -1
if (mod((p + 1), num_procs_z) == 0) then
ierr = 0
end if
else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add a check to validate the slab decomposition. If the number of grid points in the z-direction is not divisible by the number of processors, abort the execution with an error message. [possible issue, importance: 7]

Suggested change
else if (slab_domain_decomposition) then
num_procs_x = 1
num_procs_y = 1
num_procs_z = num_procs
ierr = -1
if (mod((p + 1), num_procs_z) == 0) then
ierr = 0
end if
else
else if (slab_domain_decomposition) then
num_procs_x = 1
num_procs_y = 1
num_procs_z = num_procs
ierr = -1
if (mod((p + 1), num_procs_z) == 0) then
ierr = 0
end if
if (ierr /= 0) then
call s_mpi_abort('Slab decomposition error: (p+1) not divisible by num_procs_z')
end if
else

Comment on lines 848 to 854
if (present(beta)) then
alt_sys = sys_size + 1
else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4
else
alt_sys = sys_size
end if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Correct the alt_sys calculation to include the size of stat_filtered_pressure to prevent buffer overflow during filtered data output. [possible issue, importance: 9]

Suggested change
if (present(beta)) then
alt_sys = sys_size + 1
else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4
else
alt_sys = sys_size
end if
if (present(beta)) then
alt_sys = sys_size + 1
else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4 + 4
else
alt_sys = sys_size
end if

Comment on lines 117 to 118
if (.not. parallel_io) then
call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Update the serial I/O path to pass the new optional filtered arguments to s_write_serial_data_files when q_filtered_wrt is enabled. [possible issue, importance: 7]

Suggested change
if (.not. parallel_io) then
call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta)
if (.not. parallel_io) then
if (q_filtered_wrt .and. present(filtered_fluid_indicator_function)) then
call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta, &
filtered_fluid_indicator_function, stat_q_cons_filtered, stat_filtered_pressure, &
stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
else
call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta)
end if

Comment on lines +1018 to +1029
else if (volume_filtering_momentum_eqn) then
do i = 1, alt_sys
var_MOK = int(i, MPI_OFFSET_KIND)

! Initial displacement to skip at beginning of file
disp = m_MOK*max(MOK, n_MOK)*max(MOK, p_MOK)*WP_MOK*(var_MOK - 1)

call MPI_FILE_SET_VIEW(ifile, disp, mpi_p, MPI_IO_DATA%view(i), &
'native', mpi_info_int, ierr)
call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf, data_size*mpi_io_type, &
mpi_io_p, status, ierr)
end do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Change the condition for writing filtered variables from volume_filtering_momentum_eqn to q_filtered_wrt combined with a time step check to ensure data is written only when requested. [possible issue, importance: 8]

Suggested change
else if (volume_filtering_momentum_eqn) then
do i = 1, alt_sys
var_MOK = int(i, MPI_OFFSET_KIND)
! Initial displacement to skip at beginning of file
disp = m_MOK*max(MOK, n_MOK)*max(MOK, p_MOK)*WP_MOK*(var_MOK - 1)
call MPI_FILE_SET_VIEW(ifile, disp, mpi_p, MPI_IO_DATA%view(i), &
'native', mpi_info_int, ierr)
call MPI_FILE_WRITE_ALL(ifile, MPI_IO_DATA%var(i)%sf, data_size*mpi_io_type, &
mpi_io_p, status, ierr)
end do
else if (q_filtered_wrt .and. (t_step == 0 .or. t_step == t_step_stop)) then
do i = 1, alt_sys
var_MOK = int(i, MPI_OFFSET_KIND)
...
end do

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 issues found across 36 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="voronoi/gen_voronoi_2D.py">

<violation number="1" location="voronoi/gen_voronoi_2D.py:35">
P2: The Lloyd iteration recomputes Voronoi cells using initial_points, so every iteration uses the original tessellation and ignores updated point positions. Use the current points array so the relaxation actually updates.</violation>
</file>

<file name="src/simulation/m_compute_statistics.fpp">

<violation number="1" location="src/simulation/m_compute_statistics.fpp:92">
P2: The OpenACC setup is applied to the whole `Msn_q_cons_filtered` array inside a per-element loop. This likely skips proper setup for each element. Use the indexed element in the setup call.</violation>

<violation number="2" location="src/simulation/m_compute_statistics.fpp:150">
P2: The OpenACC setup is applied to the whole `stat_q_cons_filtered` array inside a per-element loop. Use the indexed element to ensure each field is initialized for GPU use.</violation>
</file>

<file name="src/post_process/m_data_input.f90">

<violation number="1" location="src/post_process/m_data_input.f90:679">
P2: alt_sys is hardcoded for a 3D/sys_size=7 layout, but the allocated filtered arrays are sized using num_dims and sys_size. In 1D/2D or other sys_size values this miscounts variables and can break MPI IO (wrong NVARS_MOK / loop bounds). Compute alt_sys from num_dims and sys_size instead of fixed constants.</violation>
</file>

<file name="src/pre_process/m_data_output.fpp">

<violation number="1" location="src/pre_process/m_data_output.fpp:840">
P2: Levelset MPI-IO should use mpi_io_p and scale the element count by mpi_io_type; using mpi_p/data_size*num_ibs breaks mixed-precision (half) output sizes.</violation>

<violation number="2" location="src/pre_process/m_data_output.fpp:860">
P2: Levelset_norm MPI-IO should use mpi_io_p and multiply by mpi_io_type; using mpi_p with raw element counts breaks mixed-precision output sizing.</violation>
</file>

<file name="voronoi/gen_voronoi_3D.py">

<violation number="1" location="voronoi/gen_voronoi_3D.py:21">
P2: `lloyd_relaxation_3d` mutates `initial_points` because `points` is just a reference. This makes the later “initial points” plot incorrect. Copy the array before modifying it.</violation>

<violation number="2" location="voronoi/gen_voronoi_3D.py:62">
P2: `os.mkdir` will fail if `../runs` doesn’t already exist. Use `os.makedirs(..., exist_ok=True)` to create parent directories safely.</violation>
</file>

<file name="src/simulation/m_ibm.fpp">

<violation number="1" location="src/simulation/m_ibm.fpp:620">
P1: The bound limit now allows `index` to reach the last valid coordinate index, but the loop still accesses `s_cc(index + 1)`. This can read past the end of `x_cc/y_cc/z_cc` when `index == bound`. Keep the bound at `m/n/p + buff_size - 1` so `index+1` stays in range.</violation>
</file>

<file name="runs/phi01/case.py">

<violation number="1" location="runs/phi01/case.py:35">
P2: `np.loadtxt` returns a 1D array for a single sphere, but the loop indexes `sphere_loc[i, 0]`, which raises an index error. Normalize to 2D so single-sphere files work.</violation>

<violation number="2" location="runs/phi01/case.py:158">
P2: `periodic_forcing` with `ib` requires `fluid_volume_fraction`, but the case dict sets `particle_vf` instead, so validation will fail and the forcing code won’t have the required volume fraction. Rename or replace this entry with `fluid_volume_fraction`.</violation>
</file>

<file name="src/common/m_mpi_common.fpp">

<violation number="1" location="src/common/m_mpi_common.fpp:2276">
P2: Deallocate the scalarfield buffers under the same condition used for allocation. Otherwise runs with periodic_forcing enabled (but volume_filtering_momentum_eqn disabled) will leak these allocations.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Ask questions if you need clarification on any suggestion

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment on lines +620 to 627
bound = m + buff_size
elseif (dim == 2) then
s_cc => y_cc
bound = n + buff_size - 1
bound = n + buff_size
else
s_cc => z_cc
bound = p + buff_size - 1
bound = p + buff_size
end if
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The bound limit now allows index to reach the last valid coordinate index, but the loop still accesses s_cc(index + 1). This can read past the end of x_cc/y_cc/z_cc when index == bound. Keep the bound at m/n/p + buff_size - 1 so index+1 stays in range.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/simulation/m_ibm.fpp, line 620:

<comment>The bound limit now allows `index` to reach the last valid coordinate index, but the loop still accesses `s_cc(index + 1)`. This can read past the end of `x_cc/y_cc/z_cc` when `index == bound`. Keep the bound at `m/n/p + buff_size - 1` so `index+1` stays in range.</comment>

<file context>
@@ -476,13 +617,13 @@ contains
                 if (dim == 1) then
                     s_cc => x_cc
-                    bound = m + buff_size - 1
+                    bound = m + buff_size
                 elseif (dim == 2) then
                     s_cc => y_cc
</file context>
Suggested change
bound = m + buff_size
elseif (dim == 2) then
s_cc => y_cc
bound = n + buff_size - 1
bound = n + buff_size
else
s_cc => z_cc
bound = p + buff_size - 1
bound = p + buff_size
end if
bound = m + buff_size - 1
elseif (dim == 2) then
s_cc => y_cc
bound = n + buff_size - 1
else
s_cc => z_cc
bound = p + buff_size - 1
end if
Fix with Cubic


for _ in range(iterations):
voro = freud.locality.Voronoi()
voro_data = voro.compute((box, initial_points))
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The Lloyd iteration recomputes Voronoi cells using initial_points, so every iteration uses the original tessellation and ignores updated point positions. Use the current points array so the relaxation actually updates.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At voronoi/gen_voronoi_2D.py, line 35:

<comment>The Lloyd iteration recomputes Voronoi cells using initial_points, so every iteration uses the original tessellation and ignores updated point positions. Use the current points array so the relaxation actually updates.</comment>

<file context>
@@ -0,0 +1,99 @@
+
+    for _ in range(iterations):
+        voro = freud.locality.Voronoi()
+        voro_data = voro.compute((box, initial_points))
+        vertices = voro_data.polytopes
+
</file context>
Fix with Cubic

integer :: alt_sys
integer :: i

alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4 ! 109, filtered indicator, stats of: R_u, R_mu, F_imet, q_cons_filtered
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: alt_sys is hardcoded for a 3D/sys_size=7 layout, but the allocated filtered arrays are sized using num_dims and sys_size. In 1D/2D or other sys_size values this miscounts variables and can break MPI IO (wrong NVARS_MOK / loop bounds). Compute alt_sys from num_dims and sys_size instead of fixed constants.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/post_process/m_data_input.f90, line 679:

<comment>alt_sys is hardcoded for a 3D/sys_size=7 layout, but the allocated filtered arrays are sized using num_dims and sys_size. In 1D/2D or other sys_size values this miscounts variables and can break MPI IO (wrong NVARS_MOK / loop bounds). Compute alt_sys from num_dims and sys_size instead of fixed constants.</comment>

<file context>
@@ -585,6 +656,72 @@ impure subroutine s_read_parallel_conservative_data(t_step, m_MOK, n_MOK, p_MOK,
+        integer :: alt_sys
+        integer :: i
+
+        alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4 ! 109, filtered indicator, stats of: R_u, R_mu, F_imet, q_cons_filtered
+
+        ! Open the file to read volume filtered variables
</file context>
Suggested change
alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4 ! 109, filtered indicator, stats of: R_u, R_mu, F_imet, q_cons_filtered
alt_sys = sys_size + 1 + 4 + (num_dims*num_dims)*4 + (num_dims*num_dims)*4 + (num_dims*4) + (sys_size - 1)*4
Fix with Cubic

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/simulation/m_data_output.fpp (1)

85-123: Add guard for filtered output in serial mode and refactor argument list

The serial I/O path ignores filtered fields even when q_filtered_wrt is enabled, leading to silent omission from output. Additionally, s_write_data_files has 12 parameters, exceeding the ≤6 argument guideline. Bundle the filtered statistics (filtered_fluid_indicator_function, stat_q_cons_filtered, stat_filtered_pressure, stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch) into a derived-type struct, then add a guard to abort if filtered output is requested without parallel I/O support.

Guard to prevent silent filtered data loss
         if (.not. parallel_io) then
+            if (q_filtered_wrt) call s_mpi_abort('q_filtered_wrt requires parallel_io or serial filtered I/O support')
             call s_write_serial_data_files(q_cons_vf, q_T_sf, q_prim_vf, t_step, bc_type, beta)
         else
             call s_write_parallel_data_files(q_cons_vf, t_step, bc_type, beta, &
                                              filtered_fluid_indicator_function, &
                                              stat_q_cons_filtered, stat_filtered_pressure, &
                                              stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
         end if
🤖 Fix all issues with AI agents
In `@src/post_process/m_data_input.f90`:
- Around line 679-709: Replace the hard-coded 3D-only expression that sets
alt_sys with a computation based on num_dims and sys_size: compute the extra
variables (filtered indicator, stats per conserved variable, pressure stats,
Reynolds stress components, effective viscosity, interfacial momentum exchange)
by deriving their counts from num_dims (e.g., number of velocity components =
num_dims, number of independent Reynolds stress components =
num_dims*(num_dims+1)/2, and any per-component multiplicative factors currently
hard-coded as 4) and then set alt_sys = sys_size + derived_extra_count; update
the assignment that sets alt_sys (and any comments) so all reads use the new
computed value instead of the 3D-specific literal.

In `@src/simulation/m_additional_forcing.fpp`:
- Around line 192-201: The finalizer s_finalize_additional_forcing_module is
missing deallocation of the window arrays allocated during init; add safe
deallocation calls for rho_window, u_window, and eps_window inside
s_finalize_additional_forcing_module (similar to the existing
DEALLOCATE(q_periodic_force) pattern) ensuring you check for ALLOCATED (or
ASSOCIATED for pointers) before DEALLOCATE and also release any device/copy
resources if these arrays were offloaded (match whatever device deallocation
pattern your codebase uses).
- Around line 84-88: The subroutine s_compute_periodic_forcing currently
declares rhs_vf as intent(in) but mutates it when adding forcing terms; change
the parameter intent to intent(inout) in the subroutine signature for rhs_vf so
the declaration matches its usage, and update any matching interface or module
declaration/prototype for s_compute_periodic_forcing to the same intent(inout)
to keep signatures consistent across the codebase.
- Around line 105-113: The kinetic-energy term is hardcoded to q_cons_vf(2:5)
which breaks non-3D or varying-sys_size layouts; replace the explicit indices
with layout-agnostic helpers by summing momentum components using the mom_idx
int_bounds_info (loop from mom_idx%lower to mom_idx%upper, accumulating
q_cons_vf(contxe + current_mom_idx)%sf(i,j,k)**2) and use E_idx
(q_cons_vf(E_idx)%sf(i,j,k)) for the energy term; update the spatial_eps
accumulation to compute (q_cons_vf(E_idx)%sf(...) - 0.5_wp * sum_mom_sq /
q_cons_vf(1)%sf(...)) * dVol so the code works for 1D/2D/multi-fluid layouts.

In `@src/simulation/m_compute_statistics.fpp`:
- Around line 84-93: ACC_SETUP_VFs is being invoked on the entire arrays
Msn_q_cons_filtered and stat_q_cons_filtered instead of the current element,
which can break macro expansion; update the calls to target the element by
replacing @:ACC_SETUP_VFs(Msn_q_cons_filtered) with
@:ACC_SETUP_VFs(Msn_q_cons_filtered(i)) (and similarly use
stat_q_cons_filtered(i)) inside the loop where you allocate vf and sf, and apply
the same change for the other occurrence around lines 142-151 so the macro runs
on each element instance.
- Around line 239-247: The formulas in the GPU loop of m_compute_statistics.fpp
can divide by zero when ns <= 3 or when the variance Msn(2)%sf(i,j,k) is zero;
modify the loop (the block computing q_stat(1..4)%sf(i,j,k) from Msn) to first
guard using the local ns and the local variance value: always assign q_stat(1) =
Msn(1), then if ns <= 1 set q_stat(2..4) to a safe default (e.g., 0 or a
sentinel), if ns <= 2 set q_stat(3) default, if ns <= 3 set q_stat(4) default,
and additionally check if Msn(2)%sf(i,j,k) is <= eps (small positive threshold)
and in that case avoid dividing by it by returning defaults for q_stat(2..4) or
using a stabilized expression; ensure these checks are done per element inside
the same collapsed GPU loop so no invalid divides occur for any (i,j,k).

In `@src/simulation/m_data_output.fpp`:
- Around line 848-852: The code uses a hard-coded constant (9*4 + 9*4 + 3*4 +
6*4) when setting alt_sys, which assumes num_dims==3 and a fixed sys_size
layout; change this to compute the extra variable count from num_dims and
sys_size and use that computed value in the alt_sys assignment. Specifically,
replace the magic constant with a computed extra_count (e.g., sum the actual
per-variable-array sizes derived from num_dims and sys_size for the arrays
represented by those terms) and set alt_sys = sys_size + 1 + extra_count in the
same conditional branch that checks beta, q_filtered_wrt, t_step, and
t_step_stop so the offset is correct for 1D/2D/3D and different equation sets.
- Around line 957-963: The code only calls s_initialize_mpi_data_filtered inside
the ib branch, leaving filtered buffers uninitialized when q_filtered_wrt is
true but ib is false or when file_per_process=true; update the logic to either
(A) add a validation that if q_filtered_wrt is true and parallel I/O is used
(file_per_process true) then ib must be true and raise an error/abort early, or
(B) move the call to s_initialize_mpi_data_filtered out of the ib-only block so
it runs whenever q_filtered_wrt is true (guarded by the same
file_per_process/alt_sys checks currently used to allocate alt_sys), ensuring
s_initialize_mpi_data_filtered is invoked for all code paths that set up
filtered MPI buffers; reference q_filtered_wrt, s_initialize_mpi_data_filtered,
ib, file_per_process and alt_sys when making the change.

In `@src/simulation/m_ibm.fpp`:
- Around line 1313-1317: The deallocation block conditionally skips freeing
levelset%sf and levelset_norm%sf when store_levelset is false, causing a leak
because these arrays are allocated unconditionally in s_initialize_ibm_module;
update the deallocation sequence to always DEALLOCATE levelset%sf and
levelset_norm%sf (remove them from the store_levelset conditional) or match the
allocation logic in s_initialize_ibm_module by only allocating them when
store_levelset is true—modify the code around the DEALLOCATE calls for
ib_markers%sf, levelset%sf, and levelset_norm%sf to ensure deallocation mirrors
allocation (refer to s_initialize_ibm_module, levelset%sf, levelset_norm%sf, and
store_levelset).
- Around line 644-648: Replace the Fortran intrinsic error stop with the
MPI-safe abort by calling s_mpi_abort(...) instead of "error stop";
specifically, in the block that currently prints the diagnostic (using
proc_rank, temp_loc, index, patch_id, dim, dir, i, j, k) and then calls error
stop "Increase buff_size", remove or keep the prints as desired and replace the
error stop with a call to s_mpi_abort that passes a concise diagnostic string
(e.g. include proc_rank, index, buff_size, bound and patch_id) so all ranks are
terminated cleanly; ensure you call the existing s_mpi_abort routine (same
spelling) and pass the message rather than using error stop.
- Around line 554-610: The code path that computes levelset on-the-fly assumes
spherical IBs when store_levelset is false; add a guard before the "else !
compute levelset..." block to detect non-spherical bodies and abort with
s_mpi_abort (or alternatively set store_levelset = .true. and fall back to
stored levelsets). Concretely, check a unique identifier for the IB shape (e.g.,
patch_ib(patch_id)%shape, or if only spheres have patch_ib(patch_id)%radius
available, test that) right before the on-the-fly computation and call
s_mpi_abort("store_levelset=F requires spherical IBs for patch_id=X") if the IB
is not a sphere so we do not silently compute incorrect image points.

In `@src/simulation/m_start_up.fpp`:
- Around line 1669-1671: The statistics module initialized when
volume_filtering_momentum_eqn is enabled (m_compute_statistics) is not being
finalized, causing host/device allocation leaks; add a finalize call alongside
the existing finalize for the FFTW filter by invoking the statistics finalizer
(call s_finalize_statistics_module()) inside the conditional for
volume_filtering_momentum_eqn so that when volume_filtering_momentum_eqn is true
both s_finalize_fftw_explicit_filter_module and s_finalize_statistics_module are
called.

In `@src/simulation/m_volume_filtering.fpp`:
- Around line 776-779: The division by q_cons_filtered(1)%sf(i,j,k) in the
reynolds_stress update can hit zero or be tiny inside particles; fix by guarding
the denominator—use a small floor (e.g. eps_rho) or max(...) against a filtered
fluid indicator before dividing so you never divide by values below eps_rho;
update the line in the loop that updates reynolds_stress(l,q)%sf(i,j,k) to
divide by a safe_denominator (max(q_cons_filtered(1)%sf(i,j,k), eps_rho) or
conditioned on filtered_fluid_indicator_function) to ensure numerical stability.
- Around line 1545-1552: Several places in the subroutine compute
q_cons_filtered(l)%sf(i,j,k) (and similar assignments later) by dividing by
filtered_fluid_indicator_function%sf(i,j,k) without checking for zero; replicate
the protection used in the Reynolds stress computation: compute a safe
denominator = max(filtered_fluid_indicator_function%sf(i,j,k), small_eps) (or
test if > small_eps) and use that safe denominator (or conditionally set the
result to zero) when assigning q_cons_filtered, and apply the same pattern for
the other divisions that reference filtered_fluid_indicator_function%sf to avoid
NaNs; locate the divisions via the symbols q_cons_filtered, data_real_3D_slabz
and filtered_fluid_indicator_function%sf and replace them with the guarded
version.
- Around line 571-590: The grad_fluid_indicator arrays are allocated but never
zeroed before the accumulation loop, so initialize each
grad_fluid_indicator(i)%sf to zero after allocation (or allocate with a zero
source) to avoid reading/adding uninitialized memory; specifically, after the
@:ALLOCATE and @:ACC_SETUP_SFs calls for grad_fluid_indicator in the block that
allocates 1:3 and grad_fluid_indicator(i)%sf, set grad_fluid_indicator(i)%sf =
0.0 (or an equivalent memset/fill) for i = 1,num_dims so the GPU_PARALLEL_LOOP
accumulation over fd_coeffs and fluid_indicator_function uses a defined starting
value.

In `@src/simulation/p_main.fpp`:
- Around line 59-64: Add corresponding finalizer routines for each initializer
called under volume_filtering_momentum_eqn: implement
s_finalize_filtering_kernel(), s_finalize_filtered_fluid_indicator_function(),
and s_finalize_fluid_indicator_gradient() that free/clean resources allocated by
s_initialize_filtering_kernel(),
s_initialize_filtered_fluid_indicator_function(), and
s_initialize_fluid_indicator_gradient(). Then update s_finalize_modules() to
call these new finalizers when volume_filtering_momentum_eqn is enabled; also
move the existing call to s_finalize_fftw_explicit_filter_module() out of the
special-case in m_start_up.fpp and invoke it from s_finalize_modules() under the
same volume_filtering_momentum_eqn guard so all four modules are finalized via
the standard cleanup path.

In `@voronoi/gen_voronoi_2D.py`:
- Around line 33-36: The loop recreates the Voronoi tessellation with the
immutable initial_points instead of the evolving points array, preventing Lloyd
relaxation from progressing; change the call in the iterations loop so
freud.locality.Voronoi().compute(...) uses the current points variable (not
initial_points) — update the voro.compute((box, initial_points)) invocation to
voro.compute((box, points)) so subsequent centroid moves operate on the newly
tessellated cells in functions/variables around voro, voro_data, vertices and
the centroid-update block.
- Around line 21-26: compute_simplex_centroid and compute_simplex_area currently
slice vertices as simplex_vertices[:, :, 0] which pulls coordinate components
instead of per-vertex vectors; change the indexing to extract each vertex vector
as simplex_vertices[:, 0, :], simplex_vertices[:, 1, :], simplex_vertices[:, 2,
:] in both compute_simplex_centroid and compute_simplex_area so centroids are
computed as the mean of the three vertex vectors and area is computed from
cross(v2 - v1, v3 - v1) with axis=1 for the norm.
🟡 Minor comments (7)
voronoi/gen_voronoi_3D.py-81-87 (1)

81-87: Fix spelling to unblock spell-check.

The pipeline flags “overlaping” as a typo in both the comment and log message.

✅ Proposed fix
-    # check no spheres are overlaping
+    # check no spheres are overlapping
@@
-                    print(f'spheres overlaping, dist={dist}, spheres #: {i}, {j}')
+                    print(f'spheres overlapping, dist={dist}, spheres #: {i}, {j}')
src/common/m_mpi_common.fpp-2275-2284 (1)

2275-2284: Deallocate scalarfield buffers when periodic_forcing is enabled.
Buffers are allocated for periodic_forcing OR volume_filtering_momentum_eqn, but finalized only for volume_filtering_momentum_eqn today.

🐛 Proposed fix
-        if (volume_filtering_momentum_eqn) then
+        if (periodic_forcing .or. volume_filtering_momentum_eqn) then
             @:DEALLOCATE(buff_send_scalarfield)
             @:DEALLOCATE(buff_recv_scalarfield)
         end if
src/common/m_boundary_common.fpp-289-396 (1)

289-396: Fix deallocation condition to match allocation.

The buffers buff_send_scalarfield and buff_recv_scalarfield are allocated when (periodic_forcing .or. volume_filtering_momentum_eqn) (m_mpi_common.fpp:106), but deallocated only when volume_filtering_momentum_eqn (m_mpi_common.fpp:2276). This creates a memory leak if periodic_forcing is enabled without volume filtering. The deallocation condition at line 2276 should be updated to match the allocation condition.

src/simulation/m_time_steppers.fpp-49-52 (1)

49-52: Remove unused module import m_compute_statistics.

The import of m_volume_filtering is correct and is used at line 610 (s_apply_fourier_filter). However, m_compute_statistics is imported but none of its exported routines are called in this file. Remove this unused dependency.

toolchain/mfc/case_validator.py-360-366 (1)

360-366: Resolve lint warnings (superfluous parens, trailing whitespace).

CI reports superfluous parentheses in the if (periodic_ibs): line and trailing whitespace in the new checks. Please clean these to keep the lint job green.

✏️ Suggested fix
-        if (periodic_ibs):
+        if periodic_ibs:

Also applies to: 1104-1169

toolchain/mfc/case_validator.py-356-357 (1)

356-357: Update num_ibs error message to match the new limit.

The constraint now allows up to 1000, but the message still mentions 10.

✏️ Suggested fix
-        self.prohibit(ib and (num_ibs <= 0 or num_ibs > 1000),
-                     "num_ibs must be between 1 and num_patches_max (10)")
+        self.prohibit(ib and (num_ibs <= 0 or num_ibs > 1000),
+                     "num_ibs must be between 1 and num_patches_max (1000)")
src/post_process/m_data_input.f90-243-299 (1)

243-299: Wrap long allocate lines to satisfy the linter

Static analysis flags multiple lines >100 chars in the filtered-array allocations. Please wrap these with continuation lines to avoid CI failures.

✂️ Example wrapping
-        allocate (stat_reynolds_stress(i, j)%vf(k)%sf(local_start_idx:end_x, local_start_idx:end_y, local_start_idx:end_z))
+        allocate (stat_reynolds_stress(i, j)%vf(k)%sf( &
+            local_start_idx:end_x, local_start_idx:end_y, local_start_idx:end_z))
🧹 Nitpick comments (18)
voronoi/gen_voronoi_3D.py (1)

61-63: Prefer os.makedirs(..., exist_ok=True) for output dir creation.

This avoids the == False comparison (Ruff E712) and is more robust.

♻️ Proposed refactor
-    if os.path.exists(output_dir) == False:
-        os.mkdir(output_dir)
+    os.makedirs(output_dir, exist_ok=True)

Please confirm this aligns with your linting/style expectations.

runs/phi01/case.py (2)

1-3: Unused import: math

The math module is imported but never used in this script. Consider removing it.

 import json
-import math
 import numpy as np

34-36: Relative path dependency for sphere locations.

The script loads sphere positions from a relative path 'sphere_array_locations.txt'. This assumes the script is executed from its containing directory (runs/phi01/). Consider using __file__ to make the path resolution more robust.

+import os
+
+# load initial sphere locations
+script_dir = os.path.dirname(os.path.abspath(__file__))
+sphere_loc = np.loadtxt(os.path.join(script_dir, 'sphere_array_locations.txt'))
-sphere_loc = np.loadtxt('sphere_array_locations.txt')
runs/3d_drag_test/case.py (1)

1-3: Unused import: math

The math module is imported but never used.

 import json
-import math
 import numpy as np
src/common/m_compute_levelset.fpp (2)

518-519: Unused variable: ierr

The variable ierr is declared but never used in this subroutine. Remove it to avoid compiler warnings about unused variables.

-        integer :: i, j, k, ix, iy, iz !< Loop index variables
-        integer :: ierr
+        integer :: i, j, k, ix, iy, iz !< Loop index variables

526-548: Periodic image loop always iterates over 8 combinations.

When the sphere center is not near any domain boundary, center(d, 2) = center(d, 1) for all dimensions. The inner 2×2×2 loop (lines 562-575) then redundantly computes the same distance 8 times. While correctness is preserved, this is inefficient.

Consider either:

  1. Skipping the inner loop when center(:,1) == center(:,2) for all dimensions, or
  2. Tracking which dimensions actually have distinct periodic images and only iterating over those.

This is a minor performance concern for a per-timestep levelset computation but could matter with many IBs.

Also applies to: 562-575

runs/3d_periodic_ibs_test/centered/case.py (1)

1-3: Unused import: math

The math module is imported but never used.

 import json
-import math
 import numpy as np
runs/3d_periodic_ibs_test/off-center/case.py (1)

1-3: Unused import: math

The math module is imported but never used.

 import json
-import math
 import numpy as np
runs/3d_1sphere_filtering/case.py (1)

1-3: Unused import: math

The math module is imported but never used.

 import json
-import math
 import numpy as np
src/simulation/m_global_parameters.fpp (1)

1208-1210: Avoid hard-coded “109” sizing for filtered stats.
The extra MPI_IO_DATA slots are fixed at 109 in both allocation and teardown. Consider deriving this count from num_dims and sys_size (or centralizing it as a named constant) to keep it synchronized with s_initialize_mpi_data_filtered and to prevent config drift.

Also applies to: 1233-1236, 1433-1436

src/common/m_mpi_common.fpp (1)

340-372: Confirm filtered-stat indexing assumptions.
The offsets (37/73/85/105) and alt_sys = sys_size + 109 imply a fixed layout (likely 3D single‑fluid). If other configurations are possible, derive offsets from num_dims/sys_size or assert supported cases to avoid out‑of‑bounds indexing.

src/common/m_ib_patches.fpp (1)

617-640: Add ix/iy/iz to the GPU private list.

These inner-loop indices are loop-local; include them in private to align with GPU loop guidelines.

♻️ Suggested fix
-            $:GPU_PARALLEL_LOOP(collapse=3, private='[cart_y, cart_z]', copy='[ib_markers_sf]', copyin='[patch_id, center, r2]')
+            $:GPU_PARALLEL_LOOP(collapse=3, private='[cart_y, cart_z, ix, iy, iz]', copy='[ib_markers_sf]', copyin='[patch_id, center, r2]')

As per coding guidelines, declare loop-local variables with private='[...]' in GPU parallel loop macros.

src/simulation/m_ibm.fpp (1)

154-228: Declare loop indices as private in GPU parallel loops.

These loops introduce j/k/l but don’t specify private, which the GPU macros expect.

♻️ Suggested fix (apply pattern to all periodic buffer loops)
-                $:GPU_PARALLEL_LOOP(collapse=3)
+                $:GPU_PARALLEL_LOOP(collapse=3, private='[j,k,l]')

As per coding guidelines, declare loop-local variables with private='[...]' in GPU parallel loop macros.

src/simulation/m_volume_filtering.fpp (5)

333-337: Consider using newunit for file I/O.

Hard-coded unit number 100 can conflict with other I/O operations. Modern Fortran provides newunit for automatic unit assignment.

♻️ Suggested improvement
integer :: particle_force_unit  ! Add as module variable

! Then in the open statement:
open(newunit=particle_force_unit, file='particle_force.bin', status='replace', &
     form='unformatted', access='stream', action='write')

1029-1038: Subroutine exceeds size and argument limits.

s_filter_batch is approximately 728 lines (limit: 500) and has 8 arguments (limit: 6). Per coding guidelines, consider:

  1. Grouping related arguments into a derived type
  2. Extracting repetitive FFT forward/backward blocks into helper subroutines

The X-Y FFT forward sequence (lines 1041-1088, 1091-1136, 1139-1188, etc.) and the Z FFT + convolution + inverse sequence are repeated for each field type. Extracting these into helpers like s_fft_xy_forward and s_fft_z_convolve would significantly reduce code size and improve maintainability.


713-721: Subroutine has 8 arguments (limit: 6).

Consider grouping the tensor arguments (reynolds_stress, visc_stress, pres_visc_stress, div_pres_visc_stress) into a derived type to reduce the argument count and improve readability.


863-863: Remove commented-out debug code.

This commented-out print statement appears to be a debug artifact and should be removed before merging.

♻️ Proposed fix
         write (100) force_glb
         flush (100)
-        !print *, force_glb(1, 1) / (0.5_wp * rho_inf_ref * u_inf_ref**2 * pi * patch_ib(1)%radius**2)
     end if

1-2007: File exceeds 1000-line limit.

At ~2007 lines, this module significantly exceeds the 1000-line guideline. Consider splitting into smaller, focused modules:

  • m_volume_filtering_types — type definitions and public interfaces
  • m_volume_filtering_fft — FFT initialization, forward/backward transforms, transpose routines
  • m_volume_filtering_compute — stress tensor, Reynolds stress, effective viscosity computations
  • m_volume_filtering_particles — particle force calculations

Comment on lines +679 to +709
alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4 ! 109, filtered indicator, stats of: R_u, R_mu, F_imet, q_cons_filtered

! Open the file to read volume filtered variables
write (file_loc, '(I0,A)') t_step, '.dat'
file_loc = trim(case_dir)//'/restart_data'//trim(mpiiofs)//trim(file_loc)
inquire (FILE=trim(file_loc), EXIST=file_exist)

if (file_exist) then
call MPI_FILE_OPEN(MPI_COMM_WORLD, file_loc, MPI_MODE_RDONLY, mpi_info_int, ifile, ierr)

call s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, &
stat_q_cons_filtered, stat_filtered_pressure, &
stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)

data_size = (m + 1)*(n + 1)*(p + 1)

m_MOK = int(m_glb + 1, MPI_OFFSET_KIND)
n_MOK = int(n_glb + 1, MPI_OFFSET_KIND)
p_MOK = int(p_glb + 1, MPI_OFFSET_KIND)
WP_MOK = int(8._wp, MPI_OFFSET_KIND)
MOK = int(1._wp, MPI_OFFSET_KIND)
str_MOK = int(name_len, MPI_OFFSET_KIND)
NVARS_MOK = int(alt_sys, MPI_OFFSET_KIND)

! Read the data for each variable
do i = sys_size + 1, alt_sys
var_MOK = int(i, MPI_OFFSET_KIND)

! Initial displacement to skip at beginning of file
disp = m_MOK*max(MOK, n_MOK)*max(MOK, p_MOK)*WP_MOK*(var_MOK - 1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Compute alt_sys from num_dims/sys_size instead of hard-coded 3D counts

The constant assumes 3D and a fixed sys_size-1, so read offsets will be wrong for other configurations. Derive the extra count from num_dims and sys_size.

🐛 Suggested sizing formula
-        alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4 ! 109, filtered indicator, stats of: R_u, R_mu, F_imet, q_cons_filtered
+        alt_sys = sys_size + 1 &
+                  + (num_dims*num_dims)*4*2 &   ! reynolds_stress + eff_visc
+                  + num_dims*4 &                ! int_mom_exch
+                  + (sys_size - 1)*4 &          ! q_cons_filtered
+                  + 4                           ! filtered_pressure
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
alt_sys = sys_size + 1 + 9*4 + 9*4 + 3*4 + 6*4 ! 109, filtered indicator, stats of: R_u, R_mu, F_imet, q_cons_filtered
! Open the file to read volume filtered variables
write (file_loc, '(I0,A)') t_step, '.dat'
file_loc = trim(case_dir)//'/restart_data'//trim(mpiiofs)//trim(file_loc)
inquire (FILE=trim(file_loc), EXIST=file_exist)
if (file_exist) then
call MPI_FILE_OPEN(MPI_COMM_WORLD, file_loc, MPI_MODE_RDONLY, mpi_info_int, ifile, ierr)
call s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, &
stat_q_cons_filtered, stat_filtered_pressure, &
stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
data_size = (m + 1)*(n + 1)*(p + 1)
m_MOK = int(m_glb + 1, MPI_OFFSET_KIND)
n_MOK = int(n_glb + 1, MPI_OFFSET_KIND)
p_MOK = int(p_glb + 1, MPI_OFFSET_KIND)
WP_MOK = int(8._wp, MPI_OFFSET_KIND)
MOK = int(1._wp, MPI_OFFSET_KIND)
str_MOK = int(name_len, MPI_OFFSET_KIND)
NVARS_MOK = int(alt_sys, MPI_OFFSET_KIND)
! Read the data for each variable
do i = sys_size + 1, alt_sys
var_MOK = int(i, MPI_OFFSET_KIND)
! Initial displacement to skip at beginning of file
disp = m_MOK*max(MOK, n_MOK)*max(MOK, p_MOK)*WP_MOK*(var_MOK - 1)
alt_sys = sys_size + 1 &
(num_dims*num_dims)*4*2 & ! reynolds_stress + eff_visc
num_dims*4 & ! int_mom_exch
(sys_size - 1)*4 & ! q_cons_filtered
4 ! filtered_pressure
! Open the file to read volume filtered variables
write (file_loc, '(I0,A)') t_step, '.dat'
file_loc = trim(case_dir)//'/restart_data'//trim(mpiiofs)//trim(file_loc)
inquire (FILE=trim(file_loc), EXIST=file_exist)
if (file_exist) then
call MPI_FILE_OPEN(MPI_COMM_WORLD, file_loc, MPI_MODE_RDONLY, mpi_info_int, ifile, ierr)
call s_initialize_mpi_data_filtered(filtered_fluid_indicator_function, &
stat_q_cons_filtered, stat_filtered_pressure, &
stat_reynolds_stress, stat_eff_visc, stat_int_mom_exch)
data_size = (m + 1)*(n + 1)*(p + 1)
m_MOK = int(m_glb + 1, MPI_OFFSET_KIND)
n_MOK = int(n_glb + 1, MPI_OFFSET_KIND)
p_MOK = int(p_glb + 1, MPI_OFFSET_KIND)
WP_MOK = int(8._wp, MPI_OFFSET_KIND)
MOK = int(1._wp, MPI_OFFSET_KIND)
str_MOK = int(name_len, MPI_OFFSET_KIND)
NVARS_MOK = int(alt_sys, MPI_OFFSET_KIND)
! Read the data for each variable
do i = sys_size + 1, alt_sys
var_MOK = int(i, MPI_OFFSET_KIND)
! Initial displacement to skip at beginning of file
disp = m_MOK*max(MOK, n_MOK)*max(MOK, p_MOK)*WP_MOK*(var_MOK - 1)
🧰 Tools
🪛 Fortitude (0.7.5)

[error] 691-691: line length of 103, exceeds maximum 100

(S001)

🤖 Prompt for AI Agents
In `@src/post_process/m_data_input.f90` around lines 679 - 709, Replace the
hard-coded 3D-only expression that sets alt_sys with a computation based on
num_dims and sys_size: compute the extra variables (filtered indicator, stats
per conserved variable, pressure stats, Reynolds stress components, effective
viscosity, interfacial momentum exchange) by deriving their counts from num_dims
(e.g., number of velocity components = num_dims, number of independent Reynolds
stress components = num_dims*(num_dims+1)/2, and any per-component
multiplicative factors currently hard-coded as 4) and then set alt_sys =
sys_size + derived_extra_count; update the assignment that sets alt_sys (and any
comments) so all reads use the new computed value instead of the 3D-specific
literal.

Comment on lines +105 to +113
dVol = dx(i)*dy(j)*dz(k)*fluid_indicator_function%sf(i, j, k)
spatial_rho = spatial_rho + (q_cons_vf(1)%sf(i, j, k)*dVol) ! rho
spatial_u = spatial_u + (q_cons_vf(contxe + mom_f_idx)%sf(i, j, k)*dVol) ! rho*u
spatial_eps = spatial_eps + ((q_cons_vf(5)%sf(i, j, k) - 0.5_wp*( &
q_cons_vf(2)%sf(i, j, k)**2 + &
q_cons_vf(3)%sf(i, j, k)**2 + &
q_cons_vf(4)%sf(i, j, k)**2)/ &
q_cons_vf(1)%sf(i, j, k))*dVol) ! rho*e
end do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the file structure and the specific lines mentioned
head -120 src/simulation/m_additional_forcing.fpp | tail -30

Repository: MFlowCode/MFC

Length of output: 170


🏁 Script executed:

# Also check lines 163-165
sed -n '160,170p' src/simulation/m_additional_forcing.fpp

Repository: MFlowCode/MFC

Length of output: 573


🏁 Script executed:

# Search for where mom_idx, E_idx, num_dims are defined
rg "mom_idx|E_idx|num_dims" src/simulation/m_additional_forcing.fpp -A 2 -B 2

Repository: MFlowCode/MFC

Length of output: 39


🏁 Script executed:

# Look for similar patterns in other files to understand the naming convention
rg "q_cons_vf\([2-5]\)" src/ --type fpp -B 2 -A 2 | head -50

Repository: MFlowCode/MFC

Length of output: 84


🏁 Script executed:

# Read the file directly without type restrictions
cat -n src/simulation/m_additional_forcing.fpp | head -120

Repository: MFlowCode/MFC

Length of output: 5169


🏁 Script executed:

# Check for index helper definitions
rg "mom_idx|E_idx" src/simulation/ -A 2 -B 2

Repository: MFlowCode/MFC

Length of output: 50370


🏁 Script executed:

# Look for similar energy/momentum calculations in the codebase
rg "q_cons_vf\(1\)" src/simulation/ | head -20

Repository: MFlowCode/MFC

Length of output: 1120


🏁 Script executed:

# Check module imports and variable declarations at top of the file
head -80 src/simulation/m_additional_forcing.fpp

Repository: MFlowCode/MFC

Length of output: 2364


🏁 Script executed:

# Search for E_idx and mom_idx definitions
rg "integer.*E_idx|parameter.*E_idx" src/simulation/ src/common/ -A 2

Repository: MFlowCode/MFC

Length of output: 436


🏁 Script executed:

# Look for mom_idx definition
rg "mom_idx|contxe|mom_f_idx" src/common/ -B 2 -A 2 | head -100

Repository: MFlowCode/MFC

Length of output: 7565


🏁 Script executed:

# Check lines 160-170 for the second mentioned issue
sed -n '155,175p' src/simulation/m_additional_forcing.fpp

Repository: MFlowCode/MFC

Length of output: 1009


🏁 Script executed:

# Look at how kinetic energy is computed elsewhere in the codebase
rg "q_cons_vf.*\*\*2.*mom_idx|f_compute_pressure|kinetic" src/simulation/ | head -20

Repository: MFlowCode/MFC

Length of output: 39


🏁 Script executed:

# Search for mom_idx definition in global parameters
rg "type.*int_bounds_info|integer.*mom_idx|mom_idx.*=" src/simulation/m_global_parameters.fpp src/common/ -B 2 -A 2

Repository: MFlowCode/MFC

Length of output: 11462


🏁 Script executed:

# Look for mom_idx usage pattern and definition
rg "mom_idx" src/ --max-count=5 | head -30

Repository: MFlowCode/MFC

Length of output: 3002


🏁 Script executed:

# Search for num_dims variable since the review mentions it
rg "num_dims" src/simulation/m_global_parameters.fpp src/common/ -B 1 -A 1

Repository: MFlowCode/MFC

Length of output: 28089


🏁 Script executed:

# Verify the actual kinetic energy computation is indeed the hard-coded one at 108-112
sed -n '100,120p' src/simulation/m_additional_forcing.fpp

Repository: MFlowCode/MFC

Length of output: 1191


🏁 Script executed:

# Verify syntax for accessing q_cons_vf with mom_idx%beg in loops
rg "do.*mom_idx%beg.*mom_idx%end|q_cons_vf\(mom_idx" src/ | head -10

Repository: MFlowCode/MFC

Length of output: 830


Refactor kinetic energy calculation to use layout-agnostic index helpers

Lines 108–112 hard-code momentum/energy indices (q_cons_vf(2:5)) assuming a fixed 5-equation single-fluid 3D layout. This breaks for 1D/2D or when sys_size/num_fluids differ. Replace with a loop using mom_idx (type int_bounds_info) and E_idx:

Refactor example
-                    spatial_eps = spatial_eps + ((q_cons_vf(5)%sf(i, j, k) - 0.5_wp*( &
-                                                  q_cons_vf(2)%sf(i, j, k)**2 + &
-                                                  q_cons_vf(3)%sf(i, j, k)**2 + &
-                                                  q_cons_vf(4)%sf(i, j, k)**2)/ &
-                                                  q_cons_vf(1)%sf(i, j, k))*dVol) ! rho*e
+                    spatial_eps = spatial_eps + ( &
+                        (q_cons_vf(E_idx)%sf(i, j, k) - 0.5_wp*( &
+                            sum((q_cons_vf(mom_idx%beg:mom_idx%end)%sf(i, j, k))**2)/ &
+                            q_cons_vf(1)%sf(i, j, k)) * dVol) ! rho*e
🤖 Prompt for AI Agents
In `@src/simulation/m_additional_forcing.fpp` around lines 105 - 113, The
kinetic-energy term is hardcoded to q_cons_vf(2:5) which breaks non-3D or
varying-sys_size layouts; replace the explicit indices with layout-agnostic
helpers by summing momentum components using the mom_idx int_bounds_info (loop
from mom_idx%lower to mom_idx%upper, accumulating q_cons_vf(contxe +
current_mom_idx)%sf(i,j,k)**2) and use E_idx (q_cons_vf(E_idx)%sf(i,j,k)) for
the energy term; update the spatial_eps accumulation to compute
(q_cons_vf(E_idx)%sf(...) - 0.5_wp * sum_mom_sq / q_cons_vf(1)%sf(...)) * dVol
so the code works for 1D/2D/multi-fluid layouts.

Comment on lines +776 to +779
do q = 1, num_dims
reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) &
- (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k)/q_cons_filtered(1)%sf(i, j, k))
end do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential division by zero on filtered density.

The Reynolds stress computation divides by q_cons_filtered(1)%sf(i, j, k) (filtered density). In regions where the fluid indicator function approaches zero (inside particles), the filtered density could be very small, causing numerical instability or division by zero.

Consider adding a small epsilon floor or checking against filtered_fluid_indicator_function:

real(wp), parameter :: eps_rho = 1.0e-12_wp
...
reynolds_stress(l, q)%sf(i, j, k) = reynolds_stress(l, q)%sf(i, j, k) &
    - (q_cons_filtered(momxb - 1 + l)%sf(i, j, k)*q_cons_filtered(momxb - 1 + q)%sf(i, j, k) &
       / max(q_cons_filtered(1)%sf(i, j, k), eps_rho))
🤖 Prompt for AI Agents
In `@src/simulation/m_volume_filtering.fpp` around lines 776 - 779, The division
by q_cons_filtered(1)%sf(i,j,k) in the reynolds_stress update can hit zero or be
tiny inside particles; fix by guarding the denominator—use a small floor (e.g.
eps_rho) or max(...) against a filtered fluid indicator before dividing so you
never divide by values below eps_rho; update the line in the loop that updates
reynolds_stress(l,q)%sf(i,j,k) to divide by a safe_denominator
(max(q_cons_filtered(1)%sf(i,j,k), eps_rho) or conditioned on
filtered_fluid_indicator_function) to ensure numerical stability.

Comment on lines +1545 to +1552
do i = 0, m
do j = 0, n
do k = 0, p
q_cons_filtered(l)%sf(i, j, k) = data_real_3D_slabz(i + 1, j + 1, k + 1)/(real(Nx*Ny*Nz, wp)*filtered_fluid_indicator_function%sf(i, j, k))
end do
end do
end do
$:END_GPU_PARALLEL_LOOP()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Division by filtered_fluid_indicator_function without zero-check.

Multiple locations in this subroutine divide by filtered_fluid_indicator_function%sf(i, j, k) (lines 1548, 1597, 1647, 1699). Same concern as with filtered density—this could cause NaN when the fluid indicator approaches zero.

Apply the same protection pattern as suggested for the Reynolds stress computation.

🤖 Prompt for AI Agents
In `@src/simulation/m_volume_filtering.fpp` around lines 1545 - 1552, Several
places in the subroutine compute q_cons_filtered(l)%sf(i,j,k) (and similar
assignments later) by dividing by filtered_fluid_indicator_function%sf(i,j,k)
without checking for zero; replicate the protection used in the Reynolds stress
computation: compute a safe denominator =
max(filtered_fluid_indicator_function%sf(i,j,k), small_eps) (or test if >
small_eps) and use that safe denominator (or conditionally set the result to
zero) when assigning q_cons_filtered, and apply the same pattern for the other
divisions that reference filtered_fluid_indicator_function%sf to avoid NaNs;
locate the divisions via the symbols q_cons_filtered, data_real_3D_slabz and
filtered_fluid_indicator_function%sf and replace them with the guarded version.

Comment on lines +59 to +64
if (volume_filtering_momentum_eqn .or. periodic_forcing) call s_initialize_fluid_indicator_function(bc_type)
if (volume_filtering_momentum_eqn) then
call s_initialize_filtering_kernel()
call s_initialize_filtered_fluid_indicator_function()
call s_initialize_fluid_indicator_gradient()
end if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for finalization routines in m_volume_filtering
rg -n 's_finalize|deallocate|@:DEALLOCATE' src/simulation/m_volume_filtering.fpp | head -50

Repository: MFlowCode/MFC

Length of output: 170


🏁 Script executed:

#!/bin/bash
# Check if m_volume_filtering finalization is called in startup module
rg -n 'volume_filtering' src/simulation/m_start_up.fpp

Repository: MFlowCode/MFC

Length of output: 478


🏁 Script executed:

#!/bin/bash
# Also check for finalization in the main program or shutdown routines
rg -n 's_finalize_modules|s_finalize|finalize' src/simulation/p_main.fpp

Repository: MFlowCode/MFC

Length of output: 90


🏁 Script executed:

#!/bin/bash
# Check module structure and public interface
rg -A 10 'module m_volume_filtering' src/simulation/m_volume_filtering.fpp | head -30

Repository: MFlowCode/MFC

Length of output: 387


🏁 Script executed:

#!/bin/bash
# Search for finalization routines matching the initialization calls
rg -n 's_finalize_fluid_indicator|s_finalize_filtering_kernel|s_finalize_filtered_fluid_indicator|s_finalize_fluid_indicator_gradient' src/simulation/

Repository: MFlowCode/MFC

Length of output: 39


🏁 Script executed:

#!/bin/bash
# Search for all finalize routines in volume filtering module
rg -n 'subroutine.*s_finalize' src/simulation/m_volume_filtering.fpp

Repository: MFlowCode/MFC

Length of output: 178


🏁 Script executed:

#!/bin/bash
# Check what s_finalize_modules does - look for its definition
rg -A 30 'subroutine s_finalize_modules' src/simulation/

Repository: MFlowCode/MFC

Length of output: 2422


🏁 Script executed:

#!/bin/bash
# List all subroutine definitions in m_volume_filtering
rg -n 'subroutine s_' src/simulation/m_volume_filtering.fpp

Repository: MFlowCode/MFC

Length of output: 2686


Memory leak detected: three initialization routines lack corresponding finalization.

The four initialization calls in lines 59–64 allocate resources for the volume filtering module, but three of them have no corresponding finalization routines:

  • s_initialize_filtering_kernel() — no finalization
  • s_initialize_filtered_fluid_indicator_function() — no finalization
  • s_initialize_fluid_indicator_gradient() — no finalization

Only s_initialize_fftw_explicit_filter_module() has a matching finalization (s_finalize_fftw_explicit_filter_module()), which is conditionally called in m_start_up.fpp line 1670, not through the standard s_finalize_modules() cleanup path. Add finalization routines for all three missing initializations and ensure they are called in s_finalize_modules() when volume_filtering_momentum_eqn is enabled.

🤖 Prompt for AI Agents
In `@src/simulation/p_main.fpp` around lines 59 - 64, Add corresponding finalizer
routines for each initializer called under volume_filtering_momentum_eqn:
implement s_finalize_filtering_kernel(),
s_finalize_filtered_fluid_indicator_function(), and
s_finalize_fluid_indicator_gradient() that free/clean resources allocated by
s_initialize_filtering_kernel(),
s_initialize_filtered_fluid_indicator_function(), and
s_initialize_fluid_indicator_gradient(). Then update s_finalize_modules() to
call these new finalizers when volume_filtering_momentum_eqn is enabled; also
move the existing call to s_finalize_fftw_explicit_filter_module() out of the
special-case in m_start_up.fpp and invoke it from s_finalize_modules() under the
same volume_filtering_momentum_eqn guard so all four modules are finalized via
the standard cleanup path.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 21, 2026

CodeAnt AI is reviewing your PR.


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added the size:XXL This PR changes 1000+ lines, ignoring generated files label Jan 21, 2026
real(wp) :: r2 ! radius squared
real(wp), dimension(3, 2) :: center
!! Variables to initialize the pressure field that corresponds to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Thread-safety / shared-state bug: cart_y and cart_z are module-level variables and currently written/read inside the parallel region; this risks threads interfering with each other. Make cart_y/cart_z (and a small temporary pair of arrays for cylindrical->cartesian conversion) local to the subroutine by adding local declarations so the parallel region operates on thread-local storage. [race condition]

Severity Level: Critical 🚨
- ❌ s_ib_sphere cylindrical coordinate conversions corrupted.
- ⚠️ s_ib_cuboid and s_ib_cylinder may be affected.
- ⚠️ GPU runs produce non-repeatable markers.
Suggested change
!! Variables to initialize the pressure field that corresponds to the
! Thread-local cartesian coordinates to avoid shared-module writes in parallel regions
real(wp) :: cart_y, cart_z
! Temporary arrays for cylindrical->cartesian conversion (thread-local)
real(wp), dimension(1:3) :: tmp_cyl, tmp_cart
Steps of Reproduction ✅
1. Inspect src/common/m_ib_patches.fpp: the module declares module-level cart_x, cart_y,
cart_z near the top and s_ib_sphere declares loop indices at lines 565-568 (new hunk).

2. Run a case that triggers s_ib_sphere with grid_geometry == 3 (so cylindrical->cartesian
conversion is used) and enable parallel/GPU execution for $:GPU_PARALLEL_LOOP. The loop
bodies assign cart_y/cart_z for each grid cell.

3. Inside the parallel region the code currently writes the module-level cart_y/cart_z
(shared state). The conversion helper s_convert_cylindrical_to_cartesian_coord writes
those module variables (subroutine present in the same file), so multiple threads can
overwrite the shared cart_y/cart_z concurrently during the loop.

4. Observe incorrect cart coordinates passed into the sphere-in-volume condition (the
distance check at src/common/m_ib_patches.fpp:631 and surrounding lines), causing wrong
ib_markers_sf values. Declaring cart_y/cart_z (and temporaries) local to s_ib_sphere
removes shared writes and prevents corruption.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/common/m_ib_patches.fpp
**Line:** 570:570
**Comment:**
	*Race Condition: Thread-safety / shared-state bug: `cart_y` and `cart_z` are module-level variables and currently written/read inside the parallel region; this risks threads interfering with each other. Make `cart_y`/`cart_z` (and a small temporary pair of arrays for cylindrical->cartesian conversion) local to the subroutine by adding local declarations so the parallel region operates on thread-local storage.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

do j = 0, n
do i = 0, m
if (grid_geometry == 3) then
call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Non-thread-safe helper call: s_convert_cylindrical_to_cartesian_coord writes module-level cart_y/cart_z and is invoked inside a parallel region; relying on that subroutine is unsafe. Replace the call with the pure function f_convert_cyl_to_cart and assign the returned components to the local thread-safe temporaries (declared at subroutine scope) to avoid shared-state updates inside the parallel loop. [race condition]

Severity Level: Critical 🚨
- ❌ s_ib_sphere coordinate conversion nondeterministic.
- ⚠️ Incorrect IB marking in cylindrical-grid cases.
- ⚠️ GPU-accelerated test failures intermittent.
Suggested change
call s_convert_cylindrical_to_cartesian_coord(y_cc(j), z_cc(k))
tmp_cyl = (/ 0._wp, y_cc(j), z_cc(k) /)
tmp_cart = f_convert_cyl_to_cart(tmp_cyl)
cart_y = tmp_cart(2)
cart_z = tmp_cart(3)
Steps of Reproduction ✅
1. Locate the code in src/common/m_ib_patches.fpp inside subroutine s_ib_sphere around
lines 622-627 (new hunk): the code calls s_convert_cylindrical_to_cartesian_coord() inside
the GPU loop.

2. s_convert_cylindrical_to_cartesian_coord writes module-level cart_y/cart_z (see its
definition later in the file). When $:GPU_PARALLEL_LOOP executes across threads, each
thread calling that subroutine updates the shared module variables.

3. Execute a parallel/GPU-enabled run with grid_geometry==3 and periodic_ibs either true
or false to hit both periodic and non-periodic branches. The shared writes cause
inter-thread interference: cart_y/cart_z values read by subsequent distance checks (e.g.,
the sphere inclusion condition at ~line 631) can be from other threads.

4. Replace the call with the pure function f_convert_cyl_to_cart and thread-local
temporaries (tmp_cyl/tmp_cart) so each thread uses its own cart_y/cart_z values,
eliminating the data-race observed in step 3.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/common/m_ib_patches.fpp
**Line:** 623:623
**Comment:**
	*Race Condition: Non-thread-safe helper call: `s_convert_cylindrical_to_cartesian_coord` writes module-level `cart_y`/`cart_z` and is invoked inside a parallel region; relying on that subroutine is unsafe. Replace the call with the pure function `f_convert_cyl_to_cart` and assign the returned components to the local thread-safe temporaries (declared at subroutine scope) to avoid shared-state updates inside the parallel loop.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

end if
else if (volume_filtering_momentum_eqn) then
do i = 1, alt_sys
var_MOK = int(i, MPI_OFFSET_KIND)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Logic error: the new MPI write loop for volume-filtered variables writes the full range 1..alt_sys unconditionally, which will include the beta slot (index sys_size+1) when alt_sys > sys_size. Later the existing code unconditionally writes the beta correction (sys_size+1) again, causing a duplicate/overlap write that can corrupt the restart file. Avoid writing the sys_size+1 index inside this new loop when beta will be written later. [logic error]

Severity Level: Critical 🚨
- ❌ MPI restart files can be corrupted or inconsistent.
- ❌ Restart/load path for parallel simulations affected.
- ⚠️ Volume-filtered momentum restart data inaccurate.
- ⚠️ Debugging I/O issues if restarts silently wrong.
Suggested change
var_MOK = int(i, MPI_OFFSET_KIND)
! Skip the explicit beta correction index here to avoid
! writing it twice (it is written later in the "present(beta)"
! correction block). This prevents overlapping writes/corruption.
if (present(beta) .and. i == sys_size + 1) cycle
Steps of Reproduction ✅
1. Run the MPI multi-process restart path so parallel I/O is used. The dispatcher
s_write_data_files (src/simulation/m_data_output.fpp: subroutine s_write_data_files
starting at ~line 82 in the PR hunk) calls s_write_parallel_data_files when parallel_io is
true.

2. Enter s_write_parallel_data_files with volume filtering enabled
(volume_filtering_momentum_eqn = .true.) and with beta present (the optional beta actual
is provided to s_write_parallel_data_files). The volume-filter write loop at
src/simulation/m_data_output.fpp: lines 1018-1029 (else if (volume_filtering_momentum_eqn)
then ... do i = 1, alt_sys) executes and writes indices 1..alt_sys — this includes
sys_size+1 when alt_sys > sys_size.

3. Later in the same subroutine the code unconditionally performs the lagrangian
correction write for beta when present (the "Correction for the lagrangian subgrid bubble
model" block writes MPI_IO_DATA%var(sys_size + 1)%sf). That causes the same sys_size+1
slot to be written twice (once inside the 1..alt_sys loop and again in the present(beta)
block).

4. Observe incorrect restart file contents or file corruption on disk when restarting: the
duplicated writes overlap file offsets and may produce corrupted or inconsistent beta data
on restart. The overlapping writes happen deterministically when (a) file_per_process is
false (the MPI file-per-world branch is used), (b) volume_filtering_momentum_eqn is true,
and (c) present(beta) is true. If this pattern is intentional, include a comment
explaining coordinated offsets; otherwise the suggested cycle/skip is required.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/simulation/m_data_output.fpp
**Line:** 1020:1020
**Comment:**
	*Logic Error: Logic error: the new MPI write loop for volume-filtered variables writes the full range 1..`alt_sys` unconditionally, which will include the `beta` slot (index `sys_size+1`) when `alt_sys` > `sys_size`. Later the existing code unconditionally writes the beta correction (`sys_size+1`) again, causing a duplicate/overlap write that can corrupt the restart file. Avoid writing the `sys_size+1` index inside this new loop when `beta` will be written later.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

Comment on lines +707 to +711
subsection_x = ib_markers%sf(i - gp_layers:i + gp_layers, j, k)
subsection_y = ib_markers%sf(i, j - gp_layers:j + gp_layers, k)
subsection_z = ib_markers%sf(i, j, k - gp_layers:k + gp_layers)
if (any(subsection_x == 0) .or. &
any(subsection_y == 0) .or. &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Logic error: the new code checks three 1D cross-sections (subsection_x, subsection_y, subsection_z) instead of the full 3D neighborhood; this can miss zero entries located off those cross-sections and misclassify ghost vs inner points. Restore the original 3D neighborhood test using subsection_3D = ib_markers%sf(i - gp_layers:i + gp_layers, j - gp_layers:j + gp_layers, k - gp_layers:k + gp_layers) and any(subsection_3D == 0). [logic error]

Severity Level: Critical 🚨
- ❌ Ghost-point counting may be incorrect.
- ⚠️ s_ibm_setup ghost allocations affected.
- ❌ Downstream interpolation uses wrong point sets.
Suggested change
subsection_x = ib_markers%sf(i - gp_layers:i + gp_layers, j, k)
subsection_y = ib_markers%sf(i, j - gp_layers:j + gp_layers, k)
subsection_z = ib_markers%sf(i, j, k - gp_layers:k + gp_layers)
if (any(subsection_x == 0) .or. &
any(subsection_y == 0) .or. &
subsection_3D = ib_markers%sf( &
i - gp_layers:i + gp_layers, &
j - gp_layers:j + gp_layers, &
k - gp_layers:k + gp_layers)
if (any(subsection_3D == 0)) then
Steps of Reproduction ✅
1. Trigger s_ibm_setup() which calls s_find_num_ghost_points(num_gps, num_inner_gps).
s_ibm_setup is the IBM initialization path used during module setup (call present in
s_ibm_setup in src/simulation/m_ibm.fpp).

2. Within s_find_num_ghost_points (src/simulation/m_ibm.fpp), iterate a 3D immersed
boundary region where gp_layers > 0 and the IB geometry is not axis-aligned.

3. Execution reaches the block starting at line 707 where the code samples three 1D slices
(subsection_x/y/z) instead of the full 3D block.

4. If a zero-valued marker exists in the 3D neighborhood but not on any of the three
central cross-sections, the current condition (lines 707–715) will not detect it and will
misclassify a true ghost point as an inner point. Reproduce by constructing an IB with
zeros off the central planes and observing num_gps_out undercount.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/simulation/m_ibm.fpp
**Line:** 707:711
**Comment:**
	*Logic Error: Logic error: the new code checks three 1D cross-sections (`subsection_x`, `subsection_y`, `subsection_z`) instead of the full 3D neighborhood; this can miss zero entries located off those cross-sections and misclassify ghost vs inner points. Restore the original 3D neighborhood test using `subsection_3D = ib_markers%sf(i - gp_layers:i + gp_layers, j - gp_layers:j + gp_layers, k - gp_layers:k + gp_layers)` and `any(subsection_3D == 0)`.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are issues if an immersed boundary is periodically wrapped around the domain and near a domain/processor corner. Since the MPI buffer routines only populate buffers in the 3 cardinal directions, buffer regions located off a diagonal are not populated, which can result in cells being marked as outer ghost points when they are not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question:

There are issues if an immersed boundary is periodically wrapped around the domain and near a domain/processor corner. Since the MPI buffer routines only populate buffers in the 3 cardinal directions, buffer regions located off a diagonal are not populated, which can result in cells being marked as outer ghost points when they are not.

Answer:

This looks like a context/assumption mismatch rather than something you can “fix” purely with Qodo Merge settings: the bot’s suggestion focuses on the local 3D neighborhood test (1D slices vs full 3D), while your comment points to a distributed-memory halo/buffer-population limitation (missing diagonal buffers near periodic + corner), which can also cause ghost/inner misclassification even if the neighborhood test is correct.

What you can do via configuration is steer /improve to explicitly consider these HPC/MPI corner cases and avoid over-confident “restore old logic” recommendations unless it can rule out the buffer/halo issue.

Add targeted extra_instructions (and optionally bump suggestion depth) like this:

[pr_code_suggestions]
extra_instructions = """\
When reviewing ghost/inner point classification in IBM/MPI code:
- Consider periodic wrapping and processor-corner cases.
- Explicitly check whether halo/buffer exchange populates diagonal/corner neighbors; if only 3 cardinal directions are exchanged, call out that the neighborhood data may be incomplete.
- Do NOT recommend switching to a full 3D neighborhood `any(subsection_3D==0)` unless you can confirm the 3D block is actually populated on-rank (including corner halos).
- Prefer suggestions that add/adjust MPI exchange (or document the limitation) and/or add tests that reproduce periodic+corner scenarios.
- If the correctness depends on runtime decomposition/halo policy, state the needed assumptions instead of asserting a single “correct” fix.
"""
suggestions_depth = "exhaustive"  # optional, for more edge-case coverage

If you maintain repo best-practices, you can also encode this as a project rule so it’s consistently checked:

  • Add a best_practices.md entry describing “MPI halo exchange must populate corner/diagonal regions (or code must not assume it)” and “periodic wraparound + corner decomposition must be tested”.

If you share which platform you use (GitHub/GitLab/Bitbucket) and whether you run /improve automatically or manually, I can tailor where to put this (wiki .pr_agent.toml vs local .pr_agent.toml) and how to scope it to only the IBM/MPI files.

Relevant Sources:

Comment on lines +799 to +804
subsection_x = ib_markers%sf(i - gp_layers:i + gp_layers, j, k)
subsection_y = ib_markers%sf(i, j - gp_layers:j + gp_layers, k)
subsection_z = ib_markers%sf(i, j, k - gp_layers:k + gp_layers)
if (any(subsection_x == 0) .or. &
any(subsection_y == 0) .or. &
any(subsection_z == 0)) then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Logic error (same pattern as above) in the ghost-point discovery routine: using three 1D slices to test for neighboring zeros (subsection_x, subsection_y, subsection_z) is not equivalent to checking the full 3D block and will miss zeros off the central cross-sections; restore the full 3D block test with subsection_3D and any(subsection_3D == 0). [logic error]

Severity Level: Critical 🚨
- ❌ Ghost-point detection fails in 3D cases.
- ⚠️ Interpolation coefficients become invalid.
- ❌ Image-point calculations use wrong points.
Suggested change
subsection_x = ib_markers%sf(i - gp_layers:i + gp_layers, j, k)
subsection_y = ib_markers%sf(i, j - gp_layers:j + gp_layers, k)
subsection_z = ib_markers%sf(i, j, k - gp_layers:k + gp_layers)
if (any(subsection_x == 0) .or. &
any(subsection_y == 0) .or. &
any(subsection_z == 0)) then
subsection_3D = ib_markers%sf( &
i - gp_layers:i + gp_layers, &
j - gp_layers:j + gp_layers, &
k - gp_layers:k + gp_layers)
if (any(subsection_3D == 0)) then
Steps of Reproduction ✅
1. Call s_find_ghost_points(ghost_points, inner_points) during IBM setup/update (called
from s_ibm_setup and s_update_mib in src/simulation/m_ibm.fpp).

2. During the 3D loop (do k = 0, p) the code reaches the section starting at line 799 that
constructs subsection_x/y/z instead of the full 3D block.

3. If an IB neighborhood contains ib_markers%sf == 0 at an off-center location (not on the
three central cross-sections), the if condition at lines 799–804 will evaluate false and a
true ghost point will be missed.

4. Reproduce by preparing a 3D IB configuration with a zero in the 3D neighborhood away
from the central planes, run s_find_ghost_points, and observe that ghost_points is missing
that location (misclassification).
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/simulation/m_ibm.fpp
**Line:** 799:804
**Comment:**
	*Logic Error: Logic error (same pattern as above) in the ghost-point discovery routine: using three 1D slices to test for neighboring zeros (`subsection_x`, `subsection_y`, `subsection_z`) is not equivalent to checking the full 3D block and will miss zeros off the central cross-sections; restore the full 3D block test with `subsection_3D` and `any(subsection_3D == 0)`.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 21, 2026

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Division by zero risk
    Several places divide by filtered_fluid_indicator_function%sf(...) (and by q_cons_filtered(1)%sf(...) for density). If the indicator (or density) is zero or very small where there is no fluid, these divisions will produce NaNs/Infs or propagate invalid values. The code needs explicit guarding or thresholds before performing those divides.

  • Density divide guard
    In the pseudo-turbulent Reynolds stress subtraction you divide by q_cons_filtered(1)%sf(...) (the filtered density). If that value is zero (or nearly zero) in any cell this will produce invalid values. Validate or clamp density before division.

  • Repeated normalization / performance
    The code repeatedly computes real(Nx*Ny*Nz, wp) inside inner loops and does direct divisions by that expression. Recomputing conversion and performing division in inner loops is wasteful and repeated. Precompute inverse normalization and reuse to reduce overhead and improve numerical consistency.

  • Uninitialized data transfer
    The GPU loop copyin uses the whole center array (dimension 3x2) even when periodic_ibs is false and only center(:,1) is initialized. This can lead to copying uninitialized memory to the device and nondeterministic behavior. Either initialize center(:,2) by default or reduce the copyin to only the used slice.

  • GPU private clause
    Several new GPU-parallel loops in s_populate_scalarfield_buffers use the $:GPU_PARALLEL_LOOP(collapse=2) pragma without specifying private variables (other similar loops in the file include private='[l,k]'). Missing private lists can lead to incorrect behavior or race conditions on the device or at least inconsistent code-generation across backends. Verify and align the pragmas with existing patterns.

  • Division by zero
    The energy forcing adds a term that divides by q_cons_vf(1)%sf(i,j,k) (cell density). If a cell density can be zero or extremely small (e.g., inside solids / masked cells), this will produce invalid values or blow up numerically. Add a safe-guard or conditional to avoid division by zero / tiny numbers.

  • Intent mismatch
    The subroutine declares rhs_vf with intent(in) but later updates its contents (adds forcing into rhs_vf). This is an interface/logic mismatch and can lead to incorrect compiler assumptions or unexpected behavior. Verify the intended intent (should likely be intent(inout)) and update the declaration accordingly.

  • File I/O robustness
    The module opens a binary stream on unit 102 at initialization and closes at finalization. Ensure the unit is not already in use, check for successful open, and consider error handling if the file cannot be created (particularly in parallel runs where only rank 0 opens the file).

  • Tetrahedralization correctness
    The routine that builds tetrahedra for each Voronoi cell uses a simple "fan" of consecutive vertex triplets (j, j+1, j+2). For arbitrary 3D Voronoi polytopes this may produce degenerate or incorrect tetrahedra (ordering/convexity assumptions). Validate the approach or use a robust 3D polyhedron tetrahedralization method.

  • Wrong Voronoi input
    Inside the Lloyd iterations, the Voronoi diagram is recomputed using initial_points rather than the current, updated points. This prevents the relaxation from converging correctly because the tessellation doesn't reflect the latest point positions.

  • Division by zero risk
    The centroid normalization divides by np.sum(simplex_areas) without guarding against a zero sum (degenerate or tiny/collinear triangles). This can raise an error or produce NaNs; a safe check or fallback is needed.

  • Timestamp / timestep mismatch
    The code builds the time axis for the MFC induction-time search from sorted(case.get_timestamps()) but builds the corresponding Y/rho arrays by iterating sorted(case.get_timesteps()). If timestamps and timesteps are different sequences (or have different ordering/length), zipping them in find_induction_time may misalign times vs. concentrations and produce incorrect induction times.

  • Time stepping overshoot risk
    generate_ct_saves advances the ReactorNet to ct_time + time_save while looping with ct_time < Tend, then increments ct_time by time_save. If Tend is not an integer multiple of time_save this may advance beyond Tend in the final iteration. Consider clamping the next advance time to Tend to avoid stepping past the intended final time.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Jan 21, 2026

CodeAnt AI finished reviewing your PR.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@voronoi/gen_voronoi_3D.py`:
- Around line 28-37: The current loop over points builds tetrahedra by taking
consecutive vertex triples (in the block that constructs tetrahedra then
computes centroids via compute_tetrahedron_centroid and volumes via
compute_tetrahedron_volume), which can produce invalid tetrahedra; replace that
tetrahedralization with a proper Delaunay decomposition of the vertex set (use
scipy.spatial.Delaunay on vertices[i] to get tetrahedra indices and then compute
centroids/volumes from those tetrahedra), or simpler and preferred, use freud's
precomputed values by replacing the per-cell centroid/volume computation with
voro_data.centroid[i] and voro_data.volumes[i] to get correct weighted centroids
and volumes. Ensure you remove the consecutive-triple logic and adapt downstream
code to consume the Delaunay-based tetrahedra or voro_data fields.
♻️ Duplicate comments (2)
voronoi/gen_voronoi_3D.py (2)

20-22: Avoid mutating the caller’s input array.

points = initial_points aliases the input, so the relaxation mutates initial_points and the “initial points” plot becomes incorrect. Copy the array before iterating.

🛠️ Suggested fix
-    points = initial_points
+    points = np.array(initial_points, copy=True)

61-63: Create the output directory recursively.

os.mkdir fails if ../runs doesn’t exist. Use os.makedirs(..., exist_ok=True) instead.

🛠️ Suggested fix
-    if os.path.exists(output_dir) == False:
-        os.mkdir(output_dir)
+    os.makedirs(output_dir, exist_ok=True)
🧹 Nitpick comments (2)
examples/3D_random_sphere_array/case.py (2)

1-3: Unused import: math

The math module is imported but never used. All mathematical operations use numpy instead.

Suggested fix
 import json
-import math
 import numpy as np

36-38: Consider adding basic validation for the sphere location file.

If sphere_array_locations.txt is missing or has fewer than 3 columns, the script will fail with an unclear error when accessing sphere_loc[i, 0:3] in the loop below. A brief check could improve usability:

Suggested validation
 # load initial sphere locations
-sphere_loc = np.loadtxt("sphere_array_locations.txt")
+sphere_loc = np.loadtxt("sphere_array_locations.txt", ndmin=2)
+if sphere_loc.ndim != 2 or sphere_loc.shape[1] < 3:
+    raise ValueError("sphere_array_locations.txt must contain at least 3 columns (x, y, z)")
 N_sphere = len(sphere_loc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review effort 4/5 size:XXL This PR changes 1000+ lines, ignoring generated files

Development

Successfully merging this pull request may close these issues.

3 participants