-
Notifications
You must be signed in to change notification settings - Fork 117
MPI FFTW #997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
MPI FFTW #997
Conversation
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #997 +/- ##
==========================================
+ Coverage 40.91% 41.19% +0.28%
==========================================
Files 70 69 -1
Lines 20270 20379 +109
Branches 2520 2540 +20
==========================================
+ Hits 8293 8396 +103
+ Misses 10439 10408 -31
- Partials 1538 1575 +37 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
/improve |
src/post_process/m_start_up.fpp
Outdated
|
||
end subroutine s_mpi_FFT_fwd | ||
|
||
subroutine s_mpi_transpose_x2y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all these mpi routines seem like they should go somewhere else... a new module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements MPI-based 3D Fast Fourier Transform (FFT) functionality for energy cascade analysis in the post-processing module. The implementation uses pencil decomposition with 2D cartesian sub-communicators instead of the previous 1D slab approach, improving computational efficiency for multi-rank FFT operations.
Key changes:
- Add MPI-based 3D FFT implementation with FFTW integration for energy spectrum calculations
- Implement pencil decomposition using cartesian sub-communicators for better parallel efficiency
- Add comprehensive input validation for FFT parameter constraints
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
toolchain/mfc/run/case_dicts.py | Add fft_wrt parameter to post-processing configuration |
src/post_process/m_start_up.fpp | Implement complete 3D FFT system with FFTW, MPI transpose operations, and energy cascade calculations |
src/post_process/m_mpi_proxy.fpp | Include fft_wrt parameter in MPI broadcast list |
src/post_process/m_global_parameters.fpp | Add fft_wrt logical parameter declaration and initialization |
src/post_process/m_checker.fpp | Add FFT input validation constraints and requirements |
src/common/m_mpi_common.fpp | Add conditional processor topology optimization for FFT operations using 2D pencil decomposition |
Further changes: There's also been more changes today that basically uses pencil decomposition throughout (including pre_process, and simulation) whenever fft_wrt is used. This is done so that file_per_process can be enabled. The performance hit in going from block to pencil decomposition is far less than the performance hit in I/O when not using file_per_process, especially considering time averaging for spectrum. I've also added the Lax Friedrichs Riemann solver |
please add tests for the LF riemann solver |
/improve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (2)
src/post_process/m_start_up.fpp:1
- The s_lf_riemann_solver subroutine is 831 lines long, which significantly exceeds the coding guideline limit of ≤500 lines for subroutines. This subroutine should be refactored into smaller, more manageable helper subroutines.
#:include 'macros.fpp'
src/post_process/m_start_up.fpp:1
- The s_lf_riemann_solver subroutine is 831 lines long, which significantly exceeds the coding guideline limit of ≤500 lines for subroutines. This subroutine should be refactored into smaller, more manageable helper subroutines.
#:include 'macros.fpp'
User description
Description
Fast Fourier transform for energy cascade implemented on multiple ranks. Improves upon Conrad's implementation by using Pencil decomposition (2D) instead of Slabs (1D) in post_process. This required the use of cartesian sub-communicators.
Tested for correctness on Taylor-Green Vortex problem.
Also adds a Lax Friedrichs Riemann solver as an option (riemann_solver = 5)
Fixes #(issue) [optional]
Type of change
Please delete options that are not relevant.
Scope
If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration
Test Configuration:
Checklist
docs/
)examples/
that demonstrate my new feature performing as expected.They run to completion and demonstrate "interesting physics"
./mfc.sh format
before committing my codeIf your code changes any code source files (anything in
src/simulation
)To make sure the code is performing as expected on GPU devices, I have:
nvtx
ranges so that they can be identified in profiles./mfc.sh run XXXX --gpu -t simulation --nsys
, and have attached the output file (.nsys-rep
) and plain text results to this PR./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace
, and have attached the output file and plain text results to this PR.PR Type
Enhancement
Description
Implement MPI-based 3D FFT for energy cascade analysis
Add pencil decomposition using cartesian sub-communicators
Integrate FFTW library with MPI transpose operations
Add input validation for FFT requirements
Diagram Walkthrough
File Walkthrough
m_mpi_common.fpp
Add FFT-specific processor topology optimization
src/common/m_mpi_common.fpp
m_checker.fpp
Add FFT input validation constraints
src/post_process/m_checker.fpp
s_check_inputs_fft
subroutine for FFT parameter validationfile_per_process
when FFT is enabledm_global_parameters.fpp
Add FFT write parameter
src/post_process/m_global_parameters.fpp
fft_wrt
logical parameter for FFT output controlfft_wrt
to false in default settingsm_mpi_proxy.fpp
Include FFT parameter in MPI broadcasts
src/post_process/m_mpi_proxy.fpp
fft_wrt
to MPI broadcast variable listm_start_up.fpp
Implement complete MPI-based 3D FFT system
src/post_process/m_start_up.fpp
case_dicts.py
Add FFT parameter to toolchain configuration
toolchain/mfc/run/case_dicts.py
fft_wrt
parameter to post-processing parameter dictionary