Phase-2 completion: post-process context types, m_riemann_solvers split, cleanup#1556
Phase-2 completion: post-process context types, m_riemann_solvers split, cleanup#1556sbryngelson wants to merge 44 commits into
Conversation
Sweeps bubble_model, avg_state, wave_speeds, recon_type, muscl_order, muscl_lim, int_comp, format, and precision (99 sites). Includes renaming legacy WENO_TYPE/MUSCL_TYPE comparisons to recon_type_weno/recon_type_muscl (same values) and two select-case labels in m_qbmm; WENO_TYPE/MUSCL_TYPE retirement is a follow-up.
Fypp resolves #:include at parse time, so generated_case_opt_decls.fpp is now emitted for every target (header-only stub outside simulation) and included unconditionally. Restores the dropped shear GPU_DECLARE (consumed in device kernels), restores the original nmom guard conditions in pre/post, and drops unused imports. The post-process beta_idx ordering change from the hoist is retained deliberately: it aligns post with simulation's writer layout (the parent ordering was inconsistent for bubbles_lagrange combined with mhd/elasticity/etc.).
Simulation declares nb in its own case-optimization block (excluded from generated decls), so the hoisted routine cannot reference it directly; pass it as an argument like nmom.
Drive s_mpi_bcast_user_inputs in all three m_mpi_proxy.fpp files from generate_bcast_fpp(target) in fortran_gen.py. The generator emits case_dir, class-(a) scalars (INT/LOG/REAL), FORTRAN_ARRAY_DIMS arrays, and the fluid_pp / bub_pp / lag_params / chem_params struct-array loops. Manual residues (bc_x/y/z members, domain bounds, m/n/p_glb, patch loops, etc.) stay in each file. get_generated_files() grows from 12 to 15 entries. Latent bug fixed: chem_wrt_Y (post, FORTRAN_ARRAY_DIMS dim=num_species) was namelist-bound but never broadcast; consumed by s_save_data on all ranks. Registry-driven generation closes it by construction. Tuple-set equivalence: pre +n_start_old (dead param, harmless), post +chem_wrt_Y only, sim exact identity.
bc_x/y/z are per-target declarations (the documented multi-variable-line exclusion), so their default assignments cannot live in m_global_parameters_common; the hoist had moved them and broke compilation. Restores the three BC default blocks to each executable after the s_assign_common_defaults call.
Verbatim motion only — no renames, reflows, or logic changes. Section map (original line ranges → files): lines 129-331 compiler flags / GPU logic → cmake/GPU.cmake lines 107-461 FYPP_EXE discovery + HANDLE_SOURCES → cmake/Fypp.cmake lines 464-541 params codegen stamp + gen-file lists + custom command/target → cmake/ParamsCodegen.cmake lines 549-821 MFC_SETUP_TARGET function → cmake/MFCTargets.cmake include() order in root CMakeLists.txt: 1. cmake/GPU.cmake (sets FYPP_GCOV_OPTS, NVHPC_USE_TWO_PASS_IPO, MFC_CUDA_CC) 2. cmake/Fypp.cmake (finds FYPP_EXE, defines HANDLE_SOURCES — needs FYPP_GCOV_OPTS) 3. cmake/ParamsCodegen.cmake (sets _mfc_gen_files_* lists — must precede HANDLE_SOURCES calls) 4. HANDLE_SOURCES calls (in root — consume _mfc_gen_files_* and FYPP_EXE) 5. cmake/MFCTargets.cmake (defines MFC_SETUP_TARGET — needs NVHPC_USE_TWO_PASS_IPO) 6. MFC_SETUP_TARGET calls (in root — instantiate targets) 7. docs section (in root — unchanged) Equivalence gate: cmake configure of -DMFC_PRE_PROCESS=ON -DMFC_MPI=OFF -DCMAKE_BUILD_TYPE=Release BEFORE and AFTER, path-normalized diff of CMakeCache.txt, Makefile, CMakeFiles/pre_process.dir/flags.make, and CMakeFiles/pre_process.dir/build.make — all empty (identical). A clean 3-target build rides the next scheduled gate.
…ludes Delete the configure-time execute_process + stamp block from cmake/ParamsCodegen.cmake (29 lines removed). The build-time add_custom_command is now the sole mechanism that writes the 15 generated_*.fpp includes. Cold-start proof (scratch build dir, no pre-existing includes): - cmake configure succeeds with include/ absent - make -n shows cmake_gen.py scheduled before all fypp steps - make mfc_params_gen produces all 15 files across 3 target dirs Incremental proof: touch toolchain/mfc/params/definitions.py -> make -n schedules regeneration without reconfigure. cmake_gen.py already calls path.parent.mkdir(parents=True,exist_ok=True) so no file(MAKE_DIRECTORY) guards were added. find_package(Python3) is kept in ParamsCodegen.cmake (the CMakeLists.txt one is in a docs block that runs later). Full build+test rides the next gate.
The hardcoded fluid_pp emitter predated the MFlowCode#1545 merge and silently dropped K/nn/tau0/hb_m/mu_min/mu_max/mu_bulk and non_newtonian from the generated lists - a multi-rank regression for non-Newtonian runs relative to the manual lists it replaced. Emitted sets now verified member-identical to master's manual lists per target. Caught by automated PR review.
The named-constant retirement deleted the WENO/MUSCL block between two section headings and merged them into one nonsensical comment. Flagged by automated PR review on MFlowCode#1552.
…h dedup docs NIB read num_ib_patches_max (2050000) but the patch_ib namelist array is dimensioned num_ib_patches_max_namelist (54000), so validation accepted indices that overflow the array. Also refreshes three doc passages made stale by the dedup. Both flagged by review.
Cray ftn rejects declare-target on use-associated names (ftn-1448 on Frontier gpu-omp): the dedup hoisted these declarations into the common module but left their GPU_DECLARE lines in simulation. Declares move to the declaring module; mixed lines split so locally-declared variables keep their declares in place. CPU-preprocessed output verified identical; declare-target scoping verified clean for both files under OpenMP emission.
fluid_pp%{mul0,ss,pv,gamma_v,M_v,mu_v,k_v,cp_v,D_v} and
lag_params%{T0,Thost,c0,rho0,x0} were removed from the Fortran derived
types by upstream MFlowCode#1085/MFlowCode#1093 but remained registered in the toolchain.
Setting any of them causes namelist read to abort with a misleading
'datatype mismatch' error.
Verified against src/common/m_derived_types.fpp:
- physical_parameters has: gamma, pi_inf, Re, cv, qv, qvp, G
- bubbles_lagrange_parameters has: solver_approach, cluster_type,
pressure_corrector, smooth_type, heatTransfer_model, massTransfer_model,
write_bubbles, write_bubbles_stats, nBubs_glb, epsilonb, charwidth, valmaxvoid
Also remove the now-dead PATTERNS entries in descriptions.py and update
the stale comments in fortran_gen.py (_emit_fluid_pp and _emit_lag_params).
Both emitters now walk the registry rather than hardcoding member lists.
bc_x/y/z%{vb1,vb2,vb3,ve1,ve2,ve3} are read from the namelist on rank 0
and consumed on all ranks by s_slip_wall/s_no_slip_wall in
src/common/m_boundary_common.fpp (~833-1155). These routines are compiled
into all three targets and are reached from pre_process (m_data_output,
m_perturbation) and post_process (m_start_up) code paths.
Without the broadcast, non-root ranks use uninitialised values for the
wall-velocity components, producing rank-dependent ghost cells for any
multi-rank wall-BC pre-process or post-process run. The sim residue has
always broadcast these; pre and post were missing them.
Add the matching broadcasts to both pre and post using the same nested
Fypp loop idiom as the simulation residue.
muscl_eps was excluded from broadcast generation via _BCAST_EXCLUDE on the incorrect assumption that it is derived post-broadcast. The derivation (in m_weno or m_muscl) only fires under f_is_default(muscl_eps), and default values are assigned on rank 0 only. Every multi-rank MUSCL run therefore had rank-divergent muscl_eps on non-root ranks. Remove it from the exclusion set. Tuple-set delta (var, mpi_type, count) vs. HEAD~: sim: +1 entry: (muscl_eps, mpi_p, 1) pre: no change post: no change _emit_fluid_pp and _emit_lag_params now walk the registry instead of maintaining hardcoded member lists. After Commit 1 deregistered the dead members (mul0/ss/pv/gamma_v/M_v/mu_v/k_v/cp_v/D_v for fluid_pp; T0/Thost/c0/rho0/x0 for lag_params), the registry now matches the Fortran types exactly. Re(1) count=2 remains sim-only via an explicit target check with a comment. G is walked as a regular REAL member. Tests: 3 new tests added — muscl_eps now broadcast in sim, fluid_pp and lag_params registry walks produce exactly the registered members minus documented exclusions, dead members absent.
…idue broadcasts pre_process sent the integer bc_x/y/z%beg/%end with mpi_p (an 8-byte real transfer over 4-byte integers - undefined behavior that happened to work by adjacency); simulation and post_process already used MPI_INTEGER. Also adds a test pinning the hand-written vb/ve and BC-code residue in pre/post so a merge conflict cannot silently drop them. Both from review.
…ops, stale import
The allocation guards in s_initialize_derived_variables_module omitted qm_wrt while the s_compute_finite_difference_coefficients call guards in m_start_up include it, so a case writing only the Q-criterion accessed unallocated arrays. Pre-existing; found during fd_context review. Guards now mirror the call sites exactly.
…n_state NFC, pure motion. The four solver bodies call s_initialize/finalize_riemann_solver, s_populate_riemann_states_variables_buffers, and s_compute_viscous_source_flux, so keeping those in the dispatching core would create a use-cycle once the solvers move out. They go to the new lowest layer together with the 14 GPU_DECLARE'd state items they consume (declares move with declarations; lowest-consumer rule, bc_buffers precedent). Module-level allocation, GPU updates, and deallocation stay in core's s_initialize/finalize_riemann_solvers_module via use-association. Statement-multiset union vs the parent differs only by module boilerplate; GPU-directive census md5 unchanged; declare-target scoping verified for both files.
NFC, pure motion (calibration commit for the solver extractions). The solver body accesses flux_rsx_vf, flux_src_rsx_vf, is1/2/3 by host association and calls the per-sweep lifecycle helpers, both now use-associated from m_riemann_state. Core re-exports s_hlld_riemann_solver through its existing public list. Statement-multiset union vs the pre-split file differs only by module boilerplate; GPU-directive census md5 unchanged; declare-target scoping verified for all files.
NFC, pure motion. State access stays host-associated via m_riemann_state; the inline_riemann include moves with the solver since its macros expand here (roe_avg pulls in get_species_enthalpies_rt, gas_constant, molecular_weights, hence the m_thermochem only-list). Core re-exports the entry point through its existing public list. Statement-multiset union vs the pre-split file differs only by module boilerplate; GPU-directive census md5 unchanged; declare-target scoping verified for all files.
NFC, pure motion. State access stays host-associated via m_riemann_state; inline_riemann macros expand here (low-Mach correction), and the solver references molecular_weights directly, hence the m_thermochem only-list. Core re-exports the entry point through its existing public list. Statement-multiset union vs the pre-split file differs only by module boilerplate; GPU-directive census md5 unchanged; declare-target scoping verified for all files.
NFC, pure motion; completes the m_riemann_solvers split. State access stays host-associated via m_riemann_state; hllc additionally uses m_bubbles (f_cpbw_KM), m_bubbles_EE (rs/vs/ps), and m_surface_tension (s_compute_capillary_source_flux). With the last solver body gone, core drops the use-lists whose final consumers moved (m_variables_conversion, m_bubbles, m_bubbles_EE, m_surface_tension, m_chemistry, m_thermochem, the m_constants only-list) and the inline_riemann include; core keeps the dispatcher, module init/finalize, and the public re-exports. Statement-multiset union vs the pre-split file differs only by module boilerplate; GPU-directive census md5 unchanged; declare-target scoping verified for all six files.
Review follow-up: m_mpi_proxy and m_helper_basic are unreferenced in the slimmed core under all configs.
There was a problem hiding this comment.
Pull request overview
Completes Phase 2 of the refactoring roadmap by (1) moving remaining post-process module state into context derived types, (2) splitting the large simulation Riemann solver module into smaller units while preserving the public surface, and (3) finishing the parameter-registry driven generation of Fortran declarations, case-opt declarations, and per-target MPI broadcast fragments (plus associated cleanup and a couple of latent bug fixes).
Changes:
- Add registry-driven generation for additional Fortran include fragments (
generated_case_opt_decls.fpp,generated_bcast.fpp) and extend typed derived-type namelist declarations viaTYPED_DECLS. - Introduce new shared Fortran module
m_global_parameters_commonand update build system/CMake to regenerate includes at build time. - Replace remaining literal enums with generated named constants across several simulation modules; add post-process
fd_context/output_contextand update post-process code to use them.
Reviewed changes
Copilot reviewed 53 out of 56 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| toolchain/mfc/params/generators/fortran_gen.py | Extend Fortran codegen to emit typed decls, case-opt decl blocks, and per-target MPI broadcast include fragments. |
| toolchain/mfc/params/generators/cmake_gen.py | Update generator script header comment to match build-time invocation. |
| toolchain/mfc/params/descriptions.py | Remove descriptions for deregistered derived-type members. |
| toolchain/mfc/params/definitions.py | Rename IB patch namelist bound constant; document/avoid dead registrations; add TYPED_DECLS. |
| toolchain/mfc/params_tests/test_fortran_gen.py | Add/adjust tests for typed decl emission, case-opt decl generation, and generated broadcast output. |
| toolchain/mfc/lint_docs.py | Derive allowed doc tokens from analytic intrinsics and registry constraints instead of a hardcoded skiplist. |
| toolchain/mfc/case_validator.py | Make recon_type validation message registry-driven (choices/names from CONSTRAINTS). |
| toolchain/mfc/build.py | Change build slug to a human-readable prefix plus existing 10-hex hash. |
| src/simulation/m_viscous.fpp | Replace reconstruction-type literals with generated named constants. |
| src/simulation/m_thinc.fpp | Replace interface-compression literal with named constant. |
| src/simulation/m_start_up.fpp | Use named recon-type constants; make Fypp loop literal order deterministic. |
| src/simulation/m_riemann_solver_hlld.fpp | New module for the HLLD solver extracted from the former monolith. |
| src/simulation/m_rhs.fpp | Use named constants for interface compression and reconstruction types. |
| src/simulation/m_qbmm.fpp | Use named constants for bubble-model choices. |
| src/simulation/m_muscl.fpp | Use named constants for MUSCL order/limiter choices. |
| src/simulation/m_mpi_proxy.fpp | Switch large portions of MPI broadcast logic to generated_bcast.fpp include; keep manual residue. |
| src/simulation/m_data_output.fpp | Use named constant for precision selection. |
| src/simulation/m_checker.fpp | Use named constants for recon type and MUSCL order checks. |
| src/simulation/m_cbc.fpp | Use named constants for recon type; make Fypp loop literal order deterministic. |
| src/simulation/m_bubbles.fpp | Use named constants for bubble-model choices. |
| src/simulation/m_bubbles_EL.fpp | Use named constant for precision selection in formatted output. |
| src/simulation/include/inline_riemann.fpp | Use named constants for average-state selection. |
| src/pre_process/m_start_up.fpp | Update IC state references to use the new ic_context bundle. |
| src/pre_process/m_mpi_proxy.fpp | Switch large portions of MPI broadcast logic to generated_bcast.fpp include; keep manual residue. |
| src/pre_process/m_initial_condition.fpp | Replace module-level IC state with type(ic_context) :: ic. |
| src/pre_process/m_data_output.fpp | Use named constant for precision selection. |
| src/pre_process/m_boundary_conditions.fpp | Remove stale/unused import. |
| src/post_process/m_start_up.fpp | Route FD coeff computation and output workspace through new contexts. |
| src/post_process/m_mpi_proxy.fpp | Switch large portions of MPI broadcast logic to generated_bcast.fpp include; keep manual residue; use named format constant. |
| src/post_process/m_derived_variables.fpp | Replace module-level FD work arrays with type(fd_context) :: fd and update allocation guards. |
| src/common/m_variables_conversion.fpp | Use named constant for average-state selection. |
| src/common/m_mpi_common.fpp | Use named constants for recon type and post-process format handling. |
| src/common/m_helper_basic.fpp | Use named constants for recon type when computing buffer size. |
| src/common/m_global_parameters_common.fpp | New shared global-parameters core and shared defaults/equation-index setup. |
| src/common/m_derived_types.fpp | Add ic_context, fd_context, and output_context derived types. |
| src/common/m_constants.fpp | Remove legacy recon-type constants and retain/clarify compile-time bounds constants. |
| docs/module_categories.json | Register new split modules in documentation categories. |
| docs/documentation/contributing.md | Update contributor guidance for build-time codegen and shared global parameters. |
| cmake/ParamsCodegen.cmake | New build-time codegen custom command/targets for generated .fpp includes. |
| cmake/MFCTargets.cmake | New target setup function and related configuration logic. |
| cmake/GPU.cmake | New consolidated GPU/compiler flags configuration. |
| cmake/Fypp.cmake | New consolidated Fypp preprocessing/HANDLE_SOURCES logic with generated-include dependencies. |
| .claude/rules/common-pitfalls.md | Update internal contributor/agent notes to reflect new shared global-params and codegen pipeline. |
| auto-generated at build time (ninja-tracked custom command) from the `TYPED_DECLS` and `FORTRAN_ARRAY_DIMS` | ||
| tables in `toolchain/mfc/params/definitions.py`. For a plain scalar registered with | ||
| `_r()` / `_nv()` above, no manual Fortran edit is needed — reconfigure (`./mfc.sh build`) | ||
| and the generated include in `m_global_parameters_common.fpp` (compiled per target) is updated | ||
| automatically. |
| After editing any generator or table, force regen by reconfiguring (`./mfc.sh build`) — | ||
| cached builds compile stale includes. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #1556 +/- ##
==========================================
- Coverage 61.17% 60.94% -0.24%
==========================================
Files 74 82 +8
Lines 20313 19922 -391
Branches 2961 2924 -37
==========================================
- Hits 12427 12141 -286
+ Misses 5870 5805 -65
+ Partials 2016 1976 -40 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Summary
Context types (post_process). The remaining module-level state bundles follow the pilot pattern:
fd_context(finite-difference workspace:gm_rho_sf,fd_coeff_x/y/z) andoutput_context(the 18-member output workspace:q_sffamily, Silo extents/offsets, paths and handles). Strictly NFC, verified by emitted-Fortran equivalence after stripping the qualifier.m_riemann_solvers.fppsplit (4,706 lines → 6 modules). The four solvers extract tom_riemann_solver_{hll,hllc,hlld,lf}.fpp; the shared device state (14 GPU-declared arrays/bounds) and the per-sweep lifecycle + viscous helpers the solvers call land inm_riemann_state.fpp(the lowest layer — keeping them in the dispatching core would create a module cycle); the original module shrinks to a 116-line dispatcher that re-exports the public surface unchanged (zero external callers modified). Pure motion: statement-multiset equivalence under CPU/OpenACC/OpenMP emission; GPU-directive census identical (747 directives); everydeclare targetverified to live in its declaring module (the Cray ftn-1448 rule).Two latent bug fixes (found by review during the above):
qm_wrtis set, while the coefficient computation runs for it — unallocated access for Q-criterion-only cases. The allocation guards now mirror the call sites.patch_ibcounts above the namelist array bound — carried in Share the global-parameters core across executables; generate MPI broadcasts; build hygiene #1552's follow-ups; noted here because the registry-driven message change rides this stack.Cleanup batch: the
recon_typevalidator message now derives from the parameter registry (the last hardcoded vocabulary site); fypp#:forset-literals become list-literals (emission order was Python-hash-dependent across processes — builds are now order-reproducible); one stale import removed; the slimmed Riemann core drops two unused imports.Verification
Notes for reviewers
Gs_rs/Res_gsallocate without a finalize-side deallocate (base-identical), and the single-precisionq_sf_striple is never deallocated.m_riemann_stateholds the per-sweep helpers as well as state — the call graph forces it; the module doc enumerates the contents.