Fix/289 improve sw lw radiation omp scaling#363
Fix/289 improve sw lw radiation omp scaling#363jirudaya wants to merge 15 commits intoMetOffice:mainfrom
Conversation
There was a problem hiding this comment.
An upgrade macro is required to set the default values of sw_segment_limit and lw_segment_limit - I presume setting these to 0 would replicate current trunk behaviour, although if we think it's more optimal to set them to 32 from the outset then that's also fine
James Manners (mo-jmanners)
left a comment
There was a problem hiding this comment.
I think any future updates to the argument lists in lw_kernel_mod.F90 and sw_kernel_mod.F90 might now require updates to the "ignore_dependencies_for" list in lw_kernel_mod.py and sw_kernel_mod.py. If this is the case, could you add comments in lw_kernel_mod.F90 and sw_kernel_mod.F90 so that developers know what would need to be changed.
2cedd7d to
260203a
Compare
James Manners (mo-jmanners)
left a comment
There was a problem hiding this comment.
Thanks, looks good.
There was a problem hiding this comment.
Hi - I'm not a fan of this change - I'd prefer it be left how it was, which was consistent with the other segment size handling for other schemes, i.e. the namelist should determine the segment size, with 0 defaulting to the entire MPI rank. Please can you revert this - thanks! (Edit - not sure if it's obvious, but this comment refers to commit 9536736)
There was a problem hiding this comment.
Hi - this should be done via an upgrade macro, rather than by editing the app configuraions by hand. Please can you revert this and add an appropriate upgrade macro. Thanks (Edit - not sure if it's obvious, but this comment refers to commit cce8025)
Also, perfectly happy for the upgrade macro to set the values to 32, if that's the current best number!
|
Sorry, I think I'm confusing you, but also getting confused myself! I initially made a mistake when suggesting changes to the upgrade macro tags between the PR (363) and issue (289). I fixed this, but I don't think before you accepted the changes - but I think the number should be 363 in these rather than 289. What I was trying to correct was that the macro tags should show that this is a macro applied to vn3.1, moving it up to vn3.1_t363. However, I can now see that the branch itself isn't at vn3.1, but instead is at an older version of the code. It may be best if you merge stable (vn3.1) into your branch (see instructions here https://github.com/MetOffice/simulation-systems/wiki/2026.03.1) - then you should see that the versions.py file is empty, apart from your single change. This will make it clearer what the upgrade macro is actually going to do! |
I made a mistake of creating the branch from main instead of stable, hence the version.py is not empty as expected. I will rebase the branch to fix this |
Add ignore_dependencies_for to sw_kernel_mod.py to prevent TransformationErrors and ensure OpenMP pragmas are re-added during PSyclone transmutation. Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Add ignore_dependencies_for to lw_kernel_mod.py to prevent TransformationErrors and ensure OpenMP pragmas are re-added during PSyclone transmutation. Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Limit the number of columns per SW segment using sw_seg_limit to improve OpenMP load balancing and prevent oversized blocks. Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Limit the number of columns per LW segment using lw_seg_limit to improve OpenMP load balancing and prevent oversized blocks. Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
- add parameters to tuning_segement_mod and initialize in um_sizes_init_mod Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Add a note to specify changes may be needed in the transmute script in the future Co-authored-by: James Manners <james.manners@metoffice.gov.uk>
Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
This reverts commit 9536736. Signed-off-by: jirudaya <158197464+jirudaya@users.noreply.github.com>
Fix macro class version
f2e3392 to
19853e3
Compare
|
I tested using a local copy of the branch brought up to date, but I think this one will need a linked JEDI PR. I will link it here once that's ready |
DanStoneMO
left a comment
There was a problem hiding this comment.
I did a re-test following a KGO change yesterday, and JEDI now works with this as-is. No linked PR will be needed. A surprise to be sure, but a welcome one.
Benjamin Went (MetBenjaminWent)
left a comment
There was a problem hiding this comment.
Hi Jaffrey, initial changes broadly look good, however I did have some comments, thanks!
| gw_segment=32 | ||
| ls_ppn_segment=32 | ||
| ussp_segment=4 | ||
| sw_segment_limit=32 | ||
| lw_segment_limit=32 |
There was a problem hiding this comment.
I think for sw_segment_limit and lw_segment_limit, the macros should sort this out when the ticket is committed.
These should be handled by the macros, and removed from this file, thanks!
| ! The maximum segment size is limited by lw_seg_limit_size to prevent | ||
| ! overly large blocks. This ensures better load balancing across threads. | ||
| ncols_per_thread = (n_profile_list + max_threads - 1) / max_threads | ||
| nblocks = ((ncols_per_thread + lw_seg_limit_size - 1) / lw_seg_limit_size) & |
There was a problem hiding this comment.
Could we explicitly round these, depending on our intent?
L592 and 593, where we are dividing?
There was a problem hiding this comment.
And L595?
| ncols_per_thread = (n_profile_list + max_threads - 1) / max_threads | ||
| nblocks = ((ncols_per_thread + sw_seg_limit_size - 1) / sw_seg_limit_size) & | ||
| * max_threads | ||
| soc_sw_block = (n_profile_list + nblocks - 1) / nblocks |
There was a problem hiding this comment.
Same as lw_kernel_mod comment
There was a problem hiding this comment.
How comes we are making these changes with this PR? I don't see any content in the main ticket details.
PR Summary
Sci/Tech Reviewer: Benjamin Went (@MetBenjaminWent)
Code Reviewer: Lottie Turner (@mo-lottieturner)
sw_kernel_modandlw_kernel_mod. The manually added OMP pragmas are now re-added without transformation errors.rose-stemlocally withcylc runon ARCHER2.Code Quality Checklist
Testing
trac.log
Security Considerations
Performance Impact
AI Assistance and Attribution
Documentation
PSyclone Approval
Sci/Tech Review
(Please alert the code reviewer via a tag when you have approved the SR)
Code Review