Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSMP-Intel on Stages/2024 #252

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from
Draft

TSMP-Intel on Stages/2024 #252

wants to merge 11 commits into from

Conversation

jjokella
Copy link
Contributor

@jjokella jjokella commented Nov 7, 2024

loadenvs.Intel: module environment for Stages/2024

@DCaviedesV
Copy link
Contributor

Excellent! Have runtime tests been successful in juwels and jureca?

@jjokella
Copy link
Contributor Author

jjokella commented Nov 8, 2024

I was a little too enthusiastic yesterday evening after the module environment was easily transferable and the environment loaded - sadly all my test-compilations on JUWELS (clm-pdaf, clm5-pdaf, clm-pfl-pdaf) failed, each for different reasons...

CLM-PFL-PDAF

OASIS error, from util/make_dir/COMP.err:

./TSMP/oasis3-mct_JUWELS_clm-pfl-pdaf/lib/psmile/src/mod_oasis_method.F90(720): error #5623: **Internal compiler error: internal abort** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.
   tag=ICHAR(TRIM(compnm))+ICHAR(TRIM(cdnam))
-------^
compilation aborted for ./TSMP/oasis3-mct_JUWELS_clm-pfl-pdaf/lib/psmile/src/mod_oasis_method.F90 (code 3)
make[2]: *** [Makefile:30: mod_oasis_method.o] Error 3

CLM-PDAF

  cat: Srcfiles: No such file or directory                                                                                                                               
./TSMP/clm3_5_JUWELS_clm-pdaf/src/utils/mct/mpeu/get_zeits.c:65:6: warning:a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
void get_zeits_(zts)
     ^
./TSMP/clm3_5_JUWELS_clm-pdaf/src/utils/mct/mpeu/get_zeits.c:89:6: warning:a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
void get_ztick_(tic)
     ^
2 warnings generated.
./TSMP/clm3_5_JUWELS_clm-pdaf/src/utils/timing/gptl.c:182:28: error: incompatible pointer to integer conversion assigning to 'unsigned int' from 'void *' [-Wint-conversion]                                                                      
    current_depth[t].depth = NULL;
                           ^ ~~~~
./TSMP/clm3_5_JUWELS_clm-pdaf/src/utils/timing/gptl.c:183:22: error: incompatible pointer to integer conversion assigning to 'int' from 'void *' [-Wint-conversion]                                                                               
    max_depth[t]     = NULL;
                     ^ ~~~~
./TSMP/clm3_5_JUWELS_clm-pdaf/src/utils/timing/gptl.c:184:22: error: incompatible pointer to integer conversion assigning to 'int' from 'void *' [-Wint-conversion]                                                                               
    max_name_len[t]  = NULL;
                     ^ ~~~~
3 errors generated. 

CLM5-PDAF

WARNING: No cesm Model version found.                                                                                                                                  
./TSMP/clm5_0_JUWELS_clm5-pdaf/clmoas/env_mach_specific.xml already exists, delete to replace
ERROR: Command ./TSMP/clm5_0_JUWELS_clm5-pdaf/bld/build-namelist failed rc=2
out=
err=Can't locate XML/LibXML.pm in @INC (you may need to install the XML::LibXML module) (@INC contains: ./TSMP_JUWELS_2024_11_07_dev-stages-2024_clm5-pdaf_Intel_CPU/clm5_0_JUWELS_clm5-pdaf/bld ./TSMP/cl\
m5_0_JUWELS_clm5-pdaf/bld ./TSMP/clm5_0_JUWELS_clm5-pdaf/cime/scripts/Tools/../../utils/perl5lib ./TSMP/clm5_0_JUWELS_clm5-pdaf/bld /p/software/juwels/stages/2024/software/Perl/5.36.1-GCCcore-12.3.0/lib/perl5/site_perl/5.36.1/x86_64-linux-thread-multi /p/software/juwels/stages/2024/software/Perl/5.36.1-GCCcore-12\
.3.0/lib/perl5/site_perl/5.36.1 /p/software/juwels/stages/2024/software/Perl/5.36.1-GCCcore-12.3.0/lib/perl5/5.36.1/x86_64-linux-thread-multi /p/software/juwels/stage\
s/2024/software/Perl/5.36.1-GCCcore-12.3.0/lib/perl5/5.36.1) at ./TSMP/clm5_0_JUWELS_clm5-pdaf/cime/scripts/Tools/../../utils/perl5lib/Config/SetupTools.pm line 5.
BEGIN failed--compilation aborted at ./TSMP/clm5_0_JUWELS_clm5-pdaf/cime/scripts/Tools/../../utils/perl5lib/Config/SetupTools.pm line 5.
Compilation failed in require at ./TSMP/clm5_0_JUWELS_clm5-pdaf/bld/CLMBuildNamelist.pm line 414.

@@ -67,7 +67,7 @@ getDefaults(){
setDefaults(){
#load the default values
platform=$def_platform
if [[ $platform == "" ]] then ; platform="JUWELS" ; fi #We need a hard default here
if [[ $platform == "" ]] then ; platform="JEDI" ; fi #We need a hard default here
Copy link
Contributor Author

@jjokella jjokella Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think JEDI should not be default in a PR to master. As I understand, it is a testing platform. I think we should leave JUWELS until JUPITER may become a default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -260,6 +260,10 @@ setCombination(){
compileClm(){
route "${cyellow}> c_compileClm${cnormal}"
comment " source clm interface script"
if echo "$compiler" | grep -qE 'Gnu'; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed here? Should be moved to common_build_interface like for OASIS below, I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really be deleted? Or could it be a copy of loadenvs.Gnu.2023 or loadenvs.Gnu.2024 depending on which is currently the default. This is how I understood loadenvs.Gnu and loadenvs.Intel so far: Copy of the default stage. Also there are some codeparts, where loadenvs.$compiler is used and this may cause problems, when loadenvs.Gnu and loadenvs.Intel are gone.

GCC with a version =>12, rmm needs a patch. 
Muhammad repo includes this patch already on the rmm.

-DCMAKE_EXE_LINKER_FLAGS=\"-lcurand -lcusparse -lcublas\ are not read correctly with "ksh" shell.
I added to the cmake the flags of -DCMAKE_EXE_LINKER_FLAGS that weren't read in file build_interface_parflow.ksh.
Update dev-stages-2024 for GPU-rmm on Juwels and Jedi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants