Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors and workaround suggestions for SuperLU, Silo, Mathpresso on IBM/NVidia #41

Open
herve-gross opened this issue Jun 24, 2019 · 11 comments
Assignees

Comments

@herve-gross
Copy link
Contributor

Platform: IBM ppc64le
OS: Red Hat Enterprise Linux Server 7.4 (Maipo)"
Kernel: 3.10.0-693.el7.ppc64le
CPU: POWER8NVL (raw) 4023.000000MHz
Compiler: gcc/7.3.0
OpenMPI: openmpi/2.1.5
CMake: cmake/3.11.4
Python: anaconda/python-2.7.13

GEOSX commit: 2f4bee2 2019-06-14 11:29:20
thirdPartyLibs commit: e2ad476 2019-06-13 15:57:16

Issue 1: In built target superlu_dist

[ 75%] Linking CXX executable pzdrive1_ABglobal
mpic++  -fopenmp   -o pddrive4_ABglobal ... ... -Wl,-rpath -Wl,/data/gpfs/Users/l0538797/src/geosx/thirdPartyLibs/install-anag-gcc-release/lapack_suite/lib64 -lblas ... -llapack ... ...
/usr/bin/ld: cannot find -lblas
/usr/bin/ld: cannot find -llapack
collect2: error: ld returned 1 exit status
make[5]: *** [EXAMPLE/pddrive_ABglobal] Error 1
make[4]: *** [EXAMPLE/CMakeFiles/pddrive_ABglobal.dir/all] Error 2

Workaround: In CMakeList.txt, modify the flags "-Wl,-rpath -Wl," to "-L"

Issue 2: In building 'silo'

[ 46%] Performing configure step for 'silo'
... ...
hostinfo               =
/bin/universe          =
/usr/bin/arch -k       =
/bin/arch              = ppc64le
/usr/bin/oslevel       =
/usr/convex/getsysinfo =

UNAME_MACHINE = ppc64le
UNAME_RELEASE = 3.10.0-693.el7.ppc64le
UNAME_SYSTEM  = Linux
UNAME_VERSION = #1 SMP Thu Jul 6 19:59:44 EDT 2017
configure: error: cannot guess build type; you must specify one
make[2]: *** [silo/src/silo-stamp/silo-configure] Error 1
make[1]: *** [CMakeFiles/silo.dir/all] Error 2
make: *** [all] Error 2

workaround: Adding the following statement to CMakeLists.txt.
set(SILO_BUILD_TYPE "ibm")

Issue 3: In building 'mathpresso'

[ 70%] Performing build step for 'mathpresso'
... ...
[ 42%] Building CXX object CMakeFiles/mathpresso.dir/data/gpfs/Users/l0538797/src/geosx/thirdPartyLibs/build-anag-gcc-release/asmjit-master/src/asmjit/base/regalloc.cpp.o
In file included from /data/gpfs/Users/l0538797/src/geosx/thirdPartyLibs/build-anag-gcc-release/asmjit-master/src/asmjit/base/func.cpp:12:0:
/data/gpfs/Users/l0538797/src/geosx/thirdPartyLibs/build-anag-gcc-release/asmjit-master/src/asmjit/base/../base/func.h:185:3: error: #error "[asmjit] Couldn't determine the target's calling convention."
# error "[asmjit] Couldn't determine the target's calling convention."

... Lots of error messages ...

In file included from /data/gpfs/Users/l0538797/src/geosx/thirdPartyLibs/build-anag-gcc-release/asmjit-master/src/asmjit/base/func.cpp:12:0:
/data/gpfs/Users/l0538797/src/geosx/thirdPartyLibs/build-anag-gcc-release/asmjit-master/src/asmjit/base/../base/func.h:477:58: error: ‘kIdHost’ is not a member of ‘asmjit::CallConv’
  ASMJIT_INLINE FuncSignature0(uint32_t ccId = CallConv::kIdHost) noexcept {

... Lots of error messages  ...

/data/gpfs/Users/l0538797/src/geosx/thirdPartyLibs/build-anag-gcc-release/mathpresso/src/mathpresso/src/mathpresso/mpcompiler.cpp:116:27: error: ‘X86Mem’ does not name a type
MATHPRESSO_INLINE const X86Mem& getMem() const { return *static_cast<const X86Mem*>(&op); }

... Lots of error messages  ...

Workaround:

  1. In asmjit-master/src/asmjit/base/func.h, commenting out the entire structure of "#if defined(ASMJIT_DOCGEN)/#endif", keep only the block of code defined in the block of "#elif ASMJIT_ARCH_X64".
  2. In mathpresso/src/mathpresso/src/mathpresso/mpcompiler.cpp, adding #include "asmjit/x86.h" at the beginning.
@rrsettgast
Copy link
Member

  1. take a look at the host configs for lc systems. If you have pre-installed BLAS/LAPACK it would be good to specify the cmake variables:
BLAS_DIR
BLAS_LIBRARY_NAMES
LAPACK_DIR
LAPACK_LIBRARY_NAMES

If you need to build BLAS/LAPACK you need to set the ENABLE_LAPACK_SUITE to ON.

  1. @markcmiller86 does this look familiar?

  2. don't build mathpresso. I don't think it works for ibm. @corbett5 confirm?

@corbett5
Copy link
Contributor

corbett5 commented Jun 24, 2019

@corbett5
Copy link
Contributor

@corbett5
Copy link
Contributor

Note that the inclusion of the cxx-utilities host config.
https://github.com/GEOSX/cxx-utilities/blob/develop/host-configs/ray_blueos_3_ppc64le_ib-clang%40upstream.cmake

@markcmiller86
Copy link

  1. @markcmiller86 does this look familiar?

I agree with @corbett5 ... you need to explicitly specify the build-type with --build argument and maybe --host too. This is because Silo's configure is whoefully out of date. Apologies.

@MichaelSekachev
Copy link

Thank you @corbett5, thank you @markcmiller86 .
I pulled the codes and will report back the results.

@sheltongeosx
Copy link

  1. take a look at the host configs for lc systems. If you have pre-installed BLAS/LAPACK it would be good to specify the cmake variables:
BLAS_DIR
BLAS_LIBRARY_NAMES
LAPACK_DIR
LAPACK_LIBRARY_NAMES

If you need to build BLAS/LAPACK you need to set the ENABLE_LAPACK_SUITE to ON.

  1. @markcmiller86 does this look familiar?
  2. don't build mathpresso. I don't think it works for ibm. @corbett5 confirm?

Hi @rrsettgast --
I have some questions about the above item 2:

  1. If mathpresso is not built, what kind of features will be missing in GEOSX?
  2. My geosx is built wih mathpresso turned off. When running the following test, it complains about GEOSX not built with mathpresso. Any idea? Here is the errors:
    $ geosx -i /data/gpfs/Users/l0538797/src/geosx/repo/GEOSX/src/coreComponents/physicsSolvers/integratedTests/SimpleSolvers/10x10x10_LaplaceFEM.xml
    real64 is alias of double
    localIndex is alias of long
    globalIndex is alias of long long
    GEOS must be configured to use Python to use parameters, symbolic math, etc. in input files
    Adding Solver of type LaplaceFEM, named laplace
    Adding Mesh: InternalMesh, mesh1
    Adding Event: PeriodicEvent, solverApplications
    Adding Event: PeriodicEvent, outputs
    Adding Event: PeriodicEvent, restarts
    TableFunction: timeFunction
    SymbolicFunction: spaceFunction
    Adding Output: Silo, siloOutput
    Adding Output: Restart, sidreRestart
    Adding Geometric Object: Box, source
    Adding Geometric Object: Box, sink
    Adding Object ElementRegion named Region1 from ObjectManager::Catalog.
    ================
    [ERROR in line 80 of file /data/gpfs/Users/l0538797/src/geosx/repo/GEOSX/src/coreComponents/managers/Functions/SymbolicFunction.cpp]
    GEOSX was not built with mathpresso!
    ** StackTrace of 10 frames **
    Frame 1: axom::slic::logErrorMessage(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, int)
    Frame 2: geosx::SymbolicFunction::InitializeFunction()
    Frame 3: geosx::FunctionBase::PostProcessInput()
    Frame 4: geosx::dataRepository::ManagedGroup::PostProcessInputRecursive()
    Frame 5: geosx::dataRepository::ManagedGroup::PostProcessInputRecursive()
    Frame 6: geosx::dataRepository::ManagedGroup::PostProcessInputRecursive()
    Frame 7: geosx::ProblemManager::ProblemSetup()
    Frame 8: main
    Frame 9:
    Frame 10: __libc_start_main
    =====
    Rank 0
    Tue Jul 16 12:21:44 2019

@corbett5
Copy link
Contributor

When building without mathpresso unit tests and integrated tests that use mathpresso will fail at runtime with a similar message.

Mathpresso enables JIT compilation of symbolic functions defined as strings in input XML files.

@sheltongeosx
Copy link

Hi Ben @corbett5

Thank you very much for your quick reply!
I have compiled thridPartyLibs and GEOSX with mathpresso turned on. When runing the test for 10x10x10_LaplaceFEM.xml, it indeed failed at jit-compilation for the expression:
" sqrt(pow(x,2)+pow(y,2)+pow(z,2))".

The location of error:
Source file: GEOSX/src/coreComponents/managers/Functions/SymbolicFunction.cpp
Funtion: void SymbolicFunction::InitializeFunction()
Failed at the call:
mathpresso::Error err = parserExpression.compile(parserContext, expression.c_str(), mathpresso::kNoOptions);

So, is there any workaround or fix on this issue of mathpresso for IBM ppl64pc ?

Thanks
Shelton

@corbett5
Copy link
Contributor

Nothing we can do about it unless we were to add PowerPC support to asmjit.

@sheltongeosx
Copy link

Hi Ben,
Compiling GEOSX without mathpresso gives me the following error (with gcc/7.3.0 on PowerPC):

....../GEOSX/src/coreComponents/managers/Functions/CompositeFunction.cpp: In member function virtual geosx::real64 geosx::CompositeFunction::Evaluate(const real64*) const:
....../GEOSX/src/coreComponents/managers/Functions/CompositeFunction.cpp:133:10: error: variable âfunctionResultsâ set but not used [-Werror=unused-but-set-variable]
real64 functionResults[m_maxNumSubFunctions];

Currently what I do to silence off the error is to manually edit the cpp file. Is there a way to set a flag in config file to suppress this error? - since it is really not an error to me.

Thanks,
Shelton

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants