Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python run_benchmark.py -b cholesky2 -f dace_cpu does not work on Macbook #25

Open
pratyai opened this issue Oct 1, 2024 · 3 comments

Comments

@pratyai
Copy link

pratyai commented Oct 1, 2024

I have been trying to run the benchmarks on my 2019 (Intel) Macbook and this particular benchmark + framework combination seems broken on that machine. I get the following error (truncated; full error):

[ 25%] Linking CXX shared library libfusion.dylib
Undefined symbols for architecture x86_64:
  "_LAPACKE_dpotrf", referenced from:
      __program_fusion_internal(fusion_state_t*, double*, long long) in fusion.cpp.o
ld: symbol(s) not found for architecture x86_64
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [libfusion.dylib] Error 1
make[1]: *** [CMakeFiles/fusion.dir/all] Error 2

On the other hand, on a linux machine, everything seems to go just fine. I have not tried with the newer (M1) Macs, so perhaps it is only present in the older Macbooks.


Side-note: the dependency installation does not exactly work as stated in README.md for me. I now have the following changes to explicitly include additional dependencies:

diff --git a/requirements.txt b/requirements.txt
index 5c2e18f..83cbdf9 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,5 +1,9 @@
-matplotlib
-numpy
-pandas
-pygount
+matplotlib~=3.9.2
+numpy~=1.26.4
+pandas~=2.2.2
+pygount~=1.8.0
 scipy
+dace~=0.16.1
+numba~=0.60.0
+sympy~=1.13.2
+npbench~=0.1
@tbennun
Copy link
Collaborator

tbennun commented Oct 2, 2024

This looks like it is related to MacOS and DaCe. Have you tried the MacOS instructions in https://spcldace.readthedocs.io/en/latest/setup/installation.html#common-issues-with-the-dace-python-module ?

@pratyai
Copy link
Author

pratyai commented Oct 2, 2024

I am using ~/.dace.conf to 1) use openblas library (installed through homebrew), and 2) select g++ or clang++ compilers (both installed through homebrew). Not quite sure if there is anything else significant from the common-issues page re. MacOS usage.

I show the errors for both compiler options below --- although, for GCC, all the benchmark programs break the same way, where for Clang, only a small number of benchmarks like cholesky2 break.

Clang++

~ $ clang++ --version
clang version 18.1.8
Target: x86_64-apple-darwin24.0.0
Thread model: posix
InstalledDir: /opt/local/libexec/llvm-18/bin
~ $ cat ~/.dace.conf
compiler:
  cpu:
    executable: clang++
    args: -I/opt/local/include/openblas -L/opt/local/lib
(brandnewvenv) ~/g/npbench (main|✚2) $ rm -rf .dacecache
(brandnewvenv) ~/g/npbench (main|✚2) $ python run_benchmark.py -b cholesky2 -f dace_cpu 2>&1
***** Testing DaCe CPU with cholesky2 on the S dataset *****
NumPy - default - validation: 25ms
Failed to compile DaCe cpu fusion implementation.
Compiler failure:
[ 25%] Building CXX object CMakeFiles/fusion.dir/Users/pmz/gitspace/npbench/.dacecache/fusion/src/cpu/fusion.cpp.o
clang++: warning: argument unused during compilation: '-L/opt/local/lib' [-Wunused-command-line-argument]
[ 50%] Linking CXX shared library libfusion.dylib
Undefined symbols for architecture x86_64:
  "_LAPACKE_dpotrf", referenced from:
      __program_fusion_internal(fusion_state_t*, double*, long long) in fusion.cpp.o
ld: symbol(s) not found for architecture x86_64
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [libfusion.dylib] Error 1
make[1]: *** [CMakeFiles/fusion.dir/all] Error 2
make: *** [all] Error 2

Traceback (most recent call last):
  File "/Users/pmz/gitspace/npbench/brandnewvenv/lib/python3.12/site-packages/dace/codegen/compiler.py", line 232, in configure_and_compile
    _run_liveoutput("cmake --build . --config %s" % (Config.get('compiler', 'build_type')),
  File "/Users/pmz/gitspace/npbench/brandnewvenv/lib/python3.12/site-packages/dace/codegen/compiler.py", line 416, in _run_liveoutput
    raise subprocess.CalledProcessError(process.returncode, command, output.getvalue())
subprocess.CalledProcessError: Command 'cmake --build . --config RelWithDebInfo' returned non-zero exit status 2.

During handling of the above exception, another exception occurred:

< ...several more similar errors skipped... >

G++

~ $ g++-14 --version
g++-14 (Homebrew GCC 14.2.0) 14.2.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

~ $ cat ~/.dace.conf
compiler:
  cpu:
    executable: g++-14
    args: -I/opt/local/include/openblas -L/opt/local/lib
(brandnewvenv) ~/g/npbench (main|✚2) $ rm -rf .dacecache
(brandnewvenv) ~/g/npbench (main|✚2) [0|1] $ python run_benchmark.py -b cholesky2 -f dace_cpu 2>&1 | head -n 20
***** Testing DaCe CPU with cholesky2 on the S dataset *****
NumPy - default - validation: 25ms
Failed to compile DaCe cpu fusion implementation.
Compiler failure:
[ 25%] Building CXX object CMakeFiles/fusion.dir/Users/pmz/gitspace/npbench/.dacecache/fusion/src/cpu/fusion.cpp.o
In file included from /usr/local/Cellar/gcc/14.2.0/include/c++/14/cstdio:42,
                 from /Users/pmz/gitspace/npbench/brandnewvenv/lib/python3.12/site-packages/dace/codegen/../runtime/include/dace/dace.h:6,
                 from /Users/pmz/gitspace/npbench/.dacecache/fusion/src/cpu/fusion.cpp:2:
/usr/local/Cellar/gcc/14.2.0/lib/gcc/current/gcc/x86_64-apple-darwin23/14/include-fixed/stdio.h:83:8: error: 'FILE' does not name a type
   83 | extern FILE *__stdinp;
      |        ^~~~
/usr/local/Cellar/gcc/14.2.0/lib/gcc/current/gcc/x86_64-apple-darwin23/14/include-fixed/stdio.h:81:1: note: 'FILE' is defined in header '<cstdio>'; this is probably fixable by adding '#include <cstdio>'
   80 | #include <sys/_types/_seek_set.h>
  +++ |+#include <cstdio>
   81 |

< ...many, many more similar errors skipped... >

@alexnick83
Copy link
Contributor

Keep in mind that not all DaCe configurations necessarily work for all benchmarks, although there is an effort to address such issues. The Clang error seems related to a problem linking to the BLAS/LAPACK library. Specifically:

Undefined symbols for architecture x86_64:
"_LAPACKE_dpotrf", referenced from:

POTRF is the LAPACK method for Cholesky decomposition, which the cholesky2 benchmark uses. I do not own a Mac, and I am unable to test it myself. I see that you are using OpenBLAS. Maybe you also need to add -lopenblas to the CPU args?

Unfortunately, I do not have any insight into the g++ error. There looks to be some issue with the stdio.h file, but this is unrelated to DaCe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants