You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There seems to be an issue with the stability of the eigen function with ComplexF32. Occasionally the eigen code will return NaN which is inconsistent with the CPU decomposition.
To reproduce
The Minimal Working Example (MWE) for this bug:
using CUDA, HDF5, LinearAlgebra
fid =h5open("broken_eigen.h5", "r")
m =read(fid, "matrix")
m =Hermitian(m)
cm =Hermitian(cu(m))
D, V =eigen(m)
cuD, cuV, eigen(cm)
close(fid)
broken_eigen.h5.txt
** Please note that this file is a .h5 file but I saved it as a txt because it would not let me post here just remove the .txt extension.
Manifest.toml
Status `~/.julia/environments/v1.9/Project.toml`
[052768ef] CUDA v5.1.1
[34da2185] Compat v4.10.0
[f67ccb44] HDF5 v0.17.1
[33e6dc65] MKL v0.6.1
Expected behavior
I would expect cuD and cuV to be the eigen values and eigen vectors of the CuMatrix cm which has values between [-4.6161222f-8, 0.8686561f0] with an absolute minimum value of 1.3966348f-25
Version info
Details on Julia:
Julia Version 1.9.4
Commit 8e5136fa297 (2023-11-14 08:46 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 32 × Intel(R) Xeon(R) Gold 6244 CPU @ 3.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, cascadelake)
Threads: 1 on 32 virtual cores
Details on CUDA:
CUDA runtime 12.3, artifact installation
CUDA driver 12.3
NVIDIA driver 535.113.1, originally for CUDA 12.2
CUDA libraries:
- CUBLAS: 12.3.2
- CURAND: 10.3.4
- CUFFT: 11.0.11
- CUSOLVER: 11.5.3
- CUSPARSE: 12.1.3
- CUPTI: 21.0.0
- NVML: 12.0.0+535.113.1
Julia packages:
- CUDA: 5.1.0
- CUDA_Driver_jll: 0.7.0+0
- CUDA_Runtime_jll: 0.10.0+1
Toolchain:
- Julia: 1.9.4
- LLVM: 14.0.6
1 device:
0: NVIDIA RTX A6000 (sm_86, 45.964 GiB / 47.988 GiB available)
The text was updated successfully, but these errors were encountered:
I can reproduce, but I'm not familiar with the eigen/heevd, so pinging a couple of people who were involved with this code and may be able to say something useful: @albertomercurio@GVigne. It's possible that this is a bug in NVIDIA's libraries, but I want to make sure we're not doing anything wrong before filing an issue.
Describe the bug
There seems to be an issue with the stability of the
eigen
function withComplexF32
. Occasionally the eigen code will returnNaN
which is inconsistent with the CPU decomposition.To reproduce
The Minimal Working Example (MWE) for this bug:
broken_eigen.h5.txt
** Please note that this file is a .h5 file but I saved it as a txt because it would not let me post here just remove the .txt extension.
Manifest.toml
Expected behavior
I would expect
cuD
andcuV
to be the eigen values and eigen vectors of the CuMatrixcm
which has values between[-4.6161222f-8, 0.8686561f0]
with an absolute minimum value of1.3966348f-25
Version info
Details on Julia:
Details on CUDA:
The text was updated successfully, but these errors were encountered: