Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load model #51

Open
chuanxinyu opened this issue Aug 9, 2024 · 2 comments
Open

Cannot load model #51

chuanxinyu opened this issue Aug 9, 2024 · 2 comments

Comments

@chuanxinyu
Copy link

Hi, I'm having the following problem when running with singularity:

INFO: Converting SIF file to temporary sandbox...
WARNING: underlay of /etc/localtime required more than 50 (77) bind mounts
WARNING: underlay of /usr/bin/nvidia-smi required more than 50 (308) bind mounts
thread '' panicked at /herro/src/inference.rs:197:70:
Cannot load model.: Torch("open file failed because of errno 2 on fopen: , file path: ../herro_models/model_R9_v0.1/model_R9_v0.1.pt\nException raised from RAIIFile at ../caffe2/serialize/file_adapter.cc:27 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x6b (0x7f40e285a6bb in /libs/libtorch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0xbf (0x7f40e28555ef in /libs/libtorch/lib/libc10.so)\nframe #2: caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x134 (0x7f40e6552f84 in /libs/libtorch/lib/libtorch_cpu.so)\nframe #3: caffe2::serialize::FileAdapter::FileAdapter(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x41 (0x7f40e65535f1 in /libs/libtorch/lib/libtorch_cpu.so)\nframe #4: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x7f (0x7f40e6550a6f in /libs/libtorch/lib/libtorch_cpu.so)\nframe #5: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >&, bool, bool) + 0x28d (0x7f40e770f5ad in /libs/libtorch/lib/libtorch_cpu.so)\nframe #6: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device, bool) + 0x92 (0x7f40e770fa42 in /libs/libtorch/lib/libtorch_cpu.so)\nframe #7: torch::jit::load(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device, bool) + 0xd1 (0x7f40e770fb71 in /libs/libtorch/lib/libtorch_cpu.so)\nframe #8: + 0x1de74e (0x55620c9d674e in herro)\nframe #9: + 0xf3e9c (0x55620c8ebe9c in herro)\nframe #10: + 0xd2758 (0x55620c8ca758 in herro)\nframe #11: + 0xd9d0c (0x55620c8d1d0c in herro)\nframe #12: + 0xf4a96 (0x55620c8eca96 in herro)\nframe #13: + 0x145375 (0x55620c93d375 in herro)\nframe #14: + 0x94ac3 (0x7f40e266bac3 in /lib/x86_64-linux-gnu/libc.so.6)\nframe #15: clone + 0x44 (0x7f40e26fca04 in /lib/x86_64-linux-gnu/libc.so.6)\n")
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
Aborted (core dumped)
INFO: Cleaning up image...

HD_DIR='pwd'
singularity run --nv --bind $HD_DIR:$HD_DIR herro.sif herro inference --read-alns ./040919_Agr_pod -m ../herro_models/model_R9_v0.1/model_R9_v0.1.pt -t 20 -b 6 ./040919_Agr_pod.prefix.fastq.gz ./040919_Agr_pod_corrected.fasta

@colindaven
Copy link

It can't find your model. Try with the full absolute path to your model, not the relative one. Did the model download correctly and have a size >0 ?

absolute path eg or wherever it is.

-m /mnt/data/herro_models/model_R9_v0.1/model_R9_v0.1.pt

@chuanxinyu
Copy link
Author

I gave the absolute path of the model as you said. It appears to run, but there is no output.

INFO: Converting SIF file to temporary sandbox...
WARNING: underlay of /etc/localtime required more than 50 (77) bind mounts
WARNING: underlay of /usr/bin/nvidia-smi required more than 50 (308) bind mounts
[00:00:08] Processed 0 reads.
INFO: Cleaning up image...

My previous runs was:
scripts/preprocess.sh 040919_Agr_pod.fastq 040919_Agr_pod.prefix 16 1
seqkit seq -ni 040919_Agr_pod.fastq > 040919_Agr_pod.ids
scripts/create_batched_alignments.sh 040919_Agr_pod.prefix.fastq.gz 040919_Agr_pod.ids 16 040919_Agr_pod

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants