Cannot load model #51

chuanxinyu · 2024-08-09T14:00:22Z

Hi, I'm having the following problem when running with singularity:

INFO: Converting SIF file to temporary sandbox...
WARNING: underlay of /etc/localtime required more than 50 (77) bind mounts
WARNING: underlay of /usr/bin/nvidia-smi required more than 50 (308) bind mounts
thread '' panicked at /herro/src/inference.rs:197:70:
Cannot load model.: Torch("open file failed because of errno 2 on fopen: , file path: ../herro_models/model_R9_v0.1/model_R9_v0.1.pt\nException raised from RAIIFile at ../caffe2/serialize/file_adapter.cc:27 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x6b (0x7f40e285a6bb in /libs/libtorch/lib/libc10.so)\nframe #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0xbf (0x7f40e28555ef in /libs/libtorch/lib/libc10.so)\nframe #2: caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x134 (0x7f40e6552f84 in /libs/libtorch/lib/libtorch_cpu.so)\nframe #3: caffe2::serialize::FileAdapter::FileAdapter(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x41 (0x7f40e65535f1 in /libs/libtorch/lib/libtorch_cpu.so)\nframe #4: caffe2::serialize::PyTorchStreamReader::PyTorchStreamReader(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x7f (0x7f40e6550a6f in /libs/libtorch/lib/libtorch_cpu.so)\nframe #5: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > >&, bool, bool) + 0x28d (0x7f40e770f5ad in /libs/libtorch/lib/libtorch_cpu.so)\nframe #6: torch::jit::import_ir_module(std::shared_ptrtorch::jit::CompilationUnit, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device, bool) + 0x92 (0x7f40e770fa42 in /libs/libtorch/lib/libtorch_cpu.so)\nframe #7: torch::jit::load(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, c10::optionalc10::Device, bool) + 0xd1 (0x7f40e770fb71 in /libs/libtorch/lib/libtorch_cpu.so)\nframe #8: + 0x1de74e (0x55620c9d674e in herro)\nframe #9: + 0xf3e9c (0x55620c8ebe9c in herro)\nframe #10: + 0xd2758 (0x55620c8ca758 in herro)\nframe #11: + 0xd9d0c (0x55620c8d1d0c in herro)\nframe #12: + 0xf4a96 (0x55620c8eca96 in herro)\nframe #13: + 0x145375 (0x55620c93d375 in herro)\nframe #14: + 0x94ac3 (0x7f40e266bac3 in /lib/x86_64-linux-gnu/libc.so.6)\nframe #15: clone + 0x44 (0x7f40e26fca04 in /lib/x86_64-linux-gnu/libc.so.6)\n")
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
Aborted (core dumped)
INFO: Cleaning up image...

HD_DIR='pwd'
singularity run --nv --bind $HD_DIR:$HD_DIR herro.sif herro inference --read-alns ./040919_Agr_pod -m ../herro_models/model_R9_v0.1/model_R9_v0.1.pt -t 20 -b 6 ./040919_Agr_pod.prefix.fastq.gz ./040919_Agr_pod_corrected.fasta

The text was updated successfully, but these errors were encountered:

colindaven · 2024-08-12T07:02:58Z

It can't find your model. Try with the full absolute path to your model, not the relative one. Did the model download correctly and have a size >0 ?

absolute path eg or wherever it is.

-m /mnt/data/herro_models/model_R9_v0.1/model_R9_v0.1.pt

chuanxinyu · 2024-08-12T13:16:35Z

I gave the absolute path of the model as you said. It appears to run, but there is no output.

INFO: Converting SIF file to temporary sandbox...
WARNING: underlay of /etc/localtime required more than 50 (77) bind mounts
WARNING: underlay of /usr/bin/nvidia-smi required more than 50 (308) bind mounts
[00:00:08] Processed 0 reads.
INFO: Cleaning up image...

My previous runs was:
scripts/preprocess.sh 040919_Agr_pod.fastq 040919_Agr_pod.prefix 16 1
seqkit seq -ni 040919_Agr_pod.fastq > 040919_Agr_pod.ids
scripts/create_batched_alignments.sh 040919_Agr_pod.prefix.fastq.gz 040919_Agr_pod.ids 16 040919_Agr_pod

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot load model #51

Cannot load model #51

chuanxinyu commented Aug 9, 2024

colindaven commented Aug 12, 2024

chuanxinyu commented Aug 12, 2024

Cannot load model #51

Cannot load model #51

Comments

chuanxinyu commented Aug 9, 2024

colindaven commented Aug 12, 2024

chuanxinyu commented Aug 12, 2024