Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporary file, output sequence, indexerror #30

Closed
mherold1 opened this issue Jan 7, 2020 · 2 comments
Closed

Temporary file, output sequence, indexerror #30

mherold1 opened this issue Jan 7, 2020 · 2 comments

Comments

@mherold1
Copy link

mherold1 commented Jan 7, 2020

Hi Sam,

three small issues that I encountered in testing:

  1. When I tried to run gretel on a full genome (1.6 mbp, 60k SNPs) I noticed that an intermediary file is written to /tmp that filled up my complete hard-drive leading to the OSError below:
[...]
  File "/home/epi_mher/miniconda2/envs/py3/lib/python3.5/multiprocessing/heap.py", line 231, in malloc
    (arena, start, stop) = self._malloc(size)
  File "/home/epi_mher/miniconda2/envs/py3/lib/python3.5/multiprocessing/heap.py", line 129, in _malloc
    arena = Arena(length)
  File "/home/epi_mher/miniconda2/envs/py3/lib/python3.5/multiprocessing/heap.py", line 81, in __init__
    assert f.tell() == size
OSError: [Errno 28] No space left on device

It took me a while to find the cause for this and even when I changed the TMPDIR to somewhere with more space I stopped the run after the file reached >150G.
I realize that gretel is not meant for this task, so I am not sure that this is something that has to be addressed.

  1. I noticed that the out.fasta files do not consider strand information, so the sequences for genes on the negative strand are not given out correctly, this can be easily fixed afterwards, but has to be taken into account.

  2. For some samples I got the following error that I could not explain:

Process Process-1:
Traceback (most recent call last):
  File "/home/epi_mher/miniconda2/envs/py3/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap
    self.run()
  File "/home/epi_mher/miniconda2/envs/py3/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/epi_mher/miniconda2/envs/py3/lib/python3.5/site-packages/gretel/util.py", line 280, in bam_worker
    hansel.add_observation(snp_a, snp_b, i+rank+1, j+rank+1)
  File "/home/epi_mher/miniconda2/envs/py3/lib/python3.5/site-packages/hansel/hansel.py", line 209, in add_observation
    self[self.__symbol_num(symbol_from), self.__symbol_num(symbol_to), pos_from, pos_to] += value
IndexError: index 152 is out of bounds for axis 3 with size 152
@SamStudio8
Copy link
Owner

@mherold1 Thanks for the report! I think I'll close this in favour of opening two new issues on your behalf. Your second suggestion is already duplicated by #26. As you've raised it however, I'll add the suggestion to my tracking bug #27.

@SamStudio8
Copy link
Owner

@mherold1 I've opened these for you. I was hoping you might be able to provide some more information in #32 to diagnose the IndexError. Regarding the temporary file issue, I'm not entirely sure why this is happening but it is not the intended use-case for Gretel so my investigation of that report will be low priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants