Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFoundError in PDF generation #72

Closed
mwhamgenomics opened this issue Nov 18, 2024 · 7 comments · Fixed by #73
Closed

FileNotFoundError in PDF generation #72

mwhamgenomics opened this issue Nov 18, 2024 · 7 comments · Fixed by #73

Comments

@mwhamgenomics
Copy link

Upon running a PDF report in the GUI, the generation process fails with a stack trace:

2024-11-18 14:50:08,284 - nicegui - ERROR - [Errno 2] No such file or directory: '/data/robin2/robin2-res/NA12878_05_NB4_06_22Rv1_07/nanoDX_scores.csv'
Traceback (most recent call last):
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/nicegui/events.py", line 417, in wait_for_result
    await result
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/robin/brain_class.py", line 900, in download_report
    myfile = await run.io_bound(
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/nicegui/run.py", line 74, in io_bound
    return await _run(thread_pool, callback, *args, **kwargs)
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/nicegui/run.py", line 52, in _run
    return await loop.run_in_executor(executor, partial(callback, *args, **kwargs))
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/robin/reporting/report.py", line 154, in create_pdf
    df_store = pd.read_csv(file_path)
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 620, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1880, in _make_engine
    self.handles = get_handle(
  File "/data/miniforge3/envs/robin2-env/lib/python3.9/site-packages/pandas/io/common.py", line 873, in get_handle
    handle = open(
FileNotFoundError: [Errno 2] No such file or directory: '/data/robin2/robin2-res/NA12878_05_NB4_06_22Rv1_07/nanoDX_scores.csv'

Looking at the results folder, we find the files NanoDX_scores.csv and PanNanoDX_scores.csv. When the PDF generation process runs in src/reporting/report.py:

  • at line 142, name will be 'NanoDX' and df_name will be 'nanoDX_scores.csv'
  • at line 147, files_in_directory will be ['NanoDX_scores.csv', 'PanNanoDX_scores.csv', ...]
  • line 150 casts both of the above to lower case, so the in will pass and the if will run
  • but at 151, file_path will end up as 'path/to/output/nanoDX_scores.csv', which still doesn't exist, causing 154 to crash

If we try fixing lines 142 and 143 to match the file names present, the report runs successfully:

("NanoDX", "NanoDX_scores.csv"),
("PanNanoDX", "PanNanoDX_scores.csv"),

I'm not sure how this didn't get picked up when I tested the fix for #70 - will try another test run from scratch.

@mattloose
Copy link
Contributor

Hmm - OK - I will test this again here.

Could you make sure that before you run it again you manually delete the folder :/data/robin2/robin2-res/NA12878_05_NB4_06_22Rv1_07

Thanks

@mwhamgenomics
Copy link
Author

I can confirm that we did remove the output folder when we re-tried running Robin and the report today. Update on my local tests, though - this was run on macOS, and the filesystem doesn't seem to care about upper/lower case:

% echo col1,col2 > test.csv
% echo 1,2 >> test.csv
% cat test.csv
% python
>>> import pandas
>>> pandas.read_csv('test.csv')
   col1  col2
0     1     2
>>> pandas.read_csv('TEST.CSV')
   col1  col2
0     1     2

Our tests earlier today were done on a PromethION, which runs Linux - so I think that explains why my tests ran successfully before

@mattloose
Copy link
Contributor

OK.

I'll have a look at this.

@mattloose
Copy link
Contributor

Though the new code was supposed to be completely case insensitive!

@mattloose
Copy link
Contributor

Please could you update to the latest version and test to see if the report now works.

@mattloose
Copy link
Contributor

Please re-open this issue if it isn't fixed - sorry it's taken a while.

@mwhamgenomics
Copy link
Author

Just realised I never replied here - I can confirm that all is well. We've been running this on Linux with a case-sensitive filesystem and have had no problems with this since this PR was merged. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants