Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cellScan error #96

Open
tkcaccia opened this issue Dec 16, 2024 · 9 comments
Open

cellScan error #96

tkcaccia opened this issue Dec 16, 2024 · 9 comments

Comments

@tkcaccia
Copy link

I ran with not problem the ancestry pipeline.
I have an issue with the somatic mutation pipeline.

I met an error in the cellScan module

(base) user@user-System-Product-Name:~$ python  ${path}/src/Monopogen.py  somatic \
    -i  /home/user/Documents/Data/single-cell/20240910-B164/output/G0157 \
    -a   ${path}/apps  -r  /home/user/Documents/Data/BeeNetOutputs/region.lst  \
    -l  /home/user/Documents/Data/single-cell/20240910-B164/output/G0157.csv   -s featureInfo     \
    -g   /home/user/Documents/Data/GRCh38UCSF/hg38.fa 
   
[2024-12-16 18:41:05,869] INFO     Monopogen.py Get feature information from sequencing data...
[2024-12-16 18:43:46,445] INFO     Monopogen.py Success! See instructions above.


(base) user@user-System-Product-Name:~$ python  ${path}/src/Monopogen.py  somatic  \
    -a   ${path}/apps \
     -r  /home/user/Documents/Data/BeeNetOutputs/region.lst \
    -i  /home/user/Documents/Data/single-cell/20240910-B164/output/G0157 \
     -l  /home/user/Documents/Data/single-cell/20240910-B164/output/G0157.csv    -s cellScan     \
    -g   /home/user/Documents/Data/GRCh38UCSF/hg38.fa 
[2024-12-16 18:48:55,418] INFO     Monopogen.py Collect single cell level information from sequencing data...
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
0:0:0:0:0:0
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/user/miniconda3/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
           ^^^^^^^^^^^^^^^^
  File "/home/user/Monopogen/src/somatic.py", line 323, in bam2mat
    mat = pd.read_csv(mat_infile, sep="\t", header=None)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
    return _read(filepath_or_buffer, kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 620, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
    self._engine = self._make_engine(f, self.engine)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine
    return mapping[engine](f, **self.options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in __init__
    self._reader = parsers.TextReader(src, **kwds)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "parsers.pyx", line 581, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/Monopogen/src/Monopogen.py", line 340, in <module>
    main()
  File "/home/user/Monopogen/src/Monopogen.py", line 333, in main
    args.func(args)
  File "/home/user/Monopogen/src/Monopogen.py", line 172, in somatic
    result = pool.map(bam2mat, joblst)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/multiprocessing/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/multiprocessing/pool.py", line 774, in get
    raise self._value
pandas.errors.EmptyDataError: No columns to parse from file

@jinzhuangdou
Copy link
Collaborator

Hello @tkcaccia,

Could you share the output from the somatic folder in G0157? Also, is the issue still present if you run only a single chromosome, such as chr1?

@tkcaccia
Copy link
Author

The output of the somatic folder is here below. I will try now to run for chr1...

total 1112440
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr10.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 18:58 chr10.cell_snv.mat.gz
-rw-rw-r-- 1 user user  2342854 Dec 16 18:48 chr10.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 38162730 Dec 16 18:41 chr10.gl.vcf.DP4
-rw-rw-r-- 1 user user  7594094 Dec 16 18:41 chr10.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  2278178 Dec 16 18:41 chr10.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1419442 Dec 16 18:41 chr10.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr11.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 19:04 chr11.cell_snv.mat.gz
-rw-rw-r-- 1 user user  3103248 Dec 16 18:48 chr11.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 43030589 Dec 16 18:41 chr11.gl.vcf.DP4
-rw-rw-r-- 1 user user 10041633 Dec 16 18:41 chr11.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  2975200 Dec 16 18:41 chr11.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1857617 Dec 16 18:41 chr11.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr12.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 19:01 chr12.cell_snv.mat.gz
-rw-rw-r-- 1 user user  3374361 Dec 16 18:48 chr12.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 47436461 Dec 16 18:41 chr12.gl.vcf.DP4
-rw-rw-r-- 1 user user 10851180 Dec 16 18:41 chr12.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  3233926 Dec 16 18:41 chr12.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  2017355 Dec 16 18:41 chr12.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr13.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 18:53 chr13.cell_snv.mat.gz
-rw-rw-r-- 1 user user   959253 Dec 16 18:48 chr13.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 18530695 Dec 16 18:41 chr13.gl.vcf.DP4
-rw-rw-r-- 1 user user  3134825 Dec 16 18:41 chr13.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user   936842 Dec 16 18:41 chr13.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user   584251 Dec 16 18:41 chr13.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr14.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 18:57 chr14.cell_snv.mat.gz
-rw-rw-r-- 1 user user  2090836 Dec 16 18:48 chr14.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 30348900 Dec 16 18:41 chr14.gl.vcf.DP4
-rw-rw-r-- 1 user user  6796121 Dec 16 18:41 chr14.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  2029956 Dec 16 18:41 chr14.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1265814 Dec 16 18:41 chr14.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr15.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 18:58 chr15.cell_snv.mat.gz
-rw-rw-r-- 1 user user  1750870 Dec 16 18:48 chr15.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 27598731 Dec 16 18:41 chr15.gl.vcf.DP4
-rw-rw-r-- 1 user user  5713045 Dec 16 18:41 chr15.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  1692444 Dec 16 18:41 chr15.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1057548 Dec 16 18:41 chr15.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr16.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 18:56 chr16.cell_snv.mat.gz
-rw-rw-r-- 1 user user  1917046 Dec 16 18:48 chr16.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 29187692 Dec 16 18:41 chr16.gl.vcf.DP4
-rw-rw-r-- 1 user user  6287212 Dec 16 18:41 chr16.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  1836530 Dec 16 18:41 chr16.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1151314 Dec 16 18:41 chr16.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr17.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 19:01 chr17.cell_snv.mat.gz
-rw-rw-r-- 1 user user  3063684 Dec 16 18:48 chr17.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 43043437 Dec 16 18:41 chr17.gl.vcf.DP4
-rw-rw-r-- 1 user user  9953985 Dec 16 18:41 chr17.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  2905928 Dec 16 18:41 chr17.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1821925 Dec 16 18:41 chr17.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr18.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 18:52 chr18.cell_snv.mat.gz
-rw-rw-r-- 1 user user   795791 Dec 16 18:48 chr18.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 14374360 Dec 16 18:41 chr18.gl.vcf.DP4
-rw-rw-r-- 1 user user  2640863 Dec 16 18:41 chr18.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user   762278 Dec 16 18:41 chr18.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user   479389 Dec 16 18:41 chr18.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr19.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 19:02 chr19.cell_snv.mat.gz
-rw-rw-r-- 1 user user  2944356 Dec 16 18:48 chr19.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 38839619 Dec 16 18:41 chr19.gl.vcf.DP4
-rw-rw-r-- 1 user user  9638481 Dec 16 18:41 chr19.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  2788148 Dec 16 18:41 chr19.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1750084 Dec 16 18:41 chr19.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 19:28 chr1.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       43 Dec 16 19:41 chr1.cell_snv.mat.gz
-rw-rw-r-- 1 user user  6278415 Dec 16 19:28 chr1.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 85332094 Dec 16 18:42 chr1.gl.vcf.DP4
-rw-rw-r-- 1 user user 20221064 Dec 16 18:42 chr1.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  6026704 Dec 16 18:42 chr1.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  3638567 Dec 16 18:42 chr1.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr20.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 18:54 chr20.cell_snv.mat.gz
-rw-rw-r-- 1 user user  1475996 Dec 16 18:48 chr20.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 22461680 Dec 16 18:41 chr20.gl.vcf.DP4
-rw-rw-r-- 1 user user  4855421 Dec 16 18:41 chr20.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  1415436 Dec 16 18:41 chr20.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user   887574 Dec 16 18:41 chr20.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr21.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 18:57 chr21.cell_snv.mat.gz
-rw-rw-r-- 1 user user   576800 Dec 16 18:48 chr21.cell_snv.snvID.csv
-rw-rw-r-- 1 user user  9551820 Dec 16 18:41 chr21.gl.vcf.DP4
-rw-rw-r-- 1 user user  1904069 Dec 16 18:41 chr21.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user   564094 Dec 16 18:41 chr21.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user   352628 Dec 16 18:41 chr21.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr22.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       44 Dec 16 18:53 chr22.cell_snv.mat.gz
-rw-rw-r-- 1 user user  1055702 Dec 16 18:48 chr22.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 16329685 Dec 16 18:41 chr22.gl.vcf.DP4
-rw-rw-r-- 1 user user  3460443 Dec 16 18:41 chr22.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  1024128 Dec 16 18:41 chr22.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user   640080 Dec 16 18:41 chr22.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr2.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       43 Dec 16 19:04 chr2.cell_snv.mat.gz
-rw-rw-r-- 1 user user  4310484 Dec 16 18:48 chr2.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 66028157 Dec 16 18:42 chr2.gl.vcf.DP4
-rw-rw-r-- 1 user user 13948508 Dec 16 18:42 chr2.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  4173460 Dec 16 18:42 chr2.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  2519110 Dec 16 18:42 chr2.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr3.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       43 Dec 16 19:05 chr3.cell_snv.mat.gz
-rw-rw-r-- 1 user user  3043097 Dec 16 18:48 chr3.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 52733005 Dec 16 18:41 chr3.gl.vcf.DP4
-rw-rw-r-- 1 user user  9950485 Dec 16 18:41 chr3.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  2968976 Dec 16 18:41 chr3.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1793693 Dec 16 18:41 chr3.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr4.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       43 Dec 16 18:57 chr4.cell_snv.mat.gz
-rw-rw-r-- 1 user user  1919696 Dec 16 18:48 chr4.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 36865869 Dec 16 18:41 chr4.gl.vcf.DP4
-rw-rw-r-- 1 user user  6374169 Dec 16 18:41 chr4.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  1874713 Dec 16 18:41 chr4.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1136044 Dec 16 18:41 chr4.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr5.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       43 Dec 16 19:00 chr5.cell_snv.mat.gz
-rw-rw-r-- 1 user user  2925422 Dec 16 18:48 chr5.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 46264593 Dec 16 18:41 chr5.gl.vcf.DP4
-rw-rw-r-- 1 user user  9586170 Dec 16 18:41 chr5.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  2862470 Dec 16 18:41 chr5.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1728510 Dec 16 18:41 chr5.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr6.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       43 Dec 16 19:02 chr6.cell_snv.mat.gz
-rw-rw-r-- 1 user user  3104334 Dec 16 18:48 chr6.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 49372527 Dec 16 18:41 chr6.gl.vcf.DP4
-rw-rw-r-- 1 user user 10242308 Dec 16 18:41 chr6.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  2994007 Dec 16 18:41 chr6.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1815231 Dec 16 18:41 chr6.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 19:41 chr7.cell_snv.cellID.csv
-rw-rw-r-- 1 user user        0 Dec 16 19:41 chr7.cell_snv.mat.gz
-rw-rw-r-- 1 user user  2451437 Dec 16 19:41 chr7.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 43476206 Dec 16 18:41 chr7.gl.vcf.DP4
-rw-rw-r-- 1 user user  8113905 Dec 16 18:41 chr7.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  2398691 Dec 16 18:41 chr7.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1451788 Dec 16 18:41 chr7.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr8.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       43 Dec 16 18:57 chr8.cell_snv.mat.gz
-rw-rw-r-- 1 user user  1840022 Dec 16 18:48 chr8.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 33830514 Dec 16 18:41 chr8.gl.vcf.DP4
-rw-rw-r-- 1 user user  6126229 Dec 16 18:41 chr8.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  1797565 Dec 16 18:41 chr8.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1089250 Dec 16 18:41 chr8.gl.vcf.filter.hc.pos
-rw-rw-r-- 1 user user   153712 Dec 16 18:48 chr9.cell_snv.cellID.csv
-rw-rw-r-- 1 user user       43 Dec 16 18:57 chr9.cell_snv.mat.gz
-rw-rw-r-- 1 user user  1944056 Dec 16 18:48 chr9.cell_snv.snvID.csv
-rw-rw-r-- 1 user user 32394644 Dec 16 18:41 chr9.gl.vcf.DP4
-rw-rw-r-- 1 user user  6451658 Dec 16 18:41 chr9.gl.vcf.filter.DP4
-rw-rw-r-- 1 user user  1906383 Dec 16 18:41 chr9.gl.vcf.filter.hc.bed
-rw-rw-r-- 1 user user  1153584 Dec 16 18:41 chr9.gl.vcf.filter.hc.pos

@tkcaccia
Copy link
Author

While the program is running, I am not sure that the csv file is optimal.
Could be the id column with too low values?

cell	id
AAAAAAAAGCTT	477
AAAAAATTCTTT	603
AAAAACTGTAAA	3274
AAAAAGAGGGTT	2402
AAAAATACCATA	1235
AAAAATCGTTTT	454
AAAAATGTCTTG	708
AAAACAGTGACA	2439
AAAACCGCATCT	3025
AAAACGCACGCG	4141
AAAACGCGCACA	1867
AAAACGTAATTG	474
AAAACGTTCTGG	3494
AAAACTTTCATC	466
AAAAGATTCACG	1419

@tkcaccia
Copy link
Author

The program has like this:

python  ${path}/src/Monopogen.py  somatic  \
    -a   ${path}/apps \
     -r  /media/user/Data/single-cell/20240910-B164/input/region1.lst \
    -i  /media/user/Data/single-cell/20240910-B164/output/G0157 \
     -l  /media/user/Data/single-cell/20240910-B164/input/G0157.csv    -s cellScan     \
    -g   /media/user/Data/Resources/GRCh38UCSF/hg38.fa  


[2024-12-17 17:21:58,736] INFO     Monopogen.py Collect single cell level information from sequencing data...
0:0:0:0:0:0
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/user/miniconda3/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
           ^^^^^^^^^^^^^^^^
  File "/home/user/Monopogen/src/somatic.py", line 323, in bam2mat
    mat = pd.read_csv(mat_infile, sep="\t", header=None)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
    return _read(filepath_or_buffer, kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 620, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
    self._engine = self._make_engine(f, self.engine)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine
    return mapping[engine](f, **self.options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in __init__
    self._reader = parsers.TextReader(src, **kwds)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "parsers.pyx", line 581, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/Monopogen/src/Monopogen.py", line 340, in <module>
    main()
  File "/home/user/Monopogen/src/Monopogen.py", line 333, in main
    args.func(args)
  File "/home/user/Monopogen/src/Monopogen.py", line 172, in somatic
    result = pool.map(bam2mat, joblst)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/multiprocessing/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/miniconda3/lib/python3.12/multiprocessing/pool.py", line 774, in get
    raise self._value
pandas.errors.EmptyDataError: No columns to parse from file

@tkcaccia
Copy link
Author

Could it be related to the index program used for the reference genome?

@Arsenalwins
Copy link

maybe you should check whether the cell barcode format is the same in your BAM

@tkcaccia
Copy link
Author

My barcode in the BAM file looks very different

 samtools view sorted.bam | less -S

HGGFCDRX5:1:2215:20546:18004        16      1       10033   0       72M     *       0       0       ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACACTA        FFFFFFF:FFFFFFFFFFFFFF>
A01240:1004:HGGFCDRX5:1:2223:16812:8187 16      1       10033   0       72M     *       0       0       ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACACTA        FFFF:FFFFFFFFFFFFFFFFFFFFFFFFF>
A01240:1004:HGGFCDRX5:1:2251:16920:12978        16      1       10033   0       72M     *       0       0       ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACACTA        F,,FFF,FFFF:,F:FFFFFFF>
A01240:1004:

How can I generate the *.csv file from the BAM file?

@tkcaccia
Copy link
Author

I did not do my experiments with Chromium. I cannot use the Cell Ranger pipeline to extract my barcode

@Arsenalwins
Copy link

wow that‘s a difficult one, maybe you should consult profession for it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants