valid read pairs is too few compared with other tools #340

shiyi-pan · 2021-01-08T15:01:49Z

Hi, I want to use TADbit to detect TADs, the TADbit detect 25,521,413 valid read pairs only ,but other tools such as HiCExplorer could detect 58949201 read pairs using the same data, I think maybe my code has some problem, could you give me some advise? I ues GEM as mapper and here is my code, thank you very much:

from pytadbit.mapping.full_mapper import full_mapping
species = 'soybean'

for the first side of the reads

full_mapping(mapper_index_path='NN1138-2.v1.0.genome.gem',
out_map_dir='results/fragment/{0}/01_mapping/mapped_{0}_r1/'.format(species),
fastq_path='soybean_R1.fastq.gz',
r_enz='MboI', frag_map=True, clean=True, nthreads=8,
temp_dir='results/fragment/{0}/01_mapping/mapped_r1_tmp/'.format(species))

david-castillo · 2021-01-10T15:33:15Z

Hi,

The number of reads that you get with any pipeline depends on several factors but mainly on two: the mismatch you allow in the mapping of the reads and the filters you applied to the mapped reads.
By default we use GEM, but even different versions of GEM have different default mismatches. Version 2 is about 4% or 6% if I remember correctly, other mappers allow higher mismatches like 15%. We also support bowtie2 and hisat2 and those will also produce a different number of reads depending on the parameters you use. By default we use the --very-sensitive option.
Then you have the filters to discard what you consider non-informative or errors. Each pipeline has his own (although conceptually similar) filters.

Regards

David

shiyi-pan · 2021-01-11T13:37:25Z

Thank you for your reply, David. I will try other mappers like bowtie2 and hisat2 . I don't know clearly how to use bowtie2 into the TADbit pipeline . Is it just mapping Hi-C reads with bowtie2 and parse the bam file with parse_map function or not ?

david-castillo · 2021-01-12T09:30:14Z

You can use the same code that you already have but specify another mapper in the full_mapper function (like mapper='bowtie2'). Be sure you input the right mapper_index_path because for bowtie2 is not a single file, you have to specify the path and the prefix to the index.

http://3dgenomes.github.io/TADbit/reference/reference_mapping.html#mapping

You can also customize bowtie2 parameters with the mapper_params parameter if you need.

After that the parse is exactly the same.

Regards

David

shiyi-pan · 2021-01-12T14:33:40Z

Thanks a lot , David.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

valid read pairs is too few compared with other tools #340

valid read pairs is too few compared with other tools #340

shiyi-pan commented Jan 8, 2021

david-castillo commented Jan 10, 2021 •

edited

Loading

shiyi-pan commented Jan 11, 2021

david-castillo commented Jan 12, 2021

shiyi-pan commented Jan 12, 2021

valid read pairs is too few compared with other tools #340

valid read pairs is too few compared with other tools #340

Comments

shiyi-pan commented Jan 8, 2021

for the first side of the reads

david-castillo commented Jan 10, 2021 • edited Loading

shiyi-pan commented Jan 11, 2021

david-castillo commented Jan 12, 2021

shiyi-pan commented Jan 12, 2021

david-castillo commented Jan 10, 2021 •

edited

Loading