Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

manage long N intergenic regions #12

Open
jwollbrett opened this issue Dec 10, 2018 · 0 comments
Open

manage long N intergenic regions #12

jwollbrett opened this issue Dec 10, 2018 · 0 comments

Comments

@jwollbrett
Copy link
Contributor

Some reference intergenic regions are full of N bp.
For instance for human, more than 80 reference intergenic regions are a sequence of 20.000 N.
As we provide reference intergenic sequences in our FTP for the BgeeCall package, we should remove these sequences.
We should maybe also remove all long N regions in intergenic regions.
One solution could be to remove all N regions bigger or equal to default kmer size of kallisto (31bp).
One initial 20.000bp reference intergenic region could then result to more than one reference intergenic region.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant