Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter 16S rRNA sequence from gnenbank file #10

Open
fconstancias opened this issue Aug 26, 2020 · 0 comments
Open

filter 16S rRNA sequence from gnenbank file #10

fconstancias opened this issue Aug 26, 2020 · 0 comments

Comments

@fconstancias
Copy link

Dear biofiles developpers / users,

I am trying to extract 16S operon from a collection of genbank flat files using R.

I am able to load the file sucessfully:

gbk <- biofiles::gbRecord("~/Projects/ETH/benoit/Bacteroides/200824/GCF_903181445.1_NZ_Bacteroides_fragilis_8E3_BL_hyb_genomic.gbff", 
                          progress = TRUE)

But when it comes to use the biofiles::filter function I can't nake it work:
biofiles::filter(gbk, Feature = "rRNA", product = "16S ribosomal RNA") -> rRNA

biofiles::summary(rRNA)

[[NZ_CAEUHN010000001]]
1857785 bp: Bacteroides fragilis isolate NZ_Bacteroides_fragilis_8E3_BL_hyb, whole genome shotgun sequence.
Error in if (is.atomic(x) || len == 1L && length(nm) <= 1L) { :
missing value where TRUE/FALSE needed

biofiles::getSequence(rRNA)

A DNAStringSet instance of length 24
width seq names
[1] 1857785 AAGTTCTGATAGAACTTAGAAGAGAATGCTCTTTTTACTATTGATTTTAATACTTTTCTCT...AATGACCGTCAATAAATTTTCGACATCCTGAACAGAGCTAATATTGTCCCTTATTGGGAT NZ_CAEUHN010000001
[2] 47339 GTCCGTTTTACCACTATAAATAGTTTCGGAAATACTTACGGTTTGAATGAGAAAAGATGTC...GAACGGATAATAAATTGGATATATTCATTTGTTTTCCAAATAGTTACTACTAAAATGCCT NZ_CAEUHN010000002
[3] 44645 AGAAACATTGATTATCAATGTTCTACAGGATAAACACCCAACTTTACGTCCAAACTGTAAA...GATACCTATTATTGGGCTACAAACATATACGTTATGTAGAATTTATAGAAAAAATAGGGG NZ_CAEUHN010000003
[4] 43518 ACATTCGTTCGTTCTCAACTTCTAAAAATGTTTCGTAAAATTTGACGGTTTGAAAGAGAAA...CAAAGCTTTTGAAGATAGAAATCATACTTTTTAAAGGTATTGATTTTCAGAAGGTTTATC NZ_CAEUHN010000004
[5] 5021 AATTATTGGATACAATTTCCAGAAAGAATAATTAGTTTGCTATTGGAAGATAAATATAAAA...GTACCGACCTCGACAACACTTGCTCCGATGACTGTTTCACCGGTAGCATCCGTAACCACA NZ_CAEUHN010000005
... ... ...
[20] 128200 TAATATTAAAGTGATATTAAAGACTGACTCTAAGGCACTTGATGAAGTGGTAGTAGTAGCT...TGTGTCAAAACGTTGGCACAACCTCCTTTTATATTTTACAGTTCTGCTATATTTTCTTTT NZ_CAEUHN010000020
[21] 109732 AAGTTCTGATAGGACTTGGCATTTCTGCCGGCCTACTCTCTCCGAACCATGTGTTCGCTAC...GCTGTTACTACGACTTCGTCAAGTGCCTTAGCATCAGTCTTTAATATCACTTTGATTATC NZ_CAEUHN010000021
[22] 88286 TCGCCTACCGTCCCGATAGAACTTAGTAAACAGTTTTAAAAACACATATAAACATCTTTAT...GGGAGAATCTTGAAGTGTAAGGATCTTGTTATTAGTTATTTATCTTAAGATATAGGTGTC NZ_CAEUHN010000022
[23] 80205 GTTTGTGTAACTATTGTATCCAACAGTAGCTGCTACCGTAAAGTCACCCTTTTTAGGAGTA...ATGATTGTTTGTCAGAGAAGGACTGCCAAAATGACTGATGACATTGGAAAAACAGGCGCT NZ_CAEUHN010000023
[24] 59327 AGATTCATTTATTTTATCTATATTTGCAAATGAGTATTTATATAAACGTTTAAAGCAAATA...CCTCTTACAGAGGCGTTATACGATAACGTAATAATAAAGATTAGGAAAAGACTTTTCTCA NZ_CAEUHN010000024

The full contigs are exported.
Any idea where I am wrong?

Thanks a ton

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant