Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapting FIMO file to the correct format for centipede_data #17

Open
Rosmaninho opened this issue Nov 12, 2020 · 0 comments
Open

Adapting FIMO file to the correct format for centipede_data #17

Rosmaninho opened this issue Nov 12, 2020 · 0 comments

Comments

@Rosmaninho
Copy link

FIMO from the MEME suite website outputs data in the following format:

motif_id motif_alt_id sequence_name start stop strand score p-value q-value matched_sequence
ZNF528 MA1597.1 Peak_31367#chr12#10213230#10213429 54 70 - 27.9633 1.45e-10 2.88e-05 CCCAGGGAAGCCATCTC
ZNF528 MA1597.1 Peak_31367#chr12#10213177#10213376 107 123 - 27.9633 1.45e-10 2.88e-05 CCCAGGGAAGCCATCTC
SP4 MA0685.1 Peak_73465#chr19#45001886#45002085 50 66 - 25.5488 3.14e-10 3.97e-05 CAGGCCACGCCCCCTTC
SP4 MA0685.1 Peak_73465#chr19#45001835#45002034 101 117 - 25.5488 3.14e-10 3.97e-05 CAGGCCACGCCCCCTTC
SP4 MA0685.1 Peak_73465#chr19#45001828#45002027 108 124 - 25.5488 3.14e-10 3.97e-05 CAGGCCACGCCCCCTTC
THAP11 MA1573.1 Peak_110384#chr3#141370283#141370482 140 158 - 27.4944 3.59e-10 6.36e-05 AGGACTACATTTCCCAGAC
CTCF MA0139.1 Peak_71057#chr19#2474615#2474814 96 114 + 25.2247 4.23e-10 0.000166 CGGCCACCAGATGGCGCCA
ZNF16 MA1654.1 Peak_181996#chr9#129485761#129485960 1 23 + 27.5244 5.42e-10 0.000109 AATGGGGAGCCATCGAAGGCCTT
ZNF16 MA1654.1 Peak_181996#chr9#129485656#129485855 106 128 + 27.5244 5.42e-10 0.000109 AATGGGGAGCCATCGAAGGCCTT

In your tutorial it seems that I need to adapt FIMO output:

sequence.name start stop X.pattern.name strand score p.value

307 chr1 753016 753228 1 + 13.53 1.14e-05

315 chr1 876197 876409 1 - 12.07 3.73e-05

29 chr1 1365483 1365695 1 - 11.88 4.24e-05

30 chr1 1365877 1366089 1 - 12.72 2.24e-05

31 chr1 1406705 1406917 1 - 11.20 6.73e-05

64 chr1 1566358 1566570 1 + 13.99 7.75e-06

q.value matched.sequence

307 NA TTTCCCAGAAGGA

315 NA CTTCCCCGAAGGG

29 NA TTTCCAAGAAAGT

30 NA CTTCCCAGGAGAG

31 NA CTTCACAGAATTA

64 NA TTTCCAAGAACCG

I am getting the following error:
-- Column specification ------------------------------------------------------------------ cols( sequence_name = col_character(), chr = col_character(), start = col_double(), stop = col_double(), strand = col_character(), score = col_double(), p-value= col_double(),q-value` = col_double(),
matched_sequence = col_character(),
motif_id = col_character(),
motif_alt_id = col_character()
)

Error in h(simpleError(msg, call)) :
error in evaluating the argument 'which' in selecting a method for function 'ScanBamParam': In range 4685: at least two out of 'start', 'end', and 'width', must
be supplied.`

How do I need to adapt my FIMO output?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant