You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each directory should include a module file that contains the name of the module used to create that directory. This information could be added to the SAMPLE_ID.versions file. The file containing the module information can be used by STATS to determine which module is used to generate the stats. In this case STATS would be provided with a list of directories rather than a list of modules.
This will allow us to decouple the directory name from the module name and allow for more flexibility in running modules repeatedly. For example STAR could be run twice on two different genome versions or HTSEQ could be run repeated on different transcriptomes. This will also allow for meta-modules and more overall granularity in modules. For example we could run HTSEQ on exons then introns and use another (meta-)module to combine the exon and intron counts.
The text was updated successfully, but these errors were encountered:
In order to implement this every module should include an inputDirectory and an outputDirectory. These should not contain default values rather PIPELINE should explicitly state the directories.
It should be fine for modules to require/assume file names. For example, STAR should expect to find the file "unaligned_1.fq" in the inputDirectory.
Each directory should include a module file that contains the name of the module used to create that directory. This information could be added to the SAMPLE_ID.versions file. The file containing the module information can be used by STATS to determine which module is used to generate the stats. In this case STATS would be provided with a list of directories rather than a list of modules.
This will allow us to decouple the directory name from the module name and allow for more flexibility in running modules repeatedly. For example STAR could be run twice on two different genome versions or HTSEQ could be run repeated on different transcriptomes. This will also allow for meta-modules and more overall granularity in modules. For example we could run HTSEQ on exons then introns and use another (meta-)module to combine the exon and intron counts.
The text was updated successfully, but these errors were encountered: