Decouple directories and module names #31

safisher · 2014-10-27T19:32:12Z

Each directory should include a module file that contains the name of the module used to create that directory. This information could be added to the SAMPLE_ID.versions file. The file containing the module information can be used by STATS to determine which module is used to generate the stats. In this case STATS would be provided with a list of directories rather than a list of modules.

This will allow us to decouple the directory name from the module name and allow for more flexibility in running modules repeatedly. For example STAR could be run twice on two different genome versions or HTSEQ could be run repeated on different transcriptomes. This will also allow for meta-modules and more overall granularity in modules. For example we could run HTSEQ on exons then introns and use another (meta-)module to combine the exon and intron counts.

safisher · 2014-11-06T20:33:47Z

In order to implement this every module should include an inputDirectory and an outputDirectory. These should not contain default values rather PIPELINE should explicitly state the directories.

It should be fine for modules to require/assume file names. For example, STAR should expect to find the file "unaligned_1.fq" in the inputDirectory.

safisher added the enhancement label Oct 27, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple directories and module names #31

Decouple directories and module names #31

safisher commented Oct 27, 2014

safisher commented Nov 6, 2014

Decouple directories and module names #31

Decouple directories and module names #31

Comments

safisher commented Oct 27, 2014

safisher commented Nov 6, 2014