Skip to content
Evan Staton edited this page Aug 14, 2013 · 29 revisions

The use of the transposome is intended to be simple and fun, and there are no complicated options or arguments to learn. The only requirement is to edit the configuration file in the Transposome/config directory and pass this to the program transposome. Here is an example file:

blast_input:
  - sequence_file:     sunflower_500k_interleaved.fasta
  - sequence_num:      25_000
  - cpu:               2
  - thread:            12
  - output_directory:  sunflower_500k_transposome_PID90_COV55
clustering_options:
  - in_memory:         1
  - percent_identity:  90
  - fraction_coverage: 0.55
  - merge_threshold:   100
annotation_input:
  - repeat_database:  RepBase1801_sunflower_repeats.fasta
annotation_options:
  - cluster_size:     500
  - blast_evalue:     10
output:
  - report_file:      sunflower_500k_transposome_report.txt

If we save this as transposome_config.yml, then we would run transposome as follows:

transposome --config transposome_config.yml

All of the results will be in the output directory that is specified in the configuration file.

ON NAMING RESULTS

Try to give the results and output directory descriptive identifiers that can be easily distinguished between runs. In the example above, there is some minimal information used to name the output where the species name, run parameters for calculating pairwise matches, and the number of sequence reads are used to describe the results.