From de990f2bdd9706b597e1b4c8683f53fd1f9cfcf2 Mon Sep 17 00:00:00 2001 From: jeanrjc Date: Fri, 11 Mar 2016 15:47:39 +0100 Subject: [PATCH] improve doc --- doc/source/introduction.rst | 2 +- doc/source/tutorial.rst | 31 +++++++++++++++++-------------- integron_finder | 2 +- 3 files changed, 19 insertions(+), 16 deletions(-) diff --git a/doc/source/introduction.rst b/doc/source/introduction.rst index 5a3687b..80aa4f1 100644 --- a/doc/source/introduction.rst +++ b/doc/source/introduction.rst @@ -50,7 +50,7 @@ matching. **Does it work ?** -Yes! The estimated sensitivity is 61% on average with the default option and goes up to 88% with the `--local_max` option. The missing *attC* sites are usually at the end of the array. The False positive rate with the `--local_max` option is estimated between 0.03 False Positive per Megabases (FP/Mb) to 0.72 FP/Mb. This leads to a probability of finding 2 consecutive *attC* sites within 4kb between 4.10^-6 and 7.10^-9. Finally, this parameters do not depend on the G+C percent of the given replicon. +Yes! The estimated sensitivity is 61% on average with the default option and goes up to 88% with the ``--local_max`` option. The missing *attC* sites are usually at the end of the array. The False positive rate with the ``--local_max`` option is estimated between 0.03 False Positive per Megabases (FP/Mb) to 0.72 FP/Mb. This leads to a probability of finding 2 consecutive *attC* sites within 4kb between 4.10^-6 and 7.10^-9. Finally, this parameters do not depend on the G+C percent of the given replicon. |benchmark| diff --git a/doc/source/tutorial.rst b/doc/source/tutorial.rst index 9c9bc63..47f30d2 100644 --- a/doc/source/tutorial.rst +++ b/doc/source/tutorial.rst @@ -93,9 +93,21 @@ INFERNAL:: Default is 1. +Circularity +----------- + +By default, IntegronFinder assumes your replicon to be circular. However, if they aren't, or if it's PCR fragments or contigs, you can specify that it's a linear fragment:: + + integron_finder mylinearsequence.fst --linear + +However, if ``--linear`` is not used and the replicon is smaller than ``4 x dt`` +(where ``dt`` is the distance threshold, so 4kb by default), the replicon is +considered linear to avoid clustering problem + + .. _advance: -Advanced use +Advanced options ============ .. _distance_threshold: @@ -118,6 +130,10 @@ or, equivalently:: This sets the threshold for clustering to 10 kb. +.. note:: + The option ``--outdir`` allows you to chose the location of the Results folder (``Results_Integron_Finder_mysequence``). If this folder already exists, IntegronFinder will not re-run analyses already done, except functional annotation. It allows you to re-run rapidly IntegronFinder with a different ``--distance_threshold`` value. Functional annotation needs to re-run each time because depending on the aggregation parameters, the proteins associated with an integron might change. + + *attC* evalue ------------- @@ -129,19 +145,6 @@ to the cost of a much higher false positive rate. integron_finder mysequence.fst --evalue_attc 5 -Circularity ------------ - -By default, IntegronFinder assumes replicon to be circular. However, if they -aren't, or if it's PCR fragments or contigs, you can specify that it's a linear -fragment:: - - integron_finder mylinearsequence.fst --linear - -However, if ``--linear`` is not used and the replicon is smaller than ``4 x dt`` -(where ``dt`` is the distance threshold, so 4kb by default), the replicon is -considered linear to avoid clustering problem - Palindromes ----------- diff --git a/integron_finder b/integron_finder index 47df067..f18da34 100755 --- a/integron_finder +++ b/integron_finder @@ -1386,7 +1386,7 @@ Python {1} - citation: Automatic and accurate identification of integrons and cassette arrays in bacterial genomes reveals unexpected patterns - Jean Cury, Thomas Jové, Marie Touchon, Bertrand Néron, Eduardo PC Rocha + Jean Cury, Thomas Jové, Marie Touchon, Bertrand Néron, Eduardo PC Rocha bioRxiv doi: http://dx.doi.org/10.1101/030866 """.format(version, sys.version) return version_text