15 Jan 15:51

JeanMainguy

2f76ba2

PPanGGOLiN 2.0.0

New commands

projection: to annotate external genomes using an existing pre-computed reference pangenome (#119, see doc).
rgp_cluster: to cluster RGP based on their gene family content (#117, see doc).
metadata: add metadata linked to various pangenome elements using simple TSV files (#111, see doc).
the write command is split in two commands (#140):
- write_pangenome: write outputs at the pangenome level (see doc)
- write_genomes: write genome outputs with pangenome annotation. Three formats are available for outputs: table, GFF and JSON Proksee (#139, see doc).
utils: a small side command to generate a default configuration file for any commands (#112, see doc).

New features

A new, improved documentation hosted by readthedoc replacing the github wiki.
GFF export of genomes with pangenome annotation (#139, see doc).
JSON Map for Proksee to visualize interactively each genome and their pangenome annotation (#139, see doc).
Configuration file can now be used to set all or some parameters of PPanGGOLiN commands (#112, see doc).

Major change

BREAKING: New structure of the pangenome file to make it much lighter and faster to read (#110). ⚠️ Break the compatibility with PPanGGOLiN v1 pangenome HDF-5 files.

Minor change

Replacing Prodigal by pyrodigal for the annotation command (#138).
The context command has a window parameter to define the number of neighboring genes that are considered on each side of a gene of interest when searching for contexts (#137, see doc).
Replace all option keyword by synteny option keyword for draw –spots to draw spots with different RGP syntenies. Now all will draw all pangenome spots (#129)

Bug Fixes

Writing out only the RGP and spot of the gene with --projection (#130). Please note that, in version 2, the --projection parameter in the write command has been renamed to --table and now belongs to the write_genomes command (check the documentation of the write_genomes command for more details).
Make deterministic clustering (#116)

Assets 2

19 Feb 17:31

axbazin

1.2.105

161035e

PPanGGOLiN 1.2.105

A lot of the code was rewritten, but that should be relatively transparent for users

Bugfixes

Shell subpartitions are properly saved in the HDF5 file and used in the different figures
--meta option for annotate to annotate genomes using the metagenome mode of prodigal
--single_copy for msa to compute MSA using 'mostly single copy' persistent genes only
Cope with drawing more than 2000 identical spots in the same figure

Assets 2

28 Feb 15:25

jpjarnoux

1.2.74

7622e33

PPanGGOLiN 1.2.74

New commands

metrics to compute a list of metrics about the pangenome.

New features

The projection option from write subcommand, can give now, information about RGPs, Spots and modules
With the new command metrics is now possible to compute the genomic
fluidity of the pangenome.
With the new command metrics is now possible to get some more information
about the module. This information will be computed and shown as statistics on families and partitions of modules
All the metrics are saved in pangenome file and could be print with info
subcommand.

Bug fixes

fix crazy cluster assignment in clustering step
fix align when using a pangenome constructed from user-provided
annotations with prokka-like identifiers
fix check information on few subcommand option to help user

Assets 2

22 Dec 10:28

axbazin

1.2.63

ed270c6

PPanGGOLiN 1.2.63

identifiers used in provided annotation files (gff or gbff, through --anno) will be used by default, unless they are not unique within the pangenome
additional column in context output indicating family partition
Always save the gene sequences when building a pangenome (which gives more flexibility when doing additional analysis with ppanggolin)

Bugfixes

fix a bug preventing you from doing a new clustering if partitions were not computed
fix genome sizes drawn with draw --spots

Assets 2

30 Nov 10:11

axbazin

1.2.46

e789a4d

PPanGGOLiN 1.2.46

New commands

module to predict conserved modules in variable parts of a pangenome
context to find which gene families are conserved in the same genomic context than sequences of interest
all to run all possible analysis with PPanGGOLiN.
panmodule to run the panModule workflow

Bug fixes

improved pseudogene reading and gff/gbff parsing
fixed gff parser to cope with bakta gff files (reported in #66)
fixed gexf formatting in the rare case of having '&' in the 'product' field of gene annotations (reported in #61)
fixed rare crash happening when a partition has only 1 gene family ( see #64 )
fixed compilation issue with gcc 10.* and above (reported in #69 )

Other:

Allow to compute K=2 if forced by the user in partition or rarefaction(by default, K is still picked between 3 and 20). (see #65 )
removed R, rpy2 and genoPlot-R dependencies (#47 shall never be a problem anymore)
added a new bokeh dependency
remove spot --draw_hotspots and related options. To realize the same thing, use draw --spots once the spots have been computed.
added a --spots option to draw to have interactive figures for spots of interest, replacing the former figures drawn with R.
align can compare a set of sequences of interest to a pangenome, and draw related elements, but cannot compare a genome to a pangenome anymore

Assets 2

24 Feb 09:36

axbazin

1.1.136

2d8ee46

PPanGGOLiN 1.1.136

Bug fixes:

cope with gff3 files without '##sequence-region' (such as Anvi'o 's and JGI's) (reported both in #48 and #56)
Do not create a new organism when reading fasta files for sequences when the organism was not in the previsouly read annotations (reported in #48)
For the 'msa' command, have the --phylo option actually working

Assets 2

15 Feb 08:50

axbazin

1.1.131

e387654

PPanGGOLiN 1.1.131

New 'fasta' command to write more easily fasta files of parts (or all ) of the pangenome (making #38 , along with other demands, possible)
New 'msa' command to compute MSA from specific families ( such as the core genome for additional phylogenetic analysis) (firstly suggested by #28 ). Will do so using only genes that are present in only one copy in each organism for each family.
Switching from lz4 to zstd for better compression for .h5 files (about 30% less disk space used with equivalent i/o speed)
More unit tests (Thanks to @sletort )

Bug Fixes

Cope with RAST-style gene identifiers (RAST gene identifiers were not used previously) (noticed in #44 )
Compute spots properly when the contig is circular (spots were not computed in circular contigs previously)(maybe fixes #43 ?)
Proper 'softcore' filter behavior when writing the gexf files (filter was too restrictive and did not include all softcore families)

Assets 2

04 Sep 14:18

axbazin

1.1.96

3c089bd

PPanGGOLiN 1.1.96

can customize the clustering mode of mmseqs2
add the possibility to read pseudogenes in the 'align' submodule
defrag with connected component clustering is now the default clustering strategy
- due to this, there is a new option --no-defrag to use the previous clustering strategy
improved option checking

Bug fixes:

cope with contigs having identical identifiers in different genomes ( see #34 )
can change the duplication margin in 'rgp' without crashing
other minor technical bugs in 'align'

Assets 2

14 May 12:01

axbazin

1.1.85

a8fbc98

PPanGGOLiN 1.1.85

allows the extraction of spot borders
The gene_families.tsv now includes a third column with an 'F' if the gene is considered being a fragment of the gene family (--families_tsv option in the 'write' module)

Bug fixes :

fragment information is actually saved in the HDF5 file now (when using --defrag option with 'cluster', 'workflow' or 'panrgp')
cope with features not following the 'tag=value' when reading a gff3 file. They will now just be ignored. An error will be raised if a gff3 object does not have the required attributes. (see #29)
In case of an error due to file formatting when reading a cluster file (--clusters with 'cluster', 'workflow' or 'panrgp'), the error will now include the line number that caused the error (see #30)

Assets 2

25 Mar 10:24

axbazin

1.1.72

8db3411

PPanGGOLiN release 1.1.72

added the 'rgp' subcommand to predict regions of genomic plasticity
added the 'spot' subcommand to predict spots of insertion in the pangenome
added a few new output files in 'write' related to the two previous commands
the 'write' subcommand has been improved
added the 'panrgp' workflow. It will probably replace the main 'workflow' command in the future but for the next few versions, they will live side by side.
improved logging and added timing at the end of the 'panrgp' workflow.

pangenomes that were computed with previous versions should be compatible with the new commands.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New commands

New features

Major change

Minor change

Bug Fixes

Bugfixes

New commands

New features

Bug fixes

Bugfixes

New commands

Bug fixes

Other:

Releases: labgem/PPanGGOLiN

PPanGGOLiN 2.0.0

New commands

New features

Major change

Minor change

Bug Fixes

PPanGGOLiN 1.2.105

Bugfixes

PPanGGOLiN 1.2.74

New commands

New features

Bug fixes

PPanGGOLiN 1.2.63

Bugfixes

PPanGGOLiN 1.2.46

New commands

Bug fixes

Other:

PPanGGOLiN 1.1.136

PPanGGOLiN 1.1.131

PPanGGOLiN 1.1.96

PPanGGOLiN 1.1.85

PPanGGOLiN release 1.1.72