kurimu

Kurīmu (meaning “cream” in Japanese) is a highly curated pangenome data collection, with metrics.

Kurīmu クリーム

About the Database

クリーム

Kurīmu (meaning "cream" in Japanese) is a pangenome data collection for public access. Kurīmu is a highly curated pangenome collection that provides consistent and validated information needed to document published reports for a wide variety of pangenomes at various taxonomic levels. This information was obtained from hundreds of research papers published in peer reviewed scientific journals and incorporated into a database for consistency and reproducibility of the reported and/or adapted results.

Information Provided

The Kurīmu data collection offers a variety of organized and ready to use information for hundreds of pangenomes, corresponding to over 20,000 individual genome sequences. This collection can serve as an entry point for any pangenome property by taxon, pangenome size, NCBI taxonomy identifier, and citations in the literature. In Kurīmu, information has been harvested for pangenomes at all taxonomic levels. Fields are shown as follows:

-Pangenome: The official scientific name of the taxon; please note that in some cases there are phenotypic (polyphyletic) groups, such as photosynthetic prokaryotes

-Unique_ID: A unique identifier for the organism constructed with MD5 using the Pangenome (name), Effective (size) and Reference (first author name and year of publication)

-NCBI_txid: The Taxonomy identifier provided by the NCBI database linking to other data resources

-Effective: The ‘true’ number of genomes used for the calculation of pangenome parameters

-Level: The taxonomic level that this organism belongs to

-Pan: The pangenome size of the entry

-Core: The number of core genes of the entry

-Peripheral: The number of peripheral genes of the entry

-Unique: The number of unique genes of the entry

-Core_pan: The percentage of the pangenome belonging to the core set (an index of coherence: the higher, the tighter the pangenome)

-Shell_eff: The ratio of unique genes per genome (an index of ‘uniqueness’/dispersion: the lower, the tighter the pangenome)

-Reference: First author surname and year of publication for the published report

-DOI: digital object identifier for the corresponding publication

-Gene_cluster: signifies whether the pangenome partitioning refers to traditional protein family clusters (C, red in figure above) or the more recent adoption of the term for gene-level variation (G, green in figure above)

The boolean fields DS1-DS4 correspond to subsets (see Table 3, original publication), in lieu of Data Supplements: DS1 for all pangenomes; DS2 for gene-level pangenomes and missing values for family clusters; DS3 when C+P+U=T (see Box 2, original publication); DS4 for duplicate entries with variable counts for pangenome sets.

How to access

Hit the link to start browsing Kurīmu.

List of pangenome analysis methods

A	K	panX
AGAPE	KinFin	PGAdb-builder
...	M	PanTools
B	MCL	Phandango
BPGA	Mugsy-A	PanWeb †
Bloom FT	MetaRef †	PanViz
BGDMdocker	micropan	Panaconda
...	MSPminer	PanGeT
C	...	PanACEA
CAMBer	N	PanGeneHome
...	NGSPanPipe	Piggy
E	...	PanVC
EDGAR	P	PGAP-X
eCAMBer	PanCGH †	...
EUPAN	progMauve	R
...	Panseq †	Roary
G	PGAT	RPAN
get_phylomarkers	PanOCT	...
get_homologues	PGAP	S
...	PanCake	SOP (pg) †
H	PanFunPro	SplitMEM
Hierarchicalsets	Pannotator †	seqana
Harvest	PanGP †	Scoary
...	PanTetris †	seq-seq-pan
I	PanFP	...
ITEP	Prokka	V
...	PanCoreGen	‘VarDetPGI’

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

kurimu

Kurīmu クリーム

About the Database

Information Provided

How to access

List of pangenome analysis methods

Files

README.md

Latest commit

History

README.md

File metadata and controls

kurimu

Kurīmu クリーム

About the Database

Information Provided

How to access

List of pangenome analysis methods