diff --git a/EDAM_dev.owl b/EDAM_dev.owl index 7bf6cb4..0df9092 100644 --- a/EDAM_dev.owl +++ b/EDAM_dev.owl @@ -1,46 +1,52 @@ - + xmlns:oboLegacy="http://purl.obolibrary.org/obo/" + xmlns:edamontology="http://edamontology.org#"> - 4020 - 16.08.2021 13:44 UTC - EDAM http://edamontology.org/ "EDAM relations and concept properties" - EDAM_data http://edamontology.org/data_ "EDAM types of data" - EDAM_format http://edamontology.org/format_ "EDAM data formats" - EDAM_operation http://edamontology.org/operation_ "EDAM operations" - EDAM_topic http://edamontology.org/topic_ "EDAM topics" - EDAM editors: Jon Ison, Matúš Kalaš, Hervé Ménager, and Veit Schwämmle. Contributors: see http://edamontologydocs.readthedocs.io/en/latest/contributors.html. License: see http://edamontologydocs.readthedocs.io/en/latest/license.html. - EDAM is an ontology of well established, familiar concepts that are prevalent within bioinformatics, including types of data and data identifiers, data formats, operations and topics. EDAM is a simple ontology - essentially a set of terms with synonyms and definitions - organised into an intuitive hierarchy for convenient use by curators, software developers and end-users. EDAM is suitable for large-scale semantic annotations and categorisation of diverse bioinformatics resources. EDAM is also suitable for diverse application including for example within workbenches and workflow-management systems, software distributions, and resource registries. - - Veit Schwämmle + + 4039 + + 18.03.2023 00:33 UTC + EDAM http://edamontology.org/ "EDAM relations, concept properties, and subsets" + EDAM_data http://edamontology.org/data_ "EDAM types of data" + EDAM_format http://edamontology.org/format_ "EDAM data formats" + EDAM_operation http://edamontology.org/operation_ "EDAM operations" + EDAM_topic http://edamontology.org/topic_ "EDAM topics" + EDAM is a community project and its development can be followed and contributed to at https://github.com/edamontology/edamontology. + EDAM is particularly suitable for semantic annotations and categorisation of diverse resources related to data analysis and management: e.g. tools, workflows, learning materials, or standards. EDAM is also useful in data management itself, for recording provenance metadata of processed data. + https://github.com/edamontology/edamontology/graphs/contributors and many more! Hervé Ménager Jon Ison Matúš Kalaš + EDAM is a domain ontology of data analysis and data management in bio- and other sciences, and science-based applications. It comprises concepts related to analysis, modelling, optimisation, and data life-cycle. Targetting usability by diverse users, the structure of EDAM is relatively simple, divided into 4 main sections: Topic, Operation, Data (incl. Identifier), and Format. application/rdf+xml - Bioinformatics operations, data types, formats, identifiers and topics + EDAM - The ontology of data analysis and management + + 1.26_dev - concept_properties "EDAM concept properties" - data "EDAM types of data" - edam "EDAM" - formats "EDAM data formats" - identifiers "EDAM types of identifiers" - operations "EDAM operations" - relations "EDAM relations" - topics "EDAM topics" - Jon Ison, Matúš Kalaš, Hervé Ménager + + + + + + + + + + + Matúš Kalaš @@ -59,15 +65,29 @@ + + + + + + + + + + + + + + - 1.13 - true + 1.13 + true Publication reference 'Citation' concept property ('citation' metadata tag) contains a dereferenceable URI, preferrably including a DOI, pointing to a citeable publication of the given data format. Publication - concept_properties + Citation @@ -82,20 +102,28 @@ - true + true Version in which a concept was created. - concept_properties + Created in + + + + + + + + - true + true A comment explaining why the comment should be or was deprecated, including name of person commenting (jison, mkalas etc.) - concept_properties + deprecation_comment @@ -104,11 +132,20 @@ - true + true 'Documentation' trailing modifier (qualifier, 'documentation') of 'xref' links of 'Format' concepts. When 'true', the link is pointing to a page with explanation, description, documentation, or specification of the given data format. Specification - concept_properties + Documentation + + + + + + + + + @@ -116,9 +153,9 @@ - true + true 'Example' concept property ('example' metadata tag) lists examples of valid values of types of identifiers (accessions). Applicable to some other types of data, too. - concept_properties + Separated by bar ('|'). For more complex data and data formats, it can be a link to a website with examples, instead. Example @@ -128,12 +165,29 @@ - true + true 'File extension' concept property ('file_extension' metadata tag) lists examples of usual file extensions of formats. - concept_properties + N.B.: File extensions that are not correspondigly defined at http://filext.com are recorded in EDAM only if not in conflict with http://filext.com, and/or unique and usual within life-science computing. Separated by bar ('|'), without a dot ('.') prefix, preferrably not all capital characters. File extension + + + + + + + + + + + + + + + + + @@ -147,16 +201,25 @@ + + + + + + + + - true + true 'Information standard' trailing modifier (qualifier, 'information_standard') of 'xref' links of 'Format' concepts. When 'true', the link is pointing to an information standard supported by the given data format. Minimum information checklist Minimum information standard - concept_properties + "Supported by the given data format" here means, that the given format enables representation of data that satisfies the information standard. Information standard + @@ -164,9 +227,9 @@ - true + true When 'true', the concept has been proposed to be deprecated. - concept_properties + deprecation_candidate @@ -175,9 +238,9 @@ - true + true When 'true', the concept has been proposed to be refactored. - concept_properties + refactor_candidate @@ -186,9 +249,9 @@ - true + true When 'true', the concept has been proposed or is supported within Debian as a tag. - concept_properties + isdebtag @@ -197,11 +260,12 @@ - true + true 'Media type' trailing modifier (qualifier, 'media_type') of 'xref' links of 'Format' concepts. When 'true', the link is pointing to a page specifying a media type of the given data format. MIME type - concept_properties + Media type + @@ -215,20 +279,28 @@ - true + true Whether terms associated with this concept are recommended for use in annotation. - concept_properties + notRecommendedForAnnotation + + + + + + + + - true + true Version in which a concept was made obsolete. - concept_properties + Obsolete since @@ -237,9 +309,9 @@ - true + true EDAM concept URI of the erstwhile "parent" of a now deprecated concept. - concept_properties + Old parent @@ -248,9 +320,9 @@ - true + true EDAM concept URI of an erstwhile related concept (by has_input, has_output, has_topic, is_format_of, etc.) of a now deprecated concept. - concept_properties + Old related @@ -259,22 +331,31 @@ - true + true 'Ontology used' concept property ('ontology_used' metadata tag) of format concepts links to a domain ontology that is used inside the given data format, or contains a note about ontology use within the format. - concept_properties + Ontology used + + + + + + + + - true + true 'Organisation' trailing modifier (qualifier, 'organisation') of 'xref' links of 'Format' concepts. When 'true', the link is pointing to an organisation that developed, standardised, and maintains the given data format. - Organization - concept_properties + Organisation + Organization + @@ -282,9 +363,9 @@ - true + true A comment explaining the proposed refactoring, including name of person commenting (jison, mkalas etc.) - concept_properties + refactor_comment @@ -293,9 +374,9 @@ - true + true 'Regular expression' concept property ('regex' metadata tag) specifies the allowed values of types of identifiers (accessions). Applicable to some other types of data, too. - concept_properties + Regular expression @@ -305,7 +386,7 @@ 'Related term' concept property ('related_term'; supposedly a synonym modifier in OBO format) states a related term - not necessarily closely semantically related - that users (also non-specialists) may use when searching. - concept_properties + Related term @@ -315,11 +396,11 @@ - true + true 'Repository' trailing modifier (qualifier, 'repository') of 'xref' links of 'Format' concepts. When 'true', the link is pointing to the public source-code repository where the given data format is developed or maintained. Public repository Source-code repository - concept_properties + Repository @@ -328,14 +409,22 @@ - true + true Name of thematic editor (http://biotools.readthedocs.io/en/latest/governance.html#registry-editors) responsible for this concept and its children. - concept_properties + thematic_editor + + + + + + + + @@ -390,105 +479,45 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + - - - + - + - + - + - - - + - + - + - + - + - + - + - + - + @@ -612,9 +641,61 @@ - + + + + + + + + + + + + + + + + + + + + + + + - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @@ -647,13 +728,12 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A has_format B' defines for the subject A, that it has the object B as its data format. - edam - relations + false Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is (or is in a role of) 'Data', or an input, output, input or output argument of an 'Operation'. Object B can either be a concept that is a 'Format', or in unexpected cases an entity outside of an ontology that is a 'Format' or is in the role of a 'Format'. In EDAM, 'has_format' is not explicitly defined between EDAM concepts, only the inverse 'is_format_of'. has format @@ -668,14 +748,13 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A has_function B' defines for the subject A, that it has the object B as its function. OBO_REL:bearer_of - edam - relations + true Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is (or is in a role of) a function, or an entity outside of an ontology that is (or is in a role of) a function specification. In the scope of EDAM, 'has_function' serves only for relating annotated entities outside of EDAM with 'Operation' concepts. has function @@ -704,13 +783,12 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A has_identifier B' defines for the subject A, that it has the object B as its identifier. - edam - relations + false Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is an 'Identifier', or an entity outside of an ontology that is an 'Identifier' or is in the role of an 'Identifier'. In EDAM, 'has_identifier' is not explicitly defined between EDAM concepts, only the inverse 'is_identifier_of'. has identifier @@ -724,14 +802,13 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A has_input B' defines for the subject A, that it has the object B as a necessary or actual input or input argument. OBO_REL:has_participant - edam - relations + true Subject A can either be concept that is or has an 'Operation' function, or an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that has an 'Operation' function or is an 'Operation'. Object B can be any concept or entity. In EDAM, only 'has_input' is explicitly defined between EDAM concepts ('Operation' 'has_input' 'Data'). The inverse, 'is_input_of', is not explicitly defined. has input @@ -759,14 +836,13 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A has_output B' defines for the subject A, that it has the object B as a necessary or actual output or output argument. OBO_REL:has_participant - edam - relations + true Subject A can either be concept that is or has an 'Operation' function, or an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that has an 'Operation' function or is an 'Operation'. Object B can be any concept or entity. In EDAM, only 'has_output' is explicitly defined between EDAM concepts ('Operation' 'has_output' 'Data'). The inverse, 'is_output_of', is not explicitly defined. has output @@ -801,13 +877,12 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A has_topic B' defines for the subject A, that it has the object B as its topic (A is in the scope of a topic B). - edam - relations + true Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is a 'Topic', or in unexpected cases an entity outside of an ontology that is a 'Topic' or is in the role of a 'Topic'. In EDAM, only 'has_topic' is explicitly defined between EDAM concepts ('Operation' or 'Data' 'has_topic' 'Topic'). The inverse, 'is_topic_of', is not explicitly defined. has topic @@ -830,14 +905,13 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A is_format_of B' defines for the subject A, that it is a data format of the object B. OBO_REL:quality_of - edam - relations + false Subject A can either be a concept that is a 'Format', or in unexpected cases an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is a 'Format' or is in the role of a 'Format'. Object B can be any concept or entity outside of an ontology that is (or is in a role of) 'Data', or an input, output, input or output argument of an 'Operation'. In EDAM, only 'is_format_of' is explicitly defined between EDAM concepts ('Format' 'is_format_of' 'Data'). The inverse, 'has_format', is not explicitly defined. is format of @@ -856,15 +930,14 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A is_function_of B' defines for the subject A, that it is a function of the object B. OBO_REL:function_of OBO_REL:inheres_in - edam - relations + true Subject A can either be concept that is (or is in a role of) a function, or an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is (or is in a role of) a function specification. Object B can be any concept or entity. Within EDAM itself, 'is_function_of' is not used. is function of @@ -897,13 +970,12 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A is_identifier_of B' defines for the subject A, that it is an identifier of the object B. - edam - relations + false Subject A can either be a concept that is an 'Identifier', or an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is an 'Identifier' or is in the role of an 'Identifier'. Object B can be any concept or entity outside of an ontology. In EDAM, only 'is_identifier_of' is explicitly defined between EDAM concepts (only 'Identifier' 'is_identifier_of' 'Data'). The inverse, 'has_identifier', is not explicitly defined. is identifier of @@ -916,14 +988,13 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A is_input_of B' defines for the subject A, that it as a necessary or actual input or input argument of the object B. OBO_REL:participates_in - edam - relations + true Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is or has an 'Operation' function, or an entity outside of an ontology that has an 'Operation' function or is an 'Operation'. In EDAM, 'is_input_of' is not explicitly defined between EDAM concepts, only the inverse 'has_input'. is input of @@ -950,14 +1021,13 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A is_output_of B' defines for the subject A, that it as a necessary or actual output or output argument of the object B. OBO_REL:participates_in - edam - relations + true Subject A can be any concept or entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated). Object B can either be a concept that is or has an 'Operation' function, or an entity outside of an ontology that has an 'Operation' function or is an 'Operation'. In EDAM, 'is_output_of' is not explicitly defined between EDAM concepts, only the inverse 'has_output'. is output of @@ -991,14 +1061,13 @@ - false - false - false - OBO_REL:is_a + false + false + false + OBO_REL:is_a 'A is_topic_of B' defines for the subject A, that it is a topic of the object B (a topic A is the scope of B). OBO_REL:quality_of - edam - relations + true Subject A can either be a concept that is a 'Topic', or in unexpected cases an entity outside of an ontology (or an ontology concept in a role of an entity being semantically annotated) that is a 'Topic' or is in the role of a 'Topic'. Object B can be any concept or entity outside of an ontology. In EDAM, 'is_topic_of' is not explicitly defined between EDAM concepts, only the inverse 'has_topic'. is topic of @@ -1019,16 +1088,35 @@ - + - - Matúš Kalaš - 2022-09-22T18:56:11.224295Z - skos:related + + + + + + + + + + + Magnus Palmblad + 2023-03-30T12:42:17.159554Z + hasBroadSynonym + + + + - - - - - - application/gpx+xml - melibleq - 2021-08-26T04:29:11.969493Z - GPS Exchange Format - GPX - - - - - - - - - - - melibleq - 2021-08-26T04:33:21.310981Z - OpenStreetMap File Formats - OSM Formats - TODO recommended to use binary PBF format instead - OSM XML - - - - - - - - - - - melibleq - 2021-08-26T04:36:50.785363Z - Digital Elevation Model - DEM - - - - - - - @@ -25300,8 +25343,8 @@ - Air pressure level - Topographic profile + Air pressure level + Topographic profile Altitude Bathymetry Depth @@ -25318,7 +25361,7 @@ - Polygon mesh + Polygon mesh Matúš Kalaš 2021-08-26T15:56:31.557993Z Geospatial grid @@ -25366,7 +25409,7 @@ - Spatial index + Spatial index Spatial coordinate reference system Geospatial reference system Geospatial coordinate system @@ -25424,7 +25467,7 @@ - Geospatial geometry + Geospatial geometry Geographical area specification @@ -25558,8 +25601,8 @@ - Triangle mesh - Triangular facets + Triangle mesh + Triangular facets 2022-05-06T09:06:57.567907Z TIN Triangular irregular network @@ -25785,20 +25828,111 @@ + + + + + + + + + + + bianchini + 2023-03-06T15:25:28.463042Z + A document describing the operations that you will be performing to acquire, manage, document, store, share, and preserve your data. This includes budgeting for these operations when required. + DMP + Data management plan + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + bianchini + 2023-03-06T15:32:37.480761Z + A standard for DMPs developed by the Research Data Alliance + maDMP + Machine-actionable DMP + + + + + + + + + + + + + + + + + A related search term with a different scope + Matúš Kalaš + 2023-02-24T09:45:21.41427Z + Slightly broader meaning + A TEMPLATE for Data concepts in EDAM. + The same thing (TSG) + Slightly narrower meaning + Mostly overlapping, but not exact, narrower, or broader. + Mandatory when released: rdfs:label, hasDefinition. +Mandatory but can be semi-automated: Created in, subsets, ... +Optional: rdfs:comment(s), synonyms and related terms; has topic (if fits); rdfs:seeAlso to a Wikipedia article and a match link to a WikiData item (if these exist) +Removed for release: created_by, creation_date, skos:editorialNote(s) + Optional, zero or more. A comment adds important information to the definition, synonyms, external links. May also be "not to be confused with". + {Data TEMPLATE} + + SKOS 'editorial note' comment is just an editorial comment that will not be released. E.g. TODO - Improve this TEMPLATE! + + + + + - beta12orEarlier - - + beta12orEarlier + + Chemical structure specified in Simplified Molecular Input Line Entry System (SMILES) line notation. - - + + SMILES - + @@ -25808,10 +25942,10 @@ - beta12orEarlier + beta12orEarlier Chemical structure specified in IUPAC International Chemical Identifier (InChI) line notation. - - + + InChI @@ -25822,10 +25956,10 @@ - beta12orEarlier + beta12orEarlier Chemical structure specified by Molecular Formula (MF), including a count of each element in a compound. - - + + The general MF query format consists of a series of valid atomic symbols, with an optional number or range. mf @@ -25837,10 +25971,10 @@ - beta12orEarlier + beta12orEarlier The InChIKey (hashed InChI) is a fixed length (25 character) condensed digital representation of an InChI chemical structure specification. It uniquely identifies a chemical compound. - - + + An InChIKey identifier is not human- nor machine-readable but is more suitable for web searches than an InChI chemical structure specification. InChIKey @@ -25851,10 +25985,10 @@ - beta12orEarlier + beta12orEarlier SMILES ARbitrary Target Specification (SMARTS) format for chemical structure specification, which is a subset of the SMILES line notation. - - + + smarts @@ -25865,10 +25999,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a molecular sequence with possible unknown positions but without ambiguity or non-sequence characters. - - + + unambiguous pure @@ -25879,10 +26013,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a nucleotide sequence with possible ambiguity, unknown positions and non-sequence characters. - - + + Non-sequence characters may be used for example for gaps. nucleotide http://onto.eva.mpg.de/ontologies/gfo-bio.owl#Nucleotide_sequence @@ -25895,10 +26029,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a protein sequence with possible ambiguity, unknown positions and non-sequence characters. - - + + Non-sequence characters may be used for gaps and translation stop. protein http://onto.eva.mpg.de/ontologies/gfo-bio.owl#Amino_acid_sequence @@ -25911,10 +26045,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for the consensus of two or more molecular sequences. - - + + consensus @@ -25925,10 +26059,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a nucleotide sequence with possible ambiguity and unknown positions but without non-sequence characters. - - + + pure nucleotide @@ -25939,10 +26073,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a nucleotide sequence (characters ACGTU only) with possible unknown positions but without ambiguity or non-sequence characters . - - + + unambiguous pure nucleotide @@ -25952,10 +26086,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a DNA sequence with possible ambiguity, unknown positions and non-sequence characters. - - + + dna http://onto.eva.mpg.de/ontologies/gfo-bio.owl#DNA_sequence @@ -25966,10 +26100,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for an RNA sequence with possible ambiguity, unknown positions and non-sequence characters. - - + + rna http://onto.eva.mpg.de/ontologies/gfo-bio.owl#RNA_sequence @@ -25981,10 +26115,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a DNA sequence (characters ACGT only) with possible unknown positions but without ambiguity or non-sequence characters. - - + + unambiguous pure dna @@ -25995,10 +26129,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a DNA sequence with possible ambiguity and unknown positions but without non-sequence characters. - - + + pure dna @@ -26009,10 +26143,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for an RNA sequence (characters ACGU only) with possible unknown positions but without ambiguity or non-sequence characters. - - + + unambiguous pure rna sequence @@ -26023,10 +26157,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for an RNA sequence with possible ambiguity and unknown positions but without non-sequence characters. - - + + pure rna @@ -26037,10 +26171,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for any protein sequence with possible unknown positions but without ambiguity or non-sequence characters. - - + + unambiguous pure protein @@ -26051,10 +26185,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for any protein sequence with possible ambiguity and unknown positions but without non-sequence characters. - - + + pure protein @@ -26064,12 +26198,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from UniGene. - + A UniGene entry includes a set of transcript sequences assigned to the same transcription locus (gene or expressed pseudogene), with information on protein similarities, gene expression, cDNA clone reagents, and genomic location. UniGene entry format true @@ -26081,12 +26215,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the COG database of clusters of (related) protein sequences. - + COG sequence cluster format true @@ -26098,11 +26232,11 @@ - beta12orEarlier + beta12orEarlier Format for sequence positions (feature location) as used in DDBJ/EMBL/GenBank database. Feature location - - + + EMBL feature location @@ -26113,10 +26247,10 @@ - beta12orEarlier + beta12orEarlier Report format for tandem repeats in a nucleotide sequence (format generated by the Sanger Centre quicktandem program). - - + + quicktandem @@ -26127,10 +26261,10 @@ - beta12orEarlier + beta12orEarlier Report format for inverted repeats in a nucleotide sequence (format generated by the Sanger Centre inverted program). - - + + Sanger inverted repeats @@ -26141,10 +26275,10 @@ - beta12orEarlier + beta12orEarlier Report format for tandem repeats in a sequence (an EMBOSS report format). - - + + EMBOSS repeat @@ -26155,10 +26289,10 @@ - beta12orEarlier + beta12orEarlier Format of a report on exon-intron structure generated by EMBOSS est2genome. - - + + est2genome format @@ -26169,10 +26303,10 @@ - beta12orEarlier + beta12orEarlier Report format for restriction enzyme recognition sites used by EMBOSS restrict program. - - + + restrict format @@ -26183,10 +26317,10 @@ - beta12orEarlier + beta12orEarlier Report format for restriction enzyme recognition sites used by EMBOSS restover program. - - + + restover format @@ -26197,10 +26331,10 @@ - beta12orEarlier + beta12orEarlier Report format for restriction enzyme recognition sites used by REBASE database. - - + + REBASE restriction sites @@ -26211,10 +26345,10 @@ - beta12orEarlier + beta12orEarlier Format of results of a sequence database search using FASTA. - - + + This includes (typically) score data, alignment data and a histogram (of observed and expected distribution of E values.) FASTA search results format @@ -26226,10 +26360,10 @@ - beta12orEarlier + beta12orEarlier Format of results of a sequence database search using some variant of BLAST. - - + + This includes score data, alignment data and summary table. BLAST results @@ -26241,10 +26375,10 @@ - beta12orEarlier + beta12orEarlier Format of results of a sequence database search using some variant of MSPCrunch. - - + + mspcrunch @@ -26255,10 +26389,10 @@ - beta12orEarlier + beta12orEarlier Format of results of a sequence database search using some variant of Smith Waterman. - - + + Smith-Waterman format @@ -26269,10 +26403,10 @@ - beta12orEarlier + beta12orEarlier Format of EMBASSY domain hits file (DHF) of hits (sequences) with domain classification information. - - + + The hits are relatives to a SCOP or CATH family and are found from a search of a sequence database. dhf @@ -26284,10 +26418,10 @@ - beta12orEarlier + beta12orEarlier Format of EMBASSY ligand hits file (LHF) of database hits (sequences) with ligand classification information. - - + + The hits are putative ligand-binding sequences and are found from a search of a sequence database. lhf @@ -26299,10 +26433,10 @@ - beta12orEarlier + beta12orEarlier Results format for searches of the InterPro database. - - + + InterPro hits format @@ -26312,10 +26446,10 @@ - beta12orEarlier + beta12orEarlier Format of results of a search of the InterPro database showing matches of query protein sequence(s) to InterPro entries. - - + + The report includes a classification of regions in a query protein sequence which are assigned to a known InterPro protein family or group. InterPro protein view report format @@ -26326,10 +26460,10 @@ - beta12orEarlier + beta12orEarlier Format of results of a search of the InterPro database showing matches between protein sequence(s) and signatures for an InterPro entry. - - + + The table presents matches between query proteins (rows) and signature methods (columns) for this entry. Alternatively the sequence(s) might be from from the InterPro entry itself. The match position in the protein sequence and match status (true positive, false positive etc) are indicated. InterPro match table format @@ -26341,10 +26475,10 @@ - beta12orEarlier + beta12orEarlier Dirichlet distribution HMMER format. - - + + HMMER Dirichlet prior @@ -26355,10 +26489,10 @@ - beta12orEarlier + beta12orEarlier Dirichlet distribution MEME format. - - + + MEME Dirichlet prior @@ -26369,10 +26503,10 @@ - beta12orEarlier + beta12orEarlier Format of a report from the HMMER package on the emission and transition counts of a hidden Markov model. - - + + HMMER emission and transition @@ -26383,10 +26517,10 @@ - beta12orEarlier + beta12orEarlier Format of a regular expression pattern from the Prosite database. - - + + prosite-pattern @@ -26397,10 +26531,10 @@ - beta12orEarlier + beta12orEarlier Format of an EMBOSS sequence pattern. - - + + EMBOSS sequence pattern @@ -26411,10 +26545,10 @@ - beta12orEarlier + beta12orEarlier A motif in the format generated by the MEME program. - - + + meme-motif @@ -26425,10 +26559,10 @@ - beta12orEarlier + beta12orEarlier Sequence profile (sequence classifier) format used in the PROSITE database. - - + + prosite-profile @@ -26439,10 +26573,10 @@ - beta12orEarlier + beta12orEarlier A profile (sequence classifier) in the format used in the JASPAR database. - - + + JASPAR format @@ -26453,10 +26587,10 @@ - beta12orEarlier + beta12orEarlier Format of the model of random sequences used by MEME. - - + + MEME background Markov model @@ -26467,10 +26601,10 @@ - beta12orEarlier + beta12orEarlier Format of a hidden Markov model representation used by the HMMER package. - - + + HMMER format @@ -26482,10 +26616,10 @@ - beta12orEarlier + beta12orEarlier FASTA-style format for multiple sequences aligned by HMMER package to an HMM. - - + + HMMER-aln @@ -26496,10 +26630,10 @@ - beta12orEarlier + beta12orEarlier Format of multiple sequences aligned by DIALIGN package. - - + + DIALIGN format @@ -26510,10 +26644,10 @@ - beta12orEarlier + beta12orEarlier EMBASSY 'domain alignment file' (DAF) format, containing a sequence alignment of protein domains belonging to the same SCOP or CATH family. - - + + The format is clustal-like and includes annotation of domain family classification information. daf @@ -26525,10 +26659,10 @@ - beta12orEarlier + beta12orEarlier Format for alignment of molecular sequences to MEME profiles (position-dependent scoring matrices) as generated by the MAST tool from the MEME package. - - + + Sequence-MEME profile alignment @@ -26539,10 +26673,10 @@ - beta12orEarlier + beta12orEarlier Format used by the HMMER package for an alignment of a sequence against a hidden Markov model database. - - + + HMMER profile alignment (sequences versus HMMs) @@ -26553,10 +26687,10 @@ - beta12orEarlier + beta12orEarlier Format used by the HMMER package for of an alignment of a hidden Markov model against a sequence database. - - + + HMMER profile alignment (HMM versus sequences) @@ -26567,10 +26701,10 @@ - beta12orEarlier + beta12orEarlier Format of PHYLIP phylogenetic distance matrix data. - - + + Data Type must include the distance matrix, probably as pairs of sequence identifiers with a distance (integer or float). Phylip distance matrix @@ -26582,10 +26716,10 @@ - beta12orEarlier + beta12orEarlier Dendrogram (tree file) format generated by ClustalW. - - + + ClustalW dendrogram @@ -26596,10 +26730,10 @@ - beta12orEarlier + beta12orEarlier Raw data file format used by Phylip from which a phylogenetic tree is directly generated or plotted. - - + + Phylip tree raw @@ -26610,10 +26744,10 @@ - beta12orEarlier + beta12orEarlier PHYLIP file format for continuous quantitative character data. - - + + Phylip continuous quantitative characters @@ -26623,12 +26757,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of phylogenetic property data. - + Phylogenetic property values format true @@ -26640,10 +26774,10 @@ - beta12orEarlier + beta12orEarlier PHYLIP file format for phylogenetics character frequency data. - - + + Phylip character frequencies format @@ -26654,10 +26788,10 @@ - beta12orEarlier + beta12orEarlier Format of PHYLIP discrete states data. - - + + Phylip discrete states format @@ -26668,10 +26802,10 @@ - beta12orEarlier + beta12orEarlier Format of PHYLIP cliques data. - - + + Phylip cliques format @@ -26682,10 +26816,10 @@ - beta12orEarlier + beta12orEarlier Phylogenetic tree data format used by the PHYLIP program. - - + + Phylip tree format @@ -26696,10 +26830,10 @@ - beta12orEarlier + beta12orEarlier The format of an entry from the TreeBASE database of phylogenetic data. - - + + TreeBASE format @@ -26710,10 +26844,10 @@ - beta12orEarlier + beta12orEarlier The format of an entry from the TreeFam database of phylogenetic data. - - + + TreeFam format @@ -26724,10 +26858,10 @@ - beta12orEarlier + beta12orEarlier Format for distances, such as Branch Score distance, between two or more phylogenetic trees as used by the Phylip package. - - + + Phylip tree distance format @@ -26738,10 +26872,10 @@ - beta12orEarlier + beta12orEarlier Format of an entry from the DSSP database (Dictionary of Secondary Structure in Proteins). - - + + The DSSP database is built using the DSSP application which defines secondary structure, geometrical features and solvent exposure of proteins, given atomic coordinates in PDB format. dssp @@ -26753,10 +26887,10 @@ - beta12orEarlier + beta12orEarlier Entry format of the HSSP database (Homology-derived Secondary Structure in Proteins). - - + + hssp @@ -26767,12 +26901,12 @@ - beta12orEarlier + beta12orEarlier Format of RNA secondary structure in dot-bracket notation, originally generated by the Vienna RNA package/server. Vienna RNA format Vienna RNA secondary structure format - - + + Dot-bracket format @@ -26783,10 +26917,10 @@ - beta12orEarlier + beta12orEarlier Format of local RNA secondary structure components with free energy values, generated by the Vienna RNA package/server. - - + + Vienna local RNA secondary structure format @@ -26808,11 +26942,11 @@ - beta12orEarlier + beta12orEarlier Format of an entry (or part of an entry) from the PDB database. PDB entry format - - + + PDB database entry format @@ -26823,11 +26957,11 @@ - beta12orEarlier + beta12orEarlier Entry format of PDB database in PDB format. PDB format - - + + PDB @@ -26838,10 +26972,10 @@ - beta12orEarlier + beta12orEarlier Entry format of PDB database in mmCIF format. - - + + mmCIF @@ -26852,10 +26986,10 @@ - beta12orEarlier + beta12orEarlier Entry format of PDB database in PDBML (XML) format. - - + + PDBML @@ -26865,11 +26999,11 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of a matrix of 3D-1D scores used by the EMBOSS Domainatrix applications. - + Domainatrix 3D-1D scoring matrix format true @@ -26882,10 +27016,10 @@ - beta12orEarlier + beta12orEarlier Amino acid index format used by the AAindex database. - - + + aaindex @@ -26895,12 +27029,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from IntEnz (The Integrated Relational Enzyme Database). - + IntEnz is the master copy of the Enzyme Nomenclature, the recommendations of the NC-IUBMB on the Nomenclature and Classification of Enzyme-Catalysed Reactions. IntEnz enzyme report format true @@ -26912,12 +27046,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the BRENDA enzyme database. - + BRENDA enzyme report format true @@ -26928,12 +27062,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the KEGG REACTION database of biochemical reactions. - + KEGG REACTION enzyme report format true @@ -26944,12 +27078,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the KEGG ENZYME database. - + KEGG ENZYME enzyme report format true @@ -26960,12 +27094,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the proto section of the REBASE enzyme database. - + REBASE proto enzyme report format true @@ -26976,12 +27110,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the withrefm section of the REBASE enzyme database. - + REBASE withrefm enzyme report format true @@ -26993,10 +27127,10 @@ - beta12orEarlier + beta12orEarlier Format of output of the Pcons Model Quality Assessment Program (MQAP). - - + + Pcons ranks protein models by assessing their quality based on the occurrence of recurring common three-dimensional structural patterns. Pcons returns a score reflecting the overall global quality and a score for each individual residue in the protein reflecting the local residue quality. Pcons report format @@ -27008,10 +27142,10 @@ - beta12orEarlier + beta12orEarlier Format of output of the ProQ protein model quality predictor. - - + + ProQ is a neural network-based predictor that predicts the quality of a protein model based on the number of structural features. ProQ report format @@ -27022,12 +27156,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of SMART domain assignment data. - + The SMART output file includes data on genetically mobile domains / analysis of domain architectures, including phyletic distributions, functional class, tertiary structures and functionally important residues. SMART domain assignment report format true @@ -27039,12 +27173,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the BIND database of protein interaction. - + BIND entry format true @@ -27055,12 +27189,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the IntAct database of protein interaction. - + IntAct entry format true @@ -27071,12 +27205,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the InterPro database of protein signatures (sequence classifiers) and classified sequences. - + This includes signature metadata, sequence references and a reference to the signature itself. There is normally a header (entry accession numbers and name), abstract, taxonomy information, example proteins etc. Each entry also includes a match list which give a number of different views of the signature matches for the sequences in each InterPro entry. InterPro entry format true @@ -27088,12 +27222,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the textual abstract of signatures in an InterPro entry and its protein matches. - + References are included and a functional inference is made where possible. InterPro entry abstract format true @@ -27105,12 +27239,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the Gene3D protein secondary database. - + Gene3D entry format true @@ -27121,12 +27255,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the PIRSF protein secondary database. - + PIRSF entry format true @@ -27137,12 +27271,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the PRINTS protein secondary database. - + PRINTS entry format true @@ -27153,12 +27287,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the Panther library of protein families and subfamilies. - + Panther Families and HMMs entry format true @@ -27169,12 +27303,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the Pfam protein secondary database. - + Pfam entry format true @@ -27185,12 +27319,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the SMART protein secondary database. - + SMART entry format true @@ -27201,12 +27335,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the Superfamily protein secondary database. - + Superfamily entry format true @@ -27217,12 +27351,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the TIGRFam protein secondary database. - + TIGRFam entry format true @@ -27233,12 +27367,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the ProDom protein domain classification database. - + ProDom entry format true @@ -27249,12 +27383,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the FSSP database. - + FSSP entry format true @@ -27266,10 +27400,10 @@ - beta12orEarlier + beta12orEarlier A report format for the kinetics of enzyme-catalysed reaction(s) in a format generated by EMBOSS findkm. This includes Michaelis Menten plot, Hanes Woolf plot, Michaelis Menten constant (Km) and maximum velocity (Vmax). - - + + findkm @@ -27279,12 +27413,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of Ensembl genome database. - + Ensembl gene report format true @@ -27295,12 +27429,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of DictyBase genome database. - + DictyBase gene report format true @@ -27311,12 +27445,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of Candida Genome database. - + CGD gene report format true @@ -27327,12 +27461,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of DragonDB genome database. - + DragonDB gene report format true @@ -27343,12 +27477,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of EcoCyc genome database. - + EcoCyc gene report format true @@ -27359,12 +27493,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of FlyBase genome database. - + FlyBase gene report format true @@ -27375,12 +27509,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of Gramene genome database. - + Gramene gene report format true @@ -27391,12 +27525,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of KEGG GENES genome database. - + KEGG GENES gene report format true @@ -27407,12 +27541,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of the Maize genetics and genomics database (MaizeGDB). - + MaizeGDB gene report format true @@ -27423,12 +27557,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of the Mouse Genome Database (MGD). - + MGD gene report format true @@ -27439,12 +27573,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of the Rat Genome Database (RGD). - + RGD gene report format true @@ -27455,12 +27589,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of the Saccharomyces Genome Database (SGD). - + SGD gene report format true @@ -27471,12 +27605,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of the Sanger GeneDB genome database. - + GeneDB gene report format true @@ -27487,12 +27621,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of The Arabidopsis Information Resource (TAIR) genome database. - + TAIR gene report format true @@ -27503,12 +27637,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of the WormBase genomes database. - + WormBase gene report format true @@ -27519,12 +27653,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of the Zebrafish Information Network (ZFIN) genome database. - + ZFIN gene report format true @@ -27535,12 +27669,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format of the TIGR genome database. - + TIGR gene report format true @@ -27551,12 +27685,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the dbSNP database. - + dbSNP polymorphism report format true @@ -27567,12 +27701,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the OMIM database of genotypes and phenotypes. - + OMIM entry format true @@ -27583,12 +27717,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of a record from the HGVbase database of genotypes and phenotypes. - + HGVbase entry format true @@ -27599,12 +27733,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of a record from the HIVDB database of genotypes and phenotypes. - + HIVDB entry format true @@ -27615,12 +27749,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the KEGG DISEASE database. - + KEGG DISEASE entry format true @@ -27632,10 +27766,10 @@ - beta12orEarlier + beta12orEarlier Report format on PCR primers and hybridisation oligos as generated by Whitehead primer3 program. - - + + Primer3 primer @@ -27646,10 +27780,10 @@ - beta12orEarlier + beta12orEarlier A format of raw sequence read data from an Applied Biosystems sequencing machine. - - + + ABI @@ -27660,10 +27794,10 @@ - beta12orEarlier + beta12orEarlier Format of MIRA sequence trace information file. - - + + mira @@ -27674,13 +27808,13 @@ - beta12orEarlier - - caf + beta12orEarlier + + caf Common Assembly Format (CAF). A sequence assembly format including contigs, base-call qualities, and other metadata. - - + + CAF @@ -27691,13 +27825,13 @@ - beta12orEarlier - + beta12orEarlier + Sequence assembly project file EXP format. Affymetrix EXP format EXP - - + + EXP @@ -27708,12 +27842,12 @@ - beta12orEarlier - + beta12orEarlier + Staden Chromatogram Files format (SCF) of base-called sequence reads, qualities, and other metadata. - - + + SCF @@ -27724,12 +27858,12 @@ - beta12orEarlier - + beta12orEarlier + PHD sequence trace format to store serialised chromatogram data (reads). - - + + PHD @@ -27746,11 +27880,11 @@ - beta12orEarlier + beta12orEarlier Format of Affymetrix data file of raw image data. Affymetrix image data file format - - + + dat @@ -27767,11 +27901,11 @@ - beta12orEarlier + beta12orEarlier Format of Affymetrix data file of information about (raw) expression levels of the individual probes. Affymetrix probe raw data format - - + + cel @@ -27782,10 +27916,10 @@ - beta12orEarlier + beta12orEarlier Format of affymetrix gene cluster files (hc-genes.txt, hc-chips.txt) from hierarchical clustering. - - + + affymetrix @@ -27795,12 +27929,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the ArrayExpress microarrays database. - + ArrayExpress entry format true @@ -27812,11 +27946,11 @@ - beta12orEarlier + beta12orEarlier Affymetrix data file format for information about experimental conditions and protocols. Affymetrix experimental conditions data file format - - + + affymetrix-exp @@ -27833,13 +27967,13 @@ - beta12orEarlier - - chp + beta12orEarlier + + chp Format of Affymetrix data file of information about (normalised) expression levels of the individual probes. Affymetrix probe normalised data format - - + + CHP @@ -27849,12 +27983,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the Electron Microscopy DataBase (EMDB). - + EMDB entry format true @@ -27865,12 +27999,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the KEGG PATHWAY database of pathway maps for molecular interactions and reaction networks. - + KEGG PATHWAY entry format true @@ -27881,12 +28015,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the MetaCyc metabolic pathways database. - + MetaCyc entry format true @@ -27897,12 +28031,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of a report from the HumanCyc metabolic pathways database. - + HumanCyc entry format true @@ -27913,12 +28047,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the INOH signal transduction pathways database. - + INOH entry format true @@ -27929,12 +28063,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the PATIKA biological pathways database. - + PATIKA entry format true @@ -27945,12 +28079,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the reactome biological pathways database. - + Reactome entry format true @@ -27961,12 +28095,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the aMAZE biological pathways and molecular interactions database. - + aMAZE entry format true @@ -27977,12 +28111,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the CPDB database. - + CPDB entry format true @@ -27993,12 +28127,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the Panther Pathways database. - + Panther Pathways entry format true @@ -28010,10 +28144,10 @@ - beta12orEarlier + beta12orEarlier Format of Taverna workflows. - - + + Taverna workflow format @@ -28023,12 +28157,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of mathematical models from the BioModel database. - + Models are annotated and linked to relevant data resources, such as publications, databases of compounds and pathways, controlled vocabularies, etc. BioModel mathematical model format true @@ -28040,12 +28174,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the KEGG LIGAND chemical database. - + KEGG LIGAND entry format true @@ -28056,12 +28190,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the KEGG COMPOUND database. - + KEGG COMPOUND entry format true @@ -28072,12 +28206,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the KEGG PLANT database. - + KEGG PLANT entry format true @@ -28088,12 +28222,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the KEGG GLYCAN database. - + KEGG GLYCAN entry format true @@ -28104,12 +28238,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from PubChem. - + PubChem entry format true @@ -28120,12 +28254,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from a database of chemical structures and property predictions. - + ChemSpider entry format true @@ -28136,12 +28270,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from Chemical Entities of Biological Interest (ChEBI). - + ChEBI includes an ontological classification defining relations between entities or classes of entities. ChEBI entry format true @@ -28153,12 +28287,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the MSDchem ligand dictionary. - + MSDchem ligand dictionary entry format true @@ -28170,10 +28304,10 @@ - beta12orEarlier + beta12orEarlier The format of an entry from the HET group dictionary (HET groups from PDB files). - - + + HET group dictionary entry format @@ -28183,12 +28317,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the KEGG DRUG database. - + KEGG DRUG entry format true @@ -28200,10 +28334,10 @@ - beta12orEarlier + beta12orEarlier Format of bibliographic reference as used by the PubMed database. - - + + PubMed citation @@ -28214,10 +28348,10 @@ - beta12orEarlier + beta12orEarlier Format for abstracts of scientific articles from the Medline database. - - + + Bibliographic reference information including citation information is included Medline Display Format @@ -28229,10 +28363,10 @@ - beta12orEarlier + beta12orEarlier CiteXplore 'core' citation format including title, journal, authors and abstract. - - + + CiteXplore-core @@ -28243,10 +28377,10 @@ - beta12orEarlier + beta12orEarlier CiteXplore 'all' citation format includes all known details such as Mesh terms and cross-references. - - + + CiteXplore-all @@ -28257,10 +28391,10 @@ - beta12orEarlier + beta12orEarlier Article format of the PubMed Central database. - - + + pmc @@ -28272,10 +28406,10 @@ - beta12orEarlier + beta12orEarlier The format of iHOP (Information Hyperlinked over Proteins) text-mining result. - - + + iHOP format @@ -28288,11 +28422,11 @@ - - beta12orEarlier + + beta12orEarlier OSCAR format of annotated chemical text. - - + + OSCAR (Open-Source Chemistry Analysis Routines) software performs chemistry-specific parsing of chemical documents. It attempts to identify chemical names, ontology concepts, and chemical data from a document. OSCAR format @@ -28303,12 +28437,12 @@ - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Format of an ATOM record (describing data for an individual atom) from a PDB file. - + PDB atom record format true @@ -28319,12 +28453,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of CATH domain classification information for a polypeptide chain. - + The report (for example http://www.cathdb.info/chain/1cukA) includes chain identifiers, domain identifiers and CATH codes for domains in a given protein chain. CATH chain report format true @@ -28336,12 +28470,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of CATH domain classification information for a protein PDB file. - + The report (for example http://www.cathdb.info/pdb/1cuk) includes chain identifiers, domain identifiers and CATH codes for domains in a given PDB file. CATH PDB report format true @@ -28353,12 +28487,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry (gene) format of the NCBI database. - + NCBI gene report format true @@ -28369,13 +28503,13 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Moby:GI_Gene Report format for biological functions associated with a gene name and its alternative names (synonyms, homonyms), as generated by the GeneIlluminator service. - + This includes a gene name and abbreviation of the name which may be in a name space indicating the gene status and relevant organisation. GeneIlluminator gene report format true @@ -28387,13 +28521,13 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Moby:BacMapGeneCard Format of a report on the DNA and protein sequences for a given gene label from a bacterial chromosome maps from the BacMap database. - + BacMap gene card format true @@ -28404,12 +28538,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of a report on Escherichia coli genes, proteins and molecules from the CyberCell Database (CCDB). - + ColiCard report format true @@ -28421,10 +28555,10 @@ - beta12orEarlier + beta12orEarlier Map of a plasmid (circular DNA) in PlasMapper TextMap format. - - + + PlasMapper TextMap @@ -28435,11 +28569,11 @@ - beta12orEarlier + beta12orEarlier Phylogenetic tree Newick (text) format. nh - - + + newick @@ -28450,10 +28584,10 @@ - beta12orEarlier + beta12orEarlier Phylogenetic tree TreeCon (text) format. - - + + TreeCon format @@ -28464,10 +28598,10 @@ - beta12orEarlier + beta12orEarlier Phylogenetic tree Nexus (text) format. - - + + Nexus format @@ -28479,19 +28613,19 @@ - beta12orEarlier - true + beta12orEarlier + true A defined way or layout of representing and structuring data in a computer file, blob, string, message, or elsewhere. Data format Data model Exchange format File format - - + + The main focus in EDAM lies on formats as means of structuring data exchanged between different tools or resources. The serialisation, compression, or encoding of concrete data formats/models is not in scope of EDAM. Format 'is format of' Data. Format - - + + "http://purl.obolibrary.org/obo/IAO_0000098" "http://purl.org/dc/elements/1.1/format" http://purl.org/biotop/biotop.owl#MachineLanguage @@ -28524,12 +28658,12 @@ - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Data format for an individual atom. - + Atomic data format true @@ -28546,11 +28680,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for a molecular sequence record. - - + + Sequence record format @@ -28566,11 +28700,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for molecular sequence feature information. - - + + Sequence feature annotation format @@ -28586,11 +28720,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for molecular sequence alignment information. - - + + Alignment format @@ -28601,10 +28735,10 @@ - beta12orEarlier + beta12orEarlier ACEDB sequence format. - - + + acedb @@ -28614,12 +28748,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Clustalw output format. - + clustal sequence format true @@ -28631,10 +28765,10 @@ - beta12orEarlier + beta12orEarlier Codata entry format. - - + + codata @@ -28644,10 +28778,10 @@ - beta12orEarlier + beta12orEarlier Fasta format variant with database name before ID. - - + + dbid @@ -28658,12 +28792,12 @@ - beta12orEarlier + beta12orEarlier EMBL entry format. EMBL EMBL sequence format - - + + EMBL format @@ -28674,10 +28808,10 @@ - beta12orEarlier + beta12orEarlier Staden experiment file format. - - + + Staden experiment format @@ -28688,12 +28822,12 @@ - beta12orEarlier + beta12orEarlier FASTA format including NCBI-style IDs. FASTA format FASTA sequence format - - + + FASTA @@ -28703,14 +28837,14 @@ - beta12orEarlier - fastq - fq + beta12orEarlier + fastq + fq FASTQ short read format ignoring quality scores. FASTAQ fq - - + + FASTQ @@ -28720,10 +28854,10 @@ - beta12orEarlier + beta12orEarlier FASTQ Illumina 1.3 short read format. - - + + FASTQ-illumina @@ -28733,10 +28867,10 @@ - beta12orEarlier + beta12orEarlier FASTQ short read format with phred quality. - - + + FASTQ-sanger @@ -28746,10 +28880,10 @@ - beta12orEarlier + beta12orEarlier FASTQ Solexa/Illumina 1.0 short read format. - - + + FASTQ-solexa @@ -28760,10 +28894,10 @@ - beta12orEarlier + beta12orEarlier Fitch program format. - - + + fitch program @@ -28774,11 +28908,11 @@ - beta12orEarlier + beta12orEarlier GCG sequence file format. GCG SSF - - + + GCG SSF (single sequence file) file format. GCG @@ -28790,11 +28924,11 @@ - beta12orEarlier + beta12orEarlier Genbank entry format. GenBank - - + + GenBank format @@ -28804,10 +28938,10 @@ - beta12orEarlier + beta12orEarlier Genpept protein entry format. - - + + Currently identical to refseqp format genpept @@ -28819,10 +28953,10 @@ - beta12orEarlier + beta12orEarlier GFF feature file format with sequence in the header. - - + + GFF2-seq @@ -28833,10 +28967,10 @@ - beta12orEarlier + beta12orEarlier GFF3 feature file format with sequence. - - + + GFF3-seq @@ -28846,10 +28980,10 @@ - beta12orEarlier + beta12orEarlier FASTA sequence format including NCBI-style GIs. - - + + giFASTA format @@ -28860,10 +28994,10 @@ - beta12orEarlier + beta12orEarlier Hennig86 output sequence format. - - + + hennig86 @@ -28874,10 +29008,10 @@ - beta12orEarlier + beta12orEarlier Intelligenetics sequence format. - - + + ig @@ -28888,10 +29022,10 @@ - beta12orEarlier + beta12orEarlier Intelligenetics sequence format (strict version). - - + + igstrict @@ -28902,10 +29036,10 @@ - beta12orEarlier + beta12orEarlier Jackknifer interleaved and non-interleaved sequence format. - - + + jackknifer @@ -28916,10 +29050,10 @@ - beta12orEarlier + beta12orEarlier Mase program sequence format. - - + + mase format @@ -28930,10 +29064,10 @@ - beta12orEarlier + beta12orEarlier Mega interleaved and non-interleaved sequence format. - - + + mega-seq @@ -28943,10 +29077,10 @@ - beta12orEarlier + beta12orEarlier GCG MSF (multiple sequence file) file format. - - + + GCG MSF @@ -28956,13 +29090,13 @@ - beta12orEarlier - pir + beta12orEarlier + pir NBRF/PIR entry sequence format. nbrf pir - - + + nbrf/pir @@ -28974,10 +29108,10 @@ - beta12orEarlier + beta12orEarlier Nexus/paup interleaved sequence format. - - + + nexus-seq @@ -28989,10 +29123,10 @@ - beta12orEarlier + beta12orEarlier PDB sequence format (ATOM lines). - - + + pdb format in EMBOSS. pdbatom @@ -29005,10 +29139,10 @@ - beta12orEarlier + beta12orEarlier PDB nucleotide sequence format (ATOM lines). - - + + pdbnuc format in EMBOSS. pdbatomnuc @@ -29021,10 +29155,10 @@ - beta12orEarlier + beta12orEarlier PDB nucleotide sequence format (SEQRES lines). - - + + pdbnucseq format in EMBOSS. pdbseqresnuc @@ -29037,10 +29171,10 @@ - beta12orEarlier + beta12orEarlier PDB sequence format (SEQRES lines). - - + + pdbseq format in EMBOSS. pdbseqres @@ -29051,10 +29185,10 @@ - beta12orEarlier + beta12orEarlier Plain old FASTA sequence format (unspecified format for IDs). - - + + Pearson format @@ -29064,12 +29198,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Phylip interleaved sequence format. - + phylip sequence format true @@ -29080,12 +29214,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + PHYLIP non-interleaved sequence format. - + phylipnon sequence format true @@ -29097,10 +29231,10 @@ - beta12orEarlier + beta12orEarlier Raw sequence format with no non-sequence characters. - - + + raw @@ -29111,10 +29245,10 @@ - beta12orEarlier + beta12orEarlier Refseq protein entry sequence format. - - + + Currently identical to genpept format refseqp @@ -29125,12 +29259,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Selex sequence format. - + selex sequence format true @@ -29142,14 +29276,14 @@ - beta12orEarlier - - + beta12orEarlier + + Staden suite sequence format. - - + + Staden format @@ -29160,13 +29294,13 @@ - beta12orEarlier - + beta12orEarlier + Stockholm multiple sequence alignment format (used by Pfam and Rfam). - - + + Stockholm format - + @@ -29176,10 +29310,10 @@ - beta12orEarlier + beta12orEarlier DNA strider output sequence format. - - + + strider format @@ -29189,12 +29323,12 @@ - beta12orEarlier + beta12orEarlier UniProtKB entry sequence format. SwissProt format UniProt format - - + + UniProtKB format @@ -29204,11 +29338,11 @@ - beta12orEarlier - txt + beta12orEarlier + txt Plain text sequence format (essentially unformatted). - - + + plain text format (unformatted) @@ -29218,12 +29352,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Treecon output sequence format. - + treecon sequence format true @@ -29235,12 +29369,12 @@ - beta12orEarlier - + beta12orEarlier + For sequences, alignments (Seqalign used by BLAST), literature, also BLAST XML & XML2. Can be serialised in text, binary, XML, JSON NCBI ASN.1-based sequence format. - - + + NCBI ASN.1 format (textual) @@ -29251,11 +29385,11 @@ - beta12orEarlier + beta12orEarlier DAS sequence (XML) format (any type). das sequence format - - + + DAS format @@ -29266,10 +29400,10 @@ - beta12orEarlier + beta12orEarlier DAS sequence (XML) format (nucleotide-only). - - + + The use of this format is deprecated. dasdna @@ -29281,10 +29415,10 @@ - beta12orEarlier + beta12orEarlier EMBOSS debugging trace sequence format of full internal data content. - - + + debug-seq @@ -29295,10 +29429,10 @@ - beta12orEarlier + beta12orEarlier Jackknifer output sequence non-interleaved format. - - + + jackknifernon @@ -29308,12 +29442,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Mega non-interleaved output sequence format. - + meganon sequence format true @@ -29324,10 +29458,10 @@ - beta12orEarlier + beta12orEarlier NCBI FASTA sequence format with NCBI-style IDs. - - + + There are several variants of this. NCBI format @@ -29340,10 +29474,10 @@ - beta12orEarlier + beta12orEarlier Nexus/paup non-interleaved sequence format. - - + + nexusnon @@ -29353,12 +29487,12 @@ - beta12orEarlier - + beta12orEarlier + General Feature Format (GFF) of sequence features. - - + + GFF2 @@ -29368,13 +29502,13 @@ - beta12orEarlier - - + beta12orEarlier + + Generic Feature Format version 3 (GFF3) of sequence features. - - + + GFF3 @@ -29384,11 +29518,11 @@ - beta12orEarlier - 1.7 - + beta12orEarlier + 1.7 + PIR feature format. - + pir true @@ -29400,12 +29534,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Swiss-Prot feature format. - + swiss feature true @@ -29417,12 +29551,12 @@ - beta12orEarlier + beta12orEarlier DAS GFF (XML) feature format. DASGFF feature das feature - - + + DASGFF @@ -29433,10 +29567,10 @@ - beta12orEarlier + beta12orEarlier EMBOSS debugging trace feature format of full internal data content. - - + + debug-feat @@ -29446,12 +29580,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + EMBL feature format. - + EMBL feature true @@ -29462,12 +29596,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Genbank feature format. - + GenBank feature true @@ -29479,11 +29613,11 @@ - beta12orEarlier + beta12orEarlier ClustalW format for (aligned) sequences. clustal - - + + ClustalW format @@ -29494,10 +29628,10 @@ - beta12orEarlier + beta12orEarlier EMBOSS alignment format for debugging trace of full internal data content. - - + + debug @@ -29508,10 +29642,10 @@ - beta12orEarlier + beta12orEarlier Fasta format for (aligned) sequences. - - + + FASTA-aln @@ -29521,10 +29655,10 @@ - beta12orEarlier + beta12orEarlier Pearson MARKX0 alignment format. - - + + markx0 @@ -29534,10 +29668,10 @@ - beta12orEarlier + beta12orEarlier Pearson MARKX1 alignment format. - - + + markx1 @@ -29547,10 +29681,10 @@ - beta12orEarlier + beta12orEarlier Pearson MARKX10 alignment format. - - + + markx10 @@ -29560,10 +29694,10 @@ - beta12orEarlier + beta12orEarlier Pearson MARKX2 alignment format. - - + + markx2 @@ -29573,10 +29707,10 @@ - beta12orEarlier + beta12orEarlier Pearson MARKX3 alignment format. - - + + markx3 @@ -29587,10 +29721,10 @@ - beta12orEarlier + beta12orEarlier Alignment format for start and end of matches between sequence pairs. - - + + match @@ -29600,10 +29734,10 @@ - beta12orEarlier + beta12orEarlier Mega format for (typically aligned) sequences. - - + + mega @@ -29613,10 +29747,10 @@ - beta12orEarlier + beta12orEarlier Mega non-interleaved format for (typically aligned) sequences. - - + + meganon @@ -29626,12 +29760,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + MSF format for (aligned) sequences. - + msf alignment format true @@ -29642,12 +29776,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Nexus/paup format for (aligned) sequences. - + nexus alignment format true @@ -29658,12 +29792,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Nexus/paup non-interleaved format for (aligned) sequences. - + nexusnon alignment format true @@ -29674,10 +29808,10 @@ - beta12orEarlier + beta12orEarlier EMBOSS simple sequence pair alignment format. - - + + pair @@ -29687,15 +29821,15 @@ - beta12orEarlier - http://www.bioperl.org/wiki/PHYLIP_multiple_alignment_format + beta12orEarlier + http://www.bioperl.org/wiki/PHYLIP_multiple_alignment_format Phylip format for (aligned) sequences. PHYLIP PHYLIP interleaved format ph phy - - + + PHYLIP format @@ -29705,13 +29839,13 @@ - beta12orEarlier - http://www.bioperl.org/wiki/PHYLIP_multiple_alignment_format + beta12orEarlier + http://www.bioperl.org/wiki/PHYLIP_multiple_alignment_format Phylip non-interleaved format for (aligned) sequences. PHYLIP sequential format phylipnon - - + + PHYLIP sequential @@ -29722,10 +29856,10 @@ - beta12orEarlier + beta12orEarlier Alignment format for score values for pairs of sequences. - - + + scores format @@ -29737,10 +29871,10 @@ - beta12orEarlier + beta12orEarlier SELEX format for (aligned) sequences. - - + + selex @@ -29751,10 +29885,10 @@ - beta12orEarlier + beta12orEarlier EMBOSS simple multiple alignment format. - - + + EMBOSS simple format @@ -29765,10 +29899,10 @@ - beta12orEarlier + beta12orEarlier Simple multiple sequence (alignment) format for SRS. - - + + srs format @@ -29779,10 +29913,10 @@ - beta12orEarlier + beta12orEarlier Simple sequence pair (alignment) format for SRS. - - + + srspair @@ -29793,10 +29927,10 @@ - beta12orEarlier + beta12orEarlier T-Coffee program alignment format. - - + + T-Coffee format @@ -29808,10 +29942,10 @@ - beta12orEarlier + beta12orEarlier Treecon format for (aligned) sequences. - - + + TreeCon-seq @@ -29827,11 +29961,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for a phylogenetic tree. - - + + Phylogenetic tree format @@ -29847,11 +29981,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for a biological pathway or network. - - + + Biological pathway or network format @@ -29867,11 +30001,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for a sequence-profile alignment. - - + + Sequence-profile alignment format @@ -29881,12 +30015,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Data format for a sequence-HMM profile alignment. - + Sequence-profile alignment (HMM) format true @@ -29903,10 +30037,10 @@ - beta12orEarlier + beta12orEarlier Data format for an amino acid index. - - + + Amino acid index format @@ -29922,12 +30056,12 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for a full-text scientific article. Literature format - - + + Article format @@ -29943,11 +30077,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format of a report from text mining. - - + + Text mining report format @@ -29963,11 +30097,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for reports on enzyme kinetics. - - + + Enzyme kinetics report format @@ -29983,15 +30117,15 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a report on a chemical compound. Chemical compound annotation format Chemical structure format Small molecule report format Small molecule structure format - - + + Chemical data format @@ -30007,12 +30141,12 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a report on a particular locus, gene, gene system or groups of genes. Gene features format - - + + Gene annotation format @@ -30022,13 +30156,13 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a workflow. Programming language Script format - - + + Workflow format @@ -30038,11 +30172,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for a molecular tertiary structure. - - + + Tertiary structure format @@ -30052,12 +30186,12 @@ - beta12orEarlier - 1.2 - + beta12orEarlier + 1.2 + Data format for a biological model. - + Biological model format true @@ -30074,11 +30208,11 @@ - beta12orEarlier - true + beta12orEarlier + true Text format of a chemical formula. - - + + Chemical formula format @@ -30094,11 +30228,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of raw (unplotted) phylogenetic data. - - + + Phylogenetic character data format @@ -30114,10 +30248,10 @@ - beta12orEarlier + beta12orEarlier Format of phylogenetic continuous quantitative character data. - - + + Phylogenetic continuous quantitative character format @@ -30133,10 +30267,10 @@ - beta12orEarlier + beta12orEarlier Format of phylogenetic discrete states data. - - + + Phylogenetic discrete states format @@ -30152,10 +30286,10 @@ - beta12orEarlier + beta12orEarlier Format of phylogenetic cliques data. - - + + Phylogenetic tree report (cliques) format @@ -30171,10 +30305,10 @@ - beta12orEarlier + beta12orEarlier Format of phylogenetic invariants data. - - + + Phylogenetic tree report (invariants) format @@ -30184,12 +30318,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Annotation format for electron microscopy models. - + Electron microscopy model format true @@ -30206,11 +30340,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format for phylogenetic tree distance data. - - + + Phylogenetic tree report (tree distances) format @@ -30220,12 +30354,12 @@ - beta12orEarlier - 1.0 - + beta12orEarlier + 1.0 + Format for sequence polymorphism data. - + Polymorphism report format true @@ -30242,11 +30376,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format for reports on a protein family. - - + + Protein family report format @@ -30262,12 +30396,12 @@ - beta12orEarlier - true + beta12orEarlier + true Format for molecular interaction data. Molecular interaction format - - + + Protein interaction format @@ -30283,11 +30417,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format for sequence assembly data. - - + + Sequence assembly format @@ -30297,10 +30431,10 @@ - beta12orEarlier + beta12orEarlier Format for information about a microarray experimental per se (not the data generated from that experiment). - - + + Microarray experiment data format @@ -30316,10 +30450,10 @@ - beta12orEarlier + beta12orEarlier Format for sequence trace data (i.e. including base call information). - - + + Sequence trace format @@ -30335,12 +30469,12 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a file of gene expression data, e.g. a gene expression matrix or profile. Gene expression data format - - + + Gene expression report format @@ -30350,12 +30484,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of a report on genotype / phenotype information. - + Genotype and phenotype annotation format true @@ -30372,11 +30506,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a map of (typically one) molecular sequence annotated with features. - - + + Map format @@ -30386,11 +30520,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a report on PCR primers or hybridisation oligos in a nucleic acid sequence. - - + + Nucleic acid features (primers) format @@ -30406,11 +30540,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a report of general information about a specific protein. - - + + Protein report format @@ -30420,12 +30554,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of a report of general information about a specific enzyme. - + Protein report (enzyme) format true @@ -30442,10 +30576,10 @@ - beta12orEarlier + beta12orEarlier Format of a matrix of 3D-1D scores (amino acid environment probabilities). - - + + 3D-1D scoring matrix format @@ -30461,11 +30595,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a report on the quality of a protein three-dimensional model. - - + + Protein structure report (quality evaluation) format @@ -30481,11 +30615,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a report on sequence hits and associated data from searching a sequence database. - - + + Database hits (sequence) format @@ -30501,11 +30635,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a matrix of genetic distances between molecular sequences. - - + + Sequence distance matrix format @@ -30521,11 +30655,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a sequence motif. - - + + Sequence motif format @@ -30541,11 +30675,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a sequence profile. - - + + Sequence profile format @@ -30561,10 +30695,10 @@ - beta12orEarlier + beta12orEarlier Format of a hidden Markov model. - - + + Hidden Markov model format @@ -30580,11 +30714,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format of a dirichlet distribution. - - + + Dirichlet distribution format @@ -30606,11 +30740,11 @@ - beta12orEarlier - true + beta12orEarlier + true Data format for the emission and transition counts of a hidden Markov model. - - + + HMM emission and transition counts format @@ -30626,11 +30760,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format for secondary structure (predicted or real) of an RNA molecule. - - + + RNA secondary structure format @@ -30640,11 +30774,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format for secondary structure (predicted or real) of a protein molecule. - - + + Protein secondary structure format @@ -30660,11 +30794,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format used to specify range(s) of sequence positions. - - + + Sequence range format @@ -30675,10 +30809,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for molecular sequence with possible unknown positions but without non-sequence characters. - - + + pure @@ -30689,10 +30823,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a molecular sequence with possible unknown positions but possibly with non-sequence characters. - - + + unpure @@ -30703,10 +30837,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a molecular sequence with possible unknown positions but without ambiguity characters. - - + + unambiguous sequence @@ -30717,10 +30851,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a molecular sequence with possible unknown positions and possible ambiguity characters. - - + + ambiguous @@ -30730,11 +30864,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format used for map of repeats in molecular (typically nucleotide) sequences. - - + + Sequence features (repeats) format @@ -30744,11 +30878,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format used for report on restriction enzyme recognition sites in nucleotide sequences. - - + + Nucleic acid features (restriction sites) format @@ -30758,11 +30892,11 @@ - beta12orEarlier - 1.10 - + beta12orEarlier + 1.10 + Format used for report on coding regions in nucleotide sequences. - + Gene features (coding region) format true @@ -30780,11 +30914,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format used for clusters of molecular sequences. - - + + Sequence cluster format @@ -30794,10 +30928,10 @@ - beta12orEarlier + beta12orEarlier Format used for clusters of protein sequences. - - + + Sequence cluster format (protein) @@ -30807,10 +30941,10 @@ - beta12orEarlier + beta12orEarlier Format used for clusters of nucleotide sequences. - - + + Sequence cluster format (nucleic acid) @@ -30820,12 +30954,12 @@ - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Format used for clusters of genes. - + Gene cluster format true @@ -30837,10 +30971,10 @@ - beta12orEarlier + beta12orEarlier A text format resembling EMBL entry format. - - + + This concept may be used for the many non-standard EMBL-like text formats. EMBL-like (text) @@ -30852,10 +30986,10 @@ - beta12orEarlier + beta12orEarlier A text format resembling FASTQ short read format. - - + + This concept may be used for non-standard FASTQ short read-like formats. FASTQ-like format (text) @@ -30866,10 +31000,10 @@ - beta12orEarlier + beta12orEarlier XML format for EMBL entries. - - + + EMBLXML @@ -30879,10 +31013,10 @@ - beta12orEarlier + beta12orEarlier XML format for EMBL entries. - - + + cdsxml @@ -30892,10 +31026,10 @@ - beta12orEarlier + beta12orEarlier XML format for EMBL entries. - - + + insdxml @@ -30905,10 +31039,10 @@ - beta12orEarlier + beta12orEarlier Geneseq sequence format. - - + + geneseq @@ -30919,10 +31053,10 @@ - beta12orEarlier + beta12orEarlier A text sequence format resembling uniprotkb entry format. - - + + UniProt-like (text) @@ -30932,11 +31066,11 @@ - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + UniProt entry sequence format. - + UniProt format true @@ -30948,12 +31082,12 @@ - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + ipi sequence format. - + ipi true @@ -30965,10 +31099,10 @@ - beta12orEarlier + beta12orEarlier Abstract format used by MedLine database. - - + + medline @@ -30984,11 +31118,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format used for ontologies. - - + + Ontology format @@ -30998,10 +31132,10 @@ - beta12orEarlier + beta12orEarlier A serialisation format conforming to the Open Biomedical Ontologies (OBO) model. - - + + OBO format @@ -31012,10 +31146,10 @@ - beta12orEarlier + beta12orEarlier A serialisation format conforming to the Web Ontology Language (OWL) model. - - + + OWL format @@ -31026,10 +31160,10 @@ - beta12orEarlier + beta12orEarlier A text format resembling FASTA format. - - + + This concept may also be used for the many non-standard FASTA-like formats. FASTA-like (text) http://filext.com/file-extension/FASTA @@ -31041,11 +31175,11 @@ - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Data format for a molecular sequence record, typically corresponding to a full entry from a molecular sequence database. - + Sequence record full format true @@ -31057,11 +31191,11 @@ - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Data format for a molecular sequence record 'lite', typically molecular sequence and minimal metadata, such as an identifier of the sequence and/or a comment. - + Sequence record lite format true @@ -31073,10 +31207,10 @@ - beta12orEarlier + beta12orEarlier An XML format for EMBL entries. - - + + This is a placeholder for other more specific concepts. It should not normally be used for annotation. EMBL format (XML) @@ -31088,10 +31222,10 @@ - beta12orEarlier + beta12orEarlier A text format resembling GenBank entry (plain text) format. - - + + This concept may be used for the non-standard GenBank-like text formats. GenBank-like format (text) @@ -31102,10 +31236,10 @@ - beta12orEarlier + beta12orEarlier Text format for a sequence feature table. - - + + Sequence feature table format (text) @@ -31115,12 +31249,12 @@ - beta12orEarlier - 1.0 - + beta12orEarlier + 1.0 + Format of a report on organism strain data / cell line. - + Strain data format true @@ -31131,12 +31265,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format for a report of strain data as used for CIP database entries. - + CIP strain data format true @@ -31147,12 +31281,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + PHYLIP file format for phylogenetic property data. - + phylip property values true @@ -31163,12 +31297,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format (HTML) for the STRING database of protein interaction. - + STRING entry format (HTML) true @@ -31180,10 +31314,10 @@ - beta12orEarlier + beta12orEarlier Entry format (XML) for the STRING database of protein interaction. - - + + STRING entry format (XML) @@ -31194,10 +31328,10 @@ - beta12orEarlier + beta12orEarlier GFF feature format (of indeterminate version). - - + + GFF @@ -31207,13 +31341,13 @@ - beta12orEarlier - + beta12orEarlier + Gene Transfer Format (GTF), a restricted version of GFF. - - + + GTF @@ -31224,10 +31358,10 @@ - beta12orEarlier + beta12orEarlier FASTA format wrapped in HTML elements. - - + + FASTA-HTML @@ -31238,10 +31372,10 @@ - beta12orEarlier + beta12orEarlier EMBL entry format wrapped in HTML elements. - - + + EMBL-HTML @@ -31251,12 +31385,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the BioCyc enzyme database. - + BioCyc enzyme report format true @@ -31267,12 +31401,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of an entry from the Enzyme nomenclature database (ENZYME). - + ENZYME enzyme report format true @@ -31283,12 +31417,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of a report on a gene from the PseudoCAP database. - + PseudoCAP gene report format true @@ -31299,12 +31433,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of a report on a gene from the GeneCards database. - + GeneCards gene report format true @@ -31316,18 +31450,17 @@ - beta12orEarlier + beta12orEarlier + txt + text/plain (not a standard IANA entry, see instead https://www.iana.org/go/rfc3676) Textual format. Plain text format - txt - - - geo + + + Data in text format can be compressed into binary format, or can be a value of an XML element or attribute. Markup formats are not considered textual (or more precisely, not plain-textual). Textual format - http://filext.com/file-extension/TXT - http://www.iana.org/assignments/media-types/media-types.xhtml#text - http://www.iana.org/assignments/media-types/text/plain + @@ -31343,11 +31476,11 @@ - beta12orEarlier + beta12orEarlier HTML format. Hypertext Markup Language - - + + HTML http://filext.com/file-extension/HTML @@ -31359,16 +31492,16 @@ - beta12orEarlier - xml - + beta12orEarlier + xml + eXtensible Markup Language (XML) format. eXtensible Markup Language - - - geo + + + Data in XML format can be serialised into text, or binary format. XML @@ -31379,12 +31512,12 @@ - beta12orEarlier - true + beta12orEarlier + true Binary format. - - - geo + + + Only specific native binary formats are listed under 'Binary format' in EDAM. Generic binary formats - such as any data being zipped, or any XML data being serialised into the Efficient XML Interchange (EXI) format - are not modelled in EDAM. Refer to http://wsio.org/compression_004. Binary format @@ -31395,12 +31528,12 @@ - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Typical textual representation of a URI. - + URI format true @@ -31411,12 +31544,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The format of an entry from the NCI-Nature pathways database. - + NCI-Nature pathway entry format true @@ -31427,12 +31560,12 @@ - beta12orEarlier - true + beta12orEarlier + true A placeholder concept for visual navigation by dividing data formats by the content of the data that is represented. Format (typed) - - + + This concept exists only to assist EDAM maintenance and navigation in graphical browsers. It does not add semantic information. The concept branch under 'Format (typed)' provides an alternative organisation of the concepts nested under the other top-level branches ('Binary', 'HTML', 'RDF', 'Text' and 'XML'. All concepts under here are already included under those branches. Format (by type of data) @@ -31477,18 +31610,18 @@ - - - - beta12orEarlier - - - - - - - Any ontology allowed, none mandatory. Preferrably with URIs but URIs are not mandatory. Non-ontology terms are also allowed as the last resort in case of a lack of suitable ontology. - + + + + beta12orEarlier + + + + + + + Any ontology allowed, none mandatory. Preferrably with URIs but URIs are not mandatory. Non-ontology terms are also allowed as the last resort in case of a lack of suitable ontology. + BioXSD-schema-based XML format of sequence-based data and some other common data - sequence records, alignments, feature records, references to resources, and more - optimised for integrative bioinformatics, Web services, and object-oriented programming. @@ -31504,8 +31637,8 @@ BioXSD/GTrack BioXSD|GTrack BioYAML - - + + 'BioXSD' belongs to the 'BioXSD|GTrack' ecosystem of generic formats. 'BioXSD in XML' is the XML format based on the common, unified 'BioXSD data model', a.k.a. 'BioXSD|BioJSON|BioYAML'. BioXSD (XML) @@ -31518,13 +31651,13 @@ - beta12orEarlier + beta12orEarlier A serialisation format conforming to the Resource Description Framework (RDF) model. Resource Description Framework format RDF Resource Description Framework - - + + RDF format @@ -31536,10 +31669,10 @@ - beta12orEarlier + beta12orEarlier Genbank entry format wrapped in HTML elements. - - + + GenBank-HTML @@ -31549,12 +31682,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Format of a report on protein features (domain composition). - + Protein features (domains) format true @@ -31565,10 +31698,10 @@ - beta12orEarlier + beta12orEarlier A format resembling EMBL entry (plain text) format. - - + + This concept may be used for the many non-standard EMBL-like formats. EMBL-like format @@ -31579,10 +31712,10 @@ - beta12orEarlier + beta12orEarlier A format resembling FASTQ short read format. - - + + This concept may be used for non-standard FASTQ short read-like formats. FASTQ-like format @@ -31593,10 +31726,10 @@ - beta12orEarlier + beta12orEarlier A format resembling FASTA format. - - + + This concept may be used for the many non-standard FASTA-like formats. FASTA-like @@ -31608,10 +31741,10 @@ - beta12orEarlier + beta12orEarlier A sequence format resembling uniprotkb entry format. - - + + uniprotkb-like format @@ -31627,10 +31760,10 @@ - beta12orEarlier + beta12orEarlier Format for a sequence feature table. - - + + Sequence feature table format @@ -31641,10 +31774,10 @@ - beta12orEarlier + beta12orEarlier OBO ontology text format. - - + + OBO @@ -31655,10 +31788,10 @@ - beta12orEarlier + beta12orEarlier OBO ontology XML format. - - + + OBO-XML @@ -31668,10 +31801,10 @@ - beta12orEarlier + beta12orEarlier Data format for a molecular sequence record. - - + + Sequence record format (text) @@ -31681,10 +31814,10 @@ - beta12orEarlier + beta12orEarlier Data format for a molecular sequence record. - - + + Sequence record format (XML) @@ -31694,10 +31827,10 @@ - beta12orEarlier + beta12orEarlier XML format for a sequence feature table. - - + + Sequence feature table format (XML) @@ -31707,10 +31840,10 @@ - beta12orEarlier + beta12orEarlier Text format for molecular sequence alignment information. - - + + Alignment format (text) @@ -31720,10 +31853,10 @@ - beta12orEarlier + beta12orEarlier XML format for molecular sequence alignment information. - - + + Alignment format (XML) @@ -31733,10 +31866,10 @@ - beta12orEarlier + beta12orEarlier Text format for a phylogenetic tree. - - + + Phylogenetic tree format (text) @@ -31746,10 +31879,10 @@ - beta12orEarlier + beta12orEarlier XML format for a phylogenetic tree. - - + + Phylogenetic tree format (XML) @@ -31760,10 +31893,10 @@ - beta12orEarlier + beta12orEarlier An XML format resembling EMBL entry format. - - + + This concept may be used for the any non-standard EMBL-like XML formats. EMBL-like (XML) @@ -31774,10 +31907,10 @@ - beta12orEarlier + beta12orEarlier A format resembling GenBank entry (plain text) format. - - + + This concept may be used for the non-standard GenBank-like formats. GenBank-like format @@ -31788,12 +31921,12 @@ - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Entry format for the STRING database of protein interaction. - + STRING entry format true @@ -31804,10 +31937,10 @@ - beta12orEarlier + beta12orEarlier Text format for sequence assembly data. - - + + Sequence assembly format (text) @@ -31817,12 +31950,12 @@ - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Text format (representation) of amino acid residues. - + Amino acid identifier format true @@ -31834,10 +31967,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a molecular sequence without any unknown positions or ambiguity characters. - - + + completely unambiguous @@ -31848,10 +31981,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a molecular sequence without unknown positions, ambiguity or non-sequence characters. - - + + completely unambiguous pure @@ -31862,10 +31995,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a nucleotide sequence (characters ACGTU only) without unknown positions, ambiguity or non-sequence characters . - - + + completely unambiguous pure nucleotide @@ -31876,10 +32009,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for a DNA sequence (characters ACGT only) without unknown positions, ambiguity or non-sequence characters. - - + + completely unambiguous pure dna @@ -31890,10 +32023,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for an RNA sequence (characters ACGU only) without unknown positions, ambiguity or non-sequence characters. - - + + completely unambiguous pure rna sequence @@ -31909,11 +32042,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a raw molecular sequence (i.e. the alphabet used). - - + + Raw sequence format http://www.onto-med.de/ontologies/gfo.owl#Symbol_sequence @@ -31926,12 +32059,12 @@ - beta12orEarlier - + beta12orEarlier + BAM format, the binary, BGZF-formatted compressed version of SAM format for alignment of nucleotide sequences (e.g. sequencing reads) to (a) reference sequence(s). May contain base-call and alignment qualities and other data. - - + + BAM @@ -31943,12 +32076,12 @@ - beta12orEarlier - + beta12orEarlier + Sequence Alignment/Map (SAM) format for alignment of nucleotide sequences (e.g. sequencing reads) to (a) reference sequence(s). May contain base-call and alignment qualities and other data. - - + + The format supports short and long reads (up to 128Mbp) produced by different sequencing platforms and is used to hold mapped data within the GATK and across the Broad Institute, the Sanger Centre, and throughout the 1000 Genomes project. SAM @@ -31960,12 +32093,12 @@ - beta12orEarlier - + beta12orEarlier + Systems Biology Markup Language (SBML), the standard XML format for models of biological processes such as for example metabolism, cell signaling, and gene regulation. - - + + SBML @@ -31976,10 +32109,10 @@ - beta12orEarlier + beta12orEarlier Alphabet for any protein sequence without unknown positions, ambiguity or non-sequence characters. - - + + completely unambiguous pure protein @@ -32001,11 +32134,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of a bibliographic reference. - - + + Bibliographic reference format @@ -32021,10 +32154,10 @@ - beta12orEarlier + beta12orEarlier Format of a sequence annotation track. - - + + Sequence annotation track format @@ -32040,10 +32173,10 @@ - beta12orEarlier + beta12orEarlier Data format for molecular sequence alignment information that can hold sequence alignment(s) of only 2 sequences. - - + + Alignment format (pair only) @@ -32059,11 +32192,11 @@ - beta12orEarlier - true + beta12orEarlier + true Format of sequence variation annotation. - - + + Sequence variation annotation format @@ -32074,10 +32207,10 @@ - beta12orEarlier + beta12orEarlier Some variant of Pearson MARKX alignment format. - - + + markx0 variant @@ -32089,10 +32222,10 @@ - beta12orEarlier + beta12orEarlier Some variant of Mega format for (typically aligned) sequences. - - + + mega variant @@ -32104,10 +32237,10 @@ - beta12orEarlier + beta12orEarlier Some variant of Phylip format for (aligned) sequences. - - + + Phylip format variant @@ -32118,10 +32251,10 @@ - beta12orEarlier + beta12orEarlier AB1 binary format of raw DNA sequence reads (output of Applied Biosystems' sequencing analysis software). Contains an electropherogram and the DNA base sequence. - - + + AB1 uses the generic binary Applied Biosystems, Inc. Format (ABIF). AB1 @@ -32133,12 +32266,12 @@ - beta12orEarlier - + beta12orEarlier + ACE sequence assembly format including contigs, base-call qualities, and other metadata (version Aug 1998 and onwards). - - + + ACE @@ -32149,12 +32282,12 @@ - beta12orEarlier - + beta12orEarlier + Browser Extensible Data (BED) format of sequence annotation track, typically to be displayed in a genome browser. - - + + BED detail format includes 2 additional columns (http://genome.ucsc.edu/FAQ/FAQformat#format1.7) and BED 15 includes 3 additional columns for experiment scores (http://genomewiki.ucsc.edu/index.php/Microarray_track). BED @@ -32166,12 +32299,18 @@ - beta12orEarlier - + + + + + + + beta12orEarlier + bigBed format for large sequence annotation tracks, similar to textual BED format. - - + + bigBed @@ -32182,13 +32321,13 @@ - beta12orEarlier - - wig + beta12orEarlier + + wig Wiggle format (WIG) of a sequence annotation track that consists of a value for each sequence position. Typically to be displayed in a genome browser. - - + + WIG @@ -32199,12 +32338,18 @@ - beta12orEarlier - + + + + + + + beta12orEarlier + bigWig format for large sequence annotation tracks that consist of a value for each sequence position. Similar to textual WIG format. - - + + bigWig @@ -32216,12 +32361,12 @@ - beta12orEarlier - + beta12orEarlier + PSL format of alignments, typically generated by BLAT or psLayout. Can be displayed in a genome browser like a sequence annotation track. - - + + PSL @@ -32233,12 +32378,12 @@ - beta12orEarlier - + beta12orEarlier + Multiple Alignment Format (MAF) supporting alignments of whole genomes with rearrangements, directions, multiple pieces to the alignment, and so forth. - - + + Typically generated by Multiz and TBA aligners; can be displayed in a genome browser like a sequence annotation track. This should not be confused with MIRA Assembly Format or Mutation Annotation Format. MAF @@ -32250,13 +32395,13 @@ - beta12orEarlier - + beta12orEarlier + 2bit binary format of nucleotide sequences using 2 bits per nucleotide. In addition encodes unknown nucleotides and lower-case 'masking'. - - + + 2bit @@ -32267,12 +32412,12 @@ - beta12orEarlier - + beta12orEarlier + .nib (nibble) binary format of a nucleotide sequence using 4 bits per nucleotide (including unknown) and its lower-case 'masking'. - - + + .nib @@ -32283,13 +32428,13 @@ - beta12orEarlier - - gp + beta12orEarlier + + gp genePred table format for gene prediction tracks. - - + + genePred format has 3 main variations (http://genome.ucsc.edu/FAQ/FAQformat#format9 http://www.broadinstitute.org/software/igv/genePred). They reflect UCSC Browser DB tables. genePred @@ -32301,12 +32446,12 @@ - beta12orEarlier - + beta12orEarlier + Personal Genome SNP (pgSnp) format for sequence variation tracks (indels and polymorphisms), supported by the UCSC Genome Browser. - - + + pgSnp @@ -32317,12 +32462,12 @@ - beta12orEarlier - + beta12orEarlier + axt format of alignments, typically produced from BLASTZ. - - + + axt @@ -32333,13 +32478,13 @@ - beta12orEarlier - - lav + beta12orEarlier + + lav LAV format of alignments generated by BLASTZ and LASTZ. - - + + LAV @@ -32350,12 +32495,12 @@ - beta12orEarlier - + beta12orEarlier + Pileup format of alignment of sequences (e.g. sequencing reads) to (a) reference sequence(s). Contains aligned bases per base of the reference sequence(s). - - + + Pileup @@ -32366,13 +32511,13 @@ - beta12orEarlier - - vcf - vcf.gz + beta12orEarlier + + vcf + vcf.gz Variant Call Format (VCF) is tabular format for storing genomic sequence variations. - - + + 1000 Genomes Project has its own specification for encoding structural variations in VCF (https://www.internationalgenome.org/wiki/Analysis/Variant%20Call%20Format/VCF%20(Variant%20Call%20Format)%20version%204.0/encoding-structural-variants). This is based on VCF version 4.0 and not directly compatible with VCF version 4.3. VCF @@ -32385,12 +32530,12 @@ - beta12orEarlier - + beta12orEarlier + Sequence Read Format (SRF) of sequence trace data. Supports submission to the NCBI Short Read Archive. - - + + SRF @@ -32401,12 +32546,12 @@ - beta12orEarlier - + beta12orEarlier + ZTR format for storing chromatogram data from DNA sequencing instruments. - - + + ZTR @@ -32417,12 +32562,12 @@ - beta12orEarlier - + beta12orEarlier + Genome Variation Format (GVF). A GFF3-compatible format with defined header and attribute tags for sequence variation. - - + + GVF @@ -32433,11 +32578,11 @@ - beta12orEarlier + beta12orEarlier BCF, the binary version of Variant Call Format (VCF) for sequence variation (indels, polymorphisms, structural variation). - - + + BCF @@ -32453,11 +32598,11 @@ - beta13 - true + beta13 + true Format of a matrix (array) of numerical values. - - + + Matrix format @@ -32473,11 +32618,11 @@ - beta13 - true + beta13 + true Format of data concerning the classification of the sequences and/or structures of protein structural domain(s). - - + + Protein domain classification format @@ -32487,10 +32632,10 @@ - beta13 + beta13 Format of raw SCOP domain classification data files. - - + + These are the parsable data files provided by SCOP. Raw SCOP domain classification format @@ -32501,10 +32646,10 @@ - beta13 + beta13 Format of raw CATH domain classification data files. - - + + These are the parsable data files provided by CATH. Raw CATH domain classification format @@ -32515,10 +32660,10 @@ - beta13 + beta13 Format of summary of domain classification information for a CATH domain. - - + + The report (for example http://www.cathdb.info/domain/1cukA01) includes CATH codes for levels in the hierarchy for the domain, level descriptions and relevant data and links. CATH domain report format @@ -32530,12 +32675,12 @@ - 1.0 - + 1.0 + Systems Biology Result Markup Language (SBRML), the standard XML format for simulated or calculated results (e.g. trajectories) of systems biology models. - - + + SBRML @@ -32545,12 +32690,12 @@ - 1.0 - + 1.0 + BioPAX is an exchange format for pathway data, with its data model defined in OWL. - - + + BioPAX @@ -32562,12 +32707,12 @@ - 1.0 - + 1.0 + EBI Application Result XML is a format returned by sequence similarity search Web services at EBI. - - + + EBI Application Result XML @@ -32578,13 +32723,13 @@ - 1.0 - + 1.0 + XML Molecular Interaction Format (MIF), standardised by HUPO PSI MI. MIF - - + + PSI MI XML (MIF) @@ -32595,12 +32740,12 @@ - 1.0 - + 1.0 + phyloXML is a standardised XML format for phylogenetic trees, networks, and associated data. - - + + phyloXML @@ -32611,12 +32756,12 @@ - 1.0 - + 1.0 + NeXML is a standardised XML format for rich phyloinformatic data. - - + + NeXML @@ -32633,12 +32778,12 @@ - 1.0 - + 1.0 + MAGE-ML XML format for microarray expression data, standardised by MGED (now FGED). - - + + MAGE-ML @@ -32655,12 +32800,12 @@ - 1.0 - + 1.0 + MAGE-TAB textual format for microarray expression data, standardised by MGED (now FGED). - - + + MAGE-TAB @@ -32671,12 +32816,12 @@ - 1.0 - + 1.0 + GCDML XML format for genome and metagenome metadata according to MIGS/MIMS/MIMARKS information standards, standardised by the Genomic Standards Consortium (GSC). - - + + GCDML @@ -32686,16 +32831,22 @@ - - - 1.0 - - - - - - + + + + + + + + + 1.0 + + + + + + @@ -32706,8 +32857,8 @@ GTrack format GTrack|BTrack|GSuite GTrack GTrack|GSuite|BTrack GTrack - - + + 'GTrack' belongs to the 'BioXSD|GTrack' ecosystem of generic formats, and particular to its subset, the 'GTrack ecosystem' (GTrack, GSuite, BTrack). 'GTrack' is the tabular format for representing features of sequences and genomes. GTrack @@ -32724,11 +32875,11 @@ - 1.0 - true + 1.0 + true Data format for a report of information derived from a biological pathway or network. - - + + Biological pathway or network report format @@ -32744,11 +32895,11 @@ - 1.0 - true + 1.0 + true Data format for annotation on a laboratory experiment. - - + + Experiment annotation format @@ -32765,12 +32916,12 @@ - 1.2 - + 1.2 + Cytoband format for chromosome cytobands. - - + + Reflects a UCSC Browser DB table. Cytoband format @@ -32783,12 +32934,12 @@ - 1.2 - + 1.2 + CopasiML, the native format of COPASI. - - + + CopasiML @@ -32799,14 +32950,14 @@ - 1.2 - - + 1.2 + + CellML, the format for mathematical models of biological and other networks. - - + + CellML @@ -32817,15 +32968,15 @@ - 1.2 - - - - + 1.2 + + + + Tabular Molecular Interaction format (MITAB), standardised by HUPO PSI MI. - - + + PSI MI TAB (MITAB) @@ -32835,12 +32986,12 @@ - 1.2 - + 1.2 + Protein affinity format (PSI-PAR), standardised by HUPO PSI MI. It is compatible with PSI MI XML (MIF) and uses the same XML Schema. - - + + PSI-PAR @@ -32851,12 +33002,12 @@ - 1.2 - + 1.2 + mzML format for raw spectrometer output data, standardised by HUPO PSI MSS. - - + + mzML is the successor and unifier of the mzData format developed by PSI and mzXML developed at the Seattle Proteome Center. mzML @@ -32873,11 +33024,11 @@ - 1.2 - true + 1.2 + true Format for mass pectra and derived data, include peptide sequences etc. - - + + Mass spectrometry data format @@ -32888,12 +33039,12 @@ - 1.2 - + 1.2 + TraML (Transition Markup Language) is the format for mass spectrometry transitions, standardised by HUPO PSI MSS. - - + + TraML @@ -32904,12 +33055,12 @@ - 1.2 - + 1.2 + mzIdentML is the exchange format for peptides and proteins identified from mass spectra, standardised by HUPO PSI PI. It can be used for outputs of proteomics search engines. - - + + mzIdentML @@ -32920,12 +33071,12 @@ - 1.2 - + 1.2 + mzQuantML is the format for quantitation values associated with peptides, proteins and small molecules from mass spectra, standardised by HUPO PSI PI. It can be used for outputs of quantitation software for proteomics. - - + + mzQuantML @@ -32936,12 +33087,12 @@ - 1.2 - + 1.2 + GelML is the format for describing the process of gel electrophoresis, standardised by HUPO PSI PS. - - + + GelML @@ -32952,12 +33103,12 @@ - 1.2 - + 1.2 + spML is the format for describing proteomics sample processing, other than using gels, prior to mass spectrometric protein identification, standardised by HUPO PSI PS. It may also be applicable for metabolomics. - - + + spML @@ -32968,10 +33119,10 @@ - 1.2 + 1.2 A human-readable encoding for the Web Ontology Language (OWL). - - + + OWL Functional Syntax @@ -32982,10 +33133,10 @@ - 1.2 + 1.2 A syntax for writing OWL class expressions. - - + + This format was influenced by the OWL Abstract Syntax and the DL style syntax. Manchester OWL Syntax @@ -32997,10 +33148,10 @@ - 1.2 + 1.2 A superset of the "Description-Logic Knowledge Representation System Specification from the KRSS Group of the ARPA Knowledge Sharing Effort". - - + + This format is used in Protege 4. KRSS2 Syntax @@ -33012,10 +33163,12 @@ - 1.2 + 1.2 + ttl + The Terse RDF Triple Language (Turtle) is a human-friendly serialisation format for RDF (Resource Description Framework) graphs. - - + + The SPARQL Query Language incorporates a very similar syntax. Turtle @@ -33027,11 +33180,11 @@ - 1.2 - nt + 1.2 + nt A plain text serialisation format for RDF (Resource Description Framework) graphs, and a subset of the Turtle (Terse RDF Triple Language) format. - - + + N-Triples should not be confused with Notation 3 which is a superset of Turtle. N-Triples @@ -33043,12 +33196,12 @@ - 1.2 - n3 + 1.2 + n3 A shorthand non-XML serialisation of Resource Description Framework model, designed with human-readability in mind. N3 - - + + Notation3 @@ -33059,11 +33212,12 @@ - 1.2 - rdf + 1.2 + rdf + Resource Description Framework (RDF) XML format. - - + + RDF/XML is a serialisation syntax for OWL DL, but not for OWL Full. RDF/XML http://www.ebi.ac.uk/SWO/data/SWO_3000006 @@ -33076,11 +33230,11 @@ - 1.2 + 1.2 OWL ontology XML serialisation format. OWL - - + + OWL/XML @@ -33091,12 +33245,12 @@ - 1.3 - + 1.3 + The A2M format is used as the primary format for multiple alignments of protein or nucleic-acid sequences in the SAM suite of tools. It is a small modification of FASTA format for sequences and is compatible with most tools that read FASTA. - - + + A2M @@ -33107,13 +33261,13 @@ - 1.3 - + 1.3 + Standard flowgram format (SFF) is a binary file format used to encode results of pyrosequencing from the 454 Life Sciences platform for high-throughput sequencing. Standard flowgram format - - + + SFF @@ -33123,12 +33277,12 @@ - 1.3 - + 1.3 + The MAP file describes SNPs and is used by the Plink package. Plink MAP - - + + MAP @@ -33138,12 +33292,12 @@ - 1.3 - + 1.3 + The PED file describes individuals and genetic data and is used by the Plink package. Plink PED - - + + PED @@ -33153,11 +33307,11 @@ - 1.3 - true + 1.3 + true Data format for a metadata on an individual and their genetic data. - - + + Individual genetic data format @@ -33168,12 +33322,12 @@ - 1.3 - + 1.3 + The PED/MAP file describes data used by the Plink package. Plink PED/MAP - - + + PED/MAP @@ -33184,14 +33338,14 @@ - 1.3 - - + 1.3 + + File format of a CT (Connectivity Table) file from the RNAstructure package. Connect format Connectivity Table file format - - + + CT @@ -33202,11 +33356,11 @@ - 1.3 - + 1.3 + XRNA old input style format. - - + + SS @@ -33218,11 +33372,11 @@ - 1.3 - + 1.3 + RNA Markup Language. - - + + RNAML @@ -33233,11 +33387,11 @@ - 1.3 - + 1.3 + Format for the Genetic Data Environment (GDE). - - + + GDE @@ -33247,12 +33401,12 @@ - 1.3 - + 1.3 + A multiple alignment in vertical format, as used in the AMPS (Alignment of Multiple Protein Sequences) pacakge. Block file format - - + + BLC @@ -33268,11 +33422,11 @@ - 1.3 - true + 1.3 + true Format of a data index of some type. - - + + Data index format @@ -33289,11 +33443,11 @@ - 1.3 - + 1.3 + BAM indexing format - - + + BAI @@ -33303,11 +33457,11 @@ - 1.3 - + 1.3 + HMMER profile HMM file for HMMER versions 2.x - - + + HMMER2 @@ -33317,11 +33471,11 @@ - 1.3 - + 1.3 + HMMER profile HMM file for HMMER versions 3.x - - + + HMMER3 @@ -33331,11 +33485,11 @@ - 1.3 - + 1.3 + EMBOSS simple sequence pair alignment format. - - + + PO @@ -33347,11 +33501,11 @@ - 1.3 - + 1.3 + XML format as produced by the NCBI Blast package - - + + BLAST XML results format @@ -33362,12 +33516,12 @@ - 1.7 - http://www.ebi.ac.uk/ena/software/cram-usage#format_specification http://samtools.github.io/hts-specs/CRAMv2.1.pdf + 1.7 + http://www.ebi.ac.uk/ena/software/cram-usage#format_specification http://samtools.github.io/hts-specs/CRAMv2.1.pdf http://www.ebi.ac.uk/ena/software/cram-usage#format_specification http://samtools.github.io/hts-specs/CRAMv2.1.pdf Reference-based compression of alignment format - - + + CRAM @@ -33378,16 +33532,14 @@ - 1.7 - json - - - + 1.7 + json + JavaScript Object Notation format; a lightweight, text-based format to represent tree-structured data using key-value pairs. JavaScript Object Notation - - - geo + + + JSON @@ -33398,10 +33550,10 @@ - 1.7 + 1.7 Encapsulated PostScript format - - + + EPS @@ -33411,10 +33563,10 @@ - 1.7 + 1.7 Graphics Interchange Format. - - + + GIF @@ -33425,11 +33577,11 @@ - 1.7 + 1.7 Microsoft Excel spreadsheet format. Microsoft Excel format - - + + xls @@ -33439,18 +33591,18 @@ - 1.7 - tab - tsv - + 1.7 + tab + tsv + Tabular data represented as tab-separated values in a text file. Tab-delimited Tab-separated values tab - - + + TSV @@ -33460,11 +33612,11 @@ - 1.7 - 1.10 - + 1.7 + 1.10 + Format of a file of gene expression data, e.g. a gene expression matrix or profile. - + Gene expression data format true @@ -33477,10 +33629,10 @@ - 1.7 + 1.7 Format of the cytoscape input file of gene expression ratios or values are specified over one or more experiments. - - + + Cytoscape input file format @@ -33497,12 +33649,12 @@ - 1.7 - https://github.com/BenLangmead/bowtie/blob/master/MANUAL + 1.7 + https://github.com/BenLangmead/bowtie/blob/master/MANUAL Bowtie format for indexed reference genome for "small" genomes. Bowtie index format - - + + ebwt @@ -33512,12 +33664,12 @@ - 1.7 - http://www.molbiol.ox.ac.uk/tutorials/Seqlab_GCG.pdf + 1.7 + http://www.molbiol.ox.ac.uk/tutorials/Seqlab_GCG.pdf Rich sequence format. GCG RSF - - + + RSF-format files contain one or more sequences that may or may not be related. In addition to the sequence data, each sequence can be annotated with descriptive sequence information (from the GCG manual). RSF @@ -33530,10 +33682,10 @@ - 1.7 + 1.7 Some format based on the GCG format. - - + + GCG format variant @@ -33544,11 +33696,11 @@ - 1.7 - http://rothlab.ucdavis.edu/genhelp/chapter_2_using_sequences.html#_Creating_and_Editing_Single_Sequenc + 1.7 + http://rothlab.ucdavis.edu/genhelp/chapter_2_using_sequences.html#_Creating_and_Editing_Single_Sequenc Bioinformatics Sequence Markup Language format. - - + + BSML @@ -33565,12 +33717,12 @@ - 1.7 - https://github.com/BenLangmead/bowtie/blob/master/MANUAL + 1.7 + https://github.com/BenLangmead/bowtie/blob/master/MANUAL Bowtie format for indexed reference genome for "large" genomes. Bowtie long index format - - + + ebwtl @@ -33581,11 +33733,11 @@ - 1.8 + 1.8 Ensembl standard format for variation data. - - + + Ensembl variation file format @@ -33596,12 +33748,12 @@ - 1.8 + 1.8 Microsoft Word format. Microsoft Word format doc - - + + docx @@ -33611,12 +33763,14 @@ - 1.8 - true + 1.8 + true Format of documents including word processor, spreadsheet and presentation. - - + + Document format + TODO: fix all these, also format types and data types, and a generic OOXML format. See https://en.wikipedia.org/wiki/Office_Open_XML, https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats. (IANA e.g. https://www.iana.org/assignments/media-types/application/vnd.openxmlformats-officedocument.spreadsheetml.sheet) +TODO Check also all where applicable from https://en.wikipedia.org/wiki/Media_type#Common_examples @@ -33626,10 +33780,10 @@ - 1.8 + 1.8 Portable Document Format - - + + PDF @@ -33645,11 +33799,11 @@ - 1.9 - true + 1.9 + true Format used for images and image metadata. - - + + Image format @@ -33660,11 +33814,11 @@ - 1.9 - + 1.9 + Medical image format corresponding to the Digital Imaging and Communications in Medicine (DICOM) standard. - - + + DICOM format @@ -33675,14 +33829,14 @@ - 1.9 - - nii + 1.9 + + nii An open file format from the Neuroimaging Informatics Technology Initiative (NIfTI) commonly used to store brain imaging data obtained using Magnetic Resonance Imaging (MRI) methods. NIFTI format NIfTI-1 format - - + + nii @@ -33693,12 +33847,12 @@ - 1.9 - + 1.9 + Text-based tagged file format for medical images generated using the MetaImage software package. Metalmage format - - + + mhd @@ -33709,11 +33863,11 @@ - 1.9 - + 1.9 + Nearly Raw Rasta Data format designed to support scientific visualisation and image processing involving N-dimensional raster data. - - + + nrrd @@ -33723,10 +33877,10 @@ - 1.9 + 1.9 File format used for scripts written in the R programming language for execution within the R software environment, typically for statistical computation and graphics. - - + + R file format @@ -33736,10 +33890,10 @@ - 1.9 + 1.9 File format used for scripts for the Statistical Package for the Social Sciences. - - + + SPSS @@ -33749,12 +33903,12 @@ - 1.9 - - eml - mht - mhtml - + 1.9 + + eml + mht + mhtml + MIME HTML format for Web pages, which can include external resources, including images, Flash animations and so on. @@ -33770,8 +33924,8 @@ MIME multipart format MIME multipart message MIME multipart message format - - + + MHTML is not strictly an HTML format, it is encoded as an HTML email message (although with multipart/related instead of multipart/alternative). It, however, contains the main HTML block as its core, and thus it is for practical reasons incuded in EDAM as a specialisation of 'HTML'. MHTML @@ -33790,10 +33944,10 @@ - 1.10 + 1.10 Proprietary file format for (raw) BeadArray data used by genomewide profiling platforms from Illumina Inc. This format is output directly from the scanner and stores summary intensities for each probe-type on an array. - - + + IDAT @@ -33804,13 +33958,13 @@ - 1.10 - + 1.10 + Joint Picture Group file format for lossy graphics file. JPEG jpeg - - + + Sequence of segments with markers. Begins with byte of 0xFF and follows by marker type. JPG @@ -33822,10 +33976,10 @@ - 1.10 + 1.10 Reporter Code Count-A data file (.csv) output by the Nanostring nCounter Digital Analyzer, which contains gene sample information, probe information and probe counts. - - + + rcc @@ -33835,12 +33989,12 @@ - 1.11 - + 1.11 + ARFF (Attribute-Relation File Format) is an ASCII text file format that describes a list of instances sharing a set of attributes. - - + + This file format is for machine learning. arff @@ -33852,12 +34006,12 @@ - 1.11 - + 1.11 + AFG is a single text-based file assembly format that holds read and consensus information together - - + + afg @@ -33868,12 +34022,12 @@ - 1.11 - + 1.11 + The bedGraph format allows display of continuous-valued data in track format. This display type is useful for probability scores and transcriptome data - - + + Holds a tab-delimited chromosome /start /end / datavalue dataset. bedgraph @@ -33884,12 +34038,12 @@ - 1.11 - + 1.11 + Browser Extensible Data (BED) format of sequence annotation track that strictly does not contain non-standard fields beyond the first 3 columns. - - + + Galaxy allows BED files to contain non-standard fields beyond the first 3 columns, some other implementations do not. bedstrict @@ -33900,12 +34054,12 @@ - 1.11 - + 1.11 + BED file format where each feature is described by chromosome, start, end, name, score, and strand. - - + + Tab delimited data in strict BED format - no non-standard columns allowed; column count forced to 6 bed6 @@ -33916,12 +34070,12 @@ - 1.11 - + 1.11 + A BED file where each feature is described by all twelve columns. - - + + Tab delimited data in strict BED format - no non-standard columns allowed; column count forced to 12 bed12 @@ -33933,12 +34087,12 @@ - 1.11 - + 1.11 + Tabular format of chromosome names and sizes used by Galaxy. - - + + Galaxy allows BED files to contain non-standard fields beyond the first 3 columns, some other implementations do not. chrominfo @@ -33950,12 +34104,12 @@ - 1.11 - + 1.11 + Custom Sequence annotation track format used by Galaxy. - - + + Used for tracks/track views within galaxy. customtrack @@ -33967,12 +34121,12 @@ - 1.11 - + 1.11 + Color space FASTA format sequence variant. - - + + FASTA format extended for color space information. csfasta @@ -33984,13 +34138,13 @@ - 1.11 - + 1.11 + HDF5 is a data model, library, and file format for storing and managing data, based on Hierarchical Data Format (HDF). h5 - - + + An HDF5 file appears to the user as a directed graph. The nodes of this graph are the higher-level HDF5 objects that are exposed by the HDF5 APIs: Groups, Datasets, Named datatypes. Currently supported by the Python MDTraj package. HDF5 is the new version, according to the HDF group, a completely different technology (https://support.hdfgroup.org/products/hdf4/ compared to HDF. HDF5 @@ -34003,12 +34157,12 @@ - 1.11 - + 1.11 + A versatile bitmap format. tiff - - + + The TIFF format is perhaps the most versatile and diverse bitmap format in existence. Its extensible nature and support for numerous data compression schemes allow developers to customize the TIFF format to fit any peculiar data storage needs. TIFF @@ -34020,12 +34174,12 @@ - 1.11 - + 1.11 + Standard bitmap storage format in the Microsoft Windows environment. bmp - - + + Although it is based on Windows internal bitmap data structures, it is supported by many non-Windows and non-PC applications. BMP @@ -34037,11 +34191,11 @@ - 1.11 - + 1.11 + IM is a format used by LabEye and other applications based on the IFUNC image processing library. - - + + IFUNC library reads and writes most uncompressed interchange versions of this format. im @@ -34053,12 +34207,12 @@ - 1.11 - - pcd + 1.11 + + pcd Photo CD format, which is the highest resolution format for images on a CD. - - + + PCD was developed by Kodak. A PCD file contains five different resolution (ranging from low to high) of a slide or film negative. Due to it PCD is often used by many photographers and graphics professionals for high-end printed applications. pcd @@ -34070,11 +34224,11 @@ - 1.11 - + 1.11 + PCX is an image file format that uses a simple form of run-length encoding. It is lossless. - - + + pcx @@ -34085,11 +34239,11 @@ - 1.11 - + 1.11 + The PPM format is a lowest common denominator color image file format. - - + + ppm @@ -34100,11 +34254,11 @@ - 1.11 - + 1.11 + PSD (Photoshop Document) is a proprietary file that allows the user to work with the images' individual layers even after the file has been saved. - - + + psd @@ -34115,11 +34269,11 @@ - 1.11 - + 1.11 + X BitMap is a plain text binary image format used by the X Window System used for storing cursor and icon bitmaps used in the X GUI. - - + + The XBM format was replaced by XPM for X11 in 1989. xbm @@ -34131,11 +34285,11 @@ - 1.11 - + 1.11 + X PixMap (XPM) is an image file format used by the X Window System, it is intended primarily for creating icon pixmaps, and supports transparent pixels. - - + + Sequence of segments with markers. Begins with byte of 0xFF and follows by marker type. xpm @@ -34147,11 +34301,11 @@ - 1.11 - + 1.11 + RGB file format is the native raster graphics file format for Silicon Graphics workstations. - - + + rgb @@ -34162,11 +34316,11 @@ - 1.11 - + 1.11 + The PBM format is a lowest common denominator monochrome file format. It serves as the common language of a large family of bitmap image conversion filters. - - + + pbm @@ -34177,11 +34331,11 @@ - 1.11 - + 1.11 + The PGM format is a lowest common denominator grayscale file format. - - + + It is designed to be extremely easy to learn and write programs for. pgm @@ -34193,12 +34347,12 @@ - 1.11 - - png + 1.11 + + png PNG is a file format for image compression. - - + + It iis expected to replace the Graphics Interchange Format (GIF). PNG @@ -34210,12 +34364,12 @@ - 1.11 - + 1.11 + Scalable Vector Graphics (SVG) is an XML-based vector image format for two-dimensional graphics with support for interactivity and animation. Scalable Vector Graphics - - + + The SVG specification is an open standard developed by the World Wide Web Consortium (W3C) since 1999. SVG @@ -34227,11 +34381,11 @@ - 1.11 - + 1.11 + Sun Raster is a raster graphics file format used on SunOS by Sun Microsystems - - + + The SVG specification is an open standard developed by the World Wide Web Consortium (W3C) since 1999. rast @@ -34248,11 +34402,11 @@ - 1.11 - true + 1.11 + true Textual report format for sequence quality for reports from sequencing machines. - - + + Sequence quality report format (text) @@ -34264,11 +34418,11 @@ - 1.11 - http://en.wikipedia.org/wiki/Phred_quality_score + 1.11 + http://en.wikipedia.org/wiki/Phred_quality_score FASTQ format subset for Phred sequencing quality score data only (no sequences). - - + + Phred quality scores are defined as a property which is logarithmically related to the base-calling error probabilities. qual @@ -34280,10 +34434,10 @@ - 1.11 + 1.11 FASTQ format subset for Phred sequencing quality score data only (no sequences) for Solexa/Illumina 1.0 format. - - + + Solexa/Illumina 1.0 format can encode a Solexa/Illumina quality score from -5 to 62 using ASCII 59 to 126 (although in raw read data Solexa scores from -5 to 40 only are expected) qualsolexa @@ -34295,11 +34449,11 @@ - 1.11 - http://en.wikipedia.org/wiki/Phred_quality_score + 1.11 + http://en.wikipedia.org/wiki/Phred_quality_score FASTQ format subset for Phred sequencing quality score data only (no sequences) from Illumina 1.5 and before Illumina 1.8. - - + + Starting in Illumina 1.5 and before Illumina 1.8, the Phred scores 0 to 2 have a slightly different meaning. The values 0 and 1 are no longer used and the value 2, encoded by ASCII 66 "B", is used also at the end of reads as a Read Segment Quality Control Indicator. qualillumina @@ -34310,11 +34464,11 @@ - 1.11 - http://en.wikipedia.org/wiki/Phred_quality_score + 1.11 + http://en.wikipedia.org/wiki/Phred_quality_score FASTQ format subset for Phred sequencing quality score data only (no sequences) for SOLiD data. - - + + For SOLiD data, the sequence is in color space, except the first position. The quality values are those of the Sanger format. qualsolid @@ -34325,11 +34479,11 @@ - 1.11 - http://en.wikipedia.org/wiki/Phred_quality_score + 1.11 + http://en.wikipedia.org/wiki/Phred_quality_score FASTQ format subset for Phred sequencing quality score data only (no sequences) from 454 sequencers. - - + + qual454 @@ -34339,12 +34493,12 @@ - 1.11 - + 1.11 + Human ENCODE peak format. - - + + Format that covers both the broad peak format and narrow peak format from ENCODE. ENCODE peak format @@ -34355,12 +34509,12 @@ - 1.11 - + 1.11 + Human ENCODE narrow peak format. - - + + Format that covers both the broad peak format and narrow peak format from ENCODE. ENCODE narrow peak format @@ -34371,12 +34525,12 @@ - 1.11 - + 1.11 + Human ENCODE broad peak format. - - + + ENCODE broad peak format @@ -34387,12 +34541,12 @@ - 1.11 - - bgz + 1.11 + + bgz Blocked GNU Zip format. - - + + BAM files are compressed using a variant of GZIP (GNU ZIP), into a format called BGZF (Blocked GNU Zip Format). bgzip @@ -34404,12 +34558,12 @@ - 1.11 - + 1.11 + TAB-delimited genome position file index format. - - + + tabix @@ -34419,11 +34573,11 @@ - 1.11 - true + 1.11 + true Data format for graph data. - - + + Graph format @@ -34433,11 +34587,11 @@ - 1.11 - + 1.11 + XML-based format used to store graph descriptions within Galaxy. - - + + xgmml @@ -34447,11 +34601,11 @@ - 1.11 - + 1.11 + SIF (simple interaction file) Format - a network/pathway format used for instance in cytoscape. - - + + sif @@ -34462,10 +34616,11 @@ - 1.11 + 1.11 + MS Excel spreadsheet format consisting of a set of XML documents stored in a ZIP-compressed file. - - + + xlsx @@ -34475,11 +34630,11 @@ - 1.11 - + 1.11 + Data format used by the SQLite database. - - + + SQLite format @@ -34490,11 +34645,11 @@ - 1.11 - + 1.11 + Data format used by the SQLite database conformant to the Gemini schema. - - + + Gemini SQLite format @@ -34504,13 +34659,13 @@ - 1.11 - Duplicate of http://edamontology.org/format_3326 - 1.20 - - + 1.11 + Duplicate of http://edamontology.org/format_3326 + 1.20 + + Format of a data index of some type. - + Index format true @@ -34523,10 +34678,10 @@ - 1.11 + 1.11 An index of a genome database, indexed for use by the snpeff tool. - - + + snpeffdb @@ -34542,14 +34697,14 @@ - 1.12 - + 1.12 + Binary format used by MATLAB files to store workspace variables. .mat file format MAT file format MATLAB file format - - + + MAT @@ -34561,14 +34716,15 @@ - 1.12 - + 1.12 + Format used by netCDF software library for writing and reading chromatography-MS data files. Also used to store trajectory atom coordinates information, such as the ones obtained by Molecular Dynamics simulations. ANDI-MS - - + + Network Common Data Form (NetCDF) library is supported by AMBER MD package from version 9. - netCDF + NetCDF + @@ -34577,11 +34733,11 @@ - 1.12 - mgf + 1.12 + mgf Mascot Generic Format. Encodes multiple MS/MS spectra in a single file. - - + + Files includes *m*/*z*, intensity pairs separated by headers; headers can contain a bit more information, including search engine instructions. MGF @@ -34592,10 +34748,10 @@ - 1.12 + 1.12 Spectral data format file where each spectrum is written to a separate file. - - + + Each file contains one header line for the known or assumed charge and the mass of the precursor peptide ion, calculated from the measured *m*/*z* and the charge. This one line was then followed by all the *m*/*z*, intensity pairs that represent the spectrum. dta @@ -34606,10 +34762,10 @@ - 1.12 + 1.12 Spectral data file similar to dta. - - + + Differ from .dta only in subtleties of the header line format and content and support the added feature of being able to. pkl @@ -34620,11 +34776,11 @@ - 1.12 - https://dx.doi.org/10.1038%2Fnbt1031 + 1.12 + https://dx.doi.org/10.1038%2Fnbt1031 Common file format for proteomics mass spectrometric data developed at the Seattle Proteome Center/Institute for Systems Biology. - - + + mzXML @@ -34635,11 +34791,11 @@ - 1.12 - http://sashimi.sourceforge.net/schema_revision/pepXML/pepXML_v118.xsd + 1.12 + http://sashimi.sourceforge.net/schema_revision/pepXML/pepXML_v118.xsd Open data format for the storage, exchange, and processing of peptide sequence assignments of MS/MS scans, intended to provide a common data output format for many different MS/MS search engines and subsequent peptide-level analyses. - - + + pepXML @@ -34650,11 +34806,11 @@ - 1.12 - + 1.12 + Graphical Pathway Markup Language (GPML) is an XML format used for exchanging biological pathways. - - + + GPML @@ -34665,10 +34821,10 @@ - 1.12 - - oxlicg - + 1.12 + + oxlicg + A list of k-mers and their occurences in a dataset. Can also be used as an implicit De Bruijn graph. @@ -34682,13 +34838,13 @@ - - 1.13 - + + 1.13 + mzTab is a tab-delimited format for mass spectrometry-based proteomics and metabolomics results. - - + + mzTab @@ -34701,18 +34857,18 @@ - + - - 1.13 - - imzml + + 1.13 + + imzml imzML metadata is a data format for mass spectrometry imaging metadata. - - + + imzML data are recorded in 2 files: '.imzXML' is a metadata XML file based on mzML by HUPO-PSI, and '.ibd' is a binary file containing the mass spectra. This entry is for the metadata XML file imzML metadata file @@ -34725,12 +34881,12 @@ - 1.13 - + 1.13 + qcML is an XML format for quality-related data of mass spectrometry and other high-throughput measurements. - - + + The focus of qcML is towards mass spectrometry based proteomics, but the format is suitable for metabolomics and sequencing as well. qcML @@ -34743,12 +34899,12 @@ - 1.13 - + 1.13 + PRIDE XML is an XML format for mass spectra, peptide and protein identifications, and metadata about a corresponding measurement, sample, experiment. - - + + PRIDE XML @@ -34759,14 +34915,14 @@ - - - 1.13 - + + + 1.13 + Simulation Experiment Description Markup Language (SED-ML) is an XML format for encoding simulation setups, according to the MIASE (Minimum Information About a Simulation Experiment) requirements. - - + + SED-ML @@ -34778,13 +34934,13 @@ - - 1.13 - + + 1.13 + Open Modeling EXchange format (OMEX) is a ZIPped format for encapsulating all information necessary for a modeling and simulation project in systems biology. - - + + An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. OMEX is one of the standardised formats within COMBINE (Computational Modeling in Biology Network). COMBINE OMEX @@ -34797,13 +34953,13 @@ - 1.13 - + 1.13 + The Investigation / Study / Assay (ISA) tab-delimited (TAB) format incorporates metadata from experiments employing a combination of technologies. - - + + ISA-TAB is based on MAGE-TAB. Other than tabular, the ISA model can also be represented in RDF, and in JSON (compliable with a set of defined JSON Schemata). ISA-TAB @@ -34815,13 +34971,13 @@ experiments employing a combination of technologies. - - 1.13 - + + 1.13 + SBtab is a tabular format for biochemical network models. - - + + SBtab @@ -34832,12 +34988,12 @@ experiments employing a combination of technologies. - 1.13 - + 1.13 + Biological Connection Markup Language (BCML) is an XML format for biological pathways. - - + + BCML @@ -34847,13 +35003,13 @@ experiments employing a combination of technologies. - - 1.13 - + + 1.13 + Biological Dynamics Markup Language (BDML) is an XML format for quantitative data describing biological dynamics. - - + + BDML @@ -34863,12 +35019,12 @@ experiments employing a combination of technologies. - 1.13 - + 1.13 + Biological Expression Language (BEL) is a textual format for representing scientific findings in life sciences in a computable form. - - + + BEL @@ -34879,12 +35035,12 @@ experiments employing a combination of technologies. - 1.13 - + 1.13 + SBGN-ML is an XML format for Systems Biology Graphical Notation (SBGN) diagrams of biological pathways or networks. - - + + SBGN-ML @@ -34895,13 +35051,13 @@ experiments employing a combination of technologies. - 1.13 - - agp + 1.13 + + agp AGP is a tabular format for a sequence assembly (a contig, a scaffold/supercontig, or a chromosome). - - + + AGP @@ -34911,11 +35067,11 @@ experiments employing a combination of technologies. - 1.13 + 1.13 PostScript format PostScript - - + + PS @@ -34925,14 +35081,14 @@ experiments employing a combination of technologies. - 1.13 - - sra + 1.13 + + sra SRA archive format (SRA) is the archive format used for input to the NCBI Sequence Read Archive. SRA SRA archive format - - + + SRA format @@ -34942,12 +35098,12 @@ experiments employing a combination of technologies. - 1.13 - + 1.13 + VDB ('vertical database') is the native format used for export from the NCBI Sequence Read Archive. SRA native format - - + + VDB @@ -34964,11 +35120,11 @@ experiments employing a combination of technologies. - 1.13 - + 1.13 + Index file format used by the samtools package to index TAB-delimited genome position files. - - + + Tabix index file format @@ -34978,10 +35134,10 @@ experiments employing a combination of technologies. - 1.13 + 1.13 A five-column, tab-delimited table of feature locations and qualifiers for importing annotation into an existing Sequin submission (an NCBI tool for submitting and updating GenBank entries). - - + + Sequin format @@ -34991,11 +35147,11 @@ experiments employing a combination of technologies. - 1.14 + 1.14 Proprietary mass-spectrometry format of Thermo Scientific's ProteomeDiscoverer software. Magellan storage file format - - + + This format corresponds to an SQLite database, and you can look into the files with e.g. SQLiteStudio3. There are also some readers (http://doi.org/10.1021/pr2005154) and converters (http://doi.org/10.1016/j.jprot.2015.06.015) for this format available, which re-engineered the database schema, but there is no official DB schema specification of Thermo Scientific for the format. MSF @@ -35012,11 +35168,11 @@ experiments employing a combination of technologies. - 1.14 - true + 1.14 + true Data format for biodiversity data. - - + + Biodiversity data format @@ -35032,12 +35188,12 @@ experiments employing a combination of technologies. - 1.14 - + 1.14 + Exchange format of the Access to Biological Collections Data (ABCD) Schema; a standard for the access to and exchange of data about specimens and observations (primary biodiversity data). ABCD - - + + ABCD format @@ -35048,12 +35204,12 @@ experiments employing a combination of technologies. - 1.14 + 1.14 Tab-delimited text files of GenePattern that contain a column for each sample, a row for each gene, and an expression value for each gene in each sample. GCT format Res format - - + + GCT/Res format @@ -35064,12 +35220,12 @@ experiments employing a combination of technologies. - 1.14 - wiff + 1.14 + wiff Mass spectrum file format from QSTAR and QTRAP instruments (ABI/Sciex). wiff - - + + WIFF format @@ -35080,11 +35236,11 @@ experiments employing a combination of technologies. - 1.14 - + 1.14 + Output format used by X! series search engines that is based on the XML language BIOML. - - + + X!Tandem XML @@ -35095,10 +35251,10 @@ experiments employing a combination of technologies. - 1.14 + 1.14 Proprietary file format for mass spectrometry data from Thermo Scientific. - - + + Proprietary format for which documentation is not available. Thermo RAW @@ -35110,11 +35266,11 @@ experiments employing a combination of technologies. - 1.14 - + 1.14 + "Raw" result file from Mascot database search. - - + + Mascot .dat file @@ -35125,12 +35281,12 @@ experiments employing a combination of technologies. - 1.14 - + 1.14 + Format of peak list files from Andromeda search engine (MaxQuant) that consist of arbitrarily many spectra. MaxQuant APL - - + + MaxQuant APL peaklist format @@ -35140,11 +35296,11 @@ experiments employing a combination of technologies. - 1.14 - + 1.14 + Synthetic Biology Open Language (SBOL) is an XML format for the specification and exchange of biological design information in synthetic biology. - - + + SBOL introduces a standardised format for the electronic exchange of information on the structural and functional aspects of biological designs. SBOL @@ -35155,11 +35311,11 @@ experiments employing a combination of technologies. - 1.14 - + 1.14 + PMML uses XML to represent mining models. The structure of the models is described by an XML Schema. - - + + One or more mining models can be contained in a PMML document. PMML @@ -35171,11 +35327,11 @@ experiments employing a combination of technologies. - 1.14 - + 1.14 + Image file format used by the Open Microscopy Environment (OME). - - + + An OME-TIFF dataset consists of one or more files in standard TIFF or BigTIFF format, with the file extension .ome.tif or .ome.tiff, and an identical (or in the case of multiple files, nearly identical) string of OME-XML metadata embedded in the ImageDescription tag of each file's first IFD (Image File Directory). BigTIFF file extensions are also permitted, with the file extension .ome.tf2, .ome.tf8 or .ome.btf, but note these file extensions are an addition to the original specification, and software using an older version of the specification may not be able to handle these file extensions. OME develops open-source software and data format standards for the storage and manipulation of biological microscopy data. It is a joint project between universities, research establishments, industry and the software development community. OME-TIFF @@ -35187,11 +35343,11 @@ experiments employing a combination of technologies. - 1.14 - + 1.14 + The LocARNA PP format combines sequence or alignment information and (respectively, single or consensus) ensemble probabilities into an PP 2.0 record. - - + + Format for multiple aligned or single sequences together with the probabilistic description of the (consensus) RNA secondary structure ensemble by probabilities of base pairs, base pair stackings, and base pairs and unpaired bases in the loop of base pairs. LocARNA PP @@ -35202,11 +35358,11 @@ experiments employing a combination of technologies. - 1.14 - + 1.14 + Input format used by the Database of Genotypes and Phenotypes (dbGaP). - - + + The Database of Genotypes and Phenotypes (dbGaP) is a National Institutes of Health (NIH) sponsored repository charged to archive, curate and distribute information produced by studies investigating the interaction of genotype and phenotype. dbGaP format @@ -35218,15 +35374,15 @@ experiments employing a combination of technologies. - - 1.15 - - biom + + 1.15 + + biom The BIological Observation Matrix (BIOM) is a format for representing biological sample by observation contingency tables in broad areas of comparative omics. The primary use of this format is to represent OTU tables and metagenome tables. BIological Observation Matrix format biom - - + + BIOM is a recognised standard for the Earth Microbiome Project, and is a project supported by Genomics Standards Consortium. Supported in QIIME, Mothur, MEGAN, etc. BIOM format @@ -35238,12 +35394,12 @@ experiments employing a combination of technologies. - 1.15 - + 1.15 + A format for storage, exchange, and processing of protein identifications created from ms/ms-derived peptide sequence data. - - + + No human-consumable information about this format is available (see http://tools.proteomecenter.org/wiki/index.php?title=Formats:protXML). protXML http://doi.org/10.1038/msb4100024 @@ -35258,18 +35414,18 @@ experiments employing a combination of technologies. - + - - - 1.15 - true + + + 1.15 + true A linked data format enables publishing structured data as linked data (Linked Data), so that the data can be interlinked and become more useful through semantic queries. Semantic Web format - - + + Linked data format @@ -35282,16 +35438,15 @@ experiments employing a combination of technologies. - 1.15 - - jsonld - - + 1.15 + + jsonld + JSON-LD, or JavaScript Object Notation for Linked Data, is a method of encoding Linked Data using JSON. JavaScript Object Notation for Linked Data jsonld - - + + JSON-LD @@ -35302,16 +35457,16 @@ experiments employing a combination of technologies. - 1.15 - - yaml - yml + 1.15 + + yaml + yml YAML (YAML Ain't Markup Language) is a human-readable tree-structured data serialisation language. YAML Ain't Markup Language yml - - + + Data in YAML format can be serialised into text, or binary format. YAML version 1.2 is a superset of JSON; prior versions were "not strictly compatible". YAML @@ -35323,13 +35478,13 @@ experiments employing a combination of technologies. - - 1.16 + + 1.16 Tabular data represented as values in a text file delimited by some character. Delimiter-separated values Tabular format - - + + DSV @@ -35340,15 +35495,15 @@ experiments employing a combination of technologies. - 1.16 - csv - + 1.16 + csv + Tabular data represented as comma-separated values in a text file. Comma-separated values - - + + CSV @@ -35359,11 +35514,11 @@ experiments employing a combination of technologies. - 1.16 - out + 1.16 + out "Raw" result file from SEQUEST database search. - - + + SEQUEST .out file @@ -35374,12 +35529,12 @@ experiments employing a combination of technologies. - 1.16 - http://ftp.mi.fu-berlin.de/pub/OpenMS/release1.9-documentation/html/classOpenMS_1_1IdXMLFile.html - http://open-ms.sourceforge.net/schemas/ + 1.16 + http://ftp.mi.fu-berlin.de/pub/OpenMS/release1.9-documentation/html/classOpenMS_1_1IdXMLFile.html + http://open-ms.sourceforge.net/schemas/ XML file format for files containing information about peptide identifications from mass spectrometry data analysis carried out with OpenMS. - - + + idXML @@ -35389,10 +35544,10 @@ experiments employing a combination of technologies. - 1.16 + 1.16 Data table formatted such that it can be passed/streamed within the KNIME platform. - - + + KNIME datatable format @@ -35404,14 +35559,14 @@ experiments employing a combination of technologies. - 1.16 - + 1.16 + UniProtKB XML sequence features format is an XML format available for downloading UniProt entries. UniProt XML UniProt XML format UniProtKB XML format - - + + UniProtKB XML @@ -35422,8 +35577,8 @@ experiments employing a combination of technologies. - 1.16 - + 1.16 + UniProtKB RDF sequence features format is an RDF format available for downloading UniProt entries (in RDF/XML). UniProt RDF UniProt RDF format @@ -35432,8 +35587,8 @@ experiments employing a combination of technologies. UniProtKB RDF format UniProtKB RDF/XML UniProtKB RDF/XML format - - + + UniProtKB RDF @@ -35477,12 +35632,12 @@ experiments employing a combination of technologies. - - 1.16 - - - - + + 1.16 + + + + BioJSON is a BioXSD-schema-based JSON format of sequence-based data and some other common data - sequence records, alignments, feature records, references to resources, and more - optimised for integrative bioinformatics, web applications and APIs, and object-oriented programming. BioJSON (BioXSD data model) @@ -35497,8 +35652,8 @@ experiments employing a combination of technologies. BioXSD/GTrack BioJSON BioXSD|BioJSON|BioYAML BioJSON BioXSD|GTrack BioJSON - - + + Work in progress. 'BioXSD' belongs to the 'BioXSD|GTrack' ecosystem of generic formats. 'BioJSON' is the JSON format based on the common, unified 'BioXSD data model', a.k.a. 'BioXSD|BioJSON|BioYAML'. BioJSON (BioXSD) @@ -35543,12 +35698,12 @@ experiments employing a combination of technologies. - - 1.16 - - - - + + 1.16 + + + + BioYAML is a BioXSD-schema-based YAML format of sequence-based data and some other common data - sequence records, alignments, feature records, references to resources, and more - optimised for integrative bioinformatics, web APIs, human readability and editting, and object-oriented programming. BioXSD BioYAML @@ -35565,8 +35720,8 @@ experiments employing a combination of technologies. BioYAML (BioXSD) BioYAML format BioYAML format (BioXSD) - - + + Work in progress. 'BioXSD' belongs to the 'BioXSD|GTrack' ecosystem of generic formats. 'BioYAML' is the YAML format based on the common, unified 'BioXSD data model', a.k.a. 'BioXSD|BioJSON|BioYAML'. BioYAML @@ -35591,8 +35746,8 @@ experiments employing a combination of technologies. - 1.16 - + 1.16 + BioJSON is a JSON format of single multiple sequence alignments, with their annotations, features, and custom visualisation and application settings for the Jalview workbench. BioJSON format (Jalview) @@ -35602,8 +35757,8 @@ experiments employing a combination of technologies. Jalview BioJSON format Jalview JSON Jalview JSON format - - + + BioJSON (Jalview) @@ -35614,13 +35769,13 @@ experiments employing a combination of technologies. - - - 1.16 - - - - + + + 1.16 + + + + GSuite is a tabular format for collections of genome or sequence feature tracks, suitable for integrative multi-track analysis. GSuite contains links to genome/sequence tracks, with additional metadata. @@ -35630,8 +35785,8 @@ experiments employing a combination of technologies. GSuite format GTrack|BTrack|GSuite GSuite GTrack|GSuite|BTrack GSuite - - + + 'GSuite' belongs to the 'BioXSD|GTrack' ecosystem of generic formats, and particular to its subset, the 'GTrack ecosystem' (GTrack, GSuite, BTrack). 'GSuite' is the tabular format for an annotated collection of individual GTrack files. GSuite @@ -35646,17 +35801,21 @@ experiments employing a combination of technologies. - 1.16 - + + + + + + + 1.16 + BTrack is an HDF5-based binary format for genome or sequence feature tracks and their collections, suitable for integrative multi-track analysis. BTrack is a binary, compressed alternative to the GTrack and GSuite formats. BTrack (GTrack ecosystem of formats) BTrack format - BioXSD/GTrack BTrack - BioXSD|GTrack BTrack GTrack|BTrack|GSuite BTrack GTrack|GSuite|BTrack BTrack - - + + 'BTrack' belongs to the 'BioXSD|GTrack' ecosystem of generic formats, and particular to its subset, the 'GTrack ecosystem' (GTrack, GSuite, BTrack). 'BTrack' is the binary, optionally compressed HDF5-based version of the GTrack and GSuite formats. BTrack @@ -35686,13 +35845,13 @@ experiments employing a combination of technologies. - 1.16 - - - - - - + 1.16 + + + + + + @@ -35706,8 +35865,8 @@ experiments employing a combination of technologies. MCPD format Multi-Crop Passport Descriptors Multi-Crop Passport Descriptors format - - + + Multi-Crop Passport Descriptors is a format available in 2 successive versions, V.1 (FAO/IPGRI 2001) and V.2 (FAO/Bioversity 2012). MCPD @@ -35725,11 +35884,11 @@ experiments employing a combination of technologies. - 1.16 - true + 1.16 + true Data format of an annotated text, e.g. with recognised entities, concepts, and relations. - - + + Annotated text format @@ -35740,13 +35899,13 @@ experiments employing a combination of technologies. - - 1.16 - + + 1.16 + JSON format of annotated scientific text used by PubAnnotations and other tools. - - + + PubAnnotation format @@ -35757,13 +35916,13 @@ experiments employing a combination of technologies. - - 1.16 - + + 1.16 + BioC is a standardised XML format for sharing and integrating text data and annotations. - - + + BioC @@ -35774,14 +35933,14 @@ experiments employing a combination of technologies. - - - 1.16 - + + + 1.16 + Native textual export format of annotated scientific text from PubTator. - - + + PubTator format @@ -35793,12 +35952,12 @@ experiments employing a combination of technologies. - 1.16 - + 1.16 + A format of text annotation using the linked-data Open Annotation Data Model, serialised typically in RDF or JSON-LD. - - + + Open Annotation format @@ -35810,14 +35969,14 @@ experiments employing a combination of technologies. - 1.16 - - - - - - - + 1.16 + + + + + + + @@ -35826,8 +35985,8 @@ experiments employing a combination of technologies. A family of similar formats of text annotation, used by BRAT and other tools, known as BioNLP Shared Task format (BioNLP 2009 Shared Task on Event Extraction, BioNLP Shared Task 2011, BioNLP Shared Task 2013), BRAT format, BRAT standoff format, and similar. BRAT format BRAT standoff format - - + + BioNLP Shared Task format @@ -35843,12 +36002,12 @@ experiments employing a combination of technologies. - 1.16 - true + 1.16 + true A query language (format) for structured database queries. Query format - - + + Query language @@ -35858,15 +36017,15 @@ experiments employing a combination of technologies. - 1.16 - sql - + 1.16 + sql + SQL (Structured Query Language) is the de-facto standard query language (format of queries) for querying and manipulating data in relational databases. Structured Query Language - - + + SQL @@ -35877,19 +36036,19 @@ experiments employing a combination of technologies. - 1.16 - - xq - xquery - xqy + 1.16 + + xq + xquery + xqy XQuery (XML Query) is a query language (format of queries) for querying and manipulating structured and unstructured data, usually in the form of XML, text, and with vendor-specific extensions for other data formats (JSON, binary, etc.). XML Query xq xquery xqy - - + + XQuery @@ -35900,13 +36059,13 @@ experiments employing a combination of technologies. - 1.16 - + 1.16 + SPARQL (SPARQL Protocol and RDF Query Language) is a semantic query language for querying and manipulating data stored in Resource Description Framework (RDF) format. SPARQL Protocol and RDF Query Language - - + + SPARQL @@ -35917,10 +36076,10 @@ experiments employing a combination of technologies. - 1.17 + 1.17 XML format for XML Schema. - - + + xsd @@ -35931,13 +36090,13 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + The A2M format is used as the primary format for multiple alignments of protein or nucleic-acid sequences in the SAM suite of tools. It is a small modification of FASTA format for sequences and is compatible with most tools that read FASTA. alignment format eXtended Multi-FastA format - - + + XMFA @@ -35948,12 +36107,12 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + The GEN file format contains genetic data and describes SNPs. Genotype file format - - + + GEN @@ -35963,11 +36122,11 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + The SAMPLE file format contains information about each individual i.e. individual IDs, covariates, phenotypes and missing data proportions, from a GWAS study. - - + + SAMPLE file format @@ -35978,11 +36137,11 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + SDF is one of a family of chemical-data file formats developed by MDL Information Systems; it is intended especially for structural information. - - + + SDF @@ -35993,11 +36152,11 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + An MDL Molfile is a file format for holding information about the atoms, bonds, connectivity and coordinates of a molecule. - - + + Molfile @@ -36008,11 +36167,11 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + Complete, portable representation of a SYBYL molecule. ASCII file which contains all the information needed to reconstruct a SYBYL molecule. - - + + Mol2 @@ -36023,12 +36182,12 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + format for the LaTeX document preparation system LaTeX format - - + + uses the TeX typesetting program format latex @@ -36040,13 +36199,13 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + Tab-delimited text file format used by Eland - the read-mapping program distributed by Illumina with its sequencing analysis pipeline - which maps short Solexa sequence reads to the human reference genome. ELAND eland - - + + ELAND format @@ -36056,13 +36215,13 @@ experiments employing a combination of technologies. - 1.20 - - + 1.20 + + Phylip multiple alignment sequence format, less stringent than PHYLIP format. PHYLIP Interleaved format - - + + It differs from Phylip Format (format_1997) on length of the ID sequence. There no length restrictions on the ID, but whitespaces aren't allowed in the sequence ID/Name because one space separates the longest ID and the beginning of the sequence. Sequences IDs must be padded to the longest ID length. Relaxed PHYLIP Interleaved @@ -36073,15 +36232,15 @@ experiments employing a combination of technologies. - 1.20 - - + 1.20 + + Phylip multiple alignment sequence format, less stringent than PHYLIP sequential format (format_1998). Relaxed PHYLIP non-interleaved Relaxed PHYLIP non-interleaved format Relaxed PHYLIP sequential format - - + + It differs from Phylip sequential format (format_1997) on length of the ID sequence. There no length restrictions on the ID, but whitespaces aren't allowed in the sequence ID/Name because one space separates the longest ID and the beginning of the sequence. Sequences IDs must be padded to the longest ID length. Relaxed PHYLIP Sequential @@ -36093,13 +36252,13 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + Default XML format of VisANT, containing all the network information. VisANT xml VisANT xml format - - + + VisML @@ -36112,14 +36271,14 @@ experiments employing a combination of technologies. - 1.20 - - gml - + 1.20 + + gml + Graph Modelling Language (GML) is a structured, hierarchical textual format for defining graphs with labelled nodes and edges, optionally directed. TODO: is it extensible by any attributes? Or just comments? What are the extra fields, and is there any specification? Graph Meta Language - - + + Graph Modelling Language can be used as a format for diverse graph-like data, such as for example biological networks (including pathways), or semantic knowledge graphs (including linked data). Not to be confused with the Geography Markup Language (also GML, with the same file extension), or with GraphML. GML (Graph Modeling Language) @@ -36134,13 +36293,13 @@ experiments employing a combination of technologies. - 1.20 - - + 1.20 + + FASTG is a format for faithfully representing genome assemblies in the face of allelic polymorphism and assembly uncertainty. FASTG assembly graph format - - + + It is called FASTG, like FASTA, but the G stands for "graph". FASTG @@ -36157,8 +36316,8 @@ experiments employing a combination of technologies. - 1.20 - true + 1.20 + true Data format for raw data from a nuclear magnetic resonance (NMR) spectroscopy experiment. NMR peak assignment data format NMR processed data format @@ -36166,8 +36325,8 @@ experiments employing a combination of technologies. Nuclear magnetic resonance spectroscopy data format Processed NMR data format Raw NMR data format - - + + NMR data format @@ -36178,12 +36337,12 @@ experiments employing a combination of technologies. - 1.20 - - + 1.20 + + nmrML is an MSI supported XML-based open access format for metabolomics NMR raw and processed spectral data. It is accompanies by an nmrCV (controlled vocabulary) to allow ontology-based annotations. - - + + nmrML @@ -36195,11 +36354,11 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + . proBAM is an adaptation of BAM (format_2572), which was extended to meet specific requirements entailed by proteomics data. - - + + proBAM @@ -36210,11 +36369,11 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + . proBED is an adaptation of BED (format_3003), which was extended to meet specific requirements entailed by proteomics data. - - + + proBED @@ -36230,12 +36389,12 @@ experiments employing a combination of technologies. - 1.20 - true + 1.20 + true Data format for raw microarray data. Microarray data format - - + + Raw microarray data format @@ -36246,11 +36405,11 @@ experiments employing a combination of technologies. - 1.20 - + 1.20 + GenePix Results (GPR) text file format developed by Axon Instruments that is used to save GenePix Results data. - - + + GPR @@ -36261,11 +36420,11 @@ experiments employing a combination of technologies. - 1.20 + 1.20 Binary format used by the ARB software suite ARB binary format - - + + ARB @@ -36276,11 +36435,11 @@ experiments employing a combination of technologies. - 1.20 - http://ftp.mi.fu-berlin.de/pub/OpenMS/release1.9-documentation/html/classOpenMS_1_1ConsensusXMLFile.html + 1.20 + http://ftp.mi.fu-berlin.de/pub/OpenMS/release1.9-documentation/html/classOpenMS_1_1ConsensusXMLFile.html OpenMS format for grouping features in one map or across several maps. - - + + consensusXML @@ -36291,11 +36450,11 @@ experiments employing a combination of technologies. - 1.20 - http://ftp.mi.fu-berlin.de/pub/OpenMS/release1.9-documentation/html/classOpenMS_1_1FeatureXMLFile.html + 1.20 + http://ftp.mi.fu-berlin.de/pub/OpenMS/release1.9-documentation/html/classOpenMS_1_1FeatureXMLFile.html OpenMS format for quantitation results (LC/MS features). - - + + featureXML @@ -36306,11 +36465,11 @@ experiments employing a combination of technologies. - 1.20 - http://www.psidev.info/mzdata-1_0_5-docs + 1.20 + http://www.psidev.info/mzdata-1_0_5-docs Now deprecated data format of the HUPO Proteomics Standards Initiative. Replaced by mzML (format_3244). - - + + mzData @@ -36321,11 +36480,11 @@ experiments employing a combination of technologies. - 1.20 - http://cruxtoolkit.sourceforge.net/tide-search.html + 1.20 + http://cruxtoolkit.sourceforge.net/tide-search.html Format supported by the Tide tool for identifying peptides from tandem mass spectra. - - + + TIDE TXT @@ -36337,11 +36496,11 @@ experiments employing a combination of technologies. - 1.20 - http://www.ncbi.nlm.nih.gov/data_specs/schema/NCBI_BlastOutput2.mod.xsd + 1.20 + http://www.ncbi.nlm.nih.gov/data_specs/schema/NCBI_BlastOutput2.mod.xsd XML format as produced by the NCBI Blast package v2. - - + + BLAST XML v2 results format @@ -36352,12 +36511,12 @@ experiments employing a combination of technologies. - 1.20 - - + 1.20 + + Microsoft Powerpoint format. - - + + pptx @@ -36370,18 +36529,18 @@ experiments employing a combination of technologies. - + - - 1.20 - - ibd + + 1.20 + + ibd ibd is a data format for mass spectrometry imaging data. - - + + imzML data is recorded in 2 files: '.imzXML' is a metadata XML file based on mzML by HUPO-PSI, and '.ibd' is a binary file containing the mass spectra. ibd @@ -36392,11 +36551,11 @@ experiments employing a combination of technologies. - 1.21 + 1.21 Data format used in Natural Language Processing. Natural Language Processing format - - + + NLP format @@ -36407,11 +36566,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + XML input file format for BEAST Software (Bayesian Evolutionary Analysis Sampling Trees). - - + + BEAST @@ -36422,11 +36581,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + Chado-XML format is a direct mapping of the Chado relational schema into XML. - - + + Chado-XML @@ -36437,11 +36596,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + An alignment format generated by PRANK/PRANKSTER consisting of four elements: newick, nodes, selection and model. - - + + HSAML @@ -36452,11 +36611,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + Output xml file from the InterProScan sequence analysis application. - - + + InterProScan XML @@ -36467,12 +36626,12 @@ experiments employing a combination of technologies. - 1.21 + 1.21 The KEGG Markup Language (KGML) is an exchange format of the KEGG pathway maps, which is converted from internally used KGML+ (KGML+SVG) format. KEGG Markup Language - - + + KGML @@ -36483,11 +36642,11 @@ experiments employing a combination of technologies. - 1.21 + 1.21 XML format for collected entries from biobliographic databases MEDLINE and PubMed. MEDLINE XML - - + + PubMed XML @@ -36498,11 +36657,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + A set of XML compliant markup components for describing multiple sequence alignments. - - + + MSAML @@ -36514,11 +36673,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + OrthoXML is designed broadly to allow the storage and comparison of orthology data from any ortholog database. It establishes a structure for describing orthology relationships while still allowing flexibility for database-specific information to be encapsulated in the same format. - - + + OrthoXML @@ -36529,11 +36688,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + Tree structure of Protein Sequence Database Markup Language generated using Matra software. - - + + PSDML @@ -36544,11 +36703,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + SeqXML is an XML Schema to describe biological sequences, developed by the Stockholm Bioinformatics Centre. - - + + SeqXML @@ -36559,11 +36718,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + XML format for the UniParc database. - - + + UniParc XML @@ -36574,11 +36733,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + XML format for the UniRef reference clusters. - - + + UniRef XML @@ -36589,21 +36748,21 @@ experiments employing a combination of technologies. - - 1.21 - - - - - cwl - - - + + 1.21 + + + + + cwl + + + Common Workflow Language (CWL) format for description of command-line tools and workflows. Common Workflow Language CommonWL - - + + CWL @@ -36614,10 +36773,10 @@ experiments employing a combination of technologies. - 1.21 + 1.21 Proprietary file format for mass spectrometry data from Waters. - - + + Proprietary format for which documentation is not available, but used by multiple tools. Waters RAW @@ -36629,11 +36788,11 @@ experiments employing a combination of technologies. - 1.21 - + 1.21 + A standardized file format for data exchange in mass spectrometry, initially developed for infrared spectrometry. - - + + JCAMP-DX is an ASCII based format and therefore not very compact even though it includes standards for file compression. JCAMP-DX @@ -36645,10 +36804,10 @@ experiments employing a combination of technologies. - 1.21 + 1.21 An NLP format used for annotated textual documents. - - + + NLP annotation format @@ -36658,10 +36817,10 @@ experiments employing a combination of technologies. - 1.21 + 1.21 NLP format used by a specific type of corpus (collection of texts). - - + + NLP corpus format @@ -36672,15 +36831,15 @@ experiments employing a combination of technologies. - - 1.21 - - - + + 1.21 + + + mirGFF3 is a common format for microRNA data resulting from small-RNA RNA-Seq workflows. miRTop format - - + + mirGFF3 is a specialisation of GFF3; produced by small-RNA-Seq analysis workflows, usable and convertible with the miRTop API (https://mirtop.readthedocs.io/en/latest/), and consumable by tools for downstream analysis. mirGFF3 @@ -36691,13 +36850,13 @@ experiments employing a combination of technologies. - 1.21 + 1.21 A "placeholder" concept for formats of annotated RNA data, including e.g. microRNA and RNA-Seq data. RNA data format miRNA data format microRNA data format - - + + RNA annotation format @@ -36713,15 +36872,15 @@ experiments employing a combination of technologies. - 1.22 - true + 1.22 + true File format to store trajectory information for a 3D structure . CG trajectory formats MD trajectory formats NA trajectory formats Protein trajectory formats - - + + Formats differ on what they are able to store (coordinates, velocities, topologies) and how they are storing it (raw, compressed, textual, binary). Trajectory format @@ -36732,11 +36891,11 @@ experiments employing a combination of technologies. - 1.22 - true + 1.22 + true Binary file format to store trajectory information for a 3D structure . - - + + Trajectory format (binary) @@ -36746,11 +36905,11 @@ experiments employing a combination of technologies. - 1.22 - true + 1.22 + true Textual file format to store trajectory information for a 3D structure . - - + + Trajectory format (text) @@ -36761,10 +36920,10 @@ experiments employing a combination of technologies. - 1.22 + 1.22 HDF is the name of a set of file formats and libraries designed to store and organize large amounts of numerical data, originally developed at the National Center for Supercomputing Applications at the University of Illinois. - - + + HDF is currently supported by many commercial and non-commercial software platforms such as Java, MATLAB/Scilab, Octave, Python and R. HDF @@ -36776,10 +36935,10 @@ experiments employing a combination of technologies. - 1.22 + 1.22 PCAZip format is a binary compressed file to store atom coordinates based on Essential Dynamics (ED) and Principal Component Analysis (PCA). - - + + The compression is made projecting the Cartesian snapshots collected along the trajectory into an orthogonal space defined by the most relevant eigenvectors obtained by diagonalization of the covariance matrix (PCA). In the compression/decompression process, part of the original information is lost, depending on the final number of eigenvectors chosen. However, with a reasonable choice of the set of eigenvectors the compression typically reduces the trajectory file to less than one tenth of their original size with very acceptable loss of information. Compression with PCAZip can only be applied to unsolvated structures. PCAzip @@ -36791,10 +36950,10 @@ experiments employing a combination of technologies. - 1.22 + 1.22 Portable binary format for trajectories produced by GROMACS package. - - + + XTC uses the External Data Representation (xdr) routines for writing and reading data which were created for the Unix Network File System (NFS). XTC files use a reduced precision (lossy) algorithm which works multiplying the coordinates by a scaling factor (typically 1000), so converting them to pm (GROMACS standard distance unit is nm). This allows an integer rounding of the values. Several other tricks are performed, such as making use of atom proximity information: atoms close in sequence are usually close in space (e.g. water molecules). That makes XTC format the most efficient in terms of disk usage, in most cases reducing by a factor of 2 the size of any other binary trajectory format. XTC @@ -36806,11 +36965,11 @@ experiments employing a combination of technologies. - 1.22 + 1.22 Trajectory Next Generation (TNG) is a format for storage of molecular simulation data. It is designed and implemented by the GROMACS development group, and it is called to be the substitute of the XTC format. Trajectory Next Generation format - - + + Fully architecture-independent format, regarding both endianness and the ability to mix single/double precision trajectories and I/O libraries. Self-sufficient, it should not require any other files for reading, and all the data should be contained in a single file for easy transport. Temporal compression of data, improving the compression rate of the previous XTC format. Possibility to store meta-data with information about the simulation. Direct access to a particular frame. Efficient parallel I/O. TNG @@ -36823,10 +36982,10 @@ experiments employing a combination of technologies. - 1.22 + 1.22 The XYZ chemical file format is widely supported by many programs, although many slightly different XYZ file formats coexist (Tinker XYZ, UniChem XYZ, etc.). Basic information stored for each atom in the system are x, y and z coordinates and atom element/atomic number. - - + + XYZ files are structured in this way: First line contains the number of atoms in the file. Second line contains a title, comment, or filename. Remaining lines contain atom information. Each line starts with the element symbol, followed by x, y and z coordinates in angstroms separated by whitespace. Multiple molecules or frames can be contained within one file, so it supports trajectory storage. XYZ files can be directly represented by a molecular viewer, as they contain all the basic information needed to build the 3D model. XYZ @@ -36839,13 +36998,13 @@ experiments employing a combination of technologies. - 1.22 - + 1.22 + AMBER trajectory (also called mdcrd), with 10 coordinates per line and format F8.3 (fixed point notation with field width 8 and 3 decimal places). AMBER trajectory format inpcrd - - + + mdcrd @@ -36861,15 +37020,15 @@ experiments employing a combination of technologies. - 1.22 - true + 1.22 + true Format of topology files; containing the static information of a structure molecular system that is needed for a molecular simulation. CG topology format MD topology format NA topology format Protein topology format - - + + Many different file formats exist describing structural molecular topology. Tipically, each MD package or simulation software works with their own implementation (e.g. GROMACS top, CHARMM psf, AMBER prmtop). Topology format @@ -36882,11 +37041,11 @@ experiments employing a combination of technologies. - 1.22 - + 1.22 + GROMACS MD package top textual files define an entire structure system topology, either directly, or by including itp files. - - + + There is currently no tool available for conversion between GROMACS topology format and other formats, due to the internal differences in both approaches. There is, however, a method to convert small molecules parameterized with AMBER force-field into GROMACS format, allowing simulations of these systems with GROMACS MD package. GROMACS top @@ -36899,16 +37058,16 @@ experiments employing a combination of technologies. - 1.22 - + 1.22 + AMBER Prmtop file (version 7) is a structure topology text file divided in several sections designed to be parsed easily using simple Fortran code. Each section contains particular topology information, such as atom name, charge, mass, angles, dihedrals, etc. AMBER Parm AMBER Parm7 Parm7 Prmtop Prmtop7 - - + + It can be modified manually, but as the size of the system increases, the hand-editing becomes increasingly complex. AMBER Parameter-Topology file format is used extensively by the AMBER software suite and is referred to as the Prmtop file for short. version 7 is written to distinguish it from old versions of AMBER Prmtop. Similarly to HDF5, it is a completely different format, according to AMBER group: a drastic change to the file format occurred with the 2004 release of Amber 7 (http://ambermd.org/prmtop.pdf) AMBER top @@ -36922,11 +37081,11 @@ experiments employing a combination of technologies. - 1.22 - + 1.22 + X-Plor Protein Structure Files (PSF) are structure topology files used by NAMD and CHARMM molecular simulations programs. PSF files contain six main sections of interest: atoms, bonds, angles, dihedrals, improper dihedrals (force terms used to maintain planarity) and cross-terms. - - + + The high similarity in the functional form of the two potential energy functions used by AMBER and CHARMM force-fields gives rise to the possible use of one force-field within the other MD engine. Therefore, the conversion of PSF files to AMBER Prmtop format is possible with the use of AMBER chamber (CHARMM - AMBER) program. PSF @@ -36940,11 +37099,11 @@ experiments employing a combination of technologies. - 1.22 - + 1.22 + GROMACS itp files (include topology) contain structure topology information, and are tipically included in GROMACS topology files (GROMACS top). Itp files are used to define individual (or multiple) components of a topology as a separate file. This is particularly useful if there is a molecule that is used frequently, and also reduces the size of the system topology file, splitting it in different parts. - - + + GROMACS itp files are used also to define position restrictions on the molecule, or to define the force field parameters for a particular ligand. GROMACS itp @@ -36961,7 +37120,7 @@ experiments employing a combination of technologies. - 1.22 + 1.22 Format of force field parameter files, which store the set of parameters (charges, masses, radii, bond lengths, bond dihedrals, etc.) that are essential for the proper description and simulation of a molecular system. Many different file formats exist describing force field parameters. Tipically, each MD package or simulation software works with their own implementation (e.g. GROMACS itp, CHARMM rtf, AMBER off / frcmod). FF parameter format @@ -36974,11 +37133,11 @@ experiments employing a combination of technologies. - 1.22 + 1.22 Scripps Research Institute BinPos format is a binary formatted file to store atom coordinates. Scripps Research Institute BinPos - - + + It is basically a translation of the ASCII atom coordinate format to binary code. The only additional information stored is a magic number that identifies the BinPos format and the number of atoms per snapshot. The remainder is the chain of coordinates binary encoded. A drawback of this format is its architecture dependency. Integers and floats codification depends on the architecture, thus it needs to be converted if working in different platforms (little endian, big endian). BinPos @@ -36991,13 +37150,13 @@ experiments employing a combination of technologies. - 1.22 - + 1.22 + AMBER coordinate/restart file with 6 coordinates per line and decimal format F12.7 (fixed point notation with field width 12 and 7 decimal places) restrt rst7 - - + + RST @@ -37009,10 +37168,10 @@ experiments employing a combination of technologies. - 1.22 + 1.22 Format of CHARMM Residue Topology Files (RTF), which define groups by including the atoms, the properties of the group, and bond and charge information. - - + + There is currently no tool available for conversion between GROMACS topology format and other formats, due to the internal differences in both approaches. There is, however, a method to convert small molecules parameterized with AMBER force-field into GROMACS format, allowing simulations of these systems with GROMACS MD package. CHARMM rtf @@ -37024,11 +37183,11 @@ experiments employing a combination of technologies. - 1.22 - + 1.22 + AMBER frcmod (Force field Modification) is a file format to store any modification to the standard force field needed for a particular molecule to be properly represented in the simulation. - - + + AMBER frcmod @@ -37039,8 +37198,8 @@ experiments employing a combination of technologies. - 1.22 - + 1.22 + AMBER Object File Format library files (OFF library files) store residue libraries (forcefield residue parameters). AMBER Object File Format AMBER lib @@ -37054,10 +37213,10 @@ experiments employing a combination of technologies. - 1.22 + 1.22 MReData is a text based data standard for processed NMR data. It is relying on SDF molecule data and allows to store assignments of NMR peaks to molecule features. The NMR-extracted data (or "NMReDATA") includes: Chemical shift,scalar coupling, 2D correlation, assignment, etc. - - + + NMReData is a text based data standard for processed NMR data. It is relying on SDF molecule data and allows to store assignments of NMR peaks to molecule features. The NMR-extracted data (or "NMReDATA") includes: Chemical shift,scalar coupling, 2D correlation, assignment, etc. Find more in the paper at D. Jeannerat, Magn. Reson. in Chem., 2017, 55, 7-14. NMReDATA @@ -37077,14 +37236,14 @@ experiments employing a combination of technologies. - 1.22 - - - - + 1.22 + + + + BpForms is a string format for concretely representing the primary structures of biopolymers, including DNA, RNA, and proteins that include non-canonical nucleic and amino acids. See https://www.bpforms.org for more information. - - + + BpForms @@ -37096,8 +37255,8 @@ experiments employing a combination of technologies. - 1.22 - + 1.22 + Format of trr files that contain the trajectory of a simulation experiment used by GROMACS. The first 4 bytes of any trr file containing 1993. See https://github.com/galaxyproject/galaxy/pull/6597/files#diff-409951594551183dbf886e24de6cb129R760 trr @@ -37116,21 +37275,21 @@ experiments employing a combination of technologies. - 1.22 - - - - - - msh - - - + 1.22 + + + + + + msh + + + Mash sketch is a format for sequence / sequence checksum information. To make a sketch, each k-mer in a sequence is hashed, which creates a pseudo-random identifier. By sorting these hashes, a small subset from the top of the sorted list can represent the entire sequence. Mash sketch min-hash sketch - - + + msh @@ -37153,11 +37312,11 @@ experiments employing a combination of technologies. - 1.23 - - - - loom + 1.23 + + + + loom The Loom file format is based on HDF5, a standard for storing large numerical datasets. The Loom format is designed to efficiently hold large omics datasets. Typically, such data takes the form of a large matrix of numbers, along with metadata for the rows and columns. Loom @@ -37172,6 +37331,7 @@ experiments employing a combination of technologies. + @@ -37184,13 +37344,13 @@ experiments employing a combination of technologies. - 1.23 - - - - - zarray - zgroup + 1.23 + + + + + zarray + zgroup The Zarr format is an implementation of chunked, compressed, N-dimensional arrays for storing data. This generic format is used in several different fields, including Genomics and Geosciences. Zarr @@ -37216,11 +37376,11 @@ experiments employing a combination of technologies. - 1.23 - - - mtx - + 1.23 + + + mtx + The Matrix Market matrix (MTX) format stores numerical or pattern matrices in a dense (array format) or sparse (coordinate format) representation. MTX @@ -37233,15 +37393,14 @@ experiments employing a combination of technologies. - 1.24 - - - - - - text/plain - - + 1.24 + + + + + + + BcForms is a format for abstractly describing the molecular structure (atoms and bonds) of macromolecular complexes as a collection of subunits and crosslinks. Each subunit can be described with BpForms (http://edamontology.org/format_3909) or SMILES (http://edamontology.org/data_2301). BcForms uses an ontology of crosslinks to abstract the chemical details of crosslinks from the descriptions of complexes (see https://bpforms.org/crosslink.html). BcForms is related to http://edamontology.org/format_3909. (BcForms uses BpForms to describe subunits which are DNA, RNA, or protein polymers.) However, that format isn't the parent of BcForms. BcForms is similarly related to SMILES (http://edamontology.org/data_2301). BcForms @@ -37254,12 +37413,12 @@ experiments employing a combination of technologies. - 1.24 - - nq + 1.24 + + nq N-Quads is a line-based, plain text format for encoding an RDF dataset. It includes information about the graph each triple belongs to. - - + + N-Quads should not be confused with N-Triples which does not contain graph information. N-Quads @@ -37271,17 +37430,16 @@ experiments employing a combination of technologies. - 1.25 - - - - - json - application/json - + 1.25 + + + + + json + Vega is a visualization grammar, a declarative language for creating, saving, and sharing interactive visualization designs. With Vega, you can describe the visual appearance and interactive behavior of a visualization in a JSON format, and generate web-based views using Canvas or SVG. - - + + Vega @@ -37292,17 +37450,16 @@ experiments employing a combination of technologies. - 1.25 - - - - - json - application/json - + 1.25 + + + + + json + Vega-Lite is a high-level grammar of interactive graphics. It provides a concise JSON syntax for rapidly generating visualizations to support analysis. Vega-Lite specifications can be compiled to Vega specifications. - - + + Vega-lite @@ -37319,16 +37476,15 @@ experiments employing a combination of technologies. - 1.25 - - - - - application/xml - + 1.25 + + + + + A model description language for computational neuroscience. - - + + NeuroML @@ -37345,19 +37501,17 @@ experiments employing a combination of technologies. - 1.25 - - - - - bngl - application/xml - plain/text - + 1.25 + + + + + bngl + BioNetGen is a format for the specification and simulation of rule-based models of biochemical systems, including signal transduction, metabolic, and genetic regulatory networks. BioNetGen Language - - + + BNGL @@ -37367,13 +37521,13 @@ experiments employing a combination of technologies. - 1.25 - - - + 1.25 + + + A Docker image is a file, comprised of multiple layers, that is used to execute code in a Docker container. An image is essentially built from the instructions for a complete and executable version of an application, which relies on the host OS kernel. - - + + Docker image @@ -37385,14 +37539,14 @@ experiments employing a combination of technologies. - 1.25 - - - gfa - + 1.25 + + + gfa + Graphical Fragment Assembly captures sequence graphs as the product of an assembly, a representation of variation in genomes, splice graphs in genes, or even overlap between reads from long-read sequencing technology. - - + + GFA 1 @@ -37403,14 +37557,14 @@ experiments employing a combination of technologies. - 1.25 - - - gfa - + 1.25 + + + gfa + Graphical Fragment Assembly captures sequence graphs as the product of an assembly, a representation of variation in genomes, splice graphs in genes, or even overlap between reads from long-read sequencing technology. - - + + GFA 2 @@ -37426,15 +37580,14 @@ experiments employing a combination of technologies. - 1.25 - - - xlsx - application/vnd.openxmlformats-officedocument.spreadsheetml.sheet - + 1.25 + + + xlsx + ObjTables is a toolkit for creating re-usable datasets that are both human and machine-readable, combining the ease of spreadsheets (e.g., Excel workbooks) with the rigor of schemas (classes, their attributes, the type of each attribute, and the possible relationships between instances of classes). ObjTables consists of a format for describing schemas for spreadsheets, numerous data types for science, a syntax for indicating the class and attribute represented by each table and column in a workbook, and software for using schemas to rigorously validate, merge, split, compare, and revision datasets. - - + + ObjTables @@ -37445,11 +37598,11 @@ experiments employing a combination of technologies. - 1.25 - contig + 1.25 + contig The CONTIG format used for output of the SOAPdenovo alignment program. It contains contig sequences generated without using mate pair information. - - + + CONTIG @@ -37460,11 +37613,11 @@ experiments employing a combination of technologies. - 1.25 - wego + 1.25 + wego WEGO native format used by the Web Gene Ontology Annotation Plot application. Tab-delimited format with gene names and others GO IDs (columns) with one annotation record per line. - - + + WEGO @@ -37475,12 +37628,12 @@ experiments employing a combination of technologies. - 1.25 - rpkm + 1.25 + rpkm Tab-delimited format for gene expression levels table, calculated as Reads Per Kilobase per Million (RPKM) mapped reads. Gene expression levels table format - - + + For example a 1kb transcript with 1000 alignments in a sample of 10 million reads (out of which 8 million reads can be mapped) will have RPKM = 1000/(1 * 8) = 125 RPKM @@ -37497,14 +37650,14 @@ experiments employing a combination of technologies. - 1.25 - tar + 1.25 + tar TAR archive file format generated by the Unix-based utility tar. TAR Tarball tar - - + + For example a 1kb transcript with 1000 alignments in a sample of 10 million reads (out of which 8 million reads can be mapped) will have RPKM = 1000/(1 * 8) = 125 TAR format @@ -37517,11 +37670,11 @@ experiments employing a combination of technologies. - 1.25 - chain + 1.25 + chain The CHAIN format describes a pairwise alignment that allow gaps in both sequences simultaneously and is used by the UCSC Genome Browser. - - + + CHAIN https://genome.ucsc.edu/goldenPath/help/chain.html @@ -37533,11 +37686,11 @@ experiments employing a combination of technologies. - 1.25 - net + 1.25 + net The NET file format is used to describe the data that underlie the net alignment annotations in the UCSC Genome Browser. - - + + NET https://genome.ucsc.edu/goldenPath/help/net.html @@ -37549,11 +37702,11 @@ experiments employing a combination of technologies. - 1.25 - qmap + 1.25 + qmap Format of QMAP files generated for methylation data from an internal BGI pipeline. - - + + QMAP @@ -37564,14 +37717,14 @@ experiments employing a combination of technologies. - 1.25 - ga + 1.25 + ga An emerging format for high-level Galaxy workflow description. Galaxy workflow format GalaxyWF ga - - + + gxformat2 https://github.com/galaxyproject/gxformat2 @@ -37583,13 +37736,13 @@ experiments employing a combination of technologies. - 1.25 - wmv + 1.25 + wmv The proprietary native video format of various Microsoft programs such as Windows Media Player. Windows Media Video format Windows movie file format - - + + WMV @@ -37606,12 +37759,12 @@ experiments employing a combination of technologies. - 1.25 - zip + 1.25 + zip ZIP is an archive file format that supports lossless data compression. ZIP - - + + A ZIP file may contain one or more files or directories that may have been compressed. ZIP format @@ -37624,11 +37777,11 @@ experiments employing a combination of technologies. - 1.25 - lsm + 1.25 + lsm Zeiss' proprietary image format based on TIFF. - - + + LSM files are the default data export for the Zeiss LSM series confocal microscopes (e.g. LSM 510, LSM 710). In addition to the image data, LSM files contain most imaging settings. LSM @@ -37645,15 +37798,15 @@ experiments employing a combination of technologies. - 1.25 - gz - gzip + 1.25 + gz + gzip GNU zip compressed file format common to Unix-based operating systems. GNU Zip gz gzip - - + + GZIP format @@ -37665,12 +37818,12 @@ experiments employing a combination of technologies. - 1.25 - avi + 1.25 + avi Audio Video Interleaved (AVI) format is a multimedia container format for AVI files, that allows synchronous audio-with-video playback. Audio Video Interleaved - - + + AVI @@ -37682,11 +37835,11 @@ experiments employing a combination of technologies. - 1.25 - trackdb + 1.25 + trackdb A declaration file format for UCSC browsers track dataset display charateristics. - - + + TrackDB @@ -37697,12 +37850,12 @@ experiments employing a combination of technologies. - 1.25 - cigar + 1.25 + cigar Compact Idiosyncratic Gapped Alignment Report format is a compressed (run-length encoded) pairwise alignment format. It is useful for representing long (e.g. genomic) pairwise alignments. CIGAR - - + + CIGAR format http://wiki.bits.vib.be/index.php/CIGAR/ @@ -37714,12 +37867,12 @@ experiments employing a combination of technologies. - 1.25 - stl + 1.25 + stl STL is a file format native to the stereolithography CAD software created by 3D Systems. The format is used to save and share surface-rendered 3D images and also for 3D printing. stl - - + + Stereolithography format @@ -37730,13 +37883,13 @@ experiments employing a combination of technologies. - 1.25 - u3d + 1.25 + u3d U3D (Universal 3D) is a compressed file format and data structure for 3D computer graphics. It contains 3D model information such as triangle meshes, lighting, shading, motion data, lines and points with color and structure. Universal 3D Universal 3D format - - + + U3D @@ -37747,11 +37900,11 @@ experiments employing a combination of technologies. - 1.25 - tex + 1.25 + tex Bitmap image format used for storing textures. - - + + Texture files can create the appearance of different surfaces and can be applied to both 2D and 3D objects. Note the file extension .tex is also used for LaTex documents which are a completely different format and they are NOT interchangable. Texture file format @@ -37763,14 +37916,14 @@ experiments employing a combination of technologies. - 1.25 - py + 1.25 + py Format for scripts writtenin Python - a widely used high-level programming language for general-purpose programming. Python Python program py - - + + Python script @@ -37781,12 +37934,12 @@ experiments employing a combination of technologies. - 1.25 - mp4 + 1.25 + mp4 A digital multimedia container format most commonly used to store video and audio. MP4 - - + + MPEG-4 @@ -37798,14 +37951,14 @@ experiments employing a combination of technologies. - 1.25 - pl + 1.25 + pl Format for scripts written in Perl - a family of high-level, general-purpose, interpreted, dynamic programming languages. Perl Perl program pl - - + + Perl script @@ -37816,13 +37969,13 @@ experiments employing a combination of technologies. - 1.25 - r + 1.25 + r Format for scripts written in the R language - an open source programming language and software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing. R R program - - + + R script @@ -37833,11 +37986,11 @@ experiments employing a combination of technologies. - 1.25 - rmd + 1.25 + rmd A file format for making dynamic documents (R Markdown scripts) with the R language. - - + + R markdown https://rmarkdown.rstudio.com/articles_intro.html @@ -37848,12 +38001,12 @@ experiments employing a combination of technologies. - 1.25 - This duplicates an existing concept (http://edamontology.org/format_3549). - 1.26 - + 1.25 + This duplicates an existing concept (http://edamontology.org/format_3549). + 1.26 + An open file format from the Neuroimaging Informatics Technology Initiative (NIfTI) commonly used to store brain imaging data obtained using Magnetic Resonance Imaging (MRI) methods. - + NIFTI format true @@ -37871,11 +38024,11 @@ experiments employing a combination of technologies. - 1.25 - pickle + 1.25 + pickle Format used by Python pickle module for serializing and de-serializing a Python object structure. - - + + pickle https://docs.python.org/2/library/pickle.html @@ -37892,13 +38045,13 @@ experiments employing a combination of technologies. - 1.25 - npy + 1.25 + npy The standard binary file format used by NumPy - a fundamental package for scientific computing with Python - for persisting a single arbitrary NumPy array on disk. The format stores all of the shape and dtype information necessary to reconstruct the array correctly. NumPy npy - - + + NumPy format @@ -37914,11 +38067,11 @@ experiments employing a combination of technologies. - 1.25 - repz + 1.25 + repz Format of repertoire (archive) files that can be read by SimToolbox (a MATLAB toolbox for structured illumination fluorescence microscopy) or alternatively extracted with zip file archiver software. - - + + SimTools repertoire file format https://pdfs.semanticscholar.org/5f25/f1cc6cdf2225fe22dc6fd4fc0296d486a85c.pdf @@ -37935,11 +38088,11 @@ experiments employing a combination of technologies. - 1.25 - cfg + 1.25 + cfg A configuration file used by various programs to store settings that are specific to their respective software. - - + + Configuration file format @@ -37955,14 +38108,14 @@ experiments employing a combination of technologies. - 1.25 - zst + 1.25 + zst Format used by the Zstandard real-time compression algorithm. Zstandard compression format Zstandard-compressed file format zst - - + + Zstandard format https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md @@ -37974,13 +38127,13 @@ experiments employing a combination of technologies. - 1.25 - m + 1.25 + m The file format for MATLAB scripts or functions. MATLAB m - - + + MATLAB script @@ -37996,14 +38149,14 @@ experiments employing a combination of technologies. - - 1.26 - - - + + 1.26 + + + A data format for specifying parameter estimation problems in systems biology. - - + + PEtab @@ -38014,17 +38167,17 @@ experiments employing a combination of technologies. - 1.26 - - - g.vcf - g.vcf.gz + 1.26 + + + g.vcf + g.vcf.gz Genomic Variant Call Format (gVCF) is a version of VCF that includes not only the positions that are variant when compared to a reference genome, but also the non-variant positions as ranges, including metrics of confidence that the positions in the range are actually non-variant e.g. minimum read-depth and genotype quality. GVCF g.vcf g.vcf.gz - - + + gVCF @@ -38055,7 +38208,7 @@ experiments employing a combination of technologies. Grid format - geo + Raster format (geographical data) @@ -38066,7 +38219,7 @@ experiments employing a combination of technologies. - geo + Vector format (geographical data) @@ -38078,11 +38231,11 @@ experiments employing a combination of technologies. - - - - geojson - + + + + geojson + Validator https://geojsonlint.com/ GeoJSON @@ -38096,12 +38249,13 @@ experiments employing a combination of technologies. - - - dbf - shp - shx - x-gis/x-shapefile + + + + dbf + shp + shx + x-gis/x-shapefile (not in IANA) Shapefile is a composite format including 3 mandatory files: binary shape/geometry file .shp, shape/geometry index file .shx, and a textual attribute format .dbf). It may contain additional optional files, mostly binary (x-gis/x-shapefile), but some textual or XML. Shapefile @@ -38116,9 +38270,9 @@ experiments employing a combination of technologies. - - - + + + Keyhole Markup Language KML was developed for Google Earth (former Keyhole Earth Viewer), and standardised by the Open Geospatial Consortium (OGC, OpenGIS, https://en.wikipedia.org/wiki/OpenGIS). KML @@ -38134,8 +38288,8 @@ experiments employing a combination of technologies. - - Climate and forecast (CF) metadata conventions + + Climate and forecast (CF) metadata conventions From version-4 NetCDF is based on HDF5 NetCDF CF conventions NetCDF following the CF metadata conventions @@ -38151,8 +38305,8 @@ experiments employing a combination of technologies. - gml - + gml + Matúš Kalaš 2021-08-26T11:59:54.522374Z Not to be confused with the Graph Modelling Language (also GML, with the same file extension). @@ -38180,7 +38334,7 @@ experiments employing a combination of technologies. - osm.pbf + osm.pbf Matúš Kalaš 2021-08-26T13:25:05.495509Z OSM PBF @@ -38196,14 +38350,14 @@ experiments employing a combination of technologies. - - - gb - grb - grib - grib1 - grib2 - + + + gb + grb + grib + grib1 + grib2 + Matúš Kalaš 2021-08-26T13:42:00.666692Z Standard format for archiving and distributing gridded data, especially weather data. Since it is a binary format, that data is packed to increase storage efficiency. It is a collection of self-contained records of 2D data. @@ -38222,7 +38376,7 @@ experiments employing a combination of technologies. - + melibleq 2021-08-29T15:54:37.290778Z The GeoTIFF file format allows a TIFF file to be enriched with georeferencing metadata, such as map projections, coordinate systems, ellipsoids, and datums. With GeoTIFF, the raw pixels of an image can be stored and organized in particular ways. @@ -38231,6 +38385,7 @@ experiments employing a combination of technologies. + TODO: Is there a version-independent link to the documentation? @@ -38240,7 +38395,7 @@ experiments employing a combination of technologies. - vrt + vrt melibleq 2021-08-29T16:08:38.496836Z GDAL Virtual Dataset. XML format that maps its attributes and geometries to that of an underlying data source of any GDAL-supported raster format. It transforms features read from other drivers based on criteria specified in an XML control file. It is primarily used to derive spatial layers from flat tables with spatial information in attribute columns. It can also be used to associate coordinate system information with a datasource, merge layers from different datasources into a single data source, or even just to provide an anchor file for access to non-file oriented datasources. @@ -38255,7 +38410,7 @@ experiments employing a combination of technologies. - + melibleq 2021-08-29T16:21:48.629321Z Regular GeoTIFF file that can be hosted on a HTTP file server. The use of HTTP GET range requests lets clients ask for just the necessary portions of a GeoTIFF file. @@ -38289,7 +38444,12 @@ experiments employing a combination of technologies. - + + + + + + melibleq 2021-09-01T02:55:45.473554Z OpenGIS standard used to specify the digital storage of geographical data (point, line, polygon, multi-point, multi-line, etc) with both spatial and non-spatial attributes. @@ -38321,8 +38481,8 @@ experiments employing a combination of technologies. - - + + Darwin Core Archive (DwC-A) is a biodiversity informatics data standard that makes use of the Darwin Core terms to produce a single, self contained dataset for sharing species-level (taxonomic), species-occurrence, sampling-event, and material sample data. DwC-A The Darwin Core Archive format consists of a set CSV files and a metadata XML file. @@ -38515,19 +38675,276 @@ experiments employing a combination of technologies. Matúš Kalaš 2022-12-07T16:23:45.962537Z - OME-Zarr? + OME-Zarr + + + + + + + application/gpx+xml (not in IANA) + melibleq + 2021-08-26T04:29:11.969493Z + GPS Exchange Format + GPX + + + + + + + + + + + melibleq + 2021-08-26T04:33:21.310981Z + OpenStreetMap File Formats + OSM Formats + TODO recommended to use binary PBF format instead + OSM XML + + + + + + + + + + + melibleq + 2021-08-26T04:36:50.785363Z + Digital Elevation Model + DEM + + + + + + + + + + + + Matúš Kalaš + 2023-10-30T21:56:20.738411Z + Columnar format + Table format? + Tabular format + TODO, WONDERING: Should this also include binary and composite/hierarchical tabular/columnar formats??? (and maybe xml & json/yaml that happen to be tables??) +And Matrix format(s)?? Are then any 2D & more-dimensional raster formats that wouldn't be tabular then??? + + + + + + + + + + Matúš Kalaš + 2023-02-24T09:45:21.41427Z + {Format TEMPLATE} + + + + + + + + + + + Matúš Kalaš + 2023-11-17T17:51:31.543842Z + Social justice FIX ID + Is this a separate topic, or the same as DEI altogether? + TODO: Perhaps not also under Social sciences, or is it? + + + + + + + + + Ableism (kind-of antonym) + Matúš Kalaš + 2023-11-17T17:54:31.281549Z + Access and functional needs + Ableism is discrimination in favour of able-bodied people. + Disability justice FIX ID + + TODO: Is 'access and funcional needs' a broad synonym (not only disability but also other limitations incl. temporary)? Or is it a separate, more "technical" topic? Or specific promotions=workarounds to space/movement/emergency preparedness / disability-inclusive disaster response / various circumstances? + TODO: Perhaps we need a specific attribute for antonyms or quasi-antonyms like here. + + + + + + + + + + Matúš Kalaš + 2023-11-17T18:32:11.770914Z + Emergency preparedness + + + + + + + + + + Matúš Kalaš + 2023-12-13T22:32:43.63192Z + GeoZarr + TODO: Is this the spec https://github.com/zarr-developers/geozarr-spec/blob/main/geozarr-spec.md ??? It is a fork of another repo!!! + + + + + + + + + Matúš Kalaš + 2024-03-20T12:37:57.134274Z + Glycoinformatics + + + + + + + + + + + Glycans + Glycome + Matúš Kalaš + 2024-03-20T12:40:46.398616Z + Glycomics + + How to connect with Glycoinformatics? How to connect with Chemistry et al.? + + + + + + + + + + + + + + + + + + Matúš Kalaš + 2024-03-27T16:01:10.709219Z + Archeobiology concerns more recent times than paleontology. + Archaeobiology + + + + + + + + + + + + + + + + + Matúš Kalaš + 2024-03-27T16:04:44.568345Z + Paleontology + + + + + + + + + + Matúš Kalaš + 2024-03-27T16:10:59.827169Z + Paleobotany + + + + + + + + + + Matúš Kalaš + 2024-03-27T16:11:28.577256Z + Plant ecology? + + + + + + + + + Matúš Kalaš + 2024-03-27T16:13:27.307243Z + Urban ecology? + + + + + + + + + Matúš Kalaš + 2024-03-27T16:13:37.093824Z + Marine ecology? + + + + + + + + + Matúš Kalaš + 2024-03-27T16:14:33.690832Z + Paleo-/Archaeoecology? + + + + - beta12orEarlier - true + beta12orEarlier + true Function A function that processes a set of inputs and results in a set of outputs, or associates arguments (inputs) with values (outputs). Computational method @@ -38541,13 +38958,13 @@ experiments employing a combination of technologies. Computational tool Process sumo:Function - - + + Special cases are: a) An operation that consumes no input (has no input arguments). Such operation is either a constant function, or an operation depending only on the underlying state. b) An operation that may modify the underlying state but has no output. c) The singular-case operation with no input or output, that still may modify the underlying state. Operation - - - + + + http://onto.eva.mpg.de/ontologies/gfo-bio.owl#Method http://purl.org/biotop/biotop.owl#Function http://semanticscience.org/resource/SIO_000017 @@ -38595,12 +39012,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Search or query a data resource and retrieve entries and / or annotation. Database retrieval Query - - + + Query and retrieval @@ -38610,12 +39027,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Search database to retrieve all relevant references to a particular entity or entry. - + Data retrieval (database cross-reference) true @@ -38638,11 +39055,11 @@ experiments employing a combination of technologies. - beta12orEarlier - true + beta12orEarlier + true Annotate an entity (typically a biological or biomedical database entity) with terms from a controlled vocabulary. - - + + This is a broad concept and is used a placeholder for other, more specific concepts. Annotation @@ -38659,13 +39076,13 @@ experiments employing a combination of technologies. - beta12orEarlier - true + beta12orEarlier + true Generate an index of (typically a file of) biological data. Data indexing Database indexing - - + + Indexing @@ -38675,12 +39092,12 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Analyse an index of biological data. - + Data index analysis true @@ -38691,12 +39108,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Retrieve basic information about a molecular sequence. - + Annotation retrieval (sequence) true @@ -38708,12 +39125,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Generate a molecular sequence by some means. Sequence generation (nucleic acid) Sequence generation (protein) - - + + Sequence generation @@ -38724,10 +39141,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Edit or change a molecular sequence, either randomly or specifically. - - + + Sequence editing @@ -38737,15 +39154,15 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Merge two or more (typically overlapping) molecular sequences. Sequence splicing Paired-end merging Paired-end stitching Read merging Read stitching - - + + Sequence merging @@ -38756,10 +39173,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Convert a molecular sequence from one type to another. - - + + Sequence conversion @@ -38781,10 +39198,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Calculate sequence complexity, for example to find low-complexity regions in sequences. - - + + Sequence complexity calculation @@ -38806,10 +39223,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Calculate sequence ambiguity, for example identity regions in protein or nucleotide sequences with many ambiguity codes. - - + + Sequence ambiguity calculation @@ -38832,10 +39249,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Calculate character or word composition or frequency of a molecular sequence. - - + + Sequence composition calculation @@ -38851,10 +39268,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Find and/or analyse repeat sequences in (typically nucleotide) sequences. - - + + Repeat sequences include tandem repeats, inverted or palindromic repeats, DNA microsatellites (Simple Sequence Repeats or SSRs), interspersed repeats, maximal duplications and reverse, complemented and reverse complemented repeats etc. Repeat units can be exact or imperfect, in tandem or dispersed, of specified or unspecified length. Repeat sequence analysis @@ -38878,11 +39295,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Discover new motifs or conserved patterns in sequences or sequence alignments (de-novo discovery). Motif discovery - - + + Motifs and patterns might be conserved or over-represented (occur with improbable frequency). Sequence motif discovery @@ -38906,7 +39323,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Find (scan for) known motifs, patterns and regular expressions in molecular sequence(s). Motif scanning Sequence signature detection @@ -38917,8 +39334,8 @@ experiments employing a combination of technologies. Sequence motif detection Sequence motif search Sequence profile search - - + + Sequence motif recognition @@ -38941,10 +39358,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Find motifs shared by molecular sequences. - - + + Sequence motif comparison @@ -38954,12 +39371,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Analyse the sequence, conformational or physicochemical properties of transcription regulatory elements in DNA sequences. - + For example transcription factor binding sites (TFBS) analysis to predict accessibility of DNA to binding factors. Transcription regulatory sequence analysis true @@ -38971,11 +39388,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Identify common, conserved (homologous) or synonymous transcriptional regulatory motifs (transcription factor binding sites). - + Conserved transcription regulatory sequence identification true @@ -38987,12 +39404,12 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.18 - - + beta12orEarlier + 1.18 + + Extract, calculate or predict non-positional (physical or chemical) properties of a protein from processing a protein (3D) structure. - + Protein property calculation (from structure) true @@ -39004,7 +39421,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Analyse flexibility and motion in protein structure. CG analysis MD analysis @@ -39014,8 +39431,8 @@ experiments employing a combination of technologies. Protein flexibility and motion analysis Protein flexibility prediction Protein motion prediction - - + + Use this concept for analysis of flexible and rigid residues, local chain deformability, regions undergoing conformational change, molecular vibrations or fluctuational dynamics, domain motions or other large-scale structural transitions in a protein structure. Simulation analysis @@ -39033,12 +39450,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Identify or screen for 3D structural motifs in protein structure(s). Protein structural feature identification Protein structural motif recognition - - + + This includes conserved substructures and conserved geometry, such as spatial arrangement of secondary structure or protein backbone. Methods might use structure alignment, structural templates, searches for similar electrostatic potential and molecular surface shape, surface-mapping of phylogenetic information etc. Structural motif discovery @@ -39056,10 +39473,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Identify structural domains in a protein structure from first principles (for example calculations on structural compactness). - - + + Protein domain recognition @@ -39069,10 +39486,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Analyse the architecture (spatial arrangement of secondary structure) of protein structure(s). - - + + Protein architecture analysis @@ -39088,7 +39505,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier WHATIF: SymShellFiveXML WHATIF: SymShellOneXML WHATIF: SymShellTenXML @@ -39098,8 +39515,8 @@ experiments employing a combination of technologies. WHATIF:ListSideChainContactsNormal WHATIF:ListSideChainContactsRelaxed Calculate or extract inter-atomic, inter-residue or residue-atom contacts, distances and interactions in protein structure(s). - - + + Residue interaction calculation @@ -39115,7 +39532,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier WHATIF:CysteineTorsions WHATIF:ResidueTorsions WHATIF:ResidueTorsionsBB @@ -39125,8 +39542,8 @@ experiments employing a combination of technologies. Cysteine torsion angle calculation Tau angle calculation Torsion angle calculation - - + + Protein geometry calculation @@ -39137,15 +39554,15 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Extract, calculate or predict non-positional (physical or chemical) properties of a protein, including any non-positional properties of the molecular sequence, from processing a protein sequence or 3D structure. Protein property rendering Protein property calculation (from sequence) Protein property calculation (from structure) Protein structural property calculation Structural property calculation - - + + This includes methods to render and visualise the properties of a protein sequence, and a residue-level search for properties such as solvent accessibility, hydropathy, secondary structure, ligand-binding etc. Protein property calculation @@ -39169,7 +39586,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Immunogen design Predict antigenicity, allergenicity / immunogenicity, allergic cross-reactivity etc of peptides and proteins. Antigenicity prediction @@ -39177,8 +39594,8 @@ experiments employing a combination of technologies. B cell peptide immunogenicity prediction Hopp and Woods plotting MHC peptide immunogenicity prediction - - + + Immunological system are cellular or humoral. In vaccine design to induces a cellular immune response, methods must search for antigens that can be recognized by the major histocompatibility complex (MHC) molecules present in T lymphocytes. If a humoral response is required, antigens for B cells must be identified. This includes methods that generate a graphical rendering of antigenicity of a protein, such as a Hopp and Woods plot. This is usually done in the development of peptide-specific antibodies or multi-epitope vaccines. Methods might use sequence data (for example motifs) and / or structural data. @@ -39204,14 +39621,14 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Predict, recognise and identify positional features in molecular sequences such as key functional sites or regions. Sequence feature prediction Sequence feature recognition Motif database search SO:0000110 - - + + Look at "Protein feature detection" (http://edamontology.org/operation_3092) and "Nucleic acid feature detection" (http://edamontology.org/operation_0415) in case more specific terms are needed. Sequence feature detection @@ -39222,12 +39639,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Extract a sequence feature table from a sequence database entry. - + Data retrieval (feature table) true @@ -39238,12 +39655,12 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Query the features (in a feature table) of molecular sequence(s). - + Feature table query true @@ -39273,12 +39690,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Compare the feature tables of two or more molecular sequences. Feature comparison Feature table comparison - - + + Sequence feature comparison @@ -39288,12 +39705,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Display basic information about a sequence alignment. - + Data retrieval (sequence alignment) true @@ -39310,10 +39727,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Analyse a molecular sequence alignment. - - + + Sequence alignment analysis @@ -39324,10 +39741,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Compare (typically by aligning) two molecular sequence alignments. - - + + See also 'Sequence profile alignment'. Sequence alignment comparison @@ -39339,10 +39756,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Convert a molecular sequence alignment from one type to another (for example amino acid to coding nucleotide sequence). - - + + Sequence alignment conversion @@ -39352,12 +39769,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Process (read and / or write) physicochemical property data of nucleic acids. - + Nucleic acid property processing true @@ -39374,10 +39791,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Calculate or predict physical or chemical properties of nucleic acid molecules, including any non-positional properties of the molecular sequence. - - + + Nucleic acid property calculation @@ -39393,14 +39810,14 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Predict splicing alternatives or transcript isoforms from analysis of sequence data. Alternative splicing analysis Alternative splicing detection Differential splicing analysis Splice transcript prediction - - + + Alternative splicing prediction @@ -39417,11 +39834,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Detect frameshifts in DNA sequences, including frameshift sites and signals, and frameshift errors from sequencing projects. Frameshift error detection - - + + Methods include sequence alignment (if related sequences are available) and word-based sequence comparison. Frameshift detection @@ -39432,10 +39849,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Detect vector sequences in nucleotide sequence, typically by comparison to a set of known vector sequences. - - + + Vector sequence detection @@ -39448,11 +39865,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Predict secondary structure of protein sequences. Secondary structure prediction (protein) - - + + Methods might use amino acid composition, local sequence information, multiple sequence alignments, physicochemical features, estimated energy content, statistical algorithms, hidden Markov models, support vector machines, kernel machines, neural networks etc. Protein secondary structure prediction @@ -39469,10 +39886,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Predict super-secondary structure of protein sequence(s). - - + + Super-secondary structures include leucine zippers, coiled coils, Helix-Turn-Helix etc. Protein super-secondary structure prediction @@ -39484,10 +39901,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Predict and/or classify transmembrane proteins or transmembrane (helical) domains or regions in protein sequences. - - + + Transmembrane protein prediction @@ -39503,10 +39920,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Analyse transmembrane protein(s), typically by processing sequence and / or structural data, and write an informative report for example about the protein and its transmembrane domains / regions. - - + + Use this (or child) concept for analysis of transmembrane domains (buried and exposed faces), transmembrane helices, helix topology, orientation, inter-helical contacts, membrane dipping (re-entrant) loops and other secondary structure etc. Methods might use pattern discovery, hidden Markov models, sequence alignment, structural profiles, amino acid property analysis, comparison to known domains or some combination (hybrid methods). Transmembrane protein analysis @@ -39517,15 +39934,15 @@ experiments employing a combination of technologies. - beta12orEarlier - This is a "organisational class" not very useful for annotation per se. - 1.19 - - + beta12orEarlier + This is a "organisational class" not very useful for annotation per se. + 1.19 + + Predict tertiary structure of a molecular (biopolymer) sequence. - + Structure prediction true @@ -39543,13 +39960,13 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Predict contacts, non-covalent interactions and distance (constraints) between amino acids in protein sequences. Residue interaction prediction Contact map prediction Protein contact map prediction - - + + Methods usually involve multiple sequence alignment analysis. Residue contact prediction @@ -39560,11 +39977,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Analyse experimental protein-protein interaction data from for example yeast two-hybrid analysis, protein microarrays, immunoaffinity chromatography followed by mass spectrometry, phage display etc. - + Protein interaction raw data analysis true @@ -39576,11 +39993,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Identify or predict protein-protein interactions, interfaces, binding sites etc in protein sequences. - + Protein-protein interaction prediction (from protein sequence) true @@ -39592,11 +40009,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Identify or predict protein-protein interactions, interfaces, binding sites etc in protein structures. - + Protein-protein interaction prediction (from protein structure) true @@ -39621,11 +40038,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Analyse a network of protein interactions. Protein interaction network comparison - - + + Protein interaction network analysis @@ -39635,14 +40052,14 @@ experiments employing a combination of technologies. - beta12orEarlier - Notions of pathway and network were mixed up, EDAM 1.24 disentangles them. - 1.24 - + beta12orEarlier + Notions of pathway and network were mixed up, EDAM 1.24 disentangles them. + 1.24 + Compare two or more biological pathways or networks. - + Pathway or network comparison true @@ -39661,11 +40078,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Predict RNA secondary structure (for example knots, pseudoknots, alternative structures etc). RNA shape prediction - - + + Methods might use RNA motifs, predicted intermolecular contacts, or RNA sequence-structure compatibility (inverse RNA folding). RNA secondary structure prediction @@ -39684,14 +40101,14 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Analyse some aspect of RNA/DNA folding, typically by processing sequence and/or structural data. For example, compute folding energies such as minimum folding energies for DNA or RNA sequences or energy landscape of RNA mutants. Nucleic acid folding Nucleic acid folding modelling Nucleic acid folding prediction Nucleic acid folding energy calculation - - + + Nucleic acid folding analysis @@ -39701,12 +40118,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Retrieve information on restriction enzymes or restriction enzyme sites. - + Data retrieval (restriction enzyme annotation) true @@ -39717,12 +40134,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Identify genetic markers in DNA sequences. - + A genetic marker is any DNA sequence of known chromosomal location that is associated with and specific to a particular gene or trait. This includes short sequences surrounding a SNP, Sequence-Tagged Sites (STS) which are well suited for PCR amplification, a longer minisatellites sequence etc. Genetic marker identification true @@ -39741,7 +40158,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Generate a genetic (linkage) map of a DNA sequence (typically a chromosome) showing the relative positions of genetic markers based on estimation of non-physical distances. Functional mapping Genetic cartography @@ -39749,8 +40166,8 @@ experiments employing a combination of technologies. Genetic map generation Linkage mapping QTL mapping - - + + Mapping involves ordering genetic loci along a chromosome and estimating the physical distance between loci. A genetic map shows the relative (not physical) position of known genes and genetic markers. This includes mapping of the genetic architecture of dynamic complex traits (functional mapping), e.g. by characterisation of the underlying quantitative trait loci (QTLs) or nucleotides (QTNs). Genetic mapping @@ -39775,10 +40192,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Analyse genetic linkage. - - + + For example, estimate how close two genes are on a chromosome by calculating how often they are transmitted together to an offspring, ascertain whether two genes are linked and parental linkage, calculate linkage map distance etc. Linkage analysis @@ -39796,11 +40213,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Calculate codon usage statistics and create a codon usage table. Codon usage table construction - - + + Codon usage table generation @@ -39811,10 +40228,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Compare two or more codon usage tables. - - + + Codon usage table comparison @@ -39854,12 +40271,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Analyse codon usage in molecular sequences or process codon usage data (e.g. a codon usage table). Codon usage data analysis Codon usage table analysis - - + + Codon usage analysis @@ -39882,10 +40299,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Identify and plot third base position variability in a nucleotide sequence. - - + + Base position variability plotting @@ -39895,10 +40312,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Find exact character or word matches between molecular sequences without full sequence alignment. - - + + Sequence word comparison @@ -39921,13 +40338,13 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Calculate a sequence distance matrix or otherwise estimate genetic distances between molecular sequences. Phylogenetic distance matrix generation Sequence distance calculation Sequence distance matrix construction - - + + Sequence distance matrix generation @@ -39943,10 +40360,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Compare two or more molecular sequences, identify and remove redundant sequences based on some criteria. - - + + Sequence redundancy removal @@ -39964,12 +40381,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Build clusters of similar sequences, typically using scores from pair-wise alignment or other comparison of the sequences. Sequence cluster construction Sequence cluster generation - - + + The clusters may be output or used internally for some other purpose. Sequence clustering @@ -39989,7 +40406,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Align (identify equivalent sites within) molecular sequences. Sequence alignment construction Sequence alignment generation @@ -39997,8 +40414,8 @@ experiments employing a combination of technologies. Constrained sequence alignment Multiple sequence alignment (constrained) Sequence alignment (constrained) - - + + Includes methods that align sequence profiles (representing sequence alignments): ethods might perform one-to-one, one-to-many or many-to-many comparisons. See also 'Sequence alignment comparison'. See also "Read mapping" Sequence alignment @@ -40011,12 +40428,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Align two or more molecular sequences of different types (for example genomic DNA to EST, cDNA or mRNA). - + Hybrid sequence alignment construction true @@ -40027,11 +40444,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Align molecular sequences using sequence and structural information. Sequence alignment (structure-based) - - + + Structure-based sequence alignment @@ -40049,14 +40466,14 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Align (superimpose) molecular tertiary structures. Structural alignment 3D profile alignment 3D profile-to-3D profile alignment Structural profile alignment - - + + Includes methods that align structural (3D) profiles or templates (representing structures or structure alignments) - including methods that perform one-to-one, one-to-many or many-to-many comparisons. Structure alignment @@ -40087,11 +40504,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Generate some type of sequence profile (for example a hidden Markov model) from a sequence alignment. Sequence profile construction - - + + Sequence profile generation @@ -40120,12 +40537,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Generate some type of structural (3D) profile or template from a structure or structure alignment. Structural profile construction Structural profile generation - - + + 3D profile generation @@ -40135,11 +40552,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Align sequence profiles (representing sequence alignments). - + Profile-profile alignment true @@ -40151,11 +40568,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Align structural (3D) profiles or templates (representing structures or structure alignments). - + 3D profile-to-3D profile alignment true @@ -40179,14 +40596,14 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Align molecular sequence(s) to sequence profile(s), or profiles to other profiles. A profile typically represents a sequence alignment. Profile-profile alignment Profile-to-profile alignment Sequence-profile alignment Sequence-to-profile alignment - - + + A sequence profile typically represents a sequence alignment. Methods might perform one-to-one, one-to-many or many-to-many comparisons. Sequence profile alignment @@ -40197,11 +40614,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Align molecular sequence(s) to structural (3D) profile(s) or template(s) (representing a structure or structure alignment). - + Sequence-to-3D-profile alignment true @@ -40225,13 +40642,13 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Align molecular sequence to structure in 3D space (threading). Sequence-structure alignment Sequence-3D profile alignment Sequence-to-3D-profile alignment - - + + This includes sequence-to-3D-profile alignment methods, which align molecular sequence(s) to structural (3D) profile(s) or template(s) (representing a structure or structure alignment) - methods might perform one-to-one, one-to-many or many-to-many comparisons. Use this concept for methods that evaluate sequence-structure compatibility by assessing residue interactions in 3D. Methods might perform one-to-one, one-to-many or many-to-many comparisons. Protein threading @@ -40248,15 +40665,15 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Recognize (predict and identify) known protein structural domains or folds in protein sequence(s) which (typically) are not accompanied by any significant sequence similarity to know structures. Domain prediction Fold prediction Protein domain prediction Protein fold prediction Protein fold recognition - - + + Methods use some type of mapping between sequence and fold, for example secondary structure prediction and alignment, profile comparison, sequence properties, homologous sequence search, kernel machines etc. Domains and folds might be taken from SCOP or CATH. Fold recognition @@ -40267,12 +40684,12 @@ experiments employing a combination of technologies. - beta12orEarlier - (jison)Too fine-grained, the operation (Data retrieval) hasn't changed, just what is retrieved. - 1.17 - + beta12orEarlier + (jison)Too fine-grained, the operation (Data retrieval) hasn't changed, just what is retrieved. + 1.17 + Search for and retrieve data concerning or describing some core data, as distinct from the primary data that is being described. - + This includes documentation, general information and other metadata on entities such as databases, database entries and tools. Metadata retrieval @@ -40291,10 +40708,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Query scientific literature, in search for articles, article data, concepts, named entities, or for statistics. - - + + Literature search @@ -40323,7 +40740,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Text analysis Process and analyse text (typically scientific literature) to extract information from it. Literature mining @@ -40331,8 +40748,8 @@ experiments employing a combination of technologies. Text data mining Article analysis Literature analysis - - + + Text mining @@ -40343,10 +40760,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Perform in-silico (virtual) PCR. - - + + Virtual PCR @@ -40374,7 +40791,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Design or predict oligonucleotide primers for PCR and DNA amplification etc. PCR primer prediction Primer design @@ -40385,8 +40802,8 @@ experiments employing a combination of technologies. PCR primer design (for large scale sequencing) PCR primer design (for methylation PCRs) Primer quality estimation - - + + Primer design involves predicting or selecting primers that are specific to a provided PCR template. Primers can be designed with certain properties such as size of product desired, primer size etc. The output might be a minimal or overlapping primer set. This includes predicting primers based on gene structure, promoters, exon-exon junctions, predicting primers that are conserved across multiple genomes or species, primers for for gene transcription profiling, for genotyping polymorphisms, for example single nucleotide polymorphisms (SNPs), for large scale sequencing, or for methylation PCRs. PCR primer design @@ -40424,11 +40841,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Predict and/or optimize oligonucleotide probes for DNA microarrays, for example for transcription profiling of genes, or for genomes and gene families. Microarray probe prediction - - + + Microarray probe design @@ -40450,12 +40867,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Combine (align and merge) overlapping fragments of a DNA sequence to reconstruct the original sequence. Metagenomic assembly Sequence assembly editing - - + + For example, assemble overlapping reads from paired-end sequencers into contigs (a contiguous sequence corresponding to read overlaps). Or assemble contigs, for example ESTs and genomic DNA fragments, depending on the detected fragment overlaps. Sequence assembly @@ -40467,11 +40884,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.16 - + beta12orEarlier + 1.16 + Standardize or normalize microarray data. - + Microarray data standardisation and normalisation true @@ -40483,12 +40900,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Process (read and / or write) SAGE, MPSS or SBS experimental data. - + Sequencing-based expression profile data processing true @@ -40506,12 +40923,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Perform cluster analysis of expression data to identify groups with similar expression profiles, for example by clustering. Gene expression clustering Gene expression profile clustering - - + + Expression profile clustering @@ -40521,7 +40938,7 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier The measurement of the activity (expression) of multiple genes in a cell, tissue, sample etc., in order to get an impression of biological function. Feature expression analysis Functional profiling @@ -40533,8 +40950,8 @@ experiments employing a combination of technologies. Protein profiling RNA profiling mRNA profiling - - + + Gene expression profiling generates some sort of gene expression profile, for example from microarray data. Gene expression profiling @@ -40546,12 +40963,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Comparison of expression profiles. Gene expression comparison Gene expression profile comparison - - + + Expression profile comparison @@ -40561,11 +40978,11 @@ experiments employing a combination of technologies. - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Interpret (in functional terms) and annotate gene expression data. - + Functional profiling true @@ -40577,12 +40994,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Analyse EST or cDNA sequences. - + For example, identify full-length cDNAs from EST sequences or detect potential EST antisense transcripts. EST and cDNA sequence analysis true @@ -40594,12 +41011,12 @@ experiments employing a combination of technologies. - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Identify and select targets for protein structural determination. - + Methods will typically navigate a graph of protein families of known structure. Structural genomics target selection true @@ -40623,10 +41040,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Assign secondary structure from protein coordinate or experimental data. - - + + Includes secondary structure assignment from circular dichroism (CD) spectroscopic data, and from protein coordinate data. Protein secondary structure assignment @@ -40650,12 +41067,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Assign a protein tertiary structure (3D coordinates), or other aspects of protein structure, from raw experimental data. NOE assignment Structure calculation - - + + Protein structure assignment @@ -40678,15 +41095,15 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier WHATIF: CorrectedPDBasXML WHATIF: UseFileDB WHATIF: UseResidueDB Evaluate the quality or correctness a protein three-dimensional model. Protein model validation Residue validation - - + + Model validation might involve checks for atomic packing, steric clashes (bumps), volume irregularities, agreement with electron density maps, number of amino acid residues, percentage of residues with missing or bad atoms, irregular Ramachandran Z-scores, irregular Chi-1 / Chi-2 normality scores, RMS-Z score on bonds and angles etc. The PDB file format has had difficulties, inconsistencies and errors. Corrections can include identifying a meaningful sequence, removal of alternate atoms, correction of nomenclature problems, removal of incomplete residues and spurious waters, addition or removal of water, modelling of missing side chains, optimisation of cysteine bonds, regularisation of bond lengths, bond angles and planarities etc. This includes methods that calculate poor quality residues. The scoring function to identify poor quality residues may consider residues with bad atoms or atoms with high B-factor, residues in the N- or C-terminal position, adjacent to an unstructured residue, non-canonical residues, glycine and proline (or adjacent to these such residues). @@ -40701,12 +41118,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier WHATIF: CorrectedPDBasXML Refine (after evaluation) a model of a molecular structure (typically a protein structure) to reduce steric clashes, volume irregularities etc. Protein model refinement - - + + Molecular model refinement @@ -40729,13 +41146,13 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Construct a phylogenetic tree. Phlyogenetic tree construction Phylogenetic reconstruction Phylogenetic tree generation - - + + Phylogenetic trees are usually constructed from a set of sequences from which an alignment (or data matrix) is calculated. Phylogenetic inference @@ -40753,12 +41170,12 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Analyse an existing phylogenetic tree or trees, typically to detect features or make predictions. Phylogenetic tree analysis Phylogenetic modelling - - + + Phylgenetic modelling is the modelling of trait evolution and prediction of trait values using phylogeny as a basis. Phylogenetic analysis @@ -40770,10 +41187,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Compare two or more phylogenetic trees. - - + + For example, to produce a consensus tree, subtrees, supertrees, calculate distances between trees or test topological similarity between trees (e.g. a congruence index) etc. Phylogenetic tree comparison @@ -40797,10 +41214,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Edit a phylogenetic tree. - - + + Phylogenetic tree editing @@ -40816,11 +41233,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Comparison of a DNA sequence to orthologous sequences in different species and inference of a phylogenetic tree, in order to identify regulatory elements such as transcription factor binding sites (TFBS). Phylogenetic shadowing - - + + Phylogenetic shadowing is a type of footprinting where many closely related species are used. A phylogenetic 'shadow' represents the additive differences between individual sequences. By masking or 'shadowing' variable positions a conserved sequence is produced with few or none of the variations, which is then compared to the sequences of interest to identify significant regions of conservation. Phylogenetic footprinting @@ -40832,11 +41249,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.20 - + beta12orEarlier + 1.20 + Simulate the folding of a protein. - + Protein folding simulation true @@ -40848,11 +41265,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Predict the folding pathway(s) or non-native structural intermediates of a protein. - + Protein folding pathway prediction true @@ -40864,11 +41281,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Map and model the effects of single nucleotide polymorphisms (SNPs) on protein structure(s). - + Protein SNP mapping true @@ -40892,14 +41309,14 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Predict the effect of point mutation on a protein structure, in terms of strucural effects and protein folding, stability and function. Variant functional prediction Protein SNP mapping Protein mutation modelling Protein stability change prediction - - + + Protein SNP mapping maps and modesl the effects of single nucleotide polymorphisms (SNPs) on protein structure(s). Methods might predict silent or pathological mutations. Variant effect prediction @@ -40910,11 +41327,11 @@ experiments employing a combination of technologies. - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Design molecules that elicit an immune response (immunogens). - + Immunogen design true @@ -40926,11 +41343,11 @@ experiments employing a combination of technologies. - beta12orEarlier - 1.18 - + beta12orEarlier + 1.18 + Predict and optimise zinc finger protein domains for DNA/RNA binding (for example for transcription factors and nucleases). - + Zinc finger prediction true @@ -40954,10 +41371,10 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Calculate Km, Vmax and derived data for an enzyme reaction. - - + + Enzyme kinetics calculation @@ -40967,16 +41384,15 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Reformat a file of data (or equivalent entity in memory). File format conversion File formatting File reformatting Format conversion - Reformatting - - - Formatting + + + Data formatting @@ -40985,11 +41401,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Test and validate the format and content of a data file. File format validation - - + + Format validation @@ -41017,15 +41433,15 @@ experiments employing a combination of technologies. - beta12orEarlier - true + beta12orEarlier + true Visualise, plot or render (graphically) biomolecular data such as molecular sequences or structures. Data visualisation Rendering Molecular visualisation Plotting - - + + This includes methods to render and visualise molecules. Data? visualisation @@ -41045,11 +41461,11 @@ experiments employing a combination of technologies. - beta12orEarlier + beta12orEarlier Search a sequence database by sequence comparison and retrieve similar sequences. sequences matching a given sequence motif or pattern, such as a Prosite pattern or regular expression. - - + + This excludes direct retrieval methods (e.g. the dbfetch program). Sequence database search @@ -41066,10 +41482,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Search a tertiary structure database, typically by sequence and/or structure comparison, or some other means, and retrieve structures and associated data. - - + + Structure database search @@ -41079,11 +41495,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Search a secondary protein database (of classification information) to assign a protein sequence(s) to a known protein family or group. - + Protein secondary database search true @@ -41095,12 +41511,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Screen a sequence against a motif or pattern database. - + Motif database search true @@ -41111,12 +41527,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.4 - + beta12orEarlier + 1.4 + Search a database of sequence profiles with a query sequence. - + Sequence profile database search true @@ -41127,12 +41543,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Search a database of transmembrane proteins, for example for sequence or structural similarities. - + Transmembrane protein database search true @@ -41143,12 +41559,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Query a database and retrieve sequences with a given entry code or accession number. - + Sequence retrieval (by code) true @@ -41159,12 +41575,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Query a database and retrieve sequences containing a given keyword. - + Sequence retrieval (by keyword) true @@ -41177,12 +41593,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Search a sequence database and retrieve sequences that are similar to a query sequence. Sequence database search (by sequence) Structure database search (by sequence) - - + + Sequence similarity search @@ -41192,11 +41608,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Search a sequence database and retrieve sequences matching a given sequence motif or pattern, such as a Prosite pattern or regular expression. - + Sequence database search (by motif or pattern) true @@ -41208,12 +41624,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search a sequence database and retrieve sequences of a given amino acid composition. - + Sequence database search (by amino acid composition) true @@ -41224,10 +41640,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Search a sequence database and retrieve sequences with a specified property, typically a physicochemical or compositional property. - - + + Sequence database search (by property) @@ -41237,12 +41653,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search a sequence database and retrieve sequences that are similar to a query sequence using a word-based method. - + Word-based methods (for example BLAST, gapped BLAST, MEGABLAST, WU-BLAST etc.) are usually quicker than alignment-based methods. They may or may not handle gaps. Sequence database search (by sequence using word-based methods) true @@ -41254,12 +41670,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search a sequence database and retrieve sequences that are similar to a query sequence using a sequence profile-based method, or with a supplied profile as query. - + This includes tools based on PSI-BLAST. Sequence database search (by sequence using profile-based methods) true @@ -41271,12 +41687,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search a sequence database for sequences that are similar to a query sequence using a local alignment-based method. - + This includes tools based on the Smith-Waterman algorithm or FASTA. Sequence database search (by sequence using local alignment-based methods) true @@ -41288,12 +41704,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search sequence(s) or a sequence database for sequences that are similar to a query sequence using a global alignment-based method. - + This includes tools based on the Needleman and Wunsch algorithm. Sequence database search (by sequence using global alignment-based methods) true @@ -41305,12 +41721,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search a DNA database (for example a database of conserved sequence tags) for matches to Sequence-Tagged Site (STS) primer sequences. - + STSs are genetic markers that are easily detected by the polymerase chain reaction (PCR) using specific primers. Sequence database search (by sequence for primer sequences) true @@ -41322,11 +41738,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search sequence(s) or a sequence database for sequences which match a set of peptide masses, for example a peptide mass fingerprint from mass spectrometry. - + Sequence database search (by molecular weight) true @@ -41338,12 +41754,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search sequence(s) or a sequence database for sequences of a given isoelectric point. - + Sequence database search (by isoelectric point) true @@ -41354,12 +41770,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Query a tertiary structure database and retrieve entries with a given entry code or accession number. - + Structure retrieval (by code) true @@ -41370,12 +41786,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Query a tertiary structure database and retrieve entries containing a given keyword. - + Structure retrieval (by keyword) true @@ -41386,11 +41802,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Search a tertiary structure database and retrieve structures with a sequence similar to a query sequence. - + Structure database search (by sequence) true @@ -41403,12 +41819,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Search a database of molecular structure and retrieve structures that are similar to a query structure. Structure database search (by structure) Structure retrieval by structure - - + + Structural similarity search @@ -41430,10 +41846,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Annotate a molecular sequence record with terms from a controlled vocabulary. - - + + Sequence annotation @@ -41444,13 +41860,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Annotate a genome sequence with terms from a controlled vocabulary. Functional genome annotation Metagenome annotation Structural genome annotation - - + + Genome annotation @@ -41460,13 +41876,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate the reverse and / or complement of a nucleotide sequence. Nucleic acid sequence reverse and complement Reverse / complement Reverse and complement - - + + Reverse complement @@ -41477,10 +41893,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate a random sequence, for example, with a specific character composition. - - + + Random sequence generation @@ -41497,11 +41913,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate digest fragments for a nucleotide sequence containing restriction sites. Nucleic acid restriction digest - - + + Restriction digest @@ -41524,10 +41940,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Cleave a protein sequence into peptide fragments (corresponding to enzymatic or chemical cleavage). - - + + This is often followed by calculation of protein fragment masses (http://edamontology.org/operation_0398). Protein sequence cleavage @@ -41538,10 +41954,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Mutate a molecular sequence a specified amount or shuffle it to produce a randomised sequence with the same overall composition. - - + + Sequence mutation and randomisation @@ -41551,10 +41967,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Mask characters in a molecular sequence (replacing those characters with a mask character). - - + + For example, SNPs or repeats in a DNA sequence might be masked. Sequence masking @@ -41565,10 +41981,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Cut (remove) characters or a region from a molecular sequence. - - + + Sequence cutting @@ -41578,10 +41994,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Create (or remove) restriction sites in sequences, for example using silent mutations. - - + + Restriction site creation @@ -41597,10 +42013,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Translate a DNA sequence into protein. - - + + DNA translation @@ -41616,10 +42032,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Transcribe a nucleotide sequence into mRNA sequence(s). - - + + DNA transcription @@ -41629,11 +42045,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Calculate base frequency or word composition of a nucleotide sequence. - + Sequence composition calculation (nucleic acid) true @@ -41645,11 +42061,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Calculate amino acid frequency or word composition of a protein sequence. - + Sequence composition calculation (protein) true @@ -41662,10 +42078,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Find (and possibly render) short repetitive subsequences (repeat sequences) in (typically nucleotide) sequences. - - + + Repeat sequence detection @@ -41676,10 +42092,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse repeat sequence organisation such as periodicity. - - + + Repeat sequence organisation analysis @@ -41689,11 +42105,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Analyse the hydrophobic, hydrophilic or charge properties of a protein structure. - + Protein hydropathy calculation (from structure) true @@ -41711,13 +42127,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:AtomAccessibilitySolvent WHATIF:AtomAccessibilitySolventPlus Calculate solvent accessible or buried surface areas in protein or other molecular structures. Protein solvent accessibility calculation - - + + Solvent accessibility might be calculated for the backbone, sidechain and total (backbone plus sidechain). Accessible surface calculation @@ -41728,11 +42144,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Identify clusters of hydrophobic or charged residues in a protein structure. - + Protein hydropathy cluster calculation true @@ -41750,10 +42166,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate whether a protein structure has an unusually large net charge (dipole moment). - - + + Protein dipole moment calculation @@ -41763,7 +42179,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:AtomAccessibilityMolecular WHATIF:AtomAccessibilityMolecularPlus WHATIF:ResidueAccessibilityMolecular @@ -41777,8 +42193,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Protein residue surface calculation Protein surface and interior calculation Protein surface calculation - - + + Molecular surface calculation @@ -41788,11 +42204,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Identify or predict catalytic residues, active sites or other ligand-binding sites in protein structures. - + Protein binding site prediction (from structure) true @@ -41810,13 +42226,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse the interaction of protein with nucleic acids, e.g. RNA or DNA-binding sites, interfaces etc. Protein-nucleic acid binding site analysis Protein-DNA interaction analysis Protein-RNA interaction analysis - - + + Protein-nucleic acid interaction analysis @@ -41826,10 +42242,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Decompose a structure into compact or globular fragments (protein peeling). - - + + Protein peeling @@ -41845,10 +42261,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate a matrix of distance between residues (for example the C-alpha atoms) in a protein structure. - - + + Protein distance matrix calculation @@ -41864,11 +42280,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate a residue contact map (typically all-versus-all inter-residue contacts) for a protein structure. Protein contact map calculation - - + + Contact map calculation @@ -41884,10 +42300,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate clusters of contacting residues in protein structures. - - + + This includes for example clusters of hydrophobic or charged residues, or clusters of contacting residues which have a key structural or functional role. Residue cluster calculation @@ -41904,13 +42320,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:HasHydrogenBonds WHATIF:ShowHydrogenBonds WHATIF:ShowHydrogenBondsM Identify potential hydrogen bonds between amino acids and other groups. - - + + The output might include the atoms involved in the bond, bond geometric parameters and bond enthalpy. Hydrogen bond calculation @@ -41921,12 +42337,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate non-canonical atomic interactions in protein structures. - + Residue non-canonical interaction detection true @@ -41943,10 +42359,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate a Ramachandran plot of a protein structure. - - + + Ramachandran plot calculation @@ -41957,11 +42373,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.22 - + beta12orEarlier + 1.22 + Validate a Ramachandran plot of a protein structure. - + Ramachandran plot validation true @@ -41985,11 +42401,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate the molecular weight of a protein sequence or fragments. Peptide mass calculation - - + + Protein molecular weight calculation @@ -42005,10 +42421,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict extinction coefficients or optical density of a protein sequence. - - + + Protein extinction coefficient calculation @@ -42030,11 +42446,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate pH-dependent properties from pKa calculations of a protein sequence. Protein pH-dependent property calculation - - + + Protein pKa calculation @@ -42045,11 +42461,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Hydropathy calculation on a protein sequence. - + Protein hydropathy calculation (from sequence) true @@ -42068,10 +42484,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Plot a protein titration curve. - - + + Protein titration curve plotting @@ -42088,10 +42504,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate isoelectric point of a protein sequence. - - + + Protein isoelectric point calculation @@ -42107,10 +42523,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Estimate hydrogen exchange rate of a protein sequence. - - + + Protein hydrogen exchange rate calculation @@ -42120,10 +42536,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate hydrophobic or hydrophilic / charged regions of a protein sequence. - - + + Protein hydrophobic region calculation @@ -42139,10 +42555,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate aliphatic index (relative volume occupied by aliphatic side chains) of a protein. - - + + Protein aliphatic index calculation @@ -42159,10 +42575,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate the hydrophobic moment of a peptide sequence and recognize amphiphilicity. - - + + Hydrophobic moment is a peptides hydrophobicity measured for different angles of rotation. Protein hydrophobic moment plotting @@ -42179,10 +42595,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict the stability or globularity of a protein sequence, whether it is intrinsically unfolded etc. - - + + Protein globularity prediction @@ -42198,10 +42614,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict the solubility or atomic solvation energy of a protein sequence. - - + + Protein solubility prediction @@ -42217,10 +42633,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict crystallizability of a protein sequence. - - + + Protein crystallizability prediction @@ -42230,12 +42646,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - (jison)Too fine-grained. - 1.17 - + beta12orEarlier + (jison)Too fine-grained. + 1.17 + Detect or predict signal peptides (and typically predict subcellular localisation) of eukaryotic proteins. - + Protein signal peptide detection (eukaryotes) true @@ -42247,12 +42663,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - (jison)Too fine-grained. - 1.17 - + beta12orEarlier + (jison)Too fine-grained. + 1.17 + Detect or predict signal peptides (and typically predict subcellular localisation) of bacterial proteins. - + Protein signal peptide detection (bacteria) true @@ -42264,11 +42680,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Predict MHC class I or class II binding peptides, promiscuous binding peptides, immunogenicity etc. - + MHC peptide immunogenicity prediction true @@ -42280,12 +42696,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Predict, recognise and identify positional features in protein sequences such as functional sites or regions and secondary structure. - + Methods typically involve scanning for known motifs, patterns and regular expressions. Protein feature prediction (from sequence) true @@ -42310,7 +42726,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict, recognise and identify features in nucleotide sequences such as functional sites or regions, typically by scanning for known motifs, patterns and regular expressions. Sequence feature detection (nucleic acid) Nucleic acid feature prediction @@ -42318,8 +42734,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Nucleic acid site detection Nucleic acid site prediction Nucleic acid site recognition - - + + Methods typically involve scanning for known motifs, patterns and regular expressions. This is placeholder but does not comprehensively include all child concepts - please inspect other concepts under "Nucleic acid sequence analysis" for example "Gene prediction", for other feature detection operations. Nucleic acid feature detection @@ -42338,7 +42754,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict antigenic determinant sites (epitopes) in protein sequences. Antibody epitope prediction Epitope prediction @@ -42348,8 +42764,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Epitope mapping (MHC Class II) T cell epitope mapping T cell epitope prediction - - + + Epitope mapping is commonly done during vaccine design. Epitope mapping @@ -42366,7 +42782,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict post-translation modification sites in protein sequences. PTM analysis PTM prediction @@ -42420,8 +42836,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Tyrosine nitration site prediction Ubiquitination prediction Ubiquitination site prediction - - + + Methods might predict sites of methylation, N-terminal myristoylation, N-terminal acetylation, sumoylation, palmitoylation, phosphorylation, sulfation, glycosylation, glycosylphosphatidylinositol (GPI) modification sites (GPI lipid anchor signals) etc. Post-translational modification site prediction @@ -42439,10 +42855,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Detect or predict signal peptides and signal peptide cleavage sites in protein sequences. - - + + Methods might use sequence motifs and features, amino acid composition, profiles, machine-learned classifiers, etc. Protein signal peptide detection @@ -42453,11 +42869,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Predict catalytic residues, active sites or other ligand-binding sites in protein sequences. - + Protein binding site prediction (from sequence) true @@ -42469,15 +42885,15 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict or detect RNA and DNA-binding binding sites in protein sequences. Protein-nucleic acid binding detection Protein-nucleic acid binding prediction Protein-nucleic acid binding site detection Protein-nucleic acid binding site prediction Zinc finger prediction - - + + This includes methods that predict and optimise zinc finger protein domains for DNA/RNA binding (for example for transcription factors and nucleases). Nucleic acids-binding site prediction @@ -42488,11 +42904,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.20 - + beta12orEarlier + 1.20 + Predict protein sites that are key to protein folding, such as possible sites of nucleation or stabilisation. - + Protein folding site prediction true @@ -42510,10 +42926,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Detect or predict cleavage sites (enzymatic or chemical) in protein sequences. - - + + Protein cleavage site prediction @@ -42523,11 +42939,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Predict epitopes that bind to MHC class I molecules. - + Epitope mapping (MHC Class I) true @@ -42539,11 +42955,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Predict epitopes that bind to MHC class II molecules. - + Epitope mapping (MHC Class II) true @@ -42555,11 +42971,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Detect, predict and identify whole gene structure in DNA sequences. This includes protein coding regions, exon-intron structure, regulatory regions etc. - + Whole gene prediction true @@ -42571,11 +42987,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Detect, predict and identify genetic elements such as promoters, coding regions, splice sites, etc in DNA sequences. - + Methods for gene prediction might be ab initio, based on phylogenetic comparisons, use motifs, sequence features, support vector machine, alignment etc. Gene component prediction @@ -42588,10 +43004,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Detect or predict transposons, retrotransposons / retrotransposition signatures etc. - - + + Transposon prediction @@ -42601,15 +43017,15 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Detect polyA signals in nucleotide sequences. PolyA detection PolyA prediction PolyA signal prediction Polyadenylation signal detection Polyadenylation signal prediction - - + + PolyA signal detection @@ -42625,11 +43041,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Detect quadruplex-forming motifs in nucleotide sequences. Quadruplex structure prediction - - + + Quadruplex (4-stranded) structures are formed by guanine-rich regions and are implicated in various important biological processes and as therapeutic targets. Quadruplex formation site detection @@ -42646,12 +43062,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Find CpG rich regions in a nucleotide sequence or isochores in genome sequences. CpG island and isochores detection CpG island and isochores rendering - - + + An isochore is long region (> 3 KB) of DNA with very uniform GC content, in contrast to the rest of the genome. Isochores tend tends to have more genes, higher local melting or denaturation temperatures, and different flexibility. Methods might calculate fractional GC content or variation of GC content, predict methylation status of CpG islands etc. This includes methods that visualise CpG rich regions in a nucleotide sequence, for example plot isochores in a genome sequence. CpG island and isochore detection @@ -42668,12 +43084,17 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + + + + + + + beta12orEarlier Find and identify restriction enzyme cleavage sites (restriction sites) in (typically) DNA sequences, for example to generate a restriction map. - - + + Restriction site recognition - @@ -42682,12 +43103,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict nucleosome exclusion sequences (nucleosome free regions) in DNA. Nucleosome exclusion sequence prediction Nucleosome formation sequence prediction - - + + Nucleosome position prediction @@ -42703,11 +43124,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify, predict or analyse splice sites in nucleotide sequences. Splice prediction - - + + Methods might require a pre-mRNA or genomic DNA sequence. Splice site prediction @@ -42718,11 +43139,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Predict whole gene structure using a combination of multiple methods to achieve better predictions. - + Integrated gene prediction true @@ -42734,10 +43155,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Find operons (operators, promoters and genes) in bacteria genes. - - + + Operon prediction @@ -42747,12 +43168,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict protein-coding regions (CDS or exon) or open reading frames in nucleotide sequences. ORF finding ORF prediction - - + + Coding region prediction @@ -42768,11 +43189,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict selenocysteine insertion sequence (SECIS) in a DNA sequence. Selenocysteine insertion sequence (SECIS) prediction - - + + SECIS elements are around 60 nucleotides in length with a stem-loop structure directs the cell to translate UGA codons as selenocysteines. SECIS element prediction @@ -42789,14 +43210,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict transcriptional regulatory motifs, patterns, elements or regions in DNA sequences. Regulatory element prediction Transcription regulatory element prediction Conserved transcription regulatory sequence identification Translational regulatory element prediction - - + + This includes comparative genomics approaches that identify common, conserved (homologous) or synonymous transcriptional regulatory elements. For example cross-species comparison of transcription factor binding sites (TFBS). Methods might analyse co-regulated or co-expressed genes, or sets of oppositely expressed genes. This includes promoters, enhancers, silencers and boundary elements / insulators, regulatory protein or transcription factor binding sites etc. Methods might be specific to a particular genome and use motifs, word-based / grammatical methods, position-specific frequency matrices, discriminative pattern analysis etc. Transcriptional regulatory element prediction @@ -42814,10 +43235,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict translation initiation sites, possibly by searching a database of sites. - - + + Translation initiation site prediction @@ -42827,10 +43248,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict whole promoters or promoter elements (transcription start sites, RNA polymerase binding site, transcription factor binding sites, promoter enhancers etc) in DNA sequences. - - + + Methods might recognize CG content, CpG islands, splice sites, polyA signals etc. Promoter prediction @@ -42841,12 +43262,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify, predict or analyse cis-regulatory elements in DNA sequences (TATA box, Pribnow box, SOS box, CAAT box, CCAAT box, operator etc.) or in RNA sequences (e.g. riboswitches). Transcriptional regulatory element prediction (DNA-cis) Transcriptional regulatory element prediction (RNA-cis) - - + + Cis-regulatory elements (cis-elements) regulate the expression of genes located on the same strand from which the element was transcribed. Cis-elements are found in the 5' promoter region of the gene, in an intron, or in the 3' untranslated region. Cis-elements are often binding sites of one or more trans-acting factors. They also occur in RNA sequences, e.g. a riboswitch is a region of an mRNA molecule that bind a small target molecule that regulates the gene's activity. cis-regulatory element prediction @@ -42857,11 +43278,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Identify, predict or analyse cis-regulatory elements (for example riboswitches) in RNA sequences. - + Transcriptional regulatory element prediction (RNA-cis) true @@ -42879,12 +43300,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict functional RNA sequences with a gene regulatory role (trans-regulatory elements) or targets. Functional RNA identification Transcriptional regulatory element prediction (trans) - - + + Trans-regulatory elements regulate genes distant from the gene from which they were transcribed. trans-regulatory element prediction @@ -42895,12 +43316,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify matrix/scaffold attachment regions (MARs/SARs) in DNA sequences. MAR/SAR prediction Matrix/scaffold attachment site prediction - - + + MAR/SAR sites often flank a gene or gene cluster and are found nearby cis-regulatory sequences. They might contribute to transcription regulation. S/MAR prediction @@ -42911,10 +43332,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict transcription factor binding sites in DNA sequences. - - + + Transcription factor binding site prediction @@ -42930,10 +43351,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict exonic splicing enhancers (ESE) in exons. - - + + An exonic splicing enhancer (ESE) is 6-base DNA sequence motif in an exon that enhances or directs splicing of pre-mRNA or hetero-nuclear RNA (hnRNA) into mRNA. Exonic splicing enhancer prediction @@ -42945,11 +43366,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Evaluate molecular sequence alignment accuracy. Sequence alignment quality evaluation - - + + Evaluation might be purely sequence-based or use structural information. Sequence alignment validation @@ -42960,11 +43381,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse character conservation in a molecular sequence alignment, for example to derive a consensus sequence. Residue conservation analysis - - + + Use this concept for methods that calculate substitution rates, estimate relative site variability, identify sites with biased properties, derive a consensus sequence, or identify highly conserved or very poorly conserved sites, regions, blocks etc. Sequence alignment analysis (conservation) @@ -42976,10 +43397,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse correlations between sites in a molecular sequence alignment. - - + + This is typically done to identify possible covarying positions and predict contacts or structural constraints in protein structures. Sequence alignment analysis (site correlation) @@ -42990,11 +43411,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Detects chimeric sequences (chimeras) from a sequence alignment. Chimeric sequence detection - - + + A chimera includes regions from two or more phylogenetically distinct sequences. They are usually artifacts of PCR and are thought to occur when a prematurely terminated amplicon reanneals to another DNA strand and is subsequently copied to completion in later PCR cycles. Chimera detection @@ -43005,11 +43426,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Detect recombination (hotspots and coldspots) and identify recombination breakpoints in a sequence alignment. Sequence alignment analysis (recombination detection) - - + + Tools might use a genetic algorithm, quartet-mapping, bootscanning, graphical methods, random forest model and so on. Recombination detection @@ -43020,12 +43441,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify insertion, deletion and duplication events from a sequence alignment. Indel discovery Sequence alignment analysis (indel detection) - - + + Tools might use a genetic algorithm, quartet-mapping, bootscanning, graphical methods, random forest model and so on. Indel detection @@ -43036,12 +43457,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Predict nucleosome formation potential of DNA sequences. - + Nucleosome formation potential prediction true @@ -43058,10 +43479,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate a thermodynamic property of DNA or DNA/RNA, such as melting temperature, enthalpy and entropy. - - + + Nucleic acid thermodynamic property calculation @@ -43078,10 +43499,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate and plot a DNA or DNA/RNA melting profile. - - + + A melting profile is used to visualise and analyse partly melted DNA conformations. Nucleic acid melting profile plotting @@ -43092,10 +43513,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate and plot a DNA or DNA/RNA stitch profile. - - + + A stitch profile represents the alternative conformations that partly melted DNA can adopt in a temperature range. Nucleic acid stitch profile plotting @@ -43106,10 +43527,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate and plot a DNA or DNA/RNA melting curve. - - + + Nucleic acid melting curve plotting @@ -43119,10 +43540,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate and plot a DNA or DNA/RNA probability profile. - - + + Nucleic acid probability profile plotting @@ -43132,10 +43553,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate and plot a DNA or DNA/RNA temperature profile. - - + + Nucleic acid temperature profile plotting @@ -43151,10 +43572,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate curvature and flexibility / stiffness of a nucleotide sequence. - - + + This includes properties such as. Nucleic acid curvature calculation @@ -43165,13 +43586,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict microRNA sequences (miRNA) and precursors or microRNA targets / binding sites in a DNA sequence. miRNA prediction microRNA detection microRNA target detection - - + + miRNA target prediction @@ -43187,10 +43608,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict tRNA genes in genomic sequences (tRNA). - - + + tRNA gene prediction @@ -43206,10 +43627,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Assess binding specificity of putative siRNA sequence(s), for example for a functional assay, typically with respect to designing specific siRNA sequences. - - + + siRNA binding specificity prediction @@ -43219,11 +43640,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - + beta12orEarlier + 1.18 + Predict secondary structure of protein sequence(s) using multiple methods to achieve better predictions. - + Protein secondary structure prediction (integrated) true @@ -43235,10 +43656,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict helical secondary structure of protein sequences. - - + + Protein secondary structure prediction (helices) @@ -43248,10 +43669,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict turn structure (for example beta hairpin turns) of protein sequences. - - + + Protein secondary structure prediction (turns) @@ -43261,10 +43682,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict open coils, non-regular secondary structure and intrinsically disordered / unstructured regions of protein sequences. - - + + Protein secondary structure prediction (coils) @@ -43274,10 +43695,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict cysteine bonding state and disulfide bond partners in protein sequences. - - + + Disulfide bond prediction @@ -43287,12 +43708,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - Not sustainable to have protein type-specific concepts. - 1.19 - + beta12orEarlier + Not sustainable to have protein type-specific concepts. + 1.19 + Predict G protein-coupled receptors (GPCR). - + GPCR prediction true @@ -43304,12 +43725,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - Not sustainable to have protein type-specific concepts. - 1.19 - + beta12orEarlier + Not sustainable to have protein type-specific concepts. + 1.19 + Analyse G-protein coupled receptor proteins (GPCRs). - + GPCR analysis true @@ -43329,11 +43750,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict tertiary structure (backbone and side-chain conformation) of protein sequences. Protein folding pathway prediction - - + + This includes methods that predict the folding pathway(s) or non-native structural intermediates of a protein. Protein structure prediction @@ -43352,10 +43773,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict structure of DNA or RNA. - - + + Methods might identify thermodynamically stable or evolutionarily conserved structures. Nucleic acid structure prediction @@ -43367,11 +43788,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict tertiary structure of protein sequence(s) without homologs of known structure. de novo structure prediction - - + + Ab initio structure prediction @@ -43389,14 +43810,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Build a three-dimensional protein model based on known (for example homologs) structures. Comparative modelling Homology modelling Homology structure modelling Protein structure comparative modelling - - + + The model might be of a whole, part or aspect of protein structure. Molecular modelling methods might use sequence-structure alignment, structural templates, molecular dynamics, energy minimisation etc. Protein modelling @@ -43421,12 +43842,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Model the structure of a protein in complex with a small molecule or another macromolecule. Docking simulation Macromolecular docking - - + + This includes protein-protein interactions, protein-nucleic acid, protein-ligand binding etc. Methods might predict whether the molecules are likely to bind in vivo, their conformation when bound, the strength of the interaction, possible mutations to achieve bonding and so on. Molecular docking @@ -43439,15 +43860,15 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Model protein backbone conformation. Protein modelling (backbone) Design optimization Epitope grafting Scaffold search Scaffold selection - - + + Methods might require a preliminary C(alpha) trace. Scaffold selection, scaffold search, epitope grafting and design optimization are stages of backbone modelling done during rational vaccine design. Backbone modelling @@ -43459,15 +43880,15 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Model, analyse or edit amino acid side chain conformation in protein structure, optimize side-chain packing, hydrogen bonding etc. Protein modelling (side chains) Antibody optimisation Antigen optimisation Antigen resurfacing Rotamer likelihood prediction - - + + Antibody optimisation is to optimize the antibody-interacting surface of the antigen (epitope). Antigen optimisation is to optimize the antigen-interacting surface of the antibody (paratope). Antigen resurfacing is to resurface the antigen by varying the sequence of non-epitope regions. Methods might use a residue rotamer library. This includes rotamer likelihood prediction: the prediction of rotamer likelihoods for all 20 amino acid types at each position in a protein structure, where output typically includes, for each residue position, the likelihoods for the 20 amino acid types with estimated reliability of the 20 likelihoods. @@ -43480,12 +43901,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Model loop conformation in protein structures. Protein loop modelling Protein modelling (loops) - - + + Loop modelling @@ -43508,12 +43929,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Model protein-ligand (for example protein-peptide) binding using comparative modelling or other techniques. Ligand-binding simulation Protein-peptide docking - - + + Methods aim to predict the position and orientation of a ligand bound to a protein receptor or enzyme. Virtual screening is used in drug discovery to search libraries of small molecules in order to identify those molecules which are most likely to bind to a drug target (typically a protein receptor or enzyme). Protein-ligand docking @@ -43532,12 +43953,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict or optimise RNA sequences (sequence pools) with likely secondary and tertiary structure for in vitro selection. Nucleic acid folding family identification Structured RNA prediction and optimisation - - + + RNA inverse folding @@ -43547,13 +43968,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Find single nucleotide polymorphisms (SNPs) - single nucleotide change in base positions - between sequences. Typically done for sequences from a high-throughput sequencing experiment that differ from a reference genome and which might, especially by reference to population frequency or functional data, indicate a polymorphism. SNP calling SNP discovery Single nucleotide polymorphism detection - - + + This includes functional SNPs for large-scale genotyping purposes, disease-associated non-synonymous SNPs etc. SNP detection @@ -43570,10 +43991,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate a physical (radiation hybrid) map of genetic markers in a DNA sequence using provided radiation hybrid (RH) scores for one or more markers. - - + + Radiation Hybrid Mapping @@ -43583,12 +44004,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Map the genetic architecture of dynamic complex traits. - + This can involve characterisation of the underlying quantitative trait loci (QTLs) or nucleotides (QTNs). Functional mapping true @@ -43606,13 +44027,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Infer haplotypes, either alleles at multiple loci that are transmitted together on the same chromosome, or a set of single nucleotide polymorphisms (SNPs) on a single chromatid that are statistically associated. Haplotype inference Haplotype map generation Haplotype reconstruction - - + + Haplotype inference can help in population genetic studies and the identification of complex disease genes, , and is typically based on aligned single nucleotide polymorphism (SNP) fragments. Haplotype comparison is a useful way to characterize the genetic variation between individuals. An individual's haplotype describes which nucleotide base occurs at each position for a set of common SNPs. Tools might use combinatorial functions (for example parsimony) or a likelihood function or model with optimisation such as minimum error correction (MEC) model, expectation-maximisation algorithm (EM), genetic algorithm or Markov chain Monte Carlo (MCMC). Haplotype mapping @@ -43629,10 +44050,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate linkage disequilibrium; the non-random association of alleles or polymorphisms at two or more loci (not necessarily on the same chromosome). - - + + Linkage disequilibrium is identified where a combination of alleles (or genetic markers) occurs more or less frequently in a population than expected by chance formation of haplotypes. Linkage disequilibrium calculation @@ -43650,10 +44071,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict genetic code from analysis of codon usage data. - - + + Genetic code prediction @@ -43670,12 +44091,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Render a representation of a distribution that consists of group of data points plotted on a simple scale. Categorical plot plotting Dotplot plotting - - + + Dot plots are useful when having not too many (e.g. 20) data points for each category. Example: draw a dotplot of sequence similarities identified from word-matching or character comparison. Dot plot plotting @@ -43693,11 +44114,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Align exactly two molecular sequences. Pairwise alignment - - + + Methods might perform one-to-one, one-to-many or many-to-many comparisons. Pairwise sequence alignment @@ -43709,11 +44130,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Align more than two molecular sequences. Multiple alignment - - + + This includes methods that use an existing alignment, for example to incorporate sequences into an alignment, or combine several multiple alignments into a single, improved alignment. Multiple sequence alignment @@ -43725,13 +44146,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Locally align exactly two molecular sequences. - + Local alignment methods identify regions of local similarity. Pairwise sequence alignment generation (local) true @@ -43743,13 +44164,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Globally align exactly two molecular sequences. - + Global alignment methods identify similarity across the entire length of the sequences. Pairwise sequence alignment generation (global) true @@ -43761,13 +44182,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Locally align two or more molecular sequences. Local sequence alignment Sequence alignment (local) Smith-Waterman - - + + Local alignment methods identify regions of local similarity. Local alignment @@ -43779,12 +44200,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Globally align two or more molecular sequences. Global sequence alignment Sequence alignment (global) - - + + Global alignment methods identify similarity across the entire length of the sequences. Global alignment @@ -43796,11 +44217,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Align two or more molecular sequences with user-defined constraints. - + Constrained sequence alignment true @@ -43812,11 +44233,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.16 - + beta12orEarlier + 1.16 + Align two or more molecular sequences using multiple methods to achieve higher quality. - + Consensus-based sequence alignment true @@ -43834,15 +44255,15 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Align multiple sequences using relative gap costs calculated from neighbors in a supplied phylogenetic tree. Multiple sequence alignment (phylogenetic tree-based) Multiple sequence alignment construction (phylogenetic tree-based) Phylogenetic tree-based multiple sequence alignment construction Sequence alignment (phylogenetic tree-based) Sequence alignment generation (phylogenetic tree-based) - - + + This is supposed to give a more biologically meaningful alignment than standard alignments. Tree-based sequence alignment @@ -43853,12 +44274,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Align molecular secondary structure (represented as a 1D string). - + Secondary structure alignment generation true @@ -43869,11 +44290,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - + beta12orEarlier + 1.18 + Align protein secondary structures. - + Protein secondary structure alignment generation true @@ -43898,13 +44319,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Align RNA secondary structures. RNA secondary structure alignment construction RNA secondary structure alignment generation Secondary structure alignment (RNA) - - + + RNA secondary structure alignment @@ -43914,12 +44335,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Align (superimpose) exactly two molecular tertiary structures. Structure alignment (pairwise) Pairwise protein structure alignment - - + + Pairwise structure alignment @@ -43929,12 +44350,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Align (superimpose) more than two molecular tertiary structures. Structure alignment (multiple) Multiple protein structure alignment - - + + This includes methods that use an existing alignment. Multiple structure alignment @@ -43945,12 +44366,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Align protein tertiary structures. - + Structure alignment (protein) true @@ -43961,12 +44382,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Align RNA tertiary structures. - + Structure alignment (RNA) true @@ -43977,13 +44398,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Locally align (superimpose) exactly two molecular tertiary structures. - + Local alignment methods identify regions of local similarity, common substructures etc. Pairwise structure alignment generation (local) true @@ -43995,13 +44416,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Globally align (superimpose) exactly two molecular tertiary structures. - + Global alignment methods identify similarity across the entire structures. Pairwise structure alignment generation (global) true @@ -44013,12 +44434,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Locally align (superimpose) two or more molecular tertiary structures. Structure alignment (local) Local protein structure alignment - - + + Local alignment methods identify regions of local similarity, common substructures etc. Local structure alignment @@ -44029,12 +44450,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Globally align (superimpose) two or more molecular tertiary structures. Structure alignment (global) Global protein structure alignment - - + + Global alignment methods identify similarity across the entire structures. Global structure alignment @@ -44045,13 +44466,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.16 - + beta12orEarlier + 1.16 + Align exactly two molecular profiles. - + Methods might perform one-to-one, one-to-many or many-to-many comparisons. Profile-profile alignment (pairwise) true @@ -44063,13 +44484,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Align two or more molecular profiles. - + Sequence alignment generation (multiple profile) true @@ -44080,14 +44501,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.16 - + beta12orEarlier + 1.16 + Align exactly two molecular Structural (3D) profiles. - + 3D profile-to-3D profile alignment (pairwise) true @@ -44098,14 +44519,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Align two or more molecular 3D profiles. - + Structural profile alignment generation (multiple) true @@ -44116,12 +44537,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search and retrieve names of or documentation on bioinformatics tools, for example by keyword or which perform a particular function. - + Data retrieval (tool metadata) true @@ -44132,12 +44553,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Search and retrieve names of or documentation on bioinformatics databases or query terms, for example by keyword. - + Data retrieval (database metadata) true @@ -44148,11 +44569,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Predict primers for large scale sequencing. - + PCR primer design (for large scale sequencing) true @@ -44164,11 +44585,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Predict primers for genotyping polymorphisms, for example single nucleotide polymorphisms (SNPs). - + PCR primer design (for genotyping polymorphisms) true @@ -44180,11 +44601,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Predict primers for gene transcription profiling. - + PCR primer design (for gene transcription profiling) true @@ -44196,11 +44617,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Predict primers that are conserved across multiple genomes or species. - + PCR primer design (for conserved primers) true @@ -44212,11 +44633,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Predict primers based on gene structure. - + PCR primer design (based on gene structure) true @@ -44228,11 +44649,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Predict primers for methylation PCRs. - + PCR primer design (for methylation PCRs) true @@ -44244,11 +44665,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Sequence assembly by combining fragments using an existing backbone sequence, typically a reference genome. Sequence assembly (mapping assembly) - - + + The final sequence will resemble the backbone sequence. Mapping assemblers are usually much faster and less memory intensive than de-novo assemblers. Mapping assembly @@ -44260,12 +44681,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Sequence assembly by combining fragments without the aid of a reference sequence or genome. De Bruijn graph Sequence assembly (de-novo assembly) - - + + De-novo assemblers are much slower and more memory intensive than mapping assemblers. De-novo assembly @@ -44278,13 +44699,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier The process of assembling many short DNA sequences together such thay they represent the original chromosomes from which the DNA originated. Genomic assembly Sequence assembly (genome assembly) Breakend assembly - - + + Genome assembly @@ -44295,11 +44716,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Sequence assembly for EST sequences (transcribed mRNA). Sequence assembly (EST assembly) - - + + Assemblers must handle (or be complicated by) alternative splicing, trans-splicing, single-nucleotide polymorphism (SNP), recoding, and post-transcriptional modification. EST assembly @@ -44312,11 +44733,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Make sequence tag to gene assignments (tag mapping) of SAGE, MPSS and SBS data. Tag to gene assignment - - + + Sequence tag mapping assigns experimentally obtained sequence tags to known transcripts or annotate potential virtual sequence tags in a genome. Sequence tag mapping @@ -44327,12 +44748,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Process (read and / or write) serial analysis of gene expression (SAGE) data. - + SAGE data processing true @@ -44343,12 +44764,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Process (read and / or write) massively parallel signature sequencing (MPSS) data. - + MPSS data processing true @@ -44359,12 +44780,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Process (read and / or write) sequencing by synthesis (SBS) data. - + SBS data processing true @@ -44382,12 +44803,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate a heat map of expression data from e.g. microarray data. Heat map construction Heatmap generation - - + + The heat map usually uses a coloring scheme to represent expression values. They can show how quantitative measurements were influenced by experimental conditions. Heat map generation @@ -44399,12 +44820,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Analyse one or more gene expression profiles, typically to interpret them in functional terms. - + Gene expression profile analysis true @@ -44423,14 +44844,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Map an expression profile to known biological pathways, for example, to identify or reconstruct a pathway. Pathway mapping Gene expression profile pathway mapping Gene to pathway mapping Gene-to-pathway mapping - - + + Expression profile pathway mapping @@ -44440,11 +44861,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - + beta12orEarlier + 1.18 + Assign secondary structure from protein coordinate data. - + Protein secondary structure assignment (from coordinate data) true @@ -44456,11 +44877,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - + beta12orEarlier + 1.18 + Assign secondary structure from circular dichroism (CD) spectroscopic data. - + Protein secondary structure assignment (from CD data) true @@ -44472,11 +44893,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.7 - + beta12orEarlier + 1.7 + Assign a protein tertiary structure (3D coordinates) from raw X-ray crystallography data. - + Protein structure assignment (from X-ray crystallographic data) true @@ -44488,11 +44909,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.7 - + beta12orEarlier + 1.7 + Assign a protein tertiary structure (3D coordinates) from raw NMR spectroscopy data. - + Protein structure assignment (from NMR data) true @@ -44504,13 +44925,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Construct a phylogenetic tree from a specific type of data. Phylogenetic tree construction (data centric) Phylogenetic tree generation (data centric) - - + + Subconcepts of this concept reflect different types of data used to generate a tree, and provide an alternate axis for curation. Phylogenetic inference (data centric) @@ -44521,13 +44942,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Construct a phylogenetic tree using a specific method. Phylogenetic tree construction (method centric) Phylogenetic tree generation (method centric) - - + + Subconcepts of this concept reflect different computational methods used to generate a tree, and provide an alternate axis for curation. Phylogenetic inference (method centric) @@ -44539,12 +44960,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Phylogenetic tree construction from molecular sequences. Phylogenetic tree construction (from molecular sequences) Phylogenetic tree generation (from molecular sequences) - - + + Methods typically compare multiple molecular sequence and estimate evolutionary distances and relationships to infer gene families or make functional predictions. Phylogenetic inference (from molecular sequences) @@ -44561,12 +44982,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Phylogenetic tree construction from continuous quantitative character data. Phylogenetic tree construction (from continuous quantitative characters) Phylogenetic tree generation (from continuous quantitative characters) - - + + Phylogenetic inference (from continuous quantitative characters) @@ -44588,12 +45009,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Phylogenetic tree construction from gene frequency data. Phylogenetic tree construction (from gene frequencies) Phylogenetic tree generation (from gene frequencies) - - + + Phylogenetic inference (from gene frequencies) @@ -44609,12 +45030,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Phylogenetic tree construction from polymorphism data including microsatellites, RFLP (restriction fragment length polymorphisms), RAPD (random-amplified polymorphic DNA) and AFLP (amplified fragment length polymorphisms) data. Phylogenetic tree construction (from polymorphism data) Phylogenetic tree generation (from polymorphism data) - - + + Phylogenetic inference (from polymorphism data) @@ -44624,12 +45045,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Construct a phylogenetic species tree, for example, from a genome-wide sequence comparison. Phylogenetic species tree construction Phylogenetic species tree generation - - + + Species tree construction @@ -44639,12 +45060,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Construct a phylogenetic tree by computing a sequence alignment and searching for the tree with the fewest number of character-state changes from the alignment. Phylogenetic tree construction (parsimony methods) Phylogenetic tree generation (parsimony methods) - - + + This includes evolutionary parsimony (invariants) methods. Phylogenetic inference (parsimony methods) @@ -44655,12 +45076,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Construct a phylogenetic tree by computing (or using precomputed) distances between sequences and searching for the tree with minimal discrepancies between pairwise distances. Phylogenetic tree construction (minimum distance methods) Phylogenetic tree generation (minimum distance methods) - - + + This includes neighbor joining (NJ) clustering method. Phylogenetic inference (minimum distance methods) @@ -44671,12 +45092,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Construct a phylogenetic tree by relating sequence data to a hypothetical tree topology using a model of sequence evolution. Phylogenetic tree construction (maximum likelihood and Bayesian methods) Phylogenetic tree generation (maximum likelihood and Bayesian methods) - - + + Maximum likelihood methods search for a tree that maximizes a likelihood function, i.e. that is most likely given the data and model. Bayesian analysis estimate the probability of tree for branch lengths and topology, typically using a Monte Carlo algorithm. Phylogenetic inference (maximum likelihood and Bayesian methods) @@ -44687,12 +45108,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Construct a phylogenetic tree by computing four-taxon trees (4-trees) and searching for the phylogeny that matches most closely. Phylogenetic tree construction (quartet methods) Phylogenetic tree generation (quartet methods) - - + + Phylogenetic inference (quartet methods) @@ -44702,12 +45123,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Construct a phylogenetic tree by using artificial-intelligence methods, for example genetic algorithms. Phylogenetic tree construction (AI methods) Phylogenetic tree generation (AI methods) - - + + Phylogenetic inference (AI methods) @@ -44730,11 +45151,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify a plausible model of DNA substitution that explains a molecular (DNA or protein) sequence alignment. Nucleotide substitution modelling - - + + DNA substitution modelling @@ -44744,11 +45165,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse the shape (topology) of a phylogenetic tree. Phylogenetic tree analysis (shape) - - + + Phylogenetic tree topology analysis @@ -44759,10 +45180,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Apply bootstrapping or other measures to estimate confidence of a phylogenetic tree. - - + + Phylogenetic tree bootstrapping @@ -44784,11 +45205,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Construct a "gene tree" which represents the evolutionary history of the genes included in the study. This can be used to predict families of genes and gene function based on their position in a phylogenetic tree. Phylogenetic tree analysis (gene family prediction) - - + + Gene trees can provide evidence for gene duplication events, as well as speciation events. Where sequences from different homologs are included in a gene tree, subsequent clustering of the orthologs can demonstrate evolutionary history of the orthologs. Gene tree construction @@ -44799,11 +45220,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse a phylogenetic tree to identify allele frequency distribution and change that is subject to evolutionary pressures (natural selection, genetic drift, mutation and gene flow). Identify type of natural selection (such as stabilizing, balancing or disruptive). Phylogenetic tree analysis (natural selection) - - + + Stabilizing/purifying (directional) selection favors a single phenotype and tends to decrease genetic diversity as a population stabilizes on a particular trait, selecting out trait extremes or deleterious mutations. In contrast, balancing selection maintain genetic polymorphisms (or multiple alleles), whereas disruptive (or diversifying) selection favors individuals at both extremes of a trait. Allele frequency distribution analysis @@ -44815,12 +45236,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare two or more phylogenetic trees to produce a consensus tree. Phylogenetic tree construction (consensus) Phylogenetic tree generation (consensus) - - + + Methods typically test for topological similarity between trees using for example a congruence index. Consensus tree construction @@ -44831,13 +45252,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare two or more phylogenetic trees to detect subtrees or supertrees. Phylogenetic sub/super tree detection Subtree construction Supertree construction - - + + Phylogenetic sub/super tree construction @@ -44853,10 +45274,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare two or more phylogenetic trees to calculate distances between trees. - - + + Phylogenetic tree distances calculation @@ -44866,10 +45287,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Annotate a phylogenetic tree with terms from a controlled vocabulary. - - + + Phylogenetic tree annotation http://www.evolutionaryontology.org/cdao.owl#CDAOAnnotation @@ -44880,11 +45301,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Predict and optimise peptide ligands that elicit an immunological response. - + Immunogenicity prediction true @@ -44902,10 +45323,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict or optimise DNA to elicit (via DNA vaccination) an immunological response. - - + + DNA vaccine design @@ -44915,11 +45336,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Reformat (a file or other report of) molecular sequence(s). - + Sequence formatting true @@ -44931,11 +45352,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Reformat (a file or other report of) molecular sequence alignment(s). - + Sequence alignment formatting true @@ -44947,11 +45368,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Reformat a codon usage table. - + Codon usage table formatting true @@ -44976,12 +45397,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualise, format or render a molecular sequence or sequences such as a sequence alignment, possibly with sequence features or properties shown. Sequence rendering Sequence alignment visualisation - - + + Sequence visualisation @@ -44991,11 +45412,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.15 - + beta12orEarlier + 1.15 + Visualise, format or print a molecular sequence alignment. - + Sequence alignment visualisation true @@ -45013,11 +45434,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualise, format or render sequence clusters. Sequence cluster rendering - - + + Sequence cluster visualisation @@ -45034,11 +45455,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Render or visualise a phylogenetic tree. Phylogenetic tree rendering - - + + Phylogenetic tree visualisation @@ -45048,11 +45469,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.15 - + beta12orEarlier + 1.15 + Visualise RNA secondary structure, knots, pseudoknots etc. - + RNA secondary structure visualisation true @@ -45064,11 +45485,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.15 - + beta12orEarlier + 1.15 + Render and visualise protein secondary structure. - + Protein secondary structure visualisation true @@ -45093,13 +45514,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualise or render molecular 3D structure, for example a high-quality static picture or animation. Structure rendering Protein secondary structure visualisation RNA secondary structure visualisation - - + + This includes visualisation of protein secondary structure such as knots, pseudoknots etc. as well as tertiary and quaternary structure. Structure visualisation @@ -45117,13 +45538,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualise microarray or other expression data. Expression data rendering Gene expression data visualisation Microarray data rendering - - + + Expression data visualisation @@ -45133,11 +45554,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Identify and analyse networks of protein interactions. - + Protein interaction network visualisation true @@ -45155,12 +45576,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Draw or visualise a DNA map. DNA map drawing Map rendering - - + + Map drawing @@ -45170,12 +45591,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Render a sequence with motifs. - + Sequence motif rendering true @@ -45192,12 +45613,17 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + + + + + + + beta12orEarlier Draw or visualise restriction maps in DNA sequences. - - + + Restriction map drawing - @@ -45206,12 +45632,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Draw a linear maps of DNA. - + DNA linear map rendering true @@ -45222,11 +45648,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier DNA circular map rendering Draw a circular maps of DNA, for example a plasmid map. - - + + Plasmid map drawing @@ -45242,11 +45668,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualise operon structure etc. Operon rendering - - + + Operon drawing @@ -45256,12 +45682,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Identify folding families of related RNAs. - + Nucleic acid folding family identification true @@ -45272,11 +45698,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.20 - + beta12orEarlier + 1.20 + Compute energies of nucleic acid folding, e.g. minimum folding energies for DNA or RNA sequences or energy landscape of RNA mutants. - + Nucleic acid folding energy calculation true @@ -45288,12 +45714,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Retrieve existing annotation (or documentation), typically annotation on a database entity. - + Use this concepts for tools which retrieve pre-existing annotations, not for example prediction methods that might make annotations. Annotation retrieval true @@ -45312,12 +45738,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict the biological or biochemical role of a protein, or other aspects of a protein function. Protein function analysis Protein functional analysis - - + + For functional properties that can be mapped to a sequence, use 'Sequence feature detection (protein)' instead. Protein function prediction @@ -45336,10 +45762,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare the functional properties of two or more proteins. - - + + Protein function comparison @@ -45349,12 +45775,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Submit a molecular sequence to a database. - + Sequence submission true @@ -45372,14 +45798,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse a known network of gene regulation. Gene regulatory network comparison Gene regulatory network modelling Regulatory network comparison Regulatory network modelling - - + + Gene regulatory network analysis @@ -45395,14 +45821,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:UploadPDB Parse, prepare or load a user-specified data file so that it is available for use. Data loading Loading - - - Parsing + + + Data parsing @@ -45411,12 +45837,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Query a sequence data resource (typically a database) and retrieve sequences and / or annotation. - + This includes direct retrieval methods (e.g. the dbfetch program) but not those that perform calculations on the sequence. Sequence retrieval true @@ -45428,14 +45854,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + WHATIF:DownloadPDB WHATIF:EchoPDB Query a tertiary structure data resource (typically a database) and retrieve structures, structure-related data and annotation. - + This includes direct retrieval methods but not those that perform calculations on the sequence or structure. Structure retrieval true @@ -45448,11 +45874,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:GetSurfaceDots Calculate the positions of dots that are homogeneously distributed over the surface of a molecule. - - + + A dot has three coordinates (x,y,z) and (typically) a color. Surface rendering @@ -45463,11 +45889,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility ('accessible surface') for each atom in a structure. - + Waters are not considered. Protein atom surface calculation (accessible) @@ -45480,11 +45906,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility ('accessible molecular surface') for each atom in a structure. - + Waters are not considered. Protein atom surface calculation (accessible molecular) @@ -45497,11 +45923,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility ('accessible surface') for each residue in a structure. - + Solvent accessibility might be calculated for the backbone, sidechain and total (backbone plus sidechain). Protein residue surface calculation (accessible) @@ -45514,11 +45940,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility ('vacuum accessible surface') for each residue in a structure. This is the accessibility of the residue when taken out of the protein together with the backbone atoms of any residue it is covalently bound to. - + Solvent accessibility might be calculated for the backbone, sidechain and total (backbone plus sidechain). Protein residue surface calculation (vacuum accessible) @@ -45531,11 +45957,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility ('accessible molecular surface') for each residue in a structure. - + Solvent accessibility might be calculated for the backbone, sidechain and total (backbone plus sidechain). Protein residue surface calculation (accessible molecular) @@ -45548,11 +45974,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility ('vacuum molecular surface') for each residue in a structure. This is the accessibility of the residue when taken out of the protein together with the backbone atoms of any residue it is covalently bound to. - + Solvent accessibility might be calculated for the backbone, sidechain and total (backbone plus sidechain). Protein residue surface calculation (vacuum molecular) @@ -45565,11 +45991,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility ('accessible molecular surface') for a structure as a whole. - + Protein surface calculation (accessible molecular) true @@ -45581,11 +46007,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility ('accessible surface') for a structure as a whole. - + Protein surface calculation (accessible) true @@ -45597,11 +46023,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate for each residue in a protein structure all its backbone torsion angles. - + Backbone torsion angle calculation true @@ -45613,11 +46039,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate for each residue in a protein structure all its torsion angles. - + Full torsion angle calculation true @@ -45629,11 +46055,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate for each cysteine (bridge) all its torsion angles. - + Cysteine torsion angle calculation true @@ -45645,11 +46071,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + For each amino acid in a protein structure calculate the backbone angle tau. - + Tau is the backbone angle N-Calpha-C (angle over the C-alpha). Tau angle calculation @@ -45662,11 +46088,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:ShowCysteineBridge Detect cysteine bridges (from coordinate data) in a protein structure. - - + + Cysteine bridge detection @@ -45676,11 +46102,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:ShowCysteineFree Detect free cysteines in a protein structure. - - + + A free cysteine is neither involved in a cysteine bridge, nor functions as a ligand to a metal. Free cysteine detection @@ -45692,11 +46118,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:ShowCysteineMetal Detect cysteines that are bound to metal in a protein structure. - - + + Metal-bound cysteine detection @@ -45706,11 +46132,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate protein residue contacts with nucleic acids in a structure. - + Residue contact calculation (residue-nucleic acid) true @@ -45722,11 +46148,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate protein residue contacts with metal in a structure. Residue-metal contact calculation - - + + Protein-metal contact calculation @@ -45736,11 +46162,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate ion contacts in a structure (all ions for all side chain atoms). - + Residue contact calculation (residue-negative ion) true @@ -45752,11 +46178,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:ShowBumps Detect 'bumps' between residues in a structure, i.e. those with pairs of atoms whose Van der Waals' radii interpenetrate more than a defined distance. - - + + Residue bump detection @@ -45766,12 +46192,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + WHATIF:SymmetryContact Calculate the number of symmetry contacts made by residues in a protein structure. - + A symmetry contact is a contact between two atoms in different asymmetric unit. Residue symmetry contact calculation @@ -45784,11 +46210,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate contacts between residues and ligands in a protein structure. - + Residue contact calculation (residue-ligand) true @@ -45800,14 +46226,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF:HasSaltBridge WHATIF:HasSaltBridgePlus WHATIF:ShowSaltBridges WHATIF:ShowSaltBridgesH Calculate (and possibly score) salt bridges in a protein structure. - - + + Salt bridges are interactions between oppositely charged atoms in different residues. The output might include the inter-atomic distance. Salt bridge calculation @@ -45818,9 +46244,9 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + WHATIF:ShowLikelyRotamers WHATIF:ShowLikelyRotamers100 WHATIF:ShowLikelyRotamers200 @@ -45832,7 +46258,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern WHATIF:ShowLikelyRotamers800 WHATIF:ShowLikelyRotamers900 Predict rotamer likelihoods for all 20 amino acid types at each position in a protein structure. - + Output typically includes, for each residue position, the likelihoods for the 20 amino acid types with estimated reliability of the 20 likelihoods. Rotamer likelihood prediction @@ -45845,12 +46271,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + WHATIF:ProlineMutationValue Calculate for each position in a protein structure the chance that a proline, when introduced at this position, would increase the stability of the whole protein. - + Proline mutation value calculation true @@ -45862,11 +46288,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF: PackingQuality Identify poorly packed residues in protein structures. - - + + Residue packing validation @@ -45876,13 +46302,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF: ImproperQualityMax WHATIF: ImproperQualitySum Validate protein geometry, for example bond lengths, bond angles, torsion angles, chiralities, planaraties etc. An example is validation of a Ramachandran plot of a protein structure. Ramachandran plot validation - - + + Protein geometry validation @@ -45893,12 +46319,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + WHATIF: PDB_sequence Extract a molecular sequence from a PDB file. - + PDB file sequence retrieval true @@ -45910,11 +46336,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Identify HET groups in PDB files. - + A HET group usually corresponds to ligands, lipids, but might also (not consistently) include groups that are attached to amino acids. Each HET group is supposed to have a unique three letter code and a unique name which might be given in the output. HET group detection @@ -45927,12 +46353,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Determine for residue the DSSP determined secondary structure in three-state (HSC). - + DSSP secondary structure assignment true @@ -45943,12 +46369,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + WHATIF: PDBasXML Reformat (a file or other report of) tertiary structure data. - + Structure formatting true @@ -45966,10 +46392,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Assign cysteine bonding state and disulfide bond partners in protein structures. - - + + Protein cysteine and disulfide bond assignment @@ -45979,11 +46405,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Identify poor quality amino acid positions in protein structures. - + Residue validation true @@ -45995,13 +46421,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + WHATIF:MovedWaterPDB Query a tertiary structure database and retrieve water molecules. - + Structure retrieval (water) true @@ -46018,10 +46444,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict siRNA duplexes in RNA. - - + + siRNA duplex prediction @@ -46032,10 +46458,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Refine an existing sequence alignment. - - + + Sequence alignment refinement @@ -46045,12 +46471,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process an EMBOSS listfile (list of EMBOSS Uniform Sequence Addresses). - + Listfile processing true @@ -46062,10 +46488,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Perform basic (non-analytical) operations on a report or file of sequences (which might include features), such as file concatenation, removal or ordering of sequences, creation of subset or a new file of sequences. - - + + Sequence file editing @@ -46075,12 +46501,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Perform basic (non-analytical) operations on a sequence alignment file, such as copying or removal and ordering of sequences. - + Sequence alignment file processing true @@ -46091,12 +46517,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Process (read and / or write) physicochemical property data for small molecules. - + Small molecule data processing true @@ -46107,12 +46533,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Search and retrieve documentation on a bioinformatics ontology. - + Data retrieval (ontology annotation) true @@ -46123,12 +46549,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Query an ontology and retrieve concepts or relations. - + Data retrieval (ontology concept) true @@ -46139,10 +46565,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify a representative sequence from a set of sequences, typically using scores from pair-wise alignment or other comparison of the sequences. - - + + Representative sequence identification @@ -46152,12 +46578,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Perform basic (non-analytical) operations on a file of molecular tertiary structural data. - + Structure file processing true @@ -46168,12 +46594,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Query a profile data resource and retrieve one or more profile(s) and / or associated annotation. - + This includes direct retrieval methods that retrieve a profile by, e.g. the profile name. Data retrieval (sequence profile) true @@ -46185,7 +46611,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Perform a statistical data operation of some type, e.g. calibration or validation. Significance testing Statistical analysis @@ -46195,8 +46621,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Gibbs sampling Hypothesis testing Omnibus test - - + + Statistical calculation @@ -46219,11 +46645,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate a 3D-1D scoring matrix from analysis of protein sequence and structural data. 3D-1D scoring matrix construction - - + + A 3D-1D scoring matrix scores the probability of amino acids occurring in different structural environments. 3D-1D scoring matrix generation @@ -46241,11 +46667,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualise transmembrane proteins, typically the transmembrane regions within a sequence. Transmembrane protein rendering - - + + Transmembrane protein visualisation @@ -46255,12 +46681,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + An operation performing purely illustrative (pedagogical) purposes. - + Demonstration true @@ -46271,12 +46697,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Query a biological pathways database and retrieve annotation on one or more pathways. - + Data retrieval (pathway or network) true @@ -46287,12 +46713,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Query a database and retrieve one or more data identifiers. - + Data retrieval (identifier) true @@ -46304,10 +46730,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate a density plot (of base composition) for a nucleotide sequence. - - + + Nucleic acid density plotting @@ -46323,11 +46749,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse one or more known molecular sequences. Sequence analysis (general) - - + + Sequence analysis @@ -46338,11 +46764,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse molecular sequence motifs. Sequence motif processing - - + + Sequence motif analysis @@ -46352,12 +46778,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) protein interaction data. - + Protein interaction data processing true @@ -46380,11 +46806,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse protein structural data. Structure analysis (protein) - - + + Protein structure analysis @@ -46394,12 +46820,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Process (read and / or write) annotation of some type, typically annotation on an entry from a biological or biomedical database entity. - + Annotation processing true @@ -46410,12 +46836,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Analyse features in molecular sequences. - + Sequence feature analysis true @@ -46432,16 +46858,16 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Basic (non-analytical) operations of some data, either a file or equivalent entity in memory, such that the same basic type of data is consumed as input and generated as output. File handling File processing Report handling Utility operation Processing - - + + Data handling @@ -46451,12 +46877,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Analyse gene expression and regulation data. - + Gene expression analysis true @@ -46467,12 +46893,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) one or more structural (3D) profile(s) or template(s) of some type. - + Structural profile processing true @@ -46483,12 +46909,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) an index of (typically a file of) biological data. - + Data index processing true @@ -46499,12 +46925,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) some type of sequence profile. - + Sequence profile processing true @@ -46515,11 +46941,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.22 - + beta12orEarlier + 1.22 + Analyse protein function, typically by processing protein sequence and/or structural data, and generate an informative report. - + Protein function analysis true @@ -46544,13 +46970,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse, simulate or predict protein folding, typically by processing sequence and / or structural data. For example, predict sites of nucleation or stabilisation key to protein folding. Protein folding modelling Protein folding simulation Protein folding site prediction - - + + Protein folding analysis @@ -46572,11 +46998,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse protein secondary structure data. Secondary structure analysis (protein) - - + + Protein secondary structure analysis @@ -46586,12 +47012,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Process (read and / or write) data on the physicochemical property of a molecule. - + Physicochemical property data processing true @@ -46609,11 +47035,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict oligonucleotide primers or probes. Primer and probe prediction - - + + Primer and probe design @@ -46623,11 +47049,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Process (read and / or write) data of a specific type, for example applying analytical methods. - + Operation (typed) true @@ -46645,11 +47071,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Search a database (or other data resource) with a supplied query and retrieve entries (or parts of entries) that are similar to the query. Search - - + + Typically the query is compared to each entry and high scoring matches (hits) are returned. For example, a BLAST search of a sequence database. Database search @@ -46667,15 +47093,29 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - Retrieve an entry (or part of an entry) from a data resource that matches a supplied query. This might include some primary data and annotation. The query is a data identifier or other indexed term. For example, retrieve a sequence record with the specified accession number, or matching supplied keywords. + beta12orEarlier + Text mining + Data access + Information extraction + Information retrieval + Retrieve an entry or part of an entry from a data resource that matches a supplied query. This might include some primary data and annotation. The query is a data identifier or other indexed term. For example: retrieve a sequence record with the specified accession number or matching supplied keywords. Data extraction Retrieval Data retrieval (metadata) Metadata retrieval - - + + Data retrieval + + + + + + + + + + @@ -46684,14 +47124,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Predict, recognise, detect or identify some properties of a biomolecule. Detection Prediction Recognition - - + + Prediction and recognition @@ -46701,11 +47141,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Compare two or more things to identify similarities. - - + + Comparison @@ -46715,11 +47155,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Refine or optimise some data model. - - + + Optimisation and refinement @@ -46732,16 +47172,16 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - + - beta12orEarlier - true + beta12orEarlier + true Model or simulate some biological entity or system, typically using mathematical techniques including dynamical systems, statistical models, differential equations, and game theoretic models. Mathematical modelling - - - Modelling and simulation + + + Modelling(?) and simulation @@ -46750,11 +47190,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Perform basic operations on some data or a database. - + Data handling true @@ -46766,12 +47206,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Validate some data. Quality control - - + + Validation @@ -46781,12 +47221,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Map properties to positions on an biological entity (typically a molecular sequence or structure), or assemble such an entity from constituent parts. Cartography - - + + Mapping @@ -46796,11 +47236,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Design a biological entity (typically a molecular sequence or structure) with specific properties. - - + + Design @@ -46810,12 +47250,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Process (read and / or write) microarray data. - + Microarray data processing true @@ -46826,13 +47266,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - + beta12orEarlier + 1.18 + Process (read and / or write) a codon usage table. - + Codon usage table processing true @@ -46843,12 +47283,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Retrieve a codon usage table and / or associated annotation. - + Data retrieval (codon usage table) true @@ -46859,12 +47299,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a gene expression profile. - + Gene expression profile processing true @@ -46888,7 +47328,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Gene set testing Identify classes of genes or proteins that are over or under-represented in a large set of genes or proteins. For example analysis of a set of genes corresponding to a gene expression profile, annotated with Gene Ontology (GO) concepts, where eventual over-/under-representation of certain GO concept within the studied set of genes is revealed. Functional enrichment analysis @@ -46898,8 +47338,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern GO-term enrichment Gene Ontology concept enrichment Gene Ontology term enrichment - - + + "Gene set analysis" (often used interchangeably or in an overlapping sense with "gene-set enrichment analysis") refers to the functional analysis (term enrichment) of a differentially expressed set of genes, rather than all genes analysed. Analyse gene expression patterns to identify sets of genes that are associated with a specific trait, condition, clinical outcome etc. Gene sets can be defined beforehand by biological function, chromosome locations and so on. @@ -46917,10 +47357,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict a network of gene regulation. - - + + Gene regulatory network prediction @@ -46930,13 +47370,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Generate, analyse or handle a biological pathway or network. - + Pathway or network processing true @@ -46953,10 +47393,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Process (read and / or write) RNA secondary structure data. - - + + RNA secondary structure analysis @@ -46966,11 +47406,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Process (read and / or write) RNA tertiary structure data. - + Structure processing (RNA) true @@ -46988,10 +47428,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict RNA tertiary structure. - - + + RNA structure prediction @@ -47007,10 +47447,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict DNA tertiary structure. - - + + DNA structure prediction @@ -47020,11 +47460,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Generate, process or analyse phylogenetic tree or trees. - + Phylogenetic tree processing true @@ -47036,12 +47476,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) protein secondary structure data. - + Protein secondary structure processing true @@ -47052,12 +47492,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a network of protein interactions. - + Protein interaction network processing true @@ -47068,12 +47508,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) one or more molecular sequences and associated annotation. - + Sequence processing true @@ -47084,11 +47524,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a protein sequence and associated annotation. - + Sequence processing (protein) true @@ -47100,12 +47540,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a nucleotide sequence and associated annotation. - + Sequence processing (nucleic acid) true @@ -47129,10 +47569,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare two or more molecular sequences. - - + + Sequence comparison @@ -47142,12 +47582,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a sequence cluster. - + Sequence cluster processing true @@ -47158,12 +47598,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a sequence feature table. - + Feature table processing true @@ -47186,13 +47626,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Detect, predict and identify genes or components of genes in DNA sequences, including promoters, coding regions, splice sites, etc. Gene calling Gene finding Whole gene prediction - - + + Includes methods that predict whole gene structure using a combination of multiple methods to achieve better predictions. Methods for gene prediction might be ab initio, based on phylogenetic comparisons, use motifs, sequence features, support vector machine, alignment etc. Gene prediction @@ -47205,11 +47645,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.16 - + beta12orEarlier + 1.16 + Classify G-protein coupled receptors (GPCRs) into families and subfamilies. - + GPCR classification true @@ -47221,13 +47661,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - Not sustainable to have protein type-specific concepts. - 1.19 - + beta12orEarlier + Not sustainable to have protein type-specific concepts. + 1.19 + Predict G-protein coupled receptor (GPCR) coupling selectivity. - + GPCR coupling selectivity prediction true @@ -47238,11 +47678,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a protein tertiary structure. - + Structure processing (protein) true @@ -47254,11 +47694,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility for each atom in a structure. - + Waters are not considered. Protein atom surface calculation @@ -47271,11 +47711,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility for each residue in a structure. - + Protein residue surface calculation true @@ -47287,11 +47727,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate the solvent accessibility of a structure as a whole. - + Protein surface calculation true @@ -47303,12 +47743,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a molecular sequence alignment. - + Sequence alignment processing true @@ -47331,11 +47771,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict protein-protein binding sites. Protein-protein binding site detection - - + + Protein-protein binding site prediction @@ -47345,12 +47785,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a molecular tertiary structure. - + Structure processing true @@ -47361,12 +47801,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Annotate a DNA map of some type with terms from a controlled vocabulary. - + Map annotation true @@ -47377,12 +47817,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Retrieve information on a protein. - + Data retrieval (protein annotation) true @@ -47393,12 +47833,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Retrieve a phylogenetic tree from a data resource. - + Data retrieval (phylogenetic tree) true @@ -47409,12 +47849,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Retrieve information on a protein interaction. - + Data retrieval (protein interaction annotation) true @@ -47425,12 +47865,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Retrieve information on a protein family. - + Data retrieval (protein family annotation) true @@ -47441,12 +47881,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Retrieve information on an RNA family. - + Data retrieval (RNA family annotation) true @@ -47457,12 +47897,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Retrieve information on a specific gene. - + Data retrieval (gene annotation) true @@ -47473,12 +47913,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Retrieve information on a specific genotype or phenotype. - + Data retrieval (genotype and phenotype annotation) true @@ -47490,10 +47930,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare the architecture of two or more protein structures. - - + + Protein architecture comparison @@ -47505,10 +47945,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify the architecture of a protein structure. - - + + Includes methods that try to suggest the most likely biological unit for a given protein X-ray crystal structure based on crystal symmetry and scoring of putative protein-protein interfaces. Protein architecture recognition @@ -47539,12 +47979,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier The simulation of molecular (typically protein) conformation using a computational model of physical forces and computer simulation. Molecular dynamics simulation Protein dynamics - - + + Molecular dynamics @@ -47567,13 +48007,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse a nucleic acid sequence (using methods that are only applicable to nucleic acid sequences). Sequence analysis (nucleic acid) Nucleic acid sequence alignment analysis Sequence alignment analysis (nucleic acid) - - + + Nucleic acid sequence analysis @@ -47595,13 +48035,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse a protein sequence (using methods that are only applicable to protein sequences). Sequence analysis (protein) Protein sequence alignment analysis Sequence alignment analysis (protein) - - + + Protein sequence analysis @@ -47617,10 +48057,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse known molecular tertiary structures. - - + + Structure analysis @@ -47642,10 +48082,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse nucleic acid tertiary structural data. - - + + Nucleic acid structure analysis @@ -47655,12 +48095,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a molecular secondary structure. - + Secondary structure processing true @@ -47678,10 +48118,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare two or more molecular tertiary structures. - - + + Structure comparison @@ -47697,11 +48137,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Render a helical wheel representation of protein secondary structure. Helical wheel rendering - - + + Helical wheel drawing @@ -47718,11 +48158,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Render a topology diagram of protein secondary structure. Topology diagram rendering - - + + Topology diagram drawing @@ -47741,11 +48181,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare protein tertiary structures. Structure comparison (protein) - - + + Methods might identify structural neighbors, find structural similarities or define a structural core. Protein structure comparison @@ -47758,13 +48198,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare protein secondary structures. Protein secondary structure Secondary structure comparison (protein) Protein secondary structure alignment - - + + Protein secondary structure comparison @@ -47780,13 +48220,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict the subcellular localisation of a protein sequence. Protein cellular localization prediction Protein subcellular localisation prediction Protein targeting prediction - - + + The prediction might include subcellular localisation (nuclear, cytoplasmic, mitochondrial, chloroplast, plastid, membrane etc) or export (extracellular proteins) of a protein. Subcellular localisation prediction @@ -47798,11 +48238,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Calculate contacts between residues in a protein structure. - + Residue contact calculation (residue-residue) true @@ -47814,11 +48254,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Identify potential hydrogen bonds between amino acid residues. - + Hydrogen bond calculation (inter-residue) true @@ -47842,13 +48282,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict the interactions of proteins with other proteins. Protein-protein interaction detection Protein-protein binding prediction Protein-protein interaction prediction - - + + Protein interaction prediction @@ -47858,12 +48298,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Process (read and / or write) codon usage data. - + Codon usage data processing true @@ -47880,7 +48320,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Process (read and/or write) expression data from experiments measuring molecules (e.g. omics data), including analysis of one or more expression profiles, typically to interpret them in functional terms. Expression data analysis Gene expression analysis @@ -47889,8 +48329,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Metagenomic inference Microarray data analysis Protein expression analysis - - + + Metagenomic inference is the profiling of phylogenetic marker genes in order to predict metagenome function. Expression analysis @@ -47903,11 +48343,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a network of gene regulation. - + Gene regulatory network processing true @@ -47919,14 +48359,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - Notions of pathway and network were mixed up, EDAM 1.24 disentangles them. - 1.24 - + beta12orEarlier + Notions of pathway and network were mixed up, EDAM 1.24 disentangles them. + 1.24 + Generate, process or analyse a biological pathway or network. - + Pathway or network analysis true @@ -47937,12 +48377,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Analyse SAGE, MPSS or SBS experimental data, typically to identify or quantify mRNA transcripts. - + Sequencing-based expression profile data analysis true @@ -47961,11 +48401,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Predict, analyse, characterize or model splice sites, splicing events and so on, typically by comparing multiple nucleic acid sequences. Splicing model analysis - - + + Splicing analysis @@ -47975,12 +48415,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Analyse raw microarray data. - + Microarray raw data analysis true @@ -47991,13 +48431,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Process (read and / or write) nucleic acid sequence or structural data. - + Nucleic acid analysis true @@ -48008,13 +48448,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Process (read and / or write) protein sequence or structural data. - + Protein analysis true @@ -48025,11 +48465,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Process (read and / or write) molecular sequence data. - + Sequence data processing true @@ -48041,12 +48481,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Process (read and / or write) molecular structural data. - + Structural data processing true @@ -48057,13 +48497,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) text. - + Text processing true @@ -48074,13 +48514,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - - - + beta12orEarlier + 1.18 + + + Analyse a protein sequence alignment, typically to detect features or make predictions. - + Protein sequence alignment analysis true @@ -48092,13 +48532,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - - - + beta12orEarlier + 1.18 + + + Analyse a protein sequence alignment, typically to detect features or make predictions. - + Nucleic acid sequence alignment analysis true @@ -48110,13 +48550,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - - - + beta12orEarlier + 1.18 + + + Compare two or more nucleic acid sequences. - + Nucleic acid sequence comparison true @@ -48128,11 +48568,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - + beta12orEarlier + 1.18 + Compare two or more protein sequences. - + Protein sequence comparison true @@ -48150,10 +48590,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Back-translate a protein sequence into DNA. - - + + DNA back-translation @@ -48163,11 +48603,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Edit or change a nucleic acid sequence, either randomly or specifically. - + Sequence editing (nucleic acid) true @@ -48179,11 +48619,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Edit or change a protein sequence, either randomly or specifically. - + Sequence editing (protein) true @@ -48195,11 +48635,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.22 - + beta12orEarlier + 1.22 + Generate a nucleic acid sequence by some means. - + Sequence generation (nucleic acid) true @@ -48211,11 +48651,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.22 - + beta12orEarlier + 1.22 + Generate a protein sequence by some means. - + Sequence generation (protein) true @@ -48227,11 +48667,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Visualise, format or render a nucleic acid sequence. - + Various nucleic acid sequence analysis methods might generate a sequence rendering but are not (for brevity) listed under here. Nucleic acid sequence visualisation @@ -48244,11 +48684,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Visualise, format or render a protein sequence. - + Various protein sequence analysis methods might generate a sequence rendering but are not (for brevity) listed under here. Protein sequence visualisation @@ -48263,11 +48703,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare nucleic acid tertiary structures. Structure comparison (nucleic acid) - - + + Nucleic acid structure comparison @@ -48277,11 +48717,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) nucleic acid tertiary structure data. - + Structure processing (nucleic acid) @@ -48312,10 +48752,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate a map of a DNA sequence annotated with positional or non-positional features of some type. - - + + DNA mapping @@ -48326,12 +48766,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a DNA map of some type. - + Map data processing true @@ -48354,10 +48794,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse the hydrophobic, hydrophilic or charge properties of a protein (from analysis of sequence or structural information). - - + + Protein hydropathy calculation @@ -48374,12 +48814,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Identify or predict catalytic residues, active sites or other ligand-binding sites in protein sequences or structures. Protein binding site detection Protein binding site prediction - - + + Binding site prediction @@ -48390,11 +48830,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Build clusters of similar structures, typically using scores from structural alignment methods. Structural clustering - - + + Structure clustering @@ -48410,11 +48850,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate a physical DNA map (sequence map) from analysis of sequence tagged sites (STS). Sequence mapping - - + + An STS is a short subsequence of known sequence and location that occurs only once in the chromosome or genome that is being mapped. Sources of STSs include 1. expressed sequence tags (ESTs), simple sequence length polymorphisms (SSLPs), and random genomic sequences from cloned genomic DNA or database sequences. Sequence tagged site (STS) mapping @@ -48426,13 +48866,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Compare two or more entities, typically the sequence or structure (or derivatives) of macromolecules, to identify equivalent subunits. Alignment construction Alignment generation - - + + Alignment @@ -48444,13 +48884,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate the molecular weight of a protein (or fragments) and compare it to another protein or reference data. Generally used for protein identification. PMF Peptide mass fingerprinting Protein fingerprinting - - + + Protein fragment weight comparison @@ -48466,10 +48906,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare the physicochemical properties of two or more proteins (or reference data). - - + + Protein property comparison @@ -48479,13 +48919,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - + beta12orEarlier + 1.18 + Compare two or more molecular secondary structures. - + Secondary structure comparison true @@ -48496,11 +48936,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + Generate a Hopp and Woods plot of antigenicity of a protein. - + Hopp and Woods plotting true @@ -48512,11 +48952,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Generate a view of clustered quantitative data, annotated with textual information. - + Cluster textual view generation true @@ -48528,7 +48968,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualise clustered quantitative data as set of different profiles, where each profile is plotted versus different entities or samples on the X-axis. Clustered quantitative data plotting Clustered quantitative data rendering @@ -48536,8 +48976,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Microarray cluster temporal graph rendering Microarray wave graph plotting Microarray wave graph rendering - - + + In the case of microarray data, visualise clustered gene expression data as a set of profiles, where each profile shows the gene expression values of a cluster across samples on the X-axis. Clustering profile plotting @@ -48548,11 +48988,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Generate a dendrograph of raw, preprocessed or clustered expression (e.g. microarray) data. - + Dendrograph plotting true @@ -48564,7 +49004,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate a plot of distances (distance or correlation matrix) between expression values. Distance map rendering Distance matrix plotting @@ -48575,8 +49015,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Microarray distance map rendering Microarray proximity map plotting Microarray proximity map rendering - - + + Proximity map plotting @@ -48586,7 +49026,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualise clustered expression data using a tree diagram. Dendrogram plotting Dendrograph plotting @@ -48597,8 +49037,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Microarray checks view rendering Microarray matrix tree plot rendering Microarray tree or dendrogram rendering - - + + Dendrogram visualisation @@ -48609,7 +49049,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualize the results of a principal component analysis (orthogonal data transformation). For example, visualization of the principal components (essential subspace) coming from a Principal Component Analysis (PCA) on the trajectory atomistic coordinates of a molecular structure. PCA plotting Principal component plotting @@ -48619,8 +49059,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Microarray principal component rendering PCA visualization Principal modes visualization - - + + Examples for visualization are the distribution of variance over the components, loading and score plots. The use of Principal Component Analysis (PCA), a multivariate statistical analysis to obtain collective variables on the atomic positional fluctuations, helps to separate the configurational space in two subspaces: an essential subspace containing relevant motions, and another one containing irrelevant local fluctuations. Principal component visualisation @@ -48632,13 +49072,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Render a graph in which the values of two variables are plotted along two axes; the pattern of the points reveals any correlation. Scatter chart plotting Microarray scatter plot plotting Microarray scatter plot rendering - - + + Comparison of two sets of quantitative data such as two samples of gene expression values. Scatter plot plotting @@ -48649,11 +49089,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.18 - + beta12orEarlier + 1.18 + Visualise gene expression data where each band (or line graph) corresponds to a sample. - + Whole microarray graph plotting true @@ -48665,13 +49105,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Visualise gene expression data after hierarchical clustering for representing hierarchical relationships. Expression data tree-map rendering Treemapping Microarray tree-map rendering - - + + Treemap visualisation @@ -48682,12 +49122,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate a box plot, i.e. a depiction of groups of numerical data through their quartiles. Box plot plotting Microarray Box-Whisker plot plotting - - + + In the case of micorarray data, visualise raw and pre-processed gene expression data, via a plot showing over- and under-expression along with mean, upper and lower quartiles. Box-Whisker plot plotting @@ -48711,11 +49151,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Generate a physical (sequence) map of a DNA sequence showing the physical distance (base pairs) between features or landmarks such as restriction sites, cloned DNA fragments, genes and other genetic markers. Physical cartography - - + + Physical mapping @@ -48726,11 +49166,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Apply analytical methods to existing data of a specific type. - - + + This excludes non-analytical methods that read and write the same basic type of data (for that, see 'Data handling'). Data analysis @@ -48741,12 +49181,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Process or analyse an alignment of molecular sequences or structures. - + Alignment analysis true @@ -48757,13 +49197,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.16 - + beta12orEarlier + 1.16 + Analyse a body of scientific text (typically a full text article from a scientific journal.) - + Article analysis true @@ -48774,12 +49214,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Analyse the interactions of two or more molecules (or parts of molecules) that are known to interact. - + Molecular interaction analysis true @@ -48802,13 +49242,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Analyse the interactions of proteins with other proteins. Protein interaction analysis Protein interaction raw data analysis Protein interaction simulation - - + + Includes analysis of raw experimental protein-protein interaction data from for example yeast two-hybrid analysis, protein microarrays, immunoaffinity chromatography followed by mass spectrometry, phage display etc. Protein-protein interaction analysis @@ -48819,7 +49259,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier WHATIF: HETGroupNames WHATIF:HasMetalContacts WHATIF:HasMetalContactsPlus @@ -48837,8 +49277,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Residue contact calculation (residue-negative ion) Residue contact calculation (residue-nucleic acid) WHATIF:SymmetryContact - - + + This includes identifying HET groups, which usually correspond to ligands, lipids, but might also (not consistently) include groups that are attached to amino acids. Each HET group is supposed to have a unique three letter code and a unique name which might be given in the output. It can also include calculation of symmetry contacts, i.e. a contact between two atoms in different asymmetric unit. Residue distance calculation @@ -48849,13 +49289,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) an alignment of two or more molecular sequences, structures or derived data. - + Alignment processing true @@ -48866,12 +49306,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.6 - + beta12orEarlier + 1.6 + Process (read and / or write) a molecular tertiary (3D) structure alignment. - + Structure alignment processing true @@ -48888,11 +49328,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate codon usage bias, e.g. generate a codon usage bias plot. Codon usage bias plotting - - + + Codon usage bias calculation @@ -48902,11 +49342,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.22 - + beta12orEarlier + 1.22 + Generate a codon usage bias plot. - + Codon usage bias plotting true @@ -48924,10 +49364,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Calculate the differences in codon usage fractions between two sequences, sets of sequences, codon usage tables etc. - - + + Codon usage fraction calculation @@ -48937,11 +49377,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - true + beta12orEarlier + true Assign molecular sequences, structures or other biological data to a specific group or category according to qualities it shares with that group or category. - - + + Classification @@ -48951,12 +49391,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Process (read and / or write) molecular interaction data. - + Molecular interaction data processing true @@ -48968,10 +49408,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Assign molecular sequence(s) to a group or category. - - + + Sequence classification @@ -48982,10 +49422,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Assign molecular structure(s) to a group or category. - - + + Structure classification @@ -48995,10 +49435,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare two or more proteins (or some aspect) to identify similarities. - - + + Protein comparison @@ -49008,10 +49448,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier + beta12orEarlier Compare two or more nucleic acids to identify similarities. - - + + Nucleic acid comparison @@ -49021,11 +49461,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Predict, recognise, detect or identify some properties of proteins. - + Prediction and recognition (protein) true @@ -49037,11 +49477,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta12orEarlier - 1.19 - + beta12orEarlier + 1.19 + Predict, recognise, detect or identify some properties of nucleic acids. - + Prediction and recognition (nucleic acid) true @@ -49059,10 +49499,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 + beta13 Edit, convert or otherwise change a molecular tertiary structure, either randomly or specifically. - - + + Structure editing @@ -49072,10 +49512,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 + beta13 Edit, convert or otherwise change a molecular sequence alignment, either randomly or specifically. - - + + Sequence alignment editing @@ -49085,14 +49525,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 - Notions of pathway and network were mixed up, EDAM 1.24 disentangles them. - 1.24 - + beta13 + Notions of pathway and network were mixed up, EDAM 1.24 disentangles them. + 1.24 + Render (visualise) a biological pathway or network. - + Pathway or network visualisation true @@ -49103,12 +49543,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 - 1.6 - + beta13 + 1.6 + Predict general (non-positional) functional properties of a protein from analysing its sequence. - + For functional properties that are positional, use 'Protein site detection' instead. Protein function prediction (from sequence) true @@ -49120,14 +49560,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 - (jison)This is a distinction made on basis of input; all features exist can be mapped to a sequence so this isn't needed (consolidate with "Protein feature detection"). - 1.17 - - - + beta13 + (jison)This is a distinction made on basis of input; all features exist can be mapped to a sequence so this isn't needed (consolidate with "Protein feature detection"). + 1.17 + + + Predict, recognise and identify functional or other key sites within protein sequences, typically by scanning for known motifs, patterns and regular expressions. - + Protein sequence feature detection true @@ -49139,12 +49579,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 - 1.18 - - + beta13 + 1.18 + + Calculate (or predict) physical or chemical properties of a protein, including any non-positional properties of the molecular sequence, from processing a protein sequence. - + Protein property calculation (from sequence) true @@ -49156,12 +49596,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 - 1.6 - + beta13 + 1.6 + Predict, recognise and identify positional features in proteins from analysing protein structure. - + Protein feature prediction (from structure) true @@ -49191,7 +49631,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 + beta13 Predict, recognise and identify positional features in proteins from analysing protein sequences or structures. Protein feature prediction Protein feature recognition @@ -49201,8 +49641,8 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Protein site recognition Sequence feature detection (protein) Sequence profile database search - - + + Features includes functional sites or regions, secondary structure, structural domains and so on. Methods might use fingerprints, motifs, profiles, hidden Markov models, sequence alignment etc to provide a mapping of a query protein sequence to a discriminatory element. This includes methods that search a secondary protein database (Prosite, Blocks, ProDom, Prints, Pfam etc.) to assign a protein sequence(s) to a known protein family or group. Protein feature detection @@ -49213,12 +49653,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 - 1.6 - + beta13 + 1.6 + Screen a molecular sequence(s) against a database (of some type) to identify similarities between the sequence and database entries. - + Database search (by sequence) true @@ -49230,10 +49670,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 + beta13 Predict a network of protein interactions. - - + + Protein interaction network prediction @@ -49244,11 +49684,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 + beta13 Design (or predict) nucleic acid sequences with specific chemical or physical properties. Gene design - - + + Nucleic acid design @@ -49259,11 +49699,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - beta13 + beta13 Edit a data entity, either randomly or specifically. - - - Editing + + + Data editing @@ -49291,14 +49731,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 + 1.1 Evaluate a DNA sequence assembly, typically for purposes of quality control. Assembly QC Assembly quality evaluation Sequence assembly QC Sequence assembly quality evaluation - - + + Sequence assembly validation @@ -49309,12 +49749,12 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 + 1.1 Align two or more (tpyically huge) molecular sequences that represent genomes. Genome alignment construction Whole genome alignment - - + + Genome alignment @@ -49324,10 +49764,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 + 1.1 Reconstruction of a sequence assembly in a localised area. - - + + Localised reassembly @@ -49337,13 +49777,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 + 1.1 Render and visualise a DNA sequence assembly. Assembly rendering Assembly visualisation Sequence assembly rendering - - + + Sequence assembly visualisation @@ -49359,13 +49799,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 + 1.1 Identify base (nucleobase) sequence from a fluorescence 'trace' data generated by an automated DNA sequencer. Base calling Phred base calling Phred base-calling - - + + Base-calling @@ -49376,13 +49816,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 + 1.1 The mapping of methylation sites in a DNA (genome) sequence. Typically, the mapping of high-throughput bisulfite reads to the reference genome. Bisulfite read mapping Bisulfite sequence alignment Bisulfite sequence mapping - - + + Bisulfite mapping follows high-throughput sequencing of DNA which has undergone bisulfite treatment followed by PCR amplification; unmethylated cytosines are specifically converted to thymine, allowing the methylation status of cytosine in the DNA to be detected. Bisulfite mapping @@ -49393,10 +49833,10 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 + 1.1 Identify and filter a (typically large) sequence data set to remove sequences from contaminants in the sample that was sequenced. - - + + Sequence contamination filtering @@ -49406,11 +49846,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 - 1.12 - + 1.1 + 1.12 + Trim sequences (typically from an automated DNA sequencer) to remove misleading ends. - + For example trim polyA tails, introns and primer sequence flanking the sequence of amplified exons, or other unwanted sequence. Trim ends @@ -49423,11 +49863,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 - 1.12 - + 1.1 + 1.12 + Trim sequences (typically from an automated DNA sequencer) to remove sequence-specific end regions, typically contamination from vector sequences. - + Trim vector true @@ -49439,11 +49879,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 - 1.12 - + 1.1 + 1.12 + Trim sequences (typically from an automated DNA sequencer) to remove the sequence ends that extend beyond an assembled reference sequence. - + Trim to reference true @@ -49455,15 +49895,15 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - 1.1 + 1.1 Cut (remove) the end from a molecular sequence. Trimming Barcode sequence removal Trim ends Trim to reference Trim vector - - + + This includes ennd trimming @@ -49485,10 +49925,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Compare the features of two genome sequences. - - + + Genomic elements that might be compared include genes, indels, single nucleotide polymorphisms (SNPs), retrotransposons, tandem repeats and so on. Genome feature comparison @@ -49505,12 +49945,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Detect errors in DNA sequences generated from sequencing projects). Short read error correction Short-read error correction - - + + Sequencing error detection @@ -49520,10 +49960,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Analyse DNA sequence data to identify differences between the genetic composition (genotype) of an individual compared to other individual's or a reference sequence. - - + + Methods might consider cytogenetic analyses, copy number polymorphism (and calculate copy number calls for copy-number variation(CNV) regions), single nucleotide polymorphism (SNP), , rare copy number variation (CNV) identification, loss of heterozygosity data and so on. Genotyping @@ -49534,14 +49974,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Analyse a genetic variation, for example to annotate its location, alleles, classification, and effects on individual transcripts predicted for a gene model. Genetic variation annotation Sequence variation analysis Variant analysis Transcript variant analysis - - + + Genetic variation annotation provides contextual interpretation of coding SNP consequences in transcripts. It allows comparisons to be made between variation data in different populations or strains for the same transcript. Genetic variation analysis @@ -49553,7 +49993,7 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Align short oligonucleotide sequences (reads) to a larger (genomic) sequence. Oligonucleotide alignment Oligonucleotide alignment construction @@ -49564,8 +50004,8 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp Short read alignment Short read mapping Short sequence read mapping - - + + The purpose of read mapping is to identify the location of sequenced fragments within a reference genome and assumes that there is, in fact, at least local similarity between the fragment and reference sequences. Read mapping @@ -49576,11 +50016,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 A varient of oligonucleotide mapping where a read is mapped to two separate locations because of possible structural variation. Split-read mapping - - + + Split read mapping @@ -49590,14 +50030,18 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 + Community profiling (see Metabarcoding) + Sample barcoding? Analyse DNA sequences in order to identify a DNA 'barcode'; marker genes or any short fragment(s) of DNA that are useful to diagnose the taxa of biological organisms. - Community profiling - Sample barcoding - - + + DNA barcoding + TODO: Add Differential taxonomic/community profiling! (bebatut) + TODO: Make into a DNA barcoding (& RNA?) topic, revise operations such as taxonomic/DNA profiling and identification, consider what should be topic(s) and what operation(s). + + @@ -49606,11 +50050,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 - 1.19 - + 1.1 + 1.19 + Identify single nucleotide change in base positions in sequencing data that differ from a reference genome and which might, especially by reference to population frequency or functional data, indicate a polymorphism. - + SNP calling true @@ -49622,13 +50066,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 - "Polymorphism detection" and "Variant calling" are essentially the same thing - keeping the later as a more prevalent term nowadays. - 1.24 - - + 1.1 + "Polymorphism detection" and "Variant calling" are essentially the same thing - keeping the later as a more prevalent term nowadays. + 1.24 + + Detect mutations in multiple DNA sequences, for example, from the alignment and comparison of the fluorescent traces produced by DNA sequencing hardware. - + Polymorphism detection true @@ -49640,11 +50084,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Visualise, format or render an image of a Chromatogram. Chromatogram viewing - - + + Chromatogram visualisation @@ -49654,11 +50098,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Analyse cytosine methylation states in nucleic acid sequences. Methylation profile analysis - - + + Methylation analysis @@ -49668,11 +50112,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 - 1.19 - + 1.1 + 1.19 + Determine cytosine methylation status of specific positions in a nucleic acid sequences. - + Methylation calling true @@ -49685,13 +50129,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Measure the overall level of methyl cytosines in a genome from analysis of experimental data, typically from chromatographic methods and methyl accepting capacity assay. Genome methylation analysis Global methylation analysis Methylation level analysis (global) - - + + Whole genome methylation analysis @@ -49701,12 +50145,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Analysing the DNA methylation of specific genes or regions of interest. Gene-specific methylation analysis Methylation level analysis (gene-specific) - - + + Gene methylation analysis @@ -49717,14 +50161,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Visualise, format or render a nucleic acid sequence that is part of (and in context of) a complete genome sequence. Genome browser Genome browsing Genome rendering Genome viewing - - + + Genome visualisation @@ -49735,11 +50179,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Compare the sequence or features of two or more genomes, for example, to find matching regions. Genomic region matching - - + + Genome comparison @@ -49756,14 +50200,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Generate an index of a genome sequence. Burrows-Wheeler Genome indexing (Burrows-Wheeler) Genome indexing (suffix arrays) Suffix arrays - - + + Many sequence alignment tasks involving many or very large sequences rely on a precomputed index of the sequence to accelerate the alignment. The Burrows-Wheeler Transform (BWT) is a permutation of the genome based on a suffix array algorithm. A suffix array consists of the lexicographically sorted list of suffixes of a genome. Genome indexing @@ -49774,11 +50218,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 - 1.12 - + 1.1 + 1.12 + Generate an index of a genome sequence using the Burrows-Wheeler algorithm. - + The Burrows-Wheeler Transform (BWT) is a permutation of the genome based on a suffix array algorithm. Genome indexing (Burrows-Wheeler) @@ -49791,11 +50235,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 - 1.12 - + 1.1 + 1.12 + Generate an index of a genome sequence using a suffix arrays algorithm. - + A suffix array consists of the lexicographically sorted list of suffixes of a genome. Genome indexing (suffix arrays) @@ -49826,12 +50270,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Analyse one or more spectra from mass spectrometry (or other) experiments. Mass spectrum analysis Spectrum analysis - - + + Spectral analysis @@ -49848,13 +50292,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Identify peaks in a spectrum from a mass spectrometry, NMR, or some other spectrum-generating experiment. Peak assignment Peak detection Peak finding - - + + Peak detection @@ -49871,12 +50315,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Link together a non-contiguous series of genomic sequences into a scaffold, consisting of sequences separated by gaps of known length. The sequences that are linked are typically typically contigs; contiguous sequences corresponding to read overlaps. Scaffold construction Scaffold generation - - + + Scaffold may be positioned along a chromosome physical map to create a "golden path". Scaffolding @@ -49887,10 +50331,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Fill the gaps in a sequence assembly (scaffold) by merging in additional sequences. - - + + Different techniques are used to generate gap sequences to connect contigs, depending on the size of the gap. For small (5-20kb) gaps, PCR amplification and sequencing is used. For large (>20kb) gaps, fragments are cloned (e.g. in BAC (Bacterial artificial chromosomes) vectors) and then sequenced. Scaffold gap completion @@ -49902,12 +50346,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Raw sequence data quality control. Sequencing QC Sequencing quality assessment - - + + Analyse raw sequence data from a sequencing pipeline and identify (and possiby fix) problems. Sequencing quality control @@ -49919,11 +50363,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Pre-process sequence reads to ensure (or improve) quality and reliability. Sequence read pre-processing - - + + For example process paired end reads to trim low quality ends remove short sequences, identify sequence inserts, detect chimeric reads, or remove low quality sequnces including vector, adaptor, low complexity and contaminant sequences. Sequences might come from genomic DNA library, EST libraries, SSH library and so on. Read pre-processing @@ -49940,10 +50384,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Estimate the frequencies of different species from analysis of the molecular sequences, typically of DNA recovered from environmental samples. - - + + Species frequency estimation @@ -49953,12 +50397,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Identify putative protein-binding regions in a genome sequence from analysis of Chip-sequencing data or ChIP-on-chip data. Protein binding peak detection Peak-pair calling - - + + Chip-sequencing combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to generate a set of reads, which are aligned to a genome sequence. The enriched areas contain the binding sites of DNA-associated proteins. For example, a transcription factor binding site. ChIP-on-chip in contrast combines chromatin immunoprecipitation ('ChIP') with microarray ('chip'). "Peak-pair calling" is similar to "Peak calling" in the context of ChIP-exo. Peak calling @@ -49969,14 +50413,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Identify from molecular sequence analysis (typically from analysis of microarray or RNA-seq data) genes whose expression levels are significantly different between two sample groups. Differential expression analysis Differential gene analysis Differential gene expression analysis Differentially expressed gene identification - - + + Differential gene expression analysis is used, for example, to identify which genes are up-regulated (increased expression) or down-regulated (decreased expression) between a group treated with a drug and a control groups. Differential gene expression profiling @@ -49987,11 +50431,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 - 1.21 - + 1.1 + 1.21 + Analyse gene expression patterns (typically from DNA microarray datasets) to identify sets of genes that are associated with a specific trait, condition, clinical outcome etc. - + Gene set testing true @@ -50004,10 +50448,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Classify variants based on their potential effect on genes, especially functional effects on the expressed proteins. - - + + Variants are typically classified by their position (intronic, exonic, etc.) in a gene transcript and (for variants in coding exons) by their effect on the protein sequence (synonymous, non-synonymous, frameshifting, etc.) Variant classification @@ -50018,10 +50462,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Identify biologically interesting variants by prioritizing individual variants, for example, homozygous variants absent in control genomes. - - + + Variant prioritisation can be used for example to produce a list of variants responsible for 'knocking out' genes in specific genomes. Methods amino acid substitution, aggregative approaches, probabilistic approach, inheritance and unified likelihood-frameworks. Variant prioritisation @@ -50033,7 +50477,7 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Detect, identify and map mutations, such as single nucleotide polymorphisms, short indels and structural variants, in multiple DNA sequences. Typically the alignment and comparison of the fluorescent traces produced by DNA sequencing hardware, to study genomic alterations. Variant mapping Allele calling @@ -50043,8 +50487,8 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp Mutation detection Somatic variant calling de novo mutation detection - - + + Methods often utilise a database of aligned reads. Somatic variant calling is the detection of variations established in somatic cells and hence not inherited as a germ line variant. Variant detection @@ -50057,11 +50501,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Detect large regions in a genome subject to copy-number variation, or other structural variations in genome(s). Structural variation discovery - - + + Methods might involve analysis of whole-genome array comparative genome hybridisation or single-nucleotide polymorphism arrays, paired-end mapping of sequencing data, or from analysis of short reads from new sequencing technologies. Structural variation detection @@ -50072,11 +50516,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Analyse sequencing data from experiments aiming to selectively sequence the coding regions of the genome. Exome sequence analysis - - + + Exome assembly @@ -50086,10 +50530,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Analyse mapping density (read depth) of (typically) short reads from sequencing platforms, for example, to detect deletions and duplications. - - + + Read depth analysis @@ -50099,13 +50543,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Combine classical quantitative trait loci (QTL) analysis with gene expression profiling, for example, to describe describe cis- and trans-controlling elements for the expression of phenotype associated genes. Gene expression QTL profiling Gene expression quantitative trait loci profiling eQTL profiling - - + + Gene expression QTL analysis @@ -50115,11 +50559,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.1 + 1.1 Estimate the number of copies of loci of particular gene(s) in DNA sequences typically from gene-expression profiling technology based on microarray hybridisation-based experiments. For example, estimate copy number (or marker dosage) of a dominant marker in samples from polyploid plant cells or tissues, or chromosomal gains and losses in tumors. Transcript copy number estimation - - + + Methods typically implement some statistical model for hypothesis testing, and methods estimate total copy number, i.e. do not distinguish the two inherited chromosomes quantities (specific copy number). Copy number estimation @@ -50130,11 +50574,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.2 + 1.2 Adapter removal Remove forward and/or reverse primers from nucleic acid sequences (typically PCR products). - - + + Primer removal @@ -50156,10 +50600,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.2 + 1.2 Infer a transcriptome sequence by analysis of short sequence reads. - - + + Transcriptome assembly @@ -50169,12 +50613,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.2 - 1.6 - + 1.2 + 1.6 + Infer a transcriptome sequence without the aid of a reference genome, i.e. by comparing short sequences (reads) to each other. - + Transcriptome assembly (de novo) true @@ -50185,12 +50629,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.2 - 1.6 - + 1.2 + 1.6 + Infer a transcriptome sequence by mapping short reads to a reference genome. - + Transcriptome assembly (mapping) true @@ -50213,10 +50657,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.3 + 1.3 Convert one set of sequence coordinates to another, e.g. convert coordinates of one assembly to another, cDNA to genomic, CDS to genomic, protein translation to genomic etc. - - + + Sequence coordinate conversion @@ -50226,10 +50670,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.3 + 1.3 Calculate similarity between 2 or more documents. - - + + Document similarity calculation @@ -50240,10 +50684,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.3 + 1.3 Cluster (group) documents on the basis of their calculated similarity. - - + + Document clustering @@ -50254,7 +50698,7 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.3 + 1.3 Recognise named entities, ontology concepts, tags, events, and dictionary terms within documents. Concept mining Entity chunking @@ -50263,8 +50707,8 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp Event extraction NER Named-entity recognition - - + + Named-entity and concept recognition @@ -50277,12 +50721,18 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.3 + + + + + + + 1.3 Map data identifiers to one another for example to establish a link between two biological databases for the purposes of data integration. Accession mapping Identifier mapping - - + + The mapping can be achieved by comparing identifier values or some other means, e.g. exact matches to a provided sequence. ID mapping @@ -50293,12 +50743,22 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.3 + + + + + + + 1.3 Process data in such a way that makes it hard to trace to the person which the data concerns. Data anonymisation - - + Data anonymization + + Anonymisation + Anonymization + + @@ -50307,12 +50767,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.3 - (jison)Too fine-grained, the operation (Data retrieval) hasn't changed, just what is retrieved. - 1.17 - + 1.3 + (jison)Too fine-grained, the operation (Data retrieval) hasn't changed, just what is retrieved. + 1.17 + Search for and retrieve a data identifier of some kind, e.g. a database entry accession. - + ID retrieval true @@ -50336,10 +50796,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.4 + 1.4 Generate a checksum of a molecular sequence. - - + + Sequence checksum generation @@ -50355,11 +50815,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.4 + 1.4 Construct a bibliography from the scientific literature. Bibliography construction - - + + Bibliography generation @@ -50369,10 +50829,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.4 + 1.4 Predict the structure of a multi-subunit protein and particularly how the subunits fit together. - - + + Protein quaternary structure prediction @@ -50394,10 +50854,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.4 + 1.4 Analyse the surface properties of proteins or other macromolecules, including surface accessible pockets, interior inaccessible cavities etc. - - + + Molecular surface analysis @@ -50407,10 +50867,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.4 + 1.4 Compare two or more ontologies, e.g. identify differences. - - + + Ontology comparison @@ -50420,11 +50880,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.4 - 1.9 - + 1.4 + 1.9 + Compare two or more ontologies, e.g. identify differences. - + Ontology comparison true @@ -50448,14 +50908,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - - 1.4 + + 1.4 Recognition of which format the given data is in. Format identification Format inference Format recognition - - + + 'Format recognition' is not a bioinformatics-specific operation, but of great relevance in bioinformatics. Should be removed from EDAM if/when captured satisfactorily in a suitable domain-generic ontology. Format detection @@ -50472,12 +50932,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.4 + 1.4 Split a file containing multiple data items into many files, each containing one item File splitting - - - Splitting + + + Data splitting @@ -50486,12 +50946,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - true + 1.6 + true Construct some data entity. Construction - - + + For non-analytical operations, see the 'Processing' branch. Generation @@ -50502,13 +50962,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - (jison)This is a distinction made on basis of input; all features exist can be mapped to a sequence so this isn't needed. - 1.17 - - + 1.6 + (jison)This is a distinction made on basis of input; all features exist can be mapped to a sequence so this isn't needed. + 1.17 + + Predict, recognise and identify functional or other key sites within nucleic acid sequences, typically by scanning for known motifs, patterns and regular expressions. - + Nucleic acid sequence feature detection true @@ -50520,17 +50980,25 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - Deposit some data in a database or some other type of repository or software system. - Data deposition + + + + + + + 1.6 + Data brokering + Data sharing + Data archival + Data preservation + Deposition of data in a database or other type of repository. Data submission Database deposition - Database submission - Submission - - + Data publication + + For non-analytical operations, see the 'Processing' branch. - Deposition + Data deposition @@ -50539,12 +51007,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - true + 1.6 + true Group together some data entities on the basis of similarities such that entities in the same group (cluster) are more similar to each other than to those in other groups (clusters). - - - Clustering + + + Clustering @@ -50553,11 +51021,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - 1.19 - + 1.6 + 1.19 + Construct some entity (typically a molecule sequence) from component pieces. - + Assembly true @@ -50569,11 +51037,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - true + 1.6 + true Convert a data set from one form to another. - - + + Conversion @@ -50583,12 +51051,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 + 1.6 Standardize or normalize data by some statistical method. Normalisation Standardisation - - + + In the simplest normalisation means adjusting values measured on different scales to a common scale (often between 0.0 and 1.0), but can refer to more sophisticated adjustment whereby entire probability distributions of adjusted values are brought into alignment. Standardisation typically refers to an operation whereby a range of values are standardised to measure how many standard deviations a value is from its mean. Standardisation and normalisation @@ -50599,11 +51067,15 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 + 1.6 Combine multiple files or data items into a single file or object. - - - Aggregation + Object aggregation + Data integration + + + Data aggregation + + @@ -50618,10 +51090,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 + 1.6 Compare two or more scientific articles. - - + + Article comparison @@ -50631,11 +51103,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - true + 1.6 + true Mathematical determination of the value of something, typically a properly of a molecule. - - + + Calculation @@ -50645,15 +51117,15 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - Notions of pathway and network were mixed up, EDAM 1.24 disentangles them. - 1.24 - + 1.6 + Notions of pathway and network were mixed up, EDAM 1.24 disentangles them. + 1.24 + Predict a molecular pathway or network. - + Pathway or network prediction true @@ -50664,11 +51136,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - 1.12 - + 1.6 + 1.12 + The process of assembling many short DNA sequences together such thay they represent the original chromosomes from which the DNA originated. - + Genome assembly true @@ -50680,11 +51152,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.6 - 1.19 - + 1.6 + 1.19 + Generate a graph, or other visual representation, of data, showing the relationship between two or more variables. - + Plotting true @@ -50702,11 +51174,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Image processing The analysis of a image (typically a digital image) of some type in order to extract information from it. - - + + Image analysis @@ -50717,10 +51189,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Analysis of data from a diffraction experiment. - - + + + imaging-revise Diffraction data analysis @@ -50736,10 +51209,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Analysis of cell migration images in order to study cell migration, typically in order to study the processes that play a role in the disease progression. - - + + Cell migration analysis @@ -50750,10 +51223,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Processing of diffraction data into a corrected, ordered, and simplified form. - - + + + imaging-revise Diffraction data reduction @@ -50769,10 +51243,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Measurement of neurites; projections (axons or dendrites) from the cell body of a neuron, from analysis of neuron images. - - + + Neurite measurement @@ -50782,12 +51256,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 The evaluation of diffraction intensities and integration of diffraction maxima from a diffraction experiment. Diffraction profile fitting Diffraction summation integration - - + + + imaging-revise Diffraction data integration @@ -50797,10 +51272,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Phase a macromolecular crystal structure, for example by using molecular replacement or experimental phasing methods. - - + + + imaging-revise Phasing @@ -50810,10 +51286,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 A technique used to construct an atomic model of an unknown structure from diffraction data, based upon an atomic model of a known structure, either a related protein or the same protein from a different crystal form. - - + + The technique solves the phase problem, i.e. retrieve information concern phases of the structure. Molecular replacement @@ -50824,10 +51300,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 A method used to refine a structure by moving the whole molecule or parts of it as a rigid unit, rather than moving individual atoms. - - + + Rigid body refinement usually follows molecular replacement in the assignment of a structure from diffraction data. Rigid body refinement @@ -50845,10 +51321,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 An image processing technique that combines and analyze multiple images of a particulate sample, in order to produce an image with clearer features that are more easily interpreted. - - + + + imaging-revise Single particle analysis is used to improve the information that can be obtained by relatively low resolution techniques, , e.g. an image of a protein or virus from transmission electron microscopy (TEM). Single particle analysis @@ -50861,12 +51338,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 - true - This is two related concepts. + 1.7 + true + This is two related concepts. Compare (align and classify) multiple particle images from a micrograph in order to produce a representative image of the particle. - - + + + imaging-revise A micrograph can include particles in multiple different orientations and/or conformations. Particles are compared and organised into sets based on their similarity. Typically iterations of classification and alignment and are performed to optimise the final 3D EM map. Single particle alignment and classification @@ -50883,11 +51361,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Clustering of molecular sequences on the basis of their function, typically using information from an ontology of gene function, or some other measure of functional phenotype. Functional sequence clustering - - + + Functional clustering @@ -50897,12 +51375,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Classifiication (typically of molecular sequences) by assignment to some taxonomic hierarchy. Taxonomy assignment Taxonomic profiling - - + + Taxonomic classification @@ -50919,11 +51397,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 The prediction of the degree of pathogenicity of a microorganism from analysis of molecular sequences. Pathogenicity prediction - - + + Virulence prediction @@ -50934,14 +51412,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Analyse the correlation patterns among features/molecules across across a variety of experiments, samples etc. Co-expression analysis Gene co-expression network analysis Gene expression correlation Gene expression correlation analysis - - + + Expression correlation analysis @@ -50957,11 +51435,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 - true + 1.7 + true Identify a correlation, i.e. a statistical relationship between two random variables or two sets of data. - - + + Correlation @@ -50978,10 +51456,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Compute the covariance model for (a family of) RNA secondary structures. - - + + RNA structure covariance model generation @@ -50991,11 +51469,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 - 1.18 - + 1.7 + 1.18 + Predict RNA secondary structure by analysis, e.g. probabilistic analysis, of the shape of RNA folds. - + RNA secondary structure prediction (shape-based) true @@ -51007,11 +51485,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 - 1.18 - + 1.7 + 1.18 + Prediction of nucleic-acid folding using sequence alignments as a source of data. - + Nucleic acid folding prediction (alignment-based) true @@ -51023,10 +51501,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Count k-mers (substrings of length k) in DNA sequence data. - - + + k-mer counting is used in genome and transcriptome assembly, metagenomic sequencing, and for error correction of sequence reads. k-mer counting @@ -51043,13 +51521,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Reconstructing the inner node labels of a phylogenetic tree from its leafes. Phylogenetic tree reconstruction Gene tree reconstruction Species tree reconstruction - - + + Note that this is somewhat different from simply analysing an existing tree or constructing a completely new one. Phylogenetic reconstruction @@ -51060,10 +51538,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Generate some data from a choosen probibalistic model, possibly to evaluate algorithms. - - + + Probabilistic data generation @@ -51074,10 +51552,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Generate sequences from some probabilistic model, e.g. a model that simulates evolution. - - + + Probabilistic sequence generation @@ -51094,10 +51572,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.7 + 1.7 Identify or predict causes for antibiotic resistance from molecular sequence analysis. - - + + Antimicrobial resistance prediction @@ -51113,13 +51591,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.8 + 1.8 Analysis of a set of objects, such as genes, annotated with given categories, where eventual over-/under-representation of certain categories within the studied set of objects is revealed. Enrichment Over-representation analysis Functional enrichment - - + + Categories from a relevant ontology can be used. The input is typically a set of genes or other biological objects, possibly represented by their identifiers, and the output of the analysis is typically a ranked list of categories, each associated with a statistical metric of over-/under-representation within the studied data. Enrichment analysis @@ -51136,11 +51614,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.8 + 1.8 Analyse a dataset with respect to concepts from an ontology of chemical structure, leveraging chemical similarity information. Chemical class enrichment - - + + Chemical similarity enrichment @@ -51150,10 +51628,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.8 + 1.8 Plot an incident curve such as a survival curve, death curve, mortality curve. - - + + Incident curve plotting @@ -51163,10 +51641,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.8 + 1.8 Identify and map patterns of genomic variations. - - + + Methods often utilise a database of aligned reads. Variant pattern analysis @@ -51177,11 +51655,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.8 - 1.12 - + 1.8 + 1.12 + Model some biological system using mathematical techniques including dynamical systems, statistical models, differential equations, and game theoretic models. - + Mathematical modelling true @@ -51199,10 +51677,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 Visualise images resulting from various types of microscopy. - - + + Microscope image visualisation @@ -51212,10 +51690,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 Annotate an image of some sort, typically with terms from a controlled vocabulary. - - + + Image annotation @@ -51225,11 +51703,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 Replace missing data with substituted values, usually by using some statistical or other mathematical approach. Data imputation - - + + Imputation @@ -51240,11 +51718,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 Visualise, format or render data from an ontology, typically a tree of terms. Ontology browsing - - + + Ontology visualisation @@ -51254,10 +51732,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 A method for making numerical assessments about the maximum percent of time that a conformer of a flexible macromolecule can exist and still be compatible with the experimental data. - - + + Maximum occurence analysis @@ -51268,12 +51746,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 Compare the models or schemas used by two or more databases, or any other general comparison of databases rather than a detailed comparison of the entries themselves. Data model comparison Schema comparison - - + + Database comparison @@ -51283,13 +51761,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 - 1.24 - + 1.9 + 1.24 + Simulate the bevaviour of a biological pathway or network. - + Notions of pathway and network were mixed up, EDAM 1.24 disentangles them. Network simulation true @@ -51301,10 +51779,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 Analyze read counts from RNA-seq experiments. - - + + RNA-seq read count analysis @@ -51314,10 +51792,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 Identify and remove redudancy from a set of small molecule structures. - - + + Chemical redundancy removal @@ -51327,10 +51805,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 Analyze time series data from an RNA-seq experiment. - - + + RNA-seq time series data analysis @@ -51340,10 +51818,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.9 + 1.9 Simulate gene expression data, e.g. for purposes of benchmarking. - - + + Simulated gene expression data generation @@ -51353,15 +51831,15 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Identify semantic relations among entities and concepts within a text, using text mining techniques. Relation discovery Relation inference Relationship discovery Relationship extraction Relationship inference - - + + Relation extraction @@ -51377,11 +51855,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Re-adjust the output of mass spectrometry experiments with shifted ppm values. Mass calibration - - + + Mass spectra calibration @@ -51397,12 +51875,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Align multiple data sets using information from chromatography, mass spectrometry and/or tandem mass spectrometry, from chromatography-mass spectrometry experiments. MBR Match between runs - - + + Chromatographic alignment @@ -51418,11 +51896,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 The removal of isotope peaks in a spectrum, to represent the fragment ion as one data point. Deconvolution - - + + Deisotoping is commonly done to reduce complexity, and done in conjunction with the charge state deconvolution. Deisotoping @@ -51440,11 +51918,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Technique for determining the amount of proteins in a sample. Protein quantitation - - + + Protein quantification @@ -51460,11 +51938,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Determination of peptide sequence from mass spectrum. Peptide-spectrum-matching - - + + Peptide identification @@ -51486,10 +51964,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Calculate the isotope distribution of a given chemical species. - - + + Isotopic distributions calculation @@ -51499,11 +51977,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Prediction of retention time in a mass spectrometry experiment based on compositional and structural properties of the separated species. Retention time calculation - - + + Retention time prediction @@ -51513,10 +51991,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Quantification without the use of chemical tags. - - + + Label-free quantification @@ -51526,10 +52004,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Quantification based on the use of chemical tags. - - + + Labeled quantification @@ -51539,10 +52017,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Quantification by Selected/multiple Reaction Monitoring workflow (XIC quantitation of precursor / fragment mass pair). - - + + MRM/SRM @@ -51552,10 +52030,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Calculate number of identified MS2 spectra as approximation of peptide / protein quantity. - - + + Spectral counting @@ -51565,10 +52043,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Quantification analysis using stable isotope labeling by amino acids in cell culture. - - + + SILAC @@ -51578,10 +52056,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Quantification analysis using the AB SCIEX iTRAQ isobaric labelling workflow, wherein 2-8 reporter ions are measured in MS2 spectra near 114 m/z. - - + + iTRAQ @@ -51591,10 +52069,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Quantification analysis using labeling based on 18O-enriched H2O. - - + + 18O labeling @@ -51604,10 +52082,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Quantification analysis using the Thermo Fisher tandem mass tag labelling workflow. - - + + TMT-tag @@ -51617,10 +52095,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Quantification analysis using chemical labeling by stable isotope dimethylation - - + + Dimethyl @@ -51630,10 +52108,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Peptide sequence tags are used as piece of information about a peptide obtained by tandem mass spectrometry. - - + + Tag-based peptide identification @@ -51644,10 +52122,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Analytical process that derives a peptide's amino acid sequence from its tandem mass spectrum (MS/MS) without the assistance of a sequence database. - - + + de Novo sequencing @@ -51657,10 +52135,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Identification of post-translational modifications (PTMs) of peptides/proteins in mass spectrum. - - + + PTM identification @@ -51671,10 +52149,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Determination of best matches between MS/MS spectrum and a database of protein or nucleic acid sequences. - - + + Peptide database search @@ -51684,12 +52162,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Peptide database search for identification of known and unknown PTMs looking for mass difference mismatches. Modification-tolerant peptide database search Unrestricted peptide database search - - + + Blind peptide database search @@ -51699,12 +52177,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 - 1.19 - - + 1.12 + 1.19 + + Statistical estimation of false discovery rate from score distribution for peptide-spectrum-matches, following a peptide database search. - + Validation of peptide-spectrum matches true @@ -51717,11 +52195,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Validation of peptide-spectrum matches Statistical estimation of false discovery rate from score distribution for peptide-spectrum-matches, following a peptide database search, and by comparison to search results with a database containing incorrect information. - - + + Target-Decoy @@ -51731,11 +52209,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Analyse data in order to deduce properties of an underlying distribution or population. Empirical Bayes - - + + Statistical inference @@ -51746,11 +52224,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + + 1.12 A statistical calculation to estimate the relationships among variables. Regression - - + + Regression analysis @@ -51768,16 +52247,16 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Model a metabolic network. This can include 1) reconstruction to break down a metabolic pathways into reactions, enzymes, and other relevant information, and compilation of this into a mathematical model and 2) simulations of metabolism based on the model. - - + Metabolic pathway modelling Metabolic network reconstruction Metabolic network simulation + Metabolic pathway reconstruction Metabolic pathway simulation Metabolic reconstruction - - + + The terms and synyonyms here reflect that for practical intents and purposes, "pathway" and "network" can be treated the same. Metabolic network modelling @@ -51789,10 +52268,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Predict the effect or function of an individual single nucleotide polymorphism (SNP). - - + + SNP annotation @@ -51802,11 +52281,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Prediction of genes or gene components from first principles, i.e. without reference to existing genes. Gene prediction (ab-initio) - - + + Ab-initio gene prediction @@ -51817,7 +52296,7 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Prediction of genes or gene components by reference to homologous genes. Empirical gene finding Empirical gene prediction @@ -51826,8 +52305,8 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp Similarity-based gene prediction Homology prediction Orthology prediction - - + + Homology-based gene prediction @@ -51838,10 +52317,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Construction of a statistical model, or a set of assumptions around some observed data, usually by describing a set of probability distributions which approximate the distribution of data. - - + + Statistical modelling @@ -51853,10 +52332,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Compare two or more molecular surfaces. - - + + Molecular surface comparison @@ -51866,11 +52345,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Annotate one or more sequences with functional information, such as cellular processes or metaobolic pathways, by reference to a controlled vocabulary - invariably the Gene Ontology (GO). Sequence functional annotation - - + + Gene functional annotation @@ -51880,10 +52359,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Variant filtering is used to eliminate false positive variants based for example on base calling quality, strand and position information, and mapping info. - - + + Variant filtering @@ -51893,10 +52372,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.12 + 1.12 Identify binding sites in nucleic acid sequences that are statistically significantly differentially bound between sample groups. - - + + Differential binding analysis @@ -51907,10 +52386,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.13 + 1.13 Analyze data from RNA-seq experiments. - - + + RNA-Seq analysis @@ -51920,10 +52399,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.13 + 1.13 Visualise, format or render a mass spectrum. - - + + Mass spectrum visualisation @@ -51933,13 +52412,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.13 + 1.13 Filter a set of files or data items according to some property. Sequence filtering rRNA filtering - - - Filtering + + + Data filtering @@ -51948,10 +52427,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.14 + 1.14 Identification of the best reference for mapping for a specific dataset from a list of potential references, when performing genetic variation analysis. - - + + Reference identification @@ -51961,11 +52440,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.14 + 1.14 Label-free quantification by integration of ion current (ion counting). Ion current integration - - + + Ion counting @@ -51975,11 +52454,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.14 + 1.14 Chemical tagging free amino groups of intact proteins with stable isotopes. ICPL - - + + Isotope-coded protein label @@ -51989,12 +52468,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.14 + 1.14 Labeling all proteins and (possibly) all amino acids using C-13 or N-15 enriched grown medium or feed. C-13 metabolic labeling N-15 metabolic labeling - - + + This includes N-15 metabolic labeling (labeling all proteins and (possibly) all amino acids using N-15 enriched grown medium or feed) and C-13 metabolic labeling (labeling all proteins and (possibly) all amino acids using C-13 enriched grown medium or feed). Metabolic labeling @@ -52005,11 +52484,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.15 + 1.15 Construction of a single sequence assembly of all reads from different samples, typically as part of a comparative metagenomic analysis. Sequence assembly (cross-assembly) - - + + Cross-assembly @@ -52019,10 +52498,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.15 + 1.15 The comparison of samples from a metagenomics study, for example, by comparison of metagenome shotgun reads or assembled contig sequences, by comparison of functional profiles, or some other method. - - + + Sample comparison @@ -52033,12 +52512,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.15 + 1.15 Differential protein analysis The analysis, using proteomics techniques, to identify proteins whose encoding genes are differentially expressed under a given experimental setup. Differential protein expression analysis - - + + Differential protein expression profiling @@ -52048,11 +52527,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.15 - 1.17 - + 1.15 + 1.17 + The analysis, using any of diverse techniques, to identify genes that are differentially expressed under a given experimental setup. - + Differential gene expression analysis true @@ -52064,10 +52543,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.15 + 1.15 Visualise, format or render data arising from an analysis of multiple samples from a metagenomics/community experiment. - - + + Multiple sample visualisation @@ -52077,13 +52556,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.15 + 1.15 The extrapolation of empirical characteristics of individuals or populations, backwards in time, to their common ancestors. Ancestral sequence reconstruction Character mapping Character optimisation - - + + Ancestral reconstruction is often used to recover possible ancestral character states of ancient, extinct organisms. Ancestral reconstruction @@ -52094,12 +52573,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.16 + 1.16 Site localisation of post-translational modifications in peptide or protein mass spectra. PTM scoring Site localisation - - + + PTM localisation @@ -52109,11 +52588,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.16 + 1.16 Operations concerning the handling and use of other tools. Endpoint management - - + + Service management @@ -52123,10 +52602,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.16 + 1.16 An operation supporting the browsing or discovery of other tools and services. - - + + Service discovery @@ -52136,10 +52615,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.16 + 1.16 An operation supporting the aggregation of other services (at least two) into a funtional unit, for the automation of some task. - - + + Service composition @@ -52149,10 +52628,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.16 + 1.16 An operation supporting the calling (invocation) of other tools and services. - - + + Service invocation @@ -52168,12 +52647,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.16 + 1.16 A data mining method typically used for studying biological networks based on pairwise correlations between variables. WGCNA Weighted gene co-expression network analysis - - + + Weighted correlation network analysis @@ -52190,11 +52669,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.16 + 1.16 Identification of protein, for example from one or more peptide identifications by tandem mass spectrometry. Protein inference - - + + Protein identification @@ -52222,12 +52701,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.16 + 1.16 Text annotation is the operation of adding notes, data and metadata, recognised entities and concepts, and their relations to a text (such as a scientific article). Article annotation Literature annotation - - + + Text annotation @@ -52238,10 +52717,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 A method whereby data on several variants are "collapsed" into a single covariate based on regions such as genes. - - + + Genome-wide association studies (GWAS) analyse a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait. Traditional association techniques can lack the power to detect the significance of rare variants individually, or measure their compound effect (rare variant burden). "Collapsing methods" were developed to overcome these problems. Collapsing methods @@ -52252,12 +52731,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 miRNA analysis The analysis of microRNAs (miRNAs) : short, highly conserved small noncoding RNA molecules that are naturally occurring plant and animal genomes. miRNA expression profiling - - + + miRNA expression analysis @@ -52267,10 +52746,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 Counting and summarising the number of short sequence reads that map to genomic features. - - + + Read summarisation @@ -52280,10 +52759,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 A technique whereby molecules with desired properties and function are isolated from libraries of random molecules, through iterative cycles of selection, amplification, and mutagenesis. - - + + In vitro selection @@ -52293,11 +52772,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 The calculation of species richness for a number of individual samples, based on plots of the number of species as a function of the number of samples (rarefaction curves). Species richness assessment - - + + Rarefaction @@ -52308,12 +52787,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 An operation which groups reads or contigs and assigns them to operational taxonomic units. Binning Binning shotgun reads - - + + Binning methods use one or a combination of compositional features or sequence similarity. Read binning @@ -52325,12 +52804,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 - true + 1.17 + true Counting and measuring experimentally determined observations into quantities. Quantitation - - + + Quantification @@ -52340,11 +52819,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 Quantification of data arising from RNA-Seq high-throughput sequencing, typically the quantification of transcript abundances durnig transcriptome analysis in a gene expression study. RNA-Seq quantitation - - + + RNA-Seq quantification @@ -52360,10 +52839,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 Match experimentally measured mass spectrum to a spectrum in a spectral library or database. - - + + Spectral library search @@ -52373,11 +52852,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 Sort a set of files or data items according to some property. - - - Sorting + + + Data sorting @@ -52386,14 +52865,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.17 + 1.17 Metabolite identification Mass spectra identification of compounds that are produced by living systems. Including polyketides, terpenoids, phenylpropanoids, alkaloids and antibiotics. De novo metabolite identification Fragmenation tree generation Metabolite identification - - + + Natural product identification @@ -52403,11 +52882,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.19 + 1.19 Identify and assess specific genes or regulatory regions of interest that are differentially methylated. Differentially-methylated region identification - - + + DMR identification @@ -52417,13 +52896,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.21 + 1.21 Genotyping of multiple loci, typically characterizing microbial species isolates using internal fragments of multiple housekeeping genes. MLST - - + + Multilocus sequence typing @@ -52441,11 +52920,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.21 + 1.21 Calculate a theoretical mass spectrometry spectra for given sequences. Spectrum prediction - - + + Spectrum calculation @@ -52461,10 +52940,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 3D visualization of a molecular trajectory. - - + + Trajectory visualization @@ -52475,13 +52954,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Compute Essential Dynamics (ED) on a simulation trajectory: an analysis of molecule dynamics using PCA (Principal Component Analysis) applied to the atomic positional fluctuations. ED PCA Principal modes - - + + Principal Component Analysis (PCA) is a multivariate statistical analysis to obtain collective variables and reduce the dimensionality of the system. Essential dynamics @@ -52504,12 +52983,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Obtain force field parameters (charge, bonds, dihedrals, etc.) from a molecule, to be used in molecular simulations Ligand parameterization Molecule parameterization - - + + Forcefield parameterisation @@ -52519,7 +52998,7 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Analyse DNA sequences in order to determine an individual's DNA characteristics, for example in criminal forensics, parentage testing and so on. DNA fingerprinting DNA profiling @@ -52532,11 +53011,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Predict or detect active sites in proteins; the region of an enzyme which binds a substrate bind and catalyses a reaction. Active site detection - - + + Active site prediction @@ -52547,12 +53026,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Predict or detect ligand-binding sites in proteins; a region of a protein which reversibly binds a ligand for some biochemical purpose, such as transport or regulation of protein function. Ligand-binding site detection Peptide-protein binding prediction - - + + Ligand-binding site prediction @@ -52563,12 +53042,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Predict or detect metal ion-binding sites in proteins. Metal-binding site detection Protein metal-binding site prediction - - + + Metal-binding site prediction @@ -52591,11 +53070,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Model or simulate protein-protein binding using comparative modelling or other techniques. Protein docking - - + + Protein-protein docking @@ -52617,13 +53096,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Predict DNA-binding proteins. DNA-binding protein detection DNA-protein interaction prediction Protein-DNA interaction prediction - - + + DNA-binding protein prediction @@ -52645,13 +53124,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Predict RNA-binding proteins. Protein-RNA interaction prediction RNA-binding protein detection RNA-protein interaction prediction - - + + RNA-binding protein prediction @@ -52661,13 +53140,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Predict or detect RNA-binding sites in protein sequences. Protein-RNA binding site detection Protein-RNA binding site prediction RNA binding site detection - - + + RNA binding site prediction @@ -52677,14 +53156,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 - + 1.22 Predict or detect DNA-binding sites in protein sequences. Protein-DNA binding site detection Protein-DNA binding site prediction DNA binding site detection - - + + DNA binding site prediction @@ -52701,11 +53179,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Identify or predict intrinsically disordered regions in proteins. - - + + Protein disorder prediction @@ -52716,11 +53194,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Extract structured information from unstructured ("free") or semi-structured textual documents. IE - - + + Information extraction @@ -52731,11 +53209,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.22 + 1.22 Retrieve resources from information systems matching a specific information need. - - + + Information retrieval + + @@ -52750,7 +53230,7 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.24 + 1.24 Detects chimeric sequences (chimeras) from a sequence alignment. Genome analysis @@ -52761,10 +53241,10 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.24 + 1.24 The determination of cytosine methylation status of specific positions in a nucleic acid sequences (usually reads from a bisulfite sequencing experiment). - - + + Methylation calling @@ -52780,11 +53260,11 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.24 + 1.24 The identification of changes in DNA sequence or chromosome structure, usually in the context of diagnostic tests for disease, or to study ancestry or phylogeny. Genetic testing - - + + This can include indirect methods which reveal the results of genetic changes, such as RNA analysis to indicate gene expression, or biochemical analysis to identify expressed proteins. DNA testing @@ -52796,41 +53276,15 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - 1.24 + 1.24 The processing of reads from high-throughput sequencing machines. - - + + Sequence read processing - - - - - 1.24 - Laboratory experiment to identify the differences between a specific genome (of an individual) and a reference genome (developed typically from many thousands of individuals). - -WGS re-sequencing is used as golden standard to detect variations compared to a given reference genome, including small variants (SNP and InDels) as well as larger genome re-organisations (CNVs, translocations, etc.). - -ows re-sequencing of complete genomes of any given organism with high resolution and high accuracy. - Resequencing - Whole_genome_sequencing - Amplicon panels - Amplicon sequencing - Amplicon-based sequencing - Highly targeted resequencing - WGR - WGRS - Whole genome resequencing - Amplicon sequencing is the ultra-deep sequencing of PCR products (amplicons), usually for the purpose of efficient genetic variant identification and characterisation in specific genomic regions. - Ultra-deep sequencing - Genome resequencing - - - - @@ -52842,7 +53296,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Render (visualise) a network - typically a biological network of some sort. Network rendering Protein interaction network rendering @@ -52863,11 +53317,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Render (visualise) a biological pathway. Pathway rendering - - + + Pathway visualisation @@ -52889,7 +53343,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Generate, process or analyse a biological network. Biological network analysis Biological network modelling @@ -52899,8 +53353,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Network prediction Network simulation Network topology simulation - - + + Network analysis @@ -52923,7 +53377,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Generate, process or analyse a biological pathway. Biological pathway analysis Biological pathway modelling @@ -52933,8 +53387,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Pathway modelling Pathway prediction Pathway simulation - - + + Pathway analysis @@ -52946,10 +53400,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Predict a metabolic pathway. - - + + Metabolic pathway prediction @@ -52959,11 +53413,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Assigning sequence reads to separate groups / files based on their index tag (sample origin). Sequence demultiplexing - - + + NGS sequence runs are often performed with multiple samples pooled together. In such cases, an index tag (or "barcode") - a unique sequence of between 6 and 12bp - is ligated to each sample's genetic material so that the sequence reads from different samples can be identified. The process of demultiplexing (dividing sequence reads into separate files for each index tag/sample) may be performed automatically by the sequencing hardware. Alternatively the reads may be lumped together in one file with barcodes still attached, requiring you to do the splitting using software. In such cases, a "mapping" file is used which indicates which barcodes correspond to which samples. Demultiplexing @@ -52974,6 +53428,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution + @@ -52986,11 +53441,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 A process used in statistics, machine learning, and information theory that reduces the number of random variables by obtaining a set of principal variables. Dimension reduction - - + + Dimensionality reduction @@ -53013,13 +53468,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 A dimensionality reduction process that selects a subset of relevant features (variables, predictors) for use in model construction. Attribute selection Variable selection Variable subset selection - - + + Feature selection @@ -53042,11 +53497,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 A dimensionality reduction process which builds (ideally) informative and non-redundant values (features) from an initial set of measured data, to aid subsequent generalization, learning or interpretation. Feature projection - - + + Feature extraction @@ -53069,15 +53524,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Virtual screening is used in drug discovery to identify potential drug compounds. It involves searching libraries of small molecules in order to identify those molecules which are most likely to bind to a drug target (typically a protein receptor or enzyme). Ligand-based screening Ligand-based virtual screening Structure-based screening Structured-based virtual screening Virtual ligand screening - - + + Virtual screening is widely used for lead identification, lead optimization, and scaffold hopping during drug design and discovery. Virtual screening @@ -53088,14 +53543,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 The application of phylogenetic and other methods to estimate paleogeographical events such as speciation. Biogeographic dating Speciation dating Species tree dating Tree-dating - - + + Tree dating @@ -53118,10 +53573,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 The development and use of mathematical models and systems analysis for the description of ecological processes, and applications such as the sustainable management of resources. - - + + Ecological modelling @@ -53131,11 +53586,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Mapping between gene tree nodes and species tree nodes or branches, to analyse and account for possible differences between gene histories and species histories, explaining this in terms of gene-scale events such as duplication, loss, transfer etc. Gene tree / species tree reconciliation - - + + Methods typically test for topological similarity between trees using for example a congruence index. Phylogenetic tree reconciliation @@ -53146,10 +53601,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 The detection of genetic selection, or (the end result of) the process by which certain traits become more prevalent in a species than other traits. - - + + Selection detection @@ -53159,10 +53614,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + + 1.25 A statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. - - + + Principal component analysis @@ -53173,11 +53629,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 Identify where sections of the genome are repeated and the number of repeats in the genome varies between individuals. CNV detection - - + + Copy number variation detection @@ -53187,10 +53643,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 Identify deletion events causing the number of repeats in the genome to vary between individuals. - - + + Deletion detection @@ -53200,10 +53656,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 Identify duplication events causing the number of repeats in the genome to vary between individuals. - - + + Duplication detection @@ -53213,10 +53669,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 Identify copy number variations which are complex, e.g. multi-allelic variations that have many structural alleles and have rearranged multiple times in the ancestral genomes. - - + + Complex CNV detection @@ -53226,10 +53682,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 Identify amplification events causing the number of repeats in the genome to vary between individuals. - - + + Amplification detection @@ -53246,10 +53702,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 Predict adhesins in protein sequences. - - + + An adhesin is a cell-surface component that facilitate the adherence of a microorganism to a cell or surface. They are important virulence factors during establishment of infection and thus are targetted during vaccine development approaches that seek to block adhesin function and prevent adherence to host cell. Adhesin prediction @@ -53260,13 +53716,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 + Protein engineering Design new protein molecules with specific structural or functional properties. Protein redesign Rational protein design de novo protein design - - + + Protein design @@ -53277,7 +53734,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 The design of small molecules with specific biological activity, such as inhibitors or modulators for proteins that are of therapeutic interest. This can involve the modification of individual atoms, the addition or removal of molecular fragments, and the use reaction-based design to explore tractable synthesis options for the small molecule. Drug design Ligand-based drug design @@ -53505,16 +53962,305 @@ ows re-sequencing of complete genomes of any given organism with high resolution + + + + + erin.calhoun + 2023-03-02T11:35:57.259868Z + Encryption + + + + + + + + + erin.calhoun + 2023-03-02T11:36:05.88338Z + Decryption + + + + + + + + + erin.calhoun + 2023-03-02T11:36:14.789196Z + Data compression + + + + + + + + + + + erin.calhoun + 2023-03-02T11:36:20.802645Z + Data decompression + + + + + + + + + erin.calhoun + 2023-03-02T11:36:47.384246Z + Data integration + + + + + + + + + + + + + + + bianchini + 2023-03-02T12:02:42.14696Z + Metadata recording + Data digitalisation + Data digitalization + + + + + + + + + bianchini + 2023-03-02T12:34:18.915934Z + Data storing + Data storage + + + + + + + + + + + + + + + + + + + + + Describe the operations that you will be performing to acquire, manage, document, store, share, and preserve your data. This includes budgeting for these operation when required. + Data management planning + + + + + + + + + erin.calhoun + 2023-02-27T10:14:48.086993Z + Data communication + Data migration + Move or transmit data between locations or devices. + Data transmission + Transmission + Data synchronisation + Data synchronization + Data syncing + Data exchange + Data sharing + Data transfer + + + + + + + + + + + + + + + + + + + erin.calhoun + 2023-02-28T09:02:32.952438Z + Generalized linear model + Multiple regression + Multivariate regression + Predictive modeling + Regression analysis + Statistical learning + Supervised learning + Linear model + Linear regression analysis + Linear regression method + Simple linear regression + Polynomial regression + Regularized linear regression + Gradient descent + Lasso regression + Multiple linear regression + Multivariate linear regression + Multivariate statistics + Nonlinear regression + Ordinary least squares + Regularization + Ridge regression + Linear regression + + Need to sort out broad/narrow/related synonyms. + + + + + + + + + + erin.calhoun + 2023-02-28T09:02:44.379994Z + Regression analysis + Statistical learning + Logistic model + Logistic regression model + Regularized logistic regression + Binary classifier + Gradient descent + Linear classifier + Maximum-likelihood estimation + Perceptron (artificial neural network) + Regularization + Logistic regression + + + + + + + + + + + erin.calhoun + 2023-02-28T09:17:27.076601Z + Independent component analysis + + + + + + + + + erin.calhoun + 2023-02-28T09:17:41.251746Z + Non-negative matrix factorization + + + + + + + + + erin.calhoun + 2023-02-28T10:03:21.925425Z + Factor analysis + + + + + + + + + erin.calhoun + 2023-02-28T15:41:35.868216Z + Regression analysis + Non-linear model + Non-linear regression + Nonlinear model + Nonlinear regression + + + + + + + + + + + + + + + + + + + + + + + + A related search term with a different scope + Matúš Kalaš + 2023-02-24T09:45:21.41427Z + Slightly broader meaning + A TEMPLATE for Operation concepts in EDAM. + The same thing (TSG) + Slightly narrower meaning + Mostly overlapping, but not exact, narrower, or broader. + Mandatory when released: rdfs:label, hasDefinition. +Mandatory but can be semi-automated: Created in, subsets, ... +Optional: rdfs:comment(s), synonyms and related terms; has topic, has input/output (what fits); rdfs:seeAlso to a Wikipedia article and a match link to a WikiData item (if these exist) +Removed for release: created_by, creation_date, skos:editorialNote(s) + Optional, zero or more. A comment adds important information to the definition, synonyms, external links. May also be "not to be confused with". + {Operation TEMPLATE} + + SKOS 'editorial note' comment is just an editorial comment that will not be released. E.g. TODO - Improve this TEMPLATE! + + + + + - beta12orEarlier - true + beta12orEarlier + true A category denoting a rather broad domain or field of interest, of study, application, work, data, or technology. Topics have no clearly defined borders between each other. sumo:FieldOfStudy - - + + Topic http://bioontology.org/ontologies/ResearchArea.owl#Area_of_Research http://onto.eva.mpg.de/ontologies/gfo-bio.owl#Method @@ -53532,16 +54278,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The processing and analysis of nucleic acid sequence, structural and other data. Nucleic acid bioinformatics Nucleic acid informatics Nucleic_acids Nucleic acid physicochemistry Nucleic acid properties - - + + Nucleic acids http://purl.bioontology.org/ontology/MSH/D017422 @@ -53554,15 +54300,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Archival, processing and analysis of protein data, typically molecular sequence and structural data. Protein bioinformatics Protein informatics Proteins Protein databases - - + + Proteins http://purl.bioontology.org/ontology/MSH/D020539 @@ -53574,11 +54320,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + The structures of reactants or products of metabolism, for example small molecules such as including vitamins, polyols, nucleotides and amino acids. - + Metabolites true @@ -53590,16 +54336,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The archival, processing and analysis of molecular sequences (monomer composition of polymers) including molecular sequence data resources, sequence sites, alignments, motifs and profiles. Sequences Sequence_analysis Biological sequences Sequence databases - - - + + + Sequence analysis http://purl.bioontology.org/ontology/MSH/D017421 @@ -53611,8 +54357,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The curation, processing, analysis and prediction of data about the structure of biological molecules, typically proteins and nucleic acids and other macromolecules. Biomolecular structure Structural bioinformatics @@ -53622,9 +54368,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Structure data resources Structure databases Structures - - - + + + This includes related concepts such as structural properties, alignments and structural motifs. Structure analysis @@ -53637,8 +54383,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The prediction of molecular structure, including the prediction, modelling, recognition or design of protein secondary or tertiary structure or other structural features, and the folding of nucleic acid molecules and the prediction or design of nucleic acid (typically RNA) sequences with specific conformations. Structure_prediction DNA structure prediction @@ -53648,8 +54394,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Protein fold recognition Protein structure prediction RNA structure prediction - - + + This includes the recognition (prediction and assignment) of known protein structural domains or folds in protein sequence(s), for example by threading, or the alignment of molecular sequences to structures, structural (3D) profiles or templates (representing a structure or structure alignment). Structure prediction @@ -53662,13 +54408,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The alignment (equivalence between sites) of molecular sequences, structures or profiles (representing a sequence or structure alignment). - + Alignment true @@ -53682,18 +54428,18 @@ ows re-sequencing of complete genomes of any given organism with high resolution - + - + - beta12orEarlier - true + beta12orEarlier + true The study of evolutionary relationships amongst organisms. Phylogeny Phylogenetic clocks @@ -53701,9 +54447,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Phylogenetic simulation Phylogenetic stratigraphy Phylogeny reconstruction - - - + + + This includes diverse phylogenetic methods, including phylogenetic tree construction, typically from molecular sequence or morphological data, methods that simulate DNA sequence evolution, a phylogenetic tree or the underlying data, or which estimate or use molecular clock and stratigraphic (age) data, methods for studying gene evolution etc. Phylogeny @@ -53717,13 +54463,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The study of gene or protein functions and their interactions in totality in a given organism, tissue, cell etc. Functional_genomics - - - + + + Functional genomics @@ -53734,8 +54480,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The conceptualisation, categorisation and nomenclature (naming) of entities or phenomena within biology or bioinformatics. This includes formal ontologies, controlled vocabularies, structured glossary, symbols and terminology or other related resource. Ontology_and_terminology Applied ontology @@ -53744,9 +54490,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Ontology relations Terminology Upper ontology - - - + + + Ontology and terminology http://purl.bioontology.org/ontology/MSH/D002965 @@ -53758,13 +54504,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + The search and query of data sources (typically databases or ontologies) in order to retrieve entries or other information. - + Information retrieval true @@ -53776,30 +54522,30 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - beta12orEarlier - true + 1.3 + beta12orEarlier + true VT 1.5.12 Computational biology VT 1.5.19 Mathematical biology VT 1.5.26 Theoretical biology VT 1.5.6 Bioinformatics The archival, curation, processing and analysis of complex biological data. The development and application of theory, analytical methods, mathematical models and computational simulation of biological systems. + Computational biology Bioinformatics Computational_biology Biomathematics Mathematical biology Theoretical biology - - - + + + This includes data processing in general, including basic handling of files and databases, datatypes, workflows and annotation. This includes the modeling and treatment of biological processes and systems in mathematical terms (theoretical biology). Bioinformatics http://purl.bioontology.org/ontology/MSH/D016247 - Computational biology @@ -53808,15 +54554,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Computer graphics - Separate? Or broader CGraphics & visualisation? VT 1.2.5 Computer graphics Rendering (drawing on a computer screen) or visualisation of molecular sequences, structures or other biomolecular data. Data rendering - TODO operation? Data_visualisation - - + + Scientific? and technical? visualisation @@ -53828,12 +54574,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The study of the thermodynamic properties of a nucleic acid. - + Nucleic acid thermodynamics true @@ -53845,7 +54591,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier The archival, curation, processing and analysis of nucleic acid structural information, such as whole structures, structural features and alignments, and associated annotation. Nucleic acid structure Nucleic_acid_structure_analysis @@ -53856,8 +54602,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution RNA alignment RNA structure RNA structure alignment - - + + Includes secondary and tertiary nucleic acid structural data, nucleic acid thermodynamic, thermal and conformational properties including DNA or DNA/RNA denaturation (melting) etc. Nucleic acid structure analysis @@ -53869,12 +54615,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier RNA sequences and structures. RNA Small RNA - - + + RNA @@ -53885,12 +54631,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Topic for the study of restriction enzymes, their cleavage sites and the restriction of nucleic acids. - + Nucleic acid restriction true @@ -53901,16 +54647,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The mapping of complete (typically nucleotide) sequences. Mapping (in the sense of short read alignment, or more generally, just alignment) has application in RNA-Seq analysis (mapping of transcriptomics reads), variant discovery (e.g. mapping of exome capture), and re-sequencing (mapping of WGS reads). Mapping Genetic linkage Linkage Linkage mapping Synteny - - + + This includes resources that aim to identify, map or analyse genetic markers in DNA sequences, for example to produce a genetic (linkage) map of a chromosome or genome or to analyse genetic linkage and synteny. It also includes resources for physical (sequence) maps of a DNA sequence showing the physical distance (base pairs) between features or landmarks such as restriction sites, cloned DNA fragments, genes and other genetic markers. It also covers for example the alignment of sequences of (typically millions) of short reads to a reference genome. Mapping @@ -53922,12 +54668,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The study of codon usage in nucleotide sequence(s), genetic codes and so on. - + Genetic codes and codon usage true @@ -53938,13 +54684,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier The translation of mRNA into protein and subsequent protein processing in the cell. Protein_expression Translation - - - + + + Protein expression @@ -53955,12 +54701,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Methods that aims to identify, predict, model or analyse genes or gene structure in DNA sequences. - + This includes the study of promoters, coding regions, splice sites, etc. Methods for gene prediction might be ab initio, based on phylogenetic comparisons, use motifs, sequence features, support vector machine, alignment etc. Gene finding true @@ -53972,12 +54718,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The transcription of DNA into mRNA. - + Transcription true @@ -53988,12 +54734,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Promoters in DNA sequences (region of DNA that facilitates the transcription of a particular gene by binding RNA polymerase and transcription factor proteins). - + Promoters true @@ -54004,11 +54750,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The folding (in 3D space) of nucleic acid molecules. - + Nucleic acid folding true @@ -54020,14 +54766,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Gene structure, regions which make an RNA product and features such as promoters, coding regions, gene fusion, splice sites etc. Gene features Gene_structure Fusion genes - - + + This includes the study of promoters, coding regions etc. This incudes operons (operators, promoters and genes) from a bacterial genome. For example the operon leader and trailer gene, gene composition of the operon and associated information. Gene structure @@ -54040,8 +54786,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Protein and peptide identification, especially in the study of whole proteomes of organisms. Proteomics Bottom-up proteomics @@ -54054,9 +54800,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Quantitative proteomics Targeted proteomics Top-down proteomics - - - + + + Includes metaproteomics: proteomics analysis of an environmental sample. Proteomics includes any methods (especially high-throughput) that separate, characterize and identify expressed proteins such as mass spectrometry, two-dimensional gel electrophoresis and protein microarrays, as well as in-silico methods that perform proteolytic or mass calculations on a protein sequence and other analyses of protein production data, for example in different cells or tissues. Proteomics @@ -54071,13 +54817,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The elucidation of the three dimensional structure for all (available) proteins in a given organism. Structural_genomics - - - + + + Structural genomics @@ -54088,14 +54834,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The study of the physical and biochemical properties of peptides and proteins, for example the hydrophobic, hydrophilic and charge properties of a protein. Protein physicochemistry Protein_properties Protein hydropathy - - + + Protein properties @@ -54106,8 +54852,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Protein-protein, protein-DNA/RNA and protein-ligand interactions, including analysis of known interactions and prediction of putative interactions. Protein_interactions Protein interaction map @@ -54120,8 +54866,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Protein-ligand interactions Protein-nucleic acid interactions Protein-protein interactions - - + + This includes experimental (e.g. yeast two-hybrid) and computational analysis techniques. Protein interactions @@ -54133,8 +54879,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Protein stability, folding (in 3D space) and protein sequence-structure-function relationships. This includes for example study of inter-atomic or inter-residue interactions in protein (3D) structures, the effect of mutation, and the design of proteins with specific properties, typically by designing changes (via site-directed mutagenesis) to an existing protein. Protein_folding_stability_and_design Protein design @@ -54142,8 +54888,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Protein residue interactions Protein stability Rational protein design - - + + Protein folding, stability and design @@ -54155,12 +54901,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Two-dimensional gel electrophoresis image and related data. - + Two-dimensional gel electrophoresis true @@ -54171,7 +54917,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier An analytical chemistry technique that measures the mass-to-charge ratio and abundance of ions in the gas phase. TODO: See overlap with/distinction from CHMO, e.g. quadrupole time-of-flight mass spectrometry [CHMO:00027], or notably “laser ablation inductively coupled plasma time-of-flight mass spectrometry” or LA-ICP-TOFMS [CHMO:0000551] (Palmblad et al. 2022 says "could just as well be annotated by connecting the individual terms “laser ablation” (CHMO:0001132), “plasma ionisation” (CHMO:0001665) and "time-of-flight mass spectrometry" (CHMO:0000580")) Mass spectrometry @@ -54183,12 +54929,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Protein microarray data. - + Protein microarrays true @@ -54199,12 +54945,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The study of the hydrophobic, hydrophilic and charge properties of a protein. - + Protein hydropathy true @@ -54215,14 +54961,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier The study of how proteins are transported within and without the cell, including signal peptides, protein subcellular localisation and export. Protein_targeting_and_localisation Protein localisation Protein sorting Protein targeting - - + + Protein targeting and localisation @@ -54233,12 +54979,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Enzyme or chemical cleavage sites and proteolytic or mass calculations on a protein sequence. - + Protein cleavage sites and proteolysis true @@ -54249,11 +54995,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The comparison of two or more protein structures. - + Use this concept for methods that are exclusively for protein structure. Protein structure comparison @@ -54266,12 +55012,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The processing and analysis of inter-atomic or inter-residue interactions in protein (3D) structures. - + Protein residue interactions true @@ -54282,12 +55028,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Protein-protein interactions, individual interactions and networks, protein complexes, protein functional coupling etc. - + Protein-protein interactions true @@ -54298,12 +55044,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Protein-ligand (small molecule) interactions. - + Protein-ligand interactions true @@ -54314,12 +55060,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Protein-DNA/RNA interactions. - + Protein-nucleic acid interactions true @@ -54330,12 +55076,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The design of proteins with specific properties, typically by designing changes (via site-directed mutagenesis) to an existing protein. - + Protein design true @@ -54346,12 +55092,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + G-protein coupled receptors (GPCRs). - + G protein-coupled receptors (GPCR) true @@ -54362,13 +55108,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Carbohydrates, typically including structural information. Carbohydrates - - - Carbohydrates + + + Carbohydrates / Glycans? @@ -54378,15 +55124,21 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + + beta12orEarlier + true + Lipidome + Lipids Lipids and their structures. - Lipidomics Lipids - - - Lipids + + + Lipidomics + + + + @@ -54395,8 +55147,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Small molecules of biological significance, typically archival, curation, processing and analysis of structural information. Small_molecules Amino acids @@ -54412,8 +55164,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Toxins Toxins and targets CHEBI:23367 - - + + Small molecules include organic molecules, metal-organic compounds, small polypeptides, small polysaccharides and oligonucleotides. Structural data is usually included. This concept excludes macromolecules such as proteins and nucleic acids. This includes the structures of drugs, drug target, their interactions and binding affinities. Also the structures of reactants or products of metabolism, for example small molecules such as including vitamins, polyols, nucleotides and amino acids. Also the physicochemical, biochemical or structural properties of amino acids or peptides. Also structural and associated data for toxic chemical substances. @@ -54428,13 +55180,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Edit, convert or otherwise change a molecular sequence, either randomly or specifically. - + Sequence editing true @@ -54445,8 +55197,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The archival, processing and analysis of the basic character composition of molecular sequences, for example character or word frequency, ambiguity, complexity, particularly regions of low complexity, and repeats or the repetitive nature of molecular sequences. Sequence_composition_complexity_and_repeats Low complexity sequences @@ -54457,8 +55209,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Sequence complexity Sequence composition Sequence repeats - - + + This includes repetitive elements within a nucleic acid sequence, e.g. long terminal repeats (LTRs); sequences (typically retroviral) directly repeated at both ends of a sequence and other types of repeating unit. This includes short repetitive subsequences (repeat sequences) in a protein sequence. Sequence composition, complexity and repeats @@ -54470,12 +55222,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Conserved patterns (motifs) in molecular sequences, that (typically) describe functional or other key sites. - + Sequence motifs true @@ -54486,11 +55238,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + The comparison of two or more molecular sequences, for example sequence alignment and clustering. - + The comparison might be on the basis of sequence, physico-chemical or some other properties of the sequences. Sequence comparison @@ -54503,8 +55255,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The archival, detection, prediction and analysis of positional features such as functional and other key sites, in molecular sequences and the conserved patterns (motifs, profiles etc.) that may be used to describe them. Sequence_sites_features_and_motifs Functional sites @@ -54513,8 +55265,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Sequence motifs Sequence profiles Sequence sites - - + + Sequence sites, features and motifs @@ -54524,12 +55276,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Search and retrieve molecular sequences that are similar to a sequence-based query (typically a simple sequence). - + The query is a sequence-based entity such as another sequence, a motif or profile. Sequence database search true @@ -54541,11 +55293,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.7 - + beta12orEarlier + 1.7 + The comparison and grouping together of molecular sequences on the basis of their similarities. - + This includes systems that generate, process and analyse sequence clusters. Sequence clustering @@ -54558,8 +55310,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Structural features or common 3D motifs within protein structures, including the surface of a protein structure, such as biological interfaces with other molecules. Protein 3D motifs Protein_structural_motifs_and_surfaces @@ -54567,8 +55319,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Protein structural motifs Protein surfaces Structural motifs - - + + This includes conformation of conserved substructures, conserved geometry (spatial arrangement) of secondary structure or protein backbone, solvent-exposed surfaces, internal cavities, the analysis of shape, hydropathy, electrostatic patches, role and functions etc. Protein structural motifs and surfaces @@ -54579,12 +55331,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The processing, analysis or use of some type of structural (3D) profile or template; a computational entity (typically a numerical matrix) that is derived from and represents a structure or structure alignment. - + Structural (3D) profiles true @@ -54595,11 +55347,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + The prediction, modelling, recognition or design of protein secondary or tertiary structure or other structural features. - + Protein structure prediction true @@ -54611,11 +55363,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + The folding of nucleic acid molecules and the prediction or design of nucleic acid (typically RNA) sequences with specific conformations. - + Nucleic acid structure prediction true @@ -54627,11 +55379,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.7 - + beta12orEarlier + 1.7 + The prediction of three-dimensional structure of a (typically protein) sequence from first principles, using a physics-based or empirical scoring function and without using explicit structural templates. - + Ab initio structure prediction true @@ -54643,12 +55395,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.4 - + beta12orEarlier + 1.4 + The modelling of the three-dimensional structure of a protein using known sequence and structural data. - + Homology modelling true @@ -54660,15 +55412,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Molecular flexibility Molecular motions The study and simulation of molecular (typically protein) conformation using a computational model of physical forces and computer simulation. Molecular_dynamics Protein dynamics - - + + This includes methods such as Molecular Dynamics, Coarse-grained dynamics, metadynamics, Quantum Mechanics, QM/MM, Markov State Models, etc. This includes resources concerning flexibility and motion in protein and other molecular structures. Molecular dynamics @@ -54680,12 +55432,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true - 1.12 - + beta12orEarlier + true + 1.12 + The modelling the structure of proteins in complex with small molecules or other macromolecules. - + Molecular docking true @@ -54697,11 +55449,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The prediction of secondary or supersecondary structure of protein sequences. - + Protein secondary structure prediction true @@ -54713,11 +55465,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The prediction of tertiary structure of protein sequences. - + Protein tertiary structure prediction true @@ -54729,11 +55481,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + The recognition (prediction and assignment) of known protein structural domains or folds in protein sequence(s). - + Protein fold recognition true @@ -54745,11 +55497,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.7 - + beta12orEarlier + 1.7 + The alignment of molecular sequences or sequence profiles (representing sequence alignments). - + This includes the generation of alignments (the identification of equivalent sites), the analysis of alignments, editing, visualisation, alignment databases, the alignment (equivalence between sites) of sequence profiles (representing sequence alignments) and so on. Sequence alignment @@ -54762,11 +55514,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.7 - + beta12orEarlier + 1.7 + The superimposition of molecular tertiary structures or structural (3D) profiles (representing a structure or structure alignment). - + This includes the generation, storage, analysis, rendering etc. of structure alignments. Structure alignment @@ -54779,11 +55531,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The alignment of molecular sequences to structures, structural (3D) profiles or templates (representing a structure or structure alignment). - + Threading true @@ -54795,12 +55547,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Sequence profiles; typically a positional, numerical matrix representing a sequence alignment. - + Sequence profiles include position-specific scoring matrix (position weight matrix), hidden Markov models etc. Sequence profiles and HMMs true @@ -54812,12 +55564,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The reconstruction of a phylogeny (evolutionary relatedness amongst organisms), for example, by building a phylogenetic tree. - + Currently too specific for the topic sub-ontology (but might be unobsoleted). Phylogeny reconstruction true @@ -54832,17 +55584,17 @@ ows re-sequencing of complete genomes of any given organism with high resolution - + - beta12orEarlier - true + beta12orEarlier + true The integrated study of evolutionary relationships and whole genome data, for example, in the analysis of species trees, horizontal gene transfer and evolutionary reconstruction. Phylogenomics - - - + + + Phylogenomics @@ -54853,12 +55605,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Simulated polymerase chain reaction (PCR). - + Virtual PCR true @@ -54869,13 +55621,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The assembly of fragments of a DNA sequence to reconstruct the original sequence. Sequence_assembly Assembly - - + + Assembly has two broad types, de-novo and re-sequencing. Re-sequencing is a specialised case of assembly, where an assembled (typically de-novo assembled) reference genome is available and is about 95% identical to the re-sequenced genome. All other cases of assembly are 'de-novo'. Sequence assembly @@ -54888,8 +55640,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Stable, naturally occuring mutations in a nucleotide sequence including alleles, naturally occurring mutations such as single base nucleotide substitutions, deletions and insertions, RFLPs and other polymorphisms. DNA variation Genetic_variation @@ -54897,8 +55649,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Mutation Polymorphism Somatic mutations - - + + Genetic variation http://purl.bioontology.org/ontology/MSH/D014644 @@ -54910,12 +55662,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Microarrays, for example, to process microarray data or design probes and experiments. - + Microarrays http://purl.bioontology.org/ontology/MSH/D046228 true @@ -54927,16 +55679,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true VT 3.1.7 Pharmacology and pharmacy The study of drugs and their effects or responses in living systems. Pharmacology Computational pharmacology Pharmacoinformatics - - - + + + Pharmacology @@ -54947,8 +55699,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true http://edamontology.org/topic_0197 The analysis of levels and patterns of synthesis of gene products (proteins and functional RNA) including interpretation in functional terms of gene expression data. Expression @@ -54960,9 +55712,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Gene transcription Gene translation Transcription - - - + + + Gene expression levels are analysed by identifying, quantifying or comparing mRNA transcripts, for example using microarrays, RNA-seq, northern blots, gene-indexed expression profiles etc. This includes the study of codon usage in nucleotide sequence(s), genetic codes and so on. Gene expression @@ -54976,12 +55728,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The regulation of gene expression. Regulatory genomics - - + + Gene regulation @@ -54993,14 +55745,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The influence of genotype on drug response, for example by correlating gene expression or single-nucleotide polymorphisms with drug efficacy or toxicity. Pharmacogenomics Pharmacogenetics - - - + + + Pharmacogenomics @@ -55012,15 +55764,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true VT 3.1.4 Medicinal chemistry The design and chemical synthesis of bioactive molecules, for example drugs or potential drug compounds, for medicinal purposes. Drug design Medicinal_chemistry - - - + + + This includes methods that search compound collections, generate or analyse drug 3D conformations, identify drug targets with structural docking etc. Medicinal chemistry @@ -55032,12 +55784,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Information on a specific fish genome including molecular sequences, genes and annotation. - + Fish true @@ -55048,12 +55800,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Information on a specific fly genome including molecular sequences, genes and annotation. - + Flies true @@ -55064,13 +55816,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). - 1.17 - + beta12orEarlier + (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). + 1.17 + Information on a specific mouse or rat genome including molecular sequences, genes and annotation. - + The resource may be specific to a group of mice / rats or all mice / rats. Mice or rats true @@ -55082,12 +55834,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Information on a specific worm genome including molecular sequences, genes and annotation. - + Worms true @@ -55098,11 +55850,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The processing and analysis of the bioinformatics literature and bibliographic data, such as literature search and query. - + Literature analysis true @@ -55115,7 +55867,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier The processing and analysis of natural language, such as scientific literature in English, in order to extract data and information, or to enable human-computer interaction. NLP Natural_language_processing @@ -55124,9 +55876,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Text analytics Text data mining Text mining - - - + + + Natural language processing @@ -55139,15 +55891,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier Deposition and curation of database accessions, including annotation, typically with terms from a controlled vocabulary. Data_submission_annotation_and_curation Data curation Data provenance Database curation - - - + + + Data submission, annotation, and curation @@ -55157,11 +55909,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + The management and manipulation of digital documents, including database records, files and reports. - + Document, record and content management true @@ -55173,12 +55925,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Annotation of a molecular sequence. - + Sequence annotation true @@ -55189,14 +55941,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Annotation of a genome. - + Genome annotation true @@ -55208,7 +55960,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier Spectroscopy An analytical technique that exploits the magenetic properties of certain atomic nuclei to provide information on the structure, dynamics, reaction state and chemical environment of molecules. NMR spectroscopy @@ -55220,9 +55972,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution Nuclear Overhauser Effect Spectroscopy ROESY Rotational Frame Nuclear Overhauser Effect Spectroscopy - - - + + + + imaging-revise NMR @@ -55233,11 +55986,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.12 - + beta12orEarlier + 1.12 + The classification of molecular sequences based on some measure of their similarity. - + Methods including sequence motifs, profile and other diagnostic elements which (typically) represent conserved patterns (of residues or properties) in molecular sequences. Sequence classification @@ -55250,12 +56003,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + primarily the classification of proteins (from sequence or structural data) into clusters, groups, families etc. - + Protein classification true @@ -55266,12 +56019,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Sequence motifs, or sequence profiles derived from an alignment of molecular sequences of a particular type. - + This includes comparison, discovery, recognition etc. of sequence motifs. Sequence motif or profile true @@ -55283,8 +56036,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Protein chemical modifications, e.g. post-translational modifications. PTMs Post-translational modifications @@ -55295,8 +56048,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Protein post-translational modifications GO:0006464 MOD:00000 - - + + EDAM does not describe all possible protein modifications. For fine-grained annotation of protein modification use the Gene Ontology (children of concept GO:0006464) and/or the Protein Modifications ontology (children of concept MOD:00000) Protein modifications @@ -55308,8 +56061,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true http://edamontology.org/topic_3076 Molecular interactions, biological pathways, networks and other models. Molecular_interactions_pathways_and_networks @@ -55329,9 +56082,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Pathways Signal transduction pathways Signaling pathways - - - + + + Molecular interactions, pathways and networks @@ -55343,9 +56096,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - beta12orEarlier - true + 1.3 + beta12orEarlier + true VT 1.2 Computer sciences VT 1.2.99 Other VT 1.3 Information sciences @@ -55368,9 +56121,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Information management Knowledge management Scientific computing - - - + + + Informatics @@ -55382,11 +56135,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Data resources for the biological or biomedical literature, either a primary source of literature or some derivative. - + Literature data resources true @@ -55398,14 +56151,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Laboratory management and resources, for example, catalogues of biological resources for use in the lab including cell lines, viruses, plasmids, phages, DNA probes and primers and so on. Laboratory_Information_management Laboratory resources - - - + + + Laboratory information management @@ -55416,12 +56169,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + General cell culture or data on a specific cell lines. - + Cell and tissue culture true @@ -55433,21 +56186,22 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true + Ecosystem sciences VT 1.5.15 Ecology The ecological and environmental sciences and especially the application of information technology (ecoinformatics). TODO: OMDG! Ecology Computational ecology Ecoinformatics Ecological informatics - Ecosystem science - - - + + + Ecology http://purl.bioontology.org/ontology/MSH/D004777 + @@ -55457,7 +56211,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier Electron diffraction experiment The study of matter by studying the interference pattern from firing electrons at a sample, to analyse structures at resolutions higher than can be achieved using light. Electron_microscopy @@ -55467,9 +56221,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution Single particle electron microscopy TEM Transmission electron microscopy - - - + + + + imaging-revise Electron microscopy @@ -55480,12 +56235,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + The cell cycle including key genes and proteins. - + Cell cycle true @@ -55496,11 +56251,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + The physicochemical, biochemical or structural properties of amino acids or peptides. - + Peptides and amino acids true @@ -55512,12 +56267,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + A specific organelle, or organelles in general, typically the genes and proteins (or genome and proteome). - + Organelles true @@ -55528,12 +56283,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Ribosomes, typically of ribosome-related genes and proteins. - + Ribosomes true @@ -55544,12 +56299,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + A database about scents. - + Scents true @@ -55560,11 +56315,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + The structures of drugs, drug target, their interactions and binding affinities. - + Drugs and target structures true @@ -55576,14 +56331,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true A specific organism, or group of organisms, used to study a particular aspect of biology. Organisms Model_organisms - - - + + + This may include information on the genome (including molecular sequences and map, genes and annotation), proteome, as well as more general information about an organism. Model organisms @@ -55595,8 +56350,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Whole genomes of one or more organisms, or genomes in general, such as meta-information on genomes, genome projects, gene names etc. Genomics Exomes @@ -55606,9 +56361,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Synthetic genomics Viral genomics Whole genomes - - - + + + Genomics http://purl.bioontology.org/ontology/MSH/D023281 @@ -55621,8 +56376,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Particular gene(s), gene family or other gene group or system and their encoded proteins.Primarily the classification of proteins (from sequence or structural data) into clusters, groups, families etc., curation of a particular protein or protein family, or any other proteins that have been classified as members of a common group. Genes, gene family or system Gene_and protein_families @@ -55631,9 +56386,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Gene system Protein families Protein sequence classification - - - + + + A protein families database might include the classifier (e.g. a sequence profile) used to build the classification. Gene and protein families @@ -55646,11 +56401,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Study of chromosomes. - + Chromosomes true @@ -55662,8 +56417,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The study of genetic constitution of a living entity, such as an individual, and organism, a cell and so on, typically with respect to a particular observable phenotypic traits, or resources concerning such traits, which might be an aspect of biochemistry, physiology, morphology, anatomy, development and so on. Genotype and phenotype resources Genotype-phenotype @@ -55673,9 +56428,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Genotyping Phenotype Phenotyping - - - + + + Genotype and phenotype @@ -55686,12 +56441,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Gene expression e.g. microarray data, northern blots, gene-indexed expression profiles etc. - + Gene expression and microarray true @@ -55702,15 +56457,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Molecular probes (e.g. a peptide probe or DNA microarray probe) or PCR primers and hybridisation oligos in a nucleic acid sequence. Probes_and_primers Primer quality Primers Probes - - + + This includes the design of primers for PCR and DNA amplification or the design of molecular probes. Probes and primers http://purl.bioontology.org/ontology/MSH/D015335 @@ -55722,15 +56477,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true VT 3.1.6 Pathology Diseases, including diseases in general and the genes, gene variations and proteins involved in one or more specific diseases. Disease Pathology - - - + + + Pathology @@ -55741,12 +56496,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + A particular protein, protein family or other group of proteins. - + Specific protein resources true @@ -55757,13 +56512,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true VT 1.5.25 Taxonomy Organism classification, identification and naming. Taxonomy - - + + Taxonomy @@ -55774,11 +56529,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Archival, processing and analysis of protein sequences and sequence-based entities such as alignments, motifs and profiles. - + Protein sequence analysis true @@ -55790,11 +56545,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + The archival, processing and analysis of nucleotide sequences and and sequence-based entities such as alignments, motifs and profiles. - + Nucleic acid sequence analysis true @@ -55806,12 +56561,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The repetitive nature of molecular sequences. - + Repeat sequences true @@ -55822,12 +56577,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The (character) complexity of molecular sequences, particularly regions of low complexity. - + Low complexity sequences true @@ -55838,12 +56593,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + A specific proteome including protein sequences and annotation. - + Proteome true @@ -55854,14 +56609,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier DNA sequences and structure, including processes such as methylation and replication. DNA analysis DNA Ancient DNA Chromosomes - - + + The DNA sequences might be coding or non-coding sequences. DNA @@ -55873,11 +56628,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Protein-coding regions including coding sequences (CDS), exons, translation initiation sites and open reading frames - + Coding RNA true @@ -55890,8 +56645,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Non-coding or functional RNA sequences, including regulatory RNA sequences, ribosomal RNA (rRNA) and transfer RNA (tRNA). Functional_regulatory_and_non-coding_RNA Functional RNA @@ -55914,8 +56669,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution siRNA snRNA snoRNA - - + + Non-coding RNA includes piwi-interacting RNA (piRNA), small nuclear RNA (snRNA) and small nucleolar RNA (snoRNA). Regulatory RNA includes microRNA (miRNA) - short single stranded RNA molecules that regulate gene expression, and small interfering RNA (siRNA). Functional, regulatory and non-coding RNA @@ -55927,12 +56682,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + One or more ribosomal RNA (rRNA) sequences. - + rRNA true @@ -55943,12 +56698,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + One or more transfer RNA (tRNA) sequences. - + tRNA true @@ -55959,11 +56714,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Protein secondary structure or secondary structure alignments. - + This includes assignment, analysis, comparison, prediction, rendering etc. of secondary structure data. Protein secondary structure @@ -55976,12 +56731,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + RNA secondary or tertiary structure and alignments. - + RNA structure true @@ -55992,11 +56747,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.8 - + beta12orEarlier + 1.8 + Protein tertiary structures. - + Protein tertiary structure true @@ -56008,12 +56763,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Classification of nucleic acid sequences and structures. - + Nucleic acid classification true @@ -56024,11 +56779,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.14 - + beta12orEarlier + 1.14 + Primarily the classification of proteins (from sequence or structural data) into clusters, groups, families etc., curation of a particular protein or protein family, or any other proteins that have been classified as members of a common group. - + Protein families true @@ -56040,8 +56795,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Protein tertiary structural domains and folds in a protein or polypeptide chain. Protein_folds_and_structural_domains Intramembrane regions @@ -56052,8 +56807,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Protein topological domains Protein transmembrane regions Transmembrane regions - - + + This includes topological domains such as cytoplasmic regions in a protein. This includes trans- or intra-membrane regions of a protein, typically describing physicochemical properties of the secondary structure elements. For example, the location and size of the membrane spanning segments and intervening loop regions, transmembrane region IN/OUT orientation relative to the membrane, plus the following data for each amino acid: A Z-coordinate (the distance to the membrane center), the free energy of membrane insertion (calculated in a sliding window over the sequence) and a reliability score. The z-coordinate implies information about re-entrant helices, interfacial helices, the tilt of a transmembrane helix and loop lengths. Protein folds and structural domains @@ -56066,11 +56821,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Nucleotide sequence alignments. - + Nucleic acid sequence alignment true @@ -56082,12 +56837,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Protein sequence alignments. - + A sequence profile typically represents a sequence alignment. Protein sequence alignment true @@ -56099,13 +56854,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The archival, detection, prediction and analysis ofpositional features such as functional sites in nucleotide sequences. - + Nucleic acid sites and features true @@ -56116,13 +56871,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The detection, identification and analysis of positional features in proteins, such as functional sites. - + Protein sites and features true @@ -56135,7 +56890,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier Proteins that bind to DNA and control transcription of DNA to mRNA (transcription factors) and also transcriptional regulatory sites, elements and regions (such as promoters, enhancers, silencers and boundary elements / insulators) in nucleotide sequences. Transcription_factors_and_regulatory_sites -10 signals @@ -56155,8 +56910,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Transcription factor binding sites Transcription factors Transcriptional regulatory sites - - + + This includes CpG rich regions (isochores) in a nucleotide sequence. This includes promoters, CAAT signals, TATA signals, -35 signals, -10 signals, GC signals, primer binding sites for initiation of transcription or reverse transcription, enhancer, attenuator, terminators and ribosome binding sites. Transcription factor proteins either promote (as an activator) or block (as a repressor) the binding to DNA of RNA polymerase. Regulatory sites including transcription factor binding site as well as promoters, enhancers, silencers and boundary elements / insulators. @@ -56171,13 +56926,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.0 - + beta12orEarlier + 1.0 + Protein phosphorylation and phosphorylation sites in protein sequences. - + Phosphorylation sites true @@ -56188,11 +56943,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Metabolic pathways. - + Metabolic pathways true @@ -56204,11 +56959,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Signaling pathways. - + Signaling pathways true @@ -56220,12 +56975,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Protein and peptide identification - + Protein and peptide identification true @@ -56235,17 +56990,19 @@ ows re-sequencing of complete genomes of any given organism with high resolution + - beta12orEarlier - Biological or biomedical analytical workflows or pipelines. - Pipelines - Workflows - Software integration - Tool integration - Tool interoperability - - - Workflows + beta12orEarlier + Data workflows + Pipelines + Software integration + Tool integration + Tool interoperability + Biological or biomedical analytical workflows or pipelines. TODO + Data flow management + + + Workflow management @@ -56255,12 +57012,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.0 - + beta12orEarlier + 1.0 + Structuring data into basic types and (computational) objects. - + Data types and objects true @@ -56271,12 +57028,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Theoretical biology - + Theoretical biology true @@ -56287,12 +57044,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Mitochondria, typically of mitochondrial genes and proteins. - + Mitochondria true @@ -56303,7 +57060,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier Plant science VT 1.5.10 Botany VT 1.5.22 Plant science @@ -56318,11 +57075,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution Plant ecology Plant genetics Plant physiology - - + + The resource may be specific to a plant, a group of plants or all plants. - Plant biology + Plant biology? Plant science? (outside of Biology?) + Is botany a synonym or narrower? ... than plant science or plant biology? @@ -56331,12 +57089,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier VT 1.5.28 Study of viruses, e.g. sequence and structural data, interactions of viral proteins, or a viral genome including molecular sequences, genes and annotation. Virology - - + + Virology @@ -56347,13 +57105,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). - 1.17 - + beta12orEarlier + (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). + 1.17 + Fungi and molds, e.g. information on a specific fungal genome including molecular sequences, genes and annotation. - + The resource may be specific to a fungus, a group of fungi or all fungi. Fungi true @@ -56365,13 +57123,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). Definition is wrong anyway. - 1.17 - + beta12orEarlier + (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). Definition is wrong anyway. + 1.17 + Pathogens, e.g. information on a specific vertebrate genome including molecular sequences, genes and annotation. - + TODO: Consider Model organism?!?!?!? Infectious disease rather? The resource may be specific to a pathogen, a group of pathogens or all pathogens. Pathogens @@ -56384,12 +57142,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Arabidopsis-specific data. - + Arabidopsis true @@ -56400,12 +57158,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Rice-specific data. - + Rice true @@ -56416,12 +57174,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Informatics resources that aim to identify, map or analyse genetic markers in DNA sequences, for example to produce a genetic (linkage) map of a chromosome or genome or to analyse genetic linkage and synteny. - + Genetic mapping and linkage true @@ -56432,13 +57190,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The study (typically comparison) of the sequence, structure or function of multiple genomes. Comparative_genomics - - - + + + Comparative genomics @@ -56449,12 +57207,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier Mobile genetic elements, such as transposons, Plasmids, Bacteriophage elements and Group II introns. Mobile_genetic_elements Transposons - - + + Mobile genetic elements @@ -56465,12 +57223,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Human diseases, typically describing the genes, mutations and proteins implicated in disease. - + Human disease true @@ -56481,14 +57239,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true VT 3.1.3 Immunology The application of information technology to immunology such as immunological processes, immunological genes, proteins and peptide ligands, antigens and so on. Immunology - - - + + + Immunology http://purl.bioontology.org/ontology/MSH/D007120 @@ -56501,15 +57259,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Lipoproteins (protein-lipid assemblies), and proteins or region of a protein that spans or are associated with a membrane. Membrane_and_lipoproteins Lipoproteins Membrane proteins Transmembrane proteins - - + + Membrane and lipoproteins @@ -56521,13 +57279,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true Proteins that catalyze chemical reaction, the kinetics of enzyme-catalysed reactions, enzyme nomenclature etc. Enzymology Enzymes - - + + Enzymes @@ -56538,11 +57296,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + PCR primers and hybridisation oligos in a nucleic acid sequence. - + Primers true @@ -56554,11 +57312,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Regions or sites in a eukaryotic and eukaryotic viral RNA sequence which directs endonuclease cleavage or polyadenylation of an RNA transcript. - + PolyA signal or sites true @@ -56570,11 +57328,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + CpG rich regions (isochores) in a nucleotide sequence. - + CpG island and isochores true @@ -56586,11 +57344,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Restriction enzyme recognition sites (restriction sites) in a nucleic acid sequence. - + Restriction sites true @@ -56602,13 +57360,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Splice sites in a nucleotide sequence or alternative RNA splicing events. - + Splice sites true @@ -56619,11 +57377,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Matrix/scaffold attachment regions (MARs/SARs) in a DNA sequence. - + Matrix/scaffold attachment sites true @@ -56635,11 +57393,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Operons (operators, promoters and genes) from a bacterial genome. - + Operon true @@ -56651,11 +57409,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Whole promoters or promoter elements (transcription start sites, RNA polymerase binding site, transcription factor binding sites, promoter enhancers etc) in a DNA sequence. - + Promoters true @@ -56667,17 +57425,17 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true VT 1.5.24 Structural biology The molecular structure of biological molecules, particularly macromolecules such as proteins and nucleic acids. Structural_biology Structural assignment Structural determination Structure determination - - - + + + This includes experimental methods for biomolecular structure determination, such as X-ray crystallography, nuclear magnetic resonance (NMR), circular dichroism (CD) spectroscopy, microscopy etc., including the assignment or modelling of molecular structure from such data. Structural biology @@ -56689,11 +57447,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Trans- or intra-membrane regions of a protein, typically describing physicochemical properties of the secondary structure elements. - + Protein membrane regions true @@ -56705,11 +57463,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + The comparison of two or more molecular structures, for example structure alignment and clustering. - + This might involve comparison of secondary or tertiary (3D) structural information. Structure comparison @@ -56722,16 +57480,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The study of gene and protein function including the prediction of functional properties of a protein. Functional analysis Function_analysis Protein function analysis Protein function prediction - - - + + + Function analysis @@ -56741,13 +57499,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). - 1.17 - + beta12orEarlier + (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). + 1.17 + Specific bacteria or archaea, e.g. information on a specific prokaryote genome including molecular sequences, genes and annotation. - + The resource may be specific to a prokaryote, a group of prokaryotes or all prokaryotes. Prokaryotes and Archaea true @@ -56759,12 +57517,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Protein data resources. - + Protein databases true @@ -56775,12 +57533,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Experimental methods for biomolecular structure determination, such as X-ray crystallography, nuclear magnetic resonance (NMR), circular dichroism (CD) spectroscopy, microscopy etc., including the assignment or modelling of molecular structure from such data. - + Structure determination true @@ -56791,18 +57549,21 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true VT 1.5.11 Cell biology Cells, such as key genes and proteins involved in the cell cycle. + Cytology Cell_biology Cells Cellular processes Protein subcellular localization - - + + Cell biology + TODO. Fix definition and synonyms! + @@ -56811,12 +57572,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Topic focused on identifying, grouping, or naming things in a structured way according to some schema based on observable relationships. - + Classification true @@ -56827,12 +57588,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Lipoproteins (protein-lipid assemblies). - + Lipoproteins true @@ -56843,12 +57604,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Visualise a phylogeny, for example, render a phylogenetic tree. - + Phylogeny visualisation true @@ -56860,23 +57621,23 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - beta12orEarlier - true + 1.3 + beta12orEarlier + true VT 1.7.4 Computational chemistry The application of information technology to chemistry in biological research environment. Topic concerning the development and application of theory, analytical methods, mathematical models and computational simulation of chemical systems. Chemical informatics Chemoinformatics + Computational chemistry Cheminformatics Computational_chemistry - - - + + + Cheminformatics - Computational chemistry @@ -56885,19 +57646,19 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The holistic modelling and analysis of complex biological systems and the interactions therein. Systems_biology Biological modelling Biological system modelling Systems modelling - - - + + + This includes databases of models and methods to construct or analyse a model. Systems biology - + http://purl.bioontology.org/ontology/MSH/D049490 @@ -56907,7 +57668,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier The application of statistical methods to biological problems. Statistics_and_probability Bayesian methods @@ -56920,11 +57681,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution Probabilistic graphical model Probability Statistics - - - + + + Statistics and probability - + http://purl.bioontology.org/ontology/MSH/D056808 @@ -56936,12 +57697,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Search for and retrieve molecular structures that are similar to a structure-based query (typically another structure or part of a structure). - + The query is a structure-based entity such as another structure, a 3D (structural) motif, 3D profile or template. Structure database search true @@ -56953,8 +57714,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true The construction, analysis, evaluation, refinement etc. of models of a molecules properties or behaviour, including the modelling the structure of proteins in complex with small molecules or other macromolecules (docking). Molecular_modelling Comparative modelling @@ -56962,8 +57723,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Homology modeling Homology modelling Molecular docking - - + + Molecular modelling @@ -56974,12 +57735,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.2 - + beta12orEarlier + 1.2 + The prediction of functional properties of a protein. - + Protein function prediction true @@ -56990,11 +57751,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Single nucleotide polymorphisms (SNP) and associated data, for example, the discovery and annotation of SNPs. - + SNP true @@ -57006,13 +57767,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Predict transmembrane domains and topology in protein sequences. - + Transmembrane protein prediction true @@ -57023,13 +57784,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + The comparison two or more nucleic acid (typically RNA) secondary or tertiary structures. - + Use this concept for methods that are exclusively for nucleic acid structures. Nucleic acid structure comparison true @@ -57041,11 +57802,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Exons in a nucleotide sequences. - + Exons true @@ -57057,11 +57818,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Transcription of DNA into RNA including the regulation of transcription. - + Gene transcription true @@ -57074,11 +57835,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier DNA mutation. DNA_mutation - - + + DNA mutation @@ -57089,8 +57850,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true VT 3.2.16 Oncology The study of cancer, for example, genes and proteins implicated in cancer. Cancer biology @@ -57098,9 +57859,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Cancer Neoplasm Neoplasms - - - + + + Oncology @@ -57111,11 +57872,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Structural and associated data for toxic chemical substances. - + Toxins and targets true @@ -57127,11 +57888,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Introns in a nucleotide sequences. - + Introns true @@ -57143,11 +57904,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + A topic concerning primarily bioinformatics software tools, typically the broad function or purpose of a tool. - + Tool topic true @@ -57159,11 +57920,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + A general area of bioinformatics study, typically the broad scope or category of content of a bioinformatics journal or conference proceeding. - + Study topic true @@ -57175,12 +57936,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Biological nomenclature (naming), symbols and terminology. - + Nomenclature true @@ -57191,12 +57952,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + The genes, gene variations and proteins involved in one or more specific diseases. - + Disease genes and proteins true @@ -57208,16 +57969,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true http://edamontology.org/topic_3040 Protein secondary or tertiary structural data and/or associated annotation. Protein structure Protein_structure_analysis Protein tertiary structure - - - + + + Protein structure analysis @@ -57228,12 +57989,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier The study of human beings in general, including the human genome and proteome. Humans Human_biology - - + + Human biology @@ -57244,12 +58005,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Informatics resource (typically a database) primarily focussed on genes. - + Gene resources true @@ -57260,12 +58021,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Yeast, e.g. information on a specific yeast genome including molecular sequences, genes and annotation. - + Yeast true @@ -57276,13 +58037,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - (jison) Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). - 1.17 - + beta12orEarlier + (jison) Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). + 1.17 + Eukaryotes or data concerning eukaryotes, e.g. information on a specific eukaryote genome including molecular sequences, genes and annotation. - + The resource may be specific to a eukaryote, a group of eukaryotes or all eukaryotes. Eukaryotes true @@ -57294,13 +58055,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). - 1.17 - + beta12orEarlier + (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). + 1.17 + Invertebrates, e.g. information on a specific invertebrate genome including molecular sequences, genes and annotation. - + The resource may be specific to an invertebrate, a group of invertebrates or all invertebrates. Invertebrates true @@ -57312,13 +58073,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). - 1.17 - + beta12orEarlier + (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). + 1.17 + Vertebrates, e.g. information on a specific vertebrate genome including molecular sequences, genes and annotation. - + The resource may be specific to a vertebrate, a group of vertebrates or all vertebrates. Vertebrates true @@ -57330,13 +58091,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). - 1.17 - + beta12orEarlier + (jison)Out of EDAM scope. While very useful to have a basic set of IDs for organisms, should find a better way to provide this e.g. in bio.tools (NCBI taxon ID subset). + 1.17 + Unicellular eukaryotes, e.g. information on a unicellular eukaryote genome including molecular sequences, genes and annotation. - + The resource may be specific to a unicellular eukaryote, a group of unicellular eukaryotes or all unicellular eukaryotes. Unicellular eukaryotes true @@ -57348,12 +58109,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Protein secondary or tertiary structure alignments. - + Protein structure alignment true @@ -57365,15 +58126,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier The study of matter and their structure by means of the diffraction of X-rays, typically the diffraction pattern caused by the regularly spaced atoms of a crystalline sample. Crystallography X-ray_diffraction X-ray crystallography X-ray microscopy - - - + + + + imaging-revise X-ray diffraction @@ -57384,12 +58146,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Conceptualisation, categorisation and naming of entities or phenomena within biology or bioinformatics. - + Ontologies, nomenclature and classification http://purl.bioontology.org/ontology/MSH/D002965 true @@ -57402,16 +58164,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier + beta12orEarlier Immunity-related proteins and their ligands. Immunoproteins_and_antigens Antigens Immunopeptides Immunoproteins Therapeutic antibodies - - - + + + This includes T cell receptors (TR), major histocompatibility complex (MHC), immunoglobulin superfamily (IgSF) / antibodies, major histocompatibility complex superfamily (MhcSF), etc." Immunoproteins and antigens @@ -57424,13 +58186,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Specific molecules, including large molecules built from repeating subunits (macromolecules) and small molecules of biological significance. CHEBI:23367 - + Molecules true @@ -57443,16 +58205,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true VT 3.1.9 Toxicology Toxins and the adverse effects of these chemical substances on living organisms. Toxicology Computational toxicology Toxicoinformatics - - - + + + Toxicology @@ -57463,12 +58225,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta13 - + beta12orEarlier + beta13 + Parallelised sequencing processes that are capable of sequencing many thousands of sequences simultaneously. - + High-throughput sequencing true @@ -57479,11 +58241,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Gene regulatory networks. - + Gene regulatory networks true @@ -57495,12 +58257,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - beta12orEarlier - + beta12orEarlier + beta12orEarlier + Informatics resources dedicated to one or more specific diseases (not diseases in general). - + Disease (specific) true @@ -57511,11 +58273,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Variable number of tandem repeat (VNTR) polymorphism in a DNA sequence. - + VNTR true @@ -57527,12 +58289,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Microsatellite polymorphism in a DNA sequence. - + Microsatellites true @@ -57544,12 +58306,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.13 - + beta12orEarlier + 1.13 + Restriction fragment length polymorphisms (RFLP) in a DNA sequence. - + RFLP true @@ -57562,8 +58324,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - true + beta12orEarlier + true DNA polymorphism. DNA_polymorphism Microsatellites @@ -57573,8 +58335,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution VNTR Variable number of tandem repeat polymorphism snps - - + + Includes microsatellite polymorphism in a DNA sequence. A microsatellite polymorphism is a very short subsequence that is repeated a variable number of times between individuals. These repeats consist of the nucleotides cytosine and adenosine. Includes restriction fragment length polymorphisms (RFLP) in a DNA sequence. An RFLP is defined by the presence or absence of a specific restriction site of a bacterial restriction enzyme. Includes single nucleotide polymorphisms (SNP) and associated data, for example, the discovery and annotation of SNPs. A SNP is a DNA sequence variation where a single nucleotide differs between members of a species or paired chromosomes in an individual. @@ -57589,12 +58351,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta12orEarlier - 1.3 - + beta12orEarlier + 1.3 + Topic for the design of nucleic acid sequences with specific conformations. - + Nucleic acid design true @@ -57605,12 +58367,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + The design of primers for PCR and DNA amplification or the design of molecular probes. - + Primer or probe design true @@ -57621,12 +58383,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.2 - + beta13 + 1.2 + Molecular secondary or tertiary (3D) structural data resources, typically of proteins and nucleic acids. - + Structure databases true @@ -57637,12 +58399,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.2 - + beta13 + 1.2 + Nucleic acid (secondary or tertiary) structure, such as whole structures, structural features and associated annotation. - + Nucleic acid structure true @@ -57653,12 +58415,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + Molecular sequence data resources, including sequence sites, alignments, motifs and profiles. - + Sequence databases true @@ -57669,12 +58431,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + Nucleotide sequences and associated concepts such as sequence sites, alignments, motifs and profiles. - + Nucleic acid sequences true @@ -57685,11 +58447,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + Protein sequences and associated concepts such as sequence sites, alignments, motifs and profiles. - + Protein sequences true @@ -57701,12 +58463,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + Protein interaction networks - + Protein interaction networks true @@ -57717,15 +58479,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true VT 1.5.4 Biochemistry and molecular biology The molecular basis of biological activity, particularly the macromolecules (e.g. proteins and nucleic acids) that are essential to life. Molecular_biology Biological processes - - - + + + Molecular biology @@ -57736,12 +58498,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + Mammals, e.g. information on a specific mammal genome including molecular sequences, genes and annotation. - + Mammals true @@ -57753,15 +58515,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true - Biodiversity science + beta13 + true + Biodiversity science VT 1.5.5 Biodiversity conservation The degree of variation of life forms within a given ecosystem, biome or an entire planet. Biodiversity - - - + + + Biodiversity http://purl.bioontology.org/ontology/MSH/D044822 @@ -57773,12 +58535,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + The comparison, grouping together and classification of macromolecules on the basis of sequence similarity. - + This includes the results of sequence clustering, ortholog identification, assignment to families, annotation etc. Sequence clusters and classification true @@ -57790,15 +58552,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true The study of genes, genetic variation and heredity in living organisms. Genetics Genes Heredity - - - + + + Genetics http://purl.bioontology.org/ontology/MSH/D005823 @@ -57810,12 +58572,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true The genes and genetic mechanisms such as Mendelian inheritance that underly continuous phenotypic traits (such as height or weight). Quantitative_genetics - - + + Quantitative genetics @@ -57826,13 +58588,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true The distribution of allele frequencies in a population of organisms and its change subject to evolutionary processes including natural selection, genetic drift, mutation and gene flow. Population_genetics - - - + + + Population genetics @@ -57843,11 +58605,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + Regulatory RNA sequences including microRNA (miRNA) and small interfering RNA (siRNA). - + Regulatory RNA true @@ -57859,11 +58621,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.13 - + beta13 + 1.13 + The documentation of resources such as tools, services and databases and how to get help. - + Documentation and help true @@ -57875,12 +58637,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + The structural and functional organisation of genes and other genetic elements. - + Genetic organisation true @@ -57891,8 +58653,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true The application of information technology to health, disease and biomedicine. Biomedical informatics Clinical informatics @@ -57900,9 +58662,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Health informatics Healthcare informatics Medical_informatics - - - + + + Medical informatics @@ -57913,15 +58675,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true VT 1.5.14 Developmental biology How organisms grow and develop. Developmental_biology Development - - - + + + Developmental biology @@ -57932,13 +58694,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true The development of organisms between the one-cell stage (typically the zygote) and the end of the embryonic stage. Embryology - - - + + + Embryology @@ -57949,14 +58711,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true VT 3.1.1 Anatomy and morphology The form and function of the structures of living organisms. Anatomy - - - + + + Anatomy @@ -57967,8 +58729,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true The scientific literature, language processing, reference information, and documentation. Language Literature @@ -57978,9 +58740,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Documentation References Scientific literature - - - + + + This includes the documentation of resources such as tools, services and databases, user support, how to get help etc. Literature and language http://purl.bioontology.org/ontology/MSH/D011642 @@ -57992,8 +58754,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true Life science Life sciences VT 1.5 Biological sciences @@ -58013,9 +58775,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Chronobiology Cryobiology Reproductive biology - - - + + + Biology @@ -58026,19 +58788,24 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true - Data stewardship + beta13 + true + Data life cycle + Data stewardship VT 1.3.1 Data management Data management comprises the practices and principles of taking care of data, other than analysing them. This includes for example taking care of the associated metadata, formatting, storage, archiving, or access. Metadata management - - - + Research data management (RDM) + + + Data management - http://purl.bioontology.org/ontology/MSH/D000079803 + + + + @@ -58047,12 +58814,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + The detection of the positional features, such as functional and other key sites, in molecular sequences. - + Sequence feature detection http://purl.bioontology.org/ontology/MSH/D058977 true @@ -58064,12 +58831,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + The detection of positional features such as functional sites in nucleotide sequences. - + Nucleic acid feature detection true @@ -58080,12 +58847,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + The detection, identification and analysis of positional protein sequence features, such as functional sites. - + Protein feature detection true @@ -58096,12 +58863,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.2 - + beta13 + 1.2 + Topic for modelling biological systems in mathematical terms. - + Biological system modelling true @@ -58111,14 +58878,27 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - beta13 + + beta13 The acquisition of data, typically measurements of physical systems using any type of sampling system, or by another other means. + Data collecting Data collection - - + Data gathering + Experimental techniques + Experiments + Lab method + Lab techniques + Laboratory experiments + Laboratory method + Laboratory techniques + + + Data acquisition + + + @@ -58127,12 +58907,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.3 - + beta13 + 1.3 + Specific genes and/or their encoded proteins or a family or other grouping of related genes and proteins. - + Genes and proteins resources true @@ -58143,11 +58923,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.13 - + beta13 + 1.13 + Topological domains such as cytoplasmic regions in a protein. - + Protein topological domains true @@ -58159,13 +58939,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true Protein sequence variants produced e.g. from alternative splicing, alternative promoter usage, alternative initiation and ribosomal frameshifting. Protein_variants - - + + Protein variants @@ -58176,12 +58956,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.12 - + beta13 + 1.12 + Regions within a nucleic acid sequence containing a signal that alters a biological function. - + Expression signals true @@ -58193,7 +58973,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 + beta13 Nucleic acids binding to some other molecule. DNA_binding_sites @@ -58203,8 +58983,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Restriction sites Ribosome binding sites Scaffold-attachment region - - + + This includes ribosome binding sites (Shine-Dalgarno sequence in prokaryotes), restriction enzyme recognition sites (restriction sites) etc. This includes sites involved with DNA replication and recombination. This includes binding sites for initiation of replication (origin of replication), regions where transfer is initiated during the conjugation or mobilisation (origin of transfer), starting sites for DNA duplication (origin of replication) and regions which are eliminated through any of kind of recombination. Also nucleosome exclusion regions, i.e. specific patterns or regions which exclude nucleosomes (the basic structural units of eukaryotic chromatin which play a significant role in regulating gene expression). DNA binding sites @@ -58217,11 +58997,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.13 - + beta13 + 1.13 + Repetitive elements within a nucleic acid sequence. - + This includes long terminal repeats (LTRs); sequences (typically retroviral) directly repeated at both ends of a defined sequence and other types of repeating unit. Nucleic acid repeats @@ -58234,12 +59014,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - true + beta13 + true DNA replication or recombination. DNA_replication_and_recombination - - + + DNA replication and recombination @@ -58251,11 +59031,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.13 - + beta13 + 1.13 + Coding sequences for a signal or transit peptide. - + Signal or transit peptide true @@ -58267,11 +59047,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - beta13 - 1.13 - + beta13 + 1.13 + Sequence tagged sites (STS) in nucleic acid sequences. - + Sequence tagged sites true @@ -58282,9 +59062,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.1 - true + + 1.1 + true The determination of complete (typically nucleotide) sequences, including those of genomes (full genome sequencing, de novo sequencing and resequencing), amplicons and transcriptomes. DNA-Seq Sequencing @@ -58301,9 +59081,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Primer walking Sanger sequencing Targeted next-generation sequencing panels - - - + + + Sequencing http://purl.bioontology.org/ontology/MSH/D059014 @@ -58316,7 +59096,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 + 1.1 The analysis of protein-DNA interactions where chromatin immunoprecipitation (ChIP) is used in combination with massively parallel DNA sequencing to identify the binding sites of DNA-associated proteins. ChIP-sequencing Chip Seq @@ -58324,8 +59104,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Chip-sequencing ChIP-seq ChIP-exo - - + + ChIP-seq @@ -58336,7 +59116,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 + 1.1 A topic concerning high-throughput sequencing of cDNA to measure the RNA content (transcriptome) of a sample, for example, to investigate how different alleles of a gene are expressed, detect post-transcriptional mutations or identify gene fusions. RNA sequencing RNA-Seq analysis @@ -58349,8 +59129,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution RNA-Seq MicroRNA sequencing miRNA-seq - - + + This includes small RNA profiling (small RNA-Seq), for example to find novel small RNAs, characterize mutations and analyze expression of small RNAs. RNA-Seq @@ -58363,11 +59143,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 - 1.3 - + 1.1 + 1.3 + DNA methylation including bisulfite sequencing, methylation sites and analysis, for example of patterns and profiles of DNA methylation in a population, tissue etc. - + DNA methylation true @@ -58379,8 +59159,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 - true + 1.1 + true The systematic study of metabolites, the chemical processes they are involved, and the chemical fingerprints of specific cellular processes in a whole cell, tissue, organ or organism. Metabolomics Exometabolomics @@ -58393,9 +59173,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Metabolome Metabonomics NMR-based metabolomics - - - + + + Metabolomics http://purl.bioontology.org/ontology/MSH/D055432 @@ -58408,13 +59188,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 - true + 1.1 + true The study of the epigenetic modifications of a whole cell, tissue, organism etc. Epigenomics - - - + + + Epigenetics concerns the heritable changes in gene expression owing to mechanisms other than DNA sequence variation. Epigenomics @@ -58428,8 +59208,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 - true + 1.1 + true Biome sequencing Community genomics Ecogenomics @@ -58440,9 +59220,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Metagenomics Biome sequencing Shotgun metagenomics - - - + + + Metagenomics http://purl.bioontology.org/ontology/MSH/D056186 @@ -58455,7 +59235,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 + 1.1 Variation in chromosome structure including microscopic and submicroscopic types of variation such as deletions, duplications, copy-number variants, insertions, inversions and translocations. DNA structural variation Genomic structural variation @@ -58465,8 +59245,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Insertion Inversion Translocation - - + + Structural variation @@ -58477,12 +59257,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 + 1.1 DNA-histone complexes (chromatin), organisation of chromatin into nucleosomes and packaging into higher-order structures. DNA_packaging Nucleosome positioning - - + + DNA packaging http://purl.bioontology.org/ontology/MSH/D042003 @@ -58494,12 +59274,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 - 1.3 - + 1.1 + 1.3 + A topic concerning high-throughput sequencing of randomly fragmented genomic DNA, for example, to investigate whole-genome sequencing and resequencing, SNP discovery, identification of copy number variations and chromosomal rearrangements. - + DNA-Seq true @@ -58510,12 +59290,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 - 1.3 - + 1.1 + 1.3 + The alignment of sequences of (typically millions) of short reads to a reference genome. This is a specialised topic within sequence alignment, especially because of complications arising from RNA splicing. - + RNA-Seq alignment true @@ -58526,13 +59306,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.1 + 1.1 Experimental techniques that combine chromatin immunoprecipitation ('ChIP') with microarray ('chip'). ChIP-on-chip is used for high-throughput study protein-DNA interactions. ChIP-chip ChIP-on-chip ChiP - - + + ChIP-on-chip @@ -58543,14 +59323,21 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - The protection of data, such as patient health data, from damage or unwanted access from unauthorised users. - Data privacy + 1.3 + Data sensitivity concerns data that should be protected from disclosure, damage or unauthorised access due to its sensitive nature. This category includes not only personal and commercial data, but also information about e.g. localisation of endangered species. Data_security - - - Data security + Data protection + Information privacy + Special category data + + + Information sensitivity + + + + + Fixed by Federico Bianchini @@ -58559,15 +59346,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 Biological samples and specimens. Specimen collections Sample_collections biosamples samples - - - + + + Sample collections @@ -58579,8 +59366,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 1.5.4 Biochemistry and molecular biology Chemical substances and physico-chemical processes and that occur within living organisms. Biological chemistry @@ -58588,9 +59375,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Glycomics Pathobiochemistry Phytochemistry - - - + + + Biochemistry @@ -58602,12 +59389,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true The study of evolutionary relationships amongst organisms from analysis of genetic information (typically gene or protein sequences). Phylogenetics - - + + Phylogenetics http://purl.bioontology.org/ontology/MSH/D010802 @@ -58619,16 +59406,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true Topic concerning the study of heritable changes, for example in gene expression or phenotype, caused by mechanisms other than changes in the DNA sequence. Epigenetics DNA methylation Histone modification Methylation profiles - - - + + + This includes sub-topics such as histone modification and DNA methylation (methylation sites and analysis, for example of patterns and profiles of DNA methylation in a population, tissue etc.) Epigenetics @@ -58642,14 +59429,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true The exploitation of biological process, structure and function for industrial purposes, for example the genetic manipulation of microorganisms for the antibody production. Biotechnology Applied microbiology - - - + + + Biotechnology @@ -58662,13 +59449,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true Phenomes, or the study of the change in phenotype (the physical and biochemical traits of organisms) in response to genetic and environmental factors. Phenomics - - - + + + Phenomics @@ -58679,17 +59466,18 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 1.5.16 Evolutionary biology The evolutionary processes, from the genetic to environmental scale, that produced life in all its diversity. Evolution Evolutionary_biology - - - + + + Evolutionary biology + @@ -58698,15 +59486,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 3.1.8 Physiology The functions of living organisms and their constituent parts. Physiology Electrophysiology - - - + + + Physiology @@ -58717,8 +59505,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 1.5.20 Microbiology The biology of microorganisms. Microbiology @@ -58730,9 +59518,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Microbiological surveillance Molecular infection biology Molecular microbiology - - - + + + Microbiology @@ -58743,13 +59531,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true The biology of parasites. Parasitology - - - + + + Parasitology @@ -58760,8 +59548,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 3.1 Basic medicine VT 3.2 Clinical medicine VT 3.2.9 General and internal medicine @@ -58772,9 +59560,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Medicine General medicine Internal medicine - - - + + + Human health and medicine @@ -58785,8 +59573,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true Neuroscience VT 3.1.5 Neuroscience The study of the nervous system and brain; its anatomy, physiology and function. @@ -58794,9 +59582,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Molecular neuroscience Neurophysiology Systemetic neuroscience - - - + + + Neurobiology @@ -58807,19 +59595,20 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 3.3.1 Epidemiology Topic concerning the the patterns, cause, and effect of disease within populations. Public_health_and_epidemiology Epidemiology Public health - - - + + + Public health and epidemiology + TODO: Split out Epidemiology details into a sub-concept @@ -58829,15 +59618,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 1.5.9 Biophysics The use of physics to study biological system. Biophysics Medical physics - - - + + + Biophysics @@ -58849,15 +59638,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true The analysis of transcriptomes, or a set of all the RNA molecules in a specific cell, tissue etc. Transcriptomics Comparative transcriptomics Transcriptome - - - + + + Gene expression a related term, or how? Transcriptomics @@ -58869,8 +59658,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - Molecules? + 1.3 + Molecules? Chemical science Polymer science VT 1.7.10 Polymer science @@ -58889,9 +59678,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Nuclear chemistry Organic chemistry Physical chemistry - - - + + + Chemistry @@ -58902,7 +59691,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 VT 1.1.99 Other VT:1.1 Mathematics The study of numbers (quantity) and other topics including structure, space, and change. @@ -58914,9 +59703,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Graph analytics Monte Carlo methods Multivariate analysis - - - + + + Mathematics @@ -58927,12 +59716,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 The study of matter, space and time, and related concepts such as energy and force. Physics - - - + + + Physics @@ -58944,14 +59733,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true RNA splicing; post-transcription RNA modification involving the removal of introns and joining of exons. Alternative splicing RNA_splicing Splice sites - - + + This includes the study of splice sites, splicing patterns, alternative splicing events and variants, isoforms, etc.. RNA splicing @@ -58964,13 +59753,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true The structure and function of genes at a molecular level. Molecular_genetics - - - + + + Molecular genetics @@ -58981,8 +59770,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 3.2.25 Respiratory systems The study of respiratory system. Pulmonary medicine @@ -58990,9 +59779,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Respiratory_medicine Pulmonary disorders Respiratory disease - - - + + + Respiratory medicine @@ -59003,12 +59792,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - 1.4 - + 1.3 + 1.4 + The study of metabolic diseases. - + Metabolic disease true @@ -59019,17 +59808,18 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - Pathogens + + 1.3 + Pathogens Pathogen-borne disease(s)? VT 3.3.4 Infectious diseases The branch of medicine that deals with the prevention, diagnosis and management of transmissable disease with clinically evident illness resulting from infection with pathogenic biological agents (viruses, bacteria, fungi, protozoa, parasites and prions). Communicable disease Transmissable disease Infectious_disease - - - + + + Infectious diseases @@ -59040,12 +59830,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 The study of rare diseases. Rare_diseases - - - + + + Rare diseases @@ -59056,14 +59846,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true The branch of medicine that deals with the anatomy, functions and disorders of the nervous system. Neurology Neurological disorders - - - + + + Neurology @@ -59074,8 +59864,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 3.2.22 Peripheral vascular disease VT 3.2.4 Cardiac and Cardiovascular systems The diseases and abnormalities of the heart and circulatory system. @@ -59083,9 +59873,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Cardiology Cardiovascular disease Heart disease - - - + + + Cardiology @@ -59097,13 +59887,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true The discovery and design of drugs or potential drug compounds. Drug_discovery - - - + + + This includes methods that search compound collections, generate or analyse drug 3D conformations, identify drug targets with structural docking etc. Drug discovery @@ -59115,15 +59905,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true Repositories of biological samples, typically human, for basic biological and clinical research. Tissue collection biobanking Biobank - - - + + + Biobank @@ -59134,13 +59924,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 Laboratory study of mice, for example, phenotyping, and mutagenesis of mouse cell lines. Laboratory mouse Mouse_clinic - - - + + + Mouse clinic @@ -59151,12 +59941,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 Collections of microbial cells including bacteria, yeasts and moulds. Microbial_collection - - - + + + Microbial collection @@ -59166,12 +59956,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 Collections of cells grown under laboratory conditions, specifically, cells from multi-cellular eukaryotes and especially animal cells. Cell_culture_collection - - - + + + Cell culture collection @@ -59182,12 +59972,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 Collections of DNA, including both collections of cloned molecules, and populations of micro-organisms that store and propagate cloned DNA. Clone_library - - - + + + Clone library @@ -59198,13 +59988,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true 'translating' the output of basic and biomedical research into better diagnostic tools, medicines, medical procedures, policies and advice. Translational_medicine - - - + + + Translational medicine @@ -59215,7 +60005,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 Collections of chemicals, typically for use in high-throughput screening experiments. Compound_libraries_and_screening Chemical library @@ -59224,9 +60014,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Small chemical compounds libraries Small compounds libraries Target identification and validation - - - + + + Compound libraries and screening @@ -59237,16 +60027,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - true + 1.3 + true VT 3.3 Health sciences Topic concerning biological science that is (typically) performed in the context of medicine. Biomedical sciences Health science Biomedical_science - - - + + + Biomedical science @@ -59257,12 +60047,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 + 1.3 Topic concerning the identity of biological entities, or reports on such entities, and the mapping of entities and records in different databases. Data_identity_and_mapping - - - + + + Data identity and mapping @@ -59272,11 +60062,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.3 - 1.12 - + 1.3 + 1.12 + The search and retrieval from a database on the basis of molecular sequence similarity. - + Sequence search true @@ -59288,13 +60078,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true Objective indicators of biological state often used to assess health, and determinate treatment. Diagnostic markers Biomarkers - - + + Biomarkers @@ -59304,21 +60094,27 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.4 + + 1.4 + This was merged into Data acquisition. + Laboratory_techniques + 1.26 + The procedures used to conduct an experiment. Experimental techniques Lab method Lab techniques Laboratory method - Laboratory_techniques Experiments Laboratory experiments - - - - Data aquisition + + + + + + Laboratory techniques + true @@ -59327,15 +60123,17 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The development of policies, models and standards that cover data acquisition, storage and integration, such that it can be put to use, typically through a process of systematically applying statistical and / or logical techniques to describe, illustrate, summarise or evaluate data. Data_architecture_analysis_and_design Data analysis Data architecture Data design - - - + Data organisation + Data organization + + + Data architecture, analysis and design @@ -59346,14 +60144,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The combination and integration of data from different sources, for example into a central repository or warehouse, to provide users with a unified view of these data. Data_integration_and_warehousing Data integration Data warehousing - - - + + + Data integration and warehousing @@ -59366,12 +60164,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 Any matter, surface or construct that interacts with a biological system. Biomaterials - - - + + + Biomaterials @@ -59383,13 +60181,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true The use of synthetic chemistry to study and manipulate biological systems. Chemical_biology - - - + + + Chemical biology @@ -59400,13 +60198,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 1.7.1 Analytical chemistry The study of the separation, identification, and quantification of the chemical components of natural and artificial materials. Analytical_chemistry - - - + + + Analytical chemistry @@ -59417,13 +60215,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The use of chemistry to create new compounds. Synthetic_chemistry Synthetic organic chemistry - - - + + + Synthetic chemistry @@ -59435,25 +60233,25 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 1.2.12 Programming languages Software engineering VT 1.2.1 Algorithms VT 1.2.14 Software engineering VT 1.2.7 Data structures The process that leads from an original formulation of a computing problem to executable programs. - Computer programming Software development Software_engineering Algorithms Data structures Programming languages - - - + + + Software engineering + @@ -59462,16 +60260,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true The process of bringing a new drug to market once a lead compounds has been identified through drug discovery. Drug development science Medicine development Medicines development Drug_development - - - + + + Drug development @@ -59482,15 +60280,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 Drug delivery Drug formulation Drug formulation and delivery The process of formulating and administering a pharmaceutical compound to achieve a therapeutic effect. Biotherapeutics - - - + + + Biotherapeutics @@ -59500,8 +60298,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true The study of how a drug interacts with the body. Drug_metabolism ADME @@ -59511,9 +60309,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Pharmacodynamics Pharmacokinetics Pharmacokinetics and pharmacodynamics - - - + + + Drug metabolism @@ -59524,15 +60322,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 Health care research Health care science The discovery, development and approval of medicines. Drug discovery and development Medicines_research_and_development - - - + + + Medicines research and development @@ -59546,18 +60344,18 @@ ows re-sequencing of complete genomes of any given organism with high resolution - + - 1.4 + 1.4 The safety (or lack) of drugs and other medical interventions. Patient safety Safety_sciences Drug safety - - - + + + Safety sciences @@ -59568,12 +60366,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The detection, assesment, understanding and prevention of adverse effects of medicines. Pharmacovigilence - - - + + + Pharmacovigilence concerns safety once a drug has gone to market. Pharmacovigilance @@ -59586,7 +60384,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The testing of new medicines, vaccines or procedures on animals (preclinical) and humans (clinical) prior to their approval by regulatory authorities. Preclinical_and_clinical_studies Clinical studies @@ -59595,9 +60393,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Drug trials Preclinical studies Preclinical study - - - + + + Preclinical and clinical studies @@ -59607,9 +60405,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.4 - true + + 1.4 + true The visual representation of an object. Imaging Diffraction experiment @@ -59618,9 +60416,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution Optical super resolution microscopy Photonic force microscopy Photonic microscopy - - - + + + + imaging-revise This includes diffraction experiments that are based upon the interference of waves, typically electromagnetic waves such as X-rays or visible light, by some object being studied, typical in order to produce an image of the object or determine its structure. Imaging @@ -59632,13 +60431,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + + 1.4 The use of imaging techniques to understand biology. Biological imaging Biological_imaging - - - + + + + imaging-revise Bioimaging @@ -59649,7 +60450,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.13 Medical imaging VT 3.2.14 Nuclear medicine VT 3.2.24 Radiology @@ -59658,9 +60459,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution Neuroimaging Nuclear medicine Radiology - - - + + + + imaging-revise Medical imaging @@ -59671,12 +60473,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The use of optical instruments to magnify the image of an object. Light_microscopy - - - + + + + imaging-revise Light microscopy @@ -59687,16 +60490,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The use of animals and alternatives in experimental research. Animal experimentation Animal research Animal testing In vivo testing Laboratory_animal_science - - - + + + Laboratory animal science @@ -59707,14 +60510,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true VT 1.5.18 Marine and Freshwater biology The study of organisms in the ocean or brackish waters. Marine_biology - - - + + + Marine biology @@ -59725,13 +60528,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true The identification of molecular and genetic causes of disease and the development of interventions to correct them. Molecular_medicine - - - + + + Molecular medicine @@ -59742,16 +60545,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.3.7 Nutrition and Dietetics The study of the effects of food components on the metabolism, health, performance and disease resistance of humans and animals. It also includes the study of human behaviours related to food choices. Nutrition Nutrition science Nutritional_science Dietetics - - - + + + Nutritional science @@ -59762,13 +60565,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true The collective characterisation and quantification of pools of biological molecules that translate into the structure, function, and dynamics of an organism or organisms. Omics - - - + + + Omics @@ -59779,16 +60582,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The processes that need to be in place to ensure the quality of products for human or animal use. Quality assurance Quality_affairs Good clinical practice Good laboratory practice Good manufacturing practice - - - + + + Quality affairs @@ -59799,13 +60602,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The protection of public health by controlling the safety and efficacy of products in areas including pharmaceuticals, veterinary medicine, medical devices, pesticides, agrochemicals, cosmetics, and complementary medicines. Healthcare RA Regulatory_affairs - - - + + + Regulatory affairs @@ -59816,14 +60619,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true Biomedical approaches to clinical interventions that involve the use of stem cells. Stem cell research Regenerative_medicine - - - + + + Regenerative medicine @@ -59835,13 +60638,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true - An interdisciplinary field of study that looks at the dynamic systems of the human body as part of an integrted whole, incoporating biochemical, physiological, and environmental interactions that sustain life. + 1.4 + true + An interdisciplinary field of study that looks at the dynamic systems of the human body as part of an integrated whole, incoporating biochemical, physiological, and environmental interactions that sustain life. Systems_medicine - - - + + + Systems medicine @@ -59852,13 +60655,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 Topic concerning the branch of medicine that deals with the prevention, diagnosis, and treatment of disease, disorder and injury in animals. Veterinary_medicine Clinical veterinary medicine - - - + + + Veterinary medicine @@ -59869,13 +60672,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The application of biological concepts and methods to the analytical and synthetic methodologies of engineering. Biological engineering Bioengineering - - - + + + Bioengineering @@ -59886,8 +60689,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true Ageing Aging Gerontology @@ -59895,9 +60698,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution The branch of medicine dealing with the diagnosis, treatment and prevention of disease in older people, and the problems specific to aging. Geriatrics Geriatric_medicine - - - + + + Geriatric medicine @@ -59908,20 +60711,20 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true VT 3.2.1 Allergy - Health issues related to the immune system and their prevention, diagnosis and mangement. + Health issues related to the immune system and their prevention, diagnosis, and management. Allergy_clinical_immunology_and_immunotherapeutics Allergy Clinical immunology Immune disorders Immunomodulators Immunotherapeutics - - - - Allergy, clinical immunology and immunotherapeutics + + + + Allergy, clinical immunology, and immunotherapeutics @@ -59933,15 +60736,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true The prevention of pain and the evaluation, treatment and rehabilitation of persons in pain. Algiatry Pain management Pain_medicine - - - + + + Pain medicine @@ -59952,14 +60755,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.2 Anaesthesiology Anaesthesia and anaesthetics. Anaesthetics Anaesthesiology - - - + + + Anaesthesiology @@ -59970,16 +60773,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.5 Critical care/Emergency medicine The multidisciplinary that cares for patients with acute, life-threatening illness or injury. Acute medicine Emergency medicine Intensive care medicine Critical_care_medicine - - - + + + Critical care medicine @@ -59990,14 +60793,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.7 Dermatology and venereal diseases The branch of medicine that deals with prevention, diagnosis and treatment of disorders of the skin, scalp, hair and nails. Dermatology Dermatological disorders - - - + + + Dermatology @@ -60008,12 +60811,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The study, diagnosis, prevention and treatments of disorders of the oral cavity, maxillofacial area and adjacent structures. Dentistry - - - + + + Dentistry @@ -60024,7 +60827,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.20 Otorhinolaryngology The branch of medicine that deals with the prevention, diagnosis, and treatment of disorders of the ear, nose and throat. Audiovestibular medicine @@ -60032,9 +60835,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Otorhinolaryngology Ear_nose_and_throat_medicine Head and neck disorders - - - + + + Ear, nose and throat medicine @@ -60045,17 +60848,17 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true The branch of medicine dealing with diseases of endocrine organs, hormone systems, their target organs, and disorders of the pathways of glucose and lipid metabolism. Endocrinology_and_metabolism Endocrine disorders Endocrinology Metabolic disorders Metabolism - - - + + + Endocrinology and metabolism @@ -60066,16 +60869,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true VT 3.2.11 Hematology The branch of medicine that deals with the blood, blood-forming organs and blood diseases. Haematology Blood disorders Haematological disorders - - - + + + Haematology @@ -60086,15 +60889,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true VT 3.2.8 Gastroenterology and hepatology The branch of medicine that deals with disorders of the oesophagus, stomach, duodenum, jejenum, ileum, large intestine, sigmoid colon and rectum. Gastroenterology Gastrointestinal disorders - - - + + + Gastroenterology @@ -60107,13 +60910,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 Sex- and gender-sensitive biosciences The study of the biological and physiological differences between males and females and how they effect differences in disease presentation and management. Gender_medicine - - - + + + TODO: ALSO IN TOXICOLOGY!!!! And elsewhere! And other than sex & gender biases!! Sex and(/or) gender differences in health and medicine @@ -60125,17 +60928,17 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true VT 3.2.15 Obstetrics and gynaecology The branch of medicine that deals with the health of the female reproductive system, pregnancy and birth. Gynaecology_and_obstetrics Gynaecological disorders Gynaecology Obstetrics - - - + + + Gynaecology and obstetrics @@ -60147,15 +60950,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true The branch of medicine that deals with the liver, gallbladder, bile ducts and bile. Hepatology Hepatic_and_biliary_medicine Liver disorders - - - + + + Hepatic and biliary medicine Hepatobiliary medicine @@ -60167,12 +60970,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - 1.13 - + 1.4 + 1.13 + + + The branch of medicine that deals with the infectious diseases of the tropics. - - + Infectious tropical disease true @@ -60183,13 +60987,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 The branch of medicine that treats body wounds or shock produced by sudden physical injury, as from violence or accident. Traumatology Trauma_medicine - - - + + + Trauma medicine @@ -60202,13 +61006,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 - true + 1.4 + true The branch of medicine that deals with the diagnosis, management and prevention of poisoning and other adverse health effects caused by medications, occupational and environmental toxins, and biological agents. Medical_toxicology - - - + + + Medical toxicology @@ -60219,7 +61023,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.19 Orthopaedics VT 3.2.26 Rheumatology The branch of medicine that deals with the prevention, diagnosis, and treatment of disorders of the muscle, bone and connective tissue. It incorporates aspects of orthopaedics, rheumatology, rehabilitation medicine and pain medicine. @@ -60227,9 +61031,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Musculoskeletal disorders Orthopaedics Rheumatology - - - + + + Musculoskeletal medicine @@ -60242,16 +61046,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 Optometry VT 3.2.17 Ophthalmology VT 3.2.18 Optometry The branch of medicine that deals with disorders of the eye, including eyelid, optic nerve/visual pathways and occular muscles. Opthalmology Eye disoders - - - + + + Opthalmology @@ -60262,14 +61066,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.21 Paediatrics The branch of medicine that deals with the medical care of infants, children and adolescents. Child health Paediatrics - - - + + + Paediatrics @@ -60280,15 +61084,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 Mental health VT 3.2.23 Psychiatry The branch of medicine that deals with the mangement of mental illness, emotional disturbance and abnormal behaviour. Psychiatry Psychiatric disorders - - - + + + Psychiatry @@ -60299,7 +61103,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.3 Andrology The health of the reproductive processes, functions and systems at all stages of life. Reproductive_health @@ -60307,9 +61111,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Family planning Fertility medicine Reproductive disorders - - - + + + Reproductive health @@ -60320,14 +61124,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.28 Transplantation The use of operative, manual and instrumental techniques on a patient to investigate and/or treat a pathological condition or help improve bodily function or appearance. Surgery Transplantation - - - + + + Surgery @@ -60338,7 +61142,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 VT 3.2.29 Urology and nephrology The branches of medicine and physiology focussing on the function and disorders of the urinary system in males and females, the reproductive system in males, and the kidney. Urology_and_nephrology @@ -60346,9 +61150,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Nephrology Urological disorders Urology - - - + + + Urology and nephrology @@ -60360,16 +61164,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.4 + 1.4 Alternative medicine Holistic medicine Integrative medicine VT 3.2.12 Integrative and Complementary medicine Medical therapies that fall beyond the scope of conventional medicine but may be used alongside it in the treatment of disease and ill health. Complementary_medicine - - - + + + Complementary medicine @@ -60380,7 +61184,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.7 + 1.7 Techniques that uses magnetic fields and radiowaves to form images, typically to investigate the anatomy and physiology of the human body. MRT Magnetic resonance imaging @@ -60388,8 +61192,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution NMRI Nuclear magnetic resonance imaging MRI - - + + + imaging-revise MRI @@ -60401,14 +61206,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.7 + 1.7 The study of matter by studying the diffraction pattern from firing neutrons at a sample, typically to determine atomic and/or magnetic structure. Neutron diffraction experiment Neutron_diffraction Elastic neutron scattering Neutron microscopy - - + + + imaging-revise Neutron diffraction @@ -60419,7 +61225,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.7 + 1.7 Imaging in sections (sectioning), through the use of a wave-generating device (tomograph) that generates an image (a tomogram). CT Computed tomography @@ -60429,8 +61235,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution PET Positron emission tomography X-ray tomography - - + + + imaging-revise Tomography @@ -60441,17 +61248,17 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.7 - true + 1.7 + true KDD Knowledge discovery in databases VT 1.3.2 Data mining The discovery of patterns in large data sets and the extraction and trasnsformation of those patterns into a useful format. Data_mining Pattern recognition - - - Data mining + + + Data mining (See ML in EDAM Bioimaging) @@ -60461,13 +61268,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.7 + + 1.7 Artificial Intelligence VT 1.2.2 Artificial Intelligence (expert systems, machine learning, robotics) A topic concerning the application of artificial intelligence methods to algorithms, in order to create methods that can learn from data in order to generate an ouput, rather than relying on explicitly encoded information only. Machine_learning Active learning - Ensembl learning + Ensemble learning Kernel methods Knowledge representation Neural networks @@ -60475,9 +61283,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Reinforcement learning Supervised learning Unsupervised learning - - - Machine learning + + + Machine learning (See in EDAM Bioimaging) @@ -60487,9 +61295,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - Database administration - Information systems + 1.8 + Database administration + Information systems Databases The general handling of data stored in digital archives such as databases, databanks, web portals, and other data resources. Database_management @@ -60497,8 +61305,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Document management File management Record management - - + + This includes databases for the results of scientific experiments, the application of high-throughput technology, computational analysis and the scientific literature. It covers the management and manipulation of digital documents, including database records, files, and reports. Database management @@ -60510,7 +61318,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 + 1.8 VT 1.5.29 Zoology Animals, e.g. information on a specific animal genome including molecular sequences, genes and annotation. Animal @@ -60521,8 +61329,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Animal genetics Animal physiology Entomology - - + + The study of the animal kingdom. Zoology @@ -60535,13 +61343,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 + 1.8 The biology, archival, detection, prediction and analysis of positional features such as functional and other key sites, in protein sequences and the conserved patterns (motifs, profiles etc.) that may be used to describe them. Protein_sites_features_and_motifs Protein sequence features Signal peptide cleavage sites - - + + A signal peptide coding sequence encodes an N-terminal domain of a secreted protein, which is involved in attaching the polypeptide to a membrane leader sequence. A transit peptide coding sequence encodes an N-terminal domain of a nuclear-encoded organellar protein; which is involved in import of the protein into the organelle. Protein sites, features and motifs @@ -60553,15 +61361,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 + 1.8 The biology, archival, detection, prediction and analysis of positional features such as functional and other key sites, in nucleic acid sequences and the conserved patterns (motifs, profiles etc.) that may be used to describe them. Nucleic_acid_sites_features_and_motifs Nucleic acid functional sites Nucleic acid sequence features Primer binding sites Sequence tagged sites - - + + Sequence tagged sites are short DNA sequences that are unique within a genome and serve as a mapping landmark, detectable by PCR they allow a genome to be mapped via an ordering of STSs. Nucleic acid sites, features and motifs @@ -60573,7 +61381,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 + 1.8 Transcription of DNA into RNA and features of a messenger RNA (mRNA) molecules including precursor RNA, primary (unprocessed) transcript and fully processed molecules. mRNA features Gene_transcripts @@ -60589,8 +61397,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Transit peptide coding sequence cDNA mRNA - - + + This includes 5'untranslated region (5'UTR), coding sequences (CDS), exons, intervening sequences (intron) and 3'untranslated regions (3'UTR). This includes Introns, and protein-coding regions including coding sequences (CDS), exons, translation initiation sites and open reading frames. Also expressed sequence tag (EST) or complementary DNA (cDNA) sequences. This includes coding sequences for a signal or transit peptide. A signal peptide coding sequence encodes an N-terminal domain of a secreted protein, which is involved in attaching the polypeptide to a membrane leader sequence. A transit peptide coding sequence encodes an N-terminal domain of a nuclear-encoded organellar protein; which is involved in import of the protein into the organelle. @@ -60605,11 +61413,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Protein-ligand (small molecule) interaction(s). - + Protein-ligand interactions true @@ -60621,11 +61429,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Protein-drug interaction(s). - + Protein-drug interactions true @@ -60636,12 +61444,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.8 + + 1.8 Genotype experiment including case control, population, and family studies. These might use array based methods and re-sequencing methods. Genotyping_experiment - - + + Genotyping experiment @@ -60652,14 +61460,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 + 1.8 Genome-wide association study experiments. GWAS GWAS analysis Genome-wide association study GWAS_study - - + + GWAS study @@ -60669,8 +61477,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.8 + + 1.8 Microarray experiments including conditions, protocol, sample:data relationships etc. Microarrays Microarray_experiment @@ -60691,8 +61499,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution aCGH microarray mRNA microarray miRNA array - - + + This might specify which raw data file relates to which sample and information on hybridisations, e.g. which are technical and which are biological replicates. Microarray experiment @@ -60703,16 +61511,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.8 + + 1.8 PCR experiments, e.g. quantitative real-time PCR. Polymerase chain reaction PCR_experiment Quantitative PCR RT-qPCR Real Time Quantitative PCR - - + + PCR experiment @@ -60722,8 +61530,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.8 + + 1.8 Proteomics experiments. Proteomics_experiment 2D PAGE experiment @@ -60735,8 +61543,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Mass spectrometry experiments Northern blot experiment Spectrum demultiplexing - - + + This includes two-dimensional gel electrophoresis (2D PAGE) experiments, gels or spots in a gel. Also mass spectrometry - an analytical chemistry technique that measures the mass-to-charge ratio and abundance of ions in the gas phase. Also Northern blot experiments. Proteomics experiment @@ -60749,11 +61557,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Two-dimensional gel electrophoresis experiments, gels or spots in a gel. - + 2D PAGE experiment true @@ -60765,11 +61573,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Northern Blot experiments. - + Northern blot experiment true @@ -60780,12 +61588,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.8 + + 1.8 RNAi experiments. RNAi_experiment - - + + RNAi experiment @@ -60795,12 +61603,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.8 + + 1.8 Biological computational model experiments (simulation), for example the minimum information required in order to permit its correct interpretation and reproduction. Simulation_experiment - - + + Simulation experiment @@ -60810,11 +61618,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Protein-DNA/RNA interaction(s). - + Protein-nucleic acid interactions true @@ -60826,11 +61634,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Protein-protein interaction(s), including interactions between protein domains. - + Protein-protein interactions true @@ -60842,11 +61650,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Cellular process pathways. - + Cellular process pathways true @@ -60858,11 +61666,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Disease pathways, typically of human disease. - + Disease pathways true @@ -60874,11 +61682,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Environmental information processing pathways. - + Environmental information processing pathways true @@ -60890,11 +61698,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Genetic information processing pathways. - + Genetic information processing pathways true @@ -60906,11 +61714,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Super-secondary structure of protein sequence(s). - + Protein super-secondary structure true @@ -60922,11 +61730,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Catalytic residues (active site) of an enzyme. - + Protein active sites true @@ -60938,7 +61746,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 + 1.8 Binding sites in proteins, including cleavage sites (for a proteolytic enzyme or agent), key residues involved in protein folding, catalytic residues (active site) of an enzyme, ligand-binding (non-catalytic) residues of a protein, such as sites that bind metal, prosthetic groups or lipids, RNA and DNA-binding proteins and binding sites etc. Protein_binding_sites Enzyme active site @@ -60946,8 +61754,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Protein functional sites Protein key folding sites Protein-nucleic acid binding sites - - + + Protein binding sites @@ -60958,11 +61766,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + RNA and DNA-binding proteins and binding sites in protein sequences. - + Protein-nucleic acid binding sites true @@ -60974,11 +61782,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Cleavage sites (for a proteolytic enzyme or agent) in a protein sequence. - + Protein cleavage sites true @@ -60990,11 +61798,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Chemical modification of a protein. - + Protein chemical modifications true @@ -61006,12 +61814,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 + 1.8 Disordered structure in a protein. Protein features (disordered structure) Protein_disordered_structure - - + + Protein disordered structure @@ -61022,11 +61830,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Structural domains or 3D folds in a protein or polypeptide chain. - + Protein domains true @@ -61038,11 +61846,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Key residues involved in protein folding. - + Protein key folding sites true @@ -61054,11 +61862,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Post-translation modifications in a protein sequence, typically describing the specific sites involved. - + Protein post-translational modifications true @@ -61070,13 +61878,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 + 1.8 Secondary structure (predicted or real) of a protein, including super-secondary structure. Protein features (secondary structure) Protein_secondary_structure Protein super-secondary structure - - + + Super-secondary structures include leucine zippers, coiled coils, Helix-Turn-Helix etc. The location and size of the secondary structure elements and intervening loop regions is typically given. The report can include disulphide bonds and post-translationally formed peptide bonds (crosslinks). Protein secondary structure @@ -61089,11 +61897,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Short repetitive subsequences (repeat sequences) in a protein sequence. - + Protein sequence repeats true @@ -61105,11 +61913,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.8 - 1.13 - + 1.8 + 1.13 + Signal peptides or signal peptide cleavage sites in protein sequences. - + Protein signal peptides true @@ -61121,12 +61929,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.10 + 1.10 VT 1.1.1 Applied mathematics The application of mathematics to specific problems in science, typically by the formulation and analysis of mathematical models. Applied_mathematics - - + + Applied mathematics @@ -61137,13 +61945,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.10 + 1.10 VT 1.1.1 Pure mathematics The study of abstract mathematical concepts. Pure_mathematics Linear algebra - - + + Pure mathematics @@ -61154,15 +61962,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.10 + 1.10 The control of data entry and maintenance to ensure the data meets defined standards, qualities or constraints. Data_governance Data stewardship - - + + Data governance http://purl.bioontology.org/ontology/MSH/D030541 + @@ -61171,15 +61980,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.10 + 1.10 The quality, integrity, and cleaning up of data. Data_quality_management Data clean-up Data cleaning Data integrity Data quality - - + + Data quality management @@ -61190,14 +61999,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.10 + 1.10 Freshwater science VT 1.5.18 Marine and Freshwater biology The study of organisms in freshwater ecosystems. Freshwater_biology - - - + + + Freshwater biology @@ -61208,14 +62017,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.10 - true + 1.10 + true VT 3.1.2 Human genetics The study of inheritance in human beings. Human_genetics - - - + + + Human genetics @@ -61225,14 +62034,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.10 + + 1.10 + Tropical health VT 3.3.14 Tropical medicine Health problems that are prevalent in tropical and subtropical regions. Tropical_medicine - - - + + + Tropical medicine @@ -61243,8 +62053,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.10 - true + 1.10 + true VT 3.3.14 Tropical medicine VT 3.4 Medical biotechnology VT 3.4.1 Biomedical devices @@ -61252,9 +62062,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Biotechnology applied to the medical sciences and the development of medicines. Medical_biotechnology Pharmaceutical biotechnology - - - + + + Medical biotechnology @@ -61264,16 +62074,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.10 - true + 1.10 + true VT 3.4.5 Molecular diagnostics An approach to medicine whereby decisions, practices and are tailored to the individual patient based on their predicted response or risk of disease. TODO: improve, make more realistic with Precision medicine (incl. probiotics). BUT: is it then anyway just a buzzword for all medicine which is with time becoming more precise? Precision medicine Personalised_medicine Molecular diagnostics - - - + + + TODO: Update Personalised medicine @@ -61284,14 +62094,14 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.12 + + 1.12 Experimental techniques to purify a protein-DNA crosslinked complex. Usually sequencing follows e.g. in the techniques ChIP-chip, ChIP-seq and MeDIP-seq. Chromatin immunoprecipitation Immunoprecipitation_experiment - - + + Immunoprecipitation experiment @@ -61302,15 +62112,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.12 + 1.12 Laboratory technique to sequence the complete DNA sequence of an organism's genome at a single time. Genome sequencing WGS Whole_genome_sequencing De novo genome sequencing Whole genome resequencing - - + + Whole genome sequencing @@ -61321,8 +62131,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.12 - + 1.12 + Laboratory technique to sequence the methylated regions in DNA. MeDIP-chip MeDIP-seq @@ -61337,8 +62147,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Whole-genome bisulfite sequencing methy-seq methyl-seq - - + + Methylated DNA immunoprecipitation @@ -61349,7 +62159,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.12 + 1.12 Laboratory technique to sequence all the protein-coding regions in a genome, i.e., the exome. Exome Exome analysis @@ -61358,8 +62168,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution WES Whole exome sequencing Exome_sequencing - - + + Exome sequencing is considered a cheap alternative to whole genome sequencing. Exome sequencing @@ -61371,16 +62181,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.12 - - true + 1.12 + + true The design of an experiment intended to test a hypothesis, and describe or explain empirical data obtained under various experimental conditions. Design of experiments Experimental design Studies Experimental_design_and_studies - - + + Experimental design and studies @@ -61392,12 +62202,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.12 + 1.12 The design of an experiment involving non-human animals. Animal_study Challenge study - - + + Animal study @@ -61410,16 +62220,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.13 - true + 1.13 + true The ecology of microorganisms including their relationship with one another and their environment. Environmental microbiology Microbial_ecology Community analysis Microbiome Molecular community analysis - - + + Microbial ecology @@ -61430,7 +62240,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.17 + 1.17 An antibody-based technique used to map in vivo RNA-protein interactions. RIP RNA_immunoprecipitation @@ -61439,8 +62249,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution HITS-CLIP PAR-CLIP iCLIP - - + + RNA immunoprecipitation @@ -61451,12 +62261,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.17 + 1.17 Large-scale study (typically comparison) of DNA sequences of populations. Population_genomics - - - + + + Population genomics @@ -61468,13 +62278,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.20 + 1.20 Agriculture Agroecology Agronomy Multidisciplinary study, research and development within the field of agriculture. Agricultural_science - Agricultural biotechnology Agricultural economics Animal breeding @@ -61490,10 +62299,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution Plant nutrition Plant pathology Soil science - - + + Agriculture/Agricultural science + TODO: Break this all down into separate topics. Distinguish Ag, Horticulture, Aquaculture, ... . Add non-conventional & traditional types of Ag/Hort/etc., incl. Biodynamic, Permaculture, Closed-loop/Recircul ... and social dimensions (Urban, Educational?, Social?, ...), Compost stuff, ... @@ -61502,12 +62312,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.20 + 1.20 Approach which samples, in parallel, all genes in all organisms present in a given sample, e.g. to provide insight into biodiversity and function. Shotgun metagenomic sequencing Metagenomic_sequencing - - + + Metagenomic sequencing @@ -61518,12 +62328,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.21 - Environment + 1.21 + Environment Study of the environment, the interactions between its physical, chemical, and biological components and it's effect on life. Also how humans impact upon the environment, and how we can manage and utilise natural resources. Environmental_science - - + + Environmental sciences @@ -61534,10 +62344,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.22 + + 1.22 The study and simulation of molecular conformations using a computational model and computer simulations. - - + + This includes methods such as Molecular Dynamics, Coarse-grained dynamics, metadynamics, Quantum Mechanics, QM/MM, Markov State Models, etc. Biomolecular simulation @@ -61549,11 +62360,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.22 + 1.22 The application of multi-disciplinary science and technology for the construction of artificial biological systems for diverse applications. Biomimeic chemistry - - + + Synthetic biology @@ -61565,15 +62376,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.22 + 1.22 The application of biotechnology to directly manipulate an organism's genes. Genetic manipulation Genetic modification Genetic_engineering Genome editing Genome engineering - - + + Genetic engineering @@ -61584,31 +62395,52 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 A field of biological research focused on the discovery and identification of peptides, typically by comparing mass spectra against a protein database. Proteogenomics - - + + Proteogenomics + + + + + 1.24 + Amplicon panels + Resequencing + Laboratory experiment to identify the differences between a specific genome (of an individual) and a reference genome (developed typically from many thousands of individuals). WGS re-sequencing is used as golden standard to detect variations compared to a given reference genome, including small variants (SNP and InDels) as well as larger genome re-organisations (CNVs, translocations, etc.). + Highly targeted resequencing + Whole genome resequencing (WGR) + Whole-genome re-sequencing (WGSR) + Amplicon sequencing + Amplicon-based sequencing + Ultra-deep sequencing + Amplicon sequencing is the ultra-deep sequencing of PCR products (amplicons), usually for the purpose of efficient genetic variant identification and characterisation in specific genomic regions. + Genome resequencing + TODO FIX. See editorial note in Metabarcoding https://webprotege.stanford.edu/#projects/69591619-4eda-4f03-9e7f-65b213038fe1/edit/Classes?selection=Class(%3Chttp://edamontology.org/topic_4038%3E) + + + + - 1.24 + 1.24 A biomedical field that bridges immunology and genetics, to study the genetic basis of the immune system. Immune system genetics Immungenetics Immunology and genetics Immunogenetics Immunogenes - - + + This involves the study of often complex genetic traits underlying diseases involving defects in the immune system. For example, identifying target genes for therapeutic approaches, or genetic variations involved in immunological pathology. Immunogenetics @@ -61620,11 +62452,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Interdisciplinary science focused on extracting information from chemical systems by data analytical approaches, for example multivariate statistics, applied mathematics, and computer science. Chemometrics - - + + Chemometrics @@ -61634,15 +62466,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.24 + + 1.24 Cytometry is the measurement of the characteristics of cells. Cytometry Flow cytometry Image cytometry Mass cytometry - - + + NOTE: Sub-concepts are maintained in EDAM BIoimaging https://webprotege.stanford.edu/#projects/2ce704bf-83ed-4d2e-985f-84c4841fac71/edit/Classes?selection=Class(%3Chttp://edamontology.org/topic_______Cytometry_%2528FIX_ID%2529%3E) Cytometry @@ -61654,10 +62486,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Biotechnology approach that seeks to optimize cellular genetic and regulatory processes in order to increase the cells' production of a certain substance. - - + + Metabolic engineering @@ -61667,8 +62499,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.24 + + 1.24 Molecular biology methods used to analyze the spatial organization of chromatin in a cell. 3C technologies 3C-based methods @@ -61688,11 +62520,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 The study of microbe gene expression within natural environments (i.e. the metatranscriptome). Metatranscriptomics - - + + Metatranscriptomics methods can be used for whole gene expression profiling of complex microbial communities. Metatranscriptomics @@ -61703,8 +62535,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution + - 1.24 + 1.24 The reconstruction and analysis of genomic information in extinct species. Paleogenomics Ancestral genomes @@ -61720,12 +62553,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 The biological classification of organisms by categorizing them in groups ("clades") based on their most recent common ancestor. Cladistics Tree of life - - + + Cladistics @@ -61738,11 +62571,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 The study of the process and mechanism of change of biomolecules such as DNA, RNA, and proteins across generations. Molecular_evolution - - + + Molecular evolution @@ -61754,7 +62587,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Immunoinformatics is the field of computational biology that deals with the study of immunoloogical questions. Immunoinformatics is at the interface between immunology and computer science. It takes advantage of computational, statistical, mathematical approaches and enhances the understanding of immunological knowledge. Computational immunology Immunoinformatics @@ -61769,7 +62602,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 A diagnostic imaging technique based on the application of ultrasound. Standardized echography Ultrasound imaging @@ -61778,8 +62611,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution Medical ultrasound Standard echography Ultrasonography - - + + + imaging-revise Echography @@ -61790,7 +62624,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.24 + 1.24 Experimental approaches to determine the rates of metabolic reactions - the metabolic fluxes - within a biological entity. Fluxomics The "fluxome" is the complete set of metabolic fluxes in a cell, and is a dynamic aspect of phenotype. @@ -61803,16 +62637,16 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - 1.12 + + 1.12 An experiment for studying protein-protein interactions. Protein_interaction_experiment Co-immunoprecipitation Phage display Yeast one-hybrid Yeast two-hybrid - - + + This used to have the ID http://edamontology.org/topic_3557 but the numerical part (owing to an error) duplicated http://edamontology.org/operation_3557 ('Imputation'). ID of this concept set to http://edamontology.org/topic_3957 in EDAM 1.24. Protein interaction experiment @@ -61824,7 +62658,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 A DNA structural variation, specifically a duplication or deletion event, resulting in sections of the genome to be repeated, or the number of repeats in the genome to vary between individuals. Copy_number_variation CNV deletion @@ -61842,10 +62676,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 The branch of genetics concerned with the relationships between chromosomes and cellular behaviour, especially during mitosis and meiosis. - - + + Cytogenetics @@ -61856,7 +62690,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 The design of vaccines to protect against a particular pathogen, including antigens, delivery systems, and adjuvants to elicit a predictable immune response against specific epitopes. Vaccinology Rational vaccine design @@ -61864,8 +62698,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Structural vaccinology Structure-based immunogen design Vaccine design - - + + Vaccinology @@ -61876,10 +62710,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 + 1.25 The study of immune system as a whole, its regulation and response to pathogens using genome-wide approaches. - - + + Immunomics @@ -61891,17 +62725,18 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.25 - + 1.25 + The study of the epigenetic modifications of a whole cell, tissue, organism etc. Epistatic genetic interaction Epistatic interactions - - + + The study of the phenomena whereby the effects of one locus mask the allelic effects of another, such as how dominant alleles mask the effects of the recessive alleles at the same locus. Epistasis http://purl.bioontology.org/ontology/MSH/D057890 + TODO WikiData links and a broad synonym Gene interaction(s)! @@ -61910,12 +62745,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.26 + 1.26 Open and reliable science + Open research Open, reliable, transparent, and trustable science Open science encompasses the practices of making scientific research transparent and participatory, and its outputs publicly accessible. - - + + Open science @@ -61926,12 +62762,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.26 + 1.26 Data rescue denotes digitalisation, formatting, archival, and publication of data that were not available in accessible or usable form. Examples are data from private archives, data inside publications, or in paper records stored privately or publicly. - - + + Data rescue + @@ -61941,18 +62778,21 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.26 - FAIR data principles - FAIRification + 1.26 + FAIR data principles + FAIR principles + FAIRification FAIR data is data that meets the principles of being findable, accessible, interoperable, and reusable. Findable, accessible, interoperable, reusable data Open data - - + + A substantially overlapping term is 'open data', i.e. publicly available data that is free to use, distribute, and create derivative work from, without restrictions. Open data does not automatically have to be FAIR (e.g. findable or interoperable), while FAIR data does in some cases not have to be publicly available without restrictions (especially sensitive personal data). FAIR data + + @@ -61962,7 +62802,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.26 + 1.26 Microbial mechanisms for protecting microorganisms against antimicrobial agents. AMR Antibiotic resistance? @@ -61975,8 +62815,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution Multiresistance Pandrug resistance (PDR) Total drug resistance (TDR) - - + + Antimicrobial resistance @@ -61988,11 +62828,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.26 + 1.26 The monitoring method for measuring electrical activity in the brain. EEG - - + + + imaging-revise Electroencephalography @@ -62004,12 +62845,13 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.26 + 1.26 The monitoring method for measuring electrical activity in the heart. ECG EKG - - + + + imaging-revise Electrocardiography @@ -62020,11 +62862,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.26 + 1.26 A method for studying biomolecules and other structures at very low (cryogenic) temperature using electron microscopy. cryo-EM - - + + Cryogenic electron microscopy @@ -62035,34 +62877,206 @@ ows re-sequencing of complete genomes of any given organism with high resolution - 1.26 - Life + 1.26 + Life Biosciences, or life sciences, include fields of study related to life, living beings, and biomolecules. Life sciences - - + + Biosciences + + + + + + + + 1.26 + Biogeochemical cycle + The carbon cycle is the biogeochemical pathway of carbon moving through the different parts of the Earth (such as ocean, atmosphere, soil), or eventually another planet. + + + Note that the carbon-nitrogen-oxygen (CNO) cycle (https://en.wikipedia.org/wiki/CNO_cycle) is a completely different, thermonuclear reaction in stars. + Carbon cycle + + + + + + + + + + + + + + + + + 1.26 + Multiomics concerns integration of data from multiple omics (e.g. transcriptomics, proteomics, epigenomics). + Integrative omics + Multi-omics + Pan-omics + Panomics + + + Multiomics + + + + + + + + + + 1.26 + With ribosome profiling, ribosome-protected mRNA fragments are analyzed with RNA-seq techniques leading to a genome-wide measurement of the translation landscape. + RIBO-seq + Ribo-Seq + RiboSeq + ribo-seq + ribosomal footprinting + translation footprinting + + + Ribosome profiling + + TODO: Revise synonyms and hierarchy + + + + + + + + + 1.26 + Combined with NGS (Next Generation Sequencing) technologies, single-cell sequencing allows the study of genetic information (DNA, RNA, epigenome...) at a single cell level. It is often used for differential analysis and gene expression profiling. + Single-cell genomics + + + Single-cell sequencing + + + + + + + + + + + 1.26 + The study of mechanical waves in liquids, solids, and gases. + + + Acoustics + + + + + + + + + + + + 1.26 + Interdisplinary study of behavior, precise control, and manipulation of low (microlitre) volume fluids in constrained space. + Fluidics + + + Microfluidics + + + + + + + + + + 1.26 + Genomic imprinting is a gene regulation mechanism by which a subset of genes are expressed from one of the two parental chromosomes only. Imprinted genes are organized in clusters, their silencing/activation of the imprinted loci involves epigenetic marks (DNA methylation, etc) and so-called imprinting control regions (ICR). It has been described in mammals, but also plants and insects. + Gene imprinting + + + Genomic imprinting + + + + + + + + + + + + + + + + + 1.26 + Environmental DNA (eDNA) + Environmental RNA (eRNA) + Environmental sequencing + Taxonomic profiling + Metabarcoding is the barcoding of (environmental) DNA or RNA to identify multiple taxa from the same sample. + DNA metabarcoding + Environmental metabarcoding + RNA metabarcoding + eDNA metabarcoding + eRNA metabarcoding + + + Typically, high-throughput sequencing is performed and the resulting sequence reads are matched to DNA barcodes in a reference database. + Metabarcoding + + + TODO also: + fix Metagenomics + create Environmental sequencing, eDNA & eRNA related terms + create Metaproteomics (has been unfinished in WebProtégé since 2022-07) + relate (or not) Meta[barcoding|genomics|proteomics|transcriptomics] with Microbial ecology, Environmental sequencing, Resequencing (now contradicting whole-genome and amplicon!), amplicon, shotgun, ... + fix DNA barcoding (now an operation), community/taxonomic profiling, amplicon sequencing, ... + + + + + + - Earth - Earth-like astronomical object? - Terrestrial planet? + Earth + Earth-like astronomical object? + Terrestrial planet? Planetary sciences - Earth sciences Having Geosciences also under Astronomy looks of course weird. However, how to connect Planetary sciences & co.? - Wikipedia in some articles suggests that Geogpraphy = Earth sciences. That's nonsense, or isn't it? Geosciences + TODO: In some contexts, the upper science is called "Earth and planetary sciences". We can consider that. Keep in mind: + - it's not only planets, also + - other astro objects + - also other spatial objects (especially geo-formats, e.g. for cell images) + - does this term sometimes exclude social geosciences? + - (broad) synonym Spatial studies/sciences? (Would even be a nice main label if picks up in the future) + @@ -62071,7 +63085,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - + NetCDF Zarr NetCDF-like Zarr data model NCZarr @@ -62136,6 +63150,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution Matúš Kalaš 2021-09-05T21:51:54.359923Z + (Future: Consider whether Astrophysics should stay separate,or be merged with Astronomy as a synonym. Depending on how much astronomy is "non-astrophysics", etc...) Astrophysics @@ -62158,12 +63173,12 @@ ows re-sequencing of complete genomes of any given organism with high resolution - Universe + Universe Astrosciences Space sciences - Space sciences: https://en.wikipedia.org/wiki/Outline_of_space_science adds mostly "Rocket science"/Space flight engineering🚀. However, https://en.wikipedia.org/wiki/Outline_of_academic_disciplines#Space_science has various topics as "siblings" of Astronomy (incl. Planetary science and Astrobiology). - Telescopy as a separate topic under Imaging??? (Or synonym of Astronomy if that is limited to observation, and separated from Astrophysics etc.) Astronomy + Space sciences: https://en.wikipedia.org/wiki/Outline_of_space_science adds mostly "Rocket science"/Space flight engineering🚀. However, https://en.wikipedia.org/wiki/Outline_of_academic_disciplines#Space_science has various topics as "siblings" of Astronomy (incl. Planetary science and Astrobiology). + Telescopy as a separate topic under Imaging??? (Or synonym of Astronomy if that is limited to observation, and separated from Astrophysics etc.) @@ -62183,13 +63198,15 @@ ows re-sequencing of complete genomes of any given organism with high resolution - - Telescope + Telescope Astronomical imaging? - Observational astronomy + Telescopy (Or synonym of Astronomy if that is limited to observation, and separated from Astrophysics etc.) - Telescopy + Observational astronomy + + + @@ -62198,7 +63215,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - + Matúš Kalaš 2021-09-05T23:24:40.903469Z Satellite imagery @@ -62224,7 +63241,6 @@ ows re-sequencing of complete genomes of any given organism with high resolution - Radio telescopy Radio astronomy @@ -62235,7 +63251,6 @@ ows re-sequencing of complete genomes of any given organism with high resolution - Neutrino telescopy? detection? Neutrino astronomy @@ -62270,11 +63285,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution - Justice - Law? - Rights - + ethics committees for research on humans, animal, ...? Accountability / bias / ethics of science itself? (also under Open science) Bias & ethics etc. in Machine learning(!) ... - Under Culture & humanities, or separate? + Justice + Law? + Rights Ethics @@ -62296,9 +63309,9 @@ ows re-sequencing of complete genomes of any given organism with high resolution - Could it be a topic of data analysis / tools / ...? Environmental ethics + @@ -62307,7 +63320,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - Registered report + Registered report Matúš Kalaš 2021-09-06T00:07:25.063142Z Pre-registration @@ -62365,10 +63378,11 @@ ows re-sequencing of complete genomes of any given organism with high resolution - Digital twin + Digital twin Matúš Kalaš 2022-05-04T13:32:08.604563Z - Modelling + Modeling and simulation + Modelling and simulation(?) @@ -62526,7 +63540,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - Geographic information systems (GIS) + Geographic information systems (GIS) Matúš Kalaš 2022-05-05T07:49:24.724238Z Geographic information science (GIScience) @@ -62535,6 +63549,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution + Spatial informatics and/or similar? @@ -62559,10 +63574,10 @@ ows re-sequencing of complete genomes of any given organism with high resolution Matúš Kalaš 2022-05-05T13:18:28.523052Z - Relation to Humanities & Anthropology? -- https://en.wikipedia.org/wiki/Humanities -- https://en.wikipedia.org/wiki/Anthropology Social sciences + Relation to Humanities & Anthropology? +- https://en.wikipedia.org/wiki/Humanities +- https://en.wikipedia.org/wiki/Anthropology @@ -62630,7 +63645,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - + jeani @@ -62644,6 +63659,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution + jeani 2022-05-06T12:54:03.110536Z Satellite remote sensing @@ -62678,6 +63694,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution Matúš Kalaš 2022-07-18T16:20:45.423453Z Epidemiology + TODO: Get details from the Public health topic @@ -62737,7 +63754,6 @@ ows re-sequencing of complete genomes of any given organism with high resolution Matúš Kalaš 2022-07-18T22:55:14.820834Z - TODO; this is a list of vaguely related keywords that we may or may not want to include somewhere: Decolonisation, Global South, Minorities (incl. Ethnical/Tribal/...), Small island/remote nations, Intersectionality, ... PLEASE ADD MORE 😊 Cultural diversity @@ -62785,7 +63801,7 @@ ows re-sequencing of complete genomes of any given organism with high resolution - + Matúš Kalaš 2022-09-10T00:37:01.346313Z Remote sensing @@ -62812,8 +63828,6 @@ ows re-sequencing of complete genomes of any given organism with high resolution Matúš Kalaš 2022-09-10T21:35:16.302722Z - - Culture and its products (both solutions & problems) versus Cultural studies (=Humanities?, incl.?=? Anthropology?, e.g. Archaeology, Archaeoastronomy): Should they be distinguished, or mixed together? And what with the various sub-disciplines of Anthropology? In contrast to the nicely broad definition on English Wikipedia, WikiData definition of https://www.wikidata.org/wiki/Q11042 is narrower in English and many other languages, and non-existent or vague in others. Therefore skos:closeMatch might be the best (narrowMatch unnecessarily explicit?). It might be unnecessary to model the concept of Art/Arts between Culture (art is narrower) and Visual art(s) (art is broader) @@ -62823,6 +63837,8 @@ ows re-sequencing of complete genomes of any given organism with high resolution General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or is more scientific than Humanities?? Culture and humanities + + @@ -62833,10 +63849,10 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-10T22:01:34.260233Z - Visual art Visual arts + @@ -62847,11 +63863,11 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-10T22:03:37.588183Z - - Scientific and technical illustration + + @@ -62863,15 +63879,15 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-10T22:05:52.943542Z - - - Botanical illustration Medical illustration Biological illustration + + + @@ -62883,9 +63899,9 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-10T22:37:14.612067Z - Archaeological illustration + @@ -62896,11 +63912,12 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-10T22:47:15.951859Z - - Archeology - Note: WikiData (not Wikipedia) has an interesting generic concept of "History, heritage, and archeology" https://www.wikidata.org/wiki/Q113077601 + Archeology Archaeology + Note: WikiData (not Wikipedia) has an interesting generic concept of "History, heritage, and archeology" https://www.wikidata.org/wiki/Q113077601 + + TODO wikidata archaeological find @@ -62936,11 +63953,11 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-11T22:45:55.356349Z - Landscape painting In most languages (not EN) and also to a major extent in the en.wikipedia, https://www.wikidata.org/wiki/Q191163 is limited to painting. Landscape art + @@ -62950,14 +63967,14 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i - Amateur astronomy (concept? then rather Citizen astronomy or Enthusiast or similiar) + Amateur astronomy (concept? then rather Citizen astronomy or Enthusiast or similiar) Matúš Kalaš 2022-09-11T23:06:51.390271Z - - Rather label Astronomical imaging (aka in en.wikipedia), if that isn't a synonym of Telescopy = Observational astronomy. + Rather label Astronomical imaging (aka in en.wikipedia), if that isn't a synonym of Telescopy = Observational astronomy. (Perhaps even distinguish "purer" photography from heavily computed imaging/visualisation??) Astrophotography + @@ -62969,12 +63986,12 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-12T12:45:48.016275Z - Computational humanities DH Humanities computing Digital humanities + @@ -62985,10 +64002,10 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-12T13:21:56.944735Z - Engineering + @@ -62998,14 +64015,13 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i - Do-it-yourself (DIY) culture?? - Do-it-yourself (DIY) hardware - Maker culture? - Maker movement? - Open design? + Do-it-yourself (DIY) culture?? + Do-it-yourself (DIY) hardware + Maker culture? + Maker movement? + Open design? Matúš Kalaš 2022-09-12T13:23:15.737553Z - Free and open-source hardware (FOSH) OSH Open hardware @@ -63016,6 +64032,7 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Open-source robotics (OSR) Open-source hardware + @@ -63026,12 +64043,12 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-12T20:41:43.908927Z - Ecological goods and services Ecosystem goods Nature's services Ecosystem services + @@ -63043,9 +64060,6 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i Matúš Kalaš 2022-09-12T21:20:08.819819Z - - - Atomic and molecular astrophysics? Chemical cosmology? Cosmochemistry? @@ -63054,6 +64068,9 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i + + + @@ -63126,7 +64143,7 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i - + @@ -63142,7 +64159,7 @@ General Anthropology as a narrow synonyms, or needs a separate sub-concept? Or i - Environmental pollution + Environmental pollution Matúš Kalaš 2022-09-26T13:16:05.814956Z Ecotoxicology? @@ -63173,10 +64190,16 @@ NOTE: E.g. light pollution is not toxic. - Environmental racism + Environmental injustice + Environmental racism Matúš Kalaš 2022-09-26T13:34:48.860002Z Environmental justice + + + + + @@ -63195,11 +64218,11 @@ NOTE: E.g. light pollution is not toxic. - + - + @@ -63229,7 +64252,7 @@ NOTE: E.g. light pollution is not toxic. - Land cover + Land cover Matúš Kalaš 2022-09-26T14:07:23.053777Z Land use @@ -63269,9 +64292,9 @@ NOTE: E.g. light pollution is not toxic. Matúš Kalaš 2022-09-29T13:24:38.200415Z - Genetic (non)discrimination? + @@ -63282,7 +64305,7 @@ NOTE: E.g. light pollution is not toxic. - + @@ -63324,8 +64347,9 @@ NOTE: E.g. light pollution is not toxic. Matúš Kalaš 2022-10-14T13:34:39.857504Z + Pedology Soil health - Soil science? + Soil science @@ -63334,6 +64358,7 @@ NOTE: E.g. light pollution is not toxic. + Matúš Kalaš 2022-10-14T13:36:59.127117Z EIA @@ -63356,7 +64381,7 @@ NOTE: E.g. light pollution is not toxic. - + Matúš Kalaš 2022-10-14T13:41:58.839583Z Spectrometry @@ -63404,8 +64429,8 @@ NOTE: E.g. light pollution is not toxic. - Pollen vector - Pollinator + Pollen vector + Pollinator Matúš Kalaš 2022-10-21T11:24:08.697676Z Biotic pollination @@ -63450,7 +64475,10 @@ NOTE: E.g. light pollution is not toxic. Matúš Kalaš 2022-10-26T21:30:50.382542Z + Eamiálbmotvuoigatvuođat Indigenous rights + + @@ -63459,6 +64487,7 @@ NOTE: E.g. light pollution is not toxic. + Matúš Kalaš 2022-10-26T21:40:14.829332Z @@ -63468,8 +64497,11 @@ NOTE: E.g. light pollution is not toxic. Indigenous knowledge (IK) Traditional cultural expressions (TCE) Local knowledge + Mātauranga taketake Traditional knowledge + Árbediehtu + @@ -63539,7 +64571,7 @@ NOTE: E.g. light pollution is not toxic. Matúš Kalaš 2022-11-07T13:32:55.235164Z Environmental surveillence? - Environmental montoring? + Environmental monitoring @@ -63548,9 +64580,10 @@ NOTE: E.g. light pollution is not toxic. + Matúš Kalaš 2022-11-07T13:33:53.364278Z - Wastewater surveillence? + Wastewater surveillence @@ -63574,9 +64607,9 @@ NOTE: E.g. light pollution is not toxic. Matúš Kalaš 2023-02-02T14:06:24.998675Z - Electrical engineering + @@ -63587,9 +64620,9 @@ NOTE: E.g. light pollution is not toxic. Matúš Kalaš 2023-02-02T14:15:09.276759Z - Mechanical engineering + @@ -63602,9 +64635,7 @@ NOTE: E.g. light pollution is not toxic. Matúš Kalaš 2023-02-02T14:16:19.216887Z - Energy systems engineering - Electrical energy engineering Electrical power engineering Power engineering @@ -63613,6 +64644,8 @@ NOTE: E.g. light pollution is not toxic. Energy engineering + + @@ -63624,10 +64657,1476 @@ NOTE: E.g. light pollution is not toxic. Matúš Kalaš 2023-02-02T14:25:12.552808Z - Energieinformatik Energy informatics + + + + + + + + + + Elementary particles + Matúš Kalaš + 2023-02-03T12:05:57.007011Z + Elementarteilchenphysik + High energy physics (HEP) + High-energy physics + Hochenergiephysik (HEP) + (HEP not only elementary particles, but also heavy ions (https://de.wikipedia.org/wiki/Schwerion) acc. to DE Wikipedia. Exact synonyms acc. to EN Wikipedia and Wikidata.) + Particle physics + Teilchenphysik + + + + + + + + + + + + + + + Matúš Kalaš + 2023-02-03T12:39:02.430724Z + Archaeoastronomy + + Directly under culture/humanities, or under archaeology? Or is not archaelogical? (as opposed to archeaobiology) + + + + + + + + + Matúš Kalaš + 2023-02-03T16:39:05.039713Z + Civil engineering + + + + + + + + + + Matúš Kalaš + 2023-02-03T20:43:40.935062Z + Computational linguistics + + + + + + + + + + Matúš Kalaš + 2023-02-07T14:13:47.515195Z + Global justice + + + + + + + + + + + Matúš Kalaš + 2023-02-09T11:13:21.41427Z + Version control + + + + + + + + + + + Data storage + erihje + 2023-02-24T10:59:23.548411Z + Registry + Repository + Data archive or library, often in the form of an online database where datasets are published (deposited) and archived (preserved) with a persistent identifier. + +Depositing datasets in public data repositories is an important step in making the data FAIR (Findable, Accessible, Interoperable and Reusable). + Data registry + Research data archiving + Data repository + + + + + + + + + + + Matúš Kalaš + 2023-03-08T10:14:53.294822Z + Women's rights + + + + + + + + + + + Matúš Kalaš + 2023-03-17T14:41:16.34877Z + Natural disaster management + + + + + + + + + Matúš Kalaš + 2023-03-30T13:50:11.974009Z + Protein design + Protein engineering + TODO: Probably obsolete Protein design operation (http://edamontology.org/operation_4008). And consider this one plus some operation? +And what about Small molecule design, and all the designs under http://edamontology.org/operation_2430 ? All topics? + + + + + + + + + Software stewardship + Matúš Kalaš + 2023-02-25T02:40:36.851802Z + Software management + + + + + + + + + + + Matúš Kalaš + 2023-07-26T09:10:57.797861Z + Single-cell transcriptomics + + + + + + + + + + erin.calhoun + 2023-02-27T13:06:10.9775Z + Molecules involved in biological processes. + Biological molecule + Biomolecule + + + + + + + + + + + erin.calhoun + 2023-02-27T13:22:11.761026Z + The study of the structure,organization, function, development, and evolution of the nervous system as well as mechanisms underlying perception, cognition, and behavior. + Neurobiology + Behavioural neuroscience + Cognitive neuroscience + Computational neuroscience + Developmental neuroscience + Molecular neuroscience + Neuroanatomy + Neurochemistry + Neuroethology + Neurophysiology + Systems neuroscience + Neuroscience + + + + + + + + + + + erin.calhoun + 2023-02-27T13:32:53.215999Z + The study of the structure and structural organization of the nervous system. + Neuroanatomy + + + + + + + + + + + + erin.calhoun + 2023-02-27T13:38:29.76265Z + Component of the nervous system of bilateral animals containing the brain and spinal cord. + CNS + Central nervous system + + + + + + + + + + + erin.calhoun + 2023-02-27T13:47:19.195761Z + Component of the nervous system comprising nerves and ganglia that form a communication network between the central nervous system and the rest of the body. + PNS + Peripheral nervous system + + + + + + + + + + + erin.calhoun + 2023-02-27T13:55:27.204618Z + The study of the functional organization of the nervous system and of the mechanisms underlying these functions. + Neurophysiology + + + + + + + + + + + erin.calhoun + 2023-02-27T13:58:56.40552Z + The study of neurobiological mechanisms underlying cognition and other higher order brain functions and behaviors, based on methods derived from traditional neuroscience and psychology. + Behavioral neuroscience + Behavioural neuroscience + Cognitive neuropsychology + Cognitive psychology + Cognitive neuroscience + + + + + + + + + + + + + erin.calhoun + 2023-02-27T14:16:02.360707Z + Neurobiology + The study of neurobiological mechanisms underlying animal behaviour. + Behavioral neuroscience + Biological psychology + Biopsychology + Psychobiology + Physiological psychology + Comparative psychology + Ethology + Evolutionary biology + Neuroethology + Behavioural neuroscience + + + + + + + + + + + erin.calhoun + 2023-02-27T14:27:31.189841Z + Developmental biology + The study of the development and maturation of the nervous system in animals. + Development of the nervous system + Neural development + Neurodevelopment + Developmental neuroscience + + + + + + + + + + + erin.calhoun + 2023-02-27T14:31:56.781298Z + Ethology + Neurobiology + The study of neurobiological mechanisms underlying natural animal behaviour (i.e. behaviours influenced by natural selection). + Neuroendocrinology + Behavioural neuroscience + Neuroethology + + + + + + + + + + + + erin.calhoun + 2023-02-27T14:43:02.583893Z + The study of neuroscience using mathematical models and computational simulations. + Mathematical neuroscience + Neuroinformatics + Theoretical neuroscience + Computational neuroscience + + + + + + + + + + + + erin.calhoun + 2023-02-27T15:08:19.469858Z + Biochemistry + Molecular biology + Neuropharmacology + Neurochemistry + + + + + + + + + + + erin.calhoun + 2023-02-27T15:10:45.752336Z + Molecular biology + The study of the molecular basis of neurobiological structures and functions. + Molecular neurobiology + Neurochemistry + Molecular neuroscience + + + + + + + + + + + erin.calhoun + 2023-02-27T15:16:47.782517Z + The study of the structure and function of neural pathways and networks. + Neuroendocrinology + Neuroimmunology + Sensory and motor neuroscience + Sensory neuroscience + Computational neuroscience + Developmental neuroscience + Neurophysiology + Systems neuroscience + + + + + + + + + + + erin.calhoun + 2023-02-27T15:32:03.591055Z + The study of molecular and cellular mechanisms of drug action. + Neuropsychopharmacology + Neuropharmacology + + + + + + + + + + + erin.calhoun + 2023-02-27T15:36:42.064642Z + Neurochemistry + Neuropharmacology + Pharmacodynamics + The study of how drug interactions with neural circuits and neurotransmitter systems influence behaviour. + Neuropsychopharmacology + + + + + + + + + + + erin.calhoun + 2023-02-27T15:47:31.227128Z + Immunology + Systems neuroscience + The study of interactions between the nervous system and the immune system, especially in relation to disease pathology. + Neurology + Neuroimmunology + + + + + + + + + + + erin.calhoun + 2023-02-27T15:57:23.036565Z + Endocrinology + Neuroethology + Neurophysiology + Physiology + Systems neuroscience + The study of interactions between the endocrine system and the immune system + Behavioral endocrinology + Neuroendocrine integration + Neuroendocrinology + + + + + + + + + + + erin.calhoun + 2023-02-27T16:12:34.605995Z + The study of neuroscience using methods and approaches traditionally used in physics. + Neurobiophysics + Neurophysiology + Neurophysics + + + + + + + + + + + erin.calhoun + 2023-02-27T16:30:38.386392Z + Linguistics + Psycholinguistics + Neurolinguistics + + + + + + + + + + + erin.calhoun + 2023-02-28T08:37:04.481451Z + An area of machine learning where an algorithm uses labeled training examples (input-output pairs) to predict unseen data points. + Supervised machine learning + Supervised learning + + + + + + + + + + + erin.calhoun + 2023-02-28T08:37:15.074284Z + An area of machine learning that infers latent patterns and structure from unlabeled data to gain insights into the underlying data distribution. + Unsupervised machine learning + Unsupervised learning + + + + + + + + + + + erin.calhoun + 2023-02-28T08:37:27.489348Z + An area of machine learning where an algorithm interacts with an environment to search for suitable actions to maximize a reward. Unlike supervised learning algorithms, which use labeled data for training, reinforcement learning uses a process of trial and error to discover optimal outputs. This process focuses on balancing exploration (testing the effectiveness of new actions) with exploitation (using known actions for a high reward). + RL + Single-agent reinforcement learning + Deep reinforcement learning + MARL + Model-based reinforcement learning + Model-free reinforcement learning + Multi-agent reinforcement learning + Off-policy reinforcement learning + On-policy reinforcement learning + PPO + Policy gradient methods + Proximal policy optimization + Q-learning + SARSA + State-action-reward-state-action + TD learning + TRPO + Temporal difference learning + Trust region policy optimization + Reinforcement learning + + + + + + + + + + + + + + + + erin.calhoun + 2023-02-28T08:41:31.004816Z + Supervised learning + SVM + SVMs + Support vector machines + Support vector network + Support vector networks + Support vector machine + + + + + + + + + + + + Matúš Kalaš + 2023-07-26T09:13:01.376455Z + Single-cell epigenomics + + + + + + + + + + Matúš Kalaš + 2023-07-26T09:15:04.85232Z + Single-cell ATAQ-Seq + + Is this concept too detailed, or just right? + + + + + + + + + erin.calhoun + 2023-02-28T09:03:07.637244Z + Decision tree + + + + + + + + + erin.calhoun + 2023-02-28T09:03:27.779266Z + RF + Random decision forest + Randomized trees + Random forest + + + + + + + + + + + erin.calhoun + 2023-02-28T09:03:54.268999Z + Gradient boosting machine + Gradient boosting + + + + + + + + + + + erin.calhoun + 2023-02-28T09:04:20.939385Z + A nonparametric supervised learning algorithm used in classification and regression tasks that finds the k closest training examples in a data set to a given example (where k is a hyperparameter determined by the user and "closest" is determined by a chosen similarity measure) and uses the labels of those examples to predict the label (in classification) or value (in regression) of the test example. + K-NN + K-nearest neighbors + KNN + Nearest neighbor search (NNS) + Proximity search + K-nearest neighbours + + + + + + + + + + + erin.calhoun + 2023-02-28T09:57:12.595479Z + Generative model + + + + + + + + + + + Activation function + Backpropagation + Bias + Hidden layer + Input layer + Neuron + Output layer + Weight + erin.calhoun + 2023-02-28T09:04:43.20096Z + Cognitive computing + Machine learning models that simulate a biological neural network by transforming an input signal to an output signal according to mathematical operations at a series of artificial neurons (nodes) interconnected by edges that may be weighted to adjust the signal strength. Models typically differ according to the structure of the nodes, network topology, and the learning algorithm chosen to find network weights. + ANN + ANNs + Artificial neural networks + NN + NNs + Neural net + Neural nets + Neural network + Neural networks + Autoencoder + CNN + Convolutional neural network + Dense neural network + DenseNet + Densely connected convolutional network + FCNN + Feedforward neural network + Fully connected neural network + LSTM + Long short-term memory + MLP + Multi-layer perceptron + Multilayer perceptron + RBM + RNN + Recurrent neural network + Restricted Boltzmann machine + SNN + Spiking neural network + Deep learning + Artificial neural network + + + + + + + + + + + erin.calhoun + 2023-02-28T09:16:21.873956Z + K-means clustering + + + + + + + + + erin.calhoun + 2023-02-28T09:16:34.635695Z + Hierarchical clustering + + + + + + + + + erin.calhoun + 2023-02-28T09:17:06.205093Z + DBSCAN + + + + + + + + + + Matúš Kalaš + 2023-07-26T09:23:34.809729Z + Single-cell omics + This term is used, somewhat. But not on Wikipedia. Do we need it as a separate concept? + + + + + + + + + Matúš Kalaš + 2023-07-26T09:28:16.021181Z + Single-cell (biological) study + + + + + + + + + + erin.calhoun + 2023-02-28T09:20:39.749016Z + 2023-02-28T09:20:55.91588Z + Pattern recognition + Classification + Statistical classification + + + + + + + + + + + erin.calhoun + 2023-02-28T09:59:21.283915Z + Model-free reinforcement learning + Reinforcement learning + A model-free, off-policy reinforcement learning algorithm used to find an optimal policy, or state-action mapping, for an environment through iterative updates of the Q-value (state-action pair) used to estimate the expected cumulative reward. + Q-learning + + + + + + + + + + + erin.calhoun + 2023-02-28T09:21:09.195382Z + Conditional model + Discriminative model + + + + + + + + + + + Matúš Kalaš + 2023-07-26T15:03:45.036645Z + Spatial transcriptomics + + + + + + + + + erin.calhoun + 2023-02-28T10:03:55.566353Z + Ensemble learning + + + + + + + + + + + erin.calhoun + 2023-02-28T10:09:05.697273Z + Cluster analysis + + + + + + + + + erin.calhoun + 2023-02-28T10:10:53.163563Z + Outlier + The detection of data points or patterns in data that appear inconsistent to or deviate significantly from other points in a data set. + Novelty detection + Outlier detection + Semi-supervised anomaly detection + Supervised anomaly detection + Unsupervised anomaly detection + Anomaly detection + + + + + + + + + + + erin.calhoun + 2023-02-28T10:12:00.79122Z + Deep learning + + + + + + + + + erin.calhoun + 2023-02-28T10:13:54.972292Z + Linear classifier + + + + + + + + + + + erin.calhoun + 2023-02-28T10:14:42.155192Z + Bayesian network + Conditional probability + Probabilistic classification + Naive Bayes + Bayes' theorem + Bayesian probability + Naive Bayes classifier + + + + + + + + + + + erin.calhoun + 2023-02-28T10:16:01.62953Z + Recurrent neural network + + + + + + + + + erin.calhoun + 2023-02-28T14:21:36.710964Z + An area of machine learning that improves model performance by training with both labeled and unlabeled data. The use of unlabeled data during training is common to both semi-supervised learning and weakly supervised learning. However, weak supervision techniques include training examples with incomplete, inexact, or inaccurate labels. + Semi-supervised machine learning + Active learning + Weak supervision + Weakly supervised learning + Semi-supervised learning + + + + + + + + + + + + Matúš Kalaš + 2023-09-13T12:23:37.271322Z + Evolutionary biodiversity? + Evolutionary ecology + + + + + + + + + + + erin.calhoun + 2023-02-28T16:02:26.392455Z + Bootstrapping + Bagging + Bootstrap aggregating + + + + + + + + + + + erin.calhoun + 2023-02-28T16:03:35.703241Z + Linear regression + Boosting + + + + + + + + + + + Backpropagation + Gradient descent + Gradient method + erin.calhoun + 2023-03-01T08:11:22.535509Z + Artificial neural network + Feedforward neural network + A type of feedforward neural network with multiple layers of nodes (an input layer, at least one hidden layer, and an output layer) that can learn nonlinear decision boundaries. Multilayer perceptrons use backpropagation and gradient methods to learn input weights and update the network to minimize the error between predicted and actual outputs. + MLP + Multi-layer perceptron + Multilayer perceptron + + + + + + + + + + + erin.calhoun + 2023-03-01T08:11:41.823574Z + A type of artificial neural network that transmits information forward from the input layer through hidden layers (where applicable) to the output layer without cycling. + FNN + Feed forward neural network + Feed-forward neural network + MLP + McCulloch-Pitt neural network + Multi-layer perceptron + Multiclass perceptron + Multilayer perceptron + Perceptron + Single-layer perceptron + Feedforward neural network + + + + + + + + + + + erin.calhoun + 2023-03-01T08:11:51.600356Z + Convolutional neural network + + + + + + + + + erin.calhoun + 2023-03-01T08:12:10.20786Z + Long short-term memory + + + + + + + + + Anomaly detection + Bottleneck + Decoder + Dimensionality reduction + Encoder + Encoder-decoder + Feature learning + Generative model + Image processing + Information retrieval + Latent space + Principal component analysis + Reconstruction error + erin.calhoun + 2023-03-01T08:12:26.223067Z + Artificial neural network + Deep learning + Machine learning + Unsupervised learning + A type of artificial neural network that uses unsupervised learning to map inputs to a compressed representation of data that captures the most important features, then map the data to a more efficiently coded reconstruction of the input. + AE + Auto-encoder + Auto-encoders + Autoencoder neural network + Autoencoder neural networks + Autoencoders + Neural network autoencoder + Neural network autoencoders + CAE + Contractive autoencoder + DAE + Denoising autoencoder + K-sparse autoencoder + Overcomplete autoencoder + Regularised autoencoder + Regularized autoencoder + SAE + Sparse autoencoder + VAE + Variational autoencoder + Convolutional autoencoder + Deep autoencoder + Recurrent autoencoder + Stacked autoencoder + Autoencoder + + + + + + + + + + + erin.calhoun + 2023-03-01T08:12:44.895651Z + Restricted Boltzmann machine + + + + + + + + + erin.calhoun + 2023-03-01T08:13:06.752314Z + Spiking neural network + + + + + + + + + erin.calhoun + 2023-03-01T08:13:21.382365Z + Dense neural network + + + + + + + + + Neuron + erin.calhoun + 2023-03-01T08:23:46.599094Z + A mathematical model of a biological neuron used as a building block for artificial neural networks. + Artificial neurone + Artificial neurons + Formal neuron + Binary neuron + Linear threshold function + MCP neuron + McCulloch-Pitts neuron + Nv neuron + Perceptron (artificial neuron) + Semi-linear unit + Artificial neuron + + + + + + + + + + + erin.calhoun + 2023-03-01T09:17:00.408582Z + Binary classifier + Linear classifier + Supervised learning + The simplest supervised learning algorithm, as implemented by Frank Rosenblatt in 1957 for binary classification of linearly separable classes. In this context, perceptron refers to a linear binary classifier that uses a step function to classify input data into one of two classes. Weights associated with each input are iteratively updated to minimize error between the predicted and actual outputs. + +Note: the term perceptron is used other ways as a case of an artificial neuron, several types of neural network, and somewhat interchangeably with logistic regression (although it lacks the probabilistic assumptions of logistic regression). + Rosenblatt perceptron + Rosenblatt perceptron model + Perceptron (algorithm) + + + + + + + + + + + erin.calhoun + 2023-03-01T10:42:24.843547Z + Artificial neuron + In this context, perceptron refers to a simplified case of an artificial neuron that produces a single output (binary or continuous) as the weighted sum of multiple inputs passed through a Heaviside step or sigmoid/logistic activation function. + +Note: the term perceptron is used other ways as a supervised learning algorithm, a type of neural network, and somewhat interchangeably with logistic regression (although it lacks the probabilistic assumptions of logistic regression). + Perceptrons + Perceptron (artificial neuron) + + + + + + + + + erin.calhoun + 2023-03-01T10:50:59.275367Z + A type of neural network with a single layer of perceptrons that receive identical inputs but are weighted differently. The output depends on the activation of each perceptron in the layer. In modern usage, perceptron or single-layer perceptron more broadly refers to any neural network based on linear threshold units and is not limited to networks using the perceptron learning algorithm for training. + +Note: the term perceptron is used other ways as a supervised learning algorithm, a case of an artificial neuron, and somewhat interchangeably with logistic regression (although it lacks the probabilistic assumptions of logistic regression). + Single-layer artificial neural network + Single-layer perceptron + Perceptron (artificial neural network) + + + + + + + + + + erin.calhoun + 2023-03-01T11:44:39.547454Z + Artificial neuron + An artificial neuron that performs binary classification by producing a binary output as the sum of binary inputs applied to a threshold value. + M-P neuron + MCP neuron + MP neuron + McCulloch-Pitts model + McCulloch-Pitts neurone + Threshold logic unit + McCulloch-Pitts neuron + + + + + + + + + + + erin.calhoun + 2023-03-01T13:14:05.058007Z + U-Net + + + + + + + + + + + Matúš Kalaš + 2023-03-02T10:28:41.069112Z + Reproducible research + + + + TODO. Check also definitions and discussions linked from https://the-turing-way.netlify.app/reproducible-research/overview/overview-definitions.html + + + + + + + + + + A related search term with a different scope + Matúš Kalaš + 2023-02-24T09:45:21.41427Z + Slightly broader meaning + A TEMPLATE for Topic concepts in EDAM. + The same thing (TSG) + Slightly narrower meaning + Mostly overlapping, but not exact, narrower, or broader. + Mandatory when released: rdfs:label, hasDefinition. +Mandatory but can be semi-automated: Created in, subsets, ... +Desired: rdfs:seeAlso to a Wikipedia article(s), match link(s) to a WikiData item(s) +Optional: rdfs:comment(s), synonyms and related terms +Removed for release: created_by, creation_date, skos:editorialNote(s) + Optional, zero or more. A comment adds important information to the definition, synonyms, external links. May also be "not to be confused with". + {Topic TEMPLATE} + + SKOS 'editorial note' comment is just an editorial comment that will not be released. E.g. TODO - Improve this TEMPLATE! + + + + + + + + + + Matúš Kalaš + 2023-11-18T19:30:02.458075Z + Biomimicry + + + + + + + + + + Matúš Kalaš + 2023-11-18T19:30:15.558177Z + Nature-based solutions + + + + + + + + + Matúš Kalaš + 2023-11-18T19:30:53.712266Z + (Waste)Water treatment + + + + + + + + + + Matúš Kalaš + 2023-11-18T19:31:25.556167Z + Water management + + + + + + + + + + Matúš Kalaš + 2023-11-18T19:33:25.345082Z + Biodiversity observation? [synonym(what kind), or not?] + Biodiversity monitoring + + + + + + + + + + Matúš Kalaš + 2023-11-18T19:37:56.225232Z + Anthropology + TODO: Reckonably, not all subdomains of Anthropology should go under Biosciences. Does the same hold for Culture & humanities? + + + + + + + + + Matúš Kalaš + 2023-11-18T19:38:47.869929Z + Forensic anthropology + + + + + + + + + + Matúš Kalaš + 2023-11-18T19:39:02.819396Z + Social anthropology + + + + + + + + Matúš Kalaš + 2023-02-24T09:45:21.41427Z + {TEMPLATE concept, generic} + + + + + + + + + + Matúš Kalaš + 2023-02-24T09:45:21.41427Z + {Obsolete concept TEMPLATE} @@ -63636,8 +66135,8 @@ NOTE: E.g. light pollution is not toxic. - 1.2 - + 1.2 + An obsolete concept (redefined in EDAM). Needed for conversion to the OBO format.