Merge pull request #113 from nfdi4plants/datamap_sheet_extension

Extend datamap logic with alternative worksheet implementation
nfdi4plants · Jul 15, 2024 · 44bddf6 · 44bddf6
2 parents 3010ddc + b1e2f81
commit 44bddf6
Show file tree

Hide file tree

Showing 2 changed files with 8 additions and 4 deletions.
diff --git a/ARC specification.md b/ARC specification.md
@@ -76,9 +76,9 @@ ARCs are based on a strict separation of data and metadata content into study ma
 
 Each ARC is a directory containing the following elements:
 
-- *Studies* are collections of material and resources used within the investigation. Study-level metadata is stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.study.xlsx` file, which MUST exist to specify the input material or data resources. Resources MAY include biological materials (e.g. plant samples, analytical standards) created during the current investigation. Resources MAY further include external data (e.g., knowledge files, results files) that need to be included and cannot be referenced due to external limitations. Resources described in a study file can be the input for one or multiple assays. Further details on `isa.study.xlsx` are specified [below](#study-and-resources). Resource (descriptor) files MUST be placed in a `resources` subdirectory. Further explications about data entities defined in the study are stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file, which SHOULD exist for studies containing data. Further details on `isa.datamap.xlsx` are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file).
+- *Studies* are collections of material and resources used within the investigation. Study-level metadata is stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.study.xlsx` file, which MUST exist to specify the input material or data resources. Resources MAY include biological materials (e.g. plant samples, analytical standards) created during the current investigation. Resources MAY further include external data (e.g., knowledge files, results files) that need to be included and cannot be referenced due to external limitations. Resources described in a study file can be the input for one or multiple assays. Further details on `isa.study.xlsx` are specified [below](#study-and-resources). Resource (descriptor) files MUST be placed in a `resources` subdirectory. Further explications about data entities defined in the study are stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file in the **study folder** or `isa_datamap` worksheet in the **isa.study.xlsx** file. Further details on this are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file).
 
-- *Assays* correspond to outcomes of experimental assays or analytical measurements (in the interpretation of the ISA model) and are treated as immutable data. Each assay is a collection of files, together with a corresponding metadata file, stored in a subdirectory of the top-level subdirectory `assays`. Assay-level metadata is stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.assay.xlsx` file, which MUST exist for each assay. Further details on `isa.assay.xlsx` are specified [below](#assay-data-and-metadata). Assay data files MUST be placed in a `dataset` subdirectory. Further explications about data entities defined in the assay are stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file, which SHOULD exist for each assay. Further details on `isa.datamap.xlsx` are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file).
+- *Assays* correspond to outcomes of experimental assays or analytical measurements (in the interpretation of the ISA model) and are treated as immutable data. Each assay is a collection of files, together with a corresponding metadata file, stored in a subdirectory of the top-level subdirectory `assays`. Assay-level metadata is stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.assay.xlsx` file, which MUST exist for each assay. Further details on `isa.assay.xlsx` are specified [below](#assay-data-and-metadata). Assay data files MUST be placed in a `dataset` subdirectory. Further explications about data entities defined in the assay are stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file in the **assay folder** or `isa_datamap` worksheet in the **isa.assay.xlsx** file. Further details on this are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file).
 
 - *Workflows* represent data analysis routines (in the sense of CWL tools and workflows) and are a collection of files, together with a corresponding CWL description, stored in a single directory under the top-level `workflows` subdirectory. A per-workflow executable CWL description is stored in `workflow.cwl`, which MUST exist for all ARC workflows. Further details on workflow descriptions are given [below](#workflow-description).
 

diff --git a/ISA-XLSX.md b/ISA-XLSX.md
@@ -118,6 +118,8 @@ Additionally, the `Study File` SHOULD contain one or more [`Annotation Table she
 
 Therefore, the main entities of the `Study File` should be `Sources` and `Samples`.
 
+Any Study MAY contain datamap references as described in the [`Datamap Sheet`](#datamap-table-sheets) section.
+
 The `Study File` implements the [`Study`](https://isa-specs.readthedocs.io/en/latest/isamodel.html#study) graph from the ISA Abstract Model. graph from the ISA Abstract Model.
 
 # Assay File
@@ -133,6 +135,8 @@ Additionally, the `Assay File` SHOULD contain one or more [`Annotation Table she
 
 Therefore, the main entities of the `Assay File` should be `Samples` and `Data`.
 
+Any Assay MAY contain datamap references as described in the [`Datamap Sheet`](#datamap-table-sheets) section.
+
 The `Assay File` implements the [`Assay`](https://isa-specs.readthedocs.io/en/latest/isamodel.html#assay) graph from the ISA Abstract Model.
 
 # Datamap File
@@ -785,9 +789,9 @@ If we pool two sources into a single sample, we might represent this as:
 | source1       | sample collection | sample1       |
 | source2       | sample collection | sample1       |
 
-# Datamap Table sheets
+# Datamap table sheets
 
-`Datamap Table sheets` are used to describe the contents of data files. 
+`Datamap Table sheets` are used to describe the contents of data files.
 
 In the `Datamap Table sheets`, column headers MUST have the first letter of each word in upper case, with the exception of the referencing label (REF).