Skip to content

Commit

Permalink
Merge pull request #113 from nfdi4plants/datamap_sheet_extension
Browse files Browse the repository at this point in the history
Extend datamap logic with alternative worksheet implementation
  • Loading branch information
HLWeil authored Jul 15, 2024
2 parents 3010ddc + b1e2f81 commit 44bddf6
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 4 deletions.
4 changes: 2 additions & 2 deletions ARC specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,9 +76,9 @@ ARCs are based on a strict separation of data and metadata content into study ma

Each ARC is a directory containing the following elements:

- *Studies* are collections of material and resources used within the investigation. Study-level metadata is stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.study.xlsx` file, which MUST exist to specify the input material or data resources. Resources MAY include biological materials (e.g. plant samples, analytical standards) created during the current investigation. Resources MAY further include external data (e.g., knowledge files, results files) that need to be included and cannot be referenced due to external limitations. Resources described in a study file can be the input for one or multiple assays. Further details on `isa.study.xlsx` are specified [below](#study-and-resources). Resource (descriptor) files MUST be placed in a `resources` subdirectory. Further explications about data entities defined in the study are stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file, which SHOULD exist for studies containing data. Further details on `isa.datamap.xlsx` are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file).
- *Studies* are collections of material and resources used within the investigation. Study-level metadata is stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.study.xlsx` file, which MUST exist to specify the input material or data resources. Resources MAY include biological materials (e.g. plant samples, analytical standards) created during the current investigation. Resources MAY further include external data (e.g., knowledge files, results files) that need to be included and cannot be referenced due to external limitations. Resources described in a study file can be the input for one or multiple assays. Further details on `isa.study.xlsx` are specified [below](#study-and-resources). Resource (descriptor) files MUST be placed in a `resources` subdirectory. Further explications about data entities defined in the study are stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file in the **study folder** or `isa_datamap` worksheet in the **isa.study.xlsx** file. Further details on this are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file).

- *Assays* correspond to outcomes of experimental assays or analytical measurements (in the interpretation of the ISA model) and are treated as immutable data. Each assay is a collection of files, together with a corresponding metadata file, stored in a subdirectory of the top-level subdirectory `assays`. Assay-level metadata is stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.assay.xlsx` file, which MUST exist for each assay. Further details on `isa.assay.xlsx` are specified [below](#assay-data-and-metadata). Assay data files MUST be placed in a `dataset` subdirectory. Further explications about data entities defined in the assay are stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file, which SHOULD exist for each assay. Further details on `isa.datamap.xlsx` are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file).
- *Assays* correspond to outcomes of experimental assays or analytical measurements (in the interpretation of the ISA model) and are treated as immutable data. Each assay is a collection of files, together with a corresponding metadata file, stored in a subdirectory of the top-level subdirectory `assays`. Assay-level metadata is stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.assay.xlsx` file, which MUST exist for each assay. Further details on `isa.assay.xlsx` are specified [below](#assay-data-and-metadata). Assay data files MUST be placed in a `dataset` subdirectory. Further explications about data entities defined in the assay are stored in [ISA-XLSX](#isa-xlsx-format) format in a `isa.datamap.xlsx` file in the **assay folder** or `isa_datamap` worksheet in the **isa.assay.xlsx** file. Further details on this are specified [in the isa-xlsx specification](ISA-XLSX.md#datamap-file).

- *Workflows* represent data analysis routines (in the sense of CWL tools and workflows) and are a collection of files, together with a corresponding CWL description, stored in a single directory under the top-level `workflows` subdirectory. A per-workflow executable CWL description is stored in `workflow.cwl`, which MUST exist for all ARC workflows. Further details on workflow descriptions are given [below](#workflow-description).

Expand Down
8 changes: 6 additions & 2 deletions ISA-XLSX.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,8 @@ Additionally, the `Study File` SHOULD contain one or more [`Annotation Table she

Therefore, the main entities of the `Study File` should be `Sources` and `Samples`.

Any Study MAY contain datamap references as described in the [`Datamap Sheet`](#datamap-table-sheets) section.

The `Study File` implements the [`Study`](https://isa-specs.readthedocs.io/en/latest/isamodel.html#study) graph from the ISA Abstract Model. graph from the ISA Abstract Model.

# Assay File
Expand All @@ -133,6 +135,8 @@ Additionally, the `Assay File` SHOULD contain one or more [`Annotation Table she

Therefore, the main entities of the `Assay File` should be `Samples` and `Data`.

Any Assay MAY contain datamap references as described in the [`Datamap Sheet`](#datamap-table-sheets) section.

The `Assay File` implements the [`Assay`](https://isa-specs.readthedocs.io/en/latest/isamodel.html#assay) graph from the ISA Abstract Model.

# Datamap File
Expand Down Expand Up @@ -785,9 +789,9 @@ If we pool two sources into a single sample, we might represent this as:
| source1 | sample collection | sample1 |
| source2 | sample collection | sample1 |

# Datamap Table sheets
# Datamap table sheets

`Datamap Table sheets` are used to describe the contents of data files.
`Datamap Table sheets` are used to describe the contents of data files.

In the `Datamap Table sheets`, column headers MUST have the first letter of each word in upper case, with the exception of the referencing label (REF).

Expand Down

0 comments on commit 44bddf6

Please sign in to comment.