Skip to content
roxmer edited this page Apr 27, 2017 · 3 revisions

Prepare the Standard for the mapping

Satandard File

The standard file is a csv file that contains three columns:

  • Entity: The name of the entity that the data is refering to. The first is the fact table.
  • Attribute: Name of the attribute for a given entity
  • Description: Text describing the attribute

File name: Standard.csv

NOTE: The file name is relevant!

In our biobank federation example the standard could be defined as follows:

Entity Attribute Description
Sample Material type The biospecimen type saved from a biological entity for testing, diagnostic, propagation, treatment or research purposes. Can be several values MIABIS-2.0-14
Sample Anatomical site The anatomical position of the body where the solid sample was taken from. MIABIS-44
Sample Sex The sex of the study participants. Can be one or more of the following values: Male, Female, Unknown, Undiferrentiated MIABIS-2.0-09
Sample Disease ICD-10 code
Sample Sample Collection ID Sample Colelction to wwhich the sample belong to
Sample Collection ID Sample Collection ID that links the sample to the sample collection
Sample Collection Name The name of the sample collection in english
Sample Collection Contact Information ID Contact information for the contact person of the sample collection MIABIS-2.0-07 and MIABIS-2.0-23 of the PI
Contact Information ID The unique ID of the contact person
Contact Information Email Email address of the Contact person

The first Entity in the Standard is Sample, which is the fact table in the model.

NOTE: It is important to define the attributes that are primary key to other entities

List Values File

The list values file is a csv file that contains two columns:

  • Attribute: Name of the attribute that has pre-defined values in any of the entities where the attribute is used
  • Value: Value of the attribute

File name: Standard_List_Values.csv

NOTE: The file name is relevant!

The list values files provides the pre-defined values of attributes to be mapped. In our example, the sample has two attributes that have pre-defined values: Material type and Sex

The list values can be ddefined as follows:

Attribute Value
Sex Male
Sex Female
Sex Unknown
Sex Undifferentiated
Material type Blood
Material type DNA
Material type Immortalized Cell Lines
Material type Plasma
Material type Saliva
Material type Serum
Material type Tissue (Frozen)
Material type Tissue (FFPE)
Material type RNA
Material type Urine

It means that in the datasets, the columns mapped to Sex and to Material type need to be parsed to assign the respective mapped values.

Once the standard files are created, they have to be located in the root folder of the web application. The names are relevant.