-
Notifications
You must be signed in to change notification settings - Fork 1
User Guide
- Introduction
- Get familiar with the standard
- Identify your datasets
- Map Attributes (columns)
- Map Attribute Values (column values)
- Check the mappings
- Download the mappings
Mapper is a generic tool to map different datasets to a pre-established standard. It can be used to implement data interoperability solutions. The Mapper allows mapping columns and column values from files that contain data to be harmonised. Each file to be mapped contains data from one entity of the standard. The files are csv files and can use practically any standard separator besides comma.
Find more in this example.
The application has three main functions:
- Map attributes of entities
- Map attribute values
- Save the map
The standard is expressed mainly with two files. One for the entities and attributes per entity and another for the attribute values. You can find an example here.
The output of the Mapper is a zip files containing JSON files of the map and column lists from the mapped files.
In order to map your datasets you need to know the semantic behind the standard. You can download the standard by clicking the button "Download Standard" as in the figure below:
Each entity in the standard corresponds to a data file that contains data to be mapped. In the standard file, the fact entity is the first entity in the columns "Entity". In the web interface of the Mapper, the fact entity is the first entity in the entity list.
The fact entity or central entity is the first entity in the dropdown list containing the entities. For instance, the figure below show a standard for biobank data based in MIABIS standard. The fact entity is Sample.
The rest of entities in the standard are the rest of the list following the fact entity. In the example above, the rest of entities to be mapped are: Sample Collection, Study and Contact Information.
The attributes to be mapped are those from the standard defined in the web service. Nevertheless, the files to be mapped can contain any number of columns in any order.
The datasets are the cvs files containing data to be mapped according to the standard defined by the Mapper web service that you are using. In our biobank federation example, the biobanks will expose sample data for sharing. The biobanks export cvs files with data regarding: sample, sample collection, study and contact person. Those files can differ a lot from one biobank to another. Each biobank maps its files to allow them to be processed by a federation framework (MIABIS Federation that allows to query all the data from all the biobanks in the federation.
To map attributes from one entity (columns in a file):
- Select the "Entity" tab
- Select the Entity to be mapped in the Entity list (figure above)
- Chose the file containing the data for the selected entity
- Map attributes by clicking first in the "Standard Attributes" table and then in the "Local Attributes" table. The standard name for the attribute will appear in the third column of the "Local Attributes" table.
Repeat 3 until all possible columns are mapped.
- Save the map for the selected entity
NOTE: Save the map before selecting a new Entity in the dropdown list!
To map attribute values from one entity (columns values):
- Select the "List Values" tab
- Select the list to be mapped in the dropdown list
- Chose the file containing the data for the selected entity. The list values can be separated in different files or be all together in the same file.
- Map attributes by clicking first in the "Standard Attributes" table and then in the "Local Attributes" table. The standard name for the attribute will appear in the third column of the "Local Attributes" table.
Repeat 3 or 4 until all possible values are mapped.
- Save the map for the selected entity
Once all the entities and values are mapped, you can check the results by selecting the tab "Map Result". This tab provides a table for the mapped entities and a table for the mapped values. Both tables can be searched or ordered to facilitate the exploration of the mappings. If a map is wrong, you can double-click in the raw and delete the map for that raw. The map can be corrected by selecting the corresponding tab (Entities or List Values) and correcting and saving again. The new corrections will be reflected in the map results. Each table can be clear by clicking "Clear Entity Map" or "Clear List Values Map" accordingly.
Once you are satisfied with your mappings, save the results by clicking the button "Save Map Results"
This action will download:
- JSON file with the entities
- JSON file with the attribute values
- Several files with the original columns of the mapped files