-
Notifications
You must be signed in to change notification settings - Fork 20
Site Config
This configuration file helps locate the study config files, configures servers that Datman may need to communicate with, configures the structure of datman managed study folders, and details expected scan tags and how to handle them.
NOTE: All keys are case sensitive. If Datman is complaining about KeyErrors for any of these settings check that your spelling and case are correct.
The projects block contains a list of short-hand codes for each study that act as keys with the name of the file holding that study's configuration as the value. These study codes are usually, but not necessarily, the same as the study code that begins the Datman ID for sessions in that study and the code must be unique for each study even if the study field of the Datman ID is not.
New studies must be added to this list in order for Datman to manage them. While you can change the name of the file here you can't set the location of the study config file. All config files for a 'system' are expected to be found in one single directory pointed to by the CONFIG_DIR
setting in SystemSettings
Note the capital "P" on projects, which is case sensitive. Study codes do not need to be all caps, but however you type them in the Projects block is how you must type them at the command line since the 'Projects' block is how Datman locates a study and all of its data.
Projects:
STUDY1: <study_1 filename here>
STUDY2: <study_2 filename here>
...
STUDYN: <study_n filename here>
And an example excerpt from our lab's tigrlab_config.yaml
Projects:
ANDT: ANDT_settings.yml
ASDD: ASDD_settings.yml
COGBDO: COGBDO_settings.yml
At least one system must be configured within this block. This block can allow multiple users to have their own separately managed Datman projects for one installation or allow multiple systems with different folder structures to work with the same NFS mounted one.
When running scripts the system configuration to use must be specified by providing the name in the environment variable DM_SYSTEM
(i.e. for many shells this should work: export DM_SYSTEM=my_system_name_here
)
-
DATMAN_PROJECTSDIR
: Must be the full path to the folder where a set of datman managed projects will be kept. For example, on our local system this is/archive/data/
-
DATMAN_ASSETSDIR
: The full path to datman's assets folder on your system. For example, on our local system this is/archive/code/datman/assets/
-
CONFIG_DIR
: The full path to the folder of study config files to use for this system. For example, on our system this is/archive/code/config/
-
QUEUE
: This defines the queue type that jobs will be submitted to if a queue is available. Currently this can be either 'sge' or 'pbs'.
Note the capitalization on 'SystemSettings', and that the needed/optional keys are all caps. The system name can be whatever case you prefer, however, you must match the spelling and capitalization exactly when you set DM_SYSTEM
in your shell
SystemSettings:
system_name:
DATMAN_PROJECTSDIR: <your projects path here>
DATMAN_ASSETSDIR: <full path to the assets folder here>
CONFIG_DIR: <full path to your study config file folder>
QUEUE: <'pbs' or 'sge' goes here>
And an example excerpt from our tigrlab_config.yaml
where we have two systems, 'kimel' and 'scc', configured:
SystemSettings:
kimel:
DATMAN_PROJECTSDIR: '/archive/data/'
DATMAN_ASSETSDIR: '/archive/code/datman/assets/'
CONFIG_DIR: '/archive/code/config/'
QUEUE: 'sge'
scc:
DATMAN_ASSETSDIR: '/KIMEL/quarantine/datman/latest/src/datman/assets/'
DATMAN_PROJECTSDIR: '/external/rprshnas01/tigrlab/archive/data/'
CONFIG_DIR: '/external/rprshnas01/tigrlab/archive/code/config/'
QUEUE: 'pbs'
This block determines the structure of each Datman managed study. Each time a new pipeline folder or other resource is added to your projects a new entry needs to be added to the list. The keys determine how the path will be referenced within the code and the values are a relative path that gets appended to the study folder (which is itself the path from DATMAN_PROJECTSDIR
with the PROJECTDIR
from the Study Config appended).
Below is a list of paths that must be configured for Datman to function correctly. Most of the core scripts read from or write to the directories configured here.
- meta: Points to the folder meant to hold metadata like scans.csv, blacklist.csv, checklist.csv, etc.
- data: Parent folder for the original dicom data and its other raw formats like nifti, mnc, etc.
- dcm: The folder that will hold raw dicom data. Only one dicom image per series is stored here
- dicom: The folder that will hold the raw zip files of dicoms before the site naming convention is applied
-
zips: The folder that holds correctly named links that point to the raw zip files in
data/dicom
- resources: The folder that holds all non-dicom data that was present in the raw zip files
- nii, mnc, nrrd: Folders that hold the converted data in nifti, mnc and nrrd formats respectively
- qc: Holds all the QC pipeline outputs
- logs: The folder that will store log output from various scripts and nightly pipelines
These paths must be configured if the scripts listed are in use, but may be omitted otherwise
-
std
- Description: Points to the folder that will hold any defined 'standards' for use in the QC pipeline. Usually these are DICOMs or json files that contain expected DICOM header fields.
- Used by:
- dm_qc_report.py - Reads from this folder
-
dtiprep
- Description: Points to the destination folder for dtiprep pipeline outputs
- Used by:
- dm_proc_dtiprep.py - Generates the contents
- dm_proc_tractmap.py - Reads from this folder
-
freesurfer
- Description: Points to the destination for freesurfer outputs
- Used by:
- dm_proc_freesurfer.py - Generates the contents
- dm_proc_fs2hcp.py - Reads from this folder
-
hcp
- Description: Points to the folder holding ciftify's HCP format outputs. These are a subset of the HCP pipelines outputs, where temp files and folders have been deleted to save space, and unlike the original HCP pipelines code this can be generated from legacy datasets
- Used by:
- dm_proc_fs2hcp.py - Generates the contents
- dm_proc_ea.py - Reads from this folder
- dm_proc_fmri.py - Reads from this folder
-
fmri
- Description: Points to the folder that holds epitome pipeline fmri outputs (e.g. rest, imobs, ea)
- Used by:
- dm_proc_ea.py - Generates contents of 'ea' subfolder
- dm_proc_fmri.py - Generates contents of 'fmri' subfolder
- dm_proc_imob.py - Generates contents of 'imob' subfolder
- dm_proc_rest.py - Generates contents of 'rest' subfolder
-
hcp_fs
- Description: Points to the folder that holds the HCP Pipelines full FreeSurfer pipeline outputs.
- Used by:
- dm_hcp_freesurfer.py - Generates the contents
-
unring
- Description: Points to the folder that holds the outputs of unring.
- Used by:
- dm_proc_unring.py - Generates the contents
This example is a subset of all keys available.
Paths:
meta: metadata/
std: metadata/standards/
data: data/
dcm: data/dcm/
nii: data/nii/
nrrd: data/nrrd/
qc: qc/
log: logs/
fmri: pipelines/fmri/
freesurfer: pipelines/freesurfer/
Assuming a configuration where the DATMAN_PROJECTSDIR
is /archive/data
(as it is in ours) and a PROJECTDIR
of SPINS
the above settings would generate a project with the following folder structure:
/archive/data/SPINS/
│
└─── metadata
│ │
│ └─── standards
│
└─── data
│ │
│ └─── dcm
│ │
│ └─── mnc
│ │
│ └─── nrrd
│
└─── qc
│
└─── logs
│
└─── pipelines
│
└─── fmri
│
└─── freesurfer
This block defines the expected tags. Each tag has its own dictionary of config values that defines which formats to convert to, which QC function to use for human data, and which QC function to use for phantoms with that tag.
NOTE: Any of the settings in this block can be overridden by the ExportInfo settings in the Study Config file
The following settings must be used to define site wide defaults for each tag
-
formats
- Description: This should be a list of formats to convert any series matching this tag to.
- Accepted values: 'nii', 'dcm', 'mnc', 'nrrd'
-
qc_types
- Description: This defines the QC function to use in dm_qc_report.py to process human data with the matching tag
- Accepted values: 'anat', 'fmri', 'dti', 'ignore'
-
qc_pha
- Description: This setting defines the QC function to use in dm_qc_report.py to process phantom data with a matching tag
- Accepted values: 'qa_dti', or 'default'. You can also omit 'qc_pha' entirely and the default will be chosen.
Site wide default values can optionally be set for any of the keys usually set in ExportInfo. To see available keys and advice for setting them see the ExportInfo section in 'Sites Block'.
The following is a small excerpt from our own ExportSettings
ExportSettings:
T1: {formats: ['nii', 'dcm', 'mnc'], qc_type: anat, qc_pha: default}
T2: {formats: ['nii', 'dcm'], qc_type: anat}
RST: {formats: ['nii', 'dcm'], qc_type: fmri}
SPRL: {formats: ['nii'], qc_type: fmri}
DTI60-1000: {formats: ['nii', 'dcm', 'nrrd'], qc_type: dti, qc_pha: qa_dti}
FMAP: {formats: ['nii', 'dcm'], qc_type: ignore}
DTI-ABCD: { formats: ['nii', 'dcm'], qc_type: dti, Pattern: 'ABCD_dMRI$'}
In this example all of the tags except DTI60-1000 will use the default phantom QC for their respective qc_type
. The last tag (DTI-ABCD) provides an example of using one of the optional settings to set up a site wide default series description pattern that will match this tag. Any study with the same tag in their ExportInfo can override this by including their own 'Pattern' setting.
There are a few site wide settings that are not part of any configuration block and that are only needed for a few datman scripts. These are documented here, along with the name of the scripts that use these values.
-
FTPSERVER
- Description: This should contain the fully qualified domain name of an sftp server that new scans will be pulled from.
- Used by:
- dm_sftp.py
-
XNATSERVER
- Description: Contains the full URL to the XNAT server this site will use to archive its data
- Used by:
- datman/xnat.py - Reads this value to find a server to read from / write to
- dm_xnat_upload.py - Uploads new scans to this server
- dm_xnat_extract.py - Downloads new scans from this server
- dm_link_shared_ids.py - Adds shared ID / study alias info to this server
-
XNATPORT
- Description: Only used alongside XNATSERVER. Specifies which port to connect to on the server.
- Used by:
- Same scripts as XNATSERVER
-
REDCAPAPI
- Description: The full URL to the site's REDCap server where 'scan completed' forms are stored
- Used by:
- dm_link_shared_ids.py - Reads shared IDs / session aliases from this server
-
LOGSERVER
- Description: The IP of the machine that will run the logging server.
- Used by:
- dm_log_server.py - This is the actual log server, needs to know this IP to start up
- The following scripts read this setting to get the destination for the log output ONLY when the remote option is used:
- dm_hcp_freesurfer.py
- dm_proc_freesurfer.py
- dm_qc_report.py
- nii_to_bids.py
- xnat_fetch_sessions.py
-
SERVER_LOG_DIR
- Description: The full path to the folder where all server logs will be stored. Should be accessible to the machine running the log server (i.e. a path local to that machine or a path NFS mounted to it). Only needed if LOGSERVER is set.
- Used by:
- dm_log_server.py
This template can be copied, modified and expanded when setting up a new site config file. Remember to save it as a '.yml' or '.yaml' file and to add the full path to it in the environment variable DM_CONFIG
. Also choose one of your configured systems as DM_SYSTEM
# Misc settings can go here (or anywhere really)
FTPSERVER: some.sftpserver.ca
XNATSERVER: xnat.somedomain.ca
XNATPORT: 443
REDCAPAPI: somedomain.ca/redcap/api/
LOGSERVER: 1.1.1.1
SERVER_LOG_DIR: /some/path/on/1.1.1.1/logs
# Projects defined here
Projects:
ANDT: ANDT_settings.yml
ASDD: ASDD_settings.yml
COGBDO: COGBDO_settings.yml
COGBDY: COGBDY_settings.yml
DBDC: DBDC_settings.yml
# Add in Systems (be sure to set one as your current DM_SYSTEM when running datman scripts)
SystemSettings:
kimel:
DATMAN_PROJECTSDIR: '/archive/data/'
DATMAN_ASSETSDIR: '/archive/code/datman/assets/'
CONFIG_DIR: '/archive/code/config/'
QUEUE: 'sge'
# Set up the structure of your installation
Paths:
meta: metadata/
data: data/
dcm: data/dcm/
nii: data/nii/
mnc: data/mnc/
nrrd: data/nrrd/
dicom: data/dicom/
resources: data/RESOURCES/
qc: qc/
std: metadata/standards/
log: logs/
zips: data/zips/
dtiprep: pipelines/dtiprep/
fmri: pipelines/fmri/
hcp: pipelines/hcp/
hcp_fs: pipelines/hcp_freesurfer/
freesurfer: pipelines/freesurfer/
unring: pipelines/unring/
# Configure tags available at this site
ExportSettings:
T1: {formats: ['nii', 'dcm', 'mnc'], qc_type: anat, qc_pha: default}
T2: {formats: ['nii', 'dcm'], qc_type: anat}
RST: {formats: ['nii', 'dcm'], qc_type: fmri}
SPRL: {formats: ['nii'], qc_type: fmri}
DTI60-1000: {formats: ['nii', 'dcm', 'nrrd'], qc_type: dti, qc_pha: qa_dti}
FMAP: {formats: ['nii', 'dcm'], qc_type: ignore}
DTI-ABCD: { formats: ['nii', 'dcm'], qc_type: dti, Pattern: 'ABCD_dMRI$'}