Skip to content

Commit

Permalink
Merge pull request #110 from Emory-HITI/dev
Browse files Browse the repository at this point in the history
An install script for Niffler
  • Loading branch information
pradeeban authored Mar 9, 2021
2 parents 9c780a5 + 8a8ce1b commit bd05e40
Show file tree
Hide file tree
Showing 17 changed files with 303 additions and 229 deletions.
75 changes: 35 additions & 40 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,68 +5,63 @@ Niffler is an efficient DICOM Framework for machine learning pipelines and proce
Niffler enables receiving DICOM images real-time as a data stream from PACS as well as specific DICOM data based on a series of DICOM C-MOV queries. The Niffler real-time DICOM receiver extracts the metadata free of PHI as the images arrive, store the metadata in a Mongo database, and deletes the images nightly. The on-demand extractor reads a CSV file provided by the user (consisting of EMPIs, AccessionNumbers, or other properties), and performs a series of DICOM C-MOVE requests to receive them from the PACS, without manually querying them. Niffler also provides additional features such as converting DICOM images into PNG images, and perform additional computations such as computing scanner utilization and finding scanners with misconfigured clocks.


# Niffler Modules
# Configure Niffler

Niffler consists of a modular architecture that provides its features. Each module can run independently. Niffler core (cold-extraction, meta-extraction, and png-extraction) is built with Python-3.6. Niffler application layer (app-layer) is built with Java and Javascript.

## cold-extraction

Parses a CSV file consisting of EMPIs, AccessionNumbers, or Study/Accession Dates, and performs a series of DICOM C-MOVE queries (often each C-MOVE following a C-FIND query) to retrieve DICOM images retrospectively from the PACS.

## meta-extraction

Receives DICOM images as a stream from a PACS and extracts and stores the metadata in a metadata store (by default, MongoDB), deleting the received DICOM images nightly.

## png-extraction

Converts a set of DICOM images into png images, extract metadata in a privacy-preserving manner. The extracted metadata is stored in a CSV file, along with the de-identified PNG images. The mapping of PNG files and their respective metadata is stored in a separate CSV file.

## app-layer

The app-layer (application layer) consists of specific algorithms. The app-layer/src/main/scripts consists of Javascript scripts such as scanner clock calibration. The app-layer/src/main/java consists of the the scanner utilization computation algorithms developed in Java.


# Configuring Niffler
Niffler consists of 4 modules, inside the modules folder. Here we will look into the common configuration and installation steps of Niffler. An introduction to Niffler can be found [here](https://emory-hiti.github.io/Niffler/).

## Configure PACS

Make sure to configure the PACS to send data to Niffler's host, port, and AE_Title. Niffler won't receive data unless the PACS allows the requests from Niffler (host/port/AE_Title).
Both meta-extraction and cold-extraction modules require proper configuration of a PACS environment to allow data transfer and query retrieval to Niffler, respectively.

## Install Dependencies
* Make sure to configure the PACS to send data to Niffler meta-extraction module's host, port, and AE_Title.

To use Niffler, first, install the dependencies.
* Niffler cold-extraction won't receive data unless the PACS allows the requests from Niffler cold-extraction (host/port/AE_Title).

$ pip install -r requirements.txt

Also install DCM4CHE from https://github.com/dcm4che/dcm4che/releases
## Configure Niffler mdextractor service

For example,
The modules/meta-extraction/services folder consists of mdextractor.sh, system.json, and mdextractor.service.

$ wget https://sourceforge.net/projects/dcm4che/files/dcm4che3/5.22.5/dcm4che-5.22.5-bin.zip/download -O dcm4che-5.22.5-bin.zip
mdextractor.sh produces the output in services/niffler-rt.out.

$ sudo apt install unzip
Make sure to provide the correct full path of your meta-extraction folder in the 2nd line of mdextractor.sh, replacing the below:

$ unzip dcm4che-5.22.5-bin.zip
```
cd /opt/localdrive/Niffler/modules/meta-extraction/
```

Make sure Java is available, as DCM4CHE and Niffler Application Layer require Java to run.
Provide the appropriate values for mdextractor.service.

You should first configure the operating system's [mail](https://www.javatpoint.com/linux-mail-command) client for the user that runs Niffler modules, if you have enabled mail sender for any of the modules through their respective config.json. This is a one-time configuration that is not specific to Niffler.
```
[Service]
Environment="MONGO_URI=USERNAME:PASSWORD@localhost:27017/"
Type=simple
ExecStart=/opt/localdrive/Niffler/modules/meta-extraction/service/mdextractor.sh
TimeoutStartSec=360
StandardOutput=/opt/localdrive/Niffler/modules/meta-extraction/service.log
StandardError=/opt/localdrive/Niffler/modules/meta-extraction/service-error.log
```

## Deploy Niffler

Then checkout Niffler source code.
## Install Niffler

To deploy Niffler, checkout Niffler source code and run the installation script.
```
$ git clone https://github.com/Emory-HITI/Niffler.git
$ cd Niffler

```
The master branch is stable whereas the dev branch has the bleeding edge.

The Java components of Niffler Application Layer are managed via Apache Maven 3.

$ mvn clean install
You might want to use the dev branch for the latest updates. For more stable version, skip the below step:
```
$ git checkout dev
```
Finally, run the installation script.
```
$ sh install.sh
```

Please refer to each module's individual README for additional instructions on deploying and using Niffler for each of its components.
Please refer to each module's individual README for additional instructions on deploying and using Niffler for each of its modules.



Expand Down
24 changes: 22 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,31 @@
## Niffler: A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology Images
# Niffler: A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology Images

Niffler is a research project for DICOM networking, supporting efficient DICOM retrievals and subsequent ML workflows on the images and metadata on a research environment.

It provides an efficient and quick approach to receiving DICOM images in real-time and on-demand from multiple PACS. It extracts DICOM metadata and stores them in a Mongo database. Additional workflows can be run on the images and metadata. One specific example, identifying scanner utilization has been implemented as part of the Niffler Application Layer.


## Citing Niffler
# Niffler Modules

Niffler consists of a modular architecture that provides its features. Each module can run independently. Niffler core (cold-extraction, meta-extraction, and png-extraction) is built and tested with Python-3.6 to Python-3.8. Niffler application layer (app-layer) is built with Java and Javascript.

## cold-extraction

Parses a CSV file consisting of EMPIs, AccessionNumbers, or Study/Accession Dates, and performs a series of DICOM C-MOVE queries (often each C-MOVE following a C-FIND query) to retrieve DICOM images retrospectively from the PACS.

## meta-extraction

Receives DICOM images as a stream from a PACS and extracts and stores the metadata in a metadata store (by default, MongoDB), deleting the received DICOM images nightly.

## png-extraction

Converts a set of DICOM images into png images, extract metadata in a privacy-preserving manner. The extracted metadata is stored in a CSV file, along with the de-identified PNG images. The mapping of PNG files and their respective metadata is stored in a separate CSV file.

## app-layer

The app-layer (application layer) consists of specific algorithms. The app-layer/src/main/scripts consists of Javascript scripts such as scanner clock calibration. The app-layer/src/main/java consists of the the scanner utilization computation algorithms developed in Java.

# Citing Niffler

If you use Niffler in your research, please cite the below paper:

Expand Down
1 change: 1 addition & 0 deletions init/dcm4che.out
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
false
9 changes: 9 additions & 0 deletions init/disable-thp.service
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[Unit]
Description=Disable Transparent Huge Pages (THP)

[Service]
Type=simple
ExecStart=/bin/sh -c "echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled && echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag"

[Install]
WantedBy=multi-user.target
1 change: 1 addition & 0 deletions init/misc.out
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
false
1 change: 1 addition & 0 deletions init/mongo.out
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
false
11 changes: 11 additions & 0 deletions init/mongodb-org-4.2.repo
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[mongodb-org-4.2]

name=MongoDB Repository

baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/4.2/x86_64/

gpgcheck=1

enabled=1

gpgkey=https://www.mongodb.org/static/pgp/server-4.2.asc
12 changes: 12 additions & 0 deletions init/mongoinit.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
conn = new Mongo();
db = conn.getDB("admin");
db.createUser(
{
user: "researchpacsroot",
pwd: passwordPrompt(), // Or "<cleartext password>"
roles: [{role:"root", db:"admin"}]
}
);

conn.close();
quit();
1 change: 1 addition & 0 deletions init/pip.out
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
false
62 changes: 62 additions & 0 deletions install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#!/bin/sh
echo "Configuring Niffler"
sudo chmod -R 777 .

PIP=`head -n 1 init/pip.out`
if [ "$PIP" = false ] ; then
sudo yum install -y python3
echo "Installing pip"
sudo yum install python3-pip
pip install -r requirements.txt
wget https://repo.anaconda.com/archive/Anaconda3-2020.11-Linux-x86_64.sh
sh Anaconda3-2020.11-Linux-x86_64.sh -u
source ~/.bashrc
rm Anaconda3-2020.11-Linux-x86_64.sh
echo "true" > init/pip.out
fi

MISC=`head -n 1 init/misc.out`
if [ "$MISC" = false ] ; then
echo "Installing gdcm and mail"
conda install -c conda-forge -y gdcm
sudo yum install mailx -y
sudo yum install sendmail sendmail-cf
chmod +x modules/meta-extraction/service/mdextractor.sh
echo "Disable THP"
sudo cp init/disable-thp.service /etc/systemd/system/disable-thp.service
sudo systemctl daemon-reload
sudo systemctl start disable-thp
sudo systemctl enable disable-thp
echo "true" > init/misc.out
fi

DCM4CHE=`head -n 1 init/dcm4che.out`
if [ "$DCM4CHE" = false ] ; then
echo "Installing JDK"
sudo yum install java-1.8.0-openjdk-devel
echo "Installing Maven"
sudo dnf install maven
echo "Installing DCM4CHE"
cd ..
wget https://sourceforge.net/projects/dcm4che/files/dcm4che3/5.22.5/dcm4che-5.22.5-bin.zip/download -O dcm4che-5.22.5-bin.zip
unzip dcm4che-5.22.5-bin.zip
rm dcm4che-5.22.5-bin.zip
cd Niffler
echo "true" > init/dcm4che.out
fi

MONGO=`head -n 1 init/mongo.out`
if [ "$MONGO" = false ] ; then
echo "Installing mongo"
sudo cp init/mongodb-org-4.2.repo /etc/yum.repos.d/
sudo yum install mongodb-org
sudo systemctl start mongod
sudo systemctl enable mongod
mongo init/mongoinit.js
sudo cp modules/meta-extraction/service/mdextractor.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl start mdextractor.service
sudo systemctl enable mdextractor.service
echo "true" > init/mongo.out
fi

45 changes: 19 additions & 26 deletions modules/cold-extraction/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,6 @@

The retrospective DICOM Extractor retrieves DICOM images on-demand, based on a CSV file provided by the user. Below we discuss the steps to run Niffler on-demand DICOM extraction queries.

First go to the src/cold-extraction directory in the Niffler source code in your server.

For example, assuming Niffler is checked out in the /opt folder,

$ cd /opt/Niffler/src/cold-extraction

Then proceed to the below steps.



# Configuring Niffler On-Demand Extractor

Expand Down Expand Up @@ -123,41 +114,42 @@ config.json entries are to be set *for each* Niffler on-demand DICOM extractions

## Running the Niffler Retrospective Data Retriever


```
$ nohup python3 ColdDataRetriever.py > UNIQUE-OUTPUT-FILE-FOR-YOUR-EXTRACTION.out &

```
Check that the extraction is going smooth, by,

```
$ tail -f UNIQUE-OUTPUT-FILE-FOR-YOUR-EXTRACTION.out

```
You will see lots of logs.

Now, if you see no log lines, most likely case is, a failure due to an on-going previous extraction. Check the Niffler logs.

```
$ tail -f niffler1.log

```
Above log might be niffler2.log. The log file is niffler, appended by a number indicated in system.json as NifflerID, where the default value is 1.

```
INFO:root:Number of running niffler processes: 2 and storescp processes: 1
ERROR:root:[EXTRACTION FAILURE] 2020-09-21 17:42:24.760598: Previous extraction still running. As such, your extraction attempt was not suuccessful this time. Please wait until that completes and re-run your query.

```
Try again later. Once there is no other process, then you can run your own extraction process.



## Check the Progress

After some time (may take a few hours to a few days, depending on the length of the CSV file), check whether the extraction is complete.

```
$ tail -f niffler.log
INFO:root:[EXTRACTION COMPLETE] 2020-09-21 17:42:38.465501: Niffler Extraction to /opt/data/new-study Completes. Terminating the completed storescp process.

```
A pickle file tracks the progress. The pickle file is created by appending ".pickle" to the csv file name.

```
<8c>^X1234, 000056789<94>

```
For "empi_accession" extractions, each entry above is empi, accession.

For "empi_date" and "accession" extractions, each entry above will be empi, study. The reason is we have to _translate_ "empi_date" and "accession" into empi_study for C-MOVE queries.
Expand All @@ -170,28 +162,29 @@ If the process fails even when no one else's Niffler process is running, check y
If you find an error such as: "IndexError: list index out of range", that indicates your csv file and/or config.json are not correctly set.

Fix them and restart your Python process, by first finding and killing your python process and then starting Niffler as before.

```
$ ps -xa | grep python
1866 ? Ss 0:00 /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers
1936 ? Ssl 0:00 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
1936 ? Ssl 0:00 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
2926 pts/0 T 0:00 python3 ColdDataRetriever.py
2926 pts/0 T 0:00 python3 ColdDataRetriever.py
3384 pts/0 S+ 0:00 grep --color=auto python
3384 pts/0 S+ 0:00 grep --color=auto python
$ kill 2926

```
You might need to run the above command with sudo to find others' Niffler processes.

Make sure not to kill others' Niffler processes. So double-check and confirm that the running process is indeed the one that was started by you, and yet failed.


Rarely, a storescp process started by another user becomes a zombie and prevents Niffler from starting. If that happens, check for storescp processes and kill them as well. Please make sure you are killing only the on-demand Niffler storescp process. By default, this will be shown QBNIFFLER:4243 as below.

```
$ sudo ps -xa | grep storescp
241720 pts/4 Sl 0:02 java -cp /opt/dcm4che-5.22.5/etc/storescp/:/opt/dcm4che-5.22.5/etc/certs/:/opt/dcm4che-5.22.5/lib/dcm4che-tool-storescp-5.22.5.jar:/opt/dcm4che-5.22.5/lib/dcm4che-core-5.22.5.jar:/opt/dcm4che-5.22.5/lib/dcm4che-net-5.22.5.jar:/opt/dcm4che-5.22.5/lib/dcm4che-tool-common-5.22.5.jar:/opt/dcm4che-5.22.5/lib/slf4j-api-1.7.30.jar:/opt/dcm4che-5.22.5/lib/slf4j-log4j12-1.7.30.jar:/opt/dcm4che-5.22.5/lib/log4j-1.2.17.jar:/opt/dcm4che-5.22.5/lib/commons-cli-1.4.jar org.dcm4che3.tool.storescp.StoreSCP --accept-unknown --directory /home/Data/Mammo/Kheiron/cohort_1/ --filepath {00100020}/{0020000D}/{0020000E}/{00080018}.dcm -b QBNIFFLER:4243 242185 pts/5 S+ 0:00 grep --color=auto storescp
$ sudo kill 241720
```
Loading

0 comments on commit bd05e40

Please sign in to comment.