Skip to content

DNA methylation for tumor subclassifications by methylation array.

Notifications You must be signed in to change notification settings

NYU-Molecular-Pathology/Methylation

Repository files navigation


Table of Contents

📖 Methylation Pipline Overview

drawing

💻 Essential Downloads

Download and install the following packages:
You can automatically install these requirements using

/Volumes/CBioinformatics/Methylation/install_requirements.sh

Use **ARM** (-arm64.pkg) package downloads for *M1/M2 Macs* & **Intel** (-x86_64.pkg) for older non-Apple Silicon Chip Based Macs)

𓇲 Additional Steps for Apple Silicon Macs Only

  • Additional OpenGL: brew install --from-source glfw3
brew install cmake && brew uninstall glfw
git clone https://github.com/glfw/glfw.git && cd glfw && \
cmake -DCMAKE_OSX_ARCHITECTURES=arm64 . && \
make && \
sudo make install
NOTE

You may need to unlock permissions before installing packages in the Mac's System Preferences Privacy & Security Panel:
https://github.com/NYU-Molecular-Pathology/Methylation/blob/main/Notes/SystemPermissions.md


❗First Time Running Classifier Pre-install packages

  • You can install all the dependencies above by executing the script on the CBioinformatics shared drive: text/Volumes/CBioinformatics/Methylation/install_requirements.sh
  • After you have installed all the required system dependencies above in Essential Downloads above, you must install all the R packages needed to install and run the classifiers.
  • Before running the classifier for the first time run the Rscript below, all_installer.R, to install any R-package dependencies. The script only needs to be run the first time installing the classifier on new systems.
  • For better debugging, paste the raw code from the URL into RStudio:
    https://raw.githubusercontent.com/NYU-Molecular-Pathology/Methylation/main/R/all_installer.R

🌐 Network Drive Mount Paths

  • To install & run the pipeline, it is critical to mount the following network smb shared drives:
  • Open Finder and press ⌘(CMD) + K then paste each of the directories below, using NYUMC\KerberosID as the login name and password is your kerberos password.
smb://research-cifs.nyumc.org/Research/CBioinformatics/
smb://research-cifs.nyumc.org/Research/snudem01lab/snudem01labspace
smb://shares-cifs.nyumc.org/apps/acc_pathology/molecular

⚡️ Quickstart

1. Download the shell script to your home folder or another directory:

  • You can download runMeth.sh in this repo under Methylation/Meth_Scripts/ or use curl/wget:
curl -# -L https://raw.githubusercontent.com/NYU-Molecular-Pathology/Methylation/main/Meth_Scripts/runMeth.sh >$HOME/runMeth.sh

2. Open the shell script, paste your REDCap API token in the methAPI field on line 3, and save it.

You can use nano $HOME/runMeth.sh

methAPI="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"  #Paste your API Token here
  • Note: Your API Token can be found in "All Samples DataBase" on the left-side panel called "API" in REDCap. Explained here

3. Add permissions to the script to be executable:

chmod +rwx $HOME/runMeth.sh
  • If install of any packages fail, be sure to check the troubleshooting section at the bottom of this page

Input Paths

Files are copied to the work directory by their RUNID name and YEAR, including the worksheet and idats for example:

  • Worksheets /Volumes/molecular/MOLECULAR LAB ONLY/NYU-METHYLATION/WORKSHEETS/2022/22-MGDM17.xlsm
  • .idat files /Volumes/molecular/Molecular/iScan/

Default Working Directory

  • Input files are copied and report files are generated on the Cbioinformatics drive:
    /Volumes/CBioinformatics/Methylation/Clinical_Runs/22-MGDM17

Output Paths

  • Html report files saved to the working directory are copied to the Z-drive
    For example, run 22-MGDM17 report files would be output in the following directories:
    /Volumes/molecular/Molecular/MethylationClassifier/2022/22-MGDM17
    /Volumes/molecular/MOLECULAR LAB ONLY/NYU-METHYLATION/Results/2022/22-MGDM17

⚙️ Executing Methylation CLI

To run the Clinical or Research Methylation pipeline, simply use the locally stored Shell Script in:
/Volumes/CBioinformatics/Methylation/runMeth.sh

  • This shell script uses Curl to download the files from this repo and takes four positional argument inputs to execute methylExpress.R in the terminal.
  • The bash script stores your REDCap API token locally and only requires the methylation run ID to be entered.
  • You can copy runMeth.sh and create an alias or symlink to execute more easily. For example:
    alias runmeth='bash $HOME/runMeth.sh' or echo "alias runmeth='bash $HOME/runMeth.sh'" >> ~/.bashrc

🤖 runMeth.sh parameters

The shell script takes the following positional arguments:

methAPI='XXXXXXXX' # (hardcoded) Your REDCap API Token
methRun=${1-NULL}  # methylation run id e.g. 22-MGDM17
PRIORITY=${2-NULL} # string of prioritized RD-numbers
runPath=${3-NULL}  # any custom directory to copy/run the idat files
redcapUp=${4-NULL} # to upload to redcap or not if server down single char i.e. "T" or "F"
runLocal=${5-NULL} # If the run directory should be executed without shared drives locally i.e. "T" or "F"

🧮 Passing Arguments to R

The four positional arguments from runmeth.sh are passed to the Rscript methylExpress.R:
arg[1] is the token for the API call ('#######################')
arg[2] is the RunID which if NULL runs the latest Clinical Worksheet 22-MGDM17
arg[3] is the selectRds parameter which is to prioritize samples being run (NULL)
arg[4] is the baseFolder parameter which is optional if you want to run/save output to a different directory (NULL)

Alternatively, instead of passing the RunID to runmeth.sh, you can source and download this repository and then locally edit args in methylExpress.R to run manually.

🧪 Run the Test Case

After installation, test the pipeline from your terminal, by executing the test case:

$HOME/runMeth.sh 21-MGDM_TEST

or if you have not saved the runMeth.sh script locally:

/Volumes/CBioinformatics/Methylation/runMeth.sh 21-MGDM_TEST

You can then check the output to confirm each html report was generated in the output directory:
/Volumes/CBioinformatics/Methylation/Clinical_Runs/21-MGDM_TEST ls -lha "$HOME/Desktop/html_21-MGDM_TEST/21-MGDM_TEST/"

NOTE: When running the test case (21-MGDM_TEST) you may notice an error with the upload log as these reports would already exist in REDCap. It is normal for the test case html files to fail uploading since the REDCap database already contains the data and files for the test run, 21-MGDM_TEST.

To run the Sarcoma Classifier or re-Run Individual Samples

  • For Individual Cases: Execute the script directly with RD-numbers, for example:
    Rscript --verbose /Volumes/CBioinformatics/Methylation/Clinical_Runs/Sarcoma_runs/methylExpress_sarcoma.R RD-15-123 RD-16-1234 RD-17-321
  • For Several/Bulk Cases: Execute the script by passing the path to a csv file containing a list of RD-numbers in the first column, for example:
    Rscript --verbose /Volumes/CBioinformatics/Methylation/Clinical_Runs/Sarcoma_runs/methylExpress_sarcoma.R /Path/To/Desktop/MyListRDs.csv
    In the event the shared drive is not accessible, the script without the API token is availible here

⚠️ Troubleshooting

Pipeline Installation Issues 1. If you have issues with package installation or dependencies:
Make sure compilers are installed by opening Xcode.app or executing `sudo xcode-select --install`

2. Then, execute the all_installer.R script by copy and pasting the raw contents of the script below into Rstudio before running runmeth.sh again: https://raw.githubusercontent.com/NYU-Molecular-Pathology/Methylation/main/Research/all_installer.R

3. To resolve any problems during automation, you can open methylExpress.R in RStudio which is downladed by runmeth.sh to your home directory.

4. Try to run `sudo xcode-select -s /Library/Developer/CommandLineTools` and `brew install gdal proj` then install the package **rgdal** in Rstudio.

5. Download the libraries below from their sources:
(a) sqlite-autoconf-3330000.tar.gz from "https://www.sqlite.org/download.html".
(b) tiff-4.1.0.tar.gz from "https://download.osgeo.org/libtiff/"
(c) proj-7.2.0.tar.gz from "https://proj.org/download.html#current-release"
(d) libgeotiff-1.6.0.tar.gz from "https://download.osgeo.org/geotiff/libgeotiff/"
(e) geos-3.8.1.tar.bz2 from "https://trac.osgeo.org/geos"
(f) gdal-3.2.0.tar.gz from "https://gdal.org/download.html"
REDCap errors 1. Once your run completes check in your run directory if there is any *upload_log.tsv* file or *redcaperrors.txt*. If these files exist, they may note any files or data which would have been over-written in the database. 2. Check with the wet lab if any RD-numbers were duplicated or previously used for the samples listed in the upload_log.tsv file. 3. Make sure your API token is not NULL and that REDCap is not down for maintenence here: https://redcap.nyumc.org/apps/redcap/
4. Check if any of the urls in the notification or API calls have been broken by a new version of REDCap. For Example, the link: https://redcap.nyumc.org/apps/redcap/redcap_v13.1.35/API/project_api.php?pid=24752 if broken, modify the URL to match REDCap Version i.e. /redcap/**redcap_v13.2.57**/)
Additional resources are here: https://redcap.nyumc.org/apps/redcap/index.php?action=help&newwin=1
REDCap Email Notification issues The automatic email notifications are located on the left-side panel called "Alerts & Notifications". If you need to change an output path in the email or change the year in the email, click on edit for Alert #1:Research Run Complete or Alert #2:Clinical Run Complete.
View the "Applications Overview" video here: https://redcap.nyumc.org/apps/redcap/index.php?action=training
A detailed guide for Alerts is availible here: https://www.ctsi.ufl.edu/wordpress/files/2019/06/REDCap-Alerts-Notifications-User-Guide.pdf
Additional resources are here: https://redcap.nyumc.org/apps/redcap/index.php?action=help&newwin=1
How to upload manually to REDCap 1. Login with your kerberos ID to https://redcap.nyumc.org/
2. On the left-hand sidebar scroll all the way down the Reports Bookmarks until you see the folders:
`>>>>CURRENT Runs~~~~~ and 3) >>>>>CLINICAL Current Run`
3. Here, you can click on the RD-number of choice and then select "Upload html file" under the methylation menue
4. Optionally, you can also select "Add / Edit Records" menu in the left sidebar and find your RD-number in the "Search query" field
5. To upload the sample classifier details, such as the values and scores, a csv file named <run_id>_Redcap.csv is saved on the Desktop in a folder a run folder created named with the <run_id>. This file can be uploaded in the import tab of REDCap under *Data Import Tool* in the sidebar. The folder will also contain a <run_id>_samplesheet.csv file used in the run derived from the .xlsm file.
Additionally, this file is copied to: `/Volumes/CBioinformatics/Methylation/Clinical_Runs/csvRedcap/<run_id>/<run_id>_Redcap.csv`
Issues with installing or running packages 1. If you are getting compiler errors or all_installer.R fails, try installing additional system dependencies with brew and restart your R session: https://raw.githubusercontent.com/NYU-Molecular-Pathology/Methylation/main/Development/brewFix.sh

2. If you still have errors with compiling or installing a package, try removing you MakeVars directory in:
`rm -rf $HOME/.R/Makevars`
Fix wet lab worksheet 1. In your run path /Volumes/CBioinformatics/Methylation/Clinical_Runs/22-MGDM##, open the RUNID.xlsm file
2. On the Review ribbon, click unprotect worksheet and unprotect tab
3. Right-click the "worksheet" tab at the bottom and unhide... raw_labels tab
4. If any "#ref" errors either drag the formula down to correct or type "=" and select the cell in the first tab "worksheet" and press return.
For example "=worksheet!B25" references cell B25 in the tab named "worksheet"

About

DNA methylation for tumor subclassifications by methylation array.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published