In this repository, you will find all the code necessary to:
- Clean and harmonize regional tau PET SUVr and Centiloid values for ADNI and A4 cohorts.
- Run graphical LASSO machine learning model to estimate the strongest conditional dependencies between tau accumulation in different brain regions and prune weaker spurious correlations.
- Analyze graph metrics to determine differences in efficiency and organization of tau deposition at varying global amyloid burdens.
A. Researchers can request the data used in this project from the ADNI and A4 websites.
B. Data cleaning scripts are located in jad2024/data_paths_and_cleaning/data_cleaning_scripts
- merging_cent_tau_csvs.ipynb: merges the centiloid and tau SUVR raw csvs into a master csv used for analysis and applies a centiloid cut off value of >=21, established by (Royse et al., 2021). The output is a new csv with only amyloid positive patients with naming style where adni/a4 is whichever dataset that csv belongs to.
- adni_a4_data_harmonization.ipynb: creates new harmonized dataframes for ADNI and A4 with the tau SUVR values for 44 bilateral brain regions and saves them where a4/adni is the parent folder name for the csv depending on whichever cohort that data belongs to.
In jad2024/analyze_graphs, you will find scripts for hyperparameter selection and running the graphical models on the data that has been divided into 3 centiloid quantile groups:
- hyperparameter_tuning/bic.ipynb: This script shows how different hyperparameter (alpha) values affect the sparsity of the precision and covariance matrices and BIC of the graphical model used to determine the optimal strength of the L1 regularization (alpha) that should be applied.
- construct_and_analyze_graphs/streamlined_graphs_allinone.ipynb: This script creates 1000 bootstrap samples of the data and fits a probabilistic graphical model to each bootstrapped sample, produces graph visualizations of the model's learned tau graph structure, and calculates metrics like weighted clustering coefficient, average shortest path length, and weighted small world coefficient to analyze how tau efficiency increases at higher amyloid burdens.
- construct_and_analyze_graphs/sig_testing.ipynb This script performs significance testing between mean graph metrics among amyloid groups. It performs an ANOVA test for clustering coefficient and average shortest path length and a Kruskal-Wallis test on small world coefficient.