Skip to content

Code for “Single-Cell Chromatin Accessibility Analysis Reveals the Epigenetic Basis and Signature Transcription Factors for the Molecular Subtypes of Colorectal Cancers” paper.

License

Notifications You must be signed in to change notification settings

liuzhenyu-yyy/CRC_Epi_scATAC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CRC_Epi_scATAC 🧬

🎯 This is the repository to host code for the CRC scATAC-seq project.

Fig1A

For more information about the project, please check our publication on Cancer Discovery.

📊 Data & metadata

🧬 Raw sequencing data

All sequencing data of scATAC-seq generated in this study have been deposited in the Genome Sequence Archive for Human (GSA-Human) under accession number HRA000992.

📑 Processed data

Processed fragments files of scATAC-seq have been deposited in the Open Archive for Miscellaneous Data (OMIX) under accession number OMIX005759.

📝 Metadata

Metadata for each patient and single cell are available in the ./metadata of this repository.

🖥️ Scripts

📂 ./code directory

Downstream analysis on the scATAC-seq data of CRCs.

00.Requirements.R

Requisites script, import librarys and functions.

01.All_Atlas.R

Basic analysis of the scATAC-seq atlas, related to Figure 1.

  • Dimensional reductions, clustering and cell typing
  • Single-cell CNV analysis
  • Marker peaks & TFs for each cell type

02.Epi_AD_Methylation.R

Chromatin dynamics of early adenomas, related to Figure 2.

  • Differential peaks & TFs in adenomas
  • Compare adenoma peaks with CRCs
  • Association with DNA methylation

03.Epi_Molecular_Subtype.R

Unsupervised subtyping of CRCs & chromatin features of iCMS subtypes, related to Figure 3 and Figure 4.

  • NMF of all malignant clusters
  • Differential analysis of each iCMS
  • Identify iCMS-specific TFs
  • Detailed analysis of TF activity and downstream targets.

04.Epi_Intratumor.R

Analysis of intra-tumor heterogeneities, related to Figure 5.

  • Identify CNV-based intra-tumor subclones
  • Phylogenetic analysis of subclones
  • Differential analysis of each subclones

05.Epi_CIMP.R

Analysis of CIMP classifications, related to Figure 6.

  • Identify CIMP subtypes
  • Differential analysis of each iCMS
  • Identify CIMP-High specific TFs

06.Epi_TF_Module.R

Weighted correlation network of TF activities, related to Figure 7.

  • Construct correlation network on TF activities
  • Identify subtype-related TF modules
  • Association between TF module and gene expression

📂 ./pipeline directory

Pipelines for processsing scATAC-seq data.

01.scATAC.process.one.sh

Process raw sequencing reads from scATAC-seq data.

02.Create_Arrow.sh

Create arrow files as input for ArchR.

CNV_from_Arrow.R

Single-cell CNV analysis. Modifed from https://github.com/GreenleafLab/10x-scATAC-2019.

Run.Homer.Motif.sh

Perform motif enrichment in given peak set using Homer.

Run.MEDICC2.sh

Perform phynogeneic analysis of tumor subclones using MEDICC2.

📂 ./reanalysis directory

Re-analysis and integration of public datasets, including DNA methylation, scRNA-seq, and scATAC-seq.

Process_Methylation_Beadchip.R

Process DNA methylation array data of CRCs generated by Luo et al.

Data source: GSE48684

scATAC_NG_CRC_continuum.R

Re-analysis of scATAC-seq data of CRC continuum generated by Becker et al.

Data source: GSE201349

scRNA_10X_CRC_atlas.R

Re-analysis of scRNA-seq data of CRCs generated by Lee et al.

Data source: GSE132465

🛠️ Dependencies

  • R environment:
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22631)
  • R packages:
attached base packages:
 [1] parallel  stats4    grid      stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
 [1] ChIPseeker_1.30.3                        minfi_1.40.0                             bumphunter_1.36.0
 [4] locfit_1.5-9.7                           iterators_1.0.14                         foreach_1.5.2
 [7] edgeR_3.36.0                             limma_3.50.3                             readr_2.1.4
[10] igraph_1.4.2                             WGCNA_1.72-1                             fastcluster_1.2.3
[13] dynamicTreeCut_1.63-1                    ggpubr_0.6.0                             clusterProfiler_4.2.2
[16] NMF_0.26                                 cluster_2.1.4                            rngtools_1.5.2
[19] registry_0.5-1                           LOLA_1.19.1                              ggbeeswarm_0.7.2
[22] Vennerable_3.1.0.9000                    viridis_0.6.3                            viridisLite_0.4.2
[25] pheatmap_1.0.12                          patchwork_1.1.2                          org.Hs.eg.db_3.14.0
[28] genomation_1.26.0                        dplyr_1.1.2                              corrplot_0.92
[31] UpSetR_1.4.0                             TxDb.Hsapiens.UCSC.hg38.knownGene_3.14.0 GenomicFeatures_1.46.5
[34] AnnotationDbi_1.56.2                     SeuratObject_4.1.3                       Seurat_4.3.0
[37] RColorBrewer_1.1-3                       BSgenome.Hsapiens.UCSC.hg38_1.4.4        BSgenome_1.62.0
[40] rtracklayer_1.54.0                       Biostrings_2.62.0                        XVector_0.34.0
[43] rhdf5_2.38.1                             SummarizedExperiment_1.24.0              Biobase_2.54.0
[46] MatrixGenerics_1.6.0                     Rcpp_1.0.10                              Matrix_1.5-4
[49] GenomicRanges_1.46.1                     GenomeInfoDb_1.30.1                      IRanges_2.28.0
[52] S4Vectors_0.32.4                         BiocGenerics_0.40.0                      matrixStats_0.63.0
[55] data.table_1.14.8                        stringr_1.5.0                            plyr_1.8.8
[58] magrittr_2.0.3                           ggplot2_3.4.2                            gtable_0.3.3
[61] gtools_3.9.4                             gridExtra_2.3                            ArchR_1.0.2

🖇️ Citation

Please consider citing our paper:

Liu, Z., Hu, Y., Xie, H., Chen, K., Wen, L., Fu, W., Zhou, X., & Tang, F. (2024). Single-Cell Chromatin Accessibility Analysis Reveals the Epigenetic Basis and Signature Transcription Factors for the Molecular Subtypes of Colorectal Cancers. Cancer Discovery, 14(6), 1082–1105. https://doi.org/10.1158/2159-8290.CD-23-1445

✉️ Questions/Comments

For any comments or questions, please feel free to submit a GitHub issue or contact me via email at [email protected] ✨.

About

Code for “Single-Cell Chromatin Accessibility Analysis Reveals the Epigenetic Basis and Signature Transcription Factors for the Molecular Subtypes of Colorectal Cancers” paper.

Resources

License

Stars

Watchers

Forks

Releases

No releases published