This project aims to uncover relationships between genetic mutations and tumor microenvironment (TME) characteristics in colorectal cancer. We focus on three types of mutations:
- STRs (Short Tandem Repeats) – repeated sequences in DNA.
- SNPs (Single Nucleotide Polymorphisms) – single base-pair changes in the DNA.
- Indels (Insertions and Deletions) – small insertions or deletions of bases in the DNA.
By analyzing these mutation types, we will explore their association with mucin production and immune cells in tumors.
-
Literature overview. Review existing studies on genetic mutations and their impact on the tumor microenvironment, with a focus on mucin production and immune cell composition in colorectal cancer.
-
Data preparation. Extract mutation data from TCGA and preprocess it for machine learning applications. Identify the best way to represent different mutation types for predictive modeling.
-
Machine learning. Train models to predict mucin levels and immune cell composition based on mutation data. The objective is to determine which mutation types are most relevant for understanding tumor microenvironment characteristics.
-
Interpretation. Analyze model outputs to identify key genetic features influencing mucin and immune cell levels. Validate findings by comparing predictions with existing research.