ROSETTA-X-to-C-code-2023

Modified version of the 2021 ROSETTA code to look only at those mutations which change an amino acid to a cysteine This directory should output the proper counts of each X to C mutations for each gene under each histology. The specific file which contains the counts is labeled "Genomics_Output_Processed_specific_Cys.txt", which is generated by the Genomics_Clinical_Mutations_Counter_cystine.ipynb file (when nested in the proper directories).

Requires files and structure similar to that of the original ROSETTA. Refer to that repo for installation instructions, and replace appropriate fileswith specific mutation ones

====File Descriptions=== Genomics_Clinical_Mutations_Counter_cystine.ipynb- base file which performs downloading of all requisite raw data, building the counts matrix froms scratch

Genomics_Clinical_Mutations_Counter_cystine_only.ipynb- version of Genomics_Clinical_Mutations_Counter_cystine.ipynb which skips downloading of raw data, instead using the contents of df_mutfiles_specific.7z and df_mutfiles_specific_cysteine.7z as its data. Said contents are a freeze of data retrieved between February and March 2022.

Genes_ManyChromosomes.xlsx lists chromosome numbers for all genes for reference while gene renaming as an additional layer of cross-check. If a gene name is truly an alias, it is on the same chromosome.

df_mutfiles.zip can be unzipped to create a ~180MB file named df_mutfiles.csv. This file contains ROSETTA coded, sequenced samples with sample ID, study ID and mutated gene information for every gene in every sample included in our genomic analysis.

df_mutfiles_specific_cysteine- version of above filtered for only cysteine mutations, counting by specifc mutational variant

The following libraries were used in compiling the source code.

NumPy 1.19.2
pandas 1.1.3

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Genelist_ManyChromosomes.xlsx		Genelist_ManyChromosomes.xlsx
Genomics_Clinical_Mutations_Counter_cystine_only.ipynb		Genomics_Clinical_Mutations_Counter_cystine_only.ipynb
LICENSE		LICENSE
README.md		README.md
df_mutfiles.zip		df_mutfiles.zip
df_mutfiles_cystine_specific.7z		df_mutfiles_cystine_specific.7z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ROSETTA-X-to-C-code-2023

About

Releases

Packages

Languages

License

DLiarakos/ROSETTA-X-to-C-code-2023

Folders and files

Latest commit

History

Repository files navigation

ROSETTA-X-to-C-code-2023

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages