Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a/b measn #4

Open
Chensanyu opened this issue May 15, 2019 · 3 comments
Open

a/b measn #4

Chensanyu opened this issue May 15, 2019 · 3 comments

Comments

@Chensanyu
Copy link

  1. hotspot region in result :
    1e96 HNSC 0:A:116;0:A:159;0:A:18;0:B:102;0:A:29;0:A:15
    What does 0 mean? What does A/B stand for?

  2. How to know the correspondence between genes and hotspots? Can I upload mutations for different genes related to one cancer type in a maf file?

@ctokheim
Copy link
Collaborator

  1. The first numbers stands for which biological assembly was used for the protein structure. Pdb's may have multiple biological assemblies with subunits in different orientations of a complex. A/B (and potentially other letters) represents the protein chain. Protein chains may originate either from the same gene's protein product or from different genes.

  2. There are two options. If you just want to examine whether your mutations overlap with previously computed hotspots from TCGA, you can just upload your mutations to mupit (https://mupit.icm.jhu.edu/MuPIT_Interactive/) or CRAVAT (which will provide you links to mupit to see the protein structure, https://www.cravat.us/CRAVAT/). Alternative, you could try to cluster mutations based on your own set of mutations. This will require you to follow through the "exome-scale" pipeline of HotMAPS, https://github.com/KarchinLab/HotMAPS/wiki/Tutorial-(Exome-scale).

@Chensanyu
Copy link
Author

Chensanyu commented May 19, 2019

Thank you very much, your answer is very helpful to me.
But I am confused about the difference between the two result files called 'hotspot_regions_gene_.01.txt' and 'hotspot_regions_structure_.01.txt'.**

  1. I guess the first file is about a gene with its mutations, and the second is a gene mapping structure. I want to know if this guess is correct? Whether the mutation in hotspot_regions_gene_.01.txt is in one cluster or multiple clusters ? Also, are there any other connections or differences between the two files? Is ‘hotspot_regions_gene_.01.txt' one of the final generated files?
  2. If I want to know how many clusters a gene contains and which mutations are in each cluster, I should focus on hotspot_regions_gene_.01.txt' or 'hotspot_regions_structure_.01.txt'.
    Sorry to bother you, look forward to your reply.

@ctokheim
Copy link
Collaborator

The answer depends on what you are looking for. I created the "gene" file since there may be many structures for a particular protein and the clustering is not always the same. Basically the "gene" file merges the clustering results for all structures into one consensus. If you are only interested in which mutations group together, and not the underlying protein structure, than using the gene file is probably the best fit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants