Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Z-score normalization to normal samples #53

Open
antass opened this issue Jun 5, 2023 · 3 comments
Open

Z-score normalization to normal samples #53

antass opened this issue Jun 5, 2023 · 3 comments

Comments

@antass
Copy link

antass commented Jun 5, 2023

I'm exploring the TCGA Pan-Cancer dataset which includes 32 studies. What samples are used as background when I select expression with normalization to normal samples?

Are there adjacent normal samples available in all of the 32 studies? I'm struggling to find any detailed info on that. In one webinar, the z-score calculation with normals as reference is mentioned, but only in the MSK Prostate study.

Specifically, for my gene of interest, when I look at the Cancer Types Summary tab, the graph shows significant over expression in cholangiocarcinoma (94% of tumor samples have high mRNA status (z-score > 2)). Does this mean over expression of cholangio tumors as compared to

  • only cholangio normals?
  • only prostate normals?
  • all normals in all studies (in which case I'd like to know which studies were included)?

Thank you very much for your help!

@pieterlukasse
Copy link
Member

Good question. When I look at the sample type in the pancancer TCGA studies, I see only the following:
image
So reference normals don't seem to be loaded to the portal.
@ritikakundra any ideas where to find the normals used? Maybe it makes sense to add a link to that as part of the genomic profile description?

@rmadupuri
Copy link
Collaborator

Hi @antass sorry for the delayed response! In case you're still interested:

What samples are used as background when I select expression with normalization to normal samples?

The profile shows tumor sample z-scores calculated with reference to expression in normal samples.

Are there adjacent normal samples available in all of the 32 studies? I'm struggling to find any detailed info on that. In one webinar, the z-score calculation with normals as reference is mentioned, but only in the MSK Prostate study.

Normal sample data is available for 16 TCGA Pancancer studies. You can find more details in cBioPortal/datahub#1241

Specifically, for my gene of interest, when I look at the Cancer Types Summary tab, the graph shows significant over expression in cholangiocarcinoma (94% of tumor samples have high mRNA status (z-score > 2)). Does this mean over expression of cholangio tumors as compared to

only cholangio normals?
only prostate normals?
all normals in all studies (in which case I'd like to know which studies were included)?

This indicates over expression of cholangio tumors compared to cholangio normals from the same study.

@rmadupuri
Copy link
Collaborator

rmadupuri commented Nov 8, 2024

@pieterlukasse currently only the tumor sample profile with z-scores calculated relative to expression in normal samples is loaded into the portal. We have two additional profiles:

  1. mRNA expression of normal samples
  2. mRNA expression of normal samples, with z-scores relative to all normal samples

These profiles are not yet loaded into the portal, but you can access this data in the pan_can_atlas studies under the normals folder on Datahub. There was a prior discussion about supporting these profiles, and we should revisit this. CC’ing @jjgao and @ritikakundra - correct me if I’ve missed anything.

We will update the README to include details about this profile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants