Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge data from SNV, Indel and CNV to create a dataframe with Allelic Count for each mutational position #1

Open
shashwatsahay opened this issue Aug 10, 2021 · 0 comments
Assignees

Comments

@shashwatsahay
Copy link
Collaborator

Input files required
A config file defining the metadata file and location of genome file

The meta data file is a csv file with the following columns

  1. Paitient: A unique identifier to paitient
  2. Sample: A unique identifier to sample can be the same over multiple patients
  3. Bam file: Bam file for the paitient-sample combination
  4. Genomic locations file in bed format: A combined set of mutational locations of SNV or Indels found for all the samples from the same patient
  5. CNV file: A CNV file containing the following rows, chr, startpos, endpos, CN
  6. Purity: Tumour cell content for the sample
  7. Ploidy: Ploidy of the sample

Output:

A tsv per pid contatining the following coulumns

  1. Sample ID: A unique identifier for sample fetched from the meta data file
  2. ID: A unique Id in the format chr:pos:ref:alt (or any other format but should be unique per sample id)
  3. allelic count: computed
  4. VAF: computed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant