-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: calculate cds coverage #1514
base: master
Are you sure you want to change the base?
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
let cds_coverage = translation | ||
.cdses() | ||
.map(|cds| { | ||
let ref_peptide_len = ref_translation.get_cds(&cds.name)?.seq.len(); | ||
let num_aligned_aa = cds.alignment_ranges.iter().map(Range::len).sum::<usize>(); | ||
let num_unknown_aa = unknown_aa_ranges | ||
.iter() | ||
.filter(|r| r.cds_name == cds.name) | ||
.map(|r| r.length) | ||
.sum(); | ||
let total_covered_aa = num_aligned_aa.saturating_sub(num_unknown_aa); | ||
|
||
let coverage_aa = if ref_peptide_len == 0 { | ||
0.0 | ||
} else { | ||
total_covered_aa as f64 / ref_peptide_len as f64 | ||
}; | ||
|
||
Ok((cds.name.clone(), coverage_aa)) | ||
}) | ||
.collect::<Result<BTreeMap<String, f64>, Report>>()?; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relevant algo code
Right now missing genes are not shown in the CSV/TSV output and in the Web UI. This was the easiest and aligns with how mutations are shown etc. Numbers are not truncated in CSV/TSV. It's not human-readable much anyways. |
the algorithm makes sense to me and I think works as intended. HIV is a good test case (frame shifts, multi-segment CDS, gaps, insertions, etc): Currently it the calculation is notably, frameshifted regions or If we weren't using "Coverage" for the corresponding number for nucleotides, I would probably argue for "Completeness". regarding precision. I don't feel strongly. Digits past |
Resolves: #1513