feat(IPVC-2425): add migration to fix tx-alt-exon-pairs view #34
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR addresses an existing issue with a view in the UTA database,
tx_alt_exon_pairs_v
. This view is used to find transcript and alternate accession exon pairs for thealign-exons
method. The goal is to find valid pairs that are missing CIGAR stings provided byuta-align
.The query finds transcript exons with the
alt_aln_method="transcript"
and alignments withalt_aln_method !="transcript"
. An issue is encountered when we deprecate an existing transcript. This happens when the cds start/end has changed or the exon structure has changed. UTA does not delete the old record, nor does it update the cds or exon values. It updates the alt_aln_method so that it is essentially hidden. Here is an example of an exon set record being updated due to a change in the exon definition of the transcript.Transcript: exon_set_id: 343991; tx_ac: NM_001173991.2; alt_aln_method: transcript
was updated to...
Transcript: exon_set_id: 343991; tx_ac: NM_001173991.2; alt_aln_method: transcript/70b44909
because the exon structure went from
0,306;306,408;408,501;501,702;702,1306
->0,306;306,408;408,501;501,703;703,1306
using the following query you will see that the updated (deprecated) transcript exon set is showing up as alt_aln_methods that will be passed to
align_exons
. There is no need to align transcript exons from one deprecated structure to the latest. This issue can be addressed by adjusting the WHERE criteria of the view.alt_aln_method !="transcript"
->alt_aln_method !~ "transcript"
To test this I ran the following query pre and post Alembic migration.
BEFORE:
AFTER: