Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insertions relative to human in the MSAs #2

Open
junhaobearxiong opened this issue Oct 12, 2024 · 2 comments
Open

Insertions relative to human in the MSAs #2

junhaobearxiong opened this issue Oct 12, 2024 · 2 comments

Comments

@junhaobearxiong
Copy link

Hi!

Thank you for your exciting work, and especially for sharing the data with community! I have a question about the omicsMSAs for human proteins (downloaded from here). In these MSAs, I noticed that the query sequences have no gaps, but there are lower-case characters in the hit sequences. I have a few questions after looking through the supplementary materials:

  1. Is the query sequence always the human sequence? If so, are the insertions relative to human (typically represented by gaps in the human sequence, and lower cases in the hit sequences) removed during post-processing of the MSAs? I'm very interested in modeling these insertions, and I'm wondering if by any chance you could also share these intermediate alignments containing the insertions relative to human?
  2. If it is correct that the MSAs don't contain insertions relative to humans, what do the lower case characters in the hit sequences represent?

Thank you so much!

Best,
Bear

@CongLabCode
Copy link
Owner

Sorry for the late reply, the file is fasta file so lower letter and upper letter should be the same. I am sorry that I forgot to convert them into upper case and caused this confusion. We only kept columns that exist in human sequences. So, no insertions kept

@junhaobearxiong
Copy link
Author

Thank you so much for the clarification! Is there by any chance you still have the intermediate MSAs produced by the alignment softwares with the insertions relative to human? Those would be of a lot of help to us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants