Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small fixes to process cif files and use them in training #141

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lbugnon
Copy link
Contributor

@lbugnon lbugnon commented Dec 30, 2024

Fixing to reproduce processing:

  • a few differences in class usage
  • paths repetition in parse verification lead to "failed" messages (at least in my case)

Using a single PDB entry as a test, I converted it to the npz format and compared with the one in the provided training set.

  • I found that "coords" and "ensemble" fields in the provided npz file are missing when processing from the mmcif file, although they do not appear to be used during training.
  • The "conformer" field in "atoms" seems different as well, I don't know if that is expected.
  • Additionally, the msa_id values are missing. I added the option to ignore empty MSA (msa_id=="") during training as temporary fix.

Happy new year!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant