-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add export to RDKIT to allow for chemdraw style visualization. #1137
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1137 +/- ##
==========================================
+ Coverage 87.13% 87.19% +0.06%
==========================================
Files 62 62
Lines 6457 6490 +33
==========================================
+ Hits 5626 5659 +33
Misses 831 831
☔ View full report in Codecov by Sentry. |
I was just thinking that since force overlap can create a bond, I need to update that function to allow bond order to be set when calling force overlap to create a bond. Again, mBuild won't necessarily do anything with bond order at this point, but I think adding the ability to set this consistently. Since this isn't very high priority, I think adding in a simple "infer_bond_order" could be useful; I tried to use the function in rdkit that supposedly does this, but ran into an error, so I still need to look more closely into that. I haven't yet tried to convert a non-atomistic particle to rdkit. I feel as this is doable, since we will explicitly set the bonds (i.e., these aren't a function of atomic species or something) and in the function I'm overwriting the names anyway to include an index. Basically just need to add some logic to know what to do if element isn't set and the name can't be parsed into an element. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Only some minor comments about type check and where the conversion method itself should stay.
mbuild/compound.py
Outdated
if bond_order is None: | ||
bond_order = "default" | ||
self.root.bond_graph.add_edge( | ||
particle_pair[0], particle_pair[1], bo=bond_order | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we need a type check here to make sure the bond_order provided is of the right type (not bogus string)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think hard checking these would be useful. The options should be:
"default", "single", "double", "triple", "aromatic", "unspecified". It seems the current options are:
"default", 1.0, 2.0, 3.0, 1.5, 0.0.
I think being cleaner with using all strings would be nice, since we wouldn't have to deal with odd behavior for floats vs ints.
Also, these options should be all listed in the doc strings to_rdkit(), regardless of what they are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a check for bond_order
using the types you suggested.
mbuild/compound.py
Outdated
3.0: Chem.BondType.TRIPLE, | ||
1.5: Chem.BondType.AROMATIC, | ||
0.0: Chem.BondType.UNSPECIFIED, | ||
"default": Chem.BondType.SINGLE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should raise a single warning here that the "default"
bond with be converted to single bond in rdkit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, it's not clear to me what the use of having "default"
is with the changes discussed of how to specify bond order types. Do we need to have "default"
as an option here, or in add_bond
if the default is always a single bond?
I think we can just have the rdkit for atomistic system for now (initial check for particle element as you have now should be sufficient). The conversion to CG systems could follow later. I think the |
@@ -1749,6 +1757,59 @@ | |||
return pybelmol | |||
|
|||
|
|||
def to_rdkit(compound): | |||
"""Create an RDKit RWMol from an mBuild Compound.""" | |||
rdkit = import_("rdkit") |
Check notice
Code scanning / CodeQL
Unused local variable Note
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc strings can probably get added here as well. Should add that we're returning Chem.RWmol, and maybe some examples of useful things to do with the mol and references to rdkit.
Possible examples.
from rdkit.Chem import Draw
img = Draw.MolToImage(temp_mol) # draw a 2D structure
Chem.MolToSmiles(temp_mol) # get smiles string
Chem.RemoveHs(temp_mol) # remove Hydrogens
# list of possible rdkit writers
# link to other working with molecules info (https://www.rdkit.org/docs/GettingStartedInPython.html#working-with-molecules)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I only added the draw example and the link since mBuild has functionality to remove particles and convert a compound to smiles.
@chrisiacovella I'm taking over this PR :) I've addressed some of the comments from @daico007 and @CalCraven. I haven't looked closely yet at the unit tests, but I can work on adding a few more (and making sure the changes made aren't breaking the ones you added) |
for more information, see https://pre-commit.ci
Thanks @chrisjonesBSU I'll take a look at this. Not going to lie...I thought we merged this a while ago! |
Ok, I think this should be ready to go. I added a unit test for the missing lines in |
Thanks for finishing this off. Everything looks good. I think we are ready to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Chris, this looks done to me as well.
PR Summary:
This adds in a function to export to an RDKIT RWMol (See issue #1138). This will allow a user to display a chemdraw style version of an atomistic compound. To enable structures to be drawn with correct bond order, we can now store the bond order in the bond graph as a data attribute. If you initialize a compound with RDKIt (i.e., from smiles) it will set the bond order into the bond list. the add_bond routine now accepts bond order as an argument, so one could manually define a molecule with correct bond order as well. If bond order is not defined, the data will be set to 'default' (which to_rdkit will assume means bond order 1). I didn't want to set this to 1 so as to make it clear to a user that this value was not actually set during construction.
I've added in tests to check the conversion and ensure that bond order is properly propagated when, e.g., we clone a compound. The default behavior is not changed (e.g., calling bonds() will just return the pair unless return_bond_order is set to true), so this should not break any existing functionality.
PR Checklist