-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix for primitive cells #57
Conversation
indices_dict = { | ||
v: k | ||
for k, v in structuretoolkit.common.helper.get_species_indices_dict( | ||
structure=self._structure | ||
).items() | ||
} | ||
new_structure.symbols = [indices_dict[i] for i in indices] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to run over all indices, and I don't think it will, and as far as I can tell that's because get_species_indices_dict
behaves in an undesirable way by overloading keys.
Note:
import structuretoolkit
from ase.build import bulk
structure = bulk("Al", cubic=True)
structure.symbols[0] = "Mg"
structure.symbols[2] = "Cu"
print(structuretoolkit.common.helper.get_species_indices_dict(structure))
>>> {'Mg': 0, 'Al': 1, 'Cu': 2}
# Yikes! atom 3 being Al is totally missed
# Modified from this PR:
indices_dict = {
v: k
for k, v in structuretoolkit.common.helper.get_species_indices_dict(
structure=structure
).items()
}
print(indices_dict)
>>> {0: 'Mg', 1: 'Al', 2: 'Cu'}
# The third atom doesn't get its species updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the feeling you confuse species and symbols. The function you are looking for is:
structuretoolkit.common.helper.get_structure_indices(structure)
Which returns array([0, 1, 2, 1])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not quite. My confusion was that the indices
variable is [0, 1, 2, 1]
and not the array indices [0, 1, 2, 3]
which I had initially assumed. So my concern was unfounded -- there is no mismatch in the span of integers in indices
and indices_dict
.
You did, however, point me nicely in the direction of the helper module. Is it not possible to replace these seven lines with structuretoolkit.common.helper.set_indices(new_structure, indices)
? I can see this is not exactly the same, because in the existing change the get_species_indices_dict
call gets self._structure
while the structure getting its symbols set is new_structure
, and in my proposal it's just new_structure
both times. However, new_structure = [: len(indices)]
and get_species_indices_dict
is pulling the indices of first occurrence for each species, so I don't see how these can differ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is in response to #56, in which case I find it uncomfortable that there is no corresponding modification to the tests showing how this change now fixes an error in the previous version.
indices_dict = { | ||
v: k | ||
for k, v in structuretoolkit.common.helper.get_species_indices_dict( | ||
structure=self._structure | ||
).items() | ||
} | ||
new_structure.symbols = [indices_dict[i] for i in indices] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not quite. My confusion was that the indices
variable is [0, 1, 2, 1]
and not the array indices [0, 1, 2, 3]
which I had initially assumed. So my concern was unfounded -- there is no mismatch in the span of integers in indices
and indices_dict
.
You did, however, point me nicely in the direction of the helper module. Is it not possible to replace these seven lines with structuretoolkit.common.helper.set_indices(new_structure, indices)
? I can see this is not exactly the same, because in the existing change the get_species_indices_dict
call gets self._structure
while the structure getting its symbols set is new_structure
, and in my proposal it's just new_structure
both times. However, new_structure = [: len(indices)]
and get_species_indices_dict
is pulling the indices of first occurrence for each species, so I don't see how these can differ.
@liamhuber I find your statements a bit confusing, if we slightly modify the example:
This gives: So the different species (kinds of elements) are just sorted in a list, but there is no connection to the order or position where the elements appear in the original structure. |
@@ -357,9 +357,13 @@ def get_primitive_cell( | |||
new_structure = self._structure.copy() | |||
new_structure.cell = cell | |||
new_structure = new_structure[: len(indices)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The primary issue is this line, because the resulting new structure can have only a subset of the different kinds of elements the original structure had.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, in the case use_elements==False
. In the case use_elements==True
, I presume we do indeed want all the same elements in the original structure. I honestly don't understand what the impact of all this is, I just want to point out that (as far as I can tell) the default behaviour should be to have all the same elements.
I added the corresponding test. |
Ohh, yikes. I guess it just happened to be ordered in the test case I tried. I don't see anything intrinsically wrong with the order changing, but I find this whole corner of the code base super confusing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to see the extra test. Looking into it, I see that the regular/hex tests for finding the primitive cell use structuretoolkit.analyse.get_symmetry(...).get_primitive_cell(...)
/structuretoolkit.analyse.get_primitive_cell(...)
. The latter, new test is in the test_symmetry
file and should use the Symmetry
object directly. One of the main differences is that the access via analyse
doesn't expose the use_elements
argument.
Going down that rabbit hole, I see that none of the tests ever modify use_elements
, so we need some test expansion. I then got rather concerned because on the one hand, use_elements
is an init argument and (private) property of Symmetry
that defaults to True
, but on the other hand Symmetry.get_primitive_cell
takes the argument use_elements
, which the docstring says is a bool, and which defaults to None
(which is falsey) -- Symmetry.get_primitive_cell
then doesn't make any direct reference to Symmetry._use_elements
!
But, Symmetry.get_primitive_cell
does pass its use_elements
argument on to the private method _get_spglib_cell
, and finally there we see Symmetry._use_elements
in the default case of Symmetry.get_primitive_cell(..., use_elements=None)
. Thus, all the tests wind up running with the Symmetry
default of use_elements==True
.
All that is to say, I'm afraid I'm going to stop reviewing here. The raised issue is now tested and resolved, so I have no explicit objections and merging can proceed, but also the code is convoluted enough and the tests sparse enough that I am not confident the solution is robust, so I'm not comfortable being on the hook to repair this with an "approve" if it turns out to be broken in other ways.
I updated the test to use |
No description provided.