Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for primitive cells #57

Merged
merged 5 commits into from
Sep 7, 2023
Merged

fix for primitive cells #57

merged 5 commits into from
Sep 7, 2023

Conversation

jan-janssen
Copy link
Member

No description provided.

Comment on lines +360 to +366
indices_dict = {
v: k
for k, v in structuretoolkit.common.helper.get_species_indices_dict(
structure=self._structure
).items()
}
new_structure.symbols = [indices_dict[i] for i in indices]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to run over all indices, and I don't think it will, and as far as I can tell that's because get_species_indices_dict behaves in an undesirable way by overloading keys.

Note:

import structuretoolkit
from ase.build import bulk

structure = bulk("Al", cubic=True)
structure.symbols[0] = "Mg"
structure.symbols[2] = "Cu"
print(structuretoolkit.common.helper.get_species_indices_dict(structure))
>>> {'Mg': 0, 'Al': 1, 'Cu': 2}
# Yikes! atom 3 being Al is totally missed

# Modified from this PR:
indices_dict = {
             v: k
             for k, v in structuretoolkit.common.helper.get_species_indices_dict(
                 structure=structure
             ).items()
         }
print(indices_dict)
>>> {0: 'Mg', 1: 'Al', 2: 'Cu'}
# The third atom doesn't get its species updated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the feeling you confuse species and symbols. The function you are looking for is:

structuretoolkit.common.helper.get_structure_indices(structure)

Which returns array([0, 1, 2, 1])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not quite. My confusion was that the indices variable is [0, 1, 2, 1] and not the array indices [0, 1, 2, 3] which I had initially assumed. So my concern was unfounded -- there is no mismatch in the span of integers in indices and indices_dict.

You did, however, point me nicely in the direction of the helper module. Is it not possible to replace these seven lines with structuretoolkit.common.helper.set_indices(new_structure, indices)? I can see this is not exactly the same, because in the existing change the get_species_indices_dict call gets self._structure while the structure getting its symbols set is new_structure, and in my proposal it's just new_structure both times. However, new_structure = [: len(indices)] and get_species_indices_dict is pulling the indices of first occurrence for each species, so I don't see how these can differ.

Copy link
Member

@liamhuber liamhuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is in response to #56, in which case I find it uncomfortable that there is no corresponding modification to the tests showing how this change now fixes an error in the previous version.

Comment on lines +360 to +366
indices_dict = {
v: k
for k, v in structuretoolkit.common.helper.get_species_indices_dict(
structure=self._structure
).items()
}
new_structure.symbols = [indices_dict[i] for i in indices]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, not quite. My confusion was that the indices variable is [0, 1, 2, 1] and not the array indices [0, 1, 2, 3] which I had initially assumed. So my concern was unfounded -- there is no mismatch in the span of integers in indices and indices_dict.

You did, however, point me nicely in the direction of the helper module. Is it not possible to replace these seven lines with structuretoolkit.common.helper.set_indices(new_structure, indices)? I can see this is not exactly the same, because in the existing change the get_species_indices_dict call gets self._structure while the structure getting its symbols set is new_structure, and in my proposal it's just new_structure both times. However, new_structure = [: len(indices)] and get_species_indices_dict is pulling the indices of first occurrence for each species, so I don't see how these can differ.

@jan-janssen
Copy link
Member Author

@liamhuber I find your statements a bit confusing, if we slightly modify the example:

structure = bulk("Al", cubic=True)
structure.symbols[0] = "Mg"
structure.symbols[1] = "Mg"
structure.symbols[2] = "Cu"
print(structuretoolkit.common.helper.get_species_indices_dict(structure))

This gives: {'Mg': 0, 'Cu': 1, 'Al': 2}

So the different species (kinds of elements) are just sorted in a list, but there is no connection to the order or position where the elements appear in the original structure.

@@ -357,9 +357,13 @@ def get_primitive_cell(
new_structure = self._structure.copy()
new_structure.cell = cell
new_structure = new_structure[: len(indices)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary issue is this line, because the resulting new structure can have only a subset of the different kinds of elements the original structure had.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, in the case use_elements==False. In the case use_elements==True, I presume we do indeed want all the same elements in the original structure. I honestly don't understand what the impact of all this is, I just want to point out that (as far as I can tell) the default behaviour should be to have all the same elements.

@jan-janssen
Copy link
Member Author

I guess this is in response to pyiron/pyiron_atomistics#1161, in which case I find it uncomfortable that there is no corresponding modification to the tests showing how this change now fixes an error in the previous version.

I added the corresponding test.

@liamhuber
Copy link
Member

@liamhuber I find your statements a bit confusing, if we slightly modify the example:

structure = bulk("Al", cubic=True)
structure.symbols[0] = "Mg"
structure.symbols[1] = "Mg"
structure.symbols[2] = "Cu"
print(structuretoolkit.common.helper.get_species_indices_dict(structure))

This gives: {'Mg': 0, 'Cu': 1, 'Al': 2}

So the different species (kinds of elements) are just sorted in a list, but there is no connection to the order or position where the elements appear in the original structure.

Ohh, yikes. I guess it just happened to be ordered in the test case I tried.

I don't see anything intrinsically wrong with the order changing, but I find this whole corner of the code base super confusing.

@liamhuber liamhuber self-requested a review September 6, 2023 16:26
Copy link
Member

@liamhuber liamhuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to see the extra test. Looking into it, I see that the regular/hex tests for finding the primitive cell use structuretoolkit.analyse.get_symmetry(...).get_primitive_cell(...)/structuretoolkit.analyse.get_primitive_cell(...). The latter, new test is in the test_symmetry file and should use the Symmetry object directly. One of the main differences is that the access via analyse doesn't expose the use_elements argument.

Going down that rabbit hole, I see that none of the tests ever modify use_elements, so we need some test expansion. I then got rather concerned because on the one hand, use_elements is an init argument and (private) property of Symmetry that defaults to True, but on the other hand Symmetry.get_primitive_cell takes the argument use_elements, which the docstring says is a bool, and which defaults to None (which is falsey) -- Symmetry.get_primitive_cell then doesn't make any direct reference to Symmetry._use_elements!

But, Symmetry.get_primitive_cell does pass its use_elements argument on to the private method _get_spglib_cell, and finally there we see Symmetry._use_elements in the default case of Symmetry.get_primitive_cell(..., use_elements=None). Thus, all the tests wind up running with the Symmetry default of use_elements==True.

All that is to say, I'm afraid I'm going to stop reviewing here. The raised issue is now tested and resolved, so I have no explicit objections and merging can proceed, but also the code is convoluted enough and the tests sparse enough that I am not confident the solution is robust, so I'm not comfortable being on the hook to repair this with an "approve" if it turns out to be broken in other ways.

@jan-janssen
Copy link
Member Author

I updated the test to use structuretoolkit.analyse.get_symmetry(...).get_primitive_cell(...) rather than structuretoolkit.analyse.get_primitive_cell(...) and I agree with the evaluation that the code is currently confusing. Still I am going to move ahead with merging it to fix pyiron/pyiron_atomistics#1161

@jan-janssen jan-janssen merged commit c808b4e into main Sep 7, 2023
10 checks passed
@jan-janssen jan-janssen deleted the primitive branch September 7, 2023 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants