Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TBStudio parser #154

Merged
merged 51 commits into from
Nov 20, 2023
Merged

TBStudio parser #154

merged 51 commits into from
Nov 20, 2023

Conversation

JosePizarro3
Copy link
Collaborator

No description provided.

@JosePizarro3 JosePizarro3 added the feature / enhancement New feature or request label Aug 22, 2023
@JosePizarro3 JosePizarro3 linked an issue Aug 22, 2023 that may be closed by this pull request
@mohammadnakhaee
Copy link
Collaborator

mohammadnakhaee commented Aug 25, 2023

Hi @JosePizarro3 I added the test for the parser. Please take a look if we can merge it.

Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, just some more detail work and the parser will be ready.

I've only made an initial review, focusing on the parsing side. Once you have the new changes under control, I will turn to the specific parsing and metainfo definitions (everything that is inside parse_system, parse_method, parse_scc + the metadata in nomad).

Let me know if something is unclear.

electronicparsers/tbstudio/parser.py Outdated Show resolved Hide resolved
electronicparsers/tbstudio/parser.py Outdated Show resolved Hide resolved
electronicparsers/tbstudio/parser.py Outdated Show resolved Hide resolved
electronicparsers/tbstudio/parser.py Outdated Show resolved Hide resolved
electronicparsers/tbstudio/parser.py Outdated Show resolved Hide resolved
electronicparsers/utils/utils.py Outdated Show resolved Hide resolved
electronicparsers/utils/utils.py Outdated Show resolved Hide resolved
electronicparsers/utils/utils.py Outdated Show resolved Hide resolved
electronicparsers/utils/utils.py Outdated Show resolved Hide resolved
electronicparsers/utils/utils.py Outdated Show resolved Hide resolved
@mohammadnakhaee
Copy link
Collaborator

Hi @JosePizarro3 Please take a look on the last changes.

Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Here you have a new round of comments. Mainly coding recommendations, a few checks, and some stuff that would make your life easier if you need to maintain this parser 🙂

Feel free to ask back. When you go resolving comments, feel free to drop a message like 'Done', or some more extended explanation, and resolve it yourself.

Thanks a lot for the hard work, the parser looks very nice! 👍🏻 And after these changes, I can review the nomad-gitlab side and we could merge soon.

electronicparsers/tbstudio/parser.py Show resolved Hide resolved
electronicparsers/tbstudio/parser.py Show resolved Hide resolved
electronicparsers/tbstudio/parser.py Outdated Show resolved Hide resolved
electronicparsers/tbstudio/parser.py Show resolved Hide resolved
electronicparsers/tbstudio/parser.py Outdated Show resolved Hide resolved
# First Principles Calculation task
if self.archive.workflow2:
first_principles_task = TaskReference(task=first_principles_calculation_archive.workflow2)
first_principles_task.name = 'First-Principles Calculation'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason of naming a task with 'Calculation' might be confusing. Even thought we all know that calculation is what we do, in NOMAD we further define 'calculation' as containing the outputs of a given 'method' applied to a specific 'material'. Here we are talking about the global task (containing system, method, calculation), hence I think it makes more sense to simply call it 'First Principles'

electronicparsers/utils/utils.py Outdated Show resolved Hide resolved
electronicparsers/utils/utils.py Show resolved Hide resolved
tests/test_tbstudioparser.py Outdated Show resolved Hide resolved
tests/test_tbstudioparser.py Outdated Show resolved Hide resolved
@JosePizarro3
Copy link
Collaborator Author

@mohammadnakhaee let me know when you need another quick round of review. I'll like to merge soon the new refactoring of 'Projection' to 'TB', https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/merge_requests/1285.

Maybe, we should decouple both changes: one merge for the renaming, another merge for adding SlaterKoster and TBStudio.

@mohammadnakhaee
Copy link
Collaborator

@JosePizarro3 yes. Could you please take a look. I solved the problems and lint errors related to tbstudio but it is still failed. Maybe you know what is going wrong. The failed tests are not related to the tbstudio.

@JosePizarro3
Copy link
Collaborator Author

@JosePizarro3 yes. Could you please take a look. I solved the problems and lint errors related to tbstudio but it is still failed. Maybe you know what is going wrong. The failed tests are not related to the tbstudio.

Maybe rebasing would solve the issue?

Bear in mind that pytest will anyways fail. I'd say that if after rebasing pylint and mypy do not pass, I'll take a look 🙂

@mohammadnakhaee
Copy link
Collaborator

Yeah let me rebase it. Maybe it helps.

@mohammadnakhaee
Copy link
Collaborator

I rebased it. Now the problem is only these two

electronicparsers/utils/utils.py:30:0: E0611: No name 'TB' in module 'nomad.datamodel.metainfo.simulation.workflow' (no-name-in-module)
electronicparsers/utils/utils.py:30:0: E0611: No name 'TBMethod' in module 'nomad.datamodel.metainfo.simulation.workflow' (no-name-in-module)

which I guess it makes sense. Doesn't it? They are newly added in the gitlab branch which is not merged yet. Is it correct?

@JosePizarro3
Copy link
Collaborator Author

which I guess it makes sense. Doesn't it? They are newly added in the gitlab branch which is not merged yet. Is it correct?

Yes, indeed 🙂

Perfect, I will take a look and give you some final review. Though it looked good at first sight. Do you want me also to review now the gitlab side?

@mohammadnakhaee
Copy link
Collaborator

Yes please see the gitlab again. I will close all the threads since there are extensive changes to the previous version. Would be nice if you comment on this version.

Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, just recalling some suggestions and fixes from my previous review.

Furthermore, the data files are not present (they should be in tests/data/tbstudio).

Once these suggestions are implemented, this will be ready 🙂

"""
sec_run = self.archive.run[-1]

sec_run.program = Program(name='TBStudio', version=self.tbm['ReleaseVersion'])
Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original suggestion:

Error handling:

sec_run.program = Program(name='TBStudio', version=self.tbm.get('ReleaseVersion', ''))

The problem is that, if self.tbm['ReleaseVersion'] is not stored (for whatever reason, like for example a version change), it will return an error. It is safer to do self.tbm.get('ReleaseVersion') or self.tbm.get('ReleaseVersion', '') if you want to be more verbose with the type of the get.

"""Populates run.calculation with the output of the calculation.
"""
sec_run = self.archive.run[-1]
self.archive.run[-1].m_create(Calculation)
Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original suggestion:

You can:

sec_scc = sec_run.m_create(Calculation)

And delete the line here where you do sec_scc = sec_run.calculation[-1] (see next comment).

Also reopening this suggestion for making parse_scc more readable.

return

fermi_level_joules = self.tbm['variables']['ChemP'] * ureg.eV
sec_scc = sec_run.calculation[-1]
Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete this when doing what I said in the comment before.

Here according to the suggestion of initializing sec_scc = sec_run.m_create(Calculation) at the beginning directly.

if band_segments_points is None or len(tb_bands) < 1 or len(frac_k_points) < 1:
return

fermi_level_joules = self.tbm['variables']['ChemP'] * ureg.eV
Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error handling (and no need to name 'joules'):

fermi_level = self.tbm['variables'].get('ChemP')
sec_energy = sec_scc.m_create(Energy, Calculation.energy)
sec_energy.fermi = fermi_level * ureg.eV if fermi_level else None

And this error handling too

electronicparsers/tbstudio/parser.py Show resolved Hide resolved
electronicparsers/tbstudio/parser.py Show resolved Hide resolved
self.logger = logger if logger is not None else logging

sec_run = self.archive.m_create(Run)
sec_run.program = Program(name="TBStudio")
Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, you were written the Program information in parse_method. Please, move the program definition here (deleting the corresponding lines in parse_method) doing:

sec_run.program = Program(name='TBStudio', version=self.tbm.get('ReleaseVersion', ''))

This, there are two calls for program which are overwritten each other.

self.parse_method()
self.parse_scc()

self.parse_workflow()
Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to define a specific function here. Simply delete that method and do:

workflow = SinglePoint()
self.archive.workflow2 = workflow

This

pass
if first_principles_calculation_archive and self._child_archives:
tb_workflow_archive = self._child_archives.get('TB_workflow')
self.parse_tb_workflow(tb_archive, first_principles_calculation_archive, tb_workflow_archive)
Copy link
Collaborator Author

@JosePizarro3 JosePizarro3 Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to call tb_archive, but rather pass archive or self.archive

This

electronicparsers/utils/utils.py Show resolved Hide resolved
@JosePizarro3
Copy link
Collaborator Author

@mohammadnakhaee all good, come to my office (I have a final doubt) and we do the final merge together 🙂

@JosePizarro3 JosePizarro3 merged commit 4f54eef into develop Nov 20, 2023
@JosePizarro3 JosePizarro3 deleted the tbs_parser branch November 20, 2023 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature / enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TBS parser
2 participants