Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Incar keys case insensitive, fix init Incar from dict val processing for str/float/int #4122

Merged
merged 37 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
b9b3804
move ENCUT from int to float list
DanielYang59 Oct 18, 2024
a5ba6ca
remove str case sensitivity in check_params
DanielYang59 Oct 18, 2024
60e1a48
fix return type annotation
DanielYang59 Oct 18, 2024
945368b
fix unit test failure
DanielYang59 Oct 18, 2024
f0ebb34
use original key for warning msg
DanielYang59 Oct 18, 2024
37ac9f5
add duplicate check in check_params
DanielYang59 Oct 18, 2024
13b5a67
add test first, issue not fixed yet
DanielYang59 Oct 18, 2024
adcdba7
init from dict also use setter method and fix val filter logic
DanielYang59 Oct 18, 2024
bae2320
relocate test_from_file_and_from_dict
DanielYang59 Oct 18, 2024
44a06c3
Revert "init from dict also use setter method and fix val filter logic"
DanielYang59 Oct 18, 2024
6799044
remove seemingly unused monkeypatch
DanielYang59 Oct 18, 2024
966bb91
tweak type
DanielYang59 Oct 18, 2024
532cad0
make module level var all cap
DanielYang59 Oct 18, 2024
a24ac8e
casting to list doesn't seem necessary, remain iterator for lazy eval
DanielYang59 Oct 18, 2024
92bfc7e
add docstring to clarify parse list
DanielYang59 Oct 18, 2024
f1fb45b
Merge branch 'master' into fix-4119-incar-tag-check
shyuep Oct 18, 2024
fdc40b8
Merge branch 'master' into fix-4119-incar-tag-check
DanielYang59 Oct 18, 2024
552c006
Merge branch 'fix-4119-incar-tag-check' of https://github.com/DanielY…
DanielYang59 Oct 18, 2024
536eae5
remove duplicate check
DanielYang59 Oct 19, 2024
ba0f349
reduce indentation level
DanielYang59 Oct 19, 2024
198ce83
remove docstring of warn that doesn't exist
DanielYang59 Oct 19, 2024
a2da974
fix typo in incar tag ECUT -> ENCUT
DanielYang59 Oct 19, 2024
435320d
inherit from UserDict, and make more ops case insensitive
DanielYang59 Oct 19, 2024
51f5922
also override del and in methdos
DanielYang59 Oct 19, 2024
3b231be
issue warning for duplicate keys
DanielYang59 Oct 19, 2024
dbaaa98
enhance warning check
DanielYang59 Oct 19, 2024
f8b223c
tweak docstring
DanielYang59 Oct 19, 2024
f30e740
fix type of float/int casting
DanielYang59 Oct 19, 2024
259124d
enhance test for from_dict consistency check
DanielYang59 Oct 19, 2024
96ad945
relocate duplicate check to setter so that both from str and dict wou…
DanielYang59 Oct 19, 2024
fa58d0d
fix index error for vasprun
DanielYang59 Oct 19, 2024
8d5768e
move duplicate warning to init otherwise get false pos when update
DanielYang59 Oct 19, 2024
c350714
enhance unit test from type cast from dict
DanielYang59 Oct 19, 2024
48d2755
remove unnecessary get default
DanielYang59 Oct 19, 2024
b2e9a63
remove unnecessary type cast in check_params
DanielYang59 Oct 19, 2024
9dc98f9
tweak Incar docstring
DanielYang59 Oct 19, 2024
837a011
Merge branch 'master' into fix-4119-incar-tag-check
DanielYang59 Oct 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 74 additions & 35 deletions src/pymatgen/io/vasp/inputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
import re
import subprocess
import warnings
from collections import UserDict
from enum import Enum, unique
from glob import glob
from hashlib import sha256
Expand Down Expand Up @@ -709,43 +710,64 @@ class BadPoscarWarning(UserWarning):
"""Warning class for bad POSCAR entries."""


class Incar(dict, MSONable):
class Incar(UserDict, MSONable):
Copy link
Contributor Author

@DanielYang59 DanielYang59 Oct 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subclassing UserDict would make our life much easier, otherwise some methods like update/setdefault would not call the overridden setter method, and I believe that's the recommended way to create a custom dict:

UserDict: wrapper around dictionary objects for easier dict subclassing

Similar to our previous discussion on Lobsterin which is also a case-insensitive dict:

class Lobsterin(UserDict, MSONable):


Perhaps we should consider writing a reusable key case-insensitive dict at some point instead of writing this over and over again.

Or use (now that we do have requests as a dependency):

from requests.structures import CaseInsensitiveDict

"""
Read and write INCAR files.
Essentially a dictionary with some helper functions.
Essentially a dictionary with helper functions, where keys are
case-insensitive by being stored in uppercase. This ensures
case-insensitive access for set, get, del, update, and setdefault operations.
"""

def __init__(self, params: dict[str, Any] | None = None) -> None:
"""
Create an Incar object.
Clean up params and create an Incar object.

Args:
params (dict): Input parameters as a dictionary.
params (dict): INCAR parameters as a dictionary.
"""
super().__init__()
if params is not None:
# If INCAR contains vector-like MAGMOMS given as a list
# of floats, convert to a list of lists
if (params.get("MAGMOM") and isinstance(params["MAGMOM"][0], int | float)) and (
params.get("LSORBIT") or params.get("LNONCOLLINEAR")
):
val = []
for idx in range(len(params["MAGMOM"]) // 3):
val.append(params["MAGMOM"][idx * 3 : (idx + 1) * 3])
params["MAGMOM"] = val

self.update(params)
params = params or {}

# If INCAR contains vector-like MAGMOMS given as a list
# of floats, convert to a list of lists
if (params.get("MAGMOM") and isinstance(params["MAGMOM"][0], int | float)) and (
params.get("LSORBIT") or params.get("LNONCOLLINEAR")
):
val: list[list] = []
for idx in range(len(params["MAGMOM"]) // 3):
val.append(params["MAGMOM"][idx * 3 : (idx + 1) * 3])
params["MAGMOM"] = val

super().__init__(params)

def __setitem__(self, key: str, val: Any) -> None:
"""
Add parameter-val pair to Incar. Warn if parameter is not in list of
valid INCAR tags. Also clean the parameter and val by stripping
leading and trailing white spaces.
Add parameter-val pair to Incar.
- Clean the parameter and val by stripping leading
and trailing white spaces.
- Cast keys to upper case.

Warnings:
BadIncarWarning: If there are duplicate in keys (case insensitive).
"""
super().__setitem__(
key.strip().upper(),
type(self).proc_val(key.strip(), val.strip()) if isinstance(val, str) else val,
)
# Check for case-insensitive duplicate keys
if (key := key.strip().upper()) in self:
warnings.warn(f"Duplicate key found (case-insensitive): {key}", BadIncarWarning, stacklevel=2)

# Cast float/int to str such that proc_val would clean up their types
val = self.proc_val(key, str(val)) if isinstance(val, str | float | int) else val
Copy link
Contributor Author

@DanielYang59 DanielYang59 Oct 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change:

- val = self.proc_val(key, str(val)) if isinstance(val, str) else val
+ val = self.proc_val(key, str(val)) if isinstance(val, str | float | int) else val

should have a healthy dose of stupidity as I believe casting int/float to str is necessary such that the val would go through proc_val to sanitize its type, otherwise it would bypass the proc_val check, leading to inconsistent behaviour when init from dict vs from str/file.

super().__setitem__(key, val)

def __getitem__(self, key: str) -> Any:
"""
Get value using a case-insensitive key.
"""
return super().__getitem__(key.strip().upper())

def __delitem__(self, key: str) -> None:
super().__delitem__(key.strip().upper())

def __contains__(self, key: str) -> bool:
return super().__contains__(key.upper().strip())

def __str__(self) -> str:
return self.get_str(sort_keys=True, pretty=False)
Expand All @@ -762,6 +784,12 @@ def __add__(self, other: Self) -> Self:
params[key] = val
return type(self)(params)

def get(self, key: str, default: Any = None) -> Any:
"""
Get a value for a case-insensitive key, return default if not found.
"""
return super().get(key.strip().upper(), default)

def as_dict(self) -> dict:
"""MSONable dict."""
dct = dict(self)
Expand Down Expand Up @@ -854,24 +882,23 @@ def from_str(cls, string: str) -> Self:
Returns:
Incar object
"""
lines: list[str] = list(clean_lines(string.splitlines()))
params: dict[str, Any] = {}
for line in lines:
for line in clean_lines(string.splitlines()):
for sline in line.split(";"):
if match := re.match(r"(\w+)\s*=\s*(.*)", sline.strip()):
key: str = match[1].strip()
val: Any = match[2].strip()
val: str = match[2].strip()
params[key] = cls.proc_val(key, val)
return cls(params)

@staticmethod
def proc_val(key: str, val: Any) -> list | bool | float | int | str:
def proc_val(key: str, val: str) -> list | bool | float | int | str:
"""Helper method to convert INCAR parameters to proper types
like ints, floats, lists, etc.

Args:
key (str): INCAR parameter key
val (Any): Value of INCAR parameter.
key (str): INCAR parameter key.
val (str): Value of INCAR parameter.
"""
list_keys = (
"LDAUU",
Expand Down Expand Up @@ -906,6 +933,7 @@ def proc_val(key: str, val: Any) -> list | bool | float | int | str:
"AGGAC",
"PARAM1",
"PARAM2",
"ENCUT",
)
int_keys = (
"NSW",
Expand All @@ -921,7 +949,6 @@ def proc_val(key: str, val: Any) -> list | bool | float | int | str:
"NPAR",
"LDAUPRINT",
"LMAXMIX",
"ENCUT",
"NSIM",
"NKRED",
"NUPDOWN",
Expand All @@ -931,7 +958,7 @@ def proc_val(key: str, val: Any) -> list | bool | float | int | str:
)
lower_str_keys = ("ML_MODE",)

def smart_int_or_float(num_str: str) -> str | float:
def smart_int_or_float(num_str: str) -> float:
"""Determine whether a string represents an integer or a float."""
if "." in num_str or "e" in num_str.lower():
return float(num_str)
Expand Down Expand Up @@ -1032,7 +1059,7 @@ def check_params(self) -> None:
warnings.warn(f"Cannot find {tag} in the list of INCAR tags", BadIncarWarning, stacklevel=2)
continue

# Check value and its type
# Check value type
param_type: str = incar_params[tag].get("type")
allowed_values: list[Any] = incar_params[tag].get("values")

Expand All @@ -1041,8 +1068,19 @@ def check_params(self) -> None:

# Only check value when it's not None,
# meaning there is recording for corresponding value
if allowed_values is not None and val not in allowed_values:
warnings.warn(f"{tag}: Cannot find {val} in the list of values", BadIncarWarning, stacklevel=2)
if allowed_values is not None:
original_val = val

# Remove case sensitivitiy for string keywords
# Note: param_type could be a Union type, e.g. "str | bool"
if "str" in param_type:
val = val.lower() if isinstance(val, str) else val
allowed_values = [item.lower() if isinstance(item, str) else item for item in allowed_values]

if val not in allowed_values:
warnings.warn(
f"{tag}: Cannot find {original_val} in the list of values", BadIncarWarning, stacklevel=2
)


class BadIncarWarning(UserWarning):
Expand Down Expand Up @@ -1712,6 +1750,7 @@ def _parse_int(string: str) -> int:


def _parse_list(string: str) -> list[float]:
"""Parse a list of floats from a string."""
return [float(y) for y in re.split(r"\s+", string.strip()) if not y.isalpha()]


Expand Down
15 changes: 8 additions & 7 deletions src/pymatgen/io/vasp/outputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ def _parse_from_incar(filename: PathLike, key: str) -> Any:
dirname = os.path.dirname(filename)
for fn in os.listdir(dirname):
if re.search("INCAR", fn):
warnings.warn(f"INCAR found. Using {key} from INCAR.")
warnings.warn(f"INCAR found. Using {key} from INCAR.", stacklevel=2)
incar = Incar.from_file(os.path.join(dirname, fn))
return incar.get(key, None)
return None
Expand Down Expand Up @@ -345,7 +345,7 @@ def __init__(
self.update_potcar_spec(parse_potcar_file)
self.update_charge_from_potcar(parse_potcar_file)

if self.incar.get("ALGO") not in {"CHI", "BSE"} and not self.converged and self.parameters.get("IBRION") != 0:
if self.incar.get("ALGO") not in {"Chi", "Bse"} and not self.converged and self.parameters.get("IBRION") != 0:
Copy link
Contributor Author

@DanielYang59 DanielYang59 Oct 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before migrating to subclassing UserDict, the setter method is not triggered through update method when init from a dict, therefore these values remain as is (BSE) when it should be capitalized (Bse).

def __init__(self, params: dict[str, Any] | None = None) -> None:
"""
Create an Incar object.
Args:
params (dict): Input parameters as a dictionary.
"""
super().__init__()
if params is not None:
# If INCAR contains vector-like MAGMOMS given as a list
# of floats, convert to a list of lists
if (params.get("MAGMOM") and isinstance(params["MAGMOM"][0], int | float)) and (
params.get("LSORBIT") or params.get("LNONCOLLINEAR")
):
val = []
for idx in range(len(params["MAGMOM"]) // 3):
val.append(params["MAGMOM"][idx * 3 : (idx + 1) * 3])
params["MAGMOM"] = val
self.update(params)

msg = f"{filename} is an unconverged VASP run.\n"
msg += f"Electronic convergence reached: {self.converged_electronic}.\n"
msg += f"Ionic convergence reached: {self.converged_ionic}."
Expand All @@ -364,10 +364,10 @@ def _parse(
self.projected_magnetisation: NDArray | None = None
self.dielectric_data: dict[str, tuple] = {}
self.other_dielectric: dict[str, tuple] = {}
self.incar: dict[str, Any] = {}
self.incar: Incar = {}
self.kpoints_opt_props: KpointOptProps | None = None

ionic_steps: list = []
ionic_steps: list[dict] = []

md_data: list[dict] = []
parsed_header: bool = False
Expand Down Expand Up @@ -1355,9 +1355,9 @@ def as_dict(self) -> dict:
dct["output"] = vout
return jsanitize(dct, strict=True)

def _parse_params(self, elem: XML_Element) -> dict:
"""Parse INCAR parameters."""
params: dict = {}
def _parse_params(self, elem: XML_Element) -> Incar[str, Any]:
"""Parse INCAR parameters and more."""
params: dict[str, Any] = {}
for c in elem:
# VASP 6.4.3 can add trailing whitespace
# for example, <i type="string" name="GGA ">PE</i>
Expand All @@ -1369,6 +1369,7 @@ def _parse_params(self, elem: XML_Element) -> dict:
# which overrides the values in the root params.
p = {k: v for k, v in p.items() if k not in params}
params |= p

else:
ptype = c.attrib.get("type", "")
val = c.text.strip() if c.text else ""
Expand Down
36 changes: 18 additions & 18 deletions src/pymatgen/util/io_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from monty.io import zopen

if TYPE_CHECKING:
from collections.abc import Generator
from collections.abc import Iterator

__author__ = "Shyue Ping Ong, Rickard Armiento, Anubhav Jain, G Matteo, Ioannis Petousis"
__copyright__ = "Copyright 2011, The Materials Project"
Expand All @@ -21,31 +21,31 @@


def clean_lines(
string_list,
remove_empty_lines=True,
rstrip_only=False,
) -> Generator[str, None, None]:
string_list: list[str],
remove_empty_lines: bool = True,
rstrip_only: bool = False,
) -> Iterator[str]:
"""Strips whitespace, carriage returns and empty lines from a list of strings.

Args:
string_list: List of strings
remove_empty_lines: Set to True to skip lines which are empty after
string_list (list[str]): List of strings.
remove_empty_lines (bool): Set to True to skip lines which are empty after
stripping.
rstrip_only: Set to True to strip trailing whitespaces only (i.e.,
rstrip_only (bool): Set to True to strip trailing whitespaces only (i.e.,
to retain leading whitespaces). Defaults to False.

Yields:
list: clean strings with no whitespaces. If rstrip_only == True,
clean strings with no trailing whitespaces.
str: clean strings with no whitespaces.
"""
for s in string_list:
clean_s = s
if "#" in s:
ind = s.index("#")
clean_s = s[:ind]
clean_s = clean_s.rstrip() if rstrip_only else clean_s.strip()
if (not remove_empty_lines) or clean_s != "":
yield clean_s
for string in string_list:
clean_string = string
if "#" in string:
clean_string = string[: string.index("#")]

clean_string = clean_string.rstrip() if rstrip_only else clean_string.strip()

if (not remove_empty_lines) or clean_string != "":
yield clean_string


def micro_pyawk(filename, search, results=None, debug=None, postdebug=None):
Expand Down
Loading
Loading