-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make mypy more strict for prototype datasets #4513
Conversation
; untyped definitions and calls | ||
disallow_untyped_defs = True | ||
|
||
; None and Optional handling | ||
no_implicit_optional = True | ||
|
||
; warnings | ||
warn_unused_ignores = True | ||
warn_return_any = True | ||
warn_unreachable = True | ||
|
||
; miscellaneous strictness flags | ||
allow_redefinition = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these the default values for those options?
If they're not the default, do we have a strong reason to use them instead of the defaults? Is this going to be clearly beneficial to the code-base and to us as developers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these the default values for those options?
Nope.
If they're not the default, do we have a strong reason to use them instead of the defaults? Is this going to be clearly beneficial to the code-base and to us as developers?
Let's go through them one by one:
-
disallow_untyped_defs
: by defaultmypy
simply accepts untyped functions and usesAny
for the input and output annotations. If our ultimate goal is to declaretorchvision
typed, we should make sure that we don't miss some functions. This flag enforces that. -
no_implicit_optional
: By defaultmypy
allows this:def foo(bar: int = None) -> int: pass
With this option enabled, it has to be
def foo(bar: Optional[int] = None) -> int: pass
Given that
None
is a valid input, we should also explicitly mention it in the annotation. -
warn_unused_ignores
: Sometimes we use# type: ignore
directives on stuff that is actually wrong in other libraries. For example fix annotation for Demultiplexer pytorch#65998 will make some ignore directives obsolete that are needed now. Without this flag, we would never know. -
warn_return_any
: If a function does something with dynamic types,mypy
usually falls back to treating the output asAny
. This will warn us if something like this happened, but we specified a more concrete output type. -
warn_unreachable
: This is more a test functionality, asmypy
will now warn us if some code is unreachable. For example, with this flag set,mypy
will warn that theif
branch is unreachable.def foo(bar: str) -> str: if isinstance(bar, int): bar = str(bar) return bar
-
allow_redefinition
: See Set allow_redefinition = True for mypy #4531. If we have this globally, we can of course remove it here.
Apart from warn_return_any
and warn_unreachable
I think these flags are clearly beneficial. For the other two, they were beneficial for me in the past, but I can others object to them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pmeier !
Just some questions for my own understanding, but LGTM
for line in lines[1:-1] | ||
] | ||
return tuple(zip(*sorted(categories_and_labels, key=lambda category_and_label: int(category_and_label[1]))))[0] | ||
categories_and_labels = cast( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wondering why we need to cast anything here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pattern.match(line).groups()
returns a Tuple[Optional[str], ...]
. So we need to cast to tell it that this will be a tuple of length 2 and every group was actually matched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to cast because of the Optional
bit or because of the exact length of the tuple? Or both?
Would List[Tuple[str, ...]],
be enough?
Also can we remove the # type: ignore[union-attr]
below now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to cast because of the
Optional
bit or because of the exact length of the tuple? Or both?
WouldList[Tuple[str, ...]]
, be enough?
List[Tuple[str, ...]]
seems to work out. I assumed I needed a two element tuple due to the assignment in L177.
Also can we remove the
# type: ignore[union-attr]
below now?
Nope. re.match
returns Optional[Match]
and since we don't check for match is None
because we are sure that we will always match, mypy
complains that None
has no attribute groups
.
@@ -12,4 +13,4 @@ def raw(buffer: io.IOBase) -> torch.Tensor: | |||
|
|||
|
|||
def pil(buffer: io.IOBase, mode: str = "RGB") -> torch.Tensor: | |||
return pil_to_tensor(PIL.Image.open(buffer).convert(mode.upper())) | |||
return cast(torch.Tensor, pil_to_tensor(PIL.Image.open(buffer).convert(mode.upper()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to call cast
because pil_to_tensor
is not typed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. For untyped functions mypy
assumes Any
and then complains because we return the more specific torch.Tensor
here. I've added a warn_redundant_casts = True
option that will emit a warning that this cast can be removed as soon as pil_to_tensor
is typed.
@@ -8,7 +8,7 @@ | |||
|
|||
|
|||
# FIXME | |||
def compute_sha256(_) -> str: | |||
def compute_sha256(path: pathlib.Path) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol I'm afraid to ask
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file needs heavy refactoring as soon as the torchdata
download API is stable-ish. Adding the type was just faster than adding an ignore.
Summary: * make mypy more strict for prototype datasets * fix code format * apply strictness only to datasets * fix more mypy issues * cleanup * fix mnist annotations * refactor celeba * warn on redundant casts * remove redundant cast * simplify annotation * fix import Reviewed By: NicolasHug Differential Revision: D31916328 fbshipit-source-id: 55eac940a3ed5bc3197debeb8b7bdb20ea543578
* make mypy more strict for prototype datasets * fix code format * apply strictness only to datasets * fix more mypy issues * cleanup * fix mnist annotations * refactor celeba * warn on redundant casts * remove redundant cast * simplify annotation * fix import
Instead of retrofitting more strictness for
mypy
later, we can start off with strict settings.cc @pmeier @mthrok @bjuncek