-
-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_csv: parameter usecols has wrong type hint #963
Comments
There are 2 different issues here:
It may be that the solution is to change this line: pandas-stubs/pandas-stubs/_typing.pyi Line 529 in 48ca4b0
to use |
sets are not hashable. |
typing.py class Hashable(Protocol, metaclass=ABCMeta):
# TODO: This is special, in that a subclass of a hashable class may not be hashable
# (for example, list vs. object). It's not obvious how to represent this. This class
# is currently mostly useless for static checking.
@abstractmethod
def __hash__(self) -> int: ... |
from collections.abc import Sequence
from pandas import read_csv
cols1: Sequence[int] = [ 1 ]
read_csv("file_csv", usecols=cols1) mypy: error: No overload variant of "read_csv" matches argument types "str", "Sequence[int]" |
That seems to be a Having said that, |
In strict mode, pyright says Type of "read_csv" is partially unknown. |
We only test and support If I change the Not sure what to do when |
It would probably create another problem, as def cols(x: str) -> bool: return x in ["A", "B", "C"]
read_csv("file_csv", usecols=cols) should be acceptable (and |
Good point. Seems like we need to change it to: |
I think that could be too strict (but might cover typical usage). It seems to work as expected on pyright playground when inlining the TypeAlias (I think the issue is that we don't set the generic type of the TypeAlias when using it in the read_csv definition) |
So something is going on that is different with what is in |
I think it might be sufficient to replace with |
I tried that out locally and it works. So a PR that changes |
I have the similar issue with usecols as the type of list of a string.
Pyright gives the following diagnostic message, but according to the pandas docs usecols can be a list of strings. I guess the type annotation of usecols is not accurate.
|
I cannot reproduce this. We explicitly test for lists of strings with |
@Dr-Irv Thank you for your reply. When I looked at the Pyright configuration docs, it is stated that for On the other hand, when I checked the - def index(self, value: Any, /, start: int = 0, stop: int = ...) -> int: ...
+ def index(self, value: Any, start: int = ..., stop: int = ..., /) -> int: ... So, the issue seems to be the version I have. I have still the old version. However, what I don't understand is that I am using the latest version (2.2.2) of Pandas, and I still don't have the change mentioned above. I downloaded the wheel file and inspected it, and it indeed contains the old version of the stub file. This is quite surprising. I have probed some of the latest pandas packages from both pypi and conda and seen that all of them still contain old version of |
You need to install |
To Reproduce
Expected behavior:
No error for
usecols=cols1
, but an error forusecols=cols2
Actual behavior:
cols1
is not accepted, even thoughSequence[str]
is "list like" (at least I think so; the term is nowhere defined) and its elements are "strings that correspond to column names"cols2
is accepted, even though it is not a callable that can be "evaluated against the column names, returning names where the callable function evaluates to True"Please complete the following information:
pandas-stubs
: 2.2.2.240603The text was updated successfully, but these errors were encountered: