diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 0c90f8068cf81..eb91c9dbeb651 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -15,7 +15,7 @@ repos: hooks: - id: codespell types_or: [python, rst, markdown] - files: ^pandas/ + files: ^(pandas|doc)/ exclude: ^pandas/tests/ - repo: https://github.com/pre-commit/pre-commit-hooks rev: v3.4.0 diff --git a/doc/source/development/contributing.rst b/doc/source/development/contributing.rst index f48c4ff5d97af..432584f0da746 100644 --- a/doc/source/development/contributing.rst +++ b/doc/source/development/contributing.rst @@ -714,7 +714,7 @@ to run its checks with:: without needing to have done ``pre-commit install`` beforehand. -If you want to run checks on all recently commited files on upstream/master you can use:: +If you want to run checks on all recently committed files on upstream/master you can use:: pre-commit run --from-ref=upstream/master --to-ref=HEAD --all-files diff --git a/doc/source/development/debugging_extensions.rst b/doc/source/development/debugging_extensions.rst index 358c4036df961..894277d304020 100644 --- a/doc/source/development/debugging_extensions.rst +++ b/doc/source/development/debugging_extensions.rst @@ -61,7 +61,7 @@ Now go ahead and execute your script: run .py -Code execution will halt at the breakpoint defined or at the occurance of any segfault. LLDB's `GDB to LLDB command map `_ provides a listing of debugger command that you can execute using either debugger. +Code execution will halt at the breakpoint defined or at the occurrence of any segfault. LLDB's `GDB to LLDB command map `_ provides a listing of debugger command that you can execute using either debugger. Another option to execute the entire test suite under lldb would be to run the following: diff --git a/doc/source/getting_started/comparison/comparison_with_spreadsheets.rst b/doc/source/getting_started/comparison/comparison_with_spreadsheets.rst index 55f999c099e23..bdd0f7d8cfddf 100644 --- a/doc/source/getting_started/comparison/comparison_with_spreadsheets.rst +++ b/doc/source/getting_started/comparison/comparison_with_spreadsheets.rst @@ -461,4 +461,4 @@ pandas' :meth:`~DataFrame.replace` is comparable to Excel's ``Replace All``. .. ipython:: python - tips.replace("Thur", "Thu") + tips.replace("Thu", "Thursday") diff --git a/doc/source/getting_started/comparison/comparison_with_sql.rst b/doc/source/getting_started/comparison/comparison_with_sql.rst index fcfa03a8bce5f..49a21f87382b3 100644 --- a/doc/source/getting_started/comparison/comparison_with_sql.rst +++ b/doc/source/getting_started/comparison/comparison_with_sql.rst @@ -193,7 +193,7 @@ to your grouped DataFrame, indicating which functions to apply to specific colum Fri 2.734737 19 Sat 2.993103 87 Sun 3.255132 76 - Thur 2.771452 62 + Thu 2.771452 62 */ .. ipython:: python @@ -213,11 +213,11 @@ Grouping by more than one column is done by passing a list of columns to the No Fri 4 2.812500 Sat 45 3.102889 Sun 57 3.167895 - Thur 45 2.673778 + Thu 45 2.673778 Yes Fri 15 2.714000 Sat 42 2.875476 Sun 19 3.516842 - Thur 17 3.030000 + Thu 17 3.030000 */ .. ipython:: python diff --git a/doc/source/user_guide/dsintro.rst b/doc/source/user_guide/dsintro.rst index 7d65e0c6faff7..efcf1a8703d2b 100644 --- a/doc/source/user_guide/dsintro.rst +++ b/doc/source/user_guide/dsintro.rst @@ -869,5 +869,5 @@ completion mechanism so they can be tab-completed: .. code-block:: ipython - In [5]: df.fo # noqa: E225, E999 + In [5]: df.foo # noqa: E225, E999 df.foo1 df.foo2 diff --git a/doc/source/user_guide/io.rst b/doc/source/user_guide/io.rst index 7e113c93baabe..cf153ddd2cbbd 100644 --- a/doc/source/user_guide/io.rst +++ b/doc/source/user_guide/io.rst @@ -2189,7 +2189,7 @@ into a flat table. data = [ {"id": 1, "name": {"first": "Coleen", "last": "Volk"}}, - {"name": {"given": "Mose", "family": "Regner"}}, + {"name": {"given": "Mark", "family": "Regner"}}, {"id": 2, "name": "Faye Raker"}, ] pd.json_normalize(data) @@ -2995,7 +2995,7 @@ For example, below XML contains a namespace with prefix, ``doc``, and URI at Similarly, an XML document can have a default namespace without prefix. Failing to assign a temporary prefix will return no nodes and raise a ``ValueError``. -But assiging *any* temporary name to correct URI allows parsing by nodes. +But assigning *any* temporary name to correct URI allows parsing by nodes. .. ipython:: python diff --git a/doc/source/user_guide/window.rst b/doc/source/user_guide/window.rst index 9db4a4bb873bd..ad710b865acea 100644 --- a/doc/source/user_guide/window.rst +++ b/doc/source/user_guide/window.rst @@ -79,7 +79,7 @@ which will first group the data by the specified keys and then perform a windowi .. versionadded:: 1.3 Some windowing operations also support the ``method='table'`` option in the constructor which -performs the windowing operaion over an entire :class:`DataFrame` instead of a single column or row at a time. +performs the windowing operation over an entire :class:`DataFrame` instead of a single column or row at a time. This can provide a useful performance benefit for a :class:`DataFrame` with many columns or rows (with the corresponding ``axis`` argument) or the ability to utilize other columns during the windowing operation. The ``method='table'`` option can only be used if ``engine='numba'`` is specified diff --git a/doc/source/whatsnew/v0.20.0.rst b/doc/source/whatsnew/v0.20.0.rst index ad8a23882e1e8..733995cc718dd 100644 --- a/doc/source/whatsnew/v0.20.0.rst +++ b/doc/source/whatsnew/v0.20.0.rst @@ -1326,7 +1326,7 @@ Deprecations Deprecate ``.ix`` ^^^^^^^^^^^^^^^^^ -The ``.ix`` indexer is deprecated, in favor of the more strict ``.iloc`` and ``.loc`` indexers. ``.ix`` offers a lot of magic on the inference of what the user wants to do. To wit, ``.ix`` can decide to index *positionally* OR via *labels*, depending on the data type of the index. This has caused quite a bit of user confusion over the years. The full indexing documentation is :ref:`here `. (:issue:`14218`) +The ``.ix`` indexer is deprecated, in favor of the more strict ``.iloc`` and ``.loc`` indexers. ``.ix`` offers a lot of magic on the inference of what the user wants to do. More specifically, ``.ix`` can decide to index *positionally* OR via *labels*, depending on the data type of the index. This has caused quite a bit of user confusion over the years. The full indexing documentation is :ref:`here `. (:issue:`14218`) The recommended methods of indexing are: diff --git a/doc/source/whatsnew/v1.2.1.rst b/doc/source/whatsnew/v1.2.1.rst index 8bfe233ae50cc..bfe30d52e2aff 100644 --- a/doc/source/whatsnew/v1.2.1.rst +++ b/doc/source/whatsnew/v1.2.1.rst @@ -18,7 +18,7 @@ Fixed regressions - Fixed regression in :meth:`~DataFrame.to_csv` opening ``codecs.StreamReaderWriter`` in binary mode instead of in text mode (:issue:`39247`) - Fixed regression in :meth:`read_csv` and other read functions were the encoding error policy (``errors``) did not default to ``"replace"`` when no encoding was specified (:issue:`38989`) - Fixed regression in :func:`read_excel` with non-rawbyte file handles (:issue:`38788`) -- Fixed regression in :meth:`DataFrame.to_stata` not removing the created file when an error occured (:issue:`39202`) +- Fixed regression in :meth:`DataFrame.to_stata` not removing the created file when an error occurred (:issue:`39202`) - Fixed regression in ``DataFrame.__setitem__`` raising ``ValueError`` when expanding :class:`DataFrame` and new column is from type ``"0 - name"`` (:issue:`39010`) - Fixed regression in setting with :meth:`DataFrame.loc` raising ``ValueError`` when :class:`DataFrame` has unsorted :class:`MultiIndex` columns and indexer is a scalar (:issue:`38601`) - Fixed regression in setting with :meth:`DataFrame.loc` raising ``KeyError`` with :class:`MultiIndex` and list-like columns indexer enlarging :class:`DataFrame` (:issue:`39147`) @@ -135,7 +135,7 @@ Other - Bumped minimum fastparquet version to 0.4.0 to avoid ``AttributeError`` from numba (:issue:`38344`) - Bumped minimum pymysql version to 0.8.1 to avoid test failures (:issue:`38344`) - Fixed build failure on MacOS 11 in Python 3.9.1 (:issue:`38766`) -- Added reference to backwards incompatible ``check_freq`` arg of :func:`testing.assert_frame_equal` and :func:`testing.assert_series_equal` in :ref:`pandas 1.1.0 whats new ` (:issue:`34050`) +- Added reference to backwards incompatible ``check_freq`` arg of :func:`testing.assert_frame_equal` and :func:`testing.assert_series_equal` in :ref:`pandas 1.1.0 what's new ` (:issue:`34050`) .. --------------------------------------------------------------------------- diff --git a/doc/source/whatsnew/v1.3.0.rst b/doc/source/whatsnew/v1.3.0.rst index fb8eecdaa275e..a06ec433d2d84 100644 --- a/doc/source/whatsnew/v1.3.0.rst +++ b/doc/source/whatsnew/v1.3.0.rst @@ -382,7 +382,7 @@ Performance improvements - Performance improvement in :meth:`core.window.rolling.Rolling.corr` and :meth:`core.window.rolling.Rolling.cov` (:issue:`39388`) - Performance improvement in :meth:`core.window.rolling.RollingGroupby.corr`, :meth:`core.window.expanding.ExpandingGroupby.corr`, :meth:`core.window.expanding.ExpandingGroupby.corr` and :meth:`core.window.expanding.ExpandingGroupby.cov` (:issue:`39591`) - Performance improvement in :func:`unique` for object data type (:issue:`37615`) -- Performance improvement in :func:`pd.json_normalize` for basic cases (including seperators) (:issue:`40035` :issue:`15621`) +- Performance improvement in :func:`pd.json_normalize` for basic cases (including separators) (:issue:`40035` :issue:`15621`) - Performance improvement in :class:`core.window.rolling.ExpandingGroupby` aggregation methods (:issue:`39664`) - Performance improvement in :class:`Styler` where render times are more than 50% reduced (:issue:`39972` :issue:`39952`) - Performance improvement in :meth:`core.window.ewm.ExponentialMovingWindow.mean` with ``times`` (:issue:`39784`) @@ -464,7 +464,7 @@ Interval - Bug in :meth:`IntervalIndex.intersection` and :meth:`IntervalIndex.symmetric_difference` always returning object-dtype when operating with :class:`CategoricalIndex` (:issue:`38653`, :issue:`38741`) - Bug in :meth:`IntervalIndex.intersection` returning duplicates when at least one of both Indexes has duplicates which are present in the other (:issue:`38743`) - :meth:`IntervalIndex.union`, :meth:`IntervalIndex.intersection`, :meth:`IntervalIndex.difference`, and :meth:`IntervalIndex.symmetric_difference` now cast to the appropriate dtype instead of raising ``TypeError`` when operating with another :class:`IntervalIndex` with incompatible dtype (:issue:`39267`) -- :meth:`PeriodIndex.union`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference`, :meth:`PeriodIndex.difference` now cast to object dtype instead of raising ``IncompatibleFrequency`` when opearting with another :class:`PeriodIndex` with incompatible dtype (:issue:`??`) +- :meth:`PeriodIndex.union`, :meth:`PeriodIndex.intersection`, :meth:`PeriodIndex.symmetric_difference`, :meth:`PeriodIndex.difference` now cast to object dtype instead of raising ``IncompatibleFrequency`` when operating with another :class:`PeriodIndex` with incompatible dtype (:issue:`??`) Indexing ^^^^^^^^ @@ -525,7 +525,7 @@ I/O - Bug in :func:`to_hdf` raising ``KeyError`` when trying to apply for subclasses of ``DataFrame`` or ``Series`` (:issue:`33748`) - Bug in :meth:`~HDFStore.put` raising a wrong ``TypeError`` when saving a DataFrame with non-string dtype (:issue:`34274`) - Bug in :func:`json_normalize` resulting in the first element of a generator object not being included in the returned ``DataFrame`` (:issue:`35923`) -- Bug in :func:`read_csv` apllying thousands separator to date columns when column should be parsed for dates and ``usecols`` is specified for ``engine="python"`` (:issue:`39365`) +- Bug in :func:`read_csv` applying thousands separator to date columns when column should be parsed for dates and ``usecols`` is specified for ``engine="python"`` (:issue:`39365`) - Bug in :func:`read_excel` forward filling :class:`MultiIndex` names with multiple header and index columns specified (:issue:`34673`) - :func:`read_excel` now respects :func:`set_option` (:issue:`34252`) - Bug in :func:`read_csv` not switching ``true_values`` and ``false_values`` for nullable ``boolean`` dtype (:issue:`34655`) @@ -603,7 +603,7 @@ ExtensionArray Other ^^^^^ -- Bug in :class:`Index` constructor sometimes silently ignorning a specified ``dtype`` (:issue:`38879`) +- Bug in :class:`Index` constructor sometimes silently ignoring a specified ``dtype`` (:issue:`38879`) - Bug in :func:`pandas.api.types.infer_dtype` not recognizing Series, Index or array with a period dtype (:issue:`23553`) - Bug in :func:`pandas.api.types.infer_dtype` raising an error for general :class:`.ExtensionArray` objects. It will now return ``"unknown-array"`` instead of raising (:issue:`37367`) - Bug in constructing a :class:`Series` from a list and a :class:`PandasDtype` (:issue:`39357`) diff --git a/setup.cfg b/setup.cfg index ca0673bd5fc34..fdc0fbdbd6b57 100644 --- a/setup.cfg +++ b/setup.cfg @@ -121,7 +121,7 @@ filterwarnings = junit_family = xunit2 [codespell] -ignore-words-list = ba,blocs,coo,hist,nd,ser +ignore-words-list = ba,blocs,coo,hist,nd,sav,ser ignore-regex = https://(\w+\.)+ [coverage:run]