Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: .to_period with MonthBegin offset does not work with a Timestamp, but works with pd.DatetimeIndex #60671

Open
2 of 3 tasks
felipeangelimvieira opened this issue Jan 7, 2025 · 6 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member Period Period data type Timestamp pd.Timestamp and associated methods

Comments

@felipeangelimvieira
Copy link

felipeangelimvieira commented Jan 7, 2025

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd


dateiindex = pd.date_range("2020-01-01", periods=1)
dateiindex.to_period(pd.offsets.MonthBegin())
dateiindex[0].to_period(pd.offsets.MonthBegin()) # error here

Issue Description

Thank you very much for the library. I believe this is a bug and not expected behaviour, since it is somewhat counterintuitive.

Calling to_period with MonthBegin raises an error:

ValueError: <MonthBegin> is not supported as period frequency

Only if the object is Timestamp, and not DatetimeIndex.

Expected Behavior

(edited)

I was expecting to obtain a period object. It does work with MonthEnd, so this may be related or the same issue of #58974

Installed Versions

INSTALLED VERSIONS

commit : 0691c5c
python : 3.11.11
python-bits : 64
OS : Darwin
OS-release : 24.1.0
Version : Darwin Kernel Version 24.1.0: Thu Oct 10 21:06:57 PDT 2024; root:xnu-11215.41.3~3/RELEASE_ARM64_T6041
machine : arm64
processor : arm
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8

pandas : 2.2.3
numpy : 2.1.3
pytz : 2024.2
dateutil : 2.9.0.post0
pip : 24.0
Cython : None
sphinx : None
IPython : 8.31.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : None
pyreadstat : None
pytest : 8.3.4
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.15.0
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2024.2
qtpy : None
pyqt5 : None

@felipeangelimvieira felipeangelimvieira added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 7, 2025
@felipeangelimvieira felipeangelimvieira changed the title BUG: .to_period with MonthBegin offset do not work with a Timestamp, but works with pd.DatetimeIndex BUG: .to_period with MonthBegin offset does not work with a Timestamp, but works with pd.DatetimeIndex Jan 7, 2025
@rhshadrach
Copy link
Member

Thanks for the report!

It does work with MonthStart

What is MonthStart? That is not a pandas object.

@rhshadrach rhshadrach added Period Period data type Timestamp pd.Timestamp and associated methods labels Jan 9, 2025
@rhshadrach
Copy link
Member

Also, dateiindex.to_period(pd.offsets.MonthBegin()) raises on main without a deprecation warning on 2.2.x. We should run a git-bisect to see where this was changed.

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Jan 30, 2025

It's been a while since I looked at this, but the docstring for to_period says

One of pandas’ period aliases or an Period object. Will be inferred by default.

and 'MS' isn't (and never has been) a Period alias.

The month period alias is 'M'

MonthEnd happens to work because the implementation couples together the MonthEnd offset and the Month period. It's...messy

@MarcoGorelli
Copy link
Member

Tried this out with pandas 2.0.3, and the error message was

AttributeError: 'pandas._libs.tslibs.offsets.MonthBegin' object has no attribute '_period_dtype_code'

So, at least the error message is a bit clearer now, but it should probably suggest "did you mean 'M' instead?". I think some logic to give better error messages there would be worthwhile

an Period object

This isn't exactly right, you can't pass a Period to to_period, you can pass a period alias. Will update

@felipeangelimvieira
Copy link
Author

felipeangelimvieira commented Jan 31, 2025

Sorry for the delay!
Thank you very much for replying.

What is MonthStart? That is not a pandas object.

Sorry, I meant pd.offsets.MonthEnd. The following code works:

import pandas as pd

dateiindex = pd.date_range("2020-01-01", periods=1)
dateiindex[0].to_period(pd.offsets.MonthEnd()) # works here

So, I believe there are two behaviours that are somewhat confusing:

  1. If I call to_period from a datetimeindex with MonthStart, it works. However, if I call it from a object from that index, it raises an error.
  2. If I use MonthEnd it works in both situations

@MarcoGorelli
Copy link
Member

Yup, thanks @felipeangelimvieira !

I'm kind of torn between:

  • disallow MonthEnd as well, and only accept Period aliases such as 'M'
  • accept both MonthStart and MonthEnd offsets, and document that they both map onto 'M'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member Period Period data type Timestamp pd.Timestamp and associated methods
Projects
None yet
Development

No branches or pull requests

3 participants