Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: remove single quotes in index names when printing #60251

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
42203a9
remove single quotes in index names when printing
thedataninja1786 Nov 8, 2024
5d9edbf
Removed trailing whitespace
thedataninja1786 Nov 8, 2024
5da2343
Removed trailing whitespaces
thedataninja1786 Nov 8, 2024
9519268
Merge branch 'main' into bugfix--pprint-embedded-quotes
thedataninja1786 Nov 8, 2024
a13b8b1
branch 'upstream/main' into bugfix--pprint-embedded-quotes
thedataninja1786 Nov 9, 2024
f0a9008
Added relevant tests when removing single quotes
thedataninja1786 Nov 9, 2024
0f1c9e5
Merge branch 'bugfix--pprint-embedded-quotes' of https://github.com/t…
thedataninja1786 Nov 9, 2024
9552a43
Refactor test_formatted_index_names to adhere with line-legth constra…
thedataninja1786 Nov 9, 2024
597e6c5
Merge remote-tracking branch 'upstream/main' into bugfix--pprint-embe…
thedataninja1786 Nov 10, 2024
2fa7eeb
Escape single quotes
thedataninja1786 Nov 10, 2024
034ad05
Merge branch 'main' into bugfix--pprint-embedded-quotes
thedataninja1786 Nov 11, 2024
c3aa8f0
Merge remote-tracking branch 'upstream/main' into bugfix--pprint-embe…
thedataninja1786 Nov 12, 2024
bdcbca1
Apply formatting and import sorting from pre-commit hooks
thedataninja1786 Nov 12, 2024
a6cad8d
Apply formatting and import sorting from pre-commit hooks
thedataninja1786 Nov 12, 2024
9d02418
Apply formatting and import sorting from pre-commit hooks
thedataninja1786 Nov 12, 2024
7e59bda
Merge branch 'bugfix--pprint-embedded-quotes' of https://github.com/t…
thedataninja1786 Nov 12, 2024
67fd773
Apply formatting and import sorting from pre-commit hooks
thedataninja1786 Nov 12, 2024
67ecd35
Merge remote-tracking branch 'upstream/main' into bugfix--pprint-embe…
thedataninja1786 Nov 17, 2024
8c41d9d
Update whatsnew
thedataninja1786 Nov 17, 2024
7a4c1ee
Merge remote-tracking branch 'upstream/main' into bugfix--pprint-embe…
thedataninja1786 Nov 22, 2024
2c3625a
Update whatsnew in v3.0.0.
thedataninja1786 Nov 22, 2024
6391b76
Update whatsnew in v3.0.0.
thedataninja1786 Nov 22, 2024
10df8b4
Converted tests for string comparisons
thedataninja1786 Nov 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion pandas/core/indexes/frozen.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,11 @@ def _disabled(self, *args, **kwargs) -> NoReturn:
raise TypeError(f"'{type(self).__name__}' does not support mutable operations.")

def __str__(self) -> str:
return pprint_thing(self, quote_strings=True, escape_chars=("\t", "\r", "\n"))
return pprint_thing(
self,
quote_strings=True,
escape_chars=("\t", "\r", "\n", "'")
)

def __repr__(self) -> str:
return f"{type(self).__name__}({self!s})"
Expand Down
3 changes: 1 addition & 2 deletions pandas/io/formats/printing.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,11 +199,10 @@ def pprint_thing(
-------
str
"""

def as_escaped_string(
thing: Any, escape_chars: EscapeChars | None = escape_chars
) -> str:
translate = {"\t": r"\t", "\n": r"\n", "\r": r"\r"}
translate = {"\t": r"\t", "\n": r"\n", "\r": r"\r", "'": r"\'"}
if isinstance(escape_chars, Mapping):
if default_escapes:
translate.update(escape_chars)
Expand Down
21 changes: 19 additions & 2 deletions pandas/tests/io/formats/test_printing.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,29 @@
# functions, not the general printing of pandas objects.
from collections.abc import Mapping
import string

import pytest
import pandas._config.config as cf

import pandas as pd
from pandas.io.formats import printing


@pytest.mark.parametrize("input_names, expected_names", [
(["'a", "b"], ["\'a", "b"]), # Escape leading quote
(["test's", "b"], ["test\'s", "b"]), # Escape apostrophe
(["'test'", "b"], ["\'test\'", "b"]), # Escape surrounding quotes
(["test","b'"], ["test","b\'"]), # Escape single quote
(["'test\n'", "b"], ["\'test\n\'", "b"]) # Escape and preserve newline
])
def test_formatted_index_names(input_names, expected_names):
# Create DataFrame with specified index names
Copy link
Member

@rhshadrach rhshadrach Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this comment - it repeats the code verbatim and so is not needed.

In addition, can you start the test with a comment referencing the issue:

# GH#60190

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are there now?

df = pd.DataFrame(
{name: [1, 2, 3] for name in input_names}
).set_index(input_names)
formatted_names = df.index.names

assert formatted_names == expected_names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test passes on the main branch, so is not testing your changes. You can do str(df.index.names) and compare this to the expected value, e.g. ['\'a', 'b']

Copy link
Author

@thedataninja1786 thedataninja1786 Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue that I didn't realize before, that's why the tests pass. When the tests look like this
image
running the pre-commit checks: pre-commit run --from-ref=upstream/main --to-ref=HEAD --all-files fail, and convert it to the format of the latest PR (i.e. not escaping the single quotes).

Copy link
Member

@rhshadrach rhshadrach Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When comparing to str(df.index.names), you want the expected to be a single string, not a list of strings. But indeed, you need to escape the backslash: ['\\'a', 'b'].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify my previous comment, you cannot compare single entries because with a backslash (this PR) and without a backslash (main) evaluate as equal. However if you convert the entire index.names to a string, they do not.

input_names = ["'a", "b"]
df = pd.DataFrame({name: [1, 2, 3] for name in input_names}).set_index(input_names)

print(str(df.index.names) == "['\\'a', 'b']")  # main
# False

print(str(df.index.names) == "['\\'a', 'b']")  # PR
# True

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think I'm with you :)



def test_adjoin():
data = [["a", "b", "c"], ["dd", "ee", "ff"], ["ggg", "hhh", "iii"]]
expected = "a dd ggg\nb ee hhh\nc ff iii"
Expand Down
Loading