Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correcting the rename dialog? #9265

Open
rdstern opened this issue Nov 18, 2024 · 13 comments
Open

Correcting the rename dialog? #9265

rdstern opened this issue Nov 18, 2024 · 13 comments
Assignees
Labels

Comments

@rdstern
Copy link
Collaborator

rdstern commented Nov 18, 2024

t2_subset.xlsx

Here is a dataset where many variables start with X.

I would like to delete the X. from the names using thre Rename 3rd option by a function.

If I do it really simply, namely replace X. by empty, it gives and error in dplyr. I think that is because of the default using regex. So I tried replacing X\\. and it no longer gives an error, but also doesn't replace.

Help please and we should also write this up. Can @N-thony or @lilyclements help?

@lilyclements
Copy link
Contributor

lilyclements commented Nov 18, 2024

Thanks for this @rdstern. The issue is that the code is looking for the wrong columns because you're using "Starts with" not "Matches".

tidyselect::starts_with() does not support regular expressions, so it is currently selecting columns which literally start with "^X\."
We need to call the ^X\\. for a different parameter in our R code, however.
To use regular expressions to match column names this way, we can use tidyselect::matches() instead, which supports regex.

So all you need to do is change from the "Starts with" option to "Matches":

image

(I've put the replacement as NEW as otherwise there's an issue that we already have a column called date)

@rdstern
Copy link
Collaborator Author

rdstern commented Nov 18, 2024

@lilyclements great. I also misunderstood the error that I got, because I assumed it was because of the dot and regex. It was because of the names being duplicated.

You see I wanted it to be simple. Just delete Xdot from the start of each name - So I was really happy that I can easily replace Xdot by NEW using starts with and no trouble!

I still need to write this up properly in the help, and possibly edit the dialog a bit if some options assume regex by default and others do not.

@N-thony
Copy link
Collaborator

N-thony commented Nov 19, 2024

I noticed also this error when you in Edit option, click on Abbreviate and then back again to Edit, and click Ok, you get the following error.
image

@lilyclements
Copy link
Contributor

@N-thony thanks for reporting. That error is that the X\\. from the ucrInput is not being read into the tidyselect::starts_with() function. Should be an easy fix when it's taken up :)

@N-thony
Copy link
Collaborator

N-thony commented Nov 19, 2024

@N-thony thanks for reporting. That error is that the X\\. from the ucrInput is not being read into the tidyselect::starts_with() function. Should be an easy fix when it's taken up :)

Yes, that is correct, @derekagorhom or @Fidel365 can easily fix this.

@rdstern rdstern changed the title Changing names with a dot in them, via the rename dialog? Correcting the rename dialog? Nov 20, 2024
@rdstern rdstern added the bug label Nov 20, 2024
@rdstern
Copy link
Collaborator Author

rdstern commented Nov 20, 2024

WE are often hitting an error after using the Rename with button and then using with the Single or the Multiple Buttons afterward. It still tries to process the rename with option. It is always resolved by pressing Reset.
Could you imporve the code, so it only processes the option from the open button.

t1_100.zip

Here is an example to try:

Importing the file I want to delete the superflous dates (9 columns), or trhe rename with fails, because names that then not unique. Nasty message though!

Then change Date to Day followed by Rename With, on Edit > Starts with X. to nothing works fine.

But the X, forst and then changfinf Date to Day afterwards, gives an error.

@N-thony
Copy link
Collaborator

N-thony commented Nov 21, 2024

@Fidel365 any plan on this?

@Fidel365
Copy link
Contributor

@N-thony on it

@rachelkg
Copy link
Contributor

@N-thony, @Fidel365 and @lilyclements

While we are talking about the Renaming Dialog I have a question.

Why is the edit option only available on the data frame tab. It seems like it would be useful on the Selected Variables tab too, could it be added? All the other options seem to be the same.

2024-11-25_14-59-14

2024-11-25_14-59-02

@rdstern
Copy link
Collaborator Author

rdstern commented Nov 26, 2024

@lilyclements a question for you. @Fidel365 has solved the bug and I hope we can merge.

I think we had @rachelkg question earleir - in #8439 on why we can't have the edit option with just selected variables. And we can't is that right. As I write this, I will also check whether (if that's the case) we could get the same thing by applying a select first! That would be nice.

Oh no, now I realise! If select is applied, it used to give a bug. Now it just takes off the select. And it could pile up problems later, maybe with none unique names?

Let's confirm that first. Then I have another @lilyclements question, though maybe more an @N-thony question?. It is easy to get the dplyr error with the example I use - given above. That's because the proposed Edit results in non-unique names. Now I don't want it to become clever and adjust the names. I'd like an error message, but clearer, namely that the edit is producing non-unique names in the data sheet.

Here is the error message:

image

I wonder if it can come with any of the options, but assume it is mainly with Edit? If we can't do better, then we could maybe add a line within the box (Check in Help if you get an error message.)

It is quite a nice example that with power comes risks!

By the way we can (of course) make the same error with the simple use of rename - so I did! Then I get this message:

image

I like that message! And I'm liking our forthcoming work in the help file. And I'm even wondering whether there's a practice document and video needed. Maybe we could combinethis and have one called Renaming columns and getting error messages. @rachelkg and Beryl (why can't I find her to send a message to?) what do you think?

@N-thony
Copy link
Collaborator

N-thony commented Nov 27, 2024

@rdstern I have been looking at the issue the selection, I think the message you had before was "Unable to retrieve the data..." when trying to rename a selected column. The problem was with the selection condition, when refreshing the data after renaming a selected column, R was not able to find the columns in the selection condition which is now old in the current data.
I have actually fixed that for Single option as you can see in the screenshot below, with selection applied, I have renamed yield to yield123, it is still fine after removing the selection. I will work on other option and once done I will open a PR. This will help to solve the issue in some dialogue when select is applied and I will see how to do the same thing with the filter.
image
image
image

@rdstern
Copy link
Collaborator Author

rdstern commented Nov 27, 2024

@lilyclements and @Fidel365 I wonder if there is an error in the contains option of the rename with dialog. I'm having trouble again with replacing dot. The contains option still seems to replace the first character, which seems to be regex stuff?

I also have an example below, where I want to change multiple dots to spaces. I'd like to check it does them all, and not just the first in the string.

Introductory e-siac Survey.xlsx

(Once I have changed the dots to spaces, then I copy the names into the variable labels.)

@lilyclements
Copy link
Contributor

lilyclements commented Dec 5, 2024

1, @rdstern To the "Edit" question - Yes, we should be able to have edit with "Selected Variables". I would suggest that in this situation that:

  • The drop down has "Matches" and "Contains" as the options in the drop-down
  • .cols = *vars in multiple receiver*
  • .fn = stringr::str_replace
  • pattern = *thing in the replace input*
  • replacement = *thing in the by input*

I wonder if it can come with any of the options, but assume it is mainly with Edit? If we can't do better, then we could maybe add a line within the box (Check in Help if you get an error message.)

@rdstern how about if we had a "check" in a box (like on the "Keys" dialog) which checks the name. Because, we might have issues too if you start a name with something that is not a character (e.g., a number or a symbol).

I just looked at it, and amended the survey data to have variables with names xx_field, and xx1size for field and size. Then I used the rename dialog to replace xx with nothing. This throws an error in R-Instat and I'm not sure why! It doesn't throw an error in R. I assume this is due to in R-Instat us usually adding an "X" to the start to make the name valid?

# Dialog: Rename Columns
data_book$rename_column_in_data(data_name="survey", type="multiple", new_column_names_df=data.frame(cols=c("xx_field","xx1size"), index=c(2,3)))

# Dialog: Rename Columns
data_book$rename_column_in_data(data_name="survey", type="rename_with",
                                .fn=stringr::str_replace, .cols=tidyselect::starts_with("xx"),
                                pattern="xx", replacement="",
                                new_column_names_df=data.frame(cols=c("xx_field","xx1size"),
                                                               index=c(2,3)))

I wonder if there is an error in the contains option of the rename with dialog. I'm having trouble again with replacing dot. The contains option still seems to replace the first character, which seems to be regex stuff?

a) We want to use matches here, and have \\. to remove the .:

data_book$rename_column_in_data(data_name="Responses", type="rename_with",
                                .fn=stringr::str_replace, .cols=tidyselect::matches("\\."),
                                pattern="\\.", replacement=" ")

b) To replace all then we would have fn = stringr::str_replace_all. Maybe we could have this as a checkbox, and if checked then we run fn = stringr::str_replace_all, otherwise, fn = stringr::str_replace?

data_book$rename_column_in_data(data_name="Responses", type="rename_with",
                                .fn=stringr::str_replace_all, .cols=tidyselect::matches("\\."),
                                pattern="\\.", replacement=" ")

c) We seem to say in this dialog that whatever the column name starts with is the pattern we are looking for. I assume we are happy with this, but I just wanted to query it!
(i.e., we currently have that you might select all columns starting with x_, and replace that x_. Could it be that you want to replace a different part of a column name, but for all columns that start with x_?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants