Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[24.2] Decode/encode FormDirectory paths to allow spaces (and other characters) #19841

Conversation

ahmedhamidawan
Copy link
Member

@ahmedhamidawan ahmedhamidawan commented Mar 18, 2025

Fixes #19812

fix_directory_form_element_encoding.mp4

Here, note that the problem reported in #19812 of directories with spaces not working as expected is fixed, for e.g.:

gxftp://Other Data/Files

works fine.

However, the case where a directory name is already "encoded", for e.g.:

gxftp://Other%20Data/Files

we instead end up creating a directory:

OtherX20Data\Files

rather than storing in the existing Other%20Data directory.

Maybe we ignore this case? Or flag such directory paths as invalid?

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@ahmedhamidawan ahmedhamidawan marked this pull request as ready for review March 18, 2025 20:12
@github-actions github-actions bot added this to the 25.0 milestone Mar 18, 2025
@ahmedhamidawan ahmedhamidawan force-pushed the fix_directory_form_element_encoding branch from feb53ed to 5c44711 Compare March 18, 2025 20:14
@bgruening
Copy link
Member

Oh thanks! I think we also have 1-2 reports about this in the last month.

@jmchilton
Copy link
Member

However, the case where a directory name is already "encoded", for e.g.:

Is this a backend issue - what file source is causing this? Everything should be JSON it seems and I wouldn't expect there to be any problems? I'd love an API test or backend unit test to verify the problem is not there.

@ahmedhamidawan
Copy link
Member Author

ahmedhamidawan commented Mar 19, 2025

Is this a backend issue - what file source is causing this? Everything should be JSON it seems and I wouldn't expect there to be any problems? I'd love an API test or backend unit test to verify the problem is not there.

My example is in the PR description, where for instance a directory name is Other%20Data, the front end keeps it intact, but yes: from there I can investigate if it is indeed at the backend where we change it to OtherX20Data instead...

The other case where the directory name has a space for example works out as shown.

This is the job info view for this:
image

@ksuderman
Copy link
Contributor

I tested this PR and it is working as expected for S3 buckets with space characters in the name. However, while the percent sign is in the list of characters to avoid Other%20Data is a valid S3 bucket name. This likely falls under the, "yeah, don't do that" category, but maybe the % should not be normalized.

@ahmedhamidawan
Copy link
Member Author

Argh, at some part of the process AFTER we have forked a process to run the job, we are somehow encoding cases like:
Other%20Data -> OtherX20Data
while
Other Data -> Other Data (is left untouched, good ✅)
by the time the python command from the tool xml is run:

python '/home/ahmedhamidawan/repos/galaxy2/tools/data_export/export_remote.py' --file-sources '/home/ahmedhamidawan/repos/galaxy2/database/jobs_directory/002/2925/configs/tmporssr9th' --directory-uri 'gxftp://Other Data/OtherX20Data/Files'

I can't seem to debug where this exactly happens... I can confirm in the backend the URI stays perfectly intact (e.g.: The URI is still correct at DefaultToolAction.execute()

@jmchilton

Everything should be JSON it seems and I wouldn't expect there to be any problems?

I can try changing the d_uri input to a JSON (unsure where exactly to make changes; what part of client if so, and I think this change would go in the xml/py for the tool?)

@ahmedhamidawan
Copy link
Member Author

Oh I found a way to make it work!!! But is it a valid fix here or can this create more potential problems?:

In export_remote.xml, we change:

<param type="directory_uri" name="d_uri" label="Directory URI">
<param type="directory_uri" name="d_uri" label="Directory URI">
    <sanitizer sanitize="false"/>
</param>

@jmchilton
Copy link
Member

That single change fixes it? If that is the case, I have a better solution if that is the only problem - if we dump the inputs to a json file and ingest them as such in the tool I think that we shouldn't need to do any encoding/decoding. If I make the change to the tool can you test it for me?

@ahmedhamidawan
Copy link
Member Author

ahmedhamidawan commented Mar 20, 2025

Yes please!

Well, this change fixes the case of the extra "encoding/sanitization?" that happens at the tool level, the frontend changes are still needed to retain the exact directory names (with spaces etc.)

@jmchilton
Copy link
Member

Was not as straight forward as I was hoping but this is the backend change I would like to see #19865 along with backend tests to ensure it is handling things as I would expect.

@ahmedhamidawan
Copy link
Member Author

@jmchilton Thank you!!
So as far as the client changes are concerned this is good to go then?

@mvdbeek
Copy link
Member

mvdbeek commented Mar 24, 2025

Can you add a couple or tests that cover the changes ?

@jmchilton
Copy link
Member

So as far as the client changes are concerned this is good to go then?

I'm still anxious that there is a bug in the file source or the concept behind file sources if we're getting these kinds of things in the UI - but I don't have the bandwidth to think through it or track down the plugin that causes this problem? I guess as long as the UI is sending back the to backend what the backend sent to the UI - I'm find with whatever happens in the UI.

@ahmedhamidawan ahmedhamidawan force-pushed the fix_directory_form_element_encoding branch from 5c44711 to c0762f7 Compare March 24, 2025 18:54
@mvdbeek mvdbeek merged commit 68be79f into galaxyproject:release_24.2 Mar 25, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants