Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't upload text/html blob with contents that contain "<svg" + any character #3151

Open
Saturn-VI opened this issue Dec 1, 2024 · 1 comment · May be fixed by #3249
Open

Can't upload text/html blob with contents that contain "<svg" + any character #3151

Saturn-VI opened this issue Dec 1, 2024 · 1 comment · May be fixed by #3249
Labels
bug Something isn't working

Comments

@Saturn-VI
Copy link

Describe the bug

com.atproto.repo.uploadBlob fails with a payload containing "<svg" followed by any other characters, with a mimeType of "text/html".

To Reproduce

Steps to reproduce the behavior:

  1. Create a file (test.txt) with contents:
<svgQ
  1. Use postman/atfile/any other tool to upload test.txt.
  2. This causes a 500: Internal Server Error.

Expected behavior

The blob successfully uploads.

Additional context

  • Any number of characters can precede the "<svg"
  • Removing the character after "<svg" causes a successful upload.
@Saturn-VI Saturn-VI added the bug Something isn't working label Dec 1, 2024
@db3000
Copy link

db3000 commented Dec 9, 2024

The issue is in this block of code

The maybeGetInfo function is called by the uploadBlob implementation in the PDS and tries to get image information (eg height, width, format) from the upload contents using the sharp library. There's a corner case here though: if the code can't recognize the contents as an image then this function returns null to indicate it isn't an image, but if the sharp library decides that the blob contains an image and fails while decoding it then an exception is thrown. This sharp library exception propagates up and causes a 500 error to be returned to the PDS client.

In the case in this ticket, the text "<svg" is enough for sharp to decide the image is an SVG file (internally it uses ImageMagick which defines the following patterns for SVG files), but then it fails to decode the rest of the image as it expects the contents to be valid XML. The actual exception thrown from sharp is:

Input buffer has corrupt header: glib: XML parse error: Error domain 1 code 73 on line 1 column 6 of data: Couldn't find end of Start Tag svgQ line 1

Given that maybeGetInfo is only used to detect image information, and the fact that it allows through some set of corrupt images (those that don't match any ImageMagick patterns), maybe its reasonable to just catch all exceptions from sharp and return null here too? Alternatively could turn this into an exception that causes a proper HTTP status code (e.g. 400) to be returned to the client?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants