Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON object not accepted as part of multipart payload #234

Open
dmjohnsson23 opened this issue Oct 2, 2024 · 6 comments
Open

JSON object not accepted as part of multipart payload #234

dmjohnsson23 opened this issue Oct 2, 2024 · 6 comments

Comments

@dmjohnsson23
Copy link

The validator does not appear to accept JSON objects as "properties" of a multipart payload. For example, in this trimmed-down schema:

openapi: 3.1.0
paths:
  /intake:
    post:
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/Intake'
          multipart/form-data:
            schema:
              type: object
              required: 
                - data
              properties:
                data:
                  $ref: '#/components/schemas/Intake'
              patternProperties:
                "^doc_.*$":
                  type: string
                  format: binary
                  description: |-
                    This is a document that will be attached to the intake request. You may 
                    provide additional metadata about this document in the `documents` property of 
                    the intake object.
        required: true

When sending an application/json request, the validator accepts the request:

curl -X POST https://dev-server/api/intake -H "Authorization: Bearer $token" -H "Content-Type: application/json" -d '{...data here...}'

However, if I wrap the exact same data in a multipart/form-data payload, it fails by throwing a \League\OpenAPIValidation\Schema\Exception\TypeMismatch exception on the data property with the message "Value expected to be 'object', but 'string' given."

curl -X POST https://dev-server/api/intake -H "Authorization: Bearer $token" -F 'data={...data here...};type=application/json'

According to this link, I should be able to have a JSON part in a multipart payload by specifying type: object. The 3.1 spec also states that JSON should be the default encoding for "object" properties in a multipart payload. The validator, however, seems to only interpret the content as a string.

@dmjohnsson23
Copy link
Author

dmjohnsson23 commented Oct 2, 2024

I spent some time digging around in the source code for the validator. I did find this interesting discrepancy:

try {
$body = $this->deserializeBody($this->parseMultipartData($addr, $document), $schema);
$validator->validate($body, $schema);
} catch (SchemaMismatch $e) {
throw InvalidBody::becauseBodyDoesNotMatchSchema($this->contentType, $addr, $e);
}

VS

try {
$validator->validate($body, $schema);
} catch (SchemaMismatch $e) {
throw InvalidBody::becauseBodyDoesNotMatchSchema($this->contentType, $addr, $e);
}

The former contains this line to parse the body before validating, where the latter does not:

$body = $this->deserializeBody($this->parseMultipartData($addr, $document), $schema);

I tried copying that line from the validatePlainBodyMultipart function to the validateServerRequestMultipart function. It didn't work, and I don't understand the validator's internal workings to chase down exactly what it's doing and why, but I thought I'd at least post it in case it was a useful lead. It is a ServerRequestInterface that I'm validating, via the RoutedServerRequestValidator, so it at least seemed probable that it was related.

@dmjohnsson23
Copy link
Author

It looks like after this block:

$body = (array) $message->getParsedBody();
$files = $this->normalizeFiles($message->getUploadedFiles());
$body = array_replace($body, $files);

$body contains an array of strings:

array (size=2)
  'data' => string '...truncated...' (length=5597)
  'doc_1' => string '~~~binary~~~' (length=12)

Something needs to happen to parse the JSON here before validation. If I can figure this out, I'll submit a PR, but I may have to defer to someone who knows the codebase if I can't. Advice would also be appreciated.

@dmjohnsson23
Copy link
Author

I've found if I change this:

try {
$validator->validate($body, $schema);
} catch (SchemaMismatch $e) {
throw InvalidBody::becauseBodyDoesNotMatchSchema($this->contentType, $addr, $e);
}

To this:

        try {
            $body = $this->deserializeBody($body, $schema);
            $validator->validate($body, $schema);
        } catch (SchemaMismatch $e) {
            throw InvalidBody::becauseBodyDoesNotMatchSchema($this->contentType, $addr, $e);
        }

And also change this:

$param = new SerializedParameter($propSchema);

To this:

            $param           = new SerializedParameter($propSchema, 'application/json');

Then the request is validatated properly (well, at least, assuming I only have JSON data in the request obviously...). The only remaining question is how to actually get the expected content type rather than hard-coding it. The existing detectEncodingContentTypes method seems to already implement the required logic, but requires other data (namely, the part's Content-Type header) to call. Frankly, I have no clue how I can actually get that content type header, as ServerReqestInterface doesn't seem to expose it....

@imefisto
Copy link
Contributor

Interesting. I'm having the same issue while trying to upload a json file.

I have no clue how I can actually get that content type header, as ServerReqestInterface doesn't seem to expose it....

ServerRequestInterface inherits from RequestInterface. RequestInterface inherits from Message. Message has the methods to retrieve headers. Is that what you're talking about?

Link to PSR7

@dmjohnsson23
Copy link
Author

The headers received from that method will give the content-type of the full HTTP payload (e.g. mime/multipart) not the individual parts. It isn't useful for what I was trying to do. In fact, I found even vanilla PHP code doesn't provide a way (that I could see anyway) of accessing those subpart headers, short of using the raw input and parsing the HTTP request yourself. You kind of just have to assume the content type will be what you expect it to be and run with it.

I ultimately found that unfortunately this library did not suit my needs, and instead opted for hand-written validation code. Sometimes there is just no substitute for doing it yourself. I did find this library helpful in simplifying my hand-written validation code though.

@imefisto
Copy link
Contributor

imefisto commented Jan 17, 2025

I'm wondering if $request->getUploadedFiles()['file']->getClientMediaType() could do the job. I'm talking without proper analysis. I'll take a look next days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants