-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.Net: fix: array query parameters with single entries #9771
Open
baywet
wants to merge
3
commits into
microsoft:main
Choose a base branch
from
baywet:fix/query-paremeters-parsing
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+8
−1
Open
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder which scenario requires the string -> JSONArray conversion. Is it when the LLM provides a string argument to a parameter of an array type? I'm not sure SK should perform this conversion, and I would argue that it's the caller's responsibility to provide a valid JSON array as an argument for the parameter of array type. If the LLM is returning the string, the error details should be sent back to it so it can make another call with the correct argument type - this is already implemented in SK.
My concern is that if we start doing the conversion because the LLM sometimes provides invalid JSON values, sooner or later we will find ourselves in a tricky situation when we are asked, for example, to create an array element for each word in the string - "v1 v2 v3," "v1, v2, v3," or "v1;v2;v3." How are we going to support that without introducing any behavioral change? We were creating an array with one element from the string, but now we would have an array with an element per word in the string. Will we add an extra option/execution parameter that allows us to tweak the conversion behavior? What if different string -> array conversion flavors are needed within the same operation or in different operations imported from the same document?
I think it's safer not to perform that conversion at the SK level and allow the LLM to self-correct based on function invocation errors. If an extra request to the LLM is not desirable, then consumers can easily decorate each operation with a delegate - How to transform a KernelFunction that would handle the conversion of arguments provided by the LLM based on the target type from the operation metadata and call original function with valid arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this is a matter of the LLM generating "the wrong JSON" but rather how the response is being parsed, hence the fix.
For the following prompt
using only the subject, recap my last 5 emails
it eventually buildscreatedDateTime desc
as the string value here for the $orderBy query parameter, which is correct.However, I do agree this could be a slippery slope, especially for request and response bodies, so maybe we limit those graceful upgrades to query/paths parameters and headers?
What's probably causing the mis-alignment here is the style from and the transformation that already gets applied post parsing. We could alternatively accept the LLMs response as is since it's correct and remove the parsing all together, but this might impact reliability in other cases?
I don't think letting the LLM brute force through adding extra square brackets is a productive use of resources here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@baywet The example you show is serializing of URL parameters, as per 6570 rules. I don't think we can mix serializing of URL parameters and JSON. They follow very different rules. With 6570 whether the parameter value is a simple string, or an array of strings it would serialize correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which begs the question: why does the pipeline to generate query parameters value go through a JSON parsing in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who builds it, LLM or SK? If it's the LLM that provides "createdDate desc" string for a parameter of an array type, then it's incorrect because the "createdDate desc" string is not a valid JSON array (see https://datatracker.ietf.org/doc/html/rfc8259#section-5).
To consider the LLM response as correct, it should conform to the parameter specification that requires the parameter to be of array type; every response that deviates from this is incorrect, and accepting it only pushes the issue further in the invocation chain -> parameters serialization failing to serialize the parameter value in accordance with the specified format or the REST API service itself receiving a wrongly formatted query string if the serialization part is dropped or produces the wrong result.
Parsing is required because SK supports non-LLM scenarios as well, where parameter arguments can be provided as a JSON string or .NET type. For example, the argument for the
orderby
parameter can either be a JSON array - "["id desc", "createdDateTime asc"]" or an instance of a .NET List type, such asnew List<string>() { "id desc" }
.... If an extra request to the LLM is not desirable, then consumers can easily decorate each operation with a delegate - How to transform a KernelFunction that would handle the conversion of arguments provided by the LLM based on the target type from the operation metadata and call original function with valid arguments.
To simplify the parameters serialization functionality that would otherwise be cluttered with the parsing and validation logic required to format the parameters, as well as the if/else logic to determine whether the argument is JSON or an instance of a .NET type.
BTW: There's no need to specify "$" in the "$orderby" parameter name, SK will do it for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the additional information.
As far as I understand kernel functions transformations, they are deeply specific to a given function, which would not be suitable in our case since we'll want a solution that works for any function, generated from any OpenAPI/Api Manifest/Copilot Plugin.
I'm not sure, whatever ends up calling this method. As far as I understand, SK calls the LLM to get the query parameters values generated, but maybe I understand this wrong?
In your previous answer, you seem to be eluding to the fact that we might be able to "force the hand" of the LLM and tell it "no, really, even if you come up with a single result, it needs to be returned as a valid JSON array with a single entry". Or maybe this is me extrapolating. Would you have pointers for that?