-
Notifications
You must be signed in to change notification settings - Fork 29.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add prompt snippets support #234220
base: main
Are you sure you want to change the base?
add prompt snippets support #234220
Conversation
6f5026f
to
a927e4e
Compare
src/vs/workbench/contrib/chat/browser/attachments/implicitContextAttachment.ts
Show resolved
Hide resolved
|
||
prompt.parts | ||
.forEach((part, i) => { | ||
public async resolveVariables( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly readability improvements in this file, the logic remains the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, but I'd prefer to keep PRs focused more narrowly, and the word "mostly" makes it a little harder to review :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sorry, it was very hard to read even on 27" monitor, so I was formatting it on the go as I was parsing through the code, and then decided to keep it. The same applies for other cases where I split long lines over. Let me know if you think that reverting this would help with the review at this point 👂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also recommend scoping PRs down where possible, large PRs with many changes make it hard on the reviewers and ultimately delay landing features.
// to go first so that an replacement logic is simple | ||
resolvedVariables | ||
.sort((left, right) => { | ||
assertDefined( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These asserts are to omit TS hacks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the !
because we know that these are parts of a prompt and have ranges, but the assert is fine too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they always have ranges, we need to update the TS types then?
The asserts just add a more readable error messages here as opposed to a cryptic cannot read 'start' of 'undefined'
you'd get with the bang
) { | ||
super(); | ||
this._register(widget.inputEditor.onDidChangeModelContent(e => { | ||
e.changes.forEach(c => { | ||
// Don't mutate entries in _variables, since they will be returned from the getter | ||
this._variables = coalesce(this._variables.map(ref => { | ||
this._variables = coalesce(this._variables.map((ref) => { | ||
if (c.text === `#file:${ref.filenameWithReferences}`) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the only way I made it work and its not ideal, any better ideas? cc @roblourens @joyceerhl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this trying to do? Is this triggered when that text is inserted all at once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we add the (+N more)
label to a filename after nested references are resolved, this is run and as the result the variables are removed. The logic below infers that the variables text have changed and are hence the variables are no longer valid and need to be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we update the variable objects when we update the labels, this if
check works, - if the variable text is really the same as the variable should have, do nothing.
endLineNumber: ref.range.endLineNumber, | ||
endColumn: ref.range.endColumn + delta | ||
} | ||
ref.range = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roblourens do you recall the main reason for the // Don't mutate entries in _variables..
above? Because the ref is now a class instance with event listeners now, I do need to mutate the variables here 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because those objects will have been previously returned, and other objects will be holding on to them, and not expecting them to change. Maybe there is somewhere that we diff the previous object with the new one, and mutating the object breaks that. There are other ways we can solve that. But I don't quite understand the problem for you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe there is somewhere that we diff the previous object with the new one
But that would break only if the diff logic relies on the object pointers, doing diffing by-attribute should be unaffected in either case?
There is no really a problem for me, just wanted to make sure that mutating the objects is fine here. We do need to mutate them now because the ref
variable is a class instance now rather than a simple object it was before. The class instance can have event listeners registered on it by other code, and it handles nested file references resolution which we don't want to again over and over 🤷
a927e4e
to
2fa00b9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not done reviewing but wanted to drop the first round of comments. Overall, when I'm working on a large feature, I usually try to start pushing small PRs to main as early as possible. That's the best way to avoid merge conflicts and get feedback early. Don't be afraid of pushing an incomplete feature behind a hidden setting, as long as it won't break anything. We do that often on this team. I also would rather avoid PRs that make multiple unrelated changes. It just gets a bit hard to review.
* assertDefined(null, new Error('Should throw this error.')) | ||
* ``` | ||
*/ | ||
export function assertDefined<T>(value: T, error: string | NonNullable<Error>): asserts value is NonNullable<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this in types.ts, assertIsDefined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is a little bit different since it is an assert
and doesn't require variable reassignment. Let me see if I can use the existing one without need to create let
variable just for the assertion 🤔
@@ -186,7 +186,7 @@ export interface ITransformer<Original, Transformed> { | |||
error?: IErrorTransformer; | |||
} | |||
|
|||
export function newWriteableStream<T>(reducer: IReducer<T>, options?: WriteableStreamOptions): WriteableStream<T> { | |||
export function newWriteableStream<T>(reducer: IReducer<T> | null, options?: WriteableStreamOptions): WriteableStream<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional argument? We generally use undefined over null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used null
on purpose to make it explicit to prevent subtle silent errors. It is too easy to overlook a variable or a function return value being an undefined and pass it over, hence I wanted to make sure the folks that using this to be very specific.
I do appreciate the work that TS has done with void
and not treating the foo?: string
and foo: undefined | string
the same anymore, but there is still the cases when requiring the explicit null
saves some hair on your head 🤗 Do you think it would be fine with undefined
? You have more context so happy to revert if you think it won't be a pain in the future.
src/vs/workbench/contrib/chat/browser/attachments/implicitContextAttachment.ts
Outdated
Show resolved
Hide resolved
state: WorkingSetEntryState.Attached, | ||
kind: 'reference', | ||
}); | ||
|
||
seenEntries.add(child); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a correct fix (cc @joyceerhl) but is this method even related to your new feature? I prefer not grouping unrelated fixes into a PR that's already really big
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one relates to my feature only - the child references is the new logic. The existing logic is now on the line 1086
above now, and that one would still need the fix to be applied.
|
||
prompt.parts | ||
.forEach((part, i) => { | ||
public async resolveVariables( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, but I'd prefer to keep PRs focused more narrowly, and the word "mostly" makes it a little harder to review :)
@@ -193,7 +195,18 @@ export class ChatWidget extends Disposable implements IChatWidget { | |||
return { text: '', parts: [] }; | |||
} | |||
|
|||
this.parsedChatRequest = this.instantiationService.createInstance(ChatRequestParser).parseChatRequest(this.viewModel!.sessionId, this.getInput(), this.location, { selectedAgent: this._lastSelectedAgent }); | |||
assertDefined( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems unnecessary, this is right after a !this.viewModel
check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point 💥 I've tried to omit the !
in the this.viewModel!.sessionId
but seems like that was also unnecessary here anyway
@@ -1046,6 +1065,29 @@ export class ChatWidget extends Disposable implements IChatWidget { | |||
this.telemetryService.publicLog2<ChatEditingWorkingSetEvent, ChatEditingWorkingSetClassification>('chatEditing/workingSetSize', { originalSize: this.inputPart.attemptedWorkingSetEntriesCount, actualSize: uniqueWorkingSetEntries.size }); | |||
} | |||
|
|||
// factor in nested references of dynamic variables into the implicit attached context | |||
const variableModel = this.getContrib<ChatDynamicVariableModel>(ChatDynamicVariableModel.ID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the ChatWidget calls to a contrib directly, then it's no longer a "contrib". The point of that model is to have something that can implement a feature on top of an object using its public interface, but is totally decoupled from the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So either this should no longer be a 'contrib' or something else needs to change, I think I need to understand the overall thing better to decide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to get to the variables
list to extend the attached context with the resolved nested file references below.
Maybe there is a better place of doing this, or a better place to add the nested file references to? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to ChatInputPart.getAttachedAndImplicitContext()
since we add the nested file references as an implicit context atm 🚀 Let me know if that works better.
/** | ||
* A file reference token inside a prompt. | ||
*/ | ||
export class FileReference extends BaseToken { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure workbench/common
is the right place for these files, it all seems specific to the chat feature, so should it stay under workbench/contrib/chat/...
?
* ); | ||
* ``` | ||
*/ | ||
export const randomInt = (max: number, min: number = 0): number => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good one for src/vs/base/common/numbers.ts
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer! I definitely need tips on the files structure 🤗
} | ||
|
||
if (typeof value === 'string') { | ||
return value.trim().toLowerCase() === 'true'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this also expecting a string vs boolean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The returned value is unknown
and we provide the JSON
view of the config, so users may add the "true"
(or even "True"
) value there and get confused? Or we already parse it to a correct boolean
value under the hood?
@@ -51,7 +51,9 @@ class InputEditorDecorations extends Disposable { | |||
|
|||
this.updateInputEditorDecorations(); | |||
this._register(this.widget.inputEditor.onDidChangeModelContent(() => this.updateInputEditorDecorations())); | |||
this._register(this.widget.onDidChangeParsedInput(() => this.updateInputEditorDecorations())); | |||
this._register(this.widget.onDidChangeParsedInput(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is necessarily an improvement, we use the single-line version a lot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a readability improvement in the sense it makes it more explicit, hence take less brain cycles to parse the syntax.
But the main reason for the change is not readability - it takes less keystrokes to change this code in the future, and you can set a breakpoint immediately during debugging (I think this change is actually a leftover of me debugging that callback 🤷).
Do you know what is the main reason for using this pattern in the codebase? Is it mostly less typing or some motives involved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I use the original style because it's more compact.
} | ||
|
||
// get all file references in the file contents | ||
const references = await this.codec.decode(fileStream.value).consumeAll(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering whether this needs to be a stream? Do you use it as a stream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's just nice to have the stream since you are working from a file stream, just wondering if there is a case where you would use it that way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the amount of data we need to parse might be substantial and because filesystem IO is involved, I thought it is important to keep it as a stream so we can do the parsing concurrently.
I have a follow-up task to make the logic here take leverage of that. Beside this line, the decoders are composed(e.g., PromptFileDecoder
uses SimpleTokenDecoder
) hence fully take advantage of the fact that an underlying decoder is a stream of messages 🚀
I can't get this to work- I set the setting, should I see intellisense for Also, it seems like this might break normal implicit context? |
…the first unit test
…ble test utilities
…or the `PromptFileReference` class
…ence-recursion-proof and add approptiate tests
…lying data source
…eference object only if `prompt-snippets` config value is set
…ile reference object only if `prompt-snippets` config value is set
356d78b
to
dc90f20
Compare
You mean autocompletion? Yeah it should work just fine, I don't recall it to broken at any point 🤔
It was broken for a bit a while back but should work just fine now, please try again! |
dom.$( | ||
'span.chat-implicit-hint', | ||
undefined, | ||
`Current file${this.getReferencesSuffix()}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be localized
this.widget.refreshParsedInput(); | ||
})); | ||
// make sure the variable is updated on filesystem changes | ||
variable.addFilesystemListeners(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit surprised by this, usually the object manages its listeners internally to keep the public interface minimal.
*/ | ||
private updateVariableTexts(): void { | ||
for (const variable of this._variables) { | ||
const text = `#file:${variable.filenameWithReferences}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the idea of having a distinct file extension for these files, which would also enable implementing attaching prompt snippets as a separate variable type instead of overloading the existing #file:
variable and implicit context. I didn't see discussion of this route in the issue you posted but let me know if that was already evaluated as an option and dismissed. For context, we have #kb:
for GitHub knowledge bases when using @github
. I think that would result in a simpler and more contained implementation of this functionality, and also as a user I personally wouldn't expect to pick a prompt snippet through #file
.
@@ -51,7 +51,9 @@ class InputEditorDecorations extends Disposable { | |||
|
|||
this.updateInputEditorDecorations(); | |||
this._register(this.widget.inputEditor.onDidChangeModelContent(() => this.updateInputEditorDecorations())); | |||
this._register(this.widget.onDidChangeParsedInput(() => this.updateInputEditorDecorations())); | |||
this._register(this.widget.onDidChangeParsedInput(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I use the original style because it's more compact.
|
||
prompt.parts | ||
.forEach((part, i) => { | ||
public async resolveVariables( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also recommend scoping PRs down where possible, large PRs with many changes make it hard on the reviewers and ultimately delay landing features.
* Decoder for the common chatbot prompt message syntax. | ||
* For instance, the file references `#file:./path/file.md` are handled by this decoder. | ||
*/ | ||
export class ChatbotPromptDecoder extends BaseDecoder<TChatbotPromptToken, TSimpleToken> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nit, but we don't use chatbot
anywhere else in this codebase. For consistency with all our other chat features, I would prefer that we renamed chatbot
to just chat
in the contents of this file as well as in the directory and file names.
Related issue: https://github.com/microsoft/vscode-internalbacklog/issues/5220#issuecomment-2486738649