add prompt snippets support #234220

legomushroom · 2024-11-19T21:34:10Z

Related issue: https://github.com/microsoft/vscode-internalbacklog/issues/5220#issuecomment-2486738649

src/vs/base/common/stream.ts

src/vs/workbench/contrib/chat/browser/attachments/implicitContextAttachment.ts

src/vs/workbench/contrib/chat/browser/chatInputPart.ts

legomushroom · 2024-11-19T21:54:22Z

src/vs/workbench/contrib/chat/browser/chatVariables.ts

-
-		prompt.parts
-			.forEach((part, i) => {
+	public async resolveVariables(


Mostly readability improvements in this file, the logic remains the same.

Thanks, but I'd prefer to keep PRs focused more narrowly, and the word "mostly" makes it a little harder to review :)

Yeah sorry, it was very hard to read even on 27" monitor, so I was formatting it on the go as I was parsing through the code, and then decided to keep it. The same applies for other cases where I split long lines over. Let me know if you think that reverting this would help with the review at this point 👂

I also recommend scoping PRs down where possible, large PRs with many changes make it hard on the reviewers and ultimately delay landing features.

legomushroom · 2024-11-19T21:55:25Z

src/vs/workbench/contrib/chat/browser/chatVariables.ts

+		// to go first so that an replacement logic is simple
+		resolvedVariables
+			.sort((left, right) => {
+				assertDefined(


These asserts are to omit TS hacks ☺️

I'm fine with the ! because we know that these are parts of a prompt and have ranges, but the assert is fine too

If they always have ranges, we need to update the TS types then?
The asserts just add a more readable error messages here as opposed to a cryptic cannot read 'start' of 'undefined' you'd get with the bang ☺️

legomushroom · 2024-11-19T21:58:19Z

src/vs/workbench/contrib/chat/browser/contrib/chatDynamicVariables.ts

 	) {
 		super();
 		this._register(widget.inputEditor.onDidChangeModelContent(e => {
 			e.changes.forEach(c => {
 				// Don't mutate entries in _variables, since they will be returned from the getter
-				this._variables = coalesce(this._variables.map(ref => {
+				this._variables = coalesce(this._variables.map((ref) => {
+					if (c.text === `#file:${ref.filenameWithReferences}`) {


That's the only way I made it work and its not ideal, any better ideas? cc @roblourens @joyceerhl

What is this trying to do? Is this triggered when that text is inserted all at once?

When we add the (+N more) label to a filename after nested references are resolved, this is run and as the result the variables are removed. The logic below infers that the variables text have changed and are hence the variables are no longer valid and need to be removed.

Because we update the variable objects when we update the labels, this if check works, - if the variable text is really the same as the variable should have, do nothing.

legomushroom · 2024-11-19T22:02:17Z

src/vs/workbench/contrib/chat/browser/contrib/chatDynamicVariables.ts

-								endLineNumber: ref.range.endLineNumber,
-								endColumn: ref.range.endColumn + delta
-							}
+						ref.range = {


@roblourens do you recall the main reason for the // Don't mutate entries in _variables.. above? Because the ref is now a class instance with event listeners now, I do need to mutate the variables here 🤔

Because those objects will have been previously returned, and other objects will be holding on to them, and not expecting them to change. Maybe there is somewhere that we diff the previous object with the new one, and mutating the object breaks that. There are other ways we can solve that. But I don't quite understand the problem for you

Maybe there is somewhere that we diff the previous object with the new one

But that would break only if the diff logic relies on the object pointers, doing diffing by-attribute should be unaffected in either case?

There is no really a problem for me, just wanted to make sure that mutating the objects is fine here. We do need to mutate them now because the ref variable is a class instance now rather than a simple object it was before. The class instance can have event listeners registered on it by other code, and it handles nested file references resolution which we don't want to again over and over 🤷

src/vs/workbench/contrib/chat/common/chatRequestParser.ts

src/vs/workbench/contrib/chat/common/chatParserTypes.ts

roblourens

Not done reviewing but wanted to drop the first round of comments. Overall, when I'm working on a large feature, I usually try to start pushing small PRs to main as early as possible. That's the best way to avoid merge conflicts and get feedback early. Don't be afraid of pushing an incomplete feature behind a hidden setting, as long as it won't break anything. We do that often on this team. I also would rather avoid PRs that make multiple unrelated changes. It just gets a bit hard to review.

roblourens · 2024-11-20T09:40:52Z

src/vs/base/common/assertDefined.ts

+ * assertDefined(null, new Error('Should throw this error.'))
+ * ```
+ */
+export function assertDefined<T>(value: T, error: string | NonNullable<Error>): asserts value is NonNullable<T> {


We have this in types.ts, assertIsDefined

This one is a little bit different since it is an assert and doesn't require variable reassignment. Let me see if I can use the existing one without need to create let variable just for the assertion 🤔

roblourens · 2024-11-20T09:41:22Z

src/vs/base/common/stream.ts

@@ -186,7 +186,7 @@ export interface ITransformer<Original, Transformed> {
 	error?: IErrorTransformer;
 }

-export function newWriteableStream<T>(reducer: IReducer<T>, options?: WriteableStreamOptions): WriteableStream<T> {
+export function newWriteableStream<T>(reducer: IReducer<T> | null, options?: WriteableStreamOptions): WriteableStream<T> {


Optional argument? We generally use undefined over null

Used null on purpose to make it explicit to prevent subtle silent errors. It is too easy to overlook a variable or a function return value being an undefined and pass it over, hence I wanted to make sure the folks that using this to be very specific.

I do appreciate the work that TS has done with void and not treating the foo?: string and foo: undefined | string the same anymore, but there is still the cases when requiring the explicit null saves some hair on your head 🤗 Do you think it would be fine with undefined? You have more context so happy to revert if you think it won't be a pain in the future.

src/vs/workbench/contrib/chat/browser/attachments/implicitContextAttachment.ts

roblourens · 2024-11-20T09:51:47Z

src/vs/workbench/contrib/chat/browser/chatInputPart.ts

 					state: WorkingSetEntryState.Attached,
 					kind: 'reference',
 				});
+
+				seenEntries.add(child);


This seems like a correct fix (cc @joyceerhl) but is this method even related to your new feature? I prefer not grouping unrelated fixes into a PR that's already really big

This one relates to my feature only - the child references is the new logic. The existing logic is now on the line 1086 above now, and that one would still need the fix to be applied.

roblourens · 2024-11-20T09:53:27Z

src/vs/workbench/contrib/chat/browser/chatVariables.ts

-
-		prompt.parts
-			.forEach((part, i) => {
+	public async resolveVariables(


Thanks, but I'd prefer to keep PRs focused more narrowly, and the word "mostly" makes it a little harder to review :)

roblourens · 2024-11-20T09:59:27Z

src/vs/workbench/contrib/chat/browser/chatWidget.ts

@@ -193,7 +195,18 @@ export class ChatWidget extends Disposable implements IChatWidget {
 				return { text: '', parts: [] };
 			}

-			this.parsedChatRequest = this.instantiationService.createInstance(ChatRequestParser).parseChatRequest(this.viewModel!.sessionId, this.getInput(), this.location, { selectedAgent: this._lastSelectedAgent });
+			assertDefined(


Seems unnecessary, this is right after a !this.viewModel check

Good point 💥 I've tried to omit the ! in the this.viewModel!.sessionId but seems like that was also unnecessary here anyway ☺️

roblourens · 2024-11-20T10:01:10Z

src/vs/workbench/contrib/chat/browser/chatWidget.ts

@@ -1046,6 +1065,29 @@ export class ChatWidget extends Disposable implements IChatWidget {
 				this.telemetryService.publicLog2<ChatEditingWorkingSetEvent, ChatEditingWorkingSetClassification>('chatEditing/workingSetSize', { originalSize: this.inputPart.attemptedWorkingSetEntriesCount, actualSize: uniqueWorkingSetEntries.size });
 			}

+			// factor in nested references of dynamic variables into the implicit attached context
+			const variableModel = this.getContrib<ChatDynamicVariableModel>(ChatDynamicVariableModel.ID);


If the ChatWidget calls to a contrib directly, then it's no longer a "contrib". The point of that model is to have something that can implement a feature on top of an object using its public interface, but is totally decoupled from the object.

So either this should no longer be a 'contrib' or something else needs to change, I think I need to understand the overall thing better to decide.

I needed to get to the variables list to extend the attached context with the resolved nested file references below.
Maybe there is a better place of doing this, or a better place to add the nested file references to? 🤔

Moved to ChatInputPart.getAttachedAndImplicitContext() since we add the nested file references as an implicit context atm 🚀 Let me know if that works better.

roblourens · 2024-11-20T10:04:04Z

src/vs/workbench/common/codecs/chatbotPromptCodec/tokens/fileReference.ts

+/**
+ * A file reference token inside a prompt.
+ */
+export class FileReference extends BaseToken {


I'm not sure workbench/common is the right place for these files, it all seems specific to the chat feature, so should it stay under workbench/contrib/chat/...?

src/vs/workbench/contrib/chat/browser/promptFileReference.ts

roblourens · 2024-11-20T10:11:42Z

src/vs/workbench/contrib/chat/test/common/utils/randomInt.ts

+ * );
+ * ```
+ */
+export const randomInt = (max: number, min: number = 0): number => {


Might be a good one for src/vs/base/common/numbers.ts?

Thanks for the pointer! I definitely need tips on the files structure 🤗

roblourens · 2024-11-20T18:17:55Z

src/vs/workbench/contrib/chat/browser/promptFileReference.ts

+		}
+
+		if (typeof value === 'string') {
+			return value.trim().toLowerCase() === 'true';


Why is this also expecting a string vs boolean?

The returned value is unknown and we provide the JSON view of the config, so users may add the "true"(or even "True") value there and get confused? Or we already parse it to a correct boolean value under the hood?

roblourens · 2024-11-20T18:21:35Z

src/vs/workbench/contrib/chat/browser/contrib/chatInputEditorContrib.ts

@@ -51,7 +51,9 @@ class InputEditorDecorations extends Disposable {

 		this.updateInputEditorDecorations();
 		this._register(this.widget.inputEditor.onDidChangeModelContent(() => this.updateInputEditorDecorations()));
-		this._register(this.widget.onDidChangeParsedInput(() => this.updateInputEditorDecorations()));
+		this._register(this.widget.onDidChangeParsedInput(() => {


I don't think this is necessarily an improvement, we use the single-line version a lot

I think it is a readability improvement in the sense it makes it more explicit, hence take less brain cycles to parse the syntax.

But the main reason for the change is not readability - it takes less keystrokes to change this code in the future, and you can set a breakpoint immediately during debugging (I think this change is actually a leftover of me debugging that callback 🤷).

Do you know what is the main reason for using this pattern in the codebase? Is it mostly less typing or some motives involved?

FWIW I use the original style because it's more compact.

roblourens · 2024-11-20T18:29:57Z

src/vs/workbench/contrib/chat/browser/promptFileReference.ts

+		}
+
+		// get all file references in the file contents
+		const references = await this.codec.decode(fileStream.value).consumeAll();


Just wondering whether this needs to be a stream? Do you use it as a stream?

Maybe it's just nice to have the stream since you are working from a file stream, just wondering if there is a case where you would use it that way

Because the amount of data we need to parse might be substantial and because filesystem IO is involved, I thought it is important to keep it as a stream so we can do the parsing concurrently.

I have a follow-up task to make the logic here take leverage of that. Beside this line, the decoders are composed(e.g., PromptFileDecoder uses SimpleTokenDecoder) hence fully take advantage of the fact that an underlying decoder is a stream of messages 🚀

roblourens · 2024-11-20T18:34:13Z

I can't get this to work- I set the setting, should I see intellisense for #file? If I manually type #file:./test.js, it doesn't include that file.

Also, it seems like this might break normal implicit context?

…the first unit test

…ble test utilities

…e stream

…or the `PromptFileReference` class

…ence` class

…ence-recursion-proof and add approptiate tests

…lying data source

…eference object only if `prompt-snippets` config value is set

…ile reference object only if `prompt-snippets` config value is set

legomushroom · 2024-11-22T01:31:02Z

@roblourens

I can't get this to work- I set the setting, should I see intellisense for #file? If I manually type #file:./test.js, it doesn't include that file.

You mean autocompletion? Yeah it should work just fine, I don't recall it to broken at any point 🤔

Also, it seems like this might break normal implicit context?

It was broken for a bit a while back but should work just fine now, please try again!

joyceerhl · 2024-11-22T01:52:34Z

src/vs/workbench/contrib/chat/browser/attachments/implicitContextAttachment.ts

+			dom.$(
+				'span.chat-implicit-hint',
+				undefined,
+				`Current file${this.getReferencesSuffix()}`,


This should be localized

joyceerhl · 2024-11-22T02:01:28Z

src/vs/workbench/contrib/chat/browser/contrib/chatDynamicVariables.ts

+				this.widget.refreshParsedInput();
+			}));
+			// make sure the variable is updated on filesystem changes
+			variable.addFilesystemListeners();


I'm a bit surprised by this, usually the object manages its listeners internally to keep the public interface minimal.

joyceerhl · 2024-11-22T02:07:58Z

src/vs/workbench/contrib/chat/browser/contrib/chatDynamicVariables.ts

+	 */
+	private updateVariableTexts(): void {
+		for (const variable of this._variables) {
+			const text = `#file:${variable.filenameWithReferences}`;


I liked the idea of having a distinct file extension for these files, which would also enable implementing attaching prompt snippets as a separate variable type instead of overloading the existing #file: variable and implicit context. I didn't see discussion of this route in the issue you posted but let me know if that was already evaluated as an option and dismissed. For context, we have #kb: for GitHub knowledge bases when using @github. I think that would result in a simpler and more contained implementation of this functionality, and also as a user I personally wouldn't expect to pick a prompt snippet through #file.

joyceerhl · 2024-11-22T02:11:28Z

src/vs/workbench/contrib/chat/browser/contrib/chatInputEditorContrib.ts

@@ -51,7 +51,9 @@ class InputEditorDecorations extends Disposable {

 		this.updateInputEditorDecorations();
 		this._register(this.widget.inputEditor.onDidChangeModelContent(() => this.updateInputEditorDecorations()));
-		this._register(this.widget.onDidChangeParsedInput(() => this.updateInputEditorDecorations()));
+		this._register(this.widget.onDidChangeParsedInput(() => {


FWIW I use the original style because it's more compact.

joyceerhl · 2024-11-22T02:14:32Z

src/vs/workbench/contrib/chat/browser/chatVariables.ts

-
-		prompt.parts
-			.forEach((part, i) => {
+	public async resolveVariables(


I also recommend scoping PRs down where possible, large PRs with many changes make it hard on the reviewers and ultimately delay landing features.

joyceerhl · 2024-11-22T02:16:59Z

src/vs/workbench/contrib/chat/common/codecs/chatbotPromptCodec/chatbotPromptDecoder.ts

+ * Decoder for the common chatbot prompt message syntax.
+ * For instance, the file references `#file:./path/file.md` are handled by this decoder.
+ */
+export class ChatbotPromptDecoder extends BaseDecoder<TChatbotPromptToken, TSimpleToken> {


This is a nit, but we don't use chatbot anywhere else in this codebase. For consistency with all our other chat features, I would prefer that we renamed chatbot to just chat in the contents of this file as well as in the directory and file names.

legomushroom requested review from roblourens and joyceerhl November 19, 2024 21:34

legomushroom self-assigned this Nov 19, 2024

vs-code-engineering bot added this to the November 2024 milestone Nov 19, 2024