Provide richer information when backend asks about active editor #2132

jennybc · 2024-01-25T06:50:46Z

Part of #1312. Builds on the POC that was included in #2019. In particular, last_active_editor_context is expanded from providing just a filepath to essentially all of the data members of vscode.TextEditor (or, at least, all of the ones I think are conceivably useful to a backend). This means we get coordinates and text for all of the current selections, as well as the entire contents of the editor.

Companion PRs:
posit-dev/ark#213
https://github.com/posit-dev/positron-python/pull/329

positron/comms/ui-frontend-openrpc.json

jennybc · 2024-01-30T23:14:23Z

positron/comms/ui-frontend-openrpc.json

+	],
+	"components": {
+		"schemas": {
+			"uiTextDocument": {


I figure I have to invent a name for this schema. The rationale for uiTextDocument is that this schema is based on vscode.TextDocument and it exists for use in our "ui" comms.

All names in these contracts should be in snake_case

I have done this now.

I'm not sure we need a comm prefix

I've removed the 'ui' prefix as well. It means we have, for example, positron.TextDocument that is ... shall we say, inspired-by-but-not-the-same-as vscode.TextDocument. But I guess this is fine (?).

positron/comms/ui-frontend-openrpc.json

jennybc · 2024-01-25T06:57:20Z

src/positron-dts/positron.d.ts

+	// Related to https://github.com/posit-dev/positron/issues/12
+	type TEC = import('./ui-comm').TextEditorContext;
+	export type TextEditorContext = TEC;
+


I really don't want to manually copy/paste/tweak the contents of the automatically generated src/vs/workbench/services/languageRuntime/common/positronUiComm.ts here. Consider this an opening move towards a possible future where generate-comms.ts also places a file in src/positron-dts/ that is brought into the positron namespace kinda like I'm doing here. ~~I'm probably not doing this in an optimal way but~~ I was elated to just find a way to keep these in a separate file!

The current, working state of this experiment (and this PR) is that I copy/paste the necessary excerpt of src/vs/workbench/services/languageRuntime/common/positronUiComm.ts into src/positron-dts/ui-comm.d.ts. The two lines above are as close as I can come to inlining that file. This StackOverflow answer was extremely helpful (https://stackoverflow.com/a/51114250) in understanding what is and is not possible in a so-called ambient module declaration. AFAICT this is the only way to import types into a global module declaration.

Our long term hopes are to rationalize this further (#12, #265), but I accept that this is as far as we can come in this PR.

Interesting, thanks for that SO link!

jennybc · 2024-01-25T06:58:10Z

src/positron-dts/positron.d.ts

-	export interface TextEditorContext {
-		/** URI of the edited document */
-		readonly path: string;
-	}


This is the direct inlining that felt OK when the interface contains a single string (readonly path: string). But it feels very unsustainable when dealing with the hundreds of automatically generated lines needed to bring in the interfaces I'm adding here.

jennybc · 2024-01-25T06:59:57Z

src/tsconfig.json

@@ -32,6 +32,7 @@
 		// --- Start Positron ---
 		"./vs/**/*.tsx",
 		"positron-dts/positron.d.ts",
+		"positron-dts/ui-comm.d.ts",


Necessary because I keep the new interfaces in their own file.

jennybc · 2024-01-25T07:00:13Z

src/vs/workbench/api/common/positron/extHostMethods.ts

@@ -40,7 +40,7 @@ export class ExtHostMethods implements extHostProtocol.ExtHostMethodsShape {
 	async call(method: UiFrontendRequest, params: Record<string, any>): Promise<JsonRpcResponse> {
 		try {
 			if (!Object.values(UiFrontendRequest).includes(method)) {
-				return <JsonRpcError> {
+				return <JsonRpcError>{


Automatic formatting did this.

jennybc · 2024-01-25T07:00:45Z

src/vs/workbench/services/frontendMethods/common/editorContext.ts

-interface ITextEditorContext {
-	readonly path: string;
-}
+type ITextEditorContext = EditorContextResult;


So many copies! But at least this one is just an alias.

lionel- · 2024-01-25T07:22:40Z

src/positron-dts/positron.d.ts

+	// * Reduce the manual proliferation of these generated types.
+	// * Ideally a file is meant to edited by humans or by robots, but not both.
+	// Related to https://github.com/posit-dev/positron/issues/12
+	type TEC = import('./ui-comm').TextEditorContext;


I was initially assuming we'd add a fenced area in positron.d.ts that generate-comms.ts would update, to preserve the file self-contained just like vscode.d.ts is.

Do we risk making imports management a bit harder in extensions? Currently they are copying that interface file locally and will now have to import two of them and remember to keep both in sync.

That said, eventually these types will probably be exported in an npm package for extensions to use, e.g. like https://www.npmjs.com/package/@types/vscode. When I jump to definition from import * as vscode from 'vscode';, I end up in a node_modules folder where the vscode.d.ts file has become an index.d.ts file inside a vscode folder. So I guess in that configuration multiple files will work out transparently (I'm a web dev neophyte so I'm not sure).

These are all good points, so my main goal is to encourage us to discuss and figure this out. I think we do have lots of evidence from R package development that it's beneficial to draw a hard line between generated files and human-curated files, whenever possible. But if we can't do that here or choose not to, yeah we should fence off a special area in positron.d.ts.

I've still got this in a separate file for the moment.

assuming we'd add a fenced area in positron.d.ts

If I did that, I'd be adding >100 lines which is >10 of that file. Then there's the higher degree of difficulty of managing a file authored by both humans and computers. And I think this problem will just get worse. But I will put this directly in positron.d.ts if that's what folks want.

Do we risk making imports management a bit harder in extensions? Currently they are copying that interface file locally ...

So I see this in positron-python (https://github.com/posit-dev/positron-python/blob/98da4f51910f36c1d898a96461acdf19b662d19c/positron-dts/positron.d.ts), but not in positron-r. Why do we need to do that in positron-python but not positron-r? Is this also a manual copy paste into positron-python?

My question about why positron-python maintains its own copy of positron.d.ts is answered here: https://github.com/posit-dev/positron-python/pull/329/files#r1472174669

jennybc · 2024-01-30T23:14:23Z

positron/comms/ui-frontend-openrpc.json

+	],
+	"components": {
+		"schemas": {
+			"uiTextDocument": {


I figure I have to invent a name for this schema. The rationale for uiTextDocument is that this schema is based on vscode.TextDocument and it exists for use in our "ui" comms.

jennybc · 2024-01-30T23:15:51Z

positron/comms/ui-frontend-openrpc.json

+						"type": "string",
+						"description": "URI of the resource viewed in the editor"
+					},
+					"eol": {


I tried to make this an actual enum, but it created lots of problems with types. So eol is a simple string now and it will be either \n or \r\n.

Yeah, I think the only reasonable way to make this a "real" enum would be to use enum values like "windows" and "unix" and then replace them with the physical sequences after/before serialization. Definitely not a high priority change.

positron/comms/ui-frontend-openrpc.json

jennybc · 2024-01-30T23:28:38Z

src/positron-dts/positron.d.ts

+	// Related to https://github.com/posit-dev/positron/issues/12
+	type TEC = import('./ui-comm').TextEditorContext;
+	export type TextEditorContext = TEC;
+


The current, working state of this experiment (and this PR) is that I copy/paste the necessary excerpt of src/vs/workbench/services/languageRuntime/common/positronUiComm.ts into src/positron-dts/ui-comm.d.ts. The two lines above are as close as I can come to inlining that file. This StackOverflow answer was extremely helpful (https://stackoverflow.com/a/51114250) in understanding what is and is not possible in a so-called ambient module declaration. AFAICT this is the only way to import types into a global module declaration.

Our long term hopes are to rationalize this further (#12, #265), but I accept that this is as far as we can come in this PR.

src/positron-dts/positron.d.ts

jennybc · 2024-01-30T23:32:24Z

src/positron-dts/ui-comm.d.ts

+ *--------------------------------------------------------------------------------------------*/
+
+//
+// Copied from src/vs/workbench/services/languageRuntime/common/positronUiComm.ts; do not edit.


This is a verbatim copy and paste of the relevant interfaces from positronUiComm.ts. Maybe one day generate-comms.ts will eliminate the need for this (#265, #12).

The most recent renaming really confirms that it's no fun to manually maintain this copy.

jennybc · 2024-01-30T23:38:13Z

src/vs/workbench/api/common/positron/extHostMethods.ts

 import { JsonRpcErrorCode } from 'vs/workbench/services/languageRuntime/common/positronBaseComm';
-
+import { EndOfLine } from '../extHostTypeConverters';


The end-of-line enum is surprisingly tricky to handle! It is defined in vscode.d.ts but you can't import from vscode in the extension host. I learned that the hard way. You don't see any PROBLEMS but the extension host becomes DOA in dev positron.

I assume that this explains the existence of all the type conversion machinery in https://github.com/posit-dev/positron/blob/main/src/vs/workbench/api/common/extHostTypeConverters.ts, which is what I use below.

jennybc · 2024-01-30T23:59:59Z

src/vs/workbench/api/common/positron/extHostMethods.ts

+		// The gymnastics here are so that we return character positions with respect to
+		// Unicode code points. Otherwise, the native Position type provides offsets with respect to
+		// UTF-16 encoded text. That would be confusing for downstream consumers, who probably
+		// ultimately receive this text as UTF-8 and want to operate on this text in terms of
+		// as user-perceivable "characters". This only matters when the selection's neighborhood
+		// includes Unicode characters in the astral plane.


This produces position data that is (a) actually correct and (b) makes the most sense in R. If these are the active selections:

here's the data we expose via rstudioapi::getActiveDocumentContext():

> f <- function(sel) { + x <- list( + text = sel$text, + `nchar(text)` = nchar(sel$text), + `range` = glue::glue(' + [{sel$range$start["row"]}, {sel$range$start["column"]}] --\\ + [{sel$range$end["row"]}, {sel$range$end["column"]}]'), + `end column - start column` = + sel$range$end['column'] - sel$range$start['column'], + `strsplit(text, split = '')[[1]]` = strsplit(sel$text, split = '')[[1]], + `substr(text, 1, 1)` = substr(sel$text, 1, 1), + `substr(text, 2, 2)` = substr(sel$text, 2, 2) + ) + fr <- function(x) format(x, justify = "right") + fl <- function(x) format(x, justify = "left") + print(glue::glue("{fr(names(x))}: {fl(x)}")) + print(glue::glue("\n")) + invisible() + } > purrr::walk(x$selection, f) text: 🤖🌷 nchar(text): 2 range: [1, 1] --[1, 3] end column - start column: 2 strsplit(text, split = '')[[1]]: 🤖, 🌷 substr(text, 1, 1): 🤖 substr(text, 2, 2): 🌷 text: 😱 nchar(text): 1 range: [2, 1] --[2, 2] end column - start column: 1 strsplit(text, split = '')[[1]]: 😱 substr(text, 1, 1): 😱 substr(text, 2, 2): text: ?💅 nchar(text): 2 range: [3, 1] --[3, 3] end column - start column: 2 strsplit(text, split = '')[[1]]: ? , 💅 substr(text, 1, 1): ? substr(text, 2, 2): 💅 text: Ab nchar(text): 2 range: [4, 1] --[4, 3] end column - start column: 2 strsplit(text, split = '')[[1]]: A, b substr(text, 1, 1): A substr(text, 2, 2): b

I like the way the reported range is consistent with R's notion of the selection length and positions, as reported by nchar() and substr().

Note that things get a little tricky in the presence of an Emoji ZWJ Sequence. But the data we return is still consistent with what R does.

> x <- rstudioapi::getActiveDocumentContext() > purrr::walk(x$selection, f) text: 🤷‍♀️😭 nchar(text): 5 range: [1, 1] --[1, 6] end column - start column: 5 strsplit(text, split = '')[[1]]: 🤷, ‍ , ♀ , ️ , 😭 substr(text, 1, 1): 🤷 substr(text, 2, 2): ‍ text: 🤖🌷 nchar(text): 2 range: [2, 1] --[2, 3] end column - start column: 2 strsplit(text, split = '')[[1]]: 🤖, 🌷 substr(text, 1, 1): 🤖 substr(text, 2, 2): 🌷

Note that the "woman shrugging" emoji is is a ZWJ sequence combining 🤷 Person Shrugging, 🏻 Light Skin Tone, Zero Width Joiner and ♀️ Female Sign.(https://emojipedia.org/female-sign). Our range suggests the first line contains 5 characters which matches nchar(). To do any better, all the tools involved would need an awareness of Unicode normalization and of graphemes. I'm not pursuing that at this point.

That sounds good. IIUC, the Javascript string iterator and the R string utilities are aligned in the way they interpret unicode characters, which seems like a happy outcome.

If we want to do better in the future for other backends or specialised consumers of rstudioapi, we could add a parameter to the document request to require positions and ranges in normalised unicode and whatnots.

IIUC, the Javascript string iterator and the R string utilities are aligned in the way they interpret unicode characters, which seems like a happy outcome

Yes ☝️ that's a good summary.

jennybc · 2024-01-31T00:01:58Z

src/vs/workbench/api/common/positron/extHostMethods.ts

+		// The selections in this text editor. The primary selection is always at index 0.
+		//
+		// The gymnastics here are so that we return character positions with respect to
+		// Unicode code points. Otherwise, the native Position type provides offsets with respect to


I found this helpful: JavaScript has a Unicode problem.

Another good resource that @lionel- found that supports this approach: https://jupyter-client.readthedocs.io/en/latest/messaging.html#notes

Maybe link to the Jupyter notes in the comment?

I found this helpful: JavaScript has a Unicode problem.

oh my

🔗 to the Jupyter notes is now added to the comment

jennybc · 2024-01-31T00:04:51Z

src/vs/workbench/api/common/positron/extHostMethods.ts

+
+		// it's surprisingly fiddly to finesse this vscode.EndOfLife enum, which we don't have
+		// direct access to here
+		const eolSequenceEnum = EndOfLine.from(editor.document.eol);


This is where I use existing extension host machinery to get usable information out of a vscode.EndOfLine enum.

src/positron-dts/positron.d.ts

positron/comms/ui-frontend-openrpc.json

+	],
+	"components": {
+		"schemas": {
+			"uiTextDocument": {


positron/comms/ui-frontend-openrpc.json

lionel- · 2024-01-31T08:31:24Z

src/positron-dts/positron.d.ts

+	// Related to https://github.com/posit-dev/positron/issues/12
+	type TEC = import('./ui-comm').TextEditorContext;
+	export type TextEditorContext = TEC;
+


Interesting, thanks for that SO link!

lionel- · 2024-01-31T08:48:51Z

src/vs/workbench/api/common/positron/extHostMethods.ts

+		// The selections in this text editor. The primary selection is always at index 0.
+		//
+		// The gymnastics here are so that we return character positions with respect to
+		// Unicode code points. Otherwise, the native Position type provides offsets with respect to


Maybe link to the Jupyter notes in the comment?

I found this helpful: JavaScript has a Unicode problem.

oh my

lionel- · 2024-01-31T08:58:40Z

src/vs/workbench/api/common/positron/extHostMethods.ts

+		// The gymnastics here are so that we return character positions with respect to
+		// Unicode code points. Otherwise, the native Position type provides offsets with respect to
+		// UTF-16 encoded text. That would be confusing for downstream consumers, who probably
+		// ultimately receive this text as UTF-8 and want to operate on this text in terms of
+		// as user-perceivable "characters". This only matters when the selection's neighborhood
+		// includes Unicode characters in the astral plane.


That sounds good. IIUC, the Javascript string iterator and the R string utilities are aligned in the way they interpret unicode characters, which seems like a happy outcome.

If we want to do better in the future for other backends or specialised consumers of rstudioapi, we could add a parameter to the document request to require positions and ranges in normalised unicode and whatnots.

jmcphers · 2024-01-31T19:42:43Z

positron/comms/ui-frontend-openrpc.json

+						"type": "string",
+						"description": "URI of the resource viewed in the editor"
+					},
+					"eol": {


Yeah, I think the only reasonable way to make this a "real" enum would be to use enum values like "windows" and "unix" and then replace them with the physical sequences after/before serialization. Definitely not a high priority change.

jennybc commented Jan 25, 2024

View reviewed changes

jennybc mentioned this pull request Jan 25, 2024

More editor context posit-dev/ark#213

Merged

lionel- reviewed Jan 25, 2024

View reviewed changes

Provide richer information when backend asks about active editor

3b6ff3d

This comment was marked as resolved.

Sign in to view

Merge branch 'main' into more-editor-context

34ce140

jennybc force-pushed the more-editor-context branch from 304175e to 34ce140 Compare January 26, 2024 19:09

Report character position wrt to Unicode code points

5cb7f6c

This comment was marked as outdated.

Sign in to view

jennybc added 3 commits January 30, 2024 13:30

Merge branch 'main' into more-editor-context

2e1ffd2

Expose document contents and line endings

107ea2d

Reuse interface names(s), since it seems to be technically possible

622c9bd

jennybc commented Jan 31, 2024

View reviewed changes

jennybc marked this pull request as ready for review January 31, 2024 00:13

jennybc requested review from jmcphers and lionel- January 31, 2024 00:13

More name tidying

d4ee9d1

lionel- reviewed Jan 31, 2024

View reviewed changes

jennybc added 4 commits January 31, 2024 10:43

Drop the 'Ui' prefix on the schemas (and, therefore, the types)

31b64f2

Use snake_case in the schemas (and, therefore, in the types)

9412609

Add link to comment

17ce1a0

Drop '_result' from schema (and, therefore, 'Result' from types)

a9775d8

jennybc force-pushed the more-editor-context branch from c810596 to a9775d8 Compare January 31, 2024 19:34

jmcphers approved these changes Jan 31, 2024

View reviewed changes

jennybc added 3 commits January 31, 2024 14:09

Bump positron-python

7697fe7

Bump ark version

6cd8da2

Merge remote-tracking branch 'origin/main' into more-editor-context

08f4094

jennybc merged commit e187d12 into main Jan 31, 2024
1 check passed

jennybc deleted the more-editor-context branch January 31, 2024 22:14

jennybc mentioned this pull request Jan 31, 2024

Feature Request: Shims for {rstudioapi} #1312

Closed

jennybc mentioned this pull request Mar 28, 2024

UI Comm: Order of Position fields #2555

Open

juliasilge mentioned this pull request Mar 29, 2024

OpenRPC for cursors/ranges rstudioapi shims #2582

Merged

		import { JsonRpcErrorCode } from 'vs/workbench/services/languageRuntime/common/positronBaseComm';

		import { EndOfLine } from '../extHostTypeConverters';

Provide richer information when backend asks about active editor #2132

Provide richer information when backend asks about active editor #2132

Conversation

jennybc commented Jan 25, 2024 • edited Loading

This comment was marked as outdated.

Choose a reason for hiding this comment

This comment was marked as resolved.

Choose a reason for hiding this comment

jennybc Jan 25, 2024 • edited Loading

Choose a reason for hiding this comment

jennybc Jan 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jennybc Jan 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as outdated.

Choose a reason for hiding this comment

jennybc Jan 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jennybc Jan 30, 2024 • edited Loading

Choose a reason for hiding this comment

jennybc Jan 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jennybc Jan 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as resolved.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jennybc commented Jan 25, 2024 •

edited

Loading

jennybc Jan 25, 2024 •

edited

Loading

jennybc Jan 30, 2024 •

edited

Loading

jennybc Jan 25, 2024 •

edited

Loading

jennybc Jan 30, 2024 •

edited

Loading

jennybc Jan 30, 2024 •

edited

Loading

jennybc Jan 30, 2024 •

edited

Loading

jennybc Jan 30, 2024 •

edited

Loading