-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide richer information when backend asks about active editor #2132
Conversation
], | ||
"components": { | ||
"schemas": { | ||
"uiTextDocument": { |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figure I have to invent a name for this schema. The rationale for uiTextDocument
is that this schema is based on vscode.TextDocument
and it exists for use in our "ui" comms.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All names in these contracts should be in snake_case
I have done this now.
I'm not sure we need a comm prefix
I've removed the 'ui' prefix as well. It means we have, for example, positron.TextDocument
that is ... shall we say, inspired-by-but-not-the-same-as vscode.TextDocument
. But I guess this is fine (?).
// Related to https://github.com/posit-dev/positron/issues/12 | ||
type TEC = import('./ui-comm').TextEditorContext; | ||
export type TextEditorContext = TEC; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't want to manually copy/paste/tweak the contents of the automatically generated src/vs/workbench/services/languageRuntime/common/positronUiComm.ts
here. Consider this an opening move towards a possible future where generate-comms.ts
also places a file in src/positron-dts/
that is brought into the positron namespace kinda like I'm doing here. I'm probably not doing this in an optimal way but I was elated to just find a way to keep these in a separate file!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current, working state of this experiment (and this PR) is that I copy/paste the necessary excerpt of src/vs/workbench/services/languageRuntime/common/positronUiComm.ts
into src/positron-dts/ui-comm.d.ts
. The two lines above are as close as I can come to inlining that file. This StackOverflow answer was extremely helpful (https://stackoverflow.com/a/51114250) in understanding what is and is not possible in a so-called ambient module declaration. AFAICT this is the only way to import types into a global module declaration.
Our long term hopes are to rationalize this further (#12, #265), but I accept that this is as far as we can come in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, thanks for that SO link!
export interface TextEditorContext { | ||
/** URI of the edited document */ | ||
readonly path: string; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the direct inlining that felt OK when the interface contains a single string (readonly path: string
). But it feels very unsustainable when dealing with the hundreds of automatically generated lines needed to bring in the interfaces I'm adding here.
@@ -32,6 +32,7 @@ | |||
// --- Start Positron --- | |||
"./vs/**/*.tsx", | |||
"positron-dts/positron.d.ts", | |||
"positron-dts/ui-comm.d.ts", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Necessary because I keep the new interfaces in their own file.
@@ -40,7 +40,7 @@ export class ExtHostMethods implements extHostProtocol.ExtHostMethodsShape { | |||
async call(method: UiFrontendRequest, params: Record<string, any>): Promise<JsonRpcResponse> { | |||
try { | |||
if (!Object.values(UiFrontendRequest).includes(method)) { | |||
return <JsonRpcError> { | |||
return <JsonRpcError>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Automatic formatting did this.
interface ITextEditorContext { | ||
readonly path: string; | ||
} | ||
type ITextEditorContext = EditorContextResult; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So many copies! But at least this one is just an alias.
src/positron-dts/positron.d.ts
Outdated
// * Reduce the manual proliferation of these generated types. | ||
// * Ideally a file is meant to edited by humans or by robots, but not both. | ||
// Related to https://github.com/posit-dev/positron/issues/12 | ||
type TEC = import('./ui-comm').TextEditorContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially assuming we'd add a fenced area in positron.d.ts
that generate-comms.ts
would update, to preserve the file self-contained just like vscode.d.ts
is.
Do we risk making imports management a bit harder in extensions? Currently they are copying that interface file locally and will now have to import two of them and remember to keep both in sync.
That said, eventually these types will probably be exported in an npm package for extensions to use, e.g. like https://www.npmjs.com/package/@types/vscode. When I jump to definition from import * as vscode from 'vscode';
, I end up in a node_modules
folder where the vscode.d.ts
file has become an index.d.ts
file inside a vscode
folder. So I guess in that configuration multiple files will work out transparently (I'm a web dev neophyte so I'm not sure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are all good points, so my main goal is to encourage us to discuss and figure this out. I think we do have lots of evidence from R package development that it's beneficial to draw a hard line between generated files and human-curated files, whenever possible. But if we can't do that here or choose not to, yeah we should fence off a special area in positron.d.ts
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've still got this in a separate file for the moment.
assuming we'd add a fenced area in
positron.d.ts
If I did that, I'd be adding >100 lines which is >10 of that file. Then there's the higher degree of difficulty of managing a file authored by both humans and computers. And I think this problem will just get worse. But I will put this directly in positron.d.ts
if that's what folks want.
Do we risk making imports management a bit harder in extensions? Currently they are copying that interface file locally ...
So I see this in positron-python (https://github.com/posit-dev/positron-python/blob/98da4f51910f36c1d898a96461acdf19b662d19c/positron-dts/positron.d.ts), but not in positron-r. Why do we need to do that in positron-python but not positron-r? Is this also a manual copy paste into positron-python?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question about why positron-python maintains its own copy of positron.d.ts
is answered here: https://github.com/posit-dev/positron-python/pull/329/files#r1472174669
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
304175e
to
34ce140
Compare
This comment was marked as outdated.
This comment was marked as outdated.
], | ||
"components": { | ||
"schemas": { | ||
"uiTextDocument": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figure I have to invent a name for this schema. The rationale for uiTextDocument
is that this schema is based on vscode.TextDocument
and it exists for use in our "ui" comms.
"type": "string", | ||
"description": "URI of the resource viewed in the editor" | ||
}, | ||
"eol": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to make this an actual enum, but it created lots of problems with types. So eol
is a simple string now and it will be either \n
or \r\n
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think the only reasonable way to make this a "real" enum would be to use enum values like "windows" and "unix" and then replace them with the physical sequences after/before serialization. Definitely not a high priority change.
// Related to https://github.com/posit-dev/positron/issues/12 | ||
type TEC = import('./ui-comm').TextEditorContext; | ||
export type TextEditorContext = TEC; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current, working state of this experiment (and this PR) is that I copy/paste the necessary excerpt of src/vs/workbench/services/languageRuntime/common/positronUiComm.ts
into src/positron-dts/ui-comm.d.ts
. The two lines above are as close as I can come to inlining that file. This StackOverflow answer was extremely helpful (https://stackoverflow.com/a/51114250) in understanding what is and is not possible in a so-called ambient module declaration. AFAICT this is the only way to import types into a global module declaration.
Our long term hopes are to rationalize this further (#12, #265), but I accept that this is as far as we can come in this PR.
*--------------------------------------------------------------------------------------------*/ | ||
|
||
// | ||
// Copied from src/vs/workbench/services/languageRuntime/common/positronUiComm.ts; do not edit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most recent renaming really confirms that it's no fun to manually maintain this copy.
import { JsonRpcErrorCode } from 'vs/workbench/services/languageRuntime/common/positronBaseComm'; | ||
|
||
import { EndOfLine } from '../extHostTypeConverters'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The end-of-line enum is surprisingly tricky to handle! It is defined in vscode.d.ts
but you can't import from vscode
in the extension host. I learned that the hard way. You don't see any PROBLEMS but the extension host becomes DOA in dev positron.
I assume that this explains the existence of all the type conversion machinery in https://github.com/posit-dev/positron/blob/main/src/vs/workbench/api/common/extHostTypeConverters.ts, which is what I use below.
// The gymnastics here are so that we return character positions with respect to | ||
// Unicode code points. Otherwise, the native Position type provides offsets with respect to | ||
// UTF-16 encoded text. That would be confusing for downstream consumers, who probably | ||
// ultimately receive this text as UTF-8 and want to operate on this text in terms of | ||
// as user-perceivable "characters". This only matters when the selection's neighborhood | ||
// includes Unicode characters in the astral plane. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This produces position data that is (a) actually correct and (b) makes the most sense in R. If these are the active selections:
![Screenshot 2024-01-30 at 3 42 25 PM](https://private-user-images.githubusercontent.com/599454/300993225-51c6b290-ad84-40b5-a31b-fa87bfba0a6a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3NDczNTAsIm5iZiI6MTczOTc0NzA1MCwicGF0aCI6Ii81OTk0NTQvMzAwOTkzMjI1LTUxYzZiMjkwLWFkODQtNDBiNS1hMzFiLWZhODdiZmJhMGE2YS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE2JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxNlQyMzA0MTBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT05MDBkYjkxZWJiNDkwYTJiNTJmZTU3ZTg0ODM2ODcwYWUzNTY0MzVkMmM4MzVlOTFhNTY2NWU5MmJlNjQxNmI3JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.WHdp7WlW9Bn1jNGtoFDNBaORKJ-ukuFi8jxbWedTlJ0)
here's the data we expose via rstudioapi::getActiveDocumentContext()
:
> f <- function(sel) {
+ x <- list(
+ text = sel$text,
+ `nchar(text)` = nchar(sel$text),
+ `range` = glue::glue('
+ [{sel$range$start["row"]}, {sel$range$start["column"]}] --\\
+ [{sel$range$end["row"]}, {sel$range$end["column"]}]'),
+ `end column - start column` =
+ sel$range$end['column'] - sel$range$start['column'],
+ `strsplit(text, split = '')[[1]]` = strsplit(sel$text, split = '')[[1]],
+ `substr(text, 1, 1)` = substr(sel$text, 1, 1),
+ `substr(text, 2, 2)` = substr(sel$text, 2, 2)
+ )
+ fr <- function(x) format(x, justify = "right")
+ fl <- function(x) format(x, justify = "left")
+ print(glue::glue("{fr(names(x))}: {fl(x)}"))
+ print(glue::glue("\n"))
+ invisible()
+ }
> purrr::walk(x$selection, f)
text: 🤖🌷
nchar(text): 2
range: [1, 1] --[1, 3]
end column - start column: 2
strsplit(text, split = '')[[1]]: 🤖, 🌷
substr(text, 1, 1): 🤖
substr(text, 2, 2): 🌷
text: 😱
nchar(text): 1
range: [2, 1] --[2, 2]
end column - start column: 1
strsplit(text, split = '')[[1]]: 😱
substr(text, 1, 1): 😱
substr(text, 2, 2):
text: ?💅
nchar(text): 2
range: [3, 1] --[3, 3]
end column - start column: 2
strsplit(text, split = '')[[1]]: ? , 💅
substr(text, 1, 1): ?
substr(text, 2, 2): 💅
text: Ab
nchar(text): 2
range: [4, 1] --[4, 3]
end column - start column: 2
strsplit(text, split = '')[[1]]: A, b
substr(text, 1, 1): A
substr(text, 2, 2): b
I like the way the reported range is consistent with R's notion of the selection length and positions, as reported by nchar()
and substr()
.
Note that things get a little tricky in the presence of an Emoji ZWJ Sequence. But the data we return is still consistent with what R does.
![Screenshot 2024-01-30 at 3 51 34 PM](https://private-user-images.githubusercontent.com/599454/300994512-466dbe4f-6f9d-4676-ad3b-59a15f3efd6f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3NDczNTAsIm5iZiI6MTczOTc0NzA1MCwicGF0aCI6Ii81OTk0NTQvMzAwOTk0NTEyLTQ2NmRiZTRmLTZmOWQtNDY3Ni1hZDNiLTU5YTE1ZjNlZmQ2Zi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjE2JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxNlQyMzA0MTBaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT03MmE5MjQ1MDIwYjZmNGJiOGQ3NTFjOWU3MTdhMGVmNzNmMjk4YWE2NDQ4YTQ3ZDkwOWVlYzcwNTFmN2QwMzRiJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.Xzzd8KlRtGNNvGwAiRJNoRn9mA_fSsYR7cg-JgeDxG0)
> x <- rstudioapi::getActiveDocumentContext()
> purrr::walk(x$selection, f)
text: 🤷♀️😭
nchar(text): 5
range: [1, 1] --[1, 6]
end column - start column: 5
strsplit(text, split = '')[[1]]: 🤷, , ♀ , ️ , 😭
substr(text, 1, 1): 🤷
substr(text, 2, 2):
text: 🤖🌷
nchar(text): 2
range: [2, 1] --[2, 3]
end column - start column: 2
strsplit(text, split = '')[[1]]: 🤖, 🌷
substr(text, 1, 1): 🤖
substr(text, 2, 2): 🌷
Note that the "woman shrugging" emoji is is a ZWJ sequence combining 🤷 Person Shrugging, 🏻 Light Skin Tone, Zero Width Joiner and ♀️ Female Sign.(https://emojipedia.org/female-sign). Our range suggests the first line contains 5 characters which matches nchar()
. To do any better, all the tools involved would need an awareness of Unicode normalization and of graphemes. I'm not pursuing that at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good. IIUC, the Javascript string iterator and the R string utilities are aligned in the way they interpret unicode characters, which seems like a happy outcome.
If we want to do better in the future for other backends or specialised consumers of rstudioapi, we could add a parameter to the document request to require positions and ranges in normalised unicode and whatnots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, the Javascript string iterator and the R string utilities are aligned in the way they interpret unicode characters, which seems like a happy outcome
Yes ☝️ that's a good summary.
// The selections in this text editor. The primary selection is always at index 0. | ||
// | ||
// The gymnastics here are so that we return character positions with respect to | ||
// Unicode code points. Otherwise, the native Position type provides offsets with respect to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this helpful: JavaScript has a Unicode problem.
Another good resource that @lionel- found that supports this approach: https://jupyter-client.readthedocs.io/en/latest/messaging.html#notes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe link to the Jupyter notes in the comment?
I found this helpful: JavaScript has a Unicode problem.
oh my
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔗 to the Jupyter notes is now added to the comment
|
||
// it's surprisingly fiddly to finesse this vscode.EndOfLife enum, which we don't have | ||
// direct access to here | ||
const eolSequenceEnum = EndOfLine.from(editor.document.eol); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where I use existing extension host machinery to get usable information out of a vscode.EndOfLine
enum.
], | ||
"components": { | ||
"schemas": { | ||
"uiTextDocument": { |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
// Related to https://github.com/posit-dev/positron/issues/12 | ||
type TEC = import('./ui-comm').TextEditorContext; | ||
export type TextEditorContext = TEC; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, thanks for that SO link!
// The selections in this text editor. The primary selection is always at index 0. | ||
// | ||
// The gymnastics here are so that we return character positions with respect to | ||
// Unicode code points. Otherwise, the native Position type provides offsets with respect to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe link to the Jupyter notes in the comment?
I found this helpful: JavaScript has a Unicode problem.
oh my
// The gymnastics here are so that we return character positions with respect to | ||
// Unicode code points. Otherwise, the native Position type provides offsets with respect to | ||
// UTF-16 encoded text. That would be confusing for downstream consumers, who probably | ||
// ultimately receive this text as UTF-8 and want to operate on this text in terms of | ||
// as user-perceivable "characters". This only matters when the selection's neighborhood | ||
// includes Unicode characters in the astral plane. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good. IIUC, the Javascript string iterator and the R string utilities are aligned in the way they interpret unicode characters, which seems like a happy outcome.
If we want to do better in the future for other backends or specialised consumers of rstudioapi, we could add a parameter to the document request to require positions and ranges in normalised unicode and whatnots.
c810596
to
a9775d8
Compare
"type": "string", | ||
"description": "URI of the resource viewed in the editor" | ||
}, | ||
"eol": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think the only reasonable way to make this a "real" enum would be to use enum values like "windows" and "unix" and then replace them with the physical sequences after/before serialization. Definitely not a high priority change.
Part of #1312. Builds on the POC that was included in #2019. In particular,
last_active_editor_context
is expanded from providing just a filepath to essentially all of the data members ofvscode.TextEditor
(or, at least, all of the ones I think are conceivably useful to a backend). This means we get coordinates and text for all of the current selections, as well as the entire contents of the editor.Companion PRs:
posit-dev/ark#213
https://github.com/posit-dev/positron-python/pull/329