-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting the "word" under the cursor is really, really complicated. #4162
Comments
For integration of Slate into my product, I also had to write a ridiculously complicated function to get the current word, and I expect many other people have done so as well. I'll sure mine too, in case anybody finds it helpful when they tackle this issue. This isEqual below is from lodash. // Expand collapsed selection to range containing exactly the
// current word, even if selection potentially spans multiple
// text nodes. If cursor is not *inside* a word (being on edge
// is not inside) then returns undefined. Otherwise, returns
// the Range containing the current word.
function currentWord(editor): Range | undefined {
const {selection} = editor;
if (selection == null || !Range.isCollapsed(selection)) {
return; // nothing to do -- no current word.
}
const { focus } = selection;
const [node, path] = Editor.node(editor, focus);
if (!Text.isText(node)) {
// focus must be in a text node.
return;
}
const { offset } = focus;
const siblings: any[] = Node.parent(editor, path).children as any;
// We move to the left from the cursor until leaving the current
// word and to the right as well in order to find the
// start and end of the current word.
let start = { i: path[path.length - 1], offset };
let end = { i: path[path.length - 1], offset };
if (offset == siblings[start.i]?.text?.length) {
// special case when starting at the right hand edge of text node.
moveRight(start);
moveRight(end);
}
const start0 = { ...start };
const end0 = { ...end };
function len(node): number {
// being careful that there could be some non-text nodes in there, which
// we just treat as length 0.
return node?.text?.length ?? 0;
}
function charAt(pos: { i: number; offset: number }): string {
const c = siblings[pos.i]?.text?.[pos.offset] ?? "";
return c;
}
function moveLeft(pos: { i: number; offset: number }): boolean {
if (pos.offset == 0) {
pos.i -= 1;
pos.offset = Math.max(0, len(siblings[pos.i]) - 1);
return true;
} else {
pos.offset -= 1;
return true;
}
return false;
}
function moveRight(pos: { i: number; offset: number }): boolean {
if (pos.offset + 1 < len(siblings[pos.i])) {
pos.offset += 1;
return true;
} else {
if (pos.i + 1 < siblings.length) {
pos.offset = 0;
pos.i += 1;
return true;
} else {
if (pos.offset < len(siblings[pos.i])) {
pos.offset += 1; // end of the last block.
return true;
}
}
}
return false;
}
while (charAt(start).match(/\w/) && moveLeft(start)) {}
// move right 1.
moveRight(start);
while (charAt(end).match(/\w/) && moveRight(end)) {}
if (isEqual(start, start0) || isEqual(end, end0)) {
// if at least one endpoint doesn't change, cursor was not inside a word,
// so we do not select.
return;
}
const path0 = path.slice(0, path.length - 1);
return {
anchor: { path: path0.concat([start.i]), offset: start.offset },
focus: { path: path0.concat([end.i]), offset: end.offset },
};
} |
Any update on this? I would really need to be able to choose which characters to include in a "word".
|
First, I want to thank the maintainers of this library for providing the community with such a great piece of software. I've been working with Slate for some time now, and it is really good, covering 99% of my use-cases. Thank you for all your time and efforts! ❤️ Having become used to such a good experience, I'm surprised when I discover the remaining 1%. It seems strange to me that @williamstein Thank you for posting your solution here. if ((start.i === start0.i && start.offset === start0.offset) ||
(end.i === end0.i && end.offset === end0.offset)) { And also wrote some simple tests for this, using slate-test-utils: /** @jsx jsx */
import { assertOutput, buildTestHarness, testRunner } from "slate-test-utils";
import { Transforms } from "slate";
// noinspection ES6UnusedImports
import { jsx } from "./utils/testUtils";
import { currentWordRange } from "./utils";
import { Editor } from "./components/Editor";
const testCases = () => {
describe(currentWordRange.name, () => {
it("Returns range of word at cursor", async () => {
const input = (
<editor>
<hp>A word or t<cursor />wo.</hp>
</editor>
);
const [editor] = await buildTestHarness(Editor)({ editor: input });
Transforms.select(editor, currentWordRange(editor));
assertOutput(
editor,
<editor>
<hp>A word or <anchor />two<focus />.</hp>
</editor>
);
});
it("Returns undefined if cursor not at a word", async () => {
const input = (
<editor>
<hp>Lorem ipsum <cursor /> dolar sit amet</hp>
</editor>
);
const [editor] = await buildTestHarness(Editor)({ editor: input });
const range = currentWordRange(editor);
expect(range).toBeUndefined();
Transforms.select(editor, range);
assertOutput(editor, input);
});
});
};
testRunner(testCases); |
We're happy to consider PRs to fix the 1%. |
I ended up writing my own stepper which goes character by character and includes options as to which characters to include. If anyone is interested, here it is. You may have to adjust typings. export function word(
editor: CustomEditor,
location: Range,
options: {
terminator?: string[]
include?: boolean
directions?: 'both' | 'left' | 'right'
} = {},
): Range | undefined {
const { terminator = [' '], include = false, directions = 'both' } = options
const { selection } = editor
if (!selection) return
// Get start and end, modify it as we move along.
let [start, end] = Range.edges(location)
let point: Point = start
function move(direction: 'right' | 'left'): boolean {
const next =
direction === 'right'
? Editor.after(editor, point, {
unit: 'character',
})
: Editor.before(editor, point, { unit: 'character' })
const wordNext =
next &&
Editor.string(
editor,
direction === 'right' ? { anchor: point, focus: next } : { anchor: next, focus: point },
)
const last = wordNext && wordNext[direction === 'right' ? 0 : wordNext.length - 1]
if (next && last && !terminator.includes(last)) {
point = next
if (point.offset === 0) {
// Means we've wrapped to beginning of another block
return false
}
} else {
return false
}
return true
}
// Move point and update start & end ranges
// Move forwards
if (directions !== 'left') {
point = end
while (move('right'));
end = point
}
// Move backwards
if (directions !== 'right') {
point = start
while (move('left'));
start = point
}
if (include) {
return {
anchor: Editor.before(editor, start, { unit: 'offset' }) ?? start,
focus: Editor.after(editor, end, { unit: 'offset' }) ?? end,
}
}
return { anchor: start, focus: end }
} Include decides whether to include the terminator. I have two use cases for this: Emojis and Mentions. You can see how to use it here: Mentions: const range =
beforeRange &&
word(editor, beforeRange, {
terminator: [' ', '@'],
directions: 'left',
include: true,
}) Emojis: const beforeWordRange =
beforeRange &&
word(editor, beforeRange, { terminator: [' ', ':'], include: true, directions: 'left' }) |
I used The idea is to first define a regular expression (a.k.a "regexp") for the word. Then use slate's Use an example: "sunny da|y" (I use a pipe sign | to denote the cursor, for this case, the cursor is between a and y). The left portion of the word is "da" and the right portion of the word is "y" so the whole word is "day". slate.demo.mov// define word character as all EN letters, numbers, and dash
// change this regexp if you want other characters to be considered a part of a word
const wordRegexp = /[0-9a-zA-Z-]/;
const getLeftChar = (editor: ReactEditor, point: BasePoint) => {
const end = Range.end(editor.selection as Range);
return Editor.string(editor, {
anchor: {
path: end.path,
offset: point.offset - 1
},
focus: {
path: end.path,
offset: point.offset
}
});
};
const getRightChar = (editor: ReactEditor, point: BasePoint) => {
const end = Range.end(editor.selection as Range);
return Editor.string(editor, {
anchor: {
path: end.path,
offset: point.offset
},
focus: {
path: end.path,
offset: point.offset + 1
}
});
};
export const getCurrentWord = (editor: ReactEditor) => {
const { selection } = editor; // selection is Range type
if (selection) {
const end = Range.end(selection); // end is a Point
let currentWord = "";
const currentPosition = cloneDeep(end);
let startOffset = end.offset;
let endOffset = end.offset;
// go left from cursor until it finds the non-word character
while (
currentPosition.offset >= 0 &&
getLeftChar(editor, currentPosition).match(wordRegexp)
) {
currentWord = getLeftChar(editor, currentPosition) + currentWord;
startOffset = currentPosition.offset - 1;
currentPosition.offset--;
}
// go right from cursor until it finds the non-word character
currentPosition.offset = end.offset;
while (
currentWord.length &&
getRightChar(editor, currentPosition).match(wordRegexp)
) {
currentWord += getRightChar(editor, currentPosition);
endOffset = currentPosition.offset + 1;
currentPosition.offset++;
}
const currentRange: Range = {
anchor: {
path: end.path,
offset: startOffset
},
focus: {
path: end.path,
offset: endOffset
}
};
return {
currentWord,
currentRange
};
}
return {};
}; |
@tomliangg thank you very much, it helped me a lot. |
@aliak00 I just wanted to thank you for this great solution which is not overly complicated.
|
Thanks @tomliangg for providing the Codesandbox link, I modified your version to ✅ Make it work to get nth of previous word, somehow in your codesandbox link the function to get nth of previous word doesn't work properly Codesandbox: Slate Get Word, Previous, After, Nth Previous and Nth After Under Cursor |
Problem
The problem is that I want to be able to get the word under the cursor (collapsed) and the range of that word within a block element. The problem is that slate's
Editor.blah
functions don't seem sufficient to do it without some crazy logic.For my use-case a "word" includes the dash and dot (
-
,.
) characters.I'll use '|' as cursor location.
If you have 'hello| world' and call
Editor.after
with the word unit, you'll get the point after world.If you have 'hello world|' and you call
Editor.after
with the word unit, you'll get the first point in the next block.The same applies to Editor.after
So to actually get the word under the cursor, this is the logic I have:
And then I have my word and range:
Solution
A solution would be to not include "space" as part of word boundaries. Or someway for me to tell the
Editor.before/after
APIs to use the word unit but include specific characters and use other characters as terminations: e.g.Or to allow
{ edge: 'end' }
in the options so that it doesn't pass the end of the block?Context
Here's a screen shot of a slack thread that has more details:
The text was updated successfully, but these errors were encountered: