Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] TS-based spell-checking with hunspell #6343

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

divarvel
Copy link
Contributor

@divarvel divarvel commented Mar 17, 2023

This is the continuation of #3965

ToDo

  • apply changes to an up-to-date master commit
  • have hunspell-rs compile in the devenv setup
  • provide a way to apply fixes (maybe in the code actions popup?)

@David-Else
Copy link
Contributor

David-Else commented Mar 17, 2023

This might well be the best solution for intelligent spell checking for comments, but with Helix's new multiple language servers feature and https://github.com/valentjn/ltex-ls (soon to get a long needed update, the author has returned to the project) maybe that would be a better solution? In the readme it says:

Comment checking in many popular programming languages (optional, opt-in)

I have not checked out that feature yet as we needed the multiple language servers to make it workable. My settings:

[language-server.ltex-ls]
command = "ltex-ls"

[language-server.ltex-ls.config]
ltex.disabledRules = { "en-US" = [
  "PROFANITY",
], "en-GB" = [
  "PROFANITY",
] }
ltex.dictionary = { "en-US" = [
  "builtin",
], "en-GB" = [
  "builtin",
] } 

[[language]]
name = "markdown"
language-servers = [  "ltex-ls", "marksman" ]
text-width = 80
formatter = { command = 'prettier', args = [
  "--parser",
  "markdown",
  "--prose-wrap",
  "never",        # <always|never|preserve>
] }

@divarvel
Copy link
Contributor Author

divarvel commented Mar 17, 2023

This might well be the best solution for intelligent spell checking for comments, but with Helix's new multiple language servers feature and https://github.com/valentjn/ltex-ls (soon to get a long needed update, the author has returned to the project) maybe that would be a better solution? In the readme it says:

Comment checking in many popular programming languages (optional, opt-in)

I have not checked out that feature yet as we needed the multiple language servers to make it workable.

ltex-ls is quite heavyweight because it checks more than just vocabulary, so it's a huge dependency compared to hunspell. Also, it makes supporting more (programming) languages more complex because then we depend on ltex-ls to support them.

@pascalkuthe
Copy link
Member

More imoortantly ltex-ls can only check comments in some languages (and development is not active) and there is no checking for identifiers. With TS based spell checking you can spellchecker variable names etc.

@David-Else
Copy link
Contributor

Development is active again, check my link. I would be very happy to get this hunspell PR merged, I just thought I would draw attention to ltex-ls as I find it vastly superior to hunspell in every way. I would probably end up using both, hunspell just for code comments.

@divarvel
Copy link
Contributor Author

I think it's good to have both options indeed, hunspell is low-profile and easily available, with the possibility to switch to a more powerful tool through a language server.

If anyone manages to get hunspell-rs to compile within the environment provided by the nix flake, that could be a good way forward. The alternative would be to provide a pure rust impl of hunspell, but i don't have the cycles for that.

@pascalkuthe
Copy link
Member

Great t hear ltex-ls is active again! I definitely think having both options is great. I would like to look into porting nuspell (modern hunspell implementation with better suggestions) to rust.

@divarvel
Copy link
Contributor Author

divarvel commented Mar 17, 2023

Based on what we currently have, there are a few options forward:

  • go on with hunspell and find a way to make it compile (i could use help from someone familiar with nix-cargo-integration)

  • search for pure-rust spell-checking libs (a quick search only returned pointers to C++ libs with bindings in rust)

  • start working on a pure rust spell-checking library (i don't have the bandwidth available for that, personally)

  • continue working on the helix side of things, to add support for

    • suggestions (something close to code actions, but maybe with a separate command / binding, since code actions are completely handled by LSP code)
    • code identifiers support (ie handle camel case and snake case)

    while waiting for the spell-checker issue to be solved.

@pascalkuthe
Copy link
Member

I will look into porting the spell checker to rust. I think there is enough to do on the helix side for now and once.the pure rust check is ready it should be pretty easy to swap out as the API will.be mostly the same.

I don't have time to look at this PR in more detail this weekend but in general on the helix side:

  • spell check should be triggered in apply_impl and run asynchronously. There needs to be some debouncing the avoid running the spell checker too often (somewhat like diff gutters).
  • The result of the spellcheck would ideally be a special kind of diagnostic. That way we avoid writing duplicate code and instead can reuse the existing dignostic system and keybindings
  • it would be cool yo habe spell based auto complete (but probably better as a followup pr)
  • Sugestions should be shown in the code action menu

Like I said I didn't have the time to check this out in more detail so some of these points could already apply but this is sort of the endgoal for the helix side of things. Swapping out the spell checking library should then be fairly easy.

@divarvel
Copy link
Contributor Author

divarvel commented Mar 17, 2023

Alright, i'll go on with my stub then.

  • Spell checking is already applied asynchronously in write_impl for now. Running it in apply_impl would make more sense indeed, and debouncing would become necessary indeed.
  • Results are stored as diagnostics (in a separate list, since diagnostics are updated as a whole by lsp or the spell checker, this avoids having to filter them). The combined diagnostics are then exposed as a single list to leverage existing tooling.
  • Using the code action menu instead of a dedicated one will require adding a small layer between the editor and LSP and decouple code actions from LSP a bit (not a big issue, it will require a bit of thought in the UI / wording to avoid confusion.)

@divarvel
Copy link
Contributor Author

(i'm leaving this paused for the moment, waiting for dust to settle on the multiple LSP front)

@pascalkuthe
Copy link
Member

yeah that's probably wise. The multi LSP PR is a pretty large change (that is very time intensive to review)

@connorlay
Copy link
Contributor

Oh cool, someone picked this up 😄

@divarvel
Copy link
Contributor Author

Oh cool, someone picked this up 😄

I dropped it waiting for multiple LSPs to land ^^

@sigmaSd
Copy link
Contributor

sigmaSd commented Jul 4, 2023

multiple lsp have landed #2507

@archseer
Copy link
Member

(NOTE: There's some work ongoing internally now, @the-mikedavis is working on a new spellchecker in https://github.com/helix-editor/spellbook)

@71GA
Copy link

71GA commented Aug 31, 2023

(NOTE: There's some work ongoing internally now, @the-mikedavis is working on a new spellchecker in https://github.com/helix-editor/spellbook)

Reference link is not working.

@divarvel
Copy link
Contributor Author

I'll probably start from scratch when/if i resume working on this. idk what @the-mikedavis has in mind for this (just working on the spellchecker itself, or directly integrating it in helix)
There are a few design decisions to make wrt how to integrate this to the existing UI, especially code actions.

@gabydd
Copy link
Member

gabydd commented Aug 31, 2023

#3637 (comment) has more info about Mike's plans, also currently the repo is private which is why the link doesn't work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants