Improve Markup Language Support #7

hesstobi · 2017-04-04T14:17:04Z

This linter is doing a great job. In case of writing a document with markup language like LaTeX it could be improved, because is shows errors on every latex command. As a quick and dirty solution a add following lines:

  editorContent = editorContent.replace /(\\\w+)((?:\{[^\}]*\})*)((?:\[[^\]]*\])*)((?:\{[^\}]*\})*)/g , (match, name, group1, group2, group3, index, input) ->
    if /\\(\w*section|\w*caption|text\w*|mbox)/.test(name)
      output = Array(name.length+1).join(" ") +
        group1.replace(/[\{\}]/g, " ") +
        Array(group2.length+1).join(" ") +
        group3.replace(/[\{\}]/g, " ")
    else
      output = Array(match.length+1).join " "
    return output

Which replacing the large part of the LaTeX markup with spaces. I than disabled the WHITESPACE_RULE
A more general approach would be to ignore grammar scopes and pattern with an API like linter-spell is providing.

The text was updated successfully, but these errors were encountered:

wysiib · 2017-04-05T09:20:50Z

We should go for a proper solution following the linter-spell one. Will look at it in the coming days.

wysiib · 2017-06-20T06:53:57Z

I am unsure whether the core plugin should include language-specific features. The same goes for issue #7. However, I am not sure about an API for connecting language-definitions as separate packages either. Any suggestions?

zoenglinghou · 2018-02-27T19:33:04Z

linter-spell-latex actually compiled excluded scopes for latex. Might be helpful.

wysiib · 2018-03-05T09:31:39Z

We thought about porting the solution done by the linter-spell package for quite some time. Currently, I am switching jobs and thus I do not have the time to implement things myself. But I will look into it, properly around end of Mai.

29antonioac · 2018-11-03T18:25:45Z

This linter is doing a great job. In case of writing a document with markup language like LaTeX it could be improved, because is shows errors on every latex command. As a quick and dirty solution a add following lines:
  editorContent = editorContent.replace /(\\\w+)((?:\{[^\}]*\})*)((?:\[[^\]]*\])*)((?:\{[^\}]*\})*)/g , (match, name, group1, group2, group3, index, input) ->
    if /\\(\w*section|\w*caption|text\w*|mbox)/.test(name)
      output = Array(name.length+1).join(" ") +
        group1.replace(/[\{\}]/g, " ") +
        Array(group2.length+1).join(" ") +
        group3.replace(/[\{\}]/g, " ")
    else
      output = Array(match.length+1).join " "
    return output
Which replacing the large part of the LaTeX markup with spaces. I than disabled the WHITESPACE_RULE
A more general approach would be to ignore grammar scopes and pattern with an API like linter-spell is providing.

Hi! How could I use this workaround until a final solution is found?

Thanks!

hesstobi · 2018-11-03T19:15:23Z

You can use my branch, which add the basic support for markup languages using the linter-spell-api. I use this a lot for latex. There are still a lot of things missing....
https://github.com/hesstobi/linter-languagetool/tree/linter-spell-api

29antonioac · 2018-11-05T13:54:41Z

Thanks for your work! It works pretty well :).

Only one question: in my documents the command \gls{} for handling acronyms are not correctly filtered. Is this a problem related to your plugin or related to linter-spell?

Thanks for all!

73 · 2018-11-20T14:14:42Z

I would like to give this thumbs up. Support for LaTeX would be so awesome!

davidlday · 2018-12-18T20:57:04Z

I don't know if this helps or not, but the LanguageTool Server now has support for processing annotated text. Not sure when exactly they implemented it. You can see the data parameter of the API at SwaggerHub for an example. It takes a value like:

{"annotation":[
 {"text": "A "},
 {"markup": "<b>"},
 {"text": "test"},
 {"markup": "</b>"}
]}

Using the linter-spell approach, perhaps the different formats could be mapped to this annotated format? This would preserve offsets, I believe, and potentially be easier than trying to reduce to pure text.

wysiib · 2018-12-23T14:53:50Z

That sounds like another nice way to proceed. I agree, reducing to pure text and keeping offsets intact might be quite a hassle. However, I haven't found a list of "all" the annotations in say Latex. Could this be derived from the language tokens Atom creates anyway? @hesstobi since this is somewhat related to what you are doing: any input?

hesstobi · 2018-12-29T11:36:17Z

Yes I think this is a good way to go. But I currently do not have any time to work on that.

davidlday · 2018-12-30T18:53:28Z

I created a few stand-alone packages that convert markup into LanguageTool's annotated text that might help:

annotatedtext - base package
annotatedtext-remark - converts markdown to annotated text using remark-parse
annotatedtext-rehype - converts html to annotated text using rehype-parse

My quick search for a LaTeX parser turned up a couple of packages, but also several SO posts on how challenging it is to create a parser. If you all know of a good parser, I can see about creating another package to handle it. Or you're free to leverage the above to create one as well. :)

hesstobi · 2018-12-30T19:18:59Z

Nice work. But I think this is more useful outside of Atom. Because you will need a parser for every grammar. Atom includes the parsing of all major grammars. With the linter-spell-api it is possibility to choose which scopes should be checked by LanguageTool. This will enable LanguageTool to check comments in programming languages and so on.

davidlday · 2018-12-30T19:51:21Z

Thank you. I see where I misunderstood the parsing in Atom. Should have looked a little closer. :( Anyhow, I'll dig in a little deeper on the grammars & linter-spell as I have time and see if I can help out.

davidlday · 2019-01-16T12:37:00Z

I've been watching/commenting on an issue on atom-wordcount that feels like a similar problem. Basically trying to eliminate all non-natural language text from a document's word count. Getting tokenized lines seems to be possible using Atom's public API by:

editorGrammar = editor.getGrammar()
editorGrammar.tokenizeLines(editor.getText())

See the early snippet in the issue for an example of filtering out scopes using first-mate. This doesn't work for tree-sitter grammars but a similar approach should be possible

hesstobi · 2019-01-16T20:21:05Z

This is the API we need! I added this to #23. But we should also find a way for tree-sitter.

mbroedl · 2019-01-17T12:03:30Z

@hesstobi Have a look at this commit where I try to use the editor.tokensForScreenRow() API. Note that this API is undocumented and thus subject to change! (See also the discussion in atom-wordcount again.)

wysiib added this to the 0.5.0 milestone Apr 5, 2017

wysiib added the enhancement label Apr 5, 2017

wysiib self-assigned this Apr 5, 2017

hesstobi mentioned this issue Jun 21, 2017

Set the spell checking language accroding the value of the magic comm… #12

Open

wysiib modified the milestones: 0.6.0, 0.5.0 Aug 28, 2017

wysiib modified the milestones: 0.6.0, 0.7.0 Sep 20, 2017

wysiib modified the milestones: 0.7.0, 0.8.0 Dec 4, 2017

hesstobi linked a pull request Dec 5, 2017 that will close this issue

[WIP] Linter spell api #23

Open

6 tasks

wysiib mentioned this issue Jan 25, 2018

Spelling corrections are not show in suggestions for LaTeX files #26

Open

wysiib modified the milestones: 0.8.0, 0.9.0 May 16, 2018

rubenwardy mentioned this issue Jul 17, 2018

Incorrectly says to end paragraph with a full stop in markdown #31

Open

davidlday mentioned this issue Jan 17, 2019

Ignore Scopes in word count [including proof of concept] OleMchls/atom-wordcount#99

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Markup Language Support #7

Improve Markup Language Support #7

hesstobi commented Apr 4, 2017

wysiib commented Apr 5, 2017

wysiib commented Jun 20, 2017

zoenglinghou commented Feb 27, 2018

wysiib commented Mar 5, 2018

29antonioac commented Nov 3, 2018

hesstobi commented Nov 3, 2018

29antonioac commented Nov 5, 2018

73 commented Nov 20, 2018

davidlday commented Dec 18, 2018

wysiib commented Dec 23, 2018

hesstobi commented Dec 29, 2018

davidlday commented Dec 30, 2018 •

edited

Loading

hesstobi commented Dec 30, 2018

davidlday commented Dec 30, 2018

davidlday commented Jan 16, 2019

hesstobi commented Jan 16, 2019

mbroedl commented Jan 17, 2019

Improve Markup Language Support #7

Improve Markup Language Support #7

Comments

hesstobi commented Apr 4, 2017

wysiib commented Apr 5, 2017

wysiib commented Jun 20, 2017

zoenglinghou commented Feb 27, 2018

wysiib commented Mar 5, 2018

29antonioac commented Nov 3, 2018

hesstobi commented Nov 3, 2018

29antonioac commented Nov 5, 2018

73 commented Nov 20, 2018

davidlday commented Dec 18, 2018

wysiib commented Dec 23, 2018

hesstobi commented Dec 29, 2018

davidlday commented Dec 30, 2018 • edited Loading

hesstobi commented Dec 30, 2018

davidlday commented Dec 30, 2018

davidlday commented Jan 16, 2019

hesstobi commented Jan 16, 2019

mbroedl commented Jan 17, 2019

davidlday commented Dec 30, 2018 •

edited

Loading