Use translated Prism AST to run RuboCop #1849

vinistock · 2024-03-27T20:54:53Z

Motivation

Use the Prism AST translator to reuse our existing AST to run RuboCop.

Implementation

RuboCop doesn't have an API where we can simply ask "format this AST", so implementing this requires patching a handful of places so that we can pass the AST around.

The idea is that we instantiate the ProcessedSource ourselves and prevent the Prism translator from re-parsing the file if we already have an AST in hand.

Questions

Is this the best way to pass the existing AST around? Could we be doing this in a more elegant way?
What do we need to do to prevent this from failing in older RuboCop versions that do not support Prism as a backend?

Automated Tests

Will add tests.

Manual Tests

Start the LSP on this branch
Verify that diagnostics and formatting is working properly

vinistock · 2024-03-27T20:56:03Z

Tagging @koic and @kddnewton. I'd love to hear your thoughts on the approach and if there's something that can be upstreamed to Prism or RuboCop to make this integration more seamless.

kddnewton

Looks good. In general I think we should provide the ability to pass through a parsed AST to Prism::Translation::Parser in prism itself instead of patching it here. One way to do this would be change it in prism to do:

def parse(source_buffer)
  ...
  Prism.parse(...)
  ...
end

instead to:

def initialize(parser: Prism)
  @parser = parser
end

def parse(source_buffer)
  ....
  @parser.parse(...)
  ...
end

and then pass in a custom object in ruby-lsp for the parser that would return the parse result.

lib/ruby_lsp/document.rb

sorbet/rbi/shims/rubocop.rbi

lib/ruby_lsp/requests/support/rubocop_runner.rb

vinistock · 2024-03-28T14:55:36Z

lib/ruby_lsp/requests/support/rubocop_runner.rb

+          processed_source = T.unsafe(::RuboCop::AST::ProcessedSource).new(
+            @options[:stdin],
+            Prism::Translation::Parser::VERSION_3_3,
+            file,
+            parser_engine: parser_engine,
+            prism_result: @parse_result,
+          )


We could avoid some patching if there was a way to instantiate the ProcessedSource with an existing AST and token list. @koic, would you be open to accepting that?

It would allow us to avoid some patching.

As I'm not familiar with Ruby LSP, I cannot say much at this point. However, I'd suggest opening a PR to RuboCop AST. There might be adjustments needed regarding the API, but I think they could be acceptable depending on the use case. cc @bbatsov @marcandre
https://github.com/rubocop/rubocop-ast

I'm curious about the context. In particular, why is @parse_result being passed around as an AST and not a ProcessedSource? I.e., would it be possible upstream in the code to generate a ProcessedSource instead of just an AST, and pass that information instead of just the AST?

All features in the Ruby LSP depend only the Prim's parse result, so we keep that in our representation of documents.

For RuboCop, we just invoke run with the file contents, which will let it re-parse the file and do everything it needs to do.

The Ruby LSP supports multiple formatters. We hook into RuboCop if it is available, but it's not a dependency. So we can't depend on RuboCop internals for our basic document representations, like using a ProcessedSource. All RuboCop things are limited to the runner here.

Now that RuboCop supports the Prism backend, we're trying to find what the easiest way of reusing the existing AST is, so that we can stop parsing documents twice.

The ideal API would be something like what Syntax Tree supports, which allows you to format a given AST. That would not only simplify the code here a lot, it would enable RuboCop to support range formatting for LSPs, which is currently impossible.

That said, the ideal API may require significant refactors, which is why I suggested something a bit simpler that will still minimize the number of patches. If we can instantiate the ProcessedSource in a way that allows us to reuse the existing AST and prevents the second parse, that'll already simplify the code quite a bit.

But a ProcessedDocument also holds the tokens and the comments

Sorry, I don't understand what you mean. Are there two different classes that would require changing? Or are you saying we could be using something else?

The use case this PR aims to address appears reasonable to me. When using a Ruby code with multiple tools (e.g., RuboCop, TypeProf, Steep...), parsing the same source code to an AST in each tool becomes a redundant process. In such cases, it makes sense to reuse the one AST across different tools.

In this PR, the only change to the interface for reusing the AST is the addition of the prism_result keyword argument. This means that backward compatibility is maintained, and there is no impact on the existing RuboCop (AST).

In summary, I think this proposal can be accepted by RuboCop and RuboCop AST. Considering this is not a specialized requirement but a backend tool for LSP, it would be more rational to provide a published interface rather than having users write monkey patches. Below, I share one concern and an idea related to it.

NOTE: Naming and the two ASTs (whitequark AST and Prism AST)

A point of caution is that the current AST RuboCop can analyze is only the whitequark/parser AST interface (Prism::Translation::Parser). It cannot process the vanilla Prism AST interface.

Therefore, I'm not sure if prism_result is the best name for the AST argument (since it's not the Prism AST). The keyword argument name could be ast, which is simpler, but naming is difficult and I'm not confident that it's better 😅

Here is a proposed design for the ast parameter, assuming future RuboCop can use Prism AST:

Current situation: When the ast: ast parameter is passed along with either parser_engine: :parser_whitequark or parser_engine: parser_prism, the AST is processed as a whitequark AST.

Future extension: When the ast: ast parameter and parser_engine: :prism are passed, the AST would be processed as a Prism AST (note that parser_engine: prism is not yet implemented).

This would potentially allow for a future interface where RuboCop can use a direct Prism AST without Prism::Translation::Parser, and handle any incompatibilities in ASTs.

kddnewton · 2024-05-22T12:40:22Z

lib/ruby_lsp/document.rb

@@ -31,12 +31,12 @@ def initialize(source:, version:, uri:, encoding: Encoding::UTF_8)
      @version = T.let(version, Integer)
      @uri = T.let(uri, URI::Generic)
      @needs_parsing = T.let(true, T::Boolean)
-      @parse_result = T.let(parse, Prism::ParseResult)
+      @parse_result = T.let(parse, Prism::ParseLexResult)


I think we should pull the ast out of this result here, so that tree can just be an attr_reader

github-actions · 2024-07-21T12:40:56Z

This pull request is being marked as stale because there was no activity in the last 2 months

vinistock added enhancement New feature or request server This pull request should be included in the server gem's release notes labels Mar 27, 2024

vinistock self-assigned this Mar 27, 2024

kddnewton suggested changes Mar 28, 2024

View reviewed changes

lib/ruby_lsp/document.rb Outdated Show resolved Hide resolved

sorbet/rbi/shims/rubocop.rbi Outdated Show resolved Hide resolved

lib/ruby_lsp/requests/support/rubocop_runner.rb Outdated Show resolved Hide resolved

vinistock commented Mar 28, 2024

View reviewed changes

vinistock added 2 commits May 3, 2024 15:53

Use parse_lex instead of parse

d5d4ea8

Translate existing AST into RuboCop AST

35ec8e9

vinistock force-pushed the vs/use_prism_translator branch from 3095014 to 35ec8e9 Compare May 3, 2024 20:18

kddnewton reviewed May 22, 2024

View reviewed changes

github-actions bot added the Stale label Jul 21, 2024

vinistock added pinned This issue or pull request is pinned and won't be marked as stale and removed Stale labels Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use translated Prism AST to run RuboCop #1849

Use translated Prism AST to run RuboCop #1849

vinistock commented Mar 27, 2024

vinistock commented Mar 27, 2024

kddnewton left a comment •

edited

Loading

vinistock Mar 28, 2024

koic Apr 2, 2024

marcandre Apr 2, 2024

vinistock Apr 2, 2024

marcandre Apr 2, 2024 •

edited

Loading

vinistock Apr 3, 2024

koic May 22, 2024

kddnewton May 22, 2024

github-actions bot commented Jul 21, 2024

Use translated Prism AST to run RuboCop #1849

Are you sure you want to change the base?

Use translated Prism AST to run RuboCop #1849

Conversation

vinistock commented Mar 27, 2024

Motivation

Implementation

Questions

Automated Tests

Manual Tests

vinistock commented Mar 27, 2024

kddnewton left a comment • edited Loading

Choose a reason for hiding this comment

vinistock Mar 28, 2024

Choose a reason for hiding this comment

koic Apr 2, 2024

Choose a reason for hiding this comment

marcandre Apr 2, 2024

Choose a reason for hiding this comment

vinistock Apr 2, 2024

Choose a reason for hiding this comment

marcandre Apr 2, 2024 • edited Loading

Choose a reason for hiding this comment

vinistock Apr 3, 2024

Choose a reason for hiding this comment

koic May 22, 2024

Choose a reason for hiding this comment

NOTE: Naming and the two ASTs (whitequark AST and Prism AST)

kddnewton May 22, 2024

Choose a reason for hiding this comment

github-actions bot commented Jul 21, 2024

kddnewton left a comment •

edited

Loading

marcandre Apr 2, 2024 •

edited

Loading