Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXPERIMENT: JDT using Javac instead of ECJ #1827

Open
mickaelistria opened this issue Jan 10, 2024 · 14 comments
Open

EXPERIMENT: JDT using Javac instead of ECJ #1827

mickaelistria opened this issue Jan 10, 2024 · 14 comments

Comments

@mickaelistria
Copy link
Contributor

I'd like to share here an ongoing experiment about making JDT use Javac (through API/code) instead of ECJ for some features.
For the moment, the experiment is ongoing on a branch of mine, documentation is available on the branch: https://github.com/mickaelistria/eclipse.jdt.core/blob/parse-with-javac/README_JAVAC_EXPERIMENT.md

For those who just want to get a sense of why we're doing that, let me quote the documentation
"""
This branch is a work in progress experiment to leverage all higher-level JDT IDE features (DOM, IJavaElement, refactorings...) relying on Javac as underlying compiler/parser instead of ECJ.

Why? Some background...

  • These days, with more frequent and more features Java releases, it's becoming hard for JDT to cope with new Java features on time and facilitate support for upcoming/preview features before Java is released so JDT can participate to consolidation of the spec. Over recent releases, JDT has failed at providing the features on time. This is mostly because of the difficulty of maintaining the Eclipse compiler: compilers are difficult bits of code to maintain and it takes a lot of time to implement things well in them. There is no clear sign the situation can improve here.
  • The Eclipse compiler has always suffered from occasional inconsistencies with Javac which end-users fail at understanding. Sometimes, ECJ is right, sometimes Javac is; but for end-users and for the ecosystem, Javac is the reference implementation and it's behavior is what they perceive as the actual specification
  • JDT has a very strong ecosystem (JDT-LS, plugins) a tons of nice features, so it seems profitable to keep relying higher-level JDT APIs, such as model or DOM to remain compatible with the ecosystem
    """

For more technical details, please see https://github.com/mickaelistria/eclipse.jdt.core/blob/parse-with-javac/README_JAVAC_EXPERIMENT.md . As you can grok from the document, many features are already working, confirming that Javac-based JDT seems like a viable opportunity.

Concretely, until we see a blocker with this approach, we (Red Hat) will continue focusing on it, with the hope that we can switch JDT-LS to use it; and of course, if/when quality and value are high enough for JDT as well, we would love to also see JDT adopting this strategy by default.
Note that it is not a goal to remove ECJ from JDT. The goal is just to enable Javac as an alternative at runtime; it's totally fine that ECJ stays. However, Red Hat will probably stop contributing to ECJ (and focus on both Javac-based JDT and higher-level JDT: DOM, manipulation and so on...)

If you're interested in joining this effort, I'd be glad to assist you in jumping in!

@jukzi
Copy link
Contributor

jukzi commented Jan 10, 2024

Interesting approach.
I had a quick look at a random commit - why do you think it's good to rely on internals like "com.sun.tools.javac"?

@mickaelistria
Copy link
Contributor Author

I didn't find a better alternative to using internals. As mentioned in the doc, using internals blocks compatibility to only 1 (usually latest) version of Java. It's in practice the only drawback of using them.

@laeubi
Copy link
Contributor

laeubi commented Jan 11, 2024

Not sure if this is maybe better suited as a discussion or draft PR, but anyways here is a corresponding effort to use javac for Tycho to compile:

@mickaelistria
Copy link
Contributor Author

Note that with this approach, some Java features that do not seem to require JDT DOM change can work out of the box. JEP-443 (unnamed variables and patterns) do work out of the box with the code pointed by this experiment.

Another good piece of news is that I've managed to separate things better, so:

  • the bits that create a JDT DOM from Javac are isolated in a fragment, thus JDT-Core doesn't require a depedency on Javac if the fragment is not used
  • the bits that do replace ECJ parsing by DOM polling are included in JDT-Core but do not have any dependency on Javac either (the DOM can equally come from ECJ or Javac without the code being any different)

So if there is interest from enough people to make it more "official", I can try to contribute some parts to JDT Core already and continue the experiment there (as it can be isolated enough to not break default behavior).

@akurtakov
Copy link
Contributor

IMO JDT can only benefit from all of this (even the fragment) being part of JDT project rather than external esp as it's gonna be no-op unless explicitly requested. Please start providing PRs to upstream it.

@jukzi
Copy link
Contributor

jukzi commented Jan 12, 2024

I suggest to get JDT leads permission first.
Personally i don't like to get any upstream commits until there is a decision that using ECJ is the way to go, as any commit can break the builds.

@iloveeclipse
Copy link
Member

I would expect any PR that is proposed out of this story per definition:

  1. should not break any existing functionality on master (neither performance nor functional regression)
  2. should have proper explanation of the change outlining why/how/alternatives etc
  3. and of course should pass a review, as this is something that isn't just a "usual" fix but rather design change

With that above followed (anything missed?) I assume no "special" decisions needed.

Note: assuming the changes here require Java 21 for building or testing (as it seem to require javac from 21+), a prerequisite are probably also some infra build changes? SDK build runs on Java 17 too.

@mickaelistria
Copy link
Contributor Author

Personally i don't like to get any upstream commits until there is a decision that using ECJ is the way to go, as any commit can break the builds.

We do not want to make JDT adopt Javac-based parser as "the way to go". The intent is more about providing a framework in JDT (basically some settings and some internal interfaces) to use an alternative compiler under the hood, and hosting support for the reference javac compiler as an alternative. There is no intent nor plan at this moment to make this the Javac/DOM first strategy the default in JDT deliveries. Of course, we hope and believe that the proposal is technically solid enough so that in a not-too-distant future it becomes mainstream, but we acknowledge we're not there yet.
So merging this would basically an expansion of JDT functionalities. As long as it passes the usual quality checks, I see very little risk and no good reason to object merging this. We will probably discuss it finer-grain in PRs.

From a JDT-first POV, our expectation is that by getting the experiment closer in JDT repo ASAP, it 1. gets some more contributors and 2. it becomes easier for adopters to also maintain JDT as we could more easily consider contributing support for new Java constructs while they're being developed in Javac at DOM-level given the translation from Javac is almost trivial to implement compared to implementing specs the ECJ compiler. So overall, we believe and hope that this addition is clearly in the best interest of JDT as it will allow to work on new Java features from multiple entry-points at once, making work more easily distributed across (more🤞) contributors.
But of course, it's all open to discussions and I would also love to gather feedback and support from all parties.

@akurtakov
Copy link
Contributor

Note: assuming the changes here require Java 21 for building or testing (as it seem to require javac from 21+), a prerequisite are probably also some infra build changes? SDK build runs on Java 17 too.

SDK builds runs on Java 17 but bundles with different than that BREE get them through toolchains and Java 21 is already in the toolchains AFAIK. That should allow compiling such Java 21 fragment too which will be ignored if Eclipse runs on older JVM. Still, I expect a number of patches to be needed before we get to that.

@mickaelistria
Copy link
Contributor Author

@testforstephen
Copy link
Contributor

Hi @mickaelistria,

I took a look at the proposal, the idea is promising. It uses a common DOM AST parser based on Javac as the foundation of all language features, which simplifies the implementation significantly.

However, I have some concerns about the design.

  • Migrating all features to DOM AST is not trivial, and some features like code completion might need to be rewritten. This will require handling many use cases. How can we define the bar and say the migration is complete and let user for testing?

  • Have you evaluated the performance impact of the new approach? The new idea will depend on DOM AST for all features, and will need to generate a full AST every time there is an operation to handle. This could be expensive when the file size increases. Also, resolving binding to get the type system can be a potential performance bottleneck when performing it in a Java project context.

@mickaelistria
Copy link
Contributor Author

Migrating all features to DOM AST is not trivial, and some features like code completion might need to be rewritten.

Yes.

How can we define the bar and say the migration is complete and let user for testing?

I don't have a definitive answer to that. I think it's really a matter of feeling while using it (dog fooding): try it and improve it as long as something is clearly felt as missing. When nothing seems to be missing in a regular code session, then it's a good time to promote it as testable by users.

Have you evaluated the performance impact of the new approach?

Not yet.

Have you evaluated the performance impact of the new approach?

That's the current implementation, but I'm confident we can improve it so only the DOM is generated when the document changes and all operations use this DOM as long as document hasn't changed.

Also, resolving binding to get the type system can be a potential performance bottleneck when performing it in a Java project context.

I think this is equally a bottleneck as with ECJ.

@testforstephen
Copy link
Contributor

As long as it's easy to switch between javac and ecj, dogfooding is a good way to test them.

For performance, I care more about features like code completion that need to be fast and responsive. The current code completion engine uses ECJ parser's dietParse to find the member at the cursor position, and then parseBlockStatements to parse only that member's body. This avoids parsing unnecessary code during completion and makes it to resolve faster on typing. I wonder how javac can do something similar. I'd like to explore this idea further.

@mickaelistria
Copy link
Contributor Author

I'd like to explore this idea further.

If you can come up with something like a flamegraph comparing execution with ECJ parser vs building a DOM with Javac (vs building a DOM with ECJ), that would be very welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants