-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Java JLS 13 and JLS14 preview in M3 models and Java ASTs; also make ASTs jump through the correctness specification in lang::analysis::AST #1936
Merged
Merged
Changes from 130 commits
Commits
Show all changes
138 commits
Select commit
Hold shift + click to select a range
d9dcfac
added switch expressions to Java M3 AST support
jurgenvinju f42c97e
starting to weed out all the different language levels
jurgenvinju a30fa23
revamped JLS level selection parameter and added preview features sup…
jurgenvinju 98c5795
fixed switchCaseRule
jurgenvinju 7ebcb9c
added method reference expression
jurgenvinju bff9a3d
added expression method reference
jurgenvinju fd08a80
added intersectiontype
jurgenvinju 2302099
added missing @Override annotations to visit methods
jurgenvinju 575a5d3
added super.MethodReference to AST
jurgenvinju ab8944b
minor
jurgenvinju c74ebb6
refactored IValueList usages to IListWriter and removed IValueList
jurgenvinju 8225b23
added initial implementation of the AST nodes for the Java module sy…
jurgenvinju 2c0f68b
added exports directive
jurgenvinju c20ad42
radically improved M3 documentation
jurgenvinju a26aad6
typos
jurgenvinju cdf7dd6
composeM3 is now _generic_ for all M3 models for all languages, as lo…
jurgenvinju 665051e
added missing mergeKeywordParameters function for adding new keyword …
jurgenvinju ce40c5f
documented TypeSymbol better
jurgenvinju 6879945
using the Language concept from M3 to encode language name and version
jurgenvinju 35b5340
fixed initial problems in module spec parser, but some of the fields …
jurgenvinju fb138b0
fixed all name references in ASTs of module declarations
jurgenvinju 14ca354
fixed names in method references
jurgenvinju 69c90da
layout
jurgenvinju 61cd581
added module as a type for Java and fixed naming issues in the AST fo…
jurgenvinju 4a11698
fixed field access
jurgenvinju 92f104d
rationalized all shortcuts of `str` to Expression for better resoluti…
jurgenvinju 06236ec
removed illegal bool value from AST definition
jurgenvinju e72f1b6
added all missing AST nodes for modifiers and annotations aback into …
jurgenvinju 0babe3e
minor cleanup
jurgenvinju f00d64a
jar converter now records java version for each class file
jurgenvinju 3199f38
fixed missing physical location for extracting single class files
jurgenvinju 5fcdd67
fixed physical location of .class files in jars
jurgenvinju 1ba7a83
fixed silly bugs in implementation of language levels for class file …
jurgenvinju a911906
M3 from source code extractors now also store the configured language…
jurgenvinju 13f2f16
added type parameters to classes and interfaces
jurgenvinju f4a804d
comments
jurgenvinju 81d8b46
wired typeParamaters back into classes and interfaces
jurgenvinju 31bc8c1
added type parameters back into method ASTs
jurgenvinju 3c30dd2
converted extra dimensions for parameter nodes like int a[][] back to…
jurgenvinju 0fab829
added documentation
jurgenvinju eb1fe84
more dimensions cleanup
jurgenvinju 97fb84a
renamed lowerbound and upperbound ndoes to super and extends to be cl…
jurgenvinju b308e95
added generic type parameters to anonymous inner classes and started …
jurgenvinju afcdfca
removed a few more nulls passing around and added TODO to know here I…
jurgenvinju b744b5e
documentation
jurgenvinju 2e29e7b
added more missing nodes for type arguments and type parameters here …
jurgenvinju a06112f
fixed more missing type parameter stuff and more null passing
jurgenvinju b0b5219
added missing type parameters for the parameterizedType constructor
jurgenvinju 1650043
fixed all raw type warnings for iterators in ASTConverter
jurgenvinju 815b779
removed dead comments and adding documentation
jurgenvinju 1811f62
added some missing headers and removed the null filtering code that i…
jurgenvinju a2375ce
fixed bug
jurgenvinju 459981e
running the new code and fixing every minor slipup
jurgenvinju f960467
alignment between ASTConverter and AST.rsc improved
jurgenvinju dc16cba
fixing consequences of removing the NULL filters
jurgenvinju b2d0701
IAnnotable does not appear to be a correctly functioning tag interfac…
jurgenvinju a500f79
anonymous classes do not have modifiers
jurgenvinju e172a6c
fixed alignment between JDT and M3 wrt lambda expressions
jurgenvinju 2b72bd2
more alignment issues
jurgenvinju eccf89a
added missing node for creation method references like String[]::new
jurgenvinju 7f8b9b0
fixed name node for enums
jurgenvinju 7b409e1
check for JLS12 for break expression support
jurgenvinju ed5e3e0
aligned annotation ASTs
jurgenvinju 1a2d1ee
fixed lambda with typed parameters and aligned lambda structure bette…
jurgenvinju 4870bb6
identifiers are now called id instead of simpleName for r3eadability …
jurgenvinju f018cd1
fixed problems with NPEs and lambdas
jurgenvinju 5039cc9
added progress bar for AST construction of many files
jurgenvinju f696b23
replaced generic infix expression by the actual binary operator const…
jurgenvinju 70b227a
prefix and postfix expressions are now modeled again as unitary const…
jurgenvinju e957196
Type ASTs now also get a TypeSymbol typ parameter for easy reference
jurgenvinju ecacd8a
all AST nodes now have a .src field
jurgenvinju 85e8dc7
improved the AST spec for speed and diagnostics purposes
jurgenvinju eb21c38
fixed bug in import statement AST, where accidentally and AST node sh…
jurgenvinju ff553c0
forgot to add
jurgenvinju aea438f
added convenience utility for checking large numbers of ASTs instance…
jurgenvinju 6bf6be7
job functions can now return the value their block returns, for conve…
jurgenvinju abb5898
fixed issue with monitor
jurgenvinju 785c70f
removed annotations field from m3, has to be part of modifiers for al…
jurgenvinju ab97479
made start in modeling the Java 9 module system in M3 relations
jurgenvinju 6ee6c8c
cleanup, docs, and added @Override
jurgenvinju 2bd241f
initial implementation of M3 construction for module system from sour…
jurgenvinju 57db654
fixed missing implementation of name-qualified field types and issues…
jurgenvinju 02ddddd
fixed several issues in M3 extractor
jurgenvinju 3974b11
fixed bug in generic composeM3 function
jurgenvinju a039bcc
debugging the M3 model creator using jsoup as an example because it h…
jurgenvinju 01365e2
solved several detailed issues in JarConverter
jurgenvinju b9fde2e
added Java 9 module extraction to the Jar converter, also aligned wit…
jurgenvinju c114650
commented out mergeKeywordParameters function for bootstrpping purposes
jurgenvinju 359e046
improved documentation of Java m3
jurgenvinju c4714a3
Merge branch 'main' into m3-jls-13-and-higher
jurgenvinju 7e0eeae
improved docs
jurgenvinju 90efee4
fixed badmerge
jurgenvinju 1133cb6
added weirdly missing semicolon
jurgenvinju 56bfa86
removed dead code
jurgenvinju c92e930
parameter type changed
jurgenvinju cb2ac66
fixed NPEs in module-info resolution from .class files
jurgenvinju 0d22d21
replaced reference ASTs with new shapes after manual confirmation
jurgenvinju ed9ee6e
added missing annotations field back in
jurgenvinju 5bb69f1
manually verified junit m3. only difference in typeDependency for mis…
jurgenvinju 3ce877d
Revert "manually verified junit m3. only difference in typeDependency…
jurgenvinju d654b64
report all different relations in order to not feel safe after fixing…
jurgenvinju 478204f
added fully qualified names to names as well
jurgenvinju bd0ecb4
added fully qualified names for simple names as well, for lookup and …
jurgenvinju fbd649b
do not print progress of modules without tests in them
jurgenvinju f58dd77
made progress bars slightly more robust against output lines that do …
jurgenvinju 85b8db4
fixed duplicate file extension bug
jurgenvinju 03e3382
improved compareM3s reporting by shortening the output to the first 5…
jurgenvinju 932be2c
rewrote the M3 and AST testing code for simplicity and easier manual …
jurgenvinju 8cfed03
promoted handy utility function to IO library module
jurgenvinju de2d2a8
compared snakes M3 manually and all changes are ok
jurgenvinju 39d9e6f
fixed bug
jurgenvinju 03c3357
manually checked all changes in M3 extractor. Many changes are cause…
jurgenvinju b3b9c0c
fixed hamcrest jar name for junit classpath
jurgenvinju ec0b696
removed unused interface
jurgenvinju 2cbf100
whitespace and comments
jurgenvinju 532323d
consistent messages in compareM3s
jurgenvinju f49047c
Merge branch 'main' into m3-jls-13-and-higher
jurgenvinju acd7c76
meticulous manual checking of junit M3
jurgenvinju 11f546c
regenerated ASTs after superfluous merge from main
jurgenvinju 860cd30
checked differences in snakes and ladders regression tests manually a…
jurgenvinju 6f9bac7
regenerated M3 binaries after minor update
jurgenvinju 3637eb5
Merge branch 'main' into m3-jls-13-and-higher
jurgenvinju 5f8b840
Merge branch 'main' into m3-jls-13-and-higher
jurgenvinju 0e3b120
added .length array field back to fieldAccess
jurgenvinju a46a36e
java M3 and AST extraction functions now check if path elements do ex…
jurgenvinju 3664287
JUnit ASTs now pass the full AST specification
jurgenvinju bc7e238
Snakes and ladders ASTs satisfy the full AST spec
jurgenvinju 915c3be
java root package bug fixed
jurgenvinju 8fa72c1
Java root package bug also fixed for junit4 reference model
jurgenvinju f0c5d3e
improved the M3 correctness specification
jurgenvinju ba936da
organize imports
jurgenvinju 0d32f78
jar extraction must be from a location that is the same on all build …
jurgenvinju 8361385
removed superfluous testResultlistener.start and refactored a bit
jurgenvinju 2fd49c8
fixed a typo
jurgenvinju 467b914
updated newObject matches and constructions with type arguments
jurgenvinju 3943a9d
fixed a big number of downstream updates to the AST patterns in JavaT…
jurgenvinju c6ee41c
JavaToObjectFlow is correct again, statically, but not yet supporting…
jurgenvinju efa6401
added new missing cases for calling constructors and methods from sup…
jurgenvinju File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,7 +16,9 @@ The concept of _declaration_ is also relevant. A `decl` annotation points from a | |
Finally, the concept of a _type_ is relevant for ASTs. In particular an `Expression` may have a `typ` annotation, or a variable declaration, etc. | ||
} | ||
@benefits{ | ||
* Symbolic abstract syntax trees can be analyzed and transformed easily using Rascal primitives such as patterns, comprehensions and visit. | ||
* Symbolic abstract syntax trees can be analyzed and transformed easily using Rascal primitives such as patterns, comprehensions and visit. | ||
* By re-using recognizable names for different programming languages, it's easier to switch between languages to analyze. | ||
* Some algorithms made be reusable on different programming languages, but please be aware of the _pitfalls_. | ||
} | ||
@pitfalls{ | ||
* Even though different languages may map to the same syntactic construct, this does not mean that the semantics is the same. Downstream | ||
|
@@ -26,43 +28,93 @@ module analysis::m3::AST | |
|
||
import Message; | ||
import Node; | ||
import IO; | ||
import Set; | ||
import util::Monitor; | ||
import analysis::m3::TypeSymbol; | ||
|
||
@synopsis{For metric purposes we can use a true AST declaration tree, a simple list of lines for generic metrics, or the reason why we do not have an AST.} | ||
data \AST(loc file = |unknown:///|) | ||
= declaration(Declaration declaration) | ||
| lines(list[str] contents) | ||
| noAST(Message msg) | ||
; | ||
|
||
loc unknownSource = |unknown:///|; | ||
loc unresolvedDecl = |unresolved:///|; | ||
loc unresolvedType = |unresolved:///|; | ||
|
||
@synopsis{Uniform name for everything that is declared in programming languages: variables, functions, classes, etc.} | ||
@description{ | ||
Instances of the Declaration type represent the _syntax_ of declarations in programming languages. | ||
|
||
| field name | description | | ||
| ---------- | ----------- | | ||
| `src` | the exact source location of the declaration in a source file | | ||
| `decl` | the resolved fully qualified name of the artefact that is being declared here | | ||
| `typ` | a symbolic representation of the static type of the declared artefact here (not the syntax of the type) | | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 thanks for also writing all this documentation. |
||
data Declaration( | ||
loc src = |unknown:///|, | ||
loc decl = |unresolved:///|, //unresolvedDecl | ||
TypeSymbol typ = \any(), | ||
list[Modifier] modifiers = [], | ||
list[Message] messages = [] | ||
loc decl = |unresolved:///|, | ||
TypeSymbol typ = unresolved() | ||
); | ||
|
||
@synopsis{Uniform name for everything that is typicall a _statement_ programming languages: assignment, loops, conditionals, jumps} | ||
@description{ | ||
Instances of the Statement type represent the _syntax_ of statements in programming languages. | ||
|
||
| field name | description | | ||
| ---------- | ----------- | | ||
| `src` | the exact source location of the statement in a source file | | ||
| `decl` | if the statement directly represent a usage of a declared artefact, then this points to the fully qualified name of the used artifact. | ||
} | ||
data Statement( | ||
loc src = |unknown:///|, | ||
loc decl = |unresolved:///| //unresolvedDecl | ||
loc decl = |unresolved:///| | ||
); | ||
|
||
@synopsis{Uniform name for everything that is an _expression_ in programming languages: arithmetic, comparisons, function invocations, ...} | ||
@description{ | ||
Instances of the Expression type represent the _syntax_ of expressions in programming languages. | ||
|
||
| field name | description | | ||
| ---------- | ----------- | | ||
| `src` | the exact source location of the expression in a source file | | ||
| `decl` | if this expression represents a usage, decl is the resolved fully qualified name of the artefact that is being used here | | ||
| `typ` | a symbolic representation of the static type of the _result_ of the expression | | ||
} | ||
data Expression( | ||
loc src = |unknown:///|, | ||
loc decl = |unresolved:///|, //unresolvedDecl, | ||
TypeSymbol typ = \any() | ||
loc decl = |unresolved:///|, | ||
TypeSymbol typ = \unresolved() | ||
); | ||
|
||
@synopsis{Uniform name for everything that is an _type_ in programming languages syntax: int, void, List<Expression>, ...} | ||
@description{ | ||
Instances of the Type type represent the _syntax_ of types in programming languages. | ||
|
||
| field name | description | | ||
| ---------- | ----------- | | ||
| `src` | the exact source location of the expression in a source file | | ||
| `decl` | the fully qualified name of the type, if resolved and if well-defined | | ||
| `typ` | a symbolic representation of the static type that is the meaning of this type expression | | ||
} | ||
data Type( | ||
loc name = |unresolved:///|, //unresolvedType, | ||
TypeSymbol typ = \any() | ||
loc src = |unknown:///|, | ||
loc decl = |unresolved:///|, | ||
TypeSymbol typ = \unresolved() | ||
); | ||
|
||
data Modifier; | ||
@synopsis{Uniform name for everything that is a _modifier_ in programming languages syntax: public, static, final, etc.} | ||
@description{ | ||
Instances of the Modifer type represent the _syntax_ of modifiers in programming languages. | ||
|
||
| field name | description | | ||
| ---------- | ----------- | | ||
| `src` | the exact source location of the expression in a source file | | ||
} | ||
data Modifier( | ||
loc src = |unknown:///| | ||
); | ||
|
||
data Bound; | ||
|
||
@synopsis{Test for the consistency characteristics of an M3 annotated abstract syntax tree} | ||
bool astNodeSpecification(node n, str language = "java", bool checkNameResolution=false, bool checkSourceLocation=true) { | ||
|
@@ -75,35 +127,89 @@ bool astNodeSpecification(node n, str language = "java", bool checkNameResolutio | |
int end(loc l) = l.offset + l.length; | ||
bool leftToRight(loc l, loc r) = end(l) <= begin(r); | ||
bool leftToRight(node a, node b) = leftToRight(pos(a), pos(b)); | ||
bool included(node parent, node child) = begin(parent) <= begin(child) && end(child) <= end(parent); | ||
|
||
if (checkSourceLocation) { | ||
// all nodes have src annotations | ||
assert all(/node x := n, x.src?); | ||
|
||
// siblings are sorted in the input, even if some of them are lists | ||
assert all(/node x := n, [*_, node a, node b, *_] := getChildren(x), leftToRight(a,b)); | ||
assert all(/node x := n, [*_, node a, [node b, *_], *_] := getChildren(x), leftToRight(a,b)); | ||
assert all(/node x := n, [*_, [*_, node a], node b, *_] := getChildren(x), leftToRight(a,b)); | ||
assert all(/node x := n, [*_, [*_, node a], [node b, *_], *_] := getChildren(x), leftToRight(a,b)); | ||
assert all(/[*_, node a, node b, *_] := n, leftToRight(a,b)); | ||
|
||
// children positions are included in the parent input scope | ||
assert all(/node parent := n, /node child := parent, begin(parent) <= begin(child), end(child) <= end(parent)); | ||
// all AST nodes have src annotations | ||
for (/node x := n, TypeSymbol _ !:= x, Message _ !:= x, Bound _ !:= x) { | ||
if (!(x.src?)) { | ||
println("No .src annotation on: | ||
' <x>"); | ||
return false; | ||
} | ||
|
||
// Note that by removing all the (unannotated) empty lists here, we cover many more complex situations | ||
// below in detecting adjacent nodes in syntax trees. | ||
children = [ e | e <- getChildren(x), e != []]; | ||
|
||
// Here we collect all the possible ways nodes can be direct siblings in an abstract syntax tree: | ||
siblings = [ | ||
*[<a,b> | [*_, node a, node b, *_] := children], // adjacent nodes | ||
*[<a,b> | [*_, node a, [node b, *_], *_] := children], // node followed by non-empty list | ||
*[<a,b> | [*_, [*_, node a], node b, *_] := children], // non-empty list followed by node | ||
*[<a,b> | [*_, [*_, node a], [node b, *_], *_] := children], // adjacent non-empty lists | ||
*[<a,b> | [*_, [*_, node a, node b, *_], *_] := children] // nodes inside a list (elements can not be lists again) | ||
]; | ||
|
||
// Note that by induction: if all the pairwise adjacent siblings are in-order, then all siblings are in order | ||
|
||
// siblings are sorted in the input, even if some of them are lists | ||
for (<a,b> <- siblings) { | ||
if (!leftToRight(a, b)) { | ||
println("Siblings are out of order: | ||
'a : <a.src> is <a> | ||
'b : <b.src> is <b>"); | ||
return false; | ||
} | ||
if (ab <- [a,b], !included(n, ab)) { | ||
println("Child location not is not covered by the parent location: | ||
' parent: <n.src> | ||
' child : <ab.src>, is <ab>"); | ||
return false; | ||
} | ||
} | ||
|
||
// if ([*_, [*_, [*_], *_], *_] := getChildren(x)) { | ||
// println("Node contains a directly nested list: | ||
// ' <n.src> : <n>"); | ||
// return false; | ||
// } | ||
|
||
// if ([_, *_, str _, *_] := children || [*_, str _, *_, _] := children) { | ||
// println("Literals and identifiers must be singletons: | ||
// ' <n>"); | ||
// return false; | ||
// } | ||
} | ||
} | ||
|
||
if (checkNameResolution) { | ||
// all resolved names have the language as schema prefix | ||
//TODO: for the benefit of the compiler, changed | ||
// assert all(/node m := n, m.decl?, /^<language>/ := decl(m).scheme); | ||
//to: | ||
for(/node m := n){ | ||
assert m.decl? && /^<language>/ := decl(m).scheme; | ||
for (/node m := n, m.decl?) { | ||
if (decl(m).scheme == "unresolved") { | ||
println("Use decl has remained unresolved at <m.src>."); | ||
} | ||
else if (/^<language>/ !:= decl(m).scheme) { | ||
println("<m.decl> has a strange loc scheme at <m.src>"); | ||
return false; | ||
} | ||
} | ||
} | ||
|
||
return true; | ||
} | ||
|
||
@synopsis{Check the AST node specification on a (large) set of ASTs and monitor the progress.} | ||
bool astNodeSpecification(set[node] toCheck, str language = "java", bool checkNameResolution=false, bool checkSourceLocation=true) | ||
= job("AST specification checker", bool (void (str, int) step) { | ||
for (node ast <- toCheck) { | ||
step(loc l := ast.src ? l.path : "AST without src location", 1); | ||
if (!astNodeSpecification(ast, language=language, checkNameResolution=checkNameResolution, checkSourceLocation=checkSourceLocation)) { | ||
return false; | ||
} | ||
} | ||
|
||
return true; | ||
}, totalWork=size(toCheck)); | ||
|
||
|
||
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this also have a
testResultListener.done();
call?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely! good catch @DavyLandman . I removed the superfluous call to testResultListener.start instead. This will help a lot with the reporting in the IDE and github actions.