Skip to content

Commit

Permalink
Merge pull request #29 from usethesource/add-php-analysis
Browse files Browse the repository at this point in the history
added php-analysis project
  • Loading branch information
jurgenvinju authored Sep 22, 2023
2 parents 8b2ef8f + fda86cc commit 5320b44
Show file tree
Hide file tree
Showing 5 changed files with 90 additions and 51 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/website-builder.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,8 @@ jobs:
uses: algolia/[email protected]
id: algolia_crawler
with: # mandatory parameters
crawler-user-id: ${{ secrets.CRAWLER_USER_ID }}
crawler-user-id: ${{ secrets.CRAWLER_ID }}
crawler-api-key: ${{ secrets.CRAWLER_API_KEY }}
algolia-app-id: ${{ secrets.ALGOLIA_APP_ID }}
algolia-api-key: ${{ secrets.ALGOLIA_API_KEY }}
site-url: 'https://www.rascal-mpl.org'
site-url: 'https://www.rascal-mpl.org'
67 changes: 19 additions & 48 deletions courses/Rascal/Expressions/Values/Location/Location.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,62 +57,31 @@ URIs are explained in [Uniform Resource Identifier](http://en.wikipedia.org/wiki
The elements of a location value can be accessed and modified using the standard mechanism of field selection and field assignment. The corresponding field names are:

* `top`: the URI of the location without precise positioning information (offset, length, begin, end).

* `uri`: the URI of the location as a string. Also subfields of the URI can be accessed:

** `scheme`: the scheme (or protocol) to be used;

** `authority`: the domain where the data are located, as a `str`;

** `host`: the host where the URI is hosted (part of authority), as a `str`;

** `port`: port on host (part of authority), as a `int`;

** `path`: path name of file on host, as a `str`;

** `extension`: file name extension, as a `str`;

** `query`: query data, as a `str`;

** `fragment`: the fragment name following the path name and query data, as a `str`;

** `user`: user info (only present in schemes like mailto), as a `str`;

** `parent` : removes the last segment from the path component, if any, as a `loc`;

** `file` : the last segment of the path, as a `str`;

** `ls` : the contents of a directory, if the loc is a directory, as a `list[loc]`.

* `scheme`: the scheme (or protocol) to be used;
* `authority`: the domain where the data are located, as a `str`;
* `host`: the host where the URI is hosted (part of authority), as a `str`;
* `port`: port on host (part of authority), as a `int`;
* `path`: path name of file on host, as a `str`;
* `extension`: file name extension, as a `str`;
* `query`: query data, as a `str`;
* `fragment`: the fragment name following the path name and query data, as a `str`;
* `user`: user info (only present in schemes like mailto), as a `str`;
* `parent` : removes the last segment from the path component, if any, as a `loc`;
* `file` : the last segment of the path, as a `str`;
* `ls` : the contents of a directory, if the loc is a directory, as a `list[loc]`.
* `offset`: start of text area.

* `length`: length of text area.

* `begin.line`, `begin.column`: begin line and column of text area.

* `end.line`, `end.column` end line and column of text area.

These are the supported protocol schemes:

| Scheme name and pattern | Description |
| --- | --- |
| `file:///<path>` | for absolute file names in the OS filesystem |
| `project://<projectName>/<path>` | relative to an IDEs workspace, the authority part is a project name and `/` is the root of the source project. Only in an IDE context you can find other projects with this. When running standalone on the commandline or using Maven only the current project is resolved. |
| `target://<authprojectNameprojectNameority>/<path>` | relative to an IDEs workspace, the authority part is a project name, and the `/` is the root of the binary target path. For example Java's `.class` files end up there |
| `tmp:///<path>` | finds the OS's folder for temporary files |
| `home:///<path>` | finds the current user's home folder |
| `cwd:///<path>` | finds the OS's current working directory |
| `std:///<path>` | resolves to the (installed) location of the Rascal standard library |
| `zip+<scheme>://<authority>/<pathToZip>!/<pathInsideZip` | is for reading and writing into zip archives |
| `jar+<scheme>://<authority>/<pathToZip>!/<pathInJar` | is for reading and writing into jar archives |
| `plugin://<bundleName>/<path>` | resolves to the an Eclipse plugin (extracted) resource location, it resolves via an OSGI `bundleresource://` |
| `bundleresource://<bundleId>/<path>` | resolves to the an OSGI bundle (extracted) resource location. This resolves to a `jar+file://<filePath>!/<pathInJar>` often but could also resolve to a filesystem location depending on the configuration options of the bundle. |

Locations with specific position information should always be generated automatically but for the curious here is an example:
All the supported schemes are reported [here]((Locations))

Locations with specific position information are normally generated automatically (e.g. by parsers) but for the curious here is an example:
```rascal-shell
|file:///home/paulk/pico.trm|(0,1,<2,3>,<4,5>)
```
Note that this is equivalent to using the `home` scheme:
Note that example is equivalent to using the `home` scheme:
```rascal-shell
|home://pico.trm|(0,1,<2,3>,<4,5>)
```
Expand All @@ -130,8 +99,10 @@ x = |tmp://myTempDirectory|;
x += "myTempFile.txt";
```


#### Benefits

* locations are values, but they can be interpreted as references.

#### Pitfalls

* if a location naming scheme is not _unique_ (read inaccurate) then downstream analyses are similarly inaccurate.
53 changes: 53 additions & 0 deletions courses/Rascal/Locations/Locations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
title: Locations
---

#### Synopsis

((Values-Location)) are a central mechanism in Rascal for referring to (parts of) files and for referring to language constructs that have "qualified names".

#### Description

Apart from the syntax of locations and the operations defined on them in ((Location)) and ((IO)) this documents an overview of all provided URI schemes and their syntax and semantics.

These are all schemes that map to locations of files in physical or logical file systems:

| scheme | description |
| -------- | ------- |
| `file:///<path>` | an absolute file-system path. Separators always like so: `/` |
| `cwd:///<path>` | file-system path relative to the current working directory of the JVM running Rascal |
| `tmp:///<path>` | file-system path relative to where this OS/JVM thinks the temp folder is |
| `home:///<path>` | file-system path relative to where the current user's home directory is |
| `std:///<path>` | opaque virtual file system that points to the root of the (deployed) Rascal standard library |
| `memory://<filesystem-name>/<path>` | fast in-memory file system that is transient between runs of the JVM. Guarantees `lastModified` is incremented after every write |
| `jar+<scheme>://<authority>/<jar-path>!/<in-jar-path>` | file-system for what is inside a jar file |
| `zip+<scheme>://<authority>/<zip-path>!/<in-zip-path>` | file-system for what is insied a zip file |
| `project://<project-name>/<path>` | opaque file-system that is relative to the root a an IDE project in the current workspace of an IDE. The project must be "open" and active for this to work. |
| `target://<project-name>/<path>` | opaque file-system that is relative to the (binary) target compilation folder of a project that is active and open in the workspace of the current IDE |
| `lib://<lib-project-name>/<path>` | opaque file-system that points to the deployed code of a Rascal library. The library must have a RASCAL.MF file with the right `Project-Name` in it. The scheme may wrap/hide a target folder or a deployed jar file, depending on the situation in the IDE. Opened and active projects are resolved to their target folders while projects we depend on in `pom.xml` that are not opened typically resolve to their installed jar files in the users `.m2` folder |
| `https://<host>/<path>?<query>#fragment` | Simply a page on a website. |
| `http://<host>/<path>?<query>#fragment` | Simply a page on a website. |
| `system:///<path>` | this is the root of the JVM class and resource path for the current JVM |
| `bundle://<bundle-name>/<path>` | this is an opaque file system into an OSGI bundle (which must be loaded and initialized). Works only in Eclipse. It defers to `bundleresource://` after the bundle instance id has been resolved. |
| `plugin//<plugin-name>/<path>` | this is an opaque file system into an Eclipse plugin (which must be loaded and initialized). Works only in Eclipse. It defers to `bundle://` after the bundle name is associated with the plugin name. |
| `bundleresource://<bundle-id>/<path>` | this is an opaque file system into an (unzipped) resource partition of an OSGI bundle (whichmust be loaded and initialized). Works only in Eclipse |

For all of these schemes it is natural to use offset/length and start/end line/column information as described [here]((Values-Location)).

To get access to a file in your current project you can use `project://<your-project>`, however it will only work while in development mode in your IDE. For a more robust reference to a local file, also after deployment, use ((IO::findResources))

Next to the above file-systems we also map the "mangled" or "qualified" naming schemes of programming languages and domains-specific languages to locations. These exact references can then be used instead of file locations to analyze source code and to generate (interactive) visualizations.

The general scheme is this: `<language-name>+<concept-name>://<name-resolution-authority>/<qualified>/<name>`:
* language-name would be `java`, `php` ,`cpp`, etc.
* concept-name would be `class`, `method`, `interface`, `parameter`, `variable`, `trait`
* resolution-authority defines what the scope of name resolution is. It can be empty if you are doing an open-world analysis, or it could be the name of the top project you are currently analyzing. That project defines the search path for all of the other names and library projects used.
* qualified-name is typically represented by the names of nested scopes in the programming languages. Sometimes explicit names are generated for implicit concepts (such as anonymous classes and lambdas).

In the [M3]((analysis::m3::Core)) model, qualified names are mapped to their declaration locations in the `.declarations` table. This table is also used by ((analysis::m3::Registry)) to provide interactive resolution. This is how Rascal jumps to the source code of a qualified name location when it is clicked.

The other direction in [M3]((analysis::m3::Core)) core is the `.uses` table. This maps the places where a declaration
is used to their qualified name. By composing `.uses` with `.declaration` you get a reference resolver.

It is also common to annotate Parse Trees and Abstract Syntax Trees with qualified names after name and type resolution. You can use a [keyword field]((AlgebraicDataType)) of an algebraic data-type. It is typically called `decl` as in `data Expression | variable(str identifier, loc decl = |unknown:///|)` and the last part of the qualified name in `decl` would be equal to the `identifier`: `java+variable:///org/myorg/MainClass/main/myVar`.

1 change: 1 addition & 0 deletions courses/Rascal/Rascal.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ details:
- Patterns
- Expressions
- Statements
- Locations
- Tests
- Errors
---
Expand Down
16 changes: 15 additions & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
<salix-contrib.version>0.2.2</salix-contrib.version>
<rascal-lsp.version>2.18.0</rascal-lsp.version>
<drambiguity.version>0.1.10</drambiguity.version>
<php-analysis.version>0.2.1-RC1</php-analysis.version>
</properties>

<build>
Expand Down Expand Up @@ -147,7 +148,15 @@
<outputDirectory>${project.basedir}</outputDirectory>
<includes>docs/**/*.*</includes>
<excludes>docs/index.value</excludes>
</artifactItem>
</artifactItem>
<artifactItem>
<groupId>org.rascalmpl</groupId>
<artifactId>php-analysis</artifactId>
<version>${php-analysis.version}</version>
<outputDirectory>${project.basedir}</outputDirectory>
<includes>docs/**/*.*</includes>
<excludes>docs/index.value</excludes>
</artifactItem>
</artifactItems>
</configuration>
</execution>
Expand Down Expand Up @@ -271,5 +280,10 @@
<artifactId>drambiguity</artifactId>
<version>${drambiguity.version}</version>
</dependency>
<dependency>
<groupId>org.rascalmpl</groupId>
<artifactId>php-analysis</artifactId>
<version>${php-analysis.version}</version>
</dependency>
</dependencies>
</project>

0 comments on commit 5320b44

Please sign in to comment.