doc: refactor and update documentation

eclipse-thingweb · Aug 17, 2023 · 15a93f9 · 15a93f9
1 parent 27abf1c
commit 15a93f9
Show file tree

Hide file tree

Showing 8 changed files with 174 additions and 188 deletions.
diff --git a/README.md b/README.md
@@ -1,44 +1,37 @@
-# Python based Thing Description Directory API compliant to: https://www.w3.org/TR/wot-discovery/
+# SparTDD
 
-## Deploy locally with docker-compose
+A Python and SPARQL based Thing Description Directory API compliant to:
+https://www.w3.org/TR/wot-discovery/
 
-For a quick launch of the SPARQL endpoint and TDD API with docker-compose:
+To learn more about the routes and function of this server, see
+the [API documentation](doc/api.md).
 
-```bash
-chmod a+rwx fuseki-docker/configuration
-chmod a+rwx fuseki-docker/databases
-docker-compose build # builds api and sparqlendpoint
-docker-compose up # runs api and sparqlendpoint
-```
-
-If you want to deploy only the TDD API using docker-compose and use an
-existing SPARQL endpoint then you should edit the `config.toml` file with the
-appropriate `SPARQLENDPOINT_URL` value. Then run only the api image.
-If the api image is already built you do not have to rebuild, relaunching it
-will use the new config.
-
-```bash
-docker-compose build api # builds the api image
-docker-compose run api # runs the api
-```
-
-## Deploy production
+## Configuration
 
-If you want to deploy production without using docker or docker-compose you can use
-the following commands:
+The TDD API can be configured using two methods. The first one is editing the
+`config.toml` file and the other one is using environment variables. Those two
+configuration can be mixed with a priority for the environment variables. It
+means that, for each variable, TDD API will search for the environment
+variables first, if they are not defined, then it will search for the
+`config.toml` values and if the variables are not defined in environment
+variable nor in `config.toml` the default value will be used.
 
-```bash
-pip install .[prod]
-export TDD__SPARQLENDPOINT_URL=<sparql endpoint url>
-export TDD__TD_JSONSCHEMA=tdd/data/td-json-schema-validation.json
-gunicorn -b 0.0.0.0:5000 app:app
-```
+The configuration variables are the same on both methods, except that
+the environment variables must be prefixed with `TDD__` to avoid conflicts.
+The `config.toml` file can also be used to define FLask server configuration (c.f.
+[documentation](https://flask.palletsprojects.com/en/2.1.x/config/#builtin-configuration-values)).
 
-You can change the `-b` parameter if you want to deploy only for localhost
-access, or public access, or change the port deployment.
+### Configuration variables
 
-In this example we use the configuration using the environment variables but you can edit
-ths `config.toml` file instead if you prefer.
+| Variable name             | default value                             | description                                                                                                                                                    |
+| ------------------------- | ----------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| [TDD__]TD_REPO_URL        | http://localhost:5000                     | The URL to access the TDD API server                                                                                                                           |
+| [TDD__]SPARQLENDPOINT_URL | http://localhost:3030/things              | The SPARQL endpoint URL                                                                                                                                        |
+| [TDD__]TD_JSONSCHEMA      | ./tdd/data/td-json-schema-validation.json | The path to the file containing JSON-Schema to validate the TDs                                                                                                |
+| [TDD__]CHECK_JSON_SCHEMA  | False                                     | Define if TDD API will check the TDs regarding to the `TD_JSONSCHEMA` schema and SHACL shapes                                                                  |
+| [TDD__]MAX_TTL            | None                                      | Integer, maximum time-to-live (in seconds) that a TD will be kept on the server (unlimited if None)                                                            |
+| [TDD__]MANDATE_TTL        | False                                     | Boolean value, if set to True, it will only upload TDs having a time-to-live (ttl) value. The server will send a 400 HTTP code if the TD does not contain one. |
+| [TDD__]LIMIT_BATCH_TDS    | 25                                        | Default limit of returned TDs by batch                                                                                                                         |
 
 ## Deploy to develop on the API
 
@@ -69,42 +62,25 @@ Install the JavaScript dependencies (the project relies on jsonld.js for JSON-LD
 npm ci
 ```
 
-### Deploy a Fuseki server locally
+### Deploy a SPARQL endpoint
 
-You can either use a distant SPARQL server or use a SPARQL server locally.
+SparTDD relies uses a SPARQL endpoint as database.
+You need to set up one before you run the project.
 
-We propose to use Apache Jena Fuseki, which has a nice administration interface.
-Download the Fuseki projet (apache-jena-fuseki-X.Y.Z.zip) from
-https://jena.apache.org/download/index.cgi
+In the [SPARQL endpoint documentation](doc/sparql-endpoints/README.md) we provide
+you with guidelines on how to set-up your SPARQL endpoint.
 
-Then unzip the downloaded archive.
-To launch the server, in the apache-jena-fuseki-X.Y.Z folder, run
-
-```
-./fuseki-server
-```
-
-The server will run on http://localhost:3030.
-If you want to create the dataset with the right configuration, you can copy-paste
-`fuseki-docker/configuration/things.ttl` into `apache-jena-fuseki-X.Y.Z/run/configuration`
-
-```
-cp fuseki-docker/configuration/things.ttl path/to/apache-jena-fuseki-X.Y.Z/run/configuration
-```
+### Run the flask server
 
-More documentation on Fuseki in this project is available in `doc/fuseki.md`
+First, set up your configuration (the SPARQL endpoint URL) (see [configuration](#configuration))
+if your SPARQL endpoint URL is not the default http://localhost:3030/things.
 
-### Run the flask server
+Then run the flask server at the root of this project in your python virtual environment.
 
 ```bash
-export TDD__SPARQLENDPOINT_URL=<sparql endpoint url>
-export TDD__TD_JSONSCHEMA=tdd/data/td-json-schema-validation.json
 flask run
 ```
 
-You can edit the `config.toml` file to change the configuration instead of using
-environment variables if you prefer.
-
 ## Import data using script
 
 To import the TDs from a directory to your SPARQL endpoint using the proxy api, run:
@@ -120,84 +96,44 @@ follows:
 python scripts/import_all_plugfest.py /path/to/TDs/directory <WOT API URL>/things?check-schema=false
 ```
 
-To import snapshots bundle (discovery data) use the proper script as following:
+## Deploy locally with docker-compose
+
+For a quick launch of the SPARQL endpoint and TDD API with docker-compose:
 
 ```bash
-python scripts/import_snapshot.py /path/to/snapshots.json <WOT API URL>/things
+chmod a+rwx fuseki-docker/configuration
+chmod a+rwx fuseki-docker/databases
+docker-compose build # builds api and sparqlendpoint
+docker-compose up # runs api and sparqlendpoint
 ```
 
-The `check-schema` param also works on this route.
-
-## Configuration
-
-The TDD API can be configured using two methods. The first one is editing the
-`config.toml` file and the other one is using environment variables. Those two
-configuration can be mixed with a priority for the environment variables. It
-means that, for each variable, TDD API will search for the environment
-variables first, if they are not defined, then it will search for the
-`config.toml` values and if the variables are not defined in environment
-variable nor in `config.toml` the default value will be used.
-
-The configuration variables are the same on both methods, except that
-the environment variables must be prefixed with `TDD__` to avoid conflicts.
-The `config.toml` file can also be used to define FLask server configuration (c.f.
-[documentation](https://flask.palletsprojects.com/en/2.1.x/config/#builtin-configuration-values)).
-
-### Configuration variables
-
-| Variable name             | default value                             | description                                                                                                                                                    |
-| ------------------------- | ----------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [TDD__]TD_REPO_URL        | http://localhost:5000                     | The URL to access the TDD API server                                                                                                                           |
-| [TDD__]SPARQLENDPOINT_URL | http://localhost:3030/things              | The SPARQL endpoint URL                                                                                                                                        |
-| [TDD__]TD_JSONSCHEMA      | ./tdd/data/td-json-schema-validation.json | The path to the file containing JSON-Schema to validate the TDs                                                                                                |
-| [TDD__]CHECK_JSON_SCHEMA  | False                                     | Define if TDD API will check the TDs regarding to the `TD_JSONSCHEMA` schema                                                                                   |
-| [TDD__]MAX_TTL            | None                                      | Integer, maximum time-to-live (in seconds) that a TD will be kept on the server (unlimited if None)                                                            |
-| [TDD__]MANDATE_TTL        | False                                     | Boolean value, if set to True, it will only upload TDs having a time-to-live (ttl) value. The server will send a 400 HTTP code if the TD does not contain one. |
-| [TDD__]LIMIT_BATCH_TDS    | 25                                        | Default limit of returned TDs by batch                                                                                                                         |
-
-## Notes on Virtuoso - TODO Change to a general section about tested Triplestores (Jena, GraphDB, Virtuoso and include this as a subsection)
+If you want to deploy only the TDD API using docker-compose and use an
+existing SPARQL endpoint then you should edit the `config.toml` file with the
+appropriate `SPARQLENDPOINT_URL` value (see [configuration](#configuration)).
+Then run only the api image.
+If the api image is already built you do not have to rebuild, relaunching it
+will use the new config.
 
-https://vos.openlinksw.com/owiki/wiki/VOS
+```bash
+docker-compose build api # builds the api image
+docker-compose run api # runs the api
+```
 
-`/Applications/Virtuoso Open Source Edition v7.2.app/Contents/virtuoso-opensource/bin` -> `./virtuoso-t +foreground +configfile ../database/virtuoso.ini`
+## Deploy production
 
-`/Applications/Virtuoso\ Open\ Source\ Edition\ v7.2.app/Contents/virtuoso-opensource/bin/isql localhost:1111 -U dba -P dba`
+If you want to deploy production without using docker or docker-compose you can use
+the following commands:
 
-```
-GRANT EXECUTE ON DB.DBA.SPARQL_INSERT_DICT_CONTENT TO "SPARQL";
-GRANT EXECUTE ON DB.DBA.SPARQL_DELETE_DICT_CONTENT TO "SPARQL";
-DB.DBA.RDF_DEFAULT_USER_PERMS_SET ('nobody', 7);
-GRANT EXECUTE ON DB.DBA.SPARUL_RUN TO "SPARQL";
-GRANT EXECUTE ON DB.DBA.SPARQL_INSERT_QUAD_DICT_CONTENT TO "SPARQL";
-GRANT EXECUTE ON DB.DBA.L_O_LOOK TO "SPARQL";
-GRANT EXECUTE ON DB.DBA.SPARUL_CLEAR TO "SPARQL";
-GRANT EXECUTE ON DB.DBA.SPARUL_DROP TO "SPARQL";
-GRANT EXECUTE ON DB.DBA.SPARQL_UPDATE TO "SPARQL";
+```bash
+pip install .[prod]
+gunicorn -b 0.0.0.0:5000 app:app
 ```
 
-http://127.0.0.1:8890/sparql  
-http://127.0.0.1:8890/conductor
-
-| User Name | Default Password | Usage                                                                                                       |
-| :-------- | :--------------- | :---------------------------------------------------------------------------------------------------------- |
-| dba       | dba              | Default Database Administrator account.                                                                     |
-| dav       | dav              | WebDAV Administrator account.                                                                               |
-| vad       | vad              | WebDAV account for internal usage in VAD (disabled by default).                                             |
-| demo      | demo             | Default demo user for the demo database. This user is the owner of the Demo catalogue of the demo database. |
-| soap      | soap             | SQL User for demonstrating SOAP services.                                                                   |
-| fori      | fori             | SQL user account for 'Forums' tutorial application demonstration in the Demo database.                      |
-
-Problem: Virtuoso 37000 Error SP031: SPARQL compiler: Blank node '\_:b0' is not allowed in a constant clause  
-https://github.com/openlink/virtuoso-opensource/issues/126
-
-Go to the Virtuoso administration UI, i.e., http://host:port/conductor
-
-- Log in as user dba
-- Go to System Admin → User Accounts → Users
-- Click the Edit link
-- Set User type to SQL/ODBC Logins and WebDAV
-- From the list of available Account Roles, select SPARQL_UPDATE and click the >> button to add it to the right-hand list
-- Click the Save button
+You can change the `-b` parameter if you want to deploy only for localhost
+access, or public access, or change the port deployment.
+
+In this example we use the configuration using the environment variables but you can edit
+the `config.toml` file instead if you prefer.
 
 ## Code quality
 

diff --git a/doc/api.md b/doc/api.md
@@ -3,13 +3,15 @@
 ## Flask server configuration
 
 The following environment variables is mandatory:
-- **TDD__SPARQLENDPOINT_URL** the URI of your SPARQLENDPOINT service (e.g., http://localhost:3030/things)
-(or edit the `config.toml` file)
+
+- **TDD\_\_SPARQLENDPOINT_URL** the URI of your SPARQLENDPOINT service (e.g., http://localhost:3030/things)
+  (or edit the `config.toml` file)
 
 ## Testing script
 
 We created a small script to import a TD or a folder of TDs `scripts/import_all_plugfest.py`.
 This script takes two arguments:
+
 - the path towards the TD file or TD folder
 - the tdd-api import route (e.g., http://localhost:5000/things)
 
@@ -27,5 +29,13 @@ It implements the [WoT-Discovery Exploration Mechanisms](https://w3c.github.io/w
 
 Its compliance has been tested in PlugFest/TestFest events.
 The results are listed here:
+
 - 2022.03: https://github.com/w3c/wot-testing/blob/main/events/2022.03.Online/Discovery/Results/logilabtdd.csv
-- TODO
+- TODO
+
+## Routes and schemas
+
+The list of routes and how they were implemented can be viewed in (routes-diagrams.odg)[routes-diagrams.odg].
+
+Some schemas to describe how the JSON and RDF are delt with in SparTDD
+can be found in (schemas.odg)[schemas.odg].
diff --git a/doc/schemas.odg b/doc/schemas.odg
diff --git a/doc/sparql-endpoints/README.md b/doc/sparql-endpoints/README.md
@@ -0,0 +1,42 @@
+# Configuring SparTDD for different SPARQL endpoints
+
+## General requirements
+
+You can either use a distant SPARQL server or use a SPARQL server locally.
+
+The SPARQL endpoint you configure must:
+
+- Allow SPARQL UPDATE queries
+- Allow named graphs
+- Be configured in the manner that the default graph is the union of the named graphs
+- Allow CORS
+
+## Using Apache Jena Fuseki
+
+We propose to use Apache Jena Fuseki, which has a nice administration interface.
+Download the Fuseki projet (apache-jena-fuseki-X.Y.Z.zip) from
+https://jena.apache.org/download/index.cgi
+
+Then unzip the downloaded archive.
+To launch the server, in the apache-jena-fuseki-X.Y.Z folder, run
+
+```
+./fuseki-server
+```
+
+The server will run on http://localhost:3030.
+If you want to create the dataset with the right configuration, you can copy-paste
+`fuseki-docker/configuration/things.ttl` into `apache-jena-fuseki-X.Y.Z/run/configuration`
+
+```
+cp fuseki-docker/configuration/things.ttl path/to/apache-jena-fuseki-X.Y.Z/run/configuration
+```
+
+More documentation on Fuseki in this project is available in [fuseki.md](fuseki.md)
+(for further configuration or docker configuration).
+
+## Using Virtuoso
+
+More documentation on Fuseki in this project is available in [virtuoso.md](virtuoso.md)
+
+## Using GraphDb
diff --git a/doc/fuseki.md → doc/sparql-endpoints/fuseki.md b/doc/fuseki.md → doc/sparql-endpoints/fuseki.md
@@ -16,31 +16,33 @@ The port of the image is 3030.
 ## Fuseki docker image environment variables
 
 We have set the following variables for the fuseki docker image :
+
 - **ENABLE_UPLOAD**: "true" -- to allow file upload (needed to import all triples)
 - **ASSEMBLER**: "/fuseki-base/configuration/things.ttl" -- this file, which in
-    this repository in under `fuseki-docker/configuration/things.ttl` will create
-    a `things` service with a TDB dataset in the fuseki endpoint at launch time.
-- **ADMIN_PASSWORD**: *your desired password*
+  this repository in under `fuseki-docker/configuration/things.ttl` will create
+  a `things` service with a TDB dataset in the fuseki endpoint at launch time.
+- **ADMIN_PASSWORD**: _your desired password_
 
 We have set three shared volumes on the image:
+
 - **/fuseki-base/configuration** folder where the configurations of the services
-    are read and stored by the fuseki endpoint
+  are read and stored by the fuseki endpoint
 - **/fuseki-base/databases** folder where the TDB files (persistent RDF databases)
-    are read and stored by the fuseki endpoint
+  are read and stored by the fuseki endpoint
 - **/fuseki-base/config.ttl** the configuration file for the whole endpoint. This
-    file will only be read by the fuseki server as no modification of this file
-    is possible at runtime.
-
+  file will only be read by the fuseki server as no modification of this file
+  is possible at runtime.
 
 ## Fuseki Service Configuration
 
 We propose a default configuration for a `/things` service on the fuseki sparql
 endpoint. This configuration file is in `fuseki-docker/configuration/things.ttl`.
 
 Two points are important in this configuration:
+
 - The dataset must be persistent (TDB or TDB2) so that the data is not lost on restart
 - The default graph must be the union of all graphs (`unionDefaultGraph` option)
-    so that all named graphs can be queried without adding a GRAPH keyword everywhere.
+  so that all named graphs can be queried without adding a GRAPH keyword everywhere.
 
 For a TDB Dataset :