Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
drkane committed Oct 26, 2020
1 parent 39c8a03 commit 28f7d97
Showing 1 changed file with 46 additions and 2 deletions.
48 changes: 46 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,11 @@
[![Tests](https://github.com/drkane/datasette-reconcile/workflows/Test/badge.svg)](https://github.com/drkane/datasette-reconcile/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/drkane/datasette-reconcile/blob/main/LICENSE)

Adds a reconciliation API to Datasette.
Adds a reconciliation API endpoint to Datasette, based on the [Reconciliation Service API](https://reconciliation-api.github.io/specs/latest/) specification.

The reconciliation API is used to match a set of strings to their correct identifiers, to help with disambiguation and consistency in large datasets. For example, the strings "United Kingdom", "United Kingdom of Great Britain and Northern Ireland" and "UK" could all be used to identify the country which has the ISO country code `GB`. It is particularly implemented in [OpenRefine](https://openrefine.org/).

The plugin adds a `/reconcile` endpoint to a table served by datasette, which responds based on the Reconciliation Service API specification. In order to activate this endpoint you need to configure the reconciliation service, as dscribed in the [usage](#usage) section.

## Installation

Expand All @@ -15,7 +19,47 @@ Install this plugin in the same environment as Datasette.

## Usage

Usage instructions go here.
### Plugin configuration

The plugin should be configured using Datasette's [`metadata.json`](https://docs.datasette.io/en/stable/metadata.html) file. The configuration can be put at the root, database or table layer of `metadata.json`, for most use cases it will make most sense to configure at the table level.

Add a `datasette-reconcile` object under `plugins` in `metadata.json`. This should look something like:

```json
{
"databases": {
"sf-trees": {
"tables": {
"Street_Tree_List": {
"plugins": {
"datasette-reconcile": {
"id_field": "id",
"name_field": "name",
"type_field": "type",
"type_default": "Tree",
"max_limit": 5,
"service_name": "Tree reconciliation"
}
}
}
}
}
}
}
```

The only required item in the configuration is `name_field`. This refers to the field in the table which will be searched to match the query text.

The rest of the configuration items are optional, and are as follows:

- `id_field`: The field containing the identifier for this entity. If not provided, and there is a primary key set, then the primary key will be used. A primary key of more than one field will give an error.
- `type_field`: If provided, this field will be used to determine the type of the entity. If not provided, then the `type_default` setting will be used instead.
- `type_default`: If provided, this value will be used as the type of every entity returned. If not provided the default of `Object` will be used for every entity.
- `max_limit`: The maximum number of records that a query can request to return. This is 5 by default. A individual query can request fewer results than this, but it cannot request more.
- `service_name`: The name of the reconciliation service that will appear in the service manifest. If not provided it will take the form `<database name> <table name> reconciliation`.
- `identifierSpace`: [Identifier space](https://reconciliation-api.github.io/specs/latest/#identifier-and-schema-spaces) given in the service manifest. If not provided a default of `http://rdf.freebase.com/ns/type.object.id` is used.
- `schemaSpace`: [Schema space](https://reconciliation-api.github.io/specs/latest/#identifier-and-schema-spaces) given in the service manifest. If not provided a default of `http://rdf.freebase.com/ns/type.object.id` is used.


## Development

Expand Down

0 comments on commit 28f7d97

Please sign in to comment.