-
Notifications
You must be signed in to change notification settings - Fork 19
Storage
Naga abstracts storage behind a Clojure protocol.
Storage is expected to be a graph store with a basic set of primitive operations. If the storage requires declaration of properties before use, then this should be done before the rules engine is run against the store.
Storage modules should register themselves with a public keyword that will be used to find the storage. This is done with the register-storage!
function.
Namespace: naga.store
Function:
(register-storage! registry-key construction-fn)
Registers a storage type with Naga. This should be called by the storage module when it is loaded.
registry-key
: Should be a keyword that will be used to address the module. The :memory
module is always registered.
construction-fn
: This is a function that takes a single parameter of a map that contains storage specific parameters, and returns an implementation of the naga.store.Storage
protocol.
e.g.
(naga.store/registry.storage! :datomic create-datomic-store)
The entire contents of the storage can be retrieved with a call to a single function:
Namespace: naga.store
Function:
(retrieve-contents store)
Returns the contents of the store
as a seq of vectors containing [entity attribute value]
data.
The storage protocol contains a set of basic operations required for all storage types. Many of these are expected to be simple wrappers or stubs for most graph stores. Implementations are not expected to be stateful. Functions that change the state of storage all return a storage object, which the implementor may choose to be the same object as the original, with modified state.
Namespace: naga.store
Functions:
(start-tx store)
(commit-tx store)
(new-node store)
(node-type? store property node)
(data-property store data)
(container-property store data)
(resolve-pattern store pattern)
(count-pattern store pattern)
(query store output-patterns patterns)
(assert-data store data)
(query-insert store assertion-patterns patterns)
This starts a transaction for the store, when transactions are supported. If transactions are unsupported, just returns the original store.
Commits an outstanding transaction on the store, when transactions are supported and one is pending. If transactions are unsupported, just returns the original store. If no transaction is pending, then this may return the store (a no-op). Alternatively, if the implementation wishes to do so, then a failed commit may throw an exception.
Creates a new graph node for representing an entity. For some systems, this may be a simple unique identifier, but for others this may require a function call (such as the [datomic.api/tempid](http://docs.datomic.com/clojure/index.html#datomic.api/tempid)
function in Datomic).
Tests if n
may be a graph node. Some systems may use the same data type in different contexts, so the property
(or edge) that refers to n
is also provided. Returns true
when the property
refers to an n
that is a graph node.
Returns a property that can refer to the value for data
. This must be a keyword in the naga
namespace, and start with first
. For instance :naga/first.long
to refer to a long value. Storage that has untyped properties (like :memory
) may just return :naga/first
.
Returns a property that can refer to the value for data
, and indicates membership in a container. This must be a keyword in the naga
namespace, and should differ from the value returned by data-property
. For instance :naga/contains.long
to refer to a long value. Storage that has untyped properties (like :memory
) may return a hardcoded value, such as :naga/contains
.
Takes a single query pattern, and returns a set of bindings for it. These are the same patterns that appear in rule bodies, and have the form:
[entity attribute value]
Bindings are a sequence of vectors, where each vector contains the requested columns.
As an example, if a database contained the following data:
[:a :p :b]
[:a :p :c]
[:m :q :x]
[:m :q :y]
Then resolving the pattern: [?u :p ?v]
will match every element that contains the :p
property. The result would be this sequence:
[[:a :b] [:a :c]]
Note that the results only contain the variable values (since the property was specified and known).
The variables bound in the results (?a
and ?b
) are stored as metadata on the results:
=> (def results (resolve-pattern store '[?u :p ?v]))
#'results
=> results
[[:a :b] [:a :c]]
=> (meta results)
{:cols [?u ?v]}
Similar to resolve-pattern
, but this returns the count of the results. This can be implemented with a simple wrapper:
;; inside storage definition
(count-pattern [store pattern]
(count (resolve-pattern store pattern)))
Most databases have an operation to perform this counting directly, which will be more efficient than this approach.
This performs a full query against the storage. The API is loosely based on Datomic queries.
Queries are based on patterns
with the results being projected to only use the variables in the output-patterns
.
patterns
is a seq of either 3 element patterns, or a list which contains a filtering operation.
An operation that returns a truthy value. This filters results to only include those where the value is truthy. For instance, '(> ?x 3)
will return true
when ?x
is greater than 3, and only those bindings will end up in the result.
A seq with values and variables in it. Variables are symbols that start with the ?
character.
- The first element must be a valid "node" value or a variable.
- The second element must be a keyword property or a variable.
- The third element may be any kind of value supported by the store, or a variable.
The output is determined from the inner join of all the pattern resolutions, filtered by the lists. The resulting bindings are then "projected" through the output-patterns
. This is a seq of patterns containing variables, and directs the format of the results. The output will be in the same structure as the output-patterns
with each variable being replaced by the associated bound value.
If a query gives a raw set of bindings of:
columns: ?a ?b ?c ?d ?e
[:t :u :v :v 3]
[:t :x :y :z 3]
[:m :u :v :n 4]
Then for output-patterns
of: [[?a ?b] [?c ?e]]
The result would be:
[[:t :u] [:v 3]]
[[:t :x] [:y 3]]
[[:m :u] [:v 4]]
Insert data
into the store. The data
parameter is a seq of 3 element seqs in [entity attribute value]
form. They may not contain variables.
Entities must be a valid node type for the store. Attributes are properties that the store will accept (we expect this to be in keyword form). Values are any supported datatype in the store, that is compatible with the given attribute.
Data that duplicates statements in the store will be silently ignored.
Much like the query
function, with the exception that the resulting data will be inserted into the store, much like with the assert-data
function. For this reason, the assertion-patterns
must all be 3 elements wide. These elements may contain fixed values, rather than only variables.
If any variables are left unbound in the assertion-patterns
parameter, this will indicate a new node that must be created for each set of bindings (or "row" in a result). Using the same unbound variable in multiple places in the assertion-patterns
parameter will reuse that node during the same binding. This allows new elements to be created for each binding, and multiple attributes to be attached to them.
Using the sample results in the description of the query
function above, then these assertion patterns:
[[?new :first ?b] [?new :second ?e]]
Would assert the following data back in the database:
[:new_node1 :first :u]
[:new_node1 :second 3]
[:new_node2 :first :x]
[:new_node3 :second 3]
[:new_node4 :first :u]
[:new_node4 :second 4]
Note how 2 assertion patterns appear for each result, so the 3 result rows were converted into 6 assertions into the storage.