Feature Proposal: Secondary Index Query Enhancements - Filter Expressions and Beyond #4
martinsumner
started this conversation in
Ideas
Replies: 1 comment
-
As an alternative to overloaded 2i queries, this appeals to me. Once functional, we'll need some detailed examples in the documentation and maybe a user tutorial video but overall, this sounds like a nice addition to existing search features. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Background
There have been multiple attempts to add a capability to query data in Riak:
Some performance improvements have been made to secondary indexing over the past few releases. Firstly the efficiency of querying has improved greatly by using
leveled
noteleveldb
, and then through improvements inleveled
/OTP
itself:There was also an order of magnitude of improvement made in query planning.
The addition of coverage participation option also addressed (with intervention) the primary functional issue of secondary index use.
The performance characteristics makes secondary indexes powerful, especially when compared to alternative databases (in particular DynamoDB). However, to make good use of these and support non-trivial queries - is is necessary to use overloaded terms and regular expressions. The primary issue with this has been with explaining the efficient use of this to developers. There is also the secondary issue of performance of more complex back-tracking regular expressions within the PCRE engine used by OTP. The engine is set be changed in OTP 28 - but that may cause its own migration issues for Riak.
The aim of this proposal is therefore to make the power secondary indexes more accessible: to try and make the process more intuitive. A secondary concern is to ensure that it remains backwards compatible with existing implementations i.e. a service currently using 2i and regular expressions should be able to transition without re-indexing their data.
Proposal
It is proposed that indexing would remain the same. The definition of index entries is made by the developer, and then appended to the object - rather than defining a schema to read the object and generate index entries. The solution will retain some key flexibility advantages: an object can appear on an unrestricted number of indexes, have an unrestricted number of entries on each index, and only a sort key is mandatory (there is no need for a partition key).
The basic proposal is that a simple language will be provided to allow for terms to be evaluated, to break up overloaded terms into separate named projected attributes (also referred to as terms). Once an index entry has been evaluated, the map of projected attributes that is outputted, can be processed using a simple filter expression containing matching rules for individual terms. The results can then be returned, but aggregation of results can also be restricted to specific terms, and return only counts if result values are not required.
Finally each query can now contain. multiple sub-queries - allowing for the results of those sub-queries to be aggregated using basic set operations, before being returned.
There should be a new Query API, where a Query can be provided as a JSON object, defined as below.
Query Request
query_request
(
)
bucket
Bucket or {Type, Bucket} tuple
aggregation_expression (optional)
A query may contain one or more queries (as described within query_details). If there is a single query, there is no purpose for
aggregation_expression
, but if there are multiple queries - theaggregation_expression
describes the set operations to be used to combine the resilts.If not provided, or the word
none
there should be a single query provided under query_details, for the query to be valid. Otherwise may be a set operation expression to describe how the results should be made by set operations on the results of each individual query in the query_details. Each tag referred to in the aggregation_expression should have a matching aggregation_tag within the query_details. the set operations supported will beUNION
INTERSECT
andSUBTRACT
e.g.$1 INTERSECT ($2 UNION $3)
.Note that aggregation will occur at a vnode level, and all the queries are run on a single vnode snapshot. However, all the queries are always run (aggregation occurs at the end, on each vnode - there is no intelligence to determine if some queries are unnecessary based on results of earlier queries e.g. they are to be intersected with an empty set).
accumulation_option
If an aggregation_expression other than none is provided, this can only be:
If the aggregation_expression is none, and there is a single query in query_details, the following additional options are supported:
A count can be based on matches (fastest option, counts each match independently), or keys which will only count unique keys that match. If the index has a 1:1 mapping between terms and keys, then these results will be consistent (and the match count should be used for improved efficiency). The term in term-based accumulation_option can be a reference to any single attribute returned from an evaluation_expression, including $term (to refer to the whole term).
result_provision
The result_provision may be two options - all, reference. If the option
all
is selected, all results will be returned once all results are available. If reference is chosen, the reference may be used to return pages of results, as when pages of results are available. The reference option is only available if an accumulationOption of keys or term_with_keys($TERM) is used.The reference will have a 60s timeout. If no request to the return results from the reference is made within 60s then the results will be dropped, and no future requests van be made to fetch the results.
expression_substitutions
A map of substitutions to be made into evaluation and filter expression within the query details.
query_details
A list of query_detail objects
query_detail/aggregation_tag
If an aggregation_expression is used, the tag of this query which acts as a reference to this query within that expression.
query_detail/index_name
The name of the index_name used for the query within Riak. Only _bin type queries are supported via this API.
query_detail/start_term
The prefix of the term to be used as the start of the range to be covered by the query. All index entries >= this term will be candidates for the result set.
query_detail/end_term
The prefix of the term to be used as the end of the range to be covered by the query. All index entries < this term will be candidates for the result set.
query_detail/evaluation _expression
An expression to take an index term ($term) and a key ($key) and return a map of Key/Value attributes to be used in a filter expression and/or an accumulation_option. This expression is a pipeline of functions to extract sub-attributes from existing attributes, or combined those attributes into combination terms.
The supported functions are:
delim - break the term using a delimiter
join - concatenate multiple terms to gather using a delimeter
split - as with delim, but the output is a list of items associated with a single term, not separate terms
slice - as with split, but the slicing of the parent term is based on fixed width
index - slice a single sub-term from a term using a start position and length
kvsplit - use dual delimiters (one to tokenise a pari, one to split a pair into key and value) to break up a sequence of Key/Value pairs into named attributes
regex - use a perl-compatible regular expression to extract named attributes
to_integer - convert a string attribute into an integer
to_string - convert an integer attribute into a string (for example to use in join)
add - perform addition on an integer attribute
subtract - perform subtraction on an integer attribute
map - categorise an attribute by comparison to a series of limits I.e. to break a value into range-based buckets
query_detail/filter_expression
A filter expression to be applied to attributes outputted from the evaluation expression. Expression supported in the form:
condition-expression ::=
operand comparator operand
| operand BETWEEN operand AND operand
| operand IN ( operand ',' operand …)
| function
| condition AND condition
| condition OR condition
| NOT condition
| ( condition )
comparator ::=
=
| <>
| <
| <=
| >
| >=
function ::=
attribute_exists (term)
| attribute_not_exists (term)
| attribute_empty (term)
| begins_with (term, prefix)
| ends_with (term, suffix)
Examples
Some examples of potentially non-trivial things that can be achieved with this query API:
The tests within the leveled PR contain many different examples, in particular the unit tests of the
leveled_filter
).The query capability will fall short of what would be possible in OpenSearch, potentially in terms of performance as well as functionality - however it will be a significant advance on the capability of other NoSQL solutions such as DynamoDB. With DynamoDB, although the filter expression support is similar, the restrictions that require a partition key on every index, and only allow an object to appear once on any given index - severely hamper the functional scope. Performance is also severely constrained by limits on RCUs per round trip.
Design
Erlang contains the
leex
andyecc
parse tools, and these are simple to use tools for building a query language. Once the language is defined, then it is simple to produce a filter function which first evaluates an index term (to convert it into individual projected attributes) and then applies a filter expression. The job then, of replacing a regular expression in the index fold with a function is trivial.Alternative Design Ideas
An attempt was made to simply improve the efficiency of regular expressions, and the 2i implementation, in particular via the http API. It was assumed that moving to pre-compiled regular expressions would make a big difference, but this will probably only be the case for very complex regular expressions. Likewise moving to RE2 was attempted using a NIF, but they only appear to be faster when pre-compiled - and there was no easy way to pre-compile a RE2 expression and then distribute that between Erlang nodes.
The difference between regular expressions and filter expressions is not just restricted to developer familiarity. The key change is support for "BETWEEN", such as for date ranges - this is extremely complicated and error-prone in regular expressions, but simple with support for "BETWEEN".
There had been previous attempts to make the separation of index sort keys, and projected attributes a more formal part of the definition part of the indexes. However, this would not be backwards compatible. The evaluation expression allows for the user to define projected attributes in relatively simple ways (i.e. using delimiters, or kv pairs).
Investigating the use of schemas to define indexes had a couple of issues:
In parallel to this there is a project which is suing the open nature of the nextgenrepl API to do both real-time replication and full-sync reconciliation from Riak to OpenSearch. This may provide an alternative approach, but it is still considered desirable to have an integrated query solution within Riak.
Testing
Initial performance testing indicates that even at fairly low levels of complexity, the combination of Evaluation Expression and Filter Expression is faster than using a regular expression.
Caveats
WIP
Pull Requests
martinsumner/leveled#440
Planned release for inclusion
Riak 3.4.0
Beta Was this translation helpful? Give feedback.
All reactions