From c9cad4041fa03b57d0d0ae78192432f93a6d0f93 Mon Sep 17 00:00:00 2001 From: Damien de Lemeny Date: Wed, 13 Mar 2024 17:35:58 -0500 Subject: [PATCH 1/6] Add query language intro to the docs --- docs/get-started/query-language-intro.md | 112 +++++++++++++++++++++++ 1 file changed, 112 insertions(+) create mode 100644 docs/get-started/query-language-intro.md diff --git a/docs/get-started/query-language-intro.md b/docs/get-started/query-language-intro.md new file mode 100644 index 00000000000..79f3d2b9576 --- /dev/null +++ b/docs/get-started/query-language-intro.md @@ -0,0 +1,112 @@ +--- +title: Introduction to Quickwit's query language +sidebar_position: 3 +--- + +Quickwit allows you to search on your indexed documents using a simple query language. Here's a quick overview. + +## Clauses + +The main concept of this language is a clause, which represents a simple condition that can be tested against documents. + +### Querying fields + +A clause operates on fields of your document. It has the following syntax : +``` +field: condition +``` + +For example, when searching documents where the field `app_name` contains the token `tantivy`, you would write the following clause: +``` +app_name:tantivy +``` + +In many cases the field name can be omitted, quickwit will then use the `default_search_fields` configured for the index. + +### Adressing structured data + +Data stored deep inside nested data structures like `object` or `json` fields can be addressed using dots as separators in the field name. +For instance, the document `{"product": {"attributes": {color": "red"}}}` is matched by +``` +product.attributes.color:red +``` + +If the keys of your object contain dots, the above syntax has some ambiguity : by default `{"k8s.component.name": "quickwit"}` will be matched by +```k8s.component.name:quickwit``` + +It is possible to remove the ambiguity by setting expand_dots in the json field configuration. +In that case, it will be necessary to escape the `.` in the query to match this document like this : +``` +k8s\.component\.name:quickwit +``` + +### Clauses Cheat Sheet + +Quickwit support various types of clauses to express different kinds of conditions. Here's a quick overview of them: + +| type | syntax | examples | description| `default_search_field`| +|-------------|--------|----------|------------|-----------------------| +| term | `field: token` | `app_name: tantivy`
`process_id:1234`
`word` | A term clause tests the existence of avalue in the field's tokens | yes | +| term prefix | `field: prefix*` | `app_name: tant*`
`quick*` | A term clause tests the existence of a token starting with the provided value | yes | +| term set | `field: IN [token token ..]` |`severity: IN [error warn]` | A term set clause tests the existence of any of the provided value in the field's tokens| yes | +| phrase | `field: "sequence of tokens"` | `full_name: "john doe"` | A phrase clause tests the existence of the provided sequence of tokens | yes | +| phrase prefix | `field: "sequence of tokens"*` | `title: "how to m"*` | A phrase prefix clause tests the exsitence of a sequence of tokens, the last one used like in a prefix clause | yes | +| all | `*` | `*` | A match-all clause will match every document | no | +| exist | `field: *` | `error: *` | An exist clause tests the existence of any value for the field, it will match only if the field exists | no | +| range | `field: bounds` |`duration: [0 1000}`
`last_name: [banner miller]` | A term clause tests the existence of a token between the provided bounds | no | + +## Queries + +Clauses can be combined using operators to form more complex queries. + +### Combining queries + +Clauses can be combined using boolean operators `AND` and `OR` to create search exp +An `AND` query will match only if conditions on both sides of the operator are met +``` +type:rose AND color:red +``` + +An `OR` query will match if either or both conditions on each side of the operator are met +``` +weekday:6 OR weekday:7 +``` + +If no operator is provided, `AND` is implicitly assumed. + +``` +type:violet color:blue +``` + +### Grouping queries +You can build complex expressions by grouping clauses using parentheses. +``` +(type:rose AND color:red) OR (type:violet AND color:blue) +``` + +When no parentheses are used, `AND` takes precedence over `OR`, meaning that the following query is equivalent to the one above. + +``` +type:rose AND color:red OR type:violet AND color:blue +``` + +### Negating queries + +An expression can be negated either with the operator `NOT` or by prefixing the query with a dash `-`. + +`NOT` and `-` take precedence over everything, such that `-a AND b` means `(-a) AND b`, not `-(a AND B)`. + +``` +NOT severity:debug +``` + +or + +``` +type:proposal -(status:rejected OR status:pending) +``` + + +## Dive deeper + +If you want to know more about the query language, head to the [Query Language Reference](/docs/reference/query-language-reference) \ No newline at end of file From 39d0f18f327f7425722889bdc67787eb43495cd2 Mon Sep 17 00:00:00 2001 From: Damien de Lemeny Date: Thu, 14 Mar 2024 12:22:03 -0500 Subject: [PATCH 2/6] Rewrite query language refdoc --- docs/get-started/query-language-intro.md | 17 -- docs/reference/query-language.md | 246 +++++++++++++++++------ 2 files changed, 187 insertions(+), 76 deletions(-) diff --git a/docs/get-started/query-language-intro.md b/docs/get-started/query-language-intro.md index 79f3d2b9576..c2cf2081c0b 100644 --- a/docs/get-started/query-language-intro.md +++ b/docs/get-started/query-language-intro.md @@ -23,23 +23,6 @@ app_name:tantivy In many cases the field name can be omitted, quickwit will then use the `default_search_fields` configured for the index. -### Adressing structured data - -Data stored deep inside nested data structures like `object` or `json` fields can be addressed using dots as separators in the field name. -For instance, the document `{"product": {"attributes": {color": "red"}}}` is matched by -``` -product.attributes.color:red -``` - -If the keys of your object contain dots, the above syntax has some ambiguity : by default `{"k8s.component.name": "quickwit"}` will be matched by -```k8s.component.name:quickwit``` - -It is possible to remove the ambiguity by setting expand_dots in the json field configuration. -In that case, it will be necessary to escape the `.` in the query to match this document like this : -``` -k8s\.component\.name:quickwit -``` - ### Clauses Cheat Sheet Quickwit support various types of clauses to express different kinds of conditions. Here's a quick overview of them: diff --git a/docs/reference/query-language.md b/docs/reference/query-language.md index 7d75362af23..b06b5b36461 100644 --- a/docs/reference/query-language.md +++ b/docs/reference/query-language.md @@ -1,108 +1,236 @@ --- -title: Query language +title: Query Language Reference sidebar_position: 40 --- -Quickwit uses a query mini-language which is used by providing a `query` parameter to the search endpoints. +## Pseudo-grammar + +``` +query = '(' query ')' + | query operator query + | unary_operator query + | clause + +operator = 'AND' | 'OR' -### Terms +unary_operator = 'NOT' | '-' + +clause = field_name ':' field_clause + | defaultable_clause + | '*' + +field_clause = term | term_prefix | term_set | phrase | phrase_prefix | range | '*' +defaultable_clause = term | term_prefix | term_set | phrase | phrase_prefix +``` +--- +## Writing Queries +### Escaping Special Characters -The `query` is parsed into a series of terms and operators. There are two types of terms: single terms such as “tantivy” and phrases which is a group of words surrounded by double quotes such as “hello world”. +Special reserved characters are: `+` , `^`, `` ` ``, `:`, `{`, `}`, `"`, `[`, `]`, `(`, `)`, `~`, `!`, `\\`, `*`, `SPACE`. Such characters can still appear in query terms, but they need to be escaped by an anti-slash `\` . -Multiple terms can be combined together with Boolean operators `AND, OR` to form a more complex query. By default, terms will be combined with the `AND` operator. + -IP addresses can be provided as IpV4 or IpV6. It is recommended to use the same format as in the indexed documents. +### Allowed characters in field names + -### Fields +### Addressing nested structures -You can specify fields to search in the query by following the syntax `field_name:term`. +Data stored deep inside nested data structures like `object` or `json` fields can be addressed using dots as separators in the field name. +For instance, the document `{"product": {"attributes": {color": "red"}}}` is matched by +``` +product.attributes.color:red +``` -For example, let's assume an index that contains two fields, `title`, and `body` with `body` the default field. To search for the phrase “Barack Obama” in the title AND “president” in the body, you can enter: +If the keys of your object contain dots, the above syntax has some ambiguity : by default `{"k8s.component.name": "quickwit"}` will be matched by +```k8s.component.name:quickwit``` +It is possible to remove the ambiguity by setting expand_dots in the json field configuration. +In that case, it will be necessary to escape the `.` in the query to match this document like this : ``` -title:"barack obama" AND president +k8s\.component\.name:quickwit ``` -Note that a query like `title:barack obama` will find only `barack` in the title and `obama` in the default fields. If no default field has been set on the index, this will result in an error. +--- + +## Structured data +### Datetime +Datetime values must be provided in rfc3339 format, such as `1970-01-01T00:00:00Z` -### Searching structures nested in documents. +### IP addresses +IP addresses can be provided as IPv4 or IPv6. It is recommended to search with the format used when indexing documents. +There is no support for searching for a range of IP using CIDR notation, but you can use normal range queries. -Quickwit is designed to index structured data. -If you search into some object nested into your document, whether it is an `object`, a `json` object, or whether it was caught through the `dynamic` mode, the query language is the same. You simply need to chain the different steps to reach your value from the root of the document. +--- -For instance, the document `{"product": {"attributes": {color": "red"}}}` is returned if you query `product.attributes.color:red`. +## Types of clauses -If a dot `.` exists in one of the key of your object, the above syntax has some ambiguity. -For instance, by default, `{"k8s.component.name": "quickwit"}` will be matched by `k8s.component.name:quickwit`. +### Term `field:term` +``` +term: term_char+ +``` -It is possible to remove the ambiguity by setting `expand_dots` in the json field configuration. -In that case, it will be necessary to escape the `.` in the query to match this document. +Matches documents if the targeted field contains a token equal to the provided term. -For instance, the above document will match the query `k8s\.component\.name:quickwit`. +`field:value` will match any document where the field 'field' has a token 'value'. -### Boolean Operators +### Term Prefix `field:prefix*` +``` +term_prefix: term '*' +``` -Quickwit supports `AND`, `+`, `OR`, `NOT` and `-` as Boolean operators (case sensitive). By default, the `AND` is chosen, this means that if you omit it in a query like `title:"barack obama" president` Quickwit will interpret the query as `title:"barack obama" AND president`. +Matches documents if the targeted field contains a token which starts with the provided value. -### Grouping boolean operators +`field:quick*` will match any document where the field 'field' has a token like `quickwit` or `quickstart`, but not `qui` or `abcd`. -Quickwit supports parenthesis to group multiple clauses: +### Term set `field: IN [a b c]` ``` -(color:red OR color:green) AND size:large +term_set = 'IN' '[' term_list ']' +term_list = term_list term + | term ``` +Matches if the document contains any of the tokens provided. + +###### Examples +`field: IN [ab cd]` will match 'ab' or 'cd', but nothing else. + +###### Perfomance Note +This is a lot like writing `field:ab OR field:cd`. When there are only a handful of terms to search for, using ORs is usually faster. +When there are many values to match, a term set query can become more efficient. -### Slop Operator + -Quickwit also supports phrase queries with a slop parameter using the slop operator `~` followed by the value of the slop. -The query will match phrases if its terms are separated by slop terms at most. +### Phrase `field:"sequence of words"` +``` +phrase = phrase_string + | phrase_string slop +phrase_string = '"' phrase_char '"' +slop = '~' [01-9]+ + +``` -The slop can be considered a budget between all terms. E.g. `"A B C"~1` matches `"A X B C"`, `"A B X C"`, but not `"A X B X C"`. +Matches if the field contains the sequence of token provided. `field:"looks good to me"` will match any document containing that sequence of tokens. +The field must have been configured with `record: position` when indexing. -Transposition costs 2, e.g. `"A B"~1` will not match `"B A"` but it would with `"A B"~2`. +###### Slop operator +Is is also possible to add a slop, which allow matching a sequence with some distance. For instance `"looks to me"~1` will match "looks good to me", but not "looks very good to me". +Transposition costs 2, e.g. `"A B"~1` will not match `"B A"` but it would with `"A B"~2`. Transposition is not a special case, in the example above A is moved 1 position and B is moved 1 position, so the slop is 2. -:::caution -Slop queries can only be used on field indexed with the [record option](./../configuration/index-config.md#text-type) set to `position` value. -::: +### Phrase Prefix `field:"finish this phr"*` +``` +phrase_prefix = phrase '*' +``` + +Matches if the field contains the sequence of token provided, where the last token in the query may be only a prefix of the token in the document. + +The field must have been configured with `record: position` when indexing. + +There is no slop for phrase prefix queries. -### Set Operator +###### Examples + `field:"thanks for your contrib"*` will match 'thanks for your contribution'. -Quickwit supports `IN [value1 value2 ...]` as a set membership operator. This is more cpu efficient than the equivalent `OR`ing of many terms, but may download more of the split than `OR`ing, especially when only a few terms are searched. You must specify a field being searched for Set queries. +###### Limitation -### Range queries +Quickwit may trim some results matched by this clause in some cases. If you search for `"thanks for your co"*`, it will enumerate the first 50 tokens which start with "co", and search for any documents where "thanks for your" is followed by any of these tokens. -Range queries can only be executed on fields with a fast field. Currently only fields of type `ip` are supported. +If there are many tokens starting with "co", "contribution" might not be one of the 50 selected tokens, and the query won't match a document containing "thanks for your contribution". Normal prefix queries don't suffer from this issue. + + + + +### Range `field: [low_bound high_bound}` +``` +range = explicit_range | comparison_half_range + +explicit_range = left_bound_char bounds right_bound_char +left_bound_char = '[' | '{' +right_bound_char = '}' | ']' +bounds = term term + | term '*' + | '*' term + +comparison_range = comparison_operator term +comparision_operator = '<' | '>' | '<=' | '>=' +``` + +Matches if the document contains a token between the provided bounds for that field. +For range queries, you must provide a field. Quickwit won't use `default_search_fields` automatically. + +###### Order +For text fields, the ranges are defined by lexicographic order. It means for a text field, 100 is between 1 and 2. +When using ranges on integers, it behaves naturally. + +###### Inclusive and exclusive bounds +Inclusive bounds are represented by square brackets `[]`. They will match tokens equal to the bound term. +Exclusive bounds are represented by curly brackets `{}`. They will not match tokens equal to the bound term. + +###### Half-Open bounds +You can make an half open range by using `*` as one of the bounds. `field:[b TO *]` will match 'bb' and 'zz', but not 'ab'. +You can also use a comparison based syntax:`field:b`, `field:<=b` or `field:>=b`. + + + +###### Examples - Inclusive Range: `ip:[127.0.0.1 TO 127.0.0.50]` - Exclusive Range: `ip:{127.0.0.1 TO 127.0.0.50}` - Unbounded Inclusive Range: `ip:[127.0.0.1 TO *] or ip:>=127.0.0.1` - Unbounded Exclusive Range: `ip:{127.0.0.1 TO *] or ip:>127.0.0.1` -#### Examples: +### Exists `field:*` -With the following corpus: -```json -[ - {"id": 1, "body": "a red bike"}, - {"id": 2, "body": "a small blue bike"}, - {"id": 3, "body": "a small, rusty, and yellow bike"}, - {"id": 4, "body": "fred's small bike"}, - {"id": 5, "body": "a tiny shelter"} -] -``` -The following queries will output: +Matches documents where the field is set. You have to specify a field for this query, Quickwit won't use `default_search_fields` automatically. -- `body:"small bird"~2`: no match [] -- `body:"red bike"~2`: matches [1] -- `body:"small blue bike"~3`: matches [2] -- `body:"small bike"`: matches [4] -- `body:"small bike"~1`: matches [2, 4] -- `body:"small bike"~2`: matches [2, 4] -- `body:"small bike"~3`: matches [2, 3, 4] -- `body: IN [small tiny]`: matches [2, 3, 4, 5] +### Match All `*` -### Escaping Special Characters +Matches every document. You can't put a field in front. It is simply written as `*`. + +--- + +## Building Queries +Most queries are composed of more than one clause. When doing so, you may add operators between clauses. + +Implicitly if no operator is provided, 'AND' is assumed. + +### Conjunction `AND` +An `AND` query will match only if both sides match. -Special reserved characters are: `+` , `^`, `` ` ``, `:`, `{`, `}`, `"`, `[`, `]`, `(`, `)`, `~`, `!`, `\\`, `*`, `SPACE`. Such characters can still appear in query terms, but they need to be escaped by an antislash `\` . + + +### Disjunction `OR` +An `OR` query will match if either (or both) sides match. + + + +### Negation `NOT` or `-` +A `NOT` query will match if the clause it is applied to does not match. +The `-` prefix is equivalent to the `NOT` operator. + +### Grouping `()` +Parentheses are used to force the order of evaluation of operators. +For instance, if a query should match if 'field1' is 'one' or 'two', and 'field2' is 'three', you can use `(field1:one OR field1:two) AND field2:three`. + +### Operator Precedence +Without parentheses, `AND` takes precedence over `OR`. That is, `a AND b OR c` is interpreted as `(a AND b) or c`. + +`NOT` and `-` takes precedence over everything, such that `-a AND b` means `(-a) AND b`, not `-(a AND B)`. + + +--- + +## Other considerations + +### Default Search Fields +In many case it is possible to omit the field you search if it was configured in the `default_search_fields` array of the index configuration. + + + +### Tokenization +Note that the result of a query can depend on the tokenizer used for the field getting searched. Hence this document always speaks of tokens, which may be the exact value the document contain (in case of the raw tokenizer), or a subset of it (for instance any tokenizer cutting on spaces). + + +Quickwit uses a query mini-language which is used by providing a `query` parameter to the search endpoints. + From c571900c0d952e09fff8dadf1d84aa8a7088a767 Mon Sep 17 00:00:00 2001 From: Damien de Lemeny Date: Tue, 19 Mar 2024 10:00:25 -0500 Subject: [PATCH 3/6] Improve QL docs Co-Authored-By: trinity-1686a --- docs/get-started/query-language-intro.md | 16 ++++----- docs/reference/query-language.md | 41 ++++++++++-------------- 2 files changed, 25 insertions(+), 32 deletions(-) diff --git a/docs/get-started/query-language-intro.md b/docs/get-started/query-language-intro.md index c2cf2081c0b..93d1d09b1b1 100644 --- a/docs/get-started/query-language-intro.md +++ b/docs/get-started/query-language-intro.md @@ -13,7 +13,7 @@ The main concept of this language is a clause, which represents a simple conditi A clause operates on fields of your document. It has the following syntax : ``` -field: condition +field:condition ``` For example, when searching documents where the field `app_name` contains the token `tantivy`, you would write the following clause: @@ -29,14 +29,14 @@ Quickwit support various types of clauses to express different kinds of conditio | type | syntax | examples | description| `default_search_field`| |-------------|--------|----------|------------|-----------------------| -| term | `field: token` | `app_name: tantivy`
`process_id:1234`
`word` | A term clause tests the existence of avalue in the field's tokens | yes | -| term prefix | `field: prefix*` | `app_name: tant*`
`quick*` | A term clause tests the existence of a token starting with the provided value | yes | -| term set | `field: IN [token token ..]` |`severity: IN [error warn]` | A term set clause tests the existence of any of the provided value in the field's tokens| yes | -| phrase | `field: "sequence of tokens"` | `full_name: "john doe"` | A phrase clause tests the existence of the provided sequence of tokens | yes | -| phrase prefix | `field: "sequence of tokens"*` | `title: "how to m"*` | A phrase prefix clause tests the exsitence of a sequence of tokens, the last one used like in a prefix clause | yes | +| term | `field:token` | `app_name:tantivy`
`process_id:1234`
`word` | A term clause tests the existence of avalue in the field's tokens | yes | +| term prefix | `field:prefix*` | `app_name:tant*`
`quick*` | A term clause tests the existence of a token starting with the provided value | yes | +| term set | `field:IN [token token ..]` |`severity:IN [error warn]` | A term set clause tests the existence of any of the provided value in the field's tokens| yes | +| phrase | `field:"sequence of tokens"` | `full_name:"john doe"` | A phrase clause tests the existence of the provided sequence of tokens | yes | +| phrase prefix | `field:"sequence of tokens"*` | `title:"how to m"*` | A phrase prefix clause tests the exsitence of a sequence of tokens, the last one used like in a prefix clause | yes | | all | `*` | `*` | A match-all clause will match every document | no | -| exist | `field: *` | `error: *` | An exist clause tests the existence of any value for the field, it will match only if the field exists | no | -| range | `field: bounds` |`duration: [0 1000}`
`last_name: [banner miller]` | A term clause tests the existence of a token between the provided bounds | no | +| exist | `field:*` | `error:*` | An exist clause tests the existence of any value for the field, it will match only if the field exists | no | +| range | `field:bounds` |`duration:[0 TO 1000}`
`last_name: [banner TO miller]` | A term clause tests the existence of a token between the provided bounds | no | ## Queries diff --git a/docs/reference/query-language.md b/docs/reference/query-language.md index b06b5b36461..68fe7f7356e 100644 --- a/docs/reference/query-language.md +++ b/docs/reference/query-language.md @@ -9,6 +9,7 @@ sidebar_position: 40 query = '(' query ')' | query operator query | unary_operator query + | query query | clause operator = 'AND' | 'OR' @@ -26,12 +27,13 @@ defaultable_clause = term | term_prefix | term_set | phrase | phrase_prefix ## Writing Queries ### Escaping Special Characters -Special reserved characters are: `+` , `^`, `` ` ``, `:`, `{`, `}`, `"`, `[`, `]`, `(`, `)`, `~`, `!`, `\\`, `*`, `SPACE`. Such characters can still appear in query terms, but they need to be escaped by an anti-slash `\` . +Some characters need to be escaped in non quoted terms because they are syntactically significant otherwise: special reserved characters are: `+` , `^`, `` ` ``, `:`, `{`, `}`, `"`, `[`, `]`, `(`, `)`, `~`, `!`, `\\`, `*`, `SPACE`. If such such characters appear in query terms, they need to be escaped by prefixing them with an anti-slash `\`. - +In quoted terms, the quote character in use `'` or `"` needs to be escaped. -### Allowed characters in field names - +###### Allowed characters in field names + +See the [Field name validation rules](https://quickwit.io/docs/configuration/index-config#field-name-validation-rules) in the index config documentation. ### Addressing nested structures @@ -66,7 +68,7 @@ There is no support for searching for a range of IP using CIDR notation, but you ### Term `field:term` ``` -term: term_char+ +term = term_char+ ``` Matches documents if the targeted field contains a token equal to the provided term. @@ -75,15 +77,14 @@ Matches documents if the targeted field contains a token equal to the provided t ### Term Prefix `field:prefix*` ``` -term_prefix: term '*' +term_prefix = term '*' ``` Matches documents if the targeted field contains a token which starts with the provided value. `field:quick*` will match any document where the field 'field' has a token like `quickwit` or `quickstart`, but not `qui` or `abcd`. - -### Term set `field: IN [a b c]` +### Term set `field:IN [a b c]` ``` term_set = 'IN' '[' term_list ']' term_list = term_list term @@ -92,7 +93,7 @@ term_list = term_list term Matches if the document contains any of the tokens provided. ###### Examples -`field: IN [ab cd]` will match 'ab' or 'cd', but nothing else. +`field:IN [ab cd]` will match 'ab' or 'cd', but nothing else. ###### Perfomance Note This is a lot like writing `field:ab OR field:cd`. When there are only a handful of terms to search for, using ORs is usually faster. @@ -133,24 +134,20 @@ There is no slop for phrase prefix queries. ###### Limitation -Quickwit may trim some results matched by this clause in some cases. If you search for `"thanks for your co"*`, it will enumerate the first 50 tokens which start with "co", and search for any documents where "thanks for your" is followed by any of these tokens. +Quickwit may trim some results matched by this clause in some cases. If you search for `"thanks for your co"*`, it will enumerate the first 50 tokens which start with "co" (in their storage order), and search for any documents where "thanks for your" is followed by any of these tokens. If there are many tokens starting with "co", "contribution" might not be one of the 50 selected tokens, and the query won't match a document containing "thanks for your contribution". Normal prefix queries don't suffer from this issue. - - - - -### Range `field: [low_bound high_bound}` +### Range `field:[low_bound TO high_bound}` ``` range = explicit_range | comparison_half_range explicit_range = left_bound_char bounds right_bound_char left_bound_char = '[' | '{' right_bound_char = '}' | ']' -bounds = term term - | term '*' - | '*' term +bounds = term TO term + | term TO '*' + | '*' TO term comparison_range = comparison_operator term comparision_operator = '<' | '>' | '<=' | '>=' @@ -171,7 +168,7 @@ Exclusive bounds are represented by curly brackets `{}`. They will not match tok You can make an half open range by using `*` as one of the bounds. `field:[b TO *]` will match 'bb' and 'zz', but not 'ab'. You can also use a comparison based syntax:`field:b`, `field:<=b` or `field:>=b`. - + ###### Examples - Inclusive Range: `ip:[127.0.0.1 TO 127.0.0.50]` @@ -224,13 +221,9 @@ Without parentheses, `AND` takes precedence over `OR`. That is, `a AND b OR c` i ## Other considerations ### Default Search Fields -In many case it is possible to omit the field you search if it was configured in the `default_search_fields` array of the index configuration. - - +In many case it is possible to omit the field you search if it was configured in the `default_search_fields` array of the index configuration. If more than one field is configured as default, the resulting implicit clauses are combined using a conjunction ('OR'). ### Tokenization Note that the result of a query can depend on the tokenizer used for the field getting searched. Hence this document always speaks of tokens, which may be the exact value the document contain (in case of the raw tokenizer), or a subset of it (for instance any tokenizer cutting on spaces). -Quickwit uses a query mini-language which is used by providing a `query` parameter to the search endpoints. - From 8c20987a41c6603f8e750add96eddc1bbafc93b5 Mon Sep 17 00:00:00 2001 From: Damien de Lemeny Date: Tue, 19 Mar 2024 11:43:58 -0500 Subject: [PATCH 4/6] Add precision about sort order for range clauses --- docs/reference/query-language.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/reference/query-language.md b/docs/reference/query-language.md index 68fe7f7356e..4449ba4f32b 100644 --- a/docs/reference/query-language.md +++ b/docs/reference/query-language.md @@ -157,7 +157,9 @@ Matches if the document contains a token between the provided bounds for that fi For range queries, you must provide a field. Quickwit won't use `default_search_fields` automatically. ###### Order -For text fields, the ranges are defined by lexicographic order. It means for a text field, 100 is between 1 and 2. +For text fields, the ranges are defined by lexicographic order on uft-8 encoded byte arrays. It means for a text field, 100 is between 1 and 2. + + When using ranges on integers, it behaves naturally. ###### Inclusive and exclusive bounds From c20e19e8b1628338657c024f39ee3bf0672559da Mon Sep 17 00:00:00 2001 From: Damien de Lemeny Date: Tue, 19 Mar 2024 11:59:01 -0500 Subject: [PATCH 5/6] Trim unwanted space --- docs/get-started/query-language-intro.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/get-started/query-language-intro.md b/docs/get-started/query-language-intro.md index 93d1d09b1b1..34dc1d5d1d8 100644 --- a/docs/get-started/query-language-intro.md +++ b/docs/get-started/query-language-intro.md @@ -36,7 +36,7 @@ Quickwit support various types of clauses to express different kinds of conditio | phrase prefix | `field:"sequence of tokens"*` | `title:"how to m"*` | A phrase prefix clause tests the exsitence of a sequence of tokens, the last one used like in a prefix clause | yes | | all | `*` | `*` | A match-all clause will match every document | no | | exist | `field:*` | `error:*` | An exist clause tests the existence of any value for the field, it will match only if the field exists | no | -| range | `field:bounds` |`duration:[0 TO 1000}`
`last_name: [banner TO miller]` | A term clause tests the existence of a token between the provided bounds | no | +| range | `field:bounds` |`duration:[0 TO 1000}`
`last_name:[banner TO miller]` | A term clause tests the existence of a token between the provided bounds | no | ## Queries From c8820b9212c0b85f61473d458ba6e5355eb3b889 Mon Sep 17 00:00:00 2001 From: Damien de Lemeny Date: Tue, 19 Mar 2024 14:21:52 -0500 Subject: [PATCH 6/6] Minor change to QL intro --- docs/get-started/query-language-intro.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/docs/get-started/query-language-intro.md b/docs/get-started/query-language-intro.md index 34dc1d5d1d8..dfdcba653d1 100644 --- a/docs/get-started/query-language-intro.md +++ b/docs/get-started/query-language-intro.md @@ -40,11 +40,9 @@ Quickwit support various types of clauses to express different kinds of conditio ## Queries -Clauses can be combined using operators to form more complex queries. - ### Combining queries -Clauses can be combined using boolean operators `AND` and `OR` to create search exp +Clauses can be combined using boolean operators `AND` and `OR` to create more complex search expressions An `AND` query will match only if conditions on both sides of the operator are met ``` type:rose AND color:red