Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow $ and @ characters in field names. #4413

Merged
merged 4 commits into from
Jan 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions docs/configuration/index-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -491,12 +491,13 @@ src.port:53 AND query_params.ctk:e42bb897d
### Field name validation rules

Currently Quickwit only accepts field name that matches the following regular expression:
`[a-zA-Z][_\.\-a-zA-Z0-9]*$`
`^[@$_\-a-zA-Z][@$_\.\-a-zA-Z0-9]{0,254}$`

In plain language:
- it needs to have at least one character.
- it should only contain latin letter `[a-zA-Z]` digits `[0-9]` or (`.`, `-`, `_`).
- the first character needs to be a letter.
- it can only contain uppercase and lowercase ASCII letters `[a-zA-Z]`, digits `[0-9]`, `.`, hyphens `-`, underscores `_`, at `@` and dollar `$` signs.
- it must not start with a dot or a digit.
- it must be different from Quickwit's reserved field mapping names `_source`, `_dynamic`, `_field_presence`.

:::caution
For field names containing the `.` character, you will need to escape it when referencing them. Otherwise the `.` character will be interpreted as a JSON object property access. Because of this, it is recommended to avoid using field names containing the `.` character.
Expand Down
2 changes: 2 additions & 0 deletions docs/configuration/storage-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,3 +146,5 @@ storage:
flavor: minio
endpoint: http://127.0.0.1:9000
```

Note: `default_index_root_uri` or index URIs do not include the endpoint, you should set it as a typical S3 path such as `s3://indexes`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor addition in the doc as I saw 2 users making this mistake, that is setting the URI as s3://127.0.0.1:9000/indexes

13 changes: 7 additions & 6 deletions quickwit/quickwit-doc-mapper/src/default_doc_mapper/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,17 +46,17 @@ pub(crate) use self::tokenizer_entry::{
use crate::QW_RESERVED_FIELD_NAMES;

/// Regular expression validating a field mapping name.
pub const FIELD_MAPPING_NAME_PATTERN: &str = r"^[_\-a-zA-Z][_\.\-a-zA-Z0-9]{0,254}$";
pub const FIELD_MAPPING_NAME_PATTERN: &str = r"^[@$_\-a-zA-Z][@$_\.\-a-zA-Z0-9]{0,254}$";

/// Validates a field mapping name.
/// Returns `Ok(())` if the name can be used for a field mapping. Does not check for reserved field
/// mapping names such as `_source`.
/// Returns `Ok(())` if the name can be used for a field mapping.
///
/// A field mapping name:
/// - may only contain uppercase and lowercase ASCII letters `[a-zA-Z]`, digits `[0-9]`, hyphens
/// `-`, and underscores `_`;
/// - can only contain uppercase and lowercase ASCII letters `[a-zA-Z]`, digits `[0-9]`, `.`,
/// hyphens `-`, underscores `_`, at `@` and dollar `$` signs;
/// - must not start with a dot or a digit;
/// - must be different from Quickwit's reserved field mapping names `_source`, `_dynamic`;
/// - must be different from Quickwit's reserved field mapping names `_source`, `_dynamic`,
/// `_field_presence`;
/// - must not be longer than 255 characters.
pub fn validate_field_mapping_name(field_mapping_name: &str) -> anyhow::Result<()> {
static FIELD_MAPPING_NAME_PTN: Lazy<Regex> =
Expand Down Expand Up @@ -146,6 +146,7 @@ mod tests {
assert!(validate_field_mapping_name("my-field").is_ok());
assert!(validate_field_mapping_name("my.field").is_ok());
assert!(validate_field_mapping_name("my_field").is_ok());
assert!(validate_field_mapping_name("$my_field@").is_ok());
assert!(validate_field_mapping_name(&"a".repeat(255)).is_ok());
}
}