Skip to content

Commit

Permalink
Index update from index config (#5078)
Browse files Browse the repository at this point in the history
* Add indexing_settings and use index config as input

* Doc fixes and cleanup deprecated parameters

* Doc improvments

* Fix outdated IndexMetadata content

* Fix typos in CLI output

* Add metastore tests for each updatable config field

* Address review comments

* Fix unintended privatisation of DocMapping

* Fix typo in tests

* Address review comments
  • Loading branch information
rdettai authored Jun 12, 2024
1 parent 82a00a7 commit f27198f
Show file tree
Hide file tree
Showing 26 changed files with 772 additions and 609 deletions.
2 changes: 0 additions & 2 deletions config/tutorials/hdfs-logs/index-config-partitioned.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,5 +43,3 @@ indexing_settings:
merge_factor: 10
max_merge_ops: 3
maturation_period: 48 hours
resources:
max_merge_write_throughput: 100mb
36 changes: 5 additions & 31 deletions docs/reference/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -182,49 +182,23 @@ quickwit index create --endpoint=http://127.0.0.1:7280 --index-config wikipedia_

### index update

Updates an index using an index config file.
`quickwit index update [args]`
#### index update search-settings

Updates default search settings.
`quickwit index update search-settings [args]`

*Synopsis*

```bash
quickwit index update search-settings
quickwit index update
--index <index>
--default-search-fields <default-search-fields>
```

*Options*

| Option | Description |
|-----------------|-------------|
| `--index` | ID of the target index |
| `--default-search-fields` | List of fields that Quickwit will search into if the user query does not explicitly target a field. Space-separated list, e.g. "field1 field2". If no value is provided, existing defaults are removed and queries without target field will fail. |
#### index update retention-policy

Configure or disable the retention policy.
`quickwit index update retention-policy [args]`

*Synopsis*

```bash
quickwit index update retention-policy
--index <index>
[--period <period>]
[--schedule <schedule>]
[--disable]
--index-config <index-config>
```

*Options*

| Option | Description |
|-----------------|-------------|
| `--index` | ID of the target index |
| `--period` | Duration after which splits are dropped. Expressed in a human-readable way (`1 day`, `2 hours`, `1 week`, ...) |
| `--schedule` | Frequency at which the retention policy is evaluated and applied. Expressed as a cron expression (0 0 * * * *) or human-readable form (hourly, daily, weekly, ...). |
| `--disable` | Disable the retention policy. Old indexed data will not be cleaned up anymore. |
| `--index-config` | Location of the index config file. |
### index clear

Clears an index: deletes all splits and resets checkpoint.
Expand Down Expand Up @@ -381,7 +355,7 @@ quickwit index ingest
| `--batch-size-limit` | Size limit of each submitted document batch. |
| `--wait` | Wait for all documents to be commited and available for search before exiting |
| `--force` | Force a commit after the last document is sent, and wait for all documents to be committed and available for search before exiting |
| `--commit-timeout` | Timeout for ingest operations that require waiting for the final commit (`--wait` or `--force`). This is different from the `commit_timeout_secs` indexing setting which sets the maximum time before commiting splits after their creation. |
| `--commit-timeout` | Timeout for ingest operations that require waiting for the final commit (`--wait` or `--force`). This is different from the `commit_timeout_secs` indexing setting, which sets the maximum time before commiting splits after their creation. |

*Examples*

Expand Down
166 changes: 104 additions & 62 deletions docs/reference/rest-api.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,10 @@ use tabled::{Table, Tabled};
use thousands::Separable;
use tracing::{debug, Level};

use self::update::{build_index_update_command, IndexUpdateCliCommand};
use crate::checklist::GREEN_COLOR;
use crate::stats::{mean, percentile, std_deviation};
use crate::{client_args, make_table, prompt_confirmation, ClientArgs, THROUGHPUT_WINDOW_SIZE};

pub mod update;

pub fn build_index_command() -> Command {
Command::new("index")
.about("Manages indexes: creates, updates, deletes, ingests, searches, describes...")
Expand All @@ -81,7 +78,18 @@ pub fn build_index_command() -> Command {
])
)
.subcommand(
build_index_update_command().display_order(2)
Command::new("update")
.display_order(1)
.about("Updates an index using an index config file.")
.long_about("This command follows PUT semantics, which means that all the fields of the current configuration are replaced by the values specified in this request or the associated defaults. In particular, if the field is optional (e.g. `retention_policy`), omitting it will delete the associated configuration. If the new configuration file contains updates that cannot be applied, the request fails, and none of the updates are applied.")
.args(&[
arg!(--index <INDEX> "ID of the target index")
.display_order(1)
.required(true),
arg!(--"index-config" <INDEX_CONFIG> "Location of the index config file.")
.display_order(2)
.required(true),
])
)
.subcommand(
Command::new("clear")
Expand Down Expand Up @@ -213,6 +221,14 @@ pub struct CreateIndexArgs {
pub assume_yes: bool,
}

#[derive(Debug, Eq, PartialEq)]
pub struct UpdateIndexArgs {
pub client_args: ClientArgs,
pub index_id: IndexId,
pub index_config_uri: Uri,
pub assume_yes: bool,
}

#[derive(Debug, Eq, PartialEq)]
pub struct DescribeIndexArgs {
pub client_args: ClientArgs,
Expand Down Expand Up @@ -260,12 +276,12 @@ pub struct ListIndexesArgs {
pub enum IndexCliCommand {
Clear(ClearIndexArgs),
Create(CreateIndexArgs),
Update(UpdateIndexArgs),
Delete(DeleteIndexArgs),
Describe(DescribeIndexArgs),
Ingest(IngestDocsArgs),
List(ListIndexesArgs),
Search(SearchIndexArgs),
Update(IndexUpdateCliCommand),
}

impl IndexCliCommand {
Expand All @@ -288,7 +304,7 @@ impl IndexCliCommand {
"ingest" => Self::parse_ingest_args(submatches),
"list" => Self::parse_list_args(submatches),
"search" => Self::parse_search_args(submatches),
"update" => Ok(Self::Update(IndexUpdateCliCommand::parse_args(submatches)?)),
"update" => Self::parse_update_args(submatches),
_ => bail!("unknown index subcommand `{subcommand}`"),
}
}
Expand Down Expand Up @@ -323,6 +339,25 @@ impl IndexCliCommand {
}))
}

fn parse_update_args(mut matches: ArgMatches) -> anyhow::Result<Self> {
let client_args = ClientArgs::parse(&mut matches)?;
let index_id = matches
.remove_one::<String>("index")
.expect("`index` should be a required arg.");
let index_config_uri = matches
.remove_one::<String>("index-config")
.map(|uri| Uri::from_str(&uri))
.expect("`index-config` should be a required arg.")?;
let assume_yes = matches.get_flag("yes");

Ok(Self::Update(UpdateIndexArgs {
index_id,
client_args,
index_config_uri,
assume_yes,
}))
}

fn parse_describe_args(mut matches: ArgMatches) -> anyhow::Result<Self> {
let client_args = ClientArgs::parse(&mut matches)?;
let index_id = matches
Expand Down Expand Up @@ -449,7 +484,7 @@ impl IndexCliCommand {
Self::Ingest(args) => ingest_docs_cli(args).await,
Self::List(args) => list_index_cli(args).await,
Self::Search(args) => search_index_cli(args).await,
Self::Update(args) => args.execute().await,
Self::Update(args) => update_index_cli(args).await,
}
}
}
Expand Down Expand Up @@ -501,6 +536,35 @@ pub async fn create_index_cli(args: CreateIndexArgs) -> anyhow::Result<()> {
Ok(())
}

pub async fn update_index_cli(args: UpdateIndexArgs) -> anyhow::Result<()> {
debug!(args=?args, "update-index");
println!("❯ Updating index...");
let storage_resolver = StorageResolver::unconfigured();
let file_content = load_file(&storage_resolver, &args.index_config_uri).await?;
let index_config_str = std::str::from_utf8(&file_content)
.with_context(|| {
format!(
"index config file `{}` contains some invalid UTF-8 characters",
args.index_config_uri
)
})?
.to_string();
let config_format = ConfigFormat::sniff_from_uri(&args.index_config_uri)?;
let qw_client = args.client_args.client();
if !args.assume_yes {
let prompt = "This operation will update the index configuration. Do you want to proceed?";
if !prompt_confirmation(prompt, false) {
return Ok(());
}
}
qw_client
.indexes()
.update(&args.index_id, &index_config_str, config_format)
.await?;
println!("{} Index successfully updated.", "✔".color(GREEN_COLOR));
Ok(())
}

pub async fn list_index_cli(args: ListIndexesArgs) -> anyhow::Result<()> {
debug!(args=?args, "list-index");
let qw_client = args.client_args.client();
Expand Down
Loading

0 comments on commit f27198f

Please sign in to comment.