Add easier segment tracing / verbosity / transparency to `IndexWriter` #14182

mikemccand · 2025-01-29T23:21:52Z

Description

When trying to understand why a shard seems to not do a good job merging, it's surprisingly difficult to gain visibility / understanding. E.g. cases like #14163 and #13226.

At Amazon Product Search, we are also trying to understand how our service behaves under update storms (many sudden real-time catalog updates), and its impact on merging / NRT segment replication.

IndexWriter has an InfoStream which gives amazing verbosity on all that is happening, but it is too voluminous.

I'd think we could make a small improvement to InfoStream. Today, it writes under different components e.g. SM for segment merging. I'd like to add a new component, ST (for "segment tracing"), which provides smallish amount of output about each flush (start and end, size, deletes), each merge (start and end, which segments, how many deletes at the start, how many carryover deletes (deletes that happened while merging was happening), when deletes are applied/written, and time to merge each index section (doc values, postings, knn, etc.)).

IW/SM already writes much of this to InfoStream but it's too scattered / diffuse. I'm hoping a new ST can be lighter weight and have the important debugging details that can help us understand issues like the ones linked/described above. An application can set an InfoStream that captures just the ST messages ...

Once we have this, the 2nd part of this effort is a simple tool that can digest the output of ST InfoStream and visualize, e.g. producing videos like this one and mayb a 2D interactive canvas/chart that lays out a graphical rendition of all segments and their life times.

The text was updated successfully, but these errors were encountered:

mikemccand · 2025-02-04T14:05:28Z

I have been tinkering with fun little Python tools in luceneutil to 1) parse a full InfoStream log into pickled classes representing all segments and their lifecycle during indexing, and 2) render the 2D segment "explanation" as a (slightly) interactive SVG HTML UI.

The resulting output is sort of a 2D rendering of the same-ish per-segment information from the merge visualization videos (from my long ago blog post about visualizing Lucene's segment merges).

Here is an example of indexing enwiki with many threads and no deletes, and this one is derived from near-real-time indexing and refreshing once per second (has deletions). The results are quite mesmerizing to look at / scroll through!

Example (from this run):

Some quick explanations:

Blue segments were created by merge and red segments were created by flush (newly index/written documents)
The height of the rectangle is proportion to its log(size_mb) -- thicker rectangles are bigger segments
The width of the rectangle is its lifetime. Notice how sometimes small segments live a long time, and some large segments live a short time. Surprising!
Each segment starts with a "dawn" (lighter shade), which is the duration while it is being written and not yet lit in the index
Some segments also end with a "dusk" (darker shade), which is the duration while it is being merged into another segment but not yet dropped from the index
When you mouse into a segment, it pops up a little text box with some details. It's hard to read. I want to make it multi-line but this is seemingly not simple in SVG/JS/CSS land, and I am most definitely not good at the latest web tech hah
When you mouse into a blue segment, it will highlight in gold/yellow the segments that were merged to produce this segment. (Sometimes they are not visible on your viewport, so).

This is still a work in progress! I suspect the above links only work on desktop browsers with big screens! Feedback welcome :)

I still want to reflect deletions better -- a segment accumulates more and more deletions with time, and the UI doesn't show that yet.

In doing this, it's clear we need access to a whole bunch of stuff from InfoStream, so... I now think this issue is a premature optimization (I will close it now). Let's instead just build these tools out on top of what IndexWriter's InfoStream already produces today, and maybe later we can optimize InfoStream writing to produce smaller output.

The overall goal of these tools is to give some badly needed transparency on an index's segments to help in debugging cases where merging is not doing what we'd expect ... (at Amazon Product Search we are also struggling with taming our TieredMergePolicy configuration).

Also, if anyone has some InfoStreams just lying around, or they are confused about how merges are happening in their shards, please turn on InfoStream and share the log and I'll try to use it as a test case for iterating on this, and maybe it uncovers something!

rmuir · 2025-02-04T17:33:30Z

@mikemccand rather than mess with info stream logging could we consider adding some counters to indexwriter to give visibility? Eg if you have a flush count with a simple int getter, that's enough. User can expose these counts via http API and scrape regularly with tool such as Prometheus, and see any storms in graphs. Exposing simple metric counters etc seems like easy low hanging fruit to allow ppl to have better eyes on this stuff

mikemccand added the type:enhancement label Jan 29, 2025

mikemccand closed this as completed Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add easier segment tracing / verbosity / transparency to `IndexWriter` #14182

Add easier segment tracing / verbosity / transparency to `IndexWriter` #14182

mikemccand commented Jan 29, 2025

mikemccand commented Feb 4, 2025

rmuir commented Feb 4, 2025

Add easier segment tracing / verbosity / transparency to IndexWriter #14182

Add easier segment tracing / verbosity / transparency to IndexWriter #14182

Comments

mikemccand commented Jan 29, 2025

Description

mikemccand commented Feb 4, 2025

rmuir commented Feb 4, 2025

Add easier segment tracing / verbosity / transparency to `IndexWriter` #14182

Add easier segment tracing / verbosity / transparency to `IndexWriter` #14182