Skip to content

Commit

Permalink
shards: only trigger rescan on .zoekt files changing (#801)
Browse files Browse the repository at this point in the history
Any write to the index dir triggered a scan. This means on busy
instances we are constantly rescanning, leading to an
over-representation in CPU profiles around watch. The events are
normally writes to our temporary files. By only considering events for
.zoekt files (which is what scan reads) we can avoid the constant scan
calls.

Just in case we also introduce a re-scan every minute in case we miss an
event. There is error handling around this, but I thought it is just
more reliable to call scan every once in a while.

Note: this doesn't represent significant CPU use, but it does muddy the
CPU profiler output. So this makes it easier to understand trends in our
continuous cpu profiling.

Test Plan: CI
  • Loading branch information
keegancsmith authored Aug 2, 2024
1 parent 764fe4f commit acacc5e
Showing 1 changed file with 23 additions and 4 deletions.
27 changes: 23 additions & 4 deletions shards/watcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,8 @@ func versionFromPath(path string) (string, int) {
}

func (s *DirectoryWatcher) scan() error {
// NOTE: if you change which file extensions are read, please update the
// watch implementation.
fs, err := filepath.Glob(filepath.Join(s.dir, "*.zoekt"))
if err != nil {
return err
Expand Down Expand Up @@ -216,21 +218,38 @@ func (s *DirectoryWatcher) watch() error {
signal := make(chan struct{}, 1)

go func() {
notify := func() {
select {
case signal <- struct{}{}:
default:
}
}

ticker := time.NewTicker(time.Minute)

for {
select {
case <-watcher.Events:
select {
case signal <- struct{}{}:
default:
case event := <-watcher.Events:
// Only notify if a file we read in has changed. This is important to
// avoid all the events writing to temporary files.
if strings.HasSuffix(event.Name, ".zoekt") || strings.HasSuffix(event.Name, ".meta") {
notify()
}

case <-ticker.C:
// Periodically just double check the disk
notify()

case err := <-watcher.Errors:
// Ignore ErrEventOverflow since we rely on the presence of events so
// safe to ignore.
if err != nil && err != fsnotify.ErrEventOverflow {
log.Println("watcher error:", err)
}

case <-s.quit:
watcher.Close()
ticker.Stop()
close(signal)
return
}
Expand Down

0 comments on commit acacc5e

Please sign in to comment.