Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow running inotify in another process #583

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

etiennebarrie
Copy link

We're using Listen through ActiveSupport::EventedFileUpdateChecker on a large application, and we're seeing a big discrepancy between running on Linux and macOS.
The main reason I found is that rb-fsevent uses a different process to do the file watching with which it communicates through a pipe.
On the other hand, rb-inotify does syscalls through ffi directly. In our case, recursing over all the subdirectories in Ruby takes a long time, about 1.3 seconds. By comparison, the same call to Listen::Listener#start takes less than 2 milliseconds on macOS.

I've read through the Performance and Tips and Techniques sections of the README, however our issue is that we just have a lot of reloadable code in a lot of directories. This is after reducing the watched paths to a minimum already.

In this PR, I've introduced a new adapter, ProcessLinux, which instead of listening directly, forks to a process with the directories as arguments, and listens to the pipe. I've added a way to use it by adding a backend selecting option prefer_fork.

It kinda works, however it doesn't fit in the current architecture very well. For example, the adapter doesn't go through the transitions properly. Listener#start returns before events can actually be processed.
Also it doesn't handle pausing currently.
More importantly, deletions don't work for some reason. The calling process receives deletion events but they don't turn into a call to the callback.
Also we're still spending time creating all the snapshots in the calling process, even though I'm not sure we need this. Is the snapshot feature only necessary for pausing/restarting?

Anyway I've created a small repro script to show the issue around the directory recursion.

Repro script
require "fileutils"
require "benchmark"
require "etc"

# Create 4096 directories (16*16*16)
FileUtils.rm_rf("tmp")
digits = (0..9).to_a + ('a'..'f').to_a
dir = nil
digits.product(digits, digits, digits) do |a,b,c|
  dir = "tmp/#{a}/#{b}/#{c}"
  FileUtils.mkdir_p(dir)
end

minor_before, major_before = GC.stat.values_at(:minor_gc_count, :major_gc_count)

fork = ARGV.delete("--fork")
GC.disable if ARGV.delete("--disable-gc")
profile = ARGV.delete("--profile")
allocations = ARGV.fetch(0) { 1_000 }.to_i

require "bundler/inline"
gemfile do
  gem "listen", path: "."
  gem "vernier" if profile
end

before_ts = Process.clock_gettime(Process::CLOCK_MONOTONIC)

if profile
  require "vernier"
  Vernier.start_profile(out: "#{Etc.uname[:sysname]}#{fork}.json")
end

queue = Queue.new
listener = Listen.to("tmp", { prefer_fork: fork }) do |m, a, r|
  queue.close
end
Thread.new do
  print "listener.start: "
  puts Benchmark.measure { listener.start }
  sleep
end

touches = nil
Thread.new do
  loop.with_index do |_, index|
    touches = index
    FileUtils.touch("#{dir}/g")
    allocations.times { +"" }
    sleep 0.1
  end
end
queue.pop
listener.stop

Vernier.stop_profile if profile

after_ts = Process.clock_gettime(Process::CLOCK_MONOTONIC)
puts "File updated #{touches} times before callback"
minor, major = GC.stat.values_at(:minor_gc_count, :major_gc_count)
p(major: major - major_before, minor: minor - minor_before)

puts "elapsed: %10.6f" % [after_ts - before_ts]

I've also added some allocations because we're also seeing a lot of GC time while doing the directory recursion.

Results

$ podman run -ti --rm -v $PWD:/workdir -w /workdir ruby:3.3.3 bash
root@d542bd460ef9:/workdir# ruby test.rb --profile 
listener.start:   0.094773   0.143460   0.238233 (  0.898538)
File updated 21 times before callback
{:major=>9, :minor=>42}
elapsed:   2.238860
root@d542bd460ef9:/workdir# ruby test.rb --profile --fork
listener.start:   0.000231   0.000285   0.000516 (  0.000746)
File updated 32 times before callback
{:major=>7, :minor=>38}
elapsed:   3.291573

Here are profiles: the Linux adapter, the forking Linux adapter, and for an unfair comparison the Darwin adapter (unfair because running on the machine while the others run in a container in a VM).

These can be opened using https://vernier.prof/. Unfortunately I can't link there directly because CORS.

We can see that the inotify implementation is still faster to receive the first event, but the call to listener.start is very slow (900ms).


To test this, for now I've just changed the acceptance test to prefer forking, this shows a few failures:

diff --git i/spec/acceptance/listen_spec.rb w/spec/acceptance/listen_spec.rb
index 7d25a56..0b6c40e 100644
--- i/spec/acceptance/listen_spec.rb
+++ w/spec/acceptance/listen_spec.rb
@@ -3,7 +3,7 @@
 RSpec.describe 'Listen', acceptance: true do
   let(:base_options) { { latency: 0.1 } }
   let(:polling_options) { {} }
-  let(:options) { {} }
+  let(:options) { { prefer_fork: true } }
   let(:all_options) { base_options.merge(polling_options).merge(options) }
 
   let(:wrapper) { setup_listener(all_options, :track_changes) }

I think the forking implementation shouldn't be necessary, and we could fix rb_inotify to do all the directory recursion in a C extension that releases the GVL. That should be enough to be able to start a listener in a thread without blocking the rest of the Ruby code. However I don't know how to do that, so this is my other attempt, just using forking to work around the GVL contention issue.


Can you take a look and evaluate whether the adapter selecting option to choose between different available adapters for a given platform would be acceptable?
Can you provide pointers as to how to fix the issue I'm seeing with deleted files? And let me know if having a way to disable snapshots would make sense too, or a way to improve the performance of that feature.

Thanks!

lib/listen/adapter.rb Outdated Show resolved Hide resolved
lib/listen/adapter/process_linux.rb Outdated Show resolved Hide resolved
lib/listen/listener/config.rb Outdated Show resolved Hide resolved
spec/lib/listen/listener/config_spec.rb Outdated Show resolved Hide resolved
Process.kill("KILL", @pipe.pid) if process_running?(@pipe.pid)
@pipe.close
end
rescue IOError, Errno::EBADF

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint/HandleExceptions: Do not suppress exceptions.

lib/listen/adapter/process_linux.rb Outdated Show resolved Hide resolved
dir = command[1..-1].chomp("\0") # remove status (M/A/D) and terminator null byte
@callback.call([dir])
end
rescue Interrupt, IOError, Errno::EBADF

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint/HandleExceptions: Do not suppress exceptions.

lib/listen/adapter/process_linux.rb Outdated Show resolved Hide resolved
@callback = block
end

def run

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics/MethodLength: Method has too many lines. [11/10]

def _run
dirs_to_watch = @callbacks.keys.map(&:to_s)
worker = Worker.new(dirs_to_watch, &method(:_process_changes))
@worker_thread = Thread.new("worker_thread") { worker.run }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style/StringLiterals: Prefer single-quoted strings when you don't need string interpolation or special symbols.

true
end

def _configure(dir, &callback)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style/EmptyMethod: Put empty method definitions on a single line.

lib/listen/adapter/process_linux.rb Outdated Show resolved Hide resolved

module Listen
module Adapter
class ProcessLinux < Base

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style/Documentation: Missing top-level class documentation comment.

@@ -0,0 +1,87 @@
# frozen_string_literal: true

require "listen"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style/StringLiterals: Prefer single-quoted strings when you don't need string interpolation or special symbols.

command = @pipe.gets("\0")
next unless command
# remove status (M/A/D) and terminator null byte
dir = command[1..].chomp("\0")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint/Syntax: unexpected token tRBRACK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants