Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI fail on TruffleRuby and JRuby #16

Open
dentarg opened this issue Nov 21, 2021 · 4 comments
Open

CI fail on TruffleRuby and JRuby #16

dentarg opened this issue Nov 21, 2021 · 4 comments
Labels
bug Something isn't working

Comments

@dentarg
Copy link
Member

dentarg commented Nov 21, 2021

CI has been unstable for TruffleRuby (and JRuby) since they were added, the tests doesn't complete. On CRuby, all tests complete in ~10s. Here's a recent example of TruffleRuby timing out: https://github.com/cloudamqp/amqp-client.rb/runs/4280214610?check_suite_focus=true#step:5:17

  1) Error:
HighLevelTest#test_it_can_reopen_channel_1_after_failed_publish:
TimeoutEveryTestCase::TestTimeout: timed out after 60 seconds
    /home/runner/work/amqp-client.rb/amqp-client.rb/lib/amqp/client.rb:88:in `stop'
    /home/runner/work/amqp-client.rb/amqp-client.rb/test/amqp/high_level_test.rb:82:in `test_it_can_reopen_channel_1_after_failed_publish'
    /home/runner/work/amqp-client.rb/amqp-client.rb/test/test_helper.rb:27:in `block (2 levels) in run'
    /home/runner/.rubies/truffleruby-21.3.0/lib/truffle/timeout.rb:163:in `timeout'
    /home/runner/work/amqp-client.rb/amqp-client.rb/test/test_helper.rb:21:in `block in run'

48 runs, 10058 assertions, 0 failures, 1 errors, 3 skips

TruffleRuby and JRuby was marked as allowed to fail in #14, so they will look green, but if one looks at the summary page of a CI run, e.g. https://github.com/cloudamqp/amqp-client.rb/actions/runs/1487738770, there will be annotations if they failed.

@dentarg dentarg added the bug Something isn't working label Nov 21, 2021
@dentarg dentarg changed the title CI fail on TruffleRuby (and JRuby?) CI fail on TruffleRuby and JRuby Nov 30, 2021
@dentarg
Copy link
Member Author

dentarg commented Nov 3, 2022

Looks more stable now but here's a recent truffleruby fail: https://github.com/cloudamqp/amqp-client.rb/actions/runs/3377816169/jobs/5607211045#step:6:17

dentarg added a commit that referenced this issue Feb 15, 2024
Was mostly skipped before 678b47d

Related to #16
@dentarg
Copy link
Member Author

dentarg commented Mar 16, 2024

How to reproduce / debug

  • TruffleRuby: docker run --rm -it -v $(pwd):/app -w /app ghcr.io/graalvm/truffleruby-community:23.1.2-debian bash
  • CRuby: docker run --rm -it -v $(pwd):/app -w /app ruby:3.2.3 bash
  • apt-get update && apt-get install --yes git rabbitmq-server sudo && service rabbitmq-server start && while true; do rabbitmq-diagnostics status 2>/dev/null && break; echo -n .; sleep 2; done && bundle

I've only been playing with one test case: TESTOPTS="--name=test_it_can_be_blocked" RUN_SUDO_TESTS=true bundle exec rake

It is important to move system("sudo rabbitmqctl set_vm_memory_high_watermark 0.4") to the ensure block, before connection&.close, otherwise the test will be stuck in Connection#expect on the #pop call (until the test case times out).

The real failure on TruffleRuby is that this assert fails:

assert_nil t.join(0.1) # make sure the thread is blocked

because writing to the socket never blocks at

# Write byte array(s) directly to the socket (thread-safe)
# @param bytes [String] One or more byte arrays
# @return [Integer] number of bytes written
# @api private
def write_bytes(*bytes)
@write_lock.synchronize do
@socket.write(*bytes)
end
rescue *READ_EXCEPTIONS => e
raise Error::ConnectionClosed.new(*@closed) if @closed
raise Error, "Could not write to socket, #{e.message}"
end

with CRuby, writing is blocked (sometimes the first ch.basic_publish is not blocked, I guess that why the test case has two of them)

@dentarg
Copy link
Member Author

dentarg commented Mar 18, 2024

sometimes the first ch.basic_publish is not blocked, I guess that why the test case has two of them

Regarding this, I just saw this comment at probably explains it rabbitmq/rabbitmq-server#1986 (comment)

All publishers are blocked but only after the node detects a published message or content or body chunk frame. In other words, at least one message will be observed as "published" by certain clients. This is by design: we don't want to block any activity of connections that consume (such as basic.consume frames) or at least do not publish.

@dentarg dentarg mentioned this issue Mar 25, 2024
@dentarg
Copy link
Member Author

dentarg commented May 15, 2024

These tests

- name: Run TLS tests
run: bundle exec rake
env:
TESTOPTS: --name=/_tls$/

now times out with truffleruby 24.0.0 (logs)

truffleruby 23.1.2 was passing (logs)

dentarg added a commit that referenced this issue May 15, 2024
truffleruby 24.0.0 times out: #16 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant