Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(connections): use Unis to enable internal connection retry behaviour #244

Merged
merged 8 commits into from
Jan 19, 2024

Conversation

andrewazores
Copy link
Member

@andrewazores andrewazores commented Jan 18, 2024

Welcome to Cryostat3! 👋

Before contributing, make sure you have:

  • Read the contributing guidelines
  • Linked a relevant issue which this PR resolves
  • Linked any other relevant issues, PR's, or documentation, if any
  • Resolved all conflicts, if any
  • Rebased your branch PR on top of the latest upstream main branch
  • Attached at least one of the following labels to the PR: [chore, ci, docs, feat, fix, test]
  • Signed all commits using a GPG signature

To recreate commits with GPG signature git fetch upstream && git rebase --force --gpg-sign upstream/main


Related to #243
Related to #147

Description of the change:

Refactors internal TargetConnectionManager implementation to use Uni. The unused executeConnectedTaskAsync is replaced with executeConnectedTaskUni, which executes on the current thread but in the reactive style. executeConnectedTask, which is used in many places and is generally how Cryostat performs remote operations on JMX or Agent HTTP targets, is refactored to call executeConnectedTaskUni internally. executeDirect is also refactored to behave similarly to executeConnectedTaskUni, but with the distinction that operations called through this method do not use the target connection cache, target connection locks, etc., so they are independent of any other operations or of discovery events. Both executeDirect and executeConnectedTaskUni are configured such that certain classes of exceptions trigger retries when caught, where other exceptions are allowed to propagate normally.

Motivation for the change:

The "retry-on-certain-exceptions" behaviour allows Cryostat to internally re-attempt remote operations on targets when the operation fails due to certain exceptions which are known to happen sporadically and transiently. This increases Cryostat's reliability from the client's point of view, since rather than the client needing to send a new request to re-attempt after these failures, Cryostat will perform re-attempts on its own until either success or timeout.

How to manually test:

  1. ./smoketest.bash -Ogtr
  2. Open web UI and try various operations - starting recordings, creating snapshots, archiving recordings, deleting recordings, generating reports, defining automated rules, etc. Everything should continue working as normal. Sporadic (and hard to reproduce consistently) JMX connection failures should be reduced or eliminated. For example, try selecting Cryostat itself as the target, and try archiving the onstart recording that is already there. I occasionally see sporadic JMX connection failures when trying this before this PR.

@andrewazores andrewazores marked this pull request as ready for review January 18, 2024 19:16
@andrewazores
Copy link
Member Author

/build_test

Copy link

Workflow started at 1/18/2024, 4:20:45 PM. View Actions Run.

Copy link

CI build and push: All tests pass ✅
https://github.com/cryostatio/cryostat3/actions/runs/7576111198

1 similar comment
Copy link

CI build and push: All tests pass ✅
https://github.com/cryostatio/cryostat3/actions/runs/7576111198

Copy link
Member

@mwangggg mwangggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@andrewazores
Copy link
Member Author

/build_test

Copy link

Workflow started at 1/19/2024, 3:05:31 PM. View Actions Run.

Copy link

CI build and push: All tests pass ✅
https://github.com/cryostatio/cryostat3/actions/runs/7588619023

1 similar comment
Copy link

CI build and push: All tests pass ✅
https://github.com/cryostatio/cryostat3/actions/runs/7588619023

@andrewazores andrewazores merged commit 34c7a6b into cryostatio:main Jan 19, 2024
8 checks passed
@andrewazores andrewazores deleted the retry-mutiny branch January 19, 2024 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants