Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_coalescing might be flaky #828

Closed
cvybhu opened this issue Oct 10, 2023 · 3 comments
Closed

test_coalescing might be flaky #828

cvybhu opened this issue Oct 10, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@cvybhu
Copy link
Contributor

cvybhu commented Oct 10, 2023

test_coalescing failed in run https://github.com/scylladb/scylla-rust-driver/actions/runs/6473573762/job/17576595276.
It might be a flaky test.

Here's the failure log:

---- transport::connection::tests::test_coalescing stdout ----
Unique name: test_rust_1696964594_11
thread 'transport::connection::tests::test_coalescing' panicked at 'assertion failed: `(left == right)`
  left: `[(0, []), (1, [1]), (2, [2, 2]), (4, [4, 4, 4, 4]), (5, [5, 5, 5, 5, 5]), (6, [6, 6, 6, 6, 6, 6]), (7, [7, 7, 7, 7, 7, 7, 7]), (8, [8, 8, 8, 8, 8, 8, 8, 8]), (9, [9, 9, 9, 9, 9, 9, 9, 9, 9]), (10, [10, 10, 10, 10, 10, 10, 10, 10, 10, 10]), (11, [11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]), (12, [12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12]), (13, [13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13]), (14, [14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14]), (15, [15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15]), (16, [16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16]), (17, [17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17]), (18, [18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18]), (19, [19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19]), (20, [20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20]), (21, [21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21]), (22, [22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22]), (23, [23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23]), (24, [24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24]), (25, [25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25]), (26, [26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26]), (27, [27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27]), (28, [28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28]), (29, [29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29]), (30, [30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30]), (31, [31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31]), (32, [32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32]), (33, [33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33]), (34, [34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34]), (35, [35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35]), (36, [36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36]), (37, [37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37]), (38, [38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38]), (39, [39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39]), (40, [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40]), (41, [41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41]), (42, [42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42]), (43, [43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43]), (44, [44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44])]`,
 right: `[(0, []), (1, [1]), (2, [2, 2]), (3, [3, 3, 3]), (4, [4, 4, 4, 4]), (5, [5, 5, 5, 5, 5]), (6, [6, 6, 6, 6, 6, 6]), (7, [7, 7, 7, 7, 7, 7, 7]), (8, [8, 8, 8, 8, 8, 8, 8, 8]), (9, [9, 9, 9, 9, 9, 9, 9, 9, 9]), (10, [10, 10, 10, 10, 10, 10, 10, 10, 10, 10]), (11, [11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11]), (12, [12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12]), (13, [13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13]), (14, [14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14]), (15, [15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15]), (16, [16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16]), (17, [17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17]), (18, [18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18]), (19, [19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19]), (20, [20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20]), (21, [21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21]), (22, [22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22]), (23, [23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23]), (24, [24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24]), (25, [25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25]), (26, [26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26]), (27, [27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27]), (28, [28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28]), (29, [29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29]), (30, [30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30]), (31, [31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31]), (32, [32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32]), (33, [33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33]), (34, [34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34]), (35, [35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35]), (36, [36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36, 36]), (37, [37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37]), (38, [38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38]), (39, [39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39, 39]), (40, [40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40]), (41, [41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41, 41]), (42, [42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42]), (43, [43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43]), (44, [44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44])]`', scylla/src/transport/connection.rs:2059:13
@cvybhu cvybhu added the bug Something isn't working label Oct 10, 2023
@piodul
Copy link
Collaborator

piodul commented Oct 10, 2023

Hmm, it looks like error propagation is incorrect in that test. It uses join_all to wait for futures that return Result<...>, but the results are actually not checked. The test also performs lots of writes in parallel, so perhaps some of them time out or are rejected with an error, thus the latter query which is supposed to retrieve all of the rows fails (here, one of the rows is missing).

We should change join_all to try_join_all and perhaps reduce the concurrency so that the test doesn't fail with Cassandra.

cvybhu added a commit to cvybhu/scylla-rust-driver that referenced this issue Oct 13, 2023
`test_coalescing` spawns a lot of tokio tasks, each of which
runs a query on the database.
If a query fails, the task will panic because of the `unwrap()`.

The problem is that all of the tasks are joined using `join_all`,
which doesn't check for errors in the joined tasks.
For example, this piece of code prints "Ok!" and the program finishes successfuly:
```rust
let mut tasks = Vec::new();
for _i in 0..4 {
    tasks.push(tokio::spawn(async { panic!("Panic!") }));
}

futures::future::join_all(tasks).await;

println!("Ok!");
```
There are some error messages in the output, but the important thig is
that they don't stop the control flow.

To fix it, let's propagate the errors upwards and use `try_join_all`
to join all the tasks. `try_join_all` will check whether the task
was successful and fail the test if it isn't.

Refs: scylladb#828

Signed-off-by: Jan Ciolek <[email protected]>
@cvybhu
Copy link
Contributor Author

cvybhu commented Oct 13, 2023

I opened #834 to fix the problem with join_all, but I didn't mark it as Fixes.

The test failed because one of the queries has failed, but join_all didn't cause this failure, it must've been something else.
The test doesn't do that much, it runs 44 queries at once, that should be absolutely fine.
My theory is that Cassandra was overwhelmed because it also had to run a test which runs a logged batch with 60k queries inside (it was a CI run for #824).

IMO the test looks okay, the failure was most likely caused by external factors. I think we can close this issue and reopen it if it fails again in the future.

@piodul
Copy link
Collaborator

piodul commented Oct 17, 2023

The test is not OK because it does not propagate errors properly. However, I agree that it would still be flaky even if it did.

I'll close the issue as suggested.

@piodul piodul closed this as not planned Won't fix, can't repro, duplicate, stale Oct 17, 2023
cvybhu added a commit to cvybhu/scylla-rust-driver that referenced this issue Oct 19, 2023
`test_coalescing` spawns a lot of tokio tasks, each of which
runs a query on the database.
If a query fails, the task will panic because of the `unwrap()`.

The problem is that all of the tasks are joined using `join_all`,
which doesn't check for errors in the joined tasks.
For example, this piece of code prints "Ok!" and the program finishes successfuly:
```rust
let mut tasks = Vec::new();
for _i in 0..4 {
    tasks.push(tokio::spawn(async { panic!("Panic!") }));
}

futures::future::join_all(tasks).await;

println!("Ok!");
```
There are some error messages in the output, but the important thig is
that they don't stop the control flow.

To fix it, let's use `try_join_all` instead of `join_all`.
`try_join_all` will check whether the task panicked or not,
and return an error if there was a panic.

I manually tested that it works by inserting a panic!()
after `conn.query()`.

Let's also make sure that the result of join_all doesn't
contain any `Result<>`, as it could hide the errors.
The result of `join_all` should be just a `Vec<>` of unit values.

Refs: scylladb#828

Signed-off-by: Jan Ciolek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants