Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The result channel doesn't get closed after load ends blocking the results file generation #385

Open
vponomaryov opened this issue Jul 11, 2023 · 2 comments
Assignees

Comments

@vponomaryov
Copy link
Collaborator

vponomaryov commented Jul 11, 2023

Issue description

Running following gemini command:

gemini -d --duration 3h --warmup 30m -c 50 -m mixed -f --non-interactive --cql-features normal \
    --max-mutation-retries 5 --max-mutation-retries-backoff 500ms --async-objects-stabilization-attempts 5 \
    --async-objects-stabilization-backoff 500ms --replication-strategy "{'class': 'SimpleStrategy', 'replication_factor': '3'}" \
    --oracle-replication-strategy "{'class': 'SimpleStrategy', 'replication_factor': '1'}" \
    --test-cluster=10.12.1.102,10.12.2.40,10.12.2.200 --outfile /gemini/gemini_result_dd524c59-4d74-41f7-a8bd-ada4d99c9e23.log \
    --seed 70 --request-timeout 180s --connect-timeout 120s --oracle-cluster=10.12.3.145

Resulted in the following:

{"L":"INFO","T":"2023-07-10T16:29:32.918Z","N":"generator","M":"starting partition key generation loop"}
{"L":"INFO","T":"2023-07-10T19:59:32.884Z","N":"pump","M":"Test run stopped. Exiting."}
{"L":"INFO","T":"2023-07-10T19:59:32.919Z","M":"Test run completed. Exiting."}

But in normal case it looks like following:

{"L":"INFO","T":"2023-06-26T09:29:11.605Z","N":"generator","M":"starting partition key generation loop"}
{"L":"INFO","T":"2023-06-26T12:59:11.579Z","N":"pump","M":"Test run stopped. Exiting."}
{"L":"INFO","T":"2023-06-26T12:59:11.608Z","M":"result channel closed"}
{"L":"INFO","T":"2023-06-26T12:59:11.609Z","M":"Test run completed. Exiting."}

So, the result channel closed message is absent in the current test run failure.
It blocked the generation of the results file.

Impact

Results file from the gemini cannot be taken.

How frequently does it reproduce?

Observed first time

Installation details

Kernel Version: 5.15.0-1039-aws
Scylla version (or git commit hash): 5.2.4-20230708.9f79c9f41d6e with build-id edaa90c2e7660d794d2d308e93c1ba956e829d7d
Gemini version: 1.7.8

Cluster size: 3 nodes (i3.large)

Scylla Nodes used in this run:

  • gemini-with-nemesis-3h-normal-5-2-oracle-db-node-e576d8f7-1 (44.203.218.69 | 10.12.3.145) (shards: 2)
  • gemini-with-nemesis-3h-normal-5-2-db-node-e576d8f7-3 (44.204.201.39 | 10.12.2.200) (shards: 2)
  • gemini-with-nemesis-3h-normal-5-2-db-node-e576d8f7-2 (3.239.28.146 | 10.12.2.40) (shards: 2)
  • gemini-with-nemesis-3h-normal-5-2-db-node-e576d8f7-1 (18.213.151.8 | 10.12.1.102) (shards: 2)

OS / Image: ami-0a69901a7e05f1029 (aws: us-east-1)

Test: gemini-3h-with-nemesis-test
Test id: e576d8f7-262b-4864-b411-4e5c65631b55
Test name: scylla-5.2/gemini-/gemini-3h-with-nemesis-test
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor e576d8f7-262b-4864-b411-4e5c65631b55
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs e576d8f7-262b-4864-b411-4e5c65631b55

Logs:

Jenkins job URL
Argus

@fruch
Copy link
Collaborator

fruch commented Jul 11, 2023

Which version of Gemini is used ?

We didn't yet backport all of the recent fixes to 5.2 (it's still not fully stable on SCT master)

@vponomaryov
Copy link
Collaborator Author

Which version of Gemini is used ?

The version is in the bug description: 1.7.8

We didn't yet backport all of the recent fixes to 5.2 (it's still not fully stable on SCT master)

I haven't seen similar bugreport so I filed it.

@fruch fruch self-assigned this Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants