Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weird disconnection and reconnection #269

Open
faithware opened this issue Nov 3, 2021 · 5 comments
Open

weird disconnection and reconnection #269

faithware opened this issue Nov 3, 2021 · 5 comments

Comments

@faithware
Copy link

faithware commented Nov 3, 2021

My two node cluster is initialized this way

  ptr<state_machine> my_state_machine = cs_new<echo_state_machine>();
  ptr<state_mgr> my_state_manager =
      cs_new<inmem_state_mgr>(cluster_id, def_endpoint);

  asio_service::options asio_opt;
  asio_opt.thread_pool_size_ = 4;

  // Raft parameters.
  raft_params params;
  // heartbeat: 100 ms, election timeout: 200 - 400 ms.
  params.heart_beat_interval_ = 100;
  params.election_timeout_lower_bound_ = 200;
  params.election_timeout_upper_bound_ = 400;
  //params.auto_adjust_quorum_for_small_cluster_ = true;
  // Upto 5 logs will be preserved ahead the last snapshot.
  params.reserved_log_items_ = 5;
  // Snapshot will be created for every 5 log appends.
  params.snapshot_distance_ = 5;
  // Client timeout: 3000 ms.
  params.client_req_timeout_ = 3000;
  // According to this method, `append_log` function
  // should be handled differently.
  params.return_method_ = raft_params::blocking;

  // Initialize Raft server listening on port 12345.
  // It will organize a single-node Raft cluster.
  raft_launcher launcher;

  server =
      launcher.init(my_state_machine, my_state_manager, my_logger,
                    net_config.clustering_port, asio_opt, params, user_init);

When I launch I set cluster ID3 to be the leader and it adds the cluster ID1 , after that, the cluster renegociate the leader and cluster ID 1 becomes leader.
After that, cluster ID 1 (the new leader) is stuck all the time in

 Connection opened 
I am a leader
new session from leader127.0.0.1
Connection closed leader:1peer :-1
new session from leader192.168.3.107
 Connection opened 
new session from leader127.0.0.1
Connection closed leader:1peer :-1
 Connection opened 
new session from leader127.0.0.1
Connection closed leader:1peer :-1
 Connection opened 

And it remains like this for ever .
Is this normal? I thought ones everything is set , the leader doesn't disconnect and reconnects to itself?

@greensky00
Copy link
Contributor

Hi @faithware
This doesn't seem normal. Can you share the log file of both servers?

@faithware
Copy link
Author

It turned out to be a network issue with my wifi card. Is there any clear way to restart the raft server?

@greensky00
Copy link
Contributor

@faithware
If you meant restarting it without killing the process, please refer to this test code to shut it down and then restart it:

s3.raftServer->shutdown();
s3.stopAsio();

Call raft_server::shutdown(), and then close Asio like this:

if (asioListener) {
asioListener->stop();
asioListener->shutdown();
}
if (asioSvc) {
asioSvc->stop();
size_t count = 0;
while (asioSvc->get_active_workers() && count < 500) {
// 10ms per tick.
timer_helper::sleep_ms(10);
count++;
}
}

@faithware
Copy link
Author

thanks @greensky00
I just used server->shutdown and restarted the raft initialization again and it works.
Do I have to explicitly kill asio even though my intent is to restart the server?

@greensky00
Copy link
Contributor

@faithware You don't have to, but just to be safe. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants