Skip to content
This repository has been archived by the owner on Dec 13, 2022. It is now read-only.

HA failover causes cinder-volume to stop responding #942

Open
JCallicoat opened this issue Apr 30, 2014 · 2 comments
Open

HA failover causes cinder-volume to stop responding #942

JCallicoat opened this issue Apr 30, 2014 · 2 comments
Milestone

Comments

@JCallicoat
Copy link

When failover occurs, cinder-volume stops consuming messages from the cinder-volume queue and requires the cinder-volume service to be restarted before it begins consuming messages again.

During this time, you can see from the cinder-volume.log that it has re-established the mysql and rabbit connections, and is sending service updates, which you can see in cinder service-list.

Jason discovered that cinder is using a direct consumer queue that is created when the cinder-volume service is started (see Direct Consumer at http://docs.openstack.org/developer/cinder/devref/rpc.html ), and is removed when the failover occurs.

E.g., cinder-volume_fanout_37c73e1379414cb7a0461aab85c69288

I traced the creation of this queue to https://github.com/openstack/cinder/blob/stable/havana/cinder/openstack/common/rpc/impl_kombu.py#L267 via https://github.com/openstack/cinder/blob/stable/havana/cinder/openstack/common/rpc/impl_kombu.py#L694 via https://github.com/openstack/cinder/blob/stable/havana/cinder/openstack/common/rpc/impl_kombu.py#L740 which is only called with fanout=True on service startup https://github.com/openstack/cinder/blob/stable/havana/cinder/openstack/common/rpc/service.py#L58

So it looks like the direct consumer queue is dropped when the connection to rabbit drops during failover, and then that queue is never recreated, so no messages are processed until cinder-volume is restarted and a new direct consumer fanout queue is created.

Cookbooks: v4.2.2
Cinder packages: 1:2013.2.2-0ubuntu1~cloud0

@breu
Copy link
Contributor

breu commented Apr 30, 2014

this may actually be related to the rabbitmq connection not getting severed on failover of the rabbitmq VIP

@breu
Copy link
Contributor

breu commented Apr 30, 2014

ok - I've tracked this down to an issue with cinder-scheduler on the controller nodes where they do not reconnect correctly when the VIP fails over. Since cinder-scheduler isn't all that useful when the cinder-volume node is down I propose that we change the ha-controller* roles to only include cinder-setup(for controller1) and cinder-api for both nodes. The cinder volume storage nodes then get cinder-scheduler and cinder-volume. If the volume nodes are offline it doesn't make much sense to have cinder-schedulers available that cannot schedule volumes to volume servers.

more to come tomorrow

@claco claco added this to the v4.2.3 milestone May 23, 2014
claco added a commit to claco/openstack-ha that referenced this issue May 28, 2014
Removed VXPVNC from monitoring. RHEL does not support and being that
RHEL is a supported platform we need to make sure that the offering
is consistent on both RHEL and Ubuntu.

Issue
rcbops/chef-cookbooks#942

(cherry picked from commit 8b8e203)
claco added a commit to claco/chef-cookbooks that referenced this issue Jun 13, 2014
HA failover causes cinder-volume to stop responding because the
scheduler does not reconnect properly after the vip failover. Since the
scheduler is worthless w/o the volume service anyways, just put it right
there  where the volume is and off of the ha controller 1/2 nodes.

Issue rcbops#942
claco added a commit to claco/chef-cookbooks that referenced this issue Jun 13, 2014
HA failover causes cinder-volume to stop responding because the
scheduler does not reconnect properly after the vip failover. Since the
scheduler is worthless w/o the volume service anyways, just put it right
there  where the volume is and off of the ha controller 1/2 nodes.

Issue rcbops#942

(cherry picked from commit 74bb5b7)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants