You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, thanks for all the hard work put into this provider!
I've been playing around with CAPI + CAPO + DevStack quite a bit lately, i've got everything running locally on my laptop (ontop of libvirt + kind).
Today when i restarted my DevStack VM, and when the CAPO controller started up i noticed that it logged a lot of log messages that looked like this (--v=2):
I counted the rate of logs to ~13/s, if i checked the Nova compute API in the DevStack end it's the same rate, it logged this:
Mar 05 20:45:53 devstack01 [email protected][1267]: DEBUG nova.compute.api [None req-91fcfda4-b4e2-4aa0-a902-c13d915ac5c4 admin admin] [instance: 5d357a3e-be75-4ed0-99c5-e03dc283d795] Fetching instance by UUID {{(pid=1267) get /opt/stack/nova/nova/compute/api.py:2981}}
Mar 05 20:45:53 devstack01 [email protected][1267]: DEBUG nova.compute.api [None req-568f1596-3b9c-4346-acc4-f7aff64684b1 admin admin] [instance: 5d357a3e-be75-4ed0-99c5-e03dc283d795] Fetching instance by UUID {{(pid=1267) get /opt/stack/nova/nova/compute/api.py:2981}}
Mar 05 20:45:53 devstack01 [email protected][1267]: DEBUG nova.compute.api [None req-34d8ba23-dc2e-4c61-92ca-1af2e3459b28 admin admin] [instance: 5d357a3e-be75-4ed0-99c5-e03dc283d795] Fetching instance by UUID {{(pid=1267) get /opt/stack/nova/nova/compute/api.py:2981}}
Mar 05 20:45:53 devstack01 [email protected][1267]: DEBUG nova.compute.api [None req-8217ec12-bdde-49da-8305-f1694882c004 admin admin] [instance: 5d357a3e-be75-4ed0-99c5-e03dc283d795] Fetching instance by UUID {{(pid=1267) get /opt/stack/nova/nova/compute/api.py:2981}}
Mar 05 20:45:53 devstack01 [email protected][1267]: DEBUG nova.compute.api [None req-b7e57d26-638d-49ab-82f5-892202ee3d33 admin admin] [instance: 5d357a3e-be75-4ed0-99c5-e03dc283d795] Fetching instance by UUID {{(pid=1267) get /opt/stack/nova/nova/compute/api.py:2981}}
Mar 05 20:45:53 devstack01 [email protected][1267]: DEBUG nova.compute.api [None req-dc1d4323-8ac1-426d-a7e2-c6293a97f661 admin admin] [instance: 5d357a3e-be75-4ed0-99c5-e03dc283d795] Fetching instance by UUID {{(pid=1267) get /opt/stack/nova/nova/compute/api.py:2981}}
At this point the state of the instances and load balancer (Octavia) in DevStack were:
Load balancer in Error state
Amphorae instance reporting SHUTOFF
My single node k8s cluster instance also reporting SHUTOFF
Not the best state 😏, and probably a corner case but still, it was enough to make the OpenStack machine controller behaving a bit off.
When i fetched the OpenStackMachine object in a watch loop from the CAPI + CAPO (kind) i observed that these two fields constantly changed, at the same rate as the logs were omitted:
resourceVersion
The message field in the InstanceReady condition
That message field was being set to:
message: 'Instance state is not handled: 0xc001129610'
/kind bug
What steps did you take and what happened:
Hello!
First of all, thanks for all the hard work put into this provider!
I've been playing around with CAPI + CAPO + DevStack quite a bit lately, i've got everything running locally on my laptop (ontop of
libvirt
+kind
).Today when i restarted my DevStack VM, and when the CAPO controller started up i noticed that it logged a lot of log messages that looked like this (
--v=2
):I counted the rate of logs to ~13/s, if i checked the Nova compute API in the DevStack end it's the same rate, it logged this:
At this point the state of the instances and load balancer (Octavia) in DevStack were:
Not the best state 😏, and probably a corner case but still, it was enough to make the OpenStack machine controller behaving a bit off.
When i fetched the
OpenStackMachine
object in a watch loop from the CAPI + CAPO (kind
) i observed that these two fields constantly changed, at the same rate as the logs were omitted:resourceVersion
message
field in theInstanceReady
conditionThat
message
field was being set to:The problem seems to be that we're writing the pointer memory address to the message field here: https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/controllers/openstackmachine_controller.go#L456. When we set that in the condition the
resourceVersion
changes and we retrigger a reconcile immediatly (AFAICT).If i dereference that instance state variable instead it works as expected (SHUTOFF is written instead) and the reconcile flooding stops.
The OpenStack server controller does something similar but instead reads the state off the object which will be correct in this case and not trigger flooding: https://github.com/kubernetes-sigs/cluster-api-provider-openstack/blob/main/controllers/openstackserver_controller.go#L390
What did you expect to happen:
The OpenStackMachine controller should not write the pointer address to a status condition field causing a flood of reconcile attempts and logs.
Anything else you would like to add:
OpenStackMachine `status`
Here we can see the full
status
of the object in CAPO:Environment:
git rev-parse HEAD
if manually built): fc2c57akubectl version
): v1.32.0/etc/os-release
): Arch LinuxThe text was updated successfully, but these errors were encountered: