-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add proposal for status field in akri resources #77
base: main
Are you sure you want to change the base?
Conversation
This proposal aims to provide a way to give feedback to the user about the status of its akri resources. Signed-off-by: Nicolas Belouin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment but overall i think this is a great add!
|
||
When the BrokerScheduled or Ready condition is False, the message shall include what resources are the cause of this. | ||
|
||
In order to be able to correctly set the "Healthy" condition, a list of healthy nodes (must be a subset of `spec.nodes` field) is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does healthyNodes
need to be listed in the status
? Could it be a determinate that at least one node is healthy for "Healthy"
to be true but not necessary for that information to be listed in the instance status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I don't see any way for an agent to know it is the last with a "healthy" device (maybe healthy is not the best term, as it means not in grace period here) and thus switch the Healthy condition status if it goes down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is making me realize something: when an agent on a node discovers an instance, it adds itself to the instance's nodes
list; however, when it stops being able to see the instance it deletes the instance rather than removing itself from the nodes
and only deleting if it is the last one left. This seems problematic because the device could be updated to explicitly no longer be able to connect with that node but is still able to connect to the others.
Is the idea here that healthyNodes
reflects accurately what nodes can see the device and we update the flow to no longer delete an instance if it is offline (to a single agent) unless it is the last online? Right now, an unhealthy instance would never exist, it would already have been deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we need to change that behavior (in fact I think I'll create a specific issue to fix that, as this looks like a bug to me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for creating the issue!
- Healthy: The instance is currently detected by an agent, "True" (instance detected) and "Unknown" (in grace period) are the only valid statuses | ||
- BrokerScheduled: The broker resources are all created (absent if no broker) | ||
- BrokerReady: All broker resources are "Ready" or "Succeeded" | ||
- Ready: All above conditions are True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if no brokers are deployed for an instance? what does BrokerReady
display as?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are no defined brokers to be deployed for an Instance, BrokerReady
condition would simply not be present and Ready
would simple reflect the Healthy
condition, similar to how BrokerScheduled
behave.
I'll update the proposal to add the same comment that exists for BrokerScheduled
.
This proposal aims to provide a way to give feedback to the user about the status of its akri resources.
This aims to provide a long term solution for Configuration/Instance linked errors as explained in project-akri/akri#615