-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
appsconfig.proto: Add edge node cluster defines #46
Conversation
It would make sense to have a separate document focusing on the cluster configuration and the API and publish that on the LF-Edge EVE wiki. |
I'll hold off creating the generated files until we have agreement on the modified files. |
this one is posted last week on lfedge wike: https://wiki.lfedge.org/display/EVE/Edge-Node+Clustering+Design |
@famleebob I think we discussed not to put internal docs on public domain |
cb75589
to
b104f42
Compare
@zedi-pramodh I updated the comments to remove links to internal docs. Erik is right, though, that we need a public design doc on LF-Edge Wiki or in this repo for a design, so good link to the wiki one. |
Yes we did, my bad. Just wanted to tie the work done to a specific description of the API |
I took the liberty to add that link to the PR description. |
proto/config/appconfig.proto
Outdated
|
||
// This edge-node UUID for the Designate Node for the Application | ||
string designated_node_id = 24; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not clear we need this in the API as opposed to the cluster selecting one control-plane node to do the download and volume creation. If we do it inside the cluster it means that the cluster can re-select the node should the originally selected one fail. Doing it in the API means that the controller (or the user??) now has to react to the failed node and based on that change the configuration it sends to the nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From I understand this longhorn persistent provisioning, this will be used to initial setup who should perform this task. Even later this node fails, this does not change.
Another way to do this, to allow cluster to decide, but in general we need to send down a set of nodes for the replication (even though initially we only have 3 nodes), each VI would need to have some election going on among the nodes. That can also be complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, i thought this part is the volume instance.
for app instance here. We later may want to allow 'manual' provisioning from the user, so this will be the same scheme, let api to indicate which node should run the app.
// To inform the edge-node if the device receiving this Volume is | ||
// responsible for volume creation, convert PVC, or not | ||
string designated_node_id = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment about this field for the app instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i put the reply in above app-instance comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I still think this will cause technical debt since the controller has to be involved in re-selecting a designated node if the previously selected one fails. So approving this holding my nose.
proto/info/info.proto
Outdated
Node_Status_READY = 1; // cluster reports our node is ready | ||
Node_Status_NOTREADY = 2; // cluster reports our node is ready |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments are identical for the two fields.
But how does the state of the node relate to the state of the cluster?
Take the case of a cluster where its config lists node1, node2, and node3.
When e.g., node3 reports READY does it mean that not only it has joined the cluster but also that it sees node2 and node1 as part of the cluster?
What status do we expect to see reported by the nodes if/when one of the nodes fail to join the cluster? Will the other two report it as READY?
These semantics should presumably be described in comments in the proto file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is only for the node to report the status as part of the cluster. We'll need another info or metric message to report the cluster status (more like kubernetes view) w/ an elected reporter on the cluster. But that is more complex and is not part of this PR.
So, this is only saying as a device, my node in this cluster status. If all nodes report 'unknown' then the cluster is down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you 1) correct the comment and 2) add the above information as comments.
Also, does it make sense to have a separate value for DOWN vs. using UNKNOWN? Either one works for me as long as this is clearly documented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will correct the comment.
We can use 'DOWN' or 'CANTCONNECT', basically it says can not talk to the api-server of the cluster.
proto/info/info.proto
Outdated
// The App in cluster mode is currently running on this edge-node | ||
bool cluster_app_running = 20; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I already commented on this field but I don't see the comment any more.
Does "running" imply that 1) this node was selected as the place to run the app instance and 2) the app instance has reached "Online" state?
What about the cases when the node has been selected but the state of the app instance is e.g., booting or halting, halted, error?
It would be helpful to know the intended use (who is going to do something based on this field? Merely for display? Or will a user/controller somehow react to this value to change things?)
Once we know that the semantics can be documented in the proto file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This info status, main usage is for App 'migration'. the node is not a designated node for the app (related to above app-instance dnid of the app), but kubernetes cluster migrated this app on to this node. The app is running now on this node. controller can now set API of app-instance changes to indicate now the designated node is on this node. All the app domain operations will work normally on this node. e.g. some 'activated' bool will be set to true on the domainstatus, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this design with designated is problematic since you depend on the controller to adjust the config based on where the app (or volume) is currently running. What if the controller does not adjust it? Should EVE interpret that as it should be migrated back? (Well, it can't since that would be a race in the eventual consistency model.)
Thus I think the placement needs to be based on policy statements (only run on node X, avoid running on node Y, run on same node as app 2) and not involve the controller reacting to where the app insance is currently running; that is not a declarative API statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just an 'info' message to report this app now start running on this 'node', it does not involve with any placement decision, since it's already being rescheduled by the cluster itself.
The node does not depend on the controller to change the DNID for the app in configure download, the plan is the have the controller eventually change the DNIS to this node, and the App status on this new node will be sorted of becomes 'legal', and start to report App metrics those sort of stuff. Since the reporting of the metrics depends on domainstatus of the app is fully activated.
I think regardless of how the detail of the process on this node, to report this 'app is running on this node' as an 'info' message is important for controller to know, or for user to know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naiming-zededa We still need to make it clear what "running" means here, in cases where the node is trying to run it (so the state could be booting) or tried to run it (and it failed so the state is currently haling, halted, or the app instance has an error).
Thus is this "running" as in "trying to run and state is currently online" or "selected" as in trying/intending to run on this node? Please clarify in comments and in the name of the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eriknordmark makes sense. we will add comments for this item, for the meaning of the App deployed is being scheduled and launched on this node and it has the property of 'node-id' of this node, the App can be in any operating state.
proto/config/edge_node_cluster.proto
Outdated
string cluster_ip_prefix = 4; | ||
|
||
// This device is an 'Agent' node | ||
bool is_agent = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be is_agent
or is_worker
? I think maybe worker
? The agent
/server
language is a k3s one; control plane
/worker
is generic Kubernetes. We are using k3s here, but that is an implementation detail. I think the API should be generic to Kubernetes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. we can change it to is_worker_node. although we have things like join_server_ip, we don't want to be too kubernetes as join_control_plan_node_ip, and 'master' should not be used these days :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I think server is fine though. Lots of people use "server" as thing to talk to, and even refer to "control plane server" or "master server". Besides, the only thing I can join is a control plane server anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be addressed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I missed it yesterday. Done, now is is_worker_node
.
I'll do another rebase when ready.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've done some strange pull+merge from main so now other PRs appear in the diffs.
Note that you should NEVER do merge commits. Instead fetch from main and rebase on that.
And for this particular repo with the protobuf files it makes sense to always keep two separate commits: one with the changes to the proto files and a separate one with the generated files. That makes it trivial to rebase since you can drop the second commit (in git rebase -i for instance), and then invoke make to recreate the generated files once you've rebased the commit with the proto changes on top of main.
So please rebase and create the two commits which are needed.
86fb9db
to
dbf886d
Compare
I tried to pull in the conflict using a Completed the rebase, now at two commits, and a forced push. |
Looks good to me, other than Erik's requests (the multiple commits). |
b0b81da
to
8d19735
Compare
Fixed Yetus errors, commit changes, rebase, build, commit generated files. Should answer all outstanding requests |
LGTM, but @eriknordmark needs to approve. |
Quick update to try to fix the Yetus errors, did I shoot too soon? |
9c12cbc
to
e41e113
Compare
I committed these changes shortly after requested, can we at least mark this done, or not blocking the merge, please? |
proto/config/appconfig.proto
Outdated
// This edge-node UUID for the Designate Node for the Application | ||
string designated_node_id = 25; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I still think this will cause technical debt since the controller has to be involved in re-selecting a designated node if the previously selected one fails. So approving this holding my nose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still two outstanding clarification requests, then I will approve this holding my nose.
e41e113
to
65320ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
You probably have to redo the generated files commit since I just merged another API adddition.
first pass adding edge node cluseter to the API fix some Yetus errors Signed-off-by: Gerald (Bob) Lee <[email protected]>
All generated files after `make` Signed-off-by: Gerald (Bob) Lee <[email protected]>
65320ce
to
ec0f94e
Compare
PR lf-edge#46 accidently added allow_to_discover field to app_instance which was removed in lf-edge#52 Not deprecating field number 25 (allow_to_discover) because we haven't had released EVE with this API yet, so we don't have to do this Signed-off-by: Pavel Abramov <[email protected]>
PR #46 accidently added allow_to_discover field to app_instance which was removed in #52 Not deprecating field number 25 (allow_to_discover) because we haven't had released EVE with this API yet, so we don't have to do this Signed-off-by: Pavel Abramov <[email protected]>
First pass adding edge node cluster to the API
Based on the design and API in https://wiki.lfedge.org/display/EVE/Edge-Node+Clustering+Design