Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: pacemaker-attrd: prevent segfault if a peer leaves when its name is unknown yet #3847

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
42 changes: 29 additions & 13 deletions daemons/attrd/attrd_corosync.c
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ attrd_cpg_dispatch(cpg_handle_t handle,
xml = pcmk__xml_parse(data);

if (xml == NULL) {
crm_err("Bad message received from %s[%u]: '%.120s'",
crm_err("Bad message received from %s[%" PRIu32 "]: '%.120s'",
from, nodeid, data);
} else {
attrd_peer_message(pcmk__get_node(nodeid, from, NULL,
Expand Down Expand Up @@ -168,41 +168,52 @@ attrd_peer_change_cb(enum pcmk__node_update kind, pcmk__node_status_t *peer,

switch (kind) {
case pcmk__node_update_name:
crm_debug("%s node %s is now %s",
crm_debug("%s node %s[%" PRIu32 "] is now %s",
(is_remote? "Remote" : "Cluster"),
peer->name, state_text(peer->state));
pcmk__s(peer->name, "unknown"), peer->cluster_layer_id,
state_text(peer->state));
break;

case pcmk__node_update_processes:
if (!pcmk_is_set(peer->processes, crm_get_cluster_proc())) {
gone = true;
}
crm_debug("Node %s is %s a peer",
peer->name, (gone? "no longer" : "now"));
crm_debug("Node %s[%" PRIu32 "] is %s a peer",
pcmk__s(peer->name, "unknown"), peer->cluster_layer_id,
(gone? "no longer" : "now"));
break;

case pcmk__node_update_state:
crm_debug("%s node %s is now %s (was %s)",
crm_debug("%s node %s[%" PRIu32 "] is now %s (was %s)",
(is_remote? "Remote" : "Cluster"),
peer->name, state_text(peer->state), state_text(data));
pcmk__s(peer->name, "unknown"), peer->cluster_layer_id,
state_text(peer->state), state_text(data));
if (pcmk__str_eq(peer->state, PCMK_VALUE_MEMBER, pcmk__str_none)) {
/* If we're the writer, send new peers a list of all attributes
* (unless it's a remote node, which doesn't run its own attrd)
*/
if (attrd_election_won()
&& !pcmk_is_set(peer->flags, pcmk__node_status_remote)) {
attrd_peer_sync(peer);
if (!is_remote) {
if (attrd_election_won()) {
attrd_peer_sync(peer);

} else {
// Anyway send a message so that the peer learns our name
attrd_send_protocol(peer);
}
}

} else {
// Remove all attribute values associated with lost nodes
attrd_peer_remove(peer->name, false, "loss");
if (peer->name != NULL) {
attrd_peer_remove(peer->name, false, "loss");
}
gone = true;
}
break;
}

// Remove votes from cluster nodes that leave, in case election in progress
if (gone && !is_remote) {
if (gone && !is_remote && peer->name != NULL) {
attrd_remove_voter(peer);
attrd_remove_peer_protocol_ver(peer->name);
attrd_do_not_expect_from_peer(peer->name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unclear on how the learning process works. As you noted here, attrd_remove_peer_protocol_ver will segfault. I think all that version communication and reply expectation code is built around the assumption that nodes will have names. What's the window of time where we won't know a node name?

Basically, I am wondering if we need to make a lot more changes to take care of node name being NULL.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A daemon learns a peer's name whenever it receives a CPG message from the peer and via pcmk__cpg_message_data() -> pcmk__get_node() -> update_peer_uname().

Particularly for attrd, right after it has joined the cluster, it learns the writer's name rather quickly since it soon receives a sync-response from the writer.

But it doesn't directly learn non-writers' names since non-writers don't proactively send any CPG messages until actually needed, for instance whenever handling an update request.

Previously when we processed sync-response from the writer via attrd_peer_sync_response() -> attrd_peer_update() -> attrd_peer_update_one() -> update_attr_on_host(), we could learn the non-writers' names through record_peer_nodeid() -> pcmk__get_node() in here:

https://github.com/ClusterLabs/pacemaker/pull/3796/files#diff-dd072fd2300bdeabe7bfd1d1e6d1512b16ff815db39b8e20b476f7ef9e64b365L326

, but not any more.

We'd probably better still do this. So that we could learn names of non-writers as early as possible through processing sync-response from the writer.

Copy link
Member Author

@gao-yan gao-yan Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A potential alternative solution or an additional assurance could be for every other non-writer to broadcast a protocol message with attrd_broadcast_protocol() or send it specifically to us once they see us coming online, so that we receive their messages and learn their names, like how controld and fenced do with their "hello"/"poke" message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach of using attrd_broadcast_protocol() for this. At a quick glance, it looks like we should learn the peer name that way because attrd_cpg_dispatch will call pcmk__cpg_message_data, which follows the path you've already outlined.

Unfortunately I don't think we can get away with doing just that. I think we would need to increase the attrd protocol version for that change. And then since pacemaker supports mixed version clusters, we would could potentially have some attrds running that know to broadcast their protocol version in these new cases, and some that do not.

So, if we want to have non-writers broadcast with attrd_broadcast_protocol, I think we still need to apply these patches. Does that make sense to you too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. With the two approaches together, we shouldn't need to be concerned about mixed-version cluster.

Expand Down Expand Up @@ -320,8 +331,13 @@ update_attr_on_host(attribute_t *a, const pcmk__node_status_t *peer,
// Remember node's XML ID if we're just learning it
if ((node_xml_id != NULL)
&& !pcmk__str_eq(node_xml_id, prev_xml_id, pcmk__str_none)) {
// Remember node's name in case unknown in the membership cache
pcmk__node_status_t *known_peer =
pcmk__get_node(0, host, node_xml_id,
pcmk__node_search_cluster_member);

crm_trace("Learned %s[%s] node XML ID is %s (was %s)",
a->id, v->nodename, node_xml_id,
a->id, known_peer->name, node_xml_id,
pcmk__s(prev_xml_id, "unknown"));
attrd_set_node_xml_id(v->nodename, node_xml_id);
if (attrd_election_won()) {
Expand Down
26 changes: 20 additions & 6 deletions daemons/attrd/attrd_messages.c
Original file line number Diff line number Diff line change
Expand Up @@ -302,10 +302,10 @@ attrd_handle_request(pcmk__request_t *request)

/*!
\internal
\brief Broadcast private attribute for local node with protocol version
\brief Send or broadcast private attribute for local node with protocol version
*/
void
attrd_broadcast_protocol(void)
attrd_send_protocol(const pcmk__node_status_t *peer)
{
xmlNode *attrd_op = pcmk__xe_create(NULL, __func__);

Expand All @@ -319,16 +319,30 @@ attrd_broadcast_protocol(void)
crm_xml_add(attrd_op, PCMK__XA_ATTR_HOST_ID,
attrd_cluster->priv->node_xml_id);

crm_debug("Broadcasting attrd protocol version %s for node %s",
ATTRD_PROTOCOL_VERSION, attrd_cluster->priv->node_name);
if (peer == NULL) {
crm_debug("Broadcasting attrd protocol version %s for node %s[%" PRIu32
"]",
ATTRD_PROTOCOL_VERSION,
pcmk__s(attrd_cluster->priv->node_name, "unknown"),
attrd_cluster->priv->node_id);

attrd_send_message(NULL, attrd_op, false); /* ends up at attrd_peer_message() */
} else {
crm_debug("Sending attrd protocol version %s for node %s[%" PRIu32
"] to node %s[%" PRIu32 "]",
ATTRD_PROTOCOL_VERSION,
pcmk__s(attrd_cluster->priv->node_name, "unknown"),
attrd_cluster->priv->node_id,
pcmk__s(peer->name, "unknown"),
peer->cluster_layer_id);
}

attrd_send_message(peer, attrd_op, false); /* ends up at attrd_peer_message() */

pcmk__xml_free(attrd_op);
}

gboolean
attrd_send_message(pcmk__node_status_t *node, xmlNode *data, bool confirm)
attrd_send_message(const pcmk__node_status_t *node, xmlNode *data, bool confirm)
{
const char *op = crm_element_value(data, PCMK_XA_TASK);

Expand Down
2 changes: 1 addition & 1 deletion daemons/attrd/pacemaker-attrd.c
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ main(int argc, char **argv)
* across all nodes. It also ensures that the writer learns our node name,
* so it can send our attributes to the CIB.
*/
attrd_broadcast_protocol();
attrd_send_protocol(NULL);

attrd_init_ipc();
crm_notice("Pacemaker node attribute manager successfully started and accepting connections");
Expand Down
4 changes: 2 additions & 2 deletions daemons/attrd/pacemaker-attrd.h
Original file line number Diff line number Diff line change
Expand Up @@ -194,13 +194,13 @@ void attrd_peer_clear_failure(pcmk__request_t *request);
void attrd_peer_sync_response(const pcmk__node_status_t *peer, bool peer_won,
xmlNode *xml);

void attrd_broadcast_protocol(void);
void attrd_send_protocol(const pcmk__node_status_t *peer);
xmlNode *attrd_client_peer_remove(pcmk__request_t *request);
xmlNode *attrd_client_clear_failure(pcmk__request_t *request);
xmlNode *attrd_client_update(pcmk__request_t *request);
xmlNode *attrd_client_refresh(pcmk__request_t *request);
xmlNode *attrd_client_query(pcmk__request_t *request);
gboolean attrd_send_message(pcmk__node_status_t *node, xmlNode *data,
gboolean attrd_send_message(const pcmk__node_status_t *node, xmlNode *data,
bool confirm);

xmlNode *attrd_add_value_xml(xmlNode *parent, const attribute_t *a,
Expand Down
2 changes: 1 addition & 1 deletion lib/cluster/cluster.c
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ pcmk__cluster_node_name(uint32_t nodeid)
}

crm_notice("Could not obtain a node name for node with "
PCMK_XA_ID "=" PRIu32,
PCMK_XA_ID "=%" PRIu32,
nodeid);
return NULL;
}
Expand Down