Page MenuHomeClusterLabs Projects

attrd_updater doesn't write bundle node attributes to CIB
Open, NormalPublic

Assigned To
None
Authored By
nrwahl2
Dec 9 2023, 1:33 AM
Tags
  • Restricted Project
  • Restricted Project
Referenced Files
None

Description

Not sure whether this applies to all remote nodes or just to bundle/guest nodes. Probably all.

The return value in attrd_cib.c:write_attribute() of crm_get_peer_full(v->nodeid, v->nodename, CRM_GET_PEER_ANY) is a crm_node_t with uname set but otherwise no uuid, no crm_remote_node flag set, no anything. I guess it hasn't been added to the crm_remote_peer_cache?

[root@pcmktest-bundle-0 /]# attrd_updater --name testattr --update false
[root@pcmktest-bundle-0 /]#

The attribute update is written to attrd but is not written to the CIB. The logs on the host show:

Dec 08 22:29:46.686 fastvm-rhel-9-0-42 pacemaker-attrd     [558604] (update_attr_on_host)       notice: Setting testattr[pcmktest-bundle-0] in instance_attributes: (unset) -> false | from fastvm-rhel-9-0-42 with no write delay
Dec 08 22:29:46.687 fastvm-rhel-9-0-42 pacemaker-attrd     [558604] (pcmk__get_peer)    info: Created entry 53899250-391d-4001-930c-1e5441b11663/0x5636df853d40 for node pcmktest-bundle-0/0 (2 total)
Dec 08 22:29:46.687 fastvm-rhel-9-0-42 pacemaker-attrd     [558604] (pcmk__corosync_uuid)       info: Node pcmktest-bundle-0 is not yet known by Corosync
Dec 08 22:29:46.687 fastvm-rhel-9-0-42 pacemaker-attrd     [558604] (pcmk__get_peer)    info: Cannot obtain a UUID for node 0/pcmktest-bundle-0
Dec 08 22:29:46.687 fastvm-rhel-9-0-42 pacemaker-attrd     [558604] (attrd_write_attribute)     notice: Cannot update testattr[pcmktest-bundle-0]=false because peer UUID not known (will retry if learned)

I reverted the "CIB transaction in attrd" commit and this issue persisted, so this one (as opposed to T732) is not a recent regression from me :)


This does not happen with crm_attribute, which uses query_node_uuid() to detect whether the node is remote.

This comment in attrd_updater.c appears to be inaccurate:

/* @TODO We don't know whether the specified node is a Pacemaker Remote
 * node or not, so we can't set pcmk__node_attr_remote when appropriate.
 * However, it's not a big problem, because pacemaker-attrd will learn
 * and remember a node's "remoteness".
 */

https://github.com/ClusterLabs/pacemaker/blob/Pacemaker-2.1.7-rc3/tools/attrd_updater.c#L354-L358

Seems like we could just use query_node_uuid() in attrd_updater.

Event Timeline

nrwahl2 triaged this task as High priority.Dec 9 2023, 1:33 AM
nrwahl2 created this task.
nrwahl2 created this object with edit policy "Restricted Project (Project)".
nrwahl2 renamed this task from Transient attribute updates for remote nodes are ignored to Transient attribute updates for remote nodes are not written to CIB.Dec 10 2023, 7:42 PM
nrwahl2 added projects: Restricted Project, Restricted Project.
nrwahl2 renamed this task from Transient attribute updates for remote nodes are not written to CIB to Transient attribute updates for bundle nodes are not written to CIB.Dec 10 2023, 8:16 PM
nrwahl2 updated the task description. (Show Details)
nrwahl2 renamed this task from Transient attribute updates for bundle nodes are not written to CIB to attrd_updater doesn't write bundle nodes attribute to CIB.Dec 10 2023, 8:39 PM
nrwahl2 updated the task description. (Show Details)
nrwahl2 edited projects, added Restricted Project; removed Restricted Project.
nrwahl2 updated the task description. (Show Details)
nrwahl2 updated the task description. (Show Details)
nrwahl2 renamed this task from attrd_updater doesn't write bundle nodes attribute to CIB to attrd_updater doesn't write bundle node attributes to CIB.Dec 12 2023, 3:56 PM
nrwahl2 updated the task description. (Show Details)

This is the currently expected behavior before pacemaker-attrd has learned a node is remote. Once it does learn it, it will update the attribute cache entry and write it out.

However we may need a better bound on how soon that will be learned. pacemaker-attrd uses pcmk__cluster_lookup_remote_node() rather than pcmk__refresh_node_caches_from_cib() to maintain the remote peer cache, and that may be worth looking into.

kgaillot lowered the priority of this task from High to Normal.Thu, Jan 2, 5:36 PM