Currently, the controller initiates metadata actions on its own (not being told by the scheduler, and not executed via the executor, as all other resource actions are). Also, it executes metadata actions asynchronously when possible, but there are situations where it has to execute them synchronously. This has significant drawbacks:
- Metadata actions are the only actions executed as the hacluster user instead of root.
- Metadata actions are executed with a hardcoded 30s timeout, and ignore any timeout ignored in the CIB.
- If any asynchronous action is pending when a synchronous metadata call is made, the asynchronous action could complete while waiting for the synchronous call, causing its SIGCHLD to be ignored and leaving it as a zombie process.
The scheduler should schedule a metadata action, as a normal resource action, when any other resource action requires metadata (see crm_op_needs_metadata()), and order the metadata action before the other one(s). There only needs to be a single metadata action per agent (not per resource). The action would be added to the graph normally, and the DC would farm it out to controllers normally.
Considerations:
- Metadata actions should always assume requires="none" (that is, not require quorum or fencing).
- Start and probe actions always require fresh metadata (not cached), so metadata actions needed for those should be marked in some way.
- For this task, metadata actions needed for actions on a Pacemaker Remote node should be scheduled on the cluster node hosting the connection, not the remote node. (Remote metadata poses enough problems to merit its own project, T359.)
When a controller processes a metadata action, and it isn't marked as above, the controller should consider the action successful immediately (like a pseudo-op) if the metadata is already cached. Otherwise, it would send the metadata action to its local executor as usual, and cache the metadata on success.
Once done, update Pacemaker Explained re: meta-data "is not performed as root".
See also: