diff --git a/cts/cli/regression.daemons.exp b/cts/cli/regression.daemons.exp
index 51a0edc88d..66bd7b3aed 100644
--- a/cts/cli/regression.daemons.exp
+++ b/cts/cli/regression.daemons.exp
@@ -1,441 +1,446 @@
=#=#=#= Begin test: Get CIB manager metadata =#=#=#=
1.1Cluster options used by Pacemaker's Cluster Information Base managerCluster Information Base manager optionsEnable Access Control Lists (ACLs) for the CIBEnable Access Control Lists (ACLs) for the CIBRaise this if log has "Evicting client" messages for cluster daemon PIDs (a good value is the number of resources in the cluster multiplied by the number of nodes).Maximum IPC message backlog before disconnecting a cluster daemon
=#=#=#= End test: Get CIB manager metadata - OK (0) =#=#=#=
* Passed: pacemaker-based - Get CIB manager metadata
=#=#=#= Begin test: Get controller metadata =#=#=#=
1.1Cluster options used by Pacemaker's controllerPacemaker controller optionsIncludes a hash which identifies the exact changeset the code was built from. Used for diagnostic purposes.Pacemaker version on cluster node elected Designated Controller (DC)Used for informational and diagnostic purposes.The messaging stack on which Pacemaker is currently runningThis optional value is mostly for users' convenience as desired in administration, but may also be used in Pacemaker configuration rules via the #cluster-name node attribute, and by higher-level tools and resource agents.An arbitrary name for the clusterThe optimal value will depend on the speed and load of your network and the type of switches used.How long to wait for a response from other nodes during start-upPacemaker is primarily event-driven, and looks ahead to know when to recheck cluster state for failure timeouts and most time-based rules. However, it will also recheck the cluster after this amount of inactivity, to evaluate rules with date specifications and serve as a fail-safe for certain types of scheduler bugs. Allowed values: Zero disables polling, while positive values are an interval in seconds(unless other units are specified, for example "5min")Polling interval to recheck cluster state and evaluate rules with date specificationsThe cluster will slow down its recovery process when the amount of system resources used (currently CPU) approaches this limitMaximum amount of system load that should be used by cluster nodesMaximum number of jobs that can be scheduled per node (defaults to 2x cores)Maximum number of jobs that can be scheduled per node (defaults to 2x cores)A cluster node may receive notification of its own fencing if fencing is misconfigured, or if fabric fencing is in use that doesn't cut cluster communication. Allowed values are "stop" to attempt to immediately stop Pacemaker and stay stopped, or "panic" to attempt to immediately reboot the local node, falling back to stop on failure.How a cluster node should react if notified of its own fencingDeclare an election failed if it is not decided within this much time. If you need to adjust this value, it probably indicates the presence of a bug.*** Advanced Use Only ***Exit immediately if shutdown does not complete within this much time. If you need to adjust this value, it probably indicates the presence of a bug.*** Advanced Use Only ***If you need to adjust this value, it probably indicates the presence of a bug.*** Advanced Use Only ***If you need to adjust this value, it probably indicates the presence of a bug.*** Advanced Use Only ***Delay cluster recovery for this much time to allow for additional events to occur. Useful if your configuration is sensitive to the order in which ping updates arrive.*** Advanced Use Only *** Enabling this option will slow down cluster recovery under all conditionsIf this is set to a positive value, lost nodes are assumed to self-fence using watchdog-based SBD within this much time. This does not require a fencing resource to be explicitly configured, though a fence_watchdog resource can be configured, to limit use to specific nodes. If this is set to 0 (the default), the cluster will never assume watchdog-based self-fencing. If this is set to a negative value, the cluster will use twice the local value of the `SBD_WATCHDOG_TIMEOUT` environment variable if that is positive, or otherwise treat this as 0. WARNING: When used, this timeout must be larger than `SBD_WATCHDOG_TIMEOUT` on all nodes that use watchdog-based SBD, and Pacemaker will refuse to start on any of those nodes where this is not true for the local value or SBD is not active. When this is set to a negative value, `SBD_WATCHDOG_TIMEOUT` must be set to the same value on all nodes that use SBD, otherwise data corruption or loss could occur.How long before nodes can be assumed to be safely down when watchdog-based self-fencing via SBD is in useHow many times fencing can fail before it will no longer be immediately re-attempted on a targetHow many times fencing can fail before it will no longer be immediately re-attempted on a targetWhat to do when the cluster does not have quorum Allowed values: stop, freeze, ignore, demote, suicideWhat to do when the cluster does not have quorumWhen true, resources active on a node when it is cleanly shut down are kept "locked" to that node (not allowed to run elsewhere) until they start again on that node after it rejoins (or for at most shutdown-lock-limit, if set). Stonith resources and Pacemaker Remote connections are never locked. Clone and bundle instances and the promoted role of promotable clones are currently never locked, though support could be added in a future release.Whether to lock resources to a cleanly shut down node
+
+ If shutdown-lock is true and this is set to a nonzero time duration, shutdown locks will expire after this much time has passed since the shutdown was initiated, even if the node has not rejoined.
+ Do not lock resources to a cleanly shut down node longer than this
+
+
=#=#=#= End test: Get controller metadata - OK (0) =#=#=#=
* Passed: pacemaker-controld - Get controller metadata
=#=#=#= Begin test: Get fencer metadata =#=#=#=
1.1Instance attributes available for all "stonith"-class resources and used by Pacemaker's fence daemon, formerly known as stonithdInstance attributes available for all "stonith"-class resourcessome devices do not support the standard 'port' parameter or may provide additional ones. Use this to specify an alternate, device-specific, parameter that should indicate the machine to be fenced. A value of none can be used to tell the cluster not to supply any additional parameters.Advanced use only: An alternate parameter to supply instead of 'port'Eg. node1:1;node2:2,3 would tell the cluster to use port 1 for node1 and ports 2 and 3 for node2A mapping of host names to ports numbers for devices that do not support host names.A list of machines controlled by this device (Optional unless pcmk_host_list=static-list)Eg. node1,node2,node3Allowed values: dynamic-list (query the device via the 'list' command), static-list (check the pcmk_host_list attribute), status (query the device via the 'status' command), none (assume every device can fence every machine)How to determine which machines are controlled by the device.Enable a delay of no more than the time specified before executing fencing actions. Pacemaker derives the overall delay by taking the value of pcmk_delay_base and adding a random delay value such that the sum is kept below this maximum.Enable a base delay for fencing actions and specify base delay value.This enables a static delay for fencing actions, which can help avoid "death matches" where two nodes try to fence each other at the same time. If pcmk_delay_max is also used, a random delay will be added such that the total delay is kept below that value.This can be set to a single time value to apply to any node targeted by this device (useful if a separate device is configured for each target), or to a node map (for example, "node1:1s;node2:5") to set a different value per target.Enable a base delay for fencing actions and specify base delay value.Cluster property concurrent-fencing=true needs to be configured first.Then use this to specify the maximum number of actions can be performed in parallel on this device. -1 is unlimited.The maximum number of actions can be performed in parallel on this deviceSome devices do not support the standard commands or may provide additional ones.\nUse this to specify an alternate, device-specific, command that implements the 'reboot' action.Advanced use only: An alternate command to run instead of 'reboot'Some devices need much more/less time to complete than normal.Use this to specify an alternate, device-specific, timeout for 'reboot' actions.Advanced use only: Specify an alternate timeout to use for reboot actions instead of stonith-timeoutSome devices do not support multiple connections. Operations may 'fail' if the device is busy with another task so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries 'reboot' actions before giving up.Advanced use only: The maximum number of times to retry the 'reboot' command within the timeout periodSome devices do not support the standard commands or may provide additional ones.Use this to specify an alternate, device-specific, command that implements the 'off' action.Advanced use only: An alternate command to run instead of 'off'Some devices need much more/less time to complete than normal.Use this to specify an alternate, device-specific, timeout for 'off' actions.Advanced use only: Specify an alternate timeout to use for off actions instead of stonith-timeoutSome devices do not support multiple connections. Operations may 'fail' if the device is busy with another task so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries 'off' actions before giving up.Advanced use only: The maximum number of times to retry the 'off' command within the timeout periodSome devices do not support the standard commands or may provide additional ones.Use this to specify an alternate, device-specific, command that implements the 'on' action.Advanced use only: An alternate command to run instead of 'on'Some devices need much more/less time to complete than normal.Use this to specify an alternate, device-specific, timeout for 'on' actions.Advanced use only: Specify an alternate timeout to use for on actions instead of stonith-timeoutSome devices do not support multiple connections. Operations may 'fail' if the device is busy with another task so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries 'on' actions before giving up.Advanced use only: The maximum number of times to retry the 'on' command within the timeout periodSome devices do not support the standard commands or may provide additional ones.Use this to specify an alternate, device-specific, command that implements the 'list' action.Advanced use only: An alternate command to run instead of 'list'Some devices need much more/less time to complete than normal.Use this to specify an alternate, device-specific, timeout for 'list' actions.Advanced use only: Specify an alternate timeout to use for list actions instead of stonith-timeoutSome devices do not support multiple connections. Operations may 'fail' if the device is busy with another task so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries 'list' actions before giving up.Advanced use only: The maximum number of times to retry the 'list' command within the timeout periodSome devices do not support the standard commands or may provide additional ones.Use this to specify an alternate, device-specific, command that implements the 'monitor' action.Advanced use only: An alternate command to run instead of 'monitor'Some devices need much more/less time to complete than normal.\nUse this to specify an alternate, device-specific, timeout for 'monitor' actions.Advanced use only: Specify an alternate timeout to use for monitor actions instead of stonith-timeoutSome devices do not support multiple connections. Operations may 'fail' if the device is busy with another task so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries 'monitor' actions before giving up.Advanced use only: The maximum number of times to retry the 'monitor' command within the timeout periodSome devices do not support the standard commands or may provide additional ones.Use this to specify an alternate, device-specific, command that implements the 'status' action.Advanced use only: An alternate command to run instead of 'status'Some devices need much more/less time to complete than normal.Use this to specify an alternate, device-specific, timeout for 'status' actions.Advanced use only: Specify an alternate timeout to use for status actions instead of stonith-timeoutSome devices do not support multiple connections. Operations may 'fail' if the device is busy with another task so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries 'status' actions before giving up.Advanced use only: The maximum number of times to retry the 'status' command within the timeout period
=#=#=#= End test: Get fencer metadata - OK (0) =#=#=#=
* Passed: pacemaker-fenced - Get fencer metadata
=#=#=#= Begin test: Get scheduler metadata =#=#=#=
1.1Cluster options used by Pacemaker's schedulerPacemaker scheduler optionsWhat to do when the cluster does not have quorum Allowed values: stop, freeze, ignore, demote, suicideWhat to do when the cluster does not have quorumWhether resources can run on any node by defaultWhether resources can run on any node by defaultWhether the cluster should refrain from monitoring, starting, and stopping resourcesWhether the cluster should refrain from monitoring, starting, and stopping resourcesWhen true, the cluster will immediately ban a resource from a node if it fails to start there. When false, the cluster will instead check the resource's fail count against its migration-threshold.Whether a start failure should prevent a resource from being recovered on the same nodeWhether the cluster should check for active resources during start-upWhether the cluster should check for active resources during start-upWhen true, resources active on a node when it is cleanly shut down are kept "locked" to that node (not allowed to run elsewhere) until they start again on that node after it rejoins (or for at most shutdown-lock-limit, if set). Stonith resources and Pacemaker Remote connections are never locked. Clone and bundle instances and the promoted role of promotable clones are currently never locked, though support could be added in a future release.Whether to lock resources to a cleanly shut down nodeIf shutdown-lock is true and this is set to a nonzero time duration, shutdown locks will expire after this much time has passed since the shutdown was initiated, even if the node has not rejoined.Do not lock resources to a cleanly shut down node longer than thisIf false, unresponsive nodes are immediately assumed to be harmless, and resources that were active on them may be recovered elsewhere. This can result in a "split-brain" situation, potentially leading to data loss and/or service unavailability.*** Advanced Use Only *** Whether nodes may be fenced as part of recoveryAction to send to fence device when a node needs to be fenced ("poweroff" is a deprecated alias for "off") Allowed values: reboot, off, poweroffAction to send to fence device when a node needs to be fenced ("poweroff" is a deprecated alias for "off")This value is not used by Pacemaker, but is kept for backward compatibility, and certain legacy fence agents might use it.*** Advanced Use Only *** Unused by PacemakerThis is set automatically by the cluster according to whether SBD is detected to be in use. User-configured values are ignored. The value `true` is meaningful if diskless SBD is used and `stonith-watchdog-timeout` is nonzero. In that case, if fencing is required, watchdog-based self-fencing will be performed via SBD without requiring a fencing resource explicitly configured.Whether watchdog integration is enabledAllow performing fencing operations in parallelAllow performing fencing operations in parallelSetting this to false may lead to a "split-brain" situation,potentially leading to data loss and/or service unavailability.*** Advanced Use Only *** Whether to fence unseen nodes at start-upApply specified delay for the fencings that are targeting the lost nodes with the highest total resource priority in case we don't have the majority of the nodes in our cluster partition, so that the more significant nodes potentially win any fencing match, which is especially meaningful under split-brain of 2-node cluster. A promoted resource instance takes the base priority + 1 on calculation if the base priority is not 0. Any static/random delays that are introduced by `pcmk_delay_base/max` configured for the corresponding fencing resources will be added to this delay. This delay should be significantly greater than, safely twice, the maximum `pcmk_delay_base/max`. By default, priority fencing delay is disabled.Apply fencing delay targeting the lost nodes with the highest total resource priorityThe node elected Designated Controller (DC) will consider an action failed if it does not get a response from the node executing the action within this time (after considering the action's own timeout). The "correct" value will depend on the speed and load of your network and cluster nodes.Maximum time for node-to-node communicationThe "correct" value will depend on the speed and load of your network and cluster nodes. If set to 0, the cluster will impose a dynamically calculated limit when any node has a high load.Maximum number of jobs that the cluster may execute in parallel across all nodesThe number of live migration actions that the cluster is allowed to execute in parallel on a node (-1 means no limit)The number of live migration actions that the cluster is allowed to execute in parallel on a node (-1 means no limit)Whether the cluster should stop all active resourcesWhether the cluster should stop all active resourcesWhether to stop resources that were removed from the configurationWhether to stop resources that were removed from the configurationWhether to cancel recurring actions removed from the configurationWhether to cancel recurring actions removed from the configurationValues other than default are poorly tested and potentially dangerous. This option will be removed in a future release.*** Deprecated *** Whether to remove stopped resources from the executorZero to disable, -1 to store unlimited.The number of scheduler inputs resulting in errors to saveZero to disable, -1 to store unlimited.The number of scheduler inputs resulting in warnings to saveZero to disable, -1 to store unlimited.The number of scheduler inputs without errors or warnings to saveRequires external entities to create node attributes (named with the prefix "#health") with values "red", "yellow", or "green". Allowed values: none, migrate-on-red, only-green, progressive, customHow cluster should react to node health attributesOnly used when "node-health-strategy" is set to "progressive".Base health score assigned to a nodeOnly used when "node-health-strategy" is set to "custom" or "progressive".The score to use for a node health attribute whose value is "green"Only used when "node-health-strategy" is set to "custom" or "progressive".The score to use for a node health attribute whose value is "yellow"Only used when "node-health-strategy" is set to "custom" or "progressive".The score to use for a node health attribute whose value is "red"How the cluster should allocate resources to nodes Allowed values: default, utilization, minimal, balancedHow the cluster should allocate resources to nodes
=#=#=#= End test: Get scheduler metadata - OK (0) =#=#=#=
* Passed: pacemaker-schedulerd - Get scheduler metadata
diff --git a/daemons/attrd/attrd_cib.c b/daemons/attrd/attrd_cib.c
index d04a67cb3f..928c013374 100644
--- a/daemons/attrd/attrd_cib.c
+++ b/daemons/attrd/attrd_cib.c
@@ -1,380 +1,380 @@
/*
- * Copyright 2013-2022 the Pacemaker project contributors
+ * Copyright 2013-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include "pacemaker-attrd.h"
static int last_cib_op_done = 0;
static gboolean
attribute_timer_cb(gpointer data)
{
attribute_t *a = data;
crm_trace("Dampen interval expired for %s", a->id);
attrd_write_or_elect_attribute(a);
return FALSE;
}
static void
attrd_cib_callback(xmlNode *msg, int call_id, int rc, xmlNode *output, void *user_data)
{
int level = LOG_ERR;
GHashTableIter iter;
const char *peer = NULL;
attribute_value_t *v = NULL;
char *name = user_data;
attribute_t *a = g_hash_table_lookup(attributes, name);
if(a == NULL) {
crm_info("Attribute %s no longer exists", name);
return;
}
a->update = 0;
if (rc == pcmk_ok && call_id < 0) {
rc = call_id;
}
switch (rc) {
case pcmk_ok:
level = LOG_INFO;
last_cib_op_done = call_id;
if (a->timer && !a->timeout_ms) {
// Remove temporary dampening for failed writes
mainloop_timer_del(a->timer);
a->timer = NULL;
}
break;
case -pcmk_err_diff_failed: /* When an attr changes while the CIB is syncing */
case -ETIME: /* When an attr changes while there is a DC election */
case -ENXIO: /* When an attr changes while the CIB is syncing a
* newer config from a node that just came up
*/
level = LOG_WARNING;
break;
}
do_crm_log(level, "CIB update %d result for %s: %s " CRM_XS " rc=%d",
call_id, a->id, pcmk_strerror(rc), rc);
g_hash_table_iter_init(&iter, a->values);
while (g_hash_table_iter_next(&iter, (gpointer *) & peer, (gpointer *) & v)) {
do_crm_log(level, "* %s[%s]=%s", a->id, peer, v->requested);
free(v->requested);
v->requested = NULL;
if (rc != pcmk_ok) {
a->changed = true; /* Attempt write out again */
}
}
if (a->changed && attrd_election_won()) {
if (rc == pcmk_ok) {
/* We deferred a write of a new update because this update was in
* progress. Write out the new value without additional delay.
*/
attrd_write_attribute(a, false);
/* We're re-attempting a write because the original failed; delay
* the next attempt so we don't potentially flood the CIB manager
* and logs with a zillion attempts per second.
*
* @TODO We could elect a new writer instead. However, we'd have to
* somehow downgrade our vote, and we'd still need something like this
* if all peers similarly fail to write this attribute (which may
* indicate a corrupted attribute entry rather than a CIB issue).
*/
} else if (a->timer) {
// Attribute has a dampening value, so use that as delay
if (!mainloop_timer_running(a->timer)) {
crm_trace("Delayed re-attempted write for %s by %s",
name, pcmk__readable_interval(a->timeout_ms));
mainloop_timer_start(a->timer);
}
} else {
/* Set a temporary dampening of 2 seconds (timer will continue
* to exist until the attribute's dampening gets set or the
* write succeeds).
*/
a->timer = attrd_add_timer(a->id, 2000, a);
mainloop_timer_start(a->timer);
}
}
}
static void
build_update_element(xmlNode *parent, attribute_t *a, const char *nodeid, const char *value)
{
const char *set = NULL;
xmlNode *xml_obj = NULL;
xml_obj = create_xml_node(parent, XML_CIB_TAG_STATE);
crm_xml_add(xml_obj, XML_ATTR_ID, nodeid);
xml_obj = create_xml_node(xml_obj, XML_TAG_TRANSIENT_NODEATTRS);
crm_xml_add(xml_obj, XML_ATTR_ID, nodeid);
if (pcmk__str_eq(a->set_type, XML_TAG_ATTR_SETS, pcmk__str_null_matches)) {
xml_obj = create_xml_node(xml_obj, XML_TAG_ATTR_SETS);
} else if (pcmk__str_eq(a->set_type, XML_TAG_UTILIZATION, pcmk__str_none)) {
xml_obj = create_xml_node(xml_obj, XML_TAG_UTILIZATION);
} else {
crm_err("Unknown set type attribute: %s", a->set_type);
}
if (a->set_id) {
crm_xml_set_id(xml_obj, "%s", a->set_id);
} else {
crm_xml_set_id(xml_obj, "%s-%s", XML_CIB_TAG_STATUS, nodeid);
}
set = ID(xml_obj);
xml_obj = create_xml_node(xml_obj, XML_CIB_TAG_NVPAIR);
if (a->uuid) {
crm_xml_set_id(xml_obj, "%s", a->uuid);
} else {
crm_xml_set_id(xml_obj, "%s-%s", set, a->id);
}
crm_xml_add(xml_obj, XML_NVPAIR_ATTR_NAME, a->id);
if(value) {
crm_xml_add(xml_obj, XML_NVPAIR_ATTR_VALUE, value);
} else {
crm_xml_add(xml_obj, XML_NVPAIR_ATTR_VALUE, "");
crm_xml_add(xml_obj, "__delete__", XML_NVPAIR_ATTR_VALUE);
}
}
static void
send_alert_attributes_value(attribute_t *a, GHashTable *t)
{
int rc = 0;
attribute_value_t *at = NULL;
GHashTableIter vIter;
g_hash_table_iter_init(&vIter, t);
while (g_hash_table_iter_next(&vIter, NULL, (gpointer *) & at)) {
rc = attrd_send_attribute_alert(at->nodename, at->nodeid,
a->id, at->current);
crm_trace("Sent alerts for %s[%s]=%s: nodeid=%d rc=%d",
a->id, at->nodename, at->current, at->nodeid, rc);
}
}
static void
set_alert_attribute_value(GHashTable *t, attribute_value_t *v)
{
attribute_value_t *a_v = NULL;
a_v = calloc(1, sizeof(attribute_value_t));
CRM_ASSERT(a_v != NULL);
a_v->nodeid = v->nodeid;
a_v->nodename = strdup(v->nodename);
pcmk__str_update(&a_v->current, v->current);
g_hash_table_replace(t, a_v->nodename, a_v);
}
mainloop_timer_t *
attrd_add_timer(const char *id, int timeout_ms, attribute_t *attr)
{
return mainloop_timer_add(id, timeout_ms, FALSE, attribute_timer_cb, attr);
}
void
attrd_write_attribute(attribute_t *a, bool ignore_delay)
{
int private_updates = 0, cib_updates = 0;
xmlNode *xml_top = NULL;
attribute_value_t *v = NULL;
GHashTableIter iter;
- enum cib_call_options flags = cib_quorum_override;
+ enum cib_call_options flags = cib_none;
GHashTable *alert_attribute_value = NULL;
if (a == NULL) {
return;
}
/* If this attribute will be written to the CIB ... */
if (!stand_alone && !a->is_private) {
/* Defer the write if now's not a good time */
CRM_CHECK(the_cib != NULL, return);
if (a->update && (a->update < last_cib_op_done)) {
crm_info("Write out of '%s' continuing: update %d considered lost", a->id, a->update);
a->update = 0; // Don't log this message again
} else if (a->update) {
crm_info("Write out of '%s' delayed: update %d in progress", a->id, a->update);
return;
} else if (mainloop_timer_running(a->timer)) {
if (ignore_delay) {
/* 'refresh' forces a write of the current value of all attributes
* Cancel any existing timers, we're writing it NOW
*/
mainloop_timer_stop(a->timer);
crm_debug("Write out of '%s': timer is running but ignore delay", a->id);
} else {
crm_info("Write out of '%s' delayed: timer is running", a->id);
return;
}
}
/* Initialize the status update XML */
xml_top = create_xml_node(NULL, XML_CIB_TAG_STATUS);
}
/* Attribute will be written shortly, so clear changed flag */
a->changed = false;
/* We will check all peers' uuids shortly, so initialize this to false */
a->unknown_peer_uuids = false;
/* Attribute will be written shortly, so clear forced write flag */
a->force_write = FALSE;
/* Make the table for the attribute trap */
alert_attribute_value = pcmk__strikey_table(NULL, attrd_free_attribute_value);
/* Iterate over each peer value of this attribute */
g_hash_table_iter_init(&iter, a->values);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) & v)) {
crm_node_t *peer = crm_get_peer_full(v->nodeid, v->nodename, CRM_GET_PEER_ANY);
/* If the value's peer info does not correspond to a peer, ignore it */
if (peer == NULL) {
crm_notice("Cannot update %s[%s]=%s because peer not known",
a->id, v->nodename, v->current);
continue;
}
/* If we're just learning the peer's node id, remember it */
if (peer->id && (v->nodeid == 0)) {
crm_trace("Learned ID %u for node %s", peer->id, v->nodename);
v->nodeid = peer->id;
}
/* If this is a private attribute, no update needs to be sent */
if (stand_alone || a->is_private) {
private_updates++;
continue;
}
/* If the peer is found, but its uuid is unknown, defer write */
if (peer->uuid == NULL) {
a->unknown_peer_uuids = true;
crm_notice("Cannot update %s[%s]=%s because peer UUID not known "
"(will retry if learned)",
a->id, v->nodename, v->current);
continue;
}
/* Add this value to status update XML */
crm_debug("Updating %s[%s]=%s (peer known as %s, UUID %s, ID %u/%u)",
a->id, v->nodename, v->current,
peer->uname, peer->uuid, peer->id, v->nodeid);
build_update_element(xml_top, a, peer->uuid, v->current);
cib_updates++;
/* Preservation of the attribute to transmit alert */
set_alert_attribute_value(alert_attribute_value, v);
free(v->requested);
v->requested = NULL;
if (v->current) {
v->requested = strdup(v->current);
} else {
/* Older attrd versions don't know about the cib_mixed_update
* flag so make sure it goes to the local cib which does
*/
cib__set_call_options(flags, crm_system_name,
cib_mixed_update|cib_scope_local);
}
}
if (private_updates) {
crm_info("Processed %d private change%s for %s, id=%s, set=%s",
private_updates, pcmk__plural_s(private_updates),
a->id, pcmk__s(a->uuid, "n/a"), pcmk__s(a->set_id, "n/a"));
}
if (cib_updates) {
crm_log_xml_trace(xml_top, __func__);
a->update = cib_internal_op(the_cib, PCMK__CIB_REQUEST_MODIFY, NULL,
XML_CIB_TAG_STATUS, xml_top, NULL, flags,
a->user);
crm_info("Sent CIB request %d with %d change%s for %s (id %s, set %s)",
a->update, cib_updates, pcmk__plural_s(cib_updates),
a->id, pcmk__s(a->uuid, "n/a"), pcmk__s(a->set_id, "n/a"));
the_cib->cmds->register_callback_full(the_cib, a->update,
CIB_OP_TIMEOUT_S, FALSE,
strdup(a->id),
"attrd_cib_callback",
attrd_cib_callback, free);
/* Transmit alert of the attribute */
send_alert_attributes_value(a, alert_attribute_value);
}
g_hash_table_destroy(alert_attribute_value);
free_xml(xml_top);
}
void
attrd_write_attributes(bool all, bool ignore_delay)
{
GHashTableIter iter;
attribute_t *a = NULL;
crm_debug("Writing out %s attributes", all? "all" : "changed");
g_hash_table_iter_init(&iter, attributes);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) & a)) {
if (!all && a->unknown_peer_uuids) {
// Try writing this attribute again, in case peer ID was learned
a->changed = true;
} else if (a->force_write) {
/* If the force_write flag is set, write the attribute. */
a->changed = true;
}
if(all || a->changed) {
/* When forced write flag is set, ignore delay. */
attrd_write_attribute(a, (a->force_write ? true : ignore_delay));
} else {
crm_trace("Skipping unchanged attribute %s", a->id);
}
}
}
void
attrd_write_or_elect_attribute(attribute_t *a)
{
if (attrd_election_won()) {
attrd_write_attribute(a, false);
} else {
attrd_start_election_if_needed();
}
}
diff --git a/daemons/attrd/pacemaker-attrd.c b/daemons/attrd/pacemaker-attrd.c
index bdd0913652..037825b313 100644
--- a/daemons/attrd/pacemaker-attrd.c
+++ b/daemons/attrd/pacemaker-attrd.c
@@ -1,359 +1,358 @@
/*
* Copyright 2013-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include "pacemaker-attrd.h"
#define SUMMARY "daemon for managing Pacemaker node attributes"
gboolean stand_alone = FALSE;
gchar **log_files = NULL;
static GOptionEntry entries[] = {
{ "stand-alone", 's', G_OPTION_FLAG_NONE, G_OPTION_ARG_NONE, &stand_alone,
"(Advanced use only) Run in stand-alone mode", NULL },
{ "logfile", 'l', G_OPTION_FLAG_NONE, G_OPTION_ARG_FILENAME_ARRAY,
&log_files, "Send logs to the additional named logfile", NULL },
{ NULL }
};
static pcmk__output_t *out = NULL;
static pcmk__supported_format_t formats[] = {
PCMK__SUPPORTED_FORMAT_NONE,
PCMK__SUPPORTED_FORMAT_TEXT,
PCMK__SUPPORTED_FORMAT_XML,
{ NULL, NULL, NULL }
};
lrmd_t *the_lrmd = NULL;
crm_cluster_t *attrd_cluster = NULL;
crm_trigger_t *attrd_config_read = NULL;
crm_exit_t attrd_exit_status = CRM_EX_OK;
static void
attrd_cib_destroy_cb(gpointer user_data)
{
cib_t *conn = user_data;
conn->cmds->signoff(conn); /* Ensure IPC is cleaned up */
if (attrd_shutting_down()) {
crm_info("Connection disconnection complete");
} else {
/* eventually this should trigger a reconnect, not a shutdown */
crm_crit("Lost connection to the CIB manager, shutting down");
attrd_exit_status = CRM_EX_DISCONNECT;
attrd_shutdown(0);
}
return;
}
static void
attrd_erase_cb(xmlNode *msg, int call_id, int rc, xmlNode *output,
void *user_data)
{
do_crm_log_unlikely((rc? LOG_NOTICE : LOG_DEBUG),
"Cleared transient attributes: %s "
CRM_XS " xpath=%s rc=%d",
pcmk_strerror(rc), (char *) user_data, rc);
}
#define XPATH_TRANSIENT "//node_state[@uname='%s']/" XML_TAG_TRANSIENT_NODEATTRS
/*!
* \internal
* \brief Wipe all transient attributes for this node from the CIB
*
* Clear any previous transient node attributes from the CIB. This is
* normally done by the DC's controller when this node leaves the cluster, but
* this handles the case where the node restarted so quickly that the
* cluster layer didn't notice.
*
* \todo If pacemaker-attrd respawns after crashing (see PCMK_respawned),
* ideally we'd skip this and sync our attributes from the writer.
* However, currently we reject any values for us that the writer has, in
* attrd_peer_update().
*/
static void
attrd_erase_attrs(void)
{
int call_id;
char *xpath = crm_strdup_printf(XPATH_TRANSIENT, attrd_cluster->uname);
crm_info("Clearing transient attributes from CIB " CRM_XS " xpath=%s",
xpath);
- call_id = the_cib->cmds->remove(the_cib, xpath, NULL,
- cib_quorum_override | cib_xpath);
+ call_id = the_cib->cmds->remove(the_cib, xpath, NULL, cib_xpath);
the_cib->cmds->register_callback_full(the_cib, call_id, 120, FALSE, xpath,
"attrd_erase_cb", attrd_erase_cb,
free);
}
static int
attrd_cib_connect(int max_retry)
{
static int attempts = 0;
int rc = -ENOTCONN;
the_cib = cib_new();
if (the_cib == NULL) {
return -ENOTCONN;
}
do {
if(attempts > 0) {
sleep(attempts);
}
attempts++;
crm_debug("Connection attempt %d to the CIB manager", attempts);
rc = the_cib->cmds->signon(the_cib, T_ATTRD, cib_command);
} while(rc != pcmk_ok && attempts < max_retry);
if (rc != pcmk_ok) {
crm_err("Connection to the CIB manager failed: %s " CRM_XS " rc=%d",
pcmk_strerror(rc), rc);
goto cleanup;
}
crm_debug("Connected to the CIB manager after %d attempts", attempts);
rc = the_cib->cmds->set_connection_dnotify(the_cib, attrd_cib_destroy_cb);
if (rc != pcmk_ok) {
crm_err("Could not set disconnection callback");
goto cleanup;
}
rc = the_cib->cmds->add_notify_callback(the_cib, T_CIB_REPLACE_NOTIFY, attrd_cib_replaced_cb);
if(rc != pcmk_ok) {
crm_err("Could not set CIB notification callback");
goto cleanup;
}
rc = the_cib->cmds->add_notify_callback(the_cib, T_CIB_DIFF_NOTIFY, attrd_cib_updated_cb);
if (rc != pcmk_ok) {
crm_err("Could not set CIB notification callback (update)");
goto cleanup;
}
return pcmk_ok;
cleanup:
cib__clean_up_connection(&the_cib);
return -ENOTCONN;
}
/*!
* \internal
* \brief Prepare the CIB after cluster is connected
*/
static void
attrd_cib_init(void)
{
// We have no attribute values in memory, wipe the CIB to match
attrd_erase_attrs();
// Set a trigger for reading the CIB (for the alerts section)
attrd_config_read = mainloop_add_trigger(G_PRIORITY_HIGH, attrd_read_options, NULL);
// Always read the CIB at start-up
mainloop_set_trigger(attrd_config_read);
}
static bool
ipc_already_running(void)
{
pcmk_ipc_api_t *old_instance = NULL;
int rc = pcmk_rc_ok;
rc = pcmk_new_ipc_api(&old_instance, pcmk_ipc_attrd);
if (rc != pcmk_rc_ok) {
return false;
}
rc = pcmk_connect_ipc(old_instance, pcmk_ipc_dispatch_sync);
if (rc != pcmk_rc_ok) {
pcmk_free_ipc_api(old_instance);
return false;
}
pcmk_disconnect_ipc(old_instance);
pcmk_free_ipc_api(old_instance);
return true;
}
static GOptionContext *
build_arg_context(pcmk__common_args_t *args, GOptionGroup **group) {
GOptionContext *context = NULL;
context = pcmk__build_arg_context(args, "text (default), xml", group, NULL);
pcmk__add_main_args(context, entries);
return context;
}
int
main(int argc, char **argv)
{
int rc = pcmk_rc_ok;
GError *error = NULL;
bool initialized = false;
GOptionGroup *output_group = NULL;
pcmk__common_args_t *args = pcmk__new_common_args(SUMMARY);
gchar **processed_args = pcmk__cmdline_preproc(argv, NULL);
GOptionContext *context = build_arg_context(args, &output_group);
attrd_init_mainloop();
crm_log_preinit(NULL, argc, argv);
mainloop_add_signal(SIGTERM, attrd_shutdown);
pcmk__register_formats(output_group, formats);
if (!g_option_context_parse_strv(context, &processed_args, &error)) {
attrd_exit_status = CRM_EX_USAGE;
goto done;
}
rc = pcmk__output_new(&out, args->output_ty, args->output_dest, argv);
if ((rc != pcmk_rc_ok) || (out == NULL)) {
attrd_exit_status = CRM_EX_ERROR;
g_set_error(&error, PCMK__EXITC_ERROR, attrd_exit_status,
"Error creating output format %s: %s",
args->output_ty, pcmk_rc_str(rc));
goto done;
}
if (args->version) {
out->version(out, false);
goto done;
}
// Open additional log files
pcmk__add_logfiles(log_files, out);
crm_log_init(T_ATTRD, LOG_INFO, TRUE, FALSE, argc, argv, FALSE);
crm_notice("Starting Pacemaker node attribute manager%s",
stand_alone ? " in standalone mode" : "");
if (ipc_already_running()) {
const char *msg = "pacemaker-attrd is already active, aborting startup";
attrd_exit_status = CRM_EX_OK;
g_set_error(&error, PCMK__EXITC_ERROR, attrd_exit_status, "%s", msg);
crm_err(msg);
goto done;
}
initialized = true;
attributes = pcmk__strkey_table(NULL, attrd_free_attribute);
/* Connect to the CIB before connecting to the cluster or listening for IPC.
* This allows us to assume the CIB is connected whenever we process a
* cluster or IPC message (which also avoids start-up race conditions).
*/
if (!stand_alone) {
if (attrd_cib_connect(30) != pcmk_ok) {
attrd_exit_status = CRM_EX_FATAL;
g_set_error(&error, PCMK__EXITC_ERROR, attrd_exit_status,
"Could not connect to the CIB");
goto done;
}
crm_info("CIB connection active");
}
if (attrd_cluster_connect() != pcmk_ok) {
attrd_exit_status = CRM_EX_FATAL;
g_set_error(&error, PCMK__EXITC_ERROR, attrd_exit_status,
"Could not connect to the cluster");
goto done;
}
crm_info("Cluster connection active");
// Initialization that requires the cluster to be connected
attrd_election_init();
if (!stand_alone) {
attrd_cib_init();
}
/* Set a private attribute for ourselves with the protocol version we
* support. This lets all nodes determine the minimum supported version
* across all nodes. It also ensures that the writer learns our node name,
* so it can send our attributes to the CIB.
*/
attrd_broadcast_protocol();
attrd_init_ipc();
crm_notice("Pacemaker node attribute manager successfully started and accepting connections");
attrd_run_mainloop();
done:
if (initialized) {
crm_info("Shutting down attribute manager");
attrd_election_fini();
attrd_ipc_fini();
attrd_lrmd_disconnect();
if (!stand_alone) {
attrd_cib_disconnect();
}
attrd_free_waitlist();
pcmk_cluster_free(attrd_cluster);
g_hash_table_destroy(attributes);
}
g_strfreev(processed_args);
pcmk__free_arg_context(context);
g_strfreev(log_files);
pcmk__output_and_clear_error(&error, out);
if (out != NULL) {
out->finish(out, attrd_exit_status, true, NULL);
pcmk__output_free(out);
}
pcmk__unregister_formats();
crm_exit(attrd_exit_status);
}
diff --git a/daemons/based/based_callbacks.c b/daemons/based/based_callbacks.c
index c558931c4d..3726caa0b1 100644
--- a/daemons/based/based_callbacks.c
+++ b/daemons/based/based_callbacks.c
@@ -1,1696 +1,1696 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include // uint32_t, uint64_t, UINT64_C()
#include
#include
#include // PRIu64
#include
#include
#include
#include
#include
#include
#include
#define EXIT_ESCALATION_MS 10000
#define OUR_NODENAME (stand_alone? "localhost" : crm_cluster->uname)
static unsigned long cib_local_bcast_num = 0;
typedef struct cib_local_notify_s {
xmlNode *notify_src;
char *client_id;
gboolean from_peer;
gboolean sync_reply;
} cib_local_notify_t;
int next_client_id = 0;
gboolean legacy_mode = FALSE;
qb_ipcs_service_t *ipcs_ro = NULL;
qb_ipcs_service_t *ipcs_rw = NULL;
qb_ipcs_service_t *ipcs_shm = NULL;
static void cib_process_request(xmlNode *request, gboolean privileged,
const pcmk__client_t *cib_client);
static int cib_process_command(xmlNode *request, xmlNode **reply,
xmlNode **cib_diff, gboolean privileged);
static gboolean cib_common_callback(qb_ipcs_connection_t *c, void *data,
size_t size, gboolean privileged);
gboolean
cib_legacy_mode(void)
{
return legacy_mode;
}
static int32_t
cib_ipc_accept(qb_ipcs_connection_t * c, uid_t uid, gid_t gid)
{
if (cib_shutdown_flag) {
crm_info("Ignoring new IPC client [%d] during shutdown",
pcmk__client_pid(c));
return -EPERM;
}
if (pcmk__new_client(c, uid, gid) == NULL) {
return -EIO;
}
return 0;
}
static int32_t
cib_ipc_dispatch_rw(qb_ipcs_connection_t * c, void *data, size_t size)
{
pcmk__client_t *client = pcmk__find_client(c);
crm_trace("%p message from %s", c, client->id);
return cib_common_callback(c, data, size, TRUE);
}
static int32_t
cib_ipc_dispatch_ro(qb_ipcs_connection_t * c, void *data, size_t size)
{
pcmk__client_t *client = pcmk__find_client(c);
crm_trace("%p message from %s", c, client->id);
return cib_common_callback(c, data, size, FALSE);
}
/* Error code means? */
static int32_t
cib_ipc_closed(qb_ipcs_connection_t * c)
{
pcmk__client_t *client = pcmk__find_client(c);
if (client == NULL) {
return 0;
}
crm_trace("Connection %p", c);
pcmk__free_client(client);
return 0;
}
static void
cib_ipc_destroy(qb_ipcs_connection_t * c)
{
crm_trace("Connection %p", c);
cib_ipc_closed(c);
if (cib_shutdown_flag) {
cib_shutdown(0);
}
}
struct qb_ipcs_service_handlers ipc_ro_callbacks = {
.connection_accept = cib_ipc_accept,
.connection_created = NULL,
.msg_process = cib_ipc_dispatch_ro,
.connection_closed = cib_ipc_closed,
.connection_destroyed = cib_ipc_destroy
};
struct qb_ipcs_service_handlers ipc_rw_callbacks = {
.connection_accept = cib_ipc_accept,
.connection_created = NULL,
.msg_process = cib_ipc_dispatch_rw,
.connection_closed = cib_ipc_closed,
.connection_destroyed = cib_ipc_destroy
};
void
cib_common_callback_worker(uint32_t id, uint32_t flags, xmlNode * op_request,
pcmk__client_t *cib_client, gboolean privileged)
{
const char *op = crm_element_value(op_request, F_CIB_OPERATION);
if (pcmk__str_eq(op, CRM_OP_REGISTER, pcmk__str_none)) {
if (flags & crm_ipc_client_response) {
xmlNode *ack = create_xml_node(NULL, __func__);
crm_xml_add(ack, F_CIB_OPERATION, CRM_OP_REGISTER);
crm_xml_add(ack, F_CIB_CLIENTID, cib_client->id);
pcmk__ipc_send_xml(cib_client, id, ack, flags);
cib_client->request_id = 0;
free_xml(ack);
}
return;
} else if (pcmk__str_eq(op, T_CIB_NOTIFY, pcmk__str_none)) {
/* Update the notify filters for this client */
int on_off = 0;
crm_exit_t status = CRM_EX_OK;
uint64_t bit = UINT64_C(0);
const char *type = crm_element_value(op_request, F_CIB_NOTIFY_TYPE);
crm_element_value_int(op_request, F_CIB_NOTIFY_ACTIVATE, &on_off);
crm_debug("Setting %s callbacks %s for client %s",
type, (on_off? "on" : "off"), pcmk__client_name(cib_client));
if (pcmk__str_eq(type, T_CIB_POST_NOTIFY, pcmk__str_casei)) {
bit = cib_notify_post;
} else if (pcmk__str_eq(type, T_CIB_PRE_NOTIFY, pcmk__str_casei)) {
bit = cib_notify_pre;
} else if (pcmk__str_eq(type, T_CIB_UPDATE_CONFIRM, pcmk__str_casei)) {
bit = cib_notify_confirm;
} else if (pcmk__str_eq(type, T_CIB_DIFF_NOTIFY, pcmk__str_casei)) {
bit = cib_notify_diff;
} else if (pcmk__str_eq(type, T_CIB_REPLACE_NOTIFY, pcmk__str_casei)) {
bit = cib_notify_replace;
} else {
status = CRM_EX_INVALID_PARAM;
}
if (bit != 0) {
if (on_off) {
pcmk__set_client_flags(cib_client, bit);
} else {
pcmk__clear_client_flags(cib_client, bit);
}
}
pcmk__ipc_send_ack(cib_client, id, flags, "ack", NULL, status);
return;
}
cib_process_request(op_request, privileged, cib_client);
}
int32_t
cib_common_callback(qb_ipcs_connection_t * c, void *data, size_t size, gboolean privileged)
{
uint32_t id = 0;
uint32_t flags = 0;
int call_options = 0;
pcmk__client_t *cib_client = pcmk__find_client(c);
xmlNode *op_request = pcmk__client_data2xml(cib_client, data, &id, &flags);
if (op_request) {
crm_element_value_int(op_request, F_CIB_CALLOPTS, &call_options);
}
if (op_request == NULL) {
crm_trace("Invalid message from %p", c);
pcmk__ipc_send_ack(cib_client, id, flags, "nack", NULL, CRM_EX_PROTOCOL);
return 0;
} else if(cib_client == NULL) {
crm_trace("Invalid client %p", c);
return 0;
}
if (pcmk_is_set(call_options, cib_sync_call)) {
CRM_LOG_ASSERT(flags & crm_ipc_client_response);
CRM_LOG_ASSERT(cib_client->request_id == 0); /* This means the client has two synchronous events in-flight */
cib_client->request_id = id; /* Reply only to the last one */
}
if (cib_client->name == NULL) {
const char *value = crm_element_value(op_request, F_CIB_CLIENTNAME);
if (value == NULL) {
cib_client->name = pcmk__itoa(cib_client->pid);
} else {
cib_client->name = strdup(value);
if (crm_is_daemon_name(value)) {
pcmk__set_client_flags(cib_client, cib_is_daemon);
}
}
}
/* Allow cluster daemons more leeway before being evicted */
if (pcmk_is_set(cib_client->flags, cib_is_daemon)) {
const char *qmax = cib_config_lookup("cluster-ipc-limit");
if (pcmk__set_client_queue_max(cib_client, qmax)) {
crm_trace("IPC threshold for client %s[%u] is now %u",
pcmk__client_name(cib_client), cib_client->pid,
cib_client->queue_max);
}
}
crm_xml_add(op_request, F_CIB_CLIENTID, cib_client->id);
crm_xml_add(op_request, F_CIB_CLIENTNAME, cib_client->name);
CRM_LOG_ASSERT(cib_client->user != NULL);
pcmk__update_acl_user(op_request, F_CIB_USER, cib_client->user);
cib_common_callback_worker(id, flags, op_request, cib_client, privileged);
free_xml(op_request);
return 0;
}
static uint64_t ping_seq = 0;
static char *ping_digest = NULL;
static bool ping_modified_since = FALSE;
static gboolean
cib_digester_cb(gpointer data)
{
if (based_is_primary) {
char buffer[32];
xmlNode *ping = create_xml_node(NULL, "ping");
ping_seq++;
free(ping_digest);
ping_digest = NULL;
ping_modified_since = FALSE;
snprintf(buffer, 32, "%" PRIu64, ping_seq);
crm_trace("Requesting peer digests (%s)", buffer);
crm_xml_add(ping, F_TYPE, "cib");
crm_xml_add(ping, F_CIB_OPERATION, CRM_OP_PING);
crm_xml_add(ping, F_CIB_PING_ID, buffer);
crm_xml_add(ping, XML_ATTR_CRM_VERSION, CRM_FEATURE_SET);
send_cluster_message(NULL, crm_msg_cib, ping, TRUE);
free_xml(ping);
}
return FALSE;
}
static void
process_ping_reply(xmlNode *reply)
{
uint64_t seq = 0;
const char *host = crm_element_value(reply, F_ORIG);
xmlNode *pong = get_message_xml(reply, F_CIB_CALLDATA);
const char *seq_s = crm_element_value(pong, F_CIB_PING_ID);
const char *digest = crm_element_value(pong, XML_ATTR_DIGEST);
if (seq_s == NULL) {
crm_debug("Ignoring ping reply with no " F_CIB_PING_ID);
return;
} else {
long long seq_ll;
if (pcmk__scan_ll(seq_s, &seq_ll, 0LL) != pcmk_rc_ok) {
return;
}
seq = (uint64_t) seq_ll;
}
if(digest == NULL) {
crm_trace("Ignoring ping reply %s from %s with no digest", seq_s, host);
} else if(seq != ping_seq) {
crm_trace("Ignoring out of sequence ping reply %s from %s", seq_s, host);
} else if(ping_modified_since) {
crm_trace("Ignoring ping reply %s from %s: cib updated since", seq_s, host);
} else {
const char *version = crm_element_value(pong, XML_ATTR_CRM_VERSION);
if(ping_digest == NULL) {
crm_trace("Calculating new digest");
ping_digest = calculate_xml_versioned_digest(the_cib, FALSE, TRUE, version);
}
crm_trace("Processing ping reply %s from %s (%s)", seq_s, host, digest);
if (!pcmk__str_eq(ping_digest, digest, pcmk__str_casei)) {
xmlNode *remote_cib = get_message_xml(pong, F_CIB_CALLDATA);
crm_notice("Local CIB %s.%s.%s.%s differs from %s: %s.%s.%s.%s %p",
crm_element_value(the_cib, XML_ATTR_GENERATION_ADMIN),
crm_element_value(the_cib, XML_ATTR_GENERATION),
crm_element_value(the_cib, XML_ATTR_NUMUPDATES),
ping_digest, host,
remote_cib?crm_element_value(remote_cib, XML_ATTR_GENERATION_ADMIN):"_",
remote_cib?crm_element_value(remote_cib, XML_ATTR_GENERATION):"_",
remote_cib?crm_element_value(remote_cib, XML_ATTR_NUMUPDATES):"_",
digest, remote_cib);
if(remote_cib && remote_cib->children) {
// Additional debug
xml_calculate_changes(the_cib, remote_cib);
pcmk__output_set_log_level(logger_out, LOG_INFO);
pcmk__xml_show_changes(logger_out, remote_cib);
crm_trace("End of differences");
}
free_xml(remote_cib);
sync_our_cib(reply, FALSE);
}
}
}
static void
do_local_notify(xmlNode * notify_src, const char *client_id,
gboolean sync_reply, gboolean from_peer)
{
int rid = 0;
int call_id = 0;
pcmk__client_t *client_obj = NULL;
CRM_ASSERT(notify_src && client_id);
crm_element_value_int(notify_src, F_CIB_CALLID, &call_id);
client_obj = pcmk__find_client_by_id(client_id);
if (client_obj == NULL) {
crm_debug("Could not send response %d: client %s not found",
call_id, client_id);
return;
}
if (sync_reply) {
if (client_obj->ipcs) {
CRM_LOG_ASSERT(client_obj->request_id);
rid = client_obj->request_id;
client_obj->request_id = 0;
crm_trace("Sending response %d to client %s%s",
rid, pcmk__client_name(client_obj),
(from_peer? " (originator of delegated request)" : ""));
} else {
crm_trace("Sending response (call %d) to client %s%s",
call_id, pcmk__client_name(client_obj),
(from_peer? " (originator of delegated request)" : ""));
}
} else {
crm_trace("Sending event %d to client %s%s",
call_id, pcmk__client_name(client_obj),
(from_peer? " (originator of delegated request)" : ""));
}
switch (PCMK__CLIENT_TYPE(client_obj)) {
case pcmk__client_ipc:
{
int rc = pcmk__ipc_send_xml(client_obj, rid, notify_src,
(sync_reply? crm_ipc_flags_none
: crm_ipc_server_event));
if (rc != pcmk_rc_ok) {
crm_warn("%s reply to client %s failed: %s " CRM_XS " rc=%d",
(sync_reply? "Synchronous" : "Asynchronous"),
pcmk__client_name(client_obj), pcmk_rc_str(rc),
rc);
}
}
break;
#ifdef HAVE_GNUTLS_GNUTLS_H
case pcmk__client_tls:
#endif
case pcmk__client_tcp:
pcmk__remote_send_xml(client_obj->remote, notify_src);
break;
default:
crm_err("Unknown transport for client %s "
CRM_XS " flags=%#016" PRIx64,
pcmk__client_name(client_obj), client_obj->flags);
}
}
static void
local_notify_destroy_callback(gpointer data)
{
cib_local_notify_t *notify = data;
free_xml(notify->notify_src);
free(notify->client_id);
free(notify);
}
static void
check_local_notify(int bcast_id)
{
cib_local_notify_t *notify = NULL;
if (!local_notify_queue) {
return;
}
notify = pcmk__intkey_table_lookup(local_notify_queue, bcast_id);
if (notify) {
do_local_notify(notify->notify_src, notify->client_id, notify->sync_reply,
notify->from_peer);
pcmk__intkey_table_remove(local_notify_queue, bcast_id);
}
}
static void
queue_local_notify(xmlNode * notify_src, const char *client_id, gboolean sync_reply,
gboolean from_peer)
{
cib_local_notify_t *notify = calloc(1, sizeof(cib_local_notify_t));
notify->notify_src = notify_src;
notify->client_id = strdup(client_id);
notify->sync_reply = sync_reply;
notify->from_peer = from_peer;
if (!local_notify_queue) {
local_notify_queue = pcmk__intkey_table(local_notify_destroy_callback);
}
pcmk__intkey_table_insert(local_notify_queue, cib_local_bcast_num, notify);
// cppcheck doesn't know notify will get freed when hash table is destroyed
// cppcheck-suppress memleak
}
static void
parse_local_options_v1(const pcmk__client_t *cib_client, int call_type,
int call_options, const char *host, const char *op,
gboolean *local_notify, gboolean *needs_reply,
gboolean *process, gboolean *needs_forward)
{
if (cib_op_modifies(call_type)
&& !(call_options & cib_inhibit_bcast)) {
/* we need to send an update anyway */
*needs_reply = TRUE;
} else {
*needs_reply = FALSE;
}
if (host == NULL && (call_options & cib_scope_local)) {
crm_trace("Processing locally scoped %s op from client %s",
op, pcmk__client_name(cib_client));
*local_notify = TRUE;
} else if ((host == NULL) && based_is_primary) {
crm_trace("Processing %s op locally from client %s as primary",
op, pcmk__client_name(cib_client));
*local_notify = TRUE;
} else if (pcmk__str_eq(host, OUR_NODENAME, pcmk__str_casei)) {
crm_trace("Processing locally addressed %s op from client %s",
op, pcmk__client_name(cib_client));
*local_notify = TRUE;
} else if (stand_alone) {
*needs_forward = FALSE;
*local_notify = TRUE;
*process = TRUE;
} else {
crm_trace("%s op from %s needs to be forwarded to client %s",
op, pcmk__client_name(cib_client),
pcmk__s(host, "the primary instance"));
*needs_forward = TRUE;
*process = FALSE;
}
}
static void
parse_local_options_v2(const pcmk__client_t *cib_client, int call_type,
int call_options, const char *host, const char *op,
gboolean *local_notify, gboolean *needs_reply,
gboolean *process, gboolean *needs_forward)
{
if (cib_op_modifies(call_type)) {
if (pcmk__str_any_of(op, PCMK__CIB_REQUEST_PRIMARY,
PCMK__CIB_REQUEST_SECONDARY, NULL)) {
/* Always handle these locally */
*process = TRUE;
*needs_reply = FALSE;
*local_notify = TRUE;
*needs_forward = FALSE;
return;
} else {
/* Redirect all other updates via CPG */
*needs_reply = TRUE;
*needs_forward = TRUE;
*process = FALSE;
crm_trace("%s op from %s needs to be forwarded to client %s",
op, pcmk__client_name(cib_client),
pcmk__s(host, "the primary instance"));
return;
}
}
*process = TRUE;
*needs_reply = FALSE;
*local_notify = TRUE;
*needs_forward = FALSE;
if (stand_alone) {
crm_trace("Processing %s op from client %s (stand-alone)",
op, pcmk__client_name(cib_client));
} else if (host == NULL) {
crm_trace("Processing unaddressed %s op from client %s",
op, pcmk__client_name(cib_client));
} else if (pcmk__str_eq(host, OUR_NODENAME, pcmk__str_casei)) {
crm_trace("Processing locally addressed %s op from client %s",
op, pcmk__client_name(cib_client));
} else {
crm_trace("%s op from %s needs to be forwarded to client %s",
op, pcmk__client_name(cib_client), host);
*needs_forward = TRUE;
*process = FALSE;
}
}
static void
parse_local_options(const pcmk__client_t *cib_client, int call_type,
int call_options, const char *host, const char *op,
gboolean *local_notify, gboolean *needs_reply,
gboolean *process, gboolean *needs_forward)
{
if(cib_legacy_mode()) {
parse_local_options_v1(cib_client, call_type, call_options, host,
op, local_notify, needs_reply, process, needs_forward);
} else {
parse_local_options_v2(cib_client, call_type, call_options, host,
op, local_notify, needs_reply, process, needs_forward);
}
}
static gboolean
parse_peer_options_v1(int call_type, xmlNode * request,
gboolean * local_notify, gboolean * needs_reply, gboolean * process,
gboolean * needs_forward)
{
const char *op = NULL;
const char *host = NULL;
const char *delegated = NULL;
const char *originator = crm_element_value(request, F_ORIG);
const char *reply_to = crm_element_value(request, F_CIB_ISREPLY);
gboolean is_reply = pcmk__str_eq(reply_to, OUR_NODENAME, pcmk__str_casei);
if (pcmk__xe_attr_is_true(request, F_CIB_GLOBAL_UPDATE)) {
*needs_reply = FALSE;
if (is_reply) {
*local_notify = TRUE;
crm_trace("Processing global/peer update from %s"
" that originated from us", originator);
} else {
crm_trace("Processing global/peer update from %s", originator);
}
return TRUE;
}
op = crm_element_value(request, F_CIB_OPERATION);
crm_trace("Processing %s request sent by %s", op, originator);
if (pcmk__str_eq(op, PCMK__CIB_REQUEST_SHUTDOWN, pcmk__str_none)) {
/* Always process these */
*local_notify = FALSE;
if (reply_to == NULL || is_reply) {
*process = TRUE;
}
if (is_reply) {
*needs_reply = FALSE;
}
return *process;
}
if (is_reply && pcmk__str_eq(op, CRM_OP_PING, pcmk__str_casei)) {
process_ping_reply(request);
return FALSE;
}
if (is_reply) {
crm_trace("Forward reply sent from %s to local clients", originator);
*process = FALSE;
*needs_reply = FALSE;
*local_notify = TRUE;
return TRUE;
}
host = crm_element_value(request, F_CIB_HOST);
if (pcmk__str_eq(host, OUR_NODENAME, pcmk__str_casei)) {
crm_trace("Processing %s request sent to us from %s", op, originator);
return TRUE;
} else if(is_reply == FALSE && pcmk__str_eq(op, CRM_OP_PING, pcmk__str_casei)) {
crm_trace("Processing %s request sent to %s by %s", op, host?host:"everyone", originator);
*needs_reply = TRUE;
return TRUE;
} else if ((host == NULL) && based_is_primary) {
crm_trace("Processing %s request sent to primary instance from %s",
op, originator);
return TRUE;
}
delegated = crm_element_value(request, F_CIB_DELEGATED);
if (delegated != NULL) {
crm_trace("Ignoring message for primary instance");
} else if (host != NULL) {
/* this is for a specific instance and we're not it */
crm_trace("Ignoring msg for instance on %s", host);
} else if ((reply_to == NULL) && !based_is_primary) {
// This is for the primary instance, and we're not it
crm_trace("Ignoring reply for primary instance");
} else if (pcmk__str_eq(op, PCMK__CIB_REQUEST_SHUTDOWN, pcmk__str_none)) {
if (reply_to != NULL) {
crm_debug("Processing %s from %s", op, originator);
*needs_reply = FALSE;
} else {
crm_debug("Processing %s reply from %s", op, originator);
}
return TRUE;
} else {
crm_err("Nothing for us to do?");
crm_log_xml_err(request, "Peer[inbound]");
}
return FALSE;
}
static gboolean
parse_peer_options_v2(int call_type, xmlNode * request,
gboolean * local_notify, gboolean * needs_reply, gboolean * process,
gboolean * needs_forward)
{
const char *host = NULL;
const char *delegated = crm_element_value(request, F_CIB_DELEGATED);
const char *op = crm_element_value(request, F_CIB_OPERATION);
const char *originator = crm_element_value(request, F_ORIG);
const char *reply_to = crm_element_value(request, F_CIB_ISREPLY);
gboolean is_reply = pcmk__str_eq(reply_to, OUR_NODENAME, pcmk__str_casei);
if (pcmk__str_eq(op, PCMK__CIB_REQUEST_REPLACE, pcmk__str_none)) {
/* sync_our_cib() sets F_CIB_ISREPLY */
if (reply_to) {
delegated = reply_to;
}
goto skip_is_reply;
} else if (pcmk__str_eq(op, PCMK__CIB_REQUEST_SYNC_TO_ALL,
pcmk__str_none)) {
// Nothing to do
} else if (is_reply && pcmk__str_eq(op, CRM_OP_PING, pcmk__str_casei)) {
process_ping_reply(request);
return FALSE;
} else if (pcmk__str_eq(op, PCMK__CIB_REQUEST_UPGRADE, pcmk__str_none)) {
/* Only the DC (node with the oldest software) should process
* this operation if F_CIB_SCHEMA_MAX is unset
*
* If the DC is happy it will then send out another
* PCMK__CIB_REQUEST_UPGRADE which will tell all nodes to do the actual
* upgrade.
*
* Except this time F_CIB_SCHEMA_MAX will be set which puts a
* limit on how far newer nodes will go
*/
const char *max = crm_element_value(request, F_CIB_SCHEMA_MAX);
const char *upgrade_rc = crm_element_value(request, F_CIB_UPGRADE_RC);
crm_trace("Parsing %s operation%s for %s with max=%s and upgrade_rc=%s",
op, (is_reply? " reply" : ""),
(based_is_primary? "primary" : "secondary"),
(max? max : "none"), (upgrade_rc? upgrade_rc : "none"));
if (upgrade_rc != NULL) {
// Our upgrade request was rejected by DC, notify clients of result
crm_xml_add(request, F_CIB_RC, upgrade_rc);
} else if ((max == NULL) && based_is_primary) {
/* We are the DC, check if this upgrade is allowed */
goto skip_is_reply;
} else if(max) {
/* Ok, go ahead and upgrade to 'max' */
goto skip_is_reply;
} else {
// Ignore broadcast client requests when we're not DC
return FALSE;
}
} else if (pcmk__xe_attr_is_true(request, F_CIB_GLOBAL_UPDATE)) {
crm_info("Detected legacy %s global update from %s", op, originator);
send_sync_request(NULL);
legacy_mode = TRUE;
return FALSE;
} else if (is_reply && cib_op_modifies(call_type)) {
crm_trace("Ignoring legacy %s reply sent from %s to local clients", op, originator);
return FALSE;
} else if (pcmk__str_eq(op, PCMK__CIB_REQUEST_SHUTDOWN, pcmk__str_none)) {
/* Legacy handling */
crm_debug("Legacy handling of %s message from %s", op, originator);
*local_notify = FALSE;
if (reply_to == NULL) {
*process = TRUE;
}
return *process;
}
if(is_reply) {
crm_trace("Handling %s reply sent from %s to local clients", op, originator);
*process = FALSE;
*needs_reply = FALSE;
*local_notify = TRUE;
return TRUE;
}
skip_is_reply:
*process = TRUE;
*needs_reply = FALSE;
*local_notify = pcmk__str_eq(delegated, OUR_NODENAME, pcmk__str_casei);
host = crm_element_value(request, F_CIB_HOST);
if (pcmk__str_eq(host, OUR_NODENAME, pcmk__str_casei)) {
crm_trace("Processing %s request sent to us from %s", op, originator);
*needs_reply = TRUE;
return TRUE;
} else if (host != NULL) {
/* this is for a specific instance and we're not it */
crm_trace("Ignoring %s operation for instance on %s", op, host);
return FALSE;
} else if(is_reply == FALSE && pcmk__str_eq(op, CRM_OP_PING, pcmk__str_casei)) {
*needs_reply = TRUE;
}
crm_trace("Processing %s request sent to everyone by %s/%s on %s %s", op,
crm_element_value(request, F_CIB_CLIENTNAME),
crm_element_value(request, F_CIB_CALLID),
originator, (*local_notify)?"(notify)":"");
return TRUE;
}
static gboolean
parse_peer_options(int call_type, xmlNode * request,
gboolean * local_notify, gboolean * needs_reply, gboolean * process,
gboolean * needs_forward)
{
/* TODO: What happens when an update comes in after node A
* requests the CIB from node B, but before it gets the reply (and
* sends out the replace operation)
*/
if(cib_legacy_mode()) {
return parse_peer_options_v1(
call_type, request, local_notify, needs_reply, process, needs_forward);
} else {
return parse_peer_options_v2(
call_type, request, local_notify, needs_reply, process, needs_forward);
}
}
static void
forward_request(xmlNode *request, int call_options)
{
const char *op = crm_element_value(request, F_CIB_OPERATION);
const char *host = crm_element_value(request, F_CIB_HOST);
crm_xml_add(request, F_CIB_DELEGATED, OUR_NODENAME);
if (host != NULL) {
crm_trace("Forwarding %s op to %s", op, host);
send_cluster_message(crm_get_peer(0, host), crm_msg_cib, request, FALSE);
} else {
crm_trace("Forwarding %s op to primary instance", op);
send_cluster_message(NULL, crm_msg_cib, request, FALSE);
}
/* Return the request to its original state */
xml_remove_prop(request, F_CIB_DELEGATED);
if (call_options & cib_discard_reply) {
crm_trace("Client not interested in reply");
}
}
static gboolean
send_peer_reply(xmlNode * msg, xmlNode * result_diff, const char *originator, gboolean broadcast)
{
CRM_ASSERT(msg != NULL);
if (broadcast) {
/* this (successful) call modified the CIB _and_ the
* change needs to be broadcast...
* send via HA to other nodes
*/
int diff_add_updates = 0;
int diff_add_epoch = 0;
int diff_add_admin_epoch = 0;
int diff_del_updates = 0;
int diff_del_epoch = 0;
int diff_del_admin_epoch = 0;
const char *digest = NULL;
int format = 1;
CRM_LOG_ASSERT(result_diff != NULL);
digest = crm_element_value(result_diff, XML_ATTR_DIGEST);
crm_element_value_int(result_diff, "format", &format);
cib_diff_version_details(result_diff,
&diff_add_admin_epoch, &diff_add_epoch, &diff_add_updates,
&diff_del_admin_epoch, &diff_del_epoch, &diff_del_updates);
crm_trace("Sending update diff %d.%d.%d -> %d.%d.%d %s",
diff_del_admin_epoch, diff_del_epoch, diff_del_updates,
diff_add_admin_epoch, diff_add_epoch, diff_add_updates, digest);
crm_xml_add(msg, F_CIB_ISREPLY, originator);
pcmk__xe_set_bool_attr(msg, F_CIB_GLOBAL_UPDATE, true);
crm_xml_add(msg, F_CIB_OPERATION, PCMK__CIB_REQUEST_APPLY_PATCH);
crm_xml_add(msg, F_CIB_USER, CRM_DAEMON_USER);
if (format == 1) {
CRM_ASSERT(digest != NULL);
}
add_message_xml(msg, F_CIB_UPDATE_DIFF, result_diff);
crm_log_xml_explicit(msg, "copy");
return send_cluster_message(NULL, crm_msg_cib, msg, TRUE);
} else if (originator != NULL) {
/* send reply via HA to originating node */
crm_trace("Sending request result to %s only", originator);
crm_xml_add(msg, F_CIB_ISREPLY, originator);
return send_cluster_message(crm_get_peer(0, originator), crm_msg_cib, msg, FALSE);
}
return FALSE;
}
/*!
* \internal
* \brief Handle an IPC or CPG message containing a request
*
* \param[in,out] request Request XML
* \param[in] privileged Whether privileged commands may be run
* (see cib_server_ops[] definition)
* \param[in] cib_client IPC client that sent request (or NULL if CPG)
*/
static void
cib_process_request(xmlNode *request, gboolean privileged,
const pcmk__client_t *cib_client)
{
int call_type = 0;
int call_options = 0;
gboolean process = TRUE; // Whether to process request locally now
gboolean is_update = TRUE; // Whether request would modify CIB
gboolean needs_reply = TRUE; // Whether to build a reply
gboolean local_notify = FALSE; // Whether to notify (local) requester
gboolean needs_forward = FALSE; // Whether to forward request somewhere else
xmlNode *op_reply = NULL;
xmlNode *result_diff = NULL;
int rc = pcmk_ok;
const char *op = crm_element_value(request, F_CIB_OPERATION);
const char *originator = crm_element_value(request, F_ORIG);
const char *host = crm_element_value(request, F_CIB_HOST);
const char *target = NULL;
const char *call_id = crm_element_value(request, F_CIB_CALLID);
const char *client_id = crm_element_value(request, F_CIB_CLIENTID);
const char *client_name = crm_element_value(request, F_CIB_CLIENTNAME);
const char *reply_to = crm_element_value(request, F_CIB_ISREPLY);
crm_element_value_int(request, F_CIB_CALLOPTS, &call_options);
if ((host != NULL) && (*host == '\0')) {
host = NULL;
}
if (host) {
target = host;
} else if (call_options & cib_scope_local) {
target = "local host";
} else {
target = "primary";
}
if (cib_client == NULL) {
crm_trace("Processing peer %s operation from %s/%s on %s intended for %s (reply=%s)",
op, client_name, call_id, originator, target, reply_to);
} else {
crm_xml_add(request, F_ORIG, OUR_NODENAME);
crm_trace("Processing local %s operation from %s/%s intended for %s", op, client_name, call_id, target);
}
rc = cib_get_operation_id(op, &call_type);
if (rc != pcmk_ok) {
/* TODO: construct error reply? */
crm_err("Pre-processing of command failed: %s", pcmk_strerror(rc));
return;
}
if (cib_client != NULL) {
parse_local_options(cib_client, call_type, call_options, host, op,
&local_notify, &needs_reply, &process, &needs_forward);
} else if (parse_peer_options(call_type, request, &local_notify,
&needs_reply, &process, &needs_forward) == FALSE) {
return;
}
is_update = cib_op_modifies(call_type);
if (call_options & cib_discard_reply) {
/* If the request will modify the CIB, and we are in legacy mode, we
* need to build a reply so we can broadcast a diff, even if the
* requester doesn't want one.
*/
needs_reply = is_update && cib_legacy_mode();
local_notify = FALSE;
}
if (needs_forward) {
const char *section = crm_element_value(request, F_CIB_SECTION);
int log_level = LOG_INFO;
if (pcmk__str_eq(op, PCMK__CIB_REQUEST_NOOP, pcmk__str_none)) {
log_level = LOG_DEBUG;
}
do_crm_log(log_level,
"Forwarding %s operation for section %s to %s (origin=%s/%s/%s)",
op,
section ? section : "'all'",
pcmk__s(host, (cib_legacy_mode() ? "primary" : "all")),
originator ? originator : "local",
client_name, call_id);
forward_request(request, call_options);
return;
}
if (cib_status != pcmk_ok) {
const char *call = crm_element_value(request, F_CIB_CALLID);
rc = cib_status;
crm_err("Operation ignored, cluster configuration is invalid."
" Please repair and restart: %s", pcmk_strerror(cib_status));
op_reply = create_xml_node(NULL, "cib-reply");
crm_xml_add(op_reply, F_TYPE, T_CIB);
crm_xml_add(op_reply, F_CIB_OPERATION, op);
crm_xml_add(op_reply, F_CIB_CALLID, call);
crm_xml_add(op_reply, F_CIB_CLIENTID, client_id);
crm_xml_add_int(op_reply, F_CIB_CALLOPTS, call_options);
crm_xml_add_int(op_reply, F_CIB_RC, rc);
crm_trace("Attaching reply output");
add_message_xml(op_reply, F_CIB_CALLDATA, the_cib);
crm_log_xml_explicit(op_reply, "cib:reply");
} else if (process) {
time_t finished = 0;
time_t now = time(NULL);
int level = LOG_INFO;
const char *section = crm_element_value(request, F_CIB_SECTION);
rc = cib_process_command(request, &op_reply, &result_diff, privileged);
if (!is_update) {
level = LOG_TRACE;
} else if (pcmk__xe_attr_is_true(request, F_CIB_GLOBAL_UPDATE)) {
switch (rc) {
case pcmk_ok:
level = LOG_INFO;
break;
case -pcmk_err_old_data:
case -pcmk_err_diff_resync:
case -pcmk_err_diff_failed:
level = LOG_TRACE;
break;
default:
level = LOG_ERR;
}
} else if (rc != pcmk_ok) {
level = LOG_WARNING;
}
do_crm_log(level,
"Completed %s operation for section %s: %s (rc=%d, origin=%s/%s/%s, version=%s.%s.%s)",
op, section ? section : "'all'", pcmk_strerror(rc), rc,
originator ? originator : "local", client_name, call_id,
the_cib ? crm_element_value(the_cib, XML_ATTR_GENERATION_ADMIN) : "0",
the_cib ? crm_element_value(the_cib, XML_ATTR_GENERATION) : "0",
the_cib ? crm_element_value(the_cib, XML_ATTR_NUMUPDATES) : "0");
finished = time(NULL);
if ((finished - now) > 3) {
crm_trace("%s operation took %lds to complete", op, (long)(finished - now));
crm_write_blackbox(0, NULL);
}
if (op_reply == NULL && (needs_reply || local_notify)) {
crm_err("Unexpected NULL reply to message");
crm_log_xml_err(request, "null reply");
needs_reply = FALSE;
local_notify = FALSE;
}
}
if (is_update && !cib_legacy_mode()) {
crm_trace("Completed pre-sync update from %s/%s/%s%s",
originator ? originator : "local", client_name, call_id,
local_notify?" with local notification":"");
} else if (!needs_reply || stand_alone) {
// This was a non-originating secondary update
crm_trace("Completed update as secondary");
} else if (cib_legacy_mode() &&
rc == pcmk_ok && result_diff != NULL && !(call_options & cib_inhibit_bcast)) {
gboolean broadcast = FALSE;
cib_local_bcast_num++;
crm_xml_add_int(request, F_CIB_LOCAL_NOTIFY_ID, cib_local_bcast_num);
broadcast = send_peer_reply(request, result_diff, originator, TRUE);
if (broadcast && client_id && local_notify && op_reply) {
/* If we have been asked to sync the reply,
* and a bcast msg has gone out, we queue the local notify
* until we know the bcast message has been received */
local_notify = FALSE;
crm_trace("Queuing local %ssync notification for %s",
(call_options & cib_sync_call) ? "" : "a-", client_id);
queue_local_notify(op_reply, client_id,
pcmk_is_set(call_options, cib_sync_call),
(cib_client == NULL));
op_reply = NULL; /* the reply is queued, so don't free here */
}
} else if (call_options & cib_discard_reply) {
crm_trace("Caller isn't interested in reply");
} else if (cib_client == NULL) {
if (is_update == FALSE || result_diff == NULL) {
crm_trace("Request not broadcast: R/O call");
} else if (call_options & cib_inhibit_bcast) {
crm_trace("Request not broadcast: inhibited");
} else if (rc != pcmk_ok) {
crm_trace("Request not broadcast: call failed: %s", pcmk_strerror(rc));
} else {
crm_trace("Directing reply to %s", originator);
}
send_peer_reply(op_reply, result_diff, originator, FALSE);
}
if (local_notify && client_id) {
crm_trace("Performing local %ssync notification for %s",
(pcmk_is_set(call_options, cib_sync_call)? "" : "a"),
client_id);
if (process == FALSE) {
do_local_notify(request, client_id,
pcmk_is_set(call_options, cib_sync_call),
(cib_client == NULL));
} else {
do_local_notify(op_reply, client_id,
pcmk_is_set(call_options, cib_sync_call),
(cib_client == NULL));
}
}
free_xml(op_reply);
free_xml(result_diff);
return;
}
static char *
calculate_section_digest(const char *xpath, xmlNode * xml_obj)
{
xmlNode *xml_section = NULL;
if (xml_obj == NULL) {
return NULL;
}
xml_section = get_xpath_object(xpath, xml_obj, LOG_TRACE);
if (xml_section == NULL) {
return NULL;
}
return calculate_xml_versioned_digest(xml_section, FALSE, TRUE, CRM_FEATURE_SET);
}
// v1 and v2 patch formats
#define XPATH_CONFIG_CHANGE \
"//" XML_CIB_TAG_CRMCONFIG " | " \
"//" XML_DIFF_CHANGE \
"[contains(@" XML_DIFF_PATH ",'/" XML_CIB_TAG_CRMCONFIG "/')]"
static bool
contains_config_change(xmlNode *diff)
{
bool changed = false;
if (diff) {
xmlXPathObject *xpathObj = xpath_search(diff, XPATH_CONFIG_CHANGE);
if (numXpathResults(xpathObj) > 0) {
changed = true;
}
freeXpathObject(xpathObj);
}
return changed;
}
static int
cib_process_command(xmlNode * request, xmlNode ** reply, xmlNode ** cib_diff, gboolean privileged)
{
xmlNode *input = NULL;
xmlNode *output = NULL;
xmlNode *result_cib = NULL;
xmlNode *current_cib = NULL;
int call_type = 0;
int call_options = 0;
const char *op = NULL;
const char *section = NULL;
const char *call_id = crm_element_value(request, F_CIB_CALLID);
+ const char *client_id = crm_element_value(request, F_CIB_CLIENTID);
+ const char *client_name = crm_element_value(request, F_CIB_CLIENTNAME);
+ const char *origin = crm_element_value(request, F_ORIG);
int rc = pcmk_ok;
int rc2 = pcmk_ok;
gboolean send_r_notify = FALSE;
- gboolean global_update = FALSE;
gboolean config_changed = FALSE;
gboolean manage_counters = TRUE;
static mainloop_timer_t *digest_timer = NULL;
char *current_nodes_digest = NULL;
char *current_alerts_digest = NULL;
char *current_status_digest = NULL;
uint32_t change_section = cib_change_section_nodes
|cib_change_section_alerts
|cib_change_section_status;
CRM_ASSERT(cib_status == pcmk_ok);
if(digest_timer == NULL) {
digest_timer = mainloop_timer_add("digester", 5000, FALSE, cib_digester_cb, NULL);
}
*reply = NULL;
*cib_diff = NULL;
current_cib = the_cib;
/* Start processing the request... */
op = crm_element_value(request, F_CIB_OPERATION);
crm_element_value_int(request, F_CIB_CALLOPTS, &call_options);
rc = cib_get_operation_id(op, &call_type);
if (rc == pcmk_ok && privileged == FALSE) {
- rc = cib_op_can_run(call_type, call_options, privileged, global_update);
+ rc = cib_op_can_run(call_type, call_options, privileged);
}
rc2 = cib_op_prepare(call_type, request, &input, §ion);
if (rc == pcmk_ok) {
rc = rc2;
}
if (rc != pcmk_ok) {
crm_trace("Call setup failed: %s", pcmk_strerror(rc));
goto done;
} else if (cib_op_modifies(call_type) == FALSE) {
rc = cib_perform_op(op, call_options, cib_op_func(call_type), TRUE,
section, request, input, FALSE, &config_changed,
current_cib, &result_cib, NULL, &output);
CRM_CHECK(result_cib == NULL, free_xml(result_cib));
goto done;
}
/* Handle a valid write action */
if (pcmk__xe_attr_is_true(request, F_CIB_GLOBAL_UPDATE)) {
/* legacy code */
manage_counters = FALSE;
cib__set_call_options(call_options, "call", cib_force_diff);
crm_trace("Global update detected");
CRM_CHECK(call_type == 3 || call_type == 4, crm_err("Call type: %d", call_type);
crm_log_xml_err(request, "bad op"));
}
ping_modified_since = TRUE;
if (pcmk_is_set(call_options, cib_inhibit_bcast)) {
crm_trace("Skipping update: inhibit broadcast");
manage_counters = FALSE;
}
if (!pcmk_is_set(call_options, cib_dryrun)
&& pcmk__str_eq(section, XML_CIB_TAG_STATUS, pcmk__str_casei)) {
// Copying large CIBs accounts for a huge percentage of our CIB usage
cib__set_call_options(call_options, "call", cib_zero_copy);
} else {
cib__clear_call_options(call_options, "call", cib_zero_copy);
}
#define XPATH_CONFIG "//" XML_TAG_CIB "/" XML_CIB_TAG_CONFIGURATION
#define XPATH_NODES XPATH_CONFIG "/" XML_CIB_TAG_NODES
#define XPATH_ALERTS XPATH_CONFIG "/" XML_CIB_TAG_ALERTS
#define XPATH_STATUS "//" XML_TAG_CIB "/" XML_CIB_TAG_STATUS
// Calculate the hash value of the section before the change
if (pcmk__str_eq(PCMK__CIB_REQUEST_REPLACE, op, pcmk__str_none)) {
current_nodes_digest = calculate_section_digest(XPATH_NODES,
current_cib);
current_alerts_digest = calculate_section_digest(XPATH_ALERTS,
current_cib);
current_status_digest = calculate_section_digest(XPATH_STATUS,
current_cib);
crm_trace("current-digest %s:%s:%s", current_nodes_digest,
current_alerts_digest, current_status_digest);
}
// result_cib must not be modified after cib_perform_op() returns
rc = cib_perform_op(op, call_options, cib_op_func(call_type), FALSE,
section, request, input, manage_counters,
&config_changed, current_cib, &result_cib, cib_diff,
&output);
if (!manage_counters) {
int format = 1;
/* Legacy code
* If the diff is NULL at this point, it's because nothing changed
*/
if (*cib_diff != NULL) {
crm_element_value_int(*cib_diff, "format", &format);
}
if (format == 1) {
config_changed = cib__config_changed_v1(NULL, NULL, cib_diff);
}
}
/* Always write to disk for successful replace and upgrade ops. This also
* negates the need to detect ordering changes.
*/
if ((rc == pcmk_ok)
&& pcmk__str_any_of(op,
PCMK__CIB_REQUEST_REPLACE,
PCMK__CIB_REQUEST_UPGRADE,
NULL)) {
config_changed = TRUE;
}
if (rc == pcmk_ok && !pcmk_is_set(call_options, cib_dryrun)) {
crm_trace("Activating %s->%s%s%s",
crm_element_value(current_cib, XML_ATTR_NUMUPDATES),
crm_element_value(result_cib, XML_ATTR_NUMUPDATES),
(pcmk_is_set(call_options, cib_zero_copy)? " zero-copy" : ""),
(config_changed? " changed" : ""));
if (!pcmk_is_set(call_options, cib_zero_copy)) {
rc = activateCibXml(result_cib, config_changed, op);
crm_trace("Activated %s (%d)",
crm_element_value(current_cib, XML_ATTR_NUMUPDATES), rc);
}
if ((rc == pcmk_ok) && contains_config_change(*cib_diff)) {
cib_read_config(config_hash, result_cib);
}
if (pcmk__str_eq(PCMK__CIB_REQUEST_REPLACE, op, pcmk__str_none)) {
char *result_nodes_digest = NULL;
char *result_alerts_digest = NULL;
char *result_status_digest = NULL;
/* Calculate the hash value of the changed section. */
result_nodes_digest = calculate_section_digest(XPATH_NODES,
result_cib);
result_alerts_digest = calculate_section_digest(XPATH_ALERTS,
result_cib);
result_status_digest = calculate_section_digest(XPATH_STATUS,
result_cib);
crm_trace("result-digest %s:%s:%s", result_nodes_digest,
result_alerts_digest, result_status_digest);
if (pcmk__str_eq(current_nodes_digest, result_nodes_digest,
pcmk__str_none)) {
change_section =
pcmk__clear_flags_as(__func__, __LINE__, LOG_TRACE,
"CIB change section",
"change_section", change_section,
cib_change_section_nodes, "nodes");
}
if (pcmk__str_eq(current_alerts_digest, result_alerts_digest,
pcmk__str_none)) {
change_section =
pcmk__clear_flags_as(__func__, __LINE__, LOG_TRACE,
"CIB change section",
"change_section", change_section,
cib_change_section_alerts, "alerts");
}
if (pcmk__str_eq(current_status_digest, result_status_digest,
pcmk__str_none)) {
change_section =
pcmk__clear_flags_as(__func__, __LINE__, LOG_TRACE,
"CIB change section",
"change_section", change_section,
cib_change_section_status, "status");
}
if (change_section != cib_change_section_none) {
send_r_notify = TRUE;
}
free(result_nodes_digest);
free(result_alerts_digest);
free(result_status_digest);
} else if (pcmk__str_eq(PCMK__CIB_REQUEST_ERASE, op, pcmk__str_none)) {
send_r_notify = TRUE;
}
mainloop_timer_stop(digest_timer);
mainloop_timer_start(digest_timer);
} else if (rc == -pcmk_err_schema_validation) {
CRM_ASSERT(!pcmk_is_set(call_options, cib_zero_copy));
if (output != NULL) {
crm_log_xml_info(output, "cib:output");
free_xml(output);
}
output = result_cib;
} else {
crm_trace("Not activating %d %d %s", rc,
pcmk_is_set(call_options, cib_dryrun),
crm_element_value(result_cib, XML_ATTR_NUMUPDATES));
if (!pcmk_is_set(call_options, cib_zero_copy)) {
free_xml(result_cib);
}
}
if ((call_options & (cib_inhibit_notify|cib_dryrun)) == 0) {
- const char *client = crm_element_value(request, F_CIB_CLIENTNAME);
-
crm_trace("Sending notifications %d",
pcmk_is_set(call_options, cib_dryrun));
- cib_diff_notify(call_options, client, call_id, op, input, rc, *cib_diff);
+ cib_diff_notify(op, rc, call_id, client_id, client_name, origin, input,
+ *cib_diff);
}
if (send_r_notify) {
- const char *origin = crm_element_value(request, F_ORIG);
-
- cib_replace_notify(origin, the_cib, rc, *cib_diff, change_section);
+ cib_replace_notify(op, rc, call_id, client_id, client_name, origin,
+ the_cib, *cib_diff, change_section);
}
pcmk__output_set_log_level(logger_out, LOG_TRACE);
logger_out->message(logger_out, "xml-patchset", *cib_diff);
done:
if (!pcmk_is_set(call_options, cib_discard_reply) || cib_legacy_mode()) {
const char *caller = crm_element_value(request, F_CIB_CLIENTID);
*reply = create_xml_node(NULL, "cib-reply");
crm_xml_add(*reply, F_TYPE, T_CIB);
crm_xml_add(*reply, F_CIB_OPERATION, op);
crm_xml_add(*reply, F_CIB_CALLID, call_id);
crm_xml_add(*reply, F_CIB_CLIENTID, caller);
crm_xml_add_int(*reply, F_CIB_CALLOPTS, call_options);
crm_xml_add_int(*reply, F_CIB_RC, rc);
if (output != NULL) {
crm_trace("Attaching reply output");
add_message_xml(*reply, F_CIB_CALLDATA, output);
}
crm_log_xml_explicit(*reply, "cib:reply");
}
crm_trace("cleanup");
if (cib_op_modifies(call_type) == FALSE && output != current_cib) {
free_xml(output);
output = NULL;
}
if (call_type >= 0) {
cib_op_cleanup(call_type, call_options, &input, &output);
}
free(current_nodes_digest);
free(current_alerts_digest);
free(current_status_digest);
crm_trace("done");
return rc;
}
void
cib_peer_callback(xmlNode * msg, void *private_data)
{
const char *reason = NULL;
const char *originator = crm_element_value(msg, F_ORIG);
if (cib_legacy_mode()
&& pcmk__str_eq(originator, OUR_NODENAME,
pcmk__str_casei|pcmk__str_null_matches)) {
/* message is from ourselves */
int bcast_id = 0;
if (!(crm_element_value_int(msg, F_CIB_LOCAL_NOTIFY_ID, &bcast_id))) {
check_local_notify(bcast_id);
}
return;
} else if (crm_peer_cache == NULL) {
reason = "membership not established";
goto bail;
}
if (crm_element_value(msg, F_CIB_CLIENTNAME) == NULL) {
crm_xml_add(msg, F_CIB_CLIENTNAME, originator);
}
/* crm_log_xml_trace(msg, "Peer[inbound]"); */
cib_process_request(msg, TRUE, NULL);
return;
bail:
if (reason) {
const char *seq = crm_element_value(msg, F_SEQ);
const char *op = crm_element_value(msg, F_CIB_OPERATION);
crm_warn("Discarding %s message (%s) from %s: %s", op, seq, originator, reason);
}
}
static gboolean
cib_force_exit(gpointer data)
{
crm_notice("Forcing exit!");
terminate_cib(__func__, CRM_EX_ERROR);
return FALSE;
}
static void
disconnect_remote_client(gpointer key, gpointer value, gpointer user_data)
{
pcmk__client_t *a_client = value;
crm_err("Can't disconnect client %s: Not implemented",
pcmk__client_name(a_client));
}
static void
initiate_exit(void)
{
int active = 0;
xmlNode *leaving = NULL;
active = crm_active_peers();
if (active < 2) {
terminate_cib(__func__, 0);
return;
}
crm_info("Sending disconnect notification to %d peers...", active);
leaving = create_xml_node(NULL, "exit-notification");
crm_xml_add(leaving, F_TYPE, "cib");
crm_xml_add(leaving, F_CIB_OPERATION, PCMK__CIB_REQUEST_SHUTDOWN);
send_cluster_message(NULL, crm_msg_cib, leaving, TRUE);
free_xml(leaving);
g_timeout_add(EXIT_ESCALATION_MS, cib_force_exit, NULL);
}
void
cib_shutdown(int nsig)
{
struct qb_ipcs_stats srv_stats;
if (cib_shutdown_flag == FALSE) {
int disconnects = 0;
qb_ipcs_connection_t *c = NULL;
cib_shutdown_flag = TRUE;
c = qb_ipcs_connection_first_get(ipcs_rw);
while (c != NULL) {
qb_ipcs_connection_t *last = c;
c = qb_ipcs_connection_next_get(ipcs_rw, last);
crm_debug("Disconnecting r/w client %p...", last);
qb_ipcs_disconnect(last);
qb_ipcs_connection_unref(last);
disconnects++;
}
c = qb_ipcs_connection_first_get(ipcs_ro);
while (c != NULL) {
qb_ipcs_connection_t *last = c;
c = qb_ipcs_connection_next_get(ipcs_ro, last);
crm_debug("Disconnecting r/o client %p...", last);
qb_ipcs_disconnect(last);
qb_ipcs_connection_unref(last);
disconnects++;
}
c = qb_ipcs_connection_first_get(ipcs_shm);
while (c != NULL) {
qb_ipcs_connection_t *last = c;
c = qb_ipcs_connection_next_get(ipcs_shm, last);
crm_debug("Disconnecting non-blocking r/w client %p...", last);
qb_ipcs_disconnect(last);
qb_ipcs_connection_unref(last);
disconnects++;
}
disconnects += pcmk__ipc_client_count();
crm_debug("Disconnecting %d remote clients", pcmk__ipc_client_count());
pcmk__foreach_ipc_client(disconnect_remote_client, NULL);
crm_info("Disconnected %d clients", disconnects);
}
qb_ipcs_stats_get(ipcs_rw, &srv_stats, QB_FALSE);
if (pcmk__ipc_client_count() == 0) {
crm_info("All clients disconnected (%d)", srv_stats.active_connections);
initiate_exit();
} else {
crm_info("Waiting on %d clients to disconnect (%d)",
pcmk__ipc_client_count(), srv_stats.active_connections);
}
}
extern int remote_fd;
extern int remote_tls_fd;
/*!
* \internal
* \brief Close remote sockets, free the global CIB and quit
*
* \param[in] caller Name of calling function (for log message)
* \param[in] fast If -1, skip disconnect; if positive, exit that
*/
void
terminate_cib(const char *caller, int fast)
{
crm_info("%s: Exiting%s...", caller,
(fast > 0)? " fast" : mainloop ? " from mainloop" : "");
if (remote_fd > 0) {
close(remote_fd);
remote_fd = 0;
}
if (remote_tls_fd > 0) {
close(remote_tls_fd);
remote_tls_fd = 0;
}
uninitializeCib();
if (logger_out != NULL) {
logger_out->finish(logger_out, CRM_EX_OK, true, NULL);
pcmk__output_free(logger_out);
logger_out = NULL;
}
if (fast > 0) {
/* Quit fast on error */
pcmk__stop_based_ipc(ipcs_ro, ipcs_rw, ipcs_shm);
crm_exit(fast);
} else if ((mainloop != NULL) && g_main_loop_is_running(mainloop)) {
/* Quit via returning from the main loop. If fast == -1, we skip the
* disconnect here, and it will be done when the main loop returns
* (this allows the peer status callback to avoid messing with the
* peer caches).
*/
if (fast == 0) {
crm_cluster_disconnect(crm_cluster);
}
g_main_loop_quit(mainloop);
} else {
/* Quit via clean exit. Even the peer status callback can disconnect
* here, because we're not returning control to the caller. */
crm_cluster_disconnect(crm_cluster);
pcmk__stop_based_ipc(ipcs_ro, ipcs_rw, ipcs_shm);
crm_exit(CRM_EX_OK);
}
}
diff --git a/daemons/based/based_common.c b/daemons/based/based_common.c
index 5027b98f63..7e68cf0ff9 100644
--- a/daemons/based/based_common.c
+++ b/daemons/based/based_common.c
@@ -1,362 +1,352 @@
/*
- * Copyright 2008-2022 the Pacemaker project contributors
+ * Copyright 2008-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
gboolean stand_alone = FALSE;
extern int cib_perform_command(xmlNode * request, xmlNode ** reply, xmlNode ** cib_diff,
gboolean privileged);
static xmlNode *
cib_prepare_common(xmlNode * root, const char *section)
{
xmlNode *data = NULL;
/* extract the CIB from the fragment */
if (root == NULL) {
return NULL;
} else if (pcmk__strcase_any_of(crm_element_name(root), XML_TAG_FRAGMENT,
F_CRM_DATA, F_CIB_CALLDATA, NULL)) {
data = first_named_child(root, XML_TAG_CIB);
} else {
data = root;
}
/* grab the section specified for the command */
if (section != NULL && data != NULL && pcmk__str_eq(crm_element_name(data), XML_TAG_CIB, pcmk__str_none)) {
data = pcmk_find_cib_element(data, section);
}
/* crm_log_xml_trace(root, "cib:input"); */
return data;
}
static int
cib_prepare_none(xmlNode * request, xmlNode ** data, const char **section)
{
*data = NULL;
*section = crm_element_value(request, F_CIB_SECTION);
return pcmk_ok;
}
static int
cib_prepare_data(xmlNode * request, xmlNode ** data, const char **section)
{
xmlNode *input_fragment = get_message_xml(request, F_CIB_CALLDATA);
*section = crm_element_value(request, F_CIB_SECTION);
*data = cib_prepare_common(input_fragment, *section);
/* crm_log_xml_debug(*data, "data"); */
return pcmk_ok;
}
static int
cib_prepare_sync(xmlNode * request, xmlNode ** data, const char **section)
{
*data = NULL;
*section = crm_element_value(request, F_CIB_SECTION);
return pcmk_ok;
}
static int
cib_prepare_diff(xmlNode * request, xmlNode ** data, const char **section)
{
xmlNode *input_fragment = NULL;
*data = NULL;
*section = NULL;
if (pcmk__xe_attr_is_true(request, F_CIB_GLOBAL_UPDATE)) {
input_fragment = get_message_xml(request, F_CIB_UPDATE_DIFF);
} else {
input_fragment = get_message_xml(request, F_CIB_CALLDATA);
}
CRM_CHECK(input_fragment != NULL, crm_log_xml_warn(request, "no input"));
*data = cib_prepare_common(input_fragment, NULL);
return pcmk_ok;
}
static int
cib_cleanup_query(int options, xmlNode ** data, xmlNode ** output)
{
CRM_LOG_ASSERT(*data == NULL);
if ((options & cib_no_children)
|| pcmk__str_eq(crm_element_name(*output), "xpath-query", pcmk__str_casei)) {
free_xml(*output);
}
return pcmk_ok;
}
static int
cib_cleanup_data(int options, xmlNode ** data, xmlNode ** output)
{
free_xml(*output);
*data = NULL;
return pcmk_ok;
}
static int
cib_cleanup_output(int options, xmlNode ** data, xmlNode ** output)
{
free_xml(*output);
return pcmk_ok;
}
static int
cib_cleanup_none(int options, xmlNode ** data, xmlNode ** output)
{
CRM_LOG_ASSERT(*data == NULL);
CRM_LOG_ASSERT(*output == NULL);
return pcmk_ok;
}
static cib_operation_t cib_server_ops[] = {
- // Booleans are modifies_cib, needs_privileges, needs_quorum
+ // Booleans are modifies_cib, needs_privileges
{
- NULL, FALSE, FALSE, FALSE,
+ NULL, FALSE, FALSE,
cib_prepare_none, cib_cleanup_none, cib_process_default
},
{
- PCMK__CIB_REQUEST_QUERY, FALSE, FALSE, FALSE,
+ PCMK__CIB_REQUEST_QUERY, FALSE, FALSE,
cib_prepare_none, cib_cleanup_query, cib_process_query
},
{
- PCMK__CIB_REQUEST_MODIFY, TRUE, TRUE, TRUE,
+ PCMK__CIB_REQUEST_MODIFY, TRUE, TRUE,
cib_prepare_data, cib_cleanup_data, cib_process_modify
},
{
- PCMK__CIB_REQUEST_APPLY_PATCH, TRUE, TRUE, TRUE,
+ PCMK__CIB_REQUEST_APPLY_PATCH, TRUE, TRUE,
cib_prepare_diff, cib_cleanup_data, cib_server_process_diff
},
{
- PCMK__CIB_REQUEST_REPLACE, TRUE, TRUE, TRUE,
+ PCMK__CIB_REQUEST_REPLACE, TRUE, TRUE,
cib_prepare_data, cib_cleanup_data, cib_process_replace_svr
},
{
- PCMK__CIB_REQUEST_CREATE, TRUE, TRUE, TRUE,
+ PCMK__CIB_REQUEST_CREATE, TRUE, TRUE,
cib_prepare_data, cib_cleanup_data, cib_process_create
},
{
- PCMK__CIB_REQUEST_DELETE, TRUE, TRUE, TRUE,
+ PCMK__CIB_REQUEST_DELETE, TRUE, TRUE,
cib_prepare_data, cib_cleanup_data, cib_process_delete
},
{
- PCMK__CIB_REQUEST_SYNC_TO_ALL, FALSE, TRUE, FALSE,
+ PCMK__CIB_REQUEST_SYNC_TO_ALL, FALSE, TRUE,
cib_prepare_sync, cib_cleanup_none, cib_process_sync
},
{
- PCMK__CIB_REQUEST_BUMP, TRUE, TRUE, TRUE,
+ PCMK__CIB_REQUEST_BUMP, TRUE, TRUE,
cib_prepare_none, cib_cleanup_output, cib_process_bump
},
{
- PCMK__CIB_REQUEST_ERASE, TRUE, TRUE, TRUE,
+ PCMK__CIB_REQUEST_ERASE, TRUE, TRUE,
cib_prepare_none, cib_cleanup_output, cib_process_erase
},
{
- PCMK__CIB_REQUEST_NOOP, FALSE, FALSE, FALSE,
+ PCMK__CIB_REQUEST_NOOP, FALSE, FALSE,
cib_prepare_none, cib_cleanup_none, cib_process_default
},
{
- PCMK__CIB_REQUEST_ABS_DELETE, TRUE, TRUE, TRUE,
+ PCMK__CIB_REQUEST_ABS_DELETE, TRUE, TRUE,
cib_prepare_data, cib_cleanup_data, cib_process_delete_absolute
},
{
- PCMK__CIB_REQUEST_UPGRADE, TRUE, TRUE, TRUE,
+ PCMK__CIB_REQUEST_UPGRADE, TRUE, TRUE,
cib_prepare_none, cib_cleanup_output, cib_process_upgrade_server
},
{
- PCMK__CIB_REQUEST_SECONDARY, FALSE, TRUE, FALSE,
+ PCMK__CIB_REQUEST_SECONDARY, FALSE, TRUE,
cib_prepare_none, cib_cleanup_none, cib_process_readwrite
},
{
- PCMK__CIB_REQUEST_ALL_SECONDARY, FALSE, TRUE, FALSE,
+ PCMK__CIB_REQUEST_ALL_SECONDARY, FALSE, TRUE,
cib_prepare_none, cib_cleanup_none, cib_process_readwrite
},
{
- PCMK__CIB_REQUEST_SYNC_TO_ONE, FALSE, TRUE, FALSE,
+ PCMK__CIB_REQUEST_SYNC_TO_ONE, FALSE, TRUE,
cib_prepare_sync, cib_cleanup_none, cib_process_sync_one
},
{
- PCMK__CIB_REQUEST_PRIMARY, TRUE, TRUE, FALSE,
+ PCMK__CIB_REQUEST_PRIMARY, TRUE, TRUE,
cib_prepare_data, cib_cleanup_data, cib_process_readwrite
},
{
- PCMK__CIB_REQUEST_IS_PRIMARY, FALSE, TRUE, FALSE,
+ PCMK__CIB_REQUEST_IS_PRIMARY, FALSE, TRUE,
cib_prepare_none, cib_cleanup_none, cib_process_readwrite
},
{
- PCMK__CIB_REQUEST_SHUTDOWN, FALSE, TRUE, FALSE,
+ PCMK__CIB_REQUEST_SHUTDOWN, FALSE, TRUE,
cib_prepare_sync, cib_cleanup_none, cib_process_shutdown_req
},
{
- CRM_OP_PING, FALSE, FALSE, FALSE,
+ CRM_OP_PING, FALSE, FALSE,
cib_prepare_none, cib_cleanup_output, cib_process_ping
},
};
int
cib_get_operation_id(const char *op, int *operation)
{
static GHashTable *operation_hash = NULL;
if (operation_hash == NULL) {
int lpc = 0;
int max_msg_types = PCMK__NELEM(cib_server_ops);
operation_hash = pcmk__strkey_table(NULL, free);
for (lpc = 1; lpc < max_msg_types; lpc++) {
int *value = malloc(sizeof(int));
if(value) {
*value = lpc;
g_hash_table_insert(operation_hash, (gpointer) cib_server_ops[lpc].operation, value);
}
}
}
if (op != NULL) {
int *value = g_hash_table_lookup(operation_hash, op);
if (value) {
*operation = *value;
return pcmk_ok;
}
}
crm_err("Operation %s is not valid", op);
*operation = -1;
return -EINVAL;
}
xmlNode *
cib_msg_copy(xmlNode * msg, gboolean with_data)
{
int lpc = 0;
const char *field = NULL;
const char *value = NULL;
xmlNode *value_struct = NULL;
static const char *field_list[] = {
F_XML_TAGNAME,
F_TYPE,
F_CIB_CLIENTID,
F_CIB_CALLOPTS,
F_CIB_CALLID,
F_CIB_OPERATION,
F_CIB_ISREPLY,
F_CIB_SECTION,
F_CIB_HOST,
F_CIB_RC,
F_CIB_DELEGATED,
F_CIB_OBJID,
F_CIB_OBJTYPE,
F_CIB_EXISTING,
F_CIB_SEENCOUNT,
F_CIB_TIMEOUT,
- F_CIB_CALLBACK_TOKEN,
F_CIB_GLOBAL_UPDATE,
F_CIB_CLIENTNAME,
F_CIB_USER,
F_CIB_NOTIFY_TYPE,
F_CIB_NOTIFY_ACTIVATE
};
static const char *data_list[] = {
F_CIB_CALLDATA,
F_CIB_UPDATE,
F_CIB_UPDATE_RESULT
};
xmlNode *copy = create_xml_node(NULL, "copy");
CRM_ASSERT(copy != NULL);
for (lpc = 0; lpc < PCMK__NELEM(field_list); lpc++) {
field = field_list[lpc];
value = crm_element_value(msg, field);
if (value != NULL) {
crm_xml_add(copy, field, value);
}
}
for (lpc = 0; with_data && lpc < PCMK__NELEM(data_list); lpc++) {
field = data_list[lpc];
value_struct = get_message_xml(msg, field);
if (value_struct != NULL) {
add_message_xml(copy, field, value_struct);
}
}
return copy;
}
cib_op_t *
cib_op_func(int call_type)
{
return &(cib_server_ops[call_type].fn);
}
gboolean
cib_op_modifies(int call_type)
{
return cib_server_ops[call_type].modifies_cib;
}
int
-cib_op_can_run(int call_type, int call_options, gboolean privileged, gboolean global_update)
+cib_op_can_run(int call_type, int call_options, bool privileged)
{
- if (privileged == FALSE && cib_server_ops[call_type].needs_privileges) {
- /* abort */
+ if (!privileged && cib_server_ops[call_type].needs_privileges) {
return -EACCES;
}
-#if 0
- if (rc == pcmk_ok
- && stand_alone == FALSE
- && global_update == FALSE
- && (call_options & cib_quorum_override) == 0 && cib_server_ops[call_type].needs_quorum) {
- return -pcmk_err_no_quorum;
- }
-#endif
return pcmk_ok;
}
int
cib_op_prepare(int call_type, xmlNode * request, xmlNode ** input, const char **section)
{
crm_trace("Prepare %d", call_type);
return cib_server_ops[call_type].prepare(request, input, section);
}
int
cib_op_cleanup(int call_type, int options, xmlNode ** input, xmlNode ** output)
{
crm_trace("Cleanup %d", call_type);
return cib_server_ops[call_type].cleanup(options, input, output);
}
diff --git a/daemons/based/based_notify.c b/daemons/based/based_notify.c
index 38e2658b6f..5881f6d41d 100644
--- a/daemons/based/based_notify.c
+++ b/daemons/based/based_notify.c
@@ -1,285 +1,305 @@
/*
- * Copyright 2004-2022 the Pacemaker project contributors
+ * Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include // PRIx64
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
-int pending_updates = 0;
-
struct cib_notification_s {
xmlNode *msg;
struct iovec *iov;
int32_t iov_size;
};
-void attach_cib_generation(xmlNode * msg, const char *field, xmlNode * a_cib);
-
-static void do_cib_notify(int options, const char *op, xmlNode *update,
- int result, xmlNode * result_data,
- const char *msg_type);
-
static void
cib_notify_send_one(gpointer key, gpointer value, gpointer user_data)
{
const char *type = NULL;
gboolean do_send = FALSE;
int rc = pcmk_rc_ok;
pcmk__client_t *client = value;
struct cib_notification_s *update = user_data;
if (client->ipcs == NULL && client->remote == NULL) {
crm_warn("Skipping client with NULL channel");
return;
}
type = crm_element_value(update->msg, F_SUBTYPE);
CRM_LOG_ASSERT(type != NULL);
if (pcmk_is_set(client->flags, cib_notify_diff)
&& pcmk__str_eq(type, T_CIB_DIFF_NOTIFY, pcmk__str_casei)) {
do_send = TRUE;
} else if (pcmk_is_set(client->flags, cib_notify_replace)
&& pcmk__str_eq(type, T_CIB_REPLACE_NOTIFY, pcmk__str_casei)) {
do_send = TRUE;
} else if (pcmk_is_set(client->flags, cib_notify_confirm)
&& pcmk__str_eq(type, T_CIB_UPDATE_CONFIRM, pcmk__str_casei)) {
do_send = TRUE;
} else if (pcmk_is_set(client->flags, cib_notify_pre)
&& pcmk__str_eq(type, T_CIB_PRE_NOTIFY, pcmk__str_casei)) {
do_send = TRUE;
} else if (pcmk_is_set(client->flags, cib_notify_post)
&& pcmk__str_eq(type, T_CIB_POST_NOTIFY, pcmk__str_casei)) {
do_send = TRUE;
}
if (do_send) {
switch (PCMK__CLIENT_TYPE(client)) {
case pcmk__client_ipc:
rc = pcmk__ipc_send_iov(client, update->iov,
crm_ipc_server_event);
if (rc != pcmk_rc_ok) {
crm_warn("Could not notify client %s: %s " CRM_XS " id=%s",
pcmk__client_name(client), pcmk_rc_str(rc),
client->id);
}
break;
#ifdef HAVE_GNUTLS_GNUTLS_H
case pcmk__client_tls:
#endif
case pcmk__client_tcp:
crm_debug("Sent %s notification to client %s (id %s)",
type, pcmk__client_name(client), client->id);
pcmk__remote_send_xml(client->remote, update->msg);
break;
default:
crm_err("Unknown transport for client %s "
CRM_XS " flags=%#016" PRIx64,
pcmk__client_name(client), client->flags);
}
}
}
static void
cib_notify_send(xmlNode * xml)
{
struct iovec *iov;
struct cib_notification_s update;
ssize_t bytes = 0;
int rc = pcmk__ipc_prepare_iov(0, xml, 0, &iov, &bytes);
if (rc == pcmk_rc_ok) {
update.msg = xml;
update.iov = iov;
update.iov_size = bytes;
pcmk__foreach_ipc_client(cib_notify_send_one, &update);
} else {
crm_notice("Could not notify clients: %s " CRM_XS " rc=%d",
pcmk_rc_str(rc), rc);
}
pcmk_free_ipc_event(iov);
}
+static void
+attach_cib_generation(xmlNode *msg, const char *field, xmlNode *a_cib)
+{
+ xmlNode *generation = create_xml_node(NULL, XML_CIB_TAG_GENERATION_TUPPLE);
+
+ if (a_cib != NULL) {
+ copy_in_properties(generation, a_cib);
+ }
+ add_message_xml(msg, field, generation);
+ free_xml(generation);
+}
+
void
-cib_diff_notify(int options, const char *client, const char *call_id, const char *op,
- xmlNode * update, int result, xmlNode * diff)
+cib_diff_notify(const char *op, int result, const char *call_id,
+ const char *client_id, const char *client_name,
+ const char *origin, xmlNode *update, xmlNode *diff)
{
int add_updates = 0;
int add_epoch = 0;
int add_admin_epoch = 0;
int del_updates = 0;
int del_epoch = 0;
int del_admin_epoch = 0;
- int log_level = LOG_TRACE;
+ uint8_t log_level = LOG_TRACE;
+
+ xmlNode *update_msg = NULL;
+ const char *type = NULL;
if (diff == NULL) {
return;
}
if (result != pcmk_ok) {
log_level = LOG_WARNING;
}
cib_diff_version_details(diff, &add_admin_epoch, &add_epoch, &add_updates,
&del_admin_epoch, &del_epoch, &del_updates);
- if (add_updates != del_updates) {
+ if ((add_admin_epoch != del_admin_epoch)
+ || (add_epoch != del_epoch)
+ || (add_updates != del_updates)) {
+
do_crm_log(log_level,
- "Update (client: %s%s%s): %d.%d.%d -> %d.%d.%d (%s)",
- client, call_id ? ", call:" : "", call_id ? call_id : "",
+ "Updated CIB generation %d.%d.%d to %d.%d.%d from client "
+ "%s%s%s (%s) (%s)",
del_admin_epoch, del_epoch, del_updates,
- add_admin_epoch, add_epoch, add_updates, pcmk_strerror(result));
+ add_admin_epoch, add_epoch, add_updates,
+ client_name,
+ ((call_id != NULL)? " call " : ""), pcmk__s(call_id, ""),
+ pcmk__s(origin, "unspecified peer"), pcmk_strerror(result));
+
+ } else if ((add_admin_epoch != 0)
+ || (add_epoch != 0)
+ || (add_updates != 0)) {
- } else if (diff != NULL) {
do_crm_log(log_level,
- "Local-only Change (client:%s%s%s): %d.%d.%d (%s)",
- client, call_id ? ", call: " : "", call_id ? call_id : "",
- add_admin_epoch, add_epoch, add_updates, pcmk_strerror(result));
+ "Local-only change to CIB generation %d.%d.%d from client "
+ "%s%s%s (%s) (%s)",
+ add_admin_epoch, add_epoch, add_updates,
+ client_name,
+ ((call_id != NULL)? " call " : ""), pcmk__s(call_id, ""),
+ pcmk__s(origin, "unspecified peer"), pcmk_strerror(result));
}
- do_cib_notify(options, op, update, result, diff, T_CIB_DIFF_NOTIFY);
-}
-
-static void
-do_cib_notify(int options, const char *op, xmlNode * update,
- int result, xmlNode * result_data, const char *msg_type)
-{
- xmlNode *update_msg = NULL;
- const char *id = NULL;
-
update_msg = create_xml_node(NULL, "notify");
- if (result_data != NULL) {
- id = crm_element_value(result_data, XML_ATTR_ID);
- }
-
crm_xml_add(update_msg, F_TYPE, T_CIB_NOTIFY);
- crm_xml_add(update_msg, F_SUBTYPE, msg_type);
+ crm_xml_add(update_msg, F_SUBTYPE, T_CIB_DIFF_NOTIFY);
crm_xml_add(update_msg, F_CIB_OPERATION, op);
+ crm_xml_add(update_msg, F_CIB_CLIENTID, client_id);
+ crm_xml_add(update_msg, F_CIB_CALLID, call_id);
+ crm_xml_add(update_msg, F_ORIG, origin);
crm_xml_add_int(update_msg, F_CIB_RC, result);
- if (id != NULL) {
- crm_xml_add(update_msg, F_CIB_OBJID, id);
- }
-
if (update != NULL) {
- crm_trace("Setting type to update->name: %s", crm_element_name(update));
- crm_xml_add(update_msg, F_CIB_OBJTYPE, crm_element_name(update));
-
- } else if (result_data != NULL) {
- crm_trace("Setting type to new_obj->name: %s", crm_element_name(result_data));
- crm_xml_add(update_msg, F_CIB_OBJTYPE, crm_element_name(result_data));
-
+ type = crm_element_name(update);
+ crm_trace("Setting type to update->name: %s", type);
} else {
- crm_trace("Not Setting type");
+ type = crm_element_name(diff);
+ crm_trace("Setting type to new_obj->name: %s", type);
}
-
+ crm_xml_add(update_msg, F_CIB_OBJID, ID(diff));
+ crm_xml_add(update_msg, F_CIB_OBJTYPE, type);
attach_cib_generation(update_msg, "cib_generation", the_cib);
+
if (update != NULL) {
add_message_xml(update_msg, F_CIB_UPDATE, update);
}
- if (result_data != NULL) {
- add_message_xml(update_msg, F_CIB_UPDATE_RESULT, result_data);
- }
+ add_message_xml(update_msg, F_CIB_UPDATE_RESULT, diff);
cib_notify_send(update_msg);
free_xml(update_msg);
}
void
-attach_cib_generation(xmlNode * msg, const char *field, xmlNode * a_cib)
-{
- xmlNode *generation = create_xml_node(NULL, XML_CIB_TAG_GENERATION_TUPPLE);
-
- if (a_cib != NULL) {
- copy_in_properties(generation, a_cib);
- }
- add_message_xml(msg, field, generation);
- free_xml(generation);
-}
-
-void
-cib_replace_notify(const char *origin, xmlNode *update, int result,
- xmlNode *diff, uint32_t change_section)
+cib_replace_notify(const char *op, int result, const char *call_id,
+ const char *client_id, const char *client_name,
+ const char *origin, xmlNode *update, xmlNode *diff,
+ uint32_t change_section)
{
xmlNode *replace_msg = NULL;
int add_updates = 0;
int add_epoch = 0;
int add_admin_epoch = 0;
int del_updates = 0;
int del_epoch = 0;
int del_admin_epoch = 0;
+ uint8_t log_level = LOG_INFO;
+
if (diff == NULL) {
return;
}
+ if (result != pcmk_ok) {
+ log_level = LOG_WARNING;
+ }
+
cib_diff_version_details(diff, &add_admin_epoch, &add_epoch, &add_updates,
&del_admin_epoch, &del_epoch, &del_updates);
if (del_updates < 0) {
crm_log_xml_debug(diff, "Bad replace diff");
}
- if (add_updates != del_updates) {
- crm_info("Replaced: %d.%d.%d -> %d.%d.%d from %s",
- del_admin_epoch, del_epoch, del_updates,
- add_admin_epoch, add_epoch, add_updates,
- pcmk__s(origin, "unspecified peer"));
- } else if (diff != NULL) {
- crm_info("Local-only Replace: %d.%d.%d from %s",
- add_admin_epoch, add_epoch, add_updates,
- pcmk__s(origin, "unspecified peer"));
+ if ((add_admin_epoch != del_admin_epoch)
+ || (add_epoch != del_epoch)
+ || (add_updates != del_updates)) {
+
+ do_crm_log(log_level,
+ "Replaced CIB generation %d.%d.%d with %d.%d.%d from client "
+ "%s%s%s (%s) (%s)",
+ del_admin_epoch, del_epoch, del_updates,
+ add_admin_epoch, add_epoch, add_updates,
+ client_name,
+ ((call_id != NULL)? " call " : ""), pcmk__s(call_id, ""),
+ pcmk__s(origin, "unspecified peer"), pcmk_strerror(result));
+
+ } else if ((add_admin_epoch != 0)
+ || (add_epoch != 0)
+ || (add_updates != 0)) {
+
+ do_crm_log(log_level,
+ "Local-only replace of CIB generation %d.%d.%d from client "
+ "%s%s%s (%s) (%s)",
+ add_admin_epoch, add_epoch, add_updates,
+ client_name,
+ ((call_id != NULL)? " call " : ""), pcmk__s(call_id, ""),
+ pcmk__s(origin, "unspecified peer"), pcmk_strerror(result));
}
replace_msg = create_xml_node(NULL, "notify-replace");
+
crm_xml_add(replace_msg, F_TYPE, T_CIB_NOTIFY);
crm_xml_add(replace_msg, F_SUBTYPE, T_CIB_REPLACE_NOTIFY);
- crm_xml_add(replace_msg, F_CIB_OPERATION, PCMK__CIB_REQUEST_REPLACE);
+ crm_xml_add(replace_msg, F_CIB_OPERATION, op);
+ crm_xml_add(replace_msg, F_CIB_CLIENTID, client_id);
+ crm_xml_add(replace_msg, F_CIB_CALLID, call_id);
+ crm_xml_add(replace_msg, F_ORIG, origin);
crm_xml_add_int(replace_msg, F_CIB_RC, result);
crm_xml_add_ll(replace_msg, F_CIB_CHANGE_SECTION,
(long long) change_section);
attach_cib_generation(replace_msg, "cib-replace-generation", update);
- crm_log_xml_trace(replace_msg, "CIB Replaced");
+ /* We can include update and diff if a replace callback needs them. Until
+ * then, avoid the overhead.
+ */
+
+ crm_log_xml_trace(replace_msg, "CIB replaced");
cib_notify_send(replace_msg);
free_xml(replace_msg);
}
diff --git a/daemons/based/based_remote.c b/daemons/based/based_remote.c
index 4013b21859..38136d2d95 100644
--- a/daemons/based/based_remote.c
+++ b/daemons/based/based_remote.c
@@ -1,691 +1,680 @@
/*
* Copyright 2004-2021 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
#include // PRIx64
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include "pacemaker-based.h"
/* #undef HAVE_PAM_PAM_APPL_H */
/* #undef HAVE_GNUTLS_GNUTLS_H */
#ifdef HAVE_GNUTLS_GNUTLS_H
# include
#endif
#include
#include
#if HAVE_SECURITY_PAM_APPL_H
# include
# define HAVE_PAM 1
#else
# if HAVE_PAM_PAM_APPL_H
# include
# define HAVE_PAM 1
# endif
#endif
extern int remote_tls_fd;
extern gboolean cib_shutdown_flag;
int init_remote_listener(int port, gboolean encrypted);
void cib_remote_connection_destroy(gpointer user_data);
#ifdef HAVE_GNUTLS_GNUTLS_H
gnutls_dh_params_t dh_params;
gnutls_anon_server_credentials_t anon_cred_s;
static void
debug_log(int level, const char *str)
{
fputs(str, stderr);
}
#endif
#define REMOTE_AUTH_TIMEOUT 10000
int num_clients;
int authenticate_user(const char *user, const char *passwd);
static int cib_remote_listen(gpointer data);
static int cib_remote_msg(gpointer data);
static void
remote_connection_destroy(gpointer user_data)
{
crm_info("No longer listening for remote connections");
return;
}
int
init_remote_listener(int port, gboolean encrypted)
{
int rc;
int *ssock = NULL;
struct sockaddr_in saddr;
int optval;
static struct mainloop_fd_callbacks remote_listen_fd_callbacks = {
.dispatch = cib_remote_listen,
.destroy = remote_connection_destroy,
};
if (port <= 0) {
/* don't start it */
return 0;
}
if (encrypted) {
#ifndef HAVE_GNUTLS_GNUTLS_H
crm_warn("TLS support is not available");
return 0;
#else
crm_notice("Starting TLS listener on port %d", port);
crm_gnutls_global_init();
/* gnutls_global_set_log_level (10); */
gnutls_global_set_log_function(debug_log);
if (pcmk__init_tls_dh(&dh_params) != pcmk_rc_ok) {
return -1;
}
gnutls_anon_allocate_server_credentials(&anon_cred_s);
gnutls_anon_set_server_dh_params(anon_cred_s, dh_params);
#endif
} else {
crm_warn("Starting plain-text listener on port %d", port);
}
#ifndef HAVE_PAM
crm_warn("PAM is _not_ enabled!");
#endif
/* create server socket */
ssock = malloc(sizeof(int));
if(ssock == NULL) {
crm_perror(LOG_ERR, "Listener socket allocation failed");
return -1;
}
*ssock = socket(AF_INET, SOCK_STREAM, 0);
if (*ssock == -1) {
crm_perror(LOG_ERR, "Listener socket creation failed");
free(ssock);
return -1;
}
/* reuse address */
optval = 1;
rc = setsockopt(*ssock, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval));
if (rc < 0) {
crm_perror(LOG_WARNING,
"Local address reuse not allowed on listener socket");
}
/* bind server socket */
memset(&saddr, '\0', sizeof(saddr));
saddr.sin_family = AF_INET;
saddr.sin_addr.s_addr = INADDR_ANY;
saddr.sin_port = htons(port);
if (bind(*ssock, (struct sockaddr *)&saddr, sizeof(saddr)) == -1) {
crm_perror(LOG_ERR, "Cannot bind to listener socket");
close(*ssock);
free(ssock);
return -2;
}
if (listen(*ssock, 10) == -1) {
crm_perror(LOG_ERR, "Cannot listen on socket");
close(*ssock);
free(ssock);
return -3;
}
mainloop_add_fd("cib-remote", G_PRIORITY_DEFAULT, *ssock, ssock, &remote_listen_fd_callbacks);
crm_debug("Started listener on port %d", port);
return *ssock;
}
static int
check_group_membership(const char *usr, const char *grp)
{
int index = 0;
struct passwd *pwd = NULL;
struct group *group = NULL;
CRM_CHECK(usr != NULL, return FALSE);
CRM_CHECK(grp != NULL, return FALSE);
pwd = getpwnam(usr);
if (pwd == NULL) {
crm_err("No user named '%s' exists!", usr);
return FALSE;
}
group = getgrgid(pwd->pw_gid);
if (group != NULL && pcmk__str_eq(grp, group->gr_name, pcmk__str_none)) {
return TRUE;
}
group = getgrnam(grp);
if (group == NULL) {
crm_err("No group named '%s' exists!", grp);
return FALSE;
}
while (TRUE) {
char *member = group->gr_mem[index++];
if (member == NULL) {
break;
} else if (pcmk__str_eq(usr, member, pcmk__str_none)) {
return TRUE;
}
};
return FALSE;
}
static gboolean
cib_remote_auth(xmlNode * login)
{
const char *user = NULL;
const char *pass = NULL;
const char *tmp = NULL;
crm_log_xml_info(login, "Login: ");
if (login == NULL) {
return FALSE;
}
tmp = crm_element_name(login);
if (!pcmk__str_eq(tmp, "cib_command", pcmk__str_casei)) {
crm_err("Wrong tag: %s", tmp);
return FALSE;
}
tmp = crm_element_value(login, "op");
if (!pcmk__str_eq(tmp, "authenticate", pcmk__str_casei)) {
crm_err("Wrong operation: %s", tmp);
return FALSE;
}
user = crm_element_value(login, "user");
pass = crm_element_value(login, "password");
if (!user || !pass) {
crm_err("missing auth credentials");
return FALSE;
}
/* Non-root daemons can only validate the password of the
* user they're running as
*/
if (check_group_membership(user, CRM_DAEMON_GROUP) == FALSE) {
crm_err("User is not a member of the required group");
return FALSE;
} else if (authenticate_user(user, pass) == FALSE) {
crm_err("PAM auth failed");
return FALSE;
}
return TRUE;
}
static gboolean
remote_auth_timeout_cb(gpointer data)
{
pcmk__client_t *client = data;
client->remote->auth_timeout = 0;
if (pcmk_is_set(client->flags, pcmk__client_authenticated)) {
return FALSE;
}
mainloop_del_fd(client->remote->source);
crm_err("Remote client authentication timed out");
return FALSE;
}
static int
cib_remote_listen(gpointer data)
{
int csock = 0;
unsigned laddr;
struct sockaddr_storage addr;
char ipstr[INET6_ADDRSTRLEN];
int ssock = *(int *)data;
int rc;
pcmk__client_t *new_client = NULL;
static struct mainloop_fd_callbacks remote_client_fd_callbacks = {
.dispatch = cib_remote_msg,
.destroy = cib_remote_connection_destroy,
};
/* accept the connection */
laddr = sizeof(addr);
memset(&addr, 0, sizeof(addr));
csock = accept(ssock, (struct sockaddr *)&addr, &laddr);
if (csock == -1) {
crm_perror(LOG_ERR, "Could not accept socket connection");
return TRUE;
}
pcmk__sockaddr2str(&addr, ipstr);
crm_debug("New %s connection from %s",
((ssock == remote_tls_fd)? "secure" : "clear-text"), ipstr);
rc = pcmk__set_nonblocking(csock);
if (rc != pcmk_rc_ok) {
crm_err("Could not set socket non-blocking: %s " CRM_XS " rc=%d",
pcmk_rc_str(rc), rc);
close(csock);
return TRUE;
}
num_clients++;
new_client = pcmk__new_unauth_client(NULL);
new_client->remote = calloc(1, sizeof(pcmk__remote_t));
if (ssock == remote_tls_fd) {
#ifdef HAVE_GNUTLS_GNUTLS_H
pcmk__set_client_flags(new_client, pcmk__client_tls);
/* create gnutls session for the server socket */
new_client->remote->tls_session = pcmk__new_tls_session(csock,
GNUTLS_SERVER,
GNUTLS_CRD_ANON,
anon_cred_s);
if (new_client->remote->tls_session == NULL) {
close(csock);
return TRUE;
}
#endif
} else {
pcmk__set_client_flags(new_client, pcmk__client_tcp);
new_client->remote->tcp_socket = csock;
}
// Require the client to authenticate within this time
new_client->remote->auth_timeout = g_timeout_add(REMOTE_AUTH_TIMEOUT,
remote_auth_timeout_cb,
new_client);
crm_info("Remote CIB client pending authentication "
CRM_XS " %p id: %s", new_client, new_client->id);
new_client->remote->source =
mainloop_add_fd("cib-remote-client", G_PRIORITY_DEFAULT, csock, new_client,
&remote_client_fd_callbacks);
return TRUE;
}
void
cib_remote_connection_destroy(gpointer user_data)
{
pcmk__client_t *client = user_data;
int csock = 0;
if (client == NULL) {
return;
}
crm_trace("Cleaning up after client %s disconnect",
pcmk__client_name(client));
num_clients--;
crm_trace("Num unfree'd clients: %d", num_clients);
switch (PCMK__CLIENT_TYPE(client)) {
case pcmk__client_tcp:
csock = client->remote->tcp_socket;
break;
#ifdef HAVE_GNUTLS_GNUTLS_H
case pcmk__client_tls:
if (client->remote->tls_session) {
void *sock_ptr = gnutls_transport_get_ptr(*client->remote->tls_session);
csock = GPOINTER_TO_INT(sock_ptr);
if (pcmk_is_set(client->flags,
pcmk__client_tls_handshake_complete)) {
gnutls_bye(*client->remote->tls_session, GNUTLS_SHUT_WR);
}
gnutls_deinit(*client->remote->tls_session);
gnutls_free(client->remote->tls_session);
client->remote->tls_session = NULL;
}
break;
#endif
default:
crm_warn("Unknown transport for client %s "
CRM_XS " flags=%#016" PRIx64,
pcmk__client_name(client), client->flags);
}
if (csock > 0) {
close(csock);
}
pcmk__free_client(client);
crm_trace("Freed the cib client");
if (cib_shutdown_flag) {
cib_shutdown(0);
}
return;
}
static void
cib_handle_remote_msg(pcmk__client_t *client, xmlNode *command)
{
const char *value = NULL;
value = crm_element_name(command);
if (!pcmk__str_eq(value, "cib_command", pcmk__str_casei)) {
crm_log_xml_trace(command, "Bad command: ");
return;
}
if (client->name == NULL) {
value = crm_element_value(command, F_CLIENTNAME);
if (value == NULL) {
client->name = strdup(client->id);
} else {
client->name = strdup(value);
}
}
- if (client->userdata == NULL) {
- value = crm_element_value(command, F_CIB_CALLBACK_TOKEN);
- if (value != NULL) {
- client->userdata = strdup(value);
- crm_trace("Callback channel for %s is %s", client->id, (char*)client->userdata);
-
- } else {
- client->userdata = strdup(client->id);
- }
- }
-
/* unset dangerous options */
xml_remove_prop(command, F_ORIG);
xml_remove_prop(command, F_CIB_HOST);
xml_remove_prop(command, F_CIB_GLOBAL_UPDATE);
crm_xml_add(command, F_TYPE, T_CIB);
crm_xml_add(command, F_CIB_CLIENTID, client->id);
crm_xml_add(command, F_CIB_CLIENTNAME, client->name);
crm_xml_add(command, F_CIB_USER, client->user);
if (crm_element_value(command, F_CIB_CALLID) == NULL) {
char *call_uuid = crm_generate_uuid();
/* fix the command */
crm_xml_add(command, F_CIB_CALLID, call_uuid);
free(call_uuid);
}
if (crm_element_value(command, F_CIB_CALLOPTS) == NULL) {
crm_xml_add_int(command, F_CIB_CALLOPTS, 0);
}
crm_log_xml_trace(command, "Remote command: ");
cib_common_callback_worker(0, 0, command, client, TRUE);
}
static int
cib_remote_msg(gpointer data)
{
xmlNode *command = NULL;
pcmk__client_t *client = data;
int rc;
int timeout = 1000;
if (pcmk_is_set(client->flags, pcmk__client_authenticated)) {
timeout = -1;
}
crm_trace("Remote %s message received for client %s",
pcmk__client_type_str(PCMK__CLIENT_TYPE(client)),
pcmk__client_name(client));
#ifdef HAVE_GNUTLS_GNUTLS_H
if ((PCMK__CLIENT_TYPE(client) == pcmk__client_tls)
&& !pcmk_is_set(client->flags, pcmk__client_tls_handshake_complete)) {
int rc = pcmk__read_handshake_data(client);
if (rc == EAGAIN) {
/* No more data is available at the moment. Just return for now;
* we'll get invoked again once the client sends more.
*/
return 0;
} else if (rc != pcmk_rc_ok) {
return -1;
}
crm_debug("TLS handshake with remote CIB client completed");
pcmk__set_client_flags(client, pcmk__client_tls_handshake_complete);
if (client->remote->auth_timeout) {
g_source_remove(client->remote->auth_timeout);
}
// Require the client to authenticate within this time
client->remote->auth_timeout = g_timeout_add(REMOTE_AUTH_TIMEOUT,
remote_auth_timeout_cb,
client);
return 0;
}
#endif
rc = pcmk__read_remote_message(client->remote, timeout);
/* must pass auth before we will process anything else */
if (!pcmk_is_set(client->flags, pcmk__client_authenticated)) {
xmlNode *reg;
const char *user = NULL;
command = pcmk__remote_message_xml(client->remote);
if (cib_remote_auth(command) == FALSE) {
free_xml(command);
return -1;
}
crm_notice("Remote CIB client connection accepted");
pcmk__set_client_flags(client, pcmk__client_authenticated);
g_source_remove(client->remote->auth_timeout);
client->remote->auth_timeout = 0;
client->name = crm_element_value_copy(command, "name");
user = crm_element_value(command, "user");
if (user) {
client->user = strdup(user);
}
/* send ACK */
reg = create_xml_node(NULL, "cib_result");
crm_xml_add(reg, F_CIB_OPERATION, CRM_OP_REGISTER);
crm_xml_add(reg, F_CIB_CLIENTID, client->id);
pcmk__remote_send_xml(client->remote, reg);
free_xml(reg);
free_xml(command);
}
command = pcmk__remote_message_xml(client->remote);
while (command) {
crm_trace("Remote client message received");
cib_handle_remote_msg(client, command);
free_xml(command);
command = pcmk__remote_message_xml(client->remote);
}
if (rc == ENOTCONN) {
crm_trace("Remote CIB client disconnected while reading from it");
return -1;
}
return 0;
}
#ifdef HAVE_PAM
static int
construct_pam_passwd(int num_msg, const struct pam_message **msg,
struct pam_response **response, void *data)
{
int count = 0;
struct pam_response *reply;
char *string = (char *)data;
CRM_CHECK(data, return PAM_CONV_ERR);
CRM_CHECK(num_msg == 1, return PAM_CONV_ERR); /* We only want to handle one message */
reply = calloc(1, sizeof(struct pam_response));
CRM_ASSERT(reply != NULL);
for (count = 0; count < num_msg; ++count) {
switch (msg[count]->msg_style) {
case PAM_TEXT_INFO:
crm_info("PAM: %s", msg[count]->msg);
break;
case PAM_PROMPT_ECHO_OFF:
case PAM_PROMPT_ECHO_ON:
reply[count].resp_retcode = 0;
reply[count].resp = string; /* We already made a copy */
break;
case PAM_ERROR_MSG:
/* In theory we'd want to print this, but then
* we see the password prompt in the logs
*/
/* crm_err("PAM error: %s", msg[count]->msg); */
break;
default:
crm_err("Unhandled conversation type: %d", msg[count]->msg_style);
goto bail;
}
}
*response = reply;
reply = NULL;
return PAM_SUCCESS;
bail:
for (count = 0; count < num_msg; ++count) {
if (reply[count].resp != NULL) {
switch (msg[count]->msg_style) {
case PAM_PROMPT_ECHO_ON:
case PAM_PROMPT_ECHO_OFF:
/* Erase the data - it contained a password */
while (*(reply[count].resp)) {
*(reply[count].resp)++ = '\0';
}
free(reply[count].resp);
break;
}
reply[count].resp = NULL;
}
}
free(reply);
reply = NULL;
return PAM_CONV_ERR;
}
#endif
int
authenticate_user(const char *user, const char *passwd)
{
#ifndef HAVE_PAM
gboolean pass = TRUE;
#else
int rc = 0;
gboolean pass = FALSE;
const void *p_user = NULL;
struct pam_conv p_conv;
struct pam_handle *pam_h = NULL;
static const char *pam_name = NULL;
if (pam_name == NULL) {
pam_name = getenv("CIB_pam_service");
}
if (pam_name == NULL) {
pam_name = "login";
}
p_conv.conv = construct_pam_passwd;
p_conv.appdata_ptr = strdup(passwd);
rc = pam_start(pam_name, user, &p_conv, &pam_h);
if (rc != PAM_SUCCESS) {
crm_err("Could not initialize PAM: %s (%d)", pam_strerror(pam_h, rc), rc);
goto bail;
}
rc = pam_authenticate(pam_h, 0);
if (rc != PAM_SUCCESS) {
crm_err("Authentication failed for %s: %s (%d)", user, pam_strerror(pam_h, rc), rc);
goto bail;
}
/* Make sure we authenticated the user we wanted to authenticate.
* Since we also run as non-root, it might be worth pre-checking
* the user has the same EID as us, since that the only user we
* can authenticate.
*/
rc = pam_get_item(pam_h, PAM_USER, &p_user);
if (rc != PAM_SUCCESS) {
crm_err("Internal PAM error: %s (%d)", pam_strerror(pam_h, rc), rc);
goto bail;
} else if (p_user == NULL) {
crm_err("Unknown user authenticated.");
goto bail;
} else if (!pcmk__str_eq(p_user, user, pcmk__str_casei)) {
crm_err("User mismatch: %s vs. %s.", (const char *)p_user, (const char *)user);
goto bail;
}
rc = pam_acct_mgmt(pam_h, 0);
if (rc != PAM_SUCCESS) {
crm_err("Access denied: %s (%d)", pam_strerror(pam_h, rc), rc);
goto bail;
}
pass = TRUE;
bail:
pam_end(pam_h, rc);
#endif
return pass;
}
diff --git a/daemons/based/pacemaker-based.h b/daemons/based/pacemaker-based.h
index 44818bb4dd..05e49b3591 100644
--- a/daemons/based/pacemaker-based.h
+++ b/daemons/based/pacemaker-based.h
@@ -1,150 +1,150 @@
/*
- * Copyright 2004-2022 the Pacemaker project contributors
+ * Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU Lesser General Public License
* version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY.
*/
#ifndef PACEMAKER_BASED__H
# define PACEMAKER_BASED__H
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#ifdef HAVE_GNUTLS_GNUTLS_H
# include
#endif
// CIB-specific client flags
enum cib_client_flags {
// Notifications
cib_notify_pre = (UINT64_C(1) << 0),
cib_notify_post = (UINT64_C(1) << 1),
cib_notify_replace = (UINT64_C(1) << 2),
cib_notify_confirm = (UINT64_C(1) << 3),
cib_notify_diff = (UINT64_C(1) << 4),
// Whether client is another cluster daemon
cib_is_daemon = (UINT64_C(1) << 12),
};
typedef struct cib_operation_s {
const char *operation;
gboolean modifies_cib;
gboolean needs_privileges;
- gboolean needs_quorum;
int (*prepare) (xmlNode *, xmlNode **, const char **);
int (*cleanup) (int, xmlNode **, xmlNode **);
int (*fn) (const char *, int, const char *, xmlNode *,
xmlNode *, xmlNode *, xmlNode **, xmlNode **);
} cib_operation_t;
extern bool based_is_primary;
extern GHashTable *config_hash;
extern xmlNode *the_cib;
extern crm_trigger_t *cib_writer;
extern gboolean cib_writes_enabled;
extern GMainLoop *mainloop;
extern crm_cluster_t *crm_cluster;
extern GHashTable *local_notify_queue;
extern gboolean legacy_mode;
extern gboolean stand_alone;
extern gboolean cib_shutdown_flag;
extern gchar *cib_root;
extern int cib_status;
extern pcmk__output_t *logger_out;
extern struct qb_ipcs_service_handlers ipc_ro_callbacks;
extern struct qb_ipcs_service_handlers ipc_rw_callbacks;
extern qb_ipcs_service_t *ipcs_ro;
extern qb_ipcs_service_t *ipcs_rw;
extern qb_ipcs_service_t *ipcs_shm;
void cib_peer_callback(xmlNode *msg, void *private_data);
void cib_common_callback_worker(uint32_t id, uint32_t flags,
xmlNode *op_request, pcmk__client_t *cib_client,
gboolean privileged);
void cib_shutdown(int nsig);
void terminate_cib(const char *caller, int fast);
gboolean cib_legacy_mode(void);
gboolean uninitializeCib(void);
xmlNode *readCibXmlFile(const char *dir, const char *file,
gboolean discard_status);
int activateCibXml(xmlNode *doc, gboolean to_disk, const char *op);
int cib_process_shutdown_req(const char *op, int options, const char *section,
xmlNode *req, xmlNode *input,
xmlNode *existing_cib, xmlNode **result_cib,
xmlNode **answer);
int cib_process_default(const char *op, int options, const char *section,
xmlNode *req, xmlNode *input, xmlNode *existing_cib,
xmlNode **result_cib, xmlNode **answer);
int cib_process_ping(const char *op, int options, const char *section,
xmlNode *req, xmlNode *input, xmlNode *existing_cib,
xmlNode **result_cib, xmlNode **answer);
int cib_process_readwrite(const char *op, int options, const char *section,
xmlNode *req, xmlNode *input, xmlNode *existing_cib,
xmlNode **result_cib, xmlNode **answer);
int cib_process_replace_svr(const char *op, int options, const char *section,
xmlNode *req, xmlNode *input, xmlNode *existing_cib,
xmlNode **result_cib, xmlNode **answer);
int cib_server_process_diff(const char *op, int options, const char *section,
xmlNode *req, xmlNode *input, xmlNode *existing_cib,
xmlNode **result_cib, xmlNode **answer);
int cib_process_sync(const char *op, int options, const char *section,
xmlNode *req, xmlNode *input, xmlNode *existing_cib,
xmlNode **result_cib, xmlNode **answer);
int cib_process_sync_one(const char *op, int options, const char *section,
xmlNode *req, xmlNode *input, xmlNode *existing_cib,
xmlNode **result_cib, xmlNode **answer);
int cib_process_delete_absolute(const char *op, int options,
const char *section, xmlNode *req,
xmlNode *input, xmlNode *existing_cib,
xmlNode **result_cib, xmlNode **answer);
int cib_process_upgrade_server(const char *op, int options, const char *section,
xmlNode *req, xmlNode *input,
xmlNode *existing_cib, xmlNode **result_cib,
xmlNode **answer);
void send_sync_request(const char *host);
int sync_our_cib(xmlNode *request, gboolean all);
xmlNode *cib_msg_copy(xmlNode *msg, gboolean with_data);
int cib_get_operation_id(const char *op, int *operation);
cib_op_t *cib_op_func(int call_type);
gboolean cib_op_modifies(int call_type);
int cib_op_prepare(int call_type, xmlNode *request, xmlNode **input,
const char **section);
int cib_op_cleanup(int call_type, int options, xmlNode **input,
xmlNode **output);
-int cib_op_can_run(int call_type, int call_options, gboolean privileged,
- gboolean global_update);
-void cib_diff_notify(int options, const char *client, const char *call_id,
- const char *op, xmlNode *update, int result,
- xmlNode *old_cib);
-void cib_replace_notify(const char *origin, xmlNode *update, int result,
- xmlNode *diff, uint32_t change_section);
+int cib_op_can_run(int call_type, int call_options, bool privileged);
+void cib_diff_notify(const char *op, int result, const char *call_id,
+ const char *client_id, const char *client_name,
+ const char *origin, xmlNode *update, xmlNode *diff);
+void cib_replace_notify(const char *op, int result, const char *call_id,
+ const char *client_id, const char *client_name,
+ const char *origin, xmlNode *update, xmlNode *diff,
+ uint32_t change_section);
static inline const char *
cib_config_lookup(const char *opt)
{
return g_hash_table_lookup(config_hash, opt);
}
#endif // PACEMAKER_BASED__H
diff --git a/daemons/controld/controld_cib.c b/daemons/controld/controld_cib.c
index bfa5a5b4fb..94b99ddf52 100644
--- a/daemons/controld/controld_cib.c
+++ b/daemons/controld/controld_cib.c
@@ -1,1033 +1,1138 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include /* sleep */
#include
#include
#include
#include
#include
#include
-static int cib_retries = 0;
+// Call ID of the most recent in-progress CIB resource update (or 0 if none)
+static int pending_rsc_update = 0;
+
+// Call IDs of requested CIB replacements that won't trigger a new election
+// (used as a set of gint values)
+static GHashTable *cib_replacements = NULL;
+
+/*!
+ * \internal
+ * \brief Store the call ID of a CIB replacement that the controller requested
+ *
+ * The \p do_cib_replaced() callback function will avoid triggering a new
+ * election when we're notified of one of these expected replacements.
+ *
+ * \param[in] call_id CIB call ID (or 0 for a synchronous call)
+ *
+ * \note This function should be called after making any asynchronous CIB
+ * request (or before making any synchronous CIB request) that may replace
+ * part of the nodes or status section. This may include CIB sync calls.
+ */
+void
+controld_record_cib_replace_call(int call_id)
+{
+ CRM_CHECK(call_id >= 0, return);
+
+ if (cib_replacements == NULL) {
+ cib_replacements = g_hash_table_new(NULL, NULL);
+ }
+
+ /* If the call ID is already present in the table, then it's old. We may not
+ * be removing them properly, and we could improperly ignore replacement
+ * notifications if cib_t:call_id wraps around.
+ */
+ CRM_LOG_ASSERT(g_hash_table_add(cib_replacements,
+ GINT_TO_POINTER((gint) call_id)));
+}
+
+/*!
+ * \internal
+ * \brief Remove the call ID of a CIB replacement from the replacements table
+ *
+ * \param[in] call_id CIB call ID (or 0 for a synchronous call)
+ *
+ * \return \p true if \p call_id was found in the table, or \p false otherwise
+ *
+ * \note CIB notifications run before CIB callbacks. If this function is called
+ * from within a callback, \p do_cib_replaced() will have removed
+ * \p call_id from the table first if relevant changes triggered a
+ * notification.
+ */
+bool
+controld_forget_cib_replace_call(int call_id)
+{
+ CRM_CHECK(call_id >= 0, return false);
+
+ if (cib_replacements == NULL) {
+ return false;
+ }
+ return g_hash_table_remove(cib_replacements,
+ GINT_TO_POINTER((gint) call_id));
+}
+
+/*!
+ * \internal
+ * \brief Empty the hash table containing call IDs of CIB replacement requests
+ */
+void
+controld_forget_all_cib_replace_calls(void)
+{
+ if (cib_replacements != NULL) {
+ g_hash_table_remove_all(cib_replacements);
+ }
+}
+
+/*!
+ * \internal
+ * \brief Free the hash table containing call IDs of CIB replacement requests
+ */
+void
+controld_destroy_cib_replacements_table(void)
+{
+ if (cib_replacements != NULL) {
+ g_hash_table_destroy(cib_replacements);
+ cib_replacements = NULL;
+ }
+}
/*!
* \internal
* \brief Respond to a dropped CIB connection
*
* \param[in] user_data CIB connection that dropped
*/
static void
handle_cib_disconnect(gpointer user_data)
{
CRM_LOG_ASSERT(user_data == controld_globals.cib_conn);
controld_trigger_fsa();
controld_globals.cib_conn->state = cib_disconnected;
if (pcmk_is_set(controld_globals.fsa_input_register, R_CIB_CONNECTED)) {
// @TODO This should trigger a reconnect, not a shutdown
crm_crit("Lost connection to the CIB manager, shutting down");
register_fsa_input(C_FSA_INTERNAL, I_ERROR, NULL);
controld_clear_fsa_input_flags(R_CIB_CONNECTED);
} else { // Expected
crm_info("Connection to the CIB manager terminated");
}
}
static void
do_cib_updated(const char *event, xmlNode * msg)
{
if (pcmk__alert_in_patchset(msg, TRUE)) {
controld_trigger_config();
}
}
static void
do_cib_replaced(const char *event, xmlNode * msg)
{
+ int call_id = 0;
+ const char *client_id = crm_element_value(msg, F_CIB_CLIENTID);
uint32_t change_section = cib_change_section_nodes
|cib_change_section_status;
long long value = 0;
crm_debug("Updating the CIB after a replace: DC=%s", pcmk__btoa(AM_I_DC));
- if (AM_I_DC == FALSE) {
+ if (!AM_I_DC) {
return;
+ }
- } else if ((controld_globals.fsa_state == S_FINALIZE_JOIN)
- && pcmk_is_set(controld_globals.fsa_input_register,
- R_CIB_ASKED)) {
- /* no need to restart the join - we asked for this replace op */
+ if ((crm_element_value_int(msg, F_CIB_CALLID, &call_id) == 0)
+ && pcmk__str_eq(client_id, controld_globals.cib_client_id,
+ pcmk__str_none)
+ && controld_forget_cib_replace_call(call_id)) {
+ // We requested this replace op. No need to restart the join.
return;
}
if ((crm_element_value_ll(msg, F_CIB_CHANGE_SECTION, &value) < 0)
|| (value < 0) || (value > UINT32_MAX)) {
crm_trace("Couldn't parse '%s' from message", F_CIB_CHANGE_SECTION);
} else {
change_section = (uint32_t) value;
}
if (pcmk_any_flags_set(change_section, cib_change_section_nodes
|cib_change_section_status)) {
/* start the join process again so we get everyone's LRM status */
populate_cib_nodes(node_update_quick|node_update_all, __func__);
register_fsa_input(C_FSA_INTERNAL, I_ELECTION, NULL);
}
}
void
controld_disconnect_cib_manager(void)
{
cib_t *cib_conn = controld_globals.cib_conn;
CRM_ASSERT(cib_conn != NULL);
crm_info("Disconnecting from the CIB manager");
controld_clear_fsa_input_flags(R_CIB_CONNECTED);
cib_conn->cmds->del_notify_callback(cib_conn, T_CIB_REPLACE_NOTIFY,
do_cib_replaced);
cib_conn->cmds->del_notify_callback(cib_conn, T_CIB_DIFF_NOTIFY,
do_cib_updated);
cib_free_callbacks(cib_conn);
if (cib_conn->state != cib_disconnected) {
cib_conn->cmds->set_secondary(cib_conn,
cib_scope_local|cib_discard_reply);
cib_conn->cmds->signoff(cib_conn);
}
crm_notice("Disconnected from the CIB manager");
}
/* A_CIB_STOP, A_CIB_START, O_CIB_RESTART */
void
do_cib_control(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
+ static int cib_retries = 0;
+
cib_t *cib_conn = controld_globals.cib_conn;
void (*dnotify_fn) (gpointer user_data) = handle_cib_disconnect;
void (*replace_cb) (const char *event, xmlNodePtr msg) = do_cib_replaced;
void (*update_cb) (const char *event, xmlNodePtr msg) = do_cib_updated;
int rc = pcmk_ok;
CRM_ASSERT(cib_conn != NULL);
if (pcmk_is_set(action, A_CIB_STOP)) {
if ((cib_conn->state != cib_disconnected)
- && (controld_globals.resource_update != 0)) {
+ && (pending_rsc_update != 0)) {
crm_info("Waiting for resource update %d to complete",
- controld_globals.resource_update);
+ pending_rsc_update);
crmd_fsa_stall(FALSE);
return;
}
controld_disconnect_cib_manager();
}
if (!pcmk_is_set(action, A_CIB_START)) {
return;
}
if (cur_state == S_STOPPING) {
crm_err("Ignoring request to connect to the CIB manager after "
"shutdown");
return;
}
rc = cib_conn->cmds->signon(cib_conn, CRM_SYSTEM_CRMD,
cib_command_nonblocking);
if (rc != pcmk_ok) {
// A short wait that usually avoids stalling the FSA
sleep(1);
rc = cib_conn->cmds->signon(cib_conn, CRM_SYSTEM_CRMD,
cib_command_nonblocking);
}
if (rc != pcmk_ok) {
crm_info("Could not connect to the CIB manager: %s", pcmk_strerror(rc));
} else if (cib_conn->cmds->set_connection_dnotify(cib_conn,
dnotify_fn) != pcmk_ok) {
crm_err("Could not set dnotify callback");
} else if (cib_conn->cmds->add_notify_callback(cib_conn,
T_CIB_REPLACE_NOTIFY,
replace_cb) != pcmk_ok) {
crm_err("Could not set CIB notification callback (replace)");
} else if (cib_conn->cmds->add_notify_callback(cib_conn,
T_CIB_DIFF_NOTIFY,
update_cb) != pcmk_ok) {
crm_err("Could not set CIB notification callback (update)");
} else {
controld_set_fsa_input_flags(R_CIB_CONNECTED);
cib_retries = 0;
+ cib_conn->cmds->client_id(cib_conn, &controld_globals.cib_client_id,
+ NULL);
}
if (!pcmk_is_set(controld_globals.fsa_input_register, R_CIB_CONNECTED)) {
cib_retries++;
if (cib_retries < 30) {
crm_warn("Couldn't complete CIB registration %d times... "
"pause and retry", cib_retries);
controld_start_wait_timer();
crmd_fsa_stall(FALSE);
} else {
crm_err("Could not complete CIB registration %d times... "
"hard error", cib_retries);
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
}
}
}
#define MIN_CIB_OP_TIMEOUT (30)
/*!
* \internal
* \brief Get the timeout (in seconds) that should be used with CIB operations
*
* \return The maximum of 30 seconds, the value of the PCMK_cib_timeout
* environment variable, or 10 seconds times one more than the number of
* nodes in the cluster.
*/
unsigned int
cib_op_timeout(void)
{
static int env_timeout = -1;
unsigned int calculated_timeout = 0;
if (env_timeout == -1) {
const char *env = getenv("PCMK_cib_timeout");
pcmk__scan_min_int(env, &env_timeout, MIN_CIB_OP_TIMEOUT);
crm_trace("Minimum CIB op timeout: %ds (environment: %s)",
env_timeout, (env? env : "none"));
}
calculated_timeout = 1 + crm_active_peers();
if (crm_remote_peer_cache) {
calculated_timeout += g_hash_table_size(crm_remote_peer_cache);
}
calculated_timeout *= 10;
calculated_timeout = QB_MAX(calculated_timeout, env_timeout);
crm_trace("Calculated timeout: %us", calculated_timeout);
if (controld_globals.cib_conn) {
controld_globals.cib_conn->call_timeout = calculated_timeout;
}
return calculated_timeout;
}
/*!
* \internal
* \brief Get CIB call options to use local scope if primary is unavailable
*
* \return CIB call options
*/
int
crmd_cib_smart_opt(void)
{
- int call_opt = cib_quorum_override;
+ int call_opt = cib_none;
if ((controld_globals.fsa_state == S_ELECTION)
|| (controld_globals.fsa_state == S_PENDING)) {
crm_info("Sending update to local CIB in state: %s",
fsa_state2string(controld_globals.fsa_state));
cib__set_call_options(call_opt, "update", cib_scope_local);
}
return call_opt;
}
static void
cib_delete_callback(xmlNode *msg, int call_id, int rc, xmlNode *output,
void *user_data)
{
char *desc = user_data;
if (rc == 0) {
crm_debug("Deletion of %s (via CIB call %d) succeeded", desc, call_id);
} else {
crm_warn("Deletion of %s (via CIB call %d) failed: %s " CRM_XS " rc=%d",
desc, call_id, pcmk_strerror(rc), rc);
}
}
// Searches for various portions of node_state to delete
// Match a particular node's node_state (takes node name 1x)
#define XPATH_NODE_STATE "//" XML_CIB_TAG_STATE "[@" XML_ATTR_UNAME "='%s']"
// Node's lrm section (name 1x)
#define XPATH_NODE_LRM XPATH_NODE_STATE "/" XML_CIB_TAG_LRM
-// Node's lrm_rsc_op entries and lrm_resource entries without lock (name 2x)
+/* Node's lrm_rsc_op entries and lrm_resource entries without unexpired lock
+ * (name 2x, (seconds_since_epoch - XML_CONFIG_ATTR_SHUTDOWN_LOCK_LIMIT) 1x)
+ */
#define XPATH_NODE_LRM_UNLOCKED XPATH_NODE_STATE "//" XML_LRM_TAG_RSC_OP \
"|" XPATH_NODE_STATE \
"//" XML_LRM_TAG_RESOURCE \
- "[not(@" XML_CONFIG_ATTR_SHUTDOWN_LOCK ")]"
+ "[not(@" XML_CONFIG_ATTR_SHUTDOWN_LOCK ") " \
+ "or " XML_CONFIG_ATTR_SHUTDOWN_LOCK "<%lld]"
// Node's transient_attributes section (name 1x)
#define XPATH_NODE_ATTRS XPATH_NODE_STATE "/" XML_TAG_TRANSIENT_NODEATTRS
// Everything under node_state (name 1x)
#define XPATH_NODE_ALL XPATH_NODE_STATE "/*"
-// Unlocked history + transient attributes (name 3x)
+/* Unlocked history + transient attributes
+ * (name 2x, (seconds_since_epoch - XML_CONFIG_ATTR_SHUTDOWN_LOCK_LIMIT) 1x,
+ * name 1x)
+ */
#define XPATH_NODE_ALL_UNLOCKED XPATH_NODE_LRM_UNLOCKED "|" XPATH_NODE_ATTRS
/*!
* \internal
* \brief Delete subsection of a node's CIB node_state
*
* \param[in] uname Desired node
* \param[in] section Subsection of node_state to delete
* \param[in] options CIB call options to use
*/
void
controld_delete_node_state(const char *uname, enum controld_section_e section,
int options)
{
cib_t *cib_conn = controld_globals.cib_conn;
char *xpath = NULL;
char *desc = NULL;
+ // Shutdown locks that started before this time are expired
+ long long expire = (long long) time(NULL)
+ - controld_globals.shutdown_lock_limit;
+
CRM_CHECK(uname != NULL, return);
switch (section) {
case controld_section_lrm:
xpath = crm_strdup_printf(XPATH_NODE_LRM, uname);
desc = crm_strdup_printf("resource history for node %s", uname);
break;
case controld_section_lrm_unlocked:
- xpath = crm_strdup_printf(XPATH_NODE_LRM_UNLOCKED, uname, uname);
+ xpath = crm_strdup_printf(XPATH_NODE_LRM_UNLOCKED,
+ uname, uname, expire);
desc = crm_strdup_printf("resource history (other than shutdown "
"locks) for node %s", uname);
break;
case controld_section_attrs:
xpath = crm_strdup_printf(XPATH_NODE_ATTRS, uname);
desc = crm_strdup_printf("transient attributes for node %s", uname);
break;
case controld_section_all:
xpath = crm_strdup_printf(XPATH_NODE_ALL, uname);
desc = crm_strdup_printf("all state for node %s", uname);
break;
case controld_section_all_unlocked:
xpath = crm_strdup_printf(XPATH_NODE_ALL_UNLOCKED,
- uname, uname, uname);
+ uname, uname, expire, uname);
desc = crm_strdup_printf("all state (other than shutdown locks) "
"for node %s", uname);
break;
}
if (cib_conn == NULL) {
crm_warn("Unable to delete %s: no CIB connection", desc);
free(desc);
} else {
int call_id;
cib__set_call_options(options, "node state deletion",
- cib_quorum_override|cib_xpath|cib_multiple);
+ cib_xpath|cib_multiple);
call_id = cib_conn->cmds->remove(cib_conn, xpath, NULL, options);
crm_info("Deleting %s (via CIB call %d) " CRM_XS " xpath=%s",
desc, call_id, xpath);
fsa_register_cib_callback(call_id, desc, cib_delete_callback);
// CIB library handles freeing desc
}
free(xpath);
}
// Takes node name and resource ID
#define XPATH_RESOURCE_HISTORY "//" XML_CIB_TAG_STATE \
"[@" XML_ATTR_UNAME "='%s']/" \
XML_CIB_TAG_LRM "/" XML_LRM_TAG_RESOURCES \
"/" XML_LRM_TAG_RESOURCE \
"[@" XML_ATTR_ID "='%s']"
// @TODO could add "and @XML_CONFIG_ATTR_SHUTDOWN_LOCK" to limit to locks
/*!
* \internal
* \brief Clear resource history from CIB for a given resource and node
*
* \param[in] rsc_id ID of resource to be cleared
* \param[in] node Node whose resource history should be cleared
* \param[in] user_name ACL user name to use
* \param[in] call_options CIB call options
*
* \return Standard Pacemaker return code
*/
int
controld_delete_resource_history(const char *rsc_id, const char *node,
const char *user_name, int call_options)
{
char *desc = NULL;
char *xpath = NULL;
int rc = pcmk_rc_ok;
CRM_CHECK((rsc_id != NULL) && (node != NULL), return EINVAL);
desc = crm_strdup_printf("resource history for %s on %s", rsc_id, node);
if (controld_globals.cib_conn == NULL) {
crm_err("Unable to clear %s: no CIB connection", desc);
free(desc);
return ENOTCONN;
}
// Ask CIB to delete the entry
xpath = crm_strdup_printf(XPATH_RESOURCE_HISTORY, node, rsc_id);
rc = cib_internal_op(controld_globals.cib_conn, PCMK__CIB_REQUEST_DELETE,
NULL, xpath, NULL, NULL, call_options|cib_xpath,
user_name);
if (rc < 0) {
rc = pcmk_legacy2rc(rc);
crm_err("Could not delete resource status of %s on %s%s%s: %s "
CRM_XS " rc=%d", rsc_id, node,
(user_name? " for user " : ""), (user_name? user_name : ""),
pcmk_rc_str(rc), rc);
free(desc);
free(xpath);
return rc;
}
if (pcmk_is_set(call_options, cib_sync_call)) {
if (pcmk_is_set(call_options, cib_dryrun)) {
crm_debug("Deletion of %s would succeed", desc);
} else {
crm_debug("Deletion of %s succeeded", desc);
}
free(desc);
} else {
crm_info("Clearing %s (via CIB call %d) " CRM_XS " xpath=%s",
desc, rc, xpath);
fsa_register_cib_callback(rc, desc, cib_delete_callback);
// CIB library handles freeing desc
}
free(xpath);
return pcmk_rc_ok;
}
/*!
* \internal
* \brief Build XML and string of parameters meeting some criteria, for digest
*
* \param[in] op Executor event with parameter table to use
* \param[in] metadata Parsed meta-data for executed resource agent
* \param[in] param_type Flag used for selection criteria
* \param[out] result Will be set to newly created XML with selected
* parameters as attributes
*
* \return Newly allocated space-separated string of parameter names
* \note Selection criteria varies by param_type: for the restart digest, we
* want parameters that are *not* marked reloadable (OCF 1.1) or that
* *are* marked unique (pre-1.1), for both string and XML results; for the
* secure digest, we want parameters that *are* marked private for the
* string, but parameters that are *not* marked private for the XML.
* \note It is the caller's responsibility to free the string return value with
* \p g_string_free() and the XML result with \p free_xml().
*/
static GString *
build_parameter_list(const lrmd_event_data_t *op,
const struct ra_metadata_s *metadata,
enum ra_param_flags_e param_type, xmlNode **result)
{
GString *list = NULL;
*result = create_xml_node(NULL, XML_TAG_PARAMS);
/* Consider all parameters only except private ones to be consistent with
* what scheduler does with calculate_secure_digest().
*/
if (param_type == ra_param_private
&& compare_version(controld_globals.dc_version, "3.16.0") >= 0) {
g_hash_table_foreach(op->params, hash2field, *result);
pcmk__filter_op_for_digest(*result);
}
for (GList *iter = metadata->ra_params; iter != NULL; iter = iter->next) {
struct ra_param_s *param = (struct ra_param_s *) iter->data;
bool accept_for_list = false;
bool accept_for_xml = false;
switch (param_type) {
case ra_param_reloadable:
accept_for_list = !pcmk_is_set(param->rap_flags, param_type);
accept_for_xml = accept_for_list;
break;
case ra_param_unique:
accept_for_list = pcmk_is_set(param->rap_flags, param_type);
accept_for_xml = accept_for_list;
break;
case ra_param_private:
accept_for_list = pcmk_is_set(param->rap_flags, param_type);
accept_for_xml = !accept_for_list;
break;
}
if (accept_for_list) {
crm_trace("Attr %s is %s", param->rap_name, ra_param_flag2text(param_type));
if (list == NULL) {
// We will later search for " WORD ", so start list with a space
pcmk__add_word(&list, 256, " ");
}
pcmk__add_word(&list, 0, param->rap_name);
} else {
crm_trace("Rejecting %s for %s", param->rap_name, ra_param_flag2text(param_type));
}
if (accept_for_xml) {
const char *v = g_hash_table_lookup(op->params, param->rap_name);
if (v != NULL) {
crm_trace("Adding attr %s=%s to the xml result", param->rap_name, v);
crm_xml_add(*result, param->rap_name, v);
}
} else {
crm_trace("Removing attr %s from the xml result", param->rap_name);
xml_remove_prop(*result, param->rap_name);
}
}
if (list != NULL) {
// We will later search for " WORD ", so end list with a space
pcmk__add_word(&list, 0, " ");
}
return list;
}
static void
append_restart_list(lrmd_event_data_t *op, struct ra_metadata_s *metadata,
xmlNode *update, const char *version)
{
GString *list = NULL;
char *digest = NULL;
xmlNode *restart = NULL;
CRM_LOG_ASSERT(op->params != NULL);
if (op->interval_ms > 0) {
/* monitors are not reloadable */
return;
}
if (pcmk_is_set(metadata->ra_flags, ra_supports_reload_agent)) {
// Add parameters not marked reloadable to the "op-force-restart" list
list = build_parameter_list(op, metadata, ra_param_reloadable,
&restart);
} else if (pcmk_is_set(metadata->ra_flags, ra_supports_legacy_reload)) {
/* @COMPAT pre-OCF-1.1 resource agents
*
* Before OCF 1.1, Pacemaker abused "unique=0" to indicate
* reloadability. Add any parameters with unique="1" to the
* "op-force-restart" list.
*/
list = build_parameter_list(op, metadata, ra_param_unique, &restart);
} else {
// Resource does not support agent reloads
return;
}
digest = calculate_operation_digest(restart, version);
/* Add "op-force-restart" and "op-restart-digest" to indicate the resource supports reload,
* no matter if it actually supports any parameters with unique="1"). */
crm_xml_add(update, XML_LRM_ATTR_OP_RESTART,
(list == NULL)? "" : (const char *) list->str);
crm_xml_add(update, XML_LRM_ATTR_RESTART_DIGEST, digest);
if ((list != NULL) && (list->len > 0)) {
crm_trace("%s: %s, %s", op->rsc_id, digest, (const char *) list->str);
} else {
crm_trace("%s: %s", op->rsc_id, digest);
}
if (list != NULL) {
g_string_free(list, TRUE);
}
free_xml(restart);
free(digest);
}
static void
append_secure_list(lrmd_event_data_t *op, struct ra_metadata_s *metadata,
xmlNode *update, const char *version)
{
GString *list = NULL;
char *digest = NULL;
xmlNode *secure = NULL;
CRM_LOG_ASSERT(op->params != NULL);
/*
* To keep XML_LRM_ATTR_OP_SECURE short, we want it to contain the
* secure parameters but XML_LRM_ATTR_SECURE_DIGEST to be based on
* the insecure ones
*/
list = build_parameter_list(op, metadata, ra_param_private, &secure);
if (list != NULL) {
digest = calculate_operation_digest(secure, version);
crm_xml_add(update, XML_LRM_ATTR_OP_SECURE, (const char *) list->str);
crm_xml_add(update, XML_LRM_ATTR_SECURE_DIGEST, digest);
crm_trace("%s: %s, %s", op->rsc_id, digest, (const char *) list->str);
g_string_free(list, TRUE);
} else {
crm_trace("%s: no secure parameters", op->rsc_id);
}
free_xml(secure);
free(digest);
}
/*!
* \internal
* \brief Create XML for a resource history entry
*
* \param[in] func Function name of caller
* \param[in,out] parent XML to add entry to
* \param[in] rsc Affected resource
* \param[in,out] op Action to add an entry for (or NULL to do nothing)
* \param[in] node_name Node where action occurred
*/
void
controld_add_resource_history_xml_as(const char *func, xmlNode *parent,
const lrmd_rsc_info_t *rsc,
lrmd_event_data_t *op,
const char *node_name)
{
int target_rc = 0;
xmlNode *xml_op = NULL;
struct ra_metadata_s *metadata = NULL;
const char *caller_version = NULL;
lrm_state_t *lrm_state = NULL;
if (op == NULL) {
return;
}
target_rc = rsc_op_expected_rc(op);
caller_version = g_hash_table_lookup(op->params, XML_ATTR_CRM_VERSION);
CRM_CHECK(caller_version != NULL, caller_version = CRM_FEATURE_SET);
xml_op = pcmk__create_history_xml(parent, op, caller_version, target_rc,
controld_globals.our_nodename, func);
if (xml_op == NULL) {
return;
}
if ((rsc == NULL) || (op->params == NULL)
|| !crm_op_needs_metadata(rsc->standard, op->op_type)) {
crm_trace("No digests needed for %s action on %s (params=%p rsc=%p)",
op->op_type, op->rsc_id, op->params, rsc);
return;
}
lrm_state = lrm_state_find(node_name);
if (lrm_state == NULL) {
crm_warn("Cannot calculate digests for operation " PCMK__OP_FMT
" because we have no connection to executor for %s",
op->rsc_id, op->op_type, op->interval_ms, node_name);
return;
}
/* Ideally the metadata is cached, and the agent is just a fallback.
*
* @TODO Go through all callers and ensure they get metadata asynchronously
* first.
*/
metadata = controld_get_rsc_metadata(lrm_state, rsc,
controld_metadata_from_agent
|controld_metadata_from_cache);
if (metadata == NULL) {
return;
}
crm_trace("Including additional digests for %s:%s:%s",
rsc->standard, rsc->provider, rsc->type);
append_restart_list(op, metadata, xml_op, caller_version);
append_secure_list(op, metadata, xml_op, caller_version);
return;
}
/*!
* \internal
* \brief Record an action as pending in the CIB, if appropriate
*
* \param[in] node_name Node where the action is pending
* \param[in] rsc Resource that action is for
* \param[in,out] op Pending action
*
* \return true if action was recorded in CIB, otherwise false
*/
bool
controld_record_pending_op(const char *node_name, const lrmd_rsc_info_t *rsc,
lrmd_event_data_t *op)
{
const char *record_pending = NULL;
CRM_CHECK((node_name != NULL) && (rsc != NULL) && (op != NULL),
return false);
// Never record certain operation types as pending
if ((op->op_type == NULL) || (op->params == NULL)
|| !controld_action_is_recordable(op->op_type)) {
return false;
}
// Check action's record-pending meta-attribute (defaults to true)
record_pending = crm_meta_value(op->params, XML_OP_ATTR_PENDING);
if ((record_pending != NULL) && !crm_is_true(record_pending)) {
return false;
}
op->call_id = -1;
op->t_run = time(NULL);
op->t_rcchange = op->t_run;
lrmd__set_result(op, PCMK_OCF_UNKNOWN, PCMK_EXEC_PENDING, NULL);
crm_debug("Recording pending %s-interval %s for %s on %s in the CIB",
pcmk__readable_interval(op->interval_ms), op->op_type, op->rsc_id,
node_name);
controld_update_resource_history(node_name, rsc, op, 0);
return true;
}
static void
cib_rsc_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
switch (rc) {
case pcmk_ok:
case -pcmk_err_diff_failed:
case -pcmk_err_diff_resync:
crm_trace("Resource update %d complete: rc=%d", call_id, rc);
break;
default:
crm_warn("Resource update %d failed: (rc=%d) %s", call_id, rc, pcmk_strerror(rc));
}
- if (call_id == controld_globals.resource_update) { // Most recent CIB call
- controld_globals.resource_update = 0;
+ if (call_id == pending_rsc_update) {
+ pending_rsc_update = 0;
controld_trigger_fsa();
}
}
/* Only successful stops, and probes that found the resource inactive, get locks
* recorded in the history. This ensures the resource stays locked to the node
* until it is active there again after the node comes back up.
*/
static bool
should_preserve_lock(lrmd_event_data_t *op)
{
if (!pcmk_is_set(controld_globals.flags, controld_shutdown_lock_enabled)) {
return false;
}
if (!strcmp(op->op_type, RSC_STOP) && (op->rc == PCMK_OCF_OK)) {
return true;
}
if (!strcmp(op->op_type, RSC_STATUS) && (op->rc == PCMK_OCF_NOT_RUNNING)) {
return true;
}
return false;
}
/*!
* \internal
* \brief Request a CIB update
*
* \param[in] section Section of CIB to update
* \param[in,out] data New XML of CIB section to update
* \param[in] options CIB call options
* \param[in] callback If not NULL, set this as the operation callback
*
* \return Standard Pacemaker return code
+ *
+ * \note If \p callback is \p cib_rsc_callback(), the CIB update's call ID is
+ * stored in \p pending_rsc_update on success.
*/
int
controld_update_cib(const char *section, xmlNode *data, int options,
void (*callback)(xmlNode *, int, int, xmlNode *, void *))
{
int cib_rc = -ENOTCONN;
CRM_ASSERT(data != NULL);
if (controld_globals.cib_conn != NULL) {
cib_rc = cib_internal_op(controld_globals.cib_conn,
PCMK__CIB_REQUEST_MODIFY, NULL, section,
data, NULL, options, NULL);
if (cib_rc >= 0) {
crm_debug("Submitted CIB update %d for %s section",
cib_rc, section);
}
}
if (callback == NULL) {
if (cib_rc < 0) {
crm_err("Failed to update CIB %s section: %s",
section, pcmk_rc_str(pcmk_legacy2rc(cib_rc)));
}
} else {
if ((cib_rc >= 0) && (callback == cib_rsc_callback)) {
/* Checking for a particular callback is a little hacky, but it
* didn't seem worth adding an output argument for cib_rc for just
* one use case.
*/
- controld_globals.resource_update = cib_rc;
+ pending_rsc_update = cib_rc;
}
fsa_register_cib_callback(cib_rc, NULL, callback);
}
return (cib_rc >= 0)? pcmk_rc_ok : pcmk_legacy2rc(cib_rc);
}
/*!
* \internal
* \brief Update resource history entry in CIB
*
* \param[in] node_name Node where action occurred
* \param[in] rsc Resource that action is for
* \param[in,out] op Action to record
* \param[in] lock_time If nonzero, when resource was locked to node
*
* \note On success, the CIB update's call ID will be stored in
- * controld_globals.resource_update.
+ * pending_rsc_update.
*/
void
controld_update_resource_history(const char *node_name,
const lrmd_rsc_info_t *rsc,
lrmd_event_data_t *op, time_t lock_time)
{
xmlNode *update = NULL;
xmlNode *xml = NULL;
int call_opt = crmd_cib_smart_opt();
const char *node_id = NULL;
const char *container = NULL;
CRM_CHECK((node_name != NULL) && (op != NULL), return);
if (rsc == NULL) {
crm_warn("Resource %s no longer exists in the executor", op->rsc_id);
controld_ack_event_directly(NULL, NULL, rsc, op, op->rsc_id);
return;
}
//
update = create_xml_node(NULL, XML_CIB_TAG_STATUS);
//
xml = create_xml_node(update, XML_CIB_TAG_STATE);
if (pcmk__str_eq(node_name, controld_globals.our_nodename,
pcmk__str_casei)) {
node_id = controld_globals.our_uuid;
} else {
node_id = node_name;
pcmk__xe_set_bool_attr(xml, XML_NODE_IS_REMOTE, true);
}
crm_xml_add(xml, XML_ATTR_ID, node_id);
crm_xml_add(xml, XML_ATTR_UNAME, node_name);
crm_xml_add(xml, XML_ATTR_ORIGIN, __func__);
//
xml = create_xml_node(xml, XML_CIB_TAG_LRM);
crm_xml_add(xml, XML_ATTR_ID, node_id);
//
xml = create_xml_node(xml, XML_LRM_TAG_RESOURCES);
//
xml = create_xml_node(xml, XML_LRM_TAG_RESOURCE);
crm_xml_add(xml, XML_ATTR_ID, op->rsc_id);
crm_xml_add(xml, XML_AGENT_ATTR_CLASS, rsc->standard);
crm_xml_add(xml, XML_AGENT_ATTR_PROVIDER, rsc->provider);
crm_xml_add(xml, XML_ATTR_TYPE, rsc->type);
if (lock_time != 0) {
/* Actions on a locked resource should either preserve the lock by
* recording it with the action result, or clear it.
*/
if (!should_preserve_lock(op)) {
lock_time = 0;
}
crm_xml_add_ll(xml, XML_CONFIG_ATTR_SHUTDOWN_LOCK,
(long long) lock_time);
}
if (op->params != NULL) {
container = g_hash_table_lookup(op->params,
CRM_META "_" XML_RSC_ATTR_CONTAINER);
if (container != NULL) {
crm_trace("Resource %s is a part of container resource %s",
op->rsc_id, container);
crm_xml_add(xml, XML_RSC_ATTR_CONTAINER, container);
}
}
// (possibly more than one)
controld_add_resource_history_xml(xml, rsc, op, node_name);
/* Update CIB asynchronously. Even if it fails, the resource state should be
* discovered during the next election. Worst case, the node is wrongly
* fenced for running a resource it isn't.
*/
crm_log_xml_trace(update, __func__);
controld_update_cib(XML_CIB_TAG_STATUS, update, call_opt, cib_rsc_callback);
free_xml(update);
}
/*!
* \internal
* \brief Erase an LRM history entry from the CIB, given the operation data
*
* \param[in] op Operation whose history should be deleted
*/
void
controld_delete_action_history(const lrmd_event_data_t *op)
{
xmlNode *xml_top = NULL;
CRM_CHECK(op != NULL, return);
xml_top = create_xml_node(NULL, XML_LRM_TAG_RSC_OP);
crm_xml_add_int(xml_top, XML_LRM_ATTR_CALLID, op->call_id);
crm_xml_add(xml_top, XML_ATTR_TRANSITION_KEY, op->user_data);
if (op->interval_ms > 0) {
char *op_id = pcmk__op_key(op->rsc_id, op->op_type, op->interval_ms);
/* Avoid deleting last_failure too (if it was a result of this recurring op failing) */
crm_xml_add(xml_top, XML_ATTR_ID, op_id);
free(op_id);
}
crm_debug("Erasing resource operation history for " PCMK__OP_FMT " (call=%d)",
op->rsc_id, op->op_type, op->interval_ms, op->call_id);
controld_globals.cib_conn->cmds->remove(controld_globals.cib_conn,
XML_CIB_TAG_STATUS, xml_top,
- cib_quorum_override);
+ cib_none);
crm_log_xml_trace(xml_top, "op:cancel");
free_xml(xml_top);
}
/* Define xpath to find LRM resource history entry by node and resource */
#define XPATH_HISTORY \
"/" XML_TAG_CIB "/" XML_CIB_TAG_STATUS \
"/" XML_CIB_TAG_STATE "[@" XML_ATTR_UNAME "='%s']" \
"/" XML_CIB_TAG_LRM "/" XML_LRM_TAG_RESOURCES \
"/" XML_LRM_TAG_RESOURCE "[@" XML_ATTR_ID "='%s']" \
"/" XML_LRM_TAG_RSC_OP
/* ... and also by operation key */
#define XPATH_HISTORY_ID XPATH_HISTORY \
"[@" XML_ATTR_ID "='%s']"
/* ... and also by operation key and operation call ID */
#define XPATH_HISTORY_CALL XPATH_HISTORY \
"[@" XML_ATTR_ID "='%s' and @" XML_LRM_ATTR_CALLID "='%d']"
/* ... and also by operation key and original operation key */
#define XPATH_HISTORY_ORIG XPATH_HISTORY \
"[@" XML_ATTR_ID "='%s' and @" XML_LRM_ATTR_TASK_KEY "='%s']"
/*!
* \internal
* \brief Delete a last_failure resource history entry from the CIB
*
* \param[in] rsc_id Name of resource to clear history for
* \param[in] node Name of node to clear history for
* \param[in] action If specified, delete only if this was failed action
* \param[in] interval_ms If \p action is specified, it has this interval
*/
void
controld_cib_delete_last_failure(const char *rsc_id, const char *node,
const char *action, guint interval_ms)
{
char *xpath = NULL;
char *last_failure_key = NULL;
CRM_CHECK((rsc_id != NULL) && (node != NULL), return);
// Generate XPath to match desired entry
last_failure_key = pcmk__op_key(rsc_id, "last_failure", 0);
if (action == NULL) {
xpath = crm_strdup_printf(XPATH_HISTORY_ID, node, rsc_id,
last_failure_key);
} else {
char *action_key = pcmk__op_key(rsc_id, action, interval_ms);
xpath = crm_strdup_printf(XPATH_HISTORY_ORIG, node, rsc_id,
last_failure_key, action_key);
free(action_key);
}
free(last_failure_key);
controld_globals.cib_conn->cmds->remove(controld_globals.cib_conn, xpath,
- NULL,
- cib_quorum_override|cib_xpath);
+ NULL, cib_xpath);
free(xpath);
}
/*!
* \internal
* \brief Delete resource history entry from the CIB, given operation key
*
* \param[in] rsc_id Name of resource to clear history for
* \param[in] node Name of node to clear history for
* \param[in] key Operation key of operation to clear history for
* \param[in] call_id If specified, delete entry only if it has this call ID
*/
void
controld_delete_action_history_by_key(const char *rsc_id, const char *node,
const char *key, int call_id)
{
char *xpath = NULL;
CRM_CHECK((rsc_id != NULL) && (node != NULL) && (key != NULL), return);
if (call_id > 0) {
xpath = crm_strdup_printf(XPATH_HISTORY_CALL, node, rsc_id, key,
call_id);
} else {
xpath = crm_strdup_printf(XPATH_HISTORY_ID, node, rsc_id, key);
}
controld_globals.cib_conn->cmds->remove(controld_globals.cib_conn, xpath,
- NULL,
- cib_quorum_override|cib_xpath);
+ NULL, cib_xpath);
free(xpath);
}
diff --git a/daemons/controld/controld_cib.h b/daemons/controld/controld_cib.h
index eb7fe1094d..bd9492a796 100644
--- a/daemons/controld/controld_cib.h
+++ b/daemons/controld/controld_cib.h
@@ -1,119 +1,125 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU Lesser General Public License
* version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY.
*/
#ifndef PCMK__CONTROLD_CIB__H
#define PCMK__CONTROLD_CIB__H
+#include
+
+#include
+
#include
#include
#include // PCMK__CIB_REQUEST_MODIFY
-#include // controld_globals.cib_conn
+#include "controld_globals.h" // controld_globals.cib_conn
static inline void
fsa_cib_anon_update(const char *section, xmlNode *data) {
if (controld_globals.cib_conn == NULL) {
crm_err("No CIB connection available");
} else {
- const int opts = cib_scope_local|cib_quorum_override|cib_can_create;
-
controld_globals.cib_conn->cmds->modify(controld_globals.cib_conn,
- section, data, opts);
+ section, data,
+ cib_scope_local|cib_can_create);
}
}
static inline void
fsa_cib_anon_update_discard_reply(const char *section, xmlNode *data) {
if (controld_globals.cib_conn == NULL) {
crm_err("No CIB connection available");
} else {
- const int opts = cib_scope_local
- |cib_quorum_override
- |cib_can_create
- |cib_discard_reply;
-
controld_globals.cib_conn->cmds->modify(controld_globals.cib_conn,
- section, data, opts);
+ section, data,
+ cib_scope_local
+ |cib_can_create
+ |cib_discard_reply);
}
}
+void controld_record_cib_replace_call(int call_id);
+bool controld_forget_cib_replace_call(int call_id);
+void controld_forget_all_cib_replace_calls(void);
+void controld_destroy_cib_replacements_table(void);
+
int controld_update_cib(const char *section, xmlNode *data, int options,
void (*callback)(xmlNode *, int, int, xmlNode *,
void *));
unsigned int cib_op_timeout(void);
// Subsections of node_state
enum controld_section_e {
controld_section_lrm,
controld_section_lrm_unlocked,
controld_section_attrs,
controld_section_all,
controld_section_all_unlocked
};
void controld_delete_node_state(const char *uname,
enum controld_section_e section, int options);
int controld_delete_resource_history(const char *rsc_id, const char *node,
const char *user_name, int call_options);
/* Convenience macro for registering a CIB callback
* (assumes that data can be freed with free())
*/
# define fsa_register_cib_callback(id, data, fn) do { \
cib_t *cib_conn = controld_globals.cib_conn; \
\
CRM_ASSERT(cib_conn != NULL); \
cib_conn->cmds->register_callback_full(cib_conn, id, cib_op_timeout(), \
FALSE, data, #fn, fn, free); \
} while(0)
void controld_add_resource_history_xml_as(const char *func, xmlNode *parent,
const lrmd_rsc_info_t *rsc,
lrmd_event_data_t *op,
const char *node_name);
#define controld_add_resource_history_xml(parent, rsc, op, node_name) \
controld_add_resource_history_xml_as(__func__, (parent), (rsc), \
(op), (node_name))
bool controld_record_pending_op(const char *node_name,
const lrmd_rsc_info_t *rsc,
lrmd_event_data_t *op);
void controld_update_resource_history(const char *node_name,
const lrmd_rsc_info_t *rsc,
lrmd_event_data_t *op, time_t lock_time);
void controld_delete_action_history(const lrmd_event_data_t *op);
void controld_cib_delete_last_failure(const char *rsc_id, const char *node,
const char *action, guint interval_ms);
void controld_delete_action_history_by_key(const char *rsc_id, const char *node,
const char *key, int call_id);
void controld_disconnect_cib_manager(void);
int crmd_cib_smart_opt(void);
/*!
* \internal
* \brief Check whether an action type should be recorded in the CIB
*
* \param[in] action Action type
*
* \return true if action should be recorded, false otherwise
*/
static inline bool
controld_action_is_recordable(const char *action)
{
return !pcmk__str_any_of(action, CRMD_ACTION_CANCEL, CRMD_ACTION_DELETE,
CRMD_ACTION_NOTIFY, CRMD_ACTION_METADATA, NULL);
}
#endif // PCMK__CONTROLD_CIB__H
diff --git a/daemons/controld/controld_control.c b/daemons/controld/controld_control.c
index bd67feae65..ffc62a01ac 100644
--- a/daemons/controld/controld_control.c
+++ b/daemons/controld/controld_control.c
@@ -1,839 +1,857 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
static qb_ipcs_service_t *ipcs = NULL;
static crm_trigger_t *config_read_trigger = NULL;
#if SUPPORT_COROSYNC
extern gboolean crm_connect_corosync(crm_cluster_t * cluster);
#endif
void crm_shutdown(int nsig);
static gboolean crm_read_options(gpointer user_data);
/* A_HA_CONNECT */
void
do_ha_control(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
gboolean registered = FALSE;
static crm_cluster_t *cluster = NULL;
if (cluster == NULL) {
cluster = pcmk_cluster_new();
}
if (action & A_HA_DISCONNECT) {
crm_cluster_disconnect(cluster);
crm_info("Disconnected from the cluster");
controld_set_fsa_input_flags(R_HA_DISCONNECTED);
}
if (action & A_HA_CONNECT) {
crm_set_status_callback(&peer_update_callback);
crm_set_autoreap(FALSE);
#if SUPPORT_COROSYNC
if (is_corosync_cluster()) {
registered = crm_connect_corosync(cluster);
}
#endif // SUPPORT_COROSYNC
if (registered) {
controld_election_init(cluster->uname);
controld_globals.our_nodename = cluster->uname;
controld_globals.our_uuid = cluster->uuid;
if(cluster->uuid == NULL) {
crm_err("Could not obtain local uuid");
registered = FALSE;
}
}
if (!registered) {
controld_set_fsa_input_flags(R_HA_DISCONNECTED);
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
return;
}
populate_cib_nodes(node_update_none, __func__);
controld_clear_fsa_input_flags(R_HA_DISCONNECTED);
crm_info("Connected to the cluster");
}
if (action & ~(A_HA_CONNECT | A_HA_DISCONNECT)) {
crm_err("Unexpected action %s in %s", fsa_action2string(action),
__func__);
}
}
/* A_SHUTDOWN */
void
do_shutdown(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state, enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
/* just in case */
controld_set_fsa_input_flags(R_SHUTDOWN);
controld_disconnect_fencer(FALSE);
}
/* A_SHUTDOWN_REQ */
void
do_shutdown_req(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
xmlNode *msg = NULL;
controld_set_fsa_input_flags(R_SHUTDOWN);
//controld_set_fsa_input_flags(R_STAYDOWN);
crm_info("Sending shutdown request to all peers (DC is %s)",
pcmk__s(controld_globals.dc_name, "not set"));
msg = create_request(CRM_OP_SHUTDOWN_REQ, NULL, NULL, CRM_SYSTEM_CRMD, CRM_SYSTEM_CRMD, NULL);
if (send_cluster_message(NULL, crm_msg_crmd, msg, TRUE) == FALSE) {
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
}
free_xml(msg);
}
void
crmd_fast_exit(crm_exit_t exit_code)
{
if (pcmk_is_set(controld_globals.fsa_input_register, R_STAYDOWN)) {
crm_warn("Inhibiting respawn "CRM_XS" remapping exit code %d to %d",
exit_code, CRM_EX_FATAL);
exit_code = CRM_EX_FATAL;
} else if ((exit_code == CRM_EX_OK)
&& pcmk_is_set(controld_globals.fsa_input_register,
R_IN_RECOVERY)) {
crm_err("Could not recover from internal error");
exit_code = CRM_EX_ERROR;
}
if (controld_globals.logger_out != NULL) {
controld_globals.logger_out->finish(controld_globals.logger_out,
exit_code, true, NULL);
pcmk__output_free(controld_globals.logger_out);
controld_globals.logger_out = NULL;
}
crm_exit(exit_code);
}
crm_exit_t
crmd_exit(crm_exit_t exit_code)
{
GMainLoop *mloop = controld_globals.mainloop;
static bool in_progress = FALSE;
if (in_progress && (exit_code == CRM_EX_OK)) {
crm_debug("Exit is already in progress");
return exit_code;
} else if(in_progress) {
crm_notice("Error during shutdown process, exiting now with status %d (%s)",
exit_code, crm_exit_str(exit_code));
crm_write_blackbox(SIGTRAP, NULL);
crmd_fast_exit(exit_code);
}
in_progress = TRUE;
crm_trace("Preparing to exit with status %d (%s)",
exit_code, crm_exit_str(exit_code));
/* Suppress secondary errors resulting from us disconnecting everything */
controld_set_fsa_input_flags(R_HA_DISCONNECTED);
/* Close all IPC servers and clients to ensure any and all shared memory files are cleaned up */
if(ipcs) {
crm_trace("Closing IPC server");
mainloop_del_ipc_server(ipcs);
ipcs = NULL;
}
controld_close_attrd_ipc();
controld_shutdown_schedulerd_ipc();
controld_disconnect_fencer(TRUE);
if ((exit_code == CRM_EX_OK) && (controld_globals.mainloop == NULL)) {
crm_debug("No mainloop detected");
exit_code = CRM_EX_ERROR;
}
/* On an error, just get out.
*
* Otherwise, make the effort to have mainloop exit gracefully so
* that it (mostly) cleans up after itself and valgrind has less
* to report on - allowing real errors stand out
*/
if (exit_code != CRM_EX_OK) {
crm_notice("Forcing immediate exit with status %d (%s)",
exit_code, crm_exit_str(exit_code));
crm_write_blackbox(SIGTRAP, NULL);
crmd_fast_exit(exit_code);
}
/* Clean up as much memory as possible for valgrind */
for (GList *iter = controld_globals.fsa_message_queue; iter != NULL;
iter = iter->next) {
fsa_data_t *fsa_data = (fsa_data_t *) iter->data;
crm_info("Dropping %s: [ state=%s cause=%s origin=%s ]",
fsa_input2string(fsa_data->fsa_input),
fsa_state2string(controld_globals.fsa_state),
fsa_cause2string(fsa_data->fsa_cause), fsa_data->origin);
delete_fsa_input(fsa_data);
}
controld_clear_fsa_input_flags(R_MEMBERSHIP);
g_list_free(controld_globals.fsa_message_queue);
controld_globals.fsa_message_queue = NULL;
controld_election_fini();
/* Tear down the CIB manager connection, but don't free it yet -- it could
* be used when we drain the mainloop later.
*/
controld_disconnect_cib_manager();
verify_stopped(controld_globals.fsa_state, LOG_WARNING);
controld_clear_fsa_input_flags(R_LRM_CONNECTED);
lrm_state_destroy_all();
mainloop_destroy_trigger(config_read_trigger);
config_read_trigger = NULL;
controld_destroy_fsa_trigger();
controld_destroy_transition_trigger();
pcmk__client_cleanup();
crm_peer_destroy();
controld_free_fsa_timers();
te_cleanup_stonith_history_sync(NULL, TRUE);
controld_free_sched_timer();
free(controld_globals.our_nodename);
controld_globals.our_nodename = NULL;
free(controld_globals.our_uuid);
controld_globals.our_uuid = NULL;
free(controld_globals.dc_name);
controld_globals.dc_name = NULL;
free(controld_globals.dc_version);
controld_globals.dc_version = NULL;
free(controld_globals.cluster_name);
controld_globals.cluster_name = NULL;
free(controld_globals.te_uuid);
controld_globals.te_uuid = NULL;
free_max_generation();
+ controld_destroy_cib_replacements_table();
controld_destroy_failed_sync_table();
+ controld_destroy_outside_events_table();
mainloop_destroy_signal(SIGPIPE);
mainloop_destroy_signal(SIGUSR1);
mainloop_destroy_signal(SIGTERM);
mainloop_destroy_signal(SIGTRAP);
/* leave SIGCHLD engaged as we might still want to drain some service-actions */
if (mloop) {
GMainContext *ctx = g_main_loop_get_context(controld_globals.mainloop);
/* Don't re-enter this block */
controld_globals.mainloop = NULL;
/* no signals on final draining anymore */
mainloop_destroy_signal(SIGCHLD);
crm_trace("Draining mainloop %d %d", g_main_loop_is_running(mloop), g_main_context_pending(ctx));
{
int lpc = 0;
while((g_main_context_pending(ctx) && lpc < 10)) {
lpc++;
crm_trace("Iteration %d", lpc);
g_main_context_dispatch(ctx);
}
}
crm_trace("Closing mainloop %d %d", g_main_loop_is_running(mloop), g_main_context_pending(ctx));
g_main_loop_quit(mloop);
/* Won't do anything yet, since we're inside it now */
g_main_loop_unref(mloop);
} else {
mainloop_destroy_signal(SIGCHLD);
}
cib_delete(controld_globals.cib_conn);
controld_globals.cib_conn = NULL;
throttle_fini();
/* Graceful */
crm_trace("Done preparing for exit with status %d (%s)",
exit_code, crm_exit_str(exit_code));
return exit_code;
}
/* A_EXIT_0, A_EXIT_1 */
void
do_exit(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state, enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
crm_exit_t exit_code = CRM_EX_OK;
int log_level = LOG_INFO;
const char *exit_type = "gracefully";
if (action & A_EXIT_1) {
log_level = LOG_ERR;
exit_type = "forcefully";
exit_code = CRM_EX_ERROR;
}
verify_stopped(cur_state, LOG_ERR);
do_crm_log(log_level, "Performing %s - %s exiting the controller",
fsa_action2string(action), exit_type);
crm_info("[%s] stopped (%d)", crm_system_name, exit_code);
crmd_exit(exit_code);
}
static void sigpipe_ignore(int nsig) { return; }
/* A_STARTUP */
void
do_startup(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state, enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
crm_debug("Registering Signal Handlers");
mainloop_add_signal(SIGTERM, crm_shutdown);
mainloop_add_signal(SIGPIPE, sigpipe_ignore);
config_read_trigger = mainloop_add_trigger(G_PRIORITY_HIGH,
crm_read_options, NULL);
controld_init_fsa_trigger();
controld_init_transition_trigger();
crm_debug("Creating CIB manager and executor objects");
controld_globals.cib_conn = cib_new();
lrm_state_init_local();
if (controld_init_fsa_timers() == FALSE) {
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
}
}
// \return libqb error code (0 on success, -errno on error)
static int32_t
accept_controller_client(qb_ipcs_connection_t *c, uid_t uid, gid_t gid)
{
crm_trace("Accepting new IPC client connection");
if (pcmk__new_client(c, uid, gid) == NULL) {
return -EIO;
}
return 0;
}
// \return libqb error code (0 on success, -errno on error)
static int32_t
dispatch_controller_ipc(qb_ipcs_connection_t * c, void *data, size_t size)
{
uint32_t id = 0;
uint32_t flags = 0;
pcmk__client_t *client = pcmk__find_client(c);
xmlNode *msg = pcmk__client_data2xml(client, data, &id, &flags);
if (msg == NULL) {
pcmk__ipc_send_ack(client, id, flags, "ack", NULL, CRM_EX_PROTOCOL);
return 0;
}
pcmk__ipc_send_ack(client, id, flags, "ack", NULL, CRM_EX_INDETERMINATE);
CRM_ASSERT(client->user != NULL);
pcmk__update_acl_user(msg, F_CRM_USER, client->user);
crm_xml_add(msg, F_CRM_SYS_FROM, client->id);
if (controld_authorize_ipc_message(msg, client, NULL)) {
crm_trace("Processing IPC message from client %s",
pcmk__client_name(client));
route_message(C_IPC_MESSAGE, msg);
}
controld_trigger_fsa();
free_xml(msg);
return 0;
}
static int32_t
ipc_client_disconnected(qb_ipcs_connection_t *c)
{
pcmk__client_t *client = pcmk__find_client(c);
if (client) {
crm_trace("Disconnecting %sregistered client %s (%p/%p)",
(client->userdata? "" : "un"), pcmk__client_name(client),
c, client);
free(client->userdata);
pcmk__free_client(client);
controld_trigger_fsa();
}
return 0;
}
static void
ipc_connection_destroyed(qb_ipcs_connection_t *c)
{
crm_trace("Connection %p", c);
ipc_client_disconnected(c);
}
/* A_STOP */
void
do_stop(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state, enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
crm_trace("Closing IPC server");
mainloop_del_ipc_server(ipcs); ipcs = NULL;
register_fsa_input(C_FSA_INTERNAL, I_TERMINATE, NULL);
}
/* A_STARTED */
void
do_started(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state, enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
static struct qb_ipcs_service_handlers crmd_callbacks = {
.connection_accept = accept_controller_client,
.connection_created = NULL,
.msg_process = dispatch_controller_ipc,
.connection_closed = ipc_client_disconnected,
.connection_destroyed = ipc_connection_destroyed
};
if (cur_state != S_STARTING) {
crm_err("Start cancelled... %s", fsa_state2string(cur_state));
return;
} else if (!pcmk_is_set(controld_globals.fsa_input_register,
R_MEMBERSHIP)) {
crm_info("Delaying start, no membership data (%.16llx)", R_MEMBERSHIP);
crmd_fsa_stall(TRUE);
return;
} else if (!pcmk_is_set(controld_globals.fsa_input_register,
R_LRM_CONNECTED)) {
crm_info("Delaying start, not connected to executor (%.16llx)", R_LRM_CONNECTED);
crmd_fsa_stall(TRUE);
return;
} else if (!pcmk_is_set(controld_globals.fsa_input_register,
R_CIB_CONNECTED)) {
crm_info("Delaying start, CIB not connected (%.16llx)", R_CIB_CONNECTED);
crmd_fsa_stall(TRUE);
return;
} else if (!pcmk_is_set(controld_globals.fsa_input_register,
R_READ_CONFIG)) {
crm_info("Delaying start, Config not read (%.16llx)", R_READ_CONFIG);
crmd_fsa_stall(TRUE);
return;
} else if (!pcmk_is_set(controld_globals.fsa_input_register, R_PEER_DATA)) {
crm_info("Delaying start, No peer data (%.16llx)", R_PEER_DATA);
crmd_fsa_stall(TRUE);
return;
}
crm_debug("Init server comms");
ipcs = pcmk__serve_controld_ipc(&crmd_callbacks);
if (ipcs == NULL) {
crm_err("Failed to create IPC server: shutting down and inhibiting respawn");
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
} else {
crm_notice("Pacemaker controller successfully started and accepting connections");
}
controld_trigger_fencer_connect();
controld_clear_fsa_input_flags(R_STARTING);
register_fsa_input(msg_data->fsa_cause, I_PENDING, NULL);
}
/* A_RECOVER */
void
do_recover(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state, enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
controld_set_fsa_input_flags(R_IN_RECOVERY);
crm_warn("Fast-tracking shutdown in response to errors");
register_fsa_input(C_FSA_INTERNAL, I_TERMINATE, NULL);
}
static pcmk__cluster_option_t controller_options[] = {
/* name, old name, type, allowed values,
* default value, validator,
* short description,
* long description
*/
{
"dc-version", NULL, "string", NULL, PCMK__VALUE_NONE, NULL,
N_("Pacemaker version on cluster node elected Designated Controller (DC)"),
N_("Includes a hash which identifies the exact changeset the code was "
"built from. Used for diagnostic purposes.")
},
{
"cluster-infrastructure", NULL, "string", NULL, "corosync", NULL,
N_("The messaging stack on which Pacemaker is currently running"),
N_("Used for informational and diagnostic purposes.")
},
{
"cluster-name", NULL, "string", NULL, NULL, NULL,
N_("An arbitrary name for the cluster"),
N_("This optional value is mostly for users' convenience as desired "
"in administration, but may also be used in Pacemaker "
"configuration rules via the #cluster-name node attribute, and "
"by higher-level tools and resource agents.")
},
{
XML_CONFIG_ATTR_DC_DEADTIME, NULL, "time",
NULL, "20s", pcmk__valid_interval_spec,
N_("How long to wait for a response from other nodes during start-up"),
N_("The optimal value will depend on the speed and load of your network "
"and the type of switches used.")
},
{
XML_CONFIG_ATTR_RECHECK, NULL, "time",
N_("Zero disables polling, while positive values are an interval in seconds"
"(unless other units are specified, for example \"5min\")"),
"15min", pcmk__valid_interval_spec,
N_("Polling interval to recheck cluster state and evaluate rules "
"with date specifications"),
N_("Pacemaker is primarily event-driven, and looks ahead to know when to "
"recheck cluster state for failure timeouts and most time-based "
"rules. However, it will also recheck the cluster after this "
"amount of inactivity, to evaluate rules with date specifications "
"and serve as a fail-safe for certain types of scheduler bugs.")
},
{
"load-threshold", NULL, "percentage", NULL,
"80%", pcmk__valid_percentage,
N_("Maximum amount of system load that should be used by cluster nodes"),
N_("The cluster will slow down its recovery process when the amount of "
"system resources used (currently CPU) approaches this limit"),
},
{
"node-action-limit", NULL, "integer", NULL,
"0", pcmk__valid_number,
N_("Maximum number of jobs that can be scheduled per node "
"(defaults to 2x cores)")
},
{ XML_CONFIG_ATTR_FENCE_REACTION, NULL, "string", NULL, "stop", NULL,
N_("How a cluster node should react if notified of its own fencing"),
N_("A cluster node may receive notification of its own fencing if fencing "
"is misconfigured, or if fabric fencing is in use that doesn't cut "
"cluster communication. Allowed values are \"stop\" to attempt to "
"immediately stop Pacemaker and stay stopped, or \"panic\" to attempt "
"to immediately reboot the local node, falling back to stop on failure.")
},
{
XML_CONFIG_ATTR_ELECTION_FAIL, NULL, "time", NULL,
"2min", pcmk__valid_interval_spec,
"*** Advanced Use Only ***",
N_("Declare an election failed if it is not decided within this much "
"time. If you need to adjust this value, it probably indicates "
"the presence of a bug.")
},
{
XML_CONFIG_ATTR_FORCE_QUIT, NULL, "time", NULL,
"20min", pcmk__valid_interval_spec,
"*** Advanced Use Only ***",
N_("Exit immediately if shutdown does not complete within this much "
"time. If you need to adjust this value, it probably indicates "
"the presence of a bug.")
},
{
"join-integration-timeout", "crmd-integration-timeout", "time", NULL,
"3min", pcmk__valid_interval_spec,
"*** Advanced Use Only ***",
N_("If you need to adjust this value, it probably indicates "
"the presence of a bug.")
},
{
"join-finalization-timeout", "crmd-finalization-timeout", "time", NULL,
"30min", pcmk__valid_interval_spec,
"*** Advanced Use Only ***",
N_("If you need to adjust this value, it probably indicates "
"the presence of a bug.")
},
{
"transition-delay", "crmd-transition-delay", "time", NULL,
"0s", pcmk__valid_interval_spec,
N_("*** Advanced Use Only *** Enabling this option will slow down "
"cluster recovery under all conditions"),
N_("Delay cluster recovery for this much time to allow for additional "
"events to occur. Useful if your configuration is sensitive to "
"the order in which ping updates arrive.")
},
{
"stonith-watchdog-timeout", NULL, "time", NULL,
"0", controld_verify_stonith_watchdog_timeout,
N_("How long before nodes can be assumed to be safely down when "
"watchdog-based self-fencing via SBD is in use"),
N_("If this is set to a positive value, lost nodes are assumed to "
"self-fence using watchdog-based SBD within this much time. This "
"does not require a fencing resource to be explicitly configured, "
"though a fence_watchdog resource can be configured, to limit use "
"to specific nodes. If this is set to 0 (the default), the cluster "
"will never assume watchdog-based self-fencing. If this is set to a "
"negative value, the cluster will use twice the local value of the "
"`SBD_WATCHDOG_TIMEOUT` environment variable if that is positive, "
"or otherwise treat this as 0. WARNING: When used, this timeout "
"must be larger than `SBD_WATCHDOG_TIMEOUT` on all nodes that use "
"watchdog-based SBD, and Pacemaker will refuse to start on any of "
"those nodes where this is not true for the local value or SBD is "
"not active. When this is set to a negative value, "
"`SBD_WATCHDOG_TIMEOUT` must be set to the same value on all nodes "
"that use SBD, otherwise data corruption or loss could occur.")
},
{
"stonith-max-attempts", NULL, "integer", NULL,
"10", pcmk__valid_positive_number,
N_("How many times fencing can fail before it will no longer be "
"immediately re-attempted on a target")
},
// Already documented in libpe_status (other values must be kept identical)
{
"no-quorum-policy", NULL, "select",
"stop, freeze, ignore, demote, suicide", "stop", pcmk__valid_quorum,
- "What to do when the cluster does not have quorum", NULL
+ N_("What to do when the cluster does not have quorum"), NULL
},
{
XML_CONFIG_ATTR_SHUTDOWN_LOCK, NULL, "boolean", NULL,
"false", pcmk__valid_boolean,
- "Whether to lock resources to a cleanly shut down node",
- "When true, resources active on a node when it is cleanly shut down "
+ N_("Whether to lock resources to a cleanly shut down node"),
+ N_("When true, resources active on a node when it is cleanly shut down "
"are kept \"locked\" to that node (not allowed to run elsewhere) "
"until they start again on that node after it rejoins (or for at "
"most shutdown-lock-limit, if set). Stonith resources and "
"Pacemaker Remote connections are never locked. Clone and bundle "
- "instances and the promoted role of promotable clones are currently"
- " never locked, though support could be added in a future release."
+ "instances and the promoted role of promotable clones are "
+ "currently never locked, though support could be added in a future "
+ "release.")
+ },
+ {
+ XML_CONFIG_ATTR_SHUTDOWN_LOCK_LIMIT, NULL, "time", NULL,
+ "0", pcmk__valid_interval_spec,
+ N_("Do not lock resources to a cleanly shut down node longer than "
+ "this"),
+ N_("If shutdown-lock is true and this is set to a nonzero time "
+ "duration, shutdown locks will expire after this much time has "
+ "passed since the shutdown was initiated, even if the node has not "
+ "rejoined.")
},
};
void
crmd_metadata(void)
{
const char *desc_short = "Pacemaker controller options";
const char *desc_long = "Cluster options used by Pacemaker's controller";
gchar *s = pcmk__format_option_metadata("pacemaker-controld", desc_short,
desc_long, controller_options,
PCMK__NELEM(controller_options));
printf("%s", s);
g_free(s);
}
static void
config_query_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
const char *value = NULL;
GHashTable *config_hash = NULL;
crm_time_t *now = crm_time_new(NULL);
xmlNode *crmconfig = NULL;
xmlNode *alerts = NULL;
if (rc != pcmk_ok) {
fsa_data_t *msg_data = NULL;
crm_err("Local CIB query resulted in an error: %s", pcmk_strerror(rc));
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
if (rc == -EACCES || rc == -pcmk_err_schema_validation) {
crm_err("The cluster is mis-configured - shutting down and staying down");
controld_set_fsa_input_flags(R_STAYDOWN);
}
goto bail;
}
crmconfig = output;
if ((crmconfig) &&
(crm_element_name(crmconfig)) &&
(strcmp(crm_element_name(crmconfig), XML_CIB_TAG_CRMCONFIG) != 0)) {
crmconfig = first_named_child(crmconfig, XML_CIB_TAG_CRMCONFIG);
}
if (!crmconfig) {
fsa_data_t *msg_data = NULL;
crm_err("Local CIB query for " XML_CIB_TAG_CRMCONFIG " section failed");
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
goto bail;
}
crm_debug("Call %d : Parsing CIB options", call_id);
config_hash = pcmk__strkey_table(free, free);
pe_unpack_nvpairs(crmconfig, crmconfig, XML_CIB_TAG_PROPSET, NULL,
config_hash, CIB_OPTIONS_FIRST, FALSE, now, NULL);
// Validate all options, and use defaults if not already present in hash
pcmk__validate_cluster_options(config_hash, controller_options,
PCMK__NELEM(controller_options));
value = g_hash_table_lookup(config_hash, "no-quorum-policy");
if (pcmk__str_eq(value, "suicide", pcmk__str_casei) && pcmk__locate_sbd()) {
controld_set_global_flags(controld_no_quorum_suicide);
}
value = g_hash_table_lookup(config_hash, XML_CONFIG_ATTR_SHUTDOWN_LOCK);
if (crm_is_true(value)) {
controld_set_global_flags(controld_shutdown_lock_enabled);
} else {
controld_clear_global_flags(controld_shutdown_lock_enabled);
}
+ value = g_hash_table_lookup(config_hash,
+ XML_CONFIG_ATTR_SHUTDOWN_LOCK_LIMIT);
+ controld_globals.shutdown_lock_limit = crm_parse_interval_spec(value)
+ / 1000;
+
value = g_hash_table_lookup(config_hash, "cluster-name");
pcmk__str_update(&(controld_globals.cluster_name), value);
// Let subcomponents initialize their own static variables
controld_configure_election(config_hash);
controld_configure_fencing(config_hash);
controld_configure_fsa_timers(config_hash);
controld_configure_throttle(config_hash);
alerts = first_named_child(output, XML_CIB_TAG_ALERTS);
crmd_unpack_alerts(alerts);
controld_set_fsa_input_flags(R_READ_CONFIG);
controld_trigger_fsa();
g_hash_table_destroy(config_hash);
bail:
crm_time_free(now);
}
/*!
* \internal
* \brief Trigger read and processing of the configuration
*
* \param[in] fn Calling function name
* \param[in] line Line number where call occurred
*/
void
controld_trigger_config_as(const char *fn, int line)
{
if (config_read_trigger != NULL) {
crm_trace("%s:%d - Triggered config processing", fn, line);
mainloop_set_trigger(config_read_trigger);
}
}
gboolean
crm_read_options(gpointer user_data)
{
cib_t *cib_conn = controld_globals.cib_conn;
int call_id = cib_conn->cmds->query(cib_conn,
"//" XML_CIB_TAG_CRMCONFIG
" | //" XML_CIB_TAG_ALERTS,
NULL, cib_xpath|cib_scope_local);
fsa_register_cib_callback(call_id, NULL, config_query_callback);
crm_trace("Querying the CIB... call %d", call_id);
return TRUE;
}
/* A_READCONFIG */
void
do_read_config(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
throttle_init();
controld_trigger_config();
}
void
crm_shutdown(int nsig)
{
const char *value = NULL;
guint default_period_ms = 0;
if ((controld_globals.mainloop == NULL)
|| !g_main_loop_is_running(controld_globals.mainloop)) {
crmd_exit(CRM_EX_OK);
return;
}
if (pcmk_is_set(controld_globals.fsa_input_register, R_SHUTDOWN)) {
crm_err("Escalating shutdown");
register_fsa_input_before(C_SHUTDOWN, I_ERROR, NULL);
return;
}
controld_set_fsa_input_flags(R_SHUTDOWN);
register_fsa_input(C_SHUTDOWN, I_SHUTDOWN, NULL);
/* If shutdown timer doesn't have a period set, use the default
*
* @TODO: Evaluate whether this is still necessary. As long as
* config_query_callback() has been run at least once, it doesn't look like
* anything could have changed the timer period since then.
*/
value = pcmk__cluster_option(NULL, controller_options,
PCMK__NELEM(controller_options),
XML_CONFIG_ATTR_FORCE_QUIT);
default_period_ms = crm_parse_interval_spec(value);
controld_shutdown_start_countdown(default_period_ms);
}
diff --git a/daemons/controld/controld_election.c b/daemons/controld/controld_election.c
index 7d6bd76ead..5f33d5b92d 100644
--- a/daemons/controld/controld_election.c
+++ b/daemons/controld/controld_election.c
@@ -1,293 +1,292 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
#include
#include
static election_t *fsa_election = NULL;
static gboolean
election_win_cb(gpointer data)
{
register_fsa_input(C_FSA_INTERNAL, I_ELECTION_DC, NULL);
return FALSE;
}
void
controld_election_init(const char *uname)
{
fsa_election = election_init("DC", uname, 60000 /*60s*/, election_win_cb);
}
/*!
* \internal
* \brief Configure election options based on the CIB
*
* \param[in,out] options Name/value pairs for configured options
*/
void
controld_configure_election(GHashTable *options)
{
const char *value = NULL;
value = g_hash_table_lookup(options, XML_CONFIG_ATTR_ELECTION_FAIL);
election_timeout_set_period(fsa_election, crm_parse_interval_spec(value));
}
void
controld_remove_voter(const char *uname)
{
election_remove(fsa_election, uname);
if (pcmk__str_eq(uname, controld_globals.dc_name, pcmk__str_casei)) {
/* Clear any election dampening in effect. Otherwise, if the lost DC had
* just won, an immediate new election could fizzle out with no new DC.
*/
election_clear_dampening(fsa_election);
}
}
void
controld_election_fini(void)
{
election_fini(fsa_election);
fsa_election = NULL;
}
void
controld_stop_current_election_timeout(void)
{
election_timeout_stop(fsa_election);
}
/* A_ELECTION_VOTE */
void
do_election_vote(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
gboolean not_voting = FALSE;
/* don't vote if we're in one of these states or wanting to shut down */
switch (cur_state) {
case S_STARTING:
case S_RECOVERY:
case S_STOPPING:
case S_TERMINATE:
crm_warn("Not voting in election, we're in state %s", fsa_state2string(cur_state));
not_voting = TRUE;
break;
case S_ELECTION:
case S_INTEGRATION:
case S_RELEASE_DC:
break;
default:
crm_err("Broken? Voting in state %s", fsa_state2string(cur_state));
break;
}
if (not_voting == FALSE) {
if (pcmk_is_set(controld_globals.fsa_input_register, R_STARTING)) {
not_voting = TRUE;
}
}
if (not_voting) {
if (AM_I_DC) {
register_fsa_input(C_FSA_INTERNAL, I_RELEASE_DC, NULL);
} else {
register_fsa_input(C_FSA_INTERNAL, I_PENDING, NULL);
}
return;
}
election_vote(fsa_election);
return;
}
void
do_election_check(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
if (controld_globals.fsa_state == S_ELECTION) {
election_check(fsa_election);
} else {
crm_debug("Ignoring election check because we are not in an election");
}
}
/* A_ELECTION_COUNT */
void
do_election_count_vote(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
enum election_result rc = 0;
ha_msg_input_t *vote = fsa_typed_data(fsa_dt_ha_msg);
if(crm_peer_cache == NULL) {
if (!pcmk_is_set(controld_globals.fsa_input_register, R_SHUTDOWN)) {
crm_err("Internal error, no peer cache");
}
return;
}
rc = election_count_vote(fsa_election, vote->msg, cur_state != S_STARTING);
switch(rc) {
case election_start:
election_reset(fsa_election);
register_fsa_input(C_FSA_INTERNAL, I_ELECTION, NULL);
break;
case election_lost:
update_dc(NULL);
if (pcmk_is_set(controld_globals.fsa_input_register, R_THE_DC)) {
cib_t *cib_conn = controld_globals.cib_conn;
register_fsa_input(C_FSA_INTERNAL, I_RELEASE_DC, NULL);
cib_conn->cmds->set_secondary(cib_conn, cib_scope_local);
} else if (cur_state != S_STARTING) {
register_fsa_input(C_FSA_INTERNAL, I_PENDING, NULL);
}
break;
default:
crm_trace("Election message resulted in state %d", rc);
}
}
static void
feature_update_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
if (rc != pcmk_ok) {
fsa_data_t *msg_data = NULL;
crm_notice("Feature update failed: %s "CRM_XS" rc=%d",
pcmk_strerror(rc), rc);
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
}
}
/*!
* \internal
* \brief Update a node attribute in the CIB during a DC takeover
*
* \param[in] name Name of attribute to update
* \param[in] value New attribute value
*/
#define dc_takeover_update_attr(name, value) do { \
cib__update_node_attr(controld_globals.logger_out, \
controld_globals.cib_conn, cib_none, \
XML_CIB_TAG_CRMCONFIG, NULL, NULL, NULL, NULL, \
name, value, NULL, NULL); \
} while (0)
/* A_DC_TAKEOVER */
void
do_dc_takeover(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
xmlNode *cib = NULL;
const char *cluster_type = name_for_cluster_type(get_cluster_type());
pid_t watchdog = pcmk__locate_sbd();
crm_info("Taking over DC status for this partition");
controld_set_fsa_input_flags(R_THE_DC);
execute_stonith_cleanup();
election_reset(fsa_election);
controld_set_fsa_input_flags(R_JOIN_OK|R_INVOKE_PE);
controld_globals.cib_conn->cmds->set_primary(controld_globals.cib_conn,
cib_scope_local);
cib = create_xml_node(NULL, XML_TAG_CIB);
crm_xml_add(cib, XML_ATTR_CRM_VERSION, CRM_FEATURE_SET);
- controld_update_cib(XML_TAG_CIB, cib, cib_quorum_override,
- feature_update_callback);
+ controld_update_cib(XML_TAG_CIB, cib, cib_none, feature_update_callback);
dc_takeover_update_attr(XML_ATTR_HAVE_WATCHDOG, pcmk__btoa(watchdog));
dc_takeover_update_attr("dc-version", PACEMAKER_VERSION "-" BUILD_VERSION);
dc_takeover_update_attr("cluster-infrastructure", cluster_type);
#if SUPPORT_COROSYNC
if ((controld_globals.cluster_name == NULL) && is_corosync_cluster()) {
char *cluster_name = pcmk__corosync_cluster_name();
if (cluster_name != NULL) {
dc_takeover_update_attr("cluster-name", cluster_name);
}
free(cluster_name);
}
#endif
controld_trigger_config();
free_xml(cib);
}
/* A_DC_RELEASE */
void
do_dc_release(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
if (action & A_DC_RELEASE) {
crm_debug("Releasing the role of DC");
controld_clear_fsa_input_flags(R_THE_DC);
controld_expect_sched_reply(NULL);
} else if (action & A_DC_RELEASED) {
crm_info("DC role released");
#if 0
if (are there errors) {
/* we can't stay up if not healthy */
/* or perhaps I_ERROR and go to S_RECOVER? */
result = I_SHUTDOWN;
}
#endif
if (pcmk_is_set(controld_globals.fsa_input_register, R_SHUTDOWN)) {
xmlNode *update = NULL;
crm_node_t *node = crm_get_peer(0, controld_globals.our_nodename);
pcmk__update_peer_expected(__func__, node, CRMD_JOINSTATE_DOWN);
update = create_node_state_update(node, node_update_expected, NULL,
__func__);
/* Don't need a based response because controld will stop. */
fsa_cib_anon_update_discard_reply(XML_CIB_TAG_STATUS, update);
free_xml(update);
}
register_fsa_input(C_FSA_INTERNAL, I_RELEASE_SUCCESS, NULL);
} else {
crm_err("Unknown DC action %s", fsa_action2string(action));
}
crm_trace("Am I still the DC? %s", AM_I_DC ? XML_BOOLEAN_YES : XML_BOOLEAN_NO);
}
diff --git a/daemons/controld/controld_execd_state.c b/daemons/controld/controld_execd_state.c
index c900f3defe..f9a6e825ae 100644
--- a/daemons/controld/controld_execd_state.c
+++ b/daemons/controld/controld_execd_state.c
@@ -1,798 +1,806 @@
/*
* Copyright 2012-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
static GHashTable *lrm_state_table = NULL;
extern GHashTable *proxy_table;
int lrmd_internal_proxy_send(lrmd_t * lrmd, xmlNode *msg);
void lrmd_internal_set_proxy_callback(lrmd_t * lrmd, void *userdata, void (*callback)(lrmd_t *lrmd, void *userdata, xmlNode *msg));
static void
free_rsc_info(gpointer value)
{
lrmd_rsc_info_t *rsc_info = value;
lrmd_free_rsc_info(rsc_info);
}
static void
free_deletion_op(gpointer value)
{
struct pending_deletion_op_s *op = value;
free(op->rsc);
delete_ha_msg_input(op->input);
free(op);
}
static void
free_recurring_op(gpointer value)
{
active_op_t *op = value;
free(op->user_data);
free(op->rsc_id);
free(op->op_type);
free(op->op_key);
if (op->params) {
g_hash_table_destroy(op->params);
}
free(op);
}
static gboolean
fail_pending_op(gpointer key, gpointer value, gpointer user_data)
{
lrmd_event_data_t event = { 0, };
lrm_state_t *lrm_state = user_data;
active_op_t *op = value;
crm_trace("Pre-emptively failing " PCMK__OP_FMT " on %s (call=%s, %s)",
op->rsc_id, op->op_type, op->interval_ms,
lrm_state->node_name, (char*)key, op->user_data);
event.type = lrmd_event_exec_complete;
event.rsc_id = op->rsc_id;
event.op_type = op->op_type;
event.user_data = op->user_data;
event.timeout = 0;
event.interval_ms = op->interval_ms;
lrmd__set_result(&event, PCMK_OCF_UNKNOWN_ERROR, PCMK_EXEC_NOT_CONNECTED,
"Action was pending when executor connection was dropped");
event.t_run = (unsigned int) op->start_time;
event.t_rcchange = (unsigned int) op->start_time;
event.call_id = op->call_id;
event.remote_nodename = lrm_state->node_name;
event.params = op->params;
process_lrm_event(lrm_state, &event, op, NULL);
lrmd__reset_result(&event);
return TRUE;
}
gboolean
lrm_state_is_local(lrm_state_t *lrm_state)
{
return (lrm_state != NULL)
&& pcmk__str_eq(lrm_state->node_name, controld_globals.our_nodename,
pcmk__str_casei);
}
-lrm_state_t *
+/*!
+ * \internal
+ * \brief Create executor state entry for a node and add it to the state table
+ *
+ * \param[in] node_name Node to create entry for
+ *
+ * \return Newly allocated executor state object initialized for \p node_name
+ */
+static lrm_state_t *
lrm_state_create(const char *node_name)
{
lrm_state_t *state = NULL;
if (!node_name) {
crm_err("No node name given for lrm state object");
return NULL;
}
state = calloc(1, sizeof(lrm_state_t));
if (!state) {
return NULL;
}
state->node_name = strdup(node_name);
state->rsc_info_cache = pcmk__strkey_table(NULL, free_rsc_info);
state->deletion_ops = pcmk__strkey_table(free, free_deletion_op);
state->active_ops = pcmk__strkey_table(free, free_recurring_op);
state->resource_history = pcmk__strkey_table(NULL, history_free);
state->metadata_cache = metadata_cache_new();
g_hash_table_insert(lrm_state_table, (char *)state->node_name, state);
return state;
}
void
lrm_state_destroy(const char *node_name)
{
g_hash_table_remove(lrm_state_table, node_name);
}
static gboolean
remote_proxy_remove_by_node(gpointer key, gpointer value, gpointer user_data)
{
remote_proxy_t *proxy = value;
const char *node_name = user_data;
if (pcmk__str_eq(node_name, proxy->node_name, pcmk__str_casei)) {
return TRUE;
}
return FALSE;
}
static void
internal_lrm_state_destroy(gpointer data)
{
lrm_state_t *lrm_state = data;
if (!lrm_state) {
return;
}
crm_trace("Destroying proxy table %s with %u members",
lrm_state->node_name, g_hash_table_size(proxy_table));
g_hash_table_foreach_remove(proxy_table, remote_proxy_remove_by_node, (char *) lrm_state->node_name);
remote_ra_cleanup(lrm_state);
lrmd_api_delete(lrm_state->conn);
if (lrm_state->rsc_info_cache) {
crm_trace("Destroying rsc info cache with %u members",
g_hash_table_size(lrm_state->rsc_info_cache));
g_hash_table_destroy(lrm_state->rsc_info_cache);
}
if (lrm_state->resource_history) {
crm_trace("Destroying history op cache with %u members",
g_hash_table_size(lrm_state->resource_history));
g_hash_table_destroy(lrm_state->resource_history);
}
if (lrm_state->deletion_ops) {
crm_trace("Destroying deletion op cache with %u members",
g_hash_table_size(lrm_state->deletion_ops));
g_hash_table_destroy(lrm_state->deletion_ops);
}
if (lrm_state->active_ops != NULL) {
crm_trace("Destroying pending op cache with %u members",
g_hash_table_size(lrm_state->active_ops));
g_hash_table_destroy(lrm_state->active_ops);
}
metadata_cache_free(lrm_state->metadata_cache);
free((char *)lrm_state->node_name);
free(lrm_state);
}
void
lrm_state_reset_tables(lrm_state_t * lrm_state, gboolean reset_metadata)
{
if (lrm_state->resource_history) {
crm_trace("Resetting resource history cache with %u members",
g_hash_table_size(lrm_state->resource_history));
g_hash_table_remove_all(lrm_state->resource_history);
}
if (lrm_state->deletion_ops) {
crm_trace("Resetting deletion operations cache with %u members",
g_hash_table_size(lrm_state->deletion_ops));
g_hash_table_remove_all(lrm_state->deletion_ops);
}
if (lrm_state->active_ops != NULL) {
crm_trace("Resetting active operations cache with %u members",
g_hash_table_size(lrm_state->active_ops));
g_hash_table_remove_all(lrm_state->active_ops);
}
if (lrm_state->rsc_info_cache) {
crm_trace("Resetting resource information cache with %u members",
g_hash_table_size(lrm_state->rsc_info_cache));
g_hash_table_remove_all(lrm_state->rsc_info_cache);
}
if (reset_metadata) {
metadata_cache_reset(lrm_state->metadata_cache);
}
}
gboolean
lrm_state_init_local(void)
{
if (lrm_state_table) {
return TRUE;
}
lrm_state_table = pcmk__strikey_table(NULL, internal_lrm_state_destroy);
if (!lrm_state_table) {
return FALSE;
}
proxy_table = pcmk__strikey_table(NULL, remote_proxy_free);
if (!proxy_table) {
g_hash_table_destroy(lrm_state_table);
lrm_state_table = NULL;
return FALSE;
}
return TRUE;
}
void
lrm_state_destroy_all(void)
{
if (lrm_state_table) {
crm_trace("Destroying state table with %u members",
g_hash_table_size(lrm_state_table));
g_hash_table_destroy(lrm_state_table); lrm_state_table = NULL;
}
if(proxy_table) {
crm_trace("Destroying proxy table with %u members",
g_hash_table_size(proxy_table));
g_hash_table_destroy(proxy_table); proxy_table = NULL;
}
}
lrm_state_t *
lrm_state_find(const char *node_name)
{
if (!node_name) {
return NULL;
}
return g_hash_table_lookup(lrm_state_table, node_name);
}
lrm_state_t *
lrm_state_find_or_create(const char *node_name)
{
lrm_state_t *lrm_state;
lrm_state = g_hash_table_lookup(lrm_state_table, node_name);
if (!lrm_state) {
lrm_state = lrm_state_create(node_name);
}
return lrm_state;
}
GList *
lrm_state_get_list(void)
{
return g_hash_table_get_values(lrm_state_table);
}
static remote_proxy_t *
find_connected_proxy_by_node(const char * node_name)
{
GHashTableIter gIter;
remote_proxy_t *proxy = NULL;
CRM_CHECK(proxy_table != NULL, return NULL);
g_hash_table_iter_init(&gIter, proxy_table);
while (g_hash_table_iter_next(&gIter, NULL, (gpointer *) &proxy)) {
if (proxy->source
&& pcmk__str_eq(node_name, proxy->node_name, pcmk__str_casei)) {
return proxy;
}
}
return NULL;
}
static void
remote_proxy_disconnect_by_node(const char * node_name)
{
remote_proxy_t *proxy = NULL;
CRM_CHECK(proxy_table != NULL, return);
while ((proxy = find_connected_proxy_by_node(node_name)) != NULL) {
/* mainloop_del_ipc_client() eventually calls remote_proxy_disconnected()
* , which removes the entry from proxy_table.
* Do not do this in a g_hash_table_iter_next() loop. */
if (proxy->source) {
mainloop_del_ipc_client(proxy->source);
}
}
return;
}
void
lrm_state_disconnect_only(lrm_state_t * lrm_state)
{
int removed = 0;
if (!lrm_state->conn) {
return;
}
crm_trace("Disconnecting %s", lrm_state->node_name);
remote_proxy_disconnect_by_node(lrm_state->node_name);
((lrmd_t *) lrm_state->conn)->cmds->disconnect(lrm_state->conn);
if (!pcmk_is_set(controld_globals.fsa_input_register, R_SHUTDOWN)) {
removed = g_hash_table_foreach_remove(lrm_state->active_ops,
fail_pending_op, lrm_state);
crm_trace("Synthesized %d operation failures for %s", removed, lrm_state->node_name);
}
}
void
lrm_state_disconnect(lrm_state_t * lrm_state)
{
if (!lrm_state->conn) {
return;
}
lrm_state_disconnect_only(lrm_state);
lrmd_api_delete(lrm_state->conn);
lrm_state->conn = NULL;
}
int
lrm_state_is_connected(lrm_state_t * lrm_state)
{
if (!lrm_state->conn) {
return FALSE;
}
return ((lrmd_t *) lrm_state->conn)->cmds->is_connected(lrm_state->conn);
}
int
lrm_state_poke_connection(lrm_state_t * lrm_state)
{
if (!lrm_state->conn) {
return -ENOTCONN;
}
return ((lrmd_t *) lrm_state->conn)->cmds->poke_connection(lrm_state->conn);
}
// \return Standard Pacemaker return code
int
controld_connect_local_executor(lrm_state_t *lrm_state)
{
int rc = pcmk_rc_ok;
if (lrm_state->conn == NULL) {
lrmd_t *api = NULL;
rc = lrmd__new(&api, NULL, NULL, 0);
if (rc != pcmk_rc_ok) {
return rc;
}
api->cmds->set_callback(api, lrm_op_callback);
lrm_state->conn = api;
}
rc = ((lrmd_t *) lrm_state->conn)->cmds->connect(lrm_state->conn,
CRM_SYSTEM_CRMD, NULL);
rc = pcmk_legacy2rc(rc);
if (rc == pcmk_rc_ok) {
lrm_state->num_lrm_register_fails = 0;
} else {
lrm_state->num_lrm_register_fails++;
}
return rc;
}
static remote_proxy_t *
crmd_remote_proxy_new(lrmd_t *lrmd, const char *node_name, const char *session_id, const char *channel)
{
struct ipc_client_callbacks proxy_callbacks = {
.dispatch = remote_proxy_dispatch,
.destroy = remote_proxy_disconnected
};
remote_proxy_t *proxy = remote_proxy_new(lrmd, &proxy_callbacks, node_name,
session_id, channel);
return proxy;
}
gboolean
crmd_is_proxy_session(const char *session)
{
return g_hash_table_lookup(proxy_table, session) ? TRUE : FALSE;
}
void
crmd_proxy_send(const char *session, xmlNode *msg)
{
remote_proxy_t *proxy = g_hash_table_lookup(proxy_table, session);
lrm_state_t *lrm_state = NULL;
if (!proxy) {
return;
}
crm_log_xml_trace(msg, "to-proxy");
lrm_state = lrm_state_find(proxy->node_name);
if (lrm_state) {
crm_trace("Sending event to %.8s on %s", proxy->session_id, proxy->node_name);
remote_proxy_relay_event(proxy, msg);
}
}
static void
crmd_proxy_dispatch(const char *session, xmlNode *msg)
{
crm_trace("Processing proxied IPC message from session %s", session);
crm_log_xml_trace(msg, "controller[inbound]");
crm_xml_add(msg, F_CRM_SYS_FROM, session);
if (controld_authorize_ipc_message(msg, NULL, session)) {
route_message(C_IPC_MESSAGE, msg);
}
controld_trigger_fsa();
}
static void
remote_config_check(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
if (rc != pcmk_ok) {
crm_err("Query resulted in an error: %s", pcmk_strerror(rc));
if (rc == -EACCES || rc == -pcmk_err_schema_validation) {
crm_err("The cluster is mis-configured - shutting down and staying down");
}
} else {
lrmd_t * lrmd = (lrmd_t *)user_data;
crm_time_t *now = crm_time_new(NULL);
GHashTable *config_hash = pcmk__strkey_table(free, free);
crm_debug("Call %d : Parsing CIB options", call_id);
pe_unpack_nvpairs(output, output, XML_CIB_TAG_PROPSET, NULL,
config_hash, CIB_OPTIONS_FIRST, FALSE, now, NULL);
/* Now send it to the remote peer */
lrmd__validate_remote_settings(lrmd, config_hash);
g_hash_table_destroy(config_hash);
crm_time_free(now);
}
}
static void
crmd_remote_proxy_cb(lrmd_t *lrmd, void *userdata, xmlNode *msg)
{
lrm_state_t *lrm_state = userdata;
const char *session = crm_element_value(msg, F_LRMD_IPC_SESSION);
remote_proxy_t *proxy = g_hash_table_lookup(proxy_table, session);
const char *op = crm_element_value(msg, F_LRMD_IPC_OP);
if (pcmk__str_eq(op, LRMD_IPC_OP_NEW, pcmk__str_casei)) {
const char *channel = crm_element_value(msg, F_LRMD_IPC_IPC_SERVER);
proxy = crmd_remote_proxy_new(lrmd, lrm_state->node_name, session, channel);
if (!remote_ra_controlling_guest(lrm_state)) {
if (proxy != NULL) {
cib_t *cib_conn = controld_globals.cib_conn;
/* Look up stonith-watchdog-timeout and send to the remote peer for validation */
int rc = cib_conn->cmds->query(cib_conn, XML_CIB_TAG_CRMCONFIG,
NULL, cib_scope_local);
cib_conn->cmds->register_callback_full(cib_conn, rc, 10, FALSE,
lrmd,
"remote_config_check",
remote_config_check,
NULL);
}
} else {
crm_debug("Skipping remote_config_check for guest-nodes");
}
} else if (pcmk__str_eq(op, LRMD_IPC_OP_SHUTDOWN_REQ, pcmk__str_casei)) {
char *now_s = NULL;
crm_notice("%s requested shutdown of its remote connection",
lrm_state->node_name);
if (!remote_ra_is_in_maintenance(lrm_state)) {
now_s = pcmk__ttoa(time(NULL));
update_attrd(lrm_state->node_name, XML_CIB_ATTR_SHUTDOWN, now_s, NULL, TRUE);
free(now_s);
remote_proxy_ack_shutdown(lrmd);
crm_warn("Reconnection attempts to %s may result in failures that must be cleared",
lrm_state->node_name);
} else {
remote_proxy_nack_shutdown(lrmd);
crm_notice("Remote resource for %s is not managed so no ordered shutdown happening",
lrm_state->node_name);
}
return;
} else if (pcmk__str_eq(op, LRMD_IPC_OP_REQUEST, pcmk__str_casei) && proxy && proxy->is_local) {
/* This is for the controller, which we are, so don't try
* to send to ourselves over IPC -- do it directly.
*/
int flags = 0;
xmlNode *request = get_message_xml(msg, F_LRMD_IPC_MSG);
CRM_CHECK(request != NULL, return);
CRM_CHECK(lrm_state->node_name, return);
crm_xml_add(request, XML_ACL_TAG_ROLE, "pacemaker-remote");
pcmk__update_acl_user(request, F_LRMD_IPC_USER, lrm_state->node_name);
/* Pacemaker Remote nodes don't know their own names (as known to the
* cluster). When getting a node info request with no name or ID, add
* the name, so we don't return info for ourselves instead of the
* Pacemaker Remote node.
*/
if (pcmk__str_eq(crm_element_value(request, F_CRM_TASK), CRM_OP_NODE_INFO, pcmk__str_casei)) {
int node_id = 0;
crm_element_value_int(request, XML_ATTR_ID, &node_id);
if ((node_id <= 0)
&& (crm_element_value(request, XML_ATTR_UNAME) == NULL)) {
crm_xml_add(request, XML_ATTR_UNAME, lrm_state->node_name);
}
}
crmd_proxy_dispatch(session, request);
crm_element_value_int(msg, F_LRMD_IPC_MSG_FLAGS, &flags);
if (flags & crm_ipc_client_response) {
int msg_id = 0;
xmlNode *op_reply = create_xml_node(NULL, "ack");
crm_xml_add(op_reply, "function", __func__);
crm_xml_add_int(op_reply, "line", __LINE__);
crm_element_value_int(msg, F_LRMD_IPC_MSG_ID, &msg_id);
remote_proxy_relay_response(proxy, op_reply, msg_id);
free_xml(op_reply);
}
} else {
remote_proxy_cb(lrmd, lrm_state->node_name, msg);
}
}
// \return Standard Pacemaker return code
int
controld_connect_remote_executor(lrm_state_t *lrm_state, const char *server,
int port, int timeout_ms)
{
int rc = pcmk_rc_ok;
if (lrm_state->conn == NULL) {
lrmd_t *api = NULL;
rc = lrmd__new(&api, lrm_state->node_name, server, port);
if (rc != pcmk_rc_ok) {
crm_warn("Pacemaker Remote connection to %s:%s failed: %s "
CRM_XS " rc=%d", server, port, pcmk_rc_str(rc), rc);
return rc;
}
lrm_state->conn = api;
api->cmds->set_callback(api, remote_lrm_op_callback);
lrmd_internal_set_proxy_callback(api, lrm_state, crmd_remote_proxy_cb);
}
crm_trace("Initiating remote connection to %s:%d with timeout %dms",
server, port, timeout_ms);
rc = ((lrmd_t *) lrm_state->conn)->cmds->connect_async(lrm_state->conn,
lrm_state->node_name,
timeout_ms);
if (rc == pcmk_ok) {
lrm_state->num_lrm_register_fails = 0;
} else {
lrm_state->num_lrm_register_fails++; // Ignored for remote connections
}
return pcmk_legacy2rc(rc);
}
int
lrm_state_get_metadata(lrm_state_t * lrm_state,
const char *class,
const char *provider,
const char *agent, char **output, enum lrmd_call_options options)
{
lrmd_key_value_t *params = NULL;
if (!lrm_state->conn) {
return -ENOTCONN;
}
/* Add the node name to the environment, as is done with normal resource
* action calls. Meta-data calls shouldn't need it, but some agents are
* written with an ocf_local_nodename call at the beginning regardless of
* action. Without the environment variable, the agent would try to contact
* the controller to get the node name -- but the controller would be
* blocking on the synchronous meta-data call.
*
* At this point, we have to assume that agents are unlikely to make other
* calls that require the controller, such as crm_node --quorum or
* --cluster-id.
*
* @TODO Make meta-data calls asynchronous. (This will be part of a larger
* project to make meta-data calls via the executor rather than directly.)
*/
params = lrmd_key_value_add(params, CRM_META "_" XML_LRM_ATTR_TARGET,
lrm_state->node_name);
return ((lrmd_t *) lrm_state->conn)->cmds->get_metadata_params(lrm_state->conn,
class, provider, agent, output, options, params);
}
int
lrm_state_cancel(lrm_state_t *lrm_state, const char *rsc_id, const char *action,
guint interval_ms)
{
if (!lrm_state->conn) {
return -ENOTCONN;
}
/* Figure out a way to make this async?
* NOTICE: Currently it's synced and directly acknowledged in do_lrm_invoke(). */
if (is_remote_lrmd_ra(NULL, NULL, rsc_id)) {
return remote_ra_cancel(lrm_state, rsc_id, action, interval_ms);
}
return ((lrmd_t *) lrm_state->conn)->cmds->cancel(lrm_state->conn, rsc_id,
action, interval_ms);
}
lrmd_rsc_info_t *
lrm_state_get_rsc_info(lrm_state_t * lrm_state, const char *rsc_id, enum lrmd_call_options options)
{
lrmd_rsc_info_t *rsc = NULL;
if (!lrm_state->conn) {
return NULL;
}
if (is_remote_lrmd_ra(NULL, NULL, rsc_id)) {
return remote_ra_get_rsc_info(lrm_state, rsc_id);
}
rsc = g_hash_table_lookup(lrm_state->rsc_info_cache, rsc_id);
if (rsc == NULL) {
/* only contact the lrmd if we don't already have a cached rsc info */
rsc = ((lrmd_t *) lrm_state->conn)->cmds->get_rsc_info(lrm_state->conn, rsc_id, options);
if (rsc == NULL) {
return NULL;
}
/* cache the result */
g_hash_table_insert(lrm_state->rsc_info_cache, rsc->id, rsc);
}
return lrmd_copy_rsc_info(rsc);
}
/*!
* \internal
* \brief Initiate a resource agent action
*
* \param[in,out] lrm_state Executor state object
* \param[in] rsc_id ID of resource for action
* \param[in] action Action to execute
* \param[in] userdata String to copy and pass to execution callback
* \param[in] interval_ms Action interval (in milliseconds)
* \param[in] timeout_ms Action timeout (in milliseconds)
* \param[in] start_delay_ms Delay (in ms) before initiating action
* \param[in] parameters Hash table of resource parameters
* \param[out] call_id Where to store call ID on success
*
* \return Standard Pacemaker return code
*/
int
controld_execute_resource_agent(lrm_state_t *lrm_state, const char *rsc_id,
const char *action, const char *userdata,
guint interval_ms, int timeout_ms,
int start_delay_ms, GHashTable *parameters,
int *call_id)
{
int rc = pcmk_rc_ok;
lrmd_key_value_t *params = NULL;
if (lrm_state->conn == NULL) {
return ENOTCONN;
}
// Convert parameters from hash table to list
if (parameters != NULL) {
const char *key = NULL;
const char *value = NULL;
GHashTableIter iter;
g_hash_table_iter_init(&iter, parameters);
while (g_hash_table_iter_next(&iter, (gpointer *) &key,
(gpointer *) &value)) {
params = lrmd_key_value_add(params, key, value);
}
}
if (is_remote_lrmd_ra(NULL, NULL, rsc_id)) {
rc = controld_execute_remote_agent(lrm_state, rsc_id, action,
userdata, interval_ms, timeout_ms,
start_delay_ms, params, call_id);
} else {
rc = ((lrmd_t *) lrm_state->conn)->cmds->exec(lrm_state->conn, rsc_id,
action, userdata,
interval_ms, timeout_ms,
start_delay_ms,
lrmd_opt_notify_changes_only,
params);
if (rc < 0) {
rc = pcmk_legacy2rc(rc);
} else {
*call_id = rc;
rc = pcmk_rc_ok;
}
}
return rc;
}
int
lrm_state_register_rsc(lrm_state_t * lrm_state,
const char *rsc_id,
const char *class,
const char *provider, const char *agent, enum lrmd_call_options options)
{
lrmd_t *conn = (lrmd_t *) lrm_state->conn;
if (conn == NULL) {
return -ENOTCONN;
}
if (is_remote_lrmd_ra(agent, provider, NULL)) {
return lrm_state_find_or_create(rsc_id)? pcmk_ok : -EINVAL;
}
/* @TODO Implement an asynchronous version of this (currently a blocking
* call to the lrmd).
*/
return conn->cmds->register_rsc(lrm_state->conn, rsc_id, class, provider,
agent, options);
}
int
lrm_state_unregister_rsc(lrm_state_t * lrm_state,
const char *rsc_id, enum lrmd_call_options options)
{
if (!lrm_state->conn) {
return -ENOTCONN;
}
if (is_remote_lrmd_ra(NULL, NULL, rsc_id)) {
lrm_state_destroy(rsc_id);
return pcmk_ok;
}
g_hash_table_remove(lrm_state->rsc_info_cache, rsc_id);
/* @TODO Optimize this ... this function is a blocking round trip from
* client to daemon. The controld_execd_state.c code path that uses this
* function should always treat it as an async operation. The executor API
* should make an async version available.
*/
return ((lrmd_t *) lrm_state->conn)->cmds->unregister_rsc(lrm_state->conn, rsc_id, options);
}
diff --git a/daemons/controld/controld_fencing.c b/daemons/controld/controld_fencing.c
index 35cd95bf34..cdfa08192b 100644
--- a/daemons/controld/controld_fencing.c
+++ b/daemons/controld/controld_fencing.c
@@ -1,1108 +1,1106 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
static void
tengine_stonith_history_synced(stonith_t *st, stonith_event_t *st_event);
/*
* stonith failure counting
*
* We don't want to get stuck in a permanent fencing loop. Keep track of the
* number of fencing failures for each target node, and the most we'll restart a
* transition for.
*/
struct st_fail_rec {
int count;
};
static bool fence_reaction_panic = false;
static unsigned long int stonith_max_attempts = 10;
static GHashTable *stonith_failures = NULL;
/*!
* \internal
* \brief Update max fencing attempts before giving up
*
* \param[in] value New max fencing attempts
*/
static void
update_stonith_max_attempts(const char *value)
{
stonith_max_attempts = char2score(value);
if (stonith_max_attempts < 1UL) {
stonith_max_attempts = 10UL;
}
}
/*!
* \internal
* \brief Configure reaction to notification of local node being fenced
*
* \param[in] reaction_s Reaction type
*/
static void
set_fence_reaction(const char *reaction_s)
{
if (pcmk__str_eq(reaction_s, "panic", pcmk__str_casei)) {
fence_reaction_panic = true;
} else {
if (!pcmk__str_eq(reaction_s, "stop", pcmk__str_casei)) {
crm_warn("Invalid value '%s' for %s, using 'stop'",
reaction_s, XML_CONFIG_ATTR_FENCE_REACTION);
}
fence_reaction_panic = false;
}
}
/*!
* \internal
* \brief Configure fencing options based on the CIB
*
* \param[in,out] options Name/value pairs for configured options
*/
void
controld_configure_fencing(GHashTable *options)
{
const char *value = NULL;
value = g_hash_table_lookup(options, XML_CONFIG_ATTR_FENCE_REACTION);
set_fence_reaction(value);
value = g_hash_table_lookup(options, "stonith-max-attempts");
update_stonith_max_attempts(value);
}
static gboolean
too_many_st_failures(const char *target)
{
GHashTableIter iter;
const char *key = NULL;
struct st_fail_rec *value = NULL;
if (stonith_failures == NULL) {
return FALSE;
}
if (target == NULL) {
g_hash_table_iter_init(&iter, stonith_failures);
while (g_hash_table_iter_next(&iter, (gpointer *) &key,
(gpointer *) &value)) {
if (value->count >= stonith_max_attempts) {
target = (const char*)key;
goto too_many;
}
}
} else {
value = g_hash_table_lookup(stonith_failures, target);
if ((value != NULL) && (value->count >= stonith_max_attempts)) {
goto too_many;
}
}
return FALSE;
too_many:
crm_warn("Too many failures (%d) to fence %s, giving up",
value->count, target);
return TRUE;
}
/*!
* \internal
* \brief Reset a stonith fail count
*
* \param[in] target Name of node to reset, or NULL for all
*/
void
st_fail_count_reset(const char *target)
{
if (stonith_failures == NULL) {
return;
}
if (target) {
struct st_fail_rec *rec = NULL;
rec = g_hash_table_lookup(stonith_failures, target);
if (rec) {
rec->count = 0;
}
} else {
GHashTableIter iter;
const char *key = NULL;
struct st_fail_rec *rec = NULL;
g_hash_table_iter_init(&iter, stonith_failures);
while (g_hash_table_iter_next(&iter, (gpointer *) &key,
(gpointer *) &rec)) {
rec->count = 0;
}
}
}
static void
st_fail_count_increment(const char *target)
{
struct st_fail_rec *rec = NULL;
if (stonith_failures == NULL) {
stonith_failures = pcmk__strkey_table(free, free);
}
rec = g_hash_table_lookup(stonith_failures, target);
if (rec) {
rec->count++;
} else {
rec = malloc(sizeof(struct st_fail_rec));
if(rec == NULL) {
return;
}
rec->count = 1;
g_hash_table_insert(stonith_failures, strdup(target), rec);
}
}
/* end stonith fail count functions */
static void
cib_fencing_updated(xmlNode *msg, int call_id, int rc, xmlNode *output,
void *user_data)
{
if (rc < pcmk_ok) {
crm_err("Fencing update %d for %s: failed - %s (%d)",
call_id, (char *)user_data, pcmk_strerror(rc), rc);
crm_log_xml_warn(msg, "Failed update");
abort_transition(INFINITY, pcmk__graph_shutdown, "CIB update failed",
NULL);
} else {
crm_info("Fencing update %d for %s: complete", call_id, (char *)user_data);
}
}
static void
send_stonith_update(pcmk__graph_action_t *action, const char *target,
const char *uuid)
{
int rc = pcmk_ok;
crm_node_t *peer = NULL;
/* We (usually) rely on the membership layer to do node_update_cluster,
* and the peer status callback to do node_update_peer, because the node
* might have already rejoined before we get the stonith result here.
*/
int flags = node_update_join | node_update_expected;
/* zero out the node-status & remove all LRM status info */
xmlNode *node_state = NULL;
CRM_CHECK(target != NULL, return);
CRM_CHECK(uuid != NULL, return);
/* Make sure the membership and join caches are accurate */
peer = crm_get_peer_full(0, target, CRM_GET_PEER_ANY);
CRM_CHECK(peer != NULL, return);
if (peer->state == NULL) {
/* Usually, we rely on the membership layer to update the cluster state
* in the CIB. However, if the node has never been seen, do it here, so
* the node is not considered unclean.
*/
flags |= node_update_cluster;
}
if (peer->uuid == NULL) {
crm_info("Recording uuid '%s' for node '%s'", uuid, target);
peer->uuid = strdup(uuid);
}
crmd_peer_down(peer, TRUE);
/* Generate a node state update for the CIB */
node_state = create_node_state_update(peer, flags, NULL, __func__);
/* we have to mark whether or not remote nodes have already been fenced */
if (peer->flags & crm_remote_node) {
char *now_s = pcmk__ttoa(time(NULL));
crm_xml_add(node_state, XML_NODE_IS_FENCED, now_s);
free(now_s);
}
/* Force our known ID */
crm_xml_add(node_state, XML_ATTR_ID, uuid);
- rc = controld_globals.cib_conn->cmds->update(controld_globals.cib_conn,
+ rc = controld_globals.cib_conn->cmds->modify(controld_globals.cib_conn,
XML_CIB_TAG_STATUS, node_state,
- cib_quorum_override
- |cib_scope_local
+ cib_scope_local
|cib_can_create);
/* Delay processing the trigger until the update completes */
crm_debug("Sending fencing update %d for %s", rc, target);
fsa_register_cib_callback(rc, strdup(target), cib_fencing_updated);
// Make sure it sticks
/* controld_globals.cib_conn->cmds->bump_epoch(controld_globals.cib_conn,
- * cib_quorum_override
- * |cib_scope_local);
+ * cib_scope_local);
*/
controld_delete_node_state(peer->uname, controld_section_all,
cib_scope_local);
free_xml(node_state);
return;
}
/*!
* \internal
* \brief Abort transition due to stonith failure
*
* \param[in] abort_action Whether to restart or stop transition
* \param[in] target Don't restart if this (NULL for any) has too many failures
* \param[in] reason Log this stonith action XML as abort reason (or NULL)
*/
static void
abort_for_stonith_failure(enum pcmk__graph_next abort_action,
const char *target, const xmlNode *reason)
{
/* If stonith repeatedly fails, we eventually give up on starting a new
* transition for that reason.
*/
if ((abort_action != pcmk__graph_wait) && too_many_st_failures(target)) {
abort_action = pcmk__graph_wait;
}
abort_transition(INFINITY, abort_action, "Stonith failed", reason);
}
/*
* stonith cleanup list
*
* If the DC is shot, proper notifications might not go out.
* The stonith cleanup list allows the cluster to (re-)send
* notifications once a new DC is elected.
*/
static GList *stonith_cleanup_list = NULL;
/*!
* \internal
* \brief Add a node to the stonith cleanup list
*
* \param[in] target Name of node to add
*/
void
add_stonith_cleanup(const char *target) {
stonith_cleanup_list = g_list_append(stonith_cleanup_list, strdup(target));
}
/*!
* \internal
* \brief Remove a node from the stonith cleanup list
*
* \param[in] Name of node to remove
*/
void
remove_stonith_cleanup(const char *target)
{
GList *iter = stonith_cleanup_list;
while (iter != NULL) {
GList *tmp = iter;
char *iter_name = tmp->data;
iter = iter->next;
if (pcmk__str_eq(target, iter_name, pcmk__str_casei)) {
crm_trace("Removing %s from the cleanup list", iter_name);
stonith_cleanup_list = g_list_delete_link(stonith_cleanup_list, tmp);
free(iter_name);
}
}
}
/*!
* \internal
* \brief Purge all entries from the stonith cleanup list
*/
void
purge_stonith_cleanup(void)
{
if (stonith_cleanup_list) {
GList *iter = NULL;
for (iter = stonith_cleanup_list; iter != NULL; iter = iter->next) {
char *target = iter->data;
crm_info("Purging %s from stonith cleanup list", target);
free(target);
}
g_list_free(stonith_cleanup_list);
stonith_cleanup_list = NULL;
}
}
/*!
* \internal
* \brief Send stonith updates for all entries in cleanup list, then purge it
*/
void
execute_stonith_cleanup(void)
{
GList *iter;
for (iter = stonith_cleanup_list; iter != NULL; iter = iter->next) {
char *target = iter->data;
crm_node_t *target_node = crm_get_peer(0, target);
const char *uuid = crm_peer_uuid(target_node);
crm_notice("Marking %s, target of a previous stonith action, as clean", target);
send_stonith_update(NULL, target, uuid);
free(target);
}
g_list_free(stonith_cleanup_list);
stonith_cleanup_list = NULL;
}
/* end stonith cleanup list functions */
/* stonith API client
*
* Functions that need to interact directly with the fencer via its API
*/
static stonith_t *stonith_api = NULL;
static crm_trigger_t *stonith_reconnect = NULL;
static char *te_client_id = NULL;
static gboolean
fail_incompletable_stonith(pcmk__graph_t *graph)
{
GList *lpc = NULL;
const char *task = NULL;
xmlNode *last_action = NULL;
if (graph == NULL) {
return FALSE;
}
for (lpc = graph->synapses; lpc != NULL; lpc = lpc->next) {
GList *lpc2 = NULL;
pcmk__graph_synapse_t *synapse = (pcmk__graph_synapse_t *) lpc->data;
if (pcmk_is_set(synapse->flags, pcmk__synapse_confirmed)) {
continue;
}
for (lpc2 = synapse->actions; lpc2 != NULL; lpc2 = lpc2->next) {
pcmk__graph_action_t *action = (pcmk__graph_action_t *) lpc2->data;
if ((action->type != pcmk__cluster_graph_action)
|| pcmk_is_set(action->flags, pcmk__graph_action_confirmed)) {
continue;
}
task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
if (task && pcmk__str_eq(task, CRM_OP_FENCE, pcmk__str_casei)) {
pcmk__set_graph_action_flags(action, pcmk__graph_action_failed);
last_action = action->xml;
pcmk__update_graph(graph, action);
crm_notice("Failing action %d (%s): fencer terminated",
action->id, ID(action->xml));
}
}
}
if (last_action != NULL) {
crm_warn("Fencer failure resulted in unrunnable actions");
abort_for_stonith_failure(pcmk__graph_restart, NULL, last_action);
return TRUE;
}
return FALSE;
}
static void
tengine_stonith_connection_destroy(stonith_t *st, stonith_event_t *e)
{
te_cleanup_stonith_history_sync(st, FALSE);
if (pcmk_is_set(controld_globals.fsa_input_register, R_ST_REQUIRED)) {
crm_crit("Fencing daemon connection failed");
mainloop_set_trigger(stonith_reconnect);
} else {
crm_info("Fencing daemon disconnected");
}
if (stonith_api) {
/* the client API won't properly reconnect notifications
* if they are still in the table - so remove them
*/
if (stonith_api->state != stonith_disconnected) {
stonith_api->cmds->disconnect(st);
}
stonith_api->cmds->remove_notification(stonith_api, NULL);
}
if (AM_I_DC) {
fail_incompletable_stonith(controld_globals.transition_graph);
trigger_graph();
}
}
/*!
* \internal
* \brief Handle an event notification from the fencing API
*
* \param[in] st Fencing API connection (ignored)
* \param[in] event Fencing API event notification
*/
static void
handle_fence_notification(stonith_t *st, stonith_event_t *event)
{
bool succeeded = true;
const char *executioner = "the cluster";
const char *client = "a client";
const char *reason = NULL;
int exec_status;
if (te_client_id == NULL) {
te_client_id = crm_strdup_printf("%s.%lu", crm_system_name,
(unsigned long) getpid());
}
if (event == NULL) {
crm_err("Notify data not found");
return;
}
if (event->executioner != NULL) {
executioner = event->executioner;
}
if (event->client_origin != NULL) {
client = event->client_origin;
}
exec_status = stonith__event_execution_status(event);
if ((stonith__event_exit_status(event) != CRM_EX_OK)
|| (exec_status != PCMK_EXEC_DONE)) {
succeeded = false;
if (exec_status == PCMK_EXEC_DONE) {
exec_status = PCMK_EXEC_ERROR;
}
}
reason = stonith__event_exit_reason(event);
crmd_alert_fencing_op(event);
if (pcmk__str_eq("on", event->action, pcmk__str_none)) {
// Unfencing doesn't need special handling, just a log message
if (succeeded) {
crm_notice("%s was unfenced by %s at the request of %s@%s",
event->target, executioner, client, event->origin);
} else {
crm_err("Unfencing of %s by %s failed (%s%s%s) with exit status %d",
event->target, executioner,
pcmk_exec_status_str(exec_status),
((reason == NULL)? "" : ": "),
((reason == NULL)? "" : reason),
stonith__event_exit_status(event));
}
return;
}
if (succeeded
&& pcmk__str_eq(event->target, controld_globals.our_nodename,
pcmk__str_casei)) {
/* We were notified of our own fencing. Most likely, either fencing was
* misconfigured, or fabric fencing that doesn't cut cluster
* communication is in use.
*
* Either way, shutting down the local host is a good idea, to require
* administrator intervention. Also, other nodes would otherwise likely
* set our status to lost because of the fencing callback and discard
* our subsequent election votes as "not part of our cluster".
*/
crm_crit("We were allegedly just fenced by %s for %s!",
executioner, event->origin); // Dumps blackbox if enabled
if (fence_reaction_panic) {
pcmk__panic(__func__);
} else {
crm_exit(CRM_EX_FATAL);
}
return; // Should never get here
}
/* Update the count of fencing failures for this target, in case we become
* DC later. The current DC has already updated its fail count in
* tengine_stonith_callback().
*/
if (!AM_I_DC) {
if (succeeded) {
st_fail_count_reset(event->target);
} else {
st_fail_count_increment(event->target);
}
}
crm_notice("Peer %s was%s terminated (%s) by %s on behalf of %s@%s: "
"%s%s%s%s " CRM_XS " event=%s",
event->target, (succeeded? "" : " not"),
event->action, executioner, client, event->origin,
(succeeded? "OK" : pcmk_exec_status_str(exec_status)),
((reason == NULL)? "" : " ("),
((reason == NULL)? "" : reason),
((reason == NULL)? "" : ")"),
event->id);
if (succeeded) {
crm_node_t *peer = pcmk__search_known_node_cache(0, event->target,
CRM_GET_PEER_ANY);
const char *uuid = NULL;
if (peer == NULL) {
return;
}
uuid = crm_peer_uuid(peer);
if (AM_I_DC) {
/* The DC always sends updates */
send_stonith_update(NULL, event->target, uuid);
/* @TODO Ideally, at this point, we'd check whether the fenced node
* hosted any guest nodes, and call remote_node_down() for them.
* Unfortunately, the controller doesn't have a simple, reliable way
* to map hosts to guests. It might be possible to track this in the
* peer cache via crm_remote_peer_cache_refresh(). For now, we rely
* on the scheduler creating fence pseudo-events for the guests.
*/
if (!pcmk__str_eq(client, te_client_id, pcmk__str_casei)) {
/* Abort the current transition if it wasn't the cluster that
* initiated fencing.
*/
crm_info("External fencing operation from %s fenced %s",
client, event->target);
abort_transition(INFINITY, pcmk__graph_restart,
"External Fencing Operation", NULL);
}
} else if (pcmk__str_eq(controld_globals.dc_name, event->target,
pcmk__str_null_matches|pcmk__str_casei)
&& !pcmk_is_set(peer->flags, crm_remote_node)) {
// Assume the target was our DC if we don't currently have one
if (controld_globals.dc_name != NULL) {
crm_notice("Fencing target %s was our DC", event->target);
} else {
crm_notice("Fencing target %s may have been our DC",
event->target);
}
/* Given the CIB resyncing that occurs around elections,
* have one node update the CIB now and, if the new DC is different,
* have them do so too after the election
*/
if (pcmk__str_eq(event->executioner, controld_globals.our_nodename,
pcmk__str_casei)) {
send_stonith_update(NULL, event->target, uuid);
}
add_stonith_cleanup(event->target);
}
/* If the target is a remote node, and we host its connection,
* immediately fail all monitors so it can be recovered quickly.
* The connection won't necessarily drop when a remote node is fenced,
* so the failure might not otherwise be detected until the next poke.
*/
if (pcmk_is_set(peer->flags, crm_remote_node)) {
remote_ra_fail(event->target);
}
crmd_peer_down(peer, TRUE);
}
}
/*!
* \brief Connect to fencer
*
* \param[in] user_data If NULL, retry failures now, otherwise retry in main loop
*
* \return TRUE
* \note If user_data is NULL, this will wait 2s between attempts, for up to
* 30 attempts, meaning the controller could be blocked as long as 58s.
*/
static gboolean
te_connect_stonith(gpointer user_data)
{
int rc = pcmk_ok;
if (stonith_api == NULL) {
stonith_api = stonith_api_new();
if (stonith_api == NULL) {
crm_err("Could not connect to fencer: API memory allocation failed");
return TRUE;
}
}
if (stonith_api->state != stonith_disconnected) {
crm_trace("Already connected to fencer, no need to retry");
return TRUE;
}
if (user_data == NULL) {
// Blocking (retry failures now until successful)
rc = stonith_api_connect_retry(stonith_api, crm_system_name, 30);
if (rc != pcmk_ok) {
crm_err("Could not connect to fencer in 30 attempts: %s "
CRM_XS " rc=%d", pcmk_strerror(rc), rc);
}
} else {
// Non-blocking (retry failures later in main loop)
rc = stonith_api->cmds->connect(stonith_api, crm_system_name, NULL);
if (rc != pcmk_ok) {
if (pcmk_is_set(controld_globals.fsa_input_register,
R_ST_REQUIRED)) {
crm_notice("Fencer connection failed (will retry): %s "
CRM_XS " rc=%d", pcmk_strerror(rc), rc);
mainloop_set_trigger(stonith_reconnect);
} else {
crm_info("Fencer connection failed (ignoring because no longer required): %s "
CRM_XS " rc=%d", pcmk_strerror(rc), rc);
}
return TRUE;
}
}
if (rc == pcmk_ok) {
stonith_api->cmds->register_notification(stonith_api,
T_STONITH_NOTIFY_DISCONNECT,
tengine_stonith_connection_destroy);
stonith_api->cmds->register_notification(stonith_api,
T_STONITH_NOTIFY_FENCE,
handle_fence_notification);
stonith_api->cmds->register_notification(stonith_api,
T_STONITH_NOTIFY_HISTORY_SYNCED,
tengine_stonith_history_synced);
te_trigger_stonith_history_sync(TRUE);
crm_notice("Fencer successfully connected");
}
return TRUE;
}
/*!
\internal
\brief Schedule fencer connection attempt in main loop
*/
void
controld_trigger_fencer_connect(void)
{
if (stonith_reconnect == NULL) {
stonith_reconnect = mainloop_add_trigger(G_PRIORITY_LOW,
te_connect_stonith,
GINT_TO_POINTER(TRUE));
}
controld_set_fsa_input_flags(R_ST_REQUIRED);
mainloop_set_trigger(stonith_reconnect);
}
void
controld_disconnect_fencer(bool destroy)
{
if (stonith_api) {
// Prevent fencer connection from coming up again
controld_clear_fsa_input_flags(R_ST_REQUIRED);
if (stonith_api->state != stonith_disconnected) {
stonith_api->cmds->disconnect(stonith_api);
}
stonith_api->cmds->remove_notification(stonith_api, NULL);
}
if (destroy) {
if (stonith_api) {
stonith_api->cmds->free(stonith_api);
stonith_api = NULL;
}
if (stonith_reconnect) {
mainloop_destroy_trigger(stonith_reconnect);
stonith_reconnect = NULL;
}
if (te_client_id) {
free(te_client_id);
te_client_id = NULL;
}
}
}
static gboolean
do_stonith_history_sync(gpointer user_data)
{
if (stonith_api && (stonith_api->state != stonith_disconnected)) {
stonith_history_t *history = NULL;
te_cleanup_stonith_history_sync(stonith_api, FALSE);
stonith_api->cmds->history(stonith_api,
st_opt_sync_call | st_opt_broadcast,
NULL, &history, 5);
stonith_history_free(history);
return TRUE;
} else {
crm_info("Skip triggering stonith history-sync as stonith is disconnected");
return FALSE;
}
}
static void
tengine_stonith_callback(stonith_t *stonith, stonith_callback_data_t *data)
{
char *uuid = NULL;
int stonith_id = -1;
int transition_id = -1;
pcmk__graph_action_t *action = NULL;
const char *target = NULL;
if ((data == NULL) || (data->userdata == NULL)) {
crm_err("Ignoring fence operation %d result: "
"No transition key given (bug?)",
((data == NULL)? -1 : data->call_id));
return;
}
if (!AM_I_DC) {
const char *reason = stonith__exit_reason(data);
if (reason == NULL) {
reason = pcmk_exec_status_str(stonith__execution_status(data));
}
crm_notice("Result of fence operation %d: %d (%s) " CRM_XS " key=%s",
data->call_id, stonith__exit_status(data), reason,
(const char *) data->userdata);
return;
}
CRM_CHECK(decode_transition_key(data->userdata, &uuid, &transition_id,
&stonith_id, NULL),
goto bail);
if (controld_globals.transition_graph->complete || (stonith_id < 0)
|| !pcmk__str_eq(uuid, controld_globals.te_uuid, pcmk__str_none)
|| (controld_globals.transition_graph->id != transition_id)) {
crm_info("Ignoring fence operation %d result: "
"Not from current transition " CRM_XS
" complete=%s action=%d uuid=%s (vs %s) transition=%d (vs %d)",
data->call_id,
pcmk__btoa(controld_globals.transition_graph->complete),
stonith_id, uuid, controld_globals.te_uuid, transition_id,
controld_globals.transition_graph->id);
goto bail;
}
action = controld_get_action(stonith_id);
if (action == NULL) {
crm_err("Ignoring fence operation %d result: "
"Action %d not found in transition graph (bug?) "
CRM_XS " uuid=%s transition=%d",
data->call_id, stonith_id, uuid, transition_id);
goto bail;
}
target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
if (target == NULL) {
crm_err("Ignoring fence operation %d result: No target given (bug?)",
data->call_id);
goto bail;
}
stop_te_timer(action);
if (stonith__exit_status(data) == CRM_EX_OK) {
const char *uuid = crm_element_value(action->xml, XML_LRM_ATTR_TARGET_UUID);
const char *op = crm_meta_value(action->params, "stonith_action");
crm_info("Fence operation %d for %s succeeded", data->call_id, target);
if (!(pcmk_is_set(action->flags, pcmk__graph_action_confirmed))) {
te_action_confirmed(action, NULL);
if (pcmk__str_eq("on", op, pcmk__str_casei)) {
const char *value = NULL;
char *now = pcmk__ttoa(time(NULL));
gboolean is_remote_node = FALSE;
/* This check is not 100% reliable, since this node is not
* guaranteed to have the remote node cached. However, it
* doesn't have to be reliable, since the attribute manager can
* learn a node's "remoteness" by other means sooner or later.
* This allows it to learn more quickly if this node does have
* the information.
*/
if (g_hash_table_lookup(crm_remote_peer_cache, uuid) != NULL) {
is_remote_node = TRUE;
}
update_attrd(target, CRM_ATTR_UNFENCED, now, NULL,
is_remote_node);
free(now);
value = crm_meta_value(action->params, XML_OP_ATTR_DIGESTS_ALL);
update_attrd(target, CRM_ATTR_DIGESTS_ALL, value, NULL,
is_remote_node);
value = crm_meta_value(action->params, XML_OP_ATTR_DIGESTS_SECURE);
update_attrd(target, CRM_ATTR_DIGESTS_SECURE, value, NULL,
is_remote_node);
} else if (!(pcmk_is_set(action->flags, pcmk__graph_action_sent_update))) {
send_stonith_update(action, target, uuid);
pcmk__set_graph_action_flags(action,
pcmk__graph_action_sent_update);
}
}
st_fail_count_reset(target);
} else {
enum pcmk__graph_next abort_action = pcmk__graph_restart;
int status = stonith__execution_status(data);
const char *reason = stonith__exit_reason(data);
if (reason == NULL) {
if (status == PCMK_EXEC_DONE) {
reason = "Agent returned error";
} else {
reason = pcmk_exec_status_str(status);
}
}
pcmk__set_graph_action_flags(action, pcmk__graph_action_failed);
/* If no fence devices were available, there's no use in immediately
* checking again, so don't start a new transition in that case.
*/
if (status == PCMK_EXEC_NO_FENCE_DEVICE) {
crm_warn("Fence operation %d for %s failed: %s "
"(aborting transition and giving up for now)",
data->call_id, target, reason);
abort_action = pcmk__graph_wait;
} else {
crm_notice("Fence operation %d for %s failed: %s "
"(aborting transition)", data->call_id, target, reason);
}
/* Increment the fail count now, so abort_for_stonith_failure() can
* check it. Non-DC nodes will increment it in
* handle_fence_notification().
*/
st_fail_count_increment(target);
abort_for_stonith_failure(abort_action, target, NULL);
}
pcmk__update_graph(controld_globals.transition_graph, action);
trigger_graph();
bail:
free(data->userdata);
free(uuid);
return;
}
static int
fence_with_delay(const char *target, const char *type, const char *delay)
{
uint32_t options = st_opt_none; // Group of enum stonith_call_options
int timeout_sec = (int) (controld_globals.transition_graph->stonith_timeout
/ 1000);
int delay_i;
if (crmd_join_phase_count(crm_join_confirmed) == 1) {
stonith__set_call_options(options, target, st_opt_allow_suicide);
}
pcmk__scan_min_int(delay, &delay_i, 0);
return stonith_api->cmds->fence_with_delay(stonith_api, options, target,
type, timeout_sec, 0, delay_i);
}
/*!
* \internal
* \brief Execute a fencing action from a transition graph
*
* \param[in] graph Transition graph being executed (ignored)
* \param[in] action Fencing action to execute
*
* \return Standard Pacemaker return code
*/
int
controld_execute_fence_action(pcmk__graph_t *graph,
pcmk__graph_action_t *action)
{
int rc = 0;
const char *id = ID(action->xml);
const char *uuid = crm_element_value(action->xml, XML_LRM_ATTR_TARGET_UUID);
const char *target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
const char *type = crm_meta_value(action->params, "stonith_action");
char *transition_key = NULL;
const char *priority_delay = NULL;
gboolean invalid_action = FALSE;
guint stonith_timeout = controld_globals.transition_graph->stonith_timeout;
CRM_CHECK(id != NULL, invalid_action = TRUE);
CRM_CHECK(uuid != NULL, invalid_action = TRUE);
CRM_CHECK(type != NULL, invalid_action = TRUE);
CRM_CHECK(target != NULL, invalid_action = TRUE);
if (invalid_action) {
crm_log_xml_warn(action->xml, "BadAction");
return EPROTO;
}
priority_delay = crm_meta_value(action->params, XML_CONFIG_ATTR_PRIORITY_FENCING_DELAY);
crm_notice("Requesting fencing (%s) of node %s "
CRM_XS " action=%s timeout=%u%s%s",
type, target, id, stonith_timeout,
priority_delay ? " priority_delay=" : "",
priority_delay ? priority_delay : "");
/* Passing NULL means block until we can connect... */
te_connect_stonith(NULL);
rc = fence_with_delay(target, type, priority_delay);
transition_key = pcmk__transition_key(controld_globals.transition_graph->id,
action->id, 0,
controld_globals.te_uuid),
stonith_api->cmds->register_callback(stonith_api, rc,
(int) (stonith_timeout / 1000),
st_opt_timeout_updates, transition_key,
"tengine_stonith_callback",
tengine_stonith_callback);
return pcmk_rc_ok;
}
bool
controld_verify_stonith_watchdog_timeout(const char *value)
{
const char *our_nodename = controld_globals.our_nodename;
gboolean rv = TRUE;
if (stonith_api && (stonith_api->state != stonith_disconnected) &&
stonith__watchdog_fencing_enabled_for_node_api(stonith_api,
our_nodename)) {
rv = pcmk__valid_sbd_timeout(value);
}
return rv;
}
/* end stonith API client functions */
/*
* stonith history synchronization
*
* Each node's fencer keeps track of a cluster-wide fencing history. When a node
* joins or leaves, we need to synchronize the history across all nodes.
*/
static crm_trigger_t *stonith_history_sync_trigger = NULL;
static mainloop_timer_t *stonith_history_sync_timer_short = NULL;
static mainloop_timer_t *stonith_history_sync_timer_long = NULL;
void
te_cleanup_stonith_history_sync(stonith_t *st, bool free_timers)
{
if (free_timers) {
mainloop_timer_del(stonith_history_sync_timer_short);
stonith_history_sync_timer_short = NULL;
mainloop_timer_del(stonith_history_sync_timer_long);
stonith_history_sync_timer_long = NULL;
} else {
mainloop_timer_stop(stonith_history_sync_timer_short);
mainloop_timer_stop(stonith_history_sync_timer_long);
}
if (st) {
st->cmds->remove_notification(st, T_STONITH_NOTIFY_HISTORY_SYNCED);
}
}
static void
tengine_stonith_history_synced(stonith_t *st, stonith_event_t *st_event)
{
te_cleanup_stonith_history_sync(st, FALSE);
crm_debug("Fence-history synced - cancel all timers");
}
static gboolean
stonith_history_sync_set_trigger(gpointer user_data)
{
mainloop_set_trigger(stonith_history_sync_trigger);
return FALSE;
}
void
te_trigger_stonith_history_sync(bool long_timeout)
{
/* trigger a sync in 5s to give more nodes the
* chance to show up so that we don't create
* unnecessary stonith-history-sync traffic
*
* the long timeout of 30s is there as a fallback
* so that after a successful connection to fenced
* we will wait for 30s for the DC to trigger a
* history-sync
* if this doesn't happen we trigger a sync locally
* (e.g. fenced segfaults and is restarted by pacemakerd)
*/
/* as we are finally checking the stonith-connection
* in do_stonith_history_sync we should be fine
* leaving stonith_history_sync_time & stonith_history_sync_trigger
* around
*/
if (stonith_history_sync_trigger == NULL) {
stonith_history_sync_trigger =
mainloop_add_trigger(G_PRIORITY_LOW,
do_stonith_history_sync, NULL);
}
if (long_timeout) {
if(stonith_history_sync_timer_long == NULL) {
stonith_history_sync_timer_long =
mainloop_timer_add("history_sync_long", 30000,
FALSE, stonith_history_sync_set_trigger,
NULL);
}
crm_info("Fence history will be synchronized cluster-wide within 30 seconds");
mainloop_timer_start(stonith_history_sync_timer_long);
} else {
if(stonith_history_sync_timer_short == NULL) {
stonith_history_sync_timer_short =
mainloop_timer_add("history_sync_short", 5000,
FALSE, stonith_history_sync_set_trigger,
NULL);
}
crm_info("Fence history will be synchronized cluster-wide within 5 seconds");
mainloop_timer_start(stonith_history_sync_timer_short);
}
}
/* end stonith history synchronization functions */
diff --git a/daemons/controld/controld_fsa.h b/daemons/controld/controld_fsa.h
index 18ca61754f..2b79f07ccb 100644
--- a/daemons/controld/controld_fsa.h
+++ b/daemons/controld/controld_fsa.h
@@ -1,695 +1,694 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU Lesser General Public License
* version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY.
*/
#ifndef CRMD_FSA__H
# define CRMD_FSA__H
# include
# include
# include
# include
# include
# include
# include
/*! States the controller can be in */
enum crmd_fsa_state {
S_IDLE = 0, /* Nothing happening */
S_ELECTION, /* Take part in the election algorithm as
* described below
*/
S_INTEGRATION, /* integrate that status of new nodes (which is
* all of them if we have just been elected DC)
* to form a complete and up-to-date picture of
* the CIB
*/
S_FINALIZE_JOIN, /* integrate that status of new nodes (which is
* all of them if we have just been elected DC)
* to form a complete and up-to-date picture of
* the CIB
*/
S_NOT_DC, /* we are in non-DC mode */
S_POLICY_ENGINE, /* Determine next stable state of the cluster */
S_RECOVERY, /* Something bad happened, check everything is ok
* before continuing and attempt to recover if
* required
*/
S_RELEASE_DC, /* we were the DC, but now we arent anymore,
* possibly by our own request, and we should
* release all unnecessary sub-systems, finish
* any pending actions, do general cleanup and
* unset anything that makes us think we are
* special :)
*/
S_STARTING, /* we are just starting out */
S_PENDING, /* we are not a full/active member yet */
S_STOPPING, /* We are in the final stages of shutting down */
S_TERMINATE, /* We are going to shutdown, this is the equiv of
* "Sending TERM signal to all processes" in Linux
* and in worst case scenarios could be considered
* a self STONITH
*/
S_TRANSITION_ENGINE, /* Attempt to make the calculated next stable
* state of the cluster a reality
*/
S_HALT, /* Freeze - don't do anything
* Something bad happened that needs the admin to fix
* Wait for I_ELECTION
*/
/* ----------- Last input found in table is above ---------- */
S_ILLEGAL /* This is an illegal FSA state */
/* (must be last) */
};
# define MAXSTATE S_ILLEGAL
/*
Once we start and do some basic sanity checks, we go into the
S_NOT_DC state and await instructions from the DC or input from
the cluster layer which indicates the election algorithm needs to run.
If the election algorithm is triggered, we enter the S_ELECTION state
from where we can either go back to the S_NOT_DC state or progress
to the S_INTEGRATION state (or S_RELEASE_DC if we used to be the DC
but aren't anymore). See the libcrmcluster API documentation for more
information about the election algorithm.
Once the election is complete, if we are the DC, we enter the
S_INTEGRATION state which is a DC-in-waiting style state. We are
the DC, but we shouldn't do anything yet because we may not have an
up-to-date picture of the cluster. There may of course be times
when this fails, so we should go back to the S_RECOVERY stage and
check everything is ok. We may also end up here if a new node came
online, since each node is authoritative about itself, and we would want
to incorporate its information into the CIB.
Once we have the latest CIB, we then enter the S_POLICY_ENGINE state
where invoke the scheduler. It is possible that between
invoking the scheduler and receiving an answer, that we receive
more input. In this case, we would discard the orginal result and
invoke it again.
Once we are satisfied with the output from the scheduler, we
enter S_TRANSITION_ENGINE and feed the scheduler's output to the
Transition Engine who attempts to make the scheduler's
calculation a reality. If the transition completes successfully,
we enter S_IDLE, otherwise we go back to S_POLICY_ENGINE with the
current unstable state and try again.
Of course, we may be asked to shutdown at any time, however we must
progress to S_NOT_DC before doing so. Once we have handed over DC
duties to another node, we can then shut down like everyone else,
that is, by asking the DC for permission and waiting for it to take all
our resources away.
The case where we are the DC and the only node in the cluster is a
special case and handled as an escalation which takes us to
S_SHUTDOWN. Similarly, if any other point in the shutdown
fails or stalls, this is escalated and we end up in S_TERMINATE.
At any point, the controller can relay messages for its subsystems,
but outbound messages (from subsystems) should probably be blocked
until S_INTEGRATION (for the DC) or the join protocol has
completed (for non-DC controllers).
*/
/*======================================
*
* Inputs/Events/Stimuli to be given to the finite state machine
*
* Some of these a true events, and others are synthesised based on
* the "register" (see below) and the contents or source of messages.
*
* The machine keeps processing until receiving I_NULL
*
*======================================*/
enum crmd_fsa_input {
/* 0 */
I_NULL, /* Nothing happened */
/* 1 */
I_CIB_OP, /* An update to the CIB occurred */
I_CIB_UPDATE, /* An update to the CIB occurred */
I_DC_TIMEOUT, /* We have lost communication with the DC */
I_ELECTION, /* Someone started an election */
I_PE_CALC, /* The scheduler needs to be invoked */
I_RELEASE_DC, /* The election completed and we were not
* elected, but we were the DC beforehand
*/
I_ELECTION_DC, /* The election completed and we were (re-)elected
* DC
*/
I_ERROR, /* Something bad happened (more serious than
* I_FAIL) and may not have been due to the action
* being performed. For example, we may have lost
* our connection to the CIB.
*/
/* 9 */
I_FAIL, /* The action failed to complete successfully */
I_INTEGRATED,
I_FINALIZED,
I_NODE_JOIN, /* A node has entered the cluster */
I_NOT_DC, /* We are not and were not the DC before or after
* the current operation or state
*/
I_RECOVERED, /* The recovery process completed successfully */
I_RELEASE_FAIL, /* We could not give up DC status for some reason
*/
I_RELEASE_SUCCESS, /* We are no longer the DC */
I_RESTART, /* The current set of actions needs to be
* restarted
*/
I_TE_SUCCESS, /* Some non-resource, non-cluster-layer action
* is required of us, e.g. ping
*/
/* 20 */
I_ROUTER, /* Do our job as router and forward this to the
* right place
*/
I_SHUTDOWN, /* We are asking to shutdown */
I_STOP, /* We have been told to shutdown */
I_TERMINATE, /* Actually exit */
I_STARTUP,
I_PE_SUCCESS, /* The action completed successfully */
I_JOIN_OFFER, /* The DC is offering membership */
I_JOIN_REQUEST, /* The client is requesting membership */
I_JOIN_RESULT, /* If not the DC: The result of a join request
* Else: A client is responding with its local state info
*/
I_WAIT_FOR_EVENT, /* we may be waiting for an async task to "happen"
* and until it does, we can't do anything else
*/
I_DC_HEARTBEAT, /* The DC is telling us that it is alive and well */
I_LRM_EVENT,
/* 30 */
I_PENDING,
I_HALT,
/* ------------ Last input found in table is above ----------- */
I_ILLEGAL /* This is an illegal value for an FSA input */
/* (must be last) */
};
# define MAXINPUT I_ILLEGAL
# define I_MESSAGE I_ROUTER
/*======================================
*
* actions
*
* Some of the actions below will always occur together for now, but this may
* not always be the case, so they are split up so that they can easily be
* called independently in the future, if necessary.
*
* For example, separating A_LRM_CONNECT from A_STARTUP might be useful
* if we ever try to recover from a faulty or disconnected executor.
*
*======================================*/
/* Don't do anything */
# define A_NOTHING 0x0000000000000000ULL
/* -- Startup actions -- */
/* Hook to perform any actions (other than connecting to other daemons)
* that might be needed as part of the startup.
*/
# define A_STARTUP 0x0000000000000001ULL
/* Hook to perform any actions that might be needed as part
* after startup is successful.
*/
# define A_STARTED 0x0000000000000002ULL
/* Connect to cluster layer */
# define A_HA_CONNECT 0x0000000000000004ULL
# define A_HA_DISCONNECT 0x0000000000000008ULL
# define A_INTEGRATE_TIMER_START 0x0000000000000010ULL
# define A_INTEGRATE_TIMER_STOP 0x0000000000000020ULL
# define A_FINALIZE_TIMER_START 0x0000000000000040ULL
# define A_FINALIZE_TIMER_STOP 0x0000000000000080ULL
/* -- Election actions -- */
# define A_DC_TIMER_START 0x0000000000000100ULL
# define A_DC_TIMER_STOP 0x0000000000000200ULL
# define A_ELECTION_COUNT 0x0000000000000400ULL
# define A_ELECTION_VOTE 0x0000000000000800ULL
# define A_ELECTION_START 0x0000000000001000ULL
/* -- Message processing -- */
/* Process the queue of requests */
# define A_MSG_PROCESS 0x0000000000002000ULL
/* Send the message to the correct recipient */
# define A_MSG_ROUTE 0x0000000000004000ULL
/* Send a welcome message to new node(s) */
# define A_DC_JOIN_OFFER_ONE 0x0000000000008000ULL
/* -- Server Join protocol actions -- */
/* Send a welcome message to all nodes */
# define A_DC_JOIN_OFFER_ALL 0x0000000000010000ULL
/* Process the remote node's ack of our join message */
# define A_DC_JOIN_PROCESS_REQ 0x0000000000020000ULL
/* Send out the results of the Join phase */
# define A_DC_JOIN_FINALIZE 0x0000000000040000ULL
/* Send out the results of the Join phase */
# define A_DC_JOIN_PROCESS_ACK 0x0000000000080000ULL
/* -- Client Join protocol actions -- */
# define A_CL_JOIN_QUERY 0x0000000000100000ULL
# define A_CL_JOIN_ANNOUNCE 0x0000000000200000ULL
/* Request membership to the DC list */
# define A_CL_JOIN_REQUEST 0x0000000000400000ULL
/* Did the DC accept or reject the request */
# define A_CL_JOIN_RESULT 0x0000000000800000ULL
/* -- Recovery, DC start/stop -- */
/* Something bad happened, try to recover */
# define A_RECOVER 0x0000000001000000ULL
/* Hook to perform any actions (apart from starting, the TE, scheduler,
* and gathering the latest CIB) that might be necessary before
* giving up the responsibilities of being the DC.
*/
# define A_DC_RELEASE 0x0000000002000000ULL
/* */
# define A_DC_RELEASED 0x0000000004000000ULL
/* Hook to perform any actions (apart from starting, the TE, scheduler,
* and gathering the latest CIB) that might be necessary before
* taking over the responsibilities of being the DC.
*/
# define A_DC_TAKEOVER 0x0000000008000000ULL
/* -- Shutdown actions -- */
# define A_SHUTDOWN 0x0000000010000000ULL
# define A_STOP 0x0000000020000000ULL
# define A_EXIT_0 0x0000000040000000ULL
# define A_EXIT_1 0x0000000080000000ULL
# define A_SHUTDOWN_REQ 0x0000000100000000ULL
# define A_ELECTION_CHECK 0x0000000200000000ULL
# define A_DC_JOIN_FINAL 0x0000000400000000ULL
/* -- CIB actions -- */
# define A_CIB_START 0x0000020000000000ULL
# define A_CIB_STOP 0x0000040000000000ULL
/* -- Transition Engine actions -- */
/* Attempt to reach the newly calculated cluster state. This is
* only called once per transition (except if it is asked to
* stop the transition or start a new one).
* Once given a cluster state to reach, the TE will determine
* tasks that can be performed in parallel, execute them, wait
* for replies and then determine the next set until the new
* state is reached or no further tasks can be taken.
*/
# define A_TE_INVOKE 0x0000100000000000ULL
# define A_TE_START 0x0000200000000000ULL
# define A_TE_STOP 0x0000400000000000ULL
# define A_TE_CANCEL 0x0000800000000000ULL
# define A_TE_HALT 0x0001000000000000ULL
/* -- Scheduler actions -- */
/* Calculate the next state for the cluster. This is only
* invoked once per needed calculation.
*/
# define A_PE_INVOKE 0x0002000000000000ULL
# define A_PE_START 0x0004000000000000ULL
# define A_PE_STOP 0x0008000000000000ULL
/* -- Misc actions -- */
/* Add a system generate "block" so that resources arent moved
* to or are activly moved away from the affected node. This
* way we can return quickly even if busy with other things.
*/
# define A_NODE_BLOCK 0x0010000000000000ULL
/* Update our information in the local CIB */
# define A_UPDATE_NODESTATUS 0x0020000000000000ULL
# define A_READCONFIG 0x0080000000000000ULL
/* -- LRM Actions -- */
/* Connect to pacemaker-execd */
# define A_LRM_CONNECT 0x0100000000000000ULL
/* Disconnect from pacemaker-execd */
# define A_LRM_DISCONNECT 0x0200000000000000ULL
# define A_LRM_INVOKE 0x0400000000000000ULL
# define A_LRM_EVENT 0x0800000000000000ULL
/* -- Logging actions -- */
# define A_LOG 0x1000000000000000ULL
# define A_ERROR 0x2000000000000000ULL
# define A_WARN 0x4000000000000000ULL
# define O_EXIT (A_SHUTDOWN|A_STOP|A_LRM_DISCONNECT|A_HA_DISCONNECT|A_EXIT_0|A_CIB_STOP)
# define O_RELEASE (A_DC_TIMER_STOP|A_DC_RELEASE|A_PE_STOP|A_TE_STOP|A_DC_RELEASED)
# define O_PE_RESTART (A_PE_START|A_PE_STOP)
# define O_TE_RESTART (A_TE_START|A_TE_STOP)
# define O_CIB_RESTART (A_CIB_START|A_CIB_STOP)
# define O_LRM_RECONNECT (A_LRM_CONNECT|A_LRM_DISCONNECT)
# define O_DC_TIMER_RESTART (A_DC_TIMER_STOP|A_DC_TIMER_START)
/*======================================
*
* "register" contents
*
* Things we may want to remember regardless of which state we are in.
*
* These also count as inputs for synthesizing I_*
*
*======================================*/
# define R_THE_DC 0x00000001ULL
/* Are we the DC? */
# define R_STARTING 0x00000002ULL
/* Are we starting up? */
# define R_SHUTDOWN 0x00000004ULL
/* Are we trying to shut down? */
# define R_STAYDOWN 0x00000008ULL
/* Should we restart? */
# define R_JOIN_OK 0x00000010ULL /* Have we completed the join process */
# define R_READ_CONFIG 0x00000040ULL
# define R_INVOKE_PE 0x00000080ULL // Should the scheduler be invoked?
# define R_CIB_CONNECTED 0x00000100ULL
/* Is the CIB connected? */
# define R_PE_CONNECTED 0x00000200ULL // Is the scheduler connected?
# define R_TE_CONNECTED 0x00000400ULL
/* Is the Transition Engine connected? */
# define R_LRM_CONNECTED 0x00000800ULL // Is pacemaker-execd connected?
# define R_CIB_REQUIRED 0x00001000ULL
/* Is the CIB required? */
# define R_PE_REQUIRED 0x00002000ULL // Is the scheduler required?
# define R_TE_REQUIRED 0x00004000ULL
/* Is the Transition Engine required? */
# define R_ST_REQUIRED 0x00008000ULL
/* Is the Stonith daemon required? */
# define R_CIB_DONE 0x00010000ULL
/* Have we calculated the CIB? */
# define R_HAVE_CIB 0x00020000ULL /* Do we have an up-to-date CIB */
-# define R_CIB_ASKED 0x00040000ULL /* Have we asked for an up-to-date CIB */
# define R_MEMBERSHIP 0x00100000ULL /* Have we got cluster layer data yet */
# define R_PEER_DATA 0x00200000ULL /* Have we got T_CL_STATUS data yet */
# define R_HA_DISCONNECTED 0x00400000ULL /* did we sign out of our own accord */
# define R_REQ_PEND 0x01000000ULL
/* Are there Requests waiting for
processing? */
# define R_PE_PEND 0x02000000ULL // Are we awaiting reply from scheduler?
# define R_TE_PEND 0x04000000ULL
/* Has the TE been invoked and we're
awaiting completion? */
# define R_RESP_PEND 0x08000000ULL
/* Do we have clients waiting on a
response? if so perhaps we shouldn't
stop yet */
# define R_SENT_RSC_STOP 0x20000000ULL /* Have we sent a stop action to all
* resources in preparation for
* shutting down */
# define R_IN_RECOVERY 0x80000000ULL
#define CRM_DIRECT_NACK_RC (99) // Deprecated (see PCMK_EXEC_INVALID)
enum crmd_fsa_cause {
C_UNKNOWN = 0,
C_STARTUP,
C_IPC_MESSAGE,
C_HA_MESSAGE,
C_CRMD_STATUS_CALLBACK,
C_LRM_OP_CALLBACK,
C_TIMER_POPPED,
C_SHUTDOWN,
C_FSA_INTERNAL,
};
enum fsa_data_type {
fsa_dt_none,
fsa_dt_ha_msg,
fsa_dt_xml,
fsa_dt_lrm,
};
typedef struct fsa_data_s fsa_data_t;
struct fsa_data_s {
int id;
enum crmd_fsa_input fsa_input;
enum crmd_fsa_cause fsa_cause;
uint64_t actions;
const char *origin;
void *data;
enum fsa_data_type data_type;
};
#define controld_set_fsa_input_flags(flags_to_set) do { \
controld_globals.fsa_input_register \
= pcmk__set_flags_as(__func__, __LINE__, LOG_TRACE, \
"FSA input", "controller", \
controld_globals.fsa_input_register, \
(flags_to_set), #flags_to_set); \
} while (0)
#define controld_clear_fsa_input_flags(flags_to_clear) do { \
controld_globals.fsa_input_register \
= pcmk__clear_flags_as(__func__, __LINE__, LOG_TRACE, \
"FSA input", "controller", \
controld_globals.fsa_input_register, \
(flags_to_clear), \
#flags_to_clear); \
} while (0)
#define controld_set_fsa_action_flags(flags_to_set) do { \
controld_globals.fsa_actions \
= pcmk__set_flags_as(__func__, __LINE__, LOG_DEBUG, \
"FSA action", "controller", \
controld_globals.fsa_actions, \
(flags_to_set), #flags_to_set); \
} while (0)
#define controld_clear_fsa_action_flags(flags_to_clear) do { \
controld_globals.fsa_actions \
= pcmk__clear_flags_as(__func__, __LINE__, LOG_DEBUG, \
"FSA action", "controller", \
controld_globals.fsa_actions, \
(flags_to_clear), #flags_to_clear); \
} while (0)
// This should be moved elsewhere
xmlNode *controld_query_executor_state(void);
const char *fsa_input2string(enum crmd_fsa_input input);
const char *fsa_state2string(enum crmd_fsa_state state);
const char *fsa_cause2string(enum crmd_fsa_cause cause);
const char *fsa_action2string(long long action);
enum crmd_fsa_state s_crmd_fsa(enum crmd_fsa_cause cause);
enum crmd_fsa_state controld_fsa_get_next_state(enum crmd_fsa_input input);
uint64_t controld_fsa_get_action(enum crmd_fsa_input input);
void controld_init_fsa_trigger(void);
void controld_destroy_fsa_trigger(void);
void free_max_generation(void);
# define AM_I_DC pcmk_is_set(controld_globals.fsa_input_register, R_THE_DC)
# define controld_trigger_fsa() controld_trigger_fsa_as(__func__, __LINE__)
void controld_trigger_fsa_as(const char *fn, int line);
/* A_READCONFIG */
void do_read_config(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t *msg_data);
/* A_PE_INVOKE */
void do_pe_invoke(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t *msg_data);
/* A_LOG */
void do_log(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_STARTUP */
void do_startup(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_CIB_START, STOP, RESTART */
void do_cib_control(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_HA_CONNECT */
void do_ha_control(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_LRM_CONNECT */
void do_lrm_control(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_PE_START, STOP, RESTART */
void do_pe_control(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_TE_START, STOP, RESTART */
void do_te_control(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_STARTED */
void do_started(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_MSG_ROUTE */
void do_msg_route(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_RECOVER */
void do_recover(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_ELECTION_VOTE */
void do_election_vote(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_ELECTION_COUNT */
void do_election_count_vote(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input,
fsa_data_t *msg_data);
/* A_ELECTION_CHECK */
void do_election_check(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_DC_TIMER_STOP */
void do_timer_control(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_DC_TAKEOVER */
void do_dc_takeover(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_DC_RELEASE */
void do_dc_release(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_DC_JOIN_OFFER_ALL */
void do_dc_join_offer_all(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_DC_JOIN_OFFER_ONE */
void do_dc_join_offer_one(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_DC_JOIN_ACK */
void do_dc_join_ack(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_DC_JOIN_REQ */
void do_dc_join_filter_offer(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input,
fsa_data_t *msg_data);
/* A_DC_JOIN_FINALIZE */
void do_dc_join_finalize(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_CL_JOIN_QUERY */
/* is there a DC out there? */
void do_cl_join_query(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t *msg_data);
/* A_CL_JOIN_ANNOUNCE */
void do_cl_join_announce(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t *msg_data);
/* A_CL_JOIN_REQUEST */
void do_cl_join_offer_respond(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input,
fsa_data_t *msg_data);
/* A_CL_JOIN_RESULT */
void do_cl_join_finalize_respond(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input,
fsa_data_t *msg_data);
/* A_LRM_INVOKE */
void do_lrm_invoke(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_LRM_EVENT */
void do_lrm_event(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_TE_INVOKE, A_TE_CANCEL */
void do_te_invoke(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_SHUTDOWN_REQ */
void do_shutdown_req(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_SHUTDOWN */
void do_shutdown(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_STOP */
void do_stop(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_EXIT_0, A_EXIT_1 */
void do_exit(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input cur_input, fsa_data_t *msg_data);
/* A_DC_JOIN_FINAL */
void do_dc_join_final(long long action, enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t *msg_data);
#endif
diff --git a/daemons/controld/controld_globals.h b/daemons/controld/controld_globals.h
index b674762683..eff1607ae3 100644
--- a/daemons/controld/controld_globals.h
+++ b/daemons/controld/controld_globals.h
@@ -1,140 +1,143 @@
/*
* Copyright 2022-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU Lesser General Public License
* version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY.
*/
#ifndef CONTROLD_GLOBALS__H
# define CONTROLD_GLOBALS__H
#include // pcmk__output_t, etc.
#include // uint32_t, uint64_t
#include // GList, GMainLoop
#include // cib_t
#include // pcmk__graph_t
#include // enum crmd_fsa_state
typedef struct {
// Booleans
//! Group of \p controld_flags values
uint32_t flags;
// Controller FSA
//! FSA state
enum crmd_fsa_state fsa_state;
//! FSA actions (group of \p A_* flags)
uint64_t fsa_actions;
//! FSA input register contents (group of \p R_* flags)
uint64_t fsa_input_register;
//! FSA message queue
GList *fsa_message_queue;
// CIB
//! Connection to the CIB
cib_t *cib_conn;
- //! Call ID of the in-progress CIB resource update (or 0 if none)
- int resource_update;
+ //! CIB connection's client ID
+ const char *cib_client_id;
// Scheduler
//! Reference of the scheduler request being waited on
char *fsa_pe_ref;
// Transitioner
//! Transitioner UUID
char *te_uuid;
//! Graph of transition currently being processed
pcmk__graph_t *transition_graph;
// Logging
//! Output object for controller log messages
pcmk__output_t *logger_out;
// Other
//! Cluster name
char *cluster_name;
//! Designated controller name
char *dc_name;
//! Designated controller's Pacemaker version
char *dc_version;
//! Local node's node name
char *our_nodename;
//! Local node's UUID
char *our_uuid;
//! Last saved cluster communication layer membership ID
unsigned long long membership_id;
+ //! Max lifetime (in seconds) of a resource's shutdown lock to a node
+ guint shutdown_lock_limit;
+
//! Main event loop
GMainLoop *mainloop;
} controld_globals_t;
extern controld_globals_t controld_globals;
/*!
* \internal
* \enum controld_flags
* \brief Bit flags to store various controller state and configuration info
*/
enum controld_flags {
//! The DC left in a membership change that is being processed
controld_dc_left = (1 << 0),
//! The FSA is stalled waiting for further input
controld_fsa_is_stalled = (1 << 1),
//! The local node has been in a quorate partition at some point
controld_ever_had_quorum = (1 << 2),
//! The local node is currently in a quorate partition
controld_has_quorum = (1 << 3),
//! Panic the local node if it loses quorum
controld_no_quorum_suicide = (1 << 4),
//! Lock resources to the local node when it shuts down cleanly
controld_shutdown_lock_enabled = (1 << 5),
};
# define controld_set_global_flags(flags_to_set) do { \
controld_globals.flags = pcmk__set_flags_as(__func__, __LINE__, \
LOG_TRACE, \
"Global", "controller", \
controld_globals.flags, \
(flags_to_set), \
#flags_to_set); \
} while (0)
# define controld_clear_global_flags(flags_to_clear) do { \
controld_globals.flags \
= pcmk__clear_flags_as(__func__, __LINE__, LOG_TRACE, "Global", \
"controller", controld_globals.flags, \
(flags_to_clear), #flags_to_clear); \
} while (0)
#endif // ifndef CONTROLD_GLOBALS__H
diff --git a/daemons/controld/controld_join_dc.c b/daemons/controld/controld_join_dc.c
index 8526a13a2d..804db171fa 100644
--- a/daemons/controld/controld_join_dc.c
+++ b/daemons/controld/controld_join_dc.c
@@ -1,958 +1,961 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
static char *max_generation_from = NULL;
static xmlNodePtr max_generation_xml = NULL;
/*!
* \internal
* \brief Nodes from which a CIB sync has failed since the peer joined
*
* This table is of the form (node_name -> join_id). \p node_name is
* the name of a client node from which a CIB \p sync_from() call has failed in
* \p do_dc_join_finalize() since the client joined the cluster as a peer.
* \p join_id is the ID of the join round in which the \p sync_from() failed,
* and is intended for use in nack log messages.
*/
static GHashTable *failed_sync_nodes = NULL;
void finalize_join_for(gpointer key, gpointer value, gpointer user_data);
void finalize_sync_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data);
gboolean check_join_state(enum crmd_fsa_state cur_state, const char *source);
/* Numeric counter used to identify join rounds (an unsigned int would be
* appropriate, except we get and set it in XML as int)
*/
static int current_join_id = 0;
/*!
* \internal
* \brief Destroy the hash table containing failed sync nodes
*/
void
controld_destroy_failed_sync_table(void)
{
if (failed_sync_nodes != NULL) {
g_hash_table_destroy(failed_sync_nodes);
failed_sync_nodes = NULL;
}
}
/*!
* \internal
* \brief Remove a node from the failed sync nodes table if present
*
* \param[in] node_name Node name to remove
*/
void
controld_remove_failed_sync_node(const char *node_name)
{
if (failed_sync_nodes != NULL) {
g_hash_table_remove(failed_sync_nodes, (gchar *) node_name);
}
}
/*!
* \internal
* \brief Add to a hash table a node whose CIB failed to sync
*
* \param[in] node_name Name of node whose CIB failed to sync
* \param[in] join_id Join round when the failure occurred
*/
static void
record_failed_sync_node(const char *node_name, gint join_id)
{
if (failed_sync_nodes == NULL) {
failed_sync_nodes = pcmk__strikey_table(g_free, NULL);
}
/* If the node is already in the table then we failed to nack it during the
* filter offer step
*/
CRM_LOG_ASSERT(g_hash_table_insert(failed_sync_nodes, g_strdup(node_name),
GINT_TO_POINTER(join_id)));
}
/*!
* \internal
* \brief Look up a node name in the failed sync table
*
* \param[in] node_name Name of node to look up
* \param[out] join_id Where to store the join ID of when the sync failed
*
* \return Standard Pacemaker return code. Specifically, \p pcmk_rc_ok if the
* node name was found, or \p pcmk_rc_node_unknown otherwise.
* \note \p *join_id is set to -1 if the node is not found.
*/
static int
lookup_failed_sync_node(const char *node_name, gint *join_id)
{
*join_id = -1;
if (failed_sync_nodes != NULL) {
gpointer result = g_hash_table_lookup(failed_sync_nodes,
(gchar *) node_name);
if (result != NULL) {
*join_id = GPOINTER_TO_INT(result);
return pcmk_rc_ok;
}
}
return pcmk_rc_node_unknown;
}
void
crm_update_peer_join(const char *source, crm_node_t * node, enum crm_join_phase phase)
{
enum crm_join_phase last = 0;
CRM_CHECK(node != NULL, return);
/* Remote nodes do not participate in joins */
if (pcmk_is_set(node->flags, crm_remote_node)) {
return;
}
last = node->join;
if(phase == last) {
crm_trace("Node %s join-%d phase is still %s "
CRM_XS " nodeid=%u source=%s",
node->uname, current_join_id, crm_join_phase_str(last),
node->id, source);
} else if ((phase <= crm_join_none) || (phase == (last + 1))) {
node->join = phase;
crm_trace("Node %s join-%d phase is now %s (was %s) "
CRM_XS " nodeid=%u source=%s",
node->uname, current_join_id, crm_join_phase_str(phase),
crm_join_phase_str(last), node->id, source);
} else {
crm_warn("Rejecting join-%d phase update for node %s because "
"can't go from %s to %s " CRM_XS " nodeid=%u source=%s",
current_join_id, node->uname, crm_join_phase_str(last),
crm_join_phase_str(phase), node->id, source);
}
}
static void
start_join_round(void)
{
GHashTableIter iter;
crm_node_t *peer = NULL;
crm_debug("Starting new join round join-%d", current_join_id);
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &peer)) {
crm_update_peer_join(__func__, peer, crm_join_none);
}
if (max_generation_from != NULL) {
free(max_generation_from);
max_generation_from = NULL;
}
if (max_generation_xml != NULL) {
free_xml(max_generation_xml);
max_generation_xml = NULL;
}
- controld_clear_fsa_input_flags(R_HAVE_CIB|R_CIB_ASKED);
+ controld_clear_fsa_input_flags(R_HAVE_CIB);
+ controld_forget_all_cib_replace_calls();
}
/*!
* \internal
* \brief Create a join message from the DC
*
* \param[in] join_op Join operation name
* \param[in] host_to Recipient of message
*/
static xmlNode *
create_dc_message(const char *join_op, const char *host_to)
{
xmlNode *msg = create_request(join_op, NULL, host_to, CRM_SYSTEM_CRMD,
CRM_SYSTEM_DC, NULL);
/* Identify which election this is a part of */
crm_xml_add_int(msg, F_CRM_JOIN_ID, current_join_id);
/* Add a field specifying whether the DC is shutting down. This keeps the
* joining node from fencing the old DC if it becomes the new DC.
*/
pcmk__xe_set_bool_attr(msg, F_CRM_DC_LEAVING,
pcmk_is_set(controld_globals.fsa_input_register,
R_SHUTDOWN));
return msg;
}
static void
join_make_offer(gpointer key, gpointer value, gpointer user_data)
{
xmlNode *offer = NULL;
crm_node_t *member = (crm_node_t *)value;
CRM_ASSERT(member != NULL);
if (crm_is_peer_active(member) == FALSE) {
crm_info("Not making join-%d offer to inactive node %s",
current_join_id,
(member->uname? member->uname : "with unknown name"));
if(member->expected == NULL && pcmk__str_eq(member->state, CRM_NODE_LOST, pcmk__str_casei)) {
/* You would think this unsafe, but in fact this plus an
* active resource is what causes it to be fenced.
*
* Yes, this does mean that any node that dies at the same
* time as the old DC and is not running resource (still)
* won't be fenced.
*
* I'm not happy about this either.
*/
pcmk__update_peer_expected(__func__, member, CRMD_JOINSTATE_DOWN);
}
return;
}
if (member->uname == NULL) {
crm_info("Not making join-%d offer to node uuid %s with unknown name",
current_join_id, member->uuid);
return;
}
if (controld_globals.membership_id != crm_peer_seq) {
controld_globals.membership_id = crm_peer_seq;
crm_info("Making join-%d offers based on membership event %llu",
current_join_id, crm_peer_seq);
}
if(user_data && member->join > crm_join_none) {
crm_info("Not making join-%d offer to already known node %s (%s)",
current_join_id, member->uname,
crm_join_phase_str(member->join));
return;
}
crm_update_peer_join(__func__, (crm_node_t*)member, crm_join_none);
offer = create_dc_message(CRM_OP_JOIN_OFFER, member->uname);
// Advertise our feature set so the joining node can bail if not compatible
crm_xml_add(offer, XML_ATTR_CRM_VERSION, CRM_FEATURE_SET);
crm_info("Sending join-%d offer to %s", current_join_id, member->uname);
send_cluster_message(member, crm_msg_crmd, offer, TRUE);
free_xml(offer);
crm_update_peer_join(__func__, member, crm_join_welcomed);
}
/* A_DC_JOIN_OFFER_ALL */
void
do_dc_join_offer_all(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
int count;
/* Reset everyone's status back to down or in_ccm in the CIB.
* Any nodes that are active in the CIB but not in the cluster membership
* will be seen as offline by the scheduler anyway.
*/
current_join_id++;
start_join_round();
update_dc(NULL);
if (cause == C_HA_MESSAGE && current_input == I_NODE_JOIN) {
crm_info("A new node joined the cluster");
}
g_hash_table_foreach(crm_peer_cache, join_make_offer, NULL);
count = crmd_join_phase_count(crm_join_welcomed);
crm_info("Waiting on join-%d requests from %d outstanding node%s",
current_join_id, count, pcmk__plural_s(count));
// Don't waste time by invoking the scheduler yet
}
/* A_DC_JOIN_OFFER_ONE */
void
do_dc_join_offer_one(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
crm_node_t *member;
ha_msg_input_t *welcome = NULL;
int count;
const char *join_to = NULL;
if (msg_data->data == NULL) {
crm_info("Making join-%d offers to any unconfirmed nodes "
"because an unknown node joined", current_join_id);
g_hash_table_foreach(crm_peer_cache, join_make_offer, &member);
check_join_state(cur_state, __func__);
return;
}
welcome = fsa_typed_data(fsa_dt_ha_msg);
if (welcome == NULL) {
// fsa_typed_data() already logged an error
return;
}
join_to = crm_element_value(welcome->msg, F_CRM_HOST_FROM);
if (join_to == NULL) {
crm_err("Can't make join-%d offer to unknown node", current_join_id);
return;
}
member = crm_get_peer(0, join_to);
/* It is possible that a node will have been sick or starting up when the
* original offer was made. However, it will either re-announce itself in
* due course, or we can re-store the original offer on the client.
*/
crm_update_peer_join(__func__, member, crm_join_none);
join_make_offer(NULL, member, NULL);
/* If the offer isn't to the local node, make an offer to the local node as
* well, to ensure the correct value for max_generation_from.
*/
if (strcasecmp(join_to, controld_globals.our_nodename) != 0) {
member = crm_get_peer(0, controld_globals.our_nodename);
join_make_offer(NULL, member, NULL);
}
/* This was a genuine join request; cancel any existing transition and
* invoke the scheduler.
*/
abort_transition(INFINITY, pcmk__graph_restart, "Node join", NULL);
count = crmd_join_phase_count(crm_join_welcomed);
crm_info("Waiting on join-%d requests from %d outstanding node%s",
current_join_id, count, pcmk__plural_s(count));
// Don't waste time by invoking the scheduler yet
}
static int
compare_int_fields(xmlNode * left, xmlNode * right, const char *field)
{
const char *elem_l = crm_element_value(left, field);
const char *elem_r = crm_element_value(right, field);
long long int_elem_l;
long long int_elem_r;
pcmk__scan_ll(elem_l, &int_elem_l, -1LL);
pcmk__scan_ll(elem_r, &int_elem_r, -1LL);
if (int_elem_l < int_elem_r) {
return -1;
} else if (int_elem_l > int_elem_r) {
return 1;
}
return 0;
}
/* A_DC_JOIN_PROCESS_REQ */
void
do_dc_join_filter_offer(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
xmlNode *generation = NULL;
int cmp = 0;
int join_id = -1;
int count = 0;
gint value = 0;
gboolean ack_nack_bool = TRUE;
ha_msg_input_t *join_ack = fsa_typed_data(fsa_dt_ha_msg);
const char *join_from = crm_element_value(join_ack->msg, F_CRM_HOST_FROM);
const char *ref = crm_element_value(join_ack->msg, F_CRM_REFERENCE);
const char *join_version = crm_element_value(join_ack->msg,
XML_ATTR_CRM_VERSION);
crm_node_t *join_node = NULL;
if (join_from == NULL) {
crm_err("Ignoring invalid join request without node name");
return;
}
join_node = crm_get_peer(0, join_from);
crm_element_value_int(join_ack->msg, F_CRM_JOIN_ID, &join_id);
if (join_id != current_join_id) {
crm_debug("Ignoring join-%d request from %s because we are on join-%d",
join_id, join_from, current_join_id);
check_join_state(cur_state, __func__);
return;
}
generation = join_ack->xml;
if (max_generation_xml != NULL && generation != NULL) {
int lpc = 0;
const char *attributes[] = {
XML_ATTR_GENERATION_ADMIN,
XML_ATTR_GENERATION,
XML_ATTR_NUMUPDATES,
};
for (lpc = 0; cmp == 0 && lpc < PCMK__NELEM(attributes); lpc++) {
cmp = compare_int_fields(max_generation_xml, generation, attributes[lpc]);
}
}
if (ref == NULL) {
ref = "none"; // for logging only
}
if (lookup_failed_sync_node(join_from, &value) == pcmk_rc_ok) {
crm_err("Rejecting join-%d request from node %s because we failed to "
"sync its CIB in join-%d " CRM_XS " ref=%s",
join_id, join_from, value, ref);
ack_nack_bool = FALSE;
} else if (!crm_is_peer_active(join_node)) {
if (match_down_event(join_from) != NULL) {
/* The join request was received after the node was fenced or
* otherwise shutdown in a way that we're aware of. No need to log
* an error in this rare occurrence; we know the client was recently
* shut down, and receiving a lingering in-flight request is not
* cause for alarm.
*/
crm_debug("Rejecting join-%d request from inactive node %s "
CRM_XS " ref=%s", join_id, join_from, ref);
} else {
crm_err("Rejecting join-%d request from inactive node %s "
CRM_XS " ref=%s", join_id, join_from, ref);
}
ack_nack_bool = FALSE;
} else if (generation == NULL) {
crm_err("Rejecting invalid join-%d request from node %s "
"missing CIB generation " CRM_XS " ref=%s",
join_id, join_from, ref);
ack_nack_bool = FALSE;
} else if ((join_version == NULL)
|| !feature_set_compatible(CRM_FEATURE_SET, join_version)) {
crm_err("Rejecting join-%d request from node %s because feature set %s"
" is incompatible with ours (%s) " CRM_XS " ref=%s",
join_id, join_from, (join_version? join_version : "pre-3.1.0"),
CRM_FEATURE_SET, ref);
ack_nack_bool = FALSE;
} else if (max_generation_xml == NULL) {
const char *validation = crm_element_value(generation,
XML_ATTR_VALIDATION);
if (get_schema_version(validation) < 0) {
crm_err("Rejecting join-%d request from %s (with first CIB "
"generation) due to unknown schema version %s "
CRM_XS " ref=%s",
join_id, join_from, validation, ref);
ack_nack_bool = FALSE;
} else {
crm_debug("Accepting join-%d request from %s (with first CIB "
"generation) " CRM_XS " ref=%s",
join_id, join_from, ref);
max_generation_xml = copy_xml(generation);
pcmk__str_update(&max_generation_from, join_from);
}
} else if ((cmp < 0)
|| ((cmp == 0)
&& pcmk__str_eq(join_from, controld_globals.our_nodename,
pcmk__str_casei))) {
const char *validation = crm_element_value(generation,
XML_ATTR_VALIDATION);
if (get_schema_version(validation) < 0) {
crm_err("Rejecting join-%d request from %s (with better CIB "
"generation than current best from %s) due to unknown "
"schema version %s " CRM_XS " ref=%s",
join_id, join_from, max_generation_from, validation, ref);
ack_nack_bool = FALSE;
} else {
crm_debug("Accepting join-%d request from %s (with better CIB "
"generation than current best from %s) " CRM_XS " ref=%s",
join_id, join_from, max_generation_from, ref);
crm_log_xml_debug(max_generation_xml, "Old max generation");
crm_log_xml_debug(generation, "New max generation");
free_xml(max_generation_xml);
max_generation_xml = copy_xml(join_ack->xml);
pcmk__str_update(&max_generation_from, join_from);
}
} else {
crm_debug("Accepting join-%d request from %s " CRM_XS " ref=%s",
join_id, join_from, ref);
}
if (!ack_nack_bool) {
if (compare_version(join_version, "3.17.0") < 0) {
/* Clients with CRM_FEATURE_SET < 3.17.0 may respawn infinitely
* after a nack message, don't send one
*/
crm_update_peer_join(__func__, join_node, crm_join_nack_quiet);
} else {
crm_update_peer_join(__func__, join_node, crm_join_nack);
}
pcmk__update_peer_expected(__func__, join_node, CRMD_JOINSTATE_NACK);
} else {
crm_update_peer_join(__func__, join_node, crm_join_integrated);
pcmk__update_peer_expected(__func__, join_node, CRMD_JOINSTATE_MEMBER);
}
count = crmd_join_phase_count(crm_join_integrated);
crm_debug("%d node%s currently integrated in join-%d",
count, pcmk__plural_s(count), join_id);
if (check_join_state(cur_state, __func__) == FALSE) {
// Don't waste time by invoking the scheduler yet
count = crmd_join_phase_count(crm_join_welcomed);
crm_debug("Waiting on join-%d requests from %d outstanding node%s",
join_id, count, pcmk__plural_s(count));
}
}
/* A_DC_JOIN_FINALIZE */
void
do_dc_join_finalize(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
char *sync_from = NULL;
int rc = pcmk_ok;
int count_welcomed = crmd_join_phase_count(crm_join_welcomed);
int count_finalizable = crmd_join_phase_count(crm_join_integrated)
+ crmd_join_phase_count(crm_join_nack)
+ crmd_join_phase_count(crm_join_nack_quiet);
/* This we can do straight away and avoid clients timing us out
* while we compute the latest CIB
*/
if (count_welcomed != 0) {
crm_debug("Waiting on join-%d requests from %d outstanding node%s "
"before finalizing join", current_join_id, count_welcomed,
pcmk__plural_s(count_welcomed));
crmd_join_phase_log(LOG_DEBUG);
/* crmd_fsa_stall(FALSE); Needed? */
return;
} else if (count_finalizable == 0) {
crm_debug("Finalization not needed for join-%d at the current time",
current_join_id);
crmd_join_phase_log(LOG_DEBUG);
check_join_state(controld_globals.fsa_state, __func__);
return;
}
controld_clear_fsa_input_flags(R_HAVE_CIB);
if (pcmk__str_eq(max_generation_from, controld_globals.our_nodename,
pcmk__str_null_matches|pcmk__str_casei)) {
controld_set_fsa_input_flags(R_HAVE_CIB);
}
if (!controld_globals.transition_graph->complete) {
crm_warn("Delaying join-%d finalization while transition in progress",
current_join_id);
crmd_join_phase_log(LOG_DEBUG);
crmd_fsa_stall(FALSE);
return;
}
- if ((max_generation_from != NULL)
- && !pcmk_is_set(controld_globals.fsa_input_register, R_HAVE_CIB)) {
- /* ask for the agreed best CIB */
- pcmk__str_update(&sync_from, max_generation_from);
- controld_set_fsa_input_flags(R_CIB_ASKED);
- crm_notice("Finalizing join-%d for %d node%s (sync'ing CIB from %s)",
- current_join_id, count_finalizable,
- pcmk__plural_s(count_finalizable), sync_from);
- crm_log_xml_notice(max_generation_xml, "Requested CIB version");
-
- } else {
- /* Send _our_ CIB out to everyone */
+ if (pcmk_is_set(controld_globals.fsa_input_register, R_HAVE_CIB)) {
+ // Send our CIB out to everyone
pcmk__str_update(&sync_from, controld_globals.our_nodename);
crm_debug("Finalizing join-%d for %d node%s (sync'ing from local CIB)",
current_join_id, count_finalizable,
pcmk__plural_s(count_finalizable));
crm_log_xml_debug(max_generation_xml, "Requested CIB version");
+
+ } else {
+ // Ask for the agreed best CIB
+ pcmk__str_update(&sync_from, max_generation_from);
+ crm_notice("Finalizing join-%d for %d node%s (sync'ing CIB from %s)",
+ current_join_id, count_finalizable,
+ pcmk__plural_s(count_finalizable), sync_from);
+ crm_log_xml_notice(max_generation_xml, "Requested CIB version");
}
crmd_join_phase_log(LOG_DEBUG);
rc = controld_globals.cib_conn->cmds->sync_from(controld_globals.cib_conn,
- sync_from, NULL,
- cib_quorum_override);
+ sync_from, NULL, cib_none);
+
+ if (pcmk_is_set(controld_globals.fsa_input_register, R_HAVE_CIB)) {
+ controld_record_cib_replace_call(rc);
+ }
fsa_register_cib_callback(rc, sync_from, finalize_sync_callback);
}
void
free_max_generation(void)
{
free(max_generation_from);
max_generation_from = NULL;
free_xml(max_generation_xml);
max_generation_xml = NULL;
}
void
finalize_sync_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
CRM_LOG_ASSERT(-EPERM != rc);
- controld_clear_fsa_input_flags(R_CIB_ASKED);
+
+ controld_forget_cib_replace_call(call_id);
+
if (rc != pcmk_ok) {
const char *sync_from = (const char *) user_data;
do_crm_log(((rc == -pcmk_err_old_data)? LOG_WARNING : LOG_ERR),
"Could not sync CIB from %s in join-%d: %s",
sync_from, current_join_id, pcmk_strerror(rc));
if (rc != -pcmk_err_old_data) {
record_failed_sync_node(sync_from, current_join_id);
}
/* restart the whole join process */
register_fsa_error_adv(C_FSA_INTERNAL, I_ELECTION_DC, NULL, NULL,
__func__);
} else if (!AM_I_DC) {
crm_debug("Sync'ed CIB for join-%d but no longer DC", current_join_id);
} else if (controld_globals.fsa_state != S_FINALIZE_JOIN) {
crm_debug("Sync'ed CIB for join-%d but no longer in S_FINALIZE_JOIN "
"(%s)", current_join_id,
fsa_state2string(controld_globals.fsa_state));
} else {
controld_set_fsa_input_flags(R_HAVE_CIB);
- controld_clear_fsa_input_flags(R_CIB_ASKED);
/* make sure dc_uuid is re-set to us */
if (!check_join_state(controld_globals.fsa_state, __func__)) {
int count_finalizable = 0;
count_finalizable = crmd_join_phase_count(crm_join_integrated)
+ crmd_join_phase_count(crm_join_nack)
+ crmd_join_phase_count(crm_join_nack_quiet);
crm_debug("Notifying %d node%s of join-%d results",
count_finalizable, pcmk__plural_s(count_finalizable),
current_join_id);
g_hash_table_foreach(crm_peer_cache, finalize_join_for, NULL);
}
}
}
static void
join_update_complete_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
fsa_data_t *msg_data = NULL;
if (rc == pcmk_ok) {
crm_debug("join-%d node history update (via CIB call %d) complete",
current_join_id, call_id);
check_join_state(controld_globals.fsa_state, __func__);
} else {
crm_err("join-%d node history update (via CIB call %d) failed: %s "
"(next transition may determine resource status incorrectly)",
current_join_id, call_id, pcmk_strerror(rc));
crm_log_xml_debug(msg, "failed");
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
}
}
/* A_DC_JOIN_PROCESS_ACK */
void
do_dc_join_ack(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
int join_id = -1;
ha_msg_input_t *join_ack = fsa_typed_data(fsa_dt_ha_msg);
enum controld_section_e section = controld_section_lrm;
- const int cib_opts = cib_scope_local|cib_quorum_override|cib_can_create;
+ const int cib_opts = cib_scope_local|cib_can_create;
const char *op = crm_element_value(join_ack->msg, F_CRM_TASK);
const char *join_from = crm_element_value(join_ack->msg, F_CRM_HOST_FROM);
crm_node_t *peer = NULL;
// Sanity checks
if (join_from == NULL) {
crm_warn("Ignoring message received without node identification");
return;
}
if (op == NULL) {
crm_warn("Ignoring message received from %s without task", join_from);
return;
}
if (strcmp(op, CRM_OP_JOIN_CONFIRM)) {
crm_debug("Ignoring '%s' message from %s while waiting for '%s'",
op, join_from, CRM_OP_JOIN_CONFIRM);
return;
}
if (crm_element_value_int(join_ack->msg, F_CRM_JOIN_ID, &join_id) != 0) {
crm_warn("Ignoring join confirmation from %s without valid join ID",
join_from);
return;
}
peer = crm_get_peer(0, join_from);
if (peer->join != crm_join_finalized) {
crm_info("Ignoring out-of-sequence join-%d confirmation from %s "
"(currently %s not %s)",
join_id, join_from, crm_join_phase_str(peer->join),
crm_join_phase_str(crm_join_finalized));
return;
}
if (join_id != current_join_id) {
crm_err("Rejecting join-%d confirmation from %s "
"because currently on join-%d",
join_id, join_from, current_join_id);
crm_update_peer_join(__func__, peer, crm_join_nack);
return;
}
crm_update_peer_join(__func__, peer, crm_join_confirmed);
/* Update CIB with node's current executor state. A new transition will be
* triggered later, when the CIB notifies us of the change.
*/
if (pcmk_is_set(controld_globals.flags, controld_shutdown_lock_enabled)) {
section = controld_section_lrm_unlocked;
}
controld_delete_node_state(join_from, section, cib_scope_local);
if (pcmk__str_eq(join_from, controld_globals.our_nodename,
pcmk__str_casei)) {
xmlNode *now_dc_lrmd_state = controld_query_executor_state();
if (now_dc_lrmd_state != NULL) {
crm_debug("Updating local node history for join-%d "
"from query result", join_id);
controld_update_cib(XML_CIB_TAG_STATUS, now_dc_lrmd_state, cib_opts,
join_update_complete_callback);
free_xml(now_dc_lrmd_state);
} else {
crm_warn("Updating local node history from join-%d confirmation "
"because query failed", join_id);
controld_update_cib(XML_CIB_TAG_STATUS, join_ack->xml, cib_opts,
join_update_complete_callback);
}
} else {
crm_debug("Updating node history for %s from join-%d confirmation",
join_from, join_id);
controld_update_cib(XML_CIB_TAG_STATUS, join_ack->xml, cib_opts,
join_update_complete_callback);
}
}
void
finalize_join_for(gpointer key, gpointer value, gpointer user_data)
{
xmlNode *acknak = NULL;
xmlNode *tmp1 = NULL;
crm_node_t *join_node = value;
const char *join_to = join_node->uname;
bool integrated = false;
switch (join_node->join) {
case crm_join_integrated:
integrated = true;
break;
case crm_join_nack:
case crm_join_nack_quiet:
break;
default:
crm_trace("Not updating non-integrated and non-nacked node %s (%s) "
"for join-%d", join_to,
crm_join_phase_str(join_node->join), current_join_id);
return;
}
/* Update the element with the node's name and UUID, in case they
* weren't known before
*/
crm_trace("Updating node name and UUID in CIB for %s", join_to);
tmp1 = create_xml_node(NULL, XML_CIB_TAG_NODE);
set_uuid(tmp1, XML_ATTR_ID, join_node);
crm_xml_add(tmp1, XML_ATTR_UNAME, join_to);
fsa_cib_anon_update(XML_CIB_TAG_NODES, tmp1);
free_xml(tmp1);
if (join_node->join == crm_join_nack_quiet) {
crm_trace("Not sending nack message to node %s with feature set older "
"than 3.17.0", join_to);
return;
}
join_node = crm_get_peer(0, join_to);
if (!crm_is_peer_active(join_node)) {
/*
* NACK'ing nodes that the membership layer doesn't know about yet
* simply creates more churn
*
* Better to leave them waiting and let the join restart when
* the new membership event comes in
*
* All other NACKs (due to versions etc) should still be processed
*/
pcmk__update_peer_expected(__func__, join_node, CRMD_JOINSTATE_PENDING);
return;
}
// Acknowledge or nack node's join request
crm_debug("%sing join-%d request from %s",
integrated? "Acknowledg" : "Nack", current_join_id, join_to);
acknak = create_dc_message(CRM_OP_JOIN_ACKNAK, join_to);
pcmk__xe_set_bool_attr(acknak, CRM_OP_JOIN_ACKNAK, integrated);
if (integrated) {
// No change needed for a nacked node
crm_update_peer_join(__func__, join_node, crm_join_finalized);
pcmk__update_peer_expected(__func__, join_node, CRMD_JOINSTATE_MEMBER);
}
send_cluster_message(join_node, crm_msg_crmd, acknak, TRUE);
free_xml(acknak);
return;
}
gboolean
check_join_state(enum crmd_fsa_state cur_state, const char *source)
{
static unsigned long long highest_seq = 0;
if (controld_globals.membership_id != crm_peer_seq) {
crm_debug("join-%d: Membership changed from %llu to %llu "
CRM_XS " highest=%llu state=%s for=%s",
current_join_id, controld_globals.membership_id, crm_peer_seq,
highest_seq, fsa_state2string(cur_state), source);
if(highest_seq < crm_peer_seq) {
/* Don't spam the FSA with duplicates */
highest_seq = crm_peer_seq;
register_fsa_input_before(C_FSA_INTERNAL, I_NODE_JOIN, NULL);
}
} else if (cur_state == S_INTEGRATION) {
if (crmd_join_phase_count(crm_join_welcomed) == 0) {
int count = crmd_join_phase_count(crm_join_integrated);
crm_debug("join-%d: Integration of %d peer%s complete "
CRM_XS " state=%s for=%s",
current_join_id, count, pcmk__plural_s(count),
fsa_state2string(cur_state), source);
register_fsa_input_before(C_FSA_INTERNAL, I_INTEGRATED, NULL);
return TRUE;
}
} else if (cur_state == S_FINALIZE_JOIN) {
if (!pcmk_is_set(controld_globals.fsa_input_register, R_HAVE_CIB)) {
crm_debug("join-%d: Delaying finalization until we have CIB "
CRM_XS " state=%s for=%s",
current_join_id, fsa_state2string(cur_state), source);
return TRUE;
} else if (crmd_join_phase_count(crm_join_welcomed) != 0) {
int count = crmd_join_phase_count(crm_join_welcomed);
crm_debug("join-%d: Still waiting on %d welcomed node%s "
CRM_XS " state=%s for=%s",
current_join_id, count, pcmk__plural_s(count),
fsa_state2string(cur_state), source);
crmd_join_phase_log(LOG_DEBUG);
} else if (crmd_join_phase_count(crm_join_integrated) != 0) {
int count = crmd_join_phase_count(crm_join_integrated);
crm_debug("join-%d: Still waiting on %d integrated node%s "
CRM_XS " state=%s for=%s",
current_join_id, count, pcmk__plural_s(count),
fsa_state2string(cur_state), source);
crmd_join_phase_log(LOG_DEBUG);
} else if (crmd_join_phase_count(crm_join_finalized) != 0) {
int count = crmd_join_phase_count(crm_join_finalized);
crm_debug("join-%d: Still waiting on %d finalized node%s "
CRM_XS " state=%s for=%s",
current_join_id, count, pcmk__plural_s(count),
fsa_state2string(cur_state), source);
crmd_join_phase_log(LOG_DEBUG);
} else {
crm_debug("join-%d: Complete " CRM_XS " state=%s for=%s",
current_join_id, fsa_state2string(cur_state), source);
register_fsa_input_later(C_FSA_INTERNAL, I_FINALIZED, NULL);
return TRUE;
}
}
return FALSE;
}
void
do_dc_join_final(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
crm_debug("Ensuring DC, quorum and node attributes are up-to-date");
crm_update_quorum(crm_have_quorum, TRUE);
}
int crmd_join_phase_count(enum crm_join_phase phase)
{
int count = 0;
crm_node_t *peer;
GHashTableIter iter;
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &peer)) {
if(peer->join == phase) {
count++;
}
}
return count;
}
void crmd_join_phase_log(int level)
{
crm_node_t *peer;
GHashTableIter iter;
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &peer)) {
do_crm_log(level, "join-%d: %s=%s", current_join_id, peer->uname,
crm_join_phase_str(peer->join));
}
}
diff --git a/daemons/controld/controld_lrm.h b/daemons/controld/controld_lrm.h
index b6fce1566a..25f3db3316 100644
--- a/daemons/controld/controld_lrm.h
+++ b/daemons/controld/controld_lrm.h
@@ -1,193 +1,188 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU Lesser General Public License
* version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY.
*/
#ifndef CONTROLD_LRM__H
# define CONTROLD_LRM__H
#include
extern gboolean verify_stopped(enum crmd_fsa_state cur_state, int log_level);
void lrm_clear_last_failure(const char *rsc_id, const char *node_name,
const char *operation, guint interval_ms);
void lrm_op_callback(lrmd_event_data_t * op);
lrmd_t *crmd_local_lrmd_conn(void);
typedef struct resource_history_s {
char *id;
uint32_t last_callid;
lrmd_rsc_info_t rsc;
lrmd_event_data_t *last;
lrmd_event_data_t *failed;
GList *recurring_op_list;
/* Resources must be stopped using the same
* parameters they were started with. This hashtable
* holds the parameters that should be used for the next stop
* cmd on this resource. */
GHashTable *stop_params;
} rsc_history_t;
void history_free(gpointer data);
enum active_op_e {
active_op_remove = (1 << 0),
active_op_cancelled = (1 << 1),
};
// In-flight action (recurring or pending)
typedef struct active_op_s {
guint interval_ms;
int call_id;
uint32_t flags; // bitmask of active_op_e
time_t start_time;
time_t lock_time;
char *rsc_id;
char *op_type;
char *op_key;
char *user_data;
GHashTable *params;
} active_op_t;
#define controld_set_active_op_flags(active_op, flags_to_set) do { \
(active_op)->flags = pcmk__set_flags_as(__func__, __LINE__, \
LOG_TRACE, "Active operation", (active_op)->op_key, \
(active_op)->flags, (flags_to_set), #flags_to_set); \
} while (0)
#define controld_clear_active_op_flags(active_op, flags_to_clear) do { \
(active_op)->flags = pcmk__clear_flags_as(__func__, __LINE__, \
LOG_TRACE, "Active operation", (active_op)->op_key, \
(active_op)->flags, (flags_to_clear), #flags_to_clear); \
} while (0)
typedef struct lrm_state_s {
const char *node_name;
void *conn; // Reserved for controld_execd_state.c usage
void *remote_ra_data; // Reserved for controld_remote_ra.c usage
GHashTable *resource_history;
GHashTable *active_ops; // Pending and recurring actions
GHashTable *deletion_ops;
GHashTable *rsc_info_cache;
GHashTable *metadata_cache; // key = class[:provider]:agent, value = ra_metadata_s
int num_lrm_register_fails;
} lrm_state_t;
struct pending_deletion_op_s {
char *rsc;
ha_msg_input_t *input;
};
/*!
* \brief Check whether this the local IPC connection to the executor
*/
gboolean
lrm_state_is_local(lrm_state_t *lrm_state);
/*!
* \brief Clear all state information from a single state entry.
* \note It sometimes useful to save metadata cache when it won't go stale.
* \note This does not close the executor connection
*/
void lrm_state_reset_tables(lrm_state_t * lrm_state, gboolean reset_metadata);
GList *lrm_state_get_list(void);
/*!
* \brief Initiate internal state tables
*/
gboolean lrm_state_init_local(void);
/*!
* \brief Destroy all state entries and internal state tables
*/
void lrm_state_destroy_all(void);
-/*!
- * \brief Create executor connection entry
- */
-lrm_state_t *lrm_state_create(const char *node_name);
-
/*!
* \brief Destroy executor connection by node name
*/
void lrm_state_destroy(const char *node_name);
/*!
* \brief Find lrm_state data by node name
*/
lrm_state_t *lrm_state_find(const char *node_name);
/*!
* \brief Either find or create a new entry
*/
lrm_state_t *lrm_state_find_or_create(const char *node_name);
/*!
* The functions below are wrappers for the executor API the the controller
* uses. These wrapper functions allow us to treat the controller's remote
* executor connection resources the same as regular resources. Internally,
* regular resources go to the executor, and remote connection resources are
* handled locally in the controller.
*/
void lrm_state_disconnect_only(lrm_state_t * lrm_state);
void lrm_state_disconnect(lrm_state_t * lrm_state);
int controld_connect_local_executor(lrm_state_t *lrm_state);
int controld_connect_remote_executor(lrm_state_t *lrm_state, const char *server,
int port, int timeout);
int lrm_state_is_connected(lrm_state_t * lrm_state);
int lrm_state_poke_connection(lrm_state_t * lrm_state);
int lrm_state_get_metadata(lrm_state_t * lrm_state,
const char *class,
const char *provider,
const char *agent, char **output, enum lrmd_call_options options);
int lrm_state_cancel(lrm_state_t *lrm_state, const char *rsc_id,
const char *action, guint interval_ms);
int controld_execute_resource_agent(lrm_state_t *lrm_state, const char *rsc_id,
const char *action, const char *userdata,
guint interval_ms, int timeout_ms,
int start_delay_ms,
GHashTable *parameters, int *call_id);
lrmd_rsc_info_t *lrm_state_get_rsc_info(lrm_state_t * lrm_state,
const char *rsc_id, enum lrmd_call_options options);
int lrm_state_register_rsc(lrm_state_t * lrm_state,
const char *rsc_id,
const char *class,
const char *provider, const char *agent, enum lrmd_call_options options);
int lrm_state_unregister_rsc(lrm_state_t * lrm_state,
const char *rsc_id, enum lrmd_call_options options);
// Functions used to manage remote executor connection resources
void remote_lrm_op_callback(lrmd_event_data_t * op);
gboolean is_remote_lrmd_ra(const char *agent, const char *provider, const char *id);
lrmd_rsc_info_t *remote_ra_get_rsc_info(lrm_state_t * lrm_state, const char *rsc_id);
int remote_ra_cancel(lrm_state_t *lrm_state, const char *rsc_id,
const char *action, guint interval_ms);
int controld_execute_remote_agent(const lrm_state_t *lrm_state,
const char *rsc_id, const char *action,
const char *userdata,
guint interval_ms, int timeout_ms,
int start_delay_ms, lrmd_key_value_t *params,
int *call_id);
void remote_ra_cleanup(lrm_state_t * lrm_state);
void remote_ra_fail(const char *node_name);
void remote_ra_process_pseudo(xmlNode *xml);
gboolean remote_ra_is_in_maintenance(lrm_state_t * lrm_state);
void remote_ra_process_maintenance_nodes(xmlNode *xml);
gboolean remote_ra_controlling_guest(lrm_state_t * lrm_state);
void process_lrm_event(lrm_state_t *lrm_state, lrmd_event_data_t *op,
active_op_t *pending, const xmlNode *action_xml);
void controld_ack_event_directly(const char *to_host, const char *to_sys,
const lrmd_rsc_info_t *rsc,
lrmd_event_data_t *op, const char *rsc_id);
void controld_rc2event(lrmd_event_data_t *event, int rc);
void controld_trigger_delete_refresh(const char *from_sys, const char *rsc_id);
#endif
diff --git a/daemons/controld/controld_membership.c b/daemons/controld/controld_membership.c
index 7ea4331157..1f7e4c0f56 100644
--- a/daemons/controld/controld_membership.c
+++ b/daemons/controld/controld_membership.c
@@ -1,462 +1,457 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
/* put these first so that uuid_t is defined without conflicts */
#include
#include
#include
#include
#include
#include
#include
#include
void post_cache_update(int instance);
extern gboolean check_join_state(enum crmd_fsa_state cur_state, const char *source);
static void
reap_dead_nodes(gpointer key, gpointer value, gpointer user_data)
{
crm_node_t *node = value;
if (crm_is_peer_active(node) == FALSE) {
crm_update_peer_join(__func__, node, crm_join_none);
if(node && node->uname) {
if (pcmk__str_eq(controld_globals.our_nodename, node->uname,
pcmk__str_casei)) {
crm_err("We're not part of the cluster anymore");
register_fsa_input(C_FSA_INTERNAL, I_ERROR, NULL);
} else if (!AM_I_DC
&& pcmk__str_eq(node->uname, controld_globals.dc_name,
pcmk__str_casei)) {
crm_warn("Our DC node (%s) left the cluster", node->uname);
register_fsa_input(C_FSA_INTERNAL, I_ELECTION, NULL);
}
}
if ((controld_globals.fsa_state == S_INTEGRATION)
|| (controld_globals.fsa_state == S_FINALIZE_JOIN)) {
check_join_state(controld_globals.fsa_state, __func__);
}
if ((node != NULL) && (node->uuid != NULL)) {
fail_incompletable_actions(controld_globals.transition_graph,
node->uuid);
}
}
}
void
post_cache_update(int instance)
{
xmlNode *no_op = NULL;
crm_peer_seq = instance;
crm_debug("Updated cache after membership event %d.", instance);
g_hash_table_foreach(crm_peer_cache, reap_dead_nodes, NULL);
controld_set_fsa_input_flags(R_MEMBERSHIP);
if (AM_I_DC) {
populate_cib_nodes(node_update_quick | node_update_cluster | node_update_peer |
node_update_expected, __func__);
}
/*
* If we lost nodes, we should re-check the election status
* Safe to call outside of an election
*/
controld_set_fsa_action_flags(A_ELECTION_CHECK);
controld_trigger_fsa();
/* Membership changed, remind everyone we're here.
* This will aid detection of duplicate DCs
*/
no_op = create_request(CRM_OP_NOOP, NULL, NULL, CRM_SYSTEM_CRMD,
AM_I_DC ? CRM_SYSTEM_DC : CRM_SYSTEM_CRMD, NULL);
send_cluster_message(NULL, crm_msg_crmd, no_op, FALSE);
free_xml(no_op);
}
static void
crmd_node_update_complete(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
fsa_data_t *msg_data = NULL;
if (rc == pcmk_ok) {
crm_trace("Node update %d complete", call_id);
} else if(call_id < pcmk_ok) {
crm_err("Node update failed: %s (%d)", pcmk_strerror(call_id), call_id);
crm_log_xml_debug(msg, "failed");
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
} else {
crm_err("Node update %d failed: %s (%d)", call_id, pcmk_strerror(rc), rc);
crm_log_xml_debug(msg, "failed");
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
}
}
/*!
* \internal
* \brief Create an XML node state tag with updates
*
* \param[in,out] node Node whose state will be used for update
* \param[in] flags Bitmask of node_update_flags indicating what to update
* \param[in,out] parent XML node to contain update (or NULL)
* \param[in] source Who requested the update (only used for logging)
*
* \return Pointer to created node state tag
*/
xmlNode *
create_node_state_update(crm_node_t *node, int flags, xmlNode *parent,
const char *source)
{
const char *value = NULL;
xmlNode *node_state;
if (!node->state) {
crm_info("Node update for %s cancelled: no state, not seen yet", node->uname);
return NULL;
}
node_state = create_xml_node(parent, XML_CIB_TAG_STATE);
if (pcmk_is_set(node->flags, crm_remote_node)) {
pcmk__xe_set_bool_attr(node_state, XML_NODE_IS_REMOTE, true);
}
set_uuid(node_state, XML_ATTR_ID, node);
if (crm_element_value(node_state, XML_ATTR_ID) == NULL) {
crm_info("Node update for %s cancelled: no id", node->uname);
free_xml(node_state);
return NULL;
}
crm_xml_add(node_state, XML_ATTR_UNAME, node->uname);
if ((flags & node_update_cluster) && node->state) {
pcmk__xe_set_bool_attr(node_state, XML_NODE_IN_CLUSTER,
pcmk__str_eq(node->state, CRM_NODE_MEMBER, pcmk__str_casei));
}
if (!pcmk_is_set(node->flags, crm_remote_node)) {
if (flags & node_update_peer) {
value = OFFLINESTATUS;
if (pcmk_is_set(node->processes, crm_get_cluster_proc())) {
value = ONLINESTATUS;
}
crm_xml_add(node_state, XML_NODE_IS_PEER, value);
}
if (flags & node_update_join) {
if (node->join <= crm_join_none) {
value = CRMD_JOINSTATE_DOWN;
} else {
value = CRMD_JOINSTATE_MEMBER;
}
crm_xml_add(node_state, XML_NODE_JOIN_STATE, value);
}
if (flags & node_update_expected) {
crm_xml_add(node_state, XML_NODE_EXPECTED, node->expected);
}
}
crm_xml_add(node_state, XML_ATTR_ORIGIN, source);
return node_state;
}
static void
remove_conflicting_node_callback(xmlNode * msg, int call_id, int rc,
xmlNode * output, void *user_data)
{
char *node_uuid = user_data;
do_crm_log_unlikely(rc == 0 ? LOG_DEBUG : LOG_NOTICE,
"Deletion of the unknown conflicting node \"%s\": %s (rc=%d)",
node_uuid, pcmk_strerror(rc), rc);
}
static void
search_conflicting_node_callback(xmlNode * msg, int call_id, int rc,
xmlNode * output, void *user_data)
{
char *new_node_uuid = user_data;
xmlNode *node_xml = NULL;
if (rc != pcmk_ok) {
if (rc != -ENXIO) {
crm_notice("Searching conflicting nodes for %s failed: %s (%d)",
new_node_uuid, pcmk_strerror(rc), rc);
}
return;
} else if (output == NULL) {
return;
}
if (pcmk__str_eq(crm_element_name(output), XML_CIB_TAG_NODE, pcmk__str_casei)) {
node_xml = output;
} else {
node_xml = pcmk__xml_first_child(output);
}
for (; node_xml != NULL; node_xml = pcmk__xml_next(node_xml)) {
const char *node_uuid = NULL;
const char *node_uname = NULL;
GHashTableIter iter;
crm_node_t *node = NULL;
gboolean known = FALSE;
if (!pcmk__str_eq(crm_element_name(node_xml), XML_CIB_TAG_NODE, pcmk__str_casei)) {
continue;
}
node_uuid = crm_element_value(node_xml, XML_ATTR_ID);
node_uname = crm_element_value(node_xml, XML_ATTR_UNAME);
if (node_uuid == NULL || node_uname == NULL) {
continue;
}
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &node)) {
if (node->uuid
&& pcmk__str_eq(node->uuid, node_uuid, pcmk__str_casei)
&& node->uname
&& pcmk__str_eq(node->uname, node_uname, pcmk__str_casei)) {
known = TRUE;
break;
}
}
if (known == FALSE) {
cib_t *cib_conn = controld_globals.cib_conn;
int delete_call_id = 0;
xmlNode *node_state_xml = NULL;
crm_notice("Deleting unknown node %s/%s which has conflicting uname with %s",
node_uuid, node_uname, new_node_uuid);
delete_call_id = cib_conn->cmds->remove(cib_conn, XML_CIB_TAG_NODES,
- node_xml,
- cib_scope_local
- |cib_quorum_override);
+ node_xml, cib_scope_local);
fsa_register_cib_callback(delete_call_id, strdup(node_uuid),
remove_conflicting_node_callback);
node_state_xml = create_xml_node(NULL, XML_CIB_TAG_STATE);
crm_xml_add(node_state_xml, XML_ATTR_ID, node_uuid);
crm_xml_add(node_state_xml, XML_ATTR_UNAME, node_uname);
delete_call_id = cib_conn->cmds->remove(cib_conn,
XML_CIB_TAG_STATUS,
node_state_xml,
- cib_scope_local
- |cib_quorum_override);
+ cib_scope_local);
fsa_register_cib_callback(delete_call_id, strdup(node_uuid),
remove_conflicting_node_callback);
free_xml(node_state_xml);
}
}
}
static void
node_list_update_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
fsa_data_t *msg_data = NULL;
if(call_id < pcmk_ok) {
crm_err("Node list update failed: %s (%d)", pcmk_strerror(call_id), call_id);
crm_log_xml_debug(msg, "update:failed");
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
} else if(rc < pcmk_ok) {
crm_err("Node update %d failed: %s (%d)", call_id, pcmk_strerror(rc), rc);
crm_log_xml_debug(msg, "update:failed");
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
}
}
void
populate_cib_nodes(enum node_update_flags flags, const char *source)
{
cib_t *cib_conn = controld_globals.cib_conn;
int call_id = 0;
gboolean from_hashtable = TRUE;
- int call_options = cib_scope_local | cib_quorum_override;
xmlNode *node_list = create_xml_node(NULL, XML_CIB_TAG_NODES);
#if SUPPORT_COROSYNC
if (!pcmk_is_set(flags, node_update_quick) && is_corosync_cluster()) {
from_hashtable = pcmk__corosync_add_nodes(node_list);
}
#endif
if (from_hashtable) {
GHashTableIter iter;
crm_node_t *node = NULL;
GString *xpath = NULL;
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &node)) {
xmlNode *new_node = NULL;
if ((node->uuid != NULL) && (node->uname != NULL)) {
crm_trace("Creating node entry for %s/%s", node->uname, node->uuid);
if (xpath == NULL) {
xpath = g_string_sized_new(512);
} else {
g_string_truncate(xpath, 0);
}
/* We need both to be valid */
new_node = create_xml_node(node_list, XML_CIB_TAG_NODE);
crm_xml_add(new_node, XML_ATTR_ID, node->uuid);
crm_xml_add(new_node, XML_ATTR_UNAME, node->uname);
/* Search and remove unknown nodes with the conflicting uname from CIB */
pcmk__g_strcat(xpath,
"/" XML_TAG_CIB "/" XML_CIB_TAG_CONFIGURATION
"/" XML_CIB_TAG_NODES "/" XML_CIB_TAG_NODE
"[@" XML_ATTR_UNAME "='", node->uname, "']"
"[@" XML_ATTR_ID "!='", node->uuid, "']", NULL);
call_id = cib_conn->cmds->query(cib_conn,
(const char *) xpath->str,
NULL,
cib_scope_local|cib_xpath);
fsa_register_cib_callback(call_id, strdup(node->uuid),
search_conflicting_node_callback);
}
}
if (xpath != NULL) {
g_string_free(xpath, TRUE);
}
}
crm_trace("Populating section from %s", from_hashtable ? "hashtable" : "cluster");
- if ((controld_update_cib(XML_CIB_TAG_NODES, node_list, call_options,
+ if ((controld_update_cib(XML_CIB_TAG_NODES, node_list, cib_scope_local,
node_list_update_callback) == pcmk_rc_ok)
&& (crm_peer_cache != NULL) && AM_I_DC) {
/*
* There is no need to update the local CIB with our values if
* we've not seen valid membership data
*/
GHashTableIter iter;
crm_node_t *node = NULL;
free_xml(node_list);
node_list = create_xml_node(NULL, XML_CIB_TAG_STATUS);
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &node)) {
create_node_state_update(node, flags, node_list, source);
}
if (crm_remote_peer_cache) {
g_hash_table_iter_init(&iter, crm_remote_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &node)) {
create_node_state_update(node, flags, node_list, source);
}
}
- controld_update_cib(XML_CIB_TAG_STATUS, node_list, call_options,
+ controld_update_cib(XML_CIB_TAG_STATUS, node_list, cib_scope_local,
crmd_node_update_complete);
}
free_xml(node_list);
}
static void
cib_quorum_update_complete(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
fsa_data_t *msg_data = NULL;
if (rc == pcmk_ok) {
crm_trace("Quorum update %d complete", call_id);
} else {
crm_err("Quorum update %d failed: %s (%d)", call_id, pcmk_strerror(rc), rc);
crm_log_xml_debug(msg, "failed");
register_fsa_error(C_FSA_INTERNAL, I_ERROR, NULL);
}
}
void
crm_update_quorum(gboolean quorum, gboolean force_update)
{
bool has_quorum = pcmk_is_set(controld_globals.flags, controld_has_quorum);
if (quorum) {
controld_set_global_flags(controld_ever_had_quorum);
} else if (pcmk_all_flags_set(controld_globals.flags,
controld_ever_had_quorum
|controld_no_quorum_suicide)) {
pcmk__panic(__func__);
}
if (AM_I_DC
&& ((has_quorum && !quorum) || (!has_quorum && quorum)
|| force_update)) {
xmlNode *update = NULL;
- int call_options = cib_scope_local | cib_quorum_override;
update = create_xml_node(NULL, XML_TAG_CIB);
crm_xml_add_int(update, XML_ATTR_HAVE_QUORUM, quorum);
crm_xml_add(update, XML_ATTR_DC_UUID, controld_globals.our_uuid);
crm_debug("Updating quorum status to %s", pcmk__btoa(quorum));
- controld_update_cib(XML_TAG_CIB, update, call_options,
+ controld_update_cib(XML_TAG_CIB, update, cib_scope_local,
cib_quorum_update_complete);
free_xml(update);
/* Quorum changes usually cause a new transition via other activity:
* quorum gained via a node joining will abort via the node join,
* and quorum lost via a node leaving will usually abort via resource
* activity and/or fencing.
*
* However, it is possible that nothing else causes a transition (e.g.
* someone forces quorum via corosync-cmaptcl, or quorum is lost due to
* a node in standby shutting down cleanly), so here ensure a new
* transition is triggered.
*/
if (quorum) {
/* If quorum was gained, abort after a short delay, in case multiple
* nodes are joining around the same time, so the one that brings us
* to quorum doesn't cause all the remaining ones to be fenced.
*/
abort_after_delay(INFINITY, pcmk__graph_restart, "Quorum gained",
5000);
} else {
abort_transition(INFINITY, pcmk__graph_restart, "Quorum lost",
NULL);
}
}
if (quorum) {
controld_set_global_flags(controld_has_quorum);
} else {
controld_clear_global_flags(controld_has_quorum);
}
}
diff --git a/daemons/controld/controld_messages.c b/daemons/controld/controld_messages.c
index 079e165850..43e20ba6e0 100644
--- a/daemons/controld/controld_messages.c
+++ b/daemons/controld/controld_messages.c
@@ -1,1307 +1,1303 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
extern void crm_shutdown(int nsig);
static enum crmd_fsa_input handle_message(xmlNode *msg,
enum crmd_fsa_cause cause);
static void handle_response(xmlNode *stored_msg);
static enum crmd_fsa_input handle_request(xmlNode *stored_msg,
enum crmd_fsa_cause cause);
static enum crmd_fsa_input handle_shutdown_request(xmlNode *stored_msg);
static void send_msg_via_ipc(xmlNode * msg, const char *sys);
/* debug only, can wrap all it likes */
static int last_data_id = 0;
void
register_fsa_error_adv(enum crmd_fsa_cause cause, enum crmd_fsa_input input,
fsa_data_t * cur_data, void *new_data, const char *raised_from)
{
/* save the current actions if any */
if (controld_globals.fsa_actions != A_NOTHING) {
register_fsa_input_adv(cur_data ? cur_data->fsa_cause : C_FSA_INTERNAL,
I_NULL, cur_data ? cur_data->data : NULL,
controld_globals.fsa_actions, TRUE, __func__);
}
/* reset the action list */
crm_info("Resetting the current action list");
fsa_dump_actions(controld_globals.fsa_actions, "Drop");
controld_globals.fsa_actions = A_NOTHING;
/* register the error */
register_fsa_input_adv(cause, input, new_data, A_NOTHING, TRUE, raised_from);
}
void
register_fsa_input_adv(enum crmd_fsa_cause cause, enum crmd_fsa_input input,
void *data, uint64_t with_actions,
gboolean prepend, const char *raised_from)
{
unsigned old_len = g_list_length(controld_globals.fsa_message_queue);
fsa_data_t *fsa_data = NULL;
if (raised_from == NULL) {
raised_from = "";
}
if (input == I_NULL && with_actions == A_NOTHING /* && data == NULL */ ) {
/* no point doing anything */
crm_err("Cannot add entry to queue: no input and no action");
return;
}
if (input == I_WAIT_FOR_EVENT) {
controld_set_global_flags(controld_fsa_is_stalled);
crm_debug("Stalling the FSA pending further input: source=%s cause=%s data=%p queue=%d",
raised_from, fsa_cause2string(cause), data, old_len);
if (old_len > 0) {
fsa_dump_queue(LOG_TRACE);
prepend = FALSE;
}
if (data == NULL) {
controld_set_fsa_action_flags(with_actions);
fsa_dump_actions(with_actions, "Restored");
return;
}
/* Store everything in the new event and reset
* controld_globals.fsa_actions
*/
with_actions |= controld_globals.fsa_actions;
controld_globals.fsa_actions = A_NOTHING;
}
last_data_id++;
crm_trace("%s %s FSA input %d (%s) due to %s, %s data",
raised_from, (prepend? "prepended" : "appended"), last_data_id,
fsa_input2string(input), fsa_cause2string(cause),
(data? "with" : "without"));
fsa_data = calloc(1, sizeof(fsa_data_t));
fsa_data->id = last_data_id;
fsa_data->fsa_input = input;
fsa_data->fsa_cause = cause;
fsa_data->origin = raised_from;
fsa_data->data = NULL;
fsa_data->data_type = fsa_dt_none;
fsa_data->actions = with_actions;
if (with_actions != A_NOTHING) {
crm_trace("Adding actions %.16llx to input",
(unsigned long long) with_actions);
}
if (data != NULL) {
switch (cause) {
case C_FSA_INTERNAL:
case C_CRMD_STATUS_CALLBACK:
case C_IPC_MESSAGE:
case C_HA_MESSAGE:
CRM_CHECK(((ha_msg_input_t *) data)->msg != NULL,
crm_err("Bogus data from %s", raised_from));
crm_trace("Copying %s data from %s as cluster message data",
fsa_cause2string(cause), raised_from);
fsa_data->data = copy_ha_msg_input(data);
fsa_data->data_type = fsa_dt_ha_msg;
break;
case C_LRM_OP_CALLBACK:
crm_trace("Copying %s data from %s as lrmd_event_data_t",
fsa_cause2string(cause), raised_from);
fsa_data->data = lrmd_copy_event((lrmd_event_data_t *) data);
fsa_data->data_type = fsa_dt_lrm;
break;
case C_TIMER_POPPED:
case C_SHUTDOWN:
case C_UNKNOWN:
case C_STARTUP:
crm_crit("Copying %s data (from %s) is not yet implemented",
fsa_cause2string(cause), raised_from);
crmd_exit(CRM_EX_SOFTWARE);
break;
}
}
/* make sure to free it properly later */
if (prepend) {
controld_globals.fsa_message_queue
= g_list_prepend(controld_globals.fsa_message_queue, fsa_data);
} else {
controld_globals.fsa_message_queue
= g_list_append(controld_globals.fsa_message_queue, fsa_data);
}
crm_trace("FSA message queue length is %d",
g_list_length(controld_globals.fsa_message_queue));
/* fsa_dump_queue(LOG_TRACE); */
if (old_len == g_list_length(controld_globals.fsa_message_queue)) {
crm_err("Couldn't add message to the queue");
}
if (input != I_WAIT_FOR_EVENT) {
controld_trigger_fsa();
}
}
void
fsa_dump_queue(int log_level)
{
int offset = 0;
for (GList *iter = controld_globals.fsa_message_queue; iter != NULL;
iter = iter->next) {
fsa_data_t *data = (fsa_data_t *) iter->data;
do_crm_log_unlikely(log_level,
"queue[%d.%d]: input %s raised by %s(%p.%d)\t(cause=%s)",
offset++, data->id, fsa_input2string(data->fsa_input),
data->origin, data->data, data->data_type,
fsa_cause2string(data->fsa_cause));
}
}
ha_msg_input_t *
copy_ha_msg_input(ha_msg_input_t * orig)
{
ha_msg_input_t *copy = calloc(1, sizeof(ha_msg_input_t));
CRM_ASSERT(copy != NULL);
copy->msg = (orig && orig->msg)? copy_xml(orig->msg) : NULL;
copy->xml = get_message_xml(copy->msg, F_CRM_DATA);
return copy;
}
void
delete_fsa_input(fsa_data_t * fsa_data)
{
lrmd_event_data_t *op = NULL;
xmlNode *foo = NULL;
if (fsa_data == NULL) {
return;
}
crm_trace("About to free %s data", fsa_cause2string(fsa_data->fsa_cause));
if (fsa_data->data != NULL) {
switch (fsa_data->data_type) {
case fsa_dt_ha_msg:
delete_ha_msg_input(fsa_data->data);
break;
case fsa_dt_xml:
foo = fsa_data->data;
free_xml(foo);
break;
case fsa_dt_lrm:
op = (lrmd_event_data_t *) fsa_data->data;
lrmd_free_event(op);
break;
case fsa_dt_none:
if (fsa_data->data != NULL) {
crm_err("Don't know how to free %s data from %s",
fsa_cause2string(fsa_data->fsa_cause), fsa_data->origin);
crmd_exit(CRM_EX_SOFTWARE);
}
break;
}
crm_trace("%s data freed", fsa_cause2string(fsa_data->fsa_cause));
}
free(fsa_data);
}
/* returns the next message */
fsa_data_t *
get_message(void)
{
fsa_data_t *message
= (fsa_data_t *) controld_globals.fsa_message_queue->data;
controld_globals.fsa_message_queue
= g_list_remove(controld_globals.fsa_message_queue, message);
crm_trace("Processing input %d", message->id);
return message;
}
void *
fsa_typed_data_adv(fsa_data_t * fsa_data, enum fsa_data_type a_type, const char *caller)
{
void *ret_val = NULL;
if (fsa_data == NULL) {
crm_err("%s: No FSA data available", caller);
} else if (fsa_data->data == NULL) {
crm_err("%s: No message data available. Origin: %s", caller, fsa_data->origin);
} else if (fsa_data->data_type != a_type) {
crm_crit("%s: Message data was the wrong type! %d vs. requested=%d. Origin: %s",
caller, fsa_data->data_type, a_type, fsa_data->origin);
CRM_ASSERT(fsa_data->data_type == a_type);
} else {
ret_val = fsa_data->data;
}
return ret_val;
}
/* A_MSG_ROUTE */
void
do_msg_route(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
ha_msg_input_t *input = fsa_typed_data(fsa_dt_ha_msg);
route_message(msg_data->fsa_cause, input->msg);
}
void
route_message(enum crmd_fsa_cause cause, xmlNode * input)
{
ha_msg_input_t fsa_input;
enum crmd_fsa_input result = I_NULL;
fsa_input.msg = input;
CRM_CHECK(cause == C_IPC_MESSAGE || cause == C_HA_MESSAGE, return);
/* try passing the buck first */
if (relay_message(input, cause == C_IPC_MESSAGE)) {
return;
}
/* handle locally */
result = handle_message(input, cause);
/* done or process later? */
switch (result) {
case I_NULL:
case I_CIB_OP:
case I_ROUTER:
case I_NODE_JOIN:
case I_JOIN_REQUEST:
case I_JOIN_RESULT:
break;
default:
/* Defering local processing of message */
register_fsa_input_later(cause, result, &fsa_input);
return;
}
if (result != I_NULL) {
/* add to the front of the queue */
register_fsa_input(cause, result, &fsa_input);
}
}
gboolean
relay_message(xmlNode * msg, gboolean originated_locally)
{
int dest = 1;
bool is_for_dc = false;
bool is_for_dcib = false;
bool is_for_te = false;
bool is_for_crm = false;
bool is_for_cib = false;
bool is_local = false;
const char *host_to = crm_element_value(msg, F_CRM_HOST_TO);
const char *sys_to = crm_element_value(msg, F_CRM_SYS_TO);
const char *sys_from = crm_element_value(msg, F_CRM_SYS_FROM);
const char *type = crm_element_value(msg, F_TYPE);
const char *task = crm_element_value(msg, F_CRM_TASK);
const char *ref = crm_element_value(msg, XML_ATTR_REFERENCE);
if (ref == NULL) {
ref = "without reference ID";
}
if (msg == NULL) {
crm_warn("Cannot route empty message");
return TRUE;
} else if (pcmk__str_eq(task, CRM_OP_HELLO, pcmk__str_casei)) {
crm_trace("No routing needed for hello message %s", ref);
return TRUE;
} else if (!pcmk__str_eq(type, T_CRM, pcmk__str_casei)) {
crm_warn("Received invalid message %s: type '%s' not '" T_CRM "'",
ref, pcmk__s(type, ""));
crm_log_xml_warn(msg, "[bad message type]");
return TRUE;
} else if (sys_to == NULL) {
crm_warn("Received invalid message %s: no subsystem", ref);
crm_log_xml_warn(msg, "[no subsystem]");
return TRUE;
}
is_for_dc = (strcasecmp(CRM_SYSTEM_DC, sys_to) == 0);
is_for_dcib = (strcasecmp(CRM_SYSTEM_DCIB, sys_to) == 0);
is_for_te = (strcasecmp(CRM_SYSTEM_TENGINE, sys_to) == 0);
is_for_cib = (strcasecmp(CRM_SYSTEM_CIB, sys_to) == 0);
is_for_crm = (strcasecmp(CRM_SYSTEM_CRMD, sys_to) == 0);
is_local = false;
if (pcmk__str_empty(host_to)) {
if (is_for_dc || is_for_te) {
is_local = false;
} else if (is_for_crm) {
if (pcmk__strcase_any_of(task, CRM_OP_NODE_INFO,
PCMK__CONTROLD_CMD_NODES, NULL)) {
/* Node info requests do not specify a host, which is normally
* treated as "all hosts", because the whole point is that the
* client may not know the local node name. Always handle these
* requests locally.
*/
is_local = true;
} else {
is_local = !originated_locally;
}
} else {
is_local = true;
}
} else if (pcmk__str_eq(controld_globals.our_nodename, host_to,
pcmk__str_casei)) {
is_local = true;
} else if (is_for_crm && pcmk__str_eq(task, CRM_OP_LRM_DELETE, pcmk__str_casei)) {
xmlNode *msg_data = get_message_xml(msg, F_CRM_DATA);
const char *mode = crm_element_value(msg_data, PCMK__XA_MODE);
if (pcmk__str_eq(mode, XML_TAG_CIB, pcmk__str_casei)) {
// Local delete of an offline node's resource history
is_local = true;
}
}
if (is_for_dc || is_for_dcib || is_for_te) {
if (AM_I_DC && is_for_te) {
crm_trace("Route message %s locally as transition request", ref);
send_msg_via_ipc(msg, sys_to);
} else if (AM_I_DC) {
crm_trace("Route message %s locally as DC request", ref);
return FALSE; // More to be done by caller
} else if (originated_locally && !pcmk__strcase_any_of(sys_from, CRM_SYSTEM_PENGINE,
CRM_SYSTEM_TENGINE, NULL)) {
-#if SUPPORT_COROSYNC
if (is_corosync_cluster()) {
dest = text2msg_type(sys_to);
}
-#endif
crm_trace("Relay message %s to DC", ref);
send_cluster_message(host_to ? crm_get_peer(0, host_to) : NULL, dest, msg, TRUE);
} else {
/* Neither the TE nor the scheduler should be sending messages
* to DCs on other nodes. By definition, if we are no longer the DC,
* then the scheduler's or TE's data should be discarded.
*/
crm_trace("Discard message %s because we are not DC", ref);
}
} else if (is_local && (is_for_crm || is_for_cib)) {
crm_trace("Route message %s locally as controller request", ref);
return FALSE; // More to be done by caller
} else if (is_local) {
crm_trace("Relay message %s locally to %s",
ref, (sys_to? sys_to : "unknown client"));
crm_log_xml_trace(msg, "[IPC relay]");
send_msg_via_ipc(msg, sys_to);
} else {
crm_node_t *node_to = NULL;
-#if SUPPORT_COROSYNC
if (is_corosync_cluster()) {
dest = text2msg_type(sys_to);
if (dest == crm_msg_none || dest > crm_msg_stonith_ng) {
dest = crm_msg_crmd;
}
}
-#endif
if (host_to) {
node_to = pcmk__search_cluster_node_cache(0, host_to);
if (node_to == NULL) {
crm_warn("Cannot route message %s: Unknown node %s",
ref, host_to);
return TRUE;
}
crm_trace("Relay message %s to %s",
ref, (node_to->uname? node_to->uname : "peer"));
} else {
crm_trace("Broadcast message %s to all peers", ref);
}
send_cluster_message(host_to ? node_to : NULL, dest, msg, TRUE);
}
return TRUE; // No further processing of message is needed
}
// Return true if field contains a positive integer
static bool
authorize_version(xmlNode *message_data, const char *field,
const char *client_name, const char *ref, const char *uuid)
{
const char *version = crm_element_value(message_data, field);
long long version_num;
if ((pcmk__scan_ll(version, &version_num, -1LL) != pcmk_rc_ok)
|| (version_num < 0LL)) {
crm_warn("Rejected IPC hello from %s: '%s' is not a valid protocol %s "
CRM_XS " ref=%s uuid=%s",
client_name, ((version == NULL)? "" : version),
field, (ref? ref : "none"), uuid);
return false;
}
return true;
}
/*!
* \internal
* \brief Check whether a client IPC message is acceptable
*
* If a given client IPC message is a hello, "authorize" it by ensuring it has
* valid information such as a protocol version, and return false indicating
* that nothing further needs to be done with the message. If the message is not
* a hello, just return true to indicate it needs further processing.
*
* \param[in] client_msg XML of IPC message
* \param[in,out] curr_client If IPC is not proxied, client that sent message
* \param[in] proxy_session If IPC is proxied, the session ID
*
* \return true if message needs further processing, false if it doesn't
*/
bool
controld_authorize_ipc_message(const xmlNode *client_msg, pcmk__client_t *curr_client,
const char *proxy_session)
{
xmlNode *message_data = NULL;
const char *client_name = NULL;
const char *op = crm_element_value(client_msg, F_CRM_TASK);
const char *ref = crm_element_value(client_msg, XML_ATTR_REFERENCE);
const char *uuid = (curr_client? curr_client->id : proxy_session);
if (uuid == NULL) {
crm_warn("IPC message from client rejected: No client identifier "
CRM_XS " ref=%s", (ref? ref : "none"));
goto rejected;
}
if (!pcmk__str_eq(CRM_OP_HELLO, op, pcmk__str_casei)) {
// Only hello messages need to be authorized
return true;
}
message_data = get_message_xml(client_msg, F_CRM_DATA);
client_name = crm_element_value(message_data, "client_name");
if (pcmk__str_empty(client_name)) {
crm_warn("IPC hello from client rejected: No client name",
CRM_XS " ref=%s uuid=%s", (ref? ref : "none"), uuid);
goto rejected;
}
if (!authorize_version(message_data, "major_version", client_name, ref,
uuid)) {
goto rejected;
}
if (!authorize_version(message_data, "minor_version", client_name, ref,
uuid)) {
goto rejected;
}
crm_trace("Validated IPC hello from client %s", client_name);
if (curr_client) {
curr_client->userdata = strdup(client_name);
}
controld_trigger_fsa();
return false;
rejected:
if (curr_client) {
qb_ipcs_disconnect(curr_client->ipcs);
}
return false;
}
static enum crmd_fsa_input
handle_message(xmlNode *msg, enum crmd_fsa_cause cause)
{
const char *type = NULL;
CRM_CHECK(msg != NULL, return I_NULL);
type = crm_element_value(msg, F_CRM_MSG_TYPE);
if (pcmk__str_eq(type, XML_ATTR_REQUEST, pcmk__str_none)) {
return handle_request(msg, cause);
} else if (pcmk__str_eq(type, XML_ATTR_RESPONSE, pcmk__str_none)) {
handle_response(msg);
return I_NULL;
}
crm_err("Unknown message type: %s", type);
return I_NULL;
}
static enum crmd_fsa_input
handle_failcount_op(xmlNode * stored_msg)
{
const char *rsc = NULL;
const char *uname = NULL;
const char *op = NULL;
char *interval_spec = NULL;
guint interval_ms = 0;
gboolean is_remote_node = FALSE;
xmlNode *xml_op = get_message_xml(stored_msg, F_CRM_DATA);
if (xml_op) {
xmlNode *xml_rsc = first_named_child(xml_op, XML_CIB_TAG_RESOURCE);
xmlNode *xml_attrs = first_named_child(xml_op, XML_TAG_ATTRS);
if (xml_rsc) {
rsc = ID(xml_rsc);
}
if (xml_attrs) {
op = crm_element_value(xml_attrs,
CRM_META "_" XML_RSC_ATTR_CLEAR_OP);
crm_element_value_ms(xml_attrs,
CRM_META "_" XML_RSC_ATTR_CLEAR_INTERVAL,
&interval_ms);
}
}
uname = crm_element_value(xml_op, XML_LRM_ATTR_TARGET);
if ((rsc == NULL) || (uname == NULL)) {
crm_log_xml_warn(stored_msg, "invalid failcount op");
return I_NULL;
}
if (crm_element_value(xml_op, XML_LRM_ATTR_ROUTER_NODE)) {
is_remote_node = TRUE;
}
crm_debug("Clearing failures for %s-interval %s on %s "
"from attribute manager, CIB, and executor state",
pcmk__readable_interval(interval_ms), rsc, uname);
if (interval_ms) {
interval_spec = crm_strdup_printf("%ums", interval_ms);
}
update_attrd_clear_failures(uname, rsc, op, interval_spec, is_remote_node);
free(interval_spec);
controld_cib_delete_last_failure(rsc, uname, op, interval_ms);
lrm_clear_last_failure(rsc, uname, op, interval_ms);
return I_NULL;
}
static enum crmd_fsa_input
handle_lrm_delete(xmlNode *stored_msg)
{
const char *mode = NULL;
xmlNode *msg_data = get_message_xml(stored_msg, F_CRM_DATA);
CRM_CHECK(msg_data != NULL, return I_NULL);
/* CRM_OP_LRM_DELETE has two distinct modes. The default behavior is to
* relay the operation to the affected node, which will unregister the
* resource from the local executor, clear the resource's history from the
* CIB, and do some bookkeeping in the controller.
*
* However, if the affected node is offline, the client will specify
* mode="cib" which means the controller receiving the operation should
* clear the resource's history from the CIB and nothing else. This is used
* to clear shutdown locks.
*/
mode = crm_element_value(msg_data, PCMK__XA_MODE);
if ((mode == NULL) || strcmp(mode, XML_TAG_CIB)) {
// Relay to affected node
crm_xml_add(stored_msg, F_CRM_SYS_TO, CRM_SYSTEM_LRMD);
return I_ROUTER;
} else {
// Delete CIB history locally (compare with do_lrm_delete())
const char *from_sys = NULL;
const char *user_name = NULL;
const char *rsc_id = NULL;
const char *node = NULL;
xmlNode *rsc_xml = NULL;
int rc = pcmk_rc_ok;
rsc_xml = first_named_child(msg_data, XML_CIB_TAG_RESOURCE);
CRM_CHECK(rsc_xml != NULL, return I_NULL);
rsc_id = ID(rsc_xml);
from_sys = crm_element_value(stored_msg, F_CRM_SYS_FROM);
node = crm_element_value(msg_data, XML_LRM_ATTR_TARGET);
user_name = pcmk__update_acl_user(stored_msg, F_CRM_USER, NULL);
crm_debug("Handling " CRM_OP_LRM_DELETE " for %s on %s locally%s%s "
"(clearing CIB resource history only)", rsc_id, node,
(user_name? " for user " : ""), (user_name? user_name : ""));
rc = controld_delete_resource_history(rsc_id, node, user_name,
cib_dryrun|cib_sync_call);
if (rc == pcmk_rc_ok) {
rc = controld_delete_resource_history(rsc_id, node, user_name,
crmd_cib_smart_opt());
}
//Notify client and tengine.(Only notify tengine if mode = "cib" and CRM_OP_LRM_DELETE.)
if (from_sys) {
lrmd_event_data_t *op = NULL;
const char *from_host = crm_element_value(stored_msg,
F_CRM_HOST_FROM);
const char *transition;
if (strcmp(from_sys, CRM_SYSTEM_TENGINE)) {
transition = crm_element_value(msg_data,
XML_ATTR_TRANSITION_KEY);
} else {
transition = crm_element_value(stored_msg,
XML_ATTR_TRANSITION_KEY);
}
crm_info("Notifying %s on %s that %s was%s deleted",
from_sys, (from_host? from_host : "local node"), rsc_id,
((rc == pcmk_rc_ok)? "" : " not"));
op = lrmd_new_event(rsc_id, CRMD_ACTION_DELETE, 0);
op->type = lrmd_event_exec_complete;
op->user_data = strdup(transition? transition : FAKE_TE_ID);
op->params = pcmk__strkey_table(free, free);
g_hash_table_insert(op->params, strdup(XML_ATTR_CRM_VERSION),
strdup(CRM_FEATURE_SET));
controld_rc2event(op, rc);
controld_ack_event_directly(from_host, from_sys, NULL, op, rsc_id);
lrmd_free_event(op);
controld_trigger_delete_refresh(from_sys, rsc_id);
}
return I_NULL;
}
}
/*!
* \brief Handle a CRM_OP_REMOTE_STATE message by updating remote peer cache
*
* \param[in] msg Message XML
*
* \return Next FSA input
*/
static enum crmd_fsa_input
handle_remote_state(const xmlNode *msg)
{
const char *remote_uname = ID(msg);
crm_node_t *remote_peer;
bool remote_is_up = false;
int rc = pcmk_rc_ok;
rc = pcmk__xe_get_bool_attr(msg, XML_NODE_IN_CLUSTER, &remote_is_up);
CRM_CHECK(remote_uname && rc == pcmk_rc_ok, return I_NULL);
remote_peer = crm_remote_peer_get(remote_uname);
CRM_CHECK(remote_peer, return I_NULL);
pcmk__update_peer_state(__func__, remote_peer,
remote_is_up ? CRM_NODE_MEMBER : CRM_NODE_LOST,
0);
return I_NULL;
}
/*!
* \brief Handle a CRM_OP_PING message
*
* \param[in] msg Message XML
*
* \return Next FSA input
*/
static enum crmd_fsa_input
handle_ping(const xmlNode *msg)
{
const char *value = NULL;
xmlNode *ping = NULL;
xmlNode *reply = NULL;
// Build reply
ping = create_xml_node(NULL, XML_CRM_TAG_PING);
value = crm_element_value(msg, F_CRM_SYS_TO);
crm_xml_add(ping, XML_PING_ATTR_SYSFROM, value);
// Add controller state
value = fsa_state2string(controld_globals.fsa_state);
crm_xml_add(ping, XML_PING_ATTR_CRMDSTATE, value);
crm_notice("Current ping state: %s", value); // CTS needs this
// Add controller health
// @TODO maybe do some checks to determine meaningful status
crm_xml_add(ping, XML_PING_ATTR_STATUS, "ok");
// Send reply
reply = create_reply(msg, ping);
free_xml(ping);
if (reply != NULL) {
(void) relay_message(reply, TRUE);
free_xml(reply);
}
// Nothing further to do
return I_NULL;
}
/*!
* \brief Handle a PCMK__CONTROLD_CMD_NODES message
*
* \param[in] request Message XML
*
* \return Next FSA input
*/
static enum crmd_fsa_input
handle_node_list(const xmlNode *request)
{
GHashTableIter iter;
crm_node_t *node = NULL;
xmlNode *reply = NULL;
xmlNode *reply_data = NULL;
// Create message data for reply
reply_data = create_xml_node(NULL, XML_CIB_TAG_NODES);
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) & node)) {
xmlNode *xml = create_xml_node(reply_data, XML_CIB_TAG_NODE);
crm_xml_add_ll(xml, XML_ATTR_ID, (long long) node->id); // uint32_t
crm_xml_add(xml, XML_ATTR_UNAME, node->uname);
crm_xml_add(xml, XML_NODE_IN_CLUSTER, node->state);
}
// Create and send reply
reply = create_reply(request, reply_data);
free_xml(reply_data);
if (reply) {
(void) relay_message(reply, TRUE);
free_xml(reply);
}
// Nothing further to do
return I_NULL;
}
/*!
* \brief Handle a CRM_OP_NODE_INFO request
*
* \param[in] msg Message XML
*
* \return Next FSA input
*/
static enum crmd_fsa_input
handle_node_info_request(const xmlNode *msg)
{
const char *value = NULL;
crm_node_t *node = NULL;
int node_id = 0;
xmlNode *reply = NULL;
xmlNode *reply_data = NULL;
// Build reply
reply_data = create_xml_node(NULL, XML_CIB_TAG_NODE);
crm_xml_add(reply_data, XML_PING_ATTR_SYSFROM, CRM_SYSTEM_CRMD);
// Add whether current partition has quorum
pcmk__xe_set_bool_attr(reply_data, XML_ATTR_HAVE_QUORUM,
pcmk_is_set(controld_globals.flags,
controld_has_quorum));
// Check whether client requested node info by ID and/or name
crm_element_value_int(msg, XML_ATTR_ID, &node_id);
if (node_id < 0) {
node_id = 0;
}
value = crm_element_value(msg, XML_ATTR_UNAME);
// Default to local node if none given
if ((node_id == 0) && (value == NULL)) {
value = controld_globals.our_nodename;
}
node = pcmk__search_node_caches(node_id, value, CRM_GET_PEER_ANY);
if (node) {
crm_xml_add(reply_data, XML_ATTR_ID, node->uuid);
crm_xml_add(reply_data, XML_ATTR_UNAME, node->uname);
crm_xml_add(reply_data, XML_NODE_IS_PEER, node->state);
pcmk__xe_set_bool_attr(reply_data, XML_NODE_IS_REMOTE,
pcmk_is_set(node->flags, crm_remote_node));
}
// Send reply
reply = create_reply(msg, reply_data);
free_xml(reply_data);
if (reply != NULL) {
(void) relay_message(reply, TRUE);
free_xml(reply);
}
// Nothing further to do
return I_NULL;
}
static void
verify_feature_set(xmlNode *msg)
{
const char *dc_version = crm_element_value(msg, XML_ATTR_CRM_VERSION);
if (dc_version == NULL) {
/* All we really know is that the DC feature set is older than 3.1.0,
* but that's also all that really matters.
*/
dc_version = "3.0.14";
}
if (feature_set_compatible(dc_version, CRM_FEATURE_SET)) {
crm_trace("Local feature set (%s) is compatible with DC's (%s)",
CRM_FEATURE_SET, dc_version);
} else {
crm_err("Local feature set (%s) is incompatible with DC's (%s)",
CRM_FEATURE_SET, dc_version);
// Nothing is likely to improve without administrator involvement
controld_set_fsa_input_flags(R_STAYDOWN);
crmd_exit(CRM_EX_FATAL);
}
}
// DC gets own shutdown all-clear
static enum crmd_fsa_input
handle_shutdown_self_ack(xmlNode *stored_msg)
{
const char *host_from = crm_element_value(stored_msg, F_CRM_HOST_FROM);
if (pcmk_is_set(controld_globals.fsa_input_register, R_SHUTDOWN)) {
// The expected case -- we initiated own shutdown sequence
crm_info("Shutting down controller");
return I_STOP;
}
if (pcmk__str_eq(host_from, controld_globals.dc_name, pcmk__str_casei)) {
// Must be logic error -- DC confirming its own unrequested shutdown
crm_err("Shutting down controller immediately due to "
"unexpected shutdown confirmation");
return I_TERMINATE;
}
if (controld_globals.fsa_state != S_STOPPING) {
// Shouldn't happen -- non-DC confirming unrequested shutdown
crm_err("Starting new DC election because %s is "
"confirming shutdown we did not request",
(host_from? host_from : "another node"));
return I_ELECTION;
}
// Shouldn't happen, but we are already stopping anyway
crm_debug("Ignoring unexpected shutdown confirmation from %s",
(host_from? host_from : "another node"));
return I_NULL;
}
// Non-DC gets shutdown all-clear from DC
static enum crmd_fsa_input
handle_shutdown_ack(xmlNode *stored_msg)
{
const char *host_from = crm_element_value(stored_msg, F_CRM_HOST_FROM);
if (host_from == NULL) {
crm_warn("Ignoring shutdown request without origin specified");
return I_NULL;
}
if (pcmk__str_eq(host_from, controld_globals.dc_name,
pcmk__str_null_matches|pcmk__str_casei)) {
if (pcmk_is_set(controld_globals.fsa_input_register, R_SHUTDOWN)) {
crm_info("Shutting down controller after confirmation from %s",
host_from);
} else {
crm_err("Shutting down controller after unexpected "
"shutdown request from %s", host_from);
controld_set_fsa_input_flags(R_STAYDOWN);
}
return I_STOP;
}
crm_warn("Ignoring shutdown request from %s because DC is %s",
host_from, controld_globals.dc_name);
return I_NULL;
}
static enum crmd_fsa_input
handle_request(xmlNode *stored_msg, enum crmd_fsa_cause cause)
{
xmlNode *msg = NULL;
const char *op = crm_element_value(stored_msg, F_CRM_TASK);
/* Optimize this for the DC - it has the most to do */
if (op == NULL) {
crm_log_xml_warn(stored_msg, "[request without " F_CRM_TASK "]");
return I_NULL;
}
if (strcmp(op, CRM_OP_SHUTDOWN_REQ) == 0) {
const char *from = crm_element_value(stored_msg, F_CRM_HOST_FROM);
crm_node_t *node = pcmk__search_cluster_node_cache(0, from);
pcmk__update_peer_expected(__func__, node, CRMD_JOINSTATE_DOWN);
if(AM_I_DC == FALSE) {
return I_NULL; /* Done */
}
}
/*========== DC-Only Actions ==========*/
if (AM_I_DC) {
if (strcmp(op, CRM_OP_JOIN_ANNOUNCE) == 0) {
return I_NODE_JOIN;
} else if (strcmp(op, CRM_OP_JOIN_REQUEST) == 0) {
return I_JOIN_REQUEST;
} else if (strcmp(op, CRM_OP_JOIN_CONFIRM) == 0) {
return I_JOIN_RESULT;
} else if (strcmp(op, CRM_OP_SHUTDOWN) == 0) {
return handle_shutdown_self_ack(stored_msg);
} else if (strcmp(op, CRM_OP_SHUTDOWN_REQ) == 0) {
// Another controller wants to shut down its node
return handle_shutdown_request(stored_msg);
} else if (strcmp(op, CRM_OP_REMOTE_STATE) == 0) {
/* a remote connection host is letting us know the node state */
return handle_remote_state(stored_msg);
}
}
/*========== common actions ==========*/
if (strcmp(op, CRM_OP_NOVOTE) == 0) {
ha_msg_input_t fsa_input;
fsa_input.msg = stored_msg;
register_fsa_input_adv(C_HA_MESSAGE, I_NULL, &fsa_input,
A_ELECTION_COUNT | A_ELECTION_CHECK, FALSE,
__func__);
} else if (strcmp(op, CRM_OP_THROTTLE) == 0) {
throttle_update(stored_msg);
if (AM_I_DC && (controld_globals.transition_graph != NULL)
&& !controld_globals.transition_graph->complete) {
crm_debug("The throttle changed. Trigger a graph.");
trigger_graph();
}
return I_NULL;
} else if (strcmp(op, CRM_OP_CLEAR_FAILCOUNT) == 0) {
return handle_failcount_op(stored_msg);
} else if (strcmp(op, CRM_OP_VOTE) == 0) {
/* count the vote and decide what to do after that */
ha_msg_input_t fsa_input;
fsa_input.msg = stored_msg;
register_fsa_input_adv(C_HA_MESSAGE, I_NULL, &fsa_input,
A_ELECTION_COUNT | A_ELECTION_CHECK, FALSE,
__func__);
/* Sometimes we _must_ go into S_ELECTION */
if (controld_globals.fsa_state == S_HALT) {
crm_debug("Forcing an election from S_HALT");
return I_ELECTION;
#if 0
} else if (AM_I_DC) {
/* This is the old way of doing things but what is gained? */
return I_ELECTION;
#endif
}
} else if (strcmp(op, CRM_OP_JOIN_OFFER) == 0) {
verify_feature_set(stored_msg);
crm_debug("Raising I_JOIN_OFFER: join-%s", crm_element_value(stored_msg, F_CRM_JOIN_ID));
return I_JOIN_OFFER;
} else if (strcmp(op, CRM_OP_JOIN_ACKNAK) == 0) {
crm_debug("Raising I_JOIN_RESULT: join-%s", crm_element_value(stored_msg, F_CRM_JOIN_ID));
return I_JOIN_RESULT;
} else if (strcmp(op, CRM_OP_LRM_DELETE) == 0) {
return handle_lrm_delete(stored_msg);
} else if ((strcmp(op, CRM_OP_LRM_FAIL) == 0)
|| (strcmp(op, CRM_OP_LRM_REFRESH) == 0) // @COMPAT
|| (strcmp(op, CRM_OP_REPROBE) == 0)) {
crm_xml_add(stored_msg, F_CRM_SYS_TO, CRM_SYSTEM_LRMD);
return I_ROUTER;
} else if (strcmp(op, CRM_OP_NOOP) == 0) {
return I_NULL;
} else if (strcmp(op, CRM_OP_LOCAL_SHUTDOWN) == 0) {
crm_shutdown(SIGTERM);
/*return I_SHUTDOWN; */
return I_NULL;
} else if (strcmp(op, CRM_OP_PING) == 0) {
return handle_ping(stored_msg);
} else if (strcmp(op, CRM_OP_NODE_INFO) == 0) {
return handle_node_info_request(stored_msg);
} else if (strcmp(op, CRM_OP_RM_NODE_CACHE) == 0) {
int id = 0;
const char *name = NULL;
crm_element_value_int(stored_msg, XML_ATTR_ID, &id);
name = crm_element_value(stored_msg, XML_ATTR_UNAME);
if(cause == C_IPC_MESSAGE) {
msg = create_request(CRM_OP_RM_NODE_CACHE, NULL, NULL, CRM_SYSTEM_CRMD, CRM_SYSTEM_CRMD, NULL);
if (send_cluster_message(NULL, crm_msg_crmd, msg, TRUE) == FALSE) {
crm_err("Could not instruct peers to remove references to node %s/%u", name, id);
} else {
crm_notice("Instructing peers to remove references to node %s/%u", name, id);
}
free_xml(msg);
} else {
reap_crm_member(id, name);
/* If we're forgetting this node, also forget any failures to fence
* it, so we don't carry that over to any node added later with the
* same name.
*/
st_fail_count_reset(name);
}
} else if (strcmp(op, CRM_OP_MAINTENANCE_NODES) == 0) {
xmlNode *xml = get_message_xml(stored_msg, F_CRM_DATA);
remote_ra_process_maintenance_nodes(xml);
} else if (strcmp(op, PCMK__CONTROLD_CMD_NODES) == 0) {
return handle_node_list(stored_msg);
/*========== (NOT_DC)-Only Actions ==========*/
} else if (!AM_I_DC) {
if (strcmp(op, CRM_OP_SHUTDOWN) == 0) {
return handle_shutdown_ack(stored_msg);
}
} else {
crm_err("Unexpected request (%s) sent to %s", op, AM_I_DC ? "the DC" : "non-DC node");
crm_log_xml_err(stored_msg, "Unexpected");
}
return I_NULL;
}
static void
handle_response(xmlNode *stored_msg)
{
const char *op = crm_element_value(stored_msg, F_CRM_TASK);
if (op == NULL) {
crm_log_xml_err(stored_msg, "Bad message");
} else if (AM_I_DC && strcmp(op, CRM_OP_PECALC) == 0) {
// Check whether scheduler answer been superseded by subsequent request
const char *msg_ref = crm_element_value(stored_msg, XML_ATTR_REFERENCE);
if (msg_ref == NULL) {
crm_err("%s - Ignoring calculation with no reference", op);
} else if (pcmk__str_eq(msg_ref, controld_globals.fsa_pe_ref,
pcmk__str_none)) {
ha_msg_input_t fsa_input;
controld_stop_sched_timer();
fsa_input.msg = stored_msg;
register_fsa_input_later(C_IPC_MESSAGE, I_PE_SUCCESS, &fsa_input);
} else {
crm_info("%s calculation %s is obsolete", op, msg_ref);
}
} else if (strcmp(op, CRM_OP_VOTE) == 0
|| strcmp(op, CRM_OP_SHUTDOWN_REQ) == 0 || strcmp(op, CRM_OP_SHUTDOWN) == 0) {
} else {
const char *host_from = crm_element_value(stored_msg, F_CRM_HOST_FROM);
crm_err("Unexpected response (op=%s, src=%s) sent to the %s",
op, host_from, AM_I_DC ? "DC" : "controller");
}
}
static enum crmd_fsa_input
handle_shutdown_request(xmlNode * stored_msg)
{
/* handle here to avoid potential version issues
* where the shutdown message/procedure may have
* been changed in later versions.
*
* This way the DC is always in control of the shutdown
*/
char *now_s = NULL;
const char *host_from = crm_element_value(stored_msg, F_CRM_HOST_FROM);
if (host_from == NULL) {
/* we're shutting down and the DC */
host_from = controld_globals.our_nodename;
}
crm_info("Creating shutdown request for %s (state=%s)", host_from,
fsa_state2string(controld_globals.fsa_state));
crm_log_xml_trace(stored_msg, "message");
now_s = pcmk__ttoa(time(NULL));
update_attrd(host_from, XML_CIB_ATTR_SHUTDOWN, now_s, NULL, FALSE);
free(now_s);
/* will be picked up by the TE as long as its running */
return I_NULL;
}
static void
send_msg_via_ipc(xmlNode * msg, const char *sys)
{
pcmk__client_t *client_channel = NULL;
CRM_CHECK(sys != NULL, return);
client_channel = pcmk__find_client_by_id(sys);
if (crm_element_value(msg, F_CRM_HOST_FROM) == NULL) {
crm_xml_add(msg, F_CRM_HOST_FROM, controld_globals.our_nodename);
}
if (client_channel != NULL) {
/* Transient clients such as crmadmin */
pcmk__ipc_send_xml(client_channel, 0, msg, crm_ipc_server_event);
} else if (pcmk__str_eq(sys, CRM_SYSTEM_TENGINE, pcmk__str_none)) {
xmlNode *data = get_message_xml(msg, F_CRM_DATA);
process_te_message(msg, data);
} else if (pcmk__str_eq(sys, CRM_SYSTEM_LRMD, pcmk__str_none)) {
fsa_data_t fsa_data;
ha_msg_input_t fsa_input;
fsa_input.msg = msg;
fsa_input.xml = get_message_xml(msg, F_CRM_DATA);
fsa_data.id = 0;
fsa_data.actions = 0;
fsa_data.data = &fsa_input;
fsa_data.fsa_input = I_MESSAGE;
fsa_data.fsa_cause = C_IPC_MESSAGE;
fsa_data.origin = __func__;
fsa_data.data_type = fsa_dt_ha_msg;
do_lrm_invoke(A_LRM_INVOKE, C_IPC_MESSAGE, controld_globals.fsa_state,
I_MESSAGE, &fsa_data);
} else if (crmd_is_proxy_session(sys)) {
crmd_proxy_send(sys, msg);
} else {
crm_info("Received invalid request: unknown subsystem '%s'", sys);
}
}
void
delete_ha_msg_input(ha_msg_input_t * orig)
{
if (orig == NULL) {
return;
}
free_xml(orig->msg);
free(orig);
}
/*!
* \internal
* \brief Notify the DC of a remote node state change
*
* \param[in] node_name Node's name
* \param[in] node_up TRUE if node is up, FALSE if down
*/
void
send_remote_state_message(const char *node_name, gboolean node_up)
{
/* If we don't have a DC, or the message fails, we have a failsafe:
* the DC will eventually pick up the change via the CIB node state.
* The message allows it to happen sooner if possible.
*/
if (controld_globals.dc_name != NULL) {
xmlNode *msg = create_request(CRM_OP_REMOTE_STATE, NULL,
controld_globals.dc_name, CRM_SYSTEM_DC,
CRM_SYSTEM_CRMD, NULL);
crm_info("Notifying DC %s of Pacemaker Remote node %s %s",
controld_globals.dc_name, node_name,
node_up? "coming up" : "going down");
crm_xml_add(msg, XML_ATTR_ID, node_name);
pcmk__xe_set_bool_attr(msg, XML_NODE_IN_CLUSTER, node_up);
send_cluster_message(crm_get_peer(0, controld_globals.dc_name),
crm_msg_crmd, msg, TRUE);
free_xml(msg);
} else {
crm_debug("No DC to notify of Pacemaker Remote node %s %s",
node_name, (node_up? "coming up" : "going down"));
}
}
diff --git a/daemons/controld/controld_te_actions.c b/daemons/controld/controld_te_actions.c
index 59a5746a94..d8cfcad489 100644
--- a/daemons/controld/controld_te_actions.c
+++ b/daemons/controld/controld_te_actions.c
@@ -1,747 +1,746 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include // lrmd_event_data_t, lrmd_free_event()
#include
#include
#include
#include
#include
static GHashTable *te_targets = NULL;
void send_rsc_command(pcmk__graph_action_t *action);
static void te_update_job_count(pcmk__graph_action_t *action, int offset);
static void
te_start_action_timer(const pcmk__graph_t *graph, pcmk__graph_action_t *action)
{
action->timer = g_timeout_add(action->timeout + graph->network_delay,
action_timer_callback, (void *) action);
CRM_ASSERT(action->timer != 0);
}
/*!
* \internal
* \brief Execute a graph pseudo-action
*
* \param[in,out] graph Transition graph being executed
* \param[in,out] pseudo Pseudo-action to execute
*
* \return Standard Pacemaker return code
*/
static int
execute_pseudo_action(pcmk__graph_t *graph, pcmk__graph_action_t *pseudo)
{
const char *task = crm_element_value(pseudo->xml, XML_LRM_ATTR_TASK);
/* send to peers as well? */
if (pcmk__str_eq(task, CRM_OP_MAINTENANCE_NODES, pcmk__str_casei)) {
GHashTableIter iter;
crm_node_t *node = NULL;
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &node)) {
xmlNode *cmd = NULL;
if (pcmk__str_eq(controld_globals.our_nodename, node->uname,
pcmk__str_casei)) {
continue;
}
cmd = create_request(task, pseudo->xml, node->uname,
CRM_SYSTEM_CRMD, CRM_SYSTEM_TENGINE, NULL);
send_cluster_message(node, crm_msg_crmd, cmd, FALSE);
free_xml(cmd);
}
remote_ra_process_maintenance_nodes(pseudo->xml);
} else {
/* Check action for Pacemaker Remote node side effects */
remote_ra_process_pseudo(pseudo->xml);
}
crm_debug("Pseudo-action %d (%s) fired and confirmed", pseudo->id,
crm_element_value(pseudo->xml, XML_LRM_ATTR_TASK_KEY));
te_action_confirmed(pseudo, graph);
return pcmk_rc_ok;
}
static int
get_target_rc(pcmk__graph_action_t *action)
{
int exit_status;
pcmk__scan_min_int(crm_meta_value(action->params, XML_ATTR_TE_TARGET_RC),
&exit_status, 0);
return exit_status;
}
/*!
* \internal
* \brief Execute a cluster action from a transition graph
*
* \param[in,out] graph Transition graph being executed
* \param[in,out] action Cluster action to execute
*
* \return Standard Pacemaker return code
*/
static int
execute_cluster_action(pcmk__graph_t *graph, pcmk__graph_action_t *action)
{
char *counter = NULL;
xmlNode *cmd = NULL;
gboolean is_local = FALSE;
const char *id = NULL;
const char *task = NULL;
const char *value = NULL;
const char *on_node = NULL;
const char *router_node = NULL;
gboolean rc = TRUE;
gboolean no_wait = FALSE;
id = ID(action->xml);
CRM_CHECK(!pcmk__str_empty(id), return EPROTO);
task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
CRM_CHECK(!pcmk__str_empty(task), return EPROTO);
on_node = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
CRM_CHECK(!pcmk__str_empty(on_node), return pcmk_rc_node_unknown);
router_node = crm_element_value(action->xml, XML_LRM_ATTR_ROUTER_NODE);
if (router_node == NULL) {
router_node = on_node;
if (pcmk__str_eq(task, CRM_OP_LRM_DELETE, pcmk__str_none)) {
const char *mode = crm_element_value(action->xml, PCMK__XA_MODE);
if (pcmk__str_eq(mode, XML_TAG_CIB, pcmk__str_none)) {
router_node = controld_globals.our_nodename;
}
}
}
if (pcmk__str_eq(router_node, controld_globals.our_nodename,
pcmk__str_casei)) {
is_local = TRUE;
}
value = crm_meta_value(action->params, XML_ATTR_TE_NOWAIT);
if (crm_is_true(value)) {
no_wait = TRUE;
}
crm_info("Handling controller request '%s' (%s on %s)%s%s",
id, task, on_node, (is_local? " locally" : ""),
(no_wait? " without waiting" : ""));
if (is_local && pcmk__str_eq(task, CRM_OP_SHUTDOWN, pcmk__str_none)) {
/* defer until everything else completes */
crm_info("Controller request '%s' is a local shutdown", id);
graph->completion_action = pcmk__graph_shutdown;
graph->abort_reason = "local shutdown";
te_action_confirmed(action, graph);
return pcmk_rc_ok;
} else if (pcmk__str_eq(task, CRM_OP_SHUTDOWN, pcmk__str_none)) {
crm_node_t *peer = crm_get_peer(0, router_node);
pcmk__update_peer_expected(__func__, peer, CRMD_JOINSTATE_DOWN);
}
cmd = create_request(task, action->xml, router_node, CRM_SYSTEM_CRMD, CRM_SYSTEM_TENGINE, NULL);
counter = pcmk__transition_key(controld_globals.transition_graph->id,
action->id, get_target_rc(action),
controld_globals.te_uuid);
crm_xml_add(cmd, XML_ATTR_TRANSITION_KEY, counter);
rc = send_cluster_message(crm_get_peer(0, router_node), crm_msg_crmd, cmd, TRUE);
free(counter);
free_xml(cmd);
if (rc == FALSE) {
crm_err("Action %d failed: send", action->id);
return ECOMM;
} else if (no_wait) {
te_action_confirmed(action, graph);
} else {
if (action->timeout <= 0) {
crm_err("Action %d: %s on %s had an invalid timeout (%dms). Using %ums instead",
action->id, task, on_node, action->timeout, graph->network_delay);
action->timeout = (int) graph->network_delay;
}
te_start_action_timer(graph, action);
}
return pcmk_rc_ok;
}
/*!
* \internal
* \brief Synthesize an executor event for a resource action timeout
*
* \param[in] action Resource action that timed out
* \param[in] target_rc Expected result of action that timed out
*
* Synthesize an executor event for a resource action timeout. (If the executor
* gets a timeout while waiting for a resource action to complete, that will be
* reported via the usual callback. This timeout means we didn't hear from the
* executor itself or the controller that relayed the action to the executor.)
*
* \return Newly created executor event for result of \p action
* \note The caller is responsible for freeing the return value using
* lrmd_free_event().
*/
static lrmd_event_data_t *
synthesize_timeout_event(const pcmk__graph_action_t *action, int target_rc)
{
lrmd_event_data_t *op = NULL;
const char *target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
const char *reason = NULL;
char *dynamic_reason = NULL;
if (pcmk__str_eq(target, get_local_node_name(), pcmk__str_casei)) {
reason = "Local executor did not return result in time";
} else {
const char *router_node = NULL;
router_node = crm_element_value(action->xml, XML_LRM_ATTR_ROUTER_NODE);
if (router_node == NULL) {
router_node = target;
}
dynamic_reason = crm_strdup_printf("Controller on %s did not return "
"result in time", router_node);
reason = dynamic_reason;
}
op = pcmk__event_from_graph_action(NULL, action, PCMK_EXEC_TIMEOUT,
PCMK_OCF_UNKNOWN_ERROR, reason);
op->call_id = -1;
op->user_data = pcmk__transition_key(controld_globals.transition_graph->id,
action->id, target_rc,
controld_globals.te_uuid);
free(dynamic_reason);
return op;
}
static void
controld_record_action_event(pcmk__graph_action_t *action,
lrmd_event_data_t *op)
{
cib_t *cib_conn = controld_globals.cib_conn;
xmlNode *state = NULL;
xmlNode *rsc = NULL;
xmlNode *action_rsc = NULL;
int rc = pcmk_ok;
const char *rsc_id = NULL;
const char *target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
const char *task_uuid = crm_element_value(action->xml, XML_LRM_ATTR_TASK_KEY);
const char *target_uuid = crm_element_value(action->xml, XML_LRM_ATTR_TARGET_UUID);
- int call_options = cib_quorum_override | cib_scope_local;
int target_rc = get_target_rc(action);
action_rsc = find_xml_node(action->xml, XML_CIB_TAG_RESOURCE, TRUE);
if (action_rsc == NULL) {
return;
}
rsc_id = ID(action_rsc);
CRM_CHECK(rsc_id != NULL,
crm_log_xml_err(action->xml, "Bad:action"); return);
/*
update the CIB
*/
state = create_xml_node(NULL, XML_CIB_TAG_STATE);
crm_xml_add(state, XML_ATTR_ID, target_uuid);
crm_xml_add(state, XML_ATTR_UNAME, target);
rsc = create_xml_node(state, XML_CIB_TAG_LRM);
crm_xml_add(rsc, XML_ATTR_ID, target_uuid);
rsc = create_xml_node(rsc, XML_LRM_TAG_RESOURCES);
rsc = create_xml_node(rsc, XML_LRM_TAG_RESOURCE);
crm_xml_add(rsc, XML_ATTR_ID, rsc_id);
crm_copy_xml_element(action_rsc, rsc, XML_ATTR_TYPE);
crm_copy_xml_element(action_rsc, rsc, XML_AGENT_ATTR_CLASS);
crm_copy_xml_element(action_rsc, rsc, XML_AGENT_ATTR_PROVIDER);
pcmk__create_history_xml(rsc, op, CRM_FEATURE_SET, target_rc, target,
__func__);
- rc = cib_conn->cmds->update(cib_conn, XML_CIB_TAG_STATUS, state,
- call_options);
+ rc = cib_conn->cmds->modify(cib_conn, XML_CIB_TAG_STATUS, state,
+ cib_scope_local);
fsa_register_cib_callback(rc, NULL, cib_action_updated);
free_xml(state);
crm_trace("Sent CIB update (call ID %d) for synthesized event of action %d (%s on %s)",
rc, action->id, task_uuid, target);
pcmk__set_graph_action_flags(action, pcmk__graph_action_sent_update);
}
void
controld_record_action_timeout(pcmk__graph_action_t *action)
{
lrmd_event_data_t *op = NULL;
const char *target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
const char *task_uuid = crm_element_value(action->xml, XML_LRM_ATTR_TASK_KEY);
int target_rc = get_target_rc(action);
crm_warn("%s %d: %s on %s timed out",
crm_element_name(action->xml), action->id, task_uuid, target);
op = synthesize_timeout_event(action, target_rc);
controld_record_action_event(action, op);
lrmd_free_event(op);
}
/*!
* \internal
* \brief Execute a resource action from a transition graph
*
* \param[in,out] graph Transition graph being executed
* \param[in,out] action Resource action to execute
*
* \return Standard Pacemaker return code
*/
static int
execute_rsc_action(pcmk__graph_t *graph, pcmk__graph_action_t *action)
{
/* never overwrite stop actions in the CIB with
* anything other than completed results
*
* Writing pending stops makes it look like the
* resource is running again
*/
xmlNode *cmd = NULL;
xmlNode *rsc_op = NULL;
gboolean rc = TRUE;
gboolean no_wait = FALSE;
gboolean is_local = FALSE;
char *counter = NULL;
const char *task = NULL;
const char *value = NULL;
const char *on_node = NULL;
const char *router_node = NULL;
const char *task_uuid = NULL;
CRM_ASSERT(action != NULL);
CRM_ASSERT(action->xml != NULL);
pcmk__clear_graph_action_flags(action, pcmk__graph_action_executed);
on_node = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
CRM_CHECK(!pcmk__str_empty(on_node),
crm_err("Corrupted command(id=%s) %s: no node",
ID(action->xml), pcmk__s(task, "without task"));
return pcmk_rc_node_unknown);
rsc_op = action->xml;
task = crm_element_value(rsc_op, XML_LRM_ATTR_TASK);
task_uuid = crm_element_value(action->xml, XML_LRM_ATTR_TASK_KEY);
router_node = crm_element_value(rsc_op, XML_LRM_ATTR_ROUTER_NODE);
if (!router_node) {
router_node = on_node;
}
counter = pcmk__transition_key(controld_globals.transition_graph->id,
action->id, get_target_rc(action),
controld_globals.te_uuid);
crm_xml_add(rsc_op, XML_ATTR_TRANSITION_KEY, counter);
if (pcmk__str_eq(router_node, controld_globals.our_nodename,
pcmk__str_casei)) {
is_local = TRUE;
}
value = crm_meta_value(action->params, XML_ATTR_TE_NOWAIT);
if (crm_is_true(value)) {
no_wait = TRUE;
}
crm_notice("Initiating %s operation %s%s on %s%s "CRM_XS" action %d",
task, task_uuid, (is_local? " locally" : ""), on_node,
(no_wait? " without waiting" : ""), action->id);
cmd = create_request(CRM_OP_INVOKE_LRM, rsc_op, router_node,
CRM_SYSTEM_LRMD, CRM_SYSTEM_TENGINE, NULL);
if (is_local) {
/* shortcut local resource commands */
ha_msg_input_t data = {
.msg = cmd,
.xml = rsc_op,
};
fsa_data_t msg = {
.id = 0,
.data = &data,
.data_type = fsa_dt_ha_msg,
.fsa_input = I_NULL,
.fsa_cause = C_FSA_INTERNAL,
.actions = A_LRM_INVOKE,
.origin = __func__,
};
do_lrm_invoke(A_LRM_INVOKE, C_FSA_INTERNAL, controld_globals.fsa_state,
I_NULL, &msg);
} else {
rc = send_cluster_message(crm_get_peer(0, router_node), crm_msg_lrmd, cmd, TRUE);
}
free(counter);
free_xml(cmd);
pcmk__set_graph_action_flags(action, pcmk__graph_action_executed);
if (rc == FALSE) {
crm_err("Action %d failed: send", action->id);
return ECOMM;
} else if (no_wait) {
/* Just mark confirmed. Don't bump the job count only to immediately
* decrement it.
*/
crm_info("Action %d confirmed - no wait", action->id);
pcmk__set_graph_action_flags(action, pcmk__graph_action_confirmed);
pcmk__update_graph(controld_globals.transition_graph, action);
trigger_graph();
} else if (pcmk_is_set(action->flags, pcmk__graph_action_confirmed)) {
crm_debug("Action %d: %s %s on %s(timeout %dms) was already confirmed.",
action->id, task, task_uuid, on_node, action->timeout);
} else {
if (action->timeout <= 0) {
crm_err("Action %d: %s %s on %s had an invalid timeout (%dms). Using %ums instead",
action->id, task, task_uuid, on_node, action->timeout, graph->network_delay);
action->timeout = (int) graph->network_delay;
}
te_update_job_count(action, 1);
te_start_action_timer(graph, action);
}
return pcmk_rc_ok;
}
struct te_peer_s
{
char *name;
int jobs;
int migrate_jobs;
};
static void te_peer_free(gpointer p)
{
struct te_peer_s *peer = p;
free(peer->name);
free(peer);
}
void te_reset_job_counts(void)
{
GHashTableIter iter;
struct te_peer_s *peer = NULL;
if(te_targets == NULL) {
te_targets = pcmk__strkey_table(NULL, te_peer_free);
}
g_hash_table_iter_init(&iter, te_targets);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) & peer)) {
peer->jobs = 0;
peer->migrate_jobs = 0;
}
}
static void
te_update_job_count_on(const char *target, int offset, bool migrate)
{
struct te_peer_s *r = NULL;
if(target == NULL || te_targets == NULL) {
return;
}
r = g_hash_table_lookup(te_targets, target);
if(r == NULL) {
r = calloc(1, sizeof(struct te_peer_s));
r->name = strdup(target);
g_hash_table_insert(te_targets, r->name, r);
}
r->jobs += offset;
if(migrate) {
r->migrate_jobs += offset;
}
crm_trace("jobs[%s] = %d", target, r->jobs);
}
static void
te_update_job_count(pcmk__graph_action_t *action, int offset)
{
const char *task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
const char *target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
if ((action->type != pcmk__rsc_graph_action) || (target == NULL)) {
/* No limit on these */
return;
}
/* if we have a router node, this means the action is performing
* on a remote node. For now, we count all actions occurring on a
* remote node against the job list on the cluster node hosting
* the connection resources */
target = crm_element_value(action->xml, XML_LRM_ATTR_ROUTER_NODE);
if ((target == NULL) && pcmk__strcase_any_of(task, CRMD_ACTION_MIGRATE,
CRMD_ACTION_MIGRATED, NULL)) {
const char *t1 = crm_meta_value(action->params, XML_LRM_ATTR_MIGRATE_SOURCE);
const char *t2 = crm_meta_value(action->params, XML_LRM_ATTR_MIGRATE_TARGET);
te_update_job_count_on(t1, offset, TRUE);
te_update_job_count_on(t2, offset, TRUE);
return;
} else if (target == NULL) {
target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
}
te_update_job_count_on(target, offset, FALSE);
}
/*!
* \internal
* \brief Check whether a graph action is allowed to be executed on a node
*
* \param[in] graph Transition graph being executed
* \param[in] action Graph action being executed
* \param[in] target Name of node where action should be executed
*
* \return true if action is allowed, otherwise false
*/
static bool
allowed_on_node(const pcmk__graph_t *graph, const pcmk__graph_action_t *action,
const char *target)
{
int limit = 0;
struct te_peer_s *r = NULL;
const char *task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
const char *id = crm_element_value(action->xml, XML_LRM_ATTR_TASK_KEY);
if(target == NULL) {
/* No limit on these */
return true;
} else if(te_targets == NULL) {
return false;
}
r = g_hash_table_lookup(te_targets, target);
limit = throttle_get_job_limit(target);
if(r == NULL) {
r = calloc(1, sizeof(struct te_peer_s));
r->name = strdup(target);
g_hash_table_insert(te_targets, r->name, r);
}
if(limit <= r->jobs) {
crm_trace("Peer %s is over their job limit of %d (%d): deferring %s",
target, limit, r->jobs, id);
return false;
} else if(graph->migration_limit > 0 && r->migrate_jobs >= graph->migration_limit) {
if (pcmk__strcase_any_of(task, CRMD_ACTION_MIGRATE, CRMD_ACTION_MIGRATED, NULL)) {
crm_trace("Peer %s is over their migration job limit of %d (%d): deferring %s",
target, graph->migration_limit, r->migrate_jobs, id);
return false;
}
}
crm_trace("Peer %s has not hit their limit yet. current jobs = %d limit= %d limit", target, r->jobs, limit);
return true;
}
/*!
* \internal
* \brief Check whether a graph action is allowed to be executed
*
* \param[in] graph Transition graph being executed
* \param[in] action Graph action being executed
*
* \return true if action is allowed, otherwise false
*/
static bool
graph_action_allowed(pcmk__graph_t *graph, pcmk__graph_action_t *action)
{
const char *target = NULL;
const char *task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
if (action->type != pcmk__rsc_graph_action) {
/* No limit on these */
return true;
}
/* if we have a router node, this means the action is performing
* on a remote node. For now, we count all actions occurring on a
* remote node against the job list on the cluster node hosting
* the connection resources */
target = crm_element_value(action->xml, XML_LRM_ATTR_ROUTER_NODE);
if ((target == NULL) && pcmk__strcase_any_of(task, CRMD_ACTION_MIGRATE,
CRMD_ACTION_MIGRATED, NULL)) {
target = crm_meta_value(action->params, XML_LRM_ATTR_MIGRATE_SOURCE);
if (!allowed_on_node(graph, action, target)) {
return false;
}
target = crm_meta_value(action->params, XML_LRM_ATTR_MIGRATE_TARGET);
} else if (target == NULL) {
target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
}
return allowed_on_node(graph, action, target);
}
/*!
* \brief Confirm a graph action (and optionally update graph)
*
* \param[in,out] action Action to confirm
* \param[in,out] graph Update and trigger this graph (if non-NULL)
*/
void
te_action_confirmed(pcmk__graph_action_t *action, pcmk__graph_t *graph)
{
if (!pcmk_is_set(action->flags, pcmk__graph_action_confirmed)) {
if ((action->type == pcmk__rsc_graph_action)
&& (crm_element_value(action->xml, XML_LRM_ATTR_TARGET) != NULL)) {
te_update_job_count(action, -1);
}
pcmk__set_graph_action_flags(action, pcmk__graph_action_confirmed);
}
if (graph) {
pcmk__update_graph(graph, action);
trigger_graph();
}
}
static pcmk__graph_functions_t te_graph_fns = {
execute_pseudo_action,
execute_rsc_action,
execute_cluster_action,
controld_execute_fence_action,
graph_action_allowed,
};
/*
* \internal
* \brief Register the transitioner's graph functions with \p libpacemaker
*/
void
controld_register_graph_functions(void)
{
pcmk__set_graph_functions(&te_graph_fns);
}
void
notify_crmd(pcmk__graph_t *graph)
{
const char *type = "unknown";
enum crmd_fsa_input event = I_NULL;
crm_debug("Processing transition completion in state %s",
fsa_state2string(controld_globals.fsa_state));
CRM_CHECK(graph->complete, graph->complete = true);
switch (graph->completion_action) {
case pcmk__graph_wait:
type = "stop";
if (controld_globals.fsa_state == S_TRANSITION_ENGINE) {
event = I_TE_SUCCESS;
}
break;
case pcmk__graph_done:
type = "done";
if (controld_globals.fsa_state == S_TRANSITION_ENGINE) {
event = I_TE_SUCCESS;
}
break;
case pcmk__graph_restart:
type = "restart";
if (controld_globals.fsa_state == S_TRANSITION_ENGINE) {
if (controld_get_period_transition_timer() > 0) {
controld_stop_transition_timer();
controld_start_transition_timer();
} else {
event = I_PE_CALC;
}
} else if (controld_globals.fsa_state == S_POLICY_ENGINE) {
controld_set_fsa_action_flags(A_PE_INVOKE);
controld_trigger_fsa();
}
break;
case pcmk__graph_shutdown:
type = "shutdown";
if (pcmk_is_set(controld_globals.fsa_input_register, R_SHUTDOWN)) {
event = I_STOP;
} else {
crm_err("We didn't ask to be shut down, yet the scheduler is telling us to");
event = I_TERMINATE;
}
}
crm_debug("Transition %d status: %s - %s", graph->id, type,
pcmk__s(graph->abort_reason, "unspecified reason"));
graph->abort_reason = NULL;
graph->completion_action = pcmk__graph_done;
if (event != I_NULL) {
register_fsa_input(C_FSA_INTERNAL, event, NULL);
} else {
controld_trigger_fsa();
}
}
diff --git a/daemons/controld/controld_te_callbacks.c b/daemons/controld/controld_te_callbacks.c
index cc09778483..cf9de837a1 100644
--- a/daemons/controld/controld_te_callbacks.c
+++ b/daemons/controld/controld_te_callbacks.c
@@ -1,689 +1,689 @@
/*
* Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include /* For ONLINESTATUS etc */
#include
void te_update_confirm(const char *event, xmlNode * msg);
#define RSC_OP_PREFIX "//" XML_TAG_DIFF_ADDED "//" XML_TAG_CIB \
"//" XML_LRM_TAG_RSC_OP "[@" XML_ATTR_ID "='"
// An explicit shutdown-lock of 0 means the lock has been cleared
static bool
shutdown_lock_cleared(xmlNode *lrm_resource)
{
time_t shutdown_lock = 0;
return (crm_element_value_epoch(lrm_resource, XML_CONFIG_ATTR_SHUTDOWN_LOCK,
&shutdown_lock) == pcmk_ok)
&& (shutdown_lock == 0);
}
static void
te_update_diff_v1(const char *event, xmlNode *diff)
{
int lpc, max;
xmlXPathObject *xpathObj = NULL;
GString *rsc_op_xpath = NULL;
CRM_CHECK(diff != NULL, return);
pcmk__output_set_log_level(controld_globals.logger_out, LOG_TRACE);
controld_globals.logger_out->message(controld_globals.logger_out,
"xml-patchset", diff);
if (cib__config_changed_v1(NULL, NULL, &diff)) {
abort_transition(INFINITY, pcmk__graph_restart, "Non-status change",
diff);
goto bail; /* configuration changed */
}
/* Tickets Attributes - Added/Updated */
xpathObj =
xpath_search(diff,
"//" F_CIB_UPDATE_RESULT "//" XML_TAG_DIFF_ADDED "//" XML_CIB_TAG_TICKETS);
if (numXpathResults(xpathObj) > 0) {
xmlNode *aborted = getXpathResult(xpathObj, 0);
abort_transition(INFINITY, pcmk__graph_restart,
"Ticket attribute: update", aborted);
goto bail;
}
freeXpathObject(xpathObj);
/* Tickets Attributes - Removed */
xpathObj =
xpath_search(diff,
"//" F_CIB_UPDATE_RESULT "//" XML_TAG_DIFF_REMOVED "//" XML_CIB_TAG_TICKETS);
if (numXpathResults(xpathObj) > 0) {
xmlNode *aborted = getXpathResult(xpathObj, 0);
abort_transition(INFINITY, pcmk__graph_restart,
"Ticket attribute: removal", aborted);
goto bail;
}
freeXpathObject(xpathObj);
/* Transient Attributes - Removed */
xpathObj =
xpath_search(diff,
"//" F_CIB_UPDATE_RESULT "//" XML_TAG_DIFF_REMOVED "//"
XML_TAG_TRANSIENT_NODEATTRS);
if (numXpathResults(xpathObj) > 0) {
xmlNode *aborted = getXpathResult(xpathObj, 0);
abort_transition(INFINITY, pcmk__graph_restart,
"Transient attribute: removal", aborted);
goto bail;
}
freeXpathObject(xpathObj);
// Check for lrm_resource entries
xpathObj = xpath_search(diff,
"//" F_CIB_UPDATE_RESULT
"//" XML_TAG_DIFF_ADDED
"//" XML_LRM_TAG_RESOURCE);
max = numXpathResults(xpathObj);
/*
* Updates by, or in response to, graph actions will never affect more than
* one resource at a time, so such updates indicate an LRM refresh. In that
* case, start a new transition rather than check each result individually,
* which can result in _huge_ speedups in large clusters.
*
* Unfortunately, we can only do so when there are no pending actions.
* Otherwise, we could mistakenly throw away those results here, and
* the cluster will stall waiting for them and time out the operation.
*/
if ((controld_globals.transition_graph->pending == 0) && (max > 1)) {
crm_debug("Ignoring resource operation updates due to history refresh of %d resources",
max);
crm_log_xml_trace(diff, "lrm-refresh");
abort_transition(INFINITY, pcmk__graph_restart, "History refresh",
NULL);
goto bail;
}
if (max == 1) {
xmlNode *lrm_resource = getXpathResult(xpathObj, 0);
if (shutdown_lock_cleared(lrm_resource)) {
// @TODO would be more efficient to abort once after transition done
abort_transition(INFINITY, pcmk__graph_restart,
"Shutdown lock cleared", lrm_resource);
// Still process results, so we stop timers and update failcounts
}
}
freeXpathObject(xpathObj);
/* Process operation updates */
xpathObj =
xpath_search(diff,
"//" F_CIB_UPDATE_RESULT "//" XML_TAG_DIFF_ADDED "//" XML_LRM_TAG_RSC_OP);
max = numXpathResults(xpathObj);
if (max > 0) {
int lpc = 0;
for (lpc = 0; lpc < max; lpc++) {
xmlNode *rsc_op = getXpathResult(xpathObj, lpc);
const char *node = get_node_id(rsc_op);
process_graph_event(rsc_op, node);
}
}
freeXpathObject(xpathObj);
/* Detect deleted (as opposed to replaced or added) actions - eg. crm_resource -C */
xpathObj = xpath_search(diff, "//" XML_TAG_DIFF_REMOVED "//" XML_LRM_TAG_RSC_OP);
max = numXpathResults(xpathObj);
for (lpc = 0; lpc < max; lpc++) {
const char *op_id = NULL;
xmlXPathObject *op_match = NULL;
xmlNode *match = getXpathResult(xpathObj, lpc);
CRM_LOG_ASSERT(match != NULL);
if(match == NULL) { continue; };
op_id = ID(match);
if (rsc_op_xpath == NULL) {
rsc_op_xpath = g_string_new(RSC_OP_PREFIX);
} else {
g_string_truncate(rsc_op_xpath, sizeof(RSC_OP_PREFIX) - 1);
}
pcmk__g_strcat(rsc_op_xpath, op_id, "']", NULL);
op_match = xpath_search(diff, (const char *) rsc_op_xpath->str);
if (numXpathResults(op_match) == 0) {
/* Prevent false positives by matching cancelations too */
const char *node = get_node_id(match);
pcmk__graph_action_t *cancelled = get_cancel_action(op_id, node);
if (cancelled == NULL) {
crm_debug("No match for deleted action %s (%s on %s)",
(const char *) rsc_op_xpath->str, op_id, node);
abort_transition(INFINITY, pcmk__graph_restart,
"Resource op removal", match);
freeXpathObject(op_match);
goto bail;
} else {
crm_debug("Deleted lrm_rsc_op %s on %s was for graph event %d",
op_id, node, cancelled->id);
}
}
freeXpathObject(op_match);
}
bail:
freeXpathObject(xpathObj);
if (rsc_op_xpath != NULL) {
g_string_free(rsc_op_xpath, TRUE);
}
}
static void
process_lrm_resource_diff(xmlNode *lrm_resource, const char *node)
{
for (xmlNode *rsc_op = pcmk__xml_first_child(lrm_resource); rsc_op != NULL;
rsc_op = pcmk__xml_next(rsc_op)) {
process_graph_event(rsc_op, node);
}
if (shutdown_lock_cleared(lrm_resource)) {
// @TODO would be more efficient to abort once after transition done
abort_transition(INFINITY, pcmk__graph_restart, "Shutdown lock cleared",
lrm_resource);
}
}
static void
process_resource_updates(const char *node, xmlNode *xml, xmlNode *change,
const char *op, const char *xpath)
{
xmlNode *rsc = NULL;
if (xml == NULL) {
return;
}
if (strcmp(TYPE(xml), XML_CIB_TAG_LRM) == 0) {
xml = first_named_child(xml, XML_LRM_TAG_RESOURCES);
CRM_CHECK(xml != NULL, return);
}
CRM_CHECK(strcmp(TYPE(xml), XML_LRM_TAG_RESOURCES) == 0, return);
/*
* Updates by, or in response to, TE actions will never contain updates
* for more than one resource at a time, so such updates indicate an
* LRM refresh.
*
* In that case, start a new transition rather than check each result
* individually, which can result in _huge_ speedups in large clusters.
*
* Unfortunately, we can only do so when there are no pending actions.
* Otherwise, we could mistakenly throw away those results here, and
* the cluster will stall waiting for them and time out the operation.
*/
if ((controld_globals.transition_graph->pending == 0)
&& (xml->children != NULL) && (xml->children->next != NULL)) {
crm_log_xml_trace(change, "lrm-refresh");
abort_transition(INFINITY, pcmk__graph_restart, "History refresh",
NULL);
return;
}
for (rsc = pcmk__xml_first_child(xml); rsc != NULL;
rsc = pcmk__xml_next(rsc)) {
crm_trace("Processing %s", ID(rsc));
process_lrm_resource_diff(rsc, node);
}
}
static char *extract_node_uuid(const char *xpath)
{
char *mutable_path = strdup(xpath);
char *node_uuid = NULL;
char *search = NULL;
char *match = NULL;
match = strstr(mutable_path, "node_state[@" XML_ATTR_ID "=\'");
if (match == NULL) {
free(mutable_path);
return NULL;
}
match += strlen("node_state[@" XML_ATTR_ID "=\'");
search = strchr(match, '\'');
if (search == NULL) {
free(mutable_path);
return NULL;
}
search[0] = 0;
node_uuid = strdup(match);
free(mutable_path);
return node_uuid;
}
static void
abort_unless_down(const char *xpath, const char *op, xmlNode *change,
const char *reason)
{
char *node_uuid = NULL;
pcmk__graph_action_t *down = NULL;
if(!pcmk__str_eq(op, "delete", pcmk__str_casei)) {
abort_transition(INFINITY, pcmk__graph_restart, reason, change);
return;
}
node_uuid = extract_node_uuid(xpath);
if(node_uuid == NULL) {
crm_err("Could not extract node ID from %s", xpath);
abort_transition(INFINITY, pcmk__graph_restart, reason, change);
return;
}
down = match_down_event(node_uuid);
if (down == NULL) {
crm_trace("Not expecting %s to be down (%s)", node_uuid, xpath);
abort_transition(INFINITY, pcmk__graph_restart, reason, change);
} else {
crm_trace("Expecting changes to %s (%s)", node_uuid, xpath);
}
free(node_uuid);
}
static void
process_op_deletion(const char *xpath, xmlNode *change)
{
char *mutable_key = strdup(xpath);
char *key;
char *node_uuid;
// Extract the part of xpath between last pair of single quotes
key = strrchr(mutable_key, '\'');
if (key != NULL) {
*key = '\0';
key = strrchr(mutable_key, '\'');
}
if (key == NULL) {
crm_warn("Ignoring malformed CIB update (resource deletion of %s)",
xpath);
free(mutable_key);
return;
}
++key;
node_uuid = extract_node_uuid(xpath);
if (confirm_cancel_action(key, node_uuid) == FALSE) {
abort_transition(INFINITY, pcmk__graph_restart,
"Resource operation removal", change);
}
free(mutable_key);
free(node_uuid);
}
static void
process_delete_diff(const char *xpath, const char *op, xmlNode *change)
{
if (strstr(xpath, "/" XML_LRM_TAG_RSC_OP "[")) {
process_op_deletion(xpath, change);
} else if (strstr(xpath, "/" XML_CIB_TAG_LRM "[")) {
abort_unless_down(xpath, op, change, "Resource state removal");
} else if (strstr(xpath, "/" XML_CIB_TAG_STATE "[")) {
abort_unless_down(xpath, op, change, "Node state removal");
} else {
crm_trace("Ignoring delete of %s", xpath);
}
}
static void
process_node_state_diff(xmlNode *state, xmlNode *change, const char *op,
const char *xpath)
{
xmlNode *lrm = first_named_child(state, XML_CIB_TAG_LRM);
process_resource_updates(ID(state), lrm, change, op, xpath);
}
static void
process_status_diff(xmlNode *status, xmlNode *change, const char *op,
const char *xpath)
{
for (xmlNode *state = pcmk__xml_first_child(status); state != NULL;
state = pcmk__xml_next(state)) {
process_node_state_diff(state, change, op, xpath);
}
}
static void
process_cib_diff(xmlNode *cib, xmlNode *change, const char *op,
const char *xpath)
{
xmlNode *status = first_named_child(cib, XML_CIB_TAG_STATUS);
xmlNode *config = first_named_child(cib, XML_CIB_TAG_CONFIGURATION);
if (status) {
process_status_diff(status, change, op, xpath);
}
if (config) {
abort_transition(INFINITY, pcmk__graph_restart,
"Non-status-only change", change);
}
}
static void
te_update_diff_v2(xmlNode *diff)
{
crm_log_xml_trace(diff, "Patch:Raw");
for (xmlNode *change = pcmk__xml_first_child(diff); change != NULL;
change = pcmk__xml_next(change)) {
xmlNode *match = NULL;
const char *name = NULL;
const char *xpath = crm_element_value(change, XML_DIFF_PATH);
// Possible ops: create, modify, delete, move
const char *op = crm_element_value(change, XML_DIFF_OP);
// Ignore uninteresting updates
if (op == NULL) {
continue;
} else if (xpath == NULL) {
crm_trace("Ignoring %s change for version field", op);
continue;
} else if ((strcmp(op, "move") == 0)
&& (strstr(xpath,
"/" XML_TAG_CIB "/" XML_CIB_TAG_CONFIGURATION
"/" XML_CIB_TAG_RESOURCES) == NULL)) {
/* We still need to consider moves within the resources section,
* since they affect placement order.
*/
crm_trace("Ignoring move change at %s", xpath);
continue;
}
// Find the result of create/modify ops
if (strcmp(op, "create") == 0) {
match = change->children;
} else if (strcmp(op, "modify") == 0) {
match = first_named_child(change, XML_DIFF_RESULT);
if(match) {
match = match->children;
}
} else if (!pcmk__str_any_of(op, "delete", "move", NULL)) {
crm_warn("Ignoring malformed CIB update (%s operation on %s is unrecognized)",
op, xpath);
continue;
}
if (match) {
if (match->type == XML_COMMENT_NODE) {
crm_trace("Ignoring %s operation for comment at %s", op, xpath);
continue;
}
name = (const char *)match->name;
}
crm_trace("Handling %s operation for %s%s%s",
op, (xpath? xpath : "CIB"),
(name? " matched by " : ""), (name? name : ""));
if (strstr(xpath, "/" XML_TAG_CIB "/" XML_CIB_TAG_CONFIGURATION)) {
abort_transition(INFINITY, pcmk__graph_restart,
"Configuration change", change);
break; // Won't be packaged with operation results we may be waiting for
} else if (strstr(xpath, "/" XML_CIB_TAG_TICKETS)
|| pcmk__str_eq(name, XML_CIB_TAG_TICKETS, pcmk__str_none)) {
abort_transition(INFINITY, pcmk__graph_restart,
"Ticket attribute change", change);
break; // Won't be packaged with operation results we may be waiting for
} else if (strstr(xpath, "/" XML_TAG_TRANSIENT_NODEATTRS "[")
|| pcmk__str_eq(name, XML_TAG_TRANSIENT_NODEATTRS,
pcmk__str_none)) {
abort_unless_down(xpath, op, change, "Transient attribute change");
break; // Won't be packaged with operation results we may be waiting for
} else if (strcmp(op, "delete") == 0) {
process_delete_diff(xpath, op, change);
} else if (name == NULL) {
crm_warn("Ignoring malformed CIB update (%s at %s has no result)",
op, xpath);
} else if (strcmp(name, XML_TAG_CIB) == 0) {
process_cib_diff(match, change, op, xpath);
} else if (strcmp(name, XML_CIB_TAG_STATUS) == 0) {
process_status_diff(match, change, op, xpath);
} else if (strcmp(name, XML_CIB_TAG_STATE) == 0) {
process_node_state_diff(match, change, op, xpath);
} else if (strcmp(name, XML_CIB_TAG_LRM) == 0) {
process_resource_updates(ID(match), match, change, op, xpath);
} else if (strcmp(name, XML_LRM_TAG_RESOURCES) == 0) {
char *local_node = pcmk__xpath_node_id(xpath, "lrm");
process_resource_updates(local_node, match, change, op, xpath);
free(local_node);
} else if (strcmp(name, XML_LRM_TAG_RESOURCE) == 0) {
char *local_node = pcmk__xpath_node_id(xpath, "lrm");
process_lrm_resource_diff(match, local_node);
free(local_node);
} else if (strcmp(name, XML_LRM_TAG_RSC_OP) == 0) {
char *local_node = pcmk__xpath_node_id(xpath, "lrm");
process_graph_event(match, local_node);
free(local_node);
} else {
crm_warn("Ignoring malformed CIB update (%s at %s has unrecognized result %s)",
op, xpath, name);
}
}
}
void
te_update_diff(const char *event, xmlNode * msg)
{
xmlNode *diff = NULL;
const char *op = NULL;
int rc = -EINVAL;
int format = 1;
int p_add[] = { 0, 0, 0 };
int p_del[] = { 0, 0, 0 };
CRM_CHECK(msg != NULL, return);
crm_element_value_int(msg, F_CIB_RC, &rc);
if (controld_globals.transition_graph == NULL) {
crm_trace("No graph");
return;
} else if (rc < pcmk_ok) {
crm_trace("Filter rc=%d (%s)", rc, pcmk_strerror(rc));
return;
} else if (controld_globals.transition_graph->complete
&& (controld_globals.fsa_state != S_IDLE)
&& (controld_globals.fsa_state != S_TRANSITION_ENGINE)
&& (controld_globals.fsa_state != S_POLICY_ENGINE)) {
crm_trace("Filter state=%s (complete)",
fsa_state2string(controld_globals.fsa_state));
return;
}
op = crm_element_value(msg, F_CIB_OPERATION);
diff = get_message_xml(msg, F_CIB_UPDATE_RESULT);
xml_patch_versions(diff, p_add, p_del);
crm_debug("Processing (%s) diff: %d.%d.%d -> %d.%d.%d (%s)", op,
p_del[0], p_del[1], p_del[2], p_add[0], p_add[1], p_add[2],
fsa_state2string(controld_globals.fsa_state));
crm_element_value_int(diff, "format", &format);
switch (format) {
case 1:
te_update_diff_v1(event, diff);
break;
case 2:
te_update_diff_v2(diff);
break;
default:
crm_warn("Ignoring malformed CIB update (unknown patch format %d)",
format);
}
- controld_destroy_outside_event_table();
+ controld_remove_all_outside_events();
}
void
process_te_message(xmlNode * msg, xmlNode * xml_data)
{
const char *value = NULL;
xmlXPathObject *xpathObj = NULL;
int nmatches = 0;
CRM_CHECK(msg != NULL, return);
// Transition requests must specify transition engine as subsystem
value = crm_element_value(msg, F_CRM_SYS_TO);
if (pcmk__str_empty(value)
|| !pcmk__str_eq(value, CRM_SYSTEM_TENGINE, pcmk__str_none)) {
crm_info("Received invalid transition request: subsystem '%s' not '"
CRM_SYSTEM_TENGINE "'", pcmk__s(value, ""));
return;
}
// Only the lrm_invoke command is supported as a transition request
value = crm_element_value(msg, F_CRM_TASK);
if (!pcmk__str_eq(value, CRM_OP_INVOKE_LRM, pcmk__str_none)) {
crm_info("Received invalid transition request: command '%s' not '"
CRM_OP_INVOKE_LRM "'", pcmk__s(value, ""));
return;
}
// Transition requests must be marked as coming from the executor
value = crm_element_value(msg, F_CRM_SYS_FROM);
if (!pcmk__str_eq(value, CRM_SYSTEM_LRMD, pcmk__str_none)) {
crm_info("Received invalid transition request: from '%s' not '"
CRM_SYSTEM_LRMD "'", pcmk__s(value, ""));
return;
}
crm_debug("Processing transition request with ref='%s' origin='%s'",
pcmk__s(crm_element_value(msg, F_CRM_REFERENCE), ""),
pcmk__s(crm_element_value(msg, F_ORIG), ""));
xpathObj = xpath_search(xml_data, "//" XML_LRM_TAG_RSC_OP);
nmatches = numXpathResults(xpathObj);
if (nmatches == 0) {
crm_err("Received transition request with no results (bug?)");
} else {
for (int lpc = 0; lpc < nmatches; lpc++) {
xmlNode *rsc_op = getXpathResult(xpathObj, lpc);
const char *node = get_node_id(rsc_op);
process_graph_event(rsc_op, node);
}
}
freeXpathObject(xpathObj);
}
void
cib_action_updated(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
if (rc < pcmk_ok) {
crm_err("Update %d FAILED: %s", call_id, pcmk_strerror(rc));
}
}
/*!
* \brief Handle a timeout in node-to-node communication
*
* \param[in,out] data Pointer to graph action
*
* \return FALSE (indicating that source should be not be re-added)
*/
gboolean
action_timer_callback(gpointer data)
{
pcmk__graph_action_t *action = (pcmk__graph_action_t *) data;
const char *task = NULL;
const char *on_node = NULL;
const char *via_node = NULL;
CRM_CHECK(data != NULL, return FALSE);
stop_te_timer(action);
task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
on_node = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
via_node = crm_element_value(action->xml, XML_LRM_ATTR_ROUTER_NODE);
if (controld_globals.transition_graph->complete) {
crm_notice("Node %s did not send %s result (via %s) within %dms "
"(ignoring because transition not in progress)",
(on_node? on_node : ""), (task? task : "unknown action"),
(via_node? via_node : "controller"), action->timeout);
} else {
/* fail the action */
crm_err("Node %s did not send %s result (via %s) within %dms "
"(action timeout plus cluster-delay)",
(on_node? on_node : ""), (task? task : "unknown action"),
(via_node? via_node : "controller"),
(action->timeout
+ controld_globals.transition_graph->network_delay));
pcmk__log_graph_action(LOG_ERR, action);
pcmk__set_graph_action_flags(action, pcmk__graph_action_failed);
te_action_confirmed(action, controld_globals.transition_graph);
abort_transition(INFINITY, pcmk__graph_restart, "Action lost", NULL);
// Record timeout in the CIB if appropriate
if ((action->type == pcmk__rsc_graph_action)
&& controld_action_is_recordable(task)) {
controld_record_action_timeout(action);
}
}
return FALSE;
}
diff --git a/daemons/controld/controld_te_events.c b/daemons/controld/controld_te_events.c
index 1dfa4ceaf2..d4e2b0fe86 100644
--- a/daemons/controld/controld_te_events.c
+++ b/daemons/controld/controld_te_events.c
@@ -1,589 +1,601 @@
/*
- * Copyright 2004-2022 the Pacemaker project contributors
+ * Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include
#include
#include
#include
#include
#include
#include
/*!
* \internal
* \brief Action numbers of outside events processed in current update diff
*
* This table is to be used as a set. It should be empty when the transitioner
* begins processing a CIB update diff. It ensures that if there are multiple
* events (for example, "_last_0" and "_last_failure_0") for the same action,
* only one of them updates the failcount. Events that originate outside the
* cluster can't be confirmed, since they're not in the transition graph.
*/
static GHashTable *outside_events = NULL;
+/*!
+ * \internal
+ * \brief Empty the hash table containing action numbers of outside events
+ */
+void
+controld_remove_all_outside_events(void)
+{
+ if (outside_events != NULL) {
+ g_hash_table_remove_all(outside_events);
+ }
+}
+
/*!
* \internal
* \brief Destroy the hash table containing action numbers of outside events
*/
void
-controld_destroy_outside_event_table(void)
+controld_destroy_outside_events_table(void)
{
if (outside_events != NULL) {
g_hash_table_destroy(outside_events);
outside_events = NULL;
}
}
/*!
* \internal
* \brief Add an outside event's action number to a set
*
* \return Standard Pacemaker return code. Specifically, \p pcmk_rc_ok if the
* event was not already in the set, or \p pcmk_rc_already otherwise.
*/
static int
record_outside_event(gint action_num)
{
if (outside_events == NULL) {
outside_events = g_hash_table_new(NULL, NULL);
}
if (g_hash_table_add(outside_events, GINT_TO_POINTER(action_num))) {
return pcmk_rc_ok;
}
return pcmk_rc_already;
}
gboolean
fail_incompletable_actions(pcmk__graph_t *graph, const char *down_node)
{
const char *target_uuid = NULL;
const char *router = NULL;
const char *router_uuid = NULL;
xmlNode *last_action = NULL;
GList *gIter = NULL;
GList *gIter2 = NULL;
if (graph == NULL || graph->complete) {
return FALSE;
}
gIter = graph->synapses;
for (; gIter != NULL; gIter = gIter->next) {
pcmk__graph_synapse_t *synapse = (pcmk__graph_synapse_t *) gIter->data;
if (pcmk_any_flags_set(synapse->flags, pcmk__synapse_confirmed|pcmk__synapse_failed)) {
/* We've already been here */
continue;
}
gIter2 = synapse->actions;
for (; gIter2 != NULL; gIter2 = gIter2->next) {
pcmk__graph_action_t *action = (pcmk__graph_action_t *) gIter2->data;
if ((action->type == pcmk__pseudo_graph_action)
|| pcmk_is_set(action->flags, pcmk__graph_action_confirmed)) {
continue;
} else if (action->type == pcmk__cluster_graph_action) {
const char *task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
if (pcmk__str_eq(task, CRM_OP_FENCE, pcmk__str_casei)) {
continue;
}
}
target_uuid = crm_element_value(action->xml, XML_LRM_ATTR_TARGET_UUID);
router = crm_element_value(action->xml, XML_LRM_ATTR_ROUTER_NODE);
if (router) {
crm_node_t *node = crm_get_peer(0, router);
if (node) {
router_uuid = node->uuid;
}
}
if (pcmk__str_eq(target_uuid, down_node, pcmk__str_casei) || pcmk__str_eq(router_uuid, down_node, pcmk__str_casei)) {
pcmk__set_graph_action_flags(action, pcmk__graph_action_failed);
pcmk__set_synapse_flags(synapse, pcmk__synapse_failed);
last_action = action->xml;
stop_te_timer(action);
pcmk__update_graph(graph, action);
if (pcmk_is_set(synapse->flags, pcmk__synapse_executed)) {
crm_notice("Action %d (%s) was pending on %s (offline)",
action->id, crm_element_value(action->xml, XML_LRM_ATTR_TASK_KEY), down_node);
} else {
crm_info("Action %d (%s) is scheduled for %s (offline)",
action->id, crm_element_value(action->xml, XML_LRM_ATTR_TASK_KEY), down_node);
}
}
}
}
if (last_action != NULL) {
crm_info("Node %s shutdown resulted in un-runnable actions", down_node);
abort_transition(INFINITY, pcmk__graph_restart, "Node failure",
last_action);
return TRUE;
}
return FALSE;
}
/*!
* \internal
* \brief Update failure-related node attributes if warranted
*
* \param[in] event XML describing operation that (maybe) failed
* \param[in] event_node_uuid Node that event occurred on
* \param[in] rc Actual operation return code
* \param[in] target_rc Expected operation return code
* \param[in] do_update If TRUE, do update regardless of operation type
* \param[in] ignore_failures If TRUE, update last failure but not fail count
*
* \return TRUE if this was not a direct nack, success or lrm status refresh
*/
static gboolean
update_failcount(const xmlNode *event, const char *event_node_uuid, int rc,
int target_rc, gboolean do_update, gboolean ignore_failures)
{
guint interval_ms = 0;
char *task = NULL;
char *rsc_id = NULL;
const char *value = NULL;
const char *id = crm_element_value(event, XML_LRM_ATTR_TASK_KEY);
const char *on_uname = crm_peer_uname(event_node_uuid);
const char *origin = crm_element_value(event, XML_ATTR_ORIGIN);
// Nothing needs to be done for success or status refresh
if (rc == target_rc) {
return FALSE;
} else if (pcmk__str_eq(origin, "build_active_RAs", pcmk__str_casei)) {
crm_debug("No update for %s (rc=%d) on %s: Old failure from lrm status refresh",
id, rc, on_uname);
return FALSE;
}
/* Sanity check */
CRM_CHECK(on_uname != NULL, return TRUE);
CRM_CHECK(parse_op_key(id, &rsc_id, &task, &interval_ms),
crm_err("Couldn't parse: %s", ID(event)); goto bail);
/* Decide whether update is necessary and what value to use */
if ((interval_ms > 0)
|| pcmk__str_eq(task, CRMD_ACTION_PROMOTE, pcmk__str_none)
|| pcmk__str_eq(task, CRMD_ACTION_DEMOTE, pcmk__str_none)) {
do_update = TRUE;
} else if (pcmk__str_eq(task, CRMD_ACTION_START, pcmk__str_none)) {
do_update = TRUE;
value = pcmk__s(controld_globals.transition_graph->failed_start_offset,
CRM_INFINITY_S);
} else if (pcmk__str_eq(task, CRMD_ACTION_STOP, pcmk__str_none)) {
do_update = TRUE;
value = pcmk__s(controld_globals.transition_graph->failed_stop_offset,
CRM_INFINITY_S);
}
if (do_update) {
pcmk__attrd_query_pair_t *fail_pair = NULL;
pcmk__attrd_query_pair_t *last_pair = NULL;
char *fail_name = NULL;
char *last_name = NULL;
GList *attrs = NULL;
uint32_t opts = pcmk__node_attr_none;
char *now = pcmk__ttoa(time(NULL));
// Fail count will be either incremented or set to infinity
if (!pcmk_str_is_infinity(value)) {
value = XML_NVPAIR_ATTR_VALUE "++";
}
if (g_hash_table_lookup(crm_remote_peer_cache, event_node_uuid)) {
opts |= pcmk__node_attr_remote;
}
crm_info("Updating %s for %s on %s after failed %s: rc=%d (update=%s, time=%s)",
(ignore_failures? "last failure" : "failcount"),
rsc_id, on_uname, task, rc, value, now);
/* Update the fail count, if we're not ignoring failures */
if (!ignore_failures) {
fail_pair = calloc(1, sizeof(pcmk__attrd_query_pair_t));
CRM_ASSERT(fail_pair != NULL);
fail_name = pcmk__failcount_name(rsc_id, task, interval_ms);
fail_pair->name = fail_name;
fail_pair->value = value;
fail_pair->node = on_uname;
attrs = g_list_prepend(attrs, fail_pair);
}
/* Update the last failure time (even if we're ignoring failures,
* so that failure can still be detected and shown, e.g. by crm_mon)
*/
last_pair = calloc(1, sizeof(pcmk__attrd_query_pair_t));
CRM_ASSERT(last_pair != NULL);
last_name = pcmk__lastfailure_name(rsc_id, task, interval_ms);
last_pair->name = last_name;
last_pair->value = now;
last_pair->node = on_uname;
attrs = g_list_prepend(attrs, last_pair);
update_attrd_list(attrs, opts);
free(fail_name);
free(fail_pair);
free(last_name);
free(last_pair);
g_list_free(attrs);
free(now);
}
bail:
free(rsc_id);
free(task);
return TRUE;
}
pcmk__graph_action_t *
controld_get_action(int id)
{
for (GList *item = controld_globals.transition_graph->synapses;
item != NULL; item = item->next) {
pcmk__graph_synapse_t *synapse = (pcmk__graph_synapse_t *) item->data;
for (GList *item2 = synapse->actions; item2; item2 = item2->next) {
pcmk__graph_action_t *action = (pcmk__graph_action_t *) item2->data;
if (action->id == id) {
return action;
}
}
}
return NULL;
}
pcmk__graph_action_t *
get_cancel_action(const char *id, const char *node)
{
GList *gIter = NULL;
GList *gIter2 = NULL;
gIter = controld_globals.transition_graph->synapses;
for (; gIter != NULL; gIter = gIter->next) {
pcmk__graph_synapse_t *synapse = (pcmk__graph_synapse_t *) gIter->data;
gIter2 = synapse->actions;
for (; gIter2 != NULL; gIter2 = gIter2->next) {
const char *task = NULL;
const char *target = NULL;
pcmk__graph_action_t *action = (pcmk__graph_action_t *) gIter2->data;
task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
if (!pcmk__str_eq(CRMD_ACTION_CANCEL, task, pcmk__str_casei)) {
continue;
}
task = crm_element_value(action->xml, XML_LRM_ATTR_TASK_KEY);
if (!pcmk__str_eq(task, id, pcmk__str_casei)) {
crm_trace("Wrong key %s for %s on %s", task, id, node);
continue;
}
target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET_UUID);
if (node && !pcmk__str_eq(target, node, pcmk__str_casei)) {
crm_trace("Wrong node %s for %s on %s", target, id, node);
continue;
}
crm_trace("Found %s on %s", id, node);
return action;
}
}
return NULL;
}
bool
confirm_cancel_action(const char *id, const char *node_id)
{
const char *op_key = NULL;
const char *node_name = NULL;
pcmk__graph_action_t *cancel = get_cancel_action(id, node_id);
if (cancel == NULL) {
return FALSE;
}
op_key = crm_element_value(cancel->xml, XML_LRM_ATTR_TASK_KEY);
node_name = crm_element_value(cancel->xml, XML_LRM_ATTR_TARGET);
stop_te_timer(cancel);
te_action_confirmed(cancel, controld_globals.transition_graph);
crm_info("Cancellation of %s on %s confirmed (action %d)",
op_key, node_name, cancel->id);
return TRUE;
}
/* downed nodes are listed like: ... */
#define XPATH_DOWNED "//" XML_GRAPH_TAG_DOWNED \
"/" XML_CIB_TAG_NODE "[@" XML_ATTR_ID "='%s']"
/*!
* \brief Find a transition event that would have made a specified node down
*
* \param[in] target UUID of node to match
*
* \return Matching event if found, NULL otherwise
*/
pcmk__graph_action_t *
match_down_event(const char *target)
{
pcmk__graph_action_t *match = NULL;
xmlXPathObjectPtr xpath_ret = NULL;
GList *gIter, *gIter2;
char *xpath = crm_strdup_printf(XPATH_DOWNED, target);
for (gIter = controld_globals.transition_graph->synapses;
gIter != NULL && match == NULL;
gIter = gIter->next) {
for (gIter2 = ((pcmk__graph_synapse_t * ) gIter->data)->actions;
gIter2 != NULL && match == NULL;
gIter2 = gIter2->next) {
match = (pcmk__graph_action_t *) gIter2->data;
if (pcmk_is_set(match->flags, pcmk__graph_action_executed)) {
xpath_ret = xpath_search(match->xml, xpath);
if (numXpathResults(xpath_ret) < 1) {
match = NULL;
}
freeXpathObject(xpath_ret);
} else {
// Only actions that were actually started can match
match = NULL;
}
}
}
free(xpath);
if (match != NULL) {
crm_debug("Shutdown action %d (%s) found for node %s", match->id,
crm_element_value(match->xml, XML_LRM_ATTR_TASK_KEY), target);
} else {
crm_debug("No reason to expect node %s to be down", target);
}
return match;
}
void
process_graph_event(xmlNode *event, const char *event_node)
{
int rc = -1; // Actual result
int target_rc = -1; // Expected result
int status = -1; // Executor status
int callid = -1; // Executor call ID
int transition_num = -1; // Transition number
int action_num = -1; // Action number within transition
char *update_te_uuid = NULL;
bool ignore_failures = FALSE;
const char *id = NULL;
const char *desc = NULL;
const char *magic = NULL;
const char *uname = NULL;
CRM_ASSERT(event != NULL);
/*
*/
magic = crm_element_value(event, XML_ATTR_TRANSITION_KEY);
if (magic == NULL) {
/* non-change */
return;
}
crm_element_value_int(event, XML_LRM_ATTR_OPSTATUS, &status);
if (status == PCMK_EXEC_PENDING) {
return;
}
id = crm_element_value(event, XML_LRM_ATTR_TASK_KEY);
crm_element_value_int(event, XML_LRM_ATTR_RC, &rc);
crm_element_value_int(event, XML_LRM_ATTR_CALLID, &callid);
rc = pcmk__effective_rc(rc);
if (decode_transition_key(magic, &update_te_uuid, &transition_num,
&action_num, &target_rc) == FALSE) {
// decode_transition_key() already logged the bad key
crm_err("Can't process action %s result: Incompatible versions? "
CRM_XS " call-id=%d", id, callid);
abort_transition(INFINITY, pcmk__graph_restart, "Bad event", event);
return;
}
if (transition_num == -1) {
// E.g. crm_resource --fail
if (record_outside_event(action_num) != pcmk_rc_ok) {
crm_debug("Outside event with transition key '%s' has already been "
"processed", magic);
goto bail;
}
desc = "initiated outside of the cluster";
abort_transition(INFINITY, pcmk__graph_restart, "Unexpected event",
event);
} else if ((action_num < 0)
|| !pcmk__str_eq(update_te_uuid, controld_globals.te_uuid,
pcmk__str_none)) {
desc = "initiated by a different DC";
abort_transition(INFINITY, pcmk__graph_restart, "Foreign event", event);
} else if ((controld_globals.transition_graph->id != transition_num)
|| controld_globals.transition_graph->complete) {
// Action is not from currently active transition
guint interval_ms = 0;
if (parse_op_key(id, NULL, NULL, &interval_ms)
&& (interval_ms != 0)) {
/* Recurring actions have the transition number they were first
* scheduled in.
*/
if (status == PCMK_EXEC_CANCELLED) {
confirm_cancel_action(id, get_node_id(event));
goto bail;
}
desc = "arrived after initial scheduling";
abort_transition(INFINITY, pcmk__graph_restart,
"Change in recurring result", event);
} else if (controld_globals.transition_graph->id != transition_num) {
desc = "arrived really late";
abort_transition(INFINITY, pcmk__graph_restart, "Old event", event);
} else {
desc = "arrived late";
abort_transition(INFINITY, pcmk__graph_restart, "Inactive graph",
event);
}
} else {
// Event is result of an action from currently active transition
pcmk__graph_action_t *action = controld_get_action(action_num);
if (action == NULL) {
// Should never happen
desc = "unknown";
abort_transition(INFINITY, pcmk__graph_restart, "Unknown event",
event);
} else if (pcmk_is_set(action->flags, pcmk__graph_action_confirmed)) {
/* Nothing further needs to be done if the action has already been
* confirmed. This can happen e.g. when processing both an
* "xxx_last_0" or "xxx_last_failure_0" record as well as the main
* history record, which would otherwise result in incorrectly
* bumping the fail count twice.
*/
crm_log_xml_debug(event, "Event already confirmed:");
goto bail;
} else {
/* An action result needs to be confirmed.
* (This is the only case where desc == NULL.)
*/
if (pcmk__str_eq(crm_meta_value(action->params, XML_OP_ATTR_ON_FAIL), "ignore", pcmk__str_casei)) {
ignore_failures = TRUE;
} else if (rc != target_rc) {
pcmk__set_graph_action_flags(action, pcmk__graph_action_failed);
}
stop_te_timer(action);
te_action_confirmed(action, controld_globals.transition_graph);
if (pcmk_is_set(action->flags, pcmk__graph_action_failed)) {
abort_transition(action->synapse->priority + 1,
pcmk__graph_restart, "Event failed", event);
}
}
}
if (id == NULL) {
id = "unknown action";
}
uname = crm_element_value(event, XML_LRM_ATTR_TARGET);
if (uname == NULL) {
uname = "unknown node";
}
if (status == PCMK_EXEC_INVALID) {
// We couldn't attempt the action
crm_info("Transition %d action %d (%s on %s): %s",
transition_num, action_num, id, uname,
pcmk_exec_status_str(status));
} else if (desc && update_failcount(event, event_node, rc, target_rc,
(transition_num == -1), FALSE)) {
crm_notice("Transition %d action %d (%s on %s): expected '%s' but got '%s' "
CRM_XS " target-rc=%d rc=%d call-id=%d event='%s'",
transition_num, action_num, id, uname,
services_ocf_exitcode_str(target_rc),
services_ocf_exitcode_str(rc),
target_rc, rc, callid, desc);
} else if (desc) {
crm_info("Transition %d action %d (%s on %s): %s "
CRM_XS " rc=%d target-rc=%d call-id=%d",
transition_num, action_num, id, uname,
desc, rc, target_rc, callid);
} else if (rc == target_rc) {
crm_info("Transition %d action %d (%s on %s) confirmed: %s "
CRM_XS " rc=%d call-id=%d",
transition_num, action_num, id, uname,
services_ocf_exitcode_str(rc), rc, callid);
} else {
update_failcount(event, event_node, rc, target_rc,
(transition_num == -1), ignore_failures);
crm_notice("Transition %d action %d (%s on %s): expected '%s' but got '%s' "
CRM_XS " target-rc=%d rc=%d call-id=%d",
transition_num, action_num, id, uname,
services_ocf_exitcode_str(target_rc),
services_ocf_exitcode_str(rc),
target_rc, rc, callid);
}
bail:
free(update_te_uuid);
}
diff --git a/daemons/controld/controld_transition.h b/daemons/controld/controld_transition.h
index 3a54704820..2da4221901 100644
--- a/daemons/controld/controld_transition.h
+++ b/daemons/controld/controld_transition.h
@@ -1,62 +1,63 @@
/*
- * Copyright 2004-2022 the Pacemaker project contributors
+ * Copyright 2004-2023 the Pacemaker project contributors
*
* This source code is licensed under the GNU Lesser General Public License
* version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY.
*/
#ifndef TENGINE__H
# define TENGINE__H
# include
# include
# include
# include
/* tengine */
pcmk__graph_action_t *match_down_event(const char *target);
pcmk__graph_action_t *get_cancel_action(const char *id, const char *node);
bool confirm_cancel_action(const char *id, const char *node_id);
void controld_record_action_timeout(pcmk__graph_action_t *action);
-void controld_destroy_outside_event_table(void);
+void controld_destroy_outside_events_table(void);
+void controld_remove_all_outside_events(void);
gboolean fail_incompletable_actions(pcmk__graph_t *graph, const char *down_node);
void process_graph_event(xmlNode *event, const char *event_node);
/* utils */
pcmk__graph_action_t *controld_get_action(int id);
gboolean stop_te_timer(pcmk__graph_action_t *action);
const char *get_rsc_state(const char *task, enum pcmk_exec_status status);
void process_te_message(xmlNode *msg, xmlNode *xml_data);
void controld_register_graph_functions(void);
void notify_crmd(pcmk__graph_t * graph);
void cib_action_updated(xmlNode *msg, int call_id, int rc, xmlNode *output,
void *user_data);
gboolean action_timer_callback(gpointer data);
void te_update_diff(const char *event, xmlNode *msg);
void controld_init_transition_trigger(void);
void controld_destroy_transition_trigger(void);
void controld_trigger_graph_as(const char *fn, int line);
void abort_after_delay(int abort_priority, enum pcmk__graph_next abort_action,
const char *abort_text, guint delay_ms);
void abort_transition_graph(int abort_priority,
enum pcmk__graph_next abort_action,
const char *abort_text, const xmlNode *reason,
const char *fn, int line);
# define trigger_graph() controld_trigger_graph_as(__func__, __LINE__)
# define abort_transition(pri, action, text, reason) \
abort_transition_graph(pri, action, text, reason,__func__,__LINE__);
void te_action_confirmed(pcmk__graph_action_t *action, pcmk__graph_t *graph);
void te_reset_job_counts(void);
#endif
diff --git a/daemons/controld/controld_utils.c b/daemons/controld/controld_utils.c
index c22e26e427..4ce09d9add 100644
--- a/daemons/controld/controld_utils.c
+++ b/daemons/controld/controld_utils.c
@@ -1,840 +1,837 @@
/*
- * Copyright 2004-2022 the Pacemaker project contributors
+ * Copyright 2004-2023 the Pacemaker project contributors
*
* The version control history for this file may have further details.
*
* This source code is licensed under the GNU General Public License version 2
* or later (GPLv2+) WITHOUT ANY WARRANTY.
*/
#include
#include
#include