Page MenuHomeClusterLabs Projects

No OneTemporary

This file is larger than 256 KB, so syntax highlighting was skipped.
diff --git a/ChangeLog b/ChangeLog
index 4843fcfc9c..6807e26497 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,1732 +1,1832 @@
+* Thu Feb 13 2014 David Vossel <dvossel@redhat.com> Pacemaker-1.1.11-1
+- Update source tarball to revision: 33f9d09
+- Changesets: 462
+- Diff: 147 files changed, 6810 insertions(+), 4057 deletions(-)
+
+- Features added since Pacemaker-1.1.10
+
+ + attrd: A truly atomic version of attrd for use where CPG is used for cluster communication
+ + cib: Allow values to be added/updated and removed in a single update
+ + cib: Support XML comments in diffs
+ + Core: Allow blackbox logging to be disabled with SIGUSR2
+ + crmd: Do not block on proxied calls from pacemaker_remoted
+ + crmd: Enable cluster-wide throttling when the cib heavily exceeds its target load
+ + crmd: Make the per-node action limit directly configurable in the CIB
+ + crmd: Slow down recovery on nodes with IO load
+ + crmd: Track CPU usage on cluster nodes and slow down recovery on nodes with high CPU/IO load
+ + crm_mon: add --hide-headers option to hide all headers
+ + crm_node: Display partition output in sorted order
+ + crm_report: Collect logs directly from journald if available
+ + Fencing: On timeout, clean up the agent's entire process group
+ + Fencing: Support agents that need the host to be unfenced at startup
+ + ipc: Raise the default buffer size to 128k
+ + PE: Add a special attribute for distinguishing between real nodes and containers in constraint rules
+ + PE: Allow location constraints to take a regex pattern to match against resource IDs
+ + pengine: Distinguish between the agent being missing and something the agent needs being missing
+ + remote: Properly version the remote connection protocol
+
+- Changes since Pacemaker-1.1.10
+
+ + Bug rhbz#1011618 - Consistently use 'Slave' as the role for unpromoted master/slave resources
+ + Bug rhbz#1057697 - Use native DBus library for systemd and upstart support to avoid problematic use of threads
+ + attrd: Any variable called 'cluster' makes the daemon crash before reaching main()
+ + attrd: Avoid infinite write loop for unknown peers
+ + attrd: Drop all attributes for peers that left the cluster
+ + attrd: Give remote-nodes ability to set attributes with attrd
+ + attrd: Prevent inflation of attribute dampen intervals
+ + attrd: Support SI units for attribute dampening
+ + Bug cl#5171 - pengine: Don't prevent clones from running due to dependant resources
+ + Bug cl#5179 - Corosync: Attempt to retrieve a peer's node name if it is not already known
+ + Bug cl#5181 - corosync: Ensure node IDs are written to the CIB as unsigned integers
+ + Bug rhbz#902407 - crm_resource: Handle --ban for master/slave resources as advertised
+ + cib: Correctly check for archived configuration files
+ + cib: Correctly log short-form xml diffs
+ + cib: Fix remote cib based on TLS
+ + cibadmin: Report errors during sign-off
+ + cli: Do not enabled blackbox for cli tools
+ + cluster: Fix segfault on removing a node
+ + cman: Do not start pacemaker if cman startup fails
+ + cman: Start clvmd and friends from the init script if enabled
+ + Command-line tools should stop after an assertion failure
+ + controld: Use the correct variant of dlm_controld for corosync-2 clusters
+ + cpg: Correctly set the group name length
+ + cpg: Ensure the CPG group is always null-terminated
+ + cpg: Only process one message at a time to allow other priority jobs to be performed
+ + crmd: Correctly observe the configured batch-limit
+ + crmd: Correctly update expected state when the previous DC shuts down
+ + crmd: Correcty update the history cache when recurring ops change their return code
+ + crmd: Don't add node_state to cib, if we have not seen or fenced this node yet
+ + crmd: don't segfault on shutdown when using heartbeat
+ + crmd: Prevent recurring monitors being cancelled due to notify operations
+ + crmd: Reliably detect and act on reprobe operations from the policy engine
+ + crmd: When a peer expectedly shuts down, record the new join and expected states into the cib
+ + crmd: When the DC gracefully shuts down, record the new expected state into the cib
+ + crm_attribute: Do not swallow hostname lookup failures
+ + crm_mon: Do not display duplicates of failed actions
+ + crm_mon: Reduce flickering in interactive mode
+ + crm_resource: Observe --master modifier for --move
+ + crm_resource: Provide a meaningful error if --master is used for primitives and groups
+ + fencing: Allow fencing for node after topology entries are deleted
+ + fencing: Apply correct score to the resource of group
+ + fencing: Ignore changes to non-fencing resources
+ + fencing: Observe pcmk_host_list during automatic unfencing
+ + fencing: Put all fencing agent processes into their own process group
+ + fencing: Wait until all possible replies are recieved before continuing with unverified devices
+ + ipc: Compress msgs based on client's actual max send size
+ + ipc: Have the ipc server enforce a minimum buffer size all clients must use.
+ + iso8601: Prevent dates from jumping backwards a day in some timezones
+ + lrmd: Correctly calculate metadata for the 'service' class
+ + lrmd: Correctly cancel monitor actions for lsb/systemd/service resources on cleaning up
+ + mcp: Remove LSB hints that instruct chkconfig to start pacemaker at boot time
+ + mcp: Some distros complain when LSB scripts do not include Default-Start/Stop directives
+ + pengine: Allow fencing of baremetal remote nodes
+ + pengine: cl#5186 - Avoid running rsc on two nodes when node is fenced during migration
+ + pengine: Correctly account for the location preferences of things colocated with a group
+ + pengine: Correctly handle demotion of grouped masters that are partially demoted
+ + pengine: Disable container node probes due to constraint conflicts
+ + pengine: Do not allow colocation with blocked clone instances
+ + pengine: Do not re-allocate clone instances that are blocked in the Stopped state
+ + pengine: Do not restart resources that depend on unmanaged resources
+ + pengine: Force record pending for migrate_to actions
+ + pengine: Location constraints with role=Started should prevent masters from running at all
+ + pengine: Order demote/promote of resources on remote nodes to happen only once the connection is up
+ + pengine: Properly handle orphaned multistate resources living on remote-nodes
+ + pengine: Properly shutdown orphaned remote connection resources
+ + pengine: Recover unexpectedly running container nodes.
+ + remote: Add support for ipv6 into pacemaker_remote daemon
+ + remote: Handle endian changes between client and server and improve forward compatibility
+ + services: Fixes segfault associated with cancelling in-flight recurring operations.
+ + services: Reset the scheduling policy and priority for lrmd's children without replying on SCHED_RESET_ON_FORK
+
* Fri Jul 26 2013 Andrew Beekhof <andrew@beekhof.net> Pacemaker-1.1.10-1
- Update source tarball to revision: ab2e209
- Changesets: 602
- Diff: 143 files changed, 8162 insertions(+), 5159 deletions(-)
- Features added since Pacemaker-1.1.9
+ Core: Convert all exit codes to positive errno values
+ crm_error: Add the ability to list and print error symbols
+ crm_resource: Allow individual resources to be reprobed
+ crm_resource: Allow options to be set recursively
+ crm_resource: Implement --ban for moving resources away from nodes and --clear (replaces --unmove)
+ crm_resource: Support OCF tracing when using --force-(check|start|stop)
+ PE: Allow active nodes in our current membership to be fenced without quorum
+ PE: Suppress meaningless IDs when displaying anonymous clone status
+ Turn off auto-respawning of systemd services when the cluster starts them
+ Bug cl#5128 - pengine: Support maintenance mode for a single node
- Changes since Pacemaker-1.1.9
+ crmd: cib: stonithd: Memory leaks resolved and improved use of glib reference counting
+ attrd: Fixes deleted attributes during dc election
+ Bug cf#5153 - Correctly display clone failcounts in crm_mon
+ Bug cl#5133 - pengine: Correctly observe on-fail=block for failed demote operation
+ Bug cl#5148 - legacy: Correctly remove a node that used to have a different nodeid
+ Bug cl#5151 - Ensure node names are consistently compared without case
+ Bug cl#5152 - crmd: Correctly clean up fenced nodes during membership changes
+ Bug cl#5154 - Do not expire failures when on-fail=block is present
+ Bug cl#5155 - pengine: Block the stop of resources if any depending resource is unmanaged
+ Bug cl#5157 - Allow migration in the absence of some colocation constraints
+ Bug cl#5161 - crmd: Prevent memory leak in operation cache
+ Bug cl#5164 - crmd: Fixes crash when using pacemaker-remote
+ Bug cl#5164 - pengine: Fixes segfault when calculating transition with remote-nodes.
+ Bug cl#5167 - crm_mon: Only print "stopped" node list for incomplete clone sets
+ Bug cl#5168 - Prevent clones from being bounced around the cluster due to location constraints
+ Bug cl#5170 - Correctly support on-fail=block for clones
+ cib: Correctly read back archived configurations if the primary is corrupted
+ cib: The result is not valid when diffs fail to apply cleanly for CLI tools
+ cib: Restore the ability to embed comments in the configuration
+ cluster: Detect and warn about node names with capitals
+ cman: Do not pretend we know the state of nodes we've never seen
+ cman: Do not unconditionally start cman if it is already running
+ cman: Support non-blocking CPG calls
+ Core: Ensure the blackbox is saved on abnormal program termination
+ corosync: Detect the loss of members for which we only know the nodeid
+ corosync: Do not pretend we know the state of nodes we've never seen
+ corosync: Ensure removed peers are erased from all caches
+ corosync: Nodes that can persist in sending CPG messages must be alive afterall
+ crmd: Do not get stuck in S_POLICY_ENGINE if a node we couldn't fence returns
+ crmd: Do not update fail-count and last-failure for old failures
+ crmd: Ensure all membership operations can complete while trying to cancel a transition
+ crmd: Ensure operations for cleaned up resources don't block recovery
+ crmd: Ensure we return to a stable state if there have been too many fencing failures
+ crmd: Initiate node shutdown if another node claims to have successfully fenced us
+ crmd: Prevent messages for remote crmd clients from being relayed to wrong daemons
+ crmd: Properly handle recurring monitor operations for remote-node agent
+ crmd: Store last-run and last-rc-change for all operations
+ crm_mon: Ensure stale pid files are updated when a new process is started
+ crm_report: Correctly collect logs when 'uname -n' reports fully qualified names
+ fencing: Fail the operation once all peers have been exhausted
+ fencing: Restore the ability to manually confirm that fencing completed
+ ipc: Allow unpriviliged clients to clean up after server failures
+ ipc: Restore the ability for members of the haclient group to connect to the cluster
+ legacy: Support "crm_node --remove" with a node name for corosync plugin (bnc#805278)
+ lrmd: Default to the upstream location for resource agent scratch directory
+ lrmd: Pass errors from lsb metadata generation back to the caller
+ pengine: Correctly handle resources that recover before we operate on them
+ pengine: Delete the old resource state on every node whenever the resource type is changed
+ pengine: Detect constraints with inappropriate actions (ie. promote for a clone)
+ pengine: Ensure per-node resource parameters are used during probes
+ pengine: If fencing is unavailable or disabled, block further recovery for resources that fail to stop
+ pengine: Implement the rest of get_timet_now() and rename to get_effective_time
+ pengine: Re-initiate _active_ recurring monitors that previously failed but have timed out
+ remote: Workaround for inconsistent tls handshake behavior between gnutls versions
+ systemd: Ensure we get shut down correctly by systemd
+ systemd: Reload systemd after adding/removing override files for cluster services
+ xml: Check for and replace non-printing characters with their octal equivalent while exporting xml text
+ xml: Prevent lockups by setting a more reliable buffer allocation strategy
* Fri Mar 08 2013 Andrew Beekhof <andrew@beekhof.net> Pacemaker-1.1.9-1
- Update source tarball to revision: 7e42d77
- Statistics:
Changesets: 731
Diff: 1301 files changed, 92909 insertions(+), 57455 deletions(-)
- Features added in Pacemaker-1.1.9
+ corosync: Allow cman and corosync 2.0 nodes to use a name other than uname()
+ corosync: Use queues to avoid blocking when sending CPG messages
+ ipc: Compress messages that exceed the configured IPC message limit
+ ipc: Use queues to prevent slow clients from blocking the server
+ ipc: Use shared memory by default
+ lrmd: Support nagios remote monitoring
+ lrmd: Pacemaker Remote Daemon for extending pacemaker functionality outside corosync cluster.
+ pengine: Check for master/slave resources that are not OCF agents
+ pengine: Support a 'requires' resource meta-attribute for controlling whether it needs quorum, fencing or nothing
+ pengine: Support for resource container
+ pengine: Support resources that require unfencing before start
- Changes since Pacemaker-1.1.8
+ attrd: Correctly handle deletion of non-existant attributes
+ Bug cl#5135 - Improved detection of the active cluster type
+ Bug rhbz#913093 - Use crm_node instead of uname
+ cib: Avoid use-after-free by correctly support cib_no_children for non-xpath queries
+ cib: Correctly process XML diff's involving element removal
+ cib: Performance improvements for non-DC nodes
+ cib: Prevent error message by correctly handling peer replies
+ cib: Prevent ordering changes when applying xml diffs
+ cib: Remove text nodes from cib replace operations
+ cluster: Detect node name collisions in corosync
+ cluster: Preserve corosync membership state when matching node name/id entries
+ cman: Force fenced to terminate on shutdown
+ cman: Ignore qdisk 'nodes'
+ core: Drop per-user core directories
+ corosync: Avoid errors when closing failed connections
+ corosync: Ensure peer state is preserved when matching names to nodeids
+ corosync: Clean up CMAP connections after querying node name
+ corosync: Correctly detect corosync 2.0 clusters even if we don't have permission to access it
+ crmd: Bug cl#5144 - Do not updated the expected status of failed nodes
+ crmd: Correctly determin if cluster disconnection was abnormal
+ crmd: Correctly relay messages for remote clients (bnc#805626, bnc#804704)
+ crmd: Correctly stall the FSA when waiting for additional inputs
+ crmd: Detect and recover when we are evicted from CPG
+ crmd: Differentiate between a node that is up and coming up in peer_update_callback()
+ crmd: Have cib operation timeouts scale with node count
+ crmd: Improved continue/wait logic in do_dc_join_finalize()
+ crmd: Prevent election storms caused by getrusage() values being too close
+ crmd: Prevent timeouts when performing pacemaker level membership negotiation
+ crmd: Prevent use-after-free of fsa_message_queue during exit
+ crmd: Store all current actions when stalling the FSA
+ crm_mon: Do not try to render a blank cib and indicate the previous output is now stale
+ crm_mon: Fixes crm_mon crash when using snmp traps.
+ crm_mon: Look for the correct error codes when applying configuration updates
+ crm_report: Ensure policy engine logs are found
+ crm_report: Fix node list detection
+ crm_resource: Have crm_resource generate a valid transition key when sending resource commands to the crmd
+ date/time: Bug cl#5118 - Correctly convert seconds-since-epoch to the current time
+ fencing: Attempt to provide more information that just 'generic error' for failed actions
+ fencing: Correctly record completed but previously unknown fencing operations
+ fencing: Correctly terminate when all device options have been exhausted
+ fencing: cov#739453 - String not null terminated
+ fencing: Do not merge new fencing requests with stale ones from dead nodes
+ fencing: Do not start fencing until entire device topology is found or query results timeout.
+ fencing: Do not wait for the query timeout if all replies have arrived
+ fencing: Fix passing of parameters from CMAN containing '='
+ fencing: Fix non-comparison when sorting devices by priority
+ fencing: On failure, only try a topology device once from the remote level.
+ fencing: Only try peers for non-topology based operations once
+ fencing: Retry stonith device for duration of action's timeout period.
+ heartbeat: Remove incorrect assert during cluster connect
+ ipc: Bug cl#5110 - Prevent 100% CPU usage when looking for synchronous replies
+ ipc: Use 50k as the default compression threshold
+ legacy: Prevent assertion failure on routing ais messages (bnc#805626)
+ legacy: Re-enable logging from the pacemaker plugin
+ legacy: Relax the 'active' check for plugin based clusters to avoid false negatives
+ legacy: Skip peer process check if the process list is empty in crm_is_corosync_peer_active()
+ mcp: Only define HA_DEBUGLOG to avoid agent calls to ocf_log printing everything twice
+ mcp: Re-attach to existing pacemaker components when mcp fails
+ pengine: Any location constraint for the slave role applies to all roles
+ pengine: Avoid leaking memory when cleaning up failcounts and using containers
+ pengine: Bug cl#5101 - Ensure stop order is preserved for partially active groups
+ pengine: Bug cl#5140 - Allow set members to be stopped when the subseqent set has require-all=false
+ pengine: Bug cl#5143 - Prevent shuffling of anonymous master/slave instances
+ pengine: Bug rhbz#880249 - Ensure orphan masters are demoted before being stopped
+ pengine: Bug rhbz#880249 - Teach the PE how to recover masters into primitives
+ pengine: cl#5025 - Automatically clear failcount for start/monitor failures after resource parameters change
+ pengine: cl#5099 - Probe operation uses the timeout value from the minimum interval monitor by default (#bnc776386)
+ pengine: cl#5111 - When clone/master child rsc has on-fail=stop, insure all children stop on failure.
+ pengine: cl#5142 - Do not delete orphaned children of an anonymous clone
+ pengine: Correctly unpack active anonymous clones
+ pengine: Ensure previous migrations are closed out before attempting another one
+ pengine: Introducing the whitebox container resources feature
+ pengine: Prevent double-free for cloned primitive from template
+ pengine: Process rsc_ticket dependencies earlier for correctly allocating resources (bnc#802307)
+ pengine: Remove special cases for fencing resources
+ pengine: rhbz#902459 - Remove rsc node status for orphan resources
+ systemd: Gracefully handle unexpected DBus return types
+ Replace the use of the insecure mktemp(3) with mkstemp(3)
* Thu Sep 20 2012 Andrew Beekhof <andrew@beekhof.net> Pacemaker-1.1.8-1
- Update source tarball to revision: 1a5341f
- Statistics:
Changesets: 1019
Diff: 2107 files changed, 117258 insertions(+), 73606 deletions(-)
- All APIs have been cleaned up and reduced to essentials
- Pacemaker now includes a replacement lrmd that supports systemd and upstart agents
- Config and state files (cib.xml, PE inputs and core files) have moved to new locations
- The crm shell has become a separate project and no longer included with Pacemaker
- All daemons/tools now have a unified set of error codes based on errno.h (see crm_error)
- Changes since Pacemaker-1.1.7
+ Core: Bug cl#5032 - Rewrite the iso8601 date handling code
+ Core: Correctly extract the version details from a diff
+ Core: Log blackbox contents, if enabled, when an error occurs
+ Core: Only LOG_NOTICE and higher are sent to syslog
+ Core: Replace use of IPC from clplumbing with IPC from libqb
+ Core: SIGUSR1 now enables blackbox logging, SIGTRAP to write out
+ Core: Support a blackbox for additional logging detail after crashes/errors
+ Promote support for advanced fencing logic to the stable schema
+ Promote support for node starting scores to the stable schema
+ Promote support for service and systemd to the stable schema
+ attrd: Differentiate between updating all our attributes and everybody updating all theirs too
+ attrd: Have single-shot clients wait for an ack before disconnecting
+ cib: cl#5026 - Synced cib updates should not return until the cpg broadcast is complete.
+ corosync: Detect when the first corosync has not yet formed and handle it gracefully
+ corosync: Obtain a full list of configured nodes, including their names, when we connect to the quorum API
+ corosync: Obtain a node name from DNS if one was not already known
+ corosync: Populate the cib nodelist from corosync if available
+ corosync: Use the CFG API and DNS to determine node names if not configured in corosync.conf
+ crmd: Block after 10 failed fencing attempts for a node
+ crmd: cl#5051 - Fixes file leak in pe ipc connection initialization.
+ crmd: cl#5053 - Fixes fail-count not being updated properly.
+ crmd: cl#5057 - Restart sub-systems correctly (bnc#755671)
+ crmd: cl#5068 - Fixes crm_node -R option so it works with corosync 2.0
+ crmd: Correctly re-establish failed attrd connections
+ crmd: Detect when the quorum API isn't configured for corosync 2.0
+ crmd: Do not overwrite any configured node type (eg. quorum node)
+ crmd: Enable use of new lrmd daemon and client library in crmd.
+ crmd: Overhaul the way node state is recorded and updated in the CIB
+ fencing: Bug rhbz#853537 - Prevent use-of-NULL when the cib libraries are not available
+ fencing: cl#5073 - Add 'off' as an valid value for stonith-action option.
+ fencing: cl#5092 - Always timeout stonith operations if timeout period expires.
+ fencing: cl#5093 - Stonith per device timeout option
+ fencing: Clean up if we detect a failed connection
+ fencing: Delegate complex self fencing requests - we wont be around to see it to completion
+ fencing: Ensure all peers are notified of complex fencing op completion
+ fencing: Fix passing of fence_legacy parameters containing '='
+ fencing: Gracefully handle metadata requests for unknown agents
+ fencing: Return cached dynamic target list for busy devices.
+ fencing: rhbz#801355 - Abort transition on DC when external fencing operation is detected
+ fencing: rhbz#801355 - Merge fence requests for identical operations already in progress.
+ fencing: rhbz#801355 - Report fencing operations external of pacemaker to cib
+ fencing: Specify the action to perform using action= instead of the older option=
+ fencing: Stop building fake metadata for broken agents
+ fencing: Tolerate agents that report empty metadata in the admin tool
+ mcp: Correctly retry the connection to corosync on failure
+ mcp: Do not shut down IPC until the last client exits
+ mcp: Prevent use-after-free when running against corosync 1.x
+ pengine: Bug cl#5059 - Use the correct action's status when calculating required actions for interleaved clones
+ pengine: Bypass online/offline checking resource detection for ping/quorum nodes
+ pengine: cl#5044 - migrate_to no longer requires load_stopped for avoiding possible transition loop
+ pengine: cl#5069 - Honor 'on-fail=ignore' even when operation is disabled.
+ pengine: cl#5070 - Allow influence of promotion score when multistate rsc is left hand of colocation
+ pengine: cl#5072 - Fixes monitor op stopping after rsc promotion.
+ pengine: cl#5072 - Fixes pengine regression test failures
+ pengine: Correctly set the status for nodes not intended to run Pacemaker
+ pengine: Do not append instance numbers to anonymous clones
+ pengine: Fix failcount expiration
+ pengine: Fix memory leaks found by valgrind
+ pengine: Fix use-after-free and use-of-NULL errors detected by coverity
+ pengine: Fixes use of colocation scores other than +/- INFINITY
+ pengine: Improve detection of rejoining nodes
+ pengine: Prevent use-of-NULL when tracing is enabled
+ pengine: Stonith resources are allowed to start even if their probes haven't completed on partially active nodes
+ services: New class called 'service' which expands to the correct (LSB/systemd/upstart) standard
+ services: Support Asynchronous systemd/upstart actions
+ Tools: crm_shadow - Bug cl#5062 - Correctly set argv[0] when forking a shell process
+ Tools: crm_report: Always include system logs (if we can find them)
* Wed Mar 28 2012 Andrew Beekhof <andrew@beekhof.net> Pacemaker-1.1.7-1
- Update source tarball to revision: bc7ff2c
- Statistics:
Changesets: 513
Diff: 1171 files changed, 90472 insertions(+), 19368 deletions(-)
- Changes since Pacemaker-1.1.6.1
+ ais: Prepare for corosync versions using IPC from libqb
+ cib: Correctly shutdown in the presence of peers without relying on timers
+ cib: Don't halt disk writes if the previous digest is missing
+ cib: Determine when there are no peers to respond to our shutdown request and exit
+ cib: Ensure no additional messages are processed after we begin terminating
+ Cluster: Hook up the callbacks to the corosync quorum notifications
+ Core: basename() may modify its input, do not pass in a constant
+ Core: Bug cl#5016 - Prevent failures in recurring ops from being lost
+ Core: Bug rhbz#800054 - Correctly retrieve heartbeat uuids
+ Core: Correctly determine when an XML file should be decompressed
+ Core: Correctly track the length of a string without reading from uninitialzied memory (valgrind)
+ Core: Ensure signals are handled eventually in the absense of timer sources or IPC messages
+ Core: Prevent use-of-NULL in crm_update_peer()
+ Core: Strip text nodes from on disk xml files
+ Core: Support libqb for logging
+ corosync: Consistently set the correct uuid with get_node_uuid()
+ Corosync: Correctly disconnect from corosync variants
+ Corosync: Correctly extract the node id from membership udpates
+ corosync: Correctly infer lost members from the quorum API
+ Corosync: Default to using the nodeid as the node's uuid (instead of uname)
+ corosync: Ensure we catch nodes that leave the membership, even if the ringid doesn't change
+ corosync: Hook up CPG membership
+ corosync: Relax a development assert and gracefully handle the error condition
+ corosync: Remove deprecated member of the CFG API
+ corosync: Treat CS_ERR_QUEUE_FULL the same as CS_ERR_TRY_AGAIN
+ corosync: Unset the process list when nodes dissappear on us
+ crmd: Also purge fencing results when we enter S_NOT_DC
+ crmd: Bug cl#5015 - Remove the failed operation as well as the resulting fail-count and last-failure attributes
+ crmd: Correctly determine when a node can suicide with fencing
+ crmd: Election - perform the age comparison only once
+ crmd: Fast-track shutdown if we couldn't request it via attrd
+ crmd: Leave it up to the PE to decide which ops can/cannot be reload
+ crmd: Prevent use-after-free when calling delete_resource due to CRM_OP_REPROBE
+ crmd: Supply format arguments in the correct order
+ fencing: Add missing format parameter
+ fencing: Add the fencing topology section to the 1.1 configuration schema
+ fencing: fence_legacy - Drop spurilous host argument from status query
+ fencing: fence_legacy - Ensure port is available as an environment variable when calling monitor
+ fencing: fence_pcmk - don't block if nothing is specified on stdin
+ fencing: Fix log format error
+ fencing: Fix segfault caused by passing garbage to dlsym()
+ fencing: Fix use-of-NULL in process_remote_stonith_query()
+ fencing: Fix use-of-NULL when listing installed devices
+ fencing: Implement support for advanced fencing topologies: eg. kdump || (network && disk) || power
+ fencing: More gracefully handle failed 'list' operations for devices that only support a single connection
+ fencing: Prevent duplicate free when listing devices
+ fencing: Prevent uninitialized pointers being passed to free
+ fencing: Prevent use-after-free, we may need the query result for subsequent operations
+ fencing: Provide enough data to construct an entry in the node's fencing history
+ fencing: Standardize on /one/ method for clients to request members be fenced
+ fencing: Supress errors when listing all registered devices
+ mcp: corosync_cfg_state_track was removed from the corosync API, luckily we didnt use it for anything
+ mcp: Do not specify a WorkingDirectory in the systemd unit file - startup fails if its not available
+ mcp: Set the HA_quorum_type env variable consistently with our corosync plugin
+ mcp: Shut down if one of our child processes can/should not be respawned
+ pengine: Bug cl#5000 - Ensure ordering is preserved when depending on partial sets
+ pengine: Bug cl#5028 - Unmanaged services should block shutdown unless in maintainence mode
+ pengine: Bug cl#5038 - Prevent restart of anonymous clones when clone-max decreases
+ pengine: Bug cl#5007 - Fixes use of colocation constraints with multi-state resources
+ pengine: Bug cl#5014 - Prevent asymmetrical order constraints from causing resource stops
+ pengine: Bug cl#5000 - Implements ability to create rsc_order constraint sets such that A can start after B or C has started.
+ pengine: Correctly migrate a resource that has just migrated
+ pengine: Correct return from error path
+ pengine: Detect reloads of previously migrated resources
+ pengine: Ensure post-migration stop actions occur before node shutdown
+ pengine: Log as loudly as possible when we cannot shut down a cluster node
+ pengine: Reload of a resource no longer causes a restart of dependant resources
+ pengine: Support limiting the number of concurrent live migrations
+ pengine: Support referencing templates in constraints
+ pengine: Support of referencing resource templates in resource sets
+ pengine: Support to make tickets standby for relinquishing tickets gracefully
+ stonith: A "start" operation of a stonith resource does a "monitor" on the device beyond registering it
+ stonith: Bug rhbz#745526 - Ensure stonith_admin actually gets called by fence_pcmk
+ Stonith: Ensure all nodes receive and deliver notifications of the manual override
+ stonith: Fix the stonith timeout issue (cl#5009, bnc#727498)
+ Stonith: Implement a manual override for when nodes are known to be safely off
+ Tools: Bug cl#5003 - Prevent use-after-free in crm_simlate
+ Tools: crm_mon - Support to display tickets (based on Yuusuke Iida's work)
+ Tools: crm_simulate - Support to grant/revoke/standby/activate tickets from the new ticket state section
+ Tools: Implement crm_node functionality for native corosync
+ Fix a number of potential problems reported by coverity
* Wed Aug 31 2011 Andrew Beekhof <andrew@beekhof.net> 1.1.6-1
- Update source tarball to revision: 676e5f25aa46 tip
- Statistics:
Changesets: 376
Diff: 1761 files changed, 36259 insertions(+), 140578 deletions(-)
- Changes since Pacemaker-1.1.5
+ ais: check for retryable errors when dispatching AIS messages
+ ais: Correctly disconnect from Corosync and Cman based clusters
+ ais: Followup to previous patch - Ensure we drain the corosync queue of messages when Glib tells us there is input
+ ais: Handle IPC error before checking for NULL data (bnc#702907)
+ cib: Check the validation version before adding the originator details of a CIB change
+ cib: Remove disconnected remote connections from mainloop
+ cman: Correctly override existing fenced operations
+ cman: Dequeue all the cman emitted events and not only the first one leaving the others in the event's queue.
+ cman: Don't call fenced_join and fenced_leave when notifying cman of a fencing event.
+ cman: We need to run the crmd as root for CMAN so that we can ACK fencing operations
+ Core: Cancelled and pending operations do not count as failed
+ Core: Ensure there is sufficient space for EOS when building short-form option strings
+ Core: Fix variable expansion in pkg-config files
+ Core: Partial revert of accidental commit in previous patch
+ Core: Use dlopen to load heartbeat libraries on-demand
+ crmd: Bug lf#2509 - Watch for config option changes from the CIB even if we're not the DC
+ crmd: Bug lf#2528 - Introduce a slight delay when creating a transition to allow attrd time to perform its updates
+ crmd: Bug lf#2559 - Fail actions that were scheduled for a failed/fenced node
+ crmd: Bug lf#2584 - Allow nodes to fence themselves if they're the last one standing
+ crmd: Bug lf#2632 - Correctly handle nodes that return faster than stonith
+ crmd: Cancel timers for actions that were pending on dead nodes
+ crmd: Catch fence operations that claim to succeed but did not really
+ crmd: Do not wait for actions that were pending on dead nodes
+ crmd: Ensure we do not attempt to perform action on failed nodes
+ crmd: Prevent use-of-NULL by g_hash_table_iter_next()
+ crmd: Recurring actions shouldn't cause the last non-recurring action to be forgotten
+ crmd: Store only the last and last failed operation in the CIB
+ mcp: dirname() modifies the input path - pass in a copy of the logfile path
+ mcp: Enable stack detection logic instead of forcing 'corosync'
+ mcp: Fix spelling mistake in systemd service script that prevents shutdown
+ mcp: Shut down if corosync becomes unavailable
+ mcp: systemd control file is now functional
+ pengine: Before migrating an utilization-using resource to a node, take off the load which will no longer run there (lf#2599, bnc#695440)
+ pengine: Before migrating an utilization-using resource to a node, take off the load which will no longer run there (regression tests) (lf#2599, bnc#695440)
+ pengine: Bug lf#2574 - Prevent shuffling by choosing the correct clone instance to stop
+ pengine: Bug lf#2575 - Use uname for migration variables, id is a UUID on heartbeat
+ pengine: Bug lf#2581 - Avoid group restart when clone (re)starts on an unrelated node
+ pengine: Bug lf#2613, lf#2619 - Group migration after failures and non-default utilization policies
+ pengine: Bug suse#707150 - Prevent services being active if dependancies on clones are not satisfied
+ pengine: Correctly recognise which recurring operations are currently active
+ pengine: Demote from Master does not clear previous errors
+ pengine: Ensure restarts due to definition changes cause the start action to be re-issued not probes
+ pengine: Ensure role is preserved for unmanaged resources
+ pengine: Ensure unmanaged resources have the correct role set so the correct monitor operation is chosen
+ pengine: Fix memory leak for re-allocated resources reported by valgrind
+ pengine: Implement cluster ticket and deadman
+ pengine: Implement resource template
+ pengine: Correctly determine the state of multi-state resources with a partial operation history
+ pengine: Only allocate master/slave resources once
+ pengine: Partial revert of 'Minor code cleanup CS: cf6bca32376c On: 2011-08-15'
+ pengine: Resolve memory leak reported by valgrind
+ pengine: Restore the ability to save inputs to disk
+ Shell: implement -w,--wait option to wait for the transition to finish
+ Shell: repair template list command
+ Shell: set of commands to examine logs, reports, etc
+ Stonith: Consolidate pcmk_host_map into run_stonith_agent so that it is applied consistently
+ Stonith: Deprecate pcmk_arg_map for the saner pcmk_host_argument
+ Stonith: Fix use-of-NULL by g_hash_table_lookup
+ Stonith: Improved pcmk_host_map parsing
+ Stonith: Prevent use-of-NULL by g_hash_table_lookup
+ Stonith: Prevent use-of-NULL when no Linux-HA stonith agents are present
+ stonith: Add missing entries to stonith_error2string()
+ Stonith: Correctly finish sending agent options if the initial write is interrupted
+ stonith: Correctly handle synchronous calls
+ stonith: Coverity - Correctly construct result list for the query API call
+ stonith: Coverity - Remove badly constructed memory allocation from the query API call
+ stonith: Ensure completed operations are recorded as such in the history
+ Stonith: Ensure device parameters are passed to the daemon during registration
+ stonith: Fix use-of-NULL in stonith_api_device_list()
+ stonith: stonith_admin - Prevent use of uninitialized pointer by --history command
+ Tools: Bug lf#2528 - Make progress when attrd_updater is called repeatedly within the dampen interval but with the same value
+ Tools: crm_report - Correctly extract data from the local node
+ Tools: crm_report - Remove newlines when detecting the node list
+ Tools: crm_report - Repair the ability to extract data from the local machine
+ Tools: crm_report - Report on all detected backtraces
* Fri Feb 11 2011 Andrew Beekhof <andrew@beekhof.net> 1.1.5-1
- Update source tarball to revision: baad6636a053
- Statistics:
Changesets: 184
Diff: 605 files changed, 46103 insertions(+), 26417 deletions(-)
- Changes since Pacemaker-1.1.4
+ Add the ability to delegate sub-sections of the cluster to non-root users via ACLs
Needs to be enabled at compile time, not enabled by default.
+ ais: Bug lf#2550 - Report failed processes immediately
+ Core: Prevent recently introduced use-after-free in replace_xml_child()
+ Core: Reinstate the logic that skips past non-XML_ELEMENT_NODE children
+ Core: Remove extra calls to xmlCleanupParser resulting in use-after-free
+ Core: Repair reference to child-of-child after removal of xml_child_iter_filter from get_message_xml()
+ crmd: Bug lf#2545 - Ensure notify variables are accurate for stop operations
+ crmd: Cancel recurring operations while we're still connected to the lrmd
+ crmd: Reschedule the PE_START action if its not already running when we try to use it
+ crmd: Update failcount for failed promote and demote operations
+ pengine: Bug lf#2445 - Avoid relying on stickness for stable clone placement
+ pengine: Bug lf#2445 - Do not override configured clone stickiness values
+ pengine: Bug lf#2493 - Don't imply colocation requirements when applying ordering constraints with clones
+ pengine: Bug lf#2495 - Prevent segfault by validating the contents of ordering sets
+ pengine: Bug lf#2508 - Correctly reconstruct the status of anonymous cloned groups
+ pengine: Bug lf#2518 - Avoid spamming the logs with errors for orphan resources
+ pengine: Bug lf#2544 - Prevent unstable clone placement by factoring in the current node's score before all others
+ pengine: Bug lf#2554 - target-role alone is not sufficient to promote resources
+ pengine: Correct target_rc for probes of inactive resources (fix regression introduced by cs:ac3f03006e95)
+ pengine: Ensure that fencing has completed for stop actions on stonith-dependent resources (lf#2551)
+ pengine: Only update the node's promotion score if the resource is active there
+ pengine: Only use the promotion score from the current clone instance
+ pengine: Prevent use-of-NULL resulting from variable shadowing spotted by Coverity
+ pengine: Prevent use-of-NULL when there is status for an undefined node
+ pengine: Prevet use-after-free resulting from unintended recursion when chosing a node to promote master/slave resources
+ Shell: don't create empty optional sections (bnc#665131)
+ Stonith: Teach stonith_admin to automagically obtain the current node attributes for the target from the CIB
+ tools: Bug lf#2527 - Prevent use-of-NULL in crm_simulate
+ Tools: Prevent crm_resource commands from being lost due to the use of cib_scope_local
* Wed Oct 20 2010 Andrew Beekhof <andrew@beekhof.net> 1.1.4-1
- Update source tarball to revision: 75406c3eb2c1 tip
- Statistics:
Changesets: 169
Diff: 772 files changed, 56172 insertions(+), 39309 deletions(-)
- Changes since Pacemaker-1.1.3
+ Italian translation of Clusters from Scratch
+ Significant performance enhancements to the Policy Engine and CIB
+ cib: Bug lf#2506 - Don't remove client's when notifications fail, they might just be too big
+ cib: Drop invalid/failed connections from the client hashtable
+ cib: Ensure all diffs sent to peers have sufficient ordering information
+ cib: Ensure non-change diffs can preserve the ordering on the other side
+ cib: Fix the feature set check
+ cib: Include version information on our synthesised diffs when nothing changed
+ cib: Optimize the way we detect group/set ordering changes - 15% speedup
+ cib: Prevent false detection of config updates with the new diff format
+ cib: Reduce unnecessary copying when comparing xml objects
+ cib: Repair the processing of updates sent from peer nodes
+ cib: Revert part of a recent commit that purged still valid connections
+ cib: The feature set version check is only valid if the current value is non-NULL
+ Core: Actually removing diff markers is necessary
+ Core: Bug lf#2506 - Drop the compression limit because Heartbeat's IPC code sucks
+ Core: Cache Relax-NG schemas - profiling indicates many cycles are wasted needlessly re-parsing them
+ Core: Correctly compare against crm_log_level in the logging macros
+ Core: Correctly extract the version details from a diff
+ Core: Correctly hook up the RNG schema cache
+ Core: Correctly use lazy_xml_sort() for v2 digests
+ Core: Don't compress large payload elements unless we're approaching message limits
+ Core: Don't insert empty ID tags when applying diffs
+ Core: Enable the improve v2 digests
+ Core: Ensure ordering is preserved when applying diffs
+ Core: Fix the CRM_CHECK macro
+ Core: Modify the v2 digest algorithm so that some fields are sorted
+ Core: Prevent use-after-free when creating a CIB update for a timed out action
+ Core: Prevent use-of-NULL when cleaning up RelaxNG data structures
+ Core: Provide significant performance improvements by implementing versioned diffs and digests
+ crmd: All pending operations should be recorded, even recurring ones with high start delays
+ crmd: Don't abort transitions when probes are completed on a node
+ crmd: Don't hide stop events that time out - allowing faster recovery in the presence of overloaded hosts
+ crmd: Ensure the CIB is always writable on the DC by removing a timing hole
+ crmd: Include the correct transition details for timed out operations
+ crmd: Prevent use of NULL by making copies of the operation's hash table
+ crmd: There's no need to check the cib version from the 'added' part of diff updates
+ crmd: Use the supplied timeout for stop actions
+ mcp: Ensure valgrind is able to log its output somewhere
+ mcp: Use 99/01 for the start/stop sequence to avoid problems with services (such as libvirtd) started by init - Patch from Vladislav Bogdanov
+ pengine: Ensure fencing of the DC preceeds the STONITH_DONE operation
+ pengine: Fix memory leak introduced as part of the conversion to GHashTables
+ pengine: Fix memory leak when processing completed migration actions
+ pengine: Fix typo leading to use-of-NULL in the new ordering code
+ pengine: Free memory in recently introduced helper function
+ pengine: lf#2478 - Implement improved handling and recovery of atomic resource migrations
+ pengine: Obtain massive speedup by prepending to the list of ordering constraints (which can grow quite large)
+ pengine: Optimize the logic for deciding which non-grouped anonymous clone instances to probe for
+ pengine: Prevent clones from being stopped because resources colocated with them cannot be active
+ pengine: Try to ensure atomic migration ops occur within a single transition
+ pengine: Use hashtables instead of linked lists for performance sensitive datastructures
+ pengine: Use the original digest algorithm for parameter lists
+ stonith: cleanup children on timeout in fence_legacy
+ Stonith: Fix two memory leaks
+ Tools: crm_shadow - Avoid replacing the entire configuration (including status)
* Tue Sep 21 2010 Andrew Beekhof <andrew@beekhof.net> 1.1.3-1
- Update source tarball to revision: e3bb31c56244 tip
- Statistics:
Changesets: 352
Diff: 481 files changed, 14130 insertions(+), 11156 deletions(-)
- Changes since Pacemaker-1.1.2.1
+ ais: Bug lf#2401 - Improved processing when the peer crmd processes join/leave
+ ais: Correct the logic for conecting to plugin based clusters
+ ais: Do not supply a process list in mcp-mode
+ ais: Drop support for whitetank in the 1.1 release series
+ ais: Get an initial dump of the node membership when connecting to quorum-based clusters
+ ais: Guard against saturated cpg connections
+ ais: Handle CS_ERR_TRY_AGAIN in more cases
+ ais: Move the code for finding uid before the fork so that the child does no logging
+ ais: Never allow quorum plugins to affect connection to the pacemaker plugin
+ ais: Sign everyone up for peer process updates, not just the crmd
+ ais: The cluster type needs to be set before initializing classic openais connections
+ cib: Also free query result for xpath operations that return more than one hit
+ cib: Attempt to resolve memory corruption when forking a child to write the cib to disk
+ cib: Correctly free memory when writing out the cib to disk
+ cib: Fix the application of unversioned diffs
+ cib: Remove old developmental error logging
+ cib: Restructure the 'valid peer' check for deciding which instructions to ignore
+ cman: Correctly process membership/quorum changes from the pcmk plugin. Allow other message types through untouched
+ cman: Filter directed messages not intended for us
+ cman: Grab the initial membership when we connect
+ cman: Keep the list of peer processes up-to-date
+ cman: Make sure our common hooks are called after a cman membership update
+ cman: Make sure we can compile without cman present
+ cman: Populate sender details for cpg messages
+ cman: Update the ringid for cman based clusters
+ Core: Correctly unpack HA_Messages containing multiple entries with the same name
+ Core: crm_count_member() should only track nodes that have the full stack up
+ Core: New developmental logging system inspired by the kernel and a PoC from Lars Ellenberg
+ crmd: All nodes should see status updates, not just he DC
+ crmd: Allow non-DC nodes to clear failcounts too
+ crmd: Base DC election on process relative uptime
+ crmd: Bug lf#2439 - cancel_op() can also return HA_RSCBUSY
+ crmd: Bug lf#2439 - Handle asynchronous notification of resource deletion events
+ crmd: Bug lf#2458 - Ensure stop actions always have the relevant resource attributes
+ crmd: Disable age as a criteria for cman based clusters, its not reliable enough
+ crmd: Ensure we activate the DC timer if we detect an alternate DC
+ crmd: Factor the nanosecond component of process uptime in elections
+ crmd: Fix assertion failure when performing async resource failures
+ crmd: Fix handling of async resource deletion results
+ crmd: Include the action for crm graph operations
+ crmd: Make sure the membership cache is accurate after a sucessful fencing operation
+ crmd: Make sure we always poke the FSA after a transition to clear any TE_HALT actions
+ crmd: Offer crm-level membership once the peer starts the crmd process
+ crmd: Only need to request quorum update for plugin based clusters
+ crmd: Prevent assertion failure for stop actions resulting from cs: 3c0bc17c6daf
+ crmd: Prevent everyone from loosing DC elections by correctly initializing all relevant variables
+ crmd: Prevent segmentation fault
+ crmd: several fixes for async resource delete (thanks to beekhof)
+ crmd: Use the correct define/size for lrm resource IDs
+ Introduce two new cluster types 'cman' and 'corosync', replaces 'quorum_provider' concept
+ mcp: Add missing headers when built without heartbeat support
+ mcp: Correctly initialize the string containing the list of active daemons
+ mcp: Fix macro expansion in init script
+ mcp: Fix the expansion of the pid file in the init script
+ mcp: Handle CS_ERR_TRY_AGAIN when connecting to libcfg
+ mcp: Make sure we can compile the mcp without cman present
+ mcp: New master control process for (re)spawning pacemaker daemons
+ mcp: Read config early so we can re-initialize logging asap if daemonizing
+ mcp: Rename the mcp binary to pacemakerd and create a 'pacemaker' init script
+ mcp: Resend our process list after every CPG change
+ mcp: Tell chkconfig we need to shut down early on
+ pengine: Avoid creating invalid ordering constraints for probes that are not needed
+ pengine: Bug lf#1959 - Fail unmanaged resources should not prevent other services from shutting down
+ pengine: Bug lf#2422 - Ordering dependencies on partially active groups not observed properly
+ pengine: Bug lf#2424 - Use notify oepration definition if it exists in the configuration
+ pengine: Bug lf#2433 - No services should be stopped until probes finish
+ pengine: Bug lf#2453 - Enforce clone ordering in the absense of colocation constraints
+ pengine: Bug lf#2476 - Repair on-fail=block for groups and primitive resources
+ pengine: Correctly detect when there is a real failcount that expired and needs to be cleared
+ pengine: Correctly handle pseudo action creation
+ pengine: Correctly order clone startup after group/clone start
+ pengine: Correct use-after-free introduced in the prior patch
+ pengine: Do not demote resources because something that requires it can not run
+ pengine: Fix colocation for interleaved clones
+ pengine: Fix colocation with partially active groups
+ pengine: Fix potential use-after-free defect from coverity
+ pengine: Fix previous merge
+ pengine: Fix use-after-free in order_actions() reported by valgrind
+ pengine: Make the current data set a global variable so it does not need to be passed around everywhere
+ pengine: Prevent endless loop when looking for operation definitions in the configuration
+ pengine: Prevent segfault by ensuring the arguments to do_calculations() are initialized
+ pengine: Rewrite the ordering constraint logic to be simplicity, clarity and maintainability
+ pengine: Wait until stonith is available, do not fall back to shutdown for nodes requesting termination
+ Resolve coverity RESOURCE_LEAK defects
+ Shell: Complete the transition to using crm_attribute instead of crm_failcount and crm_standby
+ stonith: Advertise stonith-ng options in the metadata
+ stonith: Bug lf#2461 - Prevent segfault by not looking up operations if the hashtable has not been initialized yet
+ stonith: Bug lf#2473 - Add the timeout at the top level where the daemon is looking for it
+ Stonith: Bug lf#2473 - Ensure stonith operations complete within the timeout and are terminated if they run too long
+ stonith: Bug lf#2473 - Ensure timeouts are included for fencing operations
+ stonith: Bug lf#2473 - Gracefully handle remote operations that arrive late (after we have done notifications)
+ stonith: Correctly parse pcmk_host_list parameters that appear on a single line
+ stonith: Map poweron/poweroff back to on/off expected by the stonith tool from cluster-glue
+ stonith: pass the configuration to the stonith program via environment variables (bnc#620781)
+ Stonith: Use the timeout specified by the user
+ Support starting plugin-based Pacemaker clusters with the MCP as well
+ Tools: Bug lf#2456 - Fix assertion failure in crm_resource
+ tools: crm_node - Repair the ability to connect to openais based clusters
+ tools: crm_node - Use the correct short option for --cman
+ tools: crm_report - corosync.conf wont necessarily contain the text 'pacemaker' anymore
+ Tools: crm_simulate - Fix use-after-free in when terminating
+ tools: crm_simulate - Resolve coverity USE_AFTER_FREE defect
+ Tools: Drop the 'pingd' daemon and resource agent in favor of ocf:pacemaker:ping
+ Tools: Fix recently introduced use-of-NULL
+ Tools: Fix use-after-free defects from coverity
* Wed May 12 2010 Andrew Beekhof <andrew@beekhof.net> 1.1.2-1
- Update source tarball to revision: c25c972a25cc tip
- Statistics:
Changesets: 339
Diff: 708 files changed, 37918 insertions(+), 10584 deletions(-)
- Changes since Pacemaker-1.1.1
+ ais: Do not count votes from offline nodes and calculate current votes before sending quorum data
+ ais: Ensure the list of active processes sent to clients is always up-to-date
+ ais: Look for the correct conf variable for turning on file logging
+ ais: Need to find a better and thread-safe way to set core_uses_pid. Disable for now.
+ ais: Use the threadsafe version of getpwnam
+ Core: Bump the feature set due to the new failcount expiry feature
+ Core: fix memory leaks exposed by valgrind
+ Core: Bug lf#2414 - Prevent use-after-free reported by valgrind when doing xpath based deletions
+ crmd: Bug lf#2414 - Prevent use-after-free of the PE connection after it dies
+ crmd: Bug lf#2414 - Prevent use-after-free of the stonith-ng connection
+ crmd: Bug lf#2401 - Improved detection of partially active peers
+ crmd: Bug lf#2379 - Ensure the cluster terminates when the PE is not available
+ crmd: Do not allow the target_rc to be misused by resource agents
+ crmd: Do not ignore action timeouts based on FSA state
+ crmd: Ensure we dont get stuck in S_PENDING if we loose an election to someone that never talks to us again
+ crmd: Fix memory leaks exposed by valgrind
+ crmd: Remove race condition that could lead to multiple instances of a clone being active on a machine
+ crmd: Send erase_status_tag() calls to the local CIB when the DC is fenced, since there is no DC to accept them
+ crmd: Use global fencing notifications to prevent secondary fencing operations of the DC
+ pengine: Bug lf#2317 - Avoid needless restart of primitive depending on a clone
+ pengine: Bug lf#2361 - Ensure clones observe mandatory ordering constraints if the LHS is unrunnable
+ pengine: Bug lf#2383 - Combine failcounts for all instances of an anonymous clone on a host
+ pengine: Bug lf#2384 - Fix intra-set colocation and ordering
+ pengine: Bug lf#2403 - Enforce mandatory promotion (colocation) constraints
+ pengine: Bug lf#2412 - Correctly find clone instances by their prefix
+ pengine: Do not be so quick to pull the trigger on nodes that are coming up
+ pengine: Fix memory leaks exposed by valgrind
+ pengine: Rewrite native_merge_weights() to avoid Fix use-after-free
+ Shell: Bug bnc#590035 - always reload status if working with the cluster
+ Shell: Bug bnc#592762 - Default to using the status section from the live CIB
+ Shell: Bug lf#2315 - edit multiple meta_attributes sets in resource management
+ Shell: Bug lf#2221 - enable comments
+ Shell: Bug bnc#580492 - implement new cibstatus interface and commands
+ Shell: Bug bnc#585471 - new cibstatus import command
+ Shell: check timeouts also against the default-action-timeout property
+ Shell: new configure filter command
+ Tools: crm_mon - fix memory leaks exposed by valgrind
* Tue Feb 16 2010 Andrew Beekhof <andrew@beekhof.net> - 1.1.1-1
- First public release of Pacemaker 1.1
- Package reference documentation in a doc subpackage
- Move cts into a subpackage so that it can be easily consumed by others
- Update source tarball to revision: 17d9cd4ee29f
+ New stonith daemon that supports global notifications
+ Service placement influenced by the physical resources
+ A new tool for simulating failures and the cluster’s reaction to them
+ Ability to serialize an otherwise unrelated a set of resource actions (eg. Xen migrations)
* Wed Feb 10 2010 Andrew Beekhof <andrew@beekhof.net> - 1.0.7-4
- Rebuild for heartbeat 3.0.2-2
* Wed Feb 10 2010 Andrew Beekhof <andrew@beekhof.net> - 1.0.7-3
- Rebuild for cluster-glue 1.0.3
* Tue Jan 19 2010 Andrew Beekhof <andrew@beekhof.net> - 1.0.7-2
- Rebuild for corosync 1.2.0
* Mon Jan 18 2010 Andrew Beekhof <andrew@beekhof.net> - 1.0.7-1
- Update source tarball to revision: 2eed906f43e9 (stable-1.0) tip
- Statistics:
Changesets: 193
Diff: 220 files changed, 15933 insertions(+), 8782 deletions(-)
- Changes since 1.0.5-4
+ pengine: Bug 2213 - Ensure groups process location constraints so that clone-node-max works for cloned groups
+ pengine: Bug lf#2153 - non-clones should not restart when clones stop/start on other nodes
+ pengine: Bug lf#2209 - Clone ordering should be able to prevent startup of dependant clones
+ pengine: Bug lf#2216 - Correctly identify the state of anonymous clones when deciding when to probe
+ pengine: Bug lf#2225 - Operations that require fencing should wait for 'stonith_complete' not 'all_stopped'.
+ pengine: Bug lf#2225 - Prevent clone peers from stopping while another is instance is (potentially) being fenced
+ pengine: Correctly anti-colocate with a group
+ pengine: Correctly unpack ordering constraints for resource sets to avoid graph loops
+ Tools: crm: load help from crm_cli.txt
+ Tools: crm: resource sets (bnc#550923)
+ Tools: crm: support for comments (LF 2221)
+ Tools: crm: support for description attribute in resources/operations (bnc#548690)
+ Tools: hb2openais: add EVMS2 CSM processing (and other changes) (bnc#548093)
+ Tools: hb2openais: do not allow empty rules, clones, or groups (LF 2215)
+ Tools: hb2openais: refuse to convert pure EVMS volumes
+ cib: Ensure the loop for login message terminates
+ cib: Finally fix reliability of receiving large messages over remote plaintext connections
+ cib: Fix remote notifications
+ cib: For remote connections, default to CRM_DAEMON_USER since thats the only one that the cib can validate the password for using PAM
+ cib: Remote plaintext - Retry sending parts of the message that did not fit the first time
+ crmd: Ensure batch-limit is correctly enforced
+ crmd: Ensure we have the latest status after a transition abort
+ (bnc#547579,547582): Tools: crm: status section editing support
+ shell: Add allow-migrate as allowed meta-attribute (bnc#539968)
+ Medium: Build: Do not automatically add -L/lib, it could cause 64-bit arches to break
+ Medium: pengine: Bug lf#2206 - rsc_order constraints always use score at the top level
+ Medium: pengine: Only complain about target-role=master for non m/s resources
+ Medium: pengine: Prevent non-multistate resources from being promoted through target-role
+ Medium: pengine: Provide a default action for resource-set ordering
+ Medium: pengine: Silently fix requires=fencing for stonith resources so that it can be set in op_defaults
+ Medium: Tools: Bug lf#2286 - Allow the shell to accept template parameters on the command line
+ Medium: Tools: Bug lf#2307 - Provide a way to determin the nodeid of past cluster members
+ Medium: Tools: crm: add update method to template apply (LF 2289)
+ Medium: Tools: crm: direct RA interface for ocf class resource agents (LF 2270)
+ Medium: Tools: crm: direct RA interface for stonith class resource agents (LF 2270)
+ Medium: Tools: crm: do not add score which does not exist
+ Medium: Tools: crm: do not consider warnings as errors (LF 2274)
+ Medium: Tools: crm: do not remove sets which contain id-ref attribute (LF 2304)
+ Medium: Tools: crm: drop empty attributes elements
+ Medium: Tools: crm: exclude locations when testing for pathological constraints (LF 2300)
+ Medium: Tools: crm: fix exit code on single shot commands
+ Medium: Tools: crm: fix node delete (LF 2305)
+ Medium: Tools: crm: implement -F (--force) option
+ Medium: Tools: crm: rename status to cibstatus (LF 2236)
+ Medium: Tools: crm: revisit configure commit
+ Medium: Tools: crm: stay in crm if user specified level only (LF 2286)
+ Medium: Tools: crm: verify changes on exit from the configure level
+ Medium: ais: Some clients such as gfs_controld want a cluster name, allow one to be specified in corosync.conf
+ Medium: cib: Clean up logic for receiving remote messages
+ Medium: cib: Create valid notification control messages
+ Medium: cib: Indicate where the remote connection came from
+ Medium: cib: Send password prompt to stderr so that stdout can be redirected
+ Medium: cts: Fix rsh handling when stdout is not required
+ Medium: doc: Fill in the section on removing a node from an AIS-based cluster
+ Medium: doc: Update the docs to reflect the 0.6/1.0 rolling upgrade problem
+ Medium: doc: Use Publican for docbook based documentation
+ Medium: fencing: stonithd: add metadata for stonithd instance attributes (and support in the shell)
+ Medium: fencing: stonithd: ignore case when comparing host names (LF 2292)
+ Medium: tools: Make crm_mon functional with remote connections
+ Medium: xml: Add stopped as a supported role for operations
+ Medium: xml: Bug bnc#552713 - Treat node unames as text fields not IDs
+ Medium: xml: Bug lf#2215 - Create an always-true expression for empty rules when upgrading from 0.6
* Thu Oct 29 2009 Andrew Beekhof <andrew@beekhof.net> - 1.0.5-4
- Include the fixes from CoroSync integration testing
- Move the resource templates - they are not documentation
- Ensure documentation is placed in a standard location
- Exclude documentation that is included elsewhere in the package
- Update the tarball from upstream to version ee19d8e83c2a
+ cib: Correctly clean up when both plaintext and tls remote ports are requested
+ pengine: Bug bnc#515172 - Provide better defaults for lt(e) and gt(e) comparisions
+ pengine: Bug lf#2197 - Allow master instances placemaker to be influenced by colocation constraints
+ pengine: Make sure promote/demote pseudo actions are created correctly
+ pengine: Prevent target-role from promoting more than master-max instances
+ ais: Bug lf#2199 - Prevent expected-quorum-votes from being populated with garbage
+ ais: Prevent deadlock - dont try to release IPC message if the connection failed
+ cib: For validation errors, send back the full CIB so the client can display the errors
+ cib: Prevent use-after-free for remote plaintext connections
+ crmd: Bug lf#2201 - Prevent use-of-NULL when running heartbeat
* Wed Oct 13 2009 Andrew Beekhof <andrew@beekhof.net> - 1.0.5-3
- Update the tarball from upstream to version 38cd629e5c3c
+ Core: Bug lf#2169 - Allow dtd/schema validation to be disabled
+ pengine: Bug lf#2106 - Not all anonymous clone children are restarted after configuration change
+ pengine: Bug lf#2170 - stop-all-resources option had no effect
+ pengine: Bug lf#2171 - Prevent groups from starting if they depend on a complex resource which can not
+ pengine: Disable resource management if stonith-enabled=true and no stonith resources are defined
+ pengine: do not include master score if it would prevent allocation
+ ais: Avoid excessive load by checking for dead children every 1s (instead of 100ms)
+ ais: Bug rh#525589 - Prevent shutdown deadlocks when running on CoroSync
+ ais: Gracefully handle changes to the AIS nodeid
+ crmd: Bug bnc#527530 - Wait for the transition to complete before leaving S_TRANSITION_ENGINE
+ crmd: Prevent use-after-free with LOG_DEBUG_3
+ Medium: xml: Mask the "symmetrical" attribute on rsc_colocation constraints (bnc#540672)
+ Medium (bnc#520707): Tools: crm: new templates ocfs2 and clvm
+ Medium: Build: Invert the disable ais/heartbeat logic so that --without (ais|heartbeat) is available to rpmbuild
+ Medium: pengine: Bug lf#2178 - Indicate unmanaged clones
+ Medium: pengine: Bug lf#2180 - Include node information for all failed ops
+ Medium: pengine: Bug lf#2189 - Incorrect error message when unpacking simple ordering constraint
+ Medium: pengine: Correctly log resources that would like to start but can not
+ Medium: pengine: Stop ptest from logging to syslog
+ Medium: ais: Include version details in plugin name
+ Medium: crmd: Requery the resource metadata after every start operation
* Fri Aug 21 2009 Tomas Mraz <tmraz@redhat.com> - 1.0.5-2.1
- rebuilt with new openssl
* Wed Aug 19 2009 Andrew Beekhof <andrew@beekhof.net> - 1.0.5-2
- Add versioned perl dependency as specified by
https://fedoraproject.org/wiki/Packaging/Perl#Packages_that_link_to_libperl
- No longer remove RPATH data, it prevents us finding libperl.so and no other
libraries were being hardcoded
- Compile in support for heartbeat
- Conditionally add heartbeat-devel and corosynclib-devel to the -devel requirements
depending on which stacks are supported
* Mon Aug 17 2009 Andrew Beekhof <andrew@beekhof.net> - 1.0.5-1
- Add dependency on resource-agents
- Use the version of the configure macro that supplies --prefix, --libdir, etc
- Update the tarball from upstream to version 462f1569a437 (Pacemaker 1.0.5 final)
+ Tools: crm_resource - Advertise --move instead of --migrate
+ Medium: Extra: New node connectivity RA that uses system ping and attrd_updater
+ Medium: crmd: Note that dc-deadtime can be used to mask the brokeness of some switches
* Tue Aug 11 2009 Ville Skyttä <ville.skytta@iki.fi> - 1.0.5-0.7.c9120a53a6ae.hg
- Use bzipped upstream tarball.
* Wed Jul 29 2009 Andrew Beekhof <andrew@beekhof.net> - 1.0.5-0.6.c9120a53a6ae.hg
- Add back missing build auto* dependancies
- Minor cleanups to the install directive
* Tue Jul 28 2009 Andrew Beekhof <andrew@beekhof.net> - 1.0.5-0.5.c9120a53a6ae.hg
- Add a leading zero to the revision when alphatag is used
* Tue Jul 28 2009 Andrew Beekhof <andrew@beekhof.net> - 1.0.5-0.4.c9120a53a6ae.hg
- Incorporate the feedback from the cluster-glue review
- Realistically, the version is a 1.0.5 pre-release
- Use the global directive instead of define for variables
- Use the haclient/hacluster group/user instead of daemon
- Use the _configure macro
- Fix install dependancies
* Fri Jul 24 2009 Andrew Beekhof <andrew@beekhof.net> - 1.0.4-3
- Initial Fedora checkin
- Include an AUTHORS and license file in each package
- Change the library package name to pacemaker-libs to be more
Fedora compliant
- Remove execute permissions from xml related files
- Reference the new cluster-glue devel package name
- Update the tarball from upstream to version c9120a53a6ae
+ pengine: Only prevent migration if the clone dependency is stopping/starting on the target node
+ pengine: Bug 2160 - Dont shuffle clones due to colocation
+ pengine: New implementation of the resource migration (not stop/start) logic
+ Medium: Tools: crm_resource - Prevent use-of-NULL by requiring a resource name for the -A and -a options
+ Medium: pengine: Prevent use-of-NULL in find_first_action()
* Tue Jul 14 2009 Andrew Beekhof <andrew@beekhof.net> - 1.0.4-2
- Reference authors from the project AUTHORS file instead of listing in description
- Change Source0 to reference the Mercurial repo
- Cleaned up the summaries and descriptions
- Incorporate the results of Fedora package self-review
* Thu Jun 04 2009 Andrew Beekhof <abeekhof@suse.de> - 1.0.4-1
- Update source tarball to revision: 1d87d3e0fc7f (stable-1.0)
- Statistics:
Changesets: 209
Diff: 266 files changed, 12010 insertions(+), 8276 deletions(-)
- Changes since Pacemaker-1.0.3
+ (bnc#488291): ais: do not rely on byte endianness on ptr cast
+ (bnc#507255): Tools: crm: delete rsc/op_defaults (these meta_attributes are killing me)
+ (bnc#507255): Tools: crm: import properly rsc/op_defaults
+ (LF 2114): Tools: crm: add support for operation instance attributes
+ ais: Bug lf#2126 - Messages replies cannot be routed to transient clients
+ ais: Fix compilation for the latest Corosync API (v1719)
+ attrd: Do not perform all updates as complete refreshes
+ cib: Fix huge memory leak affecting heartbeat-based clusters
+ Core: Allow xpath queries to match attributes
+ Core: Generate the help text directly from a tool options struct
+ Core: Handle differences in 0.6 messaging format
+ crmd: Bug lf#2120 - All transient node attribute updates need to go via attrd
+ crmd: Correctly calculate how long an FSA action took to avoid spamming the logs with errors
+ crmd: Fix another large memory leak affecting Heartbeat based clusters
+ lha: Restore compatability with older versions
+ pengine: Bug bnc#495687 - Filesystem is not notified of successful STONITH under some conditions
+ pengine: Make running a cluster with STONITH enabled but no STONITH resources an error and provide details on resolutions
+ pengine: Prevent use-ofNULL when using resource ordering sets
+ pengine: Provide inter-notification ordering guarantees
+ pengine: Rewrite the notification code to be understanable and extendable
+ Tools: attrd - Prevent race condition resulting in the cluster forgetting the node wishes to shut down
+ Tools: crm: regression tests
+ Tools: crm_mon - Fix smtp notifications
+ Tools: crm_resource - Repair the ability to query meta attributes
+ Low Build: Bug lf#2105 - Debian package should contain pacemaker doc and crm templates
+ Medium (bnc#507255): Tools: crm: handle empty rsc/op_defaults properly
+ Medium (bnc#507255): Tools: crm: use the right obj_type when creating objects from xml nodes
+ Medium (LF 2107): Tools: crm: revisit exit codes in configure
+ Medium: cib: Do not bother validating updates that only affect the status section
+ Medium: Core: Include supported stacks in version information
+ Medium: crmd: Record in the CIB, the cluster infrastructure being used
+ Medium: cts: Do not combine crm_standby arguments - the wrapper can not process them
+ Medium: cts: Fix the CIBAusdit class
+ Medium: Extra: Refresh showscores script from Dominik
+ Medium: pengine: Build a statically linked version of ptest
+ Medium: pengine: Correctly log the actions for resources that are being recovered
+ Medium: pengine: Correctly log the occurance of promotion events
+ Medium: pengine: Implememt node health based on a patch from Mark Hamzy
+ Medium: Tools: Add examples to help text outputs
+ Medium: Tools: crm: catch syntax errors for configure load
+ Medium: Tools: crm: implement erasing nodes in configure erase
+ Medium: Tools: crm: work with parents only when managing xml objects
+ Medium: Tools: crm_mon - Add option to run custom notification program on resource operations (Patch by Dominik Klein)
+ Medium: Tools: crm_resource - Allow --cleanup to function on complex resources and cluster-wide
+ Medium: Tools: haresource2cib.py - Patch from horms to fix conversion error
+ Medium: Tools: Include stack information in crm_mon output
+ Medium: Tools: Two new options (--stack,--constraints) to crm_resource for querying how a resource is configured
* Wed Apr 08 2009 Andrew Beekhof <abeekhof@suse.de> - 1.0.3-1
- Update source tarball to revision: b133b3f19797 (stable-1.0) tip
- Statistics:
Changesets: 383
Diff: 329 files changed, 15471 insertions(+), 15119 deletions(-)
- Changes since Pacemaker-1.0.2
+ Added tag SLE11-HAE-GMC for changeset 9196be9830c2
+ ais plugin: Fix quorum calculation (bnc#487003)
+ ais: Another memory fix leak in error path
+ ais: Bug bnc#482847, bnc#482905 - Force a clean exit of OpenAIS once Pacemaker has finished unloading
+ ais: Bug bnc#486858 - Fix update_member() to prevent spamming clients with membership events containing no changes
+ ais: Centralize all quorum calculations in the ais plugin and allow expected votes to be configured int he cib
+ ais: Correctly handle a return value of zero from openais_dispatch_recv()
+ ais: Disable logging to a file
+ ais: Fix memory leak in error path
+ ais: IPC messages are only in scope until a response is sent
+ All signal handlers used with CL_SIGNAL() need to be as minimal as possible
+ cib: Bug bnc#482885 - Simplify CIB disk-writes to prevent data loss. Required a change to the backup filename format
+ cib: crmd: Revert part of 9782ab035003. Complex shutdown routines need G_main_add_SignalHandler to avoid race coditions
+ crm: Avoid infinite loop during crm configure edit (bnc#480327)
+ crmd: Avoid a race condition by waiting for the attrd update to trigger a transition automatically
+ crmd: Bug bnc#480977 - Prevent extra, partial, shutdown when a node restarts too quickly
+ crmd: Bug bnc#480977 - Prevent extra, partial, shutdown when a node restarts too quickly (verified)
+ crmd: Bug bnc#489063 - Ensure the DC is always unset after we 'loose' an election
+ crmd: Bug BSC#479543 - Correctly find the migration source for timed out migrate_from actions
+ crmd: Call crm_peer_init() before we start the FSA - prevents a race condition when used with Heartbeat
+ crmd: Erasing the status section should not be forced to the local node
+ crmd: Fix memory leak in cib notication processing code
+ crmd: Fix memory leak in transition graph processing
+ crmd: Fix memory leaks found by valgrind
+ crmd: More memory leaks fixes found by valgrind
+ fencing: stonithd: is_heartbeat_cluster is a no-no if there is no heartbeat support
+ pengine: Bug bnc#466788 - Exclude nodes that can not run resources
+ pengine: Bug bnc#466788 - Make colocation based on node attributes work
+ pengine: Bug BNC#478687 - Do not crash when clone-max is 0
+ pengine: Bug bnc#488721 - Fix id-ref expansion for clones, the doc-root for clone children is not the cib root
+ pengine: Bug bnc#490418 - Correctly determine node state for nodes wishing to be terminated
+ pengine: Bug LF#2087 - Correctly parse the state of anonymous clones that have multiple instances on a given node
+ pengine: Bug lf#2089 - Meta attributes are not inherited by clone children
+ pengine: Bug lf#2091 - Correctly restart modified resources that were found active by a probe
+ pengine: Bug lf#2094 - Fix probe ordering for cloned groups
+ pengine: Bug LF:2075 - Fix large pingd memory leaks
+ pengine: Correctly attach orphaned clone children to their parent
+ pengine: Correctly handle terminate node attributes that are set to the output from time()
+ pengine: Ensure orphaned clone members are hooked up to the parent when clone-max=0
+ pengine: Fix memory leak in LogActions
+ pengine: Fix the determination of whether a group is active
+ pengine: Look up the correct promotion preference for anonymous masters
+ pengine: Simplify handling of start failures by changing the default migration-threshold to INFINITY
+ pengine: The ordered option for clones no longer causes extra start/stop operations
+ RA: Bug bnc#490641 - Shut down dlm_controld with -TERM instead of -KILL
+ RA: pingd: Set default ping interval to 1 instead of 0 seconds
+ Resources: pingd - Correctly tell the ping daemon to shut down
+ Tools: Bug bnc#483365 - Ensure the command from cluster_test includes a value for --log-facility
+ Tools: cli: fix and improve delete command
+ Tools: crm: add and implement templates
+ Tools: crm: add support for command aliases and some common commands (i.e. cd,exit)
+ Tools: crm: create top configuration nodes if they are missing
+ Tools: crm: fix parsing attributes for rules (broken by the previous changeset)
+ Tools: crm: new ra set of commands
+ Tools: crm: resource agents information management
+ Tools: crm: rsc/op_defaults
+ Tools: crm: support for no value attribute in nvpairs
+ Tools: crm: the new configure monitor command
+ Tools: crm: the new configure node command
+ Tools: crm_mon - Prevent use-of-NULL when summarizing an orphan
+ Tools: hb2openais: create clvmd clone for respawn evmsd in ha.cf
+ Tools: hb2openais: fix a serious recursion bug in xml node processing
+ Tools: hb2openais: fix ocfs2 processing
+ Tools: pingd - prevent double free of getaddrinfo() output in error path
+ Tools: The default re-ping interval for pingd should be 1s not 1ms
+ Medium (bnc#479049): Tools: crm: add validation of resource type for the configure primitive command
+ Medium (bnc#479050): Tools: crm: add help for RA parameters in tab completion
+ Medium (bnc#479050): Tools: crm: add tab completion for primitive params/meta/op
+ Medium (bnc#479050): Tools: crm: reimplement cluster properties completion
+ Medium (bnc#486968): Tools: crm: listnodes function requires no parameters (do not mix completion with other stuff)
+ Medium: ais: Remove the ugly hack for dampening AIS membership changes
+ Medium: cib: Fix memory leaks by using mainloop_add_signal
+ Medium: cib: Move more logging to the debug level (was info)
+ Medium: cib: Overhaul the processing of synchronous replies
+ Medium: Core: Add library functions for instructing the cluster to terminate nodes
+ Medium: crmd: Add new expected-quorum-votes option
+ Medium: crmd: Allow up to 5 retires when an attrd update fails
+ Medium: crmd: Automatically detect and use new values for crm_config options
+ Medium: crmd: Bug bnc#490426 - Escalated shutdowns stall when there are pending resource operations
+ Medium: crmd: Clean up and optimize the DC election algorithm
+ Medium: crmd: Fix memory leak in shutdown
+ Medium: crmd: Fix memory leaks spotted by Valgrind
+ Medium: crmd: Ingore join messages from hosts other than our DC
+ Medium: crmd: Limit the scope of resource updates to the status section
+ Medium: crmd: Prevent the crmd from being respawned if its told to shut down when it did not ask to be
+ Medium: crmd: Re-check the election status after membership events
+ Medium: crmd: Send resource updates via the local CIB during elections
+ Medium: pengine: Bug bnc#491441 - crm_mon does not display operations returning 'uninstalled' correctly
+ Medium: pengine: Bug lf#2101 - For location constraints, role=Slave is equivalent to role=Started
+ Medium: pengine: Clean up the API - removed ->children() and renamed ->find_child() to fine_rsc()
+ Medium: pengine: Compress the display of healthy anonymous clones
+ Medium: pengine: Correctly log the actions for resources that are being recovered
+ Medium: pengine: Determin a promotion score for complex resources
+ Medium: pengine: Ensure clones always have a value for globally-unique
+ Medium: pengine: Prevent orphan clones from being allocated
+ Medium: RA: controld: Return proper exit code for stop op.
+ Medium: Tools: Bug bnc#482558 - Fix logging test in cluster_test
+ Medium: Tools: Bug bnc#482828 - Fix quoting in cluster_test logging setup
+ Medium: Tools: Bug bnc#482840 - Include directory path to CTSlab.py
+ Medium: Tools: crm: add more user input checks
+ Medium: Tools: crm: do not check resource status of we are working with a shadow
+ Medium: Tools: crm: fix id-refs and allow reference to top objects (i.e. primitive)
+ Medium: Tools: crm: ignore comments in the CIB
+ Medium: Tools: crm: multiple column output would not work with small lists
+ Medium: Tools: crm: refuse to delete running resources
+ Medium: Tools: crm: rudimentary if-else for templates
+ Medium: Tools: crm: Start/stop clones via target-role.
+ Medium: Tools: crm_mon - Compress the node status for healthy and offline nodes
+ Medium: Tools: crm_shadow - Return 0/cib_ok when --create-empty succeeds
+ Medium: Tools: crm_shadow - Support -e, the short form of --create-empty
+ Medium: Tools: Make attrd quieter
+ Medium: Tools: pingd - Avoid using various clplumbing functions as they seem to leak
+ Medium: Tools: Reduce pingd logging
* Mon Feb 16 2009 Andrew Beekhof <abeekhof@suse.de> - 1.0.2-1
- Update source tarball to revision: d232d19daeb9 (stable-1.0) tip
- Statistics:
Changesets: 441
Diff: 639 files changed, 20871 insertions(+), 21594 deletions(-)
- Changes since Pacemaker-1.0.1
+ (bnc#450815): Tools: crm cli: do not generate id for the operations tag
+ ais: Add support for the new AIS IPC layer
+ ais: Always set header.error to the correct default: SA_AIS_OK
+ ais: Bug BNC#456243 - Ensure the membership cache always contains an entry for the local node
+ ais: Bug BNC:456208 - Prevent deadlocks by not logging in the child process before exec()
+ ais: By default, disable supprt for the WIP openais IPC patch
+ ais: Detect and handle situations where ais and the crm disagree on the node name
+ ais: Ensure crm_peer_seq is updated after a membership update
+ ais: Make sure all IPC header fields are set to sane defaults
+ ais: Repair and streamline service load now that whitetank startup functions correctly
+ build: create and install doc files
+ cib: Allow clients without mainloop to connect to the cib
+ cib: CID:18 - Fix use-of-NULL in cib_perform_op
+ cib: CID:18 - Repair errors introduced in b5a18704477b - Fix use-of-NULL in cib_perform_op
+ cib: Ensure diffs contain the correct values of admin_epoch
+ cib: Fix four moderately sized memory leaks detected by Valgrind
+ Core: CID:10 - Prevent indexing into an array of schemas with a negative value
+ Core: CID:13 - Fix memory leak in log_data_element
+ Core: CID:15 - Fix memory leak in crm_get_peer
+ Core: CID:6 - Fix use-of-NULL in copy_ha_msg_input
+ Core: Fix crash in the membership code preventing node shutdown
+ Core: Fix more memory leaks foudn by valgrind
+ Core: Prevent unterminated strings after decompression
+ crmd: Bug BNC:467995 - Delay marking STONITH operations complete until STONITH tells us so
+ crmd: Bug LF:1962 - Do not NACK peers because they are not (yet) in our membership. Just ignore them.
+ crmd: Bug LF:2010 - Ensure fencing cib updates create the node_state entry if needed to preent re-fencing during cluster startup
+ crmd: Correctly handle reconnections to attrd
+ crmd: Ensure updates for lost migrate operations indicate which node it tried to migrating to
+ crmd: If there are no nodes to finalize, start an election.
+ crmd: If there are no nodes to welcome, start an election.
+ crmd: Prevent node attribute loss by detecting attrd disconnections immediately
+ crmd: Prevent node re-probe loops by ensuring manditory actions always complete
+ pengine: Bug 2005 - Fix startup ordering of cloned stonith groups
+ pengine: Bug 2006 - Correctly reprobe cloned groups
+ pengine: Bug BNC:465484 - Fix the no-quorum-policy=suicide option
+ pengine: Bug LF:1996 - Correctly process disabled monitor operations
+ pengine: CID:19 - Fix use-of-NULL in determine_online_status
+ pengine: Clones now default to globally-unique=false
+ pengine: Correctly calculate the number of available nodes for the clone to use
+ pengine: Only shoot online nodes with no-quorum-policy=suicide
+ pengine: Prevent on-fail settings being ignored after a resource is successfully stopped
+ pengine: Prevent use-of-NULL for failed migrate actions in process_rsc_state()
+ pengine: Remove an optimization for the terminate node attribute that caused the cluster to block indefinitly
+ pengine: Repar the ability to colocate based on node attributes other than uname
+ pengine: Start the correct monitor operation for unmanaged masters
+ stonith: CID:3 - Fix another case of exceptionally poor error handling by the original stonith developers
+ stonith: CID:5 - Checking for NULL and then dereferencing it anyway is an interesting approach to error handling
+ stonithd: Sending IPC to the cluster is a privileged operation
+ stonithd: wrong checks for shmid (0 is a valid id)
+ Tools: attrd - Correctly determine when an attribute has stopped changing and should be committed to the CIB
+ Tools: Bug 2003 - pingd does not correctly detect failures when the interface is down
+ Tools: Bug 2003 - pingd does not correctly handle node-down events on multi-NIC systems
+ Tools: Bug 2021 - pingd does not detect sequence wrapping correctly, incorrectly reports nodes offline
+ Tools: Bug BNC:468066 - Do not use the result of uname() when its no longer in scope
+ Tools: Bug BNC:473265 - crm_resource -L dumps core
+ Tools: Bug LF:2001 - Transient node attributes should be set via attrd
+ Tools: Bug LF:2036 - crm_resource cannot set/get parameters for cloned resources
+ Tools: Bug LF:2046 - Node attribute updates are lost because attrd can take too long to start
+ Tools: Cause the correct clone instance to be failed with crm_resource -F
+ Tools: cluster_test - Allow the user to select a stack and fix CTS invocation
+ Tools: crm cli: allow rename only if the resource is stopped
+ Tools: crm cli: catch system errors on file operations
+ Tools: crm cli: completion for ids in configure
+ Tools: crm cli: drop '-rsc' from attributes for order constraint
+ Tools: crm cli: exit with an appropriate exit code
+ Tools: crm cli: fix wrong order of action and resource in order constraint
+ Tools: crm cli: fox wrong exit code
+ Tools: crm cli: improve handling of cib attributes
+ Tools: crm cli: new command: configure rename
+ Tools: crm cli: new command: configure upgrade
+ Tools: crm cli: new command: node delete
+ Tools: crm cli: prevent key errors on missing cib attributes
+ Tools: crm cli: print long help for help topics
+ Tools: crm cli: return on syntax error when parsing score
+ Tools: crm cli: rsc_location can be without nvpairs
+ Tools: crm cli: short node preference location constraint
+ Tools: crm cli: sometimes, on errors, level would change on single shot use
+ Tools: crm cli: syntax: drop a bunch of commas (remains of help tables conversion)
+ Tools: crm cli: verify user input for sanity
+ Tools: crm: find expressions within rules (do not always skip xml nodes due to used id)
+ Tools: crm_master should not define a set id now that attrd is used. Defining one can break lookups
+ Tools: crm_mon Use the OID assigned to the project by IANA for SNMP traps
+ Medium (bnc#445622): Tools: crm cli: improve the node show command and drop node status
+ Medium (LF 2009): stonithd: improve timeouts for remote fencing
+ Medium: ais: Allow dead peers to be removed from membership calculations
+ Medium: ais: Pass node deletion events on to clients
+ Medium: ais: Sanitize ipc usage
+ Medium: ais: Supply the node uname in addtion to the id
+ Medium: Build: Clean up configure to ensure NON_FATAL_CFLAGS is consistent with CFLAGS (ie. includes -g)
+ Medium: Build: Install cluster_test
+ Medium: Build: Use more restrictive CFLAGS and fix the resulting errors
+ Medium: cib: CID:20 - Fix potential use-after-free in cib_native_signon
+ Medium: Core: Bug BNC:474727 - Set a maximum time to wait for IPC messages
+ Medium: Core: CID:12 - Fix memory leak in decode_transition_magic error path
+ Medium: Core: CID:14 - Fix memory leak in calculate_xml_digest error path
+ Medium: Core: CID:16 - Fix memory leak in date_to_string error path
+ Medium: Core: Try to track down the cause of XML parsing errors
+ Medium: crmd: Bug BNC:472473 - Do not wait excessive amounts of time for lost actions
+ Medium: crmd: Bug BNC:472473 - Reduce the transition timeout to action_timeout+network_delay
+ Medium: crmd: Do not fast-track the processing of LRM refreshes when there are pending actions.
+ Medium: crmd: do_dc_join_filter_offer - Check the 'join' message is for the current instance before deciding to NACK peers
+ Medium: crmd: Find option values without having to do a config upgrade
+ Medium: crmd: Implement shutdown using a transient node attribute
+ Medium: crmd: Update the crmd options to use dashes instead of underscores
+ Medium: cts: Add 'cluster reattach' to the suite of automated regression tests
+ Medium: cts: cluster_test - Make some usability enhancements
+ Medium: CTS: cluster_test - suggest a valid port number
+ Medium: CTS: Fix python import order
+ Medium: cts: Implement an automated SplitBrain test
+ Medium: CTS: Remove references to deleted classes
+ Medium: Extra: Resources - Use HA_VARRUN instead of HA_RSCTMP for state files as Heartbeat removes HA_RSCTMP at startup
+ Medium: HB: Bug 1933 - Fake crmd_client_status_callback() calls because HB does not provide them for already running processes
+ Medium: pengine: CID:17 - Fix memory leak in find_actions_by_task error path
+ Medium: pengine: CID:7,8 - Prevent hypothetical use-of-NULL in LogActions
+ Medium: pengine: Defer logging the actions performed on a resource until we have processed ordering constraints
+ Medium: pengine: Remove the symmetrical attribute of colocation constraints
+ Medium: Resources: pingd - fix the meta defaults
+ Medium: Resources: Stateful - Add missing meta defaults
+ Medium: stonithd: exit if we the pid file cannot be locked
+ Medium: Tools: Allow attrd clients to specify the ID the attribute should be created with
+ Medium: Tools: attrd - Allow attribute updates to be performed from a hosts peer
+ Medium: Tools: Bug LF:1994 - Clean up crm_verify return codes
+ Medium: Tools: Change the pingd defaults to ping hosts once every second (instead of 5 times every 10 seconds)
+ Medium: Tools: cibmin - Detect resource operations with a view to providing email/snmp/cim notification
+ Medium: Tools: crm cli: add back symmetrical for order constraints
+ Medium: Tools: crm cli: generate role in location when converting from xml
+ Medium: Tools: crm cli: handle shlex exceptions
+ Medium: Tools: crm cli: keep order of help topics
+ Medium: Tools: crm cli: refine completion for ids in configure
+ Medium: Tools: crm cli: replace inf with INFINITY
+ Medium: Tools: crm cli: streamline cib load and parsing
+ Medium: Tools: crm cli: supply provider only for ocf class primitives
+ Medium: Tools: crm_mon - Add support for sending mail notifications of resource events
+ Medium: Tools: crm_mon - Include the DC version in status summary
+ Medium: Tools: crm_mon - Sanitize startup and option processing
+ Medium: Tools: crm_mon - switch to event-driven updates and add support for sending snmp traps
+ Medium: Tools: crm_shadow - Replace the --locate option with the saner --edit
+ Medium: Tools: hb2openais: do not remove Evmsd resources, but replace them with clvmd
+ Medium: Tools: hb2openais: replace crmadmin with crm_mon
+ Medium: Tools: hb2openais: replace the lsb class with ocf for o2cb
+ Medium: Tools: hb2openais: reuse code
+ Medium: Tools: LF:2029 - Display an error if crm_resource is used to reset the operation history of non-primitive resources
+ Medium: Tools: Make pingd resilient to attrd failures
+ Medium: Tools: pingd - fix the command line switches
+ Medium: Tools: Rename ccm_tool to crm_node
* Tue Nov 18 2008 Andrew Beekhof <abeekhof@suse.de> - 1.0.1-1
- Update source tarball to revision: 6fc5ce8302ab (stable-1.0) tip
- Statistics:
Changesets: 170
Diff: 816 files changed, 7633 insertions(+), 6286 deletions(-)
- Changes since Pacemaker-1.0.1
+ ais: Allow the crmd to get callbacks whenever a node state changes
+ ais: Create an option for starting the mgmtd daemon automatically
+ ais: Ensure HA_RSCTMP exists for use by resource agents
+ ais: Hook up the openais.conf config logging options
+ ais: Zero out the PID of disconnecting clients
+ cib: Ensure global updates cause a disk write when appropriate
+ Core: Add an extra snaity check to getXpathResults() to prevent segfaults
+ Core: Do not redefine __FUNCTION__ unnecessarily
+ Core: Repair the ability to have comments in the configuration
+ crmd: Bug:1975 - crmd should wait indefinitely for stonith operations to complete
+ crmd: Ensure PE processing does not occur for all error cases in do_pe_invoke_callback
+ crmd: Requests to the CIB should cause any prior PE calculations to be ignored
+ heartbeat: Wait for membership 'up' events before removing stale node status data
+ pengine: Bug LF:1988 - Ensure recurring operations always have the correct target-rc set
+ pengine: Bug LF:1988 - For unmanaged resources we need to skip the usual can_run_resources() checks
+ pengine: Ensure the terminate node attribute is handled correctly
+ pengine: Fix optional colocation
+ pengine: Improve up the detection of 'new' nodes joining the cluster
+ pengine: Prevent assert failures in master_color() by ensuring unmanaged masters are always reallocated to their current location
+ Tools: crm cli: parser: return False on syntax error and None for comments
+ Tools: crm cli: unify template and edit commands
+ Tools: crm_shadow - Show more line number information after validation failures
+ Tools: hb2openais: add option to upgrade the CIB to v3.0
+ Tools: hb2openais: add U option to getopts and update usage
+ Tools: hb2openais: backup improved and multiple fixes
+ Tools: hb2openais: fix class/provider reversal
+ Tools: hb2openais: fix testing
+ Tools: hb2openais: move the CIB update to the end
+ Tools: hb2openais: update logging and set logfile appropriately
+ Tools: LF:1969 - Attrd never sets any properties in the cib
+ Tools: Make attrd functional on OpenAIS
+ Medium: ais: Hook up the options for specifying the expected number of nodes and total quorum votes
+ Medium: ais: Look for pacemaker options inside the service block with 'name: pacemaker' instead of creating an addtional configuration block
+ Medium: ais: Provide better feedback when nodes change nodeids (in openais.conf)
+ Medium: cib: Always store cib contents on disk with num_updates=0
+ Medium: cib: Ensure remote access ports are cleaned up on shutdown
+ Medium: crmd: Detect deleted resource operations automatically
+ Medium: crmd: Erase a nodes resource operations and transient attributes after a successful STONITH
+ Medium: crmd: Find a more appropriate place to update quorum and refresh attrd attributes
+ Medium: crmd: Fix the handling of unexpected PE exits to ensure the current CIB is stored
+ Medium: crmd: Fix the recording of pending operations in the CIB
+ Medium: crmd: Initiate an attrd refresh _after_ the status section has been fully repopulated
+ Medium: crmd: Only the DC should update quorum in an openais cluster
+ Medium: Ensure meta attributes are used consistantly
+ Medium: pengine: Allow group and clone level resource attributes
+ Medium: pengine: Bug N:437719 - Ensure scores from colocated resources count when allocating groups
+ Medium: pengine: Prevent lsb scripts from being used in globally unique clones
+ Medium: pengine: Make a best-effort guess at a migration threshold for people with 0.6 configs
+ Medium: Resources: controld - ensure we are part of a clone with globally_unique=false
+ Medium: Tools: attrd - Automatically refresh all attributes after a CIB replace operation
+ Medium: Tools: Bug LF:1985 - crm_mon - Correctly process failed cib queries to allow reconnection after cluster restarts
+ Medium: Tools: Bug LF:1987 - crm_verify incorrectly warns of configuration upgrades for the most recent version
+ Medium: Tools: crm (bnc#441028): check for key error in attributes management
+ Medium: Tools: crm_mon - display the meaning of the operation rc code instead of the status
+ Medium: Tools: crm_mon - Fix the display of timing data
+ Medium: Tools: crm_verify - check that we are being asked to validate a complete config
+ Medium: xml: Relax the restriction on the contents of rsc_locaiton.node
* Thu Oct 16 2008 Andrew Beekhof <abeekhof@suse.de> - 1.0.0-1
- Update source tarball to revision: 388654dfef8f tip
- Statistics:
Changesets: 261
Diff: 3021 files changed, 244985 insertions(+), 111596 deletions(-)
- Changes since f805e1b30103
+ add the crm cli program
+ ais: Move the service id definition to a common location and make sure it is always used
+ build: rename hb2openais.sh to .in and replace paths with vars
+ cib: Implement --create for crm_shadow
+ cib: Remove dead files
+ Core: Allow the expected number of quorum votes to be configrable
+ Core: cl_malloc and friends were removed from Heartbeat
+ Core: Only call xmlCleanupParser() if we parsed anything. Doing so unconditionally seems to cause a segfault
+ hb2openais.sh: improve pingd handling; several bugs fixed
+ hb2openais: fix clone creation; replace EVMS strings
+ new hb2openais.sh conversion script
+ pengine: Bug LF:1950 - Ensure the current values for all notification variables are always set (even if empty)
+ pengine: Bug LF:1955 - Ensure unmanaged masters are unconditionally repromoted to ensure they are monitored correctly.
+ pengine: Bug LF:1955 - Fix another case of filtering causing unmanaged master failures
+ pengine: Bug LF:1955 - Umanaged mode prevents master resources from being allocated correctly
+ pengine: Bug N:420538 - Anit-colocation caused a positive node preference
+ pengine: Correctly handle unmanaged resources to prevent them from being started elsewhere
+ pengine: crm_resource - Fix the --migrate command
+ pengine: MAke stonith-enabled default to true and warn if no STONITH resources are found
+ pengine: Make sure orphaned clone children are created correctly
+ pengine: Monitors for unmanaged resources do not need to wait for start/promote/demote actions to complete
+ stonithd (LF 1951): fix remote stonith operations
+ stonithd: fix handling of timeouts
+ stonithd: fix logic for stonith resource priorities
+ stonithd: implement the fence-timeout instance attribute
+ stonithd: initialize value before reading fence-timeout
+ stonithd: set timeouts for fencing ops to the timeout of the start op
+ stonithd: stonith rsc priorities (new feature)
+ Tools: Add hb2openais - a tool for upgrading a Heartbeat cluster to use OpenAIS instead
+ Tools: crm_verify - clean up the upgrade logic to prevent crash on invalid configurations
+ Tools: Make pingd functional on Linux
+ Update version numbers for 1.0 candidates
+ Medium: ais: Add support for a synchronous call to retrieve the nodes nodeid
+ Medium: ais: Use the agreed service number
+ Medium: Build: Reliably detect heartbeat libraries during configure
+ Medium: Build: Supply prototypes for libreplace functions when needed
+ Medium: Build: Teach configure how to find corosync
+ Medium: Core: Provide better feedback if Pacemaker is started by a stack it does not support
+ Medium: crmd: Avoid calling GHashTable functions with NULL
+ Medium: crmd: Delay raising I_ERROR when the PE exits until we have had a chance to save the current CIB
+ Medium: crmd: Hook up the stonith-timeout option to stonithd
+ Medium: crmd: Prevent potential use-of-NULL in global_timer_callback
+ Medium: crmd: Rationalize the logging of graph aborts
+ Medium: pengine: Add a stonith_timeout option and remove new options that are better set in rsc_defaults
+ Medium: pengine: Allow external entities to ask for a node to be shot by creating a terminate=true transient node attribute
+ Medium: pengine: Bug LF:1950 - Notifications do not contain all documented resource state fields
+ Medium: pengine: Bug N:417585 - Do not restart group children whos individual score drops below zero
+ Medium: pengine: Detect clients that disconnect before receiving their reply
+ Medium: pengine: Implement a true maintenance mode
+ Medium: pengine: Implement on-fail=standby for NTT. Derived from a patch by Satomi TANIGUCHI
+ Medium: pengine: Print the correct message when stonith is disabled
+ Medium: pengine: ptest - check the input is valid before proceeding
+ Medium: pengine: Revert group stickiness to the 'old way'
+ Medium: pengine: Use the correct attribute for action 'requires' (was prereq)
+ Medium: stonithd: Fix compilation without full heartbeat install
+ Medium: stonithd: exit with better code on empty host list
+ Medium: tools: Add a new regression test for CLI tools
+ Medium: tools: crm_resource - return with non-zero when a resource migration command is invalid
+ Medium: tools: crm_shadow - Allow the admin to start with an empty CIB (and no cluster connection)
+ Medium: xml: pacemaker-0.7 is now an alias for the 1.0 schema
* Mon Sep 22 2008 Andrew Beekhof <abeekhof@suse.de> - 0.7.3-1
- Update source tarball to revision: 33e677ab7764+ tip
- Statistics:
Changesets: 133
Diff: 89 files changed, 7492 insertions(+), 1125 deletions(-)
- Changes since f805e1b30103
+ Tools: add the crm cli program
+ Core: cl_malloc and friends were removed from Heartbeat
+ Core: Only call xmlCleanupParser() if we parsed anything. Doing so unconditionally seems to cause a segfault
+ new hb2openais.sh conversion script
+ pengine: Bug LF:1950 - Ensure the current values for all notification variables are always set (even if empty)
+ pengine: Bug LF:1955 - Ensure unmanaged masters are unconditionally repromoted to ensure they are monitored correctly.
+ pengine: Bug LF:1955 - Fix another case of filtering causing unmanaged master failures
+ pengine: Bug LF:1955 - Umanaged mode prevents master resources from being allocated correctly
+ pengine: Bug N:420538 - Anit-colocation caused a positive node preference
+ pengine: Correctly handle unmanaged resources to prevent them from being started elsewhere
+ pengine: crm_resource - Fix the --migrate command
+ pengine: MAke stonith-enabled default to true and warn if no STONITH resources are found
+ pengine: Make sure orphaned clone children are created correctly
+ pengine: Monitors for unmanaged resources do not need to wait for start/promote/demote actions to complete
+ stonithd (LF 1951): fix remote stonith operations
+ Tools: crm_verify - clean up the upgrade logic to prevent crash on invalid configurations
+ Medium: ais: Add support for a synchronous call to retrieve the nodes nodeid
+ Medium: ais: Use the agreed service number
+ Medium: pengine: Allow external entities to ask for a node to be shot by creating a terminate=true transient node attribute
+ Medium: pengine: Bug LF:1950 - Notifications do not contain all documented resource state fields
+ Medium: pengine: Bug N:417585 - Do not restart group children whos individual score drops below zero
+ Medium: pengine: Implement a true maintenance mode
+ Medium: pengine: Print the correct message when stonith is disabled
+ Medium: stonithd: exit with better code on empty host list
+ Medium: xml: pacemaker-0.7 is now an alias for the 1.0 schema
* Wed Aug 20 2008 Andrew Beekhof <abeekhof@suse.de> - 0.7.1-1
- Update source tarball to revision: f805e1b30103+ tip
- Statistics:
Changesets: 184
Diff: 513 files changed, 43408 insertions(+), 43783 deletions(-)
- Changes since 0.7.0-19
+ Fix compilation when GNUTLS isnt found
+ admin: Fix use-after-free in crm_mon
+ Build: Remove testing code that prevented heartbeat-only builds
+ cib: Use single quotes so that the xpath queries for nvpairs will succeed
+ crmd: Always connect to stonithd when the TE starts and ensure we notice if it dies
+ crmd: Correctly handle a dead PE process
+ crmd: Make sure async-failures cause the failcount to be incrimented
+ pengine: Bug LF:1941 - Handle failed clone instance probes when clone-max < #nodes
+ pengine: Parse resource ordering sets correctly
+ pengine: Prevent use-of-NULL - order->rsc_rh will not always be non-NULL
+ pengine: Unpack colocation sets correctly
+ Tools: crm_mon - Prevent use-of-NULL for orphaned resources
+ Medium: ais: Add support for a synchronous call to retrieve the nodes nodeid
+ Medium: ais: Allow transient clients to receive membership updates
+ Medium: ais: Avoid double-free in error path
+ Medium: ais: Include in the mebership nodes for which we have not determined their hostname
+ Medium: ais: Spawn the PE from the ais plugin instead of the crmd
+ Medium: cib: By default, new configurations use the latest schema
+ Medium: cib: Clean up the CIB if it was already disconnected
+ Medium: cib: Only incriment num_updates if something actually changed
+ Medium: cib: Prevent use-after-free in client after abnormal termination of the CIB
+ Medium: Core: Fix memory leak in xpath searches
+ Medium: Core: Get more details regarding parser errors
+ Medium: Core: Repair expand_plus_plus - do not call char2score on unexpanded values
+ Medium: Core: Switch to the libxml2 parser - its significantly faster
+ Medium: Core: Use a libxml2 library function for xml -> text conversion
+ Medium: crmd: Asynchronous failure actions have no parameters
+ Medium: crmd: Avoid calling glib functions with NULL
+ Medium: crmd: Do not allow an election to promote a node from S_STARTING
+ Medium: crmd: Do not vote if we have not completed the local startup
+ Medium: crmd: Fix te_update_diff() now that get_object_root() functions differently
+ Medium: crmd: Fix the lrmd xpath expressions to not contain quotes
+ Medium: crmd: If we get a join offer during an election, better restart the election
+ Medium: crmd: No further processing is needed when using the LRMs API call for failing resources
+ Medium: crmd: Only update have-quorum if the value changed
+ Medium: crmd: Repair the input validation logic in do_te_invoke
+ Medium: cts: CIBs can no longer contain comments
+ Medium: cts: Enable a bunch of tests that were incorrectly disabled
+ Medium: cts: The libxml2 parser wont allow v1 resources to use integers as parameter names
+ Medium: Do not use the cluster UID and GID directly. Look them up based on the configured value of HA_CCMUSER
+ Medium: Fix compilation when heartbeat is not supported
+ Medium: pengine: Allow groups to be involved in optional ordering constraints
+ Medium: pengine: Allow sets of operations to be reused by multiple resources
+ Medium: pengine: Bug LF:1941 - Mark extra clone instances as orphans and do not show inactive ones
+ Medium: pengine: Determin the correct migration-threshold during resource expansion
+ Medium: pengine: Implement no-quorum-policy=suicide (FATE #303619)
+ Medium: pengine: Clean up resources after stopping old copies of the PE
+ Medium: pengine: Teach the PE how to stop old copies of itself
+ Medium: Tools: Backport hb_report updates
+ Medium: Tools: cib_shadow - On create, spawn a new shell with CIB_shadow and PS1 set accordingly
+ Medium: Tools: Rename cib_shadow to crm_shadow
* Fri Jul 18 2008 Andrew Beekhof <abeekhof@suse.de> - 0.7.0-19
- Update source tarball to revision: 007c3a1c50f5 (unstable) tip
- Statistics:
Changesets: 108
Diff: 216 files changed, 4632 insertions(+), 4173 deletions(-)
- Changes added since unstable-0.7
+ admin: Fix use-after-free in crm_mon
+ ais: Change the tag for the ais plugin to "pacemaker" (used in openais.conf)
+ ais: Log terminated processes as an error
+ cib: Performance - Reorganize things to avoid calculating the XML diff twice
+ pengine: Bug LF:1941 - Handle failed clone instance probes when clone-max < #nodes
+ pengine: Fix memory leak in action2xml
+ pengine: Make OCF_ERR_ARGS a node-level error rather than a cluster-level one
+ pengine: Properly handle clones that are not installed on all nodes
+ Medium: admin: cibadmin - Show any validation errors if the upgrade failed
+ Medium: admin: cib_shadow - Implement --locate to display the underlying filename
+ Medium: admin: cib_shadow - Implement a --diff option
+ Medium: admin: cib_shadow - Implement a --switch option
+ Medium: admin: crm_resource - create more compact constraints that do not use lifetime (which is deprecated)
+ Medium: ais: Approximate born_on for OpenAIS based clusters
+ Medium: cib: Remove do_id_check, it is a poor substitute for ID validation by a schema
+ Medium: cib: Skip construction of pre-notify messages if no-one wants one
+ Medium: Core: Attempt to streamline some key functions to increase performance
+ Medium: Core: Clean up XML parser after validation
+ Medium: crmd: Detect and optimize the CRMs behavior when processing diffs of an LRM refresh
+ Medium: Fix memory leaks when resetting the name of an XML object
+ Medium: pengine: Prefer the current location if it is one of a group of nodes with the same (highest) score
* Wed Jun 25 2008 Andrew Beekhof <abeekhof@suse.de> - 0.7.0-1
- Update source tarball to revision: bde0c7db74fb tip
- Statistics:
Changesets: 439
Diff: 676 files changed, 41310 insertions(+), 52071 deletions(-)
- Changes added since stable-0.6
+ A new tool for setting up and invoking CTS
+ Admin: All tools now use --node (-N) for specifying node unames
+ Admin: All tools now use --xml-file (-x) and --xml-text (-X) for specifying where to find XML blobs
+ cib: Cleanup the API - remove redundant input fields
+ cib: Implement CIB_shadow - a facility for making and testing changes before uploading them to the cluster
+ cib: Make registering per-op callbacks an API call and renamed (for clarity) the API call for requesting notifications
+ Core: Add a facility for automatically upgrading old configurations
+ Core: Adopt libxml2 as the XML processing library - all external clients need to be recompiled
+ Core: Allow sending TLS messages larger than the MTU
+ Core: Fix parsing of time-only ISO dates
+ Core: Smarter handling of XML values containing quotes
+ Core: XML memory corruption - catch, and handle, cases where we are overwriting an attribute value with itself
+ Core: The xml ID type does not allow UUIDs that start with a number
+ Core: Implement XPath based versions of query/delete/replace/modify
+ Core: Remove some HA2.0.(3,4) compatability code
+ crmd: Overhaul the detection of nodes that are starting vs. failed
+ pengine: Bug LF:1459 - Allow failures to expire
+ pengine: Have the PE do non-persistent configuration upgrades before performing calculations
+ pengine: Replace failure-stickiness with a simple 'migration-threshold'
+ tengine: Simplify the design by folding the tengine process into the crmd
+ Medium: Admin: Bug LF:1438 - Allow the list of all/active resource operations to be queried by crm_resource
+ Medium: Admin: Bug LF:1708 - crm_resource should print a warning if an attribute is already set as a meta attribute
+ Medium: Admin: Bug LF:1883 - crm_mon should display fail-count and operation history
+ Medium: Admin: Bug LF:1883 - crm_mon should display operation timing data
+ Medium: Admin: Bug N:371785 - crm_resource -C does not also clean up fail-count attributes
+ Medium: Admin: crm_mon - include timing data for failed actions
+ Medium: ais: Read options from the environment since objdb is not completely usable yet
+ Medium: cib: Add sections for op_defaults and rsc_defaults
+ Medium: cib: Better matching notification callbacks (for detecting duplicates and removal)
+ Medium: cib: Bug LF:1348 - Allow rules and attribute sets to be referenced for use in other objects
+ Medium: cib: BUG LF:1918 - By default, all cib calls now timeout after 30s
+ Medium: cib: Detect updates that decrease the version tuple
+ Medium: cib: Implement a client-side operation timeout - Requires LHA update
+ Medium: cib: Implement callbacks and async notifications for remote connections
+ Medium: cib: Make cib->cmds->update() an alias for modify at the API level (also implemented in cibadmin)
+ Medium: cib: Mark the CIB as disconnected if the IPC connection is terminated
+ Medium: cib: New call option 'cib_can_create' which can be passed to modify actions - allows the object to be created if it does not exist yet
+ Medium: cib: Reimplement get|set|delete attributes using XPath
+ Medium: cib: Remove some useless parts of the API
+ Medium: cib: Remove the 'attributes' scaffolding from the new format
+ Medium: cib: Implement the ability for clients to connect to remote servers
+ Medium: Core: Add support for validating xml against RelaxNG schemas
+ Medium: Core: Allow more than one item to be modified/deleted in XPath based operations
+ Medium: Core: Fix the sort_pairs function for creating sorted xml objects
+ Medium: Core: iso8601 - Implement subtract_duration and fix subtract_time
+ Medium: Core: Reduce the amount of xml copying occuring
+ Medium: Core: Support value='value+=N' XML updates (in addtion to value='value++')
+ Medium: crmd: Add support for lrm_ops->fail_rsc if its available
+ Medium: crmd: HB - watch link status for node leaving events
+ Medium: crmd: Bug LF:1924 - Improved handling of lrmd disconnects and shutdowns
+ Medium: crmd: Do not wait for actions with a start_delay over 5 minutes. Confirm them immediately
+ Medium: pengine: Bug LF:1328 - Do not fencing nodes in clusters without managed resources
+ Medium: pengine: Bug LF:1461 - Give transient node attributes (in <status/>) preference over persistent ones (in <nodes/>)
+ Medium: pengine: Bug LF:1884, Bug LF:1885 - Implement N:M ordering and colocation constraints
+ Medium: pengine: Bug LF:1886 - Create a resource and operation 'defaults' config section
+ Medium: pengine: Bug LF:1892 - Allow recurring actions to be triggered at known times
+ Medium: pengine: Bug LF:1926 - Probes should complete before stop actions are invoked
+ Medium: pengine: Fix the standby when its set as a transient attribute
+ Medium: pengine: Implement a global 'stop-all-resources' option
+ Medium: pengine: Implement cibpipe, a tool for performing/simulating config changes "offline"
+ Medium: pengine: We do not allow colocation with specific clone instances
+ Medium: Tools: pingd - Implement a stack-independant version of pingd
+ Medium: xml: Ship an xslt for upgrading from 0.6 to 0.7
* Thu Jun 19 2008 Andrew Beekhof <abeekhof@suse.de> - 0.6.5-1
- Update source tarball to revision: b9fe723d1ac5 tip
- Statistics:
Changesets: 48
Diff: 37 files changed, 1204 insertions(+), 234 deletions(-)
- Changes since Pacemaker-0.6.4
+ Admin: Repair the ability to delete failcounts
+ ais: Audit IPC handling between the AIS plugin and CRM processes
+ ais: Have the plugin create needed /var/lib directories
+ ais: Make sure the sync and async connections are assigned correctly (not swapped)
+ cib: Correctly detect configuration changes - num_updates does not count
+ pengine: Apply stickiness values to the whole group, not the individual resources
+ pengine: Bug N:385265 - Ensure groups are migrated instead of remaining partially active on the current node
+ pengine: Bug N:396293 - Enforce manditory group restarts due to ordering constraints
+ pengine: Correctly recover master instances found active on more than one node
+ pengine: Fix memory leaks reported by Valgrind
+ Medium: Admin: crm_mon - Misc improvements from Satomi Taniguchi
+ Medium: Bug LF:1900 - Resource stickiness should not allow placement in asynchronous clusters
+ Medium: crmd: Ensure joins are completed promptly when a node taking part dies
+ Medium: pengine: Avoid clone instance shuffling in more cases
+ Medium: pengine: Bug LF:1906 - Remove an optimization in native_merge_weights() causing group scores to behave eratically
+ Medium: pengine: Make use of target_rc data to correctly process resource operations
+ Medium: pengine: Prevent a possible use of NULL in sort_clone_instance()
+ Medium: tengine: Include target rc in the transition key - used to correctly determin operation failure
* Thu May 22 2008 Andrew Beekhof <abeekhof@suse.de> - 0.6.4-1
- Update source tarball to revision: 226d8e356924 tip
- Statistics:
Changesets: 55
Diff: 199 files changed, 7103 insertions(+), 12378 deletions(-)
- Changes since Pacemaker-0.6.3
+ crmd: Bug LF:1881 LF:1882 - Overhaul the logic for operation cancelation and deletion
+ crmd: Bug LF:1894 - Make sure cancelled recurring operations are cleaned out from the CIB
+ pengine: Bug N:387749 - Colocation with clones causes unnecessary clone instance shuffling
+ pengine: Ensure 'master' monitor actions are cancelled _before_ we demote the resource
+ pengine: Fix assert failure leading to core dump - make sure variable is properly initialized
+ pengine: Make sure 'slave' monitoring happens after the resource has been demoted
+ pengine: Prevent failure stickiness underflows (where too many failures become a _positive_ preference)
+ Medium: Admin: crm_mon - Only complain if the output file could not be opened
+ Medium: Common: filter_action_parameters - enable legacy handling only for older versions
+ Medium: pengine: Bug N:385265 - The failure stickiness of group children is ignored until it reaches -INFINITY
+ Medium: pengine: Implement master and clone colocation by exlcuding nodes rather than setting ones score to INFINITY (similar to cs: 756afc42dc51)
+ Medium: tengine: Bug LF:1875 - Correctly find actions to cancel when their node leaves the cluster
* Wed Apr 23 2008 Andrew Beekhof <abeekhof@suse.de> - 0.6.3-1
- Update source tarball to revision: fd8904c9bc67 tip
- Statistics:
Changesets: 117
Diff: 354 files changed, 19094 insertions(+), 11338 deletions(-)
- Changes since Pacemaker-0.6.2
+ Admin: Bug LF:1848 - crm_resource - Pass set name and id to delete_resource_attr() in the correct order
+ Build: SNMP has been moved to the management/pygui project
+ crmd: Bug LF1837 - Unmanaged resources prevent crmd from shutting down
+ crmd: Prevent use-after-free in lrm interface code (Patch based on work by Keisuke MORI)
+ pengine: Allow the cluster to make progress by not retrying failed demote actions
+ pengine: Anti-colocation with slave should not prevent master colocation
+ pengine: Bug LF 1768 - Wait more often for STONITH ops to complete before starting resources
+ pengine: Bug LF1836 - Allow is-managed-default=false to be overridden by individual resources
+ pengine: Bug LF185 - Prevent pointless master/slave instance shuffling by ignoring the master-pref of stopped instances
+ pengine: Bug N-191176 - Implement interleaved ordering for clone-to-clone scenarios
+ pengine: Bug N-347004 - Ensure clone notifications are always sent when an instance is stopped/started
+ pengine: Bug N-347004 - Include notification ordering is correct for interleaved clones
+ pengine: Bug PM-11 - Directly link probe_complete to starting clone instances
+ pengine: Bug PM1 - Fix setting failcounts when applied to complex resources
+ pengine: Bug PM12, LF1648 - Extensive revision of group ordering
+ pengine: Bug PM7 - Ensure masters are always demoted before they are stopped
+ pengine: Create probes after allocation to allow smarter handling of anonymous clones
+ pengine: Do not prioritize clone instances that must be moved
+ pengine: Fix error in previous commit that allowed more than the required number of masters to be promoted
+ pengine: Group start ordering fixes
+ pengine: Implement promote/demote ordering for cloned groups
+ tengine: Repair failcount updates
+ tengine: Use the correct offset when updating failcount
+ Medium: Admin: Add a summary output that can be easily parsed by CTS for audit purposes
+ Medium: Build: Make configure fail if bz2 or libxml2 are not present
+ Medium: Build: Re-instate a better default for LCRSODIR
+ Medium: CIB: Bug LF-1861 - Filter irrelvant error status from synchronous CIB clients
+ Medium: Core: Bug 1849 - Invalid conversion of ordinal leap year to gregorian date
+ Medium: Core: Drop compataibility code for 2.0.4 and 2.0.5 clusters
+ Medium: crmd: Bug LF-1860 - Automatically cancel recurring ops before demote and promote operations (not only stops)
+ Medium: crmd: Save the current CIB contents if we detect the PE crashed
+ Medium: pengine: Bug LF:1866 - Fix version check when applying compatability handling for failed start operations
+ Medium: pengine: Bug LF:1866 - Restore the ability to have start failures not be fatal
+ Medium: pengine: Bug PM1 - Failcount applies to all instances of non-unique clone
+ Medium: pengine: Correctly set the state of partially active master/slave groups
+ Medium: pengine: Do not claim to be stopping an already stopped orphan
+ Medium: pengine: Ensure implies_left ordering constraints are always effective
+ Medium: pengine: Indicate each resources 'promotion' score
+ Medium: pengine: Prevent a possible use-of-NULL
+ Medium: pengine: Reprocess the current action if it changed (so that any prior dependancies are updated)
+ Medium: tengine: Bug LF-1859 - Wait for fail-count updates to complete before terminating the transition
+ Medium: tengine: Bug LF:1859 - Do not abort graphs due to our own failcount updates
+ Medium: tengine: Bug LF:1859 - Prevent the TE from interupting itself
* Thu Feb 14 2008 Andrew Beekhof <abeekhof@suse.de> - 0.6.2-1
- Update source tarball to revision: 28b1a8c1868b tip
- Statistics:
Changesets: 11
Diff: 7 files changed, 58 insertions(+), 18 deletions(-)
- Changes since Pacemaker-0.6.1
+ haresources2cib.py: set default-action-timeout to the default (20s)
+ haresources2cib.py: update ra parameters lists
+ Medium: SNMP: Allow the snmp subagent to be built (patch from MATSUDA, Daiki)
+ Medium: Tools: Make sure the autoconf variables in haresources2cib are expanded
* Tue Feb 12 2008 Andrew Beekhof <abeekhof@suse.de> - 0.6.1-1
- Update source tarball to revision: e7152d1be933 tip
- Statistics:
Changesets: 25
Diff: 37 files changed, 1323 insertions(+), 227 deletions(-)
- Changes since Pacemaker-0.6.0
+ CIB: Ensure changes to top-level attributes (like admin_epoch) cause a disk write
+ CIB: Ensure the archived file hits the disk before returning
+ CIB: Repair the ability to do 'atomic incriment' updates (value="value++")
+ crmd: Bug #7 - Connecting to the crmd immediately after startup causes use-of-NULL
+ Medium: CIB: Mask cib_diff_resync results from the caller - they do not need to know
+ Medium: crmd: Delay starting the IPC server until we are fully functional
+ Medium: CTS: Fix the startup patterns
+ Medium: pengine: Bug 1820 - Allow the first resource in a group to be migrated
+ Medium: pengine: Bug 1820 - Check the colocation dependancies of resources to be migrated
* Mon Jan 14 2008 Andrew Beekhof <abeekhof@suse.de> - 0.6.0-2
- This is the first release of the Pacemaker Cluster Resource Manager formerly part of Heartbeat.
- For those looking for the GUI, mgmtd, CIM or TSA components, they are now found in
the new pacemaker-pygui project. Build dependancies prevent them from being
included in Heartbeat (since the built-in CRM is no longer supported) and,
being non-core components, are not included with Pacemaker.
- Update source tarball to revision: c94b92d550cf
- Statistics:
Changesets: 347
Diff: 2272 files changed, 132508 insertions(+), 305991 deletions(-)
- Test hardware:
+ 6-node vmware cluster (sles10-sp1/256Mb/vmware stonith) on a single host (opensuse10.3/2Gb/2.66Ghz Quad Core2)
+ 7-node EMC Centera cluster (sles10/512Mb/2Ghz Xeon/ssh stonith)
- Notes: Heartbeat Stack
+ All testing was performed with STONITH enabled
+ The CRM was enabled using the "crm respawn" directive
- Notes: OpenAIS Stack
+ This release contains a preview of support for the OpenAIS cluster stack
+ The current release of the OpenAIS project is missing two important
patches that we require. OpenAIS packages containing these patches are
available for most major distributions at:
http://download.opensuse.org/repositories/server:/ha-clustering
+ The OpenAIS stack is not currently recommended for use in clusters that
have shared data as STONITH support is not yet implimented
+ pingd is not yet available for use with the OpenAIS stack
+ 3 significant OpenAIS issues were found during testing of 4 and 6 node
clusters. We are activly working together with the OpenAIS project to
get these resolved.
- Pending bugs encountered during testing:
+ OpenAIS #1736 - Openais membership took 20s to stabilize
+ Heartbeat #1750 - ipc_bufpool_update: magic number in head does not match
+ OpenAIS #1793 - Assertion failure in memb_state_gather_enter()
+ OpenAIS #1796 - Cluster message corruption
- Changes since Heartbeat-2.1.2-24
+ Add OpenAIS support
+ Admin: crm_uuid - Look in the right place for Heartbeat UUID files
+ admin: Exit and indicate a problem if the crmd exits while crmadmin is performing a query
+ cib: Fix CIB_OP_UPDATE calls that modify the whole CIB
+ cib: Fix compilation when supporting the heartbeat stack
+ cib: Fix memory leaks caused by the switch to get_message_xml()
+ cib: HA_VALGRIND_ENABLED needs to be set _and_ set to 1|yes|true
+ cib: Use get_message_xml() in preference to cl_get_struct()
+ cib: Use the return value from call to write() in cib_send_plaintext()
+ Core: ccm nodes can legitimately have a node id of 0
+ Core: Fix peer-process tracking for the Heartbeat stack
+ Core: Heartbeat does not send status notifications for nodes that were already part of the cluster. Fake them instead
+ CRM: Add children to HA_Messages such that the field name matches F_XML_TAGNAME
+ crm: Adopt a more flexible appraoch to enabling Valgrind
+ crm: Fix compilation when bzip2 is not installed
+ CRM: Future-proof get_message_xml()
+ crmd: Filter election responses based on time not FSA state
+ crmd: Handle all possible peer states in crmd_ha_status_callback()
+ crmd: Make sure the current date/time is set - prevents use-of-NULL when evaluating rules
+ crmd: Relax an assertion regrading ccm membership instances
+ crmd: Use (node->processes&crm_proc_ais) to accurately update the CIB after replace operations
+ crmd: Heartbeat: Accurately record peer client status
+ pengine: Bug 1777 - Allow colocation with a resource in the Stopped state
+ pengine: Bug 1822 - Prevent use-of-NULL in PromoteRsc()
+ pengine: Implement three recovery policies based on op_status and op_rc
+ pengine: Parse fail-count correctly (it may be set to ININFITY)
+ pengine: Prevent graph-loop when stonith agents need to be moved around before a STONITH op
+ pengine: Prevent graph-loops when two operations have the same name+interval
+ tengine: Cancel active timers when destroying graphs
+ tengine: Ensure failcount is set correctly for failed stops/starts
+ tengine: Update failcount for oeprations that time out
+ Medium: admin: Prevent hang in crm_mon -1 when there is no cib connection - Patch from Junko IKEDA
+ Medium: cib: Require --force|-f when performing potentially dangerous commands with cibadmin
+ Medium: cib: Tweak the shutdown code
+ Medium: Common: Only count peer processes of active nodes
+ Medium: Core: Create generic cluster sign-in method
+ Medium: core: Fix compilation when Heartbeat support is disabled
+ Medium: Core: General cleanup for supporting two stacks
+ Medium: Core: iso6601 - Support parsing of time-only strings
+ Medium: core: Isolate more code that is only needed when SUPPORT_HEARTBEAT is enabled
+ Medium: crm: Improved logging of errors in the XML parser
+ Medium: crmd: Fix potential use-of-NULL in string comparison
+ Medium: crmd: Reimpliment syncronizing of CIB queries and updates when invoking the PE
+ Medium: crm_mon: Indicate when a node is both in standby mode and offline
+ Medium: pengine: Bug 1822 - Do not try an promote groups if not all of it is active
+ Medium: pengine: on_fail=nothing is an alias for 'ignore' not 'restart'
+ Medium: pengine: Prevent a potential use-of-NULL in cron_range_satisfied()
+ snmp subagent: fix a problem on displaying an unmanaged group
+ snmp subagent: use the syslog setting
+ snmp: v2 support (thanks to Keisuke MORI)
+ snmp_subagent - made it not complain about some things if shutting down
* Mon Dec 10 2007 Andrew Beekhof <abeekhof@suse.de> - 0.6.0-1
- Initial opensuse package check-in
diff --git a/attrd/commands.c b/attrd/commands.c
index 459eb41bdd..108db51dc7 100644
--- a/attrd/commands.c
+++ b/attrd/commands.c
@@ -1,771 +1,775 @@
/*
* Copyright (C) 2013 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <glib.h>
#include <crm/msg_xml.h>
#include <crm/cluster.h>
#include <crm/cib.h>
#include <crm/cluster/internal.h>
#include <crm/cluster/election.h>
#include <crm/cib/internal.h>
#include <internal.h>
#define ATTRD_PROTOCOL_VERSION "1"
int last_cib_op_done = 0;
char *peer_writer = NULL;
GHashTable *attributes = NULL;
typedef struct attribute_s {
char *uuid; /* TODO: Remove if at all possible */
char *id;
char *set;
GHashTable *values;
int update;
int timeout_ms;
bool changed;
bool unknown_peer_uuids;
mainloop_timer_t *timer;
char *user;
} attribute_t;
typedef struct attribute_value_s {
uint32_t nodeid;
gboolean is_remote;
char *nodename;
char *current;
char *requested;
char *stored;
} attribute_value_t;
void write_attribute(attribute_t *a);
void write_or_elect_attribute(attribute_t *a);
void attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool filter);
void attrd_peer_sync(crm_node_t *peer, xmlNode *xml);
void attrd_peer_remove(const char *host, const char *source);
static gboolean
send_attrd_message(crm_node_t * node, xmlNode * data)
{
crm_xml_add(data, F_TYPE, T_ATTRD);
crm_xml_add(data, F_ATTRD_IGNORE_LOCALLY, "atomic-version"); /* Tell older versions to ignore our messages */
crm_xml_add(data, F_ATTRD_VERSION, ATTRD_PROTOCOL_VERSION);
crm_xml_add_int(data, F_ATTRD_WRITER, election_state(writer));
return send_cluster_message(node, crm_msg_attrd, data, TRUE);
}
static gboolean
attribute_timer_cb(gpointer data)
{
attribute_t *a = data;
crm_trace("Dampen interval expired for %s in state %d", a->id, election_state(writer));
write_or_elect_attribute(a);
return FALSE;
}
static void
free_attribute_value(gpointer data)
{
attribute_value_t *v = data;
free(v->nodename);
free(v->current);
free(v->requested);
free(v->stored);
free(v);
}
void
free_attribute(gpointer data)
{
attribute_t *a = data;
if(a) {
free(a->id);
free(a->set);
free(a->uuid);
free(a->user);
mainloop_timer_del(a->timer);
g_hash_table_destroy(a->values);
free(a);
}
}
xmlNode *
build_attribute_xml(
xmlNode *parent, const char *name, const char *set, const char *uuid, unsigned int timeout_ms, const char *user,
const char *peer, uint32_t peerid, const char *value)
{
xmlNode *xml = create_xml_node(parent, __FUNCTION__);
crm_xml_add(xml, F_ATTRD_ATTRIBUTE, name);
crm_xml_add(xml, F_ATTRD_SET, set);
crm_xml_add(xml, F_ATTRD_KEY, uuid);
crm_xml_add(xml, F_ATTRD_USER, user);
crm_xml_add(xml, F_ATTRD_HOST, peer);
crm_xml_add_int(xml, F_ATTRD_HOST_ID, peerid);
crm_xml_add(xml, F_ATTRD_VALUE, value);
crm_xml_add_int(xml, F_ATTRD_DAMPEN, timeout_ms/1000);
return xml;
}
static attribute_t *
create_attribute(xmlNode *xml)
{
int dampen = 0;
const char *value = crm_element_value(xml, F_ATTRD_DAMPEN);
attribute_t *a = calloc(1, sizeof(attribute_t));
a->id = crm_element_value_copy(xml, F_ATTRD_ATTRIBUTE);
a->set = crm_element_value_copy(xml, F_ATTRD_SET);
a->uuid = crm_element_value_copy(xml, F_ATTRD_KEY);
a->values = g_hash_table_new_full(crm_str_hash, g_str_equal, NULL, free_attribute_value);
#if ENABLE_ACL
crm_trace("Performing all %s operations as user '%s'", a->id, a->user);
a->user = crm_element_value_copy(xml, F_ATTRD_USER);
#endif
if(value) {
dampen = crm_get_msec(value);
crm_trace("Created attribute %s with delay %dms (%s)", a->id, dampen, value);
} else {
crm_trace("Created attribute %s with no delay", a->id);
}
if(dampen > 0) {
a->timeout_ms = dampen;
a->timer = mainloop_timer_add(strdup(a->id), a->timeout_ms, FALSE, attribute_timer_cb, a);
}
g_hash_table_replace(attributes, a->id, a);
return a;
}
void
attrd_client_message(crm_client_t *client, xmlNode *xml)
{
bool broadcast = FALSE;
static int plus_plus_len = 5;
const char *op = crm_element_value(xml, F_ATTRD_TASK);
if(safe_str_eq(op, "peer-remove")) {
const char *host = crm_element_value(xml, F_ATTRD_HOST);
crm_info("Client %s is requesting all values for %s be removed", client->name, host);
if(host) {
broadcast = TRUE;
}
} else if(safe_str_eq(op, "update")) {
attribute_t *a = NULL;
attribute_value_t *v = NULL;
char *key = crm_element_value_copy(xml, F_ATTRD_KEY);
char *set = crm_element_value_copy(xml, F_ATTRD_SET);
char *host = crm_element_value_copy(xml, F_ATTRD_HOST);
const char *attr = crm_element_value(xml, F_ATTRD_ATTRIBUTE);
const char *value = crm_element_value(xml, F_ATTRD_VALUE);
a = g_hash_table_lookup(attributes, attr);
if(host == NULL) {
crm_trace("Inferring host");
host = strdup(attrd_cluster->uname);
crm_xml_add(xml, F_ATTRD_HOST, host);
crm_xml_add_int(xml, F_ATTRD_HOST_ID, attrd_cluster->nodeid);
}
if (value) {
int offset = 1;
int int_value = 0;
int value_len = strlen(value);
if (value_len < (plus_plus_len + 2)
|| value[plus_plus_len] != '+'
|| (value[plus_plus_len + 1] != '+' && value[plus_plus_len + 1] != '=')) {
goto send;
}
if(a) {
v = g_hash_table_lookup(a->values, host);
}
if(v) {
int_value = char2score(v->current);
}
if (value[plus_plus_len + 1] != '+') {
const char *offset_s = value + (plus_plus_len + 2);
offset = char2score(offset_s);
}
int_value += offset;
if (int_value > INFINITY) {
int_value = INFINITY;
}
crm_info("Expanded %s=%s to %d", attr, value, int_value);
crm_xml_add_int(xml, F_ATTRD_VALUE, int_value);
}
send:
if(peer_writer == NULL && election_state(writer) != election_in_progress) {
crm_info("Starting an election to determine the writer");
election_vote(writer);
}
crm_info("Broadcasting %s[%s] = %s%s", attr, host, value, election_state(writer) == election_won?" (writer)":"");
broadcast = TRUE;
free(key);
free(set);
free(host);
+
+ } else if(safe_str_eq(op, "refresh")) {
+ crm_notice("Updating all attributes");
+ write_attributes(TRUE, FALSE);
}
if(broadcast) {
send_attrd_message(NULL, xml);
}
}
void
attrd_peer_message(crm_node_t *peer, xmlNode *xml)
{
int peer_state = 0;
const char *v = crm_element_value(xml, F_ATTRD_VERSION);
const char *op = crm_element_value(xml, F_ATTRD_TASK);
const char *election_op = crm_element_value(xml, F_CRM_TASK);
if(election_op) {
enum election_result rc = 0;
crm_xml_add(xml, F_CRM_HOST_FROM, peer->uname);
rc = election_count_vote(writer, xml, TRUE);
switch(rc) {
case election_start:
free(peer_writer);
peer_writer = NULL;
election_vote(writer);
break;
case election_lost:
free(peer_writer);
peer_writer = strdup(peer->uname);
break;
default:
election_check(writer);
break;
}
return;
} else if(v == NULL) {
/* From the non-atomic version */
if(safe_str_eq(op, "update")) {
const char *name = crm_element_value(xml, F_ATTRD_ATTRIBUTE);
crm_trace("Compatibility update of %s from %s", name, peer->uname);
attrd_peer_update(peer, xml, FALSE);
} else if(safe_str_eq(op, "flush")) {
const char *name = crm_element_value(xml, F_ATTRD_ATTRIBUTE);
attribute_t *a = g_hash_table_lookup(attributes, name);
if(a) {
crm_trace("Compatibility write-out of %s for %s from %s", a->id, op, peer->uname);
write_or_elect_attribute(a);
}
} else if(safe_str_eq(op, "refresh")) {
GHashTableIter aIter;
attribute_t *a = NULL;
g_hash_table_iter_init(&aIter, attributes);
while (g_hash_table_iter_next(&aIter, NULL, (gpointer *) & a)) {
crm_trace("Compatibility write-out of %s for %s from %s", a->id, op, peer->uname);
write_or_elect_attribute(a);
}
}
}
crm_element_value_int(xml, F_ATTRD_WRITER, &peer_state);
if(election_state(writer) == election_won
&& peer_state == election_won
&& safe_str_neq(peer->uname, attrd_cluster->uname)) {
crm_notice("Detected another attribute writer: %s", peer->uname);
election_vote(writer);
} else if(peer_state == election_won) {
if(peer_writer == NULL) {
peer_writer = strdup(peer->uname);
crm_notice("Recorded attribute writer: %s", peer->uname);
} else if(safe_str_neq(peer->uname, peer_writer)) {
crm_notice("Recorded new attribute writer: %s (was %s)", peer->uname, peer_writer);
free(peer_writer);
peer_writer = strdup(peer->uname);
}
}
if(safe_str_eq(op, "update")) {
attrd_peer_update(peer, xml, FALSE);
} else if(safe_str_eq(op, "sync")) {
attrd_peer_sync(peer, xml);
} else if(safe_str_eq(op, "peer-remove")) {
const char *host = crm_element_value(xml, F_ATTRD_HOST);
attrd_peer_remove(host, peer->uname);
} else if(safe_str_eq(op, "sync-response")
&& safe_str_neq(peer->uname, attrd_cluster->uname)) {
xmlNode *child = NULL;
crm_notice("Processing %s from %s", op, peer->uname);
for (child = __xml_first_child(xml); child != NULL; child = __xml_next(child)) {
attrd_peer_update(peer, child, TRUE);
}
}
}
void
attrd_peer_sync(crm_node_t *peer, xmlNode *xml)
{
GHashTableIter aIter;
GHashTableIter vIter;
attribute_t *a = NULL;
attribute_value_t *v = NULL;
xmlNode *sync = create_xml_node(NULL, __FUNCTION__);
crm_xml_add(sync, F_ATTRD_TASK, "sync-response");
g_hash_table_iter_init(&aIter, attributes);
while (g_hash_table_iter_next(&aIter, NULL, (gpointer *) & a)) {
g_hash_table_iter_init(&vIter, a->values);
while (g_hash_table_iter_next(&vIter, NULL, (gpointer *) & v)) {
crm_debug("Syncing %s[%s] = %s to %s", a->id, v->nodename, v->current, peer?peer->uname:"everyone");
build_attribute_xml(sync, a->id, a->set, a->uuid, a->timeout_ms, a->user, v->nodename, v->nodeid, v->current);
}
}
crm_debug("Syncing values to %s", peer?peer->uname:"everyone");
send_attrd_message(peer, sync);
free_xml(sync);
}
void
attrd_peer_remove(const char *host, const char *source)
{
attribute_t *a = NULL;
GHashTableIter aIter;
crm_notice("Removing all %s attributes for %s", host, source);
if(host == NULL) {
return;
}
g_hash_table_iter_init(&aIter, attributes);
while (g_hash_table_iter_next(&aIter, NULL, (gpointer *) & a)) {
if(g_hash_table_remove(a->values, host)) {
crm_debug("Removed %s[%s] for %s", a->id, host, source);
}
}
/* if this matches a remote peer, it will be removed from the cache */
crm_remote_peer_cache_remove(host);
}
void
attrd_peer_update(crm_node_t *peer, xmlNode *xml, bool filter)
{
bool changed = FALSE;
attribute_value_t *v = NULL;
const char *host = crm_element_value(xml, F_ATTRD_HOST);
const char *attr = crm_element_value(xml, F_ATTRD_ATTRIBUTE);
const char *value = crm_element_value(xml, F_ATTRD_VALUE);
attribute_t *a = g_hash_table_lookup(attributes, attr);
if(a == NULL) {
a = create_attribute(xml);
}
v = g_hash_table_lookup(a->values, host);
if(v == NULL) {
crm_trace("Setting %s[%s] to %s from %s", attr, host, value, peer->uname);
v = calloc(1, sizeof(attribute_value_t));
if(value) {
v->current = strdup(value);
}
v->nodename = strdup(host);
crm_element_value_int(xml, F_ATTRD_IS_REMOTE, &v->is_remote);
g_hash_table_replace(a->values, v->nodename, v);
if (v->is_remote == TRUE) {
crm_remote_peer_cache_add(host);
}
changed = TRUE;
} else if(filter
&& safe_str_neq(v->current, value)
&& safe_str_eq(host, attrd_cluster->uname)) {
xmlNode *sync = create_xml_node(NULL, __FUNCTION__);
crm_notice("%s[%s]: local value '%s' takes priority over '%s' from %s",
a->id, host, v->current, value, peer->uname);
crm_xml_add(sync, F_ATTRD_TASK, "sync-response");
v = g_hash_table_lookup(a->values, host);
build_attribute_xml(sync, a->id, a->set, a->uuid, a->timeout_ms, a->user, v->nodename, v->nodeid, v->current);
crm_xml_add_int(sync, F_ATTRD_WRITER, election_state(writer));
send_attrd_message(peer, sync);
free_xml(sync);
} else if(safe_str_neq(v->current, value)) {
crm_info("Setting %s[%s]: %s -> %s from %s", attr, host, v->current, value, peer->uname);
free(v->current);
if(value) {
v->current = strdup(value);
} else {
v->current = NULL;
}
changed = TRUE;
} else {
crm_trace("Unchanged %s[%s] from %s is %s", attr, host, peer->uname, value);
}
a->changed |= changed;
/* this only involves cluster nodes. */
if(v->nodeid == 0 && (v->is_remote == FALSE)) {
if(crm_element_value_int(xml, F_ATTRD_HOST_ID, (int*)&v->nodeid) == 0) {
/* Create the name/id association */
crm_node_t *peer = crm_get_peer(v->nodeid, host);
crm_trace("We know %s's node id now: %s", peer->uname, peer->uuid);
if(election_state(writer) == election_won) {
write_attributes(FALSE, TRUE);
return;
}
}
}
if(changed) {
if(a->timer) {
crm_trace("Delayed write out (%dms) for %s", a->timeout_ms, a->id);
mainloop_timer_start(a->timer);
} else {
write_or_elect_attribute(a);
}
}
}
void
write_or_elect_attribute(attribute_t *a)
{
enum election_result rc = election_state(writer);
if(rc == election_won) {
write_attribute(a);
} else if(rc == election_in_progress) {
crm_trace("Election in progress to determine who will write out %s", a->id);
} else if(peer_writer == NULL) {
crm_info("Starting an election to determine who will write out %s", a->id);
election_vote(writer);
} else {
crm_trace("%s will write out %s, we are in state %d", peer_writer, a->id, rc);
}
}
gboolean
attrd_election_cb(gpointer user_data)
{
crm_trace("Election complete");
free(peer_writer);
peer_writer = strdup(attrd_cluster->uname);
/* Update the peers after an election */
attrd_peer_sync(NULL, NULL);
/* Update the CIB after an election */
write_attributes(TRUE, FALSE);
return FALSE;
}
void
attrd_peer_change_cb(enum crm_status_type kind, crm_node_t *peer, const void *data)
{
if(election_state(writer) == election_won
&& kind == crm_status_nstate
&& safe_str_eq(peer->state, CRM_NODE_MEMBER)) {
attrd_peer_sync(peer, NULL);
} else if(kind == crm_status_nstate
&& safe_str_neq(peer->state, CRM_NODE_MEMBER)) {
attrd_peer_remove(peer->uname, __FUNCTION__);
if(peer_writer && safe_str_eq(peer->uname, peer_writer)) {
free(peer_writer);
peer_writer = NULL;
crm_notice("Lost attribute writer %s", peer->uname);
}
} else if(kind == crm_status_processes) {
if(is_set(peer->processes, crm_proc_cpg)) {
crm_update_peer_state(__FUNCTION__, peer, CRM_NODE_MEMBER, 0);
} else {
crm_update_peer_state(__FUNCTION__, peer, CRM_NODE_LOST, 0);
}
}
}
static void
attrd_cib_callback(xmlNode * msg, int call_id, int rc, xmlNode * output, void *user_data)
{
int level = LOG_ERR;
GHashTableIter iter;
const char *peer = NULL;
attribute_value_t *v = NULL;
char *name = user_data;
attribute_t *a = g_hash_table_lookup(attributes, name);
if(a == NULL) {
crm_info("Attribute %s no longer exists", name);
goto done;
}
a->update = 0;
if (rc == pcmk_ok && call_id < 0) {
rc = call_id;
}
switch (rc) {
case pcmk_ok:
level = LOG_INFO;
last_cib_op_done = call_id;
break;
case -pcmk_err_diff_failed: /* When an attr changes while the CIB is syncing */
case -ETIME: /* When an attr changes while there is a DC election */
case -ENXIO: /* When an attr changes while the CIB is syncing a
* newer config from a node that just came up
*/
level = LOG_WARNING;
break;
}
do_crm_log(level, "Update %d for %s: %s (%d)", call_id, name, pcmk_strerror(rc), rc);
g_hash_table_iter_init(&iter, a->values);
while (g_hash_table_iter_next(&iter, (gpointer *) & peer, (gpointer *) & v)) {
crm_notice("Update %d for %s[%s]=%s: %s (%d)", call_id, a->id, peer, v->requested, pcmk_strerror(rc), rc);
if(rc == pcmk_ok) {
free(v->stored);
v->stored = v->requested;
v->requested = NULL;
} else {
free(v->requested);
v->requested = NULL;
a->changed = TRUE; /* Attempt write out again */
}
}
done:
free(name);
if(a && a->changed && election_state(writer) == election_won) {
write_attribute(a);
}
}
void
write_attributes(bool all, bool peer_discovered)
{
GHashTableIter iter;
attribute_t *a = NULL;
g_hash_table_iter_init(&iter, attributes);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) & a)) {
if (peer_discovered && a->unknown_peer_uuids) {
/* a new peer uuid has been discovered, try writing this attribute again. */
a->changed = TRUE;
}
if(all || a->changed) {
write_attribute(a);
} else {
crm_debug("Skipping unchanged attribute %s", a->id);
}
}
}
static void
build_update_element(xmlNode *parent, attribute_t *a, const char *nodeid, const char *value)
{
char *set = NULL;
char *uuid = NULL;
xmlNode *xml_obj = NULL;
if(a->set) {
set = g_strdup(a->set);
} else {
set = g_strdup_printf("%s-%s", XML_CIB_TAG_STATUS, nodeid);
}
if(a->uuid) {
uuid = g_strdup(a->uuid);
} else {
int lpc;
uuid = g_strdup_printf("%s-%s", set, a->id);
/* Minimal attempt at sanitizing automatic IDs */
for (lpc = 0; uuid[lpc] != 0; lpc++) {
switch (uuid[lpc]) {
case ':':
uuid[lpc] = '.';
}
}
}
xml_obj = create_xml_node(parent, XML_CIB_TAG_STATE);
crm_xml_add(xml_obj, XML_ATTR_ID, nodeid);
xml_obj = create_xml_node(xml_obj, XML_TAG_TRANSIENT_NODEATTRS);
crm_xml_add(xml_obj, XML_ATTR_ID, nodeid);
xml_obj = create_xml_node(xml_obj, XML_TAG_ATTR_SETS);
crm_xml_add(xml_obj, XML_ATTR_ID, set);
xml_obj = create_xml_node(xml_obj, XML_CIB_TAG_NVPAIR);
crm_xml_add(xml_obj, XML_ATTR_ID, uuid);
crm_xml_add(xml_obj, XML_NVPAIR_ATTR_NAME, a->id);
if(value) {
crm_xml_add(xml_obj, XML_NVPAIR_ATTR_VALUE, value);
} else {
crm_xml_add(xml_obj, XML_NVPAIR_ATTR_VALUE, "");
crm_xml_add(xml_obj, "__delete__", XML_NVPAIR_ATTR_VALUE);
}
g_free(uuid);
g_free(set);
}
void
write_attribute(attribute_t *a)
{
int updates = 0;
xmlNode *xml_top = NULL;
attribute_value_t *v = NULL;
GHashTableIter iter;
enum cib_call_options flags = cib_quorum_override;
if (a == NULL) {
return;
} else if (the_cib == NULL) {
crm_info("Write out of '%s' delayed: cib not connected", a->id);
return;
} else if(a->update && a->update < last_cib_op_done) {
crm_info("Write out of '%s' continuing: update %d considered lost", a->id, a->update);
} else if(a->update) {
crm_info("Write out of '%s' delayed: update %d in progress", a->id, a->update);
return;
} else if(mainloop_timer_running(a->timer)) {
crm_info("Write out of '%s' delayed: timer is running", a->id);
return;
}
a->changed = FALSE;
a->unknown_peer_uuids = FALSE;
xml_top = create_xml_node(NULL, XML_CIB_TAG_STATUS);
g_hash_table_iter_init(&iter, a->values);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) & v)) {
crm_node_t *peer = crm_get_peer_full(v->nodeid, v->nodename, CRM_GET_PEER_REMOTE|CRM_GET_PEER_CLUSTER);
if(peer && peer->id && v->nodeid == 0) {
crm_trace("Updating value's nodeid");
v->nodeid = peer->id;
}
if (peer == NULL) {
/* If the user is trying to set an attribute on an unknown peer, ignore it. */
crm_notice("Update error (peer not found): %s[%s]=%s failed (host=%p)", v->nodename, a->id, v->current, peer);
} else if (peer->uuid == NULL) {
/* peer is found, but we don't know the uuid yet. Wait until we discover a new uuid before attempting to write */
a->unknown_peer_uuids = FALSE;
crm_notice("Update error (unknown peer uuid, retry will be attempted once uuid is discovered): %s[%s]=%s failed (host=%p)", v->nodename, a->id, v->current, peer);
} else {
crm_debug("Update: %s[%s]=%s (%s %u %u %s)", v->nodename, a->id, v->current, peer->uuid, peer->id, v->nodeid, peer->uname);
build_update_element(xml_top, a, peer->uuid, v->current);
updates++;
free(v->requested);
v->requested = NULL;
if(v->current) {
v->requested = strdup(v->current);
} else {
/* Older versions don't know about the cib_mixed_update flag
* Make sure it goes to the local cib which does
*/
flags |= cib_mixed_update|cib_scope_local;
}
}
}
if(updates) {
crm_log_xml_trace(xml_top, __FUNCTION__);
a->update = cib_internal_op(the_cib, CIB_OP_MODIFY, NULL, XML_CIB_TAG_STATUS, xml_top, NULL,
flags, a->user);
crm_notice("Sent update %d with %d changes for %s, id=%s, set=%s",
a->update, updates, a->id, a->uuid ? a->uuid : "<n/a>", a->set);
the_cib->cmds->register_callback(
the_cib, a->update, 120, FALSE, strdup(a->id), "attrd_cib_callback", attrd_cib_callback);
}
free_xml(xml_top);
}
diff --git a/crmd/messages.c b/crmd/messages.c
index 1aa53b2e3d..bc3bad3f28 100644
--- a/crmd/messages.c
+++ b/crmd/messages.c
@@ -1,960 +1,971 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <sys/param.h>
#include <crm/crm.h>
#include <string.h>
#include <time.h>
#include <crmd_fsa.h>
#include <crm/msg_xml.h>
#include <crm/common/xml.h>
#include <crm/cluster/internal.h>
#include <crm/cib.h>
#include <crm/common/ipcs.h>
#include <crmd.h>
#include <crmd_messages.h>
#include <crmd_lrm.h>
#include <throttle.h>
GListPtr fsa_message_queue = NULL;
extern void crm_shutdown(int nsig);
void handle_response(xmlNode * stored_msg);
enum crmd_fsa_input handle_request(xmlNode * stored_msg, enum crmd_fsa_cause cause);
enum crmd_fsa_input handle_shutdown_request(xmlNode * stored_msg);
#define ROUTER_RESULT(x) crm_trace("Router result: %s", x)
/* debug only, can wrap all it likes */
int last_data_id = 0;
void
register_fsa_error_adv(enum crmd_fsa_cause cause, enum crmd_fsa_input input,
fsa_data_t * cur_data, void *new_data, const char *raised_from)
{
/* save the current actions if any */
if (fsa_actions != A_NOTHING) {
register_fsa_input_adv(cur_data ? cur_data->fsa_cause : C_FSA_INTERNAL,
I_NULL, cur_data ? cur_data->data : NULL,
fsa_actions, TRUE, __FUNCTION__);
}
/* reset the action list */
crm_info("Resetting the current action list");
fsa_dump_actions(fsa_actions, "Drop");
fsa_actions = A_NOTHING;
/* register the error */
register_fsa_input_adv(cause, input, new_data, A_NOTHING, TRUE, raised_from);
}
int
register_fsa_input_adv(enum crmd_fsa_cause cause, enum crmd_fsa_input input,
void *data, long long with_actions,
gboolean prepend, const char *raised_from)
{
unsigned old_len = g_list_length(fsa_message_queue);
fsa_data_t *fsa_data = NULL;
CRM_CHECK(raised_from != NULL, raised_from = "<unknown>");
if (input == I_NULL && with_actions == A_NOTHING /* && data == NULL */ ) {
/* no point doing anything */
crm_err("Cannot add entry to queue: no input and no action");
return 0;
}
if (input == I_WAIT_FOR_EVENT) {
do_fsa_stall = TRUE;
crm_debug("Stalling the FSA pending further input: source=%s cause=%s data=%p queue=%d",
raised_from, fsa_cause2string(cause), data, old_len);
if (old_len > 0) {
fsa_dump_queue(LOG_TRACE);
prepend = FALSE;
}
if (data == NULL) {
fsa_actions |= with_actions;
fsa_dump_actions(with_actions, "Restored");
return 0;
}
/* Store everything in the new event and reset fsa_actions */
with_actions |= fsa_actions;
fsa_actions = A_NOTHING;
}
last_data_id++;
crm_trace("%s %s FSA input %d (%s) (cause=%s) %s data",
raised_from, prepend ? "prepended" : "appended", last_data_id,
fsa_input2string(input), fsa_cause2string(cause), data ? "with" : "without");
fsa_data = calloc(1, sizeof(fsa_data_t));
fsa_data->id = last_data_id;
fsa_data->fsa_input = input;
fsa_data->fsa_cause = cause;
fsa_data->origin = raised_from;
fsa_data->data = NULL;
fsa_data->data_type = fsa_dt_none;
fsa_data->actions = with_actions;
if (with_actions != A_NOTHING) {
crm_trace("Adding actions %.16llx to input", with_actions);
}
if (data != NULL) {
switch (cause) {
case C_FSA_INTERNAL:
case C_CRMD_STATUS_CALLBACK:
case C_IPC_MESSAGE:
case C_HA_MESSAGE:
crm_trace("Copying %s data from %s as a HA msg",
fsa_cause2string(cause), raised_from);
CRM_CHECK(((ha_msg_input_t *) data)->msg != NULL,
crm_err("Bogus data from %s", raised_from));
fsa_data->data = copy_ha_msg_input(data);
fsa_data->data_type = fsa_dt_ha_msg;
break;
case C_LRM_OP_CALLBACK:
crm_trace("Copying %s data from %s as lrmd_event_data_t",
fsa_cause2string(cause), raised_from);
fsa_data->data = lrmd_copy_event((lrmd_event_data_t *) data);
fsa_data->data_type = fsa_dt_lrm;
break;
case C_CCM_CALLBACK:
case C_SUBSYSTEM_CONNECT:
case C_LRM_MONITOR_CALLBACK:
case C_TIMER_POPPED:
case C_SHUTDOWN:
case C_HEARTBEAT_FAILED:
case C_HA_DISCONNECT:
case C_ILLEGAL:
case C_UNKNOWN:
case C_STARTUP:
crm_err("Copying %s data (from %s)"
" not yet implemented", fsa_cause2string(cause), raised_from);
crmd_exit(pcmk_err_generic);
break;
}
crm_trace("%s data copied", fsa_cause2string(fsa_data->fsa_cause));
}
/* make sure to free it properly later */
if (prepend) {
crm_trace("Prepending input");
fsa_message_queue = g_list_prepend(fsa_message_queue, fsa_data);
} else {
fsa_message_queue = g_list_append(fsa_message_queue, fsa_data);
}
crm_trace("Queue len: %d", g_list_length(fsa_message_queue));
/* fsa_dump_queue(LOG_DEBUG_2); */
if (old_len == g_list_length(fsa_message_queue)) {
crm_err("Couldnt add message to the queue");
}
if (fsa_source && input != I_WAIT_FOR_EVENT) {
crm_trace("Triggering FSA: %s", __FUNCTION__);
mainloop_set_trigger(fsa_source);
}
return last_data_id;
}
void
fsa_dump_queue(int log_level)
{
int offset = 0;
GListPtr lpc = NULL;
for (lpc = fsa_message_queue; lpc != NULL; lpc = lpc->next) {
fsa_data_t *data = (fsa_data_t *) lpc->data;
do_crm_log_unlikely(log_level,
"queue[%d.%d]: input %s raised by %s(%p.%d)\t(cause=%s)",
offset++, data->id, fsa_input2string(data->fsa_input),
data->origin, data->data, data->data_type,
fsa_cause2string(data->fsa_cause));
}
}
ha_msg_input_t *
copy_ha_msg_input(ha_msg_input_t * orig)
{
ha_msg_input_t *copy = NULL;
xmlNodePtr data = NULL;
if (orig != NULL) {
crm_trace("Copy msg");
data = copy_xml(orig->msg);
} else {
crm_trace("No message to copy");
}
copy = new_ha_msg_input(data);
if (orig && orig->msg != NULL) {
CRM_CHECK(copy->msg != NULL, crm_err("copy failed"));
}
return copy;
}
void
delete_fsa_input(fsa_data_t * fsa_data)
{
lrmd_event_data_t *op = NULL;
xmlNode *foo = NULL;
if (fsa_data == NULL) {
return;
}
crm_trace("About to free %s data", fsa_cause2string(fsa_data->fsa_cause));
if (fsa_data->data != NULL) {
switch (fsa_data->data_type) {
case fsa_dt_ha_msg:
delete_ha_msg_input(fsa_data->data);
break;
case fsa_dt_xml:
foo = fsa_data->data;
free_xml(foo);
break;
case fsa_dt_lrm:
op = (lrmd_event_data_t *) fsa_data->data;
lrmd_free_event(op);
break;
case fsa_dt_none:
if (fsa_data->data != NULL) {
crm_err("Dont know how to free %s data from %s",
fsa_cause2string(fsa_data->fsa_cause), fsa_data->origin);
crmd_exit(pcmk_err_generic);
}
break;
}
crm_trace("%s data freed", fsa_cause2string(fsa_data->fsa_cause));
}
free(fsa_data);
}
/* returns the next message */
fsa_data_t *
get_message(void)
{
fsa_data_t *message = g_list_nth_data(fsa_message_queue, 0);
fsa_message_queue = g_list_remove(fsa_message_queue, message);
crm_trace("Processing input %d", message->id);
return message;
}
/* returns the current head of the FIFO queue */
gboolean
is_message(void)
{
return (g_list_length(fsa_message_queue) > 0);
}
void *
fsa_typed_data_adv(fsa_data_t * fsa_data, enum fsa_data_type a_type, const char *caller)
{
void *ret_val = NULL;
if (fsa_data == NULL) {
crm_err("%s: No FSA data available", caller);
} else if (fsa_data->data == NULL) {
crm_err("%s: No message data available. Origin: %s", caller, fsa_data->origin);
} else if (fsa_data->data_type != a_type) {
crm_crit("%s: Message data was the wrong type! %d vs. requested=%d. Origin: %s",
caller, fsa_data->data_type, a_type, fsa_data->origin);
CRM_ASSERT(fsa_data->data_type == a_type);
} else {
ret_val = fsa_data->data;
}
return ret_val;
}
/* A_MSG_ROUTE */
void
do_msg_route(long long action,
enum crmd_fsa_cause cause,
enum crmd_fsa_state cur_state,
enum crmd_fsa_input current_input, fsa_data_t * msg_data)
{
ha_msg_input_t *input = fsa_typed_data(fsa_dt_ha_msg);
route_message(msg_data->fsa_cause, input->msg);
}
void
route_message(enum crmd_fsa_cause cause, xmlNode * input)
{
ha_msg_input_t fsa_input;
enum crmd_fsa_input result = I_NULL;
fsa_input.msg = input;
CRM_CHECK(cause == C_IPC_MESSAGE || cause == C_HA_MESSAGE, return);
/* try passing the buck first */
if (relay_message(input, cause == C_IPC_MESSAGE)) {
return;
}
/* handle locally */
result = handle_message(input, cause);
/* done or process later? */
switch (result) {
case I_NULL:
case I_CIB_OP:
case I_ROUTER:
case I_NODE_JOIN:
case I_JOIN_REQUEST:
case I_JOIN_RESULT:
break;
default:
/* Defering local processing of message */
register_fsa_input_later(cause, result, &fsa_input);
return;
}
if (result != I_NULL) {
/* add to the front of the queue */
register_fsa_input(cause, result, &fsa_input);
}
}
gboolean
relay_message(xmlNode * msg, gboolean originated_locally)
{
int dest = 1;
int is_for_dc = 0;
int is_for_dcib = 0;
int is_for_te = 0;
int is_for_crm = 0;
int is_for_cib = 0;
int is_local = 0;
gboolean processing_complete = FALSE;
const char *host_to = crm_element_value(msg, F_CRM_HOST_TO);
const char *sys_to = crm_element_value(msg, F_CRM_SYS_TO);
const char *sys_from = crm_element_value(msg, F_CRM_SYS_FROM);
const char *type = crm_element_value(msg, F_TYPE);
const char *msg_error = NULL;
crm_trace("Routing message %s", crm_element_value(msg, XML_ATTR_REFERENCE));
if (msg == NULL) {
msg_error = "Cannot route empty message";
} else if (safe_str_eq(CRM_OP_HELLO, crm_element_value(msg, F_CRM_TASK))) {
/* quietly ignore */
processing_complete = TRUE;
} else if (safe_str_neq(type, T_CRM)) {
msg_error = "Bad message type";
} else if (sys_to == NULL) {
msg_error = "Bad message destination: no subsystem";
}
if (msg_error != NULL) {
processing_complete = TRUE;
crm_err("%s", msg_error);
crm_log_xml_warn(msg, "bad msg");
}
if (processing_complete) {
return TRUE;
}
processing_complete = TRUE;
is_for_dc = (strcasecmp(CRM_SYSTEM_DC, sys_to) == 0);
is_for_dcib = (strcasecmp(CRM_SYSTEM_DCIB, sys_to) == 0);
is_for_te = (strcasecmp(CRM_SYSTEM_TENGINE, sys_to) == 0);
is_for_cib = (strcasecmp(CRM_SYSTEM_CIB, sys_to) == 0);
is_for_crm = (strcasecmp(CRM_SYSTEM_CRMD, sys_to) == 0);
is_local = 0;
if (host_to == NULL || strlen(host_to) == 0) {
if (is_for_dc || is_for_te) {
is_local = 0;
} else if (is_for_crm && originated_locally) {
is_local = 0;
} else {
is_local = 1;
}
} else if (safe_str_eq(fsa_our_uname, host_to)) {
is_local = 1;
}
if (is_for_dc || is_for_dcib || is_for_te) {
if (AM_I_DC && is_for_te) {
ROUTER_RESULT("Message result: Local relay");
send_msg_via_ipc(msg, sys_to);
} else if (AM_I_DC) {
ROUTER_RESULT("Message result: DC/CRMd process");
processing_complete = FALSE; /* more to be done by caller */
} else if (originated_locally && safe_str_neq(sys_from, CRM_SYSTEM_PENGINE)
&& safe_str_neq(sys_from, CRM_SYSTEM_TENGINE)) {
/* Neither the TE or PE should be sending messages
* to DC's on other nodes
*
* By definition, if we are no longer the DC, then
* the PE or TE's data should be discarded
*/
#if SUPPORT_COROSYNC
if (is_openais_cluster()) {
dest = text2msg_type(sys_to);
}
#endif
ROUTER_RESULT("Message result: External relay to DC");
send_cluster_message(host_to ? crm_get_peer(0, host_to) : NULL, dest, msg, TRUE);
} else {
/* discard */
ROUTER_RESULT("Message result: Discard, not DC");
}
} else if (is_local && (is_for_crm || is_for_cib)) {
ROUTER_RESULT("Message result: CRMd process");
processing_complete = FALSE; /* more to be done by caller */
} else if (is_local) {
ROUTER_RESULT("Message result: Local relay");
send_msg_via_ipc(msg, sys_to);
} else {
+ crm_node_t *node_to = NULL;
+
#if SUPPORT_COROSYNC
if (is_openais_cluster()) {
dest = text2msg_type(sys_to);
if (dest == crm_msg_none || dest > crm_msg_stonith_ng) {
dest = crm_msg_crmd;
}
}
#endif
+
+ if (host_to) {
+ node_to = crm_find_peer(0, host_to);
+ if (node_to == NULL) {
+ crm_err("Cannot route message to unknown node %s", host_to);
+ return TRUE;
+ }
+ }
+
ROUTER_RESULT("Message result: External relay");
- send_cluster_message(host_to ? crm_get_peer(0, host_to) : NULL, dest, msg, TRUE);
+ send_cluster_message(host_to ? node_to : NULL, dest, msg, TRUE);
}
return processing_complete;
}
static gboolean
process_hello_message(xmlNode * hello,
char **client_name, char **major_version, char **minor_version)
{
const char *local_client_name;
const char *local_major_version;
const char *local_minor_version;
*client_name = NULL;
*major_version = NULL;
*minor_version = NULL;
if (hello == NULL) {
return FALSE;
}
local_client_name = crm_element_value(hello, "client_name");
local_major_version = crm_element_value(hello, "major_version");
local_minor_version = crm_element_value(hello, "minor_version");
if (local_client_name == NULL || strlen(local_client_name) == 0) {
crm_err("Hello message was not valid (field %s not found)", "client name");
return FALSE;
} else if (local_major_version == NULL || strlen(local_major_version) == 0) {
crm_err("Hello message was not valid (field %s not found)", "major version");
return FALSE;
} else if (local_minor_version == NULL || strlen(local_minor_version) == 0) {
crm_err("Hello message was not valid (field %s not found)", "minor version");
return FALSE;
}
*client_name = strdup(local_client_name);
*major_version = strdup(local_major_version);
*minor_version = strdup(local_minor_version);
crm_trace("Hello message ok");
return TRUE;
}
gboolean
crmd_authorize_message(xmlNode * client_msg, crm_client_t * curr_client, const char *proxy_session)
{
char *client_name = NULL;
char *major_version = NULL;
char *minor_version = NULL;
gboolean auth_result = FALSE;
xmlNode *xml = NULL;
const char *op = crm_element_value(client_msg, F_CRM_TASK);
const char *uuid = curr_client ? curr_client->id : proxy_session;
if (uuid == NULL) {
crm_warn("Message [%s] not authorized", crm_element_value(client_msg, XML_ATTR_REFERENCE));
return FALSE;
} else if (safe_str_neq(CRM_OP_HELLO, op)) {
return TRUE;
}
xml = get_message_xml(client_msg, F_CRM_DATA);
auth_result = process_hello_message(xml, &client_name, &major_version, &minor_version);
if (auth_result == TRUE) {
if (client_name == NULL) {
crm_err("Bad client details (client_name=%s, uuid=%s)",
crm_str(client_name), uuid);
auth_result = FALSE;
}
}
if (auth_result == TRUE) {
/* check version */
int mav = atoi(major_version);
int miv = atoi(minor_version);
crm_trace("Checking client version number");
if (mav < 0 || miv < 0) {
crm_err("Client version (%d:%d) is not acceptable", mav, miv);
auth_result = FALSE;
}
}
if (auth_result == TRUE) {
crm_trace("Accepted client %s", client_name);
if (curr_client) {
curr_client->userdata = strdup(client_name);
}
crm_trace("Triggering FSA: %s", __FUNCTION__);
mainloop_set_trigger(fsa_source);
} else {
crm_warn("Rejected client logon request");
if (curr_client) {
qb_ipcs_disconnect(curr_client->ipcs);
}
}
free(minor_version);
free(major_version);
free(client_name);
/* hello messages should never be processed further */
return FALSE;
}
enum crmd_fsa_input
handle_message(xmlNode * msg, enum crmd_fsa_cause cause)
{
const char *type = NULL;
CRM_CHECK(msg != NULL, return I_NULL);
type = crm_element_value(msg, F_CRM_MSG_TYPE);
if (crm_str_eq(type, XML_ATTR_REQUEST, TRUE)) {
return handle_request(msg, cause);
} else if (crm_str_eq(type, XML_ATTR_RESPONSE, TRUE)) {
handle_response(msg);
return I_NULL;
}
crm_err("Unknown message type: %s", type);
return I_NULL;
}
static enum crmd_fsa_input
handle_failcount_op(xmlNode * stored_msg)
{
const char *rsc = NULL;
const char *uname = NULL;
gboolean is_remote_node = FALSE;
xmlNode *xml_rsc = get_xpath_object("//" XML_CIB_TAG_RESOURCE, stored_msg, LOG_ERR);
if (xml_rsc) {
rsc = ID(xml_rsc);
}
uname = crm_element_value(stored_msg, XML_LRM_ATTR_TARGET);
if (crm_element_value(stored_msg, XML_LRM_ATTR_ROUTER_NODE)) {
is_remote_node = TRUE;
}
if (rsc) {
char *attr = NULL;
crm_info("Removing failcount for %s", rsc);
attr = crm_concat("fail-count", rsc, '-');
update_attrd(uname, attr, NULL, NULL, is_remote_node);
free(attr);
attr = crm_concat("last-failure", rsc, '-');
update_attrd(uname, attr, NULL, NULL, is_remote_node);
free(attr);
lrm_clear_last_failure(rsc, uname);
} else {
crm_log_xml_warn(stored_msg, "invalid failcount op");
}
return I_NULL;
}
enum crmd_fsa_input
handle_request(xmlNode * stored_msg, enum crmd_fsa_cause cause)
{
xmlNode *msg = NULL;
const char *op = crm_element_value(stored_msg, F_CRM_TASK);
/* Optimize this for the DC - it has the most to do */
if (op == NULL) {
crm_log_xml_err(stored_msg, "Bad message");
return I_NULL;
}
/*========== DC-Only Actions ==========*/
if (AM_I_DC) {
if (strcmp(op, CRM_OP_JOIN_ANNOUNCE) == 0) {
return I_NODE_JOIN;
} else if (strcmp(op, CRM_OP_JOIN_REQUEST) == 0) {
return I_JOIN_REQUEST;
} else if (strcmp(op, CRM_OP_JOIN_CONFIRM) == 0) {
return I_JOIN_RESULT;
} else if (strcmp(op, CRM_OP_SHUTDOWN) == 0) {
const char *host_from = crm_element_value(stored_msg, F_CRM_HOST_FROM);
gboolean dc_match = safe_str_eq(host_from, fsa_our_dc);
if (is_set(fsa_input_register, R_SHUTDOWN)) {
crm_info("Shutting ourselves down (DC)");
return I_STOP;
} else if (dc_match) {
crm_err("We didnt ask to be shut down, yet our"
" TE is telling us too." " Better get out now!");
return I_TERMINATE;
} else if (fsa_state != S_STOPPING) {
crm_err("Another node is asking us to shutdown" " but we think we're ok.");
return I_ELECTION;
}
} else if (strcmp(op, CRM_OP_SHUTDOWN_REQ) == 0) {
/* a slave wants to shut down */
/* create cib fragment and add to message */
return handle_shutdown_request(stored_msg);
}
}
/*========== common actions ==========*/
if (strcmp(op, CRM_OP_NOVOTE) == 0) {
ha_msg_input_t fsa_input;
fsa_input.msg = stored_msg;
register_fsa_input_adv(C_HA_MESSAGE, I_NULL, &fsa_input,
A_ELECTION_COUNT | A_ELECTION_CHECK, FALSE, __FUNCTION__);
} else if (strcmp(op, CRM_OP_THROTTLE) == 0) {
throttle_update(stored_msg);
return I_NULL;
} else if (strcmp(op, CRM_OP_CLEAR_FAILCOUNT) == 0) {
return handle_failcount_op(stored_msg);
} else if (strcmp(op, CRM_OP_VOTE) == 0) {
/* count the vote and decide what to do after that */
ha_msg_input_t fsa_input;
fsa_input.msg = stored_msg;
register_fsa_input_adv(C_HA_MESSAGE, I_NULL, &fsa_input,
A_ELECTION_COUNT | A_ELECTION_CHECK, FALSE, __FUNCTION__);
/* Sometimes we _must_ go into S_ELECTION */
if (fsa_state == S_HALT) {
crm_debug("Forcing an election from S_HALT");
return I_ELECTION;
#if 0
} else if (AM_I_DC) {
/* This is the old way of doing things but what is gained? */
return I_ELECTION;
#endif
}
} else if (strcmp(op, CRM_OP_JOIN_OFFER) == 0) {
crm_debug("Raising I_JOIN_OFFER: join-%s", crm_element_value(stored_msg, F_CRM_JOIN_ID));
return I_JOIN_OFFER;
} else if (strcmp(op, CRM_OP_JOIN_ACKNAK) == 0) {
crm_debug("Raising I_JOIN_RESULT: join-%s", crm_element_value(stored_msg, F_CRM_JOIN_ID));
return I_JOIN_RESULT;
} else if (strcmp(op, CRM_OP_LRM_DELETE) == 0
|| strcmp(op, CRM_OP_LRM_FAIL) == 0
|| strcmp(op, CRM_OP_LRM_REFRESH) == 0 || strcmp(op, CRM_OP_REPROBE) == 0) {
crm_xml_add(stored_msg, F_CRM_SYS_TO, CRM_SYSTEM_LRMD);
return I_ROUTER;
} else if (strcmp(op, CRM_OP_NOOP) == 0) {
return I_NULL;
} else if (strcmp(op, CRM_OP_LOCAL_SHUTDOWN) == 0) {
crm_shutdown(SIGTERM);
/*return I_SHUTDOWN; */
return I_NULL;
/*========== (NOT_DC)-Only Actions ==========*/
} else if (AM_I_DC == FALSE && strcmp(op, CRM_OP_SHUTDOWN) == 0) {
const char *host_from = crm_element_value(stored_msg, F_CRM_HOST_FROM);
gboolean dc_match = safe_str_eq(host_from, fsa_our_dc);
if (dc_match || fsa_our_dc == NULL) {
if (is_set(fsa_input_register, R_SHUTDOWN) == FALSE) {
crm_err("We didn't ask to be shut down, yet our" " DC is telling us too.");
set_bit(fsa_input_register, R_STAYDOWN);
return I_STOP;
}
crm_info("Shutting down");
return I_STOP;
} else {
crm_warn("Discarding %s op from %s", op, host_from);
}
} else if (strcmp(op, CRM_OP_PING) == 0) {
/* eventually do some stuff to figure out
* if we /are/ ok
*/
const char *sys_to = crm_element_value(stored_msg, F_CRM_SYS_TO);
xmlNode *ping = create_xml_node(NULL, XML_CRM_TAG_PING);
crm_xml_add(ping, XML_PING_ATTR_STATUS, "ok");
crm_xml_add(ping, XML_PING_ATTR_SYSFROM, sys_to);
crm_xml_add(ping, "crmd_state", fsa_state2string(fsa_state));
/* Ok, so technically not so interesting, but CTS needs to see this */
crm_notice("Current ping state: %s", fsa_state2string(fsa_state));
msg = create_reply(stored_msg, ping);
if(msg) {
relay_message(msg, TRUE);
}
free_xml(ping);
free_xml(msg);
} else if (strcmp(op, CRM_OP_RM_NODE_CACHE) == 0) {
int id = 0;
const char *name = NULL;
crm_element_value_int(stored_msg, XML_ATTR_ID, &id);
name = crm_element_value(stored_msg, XML_ATTR_UNAME);
if(cause == C_IPC_MESSAGE) {
msg = create_request(CRM_OP_RM_NODE_CACHE, NULL, NULL, CRM_SYSTEM_CRMD, CRM_SYSTEM_CRMD, NULL);
if (send_cluster_message(NULL, crm_msg_crmd, msg, TRUE) == FALSE) {
crm_err("Could not instruct peers to remove references to node %s/%u", name, id);
} else {
crm_notice("Instructing peers to remove references to node %s/%u", name, id);
}
free_xml(msg);
} else {
reap_crm_member(id, name);
}
} else {
crm_err("Unexpected request (%s) sent to %s", op, AM_I_DC ? "the DC" : "non-DC node");
crm_log_xml_err(stored_msg, "Unexpected");
}
return I_NULL;
}
void
handle_response(xmlNode * stored_msg)
{
const char *op = crm_element_value(stored_msg, F_CRM_TASK);
if (op == NULL) {
crm_log_xml_err(stored_msg, "Bad message");
} else if (AM_I_DC && strcmp(op, CRM_OP_PECALC) == 0) {
/* Check if the PE answer been superceeded by a subsequent request? */
const char *msg_ref = crm_element_value(stored_msg, XML_ATTR_REFERENCE);
if (msg_ref == NULL) {
crm_err("%s - Ignoring calculation with no reference", op);
} else if (safe_str_eq(msg_ref, fsa_pe_ref)) {
ha_msg_input_t fsa_input;
fsa_input.msg = stored_msg;
register_fsa_input_later(C_IPC_MESSAGE, I_PE_SUCCESS, &fsa_input);
crm_trace("Completed: %s...", fsa_pe_ref);
} else {
crm_info("%s calculation %s is obsolete", op, msg_ref);
}
} else if (strcmp(op, CRM_OP_VOTE) == 0
|| strcmp(op, CRM_OP_SHUTDOWN_REQ) == 0 || strcmp(op, CRM_OP_SHUTDOWN) == 0) {
} else {
const char *host_from = crm_element_value(stored_msg, F_CRM_HOST_FROM);
crm_err("Unexpected response (op=%s, src=%s) sent to the %s",
op, host_from, AM_I_DC ? "DC" : "CRMd");
}
}
enum crmd_fsa_input
handle_shutdown_request(xmlNode * stored_msg)
{
/* handle here to avoid potential version issues
* where the shutdown message/proceedure may have
* been changed in later versions.
*
* This way the DC is always in control of the shutdown
*/
char *now_s = NULL;
time_t now = time(NULL);
const char *host_from = crm_element_value(stored_msg, F_CRM_HOST_FROM);
if (host_from == NULL) {
/* we're shutting down and the DC */
host_from = fsa_our_uname;
}
crm_info("Creating shutdown request for %s (state=%s)", host_from, fsa_state2string(fsa_state));
crm_log_xml_trace(stored_msg, "message");
now_s = crm_itoa(now);
update_attrd(host_from, XML_CIB_ATTR_SHUTDOWN, now_s, NULL, FALSE);
free(now_s);
/* will be picked up by the TE as long as its running */
return I_NULL;
}
/* msg is deleted by the time this returns */
extern gboolean process_te_message(xmlNode * msg, xmlNode * xml_data);
gboolean
send_msg_via_ipc(xmlNode * msg, const char *sys)
{
gboolean send_ok = TRUE;
crm_client_t *client_channel = crm_client_get_by_id(sys);
if (crm_element_value(msg, F_CRM_HOST_FROM) == NULL) {
crm_xml_add(msg, F_CRM_HOST_FROM, fsa_our_uname);
}
if (client_channel != NULL) {
/* Transient clients such as crmadmin */
send_ok = crm_ipcs_send(client_channel, 0, msg, crm_ipc_server_event);
} else if (sys != NULL && strcmp(sys, CRM_SYSTEM_TENGINE) == 0) {
xmlNode *data = get_message_xml(msg, F_CRM_DATA);
process_te_message(msg, data);
} else if (sys != NULL && strcmp(sys, CRM_SYSTEM_LRMD) == 0) {
fsa_data_t fsa_data;
ha_msg_input_t fsa_input;
fsa_input.msg = msg;
fsa_input.xml = get_message_xml(msg, F_CRM_DATA);
fsa_data.id = 0;
fsa_data.actions = 0;
fsa_data.data = &fsa_input;
fsa_data.fsa_input = I_MESSAGE;
fsa_data.fsa_cause = C_IPC_MESSAGE;
fsa_data.origin = __FUNCTION__;
fsa_data.data_type = fsa_dt_ha_msg;
#ifdef FSA_TRACE
crm_trace("Invoking action A_LRM_INVOKE (%.16llx)", A_LRM_INVOKE);
#endif
do_lrm_invoke(A_LRM_INVOKE, C_IPC_MESSAGE, fsa_state, I_MESSAGE, &fsa_data);
} else if (sys != NULL && crmd_is_proxy_session(sys)) {
crmd_proxy_send(sys, msg);
} else {
crm_debug("Unknown Sub-system (%s)... discarding message.", crm_str(sys));
send_ok = FALSE;
}
return send_ok;
}
ha_msg_input_t *
new_ha_msg_input(xmlNode * orig)
{
ha_msg_input_t *input_copy = NULL;
input_copy = calloc(1, sizeof(ha_msg_input_t));
input_copy->msg = orig;
input_copy->xml = get_message_xml(input_copy->msg, F_CRM_DATA);
return input_copy;
}
void
delete_ha_msg_input(ha_msg_input_t * orig)
{
if (orig == NULL) {
return;
}
free_xml(orig->msg);
free(orig);
}
diff --git a/crmd/te_utils.c b/crmd/te_utils.c
index 0c92e95881..47880dd58a 100644
--- a/crmd/te_utils.c
+++ b/crmd/te_utils.c
@@ -1,491 +1,497 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <sys/param.h>
#include <crm/crm.h>
#include <crm/msg_xml.h>
#include <crm/common/xml.h>
#include <tengine.h>
#include <crmd_fsa.h>
#include <crmd_messages.h>
#include <throttle.h>
#include <crm/fencing/internal.h>
crm_trigger_t *stonith_reconnect = NULL;
GListPtr stonith_cleanup_list = NULL;
static gboolean
fail_incompletable_stonith(crm_graph_t * graph)
{
GListPtr lpc = NULL;
const char *task = NULL;
xmlNode *last_action = NULL;
if (graph == NULL) {
return FALSE;
}
for (lpc = graph->synapses; lpc != NULL; lpc = lpc->next) {
GListPtr lpc2 = NULL;
synapse_t *synapse = (synapse_t *) lpc->data;
if (synapse->confirmed) {
continue;
}
for (lpc2 = synapse->actions; lpc2 != NULL; lpc2 = lpc2->next) {
crm_action_t *action = (crm_action_t *) lpc2->data;
if (action->type != action_type_crm || action->confirmed) {
continue;
}
task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
if (task && safe_str_eq(task, CRM_OP_FENCE)) {
action->failed = TRUE;
last_action = action->xml;
update_graph(graph, action);
crm_notice("Failing action %d (%s): STONITHd terminated",
action->id, ID(action->xml));
}
}
}
if (last_action != NULL) {
crm_warn("STONITHd failure resulted in un-runnable actions");
abort_transition(INFINITY, tg_restart, "Stonith failure", last_action);
return TRUE;
}
return FALSE;
}
static void
tengine_stonith_connection_destroy(stonith_t * st, stonith_event_t * e)
{
if (is_set(fsa_input_register, R_ST_REQUIRED)) {
crm_crit("Fencing daemon connection failed");
mainloop_set_trigger(stonith_reconnect);
} else {
crm_info("Fencing daemon disconnected");
}
/* cbchan will be garbage at this point, arrange for it to be reset */
if(stonith_api) {
stonith_api->state = stonith_disconnected;
}
if (AM_I_DC) {
fail_incompletable_stonith(transition_graph);
trigger_graph();
}
}
#if SUPPORT_CMAN
# include <libfenced.h>
#endif
char *te_client_id = NULL;
#ifdef HAVE_SYS_REBOOT_H
# include <unistd.h>
# include <sys/reboot.h>
#endif
static void
tengine_stonith_notify(stonith_t * st, stonith_event_t * st_event)
{
if(te_client_id == NULL) {
te_client_id = g_strdup_printf("%s.%d", crm_system_name, getpid());
}
if (st_event == NULL) {
crm_err("Notify data not found");
return;
}
if (st_event->result == pcmk_ok && crm_str_eq(st_event->target, fsa_our_uname, TRUE)) {
crm_crit("We were alegedly just fenced by %s for %s with %s!", st_event->executioner,
st_event->origin, st_event->device); /* Dumps blackbox if enabled */
qb_log_fini(); /* Try to get the above log message to disk - somehow */
/* Get out ASAP and do not come back up.
*
* Triggering a reboot is also not the worst idea either since
* the rest of the cluster thinks we're safely down
*/
#ifdef RB_HALT_SYSTEM
reboot(RB_HALT_SYSTEM);
#endif
/*
* If reboot() fails or is not supported, coming back up will
* probably lead to a situation where the other nodes set our
* status to 'lost' because of the fencing callback and will
* discard subsequent election votes with:
*
* Election 87 (current: 5171, owner: 103): Processed vote from east-03 (Peer is not part of our cluster)
*
* So just stay dead, something is seriously messed up anyway.
*
*/
exit(100); /* None of our wrappers since we already called qb_log_fini() */
return;
}
if (st_event->result == pcmk_ok &&
safe_str_eq(st_event->operation, T_STONITH_NOTIFY_FENCE)) {
st_fail_count_reset(st_event->target);
}
crm_notice("Peer %s was%s terminated (%s) by %s for %s: %s (ref=%s) by client %s",
st_event->target, st_event->result == pcmk_ok ? "" : " not",
st_event->action,
st_event->executioner ? st_event->executioner : "<anyone>",
st_event->origin, pcmk_strerror(st_event->result), st_event->id,
st_event->client_origin ? st_event->client_origin : "<unknown>");
#if SUPPORT_CMAN
if (st_event->result == pcmk_ok && is_cman_cluster()) {
int local_rc = 0;
int confirm = 0;
char *target_copy = strdup(st_event->target);
/* In case fenced hasn't noticed yet
*
* Any fencing that has been inititated will be completed by way of the fence_pcmk redirect
*/
local_rc = fenced_external(target_copy);
if (local_rc != 0) {
crm_err("Could not notify CMAN that '%s' is now fenced: %d", st_event->target,
local_rc);
} else {
crm_notice("Notified CMAN that '%s' is now fenced", st_event->target);
}
/* In case fenced is already trying to shoot it */
confirm = open("/var/run/cluster/fenced_override", O_NONBLOCK|O_WRONLY);
if (confirm) {
int ignore = 0;
int len = strlen(target_copy);
errno = 0;
local_rc = write(confirm, target_copy, len);
ignore = write(confirm, "\n", 1);
if(ignore < 0 && errno == EBADF) {
crm_trace("CMAN not expecting %s to be fenced (yet)", st_event->target);
} else if (local_rc < len) {
crm_perror(LOG_ERR, "Confirmation of CMAN fencing event for '%s' failed: %d", st_event->target, local_rc);
} else {
fsync(confirm);
crm_notice("Confirmed CMAN fencing event for '%s'", st_event->target);
}
close(confirm);
}
free(target_copy);
}
#endif
- if (st_event->result == pcmk_ok) {
- crm_node_t *peer = crm_get_peer_full(0, st_event->target, CRM_GET_PEER_REMOTE | CRM_GET_PEER_CLUSTER);
- const char *uuid = crm_peer_uuid(peer);
- gboolean we_are_executioner = safe_str_eq(st_event->executioner, fsa_our_uname);
+ if (st_event->result == pcmk_ok) {
+ crm_node_t *peer = crm_find_peer_full(0, st_event->target, CRM_GET_PEER_REMOTE | CRM_GET_PEER_CLUSTER);
+ const char *uuid = NULL;
+ gboolean we_are_executioner = safe_str_eq(st_event->executioner, fsa_our_uname);
+
+ if (peer == NULL) {
+ return;
+ }
+
+ uuid = crm_peer_uuid(peer);
crm_trace("target=%s dc=%s", st_event->target, fsa_our_dc);
if(AM_I_DC) {
/* The DC always sends updates */
send_stonith_update(NULL, st_event->target, uuid);
if (st_event->client_origin && safe_str_neq(st_event->client_origin, te_client_id)) {
/* Abort the current transition graph if it wasn't us
* that invoked stonith to fence someone
*/
crm_info("External fencing operation from %s fenced %s", st_event->client_origin, st_event->target);
abort_transition(INFINITY, tg_restart, "External Fencing Operation", NULL);
}
/* Assume it was our leader if we dont currently have one */
} else if (fsa_our_dc == NULL || safe_str_eq(fsa_our_dc, st_event->target)) {
crm_notice("Target %s our leader %s (recorded: %s)",
fsa_our_dc ? "was" : "may have been", st_event->target,
fsa_our_dc ? fsa_our_dc : "<unset>");
/* Given the CIB resyncing that occurs around elections,
* have one node update the CIB now and, if the new DC is different,
* have them do so too after the election
*/
if (we_are_executioner) {
send_stonith_update(NULL, st_event->target, uuid);
}
stonith_cleanup_list = g_list_append(stonith_cleanup_list, strdup(st_event->target));
}
crmd_peer_down(peer, TRUE);
}
}
gboolean
te_connect_stonith(gpointer user_data)
{
int lpc = 0;
int rc = pcmk_ok;
if (stonith_api == NULL) {
stonith_api = stonith_api_new();
}
if (stonith_api->state != stonith_disconnected) {
crm_trace("Still connected");
return TRUE;
}
for (lpc = 0; lpc < 30; lpc++) {
crm_debug("Attempting connection to fencing daemon...");
sleep(1);
rc = stonith_api->cmds->connect(stonith_api, crm_system_name, NULL);
if (rc == pcmk_ok) {
break;
}
if (user_data != NULL) {
crm_err("Sign-in failed: triggered a retry");
mainloop_set_trigger(stonith_reconnect);
return TRUE;
}
crm_err("Sign-in failed: pausing and trying again in 2s...");
sleep(1);
}
CRM_CHECK(rc == pcmk_ok, return TRUE); /* If not, we failed 30 times... just get out */
stonith_api->cmds->register_notification(stonith_api, T_STONITH_NOTIFY_DISCONNECT,
tengine_stonith_connection_destroy);
stonith_api->cmds->register_notification(stonith_api, T_STONITH_NOTIFY_FENCE,
tengine_stonith_notify);
crm_trace("Connected");
return TRUE;
}
gboolean
stop_te_timer(crm_action_timer_t * timer)
{
const char *timer_desc = "action timer";
if (timer == NULL) {
return FALSE;
}
if (timer->reason == timeout_abort) {
timer_desc = "global timer";
crm_trace("Stopping %s", timer_desc);
}
if (timer->source_id != 0) {
crm_trace("Stopping %s", timer_desc);
g_source_remove(timer->source_id);
timer->source_id = 0;
} else {
crm_trace("%s was already stopped", timer_desc);
return FALSE;
}
return TRUE;
}
gboolean
te_graph_trigger(gpointer user_data)
{
enum transition_status graph_rc = -1;
if (transition_graph == NULL) {
crm_debug("Nothing to do");
return TRUE;
}
crm_trace("Invoking graph %d in state %s", transition_graph->id, fsa_state2string(fsa_state));
switch (fsa_state) {
case S_STARTING:
case S_PENDING:
case S_NOT_DC:
case S_HALT:
case S_ILLEGAL:
case S_STOPPING:
case S_TERMINATE:
return TRUE;
break;
default:
break;
}
if (transition_graph->complete == FALSE) {
int limit = transition_graph->batch_limit;
transition_graph->batch_limit = throttle_get_total_job_limit(limit);
graph_rc = run_graph(transition_graph);
transition_graph->batch_limit = limit; /* Restore the configured value */
print_graph(LOG_DEBUG_3, transition_graph);
if (graph_rc == transition_active) {
crm_trace("Transition not yet complete");
return TRUE;
} else if (graph_rc == transition_pending) {
crm_trace("Transition not yet complete - no actions fired");
return TRUE;
}
if (graph_rc != transition_complete) {
crm_warn("Transition failed: %s", transition_status(graph_rc));
print_graph(LOG_NOTICE, transition_graph);
}
}
crm_debug("Transition %d is now complete", transition_graph->id);
transition_graph->complete = TRUE;
notify_crmd(transition_graph);
return TRUE;
}
void
trigger_graph_processing(const char *fn, int line)
{
crm_trace("%s:%d - Triggered graph processing", fn, line);
mainloop_set_trigger(transition_trigger);
}
void
abort_transition_graph(int abort_priority, enum transition_action abort_action,
const char *abort_text, xmlNode * reason, const char *fn, int line)
{
const char *magic = NULL;
CRM_CHECK(transition_graph != NULL, return);
if (reason) {
int diff_add_updates = 0;
int diff_add_epoch = 0;
int diff_add_admin_epoch = 0;
int diff_del_updates = 0;
int diff_del_epoch = 0;
int diff_del_admin_epoch = 0;
const char *uname = "";
xmlNode *search = reason;
xmlNode *diff = get_xpath_object("//" F_CIB_UPDATE_RESULT "//diff", reason, LOG_DEBUG_2);
magic = crm_element_value(reason, XML_ATTR_TRANSITION_MAGIC);
while(search) {
const char *kind = TYPE(search);
if (safe_str_eq(XML_CIB_TAG_STATE, kind)
|| safe_str_eq(XML_CIB_TAG_NODE, kind)) {
uname = crm_peer_uname(ID(search));
break;
}
search = search->parent;
}
if (diff) {
cib_diff_version_details(diff,
&diff_add_admin_epoch, &diff_add_epoch, &diff_add_updates,
&diff_del_admin_epoch, &diff_del_epoch, &diff_del_updates);
if (crm_str_eq(TYPE(reason), XML_CIB_TAG_NVPAIR, TRUE)) {
crm_info
("%s:%d - Triggered transition abort (complete=%d, node=%s, tag=%s, id=%s, name=%s, value=%s, magic=%s, cib=%d.%d.%d) : %s",
fn, line, transition_graph->complete, uname, TYPE(reason), ID(reason), NAME(reason),
VALUE(reason), magic ? magic : "NA", diff_add_admin_epoch, diff_add_epoch,
diff_add_updates, abort_text);
} else {
crm_info
("%s:%d - Triggered transition abort (complete=%d, node=%s, tag=%s, id=%s, magic=%s, cib=%d.%d.%d) : %s",
fn, line, transition_graph->complete, uname, TYPE(reason), ID(reason),
magic ? magic : "NA", diff_add_admin_epoch, diff_add_epoch, diff_add_updates,
abort_text);
}
} else {
crm_info
("%s:%d - Triggered transition abort (complete=%d, node=%s, tag=%s, id=%s, magic=%s) : %s",
fn, line, transition_graph->complete, uname, TYPE(reason), ID(reason),
magic ? magic : "NA", abort_text);
}
} else {
crm_info("%s:%d - Triggered transition abort (complete=%d) : %s",
fn, line, transition_graph->complete, abort_text);
}
switch (fsa_state) {
case S_STARTING:
case S_PENDING:
case S_NOT_DC:
case S_HALT:
case S_ILLEGAL:
case S_STOPPING:
case S_TERMINATE:
crm_info("Abort suppressed: state=%s (complete=%d)",
fsa_state2string(fsa_state), transition_graph->complete);
return;
default:
break;
}
if (magic == NULL && reason != NULL) {
crm_log_xml_debug(reason, "Cause");
}
/* Make sure any queued calculations are discarded ASAP */
free(fsa_pe_ref);
fsa_pe_ref = NULL;
if (transition_graph->complete) {
if (transition_timer->period_ms > 0) {
crm_timer_stop(transition_timer);
crm_timer_start(transition_timer);
} else {
register_fsa_input(C_FSA_INTERNAL, I_PE_CALC, NULL);
}
return;
}
update_abort_priority(transition_graph, abort_priority, abort_action, abort_text);
mainloop_set_trigger(transition_trigger);
}
diff --git a/include/crm/cluster/internal.h b/include/crm/cluster/internal.h
index d7fca2607f..619fc392ce 100644
--- a/include/crm/cluster/internal.h
+++ b/include/crm/cluster/internal.h
@@ -1,443 +1,446 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef CRM_CLUSTER_INTERNAL__H
# define CRM_CLUSTER_INTERNAL__H
# include <crm/cluster.h>
# define AIS_IPC_NAME "ais-crm-ipc"
# define AIS_IPC_MESSAGE_SIZE 8192*128
# define CRM_MESSAGE_IPC_ACK 0
# ifndef INTERFACE_MAX
# define INTERFACE_MAX 2 /* from the private coroapi.h header */
# endif
typedef struct crm_ais_host_s AIS_Host;
typedef struct crm_ais_msg_s AIS_Message;
struct crm_ais_host_s {
uint32_t id;
uint32_t pid;
gboolean local;
enum crm_ais_msg_types type;
uint32_t size;
char uname[MAX_NAME];
} __attribute__ ((packed));
struct crm_ais_msg_s {
cs_ipc_header_response_t header __attribute__ ((aligned(8)));
uint32_t id;
gboolean is_compressed;
AIS_Host host;
AIS_Host sender;
uint32_t size;
uint32_t compressed_size;
/* 584 bytes */
char data[0];
} __attribute__ ((packed));
struct crm_ais_nodeid_resp_s {
cs_ipc_header_response_t header __attribute__ ((aligned(8)));
uint32_t id;
uint32_t counter;
char uname[MAX_NAME];
char cname[MAX_NAME];
} __attribute__ ((packed));
struct crm_ais_quorum_resp_s {
cs_ipc_header_response_t header __attribute__ ((aligned(8)));
uint64_t id;
uint32_t votes;
uint32_t expected_votes;
uint32_t quorate;
} __attribute__ ((packed));
/* *INDENT-OFF* */
enum crm_proc_flag {
crm_proc_none = 0x00000001,
/* These values are sent over the network by the legacy plugin
* Therefor changing any of these values is going to break compatability
*
* So don't
*/
/* 3 messaging types */
crm_proc_heartbeat = 0x01000000,
crm_proc_plugin = 0x00000002,
crm_proc_cpg = 0x04000000,
crm_proc_lrmd = 0x00000010,
crm_proc_cib = 0x00000100,
crm_proc_crmd = 0x00000200,
crm_proc_attrd = 0x00001000,
crm_proc_stonithd = 0x00002000,
crm_proc_stonith_ng= 0x00100000,
crm_proc_pe = 0x00010000,
crm_proc_te = 0x00020000,
crm_proc_mgmtd = 0x00040000,
};
/* *INDENT-ON* */
static inline const char *
peer2text(enum crm_proc_flag proc)
{
const char *text = "unknown";
if (proc == (crm_proc_cpg | crm_proc_crmd)) {
return "peer";
}
switch (proc) {
case crm_proc_none:
text = "none";
break;
case crm_proc_plugin:
text = "ais";
break;
case crm_proc_heartbeat:
text = "heartbeat";
break;
case crm_proc_cib:
text = "cib";
break;
case crm_proc_crmd:
text = "crmd";
break;
case crm_proc_pe:
text = "pengine";
break;
case crm_proc_te:
text = "tengine";
break;
case crm_proc_lrmd:
text = "lrmd";
break;
case crm_proc_attrd:
text = "attrd";
break;
case crm_proc_stonithd:
text = "stonithd";
break;
case crm_proc_stonith_ng:
text = "stonith-ng";
break;
case crm_proc_mgmtd:
text = "mgmtd";
break;
case crm_proc_cpg:
text = "corosync-cpg";
break;
}
return text;
}
static inline enum crm_proc_flag
text2proc(const char *proc)
{
/* We only care about these two so far */
if (proc && strcmp(proc, "cib") == 0) {
return crm_proc_cib;
} else if (proc && strcmp(proc, "crmd") == 0) {
return crm_proc_crmd;
}
return crm_proc_none;
}
static inline const char *
ais_dest(const struct crm_ais_host_s *host)
{
if (host->local) {
return "local";
} else if (host->size > 0) {
return host->uname;
} else {
return "<all>";
}
}
# define ais_data_len(msg) (msg->is_compressed?msg->compressed_size:msg->size)
static inline AIS_Message *
ais_msg_copy(const AIS_Message * source)
{
AIS_Message *target = malloc(sizeof(AIS_Message) + ais_data_len(source));
if(target) {
memcpy(target, source, sizeof(AIS_Message));
memcpy(target->data, source->data, ais_data_len(target));
}
return target;
}
/*
typedef enum {
CS_OK = 1,
CS_ERR_LIBRARY = 2,
CS_ERR_VERSION = 3,
CS_ERR_INIT = 4,
CS_ERR_TIMEOUT = 5,
CS_ERR_TRY_AGAIN = 6,
CS_ERR_INVALID_PARAM = 7,
CS_ERR_NO_MEMORY = 8,
CS_ERR_BAD_HANDLE = 9,
CS_ERR_BUSY = 10,
CS_ERR_ACCESS = 11,
CS_ERR_NOT_EXIST = 12,
CS_ERR_NAME_TOO_LONG = 13,
CS_ERR_EXIST = 14,
CS_ERR_NO_SPACE = 15,
CS_ERR_INTERRUPT = 16,
CS_ERR_NAME_NOT_FOUND = 17,
CS_ERR_NO_RESOURCES = 18,
CS_ERR_NOT_SUPPORTED = 19,
CS_ERR_BAD_OPERATION = 20,
CS_ERR_FAILED_OPERATION = 21,
CS_ERR_MESSAGE_ERROR = 22,
CS_ERR_QUEUE_FULL = 23,
CS_ERR_QUEUE_NOT_AVAILABLE = 24,
CS_ERR_BAD_FLAGS = 25,
CS_ERR_TOO_BIG = 26,
CS_ERR_NO_SECTIONS = 27,
CS_ERR_CONTEXT_NOT_FOUND = 28,
CS_ERR_TOO_MANY_GROUPS = 30,
CS_ERR_SECURITY = 100
} cs_error_t;
*/
static inline const char *
ais_error2text(int error)
{
const char *text = "unknown";
# if SUPPORT_COROSYNC
switch (error) {
case CS_OK:
text = "OK";
break;
case CS_ERR_LIBRARY:
text = "Library error";
break;
case CS_ERR_VERSION:
text = "Version error";
break;
case CS_ERR_INIT:
text = "Initialization error";
break;
case CS_ERR_TIMEOUT:
text = "Timeout";
break;
case CS_ERR_TRY_AGAIN:
text = "Try again";
break;
case CS_ERR_INVALID_PARAM:
text = "Invalid parameter";
break;
case CS_ERR_NO_MEMORY:
text = "No memory";
break;
case CS_ERR_BAD_HANDLE:
text = "Bad handle";
break;
case CS_ERR_BUSY:
text = "Busy";
break;
case CS_ERR_ACCESS:
text = "Access error";
break;
case CS_ERR_NOT_EXIST:
text = "Doesn't exist";
break;
case CS_ERR_NAME_TOO_LONG:
text = "Name too long";
break;
case CS_ERR_EXIST:
text = "Exists";
break;
case CS_ERR_NO_SPACE:
text = "No space";
break;
case CS_ERR_INTERRUPT:
text = "Interrupt";
break;
case CS_ERR_NAME_NOT_FOUND:
text = "Name not found";
break;
case CS_ERR_NO_RESOURCES:
text = "No resources";
break;
case CS_ERR_NOT_SUPPORTED:
text = "Not supported";
break;
case CS_ERR_BAD_OPERATION:
text = "Bad operation";
break;
case CS_ERR_FAILED_OPERATION:
text = "Failed operation";
break;
case CS_ERR_MESSAGE_ERROR:
text = "Message error";
break;
case CS_ERR_QUEUE_FULL:
text = "Queue full";
break;
case CS_ERR_QUEUE_NOT_AVAILABLE:
text = "Queue not available";
break;
case CS_ERR_BAD_FLAGS:
text = "Bad flags";
break;
case CS_ERR_TOO_BIG:
text = "To big";
break;
case CS_ERR_NO_SECTIONS:
text = "No sections";
break;
}
# endif
return text;
}
static inline const char *
msg_type2text(enum crm_ais_msg_types type)
{
const char *text = "unknown";
switch (type) {
case crm_msg_none:
text = "unknown";
break;
case crm_msg_ais:
text = "ais";
break;
case crm_msg_cib:
text = "cib";
break;
case crm_msg_crmd:
text = "crmd";
break;
case crm_msg_pe:
text = "pengine";
break;
case crm_msg_te:
text = "tengine";
break;
case crm_msg_lrmd:
text = "lrmd";
break;
case crm_msg_attrd:
text = "attrd";
break;
case crm_msg_stonithd:
text = "stonithd";
break;
case crm_msg_stonith_ng:
text = "stonith-ng";
break;
}
return text;
}
enum crm_ais_msg_types text2msg_type(const char *text);
char *get_ais_data(const AIS_Message * msg);
gboolean check_message_sanity(const AIS_Message * msg, const char *data);
# if SUPPORT_HEARTBEAT
extern ll_cluster_t *heartbeat_cluster;
gboolean send_ha_message(ll_cluster_t * hb_conn, xmlNode * msg,
const char *node, gboolean force_ordered);
gboolean ha_msg_dispatch(ll_cluster_t * cluster_conn, gpointer user_data);
gboolean register_heartbeat_conn(crm_cluster_t * cluster);
xmlNode *convert_ha_message(xmlNode * parent, HA_Message * msg, const char *field);
gboolean ccm_have_quorum(oc_ed_t event);
const char *ccm_event_name(oc_ed_t event);
crm_node_t *crm_update_ccm_node(const oc_ev_membership_t * oc, int offset, const char *state,
uint64_t seq);
gboolean heartbeat_initialize_nodelist(void *cluster, gboolean force_member, xmlNode * xml_parent);
# endif
# if SUPPORT_COROSYNC
gboolean send_cpg_iov(struct iovec * iov);
# if SUPPORT_PLUGIN
char *classic_node_name(uint32_t nodeid);
void plugin_handle_membership(AIS_Message *msg);
bool send_plugin_text(int class, struct iovec *iov);
# else
char *corosync_node_name(uint64_t /*cmap_handle_t */ cmap_handle, uint32_t nodeid);
# endif
gboolean corosync_initialize_nodelist(void *cluster, gboolean force_member, xmlNode * xml_parent);
gboolean send_cluster_message_cs(xmlNode * msg, gboolean local,
crm_node_t * node, enum crm_ais_msg_types dest);
enum cluster_type_e find_corosync_variant(void);
void terminate_cs_connection(crm_cluster_t * cluster);
gboolean init_cs_connection(crm_cluster_t * cluster);
gboolean init_cs_connection_once(crm_cluster_t * cluster);
# endif
# ifdef SUPPORT_CMAN
char *cman_node_name(uint32_t nodeid);
# endif
enum crm_quorum_source {
crm_quorum_cman,
crm_quorum_corosync,
crm_quorum_pacemaker,
};
int get_corosync_id(int id, const char *uuid);
char *get_corosync_uuid(crm_node_t *peer);
enum crm_quorum_source get_quorum_source(void);
void crm_update_peer_proc(const char *source, crm_node_t * peer, uint32_t flag, const char *status);
crm_node_t *crm_update_peer(const char *source, unsigned int id, uint64_t born, uint64_t seen,
int32_t votes, uint32_t children, const char *uuid, const char *uname,
const char *addr, const char *state);
void crm_update_peer_expected(const char *source, crm_node_t * node, const char *expected);
void crm_update_peer_state(const char *source, crm_node_t * node, const char *state,
int membership);
gboolean init_cman_connection(gboolean(*dispatch) (unsigned long long, gboolean),
void (*destroy) (gpointer));
gboolean cluster_connect_quorum(gboolean(*dispatch) (unsigned long long, gboolean),
void (*destroy) (gpointer));
void set_node_uuid(const char *uname, const char *uuid);
gboolean node_name_is_valid(const char *key, const char *name);
+crm_node_t * crm_find_peer_full(unsigned int id, const char *uname, int flags);
+crm_node_t * crm_find_peer(unsigned int id, const char *uname);
+
#endif
diff --git a/include/crm/pengine/common.h b/include/crm/pengine/common.h
index 2477e47cdc..acd806ac4a 100644
--- a/include/crm/pengine/common.h
+++ b/include/crm/pengine/common.h
@@ -1,126 +1,128 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef PE_COMMON__H
# define PE_COMMON__H
# include <glib.h>
extern gboolean was_processing_error;
extern gboolean was_processing_warning;
/* order is significant here
* items listed in order of accending severeness
* more severe actions take precedent over lower ones
*/
enum action_fail_response {
action_fail_ignore,
action_fail_recover,
action_fail_migrate, /* recover by moving it somewhere else */
action_fail_block,
action_fail_stop,
action_fail_standby,
action_fail_fence,
action_fail_restart_container
};
/* the "done" action must be the "pre" action +1 */
enum action_tasks {
no_action,
monitor_rsc,
stop_rsc,
stopped_rsc,
start_rsc,
started_rsc,
action_notify,
action_notified,
action_promote,
action_promoted,
action_demote,
action_demoted,
shutdown_crm,
stonith_node
};
enum rsc_recovery_type {
recovery_stop_start,
recovery_stop_only,
recovery_block
};
enum rsc_start_requirement {
rsc_req_nothing, /* Allowed by custom_action() */
rsc_req_quorum, /* Enforced by custom_action() */
rsc_req_stonith /* Enforced by native_start_constraints() */
};
enum rsc_role_e {
RSC_ROLE_UNKNOWN,
RSC_ROLE_STOPPED,
RSC_ROLE_STARTED,
RSC_ROLE_SLAVE,
RSC_ROLE_MASTER,
};
# define RSC_ROLE_MAX RSC_ROLE_MASTER+1
# define RSC_ROLE_UNKNOWN_S "Unknown"
# define RSC_ROLE_STOPPED_S "Stopped"
# define RSC_ROLE_STARTED_S "Started"
# define RSC_ROLE_SLAVE_S "Slave"
# define RSC_ROLE_MASTER_S "Master"
/* *INDENT-OFF* */
enum pe_print_options {
pe_print_log = 0x0001,
pe_print_html = 0x0002,
pe_print_ncurses = 0x0004,
pe_print_printf = 0x0008,
pe_print_dev = 0x0010,
pe_print_details = 0x0020,
pe_print_max_details = 0x0040,
pe_print_rsconly = 0x0080,
pe_print_ops = 0x0100,
pe_print_suppres_nl = 0x0200,
pe_print_xml = 0x0400,
+ pe_print_brief = 0x0800,
+ pe_print_pending = 0x1000,
};
/* *INDENT-ON* */
const char *task2text(enum action_tasks task);
enum action_tasks text2task(const char *task);
enum rsc_role_e text2role(const char *role);
const char *role2text(enum rsc_role_e role);
const char *fail2text(enum action_fail_response fail);
const char *pe_pref(GHashTable * options, const char *name);
void calculate_active_ops(GList * sorted_op_list, int *start_index, int *stop_index);
static inline const char *
recovery2text(enum rsc_recovery_type type)
{
switch (type) {
case recovery_stop_only:
return "shutting it down";
case recovery_stop_start:
return "attempting recovery";
case recovery_block:
return "waiting for an administrator";
}
return "Unknown";
}
#endif
diff --git a/include/crm/pengine/internal.h b/include/crm/pengine/internal.h
index 46e63c4eb1..072c0a90f8 100644
--- a/include/crm/pengine/internal.h
+++ b/include/crm/pengine/internal.h
@@ -1,268 +1,271 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef PE_INTERNAL__H
# define PE_INTERNAL__H
# include <crm/pengine/status.h>
# define pe_rsc_info(rsc, fmt, args...) crm_log_tag(LOG_INFO, rsc ? rsc->id : "<NULL>", fmt, ##args)
# define pe_rsc_debug(rsc, fmt, args...) crm_log_tag(LOG_DEBUG, rsc ? rsc->id : "<NULL>", fmt, ##args)
# define pe_rsc_trace(rsc, fmt, args...) crm_log_tag(LOG_TRACE, rsc ? rsc->id : "<NULL>", fmt, ##args)
# define pe_err(fmt...) { was_processing_error = TRUE; crm_config_error = TRUE; crm_err(fmt); }
# define pe_warn(fmt...) { was_processing_warning = TRUE; crm_config_warning = TRUE; crm_warn(fmt); }
# define pe_proc_err(fmt...) { was_processing_error = TRUE; crm_err(fmt); }
# define pe_proc_warn(fmt...) { was_processing_warning = TRUE; crm_warn(fmt); }
# define pe_set_action_bit(action, bit) action->flags = crm_set_bit(__FUNCTION__, action->uuid, action->flags, bit)
# define pe_clear_action_bit(action, bit) action->flags = crm_clear_bit(__FUNCTION__, action->uuid, action->flags, bit)
typedef struct notify_data_s {
GHashTable *keys;
const char *action;
action_t *pre;
action_t *post;
action_t *pre_done;
action_t *post_done;
GListPtr active; /* notify_entry_t* */
GListPtr inactive; /* notify_entry_t* */
GListPtr start; /* notify_entry_t* */
GListPtr stop; /* notify_entry_t* */
GListPtr demote; /* notify_entry_t* */
GListPtr promote; /* notify_entry_t* */
GListPtr master; /* notify_entry_t* */
GListPtr slave; /* notify_entry_t* */
} notify_data_t;
bool pe_can_fence(pe_working_set_t *data_set, node_t *node);
int merge_weights(int w1, int w2);
void add_hash_param(GHashTable * hash, const char *name, const char *value);
void append_hashtable(gpointer key, gpointer value, gpointer user_data);
char *native_parameter(resource_t * rsc, node_t * node, gboolean create, const char *name,
pe_working_set_t * data_set);
node_t *native_location(resource_t * rsc, GListPtr * list, gboolean current);
void pe_metadata(void);
void verify_pe_options(GHashTable * options);
void common_update_score(resource_t * rsc, const char *id, int score);
void native_add_running(resource_t * rsc, node_t * node, pe_working_set_t * data_set);
node_t *rsc_known_on(resource_t * rsc, GListPtr * list);
gboolean native_unpack(resource_t * rsc, pe_working_set_t * data_set);
gboolean group_unpack(resource_t * rsc, pe_working_set_t * data_set);
gboolean clone_unpack(resource_t * rsc, pe_working_set_t * data_set);
gboolean master_unpack(resource_t * rsc, pe_working_set_t * data_set);
resource_t *native_find_rsc(resource_t * rsc, const char *id, node_t * node, int flags);
gboolean native_active(resource_t * rsc, gboolean all);
gboolean group_active(resource_t * rsc, gboolean all);
gboolean clone_active(resource_t * rsc, gboolean all);
gboolean master_active(resource_t * rsc, gboolean all);
void native_print(resource_t * rsc, const char *pre_text, long options, void *print_data);
void group_print(resource_t * rsc, const char *pre_text, long options, void *print_data);
void clone_print(resource_t * rsc, const char *pre_text, long options, void *print_data);
void master_print(resource_t * rsc, const char *pre_text, long options, void *print_data);
void native_free(resource_t * rsc);
void group_free(resource_t * rsc);
void clone_free(resource_t * rsc);
void master_free(resource_t * rsc);
enum rsc_role_e native_resource_state(const resource_t * rsc, gboolean current);
enum rsc_role_e group_resource_state(const resource_t * rsc, gboolean current);
enum rsc_role_e clone_resource_state(const resource_t * rsc, gboolean current);
enum rsc_role_e master_resource_state(const resource_t * rsc, gboolean current);
gboolean common_unpack(xmlNode * xml_obj, resource_t ** rsc, resource_t * parent,
pe_working_set_t * data_set);
void common_print(resource_t * rsc, const char *pre_text, long options, void *print_data);
void common_free(resource_t * rsc);
extern pe_working_set_t *pe_dataset;
extern node_t *node_copy(node_t * this_node);
extern time_t get_effective_time(pe_working_set_t * data_set);
extern int get_failcount(node_t * node, resource_t * rsc, time_t *last_failure,
pe_working_set_t * data_set);
extern int get_failcount_full(node_t * node, resource_t * rsc, time_t *last_failure,
bool effective, pe_working_set_t * data_set);
extern int get_failcount_all(node_t * node, resource_t * rsc, time_t *last_failure,
pe_working_set_t * data_set);
/* Binary like operators for lists of nodes */
extern void node_list_exclude(GHashTable * list, GListPtr list2, gboolean merge_scores);
extern GListPtr node_list_dup(GListPtr list, gboolean reset, gboolean filter);
extern GListPtr node_list_from_hash(GHashTable * hash, gboolean reset, gboolean filter);
extern GHashTable *node_hash_from_list(GListPtr list);
static inline gpointer
pe_hash_table_lookup(GHashTable * hash, gconstpointer key)
{
if (hash) {
return g_hash_table_lookup(hash, key);
}
return NULL;
}
extern action_t *get_pseudo_op(const char *name, pe_working_set_t * data_set);
extern gboolean order_actions(action_t * lh_action, action_t * rh_action, enum pe_ordering order);
GHashTable *node_hash_dup(GHashTable * hash);
extern GListPtr node_list_and(GListPtr list1, GListPtr list2, gboolean filter);
extern GListPtr node_list_xor(GListPtr list1, GListPtr list2, gboolean filter);
extern GListPtr node_list_minus(GListPtr list1, GListPtr list2, gboolean filter);
extern void pe_free_shallow(GListPtr alist);
extern void pe_free_shallow_adv(GListPtr alist, gboolean with_data);
/* Printing functions for debug */
extern void print_node(const char *pre_text, node_t * node, gboolean details);
extern void print_resource(int log_level, const char *pre_text, resource_t * rsc, gboolean details);
extern void dump_node_scores_worker(int level, const char *file, const char *function, int line,
resource_t * rsc, const char *comment, GHashTable * nodes);
extern void dump_node_capacity(int level, const char *comment, node_t * node);
extern void dump_rsc_utilization(int level, const char *comment, resource_t * rsc, node_t * node);
# define dump_node_scores(level, rsc, text, nodes) do { \
dump_node_scores_worker(level, __FILE__, __FUNCTION__, __LINE__, rsc, text, nodes); \
} while(0)
/* Sorting functions */
extern gint sort_rsc_priority(gconstpointer a, gconstpointer b);
extern gint sort_rsc_index(gconstpointer a, gconstpointer b);
extern xmlNode *find_rsc_op_entry(resource_t * rsc, const char *key);
extern action_t *custom_action(resource_t * rsc, char *key, const char *task, node_t * on_node,
gboolean optional, gboolean foo, pe_working_set_t * data_set);
# define delete_key(rsc) generate_op_key(rsc->id, CRMD_ACTION_DELETE, 0)
# define delete_action(rsc, node, optional) custom_action( \
rsc, delete_key(rsc), CRMD_ACTION_DELETE, node, \
optional, TRUE, data_set);
# define stopped_key(rsc) generate_op_key(rsc->id, CRMD_ACTION_STOPPED, 0)
# define stopped_action(rsc, node, optional) custom_action( \
rsc, stopped_key(rsc), CRMD_ACTION_STOPPED, node, \
optional, TRUE, data_set);
# define stop_key(rsc) generate_op_key(rsc->id, CRMD_ACTION_STOP, 0)
# define stop_action(rsc, node, optional) custom_action( \
rsc, stop_key(rsc), CRMD_ACTION_STOP, node, \
optional, TRUE, data_set);
# define start_key(rsc) generate_op_key(rsc->id, CRMD_ACTION_START, 0)
# define start_action(rsc, node, optional) custom_action( \
rsc, start_key(rsc), CRMD_ACTION_START, node, \
optional, TRUE, data_set)
# define started_key(rsc) generate_op_key(rsc->id, CRMD_ACTION_STARTED, 0)
# define started_action(rsc, node, optional) custom_action( \
rsc, started_key(rsc), CRMD_ACTION_STARTED, node, \
optional, TRUE, data_set)
# define promote_key(rsc) generate_op_key(rsc->id, CRMD_ACTION_PROMOTE, 0)
# define promote_action(rsc, node, optional) custom_action( \
rsc, promote_key(rsc), CRMD_ACTION_PROMOTE, node, \
optional, TRUE, data_set)
# define promoted_key(rsc) generate_op_key(rsc->id, CRMD_ACTION_PROMOTED, 0)
# define promoted_action(rsc, node, optional) custom_action( \
rsc, promoted_key(rsc), CRMD_ACTION_PROMOTED, node, \
optional, TRUE, data_set)
# define demote_key(rsc) generate_op_key(rsc->id, CRMD_ACTION_DEMOTE, 0)
# define demote_action(rsc, node, optional) custom_action( \
rsc, demote_key(rsc), CRMD_ACTION_DEMOTE, node, \
optional, TRUE, data_set)
# define demoted_key(rsc) generate_op_key(rsc->id, CRMD_ACTION_DEMOTED, 0)
# define demoted_action(rsc, node, optional) custom_action( \
rsc, demoted_key(rsc), CRMD_ACTION_DEMOTED, node, \
optional, TRUE, data_set)
extern action_t *find_first_action(GListPtr input, const char *uuid, const char *task,
node_t * on_node);
extern enum action_tasks get_complex_task(resource_t * rsc, const char *name,
gboolean allow_non_atomic);
extern GListPtr find_actions(GListPtr input, const char *key, node_t * on_node);
extern GListPtr find_actions_exact(GListPtr input, const char *key, node_t * on_node);
extern GListPtr find_recurring_actions(GListPtr input, node_t * not_on_node);
extern void pe_free_action(action_t * action);
extern void resource_location(resource_t * rsc, node_t * node, int score, const char *tag,
pe_working_set_t * data_set);
extern gint sort_op_by_callid(gconstpointer a, gconstpointer b);
extern gboolean get_target_role(resource_t * rsc, enum rsc_role_e *role);
extern resource_t *find_clone_instance(resource_t * rsc, const char *sub_id,
pe_working_set_t * data_set);
extern void destroy_ticket(gpointer data);
extern ticket_t *ticket_new(const char *ticket_id, pe_working_set_t * data_set);
char *clone_strip(const char *last_rsc_id);
char *clone_zero(const char *last_rsc_id);
gint sort_node_uname(gconstpointer a, gconstpointer b);
bool is_set_recursive(resource_t * rsc, long long flag, bool any);
enum rsc_digest_cmp_val {
/*! Digests are the same */
RSC_DIGEST_MATCH = 0,
/*! Params that require a restart changed */
RSC_DIGEST_RESTART,
/*! Some parameter changed. */
RSC_DIGEST_ALL,
/*! rsc op didn't have a digest associated with it, so
* it is unknown if parameters changed or not. */
RSC_DIGEST_UNKNOWN,
};
typedef struct op_digest_cache_s {
enum rsc_digest_cmp_val rc;
xmlNode *params_all;
xmlNode *params_restart;
char *digest_all_calc;
char *digest_restart_calc;
} op_digest_cache_t;
op_digest_cache_t *rsc_action_digest_cmp(resource_t * rsc, xmlNode * xml_op, node_t * node,
pe_working_set_t * data_set);
gboolean xml_contains_remote_node(xmlNode *xml);
gboolean is_baremetal_remote_node(node_t *node);
gboolean is_container_remote_node(node_t *node);
gboolean is_remote_node(node_t *node);
resource_t * rsc_contains_remote_node(pe_working_set_t * data_set, resource_t *rsc);
+
+void print_rscs_brief(GListPtr rsc_list, const char * pre_text, long options,
+ void * print_data, gboolean print_all);
#endif
diff --git a/include/crm/pengine/status.h b/include/crm/pengine/status.h
index d85095ee54..fed24824a7 100644
--- a/include/crm/pengine/status.h
+++ b/include/crm/pengine/status.h
@@ -1,374 +1,376 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#ifndef PENGINE_STATUS__H
# define PENGINE_STATUS__H
# include <glib.h>
# include <crm/common/iso8601.h>
# include <crm/pengine/common.h>
typedef struct node_s pe_node_t;
typedef struct node_s node_t;
typedef struct pe_action_s action_t;
typedef struct pe_action_s pe_action_t;
typedef struct resource_s resource_t;
typedef struct ticket_s ticket_t;
typedef enum no_quorum_policy_e {
no_quorum_freeze,
no_quorum_stop,
no_quorum_ignore,
no_quorum_suicide
} no_quorum_policy_t;
enum node_type {
node_ping,
node_member,
node_remote
};
enum pe_restart {
pe_restart_restart,
pe_restart_ignore
};
enum pe_find {
pe_find_renamed = 0x001,
pe_find_clone = 0x004,
pe_find_current = 0x008,
pe_find_inactive = 0x010,
};
# define pe_flag_have_quorum 0x00000001ULL
# define pe_flag_symmetric_cluster 0x00000002ULL
# define pe_flag_is_managed_default 0x00000004ULL
# define pe_flag_maintenance_mode 0x00000008ULL
# define pe_flag_stonith_enabled 0x00000010ULL
# define pe_flag_have_stonith_resource 0x00000020ULL
# define pe_flag_stop_rsc_orphans 0x00000100ULL
# define pe_flag_stop_action_orphans 0x00000200ULL
# define pe_flag_stop_everything 0x00000400ULL
# define pe_flag_start_failure_fatal 0x00001000ULL
# define pe_flag_remove_after_stop 0x00002000ULL
# define pe_flag_startup_probes 0x00010000ULL
# define pe_flag_have_status 0x00020000ULL
# define pe_flag_have_remote_nodes 0x00040000ULL
# define pe_flag_quick_location 0x00100000ULL
typedef struct pe_working_set_s {
xmlNode *input;
crm_time_t *now;
/* options extracted from the input */
char *dc_uuid;
node_t *dc_node;
const char *stonith_action;
const char *placement_strategy;
unsigned long long flags;
int stonith_timeout;
int default_resource_stickiness;
no_quorum_policy_t no_quorum_policy;
GHashTable *config_hash;
GHashTable *domains;
GHashTable *tickets;
GListPtr nodes;
GListPtr resources;
GListPtr placement_constraints;
GListPtr ordering_constraints;
GListPtr colocation_constraints;
GListPtr ticket_constraints;
GListPtr actions;
xmlNode *failed;
xmlNode *op_defaults;
xmlNode *rsc_defaults;
/* stats */
int num_synapse;
int max_valid_nodes;
int order_id;
int action_id;
/* final output */
xmlNode *graph;
GHashTable *template_rsc_sets;
const char *localhost;
} pe_working_set_t;
struct node_shared_s {
const char *id;
const char *uname;
/* Make all these flags into a bitfield one day */
gboolean online;
gboolean standby;
gboolean standby_onfail;
gboolean pending;
gboolean unclean;
gboolean unseen;
gboolean shutdown;
gboolean expected_up;
gboolean is_dc;
int num_resources;
GListPtr running_rsc; /* resource_t* */
GListPtr allocated_rsc; /* resource_t* */
resource_t *remote_rsc;
GHashTable *attrs; /* char* => char* */
enum node_type type;
GHashTable *utilization;
/*! cache of calculated rsc digests for this node. */
GHashTable *digest_cache;
gboolean maintenance;
};
struct node_s {
int weight;
gboolean fixed;
int count;
struct node_shared_s *details;
};
# include <crm/pengine/complex.h>
# define pe_rsc_orphan 0x00000001ULL
# define pe_rsc_managed 0x00000002ULL
# define pe_rsc_block 0x00000004ULL /* Further operations are prohibited due to failure policy */
# define pe_rsc_orphan_container_filler 0x00000008ULL
# define pe_rsc_notify 0x00000010ULL
# define pe_rsc_unique 0x00000020ULL
# define pe_rsc_provisional 0x00000100ULL
# define pe_rsc_allocating 0x00000200ULL
# define pe_rsc_merging 0x00000400ULL
# define pe_rsc_munging 0x00000800ULL
# define pe_rsc_try_reload 0x00001000ULL
# define pe_rsc_reload 0x00002000ULL
# define pe_rsc_failed 0x00010000ULL
# define pe_rsc_shutdown 0x00020000ULL
# define pe_rsc_runnable 0x00040000ULL
# define pe_rsc_start_pending 0x00080000ULL
# define pe_rsc_starting 0x00100000ULL
# define pe_rsc_stopping 0x00200000ULL
# define pe_rsc_migrating 0x00400000ULL
# define pe_rsc_allow_migrate 0x00800000ULL
# define pe_rsc_failure_ignored 0x01000000ULL
# define pe_rsc_unexpectedly_running 0x02000000ULL
# define pe_rsc_needs_quorum 0x10000000ULL
# define pe_rsc_needs_fencing 0x20000000ULL
# define pe_rsc_needs_unfencing 0x40000000ULL
enum pe_graph_flags {
pe_graph_none = 0x00000,
pe_graph_updated_first = 0x00001,
pe_graph_updated_then = 0x00002,
pe_graph_disable = 0x00004,
};
/* *INDENT-OFF* */
enum pe_action_flags {
pe_action_pseudo = 0x00001,
pe_action_runnable = 0x00002,
pe_action_optional = 0x00004,
pe_action_print_always = 0x00008,
pe_action_have_node_attrs = 0x00010,
pe_action_failure_is_fatal = 0x00020,
pe_action_implied_by_stonith = 0x00040,
pe_action_migrate_runnable = 0x00080,
pe_action_dumped = 0x00100,
pe_action_processed = 0x00200,
pe_action_clear = 0x00400,
pe_action_dangle = 0x00800,
pe_action_requires_any = 0x01000, /* This action requires one or mre of its dependancies to be runnable
* We use this to clear the runnable flag before checking dependancies
*/
};
/* *INDENT-ON* */
struct resource_s {
char *id;
char *clone_name;
xmlNode *xml;
xmlNode *orig_xml;
xmlNode *ops_xml;
resource_t *parent;
void *variant_opaque;
enum pe_obj_types variant;
resource_object_functions_t *fns;
resource_alloc_functions_t *cmds;
enum rsc_recovery_type recovery_type;
enum pe_restart restart_type;
int priority;
int stickiness;
int sort_index;
int failure_timeout;
int effective_priority;
int migration_threshold;
gboolean is_remote_node;
unsigned long long flags;
GListPtr rsc_cons_lhs; /* rsc_colocation_t* */
GListPtr rsc_cons; /* rsc_colocation_t* */
GListPtr rsc_location; /* rsc_to_node_t* */
GListPtr actions; /* action_t* */
GListPtr rsc_tickets; /* rsc_ticket* */
node_t *allocated_to;
GListPtr running_on; /* node_t* */
GHashTable *known_on; /* node_t* */
GHashTable *allowed_nodes; /* node_t* */
enum rsc_role_e role;
enum rsc_role_e next_role;
GHashTable *meta;
GHashTable *parameters;
GHashTable *utilization;
GListPtr children; /* resource_t* */
GListPtr dangling_migrations; /* node_t* */
node_t *partial_migration_target;
node_t *partial_migration_source;
resource_t *container;
GListPtr fillers;
+
+ char *pending_task;
};
struct pe_action_s {
int id;
int priority;
resource_t *rsc;
node_t *node;
xmlNode *op_entry;
char *task;
char *uuid;
enum pe_action_flags flags;
enum rsc_start_requirement needs;
enum action_fail_response on_fail;
enum rsc_role_e fail_role;
action_t *pre_notify;
action_t *pre_notified;
action_t *post_notify;
action_t *post_notified;
int seen_count;
GHashTable *meta;
GHashTable *extra;
GListPtr actions_before; /* action_warpper_t* */
GListPtr actions_after; /* action_warpper_t* */
};
struct ticket_s {
char *id;
gboolean granted;
time_t last_granted;
gboolean standby;
GHashTable *state;
};
enum pe_link_state {
pe_link_not_dumped,
pe_link_dumped,
pe_link_dup,
};
/* *INDENT-OFF* */
enum pe_ordering {
pe_order_none = 0x0, /* deleted */
pe_order_optional = 0x1, /* pure ordering, nothing implied */
pe_order_apply_first_non_migratable = 0x2, /* Only apply this constraint's ordering if first is not migratable. */
pe_order_implies_first = 0x10, /* If 'first' is required, ensure 'then' is too */
pe_order_implies_then = 0x20, /* If 'then' is required, ensure 'first' is too */
pe_order_implies_first_master = 0x40, /* Imply 'first' is required when 'then' is required and then's rsc holds Master role. */
/* first requires then to be both runnable and migrate runnable. */
pe_order_implies_first_migratable = 0x80,
pe_order_runnable_left = 0x100, /* 'then' requires 'first' to be runnable */
pe_order_pseudo_left = 0x200, /* 'then' can only be pseudo if 'first' is runnable */
pe_order_restart = 0x1000, /* 'then' is runnable if 'first' is optional or runnable */
pe_order_stonith_stop = 0x2000, /* only applies if the action is non-pseudo */
pe_order_serialize_only = 0x4000, /* serialize */
pe_order_implies_first_printed = 0x10000, /* Like ..implies_first but only ensures 'first' is printed, not manditory */
pe_order_implies_then_printed = 0x20000, /* Like ..implies_then but only ensures 'then' is printed, not manditory */
pe_order_asymmetrical = 0x100000, /* Indicates asymmetrical one way ordering constraint. */
pe_order_load = 0x200000, /* Only relevant if... */
pe_order_one_or_more = 0x400000, /* 'then' is only runnable if one or more of it's dependancies are too */
pe_order_trace = 0x4000000 /* test marker */
};
/* *INDENT-ON* */
typedef struct action_wrapper_s action_wrapper_t;
struct action_wrapper_s {
enum pe_ordering type;
enum pe_link_state state;
action_t *action;
};
const char *rsc_printable_id(resource_t *rsc);
gboolean cluster_status(pe_working_set_t * data_set);
void set_working_set_defaults(pe_working_set_t * data_set);
void cleanup_calculations(pe_working_set_t * data_set);
resource_t *pe_find_resource(GListPtr rsc_list, const char *id_rh);
node_t *pe_find_node(GListPtr node_list, const char *uname);
node_t *pe_find_node_id(GListPtr node_list, const char *id);
node_t *pe_find_node_any(GListPtr node_list, const char *id, const char *uname);
GListPtr find_operations(const char *rsc, const char *node, gboolean active_filter,
pe_working_set_t * data_set);
#endif
diff --git a/lib/cluster/membership.c b/lib/cluster/membership.c
index b151072d7a..d2bf780953 100644
--- a/lib/cluster/membership.c
+++ b/lib/cluster/membership.c
@@ -1,641 +1,674 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#ifndef _GNU_SOURCE
# define _GNU_SOURCE
#endif
#include <sys/param.h>
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <glib.h>
#include <crm/common/ipc.h>
#include <crm/cluster/internal.h>
#include <crm/msg_xml.h>
#include <crm/stonith-ng.h>
GHashTable *crm_peer_cache = NULL;
GHashTable *crm_remote_peer_cache = NULL;
unsigned long long crm_peer_seq = 0;
gboolean crm_have_quorum = FALSE;
int
crm_remote_peer_cache_size(void)
{
if (crm_remote_peer_cache == NULL) {
return 0;
}
return g_hash_table_size(crm_remote_peer_cache);
}
void
crm_remote_peer_cache_add(const char *node_name)
{
crm_node_t *node = g_hash_table_lookup(crm_remote_peer_cache, node_name);
if (node == NULL) {
crm_trace("added %s to remote cache", node_name);
node = calloc(1, sizeof(crm_node_t));
node->flags = crm_remote_node;
CRM_ASSERT(node);
node->uname = strdup(node_name);
node->uuid = strdup(node_name);
node->state = strdup(CRM_NODE_MEMBER);
g_hash_table_replace(crm_remote_peer_cache, node->uname, node);
}
}
void
crm_remote_peer_cache_remove(const char *node_name)
{
g_hash_table_remove(crm_remote_peer_cache, node_name);
}
static void
remote_cache_refresh_helper(xmlNode *cib, const char *xpath, const char *field, int flags)
{
const char *remote = NULL;
crm_node_t *node = NULL;
xmlXPathObjectPtr xpathObj = NULL;
int max = 0;
int lpc = 0;
xpathObj = xpath_search(cib, xpath);
max = numXpathResults(xpathObj);
for (lpc = 0; lpc < max; lpc++) {
xmlNode *xml = getXpathResult(xpathObj, lpc);
CRM_CHECK(xml != NULL, continue);
remote = crm_element_value(xml, field);
if (remote) {
crm_trace("added %s to remote cache", remote);
node = calloc(1, sizeof(crm_node_t));
node->flags = flags;
CRM_ASSERT(node);
node->uname = strdup(remote);
node->uuid = strdup(remote);
node->state = strdup(CRM_NODE_MEMBER);
g_hash_table_replace(crm_remote_peer_cache, node->uname, node);
}
}
freeXpathObject(xpathObj);
}
void crm_remote_peer_cache_refresh(xmlNode *cib)
{
const char *xpath = NULL;
g_hash_table_remove_all(crm_remote_peer_cache);
/* remote nodes associated with a cluster resource */
xpath = "//" XML_TAG_CIB "//" XML_CIB_TAG_CONFIGURATION "//" XML_CIB_TAG_RESOURCE "//" XML_TAG_META_SETS "//" XML_CIB_TAG_NVPAIR "[@name='remote-node']";
remote_cache_refresh_helper(cib, xpath, "value", crm_remote_node | crm_remote_container);
/* baremetal nodes defined by connection resources*/
xpath = "//" XML_TAG_CIB "//" XML_CIB_TAG_CONFIGURATION "//" XML_CIB_TAG_RESOURCE "[@type='remote'][@provider='pacemaker']";
remote_cache_refresh_helper(cib, xpath, "id", crm_remote_node | crm_remote_baremetal);
/* baremetal nodes we have seen in the config that may or may not have connection
* resources associated with them anymore */
xpath = "//" XML_TAG_CIB "//" XML_CIB_TAG_STATUS "//" XML_CIB_TAG_STATE "[@remote_node='true']";
remote_cache_refresh_helper(cib, xpath, "id", crm_remote_node | crm_remote_baremetal);
}
gboolean
crm_is_peer_active(const crm_node_t * node)
{
if(node == NULL) {
return FALSE;
}
if (is_set(node->flags, crm_remote_node)) {
/* remote nodes are never considered active members. This
* guarantees they will never be considered for DC membership.*/
return FALSE;
}
#if SUPPORT_COROSYNC
if (is_openais_cluster()) {
return crm_is_corosync_peer_active(node);
}
#endif
#if SUPPORT_HEARTBEAT
if (is_heartbeat_cluster()) {
return crm_is_heartbeat_peer_active(node);
}
#endif
crm_err("Unhandled cluster type: %s", name_for_cluster_type(get_cluster_type()));
return FALSE;
}
static gboolean
crm_reap_dead_member(gpointer key, gpointer value, gpointer user_data)
{
crm_node_t *node = value;
crm_node_t *search = user_data;
if (search == NULL) {
return FALSE;
} else if (search->id && node->id != search->id) {
return FALSE;
} else if (search->id == 0 && safe_str_neq(node->uname, search->uname)) {
return FALSE;
} else if (crm_is_peer_active(value) == FALSE) {
crm_notice("Removing %s/%u from the membership list", node->uname, node->id);
return TRUE;
}
return FALSE;
}
guint
reap_crm_member(uint32_t id, const char *name)
{
int matches = 0;
crm_node_t search;
if (crm_peer_cache == NULL) {
crm_trace("Nothing to do, cache not initialized");
return 0;
}
search.id = id;
search.uname = name ? strdup(name) : NULL;
matches = g_hash_table_foreach_remove(crm_peer_cache, crm_reap_dead_member, &search);
if(matches) {
crm_notice("Purged %d peers with id=%u and/or uname=%s from the membership cache", matches, id, name);
} else {
crm_info("No peers with id=%u and/or uname=%s exist", id, name);
}
free(search.uname);
return matches;
}
static void
crm_count_peer(gpointer key, gpointer value, gpointer user_data)
{
guint *count = user_data;
crm_node_t *node = value;
if (crm_is_peer_active(node)) {
*count = *count + 1;
}
}
guint
crm_active_peers(void)
{
guint count = 0;
if (crm_peer_cache) {
g_hash_table_foreach(crm_peer_cache, crm_count_peer, &count);
}
return count;
}
static void
destroy_crm_node(gpointer data)
{
crm_node_t *node = data;
crm_trace("Destroying entry for node %u: %s", node->id, node->uname);
free(node->addr);
free(node->uname);
free(node->state);
free(node->uuid);
free(node->expected);
free(node);
}
void
crm_peer_init(void)
{
if (crm_peer_cache == NULL) {
crm_peer_cache = g_hash_table_new_full(crm_str_hash, g_str_equal, free, destroy_crm_node);
}
if (crm_remote_peer_cache == NULL) {
crm_remote_peer_cache = g_hash_table_new_full(crm_str_hash, g_str_equal, NULL, destroy_crm_node);
}
}
void
crm_peer_destroy(void)
{
if (crm_peer_cache != NULL) {
crm_trace("Destroying peer cache with %d members", g_hash_table_size(crm_peer_cache));
g_hash_table_destroy(crm_peer_cache);
crm_peer_cache = NULL;
}
if (crm_remote_peer_cache != NULL) {
crm_trace("Destroying remote peer cache with %d members", g_hash_table_size(crm_remote_peer_cache));
g_hash_table_destroy(crm_remote_peer_cache);
crm_remote_peer_cache = NULL;
}
}
void (*crm_status_callback) (enum crm_status_type, crm_node_t *, const void *) = NULL;
void
crm_set_status_callback(void (*dispatch) (enum crm_status_type, crm_node_t *, const void *))
{
crm_status_callback = dispatch;
}
static void crm_dump_peer_hash(int level, const char *caller)
{
GHashTableIter iter;
const char *id = NULL;
crm_node_t *node = NULL;
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, (gpointer *) &id, (gpointer *) &node)) {
do_crm_log(level, "%s: Node %u/%s = %p - %s", caller, node->id, node->uname, node, id);
}
}
static gboolean crm_hash_find_by_data(gpointer key, gpointer value, gpointer user_data)
{
if(value == user_data) {
return TRUE;
}
return FALSE;
}
+crm_node_t *
+crm_find_peer_full(unsigned int id, const char *uname, int flags)
+{
+ crm_node_t *node = NULL;
+
+ CRM_ASSERT(id > 0 || uname != NULL);
+
+ crm_peer_init();
+
+ if (flags & CRM_GET_PEER_REMOTE) {
+ node = g_hash_table_lookup(crm_remote_peer_cache, uname);
+ }
+
+ if (node == NULL && (flags & CRM_GET_PEER_CLUSTER)) {
+ node = crm_find_peer(id, uname);
+ }
+ return node;
+}
+
crm_node_t *
crm_get_peer_full(unsigned int id, const char *uname, int flags)
{
crm_node_t *node = NULL;
CRM_ASSERT(id > 0 || uname != NULL);
crm_peer_init();
if (flags & CRM_GET_PEER_REMOTE) {
node = g_hash_table_lookup(crm_remote_peer_cache, uname);
}
if (node == NULL && (flags & CRM_GET_PEER_CLUSTER)) {
node = crm_get_peer(id, uname);
}
return node;
}
-/* coverity[-alloc] Memory is referenced in one or both hashtables */
crm_node_t *
-crm_get_peer(unsigned int id, const char *uname)
+crm_find_peer(unsigned int id, const char *uname)
{
GHashTableIter iter;
crm_node_t *node = NULL;
crm_node_t *by_id = NULL;
crm_node_t *by_name = NULL;
- char *uname_lookup = NULL;
CRM_ASSERT(id > 0 || uname != NULL);
crm_peer_init();
if (uname != NULL) {
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &node)) {
if(node->uname && strcasecmp(node->uname, uname) == 0) {
crm_trace("Name match: %s = %p", node->uname, node);
by_name = node;
break;
}
}
}
if (id > 0) {
g_hash_table_iter_init(&iter, crm_peer_cache);
while (g_hash_table_iter_next(&iter, NULL, (gpointer *) &node)) {
if(node->id == id) {
crm_trace("ID match: %u = %p", node->id, node);
by_id = node;
break;
}
}
}
node = by_id; /* Good default */
if(by_id == by_name) {
/* Nothing to do if they match (both NULL counts) */
crm_trace("Consistent: %p for %u/%s", by_id, id, uname);
} else if(by_id == NULL && by_name) {
crm_trace("Only one: %p for %u/%s", by_name, id, uname);
if(id && by_name->id) {
crm_dump_peer_hash(LOG_WARNING, __FUNCTION__);
crm_crit("Node %u and %u share the same name '%s'",
id, by_name->id, uname);
node = NULL; /* Create a new one */
} else {
node = by_name;
}
} else if(by_name == NULL && by_id) {
crm_trace("Only one: %p for %u/%s", by_id, id, uname);
if(uname && by_id->uname) {
crm_dump_peer_hash(LOG_WARNING, __FUNCTION__);
crm_crit("Node '%s' and '%s' share the same cluster nodeid %u: assuming '%s' is correct",
uname, by_id->uname, id, uname);
}
} else if(uname && by_id->uname) {
crm_warn("Node '%s' and '%s' share the same cluster nodeid: %u", by_id->uname, by_name->uname, id);
} else if(id && by_name->id) {
crm_warn("Node %u and %u share the same name: '%s'", by_id->id, by_name->id, uname);
} else {
/* Simple merge */
/* Only corosync based clusters use nodeid's
*
* The functions that call crm_update_peer_state() only know nodeid
* so 'by_id' is authorative when merging
*
* Same for crm_update_peer_proc()
*/
crm_dump_peer_hash(LOG_DEBUG, __FUNCTION__);
crm_info("Merging %p into %p", by_name, by_id);
g_hash_table_foreach_remove(crm_peer_cache, crm_hash_find_by_data, by_name);
}
+ return node;
+}
+
+/* coverity[-alloc] Memory is referenced in one or both hashtables */
+crm_node_t *
+crm_get_peer(unsigned int id, const char *uname)
+{
+ crm_node_t *node = NULL;
+ char *uname_lookup = NULL;
+
+ CRM_ASSERT(id > 0 || uname != NULL);
+
+ crm_peer_init();
+
+ node = crm_find_peer(id, uname);
+
if (node == NULL) {
char *uniqueid = crm_generate_uuid();
node = calloc(1, sizeof(crm_node_t));
CRM_ASSERT(node);
crm_info("Created entry %s/%p for node %s/%u (%d total)",
uniqueid, node, uname, id, 1 + g_hash_table_size(crm_peer_cache));
g_hash_table_replace(crm_peer_cache, uniqueid, node);
}
if(id && uname == NULL && node->uname == NULL) {
uname_lookup = get_node_name(id);
uname = uname_lookup;
crm_trace("Inferred a name of '%s' for node %u", uname, id);
}
if(id > 0 && uname && (node->id == 0 || node->uname == NULL)) {
crm_info("Node %u is now known as %s", id, uname);
}
if(id > 0 && node->id == 0) {
node->id = id;
}
if(uname && node->uname == NULL) {
int lpc, len = strlen(uname);
for (lpc = 0; lpc < len; lpc++) {
if (uname[lpc] >= 'A' && uname[lpc] <= 'Z') {
crm_warn("Node names with capitals are discouraged, consider changing '%s' to something else",
uname);
break;
}
}
node->uname = strdup(uname);
if (crm_status_callback) {
crm_status_callback(crm_status_uname, node, NULL);
}
}
if(node->uuid == NULL) {
const char *uuid = crm_peer_uuid(node);
if (uuid) {
crm_info("Node %u has uuid %s", id, uuid);
} else {
crm_info("Cannot obtain a UUID for node %u/%s", id, node->uname);
}
}
free(uname_lookup);
return node;
}
crm_node_t *
crm_update_peer(const char *source, unsigned int id, uint64_t born, uint64_t seen, int32_t votes,
uint32_t children, const char *uuid, const char *uname, const char *addr,
const char *state)
{
#if SUPPORT_PLUGIN
gboolean addr_changed = FALSE;
gboolean votes_changed = FALSE;
#endif
crm_node_t *node = NULL;
id = get_corosync_id(id, uuid);
node = crm_get_peer(id, uname);
CRM_ASSERT(node != NULL);
if (node->uuid == NULL) {
if (is_openais_cluster()) {
/* Yes, overrule whatever was passed in */
crm_peer_uuid(node);
} else if (uuid != NULL) {
node->uuid = strdup(uuid);
}
}
if (children > 0) {
crm_update_peer_proc(source, node, children, state);
}
if (state != NULL) {
crm_update_peer_state(source, node, state, seen);
}
#if SUPPORT_HEARTBEAT
if (born != 0) {
node->born = born;
}
#endif
#if SUPPORT_PLUGIN
/* These were only used by the plugin */
if (born != 0) {
node->born = born;
}
if (votes > 0 && node->votes != votes) {
votes_changed = TRUE;
node->votes = votes;
}
if (addr != NULL) {
if (node->addr == NULL || crm_str_eq(node->addr, addr, FALSE) == FALSE) {
addr_changed = TRUE;
free(node->addr);
node->addr = strdup(addr);
}
}
if (addr_changed || votes_changed) {
crm_info("%s: Node %s: id=%u state=%s addr=%s%s votes=%d%s born=" U64T " seen=" U64T
" proc=%.32x", source, node->uname, node->id, node->state,
node->addr, addr_changed ? " (new)" : "", node->votes,
votes_changed ? " (new)" : "", node->born, node->last_seen, node->processes);
}
#endif
return node;
}
void
crm_update_peer_proc(const char *source, crm_node_t * node, uint32_t flag, const char *status)
{
uint32_t last = 0;
gboolean changed = FALSE;
CRM_CHECK(node != NULL, crm_err("%s: Could not set %s to %s for NULL",
source, peer2text(flag), status); return);
last = node->processes;
if (status == NULL) {
node->processes = flag;
if (node->processes != last) {
changed = TRUE;
}
} else if (safe_str_eq(status, ONLINESTATUS)) {
if ((node->processes & flag) == 0) {
set_bit(node->processes, flag);
changed = TRUE;
}
#if SUPPORT_PLUGIN
} else if (safe_str_eq(status, CRM_NODE_MEMBER)) {
if (flag > 0 && node->processes != flag) {
node->processes = flag;
changed = TRUE;
}
#endif
} else if (node->processes & flag) {
clear_bit(node->processes, flag);
changed = TRUE;
}
if (changed) {
if (status == NULL && flag <= crm_proc_none) {
crm_info("%s: Node %s[%u] - all processes are now offline", source, node->uname,
node->id);
} else {
crm_info("%s: Node %s[%u] - %s is now %s", source, node->uname, node->id,
peer2text(flag), status);
}
if (crm_status_callback) {
crm_status_callback(crm_status_processes, node, &last);
}
} else {
crm_trace("%s: Node %s[%u] - %s is unchanged (%s)", source, node->uname, node->id,
peer2text(flag), status);
}
}
void
crm_update_peer_expected(const char *source, crm_node_t * node, const char *expected)
{
char *last = NULL;
gboolean changed = FALSE;
CRM_CHECK(node != NULL, crm_err("%s: Could not set 'expected' to %s", source, expected);
return);
last = node->expected;
if (expected != NULL && safe_str_neq(node->expected, expected)) {
node->expected = strdup(expected);
changed = TRUE;
}
if (changed) {
crm_info("%s: Node %s[%u] - expected state is now %s (was %s)", source, node->uname, node->id,
expected, last);
free(last);
} else {
crm_trace("%s: Node %s[%u] - expected state is unchanged (%s)", source, node->uname,
node->id, expected);
}
}
void
crm_update_peer_state(const char *source, crm_node_t * node, const char *state, int membership)
{
char *last = NULL;
gboolean changed = FALSE;
CRM_CHECK(node != NULL, crm_err("%s: Could not set 'state' to %s", source, state);
return);
last = node->state;
if (state != NULL && safe_str_neq(node->state, state)) {
node->state = strdup(state);
changed = TRUE;
}
if (membership != 0 && safe_str_eq(node->state, CRM_NODE_MEMBER)) {
node->last_seen = membership;
}
if (changed) {
crm_notice("%s: Node %s[%u] - state is now %s (was %s)", source, node->uname, node->id, state, last);
if (crm_status_callback) {
enum crm_status_type status_type = crm_status_nstate;
if (is_set(node->flags, crm_remote_node)) {
status_type = crm_status_rstate;
}
crm_status_callback(status_type, node, last);
}
free(last);
} else {
crm_trace("%s: Node %s[%u] - state is unchanged (%s)", source, node->uname, node->id,
state);
}
}
int
crm_terminate_member(int nodeid, const char *uname, void *unused)
{
/* Always use the synchronous, non-mainloop version */
return stonith_api_kick(nodeid, uname, 120, TRUE);
}
int
crm_terminate_member_no_mainloop(int nodeid, const char *uname, int *connection)
{
return stonith_api_kick(nodeid, uname, 120, TRUE);
}
diff --git a/lib/pengine/complex.c b/lib/pengine/complex.c
index 2ada463796..a592fc8af8 100644
--- a/lib/pengine/complex.c
+++ b/lib/pengine/complex.c
@@ -1,747 +1,748 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <crm/pengine/rules.h>
#include <crm/pengine/internal.h>
#include <crm/msg_xml.h>
void populate_hash(xmlNode * nvpair_list, GHashTable * hash, const char **attrs, int attrs_length);
resource_object_functions_t resource_class_functions[] = {
{
native_unpack,
native_find_rsc,
native_parameter,
native_print,
native_active,
native_resource_state,
native_location,
native_free},
{
group_unpack,
native_find_rsc,
native_parameter,
group_print,
group_active,
group_resource_state,
native_location,
group_free},
{
clone_unpack,
native_find_rsc,
native_parameter,
clone_print,
clone_active,
clone_resource_state,
native_location,
clone_free},
{
master_unpack,
native_find_rsc,
native_parameter,
clone_print,
clone_active,
clone_resource_state,
native_location,
clone_free}
};
enum pe_obj_types
get_resource_type(const char *name)
{
if (safe_str_eq(name, XML_CIB_TAG_RESOURCE)) {
return pe_native;
} else if (safe_str_eq(name, XML_CIB_TAG_GROUP)) {
return pe_group;
} else if (safe_str_eq(name, XML_CIB_TAG_INCARNATION)) {
return pe_clone;
} else if (safe_str_eq(name, XML_CIB_TAG_MASTER)) {
return pe_master;
}
return pe_unknown;
}
const char *
get_resource_typename(enum pe_obj_types type)
{
switch (type) {
case pe_native:
return XML_CIB_TAG_RESOURCE;
case pe_group:
return XML_CIB_TAG_GROUP;
case pe_clone:
return XML_CIB_TAG_INCARNATION;
case pe_master:
return XML_CIB_TAG_MASTER;
case pe_unknown:
return "unknown";
}
return "<unknown>";
}
static void
dup_attr(gpointer key, gpointer value, gpointer user_data)
{
add_hash_param(user_data, key, value);
}
void
get_meta_attributes(GHashTable * meta_hash, resource_t * rsc,
node_t * node, pe_working_set_t * data_set)
{
GHashTable *node_hash = NULL;
if (node) {
node_hash = node->details->attrs;
}
if (rsc->xml) {
xmlAttrPtr xIter = NULL;
for (xIter = rsc->xml->properties; xIter; xIter = xIter->next) {
const char *prop_name = (const char *)xIter->name;
const char *prop_value = crm_element_value(rsc->xml, prop_name);
add_hash_param(meta_hash, prop_name, prop_value);
}
}
unpack_instance_attributes(data_set->input, rsc->xml, XML_TAG_META_SETS, node_hash,
meta_hash, NULL, FALSE, data_set->now);
/* populate from the regular attributes until the GUI can create
* meta attributes
*/
unpack_instance_attributes(data_set->input, rsc->xml, XML_TAG_ATTR_SETS, node_hash,
meta_hash, NULL, FALSE, data_set->now);
/* set anything else based on the parent */
if (rsc->parent != NULL) {
g_hash_table_foreach(rsc->parent->meta, dup_attr, meta_hash);
}
/* and finally check the defaults */
unpack_instance_attributes(data_set->input, data_set->rsc_defaults, XML_TAG_META_SETS,
node_hash, meta_hash, NULL, FALSE, data_set->now);
}
void
get_rsc_attributes(GHashTable * meta_hash, resource_t * rsc,
node_t * node, pe_working_set_t * data_set)
{
GHashTable *node_hash = NULL;
if (node) {
node_hash = node->details->attrs;
}
unpack_instance_attributes(data_set->input, rsc->xml, XML_TAG_ATTR_SETS, node_hash,
meta_hash, NULL, FALSE, data_set->now);
if (rsc->container) {
g_hash_table_replace(meta_hash, strdup(CRM_META"_"XML_RSC_ATTR_CONTAINER), strdup(rsc->container->id));
}
/* set anything else based on the parent */
if (rsc->parent != NULL) {
get_rsc_attributes(meta_hash, rsc->parent, node, data_set);
} else {
/* and finally check the defaults */
unpack_instance_attributes(data_set->input, data_set->rsc_defaults, XML_TAG_ATTR_SETS,
node_hash, meta_hash, NULL, FALSE, data_set->now);
}
}
static char *
template_op_key(xmlNode * op)
{
const char *name = crm_element_value(op, "name");
const char *role = crm_element_value(op, "role");
char *key = NULL;
if (role == NULL || crm_str_eq(role, RSC_ROLE_STARTED_S, TRUE)
|| crm_str_eq(role, RSC_ROLE_SLAVE_S, TRUE)) {
role = RSC_ROLE_UNKNOWN_S;
}
key = crm_concat(name, role, '-');
return key;
}
static gboolean
unpack_template(xmlNode * xml_obj, xmlNode ** expanded_xml, pe_working_set_t * data_set)
{
xmlNode *cib_resources = NULL;
xmlNode *template = NULL;
xmlNode *new_xml = NULL;
xmlNode *child_xml = NULL;
xmlNode *rsc_ops = NULL;
xmlNode *template_ops = NULL;
const char *template_ref = NULL;
const char *id = NULL;
if (xml_obj == NULL) {
pe_err("No resource object for template unpacking");
return FALSE;
}
template_ref = crm_element_value(xml_obj, XML_CIB_TAG_RSC_TEMPLATE);
if (template_ref == NULL) {
return TRUE;
}
id = ID(xml_obj);
if (id == NULL) {
pe_err("'%s' object must have a id", crm_element_name(xml_obj));
return FALSE;
}
if (crm_str_eq(template_ref, id, TRUE)) {
pe_err("The resource object '%s' should not reference itself", id);
return FALSE;
}
cib_resources = get_xpath_object("//"XML_CIB_TAG_RESOURCES, data_set->input, LOG_TRACE);
if (cib_resources == NULL) {
pe_err("No resources configured");
return FALSE;
}
template = find_entity(cib_resources, XML_CIB_TAG_RSC_TEMPLATE, template_ref);
if (template == NULL) {
pe_err("No template named '%s'", template_ref);
return FALSE;
}
new_xml = copy_xml(template);
xmlNodeSetName(new_xml, xml_obj->name);
crm_xml_replace(new_xml, XML_ATTR_ID, id);
template_ops = find_xml_node(new_xml, "operations", FALSE);
for (child_xml = __xml_first_child(xml_obj); child_xml != NULL;
child_xml = __xml_next(child_xml)) {
xmlNode *new_child = NULL;
new_child = add_node_copy(new_xml, child_xml);
if (crm_str_eq((const char *)new_child->name, "operations", TRUE)) {
rsc_ops = new_child;
}
}
if (template_ops && rsc_ops) {
xmlNode *op = NULL;
GHashTable *rsc_ops_hash =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, NULL);
for (op = __xml_first_child(rsc_ops); op != NULL; op = __xml_next(op)) {
char *key = template_op_key(op);
g_hash_table_insert(rsc_ops_hash, key, op);
}
for (op = __xml_first_child(template_ops); op != NULL; op = __xml_next(op)) {
char *key = template_op_key(op);
if (g_hash_table_lookup(rsc_ops_hash, key) == NULL) {
add_node_copy(rsc_ops, op);
}
free(key);
}
if (rsc_ops_hash) {
g_hash_table_destroy(rsc_ops_hash);
}
free_xml(template_ops);
}
/*free_xml(*expanded_xml); */
*expanded_xml = new_xml;
/* Disable multi-level templates for now */
/*if(unpack_template(new_xml, expanded_xml, data_set) == FALSE) {
free_xml(*expanded_xml);
*expanded_xml = NULL;
return FALSE;
} */
return TRUE;
}
static gboolean
add_template_rsc(xmlNode * xml_obj, pe_working_set_t * data_set)
{
const char *template_ref = NULL;
const char *id = NULL;
xmlNode *rsc_set = NULL;
xmlNode *rsc_ref = NULL;
if (xml_obj == NULL) {
pe_err("No resource object for processing resource list of template");
return FALSE;
}
template_ref = crm_element_value(xml_obj, XML_CIB_TAG_RSC_TEMPLATE);
if (template_ref == NULL) {
return TRUE;
}
id = ID(xml_obj);
if (id == NULL) {
pe_err("'%s' object must have a id", crm_element_name(xml_obj));
return FALSE;
}
if (crm_str_eq(template_ref, id, TRUE)) {
pe_err("The resource object '%s' should not reference itself", id);
return FALSE;
}
rsc_set = g_hash_table_lookup(data_set->template_rsc_sets, template_ref);
if (rsc_set == NULL) {
rsc_set = create_xml_node(NULL, XML_CONS_TAG_RSC_SET);
crm_xml_add(rsc_set, XML_ATTR_ID, template_ref);
g_hash_table_insert(data_set->template_rsc_sets, strdup(template_ref), rsc_set);
}
rsc_ref = create_xml_node(rsc_set, XML_TAG_RESOURCE_REF);
crm_xml_add(rsc_ref, XML_ATTR_ID, id);
return TRUE;
}
gboolean
common_unpack(xmlNode * xml_obj, resource_t ** rsc,
resource_t * parent, pe_working_set_t * data_set)
{
xmlNode *expanded_xml = NULL;
xmlNode *ops = NULL;
resource_t *top = NULL;
const char *value = NULL;
const char *id = crm_element_value(xml_obj, XML_ATTR_ID);
const char *class = crm_element_value(xml_obj, XML_AGENT_ATTR_CLASS);
crm_log_xml_trace(xml_obj, "Processing resource input...");
if (id == NULL) {
pe_err("Must specify id tag in <resource>");
return FALSE;
} else if (rsc == NULL) {
pe_err("Nowhere to unpack resource into");
return FALSE;
}
if (unpack_template(xml_obj, &expanded_xml, data_set) == FALSE) {
return FALSE;
}
*rsc = calloc(1, sizeof(resource_t));
if (expanded_xml) {
crm_log_xml_trace(expanded_xml, "Expanded resource...");
(*rsc)->xml = expanded_xml;
(*rsc)->orig_xml = xml_obj;
} else {
(*rsc)->xml = xml_obj;
(*rsc)->orig_xml = NULL;
}
(*rsc)->parent = parent;
ops = find_xml_node((*rsc)->xml, "operations", FALSE);
(*rsc)->ops_xml = expand_idref(ops, data_set->input);
(*rsc)->variant = get_resource_type(crm_element_name(xml_obj));
if ((*rsc)->variant == pe_unknown) {
pe_err("Unknown resource type: %s", crm_element_name(xml_obj));
free(*rsc);
return FALSE;
}
(*rsc)->parameters =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, g_hash_destroy_str);
(*rsc)->meta =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, g_hash_destroy_str);
(*rsc)->allowed_nodes =
g_hash_table_new_full(crm_str_hash, g_str_equal, NULL, g_hash_destroy_str);
(*rsc)->known_on = g_hash_table_new_full(crm_str_hash, g_str_equal, NULL, g_hash_destroy_str);
value = crm_element_value(xml_obj, XML_RSC_ATTR_INCARNATION);
if (value) {
(*rsc)->id = crm_concat(id, value, ':');
add_hash_param((*rsc)->meta, XML_RSC_ATTR_INCARNATION, value);
} else {
(*rsc)->id = strdup(id);
}
(*rsc)->fns = &resource_class_functions[(*rsc)->variant];
pe_rsc_trace((*rsc), "Unpacking resource...");
get_meta_attributes((*rsc)->meta, *rsc, NULL, data_set);
(*rsc)->flags = 0;
set_bit((*rsc)->flags, pe_rsc_runnable);
set_bit((*rsc)->flags, pe_rsc_provisional);
if (is_set(data_set->flags, pe_flag_is_managed_default)) {
set_bit((*rsc)->flags, pe_rsc_managed);
}
(*rsc)->rsc_cons = NULL;
(*rsc)->rsc_tickets = NULL;
(*rsc)->actions = NULL;
(*rsc)->role = RSC_ROLE_STOPPED;
(*rsc)->next_role = RSC_ROLE_UNKNOWN;
(*rsc)->recovery_type = recovery_stop_start;
(*rsc)->stickiness = data_set->default_resource_stickiness;
(*rsc)->migration_threshold = INFINITY;
(*rsc)->failure_timeout = 0;
value = g_hash_table_lookup((*rsc)->meta, XML_CIB_ATTR_PRIORITY);
(*rsc)->priority = crm_parse_int(value, "0");
(*rsc)->effective_priority = (*rsc)->priority;
value = g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_NOTIFY);
if (crm_is_true(value)) {
set_bit((*rsc)->flags, pe_rsc_notify);
}
value = g_hash_table_lookup((*rsc)->meta, XML_OP_ATTR_ALLOW_MIGRATE);
if (crm_is_true(value)) {
set_bit((*rsc)->flags, pe_rsc_allow_migrate);
}
value = g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_MANAGED);
if (value != NULL && safe_str_neq("default", value)) {
gboolean bool_value = TRUE;
crm_str_to_boolean(value, &bool_value);
if (bool_value == FALSE) {
clear_bit((*rsc)->flags, pe_rsc_managed);
} else {
set_bit((*rsc)->flags, pe_rsc_managed);
}
}
if (is_set(data_set->flags, pe_flag_maintenance_mode)) {
clear_bit((*rsc)->flags, pe_rsc_managed);
}
pe_rsc_trace((*rsc), "Options for %s", (*rsc)->id);
value = g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_UNIQUE);
top = uber_parent(*rsc);
if (crm_is_true(value) || top->variant < pe_clone) {
set_bit((*rsc)->flags, pe_rsc_unique);
}
value = g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_RESTART);
if (safe_str_eq(value, "restart")) {
(*rsc)->restart_type = pe_restart_restart;
pe_rsc_trace((*rsc), "\tDependency restart handling: restart");
} else {
(*rsc)->restart_type = pe_restart_ignore;
pe_rsc_trace((*rsc), "\tDependency restart handling: ignore");
}
value = g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_MULTIPLE);
if (safe_str_eq(value, "stop_only")) {
(*rsc)->recovery_type = recovery_stop_only;
pe_rsc_trace((*rsc), "\tMultiple running resource recovery: stop only");
} else if (safe_str_eq(value, "block")) {
(*rsc)->recovery_type = recovery_block;
pe_rsc_trace((*rsc), "\tMultiple running resource recovery: block");
} else {
(*rsc)->recovery_type = recovery_stop_start;
pe_rsc_trace((*rsc), "\tMultiple running resource recovery: stop/start");
}
value = g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_STICKINESS);
if (value != NULL && safe_str_neq("default", value)) {
(*rsc)->stickiness = char2score(value);
}
value = g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_FAIL_STICKINESS);
if (value != NULL && safe_str_neq("default", value)) {
(*rsc)->migration_threshold = char2score(value);
} else if (value == NULL) {
/* Make a best-effort guess at a migration threshold for people with 0.6 configs
* try with underscores and hyphens, from both the resource and global defaults section
*/
value = g_hash_table_lookup((*rsc)->meta, "resource-failure-stickiness");
if (value == NULL) {
value = g_hash_table_lookup((*rsc)->meta, "resource_failure_stickiness");
}
if (value == NULL) {
value =
g_hash_table_lookup(data_set->config_hash, "default-resource-failure-stickiness");
}
if (value == NULL) {
value =
g_hash_table_lookup(data_set->config_hash, "default_resource_failure_stickiness");
}
if (value) {
int fail_sticky = char2score(value);
if (fail_sticky == -INFINITY) {
(*rsc)->migration_threshold = 1;
pe_rsc_info((*rsc),
"Set a migration threshold of %d for %s based on a failure-stickiness of %s",
(*rsc)->migration_threshold, (*rsc)->id, value);
} else if ((*rsc)->stickiness != 0 && fail_sticky != 0) {
(*rsc)->migration_threshold = (*rsc)->stickiness / fail_sticky;
if ((*rsc)->migration_threshold < 0) {
/* Make sure it's positive */
(*rsc)->migration_threshold = 0 - (*rsc)->migration_threshold;
}
(*rsc)->migration_threshold += 1;
pe_rsc_info((*rsc),
"Calculated a migration threshold for %s of %d based on a stickiness of %d/%s",
(*rsc)->id, (*rsc)->migration_threshold, (*rsc)->stickiness, value);
}
}
}
value = g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_REQUIRES);
if (safe_str_eq(value, "nothing")) {
} else if (safe_str_eq(value, "quorum")) {
set_bit((*rsc)->flags, pe_rsc_needs_quorum);
} else if (safe_str_eq(value, "unfencing")) {
set_bit((*rsc)->flags, pe_rsc_needs_fencing);
set_bit((*rsc)->flags, pe_rsc_needs_unfencing);
if (is_set(data_set->flags, pe_flag_stonith_enabled)) {
crm_notice("%s requires (un)fencing but fencing is disabled", (*rsc)->id);
}
} else if (safe_str_eq(value, "fencing")) {
set_bit((*rsc)->flags, pe_rsc_needs_fencing);
if (is_set(data_set->flags, pe_flag_stonith_enabled)) {
crm_notice("%s requires fencing but fencing is disabled", (*rsc)->id);
}
} else {
if (value) {
crm_config_err("Invalid value for %s->requires: %s%s",
(*rsc)->id, value,
is_set(data_set->flags,
pe_flag_stonith_enabled) ? "" : " (stonith-enabled=false)");
}
if (is_set(data_set->flags, pe_flag_stonith_enabled)) {
set_bit((*rsc)->flags, pe_rsc_needs_fencing);
value = "fencing (default)";
} else if (data_set->no_quorum_policy == no_quorum_ignore) {
value = "nothing (default)";
} else {
set_bit((*rsc)->flags, pe_rsc_needs_quorum);
value = "quorum (default)";
}
}
pe_rsc_trace((*rsc), "\tRequired to start: %s", value);
value = g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_FAIL_TIMEOUT);
if (value != NULL) {
/* call crm_get_msec() and convert back to seconds */
(*rsc)->failure_timeout = (crm_get_msec(value) / 1000);
}
get_target_role(*rsc, &((*rsc)->next_role));
pe_rsc_trace((*rsc), "\tDesired next state: %s",
(*rsc)->next_role != RSC_ROLE_UNKNOWN ? role2text((*rsc)->next_role) : "default");
if ((*rsc)->fns->unpack(*rsc, data_set) == FALSE) {
return FALSE;
}
if (is_set(data_set->flags, pe_flag_symmetric_cluster)) {
resource_location(*rsc, NULL, 0, "symmetric_default", data_set);
} else if (xml_contains_remote_node(xml_obj) && g_hash_table_lookup((*rsc)->meta, XML_RSC_ATTR_CONTAINER)) {
/* remote resources tied to a container resource must always be allowed
* to opt-in to the cluster. Whether the connection resource is actually
* allowed to be placed on a node is dependent on the container resource */
resource_location(*rsc, NULL, 0, "remote_connection_default", data_set);
}
pe_rsc_trace((*rsc), "\tAction notification: %s",
is_set((*rsc)->flags, pe_rsc_notify) ? "required" : "not required");
if (safe_str_eq(class, "stonith")) {
set_bit(data_set->flags, pe_flag_have_stonith_resource);
}
(*rsc)->utilization =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, g_hash_destroy_str);
unpack_instance_attributes(data_set->input, (*rsc)->xml, XML_TAG_UTILIZATION, NULL,
(*rsc)->utilization, NULL, FALSE, data_set->now);
/* data_set->resources = g_list_append(data_set->resources, (*rsc)); */
if (expanded_xml) {
if (add_template_rsc(xml_obj, data_set) == FALSE) {
return FALSE;
}
}
return TRUE;
}
void
common_update_score(resource_t * rsc, const char *id, int score)
{
node_t *node = NULL;
node = pe_hash_table_lookup(rsc->allowed_nodes, id);
if (node != NULL) {
pe_rsc_trace(rsc, "Updating score for %s on %s: %d + %d", rsc->id, id, node->weight, score);
node->weight = merge_weights(node->weight, score);
}
if (rsc->children) {
GListPtr gIter = rsc->children;
for (; gIter != NULL; gIter = gIter->next) {
resource_t *child_rsc = (resource_t *) gIter->data;
common_update_score(child_rsc, id, score);
}
}
}
gboolean
is_parent(resource_t *child, resource_t *rsc)
{
resource_t *parent = child;
if (parent == NULL || rsc == NULL) {
return FALSE;
}
while (parent->parent != NULL) {
if (parent->parent == rsc) {
return TRUE;
}
parent = parent->parent;
}
return FALSE;
}
resource_t *
uber_parent(resource_t * rsc)
{
resource_t *parent = rsc;
if (parent == NULL) {
return NULL;
}
while (parent->parent != NULL) {
parent = parent->parent;
}
return parent;
}
void
common_free(resource_t * rsc)
{
if (rsc == NULL) {
return;
}
pe_rsc_trace(rsc, "Freeing %s %d", rsc->id, rsc->variant);
g_list_free(rsc->rsc_cons);
g_list_free(rsc->rsc_cons_lhs);
g_list_free(rsc->rsc_tickets);
g_list_free(rsc->dangling_migrations);
if (rsc->parameters != NULL) {
g_hash_table_destroy(rsc->parameters);
}
if (rsc->meta != NULL) {
g_hash_table_destroy(rsc->meta);
}
if (rsc->utilization != NULL) {
g_hash_table_destroy(rsc->utilization);
}
if (rsc->parent == NULL && is_set(rsc->flags, pe_rsc_orphan)) {
free_xml(rsc->xml);
rsc->xml = NULL;
free_xml(rsc->orig_xml);
rsc->orig_xml = NULL;
/* if rsc->orig_xml, then rsc->xml is an expanded xml from a template */
} else if (rsc->orig_xml) {
free_xml(rsc->xml);
rsc->xml = NULL;
}
if (rsc->running_on) {
g_list_free(rsc->running_on);
rsc->running_on = NULL;
}
if (rsc->known_on) {
g_hash_table_destroy(rsc->known_on);
rsc->known_on = NULL;
}
if (rsc->actions) {
g_list_free(rsc->actions);
rsc->actions = NULL;
}
if (rsc->allowed_nodes) {
g_hash_table_destroy(rsc->allowed_nodes);
rsc->allowed_nodes = NULL;
}
g_list_free(rsc->fillers);
g_list_free(rsc->rsc_location);
pe_rsc_trace(rsc, "Resource freed");
free(rsc->id);
free(rsc->clone_name);
free(rsc->allocated_to);
free(rsc->variant_opaque);
+ free(rsc->pending_task);
free(rsc);
}
diff --git a/lib/pengine/group.c b/lib/pengine/group.c
index 885486eda2..831376c23b 100644
--- a/lib/pengine/group.c
+++ b/lib/pengine/group.c
@@ -1,228 +1,233 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <crm/pengine/rules.h>
#include <crm/pengine/status.h>
#include <crm/pengine/internal.h>
#include <unpack.h>
#include <crm/msg_xml.h>
#define VARIANT_GROUP 1
#include "./variant.h"
gboolean
group_unpack(resource_t * rsc, pe_working_set_t * data_set)
{
xmlNode *xml_obj = rsc->xml;
xmlNode *xml_native_rsc = NULL;
group_variant_data_t *group_data = NULL;
const char *group_ordered = g_hash_table_lookup(rsc->meta, XML_RSC_ATTR_ORDERED);
const char *group_colocated = g_hash_table_lookup(rsc->meta, "collocated");
const char *clone_id = NULL;
pe_rsc_trace(rsc, "Processing resource %s...", rsc->id);
group_data = calloc(1, sizeof(group_variant_data_t));
group_data->num_children = 0;
group_data->first_child = NULL;
group_data->last_child = NULL;
rsc->variant_opaque = group_data;
group_data->ordered = TRUE;
group_data->colocated = TRUE;
if (group_ordered != NULL) {
crm_str_to_boolean(group_ordered, &(group_data->ordered));
}
if (group_colocated != NULL) {
crm_str_to_boolean(group_colocated, &(group_data->colocated));
}
clone_id = crm_element_value(rsc->xml, XML_RSC_ATTR_INCARNATION);
for (xml_native_rsc = __xml_first_child(xml_obj); xml_native_rsc != NULL;
xml_native_rsc = __xml_next(xml_native_rsc)) {
if (crm_str_eq((const char *)xml_native_rsc->name, XML_CIB_TAG_RESOURCE, TRUE)) {
resource_t *new_rsc = NULL;
crm_xml_add(xml_native_rsc, XML_RSC_ATTR_INCARNATION, clone_id);
if (common_unpack(xml_native_rsc, &new_rsc, rsc, data_set) == FALSE) {
pe_err("Failed unpacking resource %s", crm_element_value(xml_obj, XML_ATTR_ID));
if (new_rsc != NULL && new_rsc->fns != NULL) {
new_rsc->fns->free(new_rsc);
}
}
group_data->num_children++;
rsc->children = g_list_append(rsc->children, new_rsc);
if (group_data->first_child == NULL) {
group_data->first_child = new_rsc;
}
group_data->last_child = new_rsc;
print_resource(LOG_DEBUG_3, "Added ", new_rsc, FALSE);
}
}
if (group_data->num_children == 0) {
#if 0
/* Bug #1287 */
crm_config_err("Group %s did not have any children", rsc->id);
return FALSE;
#else
crm_config_warn("Group %s did not have any children", rsc->id);
return TRUE;
#endif
}
pe_rsc_trace(rsc, "Added %d children to resource %s...", group_data->num_children, rsc->id);
return TRUE;
}
gboolean
group_active(resource_t * rsc, gboolean all)
{
gboolean c_all = TRUE;
gboolean c_any = FALSE;
GListPtr gIter = rsc->children;
for (; gIter != NULL; gIter = gIter->next) {
resource_t *child_rsc = (resource_t *) gIter->data;
if (child_rsc->fns->active(child_rsc, all)) {
c_any = TRUE;
} else {
c_all = FALSE;
}
}
if (c_any == FALSE) {
return FALSE;
} else if (all && c_all == FALSE) {
return FALSE;
}
return TRUE;
}
static void
group_print_xml(resource_t * rsc, const char *pre_text, long options, void *print_data)
{
GListPtr gIter = rsc->children;
char *child_text = crm_concat(pre_text, " ", ' ');
status_print("%s<group id=\"%s\" ", pre_text, rsc->id);
status_print("number_resources=\"%d\" ", g_list_length(rsc->children));
status_print(">\n");
for (; gIter != NULL; gIter = gIter->next) {
resource_t *child_rsc = (resource_t *) gIter->data;
child_rsc->fns->print(child_rsc, child_text, options, print_data);
}
status_print("%s</group>\n", pre_text);
free(child_text);
}
void
group_print(resource_t * rsc, const char *pre_text, long options, void *print_data)
{
char *child_text = NULL;
GListPtr gIter = rsc->children;
if (pre_text == NULL) {
pre_text = " ";
}
if (options & pe_print_xml) {
group_print_xml(rsc, pre_text, options, print_data);
return;
}
child_text = crm_concat(pre_text, " ", ' ');
status_print("%sResource Group: %s", pre_text ? pre_text : "", rsc->id);
if (options & pe_print_html) {
status_print("\n<ul>\n");
} else if ((options & pe_print_log) == 0) {
status_print("\n");
}
- for (; gIter != NULL; gIter = gIter->next) {
- resource_t *child_rsc = (resource_t *) gIter->data;
+ if (options & pe_print_brief) {
+ print_rscs_brief(rsc->children, child_text, options, print_data, TRUE);
- if (options & pe_print_html) {
- status_print("<li>\n");
- }
- child_rsc->fns->print(child_rsc, child_text, options, print_data);
- if (options & pe_print_html) {
- status_print("</li>\n");
+ } else {
+ for (; gIter != NULL; gIter = gIter->next) {
+ resource_t *child_rsc = (resource_t *) gIter->data;
+
+ if (options & pe_print_html) {
+ status_print("<li>\n");
+ }
+ child_rsc->fns->print(child_rsc, child_text, options, print_data);
+ if (options & pe_print_html) {
+ status_print("</li>\n");
+ }
}
}
if (options & pe_print_html) {
status_print("</ul>\n");
}
free(child_text);
}
void
group_free(resource_t * rsc)
{
GListPtr gIter = rsc->children;
CRM_CHECK(rsc != NULL, return);
pe_rsc_trace(rsc, "Freeing %s", rsc->id);
for (; gIter != NULL; gIter = gIter->next) {
resource_t *child_rsc = (resource_t *) gIter->data;
pe_rsc_trace(child_rsc, "Freeing child %s", child_rsc->id);
child_rsc->fns->free(child_rsc);
}
pe_rsc_trace(rsc, "Freeing child list");
g_list_free(rsc->children);
common_free(rsc);
}
enum rsc_role_e
group_resource_state(const resource_t * rsc, gboolean current)
{
enum rsc_role_e group_role = RSC_ROLE_UNKNOWN;
GListPtr gIter = rsc->children;
for (; gIter != NULL; gIter = gIter->next) {
resource_t *child_rsc = (resource_t *) gIter->data;
enum rsc_role_e role = child_rsc->fns->state(child_rsc, current);
if (role > group_role) {
group_role = role;
}
}
pe_rsc_trace(rsc, "%s role: %s", rsc->id, role2text(group_role));
return group_role;
}
diff --git a/lib/pengine/native.c b/lib/pengine/native.c
index adfd5bacd5..1c64c8f64d 100644
--- a/lib/pengine/native.c
+++ b/lib/pengine/native.c
@@ -1,609 +1,849 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <crm/pengine/rules.h>
#include <crm/pengine/status.h>
#include <crm/pengine/complex.h>
#include <crm/pengine/internal.h>
#include <unpack.h>
#include <crm/msg_xml.h>
#define VARIANT_NATIVE 1
#include "./variant.h"
void
native_add_running(resource_t * rsc, node_t * node, pe_working_set_t * data_set)
{
GListPtr gIter = rsc->running_on;
CRM_CHECK(node != NULL, return);
for (; gIter != NULL; gIter = gIter->next) {
node_t *a_node = (node_t *) gIter->data;
CRM_CHECK(a_node != NULL, return);
if (safe_str_eq(a_node->details->id, node->details->id)) {
return;
}
}
pe_rsc_trace(rsc, "Adding %s to %s %s", rsc->id, node->details->uname,
is_set(rsc->flags, pe_rsc_managed)?"":"(unmanaged)");
rsc->running_on = g_list_append(rsc->running_on, node);
if (rsc->variant == pe_native) {
node->details->running_rsc = g_list_append(node->details->running_rsc, rsc);
}
if (rsc->variant == pe_native && node->details->maintenance) {
clear_bit(rsc->flags, pe_rsc_managed);
}
if (is_not_set(rsc->flags, pe_rsc_managed)) {
resource_t *p = rsc->parent;
pe_rsc_info(rsc, "resource %s isnt managed", rsc->id);
resource_location(rsc, node, INFINITY, "not_managed_default", data_set);
while(p && node->details->online) {
/* add without the additional location constraint */
p->running_on = g_list_append(p->running_on, node);
p = p->parent;
}
return;
}
if (rsc->variant == pe_native && g_list_length(rsc->running_on) > 1) {
switch (rsc->recovery_type) {
case recovery_stop_only:
{
GHashTableIter gIter;
node_t *local_node = NULL;
/* make sure it doesnt come up again */
g_hash_table_destroy(rsc->allowed_nodes);
rsc->allowed_nodes = node_hash_from_list(data_set->nodes);
g_hash_table_iter_init(&gIter, rsc->allowed_nodes);
while (g_hash_table_iter_next(&gIter, NULL, (void **)&local_node)) {
local_node->weight = -INFINITY;
}
}
break;
case recovery_stop_start:
break;
case recovery_block:
clear_bit(rsc->flags, pe_rsc_managed);
set_bit(rsc->flags, pe_rsc_block);
break;
}
crm_debug("%s is active on %d nodes including %s: %s",
rsc->id, g_list_length(rsc->running_on), node->details->uname,
recovery2text(rsc->recovery_type));
} else {
pe_rsc_trace(rsc, "Resource %s is active on: %s", rsc->id, node->details->uname);
}
if (rsc->parent != NULL) {
native_add_running(rsc->parent, node, data_set);
}
}
extern void force_non_unique_clone(resource_t * rsc, const char *rid, pe_working_set_t * data_set);
gboolean
native_unpack(resource_t * rsc, pe_working_set_t * data_set)
{
resource_t *parent = uber_parent(rsc);
native_variant_data_t *native_data = NULL;
const char *class = crm_element_value(rsc->xml, XML_AGENT_ATTR_CLASS);
pe_rsc_trace(rsc, "Processing resource %s...", rsc->id);
native_data = calloc(1, sizeof(native_variant_data_t));
rsc->variant_opaque = native_data;
if (is_set(rsc->flags, pe_rsc_unique) && rsc->parent) {
if (safe_str_eq(class, "lsb")) {
resource_t *top = uber_parent(rsc);
force_non_unique_clone(top, rsc->id, data_set);
}
}
if (safe_str_eq(class, "ocf") == FALSE) {
const char *stateful = g_hash_table_lookup(parent->meta, "stateful");
if (safe_str_eq(stateful, XML_BOOLEAN_TRUE)) {
pe_err
("Resource %s is of type %s and therefore cannot be used as a master/slave resource",
rsc->id, class);
return FALSE;
}
}
return TRUE;
}
resource_t *
native_find_rsc(resource_t * rsc, const char *id, node_t * on_node, int flags)
{
gboolean match = FALSE;
resource_t *result = NULL;
GListPtr gIter = rsc->children;
CRM_ASSERT(id != NULL);
if (flags & pe_find_clone) {
const char *rid = ID(rsc->xml);
if (rsc->parent == NULL) {
match = FALSE;
} else if (safe_str_eq(rsc->id, id)) {
match = TRUE;
} else if (safe_str_eq(rid, id)) {
match = TRUE;
}
} else {
if (strcmp(rsc->id, id) == 0) {
match = TRUE;
} else if (is_set(flags, pe_find_renamed)
&& rsc->clone_name && strcmp(rsc->clone_name, id) == 0) {
match = TRUE;
}
}
if (match && on_node) {
pe_rsc_trace(rsc, "Now checking %s is on %s", rsc->id, on_node->details->uname);
if (is_set(flags, pe_find_current) && rsc->running_on) {
GListPtr gIter = rsc->running_on;
for (; gIter != NULL; gIter = gIter->next) {
node_t *loc = (node_t *) gIter->data;
if (loc->details == on_node->details) {
return rsc;
}
}
} else if (is_set(flags, pe_find_inactive) && rsc->running_on == NULL) {
return rsc;
} else if (is_not_set(flags, pe_find_current) && rsc->allocated_to
&& rsc->allocated_to->details == on_node->details) {
return rsc;
}
} else if (match) {
return rsc;
}
for (; gIter != NULL; gIter = gIter->next) {
resource_t *child = (resource_t *) gIter->data;
result = rsc->fns->find_rsc(child, id, on_node, flags);
if (result) {
return result;
}
}
return NULL;
}
char *
native_parameter(resource_t * rsc, node_t * node, gboolean create, const char *name,
pe_working_set_t * data_set)
{
char *value_copy = NULL;
const char *value = NULL;
GHashTable *hash = rsc->parameters;
GHashTable *local_hash = NULL;
CRM_CHECK(rsc != NULL, return NULL);
CRM_CHECK(name != NULL && strlen(name) != 0, return NULL);
pe_rsc_trace(rsc, "Looking up %s in %s", name, rsc->id);
if (create || g_hash_table_size(rsc->parameters) == 0) {
if (node != NULL) {
pe_rsc_trace(rsc, "Creating hash with node %s", node->details->uname);
} else {
pe_rsc_trace(rsc, "Creating default hash");
}
local_hash = g_hash_table_new_full(crm_str_hash, g_str_equal,
g_hash_destroy_str, g_hash_destroy_str);
get_rsc_attributes(local_hash, rsc, node, data_set);
hash = local_hash;
}
value = g_hash_table_lookup(hash, name);
if (value == NULL) {
/* try meta attributes instead */
value = g_hash_table_lookup(rsc->meta, name);
}
if (value != NULL) {
value_copy = strdup(value);
}
if (local_hash != NULL) {
g_hash_table_destroy(local_hash);
}
return value_copy;
}
gboolean
native_active(resource_t * rsc, gboolean all)
{
GListPtr gIter = rsc->running_on;
for (; gIter != NULL; gIter = gIter->next) {
node_t *a_node = (node_t *) gIter->data;
if (a_node->details->unclean) {
crm_debug("Resource %s: node %s is unclean", rsc->id, a_node->details->uname);
return TRUE;
} else if (a_node->details->online == FALSE) {
crm_debug("Resource %s: node %s is offline", rsc->id, a_node->details->uname);
} else {
crm_debug("Resource %s active on %s", rsc->id, a_node->details->uname);
return TRUE;
}
}
return FALSE;
}
struct print_data_s {
long options;
void *print_data;
};
static void
native_print_attr(gpointer key, gpointer value, gpointer user_data)
{
long options = ((struct print_data_s *)user_data)->options;
void *print_data = ((struct print_data_s *)user_data)->print_data;
status_print("Option: %s = %s\n", (char *)key, (char *)value);
}
+static const char *
+native_pending_state(resource_t * rsc)
+{
+ const char *pending_state = NULL;
+
+ if (safe_str_eq(rsc->pending_task, CRMD_ACTION_START)) {
+ pending_state = "Starting";
+
+ } else if (safe_str_eq(rsc->pending_task, CRMD_ACTION_STOP)) {
+ pending_state = "Stopping";
+
+ } else if (safe_str_eq(rsc->pending_task, CRMD_ACTION_MIGRATE)) {
+ pending_state = "Migrating";
+
+ } else if (safe_str_eq(rsc->pending_task, CRMD_ACTION_MIGRATED)) {
+ /* Work might be done in here. */
+ pending_state = "Migrating";
+
+ } else if (safe_str_eq(rsc->pending_task, CRMD_ACTION_PROMOTE)) {
+ pending_state = "Promoting";
+
+ } else if (safe_str_eq(rsc->pending_task, CRMD_ACTION_DEMOTE)) {
+ pending_state = "Demoting";
+ }
+
+ return pending_state;
+}
+
+static const char *
+native_pending_task(resource_t * rsc)
+{
+ const char *pending_task = NULL;
+
+ if (safe_str_eq(rsc->pending_task, CRMD_ACTION_NOTIFY)) {
+ /* "Notifying" is not very useful to be shown. */
+ pending_task = NULL;
+
+ } else if (safe_str_eq(rsc->pending_task, CRMD_ACTION_STATUS)) {
+ pending_task = "Monitoring";
+
+ /* Comment this out until someone requests it */
+ /*
+ } else if (safe_str_eq(rsc->pending_task, "probe")) {
+ pending_task = "Checking";
+ */
+ }
+
+ return pending_task;
+}
+
static void
native_print_xml(resource_t * rsc, const char *pre_text, long options, void *print_data)
{
enum rsc_role_e role = rsc->role;
const char *class = crm_element_value(rsc->xml, XML_AGENT_ATTR_CLASS);
const char *prov = crm_element_value(rsc->xml, XML_AGENT_ATTR_PROVIDER);
+ const char *rsc_state = NULL;
if(role == RSC_ROLE_STARTED && uber_parent(rsc)->variant == pe_master) {
role = RSC_ROLE_SLAVE;
}
/* resource information. */
status_print("%s<resource ", pre_text);
status_print("id=\"%s\" ", rsc_printable_id(rsc));
status_print("resource_agent=\"%s%s%s:%s\" ",
class,
prov ? "::" : "", prov ? prov : "", crm_element_value(rsc->xml, XML_ATTR_TYPE));
- status_print("role=\"%s\" ", role2text(role));
+
+ if (options & pe_print_pending) {
+ rsc_state = native_pending_state(rsc);
+ }
+ if (rsc_state == NULL) {
+ rsc_state = role2text(role);
+ }
+ status_print("role=\"%s\" ", rsc_state);
status_print("active=\"%s\" ", rsc->fns->active(rsc, TRUE) ? "true" : "false");
status_print("orphaned=\"%s\" ", is_set(rsc->flags, pe_rsc_orphan) ? "true" : "false");
status_print("managed=\"%s\" ", is_set(rsc->flags, pe_rsc_managed) ? "true" : "false");
status_print("failed=\"%s\" ", is_set(rsc->flags, pe_rsc_failed) ? "true" : "false");
status_print("failure_ignored=\"%s\" ",
is_set(rsc->flags, pe_rsc_failure_ignored) ? "true" : "false");
status_print("nodes_running_on=\"%d\" ", g_list_length(rsc->running_on));
+ if (options & pe_print_pending) {
+ const char *pending_task = native_pending_task(rsc);
+
+ if (pending_task) {
+ status_print("pending=\"%s\" ", pending_task);
+ }
+ }
+
if (options & pe_print_dev) {
status_print("provisional=\"%s\" ",
is_set(rsc->flags, pe_rsc_provisional) ? "true" : "false");
status_print("runnable=\"%s\" ", is_set(rsc->flags, pe_rsc_runnable) ? "true" : "false");
status_print("priority=\"%f\" ", (double)rsc->priority);
status_print("variant=\"%s\" ", crm_element_name(rsc->xml));
}
/* print out the nodes this resource is running on */
if (options & pe_print_rsconly) {
status_print("/>\n");
/* do nothing */
} else if (g_list_length(rsc->running_on) > 0) {
GListPtr gIter = rsc->running_on;
status_print(">\n");
for (; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
status_print("%s <node name=\"%s\" id=\"%s\" cached=\"%s\"/>\n", pre_text,
node->details->uname, node->details->id,
node->details->online ? "false" : "true");
}
status_print("%s</resource>\n", pre_text);
} else {
status_print("/>\n");
}
}
void
native_print(resource_t * rsc, const char *pre_text, long options, void *print_data)
{
node_t *node = NULL;
const char *class = crm_element_value(rsc->xml, XML_AGENT_ATTR_CLASS);
const char *kind = crm_element_value(rsc->xml, XML_ATTR_TYPE);
int offset = 0;
char buffer[LINE_MAX];
CRM_ASSERT(rsc->variant == pe_native);
CRM_ASSERT(kind != NULL);
if (rsc->meta) {
const char *is_internal = g_hash_table_lookup(rsc->meta, XML_RSC_ATTR_INTERNAL_RSC);
if (crm_is_true(is_internal)) {
crm_trace("skipping print of internal resource %s", rsc->id);
return;
}
}
if (pre_text == NULL && (options & pe_print_printf)) {
pre_text = " ";
}
if (options & pe_print_xml) {
native_print_xml(rsc, pre_text, options, print_data);
return;
}
if (rsc->running_on != NULL) {
node = rsc->running_on->data;
}
if ((options & pe_print_rsconly) || g_list_length(rsc->running_on) > 1) {
node = NULL;
}
if (options & pe_print_html) {
if (is_not_set(rsc->flags, pe_rsc_managed)) {
status_print("<font color=\"yellow\">");
} else if (is_set(rsc->flags, pe_rsc_failed)) {
status_print("<font color=\"red\">");
} else if (rsc->variant == pe_native && g_list_length(rsc->running_on) == 0) {
status_print("<font color=\"red\">");
} else if (g_list_length(rsc->running_on) > 1) {
status_print("<font color=\"orange\">");
} else if (is_set(rsc->flags, pe_rsc_failure_ignored)) {
status_print("<font color=\"yellow\">");
} else {
status_print("<font color=\"green\">");
}
}
if(pre_text) {
offset += snprintf(buffer + offset, LINE_MAX - offset, "%s", pre_text);
}
offset += snprintf(buffer + offset, LINE_MAX - offset, "%s", rsc_printable_id(rsc));
offset += snprintf(buffer + offset, LINE_MAX - offset, "\t(%s", class);
if (safe_str_eq(class, "ocf")) {
const char *prov = crm_element_value(rsc->xml, XML_AGENT_ATTR_PROVIDER);
offset += snprintf(buffer + offset, LINE_MAX - offset, "::%s", prov);
}
offset += snprintf(buffer + offset, LINE_MAX - offset, ":%s):\t", kind);
if(is_set(rsc->flags, pe_rsc_orphan)) {
offset += snprintf(buffer + offset, LINE_MAX - offset, " ORPHANED ");
}
if(rsc->role > RSC_ROLE_SLAVE && is_set(rsc->flags, pe_rsc_failed)) {
offset += snprintf(buffer + offset, LINE_MAX - offset, "FAILED %s ", role2text(rsc->role));
} else if(is_set(rsc->flags, pe_rsc_failed)) {
offset += snprintf(buffer + offset, LINE_MAX - offset, "FAILED ");
} else {
- offset += snprintf(buffer + offset, LINE_MAX - offset, "%s ", role2text(rsc->role));
+ const char *rsc_state = NULL;
+
+ if (options & pe_print_pending) {
+ rsc_state = native_pending_state(rsc);
+ }
+ if (rsc_state == NULL) {
+ rsc_state = role2text(rsc->role);
+ }
+ offset += snprintf(buffer + offset, LINE_MAX - offset, "%s ", rsc_state);
}
if(node) {
offset += snprintf(buffer + offset, LINE_MAX - offset, "%s ", node->details->uname);
}
+
+ if (options & pe_print_pending) {
+ const char *pending_task = native_pending_task(rsc);
+
+ if (pending_task) {
+ offset += snprintf(buffer + offset, LINE_MAX - offset, "(%s) ", pending_task);
+ }
+ }
+
if(is_not_set(rsc->flags, pe_rsc_managed)) {
offset += snprintf(buffer + offset, LINE_MAX - offset, "(unmanaged) ");
}
if(is_set(rsc->flags, pe_rsc_failure_ignored)) {
offset += snprintf(buffer + offset, LINE_MAX - offset, "(failure ignored)");
}
if ((options & pe_print_rsconly) || g_list_length(rsc->running_on) > 1) {
const char *desc = crm_element_value(rsc->xml, XML_ATTR_DESC);
if(desc) {
offset += snprintf(buffer + offset, LINE_MAX - offset, "%s", desc);
}
}
status_print("%s", buffer);
#if CURSES_ENABLED
if ((options & pe_print_rsconly) || g_list_length(rsc->running_on) > 1) {
/* Done */
} else if (options & pe_print_ncurses) {
/* coverity[negative_returns] False positive */
move(-1, 0);
}
#endif
if (options & pe_print_html) {
status_print(" </font> ");
}
if ((options & pe_print_rsconly)) {
} else if (g_list_length(rsc->running_on) > 1) {
GListPtr gIter = rsc->running_on;
int counter = 0;
if (options & pe_print_html) {
status_print("<ul>\n");
} else if ((options & pe_print_printf)
|| (options & pe_print_ncurses)) {
status_print("[");
}
for (; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
counter++;
if (options & pe_print_html) {
status_print("<li>\n%s", node->details->uname);
} else if ((options & pe_print_printf)
|| (options & pe_print_ncurses)) {
status_print(" %s", node->details->uname);
} else if ((options & pe_print_log)) {
status_print("\t%d : %s", counter, node->details->uname);
} else {
status_print("%s", node->details->uname);
}
if (options & pe_print_html) {
status_print("</li>\n");
}
}
if (options & pe_print_html) {
status_print("</ul>\n");
} else if ((options & pe_print_printf)
|| (options & pe_print_ncurses)) {
status_print(" ]");
}
}
if (options & pe_print_html) {
status_print("<br/>\n");
} else if (options & pe_print_suppres_nl) {
/* nothing */
} else if ((options & pe_print_printf) || (options & pe_print_ncurses)) {
status_print("\n");
}
if (options & pe_print_details) {
struct print_data_s pdata;
pdata.options = options;
pdata.print_data = print_data;
g_hash_table_foreach(rsc->parameters, native_print_attr, &pdata);
}
if (options & pe_print_dev) {
GHashTableIter iter;
node_t *node = NULL;
status_print("%s\t(%s%svariant=%s, priority=%f)", pre_text,
is_set(rsc->flags, pe_rsc_provisional) ? "provisional, " : "",
is_set(rsc->flags, pe_rsc_runnable) ? "" : "non-startable, ",
crm_element_name(rsc->xml), (double)rsc->priority);
status_print("%s\tAllowed Nodes", pre_text);
g_hash_table_iter_init(&iter, rsc->allowed_nodes);
while (g_hash_table_iter_next(&iter, NULL, (void **)&node)) {
status_print("%s\t * %s %d", pre_text, node->details->uname, node->weight);
}
}
if (options & pe_print_max_details) {
GHashTableIter iter;
node_t *node = NULL;
status_print("%s\t=== Allowed Nodes\n", pre_text);
g_hash_table_iter_init(&iter, rsc->allowed_nodes);
while (g_hash_table_iter_next(&iter, NULL, (void **)&node)) {
print_node("\t", node, FALSE);
}
}
}
void
native_free(resource_t * rsc)
{
pe_rsc_trace(rsc, "Freeing resource action list (not the data)");
common_free(rsc);
}
enum rsc_role_e
native_resource_state(const resource_t * rsc, gboolean current)
{
enum rsc_role_e role = rsc->next_role;
if (current) {
role = rsc->role;
}
pe_rsc_trace(rsc, "%s state: %s", rsc->id, role2text(role));
return role;
}
node_t *
native_location(resource_t * rsc, GListPtr * list, gboolean current)
{
node_t *one = NULL;
GListPtr result = NULL;
if (rsc->children) {
GListPtr gIter = rsc->children;
for (; gIter != NULL; gIter = gIter->next) {
resource_t *child = (resource_t *) gIter->data;
child->fns->location(child, &result, current);
}
} else if (current && rsc->running_on) {
result = g_list_copy(rsc->running_on);
} else if (current == FALSE && rsc->allocated_to) {
result = g_list_append(NULL, rsc->allocated_to);
}
if (result && g_list_length(result) == 1) {
one = g_list_nth_data(result, 0);
}
if (list) {
GListPtr gIter = result;
for (; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
if (*list == NULL || pe_find_node_id(*list, node->details->id) == NULL) {
*list = g_list_append(*list, node);
}
}
}
g_list_free(result);
return one;
}
+
+static void
+get_rscs_brief(GListPtr rsc_list, GHashTable * rsc_table, GHashTable * active_table)
+{
+ GListPtr gIter = rsc_list;
+
+ for (; gIter != NULL; gIter = gIter->next) {
+ resource_t *rsc = (resource_t *) gIter->data;
+
+ const char *class = crm_element_value(rsc->xml, XML_AGENT_ATTR_CLASS);
+ const char *kind = crm_element_value(rsc->xml, XML_ATTR_TYPE);
+
+ int offset = 0;
+ char buffer[LINE_MAX];
+
+ int *rsc_counter = NULL;
+ int *active_counter = NULL;
+
+ if (rsc->variant != pe_native) {
+ continue;
+ }
+
+ offset += snprintf(buffer + offset, LINE_MAX - offset, "%s", class);
+ if (safe_str_eq(class, "ocf")) {
+ const char *prov = crm_element_value(rsc->xml, XML_AGENT_ATTR_PROVIDER);
+ offset += snprintf(buffer + offset, LINE_MAX - offset, "::%s", prov);
+ }
+ offset += snprintf(buffer + offset, LINE_MAX - offset, ":%s", kind);
+
+ if (rsc_table) {
+ rsc_counter = g_hash_table_lookup(rsc_table, buffer);
+ if (rsc_counter == NULL) {
+ rsc_counter = calloc(1, sizeof(int));
+ *rsc_counter = 0;
+ g_hash_table_insert(rsc_table, strdup(buffer), rsc_counter);
+ }
+ (*rsc_counter)++;
+ }
+
+ if (active_table) {
+ GListPtr gIter2 = rsc->running_on;
+
+ for (; gIter2 != NULL; gIter2 = gIter2->next) {
+ node_t *node = (node_t *) gIter2->data;
+ GHashTable *node_table = NULL;
+
+ if (node->details->unclean == FALSE && node->details->online == FALSE) {
+ continue;
+ }
+
+ node_table = g_hash_table_lookup(active_table, node->details->uname);
+ if (node_table == NULL) {
+ node_table = g_hash_table_new_full(crm_str_hash, g_str_equal, free, free);
+ g_hash_table_insert(active_table, strdup(node->details->uname), node_table);
+ }
+
+ active_counter = g_hash_table_lookup(node_table, buffer);
+ if (active_counter == NULL) {
+ active_counter = calloc(1, sizeof(int));
+ *active_counter = 0;
+ g_hash_table_insert(node_table, strdup(buffer), active_counter);
+ }
+ (*active_counter)++;
+ }
+ }
+ }
+}
+
+static void
+destroy_node_table(gpointer data)
+{
+ GHashTable *node_table = data;
+
+ if (node_table) {
+ g_hash_table_destroy(node_table);
+ }
+}
+
+void
+print_rscs_brief(GListPtr rsc_list, const char *pre_text, long options,
+ void *print_data, gboolean print_all)
+{
+ GHashTable *rsc_table = g_hash_table_new_full(crm_str_hash, g_str_equal, free, free);
+ GHashTable *active_table = g_hash_table_new_full(crm_str_hash, g_str_equal,
+ free, destroy_node_table);
+ GHashTableIter hash_iter;
+ char *type = NULL;
+ int *rsc_counter = NULL;
+
+ get_rscs_brief(rsc_list, rsc_table, active_table);
+
+ g_hash_table_iter_init(&hash_iter, rsc_table);
+ while (g_hash_table_iter_next(&hash_iter, (gpointer *)&type, (gpointer *)&rsc_counter)) {
+ GHashTableIter hash_iter2;
+ char *node_name = NULL;
+ GHashTable *node_table = NULL;
+ int active_counter_all = 0;
+
+ g_hash_table_iter_init(&hash_iter2, active_table);
+ while (g_hash_table_iter_next(&hash_iter2, (gpointer *)&node_name, (gpointer *)&node_table)) {
+ int *active_counter = g_hash_table_lookup(node_table, type);
+
+ if (active_counter == NULL || *active_counter == 0) {
+ continue;
+
+ } else {
+ active_counter_all += *active_counter;
+ }
+
+ if (options & pe_print_rsconly) {
+ node_name = NULL;
+ }
+
+ if (options & pe_print_html) {
+ status_print("<li>\n");
+ }
+
+ if (print_all) {
+ status_print("%s%d/%d\t(%s):\tActive %s\n", pre_text ? pre_text : "",
+ active_counter ? *active_counter : 0,
+ rsc_counter ? *rsc_counter : 0, type,
+ active_counter && (*active_counter > 0) && node_name ? node_name : "");
+ } else {
+ status_print("%s%d\t(%s):\tActive %s\n", pre_text ? pre_text : "",
+ active_counter ? *active_counter : 0, type,
+ active_counter && (*active_counter > 0) && node_name ? node_name : "");
+ }
+
+ if (options & pe_print_html) {
+ status_print("</li>\n");
+ }
+ }
+
+ if (print_all && active_counter_all == 0) {
+ if (options & pe_print_html) {
+ status_print("<li>\n");
+ }
+
+ status_print("%s%d/%d\t(%s):\tActive\n", pre_text ? pre_text : "",
+ active_counter_all,
+ rsc_counter ? *rsc_counter : 0, type);
+
+ if (options & pe_print_html) {
+ status_print("</li>\n");
+ }
+ }
+ }
+
+ if (rsc_table) {
+ g_hash_table_destroy(rsc_table);
+ rsc_table = NULL;
+ }
+ if (active_table) {
+ g_hash_table_destroy(active_table);
+ active_table = NULL;
+ }
+}
diff --git a/lib/pengine/unpack.c b/lib/pengine/unpack.c
index af15e4bec7..b25e10c9ab 100644
--- a/lib/pengine/unpack.c
+++ b/lib/pengine/unpack.c
@@ -1,3082 +1,3093 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <glib.h>
#include <crm/crm.h>
#include <crm/services.h>
#include <crm/msg_xml.h>
#include <crm/common/xml.h>
#include <crm/common/util.h>
#include <crm/pengine/rules.h>
#include <crm/pengine/internal.h>
#include <unpack.h>
CRM_TRACE_INIT_DATA(pe_status);
#define set_config_flag(data_set, option, flag) do { \
const char *tmp = pe_pref(data_set->config_hash, option); \
if(tmp) { \
if(crm_is_true(tmp)) { \
set_bit(data_set->flags, flag); \
} else { \
clear_bit(data_set->flags, flag); \
} \
} \
} while(0)
gboolean unpack_rsc_op(resource_t * rsc, node_t * node, xmlNode * xml_op,
enum action_fail_response *failed, pe_working_set_t * data_set);
static gboolean determine_remote_online_status(node_t * this_node);
static void
pe_fence_node(pe_working_set_t * data_set, node_t * node, const char *reason)
{
CRM_CHECK(node, return);
/* fence remote nodes living in a container by marking the container as failed. */
if (is_container_remote_node(node)) {
resource_t *rsc = node->details->remote_rsc->container;
if (is_set(rsc->flags, pe_rsc_failed) == FALSE) {
crm_warn("Remote node %s will be fenced by recovering container resource %s",
node->details->uname, rsc->id, reason);
set_bit(rsc->flags, pe_rsc_failed);
}
} else if (node->details->unclean == FALSE) {
if(pe_can_fence(data_set, node)) {
crm_warn("Node %s will be fenced %s", node->details->uname, reason);
} else {
crm_warn("Node %s is unclean %s", node->details->uname, reason);
}
node->details->unclean = TRUE;
}
}
gboolean
unpack_config(xmlNode * config, pe_working_set_t * data_set)
{
const char *value = NULL;
GHashTable *config_hash =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, g_hash_destroy_str);
data_set->config_hash = config_hash;
unpack_instance_attributes(data_set->input, config, XML_CIB_TAG_PROPSET, NULL, config_hash,
CIB_OPTIONS_FIRST, FALSE, data_set->now);
verify_pe_options(data_set->config_hash);
set_config_flag(data_set, "enable-startup-probes", pe_flag_startup_probes);
if(is_not_set(data_set->flags, pe_flag_startup_probes)) {
crm_info("Startup probes: disabled (dangerous)");
}
value = pe_pref(data_set->config_hash, "stonith-timeout");
data_set->stonith_timeout = crm_get_msec(value);
crm_debug("STONITH timeout: %d", data_set->stonith_timeout);
set_config_flag(data_set, "stonith-enabled", pe_flag_stonith_enabled);
crm_debug("STONITH of failed nodes is %s",
is_set(data_set->flags, pe_flag_stonith_enabled) ? "enabled" : "disabled");
data_set->stonith_action = pe_pref(data_set->config_hash, "stonith-action");
crm_trace("STONITH will %s nodes", data_set->stonith_action);
set_config_flag(data_set, "stop-all-resources", pe_flag_stop_everything);
crm_debug("Stop all active resources: %s",
is_set(data_set->flags, pe_flag_stop_everything) ? "true" : "false");
set_config_flag(data_set, "symmetric-cluster", pe_flag_symmetric_cluster);
if (is_set(data_set->flags, pe_flag_symmetric_cluster)) {
crm_debug("Cluster is symmetric" " - resources can run anywhere by default");
}
value = pe_pref(data_set->config_hash, "default-resource-stickiness");
data_set->default_resource_stickiness = char2score(value);
crm_debug("Default stickiness: %d", data_set->default_resource_stickiness);
value = pe_pref(data_set->config_hash, "no-quorum-policy");
if (safe_str_eq(value, "ignore")) {
data_set->no_quorum_policy = no_quorum_ignore;
} else if (safe_str_eq(value, "freeze")) {
data_set->no_quorum_policy = no_quorum_freeze;
} else if (safe_str_eq(value, "suicide")) {
gboolean do_panic = FALSE;
crm_element_value_int(data_set->input, XML_ATTR_QUORUM_PANIC, &do_panic);
if (is_set(data_set->flags, pe_flag_stonith_enabled) == FALSE) {
crm_config_err
("Setting no-quorum-policy=suicide makes no sense if stonith-enabled=false");
}
if (do_panic && is_set(data_set->flags, pe_flag_stonith_enabled)) {
data_set->no_quorum_policy = no_quorum_suicide;
} else if (is_set(data_set->flags, pe_flag_have_quorum) == FALSE && do_panic == FALSE) {
crm_notice("Resetting no-quorum-policy to 'stop': The cluster has never had quorum");
data_set->no_quorum_policy = no_quorum_stop;
}
} else {
data_set->no_quorum_policy = no_quorum_stop;
}
switch (data_set->no_quorum_policy) {
case no_quorum_freeze:
crm_debug("On loss of CCM Quorum: Freeze resources");
break;
case no_quorum_stop:
crm_debug("On loss of CCM Quorum: Stop ALL resources");
break;
case no_quorum_suicide:
crm_notice("On loss of CCM Quorum: Fence all remaining nodes");
break;
case no_quorum_ignore:
crm_notice("On loss of CCM Quorum: Ignore");
break;
}
set_config_flag(data_set, "stop-orphan-resources", pe_flag_stop_rsc_orphans);
crm_trace("Orphan resources are %s",
is_set(data_set->flags, pe_flag_stop_rsc_orphans) ? "stopped" : "ignored");
set_config_flag(data_set, "stop-orphan-actions", pe_flag_stop_action_orphans);
crm_trace("Orphan resource actions are %s",
is_set(data_set->flags, pe_flag_stop_action_orphans) ? "stopped" : "ignored");
set_config_flag(data_set, "remove-after-stop", pe_flag_remove_after_stop);
crm_trace("Stopped resources are removed from the status section: %s",
is_set(data_set->flags, pe_flag_remove_after_stop) ? "true" : "false");
set_config_flag(data_set, "maintenance-mode", pe_flag_maintenance_mode);
crm_trace("Maintenance mode: %s",
is_set(data_set->flags, pe_flag_maintenance_mode) ? "true" : "false");
if (is_set(data_set->flags, pe_flag_maintenance_mode)) {
clear_bit(data_set->flags, pe_flag_is_managed_default);
} else {
set_config_flag(data_set, "is-managed-default", pe_flag_is_managed_default);
}
crm_trace("By default resources are %smanaged",
is_set(data_set->flags, pe_flag_is_managed_default) ? "" : "not ");
set_config_flag(data_set, "start-failure-is-fatal", pe_flag_start_failure_fatal);
crm_trace("Start failures are %s",
is_set(data_set->flags,
pe_flag_start_failure_fatal) ? "always fatal" : "handled by failcount");
node_score_red = char2score(pe_pref(data_set->config_hash, "node-health-red"));
node_score_green = char2score(pe_pref(data_set->config_hash, "node-health-green"));
node_score_yellow = char2score(pe_pref(data_set->config_hash, "node-health-yellow"));
crm_debug("Node scores: 'red' = %s, 'yellow' = %s, 'green' = %s",
pe_pref(data_set->config_hash, "node-health-red"),
pe_pref(data_set->config_hash, "node-health-yellow"),
pe_pref(data_set->config_hash, "node-health-green"));
data_set->placement_strategy = pe_pref(data_set->config_hash, "placement-strategy");
crm_trace("Placement strategy: %s", data_set->placement_strategy);
return TRUE;
}
static void
destroy_digest_cache(gpointer ptr)
{
op_digest_cache_t *data = ptr;
free_xml(data->params_all);
free_xml(data->params_restart);
free(data->digest_all_calc);
free(data->digest_restart_calc);
free(data);
}
static node_t *
create_node(const char *id, const char *uname, const char *type, const char *score, pe_working_set_t * data_set)
{
node_t *new_node = NULL;
if (pe_find_node(data_set->nodes, uname) != NULL) {
crm_config_warn("Detected multiple node entries with uname=%s"
" - this is rarely intended", uname);
}
new_node = calloc(1, sizeof(node_t));
if (new_node == NULL) {
return NULL;
}
new_node->weight = char2score(score);
new_node->fixed = FALSE;
new_node->details = calloc(1, sizeof(struct node_shared_s));
if (new_node->details == NULL) {
free(new_node);
return NULL;
}
crm_trace("Creating node for entry %s/%s", uname, id);
new_node->details->id = id;
new_node->details->uname = uname;
new_node->details->online = FALSE;
new_node->details->shutdown = FALSE;
new_node->details->running_rsc = NULL;
new_node->details->type = node_ping;
if (safe_str_eq(type, "remote")) {
new_node->details->type = node_remote;
set_bit(data_set->flags, pe_flag_have_remote_nodes);
} else if (type == NULL || safe_str_eq(type, "member")
|| safe_str_eq(type, NORMALNODE)) {
new_node->details->type = node_member;
}
new_node->details->attrs = g_hash_table_new_full(crm_str_hash, g_str_equal,
g_hash_destroy_str,
g_hash_destroy_str);
new_node->details->utilization =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str,
g_hash_destroy_str);
new_node->details->digest_cache =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str,
destroy_digest_cache);
data_set->nodes = g_list_insert_sorted(data_set->nodes, new_node, sort_node_uname);
return new_node;
}
static const char *
expand_remote_rsc_meta(xmlNode *xml_obj, xmlNode *parent, GHashTable **rsc_name_check)
{
xmlNode *xml_rsc = NULL;
xmlNode *xml_tmp = NULL;
xmlNode *attr_set = NULL;
xmlNode *attr = NULL;
const char *container_id = ID(xml_obj);
const char *remote_name = NULL;
const char *remote_server = NULL;
const char *remote_port = NULL;
const char *connect_timeout = "60s";
char *tmp_id = NULL;
for (attr_set = __xml_first_child(xml_obj); attr_set != NULL; attr_set = __xml_next(attr_set)) {
if (safe_str_neq((const char *)attr_set->name, XML_TAG_META_SETS)) {
continue;
}
for (attr = __xml_first_child(attr_set); attr != NULL; attr = __xml_next(attr)) {
const char *value = crm_element_value(attr, XML_NVPAIR_ATTR_VALUE);
const char *name = crm_element_value(attr, XML_NVPAIR_ATTR_NAME);
if (safe_str_eq(name, "remote-node")) {
remote_name = value;
} else if (safe_str_eq(name, "remote-addr")) {
remote_server = value;
} else if (safe_str_eq(name, "remote-port")) {
remote_port = value;
} else if (safe_str_eq(name, "remote-connect-timeout")) {
connect_timeout = value;
}
}
}
if (remote_name == NULL) {
return NULL;
}
if (*rsc_name_check == NULL) {
*rsc_name_check = g_hash_table_new(crm_str_hash, g_str_equal);
for (xml_rsc = __xml_first_child(parent); xml_rsc != NULL; xml_rsc = __xml_next(xml_rsc)) {
const char *id = ID(xml_rsc);
/* avoiding heap allocation here because we know the duration of this hashtable allows us to */
g_hash_table_insert(*rsc_name_check, (char *) id, (char *) id);
}
}
if (g_hash_table_lookup(*rsc_name_check, remote_name)) {
crm_err("Naming conflict with remote-node=%s. remote-nodes can not have the same name as a resource.",
remote_name);
return NULL;
}
xml_rsc = create_xml_node(parent, XML_CIB_TAG_RESOURCE);
crm_xml_add(xml_rsc, XML_ATTR_ID, remote_name);
crm_xml_add(xml_rsc, XML_AGENT_ATTR_CLASS, "ocf");
crm_xml_add(xml_rsc, XML_AGENT_ATTR_PROVIDER, "pacemaker");
crm_xml_add(xml_rsc, XML_ATTR_TYPE, "remote");
xml_tmp = create_xml_node(xml_rsc, XML_TAG_META_SETS);
tmp_id = crm_concat(remote_name, XML_TAG_META_SETS, '_');
crm_xml_add(xml_tmp, XML_ATTR_ID, tmp_id);
free(tmp_id);
attr = create_xml_node(xml_tmp, XML_CIB_TAG_NVPAIR);
tmp_id = crm_concat(remote_name, "meta-attributes-container", '_');
crm_xml_add(attr, XML_ATTR_ID, tmp_id);
crm_xml_add(attr, XML_NVPAIR_ATTR_NAME, XML_RSC_ATTR_CONTAINER);
crm_xml_add(attr, XML_NVPAIR_ATTR_VALUE, container_id);
free(tmp_id);
attr = create_xml_node(xml_tmp, XML_CIB_TAG_NVPAIR);
tmp_id = crm_concat(remote_name, "meta-attributes-internal", '_');
crm_xml_add(attr, XML_ATTR_ID, tmp_id);
crm_xml_add(attr, XML_NVPAIR_ATTR_NAME, XML_RSC_ATTR_INTERNAL_RSC);
crm_xml_add(attr, XML_NVPAIR_ATTR_VALUE, "true");
free(tmp_id);
xml_tmp = create_xml_node(xml_rsc, "operations");
attr = create_xml_node(xml_tmp, XML_ATTR_OP);
tmp_id = crm_concat(remote_name, "monitor-interval-30s", '_');
crm_xml_add(attr, XML_ATTR_ID, tmp_id);
crm_xml_add(attr, XML_ATTR_TIMEOUT, "30s");
crm_xml_add(attr, XML_LRM_ATTR_INTERVAL, "30s");
crm_xml_add(attr, XML_NVPAIR_ATTR_NAME, "monitor");
free(tmp_id);
if (connect_timeout) {
attr = create_xml_node(xml_tmp, XML_ATTR_OP);
tmp_id = crm_concat(remote_name, "start-interval-0", '_');
crm_xml_add(attr, XML_ATTR_ID, tmp_id);
crm_xml_add(attr, XML_ATTR_TIMEOUT, connect_timeout);
crm_xml_add(attr, XML_LRM_ATTR_INTERVAL, "0");
crm_xml_add(attr, XML_NVPAIR_ATTR_NAME, "start");
free(tmp_id);
}
if (remote_port || remote_server) {
xml_tmp = create_xml_node(xml_rsc, XML_TAG_ATTR_SETS);
tmp_id = crm_concat(remote_name, XML_TAG_ATTR_SETS, '_');
crm_xml_add(xml_tmp, XML_ATTR_ID, tmp_id);
free(tmp_id);
if (remote_server) {
attr = create_xml_node(xml_tmp, XML_CIB_TAG_NVPAIR);
tmp_id = crm_concat(remote_name, "instance-attributes-addr", '_');
crm_xml_add(attr, XML_ATTR_ID, tmp_id);
crm_xml_add(attr, XML_NVPAIR_ATTR_NAME, "addr");
crm_xml_add(attr, XML_NVPAIR_ATTR_VALUE, remote_server);
free(tmp_id);
}
if (remote_port) {
attr = create_xml_node(xml_tmp, XML_CIB_TAG_NVPAIR);
tmp_id = crm_concat(remote_name, "instance-attributes-port", '_');
crm_xml_add(attr, XML_ATTR_ID, tmp_id);
crm_xml_add(attr, XML_NVPAIR_ATTR_NAME, "port");
crm_xml_add(attr, XML_NVPAIR_ATTR_VALUE, remote_port);
free(tmp_id);
}
}
return remote_name;
}
static void
handle_startup_fencing(pe_working_set_t *data_set, node_t *new_node)
{
static const char *blind_faith = NULL;
static gboolean unseen_are_unclean = TRUE;
static gboolean init_startup_fence_params = FALSE;
if (init_startup_fence_params == FALSE) {
blind_faith = pe_pref(data_set->config_hash, "startup-fencing");
init_startup_fence_params = TRUE;
if (crm_is_true(blind_faith) == FALSE) {
unseen_are_unclean = FALSE;
crm_warn("Blind faith: not fencing unseen nodes");
}
}
if (is_set(data_set->flags, pe_flag_stonith_enabled) == FALSE
|| unseen_are_unclean == FALSE) {
/* blind faith... */
new_node->details->unclean = FALSE;
} else {
/* all nodes are unclean until we've seen their
* status entry
*/
new_node->details->unclean = TRUE;
}
/* We need to be able to determine if a node's status section
* exists or not separate from whether the node is unclean. */
new_node->details->unseen = TRUE;
}
gboolean
unpack_nodes(xmlNode * xml_nodes, pe_working_set_t * data_set)
{
xmlNode *xml_obj = NULL;
node_t *new_node = NULL;
const char *id = NULL;
const char *uname = NULL;
const char *type = NULL;
const char *score = NULL;
for (xml_obj = __xml_first_child(xml_nodes); xml_obj != NULL; xml_obj = __xml_next(xml_obj)) {
if (crm_str_eq((const char *)xml_obj->name, XML_CIB_TAG_NODE, TRUE)) {
new_node = NULL;
id = crm_element_value(xml_obj, XML_ATTR_ID);
uname = crm_element_value(xml_obj, XML_ATTR_UNAME);
type = crm_element_value(xml_obj, XML_ATTR_TYPE);
score = crm_element_value(xml_obj, XML_RULE_ATTR_SCORE);
crm_trace("Processing node %s/%s", uname, id);
if (id == NULL) {
crm_config_err("Must specify id tag in <node>");
continue;
}
new_node = create_node(id, uname, type, score, data_set);
if (new_node == NULL) {
return FALSE;
}
/* if(data_set->have_quorum == FALSE */
/* && data_set->no_quorum_policy == no_quorum_stop) { */
/* /\* start shutting resources down *\/ */
/* new_node->weight = -INFINITY; */
/* } */
handle_startup_fencing(data_set, new_node);
add_node_attrs(xml_obj, new_node, FALSE, data_set);
unpack_instance_attributes(data_set->input, xml_obj, XML_TAG_UTILIZATION, NULL,
new_node->details->utilization, NULL, FALSE, data_set->now);
crm_trace("Done with node %s", crm_element_value(xml_obj, XML_ATTR_UNAME));
}
}
if (data_set->localhost && pe_find_node(data_set->nodes, data_set->localhost) == NULL) {
crm_info("Creating a fake local node");
create_node(data_set->localhost, data_set->localhost, NULL, 0, data_set);
}
return TRUE;
}
static void
g_hash_destroy_node_list(gpointer data)
{
GListPtr domain = data;
g_list_free_full(domain, free);
}
gboolean
unpack_domains(xmlNode * xml_domains, pe_working_set_t * data_set)
{
const char *id = NULL;
GListPtr domain = NULL;
xmlNode *xml_node = NULL;
xmlNode *xml_domain = NULL;
crm_debug("Unpacking domains");
data_set->domains =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str,
g_hash_destroy_node_list);
for (xml_domain = __xml_first_child(xml_domains); xml_domain != NULL;
xml_domain = __xml_next(xml_domain)) {
if (crm_str_eq((const char *)xml_domain->name, XML_CIB_TAG_DOMAIN, TRUE)) {
domain = NULL;
id = crm_element_value(xml_domain, XML_ATTR_ID);
for (xml_node = __xml_first_child(xml_domain); xml_node != NULL;
xml_node = __xml_next(xml_node)) {
if (crm_str_eq((const char *)xml_node->name, XML_CIB_TAG_NODE, TRUE)) {
node_t *copy = NULL;
node_t *node = NULL;
const char *uname = crm_element_value(xml_node, "name");
const char *score = crm_element_value(xml_node, XML_RULE_ATTR_SCORE);
if (uname == NULL) {
crm_config_err("Invalid domain %s: Must specify id tag in <node>", id);
continue;
}
node = pe_find_node(data_set->nodes, uname);
if (node == NULL) {
node = pe_find_node_id(data_set->nodes, uname);
}
if (node == NULL) {
crm_config_warn("Invalid domain %s: Node %s does not exist", id, uname);
continue;
}
copy = node_copy(node);
copy->weight = char2score(score);
crm_debug("Adding %s to domain %s with score %s", node->details->uname, id,
score);
domain = g_list_prepend(domain, copy);
}
}
if (domain) {
crm_debug("Created domain %s with %d members", id, g_list_length(domain));
g_hash_table_replace(data_set->domains, strdup(id), domain);
}
}
}
return TRUE;
}
static void
destroy_template_rsc_set(gpointer data)
{
xmlNode *rsc_set = data;
free_xml(rsc_set);
}
static void
setup_container(resource_t * rsc, pe_working_set_t * data_set)
{
const char *container_id = NULL;
if (rsc->children) {
GListPtr gIter = rsc->children;
for (; gIter != NULL; gIter = gIter->next) {
resource_t *child_rsc = (resource_t *) gIter->data;
setup_container(child_rsc, data_set);
}
return;
}
container_id = g_hash_table_lookup(rsc->meta, XML_RSC_ATTR_CONTAINER);
if (container_id && safe_str_neq(container_id, rsc->id)) {
resource_t *container = pe_find_resource(data_set->resources, container_id);
if (container) {
rsc->container = container;
container->fillers = g_list_append(container->fillers, rsc);
pe_rsc_trace(rsc, "Resource %s's container is %s", rsc->id, container_id);
} else {
pe_err("Resource %s: Unknown resource container (%s)", rsc->id, container_id);
}
}
}
gboolean
unpack_remote_nodes(xmlNode * xml_resources, pe_working_set_t * data_set)
{
xmlNode *xml_obj = NULL;
GHashTable *rsc_name_check = NULL;
/* generate remote nodes from resource config before unpacking resources */
for (xml_obj = __xml_first_child(xml_resources); xml_obj != NULL; xml_obj = __xml_next(xml_obj)) {
const char *new_node_id = NULL;
/* remote rsc can be defined as primitive, or exist within the metadata of another rsc */
if (xml_contains_remote_node(xml_obj)) {
new_node_id = ID(xml_obj);
/* This check is here to make sure we don't iterate over
* an expanded node that has already been added to the node list. */
if (new_node_id && pe_find_node(data_set->nodes, new_node_id) != NULL) {
continue;
}
} else {
/* expands a metadata defined remote resource into the xml config
* as an actual rsc primitive to be unpacked later. */
new_node_id = expand_remote_rsc_meta(xml_obj, xml_resources, &rsc_name_check);
}
if (new_node_id) {
crm_trace("detected remote node %s", new_node_id);
create_node(new_node_id, new_node_id, "remote", NULL, data_set);
}
}
if (rsc_name_check) {
g_hash_table_destroy(rsc_name_check);
}
return TRUE;
}
/* Call this after all the nodes and resources have been
* unpacked, but before the status section is read.
*
* A remote node's online status is reflected by the state
* of the remote node's connection resource. We need to link
* the remote node to this connection resource so we can have
* easy access to the connection resource during the PE calculations.
*/
static void
link_rsc2remotenode(pe_working_set_t *data_set, resource_t *new_rsc)
{
node_t *remote_node = NULL;
if (new_rsc->is_remote_node == FALSE) {
return;
}
if (is_set(data_set->flags, pe_flag_quick_location)) {
/* remote_nodes and remote_resources are not linked in quick location calculations */
return;
}
print_resource(LOG_DEBUG_3, "Linking remote-node connection resource, ", new_rsc, FALSE);
remote_node = pe_find_node(data_set->nodes, new_rsc->id);
CRM_CHECK(remote_node != NULL, return;);
remote_node->details->remote_rsc = new_rsc;
/* If this is a baremetal remote-node (no container resource
* associated with it) then we need to handle startup fencing the same way
* as cluster nodes. */
if (new_rsc->container == NULL) {
handle_startup_fencing(data_set, remote_node);
return;
}
}
gboolean
unpack_resources(xmlNode * xml_resources, pe_working_set_t * data_set)
{
xmlNode *xml_obj = NULL;
GListPtr gIter = NULL;
data_set->template_rsc_sets =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str,
destroy_template_rsc_set);
for (xml_obj = __xml_first_child(xml_resources); xml_obj != NULL; xml_obj = __xml_next(xml_obj)) {
resource_t *new_rsc = NULL;
if (crm_str_eq((const char *)xml_obj->name, XML_CIB_TAG_RSC_TEMPLATE, TRUE)) {
const char *template_id = ID(xml_obj);
if (template_id && g_hash_table_lookup_extended(data_set->template_rsc_sets,
template_id, NULL, NULL) == FALSE) {
/* Record the template's ID for the knowledge of its existence anyway. */
g_hash_table_insert(data_set->template_rsc_sets, strdup(template_id), NULL);
}
continue;
}
crm_trace("Beginning unpack... <%s id=%s... >", crm_element_name(xml_obj), ID(xml_obj));
if (common_unpack(xml_obj, &new_rsc, NULL, data_set)) {
data_set->resources = g_list_append(data_set->resources, new_rsc);
if (xml_contains_remote_node(xml_obj)) {
new_rsc->is_remote_node = TRUE;
}
print_resource(LOG_DEBUG_3, "Added ", new_rsc, FALSE);
} else {
crm_config_err("Failed unpacking %s %s",
crm_element_name(xml_obj), crm_element_value(xml_obj, XML_ATTR_ID));
if (new_rsc != NULL && new_rsc->fns != NULL) {
new_rsc->fns->free(new_rsc);
}
}
}
for (gIter = data_set->resources; gIter != NULL; gIter = gIter->next) {
resource_t *rsc = (resource_t *) gIter->data;
setup_container(rsc, data_set);
link_rsc2remotenode(data_set, rsc);
}
data_set->resources = g_list_sort(data_set->resources, sort_rsc_priority);
if (is_not_set(data_set->flags, pe_flag_quick_location)
&& is_set(data_set->flags, pe_flag_stonith_enabled)
&& is_set(data_set->flags, pe_flag_have_stonith_resource) == FALSE) {
crm_config_err("Resource start-up disabled since no STONITH resources have been defined");
crm_config_err("Either configure some or disable STONITH with the stonith-enabled option");
crm_config_err("NOTE: Clusters with shared data need STONITH to ensure data integrity");
}
return TRUE;
}
/* The ticket state section:
* "/cib/status/tickets/ticket_state" */
static gboolean
unpack_ticket_state(xmlNode * xml_ticket, pe_working_set_t * data_set)
{
const char *ticket_id = NULL;
const char *granted = NULL;
const char *last_granted = NULL;
const char *standby = NULL;
xmlAttrPtr xIter = NULL;
ticket_t *ticket = NULL;
ticket_id = ID(xml_ticket);
if (ticket_id == NULL || strlen(ticket_id) == 0) {
return FALSE;
}
crm_trace("Processing ticket state for %s", ticket_id);
ticket = g_hash_table_lookup(data_set->tickets, ticket_id);
if (ticket == NULL) {
ticket = ticket_new(ticket_id, data_set);
if (ticket == NULL) {
return FALSE;
}
}
for (xIter = xml_ticket->properties; xIter; xIter = xIter->next) {
const char *prop_name = (const char *)xIter->name;
const char *prop_value = crm_element_value(xml_ticket, prop_name);
if (crm_str_eq(prop_name, XML_ATTR_ID, TRUE)) {
continue;
}
g_hash_table_replace(ticket->state, strdup(prop_name), strdup(prop_value));
}
granted = g_hash_table_lookup(ticket->state, "granted");
if (granted && crm_is_true(granted)) {
ticket->granted = TRUE;
crm_info("We have ticket '%s'", ticket->id);
} else {
ticket->granted = FALSE;
crm_info("We do not have ticket '%s'", ticket->id);
}
last_granted = g_hash_table_lookup(ticket->state, "last-granted");
if (last_granted) {
ticket->last_granted = crm_parse_int(last_granted, 0);
}
standby = g_hash_table_lookup(ticket->state, "standby");
if (standby && crm_is_true(standby)) {
ticket->standby = TRUE;
if (ticket->granted) {
crm_info("Granted ticket '%s' is in standby-mode", ticket->id);
}
} else {
ticket->standby = FALSE;
}
crm_trace("Done with ticket state for %s", ticket_id);
return TRUE;
}
static gboolean
unpack_tickets_state(xmlNode * xml_tickets, pe_working_set_t * data_set)
{
xmlNode *xml_obj = NULL;
for (xml_obj = __xml_first_child(xml_tickets); xml_obj != NULL; xml_obj = __xml_next(xml_obj)) {
if (crm_str_eq((const char *)xml_obj->name, XML_CIB_TAG_TICKET_STATE, TRUE) == FALSE) {
continue;
}
unpack_ticket_state(xml_obj, data_set);
}
return TRUE;
}
/* Compatibility with the deprecated ticket state section:
* "/cib/status/tickets/instance_attributes" */
static void
get_ticket_state_legacy(gpointer key, gpointer value, gpointer user_data)
{
const char *long_key = key;
char *state_key = NULL;
const char *granted_prefix = "granted-ticket-";
const char *last_granted_prefix = "last-granted-";
static int granted_prefix_strlen = 0;
static int last_granted_prefix_strlen = 0;
const char *ticket_id = NULL;
const char *is_granted = NULL;
const char *last_granted = NULL;
const char *sep = NULL;
ticket_t *ticket = NULL;
pe_working_set_t *data_set = user_data;
if (granted_prefix_strlen == 0) {
granted_prefix_strlen = strlen(granted_prefix);
}
if (last_granted_prefix_strlen == 0) {
last_granted_prefix_strlen = strlen(last_granted_prefix);
}
if (strstr(long_key, granted_prefix) == long_key) {
ticket_id = long_key + granted_prefix_strlen;
if (strlen(ticket_id)) {
state_key = strdup("granted");
is_granted = value;
}
} else if (strstr(long_key, last_granted_prefix) == long_key) {
ticket_id = long_key + last_granted_prefix_strlen;
if (strlen(ticket_id)) {
state_key = strdup("last-granted");
last_granted = value;
}
} else if ((sep = strrchr(long_key, '-'))) {
ticket_id = sep + 1;
state_key = strndup(long_key, strlen(long_key) - strlen(sep));
}
if (ticket_id == NULL || strlen(ticket_id) == 0) {
free(state_key);
return;
}
if (state_key == NULL || strlen(state_key) == 0) {
free(state_key);
return;
}
ticket = g_hash_table_lookup(data_set->tickets, ticket_id);
if (ticket == NULL) {
ticket = ticket_new(ticket_id, data_set);
if (ticket == NULL) {
free(state_key);
return;
}
}
g_hash_table_replace(ticket->state, state_key, strdup(value));
if (is_granted) {
if (crm_is_true(is_granted)) {
ticket->granted = TRUE;
crm_info("We have ticket '%s'", ticket->id);
} else {
ticket->granted = FALSE;
crm_info("We do not have ticket '%s'", ticket->id);
}
} else if (last_granted) {
ticket->last_granted = crm_parse_int(last_granted, 0);
}
}
/* remove nodes that are down, stopping */
/* create +ve rsc_to_node constraints between resources and the nodes they are running on */
/* anything else? */
gboolean
unpack_status(xmlNode * status, pe_working_set_t * data_set)
{
const char *id = NULL;
const char *uname = NULL;
xmlNode *state = NULL;
xmlNode *lrm_rsc = NULL;
node_t *this_node = NULL;
crm_trace("Beginning unpack");
if (data_set->tickets == NULL) {
data_set->tickets =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, destroy_ticket);
}
for (state = __xml_first_child(status); state != NULL; state = __xml_next(state)) {
if (crm_str_eq((const char *)state->name, XML_CIB_TAG_TICKETS, TRUE)) {
xmlNode *xml_tickets = state;
GHashTable *state_hash = NULL;
/* Compatibility with the deprecated ticket state section:
* Unpack the attributes in the deprecated "/cib/status/tickets/instance_attributes" if it exists. */
state_hash =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str,
g_hash_destroy_str);
unpack_instance_attributes(data_set->input, xml_tickets, XML_TAG_ATTR_SETS, NULL,
state_hash, NULL, TRUE, data_set->now);
g_hash_table_foreach(state_hash, get_ticket_state_legacy, data_set);
if (state_hash) {
g_hash_table_destroy(state_hash);
}
/* Unpack the new "/cib/status/tickets/ticket_state"s */
unpack_tickets_state(xml_tickets, data_set);
}
if (crm_str_eq((const char *)state->name, XML_CIB_TAG_STATE, TRUE)) {
xmlNode *attrs = NULL;
id = crm_element_value(state, XML_ATTR_ID);
uname = crm_element_value(state, XML_ATTR_UNAME);
this_node = pe_find_node_any(data_set->nodes, id, uname);
if (uname == NULL) {
/* error */
continue;
} else if (this_node == NULL) {
crm_config_warn("Node %s in status section no longer exists", uname);
continue;
} else if (is_remote_node(this_node)) {
/* online state for remote nodes is determined by the rsc state
* after all the unpacking is done. */
continue;
}
crm_trace("Processing node id=%s, uname=%s", id, uname);
/* Mark the node as provisionally clean
* - at least we have seen it in the current cluster's lifetime
*/
this_node->details->unclean = FALSE;
this_node->details->unseen = FALSE;
attrs = find_xml_node(state, XML_TAG_TRANSIENT_NODEATTRS, FALSE);
add_node_attrs(attrs, this_node, TRUE, data_set);
if (crm_is_true(g_hash_table_lookup(this_node->details->attrs, "standby"))) {
crm_info("Node %s is in standby-mode", this_node->details->uname);
this_node->details->standby = TRUE;
}
if (crm_is_true(g_hash_table_lookup(this_node->details->attrs, "maintenance"))) {
crm_info("Node %s is in maintenance-mode", this_node->details->uname);
this_node->details->maintenance = TRUE;
}
crm_trace("determining node state");
determine_online_status(state, this_node, data_set);
if (this_node->details->online && data_set->no_quorum_policy == no_quorum_suicide) {
/* Everything else should flow from this automatically
* At least until the PE becomes able to migrate off healthy resources
*/
pe_fence_node(data_set, this_node, "because the cluster does not have quorum");
}
}
}
/* Now that we know all node states, we can safely handle migration ops */
for (state = __xml_first_child(status); state != NULL; state = __xml_next(state)) {
if (crm_str_eq((const char *)state->name, XML_CIB_TAG_STATE, TRUE) == FALSE) {
continue;
}
id = crm_element_value(state, XML_ATTR_ID);
uname = crm_element_value(state, XML_ATTR_UNAME);
this_node = pe_find_node_any(data_set->nodes, id, uname);
if (this_node == NULL) {
crm_info("Node %s is unknown", id);
continue;
} else if (is_remote_node(this_node)) {
/* online status of remote node can not be determined until all other
* resource status is unpacked. */
continue;
} else if (this_node->details->online || is_set(data_set->flags, pe_flag_stonith_enabled)) {
crm_trace("Processing lrm resource entries on healthy node: %s",
this_node->details->uname);
lrm_rsc = find_xml_node(state, XML_CIB_TAG_LRM, FALSE);
lrm_rsc = find_xml_node(lrm_rsc, XML_LRM_TAG_RESOURCES, FALSE);
unpack_lrm_resources(this_node, lrm_rsc, data_set);
}
}
/* now that the rest of the cluster's status is determined
* calculate remote-nodes */
unpack_remote_status(status, data_set);
return TRUE;
}
gboolean
unpack_remote_status(xmlNode * status, pe_working_set_t * data_set)
{
const char *id = NULL;
const char *uname = NULL;
GListPtr gIter = NULL;
xmlNode *state = NULL;
xmlNode *lrm_rsc = NULL;
node_t *this_node = NULL;
if (is_set(data_set->flags, pe_flag_have_remote_nodes) == FALSE) {
crm_trace("no remote nodes to unpack");
return TRUE;
}
/* get online status */
for (gIter = data_set->nodes; gIter != NULL; gIter = gIter->next) {
this_node = gIter->data;
if ((this_node == NULL) || (is_remote_node(this_node) == FALSE)) {
continue;
}
determine_remote_online_status(this_node);
}
/* process attributes */
for (state = __xml_first_child(status); state != NULL; state = __xml_next(state)) {
xmlNode *attrs = NULL;
if (crm_str_eq((const char *)state->name, XML_CIB_TAG_STATE, TRUE) == FALSE) {
continue;
}
id = crm_element_value(state, XML_ATTR_ID);
uname = crm_element_value(state, XML_ATTR_UNAME);
this_node = pe_find_node_any(data_set->nodes, id, uname);
if ((this_node == NULL) || (is_remote_node(this_node) == FALSE)) {
continue;
}
crm_trace("Processing remote node id=%s, uname=%s", id, uname);
this_node->details->unclean = FALSE;
this_node->details->unseen = FALSE;
attrs = find_xml_node(state, XML_TAG_TRANSIENT_NODEATTRS, FALSE);
add_node_attrs(attrs, this_node, TRUE, data_set);
if (crm_is_true(g_hash_table_lookup(this_node->details->attrs, "standby"))) {
crm_info("Node %s is in standby-mode", this_node->details->uname);
this_node->details->standby = TRUE;
}
}
/* process node rsc status */
for (state = __xml_first_child(status); state != NULL; state = __xml_next(state)) {
if (crm_str_eq((const char *)state->name, XML_CIB_TAG_STATE, TRUE) == FALSE) {
continue;
}
id = crm_element_value(state, XML_ATTR_ID);
uname = crm_element_value(state, XML_ATTR_UNAME);
this_node = pe_find_node_any(data_set->nodes, id, uname);
if ((this_node == NULL) || (is_remote_node(this_node) == FALSE)) {
continue;
}
crm_trace("Processing lrm resource entries on healthy remote node: %s",
this_node->details->uname);
lrm_rsc = find_xml_node(state, XML_CIB_TAG_LRM, FALSE);
lrm_rsc = find_xml_node(lrm_rsc, XML_LRM_TAG_RESOURCES, FALSE);
unpack_lrm_resources(this_node, lrm_rsc, data_set);
}
return TRUE;
}
static gboolean
determine_online_status_no_fencing(pe_working_set_t * data_set, xmlNode * node_state,
node_t * this_node)
{
gboolean online = FALSE;
const char *join = crm_element_value(node_state, XML_NODE_JOIN_STATE);
const char *is_peer = crm_element_value(node_state, XML_NODE_IS_PEER);
const char *in_cluster = crm_element_value(node_state, XML_NODE_IN_CLUSTER);
const char *exp_state = crm_element_value(node_state, XML_NODE_EXPECTED);
if (!crm_is_true(in_cluster)) {
crm_trace("Node is down: in_cluster=%s", crm_str(in_cluster));
} else if (safe_str_eq(is_peer, ONLINESTATUS)) {
if (safe_str_eq(join, CRMD_JOINSTATE_MEMBER)) {
online = TRUE;
} else {
crm_debug("Node is not ready to run resources: %s", join);
}
} else if (this_node->details->expected_up == FALSE) {
crm_trace("CRMd is down: in_cluster=%s", crm_str(in_cluster));
crm_trace("\tis_peer=%s, join=%s, expected=%s",
crm_str(is_peer), crm_str(join), crm_str(exp_state));
} else {
/* mark it unclean */
pe_fence_node(data_set, this_node, "because it is partially and/or un-expectedly down");
crm_info("\tin_cluster=%s, is_peer=%s, join=%s, expected=%s",
crm_str(in_cluster), crm_str(is_peer), crm_str(join), crm_str(exp_state));
}
return online;
}
static gboolean
determine_online_status_fencing(pe_working_set_t * data_set, xmlNode * node_state,
node_t * this_node)
{
gboolean online = FALSE;
gboolean do_terminate = FALSE;
const char *join = crm_element_value(node_state, XML_NODE_JOIN_STATE);
const char *is_peer = crm_element_value(node_state, XML_NODE_IS_PEER);
const char *in_cluster = crm_element_value(node_state, XML_NODE_IN_CLUSTER);
const char *exp_state = crm_element_value(node_state, XML_NODE_EXPECTED);
const char *terminate = g_hash_table_lookup(this_node->details->attrs, "terminate");
/*
- XML_NODE_IN_CLUSTER ::= true|false
- XML_NODE_IS_PEER ::= true|false|online|offline
- XML_NODE_JOIN_STATE ::= member|down|pending|banned
- XML_NODE_EXPECTED ::= member|down
*/
if (crm_is_true(terminate)) {
do_terminate = TRUE;
} else if (terminate != NULL && strlen(terminate) > 0) {
/* could be a time() value */
char t = terminate[0];
if (t != '0' && isdigit(t)) {
do_terminate = TRUE;
}
}
crm_trace("%s: in_cluster=%s, is_peer=%s, join=%s, expected=%s, term=%d",
this_node->details->uname, crm_str(in_cluster), crm_str(is_peer),
crm_str(join), crm_str(exp_state), do_terminate);
online = crm_is_true(in_cluster);
if (safe_str_eq(is_peer, ONLINESTATUS)) {
is_peer = XML_BOOLEAN_YES;
}
if (exp_state == NULL) {
exp_state = CRMD_JOINSTATE_DOWN;
}
if (this_node->details->shutdown) {
crm_debug("%s is shutting down", this_node->details->uname);
online = crm_is_true(is_peer); /* Slightly different criteria since we cant shut down a dead peer */
} else if (in_cluster == NULL) {
pe_fence_node(data_set, this_node, "because the peer has not been seen by the cluster");
} else if (safe_str_eq(join, CRMD_JOINSTATE_NACK)) {
pe_fence_node(data_set, this_node, "because it failed the pacemaker membership criteria");
} else if (do_terminate == FALSE && safe_str_eq(exp_state, CRMD_JOINSTATE_DOWN)) {
if (crm_is_true(in_cluster) || crm_is_true(is_peer)) {
crm_info("- Node %s is not ready to run resources", this_node->details->uname);
this_node->details->standby = TRUE;
this_node->details->pending = TRUE;
} else {
crm_trace("%s is down or still coming up", this_node->details->uname);
}
} else if (do_terminate && safe_str_eq(join, CRMD_JOINSTATE_DOWN)
&& crm_is_true(in_cluster) == FALSE && crm_is_true(is_peer) == FALSE) {
crm_info("Node %s was just shot", this_node->details->uname);
online = FALSE;
} else if (crm_is_true(in_cluster) == FALSE) {
pe_fence_node(data_set, this_node, "because the node is no longer part of the cluster");
} else if (crm_is_true(is_peer) == FALSE) {
pe_fence_node(data_set, this_node, "because our peer process is no longer available");
/* Everything is running at this point, now check join state */
} else if (do_terminate) {
pe_fence_node(data_set, this_node, "because termination was requested");
} else if (safe_str_eq(join, CRMD_JOINSTATE_MEMBER)) {
crm_info("Node %s is active", this_node->details->uname);
} else if (safe_str_eq(join, CRMD_JOINSTATE_PENDING)
|| safe_str_eq(join, CRMD_JOINSTATE_DOWN)) {
crm_info("Node %s is not ready to run resources", this_node->details->uname);
this_node->details->standby = TRUE;
this_node->details->pending = TRUE;
} else {
pe_fence_node(data_set, this_node, "because the peer was in an unknown state");
crm_warn("%s: in-cluster=%s, is-peer=%s, join=%s, expected=%s, term=%d, shutdown=%d",
this_node->details->uname, crm_str(in_cluster), crm_str(is_peer),
crm_str(join), crm_str(exp_state), do_terminate, this_node->details->shutdown);
}
return online;
}
static gboolean
determine_remote_online_status(node_t * this_node)
{
resource_t *rsc = this_node->details->remote_rsc;
resource_t *container = rsc->container;
CRM_ASSERT(rsc != NULL);
/* If the resource is currently started, mark it online. */
if (rsc->role == RSC_ROLE_STARTED) {
crm_trace("Remote node %s is set to ONLINE. role == started", this_node->details->id);
this_node->details->online = TRUE;
}
/* consider this node shutting down if transitioning start->stop */
if (rsc->role == RSC_ROLE_STARTED && rsc->next_role == RSC_ROLE_STOPPED) {
crm_trace("Remote node %s shutdown. transition from start to stop role", this_node->details->id);
this_node->details->shutdown = TRUE;
}
/* Now check all the failure conditions. */
if (is_set(rsc->flags, pe_rsc_failed) ||
(rsc->role == RSC_ROLE_STOPPED) ||
(container && is_set(container->flags, pe_rsc_failed)) ||
(container && container->role == RSC_ROLE_STOPPED)) {
crm_trace("Remote node %s is set to OFFLINE. node is stopped or rsc failed.", this_node->details->id);
this_node->details->online = FALSE;
}
crm_trace("Remote node %s online=%s",
this_node->details->id, this_node->details->online ? "TRUE" : "FALSE");
return this_node->details->online;
}
gboolean
determine_online_status(xmlNode * node_state, node_t * this_node, pe_working_set_t * data_set)
{
gboolean online = FALSE;
const char *shutdown = NULL;
const char *exp_state = crm_element_value(node_state, XML_NODE_EXPECTED);
if (this_node == NULL) {
crm_config_err("No node to check");
return online;
}
this_node->details->shutdown = FALSE;
this_node->details->expected_up = FALSE;
shutdown = g_hash_table_lookup(this_node->details->attrs, XML_CIB_ATTR_SHUTDOWN);
if (shutdown != NULL && safe_str_neq("0", shutdown)) {
this_node->details->shutdown = TRUE;
} else if (safe_str_eq(exp_state, CRMD_JOINSTATE_MEMBER)) {
this_node->details->expected_up = TRUE;
}
if (this_node->details->type == node_ping) {
this_node->details->unclean = FALSE;
online = FALSE; /* As far as resource management is concerned,
* the node is safely offline.
* Anyone caught abusing this logic will be shot
*/
} else if (is_set(data_set->flags, pe_flag_stonith_enabled) == FALSE) {
online = determine_online_status_no_fencing(data_set, node_state, this_node);
} else {
online = determine_online_status_fencing(data_set, node_state, this_node);
}
if (online) {
this_node->details->online = TRUE;
} else {
/* remove node from contention */
this_node->fixed = TRUE;
this_node->weight = -INFINITY;
}
if (online && this_node->details->shutdown) {
/* dont run resources here */
this_node->fixed = TRUE;
this_node->weight = -INFINITY;
}
if (this_node->details->type == node_ping) {
crm_info("Node %s is not a pacemaker node", this_node->details->uname);
} else if (this_node->details->unclean) {
pe_proc_warn("Node %s is unclean", this_node->details->uname);
} else if (this_node->details->online) {
crm_info("Node %s is %s", this_node->details->uname,
this_node->details->shutdown ? "shutting down" :
this_node->details->pending ? "pending" :
this_node->details->standby ? "standby" :
this_node->details->maintenance ? "maintenance" : "online");
} else {
crm_trace("Node %s is offline", this_node->details->uname);
}
return online;
}
char *
clone_strip(const char *last_rsc_id)
{
int lpc = 0;
char *zero = NULL;
CRM_CHECK(last_rsc_id != NULL, return NULL);
lpc = strlen(last_rsc_id);
while (--lpc > 0) {
switch (last_rsc_id[lpc]) {
case 0:
crm_err("Empty string: %s", last_rsc_id);
return NULL;
break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
break;
case ':':
zero = calloc(1, lpc + 1);
memcpy(zero, last_rsc_id, lpc);
zero[lpc] = 0;
return zero;
default:
goto done;
}
}
done:
zero = strdup(last_rsc_id);
return zero;
}
char *
clone_zero(const char *last_rsc_id)
{
int lpc = 0;
char *zero = NULL;
CRM_CHECK(last_rsc_id != NULL, return NULL);
if (last_rsc_id != NULL) {
lpc = strlen(last_rsc_id);
}
while (--lpc > 0) {
switch (last_rsc_id[lpc]) {
case 0:
return NULL;
break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
break;
case ':':
zero = calloc(1, lpc + 3);
memcpy(zero, last_rsc_id, lpc);
zero[lpc] = ':';
zero[lpc + 1] = '0';
zero[lpc + 2] = 0;
return zero;
default:
goto done;
}
}
done:
lpc = strlen(last_rsc_id);
zero = calloc(1, lpc + 3);
memcpy(zero, last_rsc_id, lpc);
zero[lpc] = ':';
zero[lpc + 1] = '0';
zero[lpc + 2] = 0;
crm_trace("%s -> %s", last_rsc_id, zero);
return zero;
}
static resource_t *
create_fake_resource(const char *rsc_id, xmlNode * rsc_entry, pe_working_set_t * data_set)
{
resource_t *rsc = NULL;
xmlNode *xml_rsc = create_xml_node(NULL, XML_CIB_TAG_RESOURCE);
copy_in_properties(xml_rsc, rsc_entry);
crm_xml_add(xml_rsc, XML_ATTR_ID, rsc_id);
crm_log_xml_debug(xml_rsc, "Orphan resource");
if (!common_unpack(xml_rsc, &rsc, NULL, data_set)) {
return NULL;
}
if (xml_contains_remote_node(xml_rsc)) {
node_t *node;
crm_debug("Detected orphaned remote node %s", rsc_id);
rsc->is_remote_node = TRUE;
node = create_node(rsc_id, rsc_id, "remote", NULL, data_set);
link_rsc2remotenode(data_set, rsc);
if (node) {
crm_trace("Setting node %s as shutting down due to orphaned connection resource", rsc_id);
node->details->shutdown = TRUE;
}
}
if (crm_element_value(rsc_entry, XML_RSC_ATTR_CONTAINER)) {
/* This orphaned rsc needs to be mapped to a container. */
crm_trace("Detected orphaned container filler %s", rsc_id);
set_bit(rsc->flags, pe_rsc_orphan_container_filler);
}
set_bit(rsc->flags, pe_rsc_orphan);
data_set->resources = g_list_append(data_set->resources, rsc);
return rsc;
}
extern resource_t *create_child_clone(resource_t * rsc, int sub_id, pe_working_set_t * data_set);
static resource_t *
find_anonymous_clone(pe_working_set_t * data_set, node_t * node, resource_t * parent,
const char *rsc_id)
{
GListPtr rIter = NULL;
resource_t *rsc = NULL;
gboolean skip_inactive = FALSE;
CRM_ASSERT(parent != NULL);
CRM_ASSERT(parent->variant == pe_clone || parent->variant == pe_master);
CRM_ASSERT(is_not_set(parent->flags, pe_rsc_unique));
/* Find an instance active (or partially active for grouped clones) on the specified node */
pe_rsc_trace(parent, "Looking for %s on %s in %s", rsc_id, node->details->uname, parent->id);
for (rIter = parent->children; rsc == NULL && rIter; rIter = rIter->next) {
GListPtr nIter = NULL;
GListPtr locations = NULL;
resource_t *child = rIter->data;
child->fns->location(child, &locations, TRUE);
if (locations == NULL) {
pe_rsc_trace(child, "Resource %s, skip inactive", child->id);
continue;
}
for (nIter = locations; nIter && rsc == NULL; nIter = nIter->next) {
node_t *childnode = nIter->data;
if (childnode->details == node->details) {
/* ->find_rsc() because we might be a cloned group */
rsc = parent->fns->find_rsc(child, rsc_id, NULL, pe_find_clone);
pe_rsc_trace(rsc, "Resource %s, active", rsc->id);
}
/* Keep this block, it means we'll do the right thing if
* anyone toggles the unique flag to 'off'
*/
if (rsc && rsc->running_on) {
crm_notice("/Anonymous/ clone %s is already running on %s",
parent->id, node->details->uname);
skip_inactive = TRUE;
rsc = NULL;
}
}
g_list_free(locations);
}
/* Find an inactive instance */
if (skip_inactive == FALSE) {
pe_rsc_trace(parent, "Looking for %s anywhere", rsc_id);
for (rIter = parent->children; rsc == NULL && rIter; rIter = rIter->next) {
GListPtr locations = NULL;
resource_t *child = rIter->data;
if (is_set(child->flags, pe_rsc_block)) {
pe_rsc_trace(child, "Skip: blocked in stopped state");
continue;
}
child->fns->location(child, &locations, TRUE);
if (locations == NULL) {
/* ->find_rsc() because we might be a cloned group */
rsc = parent->fns->find_rsc(child, rsc_id, NULL, pe_find_clone);
pe_rsc_trace(parent, "Resource %s, empty slot", rsc->id);
}
g_list_free(locations);
}
}
if (rsc == NULL) {
/* Create an extra orphan */
resource_t *top = create_child_clone(parent, -1, data_set);
/* ->find_rsc() because we might be a cloned group */
rsc = top->fns->find_rsc(top, rsc_id, NULL, pe_find_clone);
CRM_ASSERT(rsc != NULL);
pe_rsc_debug(parent, "Created orphan %s for %s: %s on %s", top->id, parent->id, rsc_id,
node->details->uname);
}
if (safe_str_neq(rsc_id, rsc->id)) {
pe_rsc_debug(rsc, "Internally renamed %s on %s to %s%s",
rsc_id, node->details->uname, rsc->id,
is_set(rsc->flags, pe_rsc_orphan) ? " (ORPHAN)" : "");
}
return rsc;
}
static resource_t *
unpack_find_resource(pe_working_set_t * data_set, node_t * node, const char *rsc_id,
xmlNode * rsc_entry)
{
resource_t *rsc = NULL;
resource_t *parent = NULL;
crm_trace("looking for %s", rsc_id);
rsc = pe_find_resource(data_set->resources, rsc_id);
/* no match */
if (rsc == NULL) {
/* Even when clone-max=0, we still create a single :0 orphan to match against */
char *tmp = clone_zero(rsc_id);
resource_t *clone0 = pe_find_resource(data_set->resources, tmp);
if (clone0 && is_not_set(clone0->flags, pe_rsc_unique)) {
rsc = clone0;
} else {
crm_trace("%s is not known as %s either", rsc_id, tmp);
}
parent = uber_parent(clone0);
free(tmp);
crm_trace("%s not found: %s", rsc_id, parent ? parent->id : "orphan");
} else if (rsc->variant > pe_native) {
crm_trace("%s is no longer a primitve resource, the lrm_resource entry is obsolete",
rsc_id);
return NULL;
} else {
parent = uber_parent(rsc);
}
if (parent && parent->variant > pe_group) {
if (is_not_set(parent->flags, pe_rsc_unique)) {
char *base = clone_strip(rsc_id);
rsc = find_anonymous_clone(data_set, node, parent, base);
CRM_ASSERT(rsc != NULL);
free(base);
}
if (rsc && safe_str_neq(rsc_id, rsc->id)) {
free(rsc->clone_name);
rsc->clone_name = strdup(rsc_id);
}
}
return rsc;
}
static resource_t *
process_orphan_resource(xmlNode * rsc_entry, node_t * node, pe_working_set_t * data_set)
{
resource_t *rsc = NULL;
const char *rsc_id = crm_element_value(rsc_entry, XML_ATTR_ID);
crm_debug("Detected orphan resource %s on %s", rsc_id, node->details->uname);
rsc = create_fake_resource(rsc_id, rsc_entry, data_set);
if (is_set(data_set->flags, pe_flag_stop_rsc_orphans) == FALSE) {
clear_bit(rsc->flags, pe_rsc_managed);
} else {
GListPtr gIter = NULL;
print_resource(LOG_DEBUG_3, "Added orphan", rsc, FALSE);
CRM_CHECK(rsc != NULL, return NULL);
resource_location(rsc, NULL, -INFINITY, "__orphan_dont_run__", data_set);
for (gIter = data_set->nodes; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
if (node->details->online && get_failcount(node, rsc, NULL, data_set)) {
action_t *clear_op = NULL;
action_t *ready = NULL;
if (is_remote_node(node)) {
char *pseudo_op_name = crm_concat(CRM_OP_PROBED, node->details->id, '_');
ready = get_pseudo_op(pseudo_op_name, data_set);
free(pseudo_op_name);
} else {
ready = get_pseudo_op(CRM_OP_PROBED, data_set);
}
clear_op = custom_action(rsc, crm_concat(rsc->id, CRM_OP_CLEAR_FAILCOUNT, '_'),
CRM_OP_CLEAR_FAILCOUNT, node, FALSE, TRUE, data_set);
add_hash_param(clear_op->meta, XML_ATTR_TE_NOWAIT, XML_BOOLEAN_TRUE);
pe_rsc_info(rsc, "Clearing failcount (%d) for orphaned resource %s on %s (%s)",
get_failcount(node, rsc, NULL, data_set), rsc->id, node->details->uname,
clear_op->uuid);
order_actions(clear_op, ready, pe_order_optional);
}
}
}
return rsc;
}
static void
process_rsc_state(resource_t * rsc, node_t * node,
enum action_fail_response on_fail,
xmlNode * migrate_op, pe_working_set_t * data_set)
{
pe_rsc_trace(rsc, "Resource %s is %s on %s: on_fail=%s",
rsc->id, role2text(rsc->role), node->details->uname, fail2text(on_fail));
/* process current state */
if (rsc->role != RSC_ROLE_UNKNOWN) {
resource_t *iter = rsc;
while (iter) {
if (g_hash_table_lookup(iter->known_on, node->details->id) == NULL) {
node_t *n = node_copy(node);
pe_rsc_trace(rsc, "%s (aka. %s) known on %s", rsc->id, rsc->clone_name,
n->details->uname);
g_hash_table_insert(iter->known_on, (gpointer) n->details->id, n);
}
if (is_set(iter->flags, pe_rsc_unique)) {
break;
}
iter = iter->parent;
}
}
if (rsc->role > RSC_ROLE_STOPPED
&& node->details->online == FALSE && is_set(rsc->flags, pe_rsc_managed)) {
gboolean should_fence = FALSE;
/* if this is a remote_node living in a container, fence the container
* by recovering it. Mark the resource as unmanaged. Once the container
* and remote connenction are re-established, the status section will
* get reset in the crmd freeing up this resource to run again once we
* are sure we know the resources state. */
if (is_container_remote_node(node)) {
set_bit(rsc->flags, pe_rsc_failed);
should_fence = TRUE;
} else if (is_set(data_set->flags, pe_flag_stonith_enabled)) {
should_fence = TRUE;
}
if (should_fence) {
char *reason = g_strdup_printf("because %s is thought to be active there", rsc->id);
pe_fence_node(data_set, node, reason);
g_free(reason);
}
}
if (node->details->unclean) {
/* No extra processing needed
* Also allows resources to be started again after a node is shot
*/
on_fail = action_fail_ignore;
}
switch (on_fail) {
case action_fail_ignore:
/* nothing to do */
break;
case action_fail_fence:
/* treat it as if it is still running
* but also mark the node as unclean
*/
pe_fence_node(data_set, node, "because of resource failure(s)");
break;
case action_fail_standby:
node->details->standby = TRUE;
node->details->standby_onfail = TRUE;
break;
case action_fail_block:
/* is_managed == FALSE will prevent any
* actions being sent for the resource
*/
clear_bit(rsc->flags, pe_rsc_managed);
set_bit(rsc->flags, pe_rsc_block);
break;
case action_fail_migrate:
/* make sure it comes up somewhere else
* or not at all
*/
resource_location(rsc, node, -INFINITY, "__action_migration_auto__", data_set);
break;
case action_fail_stop:
rsc->next_role = RSC_ROLE_STOPPED;
break;
case action_fail_recover:
if (rsc->role != RSC_ROLE_STOPPED && rsc->role != RSC_ROLE_UNKNOWN) {
set_bit(rsc->flags, pe_rsc_failed);
stop_action(rsc, node, FALSE);
}
break;
case action_fail_restart_container:
set_bit(rsc->flags, pe_rsc_failed);
if (rsc->container) {
stop_action(rsc->container, node, FALSE);
} else if (rsc->role != RSC_ROLE_STOPPED && rsc->role != RSC_ROLE_UNKNOWN) {
stop_action(rsc, node, FALSE);
}
break;
}
if (rsc->role != RSC_ROLE_STOPPED && rsc->role != RSC_ROLE_UNKNOWN) {
if (is_set(rsc->flags, pe_rsc_orphan)) {
if (is_set(rsc->flags, pe_rsc_managed)) {
crm_config_warn("Detected active orphan %s running on %s",
rsc->id, node->details->uname);
} else {
crm_config_warn("Cluster configured not to stop active orphans."
" %s must be stopped manually on %s",
rsc->id, node->details->uname);
}
}
native_add_running(rsc, node, data_set);
if (on_fail != action_fail_ignore) {
set_bit(rsc->flags, pe_rsc_failed);
}
} else if (rsc->clone_name && strchr(rsc->clone_name, ':') != NULL) {
/* Only do this for older status sections that included instance numbers
* Otherwise stopped instances will appear as orphans
*/
pe_rsc_trace(rsc, "Resetting clone_name %s for %s (stopped)", rsc->clone_name, rsc->id);
free(rsc->clone_name);
rsc->clone_name = NULL;
} else {
char *key = stop_key(rsc);
GListPtr possible_matches = find_actions(rsc->actions, key, node);
GListPtr gIter = possible_matches;
for (; gIter != NULL; gIter = gIter->next) {
action_t *stop = (action_t *) gIter->data;
stop->flags |= pe_action_optional;
}
g_list_free(possible_matches);
free(key);
}
}
/* create active recurring operations as optional */
static void
process_recurring(node_t * node, resource_t * rsc,
int start_index, int stop_index,
GListPtr sorted_op_list, pe_working_set_t * data_set)
{
int counter = -1;
const char *task = NULL;
const char *status = NULL;
GListPtr gIter = sorted_op_list;
pe_rsc_trace(rsc, "%s: Start index %d, stop index = %d", rsc->id, start_index, stop_index);
for (; gIter != NULL; gIter = gIter->next) {
xmlNode *rsc_op = (xmlNode *) gIter->data;
int interval = 0;
char *key = NULL;
const char *id = ID(rsc_op);
const char *interval_s = NULL;
counter++;
if (node->details->online == FALSE) {
pe_rsc_trace(rsc, "Skipping %s/%s: node is offline", rsc->id, node->details->uname);
break;
/* Need to check if there's a monitor for role="Stopped" */
} else if (start_index < stop_index && counter <= stop_index) {
pe_rsc_trace(rsc, "Skipping %s/%s: resource is not active", id, node->details->uname);
continue;
} else if (counter < start_index) {
pe_rsc_trace(rsc, "Skipping %s/%s: old %d", id, node->details->uname, counter);
continue;
}
interval_s = crm_element_value(rsc_op, XML_LRM_ATTR_INTERVAL);
interval = crm_parse_int(interval_s, "0");
if (interval == 0) {
pe_rsc_trace(rsc, "Skipping %s/%s: non-recurring", id, node->details->uname);
continue;
}
status = crm_element_value(rsc_op, XML_LRM_ATTR_OPSTATUS);
if (safe_str_eq(status, "-1")) {
pe_rsc_trace(rsc, "Skipping %s/%s: status", id, node->details->uname);
continue;
}
task = crm_element_value(rsc_op, XML_LRM_ATTR_TASK);
/* create the action */
key = generate_op_key(rsc->id, task, interval);
pe_rsc_trace(rsc, "Creating %s/%s", key, node->details->uname);
custom_action(rsc, key, task, node, TRUE, TRUE, data_set);
}
}
void
calculate_active_ops(GListPtr sorted_op_list, int *start_index, int *stop_index)
{
int counter = -1;
int implied_monitor_start = -1;
int implied_master_start = -1;
const char *task = NULL;
const char *status = NULL;
GListPtr gIter = sorted_op_list;
*stop_index = -1;
*start_index = -1;
for (; gIter != NULL; gIter = gIter->next) {
xmlNode *rsc_op = (xmlNode *) gIter->data;
counter++;
task = crm_element_value(rsc_op, XML_LRM_ATTR_TASK);
status = crm_element_value(rsc_op, XML_LRM_ATTR_OPSTATUS);
if (safe_str_eq(task, CRMD_ACTION_STOP)
&& safe_str_eq(status, "0")) {
*stop_index = counter;
} else if (safe_str_eq(task, CRMD_ACTION_START) || safe_str_eq(task, CRMD_ACTION_MIGRATED)) {
*start_index = counter;
} else if ((implied_monitor_start <= *stop_index) && safe_str_eq(task, CRMD_ACTION_STATUS)) {
const char *rc = crm_element_value(rsc_op, XML_LRM_ATTR_RC);
if (safe_str_eq(rc, "0") || safe_str_eq(rc, "8")) {
implied_monitor_start = counter;
}
} else if (safe_str_eq(task, CRMD_ACTION_PROMOTE) || safe_str_eq(task, CRMD_ACTION_DEMOTE)) {
implied_master_start = counter;
}
}
if (*start_index == -1) {
if (implied_master_start != -1) {
*start_index = implied_master_start;
} else if (implied_monitor_start != -1) {
*start_index = implied_monitor_start;
}
}
}
static resource_t *
unpack_lrm_rsc_state(node_t * node, xmlNode * rsc_entry, pe_working_set_t * data_set)
{
GListPtr gIter = NULL;
int stop_index = -1;
int start_index = -1;
enum rsc_role_e req_role = RSC_ROLE_UNKNOWN;
const char *task = NULL;
const char *rsc_id = crm_element_value(rsc_entry, XML_ATTR_ID);
resource_t *rsc = NULL;
GListPtr op_list = NULL;
GListPtr sorted_op_list = NULL;
xmlNode *migrate_op = NULL;
xmlNode *rsc_op = NULL;
enum action_fail_response on_fail = FALSE;
enum rsc_role_e saved_role = RSC_ROLE_UNKNOWN;
crm_trace("[%s] Processing %s on %s",
crm_element_name(rsc_entry), rsc_id, node->details->uname);
/* extract operations */
op_list = NULL;
sorted_op_list = NULL;
for (rsc_op = __xml_first_child(rsc_entry); rsc_op != NULL; rsc_op = __xml_next(rsc_op)) {
if (crm_str_eq((const char *)rsc_op->name, XML_LRM_TAG_RSC_OP, TRUE)) {
op_list = g_list_prepend(op_list, rsc_op);
}
}
if (op_list == NULL) {
/* if there are no operations, there is nothing to do */
return NULL;
}
/* find the resource */
rsc = unpack_find_resource(data_set, node, rsc_id, rsc_entry);
if (rsc == NULL) {
rsc = process_orphan_resource(rsc_entry, node, data_set);
}
CRM_ASSERT(rsc != NULL);
/* process operations */
saved_role = rsc->role;
on_fail = action_fail_ignore;
rsc->role = RSC_ROLE_UNKNOWN;
sorted_op_list = g_list_sort(op_list, sort_op_by_callid);
for (gIter = sorted_op_list; gIter != NULL; gIter = gIter->next) {
xmlNode *rsc_op = (xmlNode *) gIter->data;
task = crm_element_value(rsc_op, XML_LRM_ATTR_TASK);
if (safe_str_eq(task, CRMD_ACTION_MIGRATED)) {
migrate_op = rsc_op;
}
unpack_rsc_op(rsc, node, rsc_op, &on_fail, data_set);
}
/* create active recurring operations as optional */
calculate_active_ops(sorted_op_list, &start_index, &stop_index);
process_recurring(node, rsc, start_index, stop_index, sorted_op_list, data_set);
/* no need to free the contents */
g_list_free(sorted_op_list);
process_rsc_state(rsc, node, on_fail, migrate_op, data_set);
if (get_target_role(rsc, &req_role)) {
if (rsc->next_role == RSC_ROLE_UNKNOWN || req_role < rsc->next_role) {
pe_rsc_debug(rsc, "%s: Overwriting calculated next role %s"
" with requested next role %s",
rsc->id, role2text(rsc->next_role), role2text(req_role));
rsc->next_role = req_role;
} else if (req_role > rsc->next_role) {
pe_rsc_info(rsc, "%s: Not overwriting calculated next role %s"
" with requested next role %s",
rsc->id, role2text(rsc->next_role), role2text(req_role));
}
}
if (saved_role > rsc->role) {
rsc->role = saved_role;
}
return rsc;
}
static void
handle_orphaned_container_fillers(xmlNode * lrm_rsc_list, pe_working_set_t * data_set)
{
xmlNode *rsc_entry = NULL;
for (rsc_entry = __xml_first_child(lrm_rsc_list); rsc_entry != NULL;
rsc_entry = __xml_next(rsc_entry)) {
resource_t *rsc;
resource_t *container;
const char *rsc_id;
const char *container_id;
if (safe_str_neq((const char *)rsc_entry->name, XML_LRM_TAG_RESOURCE)) {
continue;
}
container_id = crm_element_value(rsc_entry, XML_RSC_ATTR_CONTAINER);
rsc_id = crm_element_value(rsc_entry, XML_ATTR_ID);
if (container_id == NULL || rsc_id == NULL) {
continue;
}
container = pe_find_resource(data_set->resources, container_id);
if (container == NULL) {
continue;
}
rsc = pe_find_resource(data_set->resources, rsc_id);
if (rsc == NULL ||
is_set(rsc->flags, pe_rsc_orphan_container_filler) == FALSE ||
rsc->container != NULL) {
continue;
}
pe_rsc_trace(rsc, "Mapped orphaned rsc %s's container to %s", rsc->id, container_id);
rsc->container = container;
container->fillers = g_list_append(container->fillers, rsc);
}
}
gboolean
unpack_lrm_resources(node_t * node, xmlNode * lrm_rsc_list, pe_working_set_t * data_set)
{
xmlNode *rsc_entry = NULL;
gboolean found_orphaned_container_filler = FALSE;
GListPtr unexpected_containers = NULL;
GListPtr gIter = NULL;
resource_t *remote = NULL;
CRM_CHECK(node != NULL, return FALSE);
crm_trace("Unpacking resources on %s", node->details->uname);
for (rsc_entry = __xml_first_child(lrm_rsc_list); rsc_entry != NULL;
rsc_entry = __xml_next(rsc_entry)) {
if (crm_str_eq((const char *)rsc_entry->name, XML_LRM_TAG_RESOURCE, TRUE)) {
resource_t *rsc;
rsc = unpack_lrm_rsc_state(node, rsc_entry, data_set);
if (!rsc) {
continue;
}
if (is_set(rsc->flags, pe_rsc_orphan_container_filler)) {
found_orphaned_container_filler = TRUE;
}
if (is_set(rsc->flags, pe_rsc_unexpectedly_running)) {
remote = rsc_contains_remote_node(data_set, rsc);
if (remote) {
unexpected_containers = g_list_append(unexpected_containers, remote);
}
}
}
}
/* If a container resource is unexpectedly up... and the remote-node
* connection resource for that container is not up, the entire container
* must be recovered. */
for (gIter = unexpected_containers; gIter != NULL; gIter = gIter->next) {
remote = (resource_t *) gIter->data;
if (remote->role != RSC_ROLE_STARTED) {
crm_warn("Recovering container resource %s. Resource is unexpectedly running and involves a remote-node.");
set_bit(remote->container->flags, pe_rsc_failed);
}
}
/* now that all the resource state has been unpacked for this node
* we have to go back and map any orphaned container fillers to their
* container resource */
if (found_orphaned_container_filler) {
handle_orphaned_container_fillers(lrm_rsc_list, data_set);
}
g_list_free(unexpected_containers);
return TRUE;
}
static void
set_active(resource_t * rsc)
{
resource_t *top = uber_parent(rsc);
if (top && top->variant == pe_master) {
rsc->role = RSC_ROLE_SLAVE;
} else {
rsc->role = RSC_ROLE_STARTED;
}
}
static void
set_node_score(gpointer key, gpointer value, gpointer user_data)
{
node_t *node = value;
int *score = user_data;
node->weight = *score;
}
#define STATUS_PATH_MAX 1024
static xmlNode *
find_lrm_op(const char *resource, const char *op, const char *node, const char *source,
pe_working_set_t * data_set)
{
int offset = 0;
char xpath[STATUS_PATH_MAX];
offset += snprintf(xpath + offset, STATUS_PATH_MAX - offset, "//node_state[@uname='%s']", node);
offset +=
snprintf(xpath + offset, STATUS_PATH_MAX - offset, "//" XML_LRM_TAG_RESOURCE "[@id='%s']",
resource);
/* Need to check against transition_magic too? */
if (source && safe_str_eq(op, CRMD_ACTION_MIGRATE)) {
offset +=
snprintf(xpath + offset, STATUS_PATH_MAX - offset,
"/" XML_LRM_TAG_RSC_OP "[@operation='%s' and @migrate_target='%s']", op,
source);
} else if (source && safe_str_eq(op, CRMD_ACTION_MIGRATED)) {
offset +=
snprintf(xpath + offset, STATUS_PATH_MAX - offset,
"/" XML_LRM_TAG_RSC_OP "[@operation='%s' and @migrate_source='%s']", op,
source);
} else {
offset +=
snprintf(xpath + offset, STATUS_PATH_MAX - offset,
"/" XML_LRM_TAG_RSC_OP "[@operation='%s']", op);
}
return get_xpath_object(xpath, data_set->input, LOG_DEBUG);
}
static void
unpack_rsc_migration(resource_t *rsc, node_t *node, xmlNode *xml_op, pe_working_set_t * data_set)
{
/*
* The normal sequence is (now): migrate_to(Src) -> migrate_from(Tgt) -> stop(Src)
*
* So if a migrate_to is followed by a stop, then we dont need to care what
* happended on the target node
*
* Without the stop, we need to look for a successful migrate_from.
* This would also imply we're no longer running on the source
*
* Without the stop, and without a migrate_from op we make sure the resource
* gets stopped on both source and target (assuming the target is up)
*
*/
int stop_id = 0;
int task_id = 0;
xmlNode *stop_op =
find_lrm_op(rsc->id, CRMD_ACTION_STOP, node->details->id, NULL, data_set);
if (stop_op) {
crm_element_value_int(stop_op, XML_LRM_ATTR_CALLID, &stop_id);
}
crm_element_value_int(xml_op, XML_LRM_ATTR_CALLID, &task_id);
if (stop_op == NULL || stop_id < task_id) {
int from_rc = 0, from_status = 0;
const char *migrate_source =
crm_element_value(xml_op, XML_LRM_ATTR_MIGRATE_SOURCE);
const char *migrate_target =
crm_element_value(xml_op, XML_LRM_ATTR_MIGRATE_TARGET);
node_t *target = pe_find_node(data_set->nodes, migrate_target);
node_t *source = pe_find_node(data_set->nodes, migrate_source);
xmlNode *migrate_from =
find_lrm_op(rsc->id, CRMD_ACTION_MIGRATED, migrate_target, migrate_source,
data_set);
rsc->role = RSC_ROLE_STARTED; /* can be master? */
if (migrate_from) {
crm_element_value_int(migrate_from, XML_LRM_ATTR_RC, &from_rc);
crm_element_value_int(migrate_from, XML_LRM_ATTR_OPSTATUS, &from_status);
pe_rsc_trace(rsc, "%s op on %s exited with status=%d, rc=%d",
ID(migrate_from), migrate_target, from_status, from_rc);
}
if (migrate_from && from_rc == PCMK_OCF_OK
&& from_status == PCMK_LRM_OP_DONE) {
pe_rsc_trace(rsc, "Detected dangling migration op: %s on %s", ID(xml_op),
migrate_source);
/* all good
* just need to arrange for the stop action to get sent
* but _without_ affecting the target somehow
*/
rsc->role = RSC_ROLE_STOPPED;
rsc->dangling_migrations = g_list_prepend(rsc->dangling_migrations, node);
} else if (migrate_from) { /* Failed */
if (target && target->details->online) {
pe_rsc_trace(rsc, "Marking active on %s %p %d", migrate_target, target,
target->details->online);
native_add_running(rsc, target, data_set);
}
} else { /* Pending or complete but erased */
if (target && target->details->online) {
pe_rsc_trace(rsc, "Marking active on %s %p %d", migrate_target, target,
target->details->online);
native_add_running(rsc, target, data_set);
if (source && source->details->online) {
/* If we make it here we have a partial migration. The migrate_to
* has completed but the migrate_from on the target has not. Hold on
* to the target and source on the resource. Later on if we detect that
* the resource is still going to run on that target, we may continue
* the migration */
rsc->partial_migration_target = target;
rsc->partial_migration_source = source;
}
} else {
/* Consider it failed here - forces a restart, prevents migration */
set_bit(rsc->flags, pe_rsc_failed);
clear_bit(rsc->flags, pe_rsc_allow_migrate);
}
}
}
}
static void
unpack_rsc_migration_failure(resource_t *rsc, node_t *node, xmlNode *xml_op, pe_working_set_t * data_set)
{
const char *task = crm_element_value(xml_op, XML_LRM_ATTR_TASK);
if (safe_str_eq(task, CRMD_ACTION_MIGRATED)) {
int stop_id = 0;
int migrate_id = 0;
const char *migrate_source = crm_element_value(xml_op, XML_LRM_ATTR_MIGRATE_SOURCE);
const char *migrate_target = crm_element_value(xml_op, XML_LRM_ATTR_MIGRATE_TARGET);
xmlNode *stop_op =
find_lrm_op(rsc->id, CRMD_ACTION_STOP, migrate_source, NULL, data_set);
xmlNode *migrate_op =
find_lrm_op(rsc->id, CRMD_ACTION_MIGRATE, migrate_source, migrate_target,
data_set);
if (stop_op) {
crm_element_value_int(stop_op, XML_LRM_ATTR_CALLID, &stop_id);
}
if (migrate_op) {
crm_element_value_int(migrate_op, XML_LRM_ATTR_CALLID, &migrate_id);
}
/* Get our state right */
rsc->role = RSC_ROLE_STARTED; /* can be master? */
if (stop_op == NULL || stop_id < migrate_id) {
node_t *source = pe_find_node(data_set->nodes, migrate_source);
if (source && source->details->online) {
native_add_running(rsc, source, data_set);
}
}
} else if (safe_str_eq(task, CRMD_ACTION_MIGRATE)) {
int stop_id = 0;
int migrate_id = 0;
const char *migrate_source = crm_element_value(xml_op, XML_LRM_ATTR_MIGRATE_SOURCE);
const char *migrate_target = crm_element_value(xml_op, XML_LRM_ATTR_MIGRATE_TARGET);
xmlNode *stop_op =
find_lrm_op(rsc->id, CRMD_ACTION_STOP, migrate_target, NULL, data_set);
xmlNode *migrate_op =
find_lrm_op(rsc->id, CRMD_ACTION_MIGRATED, migrate_target, migrate_source,
data_set);
if (stop_op) {
crm_element_value_int(stop_op, XML_LRM_ATTR_CALLID, &stop_id);
}
if (migrate_op) {
crm_element_value_int(migrate_op, XML_LRM_ATTR_CALLID, &migrate_id);
}
/* Get our state right */
rsc->role = RSC_ROLE_STARTED; /* can be master? */
if (stop_op == NULL || stop_id < migrate_id) {
node_t *target = pe_find_node(data_set->nodes, migrate_target);
pe_rsc_trace(rsc, "Stop: %p %d, Migrated: %p %d", stop_op, stop_id, migrate_op,
migrate_id);
if (target && target->details->online) {
native_add_running(rsc, target, data_set);
}
} else if (migrate_op == NULL) {
/* Make sure it gets cleaned up, the stop may pre-date the migrate_from */
rsc->dangling_migrations = g_list_prepend(rsc->dangling_migrations, node);
}
}
}
static const char *get_op_key(xmlNode *xml_op)
{
const char *key = crm_element_value(xml_op, XML_LRM_ATTR_TASK_KEY);
if(key == NULL) {
key = ID(xml_op);
}
return key;
}
static void
unpack_rsc_op_failure(resource_t *rsc, node_t *node, int rc, xmlNode *xml_op, enum action_fail_response *on_fail, pe_working_set_t * data_set)
{
int interval = 0;
bool is_probe = FALSE;
action_t *action = NULL;
const char *key = get_op_key(xml_op);
const char *task = crm_element_value(xml_op, XML_LRM_ATTR_TASK);
const char *op_version = crm_element_value(xml_op, XML_ATTR_CRM_VERSION);
crm_element_value_int(xml_op, XML_LRM_ATTR_INTERVAL, &interval);
if(interval == 0 && safe_str_eq(task, CRMD_ACTION_STATUS)) {
is_probe = TRUE;
pe_rsc_trace(rsc, "is a probe: %s", key);
}
if (rc != PCMK_OCF_NOT_INSTALLED || is_set(data_set->flags, pe_flag_symmetric_cluster)) {
crm_warn("Processing failed op %s for %s on %s: %s (%d)",
task, rsc->id, node->details->uname, services_ocf_exitcode_str(rc),
rc);
crm_xml_add(xml_op, XML_ATTR_UNAME, node->details->uname);
if ((node->details->shutdown == FALSE) || (node->details->online == TRUE)) {
add_node_copy(data_set->failed, xml_op);
}
} else {
crm_trace("Processing failed op %s for %s on %s: %s (%d)",
task, rsc->id, node->details->uname, services_ocf_exitcode_str(rc),
rc);
}
action = custom_action(rsc, strdup(key), task, NULL, TRUE, FALSE, data_set);
if ((action->on_fail <= action_fail_fence && *on_fail < action->on_fail) ||
(action->on_fail == action_fail_restart_container
&& *on_fail <= action_fail_recover) || (*on_fail == action_fail_restart_container
&& action->on_fail >=
action_fail_migrate)) {
pe_rsc_trace(rsc, "on-fail %s -> %s for %s (%s)", fail2text(*on_fail),
fail2text(action->on_fail), action->uuid, key);
*on_fail = action->on_fail;
}
if (safe_str_eq(task, CRMD_ACTION_STOP)) {
resource_location(rsc, node, -INFINITY, "__stop_fail__", data_set);
} else if (safe_str_eq(task, CRMD_ACTION_MIGRATE) || safe_str_eq(task, CRMD_ACTION_MIGRATED)) {
unpack_rsc_migration_failure(rsc, node, xml_op, data_set);
} else if (safe_str_eq(task, CRMD_ACTION_PROMOTE)) {
rsc->role = RSC_ROLE_MASTER;
} else if (safe_str_eq(task, CRMD_ACTION_DEMOTE)) {
/*
* staying in role=master ends up putting the PE/TE into a loop
* setting role=slave is not dangerous because no master will be
* promoted until the failed resource has been fully stopped
*/
rsc->next_role = RSC_ROLE_STOPPED;
if (action->on_fail == action_fail_block) {
rsc->role = RSC_ROLE_MASTER;
} else {
crm_warn("Forcing %s to stop after a failed demote action", rsc->id);
rsc->role = RSC_ROLE_SLAVE;
}
} else if (compare_version("2.0", op_version) > 0 && safe_str_eq(task, CRMD_ACTION_START)) {
crm_warn("Compatibility handling for failed op %s on %s", key, node->details->uname);
resource_location(rsc, node, -INFINITY, "__legacy_start__", data_set);
}
if(is_probe && rc == PCMK_OCF_NOT_INSTALLED) {
/* leave stopped */
pe_rsc_trace(rsc, "Leaving %s stopped", rsc->id);
rsc->role = RSC_ROLE_STOPPED;
} else if (rsc->role < RSC_ROLE_STARTED) {
pe_rsc_trace(rsc, "Setting %s active", rsc->id);
set_active(rsc);
}
pe_rsc_trace(rsc, "Resource %s: role=%s, unclean=%s, on_fail=%s, fail_role=%s",
rsc->id, role2text(rsc->role),
node->details->unclean ? "true" : "false",
fail2text(action->on_fail), role2text(action->fail_role));
if (action->fail_role != RSC_ROLE_STARTED && rsc->next_role < action->fail_role) {
rsc->next_role = action->fail_role;
}
if (action->fail_role == RSC_ROLE_STOPPED) {
int score = -INFINITY;
resource_t *fail_rsc = rsc;
if (fail_rsc->parent) {
resource_t *parent = uber_parent(fail_rsc);
if ((parent->variant == pe_clone || parent->variant == pe_master)
&& is_not_set(parent->flags, pe_rsc_unique)) {
/* for clone and master resources, if a child fails on an operation
* with on-fail = stop, all the resources fail. Do this by preventing
* the parent from coming up again. */
fail_rsc = parent;
}
}
crm_warn("Making sure %s doesn't come up again", fail_rsc->id);
/* make sure it doesnt come up again */
g_hash_table_destroy(fail_rsc->allowed_nodes);
fail_rsc->allowed_nodes = node_hash_from_list(data_set->nodes);
g_hash_table_foreach(fail_rsc->allowed_nodes, set_node_score, &score);
}
pe_free_action(action);
}
static int
determine_op_status(
resource_t *rsc, int rc, int target_rc, node_t * node, xmlNode * xml_op, enum action_fail_response * on_fail, pe_working_set_t * data_set)
{
int interval = 0;
int result = PCMK_LRM_OP_DONE;
const char *key = get_op_key(xml_op);
const char *task = crm_element_value(xml_op, XML_LRM_ATTR_TASK);
bool is_probe = FALSE;
crm_element_value_int(xml_op, XML_LRM_ATTR_INTERVAL, &interval);
if (interval == 0 && safe_str_eq(task, CRMD_ACTION_STATUS)) {
is_probe = TRUE;
}
if (target_rc >= 0 && target_rc != rc) {
result = PCMK_LRM_OP_ERROR;
pe_rsc_debug(rsc, "%s on %s returned '%s' (%d) instead of the expected value: '%s' (%d)",
key, node->details->uname,
services_ocf_exitcode_str(rc), rc,
services_ocf_exitcode_str(target_rc), target_rc);
}
/* we could clean this up significantly except for old LRMs and CRMs that
* didnt include target_rc and liked to remap status
*/
switch (rc) {
case PCMK_OCF_OK:
if (is_probe && target_rc == 7) {
result = PCMK_LRM_OP_DONE;
pe_rsc_info(rsc, "Operation %s found resource %s active on %s",
task, rsc->id, node->details->uname);
set_bit(rsc->flags, pe_rsc_unexpectedly_running);
/* legacy code for pre-0.6.5 operations */
} else if (target_rc < 0 && interval > 0 && rsc->role == RSC_ROLE_MASTER) {
/* catch status ops that return 0 instead of 8 while they
* are supposed to be in master mode
*/
result = PCMK_LRM_OP_ERROR;
}
break;
case PCMK_OCF_NOT_RUNNING:
if (is_probe || target_rc == rc) {
result = PCMK_LRM_OP_DONE;
rsc->role = RSC_ROLE_STOPPED;
/* clear any previous failure actions */
*on_fail = action_fail_ignore;
rsc->next_role = RSC_ROLE_UNKNOWN;
} else if (safe_str_neq(task, CRMD_ACTION_STOP)) {
result = PCMK_LRM_OP_ERROR;
}
break;
case PCMK_OCF_RUNNING_MASTER:
if (is_probe) {
result = PCMK_LRM_OP_DONE;
pe_rsc_info(rsc, "Operation %s found resource %s active in master mode on %s",
task, rsc->id, node->details->uname);
} else if (target_rc == rc) {
/* nothing to do */
} else if (target_rc >= 0) {
result = PCMK_LRM_OP_ERROR;
/* legacy code for pre-0.6.5 operations */
} else if (safe_str_neq(task, CRMD_ACTION_STATUS)
|| rsc->role != RSC_ROLE_MASTER) {
result = PCMK_LRM_OP_ERROR;
if (rsc->role != RSC_ROLE_MASTER) {
crm_err("%s reported %s in master mode on %s",
key, rsc->id, node->details->uname);
}
}
rsc->role = RSC_ROLE_MASTER;
break;
case PCMK_OCF_FAILED_MASTER:
rsc->role = RSC_ROLE_MASTER;
result = PCMK_LRM_OP_ERROR;
break;
case PCMK_OCF_NOT_CONFIGURED:
result = PCMK_LRM_OP_ERROR_FATAL;
break;
case PCMK_OCF_NOT_INSTALLED:
case PCMK_OCF_INVALID_PARAM:
case PCMK_OCF_INSUFFICIENT_PRIV:
case PCMK_OCF_UNIMPLEMENT_FEATURE:
if (rc == PCMK_OCF_UNIMPLEMENT_FEATURE && interval > 0) {
result = PCMK_LRM_OP_NOTSUPPORTED;
break;
} else if(pe_can_fence(data_set, node) == FALSE
&& safe_str_eq(task, CRMD_ACTION_STOP)) {
/* If a stop fails and we can't fence, there's nothing else we can do */
pe_proc_err("No further recovery can be attempted for %s: %s action failed with '%s' (%d)",
rsc->id, task, services_ocf_exitcode_str(rc), rc);
clear_bit(rsc->flags, pe_rsc_managed);
set_bit(rsc->flags, pe_rsc_block);
}
result = PCMK_LRM_OP_ERROR_HARD;
break;
default:
if (result == PCMK_LRM_OP_DONE) {
crm_info("Treating %s (rc=%d) on %s as an ERROR",
key, rc, node->details->uname);
result = PCMK_LRM_OP_ERROR;
}
}
return result;
}
static bool check_operation_expiry(resource_t *rsc, node_t *node, int rc, xmlNode *xml_op, pe_working_set_t * data_set)
{
bool expired = FALSE;
time_t last_failure = 0;
int clear_failcount = 0;
int interval = 0;
const char *key = get_op_key(xml_op);
const char *task = crm_element_value(xml_op, XML_LRM_ATTR_TASK);
if (rsc->failure_timeout > 0) {
int last_run = 0;
if (crm_element_value_int(xml_op, XML_RSC_OP_LAST_CHANGE, &last_run) == 0) {
time_t now = get_effective_time(data_set);
if (now > (last_run + rsc->failure_timeout)) {
expired = TRUE;
}
}
}
if (expired) {
if (rsc->failure_timeout > 0) {
int fc = get_failcount_full(node, rsc, &last_failure, FALSE, data_set);
if(fc && get_failcount_full(node, rsc, &last_failure, TRUE, data_set) == 0) {
clear_failcount = 1;
crm_notice("Clearing expired failcount for %s on %s", rsc->id, node->details->uname);
}
}
} else if (strstr(ID(xml_op), "last_failure") &&
((strcmp(task, "start") == 0) || (strcmp(task, "monitor") == 0))) {
op_digest_cache_t *digest_data = NULL;
digest_data = rsc_action_digest_cmp(rsc, xml_op, node, data_set);
if (digest_data->rc == RSC_DIGEST_UNKNOWN) {
crm_trace("rsc op %s on node %s does not have a op digest to compare against", rsc->id,
key, node->details->id);
} else if (digest_data->rc != RSC_DIGEST_MATCH) {
clear_failcount = 1;
crm_info
("Clearing failcount for %s on %s, %s failed and now resource parameters have changed.",
task, rsc->id, node->details->uname);
}
}
if (clear_failcount) {
action_t *clear_op = NULL;
clear_op = custom_action(rsc, crm_concat(rsc->id, CRM_OP_CLEAR_FAILCOUNT, '_'),
CRM_OP_CLEAR_FAILCOUNT, node, FALSE, TRUE, data_set);
add_hash_param(clear_op->meta, XML_ATTR_TE_NOWAIT, XML_BOOLEAN_TRUE);
}
crm_element_value_int(xml_op, XML_LRM_ATTR_INTERVAL, &interval);
if(expired && interval == 0 && safe_str_eq(task, CRMD_ACTION_STATUS)) {
switch(rc) {
case PCMK_OCF_OK:
case PCMK_OCF_NOT_RUNNING:
case PCMK_OCF_RUNNING_MASTER:
/* Don't expire probes that return these values */
expired = FALSE;
break;
}
}
return expired;
}
static int get_target_rc(xmlNode *xml_op)
{
int dummy = 0;
int target_rc = 0;
char *dummy_string = NULL;
const char *key = crm_element_value(xml_op, XML_ATTR_TRANSITION_KEY);
if (key == NULL) {
return -1;
}
decode_transition_key(key, &dummy_string, &dummy, &dummy, &target_rc);
free(dummy_string);
return target_rc;
}
static enum action_fail_response
get_action_on_fail(resource_t *rsc, const char *key, const char *task, pe_working_set_t * data_set)
{
int result = action_fail_recover;
action_t *action = custom_action(rsc, strdup(key), task, NULL, TRUE, FALSE, data_set);
result = action->on_fail;
pe_free_action(action);
return result;
}
static void
update_resource_state(resource_t *rsc, node_t * node, xmlNode * xml_op, const char *task, int rc,
enum action_fail_response *on_fail, pe_working_set_t * data_set)
{
gboolean clear_past_failure = FALSE;
if (rc == PCMK_OCF_NOT_RUNNING) {
clear_past_failure = TRUE;
} else if (rc == PCMK_OCF_NOT_INSTALLED) {
rsc->role = RSC_ROLE_STOPPED;
} else if (safe_str_eq(task, CRMD_ACTION_STATUS)) {
clear_past_failure = TRUE;
if (rsc->role < RSC_ROLE_STARTED) {
set_active(rsc);
}
} else if (safe_str_eq(task, CRMD_ACTION_START)) {
rsc->role = RSC_ROLE_STARTED;
clear_past_failure = TRUE;
} else if (safe_str_eq(task, CRMD_ACTION_STOP)) {
rsc->role = RSC_ROLE_STOPPED;
clear_past_failure = TRUE;
} else if (safe_str_eq(task, CRMD_ACTION_PROMOTE)) {
rsc->role = RSC_ROLE_MASTER;
clear_past_failure = TRUE;
} else if (safe_str_eq(task, CRMD_ACTION_DEMOTE)) {
/* Demote from Master does not clear an error */
rsc->role = RSC_ROLE_SLAVE;
} else if (safe_str_eq(task, CRMD_ACTION_MIGRATED)) {
rsc->role = RSC_ROLE_STARTED;
clear_past_failure = TRUE;
} else if (safe_str_eq(task, CRMD_ACTION_MIGRATE)) {
unpack_rsc_migration(rsc, node, xml_op, data_set);
} else if (rsc->role < RSC_ROLE_STARTED) {
/* migrate_to and migrate_from will land here */
pe_rsc_trace(rsc, "%s active on %s", rsc->id, node->details->uname);
set_active(rsc);
}
/* clear any previous failure actions */
if (clear_past_failure) {
switch (*on_fail) {
case action_fail_stop:
case action_fail_fence:
case action_fail_migrate:
case action_fail_standby:
pe_rsc_trace(rsc, "%s.%s is not cleared by a completed stop",
rsc->id, fail2text(*on_fail));
break;
case action_fail_block:
case action_fail_ignore:
case action_fail_recover:
*on_fail = action_fail_ignore;
rsc->next_role = RSC_ROLE_UNKNOWN;
break;
case action_fail_restart_container:
*on_fail = action_fail_ignore;
rsc->next_role = RSC_ROLE_UNKNOWN;
}
}
}
gboolean
unpack_rsc_op(resource_t * rsc, node_t * node, xmlNode * xml_op,
enum action_fail_response * on_fail, pe_working_set_t * data_set)
{
int task_id = 0;
const char *key = NULL;
const char *task = NULL;
const char *task_key = NULL;
int rc = 0;
int status = PCMK_LRM_OP_PENDING-1;
int target_rc = get_target_rc(xml_op);
+ int interval = 0;
gboolean expired = FALSE;
resource_t *parent = rsc;
enum action_fail_response failure_strategy = action_fail_recover;
CRM_CHECK(rsc != NULL, return FALSE);
CRM_CHECK(node != NULL, return FALSE);
CRM_CHECK(xml_op != NULL, return FALSE);
task_key = get_op_key(xml_op);
task = crm_element_value(xml_op, XML_LRM_ATTR_TASK);
key = crm_element_value(xml_op, XML_ATTR_TRANSITION_KEY);
crm_element_value_int(xml_op, XML_LRM_ATTR_RC, &rc);
crm_element_value_int(xml_op, XML_LRM_ATTR_CALLID, &task_id);
crm_element_value_int(xml_op, XML_LRM_ATTR_OPSTATUS, &status);
+ crm_element_value_int(xml_op, XML_LRM_ATTR_INTERVAL, &interval);
CRM_CHECK(task != NULL, return FALSE);
CRM_CHECK(status <= PCMK_LRM_OP_NOT_INSTALLED, return FALSE);
CRM_CHECK(status >= PCMK_LRM_OP_PENDING, return FALSE);
if (safe_str_eq(task, CRMD_ACTION_NOTIFY)) {
/* safe to ignore these */
return TRUE;
}
if (is_not_set(rsc->flags, pe_rsc_unique)) {
parent = uber_parent(rsc);
}
pe_rsc_trace(rsc, "Unpacking task %s/%s (call_id=%d, status=%d, rc=%d) on %s (role=%s)",
task_key, task, task_id, status, rc, node->details->uname, role2text(rsc->role));
if (node->details->unclean) {
pe_rsc_trace(rsc, "Node %s (where %s is running) is unclean."
" Further action depends on the value of the stop's on-fail attribue",
node->details->uname, rsc->id);
}
if (status == PCMK_LRM_OP_ERROR) {
/* Older versions set this if rc != 0 but its up to us to decide */
status = PCMK_LRM_OP_DONE;
}
if(status != PCMK_LRM_OP_NOT_INSTALLED) {
expired = check_operation_expiry(rsc, node, rc, xml_op, data_set);
}
if (expired && target_rc != rc) {
- int interval = 0;
const char *magic = crm_element_value(xml_op, XML_ATTR_TRANSITION_MAGIC);
pe_rsc_debug(rsc, "Expired operation '%s' on %s returned '%s' (%d) instead of the expected value: '%s' (%d)",
key, node->details->uname,
services_ocf_exitcode_str(rc), rc,
services_ocf_exitcode_str(target_rc), target_rc);
- crm_element_value_int(xml_op, XML_LRM_ATTR_INTERVAL, &interval);
if(interval == 0) {
crm_notice("Ignoring expired calculated failure %s (rc=%d, magic=%s) on %s",
task_key, rc, magic, node->details->uname);
goto done;
} else if(node->details->online && node->details->unclean == FALSE) {
crm_notice("Re-initiated expired calculated failure %s (rc=%d, magic=%s) on %s",
task_key, rc, magic, node->details->uname);
/* This is SO horrible, but we don't have access to CancelXmlOp() yet */
crm_xml_add(xml_op, XML_LRM_ATTR_RESTART_DIGEST, "calculated-failure-timeout");
goto done;
}
}
if(status == PCMK_LRM_OP_DONE || status == PCMK_LRM_OP_ERROR) {
status = determine_op_status(rsc, rc, target_rc, node, xml_op, on_fail, data_set);
}
pe_rsc_trace(rsc, "Handling status: %d", status);
switch (status) {
case PCMK_LRM_OP_CANCELLED:
/* do nothing?? */
pe_err("Dont know what to do for cancelled ops yet");
break;
case PCMK_LRM_OP_PENDING:
if (safe_str_eq(task, CRMD_ACTION_START)) {
set_bit(rsc->flags, pe_rsc_start_pending);
set_active(rsc);
} else if (safe_str_eq(task, CRMD_ACTION_PROMOTE)) {
rsc->role = RSC_ROLE_MASTER;
} else if (safe_str_eq(task, CRMD_ACTION_MIGRATE) && node->details->unclean) {
/* If a pending migrate_to action is out on a unclean node,
* we have to force the stop action on the target. */
const char *migrate_target = crm_element_value(xml_op, XML_LRM_ATTR_MIGRATE_TARGET);
node_t *target = pe_find_node(data_set->nodes, migrate_target);
if (target) {
stop_action(rsc, target, FALSE);
}
}
+
+ if (rsc->pending_task == NULL) {
+ if (safe_str_eq(task, CRMD_ACTION_STATUS) && interval == 0) {
+ /* Comment this out until someone requests it */
+ /* Comment this out until cl#5184 is fixed */
+ /*rsc->pending_task = strdup("probe");*/
+
+ } else {
+ rsc->pending_task = strdup(task);
+ }
+ }
break;
case PCMK_LRM_OP_DONE:
pe_rsc_trace(rsc, "%s/%s completed on %s", rsc->id, task, node->details->uname);
update_resource_state(rsc, node, xml_op, task, rc, on_fail, data_set);
break;
case PCMK_LRM_OP_NOT_INSTALLED:
failure_strategy = get_action_on_fail(rsc, task_key, task, data_set);
if (failure_strategy == action_fail_ignore) {
crm_warn("Cannot ignore failed %s (status=%d, rc=%d) on %s: "
"Resource agent doesn't exist",
task_key, status, rc, node->details->uname);
/* Also for printing it as "FAILED" by marking it as pe_rsc_failed later */
*on_fail = action_fail_migrate;
}
resource_location(parent, node, -INFINITY, "hard-error", data_set);
unpack_rsc_op_failure(rsc, node, rc, xml_op, on_fail, data_set);
break;
case PCMK_LRM_OP_ERROR:
case PCMK_LRM_OP_ERROR_HARD:
case PCMK_LRM_OP_ERROR_FATAL:
case PCMK_LRM_OP_TIMEOUT:
case PCMK_LRM_OP_NOTSUPPORTED:
failure_strategy = get_action_on_fail(rsc, task_key, task, data_set);
if ((failure_strategy == action_fail_ignore)
|| (failure_strategy == action_fail_restart_container
&& safe_str_eq(task, CRMD_ACTION_STOP))) {
crm_warn("Pretending the failure of %s (rc=%d) on %s succeeded",
task_key, rc, node->details->uname);
update_resource_state(rsc, node, xml_op, task, target_rc, on_fail, data_set);
crm_xml_add(xml_op, XML_ATTR_UNAME, node->details->uname);
set_bit(rsc->flags, pe_rsc_failure_ignored);
if ((node->details->shutdown == FALSE) || (node->details->online == TRUE)) {
crm_xml_add(xml_op, XML_ATTR_UNAME, node->details->uname);
add_node_copy(data_set->failed, xml_op);
}
if (failure_strategy == action_fail_restart_container && *on_fail <= action_fail_recover) {
*on_fail = failure_strategy;
}
} else {
unpack_rsc_op_failure(rsc, node, rc, xml_op, on_fail, data_set);
if(status == PCMK_LRM_OP_ERROR_HARD) {
do_crm_log(rc != PCMK_OCF_NOT_INSTALLED?LOG_ERR:LOG_NOTICE,
"Preventing %s from re-starting on %s: operation %s failed '%s' (%d)",
parent->id, node->details->uname,
task, services_ocf_exitcode_str(rc), rc);
resource_location(parent, node, -INFINITY, "hard-error", data_set);
} else if(status == PCMK_LRM_OP_ERROR_FATAL) {
crm_err("Preventing %s from re-starting anywhere: operation %s failed '%s' (%d)",
parent->id, task, services_ocf_exitcode_str(rc), rc);
resource_location(parent, NULL, -INFINITY, "fatal-error", data_set);
}
}
break;
}
done:
pe_rsc_trace(rsc, "Resource %s after %s: role=%s", rsc->id, task, role2text(rsc->role));
return TRUE;
}
gboolean
add_node_attrs(xmlNode * xml_obj, node_t * node, gboolean overwrite, pe_working_set_t * data_set)
{
g_hash_table_insert(node->details->attrs,
strdup("#uname"), strdup(node->details->uname));
g_hash_table_insert(node->details->attrs,
strdup("#kind"), strdup(node->details->remote_rsc?"container":"cluster"));
g_hash_table_insert(node->details->attrs, strdup("#" XML_ATTR_ID), strdup(node->details->id));
if (safe_str_eq(node->details->id, data_set->dc_uuid)) {
data_set->dc_node = node;
node->details->is_dc = TRUE;
g_hash_table_insert(node->details->attrs,
strdup("#" XML_ATTR_DC), strdup(XML_BOOLEAN_TRUE));
} else {
g_hash_table_insert(node->details->attrs,
strdup("#" XML_ATTR_DC), strdup(XML_BOOLEAN_FALSE));
}
unpack_instance_attributes(data_set->input, xml_obj, XML_TAG_ATTR_SETS, NULL,
node->details->attrs, NULL, overwrite, data_set->now);
return TRUE;
}
static GListPtr
extract_operations(const char *node, const char *rsc, xmlNode * rsc_entry, gboolean active_filter)
{
int counter = -1;
int stop_index = -1;
int start_index = -1;
xmlNode *rsc_op = NULL;
GListPtr gIter = NULL;
GListPtr op_list = NULL;
GListPtr sorted_op_list = NULL;
/* extract operations */
op_list = NULL;
sorted_op_list = NULL;
for (rsc_op = __xml_first_child(rsc_entry); rsc_op != NULL; rsc_op = __xml_next(rsc_op)) {
if (crm_str_eq((const char *)rsc_op->name, XML_LRM_TAG_RSC_OP, TRUE)) {
crm_xml_add(rsc_op, "resource", rsc);
crm_xml_add(rsc_op, XML_ATTR_UNAME, node);
op_list = g_list_prepend(op_list, rsc_op);
}
}
if (op_list == NULL) {
/* if there are no operations, there is nothing to do */
return NULL;
}
sorted_op_list = g_list_sort(op_list, sort_op_by_callid);
/* create active recurring operations as optional */
if (active_filter == FALSE) {
return sorted_op_list;
}
op_list = NULL;
calculate_active_ops(sorted_op_list, &start_index, &stop_index);
for (gIter = sorted_op_list; gIter != NULL; gIter = gIter->next) {
xmlNode *rsc_op = (xmlNode *) gIter->data;
counter++;
if (start_index < stop_index) {
crm_trace("Skipping %s: not active", ID(rsc_entry));
break;
} else if (counter < start_index) {
crm_trace("Skipping %s: old", ID(rsc_op));
continue;
}
op_list = g_list_append(op_list, rsc_op);
}
g_list_free(sorted_op_list);
return op_list;
}
GListPtr
find_operations(const char *rsc, const char *node, gboolean active_filter,
pe_working_set_t * data_set)
{
GListPtr output = NULL;
GListPtr intermediate = NULL;
xmlNode *tmp = NULL;
xmlNode *status = find_xml_node(data_set->input, XML_CIB_TAG_STATUS, TRUE);
node_t *this_node = NULL;
xmlNode *node_state = NULL;
for (node_state = __xml_first_child(status); node_state != NULL;
node_state = __xml_next(node_state)) {
if (crm_str_eq((const char *)node_state->name, XML_CIB_TAG_STATE, TRUE)) {
const char *uname = crm_element_value(node_state, XML_ATTR_UNAME);
if (node != NULL && safe_str_neq(uname, node)) {
continue;
}
this_node = pe_find_node(data_set->nodes, uname);
CRM_CHECK(this_node != NULL, continue);
if (is_remote_node(this_node)) {
determine_remote_online_status(this_node);
} else {
determine_online_status(node_state, this_node, data_set);
}
if (this_node->details->online || is_set(data_set->flags, pe_flag_stonith_enabled)) {
/* offline nodes run no resources...
* unless stonith is enabled in which case we need to
* make sure rsc start events happen after the stonith
*/
xmlNode *lrm_rsc = NULL;
tmp = find_xml_node(node_state, XML_CIB_TAG_LRM, FALSE);
tmp = find_xml_node(tmp, XML_LRM_TAG_RESOURCES, FALSE);
for (lrm_rsc = __xml_first_child(tmp); lrm_rsc != NULL;
lrm_rsc = __xml_next(lrm_rsc)) {
if (crm_str_eq((const char *)lrm_rsc->name, XML_LRM_TAG_RESOURCE, TRUE)) {
const char *rsc_id = crm_element_value(lrm_rsc, XML_ATTR_ID);
if (rsc != NULL && safe_str_neq(rsc_id, rsc)) {
continue;
}
intermediate = extract_operations(uname, rsc_id, lrm_rsc, active_filter);
output = g_list_concat(output, intermediate);
}
}
}
}
}
return output;
}
diff --git a/tools/crm_mon.c b/tools/crm_mon.c
index dd81364d37..2aff78d0c3 100644
--- a/tools/crm_mon.c
+++ b/tools/crm_mon.c
@@ -1,2523 +1,2588 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <sys/param.h>
#include <crm/crm.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <libgen.h>
#include <sys/utsname.h>
#include <crm/msg_xml.h>
#include <crm/services.h>
#include <crm/lrmd.h>
#include <crm/common/util.h>
#include <crm/common/xml.h>
#include <crm/common/ipc.h>
#include <crm/common/mainloop.h>
#include <crm/cib/internal.h>
#include <crm/pengine/status.h>
#include <../lib/pengine/unpack.h>
#include <../pengine/pengine.h>
#include <crm/stonith-ng.h>
/* GMainLoop *mainloop = NULL; */
void wait_for_refresh(int offset, const char *prefix, int msec);
void clean_up(int rc);
void crm_diff_update(const char *event, xmlNode * msg);
gboolean mon_refresh_display(gpointer user_data);
int cib_connect(gboolean full);
void mon_st_callback(stonith_t * st, stonith_event_t * e);
char *xml_file = NULL;
char *as_html_file = NULL;
int as_xml = 0;
char *pid_file = NULL;
char *snmp_target = NULL;
char *snmp_community = NULL;
gboolean as_console = TRUE;;
gboolean simple_status = FALSE;
gboolean group_by_node = FALSE;
gboolean inactive_resources = FALSE;
gboolean web_cgi = FALSE;
int reconnect_msec = 5000;
gboolean daemonize = FALSE;
GMainLoop *mainloop = NULL;
guint timer_id = 0;
GList *attr_list = NULL;
const char *crm_mail_host = NULL;
const char *crm_mail_prefix = NULL;
const char *crm_mail_from = NULL;
const char *crm_mail_to = NULL;
const char *external_agent = NULL;
const char *external_recipient = NULL;
cib_t *cib = NULL;
stonith_t *st = NULL;
xmlNode *current_cib = NULL;
gboolean one_shot = FALSE;
gboolean has_warnings = FALSE;
gboolean print_failcount = FALSE;
gboolean print_operations = FALSE;
gboolean print_timing = FALSE;
gboolean print_nodes_attr = FALSE;
gboolean print_last_updated = TRUE;
gboolean print_last_change = TRUE;
gboolean print_tickets = FALSE;
gboolean watch_fencing = FALSE;
gboolean hide_headers = FALSE;
+gboolean print_brief = FALSE;
+gboolean print_pending = FALSE;
/* FIXME allow, detect, and correctly interpret glob pattern or regex? */
const char *print_neg_location_prefix;
const char *print_neg_location_prefix_toggle;
#define FILTER_STR {"shutdown", "terminate", "standby", "fail-count", \
"last-failure", "probe_complete", "#id", "#uname", \
"#is_dc", "#kind", NULL}
gboolean log_diffs = FALSE;
gboolean log_updates = FALSE;
long last_refresh = 0;
crm_trigger_t *refresh_trigger = NULL;
/*
* 1.3.6.1.4.1.32723 has been assigned to the project by IANA
* http://www.iana.org/assignments/enterprise-numbers
*/
#define PACEMAKER_PREFIX "1.3.6.1.4.1.32723"
#define PACEMAKER_TRAP_PREFIX PACEMAKER_PREFIX ".1"
#define snmp_crm_trap_oid PACEMAKER_TRAP_PREFIX
#define snmp_crm_oid_node PACEMAKER_TRAP_PREFIX ".1"
#define snmp_crm_oid_rsc PACEMAKER_TRAP_PREFIX ".2"
#define snmp_crm_oid_task PACEMAKER_TRAP_PREFIX ".3"
#define snmp_crm_oid_desc PACEMAKER_TRAP_PREFIX ".4"
#define snmp_crm_oid_status PACEMAKER_TRAP_PREFIX ".5"
#define snmp_crm_oid_rc PACEMAKER_TRAP_PREFIX ".6"
#define snmp_crm_oid_trc PACEMAKER_TRAP_PREFIX ".7"
#if CURSES_ENABLED
# define print_dot() if(as_console) { \
printw("."); \
clrtoeol(); \
refresh(); \
} else { \
fprintf(stdout, "."); \
}
#else
# define print_dot() fprintf(stdout, ".");
#endif
#if CURSES_ENABLED
# define print_as(fmt, args...) if(as_console) { \
printw(fmt, ##args); \
clrtoeol(); \
refresh(); \
} else { \
fprintf(stdout, fmt, ##args); \
}
#else
# define print_as(fmt, args...) fprintf(stdout, fmt, ##args);
#endif
static void
blank_screen(void)
{
#if CURSES_ENABLED
int lpc = 0;
for (lpc = 0; lpc < LINES; lpc++) {
move(lpc, 0);
clrtoeol();
}
move(0, 0);
refresh();
#endif
}
static gboolean
mon_timer_popped(gpointer data)
{
int rc = pcmk_ok;
#if CURSES_ENABLED
if(as_console) {
clear();
refresh();
}
#endif
if (timer_id > 0) {
g_source_remove(timer_id);
}
print_as("Reconnecting...\n");
rc = cib_connect(TRUE);
if (rc != pcmk_ok) {
timer_id = g_timeout_add(reconnect_msec, mon_timer_popped, NULL);
}
return FALSE;
}
static void
mon_cib_connection_destroy(gpointer user_data)
{
print_as("Connection to the CIB terminated\n");
if (cib) {
cib->cmds->signoff(cib);
timer_id = g_timeout_add(reconnect_msec, mon_timer_popped, NULL);
}
return;
}
/*
* Mainloop signal handler.
*/
static void
mon_shutdown(int nsig)
{
clean_up(EX_OK);
}
#if ON_DARWIN
# define sighandler_t sig_t
#endif
#if CURSES_ENABLED
# ifndef HAVE_SIGHANDLER_T
typedef void (*sighandler_t) (int);
# endif
static sighandler_t ncurses_winch_handler;
static void
mon_winresize(int nsig)
{
static int not_done;
int lines = 0, cols = 0;
if (!not_done++) {
if (ncurses_winch_handler)
/* the original ncurses WINCH signal handler does the
* magic of retrieving the new window size;
* otherwise, we'd have to use ioctl or tgetent */
(*ncurses_winch_handler) (SIGWINCH);
getmaxyx(stdscr, lines, cols);
resizeterm(lines, cols);
mainloop_set_trigger(refresh_trigger);
}
not_done--;
}
#endif
int
cib_connect(gboolean full)
{
int rc = pcmk_ok;
static gboolean need_pass = TRUE;
CRM_CHECK(cib != NULL, return -EINVAL);
if (getenv("CIB_passwd") != NULL) {
need_pass = FALSE;
}
if (watch_fencing && st == NULL) {
st = stonith_api_new();
}
if (watch_fencing && st->state == stonith_disconnected) {
crm_trace("Connecting to stonith");
rc = st->cmds->connect(st, crm_system_name, NULL);
if (rc == pcmk_ok) {
crm_trace("Setting up stonith callbacks");
st->cmds->register_notification(st, T_STONITH_NOTIFY_FENCE, mon_st_callback);
}
}
if (cib->state != cib_connected_query && cib->state != cib_connected_command) {
crm_trace("Connecting to the CIB");
if (as_console && need_pass && cib->variant == cib_remote) {
need_pass = FALSE;
print_as("Password:");
}
rc = cib->cmds->signon(cib, crm_system_name, cib_query);
if (rc != pcmk_ok) {
return rc;
}
current_cib = get_cib_copy(cib);
mon_refresh_display(NULL);
if (full) {
if (rc == pcmk_ok) {
rc = cib->cmds->set_connection_dnotify(cib, mon_cib_connection_destroy);
if (rc == -EPROTONOSUPPORT) {
print_as
("Notification setup not supported, won't be able to reconnect after failure");
if (as_console) {
sleep(2);
}
rc = pcmk_ok;
}
}
if (rc == pcmk_ok) {
cib->cmds->del_notify_callback(cib, T_CIB_DIFF_NOTIFY, crm_diff_update);
rc = cib->cmds->add_notify_callback(cib, T_CIB_DIFF_NOTIFY, crm_diff_update);
}
if (rc != pcmk_ok) {
print_as("Notification setup failed, could not monitor CIB actions");
if (as_console) {
sleep(2);
}
clean_up(-rc);
}
}
}
return rc;
}
/* *INDENT-OFF* */
static struct crm_option long_options[] = {
/* Top-level Options */
{"help", 0, 0, '?', "\tThis text"},
{"version", 0, 0, '$', "\tVersion information" },
{"verbose", 0, 0, 'V', "\tIncrease debug output"},
{"quiet", 0, 0, 'Q', "\tDisplay only essential output" },
{"-spacer-", 1, 0, '-', "\nModes:"},
{"as-html", 1, 0, 'h', "\tWrite cluster status to the named html file"},
{"as-xml", 0, 0, 'X', "\t\tWrite cluster status as xml to stdout. This will enable one-shot mode."},
{"web-cgi", 0, 0, 'w', "\t\tWeb mode with output suitable for cgi"},
{"simple-status", 0, 0, 's', "\tDisplay the cluster status once as a simple one line output (suitable for nagios)"},
{"snmp-traps", 1, 0, 'S', "\tSend SNMP traps to this station", !ENABLE_SNMP},
{"snmp-community", 1, 0, 'C', "Specify community for SNMP traps(default is NULL)", !ENABLE_SNMP},
{"mail-to", 1, 0, 'T', "\tSend Mail alerts to this user. See also --mail-from, --mail-host, --mail-prefix", !ENABLE_ESMTP},
{"-spacer-", 1, 0, '-', "\nDisplay Options:"},
{"group-by-node", 0, 0, 'n', "\tGroup resources by node" },
{"inactive", 0, 0, 'r', "\t\tDisplay inactive resources" },
{"failcounts", 0, 0, 'f', "\tDisplay resource fail counts"},
{"operations", 0, 0, 'o', "\tDisplay resource operation history" },
{"timing-details", 0, 0, 't', "\tDisplay resource operation history with timing details" },
{"tickets", 0, 0, 'c', "\t\tDisplay cluster tickets"},
{"watch-fencing", 0, 0, 'W', "\tListen for fencing events. For use with --external-agent, --mail-to and/or --snmp-traps where supported"},
{"neg-locations", 2, 0, 'L', "Display negative location constraints [optionally filtered by id prefix]"},
{"show-node-attributes", 0, 0, 'A', "Display node attributes" },
{"hide-headers", 0, 0, 'D', "\tHide all headers" },
+ {"brief", 0, 0, 'b', "\t\tBrief output" },
+ {"pending", 0, 0, 'j', "\t\tDisplay pending state if 'record-pending' is enabled" },
{"-spacer-", 1, 0, '-', "\nAdditional Options:"},
{"interval", 1, 0, 'i', "\tUpdate frequency in seconds" },
{"one-shot", 0, 0, '1', "\t\tDisplay the cluster status once on the console and exit"},
{"disable-ncurses",0, 0, 'N', "\tDisable the use of ncurses", !CURSES_ENABLED},
{"daemonize", 0, 0, 'd', "\tRun in the background as a daemon"},
{"pid-file", 1, 0, 'p', "\t(Advanced) Daemon pid file location"},
{"mail-from", 1, 0, 'F', "\tMail alerts should come from the named user", !ENABLE_ESMTP},
{"mail-host", 1, 0, 'H', "\tMail alerts should be sent via the named host", !ENABLE_ESMTP},
{"mail-prefix", 1, 0, 'P', "Subjects for mail alerts should start with this string", !ENABLE_ESMTP},
{"external-agent", 1, 0, 'E', "A program to run when resource operations take place."},
{"external-recipient",1, 0, 'e', "A recipient for your program (assuming you want the program to send something to someone)."},
{"xml-file", 1, 0, 'x', NULL, 1},
{"-spacer-", 1, 0, '-', "\nExamples:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', "Display the cluster status on the console with updates as they occur:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_mon", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Display the cluster status on the console just once then exit:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_mon -1", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Display your cluster status, group resources by node, and include inactive resources in the list:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_mon --group-by-node --inactive", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Start crm_mon as a background daemon and have it write the cluster status to an HTML file:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_mon --daemonize --as-html /path/to/docroot/filename.html", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Start crm_mon and export the current cluster status as xml to stdout, then exit.:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_mon --as-xml", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Start crm_mon as a background daemon and have it send email alerts:", pcmk_option_paragraph|!ENABLE_ESMTP},
{"-spacer-", 1, 0, '-', " crm_mon --daemonize --mail-to user@example.com --mail-host mail.example.com", pcmk_option_example|!ENABLE_ESMTP},
{"-spacer-", 1, 0, '-', "Start crm_mon as a background daemon and have it send SNMP alerts:", pcmk_option_paragraph|!ENABLE_SNMP},
{"-spacer-", 1, 0, '-', " crm_mon --daemonize --snmp-traps snmptrapd.example.com", pcmk_option_example|!ENABLE_SNMP},
{NULL, 0, 0, 0}
};
/* *INDENT-ON* */
#if CURSES_ENABLED
static const char *
get_option_desc(char c)
{
int lpc;
for (lpc = 0; long_options[lpc].name != NULL; lpc++) {
if (long_options[lpc].name[0] == '-')
continue;
if (long_options[lpc].val == c) {
const char * tab = NULL;
tab = strrchr(long_options[lpc].desc, '\t');
return tab ? ++tab : long_options[lpc].desc;
}
}
return NULL;
}
static gboolean
detect_user_input(GIOChannel *channel, GIOCondition condition, gpointer unused)
{
int c;
gboolean config_mode = FALSE;
while (1) {
/* Get user input */
c = getchar();
switch (c) {
case 'c':
print_tickets = ! print_tickets;
break;
case 'f':
print_failcount = ! print_failcount;
break;
case 'n':
group_by_node = ! group_by_node;
break;
case 'o':
print_operations = ! print_operations;
break;
case 'r':
inactive_resources = ! inactive_resources;
break;
case 't':
print_timing = ! print_timing;
if (print_timing)
print_operations = TRUE;
break;
case 'A':
print_nodes_attr = ! print_nodes_attr;
break;
case 'L':
if (print_neg_location_prefix) {
/* toggle off */
print_neg_location_prefix_toggle = print_neg_location_prefix;
print_neg_location_prefix = NULL;
} else if (print_neg_location_prefix_toggle) {
/* toggle on */
print_neg_location_prefix = print_neg_location_prefix_toggle;
print_neg_location_prefix_toggle = NULL;
} else {
/* toggled on for the first time at runtime */
print_neg_location_prefix = "";
}
break;
case 'D':
hide_headers = ! hide_headers;
break;
+ case 'b':
+ print_brief = ! print_brief;
+ break;
+ case 'j':
+ print_pending = ! print_pending;
+ break;
case '?':
config_mode = TRUE;
break;
default:
goto refresh;
}
if (!config_mode)
goto refresh;
blank_screen();
print_as("Display option change mode\n");
print_as("\n");
print_as("%c c: \t%s\n", print_tickets ? '*': ' ', get_option_desc('c'));
print_as("%c f: \t%s\n", print_failcount ? '*': ' ', get_option_desc('f'));
print_as("%c n: \t%s\n", group_by_node ? '*': ' ', get_option_desc('n'));
print_as("%c o: \t%s\n", print_operations ? '*': ' ', get_option_desc('o'));
print_as("%c r: \t%s\n", inactive_resources ? '*': ' ', get_option_desc('r'));
print_as("%c t: \t%s\n", print_timing ? '*': ' ', get_option_desc('t'));
print_as("%c A: \t%s\n", print_nodes_attr ? '*': ' ', get_option_desc('A'));
print_as("%c L: \t%s\n", print_neg_location_prefix ? '*': ' ', get_option_desc('L'));
print_as("%c D: \t%s\n", hide_headers ? '*': ' ', get_option_desc('D'));
+ print_as("%c b: \t%s\n", print_brief ? '*': ' ', get_option_desc('b'));
+ print_as("%c j: \t%s\n", print_pending ? '*': ' ', get_option_desc('j'));
print_as("\n");
print_as("Toggle fields via field letter, type any other key to return");
}
refresh:
mon_refresh_display(NULL);
return TRUE;
}
#endif
int
main(int argc, char **argv)
{
int flag;
int argerr = 0;
int exit_code = 0;
int option_index = 0;
pid_file = strdup("/tmp/ClusterMon.pid");
crm_log_cli_init("crm_mon");
crm_set_options(NULL, "mode [options]", long_options,
"Provides a summary of cluster's current state."
"\n\nOutputs varying levels of detail in a number of different formats.\n");
#if !defined (ON_DARWIN) && !defined (ON_BSD)
/* prevent zombies */
signal(SIGCLD, SIG_IGN);
#endif
if (strcmp(crm_system_name, "crm_mon.cgi") == 0) {
web_cgi = TRUE;
one_shot = TRUE;
}
while (1) {
flag = crm_get_option(argc, argv, &option_index);
if (flag == -1)
break;
switch (flag) {
case 'V':
crm_bump_log_level(argc, argv);
break;
case 'Q':
print_last_updated = FALSE;
print_last_change = FALSE;
break;
case 'i':
reconnect_msec = crm_get_msec(optarg);
break;
case 'n':
group_by_node = TRUE;
break;
case 'r':
inactive_resources = TRUE;
break;
case 'W':
watch_fencing = TRUE;
break;
case 'd':
daemonize = TRUE;
break;
case 't':
print_timing = TRUE;
print_operations = TRUE;
break;
case 'o':
print_operations = TRUE;
break;
case 'f':
print_failcount = TRUE;
break;
case 'A':
print_nodes_attr = TRUE;
break;
case 'L':
print_neg_location_prefix = optarg ?: "";
break;
case 'D':
hide_headers = TRUE;
break;
+ case 'b':
+ print_brief = TRUE;
+ break;
+ case 'j':
+ print_pending = TRUE;
+ break;
case 'c':
print_tickets = TRUE;
break;
case 'p':
free(pid_file);
pid_file = strdup(optarg);
break;
case 'x':
xml_file = strdup(optarg);
one_shot = TRUE;
break;
case 'h':
as_html_file = strdup(optarg);
umask(S_IWGRP | S_IWOTH);
break;
case 'X':
as_xml = TRUE;
one_shot = TRUE;
break;
case 'w':
web_cgi = TRUE;
one_shot = TRUE;
break;
case 's':
simple_status = TRUE;
one_shot = TRUE;
break;
case 'S':
snmp_target = optarg;
break;
case 'T':
crm_mail_to = optarg;
break;
case 'F':
crm_mail_from = optarg;
break;
case 'H':
crm_mail_host = optarg;
break;
case 'P':
crm_mail_prefix = optarg;
break;
case 'E':
external_agent = optarg;
break;
case 'e':
external_recipient = optarg;
break;
case '1':
one_shot = TRUE;
break;
case 'N':
as_console = FALSE;
break;
case 'C':
snmp_community = optarg;
break;
case '$':
case '?':
crm_help(flag, EX_OK);
break;
default:
printf("Argument code 0%o (%c) is not (?yet?) supported\n", flag, flag);
++argerr;
break;
}
}
if (optind < argc) {
printf("non-option ARGV-elements: ");
while (optind < argc)
printf("%s ", argv[optind++]);
printf("\n");
}
if (argerr) {
crm_help('?', EX_USAGE);
}
if (one_shot) {
as_console = FALSE;
} else if (daemonize) {
as_console = FALSE;
crm_enable_stderr(FALSE);
if (!as_html_file && !snmp_target && !crm_mail_to && !external_agent && !as_xml) {
printf
("Looks like you forgot to specify one or more of: --as-html, --as-xml, --mail-to, --snmp-target, --external-agent\n");
crm_help('?', EX_USAGE);
}
crm_make_daemon(crm_system_name, TRUE, pid_file);
} else if (as_console) {
#if CURSES_ENABLED
initscr();
cbreak();
noecho();
crm_enable_stderr(FALSE);
#else
one_shot = TRUE;
as_console = FALSE;
printf("Defaulting to one-shot mode\n");
printf("You need to have curses available at compile time to enable console mode\n");
#endif
}
crm_info("Starting %s", crm_system_name);
if (xml_file != NULL) {
current_cib = filename2xml(xml_file);
mon_refresh_display(NULL);
return exit_code;
}
if (current_cib == NULL) {
cib = cib_new();
do {
if (!one_shot) {
print_as("Attempting connection to the cluster...\n");
}
exit_code = cib_connect(!one_shot);
if (one_shot) {
break;
} else if (exit_code != pcmk_ok) {
sleep(reconnect_msec / 1000);
#if CURSES_ENABLED
if(as_console) {
clear();
refresh();
}
#endif
}
} while (exit_code == -ENOTCONN);
if (exit_code != pcmk_ok) {
print_as("\nConnection to cluster failed: %s\n", pcmk_strerror(exit_code));
if (as_console) {
sleep(2);
}
clean_up(-exit_code);
}
}
if (one_shot) {
return exit_code;
}
mainloop = g_main_new(FALSE);
mainloop_add_signal(SIGTERM, mon_shutdown);
mainloop_add_signal(SIGINT, mon_shutdown);
#if CURSES_ENABLED
if (as_console) {
ncurses_winch_handler = signal(SIGWINCH, mon_winresize);
if (ncurses_winch_handler == SIG_DFL ||
ncurses_winch_handler == SIG_IGN || ncurses_winch_handler == SIG_ERR)
ncurses_winch_handler = NULL;
g_io_add_watch(g_io_channel_unix_new(STDIN_FILENO), G_IO_IN, detect_user_input, NULL);
}
#endif
refresh_trigger = mainloop_add_trigger(G_PRIORITY_LOW, mon_refresh_display, NULL);
g_main_run(mainloop);
g_main_destroy(mainloop);
crm_info("Exiting %s", crm_system_name);
clean_up(0);
return 0; /* never reached */
}
#define mon_warn(fmt...) do { \
if (!has_warnings) { \
print_as("Warning:"); \
} else { \
print_as(","); \
} \
print_as(fmt); \
has_warnings = TRUE; \
} while(0)
static int
count_resources(pe_working_set_t * data_set, resource_t * rsc)
{
int count = 0;
GListPtr gIter = NULL;
if (rsc == NULL) {
gIter = data_set->resources;
} else if (rsc->children) {
gIter = rsc->children;
} else {
return is_not_set(rsc->flags, pe_rsc_orphan);
}
for (; gIter != NULL; gIter = gIter->next) {
count += count_resources(data_set, gIter->data);
}
return count;
}
static int
print_simple_status(pe_working_set_t * data_set)
{
node_t *dc = NULL;
GListPtr gIter = NULL;
int nodes_online = 0;
int nodes_standby = 0;
int nodes_maintenance = 0;
dc = data_set->dc_node;
if (dc == NULL) {
mon_warn("No DC ");
}
for (gIter = data_set->nodes; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
if (node->details->standby && node->details->online) {
nodes_standby++;
} else if (node->details->maintenance && node->details->online) {
nodes_maintenance++;
} else if (node->details->online) {
nodes_online++;
} else {
mon_warn("offline node: %s", node->details->uname);
}
}
if (!has_warnings) {
print_as("Ok: %d nodes online", nodes_online);
if (nodes_standby > 0) {
print_as(", %d standby nodes", nodes_standby);
}
if (nodes_maintenance > 0) {
print_as(", %d maintenance nodes", nodes_maintenance);
}
print_as(", %d resources configured", count_resources(data_set, NULL));
}
print_as("\n");
return 0;
}
static void
print_date(time_t time)
{
int lpc = 0;
char date_str[26];
asctime_r(localtime(&time), date_str);
for (; lpc < 26; lpc++) {
if (date_str[lpc] == '\n') {
date_str[lpc] = 0;
}
}
print_as("'%s'", date_str);
}
#include <crm/pengine/internal.h>
static void
print_rsc_summary(pe_working_set_t * data_set, node_t * node, resource_t * rsc, gboolean all)
{
gboolean printed = FALSE;
time_t last_failure = 0;
int failcount = get_failcount_full(node, rsc, &last_failure, FALSE, data_set);
if (all || failcount || last_failure > 0) {
printed = TRUE;
print_as(" %s: migration-threshold=%d", rsc_printable_id(rsc), rsc->migration_threshold);
}
if (failcount > 0) {
printed = TRUE;
print_as(" fail-count=%d", failcount);
}
if (last_failure > 0) {
printed = TRUE;
print_as(" last-failure=");
print_date(last_failure);
}
if (printed) {
print_as("\n");
}
}
static void
print_rsc_history(pe_working_set_t * data_set, node_t * node, xmlNode * rsc_entry)
{
GListPtr gIter = NULL;
GListPtr op_list = NULL;
gboolean print_name = TRUE;
GListPtr sorted_op_list = NULL;
const char *rsc_id = crm_element_value(rsc_entry, XML_ATTR_ID);
resource_t *rsc = pe_find_resource(data_set->resources, rsc_id);
xmlNode *rsc_op = NULL;
for (rsc_op = __xml_first_child(rsc_entry); rsc_op != NULL; rsc_op = __xml_next(rsc_op)) {
if (crm_str_eq((const char *)rsc_op->name, XML_LRM_TAG_RSC_OP, TRUE)) {
op_list = g_list_append(op_list, rsc_op);
}
}
sorted_op_list = g_list_sort(op_list, sort_op_by_callid);
for (gIter = sorted_op_list; gIter != NULL; gIter = gIter->next) {
xmlNode *xml_op = (xmlNode *) gIter->data;
const char *value = NULL;
const char *call = crm_element_value(xml_op, XML_LRM_ATTR_CALLID);
const char *task = crm_element_value(xml_op, XML_LRM_ATTR_TASK);
const char *op_rc = crm_element_value(xml_op, XML_LRM_ATTR_RC);
const char *interval = crm_element_value(xml_op, XML_LRM_ATTR_INTERVAL);
int rc = crm_parse_int(op_rc, "0");
if (safe_str_eq(task, CRMD_ACTION_STATUS)
&& safe_str_eq(interval, "0")) {
task = "probe";
}
if (rc == 7 && safe_str_eq(task, "probe")) {
continue;
} else if (safe_str_eq(task, CRMD_ACTION_NOTIFY)) {
continue;
}
if (print_name) {
print_name = FALSE;
if (rsc == NULL) {
print_as("Orphan resource: %s", rsc_id);
} else {
print_rsc_summary(data_set, node, rsc, TRUE);
}
}
print_as(" + (%s) %s:", call, task);
if (safe_str_neq(interval, "0")) {
print_as(" interval=%sms", interval);
}
if (print_timing) {
int int_value;
const char *attr = XML_RSC_OP_LAST_CHANGE;
value = crm_element_value(xml_op, attr);
if (value) {
int_value = crm_parse_int(value, NULL);
if (int_value > 0) {
print_as(" %s=", attr);
print_date(int_value);
}
}
attr = XML_RSC_OP_LAST_RUN;
value = crm_element_value(xml_op, attr);
if (value) {
int_value = crm_parse_int(value, NULL);
if (int_value > 0) {
print_as(" %s=", attr);
print_date(int_value);
}
}
attr = XML_RSC_OP_T_EXEC;
value = crm_element_value(xml_op, attr);
if (value) {
int_value = crm_parse_int(value, NULL);
print_as(" %s=%dms", attr, int_value);
}
attr = XML_RSC_OP_T_QUEUE;
value = crm_element_value(xml_op, attr);
if (value) {
int_value = crm_parse_int(value, NULL);
print_as(" %s=%dms", attr, int_value);
}
}
print_as(" rc=%s (%s)\n", op_rc, services_ocf_exitcode_str(rc));
}
/* no need to free the contents */
g_list_free(sorted_op_list);
}
static void
print_attr_msg(node_t * node, GListPtr rsc_list, const char *attrname, const char *attrvalue)
{
GListPtr gIter = NULL;
for (gIter = rsc_list; gIter != NULL; gIter = gIter->next) {
resource_t *rsc = (resource_t *) gIter->data;
const char *type = g_hash_table_lookup(rsc->meta, "type");
if (rsc->children != NULL) {
print_attr_msg(node, rsc->children, attrname, attrvalue);
}
if (safe_str_eq(type, "ping") || safe_str_eq(type, "pingd")) {
const char *name = g_hash_table_lookup(rsc->parameters, "name");
if (name == NULL) {
name = "pingd";
}
/* To identify the resource with the attribute name. */
if (safe_str_eq(name, attrname)) {
int host_list_num = 0;
int expected_score = 0;
int value = crm_parse_int(attrvalue, "0");
const char *hosts = g_hash_table_lookup(rsc->parameters, "host_list");
const char *multiplier = g_hash_table_lookup(rsc->parameters, "multiplier");
if(hosts) {
char **host_list = g_strsplit(hosts, " ", 0);
host_list_num = g_strv_length(host_list);
g_strfreev(host_list);
}
/* pingd multiplier is the same as the default value. */
expected_score = host_list_num * crm_parse_int(multiplier, "1");
/* pingd is abnormal score. */
if (value <= 0) {
print_as("\t: Connectivity is lost");
} else if (value < expected_score) {
print_as("\t: Connectivity is degraded (Expected=%d)", expected_score);
}
}
}
}
}
static int
compare_attribute(gconstpointer a, gconstpointer b)
{
int rc;
rc = strcmp((const char *)a, (const char *)b);
return rc;
}
static void
create_attr_list(gpointer name, gpointer value, gpointer data)
{
int i;
const char *filt_str[] = FILTER_STR;
CRM_CHECK(name != NULL, return);
/* filtering automatic attributes */
for (i = 0; filt_str[i] != NULL; i++) {
if (g_str_has_prefix(name, filt_str[i])) {
return;
}
}
attr_list = g_list_insert_sorted(attr_list, name, compare_attribute);
}
static void
print_node_attribute(gpointer name, gpointer node_data)
{
const char *value = NULL;
node_t *node = (node_t *) node_data;
value = g_hash_table_lookup(node->details->attrs, name);
print_as(" + %-32s\t: %-10s", (char *)name, value);
print_attr_msg(node, node->details->running_rsc, name, value);
print_as("\n");
}
static void
print_node_summary(pe_working_set_t * data_set, gboolean operations)
{
xmlNode *lrm_rsc = NULL;
xmlNode *rsc_entry = NULL;
xmlNode *node_state = NULL;
xmlNode *cib_status = get_object_root(XML_CIB_TAG_STATUS, data_set->input);
if (operations) {
print_as("\nOperations:\n");
} else {
print_as("\nMigration summary:\n");
}
for (node_state = __xml_first_child(cib_status); node_state != NULL;
node_state = __xml_next(node_state)) {
if (crm_str_eq((const char *)node_state->name, XML_CIB_TAG_STATE, TRUE)) {
node_t *node = pe_find_node_id(data_set->nodes, ID(node_state));
if (node == NULL || node->details->online == FALSE) {
continue;
}
print_as("* Node %s: ", crm_element_value(node_state, XML_ATTR_UNAME));
print_as("\n");
lrm_rsc = find_xml_node(node_state, XML_CIB_TAG_LRM, FALSE);
lrm_rsc = find_xml_node(lrm_rsc, XML_LRM_TAG_RESOURCES, FALSE);
for (rsc_entry = __xml_first_child(lrm_rsc); rsc_entry != NULL;
rsc_entry = __xml_next(rsc_entry)) {
if (crm_str_eq((const char *)rsc_entry->name, XML_LRM_TAG_RESOURCE, TRUE)) {
if (operations) {
print_rsc_history(data_set, node, rsc_entry);
} else {
const char *rsc_id = crm_element_value(rsc_entry, XML_ATTR_ID);
resource_t *rsc = pe_find_resource(data_set->resources, rsc_id);
if (rsc) {
print_rsc_summary(data_set, node, rsc, FALSE);
} else {
print_as(" %s: orphan\n", rsc_id);
}
}
}
}
}
}
}
static void
print_ticket(gpointer name, gpointer value, gpointer data)
{
ticket_t *ticket = (ticket_t *) value;
print_as(" %s\t%s%10s", ticket->id,
ticket->granted ? "granted" : "revoked", ticket->standby ? " [standby]" : "");
if (ticket->last_granted > -1) {
print_as(" last-granted=");
print_date(ticket->last_granted);
}
print_as("\n");
return;
}
static void
print_cluster_tickets(pe_working_set_t * data_set)
{
print_as("\nTickets:\n");
g_hash_table_foreach(data_set->tickets, print_ticket, NULL);
return;
}
static void print_neg_locations(pe_working_set_t *data_set)
{
GListPtr gIter, gIter2;
print_as("\nFencing constraints:\n");
for (gIter = data_set->placement_constraints; gIter != NULL; gIter = gIter->next) {
rsc_to_node_t *location = (rsc_to_node_t *) gIter->data;
if (!g_str_has_prefix(location->id, print_neg_location_prefix))
continue;
for (gIter2 = location->node_list_rh; gIter2 != NULL; gIter2 = gIter2->next) {
node_t *node = (node_t *) gIter2->data;
if (node->weight >= 0) /* != -INFINITY ??? */
continue;
print_as(" %s\tprevents %s from running %son %s\n",
location->id, location->rsc_lh->id,
location->role_filter == RSC_ROLE_MASTER ? "as Master " : "",
node->details->uname);
}
}
}
static int
print_status(pe_working_set_t * data_set)
{
static int updates = 0;
GListPtr gIter = NULL;
node_t *dc = NULL;
char *since_epoch = NULL;
char *online_nodes = NULL;
char *online_remote_nodes = NULL;
char *online_remote_containers = NULL;
char *offline_nodes = NULL;
char *offline_remote_nodes = NULL;
const char *stack_s = NULL;
xmlNode *dc_version = NULL;
xmlNode *quorum_node = NULL;
xmlNode *stack = NULL;
time_t a_time = time(NULL);
int print_opts = pe_print_ncurses;
const char *quorum_votes = "unknown";
if (as_console) {
blank_screen();
} else {
print_opts = pe_print_printf;
}
+ if (print_pending) {
+ print_opts |= pe_print_pending;
+ }
+
updates++;
dc = data_set->dc_node;
if (a_time == (time_t) - 1) {
crm_perror(LOG_ERR, "set_node_tstamp(): Invalid time returned");
return 1;
}
since_epoch = ctime(&a_time);
if (since_epoch != NULL && print_last_updated && !hide_headers) {
print_as("Last updated: %s", since_epoch);
}
if (print_last_change && !hide_headers) {
const char *last_written = crm_element_value(data_set->input, XML_CIB_ATTR_WRITTEN);
const char *user = crm_element_value(data_set->input, XML_ATTR_UPDATE_USER);
const char *client = crm_element_value(data_set->input, XML_ATTR_UPDATE_CLIENT);
const char *origin = crm_element_value(data_set->input, XML_ATTR_UPDATE_ORIG);
print_as("Last change: %s", last_written ? last_written : "");
if (user) {
print_as(" by %s", user);
}
if (client) {
print_as(" via %s", client);
}
if (origin) {
print_as(" on %s", origin);
}
print_as("\n");
}
stack =
get_xpath_object("//nvpair[@name='cluster-infrastructure']", data_set->input, LOG_DEBUG);
if (stack) {
stack_s = crm_element_value(stack, XML_NVPAIR_ATTR_VALUE);
if (!hide_headers) {
print_as("Stack: %s\n", stack_s);
}
}
dc_version = get_xpath_object("//nvpair[@name='dc-version']", data_set->input, LOG_DEBUG);
if (dc == NULL) {
print_as("Current DC: NONE\n");
} else if (!hide_headers) {
const char *quorum = crm_element_value(data_set->input, XML_ATTR_HAVE_QUORUM);
if (safe_str_neq(dc->details->uname, dc->details->id)) {
print_as("Current DC: %s (%s)", dc->details->uname, dc->details->id);
} else {
print_as("Current DC: %s", dc->details->uname);
}
print_as(" - partition %s quorum\n", crm_is_true(quorum) ? "with" : "WITHOUT");
if (dc_version) {
print_as("Version: %s\n", crm_element_value(dc_version, XML_NVPAIR_ATTR_VALUE));
}
}
quorum_node =
get_xpath_object("//nvpair[@name='" XML_ATTR_EXPECTED_VOTES "']", data_set->input,
LOG_DEBUG);
if (quorum_node) {
quorum_votes = crm_element_value(quorum_node, XML_NVPAIR_ATTR_VALUE);
}
if(!hide_headers) {
if(stack_s && strstr(stack_s, "classic openais") != NULL) {
print_as("%d Nodes configured, %s expected votes\n", g_list_length(data_set->nodes),
quorum_votes);
} else {
print_as("%d Nodes configured\n", g_list_length(data_set->nodes));
}
print_as("%d Resources configured\n", count_resources(data_set, NULL));
print_as("\n\n");
}
for (gIter = data_set->nodes; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
const char *node_mode = NULL;
char *node_name = NULL;
if (is_container_remote_node(node)) {
node_name = g_strdup_printf("%s:%s", node->details->uname, node->details->remote_rsc->container->id);
} else {
node_name = g_strdup_printf("%s", node->details->uname);
}
if (node->details->unclean) {
if (node->details->online && node->details->unclean) {
node_mode = "UNCLEAN (online)";
} else if (node->details->pending) {
node_mode = "UNCLEAN (pending)";
} else {
node_mode = "UNCLEAN (offline)";
}
} else if (node->details->pending) {
node_mode = "pending";
} else if (node->details->standby_onfail && node->details->online) {
node_mode = "standby (on-fail)";
} else if (node->details->standby) {
if (node->details->online) {
node_mode = "standby";
} else {
node_mode = "OFFLINE (standby)";
}
} else if (node->details->maintenance) {
if (node->details->online) {
node_mode = "maintenance";
} else {
node_mode = "OFFLINE (maintenance)";
}
} else if (node->details->online) {
node_mode = "online";
if (group_by_node == FALSE) {
if (is_container_remote_node(node)) {
online_remote_containers = add_list_element(online_remote_containers, node_name);
} else if (is_baremetal_remote_node(node)) {
online_remote_nodes = add_list_element(online_remote_nodes, node_name);
} else {
online_nodes = add_list_element(online_nodes, node_name);
}
continue;
}
} else {
node_mode = "OFFLINE";
if (group_by_node == FALSE) {
if (is_baremetal_remote_node(node)) {
offline_remote_nodes = add_list_element(offline_remote_nodes, node_name);
} else if (is_container_remote_node(node)) {
/* ignore offline container nodes */
} else {
offline_nodes = add_list_element(offline_nodes, node_name);
}
continue;
}
}
if (is_container_remote_node(node)) {
print_as("ContainerNode %s: %s\n", node_name, node_mode);
} else if (is_baremetal_remote_node(node)) {
print_as("RemoteNode %s: %s\n", node_name, node_mode);
} else if (safe_str_eq(node->details->uname, node->details->id)) {
print_as("Node %s: %s\n", node_name, node_mode);
} else {
print_as("Node %s (%s): %s\n", node_name, node->details->id, node_mode);
}
- if (group_by_node) {
+ if (print_brief && group_by_node) {
+ print_rscs_brief(node->details->running_rsc, "\t", print_opts | pe_print_rsconly,
+ stdout, FALSE);
+
+ } else if (group_by_node) {
GListPtr gIter2 = NULL;
for (gIter2 = node->details->running_rsc; gIter2 != NULL; gIter2 = gIter2->next) {
resource_t *rsc = (resource_t *) gIter2->data;
rsc->fns->print(rsc, "\t", print_opts | pe_print_rsconly, stdout);
}
}
free(node_name);
}
if (online_nodes) {
print_as("Online: [%s ]\n", online_nodes);
free(online_nodes);
}
if (offline_nodes) {
print_as("OFFLINE: [%s ]\n", offline_nodes);
free(offline_nodes);
}
if (online_remote_nodes) {
print_as("RemoteOnline: [%s ]\n", online_remote_nodes);
free(online_remote_nodes);
}
if (offline_remote_nodes) {
print_as("RemoteOFFLINE: [%s ]\n", offline_remote_nodes);
free(offline_remote_nodes);
}
if (online_remote_containers) {
print_as("Containers: [%s ]\n", online_remote_containers);
free(online_remote_containers);
}
if (group_by_node == FALSE && inactive_resources) {
print_as("\nFull list of resources:\n");
} else if (inactive_resources) {
print_as("\nInactive resources:\n");
}
if (group_by_node == FALSE || inactive_resources) {
print_as("\n");
+
+ if (print_brief && group_by_node == FALSE) {
+ print_opts |= pe_print_brief;
+ print_rscs_brief(data_set->resources, NULL, print_opts, stdout,
+ inactive_resources);
+ }
+
for (gIter = data_set->resources; gIter != NULL; gIter = gIter->next) {
resource_t *rsc = (resource_t *) gIter->data;
gboolean is_active = rsc->fns->active(rsc, TRUE);
gboolean partially_active = rsc->fns->active(rsc, FALSE);
+ if (print_brief && group_by_node == FALSE
+ && rsc->variant == pe_native) {
+ continue;
+ }
+
if (is_set(rsc->flags, pe_rsc_orphan) && is_active == FALSE) {
continue;
} else if (group_by_node == FALSE) {
if (partially_active || inactive_resources) {
rsc->fns->print(rsc, NULL, print_opts, stdout);
}
} else if (is_active == FALSE && inactive_resources) {
rsc->fns->print(rsc, NULL, print_opts, stdout);
}
}
}
if (print_nodes_attr) {
print_as("\nNode Attributes:\n");
for (gIter = data_set->nodes; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
if (node == NULL || node->details->online == FALSE) {
continue;
}
print_as("* Node %s:\n", node->details->uname);
g_hash_table_foreach(node->details->attrs, create_attr_list, NULL);
g_list_foreach(attr_list, print_node_attribute, node);
g_list_free(attr_list);
attr_list = NULL;
}
}
if (print_operations || print_failcount) {
print_node_summary(data_set, print_operations);
}
if (xml_has_children(data_set->failed)) {
xmlNode *xml_op = NULL;
print_as("\nFailed actions:\n");
for (xml_op = __xml_first_child(data_set->failed); xml_op != NULL;
xml_op = __xml_next(xml_op)) {
int status = 0;
int rc = 0;
const char *id = ID(xml_op);
const char *op_key = crm_element_value(xml_op, XML_LRM_ATTR_TASK_KEY);
const char *last = crm_element_value(xml_op, XML_RSC_OP_LAST_CHANGE);
const char *node = crm_element_value(xml_op, XML_ATTR_UNAME);
const char *call = crm_element_value(xml_op, XML_LRM_ATTR_CALLID);
const char *rc_s = crm_element_value(xml_op, XML_LRM_ATTR_RC);
const char *status_s = crm_element_value(xml_op, XML_LRM_ATTR_OPSTATUS);
rc = crm_parse_int(rc_s, "0");
status = crm_parse_int(status_s, "0");
if (last) {
time_t run_at = crm_parse_int(last, "0");
char *run_at_s = ctime(&run_at);
if(run_at_s) {
run_at_s[24] = 0; /* Overwrite the newline */
}
print_as(" %s on %s '%s' (%d): call=%s, status=%s, last-rc-change='%s', queued=%sms, exec=%sms\n",
op_key ? op_key : id, node, services_ocf_exitcode_str(rc), rc, call, services_lrm_status_str(status),
run_at_s, crm_element_value(xml_op, XML_RSC_OP_T_QUEUE), crm_element_value(xml_op, XML_RSC_OP_T_EXEC));
} else {
print_as(" %s on %s '%s' (%d): call=%s, status=%s\n",
op_key ? op_key : id, node, services_ocf_exitcode_str(rc), rc, call, services_lrm_status_str(status));
}
}
print_as("\n");
}
if (print_tickets || print_neg_location_prefix) {
/* For recording the tickets that are referenced in rsc_ticket constraints
* but have never been granted yet.
* To be able to print negative location constraint summary,
* we also need them to be unpacked. */
xmlNode *cib_constraints = get_object_root(XML_CIB_TAG_CONSTRAINTS, data_set->input);
unpack_constraints(cib_constraints, data_set);
}
if (print_tickets) {
print_cluster_tickets(data_set);
}
if (print_neg_location_prefix) {
print_neg_locations(data_set);
}
#if CURSES_ENABLED
if (as_console) {
refresh();
}
#endif
return 0;
}
static int
print_xml_status(pe_working_set_t * data_set)
{
FILE *stream = stdout;
GListPtr gIter = NULL;
node_t *dc = NULL;
xmlNode *stack = NULL;
xmlNode *quorum_node = NULL;
const char *quorum_votes = "unknown";
+ int print_opts = pe_print_xml;
dc = data_set->dc_node;
+ if (print_pending) {
+ print_opts |= pe_print_pending;
+ }
+
fprintf(stream, "<?xml version=\"1.0\"?>\n");
fprintf(stream, "<crm_mon version=\"%s\">\n", VERSION);
/*** SUMMARY ***/
fprintf(stream, " <summary>\n");
if (print_last_updated) {
time_t now = time(NULL);
char *now_str = ctime(&now);
now_str[24] = EOS; /* replace the newline */
fprintf(stream, " <last_update time=\"%s\" />\n", now_str);
}
if (print_last_change) {
const char *last_written = crm_element_value(data_set->input, XML_CIB_ATTR_WRITTEN);
const char *user = crm_element_value(data_set->input, XML_ATTR_UPDATE_USER);
const char *client = crm_element_value(data_set->input, XML_ATTR_UPDATE_CLIENT);
const char *origin = crm_element_value(data_set->input, XML_ATTR_UPDATE_ORIG);
fprintf(stream,
" <last_change time=\"%s\" user=\"%s\" client=\"%s\" origin=\"%s\" />\n",
last_written ? last_written : "", user ? user : "", client ? client : "",
origin ? origin : "");
}
stack = get_xpath_object("//nvpair[@name='cluster-infrastructure']",
data_set->input, LOG_DEBUG);
if (stack) {
fprintf(stream, " <stack type=\"%s\" />\n",
crm_element_value(stack, XML_NVPAIR_ATTR_VALUE));
}
if (!dc) {
fprintf(stream, " <current_dc present=\"false\" />\n");
} else {
const char *quorum = crm_element_value(data_set->input, XML_ATTR_HAVE_QUORUM);
const char *uname = dc->details->uname;
const char *id = dc->details->id;
xmlNode *dc_version = get_xpath_object("//nvpair[@name='dc-version']",
data_set->input,
LOG_DEBUG);
fprintf(stream,
" <current_dc present=\"true\" version=\"%s\" name=\"%s\" id=\"%s\" with_quorum=\"%s\" />\n",
dc_version ? crm_element_value(dc_version, XML_NVPAIR_ATTR_VALUE) : "", uname, id,
quorum ? (crm_is_true(quorum) ? "true" : "false") : "false");
}
quorum_node = get_xpath_object("//nvpair[@name='" XML_ATTR_EXPECTED_VOTES "']",
data_set->input, LOG_DEBUG);
if (quorum_node) {
quorum_votes = crm_element_value(quorum_node, XML_NVPAIR_ATTR_VALUE);
}
fprintf(stream, " <nodes_configured number=\"%d\" expected_votes=\"%s\" />\n",
g_list_length(data_set->nodes), quorum_votes);
fprintf(stream, " <resources_configured number=\"%d\" />\n",
count_resources(data_set, NULL));
fprintf(stream, " </summary>\n");
/*** NODES ***/
fprintf(stream, " <nodes>\n");
for (gIter = data_set->nodes; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
const char *node_type = "unknown";
switch (node->details->type) {
case node_member:
node_type = "member";
break;
case node_remote:
node_type = "remote";
break;
case node_ping:
node_type = "ping";
break;
}
fprintf(stream, " <node name=\"%s\" ", node->details->uname);
fprintf(stream, "id=\"%s\" ", node->details->id);
fprintf(stream, "online=\"%s\" ", node->details->online ? "true" : "false");
fprintf(stream, "standby=\"%s\" ", node->details->standby ? "true" : "false");
fprintf(stream, "standby_onfail=\"%s\" ", node->details->standby_onfail ? "true" : "false");
fprintf(stream, "maintenance=\"%s\" ", node->details->maintenance ? "true" : "false");
fprintf(stream, "pending=\"%s\" ", node->details->pending ? "true" : "false");
fprintf(stream, "unclean=\"%s\" ", node->details->unclean ? "true" : "false");
fprintf(stream, "shutdown=\"%s\" ", node->details->shutdown ? "true" : "false");
fprintf(stream, "expected_up=\"%s\" ", node->details->expected_up ? "true" : "false");
fprintf(stream, "is_dc=\"%s\" ", node->details->is_dc ? "true" : "false");
fprintf(stream, "resources_running=\"%d\" ", g_list_length(node->details->running_rsc));
fprintf(stream, "type=\"%s\" ", node_type);
if (group_by_node) {
GListPtr lpc2 = NULL;
fprintf(stream, ">\n");
for (lpc2 = node->details->running_rsc; lpc2 != NULL; lpc2 = lpc2->next) {
resource_t *rsc = (resource_t *) lpc2->data;
- rsc->fns->print(rsc, " ", pe_print_xml | pe_print_rsconly, stream);
+ rsc->fns->print(rsc, " ", print_opts | pe_print_rsconly, stream);
}
fprintf(stream, " </node>\n");
} else {
fprintf(stream, "/>\n");
}
}
fprintf(stream, " </nodes>\n");
/*** RESOURCES ***/
if (group_by_node == FALSE || inactive_resources) {
fprintf(stream, " <resources>\n");
for (gIter = data_set->resources; gIter != NULL; gIter = gIter->next) {
resource_t *rsc = (resource_t *) gIter->data;
gboolean is_active = rsc->fns->active(rsc, TRUE);
gboolean partially_active = rsc->fns->active(rsc, FALSE);
if (is_set(rsc->flags, pe_rsc_orphan) && is_active == FALSE) {
continue;
} else if (group_by_node == FALSE) {
if (partially_active || inactive_resources) {
- rsc->fns->print(rsc, " ", pe_print_xml, stream);
+ rsc->fns->print(rsc, " ", print_opts, stream);
}
} else if (is_active == FALSE && inactive_resources) {
- rsc->fns->print(rsc, " ", pe_print_xml, stream);
+ rsc->fns->print(rsc, " ", print_opts, stream);
}
}
fprintf(stream, " </resources>\n");
}
fprintf(stream, "</crm_mon>\n");
fflush(stream);
fclose(stream);
return 0;
}
static int
print_html_status(pe_working_set_t * data_set, const char *filename, gboolean web_cgi)
{
FILE *stream;
GListPtr gIter = NULL;
node_t *dc = NULL;
static int updates = 0;
char *filename_tmp = NULL;
+ int print_opts = pe_print_html;
+
+ if (print_pending) {
+ print_opts |= pe_print_pending;
+ }
if (web_cgi) {
stream = stdout;
fprintf(stream, "Content-type: text/html\n\n");
} else {
filename_tmp = crm_concat(filename, "tmp", '.');
stream = fopen(filename_tmp, "w");
if (stream == NULL) {
crm_perror(LOG_ERR, "Cannot open %s for writing", filename_tmp);
free(filename_tmp);
return -1;
}
}
updates++;
dc = data_set->dc_node;
fprintf(stream, "<html>");
fprintf(stream, "<head>");
fprintf(stream, "<title>Cluster status</title>");
/* content="%d;url=http://webdesign.about.com" */
fprintf(stream, "<meta http-equiv=\"refresh\" content=\"%d\">", reconnect_msec / 1000);
fprintf(stream, "</head>");
/*** SUMMARY ***/
fprintf(stream, "<h2>Cluster summary</h2>");
{
char *now_str = NULL;
time_t now = time(NULL);
now_str = ctime(&now);
now_str[24] = EOS; /* replace the newline */
fprintf(stream, "Last updated: <b>%s</b><br/>\n", now_str);
}
if (dc == NULL) {
fprintf(stream, "Current DC: <font color=\"red\"><b>NONE</b></font><br/>");
} else {
fprintf(stream, "Current DC: %s (%s)<br/>", dc->details->uname, dc->details->id);
}
fprintf(stream, "%d Nodes configured.<br/>", g_list_length(data_set->nodes));
fprintf(stream, "%d Resources configured.<br/>", count_resources(data_set, NULL));
/*** CONFIG ***/
fprintf(stream, "<h3>Config Options</h3>\n");
fprintf(stream, "<table>\n");
fprintf(stream, "<tr><td>STONITH of failed nodes</td><td>:</td><td>%s</td></tr>\n",
is_set(data_set->flags, pe_flag_stonith_enabled) ? "enabled" : "disabled");
fprintf(stream, "<tr><td>Cluster is</td><td>:</td><td>%ssymmetric</td></tr>\n",
is_set(data_set->flags, pe_flag_symmetric_cluster) ? "" : "a-");
fprintf(stream, "<tr><td>No Quorum Policy</td><td>:</td><td>");
switch (data_set->no_quorum_policy) {
case no_quorum_freeze:
fprintf(stream, "Freeze resources");
break;
case no_quorum_stop:
fprintf(stream, "Stop ALL resources");
break;
case no_quorum_ignore:
fprintf(stream, "Ignore");
break;
case no_quorum_suicide:
fprintf(stream, "Suicide");
break;
}
fprintf(stream, "\n</td></tr>\n</table>\n");
/*** NODE LIST ***/
fprintf(stream, "<h2>Node List</h2>\n");
fprintf(stream, "<ul>\n");
for (gIter = data_set->nodes; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
fprintf(stream, "<li>");
if (node->details->standby_onfail && node->details->online) {
fprintf(stream, "Node: %s (%s): %s", node->details->uname, node->details->id,
"<font color=\"orange\">standby (on-fail)</font>\n");
} else if (node->details->standby && node->details->online) {
fprintf(stream, "Node: %s (%s): %s", node->details->uname, node->details->id,
"<font color=\"orange\">standby</font>\n");
} else if (node->details->standby) {
fprintf(stream, "Node: %s (%s): %s", node->details->uname, node->details->id,
"<font color=\"red\">OFFLINE (standby)</font>\n");
} else if (node->details->maintenance && node->details->online) {
fprintf(stream, "Node: %s (%s): %s", node->details->uname, node->details->id,
"<font color=\"blue\">maintenance</font>\n");
} else if (node->details->maintenance) {
fprintf(stream, "Node: %s (%s): %s", node->details->uname, node->details->id,
"<font color=\"red\">OFFLINE (maintenance)</font>\n");
} else if (node->details->online) {
fprintf(stream, "Node: %s (%s): %s", node->details->uname, node->details->id,
"<font color=\"green\">online</font>\n");
} else {
fprintf(stream, "Node: %s (%s): %s", node->details->uname, node->details->id,
"<font color=\"red\">OFFLINE</font>\n");
}
- if (group_by_node) {
+ if (print_brief && group_by_node) {
+ fprintf(stream, "<ul>\n");
+ print_rscs_brief(node->details->running_rsc, NULL, print_opts | pe_print_rsconly,
+ stream, FALSE);
+ fprintf(stream, "</ul>\n");
+
+ } else if (group_by_node) {
GListPtr lpc2 = NULL;
fprintf(stream, "<ul>\n");
for (lpc2 = node->details->running_rsc; lpc2 != NULL; lpc2 = lpc2->next) {
resource_t *rsc = (resource_t *) lpc2->data;
fprintf(stream, "<li>");
- rsc->fns->print(rsc, NULL, pe_print_html | pe_print_rsconly, stream);
+ rsc->fns->print(rsc, NULL, print_opts | pe_print_rsconly, stream);
fprintf(stream, "</li>\n");
}
fprintf(stream, "</ul>\n");
}
fprintf(stream, "</li>\n");
}
fprintf(stream, "</ul>\n");
if (group_by_node && inactive_resources) {
fprintf(stream, "<h2>Inactive Resources</h2>\n");
} else if (group_by_node == FALSE) {
fprintf(stream, "<h2>Resource List</h2>\n");
}
if (group_by_node == FALSE || inactive_resources) {
+ if (print_brief && group_by_node == FALSE) {
+ print_opts |= pe_print_brief;
+ print_rscs_brief(data_set->resources, NULL, print_opts, stream,
+ inactive_resources);
+ }
+
for (gIter = data_set->resources; gIter != NULL; gIter = gIter->next) {
resource_t *rsc = (resource_t *) gIter->data;
gboolean is_active = rsc->fns->active(rsc, TRUE);
gboolean partially_active = rsc->fns->active(rsc, FALSE);
+ if (print_brief && group_by_node == FALSE
+ && rsc->variant == pe_native) {
+ continue;
+ }
+
if (is_set(rsc->flags, pe_rsc_orphan) && is_active == FALSE) {
continue;
} else if (group_by_node == FALSE) {
if (partially_active || inactive_resources) {
- rsc->fns->print(rsc, NULL, pe_print_html, stream);
+ rsc->fns->print(rsc, NULL, print_opts, stream);
}
} else if (is_active == FALSE && inactive_resources) {
- rsc->fns->print(rsc, NULL, pe_print_html, stream);
+ rsc->fns->print(rsc, NULL, print_opts, stream);
}
}
}
fprintf(stream, "</html>");
fflush(stream);
fclose(stream);
if (!web_cgi) {
if (rename(filename_tmp, filename) != 0) {
crm_perror(LOG_ERR, "Unable to rename %s->%s", filename_tmp, filename);
}
free(filename_tmp);
}
return 0;
}
#if ENABLE_SNMP
# include <net-snmp/net-snmp-config.h>
# include <net-snmp/snmpv3_api.h>
# include <net-snmp/agent/agent_trap.h>
# include <net-snmp/library/snmp_client.h>
# include <net-snmp/library/mib.h>
# include <net-snmp/library/snmp_debug.h>
# define add_snmp_field(list, oid_string, value) do { \
oid name[MAX_OID_LEN]; \
size_t name_length = MAX_OID_LEN; \
if (snmp_parse_oid(oid_string, name, &name_length)) { \
int s_rc = snmp_add_var(list, name, name_length, 's', (value)); \
if(s_rc != 0) { \
crm_err("Could not add %s=%s rc=%d", oid_string, value, s_rc); \
} else { \
crm_trace("Added %s=%s", oid_string, value); \
} \
} else { \
crm_err("Could not parse OID: %s", oid_string); \
} \
} while(0) \
# define add_snmp_field_int(list, oid_string, value) do { \
oid name[MAX_OID_LEN]; \
size_t name_length = MAX_OID_LEN; \
if (snmp_parse_oid(oid_string, name, &name_length)) { \
if(NULL == snmp_pdu_add_variable( \
list, name, name_length, ASN_INTEGER, \
(u_char *) & value, sizeof(value))) { \
crm_err("Could not add %s=%d", oid_string, value); \
} else { \
crm_trace("Added %s=%d", oid_string, value); \
} \
} else { \
crm_err("Could not parse OID: %s", oid_string); \
} \
} while(0) \
static int
snmp_input(int operation, netsnmp_session * session, int reqid, netsnmp_pdu * pdu, void *magic)
{
return 1;
}
static netsnmp_session *
crm_snmp_init(const char *target, char *community)
{
static netsnmp_session *session = NULL;
# ifdef NETSNMPV53
char target53[128];
snprintf(target53, sizeof(target53), "%s:162", target);
# endif
if (session) {
return session;
}
if (target == NULL) {
return NULL;
}
if (get_crm_log_level() > LOG_INFO) {
char *debug_tokens = strdup("run:shell,snmptrap,tdomain");
debug_register_tokens(debug_tokens);
snmp_set_do_debugging(1);
}
session = calloc(1, sizeof(netsnmp_session));
snmp_sess_init(session);
session->version = SNMP_VERSION_2c;
session->callback = snmp_input;
session->callback_magic = NULL;
if (community) {
session->community_len = strlen(community);
session->community = (unsigned char *)community;
}
session = snmp_add(session,
# ifdef NETSNMPV53
netsnmp_tdomain_transport(target53, 0, "udp"),
# else
netsnmp_transport_open_client("snmptrap", target),
# endif
NULL, NULL);
if (session == NULL) {
snmp_sess_perror("Could not create snmp transport", session);
}
return session;
}
#endif
static int
send_snmp_trap(const char *node, const char *rsc, const char *task, int target_rc, int rc,
int status, const char *desc)
{
int ret = 1;
#if ENABLE_SNMP
static oid snmptrap_oid[] = { 1, 3, 6, 1, 6, 3, 1, 1, 4, 1, 0 };
static oid sysuptime_oid[] = { 1, 3, 6, 1, 2, 1, 1, 3, 0 };
netsnmp_pdu *trap_pdu;
netsnmp_session *session = crm_snmp_init(snmp_target, snmp_community);
trap_pdu = snmp_pdu_create(SNMP_MSG_TRAP2);
if (!trap_pdu) {
crm_err("Failed to create SNMP notification");
return SNMPERR_GENERR;
}
if (1) {
/* send uptime */
char csysuptime[20];
time_t now = time(NULL);
sprintf(csysuptime, "%ld", now);
snmp_add_var(trap_pdu, sysuptime_oid, sizeof(sysuptime_oid) / sizeof(oid), 't', csysuptime);
}
/* Indicate what the trap is by setting snmpTrapOid.0 */
ret =
snmp_add_var(trap_pdu, snmptrap_oid, sizeof(snmptrap_oid) / sizeof(oid), 'o',
snmp_crm_trap_oid);
if (ret != 0) {
crm_err("Failed set snmpTrapOid.0=%s", snmp_crm_trap_oid);
return ret;
}
/* Add extries to the trap */
if (rsc) {
add_snmp_field(trap_pdu, snmp_crm_oid_rsc, rsc);
}
add_snmp_field(trap_pdu, snmp_crm_oid_node, node);
add_snmp_field(trap_pdu, snmp_crm_oid_task, task);
add_snmp_field(trap_pdu, snmp_crm_oid_desc, desc);
add_snmp_field_int(trap_pdu, snmp_crm_oid_rc, rc);
add_snmp_field_int(trap_pdu, snmp_crm_oid_trc, target_rc);
add_snmp_field_int(trap_pdu, snmp_crm_oid_status, status);
/* Send and cleanup */
ret = snmp_send(session, trap_pdu);
if (ret == 0) {
/* error */
snmp_sess_perror("Could not send SNMP trap", session);
snmp_free_pdu(trap_pdu);
ret = SNMPERR_GENERR;
} else {
ret = SNMPERR_SUCCESS;
}
#else
crm_err("Sending SNMP traps is not supported by this installation");
#endif
return ret;
}
#if ENABLE_ESMTP
# include <auth-client.h>
# include <libesmtp.h>
static void
print_recipient_status(smtp_recipient_t recipient, const char *mailbox, void *arg)
{
const smtp_status_t *status;
status = smtp_recipient_status(recipient);
printf("%s: %d %s", mailbox, status->code, status->text);
}
static void
event_cb(smtp_session_t session, int event_no, void *arg, ...)
{
int *ok;
va_list alist;
va_start(alist, arg);
switch (event_no) {
case SMTP_EV_CONNECT:
case SMTP_EV_MAILSTATUS:
case SMTP_EV_RCPTSTATUS:
case SMTP_EV_MESSAGEDATA:
case SMTP_EV_MESSAGESENT:
case SMTP_EV_DISCONNECT:
break;
case SMTP_EV_WEAK_CIPHER:{
int bits = va_arg(alist, long);
ok = va_arg(alist, int *);
crm_debug("SMTP_EV_WEAK_CIPHER, bits=%d - accepted.", bits);
*ok = 1;
break;
}
case SMTP_EV_STARTTLS_OK:
crm_debug("SMTP_EV_STARTTLS_OK - TLS started here.");
break;
case SMTP_EV_INVALID_PEER_CERTIFICATE:{
long vfy_result = va_arg(alist, long);
ok = va_arg(alist, int *);
/* There is a table in handle_invalid_peer_certificate() of mail-file.c */
crm_err("SMTP_EV_INVALID_PEER_CERTIFICATE: %ld", vfy_result);
*ok = 1;
break;
}
case SMTP_EV_NO_PEER_CERTIFICATE:
ok = va_arg(alist, int *);
crm_debug("SMTP_EV_NO_PEER_CERTIFICATE - accepted.");
*ok = 1;
break;
case SMTP_EV_WRONG_PEER_CERTIFICATE:
ok = va_arg(alist, int *);
crm_debug("SMTP_EV_WRONG_PEER_CERTIFICATE - accepted.");
*ok = 1;
break;
case SMTP_EV_NO_CLIENT_CERTIFICATE:
ok = va_arg(alist, int *);
crm_debug("SMTP_EV_NO_CLIENT_CERTIFICATE - accepted.");
*ok = 1;
break;
default:
crm_debug("Got event: %d - ignored.\n", event_no);
}
va_end(alist);
}
#endif
#define BODY_MAX 2048
#if ENABLE_ESMTP
static void
crm_smtp_debug(const char *buf, int buflen, int writing, void *arg)
{
char type = 0;
int lpc = 0, last = 0, level = *(int *)arg;
if (writing == SMTP_CB_HEADERS) {
type = 'H';
} else if (writing) {
type = 'C';
} else {
type = 'S';
}
for (; lpc < buflen; lpc++) {
switch (buf[lpc]) {
case 0:
case '\n':
if (last > 0) {
do_crm_log(level, " %.*s", lpc - last, buf + last);
} else {
do_crm_log(level, "%c: %.*s", type, lpc - last, buf + last);
}
last = lpc + 1;
break;
}
}
}
#endif
static int
send_custom_trap(const char *node, const char *rsc, const char *task, int target_rc, int rc,
int status, const char *desc)
{
pid_t pid;
/*setenv needs chars, these are ints */
char *rc_s = crm_itoa(rc);
char *status_s = crm_itoa(status);
char *target_rc_s = crm_itoa(target_rc);
crm_debug("Sending external notification to '%s' via '%s'", external_recipient, external_agent);
setenv("CRM_notify_recipient", external_recipient, 1);
setenv("CRM_notify_node", node, 1);
setenv("CRM_notify_rsc", rsc, 1);
setenv("CRM_notify_task", task, 1);
setenv("CRM_notify_desc", desc, 1);
setenv("CRM_notify_rc", rc_s, 1);
setenv("CRM_notify_target_rc", target_rc_s, 1);
setenv("CRM_notify_status", status_s, 1);
pid = fork();
if (pid == -1) {
crm_perror(LOG_ERR, "notification fork() failed.");
}
if (pid == 0) {
/* crm_debug("notification: I am the child. Executing the nofitication program."); */
execl(external_agent, external_agent, NULL);
}
crm_trace("Finished running custom notification program '%s'.", external_agent);
free(target_rc_s);
free(status_s);
free(rc_s);
return 0;
}
static int
send_smtp_trap(const char *node, const char *rsc, const char *task, int target_rc, int rc,
int status, const char *desc)
{
#if ENABLE_ESMTP
smtp_session_t session;
smtp_message_t message;
auth_context_t authctx;
struct sigaction sa;
int len = 25; /* Note: Check extra padding on the Subject line below */
int noauth = 1;
int smtp_debug = LOG_DEBUG;
char crm_mail_body[BODY_MAX];
char *crm_mail_subject = NULL;
memset(&sa, 0, sizeof(struct sigaction));
if (node == NULL) {
node = "-";
}
if (rsc == NULL) {
rsc = "-";
}
if (desc == NULL) {
desc = "-";
}
if (crm_mail_to == NULL) {
return 1;
}
if (crm_mail_host == NULL) {
crm_mail_host = "localhost:25";
}
if (crm_mail_prefix == NULL) {
crm_mail_prefix = "Cluster notification";
}
crm_debug("Sending '%s' mail to %s via %s", crm_mail_prefix, crm_mail_to, crm_mail_host);
len += strlen(crm_mail_prefix);
len += strlen(task);
len += strlen(rsc);
len += strlen(node);
len += strlen(desc);
len++;
crm_mail_subject = calloc(1, len);
/* If you edit this line, ensure you allocate enough memory for it by altering 'len' above */
snprintf(crm_mail_subject, len, "%s - %s event for %s on %s: %s\r\n", crm_mail_prefix, task,
rsc, node, desc);
len = 0;
len += snprintf(crm_mail_body + len, BODY_MAX - len, "\r\n%s\r\n", crm_mail_prefix);
len += snprintf(crm_mail_body + len, BODY_MAX - len, "====\r\n\r\n");
if (rc == target_rc) {
len += snprintf(crm_mail_body + len, BODY_MAX - len,
"Completed operation %s for resource %s on %s\r\n", task, rsc, node);
} else {
len += snprintf(crm_mail_body + len, BODY_MAX - len,
"Operation %s for resource %s on %s failed: %s\r\n", task, rsc, node, desc);
}
len += snprintf(crm_mail_body + len, BODY_MAX - len, "\r\nDetails:\r\n");
len += snprintf(crm_mail_body + len, BODY_MAX - len,
"\toperation status: (%d) %s\r\n", status, services_lrm_status_str(status));
if (status == PCMK_LRM_OP_DONE) {
len += snprintf(crm_mail_body + len, BODY_MAX - len,
"\tscript returned: (%d) %s\r\n", rc, services_ocf_exitcode_str(rc));
len += snprintf(crm_mail_body + len, BODY_MAX - len,
"\texpected return value: (%d) %s\r\n", target_rc,
services_ocf_exitcode_str(target_rc));
}
auth_client_init();
session = smtp_create_session();
message = smtp_add_message(session);
smtp_starttls_enable(session, Starttls_ENABLED);
sa.sa_handler = SIG_IGN;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
sigaction(SIGPIPE, &sa, NULL);
smtp_set_server(session, crm_mail_host);
authctx = auth_create_context();
auth_set_mechanism_flags(authctx, AUTH_PLUGIN_PLAIN, 0);
smtp_set_eventcb(session, event_cb, NULL);
/* Now tell libESMTP it can use the SMTP AUTH extension.
*/
if (!noauth) {
crm_debug("Adding authentication context");
smtp_auth_set_context(session, authctx);
}
if (crm_mail_from == NULL) {
struct utsname us;
char auto_from[BODY_MAX];
CRM_ASSERT(uname(&us) == 0);
snprintf(auto_from, BODY_MAX, "crm_mon@%s", us.nodename);
smtp_set_reverse_path(message, auto_from);
} else {
/* NULL is ok */
smtp_set_reverse_path(message, crm_mail_from);
}
smtp_set_header(message, "To", NULL /*phrase */ , NULL /*addr */ ); /* "Phrase" <addr> */
smtp_add_recipient(message, crm_mail_to);
/* Set the Subject: header and override any subject line in the message headers. */
smtp_set_header(message, "Subject", crm_mail_subject);
smtp_set_header_option(message, "Subject", Hdr_OVERRIDE, 1);
smtp_set_message_str(message, crm_mail_body);
smtp_set_monitorcb(session, crm_smtp_debug, &smtp_debug, 1);
if (smtp_start_session(session)) {
char buf[128];
int rc = smtp_errno();
crm_err("SMTP server problem: %s (%d)", smtp_strerror(rc, buf, sizeof buf), rc);
} else {
char buf[128];
int rc = smtp_errno();
const smtp_status_t *smtp_status = smtp_message_transfer_status(message);
if (rc != 0) {
crm_err("SMTP server problem: %s (%d)", smtp_strerror(rc, buf, sizeof buf), rc);
}
crm_info("Send status: %d %s", smtp_status->code, crm_str(smtp_status->text));
smtp_enumerate_recipients(message, print_recipient_status, NULL);
}
smtp_destroy_session(session);
auth_destroy_context(authctx);
auth_client_exit();
#endif
return 0;
}
static void
handle_rsc_op(xmlNode * rsc_op)
{
int rc = -1;
int status = -1;
int action = -1;
int interval = 0;
int target_rc = -1;
int transition_num = -1;
gboolean notify = TRUE;
char *rsc = NULL;
char *task = NULL;
const char *desc = NULL;
const char *node = NULL;
const char *magic = NULL;
const char *id = crm_element_value(rsc_op, XML_LRM_ATTR_TASK_KEY);
char *update_te_uuid = NULL;
xmlNode *n = rsc_op;
if (id == NULL) {
/* Compatability with <= 1.1.5 */
id = ID(rsc_op);
}
magic = crm_element_value(rsc_op, XML_ATTR_TRANSITION_MAGIC);
if (magic == NULL) {
/* non-change */
return;
}
if (FALSE == decode_transition_magic(magic, &update_te_uuid, &transition_num, &action,
&status, &rc, &target_rc)) {
crm_err("Invalid event %s detected for %s", magic, id);
return;
}
if (parse_op_key(id, &rsc, &task, &interval) == FALSE) {
crm_err("Invalid event detected for %s", id);
goto bail;
}
while (n != NULL && safe_str_neq(XML_CIB_TAG_STATE, TYPE(n))) {
n = n->parent;
}
node = crm_element_value(n, XML_ATTR_UNAME);
if (node == NULL) {
node = ID(n);
}
if (node == NULL) {
crm_err("No node detected for event %s (%s)", magic, id);
goto bail;
}
/* look up where we expected it to be? */
desc = pcmk_strerror(pcmk_ok);
if (status == PCMK_LRM_OP_DONE && target_rc == rc) {
crm_notice("%s of %s on %s completed: %s", task, rsc, node, desc);
if (rc == PCMK_OCF_NOT_RUNNING) {
notify = FALSE;
}
} else if (status == PCMK_LRM_OP_DONE) {
desc = services_ocf_exitcode_str(rc);
crm_warn("%s of %s on %s failed: %s", task, rsc, node, desc);
} else {
desc = services_lrm_status_str(status);
crm_warn("%s of %s on %s failed: %s", task, rsc, node, desc);
}
if (notify && snmp_target) {
send_snmp_trap(node, rsc, task, target_rc, rc, status, desc);
}
if (notify && crm_mail_to) {
send_smtp_trap(node, rsc, task, target_rc, rc, status, desc);
}
if (notify && external_agent) {
send_custom_trap(node, rsc, task, target_rc, rc, status, desc);
}
bail:
free(update_te_uuid);
free(rsc);
free(task);
}
static gboolean
mon_trigger_refresh(gpointer user_data)
{
mainloop_set_trigger(refresh_trigger);
return FALSE;
}
void
crm_diff_update(const char *event, xmlNode * msg)
{
int rc = -1;
long now = time(NULL);
static bool stale = FALSE;
static int updates = 0;
static mainloop_timer_t *refresh_timer = NULL;
print_dot();
if(refresh_timer == NULL) {
refresh_timer = mainloop_timer_add("refresh", 2000, FALSE, mon_trigger_refresh, NULL);
}
if (current_cib != NULL) {
xmlNode *cib_last = current_cib;
current_cib = NULL;
rc = cib_apply_patch_event(msg, cib_last, &current_cib, LOG_DEBUG);
free_xml(cib_last);
switch (rc) {
case -pcmk_err_diff_resync:
case -pcmk_err_diff_failed:
crm_notice("[%s] Patch aborted: %s (%d)", event, pcmk_strerror(rc), rc);
break;
case pcmk_ok:
updates++;
break;
default:
crm_notice("[%s] ABORTED: %s (%d)", event, pcmk_strerror(rc), rc);
}
}
if (current_cib == NULL) {
current_cib = get_cib_copy(cib);
}
if (crm_mail_to || snmp_target || external_agent) {
/* Process operation updates */
xmlXPathObject *xpathObj = xpath_search(msg,
"//" F_CIB_UPDATE_RESULT "//" XML_TAG_DIFF_ADDED
"//" XML_LRM_TAG_RSC_OP);
int lpc = 0, max = numXpathResults(xpathObj);
for (lpc = 0; lpc < max; lpc++) {
xmlNode *rsc_op = getXpathResult(xpathObj, lpc);
handle_rsc_op(rsc_op);
}
freeXpathObject(xpathObj);
}
if (current_cib == NULL) {
if(!stale) {
print_as("--- Stale data ---");
}
stale = TRUE;
return;
}
stale = FALSE;
/* Refresh
* - immediately if the last update was more than 5s ago
* - every 10 updates
* - at most 2s after the last update
*/
if ((now - last_refresh) > (reconnect_msec / 1000)) {
mainloop_set_trigger(refresh_trigger);
mainloop_timer_stop(refresh_timer);
updates = 0;
} else if(updates > 10) {
mainloop_set_trigger(refresh_trigger);
mainloop_timer_stop(refresh_timer);
updates = 0;
} else {
mainloop_timer_start(refresh_timer);
}
}
gboolean
mon_refresh_display(gpointer user_data)
{
xmlNode *cib_copy = copy_xml(current_cib);
pe_working_set_t data_set;
last_refresh = time(NULL);
if (cli_config_update(&cib_copy, NULL, FALSE) == FALSE) {
if (cib) {
cib->cmds->signoff(cib);
}
print_as("Upgrade failed: %s", pcmk_strerror(-pcmk_err_dtd_validation));
if (as_console) {
sleep(2);
}
clean_up(EX_USAGE);
return FALSE;
}
set_working_set_defaults(&data_set);
data_set.input = cib_copy;
cluster_status(&data_set);
if (as_html_file || web_cgi) {
if (print_html_status(&data_set, as_html_file, web_cgi) != 0) {
fprintf(stderr, "Critical: Unable to output html file\n");
clean_up(EX_USAGE);
}
} else if (as_xml) {
if (print_xml_status(&data_set) != 0) {
fprintf(stderr, "Critical: Unable to output xml file\n");
clean_up(EX_USAGE);
}
} else if (daemonize) {
/* do nothing */
} else if (simple_status) {
print_simple_status(&data_set);
if (has_warnings) {
clean_up(EX_USAGE);
}
} else {
print_status(&data_set);
}
cleanup_calculations(&data_set);
return TRUE;
}
void
mon_st_callback(stonith_t * st, stonith_event_t * e)
{
char *desc = g_strdup_printf("Operation %s requested by %s for peer %s: %s (ref=%s)",
e->operation, e->origin, e->target, pcmk_strerror(e->result),
e->id);
if (snmp_target) {
send_snmp_trap(e->target, NULL, e->operation, pcmk_ok, e->result, 0, desc);
}
if (crm_mail_to) {
send_smtp_trap(e->target, NULL, e->operation, pcmk_ok, e->result, 0, desc);
}
if (external_agent) {
send_custom_trap(e->target, NULL, e->operation, pcmk_ok, e->result, 0, desc);
}
g_free(desc);
}
/*
* De-init ncurses, signoff from the CIB and deallocate memory.
*/
void
clean_up(int rc)
{
#if ENABLE_SNMP
netsnmp_session *session = crm_snmp_init(NULL, NULL);
if (session) {
snmp_close(session);
snmp_shutdown("snmpapp");
}
#endif
#if CURSES_ENABLED
if (as_console) {
as_console = FALSE;
echo();
nocbreak();
endwin();
}
#endif
if (cib != NULL) {
cib->cmds->signoff(cib);
cib_delete(cib);
cib = NULL;
}
free(as_html_file);
free(xml_file);
free(pid_file);
if (rc >= 0) {
crm_exit(rc);
}
return;
}
diff --git a/tools/crm_resource.c b/tools/crm_resource.c
index bd576cceea..1d110031a0 100644
--- a/tools/crm_resource.c
+++ b/tools/crm_resource.c
@@ -1,2232 +1,2251 @@
/*
* Copyright (C) 2004 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <sys/param.h>
#include <crm/crm.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <libgen.h>
#include <crm/msg_xml.h>
#include <crm/services.h>
#include <crm/common/xml.h>
#include <crm/common/mainloop.h>
#include <crm/cib.h>
#include <crm/attrd.h>
#include <crm/pengine/rules.h>
#include <crm/pengine/status.h>
#include <crm/pengine/internal.h>
bool scope_master = FALSE;
gboolean do_force = FALSE;
gboolean BE_QUIET = FALSE;
const char *attr_set_type = XML_TAG_ATTR_SETS;
char *host_id = NULL;
const char *rsc_id = NULL;
const char *host_uname = NULL;
const char *prop_name = NULL;
const char *prop_value = NULL;
const char *rsc_type = NULL;
const char *prop_id = NULL;
const char *prop_set = NULL;
char *move_lifetime = NULL;
char rsc_cmd = 'L';
const char *rsc_long_cmd = NULL;
char *our_pid = NULL;
crm_ipc_t *crmd_channel = NULL;
char *xml_file = NULL;
int cib_options = cib_sync_call;
int crmd_replies_needed = 1; /* The welcome message */
GMainLoop *mainloop = NULL;
+gboolean print_pending = FALSE;
extern void cleanup_alloc_calculations(pe_working_set_t * data_set);
#define CMD_ERR(fmt, args...) do { \
crm_warn(fmt, ##args); \
fprintf(stderr, fmt, ##args); \
} while(0)
#define message_timeout_ms 60*1000
static gboolean
resource_ipc_timeout(gpointer data)
{
fprintf(stderr, "No messages received in %d seconds.. aborting\n",
(int)message_timeout_ms / 1000);
crm_err("No messages received in %d seconds", (int)message_timeout_ms / 1000);
return crm_exit(-1);
}
static void
resource_ipc_connection_destroy(gpointer user_data)
{
crm_info("Connection to CRMd was terminated");
crm_exit(1);
}
static void
start_mainloop(void)
{
mainloop = g_main_new(FALSE);
fprintf(stderr, "Waiting for %d replies from the CRMd", crmd_replies_needed);
crm_debug("Waiting for %d replies from the CRMd", crmd_replies_needed);
g_timeout_add(message_timeout_ms, resource_ipc_timeout, NULL);
g_main_run(mainloop);
}
static int
resource_ipc_callback(const char *buffer, ssize_t length, gpointer userdata)
{
xmlNode *msg = string2xml(buffer);
fprintf(stderr, ".");
crm_log_xml_trace(msg, "[inbound]");
crmd_replies_needed--;
if (crmd_replies_needed == 0) {
fprintf(stderr, " OK\n");
crm_debug("Got all the replies we expected");
return crm_exit(pcmk_ok);
}
free_xml(msg);
return 0;
}
struct ipc_client_callbacks crm_callbacks = {
.dispatch = resource_ipc_callback,
.destroy = resource_ipc_connection_destroy,
};
static int
do_find_resource(const char *rsc, resource_t * the_rsc, pe_working_set_t * data_set)
{
int found = 0;
GListPtr lpc = NULL;
if (the_rsc == NULL) {
the_rsc = pe_find_resource(data_set->resources, rsc);
}
if (the_rsc == NULL) {
return -ENXIO;
}
if (the_rsc->variant >= pe_clone) {
GListPtr gIter = the_rsc->children;
for (; gIter != NULL; gIter = gIter->next) {
found += do_find_resource(rsc, gIter->data, data_set);
}
return found;
}
for (lpc = the_rsc->running_on; lpc != NULL; lpc = lpc->next) {
node_t *node = (node_t *) lpc->data;
crm_trace("resource %s is running on: %s", rsc, node->details->uname);
if (BE_QUIET) {
fprintf(stdout, "%s\n", node->details->uname);
} else {
const char *state = "";
if (the_rsc->variant == pe_native && the_rsc->role == RSC_ROLE_MASTER) {
state = "Master";
}
fprintf(stdout, "resource %s is running on: %s %s\n", rsc, node->details->uname, state);
}
found++;
}
if (BE_QUIET == FALSE && found == 0) {
fprintf(stderr, "resource %s is NOT running\n", rsc);
}
return 0;
}
#define cons_string(x) x?x:"NA"
static void
print_cts_constraints(pe_working_set_t * data_set)
{
xmlNode *xml_obj = NULL;
xmlNode *lifetime = NULL;
xmlNode *cib_constraints = get_object_root(XML_CIB_TAG_CONSTRAINTS, data_set->input);
for (xml_obj = __xml_first_child(cib_constraints); xml_obj != NULL;
xml_obj = __xml_next(xml_obj)) {
const char *id = crm_element_value(xml_obj, XML_ATTR_ID);
if (id == NULL) {
continue;
}
lifetime = first_named_child(xml_obj, "lifetime");
if (test_ruleset(lifetime, NULL, data_set->now) == FALSE) {
continue;
}
if (safe_str_eq(XML_CONS_TAG_RSC_DEPEND, crm_element_name(xml_obj))) {
printf("Constraint %s %s %s %s %s %s %s\n",
crm_element_name(xml_obj),
cons_string(crm_element_value(xml_obj, XML_ATTR_ID)),
cons_string(crm_element_value(xml_obj, XML_COLOC_ATTR_SOURCE)),
cons_string(crm_element_value(xml_obj, XML_COLOC_ATTR_TARGET)),
cons_string(crm_element_value(xml_obj, XML_RULE_ATTR_SCORE)),
cons_string(crm_element_value(xml_obj, XML_COLOC_ATTR_SOURCE_ROLE)),
cons_string(crm_element_value(xml_obj, XML_COLOC_ATTR_TARGET_ROLE)));
} else if (safe_str_eq(XML_CONS_TAG_RSC_LOCATION, crm_element_name(xml_obj))) {
/* unpack_location(xml_obj, data_set); */
}
}
}
static void
print_cts_rsc(resource_t * rsc)
{
GListPtr lpc = NULL;
const char *host = NULL;
gboolean needs_quorum = TRUE;
const char *rtype = crm_element_value(rsc->xml, XML_ATTR_TYPE);
const char *rprov = crm_element_value(rsc->xml, XML_AGENT_ATTR_PROVIDER);
const char *rclass = crm_element_value(rsc->xml, XML_AGENT_ATTR_CLASS);
if (safe_str_eq(rclass, "stonith")) {
xmlNode *op = NULL;
needs_quorum = FALSE;
for (op = __xml_first_child(rsc->ops_xml); op != NULL; op = __xml_next(op)) {
if (crm_str_eq((const char *)op->name, "op", TRUE)) {
const char *name = crm_element_value(op, "name");
if (safe_str_neq(name, CRMD_ACTION_START)) {
const char *value = crm_element_value(op, "requires");
if (safe_str_eq(value, "nothing")) {
needs_quorum = FALSE;
}
break;
}
}
}
}
if (rsc->running_on != NULL && g_list_length(rsc->running_on) == 1) {
node_t *tmp = rsc->running_on->data;
host = tmp->details->uname;
}
printf("Resource: %s %s %s %s %s %s %s %s %d %lld 0x%.16llx\n",
crm_element_name(rsc->xml), rsc->id,
rsc->clone_name ? rsc->clone_name : rsc->id, rsc->parent ? rsc->parent->id : "NA",
rprov ? rprov : "NA", rclass, rtype, host ? host : "NA", needs_quorum, rsc->flags,
rsc->flags);
for (lpc = rsc->children; lpc != NULL; lpc = lpc->next) {
resource_t *child = (resource_t *) lpc->data;
print_cts_rsc(child);
}
}
static void
print_raw_rsc(resource_t * rsc)
{
GListPtr lpc = NULL;
GListPtr children = rsc->children;
if (children == NULL) {
printf("%s\n", rsc->id);
}
for (lpc = children; lpc != NULL; lpc = lpc->next) {
resource_t *child = (resource_t *) lpc->data;
print_raw_rsc(child);
}
}
static int
do_find_resource_list(pe_working_set_t * data_set, gboolean raw)
{
int found = 0;
GListPtr lpc = NULL;
+ int opts = pe_print_printf | pe_print_rsconly;
+
+ if (print_pending) {
+ opts |= pe_print_pending;
+ }
for (lpc = data_set->resources; lpc != NULL; lpc = lpc->next) {
resource_t *rsc = (resource_t *) lpc->data;
if (is_set(rsc->flags, pe_rsc_orphan)
&& rsc->fns->active(rsc, TRUE) == FALSE) {
continue;
}
- rsc->fns->print(rsc, NULL, pe_print_printf | pe_print_rsconly, stdout);
+ rsc->fns->print(rsc, NULL, opts, stdout);
found++;
}
if (found == 0) {
printf("NO resources configured\n");
return -ENXIO;
}
return 0;
}
static resource_t *
find_rsc_or_clone(const char *rsc, pe_working_set_t * data_set)
{
resource_t *the_rsc = pe_find_resource(data_set->resources, rsc);
if (the_rsc == NULL) {
char *as_clone = crm_concat(rsc, "0", ':');
the_rsc = pe_find_resource(data_set->resources, as_clone);
free(as_clone);
}
return the_rsc;
}
static int
dump_resource(const char *rsc, pe_working_set_t * data_set, gboolean expanded)
{
char *rsc_xml = NULL;
resource_t *the_rsc = find_rsc_or_clone(rsc, data_set);
+ int opts = pe_print_printf;
if (the_rsc == NULL) {
return -ENXIO;
}
- the_rsc->fns->print(the_rsc, NULL, pe_print_printf, stdout);
+
+ if (print_pending) {
+ opts |= pe_print_pending;
+ }
+ the_rsc->fns->print(the_rsc, NULL, opts, stdout);
if (expanded) {
rsc_xml = dump_xml_formatted(the_rsc->xml);
} else {
if (the_rsc->orig_xml) {
rsc_xml = dump_xml_formatted(the_rsc->orig_xml);
} else {
rsc_xml = dump_xml_formatted(the_rsc->xml);
}
}
fprintf(stdout, "%sxml:\n%s\n", expanded ? "" : "raw ", rsc_xml);
free(rsc_xml);
return 0;
}
static int
dump_resource_attr(const char *rsc, const char *attr, pe_working_set_t * data_set)
{
int rc = -ENXIO;
node_t *current = NULL;
GHashTable *params = NULL;
resource_t *the_rsc = find_rsc_or_clone(rsc, data_set);
const char *value = NULL;
if (the_rsc == NULL) {
return -ENXIO;
}
if (g_list_length(the_rsc->running_on) == 1) {
current = the_rsc->running_on->data;
} else if (g_list_length(the_rsc->running_on) > 1) {
CMD_ERR("%s is active on more than one node,"
" returning the default value for %s\n", the_rsc->id, crm_str(value));
}
params = g_hash_table_new_full(crm_str_hash, g_str_equal,
g_hash_destroy_str, g_hash_destroy_str);
if (safe_str_eq(attr_set_type, XML_TAG_ATTR_SETS)) {
get_rsc_attributes(params, the_rsc, current, data_set);
} else if (safe_str_eq(attr_set_type, XML_TAG_META_SETS)) {
get_meta_attributes(params, the_rsc, current, data_set);
} else {
unpack_instance_attributes(data_set->input, the_rsc->xml, XML_TAG_UTILIZATION, NULL,
params, NULL, FALSE, data_set->now);
}
crm_debug("Looking up %s in %s", attr, the_rsc->id);
value = g_hash_table_lookup(params, attr);
if (value != NULL) {
fprintf(stdout, "%s\n", value);
rc = 0;
}
g_hash_table_destroy(params);
return rc;
}
static int
find_resource_attr(cib_t * the_cib, const char *attr, const char *rsc, const char *set_type,
const char *set_name, const char *attr_id, const char *attr_name, char **value)
{
int offset = 0;
static int xpath_max = 1024;
int rc = pcmk_ok;
xmlNode *xml_search = NULL;
char *xpath_string = NULL;
CRM_ASSERT(value != NULL);
*value = NULL;
xpath_string = calloc(1, xpath_max);
offset +=
snprintf(xpath_string + offset, xpath_max - offset, "%s", get_object_path("resources"));
offset += snprintf(xpath_string + offset, xpath_max - offset, "//*[@id=\"%s\"]", rsc);
if (set_type) {
offset += snprintf(xpath_string + offset, xpath_max - offset, "/%s", set_type);
if (set_name) {
offset += snprintf(xpath_string + offset, xpath_max - offset, "[@id=\"%s\"]", set_name);
}
}
offset += snprintf(xpath_string + offset, xpath_max - offset, "//nvpair[");
if (attr_id) {
offset += snprintf(xpath_string + offset, xpath_max - offset, "@id=\"%s\"", attr_id);
}
if (attr_name) {
if (attr_id) {
offset += snprintf(xpath_string + offset, xpath_max - offset, " and ");
}
offset += snprintf(xpath_string + offset, xpath_max - offset, "@name=\"%s\"", attr_name);
}
offset += snprintf(xpath_string + offset, xpath_max - offset, "]");
rc = the_cib->cmds->query(the_cib, xpath_string, &xml_search,
cib_sync_call | cib_scope_local | cib_xpath);
if (rc != pcmk_ok) {
goto bail;
}
crm_log_xml_debug(xml_search, "Match");
if (xml_has_children(xml_search)) {
xmlNode *child = NULL;
rc = -EINVAL;
printf("Multiple attributes match name=%s\n", attr_name);
for (child = __xml_first_child(xml_search); child != NULL; child = __xml_next(child)) {
printf(" Value: %s \t(id=%s)\n",
crm_element_value(child, XML_NVPAIR_ATTR_VALUE), ID(child));
}
} else {
const char *tmp = crm_element_value(xml_search, attr);
if (tmp) {
*value = strdup(tmp);
}
}
bail:
free(xpath_string);
free_xml(xml_search);
return rc;
}
#include "../pengine/pengine.h"
static int
set_resource_attr(const char *rsc_id, const char *attr_set, const char *attr_id,
const char *attr_name, const char *attr_value, bool recursive,
cib_t * cib, pe_working_set_t * data_set)
{
int rc = pcmk_ok;
static bool need_init = TRUE;
char *local_attr_id = NULL;
char *local_attr_set = NULL;
xmlNode *xml_top = NULL;
xmlNode *xml_obj = NULL;
gboolean use_attributes_tag = FALSE;
resource_t *rsc = find_rsc_or_clone(rsc_id, data_set);
if (rsc == NULL) {
return -ENXIO;
}
if (safe_str_eq(attr_set_type, XML_TAG_ATTR_SETS)) {
rc = find_resource_attr(cib, XML_ATTR_ID, rsc_id, XML_TAG_META_SETS, attr_set, attr_id,
attr_name, &local_attr_id);
if (rc == pcmk_ok) {
printf("WARNING: There is already a meta attribute called %s (id=%s)\n", attr_name,
local_attr_id);
}
}
rc = find_resource_attr(cib, XML_ATTR_ID, rsc_id, attr_set_type, attr_set, attr_id, attr_name,
&local_attr_id);
if (rc == pcmk_ok) {
crm_debug("Found a match for name=%s: id=%s", attr_name, local_attr_id);
attr_id = local_attr_id;
} else if (rc != -ENXIO) {
free(local_attr_id);
return rc;
} else {
const char *value = NULL;
xmlNode *cib_top = NULL;
const char *tag = crm_element_name(rsc->xml);
rc = cib->cmds->query(cib, "/cib", &cib_top,
cib_sync_call | cib_scope_local | cib_xpath | cib_no_children);
value = crm_element_value(cib_top, "ignore_dtd");
if (value != NULL) {
use_attributes_tag = TRUE;
} else {
value = crm_element_value(cib_top, XML_ATTR_VALIDATION);
if (value && strstr(value, "-0.6")) {
use_attributes_tag = TRUE;
}
}
free_xml(cib_top);
if (attr_set == NULL) {
local_attr_set = crm_concat(rsc_id, attr_set_type, '-');
attr_set = local_attr_set;
}
if (attr_id == NULL) {
local_attr_id = crm_concat(attr_set, attr_name, '-');
attr_id = local_attr_id;
}
if (use_attributes_tag && safe_str_eq(tag, XML_CIB_TAG_MASTER)) {
tag = "master_slave"; /* use the old name */
}
xml_top = create_xml_node(NULL, tag);
crm_xml_add(xml_top, XML_ATTR_ID, rsc_id);
xml_obj = create_xml_node(xml_top, attr_set_type);
crm_xml_add(xml_obj, XML_ATTR_ID, attr_set);
if (use_attributes_tag) {
xml_obj = create_xml_node(xml_obj, XML_TAG_ATTRS);
}
}
xml_obj = create_xml_node(xml_obj, XML_CIB_TAG_NVPAIR);
if (xml_top == NULL) {
xml_top = xml_obj;
}
crm_xml_add(xml_obj, XML_ATTR_ID, attr_id);
crm_xml_add(xml_obj, XML_NVPAIR_ATTR_NAME, attr_name);
crm_xml_add(xml_obj, XML_NVPAIR_ATTR_VALUE, attr_value);
crm_log_xml_debug(xml_top, "Update");
rc = cib->cmds->modify(cib, XML_CIB_TAG_RESOURCES, xml_top, cib_options);
free_xml(xml_top);
free(local_attr_id);
free(local_attr_set);
if(recursive && safe_str_eq(attr_set_type, XML_TAG_META_SETS)) {
GListPtr lpc = NULL;
if(need_init) {
xmlNode *cib_constraints = get_object_root(XML_CIB_TAG_CONSTRAINTS, data_set->input);
need_init = FALSE;
unpack_constraints(cib_constraints, data_set);
for (lpc = data_set->resources; lpc != NULL; lpc = lpc->next) {
resource_t *r = (resource_t *) lpc->data;
clear_bit(r->flags, pe_rsc_allocating);
}
}
crm_debug("Looking for dependancies %p", rsc->rsc_cons_lhs);
set_bit(rsc->flags, pe_rsc_allocating);
for (lpc = rsc->rsc_cons_lhs; lpc != NULL; lpc = lpc->next) {
rsc_colocation_t *cons = (rsc_colocation_t *) lpc->data;
resource_t *peer = cons->rsc_lh;
crm_debug("Checking %s %d", cons->id, cons->score);
if (cons->score > 0 && is_not_set(peer->flags, pe_rsc_allocating)) {
/* Don't get into colocation loops */
crm_debug("Setting %s=%s for dependant resource %s", attr_name, attr_value, peer->id);
set_resource_attr(peer->id, NULL, NULL, attr_name, attr_value, recursive, cib, data_set);
}
}
}
return rc;
}
static int
delete_resource_attr(const char *rsc_id, const char *attr_set, const char *attr_id,
const char *attr_name, cib_t * cib, pe_working_set_t * data_set)
{
xmlNode *xml_obj = NULL;
int rc = pcmk_ok;
char *local_attr_id = NULL;
resource_t *rsc = find_rsc_or_clone(rsc_id, data_set);
if (rsc == NULL) {
return -ENXIO;
}
rc = find_resource_attr(cib, XML_ATTR_ID, rsc_id, attr_set_type, attr_set, attr_id, attr_name,
&local_attr_id);
if (rc == -ENXIO) {
return pcmk_ok;
} else if (rc != pcmk_ok) {
return rc;
}
if (attr_id == NULL) {
attr_id = local_attr_id;
}
xml_obj = create_xml_node(NULL, XML_CIB_TAG_NVPAIR);
crm_xml_add(xml_obj, XML_ATTR_ID, attr_id);
crm_xml_add(xml_obj, XML_NVPAIR_ATTR_NAME, attr_name);
crm_log_xml_debug(xml_obj, "Delete");
rc = cib->cmds->delete(cib, XML_CIB_TAG_RESOURCES, xml_obj, cib_options);
if (rc == pcmk_ok) {
printf("Deleted %s option: id=%s%s%s%s%s\n", rsc_id, local_attr_id,
attr_set ? " set=" : "", attr_set ? attr_set : "",
attr_name ? " name=" : "", attr_name ? attr_name : "");
}
free_xml(xml_obj);
free(local_attr_id);
return rc;
}
static int
dump_resource_prop(const char *rsc, const char *attr, pe_working_set_t * data_set)
{
const char *value = NULL;
resource_t *the_rsc = pe_find_resource(data_set->resources, rsc);
if (the_rsc == NULL) {
return -ENXIO;
}
value = crm_element_value(the_rsc->xml, attr);
if (value != NULL) {
fprintf(stdout, "%s\n", value);
return 0;
}
return -ENXIO;
}
static int
send_lrm_rsc_op(crm_ipc_t * crmd_channel, const char *op,
const char *host_uname, const char *rsc_id,
gboolean only_failed, pe_working_set_t * data_set)
{
char *key = NULL;
int rc = -ECOMM;
xmlNode *cmd = NULL;
xmlNode *xml_rsc = NULL;
const char *value = NULL;
const char *router_node = host_uname;
xmlNode *params = NULL;
xmlNode *msg_data = NULL;
resource_t *rsc = pe_find_resource(data_set->resources, rsc_id);
if (rsc == NULL) {
CMD_ERR("Resource %s not found\n", rsc_id);
return -ENXIO;
} else if (rsc->variant != pe_native) {
CMD_ERR("We can only process primitive resources, not %s\n", rsc_id);
return -EINVAL;
} else if (host_uname == NULL) {
CMD_ERR("Please supply a hostname with -H\n");
return -EINVAL;
} else {
node_t *node = pe_find_node(data_set->nodes, host_uname);
if (node && is_remote_node(node)) {
if (node->details->remote_rsc->running_on) {
node = node->details->remote_rsc->running_on->data;
router_node = node->details->uname;
} else {
CMD_ERR("No lrmd connection detected to remote node %s", host_uname);
return -ENXIO;
}
}
}
key = generate_transition_key(0, getpid(), 0, "xxxxxxxx-xrsc-opxx-xcrm-resourcexxxx");
msg_data = create_xml_node(NULL, XML_GRAPH_TAG_RSC_OP);
crm_xml_add(msg_data, XML_ATTR_TRANSITION_KEY, key);
free(key);
crm_xml_add(msg_data, XML_LRM_ATTR_TARGET, host_uname);
if (safe_str_neq(router_node, host_uname)) {
crm_xml_add(msg_data, XML_LRM_ATTR_ROUTER_NODE, router_node);
}
xml_rsc = create_xml_node(msg_data, XML_CIB_TAG_RESOURCE);
if (rsc->clone_name) {
crm_xml_add(xml_rsc, XML_ATTR_ID, rsc->clone_name);
crm_xml_add(xml_rsc, XML_ATTR_ID_LONG, rsc->id);
} else {
crm_xml_add(xml_rsc, XML_ATTR_ID, rsc->id);
}
value = crm_element_value(rsc->xml, XML_ATTR_TYPE);
crm_xml_add(xml_rsc, XML_ATTR_TYPE, value);
if (value == NULL) {
CMD_ERR("%s has no type! Aborting...\n", rsc_id);
return -ENXIO;
}
value = crm_element_value(rsc->xml, XML_AGENT_ATTR_CLASS);
crm_xml_add(xml_rsc, XML_AGENT_ATTR_CLASS, value);
if (value == NULL) {
CMD_ERR("%s has no class! Aborting...\n", rsc_id);
return -ENXIO;
}
value = crm_element_value(rsc->xml, XML_AGENT_ATTR_PROVIDER);
crm_xml_add(xml_rsc, XML_AGENT_ATTR_PROVIDER, value);
params = create_xml_node(msg_data, XML_TAG_ATTRS);
crm_xml_add(params, XML_ATTR_CRM_VERSION, CRM_FEATURE_SET);
key = crm_meta_name(XML_LRM_ATTR_INTERVAL);
crm_xml_add(params, key, "60000"); /* 1 minute */
free(key);
cmd = create_request(op, msg_data, router_node, CRM_SYSTEM_CRMD, crm_system_name, our_pid);
/* crm_log_xml_warn(cmd, "send_lrm_rsc_op"); */
free_xml(msg_data);
if (crm_ipc_send(crmd_channel, cmd, 0, 0, NULL) > 0) {
rc = 0;
} else {
CMD_ERR("Could not send %s op to the crmd", op);
rc = -ENOTCONN;
}
free_xml(cmd);
return rc;
}
static int
delete_lrm_rsc(cib_t *cib_conn, crm_ipc_t * crmd_channel, const char *host_uname,
resource_t * rsc, pe_working_set_t * data_set)
{
int rc = pcmk_ok;
if (rsc == NULL) {
return -ENXIO;
} else if (rsc->children) {
GListPtr lpc = NULL;
for (lpc = rsc->children; lpc != NULL; lpc = lpc->next) {
resource_t *child = (resource_t *) lpc->data;
delete_lrm_rsc(cib_conn, crmd_channel, host_uname, child, data_set);
}
return pcmk_ok;
} else if (host_uname == NULL) {
GListPtr lpc = NULL;
for (lpc = data_set->nodes; lpc != NULL; lpc = lpc->next) {
node_t *node = (node_t *) lpc->data;
if (node->details->online) {
delete_lrm_rsc(cib_conn, crmd_channel, node->details->uname, rsc, data_set);
}
}
return pcmk_ok;
}
printf("Cleaning up %s on %s\n", rsc->id, host_uname);
rc = send_lrm_rsc_op(crmd_channel, CRM_OP_LRM_DELETE, host_uname, rsc->id, TRUE, data_set);
if (rc == pcmk_ok) {
char *attr_name = NULL;
const char *id = rsc->id;
node_t *node = pe_find_node(data_set->nodes, host_uname);
if(node && node->details->remote_rsc == NULL) {
crmd_replies_needed++;
}
if (rsc->clone_name) {
id = rsc->clone_name;
}
attr_name = crm_concat("fail-count", id, '-');
rc = attrd_update_delegate(NULL, 'D', host_uname, attr_name, NULL, XML_CIB_TAG_STATUS, NULL,
NULL, NULL, node ? is_remote_node(node) : FALSE);
free(attr_name);
}
return rc;
}
static int
fail_lrm_rsc(crm_ipc_t * crmd_channel, const char *host_uname,
const char *rsc_id, pe_working_set_t * data_set)
{
crm_warn("Failing: %s", rsc_id);
return send_lrm_rsc_op(crmd_channel, CRM_OP_LRM_FAIL, host_uname, rsc_id, FALSE, data_set);
}
static char *
parse_cli_lifetime(const char *input)
{
char *later_s = NULL;
crm_time_t *now = NULL;
crm_time_t *later = NULL;
crm_time_t *duration = NULL;
if (input == NULL) {
return NULL;
}
duration = crm_time_parse_duration(move_lifetime);
if (duration == NULL) {
CMD_ERR("Invalid duration specified: %s\n", move_lifetime);
CMD_ERR("Please refer to"
" http://en.wikipedia.org/wiki/ISO_8601#Duration"
" for examples of valid durations\n");
return NULL;
}
now = crm_time_new(NULL);
later = crm_time_add(now, duration);
crm_time_log(LOG_INFO, "now ", now,
crm_time_log_date | crm_time_log_timeofday | crm_time_log_with_timezone);
crm_time_log(LOG_INFO, "later ", later,
crm_time_log_date | crm_time_log_timeofday | crm_time_log_with_timezone);
crm_time_log(LOG_INFO, "duration", duration, crm_time_log_date | crm_time_log_timeofday);
later_s = crm_time_as_string(later, crm_time_log_date | crm_time_log_timeofday);
printf("Migration will take effect until: %s\n", later_s);
crm_time_free(duration);
crm_time_free(later);
crm_time_free(now);
return later_s;
}
static int
ban_resource(const char *rsc_id, const char *host, GListPtr allnodes, cib_t * cib_conn)
{
char *later_s = NULL;
int rc = pcmk_ok;
char *id = NULL;
xmlNode *fragment = NULL;
xmlNode *location = NULL;
if(host == NULL) {
GListPtr n = allnodes;
for(; n && rc == pcmk_ok; n = n->next) {
node_t *target = n->data;
rc = ban_resource(rsc_id, target->details->uname, NULL, cib_conn);
}
return rc;
}
later_s = parse_cli_lifetime(move_lifetime);
if(move_lifetime && later_s == NULL) {
return -EINVAL;
}
fragment = create_xml_node(NULL, XML_CIB_TAG_CONSTRAINTS);
id = g_strdup_printf("cli-ban-%s-on-%s", rsc_id, host);
location = create_xml_node(fragment, XML_CONS_TAG_RSC_LOCATION);
crm_xml_add(location, XML_ATTR_ID, id);
free(id);
if (BE_QUIET == FALSE) {
CMD_ERR("WARNING: Creating rsc_location constraint '%s'"
" with a score of -INFINITY for resource %s"
" on %s.\n", ID(location), rsc_id, host);
CMD_ERR("\tThis will prevent %s from %s"
" on %s until the constraint is removed using"
" the 'crm_resource --clear' command or manually"
" with cibadmin\n", rsc_id, scope_master?"being promoted":"running", host);
CMD_ERR("\tThis will be the case even if %s is"
" the last node in the cluster\n", host);
CMD_ERR("\tThis message can be disabled with --quiet\n");
}
crm_xml_add(location, XML_COLOC_ATTR_SOURCE, rsc_id);
if(scope_master) {
crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_MASTER_S);
} else {
crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_STARTED_S);
}
if (later_s == NULL) {
/* Short form */
crm_xml_add(location, XML_CIB_TAG_NODE, host);
crm_xml_add(location, XML_RULE_ATTR_SCORE, MINUS_INFINITY_S);
} else {
xmlNode *rule = create_xml_node(location, XML_TAG_RULE);
xmlNode *expr = create_xml_node(rule, XML_TAG_EXPRESSION);
id = g_strdup_printf("cli-ban-%s-on-%s-rule", rsc_id, host);
crm_xml_add(rule, XML_ATTR_ID, id);
free(id);
crm_xml_add(rule, XML_RULE_ATTR_SCORE, MINUS_INFINITY_S);
crm_xml_add(rule, XML_RULE_ATTR_BOOLEAN_OP, "and");
id = g_strdup_printf("cli-ban-%s-on-%s-expr", rsc_id, host);
crm_xml_add(expr, XML_ATTR_ID, id);
free(id);
crm_xml_add(expr, XML_EXPR_ATTR_ATTRIBUTE, "#uname");
crm_xml_add(expr, XML_EXPR_ATTR_OPERATION, "eq");
crm_xml_add(expr, XML_EXPR_ATTR_VALUE, host);
crm_xml_add(expr, XML_EXPR_ATTR_TYPE, "string");
expr = create_xml_node(rule, "date_expression");
id = g_strdup_printf("cli-ban-%s-on-%s-lifetime", rsc_id, host);
crm_xml_add(expr, XML_ATTR_ID, id);
free(id);
crm_xml_add(expr, "operation", "lt");
crm_xml_add(expr, "end", later_s);
}
crm_log_xml_notice(fragment, "Modify");
rc = cib_conn->cmds->update(cib_conn, XML_CIB_TAG_CONSTRAINTS, fragment, cib_options);
free_xml(fragment);
free(later_s);
return rc;
}
static int
prefer_resource(const char *rsc_id, const char *host, cib_t * cib_conn)
{
char *later_s = parse_cli_lifetime(move_lifetime);
int rc = pcmk_ok;
char *id = NULL;
xmlNode *location = NULL;
xmlNode *fragment = NULL;
if(move_lifetime && later_s == NULL) {
return -EINVAL;
}
fragment = create_xml_node(NULL, XML_CIB_TAG_CONSTRAINTS);
id = g_strdup_printf("cli-prefer-%s", rsc_id);
location = create_xml_node(fragment, XML_CONS_TAG_RSC_LOCATION);
crm_xml_add(location, XML_ATTR_ID, id);
free(id);
crm_xml_add(location, XML_COLOC_ATTR_SOURCE, rsc_id);
if(scope_master) {
crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_MASTER_S);
} else {
crm_xml_add(location, XML_RULE_ATTR_ROLE, RSC_ROLE_STARTED_S);
}
if (later_s == NULL) {
/* Short form */
crm_xml_add(location, XML_CIB_TAG_NODE, host);
crm_xml_add(location, XML_RULE_ATTR_SCORE, INFINITY_S);
} else {
xmlNode *rule = create_xml_node(location, XML_TAG_RULE);
xmlNode *expr = create_xml_node(rule, XML_TAG_EXPRESSION);
id = crm_concat("cli-prefer-rule", rsc_id, '-');
crm_xml_add(rule, XML_ATTR_ID, id);
free(id);
crm_xml_add(rule, XML_RULE_ATTR_SCORE, INFINITY_S);
crm_xml_add(rule, XML_RULE_ATTR_BOOLEAN_OP, "and");
id = crm_concat("cli-prefer-expr", rsc_id, '-');
crm_xml_add(expr, XML_ATTR_ID, id);
free(id);
crm_xml_add(expr, XML_EXPR_ATTR_ATTRIBUTE, "#uname");
crm_xml_add(expr, XML_EXPR_ATTR_OPERATION, "eq");
crm_xml_add(expr, XML_EXPR_ATTR_VALUE, host);
crm_xml_add(expr, XML_EXPR_ATTR_TYPE, "string");
expr = create_xml_node(rule, "date_expression");
id = crm_concat("cli-prefer-lifetime-end", rsc_id, '-');
crm_xml_add(expr, XML_ATTR_ID, id);
free(id);
crm_xml_add(expr, "operation", "lt");
crm_xml_add(expr, "end", later_s);
}
crm_log_xml_info(fragment, "Modify");
rc = cib_conn->cmds->update(cib_conn, XML_CIB_TAG_CONSTRAINTS, fragment, cib_options);
free_xml(fragment);
free(later_s);
return rc;
}
static int
clear_resource(const char *rsc_id, const char *host, GListPtr allnodes, cib_t * cib_conn)
{
char *id = NULL;
int rc = pcmk_ok;
xmlNode *fragment = NULL;
xmlNode *location = NULL;
fragment = create_xml_node(NULL, XML_CIB_TAG_CONSTRAINTS);
if(host) {
id = g_strdup_printf("cli-ban-%s-on-%s", rsc_id, host);
location = create_xml_node(fragment, XML_CONS_TAG_RSC_LOCATION);
crm_xml_add(location, XML_ATTR_ID, id);
free(id);
} else {
GListPtr n = allnodes;
for(; n; n = n->next) {
node_t *target = n->data;
id = g_strdup_printf("cli-ban-%s-on-%s", rsc_id, target->details->uname);
location = create_xml_node(fragment, XML_CONS_TAG_RSC_LOCATION);
crm_xml_add(location, XML_ATTR_ID, id);
free(id);
}
}
id = g_strdup_printf("cli-prefer-%s", rsc_id);
location = create_xml_node(fragment, XML_CONS_TAG_RSC_LOCATION);
crm_xml_add(location, XML_ATTR_ID, id);
if(host && do_force == FALSE) {
crm_xml_add(location, XML_CIB_TAG_NODE, host);
}
free(id);
crm_log_xml_info(fragment, "Delete");
rc = cib_conn->cmds->delete(cib_conn, XML_CIB_TAG_CONSTRAINTS, fragment, cib_options);
if (rc == -ENXIO) {
rc = pcmk_ok;
} else if (rc != pcmk_ok) {
goto bail;
}
bail:
free_xml(fragment);
return rc;
}
static int
list_resource_operations(const char *rsc_id, const char *host_uname, gboolean active,
pe_working_set_t * data_set)
{
resource_t *rsc = NULL;
int opts = pe_print_printf | pe_print_rsconly | pe_print_suppres_nl;
GListPtr ops = find_operations(rsc_id, host_uname, active, data_set);
GListPtr lpc = NULL;
+ if (print_pending) {
+ opts |= pe_print_pending;
+ }
+
for (lpc = ops; lpc != NULL; lpc = lpc->next) {
xmlNode *xml_op = (xmlNode *) lpc->data;
const char *op_rsc = crm_element_value(xml_op, "resource");
const char *last = crm_element_value(xml_op, XML_RSC_OP_LAST_CHANGE);
const char *status_s = crm_element_value(xml_op, XML_LRM_ATTR_OPSTATUS);
const char *op_key = crm_element_value(xml_op, XML_LRM_ATTR_TASK_KEY);
int status = crm_parse_int(status_s, "0");
rsc = pe_find_resource(data_set->resources, op_rsc);
if(rsc) {
rsc->fns->print(rsc, "", opts, stdout);
} else {
fprintf(stdout, "Unknown resource %s", op_rsc);
}
fprintf(stdout, ": %s (node=%s, call=%s, rc=%s",
op_key ? op_key : ID(xml_op),
crm_element_value(xml_op, XML_ATTR_UNAME),
crm_element_value(xml_op, XML_LRM_ATTR_CALLID),
crm_element_value(xml_op, XML_LRM_ATTR_RC));
if (last) {
time_t run_at = crm_parse_int(last, "0");
fprintf(stdout, ", last-rc-change=%s, exec=%sms",
crm_strip_trailing_newline(ctime(&run_at)), crm_element_value(xml_op, XML_RSC_OP_T_EXEC));
}
fprintf(stdout, "): %s\n", services_lrm_status_str(status));
}
return pcmk_ok;
}
static void
show_location(resource_t * rsc, const char *prefix)
{
GListPtr lpc = NULL;
GListPtr list = rsc->rsc_location;
int offset = 0;
if (prefix) {
offset = strlen(prefix) - 2;
}
for (lpc = list; lpc != NULL; lpc = lpc->next) {
rsc_to_node_t *cons = (rsc_to_node_t *) lpc->data;
GListPtr lpc2 = NULL;
for (lpc2 = cons->node_list_rh; lpc2 != NULL; lpc2 = lpc2->next) {
node_t *node = (node_t *) lpc2->data;
char *score = score2char(node->weight);
fprintf(stdout, "%s: Node %-*s (score=%s, id=%s)\n",
prefix ? prefix : " ", 71 - offset, node->details->uname, score, cons->id);
free(score);
}
}
}
static void
show_colocation(resource_t * rsc, gboolean dependants, gboolean recursive, int offset)
{
char *prefix = NULL;
GListPtr lpc = NULL;
GListPtr list = rsc->rsc_cons;
prefix = calloc(1, (offset * 4) + 1);
memset(prefix, ' ', offset * 4);
if (dependants) {
list = rsc->rsc_cons_lhs;
}
if (is_set(rsc->flags, pe_rsc_allocating)) {
/* Break colocation loops */
printf("loop %s\n", rsc->id);
free(prefix);
return;
}
set_bit(rsc->flags, pe_rsc_allocating);
for (lpc = list; lpc != NULL; lpc = lpc->next) {
rsc_colocation_t *cons = (rsc_colocation_t *) lpc->data;
char *score = NULL;
resource_t *peer = cons->rsc_rh;
if (dependants) {
peer = cons->rsc_lh;
}
if (is_set(peer->flags, pe_rsc_allocating)) {
if (dependants == FALSE) {
fprintf(stdout, "%s%-*s (id=%s - loop)\n", prefix, 80 - (4 * offset), peer->id,
cons->id);
}
continue;
}
if (dependants && recursive) {
show_colocation(peer, dependants, recursive, offset + 1);
}
score = score2char(cons->score);
if (cons->role_rh > RSC_ROLE_STARTED) {
fprintf(stdout, "%s%-*s (score=%s, %s role=%s, id=%s)\n", prefix, 80 - (4 * offset),
peer->id, score, dependants ? "needs" : "with", role2text(cons->role_rh),
cons->id);
} else {
fprintf(stdout, "%s%-*s (score=%s, id=%s)\n", prefix, 80 - (4 * offset),
peer->id, score, cons->id);
}
show_location(peer, prefix);
free(score);
if (!dependants && recursive) {
show_colocation(peer, dependants, recursive, offset + 1);
}
}
free(prefix);
}
static GHashTable *
generate_resource_params(resource_t * rsc, pe_working_set_t * data_set)
{
GHashTable *params = NULL;
GHashTable *meta = NULL;
GHashTable *combined = NULL;
GHashTableIter iter;
if (!rsc) {
crm_err("Resource does not exist in config");
return NULL;
}
params =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, g_hash_destroy_str);
meta = g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, g_hash_destroy_str);
combined =
g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, g_hash_destroy_str);
get_rsc_attributes(params, rsc, NULL /* TODO: Pass in local node */ , data_set);
get_meta_attributes(meta, rsc, NULL /* TODO: Pass in local node */ , data_set);
if (params) {
char *key = NULL;
char *value = NULL;
g_hash_table_iter_init(&iter, params);
while (g_hash_table_iter_next(&iter, (gpointer *) & key, (gpointer *) & value)) {
g_hash_table_insert(combined, strdup(key), strdup(value));
}
g_hash_table_destroy(params);
}
if (meta) {
char *key = NULL;
char *value = NULL;
g_hash_table_iter_init(&iter, meta);
while (g_hash_table_iter_next(&iter, (gpointer *) & key, (gpointer *) & value)) {
char *crm_name = crm_meta_name(key);
g_hash_table_insert(combined, crm_name, strdup(value));
}
g_hash_table_destroy(meta);
}
return combined;
}
/* *INDENT-OFF* */
static struct crm_option long_options[] = {
/* Top-level Options */
{"help", 0, 0, '?', "\t\tThis text"},
{"version", 0, 0, '$', "\t\tVersion information" },
{"verbose", 0, 0, 'V', "\t\tIncrease debug output"},
{"quiet", 0, 0, 'Q', "\t\tPrint only the value on stdout\n"},
{"resource", 1, 0, 'r', "\tResource ID" },
{"-spacer-",1, 0, '-', "\nQueries:"},
{"list", 0, 0, 'L', "\t\tList all cluster resources"},
{"list-raw", 0, 0, 'l', "\tList the IDs of all instantiated resources (no groups/clones/...)"},
{"list-cts", 0, 0, 'c', NULL, 1},
{"list-operations", 0, 0, 'O', "\tList active resource operations. Optionally filtered by resource (-r) and/or node (-N)"},
- {"list-all-operations", 0, 0, 'o', "List all resource operations. Optionally filtered by resource (-r) and/or node (-N)\n"},
+ {"list-all-operations", 0, 0, 'o', "List all resource operations. Optionally filtered by resource (-r) and/or node (-N)"},
+ {"pending", 0, 0, 'j', "\t\tDisplay pending state if 'record-pending' is enabled\n"},
{"list-standards", 0, 0, 0, "\tList supported standards"},
{"list-ocf-providers", 0, 0, 0, "List all available OCF providers"},
{"list-agents", 1, 0, 0, "List all agents available for the named standard and/or provider."},
{"list-ocf-alternatives", 1, 0, 0, "List all available providers for the named OCF agent\n"},
{"show-metadata", 1, 0, 0, "Show the metadata for the named class:provider:agent"},
{"query-xml", 0, 0, 'q', "\tQuery the definition of a resource (template expanded)"},
{"query-xml-raw", 0, 0, 'w', "\tQuery the definition of a resource (raw xml)"},
{"locate", 0, 0, 'W', "\t\tDisplay the current location(s) of a resource"},
{"stack", 0, 0, 'A', "\t\tDisplay the prerequisites and dependents of a resource"},
{"constraints",0, 0, 'a', "\tDisplay the (co)location constraints that apply to a resource"},
{"-spacer-", 1, 0, '-', "\nCommands:"},
{"cleanup", 0, 0, 'C', "\t\tDelete the resource history and re-check the current state. Optional: --resource"},
{"set-parameter", 1, 0, 'p', "Set the named parameter for a resource. See also -m, --meta"},
{"get-parameter", 1, 0, 'g', "Display the named parameter for a resource. See also -m, --meta"},
{"delete-parameter",1, 0, 'd', "Delete the named parameter for a resource. See also -m, --meta"},
{"get-property", 1, 0, 'G', "Display the 'class', 'type' or 'provider' of a resource", 1},
{"set-property", 1, 0, 'S', "(Advanced) Set the class, type or provider of a resource", 1},
{"-spacer-", 1, 0, '-', "\nResource location:"},
{
"move", 0, 0, 'M',
"\t\tMove a resource from its current location to the named destination.\n "
"\t\t\t\tRequires: --host. Optional: --lifetime, --master\n\n"
"\t\t\t\tNOTE: This may prevent the resource from running on the previous location node until the implicit constraints expire or are removed with --unban\n"
},
{
"ban", 0, 0, 'B',
"\t\tPrevent the named resource from running on the named --host. \n"
"\t\t\t\tRequires: --resource. Optional: --host, --lifetime, --master\n\n"
"\t\t\t\tIf --host is not specified, it defaults to:\n"
"\t\t\t\t * the curent location for primitives and groups, or\n\n"
"\t\t\t\t * the curent location of the master for m/s resources with master-max=1\n\n"
"\t\t\t\tAll other situations result in an error as there is no sane default.\n\n"
"\t\t\t\tNOTE: This will prevent the resource from running on this node until the constraint expires or is removed with --clear\n"
},
{
"clear", 0, 0, 'U', "\t\tRemove all constraints created by the --ban and/or --move commands. \n"
"\t\t\t\tRequires: --resource. Optional: --host, --master\n\n"
"\t\t\t\tIf --host is not specified, all constraints created by --ban and --move will be removed for the named resource.\n"
},
{"lifetime", 1, 0, 'u', "\tLifespan of constraints created by the --ban and --move commands"},
{
"master", 0, 0, 0,
"\t\tLimit the scope of the --ban, --move and --clear commands to the Master role.\n"
"\t\t\t\tFor --ban and --move, the previous master can still remain active in the Slave role."
},
{"-spacer-", 1, 0, '-', "\nAdvanced Commands:"},
{"delete", 0, 0, 'D', "\t\t(Advanced) Delete a resource from the CIB"},
{"fail", 0, 0, 'F', "\t\t(Advanced) Tell the cluster this resource has failed"},
{"force-stop", 0, 0, 0, "\t(Advanced) Bypass the cluster and stop a resource on the local node. Additional detail with -V"},
{"force-start",0, 0, 0, "\t(Advanced) Bypass the cluster and start a resource on the local node. Additional detail with -V"},
{"force-check",0, 0, 0, "\t(Advanced) Bypass the cluster and check the state of a resource on the local node. Additional detail with -V\n"},
{"-spacer-", 1, 0, '-', "\nAdditional Options:"},
{"node", 1, 0, 'N', "\tHost uname"},
{"recursive", 0, 0, 0, "\tFollow colocation chains when using --set-parameter"},
{"resource-type", 1, 0, 't', "Resource type (primitive, clone, group, ...)"},
- {"parameter-value", 1, 0, 'v', "Value to use with -p, -g or -d"},
+ {"parameter-value", 1, 0, 'v', "Value to use with -p or -S"},
{"meta", 0, 0, 'm', "\t\tModify a resource's configuration option rather than one which is passed to the resource agent script. For use with -p, -g, -d"},
{"utilization", 0, 0, 'z', "\tModify a resource's utilization attribute. For use with -p, -g, -d"},
{"set-name", 1, 0, 's', "\t(Advanced) ID of the instance_attributes object to change"},
{"nvpair", 1, 0, 'i', "\t(Advanced) ID of the nvpair object to change/delete"},
{"force", 0, 0, 'f', "\n" /* Is this actually true anymore?
"\t\tForce the resource to move by creating a rule for the current location and a score of -INFINITY"
"\n\t\tThis should be used if the resource's stickiness and constraint scores total more than INFINITY (Currently 100,000)"
"\n\t\tNOTE: This will prevent the resource from running on this node until the constraint is removed with -U or the --lifetime duration expires\n"*/ },
{"xml-file", 1, 0, 'x', NULL, 1},\
/* legacy options */
{"host-uname", 1, 0, 'H', NULL, 1},
{"migrate", 0, 0, 'M', NULL, 1},
{"un-migrate", 0, 0, 'U', NULL, 1},
{"un-move", 0, 0, 'U', NULL, 1},
{"refresh", 0, 0, 'R', NULL, 1},
{"reprobe", 0, 0, 'P', NULL, 1},
{"-spacer-", 1, 0, '-', "\nExamples:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', "List the configured resources:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --list", pcmk_option_example},
{"-spacer-", 1, 0, '-', "List the available OCF agents:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --list-agents ocf", pcmk_option_example},
{"-spacer-", 1, 0, '-', "List the available OCF agents from the linux-ha project:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --list-agents ocf:heartbeat", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Display the current location of 'myResource':", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --resource myResource --locate", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Move 'myResource' to another machine:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --resource myResource --move", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Move 'myResource' to a specific machine:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --resource myResource --move --node altNode", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Allow (but not force) 'myResource' to move back to its original location:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --resource myResource --un-move", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Tell the cluster that 'myResource' failed:", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --resource myResource --fail", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Stop a 'myResource' (and anything that depends on it):", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --resource myResource --set-parameter target-role --meta --parameter-value Stopped", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Tell the cluster not to manage 'myResource':", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', "The cluster will not attempt to start or stop the resource under any circumstances."},
{"-spacer-", 1, 0, '-', "Useful when performing maintenance tasks on a resource.", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --resource myResource --set-parameter is-managed --meta --parameter-value false", pcmk_option_example},
{"-spacer-", 1, 0, '-', "Erase the operation history of 'myResource' on 'aNode':", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', "The cluster will 'forget' the existing resource state (including any errors) and attempt to recover the resource."},
{"-spacer-", 1, 0, '-', "Useful when a resource had failed permanently and has been repaired by an administrator.", pcmk_option_paragraph},
{"-spacer-", 1, 0, '-', " crm_resource --resource myResource --cleanup --node aNode", pcmk_option_example},
{0, 0, 0, 0}
};
/* *INDENT-ON* */
int
main(int argc, char **argv)
{
const char *longname = NULL;
pe_working_set_t data_set;
xmlNode *cib_xml_copy = NULL;
cib_t *cib_conn = NULL;
bool do_trace = FALSE;
bool recursive = FALSE;
int rc = pcmk_ok;
int option_index = 0;
int argerr = 0;
int flag;
crm_log_cli_init("crm_resource");
crm_set_options(NULL, "(query|command) [options]", long_options,
"Perform tasks related to cluster resources.\nAllows resources to be queried (definition and location), modified, and moved around the cluster.\n");
if (argc < 2) {
crm_help('?', EX_USAGE);
}
while (1) {
flag = crm_get_option_long(argc, argv, &option_index, &longname);
if (flag == -1)
break;
switch (flag) {
case 0:
if (safe_str_eq("master", longname)) {
scope_master = TRUE;
} else if(safe_str_eq(longname, "recursive")) {
recursive = TRUE;
} else if (safe_str_eq("force-stop", longname)
|| safe_str_eq("force-start", longname)
|| safe_str_eq("force-check", longname)) {
rsc_cmd = flag;
rsc_long_cmd = longname;
} else if (safe_str_eq("list-ocf-providers", longname)
|| safe_str_eq("list-ocf-alternatives", longname)
|| safe_str_eq("list-standards", longname)) {
const char *text = NULL;
lrmd_list_t *list = NULL;
lrmd_list_t *iter = NULL;
lrmd_t *lrmd_conn = lrmd_api_new();
if (safe_str_eq("list-ocf-providers", longname)
|| safe_str_eq("list-ocf-alternatives", longname)) {
rc = lrmd_conn->cmds->list_ocf_providers(lrmd_conn, optarg, &list);
text = "OCF providers";
} else if (safe_str_eq("list-standards", longname)) {
rc = lrmd_conn->cmds->list_standards(lrmd_conn, &list);
text = "standards";
}
if (rc > 0) {
rc = 0;
for (iter = list; iter != NULL; iter = iter->next) {
rc++;
printf("%s\n", iter->val);
}
lrmd_list_freeall(list);
} else if (optarg) {
fprintf(stderr, "No %s found for %s\n", text, optarg);
} else {
fprintf(stderr, "No %s found\n", text);
}
lrmd_api_delete(lrmd_conn);
return crm_exit(rc);
} else if (safe_str_eq("show-metadata", longname)) {
char standard[512];
char provider[512];
char type[512];
char *metadata = NULL;
lrmd_t *lrmd_conn = lrmd_api_new();
rc = sscanf(optarg, "%[^:]:%[^:]:%s", standard, provider, type);
if (rc == 3) {
rc = lrmd_conn->cmds->get_metadata(lrmd_conn, standard, provider, type,
&metadata, 0);
} else if (rc == 2) {
rc = lrmd_conn->cmds->get_metadata(lrmd_conn, standard, NULL, provider,
&metadata, 0);
} else if (rc < 2) {
fprintf(stderr,
"Please specify standard:type or standard:provider:type, not %s\n",
optarg);
rc = -EINVAL;
}
if (metadata) {
printf("%s\n", metadata);
} else {
fprintf(stderr, "Metadata query for %s failed: %d\n", optarg, rc);
}
lrmd_api_delete(lrmd_conn);
return crm_exit(rc);
} else if (safe_str_eq("list-agents", longname)) {
lrmd_list_t *list = NULL;
lrmd_list_t *iter = NULL;
char standard[512];
char provider[512];
lrmd_t *lrmd_conn = lrmd_api_new();
rc = sscanf(optarg, "%[^:]:%s", standard, provider);
if (rc == 1) {
rc = lrmd_conn->cmds->list_agents(lrmd_conn, &list, optarg, NULL);
provider[0] = '*';
provider[1] = 0;
} else if (rc == 2) {
rc = lrmd_conn->cmds->list_agents(lrmd_conn, &list, standard, provider);
}
if (rc > 0) {
rc = 0;
for (iter = list; iter != NULL; iter = iter->next) {
printf("%s\n", iter->val);
rc++;
}
lrmd_list_freeall(list);
rc = 0;
} else {
fprintf(stderr, "No agents found for standard=%s, provider=%s\n", standard,
provider);
rc = -1;
}
lrmd_api_delete(lrmd_conn);
return crm_exit(rc);
} else {
crm_err("Unhandled long option: %s", longname);
}
break;
case 'V':
do_trace = TRUE;
crm_bump_log_level(argc, argv);
break;
case '$':
case '?':
crm_help(flag, EX_OK);
break;
case 'x':
xml_file = strdup(optarg);
break;
case 'Q':
BE_QUIET = TRUE;
break;
case 'm':
attr_set_type = XML_TAG_META_SETS;
break;
case 'z':
attr_set_type = XML_TAG_UTILIZATION;
break;
case 'u':
move_lifetime = strdup(optarg);
break;
case 'f':
do_force = TRUE;
break;
case 'i':
prop_id = optarg;
break;
case 's':
prop_set = optarg;
break;
case 'r':
rsc_id = optarg;
break;
case 'v':
prop_value = optarg;
break;
case 't':
rsc_type = optarg;
break;
case 'C':
case 'R':
case 'P':
rsc_cmd = 'C';
break;
case 'L':
case 'c':
case 'l':
case 'q':
case 'w':
case 'D':
case 'F':
case 'W':
case 'M':
case 'U':
case 'B':
case 'O':
case 'o':
case 'A':
case 'a':
rsc_cmd = flag;
break;
+ case 'j':
+ print_pending = TRUE;
+ break;
case 'p':
case 'g':
case 'd':
case 'S':
case 'G':
prop_name = optarg;
rsc_cmd = flag;
break;
case 'h':
case 'H':
case 'N':
crm_trace("Option %c => %s", flag, optarg);
host_uname = optarg;
break;
default:
CMD_ERR("Argument code 0%o (%c) is not (?yet?) supported\n", flag, flag);
++argerr;
break;
}
}
if (optind < argc && argv[optind] != NULL) {
CMD_ERR("non-option ARGV-elements: ");
while (optind < argc && argv[optind] != NULL) {
CMD_ERR("%s ", argv[optind++]);
++argerr;
}
CMD_ERR("\n");
}
if (optind > argc) {
++argerr;
}
if (argerr) {
crm_help('?', EX_USAGE);
}
our_pid = calloc(1, 11);
if (our_pid != NULL) {
snprintf(our_pid, 10, "%d", getpid());
our_pid[10] = '\0';
}
if (do_force) {
crm_debug("Forcing...");
cib_options |= cib_quorum_override;
}
set_working_set_defaults(&data_set);
if (rsc_cmd != 'P' || rsc_id) {
resource_t *rsc = NULL;
cib_conn = cib_new();
rc = cib_conn->cmds->signon(cib_conn, crm_system_name, cib_command);
if (rc != pcmk_ok) {
CMD_ERR("Error signing on to the CIB service: %s\n", pcmk_strerror(rc));
return crm_exit(rc);
}
if (xml_file != NULL) {
cib_xml_copy = filename2xml(xml_file);
} else {
cib_xml_copy = get_cib_copy(cib_conn);
}
if (cli_config_update(&cib_xml_copy, NULL, FALSE) == FALSE) {
rc = -ENOKEY;
goto bail;
}
data_set.input = cib_xml_copy;
data_set.now = crm_time_new(NULL);
cluster_status(&data_set);
if (rsc_id) {
rsc = find_rsc_or_clone(rsc_id, &data_set);
}
if (rsc == NULL && rsc_cmd != 'C') {
rc = -ENXIO;
}
}
if (rsc_cmd == 'R' || rsc_cmd == 'C' || rsc_cmd == 'F' || rsc_cmd == 'P') {
xmlNode *xml = NULL;
mainloop_io_t *source =
mainloop_add_ipc_client(CRM_SYSTEM_CRMD, G_PRIORITY_DEFAULT, 0, NULL, &crm_callbacks);
crmd_channel = mainloop_get_ipc_client(source);
if (crmd_channel == NULL) {
CMD_ERR("Error signing on to the CRMd service\n");
rc = -ENOTCONN;
goto bail;
}
xml = create_hello_message(our_pid, crm_system_name, "0", "1");
crm_ipc_send(crmd_channel, xml, 0, 0, NULL);
free_xml(xml);
}
if (rsc_cmd == 'L') {
rc = pcmk_ok;
do_find_resource_list(&data_set, FALSE);
} else if (rsc_cmd == 'l') {
int found = 0;
GListPtr lpc = NULL;
rc = pcmk_ok;
for (lpc = data_set.resources; lpc != NULL; lpc = lpc->next) {
resource_t *rsc = (resource_t *) lpc->data;
found++;
print_raw_rsc(rsc);
}
if (found == 0) {
printf("NO resources configured\n");
rc = -ENXIO;
goto bail;
}
} else if (rsc_cmd == 0 && rsc_long_cmd) {
svc_action_t *op = NULL;
const char *rtype = NULL;
const char *rprov = NULL;
const char *rclass = NULL;
const char *action = NULL;
GHashTable *params = NULL;
resource_t *rsc = pe_find_resource(data_set.resources, rsc_id);
if (rsc == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
if (safe_str_eq(rsc_long_cmd, "force-stop")) {
action = "stop";
} else if (safe_str_eq(rsc_long_cmd, "force-start")) {
action = "start";
if(rsc->variant >= pe_clone) {
rc = do_find_resource(rsc_id, NULL, &data_set);
if(rc > 0 && do_force == FALSE) {
CMD_ERR("It is not safe to start %s here: the cluster claims it is already active", rsc_id);
CMD_ERR("Try setting target-role=stopped first or specifying --force");
crm_exit(EPERM);
}
}
} else if (safe_str_eq(rsc_long_cmd, "force-check")) {
action = "monitor";
}
rclass = crm_element_value(rsc->xml, XML_AGENT_ATTR_CLASS);
rprov = crm_element_value(rsc->xml, XML_AGENT_ATTR_PROVIDER);
rtype = crm_element_value(rsc->xml, XML_ATTR_TYPE);
if(safe_str_eq(rclass, "stonith")){
CMD_ERR("Sorry, --%s doesn't support %s resources yet\n", rsc_long_cmd, rclass);
crm_exit(EOPNOTSUPP);
}
params = generate_resource_params(rsc, &data_set);
op = resources_action_create(rsc->id, rclass, rprov, rtype, action, 0, -1, params);
if(do_trace) {
setenv("OCF_TRACE_RA", "1", 1);
}
if(op == NULL) {
/* Re-run but with stderr enabled so we can display a sane error message */
crm_enable_stderr(TRUE);
resources_action_create(rsc->id, rclass, rprov, rtype, action, 0, -1, params);
return crm_exit(EINVAL);
} else if (services_action_sync(op)) {
int more, lpc, last;
char *local_copy = NULL;
if (op->status == PCMK_LRM_OP_DONE) {
printf("Operation %s for %s (%s:%s:%s) returned %d\n",
action, rsc->id, rclass, rprov ? rprov : "", rtype, op->rc);
} else {
printf("Operation %s for %s (%s:%s:%s) failed: %d\n",
action, rsc->id, rclass, rprov ? rprov : "", rtype, op->status);
}
if (op->stdout_data) {
local_copy = strdup(op->stdout_data);
more = strlen(local_copy);
last = 0;
for (lpc = 0; lpc < more; lpc++) {
if (local_copy[lpc] == '\n' || local_copy[lpc] == 0) {
local_copy[lpc] = 0;
printf(" > stdout: %s\n", local_copy + last);
last = lpc + 1;
}
}
free(local_copy);
}
if (op->stderr_data) {
local_copy = strdup(op->stderr_data);
more = strlen(local_copy);
last = 0;
for (lpc = 0; lpc < more; lpc++) {
if (local_copy[lpc] == '\n' || local_copy[lpc] == 0) {
local_copy[lpc] = 0;
printf(" > stderr: %s\n", local_copy + last);
last = lpc + 1;
}
}
free(local_copy);
}
}
rc = op->rc;
services_action_free(op);
return crm_exit(rc);
} else if (rsc_cmd == 'A' || rsc_cmd == 'a') {
GListPtr lpc = NULL;
resource_t *rsc = pe_find_resource(data_set.resources, rsc_id);
xmlNode *cib_constraints = get_object_root(XML_CIB_TAG_CONSTRAINTS, data_set.input);
if (rsc == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
unpack_constraints(cib_constraints, &data_set);
for (lpc = data_set.resources; lpc != NULL; lpc = lpc->next) {
resource_t *r = (resource_t *) lpc->data;
clear_bit(r->flags, pe_rsc_allocating);
}
show_colocation(rsc, TRUE, rsc_cmd == 'A', 1);
fprintf(stdout, "* %s\n", rsc->id);
show_location(rsc, NULL);
for (lpc = data_set.resources; lpc != NULL; lpc = lpc->next) {
resource_t *r = (resource_t *) lpc->data;
clear_bit(r->flags, pe_rsc_allocating);
}
show_colocation(rsc, FALSE, rsc_cmd == 'A', 1);
} else if (rsc_cmd == 'c') {
int found = 0;
GListPtr lpc = NULL;
rc = pcmk_ok;
for (lpc = data_set.resources; lpc != NULL; lpc = lpc->next) {
resource_t *rsc = (resource_t *) lpc->data;
print_cts_rsc(rsc);
found++;
}
print_cts_constraints(&data_set);
} else if (rsc_cmd == 'F') {
rc = fail_lrm_rsc(crmd_channel, host_uname, rsc_id, &data_set);
if (rc == pcmk_ok) {
start_mainloop();
}
} else if (rsc_cmd == 'O') {
rc = list_resource_operations(rsc_id, host_uname, TRUE, &data_set);
} else if (rsc_cmd == 'o') {
rc = list_resource_operations(rsc_id, host_uname, FALSE, &data_set);
} else if (rc == -ENXIO) {
CMD_ERR("Resource '%s' not found: %s\n", crm_str(rsc_id), pcmk_strerror(rc));
} else if (rsc_cmd == 'W') {
if (rsc_id == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
rc = do_find_resource(rsc_id, NULL, &data_set);
} else if (rsc_cmd == 'q') {
if (rsc_id == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
rc = dump_resource(rsc_id, &data_set, TRUE);
} else if (rsc_cmd == 'w') {
if (rsc_id == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
rc = dump_resource(rsc_id, &data_set, FALSE);
} else if (rsc_cmd == 'U') {
node_t *dest = NULL;
if (rsc_id == NULL) {
CMD_ERR("No value specified for --resource\n");
rc = -ENXIO;
goto bail;
}
if (host_uname) {
dest = pe_find_node(data_set.nodes, host_uname);
if (dest == NULL) {
CMD_ERR("Unknown node: %s\n", host_uname);
rc = -ENXIO;
goto bail;
}
rc = clear_resource(rsc_id, dest->details->uname, NULL, cib_conn);
} else {
rc = clear_resource(rsc_id, NULL, data_set.nodes, cib_conn);
}
} else if (rsc_cmd == 'M' && host_uname) {
int count = 0;
node_t *current = NULL;
node_t *dest = pe_find_node(data_set.nodes, host_uname);
resource_t *rsc = pe_find_resource(data_set.resources, rsc_id);
rc = -EINVAL;
if (rsc == NULL) {
CMD_ERR("Resource '%s' not moved: not found\n", rsc_id);
rc = -ENXIO;
goto bail;
} else if (scope_master && rsc->variant < pe_master) {
resource_t *p = uber_parent(rsc);
if(p->variant == pe_master) {
CMD_ERR("Using parent '%s' for --move command instead of '%s'.\n", rsc->id, rsc_id);
rsc_id = p->id;
rsc = p;
} else {
CMD_ERR("Ignoring '--master' option: not valid for %s resources.\n",
get_resource_typename(rsc->variant));
scope_master = FALSE;
}
}
if(rsc->variant == pe_master) {
GListPtr iter = NULL;
for(iter = rsc->children; iter; iter = iter->next) {
resource_t *child = (resource_t *)iter->data;
if(child->role == RSC_ROLE_MASTER) {
rsc = child;
count++;
}
}
if(scope_master == FALSE && count == 0) {
count = g_list_length(rsc->running_on);
}
} else if (rsc->variant > pe_group) {
count = g_list_length(rsc->running_on);
} else if (g_list_length(rsc->running_on) > 1) {
CMD_ERR("Resource '%s' not moved: active on multiple nodes\n", rsc_id);
goto bail;
}
if(dest == NULL) {
CMD_ERR("Error performing operation: node '%s' is unknown\n", host_uname);
rc = -ENXIO;
goto bail;
}
if(g_list_length(rsc->running_on) == 1) {
current = rsc->running_on->data;
}
if(current == NULL) {
/* Nothing to check */
} else if(scope_master && rsc->role != RSC_ROLE_MASTER) {
crm_trace("%s is already active on %s but not in correct state", rsc_id, dest->details->uname);
} else if (safe_str_eq(current->details->uname, dest->details->uname)) {
CMD_ERR("Error performing operation: %s is already %s on %s\n",
rsc_id, scope_master?"promoted":"active", dest->details->uname);
goto bail;
}
/* Clear any previous constraints for 'dest' */
clear_resource(rsc_id, dest->details->uname, data_set.nodes, cib_conn);
/* Record an explicit preference for 'dest' */
rc = prefer_resource(rsc_id, dest->details->uname, cib_conn);
crm_trace("%s%s now prefers node %s%s",
rsc->id, scope_master?" (master)":"", dest->details->uname, do_force?"(forced)":"");
if(do_force) {
/* Ban the original location if possible */
if(current) {
ban_resource(rsc_id, current->details->uname, NULL, cib_conn);
} else if(count > 1) {
CMD_ERR("Resource '%s' is currently %s in %d locations. One may now move one to %s\n",
rsc_id, scope_master?"promoted":"active", count, dest->details->uname);
CMD_ERR("You can prevent '%s' from being %s at a specific location with:"
" --ban %s--host <name>\n", rsc_id, scope_master?"promoted":"active", scope_master?"--master ":"");
} else {
crm_trace("Not banning %s from it's current location: not active", rsc_id);
}
}
} else if (rsc_cmd == 'B' && host_uname) {
resource_t *rsc = pe_find_resource(data_set.resources, rsc_id);
node_t *dest = pe_find_node(data_set.nodes, host_uname);
rc = -ENXIO;
if (rsc_id == NULL) {
CMD_ERR("No value specified for --resource\n");
goto bail;
} else if(rsc == NULL) {
CMD_ERR("Resource '%s' not moved: unknown\n", rsc_id);
} else if (dest == NULL) {
CMD_ERR("Error performing operation: node '%s' is unknown\n", host_uname);
goto bail;
}
rc = ban_resource(rsc_id, dest->details->uname, NULL, cib_conn);
} else if (rsc_cmd == 'B' || rsc_cmd == 'M') {
resource_t *rsc = pe_find_resource(data_set.resources, rsc_id);
rc = -ENXIO;
if (rsc_id == NULL) {
CMD_ERR("No value specified for --resource\n");
goto bail;
}
rc = -EINVAL;
if(rsc == NULL) {
CMD_ERR("Resource '%s' not moved: unknown\n", rsc_id);
} else if(g_list_length(rsc->running_on) == 1) {
node_t *current = rsc->running_on->data;
rc = ban_resource(rsc_id, current->details->uname, NULL, cib_conn);
} else if(rsc->variant == pe_master) {
int count = 0;
GListPtr iter = NULL;
node_t *current = NULL;
for(iter = rsc->children; iter; iter = iter->next) {
resource_t *child = (resource_t *)iter->data;
if(child->role == RSC_ROLE_MASTER) {
count++;
current = child->running_on->data;
}
}
if(count == 1 && current) {
rc = ban_resource(rsc_id, current->details->uname, NULL, cib_conn);
} else {
CMD_ERR("Resource '%s' not moved: active in %d locations (promoted in %d).\n", rsc_id, g_list_length(rsc->running_on), count);
CMD_ERR("You can prevent '%s' from running on a specific location with: --ban --host <name>\n", rsc_id);
CMD_ERR("You can prevent '%s' from being promoted at a specific location with:"
" --ban --master --host <name>\n", rsc_id);
}
} else {
CMD_ERR("Resource '%s' not moved: active in %d locations.\n", rsc_id, g_list_length(rsc->running_on));
CMD_ERR("You can prevent '%s' from running on a specific location with: --ban --host <name>\n", rsc_id);
}
} else if (rsc_cmd == 'G') {
if (rsc_id == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
rc = dump_resource_prop(rsc_id, prop_name, &data_set);
} else if (rsc_cmd == 'S') {
xmlNode *msg_data = NULL;
if (prop_value == NULL || strlen(prop_value) == 0) {
CMD_ERR("You need to supply a value with the -v option\n");
rc = -EINVAL;
goto bail;
} else if (cib_conn == NULL) {
rc = -ENOTCONN;
goto bail;
}
if (rsc_id == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
CRM_LOG_ASSERT(rsc_type != NULL);
CRM_LOG_ASSERT(prop_name != NULL);
CRM_LOG_ASSERT(prop_value != NULL);
msg_data = create_xml_node(NULL, rsc_type);
crm_xml_add(msg_data, XML_ATTR_ID, rsc_id);
crm_xml_add(msg_data, prop_name, prop_value);
rc = cib_conn->cmds->modify(cib_conn, XML_CIB_TAG_RESOURCES, msg_data, cib_options);
free_xml(msg_data);
} else if (rsc_cmd == 'g') {
if (rsc_id == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
rc = dump_resource_attr(rsc_id, prop_name, &data_set);
} else if (rsc_cmd == 'p') {
if (rsc_id == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
if (prop_value == NULL || strlen(prop_value) == 0) {
CMD_ERR("You need to supply a value with the -v option\n");
rc = -EINVAL;
goto bail;
}
/* coverity[var_deref_model] False positive */
rc = set_resource_attr(rsc_id, prop_set, prop_id, prop_name,
prop_value, recursive, cib_conn, &data_set);
} else if (rsc_cmd == 'd') {
if (rsc_id == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
/* coverity[var_deref_model] False positive */
rc = delete_resource_attr(rsc_id, prop_set, prop_id, prop_name, cib_conn, &data_set);
} else if (rsc_cmd == 'C' && rsc_id) {
resource_t *rsc = pe_find_resource(data_set.resources, rsc_id);
crm_debug("Re-checking the state of %s on %s", rsc_id, host_uname);
if(rsc) {
crmd_replies_needed = 0;
rc = delete_lrm_rsc(cib_conn, crmd_channel, host_uname, rsc, &data_set);
} else {
rc = -ENODEV;
}
if (rc == pcmk_ok) {
start_mainloop();
}
} else if (rsc_cmd == 'C') {
xmlNode *cmd = create_request(CRM_OP_REPROBE, NULL, host_uname,
CRM_SYSTEM_CRMD, crm_system_name, our_pid);
crm_debug("Re-checking the state of all resources on %s", host_uname);
if (crm_ipc_send(crmd_channel, cmd, 0, 0, NULL) > 0) {
start_mainloop();
}
free_xml(cmd);
} else if (rsc_cmd == 'D') {
xmlNode *msg_data = NULL;
if (rsc_id == NULL) {
CMD_ERR("Must supply a resource id with -r\n");
rc = -ENXIO;
goto bail;
}
if (rsc_type == NULL) {
CMD_ERR("You need to specify a resource type with -t");
rc = -ENXIO;
goto bail;
} else if (cib_conn == NULL) {
rc = -ENOTCONN;
goto bail;
}
msg_data = create_xml_node(NULL, rsc_type);
crm_xml_add(msg_data, XML_ATTR_ID, rsc_id);
rc = cib_conn->cmds->delete(cib_conn, XML_CIB_TAG_RESOURCES, msg_data, cib_options);
free_xml(msg_data);
} else {
CMD_ERR("Unknown command: %c\n", rsc_cmd);
}
bail:
if (cib_conn != NULL) {
cleanup_alloc_calculations(&data_set);
cib_conn->cmds->signoff(cib_conn);
cib_delete(cib_conn);
}
if (rc == -pcmk_err_no_quorum) {
CMD_ERR("Error performing operation: %s\n", pcmk_strerror(rc));
CMD_ERR("Try using -f\n");
} else if (rc != pcmk_ok) {
CMD_ERR("Error performing operation: %s\n", pcmk_strerror(rc));
}
return crm_exit(rc);
}
diff --git a/tools/crm_simulate.c b/tools/crm_simulate.c
index 61065c440f..0956141174 100644
--- a/tools/crm_simulate.c
+++ b/tools/crm_simulate.c
@@ -1,1596 +1,1603 @@
/*
* Copyright (C) 2009 Andrew Beekhof <andrew@beekhof.net>
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This software is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include <crm_internal.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/param.h>
#include <sys/types.h>
#include <dirent.h>
#include <crm/crm.h>
#include <crm/cib.h>
#include <crm/common/util.h>
#include <crm/transition.h>
#include <crm/common/iso8601.h>
#include <crm/pengine/status.h>
#include <allocate.h>
cib_t *global_cib = NULL;
GListPtr op_fail = NULL;
gboolean quiet = FALSE;
gboolean bringing_nodes_online = FALSE;
+gboolean print_pending = FALSE;
#define new_node_template "//"XML_CIB_TAG_NODE"[@uname='%s']"
#define node_template "//"XML_CIB_TAG_STATE"[@uname='%s']"
#define rsc_template "//"XML_CIB_TAG_STATE"[@uname='%s']//"XML_LRM_TAG_RESOURCE"[@id='%s']"
#define op_template "//"XML_CIB_TAG_STATE"[@uname='%s']//"XML_LRM_TAG_RESOURCE"[@id='%s']/"XML_LRM_TAG_RSC_OP"[@id='%s']"
/* #define op_template "//"XML_CIB_TAG_STATE"[@uname='%s']//"XML_LRM_TAG_RESOURCE"[@id='%s']/"XML_LRM_TAG_RSC_OP"[@id='%s' and @"XML_LRM_ATTR_CALLID"='%d']" */
#define quiet_log(fmt, args...) do { \
if(quiet == FALSE) { \
printf(fmt , ##args); \
} \
} while(0)
extern void cleanup_alloc_calculations(pe_working_set_t * data_set);
extern xmlNode *do_calculations(pe_working_set_t * data_set, xmlNode * xml_input, crm_time_t * now);
char *use_date = NULL;
static crm_time_t *
get_date(void)
{
if (use_date) {
return crm_time_new(use_date);
}
return NULL;
}
static xmlNode *
find_resource(xmlNode * cib_node, const char *resource)
{
char *xpath = NULL;
xmlNode *match = NULL;
const char *node = crm_element_value(cib_node, XML_ATTR_UNAME);
int max = strlen(rsc_template) + strlen(resource) + strlen(node) + 1;
xpath = calloc(1, max);
snprintf(xpath, max, rsc_template, node, resource);
match = get_xpath_object(xpath, cib_node, LOG_DEBUG_2);
free(xpath);
return match;
}
static void
create_node_entry(cib_t * cib_conn, const char *node)
{
int rc = pcmk_ok;
int max = strlen(new_node_template) + strlen(node) + 1;
char *xpath = NULL;
xpath = calloc(1, max);
snprintf(xpath, max, new_node_template, node);
rc = cib_conn->cmds->query(cib_conn, xpath, NULL, cib_xpath | cib_sync_call | cib_scope_local);
if (rc == -ENXIO) {
xmlNode *cib_object = create_xml_node(NULL, XML_CIB_TAG_NODE);
/* Using node uname as uuid ala corosync/openais */
crm_xml_add(cib_object, XML_ATTR_ID, node);
crm_xml_add(cib_object, XML_ATTR_UNAME, node);
cib_conn->cmds->create(cib_conn, XML_CIB_TAG_NODES, cib_object,
cib_sync_call | cib_scope_local);
/* Not bothering with subsequent query to see if it exists,
we'll bomb out later in the call to query_node_uuid()... */
free_xml(cib_object);
}
free(xpath);
}
static xmlNode *
inject_node_state(cib_t * cib_conn, const char *node, const char *uuid)
{
int rc = pcmk_ok;
int max = strlen(rsc_template) + strlen(node) + 1;
char *xpath = NULL;
xmlNode *cib_object = NULL;
xpath = calloc(1, max);
if (bringing_nodes_online) {
create_node_entry(cib_conn, node);
}
snprintf(xpath, max, node_template, node);
rc = cib_conn->cmds->query(cib_conn, xpath, &cib_object,
cib_xpath | cib_sync_call | cib_scope_local);
if (cib_object && ID(cib_object) == NULL) {
crm_err("Detected multiple node_state entries for xpath=%s, bailing", xpath);
crm_log_xml_warn(cib_object, "Duplicates");
crm_exit(ENOTUNIQ);
}
if (rc == -ENXIO) {
char *found_uuid = NULL;
if (uuid == NULL) {
query_node_uuid(cib_conn, node, &found_uuid, NULL);
} else {
found_uuid = strdup(uuid);
}
cib_object = create_xml_node(NULL, XML_CIB_TAG_STATE);
crm_xml_add(cib_object, XML_ATTR_UUID, found_uuid);
crm_xml_add(cib_object, XML_ATTR_UNAME, node);
cib_conn->cmds->create(cib_conn, XML_CIB_TAG_STATUS, cib_object,
cib_sync_call | cib_scope_local);
free_xml(cib_object);
free(found_uuid);
rc = cib_conn->cmds->query(cib_conn, xpath, &cib_object,
cib_xpath | cib_sync_call | cib_scope_local);
crm_trace("injecting node state for %s. rc is %d", node, rc);
}
free(xpath);
CRM_ASSERT(rc == pcmk_ok);
return cib_object;
}
static xmlNode *
modify_node(cib_t * cib_conn, char *node, gboolean up)
{
xmlNode *cib_node = inject_node_state(cib_conn, node, NULL);
if (up) {
crm_xml_add(cib_node, XML_NODE_IN_CLUSTER, XML_BOOLEAN_YES);
crm_xml_add(cib_node, XML_NODE_IS_PEER, ONLINESTATUS);
crm_xml_add(cib_node, XML_NODE_JOIN_STATE, CRMD_JOINSTATE_MEMBER);
crm_xml_add(cib_node, XML_NODE_EXPECTED, CRMD_JOINSTATE_MEMBER);
} else {
crm_xml_add(cib_node, XML_NODE_IN_CLUSTER, XML_BOOLEAN_NO);
crm_xml_add(cib_node, XML_NODE_IS_PEER, OFFLINESTATUS);
crm_xml_add(cib_node, XML_NODE_JOIN_STATE, CRMD_JOINSTATE_DOWN);
crm_xml_add(cib_node, XML_NODE_EXPECTED, CRMD_JOINSTATE_DOWN);
}
crm_xml_add(cib_node, XML_ATTR_ORIGIN, crm_system_name);
return cib_node;
}
static void
inject_transient_attr(xmlNode * cib_node, const char *name, const char *value)
{
xmlNode *attrs = NULL;
xmlNode *container = NULL;
xmlNode *nvp = NULL;
const char *node_uuid = ID(cib_node);
char *nvp_id = crm_concat(name, node_uuid, '-');
crm_info("Injecting attribute %s=%s into %s '%s'", name, value, xmlGetNodePath(cib_node),
ID(cib_node));
attrs = first_named_child(cib_node, XML_TAG_TRANSIENT_NODEATTRS);
if (attrs == NULL) {
attrs = create_xml_node(cib_node, XML_TAG_TRANSIENT_NODEATTRS);
crm_xml_add(attrs, XML_ATTR_ID, node_uuid);
}
container = first_named_child(attrs, XML_TAG_ATTR_SETS);
if (container == NULL) {
container = create_xml_node(attrs, XML_TAG_ATTR_SETS);
crm_xml_add(container, XML_ATTR_ID, node_uuid);
}
nvp = create_xml_node(container, XML_CIB_TAG_NVPAIR);
crm_xml_add(nvp, XML_ATTR_ID, nvp_id);
crm_xml_add(nvp, XML_NVPAIR_ATTR_NAME, name);
crm_xml_add(nvp, XML_NVPAIR_ATTR_VALUE, value);
free(nvp_id);
}
static xmlNode *
inject_resource(xmlNode * cib_node, const char *resource, const char *rclass, const char *rtype,
const char *rprovider)
{
xmlNode *lrm = NULL;
xmlNode *container = NULL;
xmlNode *cib_resource = NULL;
char *xpath = NULL;
cib_resource = find_resource(cib_node, resource);
if (cib_resource != NULL) {
return cib_resource;
}
/* One day, add query for class, provider, type */
if (rclass == NULL || rtype == NULL) {
fprintf(stderr, "Resource %s not found in the status section of %s."
" Please supply the class and type to continue\n", resource, ID(cib_node));
return NULL;
} else if (safe_str_neq(rclass, "ocf")
&& safe_str_neq(rclass, "stonith")
&& safe_str_neq(rclass, "heartbeat")
&& safe_str_neq(rclass, "service")
&& safe_str_neq(rclass, "upstart")
&& safe_str_neq(rclass, "systemd")
&& safe_str_neq(rclass, "lsb")) {
fprintf(stderr, "Invalid class for %s: %s\n", resource, rclass);
return NULL;
} else if (safe_str_eq(rclass, "ocf") && rprovider == NULL) {
fprintf(stderr, "Please specify the provider for resource %s\n", resource);
return NULL;
}
xpath = (char *)xmlGetNodePath(cib_node);
crm_info("Injecting new resource %s into %s '%s'", resource, xpath, ID(cib_node));
free(xpath);
lrm = first_named_child(cib_node, XML_CIB_TAG_LRM);
if (lrm == NULL) {
const char *node_uuid = ID(cib_node);
lrm = create_xml_node(cib_node, XML_CIB_TAG_LRM);
crm_xml_add(lrm, XML_ATTR_ID, node_uuid);
}
container = first_named_child(lrm, XML_LRM_TAG_RESOURCES);
if (container == NULL) {
container = create_xml_node(lrm, XML_LRM_TAG_RESOURCES);
}
cib_resource = create_xml_node(container, XML_LRM_TAG_RESOURCE);
crm_xml_add(cib_resource, XML_ATTR_ID, resource);
crm_xml_add(cib_resource, XML_AGENT_ATTR_CLASS, rclass);
crm_xml_add(cib_resource, XML_AGENT_ATTR_PROVIDER, rprovider);
crm_xml_add(cib_resource, XML_ATTR_TYPE, rtype);
return cib_resource;
}
static lrmd_event_data_t *
create_op(xmlNode * cib_resource, const char *task, int interval, int outcome)
{
lrmd_event_data_t *op = NULL;
xmlNode *xop = NULL;
op = calloc(1, sizeof(lrmd_event_data_t));
op->rsc_id = strdup(ID(cib_resource));
op->interval = interval;
op->op_type = strdup(task);
op->rc = outcome;
op->op_status = 0;
op->params = NULL; /* TODO: Fill me in */
op->t_run = time(NULL);
op->t_rcchange = op->t_run;
op->call_id = 0;
for (xop = __xml_first_child(cib_resource); xop != NULL; xop = __xml_next(xop)) {
int tmp = 0;
crm_element_value_int(xop, XML_LRM_ATTR_CALLID, &tmp);
if (tmp > op->call_id) {
op->call_id = tmp;
}
}
op->call_id++;
return op;
}
static xmlNode *
inject_op(xmlNode * cib_resource, lrmd_event_data_t * op, int target_rc)
{
return create_operation_update(cib_resource, op, CRM_FEATURE_SET, target_rc, crm_system_name,
LOG_DEBUG_2);
}
static void
update_failcounts(xmlNode * cib_node, const char *resource, int interval, int rc)
{
if (rc == 0) {
return;
} else if (rc == 7 && interval == 0) {
return;
} else {
char *name = NULL;
char *now = crm_itoa(time(NULL));
name = crm_concat("fail-count", resource, '-');
inject_transient_attr(cib_node, name, "value++");
name = crm_concat("last-failure", resource, '-');
inject_transient_attr(cib_node, name, now);
free(name);
free(now);
}
}
static gboolean
exec_pseudo_action(crm_graph_t * graph, crm_action_t * action)
{
const char *node = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
const char *task = crm_element_value(action->xml, XML_LRM_ATTR_TASK_KEY);
action->confirmed = TRUE;
quiet_log(" * Pseudo action: %s%s%s\n", task, node ? " on " : "", node ? node : "");
update_graph(graph, action);
return TRUE;
}
GListPtr resource_list = NULL;
static gboolean
exec_rsc_action(crm_graph_t * graph, crm_action_t * action)
{
int rc = 0;
GListPtr gIter = NULL;
lrmd_event_data_t *op = NULL;
int target_outcome = 0;
gboolean uname_is_uuid = FALSE;
const char *rtype = NULL;
const char *rclass = NULL;
const char *resource = NULL;
const char *rprovider = NULL;
const char *operation = crm_element_value(action->xml, "operation");
const char *target_rc_s = crm_meta_value(action->params, XML_ATTR_TE_TARGET_RC);
xmlNode *cib_node = NULL;
xmlNode *cib_resource = NULL;
xmlNode *action_rsc = first_named_child(action->xml, XML_CIB_TAG_RESOURCE);
char *node = crm_element_value_copy(action->xml, XML_LRM_ATTR_TARGET);
char *uuid = crm_element_value_copy(action->xml, XML_LRM_ATTR_TARGET_UUID);
const char *router_node = crm_element_value(action->xml, XML_LRM_ATTR_ROUTER_NODE);
if (safe_str_eq(operation, CRM_OP_PROBED)
|| safe_str_eq(operation, CRM_OP_REPROBE)) {
crm_info("Skipping %s op for %s\n", operation, node);
goto done;
}
if (action_rsc == NULL) {
crm_log_xml_err(action->xml, "Bad");
free(node); free(uuid);
return FALSE;
}
/* Look for the preferred name
* If not found, try the expected 'local' name
* If not found use the preferred name anyway
*/
resource = crm_element_value(action_rsc, XML_ATTR_ID);
if (pe_find_resource(resource_list, resource) == NULL) {
const char *longname = crm_element_value(action_rsc, XML_ATTR_ID_LONG);
if (pe_find_resource(resource_list, longname)) {
resource = longname;
}
}
if (safe_str_eq(operation, "delete")) {
quiet_log(" * Resource action: %-15s delete on %s\n", resource, node);
goto done;
}
rclass = crm_element_value(action_rsc, XML_AGENT_ATTR_CLASS);
rtype = crm_element_value(action_rsc, XML_ATTR_TYPE);
rprovider = crm_element_value(action_rsc, XML_AGENT_ATTR_PROVIDER);
if (target_rc_s != NULL) {
target_outcome = crm_parse_int(target_rc_s, "0");
}
CRM_ASSERT(global_cib->cmds->query(global_cib, NULL, NULL, cib_sync_call | cib_scope_local) ==
pcmk_ok);
if (router_node) {
uname_is_uuid = TRUE;
}
cib_node = inject_node_state(global_cib, node, uname_is_uuid ? node : uuid);
CRM_ASSERT(cib_node != NULL);
cib_resource = inject_resource(cib_node, resource, rclass, rtype, rprovider);
CRM_ASSERT(cib_resource != NULL);
op = convert_graph_action(cib_resource, action, 0, target_outcome);
if (op->interval) {
quiet_log(" * Resource action: %-15s %s=%d on %s\n", resource, op->op_type, op->interval,
node);
} else {
quiet_log(" * Resource action: %-15s %s on %s\n", resource, op->op_type, node);
}
for (gIter = op_fail; gIter != NULL; gIter = gIter->next) {
char *spec = (char *)gIter->data;
char *key = NULL;
key = calloc(1, 1 + strlen(spec));
snprintf(key, strlen(spec), "%s_%s_%d@%s=", resource, op->op_type, op->interval, node);
if (strncasecmp(key, spec, strlen(key)) == 0) {
rc = sscanf(spec, "%*[^=]=%d", (int *)&op->rc);
action->failed = TRUE;
graph->abort_priority = INFINITY;
printf("\tPretending action %d failed with rc=%d\n", action->id, op->rc);
update_failcounts(cib_node, resource, op->interval, op->rc);
free(key);
break;
}
free(key);
}
inject_op(cib_resource, op, target_outcome);
lrmd_free_event(op);
rc = global_cib->cmds->modify(global_cib, XML_CIB_TAG_STATUS, cib_node,
cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
done:
free(node); free(uuid);
free_xml(cib_node);
action->confirmed = TRUE;
update_graph(graph, action);
return TRUE;
}
static gboolean
exec_crmd_action(crm_graph_t * graph, crm_action_t * action)
{
const char *node = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
const char *task = crm_element_value(action->xml, XML_LRM_ATTR_TASK);
xmlNode *rsc = first_named_child(action->xml, XML_CIB_TAG_RESOURCE);
action->confirmed = TRUE;
if(rsc) {
quiet_log(" * Cluster action: %s for %s on %s\n", task, ID(rsc), node);
} else {
quiet_log(" * Cluster action: %s on %s\n", task, node);
}
update_graph(graph, action);
return TRUE;
}
#define STATUS_PATH_MAX 512
static gboolean
exec_stonith_action(crm_graph_t * graph, crm_action_t * action)
{
int rc = 0;
char xpath[STATUS_PATH_MAX];
char *target = crm_element_value_copy(action->xml, XML_LRM_ATTR_TARGET);
xmlNode *cib_node = modify_node(global_cib, target, FALSE);
crm_xml_add(cib_node, XML_ATTR_ORIGIN, __FUNCTION__);
CRM_ASSERT(cib_node != NULL);
quiet_log(" * Fencing %s\n", target);
rc = global_cib->cmds->replace(global_cib, XML_CIB_TAG_STATUS, cib_node,
cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
snprintf(xpath, STATUS_PATH_MAX, "//node_state[@uname='%s']/%s", target, XML_CIB_TAG_LRM);
rc = global_cib->cmds->delete(global_cib, xpath, NULL,
cib_xpath | cib_sync_call | cib_scope_local);
snprintf(xpath, STATUS_PATH_MAX, "//node_state[@uname='%s']/%s", target,
XML_TAG_TRANSIENT_NODEATTRS);
rc = global_cib->cmds->delete(global_cib, xpath, NULL,
cib_xpath | cib_sync_call | cib_scope_local);
action->confirmed = TRUE;
update_graph(graph, action);
free_xml(cib_node);
free(target);
return TRUE;
}
static void
-print_cluster_status(pe_working_set_t * data_set)
+print_cluster_status(pe_working_set_t * data_set, long options)
{
char *online_nodes = NULL;
char *online_remote_nodes = NULL;
char *online_remote_containers = NULL;
char *offline_nodes = NULL;
char *offline_remote_nodes = NULL;
GListPtr gIter = NULL;
for (gIter = data_set->nodes; gIter != NULL; gIter = gIter->next) {
node_t *node = (node_t *) gIter->data;
const char *node_mode = NULL;
char *node_name = NULL;
if (is_container_remote_node(node)) {
node_name = g_strdup_printf("%s:%s", node->details->uname, node->details->remote_rsc->container->id);
} else {
node_name = g_strdup_printf("%s", node->details->uname);
}
if (node->details->unclean) {
if (node->details->online && node->details->unclean) {
node_mode = "UNCLEAN (online)";
} else if (node->details->pending) {
node_mode = "UNCLEAN (pending)";
} else {
node_mode = "UNCLEAN (offline)";
}
} else if (node->details->pending) {
node_mode = "pending";
} else if (node->details->standby_onfail && node->details->online) {
node_mode = "standby (on-fail)";
} else if (node->details->standby) {
if (node->details->online) {
node_mode = "standby";
} else {
node_mode = "OFFLINE (standby)";
}
} else if (node->details->maintenance) {
if (node->details->online) {
node_mode = "maintenance";
} else {
node_mode = "OFFLINE (maintenance)";
}
} else if (node->details->online) {
node_mode = "online";
if (is_container_remote_node(node)) {
online_remote_containers = add_list_element(online_remote_containers, node_name);
} else if (is_baremetal_remote_node(node)) {
online_remote_nodes = add_list_element(online_remote_nodes, node_name);
} else {
online_nodes = add_list_element(online_nodes, node_name);
}
free(node_name);
continue;
} else {
node_mode = "OFFLINE";
if (is_baremetal_remote_node(node)) {
offline_remote_nodes = add_list_element(offline_remote_nodes, node_name);
} else if (is_container_remote_node(node)) {
/* ignore offline container nodes */
} else {
offline_nodes = add_list_element(offline_nodes, node_name);
}
free(node_name);
continue;
}
if (is_container_remote_node(node)) {
printf("ContainerNode %s: %s\n", node_name, node_mode);
} else if (is_baremetal_remote_node(node)) {
printf("RemoteNode %s: %s\n", node_name, node_mode);
} else if (safe_str_eq(node->details->uname, node->details->id)) {
printf("Node %s: %s\n", node_name, node_mode);
} else {
printf("Node %s (%s): %s\n", node_name, node->details->id, node_mode);
}
free(node_name);
}
if (online_nodes) {
printf("Online: [%s ]\n", online_nodes);
free(online_nodes);
}
if (offline_nodes) {
printf("OFFLINE: [%s ]\n", offline_nodes);
free(offline_nodes);
}
if (online_remote_nodes) {
printf("RemoteOnline: [%s ]\n", online_remote_nodes);
free(online_remote_nodes);
}
if (offline_remote_nodes) {
printf("RemoteOFFLINE: [%s ]\n", offline_remote_nodes);
free(offline_remote_nodes);
}
if (online_remote_containers) {
printf("Containers: [%s ]\n", online_remote_containers);
free(online_remote_containers);
}
fprintf(stdout, "\n");
for (gIter = data_set->resources; gIter != NULL; gIter = gIter->next) {
resource_t *rsc = (resource_t *) gIter->data;
if (is_set(rsc->flags, pe_rsc_orphan)
&& rsc->role == RSC_ROLE_STOPPED) {
continue;
}
- rsc->fns->print(rsc, NULL, pe_print_printf, stdout);
+ rsc->fns->print(rsc, NULL, pe_print_printf | options, stdout);
}
fprintf(stdout, "\n");
}
static int
run_simulation(pe_working_set_t * data_set)
{
crm_graph_t *transition = NULL;
enum transition_status graph_rc = -1;
crm_graph_functions_t exec_fns = {
exec_pseudo_action,
exec_rsc_action,
exec_crmd_action,
exec_stonith_action,
};
set_graph_functions(&exec_fns);
quiet_log("\nExecuting cluster transition:\n");
transition = unpack_graph(data_set->graph, crm_system_name);
print_graph(LOG_DEBUG, transition);
resource_list = data_set->resources;
do {
graph_rc = run_graph(transition);
} while (graph_rc == transition_active);
resource_list = NULL;
if (graph_rc != transition_complete) {
fprintf(stdout, "Transition failed: %s\n", transition_status(graph_rc));
print_graph(LOG_ERR, transition);
}
destroy_graph(transition);
if (graph_rc != transition_complete) {
fprintf(stdout, "An invalid transition was produced\n");
}
if (quiet == FALSE) {
xmlNode *cib_object = NULL;
int rc =
global_cib->cmds->query(global_cib, NULL, &cib_object, cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
quiet_log("\nRevised cluster status:\n");
cleanup_alloc_calculations(data_set);
data_set->input = cib_object;
data_set->now = get_date();
cluster_status(data_set);
- print_cluster_status(data_set);
+ print_cluster_status(data_set, 0);
}
if (graph_rc != transition_complete) {
return graph_rc;
}
return 0;
}
static char *
create_action_name(action_t * action)
{
char *action_name = NULL;
const char *prefix = NULL;
const char *action_host = NULL;
const char *task = action->task;
if (action->node) {
action_host = action->node->details->uname;
} else if (is_not_set(action->flags, pe_action_pseudo)) {
action_host = "<none>";
}
if (safe_str_eq(action->task, RSC_CANCEL)) {
prefix = "Cancel ";
task = "monitor"; /* TO-DO: Hack! */
}
if (action->rsc && action->rsc->clone_name) {
char *key = NULL;
const char *name = action->rsc->clone_name;
const char *interval_s = g_hash_table_lookup(action->meta, XML_LRM_ATTR_INTERVAL);
int interval = crm_parse_int(interval_s, "0");
if (safe_str_eq(action->task, RSC_NOTIFY)
|| safe_str_eq(action->task, RSC_NOTIFIED)) {
const char *n_type = g_hash_table_lookup(action->meta, "notify_key_type");
const char *n_task = g_hash_table_lookup(action->meta, "notify_key_operation");
CRM_ASSERT(n_type != NULL);
CRM_ASSERT(n_task != NULL);
key = generate_notify_key(name, n_type, n_task);
} else {
key = generate_op_key(name, task, interval);
}
if (action_host) {
action_name = g_strdup_printf("%s%s %s", prefix ? prefix : "", key, action_host);
} else {
action_name = g_strdup_printf("%s%s", prefix ? prefix : "", key);
}
free(key);
} else if (safe_str_eq(action->task, CRM_OP_FENCE)) {
action_name = g_strdup_printf("%s%s %s", prefix ? prefix : "", action->task, action_host);
} else if (action_host) {
action_name = g_strdup_printf("%s%s %s", prefix ? prefix : "", action->uuid, action_host);
} else {
action_name = g_strdup_printf("%s", action->uuid);
}
return action_name;
}
static void
create_dotfile(pe_working_set_t * data_set, const char *dot_file, gboolean all_actions)
{
GListPtr gIter = NULL;
FILE *dot_strm = fopen(dot_file, "w");
if (dot_strm == NULL) {
crm_perror(LOG_ERR, "Could not open %s for writing", dot_file);
return;
}
fprintf(dot_strm, " digraph \"g\" {\n");
for (gIter = data_set->actions; gIter != NULL; gIter = gIter->next) {
action_t *action = (action_t *) gIter->data;
const char *style = "dashed";
const char *font = "black";
const char *color = "black";
char *action_name = create_action_name(action);
crm_trace("Action %d: %p", action->id, action);
if (is_set(action->flags, pe_action_pseudo)) {
font = "orange";
}
if (is_set(action->flags, pe_action_dumped)) {
style = "bold";
color = "green";
} else if (action->rsc != NULL && is_not_set(action->rsc->flags, pe_rsc_managed)) {
color = "red";
font = "purple";
if (all_actions == FALSE) {
goto dont_write;
}
} else if (is_set(action->flags, pe_action_optional)) {
color = "blue";
if (all_actions == FALSE) {
goto dont_write;
}
} else {
color = "red";
CRM_CHECK(is_set(action->flags, pe_action_runnable) == FALSE,;
);
}
set_bit(action->flags, pe_action_dumped);
fprintf(dot_strm, "\"%s\" [ style=%s color=\"%s\" fontcolor=\"%s\"]\n",
action_name, style, color, font);
dont_write:
free(action_name);
}
for (gIter = data_set->actions; gIter != NULL; gIter = gIter->next) {
action_t *action = (action_t *) gIter->data;
GListPtr gIter2 = NULL;
for (gIter2 = action->actions_before; gIter2 != NULL; gIter2 = gIter2->next) {
action_wrapper_t *before = (action_wrapper_t *) gIter2->data;
char *before_name = NULL;
char *after_name = NULL;
const char *style = "dashed";
gboolean optional = TRUE;
if (before->state == pe_link_dumped) {
optional = FALSE;
style = "bold";
} else if (is_set(action->flags, pe_action_pseudo)
&& (before->type & pe_order_stonith_stop)) {
continue;
} else if (before->state == pe_link_dup) {
continue;
} else if (before->type == pe_order_none) {
continue;
} else if (is_set(before->action->flags, pe_action_dumped)
&& is_set(action->flags, pe_action_dumped)
&& before->type != pe_order_load) {
optional = FALSE;
}
if (all_actions || optional == FALSE) {
before_name = create_action_name(before->action);
after_name = create_action_name(action);
fprintf(dot_strm, "\"%s\" -> \"%s\" [ style = %s]\n",
before_name, after_name, style);
free(before_name);
free(after_name);
}
}
}
fprintf(dot_strm, "}\n");
if (dot_strm != NULL) {
fflush(dot_strm);
fclose(dot_strm);
}
}
static int
find_ticket_state(cib_t * the_cib, const char *ticket_id, xmlNode ** ticket_state_xml)
{
int offset = 0;
static int xpath_max = 1024;
int rc = pcmk_ok;
xmlNode *xml_search = NULL;
char *xpath_string = NULL;
CRM_ASSERT(ticket_state_xml != NULL);
*ticket_state_xml = NULL;
xpath_string = calloc(1, xpath_max);
offset += snprintf(xpath_string + offset, xpath_max - offset, "%s", "/cib/status/tickets");
if (ticket_id) {
offset += snprintf(xpath_string + offset, xpath_max - offset, "/%s[@id=\"%s\"]",
XML_CIB_TAG_TICKET_STATE, ticket_id);
}
rc = the_cib->cmds->query(the_cib, xpath_string, &xml_search,
cib_sync_call | cib_scope_local | cib_xpath);
if (rc != pcmk_ok) {
goto bail;
}
crm_log_xml_debug(xml_search, "Match");
if (xml_has_children(xml_search)) {
if (ticket_id) {
fprintf(stdout, "Multiple ticket_states match ticket_id=%s\n", ticket_id);
}
*ticket_state_xml = xml_search;
} else {
*ticket_state_xml = xml_search;
}
bail:
free(xpath_string);
return rc;
}
static int
set_ticket_state_attr(const char *ticket_id, const char *attr_name,
const char *attr_value, cib_t * cib, int cib_options)
{
int rc = pcmk_ok;
xmlNode *xml_top = NULL;
xmlNode *ticket_state_xml = NULL;
rc = find_ticket_state(cib, ticket_id, &ticket_state_xml);
if (rc == pcmk_ok) {
crm_debug("Found a match state for ticket: id=%s", ticket_id);
xml_top = ticket_state_xml;
} else if (rc != -ENXIO) {
return rc;
} else {
xmlNode *xml_obj = NULL;
xml_top = create_xml_node(NULL, XML_CIB_TAG_STATUS);
xml_obj = create_xml_node(xml_top, XML_CIB_TAG_TICKETS);
ticket_state_xml = create_xml_node(xml_obj, XML_CIB_TAG_TICKET_STATE);
crm_xml_add(ticket_state_xml, XML_ATTR_ID, ticket_id);
}
crm_xml_add(ticket_state_xml, attr_name, attr_value);
crm_log_xml_debug(xml_top, "Update");
rc = cib->cmds->modify(cib, XML_CIB_TAG_STATUS, xml_top, cib_options);
free_xml(xml_top);
return rc;
}
static void
modify_configuration(pe_working_set_t * data_set,
const char *quorum, GListPtr node_up, GListPtr node_down, GListPtr node_fail,
GListPtr op_inject, GListPtr ticket_grant, GListPtr ticket_revoke,
GListPtr ticket_standby, GListPtr ticket_activate)
{
int rc = pcmk_ok;
GListPtr gIter = NULL;
xmlNode *cib_op = NULL;
xmlNode *cib_node = NULL;
xmlNode *cib_resource = NULL;
lrmd_event_data_t *op = NULL;
if (quorum) {
xmlNode *top = create_xml_node(NULL, XML_TAG_CIB);
quiet_log(" + Setting quorum: %s\n", quorum);
/* crm_xml_add(top, XML_ATTR_DC_UUID, dc_uuid); */
crm_xml_add(top, XML_ATTR_HAVE_QUORUM, quorum);
rc = global_cib->cmds->modify(global_cib, NULL, top, cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
}
for (gIter = node_up; gIter != NULL; gIter = gIter->next) {
char *node = (char *)gIter->data;
quiet_log(" + Bringing node %s online\n", node);
cib_node = modify_node(global_cib, node, TRUE);
CRM_ASSERT(cib_node != NULL);
rc = global_cib->cmds->modify(global_cib, XML_CIB_TAG_STATUS, cib_node,
cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
free_xml(cib_node);
}
for (gIter = node_down; gIter != NULL; gIter = gIter->next) {
char *node = (char *)gIter->data;
quiet_log(" + Taking node %s offline\n", node);
cib_node = modify_node(global_cib, node, FALSE);
CRM_ASSERT(cib_node != NULL);
rc = global_cib->cmds->modify(global_cib, XML_CIB_TAG_STATUS, cib_node,
cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
free_xml(cib_node);
}
for (gIter = node_fail; gIter != NULL; gIter = gIter->next) {
char *node = (char *)gIter->data;
quiet_log(" + Failing node %s\n", node);
cib_node = modify_node(global_cib, node, TRUE);
crm_xml_add(cib_node, XML_NODE_IN_CLUSTER, XML_BOOLEAN_NO);
CRM_ASSERT(cib_node != NULL);
rc = global_cib->cmds->modify(global_cib, XML_CIB_TAG_STATUS, cib_node,
cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
free_xml(cib_node);
}
for (gIter = ticket_grant; gIter != NULL; gIter = gIter->next) {
char *ticket_id = (char *)gIter->data;
quiet_log(" + Granting ticket %s\n", ticket_id);
rc = set_ticket_state_attr(ticket_id, "granted", "true",
global_cib, cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
}
for (gIter = ticket_revoke; gIter != NULL; gIter = gIter->next) {
char *ticket_id = (char *)gIter->data;
quiet_log(" + Revoking ticket %s\n", ticket_id);
rc = set_ticket_state_attr(ticket_id, "granted", "false",
global_cib, cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
}
for (gIter = ticket_standby; gIter != NULL; gIter = gIter->next) {
char *ticket_id = (char *)gIter->data;
quiet_log(" + Making ticket %s standby\n", ticket_id);
rc = set_ticket_state_attr(ticket_id, "standby", "true",
global_cib, cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
}
for (gIter = ticket_activate; gIter != NULL; gIter = gIter->next) {
char *ticket_id = (char *)gIter->data;
quiet_log(" + Activating ticket %s\n", ticket_id);
rc = set_ticket_state_attr(ticket_id, "standby", "false",
global_cib, cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
}
for (gIter = op_inject; gIter != NULL; gIter = gIter->next) {
char *spec = (char *)gIter->data;
int rc = 0;
int outcome = 0;
int interval = 0;
char *key = NULL;
char *node = NULL;
char *task = NULL;
char *resource = NULL;
const char *rtype = NULL;
const char *rclass = NULL;
const char *rprovider = NULL;
resource_t *rsc = NULL;
quiet_log(" + Injecting %s into the configuration\n", spec);
key = calloc(1, strlen(spec) + 1);
node = calloc(1, strlen(spec) + 1);
rc = sscanf(spec, "%[^@]@%[^=]=%d", key, node, &outcome);
CRM_CHECK(rc == 3,
fprintf(stderr, "Invalid operation spec: %s. Only found %d fields\n", spec, rc);
continue);
parse_op_key(key, &resource, &task, &interval);
rsc = pe_find_resource(data_set->resources, resource);
if (rsc == NULL) {
fprintf(stderr, " - Invalid resource name: %s\n", resource);
} else {
rclass = crm_element_value(rsc->xml, XML_AGENT_ATTR_CLASS);
rtype = crm_element_value(rsc->xml, XML_ATTR_TYPE);
rprovider = crm_element_value(rsc->xml, XML_AGENT_ATTR_PROVIDER);
cib_node = inject_node_state(global_cib, node, NULL);
CRM_ASSERT(cib_node != NULL);
update_failcounts(cib_node, resource, interval, outcome);
cib_resource = inject_resource(cib_node, resource, rclass, rtype, rprovider);
CRM_ASSERT(cib_resource != NULL);
op = create_op(cib_resource, task, interval, outcome);
CRM_ASSERT(op != NULL);
cib_op = inject_op(cib_resource, op, 0);
CRM_ASSERT(cib_op != NULL);
lrmd_free_event(op);
rc = global_cib->cmds->modify(global_cib, XML_CIB_TAG_STATUS, cib_node,
cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
}
free(task);
free(node);
free(key);
}
}
static void
setup_input(const char *input, const char *output)
{
int rc = pcmk_ok;
cib_t *cib_conn = NULL;
xmlNode *cib_object = NULL;
char *local_output = NULL;
if (input == NULL) {
/* Use live CIB */
cib_conn = cib_new();
rc = cib_conn->cmds->signon(cib_conn, crm_system_name, cib_command);
if (rc == pcmk_ok) {
cib_object = get_cib_copy(cib_conn);
}
cib_conn->cmds->signoff(cib_conn);
cib_delete(cib_conn);
cib_conn = NULL;
if (cib_object == NULL) {
fprintf(stderr, "Live CIB query failed: empty result\n");
crm_exit(ENOTCONN);
}
} else if (safe_str_eq(input, "-")) {
cib_object = filename2xml(NULL);
} else {
cib_object = filename2xml(input);
}
if (get_object_root(XML_CIB_TAG_STATUS, cib_object) == NULL) {
create_xml_node(cib_object, XML_CIB_TAG_STATUS);
}
if (cli_config_update(&cib_object, NULL, FALSE) == FALSE) {
free_xml(cib_object);
crm_exit(ENOKEY);
}
if (validate_xml(cib_object, NULL, FALSE) != TRUE) {
free_xml(cib_object);
crm_exit(pcmk_err_dtd_validation);
}
if (output == NULL) {
char *pid = crm_itoa(getpid());
local_output = get_shadow_file(pid);
output = local_output;
free(pid);
}
rc = write_xml_file(cib_object, output, FALSE);
free_xml(cib_object);
cib_object = NULL;
if (rc < 0) {
fprintf(stderr, "Could not create '%s': %s\n", output, strerror(errno));
crm_exit(rc);
}
setenv("CIB_file", output, 1);
free(local_output);
}
/* *INDENT-OFF* */
static struct crm_option long_options[] = {
/* Top-level Options */
{"help", 0, 0, '?', "\tThis text"},
{"version", 0, 0, '$', "\tVersion information" },
{"quiet", 0, 0, 'Q', "\tDisplay only essentialoutput"},
{"verbose", 0, 0, 'V', "\tIncrease debug output"},
{"-spacer-", 0, 0, '-', "\nOperations:"},
{"run", 0, 0, 'R', "\tDetermine the cluster's response to the given configuration and status"},
{"simulate", 0, 0, 'S', "Simulate the transition's execution and display the resulting cluster status"},
{"in-place", 0, 0, 'X', "Simulate the transition's execution and store the result back to the input file"},
{"show-scores", 0, 0, 's', "Show allocation scores"},
{"show-utilization", 0, 0, 'U', "Show utilization information"},
{"profile", 1, 0, 'P', "Run all tests in the named directory to create profiling data"},
+ {"pending", 0, 0, 'j', "\tDisplay pending state if 'record-pending' is enabled"},
{"-spacer-", 0, 0, '-', "\nSynthetic Cluster Events:"},
{"node-up", 1, 0, 'u', "\tBring a node online"},
{"node-down", 1, 0, 'd', "\tTake a node offline"},
{"node-fail", 1, 0, 'f', "\tMark a node as failed"},
{"op-inject", 1, 0, 'i', "\tGenerate a failure for the cluster to react to in the simulation"},
{"-spacer-", 0, 0, '-', "\t\tValue is of the form ${resource}_${task}_${interval}@${node}=${rc}."},
{"-spacer-", 0, 0, '-', "\t\tEg. memcached_monitor_20000@bart.example.com=7"},
{"-spacer-", 0, 0, '-', "\t\tFor more information on OCF return codes, refer to: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-ocf-return-codes.html"},
{"op-fail", 1, 0, 'F', "\tIf the specified task occurs during the simulation, have it fail with return code ${rc}"},
{"-spacer-", 0, 0, '-', "\t\tValue is of the form ${resource}_${task}_${interval}@${node}=${rc}."},
{"-spacer-", 0, 0, '-', "\t\tEg. memcached_stop_0@bart.example.com=1\n"},
{"-spacer-", 0, 0, '-', "\t\tThe transition will normally stop at the failed action. Save the result with --save-output and re-run with --xml-file"},
{"set-datetime", 1, 0, 't', "Set date/time"},
{"quorum", 1, 0, 'q', "\tSpecify a value for quorum"},
{"ticket-grant", 1, 0, 'g', "Grant a ticket"},
{"ticket-revoke", 1, 0, 'r', "Revoke a ticket"},
{"ticket-standby", 1, 0, 'b', "Make a ticket standby"},
{"ticket-activate", 1, 0, 'e', "Activate a ticket"},
{"-spacer-", 0, 0, '-', "\nOutput Options:"},
{"save-input", 1, 0, 'I', "\tSave the input configuration to the named file"},
{"save-output", 1, 0, 'O', "Save the output configuration to the named file"},
{"save-graph", 1, 0, 'G', "\tSave the transition graph (XML format) to the named file"},
{"save-dotfile", 1, 0, 'D', "Save the transition graph (DOT format) to the named file"},
{"all-actions", 0, 0, 'a', "\tDisplay all possible actions in the DOT graph - even ones not part of the transition"},
{"-spacer-", 0, 0, '-', "\nData Source:"},
{"live-check", 0, 0, 'L', "\tConnect to the CIB and use the current contents as input"},
{"xml-file", 1, 0, 'x', "\tRetrieve XML from the named file"},
{"xml-pipe", 0, 0, 'p', "\tRetrieve XML from stdin"},
{"-spacer-", 0, 0, '-', "\nExamples:\n"},
{"-spacer-", 0, 0, '-', "Pretend a recurring monitor action found memcached stopped on node fred.example.com and, during recovery, that the memcached stop action failed", pcmk_option_paragraph},
{"-spacer-", 0, 0, '-', " crm_simulate -LS --op-inject memcached:0_monitor_20000@bart.example.com=7 --op-fail memcached:0_stop_0@fred.example.com=1 --save-output /tmp/memcached-test.xml", pcmk_option_example},
{"-spacer-", 0, 0, '-', "Now see what the reaction to the stop failure would be", pcmk_option_paragraph},
{"-spacer-", 0, 0, '-', " crm_simulate -S --xml-file /tmp/memcached-test.xml", pcmk_option_example},
{0, 0, 0, 0}
};
/* *INDENT-ON* */
static void
profile_one(const char *xml_file)
{
xmlNode *cib_object = NULL;
pe_working_set_t data_set;
printf("* Testing %s\n", xml_file);
cib_object = filename2xml(xml_file);
if (get_object_root(XML_CIB_TAG_STATUS, cib_object) == NULL) {
create_xml_node(cib_object, XML_CIB_TAG_STATUS);
}
if (cli_config_update(&cib_object, NULL, FALSE) == FALSE) {
free_xml(cib_object);
return;
}
if (validate_xml(cib_object, NULL, FALSE) != TRUE) {
free_xml(cib_object);
return;
}
set_working_set_defaults(&data_set);
data_set.input = cib_object;
data_set.now = get_date();
do_calculations(&data_set, cib_object, NULL);
cleanup_alloc_calculations(&data_set);
}
#ifndef FILENAME_MAX
# define FILENAME_MAX 512
#endif
static int
profile_all(const char *dir)
{
struct dirent **namelist;
int lpc = 0;
int file_num = scandir(dir, &namelist, 0, alphasort);
if (file_num > 0) {
struct stat prop;
char buffer[FILENAME_MAX + 1];
while (file_num--) {
if ('.' == namelist[file_num]->d_name[0]) {
free(namelist[file_num]);
continue;
} else if (strstr(namelist[file_num]->d_name, ".xml") == NULL) {
free(namelist[file_num]);
continue;
}
lpc++;
snprintf(buffer, FILENAME_MAX, "%s/%s", dir, namelist[file_num]->d_name);
if (stat(buffer, &prop) == 0 && S_ISREG(prop.st_mode)) {
profile_one(buffer);
}
free(namelist[file_num]);
}
free(namelist);
}
return lpc;
}
int
main(int argc, char **argv)
{
int rc = 0;
guint modified = 0;
gboolean store = FALSE;
gboolean process = FALSE;
gboolean simulate = FALSE;
gboolean all_actions = FALSE;
gboolean have_stdout = FALSE;
pe_working_set_t data_set;
const char *xml_file = "-";
const char *quorum = NULL;
const char *test_dir = NULL;
const char *dot_file = NULL;
const char *graph_file = NULL;
const char *input_file = NULL;
const char *output_file = NULL;
int flag = 0;
int index = 0;
int argerr = 0;
GListPtr node_up = NULL;
GListPtr node_down = NULL;
GListPtr node_fail = NULL;
GListPtr op_inject = NULL;
GListPtr ticket_grant = NULL;
GListPtr ticket_revoke = NULL;
GListPtr ticket_standby = NULL;
GListPtr ticket_activate = NULL;
xmlNode *input = NULL;
crm_log_cli_init("crm_simulate");
crm_set_options(NULL, "datasource operation [additional options]",
long_options, "Tool for simulating the cluster's response to events");
if (argc < 2) {
crm_help('?', EX_USAGE);
}
while (1) {
flag = crm_get_option(argc, argv, &index);
if (flag == -1)
break;
switch (flag) {
case 'V':
if (have_stdout == FALSE) {
/* Redirect stderr to stdout so we can grep the output */
have_stdout = TRUE;
close(STDERR_FILENO);
dup2(STDOUT_FILENO, STDERR_FILENO);
}
crm_bump_log_level(argc, argv);
break;
case '?':
case '$':
crm_help(flag, EX_OK);
break;
case 'p':
xml_file = "-";
break;
case 'Q':
quiet = TRUE;
break;
case 'L':
xml_file = NULL;
break;
case 'x':
xml_file = optarg;
break;
case 'u':
modified++;
bringing_nodes_online = TRUE;
node_up = g_list_append(node_up, optarg);
break;
case 'd':
modified++;
node_down = g_list_append(node_down, optarg);
break;
case 'f':
modified++;
node_fail = g_list_append(node_fail, optarg);
break;
case 't':
use_date = strdup(optarg);
break;
case 'i':
modified++;
op_inject = g_list_append(op_inject, optarg);
break;
case 'F':
process = TRUE;
simulate = TRUE;
op_fail = g_list_append(op_fail, optarg);
break;
case 'q':
modified++;
quorum = optarg;
break;
case 'g':
modified++;
ticket_grant = g_list_append(ticket_grant, optarg);
break;
case 'r':
modified++;
ticket_revoke = g_list_append(ticket_revoke, optarg);
break;
case 'b':
modified++;
ticket_standby = g_list_append(ticket_standby, optarg);
break;
case 'e':
modified++;
ticket_activate = g_list_append(ticket_activate, optarg);
break;
case 'a':
all_actions = TRUE;
break;
case 's':
process = TRUE;
show_scores = TRUE;
break;
case 'U':
process = TRUE;
show_utilization = TRUE;
break;
+ case 'j':
+ print_pending = TRUE;
+ break;
case 'S':
process = TRUE;
simulate = TRUE;
break;
case 'X':
store = TRUE;
process = TRUE;
simulate = TRUE;
break;
case 'R':
process = TRUE;
break;
case 'D':
process = TRUE;
dot_file = optarg;
break;
case 'G':
process = TRUE;
graph_file = optarg;
break;
case 'I':
input_file = optarg;
break;
case 'O':
output_file = optarg;
break;
case 'P':
test_dir = optarg;
break;
default:
++argerr;
break;
}
}
if (optind > argc) {
++argerr;
}
if (argerr) {
crm_help('?', EX_USAGE);
}
if (test_dir != NULL) {
return profile_all(test_dir);
}
setup_input(xml_file, store ? xml_file : output_file);
global_cib = cib_new();
global_cib->cmds->signon(global_cib, crm_system_name, cib_command);
set_working_set_defaults(&data_set);
if (data_set.now != NULL) {
quiet_log(" + Setting effective cluster time: %s", use_date);
crm_time_log(LOG_WARNING, "Set fake 'now' to", data_set.now,
crm_time_log_date | crm_time_log_timeofday);
}
rc = global_cib->cmds->query(global_cib, NULL, &input, cib_sync_call | cib_scope_local);
CRM_ASSERT(rc == pcmk_ok);
data_set.input = input;
data_set.now = get_date();
cluster_status(&data_set);
if (quiet == FALSE) {
+ int options = print_pending ? pe_print_pending : 0;
+
quiet_log("\nCurrent cluster status:\n");
- print_cluster_status(&data_set);
+ print_cluster_status(&data_set, options);
}
if (modified) {
quiet_log("Performing requested modifications\n");
modify_configuration(&data_set, quorum, node_up, node_down, node_fail, op_inject,
ticket_grant, ticket_revoke, ticket_standby, ticket_activate);
rc = global_cib->cmds->query(global_cib, NULL, &input, cib_sync_call);
if (rc != pcmk_ok) {
fprintf(stderr, "Could not connect to the CIB for input: %s\n", pcmk_strerror(rc));
goto done;
}
cleanup_alloc_calculations(&data_set);
data_set.now = get_date();
data_set.input = input;
}
if (input_file != NULL) {
rc = write_xml_file(input, input_file, FALSE);
if (rc < 0) {
fprintf(stderr, "Could not create '%s': %s\n", input_file, strerror(errno));
goto done;
}
}
rc = 0;
if (process || simulate) {
crm_time_t *local_date = NULL;
if (show_scores && show_utilization) {
printf("Allocation scores and utilization information:\n");
} else if (show_scores) {
fprintf(stdout, "Allocation scores:\n");
} else if (show_utilization) {
printf("Utilization information:\n");
}
do_calculations(&data_set, input, local_date);
input = NULL; /* Don't try and free it twice */
if (graph_file != NULL) {
write_xml_file(data_set.graph, graph_file, FALSE);
}
if (dot_file != NULL) {
create_dotfile(&data_set, dot_file, all_actions);
}
if (quiet == FALSE) {
GListPtr gIter = NULL;
quiet_log("%sTransition Summary:\n", show_scores || show_utilization
|| modified ? "\n" : "");
fflush(stdout);
for (gIter = data_set.resources; gIter != NULL; gIter = gIter->next) {
resource_t *rsc = (resource_t *) gIter->data;
LogActions(rsc, &data_set, TRUE);
}
}
}
if (simulate) {
rc = run_simulation(&data_set);
}
done:
cleanup_alloc_calculations(&data_set);
global_cib->cmds->signoff(global_cib);
cib_delete(global_cib);
free(use_date);
fflush(stderr);
return crm_exit(rc);
}

File Metadata

Mime Type
text/x-diff
Expires
Mon, Apr 21, 10:36 AM (1 d, 3 h)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
1664795
Default Alt Text
(645 KB)

Event Timeline