diff --git a/ChangeLog b/ChangeLog index 34fe994101..905f22ec69 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,2336 +1,2387 @@ +* Thu Nov 03 2016 Ken Gaillot Pacemaker-1.1.16-rc1 +- Update source tarball to revision: 2fc4716 +- Changesets: 360 +- Diff: 148 files changed, 7187 insertions(+), 5592 deletions(-) + +- Features added since Pacemaker-1.1.15 + + Location constraints may use rsc-pattern, with submatches expanded + + node-health-base available with node-health-strategy=progressive + + new Pacemaker Development document for working on pacemaker code base + + new PCMK_panic_action variable allows crash instead of reboot on panic + + resources: add resource agent for managing a node attribute + + systemd: include socket units when listing all systemd agents + +- Changes since Pacemaker-1.1.15 + + Important security fix for CVE-2016-7035 + + Logging is now synchronous when blackboxes are enabled + + All python code except CTS is now compatible with python 2.6+ and 3.2+ + + build: take advantage of compiler features for security and performance + + build: update SuSE spec modifications for recent spec changes + + build: avoid watchdog reboot when upgrading pacemaker_remote with sbd + + build: numerous other improvements in environment detection, etc. + + cib: fix infinite loop when no schema validates + + crmd: cl#5185 - record pending operations in CIB before they are performed + + crmd: don't abort transitions for CIB comment changes + + crmd: resend shutdown request if DC loses original request + + documentation: install improved README in doc instead of now-removed AUTHORS + + documentation: clarify licensing and provide copy of all licenses + + documentation: document various features and upgrades better + + fence_legacy: use "list" action when searching cluster-glue agents + + libcib: don't stop sending alerts after releasing DC role + + libcrmcommon: properly handle XML comments when comparing v2 patchset diffs + + libcrmcommon: report errors consistently when waiting for data on connection + + libpengine: avoid potential use-of-NULL + + libservices: use DBusError API properly + + pacemaker_remote: init script stop should always return 0 + + pacemaker_remote: allow remote clients to timeout/reconnect + + pacemaker_remote: correctly calculate remaining timeout when receiving messages + + pengine: avoid transition loop for start-then-stop + unfencing + + pengine: correctly update dependent actions of un-runnable clones + + pengine: do not fence a node in maintenance mode if it shuts down cleanly + + pengine: set OCF_RESKEY_CRM_meta_notify_active_* for multistate resources + + resources: ping - avoid temporary files for fping check, support FreeBSD + + resources: SysInfo - better support for FreeBSD + + resources: variable name typo in docker-wrapper + + tools: correct attrd_updater help and error messages when using CMAN + + tools: crm_standby --version/--help should work without cluster running + + tools: make crm_report sanitize CIB before generating readable version + + tools: display pending resource state by default when available + + tools: avoid matching other process with same PID in ClusterMon + + * Tue Jun 21 2016 Ken Gaillot Pacemaker-1.1.15-1 - Update source tarball to revision: 32fa6a5 - Changesets: 533 - Diff: 219 files changed, 6659 insertions(+), 3989 deletions(-) - Features added since Pacemaker-1.1.14 + Event-driven alerts allow scripts to be called after significant events + build: Some files moved from pacemaker package to pacemaker-cli for cleaner pacemaker-remote dependencies + build: ./configure --with-configdir argument for /etc/sysconfig, /etc/default, etc. + fencing: Simplify watchdog integration + fencing: Support concurrent fencing actions via new pcmk_action_limit option + remote: pacemaker_remote may be stopped without disabling resource first + remote: Report integration status of Pacemaker Remote nodes in CIB node_state + tools: crm_mon now reports why resources are not starting + tools: crm_report now obscures passwords in logfiles + tools: attrd_updater --update-both/--update-delay options allow changing dampening value + tools: allow stonith_admin -H '*' to show history for all nodes - Changes since Pacemaker-1.1.14 + Fix multiple memory issues (leaks, use-after-free) in daemons, libraries and tools + Make various log messages more user-friendly + Improve FreeBSD and Hurd support + attrd: Prevent possible segfault on exit + cib: Fix regression to restore support for compressed CIB larger than 1MB + common: fix regression in 1.1.14 that made have-watchdog always true + controld: handle DLM "wait fencing" state better + crmd: Fix regression so that fenced unseen nodes do not remain unclean + crmd: Take start-delay into account when calculation action timeouts + crmd: Avoid timeout on older peers when cancelling a resource operation + fencing: Allow fencing by node ID (e.g. by DLM) even if node left cluster + lrmd: Fix potential issues when cluster is stopped via systemd shutdown + pacemakerd: Properly respawn stonithd if it fails + pengine: Fix regression with multiple monitor levels that could ignore failure + pengine: Correctly set OCF_RESKEY_CRM_meta_timeout when start-delay is configured + pengine: Properly order actions for master/slave resources in anti-colocations + pengine: Respect asymmetrical ordering when trying to move resources + pengine: Properly order stop actions on guest node relative to host stonith + pengine: Correctly block actions dependent on unrunnable clones + remote: Allow remote nodes to have node attributes even with legacy attrd + remote: Recover from remote node fencing more quickly + remote: Place resources on newly rejoined remote nodes more quickly + resources: ping agent can now use fping6 for IPv6 hosts + resources: SysInfo now resets #health_disk to green when there's sufficient free disk + tools: crm_report is now more efficient and handles Pacemaker Remote nodes better + tools: Prevent crm_resource segfault when --resource is not supplied with --restart + tools: crm_shadow --display option now works + tools: crm_resource --restart handles groups, target-roles and moving resources better * Thu Jan 14 2016 Ken Gaillot Pacemaker-1.1.14-1 - Update source tarball to revision: f0b585a - Changesets: 724 - Diff: 179 files changed, 13142 insertions(+), 7695 deletions(-) - Features added since Pacemaker-1.1.13 + crm_resource: Indicate common reasons why a resource may not start after a cleanup + crm_resource: New --force-promote and --force-demote options for debugging + fencing: Support targeting fencing topologies by node name pattern or node attribute + fencing: Remap sequential topology reboots to all-off-then-all-on + pengine: Allow resources to start and stop as soon as their state is known on all nodes + pengine: Include a list of all and available nodes with clone notifications + pengine: Addition of the clone resource clone-min metadata option + pengine: Support of multiple-active=block for resource groups + remote: Resources that create guest nodes can be included in a group resource + remote: reconnect_interval option for remote nodes to delay reconnect after fence - Changes since Pacemaker-1.1.13 + improve support for building on FreeBSD and Debian + fix multiple memory issues (leaks, use-after-free, double free, use-of-NULL) in components and tools + cib: Do not terminate due to badly behaving clients + cman: handle corosync-invented node names of the form Node{id} for peers not in its node list + controld: replace bashism + crm_node: Display node state with -l and quorum status with -q, if available + crmd: resources would sometimes be restarted when only non-unique parameters changed + crmd: fence remote node after connection failure only once + crmd: handle resources named the same as cluster nodes + crmd: Pre-emptively fail in-flight actions when lrmd connections fail + crmd: Record actions in the CIB as failed if we cannot execute them + crm_report: Enable password sanitizing by default + crm_report: Allow log file discovery to be disabled + crm_resource: Allow the resource configuration to be modified for --force-{check,start,..} calls + crm_resource: Compensate for -C and -p being called with the child resource for clones + crm_resource: Correctly clean up all children for anonymous cloned groups + crm_resource: Correctly clean up failcounts for inactive anonymous clones + crm_resource: Correctly observe --force when deleting and updating attributes + crm_shadow: Fix "crm_shadow --diff" + crm_simulate: Prevent segfault on arches with 64bit time_t + fencing: ensure "required"/"automatic" only apply to "on" actions + fencing: Return a provider for the internal fencing agent "#watchdog" instead of logging an error + fencing: ignore stderr output of fence agents (often used for debug messages) + fencing: fix issue where deleting a fence device attribute can delete the device + libcib: potential user input overflow + libcluster: overhaul peer cache management + log: make syslog less noisy + log: fix various misspellings in log messages + lrmd: cancel currently pending STONITH op if stonithd connection is lost + lrmd: Finalize all pending and recurring operations when cleaning up a resource + pengine: Bug cl#5247 - Imply resources running on a container are stopped when the container is stopped + pengine: cl#5235 - Prevent graph loops that can be introduced by "load_stopped -> migrate_to" ordering + pengine: Correctly bypass fencing for resources that do not require it + pengine: do not timeout remote node recurring monitor op failure until after fencing + pengine: Ensure recurring monitor operations are cancelled when clone instances are de-allocated + pengine: fixes segfault in pengine when fencing remote node + pengine: properly handle blocked clone actions + pengine: ensure failed actions that occurred in node shutdown are displayed + remote: Correctly display the usage of the ocf:pacemaker:remote resource agent + remote: do not fail operations because of a migration + remote: enable reloads for select remote connection options + resources: allow for top output with or without percent sign in HealthCPU + resources: Prevent an error message on stopping "Dummy" resource + systemd: Prevent segfault when logging failed operations + systemd: Reconnect to System DBus if the connection is closed + systemd: set systemd resources' timeout values higher than systemd's own default + tools: Do not send command lines to syslog + tools: update SNMP MIB + upstart: Ensure pending structs are correctly unreferenced * Wed Jun 24 2015 Andrew Beekhof Pacemaker-1.1.13-1 - Update source tarball to revision: 2a1847e - Changesets: 750 - Diff: 156 files changed, 11323 insertions(+), 3725 deletions(-) - Features added since Pacemaker-1.1.12 + Allow fail-counts to be removed en-mass when the new attrd is in operation + attrd supports private attributes (not written to CIB) + crmd: Ensure a watchdog device is in use if stonith-watchdog-timeout is configured + crmd: If configured, trigger the watchdog immediately if we lose quorum and no-quorum-policy=suicide + crm_diff: Support generating a difference without versions details if --no-version/-u is supplied + crm_resource: Implement an intelligent restart capability + Fencing: Advertise the watchdog device for fencing operations + Fencing: Allow the cluster to recover resources if the watchdog is in use + fencing: cl#5134 - Support random fencing delay to avoid double fencing + mcp: Allow orphan children to initiate node panic via SIGQUIT + mcp: Turn on sbd integration if pacemakerd finds it running + mcp: Two new error codes that result in machine reset or power off + Officially support the resource-discovery attribute for location constraints + PE: Allow natural ordering of colocation sets + PE: Support non-actionable degraded mode for OCF + pengine: cl#5207 - Display "UNCLEAN" for resources running on unclean offline nodes + remote: pcmk remote client tool for use with container wrapper script + Support machine panics for some kinds of errors (via sbd if available) + tools: add crm_resource --wait option + tools: attrd_updater supports --query and --all options + tools: attrd_updater: Allow attributes to be set for other nodes - Changes since Pacemaker-1.1.12 + pengine: exclusive discovery implies rsc is only allowed on exclusive subset of nodes + acl: Correctly implement the 'reference' acl directive + acl: Do not delay evaluation of added nodes in some situations + attrd: b22b1fe did uuid test too early + attrd: Clean out the node cache when requested by the admin + attrd: fixes double free in attrd legacy + attrd: properly write attributes for peers once uuid is discovered + attrd: refresh should force an immediate write-out of all attributes + attrd: Simplify how node deletions happen + Bug rhbz#1067544 - Tools: Correctly handle --ban, --move and --locate for master/slave groups + Bug rhbz#1181824 - Ensure the DC can be reliably fenced + cib: Ability to upgrade cib validation schema in legacy mode + cib: Always generate digests for cib diffs in legacy mode + cib: assignment where comparison intended + cib: Avoid nodeid conflicts we don't care about + cib: Correctly add "update-origin", "update-client" and "update-user" attributes for cib + cib: Correctly set up signal handlers + cib: Correctly track node state + cib: Do not update on disk backups if we're just querying them + cib: Enable cib legacy mode for plugin-based clusters + cib: Ensure file-based backends treat '-o section' consistently with the native backend + cib: Ensure upgrade operations from a non-DC get an acknowledgement + cib: No need to enforce cib digests for v2 diffs in legacy mode + cib: Revert d153b86 to instantly get cib synchronized in legacy mode + cib: tls sock cleanup for remote cib connections + cli: Ensure subsequent unknown long options are correctly detected + cluster: Invoke crm_remove_conflicting_peer() only when the new node's uname is being assigned in the node cache + common: Increment current and age for lib common as a result of APIs being added + corosync: Bug cl#5232 - Somewhat gracefully handle nodes with invalid UUIDs + corosync: Avoid unnecessary repeated CMAP API calls + crmd/pengine: handle on-fail=ignore properly + crmd: Add "on_node" attribute for *_last_failure_0 lrm resource operations + crmd: All peers need to track node shutdown requests + crmd: Cached copies of transient attributes cease to be valid once a node leaves the membership + crmd: Correctly add the local option that validates against schema for pengine to calculate + crmd: Disable debug logging that results in significant overhead + crmd: do not remove connection resources during re-probe + crmd: don't update fail count twice for same failure + crmd: Ensure remote connection resources timeout properly during 'migrate_from' action + crmd: Ensure throttle_mode() does something on Linux + crmd: Fixes crash when remote connection migration fails + crmd: gracefully handle remote node disconnects during op execution + crmd: Handle remote connection failures while executing ops on remote connection + crmd: include remote nodes when forcing cluster wide resource reprobe + crmd: never stop recurring monitor ops for pcmk remote during incomplete migration + crmd: Prevent the old version of DC from being fenced when it shuts down for rolling-upgrade + crmd: Prevent use-of-NULL during reprobe + crmd: properly update job limit for baremetal remote-nodes + crmd: Remote-node throttle jobs count towards cluster-node hosting conneciton rsc + crmd: Reset stonith failcount to recover transitioner when the node rejoins + crmd: resolves memory leak in crmd. + crmd: respect start-failure-is-fatal even for artifically injected events + crmd: Wait for all pending operations to complete before poking the policy engine + crmd: When container's host is fenced, cancel in-flight operations + crm_attribute: Correctly update config options when -o crm_config is specified + crm_failcount: Better error reporting when no resource is specified + crm_mon: add exit reason to resource failure output + crm_mon: Fill CRM_notify_node in traps with node's uname rather than node's id if possible + crm_mon: Repair notification delivery when the v2 patch format is in use + crm_node: Correctly remove nodes from the CIB by nodeid + crm_report: More patterns for finding logs on non-DC nodes + crm_resource: Allow resource restart operations to be node specific + crm_resource: avoid deletion of lrm cache on node with resource discovery disabled. + crm_resource: Calculate how long to wait for a restart based on the resource timeouts + crm_resource: Clean up memory in --restart error paths + crm_resource: Display the locations of all anonymous clone children when supplying the children's common ID + crm_resource: Ensure --restart sets/clears meta attributes + crm_resource: Ensure fail-counts are purged when we redetect the state of all resources + crm_resource: Implement --timeout for resource restart operations + crm_resource: Include group members when calculating the next timeout + crm_resource: Memory leak in error paths + crm_resource: Prevent use-after-free + crm_resource: Repair regression test outputs + crm_resource: Use-after-free when restarting a resource + dbus: ref count leaks + dbus: Ensure both the read and write queues get dispatched + dbus: Fail gracefully if malloc fails + dbus: handle dispatch queue when multiple replies need to be processed + dbus: Notice when dbus connections get disabled + dbus: Remove double-free introduced while trying to make coverity shut up + ensure if B is colocated with A, B can never run without A + fence_legacy: Avoid passing 'port' to cluster-glue agents + fencing: Allow nodes to be purged from the member cache + fencing: Correctly make args for fencing agents + fencing: Correctly wait for self-fencing to occur when the watchdog is in use + fencing: Ensure the hostlist parameter is set for watchdog agents + fencing: Force 'stonith-ng' as the system name + fencing: Gracefully handle invalid metadata from agents + fencing: If configured, wait stonith-watchdog-timer seconds for self-fencing to complete + fencing: Reject actions for devices that haven't been explicitly registered yet + ipc: properly allocate server enforced buffer size on client + ipc: use server enforced buffer during ipc client send + lrmd, services: interpret LSB status codes properly + lrmd: add back support for class heartbeat agents + lrmd: cancel pending async connection during disconnect + lrmd: enable ipc proxy for docker-wrapper privileged mode + lrmd: fix rescheduling of systemd monitor op during start + lrmd: Handle systemd reporting 'done' before a resource is actually stopped + lrmd: Hint to child processes that using sd_notify is not required + lrmd: Log with the correct personality + lrmd: Prevent glib assert triggered by timers being removed from mainloop more than once + lrmd: report original timeout when systemd operation completes + lrmd: store failed operation exit reason in cib + mainloop: resolves race condition mainloop poll involving modification of ipc connections + make targetted reprobe for remote node work, crm_resource -C -N + mcp: Allow a configurable delay when debugging shutdown issues + mcp: Avoid requiring 'export' for SYS-V sysconfig options + Membership: Detect and resolve nodes that change their ID + pacemakerd: resolves memory leak of xml structure in pacemakerd + pengine: ability to launch resources in isolated containers + pengine: add #kind=remote for baremetal remote-nodes + pengine: allow baremetal remote-nodes to recover without requiring fencing when cluster-node fails + pengine: allow remote-nodes to be placed in maintenance mode + pengine: Avoid trailing whitespaces when printing resource state + pengine: cl#5130 - Choose nodes capable of running all the colocated utilization resources + pengine: cl#5130 - Only check the capacities of the nodes that are allowed to run the resource + pengine: Correctly compare feature set to determine how to unpack meta attributes + pengine: disable migrations for resources with isolation containers + pengine: disable reloading of resources within isolated container wrappers + pengine: Do not aggregate children in a pending state into the started/stopped/etc lists + pengine: Do not record duplicate copies of the failed actions + pengine: Do not reschedule monitors that are no longer needed while resource definitions have changed + pengine: Fence baremetal remote when recurring monitor op fails + pengine: Fix colocation with unmanaged resources + pengine: Fix the behaviors of multi-state resources with asymmetrical ordering + pengine: fixes pengine crash with orphaned remote node connection resource + pengine: fixes segfault caused by malformed log warning + pengine: handle cloned isolated resources in a sane way + pengine: handle isolated resource scenario, cloned group of isolated resources + pengine: Handle ordering between stateful and migratable resources + pengine: imply stop in container node resources when host node is fenced + pengine: only fence baremetal remote when connection can fails or can not be recovered + pengine: only kill process group on timeout when on-fail does not equal block. + pengine: per-node control over resource discovery + pengine: prefer migration target for remote node connections + pengine: prevent disabling rsc discovery per node in certain situations + pengine: Prevent use-after-free in sort_rsc_process_order() + pengine: properly handle ordering during remote connection partial migration + pengine: properly recover remote-nodes when cluster-node proxy goes offline + pengine: remove unnecessary whitespace from notify environment variables + pengine: require-all feature for ordered clones + pengine: Resolve memory leaks + pengine: resource discovery mode for location constraints + pengine: restart master instances on instance attribute changes + pengine: Turn off legacy unpacking of resource options into the meta hashtable + pengine: Watchdog integration is sufficient for fencing + Perform systemd reloads asynchronously + ping: Correctly advertise multiplier default + Prefer to inherit the watchdog timeout from SBD + properly record stop args after reload + provide fake meta data for ra class heartbeat + remote: report timestamps for remote connection resource operations + remote: Treat recv msg timeout as a disconnect + service: Prevent potential use-of-NULL in metadata lookups + solaris: Allow compilation when dirent.d_type is not available + solaris: Correctly replace the linux swab functions + solaris: Disable throttling since /proc doesn't exist + stonith-ng: Correctly observe the watchdog completion timeout + stonith-ng: Correctly track node state + stonith-ng: Reset mainloop source IDs after removing them + systemd: Correctly handle long running stop actions + systemd: Ensure failed monitor operations always return + systemd: Ensure we don't call dbus_message_unref() with NULL + systemd: fix crash caused when canceling in-flight operation + systemd: Kindly ask dbus NOT to kill the process if the dbus connection fails + systemd: Perform actions asynchronously + systemd: Perform monitor operations without blocking + systemd: Tell systemd not to take DBus down from underneath us + systemd: Trick systemd into not stopping our services before us during shutdown + tools: Improve crm_mon output with certain option combinations + upstart: Monitor actions always return 'ok' or 'not running' + upstart: Perform more parts of monitor operations without blocking + xml: add 'require-all' to xml schema for constraints + xml: cl#5231 - Unset the deleted attributes in the resulting diffs + xml: Clone the latest constraint schema in preparation for changes" + xml: Correctly create v1 patchsets when deleting attributes + xml: Do not change the ordering of properties when applying v1 cib diffs + xml: Do not dump deleted attributes + xml: Do not prune leaves from v1 cib diffs that are being created with digests + xml: Ensure ACLs are reapplied before calculating what a replace operation changed + xml: Fix upgrade-1.3.xsl to correctly transform ACL rules with "attribute" + xml: Prevent assert errors in crm_element_value() on applying a patch without version information + xml: Prevent potential use-of-NULL * Tue Jul 22 2014 Andrew Beekhof Pacemaker-1.1.12-1 - Update source tarball to revision: 93a037d - Changesets: 795 - Diff: 195 files changed, 13772 insertions(+), 6176 deletions(-) - Features added since Pacemaker-1.1.11 + Changes to the ACL schema to support nodes and unix groups + cib: Check ACLs prior to making the update instead of parsing the diff afterwards + cib: Default ACL support to on + cib: Enable the more efficient xml patchset format + cib: Implement zero-copy status update + cib: Send all r/w operations via the cluster connection and have all nodes process them + crmd: Set "cluster-name" property to corosync's "cluster_name" by default for corosync-2 + crm_mon: Display brief output if "-b/--brief" is supplied or 'b' is toggled + crm_report: Allow ssh alternatives to be used + crm_ticket: Support multiple modifications for a ticket in an atomic operation + extra: Add logrotate configuration file for /var/log/pacemaker.log + Fencing: Add the ability to call stonith_api_time() from stonith_admin + logging: daemons always get a log file, unless explicitly set to configured 'none' + logging: allows the user to specify a log level that is output to syslog + PE: Automatically re-unfence a node if the fencing device definition changes + pengine: cl#5174 - Allow resource sets and templates for location constraints + pengine: Support cib object tags + pengine: Support cluster-specific instance attributes based on rules + pengine: Support id-ref in nvpair with optional "name" + pengine: Support per-resource maintenance mode + pengine: Support site-specific instance attributes based on rules + tools: Allow crm_shadow to create older configuration versions + tools: Display pending state in crm_mon/crm_resource/crm_simulate if --pending/-j is supplied (cl#5178) + xml: Add the ability to have lightweight schema revisions + xml: Enable resource sets in location constraints for 1.2 schema + xml: Support resources that require unfencing - Changes since Pacemaker-1.1.11 + acl: Authenticate pacemaker-remote requests with the node name as the client + acl: Read access must be explicitly granted + attrd: Ensure attribute dampening is always observed + attrd: Remove offline nodes from node cache for "peer-remove" requests + Bug cl#5055 - Improved migration support. + Bug cl#5184 - Ensure pending probes that ultimately fail are correctly updated + Bug cl#5196 - pengine: Check values after expanding templates + Bug cl#5212 - Do not promote instances when quorum is lots and no-quorum-policy=freeze + Bug cl#5213 - Ensure role colocation with -INFINITY is enforced + Bug cl#5213 - Limit the scope of the previous commit to the masters role + Bug cl#5219 - pengine: Allow unrelated resources with a common colocation target to remain promoted + Bug cl#5222 - cib: Repair rolling update capability + Bug cl#5222 - Enable legacy mode whenever a broadcast update is detected + Bug rhbz#1036631 - Stop members of cloned groups when dependencies are stopped + Bug rhbz#1054307 - cname pattern match should be more restrictive in init script + Bug rhbz#1057697 - Use native DBus library for systemd/upstart support to avoid problematic use of threads + Bug rhbz#1097457 - Limit the scope of the previous fix and include a helpful comment + Bug rhbz#1097457 - Prevent invalid transition when resource are ordered to start after the container they're started in + cib: allow setting permanent remote-node attributes + cib: Auto-detect which patchset format to use + cib: Determine the best value of validate-with if one is not supplied + cib: Do not disable cib disk writes if on-disk cib is corrupt + cib: Ensure 'cibadmin -R/--replace' commands get replies + cib: Erasing the cib is an admin action, bump the admin_epoch instead + cib: Fix remote cib based on TLS + cib: Ignore patch failures if we already have their contents + cib: Validate that everyone still sees the same configuration once all updates have completed + cibadmin: Allow priviliged clients to perform tasks as unpriviliged users + cibadmin: Remove dangerous commands that exposed unnecessary implementation internal details + cluster: Fix segfault on removing a node + cluster: Prevent search of unames from attempting to create node entries for unknown nodes + cluster: Remove unknown offline nodes with conflicting unames from node cache + controld: Do not consider the dlm up until the address list is present + controld: handling startup fencing within the controld agent, not the dlm + controld: Return OCF_ERR_INSTALLED instead of OCF_NOT_INSTALLED + crmd: Ack pending operations that were cancelled due to rsc deletion + crmd: Actions can only be executed if their pre-requisits completed successfully + crmd: avoid double free caused by nested hash table removal + crmd: Avoid spamming the cib by triggering a transition only once per non-status change + crmd: Correctly react to successful unfencing operations + crmd: Correctly recognise operation cancellations we initiated + crmd: Do not erase the status section for unfenced nodes + crmd: Do not overwrite existing node state when fencing completes + crmd: Do not start timers for already completed operations + crmd: Ensure crm_config options are re-read on updates + crmd: Fenced nodes that return prior to an election do not need to have their status section reset + crmd: make lrm_state hash table not case sensitive + crmd: make node_state erase correctly + crmd: Only write fence_averride if open() returns a positive file descriptor + crmd: Prevent manual fencing confirmations from attempting to create node entries for unknown nodes + crmd: Prevent SIGPIPE when notifying CMAN about fencing operations + crmd: Remove state of unknown nodes with conflicting unames from CIB + crmd: Remove unknown nodes with conflicting unames from CIB + crmd: Report unsuccessful unfencing operations + crm_diff: Allow the generation of xml patchsets without digests + crm_mon: Allow the file created by --as-html to be world readable + crm_mon: Ensure resource attributes have been unpacked before displaying connectivity data + crm_node: Only remove the named resource from the cib + crm_report: Gracefully handle rediculously large logfiles + crm_report: Only gather dlm data if dlm_controld is running + crm_resource: Gracefully handle -EACCESS when querying the cib + crm_verify: Perform a full set of calculations whenever the status section is present + fencing: Advertise support for reboot/on/off in the metadata for legacy agents + fencing: Automatically switch from 'list' to 'status' to 'static-list' if those actions are not advertised in the metadata + fencing: Cache metadata lookups to avoid repeated blocking during device registration + fencing: Correctly record which peer performed the fencing operation + fencing: default to 'off' when agent does not advertise 'reboot' in metadata + fencing: Do not unregister/register all stonith devices on every resource agent change + fencing: Execute all required fencing devices regardless of what topology level they are at + fencing: Fence using all required devices + fencing: Pass the correct options when looking up the history by node name + fencing: Update stonith device list only if stonith is enabled + get_cluster_type: failing concurrent tool invocations on heartbeat + ignore SIGPIPE when gnutls is in use + iso8601: Different logic is needed when logging and calculating durations + iso8601: Fix memory leak in duration calculation + Logging: Bootstrap daemon logging before processing arguments but configure it afterwards + lrmd: Cancel recurring operations before stop action is executed + lrmd: Expose logging variables expected by OCF agents + lrmd: Handle systemd reporting 'done' before a resource is actually stopped/started + lrmd: Merge duplicate recurring monitor operations + lrmd: Prevent OCF agents from logging to random files due to "value" of setenv() being NULL + lrmd: Provide stderr output from agents if available, otherwise fall back to stdout + mainloop: Better handle the killing of processes in the act of exiting + mainloop: Canceling in-flight operations should not fail if child process has already exited. + mainloop: Fixes use after free in process monitor code + mcp: Tell systemd not to respawn us if we exit with rc=100 + membership: Avoid duplicate peer entries in the peer cache + pengine: Allow container nodes to migrate with connection resource + pengine: avoid assert by searching for stop action on correct node during LogActions + pengine: Block restart of resources if any dependent resource in a group is unmanaged + pengine: cl#5186 - Avoid running rsc on two nodes when node is fenced during migration + pengine: cl#5187 - Prevent resources in an anti-colocation from even temporarily running on a same node + pengine: cl#5200 - Before migrating utilization-using resources to a node, take off the load that will no longer run there if it's not introducing transition loop + pengine: Correctly handle origin offsets in the future + pengine: Correctly observe requires=nothing + pengine: Default sequential to TRUE for resource sets for consistency with colocation sets + pengine: Delay unfencing until after we know the state of all resources that require unfencing + pengine: Do not initiate fencing for unclean nodes when fencing is disabled + pengine: Ensure instance numbers are preserved for cloned templates + pengine: Ensure unfencing only happens once, even if the transition is interrupted + pengine: Fencing devices default to only requiring quorum in order to start + pengine: fixes invalid transition caused by clones with more than 10 instances + pengine: Force record pending for migrate_to actions + pengine: handles edge case where container order constraints are not honored during migration + pengine: Ignore failure-timeout only if the failed operation has on-fail="block" + pengine: Mark unrunnable stop actions as "blocked" and show the correct current locations + pengine: Memory leaks + pengine: properly handle fencing of container remote-nodes when the container is orphaned + pengine: properly place resource within a container when container is a remote-node. + pengine: Unfencing is based on device probes, there is no need to unfence when normal resources are found active + pengine: Use "#cluster-name" in rules for setting cluster-specific instance attributes + pengine: Use "#site-name" in rules for setting site-specific instance attributes + remote: Allow baremetal remote-node connection resources to migrate + remote: clear remote-node status correctly + remote: Enable migration support for baremetal connection resources by default + remote: Handle request/response ipc proxy correctly + services: Correctly reset the nice value for lrmd's children + services: Do not allow duplicate recurring op entries + services: Do not block synced service executions + services: Fixes segfault associated with cancelling in-flight recurring operations. + services: Remove cancelled recurring ops from internal lists as early as possible + services: Remove file descriptors from mainloop as soon as we have drained them + services: Reset the scheduling policy and priority for lrmd's children without replying on SCHED_RESET_ON_FORK + services_action_cancel: Interpret return code from mainloop_child_kill() correctly + stonith_admin: Ensure pointers passed to sscanf() are properly initialized + stonith_api_time_helper now returns when the most recent fencing operation completed + systemd: Prevent use-of-NULL when determining if an agent exists + systemd: Try to handle dbus actions that complete prior to configuring a callback + Tools: Non-daemons shouldn't abort just because xml parsing failed + Upstart: Allow comilation with glib versions older than 2.28 + Upstart: Do not attempt upstart jobs if we cannot connect to dbus + When data was old, it fixed so that the newest cib might not be acquired. + xml: Check all available schemas when doing upgrades + xml: Correctly determine the lowest allowed schema version + xml: Correctly enforce ACLs after a replace operation + xml: Correctly infer attribute changes after a replace operation + xml: Create the correct diff when only part of a document is changed + xml: Detect attribute ordering changes + xml: Detect content that is added and removed in the same update + xml: Do not prune meaningful leaves from v1 patchsets + xml: Empty patchsets are considered to have applied cleanly + xml: Ensure patches always have version details set + xml: Find the minimal set of changes when part of a document is replaced + xml: If validate-with is missing, we find the most recent schema that accepts it and go from there + xml: Introduce a 'move' primitive for v2 patch sets + xml: Preserve the attribute order in the patch for subsequent digest validation + xml: Resolve memory leak when logging xml blobs + xml: Update xml validation to allow '' * Thu Feb 13 2014 David Vossel Pacemaker-1.1.11-1 - Update source tarball to revision: 33f9d09 - Changesets: 462 - Diff: 147 files changed, 6810 insertions(+), 4057 deletions(-) - Features added since Pacemaker-1.1.10 + attrd: A truly atomic version of attrd for use where CPG is used for cluster communication + cib: Allow values to be added/updated and removed in a single update + cib: Support XML comments in diffs + Core: Allow blackbox logging to be disabled with SIGUSR2 + crmd: Do not block on proxied calls from pacemaker_remoted + crmd: Enable cluster-wide throttling when the cib heavily exceeds its target load + crmd: Make the per-node action limit directly configurable in the CIB + crmd: Slow down recovery on nodes with IO load + crmd: Track CPU usage on cluster nodes and slow down recovery on nodes with high CPU/IO load + crm_mon: add --hide-headers option to hide all headers + crm_node: Display partition output in sorted order + crm_report: Collect logs directly from journald if available + Fencing: On timeout, clean up the agent's entire process group + Fencing: Support agents that need the host to be unfenced at startup + ipc: Raise the default buffer size to 128k + PE: Add a special attribute for distinguishing between real nodes and containers in constraint rules + PE: Allow location constraints to take a regex pattern to match against resource IDs + pengine: Distinguish between the agent being missing and something the agent needs being missing + remote: Properly version the remote connection protocol - Changes since Pacemaker-1.1.10 + Bug rhbz#1011618 - Consistently use 'Slave' as the role for unpromoted master/slave resources + Bug rhbz#1057697 - Use native DBus library for systemd and upstart support to avoid problematic use of threads + attrd: Any variable called 'cluster' makes the daemon crash before reaching main() + attrd: Avoid infinite write loop for unknown peers + attrd: Drop all attributes for peers that left the cluster + attrd: Give remote-nodes ability to set attributes with attrd + attrd: Prevent inflation of attribute dampen intervals + attrd: Support SI units for attribute dampening + Bug cl#5171 - pengine: Don't prevent clones from running due to dependent resources + Bug cl#5179 - Corosync: Attempt to retrieve a peer's node name if it is not already known + Bug cl#5181 - corosync: Ensure node IDs are written to the CIB as unsigned integers + Bug rhbz#902407 - crm_resource: Handle --ban for master/slave resources as advertised + cib: Correctly check for archived configuration files + cib: Correctly log short-form xml diffs + cib: Fix remote cib based on TLS + cibadmin: Report errors during sign-off + cli: Do not enabled blackbox for cli tools + cluster: Fix segfault on removing a node + cman: Do not start pacemaker if cman startup fails + cman: Start clvmd and friends from the init script if enabled + Command-line tools should stop after an assertion failure + controld: Use the correct variant of dlm_controld for corosync-2 clusters + cpg: Correctly set the group name length + cpg: Ensure the CPG group is always null-terminated + cpg: Only process one message at a time to allow other priority jobs to be performed + crmd: Correctly observe the configured batch-limit + crmd: Correctly update expected state when the previous DC shuts down + crmd: Correcty update the history cache when recurring ops change their return code + crmd: Don't add node_state to cib, if we have not seen or fenced this node yet + crmd: don't segfault on shutdown when using heartbeat + crmd: Prevent recurring monitors being cancelled due to notify operations + crmd: Reliably detect and act on reprobe operations from the policy engine + crmd: When a peer expectedly shuts down, record the new join and expected states into the cib + crmd: When the DC gracefully shuts down, record the new expected state into the cib + crm_attribute: Do not swallow hostname lookup failures + crm_mon: Do not display duplicates of failed actions + crm_mon: Reduce flickering in interactive mode + crm_resource: Observe --master modifier for --move + crm_resource: Provide a meaningful error if --master is used for primitives and groups + fencing: Allow fencing for node after topology entries are deleted + fencing: Apply correct score to the resource of group + fencing: Ignore changes to non-fencing resources + fencing: Observe pcmk_host_list during automatic unfencing + fencing: Put all fencing agent processes into their own process group + fencing: Wait until all possible replies are recieved before continuing with unverified devices + ipc: Compress msgs based on client's actual max send size + ipc: Have the ipc server enforce a minimum buffer size all clients must use. + iso8601: Prevent dates from jumping backwards a day in some timezones + lrmd: Correctly calculate metadata for the 'service' class + lrmd: Correctly cancel monitor actions for lsb/systemd/service resources on cleaning up + mcp: Remove LSB hints that instruct chkconfig to start pacemaker at boot time + mcp: Some distros complain when LSB scripts do not include Default-Start/Stop directives + pengine: Allow fencing of baremetal remote nodes + pengine: cl#5186 - Avoid running rsc on two nodes when node is fenced during migration + pengine: Correctly account for the location preferences of things colocated with a group + pengine: Correctly handle demotion of grouped masters that are partially demoted + pengine: Disable container node probes due to constraint conflicts + pengine: Do not allow colocation with blocked clone instances + pengine: Do not re-allocate clone instances that are blocked in the Stopped state + pengine: Do not restart resources that depend on unmanaged resources + pengine: Force record pending for migrate_to actions + pengine: Location constraints with role=Started should prevent masters from running at all + pengine: Order demote/promote of resources on remote nodes to happen only once the connection is up + pengine: Properly handle orphaned multistate resources living on remote-nodes + pengine: Properly shutdown orphaned remote connection resources + pengine: Recover unexpectedly running container nodes. + remote: Add support for ipv6 into pacemaker_remote daemon + remote: Handle endian changes between client and server and improve forward compatibility + services: Fixes segfault associated with cancelling in-flight recurring operations. + services: Reset the scheduling policy and priority for lrmd's children without replying on SCHED_RESET_ON_FORK * Fri Jul 26 2013 Andrew Beekhof Pacemaker-1.1.10-1 - Update source tarball to revision: ab2e209 - Changesets: 602 - Diff: 143 files changed, 8162 insertions(+), 5159 deletions(-) - Features added since Pacemaker-1.1.9 + Core: Convert all exit codes to positive errno values + crm_error: Add the ability to list and print error symbols + crm_resource: Allow individual resources to be reprobed + crm_resource: Allow options to be set recursively + crm_resource: Implement --ban for moving resources away from nodes and --clear (replaces --unmove) + crm_resource: Support OCF tracing when using --force-(check|start|stop) + PE: Allow active nodes in our current membership to be fenced without quorum + PE: Suppress meaningless IDs when displaying anonymous clone status + Turn off auto-respawning of systemd services when the cluster starts them + Bug cl#5128 - pengine: Support maintenance mode for a single node - Changes since Pacemaker-1.1.9 + crmd: cib: stonithd: Memory leaks resolved and improved use of glib reference counting + attrd: Fixes deleted attributes during dc election + Bug cf#5153 - Correctly display clone failcounts in crm_mon + Bug cl#5133 - pengine: Correctly observe on-fail=block for failed demote operation + Bug cl#5148 - legacy: Correctly remove a node that used to have a different nodeid + Bug cl#5151 - Ensure node names are consistently compared without case + Bug cl#5152 - crmd: Correctly clean up fenced nodes during membership changes + Bug cl#5154 - Do not expire failures when on-fail=block is present + Bug cl#5155 - pengine: Block the stop of resources if any depending resource is unmanaged + Bug cl#5157 - Allow migration in the absence of some colocation constraints + Bug cl#5161 - crmd: Prevent memory leak in operation cache + Bug cl#5164 - crmd: Fixes crash when using pacemaker-remote + Bug cl#5164 - pengine: Fixes segfault when calculating transition with remote-nodes. + Bug cl#5167 - crm_mon: Only print "stopped" node list for incomplete clone sets + Bug cl#5168 - Prevent clones from being bounced around the cluster due to location constraints + Bug cl#5170 - Correctly support on-fail=block for clones + cib: Correctly read back archived configurations if the primary is corrupted + cib: The result is not valid when diffs fail to apply cleanly for CLI tools + cib: Restore the ability to embed comments in the configuration + cluster: Detect and warn about node names with capitals + cman: Do not pretend we know the state of nodes we've never seen + cman: Do not unconditionally start cman if it is already running + cman: Support non-blocking CPG calls + Core: Ensure the blackbox is saved on abnormal program termination + corosync: Detect the loss of members for which we only know the nodeid + corosync: Do not pretend we know the state of nodes we've never seen + corosync: Ensure removed peers are erased from all caches + corosync: Nodes that can persist in sending CPG messages must be alive afterall + crmd: Do not get stuck in S_POLICY_ENGINE if a node we couldn't fence returns + crmd: Do not update fail-count and last-failure for old failures + crmd: Ensure all membership operations can complete while trying to cancel a transition + crmd: Ensure operations for cleaned up resources don't block recovery + crmd: Ensure we return to a stable state if there have been too many fencing failures + crmd: Initiate node shutdown if another node claims to have successfully fenced us + crmd: Prevent messages for remote crmd clients from being relayed to wrong daemons + crmd: Properly handle recurring monitor operations for remote-node agent + crmd: Store last-run and last-rc-change for all operations + crm_mon: Ensure stale pid files are updated when a new process is started + crm_report: Correctly collect logs when 'uname -n' reports fully qualified names + fencing: Fail the operation once all peers have been exhausted + fencing: Restore the ability to manually confirm that fencing completed + ipc: Allow unpriviliged clients to clean up after server failures + ipc: Restore the ability for members of the haclient group to connect to the cluster + legacy: Support "crm_node --remove" with a node name for corosync plugin (bnc#805278) + lrmd: Default to the upstream location for resource agent scratch directory + lrmd: Pass errors from lsb metadata generation back to the caller + pengine: Correctly handle resources that recover before we operate on them + pengine: Delete the old resource state on every node whenever the resource type is changed + pengine: Detect constraints with inappropriate actions (ie. promote for a clone) + pengine: Ensure per-node resource parameters are used during probes + pengine: If fencing is unavailable or disabled, block further recovery for resources that fail to stop + pengine: Implement the rest of get_timet_now() and rename to get_effective_time + pengine: Re-initiate _active_ recurring monitors that previously failed but have timed out + remote: Workaround for inconsistent tls handshake behavior between gnutls versions + systemd: Ensure we get shut down correctly by systemd + systemd: Reload systemd after adding/removing override files for cluster services + xml: Check for and replace non-printing characters with their octal equivalent while exporting xml text + xml: Prevent lockups by setting a more reliable buffer allocation strategy * Fri Mar 08 2013 Andrew Beekhof Pacemaker-1.1.9-1 - Update source tarball to revision: 7e42d77 - Statistics: Changesets: 731 Diff: 1301 files changed, 92909 insertions(+), 57455 deletions(-) - Features added in Pacemaker-1.1.9 + corosync: Allow cman and corosync 2.0 nodes to use a name other than uname() + corosync: Use queues to avoid blocking when sending CPG messages + ipc: Compress messages that exceed the configured IPC message limit + ipc: Use queues to prevent slow clients from blocking the server + ipc: Use shared memory by default + lrmd: Support nagios remote monitoring + lrmd: Pacemaker Remote Daemon for extending pacemaker functionality outside corosync cluster. + pengine: Check for master/slave resources that are not OCF agents + pengine: Support a 'requires' resource meta-attribute for controlling whether it needs quorum, fencing or nothing + pengine: Support for resource container + pengine: Support resources that require unfencing before start - Changes since Pacemaker-1.1.8 + attrd: Correctly handle deletion of non-existant attributes + Bug cl#5135 - Improved detection of the active cluster type + Bug rhbz#913093 - Use crm_node instead of uname + cib: Avoid use-after-free by correctly support cib_no_children for non-xpath queries + cib: Correctly process XML diff's involving element removal + cib: Performance improvements for non-DC nodes + cib: Prevent error message by correctly handling peer replies + cib: Prevent ordering changes when applying xml diffs + cib: Remove text nodes from cib replace operations + cluster: Detect node name collisions in corosync + cluster: Preserve corosync membership state when matching node name/id entries + cman: Force fenced to terminate on shutdown + cman: Ignore qdisk 'nodes' + core: Drop per-user core directories + corosync: Avoid errors when closing failed connections + corosync: Ensure peer state is preserved when matching names to nodeids + corosync: Clean up CMAP connections after querying node name + corosync: Correctly detect corosync 2.0 clusters even if we don't have permission to access it + crmd: Bug cl#5144 - Do not updated the expected status of failed nodes + crmd: Correctly determin if cluster disconnection was abnormal + crmd: Correctly relay messages for remote clients (bnc#805626, bnc#804704) + crmd: Correctly stall the FSA when waiting for additional inputs + crmd: Detect and recover when we are evicted from CPG + crmd: Differentiate between a node that is up and coming up in peer_update_callback() + crmd: Have cib operation timeouts scale with node count + crmd: Improved continue/wait logic in do_dc_join_finalize() + crmd: Prevent election storms caused by getrusage() values being too close + crmd: Prevent timeouts when performing pacemaker level membership negotiation + crmd: Prevent use-after-free of fsa_message_queue during exit + crmd: Store all current actions when stalling the FSA + crm_mon: Do not try to render a blank cib and indicate the previous output is now stale + crm_mon: Fixes crm_mon crash when using snmp traps. + crm_mon: Look for the correct error codes when applying configuration updates + crm_report: Ensure policy engine logs are found + crm_report: Fix node list detection + crm_resource: Have crm_resource generate a valid transition key when sending resource commands to the crmd + date/time: Bug cl#5118 - Correctly convert seconds-since-epoch to the current time + fencing: Attempt to provide more information that just 'generic error' for failed actions + fencing: Correctly record completed but previously unknown fencing operations + fencing: Correctly terminate when all device options have been exhausted + fencing: cov#739453 - String not null terminated + fencing: Do not merge new fencing requests with stale ones from dead nodes + fencing: Do not start fencing until entire device topology is found or query results timeout. + fencing: Do not wait for the query timeout if all replies have arrived + fencing: Fix passing of parameters from CMAN containing '=' + fencing: Fix non-comparison when sorting devices by priority + fencing: On failure, only try a topology device once from the remote level. + fencing: Only try peers for non-topology based operations once + fencing: Retry stonith device for duration of action's timeout period. + heartbeat: Remove incorrect assert during cluster connect + ipc: Bug cl#5110 - Prevent 100% CPU usage when looking for synchronous replies + ipc: Use 50k as the default compression threshold + legacy: Prevent assertion failure on routing ais messages (bnc#805626) + legacy: Re-enable logging from the pacemaker plugin + legacy: Relax the 'active' check for plugin based clusters to avoid false negatives + legacy: Skip peer process check if the process list is empty in crm_is_corosync_peer_active() + mcp: Only define HA_DEBUGLOG to avoid agent calls to ocf_log printing everything twice + mcp: Re-attach to existing pacemaker components when mcp fails + pengine: Any location constraint for the slave role applies to all roles + pengine: Avoid leaking memory when cleaning up failcounts and using containers + pengine: Bug cl#5101 - Ensure stop order is preserved for partially active groups + pengine: Bug cl#5140 - Allow set members to be stopped when the subseqent set has require-all=false + pengine: Bug cl#5143 - Prevent shuffling of anonymous master/slave instances + pengine: Bug rhbz#880249 - Ensure orphan masters are demoted before being stopped + pengine: Bug rhbz#880249 - Teach the PE how to recover masters into primitives + pengine: cl#5025 - Automatically clear failcount for start/monitor failures after resource parameters change + pengine: cl#5099 - Probe operation uses the timeout value from the minimum interval monitor by default (#bnc776386) + pengine: cl#5111 - When clone/master child rsc has on-fail=stop, insure all children stop on failure. + pengine: cl#5142 - Do not delete orphaned children of an anonymous clone + pengine: Correctly unpack active anonymous clones + pengine: Ensure previous migrations are closed out before attempting another one + pengine: Introducing the whitebox container resources feature + pengine: Prevent double-free for cloned primitive from template + pengine: Process rsc_ticket dependencies earlier for correctly allocating resources (bnc#802307) + pengine: Remove special cases for fencing resources + pengine: rhbz#902459 - Remove rsc node status for orphan resources + systemd: Gracefully handle unexpected DBus return types + Replace the use of the insecure mktemp(3) with mkstemp(3) * Thu Sep 20 2012 Andrew Beekhof Pacemaker-1.1.8-1 - Update source tarball to revision: 1a5341f - Statistics: Changesets: 1019 Diff: 2107 files changed, 117258 insertions(+), 73606 deletions(-) - All APIs have been cleaned up and reduced to essentials - Pacemaker now includes a replacement lrmd that supports systemd and upstart agents - Config and state files (cib.xml, PE inputs and core files) have moved to new locations - The crm shell has become a separate project and no longer included with Pacemaker - All daemons/tools now have a unified set of error codes based on errno.h (see crm_error) - Changes since Pacemaker-1.1.7 + Core: Bug cl#5032 - Rewrite the iso8601 date handling code + Core: Correctly extract the version details from a diff + Core: Log blackbox contents, if enabled, when an error occurs + Core: Only LOG_NOTICE and higher are sent to syslog + Core: Replace use of IPC from clplumbing with IPC from libqb + Core: SIGUSR1 now enables blackbox logging, SIGTRAP to write out + Core: Support a blackbox for additional logging detail after crashes/errors + Promote support for advanced fencing logic to the stable schema + Promote support for node starting scores to the stable schema + Promote support for service and systemd to the stable schema + attrd: Differentiate between updating all our attributes and everybody updating all theirs too + attrd: Have single-shot clients wait for an ack before disconnecting + cib: cl#5026 - Synced cib updates should not return until the cpg broadcast is complete. + corosync: Detect when the first corosync has not yet formed and handle it gracefully + corosync: Obtain a full list of configured nodes, including their names, when we connect to the quorum API + corosync: Obtain a node name from DNS if one was not already known + corosync: Populate the cib nodelist from corosync if available + corosync: Use the CFG API and DNS to determine node names if not configured in corosync.conf + crmd: Block after 10 failed fencing attempts for a node + crmd: cl#5051 - Fixes file leak in PE ipc connection initialization. + crmd: cl#5053 - Fixes fail-count not being updated properly. + crmd: cl#5057 - Restart sub-systems correctly (bnc#755671) + crmd: cl#5068 - Fixes crm_node -R option so it works with corosync 2.0 + crmd: Correctly re-establish failed attrd connections + crmd: Detect when the quorum API isn't configured for corosync 2.0 + crmd: Do not overwrite any configured node type (eg. quorum node) + crmd: Enable use of new lrmd daemon and client library in crmd. + crmd: Overhaul the way node state is recorded and updated in the CIB + fencing: Bug rhbz#853537 - Prevent use-of-NULL when the cib libraries are not available + fencing: cl#5073 - Add 'off' as an valid value for stonith-action option. + fencing: cl#5092 - Always timeout stonith operations if timeout period expires. + fencing: cl#5093 - Stonith per device timeout option + fencing: Clean up if we detect a failed connection + fencing: Delegate complex self fencing requests - we wont be around to see it to completion + fencing: Ensure all peers are notified of complex fencing op completion + fencing: Fix passing of fence_legacy parameters containing '=' + fencing: Gracefully handle metadata requests for unknown agents + fencing: Return cached dynamic target list for busy devices. + fencing: rhbz#801355 - Abort transition on DC when external fencing operation is detected + fencing: rhbz#801355 - Merge fence requests for identical operations already in progress. + fencing: rhbz#801355 - Report fencing operations external of pacemaker to cib + fencing: Specify the action to perform using action= instead of the older option= + fencing: Stop building fake metadata for broken agents + fencing: Tolerate agents that report empty metadata in the admin tool + mcp: Correctly retry the connection to corosync on failure + mcp: Do not shut down IPC until the last client exits + mcp: Prevent use-after-free when running against corosync 1.x + pengine: Bug cl#5059 - Use the correct action's status when calculating required actions for interleaved clones + pengine: Bypass online/offline checking resource detection for ping/quorum nodes + pengine: cl#5044 - migrate_to no longer requires load_stopped for avoiding possible transition loop + pengine: cl#5069 - Honor 'on-fail=ignore' even when operation is disabled. + pengine: cl#5070 - Allow influence of promotion score when multistate rsc is left hand of colocation + pengine: cl#5072 - Fixes monitor op stopping after rsc promotion. + pengine: cl#5072 - Fixes pengine regression test failures + pengine: Correctly set the status for nodes not intended to run Pacemaker + pengine: Do not append instance numbers to anonymous clones + pengine: Fix failcount expiration + pengine: Fix memory leaks found by valgrind + pengine: Fix use-after-free and use-of-NULL errors detected by coverity + pengine: Fixes use of colocation scores other than +/- INFINITY + pengine: Improve detection of rejoining nodes + pengine: Prevent use-of-NULL when tracing is enabled + pengine: Stonith resources are allowed to start even if their probes haven't completed on partially active nodes + services: New class called 'service' which expands to the correct (LSB/systemd/upstart) standard + services: Support Asynchronous systemd/upstart actions + Tools: crm_shadow - Bug cl#5062 - Correctly set argv[0] when forking a shell process + Tools: crm_report: Always include system logs (if we can find them) * Wed Mar 28 2012 Andrew Beekhof Pacemaker-1.1.7-1 - Update source tarball to revision: bc7ff2c - Statistics: Changesets: 513 Diff: 1171 files changed, 90472 insertions(+), 19368 deletions(-) - Changes since Pacemaker-1.1.6.1 + ais: Prepare for corosync versions using IPC from libqb + cib: Correctly shutdown in the presence of peers without relying on timers + cib: Don't halt disk writes if the previous digest is missing + cib: Determine when there are no peers to respond to our shutdown request and exit + cib: Ensure no additional messages are processed after we begin terminating + Cluster: Hook up the callbacks to the corosync quorum notifications + Core: basename() may modify its input, do not pass in a constant + Core: Bug cl#5016 - Prevent failures in recurring ops from being lost + Core: Bug rhbz#800054 - Correctly retrieve heartbeat uuids + Core: Correctly determine when an XML file should be decompressed + Core: Correctly track the length of a string without reading from uninitialzied memory (valgrind) + Core: Ensure signals are handled eventually in the absense of timer sources or IPC messages + Core: Prevent use-of-NULL in crm_update_peer() + Core: Strip text nodes from on disk xml files + Core: Support libqb for logging + corosync: Consistently set the correct uuid with get_node_uuid() + Corosync: Correctly disconnect from corosync variants + Corosync: Correctly extract the node id from membership udpates + corosync: Correctly infer lost members from the quorum API + Corosync: Default to using the nodeid as the node's uuid (instead of uname) + corosync: Ensure we catch nodes that leave the membership, even if the ringid doesn't change + corosync: Hook up CPG membership + corosync: Relax a development assert and gracefully handle the error condition + corosync: Remove deprecated member of the CFG API + corosync: Treat CS_ERR_QUEUE_FULL the same as CS_ERR_TRY_AGAIN + corosync: Unset the process list when nodes dissappear on us + crmd: Also purge fencing results when we enter S_NOT_DC + crmd: Bug cl#5015 - Remove the failed operation as well as the resulting fail-count and last-failure attributes + crmd: Correctly determine when a node can suicide with fencing + crmd: Election - perform the age comparison only once + crmd: Fast-track shutdown if we couldn't request it via attrd + crmd: Leave it up to the PE to decide which ops can/cannot be reload + crmd: Prevent use-after-free when calling delete_resource due to CRM_OP_REPROBE + crmd: Supply format arguments in the correct order + fencing: Add missing format parameter + fencing: Add the fencing topology section to the 1.1 configuration schema + fencing: fence_legacy - Drop spurilous host argument from status query + fencing: fence_legacy - Ensure port is available as an environment variable when calling monitor + fencing: fence_pcmk - don't block if nothing is specified on stdin + fencing: Fix log format error + fencing: Fix segfault caused by passing garbage to dlsym() + fencing: Fix use-of-NULL in process_remote_stonith_query() + fencing: Fix use-of-NULL when listing installed devices + fencing: Implement support for advanced fencing topologies: eg. kdump || (network && disk) || power + fencing: More gracefully handle failed 'list' operations for devices that only support a single connection + fencing: Prevent duplicate free when listing devices + fencing: Prevent uninitialized pointers being passed to free + fencing: Prevent use-after-free, we may need the query result for subsequent operations + fencing: Provide enough data to construct an entry in the node's fencing history + fencing: Standardize on /one/ method for clients to request members be fenced + fencing: Supress errors when listing all registered devices + mcp: corosync_cfg_state_track was removed from the corosync API, luckily we didnt use it for anything + mcp: Do not specify a WorkingDirectory in the systemd unit file - startup fails if its not available + mcp: Set the HA_quorum_type env variable consistently with our corosync plugin + mcp: Shut down if one of our child processes can/should not be respawned + pengine: Bug cl#5000 - Ensure ordering is preserved when depending on partial sets + pengine: Bug cl#5028 - Unmanaged services should block shutdown unless in maintenance mode + pengine: Bug cl#5038 - Prevent restart of anonymous clones when clone-max decreases + pengine: Bug cl#5007 - Fixes use of colocation constraints with multi-state resources + pengine: Bug cl#5014 - Prevent asymmetrical order constraints from causing resource stops + pengine: Bug cl#5000 - Implements ability to create rsc_order constraint sets such that A can start after B or C has started. + pengine: Correctly migrate a resource that has just migrated + pengine: Correct return from error path + pengine: Detect reloads of previously migrated resources + pengine: Ensure post-migration stop actions occur before node shutdown + pengine: Log as loudly as possible when we cannot shut down a cluster node + pengine: Reload of a resource no longer causes a restart of dependent resources + pengine: Support limiting the number of concurrent live migrations + pengine: Support referencing templates in constraints + pengine: Support of referencing resource templates in resource sets + pengine: Support to make tickets standby for relinquishing tickets gracefully + stonith: A "start" operation of a stonith resource does a "monitor" on the device beyond registering it + stonith: Bug rhbz#745526 - Ensure stonith_admin actually gets called by fence_pcmk + Stonith: Ensure all nodes receive and deliver notifications of the manual override + stonith: Fix the stonith timeout issue (cl#5009, bnc#727498) + Stonith: Implement a manual override for when nodes are known to be safely off + Tools: Bug cl#5003 - Prevent use-after-free in crm_simlate + Tools: crm_mon - Support to display tickets (based on Yuusuke Iida's work) + Tools: crm_simulate - Support to grant/revoke/standby/activate tickets from the new ticket state section + Tools: Implement crm_node functionality for native corosync + Fix a number of potential problems reported by coverity * Wed Aug 31 2011 Andrew Beekhof 1.1.6-1 - Update source tarball to revision: 676e5f25aa46 tip - Statistics: Changesets: 376 Diff: 1761 files changed, 36259 insertions(+), 140578 deletions(-) - Changes since Pacemaker-1.1.5 + ais: check for retryable errors when dispatching AIS messages + ais: Correctly disconnect from Corosync and Cman based clusters + ais: Followup to previous patch - Ensure we drain the corosync queue of messages when Glib tells us there is input + ais: Handle IPC error before checking for NULL data (bnc#702907) + cib: Check the validation version before adding the originator details of a CIB change + cib: Remove disconnected remote connections from mainloop + cman: Correctly override existing fenced operations + cman: Dequeue all the cman emitted events and not only the first one leaving the others in the event's queue. + cman: Don't call fenced_join and fenced_leave when notifying cman of a fencing event. + cman: We need to run the crmd as root for CMAN so that we can ACK fencing operations + Core: Cancelled and pending operations do not count as failed + Core: Ensure there is sufficient space for EOS when building short-form option strings + Core: Fix variable expansion in pkg-config files + Core: Partial revert of accidental commit in previous patch + Core: Use dlopen to load heartbeat libraries on-demand + crmd: Bug lf#2509 - Watch for config option changes from the CIB even if we're not the DC + crmd: Bug lf#2528 - Introduce a slight delay when creating a transition to allow attrd time to perform its updates + crmd: Bug lf#2559 - Fail actions that were scheduled for a failed/fenced node + crmd: Bug lf#2584 - Allow nodes to fence themselves if they're the last one standing + crmd: Bug lf#2632 - Correctly handle nodes that return faster than stonith + crmd: Cancel timers for actions that were pending on dead nodes + crmd: Catch fence operations that claim to succeed but did not really + crmd: Do not wait for actions that were pending on dead nodes + crmd: Ensure we do not attempt to perform action on failed nodes + crmd: Prevent use-of-NULL by g_hash_table_iter_next() + crmd: Recurring actions shouldn't cause the last non-recurring action to be forgotten + crmd: Store only the last and last failed operation in the CIB + mcp: dirname() modifies the input path - pass in a copy of the logfile path + mcp: Enable stack detection logic instead of forcing 'corosync' + mcp: Fix spelling mistake in systemd service script that prevents shutdown + mcp: Shut down if corosync becomes unavailable + mcp: systemd control file is now functional + pengine: Before migrating an utilization-using resource to a node, take off the load which will no longer run there (lf#2599, bnc#695440) + pengine: Before migrating an utilization-using resource to a node, take off the load which will no longer run there (regression tests) (lf#2599, bnc#695440) + pengine: Bug lf#2574 - Prevent shuffling by choosing the correct clone instance to stop + pengine: Bug lf#2575 - Use uname for migration variables, id is a UUID on heartbeat + pengine: Bug lf#2581 - Avoid group restart when clone (re)starts on an unrelated node + pengine: Bug lf#2613, lf#2619 - Group migration after failures and non-default utilization policies + pengine: Bug suse#707150 - Prevent services being active if dependencies on clones are not satisfied + pengine: Correctly recognise which recurring operations are currently active + pengine: Demote from Master does not clear previous errors + pengine: Ensure restarts due to definition changes cause the start action to be re-issued not probes + pengine: Ensure role is preserved for unmanaged resources + pengine: Ensure unmanaged resources have the correct role set so the correct monitor operation is chosen + pengine: Fix memory leak for re-allocated resources reported by valgrind + pengine: Implement cluster ticket and deadman + pengine: Implement resource template + pengine: Correctly determine the state of multi-state resources with a partial operation history + pengine: Only allocate master/slave resources once + pengine: Partial revert of 'Minor code cleanup CS: cf6bca32376c On: 2011-08-15' + pengine: Resolve memory leak reported by valgrind + pengine: Restore the ability to save inputs to disk + Shell: implement -w,--wait option to wait for the transition to finish + Shell: repair template list command + Shell: set of commands to examine logs, reports, etc + Stonith: Consolidate pcmk_host_map into run_stonith_agent so that it is applied consistently + Stonith: Deprecate pcmk_arg_map for the saner pcmk_host_argument + Stonith: Fix use-of-NULL by g_hash_table_lookup + Stonith: Improved pcmk_host_map parsing + Stonith: Prevent use-of-NULL by g_hash_table_lookup + Stonith: Prevent use-of-NULL when no Linux-HA stonith agents are present + stonith: Add missing entries to stonith_error2string() + Stonith: Correctly finish sending agent options if the initial write is interrupted + stonith: Correctly handle synchronous calls + stonith: Coverity - Correctly construct result list for the query API call + stonith: Coverity - Remove badly constructed memory allocation from the query API call + stonith: Ensure completed operations are recorded as such in the history + Stonith: Ensure device parameters are passed to the daemon during registration + stonith: Fix use-of-NULL in stonith_api_device_list() + stonith: stonith_admin - Prevent use of uninitialized pointer by --history command + Tools: Bug lf#2528 - Make progress when attrd_updater is called repeatedly within the dampen interval but with the same value + Tools: crm_report - Correctly extract data from the local node + Tools: crm_report - Remove newlines when detecting the node list + Tools: crm_report - Repair the ability to extract data from the local machine + Tools: crm_report - Report on all detected backtraces * Fri Feb 11 2011 Andrew Beekhof 1.1.5-1 - Update source tarball to revision: baad6636a053 - Statistics: Changesets: 184 Diff: 605 files changed, 46103 insertions(+), 26417 deletions(-) - Changes since Pacemaker-1.1.4 + Add the ability to delegate sub-sections of the cluster to non-root users via ACLs Needs to be enabled at compile time, not enabled by default. + ais: Bug lf#2550 - Report failed processes immediately + Core: Prevent recently introduced use-after-free in replace_xml_child() + Core: Reinstate the logic that skips past non-XML_ELEMENT_NODE children + Core: Remove extra calls to xmlCleanupParser resulting in use-after-free + Core: Repair reference to child-of-child after removal of xml_child_iter_filter from get_message_xml() + crmd: Bug lf#2545 - Ensure notify variables are accurate for stop operations + crmd: Cancel recurring operations while we're still connected to the lrmd + crmd: Reschedule the PE_START action if its not already running when we try to use it + crmd: Update failcount for failed promote and demote operations + pengine: Bug lf#2445 - Avoid relying on stickness for stable clone placement + pengine: Bug lf#2445 - Do not override configured clone stickiness values + pengine: Bug lf#2493 - Don't imply colocation requirements when applying ordering constraints with clones + pengine: Bug lf#2495 - Prevent segfault by validating the contents of ordering sets + pengine: Bug lf#2508 - Correctly reconstruct the status of anonymous cloned groups + pengine: Bug lf#2518 - Avoid spamming the logs with errors for orphan resources + pengine: Bug lf#2544 - Prevent unstable clone placement by factoring in the current node's score before all others + pengine: Bug lf#2554 - target-role alone is not sufficient to promote resources + pengine: Correct target_rc for probes of inactive resources (fix regression introduced by cs:ac3f03006e95) + pengine: Ensure that fencing has completed for stop actions on stonith-dependent resources (lf#2551) + pengine: Only update the node's promotion score if the resource is active there + pengine: Only use the promotion score from the current clone instance + pengine: Prevent use-of-NULL resulting from variable shadowing spotted by Coverity + pengine: Prevent use-of-NULL when there is status for an undefined node + pengine: Prevet use-after-free resulting from unintended recursion when chosing a node to promote master/slave resources + Shell: don't create empty optional sections (bnc#665131) + Stonith: Teach stonith_admin to automagically obtain the current node attributes for the target from the CIB + tools: Bug lf#2527 - Prevent use-of-NULL in crm_simulate + Tools: Prevent crm_resource commands from being lost due to the use of cib_scope_local * Wed Oct 20 2010 Andrew Beekhof 1.1.4-1 - Update source tarball to revision: 75406c3eb2c1 tip - Statistics: Changesets: 169 Diff: 772 files changed, 56172 insertions(+), 39309 deletions(-) - Changes since Pacemaker-1.1.3 + Italian translation of Clusters from Scratch + Significant performance enhancements to the Policy Engine and CIB + cib: Bug lf#2506 - Don't remove client's when notifications fail, they might just be too big + cib: Drop invalid/failed connections from the client hashtable + cib: Ensure all diffs sent to peers have sufficient ordering information + cib: Ensure non-change diffs can preserve the ordering on the other side + cib: Fix the feature set check + cib: Include version information on our synthesised diffs when nothing changed + cib: Optimize the way we detect group/set ordering changes - 15% speedup + cib: Prevent false detection of config updates with the new diff format + cib: Reduce unnecessary copying when comparing xml objects + cib: Repair the processing of updates sent from peer nodes + cib: Revert part of a recent commit that purged still valid connections + cib: The feature set version check is only valid if the current value is non-NULL + Core: Actually removing diff markers is necessary + Core: Bug lf#2506 - Drop the compression limit because Heartbeat's IPC code sucks + Core: Cache Relax-NG schemas - profiling indicates many cycles are wasted needlessly re-parsing them + Core: Correctly compare against crm_log_level in the logging macros + Core: Correctly extract the version details from a diff + Core: Correctly hook up the RNG schema cache + Core: Correctly use lazy_xml_sort() for v2 digests + Core: Don't compress large payload elements unless we're approaching message limits + Core: Don't insert empty ID tags when applying diffs + Core: Enable the improve v2 digests + Core: Ensure ordering is preserved when applying diffs + Core: Fix the CRM_CHECK macro + Core: Modify the v2 digest algorithm so that some fields are sorted + Core: Prevent use-after-free when creating a CIB update for a timed out action + Core: Prevent use-of-NULL when cleaning up RelaxNG data structures + Core: Provide significant performance improvements by implementing versioned diffs and digests + crmd: All pending operations should be recorded, even recurring ones with high start delays + crmd: Don't abort transitions when probes are completed on a node + crmd: Don't hide stop events that time out - allowing faster recovery in the presence of overloaded hosts + crmd: Ensure the CIB is always writable on the DC by removing a timing hole + crmd: Include the correct transition details for timed out operations + crmd: Prevent use of NULL by making copies of the operation's hash table + crmd: There's no need to check the cib version from the 'added' part of diff updates + crmd: Use the supplied timeout for stop actions + mcp: Ensure valgrind is able to log its output somewhere + mcp: Use 99/01 for the start/stop sequence to avoid problems with services (such as libvirtd) started by init - Patch from Vladislav Bogdanov + pengine: Ensure fencing of the DC preceeds the STONITH_DONE operation + pengine: Fix memory leak introduced as part of the conversion to GHashTables + pengine: Fix memory leak when processing completed migration actions + pengine: Fix typo leading to use-of-NULL in the new ordering code + pengine: Free memory in recently introduced helper function + pengine: lf#2478 - Implement improved handling and recovery of atomic resource migrations + pengine: Obtain massive speedup by prepending to the list of ordering constraints (which can grow quite large) + pengine: Optimize the logic for deciding which non-grouped anonymous clone instances to probe for + pengine: Prevent clones from being stopped because resources colocated with them cannot be active + pengine: Try to ensure atomic migration ops occur within a single transition + pengine: Use hashtables instead of linked lists for performance sensitive datastructures + pengine: Use the original digest algorithm for parameter lists + stonith: cleanup children on timeout in fence_legacy + Stonith: Fix two memory leaks + Tools: crm_shadow - Avoid replacing the entire configuration (including status) * Tue Sep 21 2010 Andrew Beekhof 1.1.3-1 - Update source tarball to revision: e3bb31c56244 tip - Statistics: Changesets: 352 Diff: 481 files changed, 14130 insertions(+), 11156 deletions(-) - Changes since Pacemaker-1.1.2.1 + ais: Bug lf#2401 - Improved processing when the peer crmd processes join/leave + ais: Correct the logic for conecting to plugin based clusters + ais: Do not supply a process list in mcp-mode + ais: Drop support for whitetank in the 1.1 release series + ais: Get an initial dump of the node membership when connecting to quorum-based clusters + ais: Guard against saturated cpg connections + ais: Handle CS_ERR_TRY_AGAIN in more cases + ais: Move the code for finding uid before the fork so that the child does no logging + ais: Never allow quorum plugins to affect connection to the pacemaker plugin + ais: Sign everyone up for peer process updates, not just the crmd + ais: The cluster type needs to be set before initializing classic openais connections + cib: Also free query result for xpath operations that return more than one hit + cib: Attempt to resolve memory corruption when forking a child to write the cib to disk + cib: Correctly free memory when writing out the cib to disk + cib: Fix the application of unversioned diffs + cib: Remove old developmental error logging + cib: Restructure the 'valid peer' check for deciding which instructions to ignore + cman: Correctly process membership/quorum changes from the pcmk plugin. Allow other message types through untouched + cman: Filter directed messages not intended for us + cman: Grab the initial membership when we connect + cman: Keep the list of peer processes up-to-date + cman: Make sure our common hooks are called after a cman membership update + cman: Make sure we can compile without cman present + cman: Populate sender details for cpg messages + cman: Update the ringid for cman based clusters + Core: Correctly unpack HA_Messages containing multiple entries with the same name + Core: crm_count_member() should only track nodes that have the full stack up + Core: New developmental logging system inspired by the kernel and a PoC from Lars Ellenberg + crmd: All nodes should see status updates, not just he DC + crmd: Allow non-DC nodes to clear failcounts too + crmd: Base DC election on process relative uptime + crmd: Bug lf#2439 - cancel_op() can also return HA_RSCBUSY + crmd: Bug lf#2439 - Handle asynchronous notification of resource deletion events + crmd: Bug lf#2458 - Ensure stop actions always have the relevant resource attributes + crmd: Disable age as a criteria for cman based clusters, its not reliable enough + crmd: Ensure we activate the DC timer if we detect an alternate DC + crmd: Factor the nanosecond component of process uptime in elections + crmd: Fix assertion failure when performing async resource failures + crmd: Fix handling of async resource deletion results + crmd: Include the action for crm graph operations + crmd: Make sure the membership cache is accurate after a sucessful fencing operation + crmd: Make sure we always poke the FSA after a transition to clear any TE_HALT actions + crmd: Offer crm-level membership once the peer starts the crmd process + crmd: Only need to request quorum update for plugin based clusters + crmd: Prevent assertion failure for stop actions resulting from cs: 3c0bc17c6daf + crmd: Prevent everyone from loosing DC elections by correctly initializing all relevant variables + crmd: Prevent segmentation fault + crmd: several fixes for async resource delete (thanks to beekhof) + crmd: Use the correct define/size for lrm resource IDs + Introduce two new cluster types 'cman' and 'corosync', replaces 'quorum_provider' concept + mcp: Add missing headers when built without heartbeat support + mcp: Correctly initialize the string containing the list of active daemons + mcp: Fix macro expansion in init script + mcp: Fix the expansion of the pid file in the init script + mcp: Handle CS_ERR_TRY_AGAIN when connecting to libcfg + mcp: Make sure we can compile the mcp without cman present + mcp: New master control process for (re)spawning pacemaker daemons + mcp: Read config early so we can re-initialize logging asap if daemonizing + mcp: Rename the mcp binary to pacemakerd and create a 'pacemaker' init script + mcp: Resend our process list after every CPG change + mcp: Tell chkconfig we need to shut down early on + pengine: Avoid creating invalid ordering constraints for probes that are not needed + pengine: Bug lf#1959 - Fail unmanaged resources should not prevent other services from shutting down + pengine: Bug lf#2422 - Ordering dependencies on partially active groups not observed properly + pengine: Bug lf#2424 - Use notify oepration definition if it exists in the configuration + pengine: Bug lf#2433 - No services should be stopped until probes finish + pengine: Bug lf#2453 - Enforce clone ordering in the absense of colocation constraints + pengine: Bug lf#2476 - Repair on-fail=block for groups and primitive resources + pengine: Correctly detect when there is a real failcount that expired and needs to be cleared + pengine: Correctly handle pseudo action creation + pengine: Correctly order clone startup after group/clone start + pengine: Correct use-after-free introduced in the prior patch + pengine: Do not demote resources because something that requires it can not run + pengine: Fix colocation for interleaved clones + pengine: Fix colocation with partially active groups + pengine: Fix potential use-after-free defect from coverity + pengine: Fix previous merge + pengine: Fix use-after-free in order_actions() reported by valgrind + pengine: Make the current data set a global variable so it does not need to be passed around everywhere + pengine: Prevent endless loop when looking for operation definitions in the configuration + pengine: Prevent segfault by ensuring the arguments to do_calculations() are initialized + pengine: Rewrite the ordering constraint logic to be simplicity, clarity and maintainability + pengine: Wait until stonith is available, do not fall back to shutdown for nodes requesting termination + Resolve coverity RESOURCE_LEAK defects + Shell: Complete the transition to using crm_attribute instead of crm_failcount and crm_standby + stonith: Advertise stonith-ng options in the metadata + stonith: Bug lf#2461 - Prevent segfault by not looking up operations if the hashtable has not been initialized yet + stonith: Bug lf#2473 - Add the timeout at the top level where the daemon is looking for it + Stonith: Bug lf#2473 - Ensure stonith operations complete within the timeout and are terminated if they run too long + stonith: Bug lf#2473 - Ensure timeouts are included for fencing operations + stonith: Bug lf#2473 - Gracefully handle remote operations that arrive late (after we have done notifications) + stonith: Correctly parse pcmk_host_list parameters that appear on a single line + stonith: Map poweron/poweroff back to on/off expected by the stonith tool from cluster-glue + stonith: pass the configuration to the stonith program via environment variables (bnc#620781) + Stonith: Use the timeout specified by the user + Support starting plugin-based Pacemaker clusters with the MCP as well + Tools: Bug lf#2456 - Fix assertion failure in crm_resource + tools: crm_node - Repair the ability to connect to openais based clusters + tools: crm_node - Use the correct short option for --cman + tools: crm_report - corosync.conf wont necessarily contain the text 'pacemaker' anymore + Tools: crm_simulate - Fix use-after-free in when terminating + tools: crm_simulate - Resolve coverity USE_AFTER_FREE defect + Tools: Drop the 'pingd' daemon and resource agent in favor of ocf:pacemaker:ping + Tools: Fix recently introduced use-of-NULL + Tools: Fix use-after-free defects from coverity * Wed May 12 2010 Andrew Beekhof 1.1.2-1 - Update source tarball to revision: c25c972a25cc tip - Statistics: Changesets: 339 Diff: 708 files changed, 37918 insertions(+), 10584 deletions(-) - Changes since Pacemaker-1.1.1 + ais: Do not count votes from offline nodes and calculate current votes before sending quorum data + ais: Ensure the list of active processes sent to clients is always up-to-date + ais: Look for the correct conf variable for turning on file logging + ais: Need to find a better and thread-safe way to set core_uses_pid. Disable for now. + ais: Use the threadsafe version of getpwnam + Core: Bump the feature set due to the new failcount expiry feature + Core: fix memory leaks exposed by valgrind + Core: Bug lf#2414 - Prevent use-after-free reported by valgrind when doing xpath based deletions + crmd: Bug lf#2414 - Prevent use-after-free of the PE connection after it dies + crmd: Bug lf#2414 - Prevent use-after-free of the stonith-ng connection + crmd: Bug lf#2401 - Improved detection of partially active peers + crmd: Bug lf#2379 - Ensure the cluster terminates when the PE is not available + crmd: Do not allow the target_rc to be misused by resource agents + crmd: Do not ignore action timeouts based on FSA state + crmd: Ensure we don't get stuck in S_PENDING if we lose an election to someone that never talks to us again + crmd: Fix memory leaks exposed by valgrind + crmd: Remove race condition that could lead to multiple instances of a clone being active on a machine + crmd: Send erase_status_tag() calls to the local CIB when the DC is fenced, since there is no DC to accept them + crmd: Use global fencing notifications to prevent secondary fencing operations of the DC + pengine: Bug lf#2317 - Avoid needless restart of primitive depending on a clone + pengine: Bug lf#2361 - Ensure clones observe mandatory ordering constraints if the LHS is unrunnable + pengine: Bug lf#2383 - Combine failcounts for all instances of an anonymous clone on a host + pengine: Bug lf#2384 - Fix intra-set colocation and ordering + pengine: Bug lf#2403 - Enforce mandatory promotion (colocation) constraints + pengine: Bug lf#2412 - Correctly find clone instances by their prefix + pengine: Do not be so quick to pull the trigger on nodes that are coming up + pengine: Fix memory leaks exposed by valgrind + pengine: Rewrite native_merge_weights() to avoid Fix use-after-free + Shell: Bug bnc#590035 - always reload status if working with the cluster + Shell: Bug bnc#592762 - Default to using the status section from the live CIB + Shell: Bug lf#2315 - edit multiple meta_attributes sets in resource management + Shell: Bug lf#2221 - enable comments + Shell: Bug bnc#580492 - implement new cibstatus interface and commands + Shell: Bug bnc#585471 - new cibstatus import command + Shell: check timeouts also against the default-action-timeout property + Shell: new configure filter command + Tools: crm_mon - fix memory leaks exposed by valgrind * Tue Feb 16 2010 Andrew Beekhof - 1.1.1-1 - First public release of Pacemaker 1.1 - Package reference documentation in a doc subpackage - Move cts into a subpackage so that it can be easily consumed by others - Update source tarball to revision: 17d9cd4ee29f + New stonith daemon that supports global notifications + Service placement influenced by the physical resources + A new tool for simulating failures and the cluster’s reaction to them + Ability to serialize an otherwise unrelated a set of resource actions (eg. Xen migrations) * Mon Jan 18 2010 Andrew Beekhof - 1.0.7-1 - Update source tarball to revision: 2eed906f43e9 (stable-1.0) tip - Statistics: Changesets: 193 Diff: 220 files changed, 15933 insertions(+), 8782 deletions(-) - Changes since 1.0.5-4 + pengine: Bug 2213 - Ensure groups process location constraints so that clone-node-max works for cloned groups + pengine: Bug lf#2153 - non-clones should not restart when clones stop/start on other nodes + pengine: Bug lf#2209 - Clone ordering should be able to prevent startup of dependent clones + pengine: Bug lf#2216 - Correctly identify the state of anonymous clones when deciding when to probe + pengine: Bug lf#2225 - Operations that require fencing should wait for 'stonith_complete' not 'all_stopped'. + pengine: Bug lf#2225 - Prevent clone peers from stopping while another is instance is (potentially) being fenced + pengine: Correctly anti-colocate with a group + pengine: Correctly unpack ordering constraints for resource sets to avoid graph loops + Tools: crm: load help from crm_cli.txt + Tools: crm: resource sets (bnc#550923) + Tools: crm: support for comments (LF 2221) + Tools: crm: support for description attribute in resources/operations (bnc#548690) + Tools: hb2openais: add EVMS2 CSM processing (and other changes) (bnc#548093) + Tools: hb2openais: do not allow empty rules, clones, or groups (LF 2215) + Tools: hb2openais: refuse to convert pure EVMS volumes + cib: Ensure the loop for login message terminates + cib: Finally fix reliability of receiving large messages over remote plaintext connections + cib: Fix remote notifications + cib: For remote connections, default to CRM_DAEMON_USER since thats the only one that the cib can validate the password for using PAM + cib: Remote plaintext - Retry sending parts of the message that did not fit the first time + crmd: Ensure batch-limit is correctly enforced + crmd: Ensure we have the latest status after a transition abort + (bnc#547579,547582): Tools: crm: status section editing support + shell: Add allow-migrate as allowed meta-attribute (bnc#539968) + Medium: Build: Do not automatically add -L/lib, it could cause 64-bit arches to break + Medium: pengine: Bug lf#2206 - rsc_order constraints always use score at the top level + Medium: pengine: Only complain about target-role=master for non m/s resources + Medium: pengine: Prevent non-multistate resources from being promoted through target-role + Medium: pengine: Provide a default action for resource-set ordering + Medium: pengine: Silently fix requires=fencing for stonith resources so that it can be set in op_defaults + Medium: Tools: Bug lf#2286 - Allow the shell to accept template parameters on the command line + Medium: Tools: Bug lf#2307 - Provide a way to determin the nodeid of past cluster members + Medium: Tools: crm: add update method to template apply (LF 2289) + Medium: Tools: crm: direct RA interface for ocf class resource agents (LF 2270) + Medium: Tools: crm: direct RA interface for stonith class resource agents (LF 2270) + Medium: Tools: crm: do not add score which does not exist + Medium: Tools: crm: do not consider warnings as errors (LF 2274) + Medium: Tools: crm: do not remove sets which contain id-ref attribute (LF 2304) + Medium: Tools: crm: drop empty attributes elements + Medium: Tools: crm: exclude locations when testing for pathological constraints (LF 2300) + Medium: Tools: crm: fix exit code on single shot commands + Medium: Tools: crm: fix node delete (LF 2305) + Medium: Tools: crm: implement -F (--force) option + Medium: Tools: crm: rename status to cibstatus (LF 2236) + Medium: Tools: crm: revisit configure commit + Medium: Tools: crm: stay in crm if user specified level only (LF 2286) + Medium: Tools: crm: verify changes on exit from the configure level + Medium: ais: Some clients such as gfs_controld want a cluster name, allow one to be specified in corosync.conf + Medium: cib: Clean up logic for receiving remote messages + Medium: cib: Create valid notification control messages + Medium: cib: Indicate where the remote connection came from + Medium: cib: Send password prompt to stderr so that stdout can be redirected + Medium: cts: Fix rsh handling when stdout is not required + Medium: doc: Fill in the section on removing a node from an AIS-based cluster + Medium: doc: Update the docs to reflect the 0.6/1.0 rolling upgrade problem + Medium: doc: Use Publican for docbook based documentation + Medium: fencing: stonithd: add metadata for stonithd instance attributes (and support in the shell) + Medium: fencing: stonithd: ignore case when comparing host names (LF 2292) + Medium: tools: Make crm_mon functional with remote connections + Medium: xml: Add stopped as a supported role for operations + Medium: xml: Bug bnc#552713 - Treat node unames as text fields not IDs + Medium: xml: Bug lf#2215 - Create an always-true expression for empty rules when upgrading from 0.6 * Thu Oct 29 2009 Andrew Beekhof - 1.0.5-4 - Include the fixes from CoroSync integration testing - Move the resource templates - they are not documentation - Ensure documentation is placed in a standard location - Exclude documentation that is included elsewhere in the package - Update the tarball from upstream to version ee19d8e83c2a + cib: Correctly clean up when both plaintext and tls remote ports are requested + pengine: Bug bnc#515172 - Provide better defaults for lt(e) and gt(e) comparisions + pengine: Bug lf#2197 - Allow master instances placemaker to be influenced by colocation constraints + pengine: Make sure promote/demote pseudo actions are created correctly + pengine: Prevent target-role from promoting more than master-max instances + ais: Bug lf#2199 - Prevent expected-quorum-votes from being populated with garbage + ais: Prevent deadlock - don't try to release IPC message if the connection failed + cib: For validation errors, send back the full CIB so the client can display the errors + cib: Prevent use-after-free for remote plaintext connections + crmd: Bug lf#2201 - Prevent use-of-NULL when running heartbeat * Wed Oct 13 2009 Andrew Beekhof - 1.0.5-3 - Update the tarball from upstream to version 38cd629e5c3c + Core: Bug lf#2169 - Allow dtd/schema validation to be disabled + pengine: Bug lf#2106 - Not all anonymous clone children are restarted after configuration change + pengine: Bug lf#2170 - stop-all-resources option had no effect + pengine: Bug lf#2171 - Prevent groups from starting if they depend on a complex resource which can not + pengine: Disable resource management if stonith-enabled=true and no stonith resources are defined + pengine: do not include master score if it would prevent allocation + ais: Avoid excessive load by checking for dead children every 1s (instead of 100ms) + ais: Bug rh#525589 - Prevent shutdown deadlocks when running on CoroSync + ais: Gracefully handle changes to the AIS nodeid + crmd: Bug bnc#527530 - Wait for the transition to complete before leaving S_TRANSITION_ENGINE + crmd: Prevent use-after-free with LOG_DEBUG_3 + Medium: xml: Mask the "symmetrical" attribute on rsc_colocation constraints (bnc#540672) + Medium (bnc#520707): Tools: crm: new templates ocfs2 and clvm + Medium: Build: Invert the disable ais/heartbeat logic so that --without (ais|heartbeat) is available to rpmbuild + Medium: pengine: Bug lf#2178 - Indicate unmanaged clones + Medium: pengine: Bug lf#2180 - Include node information for all failed ops + Medium: pengine: Bug lf#2189 - Incorrect error message when unpacking simple ordering constraint + Medium: pengine: Correctly log resources that would like to start but can not + Medium: pengine: Stop ptest from logging to syslog + Medium: ais: Include version details in plugin name + Medium: crmd: Requery the resource metadata after every start operation * Fri Aug 21 2009 Tomas Mraz - 1.0.5-2.1 - rebuilt with new openssl * Wed Aug 19 2009 Andrew Beekhof - 1.0.5-2 - Add versioned perl dependency as specified by https://fedoraproject.org/wiki/Packaging/Perl#Packages_that_link_to_libperl - No longer remove RPATH data, it prevents us finding libperl.so and no other libraries were being hardcoded - Compile in support for heartbeat - Conditionally add heartbeat-devel and corosynclib-devel to the -devel requirements depending on which stacks are supported * Mon Aug 17 2009 Andrew Beekhof - 1.0.5-1 - Add dependency on resource-agents - Use the version of the configure macro that supplies --prefix, --libdir, etc - Update the tarball from upstream to version 462f1569a437 (Pacemaker 1.0.5 final) + Tools: crm_resource - Advertise --move instead of --migrate + Medium: Extra: New node connectivity RA that uses system ping and attrd_updater + Medium: crmd: Note that dc-deadtime can be used to mask the brokeness of some switches * Tue Aug 11 2009 Ville Skyttä - 1.0.5-0.7.c9120a53a6ae.hg - Use bzipped upstream tarball. * Wed Jul 29 2009 Andrew Beekhof - 1.0.5-0.6.c9120a53a6ae.hg - Add back missing build auto* dependencies - Minor cleanups to the install directive * Tue Jul 28 2009 Andrew Beekhof - 1.0.5-0.5.c9120a53a6ae.hg - Add a leading zero to the revision when alphatag is used * Tue Jul 28 2009 Andrew Beekhof - 1.0.5-0.4.c9120a53a6ae.hg - Incorporate the feedback from the cluster-glue review - Realistically, the version is a 1.0.5 pre-release - Use the global directive instead of define for variables - Use the haclient/hacluster group/user instead of daemon - Use the _configure macro - Fix install dependencies * Fri Jul 24 2009 Andrew Beekhof - 1.0.4-3 - Initial Fedora checkin - Include an AUTHORS and license file in each package - Change the library package name to pacemaker-libs to be more Fedora compliant - Remove execute permissions from xml related files - Reference the new cluster-glue devel package name - Update the tarball from upstream to version c9120a53a6ae + pengine: Only prevent migration if the clone dependency is stopping/starting on the target node + pengine: Bug 2160 - Don't shuffle clones due to colocation + pengine: New implementation of the resource migration (not stop/start) logic + Medium: Tools: crm_resource - Prevent use-of-NULL by requiring a resource name for the -A and -a options + Medium: pengine: Prevent use-of-NULL in find_first_action() * Tue Jul 14 2009 Andrew Beekhof - 1.0.4-2 - Reference authors from the project AUTHORS file instead of listing in description - Change Source0 to reference the Mercurial repo - Cleaned up the summaries and descriptions - Incorporate the results of Fedora package self-review * Thu Jun 04 2009 Andrew Beekhof - 1.0.4-1 - Update source tarball to revision: 1d87d3e0fc7f (stable-1.0) - Statistics: Changesets: 209 Diff: 266 files changed, 12010 insertions(+), 8276 deletions(-) - Changes since Pacemaker-1.0.3 + (bnc#488291): ais: do not rely on byte endianness on ptr cast + (bnc#507255): Tools: crm: delete rsc/op_defaults (these meta_attributes are killing me) + (bnc#507255): Tools: crm: import properly rsc/op_defaults + (LF 2114): Tools: crm: add support for operation instance attributes + ais: Bug lf#2126 - Messages replies cannot be routed to transient clients + ais: Fix compilation for the latest Corosync API (v1719) + attrd: Do not perform all updates as complete refreshes + cib: Fix huge memory leak affecting heartbeat-based clusters + Core: Allow xpath queries to match attributes + Core: Generate the help text directly from a tool options struct + Core: Handle differences in 0.6 messaging format + crmd: Bug lf#2120 - All transient node attribute updates need to go via attrd + crmd: Correctly calculate how long an FSA action took to avoid spamming the logs with errors + crmd: Fix another large memory leak affecting Heartbeat based clusters + lha: Restore compatibility with older versions + pengine: Bug bnc#495687 - Filesystem is not notified of successful STONITH under some conditions + pengine: Make running a cluster with STONITH enabled but no STONITH resources an error and provide details on resolutions + pengine: Prevent use-ofNULL when using resource ordering sets + pengine: Provide inter-notification ordering guarantees + pengine: Rewrite the notification code to be understanable and extendable + Tools: attrd - Prevent race condition resulting in the cluster forgetting the node wishes to shut down + Tools: crm: regression tests + Tools: crm_mon - Fix smtp notifications + Tools: crm_resource - Repair the ability to query meta attributes + Low Build: Bug lf#2105 - Debian package should contain pacemaker doc and crm templates + Medium (bnc#507255): Tools: crm: handle empty rsc/op_defaults properly + Medium (bnc#507255): Tools: crm: use the right obj_type when creating objects from xml nodes + Medium (LF 2107): Tools: crm: revisit exit codes in configure + Medium: cib: Do not bother validating updates that only affect the status section + Medium: Core: Include supported stacks in version information + Medium: crmd: Record in the CIB, the cluster infrastructure being used + Medium: cts: Do not combine crm_standby arguments - the wrapper can not process them + Medium: cts: Fix the CIBAusdit class + Medium: Extra: Refresh showscores script from Dominik + Medium: pengine: Build a statically linked version of ptest + Medium: pengine: Correctly log the actions for resources that are being recovered + Medium: pengine: Correctly log the occurance of promotion events + Medium: pengine: Implememt node health based on a patch from Mark Hamzy + Medium: Tools: Add examples to help text outputs + Medium: Tools: crm: catch syntax errors for configure load + Medium: Tools: crm: implement erasing nodes in configure erase + Medium: Tools: crm: work with parents only when managing xml objects + Medium: Tools: crm_mon - Add option to run custom notification program on resource operations (Patch by Dominik Klein) + Medium: Tools: crm_resource - Allow --cleanup to function on complex resources and cluster-wide + Medium: Tools: haresource2cib.py - Patch from horms to fix conversion error + Medium: Tools: Include stack information in crm_mon output + Medium: Tools: Two new options (--stack,--constraints) to crm_resource for querying how a resource is configured * Wed Apr 08 2009 Andrew Beekhof - 1.0.3-1 - Update source tarball to revision: b133b3f19797 (stable-1.0) tip - Statistics: Changesets: 383 Diff: 329 files changed, 15471 insertions(+), 15119 deletions(-) - Changes since Pacemaker-1.0.2 + Added tag SLE11-HAE-GMC for changeset 9196be9830c2 + ais plugin: Fix quorum calculation (bnc#487003) + ais: Another memory fix leak in error path + ais: Bug bnc#482847, bnc#482905 - Force a clean exit of OpenAIS once Pacemaker has finished unloading + ais: Bug bnc#486858 - Fix update_member() to prevent spamming clients with membership events containing no changes + ais: Centralize all quorum calculations in the ais plugin and allow expected votes to be configured int he cib + ais: Correctly handle a return value of zero from openais_dispatch_recv() + ais: Disable logging to a file + ais: Fix memory leak in error path + ais: IPC messages are only in scope until a response is sent + All signal handlers used with CL_SIGNAL() need to be as minimal as possible + cib: Bug bnc#482885 - Simplify CIB disk-writes to prevent data loss. Required a change to the backup filename format + cib: crmd: Revert part of 9782ab035003. Complex shutdown routines need G_main_add_SignalHandler to avoid race coditions + crm: Avoid infinite loop during crm configure edit (bnc#480327) + crmd: Avoid a race condition by waiting for the attrd update to trigger a transition automatically + crmd: Bug bnc#480977 - Prevent extra, partial, shutdown when a node restarts too quickly + crmd: Bug bnc#480977 - Prevent extra, partial, shutdown when a node restarts too quickly (verified) + crmd: Bug bnc#489063 - Ensure the DC is always unset after we 'lose' an election + crmd: Bug BSC#479543 - Correctly find the migration source for timed out migrate_from actions + crmd: Call crm_peer_init() before we start the FSA - prevents a race condition when used with Heartbeat + crmd: Erasing the status section should not be forced to the local node + crmd: Fix memory leak in cib notication processing code + crmd: Fix memory leak in transition graph processing + crmd: Fix memory leaks found by valgrind + crmd: More memory leaks fixes found by valgrind + fencing: stonithd: is_heartbeat_cluster is a no-no if there is no heartbeat support + pengine: Bug bnc#466788 - Exclude nodes that can not run resources + pengine: Bug bnc#466788 - Make colocation based on node attributes work + pengine: Bug BNC#478687 - Do not crash when clone-max is 0 + pengine: Bug bnc#488721 - Fix id-ref expansion for clones, the doc-root for clone children is not the cib root + pengine: Bug bnc#490418 - Correctly determine node state for nodes wishing to be terminated + pengine: Bug LF#2087 - Correctly parse the state of anonymous clones that have multiple instances on a given node + pengine: Bug lf#2089 - Meta attributes are not inherited by clone children + pengine: Bug lf#2091 - Correctly restart modified resources that were found active by a probe + pengine: Bug lf#2094 - Fix probe ordering for cloned groups + pengine: Bug LF:2075 - Fix large pingd memory leaks + pengine: Correctly attach orphaned clone children to their parent + pengine: Correctly handle terminate node attributes that are set to the output from time() + pengine: Ensure orphaned clone members are hooked up to the parent when clone-max=0 + pengine: Fix memory leak in LogActions + pengine: Fix the determination of whether a group is active + pengine: Look up the correct promotion preference for anonymous masters + pengine: Simplify handling of start failures by changing the default migration-threshold to INFINITY + pengine: The ordered option for clones no longer causes extra start/stop operations + RA: Bug bnc#490641 - Shut down dlm_controld with -TERM instead of -KILL + RA: pingd: Set default ping interval to 1 instead of 0 seconds + Resources: pingd - Correctly tell the ping daemon to shut down + Tools: Bug bnc#483365 - Ensure the command from cluster_test includes a value for --log-facility + Tools: cli: fix and improve delete command + Tools: crm: add and implement templates + Tools: crm: add support for command aliases and some common commands (i.e. cd,exit) + Tools: crm: create top configuration nodes if they are missing + Tools: crm: fix parsing attributes for rules (broken by the previous changeset) + Tools: crm: new ra set of commands + Tools: crm: resource agents information management + Tools: crm: rsc/op_defaults + Tools: crm: support for no value attribute in nvpairs + Tools: crm: the new configure monitor command + Tools: crm: the new configure node command + Tools: crm_mon - Prevent use-of-NULL when summarizing an orphan + Tools: hb2openais: create clvmd clone for respawn evmsd in ha.cf + Tools: hb2openais: fix a serious recursion bug in xml node processing + Tools: hb2openais: fix ocfs2 processing + Tools: pingd - prevent double free of getaddrinfo() output in error path + Tools: The default re-ping interval for pingd should be 1s not 1ms + Medium (bnc#479049): Tools: crm: add validation of resource type for the configure primitive command + Medium (bnc#479050): Tools: crm: add help for RA parameters in tab completion + Medium (bnc#479050): Tools: crm: add tab completion for primitive params/meta/op + Medium (bnc#479050): Tools: crm: reimplement cluster properties completion + Medium (bnc#486968): Tools: crm: listnodes function requires no parameters (do not mix completion with other stuff) + Medium: ais: Remove the ugly hack for dampening AIS membership changes + Medium: cib: Fix memory leaks by using mainloop_add_signal + Medium: cib: Move more logging to the debug level (was info) + Medium: cib: Overhaul the processing of synchronous replies + Medium: Core: Add library functions for instructing the cluster to terminate nodes + Medium: crmd: Add new expected-quorum-votes option + Medium: crmd: Allow up to 5 retires when an attrd update fails + Medium: crmd: Automatically detect and use new values for crm_config options + Medium: crmd: Bug bnc#490426 - Escalated shutdowns stall when there are pending resource operations + Medium: crmd: Clean up and optimize the DC election algorithm + Medium: crmd: Fix memory leak in shutdown + Medium: crmd: Fix memory leaks spotted by Valgrind + Medium: crmd: Ignore join messages from hosts other than our DC + Medium: crmd: Limit the scope of resource updates to the status section + Medium: crmd: Prevent the crmd from being respawned if its told to shut down when it did not ask to be + Medium: crmd: Re-check the election status after membership events + Medium: crmd: Send resource updates via the local CIB during elections + Medium: pengine: Bug bnc#491441 - crm_mon does not display operations returning 'uninstalled' correctly + Medium: pengine: Bug lf#2101 - For location constraints, role=Slave is equivalent to role=Started + Medium: pengine: Clean up the API - removed ->children() and renamed ->find_child() to fine_rsc() + Medium: pengine: Compress the display of healthy anonymous clones + Medium: pengine: Correctly log the actions for resources that are being recovered + Medium: pengine: Determin a promotion score for complex resources + Medium: pengine: Ensure clones always have a value for globally-unique + Medium: pengine: Prevent orphan clones from being allocated + Medium: RA: controld: Return proper exit code for stop op. + Medium: Tools: Bug bnc#482558 - Fix logging test in cluster_test + Medium: Tools: Bug bnc#482828 - Fix quoting in cluster_test logging setup + Medium: Tools: Bug bnc#482840 - Include directory path to CTSlab.py + Medium: Tools: crm: add more user input checks + Medium: Tools: crm: do not check resource status of we are working with a shadow + Medium: Tools: crm: fix id-refs and allow reference to top objects (i.e. primitive) + Medium: Tools: crm: ignore comments in the CIB + Medium: Tools: crm: multiple column output would not work with small lists + Medium: Tools: crm: refuse to delete running resources + Medium: Tools: crm: rudimentary if-else for templates + Medium: Tools: crm: Start/stop clones via target-role. + Medium: Tools: crm_mon - Compress the node status for healthy and offline nodes + Medium: Tools: crm_shadow - Return 0/cib_ok when --create-empty succeeds + Medium: Tools: crm_shadow - Support -e, the short form of --create-empty + Medium: Tools: Make attrd quieter + Medium: Tools: pingd - Avoid using various clplumbing functions as they seem to leak + Medium: Tools: Reduce pingd logging * Mon Feb 16 2009 Andrew Beekhof - 1.0.2-1 - Update source tarball to revision: d232d19daeb9 (stable-1.0) tip - Statistics: Changesets: 441 Diff: 639 files changed, 20871 insertions(+), 21594 deletions(-) - Changes since Pacemaker-1.0.1 + (bnc#450815): Tools: crm cli: do not generate id for the operations tag + ais: Add support for the new AIS IPC layer + ais: Always set header.error to the correct default: SA_AIS_OK + ais: Bug BNC#456243 - Ensure the membership cache always contains an entry for the local node + ais: Bug BNC:456208 - Prevent deadlocks by not logging in the child process before exec() + ais: By default, disable supprt for the WIP openais IPC patch + ais: Detect and handle situations where ais and the crm disagree on the node name + ais: Ensure crm_peer_seq is updated after a membership update + ais: Make sure all IPC header fields are set to sane defaults + ais: Repair and streamline service load now that whitetank startup functions correctly + build: create and install doc files + cib: Allow clients without mainloop to connect to the cib + cib: CID:18 - Fix use-of-NULL in cib_perform_op + cib: CID:18 - Repair errors introduced in b5a18704477b - Fix use-of-NULL in cib_perform_op + cib: Ensure diffs contain the correct values of admin_epoch + cib: Fix four moderately sized memory leaks detected by Valgrind + Core: CID:10 - Prevent indexing into an array of schemas with a negative value + Core: CID:13 - Fix memory leak in log_data_element + Core: CID:15 - Fix memory leak in crm_get_peer + Core: CID:6 - Fix use-of-NULL in copy_ha_msg_input + Core: Fix crash in the membership code preventing node shutdown + Core: Fix more memory leaks foudn by valgrind + Core: Prevent unterminated strings after decompression + crmd: Bug BNC:467995 - Delay marking STONITH operations complete until STONITH tells us so + crmd: Bug LF:1962 - Do not NACK peers because they are not (yet) in our membership. Just ignore them. + crmd: Bug LF:2010 - Ensure fencing cib updates create the node_state entry if needed to preent re-fencing during cluster startup + crmd: Correctly handle reconnections to attrd + crmd: Ensure updates for lost migrate operations indicate which node it tried to migrating to + crmd: If there are no nodes to finalize, start an election. + crmd: If there are no nodes to welcome, start an election. + crmd: Prevent node attribute loss by detecting attrd disconnections immediately + crmd: Prevent node re-probe loops by ensuring mandatory actions always complete + pengine: Bug 2005 - Fix startup ordering of cloned stonith groups + pengine: Bug 2006 - Correctly reprobe cloned groups + pengine: Bug BNC:465484 - Fix the no-quorum-policy=suicide option + pengine: Bug LF:1996 - Correctly process disabled monitor operations + pengine: CID:19 - Fix use-of-NULL in determine_online_status + pengine: Clones now default to globally-unique=false + pengine: Correctly calculate the number of available nodes for the clone to use + pengine: Only shoot online nodes with no-quorum-policy=suicide + pengine: Prevent on-fail settings being ignored after a resource is successfully stopped + pengine: Prevent use-of-NULL for failed migrate actions in process_rsc_state() + pengine: Remove an optimization for the terminate node attribute that caused the cluster to block indefinitly + pengine: Repar the ability to colocate based on node attributes other than uname + pengine: Start the correct monitor operation for unmanaged masters + stonith: CID:3 - Fix another case of exceptionally poor error handling by the original stonith developers + stonith: CID:5 - Checking for NULL and then dereferencing it anyway is an interesting approach to error handling + stonithd: Sending IPC to the cluster is a privileged operation + stonithd: wrong checks for shmid (0 is a valid id) + Tools: attrd - Correctly determine when an attribute has stopped changing and should be committed to the CIB + Tools: Bug 2003 - pingd does not correctly detect failures when the interface is down + Tools: Bug 2003 - pingd does not correctly handle node-down events on multi-NIC systems + Tools: Bug 2021 - pingd does not detect sequence wrapping correctly, incorrectly reports nodes offline + Tools: Bug BNC:468066 - Do not use the result of uname() when its no longer in scope + Tools: Bug BNC:473265 - crm_resource -L dumps core + Tools: Bug LF:2001 - Transient node attributes should be set via attrd + Tools: Bug LF:2036 - crm_resource cannot set/get parameters for cloned resources + Tools: Bug LF:2046 - Node attribute updates are lost because attrd can take too long to start + Tools: Cause the correct clone instance to be failed with crm_resource -F + Tools: cluster_test - Allow the user to select a stack and fix CTS invocation + Tools: crm cli: allow rename only if the resource is stopped + Tools: crm cli: catch system errors on file operations + Tools: crm cli: completion for ids in configure + Tools: crm cli: drop '-rsc' from attributes for order constraint + Tools: crm cli: exit with an appropriate exit code + Tools: crm cli: fix wrong order of action and resource in order constraint + Tools: crm cli: fox wrong exit code + Tools: crm cli: improve handling of cib attributes + Tools: crm cli: new command: configure rename + Tools: crm cli: new command: configure upgrade + Tools: crm cli: new command: node delete + Tools: crm cli: prevent key errors on missing cib attributes + Tools: crm cli: print long help for help topics + Tools: crm cli: return on syntax error when parsing score + Tools: crm cli: rsc_location can be without nvpairs + Tools: crm cli: short node preference location constraint + Tools: crm cli: sometimes, on errors, level would change on single shot use + Tools: crm cli: syntax: drop a bunch of commas (remains of help tables conversion) + Tools: crm cli: verify user input for sanity + Tools: crm: find expressions within rules (do not always skip xml nodes due to used id) + Tools: crm_master should not define a set id now that attrd is used. Defining one can break lookups + Tools: crm_mon Use the OID assigned to the project by IANA for SNMP traps + Medium (bnc#445622): Tools: crm cli: improve the node show command and drop node status + Medium (LF 2009): stonithd: improve timeouts for remote fencing + Medium: ais: Allow dead peers to be removed from membership calculations + Medium: ais: Pass node deletion events on to clients + Medium: ais: Sanitize ipc usage + Medium: ais: Supply the node uname in addtion to the id + Medium: Build: Clean up configure to ensure NON_FATAL_CFLAGS is consistent with CFLAGS (ie. includes -g) + Medium: Build: Install cluster_test + Medium: Build: Use more restrictive CFLAGS and fix the resulting errors + Medium: cib: CID:20 - Fix potential use-after-free in cib_native_signon + Medium: Core: Bug BNC:474727 - Set a maximum time to wait for IPC messages + Medium: Core: CID:12 - Fix memory leak in decode_transition_magic error path + Medium: Core: CID:14 - Fix memory leak in calculate_xml_digest error path + Medium: Core: CID:16 - Fix memory leak in date_to_string error path + Medium: Core: Try to track down the cause of XML parsing errors + Medium: crmd: Bug BNC:472473 - Do not wait excessive amounts of time for lost actions + Medium: crmd: Bug BNC:472473 - Reduce the transition timeout to action_timeout+network_delay + Medium: crmd: Do not fast-track the processing of LRM refreshes when there are pending actions. + Medium: crmd: do_dc_join_filter_offer - Check the 'join' message is for the current instance before deciding to NACK peers + Medium: crmd: Find option values without having to do a config upgrade + Medium: crmd: Implement shutdown using a transient node attribute + Medium: crmd: Update the crmd options to use dashes instead of underscores + Medium: cts: Add 'cluster reattach' to the suite of automated regression tests + Medium: cts: cluster_test - Make some usability enhancements + Medium: CTS: cluster_test - suggest a valid port number + Medium: CTS: Fix python import order + Medium: cts: Implement an automated SplitBrain test + Medium: CTS: Remove references to deleted classes + Medium: Extra: Resources - Use HA_VARRUN instead of HA_RSCTMP for state files as Heartbeat removes HA_RSCTMP at startup + Medium: HB: Bug 1933 - Fake crmd_client_status_callback() calls because HB does not provide them for already running processes + Medium: pengine: CID:17 - Fix memory leak in find_actions_by_task error path + Medium: pengine: CID:7,8 - Prevent hypothetical use-of-NULL in LogActions + Medium: pengine: Defer logging the actions performed on a resource until we have processed ordering constraints + Medium: pengine: Remove the symmetrical attribute of colocation constraints + Medium: Resources: pingd - fix the meta defaults + Medium: Resources: Stateful - Add missing meta defaults + Medium: stonithd: exit if we the pid file cannot be locked + Medium: Tools: Allow attrd clients to specify the ID the attribute should be created with + Medium: Tools: attrd - Allow attribute updates to be performed from a hosts peer + Medium: Tools: Bug LF:1994 - Clean up crm_verify return codes + Medium: Tools: Change the pingd defaults to ping hosts once every second (instead of 5 times every 10 seconds) + Medium: Tools: cibmin - Detect resource operations with a view to providing email/snmp/cim notification + Medium: Tools: crm cli: add back symmetrical for order constraints + Medium: Tools: crm cli: generate role in location when converting from xml + Medium: Tools: crm cli: handle shlex exceptions + Medium: Tools: crm cli: keep order of help topics + Medium: Tools: crm cli: refine completion for ids in configure + Medium: Tools: crm cli: replace inf with INFINITY + Medium: Tools: crm cli: streamline cib load and parsing + Medium: Tools: crm cli: supply provider only for ocf class primitives + Medium: Tools: crm_mon - Add support for sending mail notifications of resource events + Medium: Tools: crm_mon - Include the DC version in status summary + Medium: Tools: crm_mon - Sanitize startup and option processing + Medium: Tools: crm_mon - switch to event-driven updates and add support for sending snmp traps + Medium: Tools: crm_shadow - Replace the --locate option with the saner --edit + Medium: Tools: hb2openais: do not remove Evmsd resources, but replace them with clvmd + Medium: Tools: hb2openais: replace crmadmin with crm_mon + Medium: Tools: hb2openais: replace the lsb class with ocf for o2cb + Medium: Tools: hb2openais: reuse code + Medium: Tools: LF:2029 - Display an error if crm_resource is used to reset the operation history of non-primitive resources + Medium: Tools: Make pingd resilient to attrd failures + Medium: Tools: pingd - fix the command line switches + Medium: Tools: Rename ccm_tool to crm_node * Tue Nov 18 2008 Andrew Beekhof - 1.0.1-1 - Update source tarball to revision: 6fc5ce8302ab (stable-1.0) tip - Statistics: Changesets: 170 Diff: 816 files changed, 7633 insertions(+), 6286 deletions(-) - Changes since Pacemaker-1.0.1 + ais: Allow the crmd to get callbacks whenever a node state changes + ais: Create an option for starting the mgmtd daemon automatically + ais: Ensure HA_RSCTMP exists for use by resource agents + ais: Hook up the openais.conf config logging options + ais: Zero out the PID of disconnecting clients + cib: Ensure global updates cause a disk write when appropriate + Core: Add an extra snaity check to getXpathResults() to prevent segfaults + Core: Do not redefine __FUNCTION__ unnecessarily + Core: Repair the ability to have comments in the configuration + crmd: Bug:1975 - crmd should wait indefinitely for stonith operations to complete + crmd: Ensure PE processing does not occur for all error cases in do_pe_invoke_callback + crmd: Requests to the CIB should cause any prior PE calculations to be ignored + heartbeat: Wait for membership 'up' events before removing stale node status data + pengine: Bug LF:1988 - Ensure recurring operations always have the correct target-rc set + pengine: Bug LF:1988 - For unmanaged resources we need to skip the usual can_run_resources() checks + pengine: Ensure the terminate node attribute is handled correctly + pengine: Fix optional colocation + pengine: Improve up the detection of 'new' nodes joining the cluster + pengine: Prevent assert failures in master_color() by ensuring unmanaged masters are always reallocated to their current location + Tools: crm cli: parser: return False on syntax error and None for comments + Tools: crm cli: unify template and edit commands + Tools: crm_shadow - Show more line number information after validation failures + Tools: hb2openais: add option to upgrade the CIB to v3.0 + Tools: hb2openais: add U option to getopts and update usage + Tools: hb2openais: backup improved and multiple fixes + Tools: hb2openais: fix class/provider reversal + Tools: hb2openais: fix testing + Tools: hb2openais: move the CIB update to the end + Tools: hb2openais: update logging and set logfile appropriately + Tools: LF:1969 - Attrd never sets any properties in the cib + Tools: Make attrd functional on OpenAIS + Medium: ais: Hook up the options for specifying the expected number of nodes and total quorum votes + Medium: ais: Look for pacemaker options inside the service block with 'name: pacemaker' instead of creating an addtional configuration block + Medium: ais: Provide better feedback when nodes change nodeids (in openais.conf) + Medium: cib: Always store cib contents on disk with num_updates=0 + Medium: cib: Ensure remote access ports are cleaned up on shutdown + Medium: crmd: Detect deleted resource operations automatically + Medium: crmd: Erase a nodes resource operations and transient attributes after a successful STONITH + Medium: crmd: Find a more appropriate place to update quorum and refresh attrd attributes + Medium: crmd: Fix the handling of unexpected PE exits to ensure the current CIB is stored + Medium: crmd: Fix the recording of pending operations in the CIB + Medium: crmd: Initiate an attrd refresh _after_ the status section has been fully repopulated + Medium: crmd: Only the DC should update quorum in an openais cluster + Medium: Ensure meta attributes are used consistantly + Medium: pengine: Allow group and clone level resource attributes + Medium: pengine: Bug N:437719 - Ensure scores from colocated resources count when allocating groups + Medium: pengine: Prevent lsb scripts from being used in globally unique clones + Medium: pengine: Make a best-effort guess at a migration threshold for people with 0.6 configs + Medium: Resources: controld - ensure we are part of a clone with globally_unique=false + Medium: Tools: attrd - Automatically refresh all attributes after a CIB replace operation + Medium: Tools: Bug LF:1985 - crm_mon - Correctly process failed cib queries to allow reconnection after cluster restarts + Medium: Tools: Bug LF:1987 - crm_verify incorrectly warns of configuration upgrades for the most recent version + Medium: Tools: crm (bnc#441028): check for key error in attributes management + Medium: Tools: crm_mon - display the meaning of the operation rc code instead of the status + Medium: Tools: crm_mon - Fix the display of timing data + Medium: Tools: crm_verify - check that we are being asked to validate a complete config + Medium: xml: Relax the restriction on the contents of rsc_locaiton.node * Thu Oct 16 2008 Andrew Beekhof - 1.0.0-1 - Update source tarball to revision: 388654dfef8f tip - Statistics: Changesets: 261 Diff: 3021 files changed, 244985 insertions(+), 111596 deletions(-) - Changes since f805e1b30103 + add the crm cli program + ais: Move the service id definition to a common location and make sure it is always used + build: rename hb2openais.sh to .in and replace paths with vars + cib: Implement --create for crm_shadow + cib: Remove dead files + Core: Allow the expected number of quorum votes to be configrable + Core: cl_malloc and friends were removed from Heartbeat + Core: Only call xmlCleanupParser() if we parsed anything. Doing so unconditionally seems to cause a segfault + hb2openais.sh: improve pingd handling; several bugs fixed + hb2openais: fix clone creation; replace EVMS strings + new hb2openais.sh conversion script + pengine: Bug LF:1950 - Ensure the current values for all notification variables are always set (even if empty) + pengine: Bug LF:1955 - Ensure unmanaged masters are unconditionally repromoted to ensure they are monitored correctly. + pengine: Bug LF:1955 - Fix another case of filtering causing unmanaged master failures + pengine: Bug LF:1955 - Umanaged mode prevents master resources from being allocated correctly + pengine: Bug N:420538 - Anit-colocation caused a positive node preference + pengine: Correctly handle unmanaged resources to prevent them from being started elsewhere + pengine: crm_resource - Fix the --migrate command + pengine: MAke stonith-enabled default to true and warn if no STONITH resources are found + pengine: Make sure orphaned clone children are created correctly + pengine: Monitors for unmanaged resources do not need to wait for start/promote/demote actions to complete + stonithd (LF 1951): fix remote stonith operations + stonithd: fix handling of timeouts + stonithd: fix logic for stonith resource priorities + stonithd: implement the fence-timeout instance attribute + stonithd: initialize value before reading fence-timeout + stonithd: set timeouts for fencing ops to the timeout of the start op + stonithd: stonith rsc priorities (new feature) + Tools: Add hb2openais - a tool for upgrading a Heartbeat cluster to use OpenAIS instead + Tools: crm_verify - clean up the upgrade logic to prevent crash on invalid configurations + Tools: Make pingd functional on Linux + Update version numbers for 1.0 candidates + Medium: ais: Add support for a synchronous call to retrieve the nodes nodeid + Medium: ais: Use the agreed service number + Medium: Build: Reliably detect heartbeat libraries during configure + Medium: Build: Supply prototypes for libreplace functions when needed + Medium: Build: Teach configure how to find corosync + Medium: Core: Provide better feedback if Pacemaker is started by a stack it does not support + Medium: crmd: Avoid calling GHashTable functions with NULL + Medium: crmd: Delay raising I_ERROR when the PE exits until we have had a chance to save the current CIB + Medium: crmd: Hook up the stonith-timeout option to stonithd + Medium: crmd: Prevent potential use-of-NULL in global_timer_callback + Medium: crmd: Rationalize the logging of graph aborts + Medium: pengine: Add a stonith_timeout option and remove new options that are better set in rsc_defaults + Medium: pengine: Allow external entities to ask for a node to be shot by creating a terminate=true transient node attribute + Medium: pengine: Bug LF:1950 - Notifications do not contain all documented resource state fields + Medium: pengine: Bug N:417585 - Do not restart group children whos individual score drops below zero + Medium: pengine: Detect clients that disconnect before receiving their reply + Medium: pengine: Implement a true maintenance mode + Medium: pengine: Implement on-fail=standby for NTT. Derived from a patch by Satomi TANIGUCHI + Medium: pengine: Print the correct message when stonith is disabled + Medium: pengine: ptest - check the input is valid before proceeding + Medium: pengine: Revert group stickiness to the 'old way' + Medium: pengine: Use the correct attribute for action 'requires' (was prereq) + Medium: stonithd: Fix compilation without full heartbeat install + Medium: stonithd: exit with better code on empty host list + Medium: tools: Add a new regression test for CLI tools + Medium: tools: crm_resource - return with non-zero when a resource migration command is invalid + Medium: tools: crm_shadow - Allow the admin to start with an empty CIB (and no cluster connection) + Medium: xml: pacemaker-0.7 is now an alias for the 1.0 schema * Mon Sep 22 2008 Andrew Beekhof - 0.7.3-1 - Update source tarball to revision: 33e677ab7764+ tip - Statistics: Changesets: 133 Diff: 89 files changed, 7492 insertions(+), 1125 deletions(-) - Changes since f805e1b30103 + Tools: add the crm cli program + Core: cl_malloc and friends were removed from Heartbeat + Core: Only call xmlCleanupParser() if we parsed anything. Doing so unconditionally seems to cause a segfault + new hb2openais.sh conversion script + pengine: Bug LF:1950 - Ensure the current values for all notification variables are always set (even if empty) + pengine: Bug LF:1955 - Ensure unmanaged masters are unconditionally repromoted to ensure they are monitored correctly. + pengine: Bug LF:1955 - Fix another case of filtering causing unmanaged master failures + pengine: Bug LF:1955 - Umanaged mode prevents master resources from being allocated correctly + pengine: Bug N:420538 - Anit-colocation caused a positive node preference + pengine: Correctly handle unmanaged resources to prevent them from being started elsewhere + pengine: crm_resource - Fix the --migrate command + pengine: MAke stonith-enabled default to true and warn if no STONITH resources are found + pengine: Make sure orphaned clone children are created correctly + pengine: Monitors for unmanaged resources do not need to wait for start/promote/demote actions to complete + stonithd (LF 1951): fix remote stonith operations + Tools: crm_verify - clean up the upgrade logic to prevent crash on invalid configurations + Medium: ais: Add support for a synchronous call to retrieve the nodes nodeid + Medium: ais: Use the agreed service number + Medium: pengine: Allow external entities to ask for a node to be shot by creating a terminate=true transient node attribute + Medium: pengine: Bug LF:1950 - Notifications do not contain all documented resource state fields + Medium: pengine: Bug N:417585 - Do not restart group children whos individual score drops below zero + Medium: pengine: Implement a true maintenance mode + Medium: pengine: Print the correct message when stonith is disabled + Medium: stonithd: exit with better code on empty host list + Medium: xml: pacemaker-0.7 is now an alias for the 1.0 schema * Wed Aug 20 2008 Andrew Beekhof - 0.7.1-1 - Update source tarball to revision: f805e1b30103+ tip - Statistics: Changesets: 184 Diff: 513 files changed, 43408 insertions(+), 43783 deletions(-) - Changes since 0.7.0-19 + Fix compilation when GNUTLS isn't found + admin: Fix use-after-free in crm_mon + Build: Remove testing code that prevented heartbeat-only builds + cib: Use single quotes so that the xpath queries for nvpairs will succeed + crmd: Always connect to stonithd when the TE starts and ensure we notice if it dies + crmd: Correctly handle a dead PE process + crmd: Make sure async-failures cause the failcount to be incremented + pengine: Bug LF:1941 - Handle failed clone instance probes when clone-max < #nodes + pengine: Parse resource ordering sets correctly + pengine: Prevent use-of-NULL - order->rsc_rh will not always be non-NULL + pengine: Unpack colocation sets correctly + Tools: crm_mon - Prevent use-of-NULL for orphaned resources + Medium: ais: Add support for a synchronous call to retrieve the nodes nodeid + Medium: ais: Allow transient clients to receive membership updates + Medium: ais: Avoid double-free in error path + Medium: ais: Include in the mebership nodes for which we have not determined their hostname + Medium: ais: Spawn the PE from the ais plugin instead of the crmd + Medium: cib: By default, new configurations use the latest schema + Medium: cib: Clean up the CIB if it was already disconnected + Medium: cib: Only increment num_updates if something actually changed + Medium: cib: Prevent use-after-free in client after abnormal termination of the CIB + Medium: Core: Fix memory leak in xpath searches + Medium: Core: Get more details regarding parser errors + Medium: Core: Repair expand_plus_plus - do not call char2score on unexpanded values + Medium: Core: Switch to the libxml2 parser - its significantly faster + Medium: Core: Use a libxml2 library function for xml -> text conversion + Medium: crmd: Asynchronous failure actions have no parameters + Medium: crmd: Avoid calling glib functions with NULL + Medium: crmd: Do not allow an election to promote a node from S_STARTING + Medium: crmd: Do not vote if we have not completed the local startup + Medium: crmd: Fix te_update_diff() now that get_object_root() functions differently + Medium: crmd: Fix the lrmd xpath expressions to not contain quotes + Medium: crmd: If we get a join offer during an election, better restart the election + Medium: crmd: No further processing is needed when using the LRMs API call for failing resources + Medium: crmd: Only update have-quorum if the value changed + Medium: crmd: Repair the input validation logic in do_te_invoke + Medium: cts: CIBs can no longer contain comments + Medium: cts: Enable a bunch of tests that were incorrectly disabled + Medium: cts: The libxml2 parser wont allow v1 resources to use integers as parameter names + Medium: Do not use the cluster UID and GID directly. Look them up based on the configured value of HA_CCMUSER + Medium: Fix compilation when heartbeat is not supported + Medium: pengine: Allow groups to be involved in optional ordering constraints + Medium: pengine: Allow sets of operations to be reused by multiple resources + Medium: pengine: Bug LF:1941 - Mark extra clone instances as orphans and do not show inactive ones + Medium: pengine: Determin the correct migration-threshold during resource expansion + Medium: pengine: Implement no-quorum-policy=suicide (FATE #303619) + Medium: pengine: Clean up resources after stopping old copies of the PE + Medium: pengine: Teach the PE how to stop old copies of itself + Medium: Tools: Backport hb_report updates + Medium: Tools: cib_shadow - On create, spawn a new shell with CIB_shadow and PS1 set accordingly + Medium: Tools: Rename cib_shadow to crm_shadow * Fri Jul 18 2008 Andrew Beekhof - 0.7.0-19 - Update source tarball to revision: 007c3a1c50f5 (unstable) tip - Statistics: Changesets: 108 Diff: 216 files changed, 4632 insertions(+), 4173 deletions(-) - Changes added since unstable-0.7 + admin: Fix use-after-free in crm_mon + ais: Change the tag for the ais plugin to "pacemaker" (used in openais.conf) + ais: Log terminated processes as an error + cib: Performance - Reorganize things to avoid calculating the XML diff twice + pengine: Bug LF:1941 - Handle failed clone instance probes when clone-max < #nodes + pengine: Fix memory leak in action2xml + pengine: Make OCF_ERR_ARGS a node-level error rather than a cluster-level one + pengine: Properly handle clones that are not installed on all nodes + Medium: admin: cibadmin - Show any validation errors if the upgrade failed + Medium: admin: cib_shadow - Implement --locate to display the underlying filename + Medium: admin: cib_shadow - Implement a --diff option + Medium: admin: cib_shadow - Implement a --switch option + Medium: admin: crm_resource - create more compact constraints that do not use lifetime (which is deprecated) + Medium: ais: Approximate born_on for OpenAIS based clusters + Medium: cib: Remove do_id_check, it is a poor substitute for ID validation by a schema + Medium: cib: Skip construction of pre-notify messages if no-one wants one + Medium: Core: Attempt to streamline some key functions to increase performance + Medium: Core: Clean up XML parser after validation + Medium: crmd: Detect and optimize the CRMs behavior when processing diffs of an LRM refresh + Medium: Fix memory leaks when resetting the name of an XML object + Medium: pengine: Prefer the current location if it is one of a group of nodes with the same (highest) score * Wed Jun 25 2008 Andrew Beekhof - 0.7.0-1 - Update source tarball to revision: bde0c7db74fb tip - Statistics: Changesets: 439 Diff: 676 files changed, 41310 insertions(+), 52071 deletions(-) - Changes added since stable-0.6 + A new tool for setting up and invoking CTS + Admin: All tools now use --node (-N) for specifying node unames + Admin: All tools now use --xml-file (-x) and --xml-text (-X) for specifying where to find XML blobs + cib: Cleanup the API - remove redundant input fields + cib: Implement CIB_shadow - a facility for making and testing changes before uploading them to the cluster + cib: Make registering per-op callbacks an API call and renamed (for clarity) the API call for requesting notifications + Core: Add a facility for automatically upgrading old configurations + Core: Adopt libxml2 as the XML processing library - all external clients need to be recompiled + Core: Allow sending TLS messages larger than the MTU + Core: Fix parsing of time-only ISO dates + Core: Smarter handling of XML values containing quotes + Core: XML memory corruption - catch, and handle, cases where we are overwriting an attribute value with itself + Core: The xml ID type does not allow UUIDs that start with a number + Core: Implement XPath based versions of query/delete/replace/modify + Core: Remove some HA2.0.(3,4) compatibility code + crmd: Overhaul the detection of nodes that are starting vs. failed + pengine: Bug LF:1459 - Allow failures to expire + pengine: Have the PE do non-persistent configuration upgrades before performing calculations + pengine: Replace failure-stickiness with a simple 'migration-threshold' + tengine: Simplify the design by folding the tengine process into the crmd + Medium: Admin: Bug LF:1438 - Allow the list of all/active resource operations to be queried by crm_resource + Medium: Admin: Bug LF:1708 - crm_resource should print a warning if an attribute is already set as a meta attribute + Medium: Admin: Bug LF:1883 - crm_mon should display fail-count and operation history + Medium: Admin: Bug LF:1883 - crm_mon should display operation timing data + Medium: Admin: Bug N:371785 - crm_resource -C does not also clean up fail-count attributes + Medium: Admin: crm_mon - include timing data for failed actions + Medium: ais: Read options from the environment since objdb is not completely usable yet + Medium: cib: Add sections for op_defaults and rsc_defaults + Medium: cib: Better matching notification callbacks (for detecting duplicates and removal) + Medium: cib: Bug LF:1348 - Allow rules and attribute sets to be referenced for use in other objects + Medium: cib: BUG LF:1918 - By default, all cib calls now timeout after 30s + Medium: cib: Detect updates that decrease the version tuple + Medium: cib: Implement a client-side operation timeout - Requires LHA update + Medium: cib: Implement callbacks and async notifications for remote connections + Medium: cib: Make cib->cmds->update() an alias for modify at the API level (also implemented in cibadmin) + Medium: cib: Mark the CIB as disconnected if the IPC connection is terminated + Medium: cib: New call option 'cib_can_create' which can be passed to modify actions - allows the object to be created if it does not exist yet + Medium: cib: Reimplement get|set|delete attributes using XPath + Medium: cib: Remove some useless parts of the API + Medium: cib: Remove the 'attributes' scaffolding from the new format + Medium: cib: Implement the ability for clients to connect to remote servers + Medium: Core: Add support for validating xml against RelaxNG schemas + Medium: Core: Allow more than one item to be modified/deleted in XPath based operations + Medium: Core: Fix the sort_pairs function for creating sorted xml objects + Medium: Core: iso8601 - Implement subtract_duration and fix subtract_time + Medium: Core: Reduce the amount of xml copying occuring + Medium: Core: Support value='value+=N' XML updates (in addtion to value='value++') + Medium: crmd: Add support for lrm_ops->fail_rsc if its available + Medium: crmd: HB - watch link status for node leaving events + Medium: crmd: Bug LF:1924 - Improved handling of lrmd disconnects and shutdowns + Medium: crmd: Do not wait for actions with a start_delay over 5 minutes. Confirm them immediately + Medium: pengine: Bug LF:1328 - Do not fencing nodes in clusters without managed resources + Medium: pengine: Bug LF:1461 - Give transient node attributes (in ) preference over persistent ones (in ) + Medium: pengine: Bug LF:1884, Bug LF:1885 - Implement N:M ordering and colocation constraints + Medium: pengine: Bug LF:1886 - Create a resource and operation 'defaults' config section + Medium: pengine: Bug LF:1892 - Allow recurring actions to be triggered at known times + Medium: pengine: Bug LF:1926 - Probes should complete before stop actions are invoked + Medium: pengine: Fix the standby when its set as a transient attribute + Medium: pengine: Implement a global 'stop-all-resources' option + Medium: pengine: Implement cibpipe, a tool for performing/simulating config changes "offline" + Medium: pengine: We do not allow colocation with specific clone instances + Medium: Tools: pingd - Implement a stack-independent version of pingd + Medium: xml: Ship an xslt for upgrading from 0.6 to 0.7 * Thu Jun 19 2008 Andrew Beekhof - 0.6.5-1 - Update source tarball to revision: b9fe723d1ac5 tip - Statistics: Changesets: 48 Diff: 37 files changed, 1204 insertions(+), 234 deletions(-) - Changes since Pacemaker-0.6.4 + Admin: Repair the ability to delete failcounts + ais: Audit IPC handling between the AIS plugin and CRM processes + ais: Have the plugin create needed /var/lib directories + ais: Make sure the sync and async connections are assigned correctly (not swapped) + cib: Correctly detect configuration changes - num_updates does not count + pengine: Apply stickiness values to the whole group, not the individual resources + pengine: Bug N:385265 - Ensure groups are migrated instead of remaining partially active on the current node + pengine: Bug N:396293 - Enforce mandatory group restarts due to ordering constraints + pengine: Correctly recover master instances found active on more than one node + pengine: Fix memory leaks reported by Valgrind + Medium: Admin: crm_mon - Misc improvements from Satomi Taniguchi + Medium: Bug LF:1900 - Resource stickiness should not allow placement in asynchronous clusters + Medium: crmd: Ensure joins are completed promptly when a node taking part dies + Medium: pengine: Avoid clone instance shuffling in more cases + Medium: pengine: Bug LF:1906 - Remove an optimization in native_merge_weights() causing group scores to behave eratically + Medium: pengine: Make use of target_rc data to correctly process resource operations + Medium: pengine: Prevent a possible use of NULL in sort_clone_instance() + Medium: tengine: Include target rc in the transition key - used to correctly determin operation failure * Thu May 22 2008 Andrew Beekhof - 0.6.4-1 - Update source tarball to revision: 226d8e356924 tip - Statistics: Changesets: 55 Diff: 199 files changed, 7103 insertions(+), 12378 deletions(-) - Changes since Pacemaker-0.6.3 + crmd: Bug LF:1881 LF:1882 - Overhaul the logic for operation cancelation and deletion + crmd: Bug LF:1894 - Make sure cancelled recurring operations are cleaned out from the CIB + pengine: Bug N:387749 - Colocation with clones causes unnecessary clone instance shuffling + pengine: Ensure 'master' monitor actions are cancelled _before_ we demote the resource + pengine: Fix assert failure leading to core dump - make sure variable is properly initialized + pengine: Make sure 'slave' monitoring happens after the resource has been demoted + pengine: Prevent failure stickiness underflows (where too many failures become a _positive_ preference) + Medium: Admin: crm_mon - Only complain if the output file could not be opened + Medium: Common: filter_action_parameters - enable legacy handling only for older versions + Medium: pengine: Bug N:385265 - The failure stickiness of group children is ignored until it reaches -INFINITY + Medium: pengine: Implement master and clone colocation by exlcuding nodes rather than setting ones score to INFINITY (similar to cs: 756afc42dc51) + Medium: tengine: Bug LF:1875 - Correctly find actions to cancel when their node leaves the cluster * Wed Apr 23 2008 Andrew Beekhof - 0.6.3-1 - Update source tarball to revision: fd8904c9bc67 tip - Statistics: Changesets: 117 Diff: 354 files changed, 19094 insertions(+), 11338 deletions(-) - Changes since Pacemaker-0.6.2 + Admin: Bug LF:1848 - crm_resource - Pass set name and id to delete_resource_attr() in the correct order + Build: SNMP has been moved to the management/pygui project + crmd: Bug LF1837 - Unmanaged resources prevent crmd from shutting down + crmd: Prevent use-after-free in lrm interface code (Patch based on work by Keisuke MORI) + pengine: Allow the cluster to make progress by not retrying failed demote actions + pengine: Anti-colocation with slave should not prevent master colocation + pengine: Bug LF 1768 - Wait more often for STONITH ops to complete before starting resources + pengine: Bug LF1836 - Allow is-managed-default=false to be overridden by individual resources + pengine: Bug LF185 - Prevent pointless master/slave instance shuffling by ignoring the master-pref of stopped instances + pengine: Bug N-191176 - Implement interleaved ordering for clone-to-clone scenarios + pengine: Bug N-347004 - Ensure clone notifications are always sent when an instance is stopped/started + pengine: Bug N-347004 - Include notification ordering is correct for interleaved clones + pengine: Bug PM-11 - Directly link probe_complete to starting clone instances + pengine: Bug PM1 - Fix setting failcounts when applied to complex resources + pengine: Bug PM12, LF1648 - Extensive revision of group ordering + pengine: Bug PM7 - Ensure masters are always demoted before they are stopped + pengine: Create probes after allocation to allow smarter handling of anonymous clones + pengine: Do not prioritize clone instances that must be moved + pengine: Fix error in previous commit that allowed more than the required number of masters to be promoted + pengine: Group start ordering fixes + pengine: Implement promote/demote ordering for cloned groups + tengine: Repair failcount updates + tengine: Use the correct offset when updating failcount + Medium: Admin: Add a summary output that can be easily parsed by CTS for audit purposes + Medium: Build: Make configure fail if bz2 or libxml2 are not present + Medium: Build: Re-instate a better default for LCRSODIR + Medium: CIB: Bug LF-1861 - Filter irrelvant error status from synchronous CIB clients + Medium: Core: Bug 1849 - Invalid conversion of ordinal leap year to gregorian date + Medium: Core: Drop compatibility code for 2.0.4 and 2.0.5 clusters + Medium: crmd: Bug LF-1860 - Automatically cancel recurring ops before demote and promote operations (not only stops) + Medium: crmd: Save the current CIB contents if we detect the PE crashed + Medium: pengine: Bug LF:1866 - Fix version check when applying compatibility handling for failed start operations + Medium: pengine: Bug LF:1866 - Restore the ability to have start failures not be fatal + Medium: pengine: Bug PM1 - Failcount applies to all instances of non-unique clone + Medium: pengine: Correctly set the state of partially active master/slave groups + Medium: pengine: Do not claim to be stopping an already stopped orphan + Medium: pengine: Ensure implies_left ordering constraints are always effective + Medium: pengine: Indicate each resources 'promotion' score + Medium: pengine: Prevent a possible use-of-NULL + Medium: pengine: Reprocess the current action if it changed (so that any prior dependencies are updated) + Medium: tengine: Bug LF-1859 - Wait for fail-count updates to complete before terminating the transition + Medium: tengine: Bug LF:1859 - Do not abort graphs due to our own failcount updates + Medium: tengine: Bug LF:1859 - Prevent the TE from interupting itself * Thu Feb 14 2008 Andrew Beekhof - 0.6.2-1 - Update source tarball to revision: 28b1a8c1868b tip - Statistics: Changesets: 11 Diff: 7 files changed, 58 insertions(+), 18 deletions(-) - Changes since Pacemaker-0.6.1 + haresources2cib.py: set default-action-timeout to the default (20s) + haresources2cib.py: update ra parameters lists + Medium: SNMP: Allow the snmp subagent to be built (patch from MATSUDA, Daiki) + Medium: Tools: Make sure the autoconf variables in haresources2cib are expanded * Tue Feb 12 2008 Andrew Beekhof - 0.6.1-1 - Update source tarball to revision: e7152d1be933 tip - Statistics: Changesets: 25 Diff: 37 files changed, 1323 insertions(+), 227 deletions(-) - Changes since Pacemaker-0.6.0 + CIB: Ensure changes to top-level attributes (like admin_epoch) cause a disk write + CIB: Ensure the archived file hits the disk before returning + CIB: Repair the ability to do 'atomic increment' updates (value="value++") + crmd: Bug #7 - Connecting to the crmd immediately after startup causes use-of-NULL + Medium: CIB: Mask cib_diff_resync results from the caller - they do not need to know + Medium: crmd: Delay starting the IPC server until we are fully functional + Medium: CTS: Fix the startup patterns + Medium: pengine: Bug 1820 - Allow the first resource in a group to be migrated + Medium: pengine: Bug 1820 - Check the colocation dependencies of resources to be migrated * Mon Jan 14 2008 Andrew Beekhof - 0.6.0-1 - This is the first release of the Pacemaker Cluster Resource Manager formerly part of Heartbeat. - For those looking for the GUI, mgmtd, CIM or TSA components, they are now found in the new pacemaker-pygui project. Build dependencies prevent them from being included in Heartbeat (since the built-in CRM is no longer supported) and, being non-core components, are not included with Pacemaker. - Update source tarball to revision: c94b92d550cf - Statistics: Changesets: 347 Diff: 2272 files changed, 132508 insertions(+), 305991 deletions(-) - Test hardware: + 6-node vmware cluster (sles10-sp1/256MB/vmware stonith) on a single host (opensuse10.3/2GB/2.66GHz Quad Core2) + 7-node EMC Centera cluster (sles10/512MB/2GHz Xeon/ssh stonith) - Notes: Heartbeat Stack + All testing was performed with STONITH enabled + The CRM was enabled using the "crm respawn" directive - Notes: OpenAIS Stack + This release contains a preview of support for the OpenAIS cluster stack + The current release of the OpenAIS project is missing two important patches that we require. OpenAIS packages containing these patches are available for most major distributions at: http://download.opensuse.org/repositories/server:/ha-clustering + The OpenAIS stack is not currently recommended for use in clusters that have shared data as STONITH support is not yet implimented + pingd is not yet available for use with the OpenAIS stack + 3 significant OpenAIS issues were found during testing of 4 and 6 node clusters. We are activly working together with the OpenAIS project to get these resolved. - Pending bugs encountered during testing: + OpenAIS #1736 - Openais membership took 20s to stabilize + Heartbeat #1750 - ipc_bufpool_update: magic number in head does not match + OpenAIS #1793 - Assertion failure in memb_state_gather_enter() + OpenAIS #1796 - Cluster message corruption - Changes since Heartbeat-2.1.2-24 + Add OpenAIS support + Admin: crm_uuid - Look in the right place for Heartbeat UUID files + admin: Exit and indicate a problem if the crmd exits while crmadmin is performing a query + cib: Fix CIB_OP_UPDATE calls that modify the whole CIB + cib: Fix compilation when supporting the heartbeat stack + cib: Fix memory leaks caused by the switch to get_message_xml() + cib: HA_VALGRIND_ENABLED needs to be set _and_ set to 1|yes|true + cib: Use get_message_xml() in preference to cl_get_struct() + cib: Use the return value from call to write() in cib_send_plaintext() + Core: ccm nodes can legitimately have a node id of 0 + Core: Fix peer-process tracking for the Heartbeat stack + Core: Heartbeat does not send status notifications for nodes that were already part of the cluster. Fake them instead + CRM: Add children to HA_Messages such that the field name matches F_XML_TAGNAME + crm: Adopt a more flexible appraoch to enabling Valgrind + crm: Fix compilation when bzip2 is not installed + CRM: Future-proof get_message_xml() + crmd: Filter election responses based on time not FSA state + crmd: Handle all possible peer states in crmd_ha_status_callback() + crmd: Make sure the current date/time is set - prevents use-of-NULL when evaluating rules + crmd: Relax an assertion regrading ccm membership instances + crmd: Use (node->processes&crm_proc_ais) to accurately update the CIB after replace operations + crmd: Heartbeat: Accurately record peer client status + pengine: Bug 1777 - Allow colocation with a resource in the Stopped state + pengine: Bug 1822 - Prevent use-of-NULL in PromoteRsc() + pengine: Implement three recovery policies based on op_status and op_rc + pengine: Parse fail-count correctly (it may be set to ININFITY) + pengine: Prevent graph-loop when stonith agents need to be moved around before a STONITH op + pengine: Prevent graph-loops when two operations have the same name+interval + tengine: Cancel active timers when destroying graphs + tengine: Ensure failcount is set correctly for failed stops/starts + tengine: Update failcount for oeprations that time out + Medium: admin: Prevent hang in crm_mon -1 when there is no cib connection - Patch from Junko IKEDA + Medium: cib: Require --force|-f when performing potentially dangerous commands with cibadmin + Medium: cib: Tweak the shutdown code + Medium: Common: Only count peer processes of active nodes + Medium: Core: Create generic cluster sign-in method + Medium: core: Fix compilation when Heartbeat support is disabled + Medium: Core: General cleanup for supporting two stacks + Medium: Core: iso6601 - Support parsing of time-only strings + Medium: core: Isolate more code that is only needed when SUPPORT_HEARTBEAT is enabled + Medium: crm: Improved logging of errors in the XML parser + Medium: crmd: Fix potential use-of-NULL in string comparison + Medium: crmd: Reimpliment syncronizing of CIB queries and updates when invoking the PE + Medium: crm_mon: Indicate when a node is both in standby mode and offline + Medium: pengine: Bug 1822 - Do not try an promote groups if not all of it is active + Medium: pengine: on_fail=nothing is an alias for 'ignore' not 'restart' + Medium: pengine: Prevent a potential use-of-NULL in cron_range_satisfied() + snmp subagent: fix a problem on displaying an unmanaged group + snmp subagent: use the syslog setting + snmp: v2 support (thanks to Keisuke MORI) + snmp_subagent - made it not complain about some things if shutting down diff --git a/doc/Clusters_from_Scratch/pot/Ap-Configuration.pot b/doc/Clusters_from_Scratch/pot/Ap-Configuration.pot index 12fc354d9e..0774e362d0 100644 --- a/doc/Clusters_from_Scratch/pot/Ap-Configuration.pot +++ b/doc/Clusters_from_Scratch/pot/Ap-Configuration.pot @@ -1,536 +1,536 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Configuration Recap" msgstr "" #. Tag: title #, no-c-format msgid "Final Cluster Configuration" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-1 pcmk-2 ]\n" " Clone Set: dlm-clone [dlm]\n" " Started: [ pcmk-1 pcmk-2 ]\n" " Clone Set: ClusterIP-clone [ClusterIP] (unique)\n" " ClusterIP:0 (ocf::heartbeat:IPaddr2): Started\n" " ClusterIP:1 (ocf::heartbeat:IPaddr2): Started\n" " Clone Set: WebFS-clone [WebFS]\n" " Started: [ pcmk-1 pcmk-2 ]\n" " Clone Set: WebSite-clone [WebSite]\n" " Started: [ pcmk-1 pcmk-2 ]" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource op defaults\n" "timeout: 240s" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs stonith\n" " impi-fencing (stonith:fence_ipmilan) Started" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs constraint\n" "Location Constraints:\n" "Ordering Constraints:\n" " start ClusterIP-clone then start WebSite-clone (kind:Mandatory)\n" " promote WebDataClone then start WebFS-clone (kind:Mandatory)\n" " start WebFS-clone then start WebSite-clone (kind:Mandatory)\n" " start dlm-clone then start WebFS-clone (kind:Mandatory)\n" "Colocation Constraints:\n" " WebSite-clone with ClusterIP-clone (score:INFINITY)\n" " WebFS-clone with WebDataClone (score:INFINITY) (with-rsc-role:Master)\n" " WebSite-clone with WebFS-clone (score:INFINITY)\n" " WebFS-clone with dlm-clone (score:INFINITY)" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Aug 14 12:05:37 2015\n" "Last change: Fri Aug 14 11:49:29 2015\n" "Stack: corosync\n" "Current DC: pcmk-1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "11 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " impi-fencing (stonith:fence_ipmilan): Started pcmk-1\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-1 pcmk-2 ]\n" " Clone Set: dlm-clone [dlm]\n" " Started: [ pcmk-1 pcmk-2 ]\n" " Clone Set: ClusterIP-clone [ClusterIP] (unique)\n" " ClusterIP:0 (ocf::heartbeat:IPaddr2): Started pcmk-2\n" " ClusterIP:1 (ocf::heartbeat:IPaddr2): Started pcmk-1\n" " Clone Set: WebFS-clone [WebFS]\n" " Started: [ pcmk-1 pcmk-2 ]\n" " Clone Set: WebSite-clone [WebSite]\n" " Started: [ pcmk-1 pcmk-2 ]\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib" msgstr "" #. Tag: programlisting #, no-c-format msgid "<cib crm_feature_set=\"3.0.9\" validate-with=\"pacemaker-2.3\" epoch=\"51\" num_updates=\"16\" admin_epoch=\"0\" cib-last-written=\"Fri Aug 14 11:49:29 2015\" have-quorum=\"1\" dc-uuid=\"1\">\n" " <crm_config>\n" " <cluster_property_set id=\"cib-bootstrap-options\">\n" " <nvpair id=\"cib-bootstrap-options-have-watchdog\" name=\"have-watchdog\" value=\"false\"/>\n" " <nvpair id=\"cib-bootstrap-options-dc-version\" name=\"dc-version\" value=\"1.1.12-a14efad\"/>\n" " <nvpair id=\"cib-bootstrap-options-cluster-infrastructure\" name=\"cluster-infrastructure\" value=\"corosync\"/>\n" " <nvpair id=\"cib-bootstrap-options-cluster-name\" name=\"cluster-name\" value=\"mycluster\"/>\n" " <nvpair id=\"cib-bootstrap-options-last-lrm-refresh\" name=\"last-lrm-refresh\" value=\"1419129162\"/>\n" " <nvpair id=\"cib-bootstrap-options-stonith-enabled\" name=\"stonith-enabled\" value=\"true\"/>\n" " </cluster_property_set>\n" " </crm_config>\n" " <nodes>\n" " <node id=\"1\" uname=\"pcmk-1\">\n" " <instance_attributes id=\"nodes-1\"/>\n" " </node>\n" " <node id=\"2\" uname=\"pcmk-2\">\n" " <instance_attributes id=\"nodes-2\"/>\n" " </node>\n" " </nodes>\n" " <resources>\n" " <primitive class=\"stonith\" id=\"impi-fencing\" type=\"fence_ipmilan\">\n" " <instance_attributes id=\"impi-fencing-instance_attributes\">\n" " <nvpair id=\"impi-fencing-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"pcmk-1 pcmk-2\"/>\n" " <nvpair id=\"impi-fencing-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"10.0.0.1\"/>\n" " <nvpair id=\"impi-fencing-instance_attributes-login\" name=\"login\" value=\"testuser\"/>\n" " <nvpair id=\"impi-fencing-instance_attributes-passwd\" name=\"passwd\" value=\"acd123\"/>\n" " </instance_attributes>\n" " <operations>\n" " <op id=\"impi-fencing-interval-60s\" interval=\"60s\" name=\"monitor\"/>\n" " </operations>\n" " </primitive>\n" " <master id=\"WebDataClone\">\n" " <primitive class=\"ocf\" id=\"WebData\" provider=\"linbit\" type=\"drbd\">\n" " <instance_attributes id=\"WebData-instance_attributes\">\n" " <nvpair id=\"WebData-instance_attributes-drbd_resource\" name=\"drbd_resource\" value=\"wwwdata\"/>\n" " </instance_attributes>\n" " <operations>\n" " <op id=\"WebData-start-timeout-240\" interval=\"0s\" name=\"start\" timeout=\"240\"/>\n" " <op id=\"WebData-promote-timeout-90\" interval=\"0s\" name=\"promote\" timeout=\"90\"/>\n" " <op id=\"WebData-demote-timeout-90\" interval=\"0s\" name=\"demote\" timeout=\"90\"/>\n" " <op id=\"WebData-stop-timeout-100\" interval=\"0s\" name=\"stop\" timeout=\"100\"/>\n" " <op id=\"WebData-monitor-interval-60s\" interval=\"60s\" name=\"monitor\"/>\n" " </operations>\n" " </primitive>\n" " <meta_attributes id=\"WebDataClone-meta_attributes\">\n" " <nvpair id=\"WebDataClone-meta_attributes-master-max\" name=\"master-max\" value=\"2\"/>\n" " <nvpair id=\"WebDataClone-meta_attributes-master-node-max\" name=\"master-node-max\" value=\"1\"/>\n" " <nvpair id=\"WebDataClone-meta_attributes-clone-max\" name=\"clone-max\" value=\"2\"/>\n" " <nvpair id=\"WebDataClone-meta_attributes-clone-node-max\" name=\"clone-node-max\" value=\"1\"/>\n" " <nvpair id=\"WebDataClone-meta_attributes-notify\" name=\"notify\" value=\"true\"/>\n" " </meta_attributes>\n" " </master>\n" " <clone id=\"dlm-clone\">\n" " <primitive class=\"ocf\" id=\"dlm\" provider=\"pacemaker\" type=\"controld\">\n" " <instance_attributes id=\"dlm-instance_attributes\"/>\n" " <operations>\n" " <op id=\"dlm-start-timeout-90\" interval=\"0s\" name=\"start\" timeout=\"90\"/>\n" " <op id=\"dlm-stop-timeout-100\" interval=\"0s\" name=\"stop\" timeout=\"100\"/>\n" " <op id=\"dlm-monitor-interval-60s\" interval=\"60s\" name=\"monitor\"/>\n" " </operations>\n" " </primitive>\n" " <meta_attributes id=\"dlm-clone-meta\">\n" " <nvpair id=\"dlm-clone-max\" name=\"clone-max\" value=\"2\"/>\n" " <nvpair id=\"dlm-clone-node-max\" name=\"clone-node-max\" value=\"1\"/>\n" " </meta_attributes>\n" " </clone>\n" " <clone id=\"ClusterIP-clone\">\n" " <primitive class=\"ocf\" id=\"ClusterIP\" provider=\"heartbeat\" type=\"IPaddr2\">\n" " <instance_attributes id=\"ClusterIP-instance_attributes\">\n" " <nvpair id=\"ClusterIP-instance_attributes-ip\" name=\"ip\" value=\"192.168.122.120\"/>\n" " <nvpair id=\"ClusterIP-instance_attributes-cidr_netmask\" name=\"cidr_netmask\" value=\"32\"/>\n" " <nvpair id=\"ClusterIP-instance_attributes-clusterip_hash\" name=\"clusterip_hash\" value=\"sourceip\"/>\n" " </instance_attributes>\n" " <operations>\n" " <op id=\"ClusterIP-start-timeout-20s\" interval=\"0s\" name=\"start\" timeout=\"20s\"/>\n" " <op id=\"ClusterIP-stop-timeout-20s\" interval=\"0s\" name=\"stop\" timeout=\"20s\"/>\n" " <op id=\"ClusterIP-monitor-interval-30s\" interval=\"30s\" name=\"monitor\"/>\n" " </operations>\n" " <meta_attributes id=\"ClusterIP-meta_attributes\"/>\n" " </primitive>\n" " <meta_attributes id=\"ClusterIP-clone-meta\">\n" " <nvpair id=\"ClusterIP-clone-max\" name=\"clone-max\" value=\"2\"/>\n" " <nvpair id=\"ClusterIP-clone-node-max\" name=\"clone-node-max\" value=\"2\"/>\n" " <nvpair id=\"ClusterIP-globally-unique\" name=\"globally-unique\" value=\"true\"/>\n" " </meta_attributes>\n" " </clone>\n" " <clone id=\"WebFS-clone\">\n" " <primitive class=\"ocf\" id=\"WebFS\" provider=\"heartbeat\" type=\"Filesystem\">\n" " <instance_attributes id=\"WebFS-instance_attributes\">\n" " <nvpair id=\"WebFS-instance_attributes-device\" name=\"device\" value=\"/dev/drbd1\"/>\n" " <nvpair id=\"WebFS-instance_attributes-directory\" name=\"directory\" value=\"/var/www/html\"/>\n" " <nvpair id=\"WebFS-instance_attributes-fstype\" name=\"fstype\" value=\"gfs2\"/>\n" " </instance_attributes>\n" " <operations>\n" " <op id=\"WebFS-start-timeout-60\" interval=\"0s\" name=\"start\" timeout=\"60\"/>\n" " <op id=\"WebFS-stop-timeout-60\" interval=\"0s\" name=\"stop\" timeout=\"60\"/>\n" " <op id=\"WebFS-monitor-interval-20\" interval=\"20\" name=\"monitor\" timeout=\"40\"/>\n" " </operations>\n" " <meta_attributes id=\"WebFS-meta_attributes\"/>\n" " </primitive>\n" " <meta_attributes id=\"WebFS-clone-meta\"/>\n" " </clone>\n" " <clone id=\"WebSite-clone\">\n" " <primitive class=\"ocf\" id=\"WebSite\" provider=\"heartbeat\" type=\"apache\">\n" " <instance_attributes id=\"WebSite-instance_attributes\">\n" " <nvpair id=\"WebSite-instance_attributes-configfile\" name=\"configfile\" value=\"/etc/httpd/conf/httpd.conf\"/>\n" " <nvpair id=\"WebSite-instance_attributes-statusurl\" name=\"statusurl\" value=\"http://localhost/server-status\"/>\n" " </instance_attributes>\n" " <operations>\n" " <op id=\"WebSite-start-timeout-40s\" interval=\"0s\" name=\"start\" timeout=\"40s\"/>\n" " <op id=\"WebSite-stop-timeout-60s\" interval=\"0s\" name=\"stop\" timeout=\"60s\"/>\n" " <op id=\"WebSite-monitor-interval-1min\" interval=\"1min\" name=\"monitor\"/>\n" " </operations>\n" " </primitive>\n" " <meta_attributes id=\"WebSite-clone-meta\"/>\n" " </clone>\n" " </resources>\n" " <constraints>\n" " <rsc_colocation id=\"colocation-WebSite-ClusterIP-INFINITY\" rsc=\"WebSite-clone\" score=\"INFINITY\" with-rsc=\"ClusterIP-clone\"/>\n" " <rsc_order first=\"ClusterIP-clone\" first-action=\"start\" id=\"order-ClusterIP-WebSite-mandatory\" then=\"WebSite-clone\" then-action=\"start\"/>\n" " <rsc_colocation id=\"colocation-WebFS-WebDataClone-INFINITY\" rsc=\"WebFS-clone\" score=\"INFINITY\" with-rsc=\"WebDataClone\" with-rsc-role=\"Master\"/>\n" " <rsc_order first=\"WebDataClone\" first-action=\"promote\" id=\"order-WebDataClone-WebFS-mandatory\" then=\"WebFS-clone\" then-action=\"start\"/>\n" " <rsc_colocation id=\"colocation-WebSite-WebFS-INFINITY\" rsc=\"WebSite-clone\" score=\"INFINITY\" with-rsc=\"WebFS-clone\"/>\n" " <rsc_order first=\"WebFS-clone\" first-action=\"start\" id=\"order-WebFS-WebSite-mandatory\" then=\"WebSite-clone\" then-action=\"start\"/>\n" " <rsc_colocation id=\"colocation-WebFS-clone-dlm-clone-INFINITY\" rsc=\"WebFS-clone\" score=\"INFINITY\" with-rsc=\"dlm-clone\"/>\n" " <rsc_order first=\"dlm-clone\" first-action=\"start\" id=\"order-dlm-clone-WebFS-clone-mandatory\" then=\"WebFS-clone\" then-action=\"start\"/>\n" " </constraints>\n" " <rsc_defaults>\n" " <meta_attributes id=\"rsc_defaults-options\">\n" " <nvpair id=\"rsc_defaults-options-resource-stickiness\" name=\"resource-stickiness\" value=\"100\"/>\n" " </meta_attributes>\n" " </rsc_defaults>\n" " <op_defaults>\n" " <meta_attributes id=\"op_defaults-options\">\n" " <nvpair id=\"op_defaults-options-timeout\" name=\"timeout\" value=\"240s\"/>\n" " </meta_attributes>\n" " </op_defaults>\n" " </configuration>\n" " <status>\n" " <node_state id=\"1\" uname=\"pcmk-1\" in_ccm=\"true\" crmd=\"online\" crm-debug-origin=\"do_update_resource\" join=\"member\" expected=\"member\">\n" " <lrm id=\"1\">\n" " <lrm_resources>\n" " <lrm_resource id=\"WebData\" type=\"drbd\" class=\"ocf\" provider=\"linbit\">\n" " <lrm_rsc_op id=\"WebData_last_0\" operation_key=\"WebData_promote_0\" operation=\"promote\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"13:4:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;13:4:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"44\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264508\" last-rc-change=\"1419264508\" exec-time=\"26\" queue-time=\"0\" op-digest=\"bc5c2e08730036ec602d79a958821da4\" on_node=\"pcmk-1\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"dlm\" type=\"controld\" class=\"ocf\" provider=\"pacemaker\">\n" " <lrm_rsc_op id=\"dlm_last_0\" operation_key=\"dlm_start_0\" operation=\"start\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"37:2:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;37:2:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"37\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264506\" last-rc-change=\"1419264506\" exec-time=\"1041\" queue-time=\"0\" op-digest=\"f2317cad3d54cec5d7d7aa7d0bf35cf8\" on_node=\"pcmk-1\"/>\n" " <lrm_rsc_op id=\"dlm_monitor_60000\" operation_key=\"dlm_monitor_60000\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"39:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;39:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"38\" rc-code=\"0\" op-status=\"0\" interval=\"60000\" last-rc-change=\"1419264507\" exec-time=\"11\" queue-time=\"0\" op-digest=\"968cc450c09e98fdac3043cb6a194d3d\" on_node=\"pcmk-1\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"ClusterIP:0\" type=\"IPaddr2\" class=\"ocf\" provider=\"heartbeat\">\n" " <lrm_rsc_op id=\"ClusterIP:0_last_0\" operation_key=\"ClusterIP:0_monitor_0\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"7:0:7:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:7;7:0:7:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"19\" rc-code=\"7\" op-status=\"0\" interval=\"0\" last-run=\"1419264506\" last-rc-change=\"1419264506\" exec-time=\"28\" queue-time=\"0\" op-digest=\"ac61ecc765070218997f6d876fa1d76c\" on_node=\"pcmk-1\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"ClusterIP:1\" type=\"IPaddr2\" class=\"ocf\" provider=\"heartbeat\">\n" " <lrm_rsc_op id=\"ClusterIP:1_last_0\" operation_key=\"ClusterIP:1_start_0\" operation=\"start\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"49:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;49:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"40\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264507\" last-rc-change=\"1419264507\" exec-time=\"190\" queue-time=\"0\" op-digest=\"ac61ecc765070218997f6d876fa1d76c\" on_node=\"pcmk-1\"/>\n" " <lrm_rsc_op id=\"ClusterIP:1_monitor_30000\" operation_key=\"ClusterIP:1_monitor_30000\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"50:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;50:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"41\" rc-code=\"0\" op-status=\"0\" interval=\"30000\" last-rc-change=\"1419264507\" exec-time=\"27\" queue-time=\"0\" op-digest=\"8ce33853c31576b708595f1d8a4a215c\" on_node=\"pcmk-1\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"WebFS\" type=\"Filesystem\" class=\"ocf\" provider=\"heartbeat\">\n" " <lrm_rsc_op id=\"WebFS_last_0\" operation_key=\"WebFS_start_0\" operation=\"start\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"62:5:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;62:5:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"46\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264508\" last-rc-change=\"1419264508\" exec-time=\"585\" queue-time=\"0\" op-digest=\"9d797b0e3b7f9729195992c0dafb5a9e\" on_node=\"pcmk-1\"/>\n" " <lrm_rsc_op id=\"WebFS_monitor_20000\" operation_key=\"WebFS_monitor_20000\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"62:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;62:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"47\" rc-code=\"0\" op-status=\"0\" interval=\"20000\" last-rc-change=\"1419264508\" exec-time=\"21\" queue-time=\"1\" op-digest=\"099af723b175851f09e5391e0c13854e\" on_node=\"pcmk-1\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"WebSite\" type=\"apache\" class=\"ocf\" provider=\"heartbeat\">\n" " <lrm_rsc_op id=\"WebSite_last_0\" operation_key=\"WebSite_start_0\" operation=\"start\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"72:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;72:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"48\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264508\" last-rc-change=\"1419264508\" exec-time=\"65\" queue-time=\"0\" op-digest=\"49ba395a3f2c142631c2ef2c431a29d9\" on_node=\"pcmk-1\"/>\n" " <lrm_rsc_op id=\"WebSite_monitor_60000\" operation_key=\"WebSite_monitor_60000\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"73:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;73:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"49\" rc-code=\"0\" op-status=\"0\" interval=\"60000\" last-rc-change=\"1419264508\" exec-time=\"26\" queue-time=\"0\" op-digest=\"eddc33bef3f1592ad847638ee485316f\" on_node=\"pcmk-1\"/>\n" " </lrm_resource>\n" " </lrm_resources>\n" " </lrm>\n" " <transient_attributes id=\"1\">\n" " <instance_attributes id=\"status-1\">\n" " <nvpair id=\"status-1-shutdown\" name=\"shutdown\" value=\"0\"/>\n" " <nvpair id=\"status-1-probe_complete\" name=\"probe_complete\" value=\"true\"/>\n" " <nvpair id=\"status-1-master-WebData\" name=\"master-WebData\" value=\"10000\"/>\n" " </instance_attributes>\n" " </transient_attributes>\n" " </node_state>\n" " <node_state id=\"2\" uname=\"pcmk-2\" in_ccm=\"true\" crmd=\"online\" crm-debug-origin=\"do_update_resource\" join=\"member\" expected=\"member\">\n" " <transient_attributes id=\"2\">\n" " <instance_attributes id=\"status-2\">\n" " <nvpair id=\"status-2-shutdown\" name=\"shutdown\" value=\"0\"/>\n" " <nvpair id=\"status-2-probe_complete\" name=\"probe_complete\" value=\"true\"/>\n" " <nvpair id=\"status-2-master-WebData\" name=\"master-WebData\" value=\"10000\"/>\n" " </instance_attributes>\n" " </transient_attributes>\n" " <lrm id=\"2\">\n" " <lrm_resources>\n" " <lrm_resource id=\"WebData\" type=\"drbd\" class=\"ocf\" provider=\"linbit\">\n" " <lrm_rsc_op id=\"WebData_last_0\" operation_key=\"WebData_promote_0\" operation=\"promote\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"16:4:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;16:4:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"41\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264508\" last-rc-change=\"1419264508\" exec-time=\"26\" queue-time=\"0\" op-digest=\"bc5c2e08730036ec602d79a958821da4\" on_node=\"pcmk-2\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"dlm\" type=\"controld\" class=\"ocf\" provider=\"pacemaker\">\n" " <lrm_rsc_op id=\"dlm_last_0\" operation_key=\"dlm_start_0\" operation=\"start\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"35:2:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;35:2:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"34\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264506\" last-rc-change=\"1419264506\" exec-time=\"1053\" queue-time=\"0\" op-digest=\"f2317cad3d54cec5d7d7aa7d0bf35cf8\" on_node=\"pcmk-2\"/>\n" " <lrm_rsc_op id=\"dlm_monitor_60000\" operation_key=\"dlm_monitor_60000\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"42:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;42:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"35\" rc-code=\"0\" op-status=\"0\" interval=\"60000\" last-rc-change=\"1419264507\" exec-time=\"19\" queue-time=\"0\" op-digest=\"968cc450c09e98fdac3043cb6a194d3d\" on_node=\"pcmk-2\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"ClusterIP:0\" type=\"IPaddr2\" class=\"ocf\" provider=\"heartbeat\">\n" " <lrm_rsc_op id=\"ClusterIP:0_last_0\" operation_key=\"ClusterIP:0_start_0\" operation=\"start\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"47:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;47:3:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"36\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264507\" last-rc-change=\"1419264507\" exec-time=\"237\" queue-time=\"0\" op-digest=\"ac61ecc765070218997f6d876fa1d76c\" on_node=\"pcmk-2\"/>\n" " <lrm_rsc_op id=\"ClusterIP:0_monitor_30000\" operation_key=\"ClusterIP:0_monitor_30000\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"51:4:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;51:4:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"39\" rc-code=\"0\" op-status=\"0\" interval=\"30000\" last-rc-change=\"1419264507\" exec-time=\"34\" queue-time=\"0\" op-digest=\"8ce33853c31576b708595f1d8a4a215c\" on_node=\"pcmk-2\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"ClusterIP:1\" type=\"IPaddr2\" class=\"ocf\" provider=\"heartbeat\">\n" " <lrm_rsc_op id=\"ClusterIP:1_last_0\" operation_key=\"ClusterIP:1_monitor_0\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"16:0:7:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:7;16:0:7:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"23\" rc-code=\"7\" op-status=\"0\" interval=\"0\" last-run=\"1419264506\" last-rc-change=\"1419264506\" exec-time=\"28\" queue-time=\"0\" op-digest=\"ac61ecc765070218997f6d876fa1d76c\" on_node=\"pcmk-2\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"WebFS\" type=\"Filesystem\" class=\"ocf\" provider=\"heartbeat\">\n" " <lrm_rsc_op id=\"WebFS_last_0\" operation_key=\"WebFS_start_0\" operation=\"start\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"60:5:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;60:5:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"43\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264508\" last-rc-change=\"1419264508\" exec-time=\"662\" queue-time=\"0\" op-digest=\"9d797b0e3b7f9729195992c0dafb5a9e\" on_node=\"pcmk-2\"/>\n" " <lrm_rsc_op id=\"WebFS_monitor_20000\" operation_key=\"WebFS_monitor_20000\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"65:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;65:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"44\" rc-code=\"0\" op-status=\"0\" interval=\"20000\" last-rc-change=\"1419264508\" exec-time=\"29\" queue-time=\"0\" op-digest=\"099af723b175851f09e5391e0c13854e\" on_node=\"pcmk-2\"/>\n" " </lrm_resource>\n" " <lrm_resource id=\"WebSite\" type=\"apache\" class=\"ocf\" provider=\"heartbeat\">\n" " <lrm_rsc_op id=\"WebSite_last_0\" operation_key=\"WebSite_start_0\" operation=\"start\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"70:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;70:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"45\" rc-code=\"0\" op-status=\"0\" interval=\"0\" last-run=\"1419264508\" last-rc-change=\"1419264508\" exec-time=\"64\" queue-time=\"0\" op-digest=\"49ba395a3f2c142631c2ef2c431a29d9\" on_node=\"pcmk-2\"/>\n" " <lrm_rsc_op id=\"WebSite_monitor_60000\" operation_key=\"WebSite_monitor_60000\" operation=\"monitor\" crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.9\" transition-key=\"71:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" transition-magic=\"0:0;71:6:0:225c8bc5-8fb0-49b6-9f75-337085b080de\" call-id=\"46\" rc-code=\"0\" op-status=\"0\" interval=\"60000\" last-rc-change=\"1419264508\" exec-time=\"28\" queue-time=\"0\" op-digest=\"eddc33bef3f1592ad847638ee485316f\" on_node=\"pcmk-2\"/>\n" " </lrm_resource>\n" " </lrm_resources>\n" " </lrm>\n" " </node_state>\n" " </status>\n" "</cib>" msgstr "" #. Tag: title #, no-c-format msgid "Node List" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status nodes\n" "Pacemaker Nodes:\n" " Online: pcmk-1 pcmk-2\n" " Standby:\n" " Offline:" msgstr "" #. Tag: title #, no-c-format msgid "Cluster Options" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs property\n" "Cluster Properties:\n" " cluster-infrastructure: corosync\n" " cluster-name: mycluster\n" " dc-version: 1.1.12-a14efad\n" " have-watchdog: false\n" " last-lrm-refresh: 1439569053\n" " stonith-enabled: true" msgstr "" #. Tag: para #, no-c-format msgid "The output shows state information automatically obtained about the cluster, including:" msgstr "" #. Tag: para #, no-c-format msgid "cluster-infrastructure - the cluster communications layer in use (heartbeat or corosync)" msgstr "" #. Tag: para #, no-c-format msgid "cluster-name - the cluster name chosen by the administrator when the cluster was created" msgstr "" #. Tag: para #, no-c-format msgid "dc-version - the version (including upstream source-code hash) of Pacemaker used on the Designated Controller" msgstr "" #. Tag: para #, no-c-format msgid "The output also shows options set by the administrator that control the way the cluster operates, including:" msgstr "" #. Tag: para #, no-c-format msgid "stonith-enabled=true - whether the cluster is allowed to use STONITH resources" msgstr "" #. Tag: title #, no-c-format msgid "Resources" msgstr "" #. Tag: title #, no-c-format msgid "Default Options" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource defaults\n" "resource-stickiness: 100" msgstr "" #. Tag: para #, no-c-format msgid "This shows cluster option defaults that apply to every resource that does not explicitly set the option itself. Above:" msgstr "" #. Tag: para #, no-c-format msgid "resource-stickiness - Specify the aversion to moving healthy resources to other machines" msgstr "" #. Tag: title #, no-c-format msgid "Fencing" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs stonith show\n" " ipmi-fencing (stonith:fence_ipmilan) Started\n" "[root@pcmk-1 ~]# pcs stonith show ipmi-fencing\n" " Resource: ipmi-fencing (class=stonith type=fence_ipmilan)\n" " Attributes: ipaddr=\"10.0.0.1\" login=\"testuser\" passwd=\"acd123\" pcmk_host_list=\"pcmk-1 pcmk-2\"\n" " Operations: monitor interval=60s (fence-monitor-interval-60s)" msgstr "" #. Tag: title #, no-c-format msgid "Service Address" msgstr "" #. Tag: para #, no-c-format msgid "Users of the services provided by the cluster require an unchanging address with which to access it. Additionally, we cloned the address so it will be active on both nodes. An iptables rule (created as part of the resource agent) is used to ensure that each request only gets processed by one of the two clone instances. The additional meta options tell the cluster that we want two instances of the clone (one \"request bucket\" for each node) and that if one node fails, then the remaining node should hold both." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource show ClusterIP-clone\n" " Clone: ClusterIP-clone\n" " Meta Attrs: clone-max=2 clone-node-max=2 globally-unique=true\n" " Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)\n" " Attributes: ip=192.168.122.120 cidr_netmask=32 clusterip_hash=sourceip\n" " Operations: start interval=0s timeout=20s (ClusterIP-start-timeout-20s)\n" " stop interval=0s timeout=20s (ClusterIP-stop-timeout-20s)\n" " monitor interval=30s (ClusterIP-monitor-interval-30s)" msgstr "" #. Tag: title #, no-c-format msgid "DRBD - Shared Storage" msgstr "" #. Tag: para #, no-c-format msgid "Here, we define the DRBD service and specify which DRBD resource (from /etc/drbd.d/*.res) it should manage. We make it a master/slave resource and, in order to have an active/active setup, allow both instances to be promoted to master at the same time. We also set the notify option so that the cluster will tell DRBD agent when its peer changes state." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource show WebDataClone\n" " Master: WebDataClone\n" " Meta Attrs: master-max=2 master-node-max=1 clone-max=2 clone-node-max=1 notify=true\n" " Resource: WebData (class=ocf provider=linbit type=drbd)\n" " Attributes: drbd_resource=wwwdata\n" " Operations: start interval=0s timeout=240 (WebData-start-timeout-240)\n" " promote interval=0s timeout=90 (WebData-promote-timeout-90)\n" " demote interval=0s timeout=90 (WebData-demote-timeout-90)\n" " stop interval=0s timeout=100 (WebData-stop-timeout-100)\n" " monitor interval=60s (WebData-monitor-interval-60s)\n" "[root@pcmk-1 ~]# pcs constraint ref WebDataClone\n" "Resource: WebDataClone\n" " colocation-WebFS-WebDataClone-INFINITY\n" " order-WebDataClone-WebFS-mandatory" msgstr "" #. Tag: title #, no-c-format msgid "Cluster Filesystem" msgstr "" #. Tag: para #, no-c-format msgid "The cluster filesystem ensures that files are read and written correctly. We need to specify the block device (provided by DRBD), where we want it mounted and that we are using GFS2. Again, it is a clone because it is intended to be active on both nodes. The additional constraints ensure that it can only be started on nodes with active DLM and DRBD instances." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource show WebFS-clone\n" " Clone: WebFS-clone\n" " Resource: WebFS (class=ocf provider=heartbeat type=Filesystem)\n" " Attributes: device=/dev/drbd1 directory=/var/www/html fstype=gfs2\n" " Operations: start interval=0s timeout=60 (WebFS-start-timeout-60)\n" " stop interval=0s timeout=60 (WebFS-stop-timeout-60)\n" " monitor interval=20 timeout=40 (WebFS-monitor-interval-20)\n" "[root@pcmk-1 ~]# pcs constraint ref WebFS-clone\n" "Resource: WebFS-clone\n" " colocation-WebFS-WebDataClone-INFINITY\n" " colocation-WebSite-WebFS-INFINITY\n" " colocation-WebFS-clone-dlm-clone-INFINITY\n" " order-WebDataClone-WebFS-mandatory\n" " order-WebFS-WebSite-mandatory\n" " order-dlm-clone-WebFS-clone-mandatory" msgstr "" #. Tag: title #, no-c-format msgid "Apache" msgstr "" #. Tag: para #, no-c-format msgid "Lastly, we have the actual service, Apache. We need only tell the cluster where to find its main configuration file and restrict it to running on nodes that have the required filesystem mounted and the IP address active." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource show WebSite-clone\n" " Clone: WebSite-clone\n" " Resource: WebSite (class=ocf provider=heartbeat type=apache)\n" " Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status\n" " Operations: start interval=0s timeout=40s (WebSite-start-timeout-40s)\n" " stop interval=0s timeout=60s (WebSite-stop-timeout-60s)\n" " monitor interval=1min (WebSite-monitor-interval-1min)\n" "[root@pcmk-1 ~]# pcs constraint ref WebSite-clone\n" "Resource: WebSite-clone\n" " colocation-WebSite-ClusterIP-INFINITY\n" " colocation-WebSite-WebFS-INFINITY\n" " order-ClusterIP-WebSite-mandatory\n" " order-WebFS-WebSite-mandatory" msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ap-Corosync-Conf.pot b/doc/Clusters_from_Scratch/pot/Ap-Corosync-Conf.pot index b26a496c0d..19815df1ab 100644 --- a/doc/Clusters_from_Scratch/pot/Ap-Corosync-Conf.pot +++ b/doc/Clusters_from_Scratch/pot/Ap-Corosync-Conf.pot @@ -1,54 +1,54 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Sample Corosync Configuration" msgstr "" #. Tag: title #, no-c-format msgid "Sample corosync.conf for two-node cluster created by pcs." msgstr "" #. Tag: literallayout #, no-c-format msgid "totem {\n" "version: 2\n" "secauth: off\n" "cluster_name: mycluster\n" "transport: udpu\n" "}\n" "\n" "nodelist {\n" " node {\n" " ring0_addr: pcmk-1\n" " nodeid: 1\n" " }\n" " node {\n" " ring0_addr: pcmk-2\n" " nodeid: 2\n" " }\n" "}\n" "\n" "quorum {\n" "provider: corosync_votequorum\n" "two_node: 1\n" "}\n" "\n" "logging {\n" "to_syslog: yes\n" "}" msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ap-Reading.pot b/doc/Clusters_from_Scratch/pot/Ap-Reading.pot index 793acf98ba..238690822d 100644 --- a/doc/Clusters_from_Scratch/pot/Ap-Reading.pot +++ b/doc/Clusters_from_Scratch/pot/Ap-Reading.pot @@ -1,34 +1,34 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Further Reading" msgstr "" #. Tag: para #, no-c-format msgid "Project Website http://www.clusterlabs.org/" msgstr "" #. Tag: para #, no-c-format msgid "SuSE has a comprehensive guide to cluster commands (though using the crmsh command-line shell rather than pcs) at: https://www.suse.com/documentation/sle_ha/book_sleha/data/book_sleha.html" msgstr "" #. Tag: para #, no-c-format msgid "Corosync http://www.corosync.org/" msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Author_Group.pot b/doc/Clusters_from_Scratch/pot/Author_Group.pot index e1dac4cb14..2a05b480d1 100644 --- a/doc/Clusters_from_Scratch/pot/Author_Group.pot +++ b/doc/Clusters_from_Scratch/pot/Author_Group.pot @@ -1,64 +1,64 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: firstname #, no-c-format msgid "Andrew" msgstr "" #. Tag: surname #, no-c-format msgid "Beekhof" msgstr "" #. Tag: orgname #, no-c-format msgid "Red Hat" msgstr "" #. Tag: contrib #, no-c-format msgid "Primary author" msgstr "" #. Tag: firstname #, no-c-format msgid "Raoul" msgstr "" #. Tag: surname #, no-c-format msgid "Scarazzini" msgstr "" #. Tag: contrib #, no-c-format msgid "Italian translation" msgstr "" #. Tag: firstname #, no-c-format msgid "Dan" msgstr "" #. Tag: surname #, no-c-format msgid "Frîncu" msgstr "" #. Tag: contrib #, no-c-format msgid "Romanian translation" msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Book_Info.pot b/doc/Clusters_from_Scratch/pot/Book_Info.pot index 3ced027eed..f29960512e 100644 --- a/doc/Clusters_from_Scratch/pot/Book_Info.pot +++ b/doc/Clusters_from_Scratch/pot/Book_Info.pot @@ -1,69 +1,69 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Clusters from Scratch" msgstr "" #. Tag: subtitle #, no-c-format msgid "Step-by-Step Instructions for Building Your First High-Availability Cluster" msgstr "" #. Tag: productname #, no-c-format msgid "Pacemaker" msgstr "" #. Tag: para #, no-c-format msgid "The purpose of this document is to provide a start-to-finish guide to building an example active/passive cluster with Pacemaker and show how it can be converted to an active/active one." msgstr "" #. Tag: para #, no-c-format msgid "The example cluster will use:" msgstr "" #. Tag: para #, no-c-format msgid "&DISTRO; &DISTRO_VERSION; as the host operating system" msgstr "" #. Tag: para #, no-c-format msgid "Corosync to provide messaging and membership services," msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker to perform resource management," msgstr "" #. Tag: para #, no-c-format msgid "DRBD as a cost-effective alternative to shared storage," msgstr "" #. Tag: para #, no-c-format msgid "GFS2 as the cluster filesystem (in active/active mode)" msgstr "" #. Tag: para #, no-c-format msgid "Given the graphical nature of the install process, a number of screenshots are included. However the guide is primarily composed of commands, the reasons for executing them and their expected outputs." msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ch-Active-Active.pot b/doc/Clusters_from_Scratch/pot/Ch-Active-Active.pot index 5dd6e355ee..2f319d5762 100644 --- a/doc/Clusters_from_Scratch/pot/Ch-Active-Active.pot +++ b/doc/Clusters_from_Scratch/pot/Ch-Active-Active.pot @@ -1,489 +1,489 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Convert Cluster to Active/Active" msgstr "" #. Tag: para #, no-c-format msgid "The primary requirement for an Active/Active cluster is that the data required for your services is available, simultaneously, on both machines. Pacemaker makes no requirement on how this is achieved; you could use a SAN if you had one available, but since DRBD supports multiple Primaries, we can continue to use it here." msgstr "" #. Tag: title #, no-c-format msgid "Install Cluster Filesystem Software" msgstr "" #. Tag: para #, no-c-format msgid "The only hitch is that we need to use a cluster-aware filesystem. The one we used earlier with DRBD, xfs, is not one of those. Both OCFS2 and GFS2 are supported; here, we will use GFS2." msgstr "" #. Tag: para #, no-c-format msgid "On both nodes, install the GFS2 command-line utilities and the Distributed Lock Manager (DLM) required by cluster filesystems:" msgstr "" #. Tag: screen #, no-c-format msgid "# yum install -y gfs2-utils dlm" msgstr "" #. Tag: title #, no-c-format msgid "Configure the Cluster for the DLM" msgstr "" #. Tag: para #, no-c-format msgid "The DLM needs to run on both nodes, so we’ll start by creating a resource for it (using the ocf:pacemaker:controld resource script), and clone it:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib dlm_cfg\n" "[root@pcmk-1 ~]# pcs -f dlm_cfg resource create dlm ocf:pacemaker:controld op monitor interval=60s\n" "[root@pcmk-1 ~]# pcs -f dlm_cfg resource clone dlm clone-max=2 clone-node-max=1\n" "[root@pcmk-1 ~]# pcs -f dlm_cfg resource show\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started\n" " WebSite (ocf::heartbeat:apache): Started\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-2 ]\n" " Slaves: [ pcmk-1 ]\n" " WebFS (ocf::heartbeat:Filesystem): Started\n" " Clone Set: dlm-clone [dlm]\n" " Stopped: [ pcmk-1 pcmk-2 ]" msgstr "" #. Tag: para #, no-c-format msgid "Activate our new configuration, and see how the cluster responds:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib-push dlm_cfg\n" "CIB updated\n" "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Aug 14 11:19:36 2015\n" "Last change: Fri Aug 14 11:19:28 2015\n" "Stack: corosync\n" "Current DC: pcmk-1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "8 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2\n" " WebSite (ocf::heartbeat:apache): Started pcmk-2\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-2 ]\n" " Slaves: [ pcmk-1 ]\n" " WebFS (ocf::heartbeat:Filesystem): Started pcmk-2\n" " ipmi-fencing (stonith:fence_ipmilan): Started pcmk-1\n" " Clone Set: dlm-clone [dlm]\n" " Started: [ pcmk-1 pcmk-2 ]\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: title #, no-c-format msgid "Create and Populate GFS2 Filesystem" msgstr "" #. Tag: para #, no-c-format msgid "Before we do anything to the existing partition, we need to make sure it is unmounted. We do this by telling the cluster to stop the WebFS resource. This will ensure that other resources (in our case, Apache) using WebFS are not only stopped, but stopped in the correct order." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource disable WebFS\n" "[root@pcmk-1 ~]# pcs resource\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started\n" " WebSite (ocf::heartbeat:apache): Stopped\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-2 ]\n" " Slaves: [ pcmk-1 ]\n" " WebFS (ocf::heartbeat:Filesystem): Stopped\n" " Clone Set: dlm-clone [dlm]\n" " Started: [ pcmk-1 pcmk-2 ]" msgstr "" #. Tag: para #, no-c-format msgid "You can see that both Apache and WebFS have been stopped, and that pcmk-2 is the current master for the DRBD device." msgstr "" #. Tag: para #, no-c-format msgid "Now we can create a new GFS2 filesystem on the DRBD device." msgstr "" #. Tag: para #, no-c-format msgid "This will erase all previous content stored on the DRBD device. Ensure you have a copy of any important data." msgstr "" #. Tag: para #, no-c-format msgid "Run the next command on whichever node has the DRBD Primary role. Otherwise, you will receive the message:" msgstr "" #. Tag: screen #, no-c-format msgid "/dev/drbd1: Read-only file system" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-2 ~]# mkfs.gfs2 -p lock_dlm -j 2 -t mycluster:web /dev/drbd1\n" "It appears to contain an existing filesystem (xfs)\n" "This will destroy any data on /dev/drbd1\n" "Are you sure you want to proceed? [y/n]y\n" "Device: /dev/drbd1\n" "Block size: 4096\n" "Device size: 1.00 GB (262127 blocks)\n" "Filesystem size: 1.00 GB (262126 blocks)\n" "Journals: 2\n" "Resource groups: 5\n" "Locking protocol: \"lock_dlm\"\n" "Lock table: \"mycluster:web\"\n" "UUID: 9a72c488-d8a7-24c9-ceee-add7a8ca52c2" msgstr "" #. Tag: para #, no-c-format msgid "The mkfs.gfs2 command required a number of additional parameters:" msgstr "" #. Tag: para #, no-c-format msgid "-p lock_dlm specifies that we want to use the kernel’s DLM." msgstr "" #. Tag: para #, no-c-format msgid "-j 2 indicates that the filesystem should reserve enough space for two journals (one for each node that will access the filesystem)." msgstr "" #. Tag: para #, no-c-format msgid "-t mycluster:web specifies the lock table name. The format for this field is clustername:fsname. For clustername, we need to use the same value we specified originally with pcs cluster setup --name (which is also the value of cluster_name in /etc/corosync/corosync.conf). If you are unsure what your cluster name is, you can look in /etc/corosync/corosync.conf or execute the command pcs cluster corosync pcmk-1 | grep cluster_name." msgstr "" #. Tag: para #, no-c-format msgid "Now we can (re-)populate the new filesystem with data (web pages). We’ll create yet another variation on our home page." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-2 ~]# mount /dev/drbd1 /mnt\n" "[root@pcmk-2 ~]# cat <<-END >/mnt/index.html\n" "<html>\n" "<body>My Test Site - GFS2</body>\n" "</html>\n" "END\n" "[root@pcmk-2 ~]# chcon -R --reference=/var/www/html /mnt\n" "[root@pcmk-2 ~]# umount /dev/drbd1\n" "[root@pcmk-2 ~]# drbdadm verify wwwdata" msgstr "" #. Tag: title #, no-c-format msgid "Reconfigure the Cluster for GFS2" msgstr "" #. Tag: para #, no-c-format msgid "With the WebFS resource stopped, let’s update the configuration." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource show WebFS\n" " Resource: WebFS (class=ocf provider=heartbeat type=Filesystem)\n" " Attributes: device=/dev/drbd1 directory=/var/www/html fstype=xfs\n" " Meta Attrs: target-role=Stopped\n" " Operations: start interval=0s timeout=60 (WebFS-start-timeout-60)\n" " stop interval=0s timeout=60 (WebFS-stop-timeout-60)\n" " monitor interval=20 timeout=40 (WebFS-monitor-interval-20)" msgstr "" #. Tag: para #, no-c-format msgid "The fstype option needs to be updated to gfs2 instead of xfs." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource update WebFS fstype=gfs2\n" "[root@pcmk-1 ~]# pcs resource show WebFS\n" " Resource: WebFS (class=ocf provider=heartbeat type=Filesystem)\n" " Attributes: device=/dev/drbd1 directory=/var/www/html fstype=gfs2\n" " Meta Attrs: target-role=Stopped\n" " Operations: start interval=0s timeout=60 (WebFS-start-timeout-60)\n" " stop interval=0s timeout=60 (WebFS-stop-timeout-60)\n" " monitor interval=20 timeout=40 (WebFS-monitor-interval-20)" msgstr "" #. Tag: para #, no-c-format msgid "GFS2 requires that DLM be running, so we also need to set up new colocation and ordering constraints for it:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs constraint colocation add WebFS with dlm-clone INFINITY\n" "[root@pcmk-1 ~]# pcs constraint order dlm-clone then WebFS\n" "Adding dlm-clone WebFS (kind: Mandatory) (Options: first-action=start then-action=start)" msgstr "" #. Tag: title #, no-c-format msgid "Clone the IP address" msgstr "" #. Tag: para #, no-c-format msgid "There’s no point making the services active on both locations if we can’t reach them both, so let’s clone the IP address." msgstr "" #. Tag: para #, no-c-format msgid "The IPaddr2 resource agent has built-in intelligence for when it is configured as a clone. It will utilize a multicast MAC address to have the local switch send the relevant packets to all nodes in the cluster, together with iptables clusterip rules on the nodes so that any given packet will be grabbed by exactly one node. This will give us a simple but effective form of load-balancing requests between our two nodes." msgstr "" #. Tag: para #, no-c-format msgid "Let’s start a new config, and clone our IP:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib loadbalance_cfg\n" "[root@pcmk-1 ~]# pcs -f loadbalance_cfg resource clone ClusterIP \\\n" " clone-max=2 clone-node-max=2 globally-unique=true" msgstr "" #. Tag: para #, no-c-format msgid "clone-max=2 tells the resource agent to split packets this many ways. This should equal the number of nodes that can host the IP." msgstr "" #. Tag: para #, no-c-format msgid "clone-node-max=2 says that one node can run up to 2 instances of the clone. This should also equal the number of nodes that can host the IP, so that if any node goes down, another node can take over the failed node’s \"request bucket\". Otherwise, requests intended for the failed node would be discarded." msgstr "" #. Tag: para #, no-c-format msgid "globally-unique=true tells the cluster that one clone isn’t identical to another (each handles a different \"bucket\"). This also tells the resource agent to insert iptables rules so each host only processes packets in its bucket(s)." msgstr "" #. Tag: para #, no-c-format msgid "Notice that when the ClusterIP becomes a clone, the constraints referencing ClusterIP now reference the clone. This is done automatically by pcs." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs -f loadbalance_cfg constraint\n" "Location Constraints:\n" "Ordering Constraints:\n" " start ClusterIP-clone then start WebSite (kind:Mandatory)\n" " promote WebDataClone then start WebFS (kind:Mandatory)\n" " start WebFS then start WebSite (kind:Mandatory)\n" " start dlm-clone then start WebFS (kind:Mandatory)\n" "Colocation Constraints:\n" " WebSite with ClusterIP-clone (score:INFINITY)\n" " WebFS with WebDataClone (score:INFINITY) (with-rsc-role:Master)\n" " WebSite with WebFS (score:INFINITY)\n" " WebFS with dlm-clone (score:INFINITY)" msgstr "" #. Tag: para #, no-c-format msgid "Now we must tell the resource how to decide which requests are processed by which hosts. To do this, we specify the clusterip_hash parameter. The value of sourceip means that the source IP address of incoming packets will be hashed; each node will process a certain range of hashes." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs -f loadbalance_cfg resource update ClusterIP clusterip_hash=sourceip" msgstr "" #. Tag: para #, no-c-format msgid "Load our configuration to the cluster, and see how it responds." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib-push loadbalance_cfg\n" "CIB updated\n" "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Aug 14 11:32:07 2015\n" "Last change: Fri Aug 14 11:32:04 2015\n" "Stack: corosync\n" "Current DC: pcmk-1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "9 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " WebSite (ocf::heartbeat:apache): Stopped\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-1 ]\n" " Slaves: [ pcmk-2 ]\n" " WebFS (ocf::heartbeat:Filesystem): Stopped\n" " ipmi-fencing (stonith:fence_ipmilan): Started pcmk-1\n" " Clone Set: dlm-clone [dlm]\n" " Started: [ pcmk-1 pcmk-2 ]\n" " Clone Set: ClusterIP-clone [ClusterIP] (unique)\n" " ClusterIP:0 (ocf::heartbeat:IPaddr2): Started pcmk-1\n" " ClusterIP:1 (ocf::heartbeat:IPaddr2): Started pcmk-2\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "If desired, you can demonstrate that all request buckets are working by using a tool such as arping from several source hosts to see which host responds to each." msgstr "" #. Tag: title #, no-c-format msgid "Clone the Filesystem and Apache Resources" msgstr "" #. Tag: para #, no-c-format msgid "Now that we have a cluster filesystem ready to go, and our nodes can load-balance requests to a shared IP address, we can configure the cluster so both nodes mount the filesystem and respond to web requests." msgstr "" #. Tag: para #, no-c-format msgid "Clone the filesystem and Apache resources in a new configuration. Notice how pcs automatically updates the relevant constraints again." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib active_cfg\n" "[root@pcmk-1 ~]# pcs -f active_cfg resource clone WebFS\n" "[root@pcmk-1 ~]# pcs -f active_cfg resource clone WebSite\n" "[root@pcmk-1 ~]# pcs -f active_cfg constraint\n" "Location Constraints:\n" "Ordering Constraints:\n" " start ClusterIP-clone then start WebSite-clone (kind:Mandatory)\n" " promote WebDataClone then start WebFS-clone (kind:Mandatory)\n" " start WebFS-clone then start WebSite-clone (kind:Mandatory)\n" " start dlm-clone then start WebFS-clone (kind:Mandatory)\n" "Colocation Constraints:\n" " WebSite-clone with ClusterIP-clone (score:INFINITY)\n" " WebFS-clone with WebDataClone (score:INFINITY) (with-rsc-role:Master)\n" " WebSite-clone with WebFS-clone (score:INFINITY)\n" " WebFS-clone with dlm-clone (score:INFINITY)" msgstr "" #. Tag: para #, no-c-format msgid "Tell the cluster that it is now allowed to promote both instances to be DRBD Primary (aka. master)." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs -f active_cfg resource update WebDataClone master-max=2" msgstr "" #. Tag: para #, no-c-format msgid "Finally, load our configuration to the cluster, and re-enable the WebFS resource (which we disabled earlier)." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib-push active_cfg\n" "CIB updated\n" "[root@pcmk-1 ~]# pcs resource enable WebFS" msgstr "" #. Tag: para #, no-c-format msgid "After all the processes are started, the status should look similar to this." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-1 pcmk-2 ]\n" " Clone Set: dlm-clone [dlm]\n" " Started: [ pcmk-1 pcmk-2 ]\n" " Clone Set: ClusterIP-clone [ClusterIP] (unique)\n" " ClusterIP:0 (ocf::heartbeat:IPaddr2): Started\n" " ClusterIP:1 (ocf::heartbeat:IPaddr2): Started\n" " Clone Set: WebFS-clone [WebFS]\n" " Started: [ pcmk-1 pcmk-2 ]\n" " Clone Set: WebSite-clone [WebSite]\n" " Started: [ pcmk-1 pcmk-2 ]" msgstr "" #. Tag: title #, no-c-format msgid "Test Failover" msgstr "" #. Tag: para #, no-c-format msgid "Testing failover is left as an exercise for the reader. For example, you can put one node into standby mode, use pcs status to confirm that its ClusterIP clone was moved to the other node, and use arping to verify that packets are not being lost from any source host." msgstr "" #. Tag: para #, no-c-format msgid "You may find that when a failed node rejoins the cluster, both ClusterIP clones stay on one node, due to the resource stickiness. While this works fine, it effectively eliminates load-balancing and returns the cluster to an active-passive setup again. You can avoid this by disabling stickiness for the IP address resource:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource meta ClusterIP resource-stickiness=0" msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ch-Active-Passive.pot b/doc/Clusters_from_Scratch/pot/Ch-Active-Passive.pot index 2adb086b7e..918aac41c2 100644 --- a/doc/Clusters_from_Scratch/pot/Ch-Active-Passive.pot +++ b/doc/Clusters_from_Scratch/pot/Ch-Active-Passive.pot @@ -1,508 +1,508 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Create an Active/Passive Cluster" msgstr "" #. Tag: title #, no-c-format msgid "Explore the Existing Configuration" msgstr "" #. Tag: para #, no-c-format msgid "When Pacemaker starts up, it automatically records the number and details of the nodes in the cluster, as well as which stack is being used and the version of Pacemaker being used." msgstr "" #. Tag: para #, no-c-format msgid "The first few lines of output should look like this:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "WARNING: no stonith devices and stonith-enabled is not false\n" "Last updated: Tue Dec 16 16:15:29 2014\n" "Last change: Tue Dec 16 15:49:47 2014\n" "Stack: corosync\n" "Current DC: pcmk-2 (2) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "0 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]" msgstr "" #. Tag: para #, no-c-format msgid "For those who are not of afraid of XML, you can see the raw cluster configuration and status by using the pcs cluster cib command." msgstr "" #. Tag: title #, no-c-format msgid "The last XML you’ll see in this document" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib" msgstr "" #. Tag: programlisting #, no-c-format msgid "<cib crm_feature_set=\"3.0.9\" validate-with=\"pacemaker-2.3\" epoch=\"5\" num_updates=\"8\" admin_epoch=\"0\" cib-last-written=\"Tue Dec 16 15:49:47 2014\" have-quorum=\"1\" dc-uuid=\"2\">\n" " <configuration>\n" " <crm_config>\n" " <cluster_property_set id=\"cib-bootstrap-options\">\n" " <nvpair id=\"cib-bootstrap-options-have-watchdog\" name=\"have-watchdog\" value=\"false\"/>\n" " <nvpair id=\"cib-bootstrap-options-dc-version\" name=\"dc-version\" value=\"1.1.12-a14efad\"/>\n" " <nvpair id=\"cib-bootstrap-options-cluster-infrastructure\" name=\"cluster-infrastructure\" value=\"corosync\"/>\n" " <nvpair id=\"cib-bootstrap-options-cluster-name\" name=\"cluster-name\" value=\"mycluster\"/>\n" " </cluster_property_set>\n" " </crm_config>\n" " <nodes>\n" " <node id=\"1\" uname=\"pcmk-1\"/>\n" " <node id=\"2\" uname=\"pcmk-2\"/>\n" " </nodes>\n" " <resources/>\n" " <constraints/>\n" " </configuration>\n" " <status>\n" " <node_state id=\"2\" uname=\"pcmk-2\" in_ccm=\"true\" crmd=\"online\" crm-debug-origin=\"do_state_transition\" join=\"member\" expected=\"member\">\n" " <lrm id=\"2\">\n" " <lrm_resources/>\n" " </lrm>\n" " <transient_attributes id=\"2\">\n" " <instance_attributes id=\"status-2\">\n" " <nvpair id=\"status-2-shutdown\" name=\"shutdown\" value=\"0\"/>\n" " <nvpair id=\"status-2-probe_complete\" name=\"probe_complete\" value=\"true\"/>\n" " </instance_attributes>\n" " </transient_attributes>\n" " </node_state>\n" " <node_state id=\"1\" uname=\"pcmk-1\" in_ccm=\"true\" crmd=\"online\" crm-debug-origin=\"do_state_transition\" join=\"member\" expected=\"member\">\n" " <lrm id=\"1\">\n" " <lrm_resources/>\n" " </lrm>\n" " <transient_attributes id=\"1\">\n" " <instance_attributes id=\"status-1\">\n" " <nvpair id=\"status-1-shutdown\" name=\"shutdown\" value=\"0\"/>\n" " <nvpair id=\"status-1-probe_complete\" name=\"probe_complete\" value=\"true\"/>\n" " </instance_attributes>\n" " </transient_attributes>\n" " </node_state>\n" " </status>\n" "</cib>" msgstr "" #. Tag: para #, no-c-format msgid "Before we make any changes, it’s a good idea to check the validity of the configuration." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# crm_verify -L -V\n" " error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined\n" " error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option\n" " error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity\n" "Errors found during check: config not valid" msgstr "" #. Tag: para #, no-c-format msgid "As you can see, the tool has found some errors." msgstr "" #. Tag: para #, no-c-format msgid "In order to guarantee the safety of your data, If the data is corrupt, there is little point in continuing to make it available the default for STONITH A common node fencing mechanism. Used to ensure data integrity by powering off \"bad\" nodes in Pacemaker is enabled. However, it also knows when no STONITH configuration has been supplied and reports this as a problem (since the cluster would not be able to make progress if a situation requiring node fencing arose)." msgstr "" #. Tag: para #, no-c-format msgid "We will disable this feature for now and configure it later." msgstr "" #. Tag: para #, no-c-format msgid "To disable STONITH, set the stonith-enabled cluster option to false:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs property set stonith-enabled=false\n" "[root@pcmk-1 ~]# crm_verify -L" msgstr "" #. Tag: para #, no-c-format msgid "With the new cluster option set, the configuration is now valid." msgstr "" #. Tag: para #, no-c-format msgid "The use of stonith-enabled=false is completely inappropriate for a production cluster. It tells the cluster to simply pretend that failed nodes are safely powered off. Some vendors will refuse to support clusters that have STONITH disabled." msgstr "" #. Tag: para #, no-c-format msgid "We disable STONITH here only to defer the discussion of its configuration, which can differ widely from one installation to the next. See for information on why STONITH is important and details on how to configure it." msgstr "" #. Tag: title #, no-c-format msgid "Add a Resource" msgstr "" #. Tag: para #, no-c-format msgid "Our first resource will be a unique IP address that the cluster can bring up on either node. Regardless of where any cluster service(s) are running, end users need a consistent address to contact them on. Here, I will choose 192.168.122.120 as the floating address, give it the imaginative name ClusterIP and tell the cluster to check whether it is running every 30 seconds." msgstr "" #. Tag: para #, no-c-format msgid "The chosen address must not already be in use on the network. Do not reuse an IP address one of the nodes already has configured." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 \\\n" " ip=192.168.122.120 cidr_netmask=32 op monitor interval=30s" msgstr "" #. Tag: para #, no-c-format msgid "Another important piece of information here is ocf:heartbeat:IPaddr2. This tells Pacemaker three things about the resource you want to add:" msgstr "" #. Tag: para #, no-c-format msgid "The first field (ocf in this case) is the standard to which the resource script conforms and where to find it." msgstr "" #. Tag: para #, no-c-format msgid "The second field (heartbeat in this case) is standard-specific; for OCF resources, it tells the cluster which OCF namespace the resource script is in." msgstr "" #. Tag: para #, no-c-format msgid "The third field (IPaddr2 in this case) is the name of the resource script." msgstr "" #. Tag: para #, no-c-format msgid "To obtain a list of the available resource standards (the ocf part of ocf:heartbeat:IPaddr2), run:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource standards\n" "ocf\n" "lsb\n" "service\n" "systemd\n" "stonith" msgstr "" #. Tag: para #, no-c-format msgid "To obtain a list of the available OCF resource providers (the heartbeat part of ocf:heartbeat:IPaddr2), run:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource providers\n" "heartbeat\n" "openstack\n" "pacemaker" msgstr "" #. Tag: para #, no-c-format msgid "Finally, if you want to see all the resource agents available for a specific OCF provider (the IPaddr2 part of ocf:heartbeat:IPaddr2), run:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource agents ocf:heartbeat\n" "CTDB\n" "Delay\n" "Dummy\n" "Filesystem\n" "IPaddr\n" "IPaddr2\n" ".\n" ". (skipping lots of resources to save space)\n" ".\n" "rsyncd\n" "slapd\n" "symlink\n" "tomcat" msgstr "" #. Tag: para #, no-c-format msgid "Now, verify that the IP resource has been added, and display the cluster’s status to see that it is now active:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Tue Dec 16 17:44:40 2014\n" "Last change: Tue Dec 16 17:44:26 2014\n" "Stack: corosync\n" "Current DC: pcmk-1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "1 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-1\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: title #, no-c-format msgid "Perform a Failover" msgstr "" #. Tag: para #, no-c-format msgid "Since our ultimate goal is high availability, we should test failover of our new resource before moving on." msgstr "" #. Tag: para #, no-c-format msgid "First, find the node on which the IP address is running." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Tue Dec 16 17:44:40 2014\n" "Last change: Tue Dec 16 17:44:26 2014\n" "Stack: corosync\n" "Current DC: pcmk-1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "1 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-1" msgstr "" #. Tag: para #, no-c-format msgid "You can see that the status of the ClusterIP resource is Started on a particular node (in this example, pcmk-1). Shut down Pacemaker and Corosync on that machine to trigger a failover." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster stop pcmk-1\n" "Stopping Cluster..." msgstr "" #. Tag: para #, no-c-format msgid "A cluster command such as pcs cluster stop nodename can be run from any node in the cluster, not just the affected node." msgstr "" #. Tag: para #, no-c-format msgid "Verify that pacemaker and corosync are no longer running:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status\n" "Error: cluster is not currently running on this node" msgstr "" #. Tag: para #, no-c-format msgid "Go to the other node, and check the cluster status." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-2 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Wed Dec 17 10:30:56 2014\n" "Last change: Tue Dec 16 17:44:26 2014\n" "Stack: corosync\n" "Current DC: pcmk-2 (2) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "1 Resources configured\n" "\n" "\n" "Online: [ pcmk-2 ]\n" "OFFLINE: [ pcmk-1 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "Notice that pcmk-1 is OFFLINE for cluster purposes (its PCSD is still Online, allowing it to receive pcs commands, but it is not participating in the cluster)." msgstr "" #. Tag: para #, no-c-format msgid "Also notice that ClusterIP is now running on pcmk-2 — failover happened automatically, and no errors are reported." msgstr "" #. Tag: title #, no-c-format msgid "Quorum" msgstr "" #. Tag: para #, no-c-format msgid "If a cluster splits into two (or more) groups of nodes that can no longer communicate with each other (aka. partitions), quorum is used to prevent resources from starting on more nodes than desired, which would risk data corruption." msgstr "" #. Tag: para #, no-c-format msgid "A cluster has quorum when more than half of all known nodes are online in the same partition, or for the mathematically inclined, whenever the following equation is true:" msgstr "" #. Tag: literallayout #, no-c-format msgid "total_nodes < 2 * active_nodes" msgstr "" #. Tag: para #, no-c-format msgid "For example, if a 5-node cluster split into 3- and 2-node paritions, the 3-node partition would have quorum and could continue serving resources. If a 6-node cluster split into two 3-node partitions, neither partition would have quorum; pacemaker’s default behavior in such cases is to stop all resources, in order to prevent data corruption." msgstr "" #. Tag: para #, no-c-format msgid "Two-node clusters are a special case. By the above definition, a two-node cluster would only have quorum when both nodes are running. This would make the creation of a two-node cluster pointless, Some would argue that two-node clusters are always pointless, but that is an argument for another time but corosync has the ability to treat two-node clusters as if only one node is required for quorum." msgstr "" #. Tag: para #, no-c-format msgid "The pcs cluster setup command will automatically configure two_node: 1 in corosync.conf, so a two-node cluster will \"just work\"." msgstr "" #. Tag: para #, no-c-format msgid "If you are using a different cluster shell, you will have to configure corosync.conf appropriately yourself. If you are using older versions of corosync, you will have to ignore quorum at the pacemaker level, using pcs property set no-quorum-policy=ignore (or the equivalent command if you are using a different cluster shell)." msgstr "" #. Tag: para #, no-c-format msgid "Now, simulate node recovery by restarting the cluster stack on pcmk-1, and check the cluster’s status. (It may take a little while before the cluster gets going on the node, but it eventually will look like the below.)" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster start pcmk-1\n" "pcmk-1: Starting Cluster...\n" "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Wed Dec 17 10:50:11 2014\n" "Last change: Tue Dec 16 17:44:26 2014\n" "Stack: corosync\n" "Current DC: pcmk-2 (2) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "1 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "With older versions of pacemaker, the cluster might move the IP back to its original location (pcmk-1). Usually, this is no longer the case." msgstr "" #. Tag: title #, no-c-format msgid "Prevent Resources from Moving after Recovery" msgstr "" #. Tag: para #, no-c-format msgid "In most circumstances, it is highly desirable to prevent healthy resources from being moved around the cluster. Moving resources almost always requires a period of downtime. For complex services such as databases, this period can be quite long." msgstr "" #. Tag: para #, no-c-format msgid "To address this, Pacemaker has the concept of resource stickiness, which controls how strongly a service prefers to stay running where it is. You may like to think of it as the \"cost\" of any downtime. By default, Pacemaker assumes there is zero cost associated with moving resources and will do so to achieve \"optimal\" Pacemaker’s definition of optimal may not always agree with that of a human’s. The order in which Pacemaker processes lists of resources and nodes creates implicit preferences in situations where the administrator has not explicitly specified them. resource placement. We can specify a different stickiness for every resource, but it is often sufficient to change the default." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource defaults resource-stickiness=100\n" "[root@pcmk-1 ~]# pcs resource defaults\n" "resource-stickiness: 100" msgstr "" #. Tag: para #, no-c-format msgid "Older versions of pcs required that rsc be added after resource in the above commands." msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ch-Apache.pot b/doc/Clusters_from_Scratch/pot/Ch-Apache.pot index 966ef24efe..0c2691edb5 100644 --- a/doc/Clusters_from_Scratch/pot/Ch-Apache.pot +++ b/doc/Clusters_from_Scratch/pot/Ch-Apache.pot @@ -1,490 +1,503 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format -msgid "Add Apache as a Cluster Service" +msgid "Add Apache HTTP Server as a Cluster Service" msgstr "" #. Tag: para #, no-c-format -msgid "Now that we have a basic but functional active/passive two-node cluster, we’re ready to add some real services. We’re going to start with Apache because it is a feature of many clusters and relatively simple to configure." +msgid " Apache HTTP Server " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Now that we have a basic but functional active/passive two-node cluster, we’re ready to add some real services. We’re going to start with Apache HTTP Server because it is a feature of many clusters and relatively simple to configure." msgstr "" #. Tag: title #, no-c-format msgid "Install Apache" msgstr "" #. Tag: para #, no-c-format msgid "Before continuing, we need to make sure Apache is installed on both hosts. We also need the wget tool in order for the cluster to be able to check the status of the Apache server." msgstr "" #. Tag: screen #, no-c-format msgid "# yum install -y httpd wget\n" "# firewall-cmd --permanent --add-service=http\n" "# firewall-cmd --reload" msgstr "" #. Tag: para #, no-c-format msgid "Do not enable the httpd service. Services that are intended to be managed via the cluster software should never be managed by the OS." msgstr "" #. Tag: para #, no-c-format msgid "It is often useful, however, to manually start the service, verify that it works, then stop it again, before adding it to the cluster. This allows you to resolve any non-cluster-related problems before continuing. Since this is a simple example, we’ll skip that step here." msgstr "" #. Tag: title #, no-c-format msgid "Create Website Documents" msgstr "" #. Tag: para #, no-c-format msgid "We need to create a page for Apache to serve. On &DISTRO; &DISTRO_VERSION;, the default Apache document root is /var/www/html, so we’ll create an index file there. For the moment, we will simplify things by serving a static site and manually synchronizing the data between the two nodes, so run this command on both nodes:" msgstr "" #. Tag: screen #, no-c-format msgid "# cat <<-END >/var/www/html/index.html\n" " <html>\n" " <body>My Test Site - $(hostname)</body>\n" " </html>\n" "END" msgstr "" #. Tag: title #, no-c-format msgid "Enable the Apache status URL" msgstr "" +#. Tag: para +#, no-c-format +msgid " Apache HTTP Server/server-status /server-status " +msgstr "" + #. Tag: para #, no-c-format msgid "In order to monitor the health of your Apache instance, and recover it if it fails, the resource agent used by Pacemaker assumes the server-status URL is available. On both nodes, enable the URL with:" msgstr "" #. Tag: screen #, no-c-format msgid "# cat <<-END >/etc/httpd/conf.d/status.conf\n" " <Location /server-status>\n" " SetHandler server-status\n" -" Order deny,allow\n" -" Deny from all\n" -" Allow from 127.0.0.1\n" +" Require local\n" " </Location>\n" "END" msgstr "" #. Tag: para #, no-c-format -msgid "If you are using a different operating system, server-status may already be enabled or may be configurable in a different location." +msgid "If you are using a different operating system, server-status may already be enabled or may be configurable in a different location. If you are using a version of Apache HTTP Server less than 2.4, the syntax will be different." msgstr "" #. Tag: title #, no-c-format msgid "Configure the Cluster" msgstr "" +#. Tag: para +#, no-c-format +msgid " Apache HTTP ServerApache resource configuration Apache resource configuration " +msgstr "" + #. Tag: para #, no-c-format msgid "At this point, Apache is ready to go, and all that needs to be done is to add it to the cluster. Let’s call the resource WebSite. We need to use an OCF resource script called apache in the heartbeat namespace. Compare the key used here, ocf:heartbeat:apache, with the one we used earlier for the IP address, ocf:heartbeat:IPaddr2 The script’s only required parameter is the path to the main Apache configuration file, and we’ll tell the cluster to check once a minute that Apache is still running." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource create WebSite ocf:heartbeat:apache \\\n" " configfile=/etc/httpd/conf/httpd.conf \\\n" " statusurl=\"http://localhost/server-status\" \\\n" " op monitor interval=1min" msgstr "" #. Tag: para #, no-c-format msgid "By default, the operation timeout for all resources' start, stop, and monitor operations is 20 seconds. In many cases, this timeout period is less than a particular resource’s advised timeout period. For the purposes of this tutorial, we will adjust the global operation timeout default to 240 seconds." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs resource op defaults timeout=240s\n" "[root@pcmk-1 ~]# pcs resource op defaults\n" "timeout: 240s" msgstr "" #. Tag: para #, no-c-format msgid "In a production cluster, it is usually better to adjust each resource’s start, stop, and monitor timeouts to values that are appropriate to the behavior observed in your environment, rather than adjust the global default." msgstr "" #. Tag: para #, no-c-format msgid "After a short delay, we should see the cluster start Apache." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Wed Dec 17 12:40:41 2014\n" "Last change: Wed Dec 17 12:40:05 2014\n" "Stack: corosync\n" "Current DC: pcmk-2 (2) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "2 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2\n" " WebSite (ocf::heartbeat:apache): Started pcmk-1\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "Wait a moment, the WebSite resource isn’t running on the same host as our IP address!" msgstr "" #. Tag: para #, no-c-format msgid "If, in the pcs status output, you see the WebSite resource has failed to start, then you’ve likely not enabled the status URL correctly. You can check whether this is the problem by running:" msgstr "" #. Tag: literallayout #, no-c-format -msgid "wget -O - http://127.0.0.1/server-status" +msgid "wget -O - http://localhost/server-status" msgstr "" #. Tag: para #, no-c-format -msgid "If you see Connection refused in the output, then this is likely the problem. Ensure that Allow from 127.0.0.1 is present for the <Location /server-status> block." +msgid "If you see Not Found or Forbidden in the output, then this is likely the problem. Ensure that the <Location /server-status> block is correct." msgstr "" #. Tag: title #, no-c-format msgid "Ensure Resources Run on the Same Host" msgstr "" #. Tag: para #, no-c-format msgid "To reduce the load on any one machine, Pacemaker will generally try to spread the configured resources across the cluster nodes. However, we can tell the cluster that two resources are related and need to run on the same host (or not at all). Here, we instruct the cluster that WebSite can only run on the host that ClusterIP is active on." msgstr "" #. Tag: para #, no-c-format msgid "To achieve this, we use a colocation constraint that indicates it is mandatory for WebSite to run on the same node as ClusterIP. The \"mandatory\" part of the colocation constraint is indicated by using a score of INFINITY. The INFINITY score also means that if ClusterIP is not active anywhere, WebSite will not be permitted to run." msgstr "" #. Tag: para #, no-c-format msgid "If ClusterIP is not active anywhere, WebSite will not be permitted to run anywhere." msgstr "" #. Tag: para #, no-c-format msgid "Colocation constraints are \"directional\", in that they imply certain things about the order in which the two resources will have a location chosen. In this case, we’re saying that WebSite needs to be placed on the same machine as ClusterIP, which implies that the cluster must know the location of ClusterIP before choosing a location for WebSite." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs constraint colocation add WebSite with ClusterIP INFINITY\n" "[root@pcmk-1 ~]# pcs constraint\n" "Location Constraints:\n" "Ordering Constraints:\n" "Colocation Constraints:\n" " WebSite with ClusterIP (score:INFINITY)\n" "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Wed Dec 17 13:57:58 2014\n" "Last change: Wed Dec 17 13:57:22 2014\n" "Stack: corosync\n" "Current DC: pcmk-2 (2) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "2 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2\n" " WebSite (ocf::heartbeat:apache): Started pcmk-2\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: title #, no-c-format msgid "Ensure Resources Start and Stop in Order" msgstr "" #. Tag: para #, no-c-format msgid "Like many services, Apache can be configured to bind to specific IP addresses on a host or to the wildcard IP address. If Apache binds to the wildcard, it doesn’t matter whether an IP address is added before or after Apache starts; Apache will respond on that IP just the same. However, if Apache binds only to certain IP address(es), the order matters: If the address is added after Apache starts, Apache won’t respond on that address." msgstr "" #. Tag: para #, no-c-format msgid "To be sure our WebSite responds regardless of Apache’s address configuration, we need to make sure ClusterIP not only runs on the same node, but starts before WebSite. A colocation constraint only ensures the resources run together, not the order in which they are started and stopped." msgstr "" #. Tag: para #, no-c-format msgid "We do this by adding an ordering constraint. By default, all order constraints are mandatory, which means that the recovery of ClusterIP will also trigger the recovery of WebSite." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs constraint order ClusterIP then WebSite\n" "Adding ClusterIP WebSite (kind: Mandatory) (Options: first-action=start then-action=start)\n" "[root@pcmk-1 ~]# pcs constraint\n" "Location Constraints:\n" "Ordering Constraints:\n" " start ClusterIP then start WebSite (kind:Mandatory)\n" "Colocation Constraints:\n" " WebSite with ClusterIP (score:INFINITY)" msgstr "" #. Tag: title #, no-c-format msgid "Prefer One Node Over Another" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker does not rely on any sort of hardware symmetry between nodes, so it may well be that one machine is more powerful than the other. In such cases, it makes sense to host the resources on the more powerful node if it is available. To do this, we create a location constraint." msgstr "" #. Tag: para #, no-c-format msgid "In the location constraint below, we are saying the WebSite resource prefers the node pcmk-1 with a score of 50. Here, the score indicates how badly we’d like the resource to run at this location." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs constraint location WebSite prefers pcmk-1=50\n" "[root@pcmk-1 ~]# pcs constraint\n" "Location Constraints:\n" " Resource: WebSite\n" " Enabled on: pcmk-1 (score:50)\n" "Ordering Constraints:\n" " start ClusterIP then start WebSite (kind:Mandatory)\n" "Colocation Constraints:\n" " WebSite with ClusterIP (score:INFINITY)\n" "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Wed Dec 17 14:11:49 2014\n" "Last change: Wed Dec 17 14:11:20 2014\n" "Stack: corosync\n" "Current DC: pcmk-2 (2) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "2 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2\n" " WebSite (ocf::heartbeat:apache): Started pcmk-2\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "Wait a minute, the resources are still on pcmk-2!" msgstr "" #. Tag: para #, no-c-format msgid "Even though WebSite now prefers to run on pcmk-1, that preference is (intentionally) less than the resource stickiness (how much we preferred not to have unnecessary downtime)." msgstr "" #. Tag: para #, no-c-format msgid "To see the current placement scores, you can use a tool called crm_simulate." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# crm_simulate -sL\n" "\n" "Current cluster status:\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2\n" " WebSite (ocf::heartbeat:apache): Started pcmk-2\n" "\n" "Allocation scores:\n" "native_color: ClusterIP allocation score on pcmk-1: 50\n" "native_color: ClusterIP allocation score on pcmk-2: 200\n" "native_color: WebSite allocation score on pcmk-1: -INFINITY\n" "native_color: WebSite allocation score on pcmk-2: 100\n" "\n" "Transition Summary:" msgstr "" #. Tag: title #, no-c-format msgid "Move Resources Manually" msgstr "" #. Tag: para #, no-c-format msgid "There are always times when an administrator needs to override the cluster and force resources to move to a specific location. In this example, we will force the WebSite to move to pcmk-1 by updating our previous location constraint with a score of INFINITY." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs constraint location WebSite prefers pcmk-1=INFINITY\n" "[root@pcmk-1 ~]# pcs constraint\n" "Location Constraints:\n" " Resource: WebSite\n" " Enabled on: pcmk-1 (score:INFINITY)\n" "Ordering Constraints:\n" " start ClusterIP then start WebSite (kind:Mandatory)\n" "Colocation Constraints:\n" " WebSite with ClusterIP (score:INFINITY)\n" "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Wed Dec 17 14:19:34 2014\n" "Last change: Wed Dec 17 14:18:37 2014\n" "Stack: corosync\n" "Current DC: pcmk-2 (2) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "2 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-1\n" " WebSite (ocf::heartbeat:apache): Started pcmk-1\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "Once we’ve finished whatever activity required us to move the resources to pcmk-1 (in our case nothing), we can then allow the cluster to resume normal operation by removing the new constraint. Since we previously configured a default stickiness, the resources will remain on pcmk-1." msgstr "" #. Tag: para #, no-c-format msgid "First, use the --full option to get the constraint’s ID:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs constraint --full\n" "Location Constraints:\n" " Resource: WebSite\n" " Enabled on: pcmk-1 (score:INFINITY) (id:location-WebSite-pcmk-1-INFINITY)\n" "Ordering Constraints:\n" " start ClusterIP then start WebSite (kind:Mandatory) (id:order-ClusterIP-WebSite-mandatory)\n" "Colocation Constraints:\n" " WebSite with ClusterIP (score:INFINITY) (id:colocation-WebSite-ClusterIP-INFINITY)" msgstr "" #. Tag: para #, no-c-format msgid "Then remove the desired contraint using its ID:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs constraint remove location-WebSite-pcmk-1-INFINITY\n" "[root@pcmk-1 ~]# pcs constraint\n" "Location Constraints:\n" "Ordering Constraints:\n" " start ClusterIP then start WebSite (kind:Mandatory)\n" "Colocation Constraints:\n" " WebSite with ClusterIP (score:INFINITY)" msgstr "" #. Tag: para #, no-c-format msgid "Note that the location constraint is now gone. If we check the cluster status, we can also see that (as expected) the resources are still active on pcmk-1." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs status\n" "Cluster name: mycluster\n" "Last updated: Wed Dec 17 14:25:21 2014\n" "Last change: Wed Dec 17 14:24:29 2014\n" "Stack: corosync\n" "Current DC: pcmk-2 (2) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "2 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-1\n" " WebSite (ocf::heartbeat:apache): Started pcmk-1\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ch-Installation.pot b/doc/Clusters_from_Scratch/pot/Ch-Installation.pot index 396fb7042c..d1b5ef1c08 100644 --- a/doc/Clusters_from_Scratch/pot/Ch-Installation.pot +++ b/doc/Clusters_from_Scratch/pot/Ch-Installation.pot @@ -1,712 +1,712 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Installation" msgstr "" #. Tag: title #, no-c-format msgid "Install &DISTRO; &DISTRO_VERSION;" msgstr "" #. Tag: title #, no-c-format msgid "Boot the Install Image" msgstr "" #. Tag: para #, no-c-format msgid "Download the 4GB &DISTRO; &DISTRO_VERSION; DVD ISO. Use the image to boot a virtual machine, or burn it to a DVD or USB drive and boot a physical server from that." msgstr "" #. Tag: para #, no-c-format msgid "After starting the installation, select your language and keyboard layout at the welcome screen." msgstr "" #. Tag: title #, no-c-format msgid "&DISTRO; &DISTRO_VERSION; Installation Welcome Screen" msgstr "" #. Tag: title #, no-c-format msgid "Installation Options" msgstr "" #. Tag: para #, no-c-format msgid "At this point, you get a chance to tweak the default installation options." msgstr "" #. Tag: title #, no-c-format msgid "&DISTRO; &DISTRO_VERSION; Installation Summary Screen" msgstr "" #. Tag: para #, no-c-format msgid "Ignore the SOFTWARE SELECTION section (try saying that 10 times quickly). The Infrastructure Server environment does have add-ons with much of the software we need, but we will leave it as a Minimal Install here, so that we can see exactly what software is required later." msgstr "" #. Tag: title #, no-c-format msgid "Configure Network" msgstr "" #. Tag: para #, no-c-format msgid "In the NETWORK & HOSTNAME section:" msgstr "" #. Tag: para #, no-c-format msgid "Edit Host Name: as desired. For this example, we will use pcmk-1.localdomain." msgstr "" #. Tag: para #, no-c-format msgid "Select your network device, press Configure…, and manually assign a fixed IP address. For this example, we’ll use 192.168.122.101 under IPv4 Settings (with an appropriate netmask, gateway and DNS server)." msgstr "" #. Tag: para #, no-c-format msgid "Flip the switch to turn your network device on." msgstr "" #. Tag: para #, no-c-format msgid "Do not accept the default network settings. Cluster machines should never obtain an IP address via DHCP, because DHCP’s periodic address renewal will interfere with corosync." msgstr "" #. Tag: title #, no-c-format msgid "Configure Disk" msgstr "" #. Tag: para #, no-c-format msgid "By default, the installer’s automatic partitioning will use LVM (which allows us to dynamically change the amount of space allocated to a given partition). However, it allocates all free space to the / (aka. root) partition, which cannot be reduced in size later (dynamic increases are fine)." msgstr "" #. Tag: para #, no-c-format msgid "In order to follow the DRBD and GFS2 portions of this guide, we need to reserve space on each machine for a replicated volume." msgstr "" #. Tag: para #, no-c-format msgid "Enter the INSTALLATION DESTINATION section, ensure the hard drive you want to install to is selected, select I will configure partitioning, and press Done." msgstr "" #. Tag: para #, no-c-format msgid "In the MANUAL PARTITIONING screen that comes next, click the option to create mountpoints automatically. Select the / mountpoint, and reduce the desired capacity by 1GiB or so. Select Modify… by the volume group name, and change the Size policy: to As large as possible, to make the reclaimed space available inside the LVM volume group. We’ll add the additional volume later." msgstr "" #. Tag: title #, no-c-format msgid "Configure Time Synchronization" msgstr "" #. Tag: para #, no-c-format msgid "It is highly recommended to enable NTP on your cluster nodes. Doing so ensures all nodes agree on the current time and makes reading log files significantly easier." msgstr "" #. Tag: para #, no-c-format msgid "&DISTRO; will enable NTP automatically. If you want to change any time-related settings (such as time zone or NTP server), you can do this in the TIME & DATE section." msgstr "" #. Tag: title #, no-c-format msgid "Finish Install" msgstr "" #. Tag: para #, no-c-format msgid "Select Begin Installation. Once it completes, set a root password, and reboot as instructed. For the purposes of this document, it is not necessary to create any additional users. After the node reboots, you’ll see a login prompt on the console. Login using root and the password you created earlier." msgstr "" #. Tag: title #, no-c-format msgid "&DISTRO; &DISTRO_VERSION; Console Prompt" msgstr "" #. Tag: para #, no-c-format msgid "From here on, we’re going to be working exclusively from the terminal." msgstr "" #. Tag: title #, no-c-format msgid "Configure the OS" msgstr "" #. Tag: title #, no-c-format msgid "Verify Networking" msgstr "" #. Tag: para #, no-c-format msgid "Ensure that the machine has the static IP address you configured earlier." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ip addr\n" "1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN\n" " link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00\n" " inet 127.0.0.1/8 scope host lo\n" " inet6 ::1/128 scope host\n" " valid_lft forever preferred_lft forever\n" "2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000\n" " link/ether 52:54:00:d7:d6:08 brd ff:ff:ff:ff:ff:ff\n" " inet 192.168.122.101/24 brd 192.168.122.255 scope global eth0\n" " valid_lft forever preferred_lft forever\n" " inet6 fe80::5054:ff:fed7:d608/64 scope link\n" " valid_lft forever preferred_lft forever" msgstr "" #. Tag: para #, no-c-format msgid "If you ever need to change the node’s IP address from the command line, follow" msgstr "" #. Tag: literallayout #, no-c-format msgid "[root@pcmk-1 ~]# vi /etc/sysconfig/network-scripts/ifcfg-${device} # manually edit as desired\n" "[root@pcmk-1 ~]# nmcli dev disconnect ${device}\n" "[root@pcmk-1 ~]# nmcli con reload ${device}\n" "[root@pcmk-1 ~]# nmcli con up ${device}" msgstr "" #. Tag: para #, no-c-format msgid "This makes NetworkManager aware that a change was made on the config file." msgstr "" #. Tag: para #, no-c-format msgid "Next, ensure that the routes are as expected:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ip route\n" "default via 192.168.122.1 dev eth0 proto static metric 100\n" "192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.101 metric 100" msgstr "" #. Tag: para #, no-c-format msgid "If there is no line beginning with default via, then you may need to add a line such as" msgstr "" #. Tag: programlisting #, no-c-format msgid "GATEWAY=\"192.168.122.1\"" msgstr "" #. Tag: para #, no-c-format msgid "to the device configuration using the same process as described above for changing the IP address." msgstr "" #. Tag: para #, no-c-format msgid "Now, check for connectivity to the outside world. Start small by testing whether we can reach the gateway we configured." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ping -c 1 192.168.122.1\n" "PING 192.168.122.1 (192.168.122.1) 56(84) bytes of data.\n" "64 bytes from 192.168.122.1: icmp_req=1 ttl=64 time=0.249 ms\n" "\n" " --- 192.168.122.1 ping statistics ---\n" "1 packets transmitted, 1 received, 0% packet loss, time 0ms\n" "rtt min/avg/max/mdev = 0.249/0.249/0.249/0.000 ms" msgstr "" #. Tag: para #, no-c-format msgid "Now try something external; choose a location you know should be available." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ping -c 1 www.google.com\n" "PING www.l.google.com (173.194.72.106) 56(84) bytes of data.\n" "64 bytes from tf-in-f106.1e100.net (173.194.72.106): icmp_req=1 ttl=41 time=167 ms\n" "\n" " --- www.l.google.com ping statistics ---\n" "1 packets transmitted, 1 received, 0% packet loss, time 0ms\n" "rtt min/avg/max/mdev = 167.618/167.618/167.618/0.000 ms" msgstr "" #. Tag: title #, no-c-format msgid "Login Remotely" msgstr "" #. Tag: para #, no-c-format msgid "The console isn’t a very friendly place to work from, so we will now switch to accessing the machine remotely via SSH where we can use copy and paste, etc." msgstr "" #. Tag: para #, no-c-format msgid "From another host, check whether we can see the new host at all:" msgstr "" #. Tag: screen #, no-c-format msgid "beekhof@f16 ~ # ping -c 1 192.168.122.101\n" "PING 192.168.122.101 (192.168.122.101) 56(84) bytes of data.\n" "64 bytes from 192.168.122.101: icmp_req=1 ttl=64 time=1.01 ms\n" "\n" "--- 192.168.122.101 ping statistics ---\n" "1 packets transmitted, 1 received, 0% packet loss, time 0ms\n" "rtt min/avg/max/mdev = 1.012/1.012/1.012/0.000 ms" msgstr "" #. Tag: para #, no-c-format msgid "Next, login as root via SSH." msgstr "" #. Tag: screen #, no-c-format msgid "beekhof@f16 ~ # ssh -l root 192.168.122.101\n" "The authenticity of host '192.168.122.101 (192.168.122.101)' can't be established.\n" "ECDSA key fingerprint is 6e:b7:8f:e2:4c:94:43:54:a8:53:cc:20:0f:29:a4:e0.\n" "Are you sure you want to continue connecting (yes/no)? yes\n" "Warning: Permanently added '192.168.122.101' (ECDSA) to the list of known hosts.\n" "root@192.168.122.101's password:\n" "Last login: Tue Aug 11 13:14:39 2015\n" "[root@pcmk-1 ~]#" msgstr "" #. Tag: title #, no-c-format msgid "Apply Updates" msgstr "" #. Tag: para #, no-c-format msgid "Apply any package updates released since your installation image was created:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# yum update" msgstr "" #. Tag: title #, no-c-format msgid "Use Short Node Names" msgstr "" #. Tag: para #, no-c-format msgid "During installation, we filled in the machine’s fully qualified domain name (FQDN), which can be rather long when it appears in cluster logs and status output. See for yourself how the machine identifies itself: Nodesshort name short name " msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# uname -n\n" "pcmk-1.localdomain" msgstr "" #. Tag: para #, no-c-format msgid " NodesDomain name (Query) Domain name (Query) " msgstr "" #. Tag: para #, no-c-format msgid "We can use the hostnamectl tool to strip off the domain name:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# hostnamectl set-hostname $(uname -n | sed s/\\\\..*//)" msgstr "" #. Tag: para #, no-c-format msgid " NodesDomain name (Remove from host name) Domain name (Remove from host name) " msgstr "" #. Tag: para #, no-c-format msgid "Now, check that the machine is using the correct name:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# uname -n\n" "pcmk-1" msgstr "" #. Tag: title #, no-c-format msgid "Repeat for Second Node" msgstr "" #. Tag: para #, no-c-format msgid "Repeat the Installation steps so far, so that you have two nodes ready to have the cluster software installed." msgstr "" #. Tag: para #, no-c-format msgid "For the purposes of this document, the additional node is called pcmk-2 with address 192.168.122.102." msgstr "" #. Tag: title #, no-c-format msgid "Configure Communication Between Nodes" msgstr "" #. Tag: title #, no-c-format msgid "Configure Host Name Resolution" msgstr "" #. Tag: para #, no-c-format msgid "Confirm that you can communicate between the two new nodes:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ping -c 3 192.168.122.102\n" "PING 192.168.122.102 (192.168.122.102) 56(84) bytes of data.\n" "64 bytes from 192.168.122.102: icmp_seq=1 ttl=64 time=0.343 ms\n" "64 bytes from 192.168.122.102: icmp_seq=2 ttl=64 time=0.402 ms\n" "64 bytes from 192.168.122.102: icmp_seq=3 ttl=64 time=0.558 ms\n" "\n" "--- 192.168.122.102 ping statistics ---\n" "3 packets transmitted, 3 received, 0% packet loss, time 2000ms\n" "rtt min/avg/max/mdev = 0.343/0.434/0.558/0.092 ms" msgstr "" #. Tag: para #, no-c-format msgid "Now we need to make sure we can communicate with the machines by their name. If you have a DNS server, add additional entries for the two machines. Otherwise, you’ll need to add the machines to /etc/hosts on both nodes. Below are the entries for my cluster nodes:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# grep pcmk /etc/hosts\n" "192.168.122.101 pcmk-1.clusterlabs.org pcmk-1\n" "192.168.122.102 pcmk-2.clusterlabs.org pcmk-2" msgstr "" #. Tag: para #, no-c-format msgid "We can now verify the setup by again using ping:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ping -c 3 pcmk-2\n" "PING pcmk-2.clusterlabs.org (192.168.122.101) 56(84) bytes of data.\n" "64 bytes from pcmk-1.clusterlabs.org (192.168.122.101): icmp_seq=1 ttl=64 time=0.164 ms\n" "64 bytes from pcmk-1.clusterlabs.org (192.168.122.101): icmp_seq=2 ttl=64 time=0.475 ms\n" "64 bytes from pcmk-1.clusterlabs.org (192.168.122.101): icmp_seq=3 ttl=64 time=0.186 ms\n" "\n" "--- pcmk-2.clusterlabs.org ping statistics ---\n" "3 packets transmitted, 3 received, 0% packet loss, time 2001ms\n" "rtt min/avg/max/mdev = 0.164/0.275/0.475/0.141 ms" msgstr "" #. Tag: title #, no-c-format msgid "Configure SSH" msgstr "" #. Tag: para #, no-c-format msgid "SSH is a convenient and secure way to copy files and perform commands remotely. For the purposes of this guide, we will create a key without a password (using the -N option) so that we can perform remote actions without being prompted." msgstr "" #. Tag: para #, no-c-format msgid " SSH " msgstr "" #. Tag: para #, no-c-format msgid "Unprotected SSH keys (those without a password) are not recommended for servers exposed to the outside world. We use them here only to simplify the demo." msgstr "" #. Tag: para #, no-c-format msgid "Create a new key and allow anyone with that key to log in:" msgstr "" #. Tag: title #, no-c-format msgid "Creating and Activating a new SSH Key" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ssh-keygen -t dsa -f ~/.ssh/id_dsa -N \"\"\n" "Generating public/private dsa key pair.\n" "Your identification has been saved in /root/.ssh/id_dsa.\n" "Your public key has been saved in /root/.ssh/id_dsa.pub.\n" "The key fingerprint is:\n" "91:09:5c:82:5a:6a:50:08:4e:b2:0c:62:de:cc:74:44 root@pcmk-1.clusterlabs.org\n" "The key's randomart image is:\n" "+--[ DSA 1024]----+\n" "|==.ooEo.. |\n" "|X O + .o o |\n" "| * A + |\n" "| + . |\n" "| . S |\n" "| |\n" "| |\n" "| |\n" "| |\n" "+-----------------+\n" "[root@pcmk-1 ~]# cp ~/.ssh/id_dsa.pub ~/.ssh/authorized_keys" msgstr "" #. Tag: para #, no-c-format msgid " Creating and Activating a new SSH Key " msgstr "" #. Tag: para #, no-c-format msgid "Install the key on the other node:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# scp -r ~/.ssh pcmk-2:\n" "The authenticity of host 'pcmk-2 (192.168.122.102)' can't be established.\n" "ECDSA key fingerprint is a4:f5:b2:34:9d:86:2b:34:a2:87:37:b9:ca:68:52:ec.\n" "Are you sure you want to continue connecting (yes/no)? yes\n" "Warning: Permanently added 'pcmk-2,192.168.122.102' (ECDSA) to the list of known hosts.\n" "root@pcmk-2's password:\n" "id_dsa.pub 100% 616 0.6KB/s 00:00\n" "id_dsa 100% 672 0.7KB/s 00:00\n" "known_hosts 100% 400 0.4KB/s 00:00\n" "authorized_keys 100% 616 0.6KB/s 00:00" msgstr "" #. Tag: para #, no-c-format msgid "Test that you can now run commands remotely, without being prompted:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ssh pcmk-2 -- uname -n\n" "pcmk-2" msgstr "" #. Tag: title #, no-c-format msgid "Install the Cluster Software" msgstr "" #. Tag: para #, no-c-format msgid "Fire up a shell on both nodes and run the following to install pacemaker, and while we’re at it, some command-line tools to make our lives easier:" msgstr "" #. Tag: screen #, no-c-format msgid "# yum install -y pacemaker pcs psmisc policycoreutils-python" msgstr "" #. Tag: para #, no-c-format msgid "This document will show commands that need to be executed on both nodes with a simple # prompt. Be sure to run them on each node individually." msgstr "" #. Tag: para #, no-c-format msgid "This document uses pcs for cluster management. Other alternatives, such as crmsh, are available, but their syntax will differ from the examples used here." msgstr "" #. Tag: title #, no-c-format msgid "Configure the Cluster Software" msgstr "" #. Tag: title #, no-c-format msgid "Allow cluster services through firewall" msgstr "" #. Tag: para #, no-c-format msgid "On each node, allow cluster-related services through the local firewall:" msgstr "" #. Tag: screen #, no-c-format msgid "# firewall-cmd --permanent --add-service=high-availability\n" "success\n" "# firewall-cmd --reload\n" "success" msgstr "" #. Tag: para #, no-c-format msgid "If you are using iptables directly, or some other firewall solution besides firewalld, simply open the following ports, which can be used by various clustering components: TCP ports 2224, 3121, and 21064, and UDP port 5405." msgstr "" #. Tag: para #, no-c-format msgid "If you run into any problems during testing, you might want to disable the firewall and SELinux entirely until you have everything working. This may create significant security issues and should not be performed on machines that will be exposed to the outside world, but may be appropriate during development and testing on a protected host." msgstr "" #. Tag: para #, no-c-format msgid "To disable security measures:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# setenforce 0\n" "[root@pcmk-1 ~]# sed -i.bak \"s/SELINUX=enforcing/SELINUX=permissive/g\" /etc/selinux/config\n" "[root@pcmk-1 ~]# systemctl disable firewalld.service\n" "[root@pcmk-1 ~]# systemctl stop firewalld.service\n" "[root@pcmk-1 ~]# iptables --flush" msgstr "" #. Tag: title #, no-c-format msgid "Enable pcs Daemon" msgstr "" #. Tag: para #, no-c-format msgid "Before the cluster can be configured, the pcs daemon must be started and enabled to start at boot time on each node. This daemon works with the pcs command-line interface to manage synchronizing the corosync configuration across all nodes in the cluster." msgstr "" #. Tag: para #, no-c-format msgid "Start and enable the daemon by issuing the following commands on each node:" msgstr "" #. Tag: screen #, no-c-format msgid "# systemctl start pcsd.service\n" "# systemctl enable pcsd.service\n" "ln -s '/usr/lib/systemd/system/pcsd.service' '/etc/systemd/system/multi-user.target.wants/pcsd.service'" msgstr "" #. Tag: para #, no-c-format msgid "The installed packages will create a hacluster user with a disabled password. While this is fine for running pcs commands locally, the account needs a login password in order to perform such tasks as syncing the corosync configuration, or starting and stopping the cluster on other nodes." msgstr "" #. Tag: para #, no-c-format msgid "This tutorial will make use of such commands, so now we will set a password for the hacluster user, using the same password on both nodes:" msgstr "" #. Tag: screen #, no-c-format msgid "# passwd hacluster\n" "Changing password for user hacluster.\n" "New password:\n" "Retype new password:\n" "passwd: all authentication tokens updated successfully." msgstr "" #. Tag: para #, no-c-format msgid "Alternatively, to script this process or set the password on a different machine from the one you’re logged into, you can use the --stdin option for passwd:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ssh pcmk-2 -- 'echo redhat1 | passwd --stdin hacluster'" msgstr "" #. Tag: title #, no-c-format msgid "Configure Corosync" msgstr "" #. Tag: para #, no-c-format msgid "On either node, use pcs cluster auth to authenticate as the hacluster user:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster auth pcmk-1 pcmk-2\n" "Username: hacluster\n" "Password:\n" "pcmk-1: Authorized\n" "pcmk-2: Authorized" msgstr "" #. Tag: para #, no-c-format msgid "Next, use pcs cluster setup on the same node to generate and synchronize the corosync configuration:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster setup --name mycluster pcmk-1 pcmk-2\n" "Shutting down pacemaker/corosync services...\n" "Redirecting to /bin/systemctl stop pacemaker.service\n" "Redirecting to /bin/systemctl stop corosync.service\n" "Killing any remaining services...\n" "Removing all cluster configuration files...\n" "pcmk-1: Succeeded\n" "pcmk-2: Succeeded" msgstr "" #. Tag: para #, no-c-format msgid "If you received an authorization error for either of those commands, make sure you configured the hacluster user account on each node with the same password." msgstr "" #. Tag: para #, no-c-format msgid "Early versions of pcs required that --name be omitted from the above command." msgstr "" #. Tag: para #, no-c-format msgid "If you are not using pcs for cluster administration, follow whatever procedures are appropriate for your tools to create a corosync.conf and copy it to all nodes." msgstr "" #. Tag: para #, no-c-format msgid "The pcs command will configure corosync to use UDP unicast transport; if you choose to use multicast instead, choose a multicast address carefully. For some subtle issues, see the now-defunct http://web.archive.org/web/20101211210054/http://29west.com/docs/THPM/multicast-address-assignment.html or the more detailed treatment in Cisco’s Guidelines for Enterprise IP Multicast Address Allocation paper." msgstr "" #. Tag: para #, no-c-format msgid "The final /etc/corosync.conf configuration on each node should look something like the sample in ." msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ch-Intro.pot b/doc/Clusters_from_Scratch/pot/Ch-Intro.pot index 686edc3126..f94ca5946d 100644 --- a/doc/Clusters_from_Scratch/pot/Ch-Intro.pot +++ b/doc/Clusters_from_Scratch/pot/Ch-Intro.pot @@ -1,289 +1,289 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Read-Me-First" msgstr "" #. Tag: title #, no-c-format msgid "The Scope of this Document" msgstr "" #. Tag: para #, no-c-format msgid "Computer clusters can be used to provide highly available services or resources. The redundancy of multiple machines is used to guard against failures of many types." msgstr "" #. Tag: para #, no-c-format msgid "This document will walk through the installation and setup of simple clusters using the &DISTRO; distribution, version &DISTRO_VERSION;." msgstr "" #. Tag: para #, no-c-format msgid "The clusters described here will use Pacemaker and Corosync to provide resource management and messaging. Required packages and modifications to their configuration files are described along with the use of the Pacemaker command line tool for generating the XML used for cluster control." msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker is a central component and provides the resource management required in these systems. This management includes detecting and recovering from the failure of various nodes, resources and services under its control." msgstr "" #. Tag: para #, no-c-format msgid "When more in depth information is required and for real world usage, please refer to the Pacemaker Explained manual." msgstr "" #. Tag: title #, no-c-format msgid "What Is Pacemaker?" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker is a cluster resource manager, that is, a logic responsible for a life-cycle of deployed software — indirectly perhaps even whole systems or their interconnections — under its control within a set of computers (a.k.a. nodes) and driven by prescribed rules." msgstr "" #. Tag: para #, no-c-format msgid "It achieves maximum availability for your cluster services (a.k.a. resources) by detecting and recovering from node- and resource-level failures by making use of the messaging and membership capabilities provided by your preferred cluster infrastructure (either Corosync or Heartbeat), and possibly by utilizing other parts of the overall cluster stack." msgstr "" #. Tag: para #, no-c-format msgid "For the goal of minimal downtime a term high availability was coined and together with its acronym, HA, is well-established in the sector. To differentiate this sort of clusters from high performance computing (HPC) ones, should a context require it (apparently, not the case in this document), using HA cluster is an option." msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker’s key features include:" msgstr "" #. Tag: para #, no-c-format msgid "Detection and recovery of node and service-level failures" msgstr "" #. Tag: para #, no-c-format msgid "Storage agnostic, no requirement for shared storage" msgstr "" #. Tag: para #, no-c-format msgid "Resource agnostic, anything that can be scripted can be clustered" msgstr "" #. Tag: para #, no-c-format msgid "Supports fencing (also referred to as the STONITH acronym, deciphered later on) for ensuring data integrity" msgstr "" #. Tag: para #, no-c-format msgid "Supports large and small clusters" msgstr "" #. Tag: para #, no-c-format msgid "Supports both quorate and resource-driven clusters" msgstr "" #. Tag: para #, no-c-format msgid "Supports practically any redundancy configuration" msgstr "" #. Tag: para #, no-c-format msgid "Automatically replicated configuration that can be updated from any node" msgstr "" #. Tag: para #, no-c-format msgid "Ability to specify cluster-wide service ordering, colocation and anti-colocation" msgstr "" #. Tag: para #, no-c-format msgid "Support for advanced service types" msgstr "" #. Tag: para #, no-c-format msgid "Clones: for services which need to be active on multiple nodes" msgstr "" #. Tag: para #, no-c-format msgid "Multi-state: for services with multiple modes (e.g. master/slave, primary/secondary)" msgstr "" #. Tag: para #, no-c-format msgid "Unified, scriptable cluster management tools" msgstr "" #. Tag: title #, no-c-format msgid "Pacemaker Architecture" msgstr "" #. Tag: para #, no-c-format msgid "At the highest level, the cluster is made up of three pieces:" msgstr "" #. Tag: para #, no-c-format msgid "Non-cluster-aware components. These pieces include the resources themselves; scripts that start, stop and monitor them; and a local daemon that masks the differences between the different standards these scripts implement. Even though interactions of these resources when run as multiple instances can resemble a distributed system, they still lack the proper HA mechanisms and/or autonomous cluster-wide governance as subsumed in the following item." msgstr "" #. Tag: para #, no-c-format msgid "Resource management. Pacemaker provides the brain that processes and reacts to events regarding the cluster. These events include nodes joining or leaving the cluster; resource events caused by failures, maintenance and scheduled activities; and other administrative actions. Pacemaker will compute the ideal state of the cluster and plot a path to achieve it after any of these events. This may include moving resources, stopping nodes and even forcing them offline with remote power switches." msgstr "" #. Tag: para #, no-c-format msgid "Low-level infrastructure. Projects like Corosync, CMAN and Heartbeat provide reliable messaging, membership and quorum information about the cluster." msgstr "" #. Tag: para #, no-c-format msgid "When combined with Corosync, Pacemaker also supports popular open source cluster filesystems. Even though Pacemaker also supports Heartbeat, the filesystems need to use the stack for messaging and membership, and Corosync seems to be what they’re standardizing on. Technically, it would be possible for them to support Heartbeat as well, but there seems little interest in this. " msgstr "" #. Tag: para #, no-c-format msgid "Due to past standardization within the cluster filesystem community, cluster filesystems make use of a common distributed lock manager, which makes use of Corosync for its messaging and membership capabilities (which nodes are up/down) and Pacemaker for fencing services." msgstr "" #. Tag: title #, no-c-format msgid "The Pacemaker Stack" msgstr "" #. Tag: title #, no-c-format msgid "Internal Components" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker itself is composed of five key components:" msgstr "" #. Tag: para #, no-c-format msgid "Cluster Information Base (CIB)" msgstr "" #. Tag: para #, no-c-format msgid "Cluster Resource Management daemon (CRMd)" msgstr "" #. Tag: para #, no-c-format msgid "Local Resource Management daemon (LRMd)" msgstr "" #. Tag: para #, no-c-format msgid "Policy Engine (PEngine or PE)" msgstr "" #. Tag: para #, no-c-format msgid "Fencing daemon (STONITHd)" msgstr "" #. Tag: para #, no-c-format msgid "The CIB uses XML to represent both the cluster’s configuration and current state of all resources in the cluster. The contents of the CIB are automatically kept in sync across the entire cluster and are used by the PEngine to compute the ideal state of the cluster and how it should be achieved." msgstr "" #. Tag: para #, no-c-format msgid "This list of instructions is then fed to the Designated Controller (DC). Pacemaker centralizes all cluster decision making by electing one of the CRMd instances to act as a master. Should the elected CRMd process (or the node it is on) fail, a new one is quickly established." msgstr "" #. Tag: para #, no-c-format msgid "The DC carries out the PEngine’s instructions in the required order by passing them to either the Local Resource Management daemon (LRMd) or CRMd peers on other nodes via the cluster messaging infrastructure (which in turn passes them on to their LRMd process)." msgstr "" #. Tag: para #, no-c-format msgid "The peer nodes all report the results of their operations back to the DC and, based on the expected and actual results, will either execute any actions that needed to wait for the previous one to complete, or abort processing and ask the PEngine to recalculate the ideal cluster state based on the unexpected results." msgstr "" #. Tag: para #, no-c-format msgid "In some cases, it may be necessary to power off nodes in order to protect shared data or complete resource recovery. For this, Pacemaker comes with STONITHd." msgstr "" #. Tag: para #, no-c-format msgid "STONITH is an acronym for Shoot-The-Other-Node-In-The-Head, a recommended practice that misbehaving node is best to be promptly fenced (shut off, cut from shared resources or otherwise immobilized), and is usually implemented with a remote power switch." msgstr "" #. Tag: para #, no-c-format msgid "In Pacemaker, STONITH devices are modeled as resources (and configured in the CIB) to enable them to be easily monitored for failure, however STONITHd takes care of understanding the STONITH topology such that its clients simply request a node be fenced, and it does the rest." msgstr "" #. Tag: title #, no-c-format msgid "Types of Pacemaker Clusters" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker makes no assumptions about your environment. This allows it to support practically any redundancy configuration including Active/Active, Active/Passive, N+1, N+M, N-to-1 and N-to-N." msgstr "" #. Tag: title #, no-c-format msgid "Active/Passive Redundancy" msgstr "" #. Tag: para #, no-c-format msgid "Two-node Active/Passive clusters using Pacemaker and DRBD are a cost-effective solution for many High Availability situations." msgstr "" #. Tag: title #, no-c-format msgid "Shared Failover" msgstr "" #. Tag: para #, no-c-format msgid "By supporting many nodes, Pacemaker can dramatically reduce hardware costs by allowing several active/passive clusters to be combined and share a common backup node." msgstr "" #. Tag: title #, no-c-format msgid "N to N Redundancy" msgstr "" #. Tag: para #, no-c-format msgid "When shared storage is available, every node can potentially be used for failover. Pacemaker can even run multiple copies of services to spread out the workload." msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ch-Shared-Storage.pot b/doc/Clusters_from_Scratch/pot/Ch-Shared-Storage.pot index 520b1a2ac7..2cef0b4397 100644 --- a/doc/Clusters_from_Scratch/pot/Ch-Shared-Storage.pot +++ b/doc/Clusters_from_Scratch/pot/Ch-Shared-Storage.pot @@ -1,695 +1,695 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Replicate Storage Using DRBD" msgstr "" #. Tag: para #, no-c-format msgid "Even if you’re serving up static websites, having to manually synchronize the contents of that website to all the machines in the cluster is not ideal. For dynamic websites, such as a wiki, it’s not even an option. Not everyone care afford network-attached storage, but somehow the data needs to be kept in sync." msgstr "" #. Tag: para #, no-c-format msgid "Enter DRBD, which can be thought of as network-based RAID-1. See http://www.drbd.org/ for details." msgstr "" #. Tag: title #, no-c-format msgid "Install the DRBD Packages" msgstr "" #. Tag: para #, no-c-format msgid "DRBD itself is included in the upstream kernel,Since version 2.6.33 but we do need some utilities to use it effectively." msgstr "" #. Tag: para #, no-c-format msgid "CentOS does not ship these utilities, so we need to enable a third-party repository to get them. Supported packages for many OSes are available from DRBD’s maker LINBIT, but here we’ll use the free ELRepo repository." msgstr "" #. Tag: para #, no-c-format msgid "On both nodes, import the ELRepo package signing key, and enable the repository:" msgstr "" #. Tag: screen #, no-c-format msgid "# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org\n" "# rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm" msgstr "" #. Tag: para #, no-c-format msgid "Now, we can install the DRBD kernel module and utilities:" msgstr "" #. Tag: screen #, no-c-format msgid "# yum install -y kmod-drbd84 drbd84-utils" msgstr "" #. Tag: para #, no-c-format msgid "The version of drbd84-utils shipped with CentOS 7.1 has a bug in the Pacemaker integration script. Until a fix is packaged, download the affected script directly from the upstream, on both nodes:" msgstr "" #. Tag: screen #, no-c-format msgid "# curl -o /usr/lib/ocf/resource.d/linbit/drbd 'http://git.linbit.com/gitweb.cgi?p=drbd-utils.git;a=blob_plain;f=scripts/drbd.ocf;h=cf6b966341377a993d1bf5f585a5b9fe72eaa5f2;hb=c11ba026bbbbc647b8112543df142f2185cb4b4b'" msgstr "" #. Tag: para #, no-c-format msgid "This is a temporary fix that will be overwritten if the package is upgraded." msgstr "" #. Tag: para #, no-c-format msgid "DRBD will not be able to run under the default SELinux security policies. If you are familiar with SELinux, you can modify the policies in a more fine-grained manner, but here we will simply exempt DRBD processes from SELinux control:" msgstr "" #. Tag: screen #, no-c-format msgid "# semanage permissive -a drbd_t" msgstr "" #. Tag: para #, no-c-format msgid "We will configure DRBD to use port 7789, so allow that port from each host to the other:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# firewall-cmd --permanent --add-rich-rule='rule family=\"ipv4\" source address=\"192.168.122.102\" port port=\"7789\" protocol=\"tcp\" accept'\n" "success\n" "[root@pcmk-1 ~]# firewall-cmd --reload\n" "success" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-2 ~]# firewall-cmd --permanent --add-rich-rule='rule family=\"ipv4\" source address=\"192.168.122.101\" port port=\"7789\" protocol=\"tcp\" accept'\n" "success\n" "[root@pcmk-2 ~]# firewall-cmd --reload\n" "success" msgstr "" #. Tag: para #, no-c-format msgid "In this example, we have only two nodes, and all network traffic is on the same LAN. In production, it is recommended to use a dedicated, isolated network for cluster-related traffic, so the firewall configuration would likely be different; one approach would be to add the dedicated network interfaces to the trusted zone." msgstr "" #. Tag: title #, no-c-format msgid "Allocate a Disk Volume for DRBD" msgstr "" #. Tag: para #, no-c-format msgid "DRBD will need its own block device on each node. This can be a physical disk partition or logical volume, of whatever size you need for your data. For this document, we will use a 1GiB logical volume, which is more than sufficient for a single HTML file and (later) GFS2 metadata." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# vgdisplay | grep -e Name -e Free\n" " VG Name centos_pcmk-1\n" " Free PE / Size 382 / 1.49 GiB\n" "[root@pcmk-1 ~]# lvcreate --name drbd-demo --size 1G centos_pcmk-1\n" "Logical volume \"drbd-demo\" created\n" "[root@pcmk-1 ~]# lvs\n" " LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert\n" " drbd-demo centos_pcmk-1 -wi-a----- 1.00g\n" " root centos_pcmk-1 -wi-ao---- 5.00g\n" " swap centos_pcmk-1 -wi-ao---- 1.00g" msgstr "" #. Tag: para #, no-c-format msgid "Repeat for the second node, making sure to use the same size:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ssh pcmk-2 -- lvcreate --name drbd-demo --size 1G centos_pcmk-2\n" "Logical volume \"drbd-demo\" created" msgstr "" #. Tag: title #, no-c-format msgid "Configure DRBD" msgstr "" #. Tag: para #, no-c-format msgid "There is no series of commands for building a DRBD configuration, so simply run this on both nodes to use this sample configuration:" msgstr "" #. Tag: screen #, no-c-format msgid "# cat <<END >/etc/drbd.d/wwwdata.res\n" "resource wwwdata {\n" " protocol C;\n" " meta-disk internal;\n" " device /dev/drbd1;\n" " syncer {\n" " verify-alg sha1;\n" " }\n" " net {\n" " allow-two-primaries;\n" " }\n" " on pcmk-1 {\n" " disk /dev/centos_pcmk-1/drbd-demo;\n" " address 192.168.122.101:7789;\n" " }\n" " on pcmk-2 {\n" " disk /dev/centos_pcmk-2/drbd-demo;\n" " address 192.168.122.102:7789;\n" " }\n" "}\n" "END" msgstr "" #. Tag: para #, no-c-format msgid "Edit the file to use the hostnames, IP addresses and logical volume paths of your nodes if they differ from the ones used in this guide." msgstr "" #. Tag: para #, no-c-format msgid "Detailed information on the directives used in this configuration (and other alternatives) is available at http://www.drbd.org/users-guide/ch-configure.html" msgstr "" #. Tag: para #, no-c-format msgid "The allow-two-primaries option would not normally be used in an active/passive cluster. We are adding it here for the convenience of changing to an active/active cluster later." msgstr "" #. Tag: title #, no-c-format msgid "Initialize DRBD" msgstr "" #. Tag: para #, no-c-format msgid "With the configuration in place, we can now get DRBD running." msgstr "" #. Tag: para #, no-c-format msgid "These commands create the local metadata for the DRBD resource, ensure the DRBD kernel module is loaded, and bring up the DRBD resource. Run them on one node:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# drbdadm create-md wwwdata\n" "initializing activity log\n" "NOT initializing bitmap\n" "Writing meta data...\n" "New drbd meta data block successfully created.\n" "[root@pcmk-1 ~]# modprobe drbd\n" "[root@pcmk-1 ~]# drbdadm up wwwdata" msgstr "" #. Tag: para #, no-c-format msgid "We can confirm DRBD’s status on this node:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# cat /proc/drbd\n" "version: 8.4.6 (api:1/proto:86-101)\n" "GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R7, 2015-04-10 05:13:52\n" "\n" " 1: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s\n" " ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1048508" msgstr "" #. Tag: para #, no-c-format msgid "Because we have not yet initialized the data, this node’s data is marked as Inconsistent. Because we have not yet initialized the second node, the local state is WFConnection (waiting for connection), and the partner node’s status is marked as Unknown." msgstr "" #. Tag: para #, no-c-format msgid "Now, repeat the above commands on the second node. This time, when we check the status, it shows:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-2 ~]# cat /proc/drbd\n" "version: 8.4.6 (api:1/proto:86-101)\n" "GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R7, 2015-04-10 05:13:52\n" "\n" " 1: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----\n" " ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1048508" msgstr "" #. Tag: para #, no-c-format msgid "You can see the state has changed to Connected, meaning the two DRBD nodes are communicating properly, and both nodes are in Secondary role with Inconsistent data." msgstr "" #. Tag: para #, no-c-format msgid "To make the data consistent, we need to tell DRBD which node should be considered to have the correct data. In this case, since we are creating a new resource, both have garbage, so we’ll just pick pcmk-1 and run this command on it:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# drbdadm primary --force wwwdata" msgstr "" #. Tag: para #, no-c-format msgid "If you are using an older version of DRBD, the required syntax may be different. See the documentation for your version for how to perform these commands." msgstr "" #. Tag: para #, no-c-format msgid "If we check the status immediately, we’ll see something like this:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# cat /proc/drbd\n" "version: 8.4.6 (api:1/proto:86-101)\n" "GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R7, 2015-04-10 05:13:52\n" "\n" " 1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r-----\n" " ns:2872 nr:0 dw:0 dr:3784 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1045636\n" " [>....................] sync'ed: 0.4% (1045636/1048508)K\n" " finish: 0:10:53 speed: 1,436 (1,436) K/sec" msgstr "" #. Tag: para #, no-c-format msgid "We can see that this node has the Primary role, the partner node has the Secondary role, this node’s data is now considered UpToDate, the partner node’s data is still Inconsistent, and a progress bar shows how far along the partner node is in synchronizing the data." msgstr "" #. Tag: para #, no-c-format msgid "After a while, the sync should finish, and you’ll see something like:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# cat /proc/drbd\n" "version: 8.4.6 (api:1/proto:86-101)\n" "GIT-hash: 833d830e0152d1e457fa7856e71e11248ccf3f70 build by phil@Build64R7, 2015-04-10 05:13:52\n" "\n" " 1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----\n" " ns:1048508 nr:0 dw:0 dr:1049420 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0" msgstr "" #. Tag: para #, no-c-format msgid "Both sets of data are now UpToDate, and we can proceed to creating and populating a filesystem for our WebSite resource’s documents." msgstr "" #. Tag: title #, no-c-format msgid "Populate the DRBD Disk" msgstr "" #. Tag: para #, no-c-format msgid "On the node with the primary role (pcmk-1 in this example), create a filesystem on the DRBD device:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# mkfs.xfs /dev/drbd1\n" "meta-data=/dev/drbd1 isize=256 agcount=4, agsize=65532 blks\n" " = sectsz=512 attr=2, projid32bit=1\n" " = crc=0 finobt=0\n" "data = bsize=4096 blocks=262127, imaxpct=25\n" " = sunit=0 swidth=0 blks\n" "naming =version 2 bsize=4096 ascii-ci=0 ftype=0\n" "log =internal log bsize=4096 blocks=853, version=2\n" " = sectsz=512 sunit=0 blks, lazy-count=1\n" "realtime =none extsz=4096 blocks=0, rtextents=0" msgstr "" #. Tag: para #, no-c-format msgid "In this example, we create an xfs filesystem with no special options. In a production environment, you should choose a filesystem type and options that are suitable for your application." msgstr "" #. Tag: para #, no-c-format msgid "Mount the newly created filesystem, populate it with our web document, give it the same SELinux policy as the web document root, then unmount it (the cluster will handle mounting and unmounting it later):" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# mount /dev/drbd1 /mnt\n" "[root@pcmk-1 ~]# cat <<-END >/mnt/index.html\n" " <html>\n" " <body>My Test Site - DRBD</body>\n" " </html>\n" "END\n" "[root@pcmk-1 ~]# chcon -R --reference=/var/www/html /mnt\n" "[root@pcmk-1 ~]# umount /dev/drbd1" msgstr "" #. Tag: title #, no-c-format msgid "Configure the Cluster for the DRBD device" msgstr "" #. Tag: para #, no-c-format msgid "One handy feature pcs has is the ability to queue up several changes into a file and commit those changes atomically. To do this, start by populating the file with the current raw XML config from the CIB." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib drbd_cfg" msgstr "" #. Tag: para #, no-c-format msgid "Using the pcs -f option, make changes to the configuration saved in the drbd_cfg file. These changes will not be seen by the cluster until the drbd_cfg file is pushed into the live cluster’s CIB later." msgstr "" #. Tag: para #, no-c-format msgid "Here, we create a cluster resource for the DRBD device, and an additional clone resource to allow the resource to run on both nodes at the same time." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs -f drbd_cfg resource create WebData ocf:linbit:drbd \\\n" " drbd_resource=wwwdata op monitor interval=60s\n" "[root@pcmk-1 ~]# pcs -f drbd_cfg resource master WebDataClone WebData \\\n" " master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 \\\n" " notify=true\n" "[root@pcmk-1 ~]# pcs -f drbd_cfg resource show\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started\n" " WebSite (ocf::heartbeat:apache): Started\n" " Master/Slave Set: WebDataClone [WebData]\n" " Stopped: [ pcmk-1 pcmk-2 ]" msgstr "" #. Tag: para #, no-c-format msgid "After you are satisfied with all the changes, you can commit them all at once by pushing the drbd_cfg file into the live CIB." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib-push drbd_cfg\n" "CIB updated" msgstr "" #. Tag: para #, no-c-format msgid "Early versions of pcs required push cib in place of cib-push above." msgstr "" #. Tag: para #, no-c-format msgid "Let’s see what the cluster did with the new configuration:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Aug 14 09:29:41 2015\n" "Last change: Fri Aug 14 09:29:25 2015\n" "Stack: corosync\n" "Current DC: pcmk-1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "4 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-1\n" " WebSite (ocf::heartbeat:apache): Started pcmk-1\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-1 ]\n" " Slaves: [ pcmk-2 ]\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "We can see that WebDataClone (our DRBD device) is running as master (DRBD’s primary role) on pcmk-1 and slave (DRBD’s secondary role) on pcmk-2." msgstr "" #. Tag: para #, no-c-format msgid "The resource agent should load the DRBD module when needed if it’s not already loaded. If that does not happen, configure your operating system to load the module at boot time. For &DISTRO; &DISTRO_VERSION;, you would run this on both nodes:" msgstr "" #. Tag: screen #, no-c-format msgid "# echo drbd >/etc/modules-load.d/drbd.conf" msgstr "" #. Tag: title #, no-c-format msgid "Configure the Cluster for the Filesystem" msgstr "" #. Tag: para #, no-c-format msgid "Now that we have a working DRBD device, we need to mount its filesystem." msgstr "" #. Tag: para #, no-c-format msgid "In addition to defining the filesystem, we also need to tell the cluster where it can be located (only on the DRBD Primary) and when it is allowed to start (after the Primary was promoted)." msgstr "" #. Tag: para #, no-c-format msgid "We are going to take a shortcut when creating the resource this time. Instead of explicitly saying we want the ocf:heartbeat:Filesystem script, we are only going to ask for Filesystem. We can do this because we know there is only one resource script named Filesystem available to pacemaker, and that pcs is smart enough to fill in the ocf:heartbeat: portion for us correctly in the configuration. If there were multiple Filesystem scripts from different OCF providers, we would need to specify the exact one we wanted." msgstr "" #. Tag: para #, no-c-format msgid "Once again, we will queue our changes to a file and then push the new configuration to the cluster as the final step." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib fs_cfg\n" "[root@pcmk-1 ~]# pcs -f fs_cfg resource create WebFS Filesystem \\\n" " device=\"/dev/drbd1\" directory=\"/var/www/html\" fstype=\"xfs\"\n" "[root@pcmk-1 ~]# pcs -f fs_cfg constraint colocation add WebFS with WebDataClone INFINITY with-rsc-role=Master\n" "[root@pcmk-1 ~]# pcs -f fs_cfg constraint order promote WebDataClone then start WebFS\n" "Adding WebDataClone WebFS (kind: Mandatory) (Options: first-action=promote then-action=start)" msgstr "" #. Tag: para #, no-c-format msgid "We also need to tell the cluster that Apache needs to run on the same machine as the filesystem and that it must be active before Apache can start." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs -f fs_cfg constraint colocation add WebSite with WebFS INFINITY\n" "[root@pcmk-1 ~]# pcs -f fs_cfg constraint order WebFS then WebSite\n" "Adding WebFS WebSite (kind: Mandatory) (Options: first-action=start then-action=start)" msgstr "" #. Tag: para #, no-c-format msgid "Review the updated configuration." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs -f fs_cfg constraint\n" "Location Constraints:\n" "Ordering Constraints:\n" " start ClusterIP then start WebSite (kind:Mandatory)\n" " promote WebDataClone then start WebFS (kind:Mandatory)\n" " start WebFS then start WebSite (kind:Mandatory)\n" "Colocation Constraints:\n" " WebSite with ClusterIP (score:INFINITY)\n" " WebFS with WebDataClone (score:INFINITY) (with-rsc-role:Master)\n" " WebSite with WebFS (score:INFINITY)" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs -f fs_cfg resource show\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started\n" " WebSite (ocf::heartbeat:apache): Started\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-1 ]\n" " Slaves: [ pcmk-2 ]\n" " WebFS (ocf::heartbeat:Filesystem): Stopped" msgstr "" #. Tag: para #, no-c-format msgid "After reviewing the new configuration, upload it and watch the cluster put it into effect." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster cib-push fs_cfg\n" "[root@pcmk-1 ~]# pcs status\n" "Last updated: Fri Aug 14 09:34:11 2015\n" "Last change: Fri Aug 14 09:34:09 2015\n" "Stack: corosync\n" "Current DC: pcmk-1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "5 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-1\n" " WebSite (ocf::heartbeat:apache): Started pcmk-1\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-1 ]\n" " Slaves: [ pcmk-2 ]\n" " WebFS (ocf::heartbeat:Filesystem): Started pcmk-1\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: title #, no-c-format msgid "Test Cluster Failover" msgstr "" #. Tag: para #, no-c-format msgid "Previously, we used pcs cluster stop pcmk-1 to stop all cluster services on pcmk-1, failing over the cluster resources, but there is another way to safely simulate node failure." msgstr "" #. Tag: para #, no-c-format msgid "We can put the node into standby mode. Nodes in this state continue to run corosync and pacemaker but are not allowed to run resources. Any resources found active there will be moved elsewhere. This feature can be particularly useful when performing system administration tasks such as updating packages used by cluster resources." msgstr "" #. Tag: para #, no-c-format msgid "Put the active node into standby mode, and observe the cluster move all the resources to the other node. The node’s status will change to indicate that it can no longer host resources." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster standby pcmk-1\n" "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Aug 14 09:36:49 2015\n" "Last change: Fri Aug 14 09:36:43 2015\n" "Stack: corosync\n" "Current DC: pcmk-1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "5 Resources configured\n" "\n" "\n" "Node pcmk-1 (1): standby\n" "Online: [ pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2\n" " WebSite (ocf::heartbeat:apache): Started pcmk-2\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-2 ]\n" " Stopped: [ pcmk-1 ]\n" " WebFS (ocf::heartbeat:Filesystem): Started pcmk-2\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "Once we’ve done everything we needed to on pcmk-1 (in this case nothing, we just wanted to see the resources move), we can allow the node to be a full cluster member again." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster unstandby pcmk-1\n" "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Aug 14 09:38:02 2015\n" "Last change: Fri Aug 14 09:37:56 2015\n" "Stack: corosync\n" "Current DC: pcmk-1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "5 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" " ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2\n" " WebSite (ocf::heartbeat:apache): Started pcmk-2\n" " Master/Slave Set: WebDataClone [WebData]\n" " Masters: [ pcmk-2 ]\n" " Slaves: [ pcmk-1 ]\n" " WebFS (ocf::heartbeat:Filesystem): Started pcmk-2\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "Notice that pcmk-1 is back to the Online state, and that the cluster resources stay where they are due to our resource stickiness settings configured earlier." msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ch-Stonith.pot b/doc/Clusters_from_Scratch/pot/Ch-Stonith.pot index 94738db0f8..6bb747ce29 100644 --- a/doc/Clusters_from_Scratch/pot/Ch-Stonith.pot +++ b/doc/Clusters_from_Scratch/pot/Ch-Stonith.pot @@ -1,252 +1,252 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Configure STONITH" msgstr "" #. Tag: title #, no-c-format msgid "What is STONITH?" msgstr "" #. Tag: para #, no-c-format msgid "STONITH (Shoot The Other Node In The Head aka. fencing) protects your data from being corrupted by rogue nodes or unintended concurrent access." msgstr "" #. Tag: para #, no-c-format msgid "Just because a node is unresponsive doesn’t mean it has stopped accessing your data. The only way to be 100% sure that your data is safe, is to use STONITH to ensure that the node is truly offline before allowing the data to be accessed from another node." msgstr "" #. Tag: para #, no-c-format msgid "STONITH also has a role to play in the event that a clustered service cannot be stopped. In this case, the cluster uses STONITH to force the whole node offline, thereby making it safe to start the service elsewhere." msgstr "" #. Tag: title #, no-c-format msgid "Choose a STONITH Device" msgstr "" #. Tag: para #, no-c-format msgid "It is crucial that your STONITH device can allow the cluster to differentiate between a node failure and a network failure." msgstr "" #. Tag: para #, no-c-format msgid "A common mistake people make when choosing a STONITH device is to use a remote power switch (such as many on-board IPMI controllers) that shares power with the node it controls. If the power fails in such a case, the cluster cannot be sure whether the node is really offline, or active and suffering from a network fault, so the cluster will stop all resources to avoid a possible split-brain situation." msgstr "" #. Tag: para #, no-c-format msgid "Likewise, any device that relies on the machine being active (such as SSH-based \"devices\" sometimes used during testing) is inappropriate." msgstr "" #. Tag: title #, no-c-format msgid "Configure the Cluster for STONITH" msgstr "" #. Tag: para #, no-c-format msgid "Install the STONITH agent(s). To see what packages are available, run yum search fence-. Be sure to install the package(s) on all cluster nodes." msgstr "" #. Tag: para #, no-c-format msgid "Configure the STONITH device itself to be able to fence your nodes and accept fencing requests. This includes any necessary configuration on the device and on the nodes, and any firewall or SELinux changes needed. Test the communication between the device and your nodes." msgstr "" #. Tag: para #, no-c-format msgid "Find the correct STONITH agent script: pcs stonith list" msgstr "" #. Tag: para #, no-c-format msgid "Find the parameters associated with the device: pcs stonith describe agent_name" msgstr "" #. Tag: para #, no-c-format msgid "Create a local copy of the CIB: pcs cluster cib stonith_cfg" msgstr "" #. Tag: para #, no-c-format msgid "Create the fencing resource: pcs -f stonith_cfg stonith create stonith_id stonith_device_type [stonith_device_options]" msgstr "" #. Tag: para #, no-c-format msgid "Any flags that do not take arguments, such as --ssl, should be passed as ssl=1." msgstr "" #. Tag: para #, no-c-format msgid "Enable STONITH in the cluster: pcs -f stonith_cfg property set stonith-enabled=true" msgstr "" #. Tag: para #, no-c-format msgid "If the device does not know how to fence nodes based on their uname, you may also need to set the special pcmk_host_map parameter. See man stonithd for details." msgstr "" #. Tag: para #, no-c-format msgid "If the device does not support the list command, you may also need to set the special pcmk_host_list and/or pcmk_host_check parameters. See man stonithd for details." msgstr "" #. Tag: para #, no-c-format msgid "If the device does not expect the victim to be specified with the port parameter, you may also need to set the special pcmk_host_argument parameter. See man stonithd for details." msgstr "" #. Tag: para #, no-c-format msgid "Commit the new configuration: pcs cluster cib-push stonith_cfg" msgstr "" #. Tag: para #, no-c-format msgid "Once the STONITH resource is running, test it (you might want to stop the cluster on that machine first): stonith_admin --reboot nodename" msgstr "" #. Tag: title #, no-c-format msgid "Example" msgstr "" #. Tag: para #, no-c-format msgid "For this example, assume we have a chassis containing four nodes and an IPMI device active on 10.0.0.1. Following the steps above would go something like this:" msgstr "" #. Tag: para #, no-c-format msgid "Step 1: Install the fence-agents-ipmilan package on both nodes." msgstr "" #. Tag: para #, no-c-format msgid "Step 2: Configure the IP address, authentication credentials, etc. in the IPMI device itself." msgstr "" #. Tag: para #, no-c-format msgid "Step 3: Choose the fence_ipmilan STONITH agent." msgstr "" #. Tag: para #, no-c-format msgid "Step 4: Obtain the agent’s possible parameters:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs stonith describe fence_ipmilan\n" "Stonith options for: fence_ipmilan\n" " ipport: TCP/UDP port to use for connection with device\n" " inet6_only: Forces agent to use IPv6 addresses only\n" " ipaddr (required): IP Address or Hostname\n" " passwd_script: Script to retrieve password\n" " method: Method to fence (onoff|cycle)\n" " inet4_only: Forces agent to use IPv4 addresses only\n" " passwd: Login password or passphrase\n" " lanplus: Use Lanplus to improve security of connection\n" " auth: IPMI Lan Auth type.\n" " cipher: Ciphersuite to use (same as ipmitool -C parameter)\n" " privlvl: Privilege level on IPMI device\n" " action (required): Fencing Action\n" " login: Login Name\n" " verbose: Verbose mode\n" " debug: Write debug information to given file\n" " version: Display version information and exit\n" " help: Display help and exit\n" " power_wait: Wait X seconds after issuing ON/OFF\n" " login_timeout: Wait X seconds for cmd prompt after login\n" " power_timeout: Test X seconds for status change after ON/OFF\n" " delay: Wait X seconds before fencing is started\n" " ipmitool_path: Path to ipmitool binary\n" " shell_timeout: Wait X seconds for cmd prompt after issuing command\n" " retry_on: Count of attempts to retry power on\n" " sudo: Use sudo (without password) when calling 3rd party sotfware.\n" " stonith-timeout: How long to wait for the STONITH action (reboot, on, off) to complete per a stonith device.\n" " priority: The priority of the stonith resource. Devices are tried in order of highest priority to lowest.\n" " pcmk_host_map: A mapping of host names to ports numbers for devices that do not support host names.\n" " pcmk_host_list: A list of machines controlled by this device (Optional unless pcmk_host_check=static-list).\n" " pcmk_host_check: How to determine which machines are controlled by the device." msgstr "" #. Tag: para #, no-c-format msgid "Step 5: pcs cluster cib stonith_cfg" msgstr "" #. Tag: para #, no-c-format msgid "Step 6: Here are example parameters for creating our STONITH resource:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs -f stonith_cfg stonith create ipmi-fencing fence_ipmilan \\\n" " pcmk_host_list=\"pcmk-1 pcmk-2\" ipaddr=10.0.0.1 login=testuser \\\n" " passwd=acd123 op monitor interval=60s\n" "[root@pcmk-1 ~]# pcs -f stonith_cfg stonith\n" " ipmi-fencing (stonith:fence_ipmilan): Stopped" msgstr "" #. Tag: para #, no-c-format msgid "Steps 7-10: Enable STONITH in the cluster:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs -f stonith_cfg property set stonith-enabled=true\n" "[root@pcmk-1 ~]# pcs -f stonith_cfg property\n" "Cluster Properties:\n" " cluster-infrastructure: corosync\n" " cluster-name: mycluster\n" " dc-version: 1.1.12-a14efad\n" " have-watchdog: false\n" " stonith-enabled: true" msgstr "" #. Tag: para #, no-c-format msgid "Step 11: pcs cluster cib-push stonith_cfg" msgstr "" #. Tag: para #, no-c-format msgid "Step 12: Test:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster stop pcmk-2\n" "[root@pcmk-1 ~]# stonith_admin --reboot pcmk-2" msgstr "" #. Tag: para #, no-c-format msgid "After a successful test, login to any rebooted nodes, and start the cluster (with pcs cluster start)." msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ch-Tools.pot b/doc/Clusters_from_Scratch/pot/Ch-Tools.pot index 27d8cf17ca..07e11c1358 100644 --- a/doc/Clusters_from_Scratch/pot/Ch-Tools.pot +++ b/doc/Clusters_from_Scratch/pot/Ch-Tools.pot @@ -1,149 +1,144 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Pacemaker Tools" msgstr "" #. Tag: title #, no-c-format msgid "Simplify administration using a cluster shell" msgstr "" #. Tag: para #, no-c-format msgid "In the dark past, configuring Pacemaker required the administrator to read and write XML. In true UNIX style, there were also a number of different commands that specialized in different aspects of querying and updating the cluster." msgstr "" #. Tag: para #, no-c-format msgid "All of that has been greatly simplified with the creation of unified command-line shells (and GUIs) that hide all the messy XML scaffolding." msgstr "" #. Tag: para #, no-c-format msgid "These shells take all the individual aspects required for managing and configuring a cluster, and pack them into one simple-to-use command line tool." msgstr "" #. Tag: para #, no-c-format msgid "They even allow you to queue up several changes at once and commit them atomically." msgstr "" #. Tag: para #, no-c-format msgid "Two popular command-line shells are pcs and crmsh. This edition of Clusters from Scratch is based on pcs." msgstr "" #. Tag: para #, no-c-format msgid "The two shells share many concepts but the scope, layout and syntax does differ, so make sure you read the version of this guide that corresponds to the software installed on your system." msgstr "" -#. Tag: para -#, no-c-format -msgid "Since pcs has the ability to manage all aspects of the cluster (both corosync and pacemaker), it requires a specific cluster stack to be in use: corosync 2.0 or later with votequorum plus Pacemaker 1.1.8 or later." -msgstr "" - #. Tag: title #, no-c-format msgid "Explore pcs" msgstr "" #. Tag: para #, no-c-format msgid "Start by taking some time to familiarize yourself with what pcs can do." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs\n" "Usage: pcs [-f file] [-h] [commands]...\n" "Control and configure pacemaker and corosync.\n" "\n" "Options:\n" " -h, --help Display usage and exit\n" " -f file Perform actions on file instead of active CIB\n" " --debug Print all network traffic and external commands run\n" " --version Print pcs version information\n" "\n" "Commands:\n" " cluster Configure cluster options and nodes\n" " resource Manage cluster resources\n" " stonith Configure fence devices\n" " constraint Set resource constraints\n" " property Set pacemaker properties\n" " acl Set pacemaker access control lists\n" " status View cluster status\n" " config View and manage cluster configuration" msgstr "" #. Tag: para #, no-c-format msgid "As you can see, the different aspects of cluster management are separated into categories: resource, cluster, stonith, property, constraint, and status. To discover the functionality available in each of these categories, one can issue the command pcs category help. Below is an example of all the options available under the status category." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status help\n" "Usage: pcs status [commands]...\n" "View current cluster and resource status\n" "Commands:\n" " [status] [--full]\n" " View all information about the cluster and resources (--full provides\n" " more details)\n" "\n" " resources\n" " View current status of cluster resources\n" "\n" " groups\n" " View currently configured groups and their resources\n" "\n" " cluster\n" " View current cluster status\n" "\n" " corosync\n" " View current membership information as seen by corosync\n" "\n" " nodes [corosync|both|config]\n" " View current status of nodes from pacemaker. If 'corosync' is\n" " specified, print nodes currently configured in corosync, if 'both'\n" " is specified, print nodes from both corosync & pacemaker. If 'config'\n" " is specified, print nodes from corosync & pacemaker configuration.\n" "\n" " pcsd <node> ...\n" " Show the current status of pcsd on the specified nodes\n" "\n" " xml\n" " View xml version of status (output from crm_mon -r -1 -X)" msgstr "" #. Tag: para #, no-c-format msgid "Additionally, if you are interested in the version and supported cluster stack(s) available with your Pacemaker installation, run:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pacemakerd --features\n" "Pacemaker 1.1.12 (Build: a14efad)\n" " Supporting v3.0.9: generated-manpages agent-manpages ascii-docs publican-docs ncurses libqb-logging libqb-ipc upstart systemd nagios corosync-native atomic-attrd acls" msgstr "" #. Tag: para #, no-c-format msgid "If the SNMP and/or email options are not listed, then Pacemaker was not built to support them. This may be by the choice of your distribution, or the required libraries may not have been available. Please contact whoever supplied you with the packages for more details." msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Ch-Verification.pot b/doc/Clusters_from_Scratch/pot/Ch-Verification.pot index 6ad0af650c..3dbbceff9a 100644 --- a/doc/Clusters_from_Scratch/pot/Ch-Verification.pot +++ b/doc/Clusters_from_Scratch/pot/Ch-Verification.pot @@ -1,203 +1,203 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Start and Verify Cluster" msgstr "" #. Tag: title #, no-c-format msgid "Start the Cluster" msgstr "" #. Tag: para #, no-c-format msgid "Now that corosync is configured, it is time to start the cluster. The command below will start corosync and pacemaker on both nodes in the cluster. If you are issuing the start command from a different node than the one you ran the pcs cluster auth command on earlier, you must authenticate on the current node you are logged into before you will be allowed to start the cluster." msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs cluster start --all\n" "pcmk-1: Starting Cluster...\n" "pcmk-2: Starting Cluster..." msgstr "" #. Tag: para #, no-c-format msgid "An alternative to using the pcs cluster start --all command is to issue either of the below command sequences on each node in the cluster separately:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs cluster start\n" "Starting Cluster..." msgstr "" #. Tag: para #, no-c-format msgid "or" msgstr "" #. Tag: screen #, no-c-format msgid "# systemctl start corosync.service\n" "# systemctl start pacemaker.service" msgstr "" #. Tag: para #, no-c-format msgid "In this example, we are not enabling the corosync and pacemaker services to start at boot. If a cluster node fails or is rebooted, you will need to run pcs cluster start nodename (or --all) to start the cluster on it. While you could enable the services to start at boot, requiring a manual start of cluster services gives you the opportunity to do a post-mortem investigation of a node failure before returning it to the cluster." msgstr "" #. Tag: title #, no-c-format msgid "Verify Corosync Installation" msgstr "" #. Tag: para #, no-c-format msgid "First, use corosync-cfgtool to check whether cluster communication is happy:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# corosync-cfgtool -s\n" "Printing ring status.\n" "Local node ID 1\n" "RING ID 0\n" " id = 192.168.122.101\n" " status = ring 0 active with no faults" msgstr "" #. Tag: para #, no-c-format msgid "We can see here that everything appears normal with our fixed IP address (not a 127.0.0.x loopback address) listed as the id, and no faults for the status." msgstr "" #. Tag: para #, no-c-format msgid "If you see something different, you might want to start by checking the node’s network, firewall and selinux configurations." msgstr "" #. Tag: para #, no-c-format msgid "Next, check the membership and quorum APIs:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# corosync-cmapctl | grep members\n" "runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0\n" "runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.122.101)\n" "runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1\n" "runtime.totem.pg.mrp.srp.members.1.status (str) = joined\n" "runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0\n" "runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.122.102)\n" "runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 2\n" "runtime.totem.pg.mrp.srp.members.2.status (str) = joined\n" "\n" "[root@pcmk-1 ~]# pcs status corosync\n" "Membership information\n" " --------------------------\n" " Nodeid Votes Name\n" " 1 1 pcmk-1 (local)\n" " 2 1 pcmk-2" msgstr "" #. Tag: para #, no-c-format msgid "You should see both nodes have joined the cluster." msgstr "" #. Tag: title #, no-c-format msgid "Verify Pacemaker Installation" msgstr "" #. Tag: para #, no-c-format msgid "Now that we have confirmed that Corosync is functional, we can check the rest of the stack. Pacemaker has already been started, so verify the necessary processes are running:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# ps axf\n" " PID TTY STAT TIME COMMAND\n" " 2 ? S 0:00 [kthreadd]\n" "...lots of processes...\n" " 1362 ? Ssl 0:35 corosync\n" " 1379 ? Ss 0:00 /usr/sbin/pacemakerd -f\n" " 1380 ? Ss 0:00 \\_ /usr/libexec/pacemaker/cib\n" " 1381 ? Ss 0:00 \\_ /usr/libexec/pacemaker/stonithd\n" " 1382 ? Ss 0:00 \\_ /usr/libexec/pacemaker/lrmd\n" " 1383 ? Ss 0:00 \\_ /usr/libexec/pacemaker/attrd\n" " 1384 ? Ss 0:00 \\_ /usr/libexec/pacemaker/pengine\n" " 1385 ? Ss 0:00 \\_ /usr/libexec/pacemaker/crmd" msgstr "" #. Tag: para #, no-c-format msgid "If that looks OK, check the pcs status output:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# pcs status\n" "Cluster name: mycluster\n" "WARNING: no stonith devices and stonith-enabled is not false\n" "Last updated: Tue Dec 16 16:15:29 2014\n" "Last change: Tue Dec 16 15:49:47 2014\n" "Stack: corosync\n" "Current DC: pcmk-2 (2) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "0 Resources configured\n" "\n" "\n" "Online: [ pcmk-1 pcmk-2 ]\n" "\n" "Full list of resources:\n" "\n" "\n" "PCSD Status:\n" " pcmk-1: Online\n" " pcmk-2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "Finally, ensure there are no startup errors (aside from messages relating to not having STONITH configured, which are OK at this point):" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# journalctl | grep -i error" msgstr "" #. Tag: para #, no-c-format msgid "Other operating systems may report startup errors in other locations, for example /var/log/messages." msgstr "" #. Tag: para #, no-c-format msgid "Repeat these checks on the other node. The results should be the same." msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Preface.pot b/doc/Clusters_from_Scratch/pot/Preface.pot index 01aa1cc95d..1f366bf020 100644 --- a/doc/Clusters_from_Scratch/pot/Preface.pot +++ b/doc/Clusters_from_Scratch/pot/Preface.pot @@ -1,19 +1,19 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Preface" msgstr "" diff --git a/doc/Clusters_from_Scratch/pot/Revision_History.pot b/doc/Clusters_from_Scratch/pot/Revision_History.pot index 6552a88957..5fed5857da 100644 --- a/doc/Clusters_from_Scratch/pot/Revision_History.pot +++ b/doc/Clusters_from_Scratch/pot/Revision_History.pot @@ -1,109 +1,109 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Revision History" msgstr "" #. Tag: firstname #, no-c-format msgid "Andrew" msgstr "" #. Tag: surname #, no-c-format msgid "Beekhof" msgstr "" #. Tag: member #, no-c-format msgid "Import from Pages.app" msgstr "" #. Tag: firstname #, no-c-format msgid "Raoul" msgstr "" #. Tag: surname #, no-c-format msgid "Scarazzini" msgstr "" #. Tag: member #, no-c-format msgid "Italian translation" msgstr "" #. Tag: member #, no-c-format msgid "Updated for Fedora 13" msgstr "" #. Tag: member #, no-c-format msgid "Update the GFS2 section to use CMAN" msgstr "" #. Tag: member #, no-c-format msgid "Generate docbook content from asciidoc sources" msgstr "" #. Tag: member #, no-c-format msgid "Updated for Fedora 17" msgstr "" #. Tag: firstname #, no-c-format msgid "David" msgstr "" #. Tag: surname #, no-c-format msgid "Vossel" msgstr "" #. Tag: member #, no-c-format msgid "Updated for pcs" msgstr "" #. Tag: firstname #, no-c-format msgid "Ken" msgstr "" #. Tag: surname #, no-c-format msgid "Gaillot" msgstr "" #. Tag: member #, no-c-format msgid "Updated for Fedora 21" msgstr "" #. Tag: member #, no-c-format msgid "Minor corrections, plus use include file for intro" msgstr "" #. Tag: member #, no-c-format msgid "Update for CentOS 7.1 and leaving firewalld/SELinux enabled" msgstr "" diff --git a/doc/Pacemaker_Remote/pot/Author_Group.pot b/doc/Pacemaker_Development/pot/Author_Group.pot similarity index 64% copy from doc/Pacemaker_Remote/pot/Author_Group.pot copy to doc/Pacemaker_Development/pot/Author_Group.pot index 3b4203d1d1..12c6fb9897 100644 --- a/doc/Pacemaker_Remote/pot/Author_Group.pot +++ b/doc/Pacemaker_Development/pot/Author_Group.pot @@ -1,34 +1,44 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: firstname #, no-c-format -msgid "David" +msgid "Andrew" msgstr "" #. Tag: surname #, no-c-format -msgid "Vossel" +msgid "Beekhof" msgstr "" #. Tag: orgname #, no-c-format msgid "Red Hat" msgstr "" #. Tag: contrib #, no-c-format -msgid "Primary author" +msgid "Co-author" +msgstr "" + +#. Tag: firstname +#, no-c-format +msgid "Ken" +msgstr "" + +#. Tag: surname +#, no-c-format +msgid "Gaillot" msgstr "" diff --git a/doc/Pacemaker_Development/pot/Book_Info.pot b/doc/Pacemaker_Development/pot/Book_Info.pot new file mode 100644 index 0000000000..6f7e1a0424 --- /dev/null +++ b/doc/Pacemaker_Development/pot/Book_Info.pot @@ -0,0 +1,29 @@ +# +# AUTHOR , YEAR. +# +msgid "" +msgstr "" +"Project-Id-Version: 0\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" +"Last-Translator: Automatically generated\n" +"Language-Team: None\n" +"MIME-Version: 1.0\n" +"Content-Type: application/x-publican; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Tag: title +#, no-c-format +msgid "Pacemaker Development" +msgstr "" + +#. Tag: subtitle +#, no-c-format +msgid "Working with the Pacemaker Code Base" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "This document has guidelines and tips for developers interested in editing Pacemaker source code and submitting changes for inclusion in the project." +msgstr "" + diff --git a/doc/Pacemaker_Development/pot/Ch-Coding.pot b/doc/Pacemaker_Development/pot/Ch-Coding.pot new file mode 100644 index 0000000000..0a391620b6 --- /dev/null +++ b/doc/Pacemaker_Development/pot/Ch-Coding.pot @@ -0,0 +1,276 @@ +# +# AUTHOR , YEAR. +# +msgid "" +msgstr "" +"Project-Id-Version: 0\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" +"Last-Translator: Automatically generated\n" +"Language-Team: None\n" +"MIME-Version: 1.0\n" +"Content-Type: application/x-publican; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Tag: title +#, no-c-format +msgid "C Coding Guidelines" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "C Boilerplate" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Cboilerplate boilerplate licensingC boilerplate C boilerplate " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Every C file should start like this:" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "/*\n" +" * Copyright (C) <YYYY[-YYYY]> Andrew Beekhof <andrew@beekhof.net>\n" +" *\n" +" * This source code is licensed under <LICENSE> WITHOUT ANY WARRANTY.\n" +" */" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "<YYYY> is the year the code was originally created (it is the most important date for copyright purposes, as it establishes priority and the point from which expiration is calculated). If the code is modified in later years, add -YYYY with the most recent year of modification." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "<LICENSE> should follow the policy set forth in the COPYING file, generally one of \"GNU General Public License version 2 or later (GPLv2+)\" or \"GNU Lesser General Public License version 2.1 or later (LGPLv2.1+)\"." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Formatting" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Whitespace" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Cwhitespace whitespace " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Indentation must be 4 spaces, no tabs." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Do not leave trailing whitespace." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Line Length" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Lines should be no longer than 80 characters unless limiting line length significantly impacts readability." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Pointers" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Cpointers pointers " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The * goes by the variable name, not the type:" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "char *foo;" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Use a space before the * and after the closing parenthesis in a cast:" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "char *foo = (char *) bar;" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Functions" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Cfunctions functions " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "In the function definition, put the return type on its own line, and place the opening brace by itself on a line:" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "static int\n" +"foo(void)\n" +"{" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "For functions with enough arguments that they must break to the next line, align arguments with the first argument:" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "static int\n" +"function_name(int bar, const char *a, const char *b,\n" +" const char *c, const char *d)\n" +"{" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "If a function name gets really long, start the arguments on their own line with 8 spaces of indentation:" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "static int\n" +"really_really_long_function_name_this_is_getting_silly_now(\n" +" int bar, const char *a, const char *b,\n" +" const char *c, const char *d)\n" +"{" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Control Statements (if, else, while, for, switch)" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The keyword is followed by one space, then left parenthesis without space, condition, right parenthesis, space, opening bracket on the same line. else and else if are on the same line with the ending brace and opening brace, separated by a space:" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "if (condition1) {\n" +" statement1;\n" +"} else if (condition2) {\n" +" statement2;\n" +"} else {\n" +" statement3;\n" +"}" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "In a switch statement, case is indented one level, and the body of each case is indented by another level. The opening brace is on the same line as switch." +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "switch (expression) {\n" +" case 0:\n" +" command1;\n" +" break;\n" +" case 1:\n" +" command2;\n" +" break;\n" +" default:\n" +" command3;\n" +"}" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Operators" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Coperators operators " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Operators have spaces from both sides. Do not rely on operator precedence; use parentheses when mixing operators with different priority." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "No space is used after opening parenthesis and before closing parenthesis." +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "x = a + b - (c * d);" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Naming Conventions" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Cnaming naming " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Any exposed symbols in libraries (non-static function names, type names, etc.) must begin with a prefix appropriate to the library, for example, crm_, pe_, st_, lrm_." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "vim Settings" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " vim " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Developers who use vim to edit source code can add the following settings to their ~/.vimrc file to follow Pacemaker C coding guidelines:" +msgstr "" + +#. Tag: screen +#, no-c-format +msgid "\" follow Pacemaker coding guidelines when editing C source code files\n" +"filetype plugin indent on\n" +"au FileType c setlocal expandtab tabstop=4 softtabstop=4 shiftwidth=4 textwidth=80\n" +"autocmd BufNewFile,BufRead *.h set filetype=c\n" +"let c_space_errors = 1" +msgstr "" + diff --git a/doc/Pacemaker_Development/pot/Ch-FAQ.pot b/doc/Pacemaker_Development/pot/Ch-FAQ.pot new file mode 100644 index 0000000000..67a4f77ccd --- /dev/null +++ b/doc/Pacemaker_Development/pot/Ch-FAQ.pot @@ -0,0 +1,139 @@ +# +# AUTHOR , YEAR. +# +msgid "" +msgstr "" +"Project-Id-Version: 0\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" +"Last-Translator: Automatically generated\n" +"Language-Team: None\n" +"MIME-Version: 1.0\n" +"Content-Type: application/x-publican; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Tag: title +#, no-c-format +msgid "Frequently Asked Questions" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Who is this document intended for?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Anyone who wishes to read and/or edit the Pacemaker source code. Casual contributors should feel free to read just this FAQ, and consult other sections as needed." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Where is the source code for Pacemaker?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " downloads source code gitGitHub GitHub The source code for Pacemaker is kept on GitHub, as are all software projects under the ClusterLabs umbrella. Pacemaker uses Git for source code management. If you are a Git newbie, the gittutorial(7) man page is an excellent starting point. If you’re familiar with using Git from the command line, you can create a local copy of the Pacemaker source code with: git clone https://github.com/ClusterLabs/pacemaker.git pacemaker" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "What are the different Git branches and repositories used for?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " branches " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The master branch is the primary branch used for development." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The 1.1 branch contains the latest official release, and normally does not receive any changes. During the release cycle, it will contain release candidates for the next official release, and will receive only bug fixes." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The 1.0 repository is a frozen snapshot of the 1.0 release series, and is no longer developed." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Messages will be posted to the developers@clusterlabs.org mailing list during the release cycle, with instructions about which branches to use when submitting requests." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "How do I build from the source code?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "See INSTALL.md in the main checkout directory." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "What coding style should I follow?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "You’ll be mostly fine if you simply follow the example of existing code. When unsure, see the relevant section of this document for language-specific recommendations. Pacemaker has grown and evolved organically over many years, so you will see much code that doesn’t conform to the current guidelines. We discourage making changes solely to bring code into conformance, as any change requires developer time for review and opens the possibility of adding bugs. However, new code should follow the guidelines, and it is fine to bring lines of older code into conformance when modifying that code for other reasons." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "How should I format my Git commit messages?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " gitcommit messages commit messages See existing examples in the git log. The first line should look like change-type: affected-code: explanation where change-type can be Fix or Bug for most bug fixes, Feature for new features, Log for changes to log messages or handling, Doc for changes to documentation or comments, or Test for changes in CTS and regression tests. You will sometimes see Low, Med (or Mid) and High used instead for bug fixes, to indicate the severity. The important thing is that only commits with Feature, Fix, Bug, or High will automatically be included in the change log for the next release. The affected-code is the name of the component(s) being changed, for example, crmd or libcrmcommon (it’s more free-form, so don’t sweat getting it exact). The explanation briefly describes the change. The git project recommends the entire summary line stay under 50 characters, but more is fine if needed for clarity. Except for the most simple and obvious of changes, the summary should be followed by a blank line and then a longer explanation of why the change was made." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "How can I test my changes?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Most importantly, Pacemaker has regression tests for most major components; these will automatically be run for any pull requests submitted through GitHub. Additionally, Pacemaker’s Cluster Test Suite (CTS) can be used to set up a test cluster and run a wide variety of complex tests. This document will have more detail on testing in the future." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "What is Pacemaker’s license?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " licensing Except where noted otherwise in the file itself, the source code for all Pacemaker programs is licensed under version 2 or later of the GNU General Public License (GPLv2+), its headers and libraries under version 2.1 or later of the less restrictive GNU Lesser General Public License (LGPLv2.1+), its documentation under version 4.0 or later of the Creative Commons Attribution-ShareAlike International Public License (CC-BY-SA), and its init scripts under the Revised BSD license. If you find any deviations from this policy, or wish to inquire about alternate licensing arrangements, please e-mail andrew@beekhof.net. Licensing issues are also discussed on the ClusterLabs wiki." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "How can I contribute my changes to the project?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Contributions of bug fixes or new features are very much appreciated! Patches can be submitted as pull requests via GitHub (the preferred method, due to its excellent features), or e-mailed to the developers@clusterlabs.org mailing list as an attachment in a format Git can import." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "What if I still have questions?" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " mailing lists Ask on the developers@clusterlabs.org mailing list for development-related questions, or on the users@clusterlabs.org mailing list for general questions about using Pacemaker. Developers often also hang out on freenode’s #clusterlabs IRC channel." +msgstr "" + diff --git a/doc/Pacemaker_Development/pot/Ch-Python.pot b/doc/Pacemaker_Development/pot/Ch-Python.pot new file mode 100644 index 0000000000..04fc3f9c04 --- /dev/null +++ b/doc/Pacemaker_Development/pot/Ch-Python.pot @@ -0,0 +1,297 @@ +# +# AUTHOR , YEAR. +# +msgid "" +msgstr "" +"Project-Id-Version: 0\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" +"Last-Translator: Automatically generated\n" +"Language-Team: None\n" +"MIME-Version: 1.0\n" +"Content-Type: application/x-publican; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Tag: title +#, no-c-format +msgid "Python Coding Guidelines" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Python Boilerplate" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Pythonboilerplate boilerplate licensingPython boilerplate Python boilerplate " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Every Python file should start like this:" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "[<SHEBANG>]\n" +"\"\"\" <BRIEF-DESCRIPTION>\n" +"\"\"\"\n" +"\n" +"# Pacemaker targets compatibility with Python 2.6+ and 3.2+\n" +"from __future__ import print_function, unicode_literals, absolute_import, division\n" +"\n" +"__copyright__ = \"Copyright (C) <YYYY[-YYYY]> Andrew Beekhof <andrew@beekhof.net>\"\n" +"__license__ = \"<LICENSE> WITHOUT ANY WARRANTY\"" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "If the file is meant to be directly executed, the first line (<SHEBANG>) should be #!/usr/bin/python. If it is meant to be imported, omit this line." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "<BRIEF-DESCRIPTION> is obviously a brief description of the file’s purpose. The string may contain any other information typically used in a Python file docstring." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The import statement is discussed further in ." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "<YYYY> is the year the code was originally created (it is the most important date for copyright purposes, as it establishes priority and the point from which expiration is calculated). If the code is modified in later years, add -YYYY with the most recent year of modification." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "<LICENSE> should follow the policy set forth in the COPYING file, generally one of \"GNU General Public License version 2 or later (GPLv2+)\" or \"GNU Lesser General Public License version 2.1 or later (LGPLv2.1+)\"." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Python Compatibility" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Python2 2 Python3 3 Pythonversions versions " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Pacemaker targets compatibility with Python 2.6 and later, and Python 3.2 and later. These versions have added features to be more compatible with each other, allowing us to support both the 2 and 3 series with the same code. It is a good idea to test any changes with both Python 2 and 3." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Python Future Imports" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The future imports used in mean:" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "All print statements must use parentheses, and printing without a newline is accomplished with the end=' ' parameter rather than a trailing comma." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "All string literals will be treated as Unicode (the u prefix is unnecessary, and must not be used, because it is not available in Python 3.2)." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Local modules must be imported using from . import (rather than just import). To import one item from a local module, use from .modulename import (rather than from modulename import)." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Division using / will always return a floating-point result (use // if you want the integer floor instead)." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Other Python Compatibility Requirements" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "When specifying an exception variable, always use as instead of a comma (e.g. except Exception as e or except (TypeError, IOError) as e). Use e.args to access the error arguments (instead of iterating over or subscripting e)." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Use in (not has_key()) to determine if a dictionary has a particular key." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Always use the I/O functions from the io module rather than the native I/O functions (e.g. io.open() rather than open())." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "When opening a file, always use the t (text) or b (binary) mode flag." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "When creating classes, always specify a parent class to ensure that it is a \"new-style\" class (e.g. class Foo(object): rather than class Foo:)" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Be aware of the bytes type added in Python 3. Many places where strings are used in Python 2 use bytes or bytearrays in Python 3 (for example, the pipes used with subprocess.Popen()). Code should handle both possibilities." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Be aware that the items(), keys(), and values() methods of dictionaries return lists in Python 2 and views in Python 3. In many case, no special handling is required, but if the code needs to use list methods on the result, cast the result to list first." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Do not name variables with or as." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Do not raise or catch strings as exceptions (e.g. raise \"Bad thing\")." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Do not use the cmp parameter of sorting functions (use key instead, if needed) or the __cmp__() method of classes (implement rich comparison methods such as __lt__() instead, if needed)." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Do not use the buffer type." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Do not use features not available in all targeted Python versions. Common examples include:" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The argparse, html, ipaddress, sysconfig, and UserDict modules" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The collections.OrderedDict class" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The subprocess.run() function" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The subprocess.DEVNULL constant" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "subprocess module-specific exceptions" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Set literals ({1, 2, 3})" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Python Usages to Avoid" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Avoid the following if possible, otherwise research the compatibility issues involved (hacky workarounds are often available):" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "long integers" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "octal integer literals" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "mixed binary and string data in one data file or variable" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "metaclasses" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "locale.strcoll and locale.strxfrm" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "the configparser and ConfigParser modules" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "importing compatibility modules such as six (so we don’t have to add them to Pacemaker’s dependencies)" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Formatting Python Code" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Pythonformatting formatting " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Indentation must be 4 spaces, no tabs." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Do not leave trailing whitespace." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Lines should be no longer than 80 characters unless limiting line length significantly impacts readability. For Python, this limitation is flexible since breaking a line often impacts readability, but definitely keep it under 120 characters." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Where not conflicting with this style guide, it is recommended (but not required) to follow PEP 8." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "It is recommended (but not required) to format Python code such that pylint --disable=line-too-long,too-many-lines,too-many-instance-attributes,too-many-arguments,too-many-statements produces minimal complaints (even better if you don’t need to disable all those checks)." +msgstr "" + diff --git a/doc/Pacemaker_Remote/pot/Author_Group.pot b/doc/Pacemaker_Development/pot/Revision_History.pot similarity index 53% copy from doc/Pacemaker_Remote/pot/Author_Group.pot copy to doc/Pacemaker_Development/pot/Revision_History.pot index 3b4203d1d1..cfdb4e3836 100644 --- a/doc/Pacemaker_Remote/pot/Author_Group.pot +++ b/doc/Pacemaker_Development/pot/Revision_History.pot @@ -1,34 +1,39 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" +#. Tag: title +#, no-c-format +msgid "Revision History" +msgstr "" + #. Tag: firstname #, no-c-format -msgid "David" +msgid "Ken" msgstr "" #. Tag: surname #, no-c-format -msgid "Vossel" +msgid "Gaillot" msgstr "" -#. Tag: orgname +#. Tag: member #, no-c-format -msgid "Red Hat" +msgid "Convert coding guidelines and developer FAQ to Publican document" msgstr "" -#. Tag: contrib +#. Tag: member #, no-c-format -msgid "Primary author" +msgid "Add Python coding guidelines, and more about licensing" msgstr "" diff --git a/doc/Pacemaker_Explained/en-US/Book_Info.xml b/doc/Pacemaker_Explained/en-US/Book_Info.xml index de4ddbebe4..5321593452 100644 --- a/doc/Pacemaker_Explained/en-US/Book_Info.xml +++ b/doc/Pacemaker_Explained/en-US/Book_Info.xml @@ -1,35 +1,35 @@ Configuration Explained An A-Z guide to Pacemaker's Configuration Options Pacemaker 1.1 - 7 - 1 + 8 + 0 The purpose of this document is to definitively explain the concepts used to configure Pacemaker. To achieve this, it will focus exclusively on the XML syntax used to configure Pacemaker's Cluster Information Base (CIB). diff --git a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt index 9527b1ab18..e3f4634c5a 100644 --- a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt +++ b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt @@ -1,821 +1,823 @@ = Advanced Configuration = [[s-remote-connection]] == Connecting from a Remote Machine == indexterm:[Cluster,Remote connection] indexterm:[Cluster,Remote administration] Provided Pacemaker is installed on a machine, it is possible to connect to the cluster even if the machine itself is not in the same cluster. To do this, one simply sets up a number of environment variables and runs the same commands as when working on a cluster node. .Environment Variables Used to Connect to Remote Instances of the CIB [width="95%",cols="1m,1,3<",options="header",align="center"] |========================================================= |Environment Variable |Default |Description |CIB_user |$USER |The user to connect as. Needs to be part of the +hacluster+ group on the target host. indexterm:[Environment Variable,CIB_user] |CIB_passwd | |The user's password. Read from the command line if unset. indexterm:[Environment Variable,CIB_passwd] |CIB_server |localhost |The host to contact indexterm:[Environment Variable,CIB_server] |CIB_port | |The port on which to contact the server; required. indexterm:[Environment Variable,CIB_port] |CIB_encrypted |TRUE |Whether to encrypt network traffic indexterm:[Environment Variable,CIB_encrypted] |========================================================= So, if *c001n01* is an active cluster node and is listening on port 1234 for connections, and *someuser* is a member of the *hacluster* group, then the following would prompt for *someuser*'s password and return the cluster's current configuration: ---- # export CIB_port=1234; export CIB_server=c001n01; export CIB_user=someuser; # cibadmin -Q ---- For security reasons, the cluster does not listen for remote connections by default. If you wish to allow remote access, you need to set the +remote-tls-port+ (encrypted) or +remote-clear-port+ (unencrypted) CIB properties (i.e., those kept in the +cib+ tag, like +num_updates+ and +epoch+). .Extra top-level CIB properties for remote access [width="95%",cols="1m,1,3<",options="header",align="center"] |========================================================= |Field |Default |Description |remote-tls-port |_none_ |Listen for encrypted remote connections on this port. indexterm:[remote-tls-port,Remote Connection Option] indexterm:[Remote Connection,Option,remote-tls-port] |remote-clear-port |_none_ |Listen for plaintext remote connections on this port. indexterm:[remote-clear-port,Remote Connection Option] indexterm:[Remote Connection,Option,remote-clear-port] |========================================================= [[s-recurring-start]] == Specifying When Recurring Actions are Performed == By default, recurring actions are scheduled relative to when the resource started. So if your resource was last started at 14:32 and you have a backup set to be performed every 24 hours, then the backup will always run in the middle of the business day -- hardly desirable. To specify a date and time that the operation should be relative to, set the operation's +interval-origin+. The cluster uses this point to calculate the correct +start-delay+ such that the operation will occur at _origin + (interval * N)_. So, if the operation's interval is 24h, its interval-origin is set to 02:00 and it is currently 14:32, then the cluster would initiate the operation with a start delay of 11 hours and 28 minutes. If the resource is moved to another node before 2am, then the operation is cancelled. The value specified for +interval+ and +interval-origin+ can be any date/time conforming to the http://en.wikipedia.org/wiki/ISO_8601[ISO8601 standard]. By way of example, to specify an operation that would run on the first Monday of 2009 and every Monday after that, you would add: .Specifying a Base for Recurring Action Intervals ===== [source,XML] ===== == Moving Resources == indexterm:[Moving,Resources] indexterm:[Resource,Moving] === Moving Resources Manually === There are primarily two occasions when you would want to move a resource from its current location: when the whole node is under maintenance, and when a single resource needs to be moved. ==== Standby Mode ==== Since everything eventually comes down to a score, you could create constraints for every resource to prevent them from running on one node. While pacemaker configuration can seem convoluted at times, not even we would require this of administrators. Instead, one can set a special node attribute which tells the cluster "don't let anything run here". There is even a helpful tool to help query and set it, called `crm_standby`. To check the standby status of the current machine, run: ---- # crm_standby -G ---- A value of +on+ indicates that the node is _not_ able to host any resources, while a value of +off+ says that it _can_. You can also check the status of other nodes in the cluster by specifying the `--node` option: ---- # crm_standby -G --node sles-2 ---- To change the current node's standby status, use `-v` instead of `-G`: ---- # crm_standby -v on ---- Again, you can change another host's value by supplying a hostname with `--node`. ==== Moving One Resource ==== When only one resource is required to move, we could do this by creating location constraints. However, once again we provide a user-friendly shortcut as part of the `crm_resource` command, which creates and modifies the extra constraints for you. If +Email+ were running on +sles-1+ and you wanted it moved to a specific location, the command would look something like: ---- # crm_resource -M -r Email -H sles-2 ---- Behind the scenes, the tool will create the following location constraint: [source,XML] It is important to note that subsequent invocations of `crm_resource -M` are not cumulative. So, if you ran these commands ---- # crm_resource -M -r Email -H sles-2 # crm_resource -M -r Email -H sles-3 ---- then it is as if you had never performed the first command. To allow the resource to move back again, use: ---- # crm_resource -U -r Email ---- Note the use of the word _allow_. The resource can move back to its original location but, depending on +resource-stickiness+, it might stay where it is. To be absolutely certain that it moves back to +sles-1+, move it there before issuing the call to `crm_resource -U`: ---- # crm_resource -M -r Email -H sles-1 # crm_resource -U -r Email ---- Alternatively, if you only care that the resource should be moved from its current location, try: ---- # crm_resource -B -r Email ---- Which will instead create a negative constraint, like [source,XML] This will achieve the desired effect, but will also have long-term consequences. As the tool will warn you, the creation of a +-INFINITY+ constraint will prevent the resource from running on that node until `crm_resource -U` is used. This includes the situation where every other cluster node is no longer available! In some cases, such as when +resource-stickiness+ is set to +INFINITY+, it is possible that you will end up with the problem described in <>. The tool can detect some of these cases and deals with them by creating both positive and negative constraints. E.g. +Email+ prefers +sles-1+ with a score of +-INFINITY+ +Email+ prefers +sles-2+ with a score of +INFINITY+ which has the same long-term consequences as discussed earlier. [[s-failure-migration]] === Moving Resources Due to Failure === Normally, if a running resource fails, pacemaker will try to start it again on the same node. However if a resource fails repeatedly, it is possible that there is an underlying problem on that node, and you might desire trying a different node in such a case. indexterm:[migration-threshold] indexterm:[failure-timeout] indexterm:[start-failure-is-fatal] Pacemaker allows you to set your preference via the +migration-threshold+ resource option. footnote:[ The naming of this option was perhaps unfortunate as it is easily confused with live migration, the process of moving a resource from one node to another without stopping it. Xen virtual guests are the most common example of resources that can be migrated in this manner. ] Simply define +migration-threshold=pass:[N]+ for a resource and it will migrate to a new node after 'N' failures. There is no threshold defined by default. To determine the resource's current failure status and limits, run `crm_mon --failcounts`. By default, once the threshold has been reached, the troublesome node will no longer be allowed to run the failed resource until the administrator manually resets the resource's failcount using `crm_failcount` (after hopefully first fixing the failure's cause). Alternatively, it is possible to expire them by setting the +failure-timeout+ option for the resource. For example, a setting of +migration-threshold=2+ and +failure-timeout=60s+ would cause the resource to move to a new node after 2 failures, and allow it to move back (depending on stickiness and constraint scores) after one minute. There are two exceptions to the migration threshold concept: when a resource either fails to start or fails to stop. If the cluster property +start-failure-is-fatal+ is set to +true+ (which is the default), start failures cause the failcount to be set to +INFINITY+ and thus always cause the resource to move immediately. Stop failures are slightly different and crucial. If a resource fails to stop and STONITH is enabled, then the cluster will fence the node in order to be able to start the resource elsewhere. If STONITH is not enabled, then the cluster has no way to continue and will not try to start the resource elsewhere, but will try to stop it again after the failure timeout. [IMPORTANT] Please read <> to understand how timeouts work before configuring a +failure-timeout+. === Moving Resources Due to Connectivity Changes === You can configure the cluster to move resources when external connectivity is lost in two steps. ==== Tell Pacemaker to Monitor Connectivity ==== First, add an *ocf:pacemaker:ping* resource to the cluster. The *ping* resource uses the system utility of the same name to a test whether list of machines (specified by DNS hostname or IPv4/IPv6 address) are reachable and uses the results to maintain a node attribute called +pingd+ by default. footnote:[ The attribute name is customizable, in order to allow multiple ping groups to be defined. ] [NOTE] =========== Older versions of Heartbeat required users to add ping nodes to +ha.cf+, but this is no longer required. Older versions of Pacemaker used a different agent *ocf:pacemaker:pingd* which is now deprecated in favor of *ping*. If your version of Pacemaker does not contain the *ping* resource agent, download the latest version from https://github.com/ClusterLabs/pacemaker/tree/master/extra/resources/ping =========== Normally, the ping resource should run on all cluster nodes, which means that you'll need to create a clone. A template for this can be found below along with a description of the most interesting parameters. .Common Options for a 'ping' Resource [width="95%",cols="1m,4<",options="header",align="center"] |========================================================= |Field |Description |dampen |The time to wait (dampening) for further changes to occur. Use this to prevent a resource from bouncing around the cluster when cluster nodes notice the loss of connectivity at slightly different times. indexterm:[dampen,Ping Resource Option] indexterm:[Ping Resource,Option,dampen] |multiplier |The number of connected ping nodes gets multiplied by this value to get a score. Useful when there are multiple ping nodes configured. indexterm:[multiplier,Ping Resource Option] indexterm:[Ping Resource,Option,multiplier] |host_list |The machines to contact in order to determine the current connectivity status. Allowed values include resolvable DNS host names, IPv4 and IPv6 addresses. indexterm:[host_list,Ping Resource Option] indexterm:[Ping Resource,Option,host_list] |========================================================= .An example ping cluster resource that checks node connectivity once every minute ===== [source,XML] ------------ ------------ ===== [IMPORTANT] =========== You're only half done. The next section deals with telling Pacemaker how to deal with the connectivity status that +ocf:pacemaker:ping+ is recording. =========== ==== Tell Pacemaker How to Interpret the Connectivity Data ==== [IMPORTANT] ====== Before attempting the following, make sure you understand <>. ====== There are a number of ways to use the connectivity data. The most common setup is for people to have a single ping target (e.g. the service network's default gateway), to prevent the cluster from running a resource on any unconnected node. .Don't run a resource on unconnected nodes ===== [source,XML] ------- ------- ===== A more complex setup is to have a number of ping targets configured. You can require the cluster to only run resources on nodes that can connect to all (or a minimum subset) of them. .Run only on nodes connected to three or more ping targets. ===== [source,XML] ------- ... ... ... ------- ===== Alternatively, you can tell the cluster only to _prefer_ nodes with the best connectivity. Just be sure to set +multiplier+ to a value higher than that of +resource-stickiness+ (and don't set either of them to +INFINITY+). .Prefer the node with the most connected ping nodes ===== [source,XML] ------- ------- ===== It is perhaps easier to think of this in terms of the simple constraints that the cluster translates it into. For example, if *sles-1* is connected to all five ping nodes but *sles-2* is only connected to two, then it would be as if you instead had the following constraints in your configuration: .How the cluster translates the above location constraint ===== [source,XML] ------- ------- ===== The advantage is that you don't have to manually update any constraints whenever your network connectivity changes. You can also combine the concepts above into something even more complex. The example below shows how you can prefer the node with the most connected ping nodes provided they have connectivity to at least three (again assuming that +multiplier+ is set to 1000). .A more complex example of choosing a location based on connectivity ===== [source,XML] ------- ------- ===== [[s-migrating-resources]] === Migrating Resources === Normally, when the cluster needs to move a resource, it fully restarts the resource (i.e. stops the resource on the current node and starts it on the new node). However, some types of resources, such as Xen virtual guests, are able to move to another location without loss of state (often referred to as live migration or hot migration). In pacemaker, this is called resource migration. Pacemaker can be configured to migrate a resource when moving it, rather than restarting it. Not all resources are able to migrate; see the Migration Checklist below, and those that can, won't do so in all situations. Conceptually, there are two requirements from which the other prerequisites follow: * The resource must be active and healthy at the old location; and * everything required for the resource to run must be available on both the old and new locations. The cluster is able to accommodate both 'push' and 'pull' migration models by requiring the resource agent to support two special actions: +migrate_to+ (performed on the current location) and +migrate_from+ (performed on the destination). In push migration, the process on the current location transfers the resource to the new location where is it later activated. In this scenario, most of the work would be done in the +migrate_to+ action and, if anything, the activation would occur during +migrate_from+. Conversely for pull, the +migrate_to+ action is practically empty and +migrate_from+ does most of the work, extracting the relevant resource state from the old location and activating it. There is no wrong or right way for a resource agent to implement migration, as long as it works. .Migration Checklist * The resource may not be a clone. * The resource must use an OCF style agent. * The resource must not be in a failed or degraded state. * The resource agent must support +migrate_to+ and +migrate_from+ actions, and advertise them in its metadata. * The resource must have the +allow-migrate+ meta-attribute set to +true+ (which is not the default). If an otherwise migratable resource depends on another resource via an ordering constraint, there are special situations in which it will be restarted rather than migrated. For example, if the resource depends on a clone, and at the time the resource needs to be moved, the clone has instances that are stopping and instances that are starting, then the resource will be restarted. The Policy Engine is not yet able to model this situation correctly and so takes the safer (if less optimal) path. In pacemaker 1.1.11 and earlier, a migratable resource will be restarted when moving if it directly or indirectly depends on 'any' primitive or group resources. Even in newer versions, if a migratable resource depends on a non-migratable resource, and both need to be moved, the migratable resource will be restarted. [[s-node-health]] == Tracking Node Health == A node may be functioning adequately as far as cluster membership is concerned, and yet be "unhealthy" in some respect that makes it an undesirable location for resources. For example, a disk drive may be reporting SMART errors, or the CPU may be highly loaded. Pacemaker offers a way to automatically move resources off unhealthy nodes. === Node Health Attributes === Pacemaker will treat any node attribute whose name starts with +#health+ as an indicator of node health. Node health attributes may have one of the following values: .Allowed Values for Node Health Attributes [width="95%",cols="1,3<",options="header",align="center"] |========================================================= |Value |Intended significance |+red+ |This indicator is unhealthy indexterm:[Node health,red] |+yellow+ |This indicator is becoming unhealthy indexterm:[Node health,yellow] |+green+ |This indicator is healthy indexterm:[Node health,green] |'integer' |A numeric score to apply to all resources on this node (0 or positive is healthy, negative is unhealthy) indexterm:[Node health,score] |========================================================= === Node Health Strategy === Pacemaker assigns a node health score to each node, as the sum of the values of all its node health attributes. This score will be used as a location constraint applied to this node for all resources. The +node-health-strategy+ cluster option controls how Pacemaker responds to changes in node health attributes, and how it translates +red+, +yellow+, and +green+ to scores. Allowed values are: .Node Health Strategies [width="95%",cols="1m,3<",options="header",align="center"] |========================================================= |Value |Effect |none |Do not track node health attributes at all. indexterm:[Node health,none] |migrate-on-red |Assign the value of +-INFINITY+ to +red+, and 0 to +yellow+ and +green+. This will cause all resources to move off the node if any attribute is +red+. indexterm:[Node health,migrate-on-red] |only-green |Assign the value of +-INFINITY+ to +red+ and +yellow+, and 0 to +green+. This will cause all resources to move off the node if any attribute is +red+ or +yellow+. indexterm:[Node health,only-green] |progressive |Assign the value of the +node-health-red+ cluster option to +red+, the value of +node-health-yellow+ to +yellow+, and the value of +node-health-green+ to - +green+. This strategy gives the administrator finer control over how - important each value is. + +green+. Each node is additionally assigned a score of +node-health-base+ + (this allows resources to start even if some attributes are +yellow+). This + strategy gives the administrator finer control over how important each value + is. indexterm:[Node health,progressive] |custom |Track node health attributes using the same values as +progressive+ for +red+, +yellow+, and +green+, but do not take them into account. The administrator is expected to implement a policy by defining rules (see <>) referencing node health attributes. indexterm:[Node health,custom] |========================================================= === Measuring Node Health === Since Pacemaker calculates node health based on node attributes, any method that sets node attributes may be used to measure node health. The most common ways are resource agents or separate daemons. Pacemaker provides examples that can be used directly or as a basis for custom code. The +ocf:pacemaker:HealthCPU+ and +ocf:pacemaker:HealthSMART+ resource agents set node health attributes based on CPU and disk parameters. The +ipmiservicelogd+ daemon sets node health attributes based on IPMI values (the +ocf:pacemaker:SystemHealth+ resource agent can be used to manage the daemon as a cluster resource). [[s-reusing-config-elements]] == Reusing Rules, Options and Sets of Operations == Sometimes a number of constraints need to use the same set of rules, and resources need to set the same options and parameters. To simplify this situation, you can refer to an existing object using an +id-ref+ instead of an id. So if for one resource you have [source,XML] ------ ------ Then instead of duplicating the rule for all your other resources, you can instead specify: .Referencing rules from other constraints ===== [source,XML] ------- ------- ===== [IMPORTANT] =========== The cluster will insist that the +rule+ exists somewhere. Attempting to add a reference to a non-existing rule will cause a validation failure, as will attempting to remove a +rule+ that is referenced elsewhere. =========== The same principle applies for +meta_attributes+ and +instance_attributes+ as illustrated in the example below: .Referencing attributes, options, and operations from other resources ===== [source,XML] ------- ------- ===== == Reloading Services After a Definition Change == The cluster automatically detects changes to the definition of services it manages. The normal response is to stop the service (using the old definition) and start it again (with the new definition). This works well, but some services are smarter and can be told to use a new set of options without restarting. To take advantage of this capability, the resource agent must: . Accept the +reload+ operation and perform any required actions. _The actions here depend completely on your application!_ + .The DRBD agent's logic for supporting +reload+ ===== [source,Bash] ------- case $1 in start) drbd_start ;; stop) drbd_stop ;; reload) drbd_reload ;; monitor) drbd_monitor ;; *) drbd_usage exit $OCF_ERR_UNIMPLEMENTED ;; esac exit $? ------- ===== . Advertise the +reload+ operation in the +actions+ section of its metadata + .The DRBD Agent Advertising Support for the +reload+ Operation ===== [source,XML] ------- 1.1 Master/Slave OCF Resource Agent for DRBD ... ------- ===== . Advertise one or more parameters that can take effect using +reload+. + Any parameter with the +unique+ set to 0 is eligible to be used in this way. + .Parameter that can be changed using reload ===== [source,XML] ------- Full path to the drbd.conf file. Path to drbd.conf ------- ===== Once these requirements are satisfied, the cluster will automatically know to reload the resource (instead of restarting) when a non-unique field changes. [NOTE] ====== Metadata will not be re-read unless the resource needs to be started. This may mean that the resource will be restarted the first time, even though you changed a parameter with +unique=0+. ====== [NOTE] ====== If both a unique and non-unique field are changed simultaneously, the resource will still be restarted. ====== diff --git a/doc/Pacemaker_Explained/en-US/Ch-Constraints.txt b/doc/Pacemaker_Explained/en-US/Ch-Constraints.txt index 2f5bec7b11..05dc42e5f1 100644 --- a/doc/Pacemaker_Explained/en-US/Ch-Constraints.txt +++ b/doc/Pacemaker_Explained/en-US/Ch-Constraints.txt @@ -1,836 +1,846 @@ = Resource Constraints = indexterm:[Resource,Constraints] == Scores == Scores of all kinds are integral to how the cluster works. Practically everything from moving a resource to deciding which resource to stop in a degraded cluster is achieved by manipulating scores in some way. Scores are calculated per resource and node. Any node with a negative score for a resource can't run that resource. The cluster places a resource on the node with the highest score for it. === Infinity Math === Pacemaker implements +INFINITY+ (or equivalently, ++INFINITY+) internally as a score of 1,000,000. Addition and subtraction with it follow these three basic rules: * Any value + +INFINITY+ = +INFINITY+ * Any value - +INFINITY+ = +-INFINITY+ * +INFINITY+ - +INFINITY+ = +-INFINITY+ [NOTE] ====== What if you want to use a score higher than 1,000,000? Typically this possibility arises when someone wants to base the score on some external metric that might go above 1,000,000. The short answer is you can't. The long answer is it is sometimes possible work around this limitation creatively. You may be able to set the score to some computed value based on the external metric rather than use the metric directly. For nodes, you can store the metric as a node attribute, and query the attribute when computing the score (possibly as part of a custom resource agent). ====== == Deciding Which Nodes a Resource Can Run On == indexterm:[Location Constraints] indexterm:[Resource,Constraints,Location] 'Location constraints' tell the cluster which nodes a resource can run on. There are two alternative strategies. One way is to say that, by default, resources can run anywhere, and then the location constraints specify nodes that are not allowed (an 'opt-out' cluster). The other way is to start with nothing able to run anywhere, and use location constraints to selectively enable allowed nodes (an 'opt-in' cluster). Whether you should choose opt-in or opt-out depends on your personal preference and the make-up of your cluster. If most of your resources can run on most of the nodes, then an opt-out arrangement is likely to result in a simpler configuration. On the other-hand, if most resources can only run on a small subset of nodes, an opt-in configuration might be simpler. === Location Properties === .Properties of a rsc_location Constraint [width="95%",cols="2m,1,5>), the + submatches can be referenced as +%0+ through +%9+ in the rule's + +score-attribute+ or a rule expression's +attribute+ '(since 1.1.16)' +indexterm:[rsc-pattern,Location Constraints] +indexterm:[Constraints,Location,rsc-pattern] + |node | |A node's name indexterm:[node,Location Constraints] indexterm:[Constraints,Location,node] |score | |Positive values indicate the resource should run on this node. Negative values indicate the resource should not run on this node. Values of \+/- +INFINITY+ change "should"/"should not" to "must"/"must not". indexterm:[score,Location Constraints] indexterm:[Constraints,Location,score] |resource-discovery |always |Whether Pacemaker should perform resource discovery (that is, check whether the resource is already running) for this resource on this node. This should normally be left as the default, so that rogue instances of a service can be stopped when they are running where they are not supposed to be. However, there are two situations where disabling resource discovery is a good idea: when a service is not installed on a node, discovery might return an error (properly written OCF agents will not, so this is usually only seen with other agent types); and when Pacemaker Remote is used to scale a cluster to hundreds of nodes, limiting resource discovery to allowed nodes can significantly boost performance. '(since 1.1.13)' * +always:+ Always perform resource discovery for the specified resource on this node. * +never:+ Never perform resource discovery for the specified resource on this node. This option should generally be used with a -INFINITY score, although that is not strictly required. * +exclusive:+ Perform resource discovery for the specified resource only on this node (and other nodes similarly marked as +exclusive+). Multiple location constraints using +exclusive+ discovery for the same resource across different nodes creates a subset of nodes resource-discovery is exclusive to. If a resource is marked for +exclusive+ discovery on one or more nodes, that resource is only allowed to be placed within that subset of nodes. indexterm:[Resource Discovery,Location Constraints] indexterm:[Constraints,Location,Resource Discovery] |========================================================= [WARNING] ========= Setting resource-discovery to +never+ or +exclusive+ removes Pacemaker's ability to detect and stop unwanted instances of a service running where it's not supposed to be. It is up to the system administrator (you!) to make sure that the service can 'never' be active on nodes without resource-discovery (such as by leaving the relevant software uninstalled). ========= === Asymmetrical "Opt-In" Clusters === indexterm:[Asymmetrical Opt-In Clusters] indexterm:[Cluster Type,Asymmetrical Opt-In] To create an opt-in cluster, start by preventing resources from running anywhere by default: ---- # crm_attribute --name symmetric-cluster --update false ---- Then start enabling nodes. The following fragment says that the web server prefers *sles-1*, the database prefers *sles-2* and both can fail over to *sles-3* if their most preferred node fails. .Opt-in location constraints for two resources ====== [source,XML] ------- ------- ====== === Symmetrical "Opt-Out" Clusters === indexterm:[Symmetrical Opt-Out Clusters] indexterm:[Cluster Type,Symmetrical Opt-Out] To create an opt-out cluster, start by allowing resources to run anywhere by default: ---- # crm_attribute --name symmetric-cluster --update true ---- Then start disabling nodes. The following fragment is the equivalent of the above opt-in configuration. .Opt-out location constraints for two resources ====== [source,XML] ------- ------- ====== [[node-score-equal]] === What if Two Nodes Have the Same Score === If two nodes have the same score, then the cluster will choose one. This choice may seem random and may not be what was intended, however the cluster was not given enough information to know any better. .Constraints where a resource prefers two nodes equally ====== [source,XML] ------- ------- ====== In the example above, assuming no other constraints and an inactive cluster, +Webserver+ would probably be placed on +sles-1+ and +Database+ on +sles-2+. It would likely have placed +Webserver+ based on the node's uname and +Database+ based on the desire to spread the resource load evenly across the cluster. However other factors can also be involved in more complex configurations. [[s-resource-ordering]] == Specifying the Order in which Resources Should Start/Stop == indexterm:[Resource,Constraints,Ordering] indexterm:[Resource,Start Order] indexterm:[Ordering Constraints] 'Ordering constraints' tell the cluster the order in which resources should start. [IMPORTANT] ==== Ordering constraints affect 'only' the ordering of resources; they do 'not' require that the resources be placed on the same node. If you want resources to be started on the same node 'and' in a specific order, you need both an ordering constraint 'and' a colocation constraint (see <>), or alternatively, a group (see <>). ==== === Ordering Properties === .Properties of a rsc_order Constraint [width="95%",cols="1m,1,4> resources. === Optional and mandatory ordering === Here is an example of ordering constraints where +Database+ 'must' start before +Webserver+, and +IP+ 'should' start before +Webserver+ if they both need to be started: .Optional and mandatory ordering constraints ====== [source,XML] ------- ------- ====== Because the above example lets +symmetrical+ default to TRUE, +Webserver+ must be stopped before +Database+ can be stopped, and +Webserver+ should be stopped before +IP+ if they both need to be stopped. [[s-resource-colocation]] == Placing Resources Relative to other Resources == indexterm:[Resource,Constraints,Colocation] indexterm:[Resource,Location Relative to other Resources] 'Colocation constraints' tell the cluster that the location of one resource depends on the location of another one. Colocation has an important side-effect: it affects the order in which resources are assigned to a node. Think about it: You can't place A relative to B unless you know where B is. footnote:[ While the human brain is sophisticated enough to read the constraint in any order and choose the correct one depending on the situation, the cluster is not quite so smart. Yet. ] So when you are creating colocation constraints, it is important to consider whether you should colocate A with B, or B with A. Another thing to keep in mind is that, assuming A is colocated with B, the cluster will take into account A's preferences when deciding which node to choose for B. For a detailed look at exactly how this occurs, see http://clusterlabs.org/doc/Colocation_Explained.pdf[Colocation Explained]. [IMPORTANT] ==== Colocation constraints affect 'only' the placement of resources; they do 'not' require that the resources be started in a particular order. If you want resources to be started on the same node 'and' in a specific order, you need both an ordering constraint (see <>) 'and' a colocation constraint, or alternatively, a group (see <>). ==== === Colocation Properties === .Properties of a rsc_colocation Constraint [width="95%",cols="2m,5<",options="header",align="center"] |========================================================= |Field |Description |id |A unique name for the constraint. indexterm:[id,Colocation Constraints] indexterm:[Constraints,Colocation,id] |rsc |The name of a resource that should be located relative to +with-rsc+. indexterm:[rsc,Colocation Constraints] indexterm:[Constraints,Colocation,rsc] |with-rsc |The name of the resource used as the colocation target. The cluster will decide where to put this resource first and then decide where to put +rsc+. indexterm:[with-rsc,Colocation Constraints] indexterm:[Constraints,Colocation,with-rsc] |score |Positive values indicate the resources should run on the same node. Negative values indicate the resources should run on different nodes. Values of \+/- +INFINITY+ change "should" to "must". indexterm:[score,Colocation Constraints] indexterm:[Constraints,Colocation,score] |========================================================= === Mandatory Placement === Mandatory placement occurs when the constraint's score is ++INFINITY+ or +-INFINITY+. In such cases, if the constraint can't be satisfied, then the +rsc+ resource is not permitted to run. For +score=INFINITY+, this includes cases where the +with-rsc+ resource is not active. If you need resource +A+ to always run on the same machine as resource +B+, you would add the following constraint: .Mandatory colocation constraint for two resources ==== [source,XML] ==== Remember, because +INFINITY+ was used, if +B+ can't run on any of the cluster nodes (for whatever reason) then +A+ will not be allowed to run. Whether +A+ is running or not has no effect on +B+. Alternatively, you may want the opposite -- that +A+ 'cannot' run on the same machine as +B+. In this case, use +score="-INFINITY"+. .Mandatory anti-colocation constraint for two resources ==== [source,XML] ==== Again, by specifying +-INFINITY+, the constraint is binding. So if the only place left to run is where +B+ already is, then +A+ may not run anywhere. As with +INFINITY+, +B+ can run even if +A+ is stopped. However, in this case +A+ also can run if +B+ is stopped, because it still meets the constraint of +A+ and +B+ not running on the same node. === Advisory Placement === If mandatory placement is about "must" and "must not", then advisory placement is the "I'd prefer if" alternative. For constraints with scores greater than +-INFINITY+ and less than +INFINITY+, the cluster will try to accommodate your wishes but may ignore them if the alternative is to stop some of the cluster resources. As in life, where if enough people prefer something it effectively becomes mandatory, advisory colocation constraints can combine with other elements of the configuration to behave as if they were mandatory. .Advisory colocation constraint for two resources ==== [source,XML] ==== [[s-resource-sets]] == Resource Sets == 'Resource sets' allow multiple resources to be affected by a single constraint. .A set of 3 resources ==== [source,XML] ---- ---- ==== Resource sets are valid inside +rsc_location+, +rsc_order+ (see <>), +rsc_colocation+ (see <>), and +rsc_ticket+ (see <>) constraints. A resource set has a number of properties that can be set, though not all have an effect in all contexts. .Properties of a resource_set [width="95%",cols="2m,1,5 ------- ====== .Visual representation of the four resources' start order for the above constraints image::images/resource-set.png["Ordered set",width="16cm",height="2.5cm",align="center"] === Ordered Set === To simplify this situation, resource sets (see <>) can be used within ordering constraints: .A chain of ordered resources expressed as a set ====== [source,XML] ------- ------- ====== While the set-based format is not less verbose, it is significantly easier to get right and maintain. [IMPORTANT] ========= If you use a higher-level tool, pay attention to how it exposes this functionality. Depending on the tool, creating a set +A B+ may be equivalent to +A then B+, or +B then A+. ========= === Ordering Multiple Sets === The syntax can be expanded to allow sets of resources to be ordered relative to each other, where the members of each individual set may be ordered or unordered (controlled by the +sequential+ property). In the example below, +A+ and +B+ can both start in parallel, as can +C+ and +D+, however +C+ and +D+ can only start once _both_ +A+ _and_ +B+ are active. .Ordered sets of unordered resources ====== [source,XML] ------- ------- ====== .Visual representation of the start order for two ordered sets of unordered resources image::images/two-sets.png["Two ordered sets",width="13cm",height="7.5cm",align="center"] Of course either set -- or both sets -- of resources can also be internally ordered (by setting +sequential="true"+) and there is no limit to the number of sets that can be specified. .Advanced use of set ordering - Three ordered sets, two of which are internally unordered ====== [source,XML] ------- ------- ====== .Visual representation of the start order for the three sets defined above image::images/three-sets.png["Three ordered sets",width="16cm",height="7.5cm",align="center"] [IMPORTANT] ==== An ordered set with +sequential=false+ makes sense only if there is another set in the constraint. Otherwise, the constraint has no effect. ==== === Resource Set OR Logic === The unordered set logic discussed so far has all been "AND" logic. To illustrate this take the 3 resource set figure in the previous section. Those sets can be expressed, +(A and B) then \(C) then (D) then (E and F)+. Say for example we want to change the first set, +(A and B)+, to use "OR" logic so the sets look like this: +(A or B) then \(C) then (D) then (E and F)+. This functionality can be achieved through the use of the +require-all+ option. This option defaults to TRUE which is why the "AND" logic is used by default. Setting +require-all=false+ means only one resource in the set needs to be started before continuing on to the next set. .Resource Set "OR" logic: Three ordered sets, where the first set is internally unordered with "OR" logic ====== [source,XML] ------- ------- ====== [IMPORTANT] ==== An ordered set with +require-all=false+ makes sense only in conjunction with +sequential=false+. Think of it like this: +sequential=false+ modifies the set to be an unordered set using "AND" logic by default, and adding +require-all=false+ flips the unordered set's "AND" logic to "OR" logic. ==== [[s-resource-sets-colocation]] == Colocating Sets of Resources == Another common situation is for an administrator to create a set of colocated resources. One way to do this would be to define a resource group (see <>), but that cannot always accurately express the desired state. Another way would be to define each relationship as an individual constraint, but that causes a constraint explosion as the number of resources and combinations grow. An example of this approach: .Chain of colocated resources ====== [source,XML] ------- ------- ====== To make things easier, resource sets (see <>) can be used within colocation constraints. As with the chained version, a resource that can't be active prevents any resource that must be colocated with it from being active. For example, if +B+ is not able to run, then both +C+ and by inference +D+ must also remain stopped. Here is an example +resource_set+: .Equivalent colocation chain expressed using +resource_set+ ====== [source,XML] ------- ------- ====== [IMPORTANT] ========= If you use a higher-level tool, pay attention to how it exposes this functionality. Depending on the tool, creating a set +A B+ may be equivalent to +A with B+, or +B with A+. ========= This notation can also be used to tell the cluster that sets of resources must be colocated relative to each other, where the individual members of each set may or may not depend on each other being active (controlled by the +sequential+ property). In this example, +A+, +B+, and +C+ will each be colocated with +D+. +D+ must be active, but any of +A+, +B+, or +C+ may be inactive without affecting any other resources. .Using colocated sets to specify a common peer ====== [source,XML] ------- ------- ====== [IMPORTANT] ==== A colocated set with +sequential=false+ makes sense only if there is another set in the constraint. Otherwise, the constraint has no effect. ==== There is no inherent limit to the number and size of the sets used. The only thing that matters is that in order for any member of one set in the constraint to be active, all members of sets listed after it must also be active (and naturally on the same node); and if a set has +sequential="true"+, then in order for one member of that set to be active, all members listed before it must also be active. If desired, you can restrict the dependency to instances of multistate resources that are in a specific role, using the set's +role+ property. .Colocation chain in which the members of the middle set have no interdependencies, and the last listed set (which the cluster places first) is restricted to instances in master status. ====== [source,XML] ------- ------- ====== .Visual representation the above example (resources to the left are placed first) image::images/three-sets-complex.png["Colocation chain",width="16cm",height="9cm",align="center"] [NOTE] ==== Pay close attention to the order in which resources and sets are listed. While the colocation dependency for members of any one set is last-to-first, the colocation dependency for multiple sets is first-to-last. In the above example, +B+ is colocated with +A+, but +colocated-set-1+ is colocated with +colocated-set-2+. Unlike ordered sets, colocated sets do not use the +require-all+ option. ==== diff --git a/doc/Pacemaker_Explained/en-US/Ch-Options.txt b/doc/Pacemaker_Explained/en-US/Ch-Options.txt index f8a3daffc8..894458d0ce 100644 --- a/doc/Pacemaker_Explained/en-US/Ch-Options.txt +++ b/doc/Pacemaker_Explained/en-US/Ch-Options.txt @@ -1,435 +1,441 @@ = Cluster-Wide Configuration = == CIB Properties == Certain settings are defined by CIB properties (that is, attributes of the +cib+ tag) rather than with the rest of the cluster configuration in the +configuration+ section. The reason is simply a matter of parsing. These options are used by the configuration database which is, by design, mostly ignorant of the content it holds. So the decision was made to place them in an easy-to-find location. .CIB Properties [width="95%",cols="2m,5<",options="header",align="center"] |========================================================= |Field |Description | admin_epoch | indexterm:[Configuration Version,Cluster] indexterm:[Cluster,Option,Configuration Version] indexterm:[admin_epoch,Cluster Option] indexterm:[Cluster,Option,admin_epoch] When a node joins the cluster, the cluster performs a check to see which node has the best configuration. It asks the node with the highest (+admin_epoch+, +epoch+, +num_updates+) tuple to replace the configuration on all the nodes -- which makes setting them, and setting them correctly, very important. +admin_epoch+ is never modified by the cluster; you can use this to make the configurations on any inactive nodes obsolete. _Never set this value to zero_. In such cases, the cluster cannot tell the difference between your configuration and the "empty" one used when nothing is found on disk. | epoch | indexterm:[epoch,Cluster Option] indexterm:[Cluster,Option,epoch] The cluster increments this every time the configuration is updated (usually by the administrator). | num_updates | indexterm:[num_updates,Cluster Option] indexterm:[Cluster,Option,num_updates] The cluster increments this every time the configuration or status is updated (usually by the cluster) and resets it to 0 when epoch changes. | validate-with | indexterm:[validate-with,Cluster Option] indexterm:[Cluster,Option,validate-with] Determines the type of XML validation that will be done on the configuration. If set to +none+, the cluster will not verify that updates conform to the DTD (nor reject ones that don't). This option can be useful when operating a mixed-version cluster during an upgrade. |cib-last-written | indexterm:[cib-last-written,Cluster Property] indexterm:[Cluster,Property,cib-last-written] Indicates when the configuration was last written to disk. Maintained by the cluster; for informational purposes only. |have-quorum | indexterm:[have-quorum,Cluster Property] indexterm:[Cluster,Property,have-quorum] Indicates if the cluster has quorum. If false, this may mean that the cluster cannot start resources or fence other nodes (see +no-quorum-policy+ below). Maintained by the cluster. |dc-uuid | indexterm:[dc-uuid,Cluster Property] indexterm:[Cluster,Property,dc-uuid] Indicates which cluster node is the current leader. Used by the cluster when placing resources and determining the order of some events. Maintained by the cluster. |========================================================= === Working with CIB Properties === Although these fields can be written to by the user, in most cases the cluster will overwrite any values specified by the user with the "correct" ones. To change the ones that can be specified by the user, for example +admin_epoch+, one should use: ---- # cibadmin --modify --xml-text '' ---- A complete set of CIB properties will look something like this: .Attributes set for a cib object ====== [source,XML] ------- ------- ====== [[s-cluster-options]] == Cluster Options == Cluster options, as you might expect, control how the cluster behaves when confronted with certain situations. They are grouped into sets within the +crm_config+ section, and, in advanced configurations, there may be more than one set. (This will be described later in the section on <> where we will show how to have the cluster use different sets of options during working hours than during weekends.) For now, we will describe the simple case where each option is present at most once. You can obtain an up-to-date list of cluster options, including their default values, by running the `man pengine` and `man crmd` commands. .Cluster Options [width="95%",cols="5m,2,11>). | enable-startup-probes | TRUE | indexterm:[enable-startup-probes,Cluster Option] indexterm:[Cluster,Option,enable-startup-probes] Should the cluster check for active resources during startup? | maintenance-mode | FALSE | indexterm:[maintenance-mode,Cluster Option] indexterm:[Cluster,Option,maintenance-mode] Should the cluster refrain from monitoring, starting and stopping resources? | stonith-enabled | TRUE | indexterm:[stonith-enabled,Cluster Option] indexterm:[Cluster,Option,stonith-enabled] Should failed nodes and nodes with resources that can't be stopped be shot? If you value your data, set up a STONITH device and enable this. If true, or unset, the cluster will refuse to start resources unless one or more STONITH resources have been configured. If false, unresponsive nodes are immediately assumed to be running no resources, and resource takeover to online nodes starts without any further protection (which means _data loss_ if the unresponsive node still accesses shared storage, for example). See also the +requires+ meta-attribute in <>. | stonith-action | reboot | indexterm:[stonith-action,Cluster Option] indexterm:[Cluster,Option,stonith-action] Action to send to STONITH device. Allowed values are +reboot+ and +off+. The value +poweroff+ is also allowed, but is only used for legacy devices. | stonith-timeout | 60s | indexterm:[stonith-timeout,Cluster Option] indexterm:[Cluster,Option,stonith-timeout] How long to wait for STONITH actions (reboot, on, off) to complete | concurrent-fencing | FALSE | indexterm:[concurrent-fencing,Cluster Option] indexterm:[Cluster,Option,concurrent-fencing] Is the cluster allowed to initiate multiple fence actions concurrently? | cluster-delay | 60s | indexterm:[cluster-delay,Cluster Option] indexterm:[Cluster,Option,cluster-delay] Estimated maximum round-trip delay over the network (excluding action execution). If the TE requires an action to be executed on another node, it will consider the action failed if it does not get a response from the other node in this time (after considering the action's own timeout). The "correct" value will depend on the speed and load of your network and cluster nodes. | dc-deadtime | 20s | indexterm:[dc-deadtime,Cluster Option] indexterm:[Cluster,Option,dc-deadtime] How long to wait for a response from other nodes during startup. The "correct" value will depend on the speed/load of your network and the type of switches used. | cluster-recheck-interval | 15min | indexterm:[cluster-recheck-interval,Cluster Option] indexterm:[Cluster,Option,cluster-recheck-interval] Polling interval for time-based changes to options, resource parameters and constraints. The Cluster is primarily event-driven, but your configuration can have elements that take effect based on the time of day. To ensure these changes take effect, we can optionally poll the cluster's status for changes. A value of 0 disables polling. Positive values are an interval (in seconds unless other SI units are specified, e.g. 5min). | pe-error-series-max | -1 | indexterm:[pe-error-series-max,Cluster Option] indexterm:[Cluster,Option,pe-error-series-max] The number of PE inputs resulting in ERRORs to save. Used when reporting problems. A value of -1 means unlimited (report all). | pe-warn-series-max | -1 | indexterm:[pe-warn-series-max,Cluster Option] indexterm:[Cluster,Option,pe-warn-series-max] The number of PE inputs resulting in WARNINGs to save. Used when reporting problems. A value of -1 means unlimited (report all). | pe-input-series-max | -1 | indexterm:[pe-input-series-max,Cluster Option] indexterm:[Cluster,Option,pe-input-series-max] The number of "normal" PE inputs to save. Used when reporting problems. A value of -1 means unlimited (report all). | node-health-strategy | none | indexterm:[node-health-strategy,Cluster Option] indexterm:[Cluster,Option,node-health-strategy] How the cluster should react to node health attributes (see <>). Allowed values are +none+, +migrate-on-red+, +only-green+, +progressive+, and +custom+. +| node-health-base | 0 | +indexterm:[node-health-base,Cluster Option] +indexterm:[Cluster,Option,node-health-base] + The base health score assigned to a node. Only used when + +node-health-strategy+ is +progressive+. '(since 1.1.16)' + | node-health-green | 0 | indexterm:[node-health-green,Cluster Option] indexterm:[Cluster,Option,node-health-green] The score to use for a node health attribute whose value is +green+. Only used when +node-health-strategy+ is +progressive+ or +custom+. | node-health-yellow | 0 | indexterm:[node-health-yellow,Cluster Option] indexterm:[Cluster,Option,node-health-yellow] The score to use for a node health attribute whose value is +yellow+. Only used when +node-health-strategy+ is +progressive+ or +custom+. | node-health-red | 0 | indexterm:[node-health-red,Cluster Option] indexterm:[Cluster,Option,node-health-red] The score to use for a node health attribute whose value is +red+. Only used when +node-health-strategy+ is +progressive+ or +custom+. | remove-after-stop | FALSE | indexterm:[remove-after-stop,Cluster Option] indexterm:[Cluster,Option,remove-after-stop] _Advanced Use Only:_ Should the cluster remove resources from the LRM after they are stopped? Values other than the default are, at best, poorly tested and potentially dangerous. | startup-fencing | TRUE | indexterm:[startup-fencing,Cluster Option] indexterm:[Cluster,Option,startup-fencing] _Advanced Use Only:_ Should the cluster shoot unseen nodes? Not using the default is very unsafe! | election-timeout | 2min | indexterm:[election-timeout,Cluster Option] indexterm:[Cluster,Option,election-timeout] _Advanced Use Only:_ If you need to adjust this value, it probably indicates the presence of a bug. | shutdown-escalation | 20min | indexterm:[shutdown-escalation,Cluster Option] indexterm:[Cluster,Option,shutdown-escalation] _Advanced Use Only:_ If you need to adjust this value, it probably indicates the presence of a bug. | crmd-integration-timeout | 3min | indexterm:[crmd-integration-timeout,Cluster Option] indexterm:[Cluster,Option,crmd-integration-timeout] _Advanced Use Only:_ If you need to adjust this value, it probably indicates the presence of a bug. | crmd-finalization-timeout | 30min | indexterm:[crmd-finalization-timeout,Cluster Option] indexterm:[Cluster,Option,crmd-finalization-timeout] _Advanced Use Only:_ If you need to adjust this value, it probably indicates the presence of a bug. | crmd-transition-delay | 0s | indexterm:[crmd-transition-delay,Cluster Option] indexterm:[Cluster,Option,crmd-transition-delay] _Advanced Use Only:_ Delay cluster recovery for the configured interval to allow for additional/related events to occur. Useful if your configuration is sensitive to the order in which ping updates arrive. Enabling this option will slow down cluster recovery under all conditions. |default-resource-stickiness | 0 | indexterm:[default-resource-stickiness,Cluster Option] indexterm:[Cluster,Option,default-resource-stickiness] _Deprecated:_ See <> instead | is-managed-default | TRUE | indexterm:[is-managed-default,Cluster Option] indexterm:[Cluster,Option,is-managed-default] _Deprecated:_ See <> instead | default-action-timeout | 20s | indexterm:[default-action-timeout,Cluster Option] indexterm:[Cluster,Option,default-action-timeout] _Deprecated:_ See <> instead |========================================================= === Querying and Setting Cluster Options === indexterm:[Querying,Cluster Option] indexterm:[Setting,Cluster Option] indexterm:[Cluster,Querying Options] indexterm:[Cluster,Setting Options] Cluster options can be queried and modified using the `crm_attribute` tool. To get the current value of +cluster-delay+, you can run: ---- # crm_attribute --query --name cluster-delay ---- which is more simply written as ---- # crm_attribute -G -n cluster-delay ---- If a value is found, you'll see a result like this: ---- # crm_attribute -G -n cluster-delay scope=crm_config name=cluster-delay value=60s ---- If no value is found, the tool will display an error: ---- # crm_attribute -G -n clusta-deway scope=crm_config name=clusta-deway value=(null) Error performing operation: No such device or address ---- To use a different value (for example, 30 seconds), simply run: ---- # crm_attribute --name cluster-delay --update 30s ---- To go back to the cluster's default value, you can delete the value, for example: ---- # crm_attribute --name cluster-delay --delete Deleted crm_config option: id=cib-bootstrap-options-cluster-delay name=cluster-delay ---- === When Options are Listed More Than Once === If you ever see something like the following, it means that the option you're modifying is present more than once. .Deleting an option that is listed twice ======= ------ # crm_attribute --name batch-limit --delete Multiple attributes match name=batch-limit in crm_config: Value: 50 (set=cib-bootstrap-options, id=cib-bootstrap-options-batch-limit) Value: 100 (set=custom, id=custom-batch-limit) Please choose from one of the matches above and supply the 'id' with --id ------- ======= In such cases, follow the on-screen instructions to perform the requested action. To determine which value is currently being used by the cluster, refer to <>. diff --git a/doc/Pacemaker_Explained/en-US/Revision_History.xml b/doc/Pacemaker_Explained/en-US/Revision_History.xml index c781a023e8..bdf18a409c 100644 --- a/doc/Pacemaker_Explained/en-US/Revision_History.xml +++ b/doc/Pacemaker_Explained/en-US/Revision_History.xml @@ -1,96 +1,108 @@ Revision History 1-0 19 Oct 2009 AndrewBeekhofandrew@beekhof.net Import from Pages.app 2-0 26 Oct 2009 AndrewBeekhofandrew@beekhof.net Cleanup and reformatting of docbook xml complete 3-0 Tue Nov 12 2009 AndrewBeekhofandrew@beekhof.net Split book into chapters and pass validation Re-organize book for use with Publican 4-0 Mon Oct 8 2012 AndrewBeekhofandrew@beekhof.net Converted to asciidoc (which is converted to docbook for use with Publican) 5-0 Mon Feb 23 2015 KenGaillotkgaillot@redhat.com Update for clarity, stylistic consistency and current command-line syntax 6-0 Tue Dec 8 2015 KenGaillotkgaillot@redhat.com Update for Pacemaker 1.1.14 7-0 Tue May 3 2016 KenGaillotkgaillot@redhat.com Update for Pacemaker 1.1.15 7-1 Fri Oct 28 2016 KenGaillotkgaillot@redhat.com Overhaul upgrade documentation, and document node health strategies + + 8-0 + Tue Oct 25 2016 + KenGaillotkgaillot@redhat.com + + + + Update for Pacemaker 1.1.16 + + + + diff --git a/doc/Pacemaker_Explained/pot/Ap-Debug.pot b/doc/Pacemaker_Explained/pot/Ap-Debug.pot index 3da22721c0..9a815092cd 100644 --- a/doc/Pacemaker_Explained/pot/Ap-Debug.pot +++ b/doc/Pacemaker_Explained/pot/Ap-Debug.pot @@ -1,189 +1,189 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Debugging Cluster Startup" msgstr "" #. Tag: title #, no-c-format msgid "Corosync" msgstr "" #. Tag: title #, no-c-format msgid "Prerequisites" msgstr "" #. Tag: title #, no-c-format msgid "Minimum logging configuration" msgstr "" #. Tag: programlisting #, no-c-format msgid "\n" " # /etc/init.d/openais start\n" " \n" " \n" " logging {\n" " to_syslog: yes\n" " syslog_facility: daemon\n" " }\n" " \n" " " msgstr "" #. Tag: caption #, no-c-format msgid "Whatever other logging you have, these two lines are required for Pacemaker clusters" msgstr "" #. Tag: title #, no-c-format msgid "Confirm Corosync Started" msgstr "" #. Tag: title #, no-c-format msgid "Expected output when starting openais" msgstr "" #. Tag: screen #, no-c-format msgid "\n" " # /etc/init.d/openais start\n" " \n" " \n" " Starting Corosync daemon (aisexec): starting... rc=0: OK\n" " \n" " " msgstr "" #. Tag: title #, no-c-format msgid "Expected log messages - startup" msgstr "" #. Tag: screen #, no-c-format msgid "\n" " # grep -e \"openais.*network interface\" -e \"AIS Executive Service\" /var/log/messages\n" " \n" " \n" " Aug 27 16:23:37 test1 openais[26337]: [MAIN ] AIS Executive Service RELEASE 'subrev 1152 version 0.80'\n" " Aug 27 16:23:38 test1 openais[26337]: [MAIN ] AIS Executive Service: started and ready to provide service.\n" " Aug 27 16:23:38 test1 openais[26337]: [TOTEM] The network interface [192.168.9.41] is now up.\n" " \n" " " msgstr "" #. Tag: caption #, no-c-format msgid "The versions may differ, but you should see Corosync indicate it started and sucessfully attached to the machine's network interface" msgstr "" #. Tag: title #, no-c-format msgid "Expected log messages - membership" msgstr "" #. Tag: screen #, no-c-format msgid "\n" " # grep CLM /var/log/messages\n" " \n" " \n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] CLM CONFIGURATION CHANGE\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] New Configuration:\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] Members Left:\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] Members Joined:\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] CLM CONFIGURATION CHANGE\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] New Configuration:\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] r(0) ip(192.168.9.41)\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] Members Left:\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] Members Joined:\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] r(0) ip(192.168.9.41)\n" " Aug 27 16:53:15 test1 openais[2166]: [CLM ] got nodejoin message 192.168.9.41\n" " \n" " " msgstr "" #. Tag: caption #, no-c-format msgid "The exact messages will differ, but you should see a new membership formed with the real IP address of your node" msgstr "" #. Tag: title #, no-c-format msgid "Checking Pacemaker" msgstr "" #. Tag: para #, no-c-format msgid "Now that we have confirmed that Corosync is functional we can check the rest of the stack." msgstr "" #. Tag: title #, no-c-format msgid "Expected Pacemaker startup logging for Corosync" msgstr "" #. Tag: screen #, no-c-format msgid "\n" " # grep pcmk_plugin_init /var/log/messages\n" " \n" " \n" " Aug 27 16:53:15 test1 openais[2166]: [pcmk ] info: pcmk_plugin_init: CRM: Initialized\n" " Aug 27 16:53:15 test1 openais[2166]: [pcmk ] Logging: Initialized pcmk_plugin_init\n" " Aug 27 16:53:15 test1 openais[2166]: [pcmk ] info: pcmk_plugin_init: Service: 9\n" " Aug 27 16:53:15 test1 openais[2166]: [pcmk ] info: pcmk_plugin_init: Local hostname: test1\n" " \n" " " msgstr "" #. Tag: caption #, no-c-format msgid "If you don't see these messages, or some like them, there is likely a problem finding or loading the pacemaker plugin." msgstr "" #. Tag: title #, no-c-format msgid "Expected process listing on a 64-bit machine" msgstr "" #. Tag: screen #, no-c-format msgid "\n" " # ps axf\n" " \n" " \n" " 3718 ? Ssl 0:05 /usr/sbin/aisexec\n" " 3723 ? SLs 0:00 \\_ /usr/lib64/heartbeat/stonithd\n" " 3724 ? S 0:05 \\_ /usr/lib64/heartbeat/cib\n" " 3725 ? S 0:21 \\_ /usr/lib64/heartbeat/lrmd\n" " 3726 ? S 0:01 \\_ /usr/lib64/heartbeat/attrd\n" " 3727 ? S 0:00 \\_ /usr/lib64/heartbeat/pengine\n" " 3728 ? S 0:01 \\_ /usr/lib64/heartbeat/crmd\n" " \n" " " msgstr "" #. Tag: caption #, no-c-format msgid "On 32-bit systems the exact path may differ, but all the above processes should be listed." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ap-FAQ.pot b/doc/Pacemaker_Explained/pot/Ap-FAQ.pot index 47d40a5941..84b0312b47 100644 --- a/doc/Pacemaker_Explained/pot/Ap-FAQ.pot +++ b/doc/Pacemaker_Explained/pot/Ap-FAQ.pot @@ -1,124 +1,124 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "FAQ" msgstr "" #. Tag: para #, no-c-format msgid "Why is the Project Called Pacemaker?" msgstr "" #. Tag: para #, no-c-format msgid " Pacemaker First of all, the reason it’s not called the CRM is because of the abundance of terms http://en.wikipedia.org/wiki/CRM that are commonly abbreviated to those three letters. The Pacemaker name came from Kham, http://khamsouk.souvanlasy.com/ a good friend of Pacemaker developer Andrew Beekhof’s, and was originally used by a Java GUI that Beekhof was prototyping in early 2007. Alas, other commitments prevented the GUI from progressing much and, when it came time to choose a name for this project, Lars Marowsky-Bree suggested it was an even better fit for an independent CRM. The idea stems from the analogy between the role of this software and that of the little device that keeps the human heart pumping. Pacemaker monitors the cluster and intervenes when necessary to ensure the smooth operation of the services it provides. There were a number of other names (and acronyms) tossed around, but suffice to say \"Pacemaker\" was the best." msgstr "" #. Tag: para #, no-c-format msgid "Why was the Pacemaker Project Created?" msgstr "" #. Tag: para #, no-c-format msgid "The decision was made to spin-off the CRM into its own project after the 2.1.3 Heartbeat release in order to:" msgstr "" #. Tag: para #, no-c-format msgid "support both the Corosync and Heartbeat cluster stacks equally" msgstr "" #. Tag: para #, no-c-format msgid "decouple the release cycles of two projects at very different stages of their life-cycles" msgstr "" #. Tag: para #, no-c-format msgid "foster clearer package boundaries, thus leading to better and more stable interfaces" msgstr "" #. Tag: para #, no-c-format msgid "What Messaging Layers are Supported?" msgstr "" #. Tag: para #, no-c-format msgid " Messaging Layers " msgstr "" #. Tag: para #, no-c-format msgid "Corosync" msgstr "" #. Tag: para #, no-c-format msgid "Heartbeat" msgstr "" #. Tag: para #, no-c-format msgid "Can I Choose Which Messaging Layer to Use at Run Time?" msgstr "" #. Tag: para #, no-c-format msgid "Yes. The CRM will automatically detect which started it and behave accordingly." msgstr "" #. Tag: para #, no-c-format msgid "Can I Have a Mixed Heartbeat-Corosync Cluster?" msgstr "" #. Tag: para #, no-c-format msgid "No." msgstr "" #. Tag: para #, no-c-format msgid " Which Messaging Layer Should I Choose?" msgstr "" #. Tag: para #, no-c-format msgid " ClusterChoosing Between Heartbeat and Corosync Choosing Between Heartbeat and Corosync Cluster StackCorosync Corosync Corosync Cluster StackHeartbeat Heartbeat Heartbeat You can choose from multiple messaging layers, including heartbeat, corosync 1 (with or without CMAN), and corosync 2. Corosync 2 is the current state of the art due to its more advanced features and better support for pacemaker, but often the best choice is to use whatever comes with your Linux distribution, and follow the distribution’s setup instructions." msgstr "" #. Tag: para #, no-c-format msgid "Where Can I Get Pre-built Packages?" msgstr "" #. Tag: para #, no-c-format msgid "Most major Linux distributions have pacemaker packages in their standard package repositories. See the Install wiki page for details." msgstr "" #. Tag: para #, no-c-format msgid "What Versions of Pacemaker Are Supported?" msgstr "" #. Tag: para #, no-c-format msgid "Some Linux distributions (such as Red Hat Enterprise Linux and SUSE Linux Enterprise) offer technical support for their customers; contact them for details of such support. For help within the community (mailing lists, IRC, etc.) from Pacemaker developers and users, refer to the Releases wiki page for an up-to-date list of versions considered to be supported by the project. When seeking assistance, please try to ensure you have one of these versions." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ap-Install.pot b/doc/Pacemaker_Explained/pot/Ap-Install.pot index 6148f76a9c..bb5a76cf9e 100644 --- a/doc/Pacemaker_Explained/pot/Ap-Install.pot +++ b/doc/Pacemaker_Explained/pot/Ap-Install.pot @@ -1,268 +1,268 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Installing" msgstr "" #. Tag: title #, no-c-format msgid "Installing the Software" msgstr "" #. Tag: para #, no-c-format msgid "Most major Linux distributions have pacemaker packages in their standard package repositories, or the software can be built from source code. See the Install wiki page for details." msgstr "" #. Tag: para #, no-c-format msgid "See Which Messaging Layer Should I Choose? for information about choosing a messaging layer." msgstr "" #. Tag: title #, no-c-format msgid "Enabling Pacemaker" msgstr "" #. Tag: title #, no-c-format msgid "Enabling Pacemaker For Corosync 2.x" msgstr "" #. Tag: para #, no-c-format msgid "High-level cluster management tools are available that can configure corosync for you. This document focuses on the lower-level details if you want to configure corosync yourself." msgstr "" #. Tag: para #, no-c-format msgid "Corosync configuration is normally located in /etc/corosync/corosync.conf." msgstr "" #. Tag: title #, no-c-format msgid "Corosync 2.x configuration file for two nodes myhost1 and myhost2" msgstr "" #. Tag: screen #, no-c-format msgid "totem {\n" "version: 2\n" "secauth: off\n" "cluster_name: mycluster\n" "transport: udpu\n" "}\n" "\n" "nodelist {\n" " node {\n" " ring0_addr: myhost1\n" " nodeid: 1\n" " }\n" " node {\n" " ring0_addr: myhost2\n" " nodeid: 2\n" " }\n" "}\n" "\n" "quorum {\n" "provider: corosync_votequorum\n" "two_node: 1\n" "}\n" "\n" "logging {\n" "to_syslog: yes\n" "}" msgstr "" #. Tag: title #, no-c-format msgid "Corosync 2.x configuration file for three nodes myhost1, myhost2 and myhost3" msgstr "" #. Tag: screen #, no-c-format msgid "totem {\n" "version: 2\n" "secauth: off\n" "cluster_name: mycluster\n" "transport: udpu\n" "}\n" "\n" "nodelist {\n" " node {\n" " ring0_addr: myhost1\n" " nodeid: 1\n" " }\n" " node {\n" " ring0_addr: myhost2\n" " nodeid: 2\n" " }\n" " node {\n" " ring0_addr: myhost3\n" " nodeid: 3\n" " }\n" "}\n" "\n" "quorum {\n" "provider: corosync_votequorum\n" "\n" "}\n" "\n" "logging {\n" "to_syslog: yes\n" "}" msgstr "" #. Tag: para #, no-c-format msgid "In the above examples, the totem section defines what protocol version and options (including encryption) to use, Please consult the Corosync website (http://www.corosync.org/) and documentation for details on enabling encryption and peer authentication for the cluster. and gives the cluster a unique name (mycluster in these examples)." msgstr "" #. Tag: para #, no-c-format msgid "The node section lists the nodes in this cluster. (See for how this affects pacemaker.)" msgstr "" #. Tag: para #, no-c-format msgid "The quorum section defines how the cluster uses quorum. The important thing is that two-node clusters must be handled specially, so two_node: 1 must be defined for two-node clusters (and only for two-node clusters)." msgstr "" #. Tag: para #, no-c-format msgid "The logging section should be self-explanatory." msgstr "" #. Tag: title #, no-c-format msgid "Enabling Pacemaker For Corosync 1.x" msgstr "" #. Tag: title #, no-c-format msgid "Corosync 1.x configuration file for a cluster with all nodes on the 192.0.2.0/24 network" msgstr "" #. Tag: programlisting #, no-c-format msgid " totem {\n" " version: 2\n" " secauth: off\n" " threads: 0\n" " interface {\n" " ringnumber: 0\n" " bindnetaddr: 192.0.2.0\n" " mcastaddr: 239.255.1.1\n" " mcastport: 1234\n" " }\n" " }\n" " logging {\n" " fileline: off\n" " to_syslog: yes\n" " syslog_facility: daemon\n" " }\n" " amf {\n" " mode: disabled\n" " }" msgstr "" #. Tag: para #, no-c-format msgid "With corosync 1.x, the totem section contains the protocol version and options as with 2.x. However, nodes are also listed here, in the interface section. The bindnetaddr option is usually the network address, thus allowing the same configuration file to be used on all nodes. IPv4 or IPv6 addresses can be used with corosync." msgstr "" #. Tag: para #, no-c-format msgid "The amf section refers to the Availability Management Framework and is not covered in this document." msgstr "" #. Tag: para #, no-c-format msgid "The above corosync configuration is enough for corosync to operate by itself, but corosync 1.x additionally needs to be told when it is being used in conjunction with Pacemaker. This can be accomplished in one of two ways:" msgstr "" #. Tag: para #, no-c-format msgid "Via the CMAN software provided with Red Hat Enterprise Linux 6 and its derivatives" msgstr "" #. Tag: para #, no-c-format msgid "Via the pacemaker corosync plugin" msgstr "" #. Tag: para #, no-c-format msgid "To use CMAN, consult its documentation." msgstr "" #. Tag: para #, no-c-format msgid "To use the pacemaker corosync plugin, add the following fragment to the corosync configuration and restart the cluster." msgstr "" #. Tag: title #, no-c-format msgid "Corosync 1._x_configuration fragment to enable Pacemaker plugin" msgstr "" #. Tag: programlisting #, no-c-format msgid "aisexec {\n" " user: root\n" " group: root\n" "}\n" "service {\n" " name: pacemaker\n" " ver: 0\n" "}" msgstr "" #. Tag: para #, no-c-format msgid "The cluster needs to be run as root so that its child processes (the lrmd in particular) have sufficient privileges to perform the actions requested of it. After all, a cluster manager that can’t add an IP address or start apache is of little use." msgstr "" #. Tag: para #, no-c-format msgid "The second directive is the one that actually instructs the cluster to run Pacemaker." msgstr "" #. Tag: title #, no-c-format msgid "Enabling Pacemaker For Heartbeat" msgstr "" #. Tag: para #, no-c-format msgid "See the heartbeat documentation for how to set up a ha.cf configuration file." msgstr "" #. Tag: para #, no-c-format msgid "To enable the use of pacemaker with heartbeat, add the following to a functional ha.cf configuration file and restart Heartbeat:" msgstr "" #. Tag: title #, no-c-format msgid "Heartbeat configuration fragment to enable Pacemaker" msgstr "" #. Tag: screen #, no-c-format msgid "crm respawn" msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ap-LSB.pot b/doc/Pacemaker_Explained/pot/Ap-LSB.pot index a09d77916c..f34d4428f7 100644 --- a/doc/Pacemaker_Explained/pot/Ap-LSB.pot +++ b/doc/Pacemaker_Explained/pot/Ap-LSB.pot @@ -1,139 +1,139 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Init Script LSB Compliance" msgstr "" #. Tag: para #, no-c-format msgid "The relevant part of the LSB specifications includes a description of all the return codes listed here." msgstr "" #. Tag: para #, no-c-format msgid "Assuming some_service is configured correctly and currently inactive, the following sequence will help you determine if it is LSB-compatible:" msgstr "" #. Tag: para #, no-c-format msgid "Start (stopped):" msgstr "" #. Tag: screen #, no-c-format msgid "# /etc/init.d/some_service start ; echo \"result: $?\"" msgstr "" #. Tag: para #, no-c-format msgid "Did the service start?" msgstr "" #. Tag: para #, no-c-format msgid "Did the command print result: 0 (in addition to its usual output)?" msgstr "" #. Tag: para #, no-c-format msgid "Status (running):" msgstr "" #. Tag: screen #, no-c-format msgid "# /etc/init.d/some_service status ; echo \"result: $?\"" msgstr "" #. Tag: para #, no-c-format msgid "Did the script accept the command?" msgstr "" #. Tag: para #, no-c-format msgid "Did the script indicate the service was running?" msgstr "" #. Tag: para #, no-c-format msgid "Start (running):" msgstr "" #. Tag: para #, no-c-format msgid "Is the service still running?" msgstr "" #. Tag: para #, no-c-format msgid "Stop (running):" msgstr "" #. Tag: screen #, no-c-format msgid "# /etc/init.d/some_service stop ; echo \"result: $?\"" msgstr "" #. Tag: para #, no-c-format msgid "Was the service stopped?" msgstr "" #. Tag: para #, no-c-format msgid "Status (stopped):" msgstr "" #. Tag: para #, no-c-format msgid "Did the script indicate the service was not running?" msgstr "" #. Tag: para #, no-c-format msgid "Did the command print result: 3 (in addition to its usual output)?" msgstr "" #. Tag: para #, no-c-format msgid "Stop (stopped):" msgstr "" #. Tag: para #, no-c-format msgid "Is the service still stopped?" msgstr "" #. Tag: para #, no-c-format msgid "Status (failed):" msgstr "" #. Tag: para #, no-c-format msgid "This step is not readily testable and relies on manual inspection of the script." msgstr "" #. Tag: para #, no-c-format msgid "The script can use one of the error codes (other than 3) listed in the LSB spec to indicate that it is active but failed. This tells the cluster that before moving the resource to another node, it needs to stop it on the existing one first." msgstr "" #. Tag: para #, no-c-format msgid "If the answer to any of the above questions is no, then the script is not LSB-compliant. Your options are then to either fix the script or write an OCF agent based on the existing script." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ap-OCF.pot b/doc/Pacemaker_Explained/pot/Ap-OCF.pot index 34e1c3032d..2c20cdee0e 100644 --- a/doc/Pacemaker_Explained/pot/Ap-OCF.pot +++ b/doc/Pacemaker_Explained/pot/Ap-OCF.pot @@ -1,514 +1,514 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "More About OCF Resource Agents" msgstr "" #. Tag: title #, no-c-format msgid "Location of Custom Scripts" msgstr "" #. Tag: para #, no-c-format msgid " OCF Resource Agents OCF Resource Agents are found in /usr/lib/ocf/resource.d/provider" msgstr "" #. Tag: para #, no-c-format msgid "When creating your own agents, you are encouraged to create a new directory under /usr/lib/ocf/resource.d/ so that they are not confused with (or overwritten by) the agents shipped by existing providers." msgstr "" #. Tag: para #, no-c-format msgid "So, for example, if you choose the provider name of bigCorp and want a new resource named bigApp, you would create a resource agent called /usr/lib/ocf/resource.d/bigCorp/bigApp and define a resource:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"custom-app\" class=\"ocf\" provider=\"bigCorp\" type=\"bigApp\"/>" msgstr "" #. Tag: title #, no-c-format msgid "Actions" msgstr "" #. Tag: para #, no-c-format msgid "All OCF resource agents are required to implement the following actions." msgstr "" #. Tag: title #, no-c-format msgid "Required Actions for OCF Agents" msgstr "" #. Tag: entry #, no-c-format msgid "Action" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: entry #, no-c-format msgid "Instructions" msgstr "" #. Tag: para #, no-c-format msgid "start" msgstr "" #. Tag: para #, no-c-format msgid "Start the resource" msgstr "" #. Tag: para #, no-c-format msgid "Return 0 on success and an appropriate error code otherwise. Must not report success until the resource is fully active. startOCF Action OCF Action OCFActionstart Actionstart start " msgstr "" #. Tag: para #, no-c-format msgid "stop" msgstr "" #. Tag: para #, no-c-format msgid "Stop the resource" msgstr "" #. Tag: para #, no-c-format msgid "Return 0 on success and an appropriate error code otherwise. Must not report success until the resource is fully stopped. stopOCF Action OCF Action OCFActionstop Actionstop stop " msgstr "" #. Tag: para #, no-c-format msgid "monitor" msgstr "" #. Tag: para #, no-c-format msgid "Check the resource’s state" msgstr "" #. Tag: para #, no-c-format msgid "Exit 0 if the resource is running, 7 if it is stopped, and anything else if it is failed. monitorOCF Action OCF Action OCFActionmonitor Actionmonitor monitor " msgstr "" #. Tag: para #, no-c-format msgid "NOTE: The monitor script should test the state of the resource on the local machine only." msgstr "" #. Tag: para #, no-c-format msgid "meta-data" msgstr "" #. Tag: para #, no-c-format msgid "Describe the resource" msgstr "" #. Tag: para #, no-c-format msgid "Provide information about this resource as an XML snippet. Exit with 0. meta-dataOCF Action OCF Action OCFActionmeta-data Actionmeta-data meta-data " msgstr "" #. Tag: para #, no-c-format msgid "NOTE: This is not performed as root." msgstr "" #. Tag: para #, no-c-format msgid "validate-all" msgstr "" #. Tag: para #, no-c-format msgid "Verify the supplied parameters" msgstr "" #. Tag: para #, no-c-format msgid "Return 0 if parameters are valid, 2 if not valid, and 6 if resource is not configured. validate-allOCF Action OCF Action OCFActionvalidate-all Actionvalidate-all validate-all " msgstr "" #. Tag: para #, no-c-format msgid "Additional requirements (not part of the OCF specification) are placed on agents that will be used for advanced concepts such as clones and multi-state resources." msgstr "" #. Tag: title #, no-c-format msgid "Optional Actions for OCF Resource Agents" msgstr "" #. Tag: para #, no-c-format msgid "promote" msgstr "" #. Tag: para #, no-c-format msgid "Promote the local instance of a multi-state resource to the master (primary) state." msgstr "" #. Tag: para #, no-c-format msgid "Return 0 on success promoteOCF Action OCF Action OCFActionpromote Actionpromote promote " msgstr "" #. Tag: para #, no-c-format msgid "demote" msgstr "" #. Tag: para #, no-c-format msgid "Demote the local instance of a multi-state resource to the slave (secondary) state." msgstr "" #. Tag: para #, no-c-format msgid "Return 0 on success demoteOCF Action OCF Action OCFActiondemote Actiondemote demote " msgstr "" #. Tag: para #, no-c-format msgid "notify" msgstr "" #. Tag: para #, no-c-format msgid "Used by the cluster to send the agent pre- and post-notification events telling the resource what has happened and will happen." msgstr "" #. Tag: para #, no-c-format msgid "Must not fail. Must exit with 0 notifyOCF Action OCF Action OCFActionnotify Actionnotify notify " msgstr "" #. Tag: para #, no-c-format msgid "One action specified in the OCF specs, recover, is not currently used by the cluster. It is intended to be a variant of the start action that tries to recover a resource locally." msgstr "" #. Tag: para #, no-c-format msgid "If you create a new OCF resource agent, use ocf-tester ocf-tester to verify that the agent complies with the OCF standard properly." msgstr "" #. Tag: title #, no-c-format msgid "How are OCF Return Codes Interpreted?" msgstr "" #. Tag: para #, no-c-format msgid "The first thing the cluster does is to check the return code against the expected result. If the result does not match the expected value, then the operation is considered to have failed, and recovery action is initiated." msgstr "" #. Tag: para #, no-c-format msgid "There are three types of failure recovery:" msgstr "" #. Tag: title #, no-c-format msgid "Types of recovery performed by the cluster" msgstr "" #. Tag: entry #, no-c-format msgid "Type" msgstr "" #. Tag: entry #, no-c-format msgid "Action Taken by the Cluster" msgstr "" #. Tag: para #, no-c-format msgid "soft" msgstr "" #. Tag: para #, no-c-format msgid "A transient error occurred" msgstr "" #. Tag: para #, no-c-format msgid "Restart the resource or move it to a new location softOCF error OCF error OCFerrorsoft errorsoft soft " msgstr "" #. Tag: para #, no-c-format msgid "hard" msgstr "" #. Tag: para #, no-c-format msgid "A non-transient error that may be specific to the current node occurred" msgstr "" #. Tag: para #, no-c-format msgid "Move the resource elsewhere and prevent it from being retried on the current node hardOCF error OCF error OCFerrorhard errorhard hard " msgstr "" #. Tag: para #, no-c-format msgid "fatal" msgstr "" #. Tag: para #, no-c-format msgid "A non-transient error that will be common to all cluster nodes (e.g. a bad configuration was specified)" msgstr "" #. Tag: para #, no-c-format msgid "Stop the resource and prevent it from being started on any cluster node fatalOCF error OCF error OCFerrorfatal errorfatal fatal " msgstr "" #. Tag: title #, no-c-format msgid "OCF Return Codes" msgstr "" #. Tag: para #, no-c-format msgid "The following table outlines the different OCF return codes and the type of recovery the cluster will initiate when a failure code is received. Although counterintuitive, even actions that return 0 (aka. OCF_SUCCESS) can be considered to have failed, if 0 was not the expected return value." msgstr "" #. Tag: title #, no-c-format msgid "OCF Return Codes and their Recovery Types" msgstr "" #. Tag: entry #, no-c-format msgid "RC" msgstr "" #. Tag: entry #, no-c-format msgid "OCF Alias" msgstr "" #. Tag: entry #, no-c-format msgid "RT" msgstr "" #. Tag: para #, no-c-format msgid "0" msgstr "" #. Tag: para #, no-c-format msgid "OCF_SUCCESS" msgstr "" #. Tag: para #, no-c-format msgid "Success. The command completed successfully. This is the expected result for all start, stop, promote and demote commands. Return CodeOCF_SUCCESS OCF_SUCCESS Return Code0OCF_SUCCESS 0OCF_SUCCESS OCF_SUCCESS " msgstr "" #. Tag: para #, no-c-format msgid "1" msgstr "" #. Tag: para #, no-c-format msgid "OCF_ERR_GENERIC" msgstr "" #. Tag: para #, no-c-format msgid "Generic \"there was a problem\" error code. Return CodeOCF_ERR_GENERIC OCF_ERR_GENERIC Return Code1OCF_ERR_GENERIC 1OCF_ERR_GENERIC OCF_ERR_GENERIC " msgstr "" #. Tag: para #, no-c-format msgid "2" msgstr "" #. Tag: para #, no-c-format msgid "OCF_ERR_ARGS" msgstr "" #. Tag: para #, no-c-format msgid "The resource’s configuration is not valid on this machine. E.g. it refers to a location not found on the node. Return CodeOCF_ERR_ARGS OCF_ERR_ARGS Return Code2OCF_ERR_ARGS 2OCF_ERR_ARGS OCF_ERR_ARGS " msgstr "" #. Tag: para #, no-c-format msgid "3" msgstr "" #. Tag: para #, no-c-format msgid "OCF_ERR_UNIMPLEMENTED" msgstr "" #. Tag: para #, no-c-format msgid "The requested action is not implemented. Return CodeOCF_ERR_UNIMPLEMENTED OCF_ERR_UNIMPLEMENTED Return Code3OCF_ERR_UNIMPLEMENTED 3OCF_ERR_UNIMPLEMENTED OCF_ERR_UNIMPLEMENTED " msgstr "" #. Tag: para #, no-c-format msgid "4" msgstr "" #. Tag: para #, no-c-format msgid "OCF_ERR_PERM" msgstr "" #. Tag: para #, no-c-format msgid "The resource agent does not have sufficient privileges to complete the task. Return CodeOCF_ERR_PERM OCF_ERR_PERM Return Code4OCF_ERR_PERM 4OCF_ERR_PERM OCF_ERR_PERM " msgstr "" #. Tag: para #, no-c-format msgid "5" msgstr "" #. Tag: para #, no-c-format msgid "OCF_ERR_INSTALLED" msgstr "" #. Tag: para #, no-c-format msgid "The tools required by the resource are not installed on this machine. Return CodeOCF_ERR_INSTALLED OCF_ERR_INSTALLED Return Code5OCF_ERR_INSTALLED 5OCF_ERR_INSTALLED OCF_ERR_INSTALLED " msgstr "" #. Tag: para #, no-c-format msgid "6" msgstr "" #. Tag: para #, no-c-format msgid "OCF_ERR_CONFIGURED" msgstr "" #. Tag: para #, no-c-format msgid "The resource’s configuration is invalid. E.g. required parameters are missing. Return CodeOCF_ERR_CONFIGURED OCF_ERR_CONFIGURED Return Code6OCF_ERR_CONFIGURED 6OCF_ERR_CONFIGURED OCF_ERR_CONFIGURED " msgstr "" #. Tag: para #, no-c-format msgid "7" msgstr "" #. Tag: para #, no-c-format msgid "OCF_NOT_RUNNING" msgstr "" #. Tag: para #, no-c-format msgid "The resource is safely stopped. The cluster will not attempt to stop a resource that returns this for any action. Return CodeOCF_NOT_RUNNING OCF_NOT_RUNNING Return Code7OCF_NOT_RUNNING 7OCF_NOT_RUNNING OCF_NOT_RUNNING " msgstr "" #. Tag: para #, no-c-format msgid "N/A" msgstr "" #. Tag: para #, no-c-format msgid "8" msgstr "" #. Tag: para #, no-c-format msgid "OCF_RUNNING_MASTER" msgstr "" #. Tag: para #, no-c-format msgid "The resource is running in master mode. Return CodeOCF_RUNNING_MASTER OCF_RUNNING_MASTER Return Code8OCF_RUNNING_MASTER 8OCF_RUNNING_MASTER OCF_RUNNING_MASTER " msgstr "" #. Tag: para #, no-c-format msgid "9" msgstr "" #. Tag: para #, no-c-format msgid "OCF_FAILED_MASTER" msgstr "" #. Tag: para #, no-c-format msgid "The resource is in master mode but has failed. The resource will be demoted, stopped and then started (and possibly promoted) again. Return CodeOCF_FAILED_MASTER OCF_FAILED_MASTER Return Code9OCF_FAILED_MASTER 9OCF_FAILED_MASTER OCF_FAILED_MASTER " msgstr "" #. Tag: para #, no-c-format msgid "other" msgstr "" #. Tag: para #, no-c-format msgid "Custom error code. Return Codeother other " msgstr "" #. Tag: para #, no-c-format msgid "Exceptions to the recovery handling described above:" msgstr "" #. Tag: para #, no-c-format msgid "Probes (non-recurring monitor actions) that find a resource active (or in master mode) will not result in recovery action unless it is also found active elsewhere." msgstr "" #. Tag: para #, no-c-format msgid "The recovery action taken when a resource is found active more than once is determined by the resource’s multiple-active property (see )." msgstr "" #. Tag: para #, no-c-format msgid "Recurring actions that return OCF_ERR_UNIMPLEMENTED do not cause any type of recovery." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ap-Samples.pot b/doc/Pacemaker_Explained/pot/Ap-Samples.pot index e6aac2e68a..a411b88deb 100644 --- a/doc/Pacemaker_Explained/pot/Ap-Samples.pot +++ b/doc/Pacemaker_Explained/pot/Ap-Samples.pot @@ -1,184 +1,184 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Sample Configurations" msgstr "" #. Tag: title #, no-c-format msgid "Empty" msgstr "" #. Tag: title #, no-c-format msgid "An Empty Configuration" msgstr "" #. Tag: programlisting #, no-c-format msgid "<cib crm_feature_set=\"3.0.7\" validate-with=\"pacemaker-1.2\" admin_epoch=\"1\" epoch=\"0\" num_updates=\"0\">\n" " <configuration>\n" " <crm_config/>\n" " <nodes/>\n" " <resources/>\n" " <constraints/>\n" " </configuration>\n" " <status/>\n" "</cib>" msgstr "" #. Tag: title #, no-c-format msgid "Simple" msgstr "" #. Tag: title #, no-c-format msgid "A simple configuration with two nodes, some cluster options and a resource" msgstr "" #. Tag: programlisting #, no-c-format msgid "<cib crm_feature_set=\"3.0.7\" validate-with=\"pacemaker-1.2\" admin_epoch=\"1\" epoch=\"0\" num_updates=\"0\">\n" " <configuration>\n" " <crm_config>\n" " <cluster_property_set id=\"cib-bootstrap-options\">\n" " <nvpair id=\"option-1\" name=\"symmetric-cluster\" value=\"true\"/>\n" " <nvpair id=\"option-2\" name=\"no-quorum-policy\" value=\"stop\"/>\n" " <nvpair id=\"option-3\" name=\"stonith-enabled\" value=\"0\"/>\n" " </cluster_property_set>\n" " </crm_config>\n" " <nodes>\n" " <node id=\"xxx\" uname=\"c001n01\" type=\"normal\"/>\n" " <node id=\"yyy\" uname=\"c001n02\" type=\"normal\"/>\n" " </nodes>\n" " <resources>\n" " <primitive id=\"myAddr\" class=\"ocf\" provider=\"heartbeat\" type=\"IPaddr\">\n" " <operations>\n" " <op id=\"myAddr-monitor\" name=\"monitor\" interval=\"300s\"/>\n" " </operations>\n" " <instance_attributes id=\"myAddr-params\">\n" " <nvpair id=\"myAddr-ip\" name=\"ip\" value=\"192.0.2.10\"/>\n" " </instance_attributes>\n" " </primitive>\n" " </resources>\n" " <constraints>\n" " <rsc_location id=\"myAddr-prefer\" rsc=\"myAddr\" node=\"c001n01\" score=\"INFINITY\"/>\n" " </constraints>\n" " <rsc_defaults>\n" " <meta_attributes id=\"rsc_defaults-options\">\n" " <nvpair id=\"rsc-default-1\" name=\"resource-stickiness\" value=\"100\"/>\n" " <nvpair id=\"rsc-default-2\" name=\"migration-threshold\" value=\"10\"/>\n" " </meta_attributes>\n" " </rsc_defaults>\n" " <op_defaults>\n" " <meta_attributes id=\"op_defaults-options\">\n" " <nvpair id=\"op-default-1\" name=\"timeout\" value=\"30s\"/>\n" " </meta_attributes>\n" " </op_defaults>\n" " </configuration>\n" " <status/>\n" "</cib>" msgstr "" #. Tag: para #, no-c-format msgid "In the above example, we have one resource (an IP address) that we check every five minutes and will run on host c001n01 until either the resource fails 10 times or the host shuts down." msgstr "" #. Tag: title #, no-c-format msgid "Advanced Configuration" msgstr "" #. Tag: title #, no-c-format msgid "An advanced configuration with groups, clones and STONITH" msgstr "" #. Tag: programlisting #, no-c-format msgid "<cib crm_feature_set=\"3.0.7\" validate-with=\"pacemaker-1.2\" admin_epoch=\"1\" epoch=\"0\" num_updates=\"0\">\n" " <configuration>\n" " <crm_config>\n" " <cluster_property_set id=\"cib-bootstrap-options\">\n" " <nvpair id=\"option-1\" name=\"symmetric-cluster\" value=\"true\"/>\n" " <nvpair id=\"option-2\" name=\"no-quorum-policy\" value=\"stop\"/>\n" " <nvpair id=\"option-3\" name=\"stonith-enabled\" value=\"true\"/>\n" " </cluster_property_set>\n" " </crm_config>\n" " <nodes>\n" " <node id=\"xxx\" uname=\"c001n01\" type=\"normal\"/>\n" " <node id=\"yyy\" uname=\"c001n02\" type=\"normal\"/>\n" " <node id=\"zzz\" uname=\"c001n03\" type=\"normal\"/>\n" " </nodes>\n" " <resources>\n" " <primitive id=\"myAddr\" class=\"ocf\" provider=\"heartbeat\" type=\"IPaddr\">\n" " <operations>\n" " <op id=\"myAddr-monitor\" name=\"monitor\" interval=\"300s\"/>\n" " </operations>\n" " <instance_attributes id=\"myAddr-attrs\">\n" " <nvpair id=\"myAddr-attr-1\" name=\"ip\" value=\"192.0.2.10\"/>\n" " </instance_attributes>\n" " </primitive>\n" " <group id=\"myGroup\">\n" " <primitive id=\"database\" class=\"lsb\" type=\"oracle\">\n" " <operations>\n" " <op id=\"database-monitor\" name=\"monitor\" interval=\"300s\"/>\n" " </operations>\n" " </primitive>\n" " <primitive id=\"webserver\" class=\"lsb\" type=\"apache\">\n" " <operations>\n" " <op id=\"webserver-monitor\" name=\"monitor\" interval=\"300s\"/>\n" " </operations>\n" " </primitive>\n" " </group>\n" " <clone id=\"STONITH\">\n" " <meta_attributes id=\"stonith-options\">\n" " <nvpair id=\"stonith-option-1\" name=\"globally-unique\" value=\"false\"/>\n" " </meta_attributes>\n" " <primitive id=\"stonithclone\" class=\"stonith\" type=\"external/ssh\">\n" " <operations>\n" " <op id=\"stonith-op-mon\" name=\"monitor\" interval=\"5s\"/>\n" " </operations>\n" " <instance_attributes id=\"stonith-attrs\">\n" " <nvpair id=\"stonith-attr-1\" name=\"hostlist\" value=\"c001n01,c001n02\"/>\n" " </instance_attributes>\n" " </primitive>\n" " </clone>\n" " </resources>\n" " <constraints>\n" " <rsc_location id=\"myAddr-prefer\" rsc=\"myAddr\" node=\"c001n01\"\n" " score=\"INFINITY\"/>\n" " <rsc_colocation id=\"group-with-ip\" rsc=\"myGroup\" with-rsc=\"myAddr\"\n" " score=\"INFINITY\"/>\n" " </constraints>\n" " <op_defaults>\n" " <meta_attributes id=\"op_defaults-options\">\n" " <nvpair id=\"op-default-1\" name=\"timeout\" value=\"30s\"/>\n" " </meta_attributes>\n" " </op_defaults>\n" " <rsc_defaults>\n" " <meta_attributes id=\"rsc_defaults-options\">\n" " <nvpair id=\"rsc-default-1\" name=\"resource-stickiness\" value=\"100\"/>\n" " <nvpair id=\"rsc-default-2\" name=\"migration-threshold\" value=\"10\"/>\n" " </meta_attributes>\n" " </rsc_defaults>\n" " </configuration>\n" " <status/>\n" "</cib>" msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ap-Upgrade.pot b/doc/Pacemaker_Explained/pot/Ap-Upgrade.pot index e9f15efe06..1d5d08a6e9 100644 --- a/doc/Pacemaker_Explained/pot/Ap-Upgrade.pot +++ b/doc/Pacemaker_Explained/pot/Ap-Upgrade.pot @@ -1,329 +1,750 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" +#. Tag: title +#, no-c-format +msgid "Upgrading" +msgstr "" + #. Tag: title #, no-c-format msgid "Upgrading Cluster Software" msgstr "" #. Tag: para #, no-c-format -msgid "There will always be an upgrade path from any pacemaker 1.x release to any other 1.y release." +msgid "There are three approaches to upgrading a cluster, each with advantages and disadvantages." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Upgrade Methods" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Method" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Available between all versions" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Can be used with Pacemaker Remote nodes" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Service outage during upgrade" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Service recovery during upgrade" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Exercises failover logic" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Allows change of messaging layer Clusterswitching between stacks switching between stacks Changing cluster stack For example, switching from Heartbeat to Corosync." msgstr "" #. Tag: para #, no-c-format -msgid "Consult the documentation for your messaging layer (Heartbeat or Corosync) to see whether upgrading them to a newer version is also supported." +msgid "Complete cluster shutdown upgradeshutdown shutdown shutdown upgrade " msgstr "" #. Tag: para #, no-c-format -msgid "There are three approaches to upgrading your cluster software:" +msgid "yes" msgstr "" #. Tag: para #, no-c-format -msgid "Complete Cluster Shutdown" +msgid "always" msgstr "" #. Tag: para #, no-c-format -msgid "Rolling (node by node)" +msgid "N/A" msgstr "" #. Tag: para #, no-c-format -msgid "Disconnect and Reattach" +msgid "no" msgstr "" #. Tag: para #, no-c-format -msgid "Each method has advantages and disadvantages, some of which are listed in the table below, and you should choose the one most appropriate to your needs." +msgid "Rolling (node by node) upgraderolling rolling rolling upgrade " msgstr "" -#. Tag: title +#. Tag: para #, no-c-format -msgid "Upgrade Methods" +msgid "always Any active resources will be moved off the node being upgraded, so there will be at least a brief outage unless all resources can be migrated \"live\"." msgstr "" -#. Tag: entry +#. Tag: para #, no-c-format -msgid "Type" +msgid "Detach and reattach upgradereattach reattach reattach upgrade " msgstr "" -#. Tag: entry +#. Tag: para #, no-c-format -msgid "Available between all software versions" +msgid "only due to failure" msgstr "" -#. Tag: entry +#. Tag: title #, no-c-format -msgid "Service Outage During Upgrade" +msgid "Complete Cluster Shutdown" msgstr "" -#. Tag: entry +#. Tag: para #, no-c-format -msgid "Service Recovery During Upgrade" +msgid "In this scenario, one shuts down all cluster nodes and resources, then upgrades all the nodes before restarting the cluster." msgstr "" -#. Tag: entry +#. Tag: para #, no-c-format -msgid "Exercises Failover Logic/Configuration" +msgid "On each node:" msgstr "" -#. Tag: entry +#. Tag: para #, no-c-format -msgid "Allows change of cluster stack type ClusterSwitching between Stacks Switching between Stacks Changing Cluster Stack For example, switching from Heartbeat to Corosync." +msgid "Shutdown the cluster software (pacemaker and the messaging layer)." msgstr "" #. Tag: para #, no-c-format -msgid "Shutdown UpgradeShutdown Shutdown Shutdown Upgrade " +msgid "Upgrade the Pacemaker software. This may also include upgrading the messaging layer and/or the underlying operating system." msgstr "" #. Tag: para #, no-c-format -msgid "yes" +msgid "Check the configuration with the crm_verify tool." msgstr "" #. Tag: para #, no-c-format -msgid "always" +msgid "Start the cluster software. The messaging layer can be either Corosync or Heartbeat and does not need to be the same one before the upgrade." msgstr "" #. Tag: para #, no-c-format -msgid "N/A" +msgid "One variation of this approach is to build a new cluster on new hosts. This allows the new version to be tested beforehand, and minimizes downtime by having the new nodes ready to be placed in production as soon as the old nodes are shut down." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Rolling (node by node)" msgstr "" #. Tag: para #, no-c-format -msgid "no" +msgid "In this scenario, each node is removed from the cluster, upgraded, and then brought back online, until all nodes are running the newest version." msgstr "" #. Tag: para #, no-c-format -msgid "Rolling UpgradeRolling Rolling Rolling Upgrade " +msgid "If you plan to upgrade other cluster software — such as the messaging layer — at the same time, consult that software’s documentation for its compatibility with a rolling upgrade." msgstr "" #. Tag: para #, no-c-format -msgid "Reattach UpgradeReattach Reattach Reattach Upgrade " +msgid "Pacemaker has three version numbers that affect rolling upgrades:" msgstr "" #. Tag: para #, no-c-format -msgid "only due to failure" +msgid "Pacemaker release version: Rolling upgrades are possible as long as the major version number (the x in x.y.z) stays the same. For example, a rolling upgrade may be done from 1.0.8 to 1.1.15, but not from 0.6.7 to 1.0.0." msgstr "" #. Tag: para #, no-c-format -msgid "In this scenario, one shuts down all cluster nodes and resources, then upgrades all the nodes before restarting the cluster." +msgid "CRM feature set: This version number applies to the communication between full cluster nodes." msgstr "" #. Tag: para #, no-c-format -msgid "On each node:" +msgid "It increases when a cluster node running the older version would have problems if the cluster’s Designated Controller (DC) has the newer version. To avoid these problems, Pacemaker ensures that the longest-running node is the DC, and that nodes with an older feature set cannot join the cluster." msgstr "" #. Tag: para #, no-c-format -msgid "Shutdown the cluster software (pacemaker and the messaging layer)." +msgid "Therefore, if the CRM feature set is changing in the Pacemaker version you are upgrading to, you should run a mixed-version cluster only during a small rolling upgrade window. If one of the older nodes drops out of the cluster for any reason, it will not be able to rejoin until it is upgraded." msgstr "" #. Tag: para #, no-c-format -msgid "Upgrade the Pacemaker software. This may also include upgrading the messaging layer and/or the underlying operating system." +msgid "LRMD protocol version: This version number applies to communication between a Pacemaker Remote node and the cluster. It increases when an older cluster node would have problems hosting the connection to a newer Pacemaker Remote node. To avoid these problems, Pacemaker Remote nodes will accept connections only from cluster nodes with the same or newer LRMD protocol version." msgstr "" #. Tag: para #, no-c-format -msgid "Check the configuration manually or with the crm_verify tool if available." +msgid "For rolling upgrades, this means that all cluster nodes should be upgraded before upgrading any Pacemaker Remote nodes." msgstr "" #. Tag: para #, no-c-format -msgid "Start the cluster software. The messaging layer can be either Corosync or Heartbeat and does not need to be the same one before the upgrade." +msgid "Unlike with CRM feature set differences between full cluster nodes, mixed LRMD protocol versions between Pacemaker Remote nodes and full cluster nodes are fine, as long as the Pacemaker Remote nodes have the older version. This can be useful, for example, to host a legacy application in an older operating system version used as a Pacemaker Remote node." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "See the ClusterLabs wiki’s Release Calendar to figure out whether the CRM feature set and/or LRMD protocol version changed between the the Pacemaker release versions in your rolling upgrade." msgstr "" #. Tag: para #, no-c-format -msgid "In this scenario, each node is removed from the cluster, upgraded and then brought back online until all nodes are running the newest version." +msgid "The interpretation of the LRMD protocol version changed in Pacemaker 1.1.15. If you are planning a rolling upgrade from an earlier Pacemaker version to Pacemaker 1.1.15 or later involving Pacemaker Remote nodes, you will need to take special precautions to avoid problems. See Upgrading to Pacemaker 1.1.15 or later from an earlier version on the ClusterLabs wiki." msgstr "" #. Tag: para #, no-c-format -msgid "Rolling upgrades should always be possible for pacemaker versions 1.0.0 and later." +msgid "To perform a rolling upgrade, on each node in turn:" msgstr "" #. Tag: para #, no-c-format -msgid "Put the node into standby mode, and wait for any active resources to be moved cleanly to another node." +msgid "Put the node into standby mode, and wait for any active resources to be moved cleanly to another node. (This step is optional, but allows you to deal with any resource issues before the upgrade.)" msgstr "" #. Tag: para #, no-c-format msgid "Shutdown the cluster software (pacemaker and the messaging layer) on the node." msgstr "" #. Tag: para #, no-c-format -msgid "If this is the first node to be upgraded, check the configuration manually or with the crm_verify tool if available." +msgid "If this is the first node to be upgraded, check the configuration with the crm_verify tool." msgstr "" #. Tag: para #, no-c-format -msgid "Start the messaging layer. This must be the same messaging layer (Corosync or Heartbeat) that the rest of the cluster is using. Upgrading the messaging layer may also be possible; consult the documentation for those projects to see whether the two versions will be compatible." +msgid "Start the messaging layer. This must be the same messaging layer (Corosync or Heartbeat) that the rest of the cluster is using." msgstr "" #. Tag: para #, no-c-format -msgid "Rolling upgrades were not always possible with older heartbeat and pacemaker versions. The table below shows which versions were compatible during rolling upgrades. Rolling upgrades that cross compatibility boundaries must be performed in multiple steps (for example, upgrading heartbeat 2.0.6 to heartbeat 2.1.3, and then upgrading again to pacemaker 0.6.6). Rolling upgrades from pacemaker 0.x to 1.y are not possible." +msgid "Rolling upgrades were not always possible with older heartbeat and pacemaker versions. Rolling upgrades that cross compatibility boundaries listed in the following table must be performed in multiple steps." msgstr "" #. Tag: title #, no-c-format msgid "Version Compatibility Table" msgstr "" #. Tag: entry #, no-c-format msgid "Version being Installed" msgstr "" #. Tag: entry #, no-c-format msgid "Oldest Compatible Version" msgstr "" #. Tag: para #, no-c-format -msgid "Pacemaker 1.0.x" +msgid "Pacemaker 1.x.y" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker 1.0.0" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker 0.7.x" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker 0.6 or Heartbeat 2.1.3" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker 0.6.x" msgstr "" #. Tag: para #, no-c-format msgid "Heartbeat 2.0.8" msgstr "" #. Tag: para #, no-c-format msgid "Heartbeat 2.1.3 (or less)" msgstr "" #. Tag: para #, no-c-format msgid "Heartbeat 2.0.4" msgstr "" #. Tag: para #, no-c-format msgid "Heartbeat 2.0.4 (or less)" msgstr "" #. Tag: para #, no-c-format msgid "Heartbeat 2.0.0" msgstr "" #. Tag: para #, no-c-format msgid "None. Use an alternate upgrade strategy." msgstr "" +#. Tag: title +#, no-c-format +msgid "Detach and Reattach" +msgstr "" + #. Tag: para #, no-c-format msgid "The reattach method is a variant of a complete cluster shutdown, where the resources are left active and get re-detected when the cluster is restarted." msgstr "" +#. Tag: para +#, no-c-format +msgid "This method may not be used if the cluster contains any Pacemaker Remote nodes." +msgstr "" + #. Tag: para #, no-c-format msgid "Tell the cluster to stop managing services. This is required to allow the services to remain active after the cluster shuts down." msgstr "" #. Tag: screen #, no-c-format -msgid "# crm_attribute -t crm_config -n is-managed-default -v false" +msgid "# crm_attribute --name maintenance-mode --update true" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "On each node, shutdown the cluster software (pacemaker and the messaging layer), and upgrade the Pacemaker software. This may also include upgrading the messaging layer. While the underlying operating system may be upgraded at the same time, that will be more likely to cause outages in the detached services (certainly, if a reboot is required)." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "On each node, start the cluster software. The messaging layer can be either Corosync or Heartbeat and does not need to be the same one as before the upgrade." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Verify that the cluster re-detected all resources correctly." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Allow the cluster to resume managing resources again:" +msgstr "" + +#. Tag: screen +#, no-c-format +msgid "# crm_attribute --name maintenance-mode --delete" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Support for maintenance mode was added in Pacemaker 1.0.0. If you are upgrading from an earlier version, you can detach by setting is-managed to false for all resources." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Upgrading the Configuration" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " upgradeConfiguration Configuration Configurationupgrading upgrading " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Pacemaker’s configuration — the Configuration Information Base (CIB) — has its own XML schema version, independent of the Pacemaker software version." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "After cluster software is upgraded, the cluster will continue to use the older schema version that it was previously using. This can be useful, for example, when administrators have written tools that modify the configuration, and are based on the older syntax." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "However, when using an older syntax, new features may be unavailable, and there is a performance impact, since the cluster must do a non-persistent configuration upgrade before each transition. So while using the old syntax is possible, it is not advisable to continue using it indefinitely." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Even if you wish to continue using the old syntax, it is a good idea to follow the upgrade procedure outlined below, except for the last step, to ensure that the new software has no problems with your existing configuration (since it will perform much the same task internally)." msgstr "" #. Tag: para #, no-c-format -msgid "For any resource that has a value for is-managed, make sure it is set to false so that the cluster will not stop it (replacing $rsc_id appropriately):" +msgid "If you are brave, it is sufficient simply to run cibadmin --upgrade." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "A more cautious approach would proceed like this:" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Create a shadow copy of the configuration. The later commands will automatically operate on this copy, rather than the live configuration." msgstr "" #. Tag: screen #, no-c-format -msgid "# crm_resource -t primitive -r $rsc_id -p is-managed -v false" +msgid "# crm_shadow --create shadow" msgstr "" #. Tag: para #, no-c-format -msgid "Start the cluster software. The messaging layer can be either Corosync or Heartbeat and does not need to be the same one as before the upgrade." +msgid "Verify the configuration is valid with the new software (which may be stricter about syntax mistakes, or may have dropped support for deprecated features): Configurationverify verify verifyConfiguration Configuration " +msgstr "" + +#. Tag: screen +#, no-c-format +msgid "# crm_verify --live-check" msgstr "" #. Tag: para #, no-c-format -msgid "Verify that the cluster re-detected all resources correctly." +msgid "Fix any errors or warnings." msgstr "" #. Tag: para #, no-c-format -msgid "Allow the cluster to resume managing resources again:" +msgid "Perform the upgrade:" +msgstr "" + +#. Tag: screen +#, no-c-format +msgid "# cibadmin --upgrade" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "If this step fails, there are three main possibilities:" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The configuration was not valid to start with (did you do steps 2 and 3?)." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The transformation failed - report a bug or email the project." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The transformation was successful but produced an invalid result." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "If the result of the transformation is invalid, you may see a number of errors from the validation library. If these are not helpful, visit the Validation FAQ wiki page and/or try the manual upgrade procedure described below." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Check the changes:" +msgstr "" + +#. Tag: screen +#, no-c-format +msgid "# crm_shadow --diff" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "If at this point there is anything about the upgrade that you wish to fine-tune (for example, to change some of the automatic IDs), now is the time to do so:" +msgstr "" + +#. Tag: screen +#, no-c-format +msgid "# crm_shadow --edit" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "This will open the configuration in your favorite editor (whichever is specified by the standard $EDITOR environment variable)." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Preview how the cluster will react:" +msgstr "" + +#. Tag: screen +#, no-c-format +msgid "# crm_simulate --live-check --save-dotfile shadow.dot -S\n" +"# graphviz shadow.dot" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Verify that either no resource actions will occur or that you are happy with any that are scheduled. If the output contains actions you do not expect (possibly due to changes to the score calculations), you may need to make further manual changes. See for further details on how to interpret the output of crm_simulate and graphviz." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Upload the changes:" msgstr "" #. Tag: screen #, no-c-format -msgid "# crm_attribute -t crm_config -n is-managed-default -v true" +msgid "# crm_shadow --commit shadow --force" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "In the unlikely event this step fails, please report a bug." +msgstr "" + +#. Tag: para +#, no-c-format +msgid " Configurationupgrade manually upgrade manually It is also possible to perform the configuration upgrade steps manually:" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Locate the upgrade*.xsl conversion scripts provided with the source code. These will often be installed in a location such as /usr/share/pacemaker, or may be obtained from the source repository." msgstr "" #. Tag: para #, no-c-format -msgid "For any resource that has a value for is-managed, reset it to true (so the cluster can recover the service if it fails) if desired:" +msgid "Run the conversion scripts that apply to your older version, for example: XMLconvert convert " msgstr "" #. Tag: screen #, no-c-format -msgid "# crm_resource -t primitive -r $rsc_id -p is-managed -v true" +msgid "# xsltproc /path/to/upgrade06.xsl config06.xml > config10.xml" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Locate the pacemaker.rng script (from the same location as the xsl files)." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Check the XML validity: validate configuration Configurationvalidate XML validate XML " +msgstr "" + +#. Tag: screen +#, no-c-format +msgid "# xmllint --relaxng /path/to/pacemaker.rng config10.xml" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The advantage of this method is that it can be performed without the cluster running, and any validation errors are often more informative." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "What Changed in 1.0" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "New" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Failure timeouts. See " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "New section for resource and operation defaults. See and " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Tool for making offline configuration changes. See " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Rules, instance_attributes, meta_attributes and sets of operations can be defined once and referenced in multiple places. See " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The CIB now accepts XPath-based create/modify/delete operations. See the cibadmin help text." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Multi-dimensional colocation and ordering constraints. See and " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The ability to connect to the CIB from non-cluster machines. See " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Allow recurring actions to be triggered at known times. See " +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Changed" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Syntax" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "All resource and cluster options now use dashes (-) instead of underscores (_)" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "master_slave was renamed to master" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The attributes container tag was removed" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The operation field pre-req has been renamed requires" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "All operations must have an interval, start/stop must have it set to zero" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The stonith-enabled option now defaults to true." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The cluster will refuse to start resources if stonith-enabled is true (or unset) and no STONITH resources have been defined" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The attributes of colocation and ordering constraints were renamed for clarity. See and " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "resource-failure-stickiness has been replaced by migration-threshold. See " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The parameters for command-line tools have been made consistent" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Switched to RelaxNG schema validation and libxml2 parser" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "id fields are now XML IDs which have the following limitations:" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "id’s cannot contain colons (:)" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "id’s cannot begin with a number" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "id’s must be globally unique (not just unique for that tag)" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Some fields (such as those in constraints that refer to resources) are IDREFs." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "This means that they must reference existing resources or objects in order for the configuration to be valid. Removing an object which is referenced elsewhere will therefore fail." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The CIB representation, from which a MD5 digest is calculated to verify CIBs on the nodes, has changed." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "This means that every CIB update will require a full refresh on any upgraded nodes until the cluster is fully upgraded to 1.0. This will result in significant performance degradation and it is therefore highly inadvisable to run a mixed 1.0/0.6 cluster for any longer than absolutely necessary." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Ping node information no longer needs to be added to ha.cf." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Simply include the lists of hosts in your ping resource(s)." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Removed" msgstr "" #. Tag: para #, no-c-format -msgid "The oldest version of the CRM to support this upgrade type was in Heartbeat 2.0.4." +msgid "It is no longer possible to set resource meta options as top-level attributes. Use meta attributes instead." msgstr "" #. Tag: para #, no-c-format -msgid "Always check your existing configuration is still compatible with the version you are installing before starting the cluster." +msgid "Resource and operation defaults are no longer read from crm_config. See and instead." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Author_Group.pot b/doc/Pacemaker_Explained/pot/Author_Group.pot index ae3aa135e6..e756452773 100644 --- a/doc/Pacemaker_Explained/pot/Author_Group.pot +++ b/doc/Pacemaker_Explained/pot/Author_Group.pot @@ -1,139 +1,139 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: firstname #, no-c-format msgid "Andrew" msgstr "" #. Tag: surname #, no-c-format msgid "Beekhof" msgstr "" #. Tag: orgname #, no-c-format msgid "Red Hat" msgstr "" #. Tag: contrib #, no-c-format msgid "Primary author" msgstr "" #. Tag: firstname #, no-c-format msgid "Dan" msgstr "" #. Tag: surname #, no-c-format msgid "Frîncu" msgstr "" #. Tag: contrib #, no-c-format msgid "Romanian translation" msgstr "" #. Tag: firstname #, no-c-format msgid "Philipp" msgstr "" #. Tag: surname #, no-c-format msgid "Marek" msgstr "" #. Tag: orgname #, no-c-format msgid "LINBit" msgstr "" #. Tag: contrib #, no-c-format msgid "Style and formatting updates. Indexing." msgstr "" #. Tag: firstname #, no-c-format msgid "Tanja" msgstr "" #. Tag: surname #, no-c-format msgid "Roth" msgstr "" #. Tag: orgname #, no-c-format msgid "SUSE" msgstr "" #. Tag: contrib #, no-c-format msgid "Utilization chapter" msgstr "" #. Tag: contrib #, no-c-format msgid "Resource Templates chapter" msgstr "" #. Tag: contrib #, no-c-format msgid "Multi-Site Clusters chapter" msgstr "" #. Tag: firstname #, no-c-format msgid "Lars" msgstr "" #. Tag: surname #, no-c-format msgid "Marowsky-Bree" msgstr "" #. Tag: firstname #, no-c-format msgid "Yan" msgstr "" #. Tag: surname #, no-c-format msgid "Gao" msgstr "" #. Tag: firstname #, no-c-format msgid "Thomas" msgstr "" #. Tag: surname #, no-c-format msgid "Schraitle" msgstr "" #. Tag: firstname #, no-c-format msgid "Dejan" msgstr "" #. Tag: surname #, no-c-format msgid "Muhamedagic" msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Book_Info.pot b/doc/Pacemaker_Explained/pot/Book_Info.pot index 3ad9435c46..e65deae975 100644 --- a/doc/Pacemaker_Explained/pot/Book_Info.pot +++ b/doc/Pacemaker_Explained/pot/Book_Info.pot @@ -1,34 +1,34 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Configuration Explained" msgstr "" #. Tag: subtitle #, no-c-format msgid "An A-Z guide to Pacemaker's Configuration Options" msgstr "" #. Tag: productname #, no-c-format msgid "Pacemaker" msgstr "" #. Tag: para #, no-c-format msgid "The purpose of this document is to definitively explain the concepts used to configure Pacemaker. To achieve this, it will focus exclusively on the XML syntax used to configure Pacemaker's Cluster Information Base (CIB)." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Advanced-Options.pot b/doc/Pacemaker_Explained/pot/Ch-Advanced-Options.pot index 5975f6b785..9e5bcd9de7 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Advanced-Options.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Advanced-Options.pot @@ -1,948 +1,1123 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Advanced Configuration" msgstr "" #. Tag: title #, no-c-format msgid "Connecting from a Remote Machine" msgstr "" #. Tag: para #, no-c-format msgid " ClusterRemote connection Remote connection ClusterRemote administration Remote administration " msgstr "" #. Tag: para #, no-c-format msgid "Provided Pacemaker is installed on a machine, it is possible to connect to the cluster even if the machine itself is not in the same cluster. To do this, one simply sets up a number of environment variables and runs the same commands as when working on a cluster node." msgstr "" #. Tag: title #, no-c-format msgid "Environment Variables Used to Connect to Remote Instances of the CIB" msgstr "" #. Tag: entry #, no-c-format msgid "Environment Variable" msgstr "" #. Tag: entry #, no-c-format msgid "Default" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "CIB_user" msgstr "" #. Tag: para #, no-c-format msgid "$USER" msgstr "" #. Tag: para #, no-c-format msgid "The user to connect as. Needs to be part of the hacluster group on the target host. Environment VariableCIB_user CIB_user " msgstr "" #. Tag: para #, no-c-format msgid "CIB_passwd" msgstr "" #. Tag: para #, no-c-format msgid "The user’s password. Read from the command line if unset. Environment VariableCIB_passwd CIB_passwd " msgstr "" #. Tag: para #, no-c-format msgid "CIB_server" msgstr "" #. Tag: para #, no-c-format msgid "localhost" msgstr "" #. Tag: para #, no-c-format msgid "The host to contact Environment VariableCIB_server CIB_server " msgstr "" #. Tag: para #, no-c-format msgid "CIB_port" msgstr "" #. Tag: para #, no-c-format msgid "The port on which to contact the server; required. Environment VariableCIB_port CIB_port " msgstr "" #. Tag: para #, no-c-format msgid "CIB_encrypted" msgstr "" #. Tag: para #, no-c-format msgid "TRUE" msgstr "" #. Tag: para #, no-c-format msgid "Whether to encrypt network traffic Environment VariableCIB_encrypted CIB_encrypted " msgstr "" #. Tag: para #, no-c-format msgid "So, if c001n01 is an active cluster node and is listening on port 1234 for connections, and someuser is a member of the hacluster group, then the following would prompt for someuser's password and return the cluster’s current configuration:" msgstr "" #. Tag: screen #, no-c-format msgid "# export CIB_port=1234; export CIB_server=c001n01; export CIB_user=someuser;\n" "# cibadmin -Q" msgstr "" #. Tag: para #, no-c-format msgid "For security reasons, the cluster does not listen for remote connections by default. If you wish to allow remote access, you need to set the remote-tls-port (encrypted) or remote-clear-port (unencrypted) CIB properties (i.e., those kept in the cib tag, like num_updates and epoch)." msgstr "" #. Tag: title #, no-c-format msgid "Extra top-level CIB properties for remote access" msgstr "" #. Tag: entry #, no-c-format msgid "Field" msgstr "" #. Tag: para #, no-c-format msgid "remote-tls-port" msgstr "" #. Tag: para #, no-c-format msgid "none" msgstr "" #. Tag: para #, no-c-format msgid "Listen for encrypted remote connections on this port. remote-tls-portRemote Connection Option Remote Connection Option Remote ConnectionOptionremote-tls-port Optionremote-tls-port remote-tls-port " msgstr "" #. Tag: para #, no-c-format msgid "remote-clear-port" msgstr "" #. Tag: para #, no-c-format msgid "Listen for plaintext remote connections on this port. remote-clear-portRemote Connection Option Remote Connection Option Remote ConnectionOptionremote-clear-port Optionremote-clear-port remote-clear-port " msgstr "" #. Tag: title #, no-c-format msgid "Specifying When Recurring Actions are Performed" msgstr "" #. Tag: para #, no-c-format msgid "By default, recurring actions are scheduled relative to when the resource started. So if your resource was last started at 14:32 and you have a backup set to be performed every 24 hours, then the backup will always run in the middle of the business day — hardly desirable." msgstr "" #. Tag: para #, no-c-format msgid "To specify a date and time that the operation should be relative to, set the operation’s interval-origin. The cluster uses this point to calculate the correct start-delay such that the operation will occur at origin + (interval * N)." msgstr "" #. Tag: para #, no-c-format msgid "So, if the operation’s interval is 24h, its interval-origin is set to 02:00 and it is currently 14:32, then the cluster would initiate the operation with a start delay of 11 hours and 28 minutes. If the resource is moved to another node before 2am, then the operation is cancelled." msgstr "" #. Tag: para #, no-c-format msgid "The value specified for interval and interval-origin can be any date/time conforming to the ISO8601 standard. By way of example, to specify an operation that would run on the first Monday of 2009 and every Monday after that, you would add:" msgstr "" #. Tag: title #, no-c-format msgid "Specifying a Base for Recurring Action Intervals" msgstr "" #. Tag: programlisting #, no-c-format msgid "<op id=\"my-weekly-action\" name=\"custom-action\" interval=\"P7D\" interval-origin=\"2009-W01-1\"/>" msgstr "" #. Tag: title #, no-c-format msgid "Moving Resources" msgstr "" #. Tag: para #, no-c-format msgid " MovingResources Resources ResourceMoving Moving " msgstr "" #. Tag: title #, no-c-format msgid "Moving Resources Manually" msgstr "" #. Tag: para #, no-c-format msgid "There are primarily two occasions when you would want to move a resource from its current location: when the whole node is under maintenance, and when a single resource needs to be moved." msgstr "" #. Tag: title #, no-c-format msgid "Standby Mode" msgstr "" #. Tag: para #, no-c-format msgid "Since everything eventually comes down to a score, you could create constraints for every resource to prevent them from running on one node. While pacemaker configuration can seem convoluted at times, not even we would require this of administrators." msgstr "" #. Tag: para #, no-c-format msgid "Instead, one can set a special node attribute which tells the cluster \"don’t let anything run here\". There is even a helpful tool to help query and set it, called crm_standby. To check the standby status of the current machine, run:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_standby -G" msgstr "" #. Tag: para #, no-c-format msgid "A value of on indicates that the node is not able to host any resources, while a value of off says that it can." msgstr "" #. Tag: para #, no-c-format msgid "You can also check the status of other nodes in the cluster by specifying the --node option:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_standby -G --node sles-2" msgstr "" #. Tag: para #, no-c-format msgid "To change the current node’s standby status, use -v instead of -G:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_standby -v on" msgstr "" #. Tag: para #, no-c-format msgid "Again, you can change another host’s value by supplying a hostname with --node." msgstr "" #. Tag: title #, no-c-format msgid "Moving One Resource" msgstr "" #. Tag: para #, no-c-format msgid "When only one resource is required to move, we could do this by creating location constraints. However, once again we provide a user-friendly shortcut as part of the crm_resource command, which creates and modifies the extra constraints for you. If Email were running on sles-1 and you wanted it moved to a specific location, the command would look something like:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource -M -r Email -H sles-2" msgstr "" #. Tag: para #, no-c-format msgid "Behind the scenes, the tool will create the following location constraint:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location rsc=\"Email\" node=\"sles-2\" score=\"INFINITY\"/>" msgstr "" #. Tag: para #, no-c-format msgid "It is important to note that subsequent invocations of crm_resource -M are not cumulative. So, if you ran these commands" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource -M -r Email -H sles-2\n" "# crm_resource -M -r Email -H sles-3" msgstr "" #. Tag: para #, no-c-format msgid "then it is as if you had never performed the first command." msgstr "" #. Tag: para #, no-c-format msgid "To allow the resource to move back again, use:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource -U -r Email" msgstr "" #. Tag: para #, no-c-format msgid "Note the use of the word allow. The resource can move back to its original location but, depending on resource-stickiness, it might stay where it is. To be absolutely certain that it moves back to sles-1, move it there before issuing the call to crm_resource -U:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource -M -r Email -H sles-1\n" "# crm_resource -U -r Email" msgstr "" #. Tag: para #, no-c-format msgid "Alternatively, if you only care that the resource should be moved from its current location, try:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource -B -r Email" msgstr "" #. Tag: para #, no-c-format msgid "Which will instead create a negative constraint, like" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location rsc=\"Email\" node=\"sles-1\" score=\"-INFINITY\"/>" msgstr "" #. Tag: para #, no-c-format msgid "This will achieve the desired effect, but will also have long-term consequences. As the tool will warn you, the creation of a -INFINITY constraint will prevent the resource from running on that node until crm_resource -U is used. This includes the situation where every other cluster node is no longer available!" msgstr "" #. Tag: para #, no-c-format msgid "In some cases, such as when resource-stickiness is set to INFINITY, it is possible that you will end up with the problem described in . The tool can detect some of these cases and deals with them by creating both positive and negative constraints. E.g." msgstr "" #. Tag: para #, no-c-format msgid "Email prefers sles-1 with a score of -INFINITY" msgstr "" #. Tag: para #, no-c-format msgid "Email prefers sles-2 with a score of INFINITY" msgstr "" #. Tag: para #, no-c-format msgid "which has the same long-term consequences as discussed earlier." msgstr "" #. Tag: title #, no-c-format msgid "Moving Resources Due to Failure" msgstr "" #. Tag: para #, no-c-format msgid "Normally, if a running resource fails, pacemaker will try to start it again on the same node. However if a resource fails repeatedly, it is possible that there is an underlying problem on that node, and you might desire trying a different node in such a case." msgstr "" #. Tag: para #, no-c-format msgid " migration-threshold failure-timeout start-failure-is-fatal " msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker allows you to set your preference via the migration-threshold resource option. The naming of this option was perhaps unfortunate as it is easily confused with live migration, the process of moving a resource from one node to another without stopping it. Xen virtual guests are the most common example of resources that can be migrated in this manner. " msgstr "" #. Tag: para #, no-c-format msgid "Simply define migration-threshold=N for a resource and it will migrate to a new node after N failures. There is no threshold defined by default. To determine the resource’s current failure status and limits, run crm_mon --failcounts." msgstr "" #. Tag: para #, no-c-format msgid "By default, once the threshold has been reached, the troublesome node will no longer be allowed to run the failed resource until the administrator manually resets the resource’s failcount using crm_failcount (after hopefully first fixing the failure’s cause). Alternatively, it is possible to expire them by setting the failure-timeout option for the resource." msgstr "" #. Tag: para #, no-c-format msgid "For example, a setting of migration-threshold=2 and failure-timeout=60s would cause the resource to move to a new node after 2 failures, and allow it to move back (depending on stickiness and constraint scores) after one minute." msgstr "" #. Tag: para #, no-c-format msgid "There are two exceptions to the migration threshold concept: when a resource either fails to start or fails to stop." msgstr "" #. Tag: para #, no-c-format msgid "If the cluster property start-failure-is-fatal is set to true (which is the default), start failures cause the failcount to be set to INFINITY and thus always cause the resource to move immediately." msgstr "" #. Tag: para #, no-c-format msgid "Stop failures are slightly different and crucial. If a resource fails to stop and STONITH is enabled, then the cluster will fence the node in order to be able to start the resource elsewhere. If STONITH is not enabled, then the cluster has no way to continue and will not try to start the resource elsewhere, but will try to stop it again after the failure timeout." msgstr "" #. Tag: para #, no-c-format msgid "Please read to understand how timeouts work before configuring a failure-timeout." msgstr "" #. Tag: title #, no-c-format msgid "Moving Resources Due to Connectivity Changes" msgstr "" #. Tag: para #, no-c-format msgid "You can configure the cluster to move resources when external connectivity is lost in two steps." msgstr "" #. Tag: title #, no-c-format msgid "Tell Pacemaker to Monitor Connectivity" msgstr "" #. Tag: para #, no-c-format msgid "First, add an ocf:pacemaker:ping resource to the cluster. The ping resource uses the system utility of the same name to a test whether list of machines (specified by DNS hostname or IPv4/IPv6 address) are reachable and uses the results to maintain a node attribute called pingd by default. The attribute name is customizable, in order to allow multiple ping groups to be defined. " msgstr "" #. Tag: para #, no-c-format msgid "Older versions of Heartbeat required users to add ping nodes to ha.cf, but this is no longer required." msgstr "" #. Tag: para #, no-c-format msgid "Older versions of Pacemaker used a different agent ocf:pacemaker:pingd which is now deprecated in favor of ping. If your version of Pacemaker does not contain the ping resource agent, download the latest version from https://github.com/ClusterLabs/pacemaker/tree/master/extra/resources/ping" msgstr "" #. Tag: para #, no-c-format msgid "Normally, the ping resource should run on all cluster nodes, which means that you’ll need to create a clone. A template for this can be found below along with a description of the most interesting parameters." msgstr "" #. Tag: title #, no-c-format msgid "Common Options for a ping Resource" msgstr "" #. Tag: para #, no-c-format msgid "dampen" msgstr "" #. Tag: para #, no-c-format msgid "The time to wait (dampening) for further changes to occur. Use this to prevent a resource from bouncing around the cluster when cluster nodes notice the loss of connectivity at slightly different times. dampenPing Resource Option Ping Resource Option Ping ResourceOptiondampen Optiondampen dampen " msgstr "" #. Tag: para #, no-c-format msgid "multiplier" msgstr "" #. Tag: para #, no-c-format msgid "The number of connected ping nodes gets multiplied by this value to get a score. Useful when there are multiple ping nodes configured. multiplierPing Resource Option Ping Resource Option Ping ResourceOptionmultiplier Optionmultiplier multiplier " msgstr "" #. Tag: para #, no-c-format msgid "host_list" msgstr "" #. Tag: para #, no-c-format msgid "The machines to contact in order to determine the current connectivity status. Allowed values include resolvable DNS host names, IPv4 and IPv6 addresses. host_listPing Resource Option Ping Resource Option Ping ResourceOptionhost_list Optionhost_list host_list " msgstr "" #. Tag: title #, no-c-format msgid "An example ping cluster resource that checks node connectivity once every minute" msgstr "" #. Tag: programlisting #, no-c-format msgid "<clone id=\"Connected\">\n" " <primitive id=\"ping\" provider=\"pacemaker\" class=\"ocf\" type=\"ping\">\n" " <instance_attributes id=\"ping-attrs\">\n" " <nvpair id=\"pingd-dampen\" name=\"dampen\" value=\"5s\"/>\n" " <nvpair id=\"pingd-multiplier\" name=\"multiplier\" value=\"1000\"/>\n" " <nvpair id=\"pingd-hosts\" name=\"host_list\" value=\"my.gateway.com www.bigcorp.com\"/>\n" " </instance_attributes>\n" " <operations>\n" " <op id=\"ping-monitor-60s\" interval=\"60s\" name=\"monitor\"/>\n" " </operations>\n" " </primitive>\n" "</clone>" msgstr "" #. Tag: para #, no-c-format msgid "You’re only half done. The next section deals with telling Pacemaker how to deal with the connectivity status that ocf:pacemaker:ping is recording." msgstr "" #. Tag: title #, no-c-format msgid "Tell Pacemaker How to Interpret the Connectivity Data" msgstr "" #. Tag: para #, no-c-format msgid "Before attempting the following, make sure you understand ." msgstr "" #. Tag: para #, no-c-format msgid "There are a number of ways to use the connectivity data." msgstr "" #. Tag: para #, no-c-format msgid "The most common setup is for people to have a single ping target (e.g. the service network’s default gateway), to prevent the cluster from running a resource on any unconnected node." msgstr "" #. Tag: title #, no-c-format msgid "Don’t run a resource on unconnected nodes" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location id=\"WebServer-no-connectivity\" rsc=\"Webserver\">\n" " <rule id=\"ping-exclude-rule\" score=\"-INFINITY\" >\n" " <expression id=\"ping-exclude\" attribute=\"pingd\" operation=\"not_defined\"/>\n" " </rule>\n" "</rsc_location>" msgstr "" #. Tag: para #, no-c-format msgid "A more complex setup is to have a number of ping targets configured. You can require the cluster to only run resources on nodes that can connect to all (or a minimum subset) of them." msgstr "" #. Tag: title #, no-c-format msgid "Run only on nodes connected to three or more ping targets." msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"ping\" provider=\"pacemaker\" class=\"ocf\" type=\"ping\">\n" "... <!-- omitting some configuration to highlight important parts -->\n" " <nvpair id=\"pingd-multiplier\" name=\"multiplier\" value=\"1000\"/>\n" "...\n" "</primitive>\n" "...\n" "<rsc_location id=\"WebServer-connectivity\" rsc=\"Webserver\">\n" " <rule id=\"ping-prefer-rule\" score=\"-INFINITY\" >\n" " <expression id=\"ping-prefer\" attribute=\"pingd\" operation=\"lt\" value=\"3000\"/>\n" " </rule>\n" "</rsc_location>" msgstr "" #. Tag: para #, no-c-format msgid "Alternatively, you can tell the cluster only to prefer nodes with the best connectivity. Just be sure to set multiplier to a value higher than that of resource-stickiness (and don’t set either of them to INFINITY)." msgstr "" #. Tag: title #, no-c-format msgid "Prefer the node with the most connected ping nodes" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location id=\"WebServer-connectivity\" rsc=\"Webserver\">\n" " <rule id=\"ping-prefer-rule\" score-attribute=\"pingd\" >\n" " <expression id=\"ping-prefer\" attribute=\"pingd\" operation=\"defined\"/>\n" " </rule>\n" "</rsc_location>" msgstr "" #. Tag: para #, no-c-format msgid "It is perhaps easier to think of this in terms of the simple constraints that the cluster translates it into. For example, if sles-1 is connected to all five ping nodes but sles-2 is only connected to two, then it would be as if you instead had the following constraints in your configuration:" msgstr "" #. Tag: title #, no-c-format msgid "How the cluster translates the above location constraint" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location id=\"ping-1\" rsc=\"Webserver\" node=\"sles-1\" score=\"5000\"/>\n" "<rsc_location id=\"ping-2\" rsc=\"Webserver\" node=\"sles-2\" score=\"2000\"/>" msgstr "" #. Tag: para #, no-c-format msgid "The advantage is that you don’t have to manually update any constraints whenever your network connectivity changes." msgstr "" #. Tag: para #, no-c-format msgid "You can also combine the concepts above into something even more complex. The example below shows how you can prefer the node with the most connected ping nodes provided they have connectivity to at least three (again assuming that multiplier is set to 1000)." msgstr "" #. Tag: title #, no-c-format msgid "A more complex example of choosing a location based on connectivity" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location id=\"WebServer-connectivity\" rsc=\"Webserver\">\n" " <rule id=\"ping-exclude-rule\" score=\"-INFINITY\" >\n" " <expression id=\"ping-exclude\" attribute=\"pingd\" operation=\"lt\" value=\"3000\"/>\n" " </rule>\n" " <rule id=\"ping-prefer-rule\" score-attribute=\"pingd\" >\n" " <expression id=\"ping-prefer\" attribute=\"pingd\" operation=\"defined\"/>\n" " </rule>\n" "</rsc_location>" msgstr "" #. Tag: title #, no-c-format msgid "Migrating Resources" msgstr "" #. Tag: para #, no-c-format msgid "Normally, when the cluster needs to move a resource, it fully restarts the resource (i.e. stops the resource on the current node and starts it on the new node)." msgstr "" #. Tag: para #, no-c-format msgid "However, some types of resources, such as Xen virtual guests, are able to move to another location without loss of state (often referred to as live migration or hot migration). In pacemaker, this is called resource migration. Pacemaker can be configured to migrate a resource when moving it, rather than restarting it." msgstr "" #. Tag: para #, no-c-format msgid "Not all resources are able to migrate; see the Migration Checklist below, and those that can, won’t do so in all situations. Conceptually, there are two requirements from which the other prerequisites follow:" msgstr "" #. Tag: para #, no-c-format msgid "The resource must be active and healthy at the old location; and" msgstr "" #. Tag: para #, no-c-format msgid "everything required for the resource to run must be available on both the old and new locations." msgstr "" #. Tag: para #, no-c-format msgid "The cluster is able to accommodate both push and pull migration models by requiring the resource agent to support two special actions: migrate_to (performed on the current location) and migrate_from (performed on the destination)." msgstr "" #. Tag: para #, no-c-format msgid "In push migration, the process on the current location transfers the resource to the new location where is it later activated. In this scenario, most of the work would be done in the migrate_to action and, if anything, the activation would occur during migrate_from." msgstr "" #. Tag: para #, no-c-format msgid "Conversely for pull, the migrate_to action is practically empty and migrate_from does most of the work, extracting the relevant resource state from the old location and activating it." msgstr "" #. Tag: para #, no-c-format msgid "There is no wrong or right way for a resource agent to implement migration, as long as it works." msgstr "" #. Tag: title #, no-c-format msgid "Migration Checklist" msgstr "" #. Tag: para #, no-c-format msgid "The resource may not be a clone." msgstr "" #. Tag: para #, no-c-format msgid "The resource must use an OCF style agent." msgstr "" #. Tag: para #, no-c-format msgid "The resource must not be in a failed or degraded state." msgstr "" #. Tag: para #, no-c-format msgid "The resource agent must support migrate_to and migrate_from actions, and advertise them in its metadata." msgstr "" #. Tag: para #, no-c-format msgid "The resource must have the allow-migrate meta-attribute set to true (which is not the default)." msgstr "" #. Tag: para #, no-c-format msgid "If an otherwise migratable resource depends on another resource via an ordering constraint, there are special situations in which it will be restarted rather than migrated." msgstr "" #. Tag: para #, no-c-format msgid "For example, if the resource depends on a clone, and at the time the resource needs to be moved, the clone has instances that are stopping and instances that are starting, then the resource will be restarted. The Policy Engine is not yet able to model this situation correctly and so takes the safer (if less optimal) path." msgstr "" #. Tag: para #, no-c-format msgid "In pacemaker 1.1.11 and earlier, a migratable resource will be restarted when moving if it directly or indirectly depends on any primitive or group resources." msgstr "" #. Tag: para #, no-c-format msgid "Even in newer versions, if a migratable resource depends on a non-migratable resource, and both need to be moved, the migratable resource will be restarted." msgstr "" +#. Tag: title +#, no-c-format +msgid "Tracking Node Health" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "A node may be functioning adequately as far as cluster membership is concerned, and yet be \"unhealthy\" in some respect that makes it an undesirable location for resources. For example, a disk drive may be reporting SMART errors, or the CPU may be highly loaded." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Pacemaker offers a way to automatically move resources off unhealthy nodes." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Node Health Attributes" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Pacemaker will treat any node attribute whose name starts with #health as an indicator of node health. Node health attributes may have one of the following values:" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Allowed Values for Node Health Attributes" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Value" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Intended significance" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "red" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "This indicator is unhealthy Node healthred red " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "yellow" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "This indicator is becoming unhealthy Node healthyellow yellow " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "green" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "This indicator is healthy Node healthgreen green " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "integer" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "A numeric score to apply to all resources on this node (0 or positive is healthy, negative is unhealthy) Node healthscore score " +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Node Health Strategy" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Pacemaker assigns a node health score to each node, as the sum of the values of all its node health attributes. This score will be used as a location constraint applied to this node for all resources." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The node-health-strategy cluster option controls how Pacemaker responds to changes in node health attributes, and how it translates red, yellow, and green to scores." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Allowed values are:" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Node Health Strategies" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Effect" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "none" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Do not track node health attributes at all. Node healthnone none " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "migrate-on-red" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Assign the value of -INFINITY to red, and 0 to yellow and green. This will cause all resources to move off the node if any attribute is red. Node healthmigrate-on-red migrate-on-red " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "only-green" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Assign the value of -INFINITY to red and yellow, and 0 to green. This will cause all resources to move off the node if any attribute is red or yellow. Node healthonly-green only-green " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "progressive" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Assign the value of the node-health-red cluster option to red, the value of node-health-yellow to yellow, and the value of node-health-green to green. Each node is additionally assigned a score of node-health-base (this allows resources to start even if some attributes are yellow). This strategy gives the administrator finer control over how important each value is. Node healthprogressive progressive " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "custom" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Track node health attributes using the same values as progressive for red, yellow, and green, but do not take them into account. The administrator is expected to implement a policy by defining rules (see ) referencing node health attributes. Node healthcustom custom " +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Measuring Node Health" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Since Pacemaker calculates node health based on node attributes, any method that sets node attributes may be used to measure node health. The most common ways are resource agents or separate daemons." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Pacemaker provides examples that can be used directly or as a basis for custom code. The ocf:pacemaker:HealthCPU and ocf:pacemaker:HealthSMART resource agents set node health attributes based on CPU and disk parameters. The ipmiservicelogd daemon sets node health attributes based on IPMI values (the ocf:pacemaker:SystemHealth resource agent can be used to manage the daemon as a cluster resource)." +msgstr "" + #. Tag: title #, no-c-format msgid "Reusing Rules, Options and Sets of Operations" msgstr "" #. Tag: para #, no-c-format msgid "Sometimes a number of constraints need to use the same set of rules, and resources need to set the same options and parameters. To simplify this situation, you can refer to an existing object using an id-ref instead of an id." msgstr "" #. Tag: para #, no-c-format msgid "So if for one resource you have" msgstr "" #. Tag: para #, no-c-format msgid "Then instead of duplicating the rule for all your other resources, you can instead specify:" msgstr "" #. Tag: title #, no-c-format msgid "Referencing rules from other constraints" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location id=\"WebDB-connectivity\" rsc=\"WebDB\">\n" " <rule id-ref=\"ping-prefer-rule\"/>\n" "</rsc_location>" msgstr "" #. Tag: para #, no-c-format msgid "The cluster will insist that the rule exists somewhere. Attempting to add a reference to a non-existing rule will cause a validation failure, as will attempting to remove a rule that is referenced elsewhere." msgstr "" #. Tag: para #, no-c-format msgid "The same principle applies for meta_attributes and instance_attributes as illustrated in the example below:" msgstr "" #. Tag: title #, no-c-format msgid "Referencing attributes, options, and operations from other resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"mySpecialRsc\" class=\"ocf\" type=\"Special\" provider=\"me\">\n" " <instance_attributes id=\"mySpecialRsc-attrs\" score=\"1\" >\n" " <nvpair id=\"default-interface\" name=\"interface\" value=\"eth0\"/>\n" " <nvpair id=\"default-port\" name=\"port\" value=\"9999\"/>\n" " </instance_attributes>\n" " <meta_attributes id=\"mySpecialRsc-options\">\n" " <nvpair id=\"failure-timeout\" name=\"failure-timeout\" value=\"5m\"/>\n" " <nvpair id=\"migration-threshold\" name=\"migration-threshold\" value=\"1\"/>\n" " <nvpair id=\"stickiness\" name=\"resource-stickiness\" value=\"0\"/>\n" " </meta_attributes>\n" " <operations id=\"health-checks\">\n" " <op id=\"health-check\" name=\"monitor\" interval=\"60s\"/>\n" " <op id=\"health-check\" name=\"monitor\" interval=\"30min\"/>\n" " </operations>\n" "</primitive>\n" "<primitive id=\"myOtherlRsc\" class=\"ocf\" type=\"Other\" provider=\"me\">\n" " <instance_attributes id-ref=\"mySpecialRsc-attrs\"/>\n" " <meta_attributes id-ref=\"mySpecialRsc-options\"/>\n" " <operations id-ref=\"health-checks\"/>\n" "</primitive>" msgstr "" #. Tag: title #, no-c-format msgid "Reloading Services After a Definition Change" msgstr "" #. Tag: para #, no-c-format msgid "The cluster automatically detects changes to the definition of services it manages. The normal response is to stop the service (using the old definition) and start it again (with the new definition). This works well, but some services are smarter and can be told to use a new set of options without restarting." msgstr "" #. Tag: para #, no-c-format msgid "To take advantage of this capability, the resource agent must:" msgstr "" #. Tag: para #, no-c-format msgid "Accept the reload operation and perform any required actions. The actions here depend completely on your application!" msgstr "" #. Tag: title #, no-c-format msgid "The DRBD agent’s logic for supporting reload" msgstr "" #. Tag: programlisting #, no-c-format msgid "case $1 in\n" " start)\n" " drbd_start\n" " ;;\n" " stop)\n" " drbd_stop\n" " ;;\n" " reload)\n" " drbd_reload\n" " ;;\n" " monitor)\n" " drbd_monitor\n" " ;;\n" " *)\n" " drbd_usage\n" " exit $OCF_ERR_UNIMPLEMENTED\n" " ;;\n" "esac\n" "exit $?" msgstr "" #. Tag: para #, no-c-format msgid "Advertise the reload operation in the actions section of its metadata" msgstr "" #. Tag: title #, no-c-format msgid "The DRBD Agent Advertising Support for the reload Operation" msgstr "" #. Tag: programlisting #, no-c-format msgid "<?xml version=\"1.0\"?>\n" " <!DOCTYPE resource-agent SYSTEM \"ra-api-1.dtd\">\n" " <resource-agent name=\"drbd\">\n" " <version>1.1</version>\n" "\n" " <longdesc>\n" " Master/Slave OCF Resource Agent for DRBD\n" " </longdesc>\n" "\n" " ...\n" "\n" " <actions>\n" " <action name=\"start\" timeout=\"240\" />\n" " <action name=\"reload\" timeout=\"240\" />\n" " <action name=\"promote\" timeout=\"90\" />\n" " <action name=\"demote\" timeout=\"90\" />\n" " <action name=\"notify\" timeout=\"90\" />\n" " <action name=\"stop\" timeout=\"100\" />\n" " <action name=\"meta-data\" timeout=\"5\" />\n" " <action name=\"validate-all\" timeout=\"30\" />\n" " </actions>\n" " </resource-agent>" msgstr "" #. Tag: para #, no-c-format msgid "Advertise one or more parameters that can take effect using reload." msgstr "" #. Tag: para #, no-c-format msgid "Any parameter with the unique set to 0 is eligible to be used in this way." msgstr "" #. Tag: title #, no-c-format msgid "Parameter that can be changed using reload" msgstr "" #. Tag: programlisting #, no-c-format msgid "<parameter name=\"drbdconf\" unique=\"0\">\n" " <longdesc>Full path to the drbd.conf file.</longdesc>\n" " <shortdesc>Path to drbd.conf</shortdesc>\n" " <content type=\"string\" default=\"${OCF_RESKEY_drbdconf_default}\"/>\n" "</parameter>" msgstr "" #. Tag: para #, no-c-format msgid "Once these requirements are satisfied, the cluster will automatically know to reload the resource (instead of restarting) when a non-unique field changes." msgstr "" #. Tag: para #, no-c-format msgid "Metadata will not be re-read unless the resource needs to be started. This may mean that the resource will be restarted the first time, even though you changed a parameter with unique=0." msgstr "" #. Tag: para #, no-c-format msgid "If both a unique and non-unique field are changed simultaneously, the resource will still be restarted." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Advanced-Resources.pot b/doc/Pacemaker_Explained/pot/Ch-Advanced-Resources.pot index 5af5a9f04d..80bd4401a1 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Advanced-Resources.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Advanced-Resources.pot @@ -1,1425 +1,1405 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Advanced Resource Types" msgstr "" #. Tag: title #, no-c-format msgid "Groups - A Syntactic Shortcut" msgstr "" #. Tag: para #, no-c-format msgid " Group Resources ResourcesGroups Groups " msgstr "" #. Tag: para #, no-c-format msgid "One of the most common elements of a cluster is a set of resources that need to be located together, start sequentially, and stop in the reverse order. To simplify this configuration, we support the concept of groups." msgstr "" #. Tag: title #, no-c-format msgid "A group of two primitive resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<group id=\"shortcut\">\n" " <primitive id=\"Public-IP\" class=\"ocf\" type=\"IPaddr\" provider=\"heartbeat\">\n" " <instance_attributes id=\"params-public-ip\">\n" " <nvpair id=\"public-ip-addr\" name=\"ip\" value=\"192.0.2.2\"/>\n" " </instance_attributes>\n" " </primitive>\n" " <primitive id=\"Email\" class=\"lsb\" type=\"exim\"/>\n" "</group>" msgstr "" #. Tag: para #, no-c-format msgid "Although the example above contains only two resources, there is no limit to the number of resources a group can contain. The example is also sufficient to explain the fundamental properties of a group:" msgstr "" #. Tag: para #, no-c-format msgid "Resources are started in the order they appear in (Public-IP first, then Email)" msgstr "" #. Tag: para #, no-c-format msgid "Resources are stopped in the reverse order to which they appear in (Email first, then Public-IP)" msgstr "" #. Tag: para #, no-c-format msgid "If a resource in the group can’t run anywhere, then nothing after that is allowed to run, too." msgstr "" #. Tag: para #, no-c-format msgid "If Public-IP can’t run anywhere, neither can Email;" msgstr "" #. Tag: para #, no-c-format msgid "but if Email can’t run anywhere, this does not affect Public-IP in any way" msgstr "" #. Tag: para #, no-c-format msgid "The group above is logically equivalent to writing:" msgstr "" #. Tag: title #, no-c-format msgid "How the cluster sees a group resource" msgstr "" #. Tag: programlisting #, no-c-format msgid "<configuration>\n" " <resources>\n" " <primitive id=\"Public-IP\" class=\"ocf\" type=\"IPaddr\" provider=\"heartbeat\">\n" " <instance_attributes id=\"params-public-ip\">\n" " <nvpair id=\"public-ip-addr\" name=\"ip\" value=\"192.0.2.2\"/>\n" " </instance_attributes>\n" " </primitive>\n" " <primitive id=\"Email\" class=\"lsb\" type=\"exim\"/>\n" " </resources>\n" " <constraints>\n" " <rsc_colocation id=\"xxx\" rsc=\"Email\" with-rsc=\"Public-IP\" score=\"INFINITY\"/>\n" " <rsc_order id=\"yyy\" first=\"Public-IP\" then=\"Email\"/>\n" " </constraints>\n" "</configuration>" msgstr "" #. Tag: para #, no-c-format msgid "Obviously as the group grows bigger, the reduced configuration effort can become significant." msgstr "" #. Tag: para #, no-c-format msgid "Another (typical) example of a group is a DRBD volume, the filesystem mount, an IP address, and an application that uses them." msgstr "" #. Tag: title #, no-c-format msgid "Group Properties" msgstr "" #. Tag: title #, no-c-format msgid "Properties of a Group Resource" msgstr "" #. Tag: entry #, no-c-format msgid "Field" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "id" msgstr "" #. Tag: para #, no-c-format msgid "A unique name for the group idGroup Resource Property Group Resource Property ResourceGroup Propertyid Group Propertyid id " msgstr "" #. Tag: title #, no-c-format msgid "Group Options" msgstr "" #. Tag: para #, no-c-format msgid "Groups inherit the priority, target-role, and is-managed properties from primitive resources. See for information about those properties." msgstr "" #. Tag: title #, no-c-format msgid "Group Instance Attributes" msgstr "" #. Tag: para #, no-c-format msgid "Groups have no instance attributes. However, any that are set for the group object will be inherited by the group’s children." msgstr "" #. Tag: title #, no-c-format msgid "Group Contents" msgstr "" #. Tag: para #, no-c-format msgid "Groups may only contain a collection of cluster resources (see ). To refer to a child of a group resource, just use the child’s id instead of the group’s." msgstr "" #. Tag: title #, no-c-format msgid "Group Constraints" msgstr "" #. Tag: para #, no-c-format msgid "Although it is possible to reference a group’s children in constraints, it is usually preferable to reference the group itself." msgstr "" #. Tag: title #, no-c-format msgid "Some constraints involving groups" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_location id=\"group-prefers-node1\" rsc=\"shortcut\" node=\"node1\" score=\"500\"/>\n" " <rsc_colocation id=\"webserver-with-group\" rsc=\"Webserver\" with-rsc=\"shortcut\"/>\n" " <rsc_order id=\"start-group-then-webserver\" first=\"Webserver\" then=\"shortcut\"/>\n" "</constraints>" msgstr "" #. Tag: title #, no-c-format msgid "Group Stickiness" msgstr "" #. Tag: para #, no-c-format msgid " resource-stickinessGroups Groups " msgstr "" #. Tag: para #, no-c-format msgid "Stickiness, the measure of how much a resource wants to stay where it is, is additive in groups. Every active resource of the group will contribute its stickiness value to the group’s total. So if the default resource-stickiness is 100, and a group has seven members, five of which are active, then the group as a whole will prefer its current location with a score of 500." msgstr "" #. Tag: title #, no-c-format msgid "Clones - Resources That Get Active on Multiple Hosts" msgstr "" #. Tag: para #, no-c-format msgid " Clone Resources ResourcesClones Clones " msgstr "" #. Tag: para #, no-c-format msgid "Clones were initially conceived as a convenient way to start multiple instances of an IP address resource and have them distributed throughout the cluster for load balancing. They have turned out to quite useful for a number of purposes including integrating with the Distributed Lock Manager (used by many cluster filesystems), the fencing subsystem, and OCFS2." msgstr "" #. Tag: para #, no-c-format msgid "You can clone any resource, provided the resource agent supports it." msgstr "" #. Tag: para #, no-c-format msgid "Three types of cloned resources exist:" msgstr "" #. Tag: para #, no-c-format msgid "Anonymous" msgstr "" #. Tag: para #, no-c-format msgid "Globally unique" msgstr "" #. Tag: para #, no-c-format msgid "Stateful" msgstr "" #. Tag: para #, no-c-format msgid "Anonymous clones are the simplest. These behave completely identically everywhere they are running. Because of this, there can be only one copy of an anonymous clone active per machine." msgstr "" #. Tag: para #, no-c-format msgid "Globally unique clones are distinct entities. A copy of the clone running on one machine is not equivalent to another instance on another node, nor would any two copies on the same node be equivalent." msgstr "" #. Tag: para #, no-c-format msgid "Stateful clones are covered later in ." msgstr "" #. Tag: title #, no-c-format msgid "A clone of an LSB resource" msgstr "" #. Tag: programlisting #, no-c-format msgid "<clone id=\"apache-clone\">\n" " <meta_attributes id=\"apache-clone-meta\">\n" " <nvpair id=\"apache-unique\" name=\"globally-unique\" value=\"false\"/>\n" " </meta_attributes>\n" " <primitive id=\"apache\" class=\"lsb\" type=\"apache\"/>\n" "</clone>" msgstr "" #. Tag: title #, no-c-format msgid "Clone Properties" msgstr "" #. Tag: title #, no-c-format msgid "Properties of a Clone Resource" msgstr "" #. Tag: para #, no-c-format msgid "A unique name for the clone idClone Property Clone Property ClonePropertyid Propertyid id " msgstr "" #. Tag: title #, no-c-format msgid "Clone Options" msgstr "" #. Tag: para #, no-c-format msgid "Options inherited from primitive resources: priority, target-role, is-managed" msgstr "" #. Tag: title #, no-c-format msgid "Clone-specific configuration options" msgstr "" #. Tag: entry #, no-c-format msgid "Default" msgstr "" #. Tag: para #, no-c-format msgid "clone-max" msgstr "" #. Tag: para #, no-c-format msgid "number of nodes in cluster" msgstr "" #. Tag: para #, no-c-format msgid "How many copies of the resource to start clone-maxClone Option Clone Option CloneOptionclone-max Optionclone-max clone-max " msgstr "" #. Tag: para #, no-c-format msgid "clone-node-max" msgstr "" #. Tag: para #, no-c-format msgid "1" msgstr "" #. Tag: para #, no-c-format msgid "How many copies of the resource can be started on a single node clone-node-maxClone Option Clone Option CloneOptionclone-node-max Optionclone-node-max clone-node-max " msgstr "" #. Tag: para #, no-c-format msgid "clone-min" msgstr "" #. Tag: para #, no-c-format msgid "Require at least this number of clone instances to be runnable before allowing resources depending on the clone to be runnable (since 1.1.14) clone-minClone Option Clone Option CloneOptionclone-min Optionclone-min clone-min " msgstr "" #. Tag: para #, no-c-format msgid "notify" msgstr "" #. Tag: para #, no-c-format msgid "true" msgstr "" #. Tag: para #, no-c-format msgid "When stopping or starting a copy of the clone, tell all the other copies beforehand and again when the action was successful. Allowed values: false, true notifyClone Option Clone Option CloneOptionnotify Optionnotify notify " msgstr "" #. Tag: para #, no-c-format msgid "globally-unique" msgstr "" #. Tag: para #, no-c-format msgid "false" msgstr "" #. Tag: para #, no-c-format msgid "Does each copy of the clone perform a different function? Allowed values: false, true globally-uniqueClone Option Clone Option CloneOptionglobally-unique Optionglobally-unique globally-unique " msgstr "" #. Tag: para #, no-c-format msgid "ordered" msgstr "" #. Tag: para #, no-c-format msgid "Should the copies be started in series (instead of in parallel)? Allowed values: false, true orderedClone Option Clone Option CloneOptionordered Optionordered ordered " msgstr "" #. Tag: para #, no-c-format msgid "interleave" msgstr "" #. Tag: para #, no-c-format msgid "If this clone depends on another clone via an ordering constraint, is it allowed to start after the local instance of the other clone starts, rather than wait for all instances of the other clone to start? Allowed values: false, true interleaveClone Option Clone Option CloneOptioninterleave Optioninterleave interleave " msgstr "" #. Tag: title #, no-c-format msgid "Clone Instance Attributes" msgstr "" #. Tag: para #, no-c-format msgid "Clones have no instance attributes; however, any that are set here will be inherited by the clone’s children." msgstr "" #. Tag: title #, no-c-format msgid "Clone Contents" msgstr "" #. Tag: para #, no-c-format msgid "Clones must contain exactly one primitive or group resource." msgstr "" #. Tag: para #, no-c-format msgid "You should never reference the name of a clone’s child. If you think you need to do this, you probably need to re-evaluate your design." msgstr "" #. Tag: title #, no-c-format msgid "Clone Constraints" msgstr "" #. Tag: para #, no-c-format msgid "In most cases, a clone will have a single copy on each active cluster node. If this is not the case, you can indicate which nodes the cluster should preferentially assign copies to with resource location constraints. These constraints are written no differently from those for primitive resources except that the clone’s id is used." msgstr "" #. Tag: title #, no-c-format msgid "Some constraints involving clones" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_location id=\"clone-prefers-node1\" rsc=\"apache-clone\" node=\"node1\" score=\"500\"/>\n" " <rsc_colocation id=\"stats-with-clone\" rsc=\"apache-stats\" with=\"apache-clone\"/>\n" " <rsc_order id=\"start-clone-then-stats\" first=\"apache-clone\" then=\"apache-stats\"/>\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "Ordering constraints behave slightly differently for clones. In the example above, apache-stats will wait until all copies of apache-clone that need to be started have done so before being started itself. Only if no copies can be started will apache-stats be prevented from being active. Additionally, the clone will wait for apache-stats to be stopped before stopping itself." msgstr "" #. Tag: para #, no-c-format msgid "Colocation of a primitive or group resource with a clone means that the resource can run on any machine with an active copy of the clone. The cluster will choose a copy based on where the clone is running and the resource’s own location preferences." msgstr "" #. Tag: para #, no-c-format msgid "Colocation between clones is also possible. If one clone A is colocated with another clone B, the set of allowed locations for A is limited to nodes on which B is (or will be) active. Placement is then performed normally." msgstr "" #. Tag: title #, no-c-format msgid "Clone Stickiness" msgstr "" #. Tag: para #, no-c-format msgid " resource-stickinessClones Clones " msgstr "" #. Tag: para #, no-c-format msgid "To achieve a stable allocation pattern, clones are slightly sticky by default. If no value for resource-stickiness is provided, the clone will use a value of 1. Being a small value, it causes minimal disturbance to the score calculations of other resources but is enough to prevent Pacemaker from needlessly moving copies around the cluster." msgstr "" #. Tag: para #, no-c-format msgid "For globally unique clones, this may result in multiple instances of the clone staying on a single node, even after another eligible node becomes active (for example, after being put into standby mode then made active again). If you do not want this behavior, specify a resource-stickiness of 0 for the clone temporarily and let the cluster adjust, then set it back to 1 if you want the default behavior to apply again." msgstr "" #. Tag: title #, no-c-format msgid "Clone Resource Agent Requirements" msgstr "" #. Tag: para #, no-c-format msgid "Any resource can be used as an anonymous clone, as it requires no additional support from the resource agent. Whether it makes sense to do so depends on your resource and its resource agent." msgstr "" #. Tag: para #, no-c-format msgid "Globally unique clones do require some additional support in the resource agent. In particular, it must only respond with ${OCF_SUCCESS} if the node has that exact instance active. All other probes for instances of the clone should result in ${OCF_NOT_RUNNING} (or one of the other OCF error codes if they are failed)." msgstr "" #. Tag: para #, no-c-format msgid "Individual instances of a clone are identified by appending a colon and a numerical offset, e.g. apache:2." msgstr "" #. Tag: para #, no-c-format msgid "Resource agents can find out how many copies there are by examining the OCF_RESKEY_CRM_meta_clone_max environment variable and which copy it is by examining OCF_RESKEY_CRM_meta_clone." msgstr "" #. Tag: para #, no-c-format msgid "The resource agent must not make any assumptions (based on OCF_RESKEY_CRM_meta_clone) about which numerical instances are active. In particular, the list of active copies will not always be an unbroken sequence, nor always start at 0." msgstr "" #. Tag: title #, no-c-format msgid "Clone Notifications" msgstr "" #. Tag: para #, no-c-format msgid "Supporting notifications requires the notify action to be implemented. If supported, the notify action will be passed a number of extra variables which, when combined with additional context, can be used to calculate the current state of the cluster and what is about to happen to it." msgstr "" #. Tag: title #, no-c-format msgid "Environment variables supplied with Clone notify actions" msgstr "" #. Tag: entry #, no-c-format msgid "Variable" msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_type" msgstr "" #. Tag: para #, no-c-format msgid "Allowed values: pre, post Environment VariableOCF_RESKEY_CRM_meta_notify_type OCF_RESKEY_CRM_meta_notify_type type typeNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_operation" msgstr "" #. Tag: para #, no-c-format msgid "Allowed values: start, stop Environment VariableOCF_RESKEY_CRM_meta_notify_operation OCF_RESKEY_CRM_meta_notify_operation operation operationNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_start_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources to be started Environment VariableOCF_RESKEY_CRM_meta_notify_start_resource OCF_RESKEY_CRM_meta_notify_start_resource start_resource start_resourceNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_stop_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources to be stopped Environment VariableOCF_RESKEY_CRM_meta_notify_stop_resource OCF_RESKEY_CRM_meta_notify_stop_resource stop_resource stop_resourceNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_active_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources that are running Environment VariableOCF_RESKEY_CRM_meta_notify_active_resource OCF_RESKEY_CRM_meta_notify_active_resource active_resource active_resourceNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_inactive_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources that are not running Environment VariableOCF_RESKEY_CRM_meta_notify_inactive_resource OCF_RESKEY_CRM_meta_notify_inactive_resource inactive_resource inactive_resourceNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_start_uname" msgstr "" #. Tag: para #, no-c-format msgid "Nodes on which resources will be started Environment VariableOCF_RESKEY_CRM_meta_notify_start_uname OCF_RESKEY_CRM_meta_notify_start_uname start_uname start_unameNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_stop_uname" msgstr "" #. Tag: para #, no-c-format msgid "Nodes on which resources will be stopped Environment VariableOCF_RESKEY_CRM_meta_notify_stop_uname OCF_RESKEY_CRM_meta_notify_stop_uname stop_uname stop_unameNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_active_uname" msgstr "" #. Tag: para #, no-c-format msgid "Nodes on which resources are running Environment VariableOCF_RESKEY_CRM_meta_notify_active_uname OCF_RESKEY_CRM_meta_notify_active_uname active_uname active_unameNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_inactive_uname" msgstr "" #. Tag: para #, no-c-format msgid "Nodes on which resources are not running Environment VariableOCF_RESKEY_CRM_meta_notify_inactive_uname OCF_RESKEY_CRM_meta_notify_inactive_uname inactive_uname inactive_unameNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "The variables come in pairs, such as OCF_RESKEY_CRM_meta_notify_start_resource and OCF_RESKEY_CRM_meta_notify_start_uname and should be treated as an array of whitespace-separated elements." msgstr "" #. Tag: para #, no-c-format msgid "Thus in order to indicate that clone:0 will be started on sles-1, clone:2 will be started on sles-3, and clone:3 will be started on sles-2, the cluster would set" msgstr "" #. Tag: title #, no-c-format msgid "Notification variables" msgstr "" #. Tag: programlisting #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_start_resource=\"clone:0 clone:2 clone:3\"\n" "OCF_RESKEY_CRM_meta_notify_start_uname=\"sles-1 sles-3 sles-2\"" msgstr "" #. Tag: title #, no-c-format msgid "Proper Interpretation of Notification Environment Variables" msgstr "" #. Tag: title #, no-c-format msgid "Pre-notification (stop):" msgstr "" #. Tag: para #, no-c-format msgid "Active resources: $OCF_RESKEY_CRM_meta_notify_active_resource" msgstr "" #. Tag: para #, no-c-format msgid "Inactive resources: $OCF_RESKEY_CRM_meta_notify_inactive_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources to be started: $OCF_RESKEY_CRM_meta_notify_start_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources to be stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource" msgstr "" #. Tag: title #, no-c-format msgid "Post-notification (stop) / Pre-notification (start):" msgstr "" #. Tag: para #, no-c-format msgid "Active resources" msgstr "" #. Tag: para #, no-c-format msgid "$OCF_RESKEY_CRM_meta_notify_active_resource" msgstr "" #. Tag: para #, no-c-format msgid "minus $OCF_RESKEY_CRM_meta_notify_stop_resource" msgstr "" #. Tag: para #, no-c-format msgid "Inactive resources" msgstr "" #. Tag: para #, no-c-format msgid "$OCF_RESKEY_CRM_meta_notify_inactive_resource" msgstr "" #. Tag: para #, no-c-format msgid "plus $OCF_RESKEY_CRM_meta_notify_stop_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources that were started: $OCF_RESKEY_CRM_meta_notify_start_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources that were stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource" msgstr "" #. Tag: title #, no-c-format msgid "Post-notification (start):" msgstr "" #. Tag: para #, no-c-format msgid "Active resources:" msgstr "" #. Tag: para #, no-c-format msgid "plus $OCF_RESKEY_CRM_meta_notify_start_resource" msgstr "" #. Tag: para #, no-c-format msgid "Inactive resources:" msgstr "" #. Tag: para #, no-c-format msgid "minus $OCF_RESKEY_CRM_meta_notify_start_resource" msgstr "" #. Tag: title #, no-c-format msgid "Multi-state - Resources That Have Multiple Modes" msgstr "" #. Tag: para #, no-c-format msgid " Multi-state Resources ResourcesMulti-state Multi-state " msgstr "" #. Tag: para #, no-c-format msgid "Multi-state resources are a specialization of clone resources; please ensure you understand before continuing!" msgstr "" #. Tag: para #, no-c-format msgid "Multi-state resources allow the instances to be in one of two operating modes (called roles). The roles are called master and slave, but can mean whatever you wish them to mean. The only limitation is that when an instance is started, it must come up in the slave role." msgstr "" #. Tag: title #, no-c-format msgid "Multi-state Properties" msgstr "" #. Tag: title #, no-c-format msgid "Properties of a Multi-State Resource" msgstr "" #. Tag: para #, no-c-format msgid "Your name for the multi-state resource idMulti-State Property Multi-State Property Multi-StatePropertyid Propertyid id " msgstr "" #. Tag: title #, no-c-format msgid "Multi-state Options" msgstr "" #. Tag: para #, no-c-format msgid "Options inherited from primitive resources: priority, target-role, is-managed" msgstr "" #. Tag: para #, no-c-format msgid "Options inherited from clone resources: clone-max, clone-node-max, notify, globally-unique, ordered, interleave" msgstr "" #. Tag: title #, no-c-format msgid "Multi-state-specific resource configuration options" msgstr "" #. Tag: para #, no-c-format msgid "master-max" msgstr "" #. Tag: para #, no-c-format msgid "How many copies of the resource can be promoted to the master role master-maxMulti-State Option Multi-State Option Multi-StateOptionmaster-max Optionmaster-max master-max " msgstr "" #. Tag: para #, no-c-format msgid "master-node-max" msgstr "" #. Tag: para #, no-c-format msgid "How many copies of the resource can be promoted to the master role on a single node master-node-maxMulti-State Option Multi-State Option Multi-StateOptionmaster-node-max Optionmaster-node-max master-node-max " msgstr "" #. Tag: title #, no-c-format msgid "Multi-state Instance Attributes" msgstr "" #. Tag: para #, no-c-format msgid "Multi-state resources have no instance attributes; however, any that are set here will be inherited by a master’s children." msgstr "" #. Tag: title #, no-c-format msgid "Multi-state Contents" msgstr "" #. Tag: para #, no-c-format msgid "Masters must contain exactly one primitive or group resource." msgstr "" #. Tag: para #, no-c-format msgid "You should never reference the name of a master’s child. If you think you need to do this, you probably need to re-evaluate your design." msgstr "" #. Tag: title #, no-c-format msgid "Monitoring Multi-State Resources" msgstr "" #. Tag: para #, no-c-format msgid "The usual monitor actions are insufficient to monitor a multi-state resource, because pacemaker needs to verify not only that the resource is active, but also that its actual role matches its intended one." msgstr "" #. Tag: para #, no-c-format msgid "Define two monitoring actions: the usual one will cover the slave role, and an additional one with role=\"master\" will cover the master role." msgstr "" #. Tag: title #, no-c-format msgid "Monitoring both states of a multi-state resource" msgstr "" #. Tag: programlisting #, no-c-format msgid "<master id=\"myMasterRsc\">\n" " <primitive id=\"myRsc\" class=\"ocf\" type=\"myApp\" provider=\"myCorp\">\n" " <operations>\n" " <op id=\"public-ip-slave-check\" name=\"monitor\" interval=\"60\"/>\n" " <op id=\"public-ip-master-check\" name=\"monitor\" interval=\"61\" role=\"Master\"/>\n" " </operations>\n" " </primitive>\n" "</master>" msgstr "" #. Tag: para #, no-c-format msgid "It is crucial that every monitor operation has a different interval! Pacemaker currently differentiates between operations only by resource and interval; so if (for example) a master/slave resource had the same monitor interval for both roles, Pacemaker would ignore the role when checking the status — which would cause unexpected return codes, and therefore unnecessary complications." msgstr "" #. Tag: title #, no-c-format msgid "Multi-state Constraints" msgstr "" #. Tag: para #, no-c-format msgid "In most cases, multi-state resources will have a single copy on each active cluster node. If this is not the case, you can indicate which nodes the cluster should preferentially assign copies to with resource location constraints. These constraints are written no differently from those for primitive resources except that the master’s id is used." msgstr "" #. Tag: para #, no-c-format msgid "When considering multi-state resources in constraints, for most purposes it is sufficient to treat them as clones. The exception is that the first-action and/or then-action fields for ordering constraints may be set to promote or demote to constrain the master role, and colocation constraints may contain rsc-role and/or with-rsc-role fields." msgstr "" #. Tag: title #, no-c-format msgid "Additional colocation constraint options for multi-state resources" msgstr "" #. Tag: para #, no-c-format msgid "rsc-role" msgstr "" #. Tag: para #, no-c-format -msgid "started" +msgid "Started" msgstr "" #. Tag: para #, no-c-format -msgid "An additional attribute of colocation constraints that specifies the role that rsc must be in. Allowed values: started, master, slave. rsc-roleOrdering Constraints Ordering Constraints ConstraintsOrderingrsc-role Orderingrsc-role rsc-role " +msgid "An additional attribute of colocation constraints that specifies the role that rsc must be in. Allowed values: Started, Master, Slave. rsc-roleOrdering Constraints Ordering Constraints ConstraintsOrderingrsc-role Orderingrsc-role rsc-role " msgstr "" #. Tag: para #, no-c-format msgid "with-rsc-role" msgstr "" #. Tag: para #, no-c-format -msgid "An additional attribute of colocation constraints that specifies the role that with-rsc must be in. Allowed values: started, master, slave. with-rsc-roleOrdering Constraints Ordering Constraints ConstraintsOrderingwith-rsc-role Orderingwith-rsc-role with-rsc-role " +msgid "An additional attribute of colocation constraints that specifies the role that with-rsc must be in. Allowed values: Started, Master, Slave. with-rsc-roleOrdering Constraints Ordering Constraints ConstraintsOrderingwith-rsc-role Orderingwith-rsc-role with-rsc-role " msgstr "" #. Tag: title #, no-c-format msgid "Constraints involving multi-state resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_location id=\"db-prefers-node1\" rsc=\"database\" node=\"node1\" score=\"500\"/>\n" " <rsc_colocation id=\"backup-with-db-slave\" rsc=\"backup\"\n" " with-rsc=\"database\" with-rsc-role=\"Slave\"/>\n" " <rsc_colocation id=\"myapp-with-db-master\" rsc=\"myApp\"\n" " with-rsc=\"database\" with-rsc-role=\"Master\"/>\n" " <rsc_order id=\"start-db-before-backup\" first=\"database\" then=\"backup\"/>\n" " <rsc_order id=\"promote-db-then-app\" first=\"database\" first-action=\"promote\"\n" " then=\"myApp\" then-action=\"start\"/>\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "In the example above, myApp will wait until one of the database copies has been started and promoted to master before being started itself on the same node. Only if no copies can be promoted will myApp be prevented from being active. Additionally, the cluster will wait for myApp to be stopped before demoting the database." msgstr "" #. Tag: para #, no-c-format msgid "Colocation of a primitive or group resource with a multi-state resource means that it can run on any machine with an active copy of the multi-state resource that has the specified role (master or slave). In the example above, the cluster will choose a location based on where database is running as a master, and if there are multiple master instances it will also factor in myApp's own location preferences when deciding which location to choose." msgstr "" #. Tag: para #, no-c-format msgid "Colocation with regular clones and other multi-state resources is also possible. In such cases, the set of allowed locations for the rsc clone is (after role filtering) limited to nodes on which the with-rsc multi-state resource is (or will be) in the specified role. Placement is then performed as normal." msgstr "" #. Tag: title #, no-c-format msgid "Using Multi-state Resources in Colocation Sets" msgstr "" #. Tag: title #, no-c-format msgid "Additional colocation set options relevant to multi-state resources" msgstr "" #. Tag: para #, no-c-format msgid "role" msgstr "" #. Tag: para #, no-c-format -msgid "The role that all members of the set must be in. Allowed values: started, master, slave. roleOrdering Constraints Ordering Constraints ConstraintsOrderingrole Orderingrole role " +msgid "The role that all members of the set must be in. Allowed values: Started, Master, Slave. roleOrdering Constraints Ordering Constraints ConstraintsOrderingrole Orderingrole role " msgstr "" #. Tag: para #, no-c-format msgid "In the following example B's master must be located on the same node as A's master. Additionally resources C and D must be located on the same node as A's and B's masters." msgstr "" #. Tag: title #, no-c-format msgid "Colocate C and D with A’s and B’s master instances" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_colocation id=\"coloc-1\" score=\"INFINITY\" >\n" " <resource_set id=\"colocated-set-example-1\" sequential=\"true\" role=\"Master\">\n" " <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" " </resource_set>\n" " <resource_set id=\"colocated-set-example-2\" sequential=\"true\">\n" " <resource_ref id=\"C\"/>\n" " <resource_ref id=\"D\"/>\n" " </resource_set>\n" " </rsc_colocation>\n" "</constraints>" msgstr "" #. Tag: title #, no-c-format msgid "Using Multi-state Resources in Ordering Sets" msgstr "" #. Tag: title #, no-c-format msgid "Additional ordered set options relevant to multi-state resources" msgstr "" #. Tag: para #, no-c-format msgid "action" msgstr "" #. Tag: para #, no-c-format msgid "value of first-action" msgstr "" #. Tag: para #, no-c-format msgid "An additional attribute of ordering constraint sets that specifies the action that applies to all members of the set. Allowed values: start, stop, promote, demote. actionOrdering Constraints Ordering Constraints ConstraintsOrderingaction Orderingaction action " msgstr "" #. Tag: title #, no-c-format msgid "Start C and D after first promoting A and B" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_order id=\"order-1\" score=\"INFINITY\" >\n" " <resource_set id=\"ordered-set-1\" sequential=\"true\" action=\"promote\">\n" " <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" " </resource_set>\n" " <resource_set id=\"ordered-set-2\" sequential=\"true\" action=\"start\">\n" " <resource_ref id=\"C\"/>\n" " <resource_ref id=\"D\"/>\n" " </resource_set>\n" " </rsc_order>\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "In the above example, B cannot be promoted to a master role until A has been promoted. Additionally, resources C and D must wait until A and B have been promoted before they can start." msgstr "" #. Tag: title #, no-c-format msgid "Multi-state Stickiness" msgstr "" #. Tag: para #, no-c-format msgid " resource-stickinessMulti-State Multi-State As with regular clones, multi-state resources are slightly sticky by default. See for details." msgstr "" #. Tag: title #, no-c-format msgid "Which Resource Instance is Promoted" msgstr "" #. Tag: para #, no-c-format msgid "During the start operation, most resource agents should call the crm_master utility. This tool automatically detects both the resource and host and should be used to set a preference for being promoted. Based on this, master-max, and master-node-max, the instance(s) with the highest preference will be promoted." msgstr "" #. Tag: para #, no-c-format msgid "An alternative is to create a location constraint that indicates which nodes are most preferred as masters." msgstr "" #. Tag: title #, no-c-format msgid "Explicitly preferring node1 to be promoted to master" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location id=\"master-location\" rsc=\"myMasterRsc\">\n" " <rule id=\"master-rule\" score=\"100\" role=\"Master\">\n" " <expression id=\"master-exp\" attribute=\"#uname\" operation=\"eq\" value=\"node1\"/>\n" " </rule>\n" "</rsc_location>" msgstr "" #. Tag: title #, no-c-format msgid "Requirements for Multi-state Resource Agents" msgstr "" #. Tag: para #, no-c-format msgid "Since multi-state resources are an extension of cloned resources, all the requirements for resource agents that support clones are also requirements for resource agents that support multi-state resources." msgstr "" #. Tag: para #, no-c-format msgid "Additionally, multi-state resources require two extra actions, demote and promote, which are responsible for changing the state of the resource. Like start and stop, they should return ${OCF_SUCCESS} if they completed successfully or a relevant error code if they did not." msgstr "" #. Tag: para #, no-c-format msgid "The states can mean whatever you wish, but when the resource is started, it must come up in the mode called slave. From there the cluster will decide which instances to promote to master." msgstr "" #. Tag: para #, no-c-format msgid "In addition to the clone requirements for monitor actions, agents must also accurately report which state they are in. The cluster relies on the agent to report its status (including role) accurately and does not indicate to the agent what role it currently believes it to be in." msgstr "" #. Tag: title #, no-c-format msgid "Role implications of OCF return codes" msgstr "" #. Tag: entry #, no-c-format msgid "Monitor Return Code" msgstr "" #. Tag: para #, no-c-format msgid "OCF_NOT_RUNNING" msgstr "" #. Tag: para #, no-c-format msgid "Stopped Return CodeOCF_NOT_RUNNING OCF_NOT_RUNNING " msgstr "" #. Tag: para #, no-c-format msgid "OCF_SUCCESS" msgstr "" #. Tag: para #, no-c-format msgid "Running (Slave) Return CodeOCF_SUCCESS OCF_SUCCESS " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RUNNING_MASTER" msgstr "" #. Tag: para #, no-c-format msgid "Running (Master) Return CodeOCF_RUNNING_MASTER OCF_RUNNING_MASTER " msgstr "" #. Tag: para #, no-c-format msgid "OCF_FAILED_MASTER" msgstr "" #. Tag: para #, no-c-format msgid "Failed (Master) Return CodeOCF_FAILED_MASTER OCF_FAILED_MASTER " msgstr "" #. Tag: para #, no-c-format msgid "Other" msgstr "" #. Tag: para #, no-c-format msgid "Failed (Slave)" msgstr "" #. Tag: title #, no-c-format msgid "Multi-state Notifications" msgstr "" #. Tag: para #, no-c-format msgid "Like clones, supporting notifications requires the notify action to be implemented. If supported, the notify action will be passed a number of extra variables which, when combined with additional context, can be used to calculate the current state of the cluster and what is about to happen to it." msgstr "" #. Tag: title #, no-c-format msgid "Environment variables supplied with multi-state notify actions Emphasized variables are specific to Master resources, and all behave in the same manner as described for Clone resources." msgstr "" -#. Tag: para -#, no-c-format -msgid "Resources the that are running Environment VariableOCF_RESKEY_CRM_meta_notify_active_resource OCF_RESKEY_CRM_meta_notify_active_resource active_resource active_resourceNotification Environment Variable Notification Environment Variable " -msgstr "" - -#. Tag: para -#, no-c-format -msgid "Resources the that are not running Environment VariableOCF_RESKEY_CRM_meta_notify_inactive_resource OCF_RESKEY_CRM_meta_notify_inactive_resource inactive_resource inactive_resourceNotification Environment Variable Notification Environment Variable " -msgstr "" - #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_master_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources that are running in Master mode Environment VariableOCF_RESKEY_CRM_meta_notify_master_resource OCF_RESKEY_CRM_meta_notify_master_resource master_resource master_resourceNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_slave_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources that are running in Slave mode Environment VariableOCF_RESKEY_CRM_meta_notify_slave_resource OCF_RESKEY_CRM_meta_notify_slave_resource slave_resource slave_resourceNotification Environment Variable Notification Environment Variable " msgstr "" -#. Tag: para -#, no-c-format -msgid " Environment VariableOCF_RESKEY_CRM_meta_notify_stop_resource OCF_RESKEY_CRM_meta_notify_stop_resource stop_resource stop_resourceNotification Environment Variable Notification Environment Variable OCF_RESKEY_CRM_meta_notify_stop_resource" -msgstr "" - -#. Tag: para -#, no-c-format -msgid "Resources to be stopped" -msgstr "" - #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_promote_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources to be promoted Environment VariableOCF_RESKEY_CRM_meta_notify_promote_resource OCF_RESKEY_CRM_meta_notify_promote_resource promote_resource promote_resourceNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_demote_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources to be demoted Environment VariableOCF_RESKEY_CRM_meta_notify_demote_resource OCF_RESKEY_CRM_meta_notify_demote_resource demote_resource demote_resourceNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_promote_uname" msgstr "" #. Tag: para #, no-c-format -msgid "Nodes on which resources will be promote Environment VariableOCF_RESKEY_CRM_meta_notify_promote_uname OCF_RESKEY_CRM_meta_notify_promote_uname promote_uname promote_unameNotification Environment Variable Notification Environment Variable " +msgid "Nodes on which resources will be promoted Environment VariableOCF_RESKEY_CRM_meta_notify_promote_uname OCF_RESKEY_CRM_meta_notify_promote_uname promote_uname promote_unameNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_demote_uname" msgstr "" #. Tag: para #, no-c-format msgid "Nodes on which resources will be demoted Environment VariableOCF_RESKEY_CRM_meta_notify_demote_uname OCF_RESKEY_CRM_meta_notify_demote_uname demote_uname demote_unameNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_master_uname" msgstr "" #. Tag: para #, no-c-format msgid "Nodes on which resources are running in Master mode Environment VariableOCF_RESKEY_CRM_meta_notify_master_uname OCF_RESKEY_CRM_meta_notify_master_uname master_uname master_unameNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: para #, no-c-format msgid "OCF_RESKEY_CRM_meta_notify_slave_uname" msgstr "" #. Tag: para #, no-c-format msgid "Nodes on which resources are running in Slave mode Environment VariableOCF_RESKEY_CRM_meta_notify_slave_uname OCF_RESKEY_CRM_meta_notify_slave_uname slave_uname slave_unameNotification Environment Variable Notification Environment Variable " msgstr "" #. Tag: title #, no-c-format msgid "Proper Interpretation of Multi-state Notification Environment Variables" msgstr "" #. Tag: title #, no-c-format msgid "Pre-notification (demote):" msgstr "" #. Tag: para #, no-c-format msgid "Active resources: $OCF_RESKEY_CRM_meta_notify_active_resource" msgstr "" #. Tag: para #, no-c-format msgid "Master resources: $OCF_RESKEY_CRM_meta_notify_master_resource" msgstr "" #. Tag: para #, no-c-format msgid "Slave resources: $OCF_RESKEY_CRM_meta_notify_slave_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources to be promoted: $OCF_RESKEY_CRM_meta_notify_promote_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources to be demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource" msgstr "" #. Tag: title #, no-c-format msgid "Post-notification (demote) / Pre-notification (stop):" msgstr "" #. Tag: para #, no-c-format msgid "Master resources:" msgstr "" #. Tag: para #, no-c-format msgid "$OCF_RESKEY_CRM_meta_notify_master_resource" msgstr "" #. Tag: para #, no-c-format msgid "minus $OCF_RESKEY_CRM_meta_notify_demote_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources that were demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource" msgstr "" #. Tag: title #, no-c-format msgid "Post-notification (stop) / Pre-notification (start)" msgstr "" #. Tag: para #, no-c-format msgid "Active resources:" msgstr "" #. Tag: para #, no-c-format msgid "Slave resources:" msgstr "" #. Tag: para #, no-c-format msgid "$OCF_RESKEY_CRM_meta_notify_slave_resource" msgstr "" #. Tag: title #, no-c-format msgid "Post-notification (start) / Pre-notification (promote)" msgstr "" #. Tag: title #, no-c-format msgid "Post-notification (promote)" msgstr "" #. Tag: para #, no-c-format msgid "plus $OCF_RESKEY_CRM_meta_notify_promote_resource" msgstr "" #. Tag: para #, no-c-format msgid "minus $OCF_RESKEY_CRM_meta_notify_promote_resource" msgstr "" #. Tag: para #, no-c-format msgid "Resources that were promoted: $OCF_RESKEY_CRM_meta_notify_promote_resource" msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Alerts.pot b/doc/Pacemaker_Explained/pot/Ch-Alerts.pot new file mode 100644 index 0000000000..826b23b793 --- /dev/null +++ b/doc/Pacemaker_Explained/pot/Ch-Alerts.pot @@ -0,0 +1,486 @@ +# +# AUTHOR , YEAR. +# +msgid "" +msgstr "" +"Project-Id-Version: 0\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" +"Last-Translator: Automatically generated\n" +"Language-Team: None\n" +"MIME-Version: 1.0\n" +"Content-Type: application/x-publican; charset=UTF-8\n" +"Content-Transfer-Encoding: 8bit\n" + +#. Tag: title +#, no-c-format +msgid "Alerts" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " ResourceAlerts Alerts " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Alerts may be configured to take some external action when a cluster event occurs (node failure, resource starting or stopping, etc.)." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Alert Agents" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "As with resource agents, the cluster calls an external program (an alert agent) to handle alerts. The cluster passes information about the event to the agent via environment variables. Agents can do anything desired with this information (send an e-mail, log to a file, update a monitoring system, etc.)." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Simple alert configuration" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "<configuration>\n" +" <alerts>\n" +" <alert id=\"my-alert\" path=\"/path/to/my-script.sh\" />\n" +" </alerts>\n" +"</configuration>" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "In the example above, the cluster will call my-script.sh for each event." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Multiple alert agents may be configured; the cluster will call all of them for each event." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Alert agents will be called only on cluster nodes. They will be called for events involving Pacemaker Remote nodes, but they will never be called on those nodes." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Alert Recipients" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Usually alerts are directed towards a recipient. Thus each alert may be additionally configured with one or more recipients. The cluster will call the agent separately for each recipient." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Alert configuration with recipient" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "<configuration>\n" +" <alerts>\n" +" <alert id=\"my-alert\" path=\"/path/to/my-script.sh\">\n" +" <recipient id=\"my-alert-recipient\" value=\"some-address\"/>\n" +" </alert>\n" +" </alerts>\n" +"</configuration>" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "In the above example, the cluster will call my-script.sh for each event, passing the recipient some-address as an environment variable." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The recipient may be anything the alert agent can recognize — an IP address, an e-mail address, a file name, whatever the particular agent supports." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Alert Meta-Attributes" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "As with resource agents, meta-attributes can be configured for alert agents to affect how Pacemaker calls them." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Meta-Attributes of an Alert" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Meta-Attribute" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Default" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Description" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "timestamp-format" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "%H:%M:%S.%06N" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Format the cluster will use when sending the event’s timestamp to the agent. This is a string as used with the date(1) command. AlertOptiontimestamp-format Optiontimestamp-format timestamp-format " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "timeout" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "30s" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "If the alert agent does not complete within this amount of time, it will be terminated. AlertOptiontimeout Optiontimeout timeout " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Meta-attributes can be configured per alert agent and/or per recipient." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Alert configuration with meta-attributes" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "<configuration>\n" +" <alerts>\n" +" <alert id=\"my-alert\" path=\"/path/to/my-script.sh\">\n" +" <meta_attributes id=\"my-alert-attributes\">\n" +" <nvpair id=\"my-alert-attributes-timeout\" name=\"timeout\"\n" +" value=\"15s\"/>\n" +" </meta_attributes>\n" +" <recipient id=\"my-alert-recipient1\" value=\"someuser@example.com\">\n" +" <meta_attributes id=\"my-alert-recipient1-attributes\">\n" +" <nvpair id=\"my-alert-recipient1-timestamp-format\"\n" +" name=\"timestamp-format\" value=\"%D %H:%M\"/>\n" +" </meta_attributes>\n" +" </recipient>\n" +" <recipient id=\"my-alert-recipient2\" value=\"otheruser@example.com\">\n" +" <meta_attributes id=\"my-alert-recipient2-attributes\">\n" +" <nvpair id=\"my-alert-recipient2-timestamp-format\"\n" +" name=\"timestamp-format\" value=\"%c\"/>\n" +" </meta_attributes>\n" +" </recipient>\n" +" </alert>\n" +" </alerts>\n" +"</configuration>" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "In the above example, the my-script.sh will get called twice for each event, with each call using a 15-second timeout. One call will be passed the recipient someuser@example.com and a timestamp in the format %D %H:%M, while the other call will be passed the recipient otheruser@example.com and a timestamp in the format %c." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Alert Instance Attributes" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "As with resource agents, agent-specific configuration values may be configured as instance attributes. These will be passed to the agent as additional environment variables. The number, names and allowed values of these instance attributes are completely up to the particular agent." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Alert configuration with instance attributes" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "<configuration>\n" +" <alerts>\n" +" <alert id=\"my-alert\" path=\"/path/to/my-script.sh\">\n" +" <meta_attributes id=\"my-alert-attributes\">\n" +" <nvpair id=\"my-alert-attributes-timeout\" name=\"timeout\"\n" +" value=\"15s\"/>\n" +" </meta_attributes>\n" +" <instance_attributes id=\"my-alert-options\">\n" +" <nvpair id=\"my-alert-options-debug\" name=\"debug\" value=\"false\"/>\n" +" </instance_attributes>\n" +" <recipient id=\"my-alert-recipient1\" value=\"someuser@example.com\"/>\n" +" </alert>\n" +" </alerts>\n" +"</configuration>" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Using the Sample Alert Agents" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Pacemaker provides several sample alert agents, installed in /usr/share/pacemaker/alerts by default." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "While these sample scripts may be copied and used as-is, they are provided mainly as templates to be edited to suit your purposes. See their source code for the full set of instance attributes they support." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Sending cluster events as SNMP traps" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid "<configuration>\n" +" <alerts>\n" +" <alert id=\"snmp_alert\" path=\"/path/to/alert_snmp.sh\">\n" +" <instance_attributes id=\"config_for_alert_snmp\">\n" +" <nvpair id=\"trap_node_states\" name=\"trap_node_states\" value=\"all\"/>\n" +" </instance_attributes>\n" +" <meta_attributes id=\"config_for_timestamp\">\n" +" <nvpair id=\"ts_fmt\" name=\"timestamp-format\"\n" +" value=\"%Y-%m-%d,%H:%M:%S.%01N\"/>\n" +" </meta_attributes>\n" +" <recipient id=\"snmp_destination\" value=\"192.168.1.2\"/>\n" +" </alert>\n" +" </alerts>\n" +"</configuration>" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Sending cluster events as e-mails" +msgstr "" + +#. Tag: programlisting +#, no-c-format +msgid " <configuration>\n" +" <alerts>\n" +" <alert id=\"smtp_alert\" path=\"/path/to/alert_smtp.sh\">\n" +" <instance_attributes id=\"config_for_alert_smtp\">\n" +" <nvpair id=\"email_sender\" name=\"email_sender\"\n" +" value=\"donotreply@example.com\"/>\n" +" </instance_attributes>\n" +" <recipient id=\"smtp_destination\" value=\"admin@example.com\"/>\n" +" </alert>\n" +" </alerts>\n" +" </configuration>" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Writing an Alert Agent" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Environment variables passed to alert agents" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Environment Variable" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_kind" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The type of alert (node, fencing, or resource) Environment VariableCRM_alert_kind CRM_alert_kind kind " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_version" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The version of Pacemaker sending the alert Environment VariableCRM_alert_version CRM_alert_version version " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_recipient" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The configured recipient Environment VariableCRM_alert_recipient CRM_alert_recipient recipient " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_node_sequence" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "A sequence number increased whenever an alert is being issued on the local node, which can be used to reference the order in which alerts have been issued by Pacemaker. An alert for an event that happened later in time reliably has a higher sequence number than alerts for earlier events. Be aware that this number has no cluster-wide meaning. Environment VariableCRM_alert_node_sequence CRM_alert_node_sequence sequence " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_timestamp" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "A timestamp created prior to executing the agent, in the format specified by the timestamp-format meta-attribute. This allows the agent to have a reliable, high-precision time of when the event occurred, regardless of when the agent itself was invoked (which could potentially be delayed due to system load, etc.). Environment VariableCRM_alert_timestamp CRM_alert_timestamp timestamp " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_node" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Name of affected node Environment VariableCRM_alert_node CRM_alert_node node " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_desc" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Detail about event. For node alerts, this is the node’s current state (member or lost). For fencing alerts, this is a summary of the requested fencing operation, including origin, target, and fencing operation error code, if any. For resource alerts, this is a readable string equivalent of CRM_alert_status. Environment VariableCRM_alert_desc CRM_alert_desc desc " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_nodeid" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "ID of node whose status changed (provided with node alerts only) Environment VariableCRM_alert_nodeid CRM_alert_nodeid nodeid " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_task" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The requested fencing or resource operation (provided with fencing and resource alerts only) Environment VariableCRM_alert_task CRM_alert_task task " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_rc" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The numerical return code of the fencing or resource operation (provided with fencing and resource alerts only) Environment VariableCRM_alert_rc CRM_alert_rc rc " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_rsc" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The name of the affected resource (resource alerts only) Environment VariableCRM_alert_rsc CRM_alert_rsc rsc " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_interval" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The interval of the resource operation (resource alerts only) Environment VariableCRM_alert_interval CRM_alert_interval interval " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_target_rc" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The expected numerical return code of the operation (resource alerts only) Environment VariableCRM_alert_target_rc CRM_alert_target_rc target_rc " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "CRM_alert_status" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "A numerical code used by Pacemaker to represent the operation result (resource alerts only) Environment VariableCRM_alert_status CRM_alert_status status " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Special concerns when writing alert agents:" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Alert agents may be called with no recipient (if none is configured), so the agent must be able to handle this situation, even if it only exits in that case. (Users may modify the configuration in stages, and add a recipient later.)" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "If more than one recipient is configured for an alert, the alert agent will be called once per recipient. If an agent is not able to run concurrently, it should be configured with only a single recipient. The agent is free, however, to interpret the recipient as a list." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "When a cluster event occurs, all alerts are fired off at the same time as separate processes. Depending on how many alerts and recipients are configured, and on what is done within the alert agents, a significant load burst may occur. The agent could be written to take this into consideration, for example by queueing resource-intensive actions into some other instance, instead of directly executing them." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Alert agents are run as the hacluster user, which has a minimal set of permissions. If an agent requires additional privileges, it is recommended to configure sudo to allow the agent to run the necessary commands as another user with the appropriate privileges." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "As always, take care to validate and sanitize user-configured parameters, such as CRM_alert_timestamp (whose content is specified by the user-configured timestamp-format), CRM_alert_recipient, and all instance attributes. Mostly this is needed simply to protect against configuration errors, but if some user can modify the CIB without having hacluster-level access to the cluster nodes, it is a potential security concern as well, to avoid the possibility of code injection." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The alerts interface is designed to be backward compatible with the external scripts interface used by the ocf:pacemaker:ClusterMon resource, which is now deprecated. To preserve this compatibility, the environment variables passed to alert agents are available prepended with CRM_notify_ as well as CRM_alert_. One break in compatibility is that ClusterMon ran external scripts as the root user, while alert agents are run as the hacluster user." +msgstr "" + diff --git a/doc/Pacemaker_Explained/pot/Ch-Basics.pot b/doc/Pacemaker_Explained/pot/Ch-Basics.pot index 8e852ee568..00234cc197 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Basics.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Basics.pot @@ -1,563 +1,563 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Configuration Basics" msgstr "" #. Tag: title #, no-c-format msgid "Configuration Layout" msgstr "" #. Tag: para #, no-c-format msgid "The cluster is defined by the Cluster Information Base (CIB), which uses XML notation. The simplest CIB, an empty one, looks like this:" msgstr "" #. Tag: title #, no-c-format msgid "An empty configuration" msgstr "" #. Tag: programlisting #, no-c-format msgid "<cib crm_feature_set=\"3.0.7\" validate-with=\"pacemaker-1.2\" admin_epoch=\"1\" epoch=\"0\" num_updates=\"0\">\n" " <configuration>\n" " <crm_config/>\n" " <nodes/>\n" " <resources/>\n" " <constraints/>\n" " </configuration>\n" " <status/>\n" "</cib>" msgstr "" #. Tag: para #, no-c-format msgid "The empty configuration above contains the major sections that make up a CIB:" msgstr "" #. Tag: para #, no-c-format msgid "cib: The entire CIB is enclosed with a cib tag. Certain fundamental settings are defined as attributes of this tag." msgstr "" #. Tag: para #, no-c-format msgid "configuration: This section — the primary focus of this document —  contains traditional configuration information such as what resources the cluster serves and the relationships among them." msgstr "" #. Tag: para #, no-c-format msgid "crm_config: cluster-wide configuration options" msgstr "" #. Tag: para #, no-c-format msgid "nodes: the machines that host the cluster" msgstr "" #. Tag: para #, no-c-format msgid "resources: the services run by the cluster" msgstr "" #. Tag: para #, no-c-format msgid "constraints: indications of how resources should be placed" msgstr "" #. Tag: para #, no-c-format msgid "status: This section contains the history of each resource on each node. Based on this data, the cluster can construct the complete current state of the cluster. The authoritative source for this section is the local resource manager (lrmd process) on each cluster node, and the cluster will occasionally repopulate the entire section. For this reason, it is never written to disk, and administrators are advised against modifying it in any way." msgstr "" #. Tag: para #, no-c-format msgid "In this document, configuration settings will be described as properties or options based on how they are defined in the CIB:" msgstr "" #. Tag: para #, no-c-format msgid "Properties are XML attributes of an XML element." msgstr "" #. Tag: para #, no-c-format msgid "Options are name-value pairs expressed as nvpair child elements of an XML element." msgstr "" #. Tag: para #, no-c-format msgid "Normally you will use command-line tools that abstract the XML, so the distinction will be unimportant; both properties and options are cluster settings you can tweak." msgstr "" #. Tag: title #, no-c-format msgid "The Current State of the Cluster" msgstr "" #. Tag: para #, no-c-format msgid "Before one starts to configure a cluster, it is worth explaining how to view the finished product. For this purpose we have created the crm_mon utility, which will display the current state of an active cluster. It can show the cluster status by node or by resource and can be used in either single-shot or dynamically-updating mode. There are also modes for displaying a list of the operations performed (grouped by node and resource) as well as information about failures." msgstr "" #. Tag: para #, no-c-format msgid "Using this tool, you can examine the state of the cluster for irregularities and see how it responds when you cause or simulate failures." msgstr "" #. Tag: para #, no-c-format msgid "Details on all the available options can be obtained using the crm_mon --help command." msgstr "" #. Tag: title #, no-c-format msgid "Sample output from crm_mon" msgstr "" #. Tag: screen #, no-c-format msgid " ============\n" " Last updated: Fri Nov 23 15:26:13 2007\n" " Current DC: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec)\n" " 3 Nodes configured.\n" " 5 Resources configured.\n" " ============\n" "\n" " Node: sles-1 (1186dc9a-324d-425a-966e-d757e693dc86): online\n" " 192.168.100.181 (heartbeat::ocf:IPaddr): Started sles-1\n" " 192.168.100.182 (heartbeat:IPaddr): Started sles-1\n" " 192.168.100.183 (heartbeat::ocf:IPaddr): Started sles-1\n" " rsc_sles-1 (heartbeat::ocf:IPaddr): Started sles-1\n" " child_DoFencing:2 (stonith:external/vmware): Started sles-1\n" " Node: sles-2 (02fb99a8-e30e-482f-b3ad-0fb3ce27d088): standby\n" " Node: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec): online\n" " rsc_sles-2 (heartbeat::ocf:IPaddr): Started sles-3\n" " rsc_sles-3 (heartbeat::ocf:IPaddr): Started sles-3\n" " child_DoFencing:0 (stonith:external/vmware): Started sles-3" msgstr "" #. Tag: title #, no-c-format msgid "Sample output from crm_mon -n" msgstr "" #. Tag: screen #, no-c-format msgid " ============\n" " Last updated: Fri Nov 23 15:26:13 2007\n" " Current DC: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec)\n" " 3 Nodes configured.\n" " 5 Resources configured.\n" " ============\n" "\n" " Node: sles-1 (1186dc9a-324d-425a-966e-d757e693dc86): online\n" " Node: sles-2 (02fb99a8-e30e-482f-b3ad-0fb3ce27d088): standby\n" " Node: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec): online\n" "\n" " Resource Group: group-1\n" " 192.168.100.181 (heartbeat::ocf:IPaddr): Started sles-1\n" " 192.168.100.182 (heartbeat:IPaddr): Started sles-1\n" " 192.168.100.183 (heartbeat::ocf:IPaddr): Started sles-1\n" " rsc_sles-1 (heartbeat::ocf:IPaddr): Started sles-1\n" " rsc_sles-2 (heartbeat::ocf:IPaddr): Started sles-3\n" " rsc_sles-3 (heartbeat::ocf:IPaddr): Started sles-3\n" " Clone Set: DoFencing\n" " child_DoFencing:0 (stonith:external/vmware): Started sles-3\n" " child_DoFencing:1 (stonith:external/vmware): Stopped\n" " child_DoFencing:2 (stonith:external/vmware): Started sles-1" msgstr "" #. Tag: para #, no-c-format msgid "The DC (Designated Controller) node is where all the decisions are made, and if the current DC fails a new one is elected from the remaining cluster nodes. The choice of DC is of no significance to an administrator beyond the fact that its logs will generally be more interesting." msgstr "" #. Tag: title #, no-c-format msgid "How Should the Configuration be Updated?" msgstr "" #. Tag: para #, no-c-format msgid "There are three basic rules for updating the cluster configuration:" msgstr "" #. Tag: para #, no-c-format msgid "Rule 1 - Never edit the cib.xml file manually. Ever. I’m not making this up." msgstr "" #. Tag: para #, no-c-format msgid "Rule 2 - Read Rule 1 again." msgstr "" #. Tag: para #, no-c-format msgid "Rule 3 - The cluster will notice if you ignored rules 1 & 2 and refuse to use the configuration." msgstr "" #. Tag: para #, no-c-format msgid "Now that it is clear how not to update the configuration, we can begin to explain how you should." msgstr "" #. Tag: title #, no-c-format msgid "Editing the CIB Using XML" msgstr "" #. Tag: para #, no-c-format msgid "The most powerful tool for modifying the configuration is the cibadmin command. With cibadmin, you can query, add, remove, update or replace any part of the configuration. All changes take effect immediately, so there is no need to perform a reload-like operation." msgstr "" #. Tag: para #, no-c-format msgid "The simplest way of using cibadmin is to use it to save the current configuration to a temporary file, edit that file with your favorite text or XML editor, and then upload the revised configuration. This process might appear to risk overwriting changes that happen after the initial cibadmin call, but pacemaker will reject any update that is \"too old\". If the CIB is updated in some other fashion after the initial cibadmin, the second cibadmin will be rejected because the version number will be too low." msgstr "" #. Tag: title #, no-c-format msgid "Safely using an editor to modify the cluster configuration" msgstr "" #. Tag: screen #, no-c-format msgid "# cibadmin --query > tmp.xml\n" "# vi tmp.xml\n" "# cibadmin --replace --xml-file tmp.xml" msgstr "" #. Tag: para #, no-c-format msgid "Some of the better XML editors can make use of a Relax NG schema to help make sure any changes you make are valid. The schema describing the configuration can be found in pacemaker.rng, which may be deployed in a location such as /usr/share/pacemaker or /usr/lib/heartbeat depending on your operating system and how you installed the software." msgstr "" #. Tag: para #, no-c-format msgid "If you want to modify just one section of the configuration, you can query and replace just that section to avoid modifying any others." msgstr "" #. Tag: title #, no-c-format msgid "Safely using an editor to modify only the resources section" msgstr "" #. Tag: screen #, no-c-format msgid "# cibadmin --query --scope resources > tmp.xml\n" "# vi tmp.xml\n" "# cibadmin --replace --scope resources --xml-file tmp.xml" msgstr "" #. Tag: title #, no-c-format msgid "Quickly Deleting Part of the Configuration" msgstr "" #. Tag: para #, no-c-format msgid "Identify the object you wish to delete by XML tag and id. For example, you might search the CIB for all STONITH-related configuration:" msgstr "" #. Tag: title #, no-c-format msgid "Searching for STONITH-related configuration items" msgstr "" #. Tag: screen #, no-c-format msgid "# cibadmin -Q | grep stonith\n" " <nvpair id=\"cib-bootstrap-options-stonith-action\" name=\"stonith-action\" value=\"reboot\"/>\n" " <nvpair id=\"cib-bootstrap-options-stonith-enabled\" name=\"stonith-enabled\" value=\"1\"/>\n" " <primitive id=\"child_DoFencing\" class=\"stonith\" type=\"external/vmware\">\n" " <lrm_resource id=\"child_DoFencing:0\" type=\"external/vmware\" class=\"stonith\">\n" " <lrm_resource id=\"child_DoFencing:0\" type=\"external/vmware\" class=\"stonith\">\n" " <lrm_resource id=\"child_DoFencing:1\" type=\"external/vmware\" class=\"stonith\">\n" " <lrm_resource id=\"child_DoFencing:0\" type=\"external/vmware\" class=\"stonith\">\n" " <lrm_resource id=\"child_DoFencing:2\" type=\"external/vmware\" class=\"stonith\">\n" " <lrm_resource id=\"child_DoFencing:0\" type=\"external/vmware\" class=\"stonith\">\n" " <lrm_resource id=\"child_DoFencing:3\" type=\"external/vmware\" class=\"stonith\">" msgstr "" #. Tag: para #, no-c-format msgid "If you wanted to delete the primitive tag with id child_DoFencing, you would run:" msgstr "" #. Tag: screen #, no-c-format -msgid "# cibadmin --delete --crm_xml '<primitive id=\"child_DoFencing\"/>'" +msgid "# cibadmin --delete --xml-text '<primitive id=\"child_DoFencing\"/>'" msgstr "" #. Tag: title #, no-c-format msgid "Updating the Configuration Without Using XML" msgstr "" #. Tag: para #, no-c-format msgid "Most tasks can be performed with one of the other command-line tools provided with pacemaker, avoiding the need to read or edit XML." msgstr "" #. Tag: para #, no-c-format msgid "To enable STONITH for example, one could run:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --name stonith-enabled --update 1" msgstr "" #. Tag: para #, no-c-format msgid "Or, to check whether somenode is allowed to run resources, there is:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_standby --get-value --node somenode" msgstr "" #. Tag: para #, no-c-format msgid "Or, to find the current location of my-test-rsc, one can use:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource --locate --resource my-test-rsc" msgstr "" #. Tag: para #, no-c-format msgid "Examples of using these tools for specific cases will be given throughout this document where appropriate." msgstr "" #. Tag: para #, no-c-format msgid "Old versions of pacemaker (1.0.3 and earlier) had different command-line tool syntax. If you are using an older version, check your installed manual pages for the proper syntax to use." msgstr "" #. Tag: title #, no-c-format msgid "Making Configuration Changes in a Sandbox" msgstr "" #. Tag: para #, no-c-format msgid "Often it is desirable to preview the effects of a series of changes before updating the configuration atomically. For this purpose we have created crm_shadow which creates a \"shadow\" copy of the configuration and arranges for all the command line tools to use it." msgstr "" #. Tag: para #, no-c-format msgid "To begin, simply invoke crm_shadow --create with the name of a configuration to create Shadow copies are identified with a name, making it possible to have more than one., and follow the simple on-screen instructions." msgstr "" #. Tag: para #, no-c-format msgid "Read this section and the on-screen instructions carefully; failure to do so could result in destroying the cluster’s active configuration!" msgstr "" #. Tag: title #, no-c-format msgid "Creating and displaying the active sandbox" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_shadow --create test\n" "Setting up shadow instance\n" "Type Ctrl-D to exit the crm_shadow shell\n" "shadow[test]:\n" "shadow[test] # crm_shadow --which\n" "test" msgstr "" #. Tag: para #, no-c-format msgid "From this point on, all cluster commands will automatically use the shadow copy instead of talking to the cluster’s active configuration. Once you have finished experimenting, you can either make the changes active via the --commit option, or discard them using the --delete option. Again, be sure to follow the on-screen instructions carefully!" msgstr "" #. Tag: para #, no-c-format msgid "For a full list of crm_shadow options and commands, invoke it with the --help option." msgstr "" #. Tag: title #, no-c-format msgid "Using a sandbox to make multiple changes atomically, discard them and verify the real configuration is untouched" msgstr "" #. Tag: screen #, no-c-format msgid " shadow[test] # crm_failcount -G -r rsc_c001n01\n" " name=fail-count-rsc_c001n01 value=0\n" " shadow[test] # crm_standby -v on -N c001n02\n" " shadow[test] # crm_standby -G -N c001n02\n" " name=c001n02 scope=nodes value=on\n" " shadow[test] # cibadmin --erase --force\n" " shadow[test] # cibadmin --query\n" " <cib cib_feature_revision=\"1\" validate-with=\"pacemaker-1.0\" admin_epoch=\"0\" crm_feature_set=\"3.0\" have-quorum=\"1\" epoch=\"112\"\n" " dc-uuid=\"c001n01\" num_updates=\"1\" cib-last-written=\"Fri Jun 27 12:17:10 2008\">\n" " <configuration>\n" " <crm_config/>\n" " <nodes/>\n" " <resources/>\n" " <constraints/>\n" " </configuration>\n" " <status/>\n" " </cib>\n" " shadow[test] # crm_shadow --delete test --force\n" " Now type Ctrl-D to exit the crm_shadow shell\n" " shadow[test] # exit\n" " # crm_shadow --which\n" " No active shadow configuration defined\n" " # cibadmin -Q\n" " <cib cib_feature_revision=\"1\" validate-with=\"pacemaker-1.0\" admin_epoch=\"0\" crm_feature_set=\"3.0\" have-quorum=\"1\" epoch=\"110\"\n" " dc-uuid=\"c001n01\" num_updates=\"551\">\n" " <configuration>\n" " <crm_config>\n" " <cluster_property_set id=\"cib-bootstrap-options\">\n" " <nvpair id=\"cib-bootstrap-1\" name=\"stonith-enabled\" value=\"1\"/>\n" " <nvpair id=\"cib-bootstrap-2\" name=\"pe-input-series-max\" value=\"30000\"/>" msgstr "" #. Tag: title #, no-c-format msgid "Testing Your Configuration Changes" msgstr "" #. Tag: para #, no-c-format msgid "We saw previously how to make a series of changes to a \"shadow\" copy of the configuration. Before loading the changes back into the cluster (e.g. crm_shadow --commit mytest --force), it is often advisable to simulate the effect of the changes with crm_simulate. For example:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_simulate --live-check -VVVVV --save-graph tmp.graph --save-dotfile tmp.dot" msgstr "" #. Tag: para #, no-c-format msgid "This tool uses the same library as the live cluster to show what it would have done given the supplied input. Its output, in addition to a significant amount of logging, is stored in two files tmp.graph and tmp.dot. Both files are representations of the same thing: the cluster’s response to your changes." msgstr "" #. Tag: para #, no-c-format msgid "The graph file stores the complete transition from the existing cluster state to your desired new state, containing a list of all the actions, their parameters and their pre-requisites. Because the transition graph is not terribly easy to read, the tool also generates a Graphviz Graph visualization software. See http://www.graphviz.org/ for details. dot-file representing the same information." msgstr "" #. Tag: para #, no-c-format msgid "For information on the options supported by crm_simulate, use its --help option." msgstr "" #. Tag: title #, no-c-format msgid "Interpreting the Graphviz output" msgstr "" #. Tag: para #, no-c-format msgid "Arrows indicate ordering dependencies" msgstr "" #. Tag: para #, no-c-format msgid "Dashed arrows indicate dependencies that are not present in the transition graph" msgstr "" #. Tag: para #, no-c-format msgid "Actions with a dashed border of any color do not form part of the transition graph" msgstr "" #. Tag: para #, no-c-format msgid "Actions with a green border form part of the transition graph" msgstr "" #. Tag: para #, no-c-format msgid "Actions with a red border are ones the cluster would like to execute but cannot run" msgstr "" #. Tag: para #, no-c-format msgid "Actions with a blue border are ones the cluster does not feel need to be executed" msgstr "" #. Tag: para #, no-c-format msgid "Actions with orange text are pseudo/pretend actions that the cluster uses to simplify the graph" msgstr "" #. Tag: para #, no-c-format msgid "Actions with black text are sent to the LRM" msgstr "" #. Tag: para #, no-c-format msgid "Resource actions have text of the form rsc_action_interval node" msgstr "" #. Tag: para #, no-c-format msgid "Any action depending on an action with a red border will not be able to execute." msgstr "" #. Tag: para #, no-c-format msgid "Loops are really bad. Please report them to the development team." msgstr "" #. Tag: title #, no-c-format msgid "Small Cluster Transition" msgstr "" #. Tag: para #, no-c-format msgid "In the above example, it appears that a new node, pcmk-2, has come online and that the cluster is checking to make sure rsc1, rsc2 and rsc3 are not already running there (Indicated by the rscN_monitor_0 entries). Once it did that, and assuming the resources were not active there, it would have liked to stop rsc1 and rsc2 on pcmk-1 and move them to pcmk-2. However, there appears to be some problem and the cluster cannot or is not permitted to perform the stop actions which implies it also cannot perform the start actions. For some reason the cluster does not want to start rsc3 anywhere." msgstr "" #. Tag: title #, no-c-format msgid "Complex Cluster Transition" msgstr "" #. Tag: title #, no-c-format msgid "Do I Need to Update the Configuration on All Cluster Nodes?" msgstr "" #. Tag: para #, no-c-format msgid "No. Any changes are immediately synchronized to the other active members of the cluster." msgstr "" #. Tag: para #, no-c-format msgid "To reduce bandwidth, the cluster only broadcasts the incremental updates that result from your changes and uses MD5 checksums to ensure that each copy is completely consistent." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Constraints.pot b/doc/Pacemaker_Explained/pot/Ch-Constraints.pot index 44d103e956..25c4ffd17c 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Constraints.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Constraints.pot @@ -1,1046 +1,1071 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Resource Constraints" msgstr "" #. Tag: para #, no-c-format msgid " ResourceConstraints Constraints " msgstr "" #. Tag: title #, no-c-format msgid "Scores" msgstr "" #. Tag: para #, no-c-format msgid "Scores of all kinds are integral to how the cluster works. Practically everything from moving a resource to deciding which resource to stop in a degraded cluster is achieved by manipulating scores in some way." msgstr "" #. Tag: para #, no-c-format msgid "Scores are calculated per resource and node. Any node with a negative score for a resource can’t run that resource. The cluster places a resource on the node with the highest score for it." msgstr "" #. Tag: title #, no-c-format msgid "Infinity Math" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker implements INFINITY (or equivalently, +INFINITY) internally as a score of 1,000,000. Addition and subtraction with it follow these three basic rules:" msgstr "" #. Tag: para #, no-c-format msgid "Any value + INFINITY = INFINITY" msgstr "" #. Tag: para #, no-c-format msgid "Any value - INFINITY = -INFINITY" msgstr "" #. Tag: para #, no-c-format msgid "INFINITY - INFINITY = -INFINITY" msgstr "" #. Tag: para #, no-c-format msgid "What if you want to use a score higher than 1,000,000? Typically this possibility arises when someone wants to base the score on some external metric that might go above 1,000,000." msgstr "" #. Tag: para #, no-c-format msgid "The short answer is you can’t." msgstr "" #. Tag: para #, no-c-format msgid "The long answer is it is sometimes possible work around this limitation creatively. You may be able to set the score to some computed value based on the external metric rather than use the metric directly. For nodes, you can store the metric as a node attribute, and query the attribute when computing the score (possibly as part of a custom resource agent)." msgstr "" #. Tag: title #, no-c-format msgid "Deciding Which Nodes a Resource Can Run On" msgstr "" #. Tag: para #, no-c-format msgid " Location Constraints ResourceConstraintsLocation ConstraintsLocation Location Location constraints tell the cluster which nodes a resource can run on." msgstr "" #. Tag: para #, no-c-format msgid "There are two alternative strategies. One way is to say that, by default, resources can run anywhere, and then the location constraints specify nodes that are not allowed (an opt-out cluster). The other way is to start with nothing able to run anywhere, and use location constraints to selectively enable allowed nodes (an opt-in cluster)." msgstr "" #. Tag: para #, no-c-format msgid "Whether you should choose opt-in or opt-out depends on your personal preference and the make-up of your cluster. If most of your resources can run on most of the nodes, then an opt-out arrangement is likely to result in a simpler configuration. On the other-hand, if most resources can only run on a small subset of nodes, an opt-in configuration might be simpler." msgstr "" #. Tag: title #, no-c-format msgid "Location Properties" msgstr "" #. Tag: title #, no-c-format msgid "Properties of a rsc_location Constraint" msgstr "" #. Tag: entry #, no-c-format msgid "Field" msgstr "" #. Tag: entry #, no-c-format msgid "Default" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "id" msgstr "" #. Tag: para #, no-c-format msgid "A unique name for the constraint idLocation Constraints Location Constraints ConstraintsLocationid Locationid id " msgstr "" #. Tag: para #, no-c-format msgid "rsc" msgstr "" #. Tag: para #, no-c-format -msgid "A resource name rscLocation Constraints Location Constraints ConstraintsLocationrsc Locationrsc rsc " +msgid "The name of the resource to which this constraint applies rscLocation Constraints Location Constraints ConstraintsLocationrsc Locationrsc rsc " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "rsc-pattern" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "A regular expression matching the names of resources to which this constraint applies, if rsc is not specified; if the regular expression contains submatches and the constraint is governed by a rule (see ), the submatches can be referenced as %0 through %9 in the rule’s score-attribute or a rule expression’s attribute (since 1.1.16) rsc-patternLocation Constraints Location Constraints ConstraintsLocationrsc-pattern Locationrsc-pattern rsc-pattern " msgstr "" #. Tag: para #, no-c-format msgid "node" msgstr "" #. Tag: para #, no-c-format msgid "A node’s name nodeLocation Constraints Location Constraints ConstraintsLocationnode Locationnode node " msgstr "" #. Tag: para #, no-c-format msgid "score" msgstr "" #. Tag: para #, no-c-format msgid "Positive values indicate the resource should run on this node. Negative values indicate the resource should not run on this node. Values of +/- INFINITY change \"should\"/\"should not\" to \"must\"/\"must not\". scoreLocation Constraints Location Constraints ConstraintsLocationscore Locationscore score " msgstr "" #. Tag: para #, no-c-format msgid "resource-discovery" msgstr "" #. Tag: para #, no-c-format msgid "always" msgstr "" #. Tag: para #, no-c-format -msgid "Whether Pacemaker should perform resource discovery (that is, check whether the resource is already running) for this resource on this node. This should normally be left as the default, so that rogue instances of a service can be stopped when they are running where they are not supposed to be. However, there are two situations where disabling resource discovery is a good idea: when a service is not installed on a node, discovery might return an error (properly written OCF agents will not, so this is usually only seen with other agent types); and when Pacemaker Remote is used to scale a cluster to hundreds of nodes, limiting resource discovery to allowed nodes can significantly boost performance." +msgid "Whether Pacemaker should perform resource discovery (that is, check whether the resource is already running) for this resource on this node. This should normally be left as the default, so that rogue instances of a service can be stopped when they are running where they are not supposed to be. However, there are two situations where disabling resource discovery is a good idea: when a service is not installed on a node, discovery might return an error (properly written OCF agents will not, so this is usually only seen with other agent types); and when Pacemaker Remote is used to scale a cluster to hundreds of nodes, limiting resource discovery to allowed nodes can significantly boost performance. (since 1.1.13)" msgstr "" #. Tag: para #, no-c-format msgid "always: Always perform resource discovery for the specified resource on this node." msgstr "" #. Tag: para #, no-c-format msgid "never: Never perform resource discovery for the specified resource on this node. This option should generally be used with a -INFINITY score, although that is not strictly required." msgstr "" #. Tag: para #, no-c-format msgid "exclusive: Perform resource discovery for the specified resource only on this node (and other nodes similarly marked as exclusive). Multiple location constraints using exclusive discovery for the same resource across different nodes creates a subset of nodes resource-discovery is exclusive to. If a resource is marked for exclusive discovery on one or more nodes, that resource is only allowed to be placed within that subset of nodes." msgstr "" #. Tag: para #, no-c-format msgid " Resource DiscoveryLocation Constraints Location Constraints ConstraintsLocationResource Discovery LocationResource Discovery Resource Discovery " msgstr "" #. Tag: para #, no-c-format msgid "Setting resource-discovery to never or exclusive removes Pacemaker’s ability to detect and stop unwanted instances of a service running where it’s not supposed to be. It is up to the system administrator (you!) to make sure that the service can never be active on nodes without resource-discovery (such as by leaving the relevant software uninstalled)." msgstr "" #. Tag: title #, no-c-format msgid "Asymmetrical \"Opt-In\" Clusters" msgstr "" #. Tag: para #, no-c-format msgid " Asymmetrical Opt-In Clusters Cluster TypeAsymmetrical Opt-In Asymmetrical Opt-In " msgstr "" #. Tag: para #, no-c-format msgid "To create an opt-in cluster, start by preventing resources from running anywhere by default:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --name symmetric-cluster --update false" msgstr "" #. Tag: para #, no-c-format msgid "Then start enabling nodes. The following fragment says that the web server prefers sles-1, the database prefers sles-2 and both can fail over to sles-3 if their most preferred node fails." msgstr "" #. Tag: title #, no-c-format msgid "Opt-in location constraints for two resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_location id=\"loc-1\" rsc=\"Webserver\" node=\"sles-1\" score=\"200\"/>\n" " <rsc_location id=\"loc-2\" rsc=\"Webserver\" node=\"sles-3\" score=\"0\"/>\n" " <rsc_location id=\"loc-3\" rsc=\"Database\" node=\"sles-2\" score=\"200\"/>\n" " <rsc_location id=\"loc-4\" rsc=\"Database\" node=\"sles-3\" score=\"0\"/>\n" "</constraints>" msgstr "" #. Tag: title #, no-c-format msgid "Symmetrical \"Opt-Out\" Clusters" msgstr "" #. Tag: para #, no-c-format msgid " Symmetrical Opt-Out Clusters Cluster TypeSymmetrical Opt-Out Symmetrical Opt-Out " msgstr "" #. Tag: para #, no-c-format msgid "To create an opt-out cluster, start by allowing resources to run anywhere by default:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --name symmetric-cluster --update true" msgstr "" #. Tag: para #, no-c-format msgid "Then start disabling nodes. The following fragment is the equivalent of the above opt-in configuration." msgstr "" #. Tag: title #, no-c-format msgid "Opt-out location constraints for two resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_location id=\"loc-1\" rsc=\"Webserver\" node=\"sles-1\" score=\"200\"/>\n" " <rsc_location id=\"loc-2-dont-run\" rsc=\"Webserver\" node=\"sles-2\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"loc-3-dont-run\" rsc=\"Database\" node=\"sles-1\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"loc-4\" rsc=\"Database\" node=\"sles-2\" score=\"200\"/>\n" "</constraints>" msgstr "" #. Tag: title #, no-c-format msgid "What if Two Nodes Have the Same Score" msgstr "" #. Tag: para #, no-c-format msgid "If two nodes have the same score, then the cluster will choose one. This choice may seem random and may not be what was intended, however the cluster was not given enough information to know any better." msgstr "" #. Tag: title #, no-c-format msgid "Constraints where a resource prefers two nodes equally" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_location id=\"loc-1\" rsc=\"Webserver\" node=\"sles-1\" score=\"INFINITY\"/>\n" " <rsc_location id=\"loc-2\" rsc=\"Webserver\" node=\"sles-2\" score=\"INFINITY\"/>\n" " <rsc_location id=\"loc-3\" rsc=\"Database\" node=\"sles-1\" score=\"500\"/>\n" " <rsc_location id=\"loc-4\" rsc=\"Database\" node=\"sles-2\" score=\"300\"/>\n" " <rsc_location id=\"loc-5\" rsc=\"Database\" node=\"sles-2\" score=\"200\"/>\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "In the example above, assuming no other constraints and an inactive cluster, Webserver would probably be placed on sles-1 and Database on sles-2. It would likely have placed Webserver based on the node’s uname and Database based on the desire to spread the resource load evenly across the cluster. However other factors can also be involved in more complex configurations." msgstr "" #. Tag: title #, no-c-format msgid "Specifying the Order in which Resources Should Start/Stop" msgstr "" #. Tag: para #, no-c-format msgid " ResourceConstraintsOrdering ConstraintsOrdering Ordering ResourceStart Order Start Order Ordering Constraints " msgstr "" #. Tag: para #, no-c-format msgid "Ordering constraints tell the cluster the order in which resources should start." msgstr "" #. Tag: para #, no-c-format msgid "Ordering constraints affect only the ordering of resources; they do not require that the resources be placed on the same node. If you want resources to be started on the same node and in a specific order, you need both an ordering constraint and a colocation constraint (see ), or alternatively, a group (see )." msgstr "" #. Tag: title #, no-c-format msgid "Ordering Properties" msgstr "" #. Tag: title #, no-c-format msgid "Properties of a rsc_order Constraint" msgstr "" #. Tag: para #, no-c-format msgid "A unique name for the constraint idOrdering Constraints Ordering Constraints ConstraintsOrderingid Orderingid id " msgstr "" #. Tag: para #, no-c-format msgid "first" msgstr "" #. Tag: para #, no-c-format msgid "Name of the resource that the then resource depends on firstOrdering Constraints Ordering Constraints ConstraintsOrderingfirst Orderingfirst first " msgstr "" #. Tag: para #, no-c-format msgid "then" msgstr "" #. Tag: para #, no-c-format msgid "Name of the dependent resource thenOrdering Constraints Ordering Constraints ConstraintsOrderingthen Orderingthen then " msgstr "" #. Tag: para #, no-c-format msgid "first-action" msgstr "" #. Tag: para #, no-c-format msgid "start" msgstr "" #. Tag: para #, no-c-format msgid "The action that the first resource must complete before then-action can be initiated for the then resource. Allowed values: start, stop, promote, demote. first-actionOrdering Constraints Ordering Constraints ConstraintsOrderingfirst-action Orderingfirst-action first-action " msgstr "" #. Tag: para #, no-c-format msgid "then-action" msgstr "" #. Tag: para #, no-c-format msgid "value of first-action" msgstr "" #. Tag: para #, no-c-format msgid "The action that the then resource can execute only after the first-action on the first resource has completed. Allowed values: start, stop, promote, demote. then-actionOrdering Constraints Ordering Constraints ConstraintsOrderingthen-action Orderingthen-action then-action " msgstr "" #. Tag: para #, no-c-format msgid "kind" msgstr "" #. Tag: para #, no-c-format msgid "How to enforce the constraint. Allowed values:" msgstr "" #. Tag: para #, no-c-format msgid "Optional: Just a suggestion. Only applies if both resources are executing the specified actions. Any change in state by the first resource will have no effect on the then resource." msgstr "" #. Tag: para #, no-c-format msgid "Mandatory: Always. If first does not perform first-action, then will not be allowed to performed then-action. If first is restarted, then (if running) will be stopped beforehand and started afterward." msgstr "" #. Tag: para #, no-c-format msgid "Serialize: Ensure that no two stop/start actions occur concurrently for the resources. First and then can start in either order, but one must complete starting before the other can be started. A typical use case is when resource start-up puts a high load on the host." msgstr "" #. Tag: para #, no-c-format msgid " kindOrdering Constraints Ordering Constraints ConstraintsOrderingkind Orderingkind kind " msgstr "" #. Tag: para #, no-c-format msgid "symmetrical" msgstr "" #. Tag: para #, no-c-format msgid "TRUE" msgstr "" #. Tag: para #, no-c-format msgid "If true, the reverse of the constraint applies for the opposite action (for example, if B starts after A starts, then B stops before A stops). symmetricalOrdering Constraints Ordering Constraints Ordering Constraintssymmetrical symmetrical " msgstr "" #. Tag: para #, no-c-format msgid "Promote and demote apply to the master role of multi-state resources." msgstr "" #. Tag: title #, no-c-format msgid "Optional and mandatory ordering" msgstr "" #. Tag: para #, no-c-format msgid "Here is an example of ordering constraints where Database must start before Webserver, and IP should start before Webserver if they both need to be started:" msgstr "" #. Tag: title #, no-c-format msgid "Optional and mandatory ordering constraints" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" "<rsc_order id=\"order-1\" first=\"IP\" then=\"Webserver\" kind=\"Optional\"/>\n" "<rsc_order id=\"order-2\" first=\"Database\" then=\"Webserver\" kind=\"Mandatory\" />\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "Because the above example lets symmetrical default to TRUE, Webserver must be stopped before Database can be stopped, and Webserver should be stopped before IP if they both need to be stopped." msgstr "" #. Tag: title #, no-c-format msgid "Placing Resources Relative to other Resources" msgstr "" #. Tag: para #, no-c-format msgid " ResourceConstraintsColocation ConstraintsColocation Colocation ResourceLocation Relative to other Resources Location Relative to other Resources Colocation constraints tell the cluster that the location of one resource depends on the location of another one." msgstr "" #. Tag: para #, no-c-format msgid "Colocation has an important side-effect: it affects the order in which resources are assigned to a node. Think about it: You can’t place A relative to B unless you know where B is. While the human brain is sophisticated enough to read the constraint in any order and choose the correct one depending on the situation, the cluster is not quite so smart. Yet. " msgstr "" #. Tag: para #, no-c-format msgid "So when you are creating colocation constraints, it is important to consider whether you should colocate A with B, or B with A." msgstr "" #. Tag: para #, no-c-format msgid "Another thing to keep in mind is that, assuming A is colocated with B, the cluster will take into account A’s preferences when deciding which node to choose for B." msgstr "" #. Tag: para #, no-c-format msgid "For a detailed look at exactly how this occurs, see Colocation Explained." msgstr "" #. Tag: para #, no-c-format msgid "Colocation constraints affect only the placement of resources; they do not require that the resources be started in a particular order. If you want resources to be started on the same node and in a specific order, you need both an ordering constraint (see ) and a colocation constraint, or alternatively, a group (see )." msgstr "" #. Tag: title #, no-c-format msgid "Colocation Properties" msgstr "" #. Tag: title #, no-c-format msgid "Properties of a rsc_colocation Constraint" msgstr "" #. Tag: para #, no-c-format msgid "A unique name for the constraint. idColocation Constraints Colocation Constraints ConstraintsColocationid Colocationid id " msgstr "" #. Tag: para #, no-c-format msgid "The name of a resource that should be located relative to with-rsc. rscColocation Constraints Colocation Constraints ConstraintsColocationrsc Colocationrsc rsc " msgstr "" #. Tag: para #, no-c-format msgid "with-rsc" msgstr "" #. Tag: para #, no-c-format msgid "The name of the resource used as the colocation target. The cluster will decide where to put this resource first and then decide where to put rsc. with-rscColocation Constraints Colocation Constraints ConstraintsColocationwith-rsc Colocationwith-rsc with-rsc " msgstr "" #. Tag: para #, no-c-format msgid "Positive values indicate the resources should run on the same node. Negative values indicate the resources should run on different nodes. Values of +/- INFINITY change \"should\" to \"must\". scoreColocation Constraints Colocation Constraints ConstraintsColocationscore Colocationscore score " msgstr "" #. Tag: title #, no-c-format msgid "Mandatory Placement" msgstr "" #. Tag: para #, no-c-format msgid "Mandatory placement occurs when the constraint’s score is +INFINITY or -INFINITY. In such cases, if the constraint can’t be satisfied, then the rsc resource is not permitted to run. For score=INFINITY, this includes cases where the with-rsc resource is not active." msgstr "" #. Tag: para #, no-c-format msgid "If you need resource A to always run on the same machine as resource B, you would add the following constraint:" msgstr "" #. Tag: title #, no-c-format msgid "Mandatory colocation constraint for two resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_colocation id=\"colocate\" rsc=\"A\" with-rsc=\"B\" score=\"INFINITY\"/>" msgstr "" #. Tag: para #, no-c-format msgid "Remember, because INFINITY was used, if B can’t run on any of the cluster nodes (for whatever reason) then A will not be allowed to run. Whether A is running or not has no effect on B." msgstr "" #. Tag: para #, no-c-format msgid "Alternatively, you may want the opposite — that A cannot run on the same machine as B. In this case, use score=\"-INFINITY\"." msgstr "" #. Tag: title #, no-c-format msgid "Mandatory anti-colocation constraint for two resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_colocation id=\"anti-colocate\" rsc=\"A\" with-rsc=\"B\" score=\"-INFINITY\"/>" msgstr "" #. Tag: para #, no-c-format msgid "Again, by specifying -INFINITY, the constraint is binding. So if the only place left to run is where B already is, then A may not run anywhere." msgstr "" #. Tag: para #, no-c-format msgid "As with INFINITY, B can run even if A is stopped. However, in this case A also can run if B is stopped, because it still meets the constraint of A and B not running on the same node." msgstr "" #. Tag: title #, no-c-format msgid "Advisory Placement" msgstr "" #. Tag: para #, no-c-format msgid "If mandatory placement is about \"must\" and \"must not\", then advisory placement is the \"I’d prefer if\" alternative. For constraints with scores greater than -INFINITY and less than INFINITY, the cluster will try to accommodate your wishes but may ignore them if the alternative is to stop some of the cluster resources." msgstr "" #. Tag: para #, no-c-format msgid "As in life, where if enough people prefer something it effectively becomes mandatory, advisory colocation constraints can combine with other elements of the configuration to behave as if they were mandatory." msgstr "" #. Tag: title #, no-c-format msgid "Advisory colocation constraint for two resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_colocation id=\"colocate-maybe\" rsc=\"A\" with-rsc=\"B\" score=\"500\"/>" msgstr "" #. Tag: title #, no-c-format msgid "Resource Sets" msgstr "" #. Tag: para #, no-c-format msgid "Resource sets allow multiple resources to be affected by a single constraint." msgstr "" #. Tag: title #, no-c-format msgid "A set of 3 resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<resource_set id=\"resource-set-example\">\n" " <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" " <resource_ref id=\"C\"/>\n" "</resource_set>" msgstr "" #. Tag: para #, no-c-format msgid "Resource sets are valid inside rsc_location, rsc_order (see ), rsc_colocation (see ), and rsc_ticket (see ) constraints." msgstr "" #. Tag: para #, no-c-format msgid "A resource set has a number of properties that can be set, though not all have an effect in all contexts." msgstr "" #. Tag: title #, no-c-format msgid "Properties of a resource_set" msgstr "" #. Tag: para #, no-c-format msgid "A unique name for the set idResource Sets Resource Sets ConstraintsResource Setsid Resource Setsid id " msgstr "" #. Tag: para #, no-c-format msgid "sequential" msgstr "" #. Tag: para #, no-c-format msgid "true" msgstr "" #. Tag: para #, no-c-format msgid "Whether the members of the set must be acted on in order. Meaningful within rsc_order and rsc_colocation. sequentialResource Sets Resource Sets ConstraintsResource Setssequential Resource Setssequential sequential " msgstr "" #. Tag: para #, no-c-format msgid "require-all" msgstr "" #. Tag: para #, no-c-format -msgid "Whether all members of the set must be active before continuing. Meaningful within rsc_order. require-allResource Sets Resource Sets ConstraintsResource Setsrequire-all Resource Setsrequire-all require-all " +msgid "Whether all members of the set must be active before continuing. Meaningful within rsc_order. (since 1.1.13) require-allResource Sets Resource Sets ConstraintsResource Setsrequire-all Resource Setsrequire-all require-all " msgstr "" #. Tag: para #, no-c-format msgid "role" msgstr "" #. Tag: para #, no-c-format msgid "Limit the effect of the constraint to the specified role. Meaningful within rsc_location, rsc_colocation and rsc_ticket. roleResource Sets Resource Sets ConstraintsResource Setsrole Resource Setsrole role " msgstr "" #. Tag: para #, no-c-format msgid "action" msgstr "" #. Tag: para #, no-c-format msgid "Limit the effect of the constraint to the specified action. Meaningful within rsc_order. actionResource Sets Resource Sets ConstraintsResource Setsaction Resource Setsaction action " msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. Use a specific score for this set within the constraint. scoreResource Sets Resource Sets ConstraintsResource Setsscore Resource Setsscore score " msgstr "" #. Tag: title #, no-c-format msgid "Ordering Sets of Resources" msgstr "" #. Tag: para #, no-c-format msgid "A common situation is for an administrator to create a chain of ordered resources, such as:" msgstr "" #. Tag: title #, no-c-format msgid "A chain of ordered resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_order id=\"order-1\" first=\"A\" then=\"B\" />\n" " <rsc_order id=\"order-2\" first=\"B\" then=\"C\" />\n" " <rsc_order id=\"order-3\" first=\"C\" then=\"D\" />\n" "</constraints>" msgstr "" #. Tag: title #, no-c-format msgid "Visual representation of the four resources' start order for the above constraints" msgstr "" #. Tag: title #, no-c-format msgid "Ordered Set" msgstr "" #. Tag: para #, no-c-format msgid "To simplify this situation, resource sets (see ) can be used within ordering constraints:" msgstr "" #. Tag: title #, no-c-format msgid "A chain of ordered resources expressed as a set" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_order id=\"order-1\">\n" " <resource_set id=\"ordered-set-example\" sequential=\"true\">\n" " <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" " <resource_ref id=\"C\"/>\n" " <resource_ref id=\"D\"/>\n" " </resource_set>\n" " </rsc_order>\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "While the set-based format is not less verbose, it is significantly easier to get right and maintain." msgstr "" #. Tag: para #, no-c-format msgid "If you use a higher-level tool, pay attention to how it exposes this functionality. Depending on the tool, creating a set A B may be equivalent to A then B, or B then A." msgstr "" #. Tag: title #, no-c-format msgid "Ordering Multiple Sets" msgstr "" #. Tag: para #, no-c-format -msgid "The syntax can be expanded to allow ordered sets of (un)ordered resources. In the example below, A and B can both start in parallel, as can C and D, however C and D can only start once both A and B are active." +msgid "The syntax can be expanded to allow sets of resources to be ordered relative to each other, where the members of each individual set may be ordered or unordered (controlled by the sequential property). In the example below, A and B can both start in parallel, as can C and D, however C and D can only start once both A and B are active." msgstr "" #. Tag: title #, no-c-format msgid "Ordered sets of unordered resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_order id=\"order-1\">\n" " <resource_set id=\"ordered-set-1\" sequential=\"false\">\n" " <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" " </resource_set>\n" " <resource_set id=\"ordered-set-2\" sequential=\"false\">\n" " <resource_ref id=\"C\"/>\n" " <resource_ref id=\"D\"/>\n" " </resource_set>\n" " </rsc_order>\n" " </constraints>" msgstr "" #. Tag: title #, no-c-format msgid "Visual representation of the start order for two ordered sets of unordered resources" msgstr "" #. Tag: para #, no-c-format msgid "Of course either set — or both sets — of resources can also be internally ordered (by setting sequential=\"true\") and there is no limit to the number of sets that can be specified." msgstr "" #. Tag: title #, no-c-format msgid "Advanced use of set ordering - Three ordered sets, two of which are internally unordered" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_order id=\"order-1\">\n" " <resource_set id=\"ordered-set-1\" sequential=\"false\">\n" " <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" " </resource_set>\n" " <resource_set id=\"ordered-set-2\" sequential=\"true\">\n" " <resource_ref id=\"C\"/>\n" " <resource_ref id=\"D\"/>\n" " </resource_set>\n" " <resource_set id=\"ordered-set-3\" sequential=\"false\">\n" " <resource_ref id=\"E\"/>\n" " <resource_ref id=\"F\"/>\n" " </resource_set>\n" " </rsc_order>\n" "</constraints>" msgstr "" #. Tag: title #, no-c-format msgid "Visual representation of the start order for the three sets defined above" msgstr "" #. Tag: para #, no-c-format msgid "An ordered set with sequential=false makes sense only if there is another set in the constraint. Otherwise, the constraint has no effect." msgstr "" #. Tag: title #, no-c-format msgid "Resource Set OR Logic" msgstr "" #. Tag: para #, no-c-format msgid "The unordered set logic discussed so far has all been \"AND\" logic. To illustrate this take the 3 resource set figure in the previous section. Those sets can be expressed, (A and B) then (C) then (D) then (E and F)." msgstr "" #. Tag: para #, no-c-format msgid "Say for example we want to change the first set, (A and B), to use \"OR\" logic so the sets look like this: (A or B) then (C) then (D) then (E and F). This functionality can be achieved through the use of the require-all option. This option defaults to TRUE which is why the \"AND\" logic is used by default. Setting require-all=false means only one resource in the set needs to be started before continuing on to the next set." msgstr "" #. Tag: title #, no-c-format msgid "Resource Set \"OR\" logic: Three ordered sets, where the first set is internally unordered with \"OR\" logic" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_order id=\"order-1\">\n" " <resource_set id=\"ordered-set-1\" sequential=\"false\" require-all=\"false\">\n" " <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" " </resource_set>\n" " <resource_set id=\"ordered-set-2\" sequential=\"true\">\n" " <resource_ref id=\"C\"/>\n" " <resource_ref id=\"D\"/>\n" " </resource_set>\n" " <resource_set id=\"ordered-set-3\" sequential=\"false\">\n" " <resource_ref id=\"E\"/>\n" " <resource_ref id=\"F\"/>\n" " </resource_set>\n" " </rsc_order>\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "An ordered set with require-all=false makes sense only in conjunction with sequential=false. Think of it like this: sequential=false modifies the set to be an unordered set using \"AND\" logic by default, and adding require-all=false flips the unordered set’s \"AND\" logic to \"OR\" logic." msgstr "" #. Tag: title #, no-c-format msgid "Colocating Sets of Resources" msgstr "" #. Tag: para #, no-c-format msgid "Another common situation is for an administrator to create a set of colocated resources." msgstr "" #. Tag: para #, no-c-format msgid "One way to do this would be to define a resource group (see ), but that cannot always accurately express the desired state." msgstr "" #. Tag: para #, no-c-format msgid "Another way would be to define each relationship as an individual constraint, but that causes a constraint explosion as the number of resources and combinations grow. An example of this approach:" msgstr "" #. Tag: title #, no-c-format msgid "Chain of colocated resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_colocation id=\"coloc-1\" rsc=\"D\" with-rsc=\"C\" score=\"INFINITY\"/>\n" " <rsc_colocation id=\"coloc-2\" rsc=\"C\" with-rsc=\"B\" score=\"INFINITY\"/>\n" " <rsc_colocation id=\"coloc-3\" rsc=\"B\" with-rsc=\"A\" score=\"INFINITY\"/>\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "To make things easier, resource sets (see ) can be used within colocation constraints. As with the chained version, a resource that can’t be active prevents any resource that must be colocated with it from being active. For example, if B is not able to run, then both C and by inference D must also remain stopped. Here is an example resource_set:" msgstr "" #. Tag: title #, no-c-format msgid "Equivalent colocation chain expressed using resource_set" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_colocation id=\"coloc-1\" score=\"INFINITY\" >\n" " <resource_set id=\"colocated-set-example\" sequential=\"true\">\n" " <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" " <resource_ref id=\"C\"/>\n" " <resource_ref id=\"D\"/>\n" " </resource_set>\n" " </rsc_colocation>\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "If you use a higher-level tool, pay attention to how it exposes this functionality. Depending on the tool, creating a set A B may be equivalent to A with B, or B with A." msgstr "" #. Tag: para #, no-c-format -msgid "This notation can also be used to tell the cluster that a set of resources must all be located with a common peer, but have no dependencies on each other. In this scenario, unlike the previous, B would be allowed to remain active even if A or C (or both) were inactive." +msgid "This notation can also be used to tell the cluster that sets of resources must be colocated relative to each other, where the individual members of each set may or may not depend on each other being active (controlled by the sequential property)." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "In this example, A, B, and C will each be colocated with D. D must be active, but any of A, B, or C may be inactive without affecting any other resources." msgstr "" #. Tag: title #, no-c-format msgid "Using colocated sets to specify a common peer" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_colocation id=\"coloc-1\" score=\"INFINITY\" >\n" " <resource_set id=\"colocated-set-1\" sequential=\"false\">\n" " <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" " <resource_ref id=\"C\"/>\n" " </resource_set>\n" " <resource_set id=\"colocated-set-2\" sequential=\"true\">\n" " <resource_ref id=\"D\"/>\n" " </resource_set>\n" " </rsc_colocation>\n" "</constraints>" msgstr "" #. Tag: para #, no-c-format msgid "A colocated set with sequential=false makes sense only if there is another set in the constraint. Otherwise, the constraint has no effect." msgstr "" #. Tag: para #, no-c-format -msgid "There is no inherent limit to the number and size of the sets used. The only thing that matters is that in order for any member of one set in the constraint to be active, all members of sets listed after it must also be active (and naturally on the same node); and if a set has sequential=\"true\", then in order for one member of that set to be active, all members listed after it must also be active. You can even specify the role in which the members of a set must be in using the set’s role attribute." +msgid "There is no inherent limit to the number and size of the sets used. The only thing that matters is that in order for any member of one set in the constraint to be active, all members of sets listed after it must also be active (and naturally on the same node); and if a set has sequential=\"true\", then in order for one member of that set to be active, all members listed before it must also be active." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "If desired, you can restrict the dependency to instances of multistate resources that are in a specific role, using the set’s role property." msgstr "" #. Tag: title #, no-c-format -msgid "A colocation chain where the members of the middle set have no interdependencies and the last has master status." +msgid "Colocation chain in which the members of the middle set have no interdependencies, and the last listed set (which the cluster places first) is restricted to instances in master status." msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_colocation id=\"coloc-1\" score=\"INFINITY\" >\n" " <resource_set id=\"colocated-set-1\" sequential=\"true\">\n" -" <resource_ref id=\"A\"/>\n" " <resource_ref id=\"B\"/>\n" +" <resource_ref id=\"A\"/>\n" " </resource_set>\n" " <resource_set id=\"colocated-set-2\" sequential=\"false\">\n" " <resource_ref id=\"C\"/>\n" " <resource_ref id=\"D\"/>\n" " <resource_ref id=\"E\"/>\n" " </resource_set>\n" " <resource_set id=\"colocated-set-3\" sequential=\"true\" role=\"Master\">\n" -" <resource_ref id=\"F\"/>\n" " <resource_ref id=\"G\"/>\n" +" <resource_ref id=\"F\"/>\n" " </resource_set>\n" " </rsc_colocation>\n" "</constraints>" msgstr "" #. Tag: title #, no-c-format -msgid "Visual representation of a colocation chain where the members of the middle set have no inter-dependencies" +msgid "Visual representation the above example (resources to the left are placed first)" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Pay close attention to the order in which resources and sets are listed. While the colocation dependency for members of any one set is last-to-first, the colocation dependency for multiple sets is first-to-last. In the above example, B is colocated with A, but colocated-set-1 is colocated with colocated-set-2." msgstr "" #. Tag: para #, no-c-format msgid "Unlike ordered sets, colocated sets do not use the require-all option." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Intro.pot b/doc/Pacemaker_Explained/pot/Ch-Intro.pot index 9c36266561..8af3ff965b 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Intro.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Intro.pot @@ -1,284 +1,284 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Read-Me-First" msgstr "" #. Tag: title #, no-c-format msgid "The Scope of this Document" msgstr "" #. Tag: para #, no-c-format msgid "The purpose of this document is to definitively explain the concepts used to configure Pacemaker. To achieve this, it will focus exclusively on the XML syntax used to configure the CIB." msgstr "" #. Tag: para #, no-c-format msgid "For those that are allergic to XML, there exist several unified shells and GUIs for Pacemaker. However these tools will not be covered at all in this document I hope, however, that the concepts explained here make the functionality of these tools more easily understood. , precisely because they hide the XML." msgstr "" #. Tag: para #, no-c-format msgid "Additionally, this document is NOT a step-by-step how-to guide for configuring a specific clustering scenario." msgstr "" #. Tag: para #, no-c-format msgid "Although such guides exist, For example, see the Clusters from Scratch guide. the purpose of this document is to provide an understanding of the building blocks that can be used to construct any type of Pacemaker cluster." msgstr "" #. Tag: title #, no-c-format msgid "What Is Pacemaker?" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker is a cluster resource manager, that is, a logic responsible for a life-cycle of deployed software — indirectly perhaps even whole systems or their interconnections — under its control within a set of computers (a.k.a. nodes) and driven by prescribed rules." msgstr "" #. Tag: para #, no-c-format msgid "It achieves maximum availability for your cluster services (a.k.a. resources) by detecting and recovering from node- and resource-level failures by making use of the messaging and membership capabilities provided by your preferred cluster infrastructure (either Corosync or Heartbeat), and possibly by utilizing other parts of the overall cluster stack." msgstr "" #. Tag: para #, no-c-format msgid "For the goal of minimal downtime a term high availability was coined and together with its acronym, HA, is well-established in the sector. To differentiate this sort of clusters from high performance computing (HPC) ones, should a context require it (apparently, not the case in this document), using HA cluster is an option." msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker’s key features include:" msgstr "" #. Tag: para #, no-c-format msgid "Detection and recovery of node and service-level failures" msgstr "" #. Tag: para #, no-c-format msgid "Storage agnostic, no requirement for shared storage" msgstr "" #. Tag: para #, no-c-format msgid "Resource agnostic, anything that can be scripted can be clustered" msgstr "" #. Tag: para #, no-c-format msgid "Supports fencing (also referred to as the STONITH acronym, deciphered later on) for ensuring data integrity" msgstr "" #. Tag: para #, no-c-format msgid "Supports large and small clusters" msgstr "" #. Tag: para #, no-c-format msgid "Supports both quorate and resource-driven clusters" msgstr "" #. Tag: para #, no-c-format msgid "Supports practically any redundancy configuration" msgstr "" #. Tag: para #, no-c-format msgid "Automatically replicated configuration that can be updated from any node" msgstr "" #. Tag: para #, no-c-format msgid "Ability to specify cluster-wide service ordering, colocation and anti-colocation" msgstr "" #. Tag: para #, no-c-format msgid "Support for advanced service types" msgstr "" #. Tag: para #, no-c-format msgid "Clones: for services which need to be active on multiple nodes" msgstr "" #. Tag: para #, no-c-format msgid "Multi-state: for services with multiple modes (e.g. master/slave, primary/secondary)" msgstr "" #. Tag: para #, no-c-format msgid "Unified, scriptable cluster management tools" msgstr "" #. Tag: title #, no-c-format msgid "Pacemaker Architecture" msgstr "" #. Tag: para #, no-c-format msgid "At the highest level, the cluster is made up of three pieces:" msgstr "" #. Tag: para #, no-c-format msgid "Non-cluster-aware components. These pieces include the resources themselves; scripts that start, stop and monitor them; and a local daemon that masks the differences between the different standards these scripts implement. Even though interactions of these resources when run as multiple instances can resemble a distributed system, they still lack the proper HA mechanisms and/or autonomous cluster-wide governance as subsumed in the following item." msgstr "" #. Tag: para #, no-c-format msgid "Resource management. Pacemaker provides the brain that processes and reacts to events regarding the cluster. These events include nodes joining or leaving the cluster; resource events caused by failures, maintenance and scheduled activities; and other administrative actions. Pacemaker will compute the ideal state of the cluster and plot a path to achieve it after any of these events. This may include moving resources, stopping nodes and even forcing them offline with remote power switches." msgstr "" #. Tag: para #, no-c-format msgid "Low-level infrastructure. Projects like Corosync, CMAN and Heartbeat provide reliable messaging, membership and quorum information about the cluster." msgstr "" #. Tag: para #, no-c-format msgid "When combined with Corosync, Pacemaker also supports popular open source cluster filesystems. Even though Pacemaker also supports Heartbeat, the filesystems need to use the stack for messaging and membership, and Corosync seems to be what they’re standardizing on. Technically, it would be possible for them to support Heartbeat as well, but there seems little interest in this. " msgstr "" #. Tag: para #, no-c-format msgid "Due to past standardization within the cluster filesystem community, cluster filesystems make use of a common distributed lock manager, which makes use of Corosync for its messaging and membership capabilities (which nodes are up/down) and Pacemaker for fencing services." msgstr "" #. Tag: title #, no-c-format msgid "The Pacemaker Stack" msgstr "" #. Tag: title #, no-c-format msgid "Internal Components" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker itself is composed of five key components:" msgstr "" #. Tag: para #, no-c-format msgid "Cluster Information Base (CIB)" msgstr "" #. Tag: para #, no-c-format msgid "Cluster Resource Management daemon (CRMd)" msgstr "" #. Tag: para #, no-c-format msgid "Local Resource Management daemon (LRMd)" msgstr "" #. Tag: para #, no-c-format msgid "Policy Engine (PEngine or PE)" msgstr "" #. Tag: para #, no-c-format msgid "Fencing daemon (STONITHd)" msgstr "" #. Tag: para #, no-c-format msgid "The CIB uses XML to represent both the cluster’s configuration and current state of all resources in the cluster. The contents of the CIB are automatically kept in sync across the entire cluster and are used by the PEngine to compute the ideal state of the cluster and how it should be achieved." msgstr "" #. Tag: para #, no-c-format msgid "This list of instructions is then fed to the Designated Controller (DC). Pacemaker centralizes all cluster decision making by electing one of the CRMd instances to act as a master. Should the elected CRMd process (or the node it is on) fail, a new one is quickly established." msgstr "" #. Tag: para #, no-c-format msgid "The DC carries out the PEngine’s instructions in the required order by passing them to either the Local Resource Management daemon (LRMd) or CRMd peers on other nodes via the cluster messaging infrastructure (which in turn passes them on to their LRMd process)." msgstr "" #. Tag: para #, no-c-format msgid "The peer nodes all report the results of their operations back to the DC and, based on the expected and actual results, will either execute any actions that needed to wait for the previous one to complete, or abort processing and ask the PEngine to recalculate the ideal cluster state based on the unexpected results." msgstr "" #. Tag: para #, no-c-format msgid "In some cases, it may be necessary to power off nodes in order to protect shared data or complete resource recovery. For this, Pacemaker comes with STONITHd." msgstr "" #. Tag: para #, no-c-format msgid "STONITH is an acronym for Shoot-The-Other-Node-In-The-Head, a recommended practice that misbehaving node is best to be promptly fenced (shut off, cut from shared resources or otherwise immobilized), and is usually implemented with a remote power switch." msgstr "" #. Tag: para #, no-c-format msgid "In Pacemaker, STONITH devices are modeled as resources (and configured in the CIB) to enable them to be easily monitored for failure, however STONITHd takes care of understanding the STONITH topology such that its clients simply request a node be fenced, and it does the rest." msgstr "" #. Tag: title #, no-c-format msgid "Types of Pacemaker Clusters" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker makes no assumptions about your environment. This allows it to support practically any redundancy configuration including Active/Active, Active/Passive, N+1, N+M, N-to-1 and N-to-N." msgstr "" #. Tag: title #, no-c-format msgid "Active/Passive Redundancy" msgstr "" #. Tag: para #, no-c-format msgid "Two-node Active/Passive clusters using Pacemaker and DRBD are a cost-effective solution for many High Availability situations." msgstr "" #. Tag: title #, no-c-format msgid "Shared Failover" msgstr "" #. Tag: para #, no-c-format msgid "By supporting many nodes, Pacemaker can dramatically reduce hardware costs by allowing several active/passive clusters to be combined and share a common backup node." msgstr "" #. Tag: title #, no-c-format msgid "N to N Redundancy" msgstr "" #. Tag: para #, no-c-format msgid "When shared storage is available, every node can potentially be used for failover. Pacemaker can even run multiple copies of services to spread out the workload." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Multi-site-Clusters.pot b/doc/Pacemaker_Explained/pot/Ch-Multi-site-Clusters.pot index d9adc6dfd9..eabbed90bc 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Multi-site-Clusters.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Multi-site-Clusters.pot @@ -1,458 +1,458 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Multi-Site Clusters and Tickets" msgstr "" #. Tag: para #, no-c-format msgid "Apart from local clusters, Pacemaker also supports multi-site clusters. That means you can have multiple, geographically dispersed sites, each with a local cluster. Failover between these clusters can be coordinated manually by the administrator, or automatically by a higher-level entity called a Cluster Ticket Registry (CTR)." msgstr "" #. Tag: title #, no-c-format msgid "Challenges for Multi-Site Clusters" msgstr "" #. Tag: para #, no-c-format msgid "Typically, multi-site environments are too far apart to support synchronous communication and data replication between the sites. That leads to significant challenges:" msgstr "" #. Tag: para #, no-c-format msgid "How do we make sure that a cluster site is up and running?" msgstr "" #. Tag: para #, no-c-format msgid "How do we make sure that resources are only started once?" msgstr "" #. Tag: para #, no-c-format msgid "How do we make sure that quorum can be reached between the different sites and a split-brain scenario avoided?" msgstr "" #. Tag: para #, no-c-format msgid "How do we manage failover between sites?" msgstr "" #. Tag: para #, no-c-format msgid "How do we deal with high latency in case of resources that need to be stopped?" msgstr "" #. Tag: para #, no-c-format msgid "In the following sections, learn how to meet these challenges." msgstr "" #. Tag: title #, no-c-format msgid "Conceptual Overview" msgstr "" #. Tag: para #, no-c-format msgid "Multi-site clusters can be considered as “overlay” clusters where each cluster site corresponds to a cluster node in a traditional cluster. The overlay cluster can be managed by a CTR in order to guarantee that any cluster resource will be active on no more than one cluster site. This is achieved by using tickets that are treated as failover domain between cluster sites, in case a site should be down." msgstr "" #. Tag: para #, no-c-format msgid "The following sections explain the individual components and mechanisms that were introduced for multi-site clusters in more detail." msgstr "" #. Tag: title #, no-c-format msgid "Ticket" msgstr "" #. Tag: para #, no-c-format msgid "Tickets are, essentially, cluster-wide attributes. A ticket grants the right to run certain resources on a specific cluster site. Resources can be bound to a certain ticket by rsc_ticket constraints. Only if the ticket is available at a site can the respective resources be started there. Vice versa, if the ticket is revoked, the resources depending on that ticket must be stopped." msgstr "" #. Tag: para #, no-c-format msgid "The ticket thus is similar to a site quorum, i.e. the permission to manage/own resources associated with that site. (One can also think of the current have-quorum flag as a special, cluster-wide ticket that is granted in case of node majority.)" msgstr "" #. Tag: para #, no-c-format msgid "Tickets can be granted and revoked either manually by administrators (which could be the default for classic enterprise clusters), or via the automated CTR mechanism described below." msgstr "" #. Tag: para #, no-c-format msgid "A ticket can only be owned by one site at a time. Initially, none of the sites has a ticket. Each ticket must be granted once by the cluster administrator." msgstr "" #. Tag: para #, no-c-format msgid "The presence or absence of tickets for a site is stored in the CIB as a cluster status. With regards to a certain ticket, there are only two states for a site: true (the site has the ticket) or false (the site does not have the ticket). The absence of a certain ticket (during the initial state of the multi-site cluster) is the same as the value false." msgstr "" #. Tag: title #, no-c-format msgid "Dead Man Dependency" msgstr "" #. Tag: para #, no-c-format msgid "A site can only activate resources safely if it can be sure that the other site has deactivated them. However after a ticket is revoked, it can take a long time until all resources depending on that ticket are stopped \"cleanly\", especially in case of cascaded resources. To cut that process short, the concept of a Dead Man Dependency was introduced." msgstr "" #. Tag: para #, no-c-format msgid "If a dead man dependency is in force, if a ticket is revoked from a site, the nodes that are hosting dependent resources are fenced. This considerably speeds up the recovery process of the cluster and makes sure that resources can be migrated more quickly." msgstr "" #. Tag: para #, no-c-format msgid "This can be configured by specifying a loss-policy=\"fence\" in rsc_ticket constraints." msgstr "" #. Tag: title #, no-c-format msgid "Cluster Ticket Registry" msgstr "" #. Tag: para #, no-c-format msgid "A CTR is a coordinated group of network daemons that automatically handles granting, revoking, and timing out tickets (instead of the administrator revoking the ticket somewhere, waiting for everything to stop, and then granting it on the desired site)." msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker does not implement its own CTR, but interoperates with external software designed for that purpose (similar to how resource and fencing agents are not directly part of pacemaker)." msgstr "" #. Tag: para #, no-c-format msgid "Participating clusters run the CTR daemons, which connect to each other, exchange information about their connectivity, and vote on which sites gets which tickets." msgstr "" #. Tag: para #, no-c-format msgid "A ticket is granted to a site only once the CTR is sure that the ticket has been relinquished by the previous owner, implemented via a timer in most scenarios. If a site loses connection to its peers, its tickets time out and recovery occurs. After the connection timeout plus the recovery timeout has passed, the other sites are allowed to re-acquire the ticket and start the resources again." msgstr "" #. Tag: para #, no-c-format msgid "This can also be thought of as a \"quorum server\", except that it is not a single quorum ticket, but several." msgstr "" #. Tag: title #, no-c-format msgid "Configuration Replication" msgstr "" #. Tag: para #, no-c-format msgid "As usual, the CIB is synchronized within each cluster, but it is not synchronized across cluster sites of a multi-site cluster. You have to configure the resources that will be highly available across the multi-site cluster for every site accordingly." msgstr "" #. Tag: title #, no-c-format msgid "Configuring Ticket Dependencies" msgstr "" #. Tag: para #, no-c-format msgid "The rsc_ticket constraint lets you specify the resources depending on a certain ticket. Together with the constraint, you can set a loss-policy that defines what should happen to the respective resources if the ticket is revoked." msgstr "" #. Tag: para #, no-c-format msgid "The attribute loss-policy can have the following values:" msgstr "" #. Tag: para #, no-c-format msgid "fence: Fence the nodes that are running the relevant resources." msgstr "" #. Tag: para #, no-c-format msgid "stop: Stop the relevant resources." msgstr "" #. Tag: para #, no-c-format msgid "freeze: Do nothing to the relevant resources." msgstr "" #. Tag: para #, no-c-format msgid "demote: Demote relevant resources that are running in master mode to slave mode." msgstr "" #. Tag: title #, no-c-format msgid "Constraint that fences node if ticketA is revoked" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_ticket id=\"rsc1-req-ticketA\" rsc=\"rsc1\" ticket=\"ticketA\" loss-policy=\"fence\"/>" msgstr "" #. Tag: para #, no-c-format msgid "The example above creates a constraint with the ID rsc1-req-ticketA. It defines that the resource rsc1 depends on ticketA and that the node running the resource should be fenced if ticketA is revoked." msgstr "" #. Tag: para #, no-c-format msgid "If resource rsc1 were a multi-state resource (i.e. it could run in master or slave mode), you might want to configure that only master mode depends on ticketA. With the following configuration, rsc1 will be demoted to slave mode if ticketA is revoked:" msgstr "" #. Tag: title #, no-c-format msgid "Constraint that demotes rsc1 if ticketA is revoked" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_ticket id=\"rsc1-req-ticketA\" rsc=\"rsc1\" rsc-role=\"Master\" ticket=\"ticketA\" loss-policy=\"demote\"/>" msgstr "" #. Tag: para #, no-c-format msgid "You can create multiple rsc_ticket constraints to let multiple resources depend on the same ticket. However, rsc_ticket also supports resource sets (see ), so one can easily list all the resources in one rsc_ticket constraint instead." msgstr "" #. Tag: title #, no-c-format msgid "Ticket constraint for multiple resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_ticket id=\"resources-dep-ticketA\" ticket=\"ticketA\" loss-policy=\"fence\">\n" " <resource_set id=\"resources-dep-ticketA-0\" role=\"Started\">\n" " <resource_ref id=\"rsc1\"/>\n" " <resource_ref id=\"group1\"/>\n" " <resource_ref id=\"clone1\"/>\n" " </resource_set>\n" " <resource_set id=\"resources-dep-ticketA-1\" role=\"Master\">\n" " <resource_ref id=\"ms1\"/>\n" " </resource_set>\n" "</rsc_ticket>" msgstr "" #. Tag: para #, no-c-format msgid "In the example above, there are two resource sets, so we can list resources with different roles in a single rsc_ticket constraint. There’s no dependency between the two resource sets, and there’s no dependency among the resources within a resource set. Each of the resources just depends on ticketA." msgstr "" #. Tag: para #, no-c-format msgid "Referencing resource templates in rsc_ticket constraints, and even referencing them within resource sets, is also supported." msgstr "" #. Tag: para #, no-c-format msgid "If you want other resources to depend on further tickets, create as many constraints as necessary with rsc_ticket." msgstr "" #. Tag: title #, no-c-format msgid "Managing Multi-Site Clusters" msgstr "" #. Tag: title #, no-c-format msgid "Granting and Revoking Tickets Manually" msgstr "" #. Tag: para #, no-c-format msgid "You can grant tickets to sites or revoke them from sites manually. If you want to re-distribute a ticket, you should wait for the dependent resources to stop cleanly at the previous site before you grant the ticket to the new site." msgstr "" #. Tag: para #, no-c-format msgid "Use the crm_ticket command line tool to grant and revoke tickets." msgstr "" #. Tag: para #, no-c-format msgid "To grant a ticket to this site:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_ticket --ticket ticketA --grant" msgstr "" #. Tag: para #, no-c-format msgid "To revoke a ticket from this site:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_ticket --ticket ticketA --revoke" msgstr "" #. Tag: para #, no-c-format msgid "If you are managing tickets manually, use the crm_ticket command with great care, because it cannot check whether the same ticket is already granted elsewhere." msgstr "" #. Tag: title #, no-c-format msgid "Granting and Revoking Tickets via a Cluster Ticket Registry" msgstr "" #. Tag: para #, no-c-format msgid "We will use Booth here as an example of software that can be used with pacemaker as a Cluster Ticket Registry. Booth implements the Raft algorithm to guarantee the distributed consensus among different cluster sites, and manages the ticket distribution (and thus the failover process between sites)." msgstr "" #. Tag: para #, no-c-format msgid "Each of the participating clusters and arbitrators runs the Booth daemon boothd." msgstr "" #. Tag: para #, no-c-format msgid "An arbitrator is the multi-site equivalent of a quorum-only node in a local cluster. If you have a setup with an even number of sites, you need an additional instance to reach consensus about decisions such as failover of resources across sites. In this case, add one or more arbitrators running at additional sites. Arbitrators are single machines that run a booth instance in a special mode. An arbitrator is especially important for a two-site scenario, otherwise there is no way for one site to distinguish between a network failure between it and the other site, and a failure of the other site." msgstr "" #. Tag: para #, no-c-format msgid "The most common multi-site scenario is probably a multi-site cluster with two sites and a single arbitrator on a third site. However, technically, there are no limitations with regards to the number of sites and the number of arbitrators involved." msgstr "" #. Tag: para #, no-c-format msgid "Boothd at each site connects to its peers running at the other sites and exchanges connectivity details. Once a ticket is granted to a site, the booth mechanism will manage the ticket automatically: If the site which holds the ticket is out of service, the booth daemons will vote which of the other sites will get the ticket. To protect against brief connection failures, sites that lose the vote (either explicitly or implicitly by being disconnected from the voting body) need to relinquish the ticket after a time-out. Thus, it is made sure that a ticket will only be re-distributed after it has been relinquished by the previous site. The resources that depend on that ticket will fail over to the new site holding the ticket. The nodes that have run the resources before will be treated according to the loss-policy you set within the rsc_ticket constraint." msgstr "" #. Tag: para #, no-c-format msgid "Before the booth can manage a certain ticket within the multi-site cluster, you initially need to grant it to a site manually via the booth command-line tool. After you have initially granted a ticket to a site, boothd will take over and manage the ticket automatically." msgstr "" #. Tag: para #, no-c-format msgid "The booth command-line tool can be used to grant, list, or revoke tickets and can be run on any machine where boothd is running. If you are managing tickets via Booth, use only booth for manual intervention, not crm_ticket. That ensures the same ticket will only be owned by one cluster site at a time." msgstr "" #. Tag: title #, no-c-format msgid "Booth Requirements" msgstr "" #. Tag: para #, no-c-format msgid "All clusters that will be part of the multi-site cluster must be based on Pacemaker." msgstr "" #. Tag: para #, no-c-format msgid "Booth must be installed on all cluster nodes and on all arbitrators that will be part of the multi-site cluster." msgstr "" #. Tag: para #, no-c-format msgid "Nodes belonging to the same cluster site should be synchronized via NTP. However, time synchronization is not required between the individual cluster sites." msgstr "" #. Tag: title #, no-c-format msgid "General Management of Tickets" msgstr "" #. Tag: para #, no-c-format msgid "Display the information of tickets:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_ticket --info" msgstr "" #. Tag: para #, no-c-format msgid "Or you can monitor them with:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_mon --tickets" msgstr "" #. Tag: para #, no-c-format msgid "Display the rsc_ticket constraints that apply to a ticket:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_ticket --ticket ticketA --constraints" msgstr "" #. Tag: para #, no-c-format msgid "When you want to do maintenance or manual switch-over of a ticket, revoking the ticket would trigger the loss policies. If loss-policy=\"fence\", the dependent resources could not be gracefully stopped/demoted, and other unrelated resources could even be affected." msgstr "" #. Tag: para #, no-c-format msgid "The proper way is making the ticket standby first with:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_ticket --ticket ticketA --standby" msgstr "" #. Tag: para #, no-c-format msgid "Then the dependent resources will be stopped or demoted gracefully without triggering the loss policies." msgstr "" #. Tag: para #, no-c-format msgid "If you have finished the maintenance and want to activate the ticket again, you can run:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_ticket --ticket ticketA --activate" msgstr "" #. Tag: title #, no-c-format msgid "For more information" msgstr "" #. Tag: para #, no-c-format msgid "SUSE’s Geo Clustering quick start" msgstr "" #. Tag: para #, no-c-format msgid "Booth" msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Nodes.pot b/doc/Pacemaker_Explained/pot/Ch-Nodes.pot index fc613d28f2..d80f79b305 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Nodes.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Nodes.pot @@ -1,384 +1,384 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Cluster Nodes" msgstr "" #. Tag: title #, no-c-format msgid "Defining a Cluster Node" msgstr "" #. Tag: para #, no-c-format msgid "Each node in the cluster will have an entry in the nodes section containing its UUID, uname, and type." msgstr "" #. Tag: title #, no-c-format msgid "Example Heartbeat cluster node entry" msgstr "" #. Tag: programlisting #, no-c-format msgid "<node id=\"1186dc9a-324d-425a-966e-d757e693dc86\" uname=\"pcmk-1\" type=\"normal\"/>" msgstr "" #. Tag: title #, no-c-format msgid "Example Corosync cluster node entry" msgstr "" #. Tag: programlisting #, no-c-format msgid "<node id=\"101\" uname=\"pcmk-1\" type=\"normal\"/>" msgstr "" #. Tag: para #, no-c-format msgid "In normal circumstances, the admin should let the cluster populate this information automatically from the communications and membership data. However for Heartbeat, one can use the crm_uuid tool to read an existing UUID or define a value before the cluster starts." msgstr "" #. Tag: title #, no-c-format msgid "Where Pacemaker Gets the Node Name" msgstr "" #. Tag: para #, no-c-format msgid "Traditionally, Pacemaker required nodes to be referred to by the value returned by uname -n. This can be problematic for services that require the uname -n to be a specific value (e.g. for a licence file)." msgstr "" #. Tag: para #, no-c-format msgid "This requirement has been relaxed for clusters using Corosync 2.0 or later. The name Pacemaker uses is:" msgstr "" #. Tag: para #, no-c-format msgid "The value stored in corosync.conf under ring0_addr in the nodelist, if it does not contain an IP address; otherwise" msgstr "" #. Tag: para #, no-c-format msgid "The value stored in corosync.conf under name in the nodelist; otherwise" msgstr "" #. Tag: para #, no-c-format msgid "The value of uname -n" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker provides the crm_node -n command which displays the name used by a running cluster." msgstr "" #. Tag: para #, no-c-format msgid "If a Corosync nodelist is used, crm_node --name-for-id number is also available to display the name used by the node with the corosync nodeid of number, for example: crm_node --name-for-id 2." msgstr "" #. Tag: title #, no-c-format msgid "Node Attributes" msgstr "" #. Tag: para #, no-c-format msgid " Nodeattribute attribute Node attributes are a special type of option (name-value pair) that applies to a node object." msgstr "" #. Tag: para #, no-c-format msgid "Beyond the basic definition of a node, the administrator can describe the node’s attributes, such as how much RAM, disk, what OS or kernel version it has, perhaps even its physical location. This information can then be used by the cluster when deciding where to place resources. For more information on the use of node attributes, see ." msgstr "" #. Tag: para #, no-c-format msgid "Node attributes can be specified ahead of time or populated later, when the cluster is running, using crm_attribute." msgstr "" #. Tag: para #, no-c-format msgid "Below is what the node’s definition would look like if the admin ran the command:" msgstr "" #. Tag: title #, no-c-format msgid "Result of using crm_attribute to specify which kernel pcmk-1 is running" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --type nodes --node pcmk-1 --name kernel --update $(uname -r)" msgstr "" #. Tag: programlisting #, no-c-format msgid "<node uname=\"pcmk-1\" type=\"normal\" id=\"101\">\n" " <instance_attributes id=\"nodes-101\">\n" " <nvpair id=\"nodes-101-kernel\" name=\"kernel\" value=\"3.10.0-123.13.2.el7.x86_64\"/>\n" " </instance_attributes>\n" "</node>" msgstr "" #. Tag: para #, no-c-format msgid "Rather than having to read the XML, a simpler way to determine the current value of an attribute is to use crm_attribute again:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --type nodes --node pcmk-1 --name kernel --query\n" "scope=nodes name=kernel value=3.10.0-123.13.2.el7.x86_64" msgstr "" #. Tag: para #, no-c-format msgid "By specifying --type nodes the admin tells the cluster that this attribute is persistent. There are also transient attributes which are kept in the status section which are \"forgotten\" whenever the node rejoins the cluster. The cluster uses this area to store a record of how many times a resource has failed on that node, but administrators can also read and write to this section by specifying --type status." msgstr "" #. Tag: title #, no-c-format msgid "Managing Nodes in a Corosync-Based Cluster" msgstr "" #. Tag: title #, no-c-format msgid "Adding a New Corosync Node" msgstr "" #. Tag: para #, no-c-format msgid " CorosyncAdd Cluster Node Add Cluster Node Add Cluster NodeCorosync Corosync " msgstr "" #. Tag: para #, no-c-format msgid "To add a new node:" msgstr "" #. Tag: para #, no-c-format msgid "Install Corosync and Pacemaker on the new host." msgstr "" #. Tag: para #, no-c-format msgid "Copy /etc/corosync/corosync.conf and /etc/corosync/authkey (if it exists) from an existing node. You may need to modify the mcastaddr option to match the new node’s IP address." msgstr "" #. Tag: para #, no-c-format msgid "Start the cluster software on the new host. If a log message containing \"Invalid digest\" appears from Corosync, the keys are not consistent between the machines." msgstr "" #. Tag: title #, no-c-format msgid "Removing a Corosync Node" msgstr "" #. Tag: para #, no-c-format msgid " CorosyncRemove Cluster Node Remove Cluster Node Remove Cluster NodeCorosync Corosync " msgstr "" #. Tag: para #, no-c-format msgid "Because the messaging and membership layers are the authoritative source for cluster nodes, deleting them from the CIB is not a complete solution. First, one must arrange for corosync to forget about the node (pcmk-1 in the example below)." msgstr "" #. Tag: para #, no-c-format msgid "Stop the cluster on the host to be removed. How to do this will vary with your operating system and installed versions of cluster software, for example, pcs cluster stop if you are using pcs for cluster management, or service corosync stop on a host using corosync 1.x with the pacemaker plugin." msgstr "" #. Tag: para #, no-c-format msgid "From one of the remaining active cluster nodes, tell Pacemaker to forget about the removed host, which will also delete the node from the CIB:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_node -R pcmk-1" msgstr "" #. Tag: para #, no-c-format msgid "This procedure only works for pacemaker 1.1.8 and later." msgstr "" #. Tag: title #, no-c-format msgid "Replacing a Corosync Node" msgstr "" #. Tag: para #, no-c-format msgid " CorosyncReplace Cluster Node Replace Cluster Node Replace Cluster NodeCorosync Corosync " msgstr "" #. Tag: para #, no-c-format msgid "To replace an existing cluster node:" msgstr "" #. Tag: para #, no-c-format msgid "Make sure the old node is completely stopped." msgstr "" #. Tag: para #, no-c-format msgid "Give the new machine the same hostname and IP address as the old one." msgstr "" #. Tag: para #, no-c-format msgid "Follow the procedure above for adding a node." msgstr "" #. Tag: title #, no-c-format msgid "Managing Nodes in a Heartbeat-based Cluster" msgstr "" #. Tag: title #, no-c-format msgid "Adding a New Heartbeat Node" msgstr "" #. Tag: para #, no-c-format msgid " HeartbeatAdd Cluster Node Add Cluster Node Add Cluster NodeHeartbeat Heartbeat " msgstr "" #. Tag: para #, no-c-format msgid "Install heartbeat and pacemaker on the new host." msgstr "" #. Tag: para #, no-c-format msgid "Copy ha.cf and authkeys from an existing node." msgstr "" #. Tag: para #, no-c-format msgid "If you do not use autojoin any in ha.cf, run:" msgstr "" #. Tag: screen #, no-c-format msgid "hb_addnode $(uname -n)" msgstr "" #. Tag: para #, no-c-format msgid "Start the cluster software on the new node." msgstr "" #. Tag: title #, no-c-format msgid "Removing a Heartbeat Node" msgstr "" #. Tag: para #, no-c-format msgid " HeartbeatRemove Cluster Node Remove Cluster Node Remove Cluster NodeHeartbeat Heartbeat " msgstr "" #. Tag: para #, no-c-format msgid "Because the messaging and membership layers are the authoritative source for cluster nodes, deleting them from the CIB is not a complete solution. First, one must arrange for Heartbeat to forget about the node (pcmk-1 in the example below)." msgstr "" #. Tag: para #, no-c-format msgid "On the host to be removed, stop the cluster:" msgstr "" #. Tag: screen #, no-c-format msgid "service heartbeat stop" msgstr "" #. Tag: para #, no-c-format msgid "From one of the remaining active cluster nodes, tell Heartbeat the node should be removed:" msgstr "" #. Tag: screen #, no-c-format msgid "hb_delnode pcmk-1" msgstr "" #. Tag: para #, no-c-format msgid "Tell Pacemaker to forget about the removed host:" msgstr "" #. Tag: screen #, no-c-format msgid "crm_node -R pcmk-1" msgstr "" #. Tag: para #, no-c-format msgid "This procedure only works for pacemaker versions after 1.1.8." msgstr "" #. Tag: title #, no-c-format msgid "Replacing a Heartbeat Node" msgstr "" #. Tag: para #, no-c-format msgid " HeartbeatReplace Cluster Node Replace Cluster Node Replace Cluster NodeHeartbeat Heartbeat To replace an existing cluster node:" msgstr "" #. Tag: para #, no-c-format msgid "Give the new machine the same hostname as the old one." msgstr "" #. Tag: para #, no-c-format msgid "Go to an active cluster node and look up the UUID for the old node in /var/lib/heartbeat/hostcache." msgstr "" #. Tag: para #, no-c-format msgid "Install the cluster software." msgstr "" #. Tag: para #, no-c-format msgid "Copy ha.cf and authkeys to the new node." msgstr "" #. Tag: para #, no-c-format msgid "On the new node, populate its UUID using crm_uuid -w and the UUID obtained earlier." msgstr "" #. Tag: para #, no-c-format msgid "Start the new cluster node." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Notifications.pot b/doc/Pacemaker_Explained/pot/Ch-Notifications.pot index 0e0fed996f..d1de90fb1a 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Notifications.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Notifications.pot @@ -1,233 +1,233 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Receiving Notification for Cluster Events" msgstr "" #. Tag: para #, no-c-format msgid " ResourceNotification Notification " msgstr "" #. Tag: para #, no-c-format msgid "A Pacemaker cluster is an event-driven system. In this context, an event might be a resource failure or a configuration change, among others." msgstr "" #. Tag: para #, no-c-format msgid "The ocf:pacemaker:ClusterMon resource can monitor the cluster status and trigger alerts on each cluster event. This resource runs crm_mon in the background at regular (configurable) intervals and uses crm_mon capabilities to trigger emails (SMTP), SNMP traps or external programs (via the extra_options parameter)." msgstr "" #. Tag: para #, no-c-format msgid "Depending on your system settings and compilation settings, SNMP or email alerts might be unavailable. Check the output of crm_mon --help to see whether these options are available to you. In any case, executing an external agent will always be available, and you can use this agent to send emails, SNMP traps or whatever action you develop." msgstr "" #. Tag: title #, no-c-format msgid "Configuring SNMP Notifications" msgstr "" #. Tag: para #, no-c-format msgid " ResourceNotificationSNMP NotificationSNMP SNMP " msgstr "" #. Tag: para #, no-c-format msgid "Requires an IP to send SNMP traps to, and an SNMP community string. The Pacemaker MIB is provided with the source, and is typically installed in /usr/share/snmp/mibs/PCMK-MIB.txt." msgstr "" #. Tag: para #, no-c-format msgid "This example uses snmphost.example.com as the SNMP IP and public as the community string:" msgstr "" #. Tag: title #, no-c-format msgid "Configuring ClusterMon to send SNMP traps" msgstr "" #. Tag: programlisting #, no-c-format msgid "<clone id=\"ClusterMon-clone\">\n" " <primitive class=\"ocf\" id=\"ClusterMon-SNMP\" provider=\"pacemaker\" type=\"ClusterMon\">\n" " <instance_attributes id=\"ClusterMon-instance_attributes\">\n" " <nvpair id=\"ClusterMon-instance_attributes-user\" name=\"user\" value=\"root\"/>\n" " <nvpair id=\"ClusterMon-instance_attributes-update\" name=\"update\" value=\"30\"/>\n" " <nvpair id=\"ClusterMon-instance_attributes-extra_options\" name=\"extra_options\" value=\"-S snmphost.example.com -C public\"/>\n" " </instance_attributes>\n" " </primitive>\n" "</clone>" msgstr "" #. Tag: title #, no-c-format msgid "Configuring Email Notifications" msgstr "" #. Tag: para #, no-c-format msgid " ResourceNotificationSMTP NotificationSMTP SMTP " msgstr "" #. Tag: para #, no-c-format msgid "Requires the recipient e-mail address. You can also optionally configure the sender e-mail address, the hostname of the SMTP relay, and a prefix string for the subject line." msgstr "" #. Tag: title #, no-c-format msgid "Configuring ClusterMon to send email alerts" msgstr "" #. Tag: programlisting #, no-c-format msgid "<clone id=\"ClusterMon-clone\">\n" " <primitive class=\"ocf\" id=\"ClusterMon-SMTP\" provider=\"pacemaker\" type=\"ClusterMon\">\n" " <instance_attributes id=\"ClusterMon-instance_attributes\">\n" " <nvpair id=\"ClusterMon-instance_attributes-user\" name=\"user\" value=\"root\"/>\n" " <nvpair id=\"ClusterMon-instance_attributes-update\" name=\"update\" value=\"30\"/>\n" " <nvpair id=\"ClusterMon-instance_attributes-extra_options\" name=\"extra_options\" value=\"-T pacemaker@example.com -F pacemaker@node2.example.com -P PACEMAKER -H mail.example.com\"/>\n" " </instance_attributes>\n" " </primitive>\n" "</clone>" msgstr "" #. Tag: title #, no-c-format msgid "Configuring Notifications via External-Agent" msgstr "" #. Tag: para #, no-c-format msgid "Requires a program (external-agent) to run when resource operations take place, and an external-recipient (IP address, email address, URI). When triggered, the external-agent is fed with dynamically filled environment variables describing precisely the cluster event that occurred. By making smart usage of these variables in your external-agent code, you can trigger any action." msgstr "" #. Tag: title #, no-c-format msgid "Configuring ClusterMon to execute an external-agent" msgstr "" #. Tag: programlisting #, no-c-format msgid "<clone id=\"ClusterMon-clone\">\n" " <primitive class=\"ocf\" id=\"ClusterMon\" provider=\"pacemaker\" type=\"ClusterMon\">\n" " <instance_attributes id=\"ClusterMon-instance_attributes\">\n" " <nvpair id=\"ClusterMon-instance_attributes-user\" name=\"user\" value=\"root\"/>\n" " <nvpair id=\"ClusterMon-instance_attributes-update\" name=\"update\" value=\"30\"/>\n" " <nvpair id=\"ClusterMon-instance_attributes-extra_options\" name=\"extra_options\" value=\"-E /usr/local/bin/example.sh -e 192.168.12.1\"/>\n" " </instance_attributes>\n" " </primitive>\n" "</clone>" msgstr "" #. Tag: title #, no-c-format msgid "Environment Variables Passed to the External Agent" msgstr "" #. Tag: entry #, no-c-format msgid "Environment Variable" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "CRM_notify_recipient" msgstr "" #. Tag: para #, no-c-format msgid "The static external-recipient from the resource definition. Environment VariableCRM_notify_recipient CRM_notify_recipient " msgstr "" #. Tag: para #, no-c-format msgid "CRM_notify_node" msgstr "" #. Tag: para #, no-c-format msgid "The node on which the status change happened. Environment VariableCRM_notify_node CRM_notify_node " msgstr "" #. Tag: para #, no-c-format msgid "CRM_notify_rsc" msgstr "" #. Tag: para #, no-c-format msgid "The name of the resource that changed the status. Environment VariableCRM_notify_rsc CRM_notify_rsc " msgstr "" #. Tag: para #, no-c-format msgid "CRM_notify_task" msgstr "" #. Tag: para #, no-c-format msgid "The operation that caused the status change. Environment VariableCRM_notify_task CRM_notify_task " msgstr "" #. Tag: para #, no-c-format msgid "CRM_notify_desc" msgstr "" #. Tag: para #, no-c-format msgid "The textual output relevant error code of the operation (if any) that caused the status change. Environment VariableCRM_notify_desc CRM_notify_desc " msgstr "" #. Tag: para #, no-c-format msgid "CRM_notify_rc" msgstr "" #. Tag: para #, no-c-format msgid "The return code of the operation. Environment VariableCRM_notify_rc CRM_notify_rc " msgstr "" #. Tag: para #, no-c-format msgid "CRM_notify_target_rc" msgstr "" #. Tag: para #, no-c-format msgid "The expected return code of the operation. Environment VariableCRM_notify_target_rc CRM_notify_target_rc " msgstr "" #. Tag: para #, no-c-format msgid "CRM_notify_status" msgstr "" #. Tag: para #, no-c-format msgid "The numerical representation of the status of the operation. Environment VariableCRM_notify_target_rc CRM_notify_target_rc " msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Options.pot b/doc/Pacemaker_Explained/pot/Ch-Options.pot index 0c64043489..a2710809c3 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Options.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Options.pot @@ -1,726 +1,781 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Cluster-Wide Configuration" msgstr "" #. Tag: title #, no-c-format msgid "CIB Properties" msgstr "" #. Tag: para #, no-c-format msgid "Certain settings are defined by CIB properties (that is, attributes of the cib tag) rather than with the rest of the cluster configuration in the configuration section." msgstr "" #. Tag: para #, no-c-format msgid "The reason is simply a matter of parsing. These options are used by the configuration database which is, by design, mostly ignorant of the content it holds. So the decision was made to place them in an easy-to-find location." msgstr "" #. Tag: entry #, no-c-format msgid "Field" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "admin_epoch" msgstr "" #. Tag: para #, no-c-format msgid " Configuration VersionCluster Cluster ClusterOptionConfiguration Version OptionConfiguration Version Configuration Version admin_epochCluster Option Cluster Option ClusterOptionadmin_epoch Optionadmin_epoch admin_epoch When a node joins the cluster, the cluster performs a check to see which node has the best configuration. It asks the node with the highest (admin_epoch, epoch, num_updates) tuple to replace the configuration on all the nodes — which makes setting them, and setting them correctly, very important. admin_epoch is never modified by the cluster; you can use this to make the configurations on any inactive nodes obsolete. Never set this value to zero. In such cases, the cluster cannot tell the difference between your configuration and the \"empty\" one used when nothing is found on disk." msgstr "" #. Tag: para #, no-c-format msgid "epoch" msgstr "" #. Tag: para #, no-c-format msgid " epochCluster Option Cluster Option ClusterOptionepoch Optionepoch epoch The cluster increments this every time the configuration is updated (usually by the administrator)." msgstr "" #. Tag: para #, no-c-format msgid "num_updates" msgstr "" #. Tag: para #, no-c-format msgid " num_updatesCluster Option Cluster Option ClusterOptionnum_updates Optionnum_updates num_updates The cluster increments this every time the configuration or status is updated (usually by the cluster) and resets it to 0 when epoch changes." msgstr "" #. Tag: para #, no-c-format msgid "validate-with" msgstr "" #. Tag: para #, no-c-format msgid " validate-withCluster Option Cluster Option ClusterOptionvalidate-with Optionvalidate-with validate-with Determines the type of XML validation that will be done on the configuration. If set to none, the cluster will not verify that updates conform to the DTD (nor reject ones that don’t). This option can be useful when operating a mixed-version cluster during an upgrade." msgstr "" #. Tag: para #, no-c-format msgid "cib-last-written" msgstr "" #. Tag: para #, no-c-format msgid " cib-last-writtenCluster Property Cluster Property ClusterPropertycib-last-written Propertycib-last-written cib-last-written Indicates when the configuration was last written to disk. Maintained by the cluster; for informational purposes only." msgstr "" #. Tag: para #, no-c-format msgid "have-quorum" msgstr "" #. Tag: para #, no-c-format msgid " have-quorumCluster Property Cluster Property ClusterPropertyhave-quorum Propertyhave-quorum have-quorum Indicates if the cluster has quorum. If false, this may mean that the cluster cannot start resources or fence other nodes (see no-quorum-policy below). Maintained by the cluster." msgstr "" #. Tag: para #, no-c-format msgid "dc-uuid" msgstr "" #. Tag: para #, no-c-format msgid " dc-uuidCluster Property Cluster Property ClusterPropertydc-uuid Propertydc-uuid dc-uuid Indicates which cluster node is the current leader. Used by the cluster when placing resources and determining the order of some events. Maintained by the cluster." msgstr "" #. Tag: title #, no-c-format msgid "Working with CIB Properties" msgstr "" #. Tag: para #, no-c-format msgid "Although these fields can be written to by the user, in most cases the cluster will overwrite any values specified by the user with the \"correct\" ones." msgstr "" #. Tag: para #, no-c-format msgid "To change the ones that can be specified by the user, for example admin_epoch, one should use:" msgstr "" #. Tag: screen #, no-c-format -msgid "# cibadmin --modify --crm_xml '<cib admin_epoch=\"42\"/>'" +msgid "# cibadmin --modify --xml-text '<cib admin_epoch=\"42\"/>'" msgstr "" #. Tag: para #, no-c-format msgid "A complete set of CIB properties will look something like this:" msgstr "" #. Tag: title #, no-c-format msgid "Attributes set for a cib object" msgstr "" #. Tag: programlisting #, no-c-format msgid "<cib crm_feature_set=\"3.0.7\" validate-with=\"pacemaker-1.2\"\n" " admin_epoch=\"42\" epoch=\"116\" num_updates=\"1\"\n" " cib-last-written=\"Mon Jan 12 15:46:39 2015\" update-origin=\"rhel7-1\"\n" " update-client=\"crm_attribute\" have-quorum=\"1\" dc-uuid=\"1\">" msgstr "" #. Tag: title #, no-c-format msgid "Cluster Options" msgstr "" #. Tag: para #, no-c-format msgid "Cluster options, as you might expect, control how the cluster behaves when confronted with certain situations." msgstr "" #. Tag: para #, no-c-format msgid "They are grouped into sets within the crm_config section, and, in advanced configurations, there may be more than one set. (This will be described later in the section on where we will show how to have the cluster use different sets of options during working hours than during weekends.) For now, we will describe the simple case where each option is present at most once." msgstr "" #. Tag: para #, no-c-format msgid "You can obtain an up-to-date list of cluster options, including their default values, by running the man pengine and man crmd commands." msgstr "" #. Tag: entry #, no-c-format msgid "Option" msgstr "" #. Tag: entry #, no-c-format msgid "Default" msgstr "" #. Tag: para #, no-c-format msgid "dc-version" msgstr "" #. Tag: para #, no-c-format msgid " dc-versionCluster Property Cluster Property ClusterPropertydc-version Propertydc-version dc-version Version of Pacemaker on the cluster’s DC. Determined automatically by the cluster. Often includes the hash which identifies the exact Git changeset it was built from. Used for diagnostic purposes." msgstr "" #. Tag: para #, no-c-format msgid "cluster-infrastructure" msgstr "" #. Tag: para #, no-c-format msgid " cluster-infrastructureCluster Property Cluster Property ClusterPropertycluster-infrastructure Propertycluster-infrastructure cluster-infrastructure The messaging stack on which Pacemaker is currently running. Determined automatically by the cluster. Used for informational and diagnostic purposes." msgstr "" #. Tag: para #, no-c-format msgid "expected-quorum-votes" msgstr "" #. Tag: para #, no-c-format msgid " expected-quorum-votesCluster Property Cluster Property ClusterPropertyexpected-quorum-votes Propertyexpected-quorum-votes expected-quorum-votes The number of nodes expected to be in the cluster. Determined automatically by the cluster. Used to calculate quorum in clusters that use Corosync 1.x without CMAN as the messaging layer." msgstr "" #. Tag: para #, no-c-format msgid "no-quorum-policy" msgstr "" #. Tag: para #, no-c-format msgid "stop" msgstr "" #. Tag: para #, no-c-format msgid " no-quorum-policyCluster Option Cluster Option ClusterOptionno-quorum-policy Optionno-quorum-policy no-quorum-policy What to do when the cluster does not have quorum. Allowed values:" msgstr "" #. Tag: para #, no-c-format msgid "ignore: continue all resource management" msgstr "" #. Tag: para #, no-c-format msgid "freeze: continue resource management, but don’t recover resources from nodes not in the affected partition" msgstr "" #. Tag: para #, no-c-format msgid "stop: stop all resources in the affected cluster partition" msgstr "" #. Tag: para #, no-c-format msgid "suicide: fence all nodes in the affected cluster partition" msgstr "" #. Tag: para #, no-c-format msgid "batch-limit" msgstr "" #. Tag: para #, no-c-format msgid "30" msgstr "" #. Tag: para #, no-c-format msgid " batch-limitCluster Option Cluster Option ClusterOptionbatch-limit Optionbatch-limit batch-limit The number of jobs that the Transition Engine (TE) is allowed to execute in parallel. The TE is the logic in pacemaker’s CRMd that executes the actions determined by the Policy Engine (PE). The \"correct\" value will depend on the speed and load of your network and cluster nodes." msgstr "" #. Tag: para #, no-c-format msgid "migration-limit" msgstr "" #. Tag: para #, no-c-format msgid "-1" msgstr "" #. Tag: para #, no-c-format msgid " migration-limitCluster Option Cluster Option ClusterOptionmigration-limit Optionmigration-limit migration-limit The number of migration jobs that the TE is allowed to execute in parallel on a node. A value of -1 means unlimited." msgstr "" #. Tag: para #, no-c-format msgid "symmetric-cluster" msgstr "" #. Tag: para #, no-c-format msgid "TRUE" msgstr "" #. Tag: para #, no-c-format msgid " symmetric-clusterCluster Option Cluster Option ClusterOptionsymmetric-cluster Optionsymmetric-cluster symmetric-cluster Can all resources run on any node by default?" msgstr "" #. Tag: para #, no-c-format msgid "stop-all-resources" msgstr "" #. Tag: para #, no-c-format msgid "FALSE" msgstr "" #. Tag: para #, no-c-format msgid " stop-all-resourcesCluster Option Cluster Option ClusterOptionstop-all-resources Optionstop-all-resources stop-all-resources Should the cluster stop all resources?" msgstr "" #. Tag: para #, no-c-format msgid "stop-orphan-resources" msgstr "" #. Tag: para #, no-c-format msgid " stop-orphan-resourcesCluster Option Cluster Option ClusterOptionstop-orphan-resources Optionstop-orphan-resources stop-orphan-resources Should deleted resources be stopped?" msgstr "" #. Tag: para #, no-c-format msgid "stop-orphan-actions" msgstr "" #. Tag: para #, no-c-format msgid " stop-orphan-actionsCluster Option Cluster Option ClusterOptionstop-orphan-actions Optionstop-orphan-actions stop-orphan-actions Should deleted actions be cancelled?" msgstr "" #. Tag: para #, no-c-format msgid "start-failure-is-fatal" msgstr "" #. Tag: para #, no-c-format -msgid " start-failure-is-fatalCluster Option Cluster Option ClusterOptionstart-failure-is-fatal Optionstart-failure-is-fatal start-failure-is-fatal Should a failure to start a resource on a particular node prevent further start attempts on that node? If FALSE, the cluster will decide whether to try starting on the same node again based on the resource’s current failure count and migration-threshold (see )." +msgid " start-failure-is-fatalCluster Option Cluster Option ClusterOptionstart-failure-is-fatal Optionstart-failure-is-fatal start-failure-is-fatal Should a failure to start a resource on a particular node prevent further start attempts on that node? If FALSE, the cluster will decide whether the same node is still eligible based on the resource’s current failure count and migration-threshold (see )." msgstr "" #. Tag: para #, no-c-format msgid "enable-startup-probes" msgstr "" #. Tag: para #, no-c-format msgid " enable-startup-probesCluster Option Cluster Option ClusterOptionenable-startup-probes Optionenable-startup-probes enable-startup-probes Should the cluster check for active resources during startup?" msgstr "" #. Tag: para #, no-c-format msgid "maintenance-mode" msgstr "" #. Tag: para #, no-c-format msgid " maintenance-modeCluster Option Cluster Option ClusterOptionmaintenance-mode Optionmaintenance-mode maintenance-mode Should the cluster refrain from monitoring, starting and stopping resources?" msgstr "" #. Tag: para #, no-c-format msgid "stonith-enabled" msgstr "" #. Tag: para #, no-c-format msgid " stonith-enabledCluster Option Cluster Option ClusterOptionstonith-enabled Optionstonith-enabled stonith-enabled Should failed nodes and nodes with resources that can’t be stopped be shot? If you value your data, set up a STONITH device and enable this." msgstr "" #. Tag: para #, no-c-format msgid "If true, or unset, the cluster will refuse to start resources unless one or more STONITH resources have been configured. If false, unresponsive nodes are immediately assumed to be running no resources, and resource takeover to online nodes starts without any further protection (which means data loss if the unresponsive node still accesses shared storage, for example). See also the requires meta-attribute in ." msgstr "" #. Tag: para #, no-c-format msgid "stonith-action" msgstr "" #. Tag: para #, no-c-format msgid "reboot" msgstr "" #. Tag: para #, no-c-format msgid " stonith-actionCluster Option Cluster Option ClusterOptionstonith-action Optionstonith-action stonith-action Action to send to STONITH device. Allowed values are reboot and off. The value poweroff is also allowed, but is only used for legacy devices." msgstr "" #. Tag: para #, no-c-format msgid "stonith-timeout" msgstr "" #. Tag: para #, no-c-format msgid "60s" msgstr "" #. Tag: para #, no-c-format msgid " stonith-timeoutCluster Option Cluster Option ClusterOptionstonith-timeout Optionstonith-timeout stonith-timeout How long to wait for STONITH actions (reboot, on, off) to complete" msgstr "" #. Tag: para #, no-c-format msgid "concurrent-fencing" msgstr "" #. Tag: para #, no-c-format msgid " concurrent-fencingCluster Option Cluster Option ClusterOptionconcurrent-fencing Optionconcurrent-fencing concurrent-fencing Is the cluster allowed to initiate multiple fence actions concurrently?" msgstr "" #. Tag: para #, no-c-format msgid "cluster-delay" msgstr "" #. Tag: para #, no-c-format msgid " cluster-delayCluster Option Cluster Option ClusterOptioncluster-delay Optioncluster-delay cluster-delay Estimated maximum round-trip delay over the network (excluding action execution). If the TE requires an action to be executed on another node, it will consider the action failed if it does not get a response from the other node in this time (after considering the action’s own timeout). The \"correct\" value will depend on the speed and load of your network and cluster nodes." msgstr "" #. Tag: para #, no-c-format msgid "dc-deadtime" msgstr "" #. Tag: para #, no-c-format msgid "20s" msgstr "" #. Tag: para #, no-c-format msgid " dc-deadtimeCluster Option Cluster Option ClusterOptiondc-deadtime Optiondc-deadtime dc-deadtime How long to wait for a response from other nodes during startup." msgstr "" #. Tag: para #, no-c-format msgid "The \"correct\" value will depend on the speed/load of your network and the type of switches used." msgstr "" #. Tag: para #, no-c-format msgid "cluster-recheck-interval" msgstr "" #. Tag: para #, no-c-format msgid "15min" msgstr "" #. Tag: para #, no-c-format msgid " cluster-recheck-intervalCluster Option Cluster Option ClusterOptioncluster-recheck-interval Optioncluster-recheck-interval cluster-recheck-interval Polling interval for time-based changes to options, resource parameters and constraints." msgstr "" #. Tag: para #, no-c-format msgid "The Cluster is primarily event-driven, but your configuration can have elements that take effect based on the time of day. To ensure these changes take effect, we can optionally poll the cluster’s status for changes. A value of 0 disables polling. Positive values are an interval (in seconds unless other SI units are specified, e.g. 5min)." msgstr "" #. Tag: para #, no-c-format msgid "pe-error-series-max" msgstr "" #. Tag: para #, no-c-format msgid " pe-error-series-maxCluster Option Cluster Option ClusterOptionpe-error-series-max Optionpe-error-series-max pe-error-series-max The number of PE inputs resulting in ERRORs to save. Used when reporting problems. A value of -1 means unlimited (report all)." msgstr "" #. Tag: para #, no-c-format msgid "pe-warn-series-max" msgstr "" #. Tag: para #, no-c-format msgid " pe-warn-series-maxCluster Option Cluster Option ClusterOptionpe-warn-series-max Optionpe-warn-series-max pe-warn-series-max The number of PE inputs resulting in WARNINGs to save. Used when reporting problems. A value of -1 means unlimited (report all)." msgstr "" #. Tag: para #, no-c-format msgid "pe-input-series-max" msgstr "" #. Tag: para #, no-c-format msgid " pe-input-series-maxCluster Option Cluster Option ClusterOptionpe-input-series-max Optionpe-input-series-max pe-input-series-max The number of \"normal\" PE inputs to save. Used when reporting problems. A value of -1 means unlimited (report all)." msgstr "" +#. Tag: para +#, no-c-format +msgid "node-health-strategy" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "none" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " node-health-strategyCluster Option Cluster Option ClusterOptionnode-health-strategy Optionnode-health-strategy node-health-strategy How the cluster should react to node health attributes (see ). Allowed values are none, migrate-on-red, only-green, progressive, and custom." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "node-health-base" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "0" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " node-health-baseCluster Option Cluster Option ClusterOptionnode-health-base Optionnode-health-base node-health-base The base health score assigned to a node. Only used when node-health-strategy is progressive. (since 1.1.16)" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "node-health-green" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " node-health-greenCluster Option Cluster Option ClusterOptionnode-health-green Optionnode-health-green node-health-green The score to use for a node health attribute whose value is green. Only used when node-health-strategy is progressive or custom." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "node-health-yellow" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " node-health-yellowCluster Option Cluster Option ClusterOptionnode-health-yellow Optionnode-health-yellow node-health-yellow The score to use for a node health attribute whose value is yellow. Only used when node-health-strategy is progressive or custom." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "node-health-red" +msgstr "" + +#. Tag: para +#, no-c-format +msgid " node-health-redCluster Option Cluster Option ClusterOptionnode-health-red Optionnode-health-red node-health-red The score to use for a node health attribute whose value is red. Only used when node-health-strategy is progressive or custom." +msgstr "" + #. Tag: para #, no-c-format msgid "remove-after-stop" msgstr "" #. Tag: para #, no-c-format msgid " remove-after-stopCluster Option Cluster Option ClusterOptionremove-after-stop Optionremove-after-stop remove-after-stop Advanced Use Only: Should the cluster remove resources from the LRM after they are stopped? Values other than the default are, at best, poorly tested and potentially dangerous." msgstr "" #. Tag: para #, no-c-format msgid "startup-fencing" msgstr "" #. Tag: para #, no-c-format msgid " startup-fencingCluster Option Cluster Option ClusterOptionstartup-fencing Optionstartup-fencing startup-fencing Advanced Use Only: Should the cluster shoot unseen nodes? Not using the default is very unsafe!" msgstr "" #. Tag: para #, no-c-format msgid "election-timeout" msgstr "" #. Tag: para #, no-c-format msgid "2min" msgstr "" #. Tag: para #, no-c-format msgid " election-timeoutCluster Option Cluster Option ClusterOptionelection-timeout Optionelection-timeout election-timeout Advanced Use Only: If you need to adjust this value, it probably indicates the presence of a bug." msgstr "" #. Tag: para #, no-c-format msgid "shutdown-escalation" msgstr "" #. Tag: para #, no-c-format msgid "20min" msgstr "" #. Tag: para #, no-c-format msgid " shutdown-escalationCluster Option Cluster Option ClusterOptionshutdown-escalation Optionshutdown-escalation shutdown-escalation Advanced Use Only: If you need to adjust this value, it probably indicates the presence of a bug." msgstr "" #. Tag: para #, no-c-format msgid "crmd-integration-timeout" msgstr "" #. Tag: para #, no-c-format msgid "3min" msgstr "" #. Tag: para #, no-c-format msgid " crmd-integration-timeoutCluster Option Cluster Option ClusterOptioncrmd-integration-timeout Optioncrmd-integration-timeout crmd-integration-timeout Advanced Use Only: If you need to adjust this value, it probably indicates the presence of a bug." msgstr "" #. Tag: para #, no-c-format msgid "crmd-finalization-timeout" msgstr "" #. Tag: para #, no-c-format msgid "30min" msgstr "" #. Tag: para #, no-c-format msgid " crmd-finalization-timeoutCluster Option Cluster Option ClusterOptioncrmd-finalization-timeout Optioncrmd-finalization-timeout crmd-finalization-timeout Advanced Use Only: If you need to adjust this value, it probably indicates the presence of a bug." msgstr "" #. Tag: para #, no-c-format msgid "crmd-transition-delay" msgstr "" #. Tag: para #, no-c-format msgid "0s" msgstr "" #. Tag: para #, no-c-format msgid " crmd-transition-delayCluster Option Cluster Option ClusterOptioncrmd-transition-delay Optioncrmd-transition-delay crmd-transition-delay Advanced Use Only: Delay cluster recovery for the configured interval to allow for additional/related events to occur. Useful if your configuration is sensitive to the order in which ping updates arrive. Enabling this option will slow down cluster recovery under all conditions." msgstr "" #. Tag: para #, no-c-format msgid "default-resource-stickiness" msgstr "" -#. Tag: para -#, no-c-format -msgid "0" -msgstr "" - #. Tag: para #, no-c-format msgid " default-resource-stickinessCluster Option Cluster Option ClusterOptiondefault-resource-stickiness Optiondefault-resource-stickiness default-resource-stickiness Deprecated: See instead" msgstr "" #. Tag: para #, no-c-format msgid "is-managed-default" msgstr "" #. Tag: para #, no-c-format msgid " is-managed-defaultCluster Option Cluster Option ClusterOptionis-managed-default Optionis-managed-default is-managed-default Deprecated: See instead" msgstr "" #. Tag: para #, no-c-format msgid "default-action-timeout" msgstr "" #. Tag: para #, no-c-format msgid " default-action-timeoutCluster Option Cluster Option ClusterOptiondefault-action-timeout Optiondefault-action-timeout default-action-timeout Deprecated: See instead" msgstr "" #. Tag: title #, no-c-format msgid "Querying and Setting Cluster Options" msgstr "" #. Tag: para #, no-c-format msgid " QueryingCluster Option Cluster Option SettingCluster Option Cluster Option ClusterQuerying Options Querying Options ClusterSetting Options Setting Options " msgstr "" #. Tag: para #, no-c-format msgid "Cluster options can be queried and modified using the crm_attribute tool. To get the current value of cluster-delay, you can run:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --query --name cluster-delay" msgstr "" #. Tag: para #, no-c-format msgid "which is more simply written as" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute -G -n cluster-delay" msgstr "" #. Tag: para #, no-c-format msgid "If a value is found, you’ll see a result like this:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute -G -n cluster-delay\n" "scope=crm_config name=cluster-delay value=60s" msgstr "" #. Tag: para #, no-c-format msgid "If no value is found, the tool will display an error:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute -G -n clusta-deway\n" "scope=crm_config name=clusta-deway value=(null)\n" "Error performing operation: No such device or address" msgstr "" #. Tag: para #, no-c-format msgid "To use a different value (for example, 30 seconds), simply run:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --name cluster-delay --update 30s" msgstr "" #. Tag: para #, no-c-format msgid "To go back to the cluster’s default value, you can delete the value, for example:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --name cluster-delay --delete\n" "Deleted crm_config option: id=cib-bootstrap-options-cluster-delay name=cluster-delay" msgstr "" #. Tag: title #, no-c-format msgid "When Options are Listed More Than Once" msgstr "" #. Tag: para #, no-c-format msgid "If you ever see something like the following, it means that the option you’re modifying is present more than once." msgstr "" #. Tag: title #, no-c-format msgid "Deleting an option that is listed twice" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --name batch-limit --delete\n" "\n" "Multiple attributes match name=batch-limit in crm_config:\n" "Value: 50 (set=cib-bootstrap-options, id=cib-bootstrap-options-batch-limit)\n" "Value: 100 (set=custom, id=custom-batch-limit)\n" "Please choose from one of the matches above and supply the 'id' with --id" msgstr "" #. Tag: para #, no-c-format msgid "In such cases, follow the on-screen instructions to perform the requested action. To determine which value is currently being used by the cluster, refer to ." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Resource-Templates.pot b/doc/Pacemaker_Explained/pot/Ch-Resource-Templates.pot index 96fc81ea81..2c5cb5d167 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Resource-Templates.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Resource-Templates.pot @@ -1,317 +1,317 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Resource Templates" msgstr "" #. Tag: para #, no-c-format msgid "If you want to create lots of resources with similar configurations, defining a resource template simplifies the task. Once defined, it can be referenced in primitives or in certain types of constraints." msgstr "" #. Tag: title #, no-c-format msgid "Configuring Resources with Templates" msgstr "" #. Tag: para #, no-c-format msgid "The primitives referencing the template will inherit all meta-attributes, instance attributes, utilization attributes and operations defined in the template. And you can define specific attributes and operations for any of the primitives. If any of these are defined in both the template and the primitive, the values defined in the primitive will take precedence over the ones defined in the template." msgstr "" #. Tag: para #, no-c-format msgid "Hence, resource templates help to reduce the amount of configuration work. If any changes are needed, they can be done to the template definition and will take effect globally in all resource definitions referencing that template." msgstr "" #. Tag: para #, no-c-format msgid "Resource templates have a syntax similar to that of primitives." msgstr "" #. Tag: title #, no-c-format msgid "Resource template for a migratable Xen virtual machine" msgstr "" #. Tag: programlisting #, no-c-format msgid "<template id=\"vm-template\" class=\"ocf\" provider=\"heartbeat\" type=\"Xen\">\n" " <meta_attributes id=\"vm-template-meta_attributes\">\n" " <nvpair id=\"vm-template-meta_attributes-allow-migrate\" name=\"allow-migrate\" value=\"true\"/>\n" " </meta_attributes>\n" " <utilization id=\"vm-template-utilization\">\n" " <nvpair id=\"vm-template-utilization-memory\" name=\"memory\" value=\"512\"/>\n" " </utilization>\n" " <operations>\n" " <op id=\"vm-template-monitor-15s\" interval=\"15s\" name=\"monitor\" timeout=\"60s\"/>\n" " <op id=\"vm-template-start-0\" interval=\"0\" name=\"start\" timeout=\"60s\"/>\n" " </operations>\n" "</template>" msgstr "" #. Tag: para #, no-c-format msgid "Once you define a resource template, you can use it in primitives by specifying the template property." msgstr "" #. Tag: title #, no-c-format msgid "Xen primitive resource using a resource template" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"vm1\" template=\"vm-template\">\n" " <instance_attributes id=\"vm1-instance_attributes\">\n" " <nvpair id=\"vm1-instance_attributes-name\" name=\"name\" value=\"vm1\"/>\n" " <nvpair id=\"vm1-instance_attributes-xmfile\" name=\"xmfile\" value=\"/etc/xen/shared-vm/vm1\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: para #, no-c-format msgid "In the example above, the new primitive vm1 will inherit everything from vm-template. For example, the equivalent of the above two examples would be:" msgstr "" #. Tag: title #, no-c-format msgid "Equivalent Xen primitive resource not using a resource template" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"vm1\" class=\"ocf\" provider=\"heartbeat\" type=\"Xen\">\n" " <meta_attributes id=\"vm-template-meta_attributes\">\n" " <nvpair id=\"vm-template-meta_attributes-allow-migrate\" name=\"allow-migrate\" value=\"true\"/>\n" " </meta_attributes>\n" " <utilization id=\"vm-template-utilization\">\n" " <nvpair id=\"vm-template-utilization-memory\" name=\"memory\" value=\"512\"/>\n" " </utilization>\n" " <operations>\n" " <op id=\"vm-template-monitor-15s\" interval=\"15s\" name=\"monitor\" timeout=\"60s\"/>\n" " <op id=\"vm-template-start-0\" interval=\"0\" name=\"start\" timeout=\"60s\"/>\n" " </operations>\n" " <instance_attributes id=\"vm1-instance_attributes\">\n" " <nvpair id=\"vm1-instance_attributes-name\" name=\"name\" value=\"vm1\"/>\n" " <nvpair id=\"vm1-instance_attributes-xmfile\" name=\"xmfile\" value=\"/etc/xen/shared-vm/vm1\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: para #, no-c-format msgid "If you want to overwrite some attributes or operations, add them to the particular primitive’s definition." msgstr "" #. Tag: title #, no-c-format msgid "Xen resource overriding template values" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"vm2\" template=\"vm-template\">\n" " <meta_attributes id=\"vm2-meta_attributes\">\n" " <nvpair id=\"vm2-meta_attributes-allow-migrate\" name=\"allow-migrate\" value=\"false\"/>\n" " </meta_attributes>\n" " <utilization id=\"vm2-utilization\">\n" " <nvpair id=\"vm2-utilization-memory\" name=\"memory\" value=\"1024\"/>\n" " </utilization>\n" " <instance_attributes id=\"vm2-instance_attributes\">\n" " <nvpair id=\"vm2-instance_attributes-name\" name=\"name\" value=\"vm2\"/>\n" " <nvpair id=\"vm2-instance_attributes-xmfile\" name=\"xmfile\" value=\"/etc/xen/shared-vm/vm2\"/>\n" " </instance_attributes>\n" " <operations>\n" " <op id=\"vm2-monitor-30s\" interval=\"30s\" name=\"monitor\" timeout=\"120s\"/>\n" " <op id=\"vm2-stop-0\" interval=\"0\" name=\"stop\" timeout=\"60s\"/>\n" " </operations>\n" "</primitive>" msgstr "" #. Tag: para #, no-c-format msgid "In the example above, the new primitive vm2 has special attribute values. Its monitor operation has a longer timeout and interval, and the primitive has an additional stop operation." msgstr "" #. Tag: para #, no-c-format msgid "To see the resulting definition of a resource, run:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource --query-xml --resource vm2" msgstr "" #. Tag: para #, no-c-format msgid "To see the raw definition of a resource in the CIB, run:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource --query-xml-raw --resource vm2" msgstr "" #. Tag: title #, no-c-format msgid "Referencing Templates in Constraints" msgstr "" #. Tag: para #, no-c-format msgid "A resource template can be referenced in the following types of constraints:" msgstr "" #. Tag: para #, no-c-format msgid "order constraints (see )" msgstr "" #. Tag: para #, no-c-format msgid "colocation constraints (see )" msgstr "" #. Tag: para #, no-c-format msgid "rsc_ticket constraints (for multi-site clusters as described in )" msgstr "" #. Tag: para #, no-c-format msgid "Resource templates referenced in constraints stand for all primitives which are derived from that template. This means, the constraint applies to all primitive resources referencing the resource template. Referencing resource templates in constraints is an alternative to resource sets and can simplify the cluster configuration considerably." msgstr "" #. Tag: para #, no-c-format msgid "For example, given the example templates earlier in this section:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_colocation id=\"vm-template-colo-base-rsc\" rsc=\"vm-template\" rsc-role=\"Started\" with-rsc=\"base-rsc\" score=\"INFINITY\"/>" msgstr "" #. Tag: para #, no-c-format msgid "would colocate all VMs with base-rsc and is the equivalent of the following constraint configuration:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_colocation id=\"vm-colo-base-rsc\" score=\"INFINITY\">\n" " <resource_set id=\"vm-colo-base-rsc-0\" sequential=\"false\" role=\"Started\">\n" " <resource_ref id=\"vm1\"/>\n" " <resource_ref id=\"vm2\"/>\n" " </resource_set>\n" " <resource_set id=\"vm-colo-base-rsc-1\">\n" " <resource_ref id=\"base-rsc\"/>\n" " </resource_set>\n" "</rsc_colocation>" msgstr "" #. Tag: para #, no-c-format msgid "In a colocation constraint, only one template may be referenced from either rsc or with-rsc; the other reference must be a regular resource." msgstr "" #. Tag: title #, no-c-format msgid "Referencing Resource Templates in Sequential Resource Sets" msgstr "" #. Tag: para #, no-c-format msgid "Resource templates can also be referenced in resource sets." msgstr "" #. Tag: para #, no-c-format msgid "For example:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_order id=\"order1\" score=\"INFINITY\">\n" " <resource_set id=\"order1-0\">\n" " <resource_ref id=\"base-rsc\"/>\n" " <resource_ref id=\"vm-template\"/>\n" " <resource_ref id=\"top-rsc\"/>\n" " </resource_set>\n" "</rsc_order>" msgstr "" #. Tag: para #, no-c-format msgid "is the equivalent of the following constraint configuration:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_order id=\"order1\" score=\"INFINITY\">\n" " <resource_set id=\"order1-0\">\n" " <resource_ref id=\"base-rsc\"/>\n" " <resource_ref id=\"vm1\"/>\n" " <resource_ref id=\"vm2\"/>\n" " <resource_ref id=\"top-rsc\"/>\n" " </resource_set>\n" "</rsc_order>" msgstr "" #. Tag: title #, no-c-format msgid "Referencing Resource Templates in Parallel Resource Sets" msgstr "" #. Tag: para #, no-c-format msgid "If the resources referencing the template can run in parallel:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_order id=\"order2\" score=\"INFINITY\">\n" " <resource_set id=\"order2-0\">\n" " <resource_ref id=\"base-rsc\"/>\n" " </resource_set>\n" " <resource_set id=\"order2-1\" sequential=\"false\">\n" " <resource_ref id=\"vm-template\"/>\n" " </resource_set>\n" " <resource_set id=\"order2-2\">\n" " <resource_ref id=\"top-rsc\"/>\n" " </resource_set>\n" "</rsc_order>" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_order id=\"order2\" score=\"INFINITY\">\n" " <resource_set id=\"order2-0\">\n" " <resource_ref id=\"base-rsc\"/>\n" " </resource_set>\n" " <resource_set id=\"order2-1\" sequential=\"false\">\n" " <resource_ref id=\"vm1\"/>\n" " <resource_ref id=\"vm2\"/>\n" " </resource_set>\n" " <resource_set id=\"order2-2\">\n" " <resource_ref id=\"top-rsc\"/>\n" " </resource_set>\n" "</rsc_order>" msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Resources.pot b/doc/Pacemaker_Explained/pot/Ch-Resources.pot index db8be236ab..fbf648f144 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Resources.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Resources.pot @@ -1,1230 +1,1255 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Cluster Resources" msgstr "" #. Tag: title #, no-c-format msgid "What is a Cluster Resource?" msgstr "" #. Tag: para #, no-c-format msgid " Resource " msgstr "" #. Tag: para #, no-c-format msgid "A resource is a service made highly available by a cluster. The simplest type of resource, a primitive resource, is described in this section. More complex forms, such as groups and clones, are described in later sections." msgstr "" #. Tag: para #, no-c-format msgid "Every primitive resource has a resource agent. A resource agent is an external program that abstracts the service it provides and present a consistent view to the cluster." msgstr "" #. Tag: para #, no-c-format msgid "This allows the cluster to be agnostic about the resources it manages. The cluster doesn’t need to understand how the resource works because it relies on the resource agent to do the right thing when given a start, stop or monitor command. For this reason, it is crucial that resource agents are well-tested." msgstr "" #. Tag: para #, no-c-format msgid "Typically, resource agents come in the form of shell scripts. However, they can be written using any technology (such as C, Python or Perl) that the author is comfortable with." msgstr "" #. Tag: title #, no-c-format msgid "Resource Classes" msgstr "" #. Tag: para #, no-c-format msgid " Resourceclass class " msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker supports several classes of agents:" msgstr "" #. Tag: para #, no-c-format msgid "OCF" msgstr "" #. Tag: para #, no-c-format msgid "LSB" msgstr "" #. Tag: para #, no-c-format msgid "Upstart" msgstr "" #. Tag: para #, no-c-format msgid "Systemd" msgstr "" #. Tag: para #, no-c-format msgid "Service" msgstr "" #. Tag: para #, no-c-format msgid "Fencing" msgstr "" #. Tag: para #, no-c-format msgid "Nagios Plugins" msgstr "" #. Tag: title #, no-c-format msgid "Open Cluster Framework" msgstr "" #. Tag: para #, no-c-format msgid " ResourceOCF OCF OCFResources Resources Open Cluster FrameworkResources Resources " msgstr "" #. Tag: para #, no-c-format msgid "The OCF standard See http://www.opencf.org/cgi-bin/viewcvs.cgi/specs/ra/resource-agent-api.txt?rev=HEAD  — at least as it relates to resource agents. The Pacemaker implementation has been somewhat extended from the OCF specs, but none of those changes are incompatible with the original OCF specification. is basically an extension of the Linux Standard Base conventions for init scripts to:" msgstr "" #. Tag: para #, no-c-format msgid "support parameters," msgstr "" #. Tag: para #, no-c-format msgid "make them self-describing, and" msgstr "" #. Tag: para #, no-c-format msgid "make them extensible" msgstr "" #. Tag: para #, no-c-format msgid "OCF specs have strict definitions of the exit codes that actions must return. The resource-agents source code includes the ocf-tester script, which can be useful in this regard. " msgstr "" #. Tag: para #, no-c-format msgid "The cluster follows these specifications exactly, and giving the wrong exit code will cause the cluster to behave in ways you will likely find puzzling and annoying. In particular, the cluster needs to distinguish a completely stopped resource from one which is in some erroneous and indeterminate state." msgstr "" #. Tag: para #, no-c-format msgid "Parameters are passed to the resource agent as environment variables, with the special prefix OCF_RESKEY_. So, a parameter which the user thinks of as ip will be passed to the resource agent as OCF_RESKEY_ip. The number and purpose of the parameters is left to the resource agent; however, the resource agent should use the meta-data command to advertise any that it supports." msgstr "" #. Tag: para #, no-c-format msgid "The OCF class is the most preferred as it is an industry standard, highly flexible (allowing parameters to be passed to agents in a non-positional manner) and self-describing." msgstr "" #. Tag: para #, no-c-format msgid "For more information, see the reference and ." msgstr "" #. Tag: title #, no-c-format msgid "Linux Standard Base" msgstr "" #. Tag: para #, no-c-format msgid " ResourceLSB LSB LSBResources Resources Linux Standard BaseResources Resources " msgstr "" #. Tag: para #, no-c-format msgid "LSB resource agents are those found in /etc/init.d." msgstr "" #. Tag: para #, no-c-format msgid "Generally, they are provided by the OS distribution and, in order to be used with the cluster, they must conform to the LSB Spec. See http://refspecs.linux-foundation.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html for the LSB Spec as it relates to init scripts. " msgstr "" #. Tag: para #, no-c-format msgid "Many distributions claim LSB compliance but ship with broken init scripts. For details on how to check whether your init script is LSB-compatible, see . Common problematic violations of the LSB standard include:" msgstr "" #. Tag: para #, no-c-format msgid "Not implementing the status operation at all" msgstr "" #. Tag: para #, no-c-format msgid "Not observing the correct exit status codes for start/stop/status actions" msgstr "" #. Tag: para #, no-c-format msgid "Starting a started resource returns an error" msgstr "" #. Tag: para #, no-c-format msgid "Stopping a stopped resource returns an error" msgstr "" #. Tag: para #, no-c-format msgid "Remember to make sure the computer is not configured to start any services at boot time — that should be controlled by the cluster." msgstr "" #. Tag: para #, no-c-format msgid " ResourceSystemd Systemd SystemdResources Resources " msgstr "" #. Tag: para #, no-c-format msgid "Some newer distributions have replaced the old \"SysV\" style of initialization daemons and scripts with an alternative called Systemd." msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker is able to manage these services if they are present." msgstr "" #. Tag: para #, no-c-format msgid "Instead of init scripts, systemd has unit files. Generally, the services (unit files) are provided by the OS distribution, but there are online guides for converting from init scripts. For example, http://0pointer.de/blog/projects/systemd-for-admins-3.html" msgstr "" #. Tag: para #, no-c-format msgid " ResourceUpstart Upstart UpstartResources Resources " msgstr "" #. Tag: para #, no-c-format msgid "Some newer distributions have replaced the old \"SysV\" style of initialization daemons (and scripts) with an alternative called Upstart." msgstr "" #. Tag: para #, no-c-format msgid "Instead of init scripts, upstart has jobs. Generally, the services (jobs) are provided by the OS distribution." msgstr "" #. Tag: title #, no-c-format msgid "System Services" msgstr "" #. Tag: para #, no-c-format msgid " ResourceSystem Services System Services System ServiceResources Resources " msgstr "" #. Tag: para #, no-c-format msgid "Since there are various types of system services (systemd, upstart, and lsb), Pacemaker supports a special service alias which intelligently figures out which one applies to a given cluster node." msgstr "" #. Tag: para #, no-c-format msgid "This is particularly useful when the cluster contains a mix of systemd, upstart, and lsb." msgstr "" #. Tag: para #, no-c-format msgid "In order, Pacemaker will try to find the named service as:" msgstr "" #. Tag: para #, no-c-format msgid "an LSB init script" msgstr "" #. Tag: para #, no-c-format msgid "a Systemd unit file" msgstr "" #. Tag: para #, no-c-format msgid "an Upstart job" msgstr "" #. Tag: title #, no-c-format msgid "STONITH" msgstr "" #. Tag: para #, no-c-format msgid " ResourceSTONITH STONITH STONITHResources Resources " msgstr "" #. Tag: para #, no-c-format msgid "The STONITH class is used exclusively for fencing-related resources. This is discussed later in ." msgstr "" #. Tag: para #, no-c-format msgid " ResourceNagios Plugins Nagios Plugins Nagios PluginsResources Resources " msgstr "" #. Tag: para #, no-c-format msgid "Nagios Plugins The project has two independent forks, hosted at https://www.nagios-plugins.org/ and https://www.monitoring-plugins.org/. Output from both projects' plugins is similar, so plugins from either project can be used with pacemaker. allow us to monitor services on remote hosts." msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker is able to do remote monitoring with the plugins if they are present." msgstr "" #. Tag: para #, no-c-format msgid "A common use case is to configure them as resources belonging to a resource container (usually a virtual machine), and the container will be restarted if any of them has failed. Another use is to configure them as ordinary resources to be used for monitoring hosts or services via the network." msgstr "" #. Tag: para #, no-c-format msgid "The supported parameters are same as the long options of the plugin." msgstr "" #. Tag: title #, no-c-format msgid "Resource Properties" msgstr "" #. Tag: para #, no-c-format msgid "These values tell the cluster which resource agent to use for the resource, where to find that resource agent and what standards it conforms to." msgstr "" #. Tag: title #, no-c-format msgid "Properties of a Primitive Resource" msgstr "" #. Tag: entry #, no-c-format msgid "Field" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "id" msgstr "" #. Tag: para #, no-c-format msgid "Your name for the resource idResource Resource ResourcePropertyid Propertyid id " msgstr "" #. Tag: para #, no-c-format msgid "class" msgstr "" #. Tag: para #, no-c-format msgid "The standard the resource agent conforms to. Allowed values: lsb, nagios, ocf, service, stonith, systemd, upstart classResource Resource ResourcePropertyclass Propertyclass class " msgstr "" #. Tag: para #, no-c-format msgid "type" msgstr "" #. Tag: para #, no-c-format msgid "The name of the Resource Agent you wish to use. E.g. IPaddr or Filesystem typeResource Resource ResourcePropertytype Propertytype type " msgstr "" #. Tag: para #, no-c-format msgid "provider" msgstr "" #. Tag: para #, no-c-format msgid "The OCF spec allows multiple vendors to supply the same resource agent. To use the OCF resource agents supplied by the Heartbeat project, you would specify heartbeat here. providerResource Resource ResourcePropertyprovider Propertyprovider provider " msgstr "" #. Tag: para #, no-c-format msgid "The XML definition of a resource can be queried with the crm_resource tool. For example:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource --resource Email --query-xml" msgstr "" #. Tag: para #, no-c-format msgid "might produce:" msgstr "" #. Tag: title #, no-c-format msgid "A system resource definition" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"Email\" class=\"service\" type=\"exim\"/>" msgstr "" #. Tag: para #, no-c-format msgid "One of the main drawbacks to system services (LSB, systemd or Upstart) resources is that they do not allow any parameters!" msgstr "" #. Tag: title #, no-c-format msgid "An OCF resource definition" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"Public-IP\" class=\"ocf\" type=\"IPaddr\" provider=\"heartbeat\">\n" " <instance_attributes id=\"Public-IP-params\">\n" " <nvpair id=\"Public-IP-ip\" name=\"ip\" value=\"192.0.2.2\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: title #, no-c-format msgid "Resource Options" msgstr "" #. Tag: para #, no-c-format msgid "Resources have two types of options: meta-attributes and instance attributes. Meta-attributes apply to any type of resource, while instance attributes are specific to each resource agent." msgstr "" #. Tag: title #, no-c-format msgid "Resource Meta-Attributes" msgstr "" #. Tag: para #, no-c-format msgid "Meta-attributes are used by the cluster to decide how a resource should behave and can be easily set using the --meta option of the crm_resource command." msgstr "" #. Tag: title #, no-c-format msgid "Meta-attributes of a Primitive Resource" msgstr "" #. Tag: entry #, no-c-format msgid "Default" msgstr "" #. Tag: para #, no-c-format msgid "priority" msgstr "" #. Tag: para #, no-c-format msgid "0" msgstr "" #. Tag: para #, no-c-format msgid "If not all resources can be active, the cluster will stop lower priority resources in order to keep higher priority ones active. priorityResource Option Resource Option ResourceOptionpriority Optionpriority priority " msgstr "" #. Tag: para #, no-c-format msgid "target-role" msgstr "" #. Tag: para #, no-c-format -msgid "started" +msgid "Started" msgstr "" #. Tag: para #, no-c-format msgid "What state should the cluster attempt to keep this resource in? Allowed values:" msgstr "" #. Tag: para #, no-c-format -msgid "stopped: Force the resource to be stopped" +msgid "Stopped: Force the resource to be stopped" msgstr "" #. Tag: para #, no-c-format -msgid "started: Allow the resource to be started (In the case of multi-state resources, they will not be promoted to master)" +msgid "Started: Allow the resource to be started (and in the case of multi-state resources, promoted to master if appropriate)" msgstr "" #. Tag: para #, no-c-format -msgid "master: Allow the resource to be started and, if appropriate, promoted target-roleResource Option Resource Option ResourceOptiontarget-role Optiontarget-role target-role " +msgid "Slave: Allow the resource to be started, but only in Slave mode if the resource is multi-state" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Master: Equivalent to Started target-roleResource Option Resource Option ResourceOptiontarget-role Optiontarget-role target-role " msgstr "" #. Tag: para #, no-c-format msgid "is-managed" msgstr "" #. Tag: para #, no-c-format msgid "TRUE" msgstr "" #. Tag: para #, no-c-format msgid "Is the cluster allowed to start and stop the resource? Allowed values: true, false is-managedResource Option Resource Option ResourceOptionis-managed Optionis-managed is-managed " msgstr "" #. Tag: para #, no-c-format msgid "resource-stickiness" msgstr "" #. Tag: para #, no-c-format msgid "value of resource-stickiness in the rsc_defaults section" msgstr "" #. Tag: para #, no-c-format msgid "How much does the resource prefer to stay where it is? resource-stickinessResource Option Resource Option ResourceOptionresource-stickiness Optionresource-stickiness resource-stickiness " msgstr "" #. Tag: para #, no-c-format msgid "requires" msgstr "" #. Tag: para #, no-c-format msgid "fencing (unless stonith-enabled is false or class is stonith, in which case it defaults to quorum)" msgstr "" #. Tag: para #, no-c-format -msgid "Conditions under which the resource can be started (Since 1.1.8) Allowed values:" +msgid "Conditions under which the resource can be started (since 1.1.8) Allowed values:" msgstr "" #. Tag: para #, no-c-format msgid "nothing: can always be started" msgstr "" #. Tag: para #, no-c-format msgid "quorum: The cluster can only start this resource if a majority of the configured nodes are active" msgstr "" #. Tag: para #, no-c-format msgid "fencing: The cluster can only start this resource if a majority of the configured nodes are active and any failed or unknown nodes have been powered off" msgstr "" #. Tag: para #, no-c-format -msgid "unfencing: The cluster can only start this resource if a majority of the configured nodes are active and any failed or unknown nodes have been powered off and only on nodes that have been unfenced" +msgid "unfencing: The cluster can only start this resource if a majority of the configured nodes are active and any failed or unknown nodes have been powered off and only on nodes that have been unfenced (since 1.1.9)" msgstr "" #. Tag: para #, no-c-format msgid " requiresResource Option Resource Option ResourceOptionrequires Optionrequires requires " msgstr "" #. Tag: para #, no-c-format msgid "migration-threshold" msgstr "" #. Tag: para #, no-c-format msgid "INFINITY" msgstr "" #. Tag: para #, no-c-format -msgid "How many failures may occur for this resource on a node, before this node is marked ineligible to host this resource. A value of INFINITY indicates that this feature is disabled. migration-thresholdResource Option Resource Option ResourceOptionmigration-threshold Optionmigration-threshold migration-threshold " +msgid "How many failures may occur for this resource on a node, before this node is marked ineligible to host this resource. A value of 0 indicates that this feature is disabled (the node will never be marked ineligible); by constrast, the cluster treats INFINITY (the default) as a very large but finite number. This option has an effect only if the failed operation has on-fail=restart (the default), and additionally for failed start operations, if the cluster property start-failure-is-fatal is false. migration-thresholdResource Option Resource Option ResourceOptionmigration-threshold Optionmigration-threshold migration-threshold " msgstr "" #. Tag: para #, no-c-format msgid "failure-timeout" msgstr "" #. Tag: para #, no-c-format -msgid "How many seconds to wait before acting as if the failure had not occurred, and potentially allowing the resource back to the node on which it failed. A value of 0 indicates that this feature is disabled. failure-timeoutResource Option Resource Option ResourceOptionfailure-timeout Optionfailure-timeout failure-timeout " +msgid "How many seconds to wait before acting as if the failure had not occurred, and potentially allowing the resource back to the node on which it failed. A value of 0 indicates that this feature is disabled. As with any time-based actions, this is not guaranteed to be checked more frequently than the value of cluster-recheck-interval (see ). failure-timeoutResource Option Resource Option ResourceOptionfailure-timeout Optionfailure-timeout failure-timeout " msgstr "" #. Tag: para #, no-c-format msgid "multiple-active" msgstr "" #. Tag: para #, no-c-format msgid "stop_start" msgstr "" #. Tag: para #, no-c-format msgid "What should the cluster do if it ever finds the resource active on more than one node? Allowed values:" msgstr "" #. Tag: para #, no-c-format msgid "block: mark the resource as unmanaged" msgstr "" #. Tag: para #, no-c-format msgid "stop_only: stop all active instances and leave them that way" msgstr "" #. Tag: para #, no-c-format msgid "stop_start: stop all active instances and start the resource in one location only" msgstr "" #. Tag: para #, no-c-format msgid " multiple-activeResource Option Resource Option ResourceOptionmultiple-active Optionmultiple-active multiple-active " msgstr "" +#. Tag: para +#, no-c-format +msgid "allow-migrate" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "TRUE for ocf:pacemaker:remote resources, FALSE otherwise" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Whether the cluster should try to \"live migrate\" this resource when it needs to be moved (see )" +msgstr "" + #. Tag: para #, no-c-format msgid "remote-node" msgstr "" #. Tag: para #, no-c-format msgid "The name of the remote-node this resource defines. This both enables the resource as a remote-node and defines the unique name used to identify the remote-node. If no other parameters are set, this value will also be assumed as the hostname to connect to at the port specified by remote-port. WARNING: This value cannot overlap with any resource or node IDs. If not specified, this feature is disabled." msgstr "" #. Tag: para #, no-c-format msgid "remote-port" msgstr "" #. Tag: para #, no-c-format msgid "3121" msgstr "" #. Tag: para #, no-c-format msgid "Port to use for the guest connection to pacemaker_remote" msgstr "" #. Tag: para #, no-c-format msgid "remote-addr" msgstr "" #. Tag: para #, no-c-format msgid "value of remote-node" msgstr "" #. Tag: para #, no-c-format msgid "The IP address or hostname to connect to if remote-node’s name is not the hostname of the guest." msgstr "" #. Tag: para #, no-c-format msgid "remote-connect-timeout" msgstr "" #. Tag: para #, no-c-format msgid "60s" msgstr "" #. Tag: para #, no-c-format msgid "How long before a pending guest connection will time out." msgstr "" #. Tag: para #, no-c-format msgid "Support for remote nodes was added in pacemaker 1.1.10. If you are using an earlier version, options related to remote nodes will not be available." msgstr "" #. Tag: para #, no-c-format msgid "As an example of setting resource options, if you performed the following commands on an LSB Email resource:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource --meta --resource Email --set-parameter priority --parameter-value 100\n" "# crm_resource -m -r Email -p multiple-active -v block" msgstr "" #. Tag: para #, no-c-format msgid "the resulting resource definition might be:" msgstr "" #. Tag: title #, no-c-format msgid "An LSB resource with cluster options" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"Email\" class=\"lsb\" type=\"exim\">\n" " <meta_attributes id=\"Email-meta_attributes\">\n" " <nvpair id=\"Email-meta_attributes-priority\" name=\"priority\" value=\"100\"/>\n" " <nvpair id=\"Email-meta_attributes-multiple-active\" name=\"multiple-active\" value=\"block\"/>\n" " </meta_attributes>\n" "</primitive>" msgstr "" #. Tag: title #, no-c-format msgid "Setting Global Defaults for Resource Meta-Attributes" msgstr "" #. Tag: para #, no-c-format msgid "To set a default value for a resource option, add it to the rsc_defaults section with crm_attribute. For example," msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --type rsc_defaults --name is-managed --update false" msgstr "" #. Tag: para #, no-c-format msgid "would prevent the cluster from starting or stopping any of the resources in the configuration (unless of course the individual resources were specifically enabled by having their is-managed set to true)." msgstr "" #. Tag: title #, no-c-format msgid "Resource Instance Attributes" msgstr "" #. Tag: para #, no-c-format msgid "The resource agents of some resource classes (lsb, systemd and upstart not among them) can be given parameters which determine how they behave and which instance of a service they control." msgstr "" #. Tag: para #, no-c-format msgid "If your resource agent supports parameters, you can add them with the crm_resource command. For example," msgstr "" #. Tag: screen #, no-c-format msgid "# crm_resource --resource Public-IP --set-parameter ip --parameter-value 192.0.2.2" msgstr "" #. Tag: para #, no-c-format msgid "would create an entry in the resource like this:" msgstr "" #. Tag: title #, no-c-format msgid "An example OCF resource with instance attributes" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"Public-IP\" class=\"ocf\" type=\"IPaddr\" provider=\"heartbeat\">\n" " <instance_attributes id=\"params-public-ip\">\n" " <nvpair id=\"public-ip-addr\" name=\"ip\" value=\"192.0.2.2\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: para #, no-c-format msgid "For an OCF resource, the result would be an environment variable called OCF_RESKEY_ip with a value of 192.0.2.2." msgstr "" #. Tag: para #, no-c-format msgid "The list of instance attributes supported by an OCF resource agent can be found by calling the resource agent with the meta-data command. The output contains an XML description of all the supported attributes, their purpose and default values." msgstr "" #. Tag: title #, no-c-format msgid "Displaying the metadata for the Dummy resource agent template" msgstr "" #. Tag: screen #, no-c-format msgid "# export OCF_ROOT=/usr/lib/ocf\n" "# $OCF_ROOT/resource.d/pacemaker/Dummy meta-data" msgstr "" #. Tag: programlisting #, no-c-format msgid "<?xml version=\"1.0\"?>\n" "<!DOCTYPE resource-agent SYSTEM \"ra-api-1.dtd\">\n" "<resource-agent name=\"Dummy\" version=\"1.0\">\n" "<version>1.0</version>\n" "\n" "<longdesc>\n" "This is a Dummy Resource Agent. It does absolutely nothing except\n" "keep track of whether its running or not.\n" "Its purpose in life is for testing and to serve as a template for RA writers.\n" "\n" "NB: Please pay attention to the timeouts specified in the actions\n" "section below. They should be meaningful for the kind of resource\n" "the agent manages. They should be the minimum advised timeouts,\n" "but they shouldn't/cannot cover _all_ possible resource\n" "instances. So, try to be neither overly generous nor too stingy,\n" "but moderate. The minimum timeouts should never be below 10 seconds.\n" "</longdesc>\n" "<shortdesc>Example stateless resource agent</shortdesc>\n" "\n" "<parameters>\n" "<parameter name=\"state\" unique=\"1\">\n" "<longdesc>\n" "Location to store the resource state in.\n" "</longdesc>\n" "<shortdesc>State file</shortdesc>\n" "<content type=\"string\" default=\"/var/run/Dummy-default.state\" />\n" "</parameter>\n" "\n" "<parameter name=\"fake\" unique=\"0\">\n" "<longdesc>\n" "Fake attribute that can be changed to cause a reload\n" "</longdesc>\n" "<shortdesc>Fake attribute that can be changed to cause a reload</shortdesc>\n" "<content type=\"string\" default=\"dummy\" />\n" "</parameter>\n" "\n" "<parameter name=\"op_sleep\" unique=\"1\">\n" "<longdesc>\n" "Number of seconds to sleep during operations. This can be used to test how\n" "the cluster reacts to operation timeouts.\n" "</longdesc>\n" "<shortdesc>Operation sleep duration in seconds.</shortdesc>\n" "<content type=\"string\" default=\"0\" />\n" "</parameter>\n" "\n" "</parameters>\n" "\n" "<actions>\n" "<action name=\"start\" timeout=\"20\" />\n" "<action name=\"stop\" timeout=\"20\" />\n" "<action name=\"monitor\" timeout=\"20\" interval=\"10\" depth=\"0\"/>\n" "<action name=\"reload\" timeout=\"20\" />\n" "<action name=\"migrate_to\" timeout=\"20\" />\n" "<action name=\"migrate_from\" timeout=\"20\" />\n" "<action name=\"validate-all\" timeout=\"20\" />\n" "<action name=\"meta-data\" timeout=\"5\" />\n" "</actions>\n" "</resource-agent>" msgstr "" #. Tag: title #, no-c-format msgid "Resource Operations" msgstr "" #. Tag: para #, no-c-format msgid " ResourceAction Action " msgstr "" #. Tag: para #, no-c-format msgid "Operations are actions the cluster can perform on a resource by calling the resource agent. Resource agents must support certain common operations such as start, stop and monitor, and may implement any others." msgstr "" #. Tag: para #, no-c-format msgid "Some operations are generated by the cluster itself, for example, stopping and starting resources as needed." msgstr "" #. Tag: para #, no-c-format msgid "You can configure operations in the cluster configuration. As an example, by default the cluster will not ensure your resources stay healthy once they are started. Currently, anyway. Automatic monitoring operations may be added in a future version of Pacemaker. To instruct the cluster to do this, you need to add a monitor operation to the resource’s definition." msgstr "" #. Tag: title #, no-c-format msgid "An OCF resource with a recurring health check" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"Public-IP\" class=\"ocf\" type=\"IPaddr\" provider=\"heartbeat\">\n" " <operations>\n" " <op id=\"public-ip-check\" name=\"monitor\" interval=\"60s\"/>\n" " </operations>\n" " <instance_attributes id=\"params-public-ip\">\n" " <nvpair id=\"public-ip-addr\" name=\"ip\" value=\"192.0.2.2\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: title #, no-c-format msgid "Properties of an Operation" msgstr "" #. Tag: para #, no-c-format msgid "A unique name for the operation. idAction Property Action Property ActionPropertyid Propertyid id " msgstr "" #. Tag: para #, no-c-format msgid "name" msgstr "" #. Tag: para #, no-c-format msgid "The action to perform. This can be any action supported by the agent; common values include monitor, start, and stop. nameAction Property Action Property ActionPropertyname Propertyname name " msgstr "" #. Tag: para #, no-c-format msgid "interval" msgstr "" #. Tag: para #, no-c-format msgid "How frequently (in seconds) to perform the operation. A value of 0 means never. A positive value defines a recurring action, which is typically used with monitor. intervalAction Property Action Property ActionPropertyinterval Propertyinterval interval " msgstr "" #. Tag: para #, no-c-format msgid "timeout" msgstr "" #. Tag: para #, no-c-format msgid "How long to wait before declaring the action has failed timeoutAction Property Action Property ActionPropertytimeout Propertytimeout timeout " msgstr "" #. Tag: para #, no-c-format msgid "on-fail" msgstr "" #. Tag: para #, no-c-format msgid "restart (except for stop operations, which default to fence when STONITH is enabled and block otherwise)" msgstr "" #. Tag: para #, no-c-format msgid "The action to take if this action ever fails. Allowed values:" msgstr "" #. Tag: para #, no-c-format msgid "ignore: Pretend the resource did not fail." msgstr "" #. Tag: para #, no-c-format msgid "block: Don’t perform any further operations on the resource." msgstr "" #. Tag: para #, no-c-format msgid "stop: Stop the resource and do not start it elsewhere." msgstr "" #. Tag: para #, no-c-format msgid "restart: Stop the resource and start it again (possibly on a different node)." msgstr "" #. Tag: para #, no-c-format msgid "fence: STONITH the node on which the resource failed." msgstr "" #. Tag: para #, no-c-format msgid "standby: Move all resources away from the node on which the resource failed." msgstr "" #. Tag: para #, no-c-format msgid " on-failAction Property Action Property ActionPropertyon-fail Propertyon-fail on-fail " msgstr "" #. Tag: para #, no-c-format msgid "enabled" msgstr "" #. Tag: para #, no-c-format msgid "If false, ignore this operation definition. This is typically used to pause a particular recurring monitor operation; for instance, it can complement the respective resource being unmanaged (is-managed=false), as this alone will not block any configured monitoring. Disabling the operation does not suppress all actions of the given type. Allowed values: true, false. enabledAction Property Action Property ActionPropertyenabled Propertyenabled enabled " msgstr "" #. Tag: para #, no-c-format msgid "record-pending" msgstr "" +#. Tag: para +#, no-c-format +msgid "FALSE" +msgstr "" + #. Tag: para #, no-c-format msgid "If true, the intention to perform the operation is recorded so that GUIs and CLI tools can indicate that an operation is in progress. This is best set as an operation default (see next section). Allowed values: true, false. enabledAction Property Action Property ActionPropertyenabled Propertyenabled enabled " msgstr "" #. Tag: para #, no-c-format msgid "role" msgstr "" #. Tag: para #, no-c-format msgid "Run the operation only on node(s) that the cluster thinks should be in the specified role. This only makes sense for recurring monitor operations. Allowed (case-sensitive) values: Stopped, Started, and in the case of multi-state resources, Slave and Master. roleAction Property Action Property ActionPropertyrole Propertyrole role " msgstr "" #. Tag: title #, no-c-format msgid "Monitoring Resources for Failure" msgstr "" #. Tag: para #, no-c-format msgid "When Pacemaker first starts a resource, it runs one-time monitor operations (referred to as probes) to ensure the resource is running where it’s supposed to be, and not running where it’s not supposed to be. (This behavior can be affected by the resource-discovery location constraint property.)" msgstr "" #. Tag: para #, no-c-format msgid "Other than those initial probes, Pacemaker will not (by default) check that the resource continues to stay healthy. As in the example above, you must configure monitor operations explicitly to perform these checks." msgstr "" #. Tag: para #, no-c-format msgid "By default, a monitor operation will ensure that the resource is running where it is supposed to. The target-role property can be used for further checking." msgstr "" #. Tag: para #, no-c-format msgid "For example, if a resource has one monitor operation with interval=10 role=Started and a second monitor operation with interval=11 role=Stopped, the cluster will run the first monitor on any nodes it thinks should be running the resource, and the second monitor on any nodes that it thinks should not be running the resource (for the truly paranoid, who want to know when an administrator manually starts a service by mistake)." msgstr "" #. Tag: title #, no-c-format msgid "Monitoring Resources When Administration is Disabled" msgstr "" #. Tag: para #, no-c-format msgid "Recurring monitor operations behave differently under various administrative settings:" msgstr "" #. Tag: para #, no-c-format msgid "When a resource is unmanaged (by setting is-managed=false): No monitors will be stopped." msgstr "" #. Tag: para #, no-c-format msgid "If the unmanaged resource is stopped on a node where the cluster thinks it should be running, the cluster will detect and report that it is not, but it will not consider the monitor failed, and will not try to start the resource until it is managed again." msgstr "" #. Tag: para #, no-c-format msgid "Starting the unmanaged resource on a different node is strongly discouraged and will at least cause the cluster to consider the resource failed, and may require the resource’s target-role to be set to Stopped then Started to be recovered." msgstr "" #. Tag: para #, no-c-format msgid "When a node is put into standby: All resources will be moved away from the node, and all monitor operations will be stopped on the node, except those with role=Stopped. Monitor operations with role=Stopped will be started on the node if appropriate." msgstr "" #. Tag: para #, no-c-format msgid "When the cluster is put into maintenance mode: All resources will be marked as unmanaged. All monitor operations will be stopped, except those with role=Stopped. As with single unmanaged resources, starting a resource on a node other than where the cluster expects it to be will cause problems." msgstr "" #. Tag: title #, no-c-format msgid "Setting Global Defaults for Operations" msgstr "" #. Tag: para #, no-c-format msgid "You can change the global default values for operation properties in a given cluster. These are defined in an op_defaults section of the CIB’s configuration section, and can be set with crm_attribute. For example," msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --type op_defaults --name timeout --update 20s" msgstr "" #. Tag: para #, no-c-format msgid "would default each operation’s timeout to 20 seconds. If an operation’s definition also includes a value for timeout, then that value would be used for that operation instead." msgstr "" #. Tag: title #, no-c-format msgid "When Implicit Operations Take a Long Time" msgstr "" #. Tag: para #, no-c-format msgid "The cluster will always perform a number of implicit operations: start, stop and a non-recurring monitor operation used at startup to check whether the resource is already active. If one of these is taking too long, then you can create an entry for them and specify a longer timeout." msgstr "" #. Tag: title #, no-c-format msgid "An OCF resource with custom timeouts for its implicit actions" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"Public-IP\" class=\"ocf\" type=\"IPaddr\" provider=\"heartbeat\">\n" " <operations>\n" " <op id=\"public-ip-startup\" name=\"monitor\" interval=\"0\" timeout=\"90s\"/>\n" " <op id=\"public-ip-start\" name=\"start\" interval=\"0\" timeout=\"180s\"/>\n" " <op id=\"public-ip-stop\" name=\"stop\" interval=\"0\" timeout=\"15min\"/>\n" " </operations>\n" " <instance_attributes id=\"params-public-ip\">\n" " <nvpair id=\"public-ip-addr\" name=\"ip\" value=\"192.0.2.2\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: title #, no-c-format msgid "Multiple Monitor Operations" msgstr "" #. Tag: para #, no-c-format msgid "Provided no two operations (for a single resource) have the same name and interval, you can have as many monitor operations as you like. In this way, you can do a superficial health check every minute and progressively more intense ones at higher intervals." msgstr "" #. Tag: para #, no-c-format msgid "To tell the resource agent what kind of check to perform, you need to provide each monitor with a different value for a common parameter. The OCF standard creates a special parameter called OCF_CHECK_LEVEL for this purpose and dictates that it is \"made available to the resource agent without the normal OCF_RESKEY prefix\"." msgstr "" #. Tag: para #, no-c-format msgid "Whatever name you choose, you can specify it by adding an instance_attributes block to the op tag. It is up to each resource agent to look for the parameter and decide how to use it." msgstr "" #. Tag: title #, no-c-format msgid "An OCF resource with two recurring health checks, performing different levels of checks specified via OCF_CHECK_LEVEL." msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"Public-IP\" class=\"ocf\" type=\"IPaddr\" provider=\"heartbeat\">\n" " <operations>\n" " <op id=\"public-ip-health-60\" name=\"monitor\" interval=\"60\">\n" " <instance_attributes id=\"params-public-ip-depth-60\">\n" " <nvpair id=\"public-ip-depth-60\" name=\"OCF_CHECK_LEVEL\" value=\"10\"/>\n" " </instance_attributes>\n" " </op>\n" " <op id=\"public-ip-health-300\" name=\"monitor\" interval=\"300\">\n" " <instance_attributes id=\"params-public-ip-depth-300\">\n" " <nvpair id=\"public-ip-depth-300\" name=\"OCF_CHECK_LEVEL\" value=\"20\"/>\n" -" </instance_attributes>\n" +" </instance_attributes>\n" " </op>\n" " </operations>\n" " <instance_attributes id=\"params-public-ip\">\n" " <nvpair id=\"public-ip-level\" name=\"ip\" value=\"192.0.2.2\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: title #, no-c-format msgid "Disabling a Monitor Operation" msgstr "" #. Tag: para #, no-c-format msgid "The easiest way to stop a recurring monitor is to just delete it. However, there can be times when you only want to disable it temporarily. In such cases, simply add enabled=\"false\" to the operation’s definition." msgstr "" #. Tag: title #, no-c-format msgid "Example of an OCF resource with a disabled health check" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"Public-IP\" class=\"ocf\" type=\"IPaddr\" provider=\"heartbeat\">\n" " <operations>\n" " <op id=\"public-ip-check\" name=\"monitor\" interval=\"60s\" enabled=\"false\"/>\n" " </operations>\n" " <instance_attributes id=\"params-public-ip\">\n" " <nvpair id=\"public-ip-addr\" name=\"ip\" value=\"192.0.2.2\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: para #, no-c-format msgid "This can be achieved from the command line by executing:" msgstr "" #. Tag: screen #, no-c-format msgid "# cibadmin --modify --xml-text '<op id=\"public-ip-check\" enabled=\"false\"/>'" msgstr "" #. Tag: para #, no-c-format msgid "Once you’ve done whatever you needed to do, you can then re-enable it with" msgstr "" #. Tag: screen #, no-c-format msgid "# cibadmin --modify --xml-text '<op id=\"public-ip-check\" enabled=\"true\"/>'" msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Rules.pot b/doc/Pacemaker_Explained/pot/Ch-Rules.pot index f97eb12b36..a1f76f423d 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Rules.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Rules.pot @@ -1,807 +1,857 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Rules" msgstr "" #. Tag: para #, no-c-format msgid " ResourceConstraintRule ConstraintRule Rule " msgstr "" #. Tag: para #, no-c-format msgid "Rules can be used to make your configuration more dynamic. One common example is to set one value for resource-stickiness during working hours, to prevent resources from being moved back to their most preferred location, and another on weekends when no-one is around to notice an outage." msgstr "" #. Tag: para #, no-c-format msgid "Another use of rules might be to assign machines to different processing groups (using a node attribute) based on time and to then use that attribute when creating location constraints." msgstr "" #. Tag: para #, no-c-format msgid "Each rule can contain a number of expressions, date-expressions and even other rules. The results of the expressions are combined based on the rule’s boolean-op field to determine if the rule ultimately evaluates to true or false. What happens next depends on the context in which the rule is being used." msgstr "" #. Tag: title #, no-c-format msgid "Rule Properties" msgstr "" #. Tag: title #, no-c-format msgid "Properties of a Rule" msgstr "" #. Tag: entry #, no-c-format msgid "Field" msgstr "" #. Tag: entry #, no-c-format msgid "Default" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "role" msgstr "" #. Tag: para #, no-c-format -msgid "started" +msgid "Started" msgstr "" #. Tag: para #, no-c-format -msgid "Limits the rule to apply only when the resource is in the specified role. Allowed values are started, slave, and master. A rule with role=\"master\" cannot determine the initial location of a clone instance and will only affect which of the active instances will be promoted. roleConstraint Rule Constraint Rule ConstraintRulerole Rulerole role " +msgid "Limits the rule to apply only when the resource is in the specified role. Allowed values are Started, Slave, and Master. A rule with role=\"Master\" cannot determine the initial location of a clone instance and will only affect which of the active instances will be promoted. roleConstraint Rule Constraint Rule ConstraintRulerole Rulerole role " msgstr "" #. Tag: para #, no-c-format msgid "score" msgstr "" #. Tag: para #, no-c-format msgid "The score to apply if the rule evaluates to true. Limited to use in rules that are part of location constraints. scoreConstraint Rule Constraint Rule ConstraintRulescore Rulescore score " msgstr "" #. Tag: para #, no-c-format msgid "score-attribute" msgstr "" #. Tag: para #, no-c-format msgid "The node attribute to look up and use as a score if the rule evaluates to true. Limited to use in rules that are part of location constraints. score-attributeConstraint Rule Constraint Rule ConstraintRulescore-attribute Rulescore-attribute score-attribute " msgstr "" #. Tag: para #, no-c-format msgid "boolean-op" msgstr "" #. Tag: para #, no-c-format msgid "and" msgstr "" #. Tag: para #, no-c-format msgid "How to combine the result of multiple expression objects. Allowed values are and and or. boolean-opConstraint Rule Constraint Rule ConstraintRuleboolean-op Ruleboolean-op boolean-op " msgstr "" #. Tag: title #, no-c-format msgid "Node Attribute Expressions" msgstr "" #. Tag: para #, no-c-format msgid " ResourceConstraintAttribute Expression ConstraintAttribute Expression Attribute Expression " msgstr "" #. Tag: para #, no-c-format -msgid "Expression objects are used to control a resource based on the attributes defined by a node or nodes. In addition to any attributes added by the administrator, each node has a built-in node attribute called #uname that can also be used." +msgid "Expression objects are used to control a resource based on the attributes defined by a node or nodes." msgstr "" #. Tag: title #, no-c-format msgid "Properties of an Expression" msgstr "" #. Tag: para #, no-c-format msgid "value" msgstr "" #. Tag: para #, no-c-format msgid "User-supplied value for comparison valueConstraint Expression Constraint Expression ConstraintAttribute Expressionvalue Attribute Expressionvalue value " msgstr "" #. Tag: para #, no-c-format msgid "attribute" msgstr "" #. Tag: para #, no-c-format msgid "The node attribute to test attributeConstraint Expression Constraint Expression ConstraintAttribute Expressionattribute Attribute Expressionattribute attribute " msgstr "" #. Tag: para #, no-c-format msgid "type" msgstr "" #. Tag: para #, no-c-format msgid "string" msgstr "" #. Tag: para #, no-c-format msgid "Determines how the value(s) should be tested. Allowed values are string, integer, and version. typeConstraint Expression Constraint Expression ConstraintAttribute Expressiontype Attribute Expressiontype type " msgstr "" #. Tag: para #, no-c-format msgid "operation" msgstr "" #. Tag: para #, no-c-format msgid "The comparison to perform. Allowed values:" msgstr "" #. Tag: para #, no-c-format msgid "lt: True if the value of the node’s attribute is less than value" msgstr "" #. Tag: para #, no-c-format msgid "gt: True if the value of the node’s attribute is greater than value" msgstr "" #. Tag: para #, no-c-format msgid "lte: True if the value of the node’s attribute is less than or equal to value" msgstr "" #. Tag: para #, no-c-format msgid "gte: True if the value of the node’s attribute is greater than or equal to value" msgstr "" #. Tag: para #, no-c-format msgid "eq: True if the value of the node’s attribute is equal to value" msgstr "" #. Tag: para #, no-c-format msgid "ne: True if the value of the node’s attribute is not equal to value" msgstr "" #. Tag: para #, no-c-format msgid "defined: True if the node has the named attribute" msgstr "" #. Tag: para #, no-c-format msgid "not_defined: True if the node does not have the named attribute operationConstraint Expression Constraint Expression ConstraintAttribute Expressionoperation Attribute Expressionoperation operation " msgstr "" +#. Tag: para +#, no-c-format +msgid "In addition to any attributes added by the administrator, the cluster defines special, built-in node attributes for each node that can also be used." +msgstr "" + +#. Tag: title +#, no-c-format +msgid "Built-in node attributes" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Name" +msgstr "" + +#. Tag: entry +#, no-c-format +msgid "Value" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "#uname" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Node name" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "#kind" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Node type. Possible values are cluster, remote, and container. Kind is remote for Pacemaker Remote nodes created with the ocf:pacemaker:remote resource, and container for Pacemaker Remote guest nodes (a legacy name unrelated to the now-common use of \"container\" for resource isolation). (since 1.1.13)" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "#ra-version" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The installed version of the resource agent on the node, as defined by the version attribute of the resource-agent tag in the agent’s metadata. Valid only within rules controlling resource options. This can be useful during rolling upgrades of a backward-incompatible resource agent. (coming in 1.1.17)" +msgstr "" + #. Tag: title #, no-c-format msgid "Time- and Date-Based Expressions" msgstr "" #. Tag: para #, no-c-format msgid " Time Based Expressions ResourceConstraintDate/Time Expression ConstraintDate/Time Expression Date/Time Expression " msgstr "" #. Tag: para #, no-c-format msgid "As the name suggests, date_expressions are used to control a resource or cluster option based on the current date/time. They may contain an optional date_spec and/or duration object depending on the context." msgstr "" #. Tag: title #, no-c-format msgid "Properties of a Date Expression" msgstr "" #. Tag: para #, no-c-format msgid "start" msgstr "" #. Tag: para #, no-c-format msgid "A date/time conforming to the ISO8601 specification. startConstraint Expression Constraint Expression ConstraintDate/Time Expressionstart Date/Time Expressionstart start " msgstr "" #. Tag: para #, no-c-format msgid "end" msgstr "" #. Tag: para #, no-c-format msgid "A date/time conforming to the ISO8601 specification. Can be inferred by supplying a value for start and a duration. endConstraint Expression Constraint Expression ConstraintDate/Time Expressionend Date/Time Expressionend end " msgstr "" #. Tag: para #, no-c-format msgid "Compares the current date/time with the start and/or end date, depending on the context. Allowed values:" msgstr "" #. Tag: para #, no-c-format msgid "gt: True if the current date/time is after start" msgstr "" #. Tag: para #, no-c-format msgid "lt: True if the current date/time is before end" msgstr "" #. Tag: para #, no-c-format -msgid "in-range: True if the current date/time is after start and before end" +msgid "in_range: True if the current date/time is after start and before end" msgstr "" #. Tag: para #, no-c-format -msgid "date-spec: True if the current date/time matches a date_spec object (described below) operationConstraint Expression Constraint Expression ConstraintDate/Time Expressionoperation Date/Time Expressionoperation operation " +msgid "date_spec: True if the current date/time matches a date_spec object (described below) operationConstraint Expression Constraint Expression ConstraintDate/Time Expressionoperation Date/Time Expressionoperation operation " msgstr "" #. Tag: para #, no-c-format msgid "As these comparisons (except for date_spec) include the time, the eq, neq, gte and lte operators have not been implemented since they would only be valid for a single second." msgstr "" #. Tag: title #, no-c-format msgid "Date Specifications" msgstr "" #. Tag: para #, no-c-format msgid " Date Specification ResourceConstraintDate Specification ConstraintDate Specification Date Specification " msgstr "" #. Tag: para #, no-c-format msgid "date_spec objects are used to create cron-like expressions relating to time. Each field can contain a single number or a single range. Instead of defaulting to zero, any field not supplied is ignored." msgstr "" #. Tag: para #, no-c-format msgid "For example, monthdays=\"1\" matches the first day of every month and hours=\"09-17\" matches the hours between 9am and 5pm (inclusive). At this time, multiple ranges (e.g. weekdays=\"1,2\" or weekdays=\"1-2,5-6\") are not supported; depending on demand, this might be implemented in a future release." msgstr "" #. Tag: title #, no-c-format msgid "Properties of a Date Specification" msgstr "" #. Tag: para #, no-c-format msgid "id" msgstr "" #. Tag: para #, no-c-format msgid "A unique name for the object idDate Specification Date Specification ConstraintDate Specificationid Date Specificationid id " msgstr "" #. Tag: para #, no-c-format msgid "hours" msgstr "" #. Tag: para #, no-c-format msgid "Allowed values: 0-23 hoursDate Specification Date Specification ConstraintDate Specificationhours Date Specificationhours hours " msgstr "" #. Tag: para #, no-c-format msgid "monthdays" msgstr "" #. Tag: para #, no-c-format msgid "Allowed values: 1-31 (depending on month and year) monthdaysDate Specification Date Specification ConstraintDate Specificationmonthdays Date Specificationmonthdays monthdays " msgstr "" #. Tag: para #, no-c-format msgid "weekdays" msgstr "" #. Tag: para #, no-c-format msgid "Allowed values: 1-7 (1=Monday, 7=Sunday) weekdaysDate Specification Date Specification ConstraintDate Specificationweekdays Date Specificationweekdays weekdays " msgstr "" #. Tag: para #, no-c-format msgid "yeardays" msgstr "" #. Tag: para #, no-c-format msgid "Allowed values: 1-366 (depending on the year) yeardaysDate Specification Date Specification ConstraintDate Specificationyeardays Date Specificationyeardays yeardays " msgstr "" #. Tag: para #, no-c-format msgid "months" msgstr "" #. Tag: para #, no-c-format msgid "Allowed values: 1-12 monthsDate Specification Date Specification ConstraintDate Specificationmonths Date Specificationmonths months " msgstr "" #. Tag: para #, no-c-format msgid "weeks" msgstr "" #. Tag: para #, no-c-format msgid "Allowed values: 1-53 (depending on weekyear) weeksDate Specification Date Specification ConstraintDate Specificationweeks Date Specificationweeks weeks " msgstr "" #. Tag: para #, no-c-format msgid "years" msgstr "" #. Tag: para #, no-c-format msgid "Year according to the Gregorian calendar yearsDate Specification Date Specification ConstraintDate Specificationyears Date Specificationyears years " msgstr "" #. Tag: para #, no-c-format msgid "weekyears" msgstr "" #. Tag: para #, no-c-format msgid "Year in which the week started; e.g. 1 January 2005 can be specified as 2005-001 Ordinal, 2005-01-01 Gregorian or 2004-W53-6 Weekly and thus would match years=\"2005\" or weekyears=\"2004\" weekyearsDate Specification Date Specification ConstraintDate Specificationweekyears Date Specificationweekyears weekyears " msgstr "" #. Tag: para #, no-c-format msgid "moon" msgstr "" #. Tag: para #, no-c-format msgid "Allowed values are 0-7 (0 is new, 4 is full moon). Seriously, you can use this. This was implemented to demonstrate the ease with which new comparisons could be added. moonDate Specification Date Specification ConstraintDate Specificationmoon Date Specificationmoon moon " msgstr "" #. Tag: title #, no-c-format msgid "Durations" msgstr "" #. Tag: para #, no-c-format msgid " Duration ResourceConstraintDuration ConstraintDuration Duration " msgstr "" #. Tag: para #, no-c-format -msgid "Durations are used to calculate a value for end when one is not supplied to in-range operations. They contain the same fields as date_spec objects but without the limitations (e.g. you can have a duration of 19 months). As with date_specs, any field not supplied is ignored." +msgid "Durations are used to calculate a value for end when one is not supplied to in_range operations. They contain the same fields as date_spec objects but without the limitations (e.g. you can have a duration of 19 months). As with date_specs, any field not supplied is ignored." msgstr "" #. Tag: title #, no-c-format msgid "Sample Time-Based Expressions" msgstr "" #. Tag: para #, no-c-format msgid "A small sample of how time-based expressions can be used:" msgstr "" #. Tag: title #, no-c-format msgid "True if now is any time in the year 2005" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rule id=\"rule1\">\n" " <date_expression id=\"date_expr1\" start=\"2005-001\" operation=\"in_range\">\n" " <duration years=\"1\"/>\n" " </date_expression>\n" "</rule>" msgstr "" #. Tag: title #, no-c-format msgid "Equivalent expression" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rule id=\"rule2\">\n" " <date_expression id=\"date_expr2\" operation=\"date_spec\">\n" " <date_spec years=\"2005\"/>\n" " </date_expression>\n" "</rule>" msgstr "" #. Tag: title #, no-c-format msgid "9am-5pm Monday-Friday" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rule id=\"rule3\">\n" " <date_expression id=\"date_expr3\" operation=\"date_spec\">\n" " <date_spec hours=\"9-16\" days=\"1-5\"/>\n" " </date_expression>\n" "</rule>" msgstr "" #. Tag: para #, no-c-format msgid "Please note that the 16 matches up to 16:59:59, as the numeric value (hour) still matches!" msgstr "" #. Tag: title #, no-c-format msgid "9am-6pm Monday through Friday or anytime Saturday" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rule id=\"rule4\" boolean_op=\"or\">\n" " <date_expression id=\"date_expr4-1\" operation=\"date_spec\">\n" " <date_spec hours=\"9-16\" days=\"1-5\"/>\n" " </date_expression>\n" " <date_expression id=\"date_expr4-2\" operation=\"date_spec\">\n" " <date_spec days=\"6\"/>\n" " </date_expression>\n" "</rule>" msgstr "" #. Tag: title #, no-c-format msgid "9am-5pm or 9pm-12am Monday through Friday" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rule id=\"rule5\" boolean_op=\"and\">\n" " <rule id=\"rule5-nested1\" boolean_op=\"or\">\n" " <date_expression id=\"date_expr5-1\" operation=\"date_spec\">\n" " <date_spec hours=\"9-16\"/>\n" " </date_expression>\n" " <date_expression id=\"date_expr5-2\" operation=\"date_spec\">\n" " <date_spec hours=\"21-23\"/>\n" " </date_expression>\n" " </rule>\n" " <date_expression id=\"date_expr5-3\" operation=\"date_spec\">\n" " <date_spec days=\"1-5\"/>\n" " </date_expression>\n" " </rule>" msgstr "" #. Tag: title #, no-c-format msgid "Mondays in March 2005" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rule id=\"rule6\" boolean_op=\"and\">\n" " <date_expression id=\"date_expr6-1\" operation=\"date_spec\">\n" " <date_spec weekdays=\"1\"/>\n" " </date_expression>\n" " <date_expression id=\"date_expr6-2\" operation=\"in_range\"\n" " start=\"2005-03-01\" end=\"2005-04-01\"/>\n" " </rule>" msgstr "" #. Tag: para #, no-c-format msgid "Because no time is specified with the above dates, 00:00:00 is implied. This means that the range includes all of 2005-03-01 but none of 2005-04-01. You may wish to write end=\"2005-03-31T23:59:59\" to avoid confusion." msgstr "" #. Tag: title #, no-c-format msgid "A full moon on Friday the 13th" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rule id=\"rule7\" boolean_op=\"and\">\n" " <date_expression id=\"date_expr7\" operation=\"date_spec\">\n" " <date_spec weekdays=\"5\" monthdays=\"13\" moon=\"4\"/>\n" " </date_expression>\n" "</rule>" msgstr "" #. Tag: title #, no-c-format msgid "Using Rules to Determine Resource Location" msgstr "" #. Tag: para #, no-c-format msgid " RuleDetermine Resource Location Determine Resource Location ResourceLocationDetermine by Rules LocationDetermine by Rules Determine by Rules " msgstr "" #. Tag: para #, no-c-format msgid "A location constraint may contain rules. When the constraint’s outermost rule evaluates to false, the cluster treats the constraint as if it were not there. When the rule evaluates to true, the node’s preference for running the resource is updated with the score associated with the rule." msgstr "" #. Tag: para #, no-c-format msgid "If this sounds familiar, it is because you have been using a simplified syntax for location constraint rules already. Consider the following location constraint:" msgstr "" #. Tag: title #, no-c-format msgid "Prevent myApacheRsc from running on c001n03" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location id=\"dont-run-apache-on-c001n03\" rsc=\"myApacheRsc\"\n" " score=\"-INFINITY\" node=\"c001n03\"/>" msgstr "" #. Tag: para #, no-c-format msgid "This constraint can be more verbosely written as:" msgstr "" #. Tag: title #, no-c-format msgid "Prevent myApacheRsc from running on c001n03 - expanded version" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_location id=\"dont-run-apache-on-c001n03\" rsc=\"myApacheRsc\">\n" " <rule id=\"dont-run-apache-rule\" score=\"-INFINITY\">\n" " <expression id=\"dont-run-apache-expr\" attribute=\"#uname\"\n" " operation=\"eq\" value=\"c00n03\"/>\n" " </rule>\n" "</rsc_location>" msgstr "" #. Tag: para #, no-c-format msgid "The advantage of using the expanded form is that one can then add extra clauses to the rule, such as limiting the rule such that it only applies during certain times of the day or days of the week." msgstr "" #. Tag: title #, no-c-format msgid "Location Rules Based on Other Node Properties" msgstr "" #. Tag: para #, no-c-format msgid "The expanded form allows us to match on node properties other than its name. If we rated each machine’s CPU power such that the cluster had the following nodes section:" msgstr "" #. Tag: title #, no-c-format msgid "A sample nodes section for use with score-attribute" msgstr "" #. Tag: programlisting #, no-c-format msgid "<nodes>\n" " <node id=\"uuid1\" uname=\"c001n01\" type=\"normal\">\n" " <instance_attributes id=\"uuid1-custom_attrs\">\n" " <nvpair id=\"uuid1-cpu_mips\" name=\"cpu_mips\" value=\"1234\"/>\n" " </instance_attributes>\n" " </node>\n" " <node id=\"uuid2\" uname=\"c001n02\" type=\"normal\">\n" " <instance_attributes id=\"uuid2-custom_attrs\">\n" " <nvpair id=\"uuid2-cpu_mips\" name=\"cpu_mips\" value=\"5678\"/>\n" " </instance_attributes>\n" " </node>\n" "</nodes>" msgstr "" #. Tag: para #, no-c-format msgid "then we could prevent resources from running on underpowered machines with this rule:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rule id=\"need-more-power-rule\" score=\"-INFINITY\">\n" " <expression id=\"need-more-power-expr\" attribute=\"cpu_mips\"\n" " operation=\"lt\" value=\"3000\"/>\n" "</rule>" msgstr "" #. Tag: title #, no-c-format msgid "Using score-attribute Instead of score" msgstr "" #. Tag: para #, no-c-format msgid "When using score-attribute instead of score, each node matched by the rule has its score adjusted differently, according to its value for the named node attribute. Thus, in the previous example, if a rule used score-attribute=\"cpu_mips\", c001n01 would have its preference to run the resource increased by 1234 whereas c001n02 would have its preference increased by 5678." msgstr "" #. Tag: title #, no-c-format msgid "Using Rules to Control Resource Options" msgstr "" #. Tag: para #, no-c-format msgid "Often some cluster nodes will be different from their peers. Sometimes, these differences — e.g. the location of a binary or the names of network interfaces — require resources to be configured differently depending on the machine they’re hosted on." msgstr "" #. Tag: para #, no-c-format msgid "By defining multiple instance_attributes objects for the resource and adding a rule to each, we can easily handle these special cases." msgstr "" #. Tag: para #, no-c-format msgid "In the example below, mySpecialRsc will use eth1 and port 9999 when run on node1, eth2 and port 8888 on node2 and default to eth0 and port 9999 for all other nodes." msgstr "" #. Tag: title #, no-c-format msgid "Defining different resource options based on the node name" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"mySpecialRsc\" class=\"ocf\" type=\"Special\" provider=\"me\">\n" " <instance_attributes id=\"special-node1\" score=\"3\">\n" " <rule id=\"node1-special-case\" score=\"INFINITY\" >\n" " <expression id=\"node1-special-case-expr\" attribute=\"#uname\"\n" " operation=\"eq\" value=\"node1\"/>\n" " </rule>\n" " <nvpair id=\"node1-interface\" name=\"interface\" value=\"eth1\"/>\n" " </instance_attributes>\n" " <instance_attributes id=\"special-node2\" score=\"2\" >\n" " <rule id=\"node2-special-case\" score=\"INFINITY\">\n" " <expression id=\"node2-special-case-expr\" attribute=\"#uname\"\n" " operation=\"eq\" value=\"node2\"/>\n" " </rule>\n" " <nvpair id=\"node2-interface\" name=\"interface\" value=\"eth2\"/>\n" " <nvpair id=\"node2-port\" name=\"port\" value=\"8888\"/>\n" " </instance_attributes>\n" " <instance_attributes id=\"defaults\" score=\"1\" >\n" " <nvpair id=\"default-interface\" name=\"interface\" value=\"eth0\"/>\n" " <nvpair id=\"default-port\" name=\"port\" value=\"9999\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: para #, no-c-format msgid "The order in which instance_attributes objects are evaluated is determined by their score (highest to lowest). If not supplied, score defaults to zero, and objects with an equal score are processed in listed order. If the instance_attributes object has no rule or a rule that evaluates to true, then for any parameter the resource does not yet have a value for, the resource will use the parameter values defined by the instance_attributes." msgstr "" #. Tag: para #, no-c-format msgid "For example, given the configuration above, if the resource is placed on node1:" msgstr "" #. Tag: para #, no-c-format msgid "special-node1 has the highest score (3) and so is evaluated first; its rule evaluates to true, so interface is set to eth1." msgstr "" #. Tag: para #, no-c-format msgid "special-node2 is evaluated next with score 2, but its rule evaluates to false, so it is ignored." msgstr "" #. Tag: para #, no-c-format msgid "defaults is evaluated last with score 1, and has no rule, so its values are examined; interface is already defined, so the value here is not used, but port is not yet defined, so port is set to 9999." msgstr "" #. Tag: title #, no-c-format msgid "Using Rules to Control Cluster Options" msgstr "" #. Tag: para #, no-c-format msgid " RuleControlling Cluster Options Controlling Cluster Options ClusterSetting Options with Rules Setting Options with Rules " msgstr "" #. Tag: para #, no-c-format msgid "Controlling cluster options is achieved in much the same manner as specifying different resource options on different nodes." msgstr "" #. Tag: para #, no-c-format msgid "The difference is that because they are cluster options, one cannot (or should not, because they won’t work) use attribute-based expressions. The following example illustrates how to set a different resource-stickiness value during and outside work hours. This allows resources to automatically move back to their most preferred hosts, but at a time that (in theory) does not interfere with business activities." msgstr "" #. Tag: title #, no-c-format msgid "Change resource-stickiness during working hours" msgstr "" #. Tag: programlisting #, no-c-format msgid "<rsc_defaults>\n" " <meta_attributes id=\"core-hours\" score=\"2\">\n" " <rule id=\"core-hour-rule\" score=\"0\">\n" " <date_expression id=\"nine-to-five-Mon-to-Fri\" operation=\"date_spec\">\n" " <date_spec id=\"nine-to-five-Mon-to-Fri-spec\" hours=\"9-16\" weekdays=\"1-5\"/>\n" " </date_expression>\n" " </rule>\n" " <nvpair id=\"core-stickiness\" name=\"resource-stickiness\" value=\"INFINITY\"/>\n" " </meta_attributes>\n" " <meta_attributes id=\"after-hours\" score=\"1\" >\n" " <nvpair id=\"after-stickiness\" name=\"resource-stickiness\" value=\"0\"/>\n" " </meta_attributes>\n" "</rsc_defaults>" msgstr "" #. Tag: title #, no-c-format msgid "Ensuring Time-Based Rules Take Effect" msgstr "" #. Tag: para #, no-c-format msgid "A Pacemaker cluster is an event-driven system. As such, it won’t recalculate the best place for resources to run unless something (like a resource failure or configuration change) happens. This can mean that a location constraint that only allows resource X to run between 9am and 5pm is not enforced." msgstr "" #. Tag: para #, no-c-format msgid "If you rely on time-based rules, the cluster-recheck-interval cluster option (which defaults to 15 minutes) is essential. This tells the cluster to periodically recalculate the ideal state of the cluster." msgstr "" #. Tag: para #, no-c-format msgid "For example, if you set cluster-recheck-interval=\"5m\", then sometime between 09:00 and 09:05 the cluster would notice that it needs to start resource X, and between 17:00 and 17:05 it would realize that X needed to be stopped. The timing of the actual start and stop actions depends on what other actions the cluster may need to perform first." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Status.pot b/doc/Pacemaker_Explained/pot/Ch-Status.pot index 8c4f651df1..38e1754f8c 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Status.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Status.pot @@ -1,632 +1,632 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Status — Here be dragons" msgstr "" #. Tag: para #, no-c-format msgid "Most users never need to understand the contents of the status section and can be happy with the output from crm_mon." msgstr "" #. Tag: para #, no-c-format msgid "However for those with a curious inclination, this section attempts to provide an overview of its contents." msgstr "" #. Tag: title #, no-c-format msgid "Node Status" msgstr "" #. Tag: para #, no-c-format msgid " NodeStatus Status Status of a Node " msgstr "" #. Tag: para #, no-c-format msgid "In addition to the cluster’s configuration, the CIB holds an up-to-date representation of each cluster node in the status section." msgstr "" #. Tag: title #, no-c-format msgid "A bare-bones status entry for a healthy node cl-virt-1" msgstr "" #. Tag: programlisting #, no-c-format msgid " <node_state id=\"cl-virt-1\" uname=\"cl-virt-2\" ha=\"active\" in_ccm=\"true\" crmd=\"online\" join=\"member\" expected=\"member\" crm-debug-origin=\"do_update_resource\">\n" " <transient_attributes id=\"cl-virt-1\"/>\n" " <lrm id=\"cl-virt-1\"/>\n" " </node_state>" msgstr "" #. Tag: para #, no-c-format msgid "Users are highly recommended not to modify any part of a node’s state directly. The cluster will periodically regenerate the entire section from authoritative sources, so any changes should be done with the tools appropriate to those sources." msgstr "" #. Tag: title #, no-c-format msgid "Authoritative Sources for State Information" msgstr "" #. Tag: entry #, no-c-format msgid "CIB Object" msgstr "" #. Tag: entry #, no-c-format msgid "Authoritative Source" msgstr "" #. Tag: para #, no-c-format msgid "node_state" msgstr "" #. Tag: para #, no-c-format msgid "crmd" msgstr "" #. Tag: para #, no-c-format msgid "transient_attributes" msgstr "" #. Tag: para #, no-c-format msgid "attrd" msgstr "" #. Tag: para #, no-c-format msgid "lrm" msgstr "" #. Tag: para #, no-c-format msgid "lrmd" msgstr "" #. Tag: para #, no-c-format msgid "The fields used in the node_state objects are named as they are largely for historical reasons and are rooted in Pacemaker’s origins as the Heartbeat resource manager. They have remained unchanged to preserve compatibility with older versions." msgstr "" #. Tag: title #, no-c-format msgid "Node Status Fields" msgstr "" #. Tag: entry #, no-c-format msgid "Field" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "id" msgstr "" #. Tag: para #, no-c-format msgid " idNode Status Node Status NodeStatusid Statusid id Unique identifier for the node. Corosync-based clusters use a numeric counter, while Heartbeat clusters use a (barely) human-readable UUID." msgstr "" #. Tag: para #, no-c-format msgid "uname" msgstr "" #. Tag: para #, no-c-format msgid " unameNode Status Node Status NodeStatusuname Statusuname uname The node’s machine name (output from uname -n)." msgstr "" #. Tag: para #, no-c-format msgid "ha" msgstr "" #. Tag: para #, no-c-format msgid " haNode Status Node Status NodeStatusha Statusha ha Is the cluster software active on this node? Allowed values: active, dead." msgstr "" #. Tag: para #, no-c-format msgid "in_ccm" msgstr "" #. Tag: para #, no-c-format msgid " in_ccmNode Status Node Status NodeStatusin_ccm Statusin_ccm in_ccm Is the node a member of the cluster? Allowed values: true, false." msgstr "" #. Tag: para #, no-c-format msgid "crmd" msgstr "" #. Tag: para #, no-c-format msgid " crmdNode Status Node Status NodeStatuscrmd Statuscrmd crmd Is the crmd process active on the node? Allowed values: online, offline." msgstr "" #. Tag: para #, no-c-format msgid "join" msgstr "" #. Tag: para #, no-c-format msgid " joinNode Status Node Status NodeStatusjoin Statusjoin join Does the node participate in hosting resources? Allowed values: down, pending, member, banned." msgstr "" #. Tag: para #, no-c-format msgid "expected" msgstr "" #. Tag: para #, no-c-format msgid " expectedNode Status Node Status NodeStatusexpected Statusexpected expected Expected value for join." msgstr "" #. Tag: para #, no-c-format msgid "crm-debug-origin" msgstr "" #. Tag: para #, no-c-format msgid " crm-debug-originNode Status Node Status NodeStatuscrm-debug-origin Statuscrm-debug-origin crm-debug-origin The origin of the most recent change(s). For diagnostic purposes." msgstr "" #. Tag: para #, no-c-format msgid "The cluster uses these fields to determine whether, at the node level, the node is healthy or is in a failed state and needs to be fenced." msgstr "" #. Tag: title #, no-c-format msgid "Transient Node Attributes" msgstr "" #. Tag: para #, no-c-format msgid "Like regular node attributes, the name/value pairs listed in the transient_attributes section help to describe the node. However they are forgotten by the cluster when the node goes offline. This can be useful, for instance, when you want a node to be in standby mode (not able to run resources) just until the next reboot." msgstr "" #. Tag: para #, no-c-format msgid "In addition to any values the administrator sets, the cluster will also store information about failed resources here." msgstr "" #. Tag: title #, no-c-format msgid "A set of transient node attributes for node cl-virt-1" msgstr "" #. Tag: programlisting #, no-c-format msgid "<transient_attributes id=\"cl-virt-1\">\n" " <instance_attributes id=\"status-cl-virt-1\">\n" " <nvpair id=\"status-cl-virt-1-pingd\" name=\"pingd\" value=\"3\"/>\n" " <nvpair id=\"status-cl-virt-1-probe_complete\" name=\"probe_complete\" value=\"true\"/>\n" " <nvpair id=\"status-cl-virt-1-fail-count-pingd:0\" name=\"fail-count-pingd:0\" value=\"1\"/>\n" " <nvpair id=\"status-cl-virt-1-last-failure-pingd:0\" name=\"last-failure-pingd:0\" value=\"1239009742\"/>\n" " </instance_attributes>\n" "</transient_attributes>" msgstr "" #. Tag: para #, no-c-format msgid "In the above example, we can see that the pingd:0 resource has failed once, at 09:22:22 UTC 6 April 2009. You can use the standard date command to print a human-readable version of any seconds-since-epoch value, for example date -d @1239009742. We also see that the node is connected to three pingd peers and that all known resources have been checked for on this machine (probe_complete)." msgstr "" #. Tag: title #, no-c-format msgid "Operation History" msgstr "" #. Tag: para #, no-c-format msgid " Operation History " msgstr "" #. Tag: para #, no-c-format msgid "A node’s resource history is held in the lrm_resources tag (a child of the lrm tag). The information stored here includes enough information for the cluster to stop the resource safely if it is removed from the configuration section. Specifically, the resource’s id, class, type and provider are stored." msgstr "" #. Tag: title #, no-c-format msgid "A record of the apcstonith resource" msgstr "" #. Tag: programlisting #, no-c-format msgid "<lrm_resource id=\"apcstonith\" type=\"apcmastersnmp\" class=\"stonith\"/>" msgstr "" #. Tag: para #, no-c-format msgid "Additionally, we store the last job for every combination of resource, action and interval. The concatenation of the values in this tuple are used to create the id of the lrm_rsc_op object." msgstr "" #. Tag: title #, no-c-format msgid "Contents of an lrm_rsc_op job" msgstr "" #. Tag: para #, no-c-format msgid " idAction Status Action Status ActionStatusid Statusid id " msgstr "" #. Tag: para #, no-c-format msgid "Identifier for the job constructed from the resource’s id, operation and interval." msgstr "" #. Tag: para #, no-c-format msgid "call-id" msgstr "" #. Tag: para #, no-c-format msgid " call-idAction Status Action Status ActionStatuscall-id Statuscall-id call-id " msgstr "" #. Tag: para #, no-c-format msgid "The job’s ticket number. Used as a sort key to determine the order in which the jobs were executed." msgstr "" #. Tag: para #, no-c-format msgid "operation" msgstr "" #. Tag: para #, no-c-format msgid " operationAction Status Action Status ActionStatusoperation Statusoperation operation " msgstr "" #. Tag: para #, no-c-format msgid "The action the resource agent was invoked with." msgstr "" #. Tag: para #, no-c-format msgid "interval" msgstr "" #. Tag: para #, no-c-format msgid " intervalAction Status Action Status ActionStatusinterval Statusinterval interval " msgstr "" #. Tag: para #, no-c-format msgid "The frequency, in milliseconds, at which the operation will be repeated. A one-off job is indicated by 0." msgstr "" #. Tag: para #, no-c-format msgid "op-status" msgstr "" #. Tag: para #, no-c-format msgid " op-statusAction Status Action Status ActionStatusop-status Statusop-status op-status " msgstr "" #. Tag: para #, no-c-format msgid "The job’s status. Generally this will be either 0 (done) or -1 (pending). Rarely used in favor of rc-code." msgstr "" #. Tag: para #, no-c-format msgid "rc-code" msgstr "" #. Tag: para #, no-c-format msgid " rc-codeAction Status Action Status ActionStatusrc-code Statusrc-code rc-code " msgstr "" #. Tag: para #, no-c-format msgid "The job’s result. Refer to for details on what the values here mean and how they are interpreted." msgstr "" #. Tag: para #, no-c-format msgid "last-run" msgstr "" #. Tag: para #, no-c-format msgid " last-runAction Status Action Status ActionStatuslast-run Statuslast-run last-run " msgstr "" #. Tag: para #, no-c-format msgid "Machine-local date/time, in seconds since epoch, at which the job was executed. For diagnostic purposes." msgstr "" #. Tag: para #, no-c-format msgid "last-rc-change" msgstr "" #. Tag: para #, no-c-format msgid " last-rc-changeAction Status Action Status ActionStatuslast-rc-change Statuslast-rc-change last-rc-change " msgstr "" #. Tag: para #, no-c-format msgid "Machine-local date/time, in seconds since epoch, at which the job first returned the current value of rc-code. For diagnostic purposes." msgstr "" #. Tag: para #, no-c-format msgid "exec-time" msgstr "" #. Tag: para #, no-c-format msgid " exec-timeAction Status Action Status ActionStatusexec-time Statusexec-time exec-time " msgstr "" #. Tag: para #, no-c-format msgid "Time, in milliseconds, that the job was running for. For diagnostic purposes." msgstr "" #. Tag: para #, no-c-format msgid "queue-time" msgstr "" #. Tag: para #, no-c-format msgid " queue-timeAction Status Action Status ActionStatusqueue-time Statusqueue-time queue-time " msgstr "" #. Tag: para #, no-c-format msgid "Time, in seconds, that the job was queued for in the LRMd. For diagnostic purposes." msgstr "" #. Tag: para #, no-c-format msgid "crm_feature_set" msgstr "" #. Tag: para #, no-c-format msgid " crm_feature_setAction Status Action Status ActionStatuscrm_feature_set Statuscrm_feature_set crm_feature_set " msgstr "" #. Tag: para #, no-c-format msgid "The version which this job description conforms to. Used when processing op-digest." msgstr "" #. Tag: para #, no-c-format msgid "transition-key" msgstr "" #. Tag: para #, no-c-format msgid " transition-keyAction Status Action Status ActionStatustransition-key Statustransition-key transition-key " msgstr "" #. Tag: para #, no-c-format msgid "A concatenation of the job’s graph action number, the graph number, the expected result and the UUID of the crmd instance that scheduled it. This is used to construct transition-magic (below)." msgstr "" #. Tag: para #, no-c-format msgid "transition-magic" msgstr "" #. Tag: para #, no-c-format msgid " transition-magicAction Status Action Status ActionStatustransition-magic Statustransition-magic transition-magic " msgstr "" #. Tag: para #, no-c-format msgid "A concatenation of the job’s op-status, rc-code and transition-key. Guaranteed to be unique for the life of the cluster (which ensures it is part of CIB update notifications) and contains all the information needed for the crmd to correctly analyze and process the completed job. Most importantly, the decomposed elements tell the crmd if the job entry was expected and whether it failed." msgstr "" #. Tag: para #, no-c-format msgid "op-digest" msgstr "" #. Tag: para #, no-c-format msgid " op-digestAction Status Action Status ActionStatusop-digest Statusop-digest op-digest " msgstr "" #. Tag: para #, no-c-format msgid "An MD5 sum representing the parameters passed to the job. Used to detect changes to the configuration, to restart resources if necessary." msgstr "" #. Tag: para #, no-c-format msgid " crm-debug-originAction Status Action Status ActionStatuscrm-debug-origin Statuscrm-debug-origin crm-debug-origin " msgstr "" #. Tag: para #, no-c-format msgid "The origin of the current values. For diagnostic purposes." msgstr "" #. Tag: title #, no-c-format msgid "Simple Operation History Example" msgstr "" #. Tag: title #, no-c-format msgid "A monitor operation (determines current state of the apcstonith resource)" msgstr "" #. Tag: programlisting #, no-c-format msgid "<lrm_resource id=\"apcstonith\" type=\"apcmastersnmp\" class=\"stonith\">\n" " <lrm_rsc_op id=\"apcstonith_monitor_0\" operation=\"monitor\" call-id=\"2\"\n" " rc-code=\"7\" op-status=\"0\" interval=\"0\"\n" " crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.1\"\n" " op-digest=\"2e3da9274d3550dc6526fb24bfcbcba0\"\n" " transition-key=\"22:2:7:2668bbeb-06d5-40f9-936d-24cb7f87006a\"\n" " transition-magic=\"0:7;22:2:7:2668bbeb-06d5-40f9-936d-24cb7f87006a\"\n" " last-run=\"1239008085\" last-rc-change=\"1239008085\" exec-time=\"10\" queue-time=\"0\"/>\n" "</lrm_resource>" msgstr "" #. Tag: para #, no-c-format msgid "In the above example, the job is a non-recurring monitor operation often referred to as a \"probe\" for the apcstonith resource." msgstr "" #. Tag: para #, no-c-format msgid "The cluster schedules probes for every configured resource on a node when the node first starts, in order to determine the resource’s current state before it takes any further action." msgstr "" #. Tag: para #, no-c-format msgid "From the transition-key, we can see that this was the 22nd action of the 2nd graph produced by this instance of the crmd (2668bbeb-06d5-40f9-936d-24cb7f87006a)." msgstr "" #. Tag: para #, no-c-format msgid "The third field of the transition-key contains a 7, which indicates that the job expects to find the resource inactive. By looking at the rc-code property, we see that this was the case." msgstr "" #. Tag: para #, no-c-format msgid "As that is the only job recorded for this node, we can conclude that the cluster started the resource elsewhere." msgstr "" #. Tag: title #, no-c-format msgid "Complex Operation History Example" msgstr "" #. Tag: title #, no-c-format msgid "Resource history of a pingd clone with multiple jobs" msgstr "" #. Tag: programlisting #, no-c-format msgid "<lrm_resource id=\"pingd:0\" type=\"pingd\" class=\"ocf\" provider=\"pacemaker\">\n" " <lrm_rsc_op id=\"pingd:0_monitor_30000\" operation=\"monitor\" call-id=\"34\"\n" " rc-code=\"0\" op-status=\"0\" interval=\"30000\"\n" " crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.1\"\n" " transition-key=\"10:11:0:2668bbeb-06d5-40f9-936d-24cb7f87006a\"\n" " ...\n" " last-run=\"1239009741\" last-rc-change=\"1239009741\" exec-time=\"10\" queue-time=\"0\"/>\n" " <lrm_rsc_op id=\"pingd:0_stop_0\" operation=\"stop\"\n" " crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.1\" call-id=\"32\"\n" " rc-code=\"0\" op-status=\"0\" interval=\"0\"\n" " transition-key=\"11:11:0:2668bbeb-06d5-40f9-936d-24cb7f87006a\"\n" " ...\n" " last-run=\"1239009741\" last-rc-change=\"1239009741\" exec-time=\"10\" queue-time=\"0\"/>\n" " <lrm_rsc_op id=\"pingd:0_start_0\" operation=\"start\" call-id=\"33\"\n" " rc-code=\"0\" op-status=\"0\" interval=\"0\"\n" " crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.1\"\n" " transition-key=\"31:11:0:2668bbeb-06d5-40f9-936d-24cb7f87006a\"\n" " ...\n" " last-run=\"1239009741\" last-rc-change=\"1239009741\" exec-time=\"10\" queue-time=\"0\" />\n" " <lrm_rsc_op id=\"pingd:0_monitor_0\" operation=\"monitor\" call-id=\"3\"\n" " rc-code=\"0\" op-status=\"0\" interval=\"0\"\n" " crm-debug-origin=\"do_update_resource\" crm_feature_set=\"3.0.1\"\n" " transition-key=\"23:2:7:2668bbeb-06d5-40f9-936d-24cb7f87006a\"\n" " ...\n" " last-run=\"1239008085\" last-rc-change=\"1239008085\" exec-time=\"20\" queue-time=\"0\"/>\n" " </lrm_resource>" msgstr "" #. Tag: para #, no-c-format msgid "When more than one job record exists, it is important to first sort them by call-id before interpreting them." msgstr "" #. Tag: para #, no-c-format msgid "Once sorted, the above example can be summarized as:" msgstr "" #. Tag: para #, no-c-format msgid "A non-recurring monitor operation returning 7 (not running), with a call-id of 3" msgstr "" #. Tag: para #, no-c-format msgid "A stop operation returning 0 (success), with a call-id of 32" msgstr "" #. Tag: para #, no-c-format msgid "A start operation returning 0 (success), with a call-id of 33" msgstr "" #. Tag: para #, no-c-format msgid "A recurring monitor returning 0 (success), with a call-id of 34" msgstr "" #. Tag: para #, no-c-format msgid "The cluster processes each job record to build up a picture of the resource’s state. After the first and second entries, it is considered stopped, and after the third it considered active." msgstr "" #. Tag: para #, no-c-format msgid "Based on the last operation, we can tell that the resource is currently active." msgstr "" #. Tag: para #, no-c-format msgid "Additionally, from the presence of a stop operation with a lower call-id than that of the start operation, we can conclude that the resource has been restarted. Specifically this occurred as part of actions 11 and 31 of transition 11 from the crmd instance with the key 2668bbeb…. This information can be helpful for locating the relevant section of the logs when looking for the source of a failure." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Stonith.pot b/doc/Pacemaker_Explained/pot/Ch-Stonith.pot index d3ef340bed..68f84425b6 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Stonith.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Stonith.pot @@ -1,1262 +1,1272 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "STONITH" msgstr "" #. Tag: para #, no-c-format msgid " STONITHConfiguration Configuration " msgstr "" #. Tag: title #, no-c-format msgid "What Is STONITH?" msgstr "" #. Tag: para #, no-c-format msgid "STONITH (an acronym for \"Shoot The Other Node In The Head\"), also called fencing, protects your data from being corrupted by rogue nodes or concurrent access." msgstr "" #. Tag: para #, no-c-format msgid "Just because a node is unresponsive, this doesn’t mean it isn’t accessing your data. The only way to be 100% sure that your data is safe, is to use STONITH so we can be certain that the node is truly offline, before allowing the data to be accessed from another node." msgstr "" #. Tag: para #, no-c-format msgid "STONITH also has a role to play in the event that a clustered service cannot be stopped. In this case, the cluster uses STONITH to force the whole node offline, thereby making it safe to start the service elsewhere." msgstr "" #. Tag: title #, no-c-format msgid "What STONITH Device Should You Use?" msgstr "" #. Tag: para #, no-c-format msgid "It is crucial that the STONITH device can allow the cluster to differentiate between a node failure and a network one." msgstr "" #. Tag: para #, no-c-format msgid "The biggest mistake people make in choosing a STONITH device is to use a remote power switch (such as many on-board IPMI controllers) that shares power with the node it controls. In such cases, the cluster cannot be sure if the node is really offline, or active and suffering from a network fault." msgstr "" #. Tag: para #, no-c-format msgid "Likewise, any device that relies on the machine being active (such as SSH-based \"devices\" used during testing) are inappropriate." msgstr "" #. Tag: title #, no-c-format msgid "Special Treatment of STONITH Resources" msgstr "" #. Tag: para #, no-c-format msgid "STONITH resources are somewhat special in Pacemaker." msgstr "" #. Tag: para #, no-c-format msgid "STONITH may be initiated by pacemaker or by other parts of the cluster (such as resources like DRBD or DLM). To accommodate this, pacemaker does not require the STONITH resource to be in the started state in order to be used, thus allowing reliable use of STONITH devices in such a case." msgstr "" #. Tag: para #, no-c-format msgid "In pacemaker versions 1.1.9 and earlier, this feature either did not exist or did not work well. Only \"running\" STONITH resources could be used by Pacemaker for fencing, and if another component tried to fence a node while Pacemaker was moving STONITH resources, the fencing could fail." msgstr "" #. Tag: para #, no-c-format msgid "All nodes have access to STONITH devices' definitions and instantiate them on-the-fly when needed, but preference is given to verified instances, which are the ones that are started according to the cluster’s knowledge." msgstr "" #. Tag: para #, no-c-format msgid "In the case of a cluster split, the partition with a verified instance will have a slight advantage, because the STONITH daemon in the other partition will have to hear from all its current peers before choosing a node to perform the fencing." msgstr "" #. Tag: para #, no-c-format msgid "Fencing resources do work the same as regular resources in some respects:" msgstr "" #. Tag: para #, no-c-format msgid "target-role can be used to enable or disable the resource" msgstr "" #. Tag: para #, no-c-format msgid "Location constraints can be used to prevent a specific node from using the resource" msgstr "" #. Tag: para #, no-c-format msgid "Currently there is a limitation that fencing resources may only have one set of meta-attributes and one set of instance attributes. This can be revisited if it becomes a significant limitation for people." msgstr "" #. Tag: para #, no-c-format msgid "See the table below or run man stonithd to see special instance attributes that may be set for any fencing resource, regardless of fence agent." msgstr "" #. Tag: title #, no-c-format msgid "Properties of Fencing Resources" msgstr "" #. Tag: entry #, no-c-format msgid "Field" msgstr "" #. Tag: entry #, no-c-format msgid "Type" msgstr "" #. Tag: entry #, no-c-format msgid "Default" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "stonith-timeout" msgstr "" #. Tag: para #, no-c-format msgid "NA" msgstr "" #. Tag: para #, no-c-format msgid "Older versions used this to override the default period to wait for a STONITH (reboot, on, off) action to complete for this device. It has been replaced by the pcmk_reboot_timeout and pcmk_off_timeout properties. stonith-timeoutFencing Fencing FencingPropertystonith-timeout Propertystonith-timeout stonith-timeout " msgstr "" #. Tag: para #, no-c-format msgid "priority" msgstr "" #. Tag: para #, no-c-format msgid "integer" msgstr "" #. Tag: para #, no-c-format msgid "0" msgstr "" #. Tag: para #, no-c-format msgid "The priority of the STONITH resource. Devices are tried in order of highest priority to lowest. priorityFencing Fencing FencingPropertypriority Propertypriority priority " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_host_map" msgstr "" #. Tag: para #, no-c-format msgid "string" msgstr "" #. Tag: para #, no-c-format msgid "A mapping of host names to ports numbers for devices that do not support host names. Example: node1:1;node2:2,3 tells the cluster to use port 1 for node1 and ports 2 and 3 for node2. pcmk_host_mapFencing Fencing FencingPropertypcmk_host_map Propertypcmk_host_map pcmk_host_map " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_host_list" msgstr "" #. Tag: para #, no-c-format msgid "A list of machines controlled by this device (optional unless pcmk_host_check is static-list). pcmk_host_listFencing Fencing FencingPropertypcmk_host_list Propertypcmk_host_list pcmk_host_list " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_host_check" msgstr "" #. Tag: para #, no-c-format msgid "dynamic-list" msgstr "" #. Tag: para #, no-c-format msgid "How to determine which machines are controlled by the device. Allowed values:" msgstr "" #. Tag: para #, no-c-format msgid "dynamic-list: query the device" msgstr "" #. Tag: para #, no-c-format msgid "static-list: check the pcmk_host_list attribute" msgstr "" #. Tag: para #, no-c-format msgid "none: assume every device can fence every machine" msgstr "" #. Tag: para #, no-c-format msgid " pcmk_host_checkFencing Fencing FencingPropertypcmk_host_check Propertypcmk_host_check pcmk_host_check " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_delay_max" msgstr "" #. Tag: para #, no-c-format msgid "time" msgstr "" #. Tag: para #, no-c-format msgid "0s" msgstr "" #. Tag: para #, no-c-format msgid "Enable a random delay of up to the time specified before executing stonith actions. This is sometimes used in two-node clusters to ensure that the nodes don’t fence each other at the same time." msgstr "" #. Tag: para #, no-c-format msgid " pcmk_delay_maxFencing Fencing FencingPropertypcmk_delay_max Propertypcmk_delay_max pcmk_delay_max " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_action_limit" msgstr "" #. Tag: para #, no-c-format msgid "1" msgstr "" #. Tag: para #, no-c-format msgid "The maximum number of actions that can be performed in parallel on this device, if the cluster option concurrent-fencing is true. -1 is unlimited." msgstr "" #. Tag: para #, no-c-format msgid " pcmk_action_limitFencing Fencing FencingPropertypcmk_action_limit Propertypcmk_action_limit pcmk_action_limit " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_host_argument" msgstr "" #. Tag: para #, no-c-format msgid "port" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. Which parameter should be supplied to the resource agent to identify the node to be fenced. Some devices do not support the standard port parameter or may provide additional ones. Use this to specify an alternate, device-specific parameter. A value of none tells the cluster not to supply any additional parameters. pcmk_host_argumentFencing Fencing FencingPropertypcmk_host_argument Propertypcmk_host_argument pcmk_host_argument " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_reboot_action" msgstr "" #. Tag: para #, no-c-format msgid "reboot" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The command to send to the resource agent in order to reboot a node. Some devices do not support the standard commands or may provide additional ones. Use this to specify an alternate, device-specific command. pcmk_reboot_actionFencing Fencing FencingPropertypcmk_reboot_action Propertypcmk_reboot_action pcmk_reboot_action " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_reboot_timeout" msgstr "" #. Tag: para #, no-c-format msgid "60s" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. Specify an alternate timeout to use for reboot actions instead of the value of stonith-timeout. Some devices need much more or less time to complete than normal. Use this to specify an alternate, device-specific timeout. pcmk_reboot_timeoutFencing Fencing FencingPropertypcmk_reboot_timeout Propertypcmk_reboot_timeout pcmk_reboot_timeout stonith-timeoutFencing Fencing FencingPropertystonith-timeout Propertystonith-timeout stonith-timeout " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_reboot_retries" msgstr "" #. Tag: para #, no-c-format msgid "2" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The maximum number of times to retry the reboot command within the timeout period. Some devices do not support multiple connections, and operations may fail if the device is busy with another task, so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries before giving up. pcmk_reboot_retriesFencing Fencing FencingPropertypcmk_reboot_retries Propertypcmk_reboot_retries pcmk_reboot_retries " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_off_action" msgstr "" #. Tag: para #, no-c-format msgid "off" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The command to send to the resource agent in order to shut down a node. Some devices do not support the standard commands or may provide additional ones. Use this to specify an alternate, device-specific command. pcmk_off_actionFencing Fencing FencingPropertypcmk_off_action Propertypcmk_off_action pcmk_off_action " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_off_timeout" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. Specify an alternate timeout to use for off actions instead of the value of stonith-timeout. Some devices need much more or less time to complete than normal. Use this to specify an alternate, device-specific timeout. pcmk_off_timeoutFencing Fencing FencingPropertypcmk_off_timeout Propertypcmk_off_timeout pcmk_off_timeout stonith-timeoutFencing Fencing FencingPropertystonith-timeout Propertystonith-timeout stonith-timeout " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_off_retries" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The maximum number of times to retry the off command within the timeout period. Some devices do not support multiple connections, and operations may fail if the device is busy with another task, so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries before giving up. pcmk_off_retriesFencing Fencing FencingPropertypcmk_off_retries Propertypcmk_off_retries pcmk_off_retries " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_list_action" msgstr "" #. Tag: para #, no-c-format msgid "list" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The command to send to the resource agent in order to list nodes. Some devices do not support the standard commands or may provide additional ones. Use this to specify an alternate, device-specific command. pcmk_list_actionFencing Fencing FencingPropertypcmk_list_action Propertypcmk_list_action pcmk_list_action " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_list_timeout" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. Specify an alternate timeout to use for list actions instead of the value of stonith-timeout. Some devices need much more or less time to complete than normal. Use this to specify an alternate, device-specific timeout. pcmk_list_timeoutFencing Fencing FencingPropertypcmk_list_timeout Propertypcmk_list_timeout pcmk_list_timeout " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_list_retries" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The maximum number of times to retry the list command within the timeout period. Some devices do not support multiple connections, and operations may fail if the device is busy with another task, so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries before giving up. pcmk_list_retriesFencing Fencing FencingPropertypcmk_list_retries Propertypcmk_list_retries pcmk_list_retries " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_monitor_action" msgstr "" #. Tag: para #, no-c-format msgid "monitor" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The command to send to the resource agent in order to report extended status. Some devices do not support the standard commands or may provide additional ones. Use this to specify an alternate, device-specific command. pcmk_monitor_actionFencing Fencing FencingPropertypcmk_monitor_action Propertypcmk_monitor_action pcmk_monitor_action " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_monitor_timeout" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. Specify an alternate timeout to use for monitor actions instead of the value of stonith-timeout. Some devices need much more or less time to complete than normal. Use this to specify an alternate, device-specific timeout. pcmk_monitor_timeoutFencing Fencing FencingPropertypcmk_monitor_timeout Propertypcmk_monitor_timeout pcmk_monitor_timeout " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_monitor_retries" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The maximum number of times to retry the monitor command within the timeout period. Some devices do not support multiple connections, and operations may fail if the device is busy with another task, so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries before giving up. pcmk_monitor_retriesFencing Fencing FencingPropertypcmk_monitor_retries Propertypcmk_monitor_retries pcmk_monitor_retries " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_status_action" msgstr "" #. Tag: para #, no-c-format msgid "status" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The command to send to the resource agent in order to report status. Some devices do not support the standard commands or may provide additional ones. Use this to specify an alternate, device-specific command. pcmk_status_actionFencing Fencing FencingPropertypcmk_status_action Propertypcmk_status_action pcmk_status_action " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_status_timeout" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. Specify an alternate timeout to use for status actions instead of the value of stonith-timeout. Some devices need much more or less time to complete than normal. Use this to specify an alternate, device-specific timeout. pcmk_status_timeoutFencing Fencing FencingPropertypcmk_status_timeout Propertypcmk_status_timeout pcmk_status_timeout " msgstr "" #. Tag: para #, no-c-format msgid "pcmk_status_retries" msgstr "" #. Tag: para #, no-c-format msgid "Advanced use only. The maximum number of times to retry the status command within the timeout period. Some devices do not support multiple connections, and operations may fail if the device is busy with another task, so Pacemaker will automatically retry the operation, if there is time remaining. Use this option to alter the number of times Pacemaker retries before giving up. pcmk_status_retriesFencing Fencing FencingPropertypcmk_status_retries Propertypcmk_status_retries pcmk_status_retries " msgstr "" #. Tag: title #, no-c-format msgid "Configuring STONITH" msgstr "" #. Tag: para #, no-c-format msgid "Higher-level configuration shells include functionality to simplify the process below, particularly the step for deciding which parameters are required. However since this document deals only with core components, you should refer to the STONITH section of the Clusters from Scratch guide for those details." msgstr "" #. Tag: para #, no-c-format msgid "Find the correct driver:" msgstr "" #. Tag: screen #, no-c-format msgid "# stonith_admin --list-installed" msgstr "" #. Tag: para #, no-c-format msgid "Find the required parameters associated with the device (replacing $AGENT_NAME with the name obtained from the previous step):" msgstr "" #. Tag: screen #, no-c-format msgid "# stonith_admin --metadata --agent $AGENT_NAME" msgstr "" #. Tag: para #, no-c-format msgid "Create a file called stonith.xml containing a primitive resource with a class of stonith, a type equal to the agent name obtained earlier, and a parameter for each of the values returned in the previous step." msgstr "" #. Tag: para #, no-c-format msgid "If the device does not know how to fence nodes based on their uname, you may also need to set the special pcmk_host_map parameter. See man stonithd for details." msgstr "" #. Tag: para #, no-c-format msgid "If the device does not support the list command, you may also need to set the special pcmk_host_list and/or pcmk_host_check parameters. See man stonithd for details." msgstr "" #. Tag: para #, no-c-format msgid "If the device does not expect the victim to be specified with the port parameter, you may also need to set the special pcmk_host_argument parameter. See man stonithd for details." msgstr "" #. Tag: para #, no-c-format msgid "Upload it into the CIB using cibadmin:" msgstr "" #. Tag: screen #, no-c-format msgid "# cibadmin -C -o resources --xml-file stonith.xml" msgstr "" #. Tag: para #, no-c-format msgid "Set stonith-enabled to true:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute -t crm_config -n stonith-enabled -v true" msgstr "" #. Tag: para #, no-c-format msgid "Once the stonith resource is running, you can test it by executing the following (although you might want to stop the cluster on that machine first):" msgstr "" #. Tag: screen #, no-c-format msgid "# stonith_admin --reboot nodename" msgstr "" #. Tag: title #, no-c-format msgid "Example STONITH Configuration" msgstr "" #. Tag: para #, no-c-format msgid "Assume we have an chassis containing four nodes and an IPMI device active on 192.0.2.1. We would choose the fence_ipmilan driver, and obtain the following list of parameters:" msgstr "" #. Tag: title #, no-c-format msgid "Obtaining a list of STONITH Parameters" msgstr "" #. Tag: screen #, no-c-format msgid "# stonith_admin --metadata -a fence_ipmilan" msgstr "" #. Tag: programlisting #, no-c-format msgid "<resource-agent name=\"fence_ipmilan\" shortdesc=\"Fence agent for IPMI over LAN\">\n" " <symlink name=\"fence_ilo3\" shortdesc=\"Fence agent for HP iLO3\"/>\n" " <symlink name=\"fence_ilo4\" shortdesc=\"Fence agent for HP iLO4\"/>\n" " <symlink name=\"fence_idrac\" shortdesc=\"Fence agent for Dell iDRAC\"/>\n" " <symlink name=\"fence_imm\" shortdesc=\"Fence agent for IBM Integrated Management Module\"/>\n" " <longdesc>\n" " </longdesc>\n" " <vendor-url>\n" " </vendor-url>\n" " <parameters>\n" " <parameter name=\"auth\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-A\"/>\n" " <content type=\"string\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"ipaddr\" unique=\"0\" required=\"1\">\n" " <getopt mixed=\"-a\"/>\n" " <content type=\"string\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"passwd\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-p\"/>\n" " <content type=\"string\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"passwd_script\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-S\"/>\n" " <content type=\"string\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"lanplus\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-P\"/>\n" " <content type=\"boolean\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"login\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-l\"/>\n" " <content type=\"string\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"action\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-o\"/>\n" " <content type=\"string\" default=\"reboot\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"timeout\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-t\"/>\n" " <content type=\"string\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"cipher\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-C\"/>\n" " <content type=\"string\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"method\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-M\"/>\n" " <content type=\"string\" default=\"onoff\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"power_wait\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-T\"/>\n" " <content type=\"string\" default=\"2\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"delay\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-f\"/>\n" " <content type=\"string\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"privlvl\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-L\"/>\n" " <content type=\"string\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " <parameter name=\"verbose\" unique=\"0\" required=\"0\">\n" " <getopt mixed=\"-v\"/>\n" " <content type=\"boolean\"/>\n" " <shortdesc>\n" " </shortdesc>\n" " </parameter>\n" " </parameters>\n" " <actions>\n" " <action name=\"on\"/>\n" " <action name=\"off\"/>\n" " <action name=\"reboot\"/>\n" " <action name=\"status\"/>\n" " <action name=\"diag\"/>\n" " <action name=\"list\"/>\n" " <action name=\"monitor\"/>\n" " <action name=\"metadata\"/>\n" " <action name=\"stop\" timeout=\"20s\"/>\n" " <action name=\"start\" timeout=\"20s\"/>\n" " </actions>\n" "</resource-agent>" msgstr "" #. Tag: para #, no-c-format msgid "Based on that, we would create a STONITH resource fragment that might look like this:" msgstr "" #. Tag: title #, no-c-format msgid "An IPMI-based STONITH Resource" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"Fencing\" class=\"stonith\" type=\"fence_ipmilan\" >\n" " <instance_attributes id=\"Fencing-params\" >\n" " <nvpair id=\"Fencing-passwd\" name=\"passwd\" value=\"testuser\" />\n" " <nvpair id=\"Fencing-login\" name=\"login\" value=\"abc123\" />\n" " <nvpair id=\"Fencing-ipaddr\" name=\"ipaddr\" value=\"192.0.2.1\" />\n" " <nvpair id=\"Fencing-pcmk_host_list\" name=\"pcmk_host_list\" value=\"pcmk-1 pcmk-2\" />\n" " </instance_attributes>\n" " <operations >\n" " <op id=\"Fencing-monitor-10m\" interval=\"10m\" name=\"monitor\" timeout=\"300s\" />\n" " </operations>\n" "</primitive>" msgstr "" #. Tag: para #, no-c-format msgid "Finally, we need to enable STONITH:" msgstr "" #. Tag: title #, no-c-format msgid "Advanced STONITH Configurations" msgstr "" #. Tag: para #, no-c-format msgid "Some people consider that having one fencing device is a single point of failure Not true, since a node or resource must fail before fencing even has a chance to; others prefer removing the node from the storage and network instead of turning it off." msgstr "" #. Tag: para #, no-c-format msgid "Whatever the reason, Pacemaker supports fencing nodes with multiple devices through a feature called fencing topologies." msgstr "" #. Tag: para #, no-c-format msgid "Simply create the individual devices as you normally would, then define one or more fencing-level entries in the fencing-topology section of the configuration." msgstr "" #. Tag: para #, no-c-format -msgid "Each fencing level is attempted in order of ascending index. Allowed indexes are 0 to 9." +msgid "Each fencing level is attempted in order of ascending index. Allowed values are 1 through 9." msgstr "" #. Tag: para #, no-c-format msgid "If a device fails, processing terminates for the current level. No further devices in that level are exercised, and the next level is attempted instead." msgstr "" #. Tag: para #, no-c-format msgid "If the operation succeeds for all the listed devices in a level, the level is deemed to have passed." msgstr "" #. Tag: para #, no-c-format msgid "The operation is finished when a level has passed (success), or all levels have been attempted (failed)." msgstr "" #. Tag: para #, no-c-format msgid "If the operation failed, the next step is determined by the Policy Engine and/or crmd." msgstr "" #. Tag: para #, no-c-format msgid "Some possible uses of topologies include:" msgstr "" #. Tag: para #, no-c-format msgid "Try poison-pill and fail back to power" msgstr "" #. Tag: para #, no-c-format msgid "Try disk and network, and fall back to power if either fails" msgstr "" #. Tag: para #, no-c-format msgid "Initiate a kdump and then poweroff the node" msgstr "" #. Tag: title #, no-c-format msgid "Properties of Fencing Levels" msgstr "" #. Tag: para #, no-c-format msgid "id" msgstr "" #. Tag: para #, no-c-format msgid "A unique name for the level idfencing-level fencing-level Fencingfencing-levelid fencing-levelid id " msgstr "" #. Tag: para #, no-c-format msgid "target" msgstr "" #. Tag: para #, no-c-format msgid "The name of a single node to which this level applies targetfencing-level fencing-level Fencingfencing-leveltarget fencing-leveltarget target " msgstr "" #. Tag: para #, no-c-format msgid "target-pattern" msgstr "" #. Tag: para #, no-c-format msgid "A regular expression matching the names of nodes to which this level applies (since 1.1.14) target-patternfencing-level fencing-level Fencingfencing-leveltarget-pattern fencing-leveltarget-pattern target-pattern " msgstr "" #. Tag: para #, no-c-format msgid "target-attribute" msgstr "" #. Tag: para #, no-c-format -msgid "The name of a node attribute that is set for nodes to which this level applies (since 1.1.14) target-attributefencing-level fencing-level Fencingfencing-leveltarget-attribute fencing-leveltarget-attribute target-attribute " +msgid "The name of a node attribute that is set (to target-value) for nodes to which this level applies (since 1.1.14) target-attributefencing-level fencing-level Fencingfencing-leveltarget-attribute fencing-leveltarget-attribute target-attribute " +msgstr "" + +#. Tag: para +#, no-c-format +msgid "target-value" +msgstr "" + +#. Tag: para +#, no-c-format +msgid "The node attribute value (of target-attribute) that is set for nodes to which this level applies (since 1.1.14) target-attributefencing-level fencing-level Fencingfencing-leveltarget-attribute fencing-leveltarget-attribute target-attribute " msgstr "" #. Tag: para #, no-c-format msgid "index" msgstr "" #. Tag: para #, no-c-format -msgid "The order in which to attempt the levels. Levels are attempted in ascending order until one succeeds. indexfencing-level fencing-level Fencingfencing-levelindex fencing-levelindex index " +msgid "The order in which to attempt the levels. Levels are attempted in ascending order until one succeeds. Valid values are 1 through 9. indexfencing-level fencing-level Fencingfencing-levelindex fencing-levelindex index " msgstr "" #. Tag: para #, no-c-format msgid "devices" msgstr "" #. Tag: para #, no-c-format msgid "A comma-separated list of devices that must all be tried for this level devicesfencing-level fencing-level Fencingfencing-leveldevices fencing-leveldevices devices " msgstr "" #. Tag: title #, no-c-format msgid "Fencing topology with different devices for different nodes" msgstr "" #. Tag: programlisting #, no-c-format msgid " <cib crm_feature_set=\"3.0.6\" validate-with=\"pacemaker-1.2\" admin_epoch=\"1\" epoch=\"0\" num_updates=\"0\">\n" " <configuration>\n" " ...\n" " <fencing-topology>\n" " <!-- For pcmk-1, try poison-pill and fail back to power -->\n" " <fencing-level id=\"f-p1.1\" target=\"pcmk-1\" index=\"1\" devices=\"poison-pill\"/>\n" " <fencing-level id=\"f-p1.2\" target=\"pcmk-1\" index=\"2\" devices=\"power\"/>\n" "\n" " <!-- For pcmk-2, try disk and network, and fail back to power -->\n" " <fencing-level id=\"f-p2.1\" target=\"pcmk-2\" index=\"1\" devices=\"disk,network\"/>\n" " <fencing-level id=\"f-p2.2\" target=\"pcmk-2\" index=\"2\" devices=\"power\"/>\n" " </fencing-topology>\n" " ...\n" " <configuration>\n" " <status/>\n" "</cib>" msgstr "" #. Tag: title #, no-c-format msgid "Example Dual-Layer, Dual-Device Fencing Topologies" msgstr "" #. Tag: para #, no-c-format msgid "The following example illustrates an advanced use of fencing-topology in a cluster with the following properties:" msgstr "" #. Tag: para #, no-c-format msgid "3 nodes (2 active prod-mysql nodes, 1 prod_mysql-rep in standby for quorum purposes)" msgstr "" #. Tag: para #, no-c-format msgid "the active nodes have an IPMI-controlled power board reached at 192.0.2.1 and 192.0.2.2" msgstr "" #. Tag: para #, no-c-format msgid "the active nodes also have two independent PSUs (Power Supply Units) connected to two independent PDUs (Power Distribution Units) reached at 198.51.100.1 (port 10 and port 11) and 203.0.113.1 (port 10 and port 11)" msgstr "" #. Tag: para #, no-c-format msgid "the first fencing method uses the fence_ipmi agent" msgstr "" #. Tag: para #, no-c-format msgid "the second fencing method uses the fence_apc_snmp agent targetting 2 fencing devices (one per PSU, either port 10 or 11)" msgstr "" #. Tag: para #, no-c-format msgid "fencing is only implemented for the active nodes and has location constraints" msgstr "" #. Tag: para #, no-c-format msgid "fencing topology is set to try IPMI fencing first then default to a \"sure-kill\" dual PDU fencing" msgstr "" #. Tag: para #, no-c-format msgid "In a normal failure scenario, STONITH will first select fence_ipmi to try to kill the faulty node. Using a fencing topology, if that first method fails, STONITH will then move on to selecting fence_apc_snmp twice:" msgstr "" #. Tag: para #, no-c-format msgid "once for the first PDU" msgstr "" #. Tag: para #, no-c-format msgid "again for the second PDU" msgstr "" #. Tag: para #, no-c-format msgid "The fence action is considered successful only if both PDUs report the required status. If any of them fails, STONITH loops back to the first fencing method, fence_ipmi, and so on until the node is fenced or fencing action is cancelled." msgstr "" #. Tag: title #, no-c-format msgid "First fencing method: single IPMI device" msgstr "" #. Tag: para #, no-c-format msgid "Each cluster node has it own dedicated IPMI channel that can be called for fencing using the following primitives:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive class=\"stonith\" id=\"fence_prod-mysql1_ipmi\" type=\"fence_ipmilan\">\n" " <instance_attributes id=\"fence_prod-mysql1_ipmi-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"192.0.2.1\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-passwd\" name=\"passwd\" value=\"finishme\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-verbose\" name=\"verbose\" value=\"true\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql1\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-lanplus\" name=\"lanplus\" value=\"true\"/>\n" " </instance_attributes>\n" "</primitive>\n" "<primitive class=\"stonith\" id=\"fence_prod-mysql2_ipmi\" type=\"fence_ipmilan\">\n" " <instance_attributes id=\"fence_prod-mysql2_ipmi-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"192.0.2.2\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-passwd\" name=\"passwd\" value=\"finishme\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-verbose\" name=\"verbose\" value=\"true\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql2\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-lanplus\" name=\"lanplus\" value=\"true\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: title #, no-c-format msgid "Second fencing method: dual PDU devices" msgstr "" #. Tag: para #, no-c-format msgid "Each cluster node also has two distinct power channels controlled by two distinct PDUs. That means a total of 4 fencing devices configured as follows:" msgstr "" #. Tag: para #, no-c-format msgid "Node 1, PDU 1, PSU 1 @ port 10" msgstr "" #. Tag: para #, no-c-format msgid "Node 1, PDU 2, PSU 2 @ port 10" msgstr "" #. Tag: para #, no-c-format msgid "Node 2, PDU 1, PSU 1 @ port 11" msgstr "" #. Tag: para #, no-c-format msgid "Node 2, PDU 2, PSU 2 @ port 11" msgstr "" #. Tag: para #, no-c-format msgid "The matching fencing agents are configured as follows:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive class=\"stonith\" id=\"fence_prod-mysql1_apc1\" type=\"fence_apc_snmp\">\n" " <instance_attributes id=\"fence_prod-mysql1_apc1-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"198.51.100.1\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-port\" name=\"port\" value=\"10\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-passwd\" name=\"passwd\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql1\"/>\n" " </instance_attributes>\n" "</primitive>\n" "<primitive class=\"stonith\" id=\"fence_prod-mysql1_apc2\" type=\"fence_apc_snmp\">\n" " <instance_attributes id=\"fence_prod-mysql1_apc2-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"203.0.113.1\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-port\" name=\"port\" value=\"10\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-passwd\" name=\"passwd\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql1\"/>\n" " </instance_attributes>\n" "</primitive>\n" "<primitive class=\"stonith\" id=\"fence_prod-mysql2_apc1\" type=\"fence_apc_snmp\">\n" " <instance_attributes id=\"fence_prod-mysql2_apc1-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"198.51.100.1\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-port\" name=\"port\" value=\"11\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-passwd\" name=\"passwd\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql2\"/>\n" " </instance_attributes>\n" "</primitive>\n" "<primitive class=\"stonith\" id=\"fence_prod-mysql2_apc2\" type=\"fence_apc_snmp\">\n" " <instance_attributes id=\"fence_prod-mysql2_apc2-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"203.0.113.1\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-port\" name=\"port\" value=\"11\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-passwd\" name=\"passwd\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql2\"/>\n" " </instance_attributes>\n" "</primitive>" msgstr "" #. Tag: title #, no-c-format msgid "Location Constraints" msgstr "" #. Tag: para #, no-c-format msgid "To prevent STONITH from trying to run a fencing agent on the same node it is supposed to fence, constraints are placed on all the fencing primitives:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<constraints>\n" " <rsc_location id=\"l_fence_prod-mysql1_ipmi\" node=\"prod-mysql1\" rsc=\"fence_prod-mysql1_ipmi\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql2_ipmi\" node=\"prod-mysql2\" rsc=\"fence_prod-mysql2_ipmi\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql1_apc2\" node=\"prod-mysql1\" rsc=\"fence_prod-mysql1_apc2\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql1_apc1\" node=\"prod-mysql1\" rsc=\"fence_prod-mysql1_apc1\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql2_apc1\" node=\"prod-mysql2\" rsc=\"fence_prod-mysql2_apc1\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql2_apc2\" node=\"prod-mysql2\" rsc=\"fence_prod-mysql2_apc2\" score=\"-INFINITY\"/>\n" "</constraints>" msgstr "" #. Tag: title #, no-c-format msgid "Fencing topology" msgstr "" #. Tag: para #, no-c-format msgid "Now that all the fencing resources are defined, it’s time to create the right topology. We want to first fence using IPMI and if that does not work, fence both PDUs to effectively and surely kill the node." msgstr "" #. Tag: programlisting #, no-c-format msgid "<fencing-topology>\n" " <fencing-level devices=\"fence_prod-mysql1_ipmi\" id=\"fencing-2\" index=\"1\" target=\"prod-mysql1\"/>\n" " <fencing-level devices=\"fence_prod-mysql1_apc1,fence_prod-mysql1_apc2\" id=\"fencing-3\" index=\"2\" target=\"prod-mysql1\"/>\n" " <fencing-level devices=\"fence_prod-mysql2_ipmi\" id=\"fencing-0\" index=\"1\" target=\"prod-mysql2\"/>\n" " <fencing-level devices=\"fence_prod-mysql2_apc1,fence_prod-mysql2_apc2\" id=\"fencing-1\" index=\"2\" target=\"prod-mysql2\"/>\n" "</fencing-topology>" msgstr "" #. Tag: para #, no-c-format msgid "Please note, in fencing-topology, the lowest index value determines the priority of the first fencing method." msgstr "" #. Tag: title #, no-c-format msgid "Final configuration" msgstr "" #. Tag: para #, no-c-format msgid "Put together, the configuration looks like this:" msgstr "" #. Tag: programlisting #, no-c-format msgid "<cib admin_epoch=\"0\" crm_feature_set=\"3.0.7\" epoch=\"292\" have-quorum=\"1\" num_updates=\"29\" validate-with=\"pacemaker-1.2\">\n" " <configuration>\n" " <crm_config>\n" " <cluster_property_set id=\"cib-bootstrap-options\">\n" " <nvpair id=\"cib-bootstrap-options-stonith-enabled\" name=\"stonith-enabled\" value=\"true\"/>\n" " <nvpair id=\"cib-bootstrap-options-stonith-action\" name=\"stonith-action\" value=\"off\"/>\n" " <nvpair id=\"cib-bootstrap-options-expected-quorum-votes\" name=\"expected-quorum-votes\" value=\"3\"/>\n" " ...\n" " </cluster_property_set>\n" " </crm_config>\n" " <nodes>\n" " <node id=\"prod-mysql1\" uname=\"prod-mysql1\">\n" " <node id=\"prod-mysql2\" uname=\"prod-mysql2\"/>\n" " <node id=\"prod-mysql-rep1\" uname=\"prod-mysql-rep1\"/>\n" " <instance_attributes id=\"prod-mysql-rep1\">\n" " <nvpair id=\"prod-mysql-rep1-standby\" name=\"standby\" value=\"on\"/>\n" " </instance_attributes>\n" " </node>\n" " </nodes>\n" " <resources>\n" " <primitive class=\"stonith\" id=\"fence_prod-mysql1_ipmi\" type=\"fence_ipmilan\">\n" " <instance_attributes id=\"fence_prod-mysql1_ipmi-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"192.0.2.1\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-passwd\" name=\"passwd\" value=\"finishme\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-verbose\" name=\"verbose\" value=\"true\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql1\"/>\n" " <nvpair id=\"fence_prod-mysql1_ipmi-instance_attributes-lanplus\" name=\"lanplus\" value=\"true\"/>\n" " </instance_attributes>\n" " </primitive>\n" " <primitive class=\"stonith\" id=\"fence_prod-mysql2_ipmi\" type=\"fence_ipmilan\">\n" " <instance_attributes id=\"fence_prod-mysql2_ipmi-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"192.0.2.2\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-passwd\" name=\"passwd\" value=\"finishme\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-verbose\" name=\"verbose\" value=\"true\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql2\"/>\n" " <nvpair id=\"fence_prod-mysql2_ipmi-instance_attributes-lanplus\" name=\"lanplus\" value=\"true\"/>\n" " </instance_attributes>\n" " </primitive>\n" " <primitive class=\"stonith\" id=\"fence_prod-mysql1_apc1\" type=\"fence_apc_snmp\">\n" " <instance_attributes id=\"fence_prod-mysql1_apc1-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"198.51.100.1\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-port\" name=\"port\" value=\"10\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-passwd\" name=\"passwd\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc1-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql1\"/>\n" " </instance_attributes>\n" " </primitive>\n" " <primitive class=\"stonith\" id=\"fence_prod-mysql1_apc2\" type=\"fence_apc_snmp\">\n" " <instance_attributes id=\"fence_prod-mysql1_apc2-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"203.0.113.1\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-port\" name=\"port\" value=\"10\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-passwd\" name=\"passwd\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql1_apc2-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql1\"/>\n" " </instance_attributes>\n" " </primitive>\n" " <primitive class=\"stonith\" id=\"fence_prod-mysql2_apc1\" type=\"fence_apc_snmp\">\n" " <instance_attributes id=\"fence_prod-mysql2_apc1-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"198.51.100.1\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-port\" name=\"port\" value=\"11\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-passwd\" name=\"passwd\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc1-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql2\"/>\n" " </instance_attributes>\n" " </primitive>\n" " <primitive class=\"stonith\" id=\"fence_prod-mysql2_apc2\" type=\"fence_apc_snmp\">\n" " <instance_attributes id=\"fence_prod-mysql2_apc2-instance_attributes\">\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-ipaddr\" name=\"ipaddr\" value=\"203.0.113.1\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-action\" name=\"action\" value=\"off\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-port\" name=\"port\" value=\"11\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-login\" name=\"login\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-passwd\" name=\"passwd\" value=\"fencing\"/>\n" " <nvpair id=\"fence_prod-mysql2_apc2-instance_attributes-pcmk_host_list\" name=\"pcmk_host_list\" value=\"prod-mysql2\"/>\n" " </instance_attributes>\n" " </primitive>\n" " </resources>\n" " <constraints>\n" " <rsc_location id=\"l_fence_prod-mysql1_ipmi\" node=\"prod-mysql1\" rsc=\"fence_prod-mysql1_ipmi\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql2_ipmi\" node=\"prod-mysql2\" rsc=\"fence_prod-mysql2_ipmi\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql1_apc2\" node=\"prod-mysql1\" rsc=\"fence_prod-mysql1_apc2\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql1_apc1\" node=\"prod-mysql1\" rsc=\"fence_prod-mysql1_apc1\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql2_apc1\" node=\"prod-mysql2\" rsc=\"fence_prod-mysql2_apc1\" score=\"-INFINITY\"/>\n" " <rsc_location id=\"l_fence_prod-mysql2_apc2\" node=\"prod-mysql2\" rsc=\"fence_prod-mysql2_apc2\" score=\"-INFINITY\"/>\n" " </constraints>\n" " <fencing-topology>\n" " <fencing-level devices=\"fence_prod-mysql1_ipmi\" id=\"fencing-2\" index=\"1\" target=\"prod-mysql1\"/>\n" " <fencing-level devices=\"fence_prod-mysql1_apc1,fence_prod-mysql1_apc2\" id=\"fencing-3\" index=\"2\" target=\"prod-mysql1\"/>\n" " <fencing-level devices=\"fence_prod-mysql2_ipmi\" id=\"fencing-0\" index=\"1\" target=\"prod-mysql2\"/>\n" " <fencing-level devices=\"fence_prod-mysql2_apc1,fence_prod-mysql2_apc2\" id=\"fencing-1\" index=\"2\" target=\"prod-mysql2\"/>\n" " </fencing-topology>\n" " ...\n" " </configuration>\n" "</cib>" msgstr "" #. Tag: title #, no-c-format msgid "Remapping Reboots" msgstr "" #. Tag: para #, no-c-format msgid "When the cluster needs to reboot a node, whether because stonith-action is reboot or because a reboot was manually requested (such as by stonith_admin --reboot), it will remap that to other commands in two cases:" msgstr "" #. Tag: para #, no-c-format msgid "If the chosen fencing device does not support the reboot command, the cluster will ask it to perform off instead." msgstr "" #. Tag: para #, no-c-format msgid "If a fencing topology level with multiple devices must be executed, the cluster will ask all the devices to perform off, then ask the devices to perform on." msgstr "" #. Tag: para #, no-c-format msgid "To understand the second case, consider the example of a node with redundant power supplies connected to intelligent power switches. Rebooting one switch and then the other would have no effect on the node. Turning both switches off, and then on, actually reboots the node." msgstr "" #. Tag: para #, no-c-format msgid "In such a case, the fencing operation will be treated as successful as long as the off commands succeed, because then it is safe for the cluster to recover any resources that were on the node. Timeouts and errors in the on phase will be logged but ignored." msgstr "" #. Tag: para #, no-c-format msgid "When a reboot operation is remapped, any action-specific timeout for the remapped action will be used (for example, pcmk_off_timeout will be used when executing the off command, not pcmk_reboot_timeout)." msgstr "" #. Tag: para #, no-c-format msgid "In Pacemaker versions 1.1.13 and earlier, reboots will not be remapped in the second case. To achieve the same effect, separate fencing devices for off and on actions must be configured." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Ch-Utilization.pot b/doc/Pacemaker_Explained/pot/Ch-Utilization.pot index ba87400325..5d437315ea 100644 --- a/doc/Pacemaker_Explained/pot/Ch-Utilization.pot +++ b/doc/Pacemaker_Explained/pot/Ch-Utilization.pot @@ -1,362 +1,362 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Utilization and Placement Strategy" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker decides where to place a resource according to the resource allocation scores on every node. The resource will be allocated to the node where the resource has the highest score." msgstr "" #. Tag: para #, no-c-format msgid "If the resource allocation scores on all the nodes are equal, by the default placement strategy, Pacemaker will choose a node with the least number of allocated resources for balancing the load. If the number of resources on each node is equal, the first eligible node listed in the CIB will be chosen to run the resource." msgstr "" #. Tag: para #, no-c-format msgid "Often, in real-world situations, different resources use significantly different proportions of a node’s capacities (memory, I/O, etc.). We cannot balance the load ideally just according to the number of resources allocated to a node. Besides, if resources are placed such that their combined requirements exceed the provided capacity, they may fail to start completely or run with degraded performance." msgstr "" #. Tag: para #, no-c-format msgid "To take these factors into account, Pacemaker allows you to configure:" msgstr "" #. Tag: para #, no-c-format msgid "The capacity a certain node provides." msgstr "" #. Tag: para #, no-c-format msgid "The capacity a certain resource requires." msgstr "" #. Tag: para #, no-c-format msgid "An overall strategy for placement of resources." msgstr "" #. Tag: title #, no-c-format msgid "Utilization attributes" msgstr "" #. Tag: para #, no-c-format msgid "To configure the capacity that a node provides or a resource requires, you can use utilization attributes in node and resource objects. You can name utilization attributes according to your preferences and define as many name/value pairs as your configuration needs. However, the attributes' values must be integers." msgstr "" #. Tag: title #, no-c-format msgid "Specifying CPU and RAM capacities of two nodes" msgstr "" #. Tag: programlisting #, no-c-format msgid "<node id=\"node1\" type=\"normal\" uname=\"node1\">\n" " <utilization id=\"node1-utilization\">\n" " <nvpair id=\"node1-utilization-cpu\" name=\"cpu\" value=\"2\"/>\n" " <nvpair id=\"node1-utilization-memory\" name=\"memory\" value=\"2048\"/>\n" " </utilization>\n" "</node>\n" "<node id=\"node2\" type=\"normal\" uname=\"node2\">\n" " <utilization id=\"node2-utilization\">\n" " <nvpair id=\"node2-utilization-cpu\" name=\"cpu\" value=\"4\"/>\n" " <nvpair id=\"node2-utilization-memory\" name=\"memory\" value=\"4096\"/>\n" " </utilization>\n" "</node>" msgstr "" #. Tag: title #, no-c-format msgid "Specifying CPU and RAM consumed by several resources" msgstr "" #. Tag: programlisting #, no-c-format msgid "<primitive id=\"rsc-small\" class=\"ocf\" provider=\"pacemaker\" type=\"Dummy\">\n" " <utilization id=\"rsc-small-utilization\">\n" " <nvpair id=\"rsc-small-utilization-cpu\" name=\"cpu\" value=\"1\"/>\n" " <nvpair id=\"rsc-small-utilization-memory\" name=\"memory\" value=\"1024\"/>\n" " </utilization>\n" "</primitive>\n" "<primitive id=\"rsc-medium\" class=\"ocf\" provider=\"pacemaker\" type=\"Dummy\">\n" " <utilization id=\"rsc-medium-utilization\">\n" " <nvpair id=\"rsc-medium-utilization-cpu\" name=\"cpu\" value=\"2\"/>\n" " <nvpair id=\"rsc-medium-utilization-memory\" name=\"memory\" value=\"2048\"/>\n" " </utilization>\n" "</primitive>\n" "<primitive id=\"rsc-large\" class=\"ocf\" provider=\"pacemaker\" type=\"Dummy\">\n" " <utilization id=\"rsc-large-utilization\">\n" " <nvpair id=\"rsc-large-utilization-cpu\" name=\"cpu\" value=\"3\"/>\n" " <nvpair id=\"rsc-large-utilization-memory\" name=\"memory\" value=\"3072\"/>\n" " </utilization>\n" "</primitive>" msgstr "" #. Tag: para #, no-c-format msgid "A node is considered eligible for a resource if it has sufficient free capacity to satisfy the resource’s requirements. The nature of the required or provided capacities is completely irrelevant to Pacemaker — it just makes sure that all capacity requirements of a resource are satisfied before placing a resource to a node." msgstr "" #. Tag: title #, no-c-format msgid "Placement Strategy" msgstr "" #. Tag: para #, no-c-format msgid "After you have configured the capacities your nodes provide and the capacities your resources require, you need to set the placement-strategy in the global cluster options, otherwise the capacity configurations have no effect." msgstr "" #. Tag: para #, no-c-format msgid "Four values are available for the placement-strategy:" msgstr "" #. Tag: term #, no-c-format msgid "default" msgstr "" #. Tag: para #, no-c-format msgid "Utilization values are not taken into account at all. Resources are allocated according to allocation scores. If scores are equal, resources are evenly distributed across nodes." msgstr "" #. Tag: term #, no-c-format msgid "utilization" msgstr "" #. Tag: para #, no-c-format msgid "Utilization values are taken into account only when deciding whether a node is considered eligible (i.e. whether it has sufficient free capacity to satisfy the resource’s requirements). Load-balancing is still done based on the number of resources allocated to a node." msgstr "" #. Tag: term #, no-c-format msgid "balanced" msgstr "" #. Tag: para #, no-c-format msgid "Utilization values are taken into account when deciding whether a node is eligible to serve a resource and when load-balancing, so an attempt is made to spread the resources in a way that optimizes resource performance." msgstr "" #. Tag: term #, no-c-format msgid "minimal" msgstr "" #. Tag: para #, no-c-format msgid "Utilization values are taken into account only when deciding whether a node is eligible to serve a resource. For load-balancing, an attempt is made to concentrate the resources on as few nodes as possible, thereby enabling possible power savings on the remaining nodes." msgstr "" #. Tag: para #, no-c-format msgid "Set placement-strategy with crm_attribute:" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_attribute --name placement-strategy --update balanced" msgstr "" #. Tag: para #, no-c-format msgid "Now Pacemaker will ensure the load from your resources will be distributed evenly throughout the cluster, without the need for convoluted sets of colocation constraints." msgstr "" #. Tag: title #, no-c-format msgid "Allocation Details" msgstr "" #. Tag: title #, no-c-format msgid "Which node is preferred to get consumed first when allocating resources?" msgstr "" #. Tag: para #, no-c-format msgid "The node with the highest node weight gets consumed first. Node weight is a score maintained by the cluster to represent node health." msgstr "" #. Tag: para #, no-c-format msgid "If multiple nodes have the same node weight:" msgstr "" #. Tag: para #, no-c-format msgid "If placement-strategy is default or utilization, the node that has the least number of allocated resources gets consumed first." msgstr "" #. Tag: para #, no-c-format msgid "If their numbers of allocated resources are equal, the first eligible node listed in the CIB gets consumed first." msgstr "" #. Tag: para #, no-c-format msgid "If placement-strategy is balanced, the node that has the most free capacity gets consumed first." msgstr "" #. Tag: para #, no-c-format msgid "If the free capacities of the nodes are equal, the node that has the least number of allocated resources gets consumed first." msgstr "" #. Tag: para #, no-c-format msgid "If placement-strategy is minimal, the first eligible node listed in the CIB gets consumed first." msgstr "" #. Tag: title #, no-c-format msgid "Which node has more free capacity?" msgstr "" #. Tag: para #, no-c-format msgid "If only one type of utilization attribute has been defined, free capacity is a simple numeric comparison." msgstr "" #. Tag: para #, no-c-format msgid "If multiple types of utilization attributes have been defined, then the node that is numerically highest in the the most attribute types has the most free capacity. For example:" msgstr "" #. Tag: para #, no-c-format msgid "If nodeA has more free cpus, and nodeB has more free memory, then their free capacities are equal." msgstr "" #. Tag: para #, no-c-format msgid "If nodeA has more free cpus, while nodeB has more free memory and storage, then nodeB has more free capacity." msgstr "" #. Tag: title #, no-c-format msgid "Which resource is preferred to be assigned first?" msgstr "" #. Tag: para #, no-c-format msgid "The resource that has the highest priority (see ) gets allocated first." msgstr "" #. Tag: para #, no-c-format msgid "If their priorities are equal, check whether they are already running. The resource that has the highest score on the node where it’s running gets allocated first, to prevent resource shuffling." msgstr "" #. Tag: para #, no-c-format msgid "If the scores above are equal or the resources are not running, the resource has the highest score on the preferred node gets allocated first." msgstr "" #. Tag: para #, no-c-format msgid "If the scores above are equal, the first runnable resource listed in the CIB gets allocated first." msgstr "" #. Tag: title #, no-c-format msgid "Limitations and Workarounds" msgstr "" #. Tag: para #, no-c-format msgid "The type of problem Pacemaker is dealing with here is known as the knapsack problem and falls into the NP-complete category of computer science problems — a fancy way of saying \"it takes a really long time to solve\"." msgstr "" #. Tag: para #, no-c-format msgid "Clearly in a HA cluster, it’s not acceptable to spend minutes, let alone hours or days, finding an optional solution while services remain unavailable." msgstr "" #. Tag: para #, no-c-format msgid "So instead of trying to solve the problem completely, Pacemaker uses a best effort algorithm for determining which node should host a particular service. This means it arrives at a solution much faster than traditional linear programming algorithms, but by doing so at the price of leaving some services stopped." msgstr "" #. Tag: para #, no-c-format msgid "In the contrived example at the start of this section:" msgstr "" #. Tag: para #, no-c-format msgid "rsc-small would be allocated to node1" msgstr "" #. Tag: para #, no-c-format msgid "rsc-medium would be allocated to node2" msgstr "" #. Tag: para #, no-c-format msgid "rsc-large would remain inactive" msgstr "" #. Tag: para #, no-c-format msgid "Which is not ideal." msgstr "" #. Tag: para #, no-c-format msgid "There are various approaches to dealing with the limitations of pacemaker’s placement strategy:" msgstr "" #. Tag: term #, no-c-format msgid "Ensure you have sufficient physical capacity." msgstr "" #. Tag: para #, no-c-format msgid "It might sound obvious, but if the physical capacity of your nodes is (close to) maxed out by the cluster under normal conditions, then failover isn’t going to go well. Even without the utilization feature, you’ll start hitting timeouts and getting secondary failures." msgstr "" #. Tag: term #, no-c-format msgid "Build some buffer into the capabilities advertised by the nodes." msgstr "" #. Tag: para #, no-c-format msgid "Advertise slightly more resources than we physically have, on the (usually valid) assumption that a resource will not use 100% of the configured amount of CPU, memory and so forth all the time. This practice is sometimes called overcommit." msgstr "" #. Tag: term #, no-c-format msgid "Specify resource priorities." msgstr "" #. Tag: para #, no-c-format msgid "If the cluster is going to sacrifice services, it should be the ones you care about (comparatively) the least. Ensure that resource priorities are properly set so that your most important resources are scheduled first." msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Pacemaker_Explained.pot b/doc/Pacemaker_Explained/pot/Pacemaker_Explained.pot index f715ca374e..0779b0581f 100644 --- a/doc/Pacemaker_Explained/pot/Pacemaker_Explained.pot +++ b/doc/Pacemaker_Explained/pot/Pacemaker_Explained.pot @@ -1,44 +1,44 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Further Reading" msgstr "" #. Tag: para #, no-c-format msgid "Project Website: " msgstr "" #. Tag: para #, no-c-format msgid "Project Documentation: " msgstr "" #. Tag: para #, no-c-format msgid "SUSE High Availibility Guide: " msgstr "" #. Tag: para #, no-c-format msgid "Heartbeat configuration: " msgstr "" #. Tag: para #, no-c-format msgid "Corosync Configuration: " msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Preface.pot b/doc/Pacemaker_Explained/pot/Preface.pot index 01aa1cc95d..1f366bf020 100644 --- a/doc/Pacemaker_Explained/pot/Preface.pot +++ b/doc/Pacemaker_Explained/pot/Preface.pot @@ -1,19 +1,19 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Preface" msgstr "" diff --git a/doc/Pacemaker_Explained/pot/Revision_History.pot b/doc/Pacemaker_Explained/pot/Revision_History.pot index 886c776f17..62420935c8 100644 --- a/doc/Pacemaker_Explained/pot/Revision_History.pot +++ b/doc/Pacemaker_Explained/pot/Revision_History.pot @@ -1,79 +1,89 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Revision History" msgstr "" #. Tag: firstname #, no-c-format msgid "Andrew" msgstr "" #. Tag: surname #, no-c-format msgid "Beekhof" msgstr "" #. Tag: member #, no-c-format msgid "Import from Pages.app" msgstr "" #. Tag: member #, no-c-format msgid "Cleanup and reformatting of docbook xml complete" msgstr "" #. Tag: member #, no-c-format msgid "Split book into chapters and pass validation" msgstr "" #. Tag: member #, no-c-format msgid "Re-organize book for use with Publican" msgstr "" #. Tag: member #, no-c-format msgid "Converted to asciidoc (which is converted to docbook for use with Publican)" msgstr "" #. Tag: firstname #, no-c-format msgid "Ken" msgstr "" #. Tag: surname #, no-c-format msgid "Gaillot" msgstr "" #. Tag: member #, no-c-format msgid "Update for clarity, stylistic consistency and current command-line syntax" msgstr "" #. Tag: member #, no-c-format msgid "Update for Pacemaker 1.1.14" msgstr "" #. Tag: member #, no-c-format msgid "Update for Pacemaker 1.1.15" msgstr "" +#. Tag: member +#, no-c-format +msgid "Overhaul upgrade documentation, and document node health strategies" +msgstr "" + +#. Tag: member +#, no-c-format +msgid "Update for Pacemaker 1.1.16" +msgstr "" + diff --git a/doc/Pacemaker_Remote/en-US/Book_Info.xml b/doc/Pacemaker_Remote/en-US/Book_Info.xml index 1e3675b9d1..64e6b32237 100644 --- a/doc/Pacemaker_Remote/en-US/Book_Info.xml +++ b/doc/Pacemaker_Remote/en-US/Book_Info.xml @@ -1,75 +1,75 @@ %BOOK_ENTITIES; ]> Pacemaker Remote Scaling High Availablity Clusters - 6 + 7 0 The document exists as both a reference and deployment guide for the Pacemaker Remote service. The example commands in this document will use: &DISTRO; &DISTRO_VERSION; as the host operating system Pacemaker Remote to perform resource management within guest nodes and remote nodes KVM for virtualization libvirt to manage guest nodes Corosync to provide messaging and membership services on cluster nodes Pacemaker to perform resource management on cluster nodes pcs as the cluster configuration toolset The concepts are the same for other distributions, virtualization platforms, toolsets, and messaging layers, and should be easily adaptable. diff --git a/doc/Pacemaker_Remote/en-US/Ch-Intro.txt b/doc/Pacemaker_Remote/en-US/Ch-Intro.txt index df5e1eae23..72bd15d9d0 100644 --- a/doc/Pacemaker_Remote/en-US/Ch-Intro.txt +++ b/doc/Pacemaker_Remote/en-US/Ch-Intro.txt @@ -1,205 +1,201 @@ = Scaling a Pacemaker Cluster = == Overview == In a basic Pacemaker high-availability cluster,footnote:[See the http://www.clusterlabs.org/doc/[Pacemaker documentation], especially 'Clusters From Scratch' and 'Pacemaker Explained', for basic information about high-availability using Pacemaker] each node runs the full cluster stack of corosync and all Pacemaker components. This allows great flexibility but limits scalability to around 16 nodes. To allow for scalability to dozens or even hundreds of nodes, Pacemaker allows nodes not running the full cluster stack to integrate into the cluster and have the cluster manage their resources as if they were a cluster node. == Terms == cluster node:: A node running the full high-availability stack of corosync and all Pacemaker components. Cluster nodes may run cluster resources, run all Pacemaker command-line tools (`crm_mon`, `crm_resource` and so on), execute fencing actions, count toward cluster quorum, and serve as the cluster's Designated Controller (DC). (((cluster node))) (((node,cluster node))) pacemaker_remote:: A small service daemon that allows a host to be used as a Pacemaker node without running the full cluster stack. Nodes running pacemaker_remote may run cluster resources and most command-line tools, but cannot perform other functions of full cluster nodes such as fencing execution, quorum voting or DC eligibility. The pacemaker_remote daemon is an enhanced version of Pacemaker's local resource management daemon (LRMD). (((pacemaker_remote))) remote node:: A physical host running pacemaker_remote. Remote nodes have a special resource that manages communication with the cluster. This is sometimes referred to as the 'baremetal' case. (((remote node))) (((node,remote node))) guest node:: A virtual host running pacemaker_remote. Guest nodes differ from remote nodes mainly in that the guest node is itself a resource that the cluster manages. (((guest node))) (((node,guest node))) [NOTE] ====== 'Remote' in this document refers to the node not being a part of the underlying corosync cluster. It has nothing to do with physical proximity. Remote nodes and guest nodes are subject to the same latency requirements as cluster nodes, which means they are typically in the same data center. ====== [NOTE] ====== It is important to distinguish the various roles a virtual machine can serve in Pacemaker clusters: * A virtual machine can run the full cluster stack, in which case it is a cluster node and is not itself managed by the cluster. * A virtual machine can be managed by the cluster as a resource, without the cluster having any awareness of the services running inside the virtual machine. The virtual machine is 'opaque' to the cluster. * A virtual machine can be a cluster resource, and run pacemaker_remote to make it a guest node, allowing the cluster to manage services inside it. The virtual machine is 'transparent' to the cluster. ====== == Support in Pacemaker Versions == It is recommended to run Pacemaker 1.1.12 or later when using pacemaker_remote due to important bug fixes. An overview of changes in pacemaker_remote -capability by version: +capability by version (aside from bug fixes, which are included in every +version): + +.1.1.16 +* Support for watchdog-based fencing (sbd) on remote nodes .1.1.15 * If pacemaker_remote is stopped on an active node, it will wait for the cluster to migrate all resources off before exiting, rather than exit immediately and get fenced. -* Bug fixes .1.1.14 * Resources that create guest nodes can be included in groups * reconnect_interval option for remote nodes -* Bug fixes, including a memory leak .1.1.13 * Support for maintenance mode * Remote nodes can recover without being fenced when the cluster node hosting their connection fails * Running pacemaker_remote within LXC environments is deprecated due to newly added Pacemaker support for isolated resources * +#kind+ built-in node attribute for use with rules -* Bug fixes .1.1.12 * Support for permanent node attributes * Support for migration -* Bug fixes .1.1.11 * Support for IPv6 * Support for remote nodes * Support for transient node attributes * Support for clusters with mixed endian architectures -* Bug fixes - -.1.1.10 -* Bug fixes .1.1.9 * Initial version to include pacemaker_remote * Limited to guest nodes in KVM/LXC environments using only IPv4; all nodes' architectures must have same endianness == Guest Nodes == (((guest node))) (((node,guest node))) *"I want a Pacemaker cluster to manage virtual machine resources, but I also want Pacemaker to be able to manage the resources that live within those virtual machines."* Without pacemaker_remote, the possibilities for implementing the above use case have significant limitations: * The cluster stack could be run on the physical hosts only, which loses the ability to monitor resources within the guests. * A separate cluster could be on the virtual guests, which quickly hits scalability issues. * The cluster stack could be run on the guests using the same cluster as the physical hosts, which also hits scalability issues and complicates fencing. With pacemaker_remote: * The physical hosts are cluster nodes (running the full cluster stack). * The virtual machines are guest nodes (running the pacemaker_remote service). Nearly zero configuration is required on the virtual machine. * The cluster stack on the cluster nodes launches the virtual machines and immediately connects to the pacemaker_remote service on them, allowing the virtual machines to integrate into the cluster. The key difference here between the guest nodes and the cluster nodes is that the guest nodes do not run the cluster stack. This means they will never become the DC, initiate fencing actions or participate in quorum voting. On the other hand, this also means that they are not bound to the scalability limits associated with the cluster stack (no 16-node corosync member limits to deal with). That isn't to say that guest nodes can scale indefinitely, but it is known that guest nodes scale horizontally much further than cluster nodes. Other than the quorum limitation, these guest nodes behave just like cluster nodes with respect to resource management. The cluster is fully capable of managing and monitoring resources on each guest node. You can build constraints against guest nodes, put them in standby, or do whatever else you'd expect to be able to do with cluster nodes. They even show up in `crm_mon` output as nodes. To solidify the concept, below is an example that is very similar to an actual deployment we test in our developer environment to verify guest node scalability: * 16 cluster nodes running the full corosync + pacemaker stack * 64 Pacemaker-managed virtual machine resources running pacemaker_remote configured as guest nodes * 64 Pacemaker-managed webserver and database resources configured to run on the 64 guest nodes With this deployment, you would have 64 webservers and databases running on 64 virtual machines on 16 hardware nodes, all of which are managed and monitored by the same Pacemaker deployment. It is known that pacemaker_remote can scale to these lengths and possibly much further depending on the specific scenario. == Remote Nodes == (((remote node))) (((node,remote node))) *"I want my traditional high-availability cluster to scale beyond the limits imposed by the corosync messaging layer."* Ultimately, the primary advantage of remote nodes over cluster nodes is scalability. There are likely some other use cases related to geographically distributed HA clusters that remote nodes may serve a purpose in, but those use cases are not well understood at this point. Like guest nodes, remote nodes will never become the DC, initiate fencing actions or participate in quorum voting. That is not to say, however, that fencing of a remote node works any differently than that of a cluster node. The Pacemaker policy engine understands how to fence remote nodes. As long as a fencing device exists, the cluster is capable of ensuring remote nodes are fenced in the exact same way as cluster nodes. == Expanding the Cluster Stack == With pacemaker_remote, the traditional view of the high-availability stack can be expanded to include a new layer: .Traditional HA Stack image::images/pcmk-ha-cluster-stack.png["Traditional Pacemaker+Corosync Stack",width="17cm",height="9cm",align="center"] .HA Stack With Guest Nodes image::images/pcmk-ha-remote-stack.png["Pacemaker+Corosync Stack With pacemaker_remote",width="20cm",height="10cm",align="center"] diff --git a/doc/Pacemaker_Remote/en-US/Revision_History.xml b/doc/Pacemaker_Remote/en-US/Revision_History.xml index b3d1fd285d..d0ad93af96 100644 --- a/doc/Pacemaker_Remote/en-US/Revision_History.xml +++ b/doc/Pacemaker_Remote/en-US/Revision_History.xml @@ -1,49 +1,55 @@ %BOOK_ENTITIES; ]> Revision History 1-0 Tue Mar 19 2013 DavidVosseldavidvossel@gmail.com Import from Pages.app 2-0 Tue May 13 2013 DavidVosseldavidvossel@gmail.com Added Future Features Section 3-0 Fri Oct 18 2013 DavidVosseldavidvossel@gmail.com Added Baremetal remote-node feature documentation 4-0 Tue Aug 25 2015 KenGaillotkgaillot@redhat.com Targeted CentOS 7.1 and Pacemaker 1.1.12+, updated for current terminology and practice 5-0 Tue Dec 8 2015 KenGaillotkgaillot@redhat.com Updated for Pacemaker 1.1.14 6-0 Tue May 3 2016 KenGaillotkgaillot@redhat.com Updated for Pacemaker 1.1.15 + + 7-0 + Mon Oct 31 2016 + KenGaillotkgaillot@redhat.com + Updated for Pacemaker 1.1.16 + diff --git a/doc/Pacemaker_Remote/pot/Author_Group.pot b/doc/Pacemaker_Remote/pot/Author_Group.pot index 3b4203d1d1..bf98059974 100644 --- a/doc/Pacemaker_Remote/pot/Author_Group.pot +++ b/doc/Pacemaker_Remote/pot/Author_Group.pot @@ -1,34 +1,34 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: firstname #, no-c-format msgid "David" msgstr "" #. Tag: surname #, no-c-format msgid "Vossel" msgstr "" #. Tag: orgname #, no-c-format msgid "Red Hat" msgstr "" #. Tag: contrib #, no-c-format msgid "Primary author" msgstr "" diff --git a/doc/Pacemaker_Remote/pot/Book_Info.pot b/doc/Pacemaker_Remote/pot/Book_Info.pot index d2a6a8e310..b05dd8738e 100644 --- a/doc/Pacemaker_Remote/pot/Book_Info.pot +++ b/doc/Pacemaker_Remote/pot/Book_Info.pot @@ -1,74 +1,74 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Pacemaker Remote" msgstr "" #. Tag: subtitle #, no-c-format msgid "Scaling High Availablity Clusters" msgstr "" #. Tag: para #, no-c-format msgid "The document exists as both a reference and deployment guide for the Pacemaker Remote service." msgstr "" #. Tag: para #, no-c-format msgid "The example commands in this document will use:" msgstr "" #. Tag: para #, no-c-format msgid "&DISTRO; &DISTRO_VERSION; as the host operating system" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker Remote to perform resource management within guest nodes and remote nodes" msgstr "" #. Tag: para #, no-c-format msgid "KVM for virtualization" msgstr "" #. Tag: para #, no-c-format msgid "libvirt to manage guest nodes" msgstr "" #. Tag: para #, no-c-format msgid "Corosync to provide messaging and membership services on cluster nodes" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker to perform resource management on cluster nodes" msgstr "" #. Tag: para #, no-c-format msgid "pcs as the cluster configuration toolset" msgstr "" #. Tag: para #, no-c-format msgid "The concepts are the same for other distributions, virtualization platforms, toolsets, and messaging layers, and should be easily adaptable." msgstr "" diff --git a/doc/Pacemaker_Remote/pot/Ch-Alternatives.pot b/doc/Pacemaker_Remote/pot/Ch-Alternatives.pot index f79938fd9c..56442f3139 100644 --- a/doc/Pacemaker_Remote/pot/Ch-Alternatives.pot +++ b/doc/Pacemaker_Remote/pot/Ch-Alternatives.pot @@ -1,109 +1,109 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Alternative Configurations" msgstr "" #. Tag: para #, no-c-format msgid "These alternative configurations may be appropriate in limited cases, such as a test cluster, but are not the best method in most situations. They are presented here for completeness and as an example of pacemaker’s flexibility to suit your needs." msgstr "" #. Tag: title #, no-c-format msgid "Virtual Machines as Cluster Nodes" msgstr "" #. Tag: para #, no-c-format msgid "The preferred use of virtual machines in a pacemaker cluster is as a cluster resource, whether opaque or as a guest node. However, it is possible to run the full cluster stack on a virtual node instead." msgstr "" #. Tag: para #, no-c-format msgid "This is commonly used to set up test environments; a single physical host (that does not participate in the cluster) runs two or more virtual machines, all running the full cluster stack. This can be used to simulate a larger cluster for testing purposes." msgstr "" #. Tag: para #, no-c-format msgid "In a production environment, fencing becomes more complicated, especially if the underlying hosts run any services besides the clustered VMs. If the VMs are not guaranteed a minimum amount of host resources, CPU and I/O contention can cause timing issues for cluster components." msgstr "" #. Tag: para #, no-c-format msgid "Another situation where this approach is sometimes used is when the cluster owner leases the VMs from a provider and does not have direct access to the underlying host. The main concerns in this case are proper fencing (usually via a custom resource agent that communicates with the provider’s APIs) and maintaining a static IP address between reboots, as well as resource contention issues." msgstr "" #. Tag: title #, no-c-format msgid "Virtual Machines as Remote Nodes" msgstr "" #. Tag: para #, no-c-format msgid "Virtual machines may be configured following the process for remote nodes rather than guest nodes (i.e., using an ocf:pacemaker:remote resource rather than letting the cluster manage the VM directly)." msgstr "" #. Tag: para #, no-c-format msgid "This is mainly useful in testing, to use a single physical host to simulate a larger cluster involving remote nodes. Pacemaker’s Cluster Test Suite (CTS) uses this approach to test remote node functionality." msgstr "" #. Tag: title #, no-c-format msgid "Containers as Guest Nodes" msgstr "" #. Tag: para #, no-c-format msgid "Containers,https://en.wikipedia.org/wiki/Operating-system-level_virtualization and in particular Linux containers (LXC) and Docker, have become a popular method of isolating services in a resource-efficient manner." msgstr "" #. Tag: para #, no-c-format msgid "The preferred means of integrating containers into Pacemaker is as a cluster resource, whether opaque or using Pacemaker’s built-in resource isolation support.Documentation for this support is planned but not yet available." msgstr "" #. Tag: para #, no-c-format msgid "However, it is possible to run pacemaker_remote inside a container, following the process for guest nodes. This is not recommended but can be useful, for example, in testing scenarios, to simulate a large number of guest nodes." msgstr "" #. Tag: para #, no-c-format msgid "The configuration process is very similar to that described for guest nodes using virtual machines. Key differences:" msgstr "" #. Tag: para #, no-c-format msgid "The underlying host must install the libvirt driver for the desired container technology — for example, the libvirt-daemon-lxc package to get the libvirt-lxc driver for LXC containers." msgstr "" #. Tag: para #, no-c-format msgid "Libvirt XML definitions must be generated for the containers. The pacemaker-cts package includes a script for this purpose, /usr/share/pacemaker/tests/cts/lxc_autogen.sh. Run it with the --help option for details on how to use it. It is intended for testing purposes only, and hardcodes various parameters that would need to be set appropriately in real usage. Of course, you can create XML definitions manually, following the appropriate libvirt driver documentation." msgstr "" #. Tag: para #, no-c-format msgid "To share the authentication key, either share the host’s /etc/pacemaker directory with the container, or copy the key into the container’s filesystem." msgstr "" #. Tag: para #, no-c-format msgid "The VirtualDomain resource for a container will need force_stop=\"true\" and an appropriate hypervisor option, for example hypervisor=\"lxc:///\" for LXC containers." msgstr "" diff --git a/doc/Pacemaker_Remote/pot/Ch-Baremetal-Tutorial.pot b/doc/Pacemaker_Remote/pot/Ch-Baremetal-Tutorial.pot index 1d1bc56c28..25d3249fad 100644 --- a/doc/Pacemaker_Remote/pot/Ch-Baremetal-Tutorial.pot +++ b/doc/Pacemaker_Remote/pot/Ch-Baremetal-Tutorial.pot @@ -1,458 +1,458 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Remote Node Walk-through" msgstr "" #. Tag: para #, no-c-format msgid "What this tutorial is: An in-depth walk-through of how to get Pacemaker to integrate a remote node into the cluster as a node capable of running cluster resources." msgstr "" #. Tag: para #, no-c-format msgid "What this tutorial is not: A realistic deployment scenario. The steps shown here are meant to get users familiar with the concept of remote nodes as quickly as possible." msgstr "" #. Tag: para #, no-c-format msgid "This tutorial requires three machines: two to act as cluster nodes, and a third to act as the remote node." msgstr "" #. Tag: title #, no-c-format msgid "Configure Remote Node" msgstr "" #. Tag: title #, no-c-format msgid "Configure Firewall on Remote Node" msgstr "" #. Tag: para #, no-c-format msgid "Allow cluster-related services through the local firewall:" msgstr "" #. Tag: screen #, no-c-format msgid "# firewall-cmd --permanent --add-service=high-availability\n" "success\n" "# firewall-cmd --reload\n" "success" msgstr "" #. Tag: para #, no-c-format msgid "If you are using iptables directly, or some other firewall solution besides firewalld, simply open the following ports, which can be used by various clustering components: TCP ports 2224, 3121, and 21064, and UDP port 5405." msgstr "" #. Tag: para #, no-c-format msgid "If you run into any problems during testing, you might want to disable the firewall and SELinux entirely until you have everything working. This may create significant security issues and should not be performed on machines that will be exposed to the outside world, but may be appropriate during development and testing on a protected host." msgstr "" #. Tag: para #, no-c-format msgid "To disable security measures:" msgstr "" #. Tag: screen #, no-c-format msgid "# setenforce 0\n" "# sed -i.bak \"s/SELINUX=enforcing/SELINUX=permissive/g\" /etc/selinux/config\n" "# systemctl disable firewalld.service\n" "# systemctl stop firewalld.service\n" "# iptables --flush" msgstr "" #. Tag: title #, no-c-format msgid "Configure pacemaker_remote on Remote Node" msgstr "" #. Tag: para #, no-c-format msgid "Install the pacemaker_remote daemon on the remote node." msgstr "" #. Tag: screen #, no-c-format msgid "# yum install -y pacemaker-remote resource-agents pcs" msgstr "" #. Tag: para #, no-c-format msgid "Create a location for the shared authentication key:" msgstr "" #. Tag: screen #, no-c-format msgid "# mkdir -p --mode=0750 /etc/pacemaker\n" "# chgrp haclient /etc/pacemaker" msgstr "" #. Tag: para #, no-c-format msgid "All nodes (both cluster nodes and remote nodes) must have the same authentication key installed for the communication to work correctly. If you already have a key on an existing node, copy it to the new remote node. Otherwise, create a new key, for example:" msgstr "" #. Tag: screen #, no-c-format msgid "# dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1" msgstr "" #. Tag: para #, no-c-format msgid "Now start and enable the pacemaker_remote daemon on the remote node." msgstr "" #. Tag: screen #, no-c-format msgid "# systemctl enable pacemaker_remote.service\n" "# systemctl start pacemaker_remote.service" msgstr "" #. Tag: para #, no-c-format msgid "Verify the start is successful." msgstr "" #. Tag: screen #, no-c-format msgid "# systemctl status pacemaker_remote\n" "pacemaker_remote.service - Pacemaker Remote Service\n" " Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service; enabled)\n" " Active: active (running) since Fri 2015-08-21 15:21:20 CDT; 20s ago\n" " Main PID: 21273 (pacemaker_remot)\n" " CGroup: /system.slice/pacemaker_remote.service\n" " └─21273 /usr/sbin/pacemaker_remoted\n" "\n" "Aug 21 15:21:20 remote1 systemd[1]: Starting Pacemaker Remote Service...\n" "Aug 21 15:21:20 remote1 systemd[1]: Started Pacemaker Remote Service.\n" "Aug 21 15:21:20 remote1 pacemaker_remoted[21273]: notice: crm_add_logfile: Additional logging available in /var/log/pacemaker.log\n" "Aug 21 15:21:20 remote1 pacemaker_remoted[21273]: notice: lrmd_init_remote_tls_server: Starting a tls listener on port 3121.\n" "Aug 21 15:21:20 remote1 pacemaker_remoted[21273]: notice: bind_and_listen: Listening on address ::" msgstr "" #. Tag: title #, no-c-format msgid "Verify Connection to Remote Node" msgstr "" #. Tag: para #, no-c-format msgid "Before moving forward, it’s worth verifying that the cluster nodes can contact the remote node on port 3121. Here’s a trick you can use. Connect using ssh from each of the cluster nodes. The connection will get destroyed, but how it is destroyed tells you whether it worked or not." msgstr "" #. Tag: para #, no-c-format msgid "First, add the remote node’s hostname (we’re using remote1 in this tutorial) to the cluster nodes' /etc/hosts files if you haven’t already. This is required unless you have DNS set up in a way where remote1’s address can be discovered." msgstr "" #. Tag: para #, no-c-format msgid "Execute the following on each cluster node, replacing the IP address with the actual IP address of the remote node." msgstr "" #. Tag: screen #, no-c-format msgid "# cat << END >> /etc/hosts\n" "192.168.122.10 remote1\n" "END" msgstr "" #. Tag: para #, no-c-format msgid "If running the ssh command on one of the cluster nodes results in this output before disconnecting, the connection works:" msgstr "" #. Tag: screen #, no-c-format msgid "# ssh -p 3121 remote1\n" "ssh_exchange_identification: read: Connection reset by peer" msgstr "" #. Tag: para #, no-c-format msgid "If you see one of these, the connection is not working:" msgstr "" #. Tag: screen #, no-c-format msgid "# ssh -p 3121 remote1\n" "ssh: connect to host remote1 port 3121: No route to host" msgstr "" #. Tag: screen #, no-c-format msgid "# ssh -p 3121 remote1\n" "ssh: connect to host remote1 port 3121: Connection refused" msgstr "" #. Tag: para #, no-c-format msgid "Once you can successfully connect to the remote node from the both cluster nodes, move on to setting up Pacemaker on the cluster nodes." msgstr "" #. Tag: title #, no-c-format msgid "Configure Cluster Nodes" msgstr "" #. Tag: title #, no-c-format msgid "Configure Firewall on Cluster Nodes" msgstr "" #. Tag: para #, no-c-format msgid "On each cluster node, allow cluster-related services through the local firewall, following the same procedure as in ." msgstr "" #. Tag: title #, no-c-format msgid "Install Pacemaker on Cluster Nodes" msgstr "" #. Tag: para #, no-c-format msgid "On the two cluster nodes, install the following packages." msgstr "" #. Tag: screen #, no-c-format msgid "# yum install -y pacemaker corosync pcs resource-agents" msgstr "" #. Tag: title #, no-c-format msgid "Copy Authentication Key to Cluster Nodes" msgstr "" #. Tag: para #, no-c-format msgid "Create a location for the shared authentication key, and copy it from any existing node:" msgstr "" #. Tag: screen #, no-c-format msgid "# mkdir -p --mode=0750 /etc/pacemaker\n" "# chgrp haclient /etc/pacemaker\n" "# scp remote1:/etc/pacemaker/authkey /etc/pacemaker/authkey" msgstr "" #. Tag: title #, no-c-format msgid "Configure Corosync on Cluster Nodes" msgstr "" #. Tag: para #, no-c-format msgid "Corosync handles Pacemaker’s cluster membership and messaging. The corosync config file is located in /etc/corosync/corosync.conf. That config file must be initialized with information about the two cluster nodes before pacemaker can start." msgstr "" #. Tag: para #, no-c-format msgid "To initialize the corosync config file, execute the following pcs command on both nodes, filling in the information in <> with your nodes' information." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs cluster setup --force --local --name mycluster <node1 ip or hostname> <node2 ip or hostname>" msgstr "" #. Tag: title #, no-c-format msgid "Start Pacemaker on Cluster Nodes" msgstr "" #. Tag: para #, no-c-format msgid "Start the cluster stack on both cluster nodes using the following command." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs cluster start" msgstr "" #. Tag: para #, no-c-format msgid "Verify corosync membership" msgstr "" #. Tag: literallayout #, no-c-format msgid "# pcs status corosync\n" "Membership information\n" "----------------------\n" " Nodeid Votes Name\n" " 1 1 node1 (local)" msgstr "" #. Tag: para #, no-c-format msgid "Verify Pacemaker status. At first, the pcs cluster status output will look like this." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Aug 21 16:14:05 2015\n" "Last change: Fri Aug 21 14:02:14 2015\n" "Stack: corosync\n" "Current DC: NONE\n" "Version: 1.1.12-a14efad\n" "1 Nodes configured, unknown expected votes\n" "0 Resources configured" msgstr "" #. Tag: para #, no-c-format msgid "After about a minute, you should see your two cluster nodes come online." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Aug 21 16:16:32 2015\n" "Last change: Fri Aug 21 14:02:14 2015\n" "Stack: corosync\n" "Current DC: node1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "2 Nodes configured\n" "0 Resources configured\n" "\n" "Online: [ node1 node2 ]" msgstr "" #. Tag: para #, no-c-format msgid "For the sake of this tutorial, we are going to disable stonith to avoid having to cover fencing device configuration." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs property set stonith-enabled=false" msgstr "" #. Tag: title #, no-c-format msgid "Integrate Remote Node into Cluster" msgstr "" #. Tag: para #, no-c-format msgid "Integrating a remote node into the cluster is achieved through the creation of a remote node connection resource. The remote node connection resource both establishes the connection to the remote node and defines that the remote node exists. Note that this resource is actually internal to Pacemaker’s crmd component. A metadata file for this resource can be found in the /usr/lib/ocf/resource.d/pacemaker/remote file that describes what options are available, but there is no actual ocf:pacemaker:remote resource agent script that performs any work." msgstr "" #. Tag: para #, no-c-format msgid "Define the remote node connection resource to our remote node, remote1, using the following command on any cluster node." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs resource create remote1 ocf:pacemaker:remote" msgstr "" #. Tag: para #, no-c-format msgid "That’s it. After a moment you should see the remote node come online." msgstr "" #. Tag: screen #, no-c-format msgid "Cluster name: mycluster\n" "Last updated: Fri Aug 21 17:13:09 2015\n" "Last change: Fri Aug 21 17:02:02 2015\n" "Stack: corosync\n" "Current DC: node1 (1) - partition with quorum\n" "Version: 1.1.12-a14efad\n" "3 Nodes configured\n" "1 Resources configured\n" "\n" "\n" "Online: [ node1 node2 ]\n" "RemoteOnline: [ remote1 ]\n" "\n" "Full list of resources:\n" "\n" " remote1 (ocf::pacemaker:remote): Started node1\n" "\n" "PCSD Status:\n" " node1: Online\n" " node2: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: title #, no-c-format msgid "Starting Resources on Remote Node" msgstr "" #. Tag: para #, no-c-format msgid "Once the remote node is integrated into the cluster, starting resources on a remote node is the exact same as on cluster nodes. Refer to the Clusters from Scratch document for examples of resource creation." msgstr "" #. Tag: para #, no-c-format msgid "Never involve a remote node connection resource in a resource group, colocation constraint, or order constraint." msgstr "" #. Tag: title #, no-c-format msgid "Fencing Remote Nodes" msgstr "" #. Tag: para #, no-c-format msgid "Remote nodes are fenced the same way as cluster nodes. No special considerations are required. Configure fencing resources for use with remote nodes the same as you would with cluster nodes." msgstr "" #. Tag: para #, no-c-format msgid "Note, however, that remote nodes can never initiate a fencing action. Only cluster nodes are capable of actually executing a fencing operation against another node." msgstr "" #. Tag: title #, no-c-format msgid "Accessing Cluster Tools from a Remote Node" msgstr "" #. Tag: para #, no-c-format msgid "Besides allowing the cluster to manage resources on a remote node, pacemaker_remote has one other trick. The pacemaker_remote daemon allows nearly all the pacemaker tools (crm_resource, crm_mon, crm_attribute, crm_master, etc.) to work on remote nodes natively." msgstr "" #. Tag: para #, no-c-format msgid "Try it: Run crm_mon on the remote node after pacemaker has integrated it into the cluster. These tools just work. These means resource agents such as master/slave resources which need access to tools like crm_master work seamlessly on the remote nodes." msgstr "" #. Tag: para #, no-c-format msgid "Higher-level command shells such as pcs may have partial support on remote nodes, but it is recommended to run them from a cluster node." msgstr "" diff --git a/doc/Pacemaker_Remote/pot/Ch-Example.pot b/doc/Pacemaker_Remote/pot/Ch-Example.pot index 869200d5a5..f8b6fd11dc 100644 --- a/doc/Pacemaker_Remote/pot/Ch-Example.pot +++ b/doc/Pacemaker_Remote/pot/Ch-Example.pot @@ -1,174 +1,174 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Guest Node Quick Example" msgstr "" #. Tag: para #, no-c-format msgid "If you already know how to use Pacemaker, you’ll likely be able to grasp this new concept of guest nodes by reading through this quick example without having to sort through all the detailed walk-through steps. Here are the key configuration ingredients that make this possible using libvirt and KVM virtual guests. These steps strip everything down to the very basics. guest node nodeguest node guest node " msgstr "" #. Tag: title #, no-c-format msgid "Mile-High View of Configuration Steps" msgstr "" #. Tag: para #, no-c-format msgid "Give each virtual machine that will be used as a guest node a static network address and unique hostname." msgstr "" #. Tag: para #, no-c-format msgid "Put the same authentication key with the path /etc/pacemaker/authkey on every cluster node and virtual machine. This secures remote communication." msgstr "" #. Tag: para #, no-c-format msgid "Run this command if you want to make a somewhat random key:" msgstr "" #. Tag: screen #, no-c-format msgid "dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1" msgstr "" #. Tag: para #, no-c-format msgid "Install pacemaker_remote on every virtual machine, enabling it to start at boot, and if a local firewall is used, allow the node to accept connections on TCP port 3121." msgstr "" #. Tag: screen #, no-c-format msgid "yum install pacemaker-remote resource-agents\n" "systemctl enable pacemaker_remote\n" "firewall-cmd --add-port 3121/tcp --permanent" msgstr "" #. Tag: para #, no-c-format msgid "If you just want to see this work, you may want to simply disable the local firewall and put SELinux in permissive mode while testing. This creates security risks and should not be done on a production machine exposed to the Internet, but can be appropriate for a protected test machine." msgstr "" #. Tag: para #, no-c-format msgid "Create a Pacemaker resource to launch each virtual machine, using the remote-node meta-attribute to let Pacemaker know this will be a guest node capable of running resources." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs resource create vm-guest1 VirtualDomain hypervisor=\"qemu:///system\" config=\"vm-guest1.xml\" meta remote-node=\"guest1\"" msgstr "" #. Tag: para #, no-c-format msgid "The above command will create CIB XML similar to the following:" msgstr "" #. Tag: programlisting #, no-c-format msgid " <primitive class=\"ocf\" id=\"vm-guest1\" provider=\"heartbeat\" type=\"VirtualDomain\">\n" " <instance_attributes id=\"vm-guest-instance_attributes\">\n" " <nvpair id=\"vm-guest1-instance_attributes-hypervisor\" name=\"hypervisor\" value=\"qemu:///system\"/>\n" " <nvpair id=\"vm-guest1-instance_attributes-config\" name=\"config\" value=\"guest1.xml\"/>\n" " </instance_attributes>\n" " <operations>\n" " <op id=\"vm-guest1-interval-30s\" interval=\"30s\" name=\"monitor\"/>\n" " </operations>\n" " <meta_attributes id=\"vm-guest1-meta_attributes\">\n" " <nvpair id=\"vm-guest1-meta_attributes-remote-node\" name=\"remote-node\" value=\"guest1\"/>\n" " </meta_attributes>\n" " </primitive>" msgstr "" #. Tag: para #, no-c-format msgid "In the example above, the meta-attribute remote-node=\"guest1\" tells Pacemaker that this resource is a guest node with the hostname guest1. The cluster will attempt to contact the virtual machine’s pacemaker_remote service at the hostname guest1 after it launches." msgstr "" #. Tag: para #, no-c-format msgid "The ID of the resource creating the virtual machine (vm-guest1 in the above example) must be different from the virtual machine’s uname (guest1 in the above example). Pacemaker will create an implicit internal resource for the pacemaker_remote connection to the guest, named with the value of remote-node, so that value cannot be used as the name of any other resource." msgstr "" #. Tag: title #, no-c-format msgid "Using a Guest Node" msgstr "" #. Tag: para #, no-c-format msgid "Guest nodes will show up in crm_mon output as normal:" msgstr "" #. Tag: title #, no-c-format msgid "Example crm_mon output after guest1 is integrated into cluster" msgstr "" #. Tag: screen #, no-c-format msgid "Last updated: Wed Mar 13 13:52:39 2013\n" "Last change: Wed Mar 13 13:25:17 2013 via crmd on node1\n" "Stack: corosync\n" "Current DC: node1 (24815808) - partition with quorum\n" "Version: 1.1.10\n" "2 Nodes configured, unknown expected votes\n" "2 Resources configured.\n" "\n" "Online: [ node1 guest1]\n" "\n" "vm-guest1 (ocf::heartbeat:VirtualDomain): Started node1" msgstr "" #. Tag: para #, no-c-format msgid "Now, you could place a resource, such as a webserver, on guest1:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs resource create webserver apache params configfile=/etc/httpd/conf/httpd.conf op monitor interval=30s\n" "# pcs constraint location webserver prefers guest1" msgstr "" #. Tag: para #, no-c-format msgid "Now, the crm_mon output would show:" msgstr "" #. Tag: screen #, no-c-format msgid "Last updated: Wed Mar 13 13:52:39 2013\n" "Last change: Wed Mar 13 13:25:17 2013 via crmd on node1\n" "Stack: corosync\n" "Current DC: node1 (24815808) - partition with quorum\n" "Version: 1.1.10\n" "2 Nodes configured, unknown expected votes\n" "2 Resources configured.\n" "\n" "Online: [ node1 guest1]\n" "\n" "vm-guest1 (ocf::heartbeat:VirtualDomain): Started node1\n" "webserver (ocf::heartbeat::apache): Started guest1" msgstr "" #. Tag: para #, no-c-format msgid "It is worth noting that after guest1 is integrated into the cluster, nearly all the Pacemaker command-line tools immediately become available to the guest node. This means things like crm_mon, crm_resource, and crm_attribute will work natively on the guest node, as long as the connection between the guest node and a cluster node exists. This is particularly important for any master/slave resources executing on the guest node that need access to crm_master to set transient attributes." msgstr "" diff --git a/doc/Pacemaker_Remote/pot/Ch-Intro.pot b/doc/Pacemaker_Remote/pot/Ch-Intro.pot index 4616770a77..fecba84c9e 100644 --- a/doc/Pacemaker_Remote/pot/Ch-Intro.pot +++ b/doc/Pacemaker_Remote/pot/Ch-Intro.pot @@ -1,374 +1,374 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Scaling a Pacemaker Cluster" msgstr "" #. Tag: title #, no-c-format msgid "Overview" msgstr "" #. Tag: para #, no-c-format msgid "In a basic Pacemaker high-availability cluster,See the Pacemaker documentation, especially Clusters From Scratch and Pacemaker Explained, for basic information about high-availability using Pacemaker each node runs the full cluster stack of corosync and all Pacemaker components. This allows great flexibility but limits scalability to around 16 nodes." msgstr "" #. Tag: para #, no-c-format msgid "To allow for scalability to dozens or even hundreds of nodes, Pacemaker allows nodes not running the full cluster stack to integrate into the cluster and have the cluster manage their resources as if they were a cluster node." msgstr "" #. Tag: title #, no-c-format msgid "Terms" msgstr "" #. Tag: term #, no-c-format msgid "cluster node" msgstr "" #. Tag: para #, no-c-format msgid "A node running the full high-availability stack of corosync and all Pacemaker components. Cluster nodes may run cluster resources, run all Pacemaker command-line tools (crm_mon, crm_resource and so on), execute fencing actions, count toward cluster quorum, and serve as the cluster’s Designated Controller (DC). cluster node nodecluster node cluster node " msgstr "" #. Tag: term #, no-c-format msgid "pacemaker_remote" msgstr "" #. Tag: para #, no-c-format msgid "A small service daemon that allows a host to be used as a Pacemaker node without running the full cluster stack. Nodes running pacemaker_remote may run cluster resources and most command-line tools, but cannot perform other functions of full cluster nodes such as fencing execution, quorum voting or DC eligibility. The pacemaker_remote daemon is an enhanced version of Pacemaker’s local resource management daemon (LRMD). pacemaker_remote " msgstr "" #. Tag: term #, no-c-format msgid "remote node" msgstr "" #. Tag: para #, no-c-format msgid "A physical host running pacemaker_remote. Remote nodes have a special resource that manages communication with the cluster. This is sometimes referred to as the baremetal case. remote node noderemote node remote node " msgstr "" #. Tag: term #, no-c-format msgid "guest node" msgstr "" #. Tag: para #, no-c-format msgid "A virtual host running pacemaker_remote. Guest nodes differ from remote nodes mainly in that the guest node is itself a resource that the cluster manages. guest node nodeguest node guest node " msgstr "" #. Tag: para #, no-c-format msgid "Remote in this document refers to the node not being a part of the underlying corosync cluster. It has nothing to do with physical proximity. Remote nodes and guest nodes are subject to the same latency requirements as cluster nodes, which means they are typically in the same data center." msgstr "" #. Tag: para #, no-c-format msgid "It is important to distinguish the various roles a virtual machine can serve in Pacemaker clusters:" msgstr "" #. Tag: para #, no-c-format msgid "A virtual machine can run the full cluster stack, in which case it is a cluster node and is not itself managed by the cluster." msgstr "" #. Tag: para #, no-c-format msgid "A virtual machine can be managed by the cluster as a resource, without the cluster having any awareness of the services running inside the virtual machine. The virtual machine is opaque to the cluster." msgstr "" #. Tag: para #, no-c-format msgid "A virtual machine can be a cluster resource, and run pacemaker_remote to make it a guest node, allowing the cluster to manage services inside it. The virtual machine is transparent to the cluster." msgstr "" #. Tag: title #, no-c-format msgid "Support in Pacemaker Versions" msgstr "" #. Tag: para #, no-c-format -msgid "It is recommended to run Pacemaker 1.1.12 or later when using pacemaker_remote due to important bug fixes. An overview of changes in pacemaker_remote capability by version:" +msgid "It is recommended to run Pacemaker 1.1.12 or later when using pacemaker_remote due to important bug fixes. An overview of changes in pacemaker_remote capability by version (aside from bug fixes, which are included in every version):" msgstr "" #. Tag: title #, no-c-format -msgid "1.1.15" +msgid "1.1.16" msgstr "" #. Tag: para #, no-c-format -msgid "If pacemaker_remote is stopped on an active node, it will wait for the cluster to migrate all resources off before exiting, rather than exit immediately and get fenced." +msgid "Support for watchdog-based fencing (sbd) on remote nodes" +msgstr "" + +#. Tag: title +#, no-c-format +msgid "1.1.15" msgstr "" #. Tag: para #, no-c-format -msgid "Bug fixes" +msgid "If pacemaker_remote is stopped on an active node, it will wait for the cluster to migrate all resources off before exiting, rather than exit immediately and get fenced." msgstr "" #. Tag: title #, no-c-format msgid "1.1.14" msgstr "" #. Tag: para #, no-c-format msgid "Resources that create guest nodes can be included in groups" msgstr "" #. Tag: para #, no-c-format msgid "reconnect_interval option for remote nodes" msgstr "" -#. Tag: para -#, no-c-format -msgid "Bug fixes, including a memory leak" -msgstr "" - #. Tag: title #, no-c-format msgid "1.1.13" msgstr "" #. Tag: para #, no-c-format msgid "Support for maintenance mode" msgstr "" #. Tag: para #, no-c-format msgid "Remote nodes can recover without being fenced when the cluster node hosting their connection fails" msgstr "" #. Tag: para #, no-c-format msgid "Running pacemaker_remote within LXC environments is deprecated due to newly added Pacemaker support for isolated resources" msgstr "" +#. Tag: para +#, no-c-format +msgid "#kind built-in node attribute for use with rules" +msgstr "" + #. Tag: title #, no-c-format msgid "1.1.12" msgstr "" #. Tag: para #, no-c-format msgid "Support for permanent node attributes" msgstr "" #. Tag: para #, no-c-format msgid "Support for migration" msgstr "" #. Tag: title #, no-c-format msgid "1.1.11" msgstr "" #. Tag: para #, no-c-format msgid "Support for IPv6" msgstr "" #. Tag: para #, no-c-format msgid "Support for remote nodes" msgstr "" #. Tag: para #, no-c-format msgid "Support for transient node attributes" msgstr "" #. Tag: para #, no-c-format msgid "Support for clusters with mixed endian architectures" msgstr "" -#. Tag: title -#, no-c-format -msgid "1.1.10" -msgstr "" - #. Tag: title #, no-c-format msgid "1.1.9" msgstr "" #. Tag: para #, no-c-format msgid "Initial version to include pacemaker_remote" msgstr "" #. Tag: para #, no-c-format msgid "Limited to guest nodes in KVM/LXC environments using only IPv4; all nodes' architectures must have same endianness" msgstr "" #. Tag: title #, no-c-format msgid "Guest Nodes" msgstr "" #. Tag: para #, no-c-format msgid " guest node nodeguest node guest node " msgstr "" #. Tag: para #, no-c-format msgid "\"I want a Pacemaker cluster to manage virtual machine resources, but I also want Pacemaker to be able to manage the resources that live within those virtual machines.\"" msgstr "" #. Tag: para #, no-c-format msgid "Without pacemaker_remote, the possibilities for implementing the above use case have significant limitations:" msgstr "" #. Tag: para #, no-c-format msgid "The cluster stack could be run on the physical hosts only, which loses the ability to monitor resources within the guests." msgstr "" #. Tag: para #, no-c-format msgid "A separate cluster could be on the virtual guests, which quickly hits scalability issues." msgstr "" #. Tag: para #, no-c-format msgid "The cluster stack could be run on the guests using the same cluster as the physical hosts, which also hits scalability issues and complicates fencing." msgstr "" #. Tag: para #, no-c-format msgid "With pacemaker_remote:" msgstr "" #. Tag: para #, no-c-format msgid "The physical hosts are cluster nodes (running the full cluster stack)." msgstr "" #. Tag: para #, no-c-format msgid "The virtual machines are guest nodes (running the pacemaker_remote service). Nearly zero configuration is required on the virtual machine." msgstr "" #. Tag: para #, no-c-format msgid "The cluster stack on the cluster nodes launches the virtual machines and immediately connects to the pacemaker_remote service on them, allowing the virtual machines to integrate into the cluster." msgstr "" #. Tag: para #, no-c-format msgid "The key difference here between the guest nodes and the cluster nodes is that the guest nodes do not run the cluster stack. This means they will never become the DC, initiate fencing actions or participate in quorum voting." msgstr "" #. Tag: para #, no-c-format msgid "On the other hand, this also means that they are not bound to the scalability limits associated with the cluster stack (no 16-node corosync member limits to deal with). That isn’t to say that guest nodes can scale indefinitely, but it is known that guest nodes scale horizontally much further than cluster nodes." msgstr "" #. Tag: para #, no-c-format msgid "Other than the quorum limitation, these guest nodes behave just like cluster nodes with respect to resource management. The cluster is fully capable of managing and monitoring resources on each guest node. You can build constraints against guest nodes, put them in standby, or do whatever else you’d expect to be able to do with cluster nodes. They even show up in crm_mon output as nodes." msgstr "" #. Tag: para #, no-c-format msgid "To solidify the concept, below is an example that is very similar to an actual deployment we test in our developer environment to verify guest node scalability:" msgstr "" #. Tag: para #, no-c-format msgid "16 cluster nodes running the full corosync + pacemaker stack" msgstr "" #. Tag: para #, no-c-format msgid "64 Pacemaker-managed virtual machine resources running pacemaker_remote configured as guest nodes" msgstr "" #. Tag: para #, no-c-format msgid "64 Pacemaker-managed webserver and database resources configured to run on the 64 guest nodes" msgstr "" #. Tag: para #, no-c-format msgid "With this deployment, you would have 64 webservers and databases running on 64 virtual machines on 16 hardware nodes, all of which are managed and monitored by the same Pacemaker deployment. It is known that pacemaker_remote can scale to these lengths and possibly much further depending on the specific scenario." msgstr "" #. Tag: title #, no-c-format msgid "Remote Nodes" msgstr "" #. Tag: para #, no-c-format msgid " remote node noderemote node remote node " msgstr "" #. Tag: para #, no-c-format msgid "\"I want my traditional high-availability cluster to scale beyond the limits imposed by the corosync messaging layer.\"" msgstr "" #. Tag: para #, no-c-format msgid "Ultimately, the primary advantage of remote nodes over cluster nodes is scalability. There are likely some other use cases related to geographically distributed HA clusters that remote nodes may serve a purpose in, but those use cases are not well understood at this point." msgstr "" #. Tag: para #, no-c-format msgid "Like guest nodes, remote nodes will never become the DC, initiate fencing actions or participate in quorum voting." msgstr "" #. Tag: para #, no-c-format msgid "That is not to say, however, that fencing of a remote node works any differently than that of a cluster node. The Pacemaker policy engine understands how to fence remote nodes. As long as a fencing device exists, the cluster is capable of ensuring remote nodes are fenced in the exact same way as cluster nodes." msgstr "" #. Tag: title #, no-c-format msgid "Expanding the Cluster Stack" msgstr "" #. Tag: para #, no-c-format msgid "With pacemaker_remote, the traditional view of the high-availability stack can be expanded to include a new layer:" msgstr "" #. Tag: title #, no-c-format msgid "Traditional HA Stack" msgstr "" #. Tag: title #, no-c-format msgid "HA Stack With Guest Nodes" msgstr "" diff --git a/doc/Pacemaker_Remote/pot/Ch-KVM-Tutorial.pot b/doc/Pacemaker_Remote/pot/Ch-KVM-Tutorial.pot index ae9f3718b9..3e230bf8af 100644 --- a/doc/Pacemaker_Remote/pot/Ch-KVM-Tutorial.pot +++ b/doc/Pacemaker_Remote/pot/Ch-KVM-Tutorial.pot @@ -1,767 +1,767 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Guest Node Walk-through" msgstr "" #. Tag: para #, no-c-format msgid "What this tutorial is: An in-depth walk-through of how to get Pacemaker to manage a KVM guest instance and integrate that guest into the cluster as a guest node." msgstr "" #. Tag: para #, no-c-format msgid "What this tutorial is not: A realistic deployment scenario. The steps shown here are meant to get users familiar with the concept of guest nodes as quickly as possible." msgstr "" #. Tag: title #, no-c-format msgid "Configure the Physical Host" msgstr "" #. Tag: para #, no-c-format msgid "For this example, we will use a single physical host named example-host. A production cluster would likely have multiple physical hosts, in which case you would run the commands here on each one, unless noted otherwise." msgstr "" #. Tag: title #, no-c-format msgid "Configure Firewall on Host" msgstr "" #. Tag: para #, no-c-format msgid "On the physical host, allow cluster-related services through the local firewall:" msgstr "" #. Tag: screen #, no-c-format msgid "# firewall-cmd --permanent --add-service=high-availability\n" "success\n" "# firewall-cmd --reload\n" "success" msgstr "" #. Tag: para #, no-c-format msgid "If you are using iptables directly, or some other firewall solution besides firewalld, simply open the following ports, which can be used by various clustering components: TCP ports 2224, 3121, and 21064, and UDP port 5405." msgstr "" #. Tag: para #, no-c-format msgid "If you run into any problems during testing, you might want to disable the firewall and SELinux entirely until you have everything working. This may create significant security issues and should not be performed on machines that will be exposed to the outside world, but may be appropriate during development and testing on a protected host." msgstr "" #. Tag: para #, no-c-format msgid "To disable security measures:" msgstr "" #. Tag: screen #, no-c-format msgid "[root@pcmk-1 ~]# setenforce 0\n" "[root@pcmk-1 ~]# sed -i.bak \"s/SELINUX=enforcing/SELINUX=permissive/g\" /etc/selinux/config\n" "[root@pcmk-1 ~]# systemctl disable firewalld.service\n" "[root@pcmk-1 ~]# systemctl stop firewalld.service\n" "[root@pcmk-1 ~]# iptables --flush" msgstr "" #. Tag: title #, no-c-format msgid "Install Cluster Software" msgstr "" #. Tag: screen #, no-c-format msgid "# yum install -y pacemaker corosync pcs resource-agents" msgstr "" #. Tag: title #, no-c-format msgid "Configure Corosync" msgstr "" #. Tag: para #, no-c-format msgid "Corosync handles pacemaker’s cluster membership and messaging. The corosync config file is located in /etc/corosync/corosync.conf. That config file must be initialized with information about the cluster nodes before pacemaker can start." msgstr "" #. Tag: para #, no-c-format msgid "To initialize the corosync config file, execute the following pcs command, replacing the cluster name and hostname as desired:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs cluster setup --force --local --name mycluster example-host" msgstr "" #. Tag: para #, no-c-format msgid "If you have multiple physical hosts, you would execute the setup command on only one host, but list all of them at the end of the command." msgstr "" #. Tag: title #, no-c-format msgid "Configure Pacemaker for Remote Node Communication" msgstr "" #. Tag: para #, no-c-format msgid "Create a place to hold an authentication key for use with pacemaker_remote:" msgstr "" #. Tag: screen #, no-c-format msgid "# mkdir -p --mode=0750 /etc/pacemaker\n" "# chgrp haclient /etc/pacemaker" msgstr "" #. Tag: para #, no-c-format msgid "Generate a key:" msgstr "" #. Tag: screen #, no-c-format msgid "# dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1" msgstr "" #. Tag: para #, no-c-format msgid "If you have multiple physical hosts, you would generate the key on only one host, and copy it to the same location on all hosts." msgstr "" #. Tag: title #, no-c-format msgid "Verify Cluster Software" msgstr "" #. Tag: para #, no-c-format msgid "Start the cluster" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs cluster start" msgstr "" #. Tag: para #, no-c-format msgid "Verify corosync membership" msgstr "" #. Tag: literallayout #, no-c-format msgid "# pcs status corosync\n" "\n" "Membership information\n" "----------------------\n" " Nodeid Votes Name\n" " 1 1 example-host (local)" msgstr "" #. Tag: para #, no-c-format msgid "Verify pacemaker status. At first, the output will look like this:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs status\n" "Cluster name: mycluster\n" "WARNING: no stonith devices and stonith-enabled is not false\n" "Last updated: Fri Oct 9 15:18:32 2015 Last change: Fri Oct 9 12:42:21 2015 by root via cibadmin on example-host\n" "Stack: corosync\n" "Current DC: NONE\n" "1 node and 0 resources configured\n" "\n" "Node example-host: UNCLEAN (offline)\n" "\n" "Full list of resources:\n" "\n" "\n" "PCSD Status:\n" " example-host: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "After a short amount of time, you should see your host as a single node in the cluster:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs status\n" "Cluster name: mycluster\n" "WARNING: no stonith devices and stonith-enabled is not false\n" "Last updated: Fri Oct 9 15:20:05 2015 Last change: Fri Oct 9 12:42:21 2015 by root via cibadmin on example-host\n" "Stack: corosync\n" "Current DC: example-host (version 1.1.13-a14efad) - partition WITHOUT quorum\n" "1 node and 0 resources configured\n" "\n" "Online: [ example-host ]\n" "\n" "Full list of resources:\n" "\n" "\n" "PCSD Status:\n" " example-host: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: title #, no-c-format msgid "Disable STONITH and Quorum" msgstr "" #. Tag: para #, no-c-format msgid "Now, enable the cluster to work without quorum or stonith. This is required for the sake of getting this tutorial to work with a single cluster node." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs property set stonith-enabled=false\n" "# pcs property set no-quorum-policy=ignore" msgstr "" #. Tag: para #, no-c-format msgid "The use of stonith-enabled=false is completely inappropriate for a production cluster. It tells the cluster to simply pretend that failed nodes are safely powered off. Some vendors will refuse to support clusters that have STONITH disabled. We disable STONITH here only to focus the discussion on pacemaker_remote, and to be able to use a single physical host in the example." msgstr "" #. Tag: para #, no-c-format msgid "Now, the status output should look similar to this:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Oct 9 15:22:49 2015 Last change: Fri Oct 9 15:22:46 2015 by root via cibadmin on example-host\n" "Stack: corosync\n" "Current DC: example-host (version 1.1.13-a14efad) - partition with quorum\n" "1 node and 0 resources configured\n" "\n" "Online: [ example-host ]\n" "\n" "Full list of resources:\n" "\n" "\n" "PCSD Status:\n" " example-host: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "Go ahead and stop the cluster for now after verifying everything is in order." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs cluster stop --force" msgstr "" #. Tag: title #, no-c-format msgid "Install Virtualization Software" msgstr "" #. Tag: screen #, no-c-format msgid "# yum install -y kvm libvirt qemu-system qemu-kvm bridge-utils virt-manager\n" "# systemctl enable libvirtd.service" msgstr "" #. Tag: para #, no-c-format msgid "Reboot the host." msgstr "" #. Tag: para #, no-c-format msgid "While KVM is used in this example, any virtualization platform with a Pacemaker resource agent can be used to create a guest node. The resource agent needs only to support usual commands (start, stop, etc.); Pacemaker implements the remote-node meta-attribute, independent of the agent." msgstr "" #. Tag: title #, no-c-format msgid "Configure the KVM guest" msgstr "" #. Tag: title #, no-c-format msgid "Create Guest" msgstr "" #. Tag: para #, no-c-format msgid "We will not outline here the installation steps required to create a KVM guest. There are plenty of tutorials available elsewhere that do that. Just be sure to configure the guest with a hostname and a static IP address (as an example here, we will use guest1 and 192.168.122.10)." msgstr "" #. Tag: title #, no-c-format msgid "Configure Firewall on Guest" msgstr "" #. Tag: para #, no-c-format msgid "On each guest, allow cluster-related services through the local firewall, following the same procedure as in ." msgstr "" #. Tag: title #, no-c-format msgid "Verify Connectivity" msgstr "" #. Tag: para #, no-c-format msgid "At this point, you should be able to ping and ssh into guests from hosts, and vice versa." msgstr "" #. Tag: title #, no-c-format msgid "Configure pacemaker_remote" msgstr "" #. Tag: para #, no-c-format msgid "Install pacemaker_remote, and enable it to run at start-up. Here, we also install the pacemaker package; it is not required, but it contains the dummy resource agent that we will use later for testing." msgstr "" #. Tag: screen #, no-c-format msgid "# yum install -y pacemaker pacemaker-remote resource-agents\n" "# systemctl enable pacemaker_remote.service" msgstr "" #. Tag: para #, no-c-format msgid "Copy the authentication key from a host:" msgstr "" #. Tag: screen #, no-c-format msgid "# mkdir -p --mode=0750 /etc/pacemaker\n" "# chgrp haclient /etc/pacemaker\n" "# scp root@example-host:/etc/pacemaker/authkey /etc/pacemaker" msgstr "" #. Tag: para #, no-c-format msgid "Start pacemaker_remote, and verify the start was successful:" msgstr "" #. Tag: screen #, no-c-format msgid "# systemctl start pacemaker_remote\n" "# systemctl status pacemaker_remote\n" "\n" " pacemaker_remote.service - Pacemaker Remote Service\n" " Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service; enabled)\n" " Active: active (running) since Thu 2013-03-14 18:24:04 EDT; 2min 8s ago\n" " Main PID: 1233 (pacemaker_remot)\n" " CGroup: name=systemd:/system/pacemaker_remote.service\n" " └─1233 /usr/sbin/pacemaker_remoted\n" "\n" " Mar 14 18:24:04 guest1 systemd[1]: Starting Pacemaker Remote Service...\n" " Mar 14 18:24:04 guest1 systemd[1]: Started Pacemaker Remote Service.\n" " Mar 14 18:24:04 guest1 pacemaker_remoted[1233]: notice: lrmd_init_remote_tls_server: Starting a tls listener on port 3121." msgstr "" #. Tag: title #, no-c-format msgid "Verify Host Connection to Guest" msgstr "" #. Tag: para #, no-c-format msgid "Before moving forward, it’s worth verifying that the host can contact the guest on port 3121. Here’s a trick you can use. Connect using ssh from the host. The connection will get destroyed, but how it is destroyed tells you whether it worked or not." msgstr "" #. Tag: para #, no-c-format msgid "First add guest1 to the host machine’s /etc/hosts file if you haven’t already. This is required unless you have DNS setup in a way where guest1’s address can be discovered." msgstr "" #. Tag: screen #, no-c-format msgid "# cat << END >> /etc/hosts\n" "192.168.122.10 guest1\n" "END" msgstr "" #. Tag: para #, no-c-format msgid "If running the ssh command on one of the cluster nodes results in this output before disconnecting, the connection works:" msgstr "" #. Tag: screen #, no-c-format msgid "# ssh -p 3121 guest1\n" "ssh_exchange_identification: read: Connection reset by peer" msgstr "" #. Tag: para #, no-c-format msgid "If you see one of these, the connection is not working:" msgstr "" #. Tag: screen #, no-c-format msgid "# ssh -p 3121 guest1\n" "ssh: connect to host guest1 port 3121: No route to host" msgstr "" #. Tag: screen #, no-c-format msgid "# ssh -p 3121 guest1\n" "ssh: connect to host guest1 port 3121: Connection refused" msgstr "" #. Tag: para #, no-c-format msgid "Once you can successfully connect to the guest from the host, shutdown the guest. Pacemaker will be managing the virtual machine from this point forward." msgstr "" #. Tag: title #, no-c-format msgid "Integrate Guest into Cluster" msgstr "" #. Tag: para #, no-c-format msgid "Now the fun part, integrating the virtual machine you’ve just created into the cluster. It is incredibly simple." msgstr "" #. Tag: title #, no-c-format msgid "Start the Cluster" msgstr "" #. Tag: para #, no-c-format msgid "On the host, start pacemaker." msgstr "" #. Tag: para #, no-c-format msgid "Wait for the host to become the DC. The output of pcs status should look as it did in ." msgstr "" #. Tag: title #, no-c-format msgid "Integrate as Guest Node" msgstr "" #. Tag: para #, no-c-format msgid "If you didn’t already do this earlier in the verify host to guest connection section, add the KVM guest’s IP address to the host’s /etc/hosts file so we can connect by hostname. For this example:" msgstr "" #. Tag: para #, no-c-format msgid "We will use the VirtualDomain resource agent for the management of the virtual machine. This agent requires the virtual machine’s XML config to be dumped to a file on disk. To do this, pick out the name of the virtual machine you just created from the output of this list." msgstr "" #. Tag: literallayout #, no-c-format msgid "# virsh list --all\n" " Id Name State\n" "----------------------------------------------------\n" " - guest1 shut off" msgstr "" #. Tag: para #, no-c-format msgid "In my case I named it guest1. Dump the xml to a file somewhere on the host using the following command." msgstr "" #. Tag: screen #, no-c-format msgid "# virsh dumpxml guest1 > /etc/pacemaker/guest1.xml" msgstr "" #. Tag: para #, no-c-format msgid "Now just register the resource with pacemaker and you’re set!" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs resource create vm-guest1 VirtualDomain hypervisor=\"qemu:///system\" \\\n" " config=\"/etc/pacemaker/guest1.xml\" meta remote-node=guest1" msgstr "" #. Tag: para #, no-c-format msgid "This example puts the guest XML under /etc/pacemaker because the permissions and SELinux labeling should not need any changes. If you run into trouble with this or any step, try disabling SELinux with setenforce 0. If it works after that, see SELinux documentation for how to troubleshoot, if you wish to reenable SELinux." msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker will automatically monitor pacemaker_remote connections for failure, so it is not necessary to create a recurring monitor on the VirtualDomain resource." msgstr "" #. Tag: para #, no-c-format msgid "Once the vm-guest1 resource is started you will see guest1 appear in the pcs status output as a node. The final pcs status output should look something like this." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Oct 9 18:00:45 2015 Last change: Fri Oct 9 17:53:44 2015 by root via crm_resource on example-host\n" "Stack: corosync\n" "Current DC: example-host (version 1.1.13-a14efad) - partition with quorum\n" "2 nodes and 2 resources configured\n" "\n" "Online: [ example-host ]\n" "GuestOnline: [ guest1@example-host ]\n" "\n" "Full list of resources:\n" "\n" " vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host\n" "\n" "PCSD Status:\n" " example-host: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: title #, no-c-format msgid "Starting Resources on KVM Guest" msgstr "" #. Tag: para #, no-c-format msgid "The commands below demonstrate how resources can be executed on both the guest node and the cluster node." msgstr "" #. Tag: para #, no-c-format msgid "Create a few Dummy resources. Dummy resources are real resource agents used just for testing purposes. They actually execute on the host they are assigned to just like an apache server or database would, except their execution just means a file was created. When the resource is stopped, that the file it created is removed." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs resource create FAKE1 ocf:pacemaker:Dummy\n" "# pcs resource create FAKE2 ocf:pacemaker:Dummy\n" "# pcs resource create FAKE3 ocf:pacemaker:Dummy\n" "# pcs resource create FAKE4 ocf:pacemaker:Dummy\n" "# pcs resource create FAKE5 ocf:pacemaker:Dummy" msgstr "" #. Tag: para #, no-c-format msgid "Now check your pcs status output. In the resource section, you should see something like the following, where some of the resources started on the cluster node, and some started on the guest node." msgstr "" #. Tag: screen #, no-c-format msgid "Full list of resources:\n" "\n" " vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host\n" " FAKE1 (ocf::pacemaker:Dummy): Started guest1\n" " FAKE2 (ocf::pacemaker:Dummy): Started guest1\n" " FAKE3 (ocf::pacemaker:Dummy): Started example-host\n" " FAKE4 (ocf::pacemaker:Dummy): Started guest1\n" " FAKE5 (ocf::pacemaker:Dummy): Started example-host" msgstr "" #. Tag: para #, no-c-format msgid "The guest node, guest1, reacts just like any other node in the cluster. For example, pick out a resource that is running on your cluster node. For my purposes, I am picking FAKE3 from the output above. We can force FAKE3 to run on guest1 in the exact same way we would any other node." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs constraint location FAKE3 prefers guest1" msgstr "" #. Tag: para #, no-c-format msgid "Now, looking at the bottom of the pcs status output you’ll see FAKE3 is on guest1." msgstr "" #. Tag: screen #, no-c-format msgid "Full list of resources:\n" "\n" " vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host\n" " FAKE1 (ocf::pacemaker:Dummy): Started guest1\n" " FAKE2 (ocf::pacemaker:Dummy): Started guest1\n" " FAKE3 (ocf::pacemaker:Dummy): Started guest1\n" " FAKE4 (ocf::pacemaker:Dummy): Started example-host\n" " FAKE5 (ocf::pacemaker:Dummy): Started example-host" msgstr "" #. Tag: title #, no-c-format msgid "Testing Recovery and Fencing" msgstr "" #. Tag: para #, no-c-format msgid "Pacemaker’s policy engine is smart enough to know fencing guest nodes associated with a virtual machine means shutting off/rebooting the virtual machine. No special configuration is necessary to make this happen. If you are interested in testing this functionality out, trying stopping the guest’s pacemaker_remote daemon. This would be equivalent of abruptly terminating a cluster node’s corosync membership without properly shutting it down." msgstr "" #. Tag: para #, no-c-format msgid "ssh into the guest and run this command." msgstr "" #. Tag: screen #, no-c-format msgid "# kill -9 `pidof pacemaker_remoted`" msgstr "" #. Tag: para #, no-c-format msgid "Within a few seconds, your pcs status output will show a monitor failure, and the guest1 node will not be shown while it is being recovered." msgstr "" #. Tag: screen #, no-c-format msgid "# pcs status\n" "Cluster name: mycluster\n" "Last updated: Fri Oct 9 18:08:35 2015 Last change: Fri Oct 9 18:07:00 2015 by root via cibadmin on example-host\n" "Stack: corosync\n" "Current DC: example-host (version 1.1.13-a14efad) - partition with quorum\n" "2 nodes and 7 resources configured\n" "\n" "Online: [ example-host ]\n" "\n" "Full list of resources:\n" "\n" " vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host\n" " FAKE1 (ocf::pacemaker:Dummy): Stopped\n" " FAKE2 (ocf::pacemaker:Dummy): Stopped\n" " FAKE3 (ocf::pacemaker:Dummy): Stopped\n" " FAKE4 (ocf::pacemaker:Dummy): Started example-host\n" " FAKE5 (ocf::pacemaker:Dummy): Started example-host\n" "\n" "Failed Actions:\n" "* guest1_monitor_30000 on example-host 'unknown error' (1): call=8, status=Error, exitreason='none',\n" " last-rc-change='Fri Oct 9 18:08:29 2015', queued=0ms, exec=0ms\n" "\n" "\n" "PCSD Status:\n" " example-host: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "A guest node involves two resources: the one you explicitly configured creates the guest, and Pacemaker creates an implicit resource for the pacemaker_remote connection, which will be named the same as the value of the remote-node attribute of the explicit resource. When we killed pacemaker_remote, it is the implicit resource that failed, which is why the failed action starts with guest1 and not vm-guest1." msgstr "" #. Tag: para #, no-c-format msgid "Once recovery of the guest is complete, you’ll see it automatically get re-integrated into the cluster. The final pcs status output should look something like this." msgstr "" #. Tag: screen #, no-c-format msgid "Cluster name: mycluster\n" "Last updated: Fri Oct 9 18:18:30 2015 Last change: Fri Oct 9 18:07:00 2015 by root via cibadmin on example-host\n" "Stack: corosync\n" "Current DC: example-host (version 1.1.13-a14efad) - partition with quorum\n" "2 nodes and 7 resources configured\n" "\n" "Online: [ example-host ]\n" "GuestOnline: [ guest1@example-host ]\n" "\n" "Full list of resources:\n" "\n" " vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host\n" " FAKE1 (ocf::pacemaker:Dummy): Started guest1\n" " FAKE2 (ocf::pacemaker:Dummy): Started guest1\n" " FAKE3 (ocf::pacemaker:Dummy): Started guest1\n" " FAKE4 (ocf::pacemaker:Dummy): Started example-host\n" " FAKE5 (ocf::pacemaker:Dummy): Started example-host\n" "\n" "Failed Actions:\n" "* guest1_monitor_30000 on example-host 'unknown error' (1): call=8, status=Error, exitreason='none',\n" " last-rc-change='Fri Oct 9 18:08:29 2015', queued=0ms, exec=0ms\n" "\n" "\n" "PCSD Status:\n" " example-host: Online\n" "\n" "Daemon Status:\n" " corosync: active/disabled\n" " pacemaker: active/disabled\n" " pcsd: active/enabled" msgstr "" #. Tag: para #, no-c-format msgid "Normally, once you’ve investigated and addressed a failed action, you can clear the failure. However Pacemaker does not yet support cleanup for the implicitly created connection resource while the explicit resource is active. If you want to clear the failed action from the status output, stop the guest resource before clearing it. For example:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs resource disable vm-guest1 --wait\n" "# pcs resource cleanup guest1\n" "# pcs resource enable vm-guest1" msgstr "" #. Tag: title #, no-c-format msgid "Accessing Cluster Tools from Guest Node" msgstr "" #. Tag: para #, no-c-format msgid "Besides allowing the cluster to manage resources on a guest node, pacemaker_remote has one other trick. The pacemaker_remote daemon allows nearly all the pacemaker tools (crm_resource, crm_mon, crm_attribute, crm_master, etc.) to work on guest nodes natively." msgstr "" #. Tag: para #, no-c-format msgid "Try it: Run crm_mon on the guest after pacemaker has integrated the guest node into the cluster. These tools just work. This means resource agents such as master/slave resources which need access to tools like crm_master work seamlessly on the guest nodes." msgstr "" #. Tag: para #, no-c-format msgid "Higher-level command shells such as pcs may have partial support on guest nodes, but it is recommended to run them from a cluster node." msgstr "" diff --git a/doc/Pacemaker_Remote/pot/Ch-Options.pot b/doc/Pacemaker_Remote/pot/Ch-Options.pot index a16b6653d3..945cfa2e7f 100644 --- a/doc/Pacemaker_Remote/pot/Ch-Options.pot +++ b/doc/Pacemaker_Remote/pot/Ch-Options.pot @@ -1,222 +1,232 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Configuration Explained" msgstr "" #. Tag: para #, no-c-format msgid "The walk-through examples use some of these options, but don’t explain exactly what they mean or do. This section is meant to be the go-to resource for all the options available for configuring pacemaker_remote-based nodes. configuration " msgstr "" #. Tag: title #, no-c-format msgid "Resource Meta-Attributes for Guest Nodes" msgstr "" #. Tag: para #, no-c-format -msgid "When configuring a virtual machine to use as a guest node, these are the metadata options available to enable the resource as a guest node and define its connection parameters." +msgid "When configuring a virtual machine as a guest node, the virtual machine is created using one of the usual resource agents for that purpose (for example, ocf:heartbeat:VirtualDomain or ocf:heartbeat:Xen), with additional metadata parameters." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "No restrictions are enforced on what agents may be used to create a guest node, but obviously the agent must create a distinct environment capable of running the pacemaker_remote daemon and cluster resources. An additional requirement is that fencing the host running the guest node resource must be sufficient for ensuring the guest node is stopped. This means, for example, that not all hypervisors supported by VirtualDomain may be used to create guest nodes; if the guest can survive the hypervisor being fenced, it may not be used as a guest node." +msgstr "" + +#. Tag: para +#, no-c-format +msgid "Below are the metadata options available to enable a resource as a guest node and define its connection parameters." msgstr "" #. Tag: title #, no-c-format msgid "Meta-attributes for configuring VM resources as guest nodes" msgstr "" #. Tag: entry #, no-c-format msgid "Option" msgstr "" #. Tag: entry #, no-c-format msgid "Default" msgstr "" #. Tag: entry #, no-c-format msgid "Description" msgstr "" #. Tag: para #, no-c-format msgid "remote-node" msgstr "" #. Tag: para #, no-c-format msgid "none" msgstr "" #. Tag: para #, no-c-format msgid "The node name of the guest node this resource defines. This both enables the resource as a guest node and defines the unique name used to identify the guest node. If no other parameters are set, this value will also be assumed as the hostname to use when connecting to pacemaker_remote on the VM. This value must not overlap with any resource or node IDs." msgstr "" #. Tag: para #, no-c-format msgid "remote-port" msgstr "" #. Tag: para #, no-c-format msgid "3121" msgstr "" #. Tag: para #, no-c-format msgid "The port on the virtual machine that the cluster will use to connect to pacemaker_remote." msgstr "" #. Tag: para #, no-c-format msgid "remote-addr" msgstr "" #. Tag: para #, no-c-format msgid "value of remote-node" msgstr "" #. Tag: para #, no-c-format msgid "The IP address or hostname to use when connecting to pacemaker_remote on the VM." msgstr "" #. Tag: para #, no-c-format msgid "remote-connect-timeout" msgstr "" #. Tag: para #, no-c-format msgid "60s" msgstr "" #. Tag: para #, no-c-format msgid "How long before a pending guest connection will time out." msgstr "" #. Tag: title #, no-c-format msgid "Connection Resources for Remote Nodes" msgstr "" #. Tag: para #, no-c-format msgid "A remote node is defined by a connection resource. That connection resource has instance attributes that define where the remote node is located on the network and how to communicate with it." msgstr "" #. Tag: para #, no-c-format msgid "Descriptions of these instance attributes can be retrieved using the following pcs command:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs resource describe remote\n" "ocf:pacemaker:remote - remote resource agent\n" "\n" "Resource options:\n" " server: Server location to connect to. This can be an ip address or hostname.\n" " port: tcp port to connect to.\n" -" reconnect_interval: Time in seconds to wait before attempting to reconnect to\n" -" a remote node after an active connection to the remote\n" -" node has been severed. This wait is recurring. If\n" -" reconnect fails after the wait period, a new reconnect\n" -" attempt will be made after observing the wait time. When\n" -" this option is in use, pacemaker will keep attempting to\n" -" reach out and connect to the remote node indefinitely\n" -" after each wait interval." +" reconnect_interval: Interval in seconds at which Pacemaker will attempt to\n" +" reconnect to a remote node after an active connection to\n" +" the remote node has been severed. When this value is\n" +" nonzero, Pacemaker will retry the connection\n" +" indefinitely, at the specified interval. As with any\n" +" time-based actions, this is not guaranteed to be checked\n" +" more frequently than the value of the\n" +" cluster-recheck-interval cluster option." msgstr "" #. Tag: para #, no-c-format msgid "When defining a remote node’s connection resource, it is common and recommended to name the connection resource the same as the remote node’s hostname. By default, if no server option is provided, the cluster will attempt to contact the remote node using the resource name as the hostname." msgstr "" #. Tag: para #, no-c-format msgid "Example defining a remote node with the hostname remote1:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs resource create remote1 remote" msgstr "" #. Tag: para #, no-c-format msgid "Example defining a remote node to connect to a specific IP address and port:" msgstr "" #. Tag: screen #, no-c-format msgid "# pcs resource create remote1 remote server=192.168.122.200 port=8938" msgstr "" #. Tag: title #, no-c-format msgid "Environment Variables for Daemon Start-up" msgstr "" #. Tag: para #, no-c-format msgid "Authentication and encryption of the connection between cluster nodes and nodes running pacemaker_remote is achieved using with TLS-PSK encryption/authentication over TCP (port 3121 by default). This means that both the cluster node and remote node must share the same private key. By default, this key is placed at /etc/pacemaker/authkey on each node." msgstr "" #. Tag: para #, no-c-format -msgid "You can change the default port and/or key location for Pacemaker and pacemaker_remote via environment variables. These environment variables can be enabled by placing them in the /etc/sysconfig/pacemaker file." +msgid "You can change the default port and/or key location for Pacemaker and pacemaker_remote via environment variables. How these variables are set varies by OS, but usually they are set in the /etc/sysconfig/pacemaker or /etc/default/pacemaker file." msgstr "" #. Tag: screen #, no-c-format msgid "#==#==# Pacemaker Remote\n" "# Use a custom directory for finding the authkey.\n" "PCMK_authkey_location=/etc/pacemaker/authkey\n" "#\n" "# Specify a custom port for Pacemaker Remote connections\n" "PCMK_remote_port=3121" msgstr "" #. Tag: title #, no-c-format msgid "Removing Remote Nodes and Guest Nodes" msgstr "" #. Tag: para #, no-c-format msgid "If the resource creating a guest node, or the ocf:pacemaker:remote resource creating a connection to a remote node, is removed from the configuration, the affected node will continue to show up in output as an offline node." msgstr "" #. Tag: para #, no-c-format msgid "If you want to get rid of that output, run (replacing $NODE_NAME appropriately):" msgstr "" #. Tag: screen #, no-c-format msgid "# crm_node --force --remove $NODE_NAME" msgstr "" #. Tag: para #, no-c-format -msgid "Be absolutely sure that the node’s resource has been deleted from the configuration first." +msgid "Be absolutely sure that there are no references to the node’s resource in the configuration before running the above command." msgstr "" diff --git a/doc/Pacemaker_Remote/pot/Revision_History.pot b/doc/Pacemaker_Remote/pot/Revision_History.pot index 707c7a7ee7..1690a8c7c3 100644 --- a/doc/Pacemaker_Remote/pot/Revision_History.pot +++ b/doc/Pacemaker_Remote/pot/Revision_History.pot @@ -1,69 +1,74 @@ # # AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: 0\n" -"POT-Creation-Date: 2016-05-03 17:45-0500\n" -"PO-Revision-Date: 2016-05-03 17:45-0500\n" +"POT-Creation-Date: 2016-11-02 17:32-0500\n" +"PO-Revision-Date: 2016-11-02 17:32-0500\n" "Last-Translator: Automatically generated\n" "Language-Team: None\n" "MIME-Version: 1.0\n" "Content-Type: application/x-publican; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" #. Tag: title #, no-c-format msgid "Revision History" msgstr "" #. Tag: firstname #, no-c-format msgid "David" msgstr "" #. Tag: surname #, no-c-format msgid "Vossel" msgstr "" #. Tag: member #, no-c-format msgid "Import from Pages.app" msgstr "" #. Tag: member #, no-c-format msgid "Added Future Features Section" msgstr "" #. Tag: member #, no-c-format msgid "Added Baremetal remote-node feature documentation" msgstr "" #. Tag: firstname #, no-c-format msgid "Ken" msgstr "" #. Tag: surname #, no-c-format msgid "Gaillot" msgstr "" #. Tag: member #, no-c-format msgid "Targeted CentOS 7.1 and Pacemaker 1.1.12+, updated for current terminology and practice" msgstr "" #. Tag: member #, no-c-format msgid "Updated for Pacemaker 1.1.14" msgstr "" #. Tag: member #, no-c-format msgid "Updated for Pacemaker 1.1.15" msgstr "" +#. Tag: member +#, no-c-format +msgid "Updated for Pacemaker 1.1.16" +msgstr "" + diff --git a/include/crm/crm.h b/include/crm/crm.h index 08853a3c63..9628c5237d 100644 --- a/include/crm/crm.h +++ b/include/crm/crm.h @@ -1,216 +1,216 @@ /* * Copyright (C) 2004 Andrew Beekhof * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public * License as published by the Free Software Foundation; either * version 2 of the License, or (at your option) any later version. * * This software is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * General Public License for more details. * * You should have received a copy of the GNU Lesser General Public * License along with this library; if not, write to the Free Software * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ #ifndef CRM__H # define CRM__H /** * \file * \brief A dumping ground * \ingroup core */ # include # include # include # include # include # include -# define CRM_FEATURE_SET "3.0.11" +# define CRM_FEATURE_SET "3.0.12" # define EOS '\0' # define DIMOF(a) ((int) (sizeof(a)/sizeof(a[0])) ) # ifndef MAX_NAME # define MAX_NAME 256 # endif # ifndef __GNUC__ # define __builtin_expect(expr, result) (expr) # endif /* Some handy macros used by the Linux kernel */ # define __likely(expr) __builtin_expect(expr, 1) # define __unlikely(expr) __builtin_expect(expr, 0) # define CRM_META "CRM_meta" extern char *crm_system_name; /* *INDENT-OFF* */ /* Clean these up at some point, some probably should be runtime options */ # define SOCKET_LEN 1024 # define APPNAME_LEN 256 # define MAX_IPC_FAIL 5 # define MAX_IPC_DELAY 120 # define DAEMON_RESPAWN_STOP 100 # define MSG_LOG 1 # define DOT_FSA_ACTIONS 1 # define DOT_ALL_FSA_INPUTS 1 /* #define FSA_TRACE 1 */ /* This header defines INFINITY, but it might be defined elsewhere as well * (e.g. math.h), so undefine it first. This, of course, complicates any attempt * to use the other definition in any code that includes this header. * * @TODO: Rename our constant (which will break API backward compatibility). */ # undef INFINITY # define INFINITY_S "INFINITY" # define MINUS_INFINITY_S "-INFINITY" # define INFINITY 1000000 /* Sub-systems */ # define CRM_SYSTEM_DC "dc" # define CRM_SYSTEM_DCIB "dcib" /* The master CIB */ # define CRM_SYSTEM_CIB "cib" # define CRM_SYSTEM_CRMD "crmd" # define CRM_SYSTEM_LRMD "lrmd" # define CRM_SYSTEM_PENGINE "pengine" # define CRM_SYSTEM_TENGINE "tengine" # define CRM_SYSTEM_STONITHD "stonithd" # define CRM_SYSTEM_MCP "pacemakerd" /* Valid operations */ # define CRM_OP_NOOP "noop" # define CRM_OP_JOIN_ANNOUNCE "join_announce" # define CRM_OP_JOIN_OFFER "join_offer" # define CRM_OP_JOIN_REQUEST "join_request" # define CRM_OP_JOIN_ACKNAK "join_ack_nack" # define CRM_OP_JOIN_CONFIRM "join_confirm" # define CRM_OP_DIE "die_no_respawn" # define CRM_OP_RETRIVE_CIB "retrieve_cib" # define CRM_OP_PING "ping" # define CRM_OP_THROTTLE "throttle" # define CRM_OP_VOTE "vote" # define CRM_OP_NOVOTE "no-vote" # define CRM_OP_HELLO "hello" # define CRM_OP_HBEAT "dc_beat" # define CRM_OP_PECALC "pe_calc" # define CRM_OP_ABORT "abort" # define CRM_OP_QUIT "quit" # define CRM_OP_LOCAL_SHUTDOWN "start_shutdown" # define CRM_OP_SHUTDOWN_REQ "req_shutdown" # define CRM_OP_SHUTDOWN "do_shutdown" # define CRM_OP_FENCE "stonith" # define CRM_OP_EVENTCC "event_cc" # define CRM_OP_TEABORT "te_abort" # define CRM_OP_TEABORTED "te_abort_confirmed" /* we asked */ # define CRM_OP_TE_HALT "te_halt" # define CRM_OP_TECOMPLETE "te_complete" # define CRM_OP_TETIMEOUT "te_timeout" # define CRM_OP_TRANSITION "transition" # define CRM_OP_REGISTER "register" # define CRM_OP_IPC_FWD "ipc_fwd" # define CRM_OP_DEBUG_UP "debug_inc" # define CRM_OP_DEBUG_DOWN "debug_dec" # define CRM_OP_INVOKE_LRM "lrm_invoke" # define CRM_OP_LRM_REFRESH "lrm_refresh" /* Deprecated */ # define CRM_OP_LRM_QUERY "lrm_query" # define CRM_OP_LRM_DELETE "lrm_delete" # define CRM_OP_LRM_FAIL "lrm_fail" # define CRM_OP_PROBED "probe_complete" # define CRM_OP_NODES_PROBED "probe_nodes_complete" # define CRM_OP_REPROBE "probe_again" # define CRM_OP_CLEAR_FAILCOUNT "clear_failcount" # define CRM_OP_REMOTE_STATE "remote_state" # define CRM_OP_RELAXED_SET "one-or-more" # define CRM_OP_RELAXED_CLONE "clone-one-or-more" # define CRM_OP_RM_NODE_CACHE "rm_node_cache" # define CRMD_JOINSTATE_DOWN "down" # define CRMD_JOINSTATE_PENDING "pending" # define CRMD_JOINSTATE_MEMBER "member" # define CRMD_JOINSTATE_NACK "banned" # define CRMD_ACTION_DELETE "delete" # define CRMD_ACTION_CANCEL "cancel" # define CRMD_ACTION_RELOAD "reload" # define CRMD_ACTION_MIGRATE "migrate_to" # define CRMD_ACTION_MIGRATED "migrate_from" # define CRMD_ACTION_START "start" # define CRMD_ACTION_STARTED "running" # define CRMD_ACTION_STOP "stop" # define CRMD_ACTION_STOPPED "stopped" # define CRMD_ACTION_PROMOTE "promote" # define CRMD_ACTION_PROMOTED "promoted" # define CRMD_ACTION_DEMOTE "demote" # define CRMD_ACTION_DEMOTED "demoted" # define CRMD_ACTION_NOTIFY "notify" # define CRMD_ACTION_NOTIFIED "notified" # define CRMD_ACTION_STATUS "monitor" /* short names */ # define RSC_DELETE CRMD_ACTION_DELETE # define RSC_CANCEL CRMD_ACTION_CANCEL # define RSC_MIGRATE CRMD_ACTION_MIGRATE # define RSC_MIGRATED CRMD_ACTION_MIGRATED # define RSC_START CRMD_ACTION_START # define RSC_STARTED CRMD_ACTION_STARTED # define RSC_STOP CRMD_ACTION_STOP # define RSC_STOPPED CRMD_ACTION_STOPPED # define RSC_PROMOTE CRMD_ACTION_PROMOTE # define RSC_PROMOTED CRMD_ACTION_PROMOTED # define RSC_DEMOTE CRMD_ACTION_DEMOTE # define RSC_DEMOTED CRMD_ACTION_DEMOTED # define RSC_NOTIFY CRMD_ACTION_NOTIFY # define RSC_NOTIFIED CRMD_ACTION_NOTIFIED # define RSC_STATUS CRMD_ACTION_STATUS /* *INDENT-ON* */ typedef GList *GListPtr; # include # include # include # define crm_str_hash g_str_hash_traditional guint crm_strcase_hash(gconstpointer v); guint g_str_hash_traditional(gconstpointer v); static inline const char *crm_action_str(const char *task, int interval) { if(safe_str_eq(task, RSC_STATUS) && !interval) { return "probe"; } return task; } #endif diff --git a/lib/cib/Makefile.am b/lib/cib/Makefile.am index 637ea8cf73..2d22ff963c 100644 --- a/lib/cib/Makefile.am +++ b/lib/cib/Makefile.am @@ -1,36 +1,36 @@ # # Copyright (C) 2004 Andrew Beekhof # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # include $(top_srcdir)/Makefile.common ## libraries lib_LTLIBRARIES = libcib.la ## SOURCES libcib_la_SOURCES = cib_ops.c cib_utils.c cib_client.c cib_native.c cib_attrs.c libcib_la_SOURCES += cib_file.c cib_remote.c -libcib_la_LDFLAGS = -version-info 5:1:1 +libcib_la_LDFLAGS = -version-info 5:2:1 libcib_la_CPPFLAGS = -I$(top_srcdir) $(AM_CPPFLAGS) libcib_la_CFLAGS = $(CFLAGS_HARDENED_LIB) libcib_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) libcib_la_LIBADD = $(CRYPTOLIB) $(top_builddir)/lib/pengine/libpe_rules.la $(top_builddir)/lib/common/libcrmcommon.la clean-generic: rm -f *.log *.debug *.xml *~ diff --git a/lib/cluster/Makefile.am b/lib/cluster/Makefile.am index 9a57bbb283..25085d0922 100644 --- a/lib/cluster/Makefile.am +++ b/lib/cluster/Makefile.am @@ -1,45 +1,45 @@ # # Copyright (C) 2004 Andrew Beekhof # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # include $(top_srcdir)/Makefile.common ## libraries lib_LTLIBRARIES = libcrmcluster.la -libcrmcluster_la_LDFLAGS = -version-info 6:0:2 +libcrmcluster_la_LDFLAGS = -version-info 6:1:2 libcrmcluster_la_CFLAGS = $(CFLAGS_HARDENED_LIB) libcrmcluster_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) libcrmcluster_la_LIBADD = $(top_builddir)/lib/common/libcrmcommon.la $(top_builddir)/lib/fencing/libstonithd.la $(CLUSTERLIBS) libcrmcluster_la_SOURCES = election.c cluster.c membership.c if BUILD_CS_SUPPORT libcrmcluster_la_SOURCES += cpg.c if BUILD_CS_PLUGIN libcrmcluster_la_SOURCES += legacy.c else libcrmcluster_la_SOURCES += corosync.c endif endif if BUILD_HEARTBEAT_SUPPORT libcrmcluster_la_SOURCES += heartbeat.c #libcrmcluster_la_LIBADD += -ldl endif clean-generic: rm -f *.log *.debug *.xml *~ diff --git a/lib/cluster/election.c b/lib/cluster/election.c index 032091a9ee..a8902d3df3 100644 --- a/lib/cluster/election.c +++ b/lib/cluster/election.c @@ -1,527 +1,517 @@ /* - * Copyright (C) 2004 Andrew Beekhof + * Copyright (C) 2004-2016 Andrew Beekhof * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License as published by the Free Software Foundation; either - * version 2 of the License, or (at your option) any later version. - * - * This software is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this library; if not, write to the Free Software - * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * This source code is licensed under the GNU Lesser General Public License + * version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY. */ + #include #include #include #include #include #include #include #include #include #define STORM_INTERVAL 2 /* in seconds */ #define STORM_MULTIPLIER 5 /* multiplied by the number of nodes */ struct election_s { enum election_result state; guint count; char *name; char *uname; GSourceFunc cb; GHashTable *voted; mainloop_timer_t *timeout; /* When to stop if not everyone casts a vote */ }; static void election_complete(election_t *e) { crm_info("Election %s complete", e->name); e->state = election_won; if(e->cb) { e->cb(e); } election_reset(e); } static gboolean election_timer_cb(gpointer user_data) { election_t *e = user_data; crm_info("Election %s %p timed out", e->name, e); election_complete(e); return FALSE; } enum election_result election_state(election_t *e) { if(e) { return e->state; } return election_error; } election_t * election_init(const char *name, const char *uname, guint period_ms, GSourceFunc cb) { static guint count = 0; election_t *e = calloc(1, sizeof(election_t)); if(e != NULL) { if(name) { e->name = crm_strdup_printf("election-%s", name); } else { e->name = crm_strdup_printf("election-%u", count++); } e->cb = cb; e->uname = strdup(uname); e->timeout = mainloop_timer_add(e->name, period_ms, FALSE, election_timer_cb, e); crm_trace("Created %s %p", e->name, e); } return e; } void election_remove(election_t *e, const char *uname) { if(e && uname && e->voted) { g_hash_table_remove(e->voted, uname); } } void election_reset(election_t *e) { crm_trace("Resetting election %s", e->name); if(e) { mainloop_timer_stop(e->timeout); } if (e && e->voted) { crm_trace("Destroying voted cache with %d members", g_hash_table_size(e->voted)); g_hash_table_destroy(e->voted); e->voted = NULL; } } void election_fini(election_t *e) { if(e) { election_reset(e); crm_trace("Destroying %s", e->name); mainloop_timer_del(e->timeout); free(e->uname); free(e->name); free(e); } } static void election_timeout_start(election_t *e) { if(e) { mainloop_timer_start(e->timeout); } } void election_timeout_stop(election_t *e) { if(e) { mainloop_timer_stop(e->timeout); } } void election_timeout_set_period(election_t *e, guint period) { if(e) { mainloop_timer_set_period(e->timeout, period); } else { crm_err("No election defined"); } } static int crm_uptime(struct timeval *output) { static time_t expires = 0; static struct rusage info; time_t tm_now = time(NULL); if (expires < tm_now) { int rc = 0; info.ru_utime.tv_sec = 0; info.ru_utime.tv_usec = 0; rc = getrusage(RUSAGE_SELF, &info); output->tv_sec = 0; output->tv_usec = 0; if (rc < 0) { crm_perror(LOG_ERR, "Could not calculate the current uptime"); expires = 0; return -1; } crm_debug("Current CPU usage is: %lds, %ldus", (long)info.ru_utime.tv_sec, (long)info.ru_utime.tv_usec); } expires = tm_now + STORM_INTERVAL; /* N seconds after the last _access_ */ output->tv_sec = info.ru_utime.tv_sec; output->tv_usec = info.ru_utime.tv_usec; return 1; } static int crm_compare_age(struct timeval your_age) { struct timeval our_age; crm_uptime(&our_age); /* If an error occurred, our_age will be compared as {0,0} */ if (our_age.tv_sec > your_age.tv_sec) { crm_debug("Win: %ld vs %ld (seconds)", (long)our_age.tv_sec, (long)your_age.tv_sec); return 1; } else if (our_age.tv_sec < your_age.tv_sec) { crm_debug("Lose: %ld vs %ld (seconds)", (long)our_age.tv_sec, (long)your_age.tv_sec); return -1; } else if (our_age.tv_usec > your_age.tv_usec) { crm_debug("Win: %ld.%ld vs %ld.%ld (usec)", (long)our_age.tv_sec, (long)our_age.tv_usec, (long)your_age.tv_sec, (long)your_age.tv_usec); return 1; } else if (our_age.tv_usec < your_age.tv_usec) { crm_debug("Lose: %ld.%ld vs %ld.%ld (usec)", (long)our_age.tv_sec, (long)our_age.tv_usec, (long)your_age.tv_sec, (long)your_age.tv_usec); return -1; } return 0; } void election_vote(election_t *e) { struct timeval age; xmlNode *vote = NULL; crm_node_t *our_node; if(e == NULL) { crm_trace("Not voting in election: not initialized"); return; } our_node = crm_get_peer(0, e->uname); if (our_node == NULL || crm_is_peer_active(our_node) == FALSE) { crm_trace("Cannot vote yet: %p", our_node); return; } e->state = election_in_progress; vote = create_request(CRM_OP_VOTE, NULL, NULL, CRM_SYSTEM_CRMD, CRM_SYSTEM_CRMD, NULL); e->count++; crm_xml_add(vote, F_CRM_ELECTION_OWNER, our_node->uuid); crm_xml_add_int(vote, F_CRM_ELECTION_ID, e->count); crm_uptime(&age); crm_xml_add_int(vote, F_CRM_ELECTION_AGE_S, age.tv_sec); crm_xml_add_int(vote, F_CRM_ELECTION_AGE_US, age.tv_usec); send_cluster_message(NULL, crm_msg_crmd, vote, TRUE); free_xml(vote); crm_debug("Started election %d", e->count); if (e->voted) { g_hash_table_destroy(e->voted); e->voted = NULL; } election_timeout_start(e); return; } bool election_check(election_t *e) { int voted_size = 0; int num_members = crm_active_peers(); if(e == NULL) { crm_trace("not initialized"); return FALSE; } if (e->voted) { voted_size = g_hash_table_size(e->voted); } /* in the case of #voted > #members, it is better to * wait for the timeout and give the cluster time to * stabilize */ if (voted_size >= num_members) { /* we won and everyone has voted */ election_timeout_stop(e); if (voted_size > num_members) { GHashTableIter gIter; const crm_node_t *node; char *key = NULL; g_hash_table_iter_init(&gIter, crm_peer_cache); while (g_hash_table_iter_next(&gIter, NULL, (gpointer *) & node)) { if (crm_is_peer_active(node)) { crm_err("member: %s proc=%.32x", node->uname, node->processes); } } g_hash_table_iter_init(&gIter, e->voted); while (g_hash_table_iter_next(&gIter, (gpointer *) & key, NULL)) { crm_err("voted: %s", key); } } election_complete(e); return TRUE; } else { crm_debug("Still waiting on %d non-votes (%d total)", num_members - voted_size, num_members); } return FALSE; } #define loss_dampen 2 /* in seconds */ /* A_ELECTION_COUNT */ enum election_result election_count_vote(election_t *e, xmlNode *vote, bool can_win) { int age = 0; int election_id = -1; int log_level = LOG_INFO; gboolean use_born_on = FALSE; gboolean done = FALSE; gboolean we_lose = FALSE; const char *op = NULL; const char *from = NULL; const char *reason = "unknown"; const char *election_owner = NULL; crm_node_t *our_node = NULL, *your_node = NULL; static int election_wins = 0; xmlNode *novote = NULL; time_t tm_now = time(NULL); static time_t expires = 0; static time_t last_election_loss = 0; /* if the membership copy is NULL we REALLY shouldn't be voting * the question is how we managed to get here. */ CRM_CHECK(vote != NULL, return election_error); if(e == NULL) { crm_info("Not voting in election: not initialized"); return election_lost; } else if(crm_peer_cache == NULL) { crm_info("Not voting in election: no peer cache"); return election_lost; } op = crm_element_value(vote, F_CRM_TASK); from = crm_element_value(vote, F_CRM_HOST_FROM); election_owner = crm_element_value(vote, F_CRM_ELECTION_OWNER); crm_element_value_int(vote, F_CRM_ELECTION_ID, &election_id); your_node = crm_get_peer(0, from); our_node = crm_get_peer(0, e->uname); if (e->voted == NULL) { crm_debug("Created voted hash"); e->voted = g_hash_table_new_full(crm_str_hash, g_str_equal, g_hash_destroy_str, g_hash_destroy_str); } if (is_heartbeat_cluster()) { use_born_on = TRUE; } else if (is_classic_ais_cluster()) { use_born_on = TRUE; } if(can_win == FALSE) { reason = "Not eligible"; we_lose = TRUE; } else if (our_node == NULL || crm_is_peer_active(our_node) == FALSE) { reason = "We are not part of the cluster"; log_level = LOG_ERR; we_lose = TRUE; } else if (election_id != e->count && crm_str_eq(our_node->uuid, election_owner, TRUE)) { log_level = LOG_TRACE; reason = "Superseded"; done = TRUE; } else if (your_node == NULL || crm_is_peer_active(your_node) == FALSE) { /* Possibly we cached the message in the FSA queue at a point that it wasn't */ reason = "Peer is not part of our cluster"; log_level = LOG_WARNING; done = TRUE; } else if (crm_str_eq(op, CRM_OP_NOVOTE, TRUE)) { char *op_copy = strdup(op); char *uname_copy = strdup(from); CRM_ASSERT(crm_str_eq(our_node->uuid, election_owner, TRUE)); /* update the list of nodes that have voted */ g_hash_table_replace(e->voted, uname_copy, op_copy); reason = "Recorded"; done = TRUE; } else { struct timeval your_age; const char *your_version = crm_element_value(vote, F_CRM_VERSION); int tv_sec = 0; int tv_usec = 0; crm_element_value_int(vote, F_CRM_ELECTION_AGE_S, &tv_sec); crm_element_value_int(vote, F_CRM_ELECTION_AGE_US, &tv_usec); your_age.tv_sec = tv_sec; your_age.tv_usec = tv_usec; age = crm_compare_age(your_age); if (crm_str_eq(from, e->uname, TRUE)) { char *op_copy = strdup(op); char *uname_copy = strdup(from); CRM_ASSERT(crm_str_eq(our_node->uuid, election_owner, TRUE)); /* update ourselves in the list of nodes that have voted */ g_hash_table_replace(e->voted, uname_copy, op_copy); reason = "Recorded"; done = TRUE; } else if (compare_version(your_version, CRM_FEATURE_SET) < 0) { reason = "Version"; we_lose = TRUE; } else if (compare_version(your_version, CRM_FEATURE_SET) > 0) { reason = "Version"; } else if (age < 0) { reason = "Uptime"; we_lose = TRUE; } else if (age > 0) { reason = "Uptime"; /* TODO: Check for y(our) born < 0 */ } else if (use_born_on && your_node->born < our_node->born) { reason = "Born"; we_lose = TRUE; } else if (use_born_on && your_node->born > our_node->born) { reason = "Born"; } else if (e->uname == NULL) { reason = "Unknown host name"; we_lose = TRUE; } else if (strcasecmp(e->uname, from) > 0) { reason = "Host name"; we_lose = TRUE; } else { reason = "Host name"; CRM_ASSERT(strcasecmp(e->uname, from) < 0); /* can't happen... * } else if(strcasecmp(e->uname, from) == 0) { * */ } } if (expires < tm_now) { election_wins = 0; expires = tm_now + STORM_INTERVAL; } else if (done == FALSE && we_lose == FALSE) { int peers = 1 + g_hash_table_size(crm_peer_cache); /* If every node has to vote down every other node, thats N*(N-1) total elections * Allow some leway before _really_ complaining */ election_wins++; if (election_wins > (peers * peers)) { crm_warn("Election storm detected: %d elections in %d seconds", election_wins, STORM_INTERVAL); election_wins = 0; expires = tm_now + STORM_INTERVAL; crm_write_blackbox(0, NULL); } } if (done) { do_crm_log(log_level + 1, "Election %d (current: %d, owner: %s): Processed %s from %s (%s)", election_id, e->count, election_owner, op, from, reason); return e->state; } else if (we_lose == FALSE) { do_crm_log(log_level, "Election %d (owner: %s) pass: %s from %s (%s)", election_id, election_owner, op, from, reason); if (last_election_loss == 0 || tm_now - last_election_loss > (time_t) loss_dampen) { last_election_loss = 0; election_timeout_stop(e); /* Start a new election by voting down this, and other, peers */ e->state = election_start; return e->state; } crm_info("Election %d ignore: We already lost an election less than %ds ago (%s)", election_id, loss_dampen, ctime(&last_election_loss)); } novote = create_request(CRM_OP_NOVOTE, NULL, from, CRM_SYSTEM_CRMD, CRM_SYSTEM_CRMD, NULL); do_crm_log(log_level, "Election %d (owner: %s) lost: %s from %s (%s)", election_id, election_owner, op, from, reason); election_timeout_stop(e); crm_xml_add(novote, F_CRM_ELECTION_OWNER, election_owner); crm_xml_add_int(novote, F_CRM_ELECTION_ID, election_id); send_cluster_message(your_node, crm_msg_crmd, novote, TRUE); free_xml(novote); last_election_loss = tm_now; e->state = election_lost; return e->state; } diff --git a/lib/common/Makefile.am b/lib/common/Makefile.am index 03526b7ff1..49f2fb29f6 100644 --- a/lib/common/Makefile.am +++ b/lib/common/Makefile.am @@ -1,50 +1,50 @@ # # Copyright (C) 2004 Andrew Beekhof # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # include $(top_srcdir)/Makefile.common AM_CPPFLAGS += -I$(top_builddir)/lib/gnu -I$(top_srcdir)/lib/gnu \ -DSBINDIR=\"$(sbindir)\" ## libraries lib_LTLIBRARIES = libcrmcommon.la # Can't use -Wcast-qual here because glib insists on pretending things are const # when they're not and thus we need the crm_element_value_const() hack # s390 needs -fPIC # s390-suse-linux/bin/ld: .libs/ipc.o: relocation R_390_PC32DBL against `__stack_chk_fail@@GLIBC_2.4' can not be used when making a shared object; recompile with -fPIC CFLAGS = $(CFLAGS_COPY:-Wcast-qual=) -fPIC -libcrmcommon_la_LDFLAGS = -version-info 9:0:6 +libcrmcommon_la_LDFLAGS = -version-info 9:1:6 libcrmcommon_la_CFLAGS = $(CFLAGS_HARDENED_LIB) libcrmcommon_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) libcrmcommon_la_LIBADD = @LIBADD_DL@ $(GNUTLSLIBS) -lm libcrmcommon_la_SOURCES = compat.c digest.c ipc.c io.c procfs.c utils.c xml.c \ iso8601.c remote.c mainloop.c logging.c watchdog.c \ schemas.c strings.c xpath.c if BUILD_CIBSECRETS libcrmcommon_la_SOURCES += cib_secrets.c endif libcrmcommon_la_SOURCES += $(top_builddir)/lib/gnu/md5.c clean-generic: rm -f *.log *.debug *.xml *~ diff --git a/lib/common/ipc.c b/lib/common/ipc.c index f060fcdb09..2949837e37 100644 --- a/lib/common/ipc.c +++ b/lib/common/ipc.c @@ -1,1293 +1,1285 @@ /* * Copyright (C) 2004 Andrew Beekhof * * This library is free software; you can redistribute it and/or * modify it under the terms of the GNU Lesser General Public * License as published by the Free Software Foundation; either * version 2.1 of the License, or (at your option) any later version. * * This library is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * Lesser General Public License for more details. * * You should have received a copy of the GNU Lesser General Public * License along with this library; if not, write to the Free Software * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define PCMK_IPC_VERSION 1 struct crm_ipc_response_header { struct qb_ipc_response_header qb; uint32_t size_uncompressed; uint32_t size_compressed; uint32_t flags; uint8_t version; /* Protect against version changes for anyone that might bother to statically link us */ }; static int hdr_offset = 0; static unsigned int ipc_buffer_max = 0; static unsigned int pick_ipc_buffer(unsigned int max); static inline void crm_ipc_init(void) { if (hdr_offset == 0) { hdr_offset = sizeof(struct crm_ipc_response_header); } if (ipc_buffer_max == 0) { ipc_buffer_max = pick_ipc_buffer(0); } } unsigned int crm_ipc_default_buffer_size(void) { return pick_ipc_buffer(0); } static char * generateReference(const char *custom1, const char *custom2) { static uint ref_counter = 0; const char *local_cust1 = custom1; const char *local_cust2 = custom2; int reference_len = 4; char *since_epoch = NULL; reference_len += 20; /* too big */ reference_len += 40; /* too big */ if (local_cust1 == NULL) { local_cust1 = "_empty_"; } reference_len += strlen(local_cust1); if (local_cust2 == NULL) { local_cust2 = "_empty_"; } reference_len += strlen(local_cust2); since_epoch = calloc(1, reference_len); if (since_epoch != NULL) { sprintf(since_epoch, "%s-%s-%lu-%u", local_cust1, local_cust2, (unsigned long)time(NULL), ref_counter++); } return since_epoch; } xmlNode * create_request_adv(const char *task, xmlNode * msg_data, const char *host_to, const char *sys_to, const char *sys_from, const char *uuid_from, const char *origin) { char *true_from = NULL; xmlNode *request = NULL; char *reference = generateReference(task, sys_from); if (uuid_from != NULL) { true_from = generate_hash_key(sys_from, uuid_from); } else if (sys_from != NULL) { true_from = strdup(sys_from); } else { crm_err("No sys from specified"); } /* host_from will get set for us if necessary by CRMd when routed */ request = create_xml_node(NULL, __FUNCTION__); crm_xml_add(request, F_CRM_ORIGIN, origin); crm_xml_add(request, F_TYPE, T_CRM); crm_xml_add(request, F_CRM_VERSION, CRM_FEATURE_SET); crm_xml_add(request, F_CRM_MSG_TYPE, XML_ATTR_REQUEST); crm_xml_add(request, F_CRM_REFERENCE, reference); crm_xml_add(request, F_CRM_TASK, task); crm_xml_add(request, F_CRM_SYS_TO, sys_to); crm_xml_add(request, F_CRM_SYS_FROM, true_from); /* HOSTTO will be ignored if it is to the DC anyway. */ if (host_to != NULL && strlen(host_to) > 0) { crm_xml_add(request, F_CRM_HOST_TO, host_to); } if (msg_data != NULL) { add_message_xml(request, F_CRM_DATA, msg_data); } free(reference); free(true_from); return request; } /* * This method adds a copy of xml_response_data */ xmlNode * create_reply_adv(xmlNode * original_request, xmlNode * xml_response_data, const char *origin) { xmlNode *reply = NULL; const char *host_from = crm_element_value(original_request, F_CRM_HOST_FROM); const char *sys_from = crm_element_value(original_request, F_CRM_SYS_FROM); const char *sys_to = crm_element_value(original_request, F_CRM_SYS_TO); const char *type = crm_element_value(original_request, F_CRM_MSG_TYPE); const char *operation = crm_element_value(original_request, F_CRM_TASK); const char *crm_msg_reference = crm_element_value(original_request, F_CRM_REFERENCE); if (type == NULL) { crm_err("Cannot create new_message, no message type in original message"); CRM_ASSERT(type != NULL); return NULL; #if 0 } else if (strcasecmp(XML_ATTR_REQUEST, type) != 0) { crm_err("Cannot create new_message, original message was not a request"); return NULL; #endif } reply = create_xml_node(NULL, __FUNCTION__); if (reply == NULL) { crm_err("Cannot create new_message, malloc failed"); return NULL; } crm_xml_add(reply, F_CRM_ORIGIN, origin); crm_xml_add(reply, F_TYPE, T_CRM); crm_xml_add(reply, F_CRM_VERSION, CRM_FEATURE_SET); crm_xml_add(reply, F_CRM_MSG_TYPE, XML_ATTR_RESPONSE); crm_xml_add(reply, F_CRM_REFERENCE, crm_msg_reference); crm_xml_add(reply, F_CRM_TASK, operation); /* since this is a reply, we reverse the from and to */ crm_xml_add(reply, F_CRM_SYS_TO, sys_from); crm_xml_add(reply, F_CRM_SYS_FROM, sys_to); /* HOSTTO will be ignored if it is to the DC anyway. */ if (host_from != NULL && strlen(host_from) > 0) { crm_xml_add(reply, F_CRM_HOST_TO, host_from); } if (xml_response_data != NULL) { add_message_xml(reply, F_CRM_DATA, xml_response_data); } return reply; } /* Libqb based IPC */ /* Server... */ GHashTable *client_connections = NULL; crm_client_t * crm_client_get(qb_ipcs_connection_t * c) { if (client_connections) { return g_hash_table_lookup(client_connections, c); } crm_trace("No client found for %p", c); return NULL; } crm_client_t * crm_client_get_by_id(const char *id) { gpointer key; crm_client_t *client; GHashTableIter iter; if (client_connections && id) { g_hash_table_iter_init(&iter, client_connections); while (g_hash_table_iter_next(&iter, &key, (gpointer *) & client)) { if (strcmp(client->id, id) == 0) { return client; } } } crm_trace("No client found with id=%s", id); return NULL; } const char * crm_client_name(crm_client_t * c) { if (c == NULL) { return "null"; } else if (c->name == NULL && c->id == NULL) { return "unknown"; } else if (c->name == NULL) { return c->id; } else { return c->name; } } void crm_client_init(void) { if (client_connections == NULL) { crm_trace("Creating client hash table"); client_connections = g_hash_table_new(g_direct_hash, g_direct_equal); } } void crm_client_cleanup(void) { if (client_connections != NULL) { int active = g_hash_table_size(client_connections); if (active) { crm_err("Exiting with %d active connections", active); } g_hash_table_destroy(client_connections); client_connections = NULL; } } void crm_client_disconnect_all(qb_ipcs_service_t *service) { qb_ipcs_connection_t *c = NULL; if (service == NULL) { return; } c = qb_ipcs_connection_first_get(service); while (c != NULL) { qb_ipcs_connection_t *last = c; c = qb_ipcs_connection_next_get(service, last); /* There really shouldn't be anyone connected at this point */ crm_notice("Disconnecting client %p, pid=%d...", last, crm_ipcs_client_pid(last)); qb_ipcs_disconnect(last); qb_ipcs_connection_unref(last); } } crm_client_t * crm_client_new(qb_ipcs_connection_t * c, uid_t uid_client, gid_t gid_client) { - static uid_t uid_server = 0; static gid_t gid_cluster = 0; crm_client_t *client = NULL; CRM_LOG_ASSERT(c); if (c == NULL) { return NULL; } if (gid_cluster == 0) { - uid_server = getuid(); if(crm_user_lookup(CRM_DAEMON_USER, NULL, &gid_cluster) < 0) { static bool have_error = FALSE; if(have_error == FALSE) { crm_warn("Could not find group for user %s", CRM_DAEMON_USER); have_error = TRUE; } } } - if(gid_cluster != 0 && gid_client != 0) { - uid_t best_uid = -1; /* Passing -1 to chown(2) means don't change */ - - if(uid_client == 0 || uid_server == 0) { /* Someone is priveliged, but the other may not be */ - best_uid = QB_MAX(uid_client, uid_server); - crm_trace("Allowing user %u to clean up after disconnect", best_uid); - } - + if (uid_client != 0) { crm_trace("Giving access to group %u", gid_cluster); - qb_ipcs_connection_auth_set(c, best_uid, gid_cluster, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP); + /* Passing -1 to chown(2) means don't change */ + qb_ipcs_connection_auth_set(c, -1, gid_cluster, S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP); } crm_client_init(); /* TODO: Do our own auth checking, return NULL if unauthorized */ client = calloc(1, sizeof(crm_client_t)); client->ipcs = c; client->kind = CRM_CLIENT_IPC; client->pid = crm_ipcs_client_pid(c); client->id = crm_generate_uuid(); crm_debug("Connecting %p for uid=%d gid=%d pid=%u id=%s", c, uid_client, gid_client, client->pid, client->id); #if ENABLE_ACL client->user = uid2username(uid_client); #endif g_hash_table_insert(client_connections, c, client); return client; } void crm_client_destroy(crm_client_t * c) { if (c == NULL) { return; } if (client_connections) { if (c->ipcs) { crm_trace("Destroying %p/%p (%d remaining)", c, c->ipcs, crm_hash_table_size(client_connections) - 1); g_hash_table_remove(client_connections, c->ipcs); } else { crm_trace("Destroying remote connection %p (%d remaining)", c, crm_hash_table_size(client_connections) - 1); g_hash_table_remove(client_connections, c->id); } } if (c->event_timer) { g_source_remove(c->event_timer); } crm_debug("Destroying %d events", g_list_length(c->event_queue)); while (c->event_queue) { struct iovec *event = c->event_queue->data; c->event_queue = g_list_remove(c->event_queue, event); free(event[0].iov_base); free(event[1].iov_base); free(event); } free(c->id); free(c->name); free(c->user); if (c->remote) { if (c->remote->auth_timeout) { g_source_remove(c->remote->auth_timeout); } free(c->remote->buffer); free(c->remote); } free(c); } int crm_ipcs_client_pid(qb_ipcs_connection_t * c) { struct qb_ipcs_connection_stats stats; stats.client_pid = 0; qb_ipcs_connection_stats_get(c, &stats, 0); return stats.client_pid; } xmlNode * crm_ipcs_recv(crm_client_t * c, void *data, size_t size, uint32_t * id, uint32_t * flags) { xmlNode *xml = NULL; char *uncompressed = NULL; char *text = ((char *)data) + sizeof(struct crm_ipc_response_header); struct crm_ipc_response_header *header = data; if (id) { *id = ((struct qb_ipc_response_header *)data)->id; } if (flags) { *flags = header->flags; } if (is_set(header->flags, crm_ipc_proxied)) { /* mark this client as being the endpoint of a proxy connection. * Proxy connections responses are sent on the event channel to avoid * blocking the proxy daemon (crmd) */ c->flags |= crm_client_flag_ipc_proxied; } if(header->version > PCMK_IPC_VERSION) { crm_err("Filtering incompatible v%d IPC message, we only support versions <= %d", header->version, PCMK_IPC_VERSION); return NULL; } if (header->size_compressed) { int rc = 0; unsigned int size_u = 1 + header->size_uncompressed; uncompressed = calloc(1, size_u); crm_trace("Decompressing message data %u bytes into %u bytes", header->size_compressed, size_u); rc = BZ2_bzBuffToBuffDecompress(uncompressed, &size_u, text, header->size_compressed, 1, 0); text = uncompressed; if (rc != BZ_OK) { crm_err("Decompression failed: %s (%d)", bz2_strerror(rc), rc); free(uncompressed); return NULL; } } CRM_ASSERT(text[header->size_uncompressed - 1] == 0); crm_trace("Received %.200s", text); xml = string2xml(text); free(uncompressed); return xml; } ssize_t crm_ipcs_flush_events(crm_client_t * c); static gboolean crm_ipcs_flush_events_cb(gpointer data) { crm_client_t *c = data; c->event_timer = 0; crm_ipcs_flush_events(c); return FALSE; } ssize_t crm_ipcs_flush_events(crm_client_t * c) { int sent = 0; ssize_t rc = 0; int queue_len = 0; if (c == NULL) { return pcmk_ok; } else if (c->event_timer) { /* There is already a timer, wait until it goes off */ crm_trace("Timer active for %p - %d", c->ipcs, c->event_timer); return pcmk_ok; } queue_len = g_list_length(c->event_queue); while (c->event_queue && sent < 100) { struct crm_ipc_response_header *header = NULL; struct iovec *event = c->event_queue->data; rc = qb_ipcs_event_sendv(c->ipcs, event, 2); if (rc < 0) { break; } sent++; header = event[0].iov_base; if (header->size_compressed) { crm_trace("Event %d to %p[%d] (%lld compressed bytes) sent", header->qb.id, c->ipcs, c->pid, (long long) rc); } else { crm_trace("Event %d to %p[%d] (%lld bytes) sent: %.120s", header->qb.id, c->ipcs, c->pid, (long long) rc, (char *) (event[1].iov_base)); } c->event_queue = g_list_remove(c->event_queue, event); free(event[0].iov_base); free(event[1].iov_base); free(event); } queue_len -= sent; if (sent > 0 || c->event_queue) { crm_trace("Sent %d events (%d remaining) for %p[%d]: %s (%lld)", sent, queue_len, c->ipcs, c->pid, pcmk_strerror(rc < 0 ? rc : 0), (long long) rc); } if (c->event_queue) { if (queue_len % 100 == 0 && queue_len > 99) { crm_warn("Event queue for %p[%d] has grown to %d", c->ipcs, c->pid, queue_len); } else if (queue_len > 500) { crm_err("Evicting slow client %p[%d]: event queue reached %d entries", c->ipcs, c->pid, queue_len); qb_ipcs_disconnect(c->ipcs); return rc; } c->event_timer = g_timeout_add(1000 + 100 * queue_len, crm_ipcs_flush_events_cb, c); } return rc; } ssize_t crm_ipc_prepare(uint32_t request, xmlNode * message, struct iovec ** result, uint32_t max_send_size) { static unsigned int biggest = 0; struct iovec *iov; unsigned int total = 0; char *compressed = NULL; char *buffer = dump_xml_unformatted(message); struct crm_ipc_response_header *header = calloc(1, sizeof(struct crm_ipc_response_header)); CRM_ASSERT(result != NULL); crm_ipc_init(); if (max_send_size == 0) { max_send_size = ipc_buffer_max; } CRM_LOG_ASSERT(max_send_size != 0); *result = NULL; iov = calloc(2, sizeof(struct iovec)); iov[0].iov_len = hdr_offset; iov[0].iov_base = header; header->version = PCMK_IPC_VERSION; header->size_uncompressed = 1 + strlen(buffer); total = iov[0].iov_len + header->size_uncompressed; if (total < max_send_size) { iov[1].iov_base = buffer; iov[1].iov_len = header->size_uncompressed; } else { unsigned int new_size = 0; if (crm_compress_string (buffer, header->size_uncompressed, max_send_size, &compressed, &new_size)) { header->flags |= crm_ipc_compressed; header->size_compressed = new_size; iov[1].iov_len = header->size_compressed; iov[1].iov_base = compressed; free(buffer); biggest = QB_MAX(header->size_compressed, biggest); } else { ssize_t rc = -EMSGSIZE; crm_log_xml_trace(message, "EMSGSIZE"); biggest = QB_MAX(header->size_uncompressed, biggest); crm_err ("Could not compress the message (%u bytes) into less than the configured ipc limit (%u bytes). " "Set PCMK_ipc_buffer to a higher value (%u bytes suggested)", header->size_uncompressed, max_send_size, 4 * biggest); free(compressed); free(buffer); free(header); free(iov); return rc; } } header->qb.size = iov[0].iov_len + iov[1].iov_len; header->qb.id = (int32_t)request; /* Replying to a specific request */ *result = iov; CRM_ASSERT(header->qb.size > 0); return header->qb.size; } ssize_t crm_ipcs_sendv(crm_client_t * c, struct iovec * iov, enum crm_ipc_flags flags) { ssize_t rc; static uint32_t id = 1; struct crm_ipc_response_header *header = iov[0].iov_base; if (c->flags & crm_client_flag_ipc_proxied) { /* _ALL_ replies to proxied connections need to be sent as events */ if (is_not_set(flags, crm_ipc_server_event)) { flags |= crm_ipc_server_event; /* this flag lets us know this was originally meant to be a response. * even though we're sending it over the event channel. */ flags |= crm_ipc_proxied_relay_response; } } header->flags |= flags; if (flags & crm_ipc_server_event) { header->qb.id = id++; /* We don't really use it, but doesn't hurt to set one */ if (flags & crm_ipc_server_free) { crm_trace("Sending the original to %p[%d]", c->ipcs, c->pid); c->event_queue = g_list_append(c->event_queue, iov); } else { struct iovec *iov_copy = calloc(2, sizeof(struct iovec)); crm_trace("Sending a copy to %p[%d]", c->ipcs, c->pid); iov_copy[0].iov_len = iov[0].iov_len; iov_copy[0].iov_base = malloc(iov[0].iov_len); memcpy(iov_copy[0].iov_base, iov[0].iov_base, iov[0].iov_len); iov_copy[1].iov_len = iov[1].iov_len; iov_copy[1].iov_base = malloc(iov[1].iov_len); memcpy(iov_copy[1].iov_base, iov[1].iov_base, iov[1].iov_len); c->event_queue = g_list_append(c->event_queue, iov_copy); } } else { CRM_LOG_ASSERT(header->qb.id != 0); /* Replying to a specific request */ rc = qb_ipcs_response_sendv(c->ipcs, iov, 2); if (rc < header->qb.size) { crm_notice("Response %d to %p[%d] (%u bytes) failed: %s (%d)", header->qb.id, c->ipcs, c->pid, header->qb.size, pcmk_strerror(rc), rc); } else { crm_trace("Response %d sent, %lld bytes to %p[%d]", header->qb.id, (long long) rc, c->ipcs, c->pid); } if (flags & crm_ipc_server_free) { free(iov[0].iov_base); free(iov[1].iov_base); free(iov); } } if (flags & crm_ipc_server_event) { rc = crm_ipcs_flush_events(c); } else { crm_ipcs_flush_events(c); } if (rc == -EPIPE || rc == -ENOTCONN) { crm_trace("Client %p disconnected", c->ipcs); } return rc; } ssize_t crm_ipcs_send(crm_client_t * c, uint32_t request, xmlNode * message, enum crm_ipc_flags flags) { struct iovec *iov = NULL; ssize_t rc = 0; if(c == NULL) { return -EDESTADDRREQ; } crm_ipc_init(); rc = crm_ipc_prepare(request, message, &iov, ipc_buffer_max); if (rc > 0) { rc = crm_ipcs_sendv(c, iov, flags | crm_ipc_server_free); } else { free(iov); crm_notice("Message to %p[%d] failed: %s (%d)", c->ipcs, c->pid, pcmk_strerror(rc), rc); } return rc; } void crm_ipcs_send_ack(crm_client_t * c, uint32_t request, uint32_t flags, const char *tag, const char *function, int line) { if (flags & crm_ipc_client_response) { xmlNode *ack = create_xml_node(NULL, tag); crm_trace("Ack'ing msg from %s (%p)", crm_client_name(c), c); c->request_id = 0; crm_xml_add(ack, "function", function); crm_xml_add_int(ack, "line", line); crm_ipcs_send(c, request, ack, flags); free_xml(ack); } } /* Client... */ #define MIN_MSG_SIZE 12336 /* sizeof(struct qb_ipc_connection_response) */ #define MAX_MSG_SIZE 128*1024 /* 128k default */ struct crm_ipc_s { struct pollfd pfd; /* the max size we can send/receive over ipc */ unsigned int max_buf_size; /* Size of the allocated 'buffer' */ unsigned int buf_size; int msg_size; int need_reply; char *buffer; char *name; uint32_t buffer_flags; qb_ipcc_connection_t *ipc; }; static unsigned int pick_ipc_buffer(unsigned int max) { static unsigned int global_max = 0; if (global_max == 0) { const char *env = getenv("PCMK_ipc_buffer"); if (env) { int env_max = crm_parse_int(env, "0"); global_max = (env_max > 0)? QB_MAX(MIN_MSG_SIZE, env_max) : MAX_MSG_SIZE; } else { global_max = MAX_MSG_SIZE; } } return QB_MAX(max, global_max); } crm_ipc_t * crm_ipc_new(const char *name, size_t max_size) { crm_ipc_t *client = NULL; client = calloc(1, sizeof(crm_ipc_t)); client->name = strdup(name); client->buf_size = pick_ipc_buffer(max_size); client->buffer = malloc(client->buf_size); /* Clients initiating connection pick the max buf size */ client->max_buf_size = client->buf_size; client->pfd.fd = -1; client->pfd.events = POLLIN; client->pfd.revents = 0; return client; } /*! * \brief Establish an IPC connection to a Pacemaker component * * \param[in] client Connection instance obtained from crm_ipc_new() * * \return TRUE on success, FALSE otherwise (in which case errno will be set) */ bool crm_ipc_connect(crm_ipc_t * client) { client->need_reply = FALSE; client->ipc = qb_ipcc_connect(client->name, client->buf_size); if (client->ipc == NULL) { crm_debug("Could not establish %s connection: %s (%d)", client->name, pcmk_strerror(errno), errno); return FALSE; } client->pfd.fd = crm_ipc_get_fd(client); if (client->pfd.fd < 0) { crm_debug("Could not obtain file descriptor for %s connection: %s (%d)", client->name, pcmk_strerror(errno), errno); return FALSE; } qb_ipcc_context_set(client->ipc, client); #ifdef HAVE_IPCS_GET_BUFFER_SIZE client->max_buf_size = qb_ipcc_get_buffer_size(client->ipc); if (client->max_buf_size > client->buf_size) { free(client->buffer); client->buffer = calloc(1, client->max_buf_size); client->buf_size = client->max_buf_size; } #endif return TRUE; } void crm_ipc_close(crm_ipc_t * client) { if (client) { crm_trace("Disconnecting %s IPC connection %p (%p)", client->name, client, client->ipc); if (client->ipc) { qb_ipcc_connection_t *ipc = client->ipc; client->ipc = NULL; qb_ipcc_disconnect(ipc); } } } void crm_ipc_destroy(crm_ipc_t * client) { if (client) { if (client->ipc && qb_ipcc_is_connected(client->ipc)) { crm_notice("Destroying an active IPC connection to %s", client->name); /* The next line is basically unsafe * * If this connection was attached to mainloop and mainloop is active, * the 'disconnected' callback will end up back here and we'll end * up free'ing the memory twice - something that can still happen * even without this if we destroy a connection and it closes before * we call exit */ /* crm_ipc_close(client); */ } crm_trace("Destroying IPC connection to %s: %p", client->name, client); free(client->buffer); free(client->name); free(client); } } int crm_ipc_get_fd(crm_ipc_t * client) { int fd = 0; if (client && client->ipc && (qb_ipcc_fd_get(client->ipc, &fd) == 0)) { return fd; } errno = EINVAL; crm_perror(LOG_ERR, "Could not obtain file IPC descriptor for %s", (client? client->name : "unspecified client")); return -errno; } bool crm_ipc_connected(crm_ipc_t * client) { bool rc = FALSE; if (client == NULL) { crm_trace("No client"); return FALSE; } else if (client->ipc == NULL) { crm_trace("No connection"); return FALSE; } else if (client->pfd.fd < 0) { crm_trace("Bad descriptor"); return FALSE; } rc = qb_ipcc_is_connected(client->ipc); if (rc == FALSE) { client->pfd.fd = -EINVAL; } return rc; } /*! * \brief Check whether an IPC connection is ready to be read * * \param[in] client Connection to check * * \return Positive value if ready to be read, 0 if not ready, -errno on error */ int crm_ipc_ready(crm_ipc_t *client) { int rc; CRM_ASSERT(client != NULL); if (crm_ipc_connected(client) == FALSE) { return -ENOTCONN; } client->pfd.revents = 0; rc = poll(&(client->pfd), 1, 0); return (rc < 0)? -errno : rc; } static int crm_ipc_decompress(crm_ipc_t * client) { struct crm_ipc_response_header *header = (struct crm_ipc_response_header *)(void*)client->buffer; if (header->size_compressed) { int rc = 0; unsigned int size_u = 1 + header->size_uncompressed; /* never let buf size fall below our max size required for ipc reads. */ unsigned int new_buf_size = QB_MAX((hdr_offset + size_u), client->max_buf_size); char *uncompressed = calloc(1, new_buf_size); crm_trace("Decompressing message data %u bytes into %u bytes", header->size_compressed, size_u); rc = BZ2_bzBuffToBuffDecompress(uncompressed + hdr_offset, &size_u, client->buffer + hdr_offset, header->size_compressed, 1, 0); if (rc != BZ_OK) { crm_err("Decompression failed: %s (%d)", bz2_strerror(rc), rc); free(uncompressed); return -EILSEQ; } /* * This assert no longer holds true. For an identical msg, some clients may * require compression, and others may not. If that same msg (event) is sent * to multiple clients, it could result in some clients receiving a compressed * msg even though compression was not explicitly required for them. * * CRM_ASSERT((header->size_uncompressed + hdr_offset) >= ipc_buffer_max); */ CRM_ASSERT(size_u == header->size_uncompressed); memcpy(uncompressed, client->buffer, hdr_offset); /* Preserve the header */ header = (struct crm_ipc_response_header *)(void*)uncompressed; free(client->buffer); client->buf_size = new_buf_size; client->buffer = uncompressed; } CRM_ASSERT(client->buffer[hdr_offset + header->size_uncompressed - 1] == 0); return pcmk_ok; } long crm_ipc_read(crm_ipc_t * client) { struct crm_ipc_response_header *header = NULL; CRM_ASSERT(client != NULL); CRM_ASSERT(client->ipc != NULL); CRM_ASSERT(client->buffer != NULL); crm_ipc_init(); client->buffer[0] = 0; client->msg_size = qb_ipcc_event_recv(client->ipc, client->buffer, client->buf_size - 1, 0); if (client->msg_size >= 0) { int rc = crm_ipc_decompress(client); if (rc != pcmk_ok) { return rc; } header = (struct crm_ipc_response_header *)(void*)client->buffer; if(header->version > PCMK_IPC_VERSION) { crm_err("Filtering incompatible v%d IPC message, we only support versions <= %d", header->version, PCMK_IPC_VERSION); return -EBADMSG; } crm_trace("Received %s event %d, size=%u, rc=%d, text: %.100s", client->name, header->qb.id, header->qb.size, client->msg_size, client->buffer + hdr_offset); } else { crm_trace("No message from %s received: %s", client->name, pcmk_strerror(client->msg_size)); } if (crm_ipc_connected(client) == FALSE || client->msg_size == -ENOTCONN) { crm_err("Connection to %s failed", client->name); } if (header) { /* Data excluding the header */ return header->size_uncompressed; } return -ENOMSG; } const char * crm_ipc_buffer(crm_ipc_t * client) { CRM_ASSERT(client != NULL); return client->buffer + sizeof(struct crm_ipc_response_header); } uint32_t crm_ipc_buffer_flags(crm_ipc_t * client) { struct crm_ipc_response_header *header = NULL; CRM_ASSERT(client != NULL); if (client->buffer == NULL) { return 0; } header = (struct crm_ipc_response_header *)(void*)client->buffer; return header->flags; } const char * crm_ipc_name(crm_ipc_t * client) { CRM_ASSERT(client != NULL); return client->name; } static int internal_ipc_send_recv(crm_ipc_t * client, const void *iov) { int rc = 0; do { rc = qb_ipcc_sendv_recv(client->ipc, iov, 2, client->buffer, client->buf_size, -1); } while (rc == -EAGAIN && crm_ipc_connected(client)); return rc; } static int internal_ipc_send_request(crm_ipc_t * client, const void *iov, int ms_timeout) { int rc = 0; time_t timeout = time(NULL) + 1 + (ms_timeout / 1000); do { rc = qb_ipcc_sendv(client->ipc, iov, 2); } while (rc == -EAGAIN && time(NULL) < timeout && crm_ipc_connected(client)); return rc; } static int internal_ipc_get_reply(crm_ipc_t * client, int request_id, int ms_timeout) { time_t timeout = time(NULL) + 1 + (ms_timeout / 1000); int rc = 0; crm_ipc_init(); /* get the reply */ crm_trace("client %s waiting on reply to msg id %d", client->name, request_id); do { rc = qb_ipcc_recv(client->ipc, client->buffer, client->buf_size, 1000); if (rc > 0) { struct crm_ipc_response_header *hdr = NULL; int rc = crm_ipc_decompress(client); if (rc != pcmk_ok) { return rc; } hdr = (struct crm_ipc_response_header *)(void*)client->buffer; if (hdr->qb.id == request_id) { /* Got it */ break; } else if (hdr->qb.id < request_id) { xmlNode *bad = string2xml(crm_ipc_buffer(client)); crm_err("Discarding old reply %d (need %d)", hdr->qb.id, request_id); crm_log_xml_notice(bad, "OldIpcReply"); } else { xmlNode *bad = string2xml(crm_ipc_buffer(client)); crm_err("Discarding newer reply %d (need %d)", hdr->qb.id, request_id); crm_log_xml_notice(bad, "ImpossibleReply"); CRM_ASSERT(hdr->qb.id <= request_id); } } else if (crm_ipc_connected(client) == FALSE) { crm_err("Server disconnected client %s while waiting for msg id %d", client->name, request_id); break; } } while (time(NULL) < timeout); return rc; } int crm_ipc_send(crm_ipc_t * client, xmlNode * message, enum crm_ipc_flags flags, int32_t ms_timeout, xmlNode ** reply) { long rc = 0; struct iovec *iov; static uint32_t id = 0; static int factor = 8; struct crm_ipc_response_header *header; crm_ipc_init(); if (client == NULL) { crm_notice("Invalid connection"); return -ENOTCONN; } else if (crm_ipc_connected(client) == FALSE) { /* Don't even bother */ crm_notice("Connection to %s closed", client->name); return -ENOTCONN; } if (ms_timeout == 0) { ms_timeout = 5000; } if (client->need_reply) { crm_trace("Trying again to obtain pending reply from %s", client->name); rc = qb_ipcc_recv(client->ipc, client->buffer, client->buf_size, ms_timeout); if (rc < 0) { crm_warn("Sending to %s (%p) is disabled until pending reply is received", client->name, client->ipc); return -EALREADY; } else { crm_notice("Lost reply from %s (%p) finally arrived, sending re-enabled", client->name, client->ipc); client->need_reply = FALSE; } } id++; CRM_LOG_ASSERT(id != 0); /* Crude wrap-around detection */ rc = crm_ipc_prepare(id, message, &iov, client->max_buf_size); if(rc < 0) { return rc; } header = iov[0].iov_base; header->flags |= flags; if(is_set(flags, crm_ipc_proxied)) { /* Don't look for a synchronous response */ clear_bit(flags, crm_ipc_client_response); } if(header->size_compressed) { if(factor < 10 && (client->max_buf_size / 10) < (rc / factor)) { crm_notice("Compressed message exceeds %d0%% of the configured ipc limit (%u bytes), " "consider setting PCMK_ipc_buffer to %u or higher", factor, client->max_buf_size, 2 * client->max_buf_size); factor++; } } crm_trace("Sending from client: %s request id: %d bytes: %u timeout:%d msg...", client->name, header->qb.id, header->qb.size, ms_timeout); if (ms_timeout > 0 || is_not_set(flags, crm_ipc_client_response)) { rc = internal_ipc_send_request(client, iov, ms_timeout); if (rc <= 0) { crm_trace("Failed to send from client %s request %d with %u bytes...", client->name, header->qb.id, header->qb.size); goto send_cleanup; } else if (is_not_set(flags, crm_ipc_client_response)) { crm_trace("Message sent, not waiting for reply to %d from %s to %u bytes...", header->qb.id, client->name, header->qb.size); goto send_cleanup; } rc = internal_ipc_get_reply(client, header->qb.id, ms_timeout); if (rc < 0) { /* No reply, for now, disable sending * * The alternative is to close the connection since we don't know * how to detect and discard out-of-sequence replies * * TODO - implement the above */ client->need_reply = TRUE; } } else { rc = internal_ipc_send_recv(client, iov); } if (rc > 0) { struct crm_ipc_response_header *hdr = (struct crm_ipc_response_header *)(void*)client->buffer; crm_trace("Received response %d, size=%u, rc=%ld, text: %.200s", hdr->qb.id, hdr->qb.size, rc, crm_ipc_buffer(client)); if (reply) { *reply = string2xml(crm_ipc_buffer(client)); } } else { crm_trace("Response not received: rc=%ld, errno=%d", rc, errno); } send_cleanup: if (crm_ipc_connected(client) == FALSE) { crm_notice("Connection to %s closed: %s (%ld)", client->name, pcmk_strerror(rc), rc); } else if (rc == -ETIMEDOUT) { crm_warn("Request %d to %s (%p) failed: %s (%ld) after %dms", header->qb.id, client->name, client->ipc, pcmk_strerror(rc), rc, ms_timeout); crm_write_blackbox(0, NULL); } else if (rc <= 0) { crm_warn("Request %d to %s (%p) failed: %s (%ld)", header->qb.id, client->name, client->ipc, pcmk_strerror(rc), rc); } free(header); free(iov[1].iov_base); free(iov); return rc; } /* Utils */ xmlNode * create_hello_message(const char *uuid, const char *client_name, const char *major_version, const char *minor_version) { xmlNode *hello_node = NULL; xmlNode *hello = NULL; if (uuid == NULL || strlen(uuid) == 0 || client_name == NULL || strlen(client_name) == 0 || major_version == NULL || strlen(major_version) == 0 || minor_version == NULL || strlen(minor_version) == 0) { crm_err("Missing fields, Hello message will not be valid."); return NULL; } hello_node = create_xml_node(NULL, XML_TAG_OPTIONS); crm_xml_add(hello_node, "major_version", major_version); crm_xml_add(hello_node, "minor_version", minor_version); crm_xml_add(hello_node, "client_name", client_name); crm_xml_add(hello_node, "client_uuid", uuid); crm_trace("creating hello message"); hello = create_request(CRM_OP_HELLO, hello_node, NULL, NULL, client_name, uuid); free_xml(hello_node); return hello; } diff --git a/lib/fencing/Makefile.am b/lib/fencing/Makefile.am index dc157995de..c7756a00db 100644 --- a/lib/fencing/Makefile.am +++ b/lib/fencing/Makefile.am @@ -1,29 +1,29 @@ # File: Makefile.am # Author: Sun Jiang Dong # Copyright (c) 2004 International Business Machines # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # include $(top_srcdir)/Makefile.common lib_LTLIBRARIES = libstonithd.la -libstonithd_la_LDFLAGS = -version-info 4:1:2 +libstonithd_la_LDFLAGS = -version-info 4:2:2 libstonithd_la_CFLAGS = $(CFLAGS_HARDENED_LIB) libstonithd_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) libstonithd_la_LIBADD = $(top_builddir)/lib/common/libcrmcommon.la libstonithd_la_SOURCES = st_client.c diff --git a/lib/lrmd/Makefile.am b/lib/lrmd/Makefile.am index 611675e814..44b938df82 100644 --- a/lib/lrmd/Makefile.am +++ b/lib/lrmd/Makefile.am @@ -1,29 +1,29 @@ # Copyright (c) 2012 David Vossel # # This library is free software; you can redistribute it and/or # modify it under the terms of the GNU Lesser General Public # License as published by the Free Software Foundation; either # version 2.1 of the License, or (at your option) any later version. # # This library is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA # include $(top_srcdir)/Makefile.common lib_LTLIBRARIES = liblrmd.la -liblrmd_la_LDFLAGS = -version-info 4:0:3 +liblrmd_la_LDFLAGS = -version-info 4:1:3 liblrmd_la_CFLAGS = $(CFLAGS_HARDENED_LIB) liblrmd_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) liblrmd_la_LIBADD = $(top_builddir)/lib/common/libcrmcommon.la \ $(top_builddir)/lib/services/libcrmservice.la \ $(top_builddir)/lib/fencing/libstonithd.la liblrmd_la_SOURCES = lrmd_client.c proxy_common.c diff --git a/lib/pengine/Makefile.am b/lib/pengine/Makefile.am index ad5c5c3f0f..a4f3558b44 100644 --- a/lib/pengine/Makefile.am +++ b/lib/pengine/Makefile.am @@ -1,44 +1,44 @@ # # Copyright (C) 2004 Andrew Beekhof # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # include $(top_srcdir)/Makefile.common ## libraries lib_LTLIBRARIES = libpe_rules.la libpe_status.la ## SOURCES noinst_HEADERS = unpack.h variant.h -libpe_rules_la_LDFLAGS = -version-info 2:6:0 +libpe_rules_la_LDFLAGS = -version-info 3:0:1 libpe_rules_la_CFLAGS = $(CFLAGS_HARDENED_LIB) libpe_rules_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) libpe_rules_la_LIBADD = $(top_builddir)/lib/common/libcrmcommon.la libpe_rules_la_SOURCES = rules.c common.c -libpe_status_la_LDFLAGS = -version-info 11:0:1 +libpe_status_la_LDFLAGS = -version-info 12:0:2 libpe_status_la_CFLAGS = $(CFLAGS_HARDENED_LIB) libpe_status_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) libpe_status_la_LIBADD = @CURSESLIBS@ $(top_builddir)/lib/common/libcrmcommon.la libpe_status_la_SOURCES = status.c unpack.c utils.c complex.c native.c \ group.c clone.c rules.c common.c remote.c clean-generic: rm -f *.log *.debug *~ diff --git a/lib/services/Makefile.am b/lib/services/Makefile.am index 8186dc449c..ff4c0a0140 100644 --- a/lib/services/Makefile.am +++ b/lib/services/Makefile.am @@ -1,44 +1,44 @@ # Copyright (c) 2012 David Vossel # # This library is free software; you can redistribute it and/or # modify it under the terms of the GNU Lesser General Public # License as published by the Free Software Foundation; either # version 2.1 of the License, or (at your option) any later version. # # This library is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public # License along with this library; if not, write to the Free Software # Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA # # MAINTAINERCLEANFILES = Makefile.in AM_CPPFLAGS = -I$(top_builddir)/include lib_LTLIBRARIES = libcrmservice.la noinst_HEADERS = pcmk-dbus.h upstart.h systemd.h services_private.h -libcrmservice_la_LDFLAGS = -version-info 4:1:1 +libcrmservice_la_LDFLAGS = -version-info 4:2:1 libcrmservice_la_CPPFLAGS = -DOCF_ROOT_DIR=\"@OCF_ROOT_DIR@\" $(AM_CPPFLAGS) libcrmservice_la_CFLAGS = $(GIO_CFLAGS) libcrmservice_la_CFLAGS += $(CFLAGS_HARDENED_LIB) libcrmservice_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) libcrmservice_la_LIBADD = $(GIO_LIBS) $(top_builddir)/lib/common/libcrmcommon.la $(DBUS_LIBS) libcrmservice_la_SOURCES = services.c services_linux.c if BUILD_DBUS libcrmservice_la_SOURCES += dbus.c endif if BUILD_UPSTART libcrmservice_la_SOURCES += upstart.c endif if BUILD_SYSTEMD libcrmservice_la_SOURCES += systemd.c endif diff --git a/lib/services/services.c b/lib/services/services.c index a3c99a65eb..4be425c8ee 100644 --- a/lib/services/services.c +++ b/lib/services/services.c @@ -1,861 +1,850 @@ /* - * Copyright (C) 2010 Andrew Beekhof + * Copyright (C) 2010-2016 Andrew Beekhof * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License as published by the Free Software Foundation; either - * version 2.1 of the License, or (at your option) any later version. - * - * This software is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this library; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * This source code is licensed under the GNU Lesser General Public License + * version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY. */ #include #ifndef _GNU_SOURCE # define _GNU_SOURCE #endif #include #include #include #include #include #include #include #include #include #include #include #include "services_private.h" #if SUPPORT_UPSTART # include #endif #if SUPPORT_SYSTEMD # include #endif /* TODO: Develop a rollover strategy */ static int operations = 0; GHashTable *recurring_actions = NULL; /* ops waiting to run async because of conflicting active * pending ops*/ GList *blocked_ops = NULL; /* ops currently active (in-flight) */ GList *inflight_ops = NULL; svc_action_t * services_action_create(const char *name, const char *action, int interval, int timeout) { return resources_action_create(name, "lsb", NULL, name, action, interval, timeout, NULL, 0); } const char * resources_find_service_class(const char *agent) { /* Priority is: * - lsb * - systemd * - upstart */ int rc = 0; struct stat st; char *path = NULL; #ifdef LSB_ROOT_DIR rc = asprintf(&path, "%s/%s", LSB_ROOT_DIR, agent); if (rc > 0 && stat(path, &st) == 0) { free(path); return "lsb"; } free(path); #endif #if SUPPORT_SYSTEMD if (systemd_unit_exists(agent)) { return "systemd"; } #endif #if SUPPORT_UPSTART if (upstart_job_exists(agent)) { return "upstart"; } #endif return NULL; } svc_action_t * resources_action_create(const char *name, const char *standard, const char *provider, const char *agent, const char *action, int interval, int timeout, GHashTable * params, enum svc_action_flags flags) { svc_action_t *op = NULL; /* * Do some up front sanity checks before we go off and * build the svc_action_t instance. */ if (crm_strlen_zero(name)) { crm_err("A service or resource action must have a name."); goto return_error; } if (crm_strlen_zero(standard)) { crm_err("A service action must have a valid standard."); goto return_error; } if (!strcasecmp(standard, "ocf") && crm_strlen_zero(provider)) { crm_err("An OCF resource action must have a provider."); goto return_error; } if (crm_strlen_zero(agent)) { crm_err("A service or resource action must have an agent."); goto return_error; } if (crm_strlen_zero(action)) { crm_err("A service or resource action must specify an action."); goto return_error; } if (safe_str_eq(action, "monitor") && ( #if SUPPORT_HEARTBEAT safe_str_eq(standard, "heartbeat") || #endif safe_str_eq(standard, "lsb") || safe_str_eq(standard, "service"))) { action = "status"; } /* * Sanity checks passed, proceed! */ op = calloc(1, sizeof(svc_action_t)); op->opaque = calloc(1, sizeof(svc_action_private_t)); op->rsc = strdup(name); op->action = strdup(action); op->interval = interval; op->timeout = timeout; op->standard = strdup(standard); op->agent = strdup(agent); op->sequence = ++operations; op->flags = flags; if (asprintf(&op->id, "%s_%s_%d", name, action, interval) == -1) { goto return_error; } if (strcasecmp(op->standard, "service") == 0) { const char *expanded = resources_find_service_class(op->agent); if(expanded) { crm_debug("Found a %s agent for %s/%s", expanded, op->rsc, op->agent); free(op->standard); op->standard = strdup(expanded); } else { crm_info("Cannot determine the standard for %s (%s)", op->rsc, op->agent); free(op->standard); op->standard = strdup("lsb"); } CRM_ASSERT(op->standard); } if (strcasecmp(op->standard, "ocf") == 0) { op->provider = strdup(provider); op->params = params; params = NULL; if (asprintf(&op->opaque->exec, "%s/resource.d/%s/%s", OCF_ROOT_DIR, provider, agent) == -1) { crm_err("Internal error: cannot create agent path"); goto return_error; } op->opaque->args[0] = strdup(op->opaque->exec); op->opaque->args[1] = strdup(action); } else if (strcasecmp(op->standard, "lsb") == 0) { if (op->agent[0] == '/') { /* if given an absolute path, use that instead * of tacking on the LSB_ROOT_DIR path to the front */ op->opaque->exec = strdup(op->agent); } else if (asprintf(&op->opaque->exec, "%s/%s", LSB_ROOT_DIR, op->agent) == -1) { crm_err("Internal error: cannot create agent path"); goto return_error; } op->opaque->args[0] = strdup(op->opaque->exec); op->opaque->args[1] = strdup(op->action); op->opaque->args[2] = NULL; #if SUPPORT_HEARTBEAT } else if (strcasecmp(op->standard, "heartbeat") == 0) { int index; int param_num; char buf_tmp[20]; void *value_tmp; if (op->agent[0] == '/') { /* if given an absolute path, use that instead * of tacking on the HB_RA_DIR path to the front */ op->opaque->exec = strdup(op->agent); } else if (asprintf(&op->opaque->exec, "%s/%s", HB_RA_DIR, op->agent) == -1) { crm_err("Internal error: cannot create agent path"); goto return_error; } op->opaque->args[0] = strdup(op->opaque->exec); /* The "heartbeat" agent class only has positional arguments, * which we keyed by their decimal position number. */ param_num = 1; for (index = 1; index <= MAX_ARGC - 3; index++ ) { snprintf(buf_tmp, sizeof(buf_tmp), "%d", index); value_tmp = g_hash_table_lookup(params, buf_tmp); if (value_tmp == NULL) { /* maybe: strdup("") ?? * But the old lrmd did simply continue as well. */ continue; } op->opaque->args[param_num++] = strdup(value_tmp); } /* Add operation code as the last argument, */ /* and the teminating NULL pointer */ op->opaque->args[param_num++] = strdup(op->action); op->opaque->args[param_num] = NULL; #endif #if SUPPORT_SYSTEMD } else if (strcasecmp(op->standard, "systemd") == 0) { op->opaque->exec = strdup("systemd-dbus"); #endif #if SUPPORT_UPSTART } else if (strcasecmp(op->standard, "upstart") == 0) { op->opaque->exec = strdup("upstart-dbus"); #endif } else if (strcasecmp(op->standard, "service") == 0) { op->opaque->exec = strdup(SERVICE_SCRIPT); op->opaque->args[0] = strdup(SERVICE_SCRIPT); op->opaque->args[1] = strdup(agent); op->opaque->args[2] = strdup(action); #if SUPPORT_NAGIOS } else if (strcasecmp(op->standard, "nagios") == 0) { int index = 0; if (op->agent[0] == '/') { /* if given an absolute path, use that instead * of tacking on the NAGIOS_PLUGIN_DIR path to the front */ op->opaque->exec = strdup(op->agent); } else if (asprintf(&op->opaque->exec, "%s/%s", NAGIOS_PLUGIN_DIR, op->agent) == -1) { crm_err("Internal error: cannot create agent path"); goto return_error; } op->opaque->args[0] = strdup(op->opaque->exec); index = 1; if (safe_str_eq(op->action, "monitor") && op->interval == 0) { /* Invoke --version for a nagios probe */ op->opaque->args[index] = strdup("--version"); index++; } else if (params) { GHashTableIter iter; char *key = NULL; char *value = NULL; static int args_size = sizeof(op->opaque->args) / sizeof(char *); g_hash_table_iter_init(&iter, params); while (g_hash_table_iter_next(&iter, (gpointer *) & key, (gpointer *) & value) && index <= args_size - 3) { int len = 3; char *long_opt = NULL; if (safe_str_eq(key, XML_ATTR_CRM_VERSION) || strstr(key, CRM_META "_")) { continue; } len += strlen(key); long_opt = calloc(1, len); sprintf(long_opt, "--%s", key); long_opt[len - 1] = 0; op->opaque->args[index] = long_opt; op->opaque->args[index + 1] = strdup(value); index += 2; } } op->opaque->args[index] = NULL; #endif } else { crm_err("Unknown resource standard: %s", op->standard); services_action_free(op); op = NULL; } if(params) { g_hash_table_destroy(params); } return op; return_error: if(params) { g_hash_table_destroy(params); } services_action_free(op); return NULL; } svc_action_t * services_action_create_generic(const char *exec, const char *args[]) { svc_action_t *op; unsigned int cur_arg; op = calloc(1, sizeof(*op)); op->opaque = calloc(1, sizeof(svc_action_private_t)); op->opaque->exec = strdup(exec); op->opaque->args[0] = strdup(exec); for (cur_arg = 1; args && args[cur_arg - 1]; cur_arg++) { op->opaque->args[cur_arg] = strdup(args[cur_arg - 1]); if (cur_arg == DIMOF(op->opaque->args) - 1) { crm_err("svc_action_t args list not long enough for '%s' execution request.", exec); break; } } return op; } #if SUPPORT_DBUS /*! * \internal * \brief Update operation's pending DBus call, unreferencing old one if needed * * \param[in,out] op Operation to modify * \param[in] pending Pending call to set */ void services_set_op_pending(svc_action_t *op, DBusPendingCall *pending) { if (op->opaque->pending && (op->opaque->pending != pending)) { if (pending) { crm_info("Lost pending %s DBus call (%p)", op->id, op->opaque->pending); } else { crm_trace("Done with pending %s DBus call (%p)", op->id, op->opaque->pending); } dbus_pending_call_unref(op->opaque->pending); } op->opaque->pending = pending; if (pending) { crm_trace("Updated pending %s DBus call (%p)", op->id, pending); } else { crm_trace("Cleared pending %s DBus call", op->id); } } #endif void services_action_cleanup(svc_action_t * op) { if(op->opaque == NULL) { return; } #if SUPPORT_DBUS if(op->opaque->timerid != 0) { crm_trace("Removing timer for call %s to %s", op->action, op->rsc); g_source_remove(op->opaque->timerid); op->opaque->timerid = 0; } if(op->opaque->pending) { crm_trace("Cleaning up pending dbus call %p %s for %s", op->opaque->pending, op->action, op->rsc); if(dbus_pending_call_get_completed(op->opaque->pending)) { crm_warn("Pending dbus call %s for %s did not complete", op->action, op->rsc); } dbus_pending_call_cancel(op->opaque->pending); dbus_pending_call_unref(op->opaque->pending); op->opaque->pending = NULL; } #endif if (op->opaque->stderr_gsource) { mainloop_del_fd(op->opaque->stderr_gsource); op->opaque->stderr_gsource = NULL; } if (op->opaque->stdout_gsource) { mainloop_del_fd(op->opaque->stdout_gsource); op->opaque->stdout_gsource = NULL; } } void services_action_free(svc_action_t * op) { unsigned int i; if (op == NULL) { return; } services_action_cleanup(op); if (op->opaque->repeat_timer) { g_source_remove(op->opaque->repeat_timer); op->opaque->repeat_timer = 0; } free(op->id); free(op->opaque->exec); for (i = 0; i < DIMOF(op->opaque->args); i++) { free(op->opaque->args[i]); } free(op->opaque); free(op->rsc); free(op->action); free(op->standard); free(op->agent); free(op->provider); free(op->stdout_data); free(op->stderr_data); if (op->params) { g_hash_table_destroy(op->params); op->params = NULL; } free(op); } gboolean cancel_recurring_action(svc_action_t * op) { crm_info("Cancelling %s operation %s", op->standard, op->id); if (recurring_actions) { g_hash_table_remove(recurring_actions, op->id); } if (op->opaque->repeat_timer) { g_source_remove(op->opaque->repeat_timer); op->opaque->repeat_timer = 0; } return TRUE; } gboolean services_action_cancel(const char *name, const char *action, int interval) { svc_action_t *op = NULL; char id[512]; snprintf(id, sizeof(id), "%s_%s_%d", name, action, interval); if (!(op = g_hash_table_lookup(recurring_actions, id))) { return FALSE; } /* Always kill the recurring timer */ cancel_recurring_action(op); if (op->pid == 0) { op->status = PCMK_LRM_OP_CANCELLED; if (op->opaque->callback) { op->opaque->callback(op); } blocked_ops = g_list_remove(blocked_ops, op); services_action_free(op); } else { crm_info("Cancelling in-flight op: performing early termination of %s (pid=%d)", id, op->pid); op->cancel = 1; if (mainloop_child_kill(op->pid) == FALSE) { /* even though the early termination failed, * the op will be marked as cancelled once it completes. */ crm_err("Termination of %s (pid=%d) failed", id, op->pid); return FALSE; } } return TRUE; } gboolean services_action_kick(const char *name, const char *action, int interval /* ms */) { svc_action_t * op = NULL; char *id = NULL; if (asprintf(&id, "%s_%s_%d", name, action, interval) == -1) { return FALSE; } op = g_hash_table_lookup(recurring_actions, id); free(id); if (op == NULL) { return FALSE; } if (op->pid) { return TRUE; } else { if (op->opaque->repeat_timer) { g_source_remove(op->opaque->repeat_timer); op->opaque->repeat_timer = 0; } recurring_action_timer(op); return TRUE; } } /* add new recurring operation, check for duplicates. * - if duplicate found, return TRUE, immediately reschedule op. * - if no dup, return FALSE, inserve into recurring op list.*/ static gboolean handle_duplicate_recurring(svc_action_t * op, void (*action_callback) (svc_action_t *)) { svc_action_t * dup = NULL; if (recurring_actions == NULL) { recurring_actions = g_hash_table_new_full(g_str_hash, g_str_equal, NULL, NULL); return FALSE; } /* check for duplicates */ dup = g_hash_table_lookup(recurring_actions, op->id); if (dup && (dup != op)) { /* update user data */ if (op->opaque->callback) { dup->opaque->callback = op->opaque->callback; dup->cb_data = op->cb_data; op->cb_data = NULL; } /* immediately execute the next interval */ if (dup->pid != 0) { if (op->opaque->repeat_timer) { g_source_remove(op->opaque->repeat_timer); op->opaque->repeat_timer = 0; } recurring_action_timer(dup); } /* free the dup. */ services_action_free(op); return TRUE; } return FALSE; } static gboolean action_async_helper(svc_action_t * op) { if (op->standard && strcasecmp(op->standard, "upstart") == 0) { #if SUPPORT_UPSTART return upstart_job_exec(op, FALSE); #endif } else if (op->standard && strcasecmp(op->standard, "systemd") == 0) { #if SUPPORT_SYSTEMD return systemd_unit_exec(op); #endif } else { return services_os_action_execute(op, FALSE); } /* The 'op' has probably been freed if the execution functions return TRUE. */ /* Avoid using the 'op' in here. */ return FALSE; } void services_add_inflight_op(svc_action_t * op) { if (op == NULL) { return; } CRM_ASSERT(op->synchronous == FALSE); /* keep track of ops that are in-flight to avoid collisions in the same namespace */ if (op->rsc) { inflight_ops = g_list_append(inflight_ops, op); } } gboolean services_action_async(svc_action_t * op, void (*action_callback) (svc_action_t *)) { op->synchronous = false; if (action_callback) { op->opaque->callback = action_callback; } if (op->interval > 0) { if (handle_duplicate_recurring(op, action_callback) == TRUE) { /* entry rescheduled, dup freed */ /* exit early */ return TRUE; } g_hash_table_replace(recurring_actions, op->id, op); } if (op->rsc && is_op_blocked(op->rsc)) { blocked_ops = g_list_append(blocked_ops, op); return TRUE; } return action_async_helper(op); } static gboolean processing_blocked_ops = FALSE; gboolean is_op_blocked(const char *rsc) { GList *gIter = NULL; svc_action_t *op = NULL; for (gIter = inflight_ops; gIter != NULL; gIter = gIter->next) { op = gIter->data; if (safe_str_eq(op->rsc, rsc)) { return TRUE; } } return FALSE; } void handle_blocked_ops(void) { GList *executed_ops = NULL; GList *gIter = NULL; svc_action_t *op = NULL; gboolean res = FALSE; if (processing_blocked_ops) { /* avoid nested calling of this function */ return; } processing_blocked_ops = TRUE; /* n^2 operation here, but blocked ops are incredibly rare. this list * will be empty 99% of the time. */ for (gIter = blocked_ops; gIter != NULL; gIter = gIter->next) { op = gIter->data; if (is_op_blocked(op->rsc)) { continue; } executed_ops = g_list_append(executed_ops, op); res = action_async_helper(op); if (res == FALSE) { op->status = PCMK_LRM_OP_ERROR; /* this can cause this function to be called recursively * which is why we have processing_blocked_ops static variable */ operation_finalize(op); } } for (gIter = executed_ops; gIter != NULL; gIter = gIter->next) { op = gIter->data; blocked_ops = g_list_remove(blocked_ops, op); } g_list_free(executed_ops); processing_blocked_ops = FALSE; } gboolean services_action_sync(svc_action_t * op) { gboolean rc = TRUE; if (op == NULL) { crm_trace("No operation to execute"); return FALSE; } op->synchronous = true; if (op->standard && strcasecmp(op->standard, "upstart") == 0) { #if SUPPORT_UPSTART rc = upstart_job_exec(op, TRUE); #endif } else if (op->standard && strcasecmp(op->standard, "systemd") == 0) { #if SUPPORT_SYSTEMD rc = systemd_unit_exec(op); #endif } else { rc = services_os_action_execute(op, TRUE); } crm_trace(" > %s_%s_%d: %s = %d", op->rsc, op->action, op->interval, op->opaque->exec, op->rc); if (op->stdout_data) { crm_trace(" > stdout: %s", op->stdout_data); } if (op->stderr_data) { crm_trace(" > stderr: %s", op->stderr_data); } return rc; } GList * get_directory_list(const char *root, gboolean files, gboolean executable) { return services_os_get_directory_list(root, files, executable); } GList * services_list(void) { return resources_list_agents("lsb", NULL); } #if SUPPORT_HEARTBEAT static GList * resources_os_list_hb_agents(void) { return services_os_get_directory_list(HB_RA_DIR, TRUE, TRUE); } #endif GList * resources_list_standards(void) { GList *standards = NULL; GList *agents = NULL; standards = g_list_append(standards, strdup("ocf")); standards = g_list_append(standards, strdup("lsb")); standards = g_list_append(standards, strdup("service")); #if SUPPORT_SYSTEMD agents = systemd_unit_listall(); if (agents) { standards = g_list_append(standards, strdup("systemd")); g_list_free_full(agents, free); } #endif #if SUPPORT_UPSTART agents = upstart_job_listall(); if (agents) { standards = g_list_append(standards, strdup("upstart")); g_list_free_full(agents, free); } #endif #if SUPPORT_NAGIOS agents = resources_os_list_nagios_agents(); if (agents) { standards = g_list_append(standards, strdup("nagios")); g_list_free_full(agents, free); } #endif #if SUPPORT_HEARTBEAT standards = g_list_append(standards, strdup("heartbeat")); #endif return standards; } GList * resources_list_providers(const char *standard) { if (strcasecmp(standard, "ocf") == 0) { return resources_os_list_ocf_providers(); } return NULL; } GList * resources_list_agents(const char *standard, const char *provider) { if (standard == NULL || strcasecmp(standard, "service") == 0) { GList *tmp1; GList *tmp2; GList *result = resources_os_list_lsb_agents(); if (standard == NULL) { tmp1 = result; tmp2 = resources_os_list_ocf_agents(NULL); if (tmp2) { result = g_list_concat(tmp1, tmp2); } } #if SUPPORT_SYSTEMD tmp1 = result; tmp2 = systemd_unit_listall(); if (tmp2) { result = g_list_concat(tmp1, tmp2); } #endif #if SUPPORT_UPSTART tmp1 = result; tmp2 = upstart_job_listall(); if (tmp2) { result = g_list_concat(tmp1, tmp2); } #endif return result; } else if (strcasecmp(standard, "ocf") == 0) { return resources_os_list_ocf_agents(provider); } else if (strcasecmp(standard, "lsb") == 0) { return resources_os_list_lsb_agents(); #if SUPPORT_HEARTBEAT } else if (strcasecmp(standard, "heartbeat") == 0) { return resources_os_list_hb_agents(); #endif #if SUPPORT_SYSTEMD } else if (strcasecmp(standard, "systemd") == 0) { return systemd_unit_listall(); #endif #if SUPPORT_UPSTART } else if (strcasecmp(standard, "upstart") == 0) { return upstart_job_listall(); #endif #if SUPPORT_NAGIOS } else if (strcasecmp(standard, "nagios") == 0) { return resources_os_list_nagios_agents(); #endif } return NULL; } diff --git a/lib/services/services_linux.c b/lib/services/services_linux.c index 398a36b602..cd7fd3f213 100644 --- a/lib/services/services_linux.c +++ b/lib/services/services_linux.c @@ -1,913 +1,902 @@ /* - * Copyright (C) 2010 Andrew Beekhof + * Copyright (C) 2010-2016 Andrew Beekhof * - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License as published by the Free Software Foundation; either - * version 2.1 of the License, or (at your option) any later version. - * - * This software is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this library; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * This source code is licensed under the GNU Lesser General Public License + * version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY. */ #include #ifndef _GNU_SOURCE # define _GNU_SOURCE #endif #include #include #include #include #include #include #include #include #include #include #ifdef HAVE_SYS_SIGNALFD_H #include #endif #include "crm/crm.h" #include "crm/common/mainloop.h" #include "crm/services.h" #include "services_private.h" #if SUPPORT_CIBSECRETS # include "crm/common/cib_secrets.h" #endif /* ops currently active (in-flight) */ extern GList *inflight_ops; static inline void set_fd_opts(int fd, int opts) { int flag; if ((flag = fcntl(fd, F_GETFL)) >= 0) { if (fcntl(fd, F_SETFL, flag | opts) < 0) { crm_err("fcntl() write failed"); } } else { crm_err("fcntl() read failed"); } } static gboolean svc_read_output(int fd, svc_action_t * op, bool is_stderr) { char *data = NULL; int rc = 0, len = 0; char buf[500]; static const size_t buf_read_len = sizeof(buf) - 1; if (fd < 0) { crm_trace("No fd for %s", op->id); return FALSE; } if (is_stderr && op->stderr_data) { len = strlen(op->stderr_data); data = op->stderr_data; crm_trace("Reading %s stderr into offset %d", op->id, len); } else if (is_stderr == FALSE && op->stdout_data) { len = strlen(op->stdout_data); data = op->stdout_data; crm_trace("Reading %s stdout into offset %d", op->id, len); } else { crm_trace("Reading %s %s into offset %d", op->id, is_stderr?"stderr":"stdout", len); } do { rc = read(fd, buf, buf_read_len); if (rc > 0) { crm_trace("Got %d chars: %.80s", rc, buf); buf[rc] = 0; data = realloc_safe(data, len + rc + 1); len += sprintf(data + len, "%s", buf); } else if (errno != EINTR) { /* error or EOF * Cleanup happens in pipe_done() */ rc = FALSE; break; } } while (rc == buf_read_len || rc < 0); if (is_stderr) { op->stderr_data = data; } else { op->stdout_data = data; } return rc; } static int dispatch_stdout(gpointer userdata) { svc_action_t *op = (svc_action_t *) userdata; return svc_read_output(op->opaque->stdout_fd, op, FALSE); } static int dispatch_stderr(gpointer userdata) { svc_action_t *op = (svc_action_t *) userdata; return svc_read_output(op->opaque->stderr_fd, op, TRUE); } static void pipe_out_done(gpointer user_data) { svc_action_t *op = (svc_action_t *) user_data; crm_trace("%p", op); op->opaque->stdout_gsource = NULL; if (op->opaque->stdout_fd > STDOUT_FILENO) { close(op->opaque->stdout_fd); } op->opaque->stdout_fd = -1; } static void pipe_err_done(gpointer user_data) { svc_action_t *op = (svc_action_t *) user_data; op->opaque->stderr_gsource = NULL; if (op->opaque->stderr_fd > STDERR_FILENO) { close(op->opaque->stderr_fd); } op->opaque->stderr_fd = -1; } static struct mainloop_fd_callbacks stdout_callbacks = { .dispatch = dispatch_stdout, .destroy = pipe_out_done, }; static struct mainloop_fd_callbacks stderr_callbacks = { .dispatch = dispatch_stderr, .destroy = pipe_err_done, }; static void set_ocf_env(const char *key, const char *value, gpointer user_data) { if (setenv(key, value, 1) != 0) { crm_perror(LOG_ERR, "setenv failed for key:%s and value:%s", key, value); } } static void set_ocf_env_with_prefix(gpointer key, gpointer value, gpointer user_data) { char buffer[500]; snprintf(buffer, sizeof(buffer), "OCF_RESKEY_%s", (char *)key); set_ocf_env(buffer, value, user_data); } static void add_OCF_env_vars(svc_action_t * op) { if (!op->standard || strcasecmp("ocf", op->standard) != 0) { return; } if (op->params) { g_hash_table_foreach(op->params, set_ocf_env_with_prefix, NULL); } set_ocf_env("OCF_RA_VERSION_MAJOR", "1", NULL); set_ocf_env("OCF_RA_VERSION_MINOR", "0", NULL); set_ocf_env("OCF_ROOT", OCF_ROOT_DIR, NULL); set_ocf_env("OCF_EXIT_REASON_PREFIX", PCMK_OCF_REASON_PREFIX, NULL); if (op->rsc) { set_ocf_env("OCF_RESOURCE_INSTANCE", op->rsc, NULL); } if (op->agent != NULL) { set_ocf_env("OCF_RESOURCE_TYPE", op->agent, NULL); } /* Notes: this is not added to specification yet. Sept 10,2004 */ if (op->provider != NULL) { set_ocf_env("OCF_RESOURCE_PROVIDER", op->provider, NULL); } } gboolean recurring_action_timer(gpointer data) { svc_action_t *op = data; crm_debug("Scheduling another invocation of %s", op->id); /* Clean out the old result */ free(op->stdout_data); op->stdout_data = NULL; free(op->stderr_data); op->stderr_data = NULL; op->opaque->repeat_timer = 0; services_action_async(op, NULL); return FALSE; } /* Returns FALSE if 'op' should be free'd by the caller */ gboolean operation_finalize(svc_action_t * op) { int recurring = 0; if (op->interval) { if (op->cancel) { op->status = PCMK_LRM_OP_CANCELLED; cancel_recurring_action(op); } else { recurring = 1; op->opaque->repeat_timer = g_timeout_add(op->interval, recurring_action_timer, (void *)op); } } if (op->opaque->callback) { op->opaque->callback(op); } op->pid = 0; inflight_ops = g_list_remove(inflight_ops, op); handle_blocked_ops(); if (!recurring && op->synchronous == FALSE) { /* * If this is a recurring action, do not free explicitly. * It will get freed whenever the action gets cancelled. */ services_action_free(op); return TRUE; } services_action_cleanup(op); return FALSE; } static void operation_finished(mainloop_child_t * p, pid_t pid, int core, int signo, int exitcode) { svc_action_t *op = mainloop_child_userdata(p); char *prefix = crm_strdup_printf("%s:%d", op->id, op->pid); mainloop_clear_child_userdata(p); op->status = PCMK_LRM_OP_DONE; CRM_ASSERT(op->pid == pid); crm_trace("%s %p %p", prefix, op->opaque->stderr_gsource, op->opaque->stdout_gsource); if (op->opaque->stderr_gsource) { /* Make sure we have read everything from the buffer. * Depending on the priority mainloop gives the fd, operation_finished * could occur before all the reads are done. Force the read now.*/ crm_trace("%s dispatching stderr", prefix); dispatch_stderr(op); crm_trace("%s: %p", op->id, op->stderr_data); mainloop_del_fd(op->opaque->stderr_gsource); op->opaque->stderr_gsource = NULL; } if (op->opaque->stdout_gsource) { /* Make sure we have read everything from the buffer. * Depending on the priority mainloop gives the fd, operation_finished * could occur before all the reads are done. Force the read now.*/ crm_trace("%s dispatching stdout", prefix); dispatch_stdout(op); crm_trace("%s: %p", op->id, op->stdout_data); mainloop_del_fd(op->opaque->stdout_gsource); op->opaque->stdout_gsource = NULL; } if (signo) { if (mainloop_child_timeout(p)) { crm_warn("%s - timed out after %dms", prefix, op->timeout); op->status = PCMK_LRM_OP_TIMEOUT; op->rc = PCMK_OCF_TIMEOUT; } else { do_crm_log_unlikely((op->cancel) ? LOG_INFO : LOG_WARNING, "%s - terminated with signal %d", prefix, signo); op->status = PCMK_LRM_OP_ERROR; op->rc = PCMK_OCF_SIGNAL; } } else { op->rc = exitcode; crm_debug("%s - exited with rc=%d", prefix, exitcode); } free(prefix); prefix = crm_strdup_printf("%s:%d:stderr", op->id, op->pid); crm_log_output(LOG_NOTICE, prefix, op->stderr_data); free(prefix); prefix = crm_strdup_printf("%s:%d:stdout", op->id, op->pid); crm_log_output(LOG_DEBUG, prefix, op->stdout_data); free(prefix); operation_finalize(op); } /*! * \internal * \brief Set operation rc and status per errno from stat(), fork() or execvp() * * \param[in,out] op Operation to set rc and status for * \param[in] error Value of errno after system call * * \return void */ static void services_handle_exec_error(svc_action_t * op, int error) { int rc_not_installed, rc_insufficient_priv, rc_exec_error; /* Mimic the return codes for each standard as that's what we'll convert back from in get_uniform_rc() */ if (safe_str_eq(op->standard, "lsb") && safe_str_eq(op->action, "status")) { rc_not_installed = PCMK_LSB_STATUS_NOT_INSTALLED; rc_insufficient_priv = PCMK_LSB_STATUS_INSUFFICIENT_PRIV; rc_exec_error = PCMK_LSB_STATUS_UNKNOWN; #if SUPPORT_NAGIOS } else if (safe_str_eq(op->standard, "nagios")) { rc_not_installed = NAGIOS_NOT_INSTALLED; rc_insufficient_priv = NAGIOS_INSUFFICIENT_PRIV; rc_exec_error = PCMK_OCF_EXEC_ERROR; #endif } else { rc_not_installed = PCMK_OCF_NOT_INSTALLED; rc_insufficient_priv = PCMK_OCF_INSUFFICIENT_PRIV; rc_exec_error = PCMK_OCF_EXEC_ERROR; } switch (error) { /* see execve(2), stat(2) and fork(2) */ case ENOENT: /* No such file or directory */ case EISDIR: /* Is a directory */ case ENOTDIR: /* Path component is not a directory */ case EINVAL: /* Invalid executable format */ case ENOEXEC: /* Invalid executable format */ op->rc = rc_not_installed; op->status = PCMK_LRM_OP_NOT_INSTALLED; break; case EACCES: /* permission denied (various errors) */ case EPERM: /* permission denied (various errors) */ op->rc = rc_insufficient_priv; op->status = PCMK_LRM_OP_ERROR; break; default: op->rc = rc_exec_error; op->status = PCMK_LRM_OP_ERROR; } } static void action_launch_child(svc_action_t *op) { int lpc; /* SIGPIPE is ignored (which is different from signal blocking) by the gnutls library. * Depending on the libqb version in use, libqb may set SIGPIPE to be ignored as well. * We do not want this to be inherited by the child process. By resetting this the signal * to the default behavior, we avoid some potential odd problems that occur during OCF * scripts when SIGPIPE is ignored by the environment. */ signal(SIGPIPE, SIG_DFL); #if defined(HAVE_SCHED_SETSCHEDULER) if (sched_getscheduler(0) != SCHED_OTHER) { struct sched_param sp; memset(&sp, 0, sizeof(sp)); sp.sched_priority = 0; if (sched_setscheduler(0, SCHED_OTHER, &sp) == -1) { crm_perror(LOG_ERR, "Could not reset scheduling policy to SCHED_OTHER for %s", op->id); } } #endif if (setpriority(PRIO_PROCESS, 0, 0) == -1) { crm_perror(LOG_ERR, "Could not reset process priority to 0 for %s", op->id); } /* Man: The call setpgrp() is equivalent to setpgid(0,0) * _and_ compiles on BSD variants too * need to investigate if it works the same too. */ setpgid(0, 0); /* close all descriptors except stdin/out/err and channels to logd */ for (lpc = getdtablesize() - 1; lpc > STDERR_FILENO; lpc--) { close(lpc); } #if SUPPORT_CIBSECRETS if (replace_secret_params(op->rsc, op->params) < 0) { /* replacing secrets failed! */ if (safe_str_eq(op->action,"stop")) { /* don't fail on stop! */ crm_info("proceeding with the stop operation for %s", op->rsc); } else { crm_err("failed to get secrets for %s, " "considering resource not configured", op->rsc); _exit(PCMK_OCF_NOT_CONFIGURED); } } #endif /* Setup environment correctly */ add_OCF_env_vars(op); /* execute the RA */ execvp(op->opaque->exec, op->opaque->args); /* Most cases should have been already handled by stat() */ services_handle_exec_error(op, errno); _exit(op->rc); } #ifndef HAVE_SYS_SIGNALFD_H static int sigchld_pipe[2] = { -1, -1 }; static void sigchld_handler() { if ((sigchld_pipe[1] >= 0) && (write(sigchld_pipe[1], "", 1) == -1)) { crm_perror(LOG_TRACE, "Could not poke SIGCHLD self-pipe"); } } #endif static void action_synced_wait(svc_action_t * op, sigset_t *mask) { int status = 0; int timeout = op->timeout; int sfd = -1; time_t start = -1; struct pollfd fds[3]; int wait_rc = 0; #ifdef HAVE_SYS_SIGNALFD_H sfd = signalfd(-1, mask, SFD_NONBLOCK); if (sfd < 0) { crm_perror(LOG_ERR, "signalfd() failed"); } #else sfd = sigchld_pipe[0]; #endif fds[0].fd = op->opaque->stdout_fd; fds[0].events = POLLIN; fds[0].revents = 0; fds[1].fd = op->opaque->stderr_fd; fds[1].events = POLLIN; fds[1].revents = 0; fds[2].fd = sfd; fds[2].events = POLLIN; fds[2].revents = 0; crm_trace("Waiting for %d", op->pid); start = time(NULL); do { int poll_rc = poll(fds, 3, timeout); if (poll_rc > 0) { if (fds[0].revents & POLLIN) { svc_read_output(op->opaque->stdout_fd, op, FALSE); } if (fds[1].revents & POLLIN) { svc_read_output(op->opaque->stderr_fd, op, TRUE); } if (fds[2].revents & POLLIN) { #ifdef HAVE_SYS_SIGNALFD_H struct signalfd_siginfo fdsi; ssize_t s; s = read(sfd, &fdsi, sizeof(struct signalfd_siginfo)); if (s != sizeof(struct signalfd_siginfo)) { crm_perror(LOG_ERR, "Read from signal fd %d failed", sfd); } else if (fdsi.ssi_signo == SIGCHLD) { #else if (1) { /* Clear out the sigchld pipe. */ char ch; while (read(sfd, &ch, 1) == 1); #endif wait_rc = waitpid(op->pid, &status, WNOHANG); if (wait_rc < 0){ crm_perror(LOG_ERR, "waitpid() for %d failed", op->pid); } else if (wait_rc > 0) { break; } } } } else if (poll_rc == 0) { timeout = 0; break; } else if (poll_rc < 0) { if (errno != EINTR) { crm_perror(LOG_ERR, "poll() failed"); break; } } timeout = op->timeout - (time(NULL) - start) * 1000; } while ((op->timeout < 0 || timeout > 0)); crm_trace("Child done: %d", op->pid); if (wait_rc <= 0) { int killrc = kill(op->pid, SIGKILL); op->rc = PCMK_OCF_UNKNOWN_ERROR; if (op->timeout > 0 && timeout <= 0) { op->status = PCMK_LRM_OP_TIMEOUT; crm_warn("%s:%d - timed out after %dms", op->id, op->pid, op->timeout); } else { op->status = PCMK_LRM_OP_ERROR; } if (killrc && errno != ESRCH) { crm_err("kill(%d, KILL) failed: %d", op->pid, errno); } /* * From sigprocmask(2): * It is not possible to block SIGKILL or SIGSTOP. Attempts to do so are silently ignored. * * This makes it safe to skip WNOHANG here */ waitpid(op->pid, &status, 0); } else if (WIFEXITED(status)) { op->status = PCMK_LRM_OP_DONE; op->rc = WEXITSTATUS(status); crm_info("Managed %s process %d exited with rc=%d", op->id, op->pid, op->rc); } else if (WIFSIGNALED(status)) { int signo = WTERMSIG(status); op->status = PCMK_LRM_OP_ERROR; crm_err("Managed %s process %d exited with signal=%d", op->id, op->pid, signo); } #ifdef WCOREDUMP if (WCOREDUMP(status)) { crm_err("Managed %s process %d dumped core", op->id, op->pid); } #endif svc_read_output(op->opaque->stdout_fd, op, FALSE); svc_read_output(op->opaque->stderr_fd, op, TRUE); close(op->opaque->stdout_fd); close(op->opaque->stderr_fd); #ifdef HAVE_SYS_SIGNALFD_H close(sfd); #endif } /* For an asynchronous 'op', returns FALSE if 'op' should be free'd by the caller */ /* For a synchronous 'op', returns FALSE if 'op' fails */ gboolean services_os_action_execute(svc_action_t * op, gboolean synchronous) { int stdout_fd[2]; int stderr_fd[2]; struct stat st; sigset_t *pmask; #ifdef HAVE_SYS_SIGNALFD_H sigset_t mask; sigset_t old_mask; #define sigchld_cleanup() do { \ if (sigismember(&old_mask, SIGCHLD) == 0) { \ if (sigprocmask(SIG_UNBLOCK, &mask, NULL) < 0) { \ crm_perror(LOG_ERR, "sigprocmask() failed to unblock sigchld"); \ } \ } \ } while (0) #else struct sigaction sa; struct sigaction old_sa; #define sigchld_cleanup() do { \ if (sigaction(SIGCHLD, &old_sa, NULL) < 0) { \ crm_perror(LOG_ERR, "sigaction() failed to remove sigchld handler"); \ } \ close(sigchld_pipe[0]); \ close(sigchld_pipe[1]); \ sigchld_pipe[0] = sigchld_pipe[1] = -1; \ } while(0) #endif /* Fail fast */ if(stat(op->opaque->exec, &st) != 0) { int rc = errno; crm_warn("Cannot execute '%s': %s (%d)", op->opaque->exec, pcmk_strerror(rc), rc); services_handle_exec_error(op, rc); if (!synchronous) { return operation_finalize(op); } return FALSE; } if (pipe(stdout_fd) < 0) { int rc = errno; crm_err("pipe(stdout_fd) failed. '%s': %s (%d)", op->opaque->exec, pcmk_strerror(rc), rc); services_handle_exec_error(op, rc); if (!synchronous) { return operation_finalize(op); } return FALSE; } if (pipe(stderr_fd) < 0) { int rc = errno; close(stdout_fd[0]); close(stdout_fd[1]); crm_err("pipe(stderr_fd) failed. '%s': %s (%d)", op->opaque->exec, pcmk_strerror(rc), rc); services_handle_exec_error(op, rc); if (!synchronous) { return operation_finalize(op); } return FALSE; } if (synchronous) { #ifdef HAVE_SYS_SIGNALFD_H sigemptyset(&mask); sigaddset(&mask, SIGCHLD); sigemptyset(&old_mask); if (sigprocmask(SIG_BLOCK, &mask, &old_mask) < 0) { crm_perror(LOG_ERR, "sigprocmask() failed to block sigchld"); } pmask = &mask; #else if(pipe(sigchld_pipe) == -1) { crm_perror(LOG_ERR, "pipe() failed"); } set_fd_opts(sigchld_pipe[0], O_NONBLOCK); set_fd_opts(sigchld_pipe[1], O_NONBLOCK); sa.sa_handler = sigchld_handler; sa.sa_flags = 0; sigemptyset(&sa.sa_mask); if (sigaction(SIGCHLD, &sa, &old_sa) < 0) { crm_perror(LOG_ERR, "sigaction() failed to set sigchld handler"); } pmask = NULL; #endif } op->pid = fork(); switch (op->pid) { case -1: { int rc = errno; close(stdout_fd[0]); close(stdout_fd[1]); close(stderr_fd[0]); close(stderr_fd[1]); crm_err("Could not execute '%s': %s (%d)", op->opaque->exec, pcmk_strerror(rc), rc); services_handle_exec_error(op, rc); if (!synchronous) { return operation_finalize(op); } sigchld_cleanup(); return FALSE; } case 0: /* Child */ close(stdout_fd[0]); close(stderr_fd[0]); if (STDOUT_FILENO != stdout_fd[1]) { if (dup2(stdout_fd[1], STDOUT_FILENO) != STDOUT_FILENO) { crm_err("dup2() failed (stdout)"); } close(stdout_fd[1]); } if (STDERR_FILENO != stderr_fd[1]) { if (dup2(stderr_fd[1], STDERR_FILENO) != STDERR_FILENO) { crm_err("dup2() failed (stderr)"); } close(stderr_fd[1]); } if (synchronous) { sigchld_cleanup(); } action_launch_child(op); CRM_ASSERT(0); /* action_launch_child is effectively noreturn */ } /* Only the parent reaches here */ close(stdout_fd[1]); close(stderr_fd[1]); op->opaque->stdout_fd = stdout_fd[0]; set_fd_opts(op->opaque->stdout_fd, O_NONBLOCK); op->opaque->stderr_fd = stderr_fd[0]; set_fd_opts(op->opaque->stderr_fd, O_NONBLOCK); if (synchronous) { action_synced_wait(op, pmask); sigchld_cleanup(); } else { crm_trace("Async waiting for %d - %s", op->pid, op->opaque->exec); mainloop_child_add_with_flags(op->pid, op->timeout, op->id, op, (op->flags & SVC_ACTION_LEAVE_GROUP) ? mainloop_leave_pid_group : 0, operation_finished); op->opaque->stdout_gsource = mainloop_add_fd(op->id, G_PRIORITY_LOW, op->opaque->stdout_fd, op, &stdout_callbacks); op->opaque->stderr_gsource = mainloop_add_fd(op->id, G_PRIORITY_LOW, op->opaque->stderr_fd, op, &stderr_callbacks); services_add_inflight_op(op); } return TRUE; } GList * services_os_get_directory_list(const char *root, gboolean files, gboolean executable) { GList *list = NULL; struct dirent **namelist; int entries = 0, lpc = 0; char buffer[PATH_MAX]; entries = scandir(root, &namelist, NULL, alphasort); if (entries <= 0) { return list; } for (lpc = 0; lpc < entries; lpc++) { struct stat sb; if ('.' == namelist[lpc]->d_name[0]) { free(namelist[lpc]); continue; } snprintf(buffer, sizeof(buffer), "%s/%s", root, namelist[lpc]->d_name); if (stat(buffer, &sb)) { continue; } if (S_ISDIR(sb.st_mode)) { if (files) { free(namelist[lpc]); continue; } } else if (S_ISREG(sb.st_mode)) { if (files == FALSE) { free(namelist[lpc]); continue; } else if (executable && (sb.st_mode & S_IXUSR) == 0 && (sb.st_mode & S_IXGRP) == 0 && (sb.st_mode & S_IXOTH) == 0) { free(namelist[lpc]); continue; } } list = g_list_append(list, strdup(namelist[lpc]->d_name)); free(namelist[lpc]); } free(namelist); return list; } GList * resources_os_list_lsb_agents(void) { return get_directory_list(LSB_ROOT_DIR, TRUE, TRUE); } GList * resources_os_list_ocf_providers(void) { return get_directory_list(OCF_ROOT_DIR "/resource.d", FALSE, TRUE); } GList * resources_os_list_ocf_agents(const char *provider) { GList *gIter = NULL; GList *result = NULL; GList *providers = NULL; if (provider) { char buffer[500]; snprintf(buffer, sizeof(buffer), "%s/resource.d/%s", OCF_ROOT_DIR, provider); return get_directory_list(buffer, TRUE, TRUE); } providers = resources_os_list_ocf_providers(); for (gIter = providers; gIter != NULL; gIter = gIter->next) { GList *tmp1 = result; GList *tmp2 = resources_os_list_ocf_agents(gIter->data); if (tmp2) { result = g_list_concat(tmp1, tmp2); } } g_list_free_full(providers, free); return result; } #if SUPPORT_NAGIOS GList * resources_os_list_nagios_agents(void) { GList *plugin_list = NULL; GList *result = NULL; GList *gIter = NULL; plugin_list = get_directory_list(NAGIOS_PLUGIN_DIR, TRUE, TRUE); /* Make sure both the plugin and its metadata exist */ for (gIter = plugin_list; gIter != NULL; gIter = gIter->next) { const char *plugin = gIter->data; char *metadata = crm_strdup_printf(NAGIOS_METADATA_DIR "/%s.xml", plugin); struct stat st; if (stat(metadata, &st) == 0) { result = g_list_append(result, strdup(plugin)); } free(metadata); } g_list_free_full(plugin_list, free); return result; } #endif diff --git a/lib/services/systemd.c b/lib/services/systemd.c index c3d9c01c02..e6e11147b9 100644 --- a/lib/services/systemd.c +++ b/lib/services/systemd.c @@ -1,770 +1,759 @@ /* - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public - * License as published by the Free Software Foundation; either - * version 2.1 of the License, or (at your option) any later version. + * Copyright (C) 2012-2016 Andrew Beekhof * - * This software is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - * General Public License for more details. - * - * You should have received a copy of the GNU General Public - * License along with this library; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - * - * Copyright (C) 2012 Andrew Beekhof + * This source code is licensed under the GNU Lesser General Public License + * version 2.1 or later (LGPLv2.1+) WITHOUT ANY WARRANTY. */ #include #include #include #include #include #include #include #include #include #include gboolean systemd_unit_exec_with_unit(svc_action_t * op, const char *unit); #define BUS_NAME "org.freedesktop.systemd1" #define BUS_NAME_MANAGER BUS_NAME ".Manager" #define BUS_NAME_UNIT BUS_NAME ".Unit" #define BUS_PATH "/org/freedesktop/systemd1" static inline DBusMessage * systemd_new_method(const char *method) { crm_trace("Calling: %s on " BUS_NAME_MANAGER, method); return dbus_message_new_method_call(BUS_NAME, BUS_PATH, BUS_NAME_MANAGER, method); } /* * Functions to manage a static DBus connection */ static DBusConnection* systemd_proxy = NULL; static inline DBusPendingCall * systemd_send(DBusMessage *msg, void(*done)(DBusPendingCall *pending, void *user_data), void *user_data, int timeout) { return pcmk_dbus_send(msg, systemd_proxy, done, user_data, timeout); } static inline DBusMessage * systemd_send_recv(DBusMessage *msg, DBusError *error, int timeout) { return pcmk_dbus_send_recv(msg, systemd_proxy, error, timeout); } /*! * \internal * \brief Send a method to systemd without arguments, and wait for reply * * \param[in] method Method to send * * \return Systemd reply on success, NULL (and error will be logged) otherwise * * \note The caller must call dbus_message_unref() on the reply after * handling it. */ static DBusMessage * systemd_call_simple_method(const char *method) { DBusMessage *msg = systemd_new_method(method); DBusMessage *reply = NULL; DBusError error; /* Don't call systemd_init() here, because that calls this */ CRM_CHECK(systemd_proxy, return NULL); if (msg == NULL) { crm_err("Could not create message to send %s to systemd", method); return NULL; } dbus_error_init(&error); reply = systemd_send_recv(msg, &error, DBUS_TIMEOUT_USE_DEFAULT); dbus_message_unref(msg); if (dbus_error_is_set(&error)) { crm_err("Could not send %s to systemd: %s (%s)", method, error.message, error.name); dbus_error_free(&error); return NULL; } else if (reply == NULL) { crm_err("Could not send %s to systemd: no reply received", method); return NULL; } return reply; } static gboolean systemd_init(void) { static int need_init = 1; /* http://dbus.freedesktop.org/doc/api/html/group__DBusConnection.html */ if (systemd_proxy && dbus_connection_get_is_connected(systemd_proxy) == FALSE) { crm_warn("Connection to System DBus is closed. Reconnecting..."); pcmk_dbus_disconnect(systemd_proxy); systemd_proxy = NULL; need_init = 1; } if (need_init) { need_init = 0; systemd_proxy = pcmk_dbus_connect(); } if (systemd_proxy == NULL) { return FALSE; } return TRUE; } static inline char * systemd_get_property(const char *unit, const char *name, void (*callback)(const char *name, const char *value, void *userdata), void *userdata, DBusPendingCall **pending, int timeout) { return systemd_proxy? pcmk_dbus_get_property(systemd_proxy, BUS_NAME, unit, BUS_NAME_UNIT, name, callback, userdata, pending, timeout) : NULL; } void systemd_cleanup(void) { if (systemd_proxy) { pcmk_dbus_disconnect(systemd_proxy); systemd_proxy = NULL; } } /* * end of systemd_proxy functions */ /*! * \internal * \brief Check whether a file name represents a systemd unit * * \param[in] name File name to check * * \return Pointer to "dot" before filename extension if so, NULL otherwise */ static const char * systemd_unit_extension(const char *name) { if (name) { const char *dot = strrchr(name, '.'); if (dot && (!strcmp(dot, ".service") || !strcmp(dot, ".socket"))) { return dot; } } return NULL; } static char * systemd_service_name(const char *name) { if (name == NULL) { return NULL; } if (systemd_unit_extension(name)) { return strdup(name); } return crm_strdup_printf("%s.service", name); } static void systemd_daemon_reload_complete(DBusPendingCall *pending, void *user_data) { DBusError error; DBusMessage *reply = NULL; unsigned int reload_count = GPOINTER_TO_UINT(user_data); dbus_error_init(&error); if(pending) { reply = dbus_pending_call_steal_reply(pending); } if (pcmk_dbus_find_error(pending, reply, &error)) { crm_err("Could not issue systemd reload %d: %s", reload_count, error.message); dbus_error_free(&error); } else { crm_trace("Reload %d complete", reload_count); } if(pending) { dbus_pending_call_unref(pending); } if(reply) { dbus_message_unref(reply); } } static bool systemd_daemon_reload(int timeout) { static unsigned int reload_count = 0; DBusMessage *msg = systemd_new_method("Reload"); reload_count++; CRM_ASSERT(msg != NULL); systemd_send(msg, systemd_daemon_reload_complete, GUINT_TO_POINTER(reload_count), timeout); dbus_message_unref(msg); return TRUE; } static bool systemd_mask_error(svc_action_t *op, const char *error) { crm_trace("Could not issue %s for %s: %s", op->action, op->rsc, error); if(strstr(error, "org.freedesktop.systemd1.InvalidName") || strstr(error, "org.freedesktop.systemd1.LoadFailed") || strstr(error, "org.freedesktop.systemd1.NoSuchUnit")) { if (safe_str_eq(op->action, "stop")) { crm_trace("Masking %s failure for %s: unknown services are stopped", op->action, op->rsc); op->rc = PCMK_OCF_OK; return TRUE; } else { crm_trace("Mapping %s failure for %s: unknown services are not installed", op->action, op->rsc); op->rc = PCMK_OCF_NOT_INSTALLED; op->status = PCMK_LRM_OP_NOT_INSTALLED; return FALSE; } } return FALSE; } static const char * systemd_loadunit_result(DBusMessage *reply, svc_action_t * op) { const char *path = NULL; DBusError error; if (pcmk_dbus_find_error((void*)&path, reply, &error)) { if(op && !systemd_mask_error(op, error.name)) { crm_err("Could not load systemd unit %s for %s: %s", op->agent, op->id, error.message); } dbus_error_free(&error); } else if(pcmk_dbus_type_check(reply, NULL, DBUS_TYPE_OBJECT_PATH, __FUNCTION__, __LINE__)) { dbus_message_get_args (reply, NULL, DBUS_TYPE_OBJECT_PATH, &path, DBUS_TYPE_INVALID); } if(op) { if (path) { systemd_unit_exec_with_unit(op, path); } else if (op->synchronous == FALSE) { operation_finalize(op); } } return path; } static void systemd_loadunit_cb(DBusPendingCall *pending, void *user_data) { DBusMessage *reply = NULL; svc_action_t * op = user_data; if(pending) { reply = dbus_pending_call_steal_reply(pending); } crm_trace("Got result: %p for %p / %p for %s", reply, pending, op->opaque->pending, op->id); CRM_LOG_ASSERT(pending == op->opaque->pending); services_set_op_pending(op, NULL); systemd_loadunit_result(reply, user_data); if(reply) { dbus_message_unref(reply); } } static char * systemd_unit_by_name(const gchar * arg_name, svc_action_t *op) { DBusMessage *msg; DBusMessage *reply = NULL; DBusPendingCall* pending = NULL; char *name = NULL; /* Equivalent to GetUnit if it's already loaded */ if (systemd_init() == FALSE) { return FALSE; } msg = systemd_new_method("LoadUnit"); CRM_ASSERT(msg != NULL); name = systemd_service_name(arg_name); CRM_LOG_ASSERT(dbus_message_append_args(msg, DBUS_TYPE_STRING, &name, DBUS_TYPE_INVALID)); free(name); if(op == NULL || op->synchronous) { const char *unit = NULL; char *munit = NULL; reply = systemd_send_recv(msg, NULL, (op? op->timeout : DBUS_TIMEOUT_USE_DEFAULT)); dbus_message_unref(msg); unit = systemd_loadunit_result(reply, op); if(unit) { munit = strdup(unit); } if(reply) { dbus_message_unref(reply); } return munit; } pending = systemd_send(msg, systemd_loadunit_cb, op, op->timeout); if(pending) { services_set_op_pending(op, pending); } dbus_message_unref(msg); return NULL; } GList * systemd_unit_listall(void) { int lpc = 0; GList *units = NULL; DBusMessageIter args; DBusMessageIter unit; DBusMessageIter elem; DBusMessage *reply = NULL; if (systemd_init() == FALSE) { return NULL; } /* " \n" \ " \n" \ " \n" \ */ reply = systemd_call_simple_method("ListUnits"); if (reply == NULL) { return NULL; } if (!dbus_message_iter_init(reply, &args)) { crm_err("Could not list systemd units: systemd reply has no arguments"); dbus_message_unref(reply); return NULL; } if (!pcmk_dbus_type_check(reply, &args, DBUS_TYPE_ARRAY, __FUNCTION__, __LINE__)) { crm_err("Could not list systemd units: systemd reply has invalid arguments"); dbus_message_unref(reply); return NULL; } dbus_message_iter_recurse(&args, &unit); while (dbus_message_iter_get_arg_type (&unit) != DBUS_TYPE_INVALID) { DBusBasicValue value; if(!pcmk_dbus_type_check(reply, &unit, DBUS_TYPE_STRUCT, __FUNCTION__, __LINE__)) { continue; } dbus_message_iter_recurse(&unit, &elem); if(!pcmk_dbus_type_check(reply, &elem, DBUS_TYPE_STRING, __FUNCTION__, __LINE__)) { continue; } dbus_message_iter_get_basic(&elem, &value); crm_trace("DBus ListUnits listed: %s", value.str); if(value.str) { const char *match = systemd_unit_extension(value.str); if (match) { char *unit_name; if (!strcmp(match, ".service")) { /* service is the "default" unit type, so strip it */ unit_name = strndup(value.str, match - value.str); } else { unit_name = strdup(value.str); } lpc++; units = g_list_append(units, unit_name); } } dbus_message_iter_next (&unit); } dbus_message_unref(reply); crm_trace("Found %d systemd services", lpc); return units; } gboolean systemd_unit_exists(const char *name) { char *unit = NULL; /* Note: Makes a blocking dbus calls * Used by resources_find_service_class() when resource class=service */ unit = systemd_unit_by_name(name, NULL); if(unit) { free(unit); return TRUE; } return FALSE; } static char * systemd_unit_metadata(const char *name, int timeout) { char *meta = NULL; char *desc = NULL; char *path = systemd_unit_by_name(name, NULL); if (path) { /* TODO: Worth a making blocking call for? Probably not. Possibly if cached. */ desc = systemd_get_property(path, "Description", NULL, NULL, NULL, timeout); } else { desc = crm_strdup_printf("Systemd unit file for %s", name); } meta = crm_strdup_printf("\n" "\n" "\n" " 1.0\n" " \n" " %s\n" " \n" " systemd unit file for %s\n" " \n" " \n" " \n" " \n" " \n" " \n" " \n" " \n" " \n" " \n" " \n" "\n", name, desc, name); free(desc); free(path); return meta; } static void systemd_exec_result(DBusMessage *reply, svc_action_t *op) { DBusError error; if (pcmk_dbus_find_error((void*)&error, reply, &error)) { /* ignore "already started" or "not running" errors */ if (!systemd_mask_error(op, error.name)) { crm_err("Could not issue %s for %s: %s", op->action, op->rsc, error.message); } dbus_error_free(&error); } else { if(!pcmk_dbus_type_check(reply, NULL, DBUS_TYPE_OBJECT_PATH, __FUNCTION__, __LINE__)) { crm_warn("Call to %s passed but return type was unexpected", op->action); op->rc = PCMK_OCF_OK; } else { const char *path = NULL; dbus_message_get_args (reply, NULL, DBUS_TYPE_OBJECT_PATH, &path, DBUS_TYPE_INVALID); crm_info("Call to %s passed: %s", op->action, path); op->rc = PCMK_OCF_OK; } } operation_finalize(op); } static void systemd_async_dispatch(DBusPendingCall *pending, void *user_data) { DBusMessage *reply = NULL; svc_action_t *op = user_data; if(pending) { reply = dbus_pending_call_steal_reply(pending); } crm_trace("Got result: %p for %p for %s, %s", reply, pending, op->rsc, op->action); CRM_LOG_ASSERT(pending == op->opaque->pending); services_set_op_pending(op, NULL); systemd_exec_result(reply, op); if(reply) { dbus_message_unref(reply); } } #define SYSTEMD_OVERRIDE_ROOT "/run/systemd/system/" static void systemd_unit_check(const char *name, const char *state, void *userdata) { svc_action_t * op = userdata; crm_trace("Resource %s has %s='%s'", op->rsc, name, state); if(state == NULL) { op->rc = PCMK_OCF_NOT_RUNNING; } else if (g_strcmp0(state, "active") == 0) { op->rc = PCMK_OCF_OK; } else if (g_strcmp0(state, "activating") == 0) { op->rc = PCMK_OCF_PENDING; } else if (g_strcmp0(state, "deactivating") == 0) { op->rc = PCMK_OCF_PENDING; } else { op->rc = PCMK_OCF_NOT_RUNNING; } if (op->synchronous == FALSE) { services_set_op_pending(op, NULL); operation_finalize(op); } } gboolean systemd_unit_exec_with_unit(svc_action_t * op, const char *unit) { const char *method = op->action; DBusMessage *msg = NULL; DBusMessage *reply = NULL; CRM_ASSERT(unit); if (safe_str_eq(op->action, "monitor") || safe_str_eq(method, "status")) { DBusPendingCall *pending = NULL; char *state; state = systemd_get_property(unit, "ActiveState", (op->synchronous? NULL : systemd_unit_check), op, (op->synchronous? NULL : &pending), op->timeout); if (op->synchronous) { systemd_unit_check("ActiveState", state, op); free(state); return op->rc == PCMK_OCF_OK; } else if (pending) { services_set_op_pending(op, pending); return TRUE; } else { return operation_finalize(op); } } else if (g_strcmp0(method, "start") == 0) { FILE *file_strm = NULL; char *override_dir = crm_strdup_printf("%s/%s.service.d", SYSTEMD_OVERRIDE_ROOT, op->agent); char *override_file = crm_strdup_printf("%s/%s.service.d/50-pacemaker.conf", SYSTEMD_OVERRIDE_ROOT, op->agent); mode_t orig_umask; method = "StartUnit"; crm_build_path(override_dir, 0755); /* Ensure the override file is world-readable. This is not strictly * necessary, but it avoids a systemd warning in the logs. */ orig_umask = umask(S_IWGRP | S_IWOTH); file_strm = fopen(override_file, "w"); umask(orig_umask); if (file_strm != NULL) { /* TODO: Insert the start timeout in too */ char *override = crm_strdup_printf( "[Unit]\n" "Description=Cluster Controlled %s\n" "Before=pacemaker.service\n" "\n" "[Service]\n" "Restart=no\n", op->agent); int rc = fprintf(file_strm, "%s\n", override); free(override); if (rc < 0) { crm_perror(LOG_ERR, "Cannot write to systemd override file %s", override_file); } } else { crm_err("Cannot open systemd override file %s for writing", override_file); } if (file_strm != NULL) { fflush(file_strm); fclose(file_strm); } systemd_daemon_reload(op->timeout); free(override_file); free(override_dir); } else if (g_strcmp0(method, "stop") == 0) { char *override_file = crm_strdup_printf("%s/%s.service.d/50-pacemaker.conf", SYSTEMD_OVERRIDE_ROOT, op->agent); method = "StopUnit"; unlink(override_file); free(override_file); systemd_daemon_reload(op->timeout); } else if (g_strcmp0(method, "restart") == 0) { method = "RestartUnit"; } else { op->rc = PCMK_OCF_UNIMPLEMENT_FEATURE; goto cleanup; } crm_debug("Calling %s for %s: %s", method, op->rsc, unit); msg = systemd_new_method(method); CRM_ASSERT(msg != NULL); /* (ss) */ { const char *replace_s = "replace"; char *name = systemd_service_name(op->agent); CRM_LOG_ASSERT(dbus_message_append_args(msg, DBUS_TYPE_STRING, &name, DBUS_TYPE_INVALID)); CRM_LOG_ASSERT(dbus_message_append_args(msg, DBUS_TYPE_STRING, &replace_s, DBUS_TYPE_INVALID)); free(name); } if (op->synchronous == FALSE) { DBusPendingCall *pending = systemd_send(msg, systemd_async_dispatch, op, op->timeout); dbus_message_unref(msg); if(pending) { services_set_op_pending(op, pending); return TRUE; } else { return operation_finalize(op); } } else { reply = systemd_send_recv(msg, NULL, op->timeout); dbus_message_unref(msg); systemd_exec_result(reply, op); if(reply) { dbus_message_unref(reply); } return FALSE; } cleanup: if (op->synchronous == FALSE) { return operation_finalize(op); } return op->rc == PCMK_OCF_OK; } static gboolean systemd_timeout_callback(gpointer p) { svc_action_t * op = p; op->opaque->timerid = 0; crm_warn("%s operation on systemd unit %s named '%s' timed out", op->action, op->agent, op->rsc); operation_finalize(op); return FALSE; } /* For an asynchronous 'op', returns FALSE if 'op' should be free'd by the caller */ /* For a synchronous 'op', returns FALSE if 'op' fails */ gboolean systemd_unit_exec(svc_action_t * op) { char *unit = NULL; CRM_ASSERT(op); CRM_ASSERT(systemd_init()); op->rc = PCMK_OCF_UNKNOWN_ERROR; crm_debug("Performing %ssynchronous %s op on systemd unit %s named '%s'", op->synchronous ? "" : "a", op->action, op->agent, op->rsc); if (safe_str_eq(op->action, "meta-data")) { /* TODO: See if we can teach the lrmd not to make these calls synchronously */ op->stdout_data = systemd_unit_metadata(op->agent, op->timeout); op->rc = PCMK_OCF_OK; if (op->synchronous == FALSE) { return operation_finalize(op); } return TRUE; } unit = systemd_unit_by_name(op->agent, op); free(unit); if (op->synchronous == FALSE) { if (op->opaque->pending) { op->opaque->timerid = g_timeout_add(op->timeout + 5000, systemd_timeout_callback, op); services_add_inflight_op(op); return TRUE; } else { return operation_finalize(op); } } return op->rc == PCMK_OCF_OK; } diff --git a/lib/transition/Makefile.am b/lib/transition/Makefile.am index 4d6cd23d0f..533b0483e3 100644 --- a/lib/transition/Makefile.am +++ b/lib/transition/Makefile.am @@ -1,35 +1,35 @@ # # Copyright (C) 2004 Andrew Beekhof # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # include $(top_srcdir)/Makefile.common ## libraries lib_LTLIBRARIES = libtransitioner.la ## SOURCES -libtransitioner_la_LDFLAGS = -version-info 2:5:0 +libtransitioner_la_LDFLAGS = -version-info 2:6:0 libtransitioner_la_CPPFLAGS = -I$(top_builddir) $(AM_CPPFLAGS) libtransitioner_la_CFLAGS = $(CFLAGS_HARDENED_LIB) libtransitioner_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) libtransitioner_la_LIBADD = $(top_builddir)/lib/common/libcrmcommon.la libtransitioner_la_SOURCES = unpack.c graph.c utils.c clean-generic: rm -f *~ diff --git a/lrmd/regression.py.in b/lrmd/regression.py.in index eef76918be..3fd7b60ecd 100755 --- a/lrmd/regression.py.in +++ b/lrmd/regression.py.in @@ -1,1290 +1,1290 @@ #!/usr/bin/python """ Regression tests for Pacemaker's lrmd """ # Pacemaker targets compatibility with Python 2.6+ and 3.2+ from __future__ import print_function, unicode_literals, absolute_import, division __copyright__ = "Copyright (C) 2012-2016 Andrew Beekhof " __license__ = "GNU General Public License version 2 or later (GPLv2+) WITHOUT ANY WARRANTY" import io import os import sys import subprocess import shlex import time # Where to find test binaries # Prefer the source tree if available BUILD_DIR = "@abs_top_builddir@" TEST_DIR = sys.path[0] def update_path(): """ Set the PATH environment variable appropriately for the tests """ new_path = os.environ['PATH'] if os.path.exists("%s/regression.py.in" % TEST_DIR): print("Running tests from the source tree: %s (%s)" % (BUILD_DIR, TEST_DIR)) new_path = "%s/lrmd:%s" % (BUILD_DIR, new_path) # For lrmd, lrmd_test and pacemaker_remoted new_path = "%s/tools:%s" % (BUILD_DIR, new_path) # For crm_resource new_path = "%s/fencing:%s" % (BUILD_DIR, new_path) # For stonithd else: print("Running tests from the install tree: @CRM_DAEMON_DIR@ (not %s)" % TEST_DIR) new_path = "@CRM_DAEMON_DIR@:%s" % (new_path) # For stonithd, lrmd, lrmd_test and pacemaker_remoted print(new_path) os.environ['PATH'] = new_path def shlex_split(command): """ Wrapper for shlex.split() that works around Python 2.6 bug """ if sys.version_info < (2, 7,): return shlex.split(command.encode('ascii')) else: return shlex.split(command) def pipe_output(pipes, stdout=True, stderr=False): """ Wrapper to get text output from pipes regardless of Python version """ output = "" pipe_outputs = pipes.communicate() if sys.version_info < (3,): if stdout: output = output + pipe_outputs[0] if stderr: output = output + pipe_outputs[1] else: if stdout: output = output + pipe_outputs[0].decode(sys.stdout.encoding) if stderr: output = output + pipe_outputs[1].decode(sys.stderr.encoding) return output def output_from_command(command): """ Run a command, and return its standard output. """ test = subprocess.Popen(shlex_split(command), stdout=subprocess.PIPE) test.wait() return pipe_output(test).split("\n") class Test(object): """ Executor for a single lrmd regression test """ def __init__(self, name, description, verbose=0, tls=0): self.name = name self.description = description self.cmds = [] if tls: self.daemon_location = "pacemaker_remoted" else: self.daemon_location = "lrmd" self.test_tool_location = "lrmd_test" self.verbose = verbose self.tls = tls self.result_txt = "" self.cmd_tool_output = "" self.result_exitcode = 0 self.lrmd_process = None self.stonith_process = None self.executed = 0 def __new_cmd(self, cmd, args, exitcode, stdout_match="", no_wait=0, stdout_negative_match="", kill=None): """ Add a command to be executed as part of this test """ if self.verbose and cmd == self.test_tool_location: args = args + " -V " if (cmd == self.test_tool_location) and self.tls: args = args + " -S " self.cmds.append( { "cmd" : cmd, "kill" : kill, "args" : args, "expected_exitcode" : exitcode, "stdout_match" : stdout_match, "stdout_negative_match" : stdout_negative_match, "no_wait" : no_wait, "cmd_output" : "", } ) def start_environment(self): """ Prepare the host for running a test """ ### make sure we are in full control here ### cmd = shlex_split("killall -q -9 stonithd lt-stonithd lrmd lt-lrmd lrmd_test lt-lrmd_test pacemaker_remoted") test = subprocess.Popen(cmd, stdout=subprocess.PIPE) test.wait() additional_args = "" if self.tls == 0: self.stonith_process = subprocess.Popen(shlex_split("stonithd -s")) if self.verbose: additional_args = additional_args + " -V" self.lrmd_process = subprocess.Popen(shlex_split("%s %s -l /tmp/lrmd-regression.log" % (self.daemon_location, additional_args))) time.sleep(1) def clean_environment(self): """ Clean up the host after running a test """ if self.lrmd_process: self.lrmd_process.terminate() self.lrmd_process.wait() if self.verbose: print("Daemon output") logfile = io.open('/tmp/lrmd-regression.log', 'rt') for line in logfile.readlines(): print(line.strip()) os.remove('/tmp/lrmd-regression.log') if self.stonith_process: self.stonith_process.terminate() self.stonith_process.wait() self.lrmd_process = None self.stonith_process = None def add_sys_cmd(self, cmd, args): """ Add a simple command to be executed as part of this test """ self.__new_cmd(cmd, args, 0, "") def add_sys_cmd_no_wait(self, cmd, args): """ Add a simple command to be executed (without waiting) as part of this test """ self.__new_cmd(cmd, args, 0, "", 1) def add_expected_fail_sys_cmd(self, cmd, args, exitcode): """ Add a command to be executed as part of this test and expected to fail """ self.__new_cmd(cmd, args, exitcode) def add_cmd_check_stdout(self, args, match, no_match=""): """ Add a command with expected output to be executed as part of this test """ self.__new_cmd(self.test_tool_location, args, 0, match, 0, no_match) def add_cmd(self, args): """ Add an lrmd_test command to be executed as part of this test """ self.__new_cmd(self.test_tool_location, args, 0, "") def add_cmd_and_kill(self, kill_proc, args): """ Add an lrmd_test command and system command to be executed as part of this test """ self.__new_cmd(self.test_tool_location, args, 0, "", kill=kill_proc) def add_expected_fail_cmd(self, args): """ Add an lrmd_test command to be executed as part of this test and expected to fail """ self.__new_cmd(self.test_tool_location, args, 1, "") def get_exitcode(self): """ Return the exit status of the last test execution """ return self.result_exitcode def print_result(self, filler): """ Print the result of the last test execution """ print("%s%s" % (filler, self.result_txt)) def run_cmd(self, args): """ Execute a command as part of this test """ cmd = shlex_split(args['args']) cmd.insert(0, args['cmd']) if self.verbose: print("\n\nRunning: "+" ".join(cmd)) test = subprocess.Popen(cmd, stdout=subprocess.PIPE) if args['kill']: if self.verbose: print("Also running: "+args['kill']) ### Typically, the kill argument is used to detect some sort of ### failure. Without yielding for a few seconds here, the process ### launched earlier that is listening for the failure may not have ### time to connect to the lrmd. time.sleep(2) subprocess.Popen(shlex_split(args['kill'])) if args['no_wait'] == 0: test.wait() else: return 0 output = pipe_output(test) if args['stdout_match'] != "" and output.count(args['stdout_match']) == 0: test.returncode = -2 print("STDOUT string '%s' was not found in cmd output" % (args['stdout_match'])) if args['stdout_negative_match'] != "" and output.count(args['stdout_negative_match']) != 0: test.returncode = -2 print("STDOUT string '%s' was found in cmd output" % (args['stdout_negative_match'])) args['cmd_output'] = output return test.returncode def run(self): """ Execute this test. """ res = 0 i = 1 if self.tls and self.name.count("stonith") != 0: self.result_txt = "SKIPPED - '%s' - disabled when testing pacemaker_remote" % (self.name) print(self.result_txt) return res self.start_environment() if self.verbose: print("\n--- START TEST - %s" % self.name) self.result_txt = "SUCCESS - '%s'" % (self.name) self.result_exitcode = 0 for cmd in self.cmds: res = self.run_cmd(cmd) if res != cmd['expected_exitcode']: print(cmd['cmd_output']) print("Step %d FAILED - command returned %d, expected %d" % (i, res, cmd['expected_exitcode'])) msg = "FAILURE - '%s' failed at step %d. Command: lrmd_test %s" self.result_txt = msg % (self.name, i, cmd['args']) self.result_exitcode = -1 break else: if self.verbose: print(cmd['cmd_output'].strip()) print("Step %d SUCCESS" % (i)) i = i + 1 self.clean_environment() print(self.result_txt) if self.verbose: print("--- END TEST - %s\n" % self.name) self.executed = 1 return res class Tests(object): """ Collection of all lrmd regression tests """ def __init__(self, verbose=0, tls=0): self.tests = [] self.verbose = verbose self.tls = tls self.rsc_classes = output_from_command("crm_resource --list-standards") self.rsc_classes = self.rsc_classes[:-1] # Strip trailing empty line self.need_authkey = 0 self.action_timeout = " -t 5000 " if self.tls: self.rsc_classes.remove("stonith") if "systemd" in self.rsc_classes: try: # This code doesn't need this import, but lrmd_dummy_daemon does, # so ensure the dependency is available rather than cause all # systemd tests to fail. import systemd.daemon except ImportError: print("Fatal error: python systemd bindings not found. Is package installed?", file=sys.stderr) sys.exit(1) print("Testing "+repr(self.rsc_classes)) self.common_cmds = { "ocf_reg_line" : "-c register_rsc -r ocf_test_rsc "+self.action_timeout+" -C ocf -P pacemaker -T Dummy", "ocf_reg_event" : "-l \"NEW_EVENT event_type:register rsc_id:ocf_test_rsc action:none rc:ok op_status:complete\"", "ocf_unreg_line" : "-c unregister_rsc -r \"ocf_test_rsc\" "+self.action_timeout, "ocf_unreg_event" : "-l \"NEW_EVENT event_type:unregister rsc_id:ocf_test_rsc action:none rc:ok op_status:complete\"", "ocf_start_line" : "-c exec -r \"ocf_test_rsc\" -a \"start\" "+self.action_timeout, "ocf_start_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:ocf_test_rsc action:start rc:ok op_status:complete\" ", "ocf_stop_line" : "-c exec -r \"ocf_test_rsc\" -a \"stop\" "+self.action_timeout, "ocf_stop_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:ocf_test_rsc action:stop rc:ok op_status:complete\" ", "ocf_monitor_line" : "-c exec -r \"ocf_test_rsc\" -a \"monitor\" -i \"2000\" "+self.action_timeout, "ocf_monitor_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:ocf_test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout, "ocf_cancel_line" : "-c cancel -r \"ocf_test_rsc\" -a \"monitor\" -i \"2000\" -t \"6000\" ", "ocf_cancel_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:ocf_test_rsc action:monitor rc:ok op_status:Cancelled\" ", "systemd_reg_line" : "-c register_rsc -r systemd_test_rsc "+self.action_timeout+" -C systemd -T lrmd_dummy_daemon", "systemd_reg_event" : "-l \"NEW_EVENT event_type:register rsc_id:systemd_test_rsc action:none rc:ok op_status:complete\"", "systemd_unreg_line" : "-c unregister_rsc -r \"systemd_test_rsc\" "+self.action_timeout, "systemd_unreg_event" : "-l \"NEW_EVENT event_type:unregister rsc_id:systemd_test_rsc action:none rc:ok op_status:complete\"", "systemd_start_line" : "-c exec -r \"systemd_test_rsc\" -a \"start\" "+self.action_timeout, "systemd_start_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:systemd_test_rsc action:start rc:ok op_status:complete\" ", "systemd_stop_line" : "-c exec -r \"systemd_test_rsc\" -a \"stop\" "+self.action_timeout, "systemd_stop_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:systemd_test_rsc action:stop rc:ok op_status:complete\" ", "systemd_monitor_line" : "-c exec -r \"systemd_test_rsc\" -a \"monitor\" -i \"2000\" "+self.action_timeout, "systemd_monitor_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:systemd_test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout, "systemd_cancel_line" : "-c cancel -r \"systemd_test_rsc\" -a \"monitor\" -i \"2000\" -t \"6000\" ", "systemd_cancel_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:systemd_test_rsc action:monitor rc:ok op_status:Cancelled\" ", "upstart_reg_line" : "-c register_rsc -r upstart_test_rsc "+self.action_timeout+" -C upstart -T lrmd_dummy_daemon", "upstart_reg_event" : "-l \"NEW_EVENT event_type:register rsc_id:upstart_test_rsc action:none rc:ok op_status:complete\"", "upstart_unreg_line" : "-c unregister_rsc -r \"upstart_test_rsc\" "+self.action_timeout, "upstart_unreg_event" : "-l \"NEW_EVENT event_type:unregister rsc_id:upstart_test_rsc action:none rc:ok op_status:complete\"", "upstart_start_line" : "-c exec -r \"upstart_test_rsc\" -a \"start\" "+self.action_timeout, "upstart_start_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:upstart_test_rsc action:start rc:ok op_status:complete\" ", "upstart_stop_line" : "-c exec -r \"upstart_test_rsc\" -a \"stop\" "+self.action_timeout, "upstart_stop_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:upstart_test_rsc action:stop rc:ok op_status:complete\" ", "upstart_monitor_line" : "-c exec -r \"upstart_test_rsc\" -a \"monitor\" -i \"2000\" "+self.action_timeout, "upstart_monitor_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:upstart_test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout, "upstart_cancel_line" : "-c cancel -r \"upstart_test_rsc\" -a \"monitor\" -i \"2000\" -t \"6000\" ", "upstart_cancel_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:upstart_test_rsc action:monitor rc:ok op_status:Cancelled\" ", "service_reg_line" : "-c register_rsc -r service_test_rsc "+self.action_timeout+" -C service -T LSBDummy", "service_reg_event" : "-l \"NEW_EVENT event_type:register rsc_id:service_test_rsc action:none rc:ok op_status:complete\"", "service_unreg_line" : "-c unregister_rsc -r \"service_test_rsc\" "+self.action_timeout, "service_unreg_event" : "-l \"NEW_EVENT event_type:unregister rsc_id:service_test_rsc action:none rc:ok op_status:complete\"", "service_start_line" : "-c exec -r \"service_test_rsc\" -a \"start\" "+self.action_timeout, "service_start_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:service_test_rsc action:start rc:ok op_status:complete\" ", "service_stop_line" : "-c exec -r \"service_test_rsc\" -a \"stop\" "+self.action_timeout, "service_stop_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:service_test_rsc action:stop rc:ok op_status:complete\" ", "service_monitor_line" : "-c exec -r \"service_test_rsc\" -a \"monitor\" -i \"2000\" "+self.action_timeout, "service_monitor_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:service_test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout, "service_cancel_line" : "-c cancel -r \"service_test_rsc\" -a \"monitor\" -i \"2000\" -t \"6000\" ", "service_cancel_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:service_test_rsc action:monitor rc:ok op_status:Cancelled\" ", "lsb_reg_line" : "-c register_rsc -r lsb_test_rsc "+self.action_timeout+" -C lsb -T LSBDummy", "lsb_reg_event" : "-l \"NEW_EVENT event_type:register rsc_id:lsb_test_rsc action:none rc:ok op_status:complete\" ", "lsb_unreg_line" : "-c unregister_rsc -r \"lsb_test_rsc\" "+self.action_timeout, "lsb_unreg_event" : "-l \"NEW_EVENT event_type:unregister rsc_id:lsb_test_rsc action:none rc:ok op_status:complete\"", "lsb_start_line" : "-c exec -r \"lsb_test_rsc\" -a \"start\" "+self.action_timeout, "lsb_start_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:lsb_test_rsc action:start rc:ok op_status:complete\" ", "lsb_stop_line" : "-c exec -r \"lsb_test_rsc\" -a \"stop\" "+self.action_timeout, "lsb_stop_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:lsb_test_rsc action:stop rc:ok op_status:complete\" ", "lsb_monitor_line" : "-c exec -r \"lsb_test_rsc\" -a status -i \"2000\" "+self.action_timeout, "lsb_monitor_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:lsb_test_rsc action:status rc:ok op_status:complete\" "+self.action_timeout, "lsb_cancel_line" : "-c cancel -r \"lsb_test_rsc\" -a \"status\" -i \"2000\" -t \"6000\" ", "lsb_cancel_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:lsb_test_rsc action:status rc:ok op_status:Cancelled\" ", "heartbeat_reg_line" : "-c register_rsc -r hb_test_rsc "+self.action_timeout+" -C heartbeat -T HBDummy", "heartbeat_reg_event" : "-l \"NEW_EVENT event_type:register rsc_id:hb_test_rsc action:none rc:ok op_status:complete\" ", "heartbeat_unreg_line" : "-c unregister_rsc -r \"hb_test_rsc\" "+self.action_timeout, "heartbeat_unreg_event" : "-l \"NEW_EVENT event_type:unregister rsc_id:hb_test_rsc action:none rc:ok op_status:complete\"", "heartbeat_start_line" : "-c exec -r \"hb_test_rsc\" -a \"start\" -k 1 -v a -k 2 -v b "+self.action_timeout, "heartbeat_start_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:hb_test_rsc action:start rc:ok op_status:complete\" ", "heartbeat_stop_line" : "-c exec -r \"hb_test_rsc\" -a \"stop\" -k 1 -v a -k 2 -v b "+self.action_timeout, "heartbeat_stop_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:hb_test_rsc action:stop rc:ok op_status:complete\" ", "heartbeat_monitor_line" : "-c exec -r \"hb_test_rsc\" -a status -k 1 -v a -k 2 -v b -i \"2000\" "+self.action_timeout, "heartbeat_monitor_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:hb_test_rsc action:status rc:ok op_status:complete\" "+self.action_timeout, "heartbeat_cancel_line" : "-c cancel -r \"hb_test_rsc\" -a \"status\" -k 1 -v a -k 2 -v b -i \"2000\" -t \"6000\" ", "heartbeat_cancel_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:hb_test_rsc action:status rc:ok op_status:Cancelled\" ", "stonith_reg_line" : "-c register_rsc -r stonith_test_rsc "+self.action_timeout+" -C stonith -P pacemaker -T fence_dummy_monitor", "stonith_reg_event" : "-l \"NEW_EVENT event_type:register rsc_id:stonith_test_rsc action:none rc:ok op_status:complete\" ", "stonith_unreg_line" : "-c unregister_rsc -r \"stonith_test_rsc\" "+self.action_timeout, "stonith_unreg_event" : "-l \"NEW_EVENT event_type:unregister rsc_id:stonith_test_rsc action:none rc:ok op_status:complete\"", "stonith_start_line" : "-c exec -r \"stonith_test_rsc\" -a \"start\" -t 8000 ", "stonith_start_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:stonith_test_rsc action:start rc:ok op_status:complete\" ", "stonith_stop_line" : "-c exec -r \"stonith_test_rsc\" -a \"stop\" "+self.action_timeout, "stonith_stop_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:stonith_test_rsc action:stop rc:ok op_status:complete\" ", "stonith_monitor_line" : "-c exec -r \"stonith_test_rsc\" -a \"monitor\" -i \"2000\" "+self.action_timeout, "stonith_monitor_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:stonith_test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout, "stonith_cancel_line" : "-c cancel -r \"stonith_test_rsc\" -a \"monitor\" -i \"2000\" -t \"6000\" ", "stonith_cancel_event" : "-l \"NEW_EVENT event_type:exec_complete rsc_id:stonith_test_rsc action:monitor rc:ok op_status:Cancelled\" ", } def new_test(self, name, description): """ Create a named test """ test = Test(name, description, self.verbose, self.tls) self.tests.append(test) return test def setup_test_environment(self): """ Prepare the host before executing any tests """ os.system("service pacemaker_remote stop") self.cleanup_test_environment() if self.tls and not os.path.isfile("/etc/pacemaker/authkey"): self.need_authkey = 1 os.system("mkdir -p /etc/pacemaker") os.system("dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1") ### Make fake systemd daemon and unit file ### dummy_daemon = """#!/usr/bin/python import time, systemd.daemon time.sleep(3) systemd.daemon.notify("READY=1") while True: time.sleep(5) """ dummy_service_file = """ [Unit] Description=Dummy resource that takes a while to start [Service] Type=notify ExecStart=/usr/sbin/lrmd_dummy_daemon """ dummy_upstart_job = (""" description "Dummy service for regression tests" exec dd if=/dev/random of=/dev/null """) dummy_fence_sleep_agent = ("""#!/usr/bin/python import sys import time def main(): for line in sys.stdin.readlines(): if line.count("monitor") > 0: time.sleep(30000) sys.exit(0) sys.exit(-1) if __name__ == "__main__": main() """) dummy_fence_agent = ("""#!/usr/bin/python from __future__ import print_function, unicode_literals, absolute_import, division import sys def main(): for line in sys.stdin.readlines(): if line.count("monitor") > 0: sys.exit(0) if line.count("metadata") > 0: print('') print(' dummy description.') print(' http://www.example.com') print(' ') print(' ') print(' ') print(' ') print(' Fencing Action') print(' ') print(' ') print(' ') print(' ') print(' Physical plug number or name of virtual machine') print(' ') print(' ') print(' ') print(' ') print(' ') print(' ') print(' ') print(' ') print('') sys.exit(0) sys.exit(-1) if __name__ == "__main__": main() """) os.system("cat <<-END >>/etc/init/lrmd_dummy_daemon.conf\n%s\nEND" % (dummy_upstart_job)) os.system("cat <<-END >>/usr/sbin/lrmd_dummy_daemon\n%s\nEND" % (dummy_daemon)) os.system("cat <<-END >>/lib/systemd/system/lrmd_dummy_daemon.service\n%s\nEND" % (dummy_service_file)) os.system("chmod a+x /usr/sbin/lrmd_dummy_daemon") os.system("cat <<-END >>/usr/sbin/fence_dummy_sleep\n%s\nEND" % (dummy_fence_sleep_agent)) os.system("chmod 711 /usr/sbin/fence_dummy_sleep") os.system("cat <<-END >>/usr/sbin/fence_dummy_monitor\n%s\nEND" % (dummy_fence_agent)) os.system("chmod 711 /usr/sbin/fence_dummy_monitor") if os.path.exists("%s/cts/LSBDummy" % BUILD_DIR): print("Using %s/cts/LSBDummy" % BUILD_DIR) os.system("cp %s/cts/LSBDummy /etc/init.d/LSBDummy" % BUILD_DIR) if not os.path.exists("@OCF_RA_DIR@/pacemaker"): os.system("mkdir -p @OCF_RA_DIR@/pacemaker/") # Install helper OCF agents for agent in ["Dummy", "Stateful", "ping"]: os.system("cp %s/extra/resources/%s @OCF_RA_DIR@/pacemaker/%s" % (BUILD_DIR, agent, agent)) os.system("chmod a+x @OCF_RA_DIR@/pacemaker/%s" % (agent)) else: # Assume it's installed print("Using @datadir@/@PACKAGE@/tests/cts/LSBDummy") os.system("cp @datadir@/@PACKAGE@/tests/cts/LSBDummy /etc/init.d/LSBDummy") os.system("chmod a+x /etc/init.d/LSBDummy") os.system("ls -al /etc/init.d/LSBDummy") os.system("mkdir -p @CRM_CORE_DIR@/root") os.system("mkdir -p /etc/ha.d/resource.d") if os.path.exists("%s/cts/HBDummy" % BUILD_DIR): print("Using %s/cts/HBDummy" % BUILD_DIR) os.system("cp %s/cts/HBDummy /etc/ha.d/resource.d/HBDummy" % BUILD_DIR) else: # Assume it's installed print("Using @datadir@/@PACKAGE@/tests/cts/HBDummy") os.system("cp @datadir@/@PACKAGE@/tests/cts/HBDummy /etc/ha.d/resource.d/HBDummy") os.system("chmod a+x /etc/ha.d/resource.d/HBDummy") os.system("ls -al /etc/ha.d/resource.d/HBDummy") if os.path.exists("/bin/systemctl"): os.system("systemctl daemon-reload") def cleanup_test_environment(self): """ Clean up the host after executing desired tests """ if self.need_authkey: os.system("rm -f /etc/pacemaker/authkey") os.system("rm -f /etc/init.d/LSBDummy") os.system("rm -f /etc/ha.d/resource.d/HBDummy") os.system("rm -f /lib/systemd/system/lrmd_dummy_daemon.service") os.system("rm -f /usr/sbin/lrmd_dummy_daemon") os.system("rm -f /usr/sbin/fence_dummy_monitor") os.system("rm -f /usr/sbin/fence_dummy_sleep") if os.path.exists("/bin/systemctl"): os.system("systemctl daemon-reload") def build_generic_tests(self): """ Register tests that apply to all resource classes """ common_cmds = self.common_cmds ### register/unregister tests ### for rsc in self.rsc_classes: test = self.new_test("generic_registration_%s" % (rsc), "Simple resource registration test for %s standard" % (rsc)) test.add_cmd(common_cmds["%s_reg_line" % (rsc)] + " " + common_cmds["%s_reg_event" % (rsc)]) test.add_cmd(common_cmds["%s_unreg_line" % (rsc)] + " " + common_cmds["%s_unreg_event" % (rsc)]) ### start/stop tests ### for rsc in self.rsc_classes: test = self.new_test("generic_start_stop_%s" % (rsc), "Simple start and stop test for %s standard" % (rsc)) test.add_cmd(common_cmds["%s_reg_line" % (rsc)] + " " + common_cmds["%s_reg_event" % (rsc)]) test.add_cmd(common_cmds["%s_start_line" % (rsc)] + " " + common_cmds["%s_start_event" % (rsc)]) test.add_cmd(common_cmds["%s_stop_line" % (rsc)] + " " + common_cmds["%s_stop_event" % (rsc)]) test.add_cmd(common_cmds["%s_unreg_line" % (rsc)] + " " + common_cmds["%s_unreg_event" % (rsc)]) ### monitor cancel test ### for rsc in self.rsc_classes: test = self.new_test("generic_monitor_cancel_%s" % (rsc), "Simple monitor cancel test for %s standard" % (rsc)) test.add_cmd(common_cmds["%s_reg_line" % (rsc)] + " " + common_cmds["%s_reg_event" % (rsc)]) test.add_cmd(common_cmds["%s_start_line" % (rsc)] + " " + common_cmds["%s_start_event" % (rsc)]) test.add_cmd(common_cmds["%s_monitor_line" % (rsc)] + " " + common_cmds["%s_monitor_event" % (rsc)]) ### If this fails, that means the monitor may not be getting rescheduled #### test.add_cmd(common_cmds["%s_monitor_event" % (rsc)]) ### If this fails, that means the monitor may not be getting rescheduled #### test.add_cmd(common_cmds["%s_monitor_event" % (rsc)]) test.add_cmd(common_cmds["%s_cancel_line" % (rsc)] + " " + common_cmds["%s_cancel_event" % (rsc)]) ### If this happens the monitor did not actually cancel correctly. ### test.add_expected_fail_cmd(common_cmds["%s_monitor_event" % (rsc)]) ### If this happens the monitor did not actually cancel correctly. ### test.add_expected_fail_cmd(common_cmds["%s_monitor_event" % (rsc)]) test.add_cmd(common_cmds["%s_stop_line" % (rsc)] + " " + common_cmds["%s_stop_event" % (rsc)]) test.add_cmd(common_cmds["%s_unreg_line" % (rsc)] + " " + common_cmds["%s_unreg_event" % (rsc)]) ### monitor duplicate test ### for rsc in self.rsc_classes: test = self.new_test("generic_monitor_duplicate_%s" % (rsc), "Test creation and canceling of duplicate monitors for %s standard" % (rsc)) test.add_cmd(common_cmds["%s_reg_line" % (rsc)] + " " + common_cmds["%s_reg_event" % (rsc)]) test.add_cmd(common_cmds["%s_start_line" % (rsc)] + " " + common_cmds["%s_start_event" % (rsc)]) test.add_cmd(common_cmds["%s_monitor_line" % (rsc)] + " " + common_cmds["%s_monitor_event" % (rsc)]) ### If this fails, that means the monitor may not be getting rescheduled #### test.add_cmd(common_cmds["%s_monitor_event" % (rsc)]) ### If this fails, that means the monitor may not be getting rescheduled #### test.add_cmd(common_cmds["%s_monitor_event" % (rsc)]) # Add the duplicate monitors test.add_cmd(common_cmds["%s_monitor_line" % (rsc)] + " " + common_cmds["%s_monitor_event" % (rsc)]) test.add_cmd(common_cmds["%s_monitor_line" % (rsc)] + " " + common_cmds["%s_monitor_event" % (rsc)]) test.add_cmd(common_cmds["%s_monitor_line" % (rsc)] + " " + common_cmds["%s_monitor_event" % (rsc)]) test.add_cmd(common_cmds["%s_monitor_line" % (rsc)] + " " + common_cmds["%s_monitor_event" % (rsc)]) # verify we still get update events ### If this fails, that means the monitor may not be getting rescheduled #### test.add_cmd(common_cmds["%s_monitor_event" % (rsc)]) # cancel the monitor, if the duplicate merged with the original, we should no longer see monitor updates test.add_cmd(common_cmds["%s_cancel_line" % (rsc)] + " " + common_cmds["%s_cancel_event" % (rsc)]) ### If this happens the monitor did not actually cancel correctly. ### test.add_expected_fail_cmd(common_cmds["%s_monitor_event" % (rsc)]) ### If this happens the monitor did not actually cancel correctly. ### test.add_expected_fail_cmd(common_cmds["%s_monitor_event" % (rsc)]) test.add_cmd(common_cmds["%s_stop_line" % (rsc)] + " " + common_cmds["%s_stop_event" % (rsc)]) test.add_cmd(common_cmds["%s_unreg_line" % (rsc)] + " " + common_cmds["%s_unreg_event" % (rsc)]) ### stop implies cancel test ### for rsc in self.rsc_classes: test = self.new_test("generic_stop_implies_cancel_%s" % (rsc), "Verify stopping a resource implies cancel of recurring ops for %s standard" % (rsc)) test.add_cmd(common_cmds["%s_reg_line" % (rsc)] + " " + common_cmds["%s_reg_event" % (rsc)]) test.add_cmd(common_cmds["%s_start_line" % (rsc)] + " " + common_cmds["%s_start_event" % (rsc)]) test.add_cmd(common_cmds["%s_monitor_line" % (rsc)] + " " + common_cmds["%s_monitor_event" % (rsc)]) ### If this fails, that means the monitor may not be getting rescheduled #### test.add_cmd(common_cmds["%s_monitor_event" % (rsc)]) ### If this fails, that means the monitor may not be getting rescheduled #### test.add_cmd(common_cmds["%s_monitor_event" % (rsc)]) test.add_cmd(common_cmds["%s_stop_line" % (rsc)] + " " + common_cmds["%s_stop_event" % (rsc)]) ### If this happens the monitor did not actually cancel correctly. ### test.add_expected_fail_cmd(common_cmds["%s_monitor_event" % (rsc)]) ### If this happens the monitor did not actually cancel correctly. ### test.add_expected_fail_cmd(common_cmds["%s_monitor_event" % (rsc)]) test.add_cmd(common_cmds["%s_unreg_line" % (rsc)] + " " + common_cmds["%s_unreg_event" % (rsc)]) def build_multi_rsc_tests(self): """ Register complex tests that involve managing multiple resouces of different types """ common_cmds = self.common_cmds # do not use service and systemd at the same time, it is the same resource. ### register start monitor stop unregister resources of each type at the same time. ### test = self.new_test("multi_rsc_start_stop_all", "Start, monitor, and stop resources of multiple types and classes") for rsc in self.rsc_classes: test.add_cmd(common_cmds["%s_reg_line" % (rsc)] + " " + common_cmds["%s_reg_event" % (rsc)]) for rsc in self.rsc_classes: test.add_cmd(common_cmds["%s_start_line" % (rsc)] + " " + common_cmds["%s_start_event" % (rsc)]) for rsc in self.rsc_classes: test.add_cmd(common_cmds["%s_monitor_line" % (rsc)] + " " + common_cmds["%s_monitor_event" % (rsc)]) for rsc in self.rsc_classes: ### If this fails, that means the monitor is not being rescheduled #### test.add_cmd(common_cmds["%s_monitor_event" % (rsc)]) for rsc in self.rsc_classes: test.add_cmd(common_cmds["%s_cancel_line" % (rsc)] + " " + common_cmds["%s_cancel_event" % (rsc)]) for rsc in self.rsc_classes: test.add_cmd(common_cmds["%s_stop_line" % (rsc)] + " " + common_cmds["%s_stop_event" % (rsc)]) for rsc in self.rsc_classes: test.add_cmd(common_cmds["%s_unreg_line" % (rsc)] + " " + common_cmds["%s_unreg_event" % (rsc)]) def build_negative_tests(self): """ Register tests related to how the lrmd handles failures """ ### ocf start timeout test ### test = self.new_test("ocf_start_timeout", "Force start timeout to occur, verify start failure.") test.add_cmd("-c register_rsc -r \"test_rsc\" -C \"ocf\" -P \"pacemaker\" -T \"Dummy\" " + self.action_timeout + "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") # -t must be less than self.action_timeout test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" -k \"op_sleep\" -v \"5\" -t 1000 -w") test.add_cmd('-l "NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:unknown error op_status:Timed Out" ' + self.action_timeout) test.add_cmd("-c exec -r test_rsc -a stop " + self.action_timeout + "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:stop rc:ok op_status:complete\" ") test.add_cmd("-c unregister_rsc -r test_rsc " + self.action_timeout + "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### stonith start timeout test ### test = self.new_test("stonith_start_timeout", "Force start timeout to occur, verify start failure.") test.add_cmd("-c register_rsc -r \"test_rsc\" -C \"stonith\" -P \"pacemaker\" -T \"fence_dummy_sleep\" " + self.action_timeout + "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" -t 1000 -w") # -t must be less than self.action_timeout test.add_cmd('-l "NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:unknown error op_status:Timed Out" ' + self.action_timeout) test.add_cmd("-c exec -r test_rsc -a stop " + self.action_timeout + "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:stop rc:ok op_status:complete\" ") test.add_cmd("-c unregister_rsc -r test_rsc " + self.action_timeout + "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### stonith component fail ### common_cmds = self.common_cmds test = self.new_test("stonith_component_fail", "Kill stonith component after lrmd connects") test.add_cmd(common_cmds["stonith_reg_line"] + " " + common_cmds["stonith_reg_event"]) test.add_cmd(common_cmds["stonith_start_line"] + " " + common_cmds["stonith_start_event"]) test.add_cmd('-c exec -r "stonith_test_rsc" -a "monitor" -i "600000" -l "NEW_EVENT event_type:exec_complete rsc_id:stonith_test_rsc action:monitor rc:ok op_status:complete" ' + self.action_timeout) test.add_cmd_and_kill("killall -9 -q stonithd lt-stonithd", '-l "NEW_EVENT event_type:exec_complete rsc_id:stonith_test_rsc action:monitor rc:unknown error op_status:error" -t 15000') test.add_cmd(common_cmds["stonith_unreg_line"] + " " + common_cmds["stonith_unreg_event"]) ### monitor fail for ocf resources ### test = self.new_test("monitor_fail_ocf", "Force ocf monitor to fail, verify failure is reported.") test.add_cmd("-c register_rsc -r \"test_rsc\" -C \"ocf\" -P \"pacemaker\" -T \"Dummy\" " + self.action_timeout + "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" " + self.action_timeout + "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" " + self.action_timeout + "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd('-c exec -r "test_rsc" -a "monitor" -i "100" ' + self.action_timeout + '-l "NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete"') test.add_cmd('-l "NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete"' + self.action_timeout) test.add_cmd('-l "NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete"' + self.action_timeout) test.add_cmd_and_kill("rm -f @localstatedir@/run/Dummy-test_rsc.state", '-l "NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:complete" -t 6000') test.add_cmd("-c cancel -r \"test_rsc\" -a \"monitor\" -i \"100\" -t \"6000\" " "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:Cancelled\" ") test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:complete\" " + self.action_timeout) test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" " + self.action_timeout) test.add_cmd("-c unregister_rsc -r \"test_rsc\" " + self.action_timeout + "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### verify notify changes only for monitor operation. ### test = self.new_test("monitor_changes_only", "Verify when flag is set, only monitor changes are notified.") test.add_cmd("-c register_rsc -r \"test_rsc\" -C \"ocf\" -P \"pacemaker\" -T \"Dummy\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+" -o " "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd('-c exec -r "test_rsc" -a "monitor" -i "100" ' + self.action_timeout + ' -o -l "NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete" ') test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd_and_kill("rm -f @localstatedir@/run/Dummy-test_rsc.state", "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:complete\" -t 6000") test.add_cmd("-c cancel -r \"test_rsc\" -a \"monitor\" -i \"100\" -t \"6000\" " "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:Cancelled\" ") test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:complete\" "+self.action_timeout) test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd('-c unregister_rsc -r "test_rsc" ' + self.action_timeout + '-l "NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete"') ### monitor fail for systemd resource ### if "systemd" in self.rsc_classes: test = self.new_test("monitor_fail_systemd", "Force systemd monitor to fail, verify failure is reported..") test.add_cmd("-c register_rsc -r \"test_rsc\" -C systemd -T lrmd_dummy_daemon "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"monitor\" -i \"100\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" ") test.add_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd_and_kill("killall -9 -q lrmd_dummy_daemon", "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:complete\" -t 8000") test.add_cmd("-c cancel -r \"test_rsc\" -a \"monitor\" -i \"100\" -t \"6000\" " "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:Cancelled\" ") test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:complete\" "+self.action_timeout) test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd("-c unregister_rsc -r \"test_rsc\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### monitor fail for upstart resource ### if "upstart" in self.rsc_classes: test = self.new_test("monitor_fail_upstart", "Force upstart monitor to fail, verify failure is reported..") test.add_cmd("-c register_rsc -r \"test_rsc\" -C upstart -T lrmd_dummy_daemon "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"monitor\" -i \"100\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" ") test.add_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd_and_kill("killall -9 -q dd", "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:complete\" -t 8000") test.add_cmd("-c cancel -r \"test_rsc\" -a \"monitor\" -i \"100\" -t \"6000\" " "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:Cancelled\" ") test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:complete\" "+self.action_timeout) test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd("-c unregister_rsc -r \"test_rsc\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### Cancel non-existent operation on a resource ### test = self.new_test("cancel_non_existent_op", "Attempt to cancel the wrong monitor operation, verify expected failure") test.add_cmd("-c register_rsc -r \"test_rsc\" -C \"ocf\" -P \"pacemaker\" -T \"Dummy\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"monitor\" -i \"100\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" ") test.add_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_expected_fail_cmd("-c cancel -r test_rsc -a \"monitor\" -i 1234 -t \"6000\" " ### interval is wrong, should fail "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:Cancelled\" ") test.add_expected_fail_cmd("-c cancel -r test_rsc -a stop -i 100 -t \"6000\" " ### action name is wrong, should fail "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:not running op_status:Cancelled\" ") test.add_cmd("-c unregister_rsc -r \"test_rsc\" " + self.action_timeout + "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### Attempt to invoke non-existent rsc id ### test = self.new_test("invoke_non_existent_rsc", "Attempt to perform operations on a non-existent rsc id.") test.add_expected_fail_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:unknown error op_status:complete\" ") test.add_expected_fail_cmd("-c exec -r test_rsc -a stop "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:stop rc:ok op_status:complete\" ") test.add_expected_fail_cmd("-c exec -r test_rsc -a monitor -i 6000 "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" ") test.add_expected_fail_cmd("-c cancel -r test_rsc -a start "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:Cancelled\" ") test.add_cmd("-c unregister_rsc -r \"test_rsc\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### Register and start a resource that doesn't exist, systemd ### if "systemd" in self.rsc_classes: test = self.new_test("start_uninstalled_systemd", "Register uninstalled systemd agent, try to start, verify expected failure") test.add_cmd("-c register_rsc -r \"test_rsc\" -C systemd -T this_is_fake1234 "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:not installed op_status:Not installed\" ") test.add_cmd("-c unregister_rsc -r \"test_rsc\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") if "upstart" in self.rsc_classes: test = self.new_test("start_uninstalled_upstart", "Register uninstalled upstart agent, try to start, verify expected failure") test.add_cmd("-c register_rsc -r \"test_rsc\" -C upstart -T this_is_fake1234 "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:not installed op_status:Not installed\" ") test.add_cmd("-c unregister_rsc -r \"test_rsc\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### Register and start a resource that doesn't exist, ocf ### test = self.new_test("start_uninstalled_ocf", "Register uninstalled ocf agent, try to start, verify expected failure.") test.add_cmd("-c register_rsc -r \"test_rsc\" -C ocf -P pacemaker -T this_is_fake1234 "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:not installed op_status:Not installed\" ") test.add_cmd("-c unregister_rsc -r \"test_rsc\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### Register ocf with non-existent provider ### test = self.new_test("start_ocf_bad_provider", "Register ocf agent with a non-existent provider, verify expected failure.") test.add_cmd("-c register_rsc -r \"test_rsc\" -C ocf -P pancakes -T Dummy "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:not installed op_status:Not installed\" ") test.add_cmd("-c unregister_rsc -r \"test_rsc\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### Register ocf with empty provider field ### test = self.new_test("start_ocf_no_provider", "Register ocf agent with a no provider, verify expected failure.") test.add_expected_fail_cmd("-c register_rsc -r \"test_rsc\" -C ocf -T Dummy "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_expected_fail_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:Error\" ") test.add_cmd("-c unregister_rsc -r \"test_rsc\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") def build_stress_tests(self): """ Register stress tests """ timeout = "-t 20000" iterations = 25 test = self.new_test("ocf_stress", "Verify OCF agent handling works under load") for i in range(iterations): test.add_cmd("-c register_rsc -r rsc_%s %s -C ocf -P heartbeat -T Dummy -l \"NEW_EVENT event_type:register rsc_id:rsc_%s action:none rc:ok op_status:complete\"" % (i, timeout, i)) test.add_cmd("-c exec -r rsc_%s -a start %s -l \"NEW_EVENT event_type:exec_complete rsc_id:rsc_%s action:start rc:ok op_status:complete\"" % (i, timeout, i)) test.add_cmd("-c exec -r rsc_%s -a monitor %s -i 1000 -l \"NEW_EVENT event_type:exec_complete rsc_id:rsc_%s action:monitor rc:ok op_status:complete\"" % (i, timeout, i)) for i in range(iterations): test.add_cmd("-c exec -r rsc_%s -a stop %s -l \"NEW_EVENT event_type:exec_complete rsc_id:rsc_%s action:stop rc:ok op_status:complete\"" % (i, timeout, i)) test.add_cmd("-c unregister_rsc -r rsc_%s %s -l \"NEW_EVENT event_type:unregister rsc_id:rsc_%s action:none rc:ok op_status:complete\"" % (i, timeout, i)) if "systemd" in self.rsc_classes: test = self.new_test("systemd_stress", "Verify systemd dbus connection works under load") for i in range(iterations): test.add_cmd("-c register_rsc -r rsc_%s %s -C systemd -T lrmd_dummy_daemon -l \"NEW_EVENT event_type:register rsc_id:rsc_%s action:none rc:ok op_status:complete\"" % (i, timeout, i)) test.add_cmd("-c exec -r rsc_%s -a start %s -l \"NEW_EVENT event_type:exec_complete rsc_id:rsc_%s action:start rc:ok op_status:complete\"" % (i, timeout, i)) test.add_cmd("-c exec -r rsc_%s -a monitor %s -i 1000 -l \"NEW_EVENT event_type:exec_complete rsc_id:rsc_%s action:monitor rc:ok op_status:complete\"" % (i, timeout, i)) for i in range(iterations): test.add_cmd("-c exec -r rsc_%s -a stop %s -l \"NEW_EVENT event_type:exec_complete rsc_id:rsc_%s action:stop rc:ok op_status:complete\"" % (i, timeout, i)) test.add_cmd("-c unregister_rsc -r rsc_%s %s -l \"NEW_EVENT event_type:unregister rsc_id:rsc_%s action:none rc:ok op_status:complete\"" % (i, timeout, i)) iterations = 9 timeout = "-t 30000" ### Verify recurring op in-flight collision is handled in series properly test = self.new_test("rsc_inflight_collision", "Verify recurring ops do not collide with other operations for the same rsc.") test.add_cmd("-c register_rsc -r test_rsc -P pacemaker -C ocf -T Dummy " "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd("-c exec -r test_rsc -a start %s -k op_sleep -v 1 -l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\"" % (timeout)) for i in range(iterations): test.add_cmd("-c exec -r test_rsc -a monitor %s -i 100%d -k op_sleep -v 2 -l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\"" % (timeout, i)) # test.add_sys_cmd("sleep", "-al @CRM_RSCTMP_DIR@") test.add_cmd("-c exec -r test_rsc -a stop %s -l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:stop rc:ok op_status:complete\"" % (timeout)) test.add_cmd("-c unregister_rsc -r test_rsc %s -l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\"" % (timeout)) def build_custom_tests(self): """ Register tests that target specific cases """ ### verify resource temporary folder is created and used by heartbeat agents. ### test = self.new_test("rsc_tmp_dir", "Verify creation and use of rsc temporary state directory") test.add_sys_cmd("ls", "-al @CRM_RSCTMP_DIR@") test.add_cmd("-c register_rsc -r test_rsc -P heartbeat -C ocf -T Dummy " "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd("-c exec -r test_rsc -a start -t 4000") test.add_sys_cmd("ls", "-al @CRM_RSCTMP_DIR@") test.add_sys_cmd("ls", "@CRM_RSCTMP_DIR@/Dummy-test_rsc.state") test.add_cmd("-c exec -r test_rsc -a stop -t 4000") test.add_cmd("-c unregister_rsc -r test_rsc "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### start delay then stop test ### test = self.new_test("start_delay", "Verify start delay works as expected.") test.add_cmd("-c register_rsc -r test_rsc -P pacemaker -C ocf -T Dummy " "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd("-c exec -r test_rsc -s 6000 -a start -w -t 6000") test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" -t 2000") test.add_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" -t 6000") test.add_cmd("-c exec -r test_rsc -a stop " + self.action_timeout + "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:stop rc:ok op_status:complete\" ") test.add_cmd("-c unregister_rsc -r test_rsc " + self.action_timeout + "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### start delay, but cancel before it gets a chance to start. ### test = self.new_test("start_delay_cancel", "Using start_delay, start a rsc, but cancel the start op before execution.") test.add_cmd("-c register_rsc -r test_rsc -P pacemaker -C ocf -T Dummy " "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd("-c exec -r test_rsc -s 5000 -a start -w -t 4000") test.add_cmd("-c cancel -r test_rsc -a start " + self.action_timeout + "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:Cancelled\" ") test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" -t 5000") test.add_cmd("-c unregister_rsc -r test_rsc " + self.action_timeout + "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### Register a bunch of resources, verify we can get info on them ### test = self.new_test("verify_get_rsc_info", "Register multiple resources, verify retrieval of rsc info.") if "systemd" in self.rsc_classes: test.add_cmd("-c register_rsc -r rsc1 -C systemd -T lrmd_dummy_daemon "+self.action_timeout) test.add_cmd("-c get_rsc_info -r rsc1 ") test.add_cmd("-c unregister_rsc -r rsc1 "+self.action_timeout) test.add_expected_fail_cmd("-c get_rsc_info -r rsc1 ") if "upstart" in self.rsc_classes: test.add_cmd("-c register_rsc -r rsc1 -C upstart -T lrmd_dummy_daemon "+self.action_timeout) test.add_cmd("-c get_rsc_info -r rsc1 ") test.add_cmd("-c unregister_rsc -r rsc1 "+self.action_timeout) test.add_expected_fail_cmd("-c get_rsc_info -r rsc1 ") test.add_cmd("-c register_rsc -r rsc2 -C ocf -T Dummy -P pacemaker "+self.action_timeout) test.add_cmd("-c get_rsc_info -r rsc2 ") test.add_cmd("-c unregister_rsc -r rsc2 "+self.action_timeout) test.add_expected_fail_cmd("-c get_rsc_info -r rsc2 ") ### Register duplicate, verify only one entry exists and can still be removed. test = self.new_test("duplicate_registration", "Register resource multiple times, verify only one entry exists and can be removed.") test.add_cmd("-c register_rsc -r rsc2 -C ocf -T Dummy -P pacemaker "+self.action_timeout) test.add_cmd_check_stdout("-c get_rsc_info -r rsc2 ", "id:rsc2 class:ocf provider:pacemaker type:Dummy") test.add_cmd("-c register_rsc -r rsc2 -C ocf -T Dummy -P pacemaker "+self.action_timeout) test.add_cmd_check_stdout("-c get_rsc_info -r rsc2 ", "id:rsc2 class:ocf provider:pacemaker type:Dummy") test.add_cmd("-c register_rsc -r rsc2 -C ocf -T Stateful -P pacemaker "+self.action_timeout) test.add_cmd_check_stdout("-c get_rsc_info -r rsc2 ", "id:rsc2 class:ocf provider:pacemaker type:Stateful") test.add_cmd("-c unregister_rsc -r rsc2 "+self.action_timeout) test.add_expected_fail_cmd("-c get_rsc_info -r rsc2 ") ### verify the option to only send notification to the original client. ### test = self.new_test("notify_orig_client_only", "Verify option to only send notifications to the client originating the action.") test.add_cmd("-c register_rsc -r \"test_rsc\" -C \"ocf\" -P \"pacemaker\" -T \"Dummy\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"start\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:start rc:ok op_status:complete\" ") test.add_cmd("-c exec -r \"test_rsc\" -a \"monitor\" -i \"100\" "+self.action_timeout+" -n " "-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" ") # this will fail because the monitor notifications should only go to the original caller, which no longer exists. test.add_expected_fail_cmd("-l \"NEW_EVENT event_type:exec_complete rsc_id:test_rsc action:monitor rc:ok op_status:complete\" "+self.action_timeout) test.add_cmd("-c cancel -r \"test_rsc\" -a \"monitor\" -i \"100\" -t \"6000\" ") test.add_cmd("-c unregister_rsc -r \"test_rsc\" "+self.action_timeout+ "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") - ### Verify that versioned resource parameters are chosen properly + ### Verify that versioned resource parameters are chosen properly ### # We'll default to one state file, then check the version to use a different one state_file_default = "@CRM_RSCTMP_DIR@/versioned_params_default.state" state_file_expected = "@CRM_RSCTMP_DIR@/versioned_params_expected.state" # Versioned attributes are passed as a single key-value pair. # Here, we define the value to choose the state file; # ocf:heartbeat:Dummy should always be at least version 0.9. versioned_key = "#versioned_attributes" versioned_attrs = """ """ % (state_file_expected, state_file_default) cmd_params = "-k '%s' -v '%s' " % (versioned_key, versioned_attrs) test = self.new_test("versioned_params", "Verify use of versioned resource parameters") # First, remove the possible state files, so we can reliably tell what gets created. test.add_sys_cmd("rm", "-f %s %s" % (state_file_expected, state_file_default)) # Register and start the resource. test.add_cmd("-c register_rsc -r test_rsc -P heartbeat -C ocf -T Dummy " "-l 'NEW_EVENT event_type:register rsc_id:test_rsc action:none rc:ok op_status:complete' " "%s" % (self.action_timeout)) test.add_cmd("-c exec -r test_rsc -a start -t 6000 %s" % cmd_params) # Check the created state file. test.add_expected_fail_sys_cmd("ls", state_file_default, 2) test.add_sys_cmd("ls", state_file_expected) # Stop and unregister the resource. test.add_cmd("-c exec -r test_rsc -a stop -t 4000 %s" % cmd_params) test.add_cmd("-c unregister_rsc -r test_rsc " + self.action_timeout + "-l \"NEW_EVENT event_type:unregister rsc_id:test_rsc action:none rc:ok op_status:complete\" ") ### get metadata ### test = self.new_test("get_ocf_metadata", "Retrieve metadata for a resource") test.add_cmd_check_stdout("-c metadata -C \"ocf\" -P \"pacemaker\" -T \"Dummy\"", "resource-agent name=\"Dummy\"") test.add_cmd("-c metadata -C \"ocf\" -P \"pacemaker\" -T \"Stateful\"") test.add_expected_fail_cmd("-c metadata -P \"pacemaker\" -T \"Stateful\"") test.add_expected_fail_cmd("-c metadata -C \"ocf\" -P \"pacemaker\" -T \"fake_agent\"") ### get metadata ### test = self.new_test("get_lsb_metadata", "Retrieve metadata for a resource") test.add_cmd_check_stdout("-c metadata -C \"lsb\" -T \"LSBDummy\"", "resource-agent name='LSBDummy'") ### get stonith metadata ### test = self.new_test("get_stonith_metadata", "Retrieve stonith metadata for a resource") test.add_cmd_check_stdout("-c metadata -C \"stonith\" -P \"pacemaker\" -T \"fence_dummy_monitor\"", "resource-agent name=\"fence_dummy_monitor\"") ### get metadata ### if "systemd" in self.rsc_classes: test = self.new_test("get_systemd_metadata", "Retrieve metadata for a resource") test.add_cmd_check_stdout("-c metadata -C \"systemd\" -T \"lrmd_dummy_daemon\"", "resource-agent name=\"lrmd_dummy_daemon\"") ### get metadata ### if "upstart" in self.rsc_classes: test = self.new_test("get_upstart_metadata", "Retrieve metadata for a resource") test.add_cmd_check_stdout("-c metadata -C \"upstart\" -T \"lrmd_dummy_daemon\"", "resource-agent name=\"lrmd_dummy_daemon\"") if "heartbeat" in self.rsc_classes: test = self.new_test("get_heartbeat_metadata", "Retrieve metadata for a resource") test.add_cmd_check_stdout("-c metadata -C \"heartbeat\" -T \"HBDummy\"", "resource-agent name='HBDummy'") ### get ocf providers ### test = self.new_test("list_ocf_providers", "Retrieve list of available resource providers, verifies pacemaker is a provider.") test.add_cmd_check_stdout("-c list_ocf_providers ", "pacemaker") test.add_cmd_check_stdout("-c list_ocf_providers -T ping", "pacemaker") ### Verify agents only exist in their lists ### test = self.new_test("verify_agent_lists", "Verify the agent lists contain the right data.") test.add_cmd_check_stdout("-c list_agents ", "Stateful") ### ocf ### test.add_cmd_check_stdout("-c list_agents -C ocf", "Stateful") test.add_cmd_check_stdout("-c list_agents -C lsb", "", "Stateful") ### should not exist test.add_cmd_check_stdout("-c list_agents -C service", "", "Stateful") ### should not exist test.add_cmd_check_stdout("-c list_agents ", "LSBDummy") ### init.d ### test.add_cmd_check_stdout("-c list_agents -C lsb", "LSBDummy") test.add_cmd_check_stdout("-c list_agents -C service", "LSBDummy") test.add_cmd_check_stdout("-c list_agents -C ocf", "", "lrmd_dummy_daemon") ### should not exist test.add_cmd_check_stdout("-c list_agents -C ocf", "", "lrmd_dummy_daemon") ### should not exist test.add_cmd_check_stdout("-c list_agents -C lsb", "", "fence_dummy_monitor") ### should not exist test.add_cmd_check_stdout("-c list_agents -C service", "", "fence_dummy_monitor") ### should not exist test.add_cmd_check_stdout("-c list_agents -C ocf", "", "fence_dummy_monitor") ### should not exist if "systemd" in self.rsc_classes: test.add_cmd_check_stdout("-c list_agents ", "lrmd_dummy_daemon") ### systemd ### test.add_cmd_check_stdout("-c list_agents -C service", "LSBDummy") test.add_cmd_check_stdout("-c list_agents -C systemd", "", "Stateful") ### should not exist test.add_cmd_check_stdout("-c list_agents -C systemd", "lrmd_dummy_daemon") test.add_cmd_check_stdout("-c list_agents -C systemd", "", "fence_dummy_monitor") ### should not exist if "upstart" in self.rsc_classes: test.add_cmd_check_stdout("-c list_agents ", "lrmd_dummy_daemon") ### upstart ### test.add_cmd_check_stdout("-c list_agents -C service", "LSBDummy") test.add_cmd_check_stdout("-c list_agents -C upstart", "", "Stateful") ### should not exist test.add_cmd_check_stdout("-c list_agents -C upstart", "lrmd_dummy_daemon") test.add_cmd_check_stdout("-c list_agents -C upstart", "", "fence_dummy_monitor") ### should not exist if "stonith" in self.rsc_classes: test.add_cmd_check_stdout("-c list_agents -C stonith", "fence_dummy_monitor") ### stonith ### test.add_cmd_check_stdout("-c list_agents -C stonith", "", "lrmd_dummy_daemon") ### should not exist test.add_cmd_check_stdout("-c list_agents -C stonith", "", "Stateful") ### should not exist test.add_cmd_check_stdout("-c list_agents ", "fence_dummy_monitor") if "heartbeat" in self.rsc_classes: test.add_cmd_check_stdout("-c list_agents -C heartbeat", "HBDummy") test.add_cmd_check_stdout("-c list_agents -C heartbeat", "", "LSBDummy") ### should not exist test.add_cmd_check_stdout("-c list_agents -C service", "", "HBDummy") ### should not exist def print_list(self): """ List all registered tests """ print("\n==== %d TESTS FOUND ====" % (len(self.tests))) print("%35s - %s" % ("TEST NAME", "TEST DESCRIPTION")) print("%35s - %s" % ("--------------------", "--------------------")) for test in self.tests: print("%35s - %s" % (test.name, test.description)) print("==== END OF LIST ====\n") def run_single(self, name): """ Run a single named test """ for test in self.tests: if test.name == name: test.run() break def run_tests_matching(self, pattern): """ Run all tests whose name matches a pattern """ for test in self.tests: if test.name.count(pattern) != 0: test.run() def run_tests(self): """ Run all tests """ for test in self.tests: test.run() def exit(self): """ Exit (with error status code if any test failed) """ for test in self.tests: if test.executed == 0: continue if test.get_exitcode() != 0: sys.exit(-1) sys.exit(0) def print_results(self): """ Print summary of results of executed tests """ failures = 0 success = 0 print("\n\n======= FINAL RESULTS ==========") print("\n--- FAILURE RESULTS:") for test in self.tests: if test.executed == 0: continue if test.get_exitcode() != 0: failures = failures + 1 test.print_result(" ") else: success = success + 1 if failures == 0: print(" None") print("\n--- TOTALS\n Pass:%d\n Fail:%d\n" % (success, failures)) class TestOptions(object): """ Option handler """ def __init__(self): self.options = {} self.options['list-tests'] = 0 self.options['run-all'] = 1 self.options['run-only'] = "" self.options['run-only-pattern'] = "" self.options['verbose'] = 0 self.options['invalid-arg'] = "" self.options['show-usage'] = 0 self.options['pacemaker-remote'] = 0 def build_options(self, argv): """ Set options based on command-line arguments """ args = argv[1:] skip = 0 for i in range(0, len(args)): if skip: skip = 0 continue elif args[i] == "-h" or args[i] == "--help": self.options['show-usage'] = 1 elif args[i] == "-l" or args[i] == "--list-tests": self.options['list-tests'] = 1 elif args[i] == "-V" or args[i] == "--verbose": self.options['verbose'] = 1 elif args[i] == "-R" or args[i] == "--pacemaker-remote": self.options['pacemaker-remote'] = 1 elif args[i] == "-r" or args[i] == "--run-only": self.options['run-only'] = args[i+1] skip = 1 elif args[i] == "-p" or args[i] == "--run-only-pattern": self.options['run-only-pattern'] = args[i+1] skip = 1 def show_usage(self): """ Show command usage """ print("usage: " + sys.argv[0] + " [options]") print("If no options are provided, all tests will run") print("Options:") print("\t [--help | -h] Show usage") print("\t [--list-tests | -l] Print out all registered tests.") print("\t [--run-only | -r 'testname'] Run a specific test") print("\t [--verbose | -V] Verbose output") print("\t [--pacemaker-remote | -R Test pacemaker-remote binary instead of lrmd.") print("\t [--run-only-pattern | -p 'string'] Run only tests containing the string value") print("\n\tExample: Run only the test 'start_top'") print("\t\t python ./regression.py --run-only start_stop") print("\n\tExample: Run only the tests with the string 'systemd' present in them") print("\t\t python ./regression.py --run-only-pattern systemd") def main(argv): """ Run lrmd regression tests as specified by arguments """ update_path() opts = TestOptions() opts.build_options(argv) tests = Tests(opts.options['verbose'], opts.options['pacemaker-remote']) tests.build_generic_tests() tests.build_multi_rsc_tests() tests.build_negative_tests() tests.build_custom_tests() tests.build_stress_tests() tests.setup_test_environment() print("Starting ...") if opts.options['list-tests']: tests.print_list() elif opts.options['show-usage']: opts.show_usage() elif opts.options['run-only-pattern'] != "": tests.run_tests_matching(opts.options['run-only-pattern']) tests.print_results() elif opts.options['run-only'] != "": tests.run_single(opts.options['run-only']) tests.print_results() else: tests.run_tests() tests.print_results() tests.cleanup_test_environment() tests.exit() if __name__ == "__main__": main(sys.argv) diff --git a/pacemaker.spec.in b/pacemaker.spec.in index b8ef84bc96..bdb4a9bd5e 100644 --- a/pacemaker.spec.in +++ b/pacemaker.spec.in @@ -1,777 +1,777 @@ # Globals and defines to control package behavior (configure these as desired) ## User and group to use for nonprivileged services %global uname hacluster %global gname haclient ## Where to install Pacemaker documentation %global pcmk_docdir %{_docdir}/%{name} ## GitHub entity that distributes source (for ease of using a fork) %global github_owner ClusterLabs ## Upstream pacemaker version, and its package version (specversion ## can be incremented to build packages reliably considered "newer" ## than previously built packages with the same pcmkversion) -%global pcmkversion 1.1.15 +%global pcmkversion 1.1.16 %global specversion 1 ## Upstream commit (or git tag, such as "Pacemaker-" plus the ## {pcmkversion} macro for an official release) to use for this package %global commit HEAD # Define globals for convenient use later ## Workaround to use parentheses in other globals %global lparen ( %global rparen ) ## Short version of git commit %define shortcommit %(c=%{commit}; case ${c} in Pacemaker-*%{rparen} echo ${c:10};; *%{rparen} echo ${c:0:7};; esac) ## Whether this is a release candidate %define pre_release %(s=%{shortcommit}; [ ${s: -4:3} != -rc ]; echo $?) ## Whether this is a development branch %define post_release %([ %{commit} = Pacemaker-%{shortcommit} ]; echo $?) ## Turn off auto-compilation of python files outside site-packages directory, ## so that the -devel package is multilib-compliant %define __os_install_post %(echo '%{__os_install_post}' | sed -e 's!/usr/lib[^[:space:]]*/brp-python-bytecompile[[:space:]].*$!!g') ## Heuristic used to infer bleeding-edge deployments that are ## less likely to have working versions of the documentation tools %define bleeding %(test ! -e /etc/yum.repos.d/fedora-rawhide.repo; echo $?) ## Corosync version %define cs_version %(pkg-config corosync --modversion 2>/dev/null | awk -F . '{print $1}') ## Where to install python site libraries (currently, this uses the unversioned ## python_sitearch macro to get the default system python, but at some point, ## we should explicitly choose python2_sitearch or python3_sitearch -- or both) %define py_site %{?python_sitearch}%{!?python_sitearch:%( python -c 'from distutils.sysconfig import get_python_lib as gpl; print(gpl(1))' 2>/dev/null)} ## Whether this platform defaults to using CMAN %define cman_native (0%{?el6} || (0%{?fedora} > 0 && 0%{?fedora} < 17)) ## Whether this platform defaults to using systemd as an init system ## (needs to be evaluated prior to BuildRequires being enumerated and ## installed as it's intended to conditionally select some of these, and ## for that there are only few indicators with varying reliability: ## - presence of systemd-defined macros (when building in a full-fledged ## environment, which is not the case with ordinary mock-based builds) ## - systemd-aware rpm as manifested with the presence of particular ## macro (rpm itself will trivially always be present when building) ## - existence of /usr/lib/os-release file, which is something heavily ## propagated by systemd project ## - when not good enough, there's always a possibility to check ## particular distro-specific macros (incl. version comparison) %define systemd_native (%{?_unitdir:1}%{?!_unitdir:0}%{nil \ } || %{?__transaction_systemd_inhibit:1}%{?!__transaction_systemd_inhibit:0}%{nil \ } || %(test -f /usr/lib/os-release; test $? -ne 0; echo $?)) # Definitions for backward compatibility with older RPM versions ## Ensure %license macro behaves consistently (older RPM will otherwise ## overwrite %license once it encounters "License:"). Courtesy Jason Tibbitts: ## https://pkgs.fedoraproject.org/cgit/rpms/epel-rpm-macros.git/tree/macros.zzz-epel?h=el6&id=e1adcb77 %if !%{defined _licensedir} %define description %{lua: rpm.define("license %doc") print("%description") } %endif # Define conditionals so that "rpmbuild --with " and # "rpmbuild --without " can enable and disable specific features ## Add option to enable support for stonith/external fencing agents %bcond_with stonithd ## Add option to create binaries suitable for use with profiling tools %bcond_with profiling ## Add option to create binaries with coverage analysis %bcond_with coverage ## Add option to skip generating documentation ## (the build tools aren't available everywhere) %bcond_without doc ## Add option to prefix package version with "0." ## (so later "official" packages will be considered updates) %bcond_with pre_release ## Add option to ship Upstart job files %bcond_with upstart_job ## Add option to turn off CMAN support on CMAN-native platforms %bcond_without cman ## Add option to turn off hardening of libraries and daemon executables %bcond_without hardening # Keep sane profiling data if requested %if %{with profiling} ## Disable -debuginfo package and stripping binaries/libraries %define debug_package %{nil} %endif # Define the release version %if %{with pre_release} || 0%{pre_release} %if 0%{pre_release} %define pcmk_release 0.%{specversion}.%(s=%{shortcommit}; echo ${s: -3}) %else %define pcmk_release 0.%{specversion}.%{shortcommit}.git %endif %else %if 0%{post_release} %define pcmk_release %{specversion}.%{shortcommit}.git %else %define pcmk_release %{specversion} %endif %endif Name: pacemaker Summary: Scalable High-Availability cluster resource manager Version: %{pcmkversion} Release: %{pcmk_release}%{?dist} %if %{defined _unitdir} License: GPLv2+ and LGPLv2+ %else # initscript is Revised BSD License: GPLv2+ and LGPLv2+ and BSD %endif Url: http://www.clusterlabs.org Group: System Environment/Daemons # eg. https://github.com/ClusterLabs/pacemaker/archive/8ae45302394b039fb098e150f156df29fc0cb576/pacemaker-8ae4530.tar.gz Source0: https://github.com/%{github_owner}/%{name}/archive/%{commit}/%{name}-%{shortcommit}.tar.gz BuildRoot: %(mktemp -ud %{_tmppath}/%{name}-%{version}-%{release}-XXXXXX) AutoReqProv: on Requires: resource-agents Requires: %{name}-libs = %{version}-%{release} Requires: %{name}-cluster-libs = %{version}-%{release} Requires: %{name}-cli = %{version}-%{release} %if %{defined systemd_requires} %systemd_requires %endif # Pacemaker targets compatibility with python 2.6+ and 3.2+ Requires: python >= 2.6 BuildRequires: python-devel >= 2.6 # Pacemaker requires a minimum libqb functionality Requires: libqb >= 0.13.0 BuildRequires: libqb-devel >= 0.13.0 # Basics required for the build (even if usually satisfied through other BRs) BuildRequires: coreutils findutils grep sed # Required for core functionality BuildRequires: automake autoconf libtool pkgconfig libtool-ltdl-devel BuildRequires: pkgconfig(glib-2.0) libxml2-devel libxslt-devel libuuid-devel BuildRequires: bzip2-devel pam-devel # Required for agent_config.h which specifies the correct scratch directory BuildRequires: resource-agents # Enables optional functionality BuildRequires: ncurses-devel docbook-style-xsl BuildRequires: bison byacc flex help2man gnutls-devel pkgconfig(dbus-1) %if %{systemd_native} BuildRequires: pkgconfig(systemd) %endif %if %{with cman} && %{cman_native} BuildRequires: clusterlib-devel # pacemaker initscript: cman initscript, fence_tool (+ some soft-dependencies) # "post" scriptlet: ccs_update_schema Requires: cman %endif Requires: corosync BuildRequires: corosynclib-devel %if %{with stonithd} BuildRequires: cluster-glue-libs-devel %endif ## (note no avoiding effect when building through non-customized mock) %if !%{bleeding} %if %{with doc} BuildRequires: publican inkscape asciidoc %endif %endif %description Pacemaker is an advanced, scalable High-Availability cluster resource manager for Corosync, CMAN and/or Linux-HA. It supports more than 16 node clusters with significant capabilities for managing resources and dependencies. It will run scripts at initialization, when machines go up or down, when related resources fail and can be configured to periodically check resource health. Available rpmbuild rebuild options: --with(out) : cman coverage doc stonithd hardening pre_release profiling upstart_job %package cli License: GPLv2+ and LGPLv2+ Summary: Command line tools for controlling Pacemaker clusters Group: System Environment/Daemons Requires: %{name}-libs = %{version}-%{release} Requires: perl-TimeDate %description cli Pacemaker is an advanced, scalable High-Availability cluster resource manager for Corosync, CMAN and/or Linux-HA. The %{name}-cli package contains command line tools that can be used to query and control the cluster from machines that may, or may not, be part of the cluster. %package -n %{name}-libs License: GPLv2+ and LGPLv2+ Summary: Core Pacemaker libraries Group: System Environment/Daemons %description -n %{name}-libs Pacemaker is an advanced, scalable High-Availability cluster resource manager for Corosync, CMAN and/or Linux-HA. The %{name}-libs package contains shared libraries needed for cluster nodes and those just running the CLI tools. %package -n %{name}-cluster-libs License: GPLv2+ and LGPLv2+ Summary: Cluster Libraries used by Pacemaker Group: System Environment/Daemons Requires: %{name}-libs = %{version}-%{release} %description -n %{name}-cluster-libs Pacemaker is an advanced, scalable High-Availability cluster resource manager for Corosync, CMAN and/or Linux-HA. The %{name}-cluster-libs package contains cluster-aware shared libraries needed for nodes that will form part of the cluster nodes. %package remote %if %{defined _unitdir} License: GPLv2+ and LGPLv2+ %else # initscript is Revised BSD License: GPLv2+ and LGPLv2+ and BSD %endif Summary: Pacemaker remote daemon for non-cluster nodes Group: System Environment/Daemons Requires: %{name}-libs = %{version}-%{release} Requires: %{name}-cli = %{version}-%{release} Requires: resource-agents %if %{defined systemd_requires} %systemd_requires %endif %description remote Pacemaker is an advanced, scalable High-Availability cluster resource manager for Corosync, CMAN and/or Linux-HA. The %{name}-remote package contains the Pacemaker Remote daemon which is capable of extending pacemaker functionality to remote nodes not running the full corosync/cluster stack. %package -n %{name}-libs-devel License: GPLv2+ and LGPLv2+ Summary: Pacemaker development package Group: Development/Libraries Requires: %{name}-cts = %{version}-%{release} Requires: %{name}-libs = %{version}-%{release} Requires: %{name}-cluster-libs = %{version}-%{release} Requires: libtool-ltdl-devel libqb-devel libuuid-devel Requires: libxml2-devel libxslt-devel bzip2-devel glib2-devel Requires: corosynclib-devel %description -n %{name}-libs-devel Pacemaker is an advanced, scalable High-Availability cluster resource manager for Corosync, CMAN and/or Linux-HA. The %{name}-libs-devel package contains headers and shared libraries for developing tools for Pacemaker. %package cts License: GPLv2+ and LGPLv2+ Summary: Test framework for cluster-related technologies like Pacemaker Group: System Environment/Daemons Requires: python >= 2.6 Requires: %{name}-libs = %{version}-%{release} # systemd python bindings are separate package in some distros %if %{defined systemd_requires} %if 0%{?fedora} > 20 Requires: systemd-python %endif %if 0%{?rhel} > 6 Requires: systemd-python %endif %endif %description cts Test framework for cluster-related technologies like Pacemaker %package doc License: CC-BY-SA Summary: Documentation for Pacemaker Group: Documentation %description doc Documentation for Pacemaker. Pacemaker is an advanced, scalable High-Availability cluster resource manager for Corosync, CMAN and/or Linux-HA. %prep %setup -q -n %{name}-%{commit} # Force the local time # # 'git' sets the file date to the date of the last commit. # This can result in files having been created in the future # when building on machines in timezones 'behind' the one the # commit occurred in - which seriously confuses 'make' find . -exec touch \{\} \; %build # Early versions of autotools (e.g. RHEL <= 5) do not support --docdir export docdir=%{pcmk_docdir} export systemdunitdir=%{?_unitdir}%{?!_unitdir:no} %if %{with hardening} # prefer distro-provided hardening flags in case they are defined # through _hardening_{c,ld}flags macros, configure script will # use its own defaults otherwise; if such hardenings are completely # undesired, rpmbuild using "--without hardening" # (or "--define '_without_hardening 1'") export CFLAGS_HARDENED_EXE="%{?_hardening_cflags}" export CFLAGS_HARDENED_LIB="%{?_hardening_cflags}" export LDFLAGS_HARDENED_EXE="%{?_hardening_ldflags}" export LDFLAGS_HARDENED_LIB="%{?_hardening_ldflags}" %endif ./autogen.sh %{configure} \ %{?with_profiling: --with-profiling} \ %{?with_coverage: --with-coverage} \ %{!?with_cman: --without-cman} \ --without-heartbeat \ %{!?with_doc: --with-brand=} \ %{!?with_hardening: --disable-hardening} \ --with-initdir=%{_initrddir} \ --localstatedir=%{_var} \ --with-version=%{version}-%{release} %if 0%{?suse_version} >= 1200 # Fedora handles rpath removal automagically sed -i 's|^hardcode_libdir_flag_spec=.*|hardcode_libdir_flag_spec=""|g' libtool sed -i 's|^runpath_var=LD_RUN_PATH|runpath_var=DIE_RPATH_DIE|g' libtool %endif make %{_smp_mflags} V=1 all %install rm -rf %{buildroot} make DESTDIR=%{buildroot} docdir=%{pcmk_docdir} V=1 install mkdir -p ${RPM_BUILD_ROOT}%{_sysconfdir}/sysconfig install -m 644 mcp/pacemaker.sysconfig ${RPM_BUILD_ROOT}%{_sysconfdir}/sysconfig/pacemaker install -m 644 tools/crm_mon.sysconfig ${RPM_BUILD_ROOT}%{_sysconfdir}/sysconfig/crm_mon %if %{with upstart_job} mkdir -p ${RPM_BUILD_ROOT}%{_sysconfdir}/init install -m 644 mcp/pacemaker.upstart ${RPM_BUILD_ROOT}%{_sysconfdir}/init/pacemaker.conf install -m 644 mcp/pacemaker.combined.upstart ${RPM_BUILD_ROOT}%{_sysconfdir}/init/pacemaker.combined.conf install -m 644 tools/crm_mon.upstart ${RPM_BUILD_ROOT}%{_sysconfdir}/init/crm_mon.conf %endif %if %{defined _unitdir} mkdir -p ${RPM_BUILD_ROOT}%{_localstatedir}/lib/rpm-state/%{name} %endif # Scripts that should be executable chmod a+x %{buildroot}/%{_datadir}/pacemaker/tests/cts/CTSlab.py # These are not actually scripts find %{buildroot} -name '*.xml' -type f -print0 | xargs -0 chmod a-x # Don't package static libs find %{buildroot} -name '*.a' -type f -print0 | xargs -0 rm -f find %{buildroot} -name '*.la' -type f -print0 | xargs -0 rm -f # Do not package these either rm -f %{buildroot}/%{_libdir}/service_crm.so # Don't ship init scripts for systemd based platforms %if %{defined _unitdir} rm -f %{buildroot}/%{_initrddir}/pacemaker rm -f %{buildroot}/%{_initrddir}/pacemaker_remote %endif # Don't ship fence_pcmk where it has no use %if %{without cman} rm -f %{buildroot}/%{_sbindir}/fence_pcmk %endif %if %{with coverage} GCOV_BASE=%{buildroot}/%{_var}/lib/pacemaker/gcov mkdir -p $GCOV_BASE find . -name '*.gcno' -type f | while read F ; do D=`dirname $F` mkdir -p ${GCOV_BASE}/$D cp $F ${GCOV_BASE}/$D done %endif %clean rm -rf %{buildroot} %post %if %{defined _unitdir} %systemd_post pacemaker.service %else /sbin/chkconfig --add pacemaker || : %if %{with cman} && %{cman_native} # make fence_pcmk in cluster.conf valid instantly otherwise tools like ccs may # choke (until schema gets auto-regenerated on the next start of cluster), # per the protocol shared with other packages contributing to cluster.rng /usr/sbin/ccs_update_schema >/dev/null 2>&1 || : %endif %endif %preun %if %{defined _unitdir} %systemd_preun pacemaker.service %else /sbin/service pacemaker stop >/dev/null 2>&1 || : if [ $1 -eq 0 ]; then # Package removal, not upgrade /sbin/chkconfig --del pacemaker || : fi %endif %postun %if %{defined _unitdir} %systemd_postun_with_restart pacemaker.service %endif %pre remote %if %{defined _unitdir} # Stop the service before anything is touched, and remember to restart # it as one of the last actions (compared to using systemd_postun_with_restart, # this avoids suicide when sbd is in use) systemctl --quiet is-active pacemaker_remote if [ $? -eq 0 ] ; then mkdir -p %{_localstatedir}/lib/rpm-state/%{name} touch %{_localstatedir}/lib/rpm-state/%{name}/restart_pacemaker_remote systemctl stop pacemaker_remote >/dev/null 2>&1 else rm -f %{_localstatedir}/lib/rpm-state/%{name}/restart_pacemaker_remote fi %endif %post remote %if %{defined _unitdir} %systemd_post pacemaker_remote.service %else /sbin/chkconfig --add pacemaker_remote || : %endif %preun remote %if %{defined _unitdir} %systemd_preun pacemaker_remote.service %else /sbin/service pacemaker_remote stop >/dev/null 2>&1 || : if [ $1 -eq 0 ]; then # Package removal, not upgrade /sbin/chkconfig --del pacemaker_remote || : fi %endif %postun remote %if %{defined _unitdir} # This next line is a no-op, because we stopped the service earlier, but # we leave it here because it allows us to revert to the standard behavior # in the future if desired %systemd_postun_with_restart pacemaker_remote.service # Explicitly take care of removing the flag-file(s) upon final removal if [ $1 -eq 0 ] ; then rm -f %{_localstatedir}/lib/rpm-state/%{name}/restart_pacemaker_remote fi %endif %posttrans remote %if %{defined _unitdir} if [ -e %{_localstatedir}/lib/rpm-state/%{name}/restart_pacemaker_remote ] ; then systemctl start pacemaker_remote >/dev/null 2>&1 rm -f %{_localstatedir}/lib/rpm-state/%{name}/restart_pacemaker_remote fi %endif %post cli %if %{defined _unitdir} %systemd_post crm_mon.service %endif %preun cli %if %{defined _unitdir} %systemd_preun crm_mon.service %endif %postun cli %if %{defined _unitdir} %systemd_postun_with_restart crm_mon.service %endif %pre -n %{name}-libs getent group %{gname} >/dev/null || groupadd -r %{gname} -g 189 getent passwd %{uname} >/dev/null || useradd -r -g %{gname} -u 189 -s /sbin/nologin -c "cluster user" %{uname} exit 0 %post -n %{name}-libs -p /sbin/ldconfig %postun -n %{name}-libs -p /sbin/ldconfig %post -n %{name}-cluster-libs -p /sbin/ldconfig %postun -n %{name}-cluster-libs -p /sbin/ldconfig %files ########################################################### %defattr(-,root,root) %config(noreplace) %{_sysconfdir}/sysconfig/pacemaker %{_sbindir}/pacemakerd %if %{defined _unitdir} %{_unitdir}/pacemaker.service %else %{_initrddir}/pacemaker %endif %exclude %{_libexecdir}/pacemaker/lrmd_test %exclude %{_sbindir}/pacemaker_remoted %{_libexecdir}/pacemaker/* %{_sbindir}/crm_attribute %{_sbindir}/crm_master %{_sbindir}/crm_node %{_sbindir}/fence_legacy %if %{with cman} %{_sbindir}/fence_pcmk %endif %{_sbindir}/stonith_admin %doc %{_mandir}/man7/crmd.* %doc %{_mandir}/man7/pengine.* %doc %{_mandir}/man7/stonithd.* %if %{without cman} || !%{cman_native} %doc %{_mandir}/man7/ocf_pacemaker_controld.* %endif %doc %{_mandir}/man7/ocf_pacemaker_o2cb.* %doc %{_mandir}/man7/ocf_pacemaker_remote.* %doc %{_mandir}/man8/crm_attribute.* %doc %{_mandir}/man8/crm_node.* %doc %{_mandir}/man8/crm_master.* %if %{with cman} %doc %{_mandir}/man8/fence_pcmk.* %endif %doc %{_mandir}/man8/fence_legacy.* %doc %{_mandir}/man8/pacemakerd.* %doc %{_mandir}/man8/stonith_admin.* %doc %{_datadir}/pacemaker/alerts %license licenses/GPLv2 %doc COPYING %doc ChangeLog %dir %attr (750, %{uname}, %{gname}) %{_var}/lib/pacemaker/cib %dir %attr (750, %{uname}, %{gname}) %{_var}/lib/pacemaker/pengine %if %{without cman} || !%{cman_native} /usr/lib/ocf/resource.d/pacemaker/controld %endif /usr/lib/ocf/resource.d/pacemaker/o2cb /usr/lib/ocf/resource.d/pacemaker/remote /usr/lib/ocf/resource.d/.isolation %if "%{?cs_version}" != "UNKNOWN" %if 0%{?cs_version} < 2 %{_libexecdir}/lcrso/pacemaker.lcrso %endif %endif %if %{with upstart_job} %config(noreplace) %{_sysconfdir}/init/pacemaker.conf %config(noreplace) %{_sysconfdir}/init/pacemaker.combined.conf %endif %files cli %defattr(-,root,root) %config(noreplace) %{_sysconfdir}/logrotate.d/pacemaker %config(noreplace) %{_sysconfdir}/sysconfig/crm_mon %if %{defined _unitdir} %{_unitdir}/crm_mon.service %endif %if %{with upstart_job} %config(noreplace) %{_sysconfdir}/init/crm_mon.conf %endif %{_sbindir}/attrd_updater %{_sbindir}/cibadmin %{_sbindir}/crm_diff %{_sbindir}/crm_error %{_sbindir}/crm_failcount %{_sbindir}/crm_mon %{_sbindir}/crm_resource %{_sbindir}/crm_standby %{_sbindir}/crm_verify %{_sbindir}/crmadmin %{_sbindir}/iso8601 %{_sbindir}/crm_shadow %{_sbindir}/crm_simulate %{_sbindir}/crm_report %{_sbindir}/crm_ticket %exclude %{_datadir}/pacemaker/alerts %exclude %{_datadir}/pacemaker/tests %{_datadir}/pacemaker %{_datadir}/snmp/mibs/PCMK-MIB.txt %exclude /usr/lib/ocf/resource.d/pacemaker/controld %exclude /usr/lib/ocf/resource.d/pacemaker/o2cb %exclude /usr/lib/ocf/resource.d/pacemaker/remote %dir /usr/lib/ocf %dir /usr/lib/ocf/resource.d /usr/lib/ocf/resource.d/pacemaker %doc %{_mandir}/man7/* %exclude %{_mandir}/man7/crmd.* %exclude %{_mandir}/man7/pengine.* %exclude %{_mandir}/man7/stonithd.* %exclude %{_mandir}/man7/ocf_pacemaker_controld.* %exclude %{_mandir}/man7/ocf_pacemaker_o2cb.* %exclude %{_mandir}/man7/ocf_pacemaker_remote.* %doc %{_mandir}/man8/* %exclude %{_mandir}/man8/crm_attribute.* %exclude %{_mandir}/man8/crm_node.* %exclude %{_mandir}/man8/crm_master.* %exclude %{_mandir}/man8/fence_pcmk.* %exclude %{_mandir}/man8/fence_legacy.* %exclude %{_mandir}/man8/pacemakerd.* %exclude %{_mandir}/man8/pacemaker_remoted.* %exclude %{_mandir}/man8/stonith_admin.* %license licenses/GPLv2 %doc COPYING %doc ChangeLog %dir %attr (750, %{uname}, %{gname}) %{_var}/lib/pacemaker %dir %attr (750, %{uname}, %{gname}) %{_var}/lib/pacemaker/blackbox %dir %attr (750, %{uname}, %{gname}) %{_var}/lib/pacemaker/cores %files -n %{name}-libs %defattr(-,root,root) %{_libdir}/libcib.so.* %{_libdir}/liblrmd.so.* %{_libdir}/libcrmservice.so.* %{_libdir}/libcrmcommon.so.* %{_libdir}/libpe_status.so.* %{_libdir}/libpe_rules.so.* %{_libdir}/libpengine.so.* %{_libdir}/libstonithd.so.* %{_libdir}/libtransitioner.so.* %license licenses/LGPLv2.1 %doc COPYING %doc ChangeLog %files -n %{name}-cluster-libs %defattr(-,root,root) %{_libdir}/libcrmcluster.so.* %license licenses/LGPLv2.1 %doc COPYING %doc ChangeLog %files remote %defattr(-,root,root) %config(noreplace) %{_sysconfdir}/sysconfig/pacemaker %if %{defined _unitdir} # state directory is shared between the subpackets # let rpm take care of removing it once it isn't # referenced anymore and empty %ghost %dir %{_localstatedir}/lib/rpm-state/%{name} %{_unitdir}/pacemaker_remote.service %else %{_initrddir}/pacemaker_remote %endif %{_sbindir}/pacemaker_remoted %{_mandir}/man8/pacemaker_remoted.* %license licenses/GPLv2 %doc COPYING %doc ChangeLog %files doc %defattr(-,root,root) %doc %{pcmk_docdir} %license licenses/CC-BY-SA-4.0 %files cts %defattr(-,root,root) %{py_site}/cts %{_datadir}/pacemaker/tests/cts %{_libexecdir}/pacemaker/lrmd_test %license licenses/GPLv2 %doc COPYING %doc ChangeLog %files -n %{name}-libs-devel %defattr(-,root,root) %exclude %{_datadir}/pacemaker/tests/cts %{_datadir}/pacemaker/tests %{_includedir}/pacemaker %{_libdir}/*.so %if %{with coverage} %{_var}/lib/pacemaker/gcov %endif %{_libdir}/pkgconfig/*.pc %license licenses/LGPLv2.1 %doc COPYING %doc ChangeLog %changelog diff --git a/pengine/Makefile.am b/pengine/Makefile.am index a980347e4f..5131dcb591 100644 --- a/pengine/Makefile.am +++ b/pengine/Makefile.am @@ -1,90 +1,90 @@ # # Copyright (C) 2004 Andrew Beekhof # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # include $(top_srcdir)/Makefile.common AM_CPPFLAGS += -I$(top_builddir) -I$(top_srcdir) halibdir = $(CRM_DAEMON_DIR) PE_TESTS = $(wildcard test10/*.scores) testdir = $(datadir)/$(PACKAGE)/tests/pengine test_SCRIPTS = regression.sh test_DATA = regression.core.sh test10dir = $(datadir)/$(PACKAGE)/tests/pengine/test10 test10_DATA = $(PE_TESTS) $(PE_TESTS:%.scores=%.xml) $(PE_TESTS:%.scores=%.exp) $(PE_TESTS:%.scores=%.dot) $(PE_TESTS:%.scores=%.summary) $(wildcard test10/*.stderr) beekhof: echo $(shell ls -1 test10/*.xml) #TESTS = test10/*.xml TESTS = test10/bug-rh-1097457.xml TEST_EXTENSIONS = .xml XML_LOG_COMPILER = ./regression.sh AM_XML_LOG_FLAGS = -V --run #LOG_COMPILER = #AM_LOG_FLAGS = -V COMMONLIBS = $(top_builddir)/lib/common/libcrmcommon.la \ $(top_builddir)/lib/pengine/libpe_status.la \ libpengine.la ## libraries lib_LTLIBRARIES = libpengine.la ## binary progs halib_PROGRAMS = pengine if BUILD_XML_HELP man7_MANS = pengine.7 endif ## SOURCES noinst_HEADERS = allocate.h notif.h utils.h pengine.h -libpengine_la_LDFLAGS = -version-info 11:0:1 +libpengine_la_LDFLAGS = -version-info 12:0:2 libpengine_la_CFLAGS = $(CFLAGS_HARDENED_LIB) libpengine_la_LDFLAGS += $(LDFLAGS_HARDENED_LIB) libpengine_la_LIBADD = $(top_builddir)/lib/pengine/libpe_status.la \ $(top_builddir)/lib/cib/libcib.la # -L$(top_builddir)/lib/pils -lpils -export-dynamic -module -avoid-version libpengine_la_SOURCES = pengine.c allocate.c notif.c utils.c constraints.c libpengine_la_SOURCES += native.c group.c clone.c master.c graph.c utilization.c pengine_CFLAGS = $(CFLAGS_HARDENED_EXE) pengine_LDFLAGS = $(LDFLAGS_HARDENED_EXE) pengine_LDADD = $(top_builddir)/lib/cib/libcib.la $(COMMONLIBS) # libcib for get_object_root() # $(top_builddir)/lib/hbclient/libhbclient.la pengine_SOURCES = main.c install-exec-local: $(mkinstalldirs) $(DESTDIR)/$(PE_STATE_DIR) -chown $(CRM_DAEMON_USER) $(DESTDIR)/$(PE_STATE_DIR) -chgrp $(CRM_DAEMON_GROUP) $(DESTDIR)/$(PE_STATE_DIR) -chmod 750 $(DESTDIR)/$(PE_STATE_DIR) uninstall-local: clean-local: rm -f test10/*.pe.* $(man7_MANS) diff --git a/version.m4 b/version.m4 index 10221da6f7..c6bfaba1c1 100644 --- a/version.m4 +++ b/version.m4 @@ -1,2 +1,2 @@ -m4_define([VERSION_NUMBER], [1.1.15]) +m4_define([VERSION_NUMBER], [1.1.16]) m4_define([PCMK_URL], [http://clusterlabs.org/]) diff --git a/xml/Readme.md b/xml/Readme.md index 88d9137828..fdcf2aef7a 100644 --- a/xml/Readme.md +++ b/xml/Readme.md @@ -1,84 +1,85 @@ # Schema Reference Besides the version of Pacemaker itself, the XML schema of the Pacemaker configuration has its own version. ## Versioned Schema Evolution A versioned schema offers transparent backward/forward compatibility. - It reflects the timeline of schema-backed features (introduction, changes to the syntax, possibly deprecation) through the versioned stable schema increments, while keeping schema versions used by default by older Pacemaker versions untouched. - Pacemaker internally uses the latest stable schema version, and relies on supplemental transformations to promote cluster configurations based on older, incompatible schema versions into the desired form. - It allows experimental features with a possibly unstable configuration interface to be developed using the special `next` version of the schema. ## Mapping Pacemaker Versions to Schema Versions | Pacemaker | Latest Schema | Changed | --------- | ------------- | ---------------------------------------------- +| `1.1.16` | `2.6` | `constraints` | `1.1.15` | `2.5` | `alerts` | `1.1.14` | `2.4` | `fencing` | `1.1.13` | `2.3` | `constraints` | `1.1.12` | `2.0` | `nodes`, `nvset`, `resources`, `tags` + `acls` | `1.1.8`+ | `1.2` | # Updating schema files # ## Experimental features ## Experimental features go into `${base}-next.rng` Create from the most recent `${base}-${X}.${Y}.rng` if it does not already exist ## Stable features ## The current stable version is determined at runtime when __xml_build_schema_list() interrogates the CRM_DTD_DIRECTORY. It will have the form `pacemaker-${X}.${Y}` and the highest `${X}.${Y}` wins. ### Simple Additions When the new syntax is a simple addition to the previous one, create a new entry with `${Y} = ${Yold} + 1` ### Feature Removal or otherwise Incompatible Changes When the new syntax is not a simple addition to the previous one, create a new entry with `${X} = ${Xold} + 1` and `${Y} = 0`. An XSLT file is also required that converts an old syntax to the new one and must be named `upgrade-${Xold}.${Yold}.xsl`. See `xml/upgrade06.xsl` for an example. ### General Proceedure 1. Copy the most recent version of `${base}-*.rng` to `${base}-${X}.${Y}.rng` 1. Commit the copy, eg. `"Clone the latest ${base} schema in preparation for changes"`. This way the actual change will be obvious in the commit history. 1. Modify `${base}-${X}.${Y}.rng` as required 1. Add an XSLT file if required and update `xslt_SCRIPTS` in `xml/Makefile.am` 1. Commit ## Admin Tasks New features will not be available until the admin 1. Updates all the nodes 1. Runs the equivalent of `cibadmin --upgrade` ## Random Notes From the source directory, run `make -C xml diff` to see the changes in the current schema (compared to the previous ones) and also the pending changes in `pacemaker-next`. Alternatively, if the intention is to grok the overall historical schema evolution, use `make -C xml fulldiff`.