Page MenuHomeClusterLabs Projects

No OneTemporary

diff --git a/doc/Pacemaker_Explained/en-US/Book_Info.xml b/doc/Pacemaker_Explained/en-US/Book_Info.xml
index bce0089524..c189d07a6c 100644
--- a/doc/Pacemaker_Explained/en-US/Book_Info.xml
+++ b/doc/Pacemaker_Explained/en-US/Book_Info.xml
@@ -1,35 +1,35 @@
<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE bookinfo PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
]>
<bookinfo>
<title>Configuration Explained</title>
<subtitle>An A-Z guide to Pacemaker's Configuration Options</subtitle>
<productname>Pacemaker</productname>
<productnumber>1.1</productnumber>
<!--
EDITION-PUBSNUMBER should match REVNUMBER in Revision_History.xml.
Increment EDITION when the syntax of the documented software
changes (pacemaker), and PUBSNUMBER for
simple textual changes (corrections, translations, etc.).
-->
- <edition>6</edition>
+ <edition>7</edition>
<pubsnumber>0</pubsnumber>
<abstract>
<para>
The purpose of this document is to definitively explain the concepts used to configure Pacemaker.
To achieve this, it will focus exclusively on the XML syntax used to configure Pacemaker's
Cluster Information Base (CIB).
</para>
</abstract>
<corpauthor>
<inlinemediaobject>
<imageobject>
<imagedata fileref="Common_Content/images/title_logo.svg" format="SVG"/>
</imageobject>
</inlinemediaobject>
</corpauthor>
<xi:include href="Common_Content/Legal_Notice.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
</xi:include>
<xi:include href="Author_Group.xml" xmlns:xi="http://www.w3.org/2001/XInclude">
</xi:include>
</bookinfo>
diff --git a/doc/Pacemaker_Explained/en-US/Ch-Options.txt b/doc/Pacemaker_Explained/en-US/Ch-Options.txt
index 0c1a2e7620..a2fbfe2473 100644
--- a/doc/Pacemaker_Explained/en-US/Ch-Options.txt
+++ b/doc/Pacemaker_Explained/en-US/Ch-Options.txt
@@ -1,404 +1,409 @@
= Cluster-Wide Configuration =
== CIB Properties ==
Certain settings are defined by CIB properties (that is, attributes of the
+cib+ tag) rather than with the rest of the cluster configuration in the
+configuration+ section.
The reason is simply a matter of parsing. These options are used by the
configuration database which is, by design, mostly ignorant of the content it
holds. So the decision was made to place them in an easy-to-find location.
.CIB Properties
[width="95%",cols="2m,5<",options="header",align="center"]
|=========================================================
|Field |Description
| admin_epoch |
indexterm:[Configuration Version,Cluster]
indexterm:[Cluster,Option,Configuration Version]
indexterm:[admin_epoch,Cluster Option]
indexterm:[Cluster,Option,admin_epoch]
When a node joins the cluster, the cluster performs a check to see
which node has the best configuration. It asks the node with the highest
(+admin_epoch+, +epoch+, +num_updates+) tuple to replace the configuration on
all the nodes -- which makes setting them, and setting them correctly, very
important. +admin_epoch+ is never modified by the cluster; you can use this
to make the configurations on any inactive nodes obsolete. _Never set this
value to zero_. In such cases, the cluster cannot tell the difference between
your configuration and the "empty" one used when nothing is found on disk.
| epoch |
indexterm:[epoch,Cluster Option]
indexterm:[Cluster,Option,epoch]
The cluster increments this every time the configuration is updated (usually by
the administrator).
| num_updates |
indexterm:[num_updates,Cluster Option]
indexterm:[Cluster,Option,num_updates]
The cluster increments this every time the configuration or status is updated
(usually by the cluster) and resets it to 0 when epoch changes.
| validate-with |
indexterm:[validate-with,Cluster Option]
indexterm:[Cluster,Option,validate-with]
Determines the type of XML validation that will be done on the configuration.
If set to +none+, the cluster will not verify that updates conform to the
DTD (nor reject ones that don't). This option can be useful when
operating a mixed-version cluster during an upgrade.
|cib-last-written |
indexterm:[cib-last-written,Cluster Property]
indexterm:[Cluster,Property,cib-last-written]
Indicates when the configuration was last written to disk. Maintained by the
cluster; for informational purposes only.
|have-quorum |
indexterm:[have-quorum,Cluster Property]
indexterm:[Cluster,Property,have-quorum]
Indicates if the cluster has quorum. If false, this may mean that the
cluster cannot start resources or fence other nodes (see
+no-quorum-policy+ below). Maintained by the cluster.
|dc-uuid |
indexterm:[dc-uuid,Cluster Property]
indexterm:[Cluster,Property,dc-uuid]
Indicates which cluster node is the current leader. Used by the
cluster when placing resources and determining the order of some
events. Maintained by the cluster.
|=========================================================
=== Working with CIB Properties ===
Although these fields can be written to by the user, in
most cases the cluster will overwrite any values specified by the
user with the "correct" ones.
To change the ones that can be specified by the user,
for example +admin_epoch+, one should use:
----
# cibadmin --modify --crm_xml '<cib admin_epoch="42"/>'
----
A complete set of CIB properties will look something like this:
.Attributes set for a cib object
======
[source,XML]
-------
<cib crm_feature_set="3.0.7" validate-with="pacemaker-1.2"
admin_epoch="42" epoch="116" num_updates="1"
cib-last-written="Mon Jan 12 15:46:39 2015" update-origin="rhel7-1"
update-client="crm_attribute" have-quorum="1" dc-uuid="1">
-------
======
== Cluster Options ==
Cluster options, as you might expect, control how the cluster behaves
when confronted with certain situations.
They are grouped into sets within the +crm_config+ section, and, in advanced
configurations, there may be more than one set. (This will be described later
in the section on <<ch-rules>> where we will show how to have the cluster use
different sets of options during working hours than during weekends.) For now,
we will describe the simple case where each option is present at most once.
You can obtain an up-to-date list of cluster options, including
their default values, by running the `man pengine` and `man crmd` commands.
.Cluster Options
[width="95%",cols="5m,2,11<a",options="header",align="center"]
|=========================================================
|Option |Default |Description
| dc-version | |
indexterm:[dc-version,Cluster Property]
indexterm:[Cluster,Property,dc-version]
Version of Pacemaker on the cluster's DC.
Determined automatically by the cluster.
Often includes the hash which identifies the exact Git changeset it was built
from. Used for diagnostic purposes.
| cluster-infrastructure | |
indexterm:[cluster-infrastructure,Cluster Property]
indexterm:[Cluster,Property,cluster-infrastructure]
The messaging stack on which Pacemaker is currently running.
Determined automatically by the cluster.
Used for informational and diagnostic purposes.
| expected-quorum-votes | |
indexterm:[expected-quorum-votes,Cluster Property]
indexterm:[Cluster,Property,expected-quorum-votes]
The number of nodes expected to be in the cluster.
Determined automatically by the cluster.
Used to calculate quorum in clusters that use Corosync 1.x without CMAN
as the messaging layer.
| no-quorum-policy | stop |
indexterm:[no-quorum-policy,Cluster Option]
indexterm:[Cluster,Option,no-quorum-policy]
What to do when the cluster does not have quorum. Allowed values:
* +ignore:+ continue all resource management
* +freeze:+ continue resource management, but don't recover resources from nodes not in the affected partition
* +stop:+ stop all resources in the affected cluster partition
* +suicide:+ fence all nodes in the affected cluster partition
| batch-limit | 30 |
indexterm:[batch-limit,Cluster Option]
indexterm:[Cluster,Option,batch-limit]
The number of jobs that the Transition Engine (TE) is allowed to execute in
parallel. The TE is the logic in pacemaker's CRMd that executes the actions
determined by the Policy Engine (PE). The "correct" value will depend on the
speed and load of your network and cluster nodes.
| migration-limit | -1 |
indexterm:[migration-limit,Cluster Option]
indexterm:[Cluster,Option,migration-limit]
The number of migration jobs that the TE is allowed to execute in
parallel on a node. A value of -1 means unlimited.
| symmetric-cluster | TRUE |
indexterm:[symmetric-cluster,Cluster Option]
indexterm:[Cluster,Option,symmetric-cluster]
Can all resources run on any node by default?
| stop-all-resources | FALSE |
indexterm:[stop-all-resources,Cluster Option]
indexterm:[Cluster,Option,stop-all-resources]
Should the cluster stop all resources?
| stop-orphan-resources | TRUE |
indexterm:[stop-orphan-resources,Cluster Option]
indexterm:[Cluster,Option,stop-orphan-resources]
Should deleted resources be stopped?
| stop-orphan-actions | TRUE |
indexterm:[stop-orphan-actions,Cluster Option]
indexterm:[Cluster,Option,stop-orphan-actions]
Should deleted actions be cancelled?
| start-failure-is-fatal | TRUE |
indexterm:[start-failure-is-fatal,Cluster Option]
indexterm:[Cluster,Option,start-failure-is-fatal]
Should a failure to start a resource on a particular node prevent further start
attempts on that node? If FALSE, the cluster will decide whether to try
starting on the same node again based on the resource's current failure count
and +migration-threshold+ (see <<s-failure-migration>>).
| enable-startup-probes | TRUE |
indexterm:[enable-startup-probes,Cluster Option]
indexterm:[Cluster,Option,enable-startup-probes]
Should the cluster check for active resources during startup?
| maintenance-mode | FALSE |
indexterm:[maintenance-mode,Cluster Option]
indexterm:[Cluster,Option,maintenance-mode]
Should the cluster refrain from monitoring, starting and stopping resources?
| stonith-enabled | TRUE |
indexterm:[stonith-enabled,Cluster Option]
indexterm:[Cluster,Option,stonith-enabled]
Should failed nodes and nodes with resources that can't be stopped be
shot? If you value your data, set up a STONITH device and enable this.
If true, or unset, the cluster will refuse to start resources unless
one or more STONITH resources have been configured.
If false, unresponsive nodes are immediately assumed to be running no
resources, and resource takeover to online nodes starts without any
further protection (which means _data loss_ if the unresponsive node
still accesses shared storage, for example). See also the +requires+
meta-attribute in <<s-resource-options>>.
| stonith-action | reboot |
indexterm:[stonith-action,Cluster Option]
indexterm:[Cluster,Option,stonith-action]
Action to send to STONITH device. Allowed values are +reboot+ and +off+.
The value +poweroff+ is also allowed, but is only used for
legacy devices.
| stonith-timeout | 60s |
indexterm:[stonith-timeout,Cluster Option]
indexterm:[Cluster,Option,stonith-timeout]
How long to wait for STONITH actions (reboot, on, off) to complete
+| concurrent-fencing | FALSE |
+indexterm:[concurrent-fencing,Cluster Option]
+indexterm:[Cluster,Option,concurrent-fencing]
+Is the cluster allowed to initiate multiple fence actions concurrently?
+
| cluster-delay | 60s |
indexterm:[cluster-delay,Cluster Option]
indexterm:[Cluster,Option,cluster-delay]
Estimated maximum round-trip delay over the network (excluding action
execution). If the TE requires an action to be executed on another node,
it will consider the action failed if it does not get a response
from the other node in this time (after considering the action's
own timeout). The "correct" value will depend on the speed and load of your
network and cluster nodes.
| dc-deadtime | 20s |
indexterm:[dc-deadtime,Cluster Option]
indexterm:[Cluster,Option,dc-deadtime]
How long to wait for a response from other nodes during startup.
The "correct" value will depend on the speed/load of your network and the type of switches used.
| cluster-recheck-interval | 15min |
indexterm:[cluster-recheck-interval,Cluster Option]
indexterm:[Cluster,Option,cluster-recheck-interval]
Polling interval for time-based changes to options, resource parameters and constraints.
The Cluster is primarily event-driven, but your configuration can have
elements that take effect based on the time of day. To ensure these changes
take effect, we can optionally poll the cluster's status for changes. A value
of 0 disables polling. Positive values are an interval (in seconds unless other
SI units are specified, e.g. 5min).
| pe-error-series-max | -1 |
indexterm:[pe-error-series-max,Cluster Option]
indexterm:[Cluster,Option,pe-error-series-max]
The number of PE inputs resulting in ERRORs to save. Used when reporting problems.
A value of -1 means unlimited (report all).
| pe-warn-series-max | -1 |
indexterm:[pe-warn-series-max,Cluster Option]
indexterm:[Cluster,Option,pe-warn-series-max]
The number of PE inputs resulting in WARNINGs to save. Used when reporting problems.
A value of -1 means unlimited (report all).
| pe-input-series-max | -1 |
indexterm:[pe-input-series-max,Cluster Option]
indexterm:[Cluster,Option,pe-input-series-max]
The number of "normal" PE inputs to save. Used when reporting problems.
A value of -1 means unlimited (report all).
| remove-after-stop | FALSE |
indexterm:[remove-after-stop,Cluster Option]
indexterm:[Cluster,Option,remove-after-stop]
_Advanced Use Only:_ Should the cluster remove resources from the LRM after
they are stopped? Values other than the default are, at best, poorly tested and
potentially dangerous.
| startup-fencing | TRUE |
indexterm:[startup-fencing,Cluster Option]
indexterm:[Cluster,Option,startup-fencing]
_Advanced Use Only:_ Should the cluster shoot unseen nodes?
Not using the default is very unsafe!
| election-timeout | 2min |
indexterm:[election-timeout,Cluster Option]
indexterm:[Cluster,Option,election-timeout]
_Advanced Use Only:_ If you need to adjust this value, it probably indicates
the presence of a bug.
| shutdown-escalation | 20min |
indexterm:[shutdown-escalation,Cluster Option]
indexterm:[Cluster,Option,shutdown-escalation]
_Advanced Use Only:_ If you need to adjust this value, it probably indicates
the presence of a bug.
| crmd-integration-timeout | 3min |
indexterm:[crmd-integration-timeout,Cluster Option]
indexterm:[Cluster,Option,crmd-integration-timeout]
_Advanced Use Only:_ If you need to adjust this value, it probably indicates
the presence of a bug.
| crmd-finalization-timeout | 30min |
indexterm:[crmd-finalization-timeout,Cluster Option]
indexterm:[Cluster,Option,crmd-finalization-timeout]
_Advanced Use Only:_ If you need to adjust this value, it probably indicates
the presence of a bug.
| crmd-transition-delay | 0s |
indexterm:[crmd-transition-delay,Cluster Option]
indexterm:[Cluster,Option,crmd-transition-delay]
_Advanced Use Only:_ Delay cluster recovery for the configured interval to
allow for additional/related events to occur. Useful if your configuration is
sensitive to the order in which ping updates arrive.
Enabling this option will slow down cluster recovery under
all conditions.
|default-resource-stickiness | 0 |
indexterm:[default-resource-stickiness,Cluster Option]
indexterm:[Cluster,Option,default-resource-stickiness]
_Deprecated:_ See <<s-resource-defaults>> instead
| is-managed-default | TRUE |
indexterm:[is-managed-default,Cluster Option]
indexterm:[Cluster,Option,is-managed-default]
_Deprecated:_ See <<s-resource-defaults>> instead
| default-action-timeout | 20s |
indexterm:[default-action-timeout,Cluster Option]
indexterm:[Cluster,Option,default-action-timeout]
_Deprecated:_ See <<s-operation-defaults>> instead
|=========================================================
=== Querying and Setting Cluster Options ===
indexterm:[Querying,Cluster Option]
indexterm:[Setting,Cluster Option]
indexterm:[Cluster,Querying Options]
indexterm:[Cluster,Setting Options]
Cluster options can be queried and modified using the `crm_attribute` tool. To
get the current value of +cluster-delay+, you can run:
----
# crm_attribute --query --name cluster-delay
----
which is more simply written as
----
# crm_attribute -G -n cluster-delay
----
If a value is found, you'll see a result like this:
----
# crm_attribute -G -n cluster-delay
scope=crm_config name=cluster-delay value=60s
----
If no value is found, the tool will display an error:
----
# crm_attribute -G -n clusta-deway
scope=crm_config name=clusta-deway value=(null)
Error performing operation: No such device or address
----
To use a different value (for example, 30 seconds), simply run:
----
# crm_attribute --name cluster-delay --update 30s
----
To go back to the cluster's default value, you can delete the value, for example:
----
# crm_attribute --name cluster-delay --delete
Deleted crm_config option: id=cib-bootstrap-options-cluster-delay name=cluster-delay
----
=== When Options are Listed More Than Once ===
If you ever see something like the following, it means that the option you're modifying is present more than once.
.Deleting an option that is listed twice
=======
------
# crm_attribute --name batch-limit --delete
Multiple attributes match name=batch-limit in crm_config:
Value: 50 (set=cib-bootstrap-options, id=cib-bootstrap-options-batch-limit)
Value: 100 (set=custom, id=custom-batch-limit)
Please choose from one of the matches above and supply the 'id' with --id
-------
=======
In such cases, follow the on-screen instructions to perform the
requested action. To determine which value is currently being used by
the cluster, refer to <<ch-rules>>.
diff --git a/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt b/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt
index a5bcf0dcfa..d2880e0843 100644
--- a/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt
+++ b/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt
@@ -1,892 +1,901 @@
= STONITH =
////
We prefer [[ch-stonith]], but older versions of asciidoc don't deal well
with that construct for chapter headings
////
anchor:ch-stonith[Chapter 13, STONITH]
indexterm:[STONITH, Configuration]
== What Is STONITH? ==
STONITH (an acronym for "Shoot The Other Node In The Head"), also called
'fencing', protects your data from being corrupted by rogue nodes or concurrent
access.
Just because a node is unresponsive, this doesn't mean it isn't
accessing your data. The only way to be 100% sure that your data is
safe, is to use STONITH so we can be certain that the node is truly
offline, before allowing the data to be accessed from another node.
STONITH also has a role to play in the event that a clustered service
cannot be stopped. In this case, the cluster uses STONITH to force the
whole node offline, thereby making it safe to start the service
elsewhere.
== What STONITH Device Should You Use? ==
It is crucial that the STONITH device can allow the cluster to
differentiate between a node failure and a network one.
The biggest mistake people make in choosing a STONITH device is to
use a remote power switch (such as many on-board IPMI controllers) that
shares power with the node it controls. In such cases, the cluster
cannot be sure if the node is really offline, or active and suffering
from a network fault.
Likewise, any device that relies on the machine being active (such as
SSH-based "devices" used during testing) are inappropriate.
== Special Treatment of STONITH Resources ==
STONITH resources are somewhat special in Pacemaker.
STONITH may be initiated by pacemaker or by other parts of the cluster
(such as resources like DRBD or DLM). To accommodate this, pacemaker
does not require the STONITH resource to be in the 'started' state
in order to be used, thus allowing reliable use of STONITH devices in such a
case.
[NOTE]
====
In pacemaker versions 1.1.9 and earlier, this feature either did not exist or
did not work well. Only "running" STONITH resources could be used by Pacemaker
for fencing, and if another component tried to fence a node while Pacemaker was
moving STONITH resources, the fencing could fail.
====
All nodes have access to STONITH devices' definitions and instantiate them
on-the-fly when needed, but preference is given to 'verified' instances, which
are the ones that are 'started' according to the cluster's knowledge.
In the case of a cluster split, the partition with a verified instance
will have a slight advantage, because the STONITH daemon in the other partition
will have to hear from all its current peers before choosing a node to
perform the fencing.
Fencing resources do work the same as regular resources in some respects:
* +target-role+ can be used to enable or disable the resource
* Location constraints can be used to prevent a specific node from using the resource
[IMPORTANT]
===========
Currently there is a limitation that fencing resources may only have
one set of meta-attributes and one set of instance attributes. This
can be revisited if it becomes a significant limitation for people.
===========
See the table below or run `man stonithd` to see special instance attributes
that may be set for any fencing resource, regardless of fence agent.
.Properties of Fencing Resources
[width="95%",cols="5m,2,3,10<a",options="header",align="center"]
|=========================================================
|Field
|Type
|Default
|Description
|stonith-timeout
|NA
|NA
|Older versions used this to override the default period to wait for a STONITH (reboot, on, off) action to complete for this device.
It has been replaced by the +pcmk_reboot_timeout+ and +pcmk_off_timeout+ properties.
indexterm:[stonith-timeout,Fencing]
indexterm:[Fencing,Property,stonith-timeout]
|priority
|integer
|0
|The priority of the STONITH resource. Devices are tried in order of highest priority to lowest.
indexterm:[priority,Fencing]
indexterm:[Fencing,Property,priority]
|pcmk_host_map
|string
|
|A mapping of host names to ports numbers for devices that do not support host names.
Example: +node1:1;node2:2,3+ tells the cluster to use port 1 for
*node1* and ports 2 and 3 for *node2*.
indexterm:[pcmk_host_map,Fencing]
indexterm:[Fencing,Property,pcmk_host_map]
|pcmk_host_list
|string
|
|A list of machines controlled by this device (optional unless
+pcmk_host_check+ is +static-list+).
indexterm:[pcmk_host_list,Fencing]
indexterm:[Fencing,Property,pcmk_host_list]
|pcmk_host_check
|string
|dynamic-list
|How to determine which machines are controlled by the device.
Allowed values:
* +dynamic-list:+ query the device
* +static-list:+ check the +pcmk_host_list+ attribute
* +none:+ assume every device can fence every machine
indexterm:[pcmk_host_check,Fencing]
indexterm:[Fencing,Property,pcmk_host_check]
|pcmk_delay_max
|time
|0s
|Enable a random delay of up to the time specified before executing stonith
actions. This is sometimes used in two-node clusters to ensure that the
nodes don't fence each other at the same time.
indexterm:[pcmk_delay_max,Fencing]
indexterm:[Fencing,Property,pcmk_delay_max]
+|pcmk_action_limit
+|integer
+|1
+|The maximum number of actions that can be performed in parallel on this
+ device, if the cluster option +concurrent-fencing+ is +true+. -1 is unlimited.
+
+indexterm:[pcmk_action_limit,Fencing]
+indexterm:[Fencing,Property,pcmk_action_limit]
+
|pcmk_host_argument
|string
|port
|'Advanced use only.' Which parameter should be supplied to the resource agent
to identify the node to be fenced. Some devices do not support the standard
+port+ parameter or may provide additional ones. Use this to specify an
alternate, device-specific parameter. A value of +none+ tells the
cluster not to supply any additional parameters.
indexterm:[pcmk_host_argument,Fencing]
indexterm:[Fencing,Property,pcmk_host_argument]
|pcmk_reboot_action
|string
|reboot
|'Advanced use only.' The command to send to the resource agent in order to
reboot a node. Some devices do not support the standard commands or may provide
additional ones. Use this to specify an alternate, device-specific command.
indexterm:[pcmk_reboot_action,Fencing]
indexterm:[Fencing,Property,pcmk_reboot_action]
|pcmk_reboot_timeout
|time
|60s
|'Advanced use only.' Specify an alternate timeout to use for `reboot` actions
instead of the value of +stonith-timeout+. Some devices need much more or less
time to complete than normal. Use this to specify an alternate, device-specific
timeout.
indexterm:[pcmk_reboot_timeout,Fencing]
indexterm:[Fencing,Property,pcmk_reboot_timeout]
indexterm:[stonith-timeout,Fencing]
indexterm:[Fencing,Property,stonith-timeout]
|pcmk_reboot_retries
|integer
|2
|'Advanced use only.' The maximum number of times to retry the `reboot` command
within the timeout period. Some devices do not support multiple connections, and
operations may fail if the device is busy with another task, so Pacemaker will
automatically retry the operation, if there is time remaining. Use this option
to alter the number of times Pacemaker retries before giving up.
indexterm:[pcmk_reboot_retries,Fencing]
indexterm:[Fencing,Property,pcmk_reboot_retries]
|pcmk_off_action
|string
|off
|'Advanced use only.' The command to send to the resource agent in order to
shut down a node. Some devices do not support the standard commands or may provide
additional ones. Use this to specify an alternate, device-specific command.
indexterm:[pcmk_off_action,Fencing]
indexterm:[Fencing,Property,pcmk_off_action]
|pcmk_off_timeout
|time
|60s
|'Advanced use only.' Specify an alternate timeout to use for `off` actions
instead of the value of +stonith-timeout+. Some devices need much more or less
time to complete than normal. Use this to specify an alternate, device-specific
timeout.
indexterm:[pcmk_off_timeout,Fencing]
indexterm:[Fencing,Property,pcmk_off_timeout]
indexterm:[stonith-timeout,Fencing]
indexterm:[Fencing,Property,stonith-timeout]
|pcmk_off_retries
|integer
|2
|'Advanced use only.' The maximum number of times to retry the `off` command
within the timeout period. Some devices do not support multiple connections, and
operations may fail if the device is busy with another task, so Pacemaker will
automatically retry the operation, if there is time remaining. Use this option
to alter the number of times Pacemaker retries before giving up.
indexterm:[pcmk_off_retries,Fencing]
indexterm:[Fencing,Property,pcmk_off_retries]
|pcmk_list_action
|string
|list
|'Advanced use only.' The command to send to the resource agent in order to
list nodes. Some devices do not support the standard commands or may provide
additional ones. Use this to specify an alternate, device-specific command.
indexterm:[pcmk_list_action,Fencing]
indexterm:[Fencing,Property,pcmk_list_action]
|pcmk_list_timeout
|time
|60s
|'Advanced use only.' Specify an alternate timeout to use for `list` actions
instead of the value of +stonith-timeout+. Some devices need much more or less
time to complete than normal. Use this to specify an alternate, device-specific
timeout.
indexterm:[pcmk_list_timeout,Fencing]
indexterm:[Fencing,Property,pcmk_list_timeout]
|pcmk_list_retries
|integer
|2
|'Advanced use only.' The maximum number of times to retry the `list` command
within the timeout period. Some devices do not support multiple connections, and
operations may fail if the device is busy with another task, so Pacemaker will
automatically retry the operation, if there is time remaining. Use this option
to alter the number of times Pacemaker retries before giving up.
indexterm:[pcmk_list_retries,Fencing]
indexterm:[Fencing,Property,pcmk_list_retries]
|pcmk_monitor_action
|string
|monitor
|'Advanced use only.' The command to send to the resource agent in order to
report extended status. Some devices do not support the standard commands or may provide
additional ones. Use this to specify an alternate, device-specific command.
indexterm:[pcmk_monitor_action,Fencing]
indexterm:[Fencing,Property,pcmk_monitor_action]
|pcmk_monitor_timeout
|time
|60s
|'Advanced use only.' Specify an alternate timeout to use for `monitor` actions
instead of the value of +stonith-timeout+. Some devices need much more or less
time to complete than normal. Use this to specify an alternate, device-specific
timeout.
indexterm:[pcmk_monitor_timeout,Fencing]
indexterm:[Fencing,Property,pcmk_monitor_timeout]
|pcmk_monitor_retries
|integer
|2
|'Advanced use only.' The maximum number of times to retry the `monitor` command
within the timeout period. Some devices do not support multiple connections, and
operations may fail if the device is busy with another task, so Pacemaker will
automatically retry the operation, if there is time remaining. Use this option
to alter the number of times Pacemaker retries before giving up.
indexterm:[pcmk_monitor_retries,Fencing]
indexterm:[Fencing,Property,pcmk_monitor_retries]
|pcmk_status_action
|string
|status
|'Advanced use only.' The command to send to the resource agent in order to
report status. Some devices do not support the standard commands or may provide
additional ones. Use this to specify an alternate, device-specific command.
indexterm:[pcmk_status_action,Fencing]
indexterm:[Fencing,Property,pcmk_status_action]
|pcmk_status_timeout
|time
|60s
|'Advanced use only.' Specify an alternate timeout to use for `status` actions
instead of the value of +stonith-timeout+. Some devices need much more or less
time to complete than normal. Use this to specify an alternate, device-specific
timeout.
indexterm:[pcmk_status_timeout,Fencing]
indexterm:[Fencing,Property,pcmk_status_timeout]
|pcmk_status_retries
|integer
|2
|'Advanced use only.' The maximum number of times to retry the `status` command
within the timeout period. Some devices do not support multiple connections, and
operations may fail if the device is busy with another task, so Pacemaker will
automatically retry the operation, if there is time remaining. Use this option
to alter the number of times Pacemaker retries before giving up.
indexterm:[pcmk_status_retries,Fencing]
indexterm:[Fencing,Property,pcmk_status_retries]
|=========================================================
== Configuring STONITH ==
[NOTE]
===========
Higher-level configuration shells include functionality to simplify the
process below, particularly the step for deciding which parameters are
required. However since this document deals only with core
components, you should refer to the STONITH chapter of the
http://www.clusterlabs.org/doc/[Clusters from Scratch] guide for those details.
===========
. Find the correct driver:
+
----
# stonith_admin --list-installed
----
. Find the required parameters associated with the device
(replacing $AGENT_NAME with the name obtained from the previous step):
+
----
# stonith_admin --metadata --agent $AGENT_NAME
----
. Create a file called +stonith.xml+ containing a primitive resource
with a class of +stonith+, a type equal to the agent name obtained earlier,
and a parameter for each of the values returned in the previous step.
. If the device does not know how to fence nodes based on their uname,
you may also need to set the special +pcmk_host_map+ parameter. See
`man stonithd` for details.
. If the device does not support the `list` command, you may also need
to set the special +pcmk_host_list+ and/or +pcmk_host_check+
parameters. See `man stonithd` for details.
. If the device does not expect the victim to be specified with the
`port` parameter, you may also need to set the special
+pcmk_host_argument+ parameter. See `man stonithd` for details.
. Upload it into the CIB using cibadmin:
+
----
# cibadmin -C -o resources --xml-file stonith.xml
----
. Set +stonith-enabled+ to true:
+
----
# crm_attribute -t crm_config -n stonith-enabled -v true
----
. Once the stonith resource is running, you can test it by executing the
following (although you might want to stop the cluster on that machine
first):
+
----
# stonith_admin --reboot nodename
----
=== Example STONITH Configuration ===
Assume we have an chassis containing four nodes and an IPMI device
active on 192.0.2.1. We would choose the `fence_ipmilan` driver,
and obtain the following list of parameters:
.Obtaining a list of STONITH Parameters
====
----
# stonith_admin --metadata -a fence_ipmilan
----
[source,XML]
----
<resource-agent name="fence_ipmilan" shortdesc="Fence agent for IPMI over LAN">
<symlink name="fence_ilo3" shortdesc="Fence agent for HP iLO3"/>
<symlink name="fence_ilo4" shortdesc="Fence agent for HP iLO4"/>
<symlink name="fence_idrac" shortdesc="Fence agent for Dell iDRAC"/>
<symlink name="fence_imm" shortdesc="Fence agent for IBM Integrated Management Module"/>
<longdesc>
</longdesc>
<vendor-url>
</vendor-url>
<parameters>
<parameter name="auth" unique="0" required="0">
<getopt mixed="-A"/>
<content type="string"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="ipaddr" unique="0" required="1">
<getopt mixed="-a"/>
<content type="string"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="passwd" unique="0" required="0">
<getopt mixed="-p"/>
<content type="string"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="passwd_script" unique="0" required="0">
<getopt mixed="-S"/>
<content type="string"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="lanplus" unique="0" required="0">
<getopt mixed="-P"/>
<content type="boolean"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="login" unique="0" required="0">
<getopt mixed="-l"/>
<content type="string"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="action" unique="0" required="0">
<getopt mixed="-o"/>
<content type="string" default="reboot"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="timeout" unique="0" required="0">
<getopt mixed="-t"/>
<content type="string"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="cipher" unique="0" required="0">
<getopt mixed="-C"/>
<content type="string"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="method" unique="0" required="0">
<getopt mixed="-M"/>
<content type="string" default="onoff"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="power_wait" unique="0" required="0">
<getopt mixed="-T"/>
<content type="string" default="2"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="delay" unique="0" required="0">
<getopt mixed="-f"/>
<content type="string"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="privlvl" unique="0" required="0">
<getopt mixed="-L"/>
<content type="string"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
<parameter name="verbose" unique="0" required="0">
<getopt mixed="-v"/>
<content type="boolean"/>
<shortdesc lang="en">
</shortdesc>
</parameter>
</parameters>
<actions>
<action name="on"/>
<action name="off"/>
<action name="reboot"/>
<action name="status"/>
<action name="diag"/>
<action name="list"/>
<action name="monitor"/>
<action name="metadata"/>
<action name="stop" timeout="20s"/>
<action name="start" timeout="20s"/>
</actions>
</resource-agent>
----
====
Based on that, we would create a STONITH resource fragment that might look
like this:
.An IPMI-based STONITH Resource
====
[source,XML]
----
<primitive id="Fencing" class="stonith" type="fence_ipmilan" >
<instance_attributes id="Fencing-params" >
<nvpair id="Fencing-passwd" name="passwd" value="testuser" />
<nvpair id="Fencing-login" name="login" value="abc123" />
<nvpair id="Fencing-ipaddr" name="ipaddr" value="192.0.2.1" />
<nvpair id="Fencing-pcmk_host_list" name="pcmk_host_list" value="pcmk-1 pcmk-2" />
</instance_attributes>
<operations >
<op id="Fencing-monitor-10m" interval="10m" name="monitor" timeout="300s" />
</operations>
</primitive>
----
====
Finally, we need to enable STONITH:
----
# crm_attribute -t crm_config -n stonith-enabled -v true
----
== Advanced STONITH Configurations ==
Some people consider that having one fencing device is a single point
of failure footnote:[Not true, since a node or resource must fail
before fencing even has a chance to]; others prefer removing the node
from the storage and network instead of turning it off.
Whatever the reason, Pacemaker supports fencing nodes with multiple
devices through a feature called 'fencing topologies'.
Simply create the individual devices as you normally would, then
define one or more +fencing-level+ entries in the +fencing-topology+ section of
the configuration.
* Each fencing level is attempted in order of ascending +index+. Allowed
indexes are 0 to 9.
* If a device fails, processing terminates for the current level.
No further devices in that level are exercised, and the next level is attempted instead.
* If the operation succeeds for all the listed devices in a level, the level is deemed to have passed.
* The operation is finished when a level has passed (success), or all levels have been attempted (failed).
* If the operation failed, the next step is determined by the Policy Engine and/or `crmd`.
Some possible uses of topologies include:
* Try poison-pill and fail back to power
* Try disk and network, and fall back to power if either fails
* Initiate a kdump and then poweroff the node
.Properties of Fencing Levels
[width="95%",cols="1m,3<",options="header",align="center"]
|=========================================================
|Field
|Description
|id
|A unique name for the level
indexterm:[id,fencing-level]
indexterm:[Fencing,fencing-level,id]
|target
|The name of a single node to which this level applies
indexterm:[target,fencing-level]
indexterm:[Fencing,fencing-level,target]
|target-pattern
|A regular expression matching the names of nodes to which this level applies
'(since 1.1.14)'
indexterm:[target-pattern,fencing-level]
indexterm:[Fencing,fencing-level,target-pattern]
|target-attribute
|The name of a node attribute that is set for nodes to which this level applies
'(since 1.1.14)'
indexterm:[target-attribute,fencing-level]
indexterm:[Fencing,fencing-level,target-attribute]
|index
|The order in which to attempt the levels.
Levels are attempted in ascending order 'until one succeeds'.
indexterm:[index,fencing-level]
indexterm:[Fencing,fencing-level,index]
|devices
|A comma-separated list of devices that must all be tried for this level
indexterm:[devices,fencing-level]
indexterm:[Fencing,fencing-level,devices]
|=========================================================
.Fencing topology with different devices for different nodes
====
[source,XML]
----
<cib crm_feature_set="3.0.6" validate-with="pacemaker-1.2" admin_epoch="1" epoch="0" num_updates="0">
<configuration>
...
<fencing-topology>
<!-- For pcmk-1, try poison-pill and fail back to power -->
<fencing-level id="f-p1.1" target="pcmk-1" index="1" devices="poison-pill"/>
<fencing-level id="f-p1.2" target="pcmk-1" index="2" devices="power"/>
<!-- For pcmk-2, try disk and network, and fail back to power -->
<fencing-level id="f-p2.1" target="pcmk-2" index="1" devices="disk,network"/>
<fencing-level id="f-p2.2" target="pcmk-2" index="2" devices="power"/>
</fencing-topology>
...
<configuration>
<status/>
</cib>
----
====
=== Example Dual-Layer, Dual-Device Fencing Topologies ===
The following example illustrates an advanced use of +fencing-topology+ in a cluster with the following properties:
* 3 nodes (2 active prod-mysql nodes, 1 prod_mysql-rep in standby for quorum purposes)
* the active nodes have an IPMI-controlled power board reached at 192.0.2.1 and 192.0.2.2
* the active nodes also have two independent PSUs (Power Supply Units)
connected to two independent PDUs (Power Distribution Units) reached at
198.51.100.1 (port 10 and port 11) and 203.0.113.1 (port 10 and port 11)
* the first fencing method uses the `fence_ipmi` agent
* the second fencing method uses the `fence_apc_snmp` agent targetting 2 fencing devices (one per PSU, either port 10 or 11)
* fencing is only implemented for the active nodes and has location constraints
* fencing topology is set to try IPMI fencing first then default to a "sure-kill" dual PDU fencing
In a normal failure scenario, STONITH will first select +fence_ipmi+ to try to kill the faulty node.
Using a fencing topology, if that first method fails, STONITH will then move on to selecting +fence_apc_snmp+ twice:
* once for the first PDU
* again for the second PDU
The fence action is considered successful only if both PDUs report the required status. If any of them fails, STONITH loops back to the first fencing method, +fence_ipmi+, and so on until the node is fenced or fencing action is cancelled.
.First fencing method: single IPMI device
Each cluster node has it own dedicated IPMI channel that can be called for fencing using the following primitives:
[source,XML]
----
<primitive class="stonith" id="fence_prod-mysql1_ipmi" type="fence_ipmilan">
<instance_attributes id="fence_prod-mysql1_ipmi-instance_attributes">
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-ipaddr" name="ipaddr" value="192.0.2.1"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-passwd" name="passwd" value="finishme"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-verbose" name="verbose" value="true"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql1"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-lanplus" name="lanplus" value="true"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="fence_prod-mysql2_ipmi" type="fence_ipmilan">
<instance_attributes id="fence_prod-mysql2_ipmi-instance_attributes">
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-ipaddr" name="ipaddr" value="192.0.2.2"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-passwd" name="passwd" value="finishme"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-verbose" name="verbose" value="true"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql2"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-lanplus" name="lanplus" value="true"/>
</instance_attributes>
</primitive>
----
.Second fencing method: dual PDU devices
Each cluster node also has two distinct power channels controlled by two
distinct PDUs. That means a total of 4 fencing devices configured as follows:
- Node 1, PDU 1, PSU 1 @ port 10
- Node 1, PDU 2, PSU 2 @ port 10
- Node 2, PDU 1, PSU 1 @ port 11
- Node 2, PDU 2, PSU 2 @ port 11
The matching fencing agents are configured as follows:
[source,XML]
----
<primitive class="stonith" id="fence_prod-mysql1_apc1" type="fence_apc_snmp">
<instance_attributes id="fence_prod-mysql1_apc1-instance_attributes">
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-ipaddr" name="ipaddr" value="198.51.100.1"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-port" name="port" value="10"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-passwd" name="passwd" value="fencing"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql1"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="fence_prod-mysql1_apc2" type="fence_apc_snmp">
<instance_attributes id="fence_prod-mysql1_apc2-instance_attributes">
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-ipaddr" name="ipaddr" value="203.0.113.1"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-port" name="port" value="10"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-passwd" name="passwd" value="fencing"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql1"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="fence_prod-mysql2_apc1" type="fence_apc_snmp">
<instance_attributes id="fence_prod-mysql2_apc1-instance_attributes">
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-ipaddr" name="ipaddr" value="198.51.100.1"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-port" name="port" value="11"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-passwd" name="passwd" value="fencing"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql2"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="fence_prod-mysql2_apc2" type="fence_apc_snmp">
<instance_attributes id="fence_prod-mysql2_apc2-instance_attributes">
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-ipaddr" name="ipaddr" value="203.0.113.1"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-port" name="port" value="11"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-passwd" name="passwd" value="fencing"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql2"/>
</instance_attributes>
</primitive>
----
.Location Constraints
To prevent STONITH from trying to run a fencing agent on the same node it is
supposed to fence, constraints are placed on all the fencing primitives:
[source,XML]
----
<constraints>
<rsc_location id="l_fence_prod-mysql1_ipmi" node="prod-mysql1" rsc="fence_prod-mysql1_ipmi" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql2_ipmi" node="prod-mysql2" rsc="fence_prod-mysql2_ipmi" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql1_apc2" node="prod-mysql1" rsc="fence_prod-mysql1_apc2" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql1_apc1" node="prod-mysql1" rsc="fence_prod-mysql1_apc1" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql2_apc1" node="prod-mysql2" rsc="fence_prod-mysql2_apc1" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql2_apc2" node="prod-mysql2" rsc="fence_prod-mysql2_apc2" score="-INFINITY"/>
</constraints>
----
.Fencing topology
Now that all the fencing resources are defined, it's time to create the right topology.
We want to first fence using IPMI and if that does not work, fence both PDUs to effectively and surely kill the node.
[source,XML]
----
<fencing-topology>
<fencing-level devices="fence_prod-mysql1_ipmi" id="fencing-2" index="1" target="prod-mysql1"/>
<fencing-level devices="fence_prod-mysql1_apc1,fence_prod-mysql1_apc2" id="fencing-3" index="2" target="prod-mysql1"/>
<fencing-level devices="fence_prod-mysql2_ipmi" id="fencing-0" index="1" target="prod-mysql2"/>
<fencing-level devices="fence_prod-mysql2_apc1,fence_prod-mysql2_apc2" id="fencing-1" index="2" target="prod-mysql2"/>
</fencing-topology>
----
Please note, in +fencing-topology+, the lowest +index+ value determines the priority of the first fencing method.
.Final configuration
Put together, the configuration looks like this:
[source,XML]
----
<cib admin_epoch="0" crm_feature_set="3.0.7" epoch="292" have-quorum="1" num_updates="29" validate-with="pacemaker-1.2">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="true"/>
<nvpair id="cib-bootstrap-options-stonith-action" name="stonith-action" value="off"/>
<nvpair id="cib-bootstrap-options-expected-quorum-votes" name="expected-quorum-votes" value="3"/>
...
</cluster_property_set>
</crm_config>
<nodes>
<node id="prod-mysql1" uname="prod-mysql1">
<node id="prod-mysql2" uname="prod-mysql2"/>
<node id="prod-mysql-rep1" uname="prod-mysql-rep1"/>
<instance_attributes id="prod-mysql-rep1">
<nvpair id="prod-mysql-rep1-standby" name="standby" value="on"/>
</instance_attributes>
</node>
</nodes>
<resources>
<primitive class="stonith" id="fence_prod-mysql1_ipmi" type="fence_ipmilan">
<instance_attributes id="fence_prod-mysql1_ipmi-instance_attributes">
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-ipaddr" name="ipaddr" value="192.0.2.1"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-passwd" name="passwd" value="finishme"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-verbose" name="verbose" value="true"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql1"/>
<nvpair id="fence_prod-mysql1_ipmi-instance_attributes-lanplus" name="lanplus" value="true"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="fence_prod-mysql2_ipmi" type="fence_ipmilan">
<instance_attributes id="fence_prod-mysql2_ipmi-instance_attributes">
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-ipaddr" name="ipaddr" value="192.0.2.2"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-passwd" name="passwd" value="finishme"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-verbose" name="verbose" value="true"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql2"/>
<nvpair id="fence_prod-mysql2_ipmi-instance_attributes-lanplus" name="lanplus" value="true"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="fence_prod-mysql1_apc1" type="fence_apc_snmp">
<instance_attributes id="fence_prod-mysql1_apc1-instance_attributes">
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-ipaddr" name="ipaddr" value="198.51.100.1"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-port" name="port" value="10"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-passwd" name="passwd" value="fencing"/>
<nvpair id="fence_prod-mysql1_apc1-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql1"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="fence_prod-mysql1_apc2" type="fence_apc_snmp">
<instance_attributes id="fence_prod-mysql1_apc2-instance_attributes">
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-ipaddr" name="ipaddr" value="203.0.113.1"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-port" name="port" value="10"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-passwd" name="passwd" value="fencing"/>
<nvpair id="fence_prod-mysql1_apc2-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql1"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="fence_prod-mysql2_apc1" type="fence_apc_snmp">
<instance_attributes id="fence_prod-mysql2_apc1-instance_attributes">
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-ipaddr" name="ipaddr" value="198.51.100.1"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-port" name="port" value="11"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-passwd" name="passwd" value="fencing"/>
<nvpair id="fence_prod-mysql2_apc1-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql2"/>
</instance_attributes>
</primitive>
<primitive class="stonith" id="fence_prod-mysql2_apc2" type="fence_apc_snmp">
<instance_attributes id="fence_prod-mysql2_apc2-instance_attributes">
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-ipaddr" name="ipaddr" value="203.0.113.1"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-action" name="action" value="off"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-port" name="port" value="11"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-login" name="login" value="fencing"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-passwd" name="passwd" value="fencing"/>
<nvpair id="fence_prod-mysql2_apc2-instance_attributes-pcmk_host_list" name="pcmk_host_list" value="prod-mysql2"/>
</instance_attributes>
</primitive>
</resources>
<constraints>
<rsc_location id="l_fence_prod-mysql1_ipmi" node="prod-mysql1" rsc="fence_prod-mysql1_ipmi" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql2_ipmi" node="prod-mysql2" rsc="fence_prod-mysql2_ipmi" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql1_apc2" node="prod-mysql1" rsc="fence_prod-mysql1_apc2" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql1_apc1" node="prod-mysql1" rsc="fence_prod-mysql1_apc1" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql2_apc1" node="prod-mysql2" rsc="fence_prod-mysql2_apc1" score="-INFINITY"/>
<rsc_location id="l_fence_prod-mysql2_apc2" node="prod-mysql2" rsc="fence_prod-mysql2_apc2" score="-INFINITY"/>
</constraints>
<fencing-topology>
<fencing-level devices="fence_prod-mysql1_ipmi" id="fencing-2" index="1" target="prod-mysql1"/>
<fencing-level devices="fence_prod-mysql1_apc1,fence_prod-mysql1_apc2" id="fencing-3" index="2" target="prod-mysql1"/>
<fencing-level devices="fence_prod-mysql2_ipmi" id="fencing-0" index="1" target="prod-mysql2"/>
<fencing-level devices="fence_prod-mysql2_apc1,fence_prod-mysql2_apc2" id="fencing-1" index="2" target="prod-mysql2"/>
</fencing-topology>
...
</configuration>
</cib>
----
== Remapping Reboots ==
When the cluster needs to reboot a node, whether because +stonith-action+ is +reboot+ or because
a reboot was manually requested (such as by `stonith_admin --reboot`), it will remap that to
other commands in two cases:
. If the chosen fencing device does not support the +reboot+ command, the cluster
will ask it to perform +off+ instead.
. If a fencing topology level with multiple devices must be executed, the cluster
will ask all the devices to perform +off+, then ask the devices to perform +on+.
To understand the second case, consider the example of a node with redundant
power supplies connected to intelligent power switches. Rebooting one switch
and then the other would have no effect on the node. Turning both switches off,
and then on, actually reboots the node.
In such a case, the fencing operation will be treated as successful as long as
the +off+ commands succeed, because then it is safe for the cluster to recover
any resources that were on the node. Timeouts and errors in the +on+ phase will
be logged but ignored.
When a reboot operation is remapped, any action-specific timeout for the
remapped action will be used (for example, +pcmk_off_timeout+ will be used when
executing the +off+ command, not +pcmk_reboot_timeout+).
[NOTE]
====
In Pacemaker versions 1.1.13 and earlier, reboots will not be remapped in the
second case. To achieve the same effect, separate fencing devices for off and
on actions must be configured.
====
diff --git a/doc/Pacemaker_Explained/en-US/Revision_History.xml b/doc/Pacemaker_Explained/en-US/Revision_History.xml
index 33010d5c0e..4bd3485d26 100644
--- a/doc/Pacemaker_Explained/en-US/Revision_History.xml
+++ b/doc/Pacemaker_Explained/en-US/Revision_History.xml
@@ -1,72 +1,84 @@
<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
]>
<appendix>
<!-- see comment in Book_Info.xml for revision numbering -->
<title>Revision History</title>
<simpara>
<revhistory>
<revision>
<revnumber>1-0</revnumber>
<date>19 Oct 2009</date>
<author><firstname>Andrew</firstname><surname>Beekhof</surname><email>andrew@beekhof.net</email></author>
<revdescription><simplelist><member>Import from Pages.app</member></simplelist></revdescription>
</revision>
<revision>
<revnumber>2-0</revnumber>
<date>26 Oct 2009</date>
<author><firstname>Andrew</firstname><surname>Beekhof</surname><email>andrew@beekhof.net</email></author>
<revdescription><simplelist><member>Cleanup and reformatting of docbook xml complete</member></simplelist></revdescription>
</revision>
<revision>
<revnumber>3-0</revnumber>
<date>Tue Nov 12 2009</date>
<author><firstname>Andrew</firstname><surname>Beekhof</surname><email>andrew@beekhof.net</email></author>
<revdescription>
<simplelist>
<member>Split book into chapters and pass validation</member>
<member>Re-organize book for use with <ulink url="https://fedorahosted.org/publican/">Publican</ulink></member>
</simplelist>
</revdescription>
</revision>
<revision>
<revnumber>4-0</revnumber>
<date>Mon Oct 8 2012</date>
<author><firstname>Andrew</firstname><surname>Beekhof</surname><email>andrew@beekhof.net</email></author>
<revdescription>
<simplelist>
<member>
Converted to <ulink url="http://www.methods.co.nz/asciidoc">asciidoc</ulink>
(which is converted to docbook for use with
<ulink url="https://fedorahosted.org/publican/">Publican</ulink>)
</member>
</simplelist>
</revdescription>
</revision>
<revision>
<revnumber>5-0</revnumber>
<date>Mon Feb 23 2015</date>
<author><firstname>Ken</firstname><surname>Gaillot</surname><email>kgaillot@redhat.com</email></author>
<revdescription>
<simplelist>
<member>
Update for clarity, stylistic consistency and current command-line syntax
</member>
</simplelist>
</revdescription>
</revision>
<revision>
<revnumber>6-0</revnumber>
<date>Tue Dec 8 2015</date>
<author><firstname>Ken</firstname><surname>Gaillot</surname><email>kgaillot@redhat.com</email></author>
<revdescription>
<simplelist>
<member>
Update for Pacemaker 1.1.14
</member>
</simplelist>
</revdescription>
</revision>
+ <revision>
+ <revnumber>7-0</revnumber>
+ <date>Tue May 3 2016</date>
+ <author><firstname>Ken</firstname><surname>Gaillot</surname><email>kgaillot@redhat.com</email></author>
+ <revdescription>
+ <simplelist>
+ <member>
+ Update for Pacemaker 1.1.15
+ </member>
+ </simplelist>
+ </revdescription>
+ </revision>
</revhistory>
</simpara>
</appendix>
diff --git a/doc/Pacemaker_Remote/en-US/Book_Info.xml b/doc/Pacemaker_Remote/en-US/Book_Info.xml
index 12e1ab891d..1e3675b9d1 100644
--- a/doc/Pacemaker_Remote/en-US/Book_Info.xml
+++ b/doc/Pacemaker_Remote/en-US/Book_Info.xml
@@ -1,75 +1,75 @@
<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE bookinfo PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % BOOK_ENTITIES SYSTEM "Pacemaker_Remote.ent">
%BOOK_ENTITIES;
]>
<bookinfo id="book-Pacemaker_Remote-Pacemaker_Remote">
<title>Pacemaker Remote</title>
<subtitle>Scaling High Availablity Clusters</subtitle>
<!--
EDITION-PUBSNUMBER should match REVNUMBER in Revision_History.xml.
Increment EDITION when the syntax of the documented software
changes (OS, pacemaker, corosync, pcs), and PUBSNUMBER for
simple textual changes (corrections, translations, etc.).
-->
- <edition>5</edition>
+ <edition>6</edition>
<pubsnumber>0</pubsnumber>
<abstract>
<para>
The document exists as both a reference and deployment guide for the Pacemaker Remote service.
</para>
<para>
The example commands in this document will use:
<orderedlist>
<listitem>
<para>
&DISTRO; &DISTRO_VERSION; as the host operating system
</para>
</listitem>
<listitem>
<para>
Pacemaker Remote to perform resource management within guest nodes and remote nodes
</para>
</listitem>
<listitem>
<para>
KVM for virtualization
</para>
</listitem>
<listitem>
<para>
libvirt to manage guest nodes
</para>
</listitem>
<listitem>
<para>
Corosync to provide messaging and membership services on cluster nodes
</para>
</listitem>
<listitem>
<para>
Pacemaker to perform resource management on cluster nodes
</para>
</listitem>
<listitem>
<para>
pcs as the cluster configuration toolset
</para>
</listitem>
</orderedlist>
The concepts are the same for other distributions,
virtualization platforms, toolsets, and messaging
layers, and should be easily adaptable.
</para>
</abstract>
<corpauthor>
<inlinemediaobject>
<imageobject>
<imagedata fileref="Common_Content/images/title_logo.svg" format="SVG" />
</imageobject>
</inlinemediaobject>
</corpauthor>
<xi:include href="Common_Content/Legal_Notice.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
<xi:include href="Author_Group.xml" xmlns:xi="http://www.w3.org/2001/XInclude" />
</bookinfo>
diff --git a/doc/Pacemaker_Remote/en-US/Ch-Alternatives.txt b/doc/Pacemaker_Remote/en-US/Ch-Alternatives.txt
index d2fd9f42fd..7cf45ab423 100644
--- a/doc/Pacemaker_Remote/en-US/Ch-Alternatives.txt
+++ b/doc/Pacemaker_Remote/en-US/Ch-Alternatives.txt
@@ -1,77 +1,78 @@
= Alternative Configurations =
These alternative configurations may be appropriate in limited cases, such as a
test cluster, but are not the best method in most situations. They are
presented here for completeness and as an example of pacemaker's flexibility
to suit your needs.
== Virtual Machines as Cluster Nodes ==
The preferred use of virtual machines in a pacemaker cluster is as a
cluster resource, whether opaque or as a guest node. However, it is
possible to run the full cluster stack on a virtual node instead.
This is commonly used to set up test environments; a single physical host
(that does not participate in the cluster) runs two or more virtual machines,
all running the full cluster stack. This can be used to simulate a
larger cluster for testing purposes.
In a production environment, fencing becomes more complicated, especially
if the underlying hosts run any services besides the clustered VMs.
If the VMs are not guaranteed a minimum amount of host resources,
CPU and I/O contention can cause timing issues for cluster components.
Another situation where this approach is sometimes used is when
the cluster owner leases the VMs from a provider and does not have
direct access to the underlying host. The main concerns in this case
are proper fencing (usually via a custom resource agent that communicates
with the provider's APIs) and maintaining a static IP address between reboots,
as well as resource contention issues.
== Virtual Machines as Remote Nodes ==
Virtual machines may be configured following the process for remote nodes
rather than guest nodes (i.e., using an *ocf:pacemaker:remote* resource
rather than letting the cluster manage the VM directly).
This is mainly useful in testing, to use a single physical host to simulate a
larger cluster involving remote nodes. Pacemaker's Cluster Test Suite (CTS)
uses this approach to test remote node functionality.
== Containers as Guest Nodes ==
Containers,footnote:[https://en.wikipedia.org/wiki/Operating-system-level_virtualization]
and in particular Linux containers (LXC) and Docker, have become a popular
method of isolating services in a resource-efficient manner.
The preferred means of integrating containers into Pacemaker is as a
cluster resource, whether opaque or using Pacemaker's built-in
resource isolation support.footnote:[Documentation for this support is planned
but not yet available.]
However, it is possible to run `pacemaker_remote` inside a container,
following the process for guest nodes. This is not recommended but can
be useful, for example, in testing scenarios, to simulate a large number of
guest nodes.
The configuration process is very similar to that described for guest nodes
using virtual machines. Key differences:
* The underlying host must install the libvirt driver for the desired container
technology -- for example, the +libvirt-daemon-lxc+ package to get the
http://libvirt.org/drvlxc.html:[libvirt-lxc] driver for LXC containers.
* Libvirt XML definitions must be generated for the containers. The
- +pacemaker-cts+ package includes a helpful script for this purpose,
+ +pacemaker-cts+ package includes a script for this purpose,
+/usr/share/pacemaker/tests/cts/lxc_autogen.sh+. Run it with the
- `--help` option for details on how to use it. Of course, you can create
- XML definitions manually, following the appropriate libvirt driver
- documentation.
+ `--help` option for details on how to use it. It is intended for testing
+ purposes only, and hardcodes various parameters that would need to be set
+ appropriately in real usage. Of course, you can create XML definitions
+ manually, following the appropriate libvirt driver documentation.
* To share the authentication key, either share the host's +/etc/pacemaker+
directory with the container, or copy the key into the container's
filesystem.
* The *VirtualDomain* resource for a container will need
*force_stop="true"* and an appropriate hypervisor option,
for example *hypervisor="lxc:///"* for LXC containers.
diff --git a/doc/Pacemaker_Remote/en-US/Ch-Baremetal-Tutorial.txt b/doc/Pacemaker_Remote/en-US/Ch-Baremetal-Tutorial.txt
index c187b2536f..f866c9a944 100644
--- a/doc/Pacemaker_Remote/en-US/Ch-Baremetal-Tutorial.txt
+++ b/doc/Pacemaker_Remote/en-US/Ch-Baremetal-Tutorial.txt
@@ -1,306 +1,310 @@
= Remote Node Walk-through =
*What this tutorial is:* An in-depth walk-through of how to get Pacemaker to
integrate a remote node into the cluster as a node capable of running cluster
resources.
*What this tutorial is not:* A realistic deployment scenario. The steps shown
here are meant to get users familiar with the concept of remote nodes as
quickly as possible.
This tutorial requires three machines: two to act as cluster nodes, and
a third to act as the remote node.
== Configure Remote Node ==
=== Configure Firewall on Remote Node ===
Allow cluster-related services through the local firewall:
----
# firewall-cmd --permanent --add-service=high-availability
success
# firewall-cmd --reload
success
----
[NOTE]
======
If you are using iptables directly, or some other firewall solution besides
firewalld, simply open the following ports, which can be used by various
clustering components: TCP ports 2224, 3121, and 21064, and UDP port 5405.
If you run into any problems during testing, you might want to disable
the firewall and SELinux entirely until you have everything working.
This may create significant security issues and should not be performed on
machines that will be exposed to the outside world, but may be appropriate
during development and testing on a protected host.
To disable security measures:
----
# setenforce 0
# sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/config
# systemctl disable firewalld.service
# systemctl stop firewalld.service
# iptables --flush
----
======
=== Configure pacemaker_remote on Remote Node ===
Install the pacemaker_remote daemon on the remote node.
----
# yum install -y pacemaker-remote resource-agents pcs
----
Create a location for the shared authentication key:
----
# mkdir -p --mode=0750 /etc/pacemaker
# chgrp haclient /etc/pacemaker
----
All nodes (both cluster nodes and remote nodes) must have the same
authentication key installed for the communication to work correctly.
If you already have a key on an existing node, copy it to the new
remote node. Otherwise, create a new key, for example:
----
# dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1
----
Now start and enable the pacemaker_remote daemon on the remote node.
----
# systemctl enable pacemaker_remote.service
# systemctl start pacemaker_remote.service
----
Verify the start is successful.
----
# systemctl status pacemaker_remote
pacemaker_remote.service - Pacemaker Remote Service
Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service; enabled)
Active: active (running) since Fri 2015-08-21 15:21:20 CDT; 20s ago
Main PID: 21273 (pacemaker_remot)
CGroup: /system.slice/pacemaker_remote.service
└─21273 /usr/sbin/pacemaker_remoted
Aug 21 15:21:20 remote1 systemd[1]: Starting Pacemaker Remote Service...
Aug 21 15:21:20 remote1 systemd[1]: Started Pacemaker Remote Service.
Aug 21 15:21:20 remote1 pacemaker_remoted[21273]: notice: crm_add_logfile: Additional logging available in /var/log/pacemaker.log
Aug 21 15:21:20 remote1 pacemaker_remoted[21273]: notice: lrmd_init_remote_tls_server: Starting a tls listener on port 3121.
Aug 21 15:21:20 remote1 pacemaker_remoted[21273]: notice: bind_and_listen: Listening on address ::
----
== Verify Connection to Remote Node ==
Before moving forward, it's worth verifying that the cluster nodes
can contact the remote node on port 3121. Here's a trick you can use.
Connect using ssh from each of the cluster nodes. The connection will get
destroyed, but how it is destroyed tells you whether it worked or not.
First, add the remote node's hostname (we're using *remote1* in this tutorial)
to the cluster nodes' +/etc/hosts+ files if you haven't already. This
is required unless you have DNS set up in a way where remote1's address can be
discovered.
Execute the following on each cluster node, replacing the IP address with the
actual IP address of the remote node.
----
# cat << END >> /etc/hosts
192.168.122.10 remote1
END
----
If running the ssh command on one of the cluster nodes results in this
-output before disconnecting, the connection works.
+output before disconnecting, the connection works:
----
# ssh -p 3121 remote1
ssh_exchange_identification: read: Connection reset by peer
----
-If you see this, the connection is not working.
+If you see one of these, the connection is not working:
----
# ssh -p 3121 remote1
ssh: connect to host remote1 port 3121: No route to host
----
+----
+# ssh -p 3121 remote1
+ssh: connect to host remote1 port 3121: Connection refused
+----
Once you can successfully connect to the remote node from the both
cluster nodes, move on to setting up Pacemaker on the cluster nodes.
== Configure Cluster Nodes ==
=== Configure Firewall on Cluster Nodes ===
On each cluster node, allow cluster-related services through the local
firewall, following the same procedure as in <<_configure_firewall_on_remote_node>>.
=== Install Pacemaker on Cluster Nodes ===
On the two cluster nodes, install the following packages.
----
# yum install -y pacemaker corosync pcs resource-agents
----
=== Copy Authentication Key to Cluster Nodes ===
Create a location for the shared authentication key,
and copy it from any existing node:
----
# mkdir -p --mode=0750 /etc/pacemaker
# chgrp haclient /etc/pacemaker
# scp remote1:/etc/pacemaker/authkey /etc/pacemaker/authkey
----
=== Configure Corosync on Cluster Nodes ===
Corosync handles Pacemaker's cluster membership and messaging. The corosync
config file is located in +/etc/corosync/corosync.conf+. That config file must be
initialized with information about the two cluster nodes before pacemaker can
start.
To initialize the corosync config file, execute the following pcs command on
both nodes, filling in the information in <> with your nodes' information.
----
# pcs cluster setup --force --local --name mycluster <node1 ip or hostname> <node2 ip or hostname>
----
=== Start Pacemaker on Cluster Nodes ===
Start the cluster stack on both cluster nodes using the following command.
----
# pcs cluster start
----
Verify corosync membership
....
# pcs status corosync
Membership information
----------------------
Nodeid Votes Name
1 1 node1 (local)
....
Verify Pacemaker status. At first, the `pcs cluster status` output will look
like this.
----
# pcs status
Cluster name: mycluster
Last updated: Fri Aug 21 16:14:05 2015
Last change: Fri Aug 21 14:02:14 2015
Stack: corosync
Current DC: NONE
Version: 1.1.12-a14efad
1 Nodes configured, unknown expected votes
0 Resources configured
----
After about a minute, you should see your two cluster nodes come online.
----
# pcs status
Cluster name: mycluster
Last updated: Fri Aug 21 16:16:32 2015
Last change: Fri Aug 21 14:02:14 2015
Stack: corosync
Current DC: node1 (1) - partition with quorum
Version: 1.1.12-a14efad
2 Nodes configured
0 Resources configured
Online: [ node1 node2 ]
----
For the sake of this tutorial, we are going to disable stonith to avoid having to cover fencing device configuration.
----
# pcs property set stonith-enabled=false
----
== Integrate Remote Node into Cluster ==
Integrating a remote node into the cluster is achieved through the
creation of a remote node connection resource. The remote node connection
resource both establishes the connection to the remote node and defines that
the remote node exists. Note that this resource is actually internal to
Pacemaker's crmd component. A metadata file for this resource can be found in
the +/usr/lib/ocf/resource.d/pacemaker/remote+ file that describes what options
are available, but there is no actual *ocf:pacemaker:remote* resource agent
script that performs any work.
Define the remote node connection resource to our remote node,
*remote1*, using the following command on any cluster node.
----
# pcs resource create remote1 ocf:pacemaker:remote
----
That's it. After a moment you should see the remote node come online.
----
Cluster name: mycluster
Last updated: Fri Aug 21 17:13:09 2015
Last change: Fri Aug 21 17:02:02 2015
Stack: corosync
Current DC: node1 (1) - partition with quorum
Version: 1.1.12-a14efad
3 Nodes configured
1 Resources configured
Online: [ node1 node2 ]
RemoteOnline: [ remote1 ]
Full list of resources:
remote1 (ocf::pacemaker:remote): Started node1
PCSD Status:
node1: Online
node2: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
== Starting Resources on Remote Node ==
Once the remote node is integrated into the cluster, starting resources on a
remote node is the exact same as on cluster nodes. Refer to the
http://clusterlabs.org/doc/['Clusters from Scratch'] document for examples of
resource creation.
[WARNING]
=========
Never involve a remote node connection resource in a resource group,
colocation constraint, or order constraint.
=========
== Fencing Remote Nodes ==
Remote nodes are fenced the same way as cluster nodes. No special
considerations are required. Configure fencing resources for use with
remote nodes the same as you would with cluster nodes.
Note, however, that remote nodes can never 'initiate' a fencing action. Only
cluster nodes are capable of actually executing a fencing operation against
another node.
== Accessing Cluster Tools from a Remote Node ==
Besides allowing the cluster to manage resources on a remote node,
pacemaker_remote has one other trick. The pacemaker_remote daemon allows
nearly all the pacemaker tools (`crm_resource`, `crm_mon`, `crm_attribute`,
`crm_master`, etc.) to work on remote nodes natively.
Try it: Run `crm_mon` on the remote node after pacemaker has
integrated it into the cluster. These tools just work. These means resource
agents such as master/slave resources which need access to tools like
`crm_master` work seamlessly on the remote nodes.
Higher-level command shells such as `pcs` may have partial support
on remote nodes, but it is recommended to run them from a cluster node.
diff --git a/doc/Pacemaker_Remote/en-US/Ch-Intro.txt b/doc/Pacemaker_Remote/en-US/Ch-Intro.txt
index 9edf054a69..416c19d880 100644
--- a/doc/Pacemaker_Remote/en-US/Ch-Intro.txt
+++ b/doc/Pacemaker_Remote/en-US/Ch-Intro.txt
@@ -1,198 +1,204 @@
= Scaling a Pacemaker Cluster =
== Overview ==
In a basic Pacemaker high-availability
cluster,footnote:[See the http://www.clusterlabs.org/doc/[Pacemaker
documentation], especially 'Clusters From Scratch' and 'Pacemaker Explained',
for basic information about high-availability using Pacemaker]
each node runs the full cluster stack of corosync and all Pacemaker components.
This allows great flexibility but limits scalability to around 16 nodes.
To allow for scalability to dozens or even hundreds of nodes, Pacemaker
allows nodes not running the full cluster stack to integrate into the cluster
and have the cluster manage their resources as if they were a cluster node.
== Terms ==
cluster node::
A node running the full high-availability stack of corosync and all
Pacemaker components. Cluster nodes may run cluster resources, run
all Pacemaker command-line tools (`crm_mon`, `crm_resource` and so on),
execute fencing actions, count toward cluster quorum, and serve as the
cluster's Designated Controller (DC).
(((cluster node)))
(((node,cluster node)))
pacemaker_remote::
A small service daemon that allows a host to be used as a Pacemaker node
without running the full cluster stack. Nodes running pacemaker_remote
may run cluster resources and most command-line tools, but cannot perform
other functions of full cluster nodes such as fencing execution, quorum
voting or DC eligibility. The pacemaker_remote daemon is an enhanced
version of Pacemaker's local resource management daemon (LRMD).
(((pacemaker_remote)))
remote node::
A physical host running pacemaker_remote. Remote nodes have a special
resource that manages communication with the cluster. This is sometimes
referred to as the 'baremetal' case.
(((remote node)))
(((node,remote node)))
guest node::
A virtual host running pacemaker_remote. Guest nodes differ from remote
nodes mainly in that the guest node is itself a resource that the cluster
manages.
(((guest node)))
(((node,guest node)))
[NOTE]
======
'Remote' in this document refers to the node not being a part of the underlying
corosync cluster. It has nothing to do with physical proximity. Remote nodes
and guest nodes are subject to the same latency requirements as cluster nodes,
which means they are typically in the same data center.
======
[NOTE]
======
It is important to distinguish the various roles a virtual machine can serve
in Pacemaker clusters:
* A virtual machine can run the full cluster stack, in which case it is a
cluster node and is not itself managed by the cluster.
* A virtual machine can be managed by the cluster as a resource, without the
cluster having any awareness of the services running inside the virtual
machine. The virtual machine is 'opaque' to the cluster.
* A virtual machine can be a cluster resource, and run pacemaker_remote
to make it a guest node, allowing the cluster to manage services
inside it. The virtual machine is 'transparent' to the cluster.
======
== Support in Pacemaker Versions ==
It is recommended to run Pacemaker 1.1.12 or later when using pacemaker_remote
due to important bug fixes. An overview of changes in pacemaker_remote
capability by version:
+.1.1.15
+* If pacemaker_remote is stopped on an active node, it will wait for the
+ cluster to migrate all resources off before exiting, rather than exit
+ immediately and get fenced.
+* Bug fixes
+
.1.1.14
* Resources that create guest nodes can be included in groups
* reconnect_interval option for remote nodes
* Bug fixes, including a memory leak
.1.1.13
* Support for maintenance mode
* Remote nodes can recover without being fenced when the cluster node
hosting their connection fails
* Running pacemaker_remote within LXC environments is deprecated due to
newly added Pacemaker support for isolated resources
* Bug fixes
.1.1.12
* Support for permanent node attributes
* Support for migration
* Bug fixes
.1.1.11
* Support for IPv6
* Support for remote nodes
* Support for transient node attributes
* Support for clusters with mixed endian architectures
* Bug fixes
.1.1.10
* Bug fixes
.1.1.9
* Initial version to include pacemaker_remote
* Limited to guest nodes in KVM/LXC environments using only IPv4;
all nodes' architectures must have same endianness
== Guest Nodes ==
(((guest node)))
(((node,guest node)))
*"I want a Pacemaker cluster to manage virtual machine resources, but I also
want Pacemaker to be able to manage the resources that live within those
virtual machines."*
Without pacemaker_remote, the possibilities for implementing the above use case
have significant limitations:
* The cluster stack could be run on the physical hosts only, which loses the
ability to monitor resources within the guests.
* A separate cluster could be on the virtual guests, which quickly hits
scalability issues.
* The cluster stack could be run on the guests using the same cluster as the
physical hosts, which also hits scalability issues and complicates fencing.
With pacemaker_remote:
* The physical hosts are cluster nodes (running the full cluster stack).
* The virtual machines are guest nodes (running the pacemaker_remote service).
Nearly zero configuration is required on the virtual machine.
* The cluster stack on the cluster nodes launches the virtual machines and
immediately connects to the pacemaker_remote service on them, allowing the
virtual machines to integrate into the cluster.
The key difference here between the guest nodes and the cluster nodes is that
the guest nodes do not run the cluster stack. This means they will never become
the DC, initiate fencing actions or participate in quorum voting.
On the other hand, this also means that they are not bound to the scalability
limits associated with the cluster stack (no 16-node corosync member limits to
deal with). That isn't to say that guest nodes can scale indefinitely, but it
is known that guest nodes scale horizontally much further than cluster nodes.
Other than the quorum limitation, these guest nodes behave just like cluster
nodes with respect to resource management. The cluster is fully capable of
managing and monitoring resources on each guest node. You can build constraints
against guest nodes, put them in standby, or do whatever else you'd expect to
be able to do with cluster nodes. They even show up in `crm_mon` output as
nodes.
To solidify the concept, below is an example that is very similar to an actual
deployment we test in our developer environment to verify guest node scalability:
* 16 cluster nodes running the full corosync + pacemaker stack
* 64 Pacemaker-managed virtual machine resources running pacemaker_remote configured as guest nodes
* 64 Pacemaker-managed webserver and database resources configured to run on the 64 guest nodes
With this deployment, you would have 64 webservers and databases running on 64
virtual machines on 16 hardware nodes, all of which are managed and monitored by
the same Pacemaker deployment. It is known that pacemaker_remote can scale to
these lengths and possibly much further depending on the specific scenario.
== Remote Nodes ==
(((remote node)))
(((node,remote node)))
*"I want my traditional high-availability cluster to scale beyond the limits
imposed by the corosync messaging layer."*
Ultimately, the primary advantage of remote nodes over cluster nodes is
scalability. There are likely some other use cases related to geographically
distributed HA clusters that remote nodes may serve a purpose in, but those use
cases are not well understood at this point.
Like guest nodes, remote nodes will never become the DC, initiate
fencing actions or participate in quorum voting.
That is not to say, however, that fencing of a remote node works any
differently than that of a cluster node. The Pacemaker policy engine
understands how to fence remote nodes. As long as a fencing device exists, the
cluster is capable of ensuring remote nodes are fenced in the exact same way as
cluster nodes.
== Expanding the Cluster Stack ==
With pacemaker_remote, the traditional view of the high-availability stack can
be expanded to include a new layer:
.Traditional HA Stack
image::images/pcmk-ha-cluster-stack.png["Traditional Pacemaker+Corosync Stack",width="17cm",height="9cm",align="center"]
.HA Stack With Guest Nodes
image::images/pcmk-ha-remote-stack.png["Pacemaker+Corosync Stack With pacemaker_remote",width="20cm",height="10cm",align="center"]
diff --git a/doc/Pacemaker_Remote/en-US/Ch-KVM-Tutorial.txt b/doc/Pacemaker_Remote/en-US/Ch-KVM-Tutorial.txt
index 72a9076592..7f09598e31 100644
--- a/doc/Pacemaker_Remote/en-US/Ch-KVM-Tutorial.txt
+++ b/doc/Pacemaker_Remote/en-US/Ch-KVM-Tutorial.txt
@@ -1,583 +1,583 @@
= Guest Node Walk-through =
*What this tutorial is:* An in-depth walk-through of how to get Pacemaker to
manage a KVM guest instance and integrate that guest into the cluster as a
guest node.
*What this tutorial is not:* A realistic deployment scenario. The steps shown
here are meant to get users familiar with the concept of guest nodes as quickly
as possible.
== Configure the Physical Host ==
[NOTE]
======
For this example, we will use a single physical host named *example-host*.
A production cluster would likely have multiple physical hosts, in which case
you would run the commands here on each one, unless noted otherwise.
======
=== Configure Firewall on Host ===
On the physical host, allow cluster-related services through the local firewall:
----
# firewall-cmd --permanent --add-service=high-availability
success
# firewall-cmd --reload
success
----
[NOTE]
======
If you are using iptables directly, or some other firewall solution besides
firewalld, simply open the following ports, which can be used by various
clustering components: TCP ports 2224, 3121, and 21064, and UDP port 5405.
If you run into any problems during testing, you might want to disable
the firewall and SELinux entirely until you have everything working.
This may create significant security issues and should not be performed on
machines that will be exposed to the outside world, but may be appropriate
during development and testing on a protected host.
To disable security measures:
----
[root@pcmk-1 ~]# setenforce 0
[root@pcmk-1 ~]# sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/config
[root@pcmk-1 ~]# systemctl disable firewalld.service
[root@pcmk-1 ~]# systemctl stop firewalld.service
[root@pcmk-1 ~]# iptables --flush
----
======
=== Install Cluster Software ===
----
# yum install -y pacemaker corosync pcs resource-agents
----
=== Configure Corosync ===
Corosync handles pacemaker's cluster membership and messaging. The corosync
config file is located in +/etc/corosync/corosync.conf+. That config file must
be initialized with information about the cluster nodes before pacemaker can
start.
To initialize the corosync config file, execute the following `pcs` command,
replacing the cluster name and hostname as desired:
----
# pcs cluster setup --force --local --name mycluster example-host
----
[NOTE]
======
If you have multiple physical hosts, you would execute the setup command on
only one host, but list all of them at the end of the command.
======
=== Configure Pacemaker for Remote Node Communication ===
Create a place to hold an authentication key for use with pacemaker_remote:
----
# mkdir -p --mode=0750 /etc/pacemaker
# chgrp haclient /etc/pacemaker
----
Generate a key:
----
# dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1
----
[NOTE]
======
If you have multiple physical hosts, you would generate the key on only one
host, and copy it to the same location on all hosts.
======
=== Verify Cluster Software ===
Start the cluster
----
# pcs cluster start
----
Verify corosync membership
....
# pcs status corosync
Membership information
----------------------
Nodeid Votes Name
1 1 example-host (local)
....
Verify pacemaker status. At first, the output will look like this:
----
# pcs status
Cluster name: mycluster
WARNING: no stonith devices and stonith-enabled is not false
Last updated: Fri Oct 9 15:18:32 2015 Last change: Fri Oct 9 12:42:21 2015 by root via cibadmin on example-host
Stack: corosync
Current DC: NONE
1 node and 0 resources configured
Node example-host: UNCLEAN (offline)
Full list of resources:
PCSD Status:
example-host: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
After a short amount of time, you should see your host as a single node in the
cluster:
----
# pcs status
Cluster name: mycluster
WARNING: no stonith devices and stonith-enabled is not false
Last updated: Fri Oct 9 15:20:05 2015 Last change: Fri Oct 9 12:42:21 2015 by root via cibadmin on example-host
Stack: corosync
Current DC: example-host (version 1.1.13-a14efad) - partition WITHOUT quorum
1 node and 0 resources configured
Online: [ example-host ]
Full list of resources:
PCSD Status:
example-host: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
=== Disable STONITH and Quorum ===
Now, enable the cluster to work without quorum or stonith. This is required
for the sake of getting this tutorial to work with a single cluster node.
----
# pcs property set stonith-enabled=false
# pcs property set no-quorum-policy=ignore
----
[WARNING]
=========
The use of `stonith-enabled=false` is completely inappropriate for a production
cluster. It tells the cluster to simply pretend that failed nodes are safely
powered off. Some vendors will refuse to support clusters that have STONITH
disabled. We disable STONITH here only to focus the discussion on
pacemaker_remote, and to be able to use a single physical host in the example.
=========
Now, the status output should look similar to this:
----
# pcs status
Cluster name: mycluster
Last updated: Fri Oct 9 15:22:49 2015 Last change: Fri Oct 9 15:22:46 2015 by root via cibadmin on example-host
Stack: corosync
Current DC: example-host (version 1.1.13-a14efad) - partition with quorum
1 node and 0 resources configured
Online: [ example-host ]
Full list of resources:
PCSD Status:
example-host: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
Go ahead and stop the cluster for now after verifying everything is in order.
----
# pcs cluster stop --force
----
=== Install Virtualization Software ===
----
# yum install -y kvm libvirt qemu-system qemu-kvm bridge-utils virt-manager
# systemctl enable libvirtd.service
----
Reboot the host.
[NOTE]
======
While KVM is used in this example, any virtualization platform with a Pacemaker
resource agent can be used to create a guest node. The resource agent needs
only to support usual commands (start, stop, etc.); Pacemaker implements the
*remote-node* meta-attribute, independent of the agent.
======
== Configure the KVM guest ==
=== Create Guest ===
We will not outline here the installation steps required to create a KVM
guest. There are plenty of tutorials available elsewhere that do that.
Just be sure to configure the guest with a hostname and a static IP address
(as an example here, we will use guest1 and 192.168.122.10).
=== Configure Firewall on Guest ===
On each guest, allow cluster-related services through the local firewall,
following the same procedure as in <<_configure_firewall_on_host>>.
=== Verify Connectivity ===
At this point, you should be able to ping and ssh into guests from hosts, and
vice versa.
=== Configure pacemaker_remote ===
Install pacemaker_remote, and enable it to run at start-up. Here, we also
install the pacemaker package; it is not required, but it contains the dummy
resource agent that we will use later for testing.
----
# yum install -y pacemaker pacemaker-remote resource-agents
# systemctl enable pacemaker_remote.service
----
Copy the authentication key from a host:
----
# mkdir -p --mode=0750 /etc/pacemaker
# chgrp haclient /etc/pacemaker
# scp root@example-host:/etc/pacemaker/authkey /etc/pacemaker
----
Start pacemaker_remote, and verify the start was successful:
----
# systemctl start pacemaker_remote
# systemctl status pacemaker_remote
pacemaker_remote.service - Pacemaker Remote Service
Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service; enabled)
Active: active (running) since Thu 2013-03-14 18:24:04 EDT; 2min 8s ago
Main PID: 1233 (pacemaker_remot)
CGroup: name=systemd:/system/pacemaker_remote.service
└─1233 /usr/sbin/pacemaker_remoted
Mar 14 18:24:04 guest1 systemd[1]: Starting Pacemaker Remote Service...
Mar 14 18:24:04 guest1 systemd[1]: Started Pacemaker Remote Service.
Mar 14 18:24:04 guest1 pacemaker_remoted[1233]: notice: lrmd_init_remote_tls_server: Starting a tls listener on port 3121.
----
=== Verify Host Connection to Guest ===
Before moving forward, it's worth verifying that the host can contact the guest
on port 3121. Here's a trick you can use. Connect using ssh from the host. The
connection will get destroyed, but how it is destroyed tells you whether it
worked or not.
First add guest1 to the host machine's +/etc/hosts+ file if you haven't
already. This is required unless you have DNS setup in a way where guest1's
address can be discovered.
----
# cat << END >> /etc/hosts
192.168.122.10 guest1
END
----
If running the ssh command on one of the cluster nodes results in this
-output before disconnecting, the connection works.
+output before disconnecting, the connection works:
----
# ssh -p 3121 guest1
ssh_exchange_identification: read: Connection reset by peer
----
-If you see one of these, the connection is not working.
+If you see one of these, the connection is not working:
----
# ssh -p 3121 guest1
ssh: connect to host guest1 port 3121: No route to host
----
----
# ssh -p 3121 guest1
ssh: connect to host guest1 port 3121: Connection refused
----
Once you can successfully connect to the guest from the host, shutdown the guest. Pacemaker will be managing the virtual machine from this point forward.
== Integrate Guest into Cluster ==
Now the fun part, integrating the virtual machine you've just created into the cluster. It is incredibly simple.
=== Start the Cluster ===
On the host, start pacemaker.
----
# pcs cluster start
----
Wait for the host to become the DC. The output of `pcs status` should look
as it did in <<_disable_stonith_and_quorum>>.
=== Integrate as Guest Node ===
If you didn't already do this earlier in the verify host to guest connection
section, add the KVM guest's IP address to the host's +/etc/hosts+ file so we
can connect by hostname. For this example:
----
# cat << END >> /etc/hosts
192.168.122.10 guest1
END
----
We will use the *VirtualDomain* resource agent for the management of the
virtual machine. This agent requires the virtual machine's XML config to be
dumped to a file on disk. To do this, pick out the name of the virtual machine
you just created from the output of this list.
....
# virsh list --all
Id Name State
----------------------------------------------------
- guest1 shut off
....
In my case I named it guest1. Dump the xml to a file somewhere on the host using the following command.
----
# virsh dumpxml guest1 > /etc/pacemaker/guest1.xml
----
Now just register the resource with pacemaker and you're set!
----
# pcs resource create vm-guest1 VirtualDomain hypervisor="qemu:///system" \
config="/etc/pacemaker/guest1.xml" meta remote-node=guest1
----
[NOTE]
======
This example puts the guest XML under /etc/pacemaker because the
permissions and SELinux labeling should not need any changes.
If you run into trouble with this or any step, try disabling SELinux
with `setenforce 0`. If it works after that, see SELinux documentation
for how to troubleshoot, if you wish to reenable SELinux.
======
[NOTE]
======
Pacemaker will automatically monitor pacemaker_remote connections for failure,
so it is not necessary to create a recurring monitor on the VirtualDomain
resource.
======
Once the *vm-guest1* resource is started you will see *guest1* appear in the
`pcs status` output as a node. The final `pcs status` output should look
something like this.
----
# pcs status
Cluster name: mycluster
Last updated: Fri Oct 9 18:00:45 2015 Last change: Fri Oct 9 17:53:44 2015 by root via crm_resource on example-host
Stack: corosync
Current DC: example-host (version 1.1.13-a14efad) - partition with quorum
2 nodes and 2 resources configured
Online: [ example-host ]
GuestOnline: [ guest1@example-host ]
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
PCSD Status:
example-host: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
=== Starting Resources on KVM Guest ===
The commands below demonstrate how resources can be executed on both the
guest node and the cluster node.
Create a few Dummy resources. Dummy resources are real resource agents used just for testing purposes. They actually execute on the host they are assigned to just like an apache server or database would, except their execution just means a file was created. When the resource is stopped, that the file it created is removed.
----
# pcs resource create FAKE1 ocf:pacemaker:Dummy
# pcs resource create FAKE2 ocf:pacemaker:Dummy
# pcs resource create FAKE3 ocf:pacemaker:Dummy
# pcs resource create FAKE4 ocf:pacemaker:Dummy
# pcs resource create FAKE5 ocf:pacemaker:Dummy
----
Now check your `pcs status` output. In the resource section, you should see
something like the following, where some of the resources started on the
cluster node, and some started on the guest node.
----
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
FAKE1 (ocf::pacemaker:Dummy): Started guest1
FAKE2 (ocf::pacemaker:Dummy): Started guest1
FAKE3 (ocf::pacemaker:Dummy): Started example-host
FAKE4 (ocf::pacemaker:Dummy): Started guest1
FAKE5 (ocf::pacemaker:Dummy): Started example-host
----
The guest node, *guest1*, reacts just like any other node in the cluster. For
example, pick out a resource that is running on your cluster node. For my
purposes, I am picking FAKE3 from the output above. We can force FAKE3 to run
on *guest1* in the exact same way we would any other node.
----
# pcs constraint location FAKE3 prefers guest1
----
Now, looking at the bottom of the `pcs status` output you'll see FAKE3 is on
*guest1*.
----
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
FAKE1 (ocf::pacemaker:Dummy): Started guest1
FAKE2 (ocf::pacemaker:Dummy): Started guest1
FAKE3 (ocf::pacemaker:Dummy): Started guest1
FAKE4 (ocf::pacemaker:Dummy): Started example-host
FAKE5 (ocf::pacemaker:Dummy): Started example-host
----
=== Testing Recovery and Fencing ===
Pacemaker's policy engine is smart enough to know fencing guest nodes
associated with a virtual machine means shutting off/rebooting the virtual
machine. No special configuration is necessary to make this happen. If you
are interested in testing this functionality out, trying stopping the guest's
pacemaker_remote daemon. This would be equivalent of abruptly terminating a
cluster node's corosync membership without properly shutting it down.
ssh into the guest and run this command.
----
# kill -9 `pidof pacemaker_remoted`
----
Within a few seconds, your `pcs status` output will show a monitor failure,
and the *guest1* node will not be shown while it is being recovered.
----
# pcs status
Cluster name: mycluster
Last updated: Fri Oct 9 18:08:35 2015 Last change: Fri Oct 9 18:07:00 2015 by root via cibadmin on example-host
Stack: corosync
Current DC: example-host (version 1.1.13-a14efad) - partition with quorum
2 nodes and 7 resources configured
Online: [ example-host ]
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
FAKE1 (ocf::pacemaker:Dummy): Stopped
FAKE2 (ocf::pacemaker:Dummy): Stopped
FAKE3 (ocf::pacemaker:Dummy): Stopped
FAKE4 (ocf::pacemaker:Dummy): Started example-host
FAKE5 (ocf::pacemaker:Dummy): Started example-host
Failed Actions:
* guest1_monitor_30000 on example-host 'unknown error' (1): call=8, status=Error, exitreason='none',
last-rc-change='Fri Oct 9 18:08:29 2015', queued=0ms, exec=0ms
PCSD Status:
example-host: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
[NOTE]
======
A guest node involves two resources: the one you explicitly configured creates the guest,
and Pacemaker creates an implicit resource for the pacemaker_remote connection, which
will be named the same as the value of the *remote-node* attribute of the explicit resource.
When we killed pacemaker_remote, it is the implicit resource that failed, which is why
the failed action starts with *guest1* and not *vm-guest1*.
======
Once recovery of the guest is complete, you'll see it automatically get
re-integrated into the cluster. The final `pcs status` output should look
something like this.
----
Cluster name: mycluster
Last updated: Fri Oct 9 18:18:30 2015 Last change: Fri Oct 9 18:07:00 2015 by root via cibadmin on example-host
Stack: corosync
Current DC: example-host (version 1.1.13-a14efad) - partition with quorum
2 nodes and 7 resources configured
Online: [ example-host ]
GuestOnline: [ guest1@example-host ]
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
FAKE1 (ocf::pacemaker:Dummy): Started guest1
FAKE2 (ocf::pacemaker:Dummy): Started guest1
FAKE3 (ocf::pacemaker:Dummy): Started guest1
FAKE4 (ocf::pacemaker:Dummy): Started example-host
FAKE5 (ocf::pacemaker:Dummy): Started example-host
Failed Actions:
* guest1_monitor_30000 on example-host 'unknown error' (1): call=8, status=Error, exitreason='none',
last-rc-change='Fri Oct 9 18:08:29 2015', queued=0ms, exec=0ms
PCSD Status:
example-host: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
Normally, once you've investigated and addressed a failed action, you can clear the
failure. However Pacemaker does not yet support cleanup for the implicitly
created connection resource while the explicit resource is active. If you want
to clear the failed action from the status output, stop the guest resource before
clearing it. For example:
----
# pcs resource disable vm-guest1 --wait
# pcs resource cleanup guest1
# pcs resource enable vm-guest1
----
=== Accessing Cluster Tools from Guest Node ===
Besides allowing the cluster to manage resources on a guest node,
pacemaker_remote has one other trick. The pacemaker_remote daemon allows
nearly all the pacemaker tools (`crm_resource`, `crm_mon`, `crm_attribute`,
`crm_master`, etc.) to work on guest nodes natively.
Try it: Run `crm_mon` on the guest after pacemaker has
integrated the guest node into the cluster. These tools just work. This
means resource agents such as master/slave resources which need access to tools
like `crm_master` work seamlessly on the guest nodes.
Higher-level command shells such as `pcs` may have partial support
on guest nodes, but it is recommended to run them from a cluster node.
diff --git a/doc/Pacemaker_Remote/en-US/Revision_History.xml b/doc/Pacemaker_Remote/en-US/Revision_History.xml
index 1954f14d96..b3d1fd285d 100644
--- a/doc/Pacemaker_Remote/en-US/Revision_History.xml
+++ b/doc/Pacemaker_Remote/en-US/Revision_History.xml
@@ -1,42 +1,49 @@
<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % BOOK_ENTITIES SYSTEM "Pacemaker_Remote.ent">
%BOOK_ENTITIES;
]>
<appendix id="appe-Pacemaker_Remote-Revision_History">
+ <!-- see comment in Book_Info.xml for revision numbering -->
<title>Revision History</title>
<simpara>
<revhistory>
<revision>
<revnumber>1-0</revnumber>
<date>Tue Mar 19 2013</date>
<author><firstname>David</firstname><surname>Vossel</surname><email>davidvossel@gmail.com</email></author>
<revdescription><simplelist><member>Import from Pages.app</member></simplelist></revdescription>
</revision>
<revision>
<revnumber>2-0</revnumber>
<date>Tue May 13 2013</date>
<author><firstname>David</firstname><surname>Vossel</surname><email>davidvossel@gmail.com</email></author>
<revdescription><simplelist><member>Added Future Features Section</member></simplelist></revdescription>
</revision>
<revision>
<revnumber>3-0</revnumber>
<date>Fri Oct 18 2013</date>
<author><firstname>David</firstname><surname>Vossel</surname><email>davidvossel@gmail.com</email></author>
<revdescription><simplelist><member>Added Baremetal remote-node feature documentation</member></simplelist></revdescription>
</revision>
<revision>
<revnumber>4-0</revnumber>
<date>Tue Aug 25 2015</date>
<author><firstname>Ken</firstname><surname>Gaillot</surname><email>kgaillot@redhat.com</email></author>
<revdescription><simplelist><member>Targeted CentOS 7.1 and Pacemaker 1.1.12+, updated for current terminology and practice</member></simplelist></revdescription>
</revision>
<revision>
<revnumber>5-0</revnumber>
<date>Tue Dec 8 2015</date>
<author><firstname>Ken</firstname><surname>Gaillot</surname><email>kgaillot@redhat.com</email></author>
<revdescription><simplelist><member>Updated for Pacemaker 1.1.14</member></simplelist></revdescription>
</revision>
+ <revision>
+ <revnumber>6-0</revnumber>
+ <date>Tue May 3 2016</date>
+ <author><firstname>Ken</firstname><surname>Gaillot</surname><email>kgaillot@redhat.com</email></author>
+ <revdescription><simplelist><member>Updated for Pacemaker 1.1.15</member></simplelist></revdescription>
+ </revision>
</revhistory>
</simpara>
</appendix>

File Metadata

Mime Type
text/x-diff
Expires
Sat, Jan 25, 11:54 AM (1 d, 19 h)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
1322454
Default Alt Text
(108 KB)

Event Timeline