=== Example Dual-Layer, Dual-Device Fencing Topologies ===
The following example illustrates an advanced use of +fencing-topology+ in a cluster with the following properties:
* 3 nodes (2 active prod-mysql nodes, 1 prod_mysql-rep in standby for quorum purposes)
* the active nodes have an IPMI-controlled power board reached at 192.0.2.1 and 192.0.2.2
* the active nodes also have two independent PSUs (Power Supply Units)
connected to two independent PDUs (Power Distribution Units) reached at
198.51.100.1 (port 10 and port 11) and 203.0.113.1 (port 10 and port 11)
* the first fencing method uses the `fence_ipmi` agent
* the second fencing method uses the `fence_apc_snmp` agent targetting 2 fencing devices (one per PSU, either port 10 or 11)
* fencing is only implemented for the active nodes and has location constraints
* fencing topology is set to try IPMI fencing first then default to a "sure-kill" dual PDU fencing
In a normal failure scenario, STONITH will first select +fence_ipmi+ to try to kill the faulty node.
Using a fencing topology, if that first method fails, STONITH will then move on to selecting +fence_apc_snmp+ twice:
* once for the first PDU
* again for the second PDU
The fence action is considered successful only if both PDUs report the required status. If any of them fails, STONITH loops back to the first fencing method, +fence_ipmi+, and so on until the node is fenced or fencing action is cancelled.
.First fencing method: single IPMI device
Each cluster node has it own dedicated IPMI channel that can be called for fencing using the following primitives:
Mar 14 18:24:04 guest1 systemd[1]: Starting Pacemaker Remote Service...
Mar 14 18:24:04 guest1 systemd[1]: Started Pacemaker Remote Service.
Mar 14 18:24:04 guest1 pacemaker_remoted[1233]: notice: lrmd_init_remote_tls_server: Starting a tls listener on port 3121.
----
=== Verify Host Connection to Guest ===
Before moving forward, it's worth verifying that the host can contact the guest
on port 3121. Here's a trick you can use. Connect using ssh from the host. The
connection will get destroyed, but how it is destroyed tells you whether it
worked or not.
First add guest1 to the host machine's +/etc/hosts+ file if you haven't
already. This is required unless you have DNS setup in a way where guest1's
address can be discovered.
----
# cat << END >> /etc/hosts
192.168.122.10 guest1
END
----
If running the ssh command on one of the cluster nodes results in this
-output before disconnecting, the connection works.
+output before disconnecting, the connection works:
----
# ssh -p 3121 guest1
ssh_exchange_identification: read: Connection reset by peer
----
-If you see one of these, the connection is not working.
+If you see one of these, the connection is not working:
----
# ssh -p 3121 guest1
ssh: connect to host guest1 port 3121: No route to host
----
----
# ssh -p 3121 guest1
ssh: connect to host guest1 port 3121: Connection refused
----
Once you can successfully connect to the guest from the host, shutdown the guest. Pacemaker will be managing the virtual machine from this point forward.
== Integrate Guest into Cluster ==
Now the fun part, integrating the virtual machine you've just created into the cluster. It is incredibly simple.
=== Start the Cluster ===
On the host, start pacemaker.
----
# pcs cluster start
----
Wait for the host to become the DC. The output of `pcs status` should look
as it did in <<_disable_stonith_and_quorum>>.
=== Integrate as Guest Node ===
If you didn't already do this earlier in the verify host to guest connection
section, add the KVM guest's IP address to the host's +/etc/hosts+ file so we
can connect by hostname. For this example:
----
# cat << END >> /etc/hosts
192.168.122.10 guest1
END
----
We will use the *VirtualDomain* resource agent for the management of the
virtual machine. This agent requires the virtual machine's XML config to be
dumped to a file on disk. To do this, pick out the name of the virtual machine
config="/etc/pacemaker/guest1.xml" meta remote-node=guest1
----
[NOTE]
======
This example puts the guest XML under /etc/pacemaker because the
permissions and SELinux labeling should not need any changes.
If you run into trouble with this or any step, try disabling SELinux
with `setenforce 0`. If it works after that, see SELinux documentation
for how to troubleshoot, if you wish to reenable SELinux.
======
[NOTE]
======
Pacemaker will automatically monitor pacemaker_remote connections for failure,
so it is not necessary to create a recurring monitor on the VirtualDomain
resource.
======
Once the *vm-guest1* resource is started you will see *guest1* appear in the
`pcs status` output as a node. The final `pcs status` output should look
something like this.
----
# pcs status
Cluster name: mycluster
Last updated: Fri Oct 9 18:00:45 2015 Last change: Fri Oct 9 17:53:44 2015 by root via crm_resource on example-host
Stack: corosync
Current DC: example-host (version 1.1.13-a14efad) - partition with quorum
2 nodes and 2 resources configured
Online: [ example-host ]
GuestOnline: [ guest1@example-host ]
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
PCSD Status:
example-host: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
=== Starting Resources on KVM Guest ===
The commands below demonstrate how resources can be executed on both the
guest node and the cluster node.
Create a few Dummy resources. Dummy resources are real resource agents used just for testing purposes. They actually execute on the host they are assigned to just like an apache server or database would, except their execution just means a file was created. When the resource is stopped, that the file it created is removed.
----
# pcs resource create FAKE1 ocf:pacemaker:Dummy
# pcs resource create FAKE2 ocf:pacemaker:Dummy
# pcs resource create FAKE3 ocf:pacemaker:Dummy
# pcs resource create FAKE4 ocf:pacemaker:Dummy
# pcs resource create FAKE5 ocf:pacemaker:Dummy
----
Now check your `pcs status` output. In the resource section, you should see
something like the following, where some of the resources started on the
cluster node, and some started on the guest node.
----
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
FAKE1 (ocf::pacemaker:Dummy): Started guest1
FAKE2 (ocf::pacemaker:Dummy): Started guest1
FAKE3 (ocf::pacemaker:Dummy): Started example-host
FAKE4 (ocf::pacemaker:Dummy): Started guest1
FAKE5 (ocf::pacemaker:Dummy): Started example-host
----
The guest node, *guest1*, reacts just like any other node in the cluster. For
example, pick out a resource that is running on your cluster node. For my
purposes, I am picking FAKE3 from the output above. We can force FAKE3 to run
on *guest1* in the exact same way we would any other node.
----
# pcs constraint location FAKE3 prefers guest1
----
Now, looking at the bottom of the `pcs status` output you'll see FAKE3 is on
*guest1*.
----
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
FAKE1 (ocf::pacemaker:Dummy): Started guest1
FAKE2 (ocf::pacemaker:Dummy): Started guest1
FAKE3 (ocf::pacemaker:Dummy): Started guest1
FAKE4 (ocf::pacemaker:Dummy): Started example-host
FAKE5 (ocf::pacemaker:Dummy): Started example-host
----
=== Testing Recovery and Fencing ===
Pacemaker's policy engine is smart enough to know fencing guest nodes
associated with a virtual machine means shutting off/rebooting the virtual
machine. No special configuration is necessary to make this happen. If you
are interested in testing this functionality out, trying stopping the guest's
pacemaker_remote daemon. This would be equivalent of abruptly terminating a
cluster node's corosync membership without properly shutting it down.
ssh into the guest and run this command.
----
# kill -9 `pidof pacemaker_remoted`
----
Within a few seconds, your `pcs status` output will show a monitor failure,
and the *guest1* node will not be shown while it is being recovered.
----
# pcs status
Cluster name: mycluster
Last updated: Fri Oct 9 18:08:35 2015 Last change: Fri Oct 9 18:07:00 2015 by root via cibadmin on example-host
Stack: corosync
Current DC: example-host (version 1.1.13-a14efad) - partition with quorum
2 nodes and 7 resources configured
Online: [ example-host ]
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
FAKE1 (ocf::pacemaker:Dummy): Stopped
FAKE2 (ocf::pacemaker:Dummy): Stopped
FAKE3 (ocf::pacemaker:Dummy): Stopped
FAKE4 (ocf::pacemaker:Dummy): Started example-host
FAKE5 (ocf::pacemaker:Dummy): Started example-host
Failed Actions:
* guest1_monitor_30000 on example-host 'unknown error' (1): call=8, status=Error, exitreason='none',
last-rc-change='Fri Oct 9 18:08:29 2015', queued=0ms, exec=0ms
PCSD Status:
example-host: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
[NOTE]
======
A guest node involves two resources: the one you explicitly configured creates the guest,
and Pacemaker creates an implicit resource for the pacemaker_remote connection, which
will be named the same as the value of the *remote-node* attribute of the explicit resource.
When we killed pacemaker_remote, it is the implicit resource that failed, which is why
the failed action starts with *guest1* and not *vm-guest1*.
======
Once recovery of the guest is complete, you'll see it automatically get
re-integrated into the cluster. The final `pcs status` output should look
something like this.
----
Cluster name: mycluster
Last updated: Fri Oct 9 18:18:30 2015 Last change: Fri Oct 9 18:07:00 2015 by root via cibadmin on example-host
Stack: corosync
Current DC: example-host (version 1.1.13-a14efad) - partition with quorum
2 nodes and 7 resources configured
Online: [ example-host ]
GuestOnline: [ guest1@example-host ]
Full list of resources:
vm-guest1 (ocf::heartbeat:VirtualDomain): Started example-host
FAKE1 (ocf::pacemaker:Dummy): Started guest1
FAKE2 (ocf::pacemaker:Dummy): Started guest1
FAKE3 (ocf::pacemaker:Dummy): Started guest1
FAKE4 (ocf::pacemaker:Dummy): Started example-host
FAKE5 (ocf::pacemaker:Dummy): Started example-host
Failed Actions:
* guest1_monitor_30000 on example-host 'unknown error' (1): call=8, status=Error, exitreason='none',
last-rc-change='Fri Oct 9 18:08:29 2015', queued=0ms, exec=0ms
PCSD Status:
example-host: Online
Daemon Status:
corosync: active/disabled
pacemaker: active/disabled
pcsd: active/enabled
----
Normally, once you've investigated and addressed a failed action, you can clear the
failure. However Pacemaker does not yet support cleanup for the implicitly
created connection resource while the explicit resource is active. If you want
to clear the failed action from the status output, stop the guest resource before
clearing it. For example:
----
# pcs resource disable vm-guest1 --wait
# pcs resource cleanup guest1
# pcs resource enable vm-guest1
----
=== Accessing Cluster Tools from Guest Node ===
Besides allowing the cluster to manage resources on a guest node,
pacemaker_remote has one other trick. The pacemaker_remote daemon allows
nearly all the pacemaker tools (`crm_resource`, `crm_mon`, `crm_attribute`,
`crm_master`, etc.) to work on guest nodes natively.
Try it: Run `crm_mon` on the guest after pacemaker has
integrated the guest node into the cluster. These tools just work. This
means resource agents such as master/slave resources which need access to tools
like `crm_master` work seamlessly on the guest nodes.
Higher-level command shells such as `pcs` may have partial support
on guest nodes, but it is recommended to run them from a cluster node.
<revdescription><simplelist><member>Targeted CentOS 7.1 and Pacemaker 1.1.12+, updated for current terminology and practice</member></simplelist></revdescription>