diff --git a/doc/sphinx/Pacemaker_Remote/example.rst b/doc/sphinx/Pacemaker_Remote/example.rst
index 46ac1ed728..196fc01746 100644
--- a/doc/sphinx/Pacemaker_Remote/example.rst
+++ b/doc/sphinx/Pacemaker_Remote/example.rst
@@ -1,130 +1,134 @@
Guest Node Quick Example
------------------------
If you already know how to use Pacemaker, you'll likely be able to grasp this
new concept of guest nodes by reading through this quick example without
having to sort through all the detailed walk-through steps. Here are the key
configuration ingredients that make this possible using libvirt and KVM virtual
guests. These steps strip everything down to the very basics.
+.. index::
+ single: guest node
+ pair: node; guest node
+
Mile-High View of Configuration Steps
#####################################
* Give each virtual machine that will be used as a guest node a static network
address and unique hostname.
* Put the same authentication key with the path ``/etc/pacemaker/authkey`` on
every cluster node and virtual machine. This secures remote communication.
Run this command if you want to make a somewhat random key:
.. code-block:: none
# dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1
* Install pacemaker_remote on every virtual machine, enabling it to start at
boot, and if a local firewall is used, allow the node to accept connections
on TCP port 3121.
.. code-block:: none
# yum install pacemaker-remote resource-agents
# systemctl enable pacemaker_remote
# firewall-cmd --add-port 3121/tcp --permanent
.. NOTE::
If you just want to see this work, you may want to simply disable the local
firewall and put SELinux in permissive mode while testing. This creates
security risks and should not be done on a production machine exposed to the
Internet, but can be appropriate for a protected test machine.
* Create a Pacemaker resource to launch each virtual machine, using the
**remote-node** meta-attribute to let Pacemaker know this will be a
guest node capable of running resources.
.. code-block:: none
# pcs resource create vm-guest1 VirtualDomain hypervisor="qemu:///system" config="vm-guest1.xml" meta remote-node="guest1"
The above command will create CIB XML similar to the following:
.. code-block:: xml
In the example above, the meta-attribute **remote-node="guest1"** tells Pacemaker
that this resource is a guest node with the hostname **guest1**. The cluster will
attempt to contact the virtual machine's pacemaker_remote service at the
hostname **guest1** after it launches.
.. NOTE::
The ID of the resource creating the virtual machine (**vm-guest1** in the above
example) 'must' be different from the virtual machine's uname (**guest1** in the
above example). Pacemaker will create an implicit internal resource for the
pacemaker_remote connection to the guest, named with the value of **remote-node**,
so that value cannot be used as the name of any other resource.
Using a Guest Node
==================
Guest nodes will show up in ``crm_mon`` output as normal. For example, this is the
``crm_mon`` output after **guest1** is integrated into the cluster:
.. code-block:: none
Stack: corosync
Current DC: node1 (version 1.1.16-12.el7_4.5-94ff4df) - partition with quorum
Last updated: Fri Jan 12 13:52:39 2018
Last change: Fri Jan 12 13:25:17 2018 via pacemaker-controld on node1
2 nodes configured
2 resources configured
Online: [ node1 guest1]
vm-guest1 (ocf::heartbeat:VirtualDomain): Started node1
Now, you could place a resource, such as a webserver, on **guest1**:
.. code-block:: none
# pcs resource create webserver apache params configfile=/etc/httpd/conf/httpd.conf op monitor interval=30s
# pcs constraint location webserver prefers guest1
Now, the crm_mon output would show:
.. code-block:: none
Stack: corosync
Current DC: node1 (version 1.1.16-12.el7_4.5-94ff4df) - partition with quorum
Last updated: Fri Jan 12 13:52:39 2018
Last change: Fri Jan 12 13:25:17 2018 via pacemaker-controld on node1
2 nodes configured
2 resources configured
Online: [ node1 guest1]
vm-guest1 (ocf::heartbeat:VirtualDomain): Started node1
webserver (ocf::heartbeat::apache): Started guest1
It is worth noting that after **guest1** is integrated into the cluster, nearly all the
Pacemaker command-line tools immediately become available to the guest node.
This means things like ``crm_mon``, ``crm_resource``, and ``crm_attribute`` will work
natively on the guest node, as long as the connection between the guest node
and a cluster node exists. This is particularly important for any promotable
clone resources executing on the guest node that need access to ``crm_master`` to
set transient attributes.
diff --git a/doc/sphinx/Pacemaker_Remote/intro.rst b/doc/sphinx/Pacemaker_Remote/intro.rst
index bb0ef5088e..903fbeea4d 100644
--- a/doc/sphinx/Pacemaker_Remote/intro.rst
+++ b/doc/sphinx/Pacemaker_Remote/intro.rst
@@ -1,163 +1,186 @@
Scaling a Pacemaker Cluster
---------------------------
Overview
########
In a basic Pacemaker high-availability cluster [#]_ each node runs the full
cluster stack of corosync and all Pacemaker components. This allows great
flexibility but limits scalability to around 16 nodes.
To allow for scalability to dozens or even hundreds of nodes, Pacemaker
allows nodes not running the full cluster stack to integrate into the cluster
and have the cluster manage their resources as if they were a cluster node.
Terms
#####
**cluster node**
A node running the full high-availability stack of corosync and all
Pacemaker components. Cluster nodes may run cluster resources, run
all Pacemaker command-line tools (``crm_mon``, ``crm_resource`` and so on),
execute fencing actions, count toward cluster quorum, and serve as the
cluster's Designated Controller (DC).
+ .. index::
+ single: cluster node
+ pair: node; cluster node
+
**pacemaker_remote**
A small service daemon that allows a host to be used as a Pacemaker node
without running the full cluster stack. Nodes running pacemaker_remote
may run cluster resources and most command-line tools, but cannot perform
other functions of full cluster nodes such as fencing execution, quorum
voting or DC eligibility. The pacemaker_remote daemon is an enhanced
version of Pacemaker's local resource management daemon (LRMD).
+ .. index::
+ single: pacemaker_remote
+
**remote node**
A physical host running pacemaker_remote. Remote nodes have a special
resource that manages communication with the cluster. This is sometimes
referred to as the 'baremetal' case.
+ .. index::
+ single: remote node
+ pair: node; remote node
+
**guest node**
A virtual host running pacemaker_remote. Guest nodes differ from remote
nodes mainly in that the guest node is itself a resource that the cluster
manages.
+ .. index::
+ single: guest node
+ pair: node; guest node
+
.. NOTE::
'Remote' in this document refers to the node not being a part of the underlying
corosync cluster. It has nothing to do with physical proximity. Remote nodes
and guest nodes are subject to the same latency requirements as cluster nodes,
which means they are typically in the same data center.
.. NOTE::
It is important to distinguish the various roles a virtual machine can serve
in Pacemaker clusters:
* A virtual machine can run the full cluster stack, in which case it is a
cluster node and is not itself managed by the cluster.
* A virtual machine can be managed by the cluster as a resource, without the
cluster having any awareness of the services running inside the virtual
machine. The virtual machine is 'opaque' to the cluster.
* A virtual machine can be a cluster resource, and run pacemaker_remote
to make it a guest node, allowing the cluster to manage services
inside it. The virtual machine is 'transparent' to the cluster.
Guest Nodes
###########
+.. index::
+ single: guest node
+ pair: node; guest node
+
**"I want a Pacemaker cluster to manage virtual machine resources, but I also
want Pacemaker to be able to manage the resources that live within those
virtual machines."**
Without pacemaker_remote, the possibilities for implementing the above use case
have significant limitations:
* The cluster stack could be run on the physical hosts only, which loses the
ability to monitor resources within the guests.
* A separate cluster could be on the virtual guests, which quickly hits
scalability issues.
* The cluster stack could be run on the guests using the same cluster as the
physical hosts, which also hits scalability issues and complicates fencing.
With pacemaker_remote:
* The physical hosts are cluster nodes (running the full cluster stack).
* The virtual machines are guest nodes (running the pacemaker_remote service).
Nearly zero configuration is required on the virtual machine.
* The cluster stack on the cluster nodes launches the virtual machines and
immediately connects to the pacemaker_remote service on them, allowing the
virtual machines to integrate into the cluster.
The key difference here between the guest nodes and the cluster nodes is that
the guest nodes do not run the cluster stack. This means they will never become
the DC, initiate fencing actions or participate in quorum voting.
On the other hand, this also means that they are not bound to the scalability
limits associated with the cluster stack (no 16-node corosync member limits to
deal with). That isn't to say that guest nodes can scale indefinitely, but it
is known that guest nodes scale horizontally much further than cluster nodes.
Other than the quorum limitation, these guest nodes behave just like cluster
nodes with respect to resource management. The cluster is fully capable of
managing and monitoring resources on each guest node. You can build constraints
against guest nodes, put them in standby, or do whatever else you'd expect to
be able to do with cluster nodes. They even show up in ``crm_mon`` output as
nodes.
To solidify the concept, below is an example that is very similar to an actual
deployment we test in our developer environment to verify guest node scalability:
* 16 cluster nodes running the full corosync + pacemaker stack
* 64 Pacemaker-managed virtual machine resources running pacemaker_remote configured as guest nodes
* 64 Pacemaker-managed webserver and database resources configured to run on the 64 guest nodes
With this deployment, you would have 64 webservers and databases running on 64
virtual machines on 16 hardware nodes, all of which are managed and monitored by
the same Pacemaker deployment. It is known that pacemaker_remote can scale to
these lengths and possibly much further depending on the specific scenario.
Remote Nodes
############
+.. index::
+ single: remote node
+ pair: node; remote node
+
**"I want my traditional high-availability cluster to scale beyond the limits
imposed by the corosync messaging layer."**
Ultimately, the primary advantage of remote nodes over cluster nodes is
scalability. There are likely some other use cases related to geographically
distributed HA clusters that remote nodes may serve a purpose in, but those use
cases are not well understood at this point.
Like guest nodes, remote nodes will never become the DC, initiate
fencing actions or participate in quorum voting.
That is not to say, however, that fencing of a remote node works any
differently than that of a cluster node. The Pacemaker scheduler
understands how to fence remote nodes. As long as a fencing device exists, the
cluster is capable of ensuring remote nodes are fenced in the exact same way as
cluster nodes.
Expanding the Cluster Stack
###########################
With pacemaker_remote, the traditional view of the high-availability stack can
be expanded to include a new layer:
Traditional HA Stack
____________________
.. image:: images/pcmk-ha-cluster-stack.png
:width: 17cm
:height: 9cm
:alt: Traditional Pacemaker+Corosync Stack
:align: center
HA Stack With Guest Nodes
_________________________
.. image:: images/pcmk-ha-remote-stack.png
:width: 20cm
:height: 10cm
:alt: Pacemaker+Corosync Stack with pacemaker_remote
:align: center
.. [#] See the ``_ Pacemaker documentation,
especially 'Clusters From Scratch' and 'Pacemaker Explained'.
diff --git a/doc/sphinx/Pacemaker_Remote/options.rst b/doc/sphinx/Pacemaker_Remote/options.rst
index e79b947018..a62fe91883 100644
--- a/doc/sphinx/Pacemaker_Remote/options.rst
+++ b/doc/sphinx/Pacemaker_Remote/options.rst
@@ -1,131 +1,134 @@
Configuration Explained
-----------------------
The walk-through examples use some of these options, but don't explain exactly
what they mean or do. This section is meant to be the go-to resource for all
the options available for configuring pacemaker_remote-based nodes.
+.. index::
+ single: configuration
+
Resource Meta-Attributes for Guest Nodes
########################################
When configuring a virtual machine as a guest node, the virtual machine is
created using one of the usual resource agents for that purpose (for example,
ocf:heartbeat:VirtualDomain or ocf:heartbeat:Xen), with additional metadata
parameters.
No restrictions are enforced on what agents may be used to create a guest node,
but obviously the agent must create a distinct environment capable of running
the pacemaker_remote daemon and cluster resources. An additional requirement is
that fencing the host running the guest node resource must be sufficient for
ensuring the guest node is stopped. This means, for example, that not all
hypervisors supported by VirtualDomain may be used to create guest nodes; if
the guest can survive the hypervisor being fenced, it may not be used as a
guest node.
Below are the metadata options available to enable a resource as a guest node
and define its connection parameters.
.. table:: Meta-attributes for configuring VM resources as guest nodes
+------------------------+-----------------+-----------------------------------------------------------+
| Option | Default | Description |
+========================+=================+===========================================================+
| remote-node | none | The node name of the guest node this resource defines. |
| | | This both enables the resource as a guest node and |
| | | defines the unique name used to identify the guest node. |
| | | If no other parameters are set, this value will also be |
| | | assumed as the hostname to use when connecting to |
| | | pacemaker_remote on the VM. This value **must not** |
| | | overlap with any resource or node IDs. |
+------------------------+-----------------+-----------------------------------------------------------+
| remote-port | 3121 | The port on the virtual machine that the cluster will |
| | | use to connect to pacemaker_remote. |
+------------------------+-----------------+-----------------------------------------------------------+
| remote-addr | 'value of' | The IP address or hostname to use when connecting to |
| | ``remote-node`` | pacemaker_remote on the VM. |
+------------------------+-----------------+-----------------------------------------------------------+
| remote-connect-timeout | 60s | How long before a pending guest connection will time out. |
+------------------------+-----------------+-----------------------------------------------------------+
Connection Resources for Remote Nodes
#####################################
A remote node is defined by a connection resource. That connection resource
has instance attributes that define where the remote node is located on the
network and how to communicate with it.
Descriptions of these instance attributes can be retrieved using the following
``pcs`` command:
.. code-block:: none
# pcs resource describe remote
ocf:pacemaker:remote - remote resource agent
Resource options:
server: Server location to connect to. This can be an ip address or hostname.
port: tcp port to connect to.
reconnect_interval: Interval in seconds at which Pacemaker will attempt to
reconnect to a remote node after an active connection to
the remote node has been severed. When this value is
nonzero, Pacemaker will retry the connection
indefinitely, at the specified interval.
When defining a remote node's connection resource, it is common and recommended
to name the connection resource the same as the remote node's hostname. By
default, if no **server** option is provided, the cluster will attempt to contact
the remote node using the resource name as the hostname.
Example defining a remote node with the hostname **remote1**:
.. code-block:: none
# pcs resource create remote1 remote
Example defining a remote node to connect to a specific IP address and port:
.. code-block:: none
# pcs resource create remote1 remote server=192.168.122.200 port=8938
Environment Variables for Daemon Start-up
#########################################
Authentication and encryption of the connection between cluster nodes
and nodes running pacemaker_remote is achieved using
with `TLS-PSK `_ encryption/authentication
over TCP (port 3121 by default). This means that both the cluster node and
remote node must share the same private key. By default, this
key is placed at ``/etc/pacemaker/authkey`` on each node.
You can change the default port and/or key location for Pacemaker and
pacemaker_remote via environment variables. How these variables are set varies
by OS, but usually they are set in the ``/etc/sysconfig/pacemaker`` or
``/etc/default/pacemaker`` file.
.. code-block:: none
#==#==# Pacemaker Remote
# Use a custom directory for finding the authkey.
PCMK_authkey_location=/etc/pacemaker/authkey
#
# Specify a custom port for Pacemaker Remote connections
PCMK_remote_port=3121
Removing Remote Nodes and Guest Nodes
#####################################
If the resource creating a guest node, or the **ocf:pacemaker:remote** resource
creating a connection to a remote node, is removed from the configuration, the
affected node will continue to show up in output as an offline node.
If you want to get rid of that output, run (replacing $NODE_NAME appropriately):
.. code-block:: none
# crm_node --force --remove $NODE_NAME
.. WARNING::
Be absolutely sure that there are no references to the node's resource in the
configuration before running the above command.