Page MenuHomeClusterLabs Projects

No OneTemporary

diff --git a/INSTALL.md b/INSTALL.md
index c4af9f7217..d25d4b04d9 100644
--- a/INSTALL.md
+++ b/INSTALL.md
@@ -1,79 +1,79 @@
# How to Install Pacemaker
## Build Dependencies
| Version | Fedora-based | Suse-based | Debian-based |
|:---------------:|:------------------:|:------------------:|:--------------:|
| 1.11 or later | automake | automake | automake |
| 2.64 or later | autoconf | autoconf | autoconf |
| | libtool | libtool | libtool |
| | libtool-ltdl-devel | | libltdl-dev |
| | libuuid-devel | libuuid-devel | uuid-dev |
| | pkgconfig | pkgconfig | pkg-config |
| 2.32.0 or later | glib2-devel | glib2-devel | libglib2.0-dev |
| | libxml2-devel | libxml2-devel | libxml2-dev |
| | libxslt-devel | libxslt-devel | libxslt-dev |
| | bzip2-devel | libbz2-devel | libbz2-dev |
| | libqb-devel | libqb-devel | libqb-dev |
| 3.2 or later | python3 | python3 | python3 |
Also: GNU make
### Cluster Stack Dependencies
*Only corosync is currently supported*
| Version | Fedora-based | Suse-based | Debian-based |
|:---------------:|:------------------:|:------------------:|:--------------:|
| 2.0.0 or later | corosynclib | libcorosync | corosync |
| 2.0.0 or later | corosynclib-devel | libcorosync-devel | |
| | | | libcfg-dev |
| | | | libcpg-dev |
| | | | libcmap-dev |
| | | | libquorum-dev |
### Optional Build Dependencies
| Feature Enabled | Version | Fedora-based | Suse-based | Debian-based |
|:-----------------------------------------------:|:--------------:|:-----------------------:|:-----------------------:|:-----------------------:|
| Pacemaker Remote and encrypted remote CIB admin | 2.1.7 or later | gnutls-devel | libgnutls-devel | libgnutls-dev |
| encrypted remote CIB admin | | pam-devel | pam-devel | libpam0g-dev |
| interactive crm_mon | | ncurses-devel | ncurses-devel | ncurses-dev |
| systemd support | | systemd-devel | systemd-devel | libsystemd-dev |
| systemd/upstart resource support | | dbus-devel | dbus-devel | libdbus-1-dev |
| Linux-HA style fencing agents | | cluster-glue-libs-devel | libglue-devel | cluster-glue-dev |
| documentation | | asciidoc or asciidoctor | asciidoc or asciidoctor | asciidoc or asciidoctor |
| documentation | | help2man | help2man | help2man |
| documentation | | inkscape | inkscape | inkscape |
| documentation | | docbook-style-xsl | docbook-xsl-stylesheets | docbook-xsl |
| documentation | | python3-sphinx | python3-sphinx | python3-sphinx |
-| documentation (PDF) | | texlive, texlive-titlesec, texlive-framed, texlive-threeparttable texlive-wrapfig texlive-multirow | texlive, texlive-latex | texlive, texlive-latex-extra |
+| documentation (PDF) | | latexmk texlive texlive-capt-of texlive-collection-xetex texlive-fncychap texlive-framed texlive-multirow texlive-needspace texlive-tabulary texlive-titlesec texlive-threeparttable texlive-upquote texlive-wrapfig texlive-xetex | texlive texlive-latex | texlive texlive-latex-extra |
| RPM packages via "make rpm" | 4.11 or later | rpm | rpm | (n/a) |
## Optional testing dependencies
* procps and psmisc (if running cts-exec, cts-fencing, or CTS)
* valgrind (if running CTS valgrind tests)
* python3-systemd (if using CTS on cluster nodes running systemd)
* rsync (if running CTS container tests)
* libvirt-daemon-driver-lxc (if running CTS container tests)
* libvirt-daemon-lxc (if running CTS container tests)
* libvirt-login-shell (if running CTS container tests)
* nmap (if not specifying an IP address base)
* oprofile (if running CTS profiling tests)
* dlm (to log DLM debugging info after CTS tests)
## Simple install
$ make && sudo make install
If GNU make is not your default make, use "gmake" instead.
## Detailed install
First, browse the build options that are available:
$ ./autogen.sh
$ ./configure --help
Re-run ./configure with any options you want, then proceed with the simple
method.
diff --git a/doc/sphinx/Clusters_from_Scratch/cluster-setup.rst b/doc/sphinx/Clusters_from_Scratch/cluster-setup.rst
index 56939c00e1..8aeb39cfb0 100644
--- a/doc/sphinx/Clusters_from_Scratch/cluster-setup.rst
+++ b/doc/sphinx/Clusters_from_Scratch/cluster-setup.rst
@@ -1,299 +1,295 @@
Set up a Cluster
----------------
Simplify Administration With a Cluster Shell
############################################
In the dark past, configuring Pacemaker required the administrator to
read and write XML. In true UNIX style, there were also a number of
different commands that specialized in different aspects of querying
and updating the cluster.
In addition, the various components of the cluster stack (corosync, pacemaker,
etc.) had to be configured separately, with different configuration tools and
formats.
All of that has been greatly simplified with the creation of higher-level tools,
whether command-line or GUIs, that hide all the mess underneath.
Command-line cluster shells take all the individual aspects required for
managing and configuring a cluster, and pack them into one simple-to-use
command-line tool.
They even allow you to queue up several changes at once and commit
them all at once.
Two popular command-line shells are ``pcs`` and ``crmsh``. Clusters from Scratch is
based on ``pcs`` because it comes with CentOS, but both have similar
functionality. Choosing a shell or GUI is a matter of personal preference and
what comes with (and perhaps is supported by) your choice of operating system.
Install the Cluster Software
############################
Fire up a shell on both nodes and run the following to activate the High
Availability repo.
.. code-block:: none
# dnf config-manager --set-enabled ha
.. IMPORTANT::
This document will show commands that need to be executed on both nodes
with a simple ``#`` prompt. Be sure to run them on each node individually.
Now, we'll install pacemaker, pcs, and some other command-line tools that will
make our lives easier:
.. code-block:: none
# yum install -y pacemaker pcs psmisc policycoreutils-python3
.. NOTE::
This document uses ``pcs`` for cluster management. Other alternatives,
such as ``crmsh``, are available, but their syntax
will differ from the examples used here.
Configure the Cluster Software
##############################
.. index::
single: firewall
Allow cluster services through firewall
_______________________________________
On each node, allow cluster-related services through the local firewall:
.. code-block:: none
# firewall-cmd --permanent --add-service=high-availability
success
# firewall-cmd --reload
success
.. NOTE ::
If you are using iptables directly, or some other firewall solution besides
firewalld, simply open the following ports, which can be used by various
clustering components: TCP ports 2224, 3121, and 21064, and UDP port 5405.
If you run into any problems during testing, you might want to disable
the firewall and SELinux entirely until you have everything working.
This may create significant security issues and should not be performed on
machines that will be exposed to the outside world, but may be appropriate
during development and testing on a protected host.
To disable security measures:
.. code-block:: none
[root@pcmk-1 ~]# setenforce 0
[root@pcmk-1 ~]# sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/config
[root@pcmk-1 ~]# systemctl mask firewalld.service
[root@pcmk-1 ~]# systemctl stop firewalld.service
[root@pcmk-1 ~]# iptables --flush
Enable pcs Daemon
_________________
Before the cluster can be configured, the pcs daemon must be started and enabled
to start at boot time on each node. This daemon works with the pcs command-line interface
to manage synchronizing the corosync configuration across all nodes in the cluster.
Start and enable the daemon by issuing the following commands on each node:
.. code-block:: none
# systemctl start pcsd.service
# systemctl enable pcsd.service
Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service.
The installed packages will create a **hacluster** user with a disabled password.
While this is fine for running ``pcs`` commands locally,
the account needs a login password in order to perform such tasks as syncing
the corosync configuration, or starting and stopping the cluster on other nodes.
This tutorial will make use of such commands,
so now we will set a password for the **hacluster** user, using the same password
on both nodes:
.. code-block:: none
# passwd hacluster
Changing password for user hacluster.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.
.. NOTE::
Alternatively, to script this process or set the password on a
different machine from the one you're logged into, you can use
the ``--stdin`` option for ``passwd``:
.. code-block:: none
[root@pcmk-1 ~]# ssh pcmk-2 -- 'echo mysupersecretpassword | passwd --stdin hacluster'
Configure Corosync
__________________
On either node, use ``pcs host auth`` to authenticate as the **hacluster** user:
.. code-block:: none
[root@pcmk-1 ~]# pcs host auth pcmk-1 pcmk-2
Username: hacluster
Password:
pcmk-2: Authorized
pcmk-1: Authorized
Next, use ``pcs cluster setup`` on the same node to generate and synchronize the
corosync configuration:
.. code-block:: none
[root@pcmk-1 ~]# pcs cluster setup mycluster pcmk-1 pcmk-2
No addresses specified for host 'pcmk-1', using 'pcmk-1'
No addresses specified for host 'pcmk-2', using 'pcmk-2'
Destroying cluster on hosts: 'pcmk-1', 'pcmk-2'...
pcmk-2: Successfully destroyed cluster
pcmk-1: Successfully destroyed cluster
Requesting remove 'pcsd settings' from 'pcmk-1', 'pcmk-2'
pcmk-1: successful removal of the file 'pcsd settings'
pcmk-2: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to 'pcmk-1', 'pcmk-2'
pcmk-1: successful distribution of the file 'corosync authkey'
pcmk-1: successful distribution of the file 'pacemaker authkey'
pcmk-2: successful distribution of the file 'corosync authkey'
pcmk-2: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to 'pcmk-1', 'pcmk-2'
pcmk-1: successful distribution of the file 'corosync.conf'
pcmk-2: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.
If you received an authorization error for either of those commands, make
sure you configured the **hacluster** user account on each node
with the same password.
The final corosync.conf configuration on each node should look
something like the sample in :ref:`sample-corosync-configuration`.
Explore pcs
###########
Start by taking some time to familiarize yourself with what ``pcs`` can do.
.. code-block:: none
[root@pcmk-1 ~]# pcs
Usage: pcs [-f file] [-h] [commands]...
Control and configure pacemaker and corosync.
Options:
-h, --help Display usage and exit.
-f file Perform actions on file instead of active CIB.
Commands supporting the option use the initial state of
the specified file as their input and then overwrite the
file with the state reflecting the requested
operation(s).
A few commands only use the specified file in read-only
mode since their effect is not a CIB modification.
--debug Print all network traffic and external commands run.
--version Print pcs version information. List pcs capabilities if
--full is specified.
--request-timeout Timeout for each outgoing request to another node in
seconds. Default is 60s.
--force Override checks and errors, the exact behavior depends on
the command. WARNING: Using the --force option is
strongly discouraged unless you know what you are doing.
Commands:
cluster Configure cluster options and nodes.
resource Manage cluster resources.
stonith Manage fence devices.
constraint Manage resource constraints.
property Manage pacemaker properties.
acl Manage pacemaker access control lists.
qdevice Manage quorum device provider on the local host.
quorum Manage cluster quorum settings.
booth Manage booth (cluster ticket manager).
status View cluster status.
config View and manage cluster configuration.
pcsd Manage pcs daemon.
host Manage hosts known to pcs/pcsd.
node Manage cluster nodes.
alert Manage pacemaker alerts.
client Manage pcsd client configuration.
dr Manage disaster recovery configuration.
tag Manage pacemaker tags.
As you can see, the different aspects of cluster management are separated
into categories. To discover the functionality available in each of these
categories, one can issue the command ``pcs <CATEGORY> help``. Below is an
example of all the options available under the status category.
.. code-block:: none
[root@pcmk-1 ~]# pcs status help
Usage: pcs status [commands]...
View current cluster and resource status
Commands:
[status] [--full] [--hide-inactive]
View all information about the cluster and resources (--full provides
more details, --hide-inactive hides inactive resources).
resources [--hide-inactive]
Show status of all currently configured resources. If --hide-inactive
is specified, only show active resources.
cluster
View current cluster status.
corosync
View current membership information as seen by corosync.
quorum
View current quorum status.
qdevice <device model> [--full] [<cluster name>]
Show runtime status of specified model of quorum device provider. Using
--full will give more detailed output. If <cluster name> is specified,
only information about the specified cluster will be displayed.
booth
Print current status of booth on the local node.
nodes [corosync | both | config]
View current status of nodes from pacemaker. If 'corosync' is
specified, view current status of nodes from corosync instead. If
'both' is specified, view current status of nodes from both corosync &
pacemaker. If 'config' is specified, print nodes from corosync &
pacemaker configuration.
pcsd [<node>]...
Show current status of pcsd on nodes specified, or on all nodes
configured in the local cluster if no nodes are specified.
xml
View xml version of status (output from crm_mon -r -1 -X).
Additionally, if you are interested in the version and supported cluster stack(s)
available with your Pacemaker installation, run:
.. code-block:: none
[root@pcmk-1 ~]# pacemakerd --features
Pacemaker 2.0.5-4.el8 (Build: ba59be7122)
Supporting v3.6.1: generated-manpages agent-manpages ncurses libqb-logging libqb-ipc systemd nagios corosync-native atomic-attrd acls cibsecrets
-
-
-.. [#] For some subtle issues, see `Topics in High-Performance Messaging: Multicast Address Assignment <http://web.archive.org/web/20101211210054/http://29west.com/docs/THPM/multicast-address-assignment.html>`_
- or the more detailed treatment in `Cisco's Guidelines for Enterprise IP Multicast Address Allocation <https://www.cisco.com/c/dam/en/us/support/docs/ip/ip-multicast/ipmlt_wp.pdf>`_.
diff --git a/doc/sphinx/Clusters_from_Scratch/installation.rst b/doc/sphinx/Clusters_from_Scratch/installation.rst
index 2827cda52a..6a0d4698cd 100644
--- a/doc/sphinx/Clusters_from_Scratch/installation.rst
+++ b/doc/sphinx/Clusters_from_Scratch/installation.rst
@@ -1,436 +1,416 @@
Installation
------------
Install |CFS_DISTRO| |CFS_DISTRO_VER|
################################################################################################
Boot the Install Image
______________________
Download the latest |CFS_DISTRO| |CFS_DISTRO_VER| DVD ISO by navigating to
the `CentOS Mirrors List <http://isoredirect.centos.org/centos/8-stream/isos/x86_64/>`_,
selecting a download mirror which is close to you, and finally selecting the
.iso file that has "dvd" in its name.
Use the image to boot a virtual machine, or burn it to a DVD or USB drive and
boot a physical server from that.
After starting the installation, select your language and keyboard layout at
the welcome screen.
.. figure:: images/WelcomeToCentos.png
- :scale: 80%
- :width: 1024
- :height: 800
:align: center
:alt: Installation Welcome Screen
|CFS_DISTRO| |CFS_DISTRO_VER| Installation Welcome Screen
Installation Options
____________________
At this point, you get a chance to tweak the default installation options.
.. figure:: images/InstallationSummary.png
- :scale: 80%
- :width: 1024
- :height: 800
:align: center
:alt: Installation Summary Screen
|CFS_DISTRO| |CFS_DISTRO_VER| Installation Summary Screen
Click on the **SOFTWARE SELECTION** section (try saying that 10 times quickly). The
default environment, **Server with GUI**, does have add-ons with much of the software
we need, but we will change the environment to a **Minimal Install** here, so that we
can see exactly what software is required later, and press **Done**.
.. figure:: images/SoftwareSelection.png
- :scale: 80%
- :width: 1024
- :height: 800
:align: center
:alt: Software Selection Screen
|CFS_DISTRO| |CFS_DISTRO_VER| Software Selection Screen
Configure Network
_________________
In the **NETWORK & HOSTNAME** section:
- Edit **Host Name:** as desired. For this example, we will use
**pcmk-1.localdomain** and then press **Apply**.
- Select your network device, press **Configure...**, and use the **Manual** method to
assign a fixed IP address. For this example, we'll use 192.168.122.101 under
**IPv4 Settings** (with an appropriate netmask, gateway and DNS server).
- Press **Save**.
- Flip the switch to turn your network device on, and press **Done**.
.. figure:: images/NetworkAndHostName.png
- :scale: 80%
- :width: 1024
- :height: 800
:align: center
:alt: Editing network settings
|CFS_DISTRO| |CFS_DISTRO_VER| Network Interface Screen
.. IMPORTANT::
Do not accept the default network settings.
Cluster machines should never obtain an IP address via DHCP, because
DHCP's periodic address renewal will interfere with corosync.
Configure Disk
______________
By default, the installer's automatic partitioning will use LVM (which allows
us to dynamically change the amount of space allocated to a given partition).
However, it allocates all free space to the ``/`` (aka. **root**) partition, which
cannot be reduced in size later (dynamic increases are fine).
In order to follow the DRBD and GFS2 portions of this guide, we need to reserve
space on each machine for a replicated volume.
Enter the **INSTALLATION DESTINATION** section, ensure the hard drive you want to
install to is selected, select **Custom** to be the **Storage Configuration**, and
press **Done**.
In the **MANUAL PARTITIONING** screen that comes next, click the option to create
mountpoints automatically. Select the ``/`` mountpoint, and reduce the desired
capacity by 3GiB or so. Select **Modify...** by the volume group name, and change
the **Size policy:** to **As large as possible**, to make the reclaimed space
available inside the LVM volume group. We'll add the additional volume later.
.. figure:: images/ManualPartitioning.png
- :scale: 80%
- :width: 1024
- :height: 800
:align: center
:alt: Manual Partitioning Screen
|CFS_DISTRO| |CFS_DISTRO_VER| Manual Partitioning Screen
Press **Done**, then **Accept changes**.
Configure Time Synchronization
______________________________
It is highly recommended to enable NTP on your cluster nodes. Doing so
ensures all nodes agree on the current time and makes reading log files
significantly easier.
|CFS_DISTRO| will enable NTP automatically. If you want to change any time-related
settings (such as time zone or NTP server), you can do this in the
**TIME & DATE** section.
Root Password
______________________________
In order to continue to the next step, a **Root Password** must be set.
.. figure:: images/RootPassword.png
- :scale: 80%
- :width: 1024
- :height: 800
:align: center
:alt: Root Password Screen
|CFS_DISTRO| |CFS_DISTRO_VER| Root Password Screen
Press **Done** (depending on the password you chose, you may need to do so twice).
Finish Install
______________
Select **Begin Installation**. Once it completes, **Reboot System**
as instructed. After the node reboots, you'll see a login prompt on
the console. Login using **root** and the password you created earlier.
.. figure:: images/ConsolePrompt.png
- :scale: 80%
- :width: 1024
- :height: 768
:align: center
:alt: Console Prompt
|CFS_DISTRO| |CFS_DISTRO_VER| Console Prompt
.. NOTE::
From here on, we're going to be working exclusively from the terminal.
Configure the OS
################
Verify Networking
_________________
Ensure that the machine has the static IP address you configured earlier.
.. code-block:: none
+
[root@pcmk-1 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:32:cf:a9 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.101/24 brd 192.168.122.255 scope global noprefixroute enp1s0
valid_lft forever preferred_lft forever
inet6 fe80::c3e1:3ba:959:fa96/64 scope link noprefixroute
valid_lft forever preferred_lft forever
.. NOTE::
If you ever need to change the node's IP address from the command line, follow
these instructions, replacing **${device}** with the name of your network device:
.. code-block:: none
[root@pcmk-1 ~]# vi /etc/sysconfig/network-scripts/ifcfg-${device} # manually edit as desired
[root@pcmk-1 ~]# nmcli dev disconnect ${device}
[root@pcmk-1 ~]# nmcli con reload ${device}
[root@pcmk-1 ~]# nmcli con up ${device}
This makes **NetworkManager** aware that a change was made on the config file.
Next, ensure that the routes are as expected:
.. code-block:: none
[root@pcmk-1 ~]# ip route
default via 192.168.122.1 dev enp1s0 proto static metric 100
192.168.122.0/24 dev enp1s0 proto kernel scope link src 192.168.122.101 metric 100
If there is no line beginning with **default via**, then you may need to add a line such as
``GATEWAY="192.168.122.1"``
to the device configuration using the same process as described above for
changing the IP address.
Now, check for connectivity to the outside world. Start small by
testing whether we can reach the gateway we configured.
.. code-block:: none
[root@pcmk-1 ~]# ping -c 1 192.168.122.1
PING 192.168.122.1 (192.168.122.1) 56(84) bytes of data.
64 bytes from 192.168.122.1: icmp_seq=1 ttl=64 time=0.492 ms
--- 192.168.122.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.492/0.492/0.492/0.000 ms
Now try something external; choose a location you know should be available.
.. code-block:: none
[root@pcmk-1 ~]# ping -c 1 www.clusterlabs.org
PING mx1.clusterlabs.org (95.217.104.78) 56(84) bytes of data.
64 bytes from mx1.clusterlabs.org (95.217.104.78): icmp_seq=1 ttl=54 time=134 ms
--- mx1.clusterlabs.org ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 133.987/133.987/133.987/0.000 ms
Login Remotely
______________
The console isn't a very friendly place to work from, so we will now
switch to accessing the machine remotely via SSH where we can
use copy and paste, etc.
From another host, check whether we can see the new host at all:
.. code-block:: none
[gchin@gchin ~]$ ping -c 1 192.168.122.101
PING 192.168.122.101 (192.168.122.101) 56(84) bytes of data.
64 bytes from 192.168.122.101: icmp_seq=1 ttl=64 time=0.344 ms
--- 192.168.122.101 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.344/0.344/0.344/0.000 ms
Next, login as root via SSH.
.. code-block:: none
[gchin@gchin ~]$ ssh root@192.168.122.101
The authenticity of host '192.168.122.101 (192.168.122.101)' can't be established.
ECDSA key fingerprint is SHA256:NBvcRrPDLIt39Rf0Tz4/f2Rd/FA5wUiDOd9bZ9QWWjo.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '192.168.122.101' (ECDSA) to the list of known hosts.
root@192.168.122.101's password:
Last login: Tue Jan 10 20:46:30 2021
[root@pcmk-1 ~]#
Apply Updates
_____________
Apply any package updates released since your installation image was created:
.. code-block:: none
[root@pcmk-1 ~]# yum update
.. index::
single: node; short name
Use Short Node Names
____________________
During installation, we filled in the machine's fully qualified domain
name (FQDN), which can be rather long when it appears in cluster logs and
status output. See for yourself how the machine identifies itself:
.. code-block:: none
[root@pcmk-1 ~]# uname -n
pcmk-1.localdomain
We can use the `hostnamectl` tool to strip off the domain name:
.. code-block:: none
[root@pcmk-1 ~]# hostnamectl set-hostname $(uname -n | sed s/\\..*//)
Now, check that the machine is using the correct name:
.. code-block:: none
[root@pcmk-1 ~]# uname -n
pcmk-1
You may want to reboot to ensure all updates take effect.
Repeat for Second Node
######################
Repeat the Installation steps so far, so that you have two
nodes ready to have the cluster software installed.
For the purposes of this document, the additional node is called
pcmk-2 with address 192.168.122.102.
Configure Communication Between Nodes
#####################################
Configure Host Name Resolution
______________________________
Confirm that you can communicate between the two new nodes:
.. code-block:: none
[root@pcmk-1 ~]# ping -c 3 192.168.122.102
PING 192.168.122.102 (192.168.122.102) 56(84) bytes of data.
64 bytes from 192.168.122.102: icmp_seq=1 ttl=64 time=1.22 ms
64 bytes from 192.168.122.102: icmp_seq=2 ttl=64 time=0.795 ms
64 bytes from 192.168.122.102: icmp_seq=3 ttl=64 time=0.751 ms
--- 192.168.122.102 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2054ms
rtt min/avg/max/mdev = 0.751/0.923/1.224/0.214 ms
Now we need to make sure we can communicate with the machines by their
name. If you have a DNS server, add additional entries for the two
machines. Otherwise, you'll need to add the machines to ``/etc/hosts``
on both nodes. Below are the entries for my cluster nodes:
.. code-block:: none
[root@pcmk-1 ~]# grep pcmk /etc/hosts
192.168.122.101 pcmk-1.clusterlabs.org pcmk-1
192.168.122.102 pcmk-2.clusterlabs.org pcmk-2
We can now verify the setup by again using ping:
.. code-block:: none
[root@pcmk-1 ~]# ping -c 3 pcmk-2
PING pcmk-2.clusterlabs.org (192.168.122.102) 56(84) bytes of data.
64 bytes from pcmk-2.clusterlabs.org (192.168.122.102): icmp_seq=1 ttl=64 time=0.295 ms
64 bytes from pcmk-2.clusterlabs.org (192.168.122.102): icmp_seq=2 ttl=64 time=0.616 ms
64 bytes from pcmk-2.clusterlabs.org (192.168.122.102): icmp_seq=3 ttl=64 time=0.809 ms
--- pcmk-2.clusterlabs.org ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2043ms
rtt min/avg/max/mdev = 0.295/0.573/0.809/0.212 ms
.. index:: SSH
Configure SSH
_____________
SSH is a convenient and secure way to copy files and perform commands
remotely. For the purposes of this guide, we will create a key without a
password (using the -N option) so that we can perform remote actions
without being prompted.
.. WARNING::
Unprotected SSH keys (those without a password) are not recommended for
servers exposed to the outside world. We use them here only to simplify
the demo.
Create a new key and allow anyone with that key to log in:
.. index::
single: SSH; key
-.Creating and Activating a new SSH Key
-
-.. code-block:: none
-
- [root@pcmk-1 ~]# ssh-keygen -t dsa -f ~/.ssh/id_dsa -N ""
- Generating public/private dsa key pair.
- Created directory '/root/.ssh'.
- Your identification has been saved in /root/.ssh/id_dsa.
- Your public key has been saved in /root/.ssh/id_dsa.pub.
- The key fingerprint is:
- SHA256:ehR595AVLAVpvFgqYXiayds2qx8emkvnHmfQZMTZ4jM root@pcmk-1
- The key's randomart image is:
- +---[DSA 1024]----+
- | . ..+.=+. |
- | . +o+ Bo. |
- | . *oo+*+o |
- | = .*E..o |
- | oS..o . |
- | .o+. |
- | o.*oo |
- | . B.* |
- | === |
- +----[SHA256]-----+
- [root@pcmk-1 ~]# cp ~/.ssh/id_dsa.pub ~/.ssh/authorized_keys
+.. topic:: Creating and Activating a new SSH Key
+
+ .. code-block:: none
+
+ [root@pcmk-1 ~]# ssh-keygen -t dsa -f ~/.ssh/id_dsa -N ""
+ Generating public/private dsa key pair.
+ Created directory '/root/.ssh'.
+ Your identification has been saved in /root/.ssh/id_dsa.
+ Your public key has been saved in /root/.ssh/id_dsa.pub.
+ The key fingerprint is:
+ SHA256:ehR595AVLAVpvFgqYXiayds2qx8emkvnHmfQZMTZ4jM root@pcmk-1
+ The key's randomart image is:
+ +---[DSA 1024]----+
+ | . ..+.=+. |
+ | . +o+ Bo. |
+ | . *oo+*+o |
+ | = .*E..o |
+ | oS..o . |
+ | .o+. |
+ | o.*oo |
+ | . B.* |
+ | === |
+ +----[SHA256]-----+
+ [root@pcmk-1 ~]# cp ~/.ssh/id_dsa.pub ~/.ssh/authorized_keys
Install the key on the other node:
.. code-block:: none
[root@pcmk-1 ~]# scp -r ~/.ssh pcmk-2:
The authenticity of host 'pcmk-2 (192.168.122.102)' can't be established.
ECDSA key fingerprint is SHA256:FQ4sVubTiHdQ6IetbN96fixoTVx/LuQUV8qoyiywnfs.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'pcmk-2,192.168.122.102' (ECDSA) to the list of known hosts.
root@pcmk-2's password:
id_dsa 100% 1385 1.6MB/s 00:00
id_dsa.pub 100% 601 1.0MB/s 00:00
authorized_keys 100% 601 1.3MB/s 00:00
known_hosts 100% 184 389.2KB/s 00:00
Test that you can now run commands remotely, without being prompted:
.. code-block:: none
[root@pcmk-1 ~]# ssh pcmk-2 -- uname -n
root@pcmk-2's password:
pcmk-2
diff --git a/doc/sphinx/Makefile.am b/doc/sphinx/Makefile.am
index 3359b252f5..afaf99c1bd 100644
--- a/doc/sphinx/Makefile.am
+++ b/doc/sphinx/Makefile.am
@@ -1,192 +1,193 @@
#
-# Copyright 2003-2020 the Pacemaker project contributors
+# Copyright 2003-2021 the Pacemaker project contributors
#
# The version control history for this file may have further details.
#
# This source code is licensed under the GNU General Public License version 2
# or later (GPLv2+) WITHOUT ANY WARRANTY.
#
include $(top_srcdir)/mk/common.mk
# Things you might want to override on the command line
# Books to generate
BOOKS ?= Clusters_from_Scratch \
Pacemaker_Administration \
Pacemaker_Development \
Pacemaker_Explained \
Pacemaker_Remote
# Output formats to generate. Possible values:
# html (multiple HTML files)
# dirhtml (HTML files named index.html in multiple directories)
# singlehtml (a single large HTML file)
# text
# pdf
# epub
# latex
# linkcheck (not actually a format; check validity of external links)
#
# The results will end up in <book>/_build/<format>
BOOK_FORMATS ?= singlehtml
-# Set to "a4" or "letter" if building latex format
-PAPER ?= letter
+# Set to "a4paper" or "letterpaper" if building latex format
+PAPER ?= letterpaper
# Additional options for sphinx-build
SPHINXFLAGS ?=
# toplevel rsync destination for www targets (without trailing slash)
RSYNC_DEST ?= root@www.clusterlabs.org:/var/www/html
# End of useful overrides
# Example scheduler transition graphs
# @TODO The original CIB XML for these is long lost. Ideally, we would recreate
# something similar and keep those here instead of the DOTs (or use a couple of
# scheduler regression test inputs instead), then regenerate the SVG
# equivalents using crm_simulate and dot when making a release.
DOTS = $(wildcard shared/images/*.dot)
# Vector sources for generated PNGs (including SVG equivalents of DOTS, created
# manually using dot)
SVGS = $(wildcard shared/images/pcmk-*.svg) $(DOTS:%.dot=%.svg)
# PNG images generated from SVGS
#
# These will not be accessible in a VPATH build, which will generate warnings
# when building the documentation, but the make will still succeed. It is
# nontrivial to get them working for VPATH builds and not worth the effort.
PNGS_GENERATED = $(SVGS:%.svg=%.png)
# Original PNG image sources
PNGS_Clusters_from_Scratch = $(wildcard Clusters_from_Scratch/images/*.png)
PNGS_Pacemaker_Explained = $(wildcard Pacemaker_Explained/images/*.png)
PNGS_Pacemaker_Remote = $(wildcard Pacemaker_Remote/images/*.png)
STATIC_FILES = $(wildcard _static/*.css)
EXTRA_DIST = $(wildcard */*.rst) $(DOTS) $(SVGS) \
$(PNGS_Clusters_from_Scratch) \
$(PNGS_Pacemaker_Explained) \
$(PNGS_Pacemaker_Remote) \
$(STATIC_FILES) \
conf.py.in
# recursive, preserve symlinks/permissions/times, verbose, compress,
# don't cross filesystems, sparse, show progress
RSYNC_OPTS = -rlptvzxS --progress
BOOK_RSYNC_DEST = $(RSYNC_DEST)/$(PACKAGE)/doc/$(PACKAGE_SERIES)
TAG ?= $(shell [ -n "`git tag --points-at HEAD | head -1`" ] \
&& ( git tag --points-at HEAD | head -1 ) \
|| git log --pretty=format:Pacemaker-$(VERSION)-%h -n 1 HEAD)
BOOK = none
DEPS_intro = shared/pacemaker-intro.rst $(PNGS_GENERATED)
DEPS_Clusters_from_Scratch = $(DEPS_intro) $(PNGS_Clusters_from_Scratch)
DEPS_Pacemaker_Administration = $(DEPS_intro)
DEPS_Pacemaker_Development =
DEPS_Pacemaker_Explained = $(DEPS_intro) $(PNGS_Pacemaker_Explained)
DEPS_Pacemaker_Remote = $(PNGS_Pacemaker_Remote)
if BUILD_SPHINX_DOCS
INKSCAPE_CMD = $(INKSCAPE) --export-dpi=90 -C
# Pattern rule to generate PNGs from SVGs
# (--export-png works with Inkscape <1.0, --export-filename with >=1.0;
# create the destination directory in case this is a VPATH build)
%.png: %.svg
$(AM_V_at)-$(MKDIR_P) "$(shell dirname "$@")"
$(AM_V_GEN) { \
$(INKSCAPE_CMD) --export-png="$@" "$<" 2>/dev/null \
|| $(INKSCAPE_CMD) --export-filename="$@" "$<"; \
} $(PCMK_quiet)
# Create a book's Sphinx configuration.
# Create the book directory in case this is a VPATH build.
$(BOOKS:%=%/conf.py): conf.py.in
$(AM_V_at)-$(MKDIR_P) "$(@:%/conf.py=%)"
$(AM_V_GEN)sed \
-e 's/%VERSION%/$(VERSION)/g' \
-e 's/%BOOK_ID%/$(@:%/conf.py=%)/g' \
-e 's/%BOOK_TITLE%/$(subst _, ,$(@:%/conf.py=%))/g' \
-e 's#%SRC_DIR%#$(abs_srcdir)#g' \
$(<) > "$@"
$(BOOK)/_build: $(STATIC_FILES) $(BOOK)/conf.py $(DEPS_$(BOOK)) $(wildcard $(srcdir)/$(BOOK)/*.rst)
@echo 'Building "$(subst _, ,$(BOOK))" because of $?' $(PCMK_quiet)
$(AM_V_at)rm -rf "$@"
$(AM_V_BOOK)for format in $(BOOK_FORMATS); do \
echo -e "\n * Building $$format" $(PCMK_quiet); \
doctrees="doctrees"; \
real_format="$$format"; \
case "$$format" in \
pdf) real_format="latex" ;; \
gettext) doctrees="gettext-doctrees" ;; \
esac; \
$(SPHINX) -b "$$real_format" -d "$@/$$doctrees" \
-c "$(builddir)/$(BOOK)" \
- -D latex_paper_size=$(PAPER) $(SPHINXFLAGS) \
+ -D latex_elements.papersize=$(PAPER) \
+ $(SPHINXFLAGS) \
"$(srcdir)/$(BOOK)" "$@/$$format" \
$(PCMK_quiet); \
if [ "$$format" = "pdf" ]; then \
$(MAKE) $(AM_MAKEFLAGS) -C "$@/$$format" \
all-pdf; \
fi; \
done
endif
.PHONY: books-upload
books-upload: all
if BUILD_SPHINX_DOCS
@echo "Uploading $(PACKAGE_SERIES) documentation set"
@for book in $(BOOKS); do \
echo " * $$book"; \
buildfile="$$book/_build/build-$(PACKAGE_SERIES).txt"; \
echo "Generated on `date --utc` from version $(TAG)" \
> "$$buildfile"; \
rsync $(RSYNC_OPTS) "$$buildfile" \
$(BOOK_FORMATS:%=$$book/_build/%) \
"$(BOOK_RSYNC_DEST)/$$book/"; \
done
all-local:
@for book in $(BOOKS); do \
$(MAKE) $(AM_MAKEFLAGS) BOOK=$$book \
PAPER="$(PAPER)" SPHINXFLAGS="$(SPHINXFLAGS)" \
BOOK_FORMATS="$(BOOK_FORMATS)" $$book/_build; \
done
install-data-local: all-local
$(AM_V_AT)for book in $(BOOKS); do \
for format in $(BOOK_FORMATS); do \
formatdir="$$book/_build/$$format"; \
for f in `find "$$formatdir" -print`; do \
dname="`echo $$f | sed s:_build/::`"; \
dloc="$(DESTDIR)/$(docdir)/$$dname"; \
if [ -d "$$f" ]; then \
$(INSTALL) -d -m 755 "$$dloc"; \
else \
$(INSTALL_DATA) "$$f" "$$dloc"; \
fi \
done; \
done; \
done
uninstall-local:
$(AM_V_AT)for book in $(BOOKS); do \
rm -rf "$(DESTDIR)/$(docdir)/$$book"; \
done
endif
clean-local:
$(AM_V_at)-rm -rf \
$(BOOKS:%="$(builddir)/%/_build") \
$(BOOKS:%="$(builddir)/%/conf.py") \
$(PNGS_GENERATED)
diff --git a/doc/sphinx/Pacemaker_Administration/tools.rst b/doc/sphinx/Pacemaker_Administration/tools.rst
index 5899467e91..e85edee403 100644
--- a/doc/sphinx/Pacemaker_Administration/tools.rst
+++ b/doc/sphinx/Pacemaker_Administration/tools.rst
@@ -1,570 +1,562 @@
.. index:: command-line tool
Using Pacemaker Command-Line Tools
----------------------------------
.. index::
single: command-line tool; output format
.. _cmdline_output:
Controlling Command Line Output
###############################
Some of the pacemaker command line utilities have been converted to a new
output system. Among these tools are ``crm_mon`` and ``stonith_admin``. This
is an ongoing project, and more tools will be converted over time. This system
lets you control the formatting of output with ``--output-as=`` and the
destination of output with ``--output-to=``.
The available formats vary by tool, but at least plain text and XML are
supported by all tools that use the new system. The default format is plain
text. The default destination is stdout but can be redirected to any file.
Some formats support command line options for changing the style of the output.
For instance:
.. code-block:: none
# crm_mon --help-output
Usage:
crm_mon [OPTION?]
Provides a summary of cluster's current state.
Outputs varying levels of detail in a number of different formats.
Output Options:
--output-as=FORMAT Specify output format as one of: console (default), html, text, xml
--output-to=DEST Specify file name for output (or "-" for stdout)
--html-cgi Add text needed to use output in a CGI program
--html-stylesheet=URI Link to an external CSS stylesheet
--html-title=TITLE Page title
--text-fancy Use more highly formatted output
.. index::
single: crm_mon
single: command-line tool; crm_mon
.. _crm_mon:
Monitor a Cluster with crm_mon
##############################
The ``crm_mon`` utility displays the current state of an active cluster. It can
show the cluster status organized by node or by resource, and can be used in
either single-shot or dynamically updating mode. It can also display operations
performed and information about failures.
Using this tool, you can examine the state of the cluster for irregularities,
and see how it responds when you cause or simulate failures.
See the manual page or the output of ``crm_mon --help`` for a full description
of its many options.
.. topic:: Sample output from crm_mon -1
.. code-block:: none
Cluster Summary:
* Stack: corosync
* Current DC: node2 (version 2.0.0-1) - partition with quorum
* Last updated: Mon Jan 29 12:18:42 2018
* Last change: Mon Jan 29 12:18:40 2018 by root via crm_attribute on node3
* 5 nodes configured
* 2 resources configured
Node List:
* Online: [ node1 node2 node3 node4 node5 ]
* Active resources:
* Fencing (stonith:fence_xvm): Started node1
* IP (ocf:heartbeat:IPaddr2): Started node2
.. topic:: Sample output from crm_mon -n -1
.. code-block:: none
Cluster Summary:
* Stack: corosync
* Current DC: node2 (version 2.0.0-1) - partition with quorum
* Last updated: Mon Jan 29 12:21:48 2018
* Last change: Mon Jan 29 12:18:40 2018 by root via crm_attribute on node3
* 5 nodes configured
* 2 resources configured
* Node List:
* Node node1: online
* Fencing (stonith:fence_xvm): Started
* Node node2: online
* IP (ocf:heartbeat:IPaddr2): Started
* Node node3: online
* Node node4: online
* Node node5: online
As mentioned in an earlier chapter, the DC is the node is where decisions are
made. The cluster elects a node to be DC as needed. The only significance of
the choice of DC to an administrator is the fact that its logs will have the
most information about why decisions were made.
.. index::
pair: crm_mon; CSS
.. _crm_mon_css:
Styling crm_mon HTML output
___________________________
Various parts of ``crm_mon``'s HTML output have a CSS class associated with
them. Not everything does, but some of the most interesting portions do. In
the following example, the status of each node has an ``online`` class and the
details of each resource have an ``rsc-ok`` class.
.. code-block:: html
<h2>Node List</h2>
<ul>
<li>
<span>Node: cluster01</span><span class="online"> online</span>
</li>
<li><ul><li><span class="rsc-ok">ping (ocf::pacemaker:ping): Started</span></li></ul></li>
<li>
<span>Node: cluster02</span><span class="online"> online</span>
</li>
<li><ul><li><span class="rsc-ok">ping (ocf::pacemaker:ping): Started</span></li></ul></li>
</ul>
By default, a stylesheet for styling these classes is included in the head of
the HTML output. The relevant portions of this stylesheet that would be used
in the above example is:
.. code-block:: css
<style>
.online { color: green }
.rsc-ok { color: green }
</style>
If you want to override some or all of the styling, simply create your own
stylesheet, place it on a web server, and pass ``--html-stylesheet=<URL>``
to ``crm_mon``. The link is added after the default stylesheet, so your
changes take precedence. You don't need to duplicate the entire default.
Only include what you want to change.
.. index::
single: cibadmin
single: command-line tool; cibadmin
.. _cibadmin:
Edit the CIB XML with cibadmin
##############################
The most flexible tool for modifying the configuration is Pacemaker's
``cibadmin`` command. With ``cibadmin``, you can query, add, remove, update
or replace any part of the configuration. All changes take effect immediately,
so there is no need to perform a reload-like operation.
The simplest way of using ``cibadmin`` is to use it to save the current
configuration to a temporary file, edit that file with your favorite
text or XML editor, and then upload the revised configuration.
.. topic:: Safely using an editor to modify the cluster configuration
.. code-block:: none
# cibadmin --query > tmp.xml
# vi tmp.xml
# cibadmin --replace --xml-file tmp.xml
Some of the better XML editors can make use of a RELAX NG schema to
help make sure any changes you make are valid. The schema describing
the configuration can be found in ``pacemaker.rng``, which may be
deployed in a location such as ``/usr/share/pacemaker`` depending on your
operating system distribution and how you installed the software.
If you want to modify just one section of the configuration, you can
query and replace just that section to avoid modifying any others.
.. topic:: Safely using an editor to modify only the resources section
.. code-block:: none
# cibadmin --query --scope resources > tmp.xml
# vi tmp.xml
# cibadmin --replace --scope resources --xml-file tmp.xml
To quickly delete a part of the configuration, identify the object you wish to
delete by XML tag and id. For example, you might search the CIB for all
STONITH-related configuration:
.. topic:: Searching for STONITH-related configuration items
.. code-block:: none
# cibadmin --query | grep stonith
<nvpair id="cib-bootstrap-options-stonith-action" name="stonith-action" value="reboot"/>
<nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="1"/>
<primitive id="child_DoFencing" class="stonith" type="external/vmware">
<lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
<lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
<lrm_resource id="child_DoFencing:1" type="external/vmware" class="stonith">
<lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
<lrm_resource id="child_DoFencing:2" type="external/vmware" class="stonith">
<lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
<lrm_resource id="child_DoFencing:3" type="external/vmware" class="stonith">
If you wanted to delete the ``primitive`` tag with id ``child_DoFencing``,
you would run:
.. code-block:: none
# cibadmin --delete --xml-text '<primitive id="child_DoFencing"/>'
See the cibadmin man page for more options.
.. warning::
Never edit the live ``cib.xml`` file directly. Pacemaker will detect such
changes and refuse to use the configuration.
.. index::
single: crm_shadow
single: command-line tool; crm_shadow
.. _crm_shadow:
Batch Configuration Changes with crm_shadow
###########################################
Often, it is desirable to preview the effects of a series of configuration
changes before updating the live configuration all at once. For this purpose,
``crm_shadow`` creates a "shadow" copy of the configuration and arranges for
all the command-line tools to use it.
To begin, simply invoke ``crm_shadow --create`` with a name of your choice,
and follow the simple on-screen instructions. Shadow copies are identified with
a name to make it possible to have more than one.
.. warning::
Read this section and the on-screen instructions carefully; failure to do so
could result in destroying the cluster's active configuration!
.. topic:: Creating and displaying the active sandbox
.. code-block:: none
# crm_shadow --create test
Setting up shadow instance
Type Ctrl-D to exit the crm_shadow shell
shadow[test]:
shadow[test] # crm_shadow --which
test
From this point on, all cluster commands will automatically use the shadow copy
instead of talking to the cluster's active configuration. Once you have
finished experimenting, you can either make the changes active via the
``--commit`` option, or discard them using the ``--delete`` option. Again, be
sure to follow the on-screen instructions carefully!
For a full list of ``crm_shadow`` options and commands, invoke it with the
``--help`` option.
.. topic:: Use sandbox to make multiple changes all at once, discard them, and verify real configuration is untouched
.. code-block:: none
shadow[test] # crm_failcount -r rsc_c001n01 -G
scope=status name=fail-count-rsc_c001n01 value=0
shadow[test] # crm_standby --node c001n02 -v on
shadow[test] # crm_standby --node c001n02 -G
scope=nodes name=standby value=on
shadow[test] # cibadmin --erase --force
shadow[test] # cibadmin --query
<cib crm_feature_set="3.0.14" validate-with="pacemaker-3.0" epoch="112" num_updates="2" admin_epoch="0" cib-last-written="Mon Jan 8 23:26:47 2018" update-origin="rhel7-1" update-client="crm_node" update-user="root" have-quorum="1" dc-uuid="1">
<configuration>
<crm_config/>
<nodes/>
<resources/>
<constraints/>
</configuration>
<status/>
</cib>
shadow[test] # crm_shadow --delete test --force
Now type Ctrl-D to exit the crm_shadow shell
shadow[test] # exit
# crm_shadow --which
No active shadow configuration defined
# cibadmin -Q
<cib crm_feature_set="3.0.14" validate-with="pacemaker-3.0" epoch="110" num_updates="2" admin_epoch="0" cib-last-written="Mon Jan 8 23:26:47 2018" update-origin="rhel7-1" update-client="crm_node" update-user="root" have-quorum="1">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-1" name="stonith-enabled" value="1"/>
<nvpair id="cib-bootstrap-2" name="pe-input-series-max" value="30000"/>
See the next section, :ref:`crm_simulate`, for how to test your changes before
committing them to the live cluster.
.. index::
single: crm_simulate
single: command-line tool; crm_simulate
.. _crm_simulate:
Simulate Cluster Activity with crm_simulate
###########################################
The command-line tool `crm_simulate` shows the results of the same logic
the cluster itself uses to respond to a particular cluster configuration and
status.
As always, the man page is the primary documentation, and should be consulted
for further details. This section aims for a better conceptual explanation and
practical examples.
Replaying cluster decision-making logic
_______________________________________
At any given time, one node in a Pacemaker cluster will be elected DC, and that
node will run Pacemaker's scheduler to make decisions.
Each time decisions need to be made (a "transition"), the DC will have log
messages like "Calculated transition ... saving inputs in ..." with a file
name. You can grab the named file and replay the cluster logic to see why
particular decisions were made. The file contains the live cluster
configuration at that moment, so you can also look at it directly to see the
value of node attributes, etc., at that time.
The simplest usage is (replacing $FILENAME with the actual file name):
.. topic:: Simulate cluster response to a given CIB
.. code-block:: none
# crm_simulate --simulate --xml-file $FILENAME
That will show the cluster state when the process started, the actions that
need to be taken ("Transition Summary"), and the resulting cluster state if the
actions succeed. Most actions will have a brief description of why they were
required.
The transition inputs may be compressed. ``crm_simulate`` can handle these
compressed files directly, though if you want to edit the file, you'll need to
uncompress it first.
You can do the same simulation for the live cluster configuration at the
current moment. This is useful mainly when using ``crm_shadow`` to create a
sandbox version of the CIB; the ``--live-check`` option will use the shadow CIB
if one is in effect.
.. topic:: Simulate cluster response to current live CIB or shadow CIB
.. code-block:: none
# crm_simulate --simulate --live-check
Why decisions were made
_______________________
To get further insight into the "why", it gets user-unfriendly very quickly. If
you add the ``--show-scores`` option, you will also see all the scores that
went into the decision-making. The node with the highest cumulative score for a
resource will run it. You can look for ``-INFINITY`` scores in particular to
see where complete bans came into effect.
You can also add ``-VVVV`` to get more detailed messages about what's happening
under the hood. You can add up to two more V's even, but that's usually useful
only if you're a masochist or tracing through the source code.
Visualizing the action sequence
_______________________________
Another handy feature is the ability to generate a visual graph of the actions
needed, using the ``--dot-file`` option. This relies on the separate
Graphviz [#]_ project.
.. topic:: Generate a visual graph of cluster actions from a saved CIB
.. code-block:: none
# crm_simulate --simulate --xml-file $FILENAME --dot-file $FILENAME.dot
# dot $FILENAME.dot -Tsvg > $FILENAME.svg
``$FILENAME.dot`` will contain a GraphViz representation of the cluster's
response to your changes, including all actions with their ordering
dependencies.
``$FILENAME.svg`` will be the same information in a standard graphical format
that you can view in your browser or other app of choice. You could, of course,
use other ``dot`` options to generate other formats.
How to interpret the graphical output:
* Bubbles indicate actions, and arrows indicate ordering dependencies
* Resource actions have text of the form
``<RESOURCE>_<ACTION>_<INTERVAL_IN_MS> <NODE>`` indicating that the
specified action will be executed for the specified resource on the
specified node, once if interval is 0 or at specified recurring interval
otherwise
* Actions with black text will be sent to the executor (that is, the
appropriate agent will be invoked)
* Actions with orange text are "pseudo" actions that the cluster uses
internally for ordering but require no real activity
* Actions with a solid green border are part of the transition (that is, the
cluster will attempt to execute them in the given order -- though a
transition can be interrupted by action failure or new events)
* Dashed arrows indicate dependencies that are not present in the transition
graph
* Actions with a dashed border will not be executed. If the dashed border is
blue, the cluster does not feel the action needs to be executed. If the
dashed border is red, the cluster would like to execute the action but
cannot. Any actions depending on an action with a dashed border will not be
able to execute.
* Loops should not happen, and should be reported as a bug if found.
.. topic:: Small Cluster Transition
.. image:: ../shared/images/Policy-Engine-small.png
:alt: An example transition graph as represented by Graphviz
- :height: 325
- :width: 1161
- :scale: 75 %
:align: center
In the above example, it appears that a new node, ``pcmk-2``, has come online
and that the cluster is checking to make sure ``rsc1``, ``rsc2`` and ``rsc3``
are not already running there (indicated by the ``rscN_monitor_0`` entries).
Once it did that, and assuming the resources were not active there, it would
have liked to stop ``rsc1`` and ``rsc2`` on ``pcmk-1`` and move them to
``pcmk-2``. However, there appears to be some problem and the cluster cannot or
is not permitted to perform the stop actions which implies it also cannot
perform the start actions. For some reason, the cluster does not want to start
``rsc3`` anywhere.
.. topic:: Complex Cluster Transition
.. image:: ../shared/images/Policy-Engine-big.png
:alt: Complex transition graph that you're not expected to be able to read
- :width: 1455
- :height: 1945
- :scale: 75 %
:align: center
What-if scenarios
_________________
You can make changes to the saved or shadow CIB and simulate it again, to see
how Pacemaker would react differently. You can edit the XML by hand, use
command-line tools such as ``cibadmin`` with either a shadow CIB or the
``CIB_file`` environment variable set to the filename, or use higher-level tool
support (see the man pages of the specific tool you're using for how to perform
actions on a saved CIB file rather than the live CIB).
You can also inject node failures and/or action failures into the simulation;
see the ``crm_simulate`` man page for more details.
This capability is useful when using a shadow CIB to edit the configuration.
Before committing the changes to the live cluster with ``crm_shadow --commit``,
you can use ``crm_simulate`` to see how the cluster will react to the changes.
-.. _attrd_updater:
-
.. _crm_attribute:
.. index::
single: attrd_updater
single: command-line tool; attrd_updater
single: crm_attribute
single: command-line tool; crm_attribute
Manage Node Attributes, Cluster Options and Defaults with crm_attribute and attrd_updater
#########################################################################################
``crm_attribute`` and ``attrd_updater`` are confusingly similar tools with subtle
differences.
``attrd_updater`` can query and update node attributes. ``crm_attribute`` can query
and update not only node attributes, but also cluster options, resource
defaults, and operation defaults.
To understand the differences, it helps to understand the various types of node
attribute.
.. table:: **Types of Node Attributes**
+-----------+----------+-------------------+------------------+----------------+----------------+
| Type | Recorded | Recorded in | Survive full | Manageable by | Manageable by |
| | in CIB? | attribute manager | cluster restart? | crm_attribute? | attrd_updater? |
| | | memory? | | | |
+===========+==========+===================+==================+================+================+
| permanent | yes | no | yes | yes | no |
+-----------+----------+-------------------+------------------+----------------+----------------+
| transient | yes | yes | no | yes | yes |
+-----------+----------+-------------------+------------------+----------------+----------------+
| private | no | yes | no | no | yes |
+-----------+----------+-------------------+------------------+----------------+----------------+
As you can see from the table above, ``crm_attribute`` can manage permanent and
transient node attributes, while ``attrd_updater`` can manage transient and
private node attributes.
The difference between the two tools lies mainly in *how* they update node
attributes: ``attrd_updater`` always contacts the Pacemaker attribute manager
directly, while ``crm_attribute`` will contact the attribute manager only for
transient node attributes, and will instead modify the CIB directly for
permanent node attributes (and for transient node attributes when unable to
contact the attribute manager).
By contacting the attribute manager directly, ``attrd_updater`` can change
an attribute's "dampening" (whether changes are immediately flushed to the CIB
or after a specified amount of time, to minimize disk writes for frequent
changes), set private node attributes (which are never written to the CIB), and
set attributes for nodes that don't yet exist.
By modifying the CIB directly, ``crm_attribute`` can set permanent node
attributes (which are only in the CIB and not managed by the attribute
manager), and can be used with saved CIB files and shadow CIBs.
However a transient node attribute is set, it is synchronized between the CIB
and the attribute manager, on all nodes.
.. index::
single: crm_failcount
single: command-line tool; crm_failcount
single: crm_node
single: command-line tool; crm_node
single: crm_report
single: command-line tool; crm_report
single: crm_standby
single: command-line tool; crm_standby
single: crm_verify
single: command-line tool; crm_verify
single: stonith_admin
single: command-line tool; stonith_admin
Other Commonly Used Tools
#########################
Other command-line tools include:
* ``crm_failcount``: query or delete resource fail counts
* ``crm_node``: manage cluster nodes
* ``crm_report``: generate a detailed cluster report for bug submissions
* ``crm_resource``: manage cluster resources
* ``crm_standby``: manage standby status of nodes
* ``crm_verify``: validate a CIB
* ``stonith_admin``: manage fencing devices
See the manual pages for details.
.. rubric:: Footnotes
.. [#] Graph visualization software. See http://www.graphviz.org/ for details.
diff --git a/doc/sphinx/Pacemaker_Explained/nodes.rst b/doc/sphinx/Pacemaker_Explained/nodes.rst
index 84069ea931..25f2c2129f 100644
--- a/doc/sphinx/Pacemaker_Explained/nodes.rst
+++ b/doc/sphinx/Pacemaker_Explained/nodes.rst
@@ -1,247 +1,246 @@
Cluster Nodes
-------------
Defining a Cluster Node
_______________________
Each cluster node will have an entry in the ``nodes`` section containing at
least an ID and a name. A cluster node's ID is defined by the cluster layer
(Corosync).
.. topic:: **Example Corosync cluster node entry**
.. code-block:: xml
<node id="101" uname="pcmk-1"/>
In normal circumstances, the admin should let the cluster populate this
information automatically from the cluster layer.
.. _node_name:
Where Pacemaker Gets the Node Name
##################################
The name that Pacemaker uses for a node in the configuration does not have to
be the same as its local hostname. Pacemaker uses the following for a Corosync
node's name, in order of most preferred first:
* The value of ``name`` in the ``nodelist`` section of ``corosync.conf``
* The value of ``ring0_addr`` in the ``nodelist`` section of ``corosync.conf``
* The local hostname (value of ``uname -n``)
If the cluster is running, the ``crm_node -n`` command will display the local
node's name as used by the cluster.
If a Corosync ``nodelist`` is used, ``crm_node --name-for-id`` with a Corosync
node ID will display the name used by the node with the given Corosync
``nodeid``, for example:
.. code-block:: none
crm_node --name-for-id 2
.. index::
single: node; attribute
single: node attribute
.. _node_attributes:
Node Attributes
_______________
Pacemaker allows node-specific values to be specified using *node attributes*.
A node attribute has a name, and may have a distinct value for each node.
Node attributes come in two types, *permanent* and *transient*. Permanent node
attributes are kept within the ``node`` entry, and keep their values even if
the cluster restarts on a node. Transient node attributes are kept in the CIB's
``status`` section, and go away when the cluster stops on the node.
While certain node attributes have specific meanings to the cluster, they are
mainly intended to allow administrators and resource agents to track any
information desired.
For example, an administrator might choose to define node attributes for how
much RAM and disk space each node has, which OS each uses, or which server room
rack each node is in.
Users can configure :ref:`rules` that use node attributes to affect where
resources are placed.
Setting and querying node attributes
####################################
Node attributes can be set and queried using the ``crm_attribute`` and
``attrd_updater`` commands, so that the user does not have to deal with XML
configuration directly.
Here is an example command to set a permanent node attribute, and the XML
configuration that would be generated:
.. topic:: **Result of using crm_attribute to specify which kernel pcmk-1 is running**
.. code-block:: none
# crm_attribute --type nodes --node pcmk-1 --name kernel --update $(uname -r)
.. code-block:: xml
<node id="1" uname="pcmk-1">
<instance_attributes id="nodes-1-attributes">
<nvpair id="nodes-1-kernel" name="kernel" value="3.10.0-862.14.4.el7.x86_64"/>
</instance_attributes>
</node>
To read back the value that was just set:
.. code-block:: none
# crm_attribute --type nodes --node pcmk-1 --name kernel --query
scope=nodes name=kernel value=3.10.0-862.14.4.el7.x86_64
The ``--type nodes`` indicates that this is a permanent node attribute;
``--type status`` would indicate a transient node attribute.
Special node attributes
#######################
Certain node attributes have special meaning to the cluster.
Node attribute names beginning with ``#`` are considered reserved for these
special attributes. Some special attributes do not start with ``#``, for
historical reasons.
Certain special attributes are set automatically by the cluster, should never
be modified directly, and can be used only within :ref:`rules`; these are
listed under
:ref:`built-in node attributes <node-attribute-expressions-special>`.
For true/false values, the cluster considers a value of "1", "y", "yes", "on",
or "true" (case-insensitively) to be true, "0", "n", "no", "off", "false", or
unset to be false, and anything else to be an error.
.. table:: **Node attributes with special significance**
+----------------------------+-----------------------------------------------------+
| Name | Description |
+============================+=====================================================+
| fail-count-* | .. index:: |
| | pair: node attribute; fail-count |
| | |
| | Attributes whose names start with |
| | ``fail-count-`` are managed by the cluster |
| | to track how many times particular resource |
| | operations have failed on this node. These |
| | should be queried and cleared via the |
| | ``crm_failcount`` or |
| | ``crm_resource --cleanup`` commands rather |
| | than directly. |
+----------------------------+-----------------------------------------------------+
| last-failure-* | .. index:: |
| | pair: node attribute; last-failure |
| | |
| | Attributes whose names start with |
| | ``last-failure-`` are managed by the cluster |
| | to track when particular resource operations |
| | have most recently failed on this node. |
| | These should be cleared via the |
| | ``crm_failcount`` or |
| | ``crm_resource --cleanup`` commands rather |
| | than directly. |
+----------------------------+-----------------------------------------------------+
| maintenance | .. index:: |
| | pair: node attribute; maintenance |
| | |
| | Similar to the ``maintenance-mode`` |
| | :ref:`cluster option <cluster_options>`, but |
| | for a single node. If true, resources will |
| | not be started or stopped on the node, |
| | resources and individual clone instances |
| | running on the node will become unmanaged, |
| | and any recurring operations for those will |
| | be cancelled. |
| | |
- | | .. warning:: |
- | | Restarting pacemaker on a node that is in |
- | | single-node maintenance mode will likely |
- | | lead to undesirable effects. If |
- | | ``maintenance`` is set as a transient |
- | | attribute, it will be erased when |
- | | Pacemaker is stopped, which will |
- | | immediately take the node out of |
- | | maintenance mode and likely get it |
- | | fenced. Even if permanent, if Pacemaker |
- | | is restarted, any resources active on the |
- | | node will have their local history erased |
- | | when the node rejoins, so the cluster |
- | | will no longer consider them running on |
- | | the node and thus will consider them |
- | | managed again, leading them to be started |
- | | elsewhere. This behavior might be |
- | | improved in a future release. |
+ | | **Warning:** Restarting pacemaker on a node that is |
+ | | in single-node maintenance mode will likely |
+ | | lead to undesirable effects. If |
+ | | ``maintenance`` is set as a transient |
+ | | attribute, it will be erased when |
+ | | Pacemaker is stopped, which will |
+ | | immediately take the node out of |
+ | | maintenance mode and likely get it |
+ | | fenced. Even if permanent, if Pacemaker |
+ | | is restarted, any resources active on the |
+ | | node will have their local history erased |
+ | | when the node rejoins, so the cluster |
+ | | will no longer consider them running on |
+ | | the node and thus will consider them |
+ | | managed again, leading them to be started |
+ | | elsewhere. This behavior might be |
+ | | improved in a future release. |
+----------------------------+-----------------------------------------------------+
| probe_complete | .. index:: |
| | pair: node attribute; probe_complete |
| | |
| | This is managed by the cluster to detect |
| | when nodes need to be reprobed, and should |
| | never be used directly. |
+----------------------------+-----------------------------------------------------+
| resource-discovery-enabled | .. index:: |
| | pair: node attribute; resource-discovery-enabled |
| | |
| | If the node is a remote node, fencing is enabled, |
| | and this attribute is explicitly set to false |
| | (unset means true in this case), resource discovery |
| | (probes) will not be done on this node. This is |
| | highly discouraged; the ``resource-discovery`` |
| | location constraint property is preferred for this |
| | purpose. |
+----------------------------+-----------------------------------------------------+
| shutdown | .. index:: |
| | pair: node attribute; shutdown |
| | |
| | This is managed by the cluster to orchestrate the |
| | shutdown of a node, and should never be used |
| | directly. |
+----------------------------+-----------------------------------------------------+
| site-name | .. index:: |
| | pair: node attribute; site-name |
| | |
| | If set, this will be used as the value of the |
| | ``#site-name`` node attribute used in rules. (If |
| | not set, the value of the ``cluster-name`` cluster |
| | option will be used as ``#site-name`` instead.) |
+----------------------------+-----------------------------------------------------+
| standby | .. index:: |
| | pair: node attribute; standby |
| | |
| | If true, the node is in standby mode. This is |
| | typically set and queried via the ``crm_standby`` |
| | command rather than directly. |
+----------------------------+-----------------------------------------------------+
| terminate | .. index:: |
| | pair: node attribute; terminate |
| | |
| | If the value is true or begins with any nonzero |
| | number, the node will be fenced. This is typically |
| | set by tools rather than directly. |
+----------------------------+-----------------------------------------------------+
| #digests-* | .. index:: |
| | pair: node attribute; #digests |
| | |
| | Attributes whose names start with ``#digests-`` are |
| | managed by the cluster to detect when |
| | :ref:`unfencing` needs to be redone, and should |
| | never be used directly. |
+----------------------------+-----------------------------------------------------+
| #node-unfenced | .. index:: |
| | pair: node attribute; #node-unfenced |
| | |
| | When the node was last unfenced (as seconds since |
| | the epoch). This is managed by the cluster and |
| | should never be used directly. |
+----------------------------+-----------------------------------------------------+
diff --git a/doc/sphinx/Pacemaker_Explained/options.rst b/doc/sphinx/Pacemaker_Explained/options.rst
index 2e21bb6180..cbb849f991 100644
--- a/doc/sphinx/Pacemaker_Explained/options.rst
+++ b/doc/sphinx/Pacemaker_Explained/options.rst
@@ -1,614 +1,611 @@
Cluster-Wide Configuration
--------------------------
.. index::
pair: XML element; cib
pair: XML element; configuration
Configuration Layout
####################
The cluster is defined by the Cluster Information Base (CIB), which uses XML
notation. The simplest CIB, an empty one, looks like this:
.. topic:: An empty configuration
.. code-block:: xml
<cib crm_feature_set="3.6.0" validate-with="pacemaker-3.5" epoch="1" num_updates="0" admin_epoch="0">
<configuration>
<crm_config/>
<nodes/>
<resources/>
<constraints/>
</configuration>
<status/>
</cib>
The empty configuration above contains the major sections that make up a CIB:
* ``cib``: The entire CIB is enclosed with a ``cib`` element. Certain
fundamental settings are defined as attributes of this element.
* ``configuration``: This section -- the primary focus of this document --
contains traditional configuration information such as what resources the
cluster serves and the relationships among them.
* ``crm_config``: cluster-wide configuration options
* ``nodes``: the machines that host the cluster
* ``resources``: the services run by the cluster
* ``constraints``: indications of how resources should be placed
* ``status``: This section contains the history of each resource on each
node. Based on this data, the cluster can construct the complete current
state of the cluster. The authoritative source for this section is the
local executor (pacemaker-execd process) on each cluster node, and the
cluster will occasionally repopulate the entire section. For this reason,
it is never written to disk, and administrators are advised against
modifying it in any way.
In this document, configuration settings will be described as properties or
options based on how they are defined in the CIB:
* Properties are XML attributes of an XML element.
* Options are name-value pairs expressed as ``nvpair`` child elements of an XML
element.
Normally, you will use command-line tools that abstract the XML, so the
distinction will be unimportant; both properties and options are cluster
settings you can tweak.
CIB Properties
##############
Certain settings are defined by CIB properties (that is, attributes of the
``cib`` tag) rather than with the rest of the cluster configuration in the
``configuration`` section.
The reason is simply a matter of parsing. These options are used by the
configuration database which is, by design, mostly ignorant of the content it
holds. So the decision was made to place them in an easy-to-find location.
.. table:: **CIB Properties**
+------------------+-----------------------------------------------------------+
| Attribute | Description |
+==================+===========================================================+
| admin_epoch | .. index:: |
| | pair: admin_epoch; cib |
| | |
| | When a node joins the cluster, the cluster performs a |
| | check to see which node has the best configuration. It |
| | asks the node with the highest (``admin_epoch``, |
| | ``epoch``, ``num_updates``) tuple to replace the |
| | configuration on all the nodes -- which makes setting |
| | them, and setting them correctly, very important. |
| | ``admin_epoch`` is never modified by the cluster; you can |
| | use this to make the configurations on any inactive nodes |
| | obsolete. |
| | |
- | | .. warning:: |
- | | Never set this value to zero. In such cases, the |
- | | cluster cannot tell the difference between your |
- | | configuration and the "empty" one used when nothing is |
- | | found on disk. |
+ | | **Warning:** Never set this value to zero. In such cases, |
+ | | the cluster cannot tell the difference between your |
+ | | configuration and the "empty" one used when nothing is |
+ | | found on disk. |
+------------------+-----------------------------------------------------------+
| epoch | .. index:: |
| | pair: epoch; cib |
| | |
| | The cluster increments this every time the configuration |
| | is updated (usually by the administrator). |
+------------------+-----------------------------------------------------------+
| num_updates | .. index:: |
| | pair: num_updates; cib |
| | |
| | The cluster increments this every time the configuration |
| | or status is updated (usually by the cluster) and resets |
| | it to 0 when epoch changes. |
+------------------+-----------------------------------------------------------+
| validate-with | .. index:: |
| | pair: validate-with; cib |
| | |
| | Determines the type of XML validation that will be done |
| | on the configuration. If set to ``none``, the cluster |
| | will not verify that updates conform to the DTD (nor |
| | reject ones that don't). |
+------------------+-----------------------------------------------------------+
| cib-last-written | .. index:: |
| | pair: cib-last-written; cib |
| | |
| | Indicates when the configuration was last written to |
| | disk. Maintained by the cluster; for informational |
| | purposes only. |
+------------------+-----------------------------------------------------------+
| have-quorum | .. index:: |
| | pair: have-quorum; cib |
| | |
| | Indicates if the cluster has quorum. If false, this may |
| | mean that the cluster cannot start resources or fence |
| | other nodes (see ``no-quorum-policy`` below). Maintained |
| | by the cluster. |
+------------------+-----------------------------------------------------------+
| dc-uuid | .. index:: |
| | pair: dc-uuid; cib |
| | |
| | Indicates which cluster node is the current leader. Used |
| | by the cluster when placing resources and determining the |
| | order of some events. Maintained by the cluster. |
+------------------+-----------------------------------------------------------+
.. _cluster_options:
Cluster Options
###############
Cluster options, as you might expect, control how the cluster behaves when
confronted with various situations.
They are grouped into sets within the ``crm_config`` section. In advanced
configurations, there may be more than one set. (This will be described later
in the chapter on :ref:`rules` where we will show how to have the cluster use
different sets of options during working hours than during weekends.) For now,
we will describe the simple case where each option is present at most once.
You can obtain an up-to-date list of cluster options, including their default
values, by running the ``man pacemaker-schedulerd`` and
``man pacemaker-controld`` commands.
.. table:: **Cluster Options**
+---------------------------+---------+----------------------------------------------------+
| Option | Default | Description |
+===========================+=========+====================================================+
| cluster-name | | .. index:: |
| | | pair: cluster option; cluster-name |
| | | |
| | | An (optional) name for the cluster as a whole. |
| | | This is mostly for users' convenience for use |
| | | as desired in administration, but this can be |
| | | used in the Pacemaker configuration in |
| | | :ref:`rules` (as the ``#cluster-name`` |
| | | :ref:`node attribute |
| | | <node-attribute-expressions-special>`. It may |
| | | also be used by higher-level tools when |
| | | displaying cluster information, and by |
| | | certain resource agents (for example, the |
| | | ``ocf:heartbeat:GFS2`` agent stores the |
| | | cluster name in filesystem meta-data). |
+---------------------------+---------+----------------------------------------------------+
| dc-version | | .. index:: |
| | | pair: cluster option; dc-version |
| | | |
| | | Version of Pacemaker on the cluster's DC. |
| | | Determined automatically by the cluster. Often |
| | | includes the hash which identifies the exact |
| | | Git changeset it was built from. Used for |
| | | diagnostic purposes. |
+---------------------------+---------+----------------------------------------------------+
| cluster-infrastructure | | .. index:: |
| | | pair: cluster option; cluster-infrastructure |
| | | |
| | | The messaging stack on which Pacemaker is |
| | | currently running. Determined automatically by |
| | | the cluster. Used for informational and |
| | | diagnostic purposes. |
+---------------------------+---------+----------------------------------------------------+
| no-quorum-policy | stop | .. index:: |
| | | pair: cluster option; no-quorum-policy |
| | | |
| | | What to do when the cluster does not have |
| | | quorum. Allowed values: |
| | | |
| | | * ``ignore:`` continue all resource management |
| | | * ``freeze:`` continue resource management, but |
| | | don't recover resources from nodes not in the |
| | | affected partition |
| | | * ``stop:`` stop all resources in the affected |
| | | cluster partition |
| | | * ``demote:`` demote promotable resources and |
| | | stop all other resources in the affected |
| | | cluster partition *(since 2.0.5)* |
| | | * ``suicide:`` fence all nodes in the affected |
| | | cluster partition |
+---------------------------+---------+----------------------------------------------------+
| batch-limit | 0 | .. index:: |
| | | pair: cluster option; batch-limit |
| | | |
| | | The maximum number of actions that the cluster |
| | | may execute in parallel across all nodes. The |
| | | "correct" value will depend on the speed and |
| | | load of your network and cluster nodes. If zero, |
| | | the cluster will impose a dynamically calculated |
| | | limit only when any node has high load. |
+---------------------------+---------+----------------------------------------------------+
| migration-limit | -1 | .. index:: |
| | | pair: cluster option; migration-limit |
| | | |
| | | The number of |
| | | :ref:`live migration <live-migration>` actions |
| | | that the cluster is allowed to execute in |
| | | parallel on a node. A value of -1 means |
| | | unlimited. |
+---------------------------+---------+----------------------------------------------------+
| symmetric-cluster | true | .. index:: |
| | | pair: cluster option; symmetric-cluster |
| | | |
| | | Whether resources can run on any node by default |
| | | (if false, a resource is allowed to run on a |
| | | node only if a |
| | | :ref:`location constraint <location-constraint>` |
| | | enables it) |
+---------------------------+---------+----------------------------------------------------+
| stop-all-resources | false | .. index:: |
| | | pair: cluster option; stop-all-resources |
| | | |
| | | Whether all resources should be disallowed from |
| | | running (can be useful during maintenance) |
+---------------------------+---------+----------------------------------------------------+
| stop-orphan-resources | true | .. index:: |
| | | pair: cluster option; stop-orphan-resources |
| | | |
| | | Whether resources that have been deleted from |
| | | the configuration should be stopped. This value |
| | | takes precedence over ``is-managed`` (that is, |
| | | even unmanaged resources will be stopped when |
| | | orphaned if this value is ``true`` |
+---------------------------+---------+----------------------------------------------------+
| stop-orphan-actions | true | .. index:: |
| | | pair: cluster option; stop-orphan-actions |
| | | |
| | | Whether recurring :ref:`operations <operation>` |
| | | that have been deleted from the configuration |
| | | should be cancelled |
+---------------------------+---------+----------------------------------------------------+
| start-failure-is-fatal | true | .. index:: |
| | | pair: cluster option; start-failure-is-fatal |
| | | |
| | | Whether a failure to start a resource on a |
| | | particular node prevents further start attempts |
| | | on that node? If ``false``, the cluster will |
| | | decide whether the node is still eligible based |
| | | on the resource's current failure count and |
| | | :ref:`migration-threshold <failure-handling>`. |
+---------------------------+---------+----------------------------------------------------+
| enable-startup-probes | true | .. index:: |
| | | pair: cluster option; enable-startup-probes |
| | | |
| | | Whether the cluster should check the |
| | | pre-existing state of resources when the cluster |
| | | starts |
+---------------------------+---------+----------------------------------------------------+
| maintenance-mode | false | .. index:: |
| | | pair: cluster option; maintenance-mode |
| | | |
| | | Whether the cluster should refrain from |
| | | monitoring, starting and stopping resources |
+---------------------------+---------+----------------------------------------------------+
| stonith-enabled | true | .. index:: |
| | | pair: cluster option; stonith-enabled |
| | | |
| | | Whether the cluster is allowed to fence nodes |
| | | (for example, failed nodes and nodes with |
| | | resources that can't be stopped. |
| | | |
| | | If true, at least one fence device must be |
| | | configured before resources are allowed to run. |
| | | |
| | | If false, unresponsive nodes are immediately |
| | | assumed to be running no resources, and resource |
| | | recovery on online nodes starts without any |
| | | further protection (which can mean *data loss* |
| | | if the unresponsive node still accesses shared |
| | | storage, for example). See also the |
| | | :ref:`requires <requires>` resource |
| | | meta-attribute. |
+---------------------------+---------+----------------------------------------------------+
| stonith-action | reboot | .. index:: |
| | | pair: cluster option; stonith-action |
| | | |
| | | Action the cluster should send to the fence agent |
| | | when a node must be fenced. Allowed values are |
| | | ``reboot``, ``off``, and (for legacy agents only) |
| | | ``poweroff``. |
+---------------------------+---------+----------------------------------------------------+
| stonith-timeout | 60s | .. index:: |
| | | pair: cluster option; stonith-timeout |
| | | |
| | | How long to wait for ``on``, ``off``, and |
| | | ``reboot`` fence actions to complete by default. |
+---------------------------+---------+----------------------------------------------------+
| stonith-max-attempts | 10 | .. index:: |
| | | pair: cluster option; stonith-max-attempts |
| | | |
| | | How many times fencing can fail for a target |
| | | before the cluster will no longer immediately |
| | | re-attempt it. |
+---------------------------+---------+----------------------------------------------------+
| stonith-watchdog-timeout | 0 | .. index:: |
| | | pair: cluster option; stonith-watchdog-timeout |
| | | |
| | | If nonzero, and the cluster detects |
| | | ``have-watchdog`` as ``true``, then watchdog-based |
| | | self-fencing will be performed via SBD when |
| | | fencing is required, without requiring a fencing |
| | | resource explicitly configured. |
| | | |
| | | If this is set to a positive value, unseen nodes |
| | | are assumed to self-fence within this much time. |
| | | |
- | | | .. warning:: |
- | | | It must be ensured that this value is larger |
- | | | than the ``SBD_WATCHDOG_TIMEOUT`` environment |
- | | | variable on all nodes. Pacemaker verifies the |
- | | | settings individually on all nodes and prevents |
- | | | startup or shuts down if configured wrongly on |
- | | | the fly. It is strongly recommended that |
- | | | ``SBD_WATCHDOG_TIMEOUT`` be set to the same |
- | | | value on all nodes. |
+ | | | **Warning:** It must be ensured that this value is |
+ | | | larger than the ``SBD_WATCHDOG_TIMEOUT`` |
+ | | | environment variable on all nodes. Pacemaker |
+ | | | verifies the settings individually on all nodes |
+ | | | and prevents startup or shuts down if configured |
+ | | | wrongly on the fly. It is strongly recommended |
+ | | | that ``SBD_WATCHDOG_TIMEOUT`` be set to the same |
+ | | | value on all nodes. |
| | | |
| | | If this is set to a negative value, and |
| | | ``SBD_WATCHDOG_TIMEOUT`` is set, twice that value |
| | | will be used. |
| | | |
- | | | .. warning:: |
- | | | In this case, it is essential (and currently |
- | | | not verified by pacemaker) that |
- | | | ``SBD_WATCHDOG_TIMEOUT`` is set to the same |
- | | | value on all nodes. |
+ | | | **Warning:** In this case, it is essential (and |
+ | | | currently not verified by pacemaker) that |
+ | | | ``SBD_WATCHDOG_TIMEOUT`` is set to the same |
+ | | | value on all nodes. |
+---------------------------+---------+----------------------------------------------------+
| concurrent-fencing | false | .. index:: |
| | | pair: cluster option; concurrent-fencing |
| | | |
| | | Whether the cluster is allowed to initiate multiple|
| | | fence actions concurrently |
+---------------------------+---------+----------------------------------------------------+
| fence-reaction | stop | .. index:: |
| | | pair: cluster option; fence-reaction |
| | | |
| | | How should a cluster node react if notified of its |
| | | own fencing? A cluster node may receive |
| | | notification of its own fencing if fencing is |
| | | misconfigured, or if fabric fencing is in use that |
| | | doesn't cut cluster communication. Allowed values |
| | | are ``stop`` to attempt to immediately stop |
| | | pacemaker and stay stopped, or ``panic`` to |
| | | attempt to immediately reboot the local node, |
| | | falling back to stop on failure. The default is |
| | | likely to be changed to ``panic`` in a future |
| | | release. *(since 2.0.3)* |
+---------------------------+---------+----------------------------------------------------+
| priority-fencing-delay | 0 | .. index:: |
| | | pair: cluster option; priority-fencing-delay |
| | | |
| | | Apply this delay to any fencing targeting the lost |
| | | nodes with the highest total resource priority in |
| | | case we don't have the majority of the nodes in |
| | | our cluster partition, so that the more |
| | | significant nodes potentially win any fencing |
| | | match (especially meaningful in a split-brain of a |
| | | 2-node cluster). A promoted resource instance |
| | | takes the resource's priority plus 1 if the |
| | | resource's priority is not 0. Any static or random |
| | | delays introduced by ``pcmk_delay_base`` and |
| | | ``pcmk_delay_max`` configured for the |
| | | corresponding fencing resources will be added to |
| | | this delay. This delay should be significantly |
| | | greater than (safely twice) the maximum delay from |
| | | those parameters. *(since 2.0.4)* |
+---------------------------+---------+----------------------------------------------------+
| cluster-delay | 60s | .. index:: |
| | | pair: cluster option; cluster-delay |
| | | |
| | | Estimated maximum round-trip delay over the |
| | | network (excluding action execution). If the DC |
| | | requires an action to be executed on another node, |
| | | it will consider the action failed if it does not |
| | | get a response from the other node in this time |
| | | (after considering the action's own timeout). The |
| | | "correct" value will depend on the speed and load |
| | | of your network and cluster nodes. |
+---------------------------+---------+----------------------------------------------------+
| dc-deadtime | 20s | .. index:: |
| | | pair: cluster option; dc-deadtime |
| | | |
| | | How long to wait for a response from other nodes |
| | | during startup. The "correct" value will depend on |
| | | the speed/load of your network and the type of |
| | | switches used. |
+---------------------------+---------+----------------------------------------------------+
| cluster-ipc-limit | 500 | .. index:: |
| | | pair: cluster option; cluster-ipc-limit |
| | | |
| | | The maximum IPC message backlog before one cluster |
| | | daemon will disconnect another. This is of use in |
| | | large clusters, for which a good value is the |
| | | number of resources in the cluster multiplied by |
| | | the number of nodes. The default of 500 is also |
| | | the minimum. Raise this if you see |
| | | "Evicting client" messages for cluster daemon PIDs |
| | | in the logs. |
+---------------------------+---------+----------------------------------------------------+
| pe-error-series-max | -1 | .. index:: |
| | | pair: cluster option; pe-error-series-max |
| | | |
| | | The number of scheduler inputs resulting in errors |
| | | to save. Used when reporting problems. A value of |
| | | -1 means unlimited (report all). |
+---------------------------+---------+----------------------------------------------------+
| pe-warn-series-max | -1 | .. index:: |
| | | pair: cluster option; pe-warn-series-max |
| | | |
| | | The number of scheduler inputs resulting in |
| | | warnings to save. Used when reporting problems. A |
| | | value of -1 means unlimited (report all). |
+---------------------------+---------+----------------------------------------------------+
| pe-input-series-max | -1 | .. index:: |
| | | pair: cluster option; pe-input-series-max |
| | | |
| | | The number of "normal" scheduler inputs to save. |
| | | Used when reporting problems. A value of -1 means |
| | | unlimited (report all). |
+---------------------------+---------+----------------------------------------------------+
| enable-acl | false | .. index:: |
| | | pair: cluster option; enable-acl |
| | | |
| | | Whether :ref:`acl` should be used to authorize |
| | | modifications to the CIB |
+---------------------------+---------+----------------------------------------------------+
| placement-strategy | default | .. index:: |
| | | pair: cluster option; placement-strategy |
| | | |
| | | How the cluster should allocate resources to nodes |
| | | (see :ref:`utilization`). Allowed values are |
| | | ``default``, ``utilization``, ``balanced``, and |
| | | ``minimal``. |
+---------------------------+---------+----------------------------------------------------+
| node-health-strategy | none | .. index:: |
| | | pair: cluster option; node-health-strategy |
| | | |
| | | How the cluster should react to node health |
| | | attributes (see :ref:`node-health`). Allowed values|
| | | are ``none``, ``migrate-on-red``, ``only-green``, |
| | | ``progressive``, and ``custom``. |
+---------------------------+---------+----------------------------------------------------+
| node-health-base | 0 | .. index:: |
| | | pair: cluster option; node-health-base |
| | | |
| | | The base health score assigned to a node. Only |
| | | used when ``node-health-strategy`` is |
| | | ``progressive``. |
+---------------------------+---------+----------------------------------------------------+
| node-health-green | 0 | .. index:: |
| | | pair: cluster option; node-health-green |
| | | |
| | | The score to use for a node health attribute whose |
| | | value is ``green``. Only used when |
| | | ``node-health-strategy`` is ``progressive`` or |
| | | ``custom``. |
+---------------------------+---------+----------------------------------------------------+
| node-health-yellow | 0 | .. index:: |
| | | pair: cluster option; node-health-yellow |
| | | |
| | | The score to use for a node health attribute whose |
| | | value is ``yellow``. Only used when |
| | | ``node-health-strategy`` is ``progressive`` or |
| | | ``custom``. |
+---------------------------+---------+----------------------------------------------------+
| node-health-red | 0 | .. index:: |
| | | pair: cluster option; node-health-red |
| | | |
| | | The score to use for a node health attribute whose |
| | | value is ``red``. Only used when |
| | | ``node-health-strategy`` is ``progressive`` or |
| | | ``custom``. |
+---------------------------+---------+----------------------------------------------------+
| cluster-recheck-interval | 15min | .. index:: |
| | | pair: cluster option; cluster-recheck-interval |
| | | |
| | | Pacemaker is primarily event-driven, and looks |
| | | ahead to know when to recheck the cluster for |
| | | failure timeouts and most time-based rules |
| | | *(since 2.0.3)*. However, it will also recheck the |
| | | cluster after this amount of inactivity. This has |
| | | two goals: rules with ``date_spec`` are only |
| | | guaranteed to be checked this often, and it also |
| | | serves as a fail-safe for some kinds of scheduler |
| | | bugs. A value of 0 disables this polling; positive |
| | | values are a time interval. |
+---------------------------+---------+----------------------------------------------------+
| shutdown-lock | false | .. index:: |
| | | pair: cluster option; shutdown-lock |
| | | |
| | | The default of false allows active resources to be |
| | | recovered elsewhere when their node is cleanly |
| | | shut down, which is what the vast majority of |
| | | users will want. However, some users prefer to |
| | | make resources highly available only for failures, |
| | | with no recovery for clean shutdowns. If this |
| | | option is true, resources active on a node when it |
| | | is cleanly shut down are kept "locked" to that |
| | | node (not allowed to run elsewhere) until they |
| | | start again on that node after it rejoins (or for |
| | | at most ``shutdown-lock-limit``, if set). Stonith |
| | | resources and Pacemaker Remote connections are |
| | | never locked. Clone and bundle instances and the |
| | | master role of promotable clones are currently |
| | | never locked, though support could be added in a |
| | | future release. Locks may be manually cleared |
| | | using the ``--refresh`` option of ``crm_resource`` |
| | | (both the resource and node must be specified; |
| | | this works with remote nodes if their connection |
| | | resource's ``target-role`` is set to ``Stopped``, |
| | | but not if Pacemaker Remote is stopped on the |
| | | remote node without disabling the connection |
| | | resource). *(since 2.0.4)* |
+---------------------------+---------+----------------------------------------------------+
| shutdown-lock-limit | 0 | .. index:: |
| | | pair: cluster option; shutdown-lock-limit |
| | | |
| | | If ``shutdown-lock`` is true, and this is set to a |
| | | nonzero time duration, locked resources will be |
| | | allowed to start after this much time has passed |
| | | since the node shutdown was initiated, even if the |
| | | node has not rejoined. (This works with remote |
| | | nodes only if their connection resource's |
| | | ``target-role`` is set to ``Stopped``.) |
| | | *(since 2.0.4)* |
+---------------------------+---------+----------------------------------------------------+
| remove-after-stop | false | .. index:: |
| | | pair: cluster option; remove-after-stop |
| | | |
| | | *Deprecated* Should the cluster remove |
| | | resources from Pacemaker's executor after they are |
| | | stopped? Values other than the default are, at |
| | | best, poorly tested and potentially dangerous. |
| | | This option is deprecated and will be removed in a |
| | | future release. |
+---------------------------+---------+----------------------------------------------------+
| startup-fencing | true | .. index:: |
| | | pair: cluster option; startup-fencing |
| | | |
| | | *Advanced Use Only:* Should the cluster fence |
| | | unseen nodes at start-up? Setting this to false is |
| | | unsafe, because the unseen nodes could be active |
| | | and running resources but unreachable. |
+---------------------------+---------+----------------------------------------------------+
| election-timeout | 2min | .. index:: |
| | | pair: cluster option; election-timeout |
| | | |
| | | *Advanced Use Only:* If you need to adjust this |
| | | value, it probably indicates the presence of a bug.|
+---------------------------+---------+----------------------------------------------------+
| shutdown-escalation | 20min | .. index:: |
| | | pair: cluster option; shutdown-escalation |
| | | |
| | | *Advanced Use Only:* If you need to adjust this |
| | | value, it probably indicates the presence of a bug.|
+---------------------------+---------+----------------------------------------------------+
| join-integration-timeout | 3min | .. index:: |
| | | pair: cluster option; join-integration-timeout |
| | | |
| | | *Advanced Use Only:* If you need to adjust this |
| | | value, it probably indicates the presence of a bug.|
+---------------------------+---------+----------------------------------------------------+
| join-finalization-timeout | 30min | .. index:: |
| | | pair: cluster option; join-finalization-timeout |
| | | |
| | | *Advanced Use Only:* If you need to adjust this |
| | | value, it probably indicates the presence of a bug.|
+---------------------------+---------+----------------------------------------------------+
| transition-delay | 0s | .. index:: |
| | | pair: cluster option; transition-delay |
| | | |
| | | *Advanced Use Only:* Delay cluster recovery for |
| | | the configured interval to allow for additional or |
| | | related events to occur. This can be useful if |
| | | your configuration is sensitive to the order in |
| | | which ping updates arrive. Enabling this option |
| | | will slow down cluster recovery under all |
| | | conditions. |
+---------------------------+---------+----------------------------------------------------+
diff --git a/doc/sphinx/Pacemaker_Remote/intro.rst b/doc/sphinx/Pacemaker_Remote/intro.rst
index 361d4fb82d..9c5dab81a0 100644
--- a/doc/sphinx/Pacemaker_Remote/intro.rst
+++ b/doc/sphinx/Pacemaker_Remote/intro.rst
@@ -1,190 +1,186 @@
Scaling a Pacemaker Cluster
---------------------------
Overview
########
In a basic Pacemaker high-availability cluster [#]_ each node runs the full
cluster stack of Corosync and all Pacemaker components. This allows great
flexibility but limits scalability to around 16 nodes.
To allow for scalability to dozens or even hundreds of nodes, Pacemaker
allows nodes not running the full cluster stack to integrate into the cluster
and have the cluster manage their resources as if they were a cluster node.
Terms
#####
.. index::
single: cluster node
single: node; cluster node
**cluster node**
A node running the full high-availability stack of corosync and all
Pacemaker components. Cluster nodes may run cluster resources, run
all Pacemaker command-line tools (``crm_mon``, ``crm_resource`` and so on),
execute fencing actions, count toward cluster quorum, and serve as the
cluster's Designated Controller (DC).
.. index:: pacemaker_remoted
**pacemaker_remoted**
A small service daemon that allows a host to be used as a Pacemaker node
without running the full cluster stack. Nodes running ``pacemaker_remoted``
may run cluster resources and most command-line tools, but cannot perform
other functions of full cluster nodes such as fencing execution, quorum
voting, or DC eligibility. The ``pacemaker_remoted`` daemon is an enhanced
version of Pacemaker's local resource management daemon (LRMD).
.. index::
single: remote node
single: node; remote node
**pacemaker_remote**
The name of the systemd service that manages ``pacemaker_remoted``
**Pacemaker Remote**
A way to refer to the general technology implementing nodes running
``pacemaker_remoted``, including the cluster-side implementation
and the communication protocol between them.
**remote node**
A physical host running ``pacemaker_remoted``. Remote nodes have a special
resource that manages communication with the cluster. This is sometimes
referred to as the *bare metal* case.
.. index::
single: guest node
single: node; guest node
**guest node**
A virtual host running ``pacemaker_remoted``. Guest nodes differ from remote
nodes mainly in that the guest node is itself a resource that the cluster
manages.
.. NOTE::
*Remote* in this document refers to the node not being a part of the underlying
corosync cluster. It has nothing to do with physical proximity. Remote nodes
and guest nodes are subject to the same latency requirements as cluster nodes,
which means they are typically in the same data center.
.. NOTE::
It is important to distinguish the various roles a virtual machine can serve
in Pacemaker clusters:
* A virtual machine can run the full cluster stack, in which case it is a
cluster node and is not itself managed by the cluster.
* A virtual machine can be managed by the cluster as a resource, without the
cluster having any awareness of the services running inside the virtual
machine. The virtual machine is *opaque* to the cluster.
* A virtual machine can be a cluster resource, and run ``pacemaker_remoted``
to make it a guest node, allowing the cluster to manage services
inside it. The virtual machine is *transparent* to the cluster.
.. index::
single: virtual machine; as guest node
Guest Nodes
###########
**"I want a Pacemaker cluster to manage virtual machine resources, but I also
want Pacemaker to be able to manage the resources that live within those
virtual machines."**
Without ``pacemaker_remoted``, the possibilities for implementing the above use
case have significant limitations:
* The cluster stack could be run on the physical hosts only, which loses the
ability to monitor resources within the guests.
* A separate cluster could be on the virtual guests, which quickly hits
scalability issues.
* The cluster stack could be run on the guests using the same cluster as the
physical hosts, which also hits scalability issues and complicates fencing.
With ``pacemaker_remoted``:
* The physical hosts are cluster nodes (running the full cluster stack).
* The virtual machines are guest nodes (running ``pacemaker_remoted``).
Nearly zero configuration is required on the virtual machine.
* The cluster stack on the cluster nodes launches the virtual machines and
immediately connects to ``pacemaker_remoted`` on them, allowing the
virtual machines to integrate into the cluster.
The key difference here between the guest nodes and the cluster nodes is that
the guest nodes do not run the cluster stack. This means they will never become
the DC, initiate fencing actions or participate in quorum voting.
On the other hand, this also means that they are not bound to the scalability
limits associated with the cluster stack (no 16-node corosync member limits to
deal with). That isn't to say that guest nodes can scale indefinitely, but it
is known that guest nodes scale horizontally much further than cluster nodes.
Other than the quorum limitation, these guest nodes behave just like cluster
nodes with respect to resource management. The cluster is fully capable of
managing and monitoring resources on each guest node. You can build constraints
against guest nodes, put them in standby, or do whatever else you'd expect to
be able to do with cluster nodes. They even show up in ``crm_mon`` output as
nodes.
To solidify the concept, below is an example that is very similar to an actual
deployment we test in our developer environment to verify guest node scalability:
* 16 cluster nodes running the full Corosync + Pacemaker stack
* 64 Pacemaker-managed virtual machine resources running ``pacemaker_remoted``
configured as guest nodes
* 64 Pacemaker-managed webserver and database resources configured to run on
the 64 guest nodes
With this deployment, you would have 64 webservers and databases running on 64
virtual machines on 16 hardware nodes, all of which are managed and monitored by
the same Pacemaker deployment. It is known that ``pacemaker_remoted`` can scale
to these lengths and possibly much further depending on the specific scenario.
Remote Nodes
############
**"I want my traditional high-availability cluster to scale beyond the limits
imposed by the corosync messaging layer."**
Ultimately, the primary advantage of remote nodes over cluster nodes is
scalability. There are likely some other use cases related to geographically
distributed HA clusters that remote nodes may serve a purpose in, but those use
cases are not well understood at this point.
Like guest nodes, remote nodes will never become the DC, initiate
fencing actions or participate in quorum voting.
That is not to say, however, that fencing of a remote node works any
differently than that of a cluster node. The Pacemaker scheduler
understands how to fence remote nodes. As long as a fencing device exists, the
cluster is capable of ensuring remote nodes are fenced in the exact same way as
cluster nodes.
Expanding the Cluster Stack
###########################
With ``pacemaker_remoted``, the traditional view of the high-availability stack
can be expanded to include a new layer:
Traditional HA Stack
____________________
.. image:: images/pcmk-ha-cluster-stack.png
- :width: 17cm
- :height: 9cm
:alt: Traditional Pacemaker+Corosync Stack
:align: center
HA Stack With Guest Nodes
_________________________
.. image:: images/pcmk-ha-remote-stack.png
- :width: 20cm
- :height: 10cm
:alt: Pacemaker+Corosync Stack with pacemaker_remoted
:align: center
.. [#] See the `<https://www.clusterlabs.org/doc/>`_ Pacemaker documentation,
especially *Clusters From Scratch* and *Pacemaker Explained*.
diff --git a/doc/sphinx/conf.py.in b/doc/sphinx/conf.py.in
index 0c2615a1cb..91c40b0404 100644
--- a/doc/sphinx/conf.py.in
+++ b/doc/sphinx/conf.py.in
@@ -1,314 +1,316 @@
""" Sphinx configuration for Pacemaker documentation
"""
__copyright__ = "Copyright 2020 the Pacemaker project contributors"
__license__ = "GNU General Public License version 2 or later (GPLv2+) WITHOUT ANY WARRANTY"
# This file is execfile()d with the current directory set to its containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
import datetime
import os
import sys
# Variables that can be used later in this file
authors = "the Pacemaker project contributors"
year = datetime.datetime.now().year
doc_license = "Creative Commons Attribution-ShareAlike International Public License"
doc_license += " version 4.0 or later (CC-BY-SA v4.0+)"
# rST markup to insert at beginning of every document; mainly used for
#
# .. |<abbr>| replace:: <Full text>
#
# where occurrences of |<abbr>| in the rST will be substituted with <Full text>
rst_prolog="""
.. |CFS_DISTRO| replace:: CentOS Stream
.. |CFS_DISTRO_VER| replace:: 8
.. |REMOTE_DISTRO| replace:: CentOS
.. |REMOTE_DISTRO_VER| replace:: 7.4
"""
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#sys.path.insert(0, os.path.abspath('.'))
# -- General configuration -----------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = []
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = '.rst'
# The encoding of source files.
#source_encoding = 'utf-8-sig'
# The master toctree document.
master_doc = 'index'
# General information about the project.
project = '%BOOK_ID%'
copyright = "2009-%s %s. Released under the terms of the %s" % (year, authors, doc_license)
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The full version, including alpha/beta/rc tags.
release = '%VERSION%'
# The short X.Y version.
version = release.rsplit('.', 1)[0]
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#language = None
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
#today = ''
# Else, today_fmt is used as the format for a strftime call.
#today_fmt = '%B %d, %Y'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ['_build']
# The reST default role (used for this markup: `text`) to use for all documents.
#default_role = None
# If true, '()' will be appended to :func: etc. cross-reference text.
#add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
#add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
#show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'vs'
# A list of ignored prefixes for module index sorting.
#modindex_common_prefix = []
# -- Options for HTML output ---------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
html_theme = 'pyramid'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#html_theme_options = {}
# Add any paths that contain custom themes here, relative to this directory.
#html_theme_path = []
html_style = 'pacemaker.css'
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
html_title = "%BOOK_TITLE%"
# A shorter title for the navigation bar. Default is the same as html_title.
#html_short_title = None
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
#html_logo = None
# The name of an image file (within the static path) to use as favicon of the
# docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32
# pixels large.
#html_favicon = None
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = [ '%SRC_DIR%/_static' ]
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
#html_last_updated_fmt = '%b %d, %Y'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
#html_use_smartypants = True
# Custom sidebar templates, maps document names to template names.
#html_sidebars = {}
# Additional templates that should be rendered to pages, maps page names to
# template names.
#html_additional_pages = {}
# If false, no module index is generated.
#html_domain_indices = True
# If false, no index is generated.
#html_use_index = True
# If true, the index is split into individual pages for each letter.
#html_split_index = False
# If true, links to the reST sources are added to the pages.
#html_show_sourcelink = True
# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
#html_show_sphinx = True
# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
#html_show_copyright = True
# If true, an OpenSearch description file will be output, and all pages will
# contain a <link> tag referring to it. The value of this option must be the
# base URL from which the finished HTML is served.
#html_use_opensearch = ''
# This is the file name suffix for HTML files (e.g. ".xhtml").
#html_file_suffix = None
# Output file base name for HTML help builder.
htmlhelp_basename = 'Pacemakerdoc'
# -- Options for LaTeX output --------------------------------------------------
+latex_engine = "xelatex"
+
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#'preamble': '',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass [howto/manual]).
latex_documents = [
('index', '%BOOK_ID%.tex', '%BOOK_TITLE%', authors, 'manual'),
]
# The name of an image file (relative to this directory) to place at the top of
# the title page.
#latex_logo = None
# For "manual" documents, if this is true, then toplevel headings are parts,
# not chapters.
#latex_use_parts = False
# If true, show page references after internal links.
#latex_show_pagerefs = False
# If true, show URL addresses after external links.
#latex_show_urls = False
# Documents to append as an appendix to all manuals.
#latex_appendices = []
# If false, no module index is generated.
#latex_domain_indices = True
# -- Options for manual page output --------------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
('index', '%BOOK_ID%', 'Part of the Pacemaker documentation set', [authors], 8)
]
# If true, show URL addresses after external links.
#man_show_urls = False
# -- Options for Texinfo output ------------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
('index', '%BOOK_ID%', '%BOOK_TITLE%', authors, '%BOOK_TITLE%',
'Pacemaker is an advanced, scalable high-availability cluster resource manager.',
'Miscellaneous'),
]
# Documents to append as an appendix to all manuals.
#texinfo_appendices = []
# If false, no module index is generated.
#texinfo_domain_indices = True
# How to display URL addresses: 'footnote', 'no', or 'inline'.
#texinfo_show_urls = 'footnote'
# -- Options for Epub output ---------------------------------------------------
# Bibliographic Dublin Core info.
epub_title = '%BOOK_TITLE%'
epub_author = authors
epub_publisher = 'ClusterLabs.org'
epub_copyright = copyright
# The language of the text. It defaults to the language option
# or en if the language is not set.
#epub_language = ''
# The scheme of the identifier. Typical schemes are ISBN or URL.
epub_scheme = 'URL'
# The unique identifier of the text. This can be a ISBN number
# or the project homepage.
epub_identifier = 'http://www.clusterlabs.org/pacemaker/doc/2.0/%BOOK_ID%/epub/%BOOK_ID%.epub'
# A unique identification for the text.
epub_uid = 'ClusterLabs.org-Pacemaker-%BOOK_ID%'
# A tuple containing the cover image and cover page html template filenames.
#epub_cover = ()
# HTML files that should be inserted before the pages created by sphinx.
# The format is a list of tuples containing the path and title.
#epub_pre_files = []
# HTML files that should be inserted after the pages created by sphinx.
# The format is a list of tuples containing the path and title.
#epub_post_files = []
# A list of files that should not be packed into the epub file.
epub_exclude_files = [
'_static/doctools.js',
'_static/jquery.js',
'_static/searchtools.js',
'_static/underscore.js',
'_static/basic.css',
'_static/websupport.js',
'search.html',
]
# The depth of the table of contents in toc.ncx.
#epub_tocdepth = 3
# Allow duplicate toc entries.
#epub_tocdup = True
diff --git a/doc/sphinx/shared/pacemaker-intro.rst b/doc/sphinx/shared/pacemaker-intro.rst
index a56e53fed7..3473636843 100644
--- a/doc/sphinx/shared/pacemaker-intro.rst
+++ b/doc/sphinx/shared/pacemaker-intro.rst
@@ -1,201 +1,196 @@
What Is Pacemaker?
####################
Pacemaker is a high-availability *cluster resource manager* -- software that
runs on a set of hosts (a *cluster* of *nodes*) in order to preserve integrity
and minimize downtime of desired services (*resources*). [#]_ It is maintained
by the `ClusterLabs <https://www.ClusterLabs.org/>`_ community.
Pacemaker's key features include:
* Detection of and recovery from node- and service-level failures
* Ability to ensure data integrity by fencing faulty nodes
* Support for one or more nodes per cluster
* Support for multiple resource interface standards (anything that can be
scripted can be clustered)
* Support (but no requirement) for shared storage
* Support for practically any redundancy configuration (active/passive, N+1,
etc.)
* Automatically replicated configuration that can be updated from any node
* Ability to specify cluster-wide relationships between services,
such as ordering, colocation and anti-colocation
* Support for advanced service types, such as *clones* (services that need to
be active on multiple nodes), *promotable clones* (clones that can run in
one of two roles), and containerized services
* Unified, scriptable cluster management tools
-.. note:: Fencing
+.. note:: **Fencing**
*Fencing*, also known as *STONITH* (an acronym for Shoot The Other Node In
The Head), is the ability to ensure that it is not possible for a node to be
running a service. This is accomplished via *fence devices* such as
intelligent power switches that cut power to the target, or intelligent
network switches that cut the target's access to the local network.
Pacemaker represents fence devices as a special class of resource.
A cluster cannot safely recover from certain failure conditions, such as an
unresponsive node, without fencing.
Cluster Architecture
____________________
At a high level, a cluster can be viewed as having these parts (which together
are often referred to as the *cluster stack*):
* **Resources:** These are the reason for the cluster's being -- the services
that need to be kept highly available.
* **Resource agents:** These are scripts or operating system components that
start, stop, and monitor resources, given a set of resource parameters.
These provide a uniform interface between Pacemaker and the managed
services.
* **Fence agents:** These are scripts that execute node fencing actions,
given a target and fence device parameters.
* **Cluster membership layer:** This component provides reliable messaging,
membership, and quorum information about the cluster. Currently, Pacemaker
supports `Corosync <http://www.corosync.org/>`_ as this layer.
* **Cluster resource manager:** Pacemaker provides the brain that processes
and reacts to events that occur in the cluster. These events may include
nodes joining or leaving the cluster; resource events caused by failures,
maintenance, or scheduled activities; and other administrative actions.
To achieve the desired availability, Pacemaker may start and stop resources
and fence nodes.
* **Cluster tools:** These provide an interface for users to interact with the
cluster. Various command-line and graphical (GUI) interfaces are available.
Most managed services are not, themselves, cluster-aware. However, many popular
open-source cluster filesystems make use of a common *Distributed Lock
Manager* (DLM), which makes direct use of Corosync for its messaging and
membership capabilities and Pacemaker for the ability to fence nodes.
.. image:: ../shared/images/pcmk-stack.png
:alt: Example cluster stack
- :scale: 75 %
:align: center
Pacemaker Architecture
______________________
Pacemaker itself is composed of multiple daemons that work together:
* pacemakerd
* pacemaker-attrd
* pacemaker-based
* pacemaker-controld
* pacemaker-execd
* pacemaker-fenced
* pacemaker-schedulerd
.. image:: ../shared/images/pcmk-internals.png
:alt: Pacemaker software components
- :scale: 65 %
:align: center
The Pacemaker master process (pacemakerd) spawns all the other daemons, and
respawns them if they unexpectedly exit.
The *Cluster Information Base* (CIB) is an
`XML <https://en.wikipedia.org/wiki/XML>`_ representation of the cluster's
configuration and the state of all nodes and resources. The *CIB manager*
(pacemaker-based) keeps the CIB synchronized across the cluster, and handles
requests to modify it.
The *attribute manager* (pacemaker-attrd) maintains a database of attributes
for all nodes, keeps it synchronized across the cluster, and handles requests
to modify them. These attributes are usually recorded in the CIB.
Given a snapshot of the CIB as input, the *scheduler* (pacemaker-schedulerd)
determines what actions are necessary to achieve the desired state of the
cluster.
The *local executor* (pacemaker-execd) handles requests to execute
resource agents on the local cluster node, and returns the result.
The *fencer* (pacemaker-fenced) handles requests to fence nodes. Given a target
node, the fencer decides which cluster node(s) should execute which fencing
device(s), and calls the necessary fencing agents (either directly, or via
requests to the fencer peers on other nodes), and returns the result.
The *controller* (pacemaker-controld) is Pacemaker's coordinator, maintaining a
consistent view of the cluster membership and orchestrating all the other
components.
Pacemaker centralizes cluster decision-making by electing one of the controller
instances as the 'Designated Controller' ('DC'). Should the elected DC process
(or the node it is on) fail, a new one is quickly established. The DC responds
to cluster events by taking a current snapshot of the CIB, feeding it to the
scheduler, then asking the executors (either directly on the local node, or via
requests to controller peers on other nodes) and the fencer to execute any
necessary actions.
.. note:: **Old daemon names**
The Pacemaker daemons were renamed in version 2.0. You may still find
references to the old names, especially in documentation targeted to
version 1.1.
.. table::
+-------------------+---------------------+
| Old name | New name |
+===================+=====================+
| attrd | pacemaker-attrd |
+-------------------+---------------------+
| cib | pacemaker-based |
+-------------------+---------------------+
| crmd | pacemaker-controld |
+-------------------+---------------------+
| lrmd | pacemaker-execd |
+-------------------+---------------------+
| stonithd | pacemaker-fenced |
+-------------------+---------------------+
| pacemaker_remoted | pacemaker-remoted |
+-------------------+---------------------+
Node Redundancy Designs
_______________________
Pacemaker supports practically any `node redundancy configuration
<https://en.wikipedia.org/wiki/High-availability_cluster#Node_configurations>`_
including *Active/Active*, *Active/Passive*, *N+1*, *N+M*, *N-to-1* and
*N-to-N*.
Active/passive clusters with two (or more) nodes using Pacemaker and
`DRBD <https://en.wikipedia.org/wiki/Distributed_Replicated_Block_Device>`_ are
a cost-effective high-availability solution for many situations. One of the
nodes provides the desired services, and if it fails, the other node takes
over.
.. image:: ../shared/images/pcmk-active-passive.png
:alt: Active/Passive Redundancy
:align: center
- :scale: 75 %
Pacemaker also supports multiple nodes in a shared-failover design, reducing
hardware costs by allowing several active/passive clusters to be combined and
share a common backup node.
.. image:: ../shared/images/pcmk-shared-failover.png
:alt: Shared Failover
:align: center
- :scale: 75 %
When shared storage is available, every node can potentially be used for
failover. Pacemaker can even run multiple copies of services to spread out the
workload. This is sometimes called N to N Redundancy.
.. image:: ../shared/images/pcmk-active-active.png
:alt: N to N Redundancy
:align: center
- :scale: 75 %
.. rubric:: Footnotes
.. [#] *Cluster* is sometimes used in other contexts to refer to hosts grouped
together for other purposes, such as high-performance computing (HPC),
but Pacemaker is not intended for those purposes.

File Metadata

Mime Type
text/x-diff
Expires
Mon, Sep 22, 11:20 PM (13 h, 50 m ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
2391772
Default Alt Text
(160 KB)

Event Timeline