diff --git a/doc/sphinx/Clusters_from_Scratch/active-active.rst b/doc/sphinx/Clusters_from_Scratch/active-active.rst index 3d4e2505fd..0d27174637 100644 --- a/doc/sphinx/Clusters_from_Scratch/active-active.rst +++ b/doc/sphinx/Clusters_from_Scratch/active-active.rst @@ -1,342 +1,343 @@ .. index:: single: storage; active/active Convert Storage to Active/Active -------------------------------- The primary requirement for an active/active cluster is that the data required for your services is available, simultaneously, on both machines. Pacemaker makes no requirement on how this is achieved; you could use a Storage Area Network (SAN) if you had one available, but since DRBD supports multiple Primaries, we can continue to use it here. .. index:: single: GFS2 single: DLM single: filesystem; GFS2 Install Cluster Filesystem Software ################################### The only hitch is that we need to use a cluster-aware filesystem. The one we used earlier with DRBD, xfs, is not one of those. Both OCFS2 and GFS2 are supported; here, we will use GFS2. On both nodes, install Distributed Lock Manager (DLM) and the GFS2 command- line utilities required by cluster filesystems: .. code-block:: console # dnf config-manager --set-enabled resilientstorage # dnf install -y dlm gfs2-utils Configure the Cluster for the DLM ################################# The DLM control daemon needs to run on both nodes, so we'll start by creating a resource for it (using the ``ocf:pacemaker:controld`` resource agent), and clone it: .. code-block:: console [root@pcmk-1 ~]# pcs cluster cib dlm_cfg [root@pcmk-1 ~]# pcs -f dlm_cfg resource create dlm \ ocf:pacemaker:controld op monitor interval=60s [root@pcmk-1 ~]# pcs -f dlm_cfg resource clone dlm clone-max=2 clone-node-max=1 [root@pcmk-1 ~]# pcs resource status * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-1 * WebSite (ocf:heartbeat:apache): Started pcmk-1 * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-1 ] * Unpromoted: [ pcmk-2 ] * WebFS (ocf:heartbeat:Filesystem): Started pcmk-1 Activate our new configuration, and see how the cluster responds: .. code-block:: console [root@pcmk-1 ~]# pcs cluster cib-push dlm_cfg --config CIB updated [root@pcmk-1 ~]# pcs resource status * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-1 * WebSite (ocf:heartbeat:apache): Started pcmk-1 * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-1 ] * Unpromoted: [ pcmk-2 ] * WebFS (ocf:heartbeat:Filesystem): Started pcmk-1 * Clone Set: dlm-clone [dlm]: * Started: [ pcmk-1 pcmk-2 ] [root@pcmk-1 ~]# pcs resource config Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=192.168.122.120 Operations: monitor interval=30s (ClusterIP-monitor-interval-30s) start interval=0s timeout=20s (ClusterIP-start-interval-0s) stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) Resource: WebSite (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status Operations: monitor interval=1min (WebSite-monitor-interval-1min) start interval=0s timeout=40s (WebSite-start-interval-0s) stop interval=0s timeout=60s (WebSite-stop-interval-0s) Clone: WebData-clone Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1 Resource: WebData (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=wwwdata Operations: demote interval=0s timeout=90 (WebData-demote-interval-0s) - monitor interval=60s (WebData-monitor-interval-60s) + monitor interval=29s role=Promoted (WebData-monitor-interval-29s) + monitor interval=31s role=Unpromoted (WebData-monitor-interval-31s) notify interval=0s timeout=90 (WebData-notify-interval-0s) promote interval=0s timeout=90 (WebData-promote-interval-0s) reload interval=0s timeout=30 (WebData-reload-interval-0s) start interval=0s timeout=240 (WebData-start-interval-0s) stop interval=0s timeout=100 (WebData-stop-interval-0s) Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd1 directory=/var/www/html fstype=xfs Operations: monitor interval=20s timeout=40s (WebFS-monitor-interval-20s) start interval=0s timeout=60s (WebFS-start-interval-0s) stop interval=0s timeout=60s (WebFS-stop-interval-0s) Clone: dlm-clone Meta Attrs: interleave=true ordered=true Resource: dlm (class=ocf provider=pacemaker type=controld) Operations: monitor interval=60s (dlm-monitor-interval-60s) start interval=0s timeout=90s (dlm-start-interval-0s) stop interval=0s timeout=100s (dlm-stop-interval-0s) Create and Populate GFS2 Filesystem ################################### Before we do anything to the existing partition, we need to make sure it is unmounted. We do this by telling the cluster to stop the ``WebFS`` resource. This will ensure that other resources (in our case, ``WebSite``) using ``WebFS`` are not only stopped, but stopped in the correct order. .. code-block:: console [root@pcmk-1 ~]# pcs resource disable WebFS [root@pcmk-1 ~]# pcs resource * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-1 * WebSite (ocf:heartbeat:apache): Stopped * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-1 ] * Unpromoted: [ pcmk-2 ] * WebFS (ocf:heartbeat:Filesystem): Stopped (disabled) * Clone Set: dlm-clone [dlm]: * Started: [ pcmk-1 pcmk-2 ] You can see that both ``WebSite`` and ``WebFS`` have been stopped, and that ``pcmk-1`` is currently running the promoted instance for the DRBD device. Now we can create a new GFS2 filesystem on the DRBD device. .. WARNING:: This will erase all previous content stored on the DRBD device. Ensure you have a copy of any important data. .. IMPORTANT:: Run the next command on whichever node has the DRBD Primary role. Otherwise, you will receive the message: .. code-block:: console /dev/drbd1: Read-only file system .. code-block:: console - [root@pcmk-2 ~]# mkfs.gfs2 -p lock_dlm -j 2 -t mycluster:web /dev/drbd1 + [root@pcmk-1 ~]# mkfs.gfs2 -p lock_dlm -j 2 -t mycluster:web /dev/drbd1 It appears to contain an existing filesystem (xfs) This will destroy any data on /dev/drbd1 Are you sure you want to proceed? [y/n] y Discarding device contents (may take a while on large devices): Done Adding journals: Done Building resource groups: Done Creating quota file: Done Writing superblock and syncing: Done Device: /dev/drbd1 Block size: 4096 Device size: 0.50 GB (131059 blocks) Filesystem size: 0.50 GB (131055 blocks) Journals: 2 Journal size: 8MB Resource groups: 4 Locking protocol: "lock_dlm" Lock table: "mycluster:web" UUID: 19712677-7206-4660-a079-5d17341dd720 The ``mkfs.gfs2`` command required a number of additional parameters: * ``-p lock_dlm`` specifies that we want to use DLM-based locking. * ``-j 2`` indicates that the filesystem should reserve enough space for two journals (one for each node that will access the filesystem). * ``-t mycluster:web`` specifies the lock table name. The format for this field is ``:``. For ``CLUSTERNAME``, we need to use the same value we specified originally with ``pcs cluster setup --name`` (which is also the value of ``cluster_name`` in ``/etc/corosync/corosync.conf``). If you are unsure what your cluster name is, you can look in ``/etc/corosync/corosync.conf`` or execute the command ``pcs cluster corosync | grep cluster_name``. Now we can (re-)populate the new filesystem with data (web pages). We'll create yet another variation on our home page. .. code-block:: console [root@pcmk-1 ~]# mount /dev/drbd1 /mnt [root@pcmk-1 ~]# cat <<-END >/mnt/index.html My Test Site - GFS2 END [root@pcmk-1 ~]# chcon -R --reference=/var/www/html /mnt [root@pcmk-1 ~]# umount /dev/drbd1 [root@pcmk-1 ~]# drbdadm verify wwwdata Reconfigure the Cluster for GFS2 ################################ With the ``WebFS`` resource stopped, let's update the configuration. .. code-block:: console [root@pcmk-1 ~]# pcs resource config WebFS Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd1 directory=/var/www/html fstype=xfs Meta Attrs: target-role=Stopped Operations: monitor interval=20s timeout=40s (WebFS-monitor-interval-20s) start interval=0s timeout=60s (WebFS-start-interval-0s) stop interval=0s timeout=60s (WebFS-stop-interval-0s) The fstype option needs to be updated to ``gfs2`` instead of ``xfs``. .. code-block:: console [root@pcmk-1 ~]# pcs resource update WebFS fstype=gfs2 [root@pcmk-1 ~]# pcs resource config WebFS Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd1 directory=/var/www/html fstype=gfs2 Meta Attrs: target-role=Stopped Operations: monitor interval=20s timeout=40s (WebFS-monitor-interval-20s) start interval=0s timeout=60s (WebFS-start-interval-0s) stop interval=0s timeout=60s (WebFS-stop-interval-0s) GFS2 requires that DLM be running, so we also need to set up new colocation and ordering constraints for it: .. code-block:: console [root@pcmk-1 ~]# pcs constraint colocation add WebFS with dlm-clone [root@pcmk-1 ~]# pcs constraint order dlm-clone then WebFS Adding dlm-clone WebFS (kind: Mandatory) (Options: first-action=start then-action=start) [root@pcmk-1 ~]# pcs constraint Location Constraints: Resource: WebSite Enabled on: Node: pcmk-2 (score:50) Ordering Constraints: start ClusterIP then start WebSite (kind:Mandatory) promote WebData-clone then start WebFS (kind:Mandatory) start WebFS then start WebSite (kind:Mandatory) start dlm-clone then start WebFS (kind:Mandatory) Colocation Constraints: WebSite with ClusterIP (score:INFINITY) WebFS with WebData-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Promoted) WebSite with WebFS (score:INFINITY) WebFS with dlm-clone (score:INFINITY) Ticket Constraints: We also need to update the ``no-quorum-policy`` property to ``freeze``. By default, the value of ``no-quorum-policy`` is set to ``stop`` indicating that once quorum is lost, all the resources on the remaining partition will immediately be stopped. Typically this default is the safest and most optimal option, but unlike most resources, GFS2 requires quorum to function. When quorum is lost both the applications using the GFS2 mounts and the GFS2 mount itself cannot be correctly stopped. Any attempts to stop these resources without quorum will fail, which will ultimately result in the entire cluster being fenced every time quorum is lost. To address this situation, set ``no-quorum-policy`` to ``freeze`` when GFS2 is in use. This means that when quorum is lost, the remaining partition will do nothing until quorum is regained. .. code-block:: console [root@pcmk-1 ~]# pcs property set no-quorum-policy=freeze .. index:: pair: filesystem; clone Clone the Filesystem Resource ############################# Now that we have a cluster filesystem ready to go, we can configure the cluster so both nodes mount the filesystem. Clone the ``Filesystem`` resource in a new configuration. Notice how ``pcs`` automatically updates the relevant constraints again. .. code-block:: console [root@pcmk-1 ~]# pcs cluster cib active_cfg [root@pcmk-1 ~]# pcs -f active_cfg resource clone WebFS [root@pcmk-1 ~]# pcs -f active_cfg constraint Location Constraints: Resource: WebSite Enabled on: Node: pcmk-2 (score:50) Ordering Constraints: start ClusterIP then start WebSite (kind:Mandatory) promote WebData-clone then start WebFS-clone (kind:Mandatory) start WebFS-clone then start WebSite (kind:Mandatory) start dlm-clone then start WebFS-clone (kind:Mandatory) Colocation Constraints: WebSite with ClusterIP (score:INFINITY) WebFS-clone with WebData-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Promoted) WebSite with WebFS-clone (score:INFINITY) WebFS-clone with dlm-clone (score:INFINITY) Ticket Constraints: Tell the cluster that it is now allowed to promote both instances to be DRBD Primary. .. code-block:: console [root@pcmk-1 ~]# pcs -f active_cfg resource update WebData-clone promoted-max=2 Finally, load our configuration to the cluster, and re-enable the ``WebFS`` resource (which we disabled earlier). .. code-block:: console [root@pcmk-1 ~]# pcs cluster cib-push active_cfg --config CIB updated [root@pcmk-1 ~]# pcs resource enable WebFS After all the processes are started, the status should look similar to this. .. code-block:: console [root@pcmk-1 ~]# pcs resource * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-1 * WebSite (ocf:heartbeat:apache): Started pcmk-1 * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-1 pcmk-2 ] * Clone Set: dlm-clone [dlm]: * Started: [ pcmk-1 pcmk-2 ] * Clone Set: WebFS-clone [WebFS]: * Started: [ pcmk-1 pcmk-2 ] Test Failover ############# Testing failover is left as an exercise for the reader. With this configuration, the data is now active/active. The website administrator could change HTML files on either node, and the live website will show the changes even if it is running on the opposite node. If the web server is configured to listen on all IP addresses, it is possible to remove the constraints between the ``WebSite`` and ``ClusterIP`` resources, and clone the ``WebSite`` resource. The web server would always be ready to serve web pages, and only the IP address would need to be moved in a failover. diff --git a/doc/sphinx/Clusters_from_Scratch/ap-configuration.rst b/doc/sphinx/Clusters_from_Scratch/ap-configuration.rst index c24cd848b1..b71e9af67c 100644 --- a/doc/sphinx/Clusters_from_Scratch/ap-configuration.rst +++ b/doc/sphinx/Clusters_from_Scratch/ap-configuration.rst @@ -1,343 +1,345 @@ Configuration Recap ------------------- Final Cluster Configuration ########################### .. code-block:: console [root@pcmk-1 ~]# pcs resource * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-1 * WebSite (ocf:heartbeat:apache): Started pcmk-1 * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-1 pcmk-2 ] * Clone Set: dlm-clone [dlm]: * Started: [ pcmk-1 pcmk-2 ] * Clone Set: WebFS-clone [WebFS]: * Started: [ pcmk-1 pcmk-2 ] .. code-block:: console [root@pcmk-1 ~]# pcs resource op defaults Meta Attrs: op_defaults-meta_attributes timeout=240s .. code-block:: console [root@pcmk-1 ~]# pcs stonith * fence_dev (stonith:some_fence_agent): Started pcmk-1 .. code-block:: console [root@pcmk-1 ~]# pcs constraint Location Constraints: Resource: WebSite Enabled on: Node: pcmk-2 (score:50) Ordering Constraints: start ClusterIP then start WebSite (kind:Mandatory) promote WebData-clone then start WebFS-clone (kind:Mandatory) start WebFS-clone then start WebSite (kind:Mandatory) start dlm-clone then start WebFS-clone (kind:Mandatory) Colocation Constraints: WebSite with ClusterIP (score:INFINITY) WebFS-clone with WebData-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Promoted) WebSite with WebFS-clone (score:INFINITY) WebFS-clone with dlm-clone (score:INFINITY) Ticket Constraints: .. code-block:: console [root@pcmk-1 ~]# pcs status Cluster name: mycluster Cluster Summary: * Stack: corosync * Current DC: pcmk-1 (version 2.1.2-4.el9-ada5c3b36e2) - partition with quorum * Last updated: Wed Jul 27 08:57:57 2022 * Last change: Wed Jul 27 08:55:00 2022 by root via cibadmin on pcmk-1 * 2 nodes configured * 9 resource instances configured Node List: * Online: [ pcmk-1 pcmk-2 ] Full List of Resources: * fence_dev (stonith:some_fence_agent): Started pcmk-1 * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-1 * WebSite (ocf:heartbeat:apache): Started pcmk-1 * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-1 pcmk-2 ] * Clone Set: dlm-clone [dlm]: * Started: [ pcmk-1 pcmk-2 ] * Clone Set: WebFS-clone [WebFS]: * Started: [ pcmk-1 pcmk-2 ] Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled .. code-block:: console [root@pcmk-1 ~]# pcs config Cluster Name: mycluster Corosync Nodes: pcmk-1 pcmk-2 Pacemaker Nodes: pcmk-1 pcmk-2 Resources: Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=192.168.122.120 Operations: monitor interval=30s (ClusterIP-monitor-interval-30s) start interval=0s timeout=20s (ClusterIP-start-interval-0s) stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) Resource: WebSite (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status Operations: monitor interval=1min (WebSite-monitor-interval-1min) start interval=0s timeout=40s (WebSite-start-interval-0s) stop interval=0s timeout=60s (WebSite-stop-interval-0s) Clone: WebData-clone Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=2 promoted-node-max=1 Resource: WebData (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=wwwdata Operations: demote interval=0s timeout=90 (WebData-demote-interval-0s) - monitor interval=60s (WebData-monitor-interval-60s) + monitor interval=29s role=Promoted (WebData-monitor-interval-29s) + monitor interval=31s role=Unpromoted (WebData-monitor-interval-31s) notify interval=0s timeout=90 (WebData-notify-interval-0s) promote interval=0s timeout=90 (WebData-promote-interval-0s) reload interval=0s timeout=30 (WebData-reload-interval-0s) start interval=0s timeout=240 (WebData-start-interval-0s) stop interval=0s timeout=100 (WebData-stop-interval-0s) Clone: dlm-clone Meta Attrs: interleave=true ordered=true Resource: dlm (class=ocf provider=pacemaker type=controld) Operations: monitor interval=60s (dlm-monitor-interval-60s) start interval=0s timeout=90s (dlm-start-interval-0s) stop interval=0s timeout=100s (dlm-stop-interval-0s) Clone: WebFS-clone Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd1 directory=/var/www/html fstype=gfs2 Operations: monitor interval=20s timeout=40s (WebFS-monitor-interval-20s) start interval=0s timeout=60s (WebFS-start-interval-0s) stop interval=0s timeout=60s (WebFS-stop-interval-0s) Stonith Devices: Resource: fence_dev (class=stonith type=some_fence_agent) Attributes: pcmk_delay_base=pcmk-1:5s;pcmk-2:0s pcmk_host_map=pcmk-1:almalinux9-1;pcmk-2:almalinux9-2 Operations: monitor interval=60s (fence_dev-monitor-interval-60s) Fencing Levels: Location Constraints: Resource: WebSite Enabled on: Node: pcmk-2 (score:50) (id:location-WebSite-pcmk-2-50) Ordering Constraints: start ClusterIP then start WebSite (kind:Mandatory) (id:order-ClusterIP-WebSite-mandatory) promote WebData-clone then start WebFS-clone (kind:Mandatory) (id:order-WebData-clone-WebFS-mandatory) start WebFS-clone then start WebSite (kind:Mandatory) (id:order-WebFS-WebSite-mandatory) start dlm-clone then start WebFS-clone (kind:Mandatory) (id:order-dlm-clone-WebFS-mandatory) Colocation Constraints: WebSite with ClusterIP (score:INFINITY) (id:colocation-WebSite-ClusterIP-INFINITY) WebFS-clone with WebData-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Promoted) (id:colocation-WebFS-WebData-clone-INFINITY) WebSite with WebFS-clone (score:INFINITY) (id:colocation-WebSite-WebFS-INFINITY) WebFS-clone with dlm-clone (score:INFINITY) (id:colocation-WebFS-dlm-clone-INFINITY) Ticket Constraints: Alerts: No alerts defined Resources Defaults: Meta Attrs: build-resource-defaults resource-stickiness=100 Operations Defaults: Meta Attrs: op_defaults-meta_attributes timeout=240s Cluster Properties: cluster-infrastructure: corosync cluster-name: mycluster dc-version: 2.1.2-4.el9-ada5c3b36e2 have-watchdog: false last-lrm-refresh: 1658896047 no-quorum-policy: freeze stonith-enabled: true Tags: No tags defined Quorum: Options: Node List ######### .. code-block:: console [root@pcmk-1 ~]# pcs status nodes Pacemaker Nodes: Online: pcmk-1 pcmk-2 Standby: Standby with resource(s) running: Maintenance: Offline: Pacemaker Remote Nodes: Online: Standby: Standby with resource(s) running: Maintenance: Offline: Cluster Options ############### .. code-block:: console [root@pcmk-1 ~]# pcs property Cluster Properties: cluster-infrastructure: corosync cluster-name: mycluster dc-version: 2.1.2-4.el9-ada5c3b36e2 have-watchdog: false no-quorum-policy: freeze stonith-enabled: true The output shows cluster-wide configuration options, as well as some baseline- level state information. The output includes: * ``cluster-infrastructure`` - the cluster communications layer in use * ``cluster-name`` - the cluster name chosen by the administrator when the cluster was created * ``dc-version`` - the version (including upstream source-code hash) of ``pacemaker`` used on the Designated Controller, which is the node elected to determine what actions are needed when events occur * ``have-watchdog`` - whether watchdog integration is enabled; set automatically when SBD is enabled * ``stonith-enabled`` - whether nodes may be fenced as part of recovery .. NOTE:: This command is equivalent to ``pcs property config``. Resources ######### Default Options _______________ .. code-block:: console [root@pcmk-1 ~]# pcs resource defaults Meta Attrs: build-resource-defaults resource-stickiness=100 This shows cluster option defaults that apply to every resource that does not explicitly set the option itself. Above: * ``resource-stickiness`` - Specify how strongly a resource prefers to remain on its current node. Alternatively, you can view this as the level of aversion to moving healthy resources to other machines. Fencing _______ .. code-block:: console [root@pcmk-1 ~]# pcs stonith status * fence_dev (stonith:some_fence_agent): Started pcmk-1 [root@pcmk-1 ~]# pcs stonith config Resource: fence_dev (class=stonith type=some_fence_agent) Attributes: pcmk_delay_base=pcmk-1:5s;pcmk-2:0s pcmk_host_map=pcmk-1:almalinux9-1;pcmk-2:almalinux9-2 Operations: monitor interval=60s (fence_dev-monitor-interval-60s) Service Address _______________ Users of the services provided by the cluster require an unchanging address with which to access it. .. code-block:: console [root@pcmk-1 ~]# pcs resource config ClusterIP Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=192.168.122.120 Operations: monitor interval=30s (ClusterIP-monitor-interval-30s) start interval=0s timeout=20s (ClusterIP-start-interval-0s) stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) DRBD - Shared Storage _____________________ Here, we define the DRBD service and specify which DRBD resource (from ``/etc/drbd.d/\*.res``) it should manage. We make it a promotable clone resource and, in order to have an active/active setup, allow both instances to be promoted at the same time. We also set the notify option so that the cluster will tell the ``drbd`` agent when its peer changes state. .. code-block:: console [root@pcmk-1 ~]# pcs resource config WebData-clone Clone: WebData-clone Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=2 promoted-node-max=1 Resource: WebData (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=wwwdata Operations: demote interval=0s timeout=90 (WebData-demote-interval-0s) - monitor interval=60s (WebData-monitor-interval-60s) + monitor interval=29s role=Promoted (WebData-monitor-interval-29s) + monitor interval=31s role=Unpromoted (WebData-monitor-interval-31s) notify interval=0s timeout=90 (WebData-notify-interval-0s) promote interval=0s timeout=90 (WebData-promote-interval-0s) reload interval=0s timeout=30 (WebData-reload-interval-0s) start interval=0s timeout=240 (WebData-start-interval-0s) stop interval=0s timeout=100 (WebData-stop-interval-0s) [root@pcmk-1 ~]# pcs constraint ref WebData-clone Resource: WebData-clone colocation-WebFS-WebData-clone-INFINITY order-WebData-clone-WebFS-mandatory Cluster Filesystem __________________ The cluster filesystem ensures that files are read and written correctly. We need to specify the block device (provided by DRBD), where we want it mounted and that we are using GFS2. Again, it is a clone because it is intended to be active on both nodes. The additional constraints ensure that it can only be started on nodes with active DLM and DRBD instances. .. code-block:: console [root@pcmk-1 ~]# pcs resource config WebFS-clone Clone: WebFS-clone Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd1 directory=/var/www/html fstype=gfs2 Operations: monitor interval=20s timeout=40s (WebFS-monitor-interval-20s) start interval=0s timeout=60s (WebFS-start-interval-0s) stop interval=0s timeout=60s (WebFS-stop-interval-0s) [root@pcmk-1 ~]# pcs constraint ref WebFS-clone Resource: WebFS-clone colocation-WebFS-WebData-clone-INFINITY colocation-WebSite-WebFS-INFINITY colocation-WebFS-dlm-clone-INFINITY order-WebData-clone-WebFS-mandatory order-WebFS-WebSite-mandatory order-dlm-clone-WebFS-mandatory Apache ______ Lastly, we have the actual service, Apache. We need only tell the cluster where to find its main configuration file and restrict it to running on a node that has the required filesystem mounted and the IP address active. .. code-block:: console [root@pcmk-1 ~]# pcs resource config WebSite Resource: WebSite (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status Operations: monitor interval=1min (WebSite-monitor-interval-1min) start interval=0s timeout=40s (WebSite-start-interval-0s) stop interval=0s timeout=60s (WebSite-stop-interval-0s) [root@pcmk-1 ~]# pcs constraint ref WebSite Resource: WebSite colocation-WebSite-ClusterIP-INFINITY colocation-WebSite-WebFS-INFINITY location-WebSite-pcmk-2-50 order-ClusterIP-WebSite-mandatory order-WebFS-WebSite-mandatory diff --git a/doc/sphinx/Clusters_from_Scratch/shared-storage.rst b/doc/sphinx/Clusters_from_Scratch/shared-storage.rst index 11cee4994b..dea3e58027 100644 --- a/doc/sphinx/Clusters_from_Scratch/shared-storage.rst +++ b/doc/sphinx/Clusters_from_Scratch/shared-storage.rst @@ -1,626 +1,645 @@ .. index:: pair: storage; DRBD Replicate Storage Using DRBD ---------------------------- Even if you're serving up static websites, having to manually synchronize the contents of that website to all the machines in the cluster is not ideal. For dynamic websites, such as a wiki, it's not even an option. Not everyone can afford network-attached storage, but somehow the data needs to be kept in sync. Enter DRBD, which can be thought of as network-based RAID-1 [#]_. Install the DRBD Packages ######################### DRBD itself is included in the upstream kernel [#]_, but we do need some utilities to use it effectively. |CFS_DISTRO| does not ship these utilities, so we need to enable a third-party repository to get them. Supported packages for many OSes are available from DRBD's maker `LINBIT `_, but here we'll use the free `ELRepo `_ repository. On both nodes, import the ELRepo package signing key, and enable the repository: .. code-block:: console [root@pcmk-1 ~]# rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org [root@pcmk-1 ~]# dnf install -y https://www.elrepo.org/elrepo-release-9.el9.elrepo.noarch.rpm Now, we can install the DRBD kernel module and utilities: .. code-block:: console # dnf install -y kmod-drbd9x drbd9x-utils DRBD will not be able to run under the default SELinux security policies. If you are familiar with SELinux, you can modify the policies in a more fine-grained manner, but here we will simply exempt DRBD processes from SELinux control: .. code-block:: console # dnf install -y policycoreutils-python-utils # semanage permissive -a drbd_t We will configure DRBD to use port 7789, so allow that port from each host to the other: .. code-block:: console [root@pcmk-1 ~]# firewall-cmd --permanent --add-rich-rule='rule family="ipv4" \ source address="192.168.122.102" port port="7789" protocol="tcp" accept' success [root@pcmk-1 ~]# firewall-cmd --reload success .. code-block:: console [root@pcmk-2 ~]# firewall-cmd --permanent --add-rich-rule='rule family="ipv4" \ source address="192.168.122.101" port port="7789" protocol="tcp" accept' success [root@pcmk-2 ~]# firewall-cmd --reload success .. NOTE:: In this example, we have only two nodes, and all network traffic is on the same LAN. In production, it is recommended to use a dedicated, isolated network for cluster-related traffic, so the firewall configuration would likely be different; one approach would be to add the dedicated network interfaces to the trusted zone. .. NOTE:: If the ``firewall-cmd --add-rich-rule`` command fails with ``Error: INVALID_RULE: unknown element`` ensure that there is no space at the beginning of the second line of the command. Allocate a Disk Volume for DRBD ############################### DRBD will need its own block device on each node. This can be a physical disk partition or logical volume, of whatever size you need for your data. For this document, we will use a 512MiB logical volume, which is more than sufficient for a single HTML file and (later) GFS2 metadata. .. code-block:: console [root@pcmk-1 ~]# vgs VG #PV #LV #SN Attr VSize VFree almalinux_pcmk-1 1 2 0 wz--n- <19.00g <13.00g [root@pcmk-1 ~]# lvcreate --name drbd-demo --size 512M almalinux_pcmk-1 Logical volume "drbd-demo" created. [root@pcmk-1 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert drbd-demo almalinux_pcmk-1 -wi-a----- 512.00m root almalinux_pcmk-1 -wi-ao---- 4.00g swap almalinux_pcmk-1 -wi-ao---- 2.00g Repeat for the second node, making sure to use the same size: .. code-block:: console [root@pcmk-1 ~]# ssh pcmk-2 -- lvcreate --name drbd-demo --size 512M cs_pcmk-2 Logical volume "drbd-demo" created. Configure DRBD ############## There is no series of commands for building a DRBD configuration, so simply run this on both nodes to use this sample configuration: .. code-block:: console # cat </etc/drbd.d/wwwdata.res - resource wwwdata { - protocol C; - meta-disk internal; - device /dev/drbd1; - syncer { - verify-alg sha1; - } - net { - allow-two-primaries; - } - on pcmk-1 { - disk /dev/almalinux_pcmk-1/drbd-demo; - address 192.168.122.101:7789; - } - on pcmk-2 { - disk /dev/almalinux_pcmk-2/drbd-demo; - address 192.168.122.102:7789; - } + resource "wwwdata" { + device minor 1; + meta-disk internal; + + net { + protocol C; + allow-two-primaries yes; + fencing resource-and-stonith; + verify-alg sha1; + } + handlers { + fence-peer "/usr/lib/drbd/crm-fence-peer.9.sh"; + unfence-peer "/usr/lib/drbd/crm-unfence-peer.9.sh"; + } + on "pcmk-1" { + disk "/dev/almalinux_pcmk-1/drbd-demo"; + node-id 0; + } + on "pcmk-2" { + disk "/dev/almalinux_pcmk-2/drbd-demo"; + node-id 1; + } + connection { + host "pcmk-1" address 192.168.122.101:7789; + host "pcmk-2" address 192.168.122.102:7789; + } } END + .. IMPORTANT:: Edit the file to use the hostnames, IP addresses, and logical volume paths of your nodes if they differ from the ones used in this guide. .. NOTE:: Detailed information on the directives used in this configuration (and other alternatives) is available in the `DRBD User's Guide - `_. - The ``allow-two-primaries`` option would not normally be used in + `_. The + guide contains a wealth of information on such topics as core DRBD + concepts, replication settings, network connection options, quorum, split- + brain handling, administrative tasks, troubleshooting, and responding to + disk or node failures, among others. + + The ``allow-two-primaries: yes`` option would not normally be used in an active/passive cluster. We are adding it here for the convenience of changing to an active/active cluster later. Initialize DRBD ############### With the configuration in place, we can now get DRBD running. These commands create the local metadata for the DRBD resource, ensure the DRBD kernel module is loaded, and bring up the DRBD resource. Run them on one node: .. code-block:: console [root@pcmk-1 ~]# drbdadm create-md wwwdata initializing activity log initializing bitmap (16 KB) to all zero Writing meta data... New drbd meta data block successfully created. success [root@pcmk-1 ~]# modprobe drbd [root@pcmk-1 ~]# drbdadm up wwwdata --== Thank you for participating in the global usage survey ==-- The server's response is: you are the 25212th user to install this version We can confirm DRBD's status on this node: .. code-block:: console [root@pcmk-1 ~]# drbdadm status wwwdata role:Secondary disk:Inconsistent pcmk-2 connection:Connecting Because we have not yet initialized the data, this node's data is marked as ``Inconsistent`` Because we have not yet initialized the second node, the ``pcmk-2`` connection is ``Connecting`` (waiting for connection). Now, repeat the above commands on the second node, starting with creating ``wwwdata.res``. After giving it time to connect, when we check the status of the first node, it shows: .. code-block:: console [root@pcmk-1 ~]# drbdadm status wwwdata role:Secondary disk:Inconsistent pcmk-2 role:Secondary peer-disk:Inconsistent You can see that ``pcmk-2 connection:Connecting`` longer appears in the output, meaning the two DRBD nodes are communicating properly, and both nodes are in ``Secondary`` role with ``Inconsistent`` data. To make the data consistent, we need to tell DRBD which node should be considered to have the correct data. In this case, since we are creating a new resource, both have garbage, so we'll just pick ``pcmk-1`` and run this command on it: .. code-block:: console [root@pcmk-1 ~]# drbdadm primary --force wwwdata .. NOTE:: If you are using a different version of DRBD, the required syntax may be different. See the documentation for your version for how to perform these commands. If we check the status immediately, we'll see something like this: .. code-block:: console [root@pcmk-1 ~]# drbdadm status wwwdata role:Primary disk:UpToDate pcmk-2 role:Secondary peer-disk:Inconsistent It will be quickly followed by this: .. code-block:: console + [root@pcmk-1 ~]# drbdadm status wwwdata role:Primary disk:UpToDate pcmk-2 role:Secondary replication:SyncSource peer-disk:Inconsistent We can see that the first node has the ``Primary`` role, its partner node has the ``Secondary`` role, the first node's data is now considered ``UpToDate``, and the partner node's data is still ``Inconsistent``. After a while, the sync should finish, and you'll see something like: .. code-block:: console [root@pcmk-1 ~]# drbdadm status wwwdata role:Primary disk:UpToDate pcmk-1 role:Secondary peer-disk:UpToDate [root@pcmk-2 ~]# drbdadm status wwwdata role:Secondary disk:UpToDate pcmk-1 role:Primary peer-disk:UpToDate Both sets of data are now ``UpToDate``, and we can proceed to creating and populating a filesystem for our ``WebSite`` resource's documents. Populate the DRBD Disk ###################### On the node with the primary role (``pcmk-1`` in this example), create a filesystem on the DRBD device: .. code-block:: console [root@pcmk-1 ~]# mkfs.xfs /dev/drbd1 meta-data=/dev/drbd1 isize=512 agcount=4, agsize=32765 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=131059, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=1368, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 Discarding blocks...Done. .. NOTE:: In this example, we create an xfs filesystem with no special options. In a production environment, you should choose a filesystem type and options that are suitable for your application. Mount the newly created filesystem, populate it with our web document, give it the same SELinux policy as the web document root, then unmount it (the cluster will handle mounting and unmounting it later): .. code-block:: console [root@pcmk-1 ~]# mount /dev/drbd1 /mnt [root@pcmk-1 ~]# cat <<-END >/mnt/index.html My Test Site - DRBD END [root@pcmk-1 ~]# chcon -R --reference=/var/www/html /mnt [root@pcmk-1 ~]# umount /dev/drbd1 Configure the Cluster for the DRBD device ######################################### One handy feature ``pcs`` has is the ability to queue up several changes into a file and commit those changes all at once. To do this, start by populating the file with the current raw XML config from the CIB. .. code-block:: console [root@pcmk-1 ~]# pcs cluster cib drbd_cfg Using ``pcs``'s ``-f`` option, make changes to the configuration saved in the ``drbd_cfg`` file. These changes will not be seen by the cluster until the ``drbd_cfg`` file is pushed into the live cluster's CIB later. Here, we create a cluster resource for the DRBD device, and an additional *clone* resource to allow the resource to run on both nodes at the same time. .. code-block:: console [root@pcmk-1 ~]# pcs -f drbd_cfg resource create WebData ocf:linbit:drbd \ - drbd_resource=wwwdata op monitor interval=60s + drbd_resource=wwwdata op monitor interval=29s role=Promoted \ + monitor interval=31s role=Unpromoted [root@pcmk-1 ~]# pcs -f drbd_cfg resource promotable WebData \ promoted-max=1 promoted-node-max=1 clone-max=2 clone-node-max=1 \ notify=true [root@pcmk-1 ~]# pcs resource status * ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-1 * WebSite (ocf::heartbeat:apache): Started pcmk-1 [root@pcmk-1 ~]# pcs resource config Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=192.168.122.120 Operations: monitor interval=30s (ClusterIP-monitor-interval-30s) start interval=0s timeout=20s (ClusterIP-start-interval-0s) stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) Resource: WebSite (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status Operations: monitor interval=1min (WebSite-monitor-interval-1min) start interval=0s timeout=40s (WebSite-start-interval-0s) stop interval=0s timeout=60s (WebSite-stop-interval-0s) After you are satisfied with all the changes, you can commit them all at once by pushing the ``drbd_cfg`` file into the live CIB. .. code-block:: console [root@pcmk-1 ~]# pcs cluster cib-push drbd_cfg --config CIB updated .. NOTE:: All the updates above can be done in one shot as follows: .. code-block:: console [root@pcmk-1 ~]# pcs resource create WebData ocf:linbit:drbd \ - drbd_resource=wwwdata op monitor interval=60s \ + drbd_resource=wwwdata op monitor interval=29s role=Promoted \ + monitor interval=31s role=Unpromoted \ promotable promoted-max=1 promoted-node-max=1 clone-max=2 \ clone-node-max=1 notify=true Let's see what the cluster did with the new configuration: .. code-block:: console [root@pcmk-1 ~]# pcs resource status * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-2 * WebSite (ocf:heartbeat:apache): Started pcmk-2 * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-1 ] * Unpromoted: [ pcmk-2 ] [root@pcmk-1 ~]# pcs resource config Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=192.168.122.120 Operations: monitor interval=30s (ClusterIP-monitor-interval-30s) start interval=0s timeout=20s (ClusterIP-start-interval-0s) stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) Resource: WebSite (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status Operations: monitor interval=1min (WebSite-monitor-interval-1min) start interval=0s timeout=40s (WebSite-start-interval-0s) stop interval=0s timeout=60s (WebSite-stop-interval-0s) Clone: WebData-clone Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1 Resource: WebData (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=wwwdata Operations: demote interval=0s timeout=90 (WebData-demote-interval-0s) - monitor interval=60s (WebData-monitor-interval-60s) + monitor interval=29s role=Promoted (WebData-monitor-interval-29s) + monitor interval=31s role=Unpromoted (WebData-monitor-interval-31s) notify interval=0s timeout=90 (WebData-notify-interval-0s) promote interval=0s timeout=90 (WebData-promote-interval-0s) reload interval=0s timeout=30 (WebData-reload-interval-0s) start interval=0s timeout=240 (WebData-start-interval-0s) stop interval=0s timeout=100 (WebData-stop-interval-0s) We can see that ``WebData-clone`` (our DRBD device) is running as ``Promoted`` (DRBD's primary role) on ``pcmk-1`` and ``Unpromoted`` (DRBD's secondary role) on ``pcmk-2``. .. IMPORTANT:: The resource agent should load the DRBD module when needed if it's not already loaded. If that does not happen, configure your operating system to load the module at boot time. For |CFS_DISTRO| |CFS_DISTRO_VER|, you would run this on both nodes: .. code-block:: console # echo drbd >/etc/modules-load.d/drbd.conf Configure the Cluster for the Filesystem ######################################## Now that we have a working DRBD device, we need to mount its filesystem. In addition to defining the filesystem, we also need to tell the cluster where it can be located (only on the DRBD Primary) and when it is allowed to start (after the Primary was promoted). We are going to take a shortcut when creating the resource this time. Instead of explicitly saying we want the ``ocf:heartbeat:Filesystem`` script, we are only going to ask for ``Filesystem``. We can do this because we know there is only one resource script named ``Filesystem`` available to Pacemaker, and that ``pcs`` is smart enough to fill in the ``ocf:heartbeat:`` portion for us correctly in the configuration. If there were multiple ``Filesystem`` scripts from different OCF providers, we would need to specify the exact one we wanted. Once again, we will queue our changes to a file and then push the new configuration to the cluster as the final step. .. code-block:: console [root@pcmk-1 ~]# pcs cluster cib fs_cfg [root@pcmk-1 ~]# pcs -f fs_cfg resource create WebFS Filesystem \ device="/dev/drbd1" directory="/var/www/html" fstype="xfs" Assumed agent name 'ocf:heartbeat:Filesystem' (deduced from 'Filesystem') [root@pcmk-1 ~]# pcs -f fs_cfg constraint colocation add \ WebFS with Promoted WebData-clone [root@pcmk-1 ~]# pcs -f fs_cfg constraint order \ promote WebData-clone then start WebFS Adding WebData-clone WebFS (kind: Mandatory) (Options: first-action=promote then-action=start) We also need to tell the cluster that Apache needs to run on the same machine as the filesystem and that it must be active before Apache can start. .. code-block:: console [root@pcmk-1 ~]# pcs -f fs_cfg constraint colocation add WebSite with WebFS [root@pcmk-1 ~]# pcs -f fs_cfg constraint order WebFS then WebSite Adding WebFS WebSite (kind: Mandatory) (Options: first-action=start then-action=start) Review the updated configuration. .. code-block:: console [root@pcmk-1 ~]# pcs -f fs_cfg constraint Location Constraints: Resource: WebSite Enabled on: Node: pcmk-1 (score:50) Ordering Constraints: start ClusterIP then start WebSite (kind:Mandatory) promote WebData-clone then start WebFS (kind:Mandatory) start WebFS then start WebSite (kind:Mandatory) Colocation Constraints: WebSite with ClusterIP (score:INFINITY) WebFS with WebData-clone (score:INFINITY) (rsc-role:Started) (with-rsc-role:Promoted) WebSite with WebFS (score:INFINITY) Ticket Constraints: After reviewing the new configuration, upload it and watch the cluster put it into effect. .. code-block:: console [root@pcmk-1 ~]# pcs cluster cib-push fs_cfg --config CIB updated [root@pcmk-1 ~]# pcs resource status * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-2 * WebSite (ocf:heartbeat:apache): Started pcmk-2 * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-2 ] * Unpromoted: [ pcmk-1 ] * WebFS (ocf:heartbeat:Filesystem): Started pcmk-2 [root@pcmk-1 ~]# pcs resource config Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2) Attributes: cidr_netmask=24 ip=192.168.122.120 Operations: monitor interval=30s (ClusterIP-monitor-interval-30s) start interval=0s timeout=20s (ClusterIP-start-interval-0s) stop interval=0s timeout=20s (ClusterIP-stop-interval-0s) Resource: WebSite (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://localhost/server-status Operations: monitor interval=1min (WebSite-monitor-interval-1min) start interval=0s timeout=40s (WebSite-start-interval-0s) stop interval=0s timeout=60s (WebSite-stop-interval-0s) Clone: WebData-clone Meta Attrs: clone-max=2 clone-node-max=1 notify=true promotable=true promoted-max=1 promoted-node-max=1 Resource: WebData (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=wwwdata Operations: demote interval=0s timeout=90 (WebData-demote-interval-0s) - monitor interval=60s (WebData-monitor-interval-60s) + monitor interval=29s role=Promoted (WebData-monitor-interval-29s) + monitor interval=31s role=Unpromoted (WebData-monitor-interval-31s) notify interval=0s timeout=90 (WebData-notify-interval-0s) promote interval=0s timeout=90 (WebData-promote-interval-0s) reload interval=0s timeout=30 (WebData-reload-interval-0s) start interval=0s timeout=240 (WebData-start-interval-0s) stop interval=0s timeout=100 (WebData-stop-interval-0s) Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd1 directory=/var/www/html fstype=xfs Operations: monitor interval=20s timeout=40s (WebFS-monitor-interval-20s) start interval=0s timeout=60s (WebFS-start-interval-0s) stop interval=0s timeout=60s (WebFS-stop-interval-0s) Test Cluster Failover ##################### Previously, we used ``pcs cluster stop pcmk-2`` to stop all cluster services on ``pcmk-2``, failing over the cluster resources, but there is another way to safely simulate node failure. We can put the node into *standby mode*. Nodes in this state continue to run ``corosync`` and ``pacemaker`` but are not allowed to run resources. Any resources found active there will be moved elsewhere. This feature can be particularly useful when performing system administration tasks such as updating packages used by cluster resources. Put the active node into standby mode, and observe the cluster move all the resources to the other node. The node's status will change to indicate that it can no longer host resources, and eventually all the resources will move. .. code-block:: console [root@pcmk-1 ~]# pcs node standby pcmk-2 [root@pcmk-1 ~]# pcs status Cluster name: mycluster Cluster Summary: * Stack: corosync * Current DC: pcmk-1 (version 2.1.2-4.el9-ada5c3b36e2) - partition with quorum * Last updated: Wed Jul 27 05:28:01 2022 * Last change: Wed Jul 27 05:27:57 2022 by root via cibadmin on pcmk-1 * 2 nodes configured * 6 resource instances configured Node List: * Node pcmk-2: standby * Online: [ pcmk-1 ] Full List of Resources: * fence_dev (stonith:some_fence_agent): Started pcmk-1 * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-1 * WebSite (ocf:heartbeat:apache): Started pcmk-1 * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-1 ] * Stopped: [ pcmk-2 ] * WebFS (ocf:heartbeat:Filesystem): Started pcmk-1 Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled Once we've done everything we needed to on ``pcmk-2`` (in this case nothing, we just wanted to see the resources move), we can unstandby the node, making it eligible to host resources again. .. code-block:: console [root@pcmk-1 ~]# pcs node unstandby pcmk-2 [root@pcmk-1 ~]# pcs status Cluster name: mycluster Cluster Summary: * Stack: corosync * Current DC: pcmk-1 (version 2.1.2-4.el9-ada5c3b36e2) - partition with quorum * Last updated: Wed Jul 27 05:28:50 2022 * Last change: Wed Jul 27 05:28:47 2022 by root via cibadmin on pcmk-1 * 2 nodes configured * 6 resource instances configured Node List: * Online: [ pcmk-1 pcmk-2 ] Full List of Resources: * fence_dev (stonith:some_fence_agent): Started pcmk-1 * ClusterIP (ocf:heartbeat:IPaddr2): Started pcmk-1 * WebSite (ocf:heartbeat:apache): Started pcmk-1 * Clone Set: WebData-clone [WebData] (promotable): * Promoted: [ pcmk-1 ] * Unpromoted: [ pcmk-2 ] * WebFS (ocf:heartbeat:Filesystem): Started pcmk-1 Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled Notice that ``pcmk-2`` is back to the ``Online`` state, and that the cluster resources stay where they are due to our resource stickiness settings configured earlier. .. [#] See http://www.drbd.org for details. .. [#] Since version 2.6.33