Change Details

This describes how to [[Configure Multiple Fencing Devices]] (using that page's example of IPMI followed by two switched PDUs) using the higher-level `pcs` tool. == Starting Point == For a frame of reference, the cluster starts with this configuration: ``` Cluster Name: an-cluster-03 Corosync Nodes: pcmk-1 pcmk-2 Pacemaker Nodes: pcmk-1 pcmk-2 Resources: Stonith Devices: Fencing Levels: Location Constraints: Ordering Constraints: Colocation Constraints: Cluster Properties: cluster-infrastructure: corosync no-quorum-policy: ignore stonith-enabled: false ``` == Assumptions == We will need to make a few assumptions about our example cluster: * It is a two-node cluster with the node names `pcmk-1` and `pcmk-2`. * The two PDUs are accessible at the network addresses `pdu-1` and `pdu-2`, and will be accessed using the `fence_apc_snmp` fence agent. * The fencing details for `pcmk-1` are: ** IPMI device address is `pcmk-1.ipmi`, the login name is `admin` and the password is `secret`. ** Its power supplies are connected to port 1 of both `pdu-1` and `pdu-2`. * The fencing details for `pcmk-2` are: ** IPMI device address is `pcmk-2.ipmi`, the login name is `admin` and the password is `secret`. ** Its power supplies are connected to port 2 of both `pdu-1` and `pdu-2`. Adapt the example below to the names, addresses, credentials, and fence agents appropriate to your cluster. == Configure Fencing Devices == * Configure the IPMI fence device for `pcmk-1`: ``` pcs stonith create fence_pcmk1_ipmi fence_ipmilan \ pcmk_host_list="pcmk-1" ipaddr="pcmk-1.ipmi" \ login="admin" passwd="secret" delay=15 \ op monitor interval=60s ``` * Configure the two PDU fence devices for `pcmk-1`: Note that we've added `power_wait="5"` to the second PDU, to tell pacemaker to wait 5 seconds after turning off the second PDU before restoring power. This gives plenty of time for the node's power supplies to completely drain, ensuring that the node loses power. ``` pcs stonith create fence_pcmk1_psu1 fence_apc_snmp \ pcmk_host_list="pcmk-1" ipaddr="pdu-1" \ port="1" op monitor interval="60s" pcs stonith create fence_pcmk1_psu2 fence_apc_snmp \ pcmk_host_list="pcmk-1" ipaddr="pdu-2" \ port="1" power_wait="5" op monitor interval="60s" ``` * Repeat for `pcmk-2`: ``` pcs stonith create fence_pcmk2_ipmi fence_ipmilan \ pcmk_host_list="pcmk-2" ipaddr="pcmk-2.ipmi" \ login="admin" passwd="secret" delay=15 \ op monitor interval=60s pcs stonith create fence_pcmk2_psu1 fence_apc_snmp \ pcmk_host_list="pcmk-2" ipaddr="pdu-1" \ port="2" op monitor interval="60s" pcs stonith create fence_pcmk2_psu2 fence_apc_snmp \ pcmk_host_list="pcmk-2" ipaddr="pdu-2" \ port="2" power_wait="5" op monitor interval="60s" ``` == Configuring fencing_topology == The next step is to tell Pacemaker the order we want the fencing methods to run. Each fencing level may have one or more fence devices. When fencing is required, Pacemaker will try each level in sequence, stopping at the first level that succeeds. Therefore, separate levels function as a "fallback" mechanism (logical "or"). At any given level, all the devices in that level will be tried in succession, and all must succeed for the level to succeed (logical "and"). For our example, tell Pacemaker that the IPMI-based fence devices are the primary methods to use, and the switched PDUs are the fallback methods: ``` pcs stonith level add 1 pcmk-1 fence_pcmk1_ipmi pcs stonith level add 2 pcmk-1 fence_pcmk1_psu1,fence_pcmk1_psu2 pcs stonith level add 1 pcmk-2 fence_pcmk2_ipmi pcs stonith level add 2 pcmk-2 fence_pcmk2_psu1,fence_pcmk2_psu2 ``` When Pacemaker needs to reboot a node using multiple devices in the same level, it turns them all off, then turns them all on, rather than rebooting each in turn, to ensure the node is completely fenced. == Enable and Test Fencing == Now that fencing is configured, we can enable it: ``` crm configure property stonith-enabled=true ``` You can test by unplugging the IPMI interface for `pcmk-1` and then crashing it, triggering `pcmk-2` to initiate fencing of it. After the IPMI interface times out, you should see PDU 1's port 1 turn off, then PDU 2's port 1 turn off, then the crashed node power down, then PDU 1's port 1 should turn back on, and finally PDU 2's port 1 should turn back on. If you configured your server's BIOS to power on after power loss or to return to last state after power loss, your server should start to power back on.

This describes how to [[Configure Multiple Fencing Devices]] (using that page's example of IPMI followed by two switched PDUs) using the higher-level `pcs` tool. == Starting Point == For a frame of reference, the cluster starts with this configuration: ``` Cluster Name: an-cluster-03 Corosync Nodes: pcmk-1 pcmk-2 Pacemaker Nodes: pcmk-1 pcmk-2 Resources: Stonith Devices: Fencing Levels: Location Constraints: Ordering Constraints: Colocation Constraints: Cluster Properties: cluster-infrastructure: corosync no-quorum-policy: ignore stonith-enabled: false ``` == Assumptions == We will need to make a few assumptions about our example cluster: * It is a two-node cluster with the node names `pcmk-1` and `pcmk-2`. * The two PDUs are accessible at the network addresses `pdu-1` and `pdu-2`, and will be accessed using the `fence_apc_snmp` fence agent. * The fencing details for `pcmk-1` are: ** IPMI device address is `pcmk-1.ipmi`, the login name is `admin` and the password is `secret`. ** Its power supplies are connected to port 1 of both `pdu-1` and `pdu-2`. * The fencing details for `pcmk-2` are: ** IPMI device address is `pcmk-2.ipmi`, the login name is `admin` and the password is `secret`. ** Its power supplies are connected to port 2 of both `pdu-1` and `pdu-2`. Adapt the example below to the names, addresses, credentials, and fence agents appropriate to your cluster. == Configure Fencing Devices == * Configure the IPMI fence device for `pcmk-1`: ``` pcs stonith create fence_pcmk1_ipmi fence_ipmilan \ pcmk_host_list="pcmk-1" ipaddr="pcmk-1.ipmi" \ login="admin" passwd="secret" delay=15 \ op monitor interval=60s ``` * Configure the two PDU fence devices for `pcmk-1`: Note that we've added `power_wait="5"` to the second PDU, to tell pacemaker to wait 5 seconds after turning off the second PDU before restoring power. This gives plenty of time for the node's power supplies to completely drain, ensuring that the node loses power. ``` pcs stonith create fence_pcmk1_psu1 fence_apc_snmp \ pcmk_host_list="pcmk-1" ipaddr="pdu-1" \ port="1" op monitor interval="60s" pcs stonith create fence_pcmk1_psu2 fence_apc_snmp \ pcmk_host_list="pcmk-1" ipaddr="pdu-2" \ port="1" power_wait="5" op monitor interval="60s" ``` * Repeat for `pcmk-2`: ``` pcs stonith create fence_pcmk2_ipmi fence_ipmilan \ pcmk_host_list="pcmk-2" ipaddr="pcmk-2.ipmi" \ login="admin" passwd="secret" delay=15 \ op monitor interval=60s pcs stonith create fence_pcmk2_psu1 fence_apc_snmp \ pcmk_host_list="pcmk-2" ipaddr="pdu-1" \ port="2" op monitor interval="60s" pcs stonith create fence_pcmk2_psu2 fence_apc_snmp \ pcmk_host_list="pcmk-2" ipaddr="pdu-2" \ port="2" power_wait="5" op monitor interval="60s" ``` == Configuring fencing_topology == The next step is to tell Pacemaker the order we want the fencing methods to run. Each fencing level may have one or more fence devices. When fencing is required, Pacemaker will try each level in sequence, stopping at the first level that succeeds. Therefore, separate levels function as a "fallback" mechanism (logical "or"). At any given level, all the devices in that level will be tried in succession, and all must succeed for the level to succeed (logical "and"). For our example, tell Pacemaker that the IPMI-based fence devices are the primary methods to use, and the switched PDUs are the fallback methods: ``` pcs stonith level add 1 pcmk-1 fence_pcmk1_ipmi pcs stonith level add 2 pcmk-1 fence_pcmk1_psu1,fence_pcmk1_psu2 pcs stonith level add 1 pcmk-2 fence_pcmk2_ipmi pcs stonith level add 2 pcmk-2 fence_pcmk2_psu1,fence_pcmk2_psu2 ``` When Pacemaker needs to reboot a node using multiple devices in the same level, it turns them all off, then turns them all on, rather than rebooting each in turn, to ensure the node is completely fenced. == Enable and Test Fencing == Now that fencing is configured, we can enable it: ``` pcs property set stonith-enabled=true ``` You can test by unplugging the IPMI interface for `pcmk-1` and then crashing it, triggering `pcmk-2` to initiate fencing of it. After the IPMI interface times out, you should see PDU 1's port 1 turn off, then PDU 2's port 1 turn off, then the crashed node power down, then PDU 1's port 1 should turn back on, and finally PDU 2's port 1 should turn back on. If you configured your server's BIOS to power on after power loss or to return to last state after power loss, your server should start to power back on.

This describes how to [[Configure Multiple Fencing Devices]] (using that page's example of IPMI followed by two switched PDUs) using the higher-level `pcs` tool. == Starting Point == For a frame of reference, the cluster starts with this configuration: ``` Cluster Name: an-cluster-03 Corosync Nodes: pcmk-1 pcmk-2 Pacemaker Nodes: pcmk-1 pcmk-2 Resources: Stonith Devices: Fencing Levels: Location Constraints: Ordering Constraints: Colocation Constraints: Cluster Properties: cluster-infrastructure: corosync no-quorum-policy: ignore stonith-enabled: false ``` == Assumptions == We will need to make a few assumptions about our example cluster: * It is a two-node cluster with the node names `pcmk-1` and `pcmk-2`. * The two PDUs are accessible at the network addresses `pdu-1` and `pdu-2`, and will be accessed using the `fence_apc_snmp` fence agent. * The fencing details for `pcmk-1` are: ** IPMI device address is `pcmk-1.ipmi`, the login name is `admin` and the password is `secret`. ** Its power supplies are connected to port 1 of both `pdu-1` and `pdu-2`. * The fencing details for `pcmk-2` are: ** IPMI device address is `pcmk-2.ipmi`, the login name is `admin` and the password is `secret`. ** Its power supplies are connected to port 2 of both `pdu-1` and `pdu-2`. Adapt the example below to the names, addresses, credentials, and fence agents appropriate to your cluster. == Configure Fencing Devices == * Configure the IPMI fence device for `pcmk-1`: ``` pcs stonith create fence_pcmk1_ipmi fence_ipmilan \ pcmk_host_list="pcmk-1" ipaddr="pcmk-1.ipmi" \ login="admin" passwd="secret" delay=15 \ op monitor interval=60s ``` * Configure the two PDU fence devices for `pcmk-1`: Note that we've added `power_wait="5"` to the second PDU, to tell pacemaker to wait 5 seconds after turning off the second PDU before restoring power. This gives plenty of time for the node's power supplies to completely drain, ensuring that the node loses power. ``` pcs stonith create fence_pcmk1_psu1 fence_apc_snmp \ pcmk_host_list="pcmk-1" ipaddr="pdu-1" \ port="1" op monitor interval="60s" pcs stonith create fence_pcmk1_psu2 fence_apc_snmp \ pcmk_host_list="pcmk-1" ipaddr="pdu-2" \ port="1" power_wait="5" op monitor interval="60s" ``` * Repeat for `pcmk-2`: ``` pcs stonith create fence_pcmk2_ipmi fence_ipmilan \ pcmk_host_list="pcmk-2" ipaddr="pcmk-2.ipmi" \ login="admin" passwd="secret" delay=15 \ op monitor interval=60s pcs stonith create fence_pcmk2_psu1 fence_apc_snmp \ pcmk_host_list="pcmk-2" ipaddr="pdu-1" \ port="2" op monitor interval="60s" pcs stonith create fence_pcmk2_psu2 fence_apc_snmp \ pcmk_host_list="pcmk-2" ipaddr="pdu-2" \ port="2" power_wait="5" op monitor interval="60s" ``` == Configuring fencing_topology == The next step is to tell Pacemaker the order we want the fencing methods to run. Each fencing level may have one or more fence devices. When fencing is required, Pacemaker will try each level in sequence, stopping at the first level that succeeds. Therefore, separate levels function as a "fallback" mechanism (logical "or"). At any given level, all the devices in that level will be tried in succession, and all must succeed for the level to succeed (logical "and"). For our example, tell Pacemaker that the IPMI-based fence devices are the primary methods to use, and the switched PDUs are the fallback methods: ``` pcs stonith level add 1 pcmk-1 fence_pcmk1_ipmi pcs stonith level add 2 pcmk-1 fence_pcmk1_psu1,fence_pcmk1_psu2 pcs stonith level add 1 pcmk-2 fence_pcmk2_ipmi pcs stonith level add 2 pcmk-2 fence_pcmk2_psu1,fence_pcmk2_psu2 ``` When Pacemaker needs to reboot a node using multiple devices in the same level, it turns them all off, then turns them all on, rather than rebooting each in turn, to ensure the node is completely fenced. == Enable and Test Fencing == Now that fencing is configured, we can enable it: ``` crm configurepcs property set stonith-enabled=true ``` You can test by unplugging the IPMI interface for `pcmk-1` and then crashing it, triggering `pcmk-2` to initiate fencing of it. After the IPMI interface times out, you should see PDU 1's port 1 turn off, then PDU 2's port 1 turn off, then the crashed node power down, then PDU 1's port 1 should turn back on, and finally PDU 2's port 1 should turn back on. If you configured your server's BIOS to power on after power loss or to return to last state after power loss, your server should start to power back on.