No OneTemporary
Actions

Size

61 KB

Referenced Files

None

Subscribers

None

View Options

	diff --git a/doc/Clusters_from_Scratch/en-US/Ch-Active-Active.txt b/doc/Clusters_from_Scratch/en-US/Ch-Active-Active.txt
	index c8e9baf38d..571d9ed5a0 100644
	--- a/doc/Clusters_from_Scratch/en-US/Ch-Active-Active.txt
	+++ b/doc/Clusters_from_Scratch/en-US/Ch-Active-Active.txt
	@@ -1,754 +1,755 @@
	= Conversion to Active/Active =

	== Requirements ==

	The primary requirement for an Active/Active cluster is that the data
	required for your services is available, simultaneously, on both
	machines. Pacemaker makes no requirement on how this is achieved, you
	could use a SAN if you had one available, however since DRBD supports
	multiple Primaries, we can also use that.

	The only hitch is that we need to use a cluster-aware filesystem. The
	one we used earlier with DRBD, ext4, is not one of those. Both OCFS2
	and GFS2 are supported, however here we will use GFS2 which comes with
	Fedora 17.

	=== Installing the required Software ===

	[source,C]
	# yum install -y gfs2-utils dlm kernel-modules-extra
	.....
	Loaded plugins: langpacks, presto, refresh-packagekit
	Resolving Dependencies
	--> Running transaction check
	---> Package dlm.x86_64 0:3.99.4-1.fc17 will be installed
	---> Package gfs2-utils.x86_64 0:3.1.4-3.fc17 will be installed
	---> Package kernel-modules-extra.x86_64 0:3.4.4-3.fc17 will be installed
	--> Finished Dependency Resolution

	Dependencies Resolved

	================================================================================
	Package Arch Version Repository Size
	================================================================================
	Installing:
	dlm x86_64 3.99.4-1.fc17 updates 83 k
	gfs2-utils x86_64 3.1.4-3.fc17 fedora 214 k
	kernel-modules-extra x86_64 3.4.4-3.fc17 updates 1.7 M

	Transaction Summary
	================================================================================
	Install 3 Packages

	Total download size: 1.9 M
	Installed size: 7.7 M
	Downloading Packages:
	(1/3): dlm-3.99.4-1.fc17.x86_64.rpm \| 83 kB 00:00
	(2/3): gfs2-utils-3.1.4-3.fc17.x86_64.rpm \| 214 kB 00:00
	(3/3): kernel-modules-extra-3.4.4-3.fc17.x86_64.rpm \| 1.7 MB 00:01
	--------------------------------------------------------------------------------
	Total 615 kB/s \| 1.9 MB 00:03
	Running Transaction Check
	Running Transaction Test
	Transaction Test Succeeded
	Running Transaction
	Installing : kernel-modules-extra-3.4.4-3.fc17.x86_64 1/3
	Installing : gfs2-utils-3.1.4-3.fc17.x86_64 2/3
	Installing : dlm-3.99.4-1.fc17.x86_64 3/3
	Verifying : dlm-3.99.4-1.fc17.x86_64 1/3
	Verifying : gfs2-utils-3.1.4-3.fc17.x86_64 2/3
	Verifying : kernel-modules-extra-3.4.4-3.fc17.x86_64 3/3

	Installed:
	dlm.x86_64 0:3.99.4-1.fc17
	gfs2-utils.x86_64 0:3.1.4-3.fc17
	kernel-modules-extra.x86_64 0:3.4.4-3.fc17

	Complete!
	.....

	== Create a GFS2 Filesystem ==

	[[GFS2_prep]]
	=== Preparation ===

	Before we do anything to the existing partition, we need to make sure it
	is unmounted. We do this by telling the cluster to stop the WebFS resource.
	This will ensure that other resources (in our case, Apache) using WebFS
	are not only stopped, but stopped in the correct order.

	ifdef::pcs[]
	[source,C]
	----
	# pcs resource stop WebFS
	# pcs resource
	ClusterIP (ocf::heartbeat:IPaddr2) Started
	WebSite (ocf::heartbeat:apache) Stopped
	Master/Slave Set: WebDataClone [WebData]
	Masters: [ pcmk-2 ]
	Slaves: [ pcmk-1 ]
	WebFS (ocf::heartbeat:Filesystem) Stopped
	----
	endif::[]

	ifdef::crm[]
	[source,C]
	-----
	# crm resource stop WebFS
	# crm_mon -1
	============
	Last updated: Tue Apr 3 14:07:36 2012
	Last change: Tue Apr 3 14:07:15 2012 via cibadmin on pcmk-1
	Stack: corosync
	Current DC: pcmk-1 (1702537408) - partition with quorum
	Version: 1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff
	2 Nodes configured, unknown expected votes
	5 Resources configured.
	============

	Online: [ pcmk-1 pcmk-2 ]

	ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2
	Master/Slave Set: WebDataClone [WebData]
	Masters: [ pcmk-2 ]
	Slaves: [ pcmk-1 ]
	-----
	endif::[]

	[NOTE]
	=======

	Note that both Apache and WebFS have been stopped.

	=======

	=== Create and Populate an GFS2 Partition ===

	Now that the cluster stack and integration pieces are running smoothly,
	we can create an GFS2 partition.

	[WARNING]
	=========

	This will erase all previous content stored on the DRBD device. Ensure
	you have a copy of any important data.

	=========

	We need to specify a number of additional parameters when creating a
	GFS2 partition.

	First we must use the -p option to specify that we want to use the the
	Kernel's DLM. Next we use -j to indicate that it should reserve enough
	space for two journals (one per node accessing the filesystem).

	ifdef::pcs[]
	Lastly, we use -t to specify the lock table name. The format for this
	field is +clustername:fsname+. For the +fsname+, we need to use the same
	value as specified in 'corosync.conf' for +cluster_name+. If you setup
	corosync with the same cluster name we used in this tutorial, cluster
	name will be 'mycluster'. If you are unsure what your cluster name is,
	open up /etc/corosync/corosync.conf, or execute the command
	'pcs cluster corosync pcmk-1' to view the corosync config. The cluster
	name will be in the +totem+ block.
	endif::[]

	ifdef::crm[]
	Lastly, we use -t to specify the lock table name. The format for this
	field is +clustername:fsname+. For the +fsname+, we need to use the same
	value as specified in 'corosync.conf' for +cluster_name+. Just pick
	something unique and descriptive and add somewhere inside the +totem+
	block. For example:

	.....
	totem {
	version: 2

	# cypto_cipher and crypto_hash: Used for mutual node authentication.
	# If you choose to enable this, then do remember to create a shared
	# secret with "corosync-keygen".
	crypto_cipher: none
	crypto_hash: none
	cluster_name: mycluster
	...
	.....

	[IMPORTANT]
	===========
	Do this on each node in the cluster and be sure to restart them before
	continuing.
	===========
	endif::[]

	[IMPORTANT]
	===========
	We must run the next command on whichever node last had '/dev/drbd'
	mounted. Otherwise you will receive the message:

	-----
	/dev/drbd1: Read-only file system
	-----
	===========

	[source,C]
	-----
	# ssh pcmk-2 -- mkfs.gfs2 -p lock_dlm -j 2 -t mycluster:web /dev/drbd1
	This will destroy any data on /dev/drbd1.
	It appears to contain: Linux rev 1.0 ext4 filesystem data, UUID=dc45fff3-c47a-4db2-96f7-a8049a323fe4 (extents) (large files) (huge files)
	Are you sure you want to proceed? [y/n]y
	Device: /dev/drbd1
	Blocksize: 4096
	Device Size 0.97 GB (253935 blocks)
	Filesystem Size: 0.97 GB (253932 blocks)
	Journals: 2
	Resource Groups: 4
	Locking Protocol: "lock_dlm"
	Lock Table: "mycluster"
	UUID: ed293a02-9eee-3fa3-ed1c-435ef1fd0116
	-----

	ifdef::pcs[]
	[source,C]
	----
	# pcs cluster cib dlm_cfg
	# pcs -f dlm_cfg resource create dlm ocf:pacemaker:controld op monitor interval=60s
	# pcs -f dlm_cfg resource clone dlm clone-max=2 clone-node-max=1
	# pcs -f dlm_cfg resource show
	ClusterIP (ocf::heartbeat:IPaddr2) Started
	WebSite (ocf::heartbeat:apache) Stopped
	Master/Slave Set: WebDataClone [WebData]
	Masters: [ pcmk-2 ]
	Slaves: [ pcmk-1 ]
	WebFS (ocf::heartbeat:Filesystem) Stopped
	Clone Set: dlm-clone [dlm]
	Stopped: [ dlm:0 dlm:1 ]
	# pcs cluster push cib dlm_cfg
	CIB updated
	# pcs status

	Last updated: Fri Sep 14 12:54:50 2012
	Last change: Fri Sep 14 12:54:43 2012 via cibadmin on pcmk-1
	Stack: corosync
	Current DC: pcmk-1 (1) - partition with quorum
	Version: 1.1.8-1.el7-60a19ed12fdb4d5c6a6b6767f52e5391e447fec0
	2 Nodes configured, unknown expected votes
	7 Resources configured.

	Online: [ pcmk-1 pcmk-2 ]

	Full list of resources:

	ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-2
	WebSite (ocf::heartbeat:apache): Stopped
	Master/Slave Set: WebDataClone [WebData]
	Masters: [ pcmk-2 ]
	Slaves: [ pcmk-1 ]
	WebFS (ocf::heartbeat:Filesystem): Stopped
	Clone Set: dlm-clone [dlm]
	Started: [ pcmk-1 pcmk-2 ]
	----
	endif::[]

	ifdef::crm[]
	[source,C]
	-----
	# crm
	crm(live)# cib new dlm
	INFO: dlm shadow CIB created
	crm(dlm)# configure primitive dlm ocf:pacemaker:controld \
	op monitor interval=60s
	crm(dlm)# configure clone dlm_clone dlm meta clone-max=2 clone-node-max=1
	crm(dlm)# configure show
	node $id="1702537408" pcmk-1 \
	attributes standby="off"
	node $id="1719314624" pcmk-2
	primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="192.168.122.120" cidr_netmask="32" \
	op monitor interval="30s"
	primitive WebData ocf:linbit:drbd \
	params drbd_resource="wwwdata" \
	op monitor interval="60s"
	primitive WebFS ocf:heartbeat:Filesystem \
	params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="ext4" \
	meta target-role="Stopped"
	primitive WebSite ocf:heartbeat:apache \
	params configfile="/etc/httpd/conf/httpd.conf" \
	op monitor interval="1min"
	primitive dlm ocf:pacemaker:controld \
	op monitor interval="60s"
	ms WebDataClone WebData \
	meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
	clone dlm_clone dlm \
	meta clone-max="2" clone-node-max="1"
	location prefer-pcmk-1 WebSite 50: pcmk-1
	colocation WebSite-with-WebFS inf: WebSite WebFS
	colocation fs_on_drbd inf: WebFS WebDataClone:Master
	colocation website-with-ip inf: WebSite ClusterIP
	order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
	order WebSite-after-WebFS inf: WebFS WebSite
	order apache-after-ip inf: ClusterIP WebSite
	property $id="cib-bootstrap-options" \
	dc-version="1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff" \
	cluster-infrastructure="corosync" \
	stonith-enabled="false" \
	no-quorum-policy="ignore" \
	last-lrm-refresh="1333446866"
	rsc_defaults $id="rsc-options" \
	resource-stickiness="100"
	op_defaults $id="op-options" \
	timeout="240s"
	crm(dlm)# cib commit dlm
	INFO: commited 'dlm' shadow CIB to the cluster
	crm(dlm)# quit
	bye
	# crm_mon -1
	============
	Last updated: Wed Apr 4 01:15:11 2012
	Last change: Wed Apr 4 00:50:11 2012 via crmd on pcmk-1
	Stack: corosync
	Current DC: pcmk-1 (1702537408) - partition with quorum
	Version: 1.1.7-2.fc17-ee0730e13d124c3d58f00016c3376a1de5323cff
	2 Nodes configured, unknown expected votes
	7 Resources configured.
	============

	Online: [ pcmk-1 pcmk-2 ]

	ClusterIP (ocf::heartbeat:IPaddr2): Started pcmk-1
	Master/Slave Set: WebDataClone [WebData]
	Masters: [ pcmk-1 ]
	Slaves: [ pcmk-2 ]
	Clone Set: dlm_clone [dlm]
	Started: [ pcmk-1 pcmk-2 ]
	-----
	endif::[]

	Then (re)populate the new filesystem with data (web pages). For now we'll
	create another variation on our home page.

	[source,C]
	-----
	# mount /dev/drbd1 /mnt/
	# cat <<-END >/mnt/index.html
	<html>
	<body>My Test Site - GFS2</body>
	</html>
	END
	# umount /dev/drbd1
	# drbdadm verify wwwdata#
	-----

	== Reconfigure the Cluster for GFS2 ==


	ifdef::pcs[]

	With the WebFS resource stopped, lets update the configuration.

	[source,C]
	----
	# pcs resource show WebFS
	Resource: WebFS
	device: /dev/drbd/by-res/wwwdata
	directory: /var/www/html
	fstype: ext4
	target-role: Stopped
	----

	The fstype option needs to be updated to gfs2 instead of ext4.

	[source,C]
	----
	# pcs resource update WebFS fstype=gfs2
	# pcs resource show WebFS
	Resource: WebFS
	device: /dev/drbd/by-res/wwwdata
	directory: /var/www/html
	fstype: gfs2
	target-role: Stopped
	CIB updated
	----
	endif::[]

	ifdef::crm[]
	[source,C]
	-----
	# crm
	crm(live) # cib new GFS2
	INFO: GFS2 shadow CIB created
	crm(GFS2) # configure delete WebFS
	crm(GFS2) # configure primitive WebFS ocf:heartbeat:Filesystem params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="gfs2"
	-----

	Now that we've recreated the resource, we also need to recreate all the
	constraints that used it. This is because the shell will automatically
	remove any constraints that referenced WebFS.

	[source,C]
	-----
	crm(GFS2) # configure colocation WebSite-with-WebFS inf: WebSite WebFS
	crm(GFS2) # configure colocation fs_on_drbd inf: WebFS WebDataClone:Master
	crm(GFS2) # configure order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
	crm(GFS2) # configure order WebSite-after-WebFS inf: WebFS WebSite
	crm(GFS2) # configure show
	node pcmk-1
	node pcmk-2
	primitive WebData ocf:linbit:drbd \
	params drbd_resource="wwwdata" \
	op monitor interval="60s"
	primitive WebFS ocf:heartbeat:Filesystem \
	params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="gfs2"
	primitive WebSite ocf:heartbeat:apache \
	params configfile="/etc/httpd/conf/httpd.conf" \
	op monitor interval="1min"
	primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="192.168.122.101" cidr_netmask="32" \
	op monitor interval="30s"
	ms WebDataClone WebData \
	meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
	colocation WebSite-with-WebFS inf: WebSite WebFS
	colocation fs_on_drbd inf: WebFS WebDataClone:Master
	colocation website-with-ip inf: WebSite ClusterIP
	order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
	order WebSite-after-WebFS inf: WebFS WebSite
	order apache-after-ip inf: ClusterIP WebSite
	property $id="cib-bootstrap-options" \
	dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="false" \
	no-quorum-policy="ignore"
	rsc_defaults $id="rsc-options" \
	resource-stickiness="100"
	-----

	Review the configuration before uploading it to the cluster, quitting the
	shell and watching the cluster's response

	[source,C]
	-----
	crm(GFS2) # cib commit GFS2
	INFO: commited 'GFS2' shadow CIB to the cluster
	crm(GFS2) # quit
	bye
	# crm_mon
	============
	Last updated: Thu Sep 3 20:49:54 2009
	Stack: openais
	Current DC: pcmk-2 - partition with quorum
	Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f
	2 Nodes configured, 2 expected votes
	6 Resources configured.
	============

	Online: [ pcmk-1 pcmk-2 ]

	WebSite (ocf::heartbeat:apache): Started pcmk-2
	Master/Slave Set: WebDataClone
	Masters: [ pcmk-1 ]
	Slaves: [ pcmk-2 ]
	ClusterIP (ocf::heartbeat:IPaddr): Started pcmk-2WebFS (ocf::heartbeat:Filesystem): Started pcmk-1
	-----
	endif::[]

	== Reconfigure Pacemaker for Active/Active ==

	Almost everything is in place. Recent versions of DRBD are capable of
	operating in Primary/Primary mode and the filesystem we're using is
	cluster aware. All we need to do now is reconfigure the cluster to take
	advantage of this.

	ifdef::pcs[]
	This will involve a number of changes, so we'll want work with a
	local cib file.

	[source,C]
	----
	# pcs cluster cib active_cfg
	----
	endif::[]

	ifdef::crm[]
	This will involve a number of changes, so we'll again use interactive
	mode.

	[source,C]
	-----
	# crm
	# cib new active
	-----
	endif::[]

	There's no point making the services active on both locations if we can't
	reach them, so lets first clone the IP address. Cloned IPaddr2 resources
	use an iptables rule to ensure that each request only gets processed by one of
	the two clone instances. The additional meta options tell the cluster how
	many instances of the clone we want (one "request bucket" for each node)
	and that if all other nodes fail, then the remaining node should hold all
	of them. Otherwise the requests would be simply discarded.


	ifdef::pcs[]
	----
	# pcs -f active_cfg resource clone ClusterIP \
	globally-unique=true clone-max=2 clone-node-max=2
	----

	Notice when the ClusterIP becomes a clone, the constraints
	referencing ClusterIP now reference the clone. This is
	done automatically by pcs.
	+endif::[]

	ifdef::pcs[]
	[source,C]
	----
	# pcs -f active_cfg constraint
	Location Constraints:
	Ordering Constraints:
	start ClusterIP-clone then start WebSite
	WebFS then WebSite
	promote WebDataClone then start WebFS
	Colocation Constraints:
	WebSite with ClusterIP-clone
	WebFS with WebDataClone (with-rsc-role:Master)
	WebSite with WebFS
	----
	endif::[]

	ifdef::crm[]
	[source,C]
	-----
	# configure clone WebIP ClusterIP \
	meta globally-unique="true" clone-max="2" clone-node-max="2"
	-----
	endif::[]

	Now we must tell the ClusterIP how to decide which requests are
	processed by which hosts. To do this we must specify the
	clusterip_hash parameter.

	ifdef::pcs[]
	[source,C]
	----
	# pcs -f active_cfg resource update ClusterIP clusterip_hash=sourceip
	----
	endif::[]

	ifdef::crm[]
	Open the ClusterIP resource

	[source,C]
	-----
	# configure edit ClusterIP
	-----

	And add the following to the params line

	.....
	clusterip_hash="sourceip"
	.....

	So that the complete definition looks like:

	.....
	primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="192.168.122.101" cidr_netmask="32" clusterip_hash="sourceip" \
	op monitor interval="30s"
	.....

	Here is the full transcript

	[source,C]
	-----
	# crm crm(live)
	# cib new active
	INFO: active shadow CIB created
	crm(active) # configure clone WebIP ClusterIP \
	meta globally-unique="true" clone-max="2" clone-node-max="2"
	crm(active) # configure shownode pcmk-1
	node pcmk-2
	primitive WebData ocf:linbit:drbd \
	params drbd_resource="wwwdata" \
	op monitor interval="60s"
	primitive WebFS ocf:heartbeat:Filesystem \
	params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="gfs2"
	primitive WebSite ocf:heartbeat:apache \
	params configfile="/etc/httpd/conf/httpd.conf" \
	op monitor interval="1min"
	primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="192.168.122.101" cidr_netmask="32" clusterip_hash="sourceip" \
	op monitor interval="30s"
	ms WebDataClone WebData \
	meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
	clone WebIP ClusterIP \
	meta globally-unique="true" clone-max="2" clone-node-max="2"
	colocation WebSite-with-WebFS inf: WebSite WebFS
	colocation fs_on_drbd inf: WebFS WebDataClone:Master
	colocation website-with-ip inf: WebSite WebIPorder WebFS-after-WebData inf: WebDataClone:promote WebFS:start
	order WebSite-after-WebFS inf: WebFS WebSiteorder apache-after-ip inf: WebIP WebSite
	property $id="cib-bootstrap-options" \
	dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="false" \
	no-quorum-policy="ignore"
	rsc_defaults $id="rsc-options" \
	resource-stickiness="100"
	-----

	Notice how any constraints that referenced ClusterIP have been updated
	to use WebIP instead. This is an additional benefit of using the crm
	shell.
	endif::[]

	Next we need to convert the filesystem and Apache resources into
	clones.

	ifdef::pcs[]
	Notice how pcs automatically updates the relevant constraints again.
	[source,C]
	----
	# pcs -f active_cfg resource clone WebFS
	# pcs -f active_cfg resource clone WebSite
	# pcs -f active_cfg constraint
	Location Constraints:
	Ordering Constraints:
	start ClusterIP-clone then start WebSite-clone
	WebFS-clone then WebSite-clone
	promote WebDataClone then start WebFS-clone
	Colocation Constraints:
	WebSite-clone with ClusterIP-clone
	WebFS-clone with WebDataClone (with-rsc-role:Master)
	WebSite-clone with WebFS-clone
	----
	endif::[]

	ifdef::crm[]
	Again, the shell will automatically update any relevant
	constraints.

	[source,C]
	-----
	crm(active) # configure clone WebFSClone WebFS
	crm(active) # configure clone WebSiteClone WebSite
	-----
	endif::[]

	The last step is to tell the cluster that it is now allowed to promote
	both instances to be Primary (aka. Master).

	ifdef::pcs[]
	[source,C]
	-----
	# pcs -f active_cfg resource update WebDataClone master-max=2
	-----
	endif::[]

	ifdef::crm[]
	[source,C]
	-----
	crm(active) # configure edit WebDataClone
	-----

	Change master-max to 2

	[source,C]
	-----
	crm(active) # configure show
	node pcmk-1
	node pcmk-2
	primitive WebData ocf:linbit:drbd \
	params drbd_resource="wwwdata" \
	op monitor interval="60s"
	primitive WebFS ocf:heartbeat:Filesystem \
	params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="gfs2"
	primitive WebSite ocf:heartbeat:apache \
	params configfile="/etc/httpd/conf/httpd.conf" \
	op monitor interval="1min"
	primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="192.168.122.101" cidr_netmask="32" clusterip_hash="sourceip" \
	op monitor interval="30s"
	ms WebDataClone WebData \
	meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
	clone WebFSClone WebFSclone WebIP ClusterIP \
	meta globally-unique="true" clone-max="2" clone-node-max="2"
	clone WebSiteClone WebSitecolocation WebSite-with-WebFS inf: WebSiteClone WebFSClone
	colocation fs_on_drbd inf: WebFSClone WebDataClone:Master
	colocation website-with-ip inf: WebSiteClone WebIP
	order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start
	order WebSite-after-WebFS inf: WebFSClone WebSiteClone
	order apache-after-ip inf: WebIP WebSiteClone
	property $id="cib-bootstrap-options" \
	dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="false" \
	no-quorum-policy="ignore"
	rsc_defaults $id="rsc-options" \
	resource-stickiness="100"
	-----
	endif::[]

	Review the configuration before uploading it to the cluster, quitting the
	shell and watching the cluster's response

	ifdef::pcs[]
	[source,C]
	-----
	# pcs cluster push cib active_cfg
	# pcs resource start WebFS
	-----

	After all the processes are started the status should look
	similar to this.

	[source,C]
	-----
	# pcs resource
	Master/Slave Set: WebDataClone [WebData]
	Masters: [ pcmk-2 pcmk-1 ]
	Clone Set: dlm-clone [dlm]
	Started: [ pcmk-2 pcmk-1 ]
	Clone Set: ClusterIP-clone [ClusterIP] (unique)
	ClusterIP:0 (ocf::heartbeat:IPaddr2) Started
	ClusterIP:1 (ocf::heartbeat:IPaddr2) Started
	Clone Set: WebFS-clone [WebFS]
	Started: [ pcmk-1 pcmk-2 ]
	Clone Set: WebSite-clone [WebSite]
	Started: [ pcmk-1 pcmk-2 ]
	-----
	endif::[]

	ifdef::crm[]
	[source,C]
	-----
	crm(active) # cib commit active
	INFO: commited 'active' shadow CIB to the cluster
	crm(active) # quit
	bye
	# crm_mon
	============
	Last updated: Thu Sep 3 21:37:27 2009
	Stack: openais
	Current DC: pcmk-2 - partition with quorum
	Version: 1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f
	2 Nodes configured, 2 expected votes
	6 Resources configured.
	============

	Online: [ pcmk-1 pcmk-2 ]

	Master/Slave Set: WebDataClone
	Masters: [ pcmk-1 pcmk-2 ]
	Clone Set: WebIP Started: [ pcmk-1 pcmk-2 ]
	Clone Set: WebFSClone Started: [ pcmk-1 pcmk-2 ]
	Clone Set: WebSiteClone Started: [ pcmk-1 pcmk-2 ]
	Clone Set: dlm_clone Started: [ pcmk-1 pcmk-2 ]
	-----
	endif::[]

	=== Testing Recovery ===

	[NOTE]
	=======
	TODO: Put one node into standby to demonstrate failover
	=======
	diff --git a/doc/Clusters_from_Scratch/en-US/Ch-Stonith.txt b/doc/Clusters_from_Scratch/en-US/Ch-Stonith.txt
	index 695deea00b..dc37e905ee 100644
	--- a/doc/Clusters_from_Scratch/en-US/Ch-Stonith.txt
	+++ b/doc/Clusters_from_Scratch/en-US/Ch-Stonith.txt
	@@ -1,307 +1,308 @@
	= Configure STONITH =

	== What Is STONITH ==

	STONITH is an acronym for Shoot-The-Other-Node-In-The-Head and it
	protects your data from being corrupted by rogue nodes or concurrent
	access.

	Just because a node is unresponsive, this doesn't mean it isn't
	accessing your data. The only way to be 100% sure that your data is
	safe, is to use STONITH so we can be certain that the node is truly
	offline, before allowing the data to be accessed from another node.


	STONITH also has a role to play in the event that a clustered service
	cannot be stopped. In this case, the cluster uses STONITH to force the
	whole node offline, thereby making it safe to start the service
	elsewhere.

	== What STONITH Device Should You Use ==

	It is crucial that the STONITH device can allow the cluster to
	differentiate between a node failure and a network one.

	The biggest mistake people make in choosing a STONITH device is to
	use remote power switch (such as many on-board IMPI controllers) that
	shares power with the node it controls. In such cases, the cluster
	cannot be sure if the node is really offline, or active and suffering
	from a network fault.

	Likewise, any device that relies on the machine being active (such as
	SSH-based "devices" used during testing) are inappropriate.

	== Configuring STONITH ==

	ifdef::pcs[]
	. Find the correct driver: +pcs stonith list+

	. Find the parameters associated with the device: +pcs stonith describe <agent name>+

	. Create a local config to make changes to +pcs cluster cib stonith_cfg+

	. Create the fencing resource using +pcs -f stonith_cfg stonith create <stonith_id>
	<stonith device type> [stonith device options]+

	. Set stonith-enable to true. +pcs -f stonith_cfg property set stonith-enabled=true+
	endif::[]

	ifdef::crm[]
	. Find the correct driver: +stonith_admin --list-installed+

	. Since every device is different, the parameters needed to configure
	it will vary. To find out the parameters associated with the device,
	run: +stonith_admin --metadata --agent type+

	The output should be XML formatted text containing additional
	parameter descriptions. We will endevor to make the output more
	friendly in a later version.

	. Enter the shell crm Create an editable copy of the existing
	configuration +cib new stonith+ Create a fencing resource containing a
	primitive resource with a class of stonith, a type of type and a
	parameter for each of the values returned in step 2: +configure
	primitive ...+
	endif::[]

	. If the device does not know how to fence nodes based on their uname,
	you may also need to set the special +pcmk_host_map+ parameter. See
	+man stonithd+ for details.

	. If the device does not support the list command, you may also need
	to set the special +pcmk_host_list+ and/or +pcmk_host_check+
	parameters. See +man stonithd+ for details.

	. If the device does not expect the victim to be specified with the
	port parameter, you may also need to set the special
	+pcmk_host_argument+ parameter. See +man stonithd+ for details.

	ifdef::crm[]
	. Upload it into the CIB from the shell: +cib commit stonith+
	endif::[]

	ifdef::pcs[]
	. Commit the new configuration. +pcs cluster push cib stonith_cfg+
	endif::[]

	. Once the stonith resource is running, you can test it by executing:
	+stonith_admin --reboot nodename+. Although you might want to stop the
	cluster on that machine first.

	== Example ==

	Assuming we have an chassis containing four nodes and an IPMI device
	active on 10.0.0.1, then we would chose the fence_ipmilan driver in step
	2 and obtain the following list of parameters

	.Obtaining a list of STONITH Parameters

	ifdef::pcs[]
	[source,C]
	----
	# pcs stonith describe fence_ipmilan
	Stonith options for: fence_ipmilan
	auth: IPMI Lan Auth type (md5, password, or none)
	ipaddr: IPMI Lan IP to talk to
	passwd: Password (if required) to control power on IPMI device
	passwd_script: Script to retrieve password (if required)
	lanplus: Use Lanplus
	login: Username/Login (if required) to control power on IPMI device
	action: Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata
	timeout: Timeout (sec) for IPMI operation
	cipher: Ciphersuite to use (same as ipmitool -C parameter)
	method: Method to fence (onoff or cycle)
	power_wait: Wait X seconds after on/off operation
	delay: Wait X seconds before fencing is started
	privlvl: Privilege level on IPMI device
	verbose: Verbose mode
	----
	endif::[]

	ifdef::crm[]
	[source,C]
	----
	# stonith_admin --metadata -a fence_ipmilan
	----
	[source,XML]
	----
	<?xml version="1.0" ?>
	<resource-agent name="fence_ipmilan" shortdesc="Fence agent for IPMI over LAN">
	<longdesc>
	fence_ipmilan is an I/O Fencing agent which can be used with machines controlled by IPMI. This agent calls support software using ipmitool (http://ipmitool.sf.net/).

	To use fence_ipmilan with HP iLO 3 you have to enable lanplus option (lanplus / -P) and increase wait after operation to 4 seconds (power_wait=4 / -T 4)</longdesc>
	<parameters>
	<parameter name="auth" unique="1">
	<getopt mixed="-A" />
	<content type="string" />
	<shortdesc lang="en">IPMI Lan Auth type (md5, password, or none)</shortdesc>
	</parameter>
	<parameter name="ipaddr" unique="1">
	<getopt mixed="-a" />
	<content type="string" />
	<shortdesc lang="en">IPMI Lan IP to talk to</shortdesc>
	</parameter>
	<parameter name="passwd" unique="1">
	<getopt mixed="-p" />
	<content type="string" />
	<shortdesc lang="en">Password (if required) to control power on IPMI device</shortdesc>
	</parameter>
	<parameter name="passwd_script" unique="1">
	<getopt mixed="-S" />
	<content type="string" />
	<shortdesc lang="en">Script to retrieve password (if required)</shortdesc>
	</parameter>
	<parameter name="lanplus" unique="1">
	<getopt mixed="-P" />
	<content type="boolean" />
	<shortdesc lang="en">Use Lanplus</shortdesc>
	</parameter>
	<parameter name="login" unique="1">
	<getopt mixed="-l" />
	<content type="string" />
	<shortdesc lang="en">Username/Login (if required) to control power on IPMI device</shortdesc>
	</parameter>
	<parameter name="action" unique="1">
	<getopt mixed="-o" />
	<content type="string" default="reboot"/>
	<shortdesc lang="en">Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata</shortdesc>
	</parameter>
	<parameter name="timeout" unique="1">
	<getopt mixed="-t" />
	<content type="string" />
	<shortdesc lang="en">Timeout (sec) for IPMI operation</shortdesc>
	</parameter>
	<parameter name="cipher" unique="1">
	<getopt mixed="-C" />
	<content type="string" />
	<shortdesc lang="en">Ciphersuite to use (same as ipmitool -C parameter)</shortdesc>
	</parameter>
	<parameter name="method" unique="1">
	<getopt mixed="-M" />
	<content type="string" default="onoff"/>
	<shortdesc lang="en">Method to fence (onoff or cycle)</shortdesc>
	</parameter>
	<parameter name="power_wait" unique="1">
	<getopt mixed="-T" />
	<content type="string" default="2"/>
	<shortdesc lang="en">Wait X seconds after on/off operation</shortdesc>
	</parameter>
	<parameter name="delay" unique="1">
	<getopt mixed="-f" />
	<content type="string" />
	<shortdesc lang="en">Wait X seconds before fencing is started</shortdesc>
	</parameter>
	<parameter name="verbose" unique="1">
	<getopt mixed="-v" />
	<content type="boolean" />
	<shortdesc lang="en">Verbose mode</shortdesc>
	</parameter>
	</parameters>
	<actions>
	<action name="on" />
	<action name="off" />
	<action name="reboot" />
	<action name="status" />
	<action name="diag" />
	<action name="list" />
	<action name="monitor" />
	<action name="metadata" />
	</actions>
	</resource-agent>
	----
	endif::[]

	from which we would create a STONITH resource fragment that might look
	like this

	.Sample STONITH Resource
	ifdef::pcs[]
	----
	# pcs cluster cib stonith_cfg
	# pcs -f stonith_cfg stonith create impi-fencing fence_ipmilan \
	pcmk_host_list="pcmk-1 pcmk-2" ipaddr=10.0.0.1 login=testuser \
	passwd=acd123 op monitor interval=60s
	----
	[source,C]
	----
	# pcs -f stonith_cfg stonith
	impi-fencing (stonith:fence_ipmilan) Stopped
	----
	endif::[]

	ifdef::crm[]
	[source,C]
	----
	# crm crm(live)# cib new stonith
	INFO: stonith shadow CIB created
	crm(stonith)# configure primitive impi-fencing stonith::fence_ipmilan \
	params pcmk_host_list="pcmk-1 pcmk-2" ipaddr=10.0.0.1 login=testuser passwd=abc123 \
	op monitor interval="60s"
	----
	endif::[]

	And finally, since we disabled it earlier, we need to re-enable STONITH.
	At this point we should have the following configuration.

	ifdef::pcs[]
	[source,C]
	----
	# pcs -f stonith_cfg property set stonith-enabled=true
	# pcs -f stonith_cfg property
	dc-version: 1.1.8-1.el7-60a19ed12fdb4d5c6a6b6767f52e5391e447fec0
	cluster-infrastructure: corosync
	no-quorum-policy: ignore
	stonith-enabled: true
	----
	+endif::[]

	Now push the configuration into the cluster.

	ifdef::pcs[]
	[source,C]
	----
	# pcs cluster push cib stonith_cfg
	----
	endif::[]

	ifdef::crm[]
	[source,C]
	----
	crm(stonith)# configure property stonith-enabled="true"
	crm(stonith)# configure shownode pcmk-1
	node pcmk-2
	primitive WebData ocf:linbit:drbd \
	params drbd_resource="wwwdata" \
	op monitor interval="60s"
	primitive WebFS ocf:heartbeat:Filesystem \
	params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="gfs2"
	primitive WebSite ocf:heartbeat:apache \
	params configfile="/etc/httpd/conf/httpd.conf" \
	op monitor interval="1min"
	primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="192.168.122.101" cidr_netmask="32" clusterip_hash="sourceip" \
	op monitor interval="30s"primitive ipmi-fencing stonith::fence_ipmilan \ params pcmk_host_list="pcmk-1 pcmk-2" ipaddr=10.0.0.1 login=testuser passwd=abc123 \ op monitor interval="60s"ms WebDataClone WebData \
	meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
	clone WebFSClone WebFS
	clone WebIP ClusterIP \
	meta globally-unique="true" clone-max="2" clone-node-max="2"
	clone WebSiteClone WebSite
	colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone
	colocation fs_on_drbd inf: WebFSClone WebDataClone:Master
	colocation website-with-ip inf: WebSiteClone WebIP
	order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start
	order WebSite-after-WebFS inf: WebFSClone WebSiteClone
	order apache-after-ip inf: WebIP WebSiteClone
	property $id="cib-bootstrap-options" \
	dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="true" \
	no-quorum-policy="ignore"
	rsc_defaults $id="rsc-options" \
	resource-stickiness="100"
	crm(stonith)# cib commit stonithINFO: commited 'stonith' shadow CIB to the cluster
	crm(stonith)# quit
	bye
	----
	endif::[]
	diff --git a/doc/Pacemaker_Explained/en-US/Ch-Basics.txt b/doc/Pacemaker_Explained/en-US/Ch-Basics.txt
	index 63681c64f2..57c0167424 100644
	--- a/doc/Pacemaker_Explained/en-US/Ch-Basics.txt
	+++ b/doc/Pacemaker_Explained/en-US/Ch-Basics.txt
	@@ -1,368 +1,368 @@
	= Configuration Basics =

	== Configuration Layout ==

	The cluster is written using XML notation and divided into two main
	sections: configuration and status.

	The status section contains the history of each resource on each node
	and based on this data, the cluster can construct the complete current
	state of the cluster. The authoritative source for the status section
	is the local resource manager (lrmd) process on each cluster node and
	the cluster will occasionally repopulate the entire section. For this
	reason it is never written to disk and administrators are advised
	against modifying it in any way.

	The configuration section contains the more traditional information
	like cluster options, lists of resources and indications of where they
	should be placed. The configuration section is the primary focus of
	this document.

	The configuration section itself is divided into four parts:

	* Configuration options (called +crm_config+)
	* Nodes
	* Resources
	* Resource relationships (called +constraints+)

	.An empty configuration
	======
	[source,XML]
	-------
	<cib admin_epoch="0" epoch="0" num_updates="0" have-quorum="false">
	<configuration>
	<crm_config/>
	<nodes/>
	<resources/>
	<constraints/>
	</configuration>
	<status/>
	</cib>
	-------
	======

	== The Current State of the Cluster ==

	Before one starts to configure a cluster, it is worth explaining how
	to view the finished product. For this purpose we have created the
	`crm_mon` utility that will display the
	current state of an active cluster. It can show the cluster status by
	node or by resource and can be used in either single-shot or
	dynamically-updating mode. There are also modes for displaying a list
	of the operations performed (grouped by node and resource) as well as
	information about failures.


	Using this tool, you can examine the state of the cluster for
	irregularities and see how it responds when you cause or simulate
	failures.

	Details on all the available options can be obtained using the
	`crm_mon --help` command.

	.Sample output from crm_mon
	======
	-------
	============
	Last updated: Fri Nov 23 15:26:13 2007
	Current DC: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec)
	3 Nodes configured.
	5 Resources configured.
	============

	Node: sles-1 (1186dc9a-324d-425a-966e-d757e693dc86): online
	192.168.100.181 (heartbeat::ocf:IPaddr): Started sles-1
	192.168.100.182 (heartbeat:IPaddr): Started sles-1
	192.168.100.183 (heartbeat::ocf:IPaddr): Started sles-1
	rsc_sles-1 (heartbeat::ocf:IPaddr): Started sles-1
	child_DoFencing:2 (stonith:external/vmware): Started sles-1
	Node: sles-2 (02fb99a8-e30e-482f-b3ad-0fb3ce27d088): standby
	Node: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec): online
	rsc_sles-2 (heartbeat::ocf:IPaddr): Started sles-3
	rsc_sles-3 (heartbeat::ocf:IPaddr): Started sles-3
	child_DoFencing:0 (stonith:external/vmware): Started sles-3
	-------
	======

	.Sample output from crm_mon -n
	======
	-------
	============
	Last updated: Fri Nov 23 15:26:13 2007
	Current DC: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec)
	3 Nodes configured.
	5 Resources configured.
	============

	Node: sles-1 (1186dc9a-324d-425a-966e-d757e693dc86): online
	Node: sles-2 (02fb99a8-e30e-482f-b3ad-0fb3ce27d088): standby
	Node: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec): online

	Resource Group: group-1
	192.168.100.181 (heartbeat::ocf:IPaddr): Started sles-1
	192.168.100.182 (heartbeat:IPaddr): Started sles-1
	192.168.100.183 (heartbeat::ocf:IPaddr): Started sles-1
	rsc_sles-1 (heartbeat::ocf:IPaddr): Started sles-1
	rsc_sles-2 (heartbeat::ocf:IPaddr): Started sles-3
	rsc_sles-3 (heartbeat::ocf:IPaddr): Started sles-3
	Clone Set: DoFencing
	child_DoFencing:0 (stonith:external/vmware): Started sles-3
	child_DoFencing:1 (stonith:external/vmware): Stopped
	child_DoFencing:2 (stonith:external/vmware): Started sles-1
	-------
	======

	The DC (Designated Controller) node is where all the decisions are
	made and if the current DC fails a new one is elected from the
	remaining cluster nodes. The choice of DC is of no significance to an
	administrator beyond the fact that its logs will generally be more
	interesting.

	== How Should the Configuration be Updated? ==

	There are three basic rules for updating the cluster configuration:

	* Rule 1 - Never edit the cib.xml file manually. Ever. I'm not making this up.
	* Rule 2 - Read Rule 1 again.
	* Rule 3 - The cluster will notice if you ignored rules 1 & 2 and refuse to use the configuration.

	Now that it is clear how NOT to update the configuration, we can begin
	to explain how you should.

	The most powerful tool for modifying the configuration is the
	+cibadmin+ command which talks to a running cluster. With +cibadmin+,
	the user can query, add, remove, update or replace any part of the
	configuration; all changes take effect immediately, so there is no
	need to perform a reload-like operation.


	The simplest way of using cibadmin is to use it to save the current
	configuration to a temporary file, edit that file with your favorite
	text or XML editor and then upload the revised configuration.

	.Safely using an editor to modify the cluster configuration
	======
	[source,C]
	--------
	# cibadmin --query > tmp.xml
	# vi tmp.xml
	# cibadmin --replace --xml-file tmp.xml
	--------
	======

	Some of the better XML editors can make use of a Relax NG schema to
	help make sure any changes you make are valid. The schema describing
	the configuration can normally be found in
	'/usr/lib/heartbeat/pacemaker.rng' on most systems.


	If you only wanted to modify the resources section, you could instead
	do

	.Safely using an editor to modify a subsection of the cluster configuration
	======
	[source,C]
	--------
	# cibadmin --query --obj_type resources > tmp.xml
	# vi tmp.xml
	# cibadmin --replace --obj_type resources --xml-file tmp.xml
	--------
	======

	to avoid modifying any other part of the configuration.

	== Quickly Deleting Part of the Configuration ==

	Identify the object you wish to delete. Eg. run

	.Searching for STONITH related configuration items
	======
	[source,C]
	# cibadmin -Q \| grep stonith
	[source,XML]
	--------
	<nvpair id="cib-bootstrap-options-stonith-action" name="stonith-action" value="reboot"/>
	<nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="1"/>
	<primitive id="child_DoFencing" class="stonith" type="external/vmware">
	<lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
	<lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
	<lrm_resource id="child_DoFencing:1" type="external/vmware" class="stonith">
	<lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
	<lrm_resource id="child_DoFencing:2" type="external/vmware" class="stonith">
	<lrm_resource id="child_DoFencing:0" type="external/vmware" class="stonith">
	<lrm_resource id="child_DoFencing:3" type="external/vmware" class="stonith">
	--------
	======

	Next identify the resource's tag name and id (in this case we'll
	choose +primitive+ and +child_DoFencing+). Then simply execute:

	[source,C]
	# cibadmin --delete --crm_xml '<primitive id="child_DoFencing"/>'

	== Updating the Configuration Without Using XML ==

	Some common tasks can also be performed with one of the higher level
	tools that avoid the need to read or edit XML.

	To enable stonith for example, one could run:

	[source,C]
	# crm_attribute --attr-name stonith-enabled --attr-value true

	Or, to see if +somenode+ is allowed to run resources, there is:

	[source,C]
	# crm_standby --get-value --node-uname somenode

	Or, to find the current location of +my-test-rsc+, one can use:

	[source,C]
	# crm_resource --locate --resource my-test-rsc

	[[s-config-sandboxes]]
	== Making Configuration Changes in a Sandbox ==

	Often it is desirable to preview the effects of a series of changes
	before updating the configuration atomically. For this purpose we
	have created `crm_shadow` which creates a
	"shadow" copy of the configuration and arranges for all the command
	line tools to use it.

	To begin, simply invoke `crm_shadow` and give
	it the name of a configuration to create footnote:[Shadow copies are
	identified with a name, making it possible to have more than one.] ;
	be sure to follow the simple on-screen instructions.

	WARNING: Read the above carefully, failure to do so could result in you
	destroying the cluster's active configuration!


	.Creating and displaying the active sandbox
	-[source,Bash]
	======
	+[source,Bash]
	--------
	# crm_shadow --create test
	Setting up shadow instance
	Type Ctrl-D to exit the crm_shadow shell
	shadow[test]:
	shadow[test] # crm_shadow --which
	test
	--------
	======

	From this point on, all cluster commands will automatically use the
	shadow copy instead of talking to the cluster's active configuration.
	Once you have finished experimenting, you can either commit the
	changes, or discard them as shown below. Again, be sure to follow the
	on-screen instructions carefully.


	For a full list of `crm_shadow` options and
	commands, invoke it with the <parameter>--help</parameter> option.

	.Using a sandbox to make multiple changes atomically
	======
	[source,Bash]
	--------
	shadow[test] # crm_failcount -G -r rsc_c001n01
	name=fail-count-rsc_c001n01 value=0
	shadow[test] # crm_standby -v on -n c001n02
	shadow[test] # crm_standby -G -n c001n02
	name=c001n02 scope=nodes value=on
	shadow[test] # cibadmin --erase --force
	shadow[test] # cibadmin --query
	<cib cib_feature_revision="1" validate-with="pacemaker-1.0" admin_epoch="0" crm_feature_set="3.0" have-quorum="1" epoch="112"
	dc-uuid="c001n01" num_updates="1" cib-last-written="Fri Jun 27 12:17:10 2008">
	<configuration>
	<crm_config/>
	<nodes/>
	<resources/>
	<constraints/>
	</configuration>
	<status/>
	</cib>
	shadow[test] # crm_shadow --delete test --force
	Now type Ctrl-D to exit the crm_shadow shell
	shadow[test] # exit
	# crm_shadow --which
	No shadow instance provided
	# cibadmin -Q
	<cib cib_feature_revision="1" validate-with="pacemaker-1.0" admin_epoch="0" crm_feature_set="3.0" have-quorum="1" epoch="110"
	dc-uuid="c001n01" num_updates="551">
	<configuration>
	<crm_config>
	<cluster_property_set id="cib-bootstrap-options">
	<nvpair id="cib-bootstrap-1" name="stonith-enabled" value="1"/>
	<nvpair id="cib-bootstrap-2" name="pe-input-series-max" value="30000"/>
	--------
	======

	Making changes in a sandbox and verifying the real configuration is untouched

	[[s-config-testing-changes]]
	== Testing Your Configuration Changes ==

	We saw previously how to make a series of changes to a "shadow" copy
	of the configuration. Before loading the changes back into the
	cluster (eg. `crm_shadow --commit mytest --force`), it is often
	advisable to simulate the effect of the changes with +crm_simulate+,
	eg.

	[source,C]
	# crm_simulate --live-check -VVVVV --save-graph tmp.graph --save-dotfile tmp.dot


	The tool uses the same library as the live cluster to show what it
	would have done given the supplied input. It's output, in addition to
	a significant amount of logging, is stored in two files +tmp.graph+
	and +tmp.dot+, both are representations of the same thing -- the
	cluster's response to your changes.

	In the graph file is stored the complete transition, containing a list
	of all the actions, their parameters and their pre-requisites.
	Because the transition graph is not terribly easy to read, the tool
	also generates a Graphviz dot-file representing the same information.

	== Interpreting the Graphviz output ==
	* Arrows indicate ordering dependencies
	* Dashed-arrows indicate dependencies that are not present in the transition graph
	* Actions with a dashed border of any color do not form part of the transition graph
	* Actions with a green border form part of the transition graph
	* Actions with a red border are ones the cluster would like to execute but cannot run
	* Actions with a blue border are ones the cluster does not feel need to be executed
	* Actions with orange text are pseudo/pretend actions that the cluster uses to simplify the graph
	* Actions with black text are sent to the LRM
	* Resource actions have text of the form pass:[<replaceable>rsc</replaceable>]_pass:[<replaceable>action</replaceable>]_pass:[<replaceable>interval</replaceable>] pass:[<replaceable>node</replaceable>]
	* Any action depending on an action with a red border will not be able to execute.
	* Loops are _really_ bad. Please report them to the development team.

	=== Small Cluster Transition ===

	image::images/Policy-Engine-small.png["An example transition graph as represented by Graphviz",width="16cm",height="6cm",align="center"]

	In the above example, it appears that a new node, +node2+, has come
	online and that the cluster is checking to make sure +rsc1+, +rsc2+
	and +rsc3+ are not already running there (Indicated by the
	+*_monitor_0+ entries). Once it did that, and assuming the resources
	were not active there, it would have liked to stop +rsc1+ and +rsc2+
	on +node1+ and move them to +node2+. However, there appears to be
	some problem and the cluster cannot or is not permitted to perform the
	stop actions which implies it also cannot perform the start actions.
	For some reason the cluster does not want to start +rsc3+ anywhere.

	For information on the options supported by `crm_simulate`, use
	the `--help` option.

	=== Complex Cluster Transition ===

	image::images/Policy-Engine-big.png["Another, slightly more complex, transition graph that you're not expected to be able to read",width="16cm",height="20cm",align="center"]

	== Do I Need to Update the Configuration on all Cluster Nodes? ==

	No. Any changes are immediately synchronized to the other active
	members of the cluster.

	To reduce bandwidth, the cluster only broadcasts the incremental
	updates that result from your changes and uses MD5 checksums to ensure
	that each copy is completely consistent.
	diff --git a/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt b/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt
	index e259ee24ec..4c831db8a7 100644
	--- a/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt
	+++ b/doc/Pacemaker_Explained/en-US/Ch-Stonith.txt
	@@ -1,307 +1,308 @@
	[[ch-stonith]]
	= Configure STONITH =

	== What Is STONITH ==


	STONITH is an acronym for Shoot-The-Other-Node-In-The-Head and it
	protects your data from being corrupted by rogue nodes or concurrent
	access.

	Just because a node is unresponsive, this doesn't mean it isn't
	accessing your data. The only way to be 100% sure that your data is
	safe, is to use STONITH so we can be certain that the node is truly
	offline, before allowing the data to be accessed from another node.


	STONITH also has a role to play in the event that a clustered service
	cannot be stopped. In this case, the cluster uses STONITH to force the
	whole node offline, thereby making it safe to start the service
	elsewhere.

	== What STONITH Device Should You Use ==

	It is crucial that the STONITH device can allow the cluster to
	differentiate between a node failure and a network one.

	The biggest mistake people make in choosing a STONITH device is to
	use remote power switch (such as many on-board IMPI controllers) that
	shares power with the node it controls. In such cases, the cluster
	cannot be sure if the node is really offline, or active and suffering
	from a network fault.

	Likewise, any device that relies on the machine being active (such as
	SSH-based "devices" used during testing) are inappropriate.

	== Configuring STONITH ==

	ifdef::pcs[]
	. Find the correct driver: +pcs stonith list+

	. Find the parameters associated with the device: +pcs stonith describe <agent name>+

	. Create a local config to make changes to +pcs cluster cib stonith_cfg+

	. Create the fencing resource using +pcs -f stonith_cfg stonith create <stonith_id>
	<stonith device type> [stonith device options]+

	. Set stonith-enable to true. +pcs -f stonith_cfg property set stonith-enabled=true+
	-endif::[]
	+endif::pcs[]

	ifdef::crm[]
	. Find the correct driver: +stonith_admin --list-installed+

	. Since every device is different, the parameters needed to configure
	it will vary. To find out the parameters associated with the device,
	run: +stonith_admin --metadata --agent type+

	The output should be XML formatted text containing additional
	parameter descriptions. We will endevor to make the output more
	friendly in a later version.

	. Enter the shell crm Create an editable copy of the existing
	configuration +cib new stonith+ Create a fencing resource containing a
	primitive resource with a class of stonith, a type of type and a
	parameter for each of the values returned in step 2: +configure
	primitive ...+
	-endif::[]
	+endif::crm[]

	. If the device does not know how to fence nodes based on their uname,
	you may also need to set the special +pcmk_host_map+ parameter. See
	+man stonithd+ for details.

	. If the device does not support the list command, you may also need
	to set the special +pcmk_host_list+ and/or +pcmk_host_check+
	parameters. See +man stonithd+ for details.

	. If the device does not expect the victim to be specified with the
	port parameter, you may also need to set the special
	+pcmk_host_argument+ parameter. See +man stonithd+ for details.

	ifdef::crm[]
	. Upload it into the CIB from the shell: +cib commit stonith+
	-endif::[]
	+endif::crm[]

	ifdef::pcs[]
	. Commit the new configuration. +pcs cluster push cib stonith_cfg+
	-endif::[]
	+endif::pcs[]

	. Once the stonith resource is running, you can test it by executing:
	+stonith_admin --reboot nodename+. Although you might want to stop the
	cluster on that machine first.

	== Example ==

	Assuming we have an chassis containing four nodes and an IPMI device
	active on 10.0.0.1, then we would chose the fence_ipmilan driver in step
	2 and obtain the following list of parameters

	.Obtaining a list of STONITH Parameters

	ifdef::pcs[]
	[source,Bash]
	----
	# pcs stonith describe fence_ipmilan
	Stonith options for: fence_ipmilan
	auth: IPMI Lan Auth type (md5, password, or none)
	ipaddr: IPMI Lan IP to talk to
	passwd: Password (if required) to control power on IPMI device
	passwd_script: Script to retrieve password (if required)
	lanplus: Use Lanplus
	login: Username/Login (if required) to control power on IPMI device
	action: Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata
	timeout: Timeout (sec) for IPMI operation
	cipher: Ciphersuite to use (same as ipmitool -C parameter)
	method: Method to fence (onoff or cycle)
	power_wait: Wait X seconds after on/off operation
	delay: Wait X seconds before fencing is started
	privlvl: Privilege level on IPMI device
	verbose: Verbose mode
	----
	-endif::[]
	+endif::pcs[]

	ifdef::crm[]
	[source,C]
	----
	# stonith_admin --metadata -a fence_ipmilan
	----
	[source,XML]
	----
	<?xml version="1.0" ?>
	<resource-agent name="fence_ipmilan" shortdesc="Fence agent for IPMI over LAN">
	<longdesc>
	fence_ipmilan is an I/O Fencing agent which can be used with machines controlled by IPMI. This agent calls support software using ipmitool (http://ipmitool.sf.net/).

	To use fence_ipmilan with HP iLO 3 you have to enable lanplus option (lanplus / -P) and increase wait after operation to 4 seconds (power_wait=4 / -T 4)</longdesc>
	<parameters>
	<parameter name="auth" unique="1">
	<getopt mixed="-A" />
	<content type="string" />
	<shortdesc lang="en">IPMI Lan Auth type (md5, password, or none)</shortdesc>
	</parameter>
	<parameter name="ipaddr" unique="1">
	<getopt mixed="-a" />
	<content type="string" />
	<shortdesc lang="en">IPMI Lan IP to talk to</shortdesc>
	</parameter>
	<parameter name="passwd" unique="1">
	<getopt mixed="-p" />
	<content type="string" />
	<shortdesc lang="en">Password (if required) to control power on IPMI device</shortdesc>
	</parameter>
	<parameter name="passwd_script" unique="1">
	<getopt mixed="-S" />
	<content type="string" />
	<shortdesc lang="en">Script to retrieve password (if required)</shortdesc>
	</parameter>
	<parameter name="lanplus" unique="1">
	<getopt mixed="-P" />
	<content type="boolean" />
	<shortdesc lang="en">Use Lanplus</shortdesc>
	</parameter>
	<parameter name="login" unique="1">
	<getopt mixed="-l" />
	<content type="string" />
	<shortdesc lang="en">Username/Login (if required) to control power on IPMI device</shortdesc>
	</parameter>
	<parameter name="action" unique="1">
	<getopt mixed="-o" />
	<content type="string" default="reboot"/>
	<shortdesc lang="en">Operation to perform. Valid operations: on, off, reboot, status, list, diag, monitor or metadata</shortdesc>
	</parameter>
	<parameter name="timeout" unique="1">
	<getopt mixed="-t" />
	<content type="string" />
	<shortdesc lang="en">Timeout (sec) for IPMI operation</shortdesc>
	</parameter>
	<parameter name="cipher" unique="1">
	<getopt mixed="-C" />
	<content type="string" />
	<shortdesc lang="en">Ciphersuite to use (same as ipmitool -C parameter)</shortdesc>
	</parameter>
	<parameter name="method" unique="1">
	<getopt mixed="-M" />
	<content type="string" default="onoff"/>
	<shortdesc lang="en">Method to fence (onoff or cycle)</shortdesc>
	</parameter>
	<parameter name="power_wait" unique="1">
	<getopt mixed="-T" />
	<content type="string" default="2"/>
	<shortdesc lang="en">Wait X seconds after on/off operation</shortdesc>
	</parameter>
	<parameter name="delay" unique="1">
	<getopt mixed="-f" />
	<content type="string" />
	<shortdesc lang="en">Wait X seconds before fencing is started</shortdesc>
	</parameter>
	<parameter name="verbose" unique="1">
	<getopt mixed="-v" />
	<content type="boolean" />
	<shortdesc lang="en">Verbose mode</shortdesc>
	</parameter>
	</parameters>
	<actions>
	<action name="on" />
	<action name="off" />
	<action name="reboot" />
	<action name="status" />
	<action name="diag" />
	<action name="list" />
	<action name="monitor" />
	<action name="metadata" />
	</actions>
	</resource-agent>
	----
	-endif::[]
	+endif::crm[]

	from which we would create a STONITH resource fragment that might look
	like this

	.Sample STONITH Resource
	ifdef::pcs[]
	[source,Bash]
	----
	# pcs cluster cib stonith_cfg
	# pcs -f stonith_cfg stonith create impi-fencing fence_ipmilan \
	pcmk_host_list="pcmk-1 pcmk-2" ipaddr=10.0.0.1 login=testuser \
	passwd=acd123 op monitor interval=60s
	# pcs -f stonith_cfg stonith
	impi-fencing (stonith:fence_ipmilan) Stopped
	----
	-endif::[]
	+endif::pcs[]

	ifdef::crm[]
	[source,Bash]
	----
	# crm crm(live)# cib new stonith
	INFO: stonith shadow CIB created
	crm(stonith)# configure primitive impi-fencing stonith::fence_ipmilan \
	params pcmk_host_list="pcmk-1 pcmk-2" ipaddr=10.0.0.1 login=testuser passwd=abc123 \
	op monitor interval="60s"
	----
	-endif::[]
	+endif::crm[]

	And finally, since we disabled it earlier, we need to re-enable STONITH.
	At this point we should have the following configuration.

	ifdef::pcs[]
	[source,Bash]
	----
	# pcs -f stonith_cfg property set stonith-enabled=true
	# pcs -f stonith_cfg property
	dc-version: 1.1.8-1.el7-60a19ed12fdb4d5c6a6b6767f52e5391e447fec0
	cluster-infrastructure: corosync
	no-quorum-policy: ignore
	stonith-enabled: true
	----
	+endif::pcs[]

	Now push the configuration into the cluster.

	ifdef::pcs[]
	[source,C]
	----
	# pcs cluster push cib stonith_cfg
	----
	-endif::[]
	+endif::pcs[]

	ifdef::crm[]
	[source,Bash]
	----
	crm(stonith)# configure property stonith-enabled="true"
	crm(stonith)# configure shownode pcmk-1
	node pcmk-2
	primitive WebData ocf:linbit:drbd \
	params drbd_resource="wwwdata" \
	op monitor interval="60s"
	primitive WebFS ocf:heartbeat:Filesystem \
	params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="gfs2"
	primitive WebSite ocf:heartbeat:apache \
	params configfile="/etc/httpd/conf/httpd.conf" \
	op monitor interval="1min"
	primitive ClusterIP ocf:heartbeat:IPaddr2 \
	params ip="192.168.122.101" cidr_netmask="32" clusterip_hash="sourceip" \
	op monitor interval="30s"primitive ipmi-fencing stonith::fence_ipmilan \ params pcmk_host_list="pcmk-1 pcmk-2" ipaddr=10.0.0.1 login=testuser passwd=abc123 \ op monitor interval="60s"ms WebDataClone WebData \
	meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
	clone WebFSClone WebFS
	clone WebIP ClusterIP \
	meta globally-unique="true" clone-max="2" clone-node-max="2"
	clone WebSiteClone WebSite
	colocation WebSite-with-WebFS inf: WebSiteClone WebFSClone
	colocation fs_on_drbd inf: WebFSClone WebDataClone:Master
	colocation website-with-ip inf: WebSiteClone WebIP
	order WebFS-after-WebData inf: WebDataClone:promote WebFSClone:start
	order WebSite-after-WebFS inf: WebFSClone WebSiteClone
	order apache-after-ip inf: WebIP WebSiteClone
	property $id="cib-bootstrap-options" \
	dc-version="1.1.5-bdd89e69ba545404d02445be1f3d72e6a203ba2f" \
	cluster-infrastructure="openais" \
	expected-quorum-votes="2" \
	stonith-enabled="true" \
	no-quorum-policy="ignore"
	rsc_defaults $id="rsc-options" \
	resource-stickiness="100"
	crm(stonith)# cib commit stonithINFO: commited 'stonith' shadow CIB to the cluster
	crm(stonith)# quit
	bye
	----
	-endif::[]
	+endif::crm[]

File Metadata

Mime Type: text/x-diff
Expires: Sat, Nov 23, 11:19 AM (1 d, 12 h)
Storage Engine: blob
Storage Format: Raw Data
Storage Handle: 1018665
Default Alt Text: (61 KB)

No OneTemporaryActions

View Options

File Metadata

Event Timeline

No OneTemporary
Actions