diff --git a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt
new file mode 100644
index 0000000000..5881cf8a30
--- /dev/null
+++ b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.txt
@@ -0,0 +1,657 @@
+= Advanced Configuration =
+
+== Connecting from a Remote Machine ==
+indexterm:[Remote,connect]
+indexterm:[Remote,administration]
+anchor:s-remote-connection[Connecting from a Remote Machine]
+
+Provided Pacemaker is installed on a machine, it is possible to
+connect to the cluster even if the machine itself is not in the same
+cluster. To do this, one simply sets up a number of environment
+variables and runs the same commands as when working on a cluster
+node.
+
+.Environment Variables Used to Connect to Remote Instances of the CIB
+[width="95%",cols="1m,4<",options="header",align="center"]
+|=========================================================
+
+|Environment Variable
+|Description
+indexterm:[Environment Variable,Remote Administration]
+
+|CIB_user
+
+|The user to connect as. Needs to be part of the +hacluster+ group on
+ the target host. Defaults to _$USER_.
+ indexterm:[CIB_*, Env. Var. for Remote Conn.,user]
+ indexterm:[Environment Variable,CIB_,user]
+
+|CIB_passwd
+|The user's password. Read from the command line if unset.
+ indexterm:[CIB_*, Env. Var. for Remote Conn.,passwd]
+ indexterm:[Environment Variable,CIB_,passwd]
+
+|CIB_server
+|The host to contact. Defaults to _localhost_.
+ indexterm:[CIB_*, Env. Var. for Remote Conn.,server]
+ indexterm:[Environment Variable,CIB_,server]
+
+|CIB_port
+|The port on which to contact the server; required.
+ indexterm:[CIB_*, Env. Var. for Remote Conn.,port]
+ indexterm:[Environment Variable,CIB_,port]
+
+|CIB_encrypted
+|Encrypt network traffic; defaults to _true_.
+ indexterm:[CIB_*, Env. Var. for Remote Conn.,encrypted]
+ indexterm:[Environment Variable,CIB_,encrypted]
+
+|=========================================================
+
+So, if +c001n01+ is an active cluster node and is listening on +1234+
+for connections, and +someguy+ is a member of the +hacluster+ group,
+then the following would prompt for +someguy+'s password and return
+the cluster's current configuration:
+
+[source,Bash]
+export CIB_port=1234; export CIB_server=c001n01; export CIB_user=someguy;
+cibadmin -Q
+
+For security reasons, the cluster does not listen for remote
+connections by default. If you wish to allow remote access, you need
+to set the +remote-tls-port+ (encrypted) or +remote-clear-port+
+(unencrypted) top-level options (ie., those kept in the cib tag, like
++num_updates+ and +epoch+).
+
+indexterm:[Remote,connect, CIB options]
+
+.Extra top-level CIB options for remote access
+[width="95%",cols="1m,2<",options="header",align="center"]
+|=========================================================
+
+|Field
+|Description
+
+|remote-tls-port
+|Listen for encrypted remote connections on this port. Default: _none_
+ indexterm:[remote-tls-port]
+
+|remote-clear-port
+|Listen for plaintext remote connections on this port. Default: _none_
+ indexterm:[remote-clear-port]
+
+|=========================================================
+
+== Specifying When Recurring Actions are Performed ==
+
+anchor:s-recurring-start[Specifying When Recurring Actions are Performed]
+
+By default, recurring actions are scheduled relative to when the
+resource started. So if your resource was last started at 14:32 and
+you have a backup set to be performed every 24 hours, then the backup
+will always run at in the middle of the business day - hardly
+desirable.
+
+To specify a date/time that the operation should be relative to, set
+the operation's +interval-origin+. The cluster uses this point to
+calculate the correct +start-delay+ such that the operation will occur
+at _origin + (interval * N)_.
+
+So, if the operation's interval is 24h, it's interval-origin is set to
++02:00+ and it is currently +14:32+, then the cluster would initiate
+the operation with a start delay of 11 hours and 28 minutes. If the
+resource is moved to another node before 2am, then the operation is of
+course cancelled.
+
+The value specified for interval and +interval-origin+ can be any
+date/time conforming to the
+http://en.wikipedia.org/wiki/ISO_8601[ISO8601 standard]. By way of
+example, to specify an operation that would run on the first Monday of
+2009 and every Monday after that you would add:
+
+.Specifying a Base for Recurring Action Intervals
+[source,XML]
+
+
+== Moving Resources ==
+indexterm:[Moving Resources] indexterm:[Resource,Moving]
+
+=== Manual Intervention ===
+
+There are primarily two occasions when you would want to move a
+resource from it's current location: when the whole node is under
+maintenance, and when a single resource needs to be moved.
+
+Since everything eventually comes down to a score, you could create
+constraints for every resource to prevent them from running on one
+node. While the configuration can seem convoluted at times, not even
+we would require this of administrators.
+
+Instead one can set a special node attribute which tells the cluster
+"don't let anything run here". There is even a helpful tool to help
+query and set it, called `crm_standby`. To check the standby status
+of the current machine, simply run:
+
+[source,Bash]
+crm_standby --get-value
+
+A value of +true+ indicates that the node is _NOT_ able to host any
+resources, while a value of +false+ says that it _CAN_.
+
+You can also check the status of other nodes in the cluster by
+specifying the `--node-uname` option:
+
+[source,Bash]
+crm_standby --get-value --node-uname sles-2
+
+To change the current node's standby status, use `--attr-value`
+instead of `--get-value`.
+
+[source,Bash]
+crm_standby --attr-value
+
+Again, you can change another host's value by supplying a host name with `--node-uname`.
+
+When only one resource is required to move, we do this by creating
+location constraints. However, once again we provide a user friendly
+shortcut as part of the `crm_resource` command, which creates and
+modifies the extra constraints for you. If +Email+ was running on
++sles-1+ and you wanted it moved to a specific location, the command
+would look something like:
+
+[source,Bash]
+crm_resource -M -r Email -H sles-2
+
+Behind the scenes, the tool will create the following location constraint:
+
+[source,XML]
+
+
+It is important to note that subsequent invocations of `crm_resource
+-M` are not cumulative. So, if you ran these commands
+
+[source,Bash]
+crm_resource -M -r Email -H sles-2
+crm_resource -M -r Email -H sles-3
+
+then it is as if you had never performed the first command.
+
+To allow the resource to move back again, use:
+
+[source,Bash]
+crm_resource -U -r Email
+
+Note the use of the word _allow_. The resource can move back to its
+original location but, depending on +resource-stickiness+, it might
+stay where it is. To be absolutely certain that it moves back to
++sles-1+, move it there before issuing the call to `crm_resource -U`:
+
+[source,Bash]
+crm_resource -M -r Email -H sles-1
+crm_resource -U -r Email
+
+Alternatively, if you only care that the resource should be moved from
+its current location, try
+
+[source,Bash]
+crm_resource -M -r Email`
+
+Which will instead create a negative constraint, like
+
+[source,XML]
+
+
+This will achieve the desired effect, but will also have long-term
+consequences. As the tool will warn you, the creation of a
++-INFINITY+ constraint will prevent the resource from running on that
+node until `crm_resource -U` is used. This includes the situation
+where every other cluster node is no longer available!
+
+In some cases, such as when +resource-stickiness+ is set to
++INFINITY+, it is possible that you will end up with the problem
+described in xref:node-score-equal[]. The tool can detect
+some of these cases and deals with them by also creating both a
+positive and negative constraint. Eg.
+
++Email+ prefers +sles-1+ with a score of +-INFINITY+
+
++Email+ prefers +sles-2+ with a score of +INFINITY+
+
+which has the same long-term consequences as discussed earlier.
+
+=== Moving Resources Due to Failure ===
+
+anchor:s-failure-migration[Moving Resources Due to Failure]
+
+New in 1.0 is the concept of a migration threshold.
+footnote:[
+The naming of this option was perhaps unfortunate as it is easily
+confused with true migration, the process of moving a resource from
+one node to another without stopping it. Xen virtual guests are the
+most common example of resources that can be migrated in this manner.
+]
+
+Simply define +migration-threshold=N+ for a resource and it will
+migrate to a new node after N failures. There is no threshold defined
+by default. To determine the resource's current failure status and
+limits, use `crm_mon --failcounts`.
+
+By default, once the threshold has been reached, this node will no
+longer be allowed to run the failed resource until the administrator
+manually resets the resource's failcount using `crm_failcount` (after
+hopefully first fixing the failure's cause). However it is possible
+to expire them by setting the resource's +failure-timeout+ option.
+
+So a setting of +migration-threshold=2+ and +failure-timeout=60s+
+would cause the resource to move to a new node after 2 failures, and
+allow it to move back (depending on the stickiness and constraint
+scores) after one minute.
+
+There are two exceptions to the migration threshold concept; they
+occur when a resource either fails to start or fails to stop. Start
+failures cause the failcount to be set to +INFINITY+ and thus always
+cause the resource to move immediately.
+
+Stop failures are slightly different and crucial. If a resource fails
+to stop and STONITH is enabled, then the cluster will fence the node
+in order to be able to start the resource elsewhere. If STONITH is
+not enabled, then the cluster has no way to continue and will not try
+to start the resource elsewhere, but will try to stop it again after
+the failure timeout.
+
+[IMPORTANT]
+Please read xref:s-rules-recheck[] before enabling this option.
+
+=== Moving Resources Due to Connectivity Changes ===
+
+Setting up the cluster to move resources when external connectivity is
+lost is a two-step process.
+
+==== Tell Pacemaker to monitor connectivity ====
+
+
+To do this, you need to add a +ping+ resource to the cluster. The
++ping+ resource uses the system utility of the same name to a test if
+list of machines (specified by DNS hostname or IPv4/IPv6 address) are
+reachable and uses the results to maintain a node attribute normally
+called +pingd+.
+footnote:[
+The attribute name is customizable; that allows multiple ping groups to be defined.
+]
+
+[NOTE]
+Older versions of Heartbeat required users to add ping nodes to _ha.cf_ - this is no longer required.
+
+[IMPORTANT]
+===========
+Older versions of Pacemaker used a custom binary called 'pingd' for
+this functionality; this is now deprecated in favor of 'ping'.
+
+If your version of Pacemaker does not contain the ping agent, you can
+download the latest version from
+https://github.com/ClusterLabs/pacemaker/tree/master/extra/resources/ping
+===========
+
+Normally the resource will run on all cluster nodes, which means that
+you'll need to create a clone. A template for this can be found below
+along with a description of the most interesting parameters.
+
+.Common Options for a 'ping' Resource
+[width="95%",cols="1m,4<",options="header",align="center"]
+|=========================================================
+
+|Field
+|Description
+
+|dampen
+|The time to wait (dampening) for further changes to occur. Use this
+ to prevent a resource from bouncing around the cluster when cluster
+ nodes notice the loss of connectivity at slightly different times.
+ indexterm:[dampen,Resource Option]
+ indexterm:[Resource,Option,dampen]
+
+|multiplier
+|The number of connected ping nodes gets multiplied by this value to
+ get a score. Useful when there are multiple ping nodes configured.
+ indexterm:[multiplier,Resource Option]
+ indexterm:[Resource,Option,multiplier]
+
+|host_list
+|The machines to contact in order to determine the current
+ connectivity status. Allowed values include resolvable DNS host
+ names, IPv4 and IPv6 addresses.
+ indexterm:[host_list,Resource Option]
+ indexterm:[Resource,Option,host_list]
+
+|=========================================================
+
+An example ping cluster resource that checks node connectivity once every minute:
+[source,XML]
+------------
+
+
+
+
+
+
+
+
+
+
+
+
+------------
+
+[IMPORTANT]
+===========
+You're only half done. The next section deals with telling Pacemaker
+how to deal with the connectivity status that +ocf:pacemaker:ping+ is
+recording.
+===========
+
+==== Tell Pacemaker how to interpret the connectivity data ====
+
+[NOTE]
+======
+Before reading the following, please make sure you have read and
+understood xref:ch-rules[] above.
+======
+
+There are a number of ways to use the connectivity data provided by
+Heartbeat. The most common setup is for people to have a single ping
+node, to prevent the cluster from running a resource on any
+unconnected node.
+
+////
+TODO: is the idea that only nodes that can reach eg. the router should have active resources?
+////
+
+.Don't run on unconnected nodes
+[source,XML]
+-------
+
+
+
+
+
+-------
+
+A more complex setup is to have a number of ping nodes configured.
+You can require the cluster to only run resources on nodes that can
+connect to all (or a minimum subset) of them.
+
+Run only on nodes connected to three or more ping nodes; this assumes +multiplier+ is set to 1000:
+[source,XML]
+-------
+
+
+
+
+
+-------
+
+Instead you can tell the cluster only to _prefer_ nodes with the best
+connectivity. Just be sure to set +multiplier+ to a value higher than
+that of +resource-stickiness+ (and don't set either of them to
++INFINITY+).
+
+.Prefer the node with the most connected ping nodes
+[source,XML]
+-------
+
+
+
+
+
+-------
+
+It is perhaps easier to think of this in terms of the simple
+constraints that the cluster translates it into. For example, if
++sles-1+ is connected to all 5 ping nodes but +sles-2+ is only
+connected to 2, then it would be as if you instead had the following
+constraints in your configuration:
+
+.How the cluster translates the pingd constraint
+[source,XML]
+-------
+
+
+-------
+
+The advantage is that you don't have to manually update any
+constraints whenever your network connectivity changes.
+
+You can also combine the concepts above into something even more
+complex. The example below shows how you can prefer the node with the
+most connected ping nodes provided they have connectivity to at least
+three (again assuming that +multiplier+ is set to 1000).
+
+.A more complex example of choosing a location based on connectivity
+[source,XML]
+-------
+
+
+
+
+
+
+
+
+-------
+
+=== Resource Migration ===
+
+Some resources, such as Xen virtual guests, are able to move to
+another location without loss of state. We call this resource
+migration; this is different from the normal practice of stopping the
+resource on the first machine and starting it elsewhere.
+
+Not all resources are able to migrate, see the Migration Checklist
+below, and those that can, won't do so in all situations.
+Conceptually there are two requirements from which the other
+prerequisites follow:
+
+* the resource must be active and healthy at the old location
+* everything required for the resource to run must be available on
+ both the old and new locations
+
+The cluster is able to accommodate both push and pull migration models
+by requiring the resource agent to support two new actions:
++migrate_to+ (performed on the current location) and +migrate_from+
+(performed on the destination).
+
+In push migration, the process on the current location transfers the
+resource to the new location where is it later activated. In this
+scenario, most of the work would be done in the +migrate_to+ action
+and, if anything, the activation would occur during +migrate_from+.
+
+Conversely for pull, the +migrate_to+ action is practically empty and
++migrate_from+ does most of the work, extracting the relevant resource
+state from the old location and activating it.
+
+There is no wrong or right way to implement migration for your
+service, as long as it works.
+
+==== Migration Checklist ====
+
+* The resource may not be a clone.
+* The resource must use an OCF style agent.
+* The resource must not be in a failed or degraded state.
+* The resource must not, directly or indirectly, depend on any
+ primitive or group resources.
+* The resource must support two new actions: +migrate_to+ and
+ +migrate_from+, and advertise them in its metadata.
+* The resource must have the +allow-migrate+ meta-attribute set to
+ +true+ (which is not the default).
+
+////
+TODO: how can a KVM with DRBD migrate?
+////
+
+If the resource depends on a clone, and at the time the resource needs
+to be move, the clone has instances that are stopping and instances
+that are starting, then the resource will be moved in the traditional
+manner. The Policy Engine is not yet able to model this situation
+correctly and so takes the safe (yet less optimal) path.
+
+== Reusing Rules, Options and Sets of Operations ==
+
+anchor:s-reusing-config-elements[Reusing Rules, Options and Sets of Operations]
+Sometimes a number of constraints need to use the same set of rules,
+and resources need to set the same options and parameters. To
+simplify this situation, you can refer to an existing object using an
++id-ref+ instead of an id.
+
+So if for one resource you have
+
+[source,XML]
+------
+
+
+
+
+
+------
+
+Then instead of duplicating the rule for all your other resources, you can instead specify:
+
+.Referencing rules from other constraints
+[source,XML]
+-------
+
+
+
+-------
+
+[IMPORTANT]
+===========
+The cluster will insist that the +rule+ exists somewhere. Attempting
+to add a reference to a non-existing rule will cause a validation
+failure, as will attempting to remove a +rule+ that is referenced
+elsewhere.
+===========
+
+The same principle applies for +meta_attributes+ and
++instance_attributes+ as illustrated in the example below:
+
+.Referencing attributes, options, and operations from other resources
+[source,XML]
+-------
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+-------
+
+== Reloading Services After a Definition Change ==
+
+The cluster automatically detects changes to the definition of
+services it manages. However, the normal response is to stop the
+service (using the old definition) and start it again (with the new
+definition). This works well, but some services are smarter and can
+be told to use a new set of options without restarting.
+
+To take advantage of this capability, your resource agent must:
+
+. Accept the +reload+ operation and perform any required actions.
+ _The steps required here depend completely on your application!_
++
+.The DRBD Agent's Control logic for Supporting the +reload+ Operation
+[source,Bash]
+-------
+case $1 in
+ start)
+ drbd_start
+ ;;
+ stop)
+ drbd_stop
+ ;;
+ reload)
+ drbd_reload
+ ;;
+ monitor)
+ drbd_monitor
+ ;;
+ *)
+ drbd_usage
+ exit $OCF_ERR_UNIMPLEMENTED
+ ;;
+esac
+exit $?
+-------
+. Advertise the +reload+ operation in the +actions+ section of its metadata
+ This is how the DRBD Agent advertisese Support for the +reload+ operation:
++
+.The DRBD Agent Advertising Support for the +reload+ Operation
+[source,XML]
+-------
+
+
+
+ 1.1
+
+
+ Master/Slave OCF Resource Agent for DRBD
+
+
+ ...
+
+
+
+
+
+
+
+
+
+
+
+
+-------
+. Advertise one or more parameters that can take effect using +reload+.
++
+Any parameter with the +unique+ set to 0 is eligible to be used in this way.
+For example, the drbdconf parameter from the drbd agent:
++
+.Parameter that can be changed using reload
+[source,XML]
+-------
+
+ Full path to the drbd.conf file.
+ Path to drbd.conf
+
+
+-------
+
+Once these requirements are satisfied, the cluster will automatically
+know to reload the resource (instead of restarting) when a non-unique
+fields changes.
+
+[NOTE]
+======
+The metadata is re-read when the resource is started. This may mean
+that the resource will be restarted the first time, even though you
+changed a parameter with +unique=0+
+======
+
+[NOTE]
+======
+If both a unique and non-unique field are changed simultaneously, the
+resource will still be restarted.
+======
diff --git a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.xml b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.xml
deleted file mode 100644
index cecc112b04..0000000000
--- a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Options.xml
+++ /dev/null
@@ -1,545 +0,0 @@
- Advanced Configuration
-
- Connecting from a Remote Machine
-
-
- Remote
- connect
-
- Remote
- administration
- Provided Pacemaker is installed on a machine, it is possible to connect to the cluster even if the machine itself is not in the same cluster.
- To do this, one simply sets up a number of environment variables and runs the same commands as when working on a cluster node.
-
-
- Environment Variables Used to Connect to Remote Instances of the CIB
-
-
-
-
-
- Environment VariableRemote Administration
- Environment Variable
- Description
-
-
- CIB_*, Env. Var. for Remote Conn.user
- Environment VariableCIB_user
- CIB_user
- The user to connect as. Needs to be part of the hacluster group on the target host. Defaults to $USER.
-
-
-
- CIB_*, Env. Var. for Remote Conn.passwd
- Environment VariableCIB_passwd
- CIB_passwd
- The user's password. Read from the command line if unset.
-
-
- CIB_*, Env. Var. for Remote Conn.server
- Environment VariableCIB_server
- CIB_server
- The host to contact. Defaults to localhost.
-
-
- CIB_*, Env. Var. for Remote Conn.port
- Environment VariableCIB_port
- CIB_port
- The port on which to contact the server; required.
-
-
- CIB_*, Env. Var. for Remote Conn.encrypted
- Environment VariableCIB_encrypted
- CIB_encrypted
- Encrypt network traffic; defaults to true.
-
-
-
-
-
- So, if c001n01 is an active cluster node and is listening on 1234 for connections, and someguy is a member of the hacluster group,
- then the following would prompt for someguy's password and return the cluster's current configuration:
-
-
- export CIB_port=1234; export CIB_server=c001n01; export CIB_user=someguy;
- cibadmin -Q
-
- For security reasons, the cluster does not listen for remote connections by default.
- If you wish to allow remote access, you need to set the remote-tls-port (encrypted) or remote-clear-port (unencrypted) top-level options (ie., those kept in the cib tag, like num_updates and epoch).
-
-
-
- Remoteconnect, CIB options
- Extra top-level CIB options for remote access
-
-
-
-
-
- Field
- Description
-
-
- remote-tls-port
- remote-tls-port
- Listen for encrypted remote connections on this port. Default: none
-
-
- remote-clear-port
- remote-clear-port
- Listen for plaintext remote connections on this port. Default: none
-
-
-
-
-
-
- Specifying When Recurring Actions are Performed
-
- By default, recurring actions are scheduled relative to when the resource started.
- So if your resource was last started at 14:32 and you have a backup set to be performed every 24 hours, then the backup will always run at in the middle of the business day - hardly desirable.
-
-
- To specify a date/time that the operation should be relative to, set the operation's interval-origin.
- The cluster uses this point to calculate the correct start-delay such that the operation will occur at origin + (interval * N).
-
-
- So, if the operation's interval is 24h, it's interval-origin is set to 02:00 and it is currently 14:32, then the cluster would initiate the operation with a start delay of 11 hours and 28 minutes.
- If the resource is moved to another node before 2am, then the operation is of course cancelled.
-
-
- The value specified for interval and interval-origin can be any date/time conforming to the ISO8601 standard.
- By way of example, to specify an operation that would run on the first Monday of 2009 and every Monday after that you would add:
-
-
- Specifying a Base for Recurring Action Intervals
- <op id="my-weekly-action" name="custom-action" interval="P7D" interval-origin="2009-W01-1"/>
-
-
-
- Moving Resources
- Moving Resources
- ResourceMoving
-
- Manual Intervention
- There are primarily two occasions when you would want to move a resource from it's current location: when the whole node is under maintenance, and when a single resource needs to be moved.
-
- Since everything eventually comes down to a score, you could create constraints for every resource to prevent them from running on one node.
- While the configuration can seem convoluted at times, not even we would require this of administrators.
-
-
- Instead one can set a special node attribute which tells the cluster "don't let anything run here".
- There is even a helpful tool to help query and set it, called crm_standby.
- To check the standby status of the current machine, simply run:
-
- crm_standby --get-value
-
- A value of true indicates that the node is NOT able to host any resources, while a value of false says that it CAN.
-
-
- You can also check the status of other nodes in the cluster by specifying the --node-uname option:
-
- crm_standby --get-value --node-uname sles-2
- To change the current node's standby status, use --attr-value instead of --get-value.
- crm_standby --attr-value
- Again, you can change another host's value by supplying a host name with --node-uname.
-
- When only one resource is required to move, we do this by creating location constraints.
- However, once again we provide a user friendly shortcut as part of the crm_resource command, which creates and modifies the extra constraints for you.
- If Email was running on sles-1 and you wanted it moved to a specific location, the command would look something like:
-
- crm_resource -M -r Email -H sles-2
- Behind the scenes, the tool will create the following location constraint:
- <rsc_location rsc="Email" node="sles-2" score="INFINITY"/>
- It is important to note that subsequent invocations of crm_resource -M are not cumulative. So, if you ran these commands
- crm_resource -M -r Email -H sles-2
- crm_resource -M -r Email -H sles-3
- then it is as if you had never performed the first command.
- To allow the resource to move back again, use:
- crm_resource -U -r Email
-
- Note the use of the word allow.
- The resource can move back to its original location but, depending on resource-stickiness, it might stay where it is.
- To be absolutely certain that it moves back to sles-1, move it there before issuing the call to crm_resource -U:
-
- crm_resource -M -r Email -H sles-1
- crm_resource -U -r Email
- Alternatively, if you only care that the resource should be moved from its current location, try
- crm_resource -M -r Email
- Which will instead create a negative constraint, like
- <rsc_location rsc="Email" node="sles-1" score="-INFINITY"/>
-
- This will achieve the desired effect, but will also have long-term consequences.
- As the tool will warn you, the creation of a -INFINITY constraint will prevent the resource from running on that node until crm_resource -U is used.
- This includes the situation where every other cluster node is no longer available!
-
-
- In some cases, such as when resource-stickiness is set to INFINITY, it is possible that you will end up with the problem described in .
- The tool can detect some of these cases and deals with them by also creating both a positive and negative constraint. Eg.
-
- Email prefers sles-1 with a score of -INFINITY
- Email prefers sles-2 with a score of INFINITY
- which has the same long-term consequences as discussed earlier.
-
-
- Moving Resources Due to Failure
- New in 1.0 is the concept of a migration threshold
-
- The naming of this option was unfortunate as it is easily confused with true migration, the process of moving a resource from one node to another without stopping it.
- Xen virtual guests are the most common example of resources that can be migrated in this manner.
-
- .
- Simply define migration-threshold=N for a resource and it will migrate to a new node after N failures.
- There is no threshold defined by default.
- To determine the resource's current failure status and limits, use crm_mon --failcounts.
-
-
- By default, once the threshold has been reached, this node will no longer be allowed to run the failed resource until the administrator manually resets the resource's failcount using crm_failcount (after hopefully first fixing the failure's cause).
- However it is possible to expire them by setting the resource's failure-timeout option.
-
- So a setting of migration-threshold=2 and failure-timeout=60s would cause the resource to move to a new node after 2 failures, and allow it to move back (depending on the stickiness and constraint scores) after one minute.
-
- There are two exceptions to the migration threshold concept; they occur when a resource either fails to start or fails to stop.
- Start failures cause the failcount to be set to INFINITY and thus always cause the resource to move immediately.
-
-
- Stop failures are slightly different and crucial.
- If a resource fails to stop and STONITH is enabled, then the cluster will fence the node in order to be able to start the resource elsewhere.
- If STONITH is not enabled, then the cluster has no way to continue and will not try to start the resource elsewhere, but will try to stop it again after the failure timeout.
-
- Please read before enabling this option.
-
-
- Moving Resources Due to Connectivity Changes
- Setting up the cluster to move resources when external connectivity is lost is a two-step process.
-
- Tell Pacemaker to monitor connectivity
-
- To do this, you need to add a ping resource to the cluster.
- The ping resource uses the system utility of the same name to a test if list of machines (specified by DNS hostname or IPv4/IPv6 address) are reachable and uses the results to maintain a node attribute normally called pingd
- The attribute name is customizable; that allows multiple ping groups to be defined.
- .
-
- Older versions of Heartbeat required users to add ping nodes to ha.cf - this is no longer required.
-
-
-
- Older versions of Pacemaker used a custom binary called pingd for this functionality; this is now deprecated in favor of ping.
- If your version of Pacemaker does not contain the ping agent, you can download the latest version.
-
-
-
-
- Normally the resource will run on all cluster nodes, which means that you'll need to create a clone.
- A template for this can be found below along with a description of the most interesting parameters.
-
-
- Common Options for a 'ping' Resource
-
-
-
-
-
- Field
- Description
-
-
-
-
-
- dampenResource Option
- ResourceOptiondampen
- dampen
- The time to wait (dampening) for further changes to occur. Use this to prevent a resource from bouncing around the cluster when cluster nodes notice the loss of connectivity at slightly different times.
-
-
-
- multiplierResource Option
- ResourceOptionmultiplier
- multiplier
- The number of connected ping nodes gets multiplied by this value to get a score. Useful when there are multiple ping nodes configured.
-
-
-
- host_listResource Option
- ResourceOptionhost_list
- host_list
- The machines to contact in order to determine the current connectivity status. Allowed values include resolvable DNS host names, IPv4 and IPv6 addresses.
-
-
-
-
-
- An example ping cluster resource, checks node connectivity once every minute
-
-
-
-
-
-
-
-
-
-
-
- ]]>
-
-
-
- You're only half done.
- The next section deals with telling Pacemaker how to deal with the connectivity status that ocf:pacemaker:ping is recording.
-
-
-
-
-
- Tell Pacemaker how to interpret the connectivity data
- NOTE: Before reading the following, please make sure you have read and understood above.
-
- There are a number of ways to use the connectivity data provided by Heartbeat.
- The most common setup is for people to have a single ping node, to prevent the cluster from running a resource on any unconnected node.
- TODO: is the idea that only nodes that can reach eg. the router should have active resources?
-
-
- Don't run on unconnected nodes
-
-
-
-
- ]]>
-
-
- A more complex setup is to have a number of ping nodes configured.
- You can require the cluster to only run resources on nodes that can connect to all (or a minimum subset) of them.
-
-
- Run only on nodes connected to three or more ping nodes; this assumes multiplier is set to 1000.
-
-
-
-
- ]]>
-
-
- Instead you can tell the cluster only to prefer nodes with the best connectivity.
- Just be sure to set multiplier to a value higher than that of resource-stickiness (and don't set either of them to INFINITY).
-
-
- Prefer the node with the most connected ping nodes
-
-
-
-
- ]]>
-
-
- It is perhaps easier to think of this in terms of the simple constraints that the cluster translates it into.
- For example, if sles-1 is connected to all 5 ping nodes but sles-2 is only connected to 2, then it would be as if you instead had the following constraints in your configuration:
-
-
- How the cluster translates the pingd constraint
-
- ]]>
-
- The advantage is that you don't have to manually update any constraints whenever your network connectivity changes.
-
- You can also combine the concepts above into something even more complex.
- The example below shows how you can prefer the node with the most connected ping nodes provided they have connectivity to at least three (again assuming that multiplier is set to 1000).
-
-
- A more complex example of choosing a location based on connectivity
-
-
-
-
-
-
-
- ]]>
-
-
-
-
- Resource Migration
-
- Some resources, such as Xen virtual guests, are able to move to another location without loss of state.
- We call this resource migration; this is different from the normal practice of stopping the resource on the first machine and starting it elsewhere.
-
-
- Not all resources are able to migrate, see the Migration Checklist below, and those that can, won't do so in all situations.
- Conceptually there are two requirements from which the other prerequisites follow:
-
- the resource must be active and healthy at the old location
- everything required for the resource to run must be available on both the old and new locations
-
-
- The cluster is able to accommodate both push and pull migration models by requiring the resource agent to support two new actions: migrate_to (performed on the current location) and migrate_from (performed on the destination).
-
- In push migration, the process on the current location transfers the resource to the new location where is it later activated.
- In this scenario, most of the work would be done in the migrate_to action and, if anything, the activation would occur during migrate_from.
-
- Conversely for pull, the migrate_to action is practically empty and migrate_from does most of the work, extracting the relevant resource state from the old location and activating it.
- There is no wrong or right way to implement migration for your service, as long as it works.
-
- Migration Checklist
-
- The resource may not be a clone.
- The resource must use an OCF style agent.
- The resource must not be in a failed or degraded state.
- The resource must not, directly or indirectly, depend on any primitive or group resources. TODO: how can a KVM with DRBD migrate?
- The resource must support two new actions: migrate_to and migrate_from, and advertise them in its metadata.
- The resource must have the allow-migrate meta-attribute set to true (which is not the default).
-
-
- If the resource depends on a clone, and at the time the resource needs to be move, the clone has instances that are stopping and instances that are starting, then the resource will be moved in the traditional manner.
- The Policy Engine is not yet able to model this situation correctly and so takes the safe (yet less optimal) path.
-
-
-
-
-
- Reusing Rules, Options and Sets of Operations
-
- Sometimes a number of constraints need to use the same set of rules, and resources need to set the same options and parameters.
- To simplify this situation, you can refer to an existing object using an id-ref instead of an id.
-
- So if for one resource you have
-
-
-
-
- ]]>
- Then instead of duplicating the rule for all your other resources, you can instead specify
-
- Referencing rules from other constraints
-
-
- ]]>
-
-
-
- The cluster will insist that the rule exists somewhere.
- Attempting to add a reference to a non-existing rule will cause a validation failure, as will attempting to remove a rule that is referenced elsewhere.
-
-
-
- The same principle applies for meta_attributes and instance_attributes as illustrated in the example below
-
- Referencing attributes, options, and operations from other resources
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- ]]>
-
-
-
-
- Reloading Services After a Definition Change
-
- The cluster automatically detects changes to the definition of services it manages.
- However, the normal response is to stop the service (using the old definition) and start it again (with the new definition).
- This works well, but some services are smarter and can be told to use a new set of options without restarting.
-
-
- To take advantage of this capability, your resource agent must:
-
-
- Accept the reload operation and perform any required actions.
- The steps required here depend completely on your application!
-
- The DRBD Agent's Control logic for Supporting the reload Operation
-
-
-
-
- Advertise the reload operation in the actions section of its metadata
-
- The DRBD Agent Advertising Support for the reload Operation
-
-
-
- 1.1
-
-
- Master/Slave OCF Resource Agent for DRBD
-
-
- ...
-
-
-
-
-
-
-
-
-
-
-
- ]]>
-
-
-
- Advertise one or more parameters that can take effect using reload.
- Any parameter with the unique set to 0 is eligible to be used in this way.
-
- Parameter that can be changed using reload
-
- Full path to the drbd.conf file.
- Path to drbd.conf
-
- ]]>
-
-
-
-
-
- Once these requirements are satisfied, the cluster will automatically know to reload the resource (instead of restarting) when a non-unique fields changes.
-
-
-
- The metadata is re-read when the resource is started.
- This may mean that the resource will be restarted the first time, even though you changed a parameter with unique=0
-
-
-
-
- If both a unique and non-unique field are changed simultaneously, the resource will still be restarted.
-
-
-
-
diff --git a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Resources.txt b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Resources.txt
new file mode 100644
index 0000000000..397dfec285
--- /dev/null
+++ b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Resources.txt
@@ -0,0 +1,940 @@
+= Advanced Resource Types =
+
+== Groups - A Syntactic Shortcut ==
+indexterm:[Group Resources]
+indexterm:[Resources,Groups]
+anchor:group-resources[Group Resources]
+
+One of the most common elements of a cluster is a set of resources
+that need to be located together, start sequentially, and stop in the
+reverse order. To simplify this configuration we support the concept
+of groups.
+
+.An example group
+[source,XML]
+-------
+
+
+
+
+
+
+
+
+-------
+
+
+Although the example above contains only two resources, there is no
+limit to the number of resources a group can contain. The example is
+also sufficient to explain the fundamental properties of a group:
+
+* Resources are started in the order they appear in (+Public-IP+
+ first, then +Email+)
+* Resources are stopped in the reverse order to which they appear in
+ (+Email+ first, then +Public-IP+)
+
+If a resource in the group can't run anywhere, then nothing after that
+is allowed to run, too.
+
+* If +Public-IP+ can't run anywhere, neither can +Email+;
+* but if +Email+ can't run anywhere, this does not affect +Public-IP+
+ in any way
+
+The group above is logically equivalent to writing:
+
+.How the cluster sees a group resource
+[source,XML]
+-------
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+-------
+
+Obviously as the group grows bigger, the reduced configuration effort
+can become significant.
+
+Another (typical) example of a group is a DRBD volume, the filesystem
+mount, an IP address, and an application that uses them.
+
+=== Group Properties ===
+.Properties of a Group Resource
+[width="95%",cols="3m,5<",options="header",align="center"]
+|=========================================================
+
+|Field
+|Description
+
+|id
+|Your name for the group
+ indexterm:[id,Group Resource Property]
+ indexterm:[Group Resource Properties,id]
+ indexterm:[Resource,Group Property,id]
+
+|=========================================================
+
+=== Group Options ===
+
+Options inherited from xref:s-resource-options[]:
++priority, target-role, is-managed+
+
+=== Group Instance Attributes ===
+
+Groups have no instance attributes, however any that are set here will
+be inherited by the group's children.
+
+=== Group Contents ===
+
+Groups may only contain a collection of
+xref:primitive-resource[] cluster resources. To refer to
+the child of a group resource, just use the child's id instead of the
+group's.
+
+=== Group Constraints ===
+
+Although it is possible to reference the group's children in
+constraints, it is usually preferable to use the group's name instead.
+
+.Example constraints involving groups
+[source,XML]
+-------
+
+
+
+
+
+-------
+
+=== Group Stickiness ===
+indexterm:[resource-stickiness,of a Group Resource]
+
+Stickiness, the measure of how much a resource wants to stay where it
+is, is additive in groups. Every active resource of the group will
+contribute its stickiness value to the group's total. So if the
+default +resource-stickiness+ is 100, and a group has seven members,
+five of which are active, then the group as a whole will prefer its
+current location with a score of 500.
+
+== Clones - Resources That Get Active on Multiple Hosts ==
+indexterm:[Clone Resources]
+indexterm:[Resources,Clones]
+anchor:s-resource-clone[Clone Resources]
+
+Clones were initially conceived as a convenient way to start N
+instances of an IP resource and have them distributed throughout the
+cluster for load balancing. They have turned out to quite useful for
+a number of purposes including integrating with Red Hat's DLM, the
+fencing subsystem, and OCFS2.
+
+You can clone any resource, provided the resource agent supports it.
+
+Three types of cloned resources exist:
+
+* Anonymous
+* Globally Unique
+* Stateful
+
+Anonymous clones are the simplest type. These resources behave
+completely identically everywhere they are running. Because of this,
+there can only be one copy of an anonymous clone active per machine.
+
+Globally unique clones are distinct entities. A copy of the clone
+running on one machine is not equivalent to another instance on
+another node. Nor would any two copies on the same node be
+equivalent.
+
+Stateful clones are covered later in xref:s-resource-multistate[].
+
+.An example clone
+[source,XML]
+-------
+
+
+
+
+
+
+-------
+
+=== Clone Properties ===
+
+.Properties of a Clone Resource
+[width="95%",cols="3m,5<",options="header",align="center"]
+|=========================================================
+
+|Field
+|Description
+
+|id
+|Your name for the clone
+ indexterm:[id,Clone Resource Property]
+ indexterm:[Clone Resource Properties,id]
+ indexterm:[Resource,Clone Property,id]
+
+|=========================================================
+
+=== Clone Options ===
+
+Options inherited from xref:s-resource-options[] resources:
++priority, target-role, is-managed+
+
+.Clone specific configuration options
+[width="95%",cols="3m,5<",options="header",align="center"]
+|=========================================================
+
+|Field
+|Description
+
+|clone-max
+|How many copies of the resource to start. Defaults to the number of
+ nodes in the cluster.
+ indexterm:[clone-max Clone Resource Property]
+ indexterm:[Clone Resource Properties,clone-max]
+ indexterm:[Resource,Clone Property,clone-max]
+
+|clone-node-max
+|How many copies of the resource can be started on a single node;
+ default _1_.
+ indexterm:[clone-node-max Clone Resource Property]
+ indexterm:[Clone Resource Properties,clone-node-max]
+ indexterm:[Resource,Clone Property,clone-node-max]
+
+|notify
+|When stopping or starting a copy of the clone, tell all the other
+ copies beforehand and when the action was successful. Allowed values:
+ _false_, +true+
+ indexterm:[notify Clone Resource Property]
+ indexterm:[Clone Resource Properties,notify]
+ indexterm:[Resource,Clone Property,notify]
+
+|globally-unique
+|Does each copy of the clone perform a different function? Allowed
+ values: _false_, +true+
+ indexterm:[globally-unique Clone Resource Property]
+ indexterm:[Clone Resource Properties,globally-unique]
+ indexterm:[Resource,Clone Property,globally-unique]
+
+|ordered
+|Should the copies be started in series (instead of in
+ parallel). Allowed values: _false_, +true+
+ indexterm:[ordered Clone Resource Property]
+ indexterm:[Clone Resource Properties,ordered]
+ indexterm:[Resource,Clone Property,ordered]
+
+|interleave
+|Changes the behavior of ordering constraints (between clones/masters)
+ so that instances can start/stop as soon as their peer instance has
+ (rather than waiting for every instance of the other clone
+ has). Allowed values: _false_, +true+
+ indexterm:[interleave Clone Resource Property]
+ indexterm:[Clone Resource Properties,interleave]
+ indexterm:[Resource,Clone Property,interleave]
+
+|=========================================================
+
+=== Clone Instance Attributes ===
+
+Clones have no instance attributes; however, any that are set here
+will be inherited by the clone's children.
+
+=== Clone Contents ===
+
+Clones must contain exactly one group or one regular resource.
+
+[WARNING]
+You should never reference the name of a clone's child.
+If you think you need to do this, you probably need to re-evaluate your design.
+
+=== Clone Constraints ===
+
+In most cases, a clone will have a single copy on each active cluster
+node. If this is not the case, you can indicate which nodes the
+cluster should preferentially assign copies to with resource location
+constraints. These constraints are written no differently to those
+for regular resources except that the clone's id is used.
+
+Ordering constraints behave slightly differently for clones. In the
+example below, +apache-stats+ will wait until all copies of the clone
+that need to be started have done so before being started itself.
+Only if _no_ copies can be started +apache-stats+ will be prevented
+from being active. Additionally, the clone will wait for
++apache-stats+ to be stopped before stopping the clone.
+
+Colocation of a regular (or group) resource with a clone means that
+the resource can run on any machine with an active copy of the clone.
+The cluster will choose a copy based on where the clone is running and
+the resource's own location preferences.
+
+Colocation between clones is also possible. In such cases, the set of
+allowed locations for the clone is limited to nodes on which the clone
+is (or will be) active. Allocation is then performed as normally.
+
+.Example constraints involving clones
+[source,XML]
+-------
+
+
+
+
+
+-------
+
+=== Clone Stickiness ===
+
+indexterm:[resource-stickiness,of a Clone Resource]
+
+To achieve a stable allocation pattern, clones are slightly sticky by
+default. If no value for +resource-stickiness+ is provided, the clone
+will use a value of 1. Being a small value, it causes minimal
+disturbance to the score calculations of other resources but is enough
+to prevent Pacemaker from needlessly moving copies around the cluster.
+
+=== Clone Resource Agent Requirements ===
+
+Any resource can be used as an anonymous clone, as it requires no
+additional support from the resource agent. Whether it makes sense to
+do so depends on your resource and its resource agent.
+
+Globally unique clones do require some additional support in the
+resource agent. In particular, it must only respond with
++${OCF_SUCCESS}+ if the node has that exact instance active. All
+other probes for instances of the clone should result in
++${OCF_NOT_RUNNING}+. Unless of course they are failed, in which case
+they should return one of the other OCF error codes.
+
+Copies of a clone are identified by appending a colon and a numerical
+offset, eg. +apache:2+.
+
+Resource agents can find out how many copies there are by examining
+the +OCF_RESKEY_CRM_meta_clone_max+ environment variable and which
+copy it is by examining +OCF_RESKEY_CRM_meta_clone+.
+
+You should not make any assumptions (based on
++OCF_RESKEY_CRM_meta_clone+) about which copies are active. In
+particular, the list of active copies will not always be an unbroken
+sequence, nor always start at 0.
+
+==== Clone Notifications ====
+
+Supporting notifications requires the +notify+ action to be
+implemented. Once supported, the notify action will be passed a
+number of extra variables which, when combined with additional
+context, can be used to calculate the current state of the cluster and
+what is about to happen to it.
+
+.Environment variables supplied with Clone notify actions
+[width="95%",cols="5,3<",options="header",align="center"]
+|=========================================================
+
+|Variable
+|Description
+
+|OCF_RESKEY_CRM_meta_notify_type
+|Allowed values: +pre+, +post+
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_type]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_type]
+
+|OCF_RESKEY_CRM_meta_notify_operation
+|Allowed values: +start+, +stop+
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_operation]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_operation]
+
+|OCF_RESKEY_CRM_meta_notify_start_resource
+|Resources to be started
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_start_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_start_resource]
+
+|OCF_RESKEY_CRM_meta_notify_stop_resource
+|Resources to be stopped
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_stop_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_stop_resource]
+
+|OCF_RESKEY_CRM_meta_notify_active_resource
+|Resources that are running
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_active_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_active_resource]
+
+|OCF_RESKEY_CRM_meta_notify_inactive_resource
+|Resources that are not running
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_inactive_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_inactive_resource]
+
+|OCF_RESKEY_CRM_meta_notify_start_uname
+|Nodes on which resources will be started
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_start_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_start_uname]
+
+|OCF_RESKEY_CRM_meta_notify_stop_uname
+|Nodes on which resources will be stopped
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_stop_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_stop_uname]
+
+|OCF_RESKEY_CRM_meta_notify_active_uname
+|Nodes on which resources are running
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_active_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_active_uname]
+
+|OCF_RESKEY_CRM_meta_notify_inactive_uname
+|Nodes on which resources are not running
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_inactive_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_inactive_uname]
+
+|=========================================================
+
+The variables come in pairs, such as
++OCF_RESKEY_CRM_meta_notify_start_resource+ and
++OCF_RESKEY_CRM_meta_notify_start_uname+ and should be treated as an
+array of whitespace separated elements.
+
+Thus in order to indicate that +clone:0+ will be started on +sles-1+,
++clone:2+ will be started on +sles-3+, and +clone:3+ will be started
+on +sles-2+, the cluster would set
+
+.Example notification variables
+[source,Bash]
+-------
+OCF_RESKEY_CRM_meta_notify_start_resource+="clone:0 clone:2 clone:3"
+OCF_RESKEY_CRM_meta_notify_start_uname+="sles-1 sles-3 sles-2"
+-------
+
+==== Proper Interpretation of Notification Environment Variables ====
+
+.Pre-notification (stop):
+
+* Active resources: +$OCF_RESKEY_CRM_meta_notify_active_resource+
+* Inactive resources: +$OCF_RESKEY_CRM_meta_notify_inactive_resource+
+* Resources to be started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources to be stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+
+
+.Post-notification (stop) / Pre-notification (start):
+
+* Active resources
+** +$OCF_RESKEY_CRM_meta_notify_active_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+* Inactive resources
+** +$OCF_RESKEY_CRM_meta_notify_inactive_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+* Resources that were started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources that were stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+
+
+.Post-notification (start):
+
+* Active resources:
+** +$OCF_RESKEY_CRM_meta_notify_active_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Inactive resources:
+** +$OCF_RESKEY_CRM_meta_notify_inactive_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources that were started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources that were stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+
+== Multi-state - Resources That Have Multiple Modes ==
+
+indexterm:[Multi-state Resources]
+indexterm:[Resources,Multi-state]
+anchor:s-resource-multistate[Multi-state Resources]
+
+Multi-state resources are a specialization of Clone resources; please
+ensure you understand the section on clones before continuing! They
+allow the instances to be in one of two operating modes; these are
+called +Master+ and +Slave+, but can mean whatever you wish them to
+mean. The only limitation is that when an instance is started, it
+must come up in the +Slave+ state.
+
+=== Multi-state Properties ===
+
+.Properties of a Multi-State Resource
+[width="95%",cols="3m,5<",options="header",align="center"]
+|=========================================================
+
+|Field
+|Description
+
+|id
+|Your name for the multi-state resource
+ indexterm:[id,Multi-State Resource Property]
+ indexterm:[Multi-State Resource Properties,id]
+ indexterm:[Resource,Multi-State Property,id]
+
+|=========================================================
+
+=== Multi-state Options ===
+
+Options inherited from xref:s-resource-options[] resources:
++priority+, +target-role+, +is-managed+
+
+Options inherited from xref:s-resource-clone[]:
++clone-max+, +clone-node-max+, +notify+, +globally-unique+, +ordered+,
++interleave+
+
+.Multi-state specific resource configuration options
+[width="95%",cols="3m,5<",options="header",align="center"]
+|=========================================================
+
+|Field
+|Description
+
+|master-max
+|How many copies of the resource can be promoted to +master+ status;
+ default 1.
+ indexterm:[master-max Multi-State Resource Property]
+ indexterm:[Multi-State Resource Properties,master-max]
+ indexterm:[Resource,Multi-State Property,master-max]
+
+|master-node-max
+|How many copies of the resource can be promoted to +master+ status on
+ a single node; default 1.
+ indexterm:[master-node-max Multi-State Resource Property]
+ indexterm:[Multi-State Resource Properties,master-node-max]
+ indexterm:[Resource,Multi-State Property,master-node-max]
+
+|=========================================================
+
+=== Multi-state Instance Attributes ===
+
+Multi-state resources have no instance attributes; however, any that
+are set here will be inherited by master's children.
+
+=== Multi-state Contents ===
+
+Masters must contain exactly one group or one regular resource.
+
+[WARNING]
+You should never reference the name of a master's child.
+If you think you need to do this, you probably need to re-evaluate your design.
+
+=== Monitoring Multi-State Resources ===
+
+The normal type of monitor actions are not sufficient to monitor a
+multi-state resource in the +Master+ state. To detect failures of the
++Master+ instance, you need to define an additional monitor action
+with +role="Master"+.
+
+[IMPORTANT]
+===========
+It is crucial that _every_ monitor operation has a different interval!
+
+This is because Pacemaker currently differentiates between operations
+only by resource and interval; so if eg. a master/slave resource has
+the same monitor interval for both roles, Pacemaker would ignore the
+role when checking the status - which would cause unexpected return
+codes, and therefore unnecessary complications.
+===========
+
+.Monitoring both states of a multi-state resource
+[source,XML]
+-------
+
+
+
+
+
+
+
+
+-------
+
+=== Multi-state Constraints ===
+
+In most cases, a multi-state resources will have a single copy on each
+active cluster node. If this is not the case, you can indicate which
+nodes the cluster should preferentially assign copies to with resource
+location constraints. These constraints are written no differently to
+those for regular resources except that the master's id is used.
+
+When considering multi-state resources in constraints, for most
+purposes it is sufficient to treat them as clones. The exception is
+when the +rsc-role+ and/or +with-rsc-role+ fields (for colocation
+constraints) and +first-action+ and/or +then-action+ fields (for
+ordering constraints) are used.
+
+.Additional constraint options relevant to multi-state resources
+[width="95%",cols="3m,5<",options="header",align="center"]
+|=========================================================
+
+|Field
+|Description
+
+|rsc-role
+|An additional attribute of colocation constraints that specifies the
+ role that +rsc+ must be in. Allowed values: _Started_, +Master+,
+ +Slave+.
+ indexterm:[rsc-role Multi-State Resource Constraints]
+ indexterm:[Multi-State Resource Constraints,rsc-role]
+ indexterm:[Resource,Multi-State Constraints,rsc-role]
+
+|with-rsc-role
+|An additional attribute of colocation constraints that specifies the
+ role that +with-rsc+ must be in. Allowed values: _Started_,
+ +Master+, +Slave+.
+ indexterm:[with-rsc-role Multi-State Resource Constraints]
+ indexterm:[Multi-State Resource Constraints,with-rsc-role]
+ indexterm:[Resource,Multi-State Constraints,with-rsc-role]
+
+|first-action
+|An additional attribute of ordering constraints that specifies the
+ action that the +first+ resource must complete before executing the
+ specified action for the +then+ resource. Allowed values: _start_,
+ +stop+, +promote+, +demote+.
+ indexterm:[first-action Multi-State Resource Constraints]
+ indexterm:[Multi-State Resource Constraints,first-action]
+ indexterm:[Resource,Multi-State Constraints,first-action]
+
+|then-action
+|An additional attribute of ordering constraints that specifies the
+ action that the +then+ resource can only execute after the
+ +first-action+ on the +first+ resource has completed. Allowed
+ values: +start+, +stop+, +promote+, +demote+. Defaults to the value
+ (specified or implied) of +first-action+.
+ indexterm:[then-action Multi-State Resource Constraints]
+ indexterm:[Multi-State Resource Constraints,then-action]
+ indexterm:[Resource,Multi-State Constraints,then-action]
+
+|=========================================================
+
+In the example below, +myApp+ will wait until one of the database
+copies has been started and promoted to master before being started
+itself. Only if no copies can be promoted will +apache-stats+ be
+prevented from being active. Additionally, the database will wait for
++myApp+ to be stopped before it is demoted.
+
+.Example constraints involving multi-state resources
+[source,XML]
+-------
+
+
+
+
+
+
+
+-------
+
+Colocation of a regular (or group) resource with a multi-state
+resource means that it can run on any machine with an active copy of
+the multi-state resource that is in the specified state (+Master+ or
++Slave+). In the example, the cluster will choose a location based on
+where database is running as a +Master+, and if there are multiple
++Master+ instances it will also factor in +myApp+'s own location
+preferences when deciding which location to choose.
+
+Colocation with regular clones and other multi-state resources is also
+possible. In such cases, the set of allowed locations for the +rsc+
+clone is (after role filtering) limited to nodes on which the
++with-rsc+ multi-state resource is (or will be) in the specified role.
+Allocation is then performed as-per-normal.
+
+=== Multi-state Stickiness ===
+
+indexterm:[resource-stickiness,of a Multi-State Resource]
+To achieve a stable allocation pattern, multi-state resources are
+slightly sticky by default. If no value for +resource-stickiness+ is
+provided, the multi-state resource will use a value of 1. Being a
+small value, it causes minimal disturbance to the score calculations
+of other resources but is enough to prevent Pacemaker from needlessly
+moving copies around the cluster.
+
+=== Which Resource Instance is Promoted ===
+
+During the start operation, most Resource Agent scripts should call
+the `crm_master` utility. This tool automatically detects both the
+resource and host and should be used to set a preference for being
+promoted. Based on this, +master-max+, and +master-node-max+, the
+instance(s) with the highest preference will be promoted.
+
+The other alternative is to create a location constraint that
+indicates which nodes are most preferred as masters.
+
+.Manually specifying which node should be promoted
+[source,XML]
+-------
+
+
+
+
+
+-------
+
+=== Multi-state Resource Agent Requirements ===
+
+Since multi-state resources are an extension of cloned resources, all
+the requirements of Clones are also requirements of multi-state
+resources. Additionally, multi-state resources require two extra
+actions: +demote+ and +promote+; these actions are responsible for
+changing the state of the resource. Like +start+ and +stop+, they
+should return +OCF_SUCCESS+ if they completed successfully or a
+relevant error code if they did not.
+
+The states can mean whatever you wish, but when the resource is
+started, it must come up in the mode called +Slave+. From there the
+cluster will then decide which instances to promote to +Master+.
+
+In addition to the Clone requirements for monitor actions, agents must
+also _accurately_ report which state they are in. The cluster relies
+on the agent to report its status (including role) accurately and does
+not indicate to the agent what role it currently believes it to be in.
+
+.Role implications of OCF return codes
+[width="95%",cols="5,3<",options="header",align="center"]
+|=========================================================
+
+|Monitor Return Code
+|Description
+
+|OCF_NOT_RUNNING
+|Stopped
+ indexterm:[return code,OCF_NOT_RUNNING]
+ indexterm:[Environment Variable,OCF_NOT_RUNNING]
+ indexterm:[OCF_NOT_RUNNING]
+
+|OCF_SUCCESS
+|Running (Slave)
+ indexterm:[return code,OCF_SUCCESS]
+ indexterm:[Environment Variable,OCF_SUCCESS]
+ indexterm:[OCF_SUCCESS]
+
+|OCF_RUNNING_MASTER
+|Running (Master)
+ indexterm:[return code,OCF_RUNNING_MASTER]
+ indexterm:[Environment Variable,OCF_RUNNING_MASTER]
+ indexterm:[OCF_RUNNING_MASTER]
+
+|OCF_FAILED_MASTER
+|Failed (Master)
+ indexterm:[return code,OCF_FAILED_MASTER]
+ indexterm:[Environment Variable,OCF_FAILED_MASTER]
+ indexterm:[OCF_FAILED_MASTER]
+
+|Other
+|Failed (Slave)
+
+|=========================================================
+
+=== Multi-state Notifications ===
+
+Like clones, supporting notifications requires the +notify+ action to
+be implemented. Once supported the notify action will be passed a
+number of extra variables which, when combined with additional
+context, can be used to calculate the current state of the cluster and
+what is about to happen to it.
+
+.Environment variables supplied with Master notify actions footnote:[Emphasized variables are specific to +Master+ resources and all behave in the same manner as described for Clone resources.]
+[width="95%",cols="5,3<",options="header",align="center"]
+|=========================================================
+
+|Variable
+|Description
+
+|OCF_RESKEY_CRM_meta_notify_type
+|Allowed values: +pre+, +post+
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_type]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_type]
+
+|OCF_RESKEY_CRM_meta_notify_operation
+|Allowed values: +start+, +stop+
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_operation]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_operation]
+
+|OCF_RESKEY_CRM_meta_notify_active_resource
+|Resources the that are running
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_active_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_active_resource]
+
+|OCF_RESKEY_CRM_meta_notify_inactive_resource
+|Resources the that are not running
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_inactive_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_inactive_resource]
+
+|_OCF_RESKEY_CRM_meta_notify_master_resource_
+|Resources that are running in +Master+ mode
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_master_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_master_resource]
+
+|_OCF_RESKEY_CRM_meta_notify_slave_resource_
+|Resources that are running in +Slave+ mode
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_slave_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_slave_resource]
+
+|OCF_RESKEY_CRM_meta_notify_start_resource
+|Resources to be started
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_start_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_start_resource]
+
+|indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_stop_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_stop_resource]
+ OCF_RESKEY_CRM_meta_notify_stop_resource
+|Resources to be stopped
+
+|_OCF_RESKEY_CRM_meta_notify_promote_resource_
+|Resources to be promoted
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_promote_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_promote_resource]
+
+|_OCF_RESKEY_CRM_meta_notify_demote_resource_
+|Resources to be demoted
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_demote_resource]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_demote_resource]
+
+|OCF_RESKEY_CRM_meta_notify_start_uname
+|Nodes on which resources will be started
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_start_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_start_uname]
+
+|OCF_RESKEY_CRM_meta_notify_stop_uname
+|Nodes on which resources will be stopped
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_stop_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_stop_uname]
+
+|_OCF_RESKEY_CRM_meta_notify_promote_uname_
+|Nodes on which resources will be promote
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_promote_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_promote_uname]
+
+|_OCF_RESKEY_CRM_meta_notify_demote_uname_
+|Nodes on which resources will be demoted
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_demote_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_demote_uname]
+
+|OCF_RESKEY_CRM_meta_notify_active_uname
+|Nodes on which resources are running
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_active_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_active_uname]
+
+|OCF_RESKEY_CRM_meta_notify_inactive_uname
+|Nodes on which resources are not running
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_inactive_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_inactive_uname]
+
+|_OCF_RESKEY_CRM_meta_notify_master_uname_
+|Nodes on which resources are running in +Master+ mode
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_master_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_master_uname]
+
+|_OCF_RESKEY_CRM_meta_notify_slave_uname_
+|Nodes on which resources are running in +Slave+ mode
+ indexterm:[Environment Variable,OCF_RESKEY_CRM_,meta_notify_slave_uname]
+ indexterm:[OCF_RESKEY_CRM_,meta_notify_slave_uname]
+
+|=========================================================
+
+=== Multi-state - Proper Interpretation of Notification Environment Variables ===
+
+
+.Pre-notification (demote):
+
+* +Active+ resources: +$OCF_RESKEY_CRM_meta_notify_active_resource+
+* +Master+ resources: +$OCF_RESKEY_CRM_meta_notify_master_resource+
+* +Slave+ resources: +$OCF_RESKEY_CRM_meta_notify_slave_resource+
+* Inactive resources: +$OCF_RESKEY_CRM_meta_notify_inactive_resource+
+* Resources to be started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources to be promoted: +$OCF_RESKEY_CRM_meta_notify_promote_resource+
+* Resources to be demoted: +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* Resources to be stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+
+
+.Post-notification (demote) / Pre-notification (stop):
+
+* +Active+ resources: +$OCF_RESKEY_CRM_meta_notify_active_resource+
+* +Master+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_master_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* +Slave+ resources: +$OCF_RESKEY_CRM_meta_notify_slave_resource+
+* Inactive resources: +$OCF_RESKEY_CRM_meta_notify_inactive_resource+
+* Resources to be started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources to be promoted: +$OCF_RESKEY_CRM_meta_notify_promote_resource+
+* Resources to be demoted: +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* Resources to be stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+* Resources that were demoted: +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+
+
+.Post-notification (stop) / Pre-notification (start)
+
+* +Active+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_active_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+* +Master+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_master_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* +Slave+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_slave_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+* Inactive resources:
+** +$OCF_RESKEY_CRM_meta_notify_inactive_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+* Resources to be started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources to be promoted: +$OCF_RESKEY_CRM_meta_notify_promote_resource+
+* Resources to be demoted: +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* Resources to be stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+* Resources that were demoted: +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* Resources that were stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+
+
+.Post-notification (start) / Pre-notification (promote)
+
+* +Active+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_active_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* +Master+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_master_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* +Slave+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_slave_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Inactive resources:
+** +$OCF_RESKEY_CRM_meta_notify_inactive_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources to be started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources to be promoted: +$OCF_RESKEY_CRM_meta_notify_promote_resource+
+* Resources to be demoted: +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* Resources to be stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+* Resources that were started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources that were demoted: +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* Resources that were stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+
+.Post-notification (promote)
+
+* +Active+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_active_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* +Master+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_master_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_promote_resource+
+* +Slave+ resources:
+** +$OCF_RESKEY_CRM_meta_notify_slave_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_start_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_promote_resource+
+* Inactive resources:
+** +$OCF_RESKEY_CRM_meta_notify_inactive_resource+
+** plus +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+** minus +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources to be started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources to be promoted: +$OCF_RESKEY_CRM_meta_notify_promote_resource+
+* Resources to be demoted: +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* Resources to be stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
+* Resources that were started: +$OCF_RESKEY_CRM_meta_notify_start_resource+
+* Resources that were promoted: +$OCF_RESKEY_CRM_meta_notify_promote_resource+
+* Resources that were demoted: +$OCF_RESKEY_CRM_meta_notify_demote_resource+
+* Resources that were stopped: +$OCF_RESKEY_CRM_meta_notify_stop_resource+
diff --git a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Resources.xml b/doc/Pacemaker_Explained/en-US/Ch-Advanced-Resources.xml
deleted file mode 100644
index b134b0df4f..0000000000
--- a/doc/Pacemaker_Explained/en-US/Ch-Advanced-Resources.xml
+++ /dev/null
@@ -1,1023 +0,0 @@
- Advanced Resource Types
-
- Group Resources
- ResourcesGroups
- Groups - A Syntactic Shortcut
-
- One of the most common elements of a cluster is a set of resources that need to be located together, start sequentially, and stop in the reverse order.
- To simplify this configuration we support the concept of groups.
-
-
- An example group
-
-
-
-
-
-
-
- ]]>
-
-
- Although the example above contains only two resources, there is no limit to the number of resources a group can contain.
- The example is also sufficient to explain the fundamental properties of a group:
-
-
- Resources are started in the order they appear in (Public-IP first, then Email)
- Resources are stopped in the reverse order to which they appear in (Email first, then Public-IP)
-
- If a resource in the group can't run anywhere, then nothing after that is allowed to run, too.
-
- If Public-IP can't run anywhere, neither can Email;
- but if Email can't run anywhere, this does not affect Public-IP in any way
-
-
-
-
- The group above is logically equivalent to writing:
-
- How the cluster sees a group resource
-
-
-
-
-
-
-
-
-
-
-
-
-
- ]]>
-
- Obviously as the group grows bigger, the reduced configuration effort can become significant.
- Another (typical) example of a group is a DRBD volume, the filesystem mount, an IP address, and an application that uses them.
-
- Properties
-
- Properties of a Group Resource
-
-
-
-
-
- Field
- Description
-
-
-
-
- idGroup Resource Property
- Group Resource Propertiesid
- ResourceGroup Propertyid
- id
- Your name for the group
-
-
-
-
-
-
- Options
- Options inherited from simple resources: priority, target-role, is-managed
-
-
- Instance Attributes
- Groups have no instance attributes, however any that are set here will be inherited by the group's children.
-
-
- Contents
-
- Groups may only contain a collection of primitive cluster resources.
- To refer to the child of a group resource, just use the child's id instead of the group's.
-
-
-
- Constraints
- Although it is possible to reference the group's children in constraints, it is usually preferable to use the group's name instead.
-
- Example constraints involving groups
-
-
-
-
- ]]>
-
-
-
- resource-stickinessof a Group Resource
- Stickiness
- Stickiness, the measure of how much a resource wants to stay where it is, is additive in groups.
- Every active resource of the group will contribute its stickiness value to the group's total.
- So if the default resource-stickiness is 100, and a group has seven members, five of which are active, then the group as a whole will prefer its current location with a score of 500.
-
-
-
- Clone Resources
- ResourcesClones
- Clones - Resources That Get Active on Multiple Hosts
-
- Clones were initially conceived as a convenient way to start N instances of an IP resource and have them distributed throughout the cluster for load balancing.
- They have turned out to quite useful for a number of purposes including integrating with Red Hat's DLM, the fencing subsystem, and OCFS2.
-
- You can clone any resource, provided the resource agent supports it.
- Three types of cloned resources exist:
-
- Anonymous
- Globally Unique
- Stateful
-
-
- Anonymous clones are the simplest type.
- These resources behave completely identically everywhere they are running.
- Because of this, there can only be one copy of an anonymous clone active per machine.
-
-
- Globally unique clones are distinct entities.
- A copy of the clone running on one machine is not equivalent to another instance on another node.
- Nor would any two copies on the same node be equivalent.
-
- Stateful clones are covered later in .
-
- An example clone
-
-
-
-
-
- ]]>
-
-
- Properties
-
- Properties of a Clone Resource
-
-
-
-
-
- Field
- Description
-
-
- idClone Resource Property
- Clone Resource Propertiesid
- ResourceClone Propertyid
- id
- Your name for the clone
-
-
-
-
- Clone specific configuration options
-
-
-
-
-
- Field
- Description
-
-
-
-
- clone-max Clone Resource Property
- Clone Resource Propertiesclone-max
- ResourceClone Propertyclone-max
- clone-max
- How many copies of the resource to start. Defaults to the number of nodes in the cluster.
-
-
- clone-node-max Clone Resource Property
- Clone Resource Propertiesclone-node-max
- ResourceClone Propertyclone-node-max
- clone-node-max
- How many copies of the resource can be started on a single node; default 1.
-
-
- notify Clone Resource Property
- Clone Resource Propertiesnotify
- ResourceClone Propertynotify
- notify
- When stopping or starting a copy of the clone, tell all the other copies beforehand and when the action was successful. Allowed values: false, true
-
-
-
- globally-unique Clone Resource Property
- Clone Resource Propertiesglobally-unique
- ResourceClone Propertyglobally-unique
- globally-unique
- Does each copy of the clone perform a different function? Allowed values: false, true
-
-
- ordered Clone Resource Property
- Clone Resource Propertiesordered
- ResourceClone Propertyordered
- ordered
- Should the copies be started in series (instead of in parallel). Allowed values: false, true
-
-
-
- interleave Clone Resource Property
- Clone Resource Propertiesinterleave
- ResourceClone Propertyinterleave
- interleave
- Changes the behavior of ordering constraints (between clones/masters) so that instances can start/stop as soon as their peer instance has (rather than waiting for every instance of the other clone has). Allowed values: false, true
-
-
-
-
-
-
-
- Instance Attributes
- Clones have no instance attributes; however, any that are set here will be inherited by the clone's children.
-
-
- Contents
- Clones must contain exactly one group or one regular resource.
-
-
- You should never reference the name of a clone's child.
- If you think you need to do this, you probably need to re-evaluate your design.
-
-
-
-
-
- Constraints
-
- In most cases, a clone will have a single copy on each active cluster node.
- If this is not the case, you can indicate which nodes the cluster should preferentially assign copies to with resource location constraints.
- These constraints are written no differently to those for regular resources except that the clone's id is used.
-
-
- Ordering constraints behave slightly differently for clones.
- In the example below, apache-stats will wait until all copies of the clone that need to be started have done so before being started itself.
- Only if no copies can be started apache-stats will be prevented from being active.
- Additionally, the clone will wait for apache-stats to be stopped before stopping the clone.
-
-
- Colocation of a regular (or group) resource with a clone means that the resource can run on any machine with an active copy of the clone.
- The cluster will choose a copy based on where the clone is running and the resource's own location preferences.
-
-
- Colocation between clones is also possible.
- In such cases, the set of allowed locations for the clone is limited to nodes on which the clone is (or will be) active.
- Allocation is then performed as normally.
-
-
- Example constraints involving clones
-
-
-
-
- ]]>
-
-
-
- Stickiness
-
- resource-stickinessof a Clone Resource
- To achieve a stable allocation pattern, clones are slightly sticky by default.
- If no value for resource-stickiness is provided, the clone will use a value of 1.
- Being a small value, it causes minimal disturbance to the score calculations of other resources but is enough to prevent Pacemaker from needlessly moving copies around the cluster.
-
-
-
- Resource Agent Requirements
-
- Any resource can be used as an anonymous clone, as it requires no additional support from the resource agent.
- Whether it makes sense to do so depends on your resource and its resource agent.
-
-
- Globally unique clones do require some additional support in the resource agent.
- In particular, it must only respond with ${OCF_SUCCESS} if the node has that exact instance active.
- All other probes for instances of the clone should result in ${OCF_NOT_RUNNING}.
- Unless of course they are failed, in which case they should return one of the other OCF error codes.
-
- Copies of a clone are identified by appending a colon and a numerical offset, eg. apache:2.
- Resource agents can find out how many copies there are by examining the OCF_RESKEY_CRM_meta_clone_max environment variable and which copy it is by examining OCF_RESKEY_CRM_meta_clone.
-
- You should not make any assumptions (based on OCF_RESKEY_CRM_meta_clone) about which copies are active.
- In particular, the list of active copies will not always be an unbroken sequence, nor always start at 0.
-
-
-
- Notifications
-
- Supporting notifications requires the notify action to be implemented.
- Once supported, the notify action will be passed a number of extra variables which, when combined with additional context, can be used to calculate the current state of the cluster and what is about to happen to it.
-
-
- Environment variables supplied with Clone notify actions
-
-
-
-
-
- Variable
- Description
-
-
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_type
- OCF_RESKEY_CRM_meta_notify_type
- OCF_RESKEY_CRM_meta_notify_type
- Allowed values: pre, post
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_operation
- OCF_RESKEY_CRM_meta_notify_operation
- OCF_RESKEY_CRM_meta_notify_operation
- Allowed values: start, stop
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_start_resource
- OCF_RESKEY_CRM_meta_notify_start_resource
- OCF_RESKEY_CRM_meta_notify_start_resource
- Resources to be started
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_stop_resource
- OCF_RESKEY_CRM_meta_notify_stop_resource
- OCF_RESKEY_CRM_meta_notify_stop_resource
- Resources to be stopped
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_active_resource
- OCF_RESKEY_CRM_meta_notify_active_resource
- OCF_RESKEY_CRM_meta_notify_active_resource
- Resources that are running
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_inactive_resource
- OCF_RESKEY_CRM_meta_notify_inactive_resource
- OCF_RESKEY_CRM_meta_notify_inactive_resource
- Resources that are not running
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_start_uname
- OCF_RESKEY_CRM_meta_notify_start_uname
- OCF_RESKEY_CRM_meta_notify_start_uname
- Nodes on which resources will be started
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_stop_uname
- OCF_RESKEY_CRM_meta_notify_stop_uname
- OCF_RESKEY_CRM_meta_notify_stop_uname
- Nodes on which resources will be stopped
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_active_uname
- OCF_RESKEY_CRM_meta_notify_active_uname
- OCF_RESKEY_CRM_meta_notify_active_uname
- Nodes on which resources are running
-
-
- Environment VariableOCF_RESKEY_CRM_meta_notify_inactive_uname
- OCF_RESKEY_CRM_meta_notify_inactive_uname
- OCF_RESKEY_CRM_meta_notify_inactive_uname
- Nodes on which resources are not running
-
-
-
-
- The variables come in pairs, such as OCF_RESKEY_CRM_meta_notify_start_resource and OCF_RESKEY_CRM_meta_notify_start_uname and should be treated as an array of whitespace separated elements.
- Thus in order to indicate that clone:0 will be started on sles-1, clone:2 will be started on sles-3, and clone:3 will be started on sles-2, the cluster would set
-
- Example notification variables
-
- OCF_RESKEY_CRM_meta_notify_start_resource="clone:0 clone:2 clone:3"
- OCF_RESKEY_CRM_meta_notify_start_uname="sles-1 sles-3 sles-2"
-
-
-
-
- Proper Interpretation of Notification Environment Variables
- Pre-notification (stop):
-
-
- Active resources: $OCF_RESKEY_CRM_meta_notify_active_resource
-
-
- Inactive resources: $OCF_RESKEY_CRM_meta_notify_inactive_resource
-
-
- Resources to be started: $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
- Resources to be stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
- Post-notification (stop) / Pre-notification (start):
-
-
- Active resources
-
- $OCF_RESKEY_CRM_meta_notify_active_resource
- minus $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
-
-
- Inactive resources
-
- $OCF_RESKEY_CRM_meta_notify_inactive_resource
- plus $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
-
-
- Resources that were started: $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
- Resources that were stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
- Post-notification (start):
-
-
- Active resources:
-
- $OCF_RESKEY_CRM_meta_notify_active_resource
- minus $OCF_RESKEY_CRM_meta_notify_stop_resource
- plus $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
-
-
- Inactive resources:
-
- $OCF_RESKEY_CRM_meta_notify_inactive_resource
- plus $OCF_RESKEY_CRM_meta_notify_stop_resource
- minus $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
-
-
- Resources that were started: $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
- Resources that were stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
-
-
-
- Multi-state Resources
- ResourcesMulti-state
- Multi-state - Resources That Have Multiple Modes
-
- Multi-state resources are a specialization of Clone resources; please ensure you understand the section on clones before continuing! They allow the instances to be in one of two operating modes;
- these are called Master and Slave, but can mean whatever you wish them to mean.
- The only limitation is that when an instance is started, it must come up in the Slave state.
-
-
- Properties
-
- Properties of a Multi-State Resource
-
-
-
-
-
- Field
- Description
-
-
- idMulti-State Resource Property
- Multi-State Resource Propertiesid
- ResourceMulti-State Propertyid
- id
- Your name for the multi-state resource
-
-
-
-
- Multi-state specific resource configuration options
-
-
-
-
-
- Field
- Description
-
-
- master-max Multi-State Resource Property
- Multi-State Resource Propertiesmaster-max
- ResourceMulti-State Propertymaster-max
- master-max
- How many copies of the resource can be promoted to master status; default 1.
-
-
- master-node-max Multi-State Resource Property
- Multi-State Resource Propertiesmaster-node-max
- ResourceMulti-State Propertymaster-node-max
- master-node-max
- How many copies of the resource can be promoted to master status on a single node; default 1.
-
-
-
-
- Instance Attributes
- Multi-state resources have no instance attributes; however, any that are set here will be inherited by master's children.
-
-
- Contents
- Masters must contain exactly one group or one regular resource.
-
-
- You should never reference the name of a master's child.
- If you think you need to do this, you probably need to re-evaluate your design.
-
-
-
-
-
- Monitoring Multi-State Resources
-
- The normal type of monitor actions are not sufficient to monitor a multi-state resource in the Master state.
- To detect failures of the Master instance, you need to define an additional monitor action with role="Master".
-
- It is crucial that every monitor operation has a different interval!
- This is because Pacemaker currently differentiates between operations only by resource and interval; so if eg. a master/slave resource has the same monitor interval for both roles, Pacemaker would ignore the role when checking the status - which would cause unexpected return codes, and therefore unnecessary complications.
-
- Monitoring both states of a multi-state resource
-
-
-
-
-
-
-
- ]]>
-
-
-
- Constraints
-
- In most cases, a multi-state resources will have a single copy on each active cluster node.
- If this is not the case, you can indicate which nodes the cluster should preferentially assign copies to with resource location constraints.
- These constraints are written no differently to those for regular resources except that the master's id is used.
-
-
- When considering multi-state resources in constraints, for most purposes it is sufficient to treat them as clones.
- The exception is when the rsc-role and/or with-rsc-role fields (for colocation constraints) and first-action and/or then-action fields (for ordering constraints) are used.
-
-
- Additional constraint options relevant to multi-state resources
-
-
-
-
-
- Field
- Description
-
-
- rsc-role Multi-State Resource Constraints
- Multi-State Resource Constraintsrsc-role
- ResourceMulti-State Constraintsrsc-role
- rsc-role
-
- An additional attribute of colocation constraints that specifies the role that rsc must be in.
- Allowed values: Started, Master, Slave.
-
-
-
- with-rsc-role Multi-State Resource Constraints
- Multi-State Resource Constraintswith-rsc-role
- ResourceMulti-State Constraintswith-rsc-role
- with-rsc-role
-
- An additional attribute of colocation constraints that specifies the role that with-rsc must be in.
- Allowed values: Started, Master, Slave.
-
-
-
- first-action Multi-State Resource Constraints
- Multi-State Resource Constraintsfirst-action
- ResourceMulti-State Constraintsfirst-action
- first-action
-
- An additional attribute of ordering constraints that specifies the action that the first resource must complete before executing the specified action for the then resource.
- Allowed values: start, stop, promote, demote.
-
-
-
- then-action Multi-State Resource Constraints
- Multi-State Resource Constraintsthen-action
- ResourceMulti-State Constraintsthen-action
- then-action
-
- An additional attribute of ordering constraints that specifies the action that the then resource can only execute after the first-action on the first resource has completed.
- Allowed values: start, stop, promote, demote. Defaults to the value (specified or implied) of first-action.
-
-
-
-
-
-
- In the example below, myApp will wait until one of the database copies has been started and promoted to master before being started itself.
- Only if no copies can be promoted will apache-stats be prevented from being active.
- Additionally, the database will wait for myApp to be stopped before it is demoted.
-
-
- Example constraints involving multi-state resources
-
-
-
-
-
-
- ]]>
-
-
- Colocation of a regular (or group) resource with a multi-state resource means that it can run on any machine with an active copy of the multi-state resource that is in the specified state (Master or Slave).
- In the example, the cluster will choose a location based on where database is running as a Master, and if there are multiple Master instances it will also factor in myApp's own location preferences when deciding which location to choose.
-
-
- Colocation with regular clones and other multi-state resources is also possible.
- In such cases, the set of allowed locations for the rsc clone is (after role filtering) limited to nodes on which the with-rsc multi-state resource is (or will be) in the specified role.
- Allocation is then performed as-per-normal.
-
-
-
- Stickiness
-
- resource-stickinessof a Multi-State Resource
- To achieve a stable allocation pattern, multi-state resources are slightly sticky by default.
- If no value for resource-stickiness is provided, the multi-state resource will use a value of 1.
- Being a small value, it causes minimal disturbance to the score calculations of other resources but is enough to prevent Pacemaker from needlessly moving copies around the cluster.
-
-
-
- Which Resource Instance is Promoted
-
- During the start operation, most Resource Agent scripts should call the crm_master utility.
- This tool automatically detects both the resource and host and should be used to set a preference for being promoted.
- Based on this, master-max, and master-node-max, the instance(s) with the highest preference will be promoted.
-
- The other alternative is to create a location constraint that indicates which nodes are most preferred as masters.
-
- Manually specifying which node should be promoted
-
-
-
-
- ]]>
-
-
-
- Resource Agent Requirements
-
- Since multi-state resources are an extension of cloned resources, all the requirements of Clones are also requirements of multi-state resources.
- Additionally, multi-state resources require two extra actions: demote and promote;
- these actions are responsible for changing the state of the resource.
- Like start and stop, they should return OCF_SUCCESS if they completed successfully or a relevant error code if they did not.
-
-
- The states can mean whatever you wish, but when the resource is started, it must come up in the mode called Slave.
- From there the cluster will then decide which instances to promote to Master.
-
-
- In addition to the Clone requirements for monitor actions, agents must also accurately report which state they are in.
- The cluster relies on the agent to report its status (including role) accurately and does not indicate to the agent what role it currently believes it to be in.
-
-
-
-
- Notifications
-
- Like clones, supporting notifications requires the notify action to be implemented.
- Once supported the notify action will be passed a number of extra variables which, when combined with additional context, can be used to calculate the current state of the cluster and what is about to happen to it.
-
-
- Environment variables supplied with Master notify actionsEmphasized variables are specific to Master resources and all behave in the same manner as described for Clone resources.
-
-
-
-
-
-
- Variable
- Description
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_type
- OCF_RESKEY_CRM_meta_notify_type
- OCF_RESKEY_CRM_meta_notify_type
- Allowed values: pre, post
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_operation
- OCF_RESKEY_CRM_meta_notify_operationOCF_RESKEY_CRM_meta_notify_operation
- Allowed values: start, stop
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_active_resource
- OCF_RESKEY_CRM_meta_notify_active_resource
- OCF_RESKEY_CRM_meta_notify_active_resource
- Resources the that are running
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_inactive_resource
- OCF_RESKEY_CRM_meta_notify_inactive_resource
- OCF_RESKEY_CRM_meta_notify_inactive_resource
- Resources the that are not running
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_master_resource
- OCF_RESKEY_CRM_meta_notify_master_resource
- OCF_RESKEY_CRM_meta_notify_master_resource
- Resources that are running in Master mode
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_slave_resource
- OCF_RESKEY_CRM_meta_notify_slave_resource
- OCF_RESKEY_CRM_meta_notify_slave_resource
- Resources that are running in Slave mode
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_start_resource
- OCF_RESKEY_CRM_meta_notify_start_resource
- OCF_RESKEY_CRM_meta_notify_start_resource
- Resources to be started
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_stop_resource
- OCF_RESKEY_CRM_meta_notify_stop_resource
- OCF_RESKEY_CRM_meta_notify_stop_resource
- Resources to be stopped
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_promote_resource
- OCF_RESKEY_CRM_meta_notify_promote_resource
- OCF_RESKEY_CRM_meta_notify_promote_resource
- Resources to be promoted
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_demote_resource
- OCF_RESKEY_CRM_meta_notify_demote_resource
- OCF_RESKEY_CRM_meta_notify_demote_resource
- Resources to be demoted
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_start_uname
- OCF_RESKEY_CRM_meta_notify_start_uname
- OCF_RESKEY_CRM_meta_notify_start_uname
- Nodes on which resources will be started
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_stop_uname
- OCF_RESKEY_CRM_meta_notify_stop_uname
- OCF_RESKEY_CRM_meta_notify_stop_uname
- Nodes on which resources will be stopped
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_promote_uname
- OCF_RESKEY_CRM_meta_notify_promote_uname
- OCF_RESKEY_CRM_meta_notify_promote_uname
- Nodes on which resources will be promoted
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_demote_uname
- OCF_RESKEY_CRM_meta_notify_demote_uname
- OCF_RESKEY_CRM_meta_notify_demote_uname
- Nodes on which resources will be demoted
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_active_uname
- OCF_RESKEY_CRM_meta_notify_active_uname
- OCF_RESKEY_CRM_meta_notify_active_uname
- Nodes on which resources are running
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_inactive_uname
- OCF_RESKEY_CRM_meta_notify_inactive_uname
- OCF_RESKEY_CRM_meta_notify_inactive_uname
- Nodes on which resources are not running
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_master_uname
- OCF_RESKEY_CRM_meta_notify_master_uname
- OCF_RESKEY_CRM_meta_notify_master_uname
- Nodes on which resources are running in Master mode
-
-
- Environment VariableOCF_RESKEY_CRM__meta_notify_slave_uname
- OCF_RESKEY_CRM_meta_notify_slave_uname
- OCF_RESKEY_CRM_meta_notify_slave_uname
- Nodes on which resources are running in Slave mode
-
-
-
-
- Proper Interpretation of Notification Environment Variables
- Pre-notification (demote):
-
- Active resources: $OCF_RESKEY_CRM_meta_notify_active_resource
- Master resources: $OCF_RESKEY_CRM_meta_notify_master_resource
- Slave resources: $OCF_RESKEY_CRM_meta_notify_slave_resource
- Inactive resources: $OCF_RESKEY_CRM_meta_notify_inactive_resource
- Resources to be started: $OCF_RESKEY_CRM_meta_notify_start_resource
- Resources to be promoted: $OCF_RESKEY_CRM_meta_notify_promote_resource
- Resources to be demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource
- Resources to be stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
- Post-notification (demote) / Pre-notification (stop):
-
- Active resources: $OCF_RESKEY_CRM_meta_notify_active_resource
- Master resources:
-
- $OCF_RESKEY_CRM_meta_notify_master_resource
- minus $OCF_RESKEY_CRM_meta_notify_demote_resource
-
-
-
- Slave resources: $OCF_RESKEY_CRM_meta_notify_slave_resource
- Inactive resources: $OCF_RESKEY_CRM_meta_notify_inactive_resource
- Resources to be started: $OCF_RESKEY_CRM_meta_notify_start_resource
- Resources to be promoted: $OCF_RESKEY_CRM_meta_notify_promote_resource
- Resources to be demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource
- Resources to be stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
- Resources that were demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource
-
-
- Post-notification (stop) / Pre-notification (start)
-
- Active resources:
-
- $OCF_RESKEY_CRM_meta_notify_active_resource
- minus $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
- Master resources:
-
- $OCF_RESKEY_CRM_meta_notify_master_resource
- minus $OCF_RESKEY_CRM_meta_notify_demote_resource
-
-
- Slave resources:
-
- $OCF_RESKEY_CRM_meta_notify_slave_resource
- minus $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
- Inactive resources:
-
- $OCF_RESKEY_CRM_meta_notify_inactive_resource
- plus $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
- Resources to be started: $OCF_RESKEY_CRM_meta_notify_start_resource
- Resources to be promoted: $OCF_RESKEY_CRM_meta_notify_promote_resource
- Resources to be demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource
- Resources to be stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
- Resources that were demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource
- Resources that were stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
-
- Post-notification (start) / Pre-notification (promote)
-
- Active resources:
-
- $OCF_RESKEY_CRM_meta_notify_active_resource
- minus $OCF_RESKEY_CRM_meta_notify_stop_resource
- plus $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
- Master resources:
-
- $OCF_RESKEY_CRM_meta_notify_master_resource
- minus $OCF_RESKEY_CRM_meta_notify_demote_resource
-
-
- Slave resources:
-
- $OCF_RESKEY_CRM_meta_notify_slave_resource
- minus $OCF_RESKEY_CRM_meta_notify_stop_resource
- plus $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
- Inactive resources:
-
- $OCF_RESKEY_CRM_meta_notify_inactive_resource
- plus $OCF_RESKEY_CRM_meta_notify_stop_resource
- minus $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
- Resources to be started: $OCF_RESKEY_CRM_meta_notify_start_resource
- Resources to be promoted: $OCF_RESKEY_CRM_meta_notify_promote_resource
- Resources to be demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource
- Resources to be stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
- Resources that were started: $OCF_RESKEY_CRM_meta_notify_start_resource
- Resources that were demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource
- Resources that were stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
- Post-notification (promote)
-
- Active resources:
-
- $OCF_RESKEY_CRM_meta_notify_active_resource
- minus $OCF_RESKEY_CRM_meta_notify_stop_resource
- plus $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
- Master resources:
-
- $OCF_RESKEY_CRM_meta_notify_master_resource
- minus $OCF_RESKEY_CRM_meta_notify_demote_resource
- plus $OCF_RESKEY_CRM_meta_notify_promote_resource
-
-
- Slave resources:
-
- $OCF_RESKEY_CRM_meta_notify_slave_resource
- minus $OCF_RESKEY_CRM_meta_notify_stop_resource
- plus $OCF_RESKEY_CRM_meta_notify_start_resource
- minus $OCF_RESKEY_CRM_meta_notify_promote_resource
-
-
- Inactive resources:
-
- $OCF_RESKEY_CRM_meta_notify_inactive_resource
- plus $OCF_RESKEY_CRM_meta_notify_stop_resource
- minus $OCF_RESKEY_CRM_meta_notify_start_resource
-
-
- Resources to be started: $OCF_RESKEY_CRM_meta_notify_start_resource
- Resources to be promoted: $OCF_RESKEY_CRM_meta_notify_promote_resource
- Resources to be demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource
- Resources to be stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
- Resources that were started: $OCF_RESKEY_CRM_meta_notify_start_resource
- Resources that were promoted: $OCF_RESKEY_CRM_meta_notify_promote_resource
- Resources that were demoted: $OCF_RESKEY_CRM_meta_notify_demote_resource
- Resources that were stopped: $OCF_RESKEY_CRM_meta_notify_stop_resource
-
-
-
-
-
diff --git a/doc/Pacemaker_Explained/en-US/Ch-Resources.txt b/doc/Pacemaker_Explained/en-US/Ch-Resources.txt
index 9033b35ec5..a264c8e8c4 100644
--- a/doc/Pacemaker_Explained/en-US/Ch-Resources.txt
+++ b/doc/Pacemaker_Explained/en-US/Ch-Resources.txt
@@ -1,589 +1,591 @@
= Cluster Resources =
== What is a Cluster Resource ==
indexterm:[Resource,Description]
The role of a resource agent is to abstract the service it provides
and present a consistent view to the cluster, which allows the cluster
to be agnostic about the resources it manages.
The cluster doesn't need to understand how the resource works because
it relies on the resource agent to do the right thing when given a
+start+, +stop+ or +monitor+ command.
For this reason it is crucial that resource agents are well tested.
Typically resource agents come in the form of shell scripts, however
they can be written using any technology (such as C, Python or Perl)
that the author is comfortable with.
== Supported Resource Classes ==
indexterm:[Resource,Classes]
anchor:s-resource-supported[Supported Resource Classes]
There are three basic classes of agents supported by Pacemaker.
In order of encouraged usage they are:
=== Open Cluster Framework ===
indexterm:[Resource,OCF]
indexterm:[OCF,Resources]
indexterm:[Open Cluster Framework,Resources]
The OCF standard
footnote:[
http://www.opencf.org/cgi-bin/viewcvs.cgi/specs/ra/resource-agent-api.txt?rev=HEAD - at least as it relates to resource agents.
] footnote:[
The Pacemaker implementation has been somewhat extended from the OCF
Specs, but none of those changes are incompatible with the original
OCF specification.
]
is basically an extension of the Linux Standard Base conventions for
init scripts to:
* support parameters,
* make them self describing and
* extensible
OCF specs have strict definitions of the exit codes that actions must return.
footnote:[
Included with the cluster is the ocf-tester script, which can be
useful in this regard.
]
The cluster follows these specifications exactly, and giving the wrong
exit code will cause the cluster to behave in ways you will likely
find puzzling and annoying. In particular, the cluster needs to
distinguish a completely stopped resource from one which is in some
erroneous and indeterminate state.
Parameters are passed to the script as environment variables, with the
special prefix +OCF_RESKEY_+. So, a parameter which the user thinks
of as ip it will be passed to the script as +OCF_RESKEY_ip+. The
number and purpose of the parameters is completely arbitrary, however
your script should advertise any that it supports using the
+meta-data+ command.
The OCF class is the most preferred one as it is an industry standard,
highly flexible (allowing parameters to be passed to agents in a
non-positional manner) and self-describing.
For more information, see the
http://www.linux-ha.org/wiki/OCF_Resource_Agents[reference] and
xref:ap-ocf[].
=== Linux Standard Base ===
indexterm:[Resource,LSB]
indexterm:[LSB,Resources]
indexterm:[Linus Standard Base,Resources]
LSB resource agents are those found in '/etc/init.d'.
Generally they are provided by the OS/distribution and, in order to be used with the cluster, they must conform to the LSB Spec.
footnote:[
See
http://refspecs.linux-foundation.org/LSB_3.0.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
for the LSB Spec (as it relates to init scripts).
]
Many distributions claim LSB compliance but ship with broken init
scripts. To see if your init script is LSB-compatible, see the FAQ
entry xref:ap-lsb[]. The most common problems are:
* Not implementing the status operation at all
* Not observing the correct exit status codes for start/stop/status actions
* Starting a started resource returns an error (this violates the LSB spec)
* Stopping a stopped resource returns an error (this violates the LSB spec)
=== Legacy Heartbeat ===
indexterm:[Resource,Heartbeat (legacy)]
indexterm:[Heartbeat,Legacy Resources]
Version 1 of Heartbeat came with its own style of resource agents and
it is highly likely that many people have written their own agents
based on its conventions. To enable administrators to continue to use
these agents, they are supported by the new cluster manager
footnote:[
See http://wiki.linux-ha.org/HeartbeatResourceAgent for more information
]
=== STONITH ===
indexterm:[Resource,STONITH]
indexterm:[STONITH,Resources]
There is also an additional class, STONITH, which is used exclusively
for fencing related resources. This is discussed later in
xref:ch-stonith[].
== Properties ==
-These values tell the cluster which script to use for the resource, where to find that script and what standards it conforms to.
+anchor:primitive-resource[primitive]
+These values tell the cluster which script to use for the resource,
+where to find that script and what standards it conforms to.
.Properties of a Primitive Resource
[width="95%",cols="1m,6<",options="header",align="center"]
|=========================================================
|Field
|Description
|id
indexterm:[id]
|Your name for the resource
|class
indexterm:[class,Resource Field]
indexterm:[Resource,Field,class]
|The standard the script conforms to. Allowed values: +heartbeat+, +lsb+, +ocf+, +stonith+
|type
indexterm:[type,Resource Field]
indexterm:[Resource,Field,type]
|The name of the Resource Agent you wish to use. Eg. _IPaddr_ or _Filesystem_
|provider
indexterm:[provider,Resource Field]
indexterm:[Resource,Field,provider]
|The OCF spec allows multiple vendors to supply the same
ResourceAgent. To use the OCF resource agents supplied with
Heartbeat, you should specify +heartbeat+ here.
|=========================================================
Resource definitions can be queried with the `crm_resource` tool. For example
[source,Bash]
crm_resource --resource Email --query-xml
might produce:
.An example LSB resource
[source,XML]
[NOTE]
One of the main drawbacks to LSB resources is that they do not allow any parameters!
Example for an OCF resource:
.An example OCF resource
[source,XML]
-------
-------
Or, finally for the equivalent legacy Heartbeat resource:
.An example Heartbeat resource
[source,XML]
-------
-------
[NOTE]
======
Heartbeat resources take only ordered and unnamed parameters. The
supplied name therefore indicates the order in which they are passed
to the script. Only single digit values are allowed.
======
== Resource Options ==
anchor:s-resource-options[Resource Options]
Options are used by the cluster to decide how your resource should
behave and can be easily set using the `--meta` option of the
`crm_resource` command.
.Options for a Primitive Resource
[width="95%",cols="1m,1,4<",options="header",align="center"]
|=========================================================
|Field
|Default
|Description
|priority
|+0+
|If not all resources can be active, the cluster will stop lower
priority resources in order to keep higher priority ones active.
indexterm:[priority,Resource Option]
indexterm:[Resource,Option,priority]
|target-role
|+Started+
|What state should the cluster attempt to keep this resource in? Allowed values:
* 'Stopped' - Force the resource to be stopped
* 'Started' - Allow the resource to be started (In the case of
xref:s-resource-multistate[] resources, they will not promoted to
master)
* 'Master' - Allow the resource to be started and, if appropriate, promoted
indexterm:[target-role,Resource Option]
indexterm:[Resource,Option,target-role]
|is-managed
|+TRUE+
|Is the cluster allowed to start and stop the resource? Allowed
values: +true+, +false+
indexterm:[is-managed,Resource Option]
indexterm:[Resource,Option,is-managed]
|resource-stickiness
|Inherited
|How much does the resource prefer to stay where it is? Defaults to
the value of +resource-stickiness+ in the +rsc_defaults+ section
indexterm:[resource-stickiness,Resource Option]
indexterm:[Resource,Option,resource-stickiness]
|migration-threshold
|+INFINITY+ (disabled)
|How many failures may occur for this resource on a node, before this
node is marked ineligible to host this resource.
indexterm:[migration-threshold,Resource Option]
indexterm:[Resource,Option,migration-threshold]
|failure-timeout
|+0+ (disabled)
|How many seconds to wait before acting as if the failure had not
occurred, and potentially allowing the resource back to the node on
which it failed.
indexterm:[failure-timeout,Resource Option]
indexterm:[Resource,Option,failure-timeout]
|multiple-active
|+stop_start+
|What should the cluster do if it ever finds the resource active on
more than one node. Allowed values:
* 'block' - mark the resource as unmanaged
* 'stop_only' - stop all active instances and leave them that way
* 'stop_start' - stop all active instances and start the resource in
one location only
indexterm:[multiple-active,Resource Option]
indexterm:[Resource,Option,multiple-active]
|=========================================================
If you performed the following commands on the previous LSB Email resource
[source,Bash]
-------
crm_resource --meta --resource Email --set-parameter priority --property-value 100
crm_resource --meta --resource Email --set-parameter multiple-active --property-value block
-------
the resulting resource definition would be
.An LSB resource with cluster options
[source,XML]
-------
-------
== Setting Global Defaults for Resource Options ==
anchor:s-resource-defaults[Resource Defaults]
To set a default value for a resource option, simply add it to the
+rsc_defaults+ section with `crm_attribute`. Thus,
[source,Bash]
crm_attribute --type rsc_defaults --attr-name is-managed --attr-value false
would prevent the cluster from starting or stopping any of the
resources in the configuration (unless of course the individual
resources were specifically enabled and had +is-managed+ set to
+true+).
== Instance Attributes ==
The scripts of some resource classes (LSB not being one of them) can
be given parameters which determine how they behave and which instance
of a service they control.
If your resource agent supports parameters, you can add them with the
`crm_resource` command. For instance
[source,Bash]
crm_resource --resource Public-IP --set-parameter ip --property-value 1.2.3.4
would create an entry in the resource like this:
.An example OCF resource with instance attributes
[source,XML]
-------
-------
For an OCF resource, the result would be an environment variable
called +OCF_RESKEY_ip+ with a value of +1.2.3.4+.
The list of instance attributes supported by an OCF script can be
found by calling the resource script with the `meta-data` command.
The output contains an XML description of all the supported
attributes, their purpose and default values.
.Displaying the metadata for the Dummy resource agent template
[source,XML]
-------
# export OCF_ROOT=/usr/lib/ocf
# $OCF_ROOT/resource.d/pacemaker/Dummy meta-data
1.0
This is a Dummy Resource Agent. It does absolutely nothing except
keep track of whether its running or not.
Its purpose in life is for testing and to serve as a template for RA writers.
Dummy resource agent
Location to store the resource state in.
State file
Dummy attribute that can be changed to cause a reload
Dummy attribute that can be changed to cause a reload
-------
== Resource Operations ==
=== Monitoring Resources for Failure ===
By default, the cluster will not ensure your resources are still
healthy. To instruct the cluster to do this, you need to add a
+monitor+ operation to the resource's definition.
.An OCF resource with a recurring health check
[source,XML]
-------
-------
.Properties of an Operation
[width="95%",cols="1m,6<",options="header",align="center"]
|=========================================================
|Field
|Description
|id
|Your name for the action. Must be unique.
|name
|The action to perform. Common values: +monitor+, +start+, +stop+
|interval
|How frequently (in seconds) to perform the operation. Default value:
+0+, meaning never.
|timeout
|How long to wait before declaring the action has failed.
|requires
|What conditions need to be satisfied before this action
occurs. Allowed values:
* 'nothing' - The cluster may start this resource at any time
* 'quorum' - The cluster can only start this resource if a majority of
the configured nodes are active
* 'fencing' - The cluster can only start this resource if a majority
of the configured nodes are active _and_ any failed or unknown nodes
have been powered off.
STONITH resources default to +nothing+, and all others default to
+fencing+ if STONITH is enabled and +quorum+ otherwise.
|on-fail
|The action to take if this action ever fails. Allowed values:
* 'ignore' - Pretend the resource did not fail
* 'block' - Don't perform any further operations on the resource
* 'stop' - Stop the resource and do not start it elsewhere
* 'restart' - Stop the resource and start it again (possibly on a different node)
* 'fence' - STONITH the node on which the resource failed
* 'standby' - Move _all_ resources away from the node on which the resource failed
The default for the +stop+ operation is +fence+ when STONITH is
enabled and +block+ otherwise. All other operations default to +stop+.
|enabled
|If +false+, the operation is treated as if it does not exist. Allowed
values: +true+, +false+
|=========================================================
=== Setting Global Defaults for Operations ===
anchor:s-operation-defaults[Operation Defaults]
To set a default value for a operation option, simply add it to the
+op_defaults+ section with `crm_attribute`. Thus,
[source,Bash]
crm_attribute --type op_defaults --attr-name timeout --attr-value 20s
would default each operation's +timeout+ to 20 seconds. If an
operation's definition also includes a value for +timeout+, then that
value would be used instead (for that operation only).
==== When Resources Take a Long Time to Start/Stop ====
There are a number of implicit operations that the cluster will always
perform - +start+, +stop+ and a non-recurring +monitor+ operation
(used at startup to check the resource isn't already active). If one
of these is taking too long, then you can create an entry for them and
simply specify a new value.
.An OCF resource with custom timeouts for its implicit actions
[source,XML]
-------
-------
==== Multiple Monitor Operations ====
Provided no two operations (for a single resource) have the same name
and interval you can have as many monitor operations as you like. In
this way you can do a superficial health check every minute and
progressively more intense ones at higher intervals.
To tell the resource agent what kind of check to perform, you need to
provide each monitor with a different value for a common parameter.
The OCF standard creates a special parameter called +OCF_CHECK_LEVEL+
for this purpose and dictates that it is _"made available to the
resource agent without the normal +OCF_RESKEY+ prefix"_.
Whatever name you choose, you can specify it by adding an
+instance_attributes+ block to the op tag. Note that it is up to each
resource agent to look for the parameter and decide how to use it.
.An OCF resource with two recurring health checks, performing different levels of checks - specified via +OCF_CHECK_LEVEL+.
[source,XML]
-------
-------
==== Disabling a Monitor Operation ====
The easiest way to stop a recurring monitor is to just delete it.
However, there can be times when you only want to disable it
temporarily. In such cases, simply add +enabled="false"+ to the
operation's definition.
.Example of an OCF resource with a disabled health check
[source,XML]
-------
-------
This can be achieved from the command-line by executing
[source,Bash]
cibadmin -M -X ‘'
Once you've done whatever you needed to do, you can then re-enable it with
[source,Bash]
cibadmin -M -X ‘'
diff --git a/doc/Pacemaker_Explained/en-US/Revision_History.xml b/doc/Pacemaker_Explained/en-US/Revision_History.xml
index b3b43ea2d9..2cb4df7244 100644
--- a/doc/Pacemaker_Explained/en-US/Revision_History.xml
+++ b/doc/Pacemaker_Explained/en-US/Revision_History.xml
@@ -1,33 +1,46 @@
Revision History119 Oct 2009AndrewBeekhofandrew@beekhof.netImport from Pages.app226 Oct 2009AndrewBeekhofandrew@beekhof.netCleanup and reformatting of docbook xml complete3Tue Nov 12 2009AndrewBeekhofandrew@beekhof.netSplit book into chapters and pass validationRe-organize book for use with Publican
+
+ 4
+ Mon Oct 8 2012
+ AndrewBeekhofandrew@beekhof.net
+
+
+
+ Converted to asciidoc (which is converted to docbook for
+ use with Publican
+
+
+
+