diff --git a/doc/Pacemaker_Explained/en-US/Ch-Basics.txt b/doc/Pacemaker_Explained/en-US/Ch-Basics.txt new file mode 100644 index 0000000000..e5df1c439f --- /dev/null +++ b/doc/Pacemaker_Explained/en-US/Ch-Basics.txt @@ -0,0 +1,349 @@ += Configuration Basics = + +== Configuration Layout == + +The cluster is written using XML notation and divided into two main +sections: configuration and status. + +The status section contains the history of each resource on each node +and based on this data, the cluster can construct the complete current +state of the cluster. The authoritative source for the status section +is the local resource manager (lrmd) process on each cluster node and +the cluster will occasionally repopulate the entire section. For this +reason it is never written to disk and administrators are advised +against modifying it in any way. + +The configuration section contains the more traditional information +like cluster options, lists of resources and indications of where they +should be placed. The configuration section is the primary focus of +this document. + +The configuration section itself is divided into four parts: + + * Configuration options (called +crm_config+) + * Nodes + * Resources + * Resource relationships (called +constraints+) + +.An empty configuration +[source,XML] +------- + + + + + + + + + +------- + +== The Current State of the Cluster == + +Before one starts to configure a cluster, it is worth explaining how +to view the finished product. For this purpose we have created the +pass:[crm_mon] utility that will display the +current state of an active cluster. It can show the cluster status by +node or by resource and can be used in either single-shot or +dynamically-updating mode. There are also modes for displaying a list +of the operations performed (grouped by node and resource) as well as +information about failures. + + +Using this tool, you can examine the state of the cluster for +irregularities and see how it responds when you cause or simulate +failures. + +Details on all the available options can be obtained using the +pass:[crm_mon --help] command. + +.Sample output from crm_mon +------- + ============ + Last updated: Fri Nov 23 15:26:13 2007 + Current DC: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec) + 3 Nodes configured. + 5 Resources configured. + ============ + + Node: sles-1 (1186dc9a-324d-425a-966e-d757e693dc86): online + 192.168.100.181 (heartbeat::ocf:IPaddr): Started sles-1 + 192.168.100.182 (heartbeat:IPaddr): Started sles-1 + 192.168.100.183 (heartbeat::ocf:IPaddr): Started sles-1 + rsc_sles-1 (heartbeat::ocf:IPaddr): Started sles-1 + child_DoFencing:2 (stonith:external/vmware): Started sles-1 + Node: sles-2 (02fb99a8-e30e-482f-b3ad-0fb3ce27d088): standby + Node: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec): online + rsc_sles-2 (heartbeat::ocf:IPaddr): Started sles-3 + rsc_sles-3 (heartbeat::ocf:IPaddr): Started sles-3 + child_DoFencing:0 (stonith:external/vmware): Started sles-3 +------- + +.Sample output from crm_mon -n +------- + ============ + Last updated: Fri Nov 23 15:26:13 2007 + Current DC: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec) + 3 Nodes configured. + 5 Resources configured. + ============ + + Node: sles-1 (1186dc9a-324d-425a-966e-d757e693dc86): online + Node: sles-2 (02fb99a8-e30e-482f-b3ad-0fb3ce27d088): standby + Node: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec): online + + Resource Group: group-1 + 192.168.100.181 (heartbeat::ocf:IPaddr): Started sles-1 + 192.168.100.182 (heartbeat:IPaddr): Started sles-1 + 192.168.100.183 (heartbeat::ocf:IPaddr): Started sles-1 + rsc_sles-1 (heartbeat::ocf:IPaddr): Started sles-1 + rsc_sles-2 (heartbeat::ocf:IPaddr): Started sles-3 + rsc_sles-3 (heartbeat::ocf:IPaddr): Started sles-3 + Clone Set: DoFencing + child_DoFencing:0 (stonith:external/vmware): Started sles-3 + child_DoFencing:1 (stonith:external/vmware): Stopped + child_DoFencing:2 (stonith:external/vmware): Started sles-1 +------- + +The DC (Designated Controller) node is where all the decisions are +made and if the current DC fails a new one is elected from the +remaining cluster nodes. The choice of DC is of no significance to an +administrator beyond the fact that its logs will generally be more +interesting. + +== How Should the Configuration be Updated? == + +There are three basic rules for updating the cluster configuration: + + * Rule 1 - Never edit the cib.xml file manually. Ever. I'm not making this up. + * Rule 2 - Read Rule 1 again. + * Rule 3 - The cluster will notice if you ignored rules 1 & 2 and refuse to use the configuration. + +Now that it is clear how NOT to update the configuration, we can begin +to explain how you should. + +The most powerful tool for modifying the configuration is the ++cibadmin+ command which talks to a running cluster. With +cibadmin+, +the user can query, add, remove, update or replace any part of the +configuration; all changes take effect immediately, so there is no +need to perform a reload-like operation. + + +The simplest way of using cibadmin is to use it to save the current +configuration to a temporary file, edit that file with your favorite +text or XML editor and then upload the revised configuration. + +.Safely using an editor to modify the cluster configuration +[source,Bash] +-------- +# cibadmin --query > tmp.xml +# vi tmp.xml +# cibadmin --replace --xml-file tmp.xml +-------- + +Some of the better XML editors can make use of a Relax NG schema to +help make sure any changes you make are valid. The schema describing +the configuration can normally be found in +pass:[/usr/lib/heartbeat/pacemaker.rng] on most +systems. + + +If you only wanted to modify the resources section, you could instead +do + +.Safely using an editor to modify a subsection of the cluster configuration +[source,Bash] +-------- +# cibadmin --query --obj_type resources > tmp.xml +# vi tmp.xml] +# cibadmin --replace --obj_type resources --xml-file tmp.xml +-------- + +to avoid modifying any other part of the configuration. + +== Quickly Deleting Part of the Configuration == + +Identify the object you wish to delete. Eg. run + +.Searching for STONITH related configuration items +[source,Bash] +-------- +# cibadmin -Q | grep stonith + + + + + + + + + + +-------- + +Next identify the resource's tag name and id (in this case we'll +choose +primitive+ and +child_DoFencing+). Then simply execute: + + +pass:[cibadmin --delete --crm_xml ‘<primitive id="child_DoFencing"/>'] + +== Updating the Configuration Without Using XML == + +Some common tasks can also be performed with one of the higher level +tools that avoid the need to read or edit XML. + +To enable stonith for example, one could run: + +pass:[crm_attribute --attr-name stonith-enabled --attr-value true] + +Or, to see if +somenode+ is allowed to run resources, there is: + +pass:[crm_standby --get-value --node-uname somenode] + +Or, to find the current location of +my-test-rsc+, one can use: + +pass:[crm_resource --locate --resource my-test-rsc] + +[[s-config-sandboxes]] +== Making Configuration Changes in a Sandbox == + +Often it is desirable to preview the effects of a series of changes +before updating the configuration atomically. For this purpose we +have created pass:[crm_shadow] which creates a +"shadow" copy of the configuration and arranges for all the command +line tools to use it. + +To begin, simply invoke pass:[crm_shadow] and give +it the name of a configuration to create footnote:[Shadow copies are +identified with a name, making it possible to have more than one.] ; +be sure to follow the simple on-screen instructions. + +Read the above carefully, failure to do so could result in you +destroying the cluster's active configuration! + + +.Creating and displaying the active sandbox +[source,Bash] +-------- + # crm_shadow --create test + Setting up shadow instance + Type Ctrl-D to exit the crm_shadow shell + shadow[test]: + shadow[test] # crm_shadow --which + test +-------- + +From this point on, all cluster commands will automatically use the +shadow copy instead of talking to the cluster's active configuration. +Once you have finished experimenting, you can either commit the +changes, or discard them as shown below. Again, be sure to follow the +on-screen instructions carefully. + + +For a full list of pass:[crm_shadow] options and +commands, invoke it with the --help option. + +.Using a sandbox to make multiple changes atomically +[source,Bash] +-------- + shadow[test] # crm_failcount -G -r rsc_c001n01 + name=fail-count-rsc_c001n01 value=0 + shadow[test] # crm_standby -v on -n c001n02 + shadow[test] # crm_standby -G -n c001n02 + name=c001n02 scope=nodes value=on + shadow[test] # cibadmin --erase --force + shadow[test] # cibadmin --query + + + + + + + + + + shadow[test] # crm_shadow --delete test --force + Now type Ctrl-D to exit the crm_shadow shell + shadow[test] # exit + # crm_shadow --which + No shadow instance provided + # cibadmin -Q + + + + + + +-------- + +Making changes in a sandbox and verifying the real configuration is untouched + +[[s-config-testing-changes]] +== Testing Your Configuration Changes == + + +We saw previously how to make a series of changes to a "shadow" copy +of the configuration. Before loading the changes back into the +cluster (eg. pass:[crm_shadow --commit mytest +--force]), it is often advisable to simulate the effect of +the changes with +ptest+, eg. + +pass:[ptest --live-check -VVVVV --save-graph tmp.graph --save-dotfile tmp.dot] + + +The tool uses the same library as the live cluster to show what it +would have done given the supplied input. It's output, in addition to +a significant amount of logging, is stored in two files +tmp.graph+ +and +tmp.dot+, both are representations of the same thing -- the +cluster's response to your changes. + +In the graph file is stored the complete transition, containing a list +of all the actions, their parameters and their pre-requisites. +Because the transition graph is not terribly easy to read, the tool +also generates a Graphviz dot-file representing the same information. + +== Interpreting the Graphviz output == + * Arrows indicate ordering dependencies + * Dashed-arrows indicate dependencies that are not present in the transition graph + * Actions with a dashed border of any color do not form part of the transition graph + * Actions with a green border form part of the transition graph + * Actions with a red border are ones the cluster would like to execute but cannot run + * Actions with a blue border are ones the cluster does not feel need to be executed + * Actions with orange text are pseudo/pretend actions that the cluster uses to simplify the graph + * Actions with black text are sent to the LRM + * Resource actions have text of the form pass:[rsc]_pass:[action]_pass:[interval] pass:[node] + * Any action depending on an action with a red border will not be able to execute. + * Loops are _really_ bad. Please report them to the development team. + +=== Small Cluster Transition === + +image::images/Policy-Engine-small.png["An example transition graph as represented by Graphviz",width="16cm",height="6cm",align="center"] + +In the above example, it appears that a new node, +node2+, has come +online and that the cluster is checking to make sure +rsc1+, +rsc2+ +and +rsc3+ are not already running there (Indicated by the ++*_monitor_0+ entries). Once it did that, and assuming the resources +were not active there, it would have liked to stop +rsc1+ and +rsc2+ +on +node1+ and move them to +node2+. However, there appears to be +some problem and the cluster cannot or is not permitted to perform the +stop actions which implies it also cannot perform the start actions. +For some reason the cluster does not want to start +rsc3+ anywhere. + +For information on the options supported by ptest, use +pass:[ptest --help]. + +=== Complex Cluster Transition === + +image::images/Policy-Engine-big.png["Another, slightly more complex, transition graph that you're not expected to be able to read",width="16cm",height="20cm",align="center"] + +== Do I Need to Update the Configuration on all Cluster Nodes? == + +No. Any changes are immediately synchronized to the other active +members of the cluster. + +To reduce bandwidth, the cluster only broadcasts the incremental +updates that result from your changes and uses MD5 checksums to ensure +that each copy is completely consistent. diff --git a/doc/Pacemaker_Explained/en-US/Ch-Basics.xml b/doc/Pacemaker_Explained/en-US/Ch-Basics.xml deleted file mode 100644 index 8d055aebe3..0000000000 --- a/doc/Pacemaker_Explained/en-US/Ch-Basics.xml +++ /dev/null @@ -1,299 +0,0 @@ - Configuration Basics -
- Configuration Layout - The cluster is written using XML notation and divided into two main sections: configuration and status. - - The status section contains the history of each resource on each node and based on this data, the cluster can construct the complete current state of the cluster. - The authoritative source for the status section is the local resource manager (lrmd) process on each cluster node and the cluster will occasionally repopulate the entire section. - For this reason it is never written to disk and administrators are advised against modifying it in any way. - - - The configuration section contains the more traditional information like cluster options, lists of resources and indications of where they should be placed. - The configuration section is the primary focus of this document. - - The configuration section itself is divided into four parts: - - Configuration options (called crm_config) - Nodes - Resources - Resource relationships (called constraints) - - - An empty configuration - - - - - - - - - -]]> - - -
-
- The Current State of the Cluster - - Before one starts to configure a cluster, it is worth explaining how to view the finished product. - For this purpose we have created the crm_mon utility that will display the current state of an active cluster. - It can show the cluster status by node or by resource and can be used in either single-shot or dynamically-updating mode. - There are also modes for displaying a list of the operations performed (grouped by node and resource) as well as information about failures. - - Using this tool, you can examine the state of the cluster for irregularities and see how it responds when you cause or simulate failures. - Details on all the available options can be obtained using the crm_mon --help command. -
- Sample output from crm_mon - # crm_mon - ============ - Last updated: Fri Nov 23 15:26:13 2007 - Current DC: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec) - 3 Nodes configured. - 5 Resources configured. - ============ - - Node: sles-1 (1186dc9a-324d-425a-966e-d757e693dc86): online - 192.168.100.181 (heartbeat::ocf:IPaddr): Started sles-1 - 192.168.100.182 (heartbeat:IPaddr): Started sles-1 - 192.168.100.183 (heartbeat::ocf:IPaddr): Started sles-1 - rsc_sles-1 (heartbeat::ocf:IPaddr): Started sles-1 - child_DoFencing:2 (stonith:external/vmware): Started sles-1 - Node: sles-2 (02fb99a8-e30e-482f-b3ad-0fb3ce27d088): standby - Node: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec): online - rsc_sles-2 (heartbeat::ocf:IPaddr): Started sles-3 - rsc_sles-3 (heartbeat::ocf:IPaddr): Started sles-3 - child_DoFencing:0 (stonith:external/vmware): Started sles-3 -
-
- Sample output from crm_mon -n - # crm_mon -n - ============ - Last updated: Fri Nov 23 15:26:13 2007 - Current DC: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec) - 3 Nodes configured. - 5 Resources configured. - ============ - - Node: sles-1 (1186dc9a-324d-425a-966e-d757e693dc86): online - Node: sles-2 (02fb99a8-e30e-482f-b3ad-0fb3ce27d088): standby - Node: sles-3 (2298606a-6a8c-499a-9d25-76242f7006ec): online - - Resource Group: group-1 - 192.168.100.181 (heartbeat::ocf:IPaddr): Started sles-1 - 192.168.100.182 (heartbeat:IPaddr): Started sles-1 - 192.168.100.183 (heartbeat::ocf:IPaddr): Started sles-1 - rsc_sles-1 (heartbeat::ocf:IPaddr): Started sles-1 - rsc_sles-2 (heartbeat::ocf:IPaddr): Started sles-3 - rsc_sles-3 (heartbeat::ocf:IPaddr): Started sles-3 - Clone Set: DoFencing - child_DoFencing:0 (stonith:external/vmware): Started sles-3 - child_DoFencing:1 (stonith:external/vmware): Stopped - child_DoFencing:2 (stonith:external/vmware): Started sles-1 -
- - The DC (Designated Controller) node is where all the decisions are made and if the current DC fails a new one is elected from the remaining cluster nodes. - The choice of DC is of no significance to an administrator beyond the fact that its logs will generally be more interesting. - -
-
- How Should the Configuration be Updated? - There are three basic rules for updating the cluster configuration: - - Rule 1 - Never edit the cib.xml file manually. Ever. I'm not making this up. - Rule 2 - Read Rule 1 again. - Rule 3 - The cluster will notice if you ignored rules 1 & 2 and refuse to use the configuration. - - Now that it is clear how NOT to update the configuration, we can begin to explain how you should. - - The most powerful tool for modifying the configuration is the cibadmin command which talks to a running cluster. - With cibadmin, the user can query, add, remove, update or replace any part of the configuration; all changes take effect immediately, so there is no need to perform a reload-like operation. - - The simplest way of using cibadmin is to use it to save the current configuration to a temporary file, edit that file with your favorite text or XML editor and then upload the revised configuration. -
- Safely using an editor to modify the cluster configuration - cibadmin --query > tmp.xml - vi tmp.xml - cibadmin --replace --xml-file tmp.xml - -
- - Some of the better XML editors can make use of a Relax NG schema to help make sure any changes you make are valid. - The schema describing the configuration can normally be found in /usr/lib/heartbeat/pacemaker.rng on most systems. - - If you only wanted to modify the resources section, you could instead do -
- Safely using an editor to modify a subsection of the cluster configuration - cibadmin --query --obj_type resources > tmp.xml - vi tmp.xml - cibadmin --replace --obj_type resources --xml-file tmp.xml - -
- to avoid modifying any other part of the configuration. -
-
- Quickly Deleting Part of the Configuration - Identify the object you wish to delete. Eg. do - - - Next identify the resource's tag name and id (in this case we'll choose primitive and child_DoFencing). - Then simply execute: - - cibadmin --delete --crm_xml ‘<primitive id="child_DoFencing"/>' -
-
- Updating the Configuration Without Using XML - Some common tasks can also be performed with one of the higher level tools that avoid the need to read or edit XML. - To enable stonith for example, one could run: - crm_attribute --attr-name stonith-enabled --attr-value true - Or, to see if somenode is allowed to run resources, there is: - crm_standby --get-value --node-uname somenode - Or, to find the current location of my-test-rsc, one can use: - crm_resource --locate --resource my-test-rsc -
-
- Making Configuration Changes in a Sandbox - - Often it is desirable to preview the effects of a series of changes before updating the configuration atomically. - For this purpose we have created crm_shadow which creates a "shadow" copy of the configuration and arranges for all the command line tools to use it. - - - To begin, simply invoke crm_shadow and give it the name of a configuration to create - Shadow copies are identified with a name, making it possible to have more than one. - ; be sure to follow the simple on-screen instructions. - Read the above carefully, failure to do so could result in you destroying the cluster's active configuration! - -
- Creating and displaying the active sandbox - # crm_shadow --create test - Setting up shadow instance - Type Ctrl-D to exit the crm_shadow shell - shadow[test]: - shadow[test] # crm_shadow --which - test -
- - From this point on, all cluster commands will automatically use the shadow copy instead of talking to the cluster's active configuration. - Once you have finished experimenting, you can either commit the changes, or discard them as shown below. - Again, be sure to follow the on-screen instructions carefully. - - For a full list of crm_shadow options and commands, invoke it with the --help option. - - Using a sandbox to make multiple changes atomically - shadow[test] # crm_failcount -G -r rsc_c001n01 - name=fail-count-rsc_c001n01 value=0 - shadow[test] # crm_standby -v on -n c001n02 - shadow[test] # crm_standby -G -n c001n02 - name=c001n02 scope=nodes value=on - shadow[test] # cibadmin --erase --force - shadow[test] # cibadmin --query - - - - - - - - - -]]> - shadow[test] # crm_shadow --delete test --force - Now type Ctrl-D to exit the crm_shadow shell - shadow[test] # exit - # crm_shadow --which - No shadow instance provided - # cibadmin -Q - - - - - - -]]> - - Making changes in a sandbox and verifying the real configuration is untouched - -
-
- Testing Your Configuration Changes - - We saw previously how to make a series of changes to a "shadow" copy of the configuration. - Before loading the changes back into the cluster (eg. crm_shadow --commit mytest --force), it is often advisable to simulate the effect of the changes with ptest, eg. - - ptest --live-check -VVVVV --save-graph tmp.graph --save-dotfile tmp.dot - - The tool uses the same library as the live cluster to show what it would have done given the supplied input. - It's output, in addition to a significant amount of logging, is stored in two files tmp.graph and tmp.dot, both are representations of the same thing -- the cluster's response to your changes. - In the graph file is stored the complete transition, containing a list of all the actions, their parameters and their pre-requisites. - Because the transition graph is not terribly easy to read, the tool also generates a Graphviz dot-file representing the same information. - - -
- Small Cluster Transition - - - - - An example transition graph as represented by Graphviz - -
-
- - Interpreting the Graphviz output - Arrows indicate ordering dependencies - Dashed-arrows indicate dependencies that are not present in the transition graph - Actions with a dashed border of any color do not form part of the transition graph - Actions with a green border form part of the transition graph - Actions with a red border are ones the cluster would like to execute but cannot run - Actions with a blue border are ones the cluster does not feel need to be executed - Actions with orange text are pseudo/pretend actions that the cluster uses to simplify the graph - Actions with black text are sent to the LRM - Resource actions have text of the form rsc_action_interval node - Any action depending on an action with a red border will not be able to execute. - Loops are really bad. Please report them to the development team. - - - In the above example, it appears that a new node, node2, has come online and that the cluster is checking to make sure rsc1, rsc2 and rsc3 are not already running there (Indicated by the *_monitor_0 entries). - Once it did that, and assuming the resources were not active there, it would have liked to stop rsc1 and rsc2 on node1 and move them to node2. - However, there appears to be some problem and the cluster cannot or is not permitted to perform the stop actions which implies it also cannot perform the start actions. - For some reason the cluster does not want to start rsc3 anywhere. - - For information on the options supported by ptest, use ptest --help. - -
- Complex Cluster Transition - - - - - Another, slightly more complex, transition graph that you're not expected to be able to read - -
-
-
-
- Do I Need to Update the Configuration on all Cluster Nodes? - No. Any changes are immediately synchronized to the other active members of the cluster. - To reduce bandwidth, the cluster only broadcasts the incremental updates that result from your changes and uses MD5 checksums to ensure that each copy is completely consistent. -
-
diff --git a/doc/Pacemaker_Explained/en-US/Ch-Options.txt b/doc/Pacemaker_Explained/en-US/Ch-Options.txt new file mode 100644 index 0000000000..79ed0ecb9e --- /dev/null +++ b/doc/Pacemaker_Explained/en-US/Ch-Options.txt @@ -0,0 +1,276 @@ += Cluster Options = + +== Special Options == + +indexterm:[Special Cluster Options] +indexterm:[Cluster Options,Special Options] + +The reason for these fields to be placed at the top level instead of +with the rest of cluster options is simply a matter of parsing. These +options are used by the configuration database which is, by design, +mostly ignorant of the content it holds. So the decision was made to +place them in an easy to find location. + +== Configuration Version == + +indexterm:[Configuration Version, Cluster Option] +indexterm:[Cluster Options,Configuration Version] + +When a node joins the cluster, the cluster will perform a check to see +who has the best configuration based on the fields below. It then +asks the node with the highest (+admin_epoch+, +epoch+, +num_updates+) +tuple to replace the configuration on all the nodes - which makes +setting them, and setting them correctly, very important. + +.Configuration Version Properties +[width="95%",cols="1m,5<",options="header",align="center"] +|========================================================= +|Field |Description + +| admin_epoch | +indexterm:[admin_epoch Cluster Option] +indexterm:[Cluster Options,admin_epoch] +Never modified by the cluster. Use this to make the configurations on +any inactive nodes obsolete. + +_Never set this value to zero_, in such cases the cluster cannot tell +the difference between your configuration and the "empty" one used +when nothing is found on disk. + +| epoch | +indexterm:[epoch Cluster Option] +indexterm:[Cluster Options,epoch] +Incremented every time the configuration is updated (usually by the admin) + +| num_updates | +indexterm:[num_updates Cluster Option] +indexterm:[Cluster Options,num_updates] +Incremented every time the configuration or status is updated (usually by the cluster) + +|========================================================= + +== Other Fields == +.Properties Controlling Validation +[width="95%",cols="1m,5<",options="header",align="center"] +|========================================================= +|Field |Description + +| validate-with | +indexterm:[validate-with Cluster Option] +indexterm:[Cluster Options,validate-with] +Determines the type of validation being done on the configuration. If +set to "none", the cluster will not verify that updates conform to the +DTD (nor reject ones that don't). This option can be useful when +operating a mixed version cluster during an upgrade. + +|========================================================= + +== Fields Maintained by the Cluster == + +.Properties Maintained by the Cluster +[width="95%",cols="1m,5<",options="header",align="center"] +|========================================================= +|Field |Description + +|crm-debug-origin | +indexterm:[crm-debug-origin Cluster Fields] +indexterm:[Cluster Fields,crm-debug-origin] +Indicates where the last update came from. Informational purposes only. + +|cib-last-written | +indexterm:[cib-last-written Cluster Fields] +indexterm:[Cluster Fields,cib-last-written] +Indicates when the configuration was last written to disk. Informational purposes only. + +|dc-uuid | +indexterm:[dc-uuid Cluster Fields] +indexterm:[Cluster Fields,dc-uuid] +Indicates which cluster node is the current leader. Used by the +cluster when placing resources and determining the order of some +events. + +|have-quorum | +indexterm:[have-quorum Cluster Fields] +indexterm:[Cluster Fields,have-quorum] +Indicates if the cluster has quorum. If false, this may mean that the +cluster cannot start resources or fence other nodes. See ++no-quorum-policy+ below. + +|========================================================= + +Note that although these fields can be written to by the admin, in +most cases the cluster will overwrite any values specified by the +admin with the "correct" ones. To change the +admin_epoch+, for +example, one would use: + +pass:[cibadmin --modify --crm_xml ‘<cib admin_epoch="42"/>'] + +A complete set of fields will look something like this: + +.An example of the fields set for a cib object +[source,XML] +------- + +------- + +== Cluster Options == + +Cluster options, as you might expect, control how the cluster behaves +when confronted with certain situations. + +They are grouped into sets and, in advanced configurations, there may +be more than one. +footnote:[This will be described later in the section on where we will show how to have the cluster use +different sets of options during working hours (when downtime is +usually to be avoided at all costs) than it does during the weekends +(when resources can be moved to the their preferred hosts without +bothering end users)] +For now we will describe the simple case where each option is present at most once. + +== Available Cluster Options == +.Cluster Options +[width="95%",cols="5m,2m,13",options="header",align="center"] +|========================================================= +|Option |Default |Description + +| batch-limit | 30 | +indexterm:[batch-limit Cluster Options] +indexterm:[Cluster Options,batch-limit] +The number of jobs that the TE is allowed to execute in parallel. The +"correct" value will depend on the speed and load of your network and +cluster nodes. + +| no-quorum-policy | stop | +indexterm:[no-quorum-policy Cluster Options] +indexterm:[Cluster Options,no-quorum-policy] +What to do when the cluster does not have quorum. Allowed values: + + * ignore - continue all resource management + * freeze - continue resource management, but don't recover resources from nodes not in the affected partition + * stop - stop all resources in the affected cluster partition + * suicide - fence all nodes in the affected cluster partition + +| symmetric-cluster | TRUE | +indexterm:[symmetric-cluster Cluster Options] +indexterm:[Cluster Options,symmetric-cluster] +Can all resources run on any node by default? + +| stonith-enabled | TRUE | +indexterm:[stonith-enabled Cluster Options] +indexterm:[Cluster Options,stonith-enabled] +Should failed nodes and nodes with resources that can't be stopped be +shot? If you value your data, set up a STONITH device and enable this. + +If true, or unset, the cluster will refuse to start resources unless +one or more STONITH resources have been configured also. + +| stonith-action | reboot | +indexterm:[stonith-action Cluster Options] +indexterm:[Cluster Options,stonith-action] +Action to send to STONITH device. Allowed values: reboot, poweroff. + +| cluster-delay | 60s | +indexterm:[cluster-delay Cluster Options] +indexterm:[Cluster Options,cluster-delay] +Round trip delay over the network (excluding action execution). The +"correct" value will depend on the speed and load of your network and +cluster nodes. + +| stop-orphan-resources | TRUE | +indexterm:[stop-orphan-resources Cluster Options] +indexterm:[Cluster Options,stop-orphan-resources] +Should deleted resources be stopped? + +| stop-orphan-actions | TRUE | +indexterm:[stop-orphan-actions Cluster Options] +indexterm:[Cluster Options,stop-orphan-actions] +Should deleted actions be cancelled? + +| start-failure-is-fatal | TRUE | +indexterm:[start-failure-is-fatal Cluster Options] +indexterm:[Cluster Options,start-failure-is-fatal] +When set to FALSE, the cluster will instead use the resource's ++failcount+ and value for +resource-failure-stickiness+. + +| pe-error-series-max | -1 (all) | +indexterm:[pe-error-series-max Cluster Options] +indexterm:[Cluster Options,pe-error-series-max] +The number of PE inputs resulting in ERRORs to save. Used when reporting problems. + +| pe-warn-series-max | -1 (all) | +indexterm:[pe-warn-series-max Cluster Options] +indexterm:[Cluster Options,pe-warn-series-max] +The number of PE inputs resulting in WARNINGs to save. Used when reporting problems. + +| pe-input-series-max | -1 (all) | +indexterm:[pe-input-series-max Cluster Options] +indexterm:[Cluster Options,pe-input-series-max] +The number of "normal" PE inputs to save. Used when reporting problems. + +|========================================================= + +You can always obtain an up-to-date list of cluster options, including +their default values, by running the pass:[pengine +metadata] command. + +== Querying and Setting Cluster Options == + +indexterm:[Querying Cluster Options] +indexterm:[Setting Cluster Options] +indexterm:[Cluster Options,Querying] +indexterm:[Cluster Options,Setting] + +Cluster options can be queried and modified using the +pass:[crm_attribute] tool. To get the current +value of +cluster-delay+, simply use: + +pass:[crm_attribute --attr-name cluster-delay --get-value] + +which is more simply written as + +pass:[crm_attribute --get-value -n cluster-delay] + +If a value is found, you'll see a result like this: +======= +pass:[ # crm_attribute --get-value -n cluster-delay] + + name=cluster-delay value=60s +======== + +However, if no value is found, the tool will display an error: +======= +pass:[# crm_attribute --get-value -n clusta-deway] + + name=clusta-deway value=(null) + Error performing operation: The object/attribute does not exist +======== + +To use a different value, eg. +30+, simply run: + +pass:[crm_attribute --attr-name cluster-delay --attr-value 30s] + +To go back to the cluster's default value you can delete the value, for example with this command: + +pass:[crm_attribute --attr-name cluster-delay --delete-attr] + +== When Options are Listed More Than Once == + +If you ever see something like the following, it means that the option you're modifying is present more than once. + +.Deleting an option that is listed twice +======= +pass:[# crm_attribute --attr-name batch-limit --delete-attr] + + Multiple attributes match name=batch-limit in crm_config: + Value: 50 (set=cib-bootstrap-options, id=cib-bootstrap-options-batch-limit) + Value: 100 (set=custom, id=custom-batch-limit) + Please choose from one of the matches above and supply the 'id' with --attr-id +======= + +In such cases follow the on-screen instructions to perform the +requested action. To determine which value is currently being used by +the cluster, please refer to the section on . diff --git a/doc/Pacemaker_Explained/en-US/Ch-Options.xml b/doc/Pacemaker_Explained/en-US/Ch-Options.xml deleted file mode 100644 index ed6da14608..0000000000 --- a/doc/Pacemaker_Explained/en-US/Ch-Options.xml +++ /dev/null @@ -1,305 +0,0 @@ - - Cluster Options -
- <indexterm significance="preferred"><primary>Special Cluster Options</primary></indexterm> - <indexterm significance="preferred"><primary>Cluster Options</primary><secondary>Special Options</secondary></indexterm> - Special Options - - The reason for these fields to be placed at the top level instead of with the rest of cluster options is simply a matter of parsing. - These options are used by the configuration database which is, by design, mostly ignorant of the content it holds. - So the decision was made to place them in an easy to find location. - -
- - <indexterm significance="preferred"><primary>Configuration Version, Cluster Option</primary></indexterm> - <indexterm significance="preferred"><primary>Cluster Options</primary><secondary>Configuration Version</secondary></indexterm> - Configuration Version - - When a node joins the cluster, the cluster will perform a check to see who has the best configuration based on the fields below. - It then asks the node with the highest (admin_epoch, epoch, num_updates) tuple to replace the configuration on all the nodes - which makes setting them, and setting them correctly, very important. - - - Configuration Version Properties - - - - - - Field - Description - - - - - admin_epoch Cluster Option - Cluster Optionsadmin_epoch - admin_epoch - - Never modified by the cluster. Use this to make the configurations on any inactive nodes obsolete. - Never set this value to zero, in such cases the cluster cannot tell the difference between your configuration and the "empty" one used when nothing is found on disk. - - - - epoch Cluster Option - Cluster Optionsepoch - epoch - Incremented every time the configuration is updated (usually by the admin) - - - num_updates Cluster Option - Cluster Optionsnum_updates - num_updates - Incremented every time the configuration or status is updated (usually by the cluster) - - - -
-
-
- Other Fields - - Properties Controlling Validation - - - - - - Field - Description - - - - - validate-with Cluster Option - Cluster Optionsvalidate-with - validate-with - - Determines the type of validation being done on the configuration. - If set to "none", the cluster will not verify that updates conform to the DTD (nor reject ones that don't). This option can be useful when operating a mixed version cluster during an upgrade. - - - - -
-
-
- Fields Maintained by the Cluster - - Properties Maintained by the Cluster - - - - - - Field - Description - - - - - crm-debug-origin Cluster Fields - Cluster Fieldscrm-debug-origin - crm-debug-origin - Indicates where the last update came from. Informational purposes only. - - - cib-last-written Cluster Fields - Cluster Fieldscib-last-written - cib-last-written - Indicates when the configuration was last written to disk. Informational purposes only. - - - dc-uuid Cluster Fields - Cluster Fieldsdc-uuid - dc-uuid - Indicates which cluster node is the current leader. Used by the cluster when placing resources and determining the order of some events. - - - have-quorum Cluster Fields - Cluster Fieldshave-quorum - have-quorum - Indicates if the cluster has quorum. If false, this may mean that the cluster cannot start resources or fence other nodes. See no-quorum-policy below. - - - -
- - Note that although these fields can be written to by the admin, in most cases the cluster will overwrite any values specified by the admin with the "correct" ones. - To change the admin_epoch, for example, one would use: - - cibadmin --modify --crm_xml ‘<cib admin_epoch="42"/>' - A complete set of fields will look something like this: - - An example of the fields set for a cib object - ]]> - - -
-
-
- Cluster Options - Cluster options, as you might expect, control how the cluster behaves when confronted with certain situations. - They are grouped into sets and, in advanced configurations, there may be more than one - This will be described later in the section on where we will show how to have the cluster use different sets of options during working hours (when downtime is usually to be avoided at all costs) than it does during the weekends (when resources can be moved to the their preferred hosts without bothering end users) - . For now we will describe the simple case where each option is present at most once. -
- Available Cluster Options - - Cluster Options - - - - - - - Option - Default - Description - - - - - batch-limit Cluster Options - Cluster Optionsbatch-limit - batch-limit - 30 - The number of jobs that the TE is allowed to execute in parallel. The "correct" value will depend on the speed and load of your network and cluster nodes. - - - no-quorum-policy Cluster Options - Cluster Optionsno-quorum-policy - no-quorum-policy - stop - - What to do when the cluster does not have quorum. - Allowed values: - - ignore - continue all resource management - freeze - continue resource management, but don't recover resources from nodes not in the affected partition - stop - stop all resources in the affected cluster partition - suicide - fence all nodes in the affected cluster partition - - - - - symmetric-cluster Cluster Options - Cluster Optionssymmetric-cluster - symmetric-cluster - TRUE - Can all resources run on any node by default? - - - stonith-enabled Cluster Options - Cluster Optionsstonith-enabled - stonith-enabled - TRUE - - Should failed nodes and nodes with resources that can't be stopped be shot? If you value your data, set up a STONITH device and enable this. - If true, or unset, the cluster will refuse to start resources unless one or more STONITH resources have been configured also. - - - - stonith-action Cluster Options - Cluster Optionsstonith-action - stonith-action - reboot - Action to send to STONITH device. Allowed values: reboot, poweroff. - - - cluster-delay Cluster Options - Cluster Optionscluster-delay - cluster-delay - 60s - Round trip delay over the network (excluding action execution). The "correct" value will depend on the speed and load of your network and cluster nodes. - - - stop-orphan-resources Cluster Options - Cluster Optionsstop-orphan-resources - stop-orphan-resources - TRUE - Should deleted resources be stopped? - - - stop-orphan-actions Cluster Options - Cluster Optionsstop-orphan-actions - stop-orphan-actions - TRUE - Should deleted actions be cancelled? - - - start-failure-is-fatal Cluster Options - Cluster Optionsstart-failure-is-fatal - start-failure-is-fatal - TRUE - When set to FALSE, the cluster will instead use the resource's failcount and value for resource-failure-stickiness. - - - pe-error-series-max Cluster Options - Cluster Optionspe-error-series-max - pe-error-series-max - -1 (all) - The number of PE inputs resulting in ERRORs to save. Used when reporting problems. - - - pe-warn-series-max Cluster Options - Cluster Optionspe-warn-series-max - pe-warn-series-max - -1 (all) - The number of PE inputs resulting in WARNINGs to save. Used when reporting problems. - - - pe-input-series-max Cluster Options - Cluster Optionspe-input-series-max - pe-input-series-max - -1 (all) - The number of "normal" PE inputs to save. Used when reporting problems. - - - -
- You can always obtain an up-to-date list of cluster options, including their default values, by running the pengine metadata command. -
-
- - <indexterm significance="preferred"><primary>Querying Cluster Options</primary></indexterm> - <indexterm significance="preferred"><primary>Setting Cluster Options</primary></indexterm> - <indexterm significance="preferred"><primary>Cluster Options</primary><secondary>Querying</secondary></indexterm> - <indexterm significance="preferred"><primary>Cluster Options</primary><secondary>Setting</secondary></indexterm> - Querying and Setting Cluster Options - - Cluster options can be queried and modified using the crm_attribute tool. - To get the current value of cluster-delay, simply use: - - crm_attribute --attr-name cluster-delay --get-value - which is more simply written as - crm_attribute --get-value -n cluster-delay - If a value is found, you'll see a result like this: - # crm_attribute --get-value -n cluster-delay - name=cluster-delay value=60s - However, if no value is found, the tool will display an error: - # crm_attribute --get-value -n clusta-deway - name=clusta-deway value=(null) - Error performing operation: The object/attribute does not exist - To use a different value, eg. , simply run: - crm_attribute --attr-name cluster-delay --attr-value 30s - To go back to the cluster's default value you can delete the value, for example with this command: - crm_attribute --attr-name cluster-delay --delete-attr -
-
- When Options are Listed More Than Once - If you ever see something like the following, it means that the option you're modifying is present more than once. - - Deleting an option that is listed twice - # crm_attribute --attr-name batch-limit --delete-attr - Multiple attributes match name=batch-limit in crm_config: - Value: 50 (set=cib-bootstrap-options, id=cib-bootstrap-options-batch-limit) - Value: 100 (set=custom, id=custom-batch-limit) - Please choose from one of the matches above and supply the 'id' with --attr-id - - In such cases follow the on-screen instructions to perform the requested action. -To determine which value is currently being used by the cluster, please refer to the section on . -
-
-