diff --git a/doc/sphinx/Pacemaker_Explained/nodes.rst b/doc/sphinx/Pacemaker_Explained/nodes.rst index f19d7319f7..84069ea931 100644 --- a/doc/sphinx/Pacemaker_Explained/nodes.rst +++ b/doc/sphinx/Pacemaker_Explained/nodes.rst @@ -1,216 +1,247 @@ Cluster Nodes ------------- -.. Convert_to_RST: - - == Defining a Cluster Node == - - Each node in the cluster will have an entry in the nodes section - containing its UUID, uname, and type. - - .Example Corosync cluster node entry - ====== - [source,XML] - - ====== - - In normal circumstances, the admin should let the cluster populate - this information automatically from the communications and membership - data. - +Defining a Cluster Node +_______________________ + +Each cluster node will have an entry in the ``nodes`` section containing at +least an ID and a name. A cluster node's ID is defined by the cluster layer +(Corosync). + +.. topic:: **Example Corosync cluster node entry** + + .. code-block:: xml + + + +In normal circumstances, the admin should let the cluster populate this +information automatically from the cluster layer. + + .. _node_name: Where Pacemaker Gets the Node Name ################################## - -.. Convert_to_RST_2: - - Traditionally, Pacemaker required nodes to be referred to by the value - returned by `uname -n`. This can be problematic for services that - require the `uname -n` to be a specific value (e.g. for a licence - file). - - This requirement has been relaxed for clusters using Corosync 2.0 or later. - The name Pacemaker uses is: - - . The value stored in +corosync.conf+ under *ring0_addr* in the *nodelist*, if it does not contain an IP address; otherwise - . The value stored in +corosync.conf+ under *name* in the *nodelist*; otherwise - . The value of `uname -n` - - Pacemaker provides the `crm_node -n` command which displays the name - used by a running cluster. - - If a Corosync *nodelist* is used, `crm_node --name-for-id` pass:[number] is also - available to display the name used by the node with the corosync - *nodeid* of pass:[number], for example: `crm_node --name-for-id 2`. - + +The name that Pacemaker uses for a node in the configuration does not have to +be the same as its local hostname. Pacemaker uses the following for a Corosync +node's name, in order of most preferred first: + +* The value of ``name`` in the ``nodelist`` section of ``corosync.conf`` +* The value of ``ring0_addr`` in the ``nodelist`` section of ``corosync.conf`` +* The local hostname (value of ``uname -n``) + +If the cluster is running, the ``crm_node -n`` command will display the local +node's name as used by the cluster. + +If a Corosync ``nodelist`` is used, ``crm_node --name-for-id`` with a Corosync +node ID will display the name used by the node with the given Corosync +``nodeid``, for example: + +.. code-block:: none + + crm_node --name-for-id 2 + + +.. index:: + single: node; attribute + single: node attribute + .. _node_attributes: Node Attributes -############### - -.. Convert_to_RST_3: - - indexterm:[Node,attribute] - Pacemaker allows node-specific values to be specified using 'node attributes'. - A node attribute has a name, and may have a distinct value for each node. - - While certain node attributes have specific meanings to the cluster, they are - mainly intended to allow administrators and resource agents to track any - information desired. - - For example, an administrator might choose to define node attributes for how - much RAM and disk space each node has, which OS each uses, or which server room - rack each node is in. - - Users can configure <> that use node attributes to affect - where resources are placed. - - === Setting and querying node attributes === - - Node attributes can be set and queried using the `crm_attribute` and - `attrd_updater` commands, so that the user does not have to deal with XML - configuration directly. - - Here is an example of what XML configuration would be generated if an - administrator ran this command: - - .Result of using crm_attribute to specify which kernel pcmk-1 is running - ====== - ------- - # crm_attribute --type nodes --node pcmk-1 --name kernel --update $(uname -r) - ------- - [source,XML] - ------- - - - - - - ------- - ====== - - To read back the value that was just set: - ---- +_______________ + +Pacemaker allows node-specific values to be specified using *node attributes*. +A node attribute has a name, and may have a distinct value for each node. + +Node attributes come in two types, *permanent* and *transient*. Permanent node +attributes are kept within the ``node`` entry, and keep their values even if +the cluster restarts on a node. Transient node attributes are kept in the CIB's +``status`` section, and go away when the cluster stops on the node. + +While certain node attributes have specific meanings to the cluster, they are +mainly intended to allow administrators and resource agents to track any +information desired. + +For example, an administrator might choose to define node attributes for how +much RAM and disk space each node has, which OS each uses, or which server room +rack each node is in. + +Users can configure :ref:`rules` that use node attributes to affect where +resources are placed. + +Setting and querying node attributes +#################################### + +Node attributes can be set and queried using the ``crm_attribute`` and +``attrd_updater`` commands, so that the user does not have to deal with XML +configuration directly. + +Here is an example command to set a permanent node attribute, and the XML +configuration that would be generated: + +.. topic:: **Result of using crm_attribute to specify which kernel pcmk-1 is running** + + .. code-block:: none + + # crm_attribute --type nodes --node pcmk-1 --name kernel --update $(uname -r) + + .. code-block:: xml + + + + + + + +To read back the value that was just set: + +.. code-block:: none + # crm_attribute --type nodes --node pcmk-1 --name kernel --query scope=nodes name=kernel value=3.10.0-862.14.4.el7.x86_64 - ---- - - By specifying `--type nodes` the admin tells the cluster that this - attribute is persistent across reboots. There are also transient attributes - which are kept in the status section and are "forgotten" whenever the node - leaves the cluster. Administrators can use this section by specifying - `--type status`. - - === Special node attributes === - - Certain node attributes have special meaning to the cluster. - - Node attribute names beginning with # are considered reserved for these - special attributes. Some special attributes do not start with #, for - historical reasons. - - Certain special attributes are set automatically by the cluster, should never - be modified directly, and can be used only within <>; - these are listed under <>. - - For true/false values, the cluster considers a value of "1", "y", "yes", "on", - or "true" (case-insensitively) to be true, "0", "n", "no", "off", "false", or - unset to be false, and anything else to be an error. - - .Node attributes with special significance - [width="95%",cols="2m,<5",options="header",align="center"] - |==== - |Name |Description - - | fail-count-* - | Attributes whose names start with +fail-count-+ are managed by the cluster - to track how many times particular resource operations have failed on this - node. These should be queried and cleared via the `crm_failcount` or - `crm_resource --cleanup` commands rather than directly. - indexterm:[Node,attribute,fail-count-] - indexterm:[fail-count-,Node attribute] - - | last-failure-* - | Attributes whose names start with +last-failure-+ are managed by the cluster - to track when particular resource operations have most recently failed on - this node. These should be cleared via the `crm_failcount` or - `crm_resource --cleanup` commands rather than directly. - indexterm:[Node,attribute,last-failure-] - indexterm:[last-failure-,Node attribute] - - | maintenance - | Similar to the +maintenance-mode+ <>, but for - a single node. If true, resources will not be started or stopped on the node, - resources and individual clone instances running on the node will become - unmanaged, and any recurring operations for those will be cancelled. - indexterm:[Node,attribute,maintenance] - indexterm:[maintenance,Node attribute] - - | probe_complete - | This is managed by the cluster to detect when nodes need to be reprobed, and - should never be used directly. - indexterm:[Node,attribute,probe_complete] - indexterm:[probe_complete,Node attribute] - - | resource-discovery-enabled - | If the node is a remote node, fencing is enabled, and this attribute is - explicitly set to false (unset means true in this case), resource discovery - (probes) will not be done on this node. This is highly discouraged; the - +resource-discovery+ location constraint property is preferred for this - purpose. - indexterm:[Node,attribute,resource-discovery-enabled] - indexterm:[resource-discovery-enabled,Node attribute] - - | shutdown - | This is managed by the cluster to orchestrate the shutdown of a node, - and should never be used directly. - indexterm:[Node,attribute,shutdown] - indexterm:[shutdown,Node attribute] - - | site-name - | If set, this will be used as the value of the +#site-name+ node attribute - used in rules. (If not set, the value of the +cluster-name+ cluster option - will be used as +#site-name+ instead.) - indexterm:[Node,attribute,site-name] - indexterm:[site-name,Node attribute] - - | standby - | If true, the node is in standby mode. This is typically set and queried via - the `crm_standby` command rather than directly. - indexterm:[Node,attribute,standby] - indexterm:[standby,Node attribute] - - | terminate - | If the value is true or begins with any nonzero number, the node will be - fenced. This is typically set by tools rather than directly. - indexterm:[Node,attribute,terminate] - indexterm:[terminate,Node attribute] - - | #digests-* - | Attributes whose names start with +#digests-+ are managed by the cluster to - detect when <> needs to be redone, and should never be - used directly. - indexterm:[Node,attribute,#digests-] - indexterm:[#digests-,Node attribute] - - | #node-unfenced - | When the node was last unfenced (as seconds since the epoch). This is managed - by the cluster and should never be used directly. - indexterm:[Node,attribute,#node-unfenced] - indexterm:[#node-unfenced,Node attribute] - - |==== - - [WARNING] - ==== - Restarting pacemaker on a node that is in single-node maintenance mode will - likely lead to undesirable effects. If +maintenance+ is set as a transient - attribute, it will be erased when pacemaker is stopped, which will immediately - take the node out of maintenance mode and likely get it fenced. Even if - permanent, if pacemaker is restarted, any resources active on the node will - have their local history erased when the node rejoins, so the cluster will no - longer consider them running on the node and thus will consider them managed - again, leading them to be started elsewhere. This behavior might be improved - in a future release. - ==== + +The ``--type nodes`` indicates that this is a permanent node attribute; +``--type status`` would indicate a transient node attribute. + +Special node attributes +####################### + +Certain node attributes have special meaning to the cluster. + +Node attribute names beginning with ``#`` are considered reserved for these +special attributes. Some special attributes do not start with ``#``, for +historical reasons. + +Certain special attributes are set automatically by the cluster, should never +be modified directly, and can be used only within :ref:`rules`; these are +listed under +:ref:`built-in node attributes `. + +For true/false values, the cluster considers a value of "1", "y", "yes", "on", +or "true" (case-insensitively) to be true, "0", "n", "no", "off", "false", or +unset to be false, and anything else to be an error. + +.. table:: **Node attributes with special significance** + + +----------------------------+-----------------------------------------------------+ + | Name | Description | + +============================+=====================================================+ + | fail-count-* | .. index:: | + | | pair: node attribute; fail-count | + | | | + | | Attributes whose names start with | + | | ``fail-count-`` are managed by the cluster | + | | to track how many times particular resource | + | | operations have failed on this node. These | + | | should be queried and cleared via the | + | | ``crm_failcount`` or | + | | ``crm_resource --cleanup`` commands rather | + | | than directly. | + +----------------------------+-----------------------------------------------------+ + | last-failure-* | .. index:: | + | | pair: node attribute; last-failure | + | | | + | | Attributes whose names start with | + | | ``last-failure-`` are managed by the cluster | + | | to track when particular resource operations | + | | have most recently failed on this node. | + | | These should be cleared via the | + | | ``crm_failcount`` or | + | | ``crm_resource --cleanup`` commands rather | + | | than directly. | + +----------------------------+-----------------------------------------------------+ + | maintenance | .. index:: | + | | pair: node attribute; maintenance | + | | | + | | Similar to the ``maintenance-mode`` | + | | :ref:`cluster option `, but | + | | for a single node. If true, resources will | + | | not be started or stopped on the node, | + | | resources and individual clone instances | + | | running on the node will become unmanaged, | + | | and any recurring operations for those will | + | | be cancelled. | + | | | + | | .. warning:: | + | | Restarting pacemaker on a node that is in | + | | single-node maintenance mode will likely | + | | lead to undesirable effects. If | + | | ``maintenance`` is set as a transient | + | | attribute, it will be erased when | + | | Pacemaker is stopped, which will | + | | immediately take the node out of | + | | maintenance mode and likely get it | + | | fenced. Even if permanent, if Pacemaker | + | | is restarted, any resources active on the | + | | node will have their local history erased | + | | when the node rejoins, so the cluster | + | | will no longer consider them running on | + | | the node and thus will consider them | + | | managed again, leading them to be started | + | | elsewhere. This behavior might be | + | | improved in a future release. | + +----------------------------+-----------------------------------------------------+ + | probe_complete | .. index:: | + | | pair: node attribute; probe_complete | + | | | + | | This is managed by the cluster to detect | + | | when nodes need to be reprobed, and should | + | | never be used directly. | + +----------------------------+-----------------------------------------------------+ + | resource-discovery-enabled | .. index:: | + | | pair: node attribute; resource-discovery-enabled | + | | | + | | If the node is a remote node, fencing is enabled, | + | | and this attribute is explicitly set to false | + | | (unset means true in this case), resource discovery | + | | (probes) will not be done on this node. This is | + | | highly discouraged; the ``resource-discovery`` | + | | location constraint property is preferred for this | + | | purpose. | + +----------------------------+-----------------------------------------------------+ + | shutdown | .. index:: | + | | pair: node attribute; shutdown | + | | | + | | This is managed by the cluster to orchestrate the | + | | shutdown of a node, and should never be used | + | | directly. | + +----------------------------+-----------------------------------------------------+ + | site-name | .. index:: | + | | pair: node attribute; site-name | + | | | + | | If set, this will be used as the value of the | + | | ``#site-name`` node attribute used in rules. (If | + | | not set, the value of the ``cluster-name`` cluster | + | | option will be used as ``#site-name`` instead.) | + +----------------------------+-----------------------------------------------------+ + | standby | .. index:: | + | | pair: node attribute; standby | + | | | + | | If true, the node is in standby mode. This is | + | | typically set and queried via the ``crm_standby`` | + | | command rather than directly. | + +----------------------------+-----------------------------------------------------+ + | terminate | .. index:: | + | | pair: node attribute; terminate | + | | | + | | If the value is true or begins with any nonzero | + | | number, the node will be fenced. This is typically | + | | set by tools rather than directly. | + +----------------------------+-----------------------------------------------------+ + | #digests-* | .. index:: | + | | pair: node attribute; #digests | + | | | + | | Attributes whose names start with ``#digests-`` are | + | | managed by the cluster to detect when | + | | :ref:`unfencing` needs to be redone, and should | + | | never be used directly. | + +----------------------------+-----------------------------------------------------+ + | #node-unfenced | .. index:: | + | | pair: node attribute; #node-unfenced | + | | | + | | When the node was last unfenced (as seconds since | + | | the epoch). This is managed by the cluster and | + | | should never be used directly. | + +----------------------------+-----------------------------------------------------+ diff --git a/doc/sphinx/Pacemaker_Explained/reusing-configuration.rst b/doc/sphinx/Pacemaker_Explained/reusing-configuration.rst index 5df391b623..7256aaa4ba 100644 --- a/doc/sphinx/Pacemaker_Explained/reusing-configuration.rst +++ b/doc/sphinx/Pacemaker_Explained/reusing-configuration.rst @@ -1,415 +1,415 @@ Reusing Parts of the Configuration ---------------------------------- Pacemaker provides multiple ways to simplify the configuration XML by reusing parts of it in multiple places. Besides simplifying the XML, this also allows you to manipulate multiple configuration elements with a single reference. Reusing Resource Definitions ############################ If you want to create lots of resources with similar configurations, defining a *resource template* simplifies the task. Once defined, it can be referenced in primitives or in certain types of constraints. Configuring Resources with Templates ____________________________________ The primitives referencing the template will inherit all meta-attributes, instance attributes, utilization attributes and operations defined in the template. And you can define specific attributes and operations for any of the primitives. If any of these are defined in both the template and the primitive, the values defined in the primitive will take precedence over the ones defined in the template. Hence, resource templates help to reduce the amount of configuration work. If any changes are needed, they can be done to the template definition and will take effect globally in all resource definitions referencing that template. Resource templates have a syntax similar to that of primitives. .. topic:: Resource template for a migratable Xen virtual machine .. code-block:: xml Once you define a resource template, you can use it in primitives by specifying the ``template`` property. .. topic:: Xen primitive resource using a resource template .. code-block:: xml In the example above, the new primitive ``vm1`` will inherit everything from ``vm-template``. For example, the equivalent of the above two examples would be: .. topic:: Equivalent Xen primitive resource not using a resource template .. code-block:: xml If you want to overwrite some attributes or operations, add them to the particular primitive's definition. .. topic:: Xen resource overriding template values .. code-block:: xml In the example above, the new primitive ``vm2`` has special attribute values. Its ``monitor`` operation has a longer ``timeout`` and ``interval``, and the primitive has an additional ``stop`` operation. To see the resulting definition of a resource, run: .. code-block:: none # crm_resource --query-xml --resource vm2 To see the raw definition of a resource in the CIB, run: .. code-block:: none # crm_resource --query-xml-raw --resource vm2 Using Templates in Constraints ______________________________ A resource template can be referenced in the following types of constraints: - ``order`` constraints (see :ref:`s-resource-ordering`) - ``colocation`` constraints (see :ref:`s-resource-colocation`) - ``rsc_ticket`` constraints (for multi-site clusters as described in :ref:`ticket-constraints`) Resource templates referenced in constraints stand for all primitives which are derived from that template. This means, the constraint applies to all primitive resources referencing the resource template. Referencing resource templates in constraints is an alternative to resource sets and can simplify the cluster configuration considerably. For example, given the example templates earlier in this chapter: .. code-block:: xml would colocate all VMs with ``base-rsc`` and is the equivalent of the following constraint configuration: .. code-block:: xml .. note:: In a colocation constraint, only one template may be referenced from either ``rsc`` or ``with-rsc``; the other reference must be a regular resource. Using Templates in Resource Sets ________________________________ Resource templates can also be referenced in resource sets. For example, given the example templates earlier in this section, then: .. code-block:: xml is the equivalent of the following constraint using a sequential resource set: .. code-block:: xml Or, if the resources referencing the template can run in parallel, then: .. code-block:: xml is the equivalent of the following constraint configuration: .. code-block:: xml -.. _s-reusing-config-elements:: +.. _s-reusing-config-elements: Reusing Rules, Options and Sets of Operations ############################################# Sometimes a number of constraints need to use the same set of rules, and resources need to set the same options and parameters. To simplify this situation, you can refer to an existing object using an ``id-ref`` instead of an ``id``. So if for one resource you have .. code-block:: xml Then instead of duplicating the rule for all your other resources, you can instead specify: -.. topic:: Referencing rules from other constraints - -.. code-block:: xml - - - - +.. topic:: **Referencing rules from other constraints** + .. code-block:: xml + + + + + .. important:: The cluster will insist that the ``rule`` exists somewhere. Attempting to add a reference to a non-existing rule will cause a validation failure, as will attempting to remove a ``rule`` that is referenced elsewhere. The same principle applies for ``meta_attributes`` and ``instance_attributes`` as illustrated in the example below: .. topic:: Referencing attributes, options, and operations from other resources .. code-block:: xml ``id-ref`` can similarly be used with ``resource_set`` (in any constraint type), ``nvpair``, and ``operations``. Tagging Configuration Elements ############################## Pacemaker allows you to *tag* any configuration element that has an XML ID. The main purpose of tagging is to support higher-level user interface tools; Pacemaker itself only uses tags within constraints. Therefore, what you can do with tags mostly depends on the tools you use. Configuring Tags ________________ A tag is simply a named list of XML IDs. .. topic:: Tag referencing three resources .. code-block:: xml What you can do with this new tag depends on what your higher-level tools support. For example, a tool might allow you to enable or disable all of the tagged resources at once, or show the status of just the tagged resources. A single configuration element can be listed in any number of tags. Using Tags in Constraints and Resource Sets ___________________________________________ Pacemaker itself only uses tags in constraints. If you supply a tag name instead of a resource name in any constraint, the constraint will apply to all resources listed in that tag. .. topic:: Constraint using a tag .. code-block:: xml In the example above, assuming the ``all-vms`` tag is defined as in the previous example, the constraint will behave the same as: .. topic:: Equivalent constraints without tags .. code-block:: xml A tag may be used directly in the constraint, or indirectly by being listed in a :ref:`resource set ` used in the constraint. When used in a resource set, an expanded tag will honor the set's ``sequential`` property. Filtering With Tags ___________________ The ``crm_mon`` tool can be used to display lots of information about the state of the cluster. On large or complicated clusters, this can include a lot of information, which makes it difficult to find the one thing you are interested in. The ``--resource=`` and ``--node=`` command line options can be used to filter results. In their most basic usage, these options take a single resource or node name. However, they can also be supplied with a tag name to display several objects at once. For instance, given the following CIB section: .. code-block:: xml The following would be output for ``crm_mon --resource=inactive-rscs -r``: .. code-block:: none Cluster Summary: * Stack: corosync * Current DC: cluster02 (version 2.0.4-1.e97f9675f.git.el7-e97f9675f) - partition with quorum * Last updated: Tue Oct 20 16:09:01 2020 * Last change: Tue May 5 12:04:36 2020 by hacluster via crmd on cluster01 * 5 nodes configured * 27 resource instances configured (4 DISABLED) Node List: * Online: [ cluster01 cluster02 ] Full List of Resources: * Clone Set: inactive-clone [inactive-dhcpd] (disabled): * Stopped (disabled): [ cluster01 cluster02 ] * Resource Group: inactive-group (disabled): * inactive-dummy-1 (ocf::pacemaker:Dummy): Stopped (disabled) * inactive-dummy-2 (ocf::pacemaker:Dummy): Stopped (disabled)