diff --git a/doc/sphinx/Pacemaker_Explained/multi-site-clusters.rst b/doc/sphinx/Pacemaker_Explained/multi-site-clusters.rst index 133e79096f..60421644ea 100644 --- a/doc/sphinx/Pacemaker_Explained/multi-site-clusters.rst +++ b/doc/sphinx/Pacemaker_Explained/multi-site-clusters.rst @@ -1,345 +1,342 @@ Multi-Site Clusters and Tickets ------------------------------- -.. Convert_to_RST: - - Apart from local clusters, Pacemaker also supports multi-site clusters. - That means you can have multiple, geographically dispersed sites, each with a - local cluster. Failover between these clusters can be coordinated - manually by the administrator, or automatically by a higher-level entity called - a 'Cluster Ticket Registry (CTR)'. - - == Challenges for Multi-Site Clusters == - - Typically, multi-site environments are too far apart to support - synchronous communication and data replication between the sites. - That leads to significant challenges: - - - How do we make sure that a cluster site is up and running? - - - How do we make sure that resources are only started once? - - - How do we make sure that quorum can be reached between the different - sites and a split-brain scenario avoided? - - - How do we manage failover between sites? - - - How do we deal with high latency in case of resources that need to be - stopped? - - In the following sections, learn how to meet these challenges. - - == Conceptual Overview == - - Multi-site clusters can be considered as “overlay” clusters where - each cluster site corresponds to a cluster node in a traditional cluster. - The overlay cluster can be managed by a CTR in order to - guarantee that any cluster resource will be active - on no more than one cluster site. This is achieved by using - 'tickets' that are treated as failover domain between cluster - sites, in case a site should be down. - - The following sections explain the individual components and mechanisms - that were introduced for multi-site clusters in more detail. - - === Ticket === - - Tickets are, essentially, cluster-wide attributes. A ticket grants the - right to run certain resources on a specific cluster site. Resources can - be bound to a certain ticket by +rsc_ticket+ constraints. Only if the - ticket is available at a site can the respective resources be started there. - Vice versa, if the ticket is revoked, the resources depending on that - ticket must be stopped. - - The ticket thus is similar to a 'site quorum', i.e. the permission to - manage/own resources associated with that site. (One can also think of the - current +have-quorum+ flag as a special, cluster-wide ticket that is granted in - case of node majority.) - - Tickets can be granted and revoked either manually by administrators - (which could be the default for classic enterprise clusters), or via - the automated CTR mechanism described below. - - A ticket can only be owned by one site at a time. Initially, none - of the sites has a ticket. Each ticket must be granted once by the cluster - administrator. - - The presence or absence of tickets for a site is stored in the CIB as a - cluster status. With regards to a certain ticket, there are only two states - for a site: +true+ (the site has the ticket) or +false+ (the site does - not have the ticket). The absence of a certain ticket (during the initial - state of the multi-site cluster) is the same as the value +false+. - - === Dead Man Dependency === - - A site can only activate resources safely if it can be sure that the - other site has deactivated them. However after a ticket is revoked, it can - take a long time until all resources depending on that ticket are stopped - "cleanly", especially in case of cascaded resources. To cut that process - short, the concept of a 'Dead Man Dependency' was introduced. - - If a dead man dependency is in force, if a ticket is revoked from a site, the - nodes that are hosting dependent resources are fenced. This considerably speeds - up the recovery process of the cluster and makes sure that resources can be - migrated more quickly. - - This can be configured by specifying a +loss-policy="fence"+ in - +rsc_ticket+ constraints. - - === Cluster Ticket Registry === - - A CTR is a coordinated group of network daemons that automatically handles - granting, revoking, and timing out tickets (instead of the administrator - revoking the ticket somewhere, waiting for everything to stop, and then - granting it on the desired site). - - Pacemaker does not implement its own CTR, but interoperates with external - software designed for that purpose (similar to how resource and fencing agents - are not directly part of pacemaker). - - Participating clusters run the CTR daemons, which connect to each other, exchange - information about their connectivity, and vote on which sites gets which - tickets. - - A ticket is granted to a site only once the CTR is sure that the ticket - has been relinquished by the previous owner, implemented via a timer in most - scenarios. If a site loses connection to its peers, its tickets time out and - recovery occurs. After the connection timeout plus the recovery timeout has - passed, the other sites are allowed to re-acquire the ticket and start the - resources again. - - This can also be thought of as a "quorum server", except that it is not - a single quorum ticket, but several. - - === Configuration Replication === - - As usual, the CIB is synchronized within each cluster, but it is 'not' synchronized - across cluster sites of a multi-site cluster. You have to configure the resources - that will be highly available across the multi-site cluster for every site - accordingly. +Apart from local clusters, Pacemaker also supports multi-site clusters. +That means you can have multiple, geographically dispersed sites, each with a +local cluster. Failover between these clusters can be coordinated +manually by the administrator, or automatically by a higher-level entity called +a *Cluster Ticket Registry (CTR)*. + +Challenges for Multi-Site Clusters +################################## + +Typically, multi-site environments are too far apart to support +synchronous communication and data replication between the sites. +That leads to significant challenges: + +- How do we make sure that a cluster site is up and running? + +- How do we make sure that resources are only started once? + +- How do we make sure that quorum can be reached between the different + sites and a split-brain scenario avoided? + +- How do we manage failover between sites? + +- How do we deal with high latency in case of resources that need to be + stopped? + +In the following sections, learn how to meet these challenges. + +Conceptual Overview +################### + +Multi-site clusters can be considered as “overlay” clusters where +each cluster site corresponds to a cluster node in a traditional cluster. +The overlay cluster can be managed by a CTR in order to +guarantee that any cluster resource will be active +on no more than one cluster site. This is achieved by using +*tickets* that are treated as failover domain between cluster +sites, in case a site should be down. + +The following sections explain the individual components and mechanisms +that were introduced for multi-site clusters in more detail. + +Ticket +______ + +Tickets are, essentially, cluster-wide attributes. A ticket grants the +right to run certain resources on a specific cluster site. Resources can +be bound to a certain ticket by ``rsc_ticket`` constraints. Only if the +ticket is available at a site can the respective resources be started there. +Vice versa, if the ticket is revoked, the resources depending on that +ticket must be stopped. + +The ticket thus is similar to a *site quorum*, i.e. the permission to +manage/own resources associated with that site. (One can also think of the +current ``have-quorum`` flag as a special, cluster-wide ticket that is +granted in case of node majority.) + +Tickets can be granted and revoked either manually by administrators +(which could be the default for classic enterprise clusters), or via +the automated CTR mechanism described below. + +A ticket can only be owned by one site at a time. Initially, none +of the sites has a ticket. Each ticket must be granted once by the cluster +administrator. + +The presence or absence of tickets for a site is stored in the CIB as a +cluster status. With regards to a certain ticket, there are only two states +for a site: ``true`` (the site has the ticket) or ``false`` (the site does +not have the ticket). The absence of a certain ticket (during the initial +state of the multi-site cluster) is the same as the value ``false``. + +Dead Man Dependency +___________________ + +A site can only activate resources safely if it can be sure that the +other site has deactivated them. However after a ticket is revoked, it can +take a long time until all resources depending on that ticket are stopped +"cleanly", especially in case of cascaded resources. To cut that process +short, the concept of a *Dead Man Dependency* was introduced. + +If a dead man dependency is in force, if a ticket is revoked from a site, the +nodes that are hosting dependent resources are fenced. This considerably speeds +up the recovery process of the cluster and makes sure that resources can be +migrated more quickly. + +This can be configured by specifying a ``loss-policy="fence"`` in +``rsc_ticket`` constraints. + +Cluster Ticket Registry +_______________________ + +A CTR is a coordinated group of network daemons that automatically handles +granting, revoking, and timing out tickets (instead of the administrator +revoking the ticket somewhere, waiting for everything to stop, and then +granting it on the desired site). + +Pacemaker does not implement its own CTR, but interoperates with external +software designed for that purpose (similar to how resource and fencing agents +are not directly part of pacemaker). + +Participating clusters run the CTR daemons, which connect to each other, exchange +information about their connectivity, and vote on which sites gets which +tickets. + +A ticket is granted to a site only once the CTR is sure that the ticket +has been relinquished by the previous owner, implemented via a timer in most +scenarios. If a site loses connection to its peers, its tickets time out and +recovery occurs. After the connection timeout plus the recovery timeout has +passed, the other sites are allowed to re-acquire the ticket and start the +resources again. + +This can also be thought of as a "quorum server", except that it is not +a single quorum ticket, but several. + +Configuration Replication +_________________________ + +As usual, the CIB is synchronized within each cluster, but it is *not* synchronized +across cluster sites of a multi-site cluster. You have to configure the resources +that will be highly available across the multi-site cluster for every site +accordingly. .. _ticket-constraints: Configuring Ticket Dependencies ############################### -.. Convert_to_RST_2: - - The `rsc_ticket` constraint lets you specify the resources depending on a certain - ticket. Together with the constraint, you can set a `loss-policy` that defines - what should happen to the respective resources if the ticket is revoked. - - The attribute `loss-policy` can have the following values: - - * +fence:+ Fence the nodes that are running the relevant resources. - - * +stop:+ Stop the relevant resources. - - * +freeze:+ Do nothing to the relevant resources. - - * +demote:+ Demote relevant resources that are running in master mode to slave mode. - - - .Constraint that fences node if +ticketA+ is revoked - ==== - [source,XML] - ------- - - ------- - ==== - - The example above creates a constraint with the ID +rsc1-req-ticketA+. It - defines that the resource +rsc1+ depends on +ticketA+ and that the node running - the resource should be fenced if +ticketA+ is revoked. - - If resource +rsc1+ were a promotable resource (i.e. it could run in master or - slave mode), you might want to configure that only master mode - depends on +ticketA+. With the following configuration, +rsc1+ will be - demoted to slave mode if +ticketA+ is revoked: - - .Constraint that demotes +rsc1+ if +ticketA+ is revoked - ==== - [source,XML] - ------- - - ------- - ==== - - You can create multiple `rsc_ticket` constraints to let multiple resources - depend on the same ticket. However, `rsc_ticket` also supports resource sets - (see <>), - so one can easily list all the resources in one `rsc_ticket` constraint instead. - - .Ticket constraint for multiple resources - ==== - [source,XML] - ------- - - - - - - - - - - - ------- - ==== - - In the example above, there are two resource sets, so we can list resources - with different roles in a single +rsc_ticket+ constraint. There's no dependency - between the two resource sets, and there's no dependency among the - resources within a resource set. Each of the resources just depends on - +ticketA+. - - Referencing resource templates in +rsc_ticket+ constraints, and even - referencing them within resource sets, is also supported. - - If you want other resources to depend on further tickets, create as many - constraints as necessary with +rsc_ticket+. - - - == Managing Multi-Site Clusters == - - === Granting and Revoking Tickets Manually === - - You can grant tickets to sites or revoke them from sites manually. - If you want to re-distribute a ticket, you should wait for - the dependent resources to stop cleanly at the previous site before you - grant the ticket to the new site. - - Use the `crm_ticket` command line tool to grant and revoke tickets. - - //// - These commands will actually just print a message telling the user that they - require '--force'. That is probably a good exercise rather than letting novice - users cut and paste '--force' here. - //// - - To grant a ticket to this site: - ------- - # crm_ticket --ticket ticketA --grant - ------- - - To revoke a ticket from this site: - ------- - # crm_ticket --ticket ticketA --revoke - ------- - - [IMPORTANT] - ==== - If you are managing tickets manually, use the `crm_ticket` command with +The **rsc_ticket** constraint lets you specify the resources depending on a certain +ticket. Together with the constraint, you can set a **loss-policy** that defines +what should happen to the respective resources if the ticket is revoked. + +The attribute **loss-policy** can have the following values: + +* ``fence:`` Fence the nodes that are running the relevant resources. + +* ``stop:`` Stop the relevant resources. + +* ``freeze:`` Do nothing to the relevant resources. + +* ``demote:`` Demote relevant resources that are running in master mode to slave mode. + +.. topic:: Constraint that fences node if ``ticketA`` is revoked + + .. code-block:: xml + + + +The example above creates a constraint with the ID ``rsc1-req-ticketA``. It +defines that the resource ``rsc1`` depends on ``ticketA`` and that the node running +the resource should be fenced if ``ticketA`` is revoked. + +If resource ``rsc1`` were a promotable resource (i.e. it could run in master or +slave mode), you might want to configure that only master mode +depends on ``ticketA``. With the following configuration, ``rsc1`` will be +demoted to slave mode if ``ticketA`` is revoked: + +.. topic:: Constraint that demotes ``rsc1`` if ``ticketA`` is revoked + + .. code-block:: xml + + + +You can create multiple **rsc_ticket** constraints to let multiple resources +depend on the same ticket. However, **rsc_ticket** also supports resource sets +(see :ref:`s-resource-sets`), so one can easily list all the resources in one +**rsc_ticket** constraint instead. + +.. topic:: Ticket constraint for multiple resources + + .. code-block:: xml + + + + + + + + + + + + +In the example above, there are two resource sets, so we can list resources +with different roles in a single ``rsc_ticket`` constraint. There's no dependency +between the two resource sets, and there's no dependency among the +resources within a resource set. Each of the resources just depends on +``ticketA``. + +Referencing resource templates in ``rsc_ticket`` constraints, and even +referencing them within resource sets, is also supported. + +If you want other resources to depend on further tickets, create as many +constraints as necessary with ``rsc_ticket``. + +Managing Multi-Site Clusters +############################ + +Granting and Revoking Tickets Manually +______________________________________ + +You can grant tickets to sites or revoke them from sites manually. +If you want to re-distribute a ticket, you should wait for +the dependent resources to stop cleanly at the previous site before you +grant the ticket to the new site. + +Use the **crm_ticket** command line tool to grant and revoke tickets. + +To grant a ticket to this site: + + .. code-block:: none + + # crm_ticket --ticket ticketA --grant + +To revoke a ticket from this site: + + .. code-block:: none + + # crm_ticket --ticket ticketA --revoke + +.. important:: + + If you are managing tickets manually, use the **crm_ticket** command with great care, because it cannot check whether the same ticket is already granted elsewhere. - ==== - - - === Granting and Revoking Tickets via a Cluster Ticket Registry === - - We will use https://github.com/ClusterLabs/booth[Booth] here as an example of - software that can be used with pacemaker as a Cluster Ticket Registry. Booth - implements the - http://en.wikipedia.org/wiki/Raft_%28computer_science%29[Raft] - algorithm to guarantee the distributed consensus among different - cluster sites, and manages the ticket distribution (and thus the failover - process between sites). - - Each of the participating clusters and 'arbitrators' runs the Booth daemon - `boothd`. - - An 'arbitrator' is the multi-site equivalent of a quorum-only node in a local - cluster. If you have a setup with an even number of sites, - you need an additional instance to reach consensus about decisions such - as failover of resources across sites. In this case, add one or more - arbitrators running at additional sites. Arbitrators are single machines - that run a booth instance in a special mode. An arbitrator is especially - important for a two-site scenario, otherwise there is no way for one site - to distinguish between a network failure between it and the other site, and - a failure of the other site. - - The most common multi-site scenario is probably a multi-site cluster with two - sites and a single arbitrator on a third site. However, technically, there are - no limitations with regards to the number of sites and the number of - arbitrators involved. - - `Boothd` at each site connects to its peers running at the other sites and - exchanges connectivity details. Once a ticket is granted to a site, the - booth mechanism will manage the ticket automatically: If the site which - holds the ticket is out of service, the booth daemons will vote which - of the other sites will get the ticket. To protect against brief - connection failures, sites that lose the vote (either explicitly or - implicitly by being disconnected from the voting body) need to - relinquish the ticket after a time-out. Thus, it is made sure that a - ticket will only be re-distributed after it has been relinquished by the - previous site. The resources that depend on that ticket will fail over - to the new site holding the ticket. The nodes that have run the - resources before will be treated according to the `loss-policy` you set - within the `rsc_ticket` constraint. - - Before the booth can manage a certain ticket within the multi-site cluster, - you initially need to grant it to a site manually via the `booth` command-line - tool. After you have initially granted a ticket to a site, `boothd` - will take over and manage the ticket automatically. - - [IMPORTANT] - ==== - The `booth` command-line tool can be used to grant, list, or - revoke tickets and can be run on any machine where `boothd` is running. - If you are managing tickets via Booth, use only `booth` for manual - intervention, not `crm_ticket`. That ensures the same ticket + +Granting and Revoking Tickets via a Cluster Ticket Registry +___________________________________________________________ + +We will use `Booth `_ here as an example of +software that can be used with pacemaker as a Cluster Ticket Registry. Booth +implements the `Raft `_ +algorithm to guarantee the distributed consensus among different +cluster sites, and manages the ticket distribution (and thus the failover +process between sites). + +Each of the participating clusters and *arbitrators* runs the Booth daemon +**boothd**. + +An *arbitrator* is the multi-site equivalent of a quorum-only node in a local +cluster. If you have a setup with an even number of sites, +you need an additional instance to reach consensus about decisions such +as failover of resources across sites. In this case, add one or more +arbitrators running at additional sites. Arbitrators are single machines +that run a booth instance in a special mode. An arbitrator is especially +important for a two-site scenario, otherwise there is no way for one site +to distinguish between a network failure between it and the other site, and +a failure of the other site. + +The most common multi-site scenario is probably a multi-site cluster with two +sites and a single arbitrator on a third site. However, technically, there are +no limitations with regards to the number of sites and the number of +arbitrators involved. + +**Boothd** at each site connects to its peers running at the other sites and +exchanges connectivity details. Once a ticket is granted to a site, the +booth mechanism will manage the ticket automatically: If the site which +holds the ticket is out of service, the booth daemons will vote which +of the other sites will get the ticket. To protect against brief +connection failures, sites that lose the vote (either explicitly or +implicitly by being disconnected from the voting body) need to +relinquish the ticket after a time-out. Thus, it is made sure that a +ticket will only be re-distributed after it has been relinquished by the +previous site. The resources that depend on that ticket will fail over +to the new site holding the ticket. The nodes that have run the +resources before will be treated according to the **loss-policy** you set +within the **rsc_ticket** constraint. + +Before the booth can manage a certain ticket within the multi-site cluster, +you initially need to grant it to a site manually via the **booth** command-line +tool. After you have initially granted a ticket to a site, **boothd** +will take over and manage the ticket automatically. + +.. important:: + + The **booth** command-line tool can be used to grant, list, or + revoke tickets and can be run on any machine where **boothd** is running. + If you are managing tickets via Booth, use only **booth** for manual + intervention, not **crm_ticket**. That ensures the same ticket will only be owned by one cluster site at a time. - ==== - - ==== Booth Requirements ==== - - * All clusters that will be part of the multi-site cluster must be based on - Pacemaker. - - * Booth must be installed on all cluster nodes and on all arbitrators that will - be part of the multi-site cluster. - - * Nodes belonging to the same cluster site should be synchronized via NTP. However, - time synchronization is not required between the individual cluster sites. - - === General Management of Tickets === - - Display the information of tickets: - ------- - # crm_ticket --info - ------- - - Or you can monitor them with: - ------- - # crm_mon --tickets - ------- - - Display the +rsc_ticket+ constraints that apply to a ticket: - ------- - # crm_ticket --ticket ticketA --constraints - ------- - - When you want to do maintenance or manual switch-over of a ticket, - revoking the ticket would trigger the loss policies. If - +loss-policy="fence"+, the dependent resources could not be gracefully - stopped/demoted, and other unrelated resources could even be affected. - - The proper way is making the ticket 'standby' first with: - ------- - # crm_ticket --ticket ticketA --standby - ------- - - Then the dependent resources will be stopped or demoted gracefully without - triggering the loss policies. - - If you have finished the maintenance and want to activate the ticket again, - you can run: - ------- - # crm_ticket --ticket ticketA --activate - ------- - - == For more information == - - * https://www.suse.com/documentation/sle-ha-geo-12/art_ha_geo_quick/data/art_ha_geo_quick.html[SUSE's Geo Clustering quick start] - - * https://github.com/ClusterLabs/booth[Booth] + +Booth Requirements +~~~~~~~~~~~~~~~~~~ + +* All clusters that will be part of the multi-site cluster must be based on + Pacemaker. + +* Booth must be installed on all cluster nodes and on all arbitrators that will + be part of the multi-site cluster. + +* Nodes belonging to the same cluster site should be synchronized via NTP. However, + time synchronization is not required between the individual cluster sites. + +General Management of Tickets +_____________________________ + +Display the information of tickets: + + .. code-block:: none + + # crm_ticket --info + +Or you can monitor them with: + + .. code-block:: none + + # crm_mon --tickets + +Display the ``rsc_ticket`` constraints that apply to a ticket: + + .. code-block:: none + + # crm_ticket --ticket ticketA --constraints + +When you want to do maintenance or manual switch-over of a ticket, +revoking the ticket would trigger the loss policies. If +``loss-policy="fence"``, the dependent resources could not be gracefully +stopped/demoted, and other unrelated resources could even be affected. + +The proper way is making the ticket *standby* first with: + + .. code-block:: none + + # crm_ticket --ticket ticketA --standby + +Then the dependent resources will be stopped or demoted gracefully without +triggering the loss policies. + +If you have finished the maintenance and want to activate the ticket again, +you can run: + + .. code-block:: none + + # crm_ticket --ticket ticketA --activate + +For more information +#################### + +* `SUSE's Geo Clustering quick start `_ + +* `Booth `_ diff --git a/doc/sphinx/Pacemaker_Explained/reusing-configuration.rst b/doc/sphinx/Pacemaker_Explained/reusing-configuration.rst index 45212e8d52..5df391b623 100644 --- a/doc/sphinx/Pacemaker_Explained/reusing-configuration.rst +++ b/doc/sphinx/Pacemaker_Explained/reusing-configuration.rst @@ -1,378 +1,415 @@ Reusing Parts of the Configuration ---------------------------------- -.. Convert_to_RST: - - Pacemaker provides multiple ways to simplify the configuration XML by reusing - parts of it in multiple places. - - Besides simplifying the XML, this also allows you to manipulate multiple - configuration elements with a single reference. - - == Reusing Resource Definitions == - - If you want to create lots of resources with similar configurations, defining a - 'resource template' simplifies the task. Once defined, it can be referenced in - primitives or in certain types of constraints. - - === Configuring Resources with Templates === - - The primitives referencing the template will inherit all meta-attributes, - instance attributes, utilization attributes and operations defined - in the template. And you can define specific attributes and operations for any - of the primitives. If any of these are defined in both the template and the - primitive, the values defined in the primitive will take precedence over the - ones defined in the template. - - Hence, resource templates help to reduce the amount of configuration work. - If any changes are needed, they can be done to the template definition and - will take effect globally in all resource definitions referencing that - template. - - Resource templates have a syntax similar to that of primitives. - - .Resource template for a migratable Xen virtual machine - ==== - [source,XML] - ---- - - ---- - ==== - - Once you define a resource template, you can use it in primitives by specifying the - +template+ property. - - .Xen primitive resource using a resource template - ==== - [source,XML] - ---- - - - - - - - ---- - ==== - - In the example above, the new primitive +vm1+ will inherit everything from +vm-template+. For - example, the equivalent of the above two examples would be: - - .Equivalent Xen primitive resource not using a resource template - ==== - [source,XML] - ---- - - - - - - - - - - - - - - - - - ---- - ==== - - If you want to overwrite some attributes or operations, add them to the - particular primitive's definition. - - .Xen resource overriding template values - ==== - [source,XML] - ---- - - - - - - - - - - - - - - - - - ---- - ==== - - In the example above, the new primitive +vm2+ has special - attribute values. Its +monitor+ operation has a longer +timeout+ and +interval+, and - the primitive has an additional +stop+ operation. - - To see the resulting definition of a resource, run: - - ---- +Pacemaker provides multiple ways to simplify the configuration XML by reusing +parts of it in multiple places. + +Besides simplifying the XML, this also allows you to manipulate multiple +configuration elements with a single reference. + +Reusing Resource Definitions +############################ + +If you want to create lots of resources with similar configurations, defining a +*resource template* simplifies the task. Once defined, it can be referenced in +primitives or in certain types of constraints. + +Configuring Resources with Templates +____________________________________ + +The primitives referencing the template will inherit all meta-attributes, +instance attributes, utilization attributes and operations defined +in the template. And you can define specific attributes and operations for any +of the primitives. If any of these are defined in both the template and the +primitive, the values defined in the primitive will take precedence over the +ones defined in the template. + +Hence, resource templates help to reduce the amount of configuration work. +If any changes are needed, they can be done to the template definition and +will take effect globally in all resource definitions referencing that +template. + +Resource templates have a syntax similar to that of primitives. + +.. topic:: Resource template for a migratable Xen virtual machine + + .. code-block:: xml + + + +Once you define a resource template, you can use it in primitives by specifying the +``template`` property. + +.. topic:: Xen primitive resource using a resource template + + .. code-block:: xml + + + + + + + + +In the example above, the new primitive ``vm1`` will inherit everything from ``vm-template``. For +example, the equivalent of the above two examples would be: + +.. topic:: Equivalent Xen primitive resource not using a resource template + + .. code-block:: xml + + + + + + + + + + + + + + + + + + +If you want to overwrite some attributes or operations, add them to the +particular primitive's definition. + +.. topic:: Xen resource overriding template values + + .. code-block:: xml + + + + + + + + + + + + + + + + + + +In the example above, the new primitive ``vm2`` has special attribute values. +Its ``monitor`` operation has a longer ``timeout`` and ``interval``, and +the primitive has an additional ``stop`` operation. + +To see the resulting definition of a resource, run: + +.. code-block:: none + # crm_resource --query-xml --resource vm2 - ---- - - To see the raw definition of a resource in the CIB, run: - - ---- + +To see the raw definition of a resource in the CIB, run: + +.. code-block:: none + # crm_resource --query-xml-raw --resource vm2 - ---- - - === Using Templates in Constraints === - - A resource template can be referenced in the following types of constraints: - - - +order+ constraints (see <>) - - +colocation+ constraints (see <>) - - +rsc_ticket+ constraints (for multi-site clusters as described in <>) - - Resource templates referenced in constraints stand for all primitives which are - derived from that template. This means, the constraint applies to all primitive - resources referencing the resource template. Referencing resource templates in - constraints is an alternative to resource sets and can simplify the cluster - configuration considerably. - - For example, given the example templates earlier in this chapter: - - [source,XML] + +Using Templates in Constraints +______________________________ + +A resource template can be referenced in the following types of constraints: + +- ``order`` constraints (see :ref:`s-resource-ordering`) +- ``colocation`` constraints (see :ref:`s-resource-colocation`) +- ``rsc_ticket`` constraints (for multi-site clusters as described in :ref:`ticket-constraints`) + +Resource templates referenced in constraints stand for all primitives which are +derived from that template. This means, the constraint applies to all primitive +resources referencing the resource template. Referencing resource templates in +constraints is an alternative to resource sets and can simplify the cluster +configuration considerably. + +For example, given the example templates earlier in this chapter: + +.. code-block:: xml + - - would colocate all VMs with +base-rsc+ and is the equivalent of the following constraint configuration: - - [source,XML] - ---- + +would colocate all VMs with ``base-rsc`` and is the equivalent of the following constraint configuration: + +.. code-block:: xml + - ---- - - [NOTE] - ====== + +.. note:: + In a colocation constraint, only one template may be referenced from either - `rsc` or `with-rsc`; the other reference must be a regular resource. - ====== - - === Using Templates in Resource Sets === - - Resource templates can also be referenced in resource sets. - - For example, given the example templates earlier in this section, then: - - [source,XML] - ---- + ``rsc`` or ``with-rsc``; the other reference must be a regular resource. + +Using Templates in Resource Sets +________________________________ + +Resource templates can also be referenced in resource sets. + +For example, given the example templates earlier in this section, then: + +.. code-block:: xml + - ---- - - is the equivalent of the following constraint using a sequential resource set: - - [source,XML] - ---- + +is the equivalent of the following constraint using a sequential resource set: + +.. code-block:: xml + - ---- - - Or, if the resources referencing the template can run in parallel, then: - - [source,XML] - ---- + +Or, if the resources referencing the template can run in parallel, then: + +.. code-block:: xml + - ---- - - is the equivalent of the following constraint configuration: - - [source,XML] - ---- + +is the equivalent of the following constraint configuration: + +.. code-block:: xml + - ---- - - [[s-reusing-config-elements]] - == Reusing Rules, Options and Sets of Operations == - - Sometimes a number of constraints need to use the same set of rules, - and resources need to set the same options and parameters. To - simplify this situation, you can refer to an existing object using an - +id-ref+ instead of an +id+. - - So if for one resource you have - - [source,XML] - ------ + +.. _s-reusing-config-elements:: + +Reusing Rules, Options and Sets of Operations +############################################# + +Sometimes a number of constraints need to use the same set of rules, +and resources need to set the same options and parameters. To +simplify this situation, you can refer to an existing object using an +``id-ref`` instead of an ``id``. + +So if for one resource you have + +.. code-block:: xml + - ------ - - Then instead of duplicating the rule for all your other resources, you can instead specify: - - .Referencing rules from other constraints - ===== - [source,XML] - ------- + +Then instead of duplicating the rule for all your other resources, you can instead specify: + +.. topic:: Referencing rules from other constraints + +.. code-block:: xml + - + - ------- - ===== - - [IMPORTANT] - =========== - The cluster will insist that the +rule+ exists somewhere. Attempting + +.. important:: + + The cluster will insist that the ``rule`` exists somewhere. Attempting to add a reference to a non-existing rule will cause a validation - failure, as will attempting to remove a +rule+ that is referenced + failure, as will attempting to remove a ``rule`` that is referenced elsewhere. - =========== - - The same principle applies for +meta_attributes+ and - +instance_attributes+ as illustrated in the example below: - - .Referencing attributes, options, and operations from other resources - ===== - [source,XML] - ------- - - - - - - - - - - - - - - - - - - - - - ------- - ===== - - +id-ref+ can similarly be used with +resource_set+ (in any constraint type), - +nvpair+, and +operations+. - - == Tagging Configuration Elements == - - Pacemaker allows you to 'tag' any configuration element that has an XML ID. - - The main purpose of tagging is to support higher-level user interface tools; - Pacemaker itself only uses tags within constraints. Therefore, what you can - do with tags mostly depends on the tools you use. - - === Configuring Tags === - - A tag is simply a named list of XML IDs. - - .Tag referencing three resources - ==== - [source,XML] - ---- + +The same principle applies for ``meta_attributes`` and +``instance_attributes`` as illustrated in the example below: + +.. topic:: Referencing attributes, options, and operations from other resources + + .. code-block:: xml + + + + + + + + + + + + + + + + + + + + + + +``id-ref`` can similarly be used with ``resource_set`` (in any constraint type), +``nvpair``, and ``operations``. + +Tagging Configuration Elements +############################## + +Pacemaker allows you to *tag* any configuration element that has an XML ID. + +The main purpose of tagging is to support higher-level user interface tools; +Pacemaker itself only uses tags within constraints. Therefore, what you can +do with tags mostly depends on the tools you use. + +Configuring Tags +________________ + +A tag is simply a named list of XML IDs. + +.. topic:: Tag referencing three resources + + .. code-block:: xml + + + + + + + + + +What you can do with this new tag depends on what your higher-level tools +support. For example, a tool might allow you to enable or disable all of +the tagged resources at once, or show the status of just the tagged +resources. + +A single configuration element can be listed in any number of tags. + +Using Tags in Constraints and Resource Sets +___________________________________________ + +Pacemaker itself only uses tags in constraints. If you supply a tag name +instead of a resource name in any constraint, the constraint will apply to +all resources listed in that tag. + +.. topic:: Constraint using a tag + + .. code-block:: xml + + + +In the example above, assuming the ``all-vms`` tag is defined as in the previous +example, the constraint will behave the same as: + +.. topic:: Equivalent constraints without tags + + .. code-block:: xml + + + + + +A tag may be used directly in the constraint, or indirectly by being +listed in a :ref:`resource set ` used in the constraint. +When used in a resource set, an expanded tag will honor the set's +``sequential`` property. + +Filtering With Tags +___________________ + +The ``crm_mon`` tool can be used to display lots of information about the +state of the cluster. On large or complicated clusters, this can include +a lot of information, which makes it difficult to find the one thing you +are interested in. The ``--resource=`` and ``--node=`` command line +options can be used to filter results. In their most basic usage, these +options take a single resource or node name. However, they can also +be supplied with a tag name to display several objects at once. + +For instance, given the following CIB section: + +.. code-block:: xml + + + + + + + + + + + + - - - - + + + - ---- - ==== - - What you can do with this new tag depends on what your higher-level tools - support. For example, a tool might allow you to enable or disable all of - the tagged resources at once, or show the status of just the tagged - resources. - - A single configuration element can be listed in any number of tags. - - === Using Tags in Constraints and Resource Sets === - - Pacemaker itself only uses tags in constraints. If you supply a tag name - instead of a resource name in any constraint, the constraint will apply to - all resources listed in that tag. - - .Constraint using a tag - ==== - [source,XML] - ---- - - ---- - ==== - - In the example above, assuming the +all-vms+ tag is defined as in the previous - example, the constraint will behave the same as: - - .Equivalent constraints without tags - ==== - [source,XML] - ---- - - - - ---- - ==== - - A tag may be used directly in the constraint, or indirectly by being - listed in a <> used in the constraint. - When used in a resource set, an expanded tag will honor the set's - +sequential+ property. + +The following would be output for ``crm_mon --resource=inactive-rscs -r``: + +.. code-block:: none + + Cluster Summary: + * Stack: corosync + * Current DC: cluster02 (version 2.0.4-1.e97f9675f.git.el7-e97f9675f) - partition with quorum + * Last updated: Tue Oct 20 16:09:01 2020 + * Last change: Tue May 5 12:04:36 2020 by hacluster via crmd on cluster01 + * 5 nodes configured + * 27 resource instances configured (4 DISABLED) + + Node List: + * Online: [ cluster01 cluster02 ] + + Full List of Resources: + * Clone Set: inactive-clone [inactive-dhcpd] (disabled): + * Stopped (disabled): [ cluster01 cluster02 ] + * Resource Group: inactive-group (disabled): + * inactive-dummy-1 (ocf::pacemaker:Dummy): Stopped (disabled) + * inactive-dummy-2 (ocf::pacemaker:Dummy): Stopped (disabled)