diff --git a/doc/sphinx/Pacemaker_Explained/cluster-options.rst b/doc/sphinx/Pacemaker_Explained/cluster-options.rst
index 77bd7e65bc..15375f4ee0 100644
--- a/doc/sphinx/Pacemaker_Explained/cluster-options.rst
+++ b/doc/sphinx/Pacemaker_Explained/cluster-options.rst
@@ -1,921 +1,784 @@
Cluster-Wide Configuration
--------------------------
.. index::
pair: XML element; cib
pair: XML element; configuration
Configuration Layout
####################
The cluster is defined by the Cluster Information Base (CIB), which uses XML
notation. The simplest CIB, an empty one, looks like this:
.. topic:: An empty configuration
.. code-block:: xml
The empty configuration above contains the major sections that make up a CIB:
* ``cib``: The entire CIB is enclosed with a ``cib`` element. Certain
fundamental settings are defined as attributes of this element.
* ``configuration``: This section -- the primary focus of this document --
contains traditional configuration information such as what resources the
cluster serves and the relationships among them.
* ``crm_config``: cluster-wide configuration options
* ``nodes``: the machines that host the cluster
* ``resources``: the services run by the cluster
* ``constraints``: indications of how resources should be placed
* ``status``: This section contains the history of each resource on each
node. Based on this data, the cluster can construct the complete current
state of the cluster. The authoritative source for this section is the
local executor (pacemaker-execd process) on each cluster node, and the
cluster will occasionally repopulate the entire section. For this reason,
it is never written to disk, and administrators are advised against
modifying it in any way.
In this document, configuration settings will be described as properties or
options based on how they are defined in the CIB:
* Properties are XML attributes of an XML element.
* Options are name-value pairs expressed as ``nvpair`` child elements of an XML
element.
Normally, you will use command-line tools that abstract the XML, so the
distinction will be unimportant; both properties and options are cluster
settings you can tweak.
-Configuration Value Types
-#########################
-
-Throughout this document, configuration values will be designated as having one
-of the following types:
-
-.. list-table:: **Configuration Value Types**
- :class: longtable
- :widths: 1 3
- :header-rows: 1
-
- * - Type
- - Description
- * - .. _boolean:
-
- .. index::
- pair: type; boolean
-
- boolean
- - Case-insensitive text value where ``1``, ``yes``, ``y``, ``on``,
- and ``true`` evaluate as true and ``0``, ``no``, ``n``, ``off``,
- ``false``, and unset evaluate as false
- * - .. _date_time:
-
- .. index::
- pair: type; date/time
-
- date/time
- - Textual timestamp like ``Sat Dec 21 11:47:45 2013``
- * - .. _duration:
-
- .. index::
- pair: type; duration
-
- duration
- - A time duration, specified either like a :ref:`timeout ` or an
- `ISO 8601 duration `_.
- A duration may be up to approximately 49 days but is intended for much
- smaller time periods.
- * - .. _enumeration:
-
- .. index::
- pair: type; enumeration
-
- enumeration
- - Text that must be one of a set of defined values (which will be listed
- in the description)
- * - .. _integer:
-
- .. index::
- pair: type; integer
-
- integer
- - 32-bit signed integer value (-2,147,483,648 to 2,147,483,647)
- * - .. _nonnegative_integer:
-
- .. index::
- pair: type; nonnegative integer
-
- nonnegative integer
- - 32-bit nonnegative integer value (0 to 2,147,483,647)
- * - .. _port:
-
- .. index::
- pair: type; port
-
- port
- - Integer TCP port number (0 to 65535)
- * - .. _score:
-
- .. index::
- pair: type; score
-
- score
- - A Pacemaker score can be an integer between -1,000,000 and 1,000,000, or
- a string alias: ``INFINITY`` or ``+INFINITY`` is equivalent to
- 1,000,000, ``-INFINITY`` is equivalent to -1,000,000, and ``red``,
- ``yellow``, and ``green`` are equivalent to integers as described in
- :ref:`node-health`.
- * - .. _text:
-
- .. index::
- pair: type; text
-
- text
- - A text string
- * - .. _timeout:
-
- .. index::
- pair: type; timeout
-
- timeout
- - A time duration, specified as a bare number (in which case it is
- considered to be in seconds) or a number with a unit (``ms`` or ``msec``
- for milliseconds, ``us`` or ``usec`` for microseconds, ``s`` or ``sec``
- for seconds, ``m`` or ``min`` for minutes, ``h`` or ``hr`` for hours)
- optionally with whitespace before and/or after the number.
- * - .. _version:
-
- .. index::
- pair: type; version
-
- version
- - Version number (any combination of alphanumeric characters, dots, and
- dashes, starting with a number).
-
-
-Scores
-______
-
-Scores are integral to how Pacemaker works. Practically everything from moving
-a resource to deciding which resource to stop in a degraded cluster is achieved
-by manipulating scores in some way.
-
-Scores are calculated per resource and node. Any node with a negative score for
-a resource can't run that resource. The cluster places a resource on the node
-with the highest score for it.
-
-Score addition and subtraction follow these rules:
-
-* Any value (including ``INFINITY``) - ``INFINITY`` = ``-INFINITY``
-* ``INFINITY`` + any value other than ``-INFINITY`` = ``INFINITY``
-
-.. note::
-
- What if you want to use a score higher than 1,000,000? Typically this possibility
- arises when someone wants to base the score on some external metric that might
- go above 1,000,000.
-
- The short answer is you can't.
-
- The long answer is it is sometimes possible work around this limitation
- creatively. You may be able to set the score to some computed value based on
- the external metric rather than use the metric directly. For nodes, you can
- store the metric as a node attribute, and query the attribute when computing
- the score (possibly as part of a custom resource agent).
-
CIB Properties
##############
Certain settings are defined by CIB properties (that is, attributes of the
``cib`` tag) rather than with the rest of the cluster configuration in the
``configuration`` section.
The reason is simply a matter of parsing. These options are used by the
configuration database which is, by design, mostly ignorant of the content it
holds. So the decision was made to place them in an easy-to-find location.
.. list-table:: **CIB Properties**
:class: longtable
:widths: 2 2 2 5
:header-rows: 1
* - Name
- Type
- Default
- Description
* - .. _admin_epoch:
.. index::
pair: admin_epoch; cib
admin_epoch
- :ref:`nonnegative integer `
- 0
- When a node joins the cluster, the cluster asks the node with the
highest (``admin_epoch``, ``epoch``, ``num_updates``) tuple to replace
the configuration on all the nodes -- which makes setting them correctly
very important. ``admin_epoch`` is never modified by the cluster; you
can use this to make the configurations on any inactive nodes obsolete.
* - .. _epoch:
.. index::
pair: epoch; cib
epoch
- :ref:`nonnegative integer `
- 0
- The cluster increments this every time the CIB's configuration section
is updated.
* - .. _num_updates:
.. index::
pair: num_updates; cib
num_updates
- :ref:`nonnegative integer `
- 0
- The cluster increments this every time the CIB's configuration or status
sections are updated, and resets it to 0 when epoch changes.
* - .. _validate_with:
.. index::
pair: validate-with; cib
validate-with
- :ref:`enumeration `
-
- Determines the type of XML validation that will be done on the
configuration. Allowed values are ``none`` (in which case the cluster
will not require that updates conform to expected syntax) and the base
names of schema files installed on the local machine (for example,
"pacemaker-3.9")
* - .. _remote_tls_port:
.. index::
pair: remote-tls-port; cib
remote-tls-port
- :ref:`port `
-
- If set, the CIB manager will listen for anonymously encrypted remote
connections on this port, to allow CIB administration from hosts not in
the cluster. No key is used, so this should be used only on a protected
network where man-in-the-middle attacks can be avoided.
* - .. _remote_clear_port:
.. index::
pair: remote-clear-port; cib
remote-clear-port
- :ref:`port `
-
- If set to a TCP port number, the CIB manager will listen for remote
connections on this port, to allow for CIB administration from hosts not
in the cluster. No encryption is used, so this should be used only on a
protected network.
* - .. _cib_last_written:
.. index::
pair: cib-last-written; cib
cib-last-written
- :ref:`date/time `
-
- Indicates when the configuration was last written to disk. Maintained by
the cluster; for informational purposes only.
* - .. _have_quorum:
.. index::
pair: have-quorum; cib
have-quorum
- :ref:`boolean `
-
- Indicates whether the cluster has quorum. If false, the cluster's
response is determined by ``no-quorum-policy`` (see below). Maintained
by the cluster.
* - .. _dc_uuid:
.. index::
pair: dc-uuid; cib
dc-uuid
- :ref:`text `
-
- Node ID of the cluster's current designated controller (DC). Used and
maintained by the cluster.
.. _cluster_options:
Cluster Options
###############
Cluster options, as you might expect, control how the cluster behaves when
confronted with various situations.
They are grouped into sets within the ``crm_config`` section. In advanced
configurations, there may be more than one set. (This will be described later
in the chapter on :ref:`rules` where we will show how to have the cluster use
different sets of options during working hours than during weekends.) For now,
we will describe the simple case where each option is present at most once.
You can obtain an up-to-date list of cluster options, including their default
values, by running the ``man pacemaker-schedulerd`` and
``man pacemaker-controld`` commands.
.. list-table:: **Cluster Options**
:class: longtable
:widths: 2 2 2 5
:header-rows: 1
* - Name
- Type
- Default
- Description
* - .. _cluster_name:
.. index::
pair: cluster option; cluster-name
cluster-name
- :ref:`text `
-
- An (optional) name for the cluster as a whole. This is mostly for users'
convenience for use as desired in administration, but can be used in the
Pacemaker configuration in :ref:`rules` (as the ``#cluster-name``
:ref:`node attribute `). It may also
be used by higher-level tools when displaying cluster information, and
by certain resource agents (for example, the ``ocf:heartbeat:GFS2``
agent stores the cluster name in filesystem meta-data).
* - .. _dc_version:
.. index::
pair: cluster option; dc-version
dc-version
- :ref:`version `
- *detected*
- Version of Pacemaker on the cluster's designated controller (DC).
Maintained by the cluster, and intended for diagnostic purposes.
* - .. _cluster_infrastructure:
.. index::
pair: cluster option; cluster-infrastructure
cluster-infrastructure
- :ref:`text `
- *detected*
- The messaging layer with which Pacemaker is currently running.
Maintained by the cluster, and intended for informational and diagnostic
purposes.
* - .. _no_quorum_policy:
.. index::
pair: cluster option; no-quorum-policy
no-quorum-policy
- :ref:`enumeration `
- stop
- What to do when the cluster does not have quorum. Allowed values:
* ``ignore:`` continue all resource management
* ``freeze:`` continue resource management, but don't recover resources
from nodes not in the affected partition
* ``stop:`` stop all resources in the affected cluster partition
* ``demote:`` demote promotable resources and stop all other resources
in the affected cluster partition *(since 2.0.5)*
* ``suicide:`` fence all nodes in the affected cluster partition
* - .. _batch_limit:
.. index::
pair: cluster option; batch-limit
batch-limit
- :ref:`integer `
- 0
- The maximum number of actions that the cluster may execute in parallel
across all nodes. The ideal value will depend on the speed and load
of your network and cluster nodes. If zero, the cluster will impose a
dynamically calculated limit only when any node has high load. If -1,
the cluster will not impose any limit.
* - .. _migration_limit:
.. index::
pair: cluster option; migration-limit
migration-limit
- :ref:`integer `
- -1
- The number of :ref:`live migration ` actions that the
cluster is allowed to execute in parallel on a node. A value of -1 means
unlimited.
* - .. _symmetric_cluster:
.. index::
pair: cluster option; symmetric-cluster
symmetric-cluster
- :ref:`boolean `
- true
- If true, resources can run on any node by default. If false, a resource
is allowed to run on a node only if a
:ref:`location constraint ` enables it.
* - .. _stop_all_resources:
.. index::
pair: cluster option; stop-all-resources
stop-all-resources
- :ref:`boolean `
- false
- Whether all resources should be disallowed from running (can be useful
during maintenance or troubleshooting)
* - .. _stop_orphan_resources:
.. index::
pair: cluster option; stop-orphan-resources
stop-orphan-resources
- :ref:`boolean `
- true
- Whether resources that have been deleted from the configuration should
be stopped. This value takes precedence over
:ref:`is-managed ` (that is, even unmanaged resources will
be stopped when orphaned if this value is ``true``).
* - .. _stop_orphan_actions:
.. index::
pair: cluster option; stop-orphan-actions
stop-orphan-actions
- :ref:`boolean `
- true
- Whether recurring :ref:`operations ` that have been deleted
from the configuration should be cancelled
* - .. _start_failure_is_fatal:
.. index::
pair: cluster option; start-failure-is-fatal
start-failure-is-fatal
- :ref:`boolean `
- true
- Whether a failure to start a resource on a particular node prevents
further start attempts on that node. If ``false``, the cluster will
decide whether the node is still eligible based on the resource's
current failure count and ``migration-threshold``.
* - .. _enable_startup_probes:
.. index::
pair: cluster option; enable-startup-probes
enable-startup-probes
- :ref:`boolean `
- true
- Whether the cluster should check the pre-existing state of resources
when the cluster starts
* - .. _maintenance_mode:
.. index::
pair: cluster option; maintenance-mode
maintenance-mode
- :ref:`boolean `
- false
- If true, the cluster will not start or stop any resource in the cluster,
and any recurring operations (expect those specifying ``role`` as
``Stopped``) will be paused. If true, this overrides the
:ref:`maintenance ` node attribute,
:ref:`is-managed ` and :ref:`maintenance `
resource meta-attributes, and :ref:`enabled ` operation
meta-attribute.
* - .. _stonith_enabled:
.. index::
pair: cluster option; stonith-enabled
stonith-enabled
- :ref:`boolean `
- true
- Whether the cluster is allowed to fence nodes (for example, failed nodes
and nodes with resources that can't be stopped).
If true, at least one fence device must be configured before resources
are allowed to run.
If false, unresponsive nodes are immediately assumed to be running no
resources, and resource recovery on online nodes starts without any
further protection (which can mean *data loss* if the unresponsive node
still accesses shared storage, for example). See also the
:ref:`requires ` resource meta-attribute.
* - .. _stonith_action:
.. index::
pair: cluster option; stonith-action
stonith-action
- :ref:`enumeration `
- reboot
- Action the cluster should send to the fence agent when a node must be
fenced. Allowed values are ``reboot``, ``off``, and (for legacy agents
only) ``poweroff``.
* - .. _stonith_timeout:
.. index::
pair: cluster option; stonith-timeout
stonith-timeout
- :ref:`duration `
- 60s
- How long to wait for ``on``, ``off``, and ``reboot`` fence actions to
complete by default.
* - .. _stonith_max_attempts:
.. index::
pair: cluster option; stonith-max-attempts
stonith-max-attempts
- :ref:`score `
- 10
- How many times fencing can fail for a target before the cluster will no
longer immediately re-attempt it. Any value below 1 will be ignored, and
the default will be used instead.
* - .. _stonith_watchdog_timeout:
.. index::
pair: cluster option; stonith-watchdog-timeout
stonith-watchdog-timeout
- :ref:`timeout `
- 0
- If nonzero, and the cluster detects ``have-watchdog`` as ``true``, then
watchdog-based self-fencing will be performed via SBD when fencing is
required, without requiring a fencing resource explicitly configured.
If this is set to a positive value, unseen nodes are assumed to
self-fence within this much time.
**Warning:** It must be ensured that this value is larger than the
``SBD_WATCHDOG_TIMEOUT`` environment variable on all nodes. Pacemaker
verifies the settings individually on all nodes and prevents startup or
shuts down if configured wrongly on the fly. It is strongly recommended
that ``SBD_WATCHDOG_TIMEOUT`` be set to the same value on all nodes.
If this is set to a negative value, and ``SBD_WATCHDOG_TIMEOUT`` is set,
twice that value will be used.
**Warning:** In this case, it is essential (and currently not verified
by pacemaker) that ``SBD_WATCHDOG_TIMEOUT`` is set to the same value on
all nodes.
* - .. _concurrent-fencing:
.. index::
pair: cluster option; concurrent-fencing
concurrent-fencing
- :ref:`boolean `
- false
- Whether the cluster is allowed to initiate multiple fence actions
concurrently. Fence actions initiated externally, such as via the
``stonith_admin`` tool or an application such as DLM, or by the fencer
itself such as recurring device monitors and ``status`` and ``list``
commands, are not limited by this option.
* - .. _fence_reaction:
.. index::
pair: cluster option; fence-reaction
fence-reaction
- :ref:`enumeration `
- stop
- How should a cluster node react if notified of its own fencing? A
cluster node may receive notification of its own fencing if fencing is
misconfigured, or if fabric fencing is in use that doesn't cut cluster
communication. Allowed values are ``stop`` to attempt to immediately
stop Pacemaker and stay stopped, or ``panic`` to attempt to immediately
reboot the local node, falling back to stop on failure. The default is
likely to be changed to ``panic`` in a future release. *(since 2.0.3)*
* - .. _priority_fencing_delay:
.. index::
pair: cluster option; priority-fencing-delay
priority-fencing-delay
- :ref:`duration `
- 0
- Apply this delay to any fencing targeting the lost nodes with the
highest total resource priority in case we don't have the majority of
the nodes in our cluster partition, so that the more significant nodes
potentially win any fencing match (especially meaningful in a
split-brain of a 2-node cluster). A promoted resource instance takes the
resource's priority plus 1 if the resource's priority is not 0. Any
static or random delays introduced by ``pcmk_delay_base`` and
``pcmk_delay_max`` configured for the corresponding fencing resources
will be added to this delay. This delay should be significantly greater
than (safely twice) the maximum delay from those parameters. *(since
2.0.4)*
* - .. _node_pending_timeout:
.. index::
pair: cluster option; node-pending-timeout
node-pending-timeout
- :ref:`duration `
- 0
- Fence nodes that do not join the controller process group within this
much time after joining the cluster, to allow the cluster to continue
managing resources. A value of 0 means never fence pending nodes. Setting the value to 2h means fence nodes after 2 hours.
*(since 2.1.7)*
* - .. _cluster_delay:
.. index::
pair: cluster option; cluster-delay
cluster-delay
- :ref:`duration `
- 60s
- If the DC requires an action to be executed on another node, it will
consider the action failed if it does not get a response from the other
node within this time (beyond the action's own timeout). The ideal value
will depend on the speed and load of your network and cluster nodes.
* - .. _dc_deadtime:
.. index::
pair: cluster option; dc-deadtime
dc-deadtime
- :ref:`duration `
- 20s
- How long to wait for a response from other nodes when electing a DC. The
ideal value will depend on the speed and load of your network and
cluster nodes.
* - .. _cluster_ipc_limit:
.. index::
pair: cluster option; cluster-ipc-limit
cluster-ipc-limit
- :ref:`nonnegative integer `
- 500
- The maximum IPC message backlog before one cluster daemon will
disconnect another. This is of use in large clusters, for which a good
value is the number of resources in the cluster multiplied by the number
of nodes. The default of 500 is also the minimum. Raise this if you see
"Evicting client" log messages for cluster daemon process IDs.
* - .. _pe_error_series_max:
.. index::
pair: cluster option; pe-error-series-max
pe-error-series-max
- :ref:`integer `
- -1
- The number of scheduler inputs resulting in errors to save. These inputs
can be helpful during troubleshooting and when reporting issues. A
negative value means save all inputs, and 0 means save none.
* - .. _pe_warn_series_max:
.. index::
pair: cluster option; pe-warn-series-max
pe-warn-series-max
- :ref:`integer `
- 5000
- The number of scheduler inputs resulting in warnings to save. These
inputs can be helpful during troubleshooting and when reporting issues.
A negative value means save all inputs, and 0 means save none.
* - .. _pe_input_series_max:
.. index::
pair: cluster option; pe-input-series-max
pe-input-series-max
- :ref:`integer `
- 4000
- The number of "normal" scheduler inputs to save. These inputs can be
helpful during troubleshooting and when reporting issues. A negative
value means save all inputs, and 0 means save none.
* - .. _enable_acl:
.. index::
pair: cluster option; enable-acl
enable-acl
- :ref:`boolean `
- false
- Whether :ref:`access control lists ` should be used to authorize
CIB modifications
* - .. _placement_strategy:
.. index::
pair: cluster option; placement-strategy
placement-strategy
- :ref:`enumeration `
- default
- How the cluster should assign resources to nodes (see
:ref:`utilization`). Allowed values are ``default``, ``utilization``,
``balanced``, and ``minimal``.
* - .. _node_health_strategy:
.. index::
pair: cluster option; node-health-strategy
node-health-strategy
- :ref:`enumeration `
- none
- How the cluster should react to :ref:`node health `
attributes. Allowed values are ``none``, ``migrate-on-red``,
``only-green``, ``progressive``, and ``custom``.
* - .. _node_health_base:
.. index::
pair: cluster option; node-health-base
node-health-base
- :ref:`score `
- 0
- The base health score assigned to a node. Only used when
``node-health-strategy`` is ``progressive``.
* - .. _node_health_green:
.. index::
pair: cluster option; node-health-green
node-health-green
- :ref:`score `
- 0
- The score to use for a node health attribute whose value is ``green``.
Only used when ``node-health-strategy`` is ``progressive`` or
``custom``.
* - .. _node_health_yellow:
.. index::
pair: cluster option; node-health-yellow
node-health-yellow
- :ref:`score `
- 0
- The score to use for a node health attribute whose value is ``yellow``.
Only used when ``node-health-strategy`` is ``progressive`` or
``custom``.
* - .. _node_health_red:
.. index::
pair: cluster option; node-health-red
node-health-red
- :ref:`score `
- 0
- The score to use for a node health attribute whose value is ``red``.
Only used when ``node-health-strategy`` is ``progressive`` or
``custom``.
* - .. _cluster_recheck_interval:
.. index::
pair: cluster option; cluster-recheck-interval
cluster-recheck-interval
- :ref:`duration `
- 15min
- Pacemaker is primarily event-driven, and looks ahead to know when to
recheck the cluster for failure timeouts and most time-based rules
*(since 2.0.3)*. However, it will also recheck the cluster after this
amount of inactivity. This has two goals: rules with ``date_spec`` are
only guaranteed to be checked this often, and it also serves as a
fail-safe for some kinds of scheduler bugs. A value of 0 disables this
polling.
* - .. _shutdown_lock:
.. index::
pair: cluster option; shutdown-lock
shutdown-lock
- :ref:`boolean `
- false
- The default of false allows active resources to be recovered elsewhere
when their node is cleanly shut down, which is what the vast majority of
users will want. However, some users prefer to make resources highly
available only for failures, with no recovery for clean shutdowns. If
this option is true, resources active on a node when it is cleanly shut
down are kept "locked" to that node (not allowed to run elsewhere) until
they start again on that node after it rejoins (or for at most
``shutdown-lock-limit``, if set). Stonith resources and Pacemaker Remote
connections are never locked. Clone and bundle instances and the
promoted role of promotable clones are currently never locked, though
support could be added in a future release. Locks may be manually
cleared using the ``--refresh`` option of ``crm_resource`` (both the
resource and node must be specified; this works with remote nodes if
their connection resource's ``target-role`` is set to ``Stopped``, but
not if Pacemaker Remote is stopped on the remote node without disabling
the connection resource). *(since 2.0.4)*
* - .. _shutdown_lock_limit:
.. index::
pair: cluster option; shutdown-lock-limit
shutdown-lock-limit
- :ref:`duration `
- 0
- If ``shutdown-lock`` is true, and this is set to a nonzero time
duration, locked resources will be allowed to start after this much time
has passed since the node shutdown was initiated, even if the node has
not rejoined. (This works with remote nodes only if their connection
resource's ``target-role`` is set to ``Stopped``.) *(since 2.0.4)*
* - .. _remove_after_stop:
.. index::
pair: cluster option; remove-after-stop
remove-after-stop
- :ref:`boolean `
- false
- *Deprecated* Whether the cluster should remove resources from
Pacemaker's executor after they are stopped. Values other than the
default are, at best, poorly tested and potentially dangerous. This
option is deprecated and will be removed in a future release.
* - .. _startup_fencing:
.. index::
pair: cluster option; startup-fencing
startup-fencing
- :ref:`boolean `
- true
- *Advanced Use Only:* Whether the cluster should fence unseen nodes at
start-up. Setting this to false is unsafe, because the unseen nodes
could be active and running resources but unreachable. ``dc-deadtime``
acts as a grace period before this fencing, since a DC must be elected
to schedule fencing.
* - .. _election_timeout:
.. index::
pair: cluster option; election-timeout
election-timeout
- :ref:`duration `
- 2min
- *Advanced Use Only:* If a winner is not declared within this much time
of starting an election, the node that initiated the election will
declare itself the winner.
* - .. _shutdown_escalation:
.. index::
pair: cluster option; shutdown-escalation
shutdown-escalation
- :ref:`duration `
- 20min
- *Advanced Use Only:* The controller will exit immediately if a shutdown
does not complete within this much time.
* - .. _join_integration_timeout:
.. index::
pair: cluster option; join-integration-timeout
join-integration-timeout
- :ref:`duration `
- 3min
- *Advanced Use Only:* If you need to adjust this value, it probably
indicates the presence of a bug.
* - .. _join_finalization_timeout:
.. index::
pair: cluster option; join-finalization-timeout
join-finalization-timeout
- :ref:`duration `
- 30min
- *Advanced Use Only:* If you need to adjust this value, it probably
indicates the presence of a bug.
* - .. _transition_delay:
.. index::
pair: cluster option; transition-delay
transition-delay
- :ref:`duration `
- 0s
- *Advanced Use Only:* Delay cluster recovery for the configured interval
to allow for additional or related events to occur. This can be useful
if your configuration is sensitive to the order in which ping updates
arrive. Enabling this option will slow down cluster recovery under all
conditions.
diff --git a/doc/sphinx/Pacemaker_Explained/local-options.rst b/doc/sphinx/Pacemaker_Explained/local-options.rst
index b95051777f..8c32990a54 100644
--- a/doc/sphinx/Pacemaker_Explained/local-options.rst
+++ b/doc/sphinx/Pacemaker_Explained/local-options.rst
@@ -1,568 +1,709 @@
Host-Local Configuration
------------------------
.. index::
pair: XML element; configuration
.. note:: Directory and file paths below may differ on your system depending on
your Pacemaker build settings. Check your Pacemaker configuration
file to find the correct paths.
+Configuration Value Types
+#########################
+
+Throughout this document, configuration values will be designated as having one
+of the following types:
+
+.. list-table:: **Configuration Value Types**
+ :class: longtable
+ :widths: 1 3
+ :header-rows: 1
+
+ * - Type
+ - Description
+ * - .. _boolean:
+
+ .. index::
+ pair: type; boolean
+
+ boolean
+ - Case-insensitive text value where ``1``, ``yes``, ``y``, ``on``,
+ and ``true`` evaluate as true and ``0``, ``no``, ``n``, ``off``,
+ ``false``, and unset evaluate as false
+ * - .. _date_time:
+
+ .. index::
+ pair: type; date/time
+
+ date/time
+ - Textual timestamp like ``Sat Dec 21 11:47:45 2013``
+ * - .. _duration:
+
+ .. index::
+ pair: type; duration
+
+ duration
+ - A time duration, specified either like a :ref:`timeout ` or an
+ `ISO 8601 duration `_.
+ A duration may be up to approximately 49 days but is intended for much
+ smaller time periods.
+ * - .. _enumeration:
+
+ .. index::
+ pair: type; enumeration
+
+ enumeration
+ - Text that must be one of a set of defined values (which will be listed
+ in the description)
+ * - .. _integer:
+
+ .. index::
+ pair: type; integer
+
+ integer
+ - 32-bit signed integer value (-2,147,483,648 to 2,147,483,647)
+ * - .. _nonnegative_integer:
+
+ .. index::
+ pair: type; nonnegative integer
+
+ nonnegative integer
+ - 32-bit nonnegative integer value (0 to 2,147,483,647)
+ * - .. _port:
+
+ .. index::
+ pair: type; port
+
+ port
+ - Integer TCP port number (0 to 65535)
+ * - .. _score:
+
+ .. index::
+ pair: type; score
+
+ score
+ - A Pacemaker score can be an integer between -1,000,000 and 1,000,000, or
+ a string alias: ``INFINITY`` or ``+INFINITY`` is equivalent to
+ 1,000,000, ``-INFINITY`` is equivalent to -1,000,000, and ``red``,
+ ``yellow``, and ``green`` are equivalent to integers as described in
+ :ref:`node-health`.
+ * - .. _text:
+
+ .. index::
+ pair: type; text
+
+ text
+ - A text string
+ * - .. _timeout:
+
+ .. index::
+ pair: type; timeout
+
+ timeout
+ - A time duration, specified as a bare number (in which case it is
+ considered to be in seconds) or a number with a unit (``ms`` or ``msec``
+ for milliseconds, ``us`` or ``usec`` for microseconds, ``s`` or ``sec``
+ for seconds, ``m`` or ``min`` for minutes, ``h`` or ``hr`` for hours)
+ optionally with whitespace before and/or after the number.
+ * - .. _version:
+
+ .. index::
+ pair: type; version
+
+ version
+ - Version number (any combination of alphanumeric characters, dots, and
+ dashes, starting with a number).
+
+
+Scores
+______
+
+Scores are integral to how Pacemaker works. Practically everything from moving
+a resource to deciding which resource to stop in a degraded cluster is achieved
+by manipulating scores in some way.
+
+Scores are calculated per resource and node. Any node with a negative score for
+a resource can't run that resource. The cluster places a resource on the node
+with the highest score for it.
+
+Score addition and subtraction follow these rules:
+
+* Any value (including ``INFINITY``) - ``INFINITY`` = ``-INFINITY``
+* ``INFINITY`` + any value other than ``-INFINITY`` = ``INFINITY``
+
+.. note::
+
+ What if you want to use a score higher than 1,000,000? Typically this possibility
+ arises when someone wants to base the score on some external metric that might
+ go above 1,000,000.
+
+ The short answer is you can't.
+
+ The long answer is it is sometimes possible work around this limitation
+ creatively. You may be able to set the score to some computed value based on
+ the external metric rather than use the metric directly. For nodes, you can
+ store the metric as a node attribute, and query the attribute when computing
+ the score (possibly as part of a custom resource agent).
+
+
+Local Options
+#############
+
Pacemaker supports several host-local configuration options. These options can
be configured on each node in the main Pacemaker configuration file
(|PCMK_CONFIG_FILE|) in the format ``=""``. They work by setting
environment variables when Pacemaker daemons start up.
.. list-table:: **Local Options**
:class: longtable
:widths: 2 2 2 5
:header-rows: 1
* - Name
- Type
- Default
- Description
* - .. _cib_pam_service:
.. index::
pair: node option; CIB_pam_service
CIB_pam_service
- :ref:`text `
- login
- PAM service to use for remote CIB client authentication (passed to
``pam_start``).
* - .. _pcmk_logfacility:
.. index::
pair: node option; PCMK_logfacility
PCMK_logfacility
- :ref:`enumeration `
- daemon
- Enable logging via the system log or journal, using the specified log
facility. Messages sent here are of value to all Pacemaker
administrators. This can be disabled using ``none``, but that is not
recommended. Allowed values:
* ``none``
* ``daemon``
* ``user``
* ``local0``
* ``local1``
* ``local2``
* ``local3``
* ``local4``
* ``local5``
* ``local6``
* ``local7``
* - .. _pcmk_logpriority:
.. index::
pair:: node option; PCMK_logpriority
PCMK_logpriority
- :ref:`enumeration `
- notice
- Unless system logging is disabled using ``PCMK_logfacility=none``,
messages of the specified log severity and higher will be sent to the
system log. The default is appropriate for most installations. Allowed
values:
* ``emerg``
* ``alert``
* ``crit``
* ``error``
* ``warning``
* ``notice``
* ``info``
* ``debug``
* - .. _pcmk_logfile:
.. index::
pair:: node option; PCMK_logfile
PCMK_logfile
- :ref:`text `
- |PCMK_LOG_FILE|
- Unless set to ``none``, more detailed log messages will be sent to the
specified file (in addition to the system log, if enabled). These
messages may have extended information, and will include messages of info
severity. This log is of more use to developers and advanced system
administrators, and when reporting problems. Note: The default is
|PCMK_CONTAINER_LOG_FILE| (inside the container) for bundled container
nodes; this would typically be mapped to a different path on the host
running the container.
* - .. _pcmk_logfile_mode:
.. index::
pair:: node option; PCMK_logfile_mode
PCMK_logfile_mode
- :ref:`text `
- 0660
- Pacemaker will set the permissions on the detail log to this value (see
``chmod(1)``).
* - .. _pcmk_debug:
.. index::
pair:: node option; PCMK_debug
PCMK_debug
- :ref:`enumeration `
- no
- Whether to send debug severity messages to the detail log. This may be
set for all subsystems (``yes`` or ``no``) or for specific (comma-
separated) subsystems. Allowed subsystems are:
* ``pacemakerd``
* ``pacemaker-attrd``
* ``pacemaker-based``
* ``pacemaker-controld``
* ``pacemaker-execd``
* ``pacemaker-fenced``
* ``pacemaker-schedulerd``
Example: ``PCMK_debug="pacemakerd,pacemaker-execd"``
* - .. _pcmk_stderr:
.. index::
pair:: node option; PCMK_stderr
PCMK_stderr
- :ref:`boolean `
- no
- *Advanced Use Only:* Whether to send daemon log messages to stderr. This
would be useful only during troubleshooting, when starting Pacemaker
manually on the command line.
Setting this option in the configuration file is pointless, since the
file is not read when starting Pacemaker manually. However, it can be set
directly as an environment variable on the command line.
* - .. _pcmk_trace_functions:
.. index::
pair:: node option; PCMK_trace_functions
PCMK_trace_functions
- :ref:`text `
-
- *Advanced Use Only:* Send debug and trace severity messages from these
(comma-separated) source code functions to the detail log.
Example:
``PCMK_trace_functions="func1,func2"``
* - .. _pcmk_trace_files:
.. index::
pair:: node option; PCMK_trace_files
PCMK_trace_files
- :ref:`text `
-
- *Advanced Use Only:* Send debug and trace severity messages from all
functions in these (comma-separated) source file names to the detail log.
Example: ``PCMK_trace_files="file1.c,file2.c"``
* - .. _pcmk_trace_formats:
.. index::
pair:: node option; PCMK_trace_formats
PCMK_trace_formats
- :ref:`text `
-
- *Advanced Use Only:* Send trace severity messages that are generated by
these (comma-separated) format strings in the source code to the detail
log.
Example: ``PCMK_trace_formats="Error: %s (%d)"``
* - .. _pcmk_trace_tags:
.. index::
pair:: node option; PCMK_trace_tags
PCMK_trace_tags
- :ref:`text `
-
- *Advanced Use Only:* Send debug and trace severity messages related to
these (comma-separated) resource IDs to the detail log.
Example: ``PCMK_trace_tags="client-ip,dbfs"``
* - .. _pcmk_blackbox:
.. index::
pair:: node option; PCMK_blackbox
PCMK_blackbox
- :ref:`enumeration `
- no
- *Advanced Use Only:* Enable blackbox logging globally (``yes`` or ``no``)
or by subsystem. A blackbox contains a rolling buffer of all logs (of all
severities). Blackboxes are stored under |CRM_BLACKBOX_DIR| by default,
by default, and their contents can be viewed using the ``qb-blackbox(8)``
command.
The blackbox recorder can be enabled at start using this variable, or at
runtime by sending a Pacemaker subsystem daemon process a ``SIGUSR1`` or
``SIGTRAP`` signal, and disabled by sending ``SIGUSR2`` (see
``kill(1)``). The blackbox will be written after a crash, assertion
failure, or ``SIGTRAP`` signal.
See :ref:`PCMK_debug ` for allowed subsystems.
Example:
``PCMK_blackbox="pacemakerd,pacemaker-execd"``
* - .. _pcmk_trace_blackbox:
.. index::
pair:: node option; PCMK_trace_blackbox
PCMK_trace_blackbox
- :ref:`enumeration `
-
- *Advanced Use Only:* Write a blackbox whenever the message at the
specified function and line is logged. Multiple entries may be comma-
separated.
Example: ``PCMK_trace_blackbox="remote.c:144,remote.c:149"``
* - .. _pcmk_node_start_state:
.. index::
pair:: node option; PCMK_node_start_state
PCMK_node_start_state
- :ref:`enumeration `
- default
- By default, the local host will join the cluster in an online or standby
state when Pacemaker first starts depending on whether it was previously
put into standby mode. If this variable is set to ``standby`` or
``online``, it will force the local host to join in the specified state.
* - .. _pcmk_node_action_limit:
.. index::
pair:: node option; PCMK_node_action_limit
PCMK_node_action_limit
- :ref:`nonnegative integer `
-
- Specify the maximum number of jobs that can be scheduled on this node. If
set, this overrides the ``node-action-limit`` cluster property for this
node.
* - .. _pcmk_shutdown_delay:
.. index::
pair:: node option; PCMK_shutdown_delay
PCMK_shutdown_delay
- :ref:`timeout `
-
- Specify a delay before shutting down ``pacemakerd`` after shutting down
all other Pacemaker daemons.
* - .. _pcmk_fail_fast:
.. index::
pair:: node option; PCMK_fail_fast
PCMK_fail_fast
- :ref:`boolean `
- no
- By default, if a Pacemaker subsystem crashes, the main ``pacemakerd``
process will attempt to restart it. If this variable is set to ``yes``,
``pacemakerd`` will panic the local host instead.
* - .. _pcmk_panic_action:
.. index::
pair:: node option; PCMK_panic_action
PCMK_panic_action
- :ref:`enumeration `
- reboot
- Pacemaker will panic the local host under certain conditions. By default,
this means rebooting the host. This variable can change that behavior: if
``crash``, trigger a kernel crash (useful if you want a kernel dump to
investigate); if ``sync-reboot`` or ``sync-crash``, synchronize
filesystems before rebooting the host or triggering a kernel crash. The
sync values are more likely to preserve log messages, but with the risk
that the host may be left active if the synchronization hangs.
* - .. _pcmk_authkey_location:
.. index::
pair:: node option; PCMK_authkey_location
PCMK_authkey_location
- :ref:`text `
- |PCMK_AUTHKEY_FILE|
- Use the contents of this file as the authorization key to use with
Pacemaker Remote connections. This file must be readable by Pacemaker
daemons (that is, it must allow read permissions to either the
|CRM_DAEMON_USER| user or the |CRM_DAEMON_GROUP| group), and its contents
must be identical on all nodes.
* - .. _pcmk_remote_address:
.. index::
pair:: node option; PCMK_remote_address
PCMK_remote_address
- :ref:`text `
-
- By default, if the Pacemaker Remote service is run on the local node, it
will listen for connections on all IP addresses. This may be set to one
address to listen on instead, as a resolvable hostname or as a numeric
IPv4 or IPv6 address. When resolving names or listening on all addresses,
IPv6 will be preferred if available. When listening on an IPv6 address,
IPv4 clients will be supported via IPv4-mapped IPv6 addresses.
Example: ``PCMK_remote_address="192.0.2.1"``
* - .. _pcmk_remote_port:
.. index::
pair:: node option; PCMK_remote_port
PCMK_remote_port
- :ref:`port `
- 3121
- Use this TCP port number for Pacemaker Remote node connections. This
value must be the same on all nodes.
* - .. _pcmk_remote_pid1:
.. index::
pair:: node option; PCMK_remote_pid1
PCMK_remote_pid1
- :ref:`enumeration `
- default
- *Advanced Use Only:* When a bundle resource's ``run-command`` option is
left to default, Pacemaker Remote runs as PID 1 in the bundle's
containers. When it does so, it loads environment variables from the
container's |PCMK_INIT_ENV_FILE| and performs the PID 1 responsibility of
reaping dead subprocesses.
This option controls whether those actions are performed when Pacemaker
Remote is not running as PID 1. It is intended primarily for developer
testing but can be useful when ``run-command`` is set to a separate,
custom PID 1 process that launches Pacemaker Remote.
* ``full``: Pacemaker Remote loads environment variables from
|PCMK_INIT_ENV_FILE| and reaps dead subprocesses.
* ``vars``: Pacemaker Remote loads environment variables from
|PCMK_INIT_ENV_FILE| but does not reap dead subprocesses.
* ``default``: Pacemaker Remote performs neither action.
If Pacemaker Remote is running as PID 1, this option is ignored, and the
behavior is the same as for ``full``.
* - .. _pcmk_tls_priorities:
.. index::
pair:: node option; PCMK_tls_priorities
PCMK_tls_priorities
- :ref:`text `
- |PCMK_GNUTLS_PRIORITIES|
- *Advanced Use Only:* These GnuTLS cipher priorities will be used for TLS
connections (whether for Pacemaker Remote connections or remote CIB
access, when enabled). See:
https://gnutls.org/manual/html_node/Priority-Strings.html
Pacemaker will append ``":+ANON-DH"`` for remote CIB access and
``":+DHE-PSK:+PSK"`` for Pacemaker Remote connections, as they are
required for the respective functionality.
Example:
``PCMK_tls_priorities="SECURE128:+SECURE192"``
* - .. _pcmk_dh_min_bits:
.. index::
pair:: node option; PCMK_dh_min_bits
PCMK_dh_min_bits
- :ref:`nonnegative integer `
- 0 (no minimum)
- *Advanced Use Only:* Set a lower bound on the bit length of the prime
number generated for Diffie-Hellman parameters needed by TLS connections.
The default is no minimum.
The server (Pacemaker Remote daemon, or CIB manager configured to accept
remote clients) will use this value to provide a floor for the value
recommended by the GnuTLS library. The library will only accept a limited
number of specific values, which vary by library version, so setting
these is recommended only when required for compatibility with specific
client versions.
Clients (connecting cluster nodes or remote CIB commands) will require
that the server use a prime of at least this size. This is recommended
only when the value must be lowered in order for the client's GnuTLS
library to accept a connection to an older server.
* - .. _pcmk_dh_max_bits:
.. index::
pair:: node option; PCMK_dh_max_bits
PCMK_dh_max_bits
- :ref:`nonnegative integer `
- 0 (no maximum)
- *Advanced Use Only:* Set an upper bound on the bit length of the prime
number generated for Diffie-Hellman parameters needed by TLS connections.
The default is no maximum.
The server (Pacemaker Remote daemon, or CIB manager configured to accept
remote clients) will use this value to provide a ceiling for the value
recommended by the GnuTLS library. The library will only accept a limited
number of specific values, which vary by library version, so setting
these is recommended only when required for compatibility with specific
client versions.
Clients do not use ``PCMK_dh_max_bits``.
* - .. _pcmk_ipc_type:
.. index::
pair:: node option; PCMK_ipc_type
PCMK_ipc_type
- :ref:`enumeration `
- shared-mem
- *Advanced Use Only:* Force use of a particular IPC method. Allowed values:
* ``shared-mem``
* ``socket``
* ``posix``
* ``sysv``
* - .. _pcmk_ipc_buffer:
.. index::
pair:: node option; PCMK_ipc_buffer
PCMK_ipc_buffer
- :ref:`nonnegative integer `
- 131072
- *Advanced Use Only:* Specify an IPC buffer size in bytes. This can be
useful when connecting to large clusters that result in messages
exceeding the default size (which will also result in log messages
referencing this variable).
* - .. _pcmk_cluster_type:
.. index::
pair:: node option; PCMK_cluster_type
PCMK_cluster_type
- :ref:`enumeration `
- corosync
- *Advanced Use Only:* Specify the cluster layer to be used. If unset,
Pacemaker will detect and use a supported cluster layer, if available.
Currently, ``"corosync"`` is the only supported cluster layer. If
multiple layers are supported in the future, this will allow overriding
Pacemaker's automatic detection to select a specific one.
* - .. _pcmk_schema_directory:
.. index::
pair:: node option; PCMK_schema_directory
PCMK_schema_directory
- :ref:`text `
- |CRM_SCHEMA_DIRECTORY|
- *Advanced Use Only:* Specify an alternate location for RNG schemas and
XSL transforms.
* - .. _pcmk_remote_schema_directory:
.. index::
pair:: node option; PCMK_remote_schema_directory
PCMK_remote_schema_directory
- :ref:`text `
- |PCMK__REMOTE_SCHEMA_DIR|
- *Advanced Use Only:* Specify an alternate location on Pacemaker Remote
nodes for storing newer RNG schemas and XSL transforms fetched from
the cluster.
* - .. _pcmk_valgrind_enabled:
.. index::
pair:: node option; PCMK_valgrind_enabled
PCMK_valgrind_enabled
- :ref:`enumeration `
- no
- *Advanced Use Only:* Whether subsystem daemons should be run under
``valgrind``. Allowed values are the same as for ``PCMK_debug``.
* - .. _pcmk_callgrind_enabled:
.. index::
pair:: node option; PCMK_callgrind_enabled
PCMK_callgrind_enabled
- :ref:`enumeration `
- no
- *Advanced Use Only:* Whether subsystem daemons should be run under
``valgrind`` with the ``callgrind`` tool enabled. Allowed values are the
same as for ``PCMK_debug``.
* - .. _sbd_sync_resource_startup:
.. index::
pair:: node option; SBD_SYNC_RESOURCE_STARTUP
SBD_SYNC_RESOURCE_STARTUP
- :ref:`boolean `
-
- If true, ``pacemakerd`` waits for a ping from ``sbd`` during startup
before starting other Pacemaker daemons, and during shutdown after
stopping other Pacemaker daemons but before exiting. Default value is set
based on the ``--with-sbd-sync-default`` configure script option.
* - .. _sbd_watchdog_timeout:
.. index::
pair:: node option; SBD_WATCHDOG_TIMEOUT
SBD_WATCHDOG_TIMEOUT
- :ref:`duration `
-
- If the ``stonith-watchdog-timeout`` cluster property is set to a negative
or invalid value, use double this value as the default if positive, or
use 0 as the default otherwise. This value must be greater than the value
of ``stonith-watchdog-timeout`` if both are set.
* - .. _valgrind_opts:
.. index::
pair:: node option; VALGRIND_OPTS
VALGRIND_OPTS
- :ref:`text `
-
- *Advanced Use Only:* Pass these options to valgrind, when enabled (see
``valgrind(1)``). ``"--vgdb=no"`` should usually be specified because
``pacemaker-execd`` can lower privileges when executing commands, which
would otherwise leave a bunch of unremovable files in ``/tmp``.