No OneTemporary
Actions

Size

14 KB

Referenced Files

None

Subscribers

None

View Options

	diff --git a/doc/Pacemaker_Explained/en-US/Ch-Alerts.txt b/doc/Pacemaker_Explained/en-US/Ch-Alerts.txt
	index 297550a594..6185702525 100644
	--- a/doc/Pacemaker_Explained/en-US/Ch-Alerts.txt
	+++ b/doc/Pacemaker_Explained/en-US/Ch-Alerts.txt
	@@ -1,340 +1,407 @@
	= Alerts =

	////
	We prefer [[ch-alerts]], but older versions of asciidoc don't deal well
	with that construct for chapter headings
	////
	anchor:ch-alerts[Chapter 7, Alerts]
	indexterm:[Resource,Alerts]

	'Alerts' (available since Pacemaker 1.1.15) may be configured to take some
	external action when a cluster event occurs (node failure, resource starting or
	stopping, etc.).


	== Alert Agents ==

	As with resource agents, the cluster calls an external program (an
	'alert agent') to handle alerts. The cluster passes information about the event
	to the agent via environment variables. Agents can do anything
	desired with this information (send an e-mail, log to a file,
	update a monitoring system, etc.).


	.Simple alert configuration
	=====
	[source,XML]
	-----
	<configuration>
	<alerts>
	<alert id="my-alert" path="/path/to/my-script.sh" />
	</alerts>
	</configuration>
	-----
	=====

	In the example above, the cluster will call +my-script.sh+ for each event.

	Multiple alert agents may be configured; the cluster will call all of them for
	each event.

	Alert agents will be called only on cluster nodes. They will be called for
	events involving Pacemaker Remote nodes, but they will never be called _on_
	those nodes.

	== Alert Recipients ==

	Usually alerts are directed towards a recipient. Thus each alert may be additionally configured with one or more recipients.
	The cluster will call the agent separately for each recipient.

	.Alert configuration with recipient
	=====
	[source,XML]
	-----
	<configuration>
	<alerts>
	<alert id="my-alert" path="/path/to/my-script.sh">
	<recipient id="my-alert-recipient" value="some-address"/>
	</alert>
	</alerts>
	</configuration>
	-----
	=====

	In the above example, the cluster will call +my-script.sh+ for each event,
	passing the recipient +some-address+ as an environment variable.

	The recipient may be anything the alert agent can recognize --
	an IP address, an e-mail address, a file name, whatever the particular
	agent supports.


	== Alert Meta-Attributes ==

	As with resource agents, meta-attributes can be configured for alert agents
	to affect how Pacemaker calls them.

	.Meta-Attributes of an Alert

	[width="95%",cols="m,1,2<a",options="header",align="center"]
	\|=========================================================

	\|Meta-Attribute
	\|Default
	\|Description

	\|timestamp-format
	\|%H:%M:%S.%06N
	\|Format the cluster will use when sending the event's timestamp to the agent.
	This is a string as used with the `date(1)` command.
	indexterm:[Alert,Option,timestamp-format]

	\|timeout
	\|30s
	\|If the alert agent does not complete within this amount of time, it will be
	terminated.
	indexterm:[Alert,Option,timeout]

	\|=========================================================

	Meta-attributes can be configured per alert agent and/or per recipient.

	.Alert configuration with meta-attributes
	=====
	[source,XML]
	-----
	<configuration>
	<alerts>
	<alert id="my-alert" path="/path/to/my-script.sh">
	<meta_attributes id="my-alert-attributes">
	<nvpair id="my-alert-attributes-timeout" name="timeout"
	value="15s"/>
	</meta_attributes>
	<recipient id="my-alert-recipient1" value="someuser@example.com">
	<meta_attributes id="my-alert-recipient1-attributes">
	<nvpair id="my-alert-recipient1-timestamp-format"
	name="timestamp-format" value="%D %H:%M"/>
	</meta_attributes>
	</recipient>
	<recipient id="my-alert-recipient2" value="otheruser@example.com">
	<meta_attributes id="my-alert-recipient2-attributes">
	<nvpair id="my-alert-recipient2-timestamp-format"
	name="timestamp-format" value="%c"/>
	</meta_attributes>
	</recipient>
	</alert>
	</alerts>
	</configuration>
	-----
	=====

	In the above example, the +my-script.sh+ will get called twice for each event,
	with each call using a 15-second timeout. One call will be passed the recipient
	+someuser@example.com+ and a timestamp in the format +%D %H:%M+, while the
	other call will be passed the recipient +otheruser@example.com+ and a timestamp
	in the format +%c+.


	== Alert Instance Attributes ==

	As with resource agents, agent-specific configuration values may be configured
	as instance attributes. These will be passed to the agent as additional
	environment variables. The number, names and allowed values of these
	instance attributes are completely up to the particular agent.

	.Alert configuration with instance attributes
	=====
	[source,XML]
	-----
	<configuration>
	<alerts>
	<alert id="my-alert" path="/path/to/my-script.sh">
	<meta_attributes id="my-alert-attributes">
	<nvpair id="my-alert-attributes-timeout" name="timeout"
	value="15s"/>
	</meta_attributes>
	<instance_attributes id="my-alert-options">
	<nvpair id="my-alert-options-debug" name="debug" value="false"/>
	</instance_attributes>
	<recipient id="my-alert-recipient1" value="someuser@example.com"/>
	</alert>
	</alerts>
	</configuration>
	-----
	=====


	+== Alert Filters ==
	+
	+'Coming in 1.1.18'
	+
	+By default, an alert agent will be called for node events, fencing events, and
	+resource events. An agent may choose to ignore certain types of events, but
	+there is still the overhead of calling it for those events. To eliminate that
	+overhead, you may select which types of events the agent should receive.
	+
	+.Alert configuration to receive only node events and fencing events
	+=====
	+[source,XML]
	+-----
	+<configuration>
	+ <alerts>
	+ <alert id="my-alert" path="/path/to/my-script.sh">
	+ <select>
	+ <select_nodes />
	+ <select_fencing />
	+ </select>
	+ <recipient id="my-alert-recipient1" value="someuser@example.com"/>
	+ </alert>
	+ </alerts>
	+</configuration>
	+-----
	+=====
	+
	+The possible options within +<select>+ are +<select_nodes>+,
	++<select_fencing>+, +<select_resources>+, and +<select_attributes>+.
	+
	+With +<select_attributes>+ (the only event type not enabled by default), the
	+agent will receive alerts when a node attribute changes. If you wish the agent
	+to be called only when certain attributes change, you can configure that as well.
	+
	+.Alert configuration to be called when certain node attributes change
	+=====
	+[source,XML]
	+-----
	+<configuration>
	+ <alerts>
	+ <alert id="my-alert" path="/path/to/my-script.sh">
	+ <select>
	+ <select_attributes>
	+ <attribute id="alert-standby" name="standby" />
	+ <attribute id="alert-shutdown" name="shutdown" />
	+ </select_attributes>
	+ </select>
	+ <recipient id="my-alert-recipient1" value="someuser@example.com"/>
	+ </alert>
	+ </alerts>
	+</configuration>
	+-----
	+=====
	+
	+Node attribute alerts are currently considered experimental. Alerts may be
	+limited to attributes set via attrd_updater, and agents may be called multiple
	+times with the same attribute value.
	+
	+
	== Using the Sample Alert Agents ==

	Pacemaker provides several sample alert agents, installed in
	+/usr/share/pacemaker/alerts+ by default.

	While these sample scripts may be copied and used as-is, they are provided
	mainly as templates to be edited to suit your purposes.
	See their source code for the full set of instance attributes they support.

	.Sending cluster events as SNMP traps
	=====
	[source,XML]
	-----
	<configuration>
	<alerts>
	<alert id="snmp_alert" path="/path/to/alert_snmp.sh">
	<instance_attributes id="config_for_alert_snmp">
	<nvpair id="trap_node_states" name="trap_node_states" value="all"/>
	</instance_attributes>
	<meta_attributes id="config_for_timestamp">
	<nvpair id="ts_fmt" name="timestamp-format"
	value="%Y-%m-%d,%H:%M:%S.%01N"/>
	</meta_attributes>
	<recipient id="snmp_destination" value="192.168.1.2"/>
	</alert>
	</alerts>
	</configuration>
	-----
	=====

	.Sending cluster events as e-mails
	=====
	[source,XML]
	-----
	<configuration>
	<alerts>
	<alert id="smtp_alert" path="/path/to/alert_smtp.sh">
	<instance_attributes id="config_for_alert_smtp">
	<nvpair id="email_sender" name="email_sender"
	value="donotreply@example.com"/>
	</instance_attributes>
	<recipient id="smtp_destination" value="admin@example.com"/>
	</alert>
	</alerts>
	</configuration>
	-----
	=====

	-== Writing an Alert Agent ==

	+== Writing an Alert Agent ==

	.Environment variables passed to alert agents

	[width="95%",cols="m,2<a",options="header",align="center"]
	\|=========================================================

	\|Environment Variable
	\|Description

	\|CRM_alert_kind
	-\|The type of alert (+node+, +fencing+, or +resource+)
	+\|The type of alert (+node+, +fencing+, +resource+, or +attribute+)
	indexterm:[Environment Variable,CRM_alert_,kind]

	\|CRM_alert_version
	\|The version of Pacemaker sending the alert
	indexterm:[Environment Variable,CRM_alert_,version]

	\|CRM_alert_recipient
	\|The configured recipient
	indexterm:[Environment Variable,CRM_alert_,recipient]

	\|CRM_alert_node_sequence
	\|A sequence number increased whenever an alert is being issued on the
	local node, which can be used to reference the order in which alerts have been
	issued by Pacemaker. An alert for an event that happened later in time
	reliably has a higher sequence number than alerts for earlier events.
	Be aware that this number has no cluster-wide meaning.
	indexterm:[Environment Variable,CRM_alert_node_,sequence]

	\|CRM_alert_timestamp
	\|A timestamp created prior to executing the agent, in the format
	specified by the +timestamp-format+ meta-attribute. This allows the agent
	to have a reliable, high-precision time of when the event occurred,
	regardless of when the agent itself was invoked (which could potentially
	be delayed due to system load, etc.).
	indexterm:[Environment Variable,CRM_alert_,timestamp]

	\|CRM_alert_node
	\|Name of affected node
	indexterm:[Environment Variable,CRM_alert_,node]

	\|CRM_alert_desc
	\|Detail about event. For +node+ alerts, this is the node's current state
	(+member+ or +lost+). For +fencing+ alerts, this is a summary of the
	requested fencing operation, including origin, target, and fencing operation
	error code, if any. For +resource+ alerts, this is a readable string
	equivalent of +CRM_alert_status+.
	indexterm:[Environment Variable,CRM_alert_,desc]

	\|CRM_alert_nodeid
	\|ID of node whose status changed (provided with +node+ alerts only)
	indexterm:[Environment Variable,CRM_alert_,nodeid]

	\|CRM_alert_task
	\|The requested fencing or resource operation
	(provided with +fencing+ and +resource+ alerts only)
	indexterm:[Environment Variable,CRM_alert_,task]

	\|CRM_alert_rc
	\|The numerical return code of the fencing or resource operation
	(provided with +fencing+ and +resource+ alerts only)
	indexterm:[Environment Variable,CRM_alert_,rc]

	\|CRM_alert_rsc
	\|The name of the affected resource (+resource+ alerts only)
	indexterm:[Environment Variable,CRM_alert_,rsc]

	\|CRM_alert_interval
	\|The interval of the resource operation (+resource+ alerts only)
	indexterm:[Environment Variable,CRM_alert_,interval]

	\|CRM_alert_target_rc
	\|The expected numerical return code of the operation (+resource+ alerts only)
	indexterm:[Environment Variable,CRM_alert_,target_rc]

	\|CRM_alert_status
	\|A numerical code used by Pacemaker to represent the operation result
	(+resource+ alerts only)
	indexterm:[Environment Variable,CRM_alert_,status]

	+\|CRM_alert_attribute_name
	+\|The name of the node attribute that changed (+attribute+ alerts only)
	+ indexterm:[Environment Variable,CRM_alert_,attribute_name]
	+
	+\|CRM_alert_attribute_value
	+\|The new value of the node attribute that changed (+attribute+ alerts only)
	+ indexterm:[Environment Variable,CRM_alert_,attribute_value]
	+
	\|=========================================================

	Special concerns when writing alert agents:

	* Alert agents may be called with no recipient (if none is configured),
	so the agent must be able to handle this situation, even if it
	only exits in that case. (Users may modify the configuration in
	stages, and add a recipient later.)

	* If more than one recipient is configured for an alert, the alert agent will
	be called once per recipient. If an agent is not able to run concurrently, it
	should be configured with only a single recipient. The agent is free,
	however, to interpret the recipient as a list.

	* When a cluster event occurs, all alerts are fired off at the same time as
	separate processes. Depending on how many alerts and recipients are
	configured, and on what is done within the alert agents,
	a significant load burst may occur. The agent could be written to take
	this into consideration, for example by queueing resource-intensive actions
	into some other instance, instead of directly executing them.

	* Alert agents are run as the +hacluster+ user, which has a minimal set
	of permissions. If an agent requires additional privileges, it is
	recommended to configure +sudo+ to allow the agent to run the necessary
	commands as another user with the appropriate privileges.

	* As always, take care to validate and sanitize user-configured parameters,
	such as CRM_alert_timestamp (whose content is specified by the
	user-configured timestamp-format), CRM_alert_recipient, and all instance
	attributes. Mostly this is needed simply to protect against configuration
	errors, but if some user can modify the CIB without having hacluster-level
	access to the cluster nodes, it is a potential security concern as well, to
	avoid the possibility of code injection.

	[NOTE]
	=====
	The alerts interface is designed to be backward compatible with the external
	scripts interface used by the +ocf:pacemaker:ClusterMon+ resource, which is
	now deprecated. To preserve this compatibility, the environment variables
	passed to alert agents are available prepended with +CRM_notify_+
	as well as +CRM_alert_+. One break in compatibility is that ClusterMon ran
	external scripts as the +root+ user, while alert agents are run as the
	+hacluster+ user.
	=====

File Metadata

Mime Type: text/x-diff
Expires: Sat, Nov 23, 4:49 AM (9 h, 8 m)
Storage Engine: blob
Storage Format: Raw Data
Storage Handle: 1008733
Default Alt Text: (14 KB)

No OneTemporaryActions

View Options

File Metadata

Event Timeline

No OneTemporary
Actions