diff --git a/doc/Pacemaker_Explained/en-US/Ch-Alerts.txt b/doc/Pacemaker_Explained/en-US/Ch-Alerts.txt new file mode 100644 index 0000000000..79e7aef35f --- /dev/null +++ b/doc/Pacemaker_Explained/en-US/Ch-Alerts.txt @@ -0,0 +1,298 @@ += Receiving Alerts for Cluster Events = + +//// +We prefer [[ch-alerts]], but older versions of asciidoc don't deal well +with that construct for chapter headings +//// +anchor:ch-alerts[Chapter 7, Receiving Alerts for Cluster Events] +indexterm:[Resource,Alerts] + +A Pacemaker cluster is an event-driven system. In this context, an 'event' +might be a resource failure or a configuration change, among others. + + + + + + +[[s-alerts-configuration]] +== Configuring Alerts via Alert-Agents == + +As with resource-agents an external program (alert-agent) is required to pass alerts generated from cluster events to a recipient (IP address, email address, URI). + +When triggered, the alert-agent is fed with dynamically filled environment +variables describing precisely the cluster event that occurred. By making +smart usage of these variables in your alert-agent code, you can trigger +any action. + +It is possible to use multiple alert-agents at the same time. + +Similarly as with resource-agents, +meta-attributes+ can be used to configure how pacemaker is treating the alert-agent (formatting of environment-variables, timeout-handling, ...). + +If an alert-agent needs additional configuration - again similar as with resource-agents - +instance-attributes+ can be added to be passed to the alert-agents as additional environment variables. + +For each of the configured alert-agents it is possible to configure multiple recipients. The alert-agents are called separately for each of the recipients configured. + +Instance- and meta-attributes can either be configured globally per alert-agent and/or per recipient. + +[NOTE] +===== +When there are multiple alert-agents and/or recipients configured on each cluster event there are multiple processes forked at the same time - for each alert-agent and each recipient one. + +Assuming that not all of these processes get scheduled right away this would lead to timestamps, being taken from withing these processes, would differ for a single cluster event. And they would be delayed. + +Thus pacemaker creates a u-second-resolution timestamp whenever a cluster event occurs and passes that to the alert-agents. + +Furthermore pacemaker as well passes an every time increased sequence-number whenever an alert-agent is called. The sequence-numbers are valid just withing one cluster-node. An alert created for a cluster event that happened later in time does reliably have a higher sequence number than those for cluster events that had happened prior to this event. +===== + +[NOTE] +===== +The interface is realized as backward-compatible evolution of the interfaces previously provided with +ocf:pacemaker:Clustermon+ and *integrated-notifications*. +To preserve script-compatibility the environment-variables passed to the alert-agents are available prepended +CRM_notify+ (compatibility version) as well as +CRM_alert+. And they implement a superset of those previous features. +===== + +[WARNING] +===== +Although the interface is realized as backward-compatible evolution of the interface previously provided with +ocf:pacemaker:Clustermon+ there is still one pitfall. + ++Clustermon+ is executed as a resource by lrmd and thus is running under root-privileges - and so do the external-scripts being called. The alert-agents are currently forked by crmd and are thus running as user hacluster. While running the alert-agents with reduced privileges is in general a security benefit, existent scripts might not be able to cope with not being executed as root. + +Configuring +sudo+ accordingly for the alert-agent-executable or the use of the sticky-bit on it might be a way around. +===== + +[[s-alerts-examples]] +== Using the Example Alert-Agents == +There are several example alert-agents provided in the the +.../extra/alerts+ directory of the pacemaker-source-tree. + +.Simple Example logging Cluster Events to a File +===== +[source,XML] +----- + + + + + + + + + + + + + +----- +===== + +.Sending Cluster Events as SNMP Traps +===== +[source,XML] +----- + + + + + + + + + + + + + +----- +Alternatively attributes can be added to the recipient-section as well. +[source,XML] +----- + + + + + + + + + + + + + + +----- +===== + +.Sending Cluster Events as E-Mails +===== +[source,XML] +----- + + + + + + + + + + + + + +----- +===== + +[[s-alerts-reference]] +== Alerts - Reference == + +.Environment Variables Passed to the External Agent - Common + +[width="95%",cols="m,2>",options="header",align="center"] +|========================================================= + +|Environment Variable +|Description + +|CRM_alert_kind +|Indicates the type of alert. One of `node`, `fencing`, `resource` + indexterm:[Environment Variable,CRM_alert_,kind] + +|CRM_alert_version +|Indicates the version of Pacemaker sending the alert. + indexterm:[Environment Variable,CRM_alert_,version] + +|CRM_alert_recipient +|The value specified in the recipient section within an alert section + indexterm:[Environment Variable,CRM_alert_,recipient] + +|CRM_alert_node_sequence +| A sequence number increased whenever an alert is being issued on the +local node; Use to reference the order in which alerts have been issued +by pacemaker. Be aware that it doesn't have a cluster-wide meaning. + indexterm:[Environment Variable,CRM_alert_node_,sequence] + +|CRM_alert_timestamp +| A timestamp that is created prior to spawning out the process which +executes the alert-agent; The format is configurable via a +format-string as with the `date` command - including the nano-second part. + indexterm:[Environment Variable,CRM_alert_,timestamp] + +|========================================================= + +.Environment Variables - Additional for `node` alerts + +[width="95%",cols="m,2>",options="header",align="center"] +|========================================================= + +|Environment Variable +|Description + +|CRM_alert_node +| The node name for which the status changed + indexterm:[Environment Variable,CRM_alert_,node] + +|CRM_alert_nodeid +| The node id for which the status changed + indexterm:[Environment Variable,CRM_alert_,nodeid] + +|CRM_alert_desc +| The current node state; One of `member` or `lost` + indexterm:[Environment Variable,CRM_alert_,desc] + +|========================================================= + +.Environment Variables - Additional for `fencing` alerts + +[width="95%",cols="m,2>",options="header",align="center"] +|========================================================= + +|Environment Variable +|Description + +|CRM_alert_node +| The node name the fencing operation is requested for + indexterm:[Environment Variable,CRM_alert_,node] + +|CRM_alert_task +| The fencing operation that was requested + indexterm:[Environment Variable,CRM_alert_,task] + +|CRM_alert_rc +| The numerical return code of the operation + indexterm:[Environment Variable,CRM_alert_,rc] + +|CRM_alert_desc +| A summary of requested fencing operation, by origin, on target +adding textual output relevant error code of the fencing operation (if any) + indexterm:[Environment Variable,CRM_alert_,desc] + + +|========================================================= + +.Environment Variables - Additional for `resource` alerts + +[width="95%",cols="m,2>",options="header",align="center"] +|========================================================= + +|Environment Variable +|Description + +|CRM_alert_node +| The node name for which the status changed + indexterm:[Environment Variable,CRM_alert_,node] + +|CRM_alert_rsc +| The name of the resource that changed the status + indexterm:[Environment Variable,CRM_alert_,rsc] + +|CRM_alert_task +| The operation that caused the status change + indexterm:[Environment Variable,CRM_alert_,task] + +|CRM_alert_interval +| The interval of a resource operation + indexterm:[Environment Variable,CRM_alert_,interval] + +|CRM_alert_rc +| The numerical return code of the operation + indexterm:[Environment Variable,CRM_alert_,rc] + +|CRM_alert_target_rc +| The expected numerical return code of the operation + indexterm:[Environment Variable,CRM_alert_,target_rc] + +|CRM_alert_status +| The numerical representation of the status of the operation + indexterm:[Environment Variable,CRM_alert_,status] + +|CRM_alert_desc +| The textual output relevant error code of the operation (if any) +that caused the status change + indexterm:[Environment Variable,CRM_alert_,desc] + +|========================================================= + + +.Meta-Attributes + +[width="95%",cols="m,2>",options="header",align="center"] +|========================================================= + +|Meta-Attribute +|Description + +|timestamp-format +| Format string as used with `date` command - including the nano-second part - defining the format in which the timestamp of a cluster event is passed to the alert-agent + indexterm:[meta-attribute,timestamp-format] + +|timeout +| Alert-Agents are forked as separate processes. So to prevent them from hogging system-resources they are observed and terminated if they don't complete within the timeout specified. + indexterm:[meta-attribute,timeout] + + +|========================================================= + diff --git a/doc/Pacemaker_Explained/en-US/Ch-Notifications.txt b/doc/Pacemaker_Explained/en-US/Ch-Notifications.txt deleted file mode 100644 index 134ab0c7b9..0000000000 --- a/doc/Pacemaker_Explained/en-US/Ch-Notifications.txt +++ /dev/null @@ -1,144 +0,0 @@ -= Receiving Notification for Cluster Events = - -//// -We prefer [[ch-notifications]], but older versions of asciidoc don't deal well -with that construct for chapter headings -//// -anchor:ch-notifications[Chapter 7, Receiving Notification for Cluster Events] -indexterm:[Resource,Notification] - -A Pacemaker cluster is an event-driven system. In this context, an 'event' -might be a resource failure or a configuration change, among others. - -The *ocf:pacemaker:ClusterMon* resource can monitor the cluster status and -trigger alerts on each cluster event. This resource runs `crm_mon` in the -background at regular (configurable) intervals and uses `crm_mon` capabilities -to trigger emails (SMTP), SNMP traps or external programs (via the -+extra_options+ parameter). - -[NOTE] -===== -Depending on your system settings and compilation settings, SNMP or email -alerts might be unavailable. Check the output of `crm_mon --help` to see whether these -options are available to you. In any case, executing an external agent will -always be available, and you can use this agent to send emails, SNMP traps -or whatever action you develop. -===== - -[[s-notification-snmp]] -== Configuring SNMP Notifications == -indexterm:[Resource,Notification,SNMP] - -Requires an IP to send SNMP traps to, and an SNMP community string. -The Pacemaker MIB is provided with the source, and is typically -installed in +/usr/share/snmp/mibs/PCMK-MIB.txt+. - -This example uses +snmphost.example.com+ as the SNMP IP and -+public+ as the community string: - -.Configuring ClusterMon to send SNMP traps -===== -[source,XML] ------ - - - - - - - - - ------ -===== - -[[s-notification-email]] -== Configuring Email Notifications == -indexterm:[Resource,Notification,SMTP,Email] - -Requires the recipient e-mail address. You can also optionally configure -the sender e-mail address, the hostname of the SMTP relay, and a prefix string -for the subject line. - -.Configuring ClusterMon to send email alerts -===== -[source,XML] ------ - - - - - - - - - ------ -===== - -[[s-notification-external]] -== Configuring Notifications via External-Agent == - -Requires a program (external-agent) to run when resource operations take -place, and an external-recipient (IP address, email address, URI). When -triggered, the external-agent is fed with dynamically filled environment -variables describing precisely the cluster event that occurred. By making -smart usage of these variables in your external-agent code, you can trigger -any action. - -.Configuring ClusterMon to execute an external-agent -===== -[source,XML] ------ - - - - - - - - - ------ -===== - -.Environment Variables Passed to the External Agent -[width="95%",cols="1m,2<",options="header",align="center"] -|========================================================= - -|Environment Variable -|Description - -|CRM_notify_recipient -| The static external-recipient from the resource definition. - indexterm:[Environment Variable,CRM_notify_recipient] - -|CRM_notify_node -| The node on which the status change happened. - indexterm:[Environment Variable,CRM_notify_node] - -|CRM_notify_rsc -| The name of the resource that changed the status. - indexterm:[Environment Variable,CRM_notify_rsc] - -|CRM_notify_task -| The operation that caused the status change. - indexterm:[Environment Variable,CRM_notify_task] - -|CRM_notify_desc -| The textual output relevant error code of the operation (if any) that caused the status change. - indexterm:[Environment Variable,CRM_notify_desc] - -|CRM_notify_rc -| The return code of the operation. - indexterm:[Environment Variable,CRM_notify_rc] - -|CRM_notify_target_rc -| The expected return code of the operation. - indexterm:[Environment Variable,CRM_notify_target_rc] - -|CRM_notify_status -| The numerical representation of the status of the operation. - indexterm:[Environment Variable,CRM_notify_target_rc] - -|========================================================= diff --git a/doc/Pacemaker_Explained/en-US/Pacemaker_Explained.xml b/doc/Pacemaker_Explained/en-US/Pacemaker_Explained.xml index fe054f3f05..991e002a3a 100644 --- a/doc/Pacemaker_Explained/en-US/Pacemaker_Explained.xml +++ b/doc/Pacemaker_Explained/en-US/Pacemaker_Explained.xml @@ -1,47 +1,47 @@ - + Further Reading Project Website: Project Documentation: SUSE High Availibility Guide: Heartbeat configuration: Corosync Configuration: