diff --git a/doc/shared/en-US/pacemaker-intro.txt b/doc/shared/en-US/pacemaker-intro.txt deleted file mode 100644 index b2a81cb757..0000000000 --- a/doc/shared/en-US/pacemaker-intro.txt +++ /dev/null @@ -1,186 +0,0 @@ -:compat-mode: legacy -== What Is 'Pacemaker'? == - -*Pacemaker* is a high-availability 'cluster resource manager' -- software that -runs on a set of hosts (a 'cluster' of 'nodes') in order to preserve integrity -and minimize downtime of desired services ('resources'). -footnote:[ -'Cluster' is sometimes used in other contexts to refer to hosts grouped -together for other purposes, such as high-performance computing (HPC), but -Pacemaker is not intended for those purposes. -] -It is maintained by the https://www.ClusterLabs.org/[ClusterLabs] community. - -Pacemaker's key features include: - - * Detection of and recovery from node- and service-level failures - * Ability to ensure data integrity by fencing faulty nodes - * Support for one or more nodes per cluster - * Support for multiple resource interface standards (anything that can be - scripted can be clustered) - * Support (but no requirement) for shared storage - * Support for practically any redundancy configuration (active/passive, N+1, - etc.) - * Automatically replicated configuration that can be updated from any node - * Ability to specify cluster-wide relationships between services, - such as ordering, colocation and anti-colocation - * Support for advanced service types, such as 'clones' (services that need to - be active on multiple nodes), 'stateful resources' (clones that can run in - one of two modes), and containerized services - * Unified, scriptable cluster management tools - -.Fencing -[NOTE] -==== -'Fencing', also known as 'STONITH' (an acronym for Shoot The Other Node In The -Head), is the ability to ensure that it is not possible for a node to be -running a service. This is accomplished via 'fence devices' such as -intelligent power switches that cut power to the target, or intelligent -network switches that cut the target's access to the local network. - -Pacemaker represents fence devices as a special class of resource. - -A cluster cannot safely recover from certain failure conditions, such as an -unresponsive node, without fencing. -==== - -== Cluster Architecture == - -At a high level, a cluster can be viewed as having these parts (which together -are often referred to as the 'cluster stack'): - - * *Resources:* These are the reason for the cluster's being -- the services - that need to be kept highly available. - - * *Resource agents:* These are scripts or operating system components that - start, stop, and monitor resources, given a set of resource parameters. - These provide a uniform interface between Pacemaker and the managed - services. - - * *Fence agents:* These are scripts that execute node fencing actions, - given a target and fence device parameters. - - * *Cluster membership layer:* This component provides reliable - messaging, membership, and quorum information about the cluster. - Currently, Pacemaker supports http://www.corosync.org/[Corosync] - as this layer. - - * *Cluster resource manager:* Pacemaker provides the brain that processes - and reacts to events that occur in the cluster. These events may include - nodes joining or leaving the cluster; resource events caused by failures, - maintenance, or scheduled activities; and other administrative actions. - To achieve the desired availability, Pacemaker may start and stop resources - and fence nodes. - - * *Cluster tools:* These provide an interface for users to interact with the - cluster. Various command-line and graphical (GUI) interfaces are available. - -Most managed services are not, themselves, cluster-aware. However, many popular -open-source cluster filesystems make use of a common 'Distributed Lock -Manager' (DLM), which makes direct use of Corosync for its messaging and -membership capabilities and Pacemaker for the ability to fence nodes. - -.Example Cluster Stack -image::images/pcmk-stack.png["Example cluster stack",width="10cm",height="7.5cm",align="center"] - -== Pacemaker Architecture == - -Pacemaker itself is composed of multiple daemons that work together: - - * pacemakerd - * pacemaker-attrd - * pacemaker-based - * pacemaker-controld - * pacemaker-execd - * pacemaker-fenced - * pacemaker-schedulerd - -.Internal Components -image::images/pcmk-internals.png["Pacemaker software components",align="center",scaledwidth="65%"] - -The Pacemaker master process (pacemakerd) spawns all the other daemons, and -respawns them if they unexpectedly exit. - -The 'Cluster Information Base' (CIB) is an -https://en.wikipedia.org/wiki/XML[XML] representation of the cluster's -configuration and the state of all nodes and resources. The 'CIB manager' -(pacemaker-based) keeps the CIB synchronized across the cluster, and handles -requests to modify it. - -The attribute manager (pacemaker-attrd) maintains a database of attributes for -all nodes, keeps it synchronized across the cluster, and handles requests to -modify them. These attributes are usually recorded in the CIB. - -Given a snapshot of the CIB as input, the 'scheduler' (pacemaker-schedulerd) -determines what actions are necessary to achieve the desired state of the -cluster. - -The 'local executor' (pacemaker-execd) handles requests to execute -resource agents on the local cluster node, and returns the result. - -The 'fencer' (pacemaker-fenced) handles requests to fence nodes. Given a target -node, the fencer decides which cluster node(s) should execute which fencing -device(s), and calls the necessary fencing agents (either directly, or via -requests to the fencer peers on other nodes), and returns the result. - -The 'controller' (pacemaker-controld) is Pacemaker's coordinator, -maintaining a consistent view of the cluster membership and orchestrating all -the other components. - -Pacemaker centralizes cluster decision-making by electing one of the controller -instances as the 'Designated Controller' ('DC'). Should the elected DC -process (or the node it is on) fail, a new one is quickly established. -The DC responds to cluster events by taking a current snapshot of the CIB, -feeding it to the scheduler, then asking the executors (either directly on -the local node, or via requests to controller peers on other nodes) and -the fencer to execute any necessary actions. - -.Old daemon names -[NOTE] -==== -The Pacemaker daemons were renamed in version 2.0. You may still find -references to the old names, especially in documentation targeted to version -1.1. - -[width="95%",cols="1,2",options="header",align="center"] -|========================================================= -| Old name | New name -| attrd | pacemaker-attrd -| cib | pacemaker-based -| crmd | pacemaker-controld -| lrmd | pacemaker-execd -| stonithd | pacemaker-fenced -| pacemaker_remoted | pacemaker-remoted -|========================================================= - -==== - -== Node Redundancy Designs == - -Pacemaker supports practically any -https://en.wikipedia.org/wiki/High-availability_cluster#Node_configurations[node -redundancy configuration] including 'Active/Active', 'Active/Passive', 'N+1', -'N+M', 'N-to-1' and 'N-to-N'. - -Active/passive clusters with two (or more) nodes using Pacemaker and -https://en.wikipedia.org/wiki/Distributed_Replicated_Block_Device:[DRBD] are -a cost-effective high-availability solution for many situations. One of the -nodes provides the desired services, and if it fails, the other node takes -over. - -.Active/Passive Redundancy -image::images/pcmk-active-passive.png["Active/Passive Redundancy",width="10cm",height="7.5cm",align="center"] - -Pacemaker also supports multiple nodes in a shared-failover design, -reducing hardware costs by allowing several active/passive clusters to be -combined and share a common backup node. - -.Shared Failover -image::images/pcmk-shared-failover.png["Shared Failover",width="10cm",height="7.5cm",align="center"] - -When shared storage is available, every node can potentially be used for -failover. Pacemaker can even run multiple copies of services to spread out the -workload. - -.N to N Redundancy -image::images/pcmk-active-active.png["N to N Redundancy",width="10cm",height="7.5cm",align="center"]