diff --git a/doc/Pacemaker_Remote/en-US/Ch-Future.txt b/doc/Pacemaker_Remote/en-US/Ch-Future.txt deleted file mode 100644 index 43d136c4e8..0000000000 --- a/doc/Pacemaker_Remote/en-US/Ch-Future.txt +++ /dev/null @@ -1,17 +0,0 @@ -= Future Features = - -Basic KVM and Linux container integration was the first phase of development for pacemaker_remote and was completed for Pacemaker v1.1.10. Here are some planned features that expand upon this initial functionality. - -== Libvirt Sandbox Support == - -Once the libvirt-sandbox project is integrated with pacemaker_remote, we will gain the ability to preform per-resource linux container isolation with very little performance impact. This functionality will allow resources living on a single node to be isolated from one another. At that point CPU and memory limits could be set per-resource dynamically just using the cluster config. - -== Bare-metal Support == -+"This feature has already been introduced into Pacemaker's master github branch and is scheduled for Pacemaker v1.1.11"+ - -The pacemaker_remote daemon already has the ability to run on bare-metal hardware nodes, but the policy engine logic for integrating bare-metal nodes is not complete. There are some complications involved with understanding a bare-metal node's state that virtual nodes don't have. Once this logic is complete, pacemaker will be able to integrate bare-metal nodes in the same way virtual remote-nodes currently are. Some special considerations for fencing will need to be addressed. - -== KVM Migration Support == -+"This feature has already been introduced into Pacemaker's master github branch and is scheduled for Pacemaker v1.1.12"+ - -Pacemaker's policy engine is limited in its ability to perform live migrations of KVM resources when resource dependencies are involved. This limitation affects how resources living within a KVM remote-node are handled when a live migration takes place. Currently when a live migration is performed on a KVM remote-node, all the resources within that remote-node have to be stopped before the migration takes place and started once again after migration has finished. This policy engine limitation is fully explained in this bug report, http://bugs.clusterlabs.org/show_bug.cgi?id=5055#c3 diff --git a/doc/Pacemaker_Remote/en-US/Ch-Intro.txt b/doc/Pacemaker_Remote/en-US/Ch-Intro.txt index 777bb976a1..2ed206d8eb 100644 --- a/doc/Pacemaker_Remote/en-US/Ch-Intro.txt +++ b/doc/Pacemaker_Remote/en-US/Ch-Intro.txt @@ -1,85 +1,101 @@ = Extending High Availability Cluster into Virtual Nodes = == Overview == The recent addition of the +pacemaker_remote+ service supported by +Pacemaker version 1.1.10 and greater+ allows nodes not running the cluster stack (pacemaker+corosync) to integrate into the cluster and have the cluster manage their resources just as if they were a real cluster node. This means that pacemaker clusters are now capable of managing both launching virtual environments (KVM/LXC) as well as launching the resources that live within those virtual environments without requiring the virtual environments to run pacemaker or corosync. == Terms == +cluster-node+ - A node running the High Availability stack (pacemaker + corosync) +remote-node+ - A node running pacemaker_remote without the rest of the High Availability stack. There are two types of remote-nodes, container and baremetal. +container+ - A pacemaker resource that contains additional resources. For example, a KVM virtual machine resource that contains a webserver resource. +container remote-node+ - A virtual guest remote-node running the pacemaker_remote service. This describes a specific remote-node use case where a virtual guest resource managed by the cluster is both started by the cluster and integrated into the cluster as a remote-node. +baremetal+ - Term used to describe an environment that is not virtualized. +baremetal remote-node+ - A baremetal hardware node running pacemaker_remote. This describes a specific remote-node use case where a hardware node not running the High Availability stack is integrated into the cluster as a remote-node through the use of pacemaker_remote. +pacemaker_remote+ - A service daemon capable of performing remote application management within guest nodes (baremetal, kvm, and lxc) in both pacemaker cluster environments and standalone (non-cluster) environments. This service is an enhanced version of pacemaker's local resource manage daemon (LRMD) that is capable of managing and monitoring LSB, OCF, upstart, and systemd resources on a guest remotely. It also allows for most of pacemaker's cli tools (crm_mon, crm_resource, crm_master, crm_attribute, ect..) to work natively on remote-nodes. +LXC+ - A Linux Container defined by the libvirt-lxc Linux container driver. http://libvirt.org/drvlxc.html -== Version Info == - -This feature is in ongoing development. - -+Pacemaker v1.1.10+ - -* Initial pacemaker_remote daemon and integration support. -* Only supports pacemaker in KVM/LXC environments. -* pacemaker_remote daemon unit test suite. -* Known bugs include (These are likely resolved if you have received an 1.1.10.x point release): Errors when setting remote-node attributes, Failures when stopping orphaned (deleted from cib while running) remote-nodes, Fixes remote-node usage in asymmetric clusters. - -+Currently in Master github branch and scheduled for Pacemaker v1.1.11+ - -* Baremetal remote-node support. -* Improvements to scaling remote-node integration. Performance testing here included 16 cluster nodes running 64 remote-nodes living in LXC containers. As part of this testing, several performance enhancements were introduced into the integration code. -* CTS tests. RemoteLXC and RemoteBaremetal. These two CTS tests allow us to perform automated verification of pacemaker_remote integration. -* Fixes for known bugs in 1.1.10 release. +== Support in Pacemaker Versions == + +It is recommended to run Pacemaker 1.1.12 or later when using pacemaker_remote +due to important bug fixes. An overview of changes in pacemaker_remote +capability by version: + +.1.1.13 +* Support for maintenance mode +* Remote nodes can recover without being fenced when the cluster node + hosting their connection fails +* Running pacemaker_remote within LXC environments is deprecated due to + newly added Pacemaker support for isolated resources +* Bug fixes + +.1.1.12 +* Support for permanent node attributes +* Support for migration +* Bug fixes + +.1.1.11 +* Support for IPv6 +* Support for remote nodes +* Support for transient node attributes +* Support for clusters with mixed endian architectures +* Bug fixes + +.1.1.10 +* Bug fixes + +.1.1.9 +* Initial version to include pacemaker_remote +* Limited to guest nodes in KVM/LXC environments using only IPv4; + all nodes' architectures must have same endianness == Virtual Machine Use Case == The use of pacemaker_remote in virtual machines solves a deployment scenario that has traditionally been difficult to solve. +"I want a pacemaker cluster to manage virtual machine resources, but I also want pacemaker to be able to manage the resources that live within those virtual machines."+ In the past, users desiring this deployment had to make a decision. They would either have to sacrifice the ability of monitoring resources residing within virtual guests by running the cluster stack on the baremetal nodes, or run another cluster instance on the virtual guests where they potentially run into corosync scalability issues. There is a third scenario where the virtual guests run the cluster stack and join the same network as the baremetal nodes, but that can quickly hit issues with scalability as well. With the pacemaker_remote service we have a new option. * The baremetal cluster-nodes run the cluster stack (pacemaker+corosync). * The virtual remote-nodes run the pacemaker_remote service (nearly zero configuration required on the virtual machine side) * The cluster stack on the cluster-nodes launch the virtual machines and immediately connect to the pacemaker_remote service, allowing the virtual machines to integrate into the cluster just as if they were a real cluster-node. The key difference here between the virtual machine remote-nodes and the cluster-nodes is that the remote-nodes are not running the cluster stack. This means the remote nodes will never become the DC, and they do not take place in quorum. On the other hand this also means that remote-nodes are not bound to the scalability limits associated with the cluster stack either. +No 16 node corosync member limits+ to deal with. That isn't to say remote-nodes can scale indefinitely, but it is known that remote-nodes scale horizontally much further than cluster-nodes. Other than the quorum limitation, these remote-nodes behave just like cluster nodes in respects to resource management. The cluster is fully capable of managing and monitoring resources on each remote-node. You can build constraints against remote-nodes, put them in standby, or whatever else you'd expect to be able to do with normal cluster-nodes. They even show up in the crm_mon output as you would expect cluster-nodes to. To solidify the concept, below is an example deployment that is very similar to an actual deployment we test in our developer environment to verify remote-node scalability. * 16 cluster-nodes running corosync+pacemaker stack. * 64 pacemaker managed virtual machine resources running pacemaker_remote configured as remote-nodes. * 64 pacemaker managed webserver and database resources configured to run on the 64 remote-nodes. With this deployment you would have 64 webservers and databases running on 64 virtual machines on 16 hardware nodes all of which are managed and monitored by the same pacemaker deployment. It is known that pacemaker_remote can scale to these lengths and possibly much further depending on the specific scenario. == Baremetal remote-node Use Case == +"I want my traditional High Availability cluster to scale beyond the limits imposed by the corosync messaging layer."+ Ultimately the primary advantage of baremetal remote-nodes over traditional nodes running the Corosync+Pacemaker stack is scalability. There are likely some other use cases related to geographically distributed HA clusters that baremetal remote-nodes may serve a purpose in, but those use cases are not well understood at this point. The only limitations baremetal remote-nodes have that cluster-nodes do not is the ability to take place in cluster quorum, and the ability to execute fencing agents via stonith. That is not to say however that fencing of a baremetal node works any differently than that of a normal cluster-node. The Pacemaker policy engine understands how to fence baremetal remote-nodes. As long as a fencing device exists, the cluster is capable of ensuring baremetal nodes are fenced in the exact same way as normal cluster-nodes are fenced. == Linux Container Use Case == +I want to isolate and limit the system resources (cpu, memory, filesystem) a cluster resource can consume without using virtual machines.+ Using pacemaker_remote with Linux containers (libvirt-lxc) opens up some interesting possibilities for isolating resources in the cluster without the use of a hypervisor. We now have the ability to both define a contained environment with cpu and memory utilization limits and then assign resources to that contained environment all managed from within pacemaker. The LXC Walk-through section of this document outlines how pacemaker_remote can be used to bring Linux containers into the cluster as remote-nodes capable of executing resources. == Expanding the Cluster Stack == === Traditional HA Stack === image::images/pcmk-ha-cluster-stack.png["The Traditional Pacemaker Corosync HA Stack.",width="17cm",height="9cm",align="center"] === Remote-Node Enabled HA Stack Using Virtual guest nodes === image::images/pcmk-ha-remote-stack.png["Placing Pacemaker Remote into the Traditional HA Stack.",width="20cm",height="10cm",align="center"] diff --git a/doc/Pacemaker_Remote/en-US/Pacemaker_Remote.xml b/doc/Pacemaker_Remote/en-US/Pacemaker_Remote.xml index 9a5e119481..c02dd20f5d 100644 --- a/doc/Pacemaker_Remote/en-US/Pacemaker_Remote.xml +++ b/doc/Pacemaker_Remote/en-US/Pacemaker_Remote.xml @@ -1,18 +1,17 @@ %BOOK_ENTITIES; ]> -