diff --git a/doc/Pacemaker_Explained/en-US/Author_Group.xml b/doc/Pacemaker_Explained/en-US/Author_Group.xml index 5e02cde545..70fe29437e 100644 --- a/doc/Pacemaker_Explained/en-US/Author_Group.xml +++ b/doc/Pacemaker_Explained/en-US/Author_Group.xml @@ -1,17 +1,35 @@ AndrewBeekhof Red Hat Primary author andrew@beekhof.net PhilippMarek LINBit Style and formatting updates. Indexing. philipp.marek@linbit.com + + TanjaRoth + SUSE + Utilization chapter + troth@suse.com + + + YanGao + SUSE + Utilization chapter + ygao@suse.com + + + ThomasSchraitle + SUSE + Utilization chapter + toms@suse.com + diff --git a/doc/Pacemaker_Explained/en-US/Ch-Utilization.txt b/doc/Pacemaker_Explained/en-US/Ch-Utilization.txt new file mode 100644 index 0000000000..32d721dff5 --- /dev/null +++ b/doc/Pacemaker_Explained/en-US/Ch-Utilization.txt @@ -0,0 +1,216 @@ + += Utilization and Placement Strategy = + +== Background == + +Pacemaker decides where to place a resource according to the resource +allocation scores on every node. The resource will be allocated to the +node where the resource has the highest score. If the resource allocation +scores on all the nodes are equal, by the default placement strategy, +Pacemaker will choose a node with the least number of allocated resources +for balancing the load. If the number of resources on each node is equal, +the node with the lowest node id will be chosen to run the resource. + +Though resources are different. They may consume different amounts of the +capacities of the nodes. Actually, we cannot ideally balance the load just +according to the number of resources allocated to a node. Besides, If +resources are placed such that their combined requirements exceed the +provided capacity, they may fail to start completely or run with degraded +performance. + +To take these into account, Pacemaker allows you to specify the following +configurations: + + - The capacity a certain node provides. + + - The capacity a certain resource requires. + + - An overall strategy for placement of resources. + + +== Utilization attributes + +To configure the capacity a node provides and the resource's requirements, +use utilization attributes. You can name the utilization attributes +according to your preferences and define as many name/value pairs as your +configuration needs. However, the attribute's values must be integers. + +First, specify the capacities the nodes provide: + + + + + + + + + + + + + + + +Then, specify the capacities the resources require: + + + + + + + + + + + + + + + + + + + + + +A node is considered eligible for a resource if it has sufficient free +capacity to satisfy the resource's requirements. The nature of the required +or provided capacities is completely irrelevant for Pacemaker, it just makes +sure that all capacity requirements of a resource are satisfied before placing +a resource to a node. + + +== Placement Strategy + +After you have configured the capacities your nodes provide and the +capacities your resources require, you need to set the placement-strategy +in the global cluster options, otherwise the capacity configurations have +no effect. + +Four values are available for the placement-strategy: + + - default + + Utilization values are not taken into account at all, per default. +Resources are allocated according to allocation scores. If scores are equal, +resources are evenly distributed across nodes. + + - utilization + + Utilization values are taken into account when deciding whether a node +is considered eligible if it has sufficient free capacity to satisfy the +resource's requirements. However, load-balancing is still done based on the +number of resources allocated to a node. + + - balanced + + Utilization values are taken into account when deciding whether a node +is eligible to serve a resource; an attempt is made to spread the resources +evenly, optimizing resource performance. + + - minimal + + Utilization values are taken into account when deciding whether a node +is eligible to serve a resource; an attempt is made to concentrate the +resources on as few nodes as possible, thereby enabling possible power savings +on the remaining nodes. + + +Set placement-strategy with crm_attribute: +crm_attribute --attr-name placement-strategy --attr-value balanced + +Now Pacemaker will ensure the load from your resources will be +distributed "'evenly" throughout the cluster - without the need for +convoluted sets of colocation constraints. + + +== Allocation Details + +=== Which node is preferred to be chosen to get consumed first on allocating +resources? + +- The node that is most healthy (which has the highest node weight) gets consumed first. + +* If their weights are equal: + * If placement-strategy = "default | utilization" + - The node that has the least number of allocated resources gets consumed first. + * If their numbers of allocated resources are equal: + - The first node listed in cib gets consumed first. + + * If placement-strategy = "balanced": + - The node that has more free capacity gets consumed first. + * If the free capacities of the nodes are equal: + - The node that has the least number of allocated resources gets consumed first. + * If their numbers of allocated resources are equal: + - The first node listed in cib gets consumed first. + + * If placement-strategy = "minimal": + - The first node listed in cib gets consumed first. + + +==== Which node has more free capacity? + +This will be quite clear if we only define one type of "capacity". While if we +define multiple types of "capacity", for example: + +If nodeA has more free cpus, nodeB has more free memory, +-- their free capacities are equal. + +If nodeA has more free cpus, while nodeB has more free memory and storage, +-- nodeB has more free capacity. + + +=== Which resource is preferred to be chosen to get assigned first? + +- The resource that has the highest priority gets allocated first. + +- If their priorities are equal, check if they are already running. The +resource that has the highest score on the node where it's running gets allocated +first (to prevent resource shuffling). + +- If the scores above are equal or they are not running, the resource has + the highest score on the preferred node gets allocated first. + + +== Limitations + +This type of problem Pacemaker is dealing with here is known as the +"knapsack problem" and falls into the "NP-complete" category of computer +science problems - which is fancy way of saying "it takes a really long time +to solve". + +Clearly in a HA cluster, it's not acceptable to spend minutes, let alone hours +or days, finding an optional solution while services remain unavailable. + +So instead of trying to solve the problem completely, Pacemaker uses a +"best effort" algorithm for determining which node should host a particular +service. This means it arrives at a "solution" much faster than traditional +linear programming algorithms, but by doing so at the price of leaving some +services stopped. + +In the contrived example above: + + rsc-small would be allocated to node1 + rsc-medium would be allocated to node2 + rsc-large would remain inactive + +Which is not ideal. + + +== Strategies for Dealing with the Limitations + +- Ensure you have sufficient physical capacity. It might sounds obvious, but +if the physical capacity of your nodes is (close to) maxed out by the cluster +under normal conditions, then failover isn't going to go well. Even without +the Utilization feature, you'll start hitting timeouts and getting secondary +"failures". + +- Build some buffer into the capabilities advertised by the nodes. Advertise +slightly more resources than we physically have on the (usually valid) +assumption that a resource will not use 100% of the configured number of +cpu/memory/etc all the time. This practice is also known as "over commit". + +- Specify resource priorities. If the cluster is going to sacrifice services, +it should be the ones you care (comparatively) about the least. Ensure that +resource priorities are properly set so that your most important resources +are scheduled first. diff --git a/doc/Pacemaker_Explained/en-US/Pacemaker_Explained.xml b/doc/Pacemaker_Explained/en-US/Pacemaker_Explained.xml index 763f55871a..27da3ee8c6 100644 --- a/doc/Pacemaker_Explained/en-US/Pacemaker_Explained.xml +++ b/doc/Pacemaker_Explained/en-US/Pacemaker_Explained.xml @@ -1,48 +1,49 @@ Receiving Notification for Cluster Events
Configuring Email Notifications
Configuring SNMP Notifications
+ Further Reading Project Website Project Documentation A comprehensive guide to cluster commands has been written by Novell Heartbeat configuration: Corosync Configuration: