diff --git a/html/pacemaker/index.html b/html/pacemaker/index.html index 5bb9787..3641585 100644 --- a/html/pacemaker/index.html +++ b/html/pacemaker/index.html @@ -1,199 +1,199 @@ ClusterLabs > Pacemaker
"The definitive open-source high-availability stack for the Linux platform builds upon the Pacemaker cluster resource manager." -- LINUX Journal, "Ahead of the Pack: the Pacemaker High-Availability Stack"

Features

Background

Black Duck Open Hub project report for pacemaker

Pacemaker has been around since 2004 and is primarily a collaborative effort - between Red Hat + between Red Hat and SuSE. However, we also receive considerable help and support from the folks at LinBit and the community in general.

The core Pacemaker team is made up of full-time developers from Australia, the Czech Republic, the USA, and Germany. Contributions to the code or documentation are always welcome.

Pacemaker ships with most modern Linux distributions and has been deployed in many critical environments including Deutsche Flugsicherung GmbH - (DFS) + (DFS) which uses Pacemaker to ensure its air traffic control systems are always available.

Currently Andrew Beekhof is the project lead for Pacemaker.

diff --git a/src/_config.yml b/src/_config.yml index ef03b00..3a07a41 100644 --- a/src/_config.yml +++ b/src/_config.yml @@ -1,52 +1,52 @@ # Welcome to Jekyll! # # This config file is meant for settings that affect your whole blog, values # which you are expected to set up once and rarely edit after that. If you find # yourself editing these this file very often, consider using Jekyll's data files # feature for the data you need to update frequently. # # For technical reasons, this file is *NOT* reloaded automatically when you use # 'bundle exec jekyll serve'. If you change this file, please restart the server process. # Site settings # These are used to personalize your new site. If you look in the HTML files, # you will see them accessed via {{ site.title }}, {{ site.email }}, and so on. # You can create any custom variable you would like, and they will be accessible # in the templates via {{ site.myvariable }}. title: ClusterLabs email: andrew@beekhof.net description: Community hub for open-source high-availability software -url: http://www.clusterlabs.org/ +url: https://www.clusterlabs.org/ google_analytics: UA-8156370-1 # Build settings theme: minima destination: ../html gems: - jekyll-assets - font-awesome-sass include: - doc - pacemaker - polls exclude: - Gemfile - Gemfile.lock - LICENSE.theme # All content generated outside of jekyll, or not yet converted to jekyll, # must be listed here, or jekyll will erase it when building the site. # Though not documented as such, the values here function as prefix matches. keep_files: - images - pacemaker/abi - pacemaker/doc - pacemaker/doxygen - pacemaker/global - pacemaker/man - Pictures - rpm-test - rpm-test-next - rpm-test-rhel diff --git a/src/_includes/sidebar.html b/src/_includes/sidebar.html index e633082..2a561f1 100644 --- a/src/_includes/sidebar.html +++ b/src/_includes/sidebar.html @@ -1,49 +1,49 @@ diff --git a/src/_layouts/home.html b/src/_layouts/home.html index 914f684..fb480da 100644 --- a/src/_layouts/home.html +++ b/src/_layouts/home.html @@ -1,203 +1,213 @@ --- layout: clusterlabs ---

Quick Overview

{% img Deploy-small.png %}

Deploy

We support many deployment scenarios, from the simplest 2-node standby cluster to a 32-node active/active configuration. We can also dramatically reduce hardware costs by allowing several active/passive clusters to be combined and share a common backup node.

{% img Monitor-small.png %}

Monitor

We monitor the system for both hardware and software failures. In the event of a failure, we will automatically recover your application and make sure it is available from one of the remaning machines in the cluster.

{% img Recover-small.png %}

Recover

After a failure, we use advanced algorithms to quickly determine the optimum locations for services based on relative node preferences and/or requirements to run with other cluster services (we call these "constraints").

Why clusters

At its core, a cluster is a distributed finite state machine capable of co-ordinating the startup and recovery of inter-related services across a set of machines.

System HA is possible without a cluster manager, but you save many headaches using one anyway

Even a distributed and/or replicated application that is able to survive the failure of one or more components can benefit from a higher level cluster:

While SYS-V init replacements like systemd can provide deterministic recovery of a complex stack of services, the recovery is limited to one machine and lacks the context of what is happening on other machines - context that is crucial to determine the difference between a local failure, clean startup or recovery after a total site failure.

Features

The ClusterLabs stack, incorporating Corosync and Pacemaker defines an Open Source, High Availability cluster offering suitable for both small and large deployments.

Components

"The definitive open-source high-availability stack for the Linux platform builds upon the Pacemaker cluster resource manager."
-- LINUX Journal, "Ahead of the Pack: the Pacemaker High-Availability Stack"

A Pacemaker stack is built on five core components:

We describe each of these in more detail as well as other optional components such as CLIs and GUIs.

Background

Pacemaker has been around since 2004 and is primarily a collaborative effort - between Red Hat - and SUSE, however we also + between Red Hat + and SUSE, however we also receive considerable help and support from the folks - at LinBit and the community in + at LinBit and the community in general.

"Pacemaker cluster stack is the state-of-the-art high availability and load balancing stack for the Linux platform."
-- OpenStack documentation

Corosync also began life in 2004 but was then part of the OpenAIS project. - It is primarily a Red - Hat initiative, however we also receive considerable - help and support from the folks in the community. + It is primarily a Red Hat initiative, + with considerable help and support from the folks in the community.

The core ClusterLabs team is made up of full-time developers from Australia, Austria, Canada, China, Czech Repulic, England, Germany, Sweden and the USA. Contributions to the code or documentation are always welcome.

The ClusterLabs stack ships with most modern enterprise distributions and has been deployed in many critical environments including Deutsche Flugsicherung GmbH (DFS) which uses Pacemaker to ensure its air traffic control systems are always available.

diff --git a/src/components.html b/src/components.html index 22cbd30..e69c430 100644 --- a/src/components.html +++ b/src/components.html @@ -1,178 +1,179 @@ --- layout: pacemaker title: Components ---

Core Components

Pacemaker

At its core, Pacemaker is a distributed finite state machine capable of co-ordinating the startup and recovery of inter-related services across a set of machines.

Pacemaker understands many different resource types (OCF, SYSV, systemd) and can accurately model the relationships between them (colocation, ordering).

It can even use technology such as Docker to automatically isolate the resources managed by the cluster.

-

Corosync

+

Corosync

Corosync APIs provide membership (a list of peers), messaging (the ability to talk to processes on those peers), and quorum (do we have a majority) capabilities to projects such as Apache Qpid and Pacemaker.

-

libQB

+

libQB

libqb is a library with the primary purpose of providing high performance client server reusable features. It provides high performance logging, tracing, ipc, and poll.

The initial features of libqb come from the parts of corosync that were thought to useful to other projects.

Resource Agents

Resource agents are the abstraction that allows Pacemaker to manage services it knows nothing about. They contain the logic for what to do when the cluster wishes to start, stop or check the health of a service.

This particular set of agents conform to the Open Cluster Framework (OCF) specification. A guide to writing agents is also available.

Fence Agents

Fence agents are the abstraction that allows Pacemaker to isolate badly behaving nodes. They achieve this by either powering off the node or disabling its access to the network and/or shared storage.

Many types of network power switches exist and you will want to choose the one(s) that match your hardware. Please be aware that some (ones that don't loose power when the machine goes down) are better than others.

Agents are generally expected to expose OCF-compliant metadata.

OCF specification

The original documentation that sparked a lot of this work. Mostly we only use the "RA" specification. Efforts are underway to revive the process for updating and modernizing the spec.

Configuration Tools

Pacemaker's internal configuration format is XML, which is great for machines but terrible for humans.

The community's best minds have created GUIs and Shells to hide the XML and allow the configuration to be viewed and updated in a more human friendly format.

Command Line Interfaces (Shells)

-

crmsh

+

crmsh

The original configuration shell for Pacemaker. Written and actively maintained by SUSE, it may be used either as an interactive shell with tab completion, for single commands directly on the shell's command line or as batch mode scripting tool. Documentation for crmsh can be - found here. + found here.

pcs

An alternate vision for a full cluster lifecycle configuration shell and web based GUI. Handles everything from cluster installation through to resource configuration and status.

GUI Tools

pygui

The original GUI for Pacemaker written in Python by IBM China. Mostly deprecated on SLES in favor of Hawk

-

Hawk

+

Hawk

Hawk is a web-based GUI for managing and monitoring Pacemaker HA clusters. It is generally intended to be run on every node in the cluster, so that you can just point your web browser at any node to access it. - There is a usage guide at hawk-guide.readthedocs.org, and it is + There is a usage guide at + hawk-guide.readthedocs.io, and it is documented as part of the - - SUSE Linux Enterprise High Availability Extension documentation + SUSE + Linux Enterprise High Availability Extension documentation

LCMC

The Linux Cluster Management Console (LCMC) is a GUI with an inovative approach for representing the status of and relationships between cluster services. It uses SSH to let you install, configure and manage clusters from your desktop.

pcs

An alternate vision for a full cluster lifecycle configuration shell and web based GUI. Handles everything from cluster installation through to resource configuration and status.

Striker

Striker is the user interface for the Anvil! (virtual) server platform and the ScanCore autonomous self-defence and alert system.

Other Add-ons

booth

The Booth cluster ticket manager extends Pacemaker to support geographically distributed clustering. It does this by managing the granting and revoking of 'tickets' which authorizes one of the cluster sites, potentially located in geographically dispersed locations, to run certain resources.

sbd

SBD provides a node fencing mechanism through the exchange of messages via shared block storage such as for example a SAN, iSCSI, FCoE. This isolates the fencing mechanism from changes in firmware version or dependencies on specific firmware controllers, and it can be used as a STONITH mechanism in all configurations that have reliable shared storage. It can also be used as a pure watchdog-based fencing mechanism.

diff --git a/src/corosync.html b/src/corosync.html index b051e68..69878ca 100644 --- a/src/corosync.html +++ b/src/corosync.html @@ -1,56 +1,56 @@ --- layout: default title: Corosync ---

Virtual synchrony

A closed process group communication model with virtual synchrony guarantees for creating replicated state machines.

Availability

A simple availability manager that restarts the application process when it has failed.

Information

A configuration and statistics in-memory database that provide the ability to set, retrieve, and receive change notifications of information.

Quorum

A quorum system that notifies applications when quorum is achieved or lost.

diff --git a/src/faq.html b/src/faq.html index 7d907bd..8ac5ed4 100644 --- a/src/faq.html +++ b/src/faq.html @@ -1,141 +1,141 @@ --- layout: default title: FAQ ---

Frequently Asked Questions

Q: Where can I get Pacemaker?

A: Pacemaker ships as part of most modern distributions, so you can usually just launch your favorite package manager on:

If all else fails, you can try installing from source.

Q: Is there any documentation?

A: Yes. You can find the set relevant to your version in our documentation index.

Q: Where should I ask questions?

A: Often basic questions can be answered on irc, but sending them to the - mailing list is + mailing list is always a good idea so that everyone can benefit from the answer.

Q: Do I need shared storage?

A: No. We can help manage it if you have some, but Pacemaker itself has no need for shared storage.

Q: Which cluster filesystems does Pacemaker support?

A: Pacemaker supports the - popular OCFS2 - and GFS2 + popular OCFS2 + and GFS2 filesystems. As you'd expect, you can use them on top of real disks or network block devices - like DRBD. + like DRBD.

Q: What kind of applications can I manage with Pacemaker?

A: Pacemaker is application agnostic, meaning anything that can be scripted can be made highly available - provided the script conforms to one of the supported standards: LSB, OCF, Systemd, or Upstart.

Q: Can I use Pacemaker with Heartbeat?

A: Yes. Pacemaker started off life as part of the Heartbeat project and continues to support it as an alternative to Corosync. See this documentation for more details

Q: Can I use Pacemaker with CMAN?

A: Yes. Pacemaker added support for CMAN v3 in version 1.1.5 to better integrate with distros that have traditionally shipped and/or supported the RHCS cluster stack instead of Pacemaker. This is particularly relevant for those looking to use GFS2 or OCFS2. See the documentation for more details

Q: Can I use Pacemaker with Corosync 1.x?

A: Yes. You will need to configure Corosync to load Pacemaker's custom plugin to provide the membership and quorum information we require. See the documentation for more details.

Q: Can I use Pacemaker with Corosync 2.x?

A: Yes. Pacemaker can obtain the membership and quorum information it requires directly from Corosync in this configuration. See the documentation for more details.

Q: Do I need a fencing device?

A: Yes. Fencing is the only 100% reliable way to ensure the integrity of your data and that applications are only active on one host. Although Pacemaker is technically able to function without Fencing, there are a good reasons SUSE and Red Hat will not support such a configuration.

Q: Do I need to know XML to configure Pacemaker?

A: No. Although Pacemaker uses XML as its native configuration format, there exist 2 CLIs and at least 4 GUIs that present the configuration in a human friendly format.

Q: How do I synchronize the cluster configuration?

A: Any changes to Pacemaker's configuration are automatically replicated to other machines. The configuration is also versioned, so any offline machines will be updated when they return.

Q: Should I choose pcs or crmsh?

A: Arguably the best advice is to use whichever one comes with your distro. This is the one that will be tailored to that environment, receive regular bugfixes and feature in the documentation.

Of course, for years people have been side-loading all of Pacemaker onto enterprise distros that didn't ship it, so doing the same for just a configuration tool should be easy if your favorite distro does not ship your favorite tool.

Q: What if my question isn't here?

A: See the getting help section and let us know!

diff --git a/src/help.html b/src/help.html index 5592b4d..f88cb84 100644 --- a/src/help.html +++ b/src/help.html @@ -1,166 +1,161 @@ --- layout: pacemaker title: Help ---

Getting Help

You can stay up to date with the Pacemaker project by subscribing to our - news and/or - site updates feeds. + site updates feeds.

A good first step is always to check out the FAQ and documentation. Otherwise, many members of the community hang out on irc and are happy to answer questions. We are spread out over many timezones though (and have day jobs), so you may need to be patient when waiting for a reply.

Extended or complex issues might be better sent to the - relevant mailing list(s) + relevant mailing list(s) (you'll need to subscribe in order to send messages). Don't worry if you pick the wrong one, many of us are on multiple lists and someone will suggest a more appropriate forum if necessary.

People new to the project, or Open Source generally, are encouraged to read Getting Answers by Mike Ash from Rogue Amoeba. It provides some very good tips on effective communication with groups such as this one. Following the advice it contains will greatly increase the chance of a quick and helpful reply.

Bugs and other problems can also be reported - via Bugzilla. + via Bugzilla.

Or if you already know the solution, submit a patch against - our GitHub + our GitHub repository.

The development of most of the ClusterLabs-related projects take place as part of the ClusterLabs organization at Github, and the source code and issue trackers for these projects can be found there.

Providing Help

If you find this project useful, you may want to consider supporting its future development. There are a number of ways to support the project (in no particular order):

Thank you for using Pacemaker

Professional Support

Does your company provide Pacemaker training or support? Let us know!

diff --git a/src/pacemaker/index.html b/src/pacemaker/index.html index 19a68f3..008365c 100644 --- a/src/pacemaker/index.html +++ b/src/pacemaker/index.html @@ -1,88 +1,88 @@ --- layout: default title: Pacemaker ---
"The definitive open-source high-availability stack for the Linux platform builds upon the Pacemaker cluster resource manager." -- LINUX Journal, "Ahead of the Pack: the Pacemaker High-Availability Stack"

Features

Background

Black Duck Open Hub project report for pacemaker

Pacemaker has been around since 2004 and is primarily a collaborative effort - between Red Hat + between Red Hat and SuSE. However, we also receive considerable help and support from the folks at LinBit and the community in general.

The core Pacemaker team is made up of full-time developers from Australia, the Czech Republic, the USA, and Germany. Contributions to the code or documentation are always welcome.

Pacemaker ships with most modern Linux distributions and has been deployed in many critical environments including Deutsche Flugsicherung GmbH - (DFS) + (DFS) which uses Pacemaker to ensure its air traffic control systems are always available.

Currently Andrew Beekhof is the project lead for Pacemaker.

diff --git a/src/quickstart-redhat-6.html b/src/quickstart-redhat-6.html index e03d3c3..ae2934a 100644 --- a/src/quickstart-redhat-6.html +++ b/src/quickstart-redhat-6.html @@ -1,197 +1,199 @@ --- layout: pacemaker title: RHEL 6 Quickstart ---
{% include quickstart-common.html %}

RHEL 6.4 onwards

Install

- Pacemaker ships as part of the Red - Hat High - Availability Add-on. The easiest way to try it out on RHEL is to install it from the Scientific Linux or CentOS repositories. + Pacemaker ships as part of the Red Hat + High Availability Add-on. + The easiest way to try it out on RHEL is to install it from the + Scientific Linux + or CentOS repositories.

If you are already running CentOS or Scientific Linux, you can skip this step. Otherwise, to teach the machine where to find the CentOS packages, run:

[ALL] # cat < /etc/yum.repo.d/centos.repo [centos-6-base] name=CentOS-$releasever - Base mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os #baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/ enabled=1 EOF

Next we use yum to install pacemaker and some other necessary packages we will need:

[ALL] # yum install pacemaker cman pcs ccs resource-agents

Configure Cluster Membership and Messaging

The supported stack on RHEL6 is based on CMAN, so thats what Pacemaker uses too.

We now create a CMAN cluster and populate it with some nodes. Note that the name cannot exceed 15 characters (we'll use 'pacemaker1').

[ONE] # ccs -f /etc/cluster/cluster.conf --createcluster pacemaker1 [ONE] # ccs -f /etc/cluster/cluster.conf --addnode node1 [ONE] # ccs -f /etc/cluster/cluster.conf --addnode node2

Next we need to teach CMAN how to send it's fencing requests to Pacemaker. We do this regardless of whether or not fencing is enabled within Pacemaker.

[ONE] # ccs -f /etc/cluster/cluster.conf --addfencedev pcmk agent=fence_pcmk [ONE] # ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect node1 [ONE] # ccs -f /etc/cluster/cluster.conf --addmethod pcmk-redirect node2 [ONE] # ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk node1 pcmk-redirect port=node1 [ONE] # ccs -f /etc/cluster/cluster.conf --addfenceinst pcmk node2 pcmk-redirect port=node2

Now copy /etc/cluster/cluster.conf to all the other nodes that will be part of the cluster.

Start the Cluster

CMAN was originally written for rgmanager and assumes the cluster should not start until the node has - quorum, + quorum, so before we try to start the cluster, we need to disable this behavior:

[ALL] # echo "CMAN_QUORUM_TIMEOUT=0" >> /etc/sysconfig/cman

Now, on each machine, run:

[ALL] # service cman start [ALL] # service pacemaker start

A note for users of prior RHEL versions

The original cluster shell (crmsh) is no longer available on RHEL. To help people make the transition there is a quick reference guide for those wanting to know what the pcs equivalent is for various crmsh commands.

Set Cluster Options

With so many devices and possible topologies, it is nearly impossible to include Fencing in a document like this. For now we will disable it.

[ONE] # pcs property set stonith-enabled=false

One of the most common ways to deploy Pacemaker is in a 2-node configuration. However quorum as a concept makes no sense in this scenario (because you only have it when more than half the nodes are available), so we'll disable it too.

[ONE] # pcs property set no-quorum-policy=ignore

For demonstration purposes, we will force the cluster to move services after a single failure:

[ONE] # pcs resource defaults migration-threshold=1

Add a Resource

Lets add a cluster service, we'll choose one doesn't require any configuration and works everywhere to make things easy. Here's the command:

[ONE] # pcs resource create my_first_svc Dummy op monitor interval=120s

"my_first_svc" is the name the service will be known as.

"ocf:pacemaker:Dummy" tells Pacemaker which script to use (Dummy - an agent that's useful as a template and for guides like this one), which namespace it is in (pacemaker) and what standard it conforms to (OCF).

"op monitor interval=120s" tells Pacemaker to check the health of this service every 2 minutes by calling the agent's monitor action.

You should now be able to see the service running using:

[ONE] # pcs status

or

[ONE] # crm_mon -1

Simulate a Service Failure

We can simulate an error by telling the service to stop directly (without telling the cluster):

[ONE] # crm_resource --resource my_first_svc --force-stop

If you now run crm_mon in interactive mode (the default), you should see (within the monitor interval - 2 minutes) the cluster notice that my_first_svc failed and move it to another node.

Next Steps

diff --git a/src/quickstart-redhat.html b/src/quickstart-redhat.html index d11be19..19e9305 100644 --- a/src/quickstart-redhat.html +++ b/src/quickstart-redhat.html @@ -1,158 +1,160 @@ --- layout: pacemaker title: RHEL 7 Quickstart ---
{% include quickstart-common.html %}

RHEL 7

Install

- Pacemaker ships as part of the Red - Hat High - Availability Add-on. The easiest way to try it out on RHEL is to install it from the Scientific Linux or CentOS repositories. + Pacemaker ships as part of the Red Hat + High Availability Add-on. + The easiest way to try it out on RHEL is to install it from the + Scientific Linux + or CentOS repositories.

If you are already running CentOS or Scientific Linux, you can skip this step. Otherwise, to teach the machine where to find the CentOS packages, run:

[ALL] # cat < /etc/yum.repos.d/centos.repo [centos-7-base] name=CentOS-$releasever - Base mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os #baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/ enabled=1 EOF

Next we use yum to install pacemaker and some other necessary packages we will need:

[ALL] # yum install pacemaker pcs resource-agents

Create the Cluster

The supported stack on RHEL7 is based on Corosync 2, so thats what Pacemaker uses too.

First we set up the authentication needed for pcs.

[ALL] # echo CHANGEME | passwd --stdin hacluster [ONE] # pcs cluster auth node1 node2 -u hacluster -p CHANGEME --force

We now create a cluster and populate it with some nodes. Note that the name cannot exceed 15 characters (we'll use 'pacemaker1').

[ONE] # pcs cluster setup --force --name pacemaker1 node1 node2

Start the Cluster

[ONE] # pcs cluster start --all

Set Cluster Options

With so many devices and possible topologies, it is nearly impossible to include Fencing in a document like this. For now we will disable it.

[ONE] # pcs property set stonith-enabled=false

One of the most common ways to deploy Pacemaker is in a 2-node configuration. However quorum as a concept makes no sense in this scenario (because you only have it when more than half the nodes are available), so we'll disable it too.

[ONE] # pcs property set no-quorum-policy=ignore

For demonstration purposes, we will force the cluster to move services after a single failure:

[ONE] # pcs resource defaults migration-threshold=1

Add a Resource

Lets add a cluster service, we'll choose one doesn't require any configuration and works everywhere to make things easy. Here's the command:

[ONE] # pcs resource create my_first_svc Dummy op monitor interval=120s

"my_first_svc" is the name the service will be known as.

"ocf:pacemaker:Dummy" tells Pacemaker which script to use (Dummy - an agent that's useful as a template and for guides like this one), which namespace it is in (pacemaker) and what standard it conforms to (OCF).

"op monitor interval=120s" tells Pacemaker to check the health of this service every 2 minutes by calling the agent's monitor action.

You should now be able to see the service running using:

[ONE] # pcs status

or

[ONE] # crm_mon -1

Simulate a Service Failure

We can simulate an error by telling the service to stop directly (without telling the cluster):

[ONE] # crm_resource --resource my_first_svc --force-stop

If you now run crm_mon in interactive mode (the default), you should see (within the monitor interval of 2 minutes) the cluster notice that my_first_svc failed and move it to another node.

Next Steps