diff --git a/.gitignore b/.gitignore index abf43bc60b..ac6bfd4156 100644 --- a/.gitignore +++ b/.gitignore @@ -1,168 +1,171 @@ # Common \#* .\#* GPATH GRTAGS GTAGS TAGS Makefile Makefile.in .deps .libs *.pc *.pyc *.bz2 *.rpm *.la *.lo *.o *~ *.gcda *.gcno # Autobuild aclocal.m4 autoconf autoheader autom4te.cache/ automake build.counter compile config.guess config.log config.status config.sub configure depcomp install-sh include/stamp-* libltdl.tar libtool libtool.m4 ltdl.m4 ltmain.sh missing py-compile m4/ltoptions.m4 m4/ltsugar.m4 m4/ltversion.m4 m4/lt~obsolete.m4 test-driver ylwrap # Configure targets Doxyfile coverage.sh cts/CTSvars.py cts/LSBDummy cts/benchmark/clubench cts/lxc_autogen.sh extra/logrotate/pacemaker include/config.h include/config.h.in include/crm_config.h lrmd/pacemaker_remote +lrmd/pacemaker_remoted lrmd/pacemaker_remote.service mcp/pacemaker mcp/pacemaker.combined.upstart mcp/pacemaker.service mcp/pacemaker.upstart pengine/regression.core.sh publican.cfg shell/modules/help.py shell/modules/ra.py shell/modules/ui.py shell/modules/vars.py tools/cibsecret tools/coverage.sh +tools/crm_error tools/crm_mon.upstart tools/crm_report tools/report.common lrmd/regression.py fencing/regression.py # Build targets *.7 *.7.xml *.7.html *.8 *.8.xml *.8.html +attrd/attrd doc/*/en-US/images/*.png doc/*/tmp/** doc/*/publish cib/cib cib/cibmon cib/cibpipe crmd/atest crmd/crmd doc/Clusters_from_Scratch.txt doc/Pacemaker_Explained.txt doc/acls.html doc/crm_fencing.html fencing/stonith-test fencing/stonith_admin fencing/stonithd fencing/stonithd.xml lrmd/lrmd lrmd/lrmd_test mcp/pacemakerd pengine/pengine pengine/pengine.xml pengine/ptest shell/regression/testcases/confbasic-xml.filter scratch -tools/attrd tools/attrd_updater tools/cibadmin tools/crm_attribute tools/crm_diff tools/crm_mon tools/crm_node tools/crm_resource tools/crm_shadow tools/crm_simulate tools/crm_uuid tools/crm_verify tools/crmadmin tools/iso8601 tools/crm_ticket tools/report.collector.1 xml/crm.dtd -xml/pacemaker.rng +xml/pacemaker*.rng +xml/versions.rng extra/rgmanager/ccs2cib extra/rgmanager/ccs_flatten extra/rgmanager/disable_rgmanager doc/Clusters_from_Scratch.build doc/Clusters_from_Scratch/en-US/Ap-*.xml doc/Clusters_from_Scratch/en-US/Ch-*.xml doc/Pacemaker_Explained.build doc/Pacemaker_Explained/en-US/Ch-*.xml doc/Pacemaker_Explained/en-US/Ap-*.xml doc/Pacemaker_Remote.build doc/Pacemaker_Remote/en-US/Ch-*.xml lib/gnu/libgnu.a lib/gnu/stdalign.h *.coverity #Other mock HTML -pacemaker.spec +pacemaker*.spec pengine/.regression.failed.diff ClusterLabs-pacemaker-*.tar.gz coverity-* compat_reports .ABI-build abi_dumps logs *.patch *.diff *.sed *.orig *.rej *.swp pengine/test10/shadow.* diff --git a/README.markdown b/README.markdown index 8d57f0f842..e4f911b7e3 100644 --- a/README.markdown +++ b/README.markdown @@ -1,92 +1,94 @@ # Pacemaker ## What is Pacemaker? Pacemaker is an advanced, scalable High-Availability cluster resource manager for Linux-HA (Heartbeat) and/or Corosync. It supports "n-node" clusters with significant capabilities for managing resources and dependencies. It will run scripts at initialization, when machines go up or down, when related resources fail and can be configured to periodically check resource health. ## For more information look at: * [Website](http://www.clusterlabs.org) * [Issues/Bugs](http://bugs.clusterlabs.org) * [Mailing list](http://oss.clusterlabs.org/mailman/listinfo/pacemaker). * [Documentation](http://www.clusterlabs.org/doc) ## User interfaces / shells There are multiple user interfaces for Pacemaker, both command line tools, graphical user interfaces and web frontends. The _crm shell_ used to be included in the Pacemaker source tree, but is now maintained as a separate project. This is not meant to be an exhaustive list: * _crmsh_: https://crmsh.github.io/ * _pcs_: https://github.com/feist/pcs/ * _LCMC_: http://lcmc.sourceforge.net/ * _hawk_: https://github.com/ClusterLabs/hawk ## Build Dependencies * automake * autoconf * libtool-ltdl-devel * libuuid-devel * pkgconfig * python * glib2-devel * libxml2-devel * libxslt-devel * python-devel * gcc-c++ * bzip2-devel * gnutls-devel * pam-devel * libqb-devel ## Cluster Stack Dependencies (Pick at least one) * clusterlib-devel (CMAN) * corosynclib-devel (Corosync) * heartbeat-devel (Heartbeat) ## Optional Build Dependencies * ncurses-devel * openssl-devel * libselinux-devel +* systemd-devel +* dbus-devel * cluster-glue-libs-devel (LHA style fencing agents) * libesmtp-devel (Email alerts) * lm_sensors-devel (SNMP alerts) * net-snmp-devel (SNMP alerts) * asciidoc (documentation) * help2man (documentation) * publican (documentation) * inkscape (documentation) * docbook-style-xsl (documentation) ## Source Control (GIT) git clone git://github.com/ClusterLabs/pacemaker.git [See Github](https://github.com/ClusterLabs/pacemaker) ## Installing from source $ ./autogen.sh $ ./configure $ make $ sudo make install ## How you can help If you find this project useful, you may want to consider supporting its future development. There are a number of ways to support the project. * Test and report issues. * Tick something off our [todo list](https://github.com/ClusterLabs/pacemaker/blob/master/TODO.markdown) * Help others on the [mailing list](http://oss.clusterlabs.org/mailman/listinfo/pacemaker). * Contribute documentation, examples and test cases. * Contribute patches. * Spread the word. diff --git a/doc/Clusters_from_Scratch/en-US/Ch-Stonith.txt b/doc/Clusters_from_Scratch/en-US/Ch-Stonith.txt index 230a208b52..0d67ecd90b 100644 --- a/doc/Clusters_from_Scratch/en-US/Ch-Stonith.txt +++ b/doc/Clusters_from_Scratch/en-US/Ch-Stonith.txt @@ -1,139 +1,140 @@ -[[_what_is_stonith]] = Configure STONITH = +== What is STONITH? == + STONITH (Shoot The Other Node In The Head aka. fencing) protects your data from being corrupted by rogue nodes or unintended concurrent access. Just because a node is unresponsive doesn't mean it has stopped accessing your data. The only way to be 100% sure that your data is safe, is to use STONITH to ensure that the node is truly offline before allowing the data to be accessed from another node. STONITH also has a role to play in the event that a clustered service cannot be stopped. In this case, the cluster uses STONITH to force the whole node offline, thereby making it safe to start the service elsewhere. == Choose a STONITH Device == It is crucial that your STONITH device can allow the cluster to differentiate between a node failure and a network failure. The biggest mistake people make in choosing a STONITH device is to use a remote power switch (such as many on-board IPMI controllers) that shares power with the node it controls. In such cases, the cluster cannot be sure if the node is really offline, or active and suffering from a network fault. Likewise, any device that relies on the machine being active (such as SSH-based "devices" used during testing) are inappropriate. == Configure the Cluster for STONITH == . Configure the STONITH device itself to be able to fence your nodes and accept fencing requests. . Install the STONITH agent(s). To see what packages are available, run `yum search fence-agents fence-virt`. Be sure to install the package(s) on all cluster nodes. . Find the correct STONITH agent script: `pcs stonith list` . Find the parameters associated with the device: +pcs stonith describe pass:[agent_name]+ . Create a local copy of the CIB: `pcs cluster cib stonith_cfg` . Create the fencing resource: +pcs -f stonith_cfg stonith create pass:[stonith_id stonith_device_type [stonith_device_options]]+ . Enable STONITH in the cluster: `pcs -f stonith_cfg property set stonith-enabled=true` . If the device does not know how to fence nodes based on their uname, you may also need to set the special *pcmk_host_map* parameter. See `man stonithd` for details. . If the device does not support the *list* command, you may also need to set the special *pcmk_host_list* and/or *pcmk_host_check* parameters. See `man stonithd` for details. . If the device does not expect the victim to be specified with the *port* parameter, you may also need to set the special *pcmk_host_argument* parameter. See `man stonithd` for details. . Commit the new configuration: `pcs cluster cib-push stonith_cfg` . Once the STONITH resource is running, test it (you might want to stop the cluster on that machine first): +stonith_admin --reboot pass:[nodename]+ == Example == For this example, assume we have a chassis containing four nodes and an IPMI device active on 10.0.0.1. Following the steps above would go something like this: Step 1: Configure the IP address, authentication credentials, etc. in the IPMI device itself. Step 2: Install the *fence-agents-ipmilan* package on both nodes. Step 3: Choose the *fence_ipmilan* STONITH agent. Step 4: Obtain the agent's possible parameters: ---- [root@pcmk-1 ~]# pcs stonith describe fence_ipmilan Stonith options for: fence_ipmilan ipport: TCP/UDP port to use for connection with device inet6_only: Forces agent to use IPv6 addresses only ipaddr (required): IP Address or Hostname passwd_script: Script to retrieve password method: Method to fence (onoff|cycle) inet4_only: Forces agent to use IPv4 addresses only passwd: Login password or passphrase lanplus: Use Lanplus to improve security of connection auth: IPMI Lan Auth type. cipher: Ciphersuite to use (same as ipmitool -C parameter) privlvl: Privilege level on IPMI device action (required): Fencing Action login: Login Name verbose: Verbose mode debug: Write debug information to given file version: Display version information and exit help: Display help and exit power_wait: Wait X seconds after issuing ON/OFF login_timeout: Wait X seconds for cmd prompt after login power_timeout: Test X seconds for status change after ON/OFF delay: Wait X seconds before fencing is started ipmitool_path: Path to ipmitool binary shell_timeout: Wait X seconds for cmd prompt after issuing command retry_on: Count of attempts to retry power on sudo: Use sudo (without password) when calling 3rd party sotfware. stonith-timeout: How long to wait for the STONITH action to complete per a stonith device. priority: The priority of the stonith resource. Devices are tried in order of highest priority to lowest. pcmk_host_map: A mapping of host names to ports numbers for devices that do not support host names. pcmk_host_list: A list of machines controlled by this device (Optional unless pcmk_host_check=static-list). pcmk_host_check: How to determine which machines are controlled by the device. ---- Step 5: `pcs cluster cib stonith_cfg` Step 6: Here are example parameters for creating our STONITH resource: ---- # pcs -f stonith_cfg stonith create ipmi-fencing fence_ipmilan \ pcmk_host_list="pcmk-1 pcmk-2" ipaddr=10.0.0.1 login=testuser \ passwd=acd123 op monitor interval=60s # pcs -f stonith_cfg stonith ipmi-fencing (stonith:fence_ipmilan): Stopped ---- Steps 7-10: Enable STONITH in the cluster: ---- # pcs -f stonith_cfg property set stonith-enabled=true # pcs -f stonith_cfg property Cluster Properties: cluster-infrastructure: corosync cluster-name: mycluster dc-version: 1.1.12-a9c8177 have-watchdog: false stonith-enabled: true ---- Step 11: `pcs cluster cib-push stonith_cfg`