diff --git a/README.markdown b/README.markdown index 2ae7ae6326..480b098c9c 100644 --- a/README.markdown +++ b/README.markdown @@ -1,76 +1,76 @@ # Pacemaker ## What is Pacemaker? Pacemaker is an advanced, scalable high-availability cluster resource manager. It supports "N-node" clusters with significant capabilities for managing resources and dependencies. It will run scripts at initialization, when machines go up or down, when related resources fail and can be configured to periodically check resource health. ## Who is Pacemaker? -Pacemaker is distributed by [ClusterLabs](http://www.clusterlabs.org). +Pacemaker is distributed by [ClusterLabs](https://www.clusterlabs.org/). Pacemaker was initially created by main architect and lead developer Andrew Beekhof , with the aid of project catalyst and advocate Lars Marowsky-Brée . Many, many developers have contributed significantly to the project since. The git log is the definitive record of their greatly appreciated contributions. The wider community of Pacemaker users is another essential aspect of the project's existence, especially the many users who participate in the mailing lists, blog about HA clustering, and otherwise actively make the project more useful. ## Where do I get Pacemaker? Pacemaker source code is distributed via [Github](https://github.com/ClusterLabs/pacemaker). From there, you can clone or download the repository to get the latest development code, or download one of the official [releases](https://github.com/ClusterLabs/pacemaker/releases). ## How do I install Pacemaker? See [INSTALL.md](https://github.com/ClusterLabs/pacemaker/blob/master/INSTALL.md). ## What higher-level interfaces to Pacemaker are available? There are multiple user interfaces for Pacemaker, including command-line tools, graphical user interfaces and web frontends. The crm shell used to be included in the Pacemaker source tree, but is now a separate project. This is not an exhaustive list: * crmsh: https://github.com/ClusterLabs/crmsh * pcs: https://github.com/ClusterLabs/pcs * LCMC: http://lcmc.sourceforge.net/ * hawk: https://github.com/ClusterLabs/hawk * Striker: https://github.com/ClusterLabs/striker ### Can I convert some other cluster configuration to Pacemaker? [clufter](https://github.com/jnpkrn/clufter) is a general-purpose tool for converting one cluster representation format to another. Among other possibilities, it can convert from a cluster based on rgmanager with CMAN to a one based on pacemaker with corosync. See its documentation for details. ## How can I help? See [CONTRIBUTING.md](https://github.com/ClusterLabs/pacemaker/blob/master/CONTRIBUTING.md). ## Where can I find more information about Pacemaker? -* [ClusterLabs website](http://www.clusterlabs.org/) -* [Documentation](http://www.clusterlabs.org/doc/) -* [Issues/Bugs](http://bugs.clusterlabs.org/) -* Mailing lists for [users](http://oss.clusterlabs.org/mailman/listinfo/users) and [developers](http://oss.clusterlabs.org/mailman/listinfo/developers) -* #clusterlabs IRC channel on [freenode](http://freenode.net/) +* [ClusterLabs website](https://www.clusterlabs.org/) +* [Documentation](https://www.clusterlabs.org/pacemaker/doc/) +* [Issues/Bugs](https://bugs.clusterlabs.org/) +* [Mailing lists](https://wiki.clusterlabs.org/wiki/Mailing_lists) for users and developers +* [ClusterLabs IRC channel](https://wiki.clusterlabs.org/wiki/ClusterLabs_IRC_channel) diff --git a/doc/crm_fencing.txt b/doc/crm_fencing.txt index 22be35eb73..eb706c4bbe 100644 --- a/doc/crm_fencing.txt +++ b/doc/crm_fencing.txt @@ -1,439 +1,439 @@ Fencing and Stonith =================== Dejan_Muhamedagic v0.9 Fencing is a very important concept in computer clusters for HA (High Availability). Unfortunately, given that fencing does not offer a visible service to users, it is often neglected. Fencing may be defined as a method to bring an HA cluster to a known state. But, what is a "cluster state" after all? To answer that question we have to see what is in the cluster. == Introduction to HA clusters Any computer cluster may be loosely defined as a collection of cooperating computers or nodes. Nodes talk to each other over communication channels, which are typically standard network connections, such as Ethernet. The main purpose of an HA cluster is to manage user services. Typical examples of user services are an Apache web server or, say, a MySQL database. From the user's point of view, the services do some specific and hopefully useful work when ordered to do so. To the cluster, however, they are just things which may be started or stopped. This distinction is important, because the nature of the service is irrelevant to the cluster. In the cluster lingo, the user services are known as resources. Every resource has a state attached, for instance: "resource r1 is started on node1". In an HA cluster, such state implies that "resource r1 is stopped on all nodes but node1", because an HA cluster must make sure that every resource may be started on at most one node. A collection of resource states and node states is the cluster state. Every node must report every change that happens to resources. This may happen only for the running resources, because a node should not start resources unless told so by somebody. That somebody is the Cluster Resource Manager (CRM) in our case. So far so good. But what if, for whatever reason, we cannot establish with certainty a state of some node or resource? This is where fencing comes in. With fencing, even when the cluster doesn't know what is happening on some node, we can make sure that that node doesn't run any or certain important resources. If you wonder how this can happen, there may be many risks involved with computing: reckless people, power outages, natural disasters, rodents, thieves, software bugs, just to name a few. We are sure that at least a few times your computer failed unpredictably. == Fencing There are two kinds of fencing: resource level and node level. Using the resource level fencing the cluster can make sure that a node cannot access one or more resources. One typical example is a SAN, where a fencing operation changes rules on a SAN switch to deny access from a node. The resource level fencing may be achieved using normal resources on which the resource we want to protect would depend. Such a resource would simply refuse to start on this node and therefore resources which depend on it will be unrunnable on the same node as well. The node level fencing makes sure that a node does not run any resources at all. This is usually done in a very simple, yet brutal way: the node is simply reset using a power switch. This may ultimately be necessary because the node may not be responsive at all. The node level fencing is our primary subject below. == Node level fencing devices Before we get into the configuration details, you need to pick a fencing device for the node level fencing. There are quite a few to choose from. If you want to see the list of stonith devices which are supported just run: stonith -L Stonith devices may be classified into five categories: - UPS (Uninterruptible Power Supply) - PDU (Power Distribution Unit) - Blade power control devices - Lights-out devices - Testing devices The choice depends mainly on your budget and the kind of hardware. For instance, if you're running a cluster on a set of blades, then the power control device in the blade enclosure is the only candidate for fencing. Of course, this device must be capable of managing single blade computers. The lights-out devices (IBM RSA, HP iLO, Dell DRAC) are becoming increasingly popular and in future they may even become standard equipment of of-the-shelf computers. They are, however, inferior to UPS devices, because they share a power supply with their host (a cluster node). If a node stays without power, the device supposed to control it would be just as useless. Even though this is obvious to us, the cluster manager is not in the know and will try to fence the node in vain. This will continue forever because all other resource operations would wait for the fencing/stonith operation to succeed. The testing devices are used exclusively for testing purposes. They are usually more gentle on the hardware. Once the cluster goes into production, they must be replaced with real fencing devices. == STONITH (Shoot The Other Node In The Head) Stonith is our fencing implementation. It provides the node level fencing. .NB The stonith and fencing terms are often used interchangeably here as well as in other texts. The stonith subsystem consists of two components: - pacemaker-fenced - stonith plugins === pacemaker-fenced pacemaker-fenced is a daemon which may be accessed by the local processes or over the network. It accepts commands which correspond to fencing operations: reset, power-off, and power-on. It may also check the status of the fencing device. pacemaker-fenced runs on every node in the CRM HA cluster. The pacemaker-fenced instance running on the DC node receives a fencing request from the CRM. It is up to this and other pacemaker-fenced programs to carry out the desired fencing operation. === Stonith plugins For every supported fencing device there is a stonith plugin which is capable of controlling that device. A stonith plugin is the interface to the fencing device. All stonith plugins look the same to pacemaker-fenced, but are quite different on the other side reflecting the nature of the fencing device. Some plugins support more than one device. A typical example is ipmilan (or external/ipmi) which implements the IPMI protocol and can control any device which supports this protocol. == CRM stonith configuration The fencing configuration consists of one or more stonith resources. A stonith resource is a resource of class stonith and it is configured just like any other resource. The list of parameters (attributes) depend on and are specific to a stonith type. Use the stonith(1) program to see the list: $ stonith -t ibmhmc -n ipaddr $ stonith -t ipmilan -n hostname ipaddr port auth priv login password reset_method .NB It is easy to guess the class of a fencing device from the set of attribute names. A short help text is also available: $ stonith -t ibmhmc -h STONITH Device: ibmhmc - IBM Hardware Management Console (HMC) Use for IBM i5, p5, pSeries and OpenPower systems managed by HMC Optional parameter name managedsyspat is white-space delimited list of patterns used to match managed system names; if last character is '*', all names that begin with the pattern are matched Optional parameter name password is password for hscroot if passwordless ssh access to HMC has NOT been setup (to do so, it is necessary to create a public/private key pair with empty passphrase - see "Configure the OpenSSH client" in the redbook for more details) For more information see http://publib-b.boulder.ibm.com/redbooks.nsf/RedbookAbstracts/SG247038.html .You just said that there is pacemaker-fenced and stonith plugins. What's with these resources now? ************************** Resources of class stonith are just a representation of stonith plugins in the CIB. Well, a bit more: apart from the fencing operations, the stonith resources, just like any other, may be started and stopped and monitored. The start and stop operations are a bit of a misnomer: enable and disable would serve better, but it's too late to change that. So, these two are actually administrative operations and do not translate to any operation on the fencing device itself. Monitor, however, does translate to device status. ************************** A dummy stonith resource configuration, which may be used in some testing scenarios is very simple: configure primitive st-null stonith:null \ params hostlist="node1 node2" clone fencing st-null commit .NB ************************** All configuration examples are in the crm configuration tool syntax. To apply them, put the sample in a text file, say sample.txt and run: crm < sample.txt The configure and commit lines are omitted from further examples. ************************** An alternative configuration: primitive st-node1 stonith:null \ params hostlist="node1" primitive st-node2 stonith:null \ params hostlist="node2" location l-st-node1 st-node1 -inf: node1 location l-st-node2 st-node2 -inf: node2 This configuration is perfectly alright as far as the cluster software is concerned. The only difference to a real world configuration is that no fencing operation takes place. A more realistic, but still only for testing, is the following external/ssh configuration: primitive st-ssh stonith:external/ssh \ params hostlist="node1 node2" clone fencing st-ssh This one can also reset nodes. As you can see, this configuration is remarkably similar to the first one which features the null stonith device. .What is this clone thing? ************************** Clones are a CRM/Pacemaker feature. A clone is basically a shortcut: instead of defining _n_ identical, yet differently named resources, a single cloned resource suffices. By far the most common use of clones is with stonith resources if the stonith device is accessible from all nodes. ************************** The real device configuration is not much different, though some devices may require more attributes. For instance, an IBM RSA lights-out device might be configured like this: primitive st-ibmrsa-1 stonith:external/ibmrsa-telnet \ params nodename=node1 ipaddr=192.168.0.101 \ userid=USERID passwd=PASSW0RD primitive st-ibmrsa-2 stonith:external/ibmrsa-telnet \ params nodename=node2 ipaddr=192.168.0.102 \ userid=USERID passwd=PASSW0RD # st-ibmrsa-1 can run anywhere but on node1 location l-st-node1 st-ibmrsa-1 -inf: node1 # st-ibmrsa-2 can run anywhere but on node2 location l-st-node2 st-ibmrsa-2 -inf: node2 .Why those strange location constraints? ************************** There is always certain probability that the stonith operation is going to fail. Hence, a stonith operation on the node which is the executioner too is not reliable. If the node is reset, then it cannot send the notification about the fencing operation outcome. ************************** If you haven't already guessed, configuration of a UPS kind of fencing device is remarkably similar to all we have already shown. All UPS devices employ the same mechanics for fencing. What is, however, different is how the device itself is accessed. Old UPS devices, those that were considered professional, used to have just a serial port, typically connected at 1200baud using a special serial cable. Many new ones still come equipped with a serial port, but often they also sport a USB interface or an Ethernet interface. The kind of connection we may make use of depends on what the plugin supports. Let's see a few examples for the APC UPS equipment: $ stonith -t apcmaster -h STONITH Device: apcmaster - APC MasterSwitch (via telnet) NOTE: The APC MasterSwitch accepts only one (telnet) connection/session a time. When one session is active, subsequent attempts to connect to the MasterSwitch will fail. For more information see http://www.apc.com/ List of valid parameter names for apcmaster STONITH device: ipaddr login password $ stonith -t apcsmart -h STONITH Device: apcsmart - APC Smart UPS (via serial port - NOT USB!). Works with higher-end APC UPSes, like Back-UPS Pro, Smart-UPS, Matrix-UPS, etc. (Smart-UPS may have to be >= Smart-UPS 700?). See http://www.networkupstools.org/protocols/apcsmart.html for protocol compatibility details. For more information see http://www.apc.com/ List of valid parameter names for apcsmart STONITH device: ttydev hostlist The former plugin supports APC UPS with a network port and telnet protocol. The latter plugin uses the APC SMART protocol over the serial line which is supported by many different APC UPS product lines. .So, what do I use: clones, constraints, both? ************************** It depends. Depends on the nature of the fencing device. For example, if the device cannot serve more than one connection at the time, then clones won't do. Depends on how many hosts can the device manage. If it's only one, and that is always the case with lights-out devices, then again clones are right out. Depends also on the number of nodes in your cluster: the more nodes the more desirable to use clones. Finally, it is also a matter of personal preference. In short: if clones are safe to use with your configuration and if they reduce the configuration, then make cloned stonith resources. ************************** The CRM configuration is left as an exercise to the reader. == Monitoring the fencing devices Just like any other resource, the stonith class agents also support the monitor operation. Given that we have often seen monitor either not configured or configured in a wrong way, we have decided to devote a section to the matter. Monitoring stonith resources, which is actually checking status of the corresponding fencing devices, is strongly recommended. So strongly, that we should consider a configuration without it invalid. On the one hand, though an indispensable part of an HA cluster, a fencing device, being the last line of defense, is used seldom. Very seldom and preferably never. On the other, for whatever reason, the power management equipment is known to be rather fragile on the communication side. Some devices were known to give up if there was too much broadcast traffic on the wire. Some cannot handle more than ten or so connections per minute. Some get confused or depressed if two clients try to connect at the same time. Most cannot handle more than one session at the time. The bottom line: try not to exercise your fencing device too often. It may not like it. Use monitoring regularly, yet sparingly, say once every couple of hours. The probability that within those few hours there will be a need for a fencing operation and that the power switch would fail is usually low. == Odd plugins Apart from plugins which handle real devices, some stonith plugins are a bit out of line and deserve special attention. === external/kdumpcheck Sometimes, it may be important to get a kernel core dump. This plugin may be used to check if the dump is in progress. If that is the case, then it will return true, as if the node has been fenced, which is actually true given that it cannot run any resources at the time. kdumpcheck is typically used in concert with another, real, fencing device. See README_kdumpcheck.txt for more details. === external/sbd This is a self-fencing device. It reacts to a so-called "poison pill" which may be inserted into a shared disk. On shared storage connection loss, it also makes the node commit suicide. See http://www.linux-ha.org/wiki/SBD_Fencing for more details. === meatware Strange name and a simple concept. `meatware` requires help from a human to operate. Whenever invoked, `meatware` logs a CRIT severity message which should show up on the node's console. The operator should then make sure that the node is down and issue a `meatclient(8)` command to tell `meatware` that it's OK to tell the cluster that it may consider the node dead. See `README.meatware` for more information. === null This one is probably not of much importance to the general public. It is used in various testing scenarios. `null` is an imaginary device which always behaves and always claims that it has shot a node, but never does anything. Sort of a happy-go-lucky. Do not use it unless you know what you are doing. === suicide `suicide` is a software-only device, which can reboot a node it is running on. It depends on the operating system, so it should be avoided whenever possible. But it is OK on one-node clusters. `suicide` and `null` are the only exceptions to the "don't shoot my host" rule. .What about that pacemaker-fenced? You forgot about it, eh? ************************** The pacemaker-fenced daemon, though it is really the master of ceremony, requires no configuration itself. All configuration is stored in the CIB. ************************** == Resources http://www.linux-ha.org/wiki/STONITH -http://www.clusterlabs.org/doc/crm_fencing.html +https://www.clusterlabs.org/doc/crm_fencing.html -http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained +https://www.clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/ http://techthoughts.typepad.com/managing_computers/2007/10/split-brain-quo.html diff --git a/doc/sphinx/Pacemaker_Development/faq.rst b/doc/sphinx/Pacemaker_Development/faq.rst index 749c2e0c7c..729c244f6d 100644 --- a/doc/sphinx/Pacemaker_Development/faq.rst +++ b/doc/sphinx/Pacemaker_Development/faq.rst @@ -1,163 +1,163 @@ Frequently Asked Questions -------------------------- :Q: Who is this document intended for? :A: Anyone who wishes to read and/or edit the Pacemaker source code. Casual contributors should feel free to read just this FAQ, and consult other chapters as needed. ---- .. index:: single: download single: source code single: git single: git; GitHub :Q: Where is the source code for Pacemaker? :A: The `source code for Pacemaker `_ is kept on `GitHub `_, as are all software projects under the `ClusterLabs `_ umbrella. Pacemaker uses `Git `_ for source code management. If you are a Git newbie, the `gittutorial(7) man page `_ is an excellent starting point. If you're familiar with using Git from the command line, you can create a local copy of the Pacemaker source code with: **git clone https://github.com/ClusterLabs/pacemaker.git** ---- .. index:: single: git; branch :Q: What are the different Git branches and repositories used for? :A: * The `master branch `_ is the primary branch used for development. * The `2.1 branch `_ is the current release branch. Normally, it does not receive any changes, but during the release cycle for a new release, it will contain release candidates. During the release cycle, certain bug fixes will go to the 2.1 branch first (and be pulled into master later). * The `2.0 branch `_, `1.1 branch `_, and separate `1.0 repository `_ are frozen snapshots of earlier release series, no longer being developed. * Messages will be posted to the `developers@ClusterLabs.org `_ mailing list during the release cycle, with instructions about which branches to use when submitting requests. ---- :Q: How do I build from the source code? :A: See `INSTALL.md `_ in the main checkout directory. ---- :Q: What coding style should I follow? :A: You'll be mostly fine if you simply follow the example of existing code. When unsure, see the relevant chapter of this document for language-specific recommendations. Pacemaker has grown and evolved organically over many years, so you will see much code that doesn't conform to the current guidelines. We discourage making changes solely to bring code into conformance, as any change requires developer time for review and opens the possibility of adding bugs. However, new code should follow the guidelines, and it is fine to bring lines of older code into conformance when modifying that code for other reasons. ---- .. index:: single: git; commit message :Q: How should I format my Git commit messages? :A: An example is "Feature: scheduler: wobble the frizzle better". * The first part is the type of change, used to automatically generate the change log for the next release. Commit messages with the following will be included in the change log: * **Feature** for new features * **Fix** for bug fixes (**Bug** or **High** also work) * **API** for changes to the public API Everything else will *not* automatically be in the change log, and so don't really matter, but types commonly used include: * **Log** for changes to log messages or handling * **Doc** for changes to documentation or comments * **Test** for changes in CTS and regression tests * **Low**, **Med**, or **Mid** for bug fixes not significant enough for a change log entry * **Refactor** for refactoring-only code changes * **Build** for build process changes * The next part is the name of the component(s) being changed, for example, **controller** or **libcrmcommon** (it's more free-form, so don't sweat getting it exact). * The rest briefly describes the change. The git project recommends the entire summary line stay under 50 characters, but more is fine if needed for clarity. * Except for the most simple and obvious of changes, the summary should be followed by a blank line and a longer explanation of *why* the change was made. ---- :Q: How can I test my changes? :A: Most importantly, Pacemaker has regression tests for most major components; these will automatically be run for any pull requests submitted through GitHub. Additionally, Pacemaker's Cluster Test Suite (CTS) can be used to set up a test cluster and run a wide variety of complex tests. This document will have more detail on testing in the future. ---- .. index:: license :Q: What is Pacemaker's license? :A: Except where noted otherwise in the file itself, the source code for all Pacemaker programs is licensed under version 2 or later of the GNU General Public License (`GPLv2+ `_), its headers and libraries under version 2.1 or later of the less restrictive GNU Lesser General Public License (`LGPLv2.1+ `_), its documentation under version 4.0 or later of the Creative Commons Attribution-ShareAlike International Public License (`CC-BY-SA-4.0 `_), and its init scripts under the `Revised BSD `_ license. If you find any deviations from this policy, or wish to inquire about alternate licensing arrangements, please e-mail the `developers@ClusterLabs.org `_ mailing list. Licensing issues are also discussed on the `ClusterLabs wiki `_. ---- :Q: How can I contribute my changes to the project? :A: Contributions of bug fixes or new features are very much appreciated! Patches can be submitted as `pull requests `_ via GitHub (the preferred method, due to its excellent `features `_), or e-mailed to the `developers@ClusterLabs.org `_ mailing list as an attachment in a format Git can import. Authors may only submit changes that they have the right to submit under the open source license indicated in the affected files. ---- .. index:: mailing list :Q: What if I still have questions? :A: Ask on the `developers@ClusterLabs.org `_ mailing list for development-related questions, or on the `users@ClusterLabs.org `_ mailing list for general questions about using Pacemaker. - Developers often also hang out on `freenode's `_ - #clusterlabs IRC channel. + Developers often also hang out on the + [ClusterLabs IRC channel](https://wiki.clusterlabs.org/wiki/ClusterLabs_IRC_channel). diff --git a/doc/sphinx/conf.py.in b/doc/sphinx/conf.py.in index d181147cd6..9da32d584d 100644 --- a/doc/sphinx/conf.py.in +++ b/doc/sphinx/conf.py.in @@ -1,316 +1,316 @@ """ Sphinx configuration for Pacemaker documentation """ __copyright__ = "Copyright 2020 the Pacemaker project contributors" __license__ = "GNU General Public License version 2 or later (GPLv2+) WITHOUT ANY WARRANTY" # This file is execfile()d with the current directory set to its containing dir. # # Note that not all possible configuration values are present in this # autogenerated file. # # All configuration values have a default; values that are commented out # serve to show the default. import datetime import os import sys # Variables that can be used later in this file authors = "the Pacemaker project contributors" year = datetime.datetime.now().year doc_license = "Creative Commons Attribution-ShareAlike International Public License" doc_license += " version 4.0 or later (CC-BY-SA v4.0+)" # rST markup to insert at beginning of every document; mainly used for # # .. || replace:: # # where occurrences of || in the rST will be substituted with rst_prolog=""" .. |CFS_DISTRO| replace:: CentOS Stream .. |CFS_DISTRO_VER| replace:: 8 .. |REMOTE_DISTRO| replace:: CentOS Stream .. |REMOTE_DISTRO_VER| replace:: 8 """ # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. #sys.path.insert(0, os.path.abspath('.')) # -- General configuration ----------------------------------------------------- # If your documentation needs a minimal Sphinx version, state it here. needs_sphinx = '1.0' # Add any Sphinx extension module names here, as strings. They can be extensions # coming with Sphinx (named 'sphinx.ext.*') or your custom ones. extensions = [] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix of source filenames. source_suffix = '.rst' # The encoding of source files. #source_encoding = 'utf-8-sig' # The master toctree document. master_doc = 'index' # General information about the project. project = '%BOOK_ID%' copyright = "2009-%s %s. Released under the terms of the %s" % (year, authors, doc_license) # The version info for the project you're documenting, acts as replacement for # |version| and |release|, also used in various other places throughout the # built documents. # # The full version, including alpha/beta/rc tags. release = '%VERSION%' # The short X.Y version. version = release.rsplit('.', 1)[0] # The language for content autogenerated by Sphinx. Refer to documentation # for a list of supported languages. #language = None # There are two options for replacing |today|: either, you set today to some # non-false value, then it is used: #today = '' # Else, today_fmt is used as the format for a strftime call. #today_fmt = '%B %d, %Y' # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. exclude_patterns = ['_build'] # The reST default role (used for this markup: `text`) to use for all documents. #default_role = None # If true, '()' will be appended to :func: etc. cross-reference text. #add_function_parentheses = True # If true, the current module name will be prepended to all description # unit titles (such as .. function::). #add_module_names = True # If true, sectionauthor and moduleauthor directives will be shown in the # output. They are ignored by default. #show_authors = False # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'vs' # A list of ignored prefixes for module index sorting. #modindex_common_prefix = [] # -- Options for HTML output --------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. html_theme = 'pyramid' # Theme options are theme-specific and customize the look and feel of a theme # further. For a list of options available for each theme, see the # documentation. #html_theme_options = {} # Add any paths that contain custom themes here, relative to this directory. #html_theme_path = [] html_style = 'pacemaker.css' # The name for this set of Sphinx documents. If None, it defaults to # " v documentation". html_title = "%BOOK_TITLE%" # A shorter title for the navigation bar. Default is the same as html_title. #html_short_title = None # The name of an image file (relative to this directory) to place at the top # of the sidebar. #html_logo = None # The name of an image file (within the static path) to use as favicon of the # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 # pixels large. #html_favicon = None # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = [ '%SRC_DIR%/_static' ] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. #html_last_updated_fmt = '%b %d, %Y' # If true, SmartyPants will be used to convert quotes and dashes to # typographically correct entities. #html_use_smartypants = True # Custom sidebar templates, maps document names to template names. #html_sidebars = {} # Additional templates that should be rendered to pages, maps page names to # template names. #html_additional_pages = {} # If false, no module index is generated. #html_domain_indices = True # If false, no index is generated. #html_use_index = True # If true, the index is split into individual pages for each letter. #html_split_index = False # If true, links to the reST sources are added to the pages. #html_show_sourcelink = True # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. #html_show_sphinx = True # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. #html_show_copyright = True # If true, an OpenSearch description file will be output, and all pages will # contain a tag referring to it. The value of this option must be the # base URL from which the finished HTML is served. #html_use_opensearch = '' # This is the file name suffix for HTML files (e.g. ".xhtml"). #html_file_suffix = None # Output file base name for HTML help builder. htmlhelp_basename = 'Pacemakerdoc' # -- Options for LaTeX output -------------------------------------------------- latex_engine = "xelatex" latex_elements = { # The paper size ('letterpaper' or 'a4paper'). #'papersize': 'letterpaper', # The font size ('10pt', '11pt' or '12pt'). #'pointsize': '10pt', # Additional stuff for the LaTeX preamble. #'preamble': '', } # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, author, documentclass [howto/manual]). latex_documents = [ ('index', '%BOOK_ID%.tex', '%BOOK_TITLE%', authors, 'manual'), ] # The name of an image file (relative to this directory) to place at the top of # the title page. #latex_logo = None # For "manual" documents, if this is true, then toplevel headings are parts, # not chapters. #latex_use_parts = False # If true, show page references after internal links. #latex_show_pagerefs = False # If true, show URL addresses after external links. #latex_show_urls = False # Documents to append as an appendix to all manuals. #latex_appendices = [] # If false, no module index is generated. #latex_domain_indices = True # -- Options for manual page output -------------------------------------------- # One entry per manual page. List of tuples # (source start file, name, description, authors, manual section). man_pages = [ ('index', '%BOOK_ID%', 'Part of the Pacemaker documentation set', [authors], 8) ] # If true, show URL addresses after external links. #man_show_urls = False # -- Options for Texinfo output ------------------------------------------------ # Grouping the document tree into Texinfo files. List of tuples # (source start file, target name, title, author, # dir menu entry, description, category) texinfo_documents = [ ('index', '%BOOK_ID%', '%BOOK_TITLE%', authors, '%BOOK_TITLE%', 'Pacemaker is an advanced, scalable high-availability cluster resource manager.', 'Miscellaneous'), ] # Documents to append as an appendix to all manuals. #texinfo_appendices = [] # If false, no module index is generated. #texinfo_domain_indices = True # How to display URL addresses: 'footnote', 'no', or 'inline'. #texinfo_show_urls = 'footnote' # -- Options for Epub output --------------------------------------------------- # Bibliographic Dublin Core info. epub_title = '%BOOK_TITLE%' epub_author = authors epub_publisher = 'ClusterLabs.org' epub_copyright = copyright # The language of the text. It defaults to the language option # or en if the language is not set. #epub_language = '' # The scheme of the identifier. Typical schemes are ISBN or URL. epub_scheme = 'URL' # The unique identifier of the text. This can be a ISBN number # or the project homepage. -epub_identifier = 'http://www.clusterlabs.org/pacemaker/doc/2.0/%BOOK_ID%/epub/%BOOK_ID%.epub' +epub_identifier = 'https://www.clusterlabs.org/pacemaker/doc/2.0/%BOOK_ID%/epub/%BOOK_ID%.epub' # A unique identification for the text. epub_uid = 'ClusterLabs.org-Pacemaker-%BOOK_ID%' # A tuple containing the cover image and cover page html template filenames. #epub_cover = () # HTML files that should be inserted before the pages created by sphinx. # The format is a list of tuples containing the path and title. #epub_pre_files = [] # HTML files that should be inserted after the pages created by sphinx. # The format is a list of tuples containing the path and title. #epub_post_files = [] # A list of files that should not be packed into the epub file. epub_exclude_files = [ '_static/doctools.js', '_static/jquery.js', '_static/searchtools.js', '_static/underscore.js', '_static/basic.css', '_static/websupport.js', 'search.html', ] # The depth of the table of contents in toc.ncx. #epub_tocdepth = 3 # Allow duplicate toc entries. #epub_tocdup = True diff --git a/extra/cluster-init b/extra/cluster-init index 52949f675c..aca74890f1 100755 --- a/extra/cluster-init +++ b/extra/cluster-init @@ -1,599 +1,599 @@ #!/bin/bash # # Copyright 2011-2021 the Pacemaker project contributors # # The version control history for this file may have further details. # # This source code is licensed under the GNU General Public License version 2 # or later (GPLv2+) WITHOUT ANY WARRANTY. # accept_defaults=0 do_raw=0 ETCHOSTS=0 pcmk_ver=11 nodelist=0 limit=0 pkgs="corosync xinetd nmap abrt-cli fence-agents perl-TimeDate gdb" transport="multicast" inaddr_any="no" INSTALL= cs_conf= fence_conf= rpm_repo= distro= dsh_group=0 if [ ! -z $cluster_name ]; then cluster=$cluster_name else cluster=dummy0 fi # Corosync Settings cs_port=666 # Settings that work great on nXX join=60 #token=3000 consensus=1500 # Official settings join=2000 token=5000 consensus=2500 # Testing join=1000 consensus=7500 do_debug=off function ip_for_node() { ping -c 1 $1 | grep "bytes from" | head -n 1 | sed -e 's/.*bytes from//' -e 's/: icmp.*//' | awk '{print $NF}' | sed 's:(::' | sed 's:)::' # if [ $do_raw = 1 ]; then # echo $1 # else # #host $1 | grep "has address" | head -n 1 | awk '{print $NF}' | sed 's:(::' | sed 's:)::' # fi } function id_for_node() { ip_for_node $* | tr '.' ' ' | awk '{print $4}' } function name_for_node() { echo $1 | awk -F. '{print $1}' } function helptext() { echo "cluster-init - Configure cluster communication for the infrastructures supported by Pacemaker" echo "" echo "-g, --group Specify the group to operate on/with" echo "-w, --host Specify a host to operate on/with. May be specified multiple times" echo "-r, --raw-ip Supplied nodes were listed as their IP addresses" echo "" echo "-c, --corosync configure for corosync" echo "-C, --nodelist configure for corosync with a node list" echo "-u, --unicast configure point-to-point communication instead of multicast" echo "" echo "-I, --install Install packages" echo "-R, --repo name Setup and update/install Pacemaker from the named clusterlabs.org repo" echo " Known values: rpm, rpm-test, rpm-next, rpm-test-next, rpm-test-rhel" echo "-D, --distro The distro within the --repo. Defaults to fedora-15" echo "" echo "-d, --debug Enable debug logging for the cluster" echo "-10 install stable-1.0 packages, implies: -p 0 -R rpm-test -I" echo "--hosts Copy the local /etc/hosts file to all nodes" echo "-e, --extra list Whitespace separated list of extra packages to install" echo "-l, --limit N Use the first N hosts from the named group" echo " Extra packages to install" exit $1 } host_input="" while true; do case "$1" in -g) cluster=$2; shift; shift;; -w|--host) for h in $2; do host_input="$host_input -w $h"; done shift; shift;; -w) host_input="$host_input -w $2" shift; shift;; -r|--raw-ip) do_raw=1; shift;; -D) distro=$2; shift; shift;; -d|--debug) do_debug=on; shift;; -R|--repo) rpm_repo=$2; shift; shift;; -I|--install) INSTALL=Yes; shift;; --hosts) ETCHOSTS=1; shift;; -c|--corosync) CTYPE=corosync; shift;; -C|--nodelist) CTYPE=corosync; nodelist=1; shift;; -u|--unicast) nodelist=1; transport=udpu; inaddr_any="yes"; shift;; -e|--extra) pkgs="$pkgs $2"; shift; shift;; -t|--test) rpm_repo=rpm-test-next; pkgs="$pkgs valgrind"; shift;; -l|--limit) limit=$2; shift; shift;; r*[0-9]) rhel=`echo $1 | sed -e s/rhel// -e s/-// -e s/r//` distro="rhel-$rhel"; pkgs="$pkgs qarsh-server"; case $rhel in 7) CTYPE=corosync;; esac shift ;; f*[0-9][0-9]) distro="fedora-`echo $1 | sed -e s/fedora// -e s/-// -e s/f//`"; CTYPE=corosync; shift ;; p0|10) pcmk_ver=10; rpm_repo="rpm-test"; install=1; shift;; -y|--yes|--defaults) accept_defaults=1; shift;; -x) set -x; shift;; -\?|--help) helptext 0; shift;; "") break;; *) echo "unknown option: $1"; exit 1;; esac done if [ ! -z $cluster ]; then host_input="-g $cluster" # use the last digit present in the variable (if any) dsh_group=`echo $cluster | sed 's/[^0-9][^0-9]*//g;s/.*\([0-9]\)$/\1/'` fi if [ -z $dsh_group ]; then dsh_group=1 fi if [ x = "x$host_input" -a x = "x$cluster" ]; then if [ -d $HOME/.dsh/group ]; then read -p "Please specify a dsh group you'd like to configure as a cluster? [] " -t 60 cluster else read -p "Please specify a whitespace delimetered list of nodes you'd like to configure as a cluster? [] " -t 60 host_list for h in $2; do host_input="$host_input -w $h"; done fi fi if [ -z "$host_input" ]; then echo "You didn't specify any nodes or groups to configure" exit 1 fi if [ $limit -gt 0 ]; then echo "Using only the first $limit hosts in $cluster group" host_list=`cluster-helper --list bullet $host_input | head -n $limit | tr '\n*' ' '` else host_list=`cluster-helper --list short $host_input` fi num_hosts=`echo $host_list | wc -w` if [ $num_hosts -gt 9 ]; then cs_port=66 fi for h in $host_list; do ping -c 1 -q $h if [ $? != 0 ]; then echo "Using long names..." host_list=`cluster-helper --list long $host_input` break fi done if [ -z $CTYPE ]; then echo "" read -p "Where should Pacemaker obtain membership and quorum from? [corosync] (corosync) " -t 60 CTYPE fi case $CTYPE in corosync) cs_conf=/etc/corosync/corosync.conf;; esac function get_defaults() { if [ -z $SSH ]; then SSH="No" fi if [ -z $SELINUX ]; then SELINUX="No" fi if [ -z $IPTABLES ]; then IPTABLES="Yes" fi if [ -z $DOMAIN ]; then DOMAIN="No" fi if [ -z $INSTALL ]; then INSTALL="Yes" fi if [ -z $DATE ]; then DATE="No" fi } get_defaults if [ $accept_defaults = 0 ]; then echo "" read -p "Shall I install an ssh key to cluster nodes? [$SSH] " -t 60 SSH echo "" echo "SELinux prevent many things, including password-less ssh logins" read -p "Shall I disable selinux? [$SELINUX] " -t 60 SELINUX echo "" echo "Incorrectly configured firewalls will prevent corosync from starting up" read -p "Shall I disable iptables? [$IPTABLES] " -t 60 IPTABLES if [ $pcmk_ver = 10 ]; then echo "" echo "Without a default domain, external/ssh fencing probably won't work because it can't find its peers" read -p "Shall I set one? [No] (domain.name) " -t 60 DOMAIN fi echo "" read -p "Shall I install/update the relevant packages? [$INSTALL] " -t 60 INSTALL case $INSTALL in [Yy][Ee][Ss]|[Yy]|"") if [ -z $rpm_repo ]; then echo "" read -p "Would you like to install packages from ClusterLabs.org? [No] (rpm, rpm-next, rpm-test-next) " -t 60 rpm_repo fi if [ ! -z $rpm_repo ]; then if [ -z $distro ]; then distro=fedora-18 echo "" read -p "Which distro are you installing for? [$distro] (eg. fedora-17, rhel-6) " -t 60 distro fi fi ;; esac echo "" read -p "Shall I sync the date/time? [$DATE] " -t 60 DATE fi get_defaults echo "" echo "Detecting possible fencing options" if [ -e /etc/cluster/fence_xvm.key ]; then echo "* Found fence_xvm" fence_conf=/etc/cluster/fence_xvm.key pkgs="$pkgs fence-virt" fi if [ ! -z ${OS_AUTH_URL} ]; then echo "* Found openstack credentials" fence_conf=/sbin/fence_openstack pkgs="$pkgs python-novaclient" fi echo "" echo "Beginning cluster configuration" echo "" case $SSH in [Yy][Ee][Ss]|[Yy]) for host in $host_list; do echo "Installing our ssh key on ${host}" ssh-copy-id root@${host} >/dev/null 2>&1 # Fix selinux labeling ssh -l root ${host} -- restorecon -R -v . done ;; esac case $DATE in [Yy][Ee][Ss]|[Yy]) for host in $host_list; do echo "Setting time on ${host}" scp /etc/localtime root@${host}:/etc now=`date +%s` ssh -l root ${host} -- date -s @$now echo "" done ;; esac REPO= if [ ! -z $rpm_repo ]; then REPO=$rpm_repo/$distro fi init=`mktemp` cat<<-END>$init verbose=0 pkgs="$pkgs" lhost=\`uname -n\` lshort=\`echo \$lhost | awk -F. '{print \$1}'\` log() { printf "%-10s \$*\n" "\$lshort:" 1>&2 } debug() { if [ \$verbose -gt 0 ]; then log "Debug: \$*" fi } info() { log "\$*" } warning() { log "WARN: \$*" } fatal() { log "ERROR: \$*" exit 1 } case $SELINUX in [Yy][Ee][Ss]|[Yy]) sed -i.sed "s/enforcing/disabled/g" /etc/selinux/config ;; esac case $IPTABLES in [Yy][Ee][Ss]|[Yy]|"") service iptables stop chkconfig iptables off service firewalld stop chkconfig firewalld off ;; esac case $DOMAIN in [Nn][Oo]|"") ;; *.*) if ! grep domain /etc/resolv.conf then sed -i.sed "s/nameserver/domain\ $DOMAIN\\\nnameserver/g" /etc/resolv.conf fi ;; *) echo "Unknown domain: $DOMAIN";; esac case $INSTALL in [Yy][Ee][Ss]|[Yy]|"") if [ ! -z $REPO ]; then info Configuring Clusterlabs repo: $REPO yum install -y wget rm -f /etc/yum.repos.d/clusterlabs.repo - wget -O /etc/yum.repos.d/clusterlabs.repo http://www.clusterlabs.org/$REPO/clusterlabs.repo &>/dev/null + wget -O /etc/yum.repos.d/clusterlabs.repo https://www.clusterlabs.org/$REPO/clusterlabs.repo &>/dev/null yum clean all fi info Installing cluster software if [ $pcmk_ver = 10 ]; then yum install -y $pkgs at service atd start systemctl enable atd.service yum install -y "pacemaker < 1.1" else yum install -y $pkgs pacemaker fi ;; esac info "Configuring services" chkconfig xinetd on service xinetd start &>/dev/null chkconfig corosync off &> /dev/null mkdir -p /etc/cluster info "Turning on core files" grep -q "unlimited" /etc/bashrc if [ $? = 1 ]; then sed -i.sed "s/bashrc/bashrc\\\nulimit\ -c\ unlimited/g" /etc/bashrc fi function patch_cs_config() { test $num_hosts != 2 two_node=$? priority="info" if [ $do_debug = 1 ]; then priority="debug" fi ssh -l root ${host} -- sed -i.sed "s/.*mcastaddr:.*/mcastaddr:\ 226.94.1.1/g" $cs_conf ssh -l root ${host} -- sed -i.sed "s/.*mcastport:.*/mcastport:\ $cs_port$dsh_group/g" $cs_conf ssh -l root ${host} -- sed -i.sed "s/.*bindnetaddr:.*/bindnetaddr:\ $ip/g" $cs_conf ssh -l root ${host} -- sed -i.sed "s/.*syslog_facility:.*/syslog_facility:\ daemon/g" $cs_conf ssh -l root ${host} -- sed -i.sed "s/.*logfile_priority:.*/logfile_priority:\ $priority/g" $cs_conf if [ ! -z $token ]; then ssh -l root ${host} -- sed -i.sed "s/.*token:.*/token:\ $token/g" $cs_conf fi if [ ! -z $consensus ]; then ssh -l root ${host} -- sed -i.sed "s/.*consensus:.*/consensus:\ $consensus/g" $cs_conf fi if [ ! -z $join ]; then ssh -l root ${host} -- sed -i.sed "s/^join:.*/join:\ $join/g" $cs_conf ssh -l root ${host} -- sed -i.sed "s/\\\Wjoin:.*/join:\ $join/g" $cs_conf fi ssh -l root ${host} -- grep -q "corosync_votequorum" $cs_conf 2>&1 > /dev/null if [ $? -eq 0 ]; then ssh -l root ${host} -- sed -i.sed "s/\\\Wexpected_votes:.*/expected_votes:\ $num_hosts/g" $cs_conf ssh -l root ${host} -- sed -i.sed "s/\\\Wtwo_node:.*/two_node:\ $two_node/g" $cs_conf else printf "%-10s Wrong quorum provider: installing $cs_conf for corosync instead\n" ${host} create_cs_config fi } function create_cs_config() { cs_tmp=/tmp/cs_conf.$$ test $num_hosts != 2 two_node=$? # Base config priority="info" if [ $do_debug = 1 ]; then priority="debug" fi cat <<-END >$cs_tmp # Please read the corosync.conf.5 manual page totem { version: 2 # cypto_cipher and crypto_hash: Used for mutual node authentication. # If you choose to enable this, then do remember to create a shared # secret with "corosync-keygen". crypto_cipher: none crypto_hash: none # Assign a fixed node id nodeid: $id # Disable encryption secauth: off transport: $transport inaddr_any: $inaddr_any # interface: define at least one interface to communicate # over. If you define more than one interface stanza, you must # also set rrp_mode. interface { # Rings must be consecutively numbered, starting at 0. ringnumber: 0 # This is normally the *network* address of the # interface to bind to. This ensures that you can use # identical instances of this configuration file # across all your cluster nodes, without having to # modify this option. bindnetaddr: $ip # However, if you have multiple physical network # interfaces configured for the same subnet, then the # network address alone is not sufficient to identify # the interface Corosync should bind to. In that case, # configure the *host* address of the interface # instead: # bindnetaddr: 192.168.1.1 # When selecting a multicast address, consider RFC # 2365 (which, among other things, specifies that # 239.255.x.x addresses are left to the discretion of # the network administrator). Do not reuse multicast # addresses across multiple Corosync clusters sharing # the same network. # Corosync uses the port you specify here for UDP # messaging, and also the immediately preceding # port. Thus if you set this to 5405, Corosync sends # messages over UDP ports 5405 and 5404. mcastport: $cs_port$dsh_group # Time-to-live for cluster communication packets. The # number of hops (routers) that this ring will allow # itself to pass. Note that multicast routing must be # specifically enabled on most network routers. ttl: 1 mcastaddr: 226.94.1.1 } } logging { debug: off fileline: off to_syslog: yes to_stderr: no syslog_facility: daemon timestamp: on to_logfile: yes logfile: /var/log/corosync.log logfile_priority: $priority } amf { mode: disabled } quorum { provider: corosync_votequorum expected_votes: $num_hosts votes: 1 two_node: $two_node wait_for_all: 0 last_man_standing: 0 auto_tie_breaker: 0 } END scp -q $cs_tmp root@${host}:$cs_conf rm -f $cs_tmp } for host in $host_list; do echo "" echo "" echo "* Configuring $host" cs_short_host=`name_for_node $host` ip=`ip_for_node $host` id=`id_for_node $host` echo $ip | grep -qis NXDOMAIN if [ $? = 0 ]; then echo "Couldn't find resolve $host to an IP address" exit 1 fi if [ `uname -n` = $host ]; then bash $init else cat $init | ssh -l root -T $host -- "cat > $init; bash $init" fi if [ "x$fence_conf" != x ]; then if [ -e $fence_conf ]; then scp $fence_conf root@${host}:$fence_conf fi fi if [ $ETCHOSTS = 1 ]; then scp /etc/hosts root@${host}:/etc/hosts fi if [ $pcmk_ver = 10 ]; then scp /etc/hosts root@${host}:/etc/hosts scp ~/.ssh/id_dsa.suse root@${host}:.ssh/id_dsa scp ~/.ssh/known_hosts root@${host}:.ssh/known_hosts fi ssh -l root ${host} -- grep -q "token:" $cs_conf 2>&1 > /dev/null new_config=$? new_config=1 if [ $new_config = 0 ]; then printf "%-10s Updating $cs_conf\n" ${host}: patch_cs_config else printf "%-10s Installing $cs_conf\n" ${host}: create_cs_config fi done