diff --git a/doc/Clusters_from_Scratch/en-US/Ch-Installation.txt b/doc/Clusters_from_Scratch/en-US/Ch-Installation.txt index 0d85c1a78d..a7d15587a1 100644 --- a/doc/Clusters_from_Scratch/en-US/Ch-Installation.txt +++ b/doc/Clusters_from_Scratch/en-US/Ch-Installation.txt @@ -1,754 +1,754 @@ = Installation = == OS Installation == Detailed instructions for installing Fedora are available at http://docs.fedoraproject.org/install-guide/f13/ in a number of languages. The abbreviated version is as follows... Point your browser to http://fedoraproject.org/en/get-fedora-all, locate the Install Media section and download the install DVD that matches your hardware. Burn the disk image to a DVD footnote:[http://docs.fedoraproject.org/readme-burning-isos/en-US.html] and boot from it. Or use the image to boot a virtual machine as I have done here. After clicking through the welcome screen, select your language and keyboard layout footnote:[http://docs.fedoraproject.org/install-guide/f13/en-US/html/s1-langselection-x86.html] .Installation: Good choice -image::images/f-13.1-welcome.png[Welcome] +image::images/f-13.1-welcome.png["Welcome",align="center"] .Fedora Installation - Storage Devices -image::images/f-13.2-devices.png[Storage Devices] +image::images/f-13.2-devices.png["Storage Devices",align="center"] Assign your machine a host name. footnote:[http://docs.fedoraproject.org/install-guide/f13/en-US/html/sn-networkconfig-fedora.html] I happen to control the clusterlabs.org domain name, so I will use that here. .Fedora Installation - Hostname -image::images/f-13.3-hostname.png[Hostname] +image::images/f-13.3-hostname.png["Hostname",align="center"] You will then be prompted to indicate the machine's physical location and to supply a root password. footnote:[http://docs.fedoraproject.org/install-guide/f13/en-US/html/sn-account_configuration.html] Now select where you want Fedora installed. footnote:[http://docs.fedoraproject.org/install-guide/f13/en-US/html/s1-diskpartsetup-x86.html] As I don’t care about any existing data, I will accept the default and allow Fedora to use the complete drive. However I want to reserve some space for DRBD, so I'll check the Review and modify partitioning layout box. .Fedora Installation - Installation Type -image::images/f-13.4-partition-overview.png[Choose Install Type] +image::images/f-13.4-partition-overview.png["Choose Install Type",align="center"] By default, Fedora will give all the space to the / (aka. root) partition. Wel'll take some back so we can use DRBD. .Fedora Installation - Default Partitioning -image::images/f-13.5-partition-default.png[Default Partitioning] +image::images/f-13.5-partition-default.png["Default Partitioning",align="center"] The finalized partition layout should look something like the diagram below. [IMPORTANT] =========== If you plan on following the DRBD or GFS2 portions of this guide, you should reserve at least 1Gb of space on each machine from which to create a shared volume. Fedora Installation - Customize PartitioningFedora Installation: Create a partition to use (later) for website data =========== .Fedora Installation - Customize Partitioning -image::images/f-13.6-partition-custom.png[Customize Partitioning] +image::images/f-13.6-partition-custom.png["Customize Partitioning",align="center"] .Fedora Installation - Bootloader -image::images/f-13.7-bootloader.png[Unless you have a strong reason not to, accept the default bootloader location] +image::images/f-13.7-bootloader.png["Unless you have a strong reason not to, accept the default bootloader location",align="center"] Next choose which software should be installed. Change the selection to Web Server since we plan on using Apache. Don't enable updates yet, we'll do that (and install any extra software we need) later. After you click next, Fedora will begin installing. .Fedora Installation - Software -image::images/f-13.8-software.png[Software selection] +image::images/f-13.8-software.png["Software selection",align="center"] Go grab something to drink, this may take a while .Fedora Installation - Installing -image::images/f-13.9-installing.png[Installing] +image::images/f-13.9-installing.png["Installing",align="center"] .Fedora Installation - Installation Complete -image::images/f-13.10-install-complete.png[Stage 1, completed] +image::images/f-13.10-install-complete.png["Stage 1, completed",align="center"] Once the node reboots, follow the on screen instructions footnote:[http://docs.fedoraproject.org/install-guide/f13/en-US/html/ch-firstboot.html] to create a system user and configure the time. .Fedora Installation - First Boot -image::images/f-13.11-post-welcome.png[First boot] +image::images/f-13.11-post-welcome.png["First boot",align="center"] .Fedora Installation - Create Non-privileged User -image::images/f-13.12-new-user.png[Creating a new user, take note of the password, you'll need it soon] +image::images/f-13.12-new-user.png["Creating a new user, take note of the password, you'll need it soon",align="center"] [NOTE] ======= It is highly recommended to enable NTP on your cluster nodes. Doing so ensures all nodes agree on the current time and makes reading log files significantly easier. Fedora Installation - Date and TimeFedora Installation: Enable NTP to keep the times on all your nodes consistent ======= .Fedora Installation - Date and Time -image::images/f-13.13-date-time.png[Date and time] +image::images/f-13.13-date-time.png["Date and time",align="center"] Click through the next screens until you reach the login window. Click on the user you created and supply the password you indicated earlier. .Fedora Installation - Customize Networking -image::images/f-13.14-networking.png[Click here to configure networking] +image::images/f-13.14-networking.png["Click here to configure networking",align="center"] [IMPORTANT] =========== Do not accept the default network settings. Cluster machines should never obtain an ip address via DHCP. Here I will use the internal addresses for the clusterlab.org network. =========== .Fedora Installation - Specify Network Preferences -image::images/f-13.15-manual-networking.png[Specify network settings for your machine, never choose DHCP] +image::images/f-13.15-manual-networking.png["Specify network settings for your machine, never choose DHCP",align="center"] .Fedora Installation - Activate Networking -image::images/f-13.16-networking-activate.png[Click the big green button to activate your changes] +image::images/f-13.16-networking-activate.png["Click the big green button to activate your changes",align="center"] .Fedora Installation - Bring up the Terminal -image::images/f-13.17-terminal.png[Down to business, fire up the command line] +image::images/f-13.17-terminal.png["Down to business, fire up the command line",align="center"] [NOTE] ====== That was the last screenshot, from here on in we’re going to be working from the terminal. ====== == Cluster Software Installation == Go to the terminal window you just opened and switch to the super user (aka. "root") account with the su command. You will need to supply the password you entered earlier during the installation process. [source,Bash] ---- [beekhof@pcmk-1 ~]$ su - Password: [root@pcmk-1 ~]# ---- [NOTE] ====== Note that the username (the text before the @ symbol) now indicates we’re running as the super user “root”. ====== [source,Bash] .... # ip addr 1: lo: mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000 link/ether 00:0c:29:6f:e1:58 brd ff:ff:ff:ff:ff:ff inet 192.168.9.41/24 brd 192.168.9.255 scope global eth0 inet6 ::20c:29ff:fe6f:e158/64 scope global dynamic valid_lft 2591667sec preferred_lft 604467sec inet6 2002:57ae:43fc:0:20c:29ff:fe6f:e158/64 scope global dynamic valid_lft 2591990sec preferred_lft 604790sec inet6 fe80::20c:29ff:fe6f:e158/64 scope link valid_lft forever preferred_lft forever # ping -c 1 www.google.com PING www.l.google.com (74.125.39.99) 56(84) bytes of data. 64 bytes from fx-in-f99.1e100.net (74.125.39.99): icmp_seq=1 ttl=56 time=16.7 ms --- www.l.google.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 20ms rtt min/avg/max/mdev = 16.713/16.713/16.713/0.000 ms # /sbin/chkconfig network on # .... === Security Shortcuts === To simplify this guide and focus on the aspects directly connected to clustering, we will now disable the machine’s firewall and SELinux installation. Both of these actions create significant security issues and should not be performed on machines that will be exposed to the outside world. [IMPORTANT] =========== TODO: Create an Appendix that deals with (at least) re-enabling the firewall. =========== [source,Bash] ---- # sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/config # /sbin/chkconfig --del iptables # service iptables stop iptables: Flushing firewall rules: [ OK ] iptables: Setting chains to policy ACCEPT: filter [ OK ] iptables: Unloading modules: [ OK ] ---- [NOTE] ================ You will need to reboot for the SELinux changes to take effect. Otherwise you will see something like this when you start corosync: May 4 19:30:54 pcmk-1 setroubleshoot: SELinux is preventing /usr/sbin/corosync "getattr" access on /. For complete SELinux messages. run sealert -l 6e0d4384-638e-4d55-9aaf-7dac011f29c1 May 4 19:30:54 pcmk-1 setroubleshoot: SELinux is preventing /usr/sbin/corosync "getattr" access on /. For complete SELinux messages. run sealert -l 6e0d4384-638e-4d55-9aaf-7dac011f29c1 ================ === Install the Cluster Software === Since version 12, Fedora comes with recent versions of everything you need, so simply fire up the shell and run: .... # sed -i.bak "s/enabled=0/enabled=1/g" /etc/yum.repos.d/fedora.repo # sed -i.bak "s/enabled=0/enabled=1/g" /etc/yum.repos.d/fedora-updates.repo # yum install -y pacemaker corosyncLoaded plugins: presto, refresh-packagekit fedora/metalink | 22 kB 00:00 fedora-debuginfo/metalink | 16 kB 00:00 fedora-debuginfo | 3.2 kB 00:00 fedora-debuginfo/primary_db | 1.4 MB 00:04 fedora-source/metalink | 22 kB 00:00 fedora-source | 3.2 kB 00:00 fedora-source/primary_db | 3.0 MB 00:05 updates/metalink | 26 kB 00:00 updates | 2.6 kB 00:00 updates/primary_db | 1.1 kB 00:00 updates-debuginfo/metalink | 18 kB 00:00 updates-debuginfo | 2.6 kB 00:00 updates-debuginfo/primary_db | 1.1 kB 00:00 updates-source/metalink | 25 kB 00:00 updates-source | 2.6 kB 00:00 updates-source/primary_db | 1.1 kB 00:00 Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package corosync.x86_64 0:1.2.1-1.fc13 set to be updated --> Processing Dependency: corosynclib = 1.2.1-1.fc13 for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libquorum.so.4(COROSYNC_QUORUM_1.0)(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libvotequorum.so.4(COROSYNC_VOTEQUORUM_1.0)(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libcpg.so.4(COROSYNC_CPG_1.0)(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libconfdb.so.4(COROSYNC_CONFDB_1.0)(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libcfg.so.4(COROSYNC_CFG_0.82)(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libpload.so.4(COROSYNC_PLOAD_1.0)(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: liblogsys.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libconfdb.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libcoroipcc.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libcpg.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libquorum.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libcoroipcs.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libvotequorum.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libcfg.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libtotem_pg.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 --> Processing Dependency: libpload.so.4()(64bit) for package: corosync-1.2.1-1.fc13.x86_64 ---> Package pacemaker.x86_64 0:1.1.5-1.fc13 set to be updated --> Processing Dependency: heartbeat >= 3.0.0 for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: net-snmp >= 5.4 for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: resource-agents for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: cluster-glue for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libnetsnmp.so.20()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libcrmcluster.so.1()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libpengine.so.3()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libnetsnmpagent.so.20()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libesmtp.so.5()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libstonithd.so.1()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libhbclient.so.1()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libpils.so.2()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libpe_status.so.2()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libnetsnmpmibs.so.20()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libnetsnmphelpers.so.20()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libcib.so.1()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libccmclient.so.1()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libstonith.so.1()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: liblrm.so.2()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libtransitioner.so.1()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libpe_rules.so.2()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libcrmcommon.so.2()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Processing Dependency: libplumb.so.2()(64bit) for package: pacemaker-1.1.5-1.fc13.x86_64 --> Running transaction check ---> Package cluster-glue.x86_64 0:1.0.2-1.fc13 set to be updated --> Processing Dependency: perl-TimeDate for package: cluster-glue-1.0.2-1.fc13.x86_64 --> Processing Dependency: libOpenIPMIutils.so.0()(64bit) for package: cluster-glue-1.0.2-1.fc13.x86_64 --> Processing Dependency: libOpenIPMIposix.so.0()(64bit) for package: cluster-glue-1.0.2-1.fc13.x86_64 --> Processing Dependency: libopenhpi.so.2()(64bit) for package: cluster-glue-1.0.2-1.fc13.x86_64 --> Processing Dependency: libOpenIPMI.so.0()(64bit) for package: cluster-glue-1.0.2-1.fc13.x86_64 ---> Package cluster-glue-libs.x86_64 0:1.0.2-1.fc13 set to be updated ---> Package corosynclib.x86_64 0:1.2.1-1.fc13 set to be updated --> Processing Dependency: librdmacm.so.1(RDMACM_1.0)(64bit) for package: corosynclib-1.2.1-1.fc13.x86_64 --> Processing Dependency: libibverbs.so.1(IBVERBS_1.0)(64bit) for package: corosynclib-1.2.1-1.fc13.x86_64 --> Processing Dependency: libibverbs.so.1(IBVERBS_1.1)(64bit) for package: corosynclib-1.2.1-1.fc13.x86_64 --> Processing Dependency: libibverbs.so.1()(64bit) for package: corosynclib-1.2.1-1.fc13.x86_64 --> Processing Dependency: librdmacm.so.1()(64bit) for package: corosynclib-1.2.1-1.fc13.x86_64 ---> Package heartbeat.x86_64 0:3.0.0-0.7.0daab7da36a8.hg.fc13 set to be updated --> Processing Dependency: PyXML for package: heartbeat-3.0.0-0.7.0daab7da36a8.hg.fc13.x86_64 ---> Package heartbeat-libs.x86_64 0:3.0.0-0.7.0daab7da36a8.hg.fc13 set to be updated ---> Package libesmtp.x86_64 0:1.0.4-12.fc12 set to be updated ---> Package net-snmp.x86_64 1:5.5-12.fc13 set to be updated --> Processing Dependency: libsensors.so.4()(64bit) for package: 1:net-snmp-5.5-12.fc13.x86_64 ---> Package net-snmp-libs.x86_64 1:5.5-12.fc13 set to be updated ---> Package pacemaker-libs.x86_64 0:1.1.5-1.fc13 set to be updated ---> Package resource-agents.x86_64 0:3.0.10-1.fc13 set to be updated --> Processing Dependency: libnet.so.1()(64bit) for package: resource-agents-3.0.10-1.fc13.x86_64 --> Running transaction check ---> Package OpenIPMI-libs.x86_64 0:2.0.16-8.fc13 set to be updated ---> Package PyXML.x86_64 0:0.8.4-17.fc13 set to be updated ---> Package libibverbs.x86_64 0:1.1.3-4.fc13 set to be updated --> Processing Dependency: libibverbs-driver for package: libibverbs-1.1.3-4.fc13.x86_64 ---> Package libnet.x86_64 0:1.1.4-3.fc12 set to be updated ---> Package librdmacm.x86_64 0:1.0.10-2.fc13 set to be updated ---> Package lm_sensors-libs.x86_64 0:3.1.2-2.fc13 set to be updated ---> Package openhpi-libs.x86_64 0:2.14.1-3.fc13 set to be updated ---> Package perl-TimeDate.noarch 1:1.20-1.fc13 set to be updated --> Running transaction check ---> Package libmlx4.x86_64 0:1.0.1-5.fc13 set to be updated --> Finished Dependency Resolution Dependencies Resolved ========================================================================================== Package Arch Version Repository Size ========================================================================================== Installing: corosync x86_64 1.2.1-1.fc13 fedora 136 k pacemaker x86_64 1.1.5-1.fc13 fedora 543 k Installing for dependencies: OpenIPMI-libs x86_64 2.0.16-8.fc13 fedora 474 k PyXML x86_64 0.8.4-17.fc13 fedora 906 k cluster-glue x86_64 1.0.2-1.fc13 fedora 230 k cluster-glue-libs x86_64 1.0.2-1.fc13 fedora 116 k corosynclib x86_64 1.2.1-1.fc13 fedora 145 k heartbeat x86_64 3.0.0-0.7.0daab7da36a8.hg.fc13 updates 172 k heartbeat-libs x86_64 3.0.0-0.7.0daab7da36a8.hg.fc13 updates 265 k libesmtp x86_64 1.0.4-12.fc12 fedora 54 k libibverbs x86_64 1.1.3-4.fc13 fedora 42 k libmlx4 x86_64 1.0.1-5.fc13 fedora 27 k libnet x86_64 1.1.4-3.fc12 fedora 49 k librdmacm x86_64 1.0.10-2.fc13 fedora 22 k lm_sensors-libs x86_64 3.1.2-2.fc13 fedora 37 k net-snmp x86_64 1:5.5-12.fc13 fedora 295 k net-snmp-libs x86_64 1:5.5-12.fc13 fedora 1.5 M openhpi-libs x86_64 2.14.1-3.fc13 fedora 135 k pacemaker-libs x86_64 1.1.5-1.fc13 fedora 264 k perl-TimeDate noarch 1:1.20-1.fc13 fedora 42 k resource-agents x86_64 3.0.10-1.fc13 fedora 357 k Transaction Summary ========================================================================================= Install 21 Package(s) Upgrade 0 Package(s) Total download size: 5.7 M Installed size: 20 M Downloading Packages: Setting up and reading Presto delta metadata updates-testing/prestodelta | 164 kB 00:00 fedora/prestodelta | 150 B 00:00 Processing delta metadata Package(s) data still to download: 5.7 M (1/21): OpenIPMI-libs-2.0.16-8.fc13.x86_64.rpm | 474 kB 00:00 (2/21): PyXML-0.8.4-17.fc13.x86_64.rpm | 906 kB 00:01 (3/21): cluster-glue-1.0.2-1.fc13.x86_64.rpm | 230 kB 00:00 (4/21): cluster-glue-libs-1.0.2-1.fc13.x86_64.rpm | 116 kB 00:00 (5/21): corosync-1.2.1-1.fc13.x86_64.rpm | 136 kB 00:00 (6/21): corosynclib-1.2.1-1.fc13.x86_64.rpm | 145 kB 00:00 (7/21): heartbeat-3.0.0-0.7.0daab7da36a8.hg.fc13.x86_64.rpm | 172 kB 00:00 (8/21): heartbeat-libs-3.0.0-0.7.0daab7da36a8.hg.fc13.x86_64.rpm | 265 kB 00:00 (9/21): libesmtp-1.0.4-12.fc12.x86_64.rpm | 54 kB 00:00 (10/21): libibverbs-1.1.3-4.fc13.x86_64.rpm | 42 kB 00:00 (11/21): libmlx4-1.0.1-5.fc13.x86_64.rpm | 27 kB 00:00 (12/21): libnet-1.1.4-3.fc12.x86_64.rpm | 49 kB 00:00 (13/21): librdmacm-1.0.10-2.fc13.x86_64.rpm | 22 kB 00:00 (14/21): lm_sensors-libs-3.1.2-2.fc13.x86_64.rpm | 37 kB 00:00 (15/21): net-snmp-5.5-12.fc13.x86_64.rpm | 295 kB 00:00 (16/21): net-snmp-libs-5.5-12.fc13.x86_64.rpm | 1.5 MB 00:01 (17/21): openhpi-libs-2.14.1-3.fc13.x86_64.rpm | 135 kB 00:00 (18/21): pacemaker-1.1.5-1.fc13.x86_64.rpm | 543 kB 00:00 (19/21): pacemaker-libs-1.1.5-1.fc13.x86_64.rpm | 264 kB 00:00 (20/21): perl-TimeDate-1.20-1.fc13.noarch.rpm | 42 kB 00:00 (21/21): resource-agents-3.0.10-1.fc13.x86_64.rpm | 357 kB 00:00 Total 539 kB/s | 5.7 MB 00:10 warning: rpmts_HdrFromFdno: Header V3 RSA/SHA256 Signature, key ID e8e40fde: NOKEY fedora/gpgkey | 3.2 kB 00:00 ... Importing GPG key 0xE8E40FDE "Fedora (13) >/etc/corosync/service.d/pcmkservice { # Load the Pacemaker Cluster Resource Manager name: pacemaker ver: 1 } END ---- The final configuration should look something like the sample in Appendix B, Sample Corosync Configuration. [IMPORTANT] =========== When run in version 1 mode, the plugin does not start the Pacemaker daemons. Instead it just sets up the quorum and messaging interfaces needed by the rest of the stack. Starting the dameons occurs when the Pacemaker init script is invoked. This resolves two long standing issues: .. Forking inside a multi-threaded process like Corosync causes all sorts of pain. This has been problematic for Pacemaker as it needs a number of daemons to be spawned. .. Corosync was never designed for staggered shutdown - something previously needed in order to prevent the cluster from leaving before Pacemaker could stop all active resources. =========== === Propagate the Configuration === Now we need to copy the changes so far to the other node: [source,Bash] ---- # for f in /etc/corosync/corosync.conf /etc/corosync/service.d/pcmk /etc/hosts; do scp $f pcmk-2:$f ; done corosync.conf 100% 1528 1.5KB/s 00:00 hosts 100% 281 0.3KB/s 00:00 # ---- diff --git a/doc/Clusters_from_Scratch/en-US/Ch-Intro.txt b/doc/Clusters_from_Scratch/en-US/Ch-Intro.txt index 90a1c0e2b0..c737d49346 100644 --- a/doc/Clusters_from_Scratch/en-US/Ch-Intro.txt +++ b/doc/Clusters_from_Scratch/en-US/Ch-Intro.txt @@ -1,155 +1,155 @@ = Read-Me-First = == The Scope of this Document == Computer clusters can be used to provide highly available services or resources. The redundancy of multiple machines is used to guard against failures of many types. This document will walk through the installation and setup of simple clusters using the Fedora distribution, version 14. The clusters described here will use Pacemaker and Corosync to provide resource management and messaging. Required packages and modifications to their configuration files are described along with the use of the Pacemaker command line tool for generating the XML used for cluster control. Pacemaker is a central component and provides the resource management required in these systems. This management includes detecting and recovering from the failure of various nodes, resources and services under its control. When more in depth information is required and for real world usage, please refer to the http://www.clusterlabs.org/doc/[Pacemaker Explained] manual. == What Is Pacemaker? == Pacemaker is a cluster resource manager. It achieves maximum availability for your cluster services (aka. resources) by detecting and recovering from node and resource-level failures by making use of the messaging and membership capabilities provided by your preferred cluster infrastructure (either Corosync or Heartbeat). Pacemaker's key features include: * Detection and recovery of node and service-level failures * Storage agnostic, no requirement for shared storage * Resource agnostic, anything that can be scripted can be clustered * Supports STONITH for ensuring data integrity * Supports large and small clusters * Supports both quorate and resource driven clusters * Supports practically any redundancy configuration * Automatically replicated configuration that can be updated from any node * Ability to specify cluster-wide service ordering, colocation and anti-colocation * Support for advanced service types ** Clones: for services which need to be active on multiple nodes ** Multi-state: for services with multiple modes (eg. master/slave, primary/secondary) * Unified, scriptable, cluster shell == Pacemaker Architecture == At the highest level, the cluster is made up of three pieces: * Non-cluster aware components (illustrated in green). These pieces include the resources themselves, scripts that start, stop and monitor them, and also a local daemon that masks the differences between the different standards these scripts implement. * Resource management Pacemaker provides the brain (illustrated in blue) that processes and reacts to events regarding the cluster. These events include nodes joining or leaving the cluster; resource events caused by failures, maintenance, scheduled activities; and other administrative actions. Pacemaker will compute the ideal state of the cluster and plot a path to achieve it after any of these events. This may include moving resources, stopping nodes and even forcing them offline with remote power switches. * Low level infrastructure Corosync provides reliable messaging, membership and quorum information about the cluster (illustrated in red). .Conceptual Stack Overview -image::images/pcmk-overview.png[Conceptual overview of the cluster stack] +image::images/pcmk-overview.png["Conceptual overview of the cluster stack",align="center"] When combined with Corosync, Pacemaker also supports popular open source cluster filesystems. footnote:[Even though Pacemaker also supports Heartbeat, the filesystems need to use the stack for messaging and membership and Corosync seems to be what they're standardizing on. Technically it would be possible for them to support Heartbeat as well, however there seems little interest in this.] Due to recent standardization within the cluster filesystem community, they make use of a common distributed lock manager which makes use of Corosync for its messaging capabilities and Pacemaker for its membership (which nodes are up/down) and fencing services. .The Pacemaker Stack -image::images/pcmk-stack.png[The Pacemaker StackThe Pacemaker stack when running on Corosync] +image::images/pcmk-stack.png["The Pacemaker StackThe Pacemaker stack when running on Corosync",align="center"] === Internal Components === Pacemaker itself is composed of four key components (illustrated below in the same color scheme as the previous diagram): * CIB (aka. Cluster Information Base) * CRMd (aka. Cluster Resource Management daemon) * PEngine (aka. PE or Policy Engine) * STONITHd .Internal Components -image::images/pcmk-internals.png[Subsystems of a Pacemaker cluster running on Corosync] +image::images/pcmk-internals.png["Subsystems of a Pacemaker cluster running on Corosync",align="center"] The CIB uses XML to represent both the cluster's configuration and current state of all resources in the cluster. The contents of the CIB are automatically kept in sync across the entire cluster and are used by the PEngine to compute the ideal state of the cluster and how it should be achieved. This list of instructions is then fed to the DC (Designated Co-ordinator). Pacemaker centralizes all cluster decision making by electing one of the CRMd instances to act as a master. Should the elected CRMd process, or the node it is on, fail... a new one is quickly established. The DC carries out the PEngine's instructions in the required order by passing them to either the LRMd (Local Resource Management daemon) or CRMd peers on other nodes via the cluster messaging infrastructure (which in turn passes them on to their LRMd process). The peer nodes all report the results of their operations back to the DC and based on the expected and actual results, will either execute any actions that needed to wait for the previous one to complete, or abort processing and ask the PEngine to recalculate the ideal cluster state based on the unexpected results. In some cases, it may be necessary to power off nodes in order to protect shared data or complete resource recovery. For this Pacemaker comes with STONITHd. STONITH is an acronym for Shoot-The-Other-Node-In-The-Head and is usually implemented with a remote power switch. In Pacemaker, STONITH devices are modeled as resources (and configured in the CIB) to enable them to be easily monitored for failure, however STONITHd takes care of understanding the STONITH topology such that its clients simply request a node be fenced and it does the rest. == Types of Pacemaker Clusters == Pacemaker makes no assumptions about your environment, this allows it to support practically any http://en.wikipedia.org/wiki/High-availability_cluster#Node_configurations[redundancy configuration] including Active/Active, Active/Passive, N+1, N+M, N-to-1 and N-to-N. In this document we will focus on the setup of a highly available Apache web server with an Active/Passive cluster using DRBD and Ext4 to store data. Then, we will upgrade this cluster to Active/Active using GFS2. .Active/Passive Redundancy -image::images/pcmk-active-passive.png[Two-node Active/Passive clusters using Pacemaker and DRBD are a cost-effective solution for many High Availability situations.] +image::images/pcmk-active-passive.png["Two-node Active/Passive clusters using Pacemaker and DRBD are a cost-effective solution for many High Availability situations",align="center"] .N to N Redundancy -image::images/pcmk-active-active.png[When shared storage is available, every node can potentially be used for failover. Pacemaker can even run multiple copies of services to spread out the workload.] +image::images/pcmk-active-active.png["When shared storage is available, every node can potentially be used for failover. Pacemaker can even run multiple copies of services to spread out the workload",align="center"]