diff --git a/doc/Clusters_from_Scratch/en-US/Ch-Installation.txt b/doc/Clusters_from_Scratch/en-US/Ch-Installation.txt index 5d5cb62958..cf47602bc0 100644 --- a/doc/Clusters_from_Scratch/en-US/Ch-Installation.txt +++ b/doc/Clusters_from_Scratch/en-US/Ch-Installation.txt @@ -1,636 +1,639 @@ = Installation = == OS Installation == Detailed instructions for installing Fedora are available at http://docs.fedoraproject.org/en-US/Fedora/20/html/Installation_Guide/ in a number of languages. The abbreviated version is as follows... Point your browser to http://fedoraproject.org/en/get-fedora-all, locate the +Install Media+ section and download the install DVD that matches your hardware. Burn the disk image to a DVD footnote:[http://docs.fedoraproject.org/en-US/Fedora/20/html/Burning_ISO_images_to_disc/index.html] and boot from it, or use the image to boot a virtual machine. After clicking through the welcome screen, select your language, keyboard layout footnote:[http://docs.fedoraproject.org/en-US/Fedora/20/html/Installation_Guide/language-selection-x86.html] At this point you get a chance to tweak the default installation options. In the +Network Configuration+ section you'll want to: - Assign your machine a host name I happen to control the clusterlabs.org domain name, so I will use that here. - Assign a fixed IP address [IMPORTANT] =========== Do not accept the default network settings. Cluster machines should never obtain an IP address via DHCP. If you miss this step, this can easily be configured after installation. You will have to navigate to +system settings+ and select +network+. From there you can select what device to configure. =========== In the +Software Selection+ section (try saying that 10 times quickly), choose +Minimal Install+ so that we see everything that gets installed. Don't enable updates yet, we'll do that (and install any extra software we need) later. [IMPORTANT] =========== By default Fedora uses LVM for partitioning which allows us to dynamically change the amount of space allocated to a given partition. However, by default it also allocates all free space to the +/+ (aka. +root+) partition which cannot be dynamically _reduced_ in size (dynamic increases are fine by-the-way). So if you plan on following the DRBD or GFS2 portions of this guide, you should reserve at least 1Gb of space on each machine from which to create a shared volume. To do so, enter the +Installation Destination+ section where you are be given an opportunity to reduce the size of the +root+ partition (after chosing which hard drive you wish to install to). =========== It is highly recommended to enable NTP on your cluster nodes. Doing so ensures all nodes agree on the current time and makes reading log files significantly easier. You can do this in the +Date & Time+ section. footnote:[http://docs.fedoraproject.org/en-US/Fedora/20/html/Installation_Guide/s1-timezone-x86.html] Once the node reboots, you'll see a (possibly mangled) login prompt on the console. Login using +root+ and the password you created earlier. image::images/Console.png["Initial Console",align="center",scaledwidth="65%"] [NOTE] ====== From here on in we're going to be working exclusively from the terminal. ====== == Post Installation Tasks == === Networking === Check the machine has the static IP address you configured earlier [source,C] ----- # ip addr 1: lo: mtu 16436 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eth0: mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 52:54:00:d7:d6:08 brd ff:ff:ff:ff:ff:ff inet 192.168.122.101/24 brd 192.168.122.255 scope global eth0 inet6 fe80::5054:ff:fed7:d608/64 scope link valid_lft forever preferred_lft forever ----- -NOTE +[NOTE] ===== -Handy tip - If you ever need to change the node's IP address from the command line follow these instructions: +If you ever need to change the node's IP address from the command line follow these instructions: + +.... +# manually edit /etc/sysconfig/network-scripts/ifcfg-${device} +# nmcli dev disconnect ${device} +# nmcli con reload ${device} +# nmcli con up ${device} +.... -1. manually edit /etc/sysconfig/network-scripts/ifcfg-${device} -2. nmcli dev disconnect ${device} -3. nmcli con reload ${device} -4. nmcli con up ${device} +This makes +NetworkManager+ aware that a change was made on the config file. -This makes NetworkManager aware that a change was made on the config file. ===== Next, check the routes are ok: [source,C] ----- [root@pcmk-1 ~]# ip route default via 192.168.122.1 dev eth0 192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.101 ----- If there is no line beginning with +default via+, then you may need to add a line such as [source,Bash] GATEWAY=192.168.122.1 to '/etc/sysconfig/network' and restart the network. Now check for connectivity to the outside world. Start small by testing if we can read the gateway we configured. [source,C] ----- # ping -c 1 192.168.122.1 PING 192.168.122.1 (192.168.122.1) 56(84) bytes of data. 64 bytes from 192.168.122.1: icmp_req=1 ttl=64 time=0.249 ms ---- 192.168.122.1 ping statistics --- + --- 192.168.122.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.249/0.249/0.249/0.000 ms ----- Now try something external, choose a location you know will be available. [source,C] ----- # ping -c 1 www.google.com PING www.l.google.com (173.194.72.106) 56(84) bytes of data. 64 bytes from tf-in-f106.1e100.net (173.194.72.106): icmp_req=1 ttl=41 time=167 ms ---- www.l.google.com ping statistics --- + --- www.l.google.com ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 167.618/167.618/167.618/0.000 ms ----- === Leaving the Console === The console isn't a very friendly place to work from, we will now switch to accessing the machine remotely via SSH where we can use copy&paste etc. First we check we can see the newly installed at all: [source,C] ----- beekhof@f16 ~ # ping -c 1 192.168.122.101 PING 192.168.122.101 (192.168.122.101) 56(84) bytes of data. 64 bytes from 192.168.122.101: icmp_req=1 ttl=64 time=1.01 ms --- 192.168.122.101 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.012/1.012/1.012/0.000 ms ----- Next we login via SSH [source,C] ----- beekhof@f16 ~ # ssh -l root 192.168.122.11 root@192.168.122.11's password: Last login: Fri Mar 30 19:41:19 2012 from 192.168.122.1 [root@pcmk-1 ~]# ----- === Security Shortcuts === To simplify this guide and focus on the aspects directly connected to clustering, we will now disable the machine's firewall and SELinux installation. [WARNING] =========== Both of these actions create significant security issues and should not be performed on machines that will be exposed to the outside world. =========== [IMPORTANT] =========== TODO: Create an Appendix that deals with (at least) re-enabling the firewall. =========== [source,C] ---- # setenforce 0 # sed -i.bak "s/SELINUX=enforcing/SELINUX=permissive/g" /etc/selinux/config # systemctl disable iptables.service # rm '/etc/systemd/system/basic.target.wants/iptables.service' # systemctl stop iptables.service ---- === Short Node Names === During installation, we filled in the machine's fully qualified domain name (FQDN) which can be rather long when it appears in cluster logs and status output. See for yourself how the machine identifies itself: (((Nodes, short name))) [source,C] ---- # uname -n pcmk-1.clusterlabs.org # dnsdomainname clusterlabs.org ---- (((Nodes, Domain name (Query)))) The output from the second command is fine, but we really don't need the domain name included in the basic host details. To address this, we need to use the +hostnamectl+ tool to strip off the domain name. [source,C] ---- # hostnamectl set-hostname $(uname -n | sed s/\\..*//)' ---- (((Nodes, Domain name (Remove from host name)))) Now check the machine is using the correct names [source,C] ---- # uname -n pcmk-1 # dnsdomainname clusterlabs.org ---- If it concerns you that the shell prompt has not been updated, simply log out and back in again. == Before You Continue == Repeat the Installation steps so far, so that you have two Fedora nodes ready to have the cluster software installed. For the purposes of this document, the additional node is called pcmk-2 with address 192.168.122.102. === Finalize Networking === Confirm that you can communicate between the two new nodes: [source,C] ---- # ping -c 3 192.168.122.102 PING 192.168.122.102 (192.168.122.102) 56(84) bytes of data. 64 bytes from 192.168.122.102: icmp_seq=1 ttl=64 time=0.343 ms 64 bytes from 192.168.122.102: icmp_seq=2 ttl=64 time=0.402 ms 64 bytes from 192.168.122.102: icmp_seq=3 ttl=64 time=0.558 ms --- 192.168.122.102 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2000ms rtt min/avg/max/mdev = 0.343/0.434/0.558/0.092 ms ---- Now we need to make sure we can communicate with the machines by their name. If you have a DNS server, add additional entries for the two machines. Otherwise, you'll need to add the machines to '/etc/hosts' . Below are the entries for my cluster nodes: [source,C] ---- # grep pcmk /etc/hosts 192.168.122.101 pcmk-1.clusterlabs.org pcmk-1 192.168.122.102 pcmk-2.clusterlabs.org pcmk-2 ---- We can now verify the setup by again using ping: [source,C] ---- # ping -c 3 pcmk-2 PING pcmk-2.clusterlabs.org (192.168.122.101) 56(84) bytes of data. 64 bytes from pcmk-1.clusterlabs.org (192.168.122.101): icmp_seq=1 ttl=64 time=0.164 ms 64 bytes from pcmk-1.clusterlabs.org (192.168.122.101): icmp_seq=2 ttl=64 time=0.475 ms 64 bytes from pcmk-1.clusterlabs.org (192.168.122.101): icmp_seq=3 ttl=64 time=0.186 ms --- pcmk-2.clusterlabs.org ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2001ms rtt min/avg/max/mdev = 0.164/0.275/0.475/0.141 ms ---- === Configure SSH === SSH is a convenient and secure way to copy files and perform commands remotely. For the purposes of this guide, we will create a key without a password (using the -N option) so that we can perform remote actions without being prompted. (((SSH))) [WARNING] ========= Unprotected SSH keys, those without a password, are not recommended for servers exposed to the outside world. We use them here only to simplify the demo. ========= Create a new key and allow anyone with that key to log in: .Creating and Activating a new SSH Key [source,C] ---- # ssh-keygen -t dsa -f ~/.ssh/id_dsa -N "" Generating public/private dsa key pair. Your identification has been saved in /root/.ssh/id_dsa. Your public key has been saved in /root/.ssh/id_dsa.pub. The key fingerprint is: 91:09:5c:82:5a:6a:50:08:4e:b2:0c:62:de:cc:74:44 root@pcmk-1.clusterlabs.org The key's randomart image is: +--[ DSA 1024]----+ |==.ooEo.. | |X O + .o o | | * A + | | + . | | . S | | | | | | | | | +-----------------+ # cp .ssh/id_dsa.pub .ssh/authorized_keys ---- (((Creating and Activating a new SSH Key))) Install the key on the other nodes and test that you can now run commands remotely, without being prompted .Installing the SSH Key on Another Host [source,C] ---- # scp -r .ssh pcmk-2: The authenticity of host 'pcmk-2 (192.168.122.102)' can't be established. RSA key fingerprint is b1:2b:55:93:f1:d9:52:2b:0f:f2:8a:4e:ae:c6:7c:9a. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'pcmk-2,192.168.122.102' (RSA) to the list of known hosts.root@pcmk-2's password: id_dsa.pub 100% 616 0.6KB/s 00:00 id_dsa 100% 672 0.7KB/s 00:00 known_hosts 100% 400 0.4KB/s 00:00 authorized_keys 100% 616 0.6KB/s 00:00 # ssh pcmk-2 -- uname -n pcmk-2 # ---- == Cluster Software Installation == === Install the Cluster Software === Since version 12, Fedora comes with recent versions of everything you need, so simply fire up a shell on all your nodes and run: [source,C] ---- [ALL] # yum install -y pacemaker pcs ---- Now install the cluster software on the second node. ifdef::pcs[] === Install the Cluster Management Software === The pcs cli command coupled with the pcs daemon creates a cluster management system capable of managing all aspects of the cluster stack across all nodes from a single location. [source,C] ---- [ALL] # yum install -y pcs ---- Make sure to install the pcs packages on both nodes. endif::[] == Setup == ifdef::pcs[] === Enable pcs Daemon === Before the cluster can be configured, the pcs daemon must be started and enabled to boot on startup on each node. This daemon works with the pcs cli command to manage syncing the corosync configuration across all the nodes in the cluster. Start and enable the daemon by issuing the following commands on each node. [source,C] ---- # systemctl start pcsd.service # systemctl enable pcsd.service ---- Now we need a way for `pcs` to talk to itself on other nodes in the cluster. This is necessary in order to perform tasks such as syncing the corosync config, or starting/stopping the cluster on remote nodes While `pcs` can be used locally without setting up these user accounts, this tutorial will make use of these remote access commands, so we will set a password for the 'hacluster' user. Its probably best if password is consistent across all the nodes. As 'root', run: [source,C] ---- # passwd hacluster password: ---- Alternatively, to script this process or set the password on a different machine to the one you're logged into, you can use the `--stdin` option for `passwd`: [source,C] ---- # ssh pcmk-2 -- 'echo redhat1 | passwd --stdin hacluster' ---- endif::[] ifdef::crmsh[] === Preparation - Multicast === Choose a port number and http://en.wikipedia.org/wiki/Multicast[multi-cast] address. http://en.wikipedia.org/wiki/Multicast_address[] Be sure that the values you chose do not conflict with any existing clusters you might have. For this document, I have chosen port '4000' and used '239.255.1.1' as the multi-cast address. endif::[] === Notes on Multicast Address Assignment === There are several subtle points that often deserve consideration when choosing/assigning multicast addresses for corosync. footnote:[This information is borrowed from, the now defunct, http://web.archive.org/web/20101211210054/http://29west.com/docs/THPM/multicast-address-assignment.html] . Avoid '224.0.0.x' + Traffic to addresses of the form '224.0.0.x' is often flooded to all switch ports. This address range is reserved for link-local uses. Many routing protocols assume that all traffic within this range will be received by all routers on the network. Hence (at least all Cisco) switches flood traffic within this range. The flooding behavior overrides the normal selective forwarding behavior of a multicast-aware switch (e.g. IGMP snooping, CGMP, etc.). . Watch for '32:1' overlap + 32 non-contiguous IP multicast addresses are mapped onto each Ethernet multicast address. A receiver that joins a single IP multicast group implicitly joins 31 others due to this overlap. Of course, filtering in the operating system discards undesired multicast traffic from applications, but NIC bandwidth and CPU resources are nonetheless consumed discarding it. The overlap occurs in the 5 high-order bits, so it's best to use the 23 low-order bits to make distinct multicast streams unique. For example, IP multicast addresses in the range '239.0.0.0' to '239.127.255.255' all map to unique Ethernet multicast addresses. However, IP multicast address '239.128.0.0' maps to the same Ethernet multicast address as '239.0.0.0', '239.128.0.1' maps to the same Ethernet multicast address as '239.0.0.1', etc. . Avoid 'x.0.0.y' and 'x.128.0.y' + Combining the above two considerations, it's best to avoid using IP multicast addresses of the form 'x.0.0.y' and 'x.128.0.y' since they all map onto the range of Ethernet multicast addresses that are flooded to all switch ports. . Watch for address assignment conflicts + http://www.iana.org/[IANA] administers http://www.iana.org/assignments/multicast-addresses[Internet multicast addresses]. Potential conflicts with Internet multicast address assignments can be avoided by using http://www.ietf.org/rfc/rfc3180.txt[GLOP addressing] (http://en.wikipedia.org/wiki/Autonomous_system_%28Internet%29[AS] required) or http://www.ietf.org/rfc/rfc2365.txt[administratively scoped] addresses. Such addresses can be safely used on a network connected to the Internet without fear of conflict with multicast sources originating on the Internet. Administratively scoped addresses are roughly analogous to the unicast address space for http://www.ietf.org/rfc/rfc1918.txt[private internets]. Site-local multicast addresses are of the form '239.255.x.y', but can grow down to '239.252.x.y' if needed. Organization-local multicast addresses are of the form '239.192-251.x.y', but can grow down to '239.x.y.z' if needed. For a more detailed treatment (57 pages!), see http://www.cisco.com/en/US/tech/tk828/technologies_white_paper09186a00802d4643.shtml[Cisco's Guidelines for Enterprise IP Multicast Address Allocation] paper. === Configuring Corosync === ifdef::pcs[] In the past, at this point in the tutorial an explanation of how to configure and propagate corosync's /etc/corosync.conf file would be necessary. Using pcs with the pcs daemon greatly simplifies this process by generating 'corosync.conf' across all the nodes in the cluster with a single command. The only thing required to achieve this is to authenticate as the pcs user 'hacluster' on one of the nodes in the cluster, and then issue the 'pcs cluster setup' command with a list of all the node names in the cluster. [source,C] ---- # pcs cluster auth pcmk-1 pcmk-2 Username: hacluster Password: pcmk-1: Authorized pcmk-2: Authorized # pcs cluster setup --name mycluster pcmk-1 pcmk-2 pcmk-1: Succeeded pcmk-2: Succeeded ---- That's it. Corosync is configured across the cluster. If you received an authorization error for either of those commands, make sure you setup the 'hacluster' user account and password on every node in the cluster with the same password. endif::[] ifdef::crmsh[] [IMPORTANT] =========== The instructions below only apply for a machine with a single NIC. If you have a more complicated setup, you should edit the configuration manually. =========== [source,C] ---- # export ais_port=4000 # export ais_mcast=239.255.1.1 ---- Next we automatically determine the hosts address. By not using the full address, we make the configuration suitable to be copied to other nodes. [source,Bash] ---- export ais_addr=`ip addr | grep "inet " | tail -n 1 | awk '{print $4}' | sed s/255/0/g` ---- Display and verify the configuration options [source,Bash] ---- # env | grep ais_ ais_mcast=239.255.1.1 ais_port=4000 ais_addr=192.168.122.0 ---- Once you're happy with the chosen values, update the Corosync configuration [source,C] ---- # cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf # sed -i.bak "s/.*mcastaddr:.*/mcastaddr:\ $ais_mcast/g" /etc/corosync/corosync.conf # sed -i.bak "s/.*mcastport:.*/mcastport:\ $ais_port/g" /etc/corosync/corosync.conf # sed -i.bak "s/.*\tbindnetaddr:.*/bindnetaddr:\ $ais_addr/g" /etc/corosync/corosync.conf ---- Lastly, you'll need to enable quorum [source,Bash] ----- cat << END >> /etc/corosync/corosync.conf quorum { provider: corosync_votequorum expected_votes: 2 } END ----- endif::[] The final /etc/corosync.conf configuration on each node should look something like the sample in Appendix B, Sample Corosync Configuration. [IMPORTANT] =========== Pacemaker used to obtain membership and quorum from a custom Corosync plugin. This plugin also had the capability to start Pacemaker automatically when Corosync was started. Neither behavior is possible with Corosync 2.0 and beyond as support for plugins was removed. Instead, Pacemaker must be started as a separate service. Also, since Pacemaker made use of the plugin for message routing, a node using the plugin (Corosync prior to 2.0) cannot talk to one that isn't (Corosync 2.0+). Rolling upgrades between these versions are therefor not possible and an alternate strategy footnote:[http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ap-upgrade.html] must be used. =========== ifdef::crmsh[] === Propagate the Configuration === Now we need to copy the changes so far to the other node: [source,C] ---- # for f in /etc/corosync/corosync.conf /etc/hosts; do scp $f pcmk-2:$f ; done corosync.conf 100% 1528 1.5KB/s 00:00 hosts 100% 281 0.3KB/s 00:00 # ---- endif::[]