diff --git a/cts/README b/cts/README index b2ff427de5..d4d564f7e1 100644 --- a/cts/README +++ b/cts/README @@ -1,138 +1,142 @@ PACEMAKER CLUSTER TEST SUITE (CTS) Purpose ------- CTS thoroughly exercises a pacemaker test cluster by running a randomized series of predefined tests on the cluster. CTS can be run against a pre-existing cluster configuration or (more typically) overwrite the existing configuration with a test configuration. Requirements ------------ * Three or more machines (one test exerciser and two or more test cluster machines). * The test cluster machines should be on the same subnet and have journalling filesystems (ext3, ext4, xfs, etc.) for all of their filesystems other than /boot. You also need a number of free IP addresses on that subnet if you intend to test mutual IP address takeover. * The test exerciser machine doesn't need to be on the same subnet as the test cluster machines. Minimal demands are made on the exerciser machine - it just has to stay up during the tests. * It helps a lot in tracking problems if all machines' clocks are closely synchronized. NTP does this automatically, but you can do it by hand if you want. * The exerciser needs to be able to ssh over to the cluster nodes as root without a password challenge. Configure ssh accordingly (see the Mini-HOWTO at the end of this document for more details). * The exerciser needs to be able to resolve the machine names of the test cluster - either by DNS or by /etc/hosts. - + Preparation ----------- Install Pacemaker (including CTS) on all machines. These scripts are coordinated with particular versions of Pacemaker, so you need the same version of CTS as the rest of Pacemaker, and you need the same version of pacemaker and CTS on both the test exerciser and the test cluster machines. Configure cluster communications (Corosync, CMAN or Heartbeat) on the cluster machines and verify everything works. NOTE: Do not run the cluster on the test exerciser machine. NOTE: Wherever machine names are mentioned in these configuration files, they must match the machines' `uname -n` name. This may or may not match the machines' FQDN (fully qualified domain name) - it depends on how -you (and your OS) have named the machines. +you (and your OS) have named the machines. Run CTS ------- Now assuming you did all this, what you need to do is run CTSlab.py: python ./CTSlab.py [options] number-of-tests-to-run You must specify which nodes are part of the cluster with --nodes, e.g.: - --node "pcmk-1 pcmk-2 pcmk-3" + --node "pcmk-1 pcmk-2 pcmk-3" Most people will want to save the output with --outputfile, e.g.: - --outputfile ~/cts.log + --outputfile ~/cts.log Unless you want to test your pre-existing cluster configuration, you also want: - --clobber-cib - --populate-resources - --test-ip-base $IP # e.g. --test-ip-base 192.168.9.100 + --clobber-cib + --populate-resources + --test-ip-base $IP # e.g. --test-ip-base 192.168.9.100 and configure some sort of fencing: - --stonith $TYPE # e.g. "--stonith rhcs" to use fence_xvm or "--stonith lha" to use external/ssh + --stonith $TYPE # e.g. "--stonith rhcs" to use fence_xvm or "--stonith lha" to use external/ssh A complete command line might look like: - - python ./CTSlab.py --nodes "pcmk-1 pcmk-2 pcmk-3" --outputfile ~/cts.log \ - --clobber-cib --populate-resources --test-ip-base 192.168.9.100 \ - --stonith rhcs 50 + + python ./CTSlab.py --nodes "pcmk-1 pcmk-2 pcmk-3" --outputfile ~/cts.log \ + --clobber-cib --populate-resources --test-ip-base 192.168.9.100 \ + --stonith rhcs 50 For more options, use the --help option. +NOTE: Perhaps more convenient way to compile a command line like above + is to use cluster_test script that, at least in the source repository, + sits in the same directory as this very file. + To extract the result of a particular test, run: - crm_report -T $test + crm_report -T $test Mini-HOWTO: Allow passwordless remote SSH connections ----------------------------------------------------- The CTS scripts run "ssh -l root" so you don't have to do any of your testing logged in as root on the test machine. Here is how to allow such connections without requiring a password to be entered each time: * On your test exerciser, create an SSH key if you do not already have one. Most commonly, SSH keys will be in your ~/.ssh directory, with the private key file not having an extension, and the public key file named the same with the extension ".pub" (for example, ~/.ssh/id_dsa.pub). If you don't already have a key, you can create one with: - ssh-keygen -t dsa + ssh-keygen -t dsa * From your test exerciser, authorize your SSH public key for root on all test machines (both the exerciser and the cluster test machines): - ssh-copy-id -i ~/.ssh/id_dsa.pub root@$MACHINE + ssh-copy-id -i ~/.ssh/id_dsa.pub root@$MACHINE You will probably have to provide your password, and possibly say "yes" to some questions about accepting the identity of the test machines. The above assumes you have a DSA SSH key in the specified location; if you have some other type of key (RSA, ECDSA, etc.), use its file name in the -i option above. If you have an old version of SSH that doesn't have ssh-copy-id, you can take the single line out of your public key file (e.g. ~/.ssh/identity.pub or ~/.ssh/id_dsa.pub) and manually add it to root's ~/.ssh/authorized_keys file on each test machine. * To test, try this command from the exerciser machine for each of your cluster machines, and for the exerciser machine itself. - ssh -l root $MACHINE + ssh -l root $MACHINE If this works without prompting for a password, you're in business. If not, look at the documentation for your version of ssh. diff --git a/cts/cluster_test b/cts/cluster_test index 27be87bad4..ff035710f5 100755 --- a/cts/cluster_test +++ b/cts/cluster_test @@ -1,184 +1,166 @@ #!/bin/bash -anyAsked=0 if [ -e ~/.cts ]; then . ~/.cts fi +anyAsked=0 -CTS_master=`uname -n` -CTS_numtests=$1 +[ $# -lt 1 ] || CTS_numtests=$1 +die() { echo "$@"; exit 1; } -if [ "x$CTS_master" = "x" ]; then +if [ -z "$CTS_asked_once" ]; then anyAsked=1 - printf "This script should only be executed on the test master.\n" - printf "The test master will remotely execute the actions required by the tests and should not be part of the cluster itself.\n" + echo "This script should only be executed on the test master." + echo "The test master will remotely execute the actions required by the tests and should not be part of the cluster itself." - read -p "Is this host intended to be the test master? (yN)" CTS_master - if [ "x$CTS_master" != "xy" ]; then - printf "This script must be executed on the test master\n" - exit 1 - fi + read -p "Is this host intended to be the test master? (yN) " doUnderstand + [ "$doUnderstand" = "y" ] \ + || die "This script must be executed on the test master" fi -if [ "x$CTS_node_list" = "x" ]; then +if [ -z "$CTS_node_list" ]; then anyAsked=1 read -p "Please list your cluster nodes (eg. node1 node2 node3): " CTS_node_list - else - printf "Beginning test of cluster: $CTS_node_list\n" + echo "Beginning test of cluster: $CTS_node_list" fi -if [ "x$CTS_stack" = "x" ]; then +if [ -z "$CTS_stack" ]; then anyAsked=1 read -p "Which cluster stack are you using? ([corosync], openais, or heartbeat): " CTS_stack - if [ -z $CTS_stack ]; then - CTS_stack=corosync - fi - + [ -n "$CTS_stack" ] || CTS_stack=corosync else - printf "Using the $CTS_stack cluster stack\n" + echo "Using the $CTS_stack cluster stack" fi -tmp=`echo ${CTS_node_list} | sed s/$HOSTNAME//` -if [ "x${CTS_node_list}" != "x${tmp}" ]; then - printf "This script must be executed on the test master and the test master cannot be part of the cluster\n" - exit 1 -fi +[ "${CTS_node_list}" = "${CTS_node_list/$HOSTNAME/}" ] \ + || die "This script must be executed on the test master and the test master cannot be part of the cluster" printf "+ Bootstraping ssh... " -if [ -z $SSH_AUTH_SOCK ]; then +if [ -z "$SSH_AUTH_SOCK" ]; then printf "\n + Initializing SSH " - agent_tmp=/tmp/.$$.ssh - ssh-agent > $agent_tmp - . $agent_tmp - rm $agent_tmp - printf " + Adding identities...\n" + eval "$(ssh-agent)" + echo " + Adding identities..." ssh-add rc=$? - if [ $rc != 0 ]; then - printf " -- No identities added\n" + if [ $rc -ne 0 ]; then + echo " -- No identities added" printf "\nThe ability to open key-based 'ssh' connections (as the user 'root') is required to use CTS.\n" - read -p " - Do you want this program to help you create one? (yN)" auto_fix - if [ "x$auto_fix" = "xy" ]; then + read -p " - Do you want this program to help you create one? (yN) " auto_fix + if [ "$auto_fix" = "y" ]; then ssh-keygen -t dsa ssh-add else - printf "Please run 'ssh-keygen -t dsa' to create a new key\n" - exit 1 + die "Please run 'ssh-keygen -t dsa' to create a new key" fi fi else - printf "OK\n" + echo "OK" fi test_ok=1 printf "+ Testing ssh configuration... " for n in $CTS_node_list; do - ssh -l root -o PasswordAuthentication=no -o ConnectTimeout=5 $n /bin/true + ssh -l root -o PasswordAuthentication=no -o ConnectTimeout=5 "$n" /bin/true rc=$? - if [ $rc != 0 ]; then - printf "\n - connection to $n failed" + if [ $rc -ne 0 ]; then + echo " - connection to $n failed" test_ok=0 fi done -if [ $test_ok = 0 ]; then - printf "\n\nThe ability to open key-based 'ssh' connections (as the user 'root') is required to use CTS.\n" - printf " Please install one of your SSH public keys to root's account on all cluster nodes\n" - - # todo - look for identities and guide the installation of one - - exit 1 +if [ $test_ok -eq 0 ]; then + printf "\nThe ability to open key-based 'ssh' connections (as the user 'root') is required to use CTS.\n" + + read -p " - Do you want this program to help you with such a setup? (yN) " auto_fix + if [ "$auto_fix" = "y" ]; then + # XXX are we picking the most suitable identity? + privKey=$(ssh-add -L | head -n1 | cut -d" " -f3) + sshCopyIdOpts="-o User=root" + [ -z "$privKey" ] || sshCopyIdOpts+=" -i \"${privKey}.pub\"" + for n in $CTS_node_list; do + eval "ssh-copy-id $sshCopyIdOpts \"${n}\"" \ + || die "Attempt to 'ssh-copy-id $sshCopyIdOpts \"$n\"' failed" + done + else + die "Please install one of your SSH public keys to root's account on all cluster nodes" + fi fi -printf "OK\n" +echo "OK" -if [ -z $CTS_logfile ]; then +if [ -z "$CTS_logfile" ]; then anyAsked=1 read -p " + Where does/should syslog store logs from remote hosts? (/var/log/messages) " CTS_logfile - if [ "x$CTS_logfile" = "x" ]; then - CTS_logfile=/var/log/messages - fi + [ -n "$CTS_logfile" ] || CTS_logfile=/var/log/messages fi -if [ ! -e $CTS_logfile ]; then - printf "$CTS_logfile doesn't exist\n" - exit 1 -fi +[ -e "$CTS_logfile" ] || die "$CTS_logfile doesn't exist" -if [ -z $CTS_logfacility ]; then +if [ -z "$CTS_logfacility" ]; then anyAsked=1 read -p " + Which log facility does the cluster use? (daemon) " CTS_logfacility - if [ "x$CTS_logfacility" = "x" ]; then - CTS_logfacility=daemon - fi + [ -n "$CTS_logfacility" ] || CTS_logfacility=daemon fi -if [ -z $CTS_boot ]; then +if [ -z "$CTS_boot" ]; then read -p "+ Is the cluster software started automatically when a node boots? [yN] " CTS_boot - if [ -z $CTS_boot ]; then + if [ -z "$CTS_boot" ]; then CTS_boot=0 else case $CTS_boot in 1|y|Y) CTS_boot=1;; *) CTS_boot=0;; esac fi fi -if [ -z $CTS_numtests ]; then +if [ -z "$CTS_numtests" ]; then read -p "+ How many test iterations should be performed? (500) " CTS_numtests - if [ -z $CTS_numtests ]; then - CTS_numtests=500 - fi + [ -n "$CTS_numtests" ] || CTS_numtests=500 fi -if [ -z $CTS_asked_once ]; then +if [ -z "$CTS_asked_once" ]; then anyAsked=1 read -p "+ What type of STONITH agent do you use? (none) " CTS_stonith - if [ "x$CTS_stonith" != "x" ]; then - read -p "+ List any STONITH agent parameters (eq. device_host=switch.power.com): " CTS_stonith_args - fi - - if [ -z $CTS_adv ]; then - read -p "+ (Advanced) Any extra CTS parameters? (none) " CTS_adv - fi -fi - -if [ $anyAsked = 1 ]; then - read -p "+ Save values to ~/.cts for next time? (yN) " doSave -fi - -if [ "x$doSave" = "xy" ]; then - echo "# CTS Test data" > ~/.cts - echo CTS_master=\"$CTS_master\" >> ~/.cts - echo CTS_stack=\"$CTS_stack\" >> ~/.cts - echo CTS_node_list=\"$CTS_node_list\" >> ~/.cts - echo CTS_logfile=\"$CTS_logfile\" >> ~/.cts - echo CTS_logport=$CTS_logport >> ~/.cts - echo CTS_logfacility=$CTS_logfacility >> ~/.cts - echo CTS_asked_once=1 >> ~/.cts - echo CTS_adv=\"$CTS_adv\" >> ~/.cts - echo CTS_stonith=$CTS_stonith >> ~/.cts - echo CTS_stonith_args=\"$CTS_stonith_args\" >> ~/.cts - echo CTS_boot=\"$CTS_boot\" >> ~/.cts + [ -z "$CTS_stonith" ] \ + || read -p "+ List any STONITH agent parameters (eq. device_host=switch.power.com): " CTS_stonith_args + [ -n "$CTS_adv" ] \ + || read -p "+ (Advanced) Any extra CTS parameters? (none) " CTS_adv +fi + +[ $anyAsked -eq 0 ] \ + || read -p "+ Save values to ~/.cts for next time? (yN) " doSave + +if [ "$doSave" = "y" ]; then + cat > ~/.cts <<-EOF + # CTS Test data + CTS_stack="$CTS_stack" + CTS_node_list="$CTS_node_list" + CTS_logfile="$CTS_logfile" + CTS_logport="$CTS_logport" + CTS_logfacility="$CTS_logfacility" + CTS_asked_once=1 + CTS_adv="$CTS_adv" + CTS_stonith="$CTS_stonith" + CTS_stonith_args="$CTS_stonith_args" + CTS_boot="$CTS_boot" +EOF fi cts_extra="" -if [ "x$CTS_stonith" != "x" ]; then +if [ -n "$CTS_stonith" ]; then cts_extra="$cts_extra --stonith-type $CTS_stonith" - if [ "x$CTS_stonith_args" != "x" ]; then - cts_extra="$cts_extra --stonitha-params \"$CTS_stonith_args\"" - fi + [ -z "$CTS_stonith_args" ] \ + || cts_extra="$cts_extra --stonitha-params \"$CTS_stonith_args\"" else cts_extra="$cts_extra --stonith 0" - printf " - Testing a cluster without STONITH is like a blunt pencil... pointless\n" + echo " - Testing a cluster without STONITH is like a blunt pencil... pointless" fi -printf "\nAll set to go for $CTS_numtests iterations!\n" -if [ $anyAsked = 0 ]; then - printf "+ To use a different configuration, remove ~/.cts and re-run cts (or edit it manually).\n" -fi +printf "\nAll set to go for %d iterations!\n" "$CTS_numtests" +[ $anyAsked -ne 0 ] \ + || echo "+ To use a different configuration, remove ~/.cts and re-run cts (or edit it manually)." echo Now paste the following command into this shell: -echo python "`dirname "$0"`"/CTSlab.py -L $CTS_logfile --syslog-facility $CTS_logfacility --no-unsafe-tests --stack $CTS_stack $CTS_adv --at-boot $CTS_boot $cts_extra $CTS_numtests --nodes \"$CTS_node_list\" +echo "python `dirname "$0"`/CTSlab.py -L \"$CTS_logfile\" --syslog-facility \"$CTS_logfacility\" --no-unsafe-tests --stack \"$CTS_stack\" $CTS_adv --at-boot \"$CTS_boot\" $cts_extra \"$CTS_numtests\" --nodes \"$CTS_node_list\""