diff --git a/tools/README.hb_report b/tools/README.hb_report
index 043898184c..ed6fef4c96 100644
--- a/tools/README.hb_report
+++ b/tools/README.hb_report
@@ -1,297 +1,305 @@
 Heartbeat reporting
 ===================
 Dejan Muhamedagic <dmuhamedagic@suse.de>
 v1.0
 
 `hb_report` is a utility to collect all information relevant to
 Heartbeat over the given period of time.
 
 Quick start
 -----------
 
 Run `hb_report` on one of the nodes or on the host which serves as
 a central log server. Run `hb_report` without parameters to see usage.
 
 A few examples:
 
 1. Last night during the backup there were several warnings
 encountered (logserver is the log host):
 +
 	logserver# hb_report -f 3:00 -t 4:00 /tmp/report
 +
 collects everything from all nodes from 3am to 4am last night.
 The files are stored in /tmp/report and compressed to a tarball
 /tmp/report.tar.gz.
 
 2. Just found a problem during testing:
 
 	node1# date : note the current time
 	node1# /etc/init.d/heartbeat start
 	node1# nasty_command_that_breaks_things
 	node1# sleep 120 : wait for the cluster to settle
 	node1# hb_report -f time /tmp/hb1
 
 Introduction
 ------------
 
 Managing clusters is cumbersome. Heartbeat v2 with its numerous
 configuration files and multi-node clusters just adds to the
 complexity. No wonder then that most problem reports were less
 than optimal. This is an attempt to rectify that situation and
 make life easier for both the users and the developers.
 
 On security
 -----------
 
 `hb_report` is a fairly complex program. As some of you are
-probably going to run it as root let us state a few important
+probably going to run it as `root` let us state a few important
 things you should keep in mind:
 
-1. Don't run `hb_report` as root! It is fairly simple to setup
+1. Don't run `hb_report` as `root`! It is fairly simple to setup
 things in such a way that root access is not needed. I won't go
 into details, just to stress that all information collected
 should be readable by accounts belonging the haclient group.
 
 2. If you still have to run this as root. Well, don't use the
 `-C` option.
 
 3. Of course, every possible precaution has been taken not to
 disturb processes, or touch or remove files out of the given
 destination directory. If you (by mistake) specify an existing
 directory, `hb_report` will bail out soon. If you specify a
-relative path, it won't work either. The final product of
-`hb_report` is a tarball. However, the destination directory is
-not removed on any node, unless the user specifies `-C`. If you're
-too lazy to cleanup the previous run, do yourself a favour and
-just supply a new destination directory. You've been warned. If
-you worry about the space used, just put all your directories
-under /tmp and setup a cronjob to remove those directories once a
-week:
+relative path, it won't work either.
+
+The final product of `hb_report` is a tarball. However, the
+destination directory is not removed on any node, unless the user
+specifies `-C`. If you're too lazy to cleanup the previous run,
+do yourself a favour and just supply a new destination directory.
+You've been warned. If you worry about the space used, just put
+all your directories under `/tmp` and setup a cronjob to remove
+those directories once a week:
 ..........
 	for d in /tmp/*; do
 		test -d $d ||
 			continue
 		test -f $d/description.txt || test -f $d/.env ||
 			continue
 		grep -qs 'By: hb_report' $d/description.txt ||
 			grep -qs '^UNIQUE_MSG=Mark' $d/.env ||
 			continue
 		rm -r $d
 	done
 ..........
 
 Mode of operation
 -----------------
 
 Cluster data collection is straightforward: just run the same
 procedure on all nodes and collect the reports. There is,
 apart from many small ones, one large complication: central
 syslog destination. So, in order to allow this to be fully
 automated, we should sometimes run the procedure on the log host
 too. Actually, if there is a log host, then the best way is to
 run `hb_report` there.
 
-We use ssh for the remote program invocation. Even though it is
+We use `ssh` for the remote program invocation. Even though it is
 possible to run `hb_report` without ssh by doing a more menial job,
 the overall user experience is much better if ssh works. Anyway,
 how else do you manage your cluster?
 
 Another ssh related point: In case your security policy
 proscribes loghost-to-cluster-over-ssh communications, then
 you'll have to copy the log file to one of the nodes and point
 `hb_report` to it.
 
 Prerequisites
 -------------
 
 1. ssh
 +
 This is not strictly required, but you won't regret having a
 password-less ssh. It is not too difficult to setup and will save
 you a lot of time. If you can't have it, for example because your
 security policy does not allow such a thing, or you just prefer
 menial work, then you will have to resort to the semi-manual
 semi-automated report generation. See below for instructions.
++
+If you need to supply a password for your passphrase/login, then
+please use the `-u` option.
 
 2. Times
 +
 In order to find files and messages in the given period and to
 parse the `-f` and `-t` options, `hb_report` uses perl and one of the
 `Date::Parse` or `Date::Manip` perl modules. Note that you need
-only one of these.
+only one of these. Furthermore, on nodes which have no logs and
+where you don't run `hb_report` directly, no date parsing is
+necessary. In other words, if you run this on a loghost then you
+don't need these perl modules on the cluster nodes.
 +
 On rpm based distributions, you can find `Date::Parse` in
 `perl-TimeDate` and on Debian and its derivatives in
 `libtimedate-perl`.
 
 3. Core dumps
 +
-To backtrace core dumps gdb is needed and the Heartbeat packages
+To backtrace core dumps `gdb` is needed and the Heartbeat packages
 with the debugging info. The debug info packages may be installed
 at the time the report is created. Let's hope that you will need
 this really seldom.
 
 What is in the report
 ---------------------
 
 1. Heartbeat related
 - heartbeat version/release information
 - heartbeat configuration (CIB, ha.cf, logd.cf)
 - heartbeat status (output from crm_mon, crm_verify, ccm_tool)
 - pengine transition graphs (if any)
 - backtraces of core dumps (if any)
 - heartbeat logs (if any)
 2. System related
 - general platform information (`uname`, `arch`, `distribution`)
-- system statistics (`uptime`, `top`, `ps`)
+- system statistics (`uptime`, `top`, `ps`, `netstat -i`, `arp`)
 3. User created :)
 - problem description (template to be edited)
 4. Generated
 - problem analysis (generated)
 
 It is preferred that the Heartbeat is running at the time of the
 report, but not absolutely required. `hb_report` will also do a
 quick analysis of the collected information.
 
 Times
 -----
 
 Specifying times can at times be a nuisance. That is why we have
 chosen to use one of the perl modules--they do allow certain
 freedom when talking dates. You can either read the instructions
 at the
 http://search.cpan.org/dist/TimeDate/lib/Date/Parse.pm#EXAMPLE_DATES[Date::Parse
 examples page].
 
 or just rely on common sense and try stuff like:
 
 	3:00          (today at 3am)
 	15:00         (today at 3pm)
 	2007/9/1 2pm  (September 1st at 2pm)
 
 `hb_report` will (probably) complain if it can't figure out what do
 you mean.
 
 Try to delimit the event as close as possible in order to reduce
 the size of the report, but still leaving a minute or two around
 for good measure.
 
 Note that `-f` is not an optional option. And don't forget to quote
 dates when they contain spaces.
 
 Should I send all this to the rest of Internet?
 -----------------------------------------------
 
 We make an effort to remove sensitive data from the Heartbeat
 configuration (CIB, ha.cf, and transition graphs). However, you
 _have_ to tell us what is sensitive! Use the `-p` option to specify
 additional regular expressions to match variable names which may
 contain information you don't want to leak. For example:
 
 	# hb_report -f 18:00 -p "user.*" -p "secret.*" /var/tmp/report
 
 We look by default for variable names matching "pass.*" and the
 stonith_host ha.cf directive.
 
 Logs and other files are not filtered. Please filter them
 yourself if necessary.
 
 Logs
 ----
 
 It may be tricky to find syslog logs. The scheme used is to log a
 unique message on all nodes and then look it up in the usual
 syslog locations. This procedure is not foolproof, in particular
 if the syslog files are in a non-standard directory. We look in
 /var/log /var/logs /var/syslog /var/adm /var/log/ha
 /var/log/cluster. In case we can't find the logs, please supply
 their location:
 
 	# hb_report -f 5pm -l /var/log/cluster1/ha-log -S /tmp/report_node1
 
 If you have different log locations on different nodes, well,
-perhaps you'd like to make them the same. Or read about the
-manual report collection. 
+perhaps you'd like to make them the same and make life easier for
+everybody.
 
 The log files are collected from all hosts where found. In case
 your syslog is configured to log to both the log server and local
 files and `hb_report` is run on the log server you will end up with
 multiple logs with same content.
 
 Files starting with "ha-" are preferred. In case syslog sends
 messages to more than one file, if one of them is named ha-log or
 ha-debug those will be favoured to syslog or messages.
 
 If there is no separate log for Heartbeat, possibly unrelated
 messages from other programs are included. We don't filter logs,
 just pick a segment for the period you specified.
 
 NB: Don't have a central log host? Read the CTS README and setup
 one.
 
 Manual report collection
 ------------------------
 
 So, your ssh doesn't work. In that case, you will have to run
 this procedure on all nodes. Use `-S` so that we don't bother with
 ssh:
 
 	# hb_report -f 5:20pm -t 5:30pm -S /tmp/report_node1
 
 If you also have a log host which is not in the cluster, then
 you'll have to copy the log to one of the nodes and tell us where
 it is:
 
 	# hb_report -f 5:20pm -t 5:30pm -l /var/tmp/ha-log -S /tmp/report_node1
 
 Furthermore, to prevent `hb_report` from asking you to edit the
 report to describe the problem on every node use `-D` on all but
 one:
 
 	# hb_report -f 5:20pm -t 5:30pm -DS /tmp/report_node1
 
 If you reconsider and want the ssh setup, take a look at the CTS
 README file for instructions.
 
 Analysis
 --------
 
 The point of analysis is to get out the most important
 information from probably several thousand lines worth of text.
 Perhaps this should be more properly named as report review as it
 is rather simple, but let's pretend that we are doing something
 utterly sophisticated.
 
 The analysis consists of the following:
 
 - compare files coming from different nodes; if they are equal,
   make one copy in the top level directory, remove duplicates,
   and create soft links instead
 - print errors, warnings, and lines matching `-L` patterns from logs
 - report if there were coredumps and by whom
 - report crm_verify results
 
 The goods
 ---------
 
 1. Common
 +
 - ha-log (if found on the log host)
 - description.txt (template and user report)
 - analysis.txt
 
 2. Per node
 +
 - ha.cf
 - logd.cf
 - ha-log (if found)
 - cib.xml (`cibadmin -Ql` or `cp` if Heartbeat is not running)
 - ccm_tool.txt (`ccm_tool -p`)
 - crm_mon.txt (`crm_mon -1`)
 - crm_verify.txt (`crm_verify -V`)
 - pengine/ (only on DC, directory with pengine transitions)
 - sysinfo.txt (static info)
 - sysstats.txt (dynamic info)
 - backtraces.txt (if coredumps found)
 - DC (well...)
+- RUNNING or STOPPED
 
diff --git a/tools/hb_report.in b/tools/hb_report.in
index c02a3df378..f4ee7fbee9 100755
--- a/tools/hb_report.in
+++ b/tools/hb_report.in
@@ -1,608 +1,663 @@
 #!/bin/sh
 
  # Copyright (C) 2007 Dejan Muhamedagic <dmuhamedagic@suse.de>
  # 
  # This program is free software; you can redistribute it and/or
  # modify it under the terms of the GNU General Public
  # License as published by the Free Software Foundation; either
  # version 2.1 of the License, or (at your option) any later version.
  # 
  # This software is distributed in the hope that it will be useful,
  # but WITHOUT ANY WARRANTY; without even the implied warranty of
  # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  # General Public License for more details.
  # 
  # You should have received a copy of the GNU General Public
  # License along with this library; if not, write to the Free Software
  # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  #
 
 . @sysconfdir@/ha.d/shellfuncs
 . $HA_NOARCHBIN/utillib.sh
 
 PROG=`basename $0`
 # FIXME: once this is part of the package!
 PROGDIR=`dirname $0`
 echo "$PROGDIR" | grep -qs '^/' || {
 	test -f @sbindir@/$PROG &&
 		PROGDIR=@sbindir@
 	test -f $HA_NOARCHBIN/$PROG &&
 		PROGDIR=$HA_NOARCHBIN
 }
 
 LOGD_CF=`findlogdcf @sysconfdir@ $HA_DIR`
 export LOGD_CF
 
-: ${SSH_OPTS="-T -o Batchmode=yes"}
+: ${SSH_OPTS="-T"}
 LOG_PATTERNS="CRIT: ERROR:"
 
 #
 # the instance where user runs hb_report is the master
 # the others are slaves
 #
 if [ x"$1" = x__slave ]; then
 	SLAVE=1
 fi
 
 #
 # if this is the master, allow ha.cf and logd.cf in the current dir
 # (because often the master is the log host)
 #
 if [ "$SLAVE" = "" ]; then
 	[ -f ha.cf ] && HA_CF=ha.cf
 	[ -f logd.cf ] && LOGD_CF=logd.cf
 fi
 
 usage() {
 	cat<<EOF
 usage: hb_report -f time [-t time] [-u user] [-l file] [-p patt] [-L patt]
        [-e prog] [-SDC] dest
 
 	-f time: time to start from
 	-t time: time to finish at (dflt: now)
-	-u user: ssh user to access other nodes (dftl: hacluster)
+	-u user: ssh user to access other nodes (dflt: empty, hacluster, root)
 	-l file: log file
 	-p patt: regular expression to match variables to be removed;
 	         this option is additive (dflt: "passw.*")
 	-L patt: regular expression to match in log files for analysis;
 	         this option is additive (dflt: $LOG_PATTERNS)
 	-e prog: your favourite editor
 	-D     : don't invoke editor to write description
 	-C     : remove the destination directory
 	-S     : single node operation; don't try to start report
 	         collectors on other nodes
 	dest   : destination directory
 EOF
 
 [ "$1" != short ] &&
 	cat<<EOF
 
 	. the multifile output is first stored in a directory {dest}
 	  of which a tarball {dest}.tar.gz is created
 	. the time specification is as in either Date::Parse or
 	  Date::Manip, whatever you have installed; Date::Parse is
 	  preferred
 	. we try to figure where is the logfile; if we can't, please
 	  clue us in
 
 	Examples
 
 	  hb_report -f 2pm /tmp/report_1
 	  hb_report -f "2007/9/5 12:30" -t "2007/9/5 14:00" /tmp/report_2
 	  hb_report -f 1:00 -t 3:00 -l /var/log/cluster/ha-debug /tmp/report_3
 	  hb_report -f "09sep07 2:00" -u hbadmin /tmp/report_4
 	  hb_report -f 18:00 -p "usern.*" -p "admin.*" /tmp/report_5
 
 	. WARNING . WARNING . WARNING . WARNING . WARNING . WARNING .
 	  We try to sanitize the CIB and the peinputs files. If you
 	  have more sensitive information, please supply additional
 	  patterns yourself. The logs and the crm_mon, ccm_tool, and
 	  crm_verify output are *not* sanitized.
 	  IT IS YOUR RESPONSIBILITY TO PROTECT THE DATA FROM EXPOSURE!
 EOF
 	exit
 }
 #
 # these are "global" variables
 #
 setvarsanddefaults() {
 	now=`perl -e 'print time()'`
 	# used by all
 	DESTDIR=""
 	FROM_TIME=""
 	TO_TIME=0
 	HA_LOG=""
 	UNIQUE_MSG="Mark:HB_REPORT:$now"
 	SANITIZE="passw.*"
 	REMOVE_DEST=""
 	# used only by the master
 	NO_SSH=""
 	SSH_USER=""
-	TRY_SSH="hacluster"
+	TRY_SSH="hacluster root"
 	SLAVEPIDS=""
 	NO_DESCRIPTION=""
 }
 chkdirname() {
 	[ "$1" ] || usage short
 	[ $# -ne 1 ] && fatal "bad directory name: $1"
 	echo $1 | grep -qs '^/' ||
 		fatal "destination directory must be an absolute path"
 	[ "$1" = / ] &&
 		fatal "no root here, thank you"
 }
 chktime() {
 	[ "$1" ] || fatal "bad time specification: $2"
 }
 msgcleanup() {
 	fatal "destination directory $DESTDIR exists, please cleanup"
 }
 nodistdirectory() {
 	fatal "could not create the destination directory $DESTDIR"
 }
 time2str() {
 	perl -e "use POSIX; print strftime('%x %X',localtime($1));"
 }
 
 #
 # find log files
 #
 logmarks() {
 	sev=$1 msg=$2
-	forall "logger -p $HA_LOGFACILITY.$sev $msg"
+	c="logger -p $HA_LOGFACILITY.$sev $msg"
+
+	for n in `getnodes`; do
+		if [ "$n" = "`uname -n`" ]; then
+			$c
+		else
+			[ "$ssh_good" ] &&
+				echo $c | ssh $ssh_opts $n
+		fi
+	done
 }
 findlog() {
 	if [ "$HA_LOGFACILITY" ]; then
 		findmsg $UNIQUE_MSG | awk '{print $1}'
 	else
 		echo ${HA_DEBUGFILE:-$HA_LOGFILE}
 	fi
 }
 
 #
 # this is how we pass environment to other hosts
 #
 dumpenv() {
 	cat<<EOF
 FROM_TIME=$FROM_TIME
 TO_TIME=$TO_TIME
 HA_LOG=$HA_LOG
 DESTDIR=$DESTDIR
 UNIQUE_MSG=$UNIQUE_MSG
 SANITIZE="$SANITIZE"
 REMOVE_DEST="$REMOVE_DEST"
 EOF
 }
 send_config() {
 	for node in `getnodes`; do
 		[ "$node" = "$WE" ] && continue
 		dumpenv |
-		ssh $SSH_OPTS $SSH_USER@$node "mkdir -p $DESTDIR; cat > $DESTDIR/.env"
+		ssh $ssh_opts $node "mkdir -p $DESTDIR; cat > $DESTDIR/.env"
 	done
 }
 start_remote_collectors() {
 	for node in `getnodes`; do
 		[ "$node" = "$WE" ] && continue
-		ssh $SSH_OPTS $SSH_USER@$node "$PROGDIR/hb_report __slave $DESTDIR" |
+		ssh $ssh_opts $node "$PROGDIR/hb_report __slave $DESTDIR" |
 			(cd $DESTDIR && tar xf -) &
 		SLAVEPIDS="$SLAVEPIDS $!"
 	done
 }
 
 #
 # does ssh work?
 #
-findsshuser() {
-	for n in `getnodes`; do
-		[ "$node" = "$WE" ] && continue
-		trysshusers $n $TRY_SSH && break
-	done
+testsshuser() {
+	if [ "$2" ]; then
+		ssh -T -o Batchmode=yes $2@$1 true 2>/dev/null
+	else
+		ssh -T -o Batchmode=yes $1 true 2>/dev/null
+	fi
 }
-checkssh() {
-	for n in `getnodes`; do
-		[ "$node" = "$WE" ] && continue
-		checksshuser $n $SSH_USER || return 1
+findsshuser() {
+	for u in "" $TRY_SSH; do
+		rc=0
+		for n in `getnodes`; do
+			[ "$node" = "$WE" ] && continue
+			testsshuser $n $u || {
+				rc=1
+				break
+			}
+		done
+		if [ $rc -eq 0 ]; then
+			echo $u
+			return 0
+		fi
 	done
-	return 0
+	return 1
 }
 
 #
 # the usual stuff
 #
 getbacktraces() {
 	flist=`find_files $HA_VARLIB/cores $1 $2`
 	[ "$flist" ] &&
 		getbt $flist > $3
 }
 getpeinputs() {
 	n=`basename $3`
 	flist=$(
 	if [ -f $3/ha-log ]; then
 		grep " $n peng.*PEngine Input stored" $3/ha-log | awk '{print $NF}'
 	else
 		find_files $HA_VARLIB/pengine $1 $2
 	fi | sed "s,$HA_VARLIB/,,g"
 	)
 	[ "$flist" ] &&
 		(cd $HA_VARLIB && tar cf - $flist) | (cd $3 && tar xf -)
 }
 touch_DC_if_dc() {
 	dc=`crmadmin -D 2>/dev/null | awk '{print $NF}'`
 	if [ "$WE" = "$dc" ]; then
 		touch $1/DC
 	fi
 }
 
 #
 # some basic system info and stats
 #
 sys_info() {
 	echo "Heartbeat version: `hb_ver`"
 	crm_info
 	echo "Platform: `uname`"
 	echo "Kernel release: `uname -r`"
 	echo "Architecture: `arch`"
 	[ `uname` = Linux ] &&
 		echo "Distribution: `distro`"
 }
 sys_stats() {
 	set -x
 	uptime
 	ps axf
 	ps auxw
 	top -b -n 1
 	netstat -i
+	arp -an
 	set +x
 }
 
 #
 # replace sensitive info with '****'
 #
 sanitize() {
 	for f in $1/ha.cf $1/cib.xml $1/pengine/*; do
 		[ -f "$f" ] && sanitize_one $f
 	done
 }
 
 #
 # remove duplicates if files are same, make links instead
 #
 consolidate() {
 	for n in `getnodes`; do
 		if [ -f $1/$2 ]; then
 			rm $1/$n/$2
 		else
 			mv $1/$n/$2 $1
 		fi
 		ln -s ../$2 $1/$n
 	done
 }
 
 #
 # some basic analysis of the report
 #
 checkcrmvfy() {
 	for n in `getnodes`; do
 		if [ -s $1/$n/crm_verify.txt ]; then
 			echo "WARN: crm_verify reported warnings at $n:"
 			cat $1/$n/crm_verify.txt
 		fi
 	done
 }
 checkbacktraces() {
 	for n in `getnodes`; do
 		[ -s $1/$n/backtraces.txt ] && {
 			echo "WARN: coredumps found at $n:"
 			egrep 'Core was generated|Program terminated' \
 					$1/$n/backtraces.txt |
 				sed 's/^/	/'
 		}
 	done
 }
 checklogs() {
 	logs=`find $1 -name ha-log`
 	[ "$logs" ] || return
 	pattfile=`maketempfile` ||
 		fatal "cannot create temporary files"
 	for p in $LOG_PATTERNS; do
 		echo "$p"
 	done > $pattfile
 	echo ""
 	echo "Log patterns:"
 	for n in `getnodes`; do
 		cat $logs | grep -f $pattfile
 	done
 	rm -f $pattfile
 }
 
 #
 # check if files have same content in the cluster
 #
 cibdiff() {
-	crm_diff -c -n $1 -o $2
+	d1=`dirname $1`
+	d2=`dirname $2`
+	if [ -f $d1/RUNNING -a -f $d2/RUNNING ] ||
+		[ -f $d1/STOPPED -a -f $d2/STOPPED ]; then
+		crm_diff -c -n $1 -o $2
+	else
+		echo "can't compare cibs from running and stopped systems"
+	fi
 }
 txtdiff() {
 	diff $1 $2
 }
 diffcheck() {
+	[ -f "$1" ] || {
+		echo "$1 does not exist"
+		return 1
+	}
+	[ -f "$2" ] || {
+		echo "$2 does not exist"
+		return 1
+	}
 	case `basename $1` in
 	ccm_tool.txt)
 		txtdiff $1 $2;; # worddiff?
 	cib.xml)
 		cibdiff $1 $2;;
 	ha.cf)
 		txtdiff $1 $2;; # confdiff?
 	crm_mon.txt|sysinfo.txt)
 		txtdiff $1 $2;;
 	esac
 }
 analyze_one() {
 	rc=0
 	node0=""
 	for n in `getnodes`; do
 		if [ "$node0" ]; then
 			diffcheck $1/$node0/$2 $1/$n/$2
 			rc=$((rc+$?))
 		else
 			node0=$n
 		fi
 	done
 	return $rc
 }
 analyze() {
 	flist="ccm_tool.txt cib.xml crm_mon.txt ha.cf sysinfo.txt"
 	for f in $flist; do
 		perl -e "printf \"Diff $f... \""
 		ls $1/*/$f >/dev/null 2>&1 || continue
 		if analyze_one $1 $f; then
 			echo "OK"
 			consolidate $1 $f
 		else
 			echo "varies"
 		fi
 	done
 	checkcrmvfy $1
 	checkbacktraces $1
 	checklogs $1
 }
 
 #
 # description template, editing, and other notes
 #
 mktemplate() {
 	cat<<EOF
 Please edit this template and describe the issue/problem you
 encountered. Then, post to
 	Linux-HA@lists.linux-ha.org
 or file a bug at
 	http://old.linux-foundation.org/developer_bugzilla/
 
 See http://linux-ha.org/ReportingProblems for detailed
 description on how to report problems.
 
 Thank you.
 
 Date: `date`
 By: $PROG $userargs
 Subject: [short problem description]
 Severity: [choose one] enhancement minor normal major critical blocking
-Component: [choose one] CRM LRM CCM RA fencing comm GUI other
+Component: [choose one] CRM LRM CCM RA fencing heartbeat comm GUI tools other
 
 Detailed description:
 ---
 [...]
 ---
 
-$(
-if [ -f $DESTDIR/sysinfo.txt ]; then
-	cat $DESTDIR/sysinfo.txt
-else
-	for n in `getnodes`; do
-		[ -f $DESTDIR/$n/sysinfo.txt ] &&
-			echo "Info $n:"; sed 's/^/	/' $DESTDIR/$n/sysinfo.txt
-	done
-fi
-)
 EOF
+
+	if [ -f $DESTDIR/sysinfo.txt ]; then
+		echo "Common system info found:"
+		cat $DESTDIR/sysinfo.txt
+	else
+		for n in `getnodes`; do
+			if [ -f $DESTDIR/$n/sysinfo.txt ]; then
+				echo "System info $n:"
+				sed 's/^/	/' $DESTDIR/$n/sysinfo.txt
+			fi
+		done
+	fi
 }
 edittemplate() {
 	if ec=`pickfirst $EDITOR vim vi emacs nano`; then
 		$ec $1
 	else
 		warning "could not find a text editor"
 	fi
 }
 finalword() {
 	cat<<EOF
 The report is saved in $DESTDIR.tar.gz.
 
 Thank you for taking time to create this report.
 EOF
 }
 checksize() {
 	ls -s $DESTDIR.tar.gz | awk '$1>=100{exit 1}' ||
 		cat <<EOF
 
 NB: size of the tarball exceeds 100kb and if posted to the
 mailing list will have to be first approved by the moderator.
 Try reducing the period (use the -f and -t options).
 EOF
 }
 
 [ $# -eq 0 ] && usage
 
-# check for the major prereq
+# check for the major prereq for a) parameter parsing and b)
+# parsing logs
+#
+NO_str2time=""
 t=`str2time "12:00"`
 if [ "$t" = "" ]; then
-	fatal "please install the perl Date::Parse module"
+	NO_str2time=1
+	[ "$SLAVE" ] ||
+		fatal "please install the perl Date::Parse module"
 fi
 
 WE=`uname -n`  # who am i?
 THIS_IS_NODE=""
 getnodes | grep -wqs $WE && # are we a node?
 	THIS_IS_NODE=1
 getlogvars
 
 #
 # part 1: get and check options; and the destination
 #
 if [ "$SLAVE" = "" ]; then
 	setvarsanddefaults
 	userargs="$@"
 	args=`getopt -o f:t:l:u:p:L:e:SDCh -- "$@"`
 	[ $? -ne 0 ] && usage
 	eval set -- "$args"
 	while [ x"$1" != x ]; do
 		case "$1" in
 			-h) usage;;
 			-f) FROM_TIME=`str2time "$2"`
 			    chktime "$FROM_TIME" "$2"
 				shift 2;;
 			-t) TO_TIME=`str2time "$2"`
 			    chktime "$TO_TIME" "$2"
 				shift 2;;
 			-u) SSH_USER="$2"; shift 2;;
 			-l) HA_LOG="$2"; shift 2;;
 			-e) EDITOR="$2"; shift 2;;
 			-p) SANITIZE="$SANITIZE $2"; shift 2;;
 			-L) LOG_PATTERNS="$LOG_PATTERNS $2"; shift 2;;
 			-S) NO_SSH=1; shift 1;;
 			-D) NO_DESCRIPTION=1; shift 1;;
 			-C) REMOVE_DEST=1; shift 1;;
 			--) shift 1; break;;
 			*) usage short;;
 		esac
 	done
 	[ $# -ne 1 ] && usage short
 	DESTDIR=$1
 	chkdirname $DESTDIR
 	[ "$FROM_TIME" ] || usage short
 fi
 
 # this only on master
 if [ "$SLAVE" = "" ]; then
 #
 # part 2: ssh business
 #
 	# find out if ssh works
-	if [ "$NO_SSH" = "" ]; then
+	ssh_good=""
+	if [ -z "$NO_SSH" ]; then
 		[ "$SSH_USER" ] ||
 			SSH_USER=`findsshuser`
-		[ "$SSH_USER" ] && checkssh || # check if it works on _all_ nodes
-			SSH_USER=""
+		if [ $? -eq 0 ]; then
+			ssh_good=1
+			if [ "$SSH_USER" ]; then
+				ssh_opts="-l $SSH_USER $SSH_OPTS"
+			else
+				ssh_opts="$SSH_OPTS"
+			fi
+		fi
 	fi
 # final check: don't run if the destination directory exists
 	[ -d $DESTDIR ] && msgcleanup
-	[ "$SSH_USER" ] &&
+	[ "$ssh_good" ] &&
 		for node in `getnodes`; do
 			[ "$node" = "$WE" ] && continue
-			ssh $SSH_OPTS $SSH_USER@$node "test -d $DESTDIR" &&
+			ssh $ssh_opts $node "test -d $DESTDIR" &&
 				msgcleanup
 		done
 fi
 
 if [ "$SLAVE" ]; then
 	DESTDIR=$2
 	[ -d $DESTDIR ] || nodistdirectory
 	. $DESTDIR/.env
 else
 	mkdir -p $DESTDIR
 	[ -d $DESTDIR ] || nodistdirectory
 fi
 
 if [ "$SLAVE" = "" ]; then
 #
 # part 3: log marks to be searched for later
 #         important to do this now on _all_ nodes
 # 
 	if [ "$HA_LOGFACILITY" ]; then
 		sev="info"
 		cfdebug=`getcfvar debug` # prefer debuglog if set
 		[ "$cfdebug" -a "$cfdebug" -gt 0 ] &&
 			sev="debug"
 		logmarks $sev $UNIQUE_MSG
 	fi
 #
 # part 4: start this program on other nodes
 #
-	if [ "$SSH_USER" ]; then
+	if [ "$ssh_good" ]; then
 		send_config
 		start_remote_collectors
 	else
 		[ `getnodes | wc -w` -gt 1 ] &&
 			warning "ssh does not work to all nodes"
 	fi
 fi
 
 # only cluster nodes need their own directories
 [ "$THIS_IS_NODE" ] && mkdir -p $DESTDIR/$WE
 
 #
 # part 5: find the logs and cut out the segment for the period
 #
 if [ "$HA_LOG" ]; then  # log provided by the user?
 	[ -f "$HA_LOG" ] || {  # not present
 		[ "$SLAVE" ] ||  # warning if not on slave
 			warning "$HA_LOG not found; we will try to find log ourselves"
 		HA_LOG=""
 	}
 fi
 if [ "$HA_LOG" = "" ]; then
 	HA_LOG=`findlog`
 	[ "$HA_LOG" ] &&
 		cnt=`fgrep -c $UNIQUE_MSG < $HA_LOG`
 fi
 nodecnt=`getnodes | wc -w`
 if [ "$cnt" ] && [ $cnt -eq $nodecnt ]; then
 	info "found the central log!"
 	info "you can ignore warnings about missing logs"
 fi
 
 if [ -f "$HA_LOG" ]; then
-	dumplog $HA_LOG $FROM_TIME $TO_TIME |
-	if [ "$THIS_IS_NODE" ]; then
-		cat > $DESTDIR/$WE/ha-log
+	if [ "$NO_str2time" ]; then
+		warning "a log found; but we cannot slice it"
+		warning "please install the perl Date::Parse module"
 	else
-		cat > $DESTDIR/ha-log # we are log server, probably
+		dumplog $HA_LOG $FROM_TIME $TO_TIME |
+		if [ "$THIS_IS_NODE" ]; then
+			cat > $DESTDIR/$WE/ha-log
+		else
+			cat > $DESTDIR/ha-log # we are log server, probably
+		fi
 	fi
 else
 	warning "could not find the log file on $WE"
 fi
 
 #
 # part 6: get all other info (config, stats, etc)
 #
 if [ "$THIS_IS_NODE" ]; then
 	getconfig $DESTDIR/$WE
 	getpeinputs $FROM_TIME $TO_TIME $DESTDIR/$WE
 	getbacktraces $FROM_TIME $TO_TIME $DESTDIR/$WE/backtraces.txt
 	touch_DC_if_dc $DESTDIR/$WE
 	sanitize $DESTDIR/$WE
 	sys_info > $DESTDIR/$WE/sysinfo.txt
 	sys_stats > $DESTDIR/$WE/sysstats.txt 2>&1
 fi
 
 #
 # part 7: endgame:
 #         slaves tar their results to stdout, the master waits
 #         for them, analyses results, asks the user to edit the
 #         problem description template, and prints final notes
 #
 if [ "$SLAVE" ]; then
 	(cd $DESTDIR && tar cf - $WE)
 else
 	wait $SLAVEPIDS
 	analyze $DESTDIR > $DESTDIR/analysis.txt
 	mktemplate > $DESTDIR/description.txt
 	[ "$NO_DESCRIPTION" ] || {
 		echo press enter to edit the problem description...
 		read junk
 		edittemplate $DESTDIR/description.txt
 	}
 	cd $DESTDIR/..
-	tar czf $DESTDIR.tar.gz $DESTDIR/
+	tar czf $DESTDIR.tar.gz `basename $DESTDIR`
 	finalword
 	checksize
 fi
 
 [ "$REMOVE_DEST" ] &&
 	rm -r $DESTDIR
diff --git a/tools/utillib.sh b/tools/utillib.sh
index 05e259120a..2187624d9d 100644
--- a/tools/utillib.sh
+++ b/tools/utillib.sh
@@ -1,384 +1,354 @@
  # Copyright (C) 2007 Dejan Muhamedagic <dmuhamedagic@suse.de>
  # 
  # This program is free software; you can redistribute it and/or
  # modify it under the terms of the GNU General Public
  # License as published by the Free Software Foundation; either
  # version 2.1 of the License, or (at your option) any later version.
  # 
  # This software is distributed in the hope that it will be useful,
  # but WITHOUT ANY WARRANTY; without even the implied warranty of
  # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
  # General Public License for more details.
  # 
  # You should have received a copy of the GNU General Public
  # License along with this library; if not, write to the Free Software
  # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  #
 
 #
 # ha.cf/logd.cf parsing
 #
 getcfvar() {
 	[ -f $HA_CF ] || return
 	sed 's/#.*//' < $HA_CF |
 		grep -w "^$1" |
 		sed 's/^[^[:space:]]*[[:space:]]*//'
 }
 iscfvarset() {
 	test "`getcfvar \"$1\"`"
 }
 iscfvartrue() {
 	getcfvar "$1" |
 		egrep -qsi "^(true|y|yes|on|1)"
 }
 getnodes() {
 	getcfvar node
 }
 
-#
-# ssh
-#
-checksshuser() {
-	ssh -o Batchmode=yes $2@$1 true 2>/dev/null
-}
-trysshusers() {
-	n=$1
-	shift 1
-	for u; do
-		if checksshuser $n $u; then
-			echo $u
-			break
-		fi
-	done
-}
-
 #
 # logging
 #
 syslogmsg() {
 	severity=$1
 	shift 1
 	logtag=""
 	[ "$HA_LOGTAG" ] && logtag="-t $HA_LOGTAG"
 	logger -p ${HA_LOGFACILITY:-"daemon"}.$severity $logtag $*
 }
 
 #
 # find log destination
 #
 uselogd() {
 	iscfvartrue use_logd &&
 		return 0  # if use_logd true
 	iscfvarset logfacility ||
 	iscfvarset logfile ||
 	iscfvarset debugfile ||
 		return 0  # or none of the log options set
 	false
 }
 findlogdcf() {
 	for f in \
 		`which strings > /dev/null 2>&1 &&
 			strings $HA_BIN/ha_logd | grep 'logd\.cf'` \
 		`for d; do echo $d/logd.cf $d/ha_logd.cf; done`
 	do
 		if [ -f "$f" ]; then
 			echo $f
 			return 0
 		fi
 	done
 	return 1
 }
 getlogvars() {
 	savecf=$HA_CF
 	if uselogd; then
 		[ -f "$LOGD_CF" ] ||
 			fatal "could not find logd.cf or ha_logd.cf"
 		HA_CF=$LOGD_CF
 	fi
 	HA_LOGFACILITY=`getcfvar logfacility`
 	HA_LOGFILE=`getcfvar logfile`
 	HA_DEBUGFILE=`getcfvar debugfile`
 	HA_SYSLOGMSGFMT=""
 	iscfvartrue syslogmsgfmt &&
 		HA_SYSLOGMSGFMT=1
 	HA_CF=$savecf
 }
 findmsg() {
 	# this is tricky, we try a few directories
 	syslogdir="/var/log /var/logs /var/syslog /var/adm /var/log/ha /var/log/cluster"
 	favourites="ha-*"
 	mark=$1
 	log=""
 	for d in $syslogdir; do
 		[ -d $d ] || continue
 		log=`fgrep -l "$mark" $d/$favourites` && break
 		log=`fgrep -l "$mark" $d/*` && break
 	done 2>/dev/null
 	echo $log
 }
 
 #
 # print a segment of a log file
 #
 str2time() {
 	perl -e "\$time='$*';" -e '
 	eval "use Date::Parse";
 	if (!$@) {
 		print str2time($time);
 	} else {
 		eval "use Date::Manip";
 		if (!$@) {
 			print UnixDate(ParseDateString($time), "%s");
 		}
 	}
 	'
 }
 getstamp() {
 	if [ "$HA_SYSLOGMSGFMT" -o "$HA_LOGFACILITY" ]; then
 		awk '{print $1,$2,$3}'
 	else
 		awk '{print $2}' | sed 's/_/ /'
 	fi
 }
 linetime() {
 	l=`tail -n +$2 $1 | head -1 | getstamp`
 	str2time "$l"
 }
 findln_by_time() {
 	logf=$1
 	tm=$2
 	first=1
 	last=`wc -l < $logf`
 	while [ $first -le $last ]; do
 		mid=$(((last+first)/2))
 		tmid=`linetime $logf $mid`
 		if [ -z "$tmid" ]; then
 			warning "cannot extract time: $logf:$mid"
 			return
 		fi
 		if [ $tmid -gt $tm ]; then
 			last=$((mid-1))
 		elif [ $tmid -lt $tm ]; then
 			first=$((mid+1))
 		else
 			break
 		fi
 	done
 	echo $mid
 }
 dumplog() {
 	logf=$1
 	from_time=$2
 	to_time=$3
 	from_line=`findln_by_time $logf $from_time`
 	if [ -z "$from_line" ]; then
 		warning "couldn't find line for time $from_time; corrupt log file?"
 		return
 	fi
 	tail -n +$from_line $logf |
 		if [ "$to_time" != 0 ]; then
 			to_line=`findln_by_time $logf $to_time`
 			if [ -z "$to_line" ]; then
 				warning "couldn't find line for time $to_time; corrupt log file?"
 				return
 			fi
 			head -$((to_line-from_line+1))
 		else
 			cat
 		fi
 }
 
 #
 # find files newer than a and older than b
 #
 touchfile() {
 	t=`maketempfile` &&
 	perl -e "\$file=\"$t\"; \$tm=$1;" -e 'utime $tm, $tm, $file;' &&
 	echo $t
 }
 find_files() {
 	dir=$1
 	from_time=$2
 	to_time=$3
 	from_stamp=`touchfile $from_time`
 	findexp="-newer $from_stamp"
 	if [ "$to_time" -a "$to_time" -gt 0 ]; then
 		to_stamp=`touchfile $to_time`
 		findexp="$findexp ! -newer $to_stamp"
 	fi
 	find $dir -type f $findexp
 	rm -f $from_stamp $to_stamp
 }
 
 #
 # coredumps
 #
 findbinary() {
 	random_binary=`which cat 2>/dev/null` # suppose we are lucky
 	binary=`gdb $random_binary $1 < /dev/null 2>/dev/null |
 		grep 'Core was generated' | awk '{print $5}' |
 		sed "s/^.//;s/[.']*$//"`
 	[ x = x"$binary" ] && return
 	fullpath=`which $binary 2>/dev/null`
 	if [ x = x"$fullpath" ]; then
 		[ -x $HA_BIN/$binary ] && echo $HA_BIN/$binary
 	else
 		echo $fullpath
 	fi
 }
 getbt() {
 	which gdb > /dev/null 2>&1 || {
 		warning "please install gdb to get backtraces"
 		return
 	}
 	for corefile; do
 		absbinpath=`findbinary $corefile`
 		[ x = x"$absbinpath" ] && return 1
 		echo "====================== start backtrace ======================"
 		ls -l $corefile
 		gdb -batch -n -quiet -ex ${BT_OPTS:-"thread apply all bt full"} -ex quit \
 			$absbinpath $corefile 2>/dev/null
 		echo "======================= end backtrace ======================="
 	done
 }
 
 #
 # heartbeat configuration/status
 #
 iscrmrunning() {
 	crmadmin -D >/dev/null 2>&1
 }
 dumpstate() {
 	crm_mon -1 | grep -v '^Last upd' > $1/crm_mon.txt
 	cibadmin -Ql > $1/cib.xml
 	ccm_tool -p > $1/ccm_tool.txt 2>&1
 }
 getconfig() {
-	cp -p $HA_CF $1/
+	[ -f $HA_CF ] &&
+		cp -p $HA_CF $1/
 	[ -f $LOGD_CF ] &&
 		cp -p $LOGD_CF $1/
 	if iscrmrunning; then
 		dumpstate $1
+		touch $1/RUNNING
 	else
 		cp -p $HA_VARLIB/crm/cib.xml $1/ 2>/dev/null
+		touch $1/STOPPED
 	fi
 	[ -f "$1/cib.xml" ] &&
 		crm_verify -V -x $1/cib.xml >$1/crm_verify.txt 2>&1
 }
 
 #
 # remove values of sensitive attributes
 #
 # this is not proper xml parsing, but it will work under the
 # circumstances
 sanitize_xml_attrs() {
 	sed $(
 	for patt in $SANITIZE; do
 		echo "-e /name=\"$patt\"/s/value=\"[^\"]*\"/value=\"****\"/"
 	done
 	)
 }
 sanitize_hacf() {
 	awk '
 	$1=="stonith_host"{ for( i=5; i<=NF; i++ ) $i="****"; }
 	{print}
 	'
 }
 sanitize_one() {
 	file=$1
 	compress=""
 	echo $file | grep -qs 'gz$' && compress=gzip
 	echo $file | grep -qs 'bz2$' && compress=bzip2
 	if [ "$compress" ]; then
 		decompress="$compress -dc"
 	else
 		compress=cat
 		decompress=cat
 	fi
 	tmp=`maketempfile` && ref=`maketempfile` ||
 		fatal "cannot create temporary files"
 	touch -r $file $ref  # save the mtime
 	if [ "`basename $file`" = ha.cf ]; then
 		sanitize_hacf
 	else
 		$decompress | sanitize_xml_attrs | $compress
 	fi < $file > $tmp
 	mv $tmp $file
 	touch -r $ref $file
 	rm -f $ref
 }
 
 #
 # keep the user posted
 #
 fatal() {
-	echo "ERROR: $*" >&2
+	echo "`uname -n`: ERROR: $*" >&2
 	exit 1
 }
 warning() {
-	echo "WARN: $*" >&2
+	echo "`uname -n`: WARN: $*" >&2
 }
 info() {
-	echo "INFO: $*" >&2
+	echo "`uname -n`: INFO: $*" >&2
 }
 pickfirst() {
 	for x; do
 		which $x >/dev/null 2>&1 && {
 			echo $x
 			return 0
 		}
 	done
 	return 1
 }
 
-#
-# run a command everywhere
-#
-forall() {
-	c="$*"
-	for n in `getnodes`; do
-		if [ "$n" = "`uname -n`" ]; then
-			$c
-		else
-			if [ "$SSH_USER" ]; then
-				echo $c | ssh $SSH_OPTS $SSH_USER@$n
-			fi
-		fi
-	done
-}
-
 #
 # get some system info
 #
 distro() {
 	which lsb_release >/dev/null 2>&1 && {
 		lsb_release -d
 		return
 	}
 	relf=`ls /etc/debian_version 2>/dev/null` ||
 	relf=`ls /etc/slackware-version 2>/dev/null` ||
 	relf=`ls -d /etc/*-release 2>/dev/null` && {
 		for f in $relf; do
 			test -f $f && {
 				echo "`ls $f` `cat $f`"
 				return
 			}
 		done
 	}
 	warning "no lsb_release no /etc/*-release no /etc/debian_version"
 }
 hb_ver() {
 	which dpkg > /dev/null 2>&1 && {
 		dpkg-query -f '${Version}' -W heartbeat 2>/dev/null ||
 			dpkg-query -f '${Version}' -W heartbeat-2
 		return
 	}
 	which rpm > /dev/null 2>&1 && {
 		rpm -q --qf '%{version}' heartbeat
 		return
 	}
 	# more packagers?
 }
 crm_info() {
 	$HA_BIN/crmd version 2>&1
 }