diff --git a/docs/boothd.8.txt b/docs/boothd.8.txt index c3765c9..0e81e3a 100644 --- a/docs/boothd.8.txt +++ b/docs/boothd.8.txt @@ -1,379 +1,377 @@ BOOTHD(8) =========== :doctype: manpage NAME ---- boothd - The Booth Cluster Ticket Manager. SYNOPSIS -------- *boothd* 'daemon' ['-D'] [-c 'config'] *booth* ['client'] {'list'} [-S 'site'] ['-D'] [-c 'config'] *booth* ['client'] {'grant'|'revoke'} [-S 'site'] ['-D'] [-t] 'ticket' [-c 'config'] *booth* 'status' ['-D'] [-c 'config'] DESCRIPTION ----------- Booth manages tickets which authorizes one of the cluster sites located in geographically dispersed distances to run certain resources. It is designed to be an add-on to Pacemaker, which extends Pacemaker to support geographically distributed clustering. It is based on the RAFT protocol, see eg. for details. SHORT EXAMPLES -------------- --------------------- # boothd daemon # boothd client list # boothd client grant -t ticket-nfs # boothd client revoke -t ticket-nfs --------------------- OPTIONS ------- *-c*:: Configuration to use. + Can be a full path to a configuration file, or a short name; in the latter case, the directory '/etc/booth' and suffix '.conf' are added. Per default 'booth' is used, which results in the path '/etc/booth/booth.conf'. + The configuration name also determines the name of the PID file - for the defaults, '/var/run/booth/booth.pid'. *-D*:: Debug output/don't daemonize. Increases the debug output level; for 'boothd daemon', keeps the process in the foreground. *-h*, *--help*:: Give a short usage output. *-s*:: Site address. *-t*:: Ticket name. *-v*, *--version*:: Report version information. *-S*:: 'systemd' mode: don't fork. This is like '-D' but without the debug output. COMMANDS -------- Whether the binary is called as 'boothd' or 'booth' doesn't matter; the first argument determines the mode of operation. *'daemon'*:: Tells 'boothd' to serve a site. The locally configured interfaces are searched for an IP address that got defined in the configuration, so that Booth can operate in /arbitrator/ resp. /site/ mode. *'client'*:: Allows to list the ticket information (see also 'crm_ticket -L'), and to revoke or (initially) grant tickets to a site. + In this mode the configuration file is searched for an IP address that is locally reachable, ie. matches a configured subnet. This allows to run the client commands on another node in the same cluster, as long as the config file and the service IP is locally reachable. + Example: If the booth service IP is 192.168.55.200, and the local node has 192.168.55.15 configured on an interface, it knows which site it belongs to. + The client can also ask another site; use '-s' to tell where to connect to. *'status'*:: 'boothd' looks for the (locked) PID file and the UDP socket, prints some output to stdout (for use in shell scripts) and returns a OCF-compatible return code. With '-D', a human-readable message is printed to STDERR as well. CONFIGURATION FILE ------------------ A basic file looks like this: ----------------------- site="192.168.201.100" site="192.168.202.100" arbitrator="192.168.203.100" -ticket="I-want-a-pony" +ticket="ticket-db8" ----------------------- You can use comment lines, by starting them with a hash-sign (''#''). Whitespace at the start and end of the line, and around the ''='', are ignored. The following key/value pairs are defined: *'port'*:: The UDP/TCP port to use. Default is '9929'. *'transport'*:: The transport protocol to use for Raft exchanges. Currently only UDP is available. + -Please note that the client mode always uses TCP to talk to a daemon; Booth +The client mode always uses TCP to talk to a daemon; Booth will always bind and listen to *both* UDP and TCP ports. *'site'*, *'arbitrator'*:: Defines a Raft member with the given IP, which should be a service IP. + You will need at least three members for normal operation; an odd number is preferred. *'ticket'*:: Registers a ticket. Multiple tickets can be handled in a single Booth instance. The next items modify per-ticket defaults. They are stored as defaults for further tickets, and are used as value for the last defined ticket (if any). *'expire'*:: The lease time for a ticket, in seconds. After that time the ticket can be revoked, and another site can get it. + Typically 'booth' will try to renew a held ticket after half the lease time. *'timeout'*:: After that time 'booth' will re-send packets if there was an insufficient number of replies. + -The default is '3'. +The default is '10'. *'weights'*:: A comma-separated list of integers that define the weight of individual Raft members, in the same order as the 'site' and 'arbitrator' lines. + -Default is '0' for all; this means that the ordering within the configuration -file defines a kind of priority for conflicting requests. +Default is '0' for all; this means that the order in the configuration +file defines priority for conflicting requests. *'acquire-after'*:: Setting this to a positive value will make 'booth' try to acquire a ticket that got lost. + Ie. if the site that _had_ the ticket is not reachable any more, then 'acquire-after' seconds after ticket expiration other sites will try to activate the ticket. (Only one will succeed, though.) + A typical delay might be 60 seconds. *'retries'*:: - Defines how often broadcast packets are sent out before the current - action (grant, revoke) is aborted. + Defines how many times to retry broadcasting packets before the + current operation (grant, revoke) is aborted. + -Default is 10; values lower than 3 are forbidden, and high values won't -make much sense, too. +Default is 10. Values lower than 3 are illegal. + -Please note that this counts only for a single packet; if ticket *renewal* -runs into this limit (because the network was temporarily down), but the -ticket is still valid afterwards, a new renewal run will be started -automatically. +This counts only for a single broadcast; if ticket *renewal* runs +into this limit (because the network was temporarily down), but +the ticket is still valid afterwards, a new renewal run will be +started automatically. *'site-user'*, *'site-group'*, *'arbitrator-user'*, *'arbitrator-group'*:: These define the credentials 'boothd' will be running with. + On a (Pacemaker) site the booth process will have to call 'crm_ticket', so the default is to use 'hacluster':'haclient'; for an arbitrator this user and group might not exists, so that will default to 'nobody':'nobody'. *'before-acquire-handler'*:: - If set, this script/program will be called before 'boothd' tries to - acquire or renew a ticket. Only a clean exit will allow 'boothd' to - proceed; any other return value will cancel the operation. + If set, this command will be called before 'boothd' tries to + acquire or renew a ticket. On exit code other than 0, + 'boothd' cancels the operation. + -This makes it possible to check whether it makes sense to try -to acquire the ticket; eg. if a service in the +This makes it possible to check whether it is appropriate +to acquire the ticket. For instance, if a service in the dependency-chain has a failcount of 'INFINITY' on all -available nodes, the service will be unable to run - and so -another cluster (and not this one!) should try to start it. +available nodes, the service will be unable to run. In that case, +it is of no use to claim the ticket. + -Please assume that 'boothd' will wait synchronously for the result of that -call, so having that program return quickly would be an advantage. +'boothd' waits synchronously for the result of the handler, so make +sure that the program returns quickly. + -Please see below for details about available environment variables. +See below for details about booth specific environment variables. A more verbose example of a configuration file might be ----------------------- transport = udp port = 9930 # D-85774 site="192.168.201.100" # D-90409 site="::ffff:192.168.202.100" # A-1120 arbitrator="192.168.203.100" -ticket="I-want-a-pony" +ticket="ticket-db8" expire = 600 acquire-after = 60 timeout = 10 retries = 5 ----------------------- NOTES ----- -Please note that Booth tickets are not meant to be real-time - a reasonable -'expire' time might be 300 seconds (5 minutes). Due to possible delays on the -WAN connections it makes no sense to expect detection of problems and failover -within a few seconds. +Tickets are not meant to be moved around quickly--a reasonable +'expire' time might be 300 seconds (5 minutes). +'booth' works with both IPv4 and IPv6 addresses. -'booth' works with IPv6 addresses, too. +'booth' renews a ticket before it expires, to account for +possible transmission delays. - -'booth' will start to renew a ticket before it expires, to account -for transmission delays. - -This will happen so that (the bigger one of) half the 'expire' time, or -'timeout'*'retries'/2 seconds, will be left for the renewal. - -Of course, that means that with bad configuration values (eg. 'expire' 60 -seconds, 'timeout' 3 seconds, and 'retries' > 40) the ticket renewal -process will be started just after the ticket got acquired. +The renewal time is calculated as larger of half the 'expire' +time and 'timeout'*'retries'/2. Hence, with small 'expire' values +(eg. 60 seconds) the ticket renewal process will be started just +after the ticket got acquired. HANDLERS -------- Currently, there's only one external handler defined (see the 'before-acquire-handler' configuration item above). -It gets the following data via the environment: +The following data is available as environment variables: *'BOOTH_TICKET':: The ticket name, as given in the configuration file. (See 'ticket' item above.) *'BOOTH_LOCAL':: - The local site specification, as defined in 'site'. + The local site name, as defined in 'site'. *'BOOTH_CONF_PATH':: The path to the active configuration file. *'BOOTH_CONF_NAME':: The configuration name, as used by the '-c' commandline argument. *'BOOTH_TICKET_EXPIRES':: - Timestamp for the ticket expiration (seconds since 1.1.1970), or '0'. + When the ticket expires (in seconds since 1.1.1970), or '0'. FILES ----- *'/etc/booth/booth.conf'*:: The default configuration file name. See also the '-c' argument. *'/var/run/booth/'*:: Directory that holds PID/lock files. See also the 'status' command. RAFT IMPLEMENTATION ------------------- -Basically, each Pacemaker ticket corresponds to a separate Raft cluster. +In essence, every ticket corresponds to a separate Raft cluster. -A ticket is granted _only_ to the Raft _Leader_, but a Leader needs not grant the ticket to Pacemaker. -To move a ticket, the Leader withdraws, and votes for the new Leader instead. - -So, the Raft "log" consists of -- nothing, more or less; there's no history to keep. +A ticket is granted _only_ to the Raft _Leader_, but a Leader +needs not grant the ticket to Pacemaker. To move a ticket, the +Leader withdraws, and votes for the new Leader instead. SYSTEMD INTEGRATION ------------------- -The Booth sources (and, very likely, packages too) include a 'systemd' unit -file for 'boothd'. +The 'boothd' 'systemd' unit file should be distributed with booth. -So don't forget to install 'boothd' into 'systemd' after configuration! +The booth daemon for a site or an arbitrator may be started +through systemd: ----------- # systemctl enable booth@{configurationname}.service # systemctl start booth@{configurationname}.service ----------- +The configuration name is required for 'systemctl', even in case +of the default name 'booth'. + EXIT STATUS ----------- *0*:: Success. For the 'status' command: Daemon running. *1* (PCMK_OCF_UNKNOWN_ERROR):: General error code. *7* (PCMK_OCF_NOT_RUNNING):: No daemon process for that configuration active. BUGS ---- Probably. Please report them on GitHub: AUTHOR ------ 'boothd' was originally written (mostly) by Jiaju Zhang. Many people have contributed to it. -In 2013 Philipp Marek took over maintainership. +In 2013 Philipp Marek took over maintainership, followed by Dejan +Muhamedagic. RESOURCES --------- GitHub: Documentation: COPYING ------- Copyright (C) 2011 Jiaju Zhang Copyright (C) 2013-2014 Philipp Marek +Copyright (C) 2014 Dejan Muhamedagic + Free use of this software is granted under the terms of the GNU General Public License (GPL). // vim: set ft=asciidoc :