diff --git a/TODO b/TODO index c831948a..117665a0 100644 --- a/TODO +++ b/TODO @@ -1,97 +1,96 @@ -------------------------------------------------------- The Corosync Cluster Engine Topic Branches and Backlog -------------------------------------------------------- ---------------------------- Last Updated: May 2012 ---------------------------- -------------------------------------- Current priority list for Needle 2.1 -------------------------------------- * disallow binding to localhost (Honza) * don't rely on mcast loop (Honza) * Add support for LOG_TRACE and log received messages in this level (Honza) * Proper support for DNS (always return one address even if function is called multiple times) (Honza) * porting of qdisk to votequorum and eventually finalize qdevice API in votequorum. (Fabio) * Cleaner shutdown process -> free memory (Fabio) -------------------------------------- Current priority list for Needle 2.X -------------------------------------- * logsys glue layer removal * harden and finish ykd algorithm * implement topic-xmlschema * Modify totemsrp to allow dynamic definitions of the ring counts to allow a larger number of redundant rings then 2. * Investigate always-on flight recorder -* support more encryption methods (other than none/aes256) from nss * implement topic-rdmaud -------------------------------- Ideas for future releases (3.0+) -------------------------------- * topic-netmalloc * doxygenize include and lib directories. * re-evaluate using libtool to link libraries. * Support for clang as compiler (depends on libtool) * reorganize library/headers/code in the tree * change and simplify build defaults * libtotem cleanup/rewrite * Rewrite totem fragmentation layer * rewrite top level totempg interface * Split fragmentation layer in totem (ie: totempg talks to totemfrg talks to totemsrp) * Add a getopt and setopt feature to top level interface to allow runtime configuration of the interface * Improve cpg - opaque data in callbacks (client stores data about itself, every node can access them), permissions (read only/read write/ some application may disable listeners) - Probably implemented as extra user space library on top of normal cpg * Better statistic - histogram * totem multiring * load balancing over different speed links in RRP We use topic branches in our git repository to develop new disruptive features that define our future roadmap. This file describes the topic branches the developers have interest in investigating further. targets can be: needle2.1, needle2.X, or future (3.0). Once in a shipped version, please remove from the topic list. ------------------------------------------------------------------------------ topic-xmlschema ------------------------------------------------------------------------------ XML configuration for corosync exists, but imput file is not checked against XML schema. This topic is about implementing preferably RelaxNG schema of corosync configuration. ------------------------------------------------------------------------------ topic-onecrypt ------------------------------------------------------------------------------ Currently encryption code is located in totemudp.c, totemudpu.c, and iba has no encryption support. This topic merges the encryption code into a new file such as totemcrp.c and provides a mechanism for totemnet.c to register encrypt and decrypt functions with totem[udp|iba|udpu] and use them as requested by the configuration. ------------------------------------------------------------------------------ topic-netmalloc ------------------------------------------------------------------------------ The totemiba.c driver must allocate memory and assign it to a protection domain in order for an infiniband driver to transmit memory. In the current implementation, totemsrp.c also allocates these same frames. This results in an extra memcpy when transmitting with libibverbs technology. Memory copies are to be avoided. The simple solution is to have each network driver provide a memory allocation function. When totemsrp wants a free frame, it requests it from the network driver. ------------------------------------------------------------------------------ topic-rdmaud ------------------------------------------------------------------------------ Currently our RDMA code uses librdmacm to setup connections. We are not certain this extra library is needed, and may be able to use only ibverbs. If this is possible, the totem code may be more reliable, especially around failure conditions. diff --git a/conf/lenses/corosync.aug b/conf/lenses/corosync.aug index 98cd2686..1418c30b 100644 --- a/conf/lenses/corosync.aug +++ b/conf/lenses/corosync.aug @@ -1,181 +1,181 @@ (* Process /etc/corosync/corosync.conf *) (* The lens is based on the corosync.conf(5) man page *) module Corosync = autoload xfm let comment = Util.comment let empty = Util.empty let dels = Util.del_str let eol = Util.eol let ws = del /[ \t]+/ " " let wsc = del /:[ \t]+/ ": " let indent = del /[ \t]*/ "" (* We require that braces are always followed by a newline *) let obr = del /\{([ \t]*)\n/ "{\n" let cbr = del /[ \t]*}[ \t]*\n/ "}\n" let ikey (k:regexp) = indent . key k let section (n:regexp) (b:lens) = [ ikey n . ws . obr . (b|empty|comment)* . cbr ] let kv (k:regexp) (v:regexp) = [ ikey k . wsc . store v . eol ] (* FIXME: it would be much more concise to write *) (* [ key k . ws . (bare | quoted) ] *) (* but the typechecker trips over that *) let qstr (k:regexp) = let delq = del /['"]/ "\"" in let bare = del /["']?/ "" . store /[^"' \t\n]+/ . del /["']?/ "" in let quoted = delq . store /.*[ \t].*/ . delq in [ ikey k . wsc . bare . eol ] |[ ikey k . wsc . quoted . eol ] (* A integer subsection *) let interface = let setting = kv "ringnumber" Rx.integer |kv "mcastport" Rx.integer |kv "ttl" Rx.integer |qstr /bindnetaddr|mcastaddr/ in section "interface" setting (* The totem section *) let totem = let setting = kv "clear_node_high_bit" /yes|no/ |kv "rrp_mode" /none|active|passive/ |kv "vsftype" /none|ykd/ |kv "secauth" /on|off/ - |kv "crypto_type" /nss|aes256/ - |kv "crypto_cipher" /none|nss|aes256/ + |kv "crypto_type" /nss|aes256|aes192|aes128|3des/ + |kv "crypto_cipher" /none|nss|aes256|aes192|aes128|3des/ |kv "crypto_hash" /none|md5|sha1|sha256|sha384|sha512/ |kv "transport" /udp|iba/ |kv "version" Rx.integer |kv "nodeid" Rx.integer |kv "threads" Rx.integer |kv "netmtu" Rx.integer |kv "token" Rx.integer |kv "token_retransmit" Rx.integer |kv "hold" Rx.integer |kv "token_retransmits_before_loss_const" Rx.integer |kv "join" Rx.integer |kv "send_join" Rx.integer |kv "consensus" Rx.integer |kv "merge" Rx.integer |kv "downcheck" Rx.integer |kv "fail_to_recv_const" Rx.integer |kv "seqno_unchanged_const" Rx.integer |kv "heartbeat_failures_allowed" Rx.integer |kv "max_network_delay" Rx.integer |kv "max_messages" Rx.integer |kv "window_size" Rx.integer |kv "rrp_problem_count_timeout" Rx.integer |kv "rrp_problem_count_threshold" Rx.integer |kv "rrp_token_expired_timeout" Rx.integer |interface in section "totem" setting let common_logging = kv "to_syslog" /yes|no|on|off/ |kv "to_stderr" /yes|no|on|off/ |kv "to_logfile" /yes|no|on|off/ |kv "debug" /yes|no|on|off|trace/ |kv "logfile_priority" /alert|crit|debug|emerg|err|info|notice|warning/ |kv "syslog_priority" /alert|crit|debug|emerg|err|info|notice|warning/ |kv "syslog_facility" /daemon|local0|local1|local2|local3|local4|local5|local6|local7/ |qstr /logfile|tags/ (* A logger_subsys subsection *) let logger_subsys = let setting = qstr /subsys/ |common_logging in section "logger_subsys" setting (* The logging section *) let logging = let setting = kv "fileline" /yes|no|on|off/ |kv "function_name" /yes|no|on|off/ |kv "timestamp" /yes|no|on|off/ |common_logging |logger_subsys in section "logging" setting (* The resource section *) let common_resource = kv "max" Rx.decimal |kv "poll_period" Rx.integer |kv "recovery" /reboot|shutdown|watchdog|none/ let memory_used = let setting = common_resource in section "memory_used" setting let load_15min = let setting = common_resource in section "load_15min" setting let system = let setting = load_15min |memory_used in section "system" setting (* The resources section *) let resources = let setting = system in section "resources" setting (* The quorum section *) let quorum = let setting = qstr /provider/ |kv "expected_votes" Rx.integer |kv "votes" Rx.integer |kv "wait_for_all" Rx.integer |kv "last_man_standing" Rx.integer |kv "last_man_standing_window" Rx.integer |kv "auto_tie_breaker" Rx.integer |kv "two_node" Rx.integer in section "quorum" setting (* The service section *) let service = let setting = qstr /name|ver/ in section "service" setting (* The uidgid section *) let uidgid = let setting = qstr /uid|gid/ in section "uidgid" setting (* The node section *) let node = let setting = qstr /ring[0-9]_addr/ |kv "nodeid" Rx.integer |kv "quorum_votes" Rx.integer in section "node" setting (* The nodelist section *) let nodelist = let setting = node in section "nodelist" setting let lns = (comment|empty|totem|quorum|logging|resources|service|uidgid|nodelist)* let xfm = transform lns (incl "/etc/corosync/corosync.conf") diff --git a/exec/coroparse.c b/exec/coroparse.c index 100b7177..b9655c5f 100644 --- a/exec/coroparse.c +++ b/exec/coroparse.c @@ -1,1136 +1,1142 @@ /* * Copyright (c) 2006-2012 Red Hat, Inc. * * All rights reserved. * * Author: Patrick Caulfield (pcaulfie@redhat.com) * Jan Friesse (jfriesse@redhat.com) * * This software licensed under BSD license, the text of which follows: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * - Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * - Neither the name of the MontaVista Software, Inc. nor the names of its * contributors may be used to endorse or promote products derived from this * software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ #include <config.h> #include <sys/types.h> #include <sys/uio.h> #include <sys/socket.h> #include <sys/stat.h> #include <sys/un.h> #include <netinet/in.h> #include <arpa/inet.h> #include <unistd.h> #include <fcntl.h> #include <stdlib.h> #include <stdio.h> #include <errno.h> #include <string.h> #include <dirent.h> #include <limits.h> #include <stddef.h> #include <grp.h> #include <pwd.h> #include <corosync/list.h> #include <qb/qbutil.h> #define LOGSYS_UTILS_ONLY 1 #include <corosync/logsys.h> #include <corosync/icmap.h> #include "main.h" #include "util.h" enum parser_cb_type { PARSER_CB_START, PARSER_CB_END, PARSER_CB_SECTION_START, PARSER_CB_SECTION_END, PARSER_CB_ITEM, }; typedef int (*parser_cb_f)(const char *path, char *key, char *value, enum parser_cb_type type, const char **error_string, void *user_data); enum main_cp_cb_data_state { MAIN_CP_CB_DATA_STATE_NORMAL, MAIN_CP_CB_DATA_STATE_TOTEM, MAIN_CP_CB_DATA_STATE_INTERFACE, MAIN_CP_CB_DATA_STATE_LOGGER_SUBSYS, MAIN_CP_CB_DATA_STATE_UIDGID, MAIN_CP_CB_DATA_STATE_LOGGING_DAEMON, MAIN_CP_CB_DATA_STATE_MEMBER, MAIN_CP_CB_DATA_STATE_QUORUM, MAIN_CP_CB_DATA_STATE_QDEVICE, MAIN_CP_CB_DATA_STATE_NODELIST, MAIN_CP_CB_DATA_STATE_NODELIST_NODE, MAIN_CP_CB_DATA_STATE_PLOAD }; struct key_value_list_item { char *key; char *value; struct list_head list; }; struct main_cp_cb_data { enum main_cp_cb_data_state state; int ringnumber; char *bindnetaddr; char *mcastaddr; char *broadcast; int mcastport; int ttl; struct list_head logger_subsys_items_head; char *subsys; char *logging_daemon_name; struct list_head member_items_head; int node_number; int ring0_addr_added; }; static int read_config_file_into_icmap( const char **error_string); static char error_string_response[512]; static int uid_determine (const char *req_user) { int pw_uid = 0; struct passwd passwd; struct passwd* pwdptr = &passwd; struct passwd* temp_pwd_pt; char *pwdbuffer; int pwdlinelen; pwdlinelen = sysconf (_SC_GETPW_R_SIZE_MAX); if (pwdlinelen == -1) { pwdlinelen = 256; } pwdbuffer = malloc (pwdlinelen); if ((getpwnam_r (req_user, pwdptr, pwdbuffer, pwdlinelen, &temp_pwd_pt)) != 0) { sprintf (error_string_response, "The '%s' user is not found in /etc/passwd, please read the documentation.", req_user); return (-1); } pw_uid = passwd.pw_uid; free (pwdbuffer); return pw_uid; } static int gid_determine (const char *req_group) { int corosync_gid = 0; struct group group; struct group * grpptr = &group; struct group * temp_grp_pt; char *grpbuffer; int grplinelen; grplinelen = sysconf (_SC_GETGR_R_SIZE_MAX); if (grplinelen == -1) { grplinelen = 256; } grpbuffer = malloc (grplinelen); if ((getgrnam_r (req_group, grpptr, grpbuffer, grplinelen, &temp_grp_pt)) != 0) { sprintf (error_string_response, "The '%s' group is not found in /etc/group, please read the documentation.", req_group); return (-1); } corosync_gid = group.gr_gid; free (grpbuffer); return corosync_gid; } static char *strchr_rs (const char *haystack, int byte) { const char *end_address = strchr (haystack, byte); if (end_address) { end_address += 1; /* skip past { or = */ while (*end_address == ' ' || *end_address == '\t') end_address++; } return ((char *) end_address); } int coroparse_configparse (const char **error_string) { if (read_config_file_into_icmap(error_string)) { return -1; } return 0; } static char *remove_whitespace(char *string) { char *start; char *end; start = string; while (*start == ' ' || *start == '\t') start++; end = start+(strlen(start))-1; while ((*end == ' ' || *end == '\t' || *end == ':' || *end == '{') && end > start) end--; if (end != start) *(end+1) = '\0'; return start; } static int parse_section(FILE *fp, char *path, const char **error_string, parser_cb_f parser_cb, void *user_data) { char line[512]; int i; char *loc; int ignore_line; char new_keyname[ICMAP_KEYNAME_MAXLEN]; if (strcmp(path, "") == 0) { parser_cb("", NULL, NULL, PARSER_CB_START, error_string, user_data); } while (fgets (line, sizeof (line), fp)) { if (strlen(line) > 0) { if (line[strlen(line) - 1] == '\n') line[strlen(line) - 1] = '\0'; if (strlen (line) > 0 && line[strlen(line) - 1] == '\r') line[strlen(line) - 1] = '\0'; } /* * Clear out white space and tabs */ for (i = strlen (line) - 1; i > -1; i--) { if (line[i] == '\t' || line[i] == ' ') { line[i] = '\0'; } else { break; } } ignore_line = 1; for (i = 0; i < strlen (line); i++) { if (line[i] != '\t' && line[i] != ' ') { if (line[i] != '#') ignore_line = 0; break; } } /* * Clear out comments and empty lines */ if (ignore_line) { continue; } /* New section ? */ if ((loc = strchr_rs (line, '{'))) { char *section = remove_whitespace(line); loc--; *loc = '\0'; strcpy(new_keyname, path); if (strcmp(path, "") != 0) { strcat(new_keyname, "."); } strcat(new_keyname, section); if (!parser_cb(new_keyname, NULL, NULL, PARSER_CB_SECTION_START, error_string, user_data)) { return -1; } if (parse_section(fp, new_keyname, error_string, parser_cb, user_data)) return -1; } /* New key/value */ if ((loc = strchr_rs (line, ':'))) { char *key; char *value; *(loc-1) = '\0'; key = remove_whitespace(line); value = remove_whitespace(loc); strcpy(new_keyname, path); if (strcmp(path, "") != 0) { strcat(new_keyname, "."); } strcat(new_keyname, key); if (!parser_cb(new_keyname, key, value, PARSER_CB_ITEM, error_string, user_data)) { return -1; } } if (strchr_rs (line, '}')) { if (!parser_cb(path, NULL, NULL, PARSER_CB_SECTION_END, error_string, user_data)) { return -1; } return 0; } } if (strcmp(path, "") != 0) { *error_string = "Missing closing brace"; return -1; } if (strcmp(path, "") == 0) { parser_cb("", NULL, NULL, PARSER_CB_END, error_string, user_data); } return 0; } static int safe_atoi(const char *str, int *res) { int val; char *endptr; errno = 0; val = strtol(str, &endptr, 10); if (errno == ERANGE) { return (-1); } if (endptr == str) { return (-1); } if (*endptr != '\0') { return (-1); } *res = val; return (0); } static int str_to_ull(const char *str, unsigned long long int *res) { unsigned long long int val; char *endptr; errno = 0; val = strtoull(str, &endptr, 10); if (errno == ERANGE) { return (-1); } if (endptr == str) { return (-1); } if (*endptr != '\0') { return (-1); } *res = val; return (0); } static int main_config_parser_cb(const char *path, char *key, char *value, enum parser_cb_type type, const char **error_string, void *user_data) { int i; unsigned long long int ull; int add_as_string; char key_name[ICMAP_KEYNAME_MAXLEN]; static char formated_err[256]; struct main_cp_cb_data *data = (struct main_cp_cb_data *)user_data; struct key_value_list_item *kv_item; struct list_head *iter, *iter_next; int uid, gid; switch (type) { case PARSER_CB_START: memset(data, 0, sizeof(struct main_cp_cb_data)); data->state = MAIN_CP_CB_DATA_STATE_NORMAL; break; case PARSER_CB_END: break; case PARSER_CB_ITEM: add_as_string = 1; switch (data->state) { case MAIN_CP_CB_DATA_STATE_NORMAL: break; case MAIN_CP_CB_DATA_STATE_PLOAD: if ((strcmp(path, "pload.count") == 0) || (strcmp(path, "pload.size") == 0)) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } icmap_set_uint32(path, i); add_as_string = 0; } break; case MAIN_CP_CB_DATA_STATE_QUORUM: if ((strcmp(path, "quorum.expected_votes") == 0) || (strcmp(path, "quorum.votes") == 0) || (strcmp(path, "quorum.last_man_standing_window") == 0) || (strcmp(path, "quorum.leaving_timeout") == 0)) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } icmap_set_uint32(path, i); add_as_string = 0; } if ((strcmp(path, "quorum.two_node") == 0) || (strcmp(path, "quorum.allow_downscale") == 0) || (strcmp(path, "quorum.wait_for_all") == 0) || (strcmp(path, "quorum.auto_tie_breaker") == 0) || (strcmp(path, "quorum.last_man_standing") == 0)) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } icmap_set_uint8(path, i); add_as_string = 0; } break; case MAIN_CP_CB_DATA_STATE_QDEVICE: if ((strcmp(path, "quorum.device.timeout") == 0) || (strcmp(path, "quorum.device.votes") == 0)) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } icmap_set_uint32(path, i); add_as_string = 0; } if ((strcmp(path, "quorum.device.master_wins") == 0)) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } icmap_set_uint8(path, i); add_as_string = 0; } case MAIN_CP_CB_DATA_STATE_TOTEM: if ((strcmp(path, "totem.version") == 0) || (strcmp(path, "totem.nodeid") == 0) || (strcmp(path, "totem.threads") == 0) || (strcmp(path, "totem.token") == 0) || (strcmp(path, "totem.token_retransmit") == 0) || (strcmp(path, "totem.hold") == 0) || (strcmp(path, "totem.token_retransmits_before_loss_const") == 0) || (strcmp(path, "totem.join") == 0) || (strcmp(path, "totem.send_join") == 0) || (strcmp(path, "totem.consensus") == 0) || (strcmp(path, "totem.merge") == 0) || (strcmp(path, "totem.downcheck") == 0) || (strcmp(path, "totem.fail_recv_const") == 0) || (strcmp(path, "totem.seqno_unchanged_const") == 0) || (strcmp(path, "totem.rrp_token_expired_timeout") == 0) || (strcmp(path, "totem.rrp_problem_count_timeout") == 0) || (strcmp(path, "totem.rrp_problem_count_threshold") == 0) || (strcmp(path, "totem.rrp_problem_count_mcast_threshold") == 0) || (strcmp(path, "totem.rrp_autorecovery_check_timeout") == 0) || (strcmp(path, "totem.heartbeat_failures_allowed") == 0) || (strcmp(path, "totem.max_network_delay") == 0) || (strcmp(path, "totem.window_size") == 0) || (strcmp(path, "totem.max_messages") == 0) || (strcmp(path, "totem.miss_count_const") == 0) || (strcmp(path, "totem.netmtu") == 0)) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } icmap_set_uint32(path, i); add_as_string = 0; } if (strcmp(path, "totem.config_version") == 0) { if (str_to_ull(value, &ull) != 0) { goto atoi_error; } icmap_set_uint64(path, ull); add_as_string = 0; } if (strcmp(path, "totem.crypto_type") == 0) { if ((strcmp(value, "nss") != 0) && - (strcmp(value, "aes256") != 0)) { + (strcmp(value, "aes256") != 0) && + (strcmp(value, "aes192") != 0) && + (strcmp(value, "aes128") != 0) && + (strcmp(value, "3des") != 0)) { *error_string = "Invalid crypto type"; return (0); } } if (strcmp(path, "totem.crypto_cipher") == 0) { if ((strcmp(value, "none") != 0) && - (strcmp(value, "aes256") != 0)) { + (strcmp(value, "aes256") != 0) && + (strcmp(value, "aes192") != 0) && + (strcmp(value, "aes128") != 0) && + (strcmp(value, "3des") != 0)) { *error_string = "Invalid cipher type"; return (0); } } if (strcmp(path, "totem.crypto_hash") == 0) { if ((strcmp(value, "none") != 0) && (strcmp(value, "md5") != 0) && (strcmp(value, "sha1") != 0) && (strcmp(value, "sha256") != 0) && (strcmp(value, "sha384") != 0) && (strcmp(value, "sha512") != 0)) { *error_string = "Invalid hash type"; return (0); } } break; case MAIN_CP_CB_DATA_STATE_INTERFACE: if (strcmp(path, "totem.interface.ringnumber") == 0) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } data->ringnumber = i; add_as_string = 0; } if (strcmp(path, "totem.interface.bindnetaddr") == 0) { data->bindnetaddr = strdup(value); add_as_string = 0; } if (strcmp(path, "totem.interface.mcastaddr") == 0) { data->mcastaddr = strdup(value); add_as_string = 0; } if (strcmp(path, "totem.interface.broadcast") == 0) { data->broadcast = strdup(value); add_as_string = 0; } if (strcmp(path, "totem.interface.mcastport") == 0) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } data->mcastport = i; if (data->mcastport < 0 || data->mcastport > 65535) { *error_string = "Invalid multicast port (should be 0..65535)"; return (0); }; add_as_string = 0; } if (strcmp(path, "totem.interface.ttl") == 0) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } data->ttl = i; if (data->ttl < 0 || data->ttl > 255) { *error_string = "Invalid TTL (should be 0..255)"; return (0); }; add_as_string = 0; } break; case MAIN_CP_CB_DATA_STATE_LOGGER_SUBSYS: if (strcmp(key, "subsys") == 0) { data->subsys = strdup(value); if (data->subsys == NULL) { *error_string = "Can't alloc memory"; return (0); } } else { kv_item = malloc(sizeof(*kv_item)); if (kv_item == NULL) { *error_string = "Can't alloc memory"; return (0); } memset(kv_item, 0, sizeof(*kv_item)); kv_item->key = strdup(key); kv_item->value = strdup(value); if (kv_item->key == NULL || kv_item->value == NULL) { free(kv_item); *error_string = "Can't alloc memory"; return (0); } list_init(&kv_item->list); list_add(&kv_item->list, &data->logger_subsys_items_head); } add_as_string = 0; break; case MAIN_CP_CB_DATA_STATE_LOGGING_DAEMON: if (strcmp(key, "subsys") == 0) { data->subsys = strdup(value); if (data->subsys == NULL) { *error_string = "Can't alloc memory"; return (0); } } else if (strcmp(key, "name") == 0) { data->logging_daemon_name = strdup(value); if (data->logging_daemon_name == NULL) { *error_string = "Can't alloc memory"; return (0); } } else { kv_item = malloc(sizeof(*kv_item)); if (kv_item == NULL) { *error_string = "Can't alloc memory"; return (0); } memset(kv_item, 0, sizeof(*kv_item)); kv_item->key = strdup(key); kv_item->value = strdup(value); if (kv_item->key == NULL || kv_item->value == NULL) { free(kv_item); *error_string = "Can't alloc memory"; return (0); } list_init(&kv_item->list); list_add(&kv_item->list, &data->logger_subsys_items_head); } add_as_string = 0; break; case MAIN_CP_CB_DATA_STATE_UIDGID: if (strcmp(key, "uid") == 0) { uid = uid_determine(value); if (uid == -1) { *error_string = error_string_response; return (0); } snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "uidgid.uid.%u", uid); icmap_set_uint8(key_name, 1); add_as_string = 0; } else if (strcmp(key, "gid") == 0) { gid = gid_determine(value); if (gid == -1) { *error_string = error_string_response; return (0); } snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "uidgid.gid.%u", gid); icmap_set_uint8(key_name, 1); add_as_string = 0; } else { *error_string = "uidgid: Only uid and gid are allowed items"; return (0); } break; case MAIN_CP_CB_DATA_STATE_MEMBER: if (strcmp(key, "memberaddr") != 0) { *error_string = "Only memberaddr is allowed in member section"; return (0); } kv_item = malloc(sizeof(*kv_item)); if (kv_item == NULL) { *error_string = "Can't alloc memory"; return (0); } memset(kv_item, 0, sizeof(*kv_item)); kv_item->key = strdup(key); kv_item->value = strdup(value); if (kv_item->key == NULL || kv_item->value == NULL) { free(kv_item); *error_string = "Can't alloc memory"; return (0); } list_init(&kv_item->list); list_add(&kv_item->list, &data->member_items_head); add_as_string = 0; break; case MAIN_CP_CB_DATA_STATE_NODELIST: break; case MAIN_CP_CB_DATA_STATE_NODELIST_NODE: snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "nodelist.node.%u.%s", data->node_number, key); if ((strcmp(key, "nodeid") == 0) || (strcmp(key, "quorum_votes") == 0)) { if (safe_atoi(value, &i) != 0) { goto atoi_error; } icmap_set_uint32(key_name, i); add_as_string = 0; } if (strcmp(key, "ring0_addr") == 0) { data->ring0_addr_added = 1; } if (add_as_string) { icmap_set_string(key_name, value); add_as_string = 0; } break; } if (add_as_string) { icmap_set_string(path, value); } break; case PARSER_CB_SECTION_START: if (strcmp(path, "totem.interface") == 0) { data->state = MAIN_CP_CB_DATA_STATE_INTERFACE; data->ringnumber = 0; data->mcastport = -1; data->ttl = -1; list_init(&data->member_items_head); }; if (strcmp(path, "totem") == 0) { data->state = MAIN_CP_CB_DATA_STATE_TOTEM; }; if (strcmp(path, "logging.logger_subsys") == 0) { data->state = MAIN_CP_CB_DATA_STATE_LOGGER_SUBSYS; list_init(&data->logger_subsys_items_head); data->subsys = NULL; } if (strcmp(path, "logging.logging_daemon") == 0) { data->state = MAIN_CP_CB_DATA_STATE_LOGGING_DAEMON; list_init(&data->logger_subsys_items_head); data->subsys = NULL; data->logging_daemon_name = NULL; } if (strcmp(path, "uidgid") == 0) { data->state = MAIN_CP_CB_DATA_STATE_UIDGID; } if (strcmp(path, "totem.interface.member") == 0) { data->state = MAIN_CP_CB_DATA_STATE_MEMBER; } if (strcmp(path, "quorum") == 0) { data->state = MAIN_CP_CB_DATA_STATE_QUORUM; } if (strcmp(path, "quorum.device") == 0) { data->state = MAIN_CP_CB_DATA_STATE_QDEVICE; } if (strcmp(path, "nodelist") == 0) { data->state = MAIN_CP_CB_DATA_STATE_NODELIST; data->node_number = 0; } if (strcmp(path, "nodelist.node") == 0) { data->state = MAIN_CP_CB_DATA_STATE_NODELIST_NODE; data->ring0_addr_added = 0; } break; case PARSER_CB_SECTION_END: switch (data->state) { case MAIN_CP_CB_DATA_STATE_NORMAL: break; case MAIN_CP_CB_DATA_STATE_PLOAD: data->state = MAIN_CP_CB_DATA_STATE_NORMAL; break; case MAIN_CP_CB_DATA_STATE_INTERFACE: /* * Create new interface section */ if (data->bindnetaddr != NULL) { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.bindnetaddr", data->ringnumber); icmap_set_string(key_name, data->bindnetaddr); free(data->bindnetaddr); data->bindnetaddr = NULL; } if (data->mcastaddr != NULL) { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.mcastaddr", data->ringnumber); icmap_set_string(key_name, data->mcastaddr); free(data->mcastaddr); data->mcastaddr = NULL; } if (data->broadcast != NULL) { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.broadcast", data->ringnumber); icmap_set_string(key_name, data->broadcast); free(data->broadcast); data->broadcast = NULL; } if (data->mcastport > -1) { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.mcastport", data->ringnumber); icmap_set_uint16(key_name, data->mcastport); } if (data->ttl > -1) { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.ttl", data->ringnumber); icmap_set_uint8(key_name, data->ttl); } i = 0; for (iter = data->member_items_head.next; iter != &data->member_items_head; iter = iter_next) { kv_item = list_entry(iter, struct key_value_list_item, list); snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.member.%u", data->ringnumber, i); icmap_set_string(key_name, kv_item->value); iter_next = iter->next; free(kv_item->value); free(kv_item->key); free(kv_item); i++; } data->state = MAIN_CP_CB_DATA_STATE_TOTEM; break; case MAIN_CP_CB_DATA_STATE_TOTEM: data->state = MAIN_CP_CB_DATA_STATE_NORMAL; break; case MAIN_CP_CB_DATA_STATE_LOGGER_SUBSYS: if (data->subsys == NULL) { *error_string = "No subsys key in logger_subsys directive"; return (0); } for (iter = data->logger_subsys_items_head.next; iter != &data->logger_subsys_items_head; iter = iter_next) { kv_item = list_entry(iter, struct key_value_list_item, list); snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.logger_subsys.%s.%s", data->subsys, kv_item->key); icmap_set_string(key_name, kv_item->value); iter_next = iter->next; free(kv_item->value); free(kv_item->key); free(kv_item); } snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.logger_subsys.%s.subsys", data->subsys); icmap_set_string(key_name, data->subsys); free(data->subsys); data->state = MAIN_CP_CB_DATA_STATE_NORMAL; break; case MAIN_CP_CB_DATA_STATE_LOGGING_DAEMON: if (data->logging_daemon_name == NULL) { *error_string = "No name key in logging_daemon directive"; return (0); } for (iter = data->logger_subsys_items_head.next; iter != &data->logger_subsys_items_head; iter = iter_next) { kv_item = list_entry(iter, struct key_value_list_item, list); if (data->subsys == NULL) { if (strcmp(data->logging_daemon_name, "corosync") == 0) { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.%s", kv_item->key); } else { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.logging_daemon.%s.%s", data->logging_daemon_name, kv_item->key); } } else { if (strcmp(data->logging_daemon_name, "corosync") == 0) { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.logger_subsys.%s.%s", data->subsys, kv_item->key); } else { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.logging_daemon.%s.%s.%s", data->logging_daemon_name, data->subsys, kv_item->key); } } icmap_set_string(key_name, kv_item->value); iter_next = iter->next; free(kv_item->value); free(kv_item->key); free(kv_item); } if (data->subsys == NULL) { if (strcmp(data->logging_daemon_name, "corosync") != 0) { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.logging_daemon.%s.name", data->logging_daemon_name); icmap_set_string(key_name, data->logging_daemon_name); } } else { if (strcmp(data->logging_daemon_name, "corosync") == 0) { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.logger_subsys.%s.subsys", data->subsys); icmap_set_string(key_name, data->subsys); } else { snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.logging_daemon.%s.%s.subsys", data->logging_daemon_name, data->subsys); icmap_set_string(key_name, data->subsys); snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "logging.logging_daemon.%s.%s.name", data->logging_daemon_name, data->subsys); icmap_set_string(key_name, data->logging_daemon_name); } } free(data->subsys); free(data->logging_daemon_name); data->state = MAIN_CP_CB_DATA_STATE_NORMAL; break; case MAIN_CP_CB_DATA_STATE_UIDGID: data->state = MAIN_CP_CB_DATA_STATE_UIDGID; break; case MAIN_CP_CB_DATA_STATE_MEMBER: data->state = MAIN_CP_CB_DATA_STATE_INTERFACE; break; case MAIN_CP_CB_DATA_STATE_QUORUM: data->state = MAIN_CP_CB_DATA_STATE_NORMAL; break; case MAIN_CP_CB_DATA_STATE_QDEVICE: data->state = MAIN_CP_CB_DATA_STATE_QUORUM; break; case MAIN_CP_CB_DATA_STATE_NODELIST: data->state = MAIN_CP_CB_DATA_STATE_NORMAL; break; case MAIN_CP_CB_DATA_STATE_NODELIST_NODE: if (!data->ring0_addr_added) { *error_string = "No ring0_addr specified for node"; return (0); } data->node_number++; data->state = MAIN_CP_CB_DATA_STATE_NODELIST; break; } break; } return (1); atoi_error: snprintf(formated_err, sizeof(formated_err), "Value of key \"%s\" must be integer, but \"%s\" was given", key, value); *error_string = formated_err; return (0); } static int uidgid_config_parser_cb(const char *path, char *key, char *value, enum parser_cb_type type, const char **error_string, void *user_data) { char key_name[ICMAP_KEYNAME_MAXLEN]; int uid, gid; switch (type) { case PARSER_CB_START: break; case PARSER_CB_END: break; case PARSER_CB_ITEM: if (strcmp(path, "uidgid.uid") == 0) { uid = uid_determine(value); if (uid == -1) { *error_string = error_string_response; return (0); } snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "uidgid.uid.%u", uid); icmap_set_uint8(key_name, 1); } else if (strcmp(path, "uidgid.gid") == 0) { gid = gid_determine(value); if (gid == -1) { *error_string = error_string_response; return (0); } snprintf(key_name, ICMAP_KEYNAME_MAXLEN, "uidgid.gid.%u", gid); icmap_set_uint8(key_name, 1); } else { *error_string = "uidgid: Only uid and gid are allowed items"; return (0); } break; case PARSER_CB_SECTION_START: if (strcmp(path, "uidgid") != 0) { *error_string = "uidgid: Can't add subsection different then uidgid"; return (0); }; break; case PARSER_CB_SECTION_END: break; } return (1); } static int read_uidgid_files_into_icmap( const char **error_string) { FILE *fp; const char *dirname; DIR *dp; struct dirent *dirent; struct dirent *entry; char filename[PATH_MAX + FILENAME_MAX + 1]; int res = 0; size_t len; int return_code; struct stat stat_buf; char key_name[ICMAP_KEYNAME_MAXLEN]; dirname = COROSYSCONFDIR "/uidgid.d"; dp = opendir (dirname); if (dp == NULL) return 0; len = offsetof(struct dirent, d_name) + FILENAME_MAX + 1; entry = malloc(len); if (entry == NULL) { res = 0; goto error_exit; } for (return_code = readdir_r(dp, entry, &dirent); dirent != NULL && return_code == 0; return_code = readdir_r(dp, entry, &dirent)) { snprintf(filename, sizeof (filename), "%s/%s", dirname, dirent->d_name); stat (filename, &stat_buf); if (S_ISREG(stat_buf.st_mode)) { fp = fopen (filename, "r"); if (fp == NULL) continue; key_name[0] = 0; res = parse_section(fp, key_name, error_string, uidgid_config_parser_cb, NULL); fclose (fp); if (res != 0) { goto error_exit; } } } error_exit: free (entry); closedir(dp); return res; } /* Read config file and load into icmap */ static int read_config_file_into_icmap( const char **error_string) { FILE *fp; const char *filename; char *error_reason = error_string_response; int res; char key_name[ICMAP_KEYNAME_MAXLEN]; struct main_cp_cb_data data; filename = getenv ("COROSYNC_MAIN_CONFIG_FILE"); if (!filename) filename = COROSYSCONFDIR "/corosync.conf"; fp = fopen (filename, "r"); if (fp == NULL) { char error_str[100]; const char *error_ptr = qb_strerror_r(errno, error_str, sizeof(error_str)); snprintf (error_reason, sizeof(error_string_response), "Can't read file %s reason = (%s)", filename, error_ptr); *error_string = error_reason; return -1; } key_name[0] = 0; res = parse_section(fp, key_name, error_string, main_config_parser_cb, &data); fclose(fp); if (res == 0) { res = read_uidgid_files_into_icmap(error_string); } if (res == 0) { snprintf (error_reason, sizeof(error_string_response), "Successfully read main configuration file '%s'.", filename); *error_string = error_reason; } return res; } diff --git a/exec/totemconfig.c b/exec/totemconfig.c index e1b9f80f..17d8e03b 100644 --- a/exec/totemconfig.c +++ b/exec/totemconfig.c @@ -1,1080 +1,1089 @@ /* * Copyright (c) 2002-2005 MontaVista Software, Inc. * Copyright (c) 2006-2012 Red Hat, Inc. * * All rights reserved. * * Author: Steven Dake (sdake@redhat.com) * Jan Friesse (jfriesse@redhat.com) * * This software licensed under BSD license, the text of which follows: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * - Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * - Neither the name of the MontaVista Software, Inc. nor the names of its * contributors may be used to endorse or promote products derived from this * software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ #include <config.h> #include <stdio.h> #include <string.h> #include <stdlib.h> #include <errno.h> #include <unistd.h> #include <sys/socket.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <netinet/in.h> #include <arpa/inet.h> #include <sys/param.h> #include <corosync/swab.h> #include <corosync/list.h> #include <qb/qbdefs.h> #include <corosync/totem/totem.h> #include <corosync/config.h> #include <corosync/logsys.h> #include <corosync/icmap.h> #include "util.h" #include "totemconfig.h" #define TOKEN_RETRANSMITS_BEFORE_LOSS_CONST 4 #define TOKEN_TIMEOUT 1000 #define JOIN_TIMEOUT 50 #define MERGE_TIMEOUT 200 #define DOWNCHECK_TIMEOUT 1000 #define FAIL_TO_RECV_CONST 2500 #define SEQNO_UNCHANGED_CONST 30 #define MINIMUM_TIMEOUT (int)(1000/HZ)*3 #define MAX_NETWORK_DELAY 50 #define WINDOW_SIZE 50 #define MAX_MESSAGES 17 #define MISS_COUNT_CONST 5 #define RRP_PROBLEM_COUNT_TIMEOUT 2000 #define RRP_PROBLEM_COUNT_THRESHOLD_DEFAULT 10 #define RRP_PROBLEM_COUNT_THRESHOLD_MIN 2 #define RRP_AUTORECOVERY_CHECK_TIMEOUT 1000 #define DEFAULT_PORT 5405 static char error_string_response[512]; static void add_totem_config_notification(struct totem_config *totem_config); static void totem_volatile_config_read (struct totem_config *totem_config) { char *str; icmap_get_uint32("totem.token", &totem_config->token_timeout); icmap_get_uint32("totem.token_retransmit", &totem_config->token_retransmit_timeout); icmap_get_uint32("totem.hold", &totem_config->token_hold_timeout); icmap_get_uint32("totem.token_retransmits_before_loss_const", &totem_config->token_retransmits_before_loss_const); icmap_get_uint32("totem.join", &totem_config->join_timeout); icmap_get_uint32("totem.send_join", &totem_config->send_join_timeout); icmap_get_uint32("totem.consensus", &totem_config->consensus_timeout); icmap_get_uint32("totem.merge", &totem_config->merge_timeout); icmap_get_uint32("totem.downcheck", &totem_config->downcheck_timeout); icmap_get_uint32("totem.fail_recv_const", &totem_config->fail_to_recv_const); icmap_get_uint32("totem.seqno_unchanged_const", &totem_config->seqno_unchanged_const); icmap_get_uint32("totem.rrp_token_expired_timeout", &totem_config->rrp_token_expired_timeout); icmap_get_uint32("totem.rrp_problem_count_timeout", &totem_config->rrp_problem_count_timeout); icmap_get_uint32("totem.rrp_problem_count_threshold", &totem_config->rrp_problem_count_threshold); icmap_get_uint32("totem.rrp_problem_count_mcast_threshold", &totem_config->rrp_problem_count_mcast_threshold); icmap_get_uint32("totem.rrp_autorecovery_check_timeout", &totem_config->rrp_autorecovery_check_timeout); icmap_get_uint32("totem.heartbeat_failures_allowed", &totem_config->heartbeat_failures_allowed); icmap_get_uint32("totem.max_network_delay", &totem_config->max_network_delay); icmap_get_uint32("totem.window_size", &totem_config->window_size); icmap_get_uint32("totem.max_messages", &totem_config->max_messages); icmap_get_uint32("totem.miss_count_const", &totem_config->miss_count_const); if (icmap_get_string("totem.vsftype", &str) == CS_OK) { totem_config->vsf_type = str; } } static void totem_get_crypto(struct totem_config *totem_config) { char *str; const char *tmp_cipher; const char *tmp_hash; tmp_hash = "sha1"; tmp_cipher = "aes256"; if (icmap_get_string("totem.secauth", &str) == CS_OK) { if (strcmp (str, "off") == 0) { tmp_hash = "none"; tmp_cipher = "none"; } free(str); } if (icmap_get_string("totem.crypto_cipher", &str) == CS_OK) { if (strcmp(str, "none") == 0) { tmp_cipher = "none"; } if (strcmp(str, "aes256") == 0) { tmp_cipher = "aes256"; } + if (strcmp(str, "aes192") == 0) { + tmp_cipher = "aes192"; + } + if (strcmp(str, "aes128") == 0) { + tmp_cipher = "aes128"; + } + if (strcmp(str, "3des") == 0) { + tmp_cipher = "3des"; + } free(str); } if (icmap_get_string("totem.crypto_hash", &str) == CS_OK) { if (strcmp(str, "none") == 0) { tmp_hash = "none"; } if (strcmp(str, "md5") == 0) { tmp_hash = "md5"; } if (strcmp(str, "sha1") == 0) { tmp_hash = "sha1"; } if (strcmp(str, "sha256") == 0) { tmp_hash = "sha256"; } if (strcmp(str, "sha384") == 0) { tmp_hash = "sha384"; } if (strcmp(str, "sha512") == 0) { tmp_hash = "sha512"; } free(str); } free(totem_config->crypto_cipher_type); free(totem_config->crypto_hash_type); totem_config->crypto_cipher_type = strdup(tmp_cipher); totem_config->crypto_hash_type = strdup(tmp_hash); } static uint16_t generate_cluster_id (const char *cluster_name) { int i; int value = 0; for (i = 0; i < strlen(cluster_name); i++) { value <<= 1; value += cluster_name[i]; } return (value & 0xFFFF); } static int get_cluster_mcast_addr ( const char *cluster_name, const struct totem_ip_address *bindnet, unsigned int ringnumber, struct totem_ip_address *res) { uint16_t clusterid; char addr[INET6_ADDRSTRLEN + 1]; int err; if (cluster_name == NULL) { return (-1); } clusterid = generate_cluster_id(cluster_name) + ringnumber; memset (res, 0, sizeof(res)); switch (bindnet->family) { case AF_INET: snprintf(addr, sizeof(addr), "239.192.%d.%d", clusterid >> 8, clusterid % 0xFF); break; case AF_INET6: snprintf(addr, sizeof(addr), "ff15::%x", clusterid); break; default: /* * Unknown family */ return (-1); } err = totemip_parse (res, addr, 0); return (err); } static int find_local_node_in_nodelist(struct totem_config *totem_config) { icmap_iter_t iter; const char *iter_key; int res = 0; int node_pos; int local_node_pos = -1; struct totem_ip_address bind_addr; int interface_up, interface_num; char tmp_key[ICMAP_KEYNAME_MAXLEN]; char *node_addr_str; struct totem_ip_address node_addr; res = totemip_iface_check(&totem_config->interfaces[0].bindnet, &bind_addr, &interface_up, &interface_num, totem_config->clear_node_high_bit); if (res == -1) { return (-1); } iter = icmap_iter_init("nodelist.node."); while ((iter_key = icmap_iter_next(iter, NULL, NULL)) != NULL) { res = sscanf(iter_key, "nodelist.node.%u.%s", &node_pos, tmp_key); if (res != 2) { continue; } if (strcmp(tmp_key, "ring0_addr") != 0) { continue; } snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "nodelist.node.%u.ring0_addr", node_pos); if (icmap_get_string(tmp_key, &node_addr_str) != CS_OK) { continue; } res = totemip_parse (&node_addr, node_addr_str, 0); free(node_addr_str); if (res == -1) { continue ; } if (totemip_equal(&bind_addr, &node_addr)) { local_node_pos = node_pos; } } icmap_iter_finalize(iter); return (local_node_pos); } static void put_nodelist_members_to_config(struct totem_config *totem_config) { icmap_iter_t iter, iter2; const char *iter_key, *iter_key2; int res = 0; int node_pos; char tmp_key[ICMAP_KEYNAME_MAXLEN]; char tmp_key2[ICMAP_KEYNAME_MAXLEN]; char *node_addr_str; int member_count; unsigned int ringnumber = 0; iter = icmap_iter_init("nodelist.node."); while ((iter_key = icmap_iter_next(iter, NULL, NULL)) != NULL) { res = sscanf(iter_key, "nodelist.node.%u.%s", &node_pos, tmp_key); if (res != 2) { continue; } if (strcmp(tmp_key, "ring0_addr") != 0) { continue; } snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "nodelist.node.%u.", node_pos); iter2 = icmap_iter_init(tmp_key); while ((iter_key2 = icmap_iter_next(iter2, NULL, NULL)) != NULL) { res = sscanf(iter_key2, "nodelist.node.%u.ring%u%s", &node_pos, &ringnumber, tmp_key2); if (res != 3 || strcmp(tmp_key2, "_addr") != 0) { continue; } if (icmap_get_string(iter_key2, &node_addr_str) != CS_OK) { continue; } member_count = totem_config->interfaces[ringnumber].member_count; res = totemip_parse(&totem_config->interfaces[ringnumber].member_list[member_count], node_addr_str, 0); if (res != -1) { totem_config->interfaces[ringnumber].member_count++; } free(node_addr_str); } icmap_iter_finalize(iter2); } icmap_iter_finalize(iter); } static void config_convert_nodelist_to_interface(struct totem_config *totem_config) { icmap_iter_t iter; const char *iter_key; int res = 0; int node_pos; char tmp_key[ICMAP_KEYNAME_MAXLEN]; char tmp_key2[ICMAP_KEYNAME_MAXLEN]; char *node_addr_str; unsigned int ringnumber = 0; struct list_head addrs; struct list_head *list; struct totem_ip_if_address *if_addr; struct totem_ip_address node_addr; int node_found; if (totemip_getifaddrs(&addrs) == -1) { return ; } iter = icmap_iter_init("nodelist.node."); while ((iter_key = icmap_iter_next(iter, NULL, NULL)) != NULL) { res = sscanf(iter_key, "nodelist.node.%u.%s", &node_pos, tmp_key); if (res != 2) { continue; } if (strcmp(tmp_key, "ring0_addr") != 0) { continue; } if (icmap_get_string(iter_key, &node_addr_str) != CS_OK) { continue ; } if (totemip_parse(&node_addr, node_addr_str, 0) == -1) { free(node_addr_str); continue ; } free(node_addr_str); /* * Try to find node in if_addrs */ node_found = 0; for (list = addrs.next; list != &addrs; list = list->next) { if_addr = list_entry(list, struct totem_ip_if_address, list); if (totemip_equal(&node_addr, &if_addr->ip_addr)) { node_found = 1; break; } } if (node_found) { break ; } } icmap_iter_finalize(iter); if (node_found) { /* * We found node, so create interface section */ snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "nodelist.node.%u.", node_pos); iter = icmap_iter_init(tmp_key); while ((iter_key = icmap_iter_next(iter, NULL, NULL)) != NULL) { res = sscanf(iter_key, "nodelist.node.%u.ring%u%s", &node_pos, &ringnumber, tmp_key2); if (res != 3 || strcmp(tmp_key2, "_addr") != 0) { continue ; } if (icmap_get_string(iter_key, &node_addr_str) != CS_OK) { continue; } snprintf(tmp_key2, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.bindnetaddr", ringnumber); icmap_set_string(tmp_key2, node_addr_str); free(node_addr_str); } icmap_iter_finalize(iter); } } extern int totem_config_read ( struct totem_config *totem_config, const char **error_string, uint64_t *warnings) { int res = 0; char *str; unsigned int ringnumber = 0; int member_count = 0; icmap_iter_t iter, member_iter; const char *iter_key; const char *member_iter_key; char ringnumber_key[ICMAP_KEYNAME_MAXLEN]; char tmp_key[ICMAP_KEYNAME_MAXLEN]; uint8_t u8; uint16_t u16; char *cluster_name = NULL; int i; int local_node_pos; int nodeid_set; *warnings = 0; memset (totem_config, 0, sizeof (struct totem_config)); totem_config->interfaces = malloc (sizeof (struct totem_interface) * INTERFACE_MAX); if (totem_config->interfaces == 0) { *error_string = "Out of memory trying to allocate ethernet interface storage area"; return -1; } memset (totem_config->interfaces, 0, sizeof (struct totem_interface) * INTERFACE_MAX); strcpy (totem_config->rrp_mode, "none"); icmap_get_uint32("totem.version", (uint32_t *)&totem_config->version); totem_get_crypto(totem_config); if (icmap_get_string("totem.rrp_mode", &str) == CS_OK) { strcpy (totem_config->rrp_mode, str); free(str); } icmap_get_uint32("totem.nodeid", &totem_config->node_id); totem_config->clear_node_high_bit = 0; if (icmap_get_string("totem.clear_node_high_bit", &str) == CS_OK) { if (strcmp (str, "yes") == 0) { totem_config->clear_node_high_bit = 1; } free(str); } icmap_get_uint32("totem.threads", &totem_config->threads); icmap_get_uint32("totem.netmtu", &totem_config->net_mtu); icmap_get_string("totem.cluster_name", &cluster_name); /* * Get things that might change in the future */ totem_volatile_config_read(totem_config); if (icmap_get_string("totem.interface.0.bindnetaddr", &str) != CS_OK) { /* * We were not able to find ring 0 bindnet addr. Try to use nodelist informations */ config_convert_nodelist_to_interface(totem_config); } else { free(str); } iter = icmap_iter_init("totem.interface."); while ((iter_key = icmap_iter_next(iter, NULL, NULL)) != NULL) { res = sscanf(iter_key, "totem.interface.%[^.].%s", ringnumber_key, tmp_key); if (res != 2) { continue; } if (strcmp(tmp_key, "bindnetaddr") != 0) { continue; } member_count = 0; ringnumber = atoi(ringnumber_key); if (ringnumber >= INTERFACE_MAX) { snprintf (error_string_response, sizeof(error_string_response), "parse error in config: interface ring number %u is bigger then allowed maximum %u\n", ringnumber, INTERFACE_MAX - 1); *error_string = error_string_response; return -1; } /* * Get the bind net address */ if (icmap_get_string(iter_key, &str) == CS_OK) { res = totemip_parse (&totem_config->interfaces[ringnumber].bindnet, str, totem_config->interfaces[ringnumber].mcast_addr.family); free(str); } /* * Get interface multicast address */ snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.mcastaddr", ringnumber); if (icmap_get_string(tmp_key, &str) == CS_OK) { res = totemip_parse (&totem_config->interfaces[ringnumber].mcast_addr, str, 0); free(str); } else { /* * User not specified address -> autogenerate one from cluster_name key * (if available) */ res = get_cluster_mcast_addr (cluster_name, &totem_config->interfaces[ringnumber].bindnet, ringnumber, &totem_config->interfaces[ringnumber].mcast_addr); } totem_config->broadcast_use = 0; snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.broadcast", ringnumber); if (icmap_get_string(tmp_key, &str) == CS_OK) { if (strcmp (str, "yes") == 0) { totem_config->broadcast_use = 1; totemip_parse ( &totem_config->interfaces[ringnumber].mcast_addr, "255.255.255.255", 0); } free(str); } /* * Get mcast port */ snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.mcastport", ringnumber); if (icmap_get_uint16(tmp_key, &totem_config->interfaces[ringnumber].ip_port) != CS_OK) { if (totem_config->broadcast_use) { totem_config->interfaces[ringnumber].ip_port = DEFAULT_PORT + (2 * ringnumber); } else { totem_config->interfaces[ringnumber].ip_port = DEFAULT_PORT; } } /* * Get the TTL */ totem_config->interfaces[ringnumber].ttl = 1; snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.ttl", ringnumber); if (icmap_get_uint8(tmp_key, &u8) == CS_OK) { totem_config->interfaces[ringnumber].ttl = u8; } snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.member.", ringnumber); member_iter = icmap_iter_init(tmp_key); while ((member_iter_key = icmap_iter_next(member_iter, NULL, NULL)) != NULL) { if (member_count == 0) { if (icmap_get_string("nodelist.node.0.ring0_addr", &str) == CS_OK) { free(str); *warnings |= TOTEM_CONFIG_WARNING_MEMBERS_IGNORED; break; } else { *warnings |= TOTEM_CONFIG_WARNING_MEMBERS_DEPRECATED; } } if (icmap_get_string(member_iter_key, &str) == CS_OK) { res = totemip_parse (&totem_config->interfaces[ringnumber].member_list[member_count++], str, 0); } } icmap_iter_finalize(member_iter); totem_config->interfaces[ringnumber].member_count = member_count; totem_config->interface_count++; } icmap_iter_finalize(iter); /* * Store automatically generated items back to icmap */ for (i = 0; i < totem_config->interface_count; i++) { snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.mcastaddr", i); if (icmap_get_string(tmp_key, &str) == CS_OK) { free(str); } else { str = (char *)totemip_print(&totem_config->interfaces[i].mcast_addr); icmap_set_string(tmp_key, str); } snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "totem.interface.%u.mcastport", i); if (icmap_get_uint16(tmp_key, &u16) != CS_OK) { icmap_set_uint16(tmp_key, totem_config->interfaces[i].ip_port); } } totem_config->transport_number = TOTEM_TRANSPORT_UDP; if (icmap_get_string("totem.transport", &str) == CS_OK) { if (strcmp (str, "udpu") == 0) { totem_config->transport_number = TOTEM_TRANSPORT_UDPU; } if (strcmp (str, "iba") == 0) { totem_config->transport_number = TOTEM_TRANSPORT_RDMA; } free(str); } free(cluster_name); /* * Check existence of nodelist */ if (icmap_get_string("nodelist.node.0.ring0_addr", &str) == CS_OK) { free(str); /* * find local node */ local_node_pos = find_local_node_in_nodelist(totem_config); if (local_node_pos != -1) { icmap_set_uint32("nodelist.local_node_pos", local_node_pos); snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "nodelist.node.%u.nodeid", local_node_pos); nodeid_set = (totem_config->node_id != 0); if (icmap_get_uint32(tmp_key, &totem_config->node_id) == CS_OK && nodeid_set) { *warnings |= TOTEM_CONFIG_WARNING_TOTEM_NODEID_IGNORED; } /* * Make localnode ring0_addr read only, so we can be sure that local * node never changes. If rebinding to other IP would be in future * supported, this must be changed and handled properly! */ snprintf(tmp_key, ICMAP_KEYNAME_MAXLEN, "nodelist.node.%u.ring0_addr", local_node_pos); icmap_set_ro_access(tmp_key, 0, 1); icmap_set_ro_access("nodelist.local_node_pos", 0, 1); } put_nodelist_members_to_config(totem_config); } add_totem_config_notification(totem_config); return 0; } int totem_config_validate ( struct totem_config *totem_config, const char **error_string) { static char local_error_reason[512]; char parse_error[512]; const char *error_reason = local_error_reason; int i; unsigned int interface_max = INTERFACE_MAX; if (totem_config->interface_count == 0) { error_reason = "No interfaces defined"; goto parse_error; } for (i = 0; i < totem_config->interface_count; i++) { /* * Some error checking of parsed data to make sure its valid */ struct totem_ip_address null_addr; memset (&null_addr, 0, sizeof (struct totem_ip_address)); if ((totem_config->transport_number == 0) && memcmp (&totem_config->interfaces[i].mcast_addr, &null_addr, sizeof (struct totem_ip_address)) == 0) { error_reason = "No multicast address specified"; goto parse_error; } if (totem_config->interfaces[i].ip_port == 0) { error_reason = "No multicast port specified"; goto parse_error; } if (totem_config->interfaces[i].ttl > 255) { error_reason = "Invalid TTL (should be 0..255)"; goto parse_error; } if (totem_config->transport_number != TOTEM_TRANSPORT_UDP && totem_config->interfaces[i].ttl != 1) { error_reason = "Can only set ttl on multicast transport types"; goto parse_error; } if (totem_config->interfaces[i].mcast_addr.family == AF_INET6 && totem_config->node_id == 0) { error_reason = "An IPV6 network requires that a node ID be specified."; goto parse_error; } if (totem_config->broadcast_use == 0 && totem_config->transport_number == 0) { if (totem_config->interfaces[i].mcast_addr.family != totem_config->interfaces[i].bindnet.family) { error_reason = "Multicast address family does not match bind address family"; goto parse_error; } if (totem_config->interfaces[i].mcast_addr.family != totem_config->interfaces[i].bindnet.family) { error_reason = "Not all bind address belong to the same IP family"; goto parse_error; } if (totemip_is_mcast (&totem_config->interfaces[i].mcast_addr) != 0) { error_reason = "mcastaddr is not a correct multicast address."; goto parse_error; } } } if (totem_config->version != 2) { error_reason = "This totem parser can only parse version 2 configurations."; goto parse_error; } if (totem_config->token_retransmits_before_loss_const == 0) { totem_config->token_retransmits_before_loss_const = TOKEN_RETRANSMITS_BEFORE_LOSS_CONST; } /* * Setup timeout values that are not setup by user */ if (totem_config->token_timeout == 0) { totem_config->token_timeout = TOKEN_TIMEOUT; } if (totem_config->max_network_delay == 0) { totem_config->max_network_delay = MAX_NETWORK_DELAY; } if (totem_config->max_network_delay < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The max_network_delay parameter (%d ms) may not be less then (%d ms).", totem_config->max_network_delay, MINIMUM_TIMEOUT); goto parse_error; } if (totem_config->window_size == 0) { totem_config->window_size = WINDOW_SIZE; } if (totem_config->max_messages == 0) { totem_config->max_messages = MAX_MESSAGES; } if (totem_config->miss_count_const == 0) { totem_config->miss_count_const = MISS_COUNT_CONST; } if (totem_config->token_timeout < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The token timeout parameter (%d ms) may not be less then (%d ms).", totem_config->token_timeout, MINIMUM_TIMEOUT); goto parse_error; } if (totem_config->token_retransmit_timeout == 0) { totem_config->token_retransmit_timeout = (int)(totem_config->token_timeout / (totem_config->token_retransmits_before_loss_const + 0.2)); } if (totem_config->token_hold_timeout == 0) { totem_config->token_hold_timeout = (int)(totem_config->token_retransmit_timeout * 0.8 - (1000/HZ)); } if (totem_config->token_retransmit_timeout < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The token retransmit timeout parameter (%d ms) may not be less then (%d ms).", totem_config->token_retransmit_timeout, MINIMUM_TIMEOUT); goto parse_error; } if (totem_config->token_hold_timeout < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The token hold timeout parameter (%d ms) may not be less then (%d ms).", totem_config->token_hold_timeout, MINIMUM_TIMEOUT); goto parse_error; } if (totem_config->join_timeout == 0) { totem_config->join_timeout = JOIN_TIMEOUT; } if (totem_config->join_timeout < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The join timeout parameter (%d ms) may not be less then (%d ms).", totem_config->join_timeout, MINIMUM_TIMEOUT); goto parse_error; } if (totem_config->consensus_timeout == 0) { totem_config->consensus_timeout = (int)(float)(1.2 * totem_config->token_timeout); } if (totem_config->consensus_timeout < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The consensus timeout parameter (%d ms) may not be less then (%d ms).", totem_config->consensus_timeout, MINIMUM_TIMEOUT); goto parse_error; } if (totem_config->merge_timeout == 0) { totem_config->merge_timeout = MERGE_TIMEOUT; } if (totem_config->merge_timeout < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The merge timeout parameter (%d ms) may not be less then (%d ms).", totem_config->merge_timeout, MINIMUM_TIMEOUT); goto parse_error; } if (totem_config->downcheck_timeout == 0) { totem_config->downcheck_timeout = DOWNCHECK_TIMEOUT; } if (totem_config->downcheck_timeout < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The downcheck timeout parameter (%d ms) may not be less then (%d ms).", totem_config->downcheck_timeout, MINIMUM_TIMEOUT); goto parse_error; } /* * RRP values validation */ if (strcmp (totem_config->rrp_mode, "none") && strcmp (totem_config->rrp_mode, "active") && strcmp (totem_config->rrp_mode, "passive")) { snprintf (local_error_reason, sizeof(local_error_reason), "The RRP mode \"%s\" specified is invalid. It must be none, active, or passive.\n", totem_config->rrp_mode); goto parse_error; } if (totem_config->rrp_problem_count_timeout == 0) { totem_config->rrp_problem_count_timeout = RRP_PROBLEM_COUNT_TIMEOUT; } if (totem_config->rrp_problem_count_timeout < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The RRP problem count timeout parameter (%d ms) may not be less then (%d ms).", totem_config->rrp_problem_count_timeout, MINIMUM_TIMEOUT); goto parse_error; } if (totem_config->rrp_problem_count_threshold == 0) { totem_config->rrp_problem_count_threshold = RRP_PROBLEM_COUNT_THRESHOLD_DEFAULT; } if (totem_config->rrp_problem_count_mcast_threshold == 0) { totem_config->rrp_problem_count_mcast_threshold = totem_config->rrp_problem_count_threshold * 10; } if (totem_config->rrp_problem_count_threshold < RRP_PROBLEM_COUNT_THRESHOLD_MIN) { snprintf (local_error_reason, sizeof(local_error_reason), "The RRP problem count threshold (%d problem count) may not be less then (%d problem count).", totem_config->rrp_problem_count_threshold, RRP_PROBLEM_COUNT_THRESHOLD_MIN); goto parse_error; } if (totem_config->rrp_problem_count_mcast_threshold < RRP_PROBLEM_COUNT_THRESHOLD_MIN) { snprintf (local_error_reason, sizeof(local_error_reason), "The RRP multicast problem count threshold (%d problem count) may not be less then (%d problem count).", totem_config->rrp_problem_count_mcast_threshold, RRP_PROBLEM_COUNT_THRESHOLD_MIN); goto parse_error; } if (totem_config->rrp_token_expired_timeout == 0) { totem_config->rrp_token_expired_timeout = totem_config->token_retransmit_timeout; } if (totem_config->rrp_token_expired_timeout < MINIMUM_TIMEOUT) { snprintf (local_error_reason, sizeof(local_error_reason), "The RRP token expired timeout parameter (%d ms) may not be less then (%d ms).", totem_config->rrp_token_expired_timeout, MINIMUM_TIMEOUT); goto parse_error; } if (totem_config->rrp_autorecovery_check_timeout == 0) { totem_config->rrp_autorecovery_check_timeout = RRP_AUTORECOVERY_CHECK_TIMEOUT; } if (strcmp (totem_config->rrp_mode, "none") == 0) { interface_max = 1; } if (interface_max < totem_config->interface_count) { snprintf (parse_error, sizeof(parse_error), "%d is too many configured interfaces for the rrp_mode setting %s.", totem_config->interface_count, totem_config->rrp_mode); error_reason = parse_error; goto parse_error; } if (totem_config->fail_to_recv_const == 0) { totem_config->fail_to_recv_const = FAIL_TO_RECV_CONST; } if (totem_config->seqno_unchanged_const == 0) { totem_config->seqno_unchanged_const = SEQNO_UNCHANGED_CONST; } if (totem_config->net_mtu == 0) { totem_config->net_mtu = 1500; } if ((MESSAGE_QUEUE_MAX) < totem_config->max_messages) { snprintf (local_error_reason, sizeof(local_error_reason), "The max_messages parameter (%d messages) may not be greater then (%d messages).", totem_config->max_messages, MESSAGE_QUEUE_MAX); goto parse_error; } if (totem_config->threads > SEND_THREADS_MAX) { totem_config->threads = SEND_THREADS_MAX; } if (totem_config->net_mtu > FRAME_SIZE_MAX) { error_reason = "This net_mtu parameter is greater then the maximum frame size"; goto parse_error; } if (totem_config->vsf_type == NULL) { totem_config->vsf_type = "none"; } return (0); parse_error: snprintf (error_string_response, sizeof(error_string_response), "parse error in config: %s\n", error_reason); *error_string = error_string_response; return (-1); } static int read_keyfile ( const char *key_location, struct totem_config *totem_config, const char **error_string) { int fd; int res; ssize_t expected_key_len = sizeof (totem_config->private_key); int saved_errno; char error_str[100]; const char *error_ptr; fd = open (key_location, O_RDONLY); if (fd == -1) { error_ptr = qb_strerror_r(errno, error_str, sizeof(error_str)); snprintf (error_string_response, sizeof(error_string_response), "Could not open %s: %s\n", key_location, error_ptr); goto parse_error; } res = read (fd, totem_config->private_key, expected_key_len); saved_errno = errno; close (fd); if (res == -1) { error_ptr = qb_strerror_r (saved_errno, error_str, sizeof(error_str)); snprintf (error_string_response, sizeof(error_string_response), "Could not read %s: %s\n", key_location, error_ptr); goto parse_error; } totem_config->private_key_len = expected_key_len; if (res != expected_key_len) { snprintf (error_string_response, sizeof(error_string_response), "Could only read %d bits of 1024 bits from %s.\n", res * 8, key_location); goto parse_error; } return 0; parse_error: *error_string = error_string_response; return (-1); } int totem_config_keyread ( struct totem_config *totem_config, const char **error_string) { int got_key = 0; char *key_location = NULL; int res; size_t key_len; memset (totem_config->private_key, 0, 128); totem_config->private_key_len = 128; if (strcmp(totem_config->crypto_cipher_type, "none") == 0 && strcmp(totem_config->crypto_hash_type, "none") == 0) { return (0); } /* cmap may store the location of the key file */ if (icmap_get_string("totem.keyfile", &key_location) == CS_OK) { res = read_keyfile(key_location, totem_config, error_string); free(key_location); if (res) { goto key_error; } got_key = 1; } else { /* Or the key itself may be in the cmap */ if (icmap_get("totem.key", NULL, &key_len, NULL) == CS_OK) { if (key_len > sizeof (totem_config->private_key)) { sprintf(error_string_response, "key is too long"); goto key_error; } if (icmap_get("totem.key", totem_config->private_key, &key_len, NULL) == CS_OK) { totem_config->private_key_len = key_len; got_key = 1; } else { sprintf(error_string_response, "can't store private key"); goto key_error; } } } /* In desperation we read the default filename */ if (!got_key) { const char *filename = getenv("COROSYNC_TOTEM_AUTHKEY_FILE"); if (!filename) filename = COROSYSCONFDIR "/authkey"; res = read_keyfile(filename, totem_config, error_string); if (res) goto key_error; } return (0); key_error: *error_string = error_string_response; return (-1); } static void totem_change_notify( int32_t event, const char *key_name, struct icmap_notify_value new_val, struct icmap_notify_value old_val, void *user_data) { totem_volatile_config_read((struct totem_config *)user_data); } static void add_totem_config_notification(struct totem_config *totem_config) { icmap_track_t icmap_track; icmap_track_add("totem.", ICMAP_TRACK_ADD | ICMAP_TRACK_DELETE | ICMAP_TRACK_MODIFY | ICMAP_TRACK_PREFIX, totem_change_notify, totem_config, &icmap_track); } diff --git a/exec/totemcrypto.c b/exec/totemcrypto.c index dc6b863d..a9ea34d7 100644 --- a/exec/totemcrypto.c +++ b/exec/totemcrypto.c @@ -1,737 +1,755 @@ /* * Copyright (c) 2006-2012 Red Hat, Inc. * * All rights reserved. * * Author: Steven Dake (sdake@redhat.com) * Christine Caulfield (ccaulfie@redhat.com) * Jan Friesse (jfriesse@redhat.com) * Fabio M. Di Nitto (fdinitto@redhat.com) * * This software licensed under BSD license, the text of which follows: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions are met: * * - Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above copyright notice, * this list of conditions and the following disclaimer in the documentation * and/or other materials provided with the distribution. * - Neither the name of the MontaVista Software, Inc. nor the names of its * contributors may be used to endorse or promote products derived from this * software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ #include "config.h" #include <nss.h> #include <pk11pub.h> #include <pkcs11.h> #include <prerror.h> #include <blapit.h> #include <hasht.h> #define LOGSYS_UTILS_ONLY 1 #include <corosync/logsys.h> #include <corosync/totem/totem.h> #include "totemcrypto.h" /* * define onwire crypto header */ struct crypto_config_header { uint8_t crypto_cipher_type; uint8_t crypto_hash_type; uint8_t __pad0; uint8_t __pad1; } __attribute__((packed)); /* * crypto definitions and conversion tables */ #define SALT_SIZE 16 enum crypto_crypt_t { CRYPTO_CIPHER_TYPE_NONE = 0, - CRYPTO_CIPHER_TYPE_AES256 = 1 + CRYPTO_CIPHER_TYPE_AES256 = 1, + CRYPTO_CIPHER_TYPE_AES192 = 2, + CRYPTO_CIPHER_TYPE_AES128 = 3, + CRYPTO_CIPHER_TYPE_3DES = 4 }; CK_MECHANISM_TYPE cipher_to_nss[] = { 0, /* CRYPTO_CIPHER_TYPE_NONE */ - CKM_AES_CBC_PAD /* CRYPTO_CIPHER_TYPE_AES256 */ + CKM_AES_CBC_PAD, /* CRYPTO_CIPHER_TYPE_AES256 */ + CKM_AES_CBC_PAD, /* CRYPTO_CIPHER_TYPE_AES192 */ + CKM_AES_CBC_PAD, /* CRYPTO_CIPHER_TYPE_AES128 */ + CKM_DES3_CBC_PAD /* CRYPTO_CIPHER_TYPE_3DES */ }; size_t cipher_key_len[] = { - 0, /* CRYPTO_CIPHER_TYPE_NONE */ - 32, /* CRYPTO_CIPHER_TYPE_AES256 */ + 0, /* CRYPTO_CIPHER_TYPE_NONE */ + AES_256_KEY_LENGTH, /* CRYPTO_CIPHER_TYPE_AES256 */ + AES_192_KEY_LENGTH, /* CRYPTO_CIPHER_TYPE_AES192 */ + AES_128_KEY_LENGTH, /* CRYPTO_CIPHER_TYPE_AES128 */ + 24 /* CRYPTO_CIPHER_TYPE_3DES - no magic in nss headers */ }; size_t cypher_block_len[] = { - 0, /* CRYPTO_CIPHER_TYPE_NONE */ - AES_BLOCK_SIZE /* CRYPTO_CIPHER_TYPE_AES256 */ + 0, /* CRYPTO_CIPHER_TYPE_NONE */ + AES_BLOCK_SIZE, /* CRYPTO_CIPHER_TYPE_AES256 */ + AES_BLOCK_SIZE, /* CRYPTO_CIPHER_TYPE_AES192 */ + AES_BLOCK_SIZE, /* CRYPTO_CIPHER_TYPE_AES128 */ + 0 /* CRYPTO_CIPHER_TYPE_3DES */ }; /* * hash definitions and conversion tables */ enum crypto_hash_t { CRYPTO_HASH_TYPE_NONE = 0, CRYPTO_HASH_TYPE_MD5 = 1, CRYPTO_HASH_TYPE_SHA1 = 2, CRYPTO_HASH_TYPE_SHA256 = 3, CRYPTO_HASH_TYPE_SHA384 = 4, CRYPTO_HASH_TYPE_SHA512 = 5 }; CK_MECHANISM_TYPE hash_to_nss[] = { - 0, /* CRYPTO_HASH_TYPE_NONE */ + 0, /* CRYPTO_HASH_TYPE_NONE */ CKM_MD5_HMAC, /* CRYPTO_HASH_TYPE_MD5 */ CKM_SHA_1_HMAC, /* CRYPTO_HASH_TYPE_SHA1 */ CKM_SHA256_HMAC, /* CRYPTO_HASH_TYPE_SHA256 */ CKM_SHA384_HMAC, /* CRYPTO_HASH_TYPE_SHA384 */ CKM_SHA512_HMAC /* CRYPTO_HASH_TYPE_SHA512 */ }; size_t hash_len[] = { - 0, /* CRYPTO_HASH_TYPE_NONE */ + 0, /* CRYPTO_HASH_TYPE_NONE */ MD5_LENGTH, /* CRYPTO_HASH_TYPE_MD5 */ SHA1_LENGTH, /* CRYPTO_HASH_TYPE_SHA1 */ SHA256_LENGTH, /* CRYPTO_HASH_TYPE_SHA256 */ SHA384_LENGTH, /* CRYPTO_HASH_TYPE_SHA384 */ SHA512_LENGTH /* CRYPTO_HASH_TYPE_SHA512 */ }; size_t hash_block_len[] = { - 0, /* CRYPTO_HASH_TYPE_NONE */ + 0, /* CRYPTO_HASH_TYPE_NONE */ MD5_BLOCK_LENGTH, /* CRYPTO_HASH_TYPE_MD5 */ SHA1_BLOCK_LENGTH, /* CRYPTO_HASH_TYPE_SHA1 */ SHA256_BLOCK_LENGTH, /* CRYPTO_HASH_TYPE_SHA256 */ SHA384_BLOCK_LENGTH, /* CRYPTO_HASH_TYPE_SHA384 */ SHA512_BLOCK_LENGTH /* CRYPTO_HASH_TYPE_SHA512 */ }; struct crypto_instance { PK11SymKey *nss_sym_key; PK11SymKey *nss_sym_key_sign; unsigned char private_key[1024]; unsigned int private_key_len; enum crypto_crypt_t crypto_cipher_type; enum crypto_hash_t crypto_hash_type; unsigned int crypto_header_size; void (*log_printf_func) ( int level, int subsys, const char *function, const char *file, int line, const char *format, ...)__attribute__((format(printf, 6, 7))); int log_level_security; int log_level_notice; int log_level_error; int log_subsys_id; }; #define log_printf(level, format, args...) \ do { \ instance->log_printf_func ( \ level, instance->log_subsys_id, \ __FUNCTION__, __FILE__, __LINE__, \ (const char *)format, ##args); \ } while (0); /* * crypt/decrypt functions */ static int string_to_crypto_cipher_type(const char* crypto_cipher_type) { if (strcmp(crypto_cipher_type, "none") == 0) { return CRYPTO_CIPHER_TYPE_NONE; } else if (strcmp(crypto_cipher_type, "aes256") == 0) { return CRYPTO_CIPHER_TYPE_AES256; + } else if (strcmp(crypto_cipher_type, "aes192") == 0) { + return CRYPTO_CIPHER_TYPE_AES192; + } else if (strcmp(crypto_cipher_type, "aes128") == 0) { + return CRYPTO_CIPHER_TYPE_AES128; + } else if (strcmp(crypto_cipher_type, "3des") == 0) { + return CRYPTO_CIPHER_TYPE_3DES; } return CRYPTO_CIPHER_TYPE_AES256; } static int init_nss_crypto(struct crypto_instance *instance) { PK11SlotInfo* crypt_slot = NULL; SECItem crypt_param; if (!cipher_to_nss[instance->crypto_cipher_type]) { return 0; } crypt_param.type = siBuffer; crypt_param.data = instance->private_key; crypt_param.len = cipher_key_len[instance->crypto_cipher_type]; crypt_slot = PK11_GetBestSlot(cipher_to_nss[instance->crypto_cipher_type], NULL); if (crypt_slot == NULL) { log_printf(instance->log_level_security, "Unable to find security slot (err %d)", PR_GetError()); return -1; } instance->nss_sym_key = PK11_ImportSymKey(crypt_slot, cipher_to_nss[instance->crypto_cipher_type], PK11_OriginUnwrap, CKA_ENCRYPT|CKA_DECRYPT, &crypt_param, NULL); if (instance->nss_sym_key == NULL) { log_printf(instance->log_level_security, "Failure to import key into NSS (err %d)", PR_GetError()); return -1; } PK11_FreeSlot(crypt_slot); return 0; } static int encrypt_nss( struct crypto_instance *instance, const unsigned char *buf_in, const size_t buf_in_len, unsigned char *buf_out, size_t *buf_out_len) { PK11Context* crypt_context = NULL; SECItem crypt_param; SECItem *nss_sec_param = NULL; int tmp1_outlen = 0; unsigned int tmp2_outlen = 0; unsigned char *salt = buf_out; unsigned char *data = buf_out + SALT_SIZE; int err = -1; if (!cipher_to_nss[instance->crypto_cipher_type]) { memcpy(buf_out, buf_in, buf_in_len); *buf_out_len = buf_in_len; return 0; } if (PK11_GenerateRandom (salt, SALT_SIZE) != SECSuccess) { log_printf(instance->log_level_security, "Failure to generate a random number %d", PR_GetError()); goto out; } crypt_param.type = siBuffer; crypt_param.data = salt; crypt_param.len = SALT_SIZE; nss_sec_param = PK11_ParamFromIV (cipher_to_nss[instance->crypto_cipher_type], &crypt_param); if (nss_sec_param == NULL) { log_printf(instance->log_level_security, "Failure to set up PKCS11 param (err %d)", PR_GetError()); goto out; } /* * Create cipher context for encryption */ crypt_context = PK11_CreateContextBySymKey (cipher_to_nss[instance->crypto_cipher_type], CKA_ENCRYPT, instance->nss_sym_key, nss_sec_param); if (!crypt_context) { log_printf(instance->log_level_security, "PK11_CreateContext failed (encrypt) crypt_type=%d (err %d)", (int)cipher_to_nss[instance->crypto_cipher_type], PR_GetError()); goto out; } if (PK11_CipherOp(crypt_context, data, &tmp1_outlen, FRAME_SIZE_MAX - instance->crypto_header_size, (unsigned char *)buf_in, buf_in_len) != SECSuccess) { log_printf(instance->log_level_security, "PK11_CipherOp failed (encrypt) crypt_type=%d (err %d)", (int)cipher_to_nss[instance->crypto_cipher_type], PR_GetError()); goto out; } if (PK11_DigestFinal(crypt_context, data + tmp1_outlen, &tmp2_outlen, FRAME_SIZE_MAX - tmp1_outlen) != SECSuccess) { log_printf(instance->log_level_security, "PK11_DigestFinal failed (encrypt) crypt_type=%d (err %d)", (int)cipher_to_nss[instance->crypto_cipher_type], PR_GetError()); goto out; } *buf_out_len = tmp1_outlen + tmp2_outlen + SALT_SIZE; err = 0; out: if (crypt_context) { PK11_DestroyContext(crypt_context, PR_TRUE); } if (nss_sec_param) { SECITEM_FreeItem(nss_sec_param, PR_TRUE); } return err; } static int decrypt_nss ( struct crypto_instance *instance, unsigned char *buf, int *buf_len) { PK11Context* decrypt_context = NULL; SECItem decrypt_param; int tmp1_outlen = 0; unsigned int tmp2_outlen = 0; unsigned char *salt = buf; unsigned char *data = salt + SALT_SIZE; int datalen = *buf_len - SALT_SIZE; unsigned char outbuf[FRAME_SIZE_MAX]; int outbuf_len; int err = -1; if (!cipher_to_nss[instance->crypto_cipher_type]) { return 0; } /* Create cipher context for decryption */ decrypt_param.type = siBuffer; decrypt_param.data = salt; decrypt_param.len = SALT_SIZE; decrypt_context = PK11_CreateContextBySymKey(cipher_to_nss[instance->crypto_cipher_type], CKA_DECRYPT, instance->nss_sym_key, &decrypt_param); if (!decrypt_context) { log_printf(instance->log_level_security, "PK11_CreateContext (decrypt) failed (err %d)", PR_GetError()); goto out; } if (PK11_CipherOp(decrypt_context, outbuf, &tmp1_outlen, sizeof(outbuf), data, datalen) != SECSuccess) { log_printf(instance->log_level_security, "PK11_CipherOp (decrypt) failed (err %d)", PR_GetError()); goto out; } if (PK11_DigestFinal(decrypt_context, outbuf + tmp1_outlen, &tmp2_outlen, sizeof(outbuf) - tmp1_outlen) != SECSuccess) { log_printf(instance->log_level_security, "PK11_DigestFinal (decrypt) failed (err %d)", PR_GetError()); goto out; } outbuf_len = tmp1_outlen + tmp2_outlen; memset(buf, 0, *buf_len); memcpy(buf, outbuf, outbuf_len); *buf_len = outbuf_len; err = 0; out: if (decrypt_context) { PK11_DestroyContext(decrypt_context, PR_TRUE); } return err; } /* * hash/hmac/digest functions */ static int string_to_crypto_hash_type(const char* crypto_hash_type) { if (strcmp(crypto_hash_type, "none") == 0) { return CRYPTO_HASH_TYPE_NONE; } else if (strcmp(crypto_hash_type, "md5") == 0) { return CRYPTO_HASH_TYPE_MD5; } else if (strcmp(crypto_hash_type, "sha1") == 0) { return CRYPTO_HASH_TYPE_SHA1; } else if (strcmp(crypto_hash_type, "sha256") == 0) { return CRYPTO_HASH_TYPE_SHA256; } else if (strcmp(crypto_hash_type, "sha384") == 0) { return CRYPTO_HASH_TYPE_SHA384; } else if (strcmp(crypto_hash_type, "sha512") == 0) { return CRYPTO_HASH_TYPE_SHA512; } return CRYPTO_HASH_TYPE_SHA1; } static int init_nss_hash(struct crypto_instance *instance) { PK11SlotInfo* hash_slot = NULL; SECItem hash_param; if (!hash_to_nss[instance->crypto_hash_type]) { return 0; } hash_param.type = siBuffer; hash_param.data = 0; hash_param.len = 0; hash_slot = PK11_GetBestSlot(hash_to_nss[instance->crypto_hash_type], NULL); if (hash_slot == NULL) { log_printf(instance->log_level_security, "Unable to find security slot (err %d)", PR_GetError()); return -1; } instance->nss_sym_key_sign = PK11_ImportSymKey(hash_slot, hash_to_nss[instance->crypto_hash_type], PK11_OriginUnwrap, CKA_SIGN, &hash_param, NULL); if (instance->nss_sym_key_sign == NULL) { log_printf(instance->log_level_security, "Failure to import key into NSS (err %d)", PR_GetError()); return -1; } PK11_FreeSlot(hash_slot); return 0; } static int calculate_nss_hash( struct crypto_instance *instance, const unsigned char *buf, const size_t buf_len, unsigned char *hash) { PK11Context* hash_context = NULL; SECItem hash_param; unsigned int hash_tmp_outlen = 0; unsigned char hash_block[hash_block_len[instance->crypto_hash_type]]; int err = -1; /* Now do the digest */ hash_param.type = siBuffer; hash_param.data = 0; hash_param.len = 0; hash_context = PK11_CreateContextBySymKey(hash_to_nss[instance->crypto_hash_type], CKA_SIGN, instance->nss_sym_key_sign, &hash_param); if (!hash_context) { log_printf(instance->log_level_security, "PK11_CreateContext failed (hash) hash_type=%d (err %d)", (int)hash_to_nss[instance->crypto_hash_type], PR_GetError()); goto out; } if (PK11_DigestBegin(hash_context) != SECSuccess) { log_printf(instance->log_level_security, "PK11_DigestBegin failed (hash) hash_type=%d (err %d)", (int)hash_to_nss[instance->crypto_hash_type], PR_GetError()); goto out; } if (PK11_DigestOp(hash_context, buf, buf_len) != SECSuccess) { log_printf(instance->log_level_security, "PK11_DigestOp failed (hash) hash_type=%d (err %d)", (int)hash_to_nss[instance->crypto_hash_type], PR_GetError()); goto out; } if (PK11_DigestFinal(hash_context, hash_block, &hash_tmp_outlen, hash_block_len[instance->crypto_hash_type]) != SECSuccess) { log_printf(instance->log_level_security, "PK11_DigestFinale failed (hash) hash_type=%d (err %d)", (int)hash_to_nss[instance->crypto_hash_type], PR_GetError()); goto out; } memcpy(hash, hash_block, hash_len[instance->crypto_hash_type]); err = 0; out: if (hash_context) { PK11_DestroyContext(hash_context, PR_TRUE); } return err; } /* * global/glue nss functions */ static int init_nss_db(struct crypto_instance *instance) { if ((!cipher_to_nss[instance->crypto_cipher_type]) && (!hash_to_nss[instance->crypto_hash_type])) { return 0; } if (NSS_NoDB_Init(".") != SECSuccess) { log_printf(instance->log_level_security, "NSS DB initialization failed (err %d)", PR_GetError()); return -1; } return 0; } static int init_nss(struct crypto_instance *instance, const char *crypto_cipher_type, const char *crypto_hash_type) { log_printf(instance->log_level_notice, "Initializing transmit/receive security (NSS) crypto: %s hash: %s", crypto_cipher_type, crypto_hash_type); if (init_nss_db(instance) < 0) { return -1; } if (init_nss_crypto(instance) < 0) { return -1; } if (init_nss_hash(instance) < 0) { return -1; } return 0; } static int encrypt_and_sign_nss ( struct crypto_instance *instance, const unsigned char *buf_in, const size_t buf_in_len, unsigned char *buf_out, size_t *buf_out_len) { unsigned char *hash = buf_out; unsigned char *data = hash + hash_len[instance->crypto_hash_type]; if (encrypt_nss(instance, buf_in, buf_in_len, data, buf_out_len) < 0) { return -1; } if (hash_to_nss[instance->crypto_hash_type]) { if (calculate_nss_hash(instance, data, *buf_out_len, hash) < 0) { return -1; } *buf_out_len = *buf_out_len + hash_len[instance->crypto_hash_type]; } return 0; } static int authenticate_and_decrypt_nss ( struct crypto_instance *instance, unsigned char *buf, int *buf_len) { if (hash_to_nss[instance->crypto_hash_type]) { unsigned char tmp_hash[hash_len[instance->crypto_hash_type]]; unsigned char *hash = buf; unsigned char *data = hash + hash_len[instance->crypto_hash_type]; int datalen = *buf_len - hash_len[instance->crypto_hash_type]; if (calculate_nss_hash(instance, data, datalen, tmp_hash) < 0) { return -1; } if (memcmp(tmp_hash, hash, hash_len[instance->crypto_hash_type]) != 0) { log_printf(instance->log_level_error, "Digest does not match"); return -1; } memmove(buf, data, datalen); *buf_len = datalen; } if (decrypt_nss(instance, buf, buf_len) < 0) { return -1; } return 0; } /* * exported API */ size_t crypto_sec_header_size( const char *crypto_cipher_type, const char *crypto_hash_type) { int crypto_cipher = string_to_crypto_cipher_type(crypto_cipher_type); int crypto_hash = string_to_crypto_hash_type(crypto_hash_type); size_t hdr_size = 0; hdr_size = sizeof(struct crypto_config_header); if (crypto_hash) { hdr_size += hash_len[crypto_hash]; } if (crypto_cipher) { hdr_size += SALT_SIZE; hdr_size += cypher_block_len[crypto_cipher]; } return hdr_size; } int crypto_encrypt_and_sign ( struct crypto_instance *instance, const unsigned char *buf_in, const size_t buf_in_len, unsigned char *buf_out, size_t *buf_out_len) { struct crypto_config_header *cch = (struct crypto_config_header *)buf_out; int err; cch->crypto_cipher_type = instance->crypto_cipher_type; cch->crypto_hash_type = instance->crypto_hash_type; cch->__pad0 = 0; cch->__pad1 = 0; buf_out += sizeof(struct crypto_config_header); err = encrypt_and_sign_nss(instance, buf_in, buf_in_len, buf_out, buf_out_len); *buf_out_len = *buf_out_len + sizeof(struct crypto_config_header); return err; } int crypto_authenticate_and_decrypt (struct crypto_instance *instance, unsigned char *buf, int *buf_len) { struct crypto_config_header *cch = (struct crypto_config_header *)buf; /* * decode crypto config of incoming packets */ if (cch->crypto_cipher_type != instance->crypto_cipher_type) { log_printf(instance->log_level_security, "Incoming packet has different crypto type. Rejecting"); return -1; } if (cch->crypto_hash_type != instance->crypto_hash_type) { log_printf(instance->log_level_security, "Incoming packet has different hash type. Rejecting"); return -1; } if ((cch->__pad0 != 0) || (cch->__pad1 != 0)) { log_printf(instance->log_level_security, "Incoming packet appears to have features not supported by this version of corosync. Rejecting"); return -1; } /* * invalidate config header and kill it */ cch = NULL; *buf_len -= sizeof(struct crypto_config_header); memmove(buf, buf + sizeof(struct crypto_config_header), *buf_len); return authenticate_and_decrypt_nss(instance, buf, buf_len); } struct crypto_instance *crypto_init( const unsigned char *private_key, unsigned int private_key_len, const char *crypto_cipher_type, const char *crypto_hash_type, void (*log_printf_func) ( int level, int subsys, const char *function, const char *file, int line, const char *format, ...)__attribute__((format(printf, 6, 7))), int log_level_security, int log_level_notice, int log_level_error, int log_subsys_id) { struct crypto_instance *instance; instance = malloc(sizeof(*instance)); if (instance == NULL) { return (NULL); } memset(instance, 0, sizeof(struct crypto_instance)); memcpy(instance->private_key, private_key, private_key_len); instance->private_key_len = private_key_len; instance->crypto_cipher_type = string_to_crypto_cipher_type(crypto_cipher_type); instance->crypto_hash_type = string_to_crypto_hash_type(crypto_hash_type); instance->crypto_header_size = crypto_sec_header_size(crypto_cipher_type, crypto_hash_type); instance->log_printf_func = log_printf_func; instance->log_level_security = log_level_security; instance->log_level_notice = log_level_notice; instance->log_level_error = log_level_error; instance->log_subsys_id = log_subsys_id; if (init_nss(instance, crypto_cipher_type, crypto_hash_type) < 0) { free(instance); return(NULL); } return (instance); } diff --git a/man/corosync.conf.5 b/man/corosync.conf.5 index 34dd7b81..1ef9dc53 100644 --- a/man/corosync.conf.5 +++ b/man/corosync.conf.5 @@ -1,658 +1,658 @@ .\"/* .\" * Copyright (c) 2005 MontaVista Software, Inc. .\" * Copyright (c) 2006-2012 Red Hat, Inc. .\" * .\" * All rights reserved. .\" * .\" * Author: Steven Dake (sdake@redhat.com) .\" * .\" * This software licensed under BSD license, the text of which follows: .\" * .\" * Redistribution and use in source and binary forms, with or without .\" * modification, are permitted provided that the following conditions are met: .\" * .\" * - Redistributions of source code must retain the above copyright notice, .\" * this list of conditions and the following disclaimer. .\" * - Redistributions in binary form must reproduce the above copyright notice, .\" * this list of conditions and the following disclaimer in the documentation .\" * and/or other materials provided with the distribution. .\" * - Neither the name of the MontaVista Software, Inc. nor the names of its .\" * contributors may be used to endorse or promote products derived from this .\" * software without specific prior written permission. .\" * .\" * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" .\" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE .\" * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR .\" * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF .\" * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS .\" * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN .\" * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) .\" * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF .\" * THE POSSIBILITY OF SUCH DAMAGE. .\" */ .TH COROSYNC_CONF 5 2012-10-10 "corosync Man Page" "Corosync Cluster Engine Programmer's Manual" .SH NAME corosync.conf - corosync executive configuration file .SH SYNOPSIS /etc/corosync/corosync.conf .SH DESCRIPTION The corosync.conf instructs the corosync executive about various parameters needed to control the corosync executive. Empty lines and lines starting with # character are ignored. The configuration file consists of bracketed top level directives. The possible directive choices are: .TP totem { } This top level directive contains configuration options for the totem protocol. .TP logging { } This top level directive contains configuration options for logging. .TP quorum { } This top level directive contains configuration options for quorum. .TP nodelist { } This top level directive contains configuration options for nodes in cluster. .PP .PP Within the .B totem directive, an interface directive is required. There is also one configuration option which is required: .PP .PP Within the .B interface sub-directive of totem there are four parameters which are required. There is one parameter which is optional. .TP ringnumber This specifies the ring number for the interface. When using the redundant ring protocol, each interface should specify separate ring numbers to uniquely identify to the membership protocol which interface to use for which redundant ring. The ringnumber must start at 0. .TP bindnetaddr This specifies the network address the corosync executive should bind to. bindnetaddr should be an IP address configured on the system, or a network address. For example, if the local interface is 192.168.5.92 with netmask 255.255.255.0, you should set bindnetaddr to 192.168.5.92 or 192.168.5.0. If the local interface is 192.168.5.92 with netmask 255.255.255.192, set bindnetaddr to 192.168.5.92 or 192.168.5.64, and so forth. This may also be an IPV6 address, in which case IPV6 networking will be used. In this case, the exact address must be specified and there is no automatic selection of the network interface within a specific subnet as with IPv4. If IPv6 networking is used, the nodeid field in nodelist must be specified. .TP broadcast This is optional and can be set to yes. If it is set to yes, the broadcast address will be used for communication. If this option is set, mcastaddr should not be set. .TP mcastaddr This is the multicast address used by corosync executive. The default should work for most networks, but the network administrator should be queried about a multicast address to use. Avoid 224.x.x.x because this is a "config" multicast address. This may also be an IPV6 multicast address, in which case IPV6 networking will be used. If IPv6 networking is used, the nodeid field in nodelist must be specified. It's not needed to use this option if cluster_name option is used. If both options are used, mcastaddr has higher priority. .TP mcastport This specifies the UDP port number. It is possible to use the same multicast address on a network with the corosync services configured for different UDP ports. Please note corosync uses two UDP ports mcastport (for mcast receives) and mcastport - 1 (for mcast sends). If you have multiple clusters on the same network using the same mcastaddr please configure the mcastports with a gap. .TP ttl This specifies the Time To Live (TTL). If you run your cluster on a routed network then the default of "1" will be too small. This option provides a way to increase this up to 255. The valid range is 0..255. Note that this is only valid on multicast transport types. .PP .PP Within the .B totem directive, there are seven configuration options of which one is required, five are optional, and one is required when IPV6 is configured in the interface subdirective. The required directive controls the version of the totem configuration. The optional option unless using IPV6 directive controls identification of the processor. The optional options control secrecy and authentication, the redundant ring mode of operation and maximum network MTU field. .TP version This specifies the version of the configuration file. Currently the only valid version for this directive is 2. .PP clear_node_high_bit This configuration option is optional and is only relevant when no nodeid is specified. Some corosync clients require a signed 32 bit nodeid that is greater than zero however by default corosync uses all 32 bits of the IPv4 address space when generating a nodeid. Set this option to yes to force the high bit to be zero and therefor ensure the nodeid is a positive signed 32 bit integer. WARNING: The clusters behavior is undefined if this option is enabled on only a subset of the cluster (for example during a rolling upgrade). .TP crypto_hash This specifies which HMAC authentication should be used to authenticate all messages. Valid values are none (no authentication), md5, sha1, sha256, sha384 and sha512. The default is sha1. .TP crypto_cipher This specifies which cipher should be used to encrypt all messages. -Valid values are none (no encryption) and aes256. +Valid values are none (no encryption), aes256, aes192, aes128 and 3des. The default is aes256. .TP secauth This specifies that HMAC/SHA1 authentication should be used to authenticate all messages. It further specifies that all data should be encrypted with the nss library and aes256 encryption algorithm to protect data from eavesdropping. Enabling this option adds a encryption header to every message sent by totem which reduces total throughput. Also encryption and authentication consume extra CPU cycles in corosync. The default is on. WARNING: This parameter is deprecated. It's recomended to use combination of crypto_cipher and crypto_hash. .TP rrp_mode This specifies the mode of redundant ring, which may be none, active, or passive. Active replication offers slightly lower latency from transmit to delivery in faulty network environments but with less performance. Passive replication may nearly double the speed of the totem protocol if the protocol doesn't become cpu bound. The final option is none, in which case only one network interface will be used to operate the totem protocol. If only one interface directive is specified, none is automatically chosen. If multiple interface directives are specified, only active or passive may be chosen. .TP netmtu This specifies the network maximum transmit unit. To set this value beyond 1500, the regular frame MTU, requires ethernet devices that support large, or also called jumbo, frames. If any device in the network doesn't support large frames, the protocol will not operate properly. The hosts must also have their mtu size set from 1500 to whatever frame size is specified here. Please note while some NICs or switches claim large frame support, they support 9000 MTU as the maximum frame size including the IP header. Setting the netmtu and host MTUs to 9000 will cause totem to use the full 9000 bytes of the frame. Then Linux will add a 18 byte header moving the full frame size to 9018. As a result some hardware will not operate properly with this size of data. A netmtu of 8982 seems to work for the few large frame devices that have been tested. Some manufacturers claim large frame support when in fact they support frame sizes of 4500 bytes. When sending multicast traffic, if the network frequently reconfigures, chances are that some device in the network doesn't support large frames. Choose hardware carefully if intending to use large frame support. The default is 1500. .TP vsftype This directive controls the virtual synchrony filter type used to identify a primary component. The preferred choice is YKD dynamic linear voting, however, for clusters larger then 32 nodes YKD consumes alot of memory. For large scale clusters that are created by changing the MAX_PROCESSORS_COUNT #define in the C code totem.h file, the virtual synchrony filter "none" is recommended but then AMF and DLCK services (which are currently experimental) are not safe for use. The default is ykd. The vsftype can also be set to none. .TP transport This directive controls the transport mechanism used. If the interface to which corosync is binding is an RDMA interface such as RoCEE or Infiniband, the "iba" parameter may be specified. To avoid the use of multicast entirely, a unicast transport parameter "udpu" can be specified. This requires specifying the list of members in nodelist directive, that could potentially make up the membership before deployment. The default is udp. The transport type can also be set to udpu or iba. .TP cluster_name This specifies the name of cluster and it's used for automatic generating of multicast address. .TP config_version This specifies version of config file. This is converted to unsigned 64-bit int. By default it's 0. Option is used to prevent joining old nodes with not up-to-date configuration. If value is not 0, and node is going for first time (only for first time, join after split doesn't follow this rules) from single-node membership to multiple nodes membership, other nodes config_versions are collected. If current node config_version is not equal to highest of collected versions, corosync is terminated. Within the .B totem directive, there are several configuration options which are used to control the operation of the protocol. It is generally not recommended to change any of these values without proper guidance and sufficient testing. Some networks may require larger values if suffering from frequent reconfigurations. Some applications may require faster failure detection times which can be achieved by reducing the token timeout. .TP token This timeout specifies in milliseconds until a token loss is declared after not receiving a token. This is the time spent detecting a failure of a processor in the current configuration. Reforming a new configuration takes about 50 milliseconds in addition to this timeout. The default is 1000 milliseconds. .TP token_retransmit This timeout specifies in milliseconds after how long before receiving a token the token is retransmitted. This will be automatically calculated if token is modified. It is not recommended to alter this value without guidance from the corosync community. The default is 238 milliseconds. .TP hold This timeout specifies in milliseconds how long the token should be held by the representative when the protocol is under low utilization. It is not recommended to alter this value without guidance from the corosync community. The default is 180 milliseconds. .TP token_retransmits_before_loss_const This value identifies how many token retransmits should be attempted before forming a new configuration. If this value is set, retransmit and hold will be automatically calculated from retransmits_before_loss and token. The default is 4 retransmissions. .TP join This timeout specifies in milliseconds how long to wait for join messages in the membership protocol. The default is 50 milliseconds. .TP send_join This timeout specifies in milliseconds an upper range between 0 and send_join to wait before sending a join message. For configurations with less then 32 nodes, this parameter is not necessary. For larger rings, this parameter is necessary to ensure the NIC is not overflowed with join messages on formation of a new ring. A reasonable value for large rings (128 nodes) would be 80msec. Other timer values must also change if this value is changed. Seek advice from the corosync mailing list if trying to run larger configurations. The default is 0 milliseconds. .TP consensus This timeout specifies in milliseconds how long to wait for consensus to be achieved before starting a new round of membership configuration. The minimum value for consensus must be 1.2 * token. This value will be automatically calculated at 1.2 * token if the user doesn't specify a consensus value. For two node clusters, a consensus larger then the join timeout but less then token is safe. For three node or larger clusters, consensus should be larger then token. There is an increasing risk of odd membership changes, which stil guarantee virtual synchrony, as node count grows if consensus is less than token. The default is 1200 milliseconds. .TP merge This timeout specifies in milliseconds how long to wait before checking for a partition when no multicast traffic is being sent. If multicast traffic is being sent, the merge detection happens automatically as a function of the protocol. The default is 200 milliseconds. .TP downcheck This timeout specifies in milliseconds how long to wait before checking that a network interface is back up after it has been downed. The default is 1000 millseconds. .TP fail_recv_const This constant specifies how many rotations of the token without receiving any of the messages when messages should be received may occur before a new configuration is formed. The default is 2500 failures to receive a message. .TP seqno_unchanged_const This constant specifies how many rotations of the token without any multicast traffic should occur before the hold timer is started. The default is 30 rotations. .TP heartbeat_failures_allowed [HeartBeating mechanism] Configures the optional HeartBeating mechanism for faster failure detection. Keep in mind that engaging this mechanism in lossy networks could cause faulty loss declaration as the mechanism relies on the network for heartbeating. So as a rule of thumb use this mechanism if you require improved failure in low to medium utilized networks. This constant specifies the number of heartbeat failures the system should tolerate before declaring heartbeat failure e.g 3. Also if this value is not set or is 0 then the heartbeat mechanism is not engaged in the system and token rotation is the method of failure detection The default is 0 (disabled). .TP max_network_delay [HeartBeating mechanism] This constant specifies in milliseconds the approximate delay that your network takes to transport one packet from one machine to another. This value is to be set by system engineers and please dont change if not sure as this effects the failure detection mechanism using heartbeat. The default is 50 milliseconds. .TP window_size This constant specifies the maximum number of messages that may be sent on one token rotation. If all processors perform equally well, this value could be large (300), which would introduce higher latency from origination to delivery for very large rings. To reduce latency in large rings(16+), the defaults are a safe compromise. If 1 or more slow processor(s) are present among fast processors, window_size should be no larger then 256000 / netmtu to avoid overflow of the kernel receive buffers. The user is notified of this by the display of a retransmit list in the notification logs. There is no loss of data, but performance is reduced when these errors occur. The default is 50 messages. .TP max_messages This constant specifies the maximum number of messages that may be sent by one processor on receipt of the token. The max_messages parameter is limited to 256000 / netmtu to prevent overflow of the kernel transmit buffers. The default is 17 messages. .TP miss_count_const This constant defines the maximum number of times on receipt of a token a message is checked for retransmission before a retransmission occurs. This parameter is useful to modify for switches that delay multicast packets compared to unicast packets. The default setting works well for nearly all modern switches. The default is 5 messages. .TP rrp_problem_count_timeout This specifies the time in milliseconds to wait before decrementing the problem count by 1 for a particular ring to ensure a link is not marked faulty for transient network failures. The default is 2000 milliseconds. .TP rrp_problem_count_threshold This specifies the number of times a problem is detected with a link before setting the link faulty. Once a link is set faulty, no more data is transmitted upon it. Also, the problem counter is no longer decremented when the problem count timeout expires. A problem is detected whenever all tokens from the proceeding processor have not been received within the rrp_token_expired_timeout. The rrp_problem_count_threshold * rrp_token_expired_timeout should be atleast 50 milliseconds less then the token timeout, or a complete reconfiguration may occur. The default is 10 problem counts. .TP rrp_problem_count_mcast_threshold This specifies the number of times a problem is detected with multicast before setting the link faulty for passive rrp mode. This variable is unused in active rrp mode. The default is 10 times rrp_problem_count_threshold. .TP rrp_token_expired_timeout This specifies the time in milliseconds to increment the problem counter for the redundant ring protocol after not having received a token from all rings for a particular processor. This value will automatically be calculated from the token timeout and problem_count_threshold but may be overridden. It is not recommended to override this value without guidance from the corosync community. The default is 47 milliseconds. .TP rrp_autorecovery_check_timeout This specifies the time in milliseconds to check if the failed ring can be auto-recovered. The default is 1000 milliseconds. .PP Within the .B logging directive, there are several configuration options which are all optional. .PP The following 3 options are valid only for the top level logging directive: .TP timestamp This specifies that a timestamp is placed on all log messages. The default is off. .TP fileline This specifies that file and line should be printed. The default is off. .TP function_name This specifies that the code function name should be printed. The default is off. .PP The following options are valid both for top level logging directive and they can be overriden in logger_subsys entries. .TP to_stderr .TP to_logfile .TP to_syslog These specify the destination of logging output. Any combination of these options may be specified. Valid options are .B yes and .B no. The default is syslog and stderr. Please note, if you are using to_logfile and want to rotate the file, use logrotate(8) with the option .B copytruncate. eg. .IP .RS .ne 18 .nf .ta 4n 30n 33n /var/log/corosync.log { missingok compress notifempty daily rotate 7 copytruncate } .ta .fi .RE .IP .PP .TP logfile If the .B to_logfile directive is set to .B yes , this option specifies the pathname of the log file. No default. .TP logfile_priority This specifies the logfile priority for this particular subsystem. Ignored if debug is on. Possible values are: alert, crit, debug (same as debug = on), emerg, err, info, notice, warning. The default is: info. .TP syslog_facility This specifies the syslog facility type that will be used for any messages sent to syslog. options are daemon, local0, local1, local2, local3, local4, local5, local6 & local7. The default is daemon. .TP syslog_priority This specifies the syslog level for this particular subsystem. Ignored if debug is on. Possible values are: alert, crit, debug (same as debug = on), emerg, err, info, notice, warning. The default is: info. .TP debug This specifies whether debug output is logged for this particular logger. Also can contain value trace, what is highest level of debug informations. The default is off. .PP Within the .B logging directive, logger_subsys directives are optional. .PP Within the .B logger_subsys sub-directive, all of the above logging configuration options are valid and can be used to override the default settings. The subsys entry, described below, is mandatory to identify the subsystem. .TP subsys This specifies the subsystem identity (name) for which logging is specified. This is the name used by a service in the log_init () call. E.g. 'CPG'. This directive is required. .PP Within the .B quorum directive it is possible to specify the quorum algorithm to use with the .TP provider directive. At the time of writing only corosync_votequorum is supported. See votequorum(5) for configuration options. .PP Within the .B nodelist directive it is possible to specify specific informations about nodes in cluster. Directive can contain only .B node sub-directive, which specifies every node that should be a member of the membership, and where non-default options are needed. Every node must have at least ring0_addr field filled. For UDPU, every node that should be a member of the membership must be specified. Possible options are: .TP ringX_addr This specifies ip address of one of the nodes. X is ring number. .TP nodeid This configuration option is optional when using IPv4 and required when using IPv6. This is a 32 bit value specifying the node identifier delivered to the cluster membership service. If this is not specified with IPv4, the node id will be determined from the 32 bit IP address the system to which the system is bound with ring identifier of 0. The node identifier value of zero is reserved and should not be used. .SH "FILES" .TP /etc/corosync/corosync.conf The corosync executive configuration file. .SH "SEE ALSO" .BR corosync_overview (8), .BR votequorum (5), .BR logrotate (8) .PP