DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! DRAFT! NOTICE: Some ideas in this paper aren't yet well sorted. Some ideas aren't complete. Some phrasings I'm myself not happy with yet. Some ideas need further explanation. Most of the ideas presented are not final yet. It is mostly a braindump. And did I say yet that this is still a DRAFT!!!!!! ? Title: The Executioner Revision: $Id: executioner.txt,v 1.1 2003/08/08 18:19:47 lars Exp $ Author: Lars Marowsky-Brée Acknowledgements: David Brower, Oracle Alan Robertson, IBM 1. Summary Every node runs an instance of the fencing daemon ("executioner"). This daemon knows which fencing devices are currently reachable and which nodes can be fenced by them - ie, the current topology of the fencing mechanisms - and will execute such requests on behalf of the CRM and report success or failure. A succesful fencing operation shall imply that the target node of the request can no longer access any shared resources in the cluster, until it has "properly rejoined the cluster". 2. Fencing topology information (Note: modelled after the STONITH model by alanr in heartbeat) 2.1. Static configuration data The mechanisms available for fencing need to be configured on each node. Provision should be made that this file can be the same on all nodes (to ease configuration deployment). It shall be made easy to configure a device for a list of nodes or all nodes. TODO: Can this configuration also be stored in the CIB configuration part? This would collide with the concept that every node is the authoritive source of information about itself. 2.2. Runtime topology Every device needs to support a low-latency "ping" operation, which shall verify whether it can currently be reached from the local node; this shall preferrably not be affected by concurrent access. The devices can either autodiscover their targets (ie, like via STONITH devices), or have to provide means of configuring this list; the list shall only be queried from the device on explicit request by the CRM. The CRM shall be allowed to assume that the same device can fence the same set of nodes for all clients. 3. Interaction with the CRM 3.1. Policies The CRM will be responsible for retrying failed commands; the Executioner shall only make exactly one attempt. It shall not retry the request on another device in particular; it is permissible to retry the command on the same device if it seems like an intermediate failure. 3.2. Queries/Commands issued 3.2.1. Device reachable The Executioner shall verify whether the given device is still reachable by the local node at this point in time. The verification shall be low-latency and low weight and allow for concurrent access from multiple nodes (if appropriate, ie for network switches). 3.2.2. Targets fenceable via device Y The node shall contact the device and return the list of nodes which it can fence. The CRM will ensure that no other node in the partition is accessing device Y right now. Results: 0 Success; list of targets included 1 Failed; device not reached 2 Failed; device failed to return list of targets 3.2.3. Fencing request to fence node X via device Y This is a blocking, synchronous call. The CRM will ensure that no other node in the partition is accessing device Y right now. (This is an issue for certain network powerswitches) The result code need to distinguish between: 0 Fencing request succeeded 1 Failed: device could not be reached, potential network issue 2 Failed: device tried, but failed to acknowledge success 3 Failed: interal device failure TODO: So much differentiation really necessary? Yes, it can help identify quickly whether it is sensible to retry the fencing request via another node to the same device, or whether the next fencing device should be tried immediately; maybe that is overkill.