diff --git a/resource_agent/API/00 b/resource_agent/API/00 deleted file mode 100644 index 223445a..0000000 --- a/resource_agent/API/00 +++ /dev/null @@ -1,364 +0,0 @@ -This draft from an email to the OCF mailing list by Lars Marowsky-Bree -dated 3/14/2002 -============= -DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT - -0. Header - -Topic: Open Clustering Framework Resource Agent API -Editor: Lars Marowsky-Brée -Revision: $Id$ -URL: http://www.opencf.org/documents/standards/resource-agent-api.txt - -Copyright (c) 2002 by Lars Marowsky-Brée. This material may be distributed -only subject to the terms and conditions set forth in the Open Publication -License, v1.0 or later (the latest version is presently available at -http://www.opencontent.org/openpub/). - -TODO: Currently, OCF isn't a real organisation and thus can't be referenced as -a copyright holder; this may need to be changed. - -TODO: Reference a "style guide" document to explain where <>, "" etc have been -used and why. - -TODO: Just if you haven't noticed yet, this document is a draft for now. - -DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT - - -1. Abstract - -Resource Agents (RA) are the middle layer between the Resource Manager (RM) -and the actual resources being managed. They aim to integrate the resource -with the RM without any modifications to the actual resource provider itself, -by encapsulating it carefully and thus making it moveable between real nodes -in a cluster. - -The RAs are obviously very specific to the resource type they are -encapsulating, however there is no reason why they should be specific to a -particular RM. - - -1.1. Scope - -This document documents a common API for the RM to call the RAs so the pool of -available RAs can be shared by the different clustering solutions. - -It does NOT define any libraries or helper functions which RAs might share -with regard to common functionality like external command execution, cluster -logging et cetera, as these are NOT specific to RA and are defined in the -respective standards. - - -1.2. API version described - -This document currently describes version 1 of the API. - -The version numbering scheme used is a simple, unsigned integer number for -ease of use and to avoid any ambiguity. The version number is communicated to -the RA and will be increased if a not downwards compatible change was made. - - -2. Terms used in this document - -2.1. "Resource" - -A single physical or logical entity that provides a service to clients or -other resources. For example, a resource can be a single disk volume, a -particular network address, or an application such as a web server. A resource -is generally available for use over time on two or more nodes in a cluster, -although it usually can be allocated to only one node at any given time. - -Resources are identified by their name and their instance parameters. The name -is a special case of an instance parameter; the name/resource type combination -is required to be unique in the cluster. - -Besides the instance parameters, a resource may have dependencies on other -resources or capabilities provided by other resources. Common examples include -a dependency on an IP address being configured or a filesystem being mounted. - - -2.2. "Resource types" - -A resource type represents a set of resources which share a common set of -instance parameters and a common set of actions which can be performed on it. - - -2.3. "Resource agent" - -A RA provides the actions ("member functions") for a a given type of -resources; by providing the RA with the instance parameters, it is used to -control a specific resource. - -They are usually implemented as shell scripts, but the API described here does -not require this. - -Although this is somewhat similiar to SystemV init scripts as described by the -LSB, there are some differences explained below. - -2.4. "Instance parameters" - -Instance parameters are the attributes which uniquely identify a given -resource instance. It is recommended that the set of instance parameters for -any given type of resources to be as minimal as possible. - -An instance parameter has a given name and value. They are both case sensitive -and must satisfy the requirements of POSIX environment name/value -combinations. - - -2.5. "Resource group" - -This is a term from the RM world, but it is explained in brief here for -completeness. As explained above, a complex resource commonly has dependencies -on other resources required for proper operation; all dependencies required to -provide an actual service to the user are usually grouped into a "resource -group" which is handled as an atomic unit by the cluster, as it isn't possible -to move a resource without also moving its dependencies or only moving a -resource but not the resources which depend on it. - -While the resource grouping is still commonly implemented by manual -configuration, the information provided by the RAs should be sufficient for -the RM to build the dependency tree on its own as far as possible. - - -3. API - -3.1. Resource Agent actions - -A RA must be able to perform the following actions on a given resource on -request by the RM; additional actions may be supported by the script for -example for LSB compliance, however more actions may be officially defined in -the future. - -In general, a RA should not assume it is the only RA of its type running -because the RM might start several RA instances for multiple independant -resource instances in parallel. - - -- start - - This brings the resource online and makes it available for use. It should - NOT terminate before the resource has been fully started. - - It may try to implement recover actions for certain cases of startup - failures at its discretion to comply. - - "start" must succeed even if the resource instance is already running. - -- stop - - This stops the resource. After the "stop" command has completed, nothing - should remain active of the resource and it must be possible to start it - on the same node or another node. - - Only if this cannot be guaranteed should it report failure; stopping an - already stopped resource should succeed. - - The "stop" request by the RM includes the authorisation to bring down the - resource even by force as long data integrity is maintained; breaking - currently active transactions should be avoided, but the request to offline - the resource has higher precendence than this. - - The "stop" action should also perform clean-ups of artifacts like leftover - shared memory segments, semaphores, IPC message queues, lock files etc. - -- status - - Verifies whether a resource is working correctly. This should be - "light-weight" query as it is called by the RM fairly often to poll the - status of the resource. - - It is accepted practice to have additional instance parameters which are not - strictly required to identify the resource instance but are needed to - monitor it or customize of how intrusive this check is allowed to be. - - Note: An interface where the RA actively informs the RM of failures is - planned but not defined yet. - -- restart - - A special case of the "start" action, this should try to recover a resource - locally. If this is not supported, the RA should simply return failure. - - The meta-data query should reveal whether this action is supported or not. - - An example includes "recovering" an IP address by moving it to another - interface; this is much less costly than initiating a full resource group - failover to another node. - -- dependencies - - Reports the dependencies of the resource instance as far as the RA can - determine. - - TODO: Which format? How? - -- metadata - - Causes the RA to report its metadata. This action does not require the - instance parameters to be set, as it is used to retrieve the information - about which instance parameters exist etc in the first place. - - TODO: How? Format? - - -3.2. Calling the RA - -3.2.1. Paths - -If the RM has to control a resource type called , it will look -for a RA named in the following locations, listed in order of -precedence: - -1. RM specific paths - Note: While this is allowed, it should not be necessary; however, it - may be necessary for legacy RAs provided by the specific RM. - -2. /usr/ocf/resource.d/ - This is the primary location for OCF-compliant RAs; if installed here, - they are not required to be LSB-compatible too. - - All executables in here may be considered RAs and thus be - "auto-discovered" by the RM. - - TODO: Define /usr/ocf directory hierarchy further or refer to another - standard document doing so. - -3. /etc/init.d/ - If a RA is both OCF and LSB compliant, it may reside here; please - refer to - http://www.linuxbase.org/spec/refspecs/LSB_1.1.0/gLSB/sysinit.html for - more details on LSB compliance. - - As the LSB does not define the "metadata" action, the RM could try to - use this to find out whether a given script can double as a RA. - - -3.2.2. Execution syntax - -After the RM has identified the executable to call, it will be called in the -following format: - - /path/to/RA/ResourceType - -This convention has been chosen to make sure a non-OCF compliant LSB init -script will fail if called as a RA by error; please refer to the section about -Resource naming / instance parameters for further restrictions because of -this. - - -3.2.3. Parameter passing - -The instance parameters and some additional attributes are passed in via the -environment; this has been chosen because it does not reveal the parameters to -an unprivileged user on the same system and environment variables can be -easily accessed by all programming languages and shell scripts. - - -3.2.3.1. Syntax for instance parameters - -They are directly converted to environment variables; the name is prefixed -with "OCF_RESKEY_". - -The instance parameter "force" with the value "yes" thus becomes: - OCF_force=yes -in the environment. - - -3.2.3.2. Special parameters - -The entire environment variable namespace starting with OCF_ is considered to -be reserved. - -Currently, the following additional parameters are defined: - -OCF_ROOT - Referring to the root of the OCF directory hierarchy. - - Example: OCF_ROOT=/usr/ocf - -OCF_RA_VERSION - Version number of the OCF Resource Agent API. If the script does - not support this revision, it should report an error. - - This is an integer number and should only be bumbed when the API - undergoes a not downwards compatible change. - - Example: OCF_RA_VERSION=1 - - -3.3. Exit codes - -These exit codes were largely modelled after the LSB 1.1.0 spec for -compatibility. - -NOTE: However, the ranges "reserved for application use" by the LSB may be -used by the OCF in the future to report more fine-grained status or special -cases to the RM. - -3.3.1. "status" - -0 program is running or service is OK -1 program is dead and /var/run pid file exists -2 program is dead and /var/lock lock file exists -3 program is stopped -4 program or service status is unknown -5-99 reserved for future LSB use -100-149 reserved for distribution use -150-199 reserved for application use -200-254 reserved - -3.3.2. "start", "stop", "restart" - -1 generic or unspecified error (current practice) -2 invalid or excess argument(s) -3 unimplemented feature (for example, "reload") -4 user had insufficient privilege -5 program is not installed -6 program is not configured -7 program is not running -8-99 reserved for future LSB use -100-149 reserved for distribution use -150-199 reserved for application use -200-254 reserved - -3.3.3. "dependencies" - -0 dependencies were correctly reported -1 dependencies could not be determined - -Note that a "dependencies" query for a RA which does not support this in -general should report no dependencies and success. An error should only be -returned if the RA supports determining the dependencies automatically but -failed. - -3.3.4. "metadata" - -The metadata query should always report success; anything else is considered a -RA failure and the RM should assume that the executable in question is not OCF -compliant. - -0 Success. - - -3.4. Relation to the LSB - -It is required that the current LSB spec is fully supported by the system. - -The API tries to make it possible to have RA function both as a normal LSB -init script and a cluster-aware RA, but this is not required functionality. -The RAs could however use the helper functions defined for LSB init scripts. - - - -A. ChangeLog - -$Log$ -Revision 1.1 2003/06/12 12:16:07 alanr -Initial revision - - -DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT - -============= diff --git a/resource_agent/API/01 b/resource_agent/API/01 deleted file mode 100644 index 85d8647..0000000 --- a/resource_agent/API/01 +++ /dev/null @@ -1,408 +0,0 @@ -DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT - -0. Header - -Topic: Open Clustering Framework Resource Agent API -Editor: Lars Marowsky-Brée -Revision: $Id$ -URL: http://www.opencf.org/documents/standards/resource-agent-api.txt - -Copyright (c) 2002 by Lars Marowsky-Brée. This material may be distributed -only subject to the terms and conditions set forth in the Open Publication -License, v1.0 or later (the latest version is presently available at -http://www.opencontent.org/openpub/). - -TODO: Reference a "style guide" document to explain where <>, "" etc have been -used and why. - - -1. Abstract - -Resource Agents (RA) are the middle layer between the Resource Manager (RM) -and the actual resources being managed. They aim to integrate the resource -with the RM without any modifications to the actual resource provider itself, -by encapsulating it carefully and thus making it movable between real nodes -in a cluster. - -The RAs are obviously very specific to the resource type they are -encapsulating, however there is no reason why they should be specific to a -particular RM. - - -1.1. Scope - -This document documents a common API for the RM to call the RAs so the pool of -available RAs can be shared by the different clustering solutions. - -It does NOT define any libraries or helper functions which RAs might share -with regard to common functionality like external command execution, cluster -logging et cetera, as these are NOT specific to RA and are defined in the -respective standards. - - -1.2. API version described - -This document currently describes version 1.1 of the API. - - -2. Terms used in this document - -2.1. "Resource" - -A single physical or logical entity that provides a service to clients or -other resources. For example, a resource can be a single disk volume, a -particular network address, or an application such as a web server. A resource -is generally available for use over time on two or more nodes in a cluster, -although it usually can be allocated to only one node at any given time. - -Resources are identified by the combination of their type and name. The name -is a special case of an instance parameter; the name/resource type combination -is required to be unique in the cluster. - -A resource may also have instance parameters which provide additional -information required for Resource Agent to control the resource. - -A resource may have dependencies on other resources or capabilities provided -by other resources. Common examples include a dependency on an IP address -being configured or a file system being mounted. These are special cases of -instance parameters and are treated in much the same fashion. - - -2.2. "Resource types" - -A resource type represents a set of resources which share a common set of -instance parameters and a common set of actions which can be performed on -resource of the given type. - - -2.3. "Resource agent" - -A RA provides the actions ("member functions") for a a given type of -resources; by providing the RA with the instance parameters, it is used to -control a specific resource. - -They are usually implemented as shell scripts, but the API described here does -not require this. - -Although this is somewhat similar to SystemV init scripts as described by the -LSB, there are some differences explained below. - - -2.4. "Instance parameters" - -Instance parameters are the attributes which uniquely identify a given -resource instance. It is recommended that the set of instance parameters for -any given type of resources to be as minimal as possible. - -An instance parameter has a given name and value. They are both case sensitive -and must satisfy the requirements of POSIX environment name/value -combinations. - - -2.5. "Resource group" - -This is a term from the RM world, but it is explained in brief here for -completeness. As explained above, a complex resource commonly has dependencies -on other resources required for proper operation; all dependencies required to -provide an actual service to the user are usually grouped into a "resource -group" which is handled as an atomic unit by the cluster, as it isn't possible -to move a resource without also moving its dependencies or only moving a -resource but not the resources which depend on it. - -While the resource grouping is still commonly implemented by manual -configuration, the information provided by the RA meta data should be -sufficient for the RM to build the dependency tree to determine the ordering -of resource startup and shutdown. - - -3. API - -3.1. API Version Numbers - -The version number is of the form "x.y", where x and y are integer numbers -greater than zero. x is referred to as the "major" number, and y the "minor" -number. - -The major number must be increased if a _backwards incompatible_ change is -made to the API. A major number mismatch between the RA and the RM must be -reported as an error to the administrator. - -The minor number must be increased if _any_ change at all is made to the API. -If the major is increased, the minor number is reset to "1". The minor number -can be used by both sides to see whether a certain additional feature is -supported by the other party. - - -3.2. Paths - -The Resource Agents are located in subdirectories under "/usr/ocf/resource.d"; -the filename of the RA maps to the resource type provided and maybe a symlink -to the real location. - -The subdirectories allow the installation of multiple RAs for the same type, -but from different vendors or package versions: - - FailSafe -> FailSafe-1.1.0/ - FailSafe-1.0.4/ - FailSafe-1.1.0/ - heartbeat -> heartbeat-0.4.9.1/ - heartbeat-0.4.9.1/ - -How the RM decides on which of several RAs for a specific resource type -installed it calls is implementation specific. - - - -3.3. Execution syntax - -After the RM has identified the executable to call, it will be called in the -following format: - - /usr/ocf/resource.d/ - - -3.4. Resource Agent actions - -A RA must be able to perform the following actions on a given resource on -request by the RM; additional actions may be supported by the script for -example for LSB compliance, however more actions may be officially defined in -the future. - -In general, a RA should not assume it is the only RA of its type running -because the RM might start several RA instances for multiple independent -resource instances in parallel. - -_Mandatory_ actions must be supported; _not mandatory_ operations must be -advertised in the meta data if supported. If the RM tries to call a not -supported, not mandatory action, the RA should return an error. - - -3.4.1. start - - Mandatory. - - This brings the resource online and makes it available for use. It should - NOT terminate before the resource has been fully started. - - It may try to implement recover actions for certain cases of startup - failures. - - "start" must succeed even if the resource instance is already running. - -3.4.2. stop - - Mandatory. - - This stops the resource. After the "stop" command has completed, nothing - should remain active of the resource and it must be possible to start it - on the same node or another node. - - Only if this cannot be guaranteed should it report failure; stopping an - already stopped resource should succeed. - - The "stop" request by the RM includes the authorization to bring down the - resource even by force as long data integrity is maintained; breaking - currently active transactions should be avoided, but the request to offline - the resource has higher precedence than this. - - The "stop" action should also perform clean-ups of artifacts like leftover - shared memory segments, semaphores, IPC message queues, lock files etc. - -3.4.3. status - - Mandatory. - - Verifies whether a resource is working correctly. This should be - "light-weight" query as it is called by the RM fairly often to poll the - status of the resource. - - It is accepted practice to have additional instance parameters which are not - strictly required to identify the resource instance but are needed to - monitor it or customize of how intrusive this check is allowed to be. - - Note: An interface where the RA actively informs the RM of failures is - planned but not defined yet. - -3.4.4. restart - - Not mandatory. - - A special case of the "start" action, this should try to recover a resource - locally. - - If this is not fully supported, it should be mapped to a stop/start action - by the RM. - - An example includes "recovering" an IP address by moving it to another - interface; this is much less costly than initiating a full resource group - fail over to another node. - -3.4.5. reload - - Not mandatory. - - Reload the configuration file without breaking currently connected users. - -3.4.6. meta-data - - Mandatory. - - Returns the resource agent meta data via stdout. - -3.4.7. validate-all - - Not mandatory. - - Validate the instance parameters provided. - - Perform a syntax check and if possible, a semantical check on the instance - parameters. - - -3.5. Parameter passing - -The instance parameters and some additional attributes are passed in via the -environment; this has been chosen because it does not reveal the parameters to -an unprivileged user on the same system and environment variables can be -easily accessed by all programming languages and shell scripts. - - -3.5.1. Syntax for instance parameters - -They are directly converted to environment variables; the name is prefixed -with "OCF_RESKEY_". - -The instance parameter "force" with the value "yes" thus becomes: - OCF_RESKEY_force=yes -in the environment. - -See section 4. for a more formal explanation of instance parameters. - - -3.5.2. Special parameters - -The entire environment variable name space starting with OCF_ is considered to -be reserved. - -Currently, the following additional parameters are defined: - -OCF_RA_VERSION_MAJ -OCF_RA_VERSION_MIN - Version number of the OCF Resource Agent API. If the script does - not support this revision, it should report an error. - - See 3.1. for an explanation of the versioning scheme used. The version - number is split into two numbers for ease of use in shell scripts. - - These two may be used by the RA to determine whether it is run under - an OCF compliant RM. - - Example: OCF_RA_VERSION_MAJ=1 - OCF_RA_VERSION_MIN=1 - -OCF_ROOT - Referring to the root of the OCF directory hierarchy. - - Example: OCF_ROOT=/usr/ocf - - -3.6. Exit codes - -These exit codes were largely modeled after the LSB 1.1.0 spec for -compatibility. - -NOTE: The ranges "reserved for application use" by the LSB may be used by the -OCF in the future to report more fine-grained status or special cases to the -RM. - -3.6.1. "status" - -0 program is running or service is OK -1 program is dead and /var/run pid file exists -2 program is dead and /var/lock lock file exists -3 program is stopped -4 program or service status is unknown -5-99 reserved for future LSB use -100-149 reserved for distribution use -150-199 reserved for application use -200-254 reserved - -3.6.2. "start", "stop", "restart", "reload" - -0 No error -1 generic or unspecified error (current practice) -2 invalid or excess argument(s) -3 unimplemented feature (for example, "reload") -4 user had insufficient privilege -5 program is not installed -6 program is not configured -7 program is not running -8-99 reserved for future LSB use -100-149 reserved for distribution use -150-199 reserved for application use -200-254 reserved - -3.6.3. "validate-all" - -0 No error -1 Semantical error -2 Syntactical error in at least one of the fields -8-99 reserved for future LSB use -100-149 reserved for distribution use -150-199 reserved for application use -200-254 reserved - -3.6.4. "meta-data" - -0 No error -1-254 Hard Resource Agent failure - - -4. Relation to the LSB - -It is required that the current LSB spec is fully supported by the system. - -The API tries to make it possible to have RA function both as a normal LSB -init script and a cluster-aware RA, but this is not required functionality. -The RAs could however use the helper functions defined for LSB init scripts. - - -5. RA meta data - -5.1. Format - -We have the following requirements which are not fulfilled by the LSB way of -embedding meta data into the beginning of the init scripts: - -- Independent of the language the RA is actually written in, -- Extensible, -- Structured, -- Easy to parse from a variety of languages. - -This is why we use simple XML to describe the RA meta data. The DTD for this -API can be found at http://www.opencf.org/standards/ra-api-1.dtd. - -5.2. Semantics - -An example of a valid meta data output is provided in -ra-metadata-example.xml. - - -A. References - -Individual contributors, ordered by last name: - - Ragnar Kjørstad - Lars Marowsky-Brée - Alan Robertson - -TODO: List all of them. - - -B. Change Log - - - -DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT - - diff --git a/resource_agent/API/02 b/resource_agent/API/02 deleted file mode 100644 index 4c1ef91..0000000 --- a/resource_agent/API/02 +++ /dev/null @@ -1,427 +0,0 @@ -DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT - -0. Header - -Topic: Open Clustering Framework Resource Agent API -Editor: Lars Marowsky-Brée -Revision: $Id$ -URL: http://www.opencf.org/documents/standards/resource-agent-api.txt - -Copyright (c) 2002 by Lars Marowsky-Brée. This material may be distributed -only subject to the terms and conditions set forth in the Open Publication -License, v1.0 or later (the latest version is presently available at -http://www.opencontent.org/openpub/). - - - -1. Abstract - -Resource Agents (RA) are the middle layer between the Resource Manager (RM) -and the actual resources being managed. They aim to integrate the resource -with the RM without any modifications to the actual resource provider itself, -by encapsulating it carefully and thus making it movable between real nodes -in a cluster. - -The RAs are obviously very specific to the resource type they are -encapsulating, however there is no reason why they should be specific to a -particular RM. - -The API described in this document should be general enough that a compliant -Resource Agent can be used by all existing resource managers / fail-over -systems who chose to implement this API either exclusively or in addition to -their existing one. - - -1.1. Scope - -This document documents a common API for the RM to call the RAs so the pool of -available RAs can be shared by the different clustering solutions. - -It does NOT define any libraries or helper functions which RAs might share -with regard to common functionality like external command execution, cluster -logging et cetera, as these are NOT specific to RA and are defined in the -respective standards. - - -1.2. API version described - -This document currently describes version 1.1 of the API. - - -2. Terms used in this document - -2.1. "Resource" - -A single physical or logical entity that provides a service to clients or -other resources. For example, a resource can be a single disk volume, a -particular network address, or an application such as a web server. A resource -is generally available for use over time on two or more nodes in a cluster, -although it usually can be allocated to only one node at any given time. - -Resources are identified by the combination of their type and name. The name -is a special case of an instance parameter; the name/resource type combination -is required to be unique in the cluster. - -A resource may also have instance parameters which provide additional -information required for Resource Agent to control the resource. - -A resource may have dependencies on other resources or capabilities provided -by other resources. Common examples include a dependency on an IP address -being configured or a file system being mounted. These are special cases of -instance parameters and are treated in much the same fashion. - - -2.2. "Resource types" - -A resource type represents a set of resources which share a common set of -instance parameters and a common set of actions which can be performed on -resource of the given type. - - -2.3. "Resource agent" - -A RA provides the actions ("member functions") for a a given type of -resources; by providing the RA with the instance parameters, it is used to -control a specific resource. - -They are usually implemented as shell scripts, but the API described here does -not require this. - -Although this is somewhat similar to SystemV init scripts as described by the -LSB, there are some differences explained below. - - -2.4. "Instance parameters" - -Instance parameters are the attributes which uniquely identify a given -resource instance. It is recommended that the set of instance parameters for -any given type of resources to be as minimal as possible. - -An instance parameter has a given name and value. They are both case sensitive -and must satisfy the requirements of POSIX environment name/value -combinations. - - -2.5. "Resource group" - -This is a term from the RM world, but it is explained in brief here for -completeness. As explained above, a complex resource commonly has dependencies -on other resources required for proper operation; all dependencies required to -provide an actual service to the user are usually grouped into a "resource -group" which is handled as an atomic unit by the cluster, as it isn't possible -to move a resource without also moving its dependencies or only moving a -resource but not the resources which depend on it. - -While the resource grouping is still commonly implemented by manual -configuration, the information provided by the RA meta data should be -sufficient for the RM to build the dependency tree to determine the ordering -of resource startup and shutdown. - - -3. API - -3.1. API Version Numbers - -The version number is of the form "x.y", where x and y are integer numbers -greater than zero. x is referred to as the "major" number, and y the "minor" -number. - -The major number must be increased if a _backwards incompatible_ change is -made to the API. A major number mismatch between the RA and the RM must be -reported as an error to the administrator. - -The minor number must be increased if _any_ change at all is made to the API. -If the major is increased, the minor number is reset to "1". The minor number -can be used by both sides to see whether a certain additional feature is -supported by the other party. - - -3.2. Paths - -The Resource Agents are located in subdirectories under "/usr/ocf/resource.d"; -the filename of the RA maps to the resource type provided and maybe a symlink -to the real location. - -The subdirectories allow the installation of multiple RAs for the same type, -but from different vendors or package versions: - - FailSafe -> FailSafe-1.1.0/ - FailSafe-1.0.4/ - FailSafe-1.1.0/ - heartbeat -> heartbeat-0.4.9.1/ - heartbeat-0.4.9.1/ - -How the RM decides on which of several RAs for a specific resource type -installed it calls is implementation specific. - - - -3.3. Execution syntax - -After the RM has identified the executable to call, it will be called in the -following format: - - /usr/ocf/resource.d/... - - -3.4. Resource Agent actions - -A RA must be able to perform the following actions on a given resource on -request by the RM; additional actions may be supported by the script for -example for LSB compliance. - -In general, a RA should not assume it is the only RA of its type running at -any given time because the RM might start several RA instances for multiple -independent resource instances in parallel. - -_Mandatory_ actions must be supported; _not mandatory_ operations must be -advertised in the meta data if supported. If the RM tries to call a not -supported, not mandatory action, the RA should return an error. - - -3.4.1. start - - Mandatory. - - This brings the resource online and makes it available for use. It should - NOT terminate before the resource has been fully started. - - It may try to implement recover actions for certain cases of startup - failures. - - "start" must succeed if the resource instance is already running. - -3.4.2. stop - - Mandatory. - - This stops the resource. After the "stop" command has completed, nothing - should remain active of the resource and it must be possible to start it - on the same node or another node. - - Only if this cannot be guaranteed should it report failure; stopping an - already stopped resource must succeed. - - The "stop" request by the RM includes the authorization to bring down the - resource even by force as long data integrity is maintained; breaking - currently active transactions should be avoided, but the request to offline - the resource has higher precedence than this. - - The "stop" action should also perform clean-ups of artifacts like leftover - shared memory segments, semaphores, IPC message queues, lock files etc. - -3.4.3. monitor - - Mandatory. - - Verifies whether a resource is working correctly. This should be - "light-weight" query as it is called by the RM fairly often to poll the - status of the resource. - - It is accepted practice to have additional instance parameters which are not - strictly required to identify the resource instance but are needed to - monitor it or customize of how intrusive this check is allowed to be. - -3.4.4. restart - - Not mandatory. - - A special case of the "start" action, this should try to recover a resource - locally. - - If this is not fully supported, it should be mapped to a stop/start action - by the RM. - - An example includes "recovering" an IP address by moving it to another - interface; this is much less costly than initiating a full resource group - fail over to another node. - -3.4.5. reload - - Not mandatory. - - Reload the configuration file without breaking currently connected users. - -3.4.6. meta-data - - Mandatory. - - Returns the resource agent meta data via stdout. - -3.4.7. validate-all - - Not mandatory. - - Validate the instance parameters provided. - - Perform a syntax check and if possible, a semantical check on the instance - parameters. - - -3.5. Parameter passing - -The instance parameters and some additional attributes are passed in via the -environment; this has been chosen because it does not reveal the parameters to -an unprivileged user on the same system and environment variables can be -easily accessed by all programming languages and shell scripts. - - -3.5.1. Syntax for instance parameters - -They are directly converted to environment variables; the name is prefixed -with "OCF_RESKEY_". - -The instance parameter "force" with the value "yes" thus becomes: - OCF_RESKEY_force=yes -in the environment. - -See section 4. for a more formal explanation of instance parameters. - - -3.5.2. Special parameters - -The entire environment variable name space starting with OCF_ is considered to -be reserved. - -Currently, the following additional parameters are defined: - -OCF_RA_VERSION_MAJ -OCF_RA_VERSION_MIN - Version number of the OCF Resource Agent API. If the script does - not support this revision, it should report an error. - - See 3.1. for an explanation of the versioning scheme used. The version - number is split into two numbers for ease of use in shell scripts. - - These two may be used by the RA to determine whether it is run under - an OCF compliant RM. - - Example: OCF_RA_VERSION_MAJ=1 - OCF_RA_VERSION_MIN=1 - -OCF_ROOT - Referring to the root of the OCF directory hierarchy. - - Example: OCF_ROOT=/usr/ocf - -OCF_MONITOR_SKIP_QOS - If set, the monitor action may skip expensive checks regarding the - quality of the service. This effectively removes the distinction - between the fine-grained states 0-119 and can be used if the RM is - only interested in whether the resource is active at all - whether - partially or fully - or not. - - -3.6. Exit codes - -These exit codes were largely modeled after the LSB 1.1.0 spec for -compatibility, except for the "monitor" action. - -3.6.1. "monitor" - -OK for fail-over / restart concerns: - 0 - 29 running OK - 30 - 59 running with warning / minor QoS offset - -NOT OK wrt fail-over / restart: - 60 - 89 running with error / partially / QoS violation - 90 - 119 not running, but with error / not cleaned up - -120 - 149 not running at all (cleanly stopped) - -Hard errors: - -150 - 179 Execution error (called with wrong syntax etc) -180 - 255 Reserved - - -3.6.2. "start", "stop", "restart", "reload" - -0 No error, acton succeeded completely -1 generic or unspecified error (current practice) -2 invalid or excess argument(s) -3 unimplemented feature (for example, "reload") -4 user had insufficient privilege -5 program is not installed -6 program is not configured -7 program is not running -8-99 reserved for future LSB use -100-149 reserved for distribution use -150-199 reserved for application use -200-254 reserved - - -3.6.3. "validate-all" - -0 No error -1 Semantical error -2 Syntactical error in at least one of the fields -8-99 reserved for future LSB use -100-149 reserved for distribution use -150-199 reserved for application use -200-254 reserved - -3.6.4. "meta-data" - -0 No error -1-254 Hard Resource Agent failure - - -4. Relation to the LSB - -It is required that the current LSB spec is fully supported by the system. - -The API tries to make it possible to have RA function both as a normal LSB -init script and a cluster-aware RA, but this is not required functionality. -The RAs could however use the helper functions defined for LSB init scripts. - - -5. RA meta data - -5.1. Format - -We have the following requirements which are not fulfilled by the LSB way of -embedding meta data into the beginning of the init scripts: - -- Independent of the language the RA is actually written in, -- Extensible, -- Structured, -- Easy to parse from a variety of languages. - -This is why we use simple XML to describe the RA meta data. The DTD for this -API can be found at http://www.opencf.org/standards/ra-api-1.dtd. - -5.2. Semantics - -An example of a valid meta data output is provided in -ra-metadata-example.xml. - - -A. References - -Individual contributors, ordered by last name: - - Greg Freemyer - Ragnar Kjørstad - Lars Marowsky-Brée - Alan Robertson - -B. Change Log - - -C. To-do list - -Reference a "style guide" document to explain where <>, "" etc have been used -and why. - -Move the terminology definitions out into a separate document common to all -OCF work. - -Complete contributor list. - -An interface where the RA asynchronously informs the RM of failures is planned -but not defined yet. - - -DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT -