Version 1 vs 3
Version 1 vs 3
Content Changes
Content Changes
The [[https://github.com/ClusterLabs/OCF-spec/blob/master/ra/1.1/resource-agent-api.md|OCF Resource Agent API 1.1]] standard released in 2021 offers new features for resource agents. This document describes how to update an existing OCF 1.0 compliant resource agent to be compatible with OCF 1.1.
== Meta-Data ==
=== DOCTYPE ===
Many resource agents have a line like this near the top of their meta-data:
```
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
```
The OCF 1.1 standard switched the schema document from DTD to RNG format, so there is no ra-api-1.dtd anymore. There is no DOCTYPE syntax for RNG, so the easiest solution is just to drop the (optional) DOCTYPE element entirely.
=== Version ===
The only required step for OCF 1.1 support is to update the `<version>` element in the top level of meta-data:
```
<version>1.1</version>
```
That's it! Everything else is optional.
Note that the `<version>` element lists the OCF standard that the agent supports; the `version` attribute inside the `<resource-agent>` element lists the version of the agent itself (and can be any value desired).
=== Description ===
In the OCF 1.0 standard, there was no place for a description of the agent itself. OCF 1.1 adopted the already-common practice of using `<longdesc>` and `<shortdesc>` elements in the top level of the meta-data for this purpose. If you don't already use them, add them if desired. Example:
```
<longdesc lang="en">
This is a long description of a theoretical resource agent that doesn't really exist. You could say whatever you want about its purpose here. The short description below is, well, a short description.
</longdesc>
<shortdesc lang="en">Super-duper resource agent that does everything</shortdesc>
```
=== Unique parameters ===
The `unique` attribute in parameters is now deprecated. You can keep it if you want to be compatible with older software that looks for it, but removing it is recommended.
Instead, add `unique-group` attributes for every set of parameters that should be unique for each instance of the resource agent. Here is an example (just the relevant portions) of an agent that requires an IP address and port combination that must be unique (the value "address" is arbitrary):
```
<parameters>
<parameter name="ip" unique-group="address">
...
</parameter>
<parameter name="port" unique-group="address">
...
</parameter>
...
</parameters>
```
=== Required parameters ===
Mark any required parameters (those the user must specify) with the new `required="1"` attribute.
=== Deprecated parameters ===
Mark any deprecated parameters with the new `<deprecated>` child element, which may optionally contain `<replaced-with>` child elements indicating parameters that should be used instead, and `<desc>` child elements explaining the deprecation for users (potentially with multiple translations). Example:
```
<parameter name="foo">
<deprecated>
<replaced-with name="mode"/>
<desc lang="en">Don't use foo, it's bad.</desc>
<desc lang="cs">Nepoužívej foo, sic to schytáš.</desc>
</deprecated>
<longdesc lang="en">
Whether the example daemon should operate with foo factor
</longdesc>
<shortdesc lang="en">Foo factor</shortdesc>
<content type="string" />
</parameter>
```
=== Enumerated parameter values ===
If you have any parameters that take specific values, you can now enumerate those values instead of allowing free-form text. Example:
```
<parameter name="mode">
<longdesc lang="en">
The mode the example daemon should operate in. Allowed values are "dry-run" and
"live".
</longdesc>
<shortdesc lang="en">Run mode</shortdesc>
<content type="select" default="live">
<option value="dry-run" />
<option value="live" />
</content>
</parameter>
```
=== Reloadable parameters ===
OCF 1.1 supports the concept of reloadable parameters, which is the same as how Pacemaker used the now-deprecated `unique` attribute.
If a parameter value can be changed without requiring a full stop and start of the service itself, mark the parameter with the new `reloadable="1"` attribute. This is //not// related to reloading the service itself, just the agent parameter values.
An example might be a web server agent that can use one of several clients to check the server status. The parameter that specifies the client can be changed without restarting the web server itself, so it could be marked as reloadable. The user can change the value of that parameter with no downtime for the web server.
If you mark any parameters as reloadable, you also have to implement a `reload-agent` action as described below, and advertise the action in meta-data.
=== validate-all depth ===
Add `depth="0"` to the metadata for the `validate-all` action. (Be //sure// that the action follows the OCF 1.1 usage of `OCF_CHECK_LEVEL` as described later!)
== Actions ==
=== notify ===
Pacemaker implemented an extension to OCF 1.0 for clone resources (which can run on multiple cluster nodes at the same time). These resource agents could optionally receive notifications before and after resource actions on any instance, via the `notify` action.
OCF 1.1 has adopted the `notify` action, but left its behavior undescribed. Continue using it or not as desired.
=== promote and demote ===
Another Pacemaker extension was promotable resources (clones whose instances can run in one of two modes). The `start` and `demote` actions bring an instance to the default mode, and the `promote` action brings the instance to the special mode. OCF 1.1 adopts these actions.
A major difference from the older Pacemaker implementation is that the role names are now `Unpromoted` and `Promoted` rather than `Master` and `Slave`. Newer versions of Pacemaker support both sets of names.
If your agent already implements promotable clones, update any mentions of the role names. The agent won't be able to support both the old and new names, because only one set can be advertised in monitor action meta-data. If you advertise the old names, advertise OCF 1.0 support; if you advertise the new names, advertise OCF 1.1 support.
=== reload and reload-agent ===
The `reload` action previously had conflicting uses; most resource agents used it to reload the service itself, while Pacemaker used it to reload agent parameters.
In OCF 1.1, the `reload` action is now reserved for reloading the service itself. For example, if the service can re-read its configuration file after receiving a signal, the reload action can send that signal. This is equivalent to how init scripts and systemd unit files use reload.
The new `reload-agent` action is for making effective any changes in parameters marked `reloadable`. Many times this will be a no-op -- in the earlier example of a web server agent that has a reloadable parameter for which client to use to contact the web server, nothing special needs to be done if that parameter is changed (the agent will simply use the new value the next time it needs to contact the web server). A different example might be a database agent with a reloadable parameter for whether the database is in read-only or read/write mode; the agent might contact the database server with a client to change the mode, which would be much quicker (and have no downtime) compared to a full database restart.
=== OCF_OUTPUT_FORMAT ===
In OCF 1.1, agents may optionally support displaying output in multiple formats. The desired format will be passed via the `OCF_OUTPUT_FORMAT` environment variable. The specific formats supported are left to the agent, as are the values used to identify them (it is recommended to use "text" for human-readable text and "xml" for XML, if supported).
Following existing practice, the `meta-data` action must default to using XML output, and all other actions must default to text. It is totally up to you whether to support anything else.
Mainly this is expected to be used for the `validate-all` action, to be able to return XML for better machine parsing. However the XML schema has not been standardized, so this will be an area of experimentation in the near future.
=== OCF_CHECK_LEVEL ===
OCF 1.0 and 1.1 both support the `OCF_CHECK_LEVEL` environment variable for the `monitor` action, to determine the depth (service impact) of check done.
OCF 1.1 extends this to the `validate-all` action as well. If not specified or 0, only syntax and consistency checks should be done (for example, verifying that a parameter value is an integer if that's appropriate). If 10, the agent may additionally verify the suitability of the local host (for example, that a necessary directory exists).
=== Exit statuses ===
The meaning of a couple of exit statuses has been clarified:
* OCF_ERR_ARGS (2): parameters are invalid in the context of the local host (such as a nonexistent configuration file)
* OCF_ERR_CONFIGURED (6): parameters are internally invalid (such as a string given where only an integer is allowed)
In addition, new exit statuses that were Pacemaker extensions have been adopted:
* OCF_RUNNING_PROMOTED (8): properly running in the promoted role
* OCF_FAILED_PROMOTED (9): failed in the promoted role
* OCF_RUNNING_DEGRADED (190): properly running but failure is more likely in the near term
* OCF_PROMOTED_DEGRADED (191): properly running in the promoted role but degraded
The symbolic names for these new statuses might or might not be defined by shell include files, so be aware of what includes you are using. If you want to maintain compatibility with older includes, you can define each symbol you need if it's not already defined, like:
```
: ${OCF_RUNNING_PROMOTED:=8}
```
== Pacemaker-specific changes for promotable clones ==
Pacemaker implements a number of extensions to the OCF standard. Pacemaker 2.1.0 and later make significant changes to these extensions with regards to promotable clones, so if you have an existing agent that supports promotable clones, these will affect you:
* Pacemaker now provides resource agents with new environment variables (in addition to the existing ones) for promotable clone notifications, with `master` replaced with `promoted` and `slave` replaced with `unpromoted`. For example, `OCF_RESKEY_CRM_meta_notify_unpromoted_resource` will be identical to `OCF_RESKEY_CRM_meta_notify_slave_resource`. Use the new names in your agent. If you want to stay compatible with older Pacemaker versions, put something like this all on one line near the top of your agent for each relevant variable the agent uses:
```
: ${OCF_RESKEY_CRM_meta_notify_unpromoted_resource:=$OCF_RESKEY_CRM_meta_notify_slave_resource}
```
* The `crm_master` command has been deprecated and replaced with a new `crm_attribute --promotion` option that defaults to `--lifetime=reboot` (example: `crm_master -l reboot -v 10` becomes `crm_attribute --promotion -v 10`. The old command will still work for now, but the new one should be used if available. The new option is available as of CRM feature set 3.9.0, which can be tested like:
```
ocf_version_cmp "3.9.0" "$OCF_RESKEY_crm_feature_set"
if [ $? -le 1 ]; then
crm_attribute --promotion <<options...>>
else
crm_master <<options...>>
fi
```
The ocf-shellfuncs include file from the resource-agents project might add some wrappers to simplify the above.
The [[https://github.com/ClusterLabs/OCF-spec/blob/master/ra/1.1/resource-agent-api.md|OCF Resource Agent API 1.1]] standard released in 2021 offers new features for resource agents. This document describes how to update an existing OCF 1.0 compliant resource agent to be compatible with OCF 1.1.
== Meta-Data ==
=== DOCTYPE ===
Many resource agents have a line like this near the top of their meta-data:
```
<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
```
The OCF 1.1 standard switched the schema document from DTD to RNG format, so there is no ra-api-1.dtd anymore. There is no DOCTYPE syntax for RNG, so the easiest solution is just to drop the (optional) DOCTYPE element entirely.
=== Version ===
The only required step for OCF 1.1 support is to update the `<version>` element in the top level of meta-data:
```
<version>1.1</version>
```
That's it! Everything else is optional.
Note that the `<version>` element lists the OCF standard that the agent supports; the `version` attribute inside the `<resource-agent>` element lists the version of the agent itself (and can be any value desired).
=== Description ===
In the OCF 1.0 standard, there was no place for a description of the agent itself. OCF 1.1 adopted the already-common practice of using `<longdesc>` and `<shortdesc>` elements in the top level of the meta-data for this purpose. If you don't already use them, add them if desired. Example:
```
<longdesc lang="en">
This is a long description of a theoretical resource agent that doesn't really exist. You could say whatever you want about its purpose here. The short description below is, well, a short description.
</longdesc>
<shortdesc lang="en">Super-duper resource agent that does everything</shortdesc>
```
=== Unique parameters ===
The `unique` attribute in parameters is now deprecated. You can keep it if you want to be compatible with older software that looks for it, but removing it is recommended.
Instead, add `unique-group` attributes for every set of parameters that should be unique for each instance of the resource agent. Here is an example (just the relevant portions) of an agent that requires an IP address and port combination that must be unique (the value "address" is arbitrary):
```
<parameters>
<parameter name="ip" unique-group="address">
...
</parameter>
<parameter name="port" unique-group="address">
...
</parameter>
...
</parameters>
```
=== Required parameters ===
Mark any required parameters (those the user must specify) with the new `required="1"` attribute.
=== Deprecated parameters ===
Mark any deprecated parameters with the new `<deprecated>` child element, which may optionally contain `<replaced-with>` child elements indicating parameters that should be used instead, and `<desc>` child elements explaining the deprecation for users (potentially with multiple translations). Example:
```
<parameter name="foo">
<deprecated>
<replaced-with name="mode"/>
<desc lang="en">Don't use foo, it's bad.</desc>
<desc lang="cs">Nepoužívej foo, sic to schytáš.</desc>
</deprecated>
<longdesc lang="en">
Whether the example daemon should operate with foo factor
</longdesc>
<shortdesc lang="en">Foo factor</shortdesc>
<content type="string" />
</parameter>
```
=== Enumerated parameter values ===
If you have any parameters that take specific values, you can now enumerate those values instead of allowing free-form text. Example:
```
<parameter name="mode">
<longdesc lang="en">
The mode the example daemon should operate in. Allowed values are "dry-run" and
"live".
</longdesc>
<shortdesc lang="en">Run mode</shortdesc>
<content type="select" default="live">
<option value="dry-run" />
<option value="live" />
</content>
</parameter>
```
=== Reloadable parameters ===
OCF 1.1 supports the concept of reloadable parameters, which is the same as how Pacemaker used the now-deprecated `unique` attribute.
If a parameter value can be changed without requiring a full stop and start of the service itself, mark the parameter with the new `reloadable="1"` attribute. This is //not// related to reloading the service itself, just the agent parameter values.
An example might be a web server agent that can use one of several clients to check the server status. The parameter that specifies the client can be changed without restarting the web server itself, so it could be marked as reloadable. The user can change the value of that parameter with no downtime for the web server.
If you mark any parameters as reloadable, you also have to implement a `reload-agent` action as described below, and advertise the action in meta-data.
=== validate-all depth ===
Add `depth="0"` to the metadata for the `validate-all` action. (Be //sure// that the action follows the OCF 1.1 usage of `OCF_CHECK_LEVEL` as described later!)
== Actions ==
=== notify ===
Pacemaker implemented an extension to OCF 1.0 for clone resources (which can run on multiple cluster nodes at the same time). These resource agents could optionally receive notifications before and after resource actions on any instance, via the `notify` action.
OCF 1.1 has adopted the `notify` action, but left its behavior undescribed. Continue using it or not as desired.
=== promote and demote ===
Another Pacemaker extension was promotable resources (clones whose instances can run in one of two modes). The `start` and `demote` actions bring an instance to the default mode, and the `promote` action brings the instance to the special mode. OCF 1.1 adopts these actions.
A major difference from the older Pacemaker implementation is that the role names are now `Unpromoted` and `Promoted` rather than `Master` and `Slave`. Newer versions of Pacemaker support both sets of names.
If your agent already implements promotable clones, update any mentions of the role names. The agent won't be able to support both the old and new names, because only one set can be advertised in monitor action meta-data. If you advertise the old names, advertise OCF 1.0 support; if you advertise the new names, advertise OCF 1.1 support.
=== reload and reload-agent ===
The `reload` action previously had conflicting uses; most resource agents used it to reload the service itself, while Pacemaker used it to reload agent parameters.
In OCF 1.1, the `reload` action is now reserved for reloading the service itself. For example, if the service can re-read its configuration file after receiving a signal, the reload action can send that signal. This is equivalent to how init scripts and systemd unit files use reload.
The new `reload-agent` action is for making effective any changes in parameters marked `reloadable`. Many times this will be a no-op -- in the earlier example of a web server agent that has a reloadable parameter for which client to use to contact the web server, nothing special needs to be done if that parameter is changed (the agent will simply use the new value the next time it needs to contact the web server). A different example might be a database agent with a reloadable parameter for whether the database is in read-only or read/write mode; the agent might contact the database server with a client to change the mode, which would be much quicker (and have no downtime) compared to a full database restart.
=== OCF_OUTPUT_FORMAT ===
In OCF 1.1, agents may optionally support displaying output in multiple formats. The desired format will be passed via the `OCF_OUTPUT_FORMAT` environment variable. The specific formats supported are left to the agent, as are the values used to identify them (it is recommended to use "text" for human-readable text and "xml" for XML, if supported).
Following existing practice, the `meta-data` action must default to using XML output, and all other actions must default to text. It is totally up to you whether to support anything else.
Mainly this is expected to be used for the `validate-all` action, to be able to return XML for better machine parsing. However the XML schema has not been standardized, so this will be an area of experimentation in the near future.
=== OCF_CHECK_LEVEL ===
OCF 1.0 and 1.1 both support the `OCF_CHECK_LEVEL` environment variable for the `monitor` action, to determine the depth (service impact) of check done.
OCF 1.1 extends this to the `validate-all` action as well. If not specified or 0, only syntax and consistency checks should be done (for example, verifying that a parameter value is an integer if that's appropriate). If 10, the agent may additionally verify the suitability of the local host (for example, that a necessary directory exists).
=== Exit statuses ===
The meaning of a couple of exit statuses has been clarified:
* OCF_ERR_ARGS (2): parameters are invalid in the context of the local host (such as a nonexistent configuration file)
* OCF_ERR_CONFIGURED (6): parameters are internally invalid (such as a string given where only an integer is allowed)
In addition, new exit statuses that were Pacemaker extensions have been adopted:
* OCF_RUNNING_PROMOTED (8): properly running in the promoted role
* OCF_FAILED_PROMOTED (9): failed in the promoted role
* OCF_RUNNING_DEGRADED (190): properly running but failure is more likely in the near term
* OCF_PROMOTED_DEGRADED (191): properly running in the promoted role but degraded
The symbolic names for these new statuses might or might not be defined by shell include files, so be aware of what includes you are using. If you want to maintain compatibility with older includes, you can define each symbol you need if it's not already defined, like:
```
: ${OCF_RUNNING_PROMOTED:=8}
```
== Pacemaker-specific changes for promotable clones ==
Pacemaker implements a number of extensions to the OCF standard. Pacemaker 2.1.0 and later make significant changes to these extensions with regards to promotable clones, so if you have an existing agent that supports promotable clones, these will affect you:
* Pacemaker now provides resource agents with new environment variables (in addition to the existing ones) for promotable clone notifications, with `master` replaced with `promoted` and `slave` replaced with `unpromoted`. For example, `OCF_RESKEY_CRM_meta_notify_unpromoted_resource` will be identical to `OCF_RESKEY_CRM_meta_notify_slave_resource`. Use the new names in your agent. If you want to stay compatible with older Pacemaker versions, put something like this all on one line near the top of your agent for each relevant variable the agent uses:
```
: ${OCF_RESKEY_CRM_meta_notify_unpromoted_resource:=$OCF_RESKEY_CRM_meta_notify_slave_resource}
```
* The `crm_master` command has been deprecated and replaced with a new `crm_attribute --promotion` option that defaults to `--lifetime=reboot` (example: `crm_master -l reboot -v 10` becomes `crm_attribute --promotion -v 10`. The old command will still work for now, but the new one should be used if available. The new option is available as of CRM feature set 3.9.0, which can be tested like:
```
ocf_version_cmp "3.9.0" "$OCF_RESKEY_crm_feature_set"
if [ $? -le 1 ]; then
crm_attribute --promotion <<options...>>
else
crm_master <<options...>>
fi
```
The ocf-shellfuncs include file from the resource-agents project might add some wrappers to simplify the above.