Page MenuHomeClusterLabs Projects

Create command-line method of listing all possible meta-attributes, cluster options, and environment variables
Open, NormalPublic

Assigned To
Authored By
kgaillot
Jan 24 2023, 10:21 AM
Tags
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
  • Restricted Project
Referenced Files
None
Subscribers

Description

Pacemaker has three main types of options: environment variables (set in sysconfig or similar), cluster options (set in the CIB crm_config section), and object meta-attributes (set under the particular object or object defaults in the CIB).

The CIB manager, controller, and scheduler take a metadata command-line argument to output OCF-like metadata for the cluster options that they use. This has a couple problems: all have to be run to see all options, and options that are used by more than one daemon are in more than one set of metadata.

etc/sysconfig/pacemaker documents most but not all of the environment variables, and isn't available in a convenient OCF-like form from the command-line.

Pacemaker accepts meta-attributes for alert, primitive, group, clone, bundle, operation, rsc_defaults, and op_defaults. Pacemaker Explained documents most but not all of them, and again isn't available in a parseable form.

crm_attribute would be a reasonable place to put new command-line options for this, maybe --list-options, --list-environment, and --list-meta [context], or maybe a single --list-options [type] argument. They should support both text and XML output, with the XML being OCF-like.

As part of this, we will likely need some sort of internal API for these that the rest of the code must use to access options (rather than directly access them via getenv() etc.), so that we can guarantee we have one comprehensive list. This could replace or evolve from pcmk__cluster_option_t. It might even be worthwhile to keep the OCF-like metadata in XML files to be used as the definitive list, with the sysconfig file, ReST files for the books, and C and Python data files created from it automatically via make targets. Alternatively, we could install the XML files, and C and Python APIs could parse them.

Feel free to separate out subtasks for this

See:

Event Timeline

kgaillot triaged this task as Normal priority.Jan 24 2023, 10:21 AM
kgaillot created this task.
kgaillot created this object with edit policy "Restricted Project (Project)".
kgaillot added a project: Restricted Project.Feb 1 2023, 6:42 PM
kgaillot renamed this task from Create command-line method of listing all possible meta-attributes to Create command-line method of listing all possible meta-attributes, cluster options, and environment variables.Mar 28 2023, 10:33 AM
kgaillot updated the task description. (Show Details)
kgaillot added projects: Restricted Project, Restricted Project.
kgaillot moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Aug 30 2023, 12:24 PM
kgaillot added a subtask: Restricted Maniphest Task.Oct 2 2023, 5:41 PM
clumens moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Nov 29 2023, 12:31 PM
kgaillot moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Dec 11 2023, 12:29 PM

It might even be worthwhile to keep the OCF-like metadata in XML files to be used as the definitive list, with the sysconfig file, ReST files for the books, and C and Python data files created from it automatically via make targets. Alternatively, we could install the XML files, and C and Python APIs could parse them.

I feel like internationalization might be easier if we store all the options (with their descriptions, etc.) in C files. Then we can generate XML on demand.

This may also reduce duplication: we can use bit flags for options that are applicable to more than one type of object. It looks like we could do this by referencing an XML entity. However, that strikes me as less intuitive, and it would make the XML files harder to parse. (Parsability by external tools seems like the main advantage of using a static XML file as the source of truth.)

we will likely need some sort of internal API for these that the rest of the code must use to access options (rather than directly access them via getenv() etc.), so that we can guarantee we have one comprehensive list.

This is reasonable as a stylistic choice and as a reminder to update the meta-data when we add an option. However, it doesn't seem either necessary or sufficient in order to guarantee we have one comprehensive list.

  • Not necessary: All that's required is to add new options to the meta-data. How we set them doesn't have to fundamentally change.
    • We still have to getenv() at daemon startup, before we can store env vars in an internal data structure for later gets and sets. This basically means we're storing the same option in two ways. It can also complicate environment inheritance by forked child processes, if we start setting internal options instead of env vars.
    • Meta-attributes can stay in their param hashes.
    • One exception: it will be simpler to give each daemon an identical config hash for cluster properties, instead of each daemon managing the subset that it cares about.
  • Not sufficient: Pacemaker shouldn't validate that a particular cluster property or meta-attribute is part of the list, since users can specify arbitrary attributes. (Higher-level tools like pcs may want to do so.) So nothing will "catch" if we get an option from a param hash or from the environment that isn't in the meta-data.
In T620#10726, @nrwahl2 wrote:

I feel like internationalization might be easier if we store all the options (with their descriptions, etc.) in C files. Then we can generate XML on demand.

This may also reduce duplication: we can use bit flags for options that are applicable to more than one type of object. It looks like we could do this by referencing an XML entity. However, that strikes me as less intuitive, and it would make the XML files harder to parse. (Parsability by external tools seems like the main advantage of using a static XML file as the source of truth.)

If we're using C as the starting point, we don't need XML, we just need tools that can parse the C code (we can require a particular style to make that easier) and generate sysconfig, ReST, and Python.

In T620#10727, @nrwahl2 wrote:

we will likely need some sort of internal API for these that the rest of the code must use to access options (rather than directly access them via getenv() etc.), so that we can guarantee we have one comprehensive list.

This is reasonable as a stylistic choice and as a reminder to update the meta-data when we add an option. However, it doesn't seem either necessary or sufficient in order to guarantee we have one comprehensive list.

Yep, I was thinking mainly of the metadata, to avoid having random undocumented options introduced here and there.

nrwahl2 closed subtask Restricted Maniphest Task as Merged.Jan 2 2024, 5:13 PM
kgaillot edited projects, added Restricted Project; removed Restricted Project.Jan 3 2024, 11:39 AM
kgaillot edited projects, added Restricted Project; removed Restricted Project.
nrwahl2 closed subtask Restricted Maniphest Task as Merged.Jan 3 2024, 1:54 PM
kgaillot edited projects, added Restricted Project; removed Restricted Project, Restricted Project, Restricted Project.
kgaillot changed the visibility from "All Users" to "Public (No Login Required)".

If we're using C as the starting point, we don't need XML, we just need tools that can parse the C code (we can require a particular style to make that easier) and generate sysconfig, ReST, and Python.

Running a command that generates XML on demand (which is our main goal anyway), and then parsing that, is probably easier than parsing the C code.


My plan is to get everything working for cluster options before moving on to other types of options. Then use this process as a basis for the rest.

The main wrinkle I foresee is defining which meta-attributes are valid for which types of resource. For one thing, I'll probably extend pcmk__opt_context to be used as a flag set to indicate which resource type(s) a meta-attribute is valid in, so that we don't duplicate options. Well, unless every composite type is a superset of primitive and the composite types have no overlapping meta-attributes.

On that note, I need to figure out what (everything?) gets inherited -- in other words, does every primitive meta-attribute make sense to set on a bundle or a clone? I haven't gone through them yet. Descriptions will have to account for this sort of thing too, unless we have a separate table for each resource type, which could get huge.

We still need to do environment variables (which we'll call local options), resource meta-attributes, alert meta-attributes, operation meta-attributes, and special stonith instance attributes.

There are some XML attributes that are important but don't fit neatly into this plan because they're not meta-attributes. For example, <alert path=X>, which is exclusively an XML attribute, and <op enabled=X timeout=Y interval=Z>, where all of these should be valid as either XML attrs or meta-attrs. I suppose we can leave the XML attrs to the schema.

Stonith instance attributes are a somewhat interesting case too. They're the only object-specific attributes we're likely to include that aren't meta-attributes. They're currently listed in the fencer metadata as you know. I suppose everything will be the same, and the longdesc/shortdesc and command line options will make clear that these are special instance attributes.

In T620#10892, @nrwahl2 wrote:

If we're using C as the starting point, we don't need XML, we just need tools that can parse the C code (we can require a particular style to make that easier) and generate sysconfig, ReST, and Python.

Running a command that generates XML on demand (which is our main goal anyway), and then parsing that, is probably easier than parsing the C code.

Yep, I wasn't thinking ...

My plan is to get everything working for cluster options before moving on to other types of options. Then use this process as a basis for the rest.

The main wrinkle I foresee is defining which meta-attributes are valid for which types of resource. For one thing, I'll probably extend pcmk__opt_context to be used as a flag set to indicate which resource type(s) a meta-attribute is valid in, so that we don't duplicate options. Well, unless every composite type is a superset of primitive and the composite types have no overlapping meta-attributes.

Sort of ...

On that note, I need to figure out what (everything?) gets inherited -- in other words, does every primitive meta-attribute make sense to set on a bundle or a clone? I haven't gone through them yet. Descriptions will have to account for this sort of thing too, unless we have a separate table for each resource type, which could get huge.

Everything set on a collective gets inherited by whatever's inside (though not vice versa). Only some primitive meta-attributes have meaning for the collective itself. For example, setting priority on a group applies to the group *and* is inherited by each member, but setting resource-stickiness on a group is inherited by each member without applying to the group itself (whose stickiness is the sum of its members').

I believe all existing collective-specific meta-attributes are mutually exclusive by collective type (unless you count the ones inherited from primitives), but I wouldn't rely on it.

So:

  • Some meta-attributes apply only to specific collective type(s).
  • Some meta-attributes apply to specific collective type(s) and primitives.
  • Any primitive meta-attribute can be set on a collective and will be inherited by its inner resources.
  • Users are free to set arbitrary meta-attributes on any resource and use them in rules.

For the purposes of showing lists, I would think we want to show the meta-attributes that can apply directly to the thing being requested (not just inherited).

We still need to do environment variables (which we'll call local options), resource meta-attributes, alert meta-attributes, operation meta-attributes, and special stonith instance attributes.

There are some XML attributes that are important but don't fit neatly into this plan because they're not meta-attributes. For example, <alert path=X>, which is exclusively an XML attribute, and <op enabled=X timeout=Y interval=Z>, where all of these should be valid as either XML attrs or meta-attrs. I suppose we can leave the XML attrs to the schema.

Yeah that's an unfortunate design. I think we should list only things that can be set in a meta-attributes block. I suppose someone might want to show XML-only attributes as well, but we could leave that as a future enhancement.

Stonith instance attributes are a somewhat interesting case too. They're the only object-specific attributes we're likely to include that aren't meta-attributes. They're currently listed in the fencer metadata as you know. I suppose everything will be the same, and the longdesc/shortdesc and command line options will make clear that these are special instance attributes.

Yep. FYI the issue is that the fencer just gets instance attributes as passed from the scheduler to the controller (and possibly to the executor) then to the fencer. It doesn't get meta-attributes. It does track the CIB, but it currently doesn't evaluate rules, which would be needed for meta-attributes (and it's possible to register a device via the IPC API without a CIB entry). But I think it would have been better to define them as meta-attributes and have the scheduler add those to the graph specially for fence devices.

For the purposes of showing lists, I would think we want to show the meta-attributes that can apply directly to the thing being requested (not just inherited).

This clearly excludes meta-attributes that are set solely by virtue of inheritance -- for example, a primitive may have the "promotable" meta-attribute set via inheritance, but it should never be set explicitly for the primitive, so it should NOT be included in the list.

However, I'm not sure whether you wanted to include meta-attributes that can be set on a collective for the sole purpose of being inherited by a primitive (sort of the opposite scenario compared to the one above).

  • On the one hand: it's cleaner and clearer not to include them, and to include only those meta-attributes that have a direct effect on the collective. We can simply ensure it's documented (maybe it already is) that primitives inherit all of their parents' meta-attributes. That implies that any primitive meta-attribute can be set on a collective for the purpose of inheritance.
  • On the other hand: excluding them offloads more work to external tools. For example, if pcs is validating meta-attributes for a bundle, then it needs to check the list of bundle meta-attributes, the list of primitive meta-attributes, and maybe the list of clone meta-attributes. That's not a deal-breaker, but it requires that extra knowledge of Pacemaker behavior be built into external tools.

But I think it would have been better to define them as meta-attributes and have the scheduler add those to the graph specially for fence devices.

Maybe worth a wishlist task for future release series. Low return on investment, besides being a clearer and more appropriate design. I presume it'd require an XSL transform too, that goes through all the instance attributes, grabs the "special" ones, and converts them to meta-attributes.

In T620#10902, @nrwahl2 wrote:

However, I'm not sure whether you wanted to include meta-attributes that can be set on a collective for the sole purpose of being inherited by a primitive (sort of the opposite scenario compared to the one above).

  • On the one hand: it's cleaner and clearer not to include them, and to include only those meta-attributes that have a direct effect on the collective. We can simply ensure it's documented (maybe it already is) that primitives inherit all of their parents' meta-attributes. That implies that any primitive meta-attribute can be set on a collective for the purpose of inheritance.
  • On the other hand: excluding them offloads more work to external tools. For example, if pcs is validating meta-attributes for a bundle, then it needs to check the list of bundle meta-attributes, the list of primitive meta-attributes, and maybe the list of clone meta-attributes. That's not a deal-breaker, but it requires that extra knowledge of Pacemaker behavior be built into external tools.

Don't include them. External tools are already aware of collective resources, and for groups and clones, the relevant primitives are explicitly configured. For implicit bundle resources and Pacemaker Remote connections, it's a little different, but those are worth special-casing in those tools.

Do we want the "list cluster options" command to go in libpacemaker, or libcrmcommon?

We normally put endpoints like this in libpacemaker, so that's my default approach. I'm pretty sure that's possible to do in this case, but it will involve an extra step. The options array has to stay in libcrmcommon to support things like pcmk__validate_cluster_options() and pcmk__cluster_option(). We use these in libpe_status and libcib, so they can't go in libpacemaker.

To put pcmk__cluster_option_metadata() ("list cluster options") in libpacemaker, we'd need to pass an enum to pcmk__format_option_metadata() in libcrmcommon. The enum value would tell pcmk__format_option_metadata() to use the static cluster_options array.

Either that or have (basically) duplicate functions -- for example, pcmk__list_cluster_options() in libpacemaker calls pcmk__cluster_option_metadata() in libcrmcommon, which calls pcmk__format_option_metadata().

Future functions (like pcmk__local_option_metadata() for environment variables) would pass a different enum value that tells pcmk__format_option_metadata() to use a different options array.

Thinking of overloading pcmk__opt_context...

In T620#10979, @nrwahl2 wrote:

Do we want the "list cluster options" command to go in libpacemaker, or libcrmcommon?

If it doesn't require anything outside libcrmcommon, I would put the bulk of it there. When we get to the UI (command-line options), the highest-level equivalents of that should be in libpacemaker.

We normally put endpoints like this in libpacemaker, so that's my default approach. I'm pretty sure that's possible to do in this case, but it will involve an extra step. The options array has to stay in libcrmcommon to support things like pcmk__validate_cluster_options() and pcmk__cluster_option(). We use these in libpe_status and libcib, so they can't go in libpacemaker.

To put pcmk__cluster_option_metadata() ("list cluster options") in libpacemaker, we'd need to pass an enum to pcmk__format_option_metadata() in libcrmcommon. The enum value would tell pcmk__format_option_metadata() to use the static cluster_options array.

Either that or have (basically) duplicate functions -- for example, pcmk__list_cluster_options() in libpacemaker calls pcmk__cluster_option_metadata() in libcrmcommon, which calls pcmk__format_option_metadata().

The libcrmcommon functions would do all the processing, and the libpacemaker functions would focus on output

Future functions (like pcmk__local_option_metadata() for environment variables) would pass a different enum value that tells pcmk__format_option_metadata() to use a different options array.

Thinking of overloading pcmk__opt_context...

If it doesn't require anything outside libcrmcommon, I would put the bulk of it there. When we get to the UI (command-line options), the highest-level equivalents of that should be in libpacemaker.
...
The libcrmcommon functions would do all the processing, and the libpacemaker functions would focus on output

That's basically the conundrum. Output is the only thing we're dealing with (there's no meaningful processing otherwise), but the output functions need access to an array that lives in libcrmcommon. There are many ways to approach this that would work. The question is which one's the cleanest and most in line with our existing code.

Edit: The message functions will have to go in libcrmcommon, unless we want to duplicate a lot of XML formatting logic. The deprecated cib_metadata() function in libcib has to call the XML message function. It can't access libpacemaker. At a compatibility break we can move the message functions to libpacemaker if we want... although libcrmcommon will still be simpler since that's where the array lives. (Granted, we could pass it out to libpacemaker as a const pointer.)

How do you feel about a new lightweight CLI tool called something like pcmk_option? I think I can get this into crm_attribute relatively easily. It's just that none of these are attributes.

It would make a ton of sense to list resource meta-attributes (and maybe operation meta-attributes) from crm_resource, and the special stonith instance attributes could go in either crm_resource or stonith_admin. But the rest of the option types (cluster, local/env, alert) don't have a natural home.

In T620#11067, @nrwahl2 wrote:

If it doesn't require anything outside libcrmcommon, I would put the bulk of it there. When we get to the UI (command-line options), the highest-level equivalents of that should be in libpacemaker.
...
The libcrmcommon functions would do all the processing, and the libpacemaker functions would focus on output

That's basically the conundrum. Output is the only thing we're dealing with (there's no meaningful processing otherwise), but the output functions need access to an array that lives in libcrmcommon. There are many ways to approach this that would work. The question is which one's the cleanest and most in line with our existing code.

libpacemaker relies on libcrmcommon, so there's no problem there. Basically the high-level functions, the ones that map directly to command-line usage with a private version taking an output object and a public version that always returns XML, should go in libpacemaker. Anything else can go in libcrmcommon. The high-level functions can be simple wrappers.

Edit: The message functions will have to go in libcrmcommon, unless we want to duplicate a lot of XML formatting logic. The deprecated cib_metadata() function in libcib has to call the XML message function. It can't access libpacemaker. At a compatibility break we can move the message functions to libpacemaker if we want... although libcrmcommon will still be simpler since that's where the array lives. (Granted, we could pass it out to libpacemaker as a const pointer.)

In T620#11095, @nrwahl2 wrote:

How do you feel about a new lightweight CLI tool called something like pcmk_option? I think I can get this into crm_attribute relatively easily. It's just that none of these are attributes.

It would make a ton of sense to list resource meta-attributes (and maybe operation meta-attributes) from crm_resource, and the special stonith instance attributes could go in either crm_resource or stonith_admin. But the rest of the option types (cluster, local/env, alert) don't have a natural home.

I think it would make sense if we were designing from scratch, but crm_attribute actually is the intended place for this. It has always been used to manage cluster options as well as node attributes.

I think it would make sense if we were designing from scratch, but crm_attribute actually is the intended place for this. It has always been used to manage cluster options as well as node attributes.

Okay, that makes sense for cluster options. I'm still not sure it makes sense to put local options there, or to put meta-attributes for alerts, resources, and ops there.

I'm not at that point yet, still finishing up cluster options, but the rest should move faster with the infrastructure in place and the approach settled on.

In T620#11130, @nrwahl2 wrote:

I think it would make sense if we were designing from scratch, but crm_attribute actually is the intended place for this. It has always been used to manage cluster options as well as node attributes.

Okay, that makes sense for cluster options. I'm still not sure it makes sense to put local options there, or to put meta-attributes for alerts, resources, and ops there.

I'm not at that point yet, still finishing up cluster options, but the rest should move faster with the infrastructure in place and the approach settled on.

Yeah, meta-attributes could make more sense in crm_resource, especially since it already has --meta and --operation options that could be overloaded.

Unlike the others, we don't have a tool to change local options. I could see a new one coming with T574, or using crm_attribute for that purpose.

kgaillot added projects: Restricted Project, Restricted Project.Tue, Jan 30, 4:35 PM
kgaillot updated the task description. (Show Details)
kgaillot moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.