Fix: scheduler: process remote shutdowns correctly
588a7c6bcdef
Actions

Description

Fix: scheduler: process remote shutdowns correctly

When unpacking node histories, the scheduler can make multiple passes through
the node_state entries, because the state of remote node connections (on other
nodes) must be known before the history of the remote node itself can be
unpacked.

When unpacking a remote or guest node's history, the scheduler also unpacks its
transient attributes. If the shutdown attribute has been set, the scheduler
marks the node as shutting down.

Previously, at that time, it would also set the remote connection's next role
to stopped. However, if it so happened that remote connection history on
another node was processed later in the node history unpacking, and a probe had
found the connection not running, this would reset the next role to unknown.
The connection stop would not be scheduled, and the shutdown would hang until
it timed out.

Now, set the remote connection to stopped for shutdowns after all node
histories have been unpacked.

Details

Provenance

kgaillot

Authored on Jan 22 2021, 5:45 PM

Parents

rP2ae780b8746f: Log: scheduler: use new function to set a resource's next role

Branches

Unknown

Tags

Unknown

Event Timeline

Changes (1)

Path

Size

lib/

pengine/

unpack.c

rP588a7c6bcdef

View Options