HomeClusterLabs Projects

Fix: scheduler: don't send clone notifications to a stopped remote node

Description

Fix: scheduler: don't send clone notifications to a stopped remote node

Since b3f9a5bbb, we discard faked executor results when resource information is
unavailable. This has exposed pre-existing issues where clone notifications were
mistakenly scheduled for Pacemaker Remote nodes. Previously, the cluster node
that had hosted the Pacemaker Remote connection would fake the result, and the
transition would proceed. Now, if the cluster node doesn't happen to have the
resource information cached, the result will not be sent, and thus the
transition will get an action timeout. This permanently blocks later actions in
the transition.

This commit avoids such a situation where start and promote clone notifications
were scheduled for a clone instance on a Pacemaker Remote node whose remote
connection is stopping, and thus would be stopped by the time the notification
would be needed.

This is slightly modified from a patch provided by Andrew Beekhof
<andrew@beekhof.net>.

RHBZ#1652752

Details

Provenance
kgaillotAuthored on Nov 26 2018, 4:45 PM
Parents
rP79467dde9542: Refactor: libcrmcommon,scheduler: move convert_const_pointer() out of library
Branches
Unknown
Tags
Unknown

Event Timeline