Fix: controller: Avoid election storm due to incompatible CIB
The DC accepts a joining node even if its local CIB manager will later
reject the joining node's CIB (for example, due to schema version
incompatibility). This can cause an election storm.
do_dc_join_finalize() calls cib_t:cmds:sync_from() against the joining
node, syncing its CIB across the cluster. However, it may fail to apply
the diff, and the DC won't notice until we get an error code via the
sync callback.
Here, if a joining node has the max generation so far, we verify on the
DC side that we recognize its schema name before accepting its join
request. That eliminates most of the cases in which we would sync the
CIB and then reject the CIB ourselves on the DC.
If the CIB sync does fail in the finalize step, then we add the node to
a table of nodes whose CIB syncs have failed, and then we trigger a new
election, this time nacking that node if it sends a join request. The
node remains in the failed sync table until it leaves the cluster (which
should happen after it's nacked).
Closes T455
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>