HomeClusterLabs Projects

A CPG client can sometimes lockup if the local node is in the downlist

Description

A CPG client can sometimes lockup if the local node is in the downlist

In a 10-node cluster where all nodes are booting up and starting corosync
at the same time, sometimes during this process corosync detects a node as
leaving and rejoining the cluster.

Occasionally the downlist that gets picked contains the local node. When the
local node sends leave events for the downlist (including itself), it sets
its cpd state to CPD_STATE_UNJOINED and clears the cpd->group_name. This
means it no longer sends CPG events to the CPG client.

Reviewed-by: Jan Friesse <jfriesse@redhat.com>

Details

Provenance
Tim Beale <tim.beale@alliedtelesis.co.nz>Authored on Aug 18 2011, 8:57 AM
jfriesseCommitted on Aug 18 2011, 8:57 AM
Parents
rC370d9bcecf27: Display ring-ID consistently in debug
Branches
Unknown
Tags
Unknown

Event Timeline