HomeClusterLabs Projects

rabbitmq-cluster: fix regression in rmq_stop

Description

rabbitmq-cluster: fix regression in rmq_stop

This regression was introduced in PR#1249 (cc23c55). The stop action
was modified to use rmq_app_running in order to check the service
status, which allows for the following sequence of events:

  • service is started, unclustered
  • stop_app is called
  • cluster_join is attempted and fails
  • stop is called

Because stop_app was called, rmq_app_running returns $OCF_NOT_RUNNING
and the stop action is a no-op. This means the erlang VM continues
running.

When the start action is attempted again, a new erlang VM is launched,
but this VM fails to boot because the old one is still running and is
registered with the same name (rabbit@nodename).

This adds a new function, rmq_node_alive, which does a simple eval to
test whether the erlang VM is up, independent of the rabbit app. The
stop action now uses rmq_node_alive to check the service status, so
even if stop_app was previously called, the erlang VM will be stopped
properly.

Resolves: RHBZ#1639826

Details

Provenance
John Eckersberg <jeckersb@redhat.com>Authored on Nov 2 2018, 1:12 PM
Parents
rR7c750babaf33: Merge pull request #1257 from jnpkrn/metadata-cleanup
Branches
Unknown
Tags
Unknown

Event Timeline