HomeClusterLabs Projects

Medium: ocf-shellfuncs: improve locking (ocf_take_lock)

Description

Medium: ocf-shellfuncs: improve locking (ocf_take_lock)

This change improves locking by ocf_take_lock(). It uses mkdir(1)
to prevent two instances from creating the same directory (named
by the lock).

The major difficulty is to prevent a race when a stale lock is
discovered. If two processes try to remove the stale lock at
about the same time, the one which runs slightly later can remove
the lock which just got created by the one which run slightly
earlier. The probability of this race is significantly reduced by
testing for stale lock twice with a random sleep in between.

Though this change does not exclude a race entirely, it makes it
extremely improbable. In addition, stale locks are result of only
abnormal circumstances and occur seldom.

The function providing random numbers has been modified to use
either /dev/urandom or awk (with the process pid as the seed).

It was thoroughly tested with both stale lock simulation and
without, by running 64 instances of processes trying to get the
lock on a workstation with 4 cpus.

Details

Provenance
Dejan Muhamedagic <dejan@hello-penguin.com>Authored on Jun 26 2017, 9:56 AM
Parents
rRf391fb65d8ee: Merge pull request #859 from vaLski/mysql_resource_fix
Branches
Unknown
Tags
Unknown

Event Timeline