Fix: controller: set timeout on scheduler responses
8ba3075f3f8d
Actions

Description

Fix: controller: set timeout on scheduler responses

Previously, once the DC successfully read the CIB and sent a calculation
request to the scheduler, it wouldn't do anything further with the request,
aside from the message handler for the scheduler's response.

This meant that if the scheduler successfully accepted the request, but then
was unable to reply (such as not getting enough CPU cycles), the controller
would never detect anything wrong, and the cluster would be blocked.

Now, the controller sets a 2-minute timer after handing off the request to the
scheduler, and if it doesn't get a response in that time, it exits and stays
down (if a node is elected DC but can't run the scheduler, we want to ensure it
doesn't interfere with further elections).