HomeClusterLabs Projects

Low: libcrmcommon: Retry sending in crm_ipcs_flush_events.
1f50992fa101Unpublished

Unpublished Commit ยท Learn More

Not On Permanent Ref: This commit is not an ancestor of any permanent ref.
This commit has been deleted in the repository: it is no longer reachable from any branch, tag, or ref.

Description

Low: libcrmcommon: Retry sending in crm_ipcs_flush_events.

If qb_ipcs_event_sendv returns -EAGAIN, we need to retry sending
immediately without letting that return value propagate up. If we don't
do this, what happens is:

  • qb_ipcs_event_sendv returns -EAGAIN, which causes the loop in crm_ipcs_flush_events to end and that value gets returned.
  • pcmk__ipc_send_iov then gets that error code when it handles events and returns EAGAIN.
  • The caller (likely pcmk__ipc_send_xml, but could be other functions) gets EAGAIN and proceeds to send the next event, thinking that the event it just tried to send succeeded but was part of a split up IPC message.

The end result is the message doesn't actually get sent and we proceed
to sending the next one. The message may still be sent later, but the
other end of the IPC connection will have moved on and won't know what
to do with it when it arrives.

FIXME: This should probably have some timeout or maximum number of
retries, though I'm not sure what we'd do once that expired.

Details

Provenance
clumensAuthored on Thu, Apr 17, 2:01 PM
Parents
rP05b1e2fb30e2: Feature: daemons: Convert schedulerd to support multipart IPC messages.
Branches
Unknown
Tags
Unknown

Event Timeline