HomeClusterLabs Projects

Low: libcrmcommon: Fix a bug in processing multiple IPC messages.

Description

Low: libcrmcommon: Fix a bug in processing multiple IPC messages.

This bug only occurs when processing multiple messages synchronously,
and only happens rarely in my testing.

What happens is that the server has sent an ACK which the client has
processed, but has not yet gotten around to sending the actual reply.
In this case, the client will get -EAGAIN. Before the previous patch to
have crm_ipc_read return that code, the result was that
pcmk__send_ipc_request would see -ENOMSG and think it was at the end and
return. The client would then shut down the connection, and then the
server would send the rest of the reply but no one was listening for it.

With this patch, -EAGAIN is handled as a separate return code where we
need to loop and try reading again. We may need to retry the read
several times before it completes. However, looping revealed a second
bug.

This second bug is due to the fact that we were using -EAGAIN as a way
of knowing to break out of the more loop. Now that it means we should
try again, there's no way out of that loop if we ever have to read a
second message. Subsequent reads will also return -EAGAIN because the
sever has nothing for us, but we don't know that we aren't expecting
anything else. So the code will loop infinitely.

This is fixed by setting the more variable, which can be done by
inspecting the return value of do_dispatch_ipc_read. Once we've read
the last expected message, more will be false and the loop will break.

Details

Provenance
clumensAuthored on Apr 21 2022, 1:59 PM
Parents
rPa4d43c4fef27: Refactor: libcrmcommon: Move the guts of dispatch_ipc_data out on its own.
Branches
Unknown
Tags
Unknown