This shows up occassionally on ctslab runs:
Jul 30 16:19:22 Running test Stonithd (rhel9-ctslab-1) [ 28] Jul 30 16:19:58 no partition details for rhel9-ctslab-3 Jul 30 16:20:14 Warning: core file on rhel9-ctslab-3: Wed 2025-07-30 16:19:47 EDT 73686 189 189 SIGABRT present /usr/libexec/pacemaker/pacemaker-controld 526.9K Jul 30 16:20:14 Warning: core file on rhel9-ctslab-3: Wed 2025-07-30 16:19:58 EDT 54218 189 189 SIGABRT present /usr/libexec/pacemaker/pacemaker-controld 523.0K Jul 30 16:20:14 Warning: core file on rhel9-ctslab-3: Wed 2025-07-30 16:19:58 EDT 73987 0 0 SIGABRT none /usr/sbin/crm_node - Jul 30 16:20:14 Audit FileAudit FAILED. Jul 30 16:20:21 BadNews: 2025-07-30T16:19:46-0400 rhel9-ctslab-3 pacemaker-controld[54218]: error: crm_glib_handler: Forked child [73686] to record non-fatal assertion at logging.c:83 : Source ID 2307737477 was not found when attempting to remove it Jul 30 16:20:21 BadNews: 2025-07-30T16:19:47-0400 rhel9-ctslab-3 pacemaker-controld[54218]: crit: GLib: Source ID 2307737477 was not found when attempting to remove it Jul 30 16:20:23 BadNews: 2025-07-30T16:19:58-0400 rhel9-ctslab-3 pacemakerd[54211]: error: pacemaker-controld[54218] terminated with signal 6 (Aborted)
Looking at one of the coredumps, we see:
#0 0x00007f6b6dc8b94c in __pthread_kill_implementation () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-8.el9.x86_64 corosynclib-3.1.8-1.el9.x86_64 dbus-libs-1.12.20-8.el9.x86_64 glib2-2.68.4-14.el9.x86_64 glibc-2.34-100.el9.x86_64 gnutls-3.8.3-1.el9.x86_64 libcap-2.48-9.el9_2.x86_64 libffi-3.4.2-8.el9.x86_64 libgcc-11.4.1-3.el9.x86_64 libgcrypt-1.10.0-10.el9_2.x86_64 libgpg-error-1.42-5.el9.x86_64 libidn2-2.3.0-7.el9.x86_64 libqb-2.0.8-1.el9.x86_64 libtasn1-4.16.0-8.el9_1.x86_64 libunistring-0.9.10-15.el9.x86_64 libuuid-2.37.4-18.el9.x86_64 libxml2-2.9.13-5.el9_3.x86_64 libxslt-1.1.34-9.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 lz4-libs-1.9.3-5.el9.x86_64 nettle-3.9.1-1.el9.x86_64 p11-kit-0.25.3-2.el9.x86_64 sssd-client-2.9.4-2.el9.x86_64 systemd-libs-252-32.el9_4.x86_64 xz-libs-5.2.5-8.el9_0.x86_64 zlib-1.2.11-40.el9.x86_64 (gdb) bt #0 0x00007f6b6dc8b94c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007f6b6dc3e646 in raise () from /lib64/libc.so.6 #2 0x00007f6b6dc287f3 in abort () from /lib64/libc.so.6 #3 0x00007f6b6e2e1a4c in fail_assert_as (assert_condition=<optimized out>, line=<optimized out>, function=<optimized out>, file=<optimized out>) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/lib/common/results.c:189 #4 crm_abort (file=<optimized out>, function=<optimized out>, line=<optimized out>, assert_condition=<optimized out>, do_core=<optimized out>, do_fork=<optimized out>) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/lib/common/results.c:221 #5 0x00007f6b6e2fcdef in crm_glib_handler (log_domain=0x7f6b6e1ca071 "GLib", flags=<optimized out>, message=0x563dea542b90 "Source ID 2307737477 was not found when attempting to remove it", user_data=<optimized out>) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/lib/common/logging.c:83 #6 0x00007f6b6e1766fa in g_logv () from /lib64/libglib-2.0.so.0 #7 0x00007f6b6e1769e3 in g_log () from /lib64/libglib-2.0.so.0 #8 0x00007f6b6e170145 in g_source_remove () from /lib64/libglib-2.0.so.0 #9 0x00007f6b6e2fe986 in mainloop_timer_stop (t=0x563dea525030) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/lib/common/mainloop.c:1350 #10 mainloop_timer_stop (t=t@entry=0x563dea525030) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/lib/common/mainloop.c:1346 #11 0x00007f6b6e2fea41 in mainloop_timer_del (t=0x563dea525030) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/lib/common/mainloop.c:1395 #12 0x0000563de9927933 in sleep_timer (data=<optimized out>) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/daemons/controld/controld_schedulerd.c:454 #13 0x00007f6b6e2fe7d5 in mainloop_timer_cb (user_data=0x563dea4f6710) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/lib/common/mainloop.c:1312 #14 0x00007f6b6e1717a1 in g_timeout_dispatch () from /lib64/libglib-2.0.so.0 #15 0x00007f6b6e170f4f in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #16 0x00007f6b6e1c6268 in g_main_context_iterate.constprop () from /lib64/libglib-2.0.so.0 #17 0x00007f6b6e1705a3 in g_main_loop_run () from /lib64/libglib-2.0.so.0 #18 0x0000563de9903c89 in main (argc=<optimized out>, argv=0x7ffc29068998) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/daemons/controld/pacemaker-controld.c:201
Up at frame #9:
#9 0x00007f6b6e2fe986 in mainloop_timer_stop (t=0x563dea525030) at /usr/src/debug/pacemaker-3.0.0-5631.821786b7f5.git.el9.x86_64/lib/common/mainloop.c:1350 1350 g_source_remove(t->id); (gdb) p t $1 = (mainloop_timer_t *) 0x563dea525030
We are perhaps removing the source but not setting it to NULL or freeing it somewhere, resulting in attempting to remove it again later? That's just a quick hunch.
I think it's a consequence of https://github.com/ClusterLabs/pacemaker/pull/3865 (luckily not in a release yet), but I haven't had time to look into it yet.