[PMTUd] add dynamic pong timeout when using crypto
problem originally reported by proxmox community, users
observed that under pressure the MTU would flap back and forth
between 2 values due to other node response timeout.
implement a dynamic timeout multiplier when using crypto that
should solve the problem in a more flexible fashion.
When a timeout hits, those new logs will show:
[knet]: [info] host: host: 1 (passive) best link: 0 (pri: 0)
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (4) for host 1 link: 0
[knet]: [info] pmtud: PMTUD link change for host: 1 link: 0 from 469 to 65429
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
[knet]: [info] pmtud: Global data MTU changed to: 65429
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (8) for host 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (16) for host 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (32) for host 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (64) for host 1 link: 0
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Increasing PMTUd response timeout multiplier to (128) for host 1 link: 0
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
and when the latency reduces and it is safe to be more responsive again:
[knet]: [debug] pmtud: Starting PMTUD for host: 1 link: 0
[knet]: [debug] pmtud: Decreasing PMTUd response timeout multiplier to (64) for host 1 link: 0
[knet]: [debug] pmtud: PMTUD completed for host: 1 link: 0 current link mtu: 65429
....
testing this patch on normal hosts is a bit challenging tho.
Patch was tested by hardcoding a super low timeout here:
diff --git a/libknet/threads_pmtud.c b/libknet/threads_pmtud.c
index 4f0ba0f..5e2b89b 100644
- a/libknet/threads_pmtud.c
+++ b/libknet/threads_pmtud.c
@@ -261,7 +271,8 @@ retry:
/* * crypto, under pressure, is a royal PITA */
- pong_timeout_adj_tmp = dst_link->pong_timeout_adj * 2;
+ //pong_timeout_adj_tmp = dst_link->pong_timeout_adj * dst_link->pmtud_crypto_timeout_multiplier;
+ pong_timeout_adj_tmp = 30 * dst_link->pmtud_crypto_timeout_multiplier;
} else { pong_timeout_adj_tmp = dst_link->pong_timeout_adj; }
and using a long running version of api_knet_send_crypto_test with a short PMTUd setfreq (10 sec).
Signed-off-by: Fabio M. Di Nitto <fdinitto@redhat.com>