mirror of
http://git.haproxy.org/git/haproxy.git
synced 2026-02-11 17:32:43 +02:00
Since commit5afcb686b("MAJOR: connection: purge idle conn by last usage") in 2.9-dev4, the test on conn->toremove_list added to conn_get_idle_flag() in 2.8 by commit3a7b539b1("BUG/MEDIUM: connection: Preserve flags when a conn is removed from an idle list") becomes misleading. Indeed, now both toremove_list and idle_list are shared by a union since the presence in these lists is mutually exclusive. However, in conn_get_idle_flag() we check for the presence in the toremove_list to decide whether or not to delete the connection from the tree. This test now fails because instead it sees the presence in the idle or safe list via the union, and concludes the element must not be removed. Thus the element remains in the tree and can be found later after the connection is released, causing crashes that Tristan reported in issue #2292. The following config is sufficient to reproduce it with 2 threads: defaults mode http timeout client 5s timeout server 5s timeout connect 1s listen front bind :8001 server next 127.0.0.1:8002 frontend next bind :8002 timeout http-keep-alive 1 http-request redirect location / Sending traffic with a few concurrent connections and some short timeouts suffices to instantly crash it after ~10k reqs: $ h2load -t 4 -c 16 -n 10000 -m 1 -w 1 http://0:8001/ With Amaury we analyzed the conditions in which the function is called in order to figure a better condition for the test and concluded that ->toremove_list is never filled there so we can safely remove that part from the test and just move the flag retrieval back to what it was prior to the 2.8 patch above. Note that the patch is not reverted though, as the parts that would drop the unexpected flags removal are unchanged. This patch must NOT be backported. The code in 2.8 works correctly, it's only the change in 2.9 that makes it misbehave.