BUG/MEDIUM: server: fix race on servers_list during server deletion

Each server is inserted in a global list named servers_list on
new_server(). This list is then only used to finalize servers
initialization after parsing.

On dynamic server creation, there is no issue as new_server() is under
thread isolation. However, when a server is deleted after its refcount
reached zero, srv_drop() removes it from servers_list without lock
protection. In the longterm, this can cause list corruption and crashes,
especially if multiple adjacent servers are removed in parallel.

To fix this, convert servers_list to a mt_list. This should not impact
performance as servers_list is not used during runtime outside of server
creation/deletion.

This should fix github issue #2733. Thanks to Chris Staite who first
found the issue here.

This must be backported up to 2.6.
This commit is contained in:
Amaury Denoyelle
2024-10-23 18:18:48 +02:00
parent 116178563c
commit 7a02fcaf20
4 changed files with 8 additions and 7 deletions

View File

@@ -41,7 +41,7 @@
__decl_thread(extern HA_SPINLOCK_T idle_conn_srv_lock);
extern struct idle_conns idle_conns[MAX_THREADS];
extern struct task *idle_conn_task;
extern struct list servers_list;
extern struct mt_list servers_list;
extern struct dict server_key_dict;
int srv_downtime(const struct server *s);