haproxy

mirror of http://git.haproxy.org/git/haproxy.git synced 2026-02-10 22:12:44 +02:00

Author	SHA1	Message	Date
Willy Tarreau	ecffaa6d5a	MINOR: net_helper: extend the ip.fp output with an option presence mask Emeric suggested that it's sometimes convenient to instantly know if a client has advertised support for window scaling or timestamps for example. While the info is present in the TCP options output, it's hard to extract since it respects the options order. So here we're extending the 56-bit fingerprint with 8 extra bits that indicate the presence of options 2..8, and any option above 9 for the last bit. In practice this is sufficient since higher options are not commonly used. Also TCP option 5 is normally not sent on the SYN (SACK, only SACK_perm is sent), and echo options 6 & 7 are no longer used (replaced with timestamps). These fields might be repurposed in the future if some more meaningful options are to be mapped (e.g. MPTCP, TFO cookie, auth).	2026-02-09 09:18:04 +01:00
Amaury Denoyelle	a1db464c3e	BUG/MINOR: proxy: fix null dereference in "add backend" handler When a backend is created at runtime, the new proxy instance is inserted at the end of proxies_list. This operation is buggy if this list is empty : the code causes a null dereference which will lead to a crash. This causes the following compilation error : CC src/proxy.o src/proxy.c: In function 'cli_parse_add_backend': src/proxy.c:4933:36: warning: null pointer dereference [-Wnull-dereference] 4933 \| proxies_list->next = px; \| ~~~~~~~~~~~~~~~~~~~^~~~ This patch fixes this issue. Note that in reality it cannot occur at this moment as proxies_list cannot be empty (haproxy requires at least one frontend to start, and the list also always contains internal proxies). No need to backport.	2026-02-06 21:35:12 +01:00
Amaury Denoyelle	5dff6e439d	BUG/MINOR: proxy: fix clang build error on "add backend" handler This patch fixes the following compilation error : src/proxy.c:4954:12: error: format string is not a string literal (potentially insecure) [-Werror,-Wformat-security] 4954 \| ha_notice(msg); \| ^~~ No need to backport.	2026-02-06 21:17:18 +01:00
Amaury Denoyelle	5753c14e84	MINOR: proxy: assign dynamic proxy ID Implement proxy ID generation for dynamic backends. This is performed through the already function existing proxy_get_next_id(). As an optimization, lookup will performed starting from a global variable <dynpx_next_id>. It is initialized to the greatest ID assigned after parsing, and updated each time a backend instance is created. When backend deletion will be implemented, it could be lowered to the newly available slot.	2026-02-06 17:28:27 +01:00
Amaury Denoyelle	3115eb82a6	MEDIUM: proxy: implement dynamic backend creation Implement the required operations for "add backend" handler. This requires a new proxy allocation, settings copy from the specified default instance and proxy config finalization. All handlers registered via REGISTER_POST_PROXY_CHECK() are also called on the newly created instance. If no error were encountered, the newly created proxy is finally attached in the proxies list.	2026-02-06 17:28:27 +01:00
Amaury Denoyelle	07195a1af4	MINOR: proxy: check default proxy compatibility on "add backend" This commits completes "add backend" handler with some checks performed on the specified default proxy instance. These are additional checks outside of the already existing inheritance rules, specific to dynamic backends. For now, a default proxy is considered not compatible if it is not in mode TCP/HTTP. Also, a default proxy is rejected if it references HTTP errors. This limitation may be lifted in the future, when HTTP errors are partiallay reworked.	2026-02-06 17:28:26 +01:00
Amaury Denoyelle	a603811aac	MINOR: proxy: parse guid on dynamic backend creation Defines an extra optional GUID argument for "add backend" command. This can be useful as it is not possible to define it via a default proxy instance.	2026-02-06 17:28:04 +01:00
Amaury Denoyelle	e152913327	MINOR: proxy: parse mode on dynamic backend creation Add an optional "mode" argument to "add backend" CLI command. This argument allows to specify if the backend is in TCP or HTTP mode. By default, it is mandatory, unless the inherited default proxy already explicitely specifies the mode. To differentiate if TCP mode is implicit or explicit, a new proxy flag PR_FL_DEF_EXPLICIT_MODE is defined. It is set for every defaults instances which explicitely defined their mode.	2026-02-06 17:27:50 +01:00
Amaury Denoyelle	7ac5088c50	MINOR: proxy: define "add backend" handler Define a basic CLI handler for "add backend". For now, this handler only performs a parsing of the name argument and return an error if a duplicate already exists. It runs under thread isolation, to guarantee thread safety during the proxy creation. This feature is considered in development. CLI command requires to set experimental-mode.	2026-02-06 17:26:55 +01:00
Amaury Denoyelle	817003aa31	MINOR: backend: add function to check support for dynamic servers Move backend compatibility checks performed during 'add server' in a dedicated function be_supports_dynamic_srv(). This should simplify addition of future restriction. This function will be reused when implementing backend creation at runtime.	2026-02-06 14:35:19 +01:00
Amaury Denoyelle	dc6cf224dd	MINOR: proxy: refactor mode parsing Define a new utility function str_to_proxy_mode() which is able to convert a string into the corresponding proxy mode if possible. This new function is used for the parsing of "mode" configuration proxy keyword. This patch will be reused for dynamic backend implementation, in order to parse a similar "mode" argument via a CLI handler.	2026-02-06 14:35:18 +01:00
Amaury Denoyelle	87ea407cce	MINOR: proxy: refactor proxy inheritance of a defaults section If a proxy is referencing a defaults instance, some checks must be performed to ensure that inheritance will be compatible. Refcount of the defaults instance may also be incremented if some settings cannot be copied. This operation is performed when parsing a new proxy of defaults section which references a defaults, either implicitely or explicitely. This patch extracts this code into a dedicated function named proxy_ref_defaults(). This in turn may call defaults_px_ref() (previously called proxy_ref_defaults()) to increment its refcount. The objective of this patch is to be able to reuse defaults inheritance validation for dynamic backends created at runtime, outside of the parsing code.	2026-02-06 14:35:18 +01:00
Amaury Denoyelle	a8bc83bea5	MINOR: cfgparse: move proxy post-init in a dedicated function A lot of proxies initialization code is delayed on post-parsing stage, as it depends on the configuration fully parsed. This is performed via a loop on proxies_list. Extract this code in a dedicated function proxy_finalize(). This patch will be useful for dynamic backends creation. Note that for the moment the code has been extracted as-is. With each new features, some init code was added there. This has become a giant loop with no real ordering. A future patch may provide some cleanup in order to reorganize this.	2026-02-06 14:35:18 +01:00
Amaury Denoyelle	2c8ad11b73	MINOR: cfgparse: validate defaults proxies separately Default proxies validation occurs during post-parsing. The objective is to report any tcp/http-rules which could not behave as expected. Previously, this was performed while looping over standard proxies list, when such proxy is referencing a default instance. This was enough as only named referenced proxies were kept after parsing. However, this is not the case anymore in the context of dynamic backends creation at runtime. As such, this patch now performs validation on every named defaults outside of the standard proxies list loop. This should not cause any behavior difference, as defaults are validated without using the proxy which relies on it. Along with this change, PR_FL_READY proxy flag is now removed. Its usage was only really needed for defaults, to avoid validating a same instance multiple times. With the validation of defaults in their own loop, it is now redundant.	2026-02-06 14:35:18 +01:00
Egor Shestakov	2a07dc9c24	BUG/MINOR: startup: handle a possible strdup() failure Fix unhandled strdup() failure when initializing global.log_tag. Bug was introduced with the fix UAF for global progname pointer from `351ae5dbe`. So it must be backported as far as 3.1.	2026-02-06 10:50:31 +01:00
Egor Shestakov	9dd7cf769e	BUG/MINOR: startup: fix allocation error message of progname string Initially when init_early was introduced the progname string was a local used for temporary storage of log_tag. Now it's global and detached from log_tag enough. Thus, in the past we could inform that log_tag allocation has been failed but not now. Must be backported since the progname string became global, that is v3.1-dev9-96-g49772c55e	2026-02-06 10:50:31 +01:00
Olivier Houchard	bf7a2808fc	BUG/MEDIUM: threads: Differ checking the max threads per group number Differ checking the max threads per group number until we're done parsing the configuration file, as it may be set after a "thread-group- directive. Otherwise the default value of 64 will be used, even if there is a max-threads-per-group directive. This should be backported to 3.3.	2026-02-06 03:01:50 +01:00
Olivier Houchard	9766211cf0	BUG/MINOR: threads: Initialize maxthrpertgroup earlier. Give global.maxthrpertgroup its default value at global creation, instead of later when we're trying to detect the thread count. It is used when verifying the configuration file validity, and if it was not set in the config file, in a few corner cases, the value of 0 would be used, which would then reject perfectly fine configuration files. This should be backported to 3.3.	2026-02-06 03:01:36 +01:00
Aperence	143f5a5c0d	BUG/MINOR: config: Fix setting of alt_proto This patch fixes the bug presented in issue #3254 (https://github.com/haproxy/haproxy/issues/3254), which occured on FreeBSD when using a stream socket for in nameserver section. This bug occured due to an incorrect reset of the alt_proto for a stream socket when the default socket is created as a datagram socket. This patch fixes this bug by doing a late assignment to alt_proto when a datagram socket is requested, leaving only the modification of alt_proto done by mptcp. Additional documentation for the use of alt_proto has also been added to clarify the use of the alt_proto variable.	2026-02-04 14:54:20 +01:00
Willy Tarreau	b6bdb2553b	MEDIUM: backend: make "balance random" consider req rate when loads are equal As reported by Damien Claisse and C�dric Paillet, the "random" LB algorithm can become particularly unfair with large numbers of servers having few connections. It's indeed fairly common to see many servers with zero connection in a thousand-server large farm, and in this case the P2C algo consisting in checking the servers' loads doesn't help at all and is basically similar to random(1). In this case, we only rely on the distribution of server IDs in the random space to pick the best server, but it's possible to observe huge discrepancies. An attempt to model the problem clearly shows that with 1600 servers with weight 10, for 1 million requests, the lowest loaded ones will take 300 req while the most loaded ones will get 780, with most of the values between 520 and 700. In addition, only the first 28 lower bits of server IDs are used for the key calculation, which means that node keys are more determinist. Setting random keys in the lowest 28 bits only better packs values with min around 530 and max around 710, with values mostly between 550 and 680. This can only be compensated by increasing weights and draws without being a perfect fix either. At 4 draws, the min is around 560 and the max around 670, with most values bteween 590 and 650. This patch takes another approach to this problem: when servers are on tie regarding their loads, instead of arbitrarily taking the second one, we now compare their current request rates, which is updated all the time and smoothed over one second, and we pick the server with the lowest request rate. Now with 2 draws, the curve is mostly flat, with the min at 580 and the max at 628, and almost all values between 611 and 625. And 4 draws exclusively gives values from 614 to 624. Other points will need to be addressed separately (bits of server ID, maybe refine the hash algorithm), but these ones would affect how caches are selected, and cannot be changed without an extra option. For random however we can perform a change without impacting anyone. This should be backported, probably only to 3.3 since it's where the "random" algo became the default.	2026-02-04 14:54:16 +01:00
Willy Tarreau	cddeea58cd	BUG/MINOR: cpu-topo: count cores not cpus to distinguish core types The per-cpu capacity of a cluster was taken into account since 3.2 with commit `6c88e27cf4` ("MEDIUM: cpu-topo: change "performance" to consider per-core capacity"). In cpu_policy_performance() and cpu_policy_efficiency(), we're trying to figure which cores have more capacity than others by comparing their cluster's average capacity. However, contrary to what the comment says, we're not averaging per core but per cpu, which makes a difference for CPUs mixing SMT with non-SMT cores on the same SoC, such as intel's 14th gen CPUs. Indeed, on a machine where cpufreq is not enabled, all CPUs can be reported with a capacity of 1024, resulting in a big cluster of 161024, and 4 small clusters of 41024 each, giving an average of 1024 per CPU, making it impossible to distinguish one from the other. In this situation, both "cpu-policy performance" and "cpu-policy efficiency" enable all cores. But this is wrong, what needs to be taken into account in the divide is the number of cores, not cpus, that allows to distinguish big from little clusters. This was not noticeable on the ARM machines the commit above aimed at fixing because there, the number of CPUs equals the number of cores. And on an x86 machine with cpu_freq enabled, the frequencies continue to help spotting which ones are big/little. By using nb_cores instead of nb_cpus in the comparison and in the avg_capa compare function, it properly works again on x86 without affecting other machines with 1 CPU per core. This can be backported to 3.2.	2026-02-04 08:49:18 +01:00
Olivier Houchard	3674afe8a0	BUG/MEDIUM: threads: Atomically set TH_FL_SLEEPING and clr FL_NOTIFIED When we're about to enter polling, atomically set TH_FL_SLEEPING and remove TH_FL_NOTIFIED, instead of doing it in sequence. Otherwise, another thread may sett that both the TH_FL_SLEEPING and the TH_FL_NOTIFIED bits are set, and don't wake up the thread then it should be doing that. This prevents a bug where a thread is sleeping while it should be handling a new connection, which can happen if there are very few incoming connection. This is easy to reproduce when using only two threads, and injecting with only one connection, the connection may then never be handled. This should be backported up to 2.8.	2026-02-04 07:13:06 +01:00
Hyeonggeun Oh	2527d9dcd1	MEDIUM: tcpcheck: add post-80 option for mysql-check to support MySQL 8.x This patch adds a new 'post-80' option that sets the CLIENT_PLUGIN_AUTH (0x00080000) capability flag and explicitly specifies mysql_native_password as the authentication plugin in the handshake response. This patch also addes documentation content for post-80 option support in MySQL 8.x version. Which handles new default auth plugin caching_sha2_password. MySQL 8.0 changed the default authentication plugin from mysql_native_password to caching_sha2_password. The current mysql-check implementation only supports pre-41 and post-41 client auth protocols, which lack the CLIENT_PLUGIN_AUTH capability flag. When HAProxy sends a post-41 authentication packet to a MySQL 8.x server, the server responds with error 1251: "Client does not support authentication protocol requested by server". The new client capabilities for post-80 are: - CLIENT_PROTOCOL_41 (0x00000200) - CLIENT_SECURE_CONNECTION (0x00008000) - CLIENT_PLUGIN_AUTH (0x00080000) Usage example: backend mysql_servers option mysql-check user haproxy post-80 server db1 192.168.1.10:3306 check The health check user must be created with mysql_native_password: CREATE USER 'haproxy'@'%' IDENTIFIED WITH mysql_native_password BY ''; This addresses https://github.com/haproxy/haproxy/issues/2934.	2026-02-03 07:36:53 +01:00
Olivier Houchard	f26562bcb7	MINOR: quic: Fix build with USE_QUIC_OPENSSL_COMPAT Commit `fa094d0b61` changed the msg callback args, but forgot to fix quic_tls_msg_callback() accordingly, so do that, and remove the unused struct connection paramter.	2026-02-03 04:05:34 +01:00
Christopher Faulet	abc1947e19	BUG/MEDIUM: applet: Fix test on shut flags for legacy applets A regression was introduced in the commit `0ea601127` ("BUG/MAJOR: applet: Don't call I/O handler if the applet was shut"). The test on shut flags for legacy applets is inverted. It should be harmeless on 3.4 and 3.3 because all applets were converted. But this fix is mandatory for 3.2 and older. The patch must be backported as far as 3.0 with the commit above.	2026-01-30 09:55:18 +01:00
William Lallemand	23e8ed6ea6	MEDIUM: ssl: porting to X509_STORE_get1_objects() for OpenSSL 4.0 OpenSSL 4.0 is deprecating X509_STORE_get0_objects(). Every occurence of X509_STORE_get0_objects() was first replaced by X509_STORE_get1_objects(). This changes the ref count of the STACK_OF(X509_OBJECT) everywhere, and need it to be sk_X509_OBJECT_pop_free(objs, X509_OBJECT_free) each time. X509_STORE_get1_objects() is not available in AWS-LC, OpenSSL < 3.2, LibreSSL and WolfSSL, so we need to still be compatible with get0. To achieve this, 2 macros were added X509_STORE_getX_objects() and sk_X509_OBJECT_popX_free(), these macros will use either the get0 or the get1 macro depending on their availability. In the case of get0, sk_X509_OBJECT_popX_free() will just do nothing instead of trying to free. Don't backport that unless really needed if we want to be compatible with OpenSSL 4.0. It changes all the refcounts.	2026-01-29 17:08:41 +01:00
Amaury Denoyelle	fa094d0b61	MEDIUM: ssl: remove connection from msg callback args SSL msg callbacks are used for notification about sent/received SSL messages. Such callbacks are registered via ssl_sock_register_msg_callback(). Prior to this patch, connection was passed as first argument of these callbacks. However, most of them do not use it. Worst, this may lead to confusion as connection can be NULL in QUIC context. This patch cleans this by removing connection argument. As an alternative, connection can be retrieved in callbacks if needed using ssl_sock_get_conn() but the code must be ready to deal with potential NULL instances. As an example, heartbeat parsing callback has been adjusted in this manner.	2026-01-29 11:14:09 +01:00
Amaury Denoyelle	869a997a68	BUG/MEDIUM: ssl: fix msg callbacks on QUIC connections With QUIC backend implementation, SSL code has been adjusted in several place when accessing connection instance. Indeed, with QUIC usage, SSL context is tied up to quic_conn, and code may be executed prior/after connection instantiation. For example, on frontend side, connection is only created after QUIC handshake completion. The following patch tried to fix unsafe accesses to connection. In particular, msg callbacks are not called anymore if connection is NULL. `fab7da0fd0` BUG/MEDIUM: quic-be/ssl_sock: TLS callback called without connection However, most msg callbacks do not need to use the connection instance. The only occurence where it is accessed is for heartbeat message parsing, which is the only case of crash solved. The above fix is too restrictive as it completely prevents execution of these callbacks when connection is unset. This breaks several features with QUIC, such as SSL key logging or samples based on ClientHello capture. The current patch reverts the above one. Thus, this restores invokation of msg callbacks for QUIC during the whole low-level connection lifetime. This requires a small adjustment in heartbeat parsing callback to prevent access on a NULL connection. The issue on ClientHello capture was mentionned in github issue #2495. This must be backported up to 3.3.	2026-01-29 11:14:09 +01:00
Willy Tarreau	48d9c90ff2	BUG/MINOR: config/ssl: fix spelling of "expose-experimental-directives" The help message for "ktls" mentions "expose-experimental-directive" without the final 's', which is particularly annoying when copy-pasting the directive from the error message directly into the config. This should be backported to 3.3.	2026-01-29 11:07:55 +01:00
Willy Tarreau	35d63cc3c7	MEDIUM: h1: strictly verify quoting in chunk extensions As reported by Ben Kallus in the following thread: https://www.mail-archive.com/haproxy@formilux.org/msg46471.html there exist some agents which mistakenly accept CRLF inside quoted chunk extensions, making it possible to fool them by injecting one extra chunk they won't see for example, or making them miss the end of the body depending on how it's done. Haproxy, like most other agents nowadays, doesn't care at all about chunk extensions and just drops them, in agreement with the spec. However, as discussed, since chunk extensions are basically never used except for attacks, and that the cost of just matching quote pairs and checking backslashed quotes is escape consistency remains relatively low, it can make sense to add such a check to abort the message parsing when this situation is encountered. Note that it has to be done at two places, because there is a fast path and a slow path for chunk parsing. Also note that it will cause transfers using improperly formatted chunk extensions to fail, but since these are really not used, and that the likelihood of them being used but improperly quoted certainly is much lower than the risk of crossing a broken parser on the client's request path or on the server's response path, we consider the risk as acceptable. The test is not subject to the configurable parser exceptions and it's very unlikely that it will ever be needed. Since this is done in 3.4 which will be LTS, this patch will have to be backported to 3.3 so that any unlikely trouble gets a chance to be detected before users upgrade to 3.4. Thanks to Ben for the discussion, and to Rajat Raghav for sparking it in the first place even though the original report was mistaken. Cc: Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu> Cc: Rajat Raghav <xclow3n@gmail.com> Cc: Christopher Faulet <cfaulet@haproxy.com>	2026-01-28 18:54:23 +01:00
Willy Tarreau	a79a67b52f	OPTIM: server: get rid of the last use of _ha_barrier_full() The code in srv_add_to_idle_list() has its roots in 2.0 with commit `9ea5d361ae` ("MEDIUM: servers: Reorganize the way idle connections are cleaned."). At this era we didn't yet have the current set of atomic load/store operations and we used to perform loads using volatile casts after a barrier. It turns out that this function has kept this schema over the years, resulting in a big mfence stalling all the pipeline in the function: \| static __inline void \| __ha_barrier_full(void) \| { \| __asm __volatile("mfence" ::: "memory"); 27.08 \| mfence \| if ((volatile void *)srv->idle_node.node.leaf_p == NULL) { 0.84 \| cmpq $0x0,0x158(%r15) 0.74 \| je 35f \| return 1; Switching these for a pair of atomic loads got rid of this and brought 0.5 to 3% extra performance depending on the tests due to variations elsewhere, but it has never been below 0.5%. Note that the second load doesn't need to be atomic since it's protected by the lock, but it's cleaner from an API and code review perspective. That's also why it's relaxed. This was the last user of _ha_barrier_full(), let's try not to reintroduce it now!	2026-01-28 16:07:27 +00:00
William Lallemand	bbab0ac4d0	BUG/MINOR: ssl: fix error message of tune.ssl.certificate-compression tune.ssl-certificate-compression expects 'auto' but not 'on'. Could be backported if the previous patch is backported.	2026-01-27 16:25:11 +01:00
William Lallemand	6995fe60c3	MINOR: ssl: allow to disable certificate compression This option allows to disable the certificate compression (RFC 8879) using OpenSSL >= 3.2.0. This feature is known to permit some denial of services by causing extra memory allocations of approximately 22MiB and extra CPU work per connection with OpenSSL versions affected by CVE-2025-66199. ( https://openssl-library.org/news/vulnerabilities/index.html#CVE-2025-66199 ) Setting this to "off" permits to mitigate the problem. Must be backported to every stable branches.	2026-01-27 16:10:41 +01:00
Christopher Faulet	0ea601127e	BUG/MAJOR: applet: Don't call I/O handler if the applet was shut In 3.0, it was stated an applet could not be woken up after it was shutdown. So the corresponding test in the applets I/O handler was removed. However, it seems it may happen, especially when outgoing data are blocked on the opposite side. But it is really unexpected because the "release" callback function was already called and the appctx context was most probably released. Strangely, it was never detected by any applet till now. But the Prometheus exporter was never updated and was still testing the shutdown. But when it was refactored to use the new applet API in 3.3, the test was removed. And this introduced a regression leading a crash because a server object could be corrupted. Conditions to hit the bug are not really clear however. So, now, to avoid any issue with all other applets, the test is performed in task_process_applet(). The I/O handler is no longer called if the applet is already shut. The same is performed for applets still relying on the old API. An amazing thanks to @idl0r for his invaluable help on this issue ! This patch should fix the issue #3244. It should first be backported to 3.3 and then slowly as far as 3.0.	2026-01-27 16:00:23 +01:00
William Lallemand	0ebef67132	MINOR: ssl: display libssl errors on private key loading Display a more precise error message from the libssl when a private key can't be loaded correctly.	2026-01-26 14:19:19 +01:00
Remi Tricot-Le Breton	9b1faee4c9	BUG/MINOR: ssl: Encrypted keys could not be loaded when given alongside certificate The SSL passphrase callback function was only called when loading private keys from a dedicated file (separate from the corresponding certificate) but not when both the certificate and the key were in the same file. We can now load them properly, regardless of how they are provided. A flas had to be added in the 'passphrase_cb_data' structure because in the 'ssl_sock_load_pem_into_ckch' function, when calling 'PEM_read_bio_PrivateKey' there might be no private key in the PEM file which would mean that the callback never gets called (and cannot set the 'passphrase_idx' to -1). This patch can be backported to 3.3.	2026-01-26 14:09:13 +01:00
Remi Tricot-Le Breton	d2ccc19fde	BUG/MINOR: ssl: Properly manage alloc failures in SSL passphrase callback Some error paths in 'ssl_sock_passwd_cb' (allocation failures) did not set the 'passphrase_idx' to -1 which is the way for the caller to know not to call the callback again so in some memory contention contexts we could end up calling the callback 'infinitely' (or until memory is finally available). This patch must be backported to 3.3.	2026-01-26 14:08:50 +01:00
Willy Tarreau	1a3252e956	MEDIUM: pools: better check for size rounding overflow on registration Certain object sizes cannot be controlled at declaration time because the resulting object size may be slightly extended (tag, caller), aligned and rounded up, or even doubled depending on pool settings (e.g. if backup is used). This patch addresses this by enlarging the type in the pool registration to 64-bit so that no info is lost from the declaration, and extra checks for overflows can be performed during registration after various rounding steps. This allows to catch issues such as these ones and to report a suitable error: global tune.http.logurilen 2147483647 frontend capture request header name len 2147483647 http-request capture src len 2147483647 tcp-request content capture src len 2147483647	2026-01-26 11:54:14 +01:00
Willy Tarreau	e9e4821db5	BUG/MINOR: stick-tables: abort startup on stk_ctr pool creation failure Since 3.3 with commit `945aa0ea82` ("MINOR: initcalls: Add a new initcall stage, STG_INIT_2"), stkt_late_init() calls stkt_create_stk_ctr_pool() but doesn't check its return value, so if the pool creation fails, the process still starts, which is not correct. This patch adds a check for the return value to make sure we fail to start in this case. This was not an issue before 3.3 because the function was called as a post-check handler which did check for errors in the returned values.	2026-01-26 11:45:49 +01:00
Willy Tarreau	4e7c07736a	BUG/MINOR: config: check capture pool creations for failures A few capture pools can fail in case of too large values for example. These include the req_uri, capture, and caphdr pools, and may be triggered with "tune.http.logurilen 2147483647" in the global section, or one of these in a frontend: capture request header name len 2147483647 http-request capture src len 2147483647 tcp-request content capture src len 2147483647 These seem to be the only occurrences where create_pool()'s return value is assigned without being checked, so let's add the proper check for errors there. This can be backported as a hardening measure though the risks and impacts are extremely low.	2026-01-26 11:45:49 +01:00
Christopher Faulet	c267d24f57	BUG/MINOR: proto_tcp: Properly report support for HAVE_TCP_MD5SIG feature Condition to report the support for HAVE_TCP_MD5SIG feature was inverted. It is only an issue for the reg-test related to this feature. This patch must be backported to 3.3.	2026-01-23 11:40:54 +01:00
Christopher Faulet	a3e9a04435	BUG/MEDIUM: mux-h1: Skip UNUSED htx block when formating the start line UNUSED blocks were not properly handled when the H1 multiplexer was formatting the start line of a request or a response. UNUSED was ignored but not removed from HTX message. So the mux can loop infinitly on such block. It could be seen a a major issue but in fact it happens only if a very specific case on the reponse processing (at least I think so): the server must send an interim message (a 100-continue for intance) with the final response. HAProxy must receive both in same time and the final reponse must be intercepted (via a http-response return action for instance), In that case, the interim message is fowarded and the server final reponse is removed and replaced by a proxy error message. Now UNUSED htx blocks are properly skipped and removed. This patch must be backported as far as 3.0.	2026-01-23 11:40:54 +01:00
Aurelien DARRAGON	a66b4881d7	BUG/MINOR: hlua: consume error object if ignored after a failing lua_pcall() We frequently use lua_pcall() to provide safe alternative functions (encapsulated helpers) that prevent the process from crashing in case of Lua error when Lua is executed from an unsafe environment. However, some of those safe helpers don't handle errors properly. In case of error, the Lua API will always put an error object on top of the stack as stated in the documentation. This error object can be used to retrieve more info about the error. But in some cases when we ignore it, we should still consume it to prevent the stack from being altered with an extra object when returning from the helper function. It should be backported to all stable versions. If the patch doesn't apply automatically, all that's needed is to check for lua_pcall() in hlua.c and for other cases than 'LUA_OK', make sure that the error object is popped from the stack before the function returns.	2026-01-23 11:23:37 +01:00
Aurelien DARRAGON	9e9083d0e2	BUG/MEDIUM: hlua: fix invalid lua_pcall() usage in hlua_traceback() Since commit `365ee28` ("BUG/MINOR: hlua: prevent LJMP in hlua_traceback()") we now use lua_pcall() to protect sensitive parts of hlua_traceback() function, and this to prevent Lua from crashing the process in case of unexpected Lua error. This is still relevant, but an error was made, as lua_pcall() was given the nresult argument '1' when _hlua_traceback() internal function doesn't push any argument on the stack. Because of this, it seems Lua API still tries to push garbage object on top of the stack before returning. This may cause functions that leverage hlua_traceback() in the middle of stack manipulation to end up having a corrupted stack when continuing after the hlua_traceback(). There doesn't seem to be many places where this could be a problem, as this was discovered using the reproducer documented in `f535d3e` ("BUG/MEDIUM: debug: only dump Lua state when panicking"). Indeed, when hlua_traceback() was used from the signal handler while the thread was previously executing Lua, when returning to Lua after the handler the Lua stack would be corrupted. To fix the issue, we emphasize on the fact that the _hlua_traceback() function doesn't push anything on the stack, returns 0, thus lua_pcall() is given 0 'nresult' argument to prevent anything from being pushed after the execution, preserving the original stack state. This should be backported to all stable versions (because `365ee28` was backported there)	2026-01-23 11:23:31 +01:00
Amaury Denoyelle	b52c60d366	MEDIUM: proxy: implement persistent named defaults This patch changes the handling of named defaults sections. Prior to this patch, every unreferenced defaults proxies were removed on post parsing. Now by default, these sections are kept after postparsing and only purged on deinit. The objective is to allow reusing them as base configuration for dynamic backends. To implement this, refcount of every still addressable named sections is incremented by one after parsing. This ensures that they won't be removed even if referencing proxies are removed at runtime. This is done via the new function proxy_ref_all_defaults(). To ensure defaults instances are still properly removed on deinit, the inverse operation is performed : refcount is decremented by one on every defaults sections via proxy_unref_all_defaults(). The original behavior can still be used by using the new global keyword tune.defaults.purge. This is useful for users using configuration with large number of defaults and not interested in dynamic backends creation.	2026-01-22 18:06:42 +01:00
Amaury Denoyelle	116983ad94	MEDIUM: cfgparse: do not store unnamed defaults in name tree Defaults section are indexed by their name in defproxy_by_name tree. For named sections, there is no duplicate : if two instances have the same name, the older one is removed from the tree. However, this was not the case for unnamed defaults which are all stored inconditionnally in defproxy_by_name. This commit introduces a new approach for unnamed defaults. Now, these instances are never inserted in the defproxy_by_name tree. Indeed, this is not needed as no tree lookup is performed with empty names. This may optimize slightly config parsing with a huge number of named and unnamed defaults sections, as the first ones won't fill up the tree needlessly. However, defproxy_by_name tree is also used to purge unreferenced defaults instances, both on postparsing and deinit. Thus, a new approach is needed for unnamed sections cleanup. Now, each time a new defaults is parsed, if the previous instance is unnamed, it is freed unless if referenced by a proxy. When config parsing is ended, a similar operation is performed to ensure the last unnamed defaults section won't stay in memory. To implement this, last_defproxy static variable is now set to global. Unnamed sections which cannot be removed due to proxies referencing proxies will still be removed when such proxies are freed themselves, at runtime or on deinit.	2026-01-22 17:57:16 +01:00
Amaury Denoyelle	848e0cd052	MINOR: proxy: simplify defaults proxies list storage Defaults proxies instance are stored in a global name tree. When there is a name conflict and the older entry cannot be simply discarded as it is already referenced, the older entry is instead removed from the name tree and inserted into the orphaned list. The purpose of the orphaned list was to guarantee that any remaining unreferenced defaults are purged either on postparsing or deinit. However, this is in fact completely useless. Indeed on postparsing, orphaned entries are always referenced. On deinit instead, defaults are already freed along the cleanup of all frontend/backend instances clean up, thanks to their refcounting. This patch streamlines this by removing orphaned list. Instead, a defaults section is inserted into a new global defaults_list during their whole lifetime. This is not strictly necessary but it ensures that defaults instances can still be accessed easily in the future if needed even if not present in the name tree. On deinit, a BUG_ON() is added to ensure that defaults_list is indeed emptied. Another benefit from this patch is to simplify the defaults deletion procedure. Orphaned simple list is replaced by a proper double linked list implementation, so a single LIST_DELETE() is now performed. This will be notably useful as defaults may be removed at runtime in the future if backends deletion at runtime is implemented.	2026-01-22 17:57:09 +01:00
Amaury Denoyelle	434e979046	MINOR: proxy: refactor defaults proxies API This patch renames functions which deal with defaults section. A common "defaults_px_" prefix is defined. This serves as a marker to identify functions which can only be used with proxies defaults capability. New BUG_ON() are enforced to ensure this is valid. Also, older proxy_unref_or_destroy_defaults() is renamed defaults_px_detach().	2026-01-22 17:55:47 +01:00
Amaury Denoyelle	6c0ea1fe73	MINOR: proxy: remove proxy_preset_defaults() Function proxy_preset_defaults() purpose has evolved over time. Originally, it was only used to initialize defaults proxies instances. Until today, it was extended so that all proxies use it. Its objective is to initialize settings to common default values. To remove the confusion, this function is now removed. Its content is integrated directly into init_new_proxy().	2026-01-22 16:20:25 +01:00
Willy Tarreau	f535d3e031	BUG/MEDIUM: debug: only dump Lua state when panicking For a long time, we've tried to show the Lua state and backtrace when dumping threads so as to be able to figure is (and which) Lua code was misbehaving, e.g. by performing expensive library calls. Since 3.1 with commit `365ee28510` ("BUG/MINOR: hlua: prevent LJMP in hlua_traceback()"), it appears that the approach is more fragile (though that fix addressed a real issue about out-of-memory), and it's possible to occasionally observe crashes or CPU loops with "show threads" while running Lua heavily. While users of "show threads" are rare, the watchdog warnings, which were also enabled on 3.1, also trigger these issues, which is even more of a concern. This patch goes the simple way to address this for now: since the purpose of the Lua backtrace was to help locate Lua call places upon a panic, let's only call the backtrace on panic but not in other situations. After a panic we obviously don't care that the Lua stack might be corrupted since it's never going to be resumed anyway. This may be relaxed in the future if a solution is found to reliably produce harmless Lua backtraces. The commit above was backported to all stable branches, so this patch will be needed everywhere. However, TAINTED_PANIC only appeared in 2.8, and given the rarety of this bug before 3.1, it's probably not needed to make any extra effort to go beyond 2.8. It's easy enough to test a version for being subject to this issue, by running the following Lua code: local function stress(txn) for _, backend in pairs(core.backends) do for _, server in pairs(backend.servers) do local stats = server:get_stats() end end end core.register_fetches("stress", stress) in the following config file: global stats socket /tmp/haproxy.stat level admin mode 666 tune.lua.bool-sample-conversion normal lua-load-per-thread "stress.lua" listen stress bind :8001 mode http timeout client 5s timeout server 5s timeout connect 5s http-request return status 200 content-type text/plain lf-string %[lua.stress()] server s1 127.0.0.1:8000 and stressing port 8001 with 100+ connections requesting / in loop, then issuing "show threads" on the CLI using socat in loops as well. Normally it instantly segfaults (sometimes during the first "show").	2026-01-22 15:47:42 +01:00

1 2 3 4 5 ...

20540 Commits