haproxy

mirror of http://git.haproxy.org/git/haproxy.git synced 2026-02-14 16:09:17 +02:00

Author	SHA1	Message	Date
Willy Tarreau	c4e56dc58c	MINOR: pools: add a new global option "no-memory-trimming" Some users with very large numbers of connections have been facing extremely long malloc_trim() calls on reload that managed to trigger the watchdog! That's a bit counter-productive. It's even possible that some implementations are not perfectly reliable or that their trimming time grows quadratically with the memory used. Instead of constantly trying to work around these issues, let's offer an option to disable this mechanism, since nobody had been complaining in the past, and this was only meant to be an improvement. This should be backported to 2.4 where trimming on reload started to appear.	2022-03-08 10:45:03 +01:00
Willy Tarreau	f9eba78fb8	BUG/MEDIUM: pools: fix ha_free() on area in the process of being freed Commit `e81248c0c` ("BUG/MINOR: pool: always align pool_heads to 64 bytes") added a free of the allocated pool in pool_destroy() using ha_free(), but it added a subtle bug by which once the pool is released, setting its address to NULL inside the structure itself cannot work because the area has just been freed. This will need to be backported wherever the patch above is backported.	2022-03-03 18:42:49 +01:00
Willy Tarreau	e81248c0c8	BUG/MINOR: pool: always align pool_heads to 64 bytes This is the pool equivalent of commit `97ea9c49f` ("BUG/MEDIUM: fd: always align fdtab[] to 64 bytes"). After a careful code review, it happens that the pool heads are the other structures allocated with malloc/calloc that claim to be aligned to a size larger than what the allocator can offer. While no issue was reported on them, no memset() is performed and no type is large, this is a problem waiting to happen, so better fix it. In addition, it's relatively easy to do by storing the allocation address inside the pool_head itself and use it at free() time. Finally, threads might benefit from the fact that the caches will really be aligned and that there will be no false sharing. This should be backported to all versions where it applies easily.	2022-03-02 18:22:08 +01:00
Christopher Faulet	86e1c3381b	MEDIUM: applet: Set the conn-stream as appctx owner instead of the stream-int Because appctx is now an endpoint of the conn-stream, there is no reason to still have the stream-interface as appctx owner. Thus, the conn-stream is now the appctx owner.	2022-02-24 11:00:02 +01:00
Willy Tarreau	f4b79c4a01	MINOR: pools: support setting debugging options using -dM The 9 currently available debugging options may now be checked, set, or cleared using -dM. The directive now takes a comma-delimited list of options after the optional poisonning byte. With "help", the list of available options is displayed with a short help and their current status. The management doc was updated.	2022-02-23 17:28:41 +01:00
Willy Tarreau	1408b1f8be	MINOR: pools: delegate parsing of command line option -dM to a new function New function pool_parse_debugging() is now dedicated to parsing options of -dM. For now it only handles the optional memory poisonning byte, but the function may already return an informative message to be printed for help, a warning or an error. This way we'll reuse it for the settings that will be needed for configurable debugging options.	2022-02-23 17:28:41 +01:00
Willy Tarreau	ef301b7556	MINOR: pools: add a debugging flag for memory poisonning option Now -dM will set POOL_DBG_POISON for consistency with the rest of the pool debugging options. As such now we only check for the new flag, which allows the default value to be preset.	2022-02-23 17:11:33 +01:00
Willy Tarreau	13d7775b06	MINOR: pools: replace DEBUG_MEMORY_POOLS with runtime POOL_DBG_TAG This option used to allow to store a marker at the end of the area, which was used as a canary and detection against wrong freeing while the object is used, and as a pointer to the last pool_free() caller when back in cache. Now that we can compute the offsets at runtime, let's check it at run time and continue the code simplification.	2022-02-23 17:11:33 +01:00
Willy Tarreau	0271822f17	MINOR: pools: replace DEBUG_POOL_TRACING with runtime POOL_DBG_CALLER This option used to allow to store a pointer to the caller of the last pool_alloc() or pool_free() at the end of the area. Now that we can compute the offsets at runtime, let's check it at run time and continue the code simplification. In __pool_alloc() we now always calculate the return address (which is quite cheap), and the POOL_DEBUG_TRACE_CALLER() calls are conditionned on the status of debugging option.	2022-02-23 17:11:33 +01:00
Willy Tarreau	42705d06b7	MINOR: pools: get rid of POOL_EXTRA This macro is build-time dependent and is almost unused, yet where it cannot easily be avoided. Now that we store the distinction between pool->size and pool->alloc_sz, we don't need to maintain it and we can instead compute it on the fly when creating a pool. This is what this patch does. The variables are for now pretty static, but this is sufficient to kill the macro and will allow to set them more dynamically.	2022-02-23 17:11:33 +01:00
Willy Tarreau	96d5bc7379	MINOR: pools: store the allocated size for each pool The allocated size is the visible size plus the extra storage. Since for now we can store up to two extra elements (mark and tracer), it's convenient because now we know that the mark is always stored at ->size, and the tracer is always before ->alloc_sz.	2022-02-23 17:11:33 +01:00
Willy Tarreau	e981631d27	MEDIUM: pools: replace CONFIG_HAP_POOLS with a runtime "NO_CACHE" flag. Like previous patches, this replaces the build-time code paths that were conditionned by CONFIG_HAP_POOLS with runtime paths conditionned by !POOL_DBG_NO_CACHE. One trivial test had to be added in the hot path in __pool_alloc() to refrain from calling pool_get_from_cache(), and another one in __pool_free() to avoid calling pool_put_to_cache(). All cache-specific functions were instrumented with a BUG_ON() to make sure we never call them with cache disabled. Additionally the cache[] array was not initialized (remains NULL) so that we can later drop it if not needed. It's particularly huge and should be turned to dynamic with a pointer to a per-thread area where all the objects are located. This will solve the memory usage issue and will improve locality, or even help better deal with NUMA machines once each thread uses its own arena.	2022-02-23 17:11:33 +01:00
Willy Tarreau	dff3b0627d	MINOR: pools: make the global pools a runtime option. There were very few functions left that were specific to global pools, and even the checks they used to participate to are not directly on the most critical path so they can suffer an extra "if". What's done now is that pool_releasable() always returns 0 when global pools are disabled (like the one before) so that pool_evict_last_items() never tries to place evicted objects there. As such there will never be any object in the free list. However pool_refill_local_from_shared() is bypassed when global pools are disabled so that we even avoid the atomic loads from this function. The default global setting is still adjusted based on the original CONFIG_NO_GLOBAL_POOLS that is set depending on threads and the allocator. The global executable only grew by 1.1kB by keeping this code enabled, and the code is simplified and will later support runtime options.	2022-02-23 17:11:33 +01:00
Willy Tarreau	6f3c7f6e6a	MINOR: pools: add a new debugging flag POOL_DBG_INTEGRITY The test to decide whether or not to enforce integrity checks on cached objects is now enabled at runtime and conditionned by this new debugging flag. While previously it was not a concern to inflate the code size by keeping the two functions static, they were moved to pool.c to limit the impact. In pool_get_from_cache(), the fast code path remains fast by having both flags tested at once to open a slower branch when either POOL_DBG_COLD_FIRST or POOL_DBG_INTEGRITY are set.	2022-02-23 17:11:33 +01:00
Willy Tarreau	d3470e1ce8	MINOR: pools: add a new debugging flag POOL_DBG_COLD_FIRST When enabling pools integrity checks, we usually prefer to allocate cold objects first in order to maximize the time the objects spend in the cache. In order to make this configurable at runtime, let's introduce a new debugging flag to control this allocation order. It is currently preset by the DEBUG_POOL_INTEGRITY build-time setting.	2022-02-23 17:11:33 +01:00
Willy Tarreau	fd8b737e2c	MINOR: pools: switch DEBUG_DONT_SHARE_POOLS to runtime This test used to appear at a single location in create_pool() to enable a check on the pool name or unconditionally merge similarly sized pools. This patch introduces POOL_DBG_DONT_MERGE and conditions the test on this new runtime flag, that is preset according to the aforementioned debugging option.	2022-02-23 17:11:33 +01:00
Willy Tarreau	8d0273ed88	MINOR: pools: switch the fail-alloc test to runtime only The fail-alloc test used to be enabled/disabled at build time using the DEBUG_FAIL_ALLOC macro, but it happens that the cost of the test is quite cheap and that it can be enabled as one of the pool_debugging options. This patch thus introduces the first POOL_DBG_FAIL_ALLOC option, whose default value depends on DEBUG_FAIL_ALLOC. The mem_should_fail() function is now always built, but it was made static since it's never used outside.	2022-02-23 17:11:33 +01:00
Willy Tarreau	605629b008	MINOR: pools: introduce a new pool_debugging global variable This read-mostly variable will be used at runtime to enable/disable certain pool-debugging features and will be set by the command-line parser. A future option -dP will take a number of debugging features as arguments to configure this variable's contents.	2022-02-23 17:11:33 +01:00
Willy Tarreau	af580f659c	MINOR: pools: disable redundant poisonning on pool_free() The poisonning performed on pool_free() used to help a little bit with use-after-free detection, but usually did more harm than good in that it was never possible to perform post-mortem analysis on released objects once poisonning was enabled on allocation. Now that there is a dedicated DEBUG_POOL_INTEGRITY, let's get rid of this annoyance which is not even documented in the management manual.	2022-02-23 17:11:33 +01:00
Willy Tarreau	9f699958dc	MINOR: pools: mark most static pool configuration variables as read-mostly The mem_poison_byte, mem_fail_rate, using_default_allocator and the pools list are all only set once at boot time and never changed later, while they're heavily used at run time. Let's optimize their usage from all threads by marking them read-mostly so that them reside in a shared cache line.	2022-02-21 20:44:26 +01:00
Willy Tarreau	27c8da1fd5	DEBUG: pools: replace the link pointer with the caller's address on pool_free() Along recent evolutions of the pools, we've lost the ability to reliably detect double-frees because while in the past the same pointer was being used to chain the objects in the cache and to store the pool's address, since 2.0 they're different so the pool's address is never overwritten on free() and a double-free will rarely be detected. This patch sets the caller's return address there. It can never be equal to a pool's address and will help guess what was the previous call path. It will not work on exotic architectures nor with very old compilers but these are not the environments where we're trying to get detailed bug reports, and this is not done by default anyway so we don't care about this limitation. Note that depending on the inlining status of the function, the result may differ but that's no big deal either. A test by placing a double free of an appctx inside the release handler itself successfully reported the trouble during appctx_free() and showed that the return address was in stream_int_shutw_applet() (this one calls the release handler).	2022-02-14 20:10:43 +01:00
Willy Tarreau	49bb5d4268	DEBUG: pools: let's add reverse mapping from cache heads to thread and pool During global eviction we're visiting nodes from the LRU tail and we determine their pool cache head and their pool. In order to make sure we never mess up, let's add some backwards pointer to the thread number and pool from the pool_cache_head. It's 64-byte aligned anyway so we're not wasting space and it helps for debugging and will prevent memory corruption the earliest possible.	2022-02-14 20:10:43 +01:00
Willy Tarreau	e2830addda	DEBUG: pools: add extra sanity checks when picking objects from a local cache These few checks are added to make sure we never try to pick an object from an empty list, which would have a devastating effect.	2022-02-14 20:10:43 +01:00
Willy Tarreau	ceabc5ca8c	CLEANUP: pools: don't needlessly set a call mark during refilling of caches When refilling caches from the shared cache, it's pointless to set the pointer to the local pool since it may be overwritten immediately after by the LIST_INSERT(). This is a leftover from the pre-2.4 code in fact. It didn't hurt, though.	2022-02-14 20:10:43 +01:00
Willy Tarreau	c895c441c7	BUG/MINOR: pools: always flush pools about to be destroyed When destroying a pool (e.g. at exit or when resizing buffers), it's important to try to free all their local objects otherwise we can leave some in the cache. This is particularly visible when changing "bufsize", because "show pools" will then show two "trash" pools, one of which contains a single object in cache (which is fortunately not reachable). In all cases this happens while single-threaded so that's easy to do, we just have to do it on the current thread. The easiest way to do this is to pass an extra argument to function pool_evict_from_local_cache() to force a full flush instead of a partial one. This can probably be backported to about all branches where this applies, but at least 2.4 needs it.	2022-02-14 20:10:43 +01:00
Willy Tarreau	b5ba09ed58	BUG/MEDIUM: pools: ensure items are always large enough for the pool_cache_item With the introduction of DEBUG_POOL_TRACING in 2.6-dev with commit `add43fa43` ("DEBUG: pools: add new build option DEBUG_POOL_TRACING"), small pools might be too short to store both the pool_cache_item struct and the caller location, resulting in memory corruption and crashes when this debug option is used. What happens here is that the way the size is calculated is by considering that the POOL_EXTRA part is only used while the object is in use, but this is not true anymore for the caller's pointer which must absolutely be placed after the pool_cache_item. This patch makes sure that the caller part will always start after the pool_cache_item and that the allocation will always be sufficent. This is only tagged medium because the debug option is new and unlikely to be used unless requested by a developer. No backport is needed.	2022-02-14 20:10:43 +01:00
Willy Tarreau	f70fdde591	BUILD: pools: fix build error on DEBUG_POOL_TRACING When squashing commit `add43fa43` ("DEBUG: pools: add new build option DEBUG_POOL_TRACING") I managed to break the build and to fail to detect it even after the rebase and a full rebuild :-(	2022-01-25 15:59:18 +01:00
Willy Tarreau	add43fa43e	DEBUG: pools: add new build option DEBUG_POOL_TRACING This new option, when set, will cause the callers of pool_alloc() and pool_free() to be recorded into an extra area in the pool that is expected to be helpful for later inspection (e.g. in core dumps). For example it may help figure that an object was released to a pool with some sub-fields not yet released or that a use-after-free happened after releasing it, with an immediate indication about the exact line of code that released it (possibly an error path). This only works with the per-thread cache, and even objects refilled from the shared pool directly into the thread-local cache will have a NULL there. That's not an issue since these objects have not yet been freed. It's worth noting that pool_alloc_nocache() continues not to set any caller pointer (e.g. when the cache is empty) because that would require a possibly undesirable API change. The extra cost is minimal (one pointer per object) and this completes well with DEBUG_POOL_INTEGRITY.	2022-01-24 16:40:48 +01:00
Willy Tarreau	0e2a5b4b61	MINOR: pools: extend pool_cache API to pass a pointer to a caller This adds a caller to pool_put_to_cache() and pool_get_from_cache() which will optionally be used to pass a pointer to their callers. For now it's not used, only the API is extended to support this pointer.	2022-01-24 16:40:48 +01:00
Willy Tarreau	d392973dcc	MINOR: pools: partially uninline pool_alloc() The pool_alloc() function was already a wrapper to __pool_alloc() which was also inlined but took a set of flags. This latter was uninlined and moved to pool.c, and pool_alloc()/pool_zalloc() turned to macros so that they can more easily evolve to support debugging options. The number of call places made this code grow over time and doing only this change saved ~1% of the whole executable's size.	2022-01-24 16:40:48 +01:00
Willy Tarreau	15c322c413	MINOR: pools: partially uninline pool_free() The pool_free() function has become a bit big over time due to the extra consistency checks. It used to remain inline only to deal cleanly with the NULL pointer free that's quite present on some structures (e.g. in stream_free()). Here we're splitting the function in two: - __pool_free() does the inner block without the pointer test and becomes a function ; - pool_free() is now a macro that only checks the pointer and calls __pool_free() if needed. The use of a macro versus an inline function is only motivated by an easier intrumentation of the code later. With this change, the code size reduces by ~1%, which means that at this point all pool_free() call places used to represent more than 1% of the total code size.	2022-01-24 16:40:48 +01:00
Willy Tarreau	0575d8fd76	DEBUG: pools: add new build option DEBUG_POOL_INTEGRITY When enabled, objects picked from the cache are checked for corruption by comparing their contents against a pattern that was placed when they were inserted into the cache. Objects are also allocated in the reverse order, from the oldest one to the most recent, so as to maximize the ability to detect such a corruption. The goal is to detect writes after free (or possibly hardware memory corruptions). Contrary to DEBUG_UAF this cannot detect reads after free, but may possibly detect later corruptions and will not consume extra memory. The CPU usage will increase a bit due to the cost of filling/checking the area and for the preference for cold cache instead of hot cache, though not as much as with DEBUG_UAF. This option is meant to be usable in production.	2022-01-21 19:07:48 +01:00
Willy Tarreau	1513c5479a	MEDIUM: pools: release cached objects in batches With this patch pool_evict_last_items builds clusters of up to CONFIG_HAP_POOL_CLUSTER_SIZE entries so that accesses to the shared pools are reduced by CONFIG_HAP_POOL_CLUSTER_SIZE and the inter- thread contention is reduced by as much..	2022-01-02 19:35:26 +01:00
Willy Tarreau	43937e920f	MEDIUM: pools: start to batch eviction from local caches Since previous patch we can forcefully evict multiple objects from the local cache, even when evicting basd on the LRU entries. Let's define a compile-time configurable setting to batch releasing of objects. For now we set this value to 8 items per round. This is marked medium because eviction from the LRU will slightly change in order to group the last items that are freed within a single cache instead of accurately scanning only the oldest ones exactly in their order of appearance. But this is required in order to evolve towards batched removals.	2022-01-02 19:35:26 +01:00
Willy Tarreau	a0b5831eed	MEDIUM: pools: centralize cache eviction in a common function We currently have two functions to evict cold objects from local caches: pool_evict_from_local_cache() to evict from a single cache, and pool_evict_from_local_caches() to evict oldest objects from all caches. The new function pool_evict_last_items() focuses on scanning oldest objects from a pool and releasing a predefined number of them, either to the shared pool or to the system. For now they're evicted one at a time, but the next step will consist in creating clusters.	2022-01-02 19:35:26 +01:00
Willy Tarreau	337410c5a4	MINOR: pools: pass the objects count to pool_put_to_shared_cache() This is in order to let the caller build the cluster of items to be released. For now single items are released hence the count is always 1.	2022-01-02 19:35:26 +01:00
Willy Tarreau	148160b027	MINOR: pools: prepare pool_item to support chained clusters In order to support batched allocations and releases, we'll need to prepare chains of items linked together and that can be atomically attached and detached at once. For this we implement a "down" pointer in each pool_item that points to the other items belonging to the same group. For now it's always NULL though freeing functions already check them when trying to release everything.	2022-01-02 19:35:26 +01:00
Willy Tarreau	361e31e3fe	MEDIUM: pool: compute the number of evictable entries once per pool In pool_evict_from_local_cache() we used to check for room left in the pool for each and every object. Now we compute the value before entering the loop and keep into a local list what has to be released, and call the OS-specific functions for the other ones. It should already save some cycles since it's not needed anymore to recheck for the pool's filling status. But the main expected benefit comes from the ability to pre-construct a list of all releasable objects, that will later help with grouping them.	2022-01-02 19:35:26 +01:00
Willy Tarreau	c16ed3b090	MINOR: pool: introduce pool_item to represent shared pool items In order to support batch allocation from/to shared pools, we'll have to support a specific representation for pool objects. The new pool_item structure will be used for this. For now it only contains a "next" pointer that matches exactly the current storage model. The few functions that deal with the shared pool entries were adapted to use the new type. There is no functionality difference at this point.	2022-01-02 19:35:26 +01:00
Willy Tarreau	b46674a283	MINOR: pool: check for pool's fullness outside of pool_put_to_shared_cache() Instead of letting pool_put_to_shared_cache() pass the object to the underlying OS layer when there's no more room, let's have the caller check if the pool is full and either call pool_put_to_shared_cache() or call pool_free_nocache(). Doing this sensibly simplifies the code as this function now only has to deal with a pool and an item and only for cases where there are local caches and shared caches. As the code was simplified and the calls more isolated, the function was moved to pool.c. Note that it's only called from pool_evict_from_local_cache{,s}() and that a part of its logic might very well move there when dealing with batches.	2022-01-02 19:35:26 +01:00
Willy Tarreau	afe2c4a1fc	MINOR: pool: allocate from the shared cache through the local caches One of the thread scaling challenges nowadays for the pools is the contention on the shared caches. There's never any situation where we have a shared cache and no local cache anymore, so we can technically afford to transfer objects from the shared cache to the local cache before returning them to the user via the regular path. This adds a little bit more work per object per miss, but will permit batch processing later. This patch simply moves pool_get_from_shared_cache() to pool.c under the new name pool_refill_local_from_shared(), and this function does not return anything but it places the allocated object at the head of the local cache.	2022-01-02 19:27:57 +01:00
Willy Tarreau	8c4927098e	CLEANUP: pools: get rid of the POOL_LINK macro The POOL_LINK macro is now only used for debugging, and it still requires ifdefs around, which needlessly complicates the code. Let's replace it and the calling code with a new pair of macros: POOL_DEBUG_SET_MARK() and POOL_DEBUG_CHECK_MARK(), that respectively store and check the pool pointer in the extra location at the end of the pool. This removes 4 pairs of ifdefs in the middle of the code.	2022-01-02 12:44:19 +01:00
Willy Tarreau	799f6143ca	CLEANUP: pools: do not use the extra pointer to link shared elements This practice relying on POOL_LINK() dates from the era where there were no pool caches, but given that the structures are a bit more complex now and that pool caches do not make use of this feature, it is totally useless since released elements have already been overwritten, and yet it complicates the architecture and prevents from making simplifications and optimizations. Let's just get rid of this feature. The pointer to the origin pool is preserved though, as it helps detect incorrect frees and serves as a canary for overflows.	2022-01-02 12:44:19 +01:00
Willy Tarreau	d5ec100661	MINOR: pools: always evict oldest objects first in pool_evict_from_local_cache() For an unknown reason, despite the comment stating that we were evicting oldest objects first from the local caches, due to the use of LIST_NEXT, the newest were evicted, since pool_put_to_cache() uses LIST_INSERT(). Some tests on 16 threads show that evicting oldest objects instead can improve performance by 0.5-1% especially when using shared pools.	2022-01-02 12:40:14 +01:00
Willy Tarreau	0d93a81863	MINOR: pools: work around possibly slow malloc_trim() during gc During 2.4-dev, support for malloc_trim() was implemented to ease release of memory in a stopping process. This was found to be quite effective and later backported to 2.3.7. Then it was found that sometimes malloc_trim() could take a huge time to complete it if was competing with other threads still allocating and releasing memory, reason why it was decided in 2.5-dev to move malloc_trim() under the thread isolation that was already in place in the shared pool version of pool_gc() (this was commit `26ed1835`). However, other instances of pool_gc() that used to call malloc_trim() were not updated since they were not using thread isolation. Currently we have two other such instances, one for when there is absolutely no pool and one for when there are only thread-local pools. Christian Ruppert reported in GH issue #1490 that he's sometimes seeing and old process die upon reload when upgrading from 2.3 to 2.4, and that this happens inside malloc_trim(). The problem is that since 2.4-dev11 with commit `0bae07592` we detect modern libc that provide a faster thread-aware allocator and do not maintain shared pools anymore. As such we're using again the simpler pool_gc() implementations that do not use thread isolation around the malloc_trim() call. All this code was cleaned up recently and the call moved to a new function trim_all_pools(). This patch implements explicit thread isolation inside that function so that callers do not have to care about this anymore. The thread isolation is conditional so that this doesn't affect the one already in place in the larger version of pool_gc(). This way it will solve the problem for all callers. This patch must be backported as far as 2.3. It may possibly require some adaptations. If trim_all_pools() is not present, copy-pasting the tests in each version of pool_gc() will have the same effect. Thanks to Christian for his detailed report and his testing.	2021-12-23 15:44:06 +01:00
David CARLIER	b1e190a885	MEDIUM: pool: Following up on previous pool trimming update. Apple libmalloc has its own notion of memory arenas as malloc_zone with rich API having various callbacks for various allocations strategies but here we just use the defaults. In trim_all_pools, we advise to purge each zone as much as possible, called "greedy" mode.	2021-12-01 10:38:31 +01:00
Willy Tarreau	781f07a620	BUILD: pools: only detect link-time jemalloc on ELF platforms The build broke on Windows and MacOS after commit `ed232148a` ("MEDIUM: pool: refactor malloc_trim/glibc and jemalloc api addition detections."), because the extern+attribute(weak) combination doesn't result in a really weak symbol and it causes an undefined symbol at link time. Let's reserve this detection to ELF platforms. The runtime detection using dladdr() remains used if defined. No backport needed, this is purely 2.6.	2021-11-26 16:13:17 +01:00
David Carlier	d450ff636c	MEDIUM: pool: support purging jemalloc arenas in trim_all_pools() In the case of Linux/glibc, falling back to malloc_trim if jemalloc had not been detected beforehand.	2021-11-25 18:54:50 +01:00
David Carlier	ed232148a7	MEDIUM: pool: refactor malloc_trim/glibc and jemalloc api addition detections. Attempt to detect jemalloc at runtime before hand whether linked or via symbols overrides, and fall back to malloc_trim/glibc for Linux otherwise.	2021-11-25 18:54:50 +01:00
Willy Tarreau	fdf53b4962	BUG/MINOR: pools: don't mark ourselves as harmless in DEBUG_UAF mode When haproxy is built with DEBUG_UAF=1, some particularly slow allocation functions are used for each pool, and it was not uncommon to see the watchdog trigger during performance tests. For this reason the allocation functions were surrounded by a pair of thread_harmless calls to mention that the function was waiting in slow syscalls. The problem is that this also releases functions blocked in thread_isolate() which can then start their work. In order to protect against the accidental removal of a shared resource in this situation, in 2.5-dev4 with commit `ba3ab7907` ("MEDIUM: servers: make the server deletion code run under full thread isolation") was added thread_isolate_full() for functions which want to be totally protected due to being manipulating some data. But this is not sufficient, because there are still places where we can allocate/free (thus sleep) under a lock, such as in long call chains involving the release of an idle connection. In this case, if one thread asks for isolation, one thread might hang in pool_alloc_area_uaf() with a lock held (for example the conns_lock when coming from conn_backend_get()->h1_takeover()->task_new()), with another thread blocked on a lock waiting for that one to release it, both keeping their bit clear in the thread_harmless mask, preventing the first thread from being released, thus causing a deadlock. In addition to this, it was already seen that the "show fd" CLI handler could wake up during a pool_free_area_uaf() with an incompletely released memory area while deleting a file descriptor, and be fooled showing bad pointers, or during a pool_alloc() on another thread that was in the process of registering a freshly allocated connection to a new file descriptor. One solution could consist in replacing all thread_isolate() calls by thread_isolate_full() but then that makes thread_isolate() useless and only shifts the problem by one slot. A better approach could possibly consist in having a way to mark that a thread is entering an extremely slow section. Such sections would be timed so that this is not abused, and the bit would be used to make the watchdog more patient. This would be acceptable as this would only affect debugging. The approach used here for now consists in removing the harmless bits around the UAF allocator, thus essentially undoing commit `85b2cae63` ("MINOR: pools: make the thread harmless during the mmap/munmap syscalls"). This is marked as minor because nobody is expected to be running with DEBUG_UAF outside of development or serious debugging, so this issue cannot affect regular users. It must be backported to stable branches that have thread_harmless_now() around the mmap() call.	2021-11-12 11:17:37 +01:00

1 2 3

108 Commits