Commit Graph

46 Commits

Author SHA1 Message Date
Willy Tarreau
598cf3f22e MAJOR: threads: change thread_isolate to support inter-group synchronization
thread_isolate() and thread_isolate_full() were relying on a set of thread
masks for all threads in different states (rdv, harmless, idle). This cannot
work anymore when the number of threads increases beyond LONGBITS so we need
to change the mechanism.

What is done here is to have a counter of requesters and the number of the
current isolated thread. Threads which want to isolate themselves increment
the request counter and wait for all threads to be marked harmless (or idle)
by scanning all groups and watching the respective masks. This is possible
because threads cannot escape once they discover this counter, unless they
also want to isolate and possibly pass first. Once all threads are harmless,
the requesting thread tries to self-assign the isolated thread number, and
if it fails it loops back to checking all threads. If it wins it's guaranted
to be alone, and can drop its harmless bit, so that other competing threads
go back to the loop waiting for all threads to be harmless. The benefit of
proceeding this way is that there's very little write contention on the
thread number (none during work), hence no cache line moves between caches,
thus frozen threads do not slow down the isolated one.

Once it's done, the isolated thread resets the thread number (hence lets
another thread take the place) and decrements the requester count, thus
possibly releasing all harmless threads.

With this change there's no more need for any global mask to synchronize
any thread, and we only need to loop over a number of groups to check
64 threads at a time per iteration. As such, tinfo's threads_want_rdv
could be dropped.

This was tested with 64 threads spread into 2 groups, running 64 tasks
(from the debug dev command), 20 "show sess" (thread_isolate()), 20
"add server blah/blah" (thread_isolate()), and 20 "del server blah/blah"
(thread_isolate_full()). The load remained very low (limited by external
socat forks) and no stuck nor starved thread was found.
2022-07-01 19:15:15 +02:00
Willy Tarreau
03f9b35114 MEDIUM: tinfo: add a dynamic thread-group context
The thread group info is not sufficient to represent a thread group's
current state as it's read-only. We also need something comparable to
the thread context to represent the aggregate state of the threads in
that group. This patch introduces ha_tgroup_ctx[] and tg_ctx for this.
It's indexed on the group id and must be cache-line aligned. The thread
masks that were global and that do not need to remain global were moved
there (want_rdv, harmless, idle).

Given that all the masks placed there now become group-specific, the
associated thread mask (tid_bit) now switches to the thread's local
bit (ltid_bit). Both are the same for nbtgroups 1 but will differ for
other values.

There's also a tg_ctx pointer in the thread so that it can be reached
from other threads.
2022-07-01 19:15:15 +02:00
Willy Tarreau
22b2a24eb2 CLEANUP: thread: remove thread_sync_release() and thread_sync_mask
This function was added in 2.0 when reworking the thread isolation
mechanism to make it more reliable. However it if fundamentally
incompatible with the full isolation mechanism provided by
thread_isolate_full() since that one will wait for all threads to
become idle while the former will wait for all threads to finish
waiting, causing a deadlock.

Given that it's not used, let's just drop it entirely before it gets
used by accident.
2022-07-01 19:15:15 +02:00
Willy Tarreau
cce203aae5 MINOR: thread: add a new all_tgroups_mask variable to know about active tgroups
In order to kill all_threads_mask we'll need to have an equivalent for
the thread groups. The all_tgroups_mask does just this, it keeps one bit
set per enabled group.
2022-07-01 19:15:15 +02:00
Willy Tarreau
66ad98a772 MINOR: tinfo: add the tgid to the thread_info struct
At several places we're dereferencing the thread group just to catch
the group number, and this will become even more required once we start
to use per-group contexts. Let's just add the tgid in the thread_info
struct to make this easier.
2022-07-01 19:15:14 +02:00
Willy Tarreau
252754c745 MINOR: tinfo: make tid temporarily still reflect global ID
For now we still set tid_bit to (1UL << tid) because FDs will not
work with more than one group without this, but once FDs start to
adopt local masks this must change to thr->ltid_bit.
2022-07-01 19:14:42 +02:00
Willy Tarreau
1a85a958dd MINOR: tinfo: remove the global thread ID bit (tid_bit)
Each thread has its own local thread id and its own global thread id,
in addition to the masks corresponding to each. Once the global thread
ID can go beyond 64 it will not be possible to have a global thread Id
bit anymore, so better start to remove it and use only the local one
from the struct thread_info.
2022-06-14 10:44:38 +02:00
Willy Tarreau
627def9e50 MINOR: threads: add a new function to resolve config groups and masks
In the configuration sometimes we'll omit a thread group number to designate
a global thread number range, and sometimes we'll mention the group and
designate IDs within that group. The operation is more complex than it
seems due to the need to check for ranges spanning between multiple groups
and determining groups from threads from bit masks and remapping bit masks
between local/global.

This patch adds a function to perform this operation, it takes a group and
mask on input and updates them on output. It's designed to be used by "bind"
lines but will likely be usable at other places if needed.

For situations where specified threads do not exist in the group, we have
the choice in the code between silently fixing the thread set or failing
with a message. For now the better option seems to return an error, but if
it turns out to be an issue we can easily change that in the future. Note
that it should only happen with "x/even" when group x only has one thread.
2021-10-08 17:22:26 +02:00
Willy Tarreau
b90935c908 MINOR: threads: add the current group ID in thread-local "tgid" variable
This is the equivalent of "tid" for ease of access. In the future if we
make th_cfg a pure thread-local array (not a pointer), it may make sense
to move it there.
2021-10-08 17:22:26 +02:00
Willy Tarreau
43ab05b3da MEDIUM: threads: replace ha_set_tid() with ha_set_thread()
ha_set_tid() was randomly used either to explicitly set thread 0 or to
set any possibly incomplete thread during boot. Let's replace it with
a pointer to a valid thread or NULL for any thread. This allows us to
check that the designated threads are always valid, and to ignore the
thread 0's mapping when setting it to NULL, and always use group 0 with
it during boot.

The initialization code is also cleaner, as we don't pass ugly casts
of a thread ID to a pointer anymore.
2021-10-08 17:22:26 +02:00
Willy Tarreau
e6806ebecc MEDIUM: threads: automatically assign threads to groups
This takes care of unassigned threads groups and places unassigned
threads there, in a more or less balanced way. Too sparse allocations
may still fail though. For now with a maximum group number fixed to 1
nothing can really fail.
2021-10-08 17:22:26 +02:00
Willy Tarreau
fc69e410e6 MINOR: threads: make tg point to the current thread's group
A the "tg" thread-local variable now always points to the current
thread group. It's pre-initializd to the first one during boot and is
set to point to the thread's one by ha_set_tid(). This last one takes
care of checking whether the thread group was assigned or not because
it may be called during boot before threads are initialized.
2021-10-08 17:22:26 +02:00
Willy Tarreau
1a9c922b53 REORG: thread/sched: move the task_per_thread stuff to thread_ctx
The scheduler contains a lot of stuff that is thread-local and not
exclusively tied to the scheduler. Other parts (namely thread_info)
contain similar thread-local context that ought to be merged with
it but that is even less related to the scheduler. However moving
more data into this structure isn't possible since task.h is high
level and cannot be included everywhere (e.g. activity) without
causing include loops.

In the end, it appears that the task_per_thread represents most of
the per-thread context defined with generic types and should simply
move to tinfo.h so that everyone can use them.

The struct was renamed to thread_ctx and the variable "sched" was
renamed to "th_ctx". "sched" used to be initialized manually from
run_thread_poll_loop(), now it's initialized by ha_set_tid() just
like ti, tid, tid_bit.

The memset() in init_task() was removed in favor of a bss initialization
of the array, so that other subsystems can put their stuff in this array.

Since the tasklet array has TL_CLASSES elements, the TL_* definitions
was moved there as well, but it's not a problem.

The vast majority of the change in this patch is caused by the
renaming of the structures.
2021-10-08 17:22:26 +02:00
Willy Tarreau
aa992761d8 CLEANUP: thread: uninline ha_tkill/ha_tkillall/ha_cpu_relax()
These ones are rarely used or only to waste CPU cycles waiting, and are
the last ones requiring system includes in thread.h. Let's uninline them
and move them to thread.c.
2021-10-07 01:41:15 +02:00
Willy Tarreau
4eeb88363c REORG: thread: move ha_get_pthread_id() to thread.c
It's the last function which directly accesses the pthread_t, let's move
it to thread.c and leave a static inline for non-thread.
2021-10-07 01:41:14 +02:00
Willy Tarreau
d10385ac4b REORG: thread: move the thread init/affinity/stop to thread.c
haproxy.c still has to deal with pthread-specific low-level stuff that
is OS-dependent. We should not have to deal with this there, and we do
not need to access pthread anywhere else.

Let's move these 3 functions to thread.c and keep empty inline ones for
when threads are disabled.
2021-10-07 01:41:14 +02:00
Willy Tarreau
407ef893e7 REORG: thread: uninline the lock-debugging code
The lock-debugging code in thread.h has no reason to be inlined. the
functions are quite fat and perform a lot of operations so there's no
saving keeping them inlined. Worse, most of them are in fact not
inlined, resulting in a significantly bigger executable.

This patch moves all this part from thread.h to thread.c. The functions
are still exported in thread.h of course. This results in ~166kB less
code:

     text    data     bss     dec     hex filename
  3165938   99424  897376 4162738  3f84b2 haproxy-before
  2991987   99424  897376 3988787  3cdd33 haproxy-after

In addition the build time with thread debugging enabled has shrunk
from 19.2 to 17.7s thanks to much less code to be parsed in thread.h
that is included virtually everywhere.
2021-10-07 01:36:51 +02:00
Frédéric Lécaille
9fccace8b0 MINOR: quic: Add a lock for RX packets
We must protect from concurrent the tree which stores the QUIC packets received
by the dgram I/O handler, these packets being also parsed by the xprt task.
2021-09-23 15:27:25 +02:00
Willy Tarreau
81a76f4827 REORG: threads: move ha_get_pthread_id() to tinfo.h
This solely manipulates the thread_info struct, it ought to be in
tinfo.h, not in thread.h.
2021-09-17 16:08:34 +02:00
Tim Duesterhus
8f1669b10f CLEANUP: Remove prototype for non-existent thread_get_default_count()
This is the only location of `thread_get_default_count` within the codebase.
2021-09-15 11:07:18 +02:00
Tim Duesterhus
992007ec78 CLEANUP: tree-wide: fix prototypes for functions taking no arguments.
"f(void)" is the correct and preferred form for a function taking no
argument, while some places use the older "f()". These were reported
by clang's -Wmissing-prototypes, for example:

  src/cpuset.c:111:5: warning: no previous prototype for function 'ha_cpuset_size' [-Wmissing-prototypes]
  int ha_cpuset_size()
  include/haproxy/cpuset.h:42:5: note: this declaration is not a prototype; add 'void' to make it a prototype for a zero-parameter function
  int ha_cpuset_size();
      ^
                     void

This aggregate patch fixes this for the following functions:

   ha_backtrace_to_stderr(), ha_cpuset_size(), ha_panic(), ha_random64(),
   ha_thread_dump_all_to_trash(), get_exec_path(), check_config_validity(),
   mworker_child_nb(), mworker_cli_proxy_(create|stop)(),
   mworker_cleantasks(), mworker_cleanlisteners(), mworker_ext_launch_all(),
   mworker_reload(), mworker_(env|proc_list)_to_(proc_list|env)(),
   mworker_(un|)block_signals(), proxy_adjust_all_maxconn(),
   proxy_destroy_all_defaults(), get_tainted(),
   pool_total_(allocated|used)(), thread_isolate(_full|)(),
   thread(_sync|)_release(), thread_harmless_till_end(),
   thread_cpu_mask_forced(), dequeue_all_listeners(), next_timer_expiry(),
   wake_expired_tasks(), process_runnable_tasks(), init_acl(),
   init_buffer(), (de|)init_log_buffers(), (de|)init_pollers(),
   fork_poller(), pool_destroy_all(), pool_evict_from_local_caches(),
   pool_total_failures(), dump_pools_to_trash(), cfg_run_diagnostics(),
   tv_init_(process|thread)_date(), __signal_process_queue(),
   deinit_signals(), haproxy_unblock_signals()
2021-09-15 11:07:18 +02:00
Willy Tarreau
8ab9419394 BUILD: threads: fix -Wundef for _POSIX_PRIORITY_SCHEDULING on libmusl
Building with an old musl-based toolchain reported this warning:

  include/haproxy/thread.h: In function 'ha_thread_relax':
  include/haproxy/thread.h:256:5: warning: "_POSIX_PRIORITY_SCHEDULING" is not defined [-Wundef]
   #if _POSIX_PRIORITY_SCHEDULING
       ^

There were indeed two "#if" insteadd of #ifdef" for this macro, let's
fix them.
2021-09-15 10:32:12 +02:00
Willy Tarreau
88d1c5d3fb MEDIUM: threads: add a stronger thread_isolate_full() call
The current principle of running under isolation was made to access
sensitive data while being certain that no other thread was using them
in parallel, without necessarily having to place locks everywhere. The
main use case are "show sess" and "show fd" which run over long chains
of pointers.

The thread_isolate() call relies on the "harmless" bit that indicates
for a given thread that it's not currently doing such sensitive things,
which is advertised using thread_harmless_now() and which ends usings
thread_harmless_end(), which also waits for possibly concurrent threads
to complete their work if they took this opportunity for starting
something tricky.

As some system calls were notoriously slow (e.g. mmap()), a bunch of
thread_harmless_now() / thread_harmless_end() were placed around them
to let waiting threads do their work while such other threads were not
able to modify memory contents.

But this is not sufficient for performing memory modifications. One such
example is the server deletion code. By modifying memory, it not only
requires that other threads are not playing with it, but are not either
in the process of touching it. The fact that a pool_alloc() or pool_free()
on some structure may call thread_harmless_now() and let another thread
start to release the same object's memory is not acceptable.

This patch introduces the concept of "idle threads". Threads entering
the polling loop are idle, as well as those that are waiting for all
others to become idle via the new function thread_isolate_full(). Once
thread_isolate_full() is granted, the thread is not idle anymore, and
it is released using thread_release() just like regular isolation. Its
users have to keep in mind that across this call nothing is granted as
another thread might have performed shared memory modifications. But
such users are extremely rare and are actually expecting this from their
peers as well.

Note that that in case of backport, this patch depends on previous patch:
  MINOR: threads: make thread_release() not wait for other ones to complete
2021-08-04 14:49:36 +02:00
Willy Tarreau
16fbdda3c3 MEDIUM: queue: use a dedicated lock for the queues (v2)
Till now whenever a server or proxy's queue was touched, this server
or proxy's lock was taken. Not only this requires distinct code paths,
but it also causes unnecessary contention with other uses of these locks.

This patch adds a lock inside the "queue" structure that will be used
the same way by the server and the proxy queuing code. The server used
to use a spinlock and the proxy an rwlock, though the queue only used
it for locked writes. This new version uses a spinlock since we don't
need the read lock part here. Tests have not shown any benefit nor cost
in using this one versus the rwlock so we could change later if needed.

The lower contention on the locks increases the performance from 362k
to 374k req/s on 16 threads with 20 servers and leastconn. The gain
with roundrobin even increases by 9%.

This is tagged medium because the lock is changed, but no other part of
the code touches the queues, with nor without locking, so this should
remain invisible.
2021-06-24 10:52:31 +02:00
Willy Tarreau
3f70fb9ea2 Revert "MEDIUM: queue: use a dedicated lock for the queues"
This reverts commit fcb8bf8650.

The recent changes since 5304669e1 MEDIUM: queue: make
pendconn_process_next_strm() only return the pendconn opened a tiny race
condition between stream_free() and process_srv_queue(), as the pendconn
is accessed outside of the lock, possibly while it's being freed. A
different approach is required.
2021-06-24 07:26:28 +02:00
Willy Tarreau
fcb8bf8650 MEDIUM: queue: use a dedicated lock for the queues
Till now whenever a server or proxy's queue was touched, this server
or proxy's lock was taken. Not only this requires distinct code paths,
but it also causes unnecessary contention with other uses of these locks.

This patch adds a lock inside the "queue" structure that will be used
the same way by the server and the proxy queuing code. The server used
to use a spinlock and the proxy an rwlock, though the queue only used
it for locked writes. This new version uses a spinlock since we don't
need the read lock part here. Tests have not shown any benefit nor cost
in using this one versus the rwlock so we could change later if needed.

The lower contention on the locks increases the performance from 491k
to 507k req/s on 16 threads with 20 servers and leastconn. The gain
with roundrobin even increases by 6%.

The performance profile changes from this:
  13.03%  haproxy             [.] fwlc_srv_reposition
   8.08%  haproxy             [.] fwlc_get_next_server
   3.62%  haproxy             [.] process_srv_queue
   1.78%  haproxy             [.] pendconn_dequeue
   1.74%  haproxy             [.] pendconn_add

to this:
  11.95%  haproxy             [.] fwlc_srv_reposition
   7.57%  haproxy             [.] fwlc_get_next_server
   3.51%  haproxy             [.] process_srv_queue
   1.74%  haproxy             [.] pendconn_dequeue
   1.70%  haproxy             [.] pendconn_add

At this point the differences are mostly measurement noise.

This is tagged medium because the lock is changed, but no other part of
the code touches the queues, with nor without locking, so this should
remain invisible.
2021-06-22 18:43:56 +02:00
Willy Tarreau
9e467af804 BUG/MEDIUM: shctx: use at least thread-based locking on USE_PRIVATE_CACHE
Since threads were introduced in 1.8, the USE_PRIVATE_CACHE mode of the
shctx was not updated to use locks. Originally it was meant to disable
sharing between processes, so it removes the lock/unlock instructions.
But with threads enabled, it's not possible to work like this anymore.

It's easy to see that once built with private cache and threads enabled,
sending violent SSL traffic to the the process instantly makes it die.
The HTTP cache is very likely affected as well.

This patch addresses this by falling back to our native spinlocks when
USE_PRIVATE_CACHE is used. In practice we could use them also for other
modes and remove all older implementations, but this patch aims at keeping
the changes very low and easy to backport. A new SHCTX_LOCK label was
added to help with debugging, but OTHER_LOCK might be usable as well
for backports.

An even lighter approach for backports may consist in always declaring
the lock (or reusing "waiters"), and calling pl_take_s() for the lock()
and pl_drop_s() for the unlock() operation. This could even be used in
all modes (process and threads), even when thread support is disabled.

Subsequent patches will further clean up this area.

This patch must be backported to all supported versions since 1.8.
2021-06-15 16:52:07 +02:00
Willy Tarreau
8715dec6f9 MEDIUM: pools: remove the locked pools implementation
Now that the modified lockless variant does not need a DWCAS anymore,
there's no reason to keep the much slower locked version, so let's
just get rid of it.
2021-06-10 17:46:50 +02:00
Willy Tarreau
29c460bc07 REORG: threads: move all_thread_mask() to thread.h
It was declared in global.h, forcing plenty of source files to include
it only for this while it's only based on definitions from thread.h.
2021-05-08 12:26:10 +02:00
Amaury Denoyelle
4c9efdecf5 MINOR: thread: implement the detection of forced cpu affinity
Create a function thread_cpu_mask_forced. Its purpose is to report if a
restrictive cpu mask is active for the current proces, for example due
to a taskset invocation. It is only implemented for the linux platform
currently.
2021-04-23 16:06:49 +02:00
Christopher Faulet
76b44195c9 MINOR: threads: Only consider running threads to end a thread harmeless period
When a thread ends its harmeless period, we must only consider running
threads when testing threads_want_rdv_mask mask. To do so, we reintroduce
all_threads_mask mask in the bitwise operation (It was removed to fix a
deadlock).

Note that for now it is useless because there is no way to stop threads or
to have threads reserved for another task. But it is safer this way to avoid
bugs in the future.
2021-04-17 11:14:58 +02:00
Christopher Faulet
f63a185500 BUG/MEDIUM: threads: Ignore current thread to end its harmless period
A previous patch was pushed to fix a deadlock when an isolated thread ends
its harmless period (a9a9e9aac ["BUG/MEDIUM: thread: Fix a deadlock if an
isolated thread is marked as harmless"]). But, unfortunately, the fix is
incomplete. The same must be done in the outer loop, in
thread_harmless_end() function. The current thread must be ignored when
threads_want_rdv_mask mask is tested.

This patch must also be backported as far as 2.0.
2021-04-17 11:14:58 +02:00
Willy Tarreau
4781b1521a CLEANUP: atomic/tree-wide: replace single increments/decrements with inc/dec
This patch replaces roughly all occurrences of an HA_ATOMIC_ADD(&foo, 1)
or HA_ATOMIC_SUB(&foo, 1) with the equivalent HA_ATOMIC_INC(&foo) and
HA_ATOMIC_DEC(&foo) respectively. These are 507 changes over 45 files.
2021-04-07 18:18:37 +02:00
Willy Tarreau
49de68520e MEDIUM: streams: do not use the streams lock anymore
The lock was still used exclusively to deal with the concurrency between
the "show sess" release handler and a stream_new() or stream_free() on
another thread. All other accesses made by "show sess" are already done
under thread isolation. The release handler only requires to unlink its
node when stopping in the middle of a dump (error, timeout etc). Let's
just isolate the thread to deal with this case so that it's compatible
with the dump conditions, and remove all remaining locking on the streams.

This effectively kills the streams lock. The measured gain here is around
1.6% with 4 threads (374krps -> 380k).
2021-02-24 13:54:50 +01:00
Willy Tarreau
ccea3c54f4 DEBUG: thread: add 5 extra lock labels for statistics and debugging
Since OTHER_LOCK is commonly used it's become much more difficult to
profile lock contention by temporarily changing a lock label. Let's
add DEBUG1..5 to serve only for debugging. These ones must not be
used in committed code. We could decide to only define them when
DEBUG_THREAD is set but that would complicate attempts at measuring
performance with debugging turned off.
2021-02-18 10:06:45 +01:00
Amaury Denoyelle
5c7086f6b0 MEDIUM: connection: protect idle conn lists with locks
This is a preparation work for connection reuse with sni/proxy
protocol/specific src-dst addresses.

Protect every access to idle conn lists with a lock. This is currently
strictly not needed because the access to the list are made with atomic
operations. However, to be able to reuse connection with specific
parameters, the list storage will be converted to eb-trees. As this
structure does not have atomic operation, it is mandatory to protect it
with a lock.

For this, the takeover lock is reused. Its role was to protect during
connection takeover. As it is now extended to general idle conns usage,
it is renamed to idle_conns_lock. A new lock section is also
instantiated named IDLE_CONNS_LOCK to isolate its impact on performance.
2021-02-12 12:33:04 +01:00
William Lallemand
7b41654495 MINOR: ssl: add SSL_SERVER_LOCK label in threads.h
Amaury reported that the commit 3ce6eed ("MEDIUM: ssl: add a rwlock for
SSL server session cache") introduced some warning during compilation:

    include/haproxy/thread.h|411 col 2| warning: enumeration value 'SSL_SERVER_LOCK' not handled in switch [-Wswitch]

This patch fix the issue by adding the right entry in the switch block.

Must be backported where 3ce6eed is backported. (2.4 only for now)
2021-02-10 16:17:19 +01:00
William Lallemand
3ce6eedb37 MEDIUM: ssl: add a rwlock for SSL server session cache
When adding the server side support for certificate update over the CLI
we encountered a design problem with the SSL session cache which was not
locked.

Indeed, once a certificate is updated we need to flush the cache, but we
also need to ensure that the cache is not used during the update.
To prevent the use of the cache during an update, this patch introduce a
rwlock for the SSL server session cache.

In the SSL session part this patch only lock in read, even if it writes.
The reason behind this, is that in the session part, there is one cache
storage per thread so it is not a problem to write in the cache from
several threads. The problem is only when trying to write in the cache
from the CLI (which could be on any thread) when a session is trying to
access the cache. So there is a write lock in the CLI part to prevent
simultaneous access by a session and the CLI.

This patch also remove the thread_isolate attempt which is eating too
much CPU time and was not protecting from the use of a free ptr in the
session.
2021-02-09 09:43:44 +01:00
Willy Tarreau
de785f04e1 MINOR: threads/debug: only report lock stats for used operations
In addition to the previous simplification, most locks don't use the
seek or read lock (e.g. spinlocks etc) so let's split the dump into
distinct operations (write/seek/read) and only report those which
were used. Now the output size is roughly divided by 5 compared
to previous ones.
2020-10-22 17:32:28 +02:00
Willy Tarreau
23d3b00bdd MINOR: threads/debug: only report used lock stats
The lock stats are very verbose and more than half of them are used in
a typical test, making it hard to spot the sought values. Let's simply
report "not used" for those which have not been called at all.
2020-10-22 17:32:28 +02:00
Willy Tarreau
61f799b8da MINOR: threads: add the transitions to/from the seek state
Since our locks are based on progressive locks, we support the upgradable
seek lock that is compatible with readers and upgradable to a write lock.
The main purpose is to take it while seeking down a tree for modification
while other threads may seek the same tree for an input (e.g. compute the
next event date).

The newly supported operations are:

  HA_RWLOCK_SKLOCK(lbl,l)         pl_take_s(l)      /* N --> S */
  HA_RWLOCK_SKTOWR(lbl,l)         pl_stow(l)        /* S --> W */
  HA_RWLOCK_WRTOSK(lbl,l)         pl_wtos(l)        /* W --> S */
  HA_RWLOCK_SKTORD(lbl,l)         pl_stor(l)        /* S --> R */
  HA_RWLOCK_WRTORD(lbl,l)         pl_wtor(l)        /* W --> R */
  HA_RWLOCK_SKUNLOCK(lbl,l)       pl_drop_s(l)      /* S --> N */
  HA_RWLOCK_TRYSKLOCK(lbl,l)      (!pl_try_s(l))    /* N -?> S */
  HA_RWLOCK_TRYRDTOSK(lbl,l)      (!pl_try_rtos(l)) /* R -?> S */

Existing code paths are left unaffected so this patch doesn't affect
any running code.
2020-10-16 16:53:46 +02:00
Willy Tarreau
8d5360ca7f MINOR: threads: augment rwlock debugging stats to report seek lock stats
We currently use only read and write lock operations with rwlocks, but
ours also support upgradable seek locks for which we do not report any
stats. Let's add them now when DEBUG_THREAD is enabled.
2020-10-16 16:51:49 +02:00
Willy Tarreau
e4d1505c83 REORG: includes: create tinfo.h for the thread_info struct
The thread_info struct is convenient to store various per-thread info
without having to resort to a painful thread_local storage which is
slow and painful to initialize.

The problem is, by having this one in thread.h it's very difficult to
add more entries there because everyone already includes thread.h so
conversely thread.h cannot reference certain types.

There's no point in having this there, instead let's create a new pair
of files, tinfo{,-t}.h, which declare the structure. This way it will
become possible to extend them with other includes and have certain
files store their own types there.
2020-06-29 09:57:23 +02:00
Willy Tarreau
db57a142c3 BUILD: thread: add parenthesis around values of locking macros
clang just failed on fd.c with this error:

  src/fd.c:491:9: error: logical not is only applied to the left hand side of this comparison [-Werror,-Wlogical-not-parentheses]
          while (HA_SPIN_TRYLOCK(OTHER_LOCK, &log_lock) != 0) {
                 ^                                      ~~
That's because this expands to this:

          while (!pl_try_s(&log_lock) != 0) {

Let's just add parenthesis in the TRYLOCK macros to avoid this.
This may need to be backported if commit df187875d ("BUG/MEDIUM: log:
don't hold the log lock during writev() on a file descriptor") is
backported as well as it seems to be the first one to trigger it.
2020-06-12 11:46:44 +02:00
Willy Tarreau
6784c99463 CLEANUP: include: make atomic.h part of the base API
Atomic ops are used about everywhere, let's make them part of the base
API by including atomic.h in api.h.
2020-06-11 10:18:59 +02:00
Willy Tarreau
3f567e4949 REORG: include: split hathreads into haproxy/thread.h and haproxy/thread-t.h
This splits the hathreads.h file into types+macros and functions. Given
that most users of this file used to include it only to get the definition
of THREAD_LOCAL and MAXTHREADS, the bare minimum was placed into thread-t.h
(i.e. types and macros).

All the thread management was left to haproxy/thread.h. It's worth noting
the drop of the trailing "s" in the name, to remove the permanent confusion
that arises between this one and the system implementation (no "s") and the
makefile's option (no "s").

For consistency, src/hathreads.c was also renamed thread.c.

A number of files were updated to only include thread-t which is the one
they really needed.

Some future improvements are possible like replacing empty inlined
functions with macros for the thread-less case, as building at -O0 disables
inlining and causes these ones to be emitted. But this really is cosmetic.
2020-06-11 10:18:56 +02:00