cumulative numbers of http request and http errors of counters tracked at
the session level and their rates can now be updated at the session level
thanks to two new functions. These functions are not used for now, but it
will be called to keep tracked counters up-to-date if an error occurs before
the stream creation.
Baptiste reported a new crash affecting 2.3 which can be triggered
when using H2 on the backend, with http-reuse always and with a tens
of clients doing close only. There are a few combined cases which cause
this to happen, but each time the issue is the same, an already freed
session is dereferenced in session_unown_conn().
Two cases were identified to cause this:
- a connection referencing a session as its owner, which is detached
from the session's list and is destroyed after this session ends.
The test on conn->owner before calling session_unown_conn() is not
sufficent as the pointer is not null but is not valid anymore.
- a connection that never goes idle and that gets killed form the
mux, where session_free() is called first, then conn_free() calls
session_unown_conn() which scans the just freed session for older
connections. This one is only triggered with DEBUG_UAF
The reason for this session to be present here is that it's needed during
the connection setup, to be passed to conn_install_mux_be() to mux->init()
as the owning session, but it's never deleted aftrewards. Furthermore, even
conn_session_free() doesn't delete this pointer after freeing the session
that lies there. Both do definitely result in a use-after-free that's more
easily triggered under DEBUG_UAF.
This patch makes sure that the owner is always deleted after detaching
or killing the session. However it is currently not possible to clear
the owner right after a synchronous init because the proxy protocol
apparently needs it (a reg test checks this), and if we leave it past
the connection setup with the session not attached anywhere, it's hard
to catch the right moment to detach it. This means that the session may
remain in conn->owner as long as the connection has never been added to
nor removed from the session's idle list. Given that this patch needs to
remain simple enough to be backported, instead it adds a workaround in
session_unown_conn() to detect that the element is already not attached
anywhere.
This fix absolutely requires previous patch "CLEANUP: connection: do not
use conn->owner when the session is known" otherwise the situation will
be even worse, as some places used to rely on conn->owner instead of the
session.
The fix could theorically be backported as far as 1.8. However, the code
in this area has significantly changed along versions and there are more
risks of breaking working stuff than fixing real issues there. The issue
was really woken up in two steps during 2.3-dev when slightly reworking
the idle conns with commit 08016ab82 ("MEDIUM: connection: Add private
connections synchronously in session server list") and when adding
support for storing used H2 connections in the session and adding the
necessary call to session_unown_conn() in the muxes. But the same test
managed to crash 2.2 when built in DEBUG_UAF and patched like this,
proving that we used to already leave dangling pointers behind us:
| diff --git a/include/haproxy/connection.h b/include/haproxy/connection.h
| index f8f235c1a..dd30b5f80 100644
| --- a/include/haproxy/connection.h
| +++ b/include/haproxy/connection.h
| @@ -458,6 +458,10 @@ static inline void conn_free(struct connection *conn)
| sess->idle_conns--;
| session_unown_conn(sess, conn);
| }
| + else {
| + struct session *sess = conn->owner;
| + BUG_ON(sess && sess->origin != &conn->obj_type);
| + }
|
| sockaddr_free(&conn->src);
| sockaddr_free(&conn->dst);
It's uncertain whether an existing code path there can lead to dereferencing
conn->owner when it's bad, though certain suspicious memory corruption bugs
make one think it's a likely candidate. The patch should not be hard to
adapt there.
Backports to 2.1 and older are left to the appreciation of the person
doing the backport.
A reproducer consists in this:
global
nbthread 1
listen l
bind :9000
mode http
http-reuse always
server s 127.0.0.1:8999 proto h2
frontend f
bind :8999 proto h2
mode http
http-request return status 200
Then this will make it crash within 2-3 seconds:
$ h1load -e -r 1 -c 10 http://0:9000/
If it does not, it might be that DEBUG_UAF was not used (it's harder then)
and it might be useful to restart.
At a few places we used to rely on conn->owner to retrieve the session
while the session is already known. This is not correct because at some
of these points the reason the connection's owner was still the session
(instead of NULL) is a mistake. At one place a comparison is even made
between the session and conn->owner assuming it's valid without checking
if it's NULL. Let's clean this up to use the session all the time.
Note that this will be needed for a forthcoming fix and will have to be
backported.
Till now we would keep a per-thread queue of pending incoming connections
for which we would store:
- the listener
- the accepted FD
- the source address
- the source address' length
And these elements were first used in session_accept_fd() running on the
target thread to allocate a connection and duplicate them again. Doing
this induces various problems. The first one is that session_accept_fd()
may only run on file descriptors and cannot be reused for QUIC. The second
issue is that it induces lots of memory copies and that the listerner
queue thrashes a lot of cache, consuming 64 bytes per entry.
This patch changes this by allocating the connection before queueing it,
and by only placing the connection's pointer into the queue. Indeed, the
first two calls used to initialize the connection already store all the
information above, which can be retrieved from the connection pointer
alone. So we just have to pop one pointer from the target thread, and
pass it to session_accept_fd() which only needs the FD for the final
settings.
This starts to make the accept path a bit more transport-agnostic, and
saves memory and CPU cycles at the same time (1% connection rate increase
was noticed with 4 threads). Thanks to dividing the accept-queue entry
size from 64 to 8 bytes, its size could be increased from 256 to 1024
connections while still dividing the overall size by two. No single
queue full condition was met.
One minor drawback is that connection may be allocated from one thread's
pool to be used into another one. But this already happens a lot with
connection reuse so there is really nothing new here.
The session_get_conn() must now be used to look for an available connection
matching a specific target for a given session. This simplifies a bit the
connect_server() function.
When a connection is marked as private, it is now added in the session server
list. We don't wait a stream is detached from the mux to do so. When the
connection is created, this happens after the mux creation. Otherwise, it is
performed when the connection is marked as private.
To allow that, when a connection is created, the session is systematically set
as the connectin owner. Thus, a backend connection has always a owner during its
creation. And a private connection has always a owner until its death.
Note that outside the detach() callback, if the call to session_add_conn()
failed, the error is ignored. In this situation, we retry to add the connection
into the session server list in the detach() callback. If this fails at this
step, the multiplexer is destroyed and the connection is closed.
extern struct dict server_name_dict was moved from the type file to the
main file. A handful of inlined functions were moved at the bottom of
the file. Call places were updated to use server-t.h when relevant, or
to simply drop the entry when not needed.