Skip to content

Commit

Permalink
MEDIUM: tcp-act/backend: support for set-bc-{mark,tos} actions
Browse files Browse the repository at this point in the history
set-bc-{mark,tos} actions are pretty similar to set-fc-{mark,tos} to set
mark/tos on packets sent from haproxy to server: set-bc-{mark,tos} actions
act on the whole backend/srv connection: from connect() to connection
teardown, thus they may only be used before the connection to the server
is instantiated, meaning that they are only relevant for request-oriented
rules such as tcp-request or http-request rules. For now their use is
limited to content request rules, because tos and mark informations are
stored directly within the stream, thus it is required that the stream
already exists.

stream flags are used in combination with dedicated stream struct members
variables to pass 'tos' and 'mark' informations so that they are correctly
considered during stream connection assignment logic (prior to connecting
to actually connecting to the server)

'tos' and 'mark' fd sockopts are taken into account in conn hash
parameters for connection reuse mechanism.

The documentation was updated accordingly.
  • Loading branch information
Darlelet committed Feb 1, 2024
1 parent b4ee7b0 commit 42a97d9
Show file tree
Hide file tree
Showing 9 changed files with 203 additions and 39 deletions.
45 changes: 42 additions & 3 deletions doc/configuration.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7513,9 +7513,10 @@ http-reuse { never | safe | aggressive | always }

When http connection sharing is enabled, a great care is taken to respect the
connection properties and compatibility. Indeed, some properties are specific
and it is not possibly to reuse it blindly. Those are the SSL SNI, source
and destination address and proxy protocol block. A connection is reused only
if it shares the same set of properties with the request.
and it is not possible to reuse it blindly. Those are the SSL SNI, source
and destination address, proxy protocol block as well as tos and mark
sockopts. A connection is reused only if it shares the same set of properties
with the request.

Also note that connections with certain bogus authentication schemes (relying
on the connection) like NTLM are marked private and never shared.
Expand Down Expand Up @@ -13791,6 +13792,8 @@ sc-set-gpt X X X X X X X
sc-set-gpt0 X X X X X X X
send-spoe-group - - X X X X -
set-bandwidth-limit - - X X X X -
set-bc-mark - - X - X - -
set-bc-tos - - X - X - -
set-dst X X X - X - -
set-dst-port X X X - X - -
set-fc-mark X X X X X X -
Expand Down Expand Up @@ -14701,6 +14704,42 @@ set-bandwidth-limit <name> [limit {<expr> | <size>}] [period {<expr> | <time>}]
See section 9.7 about bandwidth limitation filter setup.


set-bc-mark { <mark> | <expr> }
Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft
- | - | X | - | X | - | -

This is used to set the Netfilter/IPFW MARK on the backend connection (all
packets sent to the server) to the value passed in <mark> or <expr> on
platforms which support it. This value is an unsigned 32 bit value which can
be matched by netfilter/ipfw and by the routing table or monitoring the
packets through DTrace. <mark> can be expressed both in decimal or hexadecimal
format (prefixed by "0x"). Alternatively, <expr> can be used: it is a standard
HAProxy expression formed by a sample-fetch followed by some converters which
must resolve to integer type. This action can be useful to force certain
packets to take a different route (for example a cheaper network path for bulk
downloads). This works on Linux kernels 2.6.32 and above and requires admin
privileges, as well on FreeBSD and OpenBSD. The mark will be set for the whole
duration of the backend/server connection (from connect to close).


set-bc-tos { <tos> | <expr> }
Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft
- | - | X | - | X | - | -

This is used to set the TOS or DSCP field value on the backend connection
(all packets sent to the server) to the value passed in <tos> or <expr> on
platforms which support this. This value represents the whole 8 bits of the
IP TOS field. Note that only the 6 higher bits are used in DSCP or TOS, and
the two lower bits are always 0. Alternatively, <expr> can be used: it is a
standard HAProxy expression formed by a sample-fetch followed by some
converters which must resolve to integer type. This action can be used to
adjust some routing behavior on inner routers based on some information from
the request. The tos will be set for the whole duration of the backend/server
connection (from connect to close).

See RFC 2474, 2597, 3260 and 4594 for more information.


set-dst <expr>
Usable in: TCP RqCon| RqSes| RqCnt| RsCnt| HTTP Req| Res| Aft
X | X | X | - | X | - | -
Expand Down
27 changes: 17 additions & 10 deletions include/haproxy/connection-t.h
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,10 @@ enum {
CO_FL_REVERSED = 0x00000004, /* connection has been reversed to backend / reversed and accepted on frontend */
CO_FL_ACT_REVERSING = 0x00000008, /* connection has been reversed to frontend but not yet accepted */

/* unused : 0x00000010 */
/* unused : 0x00000020 */
CO_FL_OPT_MARK = 0x00000010, /* connection has a special sockopt mark */

CO_FL_OPT_TOS = 0x00000020, /* connection has a special sockopt tos */

/* unused : 0x00000040, 0x00000080 */

/* These flags indicate whether the Control and Transport layers are initialized */
Expand Down Expand Up @@ -172,13 +174,14 @@ static forceinline char *conn_show_flags(char *buf, size_t len, const char *deli
_(0);
/* flags */
_(CO_FL_SAFE_LIST, _(CO_FL_IDLE_LIST, _(CO_FL_CTRL_READY,
_(CO_FL_REVERSED, _(CO_FL_ACT_REVERSING, _(CO_FL_XPRT_READY,
_(CO_FL_WANT_DRAIN, _(CO_FL_WAIT_ROOM, _(CO_FL_EARLY_SSL_HS, _(CO_FL_EARLY_DATA,
_(CO_FL_SOCKS4_SEND, _(CO_FL_SOCKS4_RECV, _(CO_FL_SOCK_RD_SH, _(CO_FL_SOCK_WR_SH,
_(CO_FL_ERROR, _(CO_FL_FDLESS, _(CO_FL_WAIT_L4_CONN, _(CO_FL_WAIT_L6_CONN,
_(CO_FL_SEND_PROXY, _(CO_FL_ACCEPT_PROXY, _(CO_FL_ACCEPT_CIP, _(CO_FL_SSL_WAIT_HS,
_(CO_FL_PRIVATE, _(CO_FL_RCVD_PROXY, _(CO_FL_SESS_IDLE, _(CO_FL_XPRT_TRACKED
))))))))))))))))))))))))));
_(CO_FL_REVERSED, _(CO_FL_ACT_REVERSING, _(CO_FL_OPT_MARK, _(CO_FL_OPT_TOS,
_(CO_FL_XPRT_READY, _(CO_FL_WANT_DRAIN, _(CO_FL_WAIT_ROOM, _(CO_FL_EARLY_SSL_HS,
_(CO_FL_EARLY_DATA, _(CO_FL_SOCKS4_SEND, _(CO_FL_SOCKS4_RECV, _(CO_FL_SOCK_RD_SH,
_(CO_FL_SOCK_WR_SH, _(CO_FL_ERROR, _(CO_FL_FDLESS, _(CO_FL_WAIT_L4_CONN,
_(CO_FL_WAIT_L6_CONN, _(CO_FL_SEND_PROXY, _(CO_FL_ACCEPT_PROXY, _(CO_FL_ACCEPT_CIP,
_(CO_FL_SSL_WAIT_HS, _(CO_FL_PRIVATE, _(CO_FL_RCVD_PROXY, _(CO_FL_SESS_IDLE,
_(CO_FL_XPRT_TRACKED
))))))))))))))))))))))))))));
/* epilogue */
_(~0U);
return buf;
Expand Down Expand Up @@ -497,8 +500,9 @@ enum conn_hash_params_t {
CONN_HASH_PARAMS_TYPE_SRC_ADDR = 0x8,
CONN_HASH_PARAMS_TYPE_SRC_PORT = 0x10,
CONN_HASH_PARAMS_TYPE_PROXY = 0x20,
CONN_HASH_PARAMS_TYPE_MARK_TOS = 0x40,
};
#define CONN_HASH_PARAMS_TYPE_COUNT 6
#define CONN_HASH_PARAMS_TYPE_COUNT 7

#define CONN_HASH_PAYLOAD_LEN \
(((sizeof(((struct conn_hash_node *)0)->node.key)) * 8) - CONN_HASH_PARAMS_TYPE_COUNT)
Expand All @@ -513,6 +517,7 @@ enum conn_hash_params_t {
struct conn_hash_params {
uint64_t sni_prehash;
uint64_t proxy_prehash;
uint64_t mark_tos_prehash;
void *target;
struct sockaddr_storage *src_addr;
struct sockaddr_storage *dst_addr;
Expand Down Expand Up @@ -581,6 +586,8 @@ struct connection {
enum obj_type *target; /* Listener for active reverse, server for passive. */
struct buffer name; /* Only used for passive reverse. Used as SNI when connection added to server idle pool. */
} reverse;
uint32_t mark; /* set network mark, if CO_FL_OPT_MARK is set */
uint8_t tos; /* set ip tos, if CO_FL_OPT_TOS is set */
};

/* node for backend connection in the idle trees for http-reuse
Expand Down
23 changes: 3 additions & 20 deletions include/haproxy/connection.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@

#include <haproxy/api.h>
#include <haproxy/buf.h>
#include <haproxy/sock.h>
#include <haproxy/connection-t.h>
#include <haproxy/stconn-t.h>
#include <haproxy/fd.h>
Expand Down Expand Up @@ -420,19 +421,7 @@ static inline void conn_set_tos(const struct connection *conn, int tos)
if (!conn || !conn_ctrl_ready(conn) || (conn->flags & CO_FL_FDLESS))
return;

#ifdef IP_TOS
if (conn->src->ss_family == AF_INET)
setsockopt(conn->handle.fd, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
#endif
#ifdef IPV6_TCLASS
if (conn->src->ss_family == AF_INET6) {
if (IN6_IS_ADDR_V4MAPPED(&((struct sockaddr_in6 *)conn->src)->sin6_addr))
/* v4-mapped addresses need IP_TOS */
setsockopt(conn->handle.fd, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
else
setsockopt(conn->handle.fd, IPPROTO_IPV6, IPV6_TCLASS, &tos, sizeof(tos));
}
#endif
sock_set_tos(conn->handle.fd, conn->src, tos);
}

/* Sets the netfilter mark on the connection's socket. The connection is tested
Expand All @@ -443,13 +432,7 @@ static inline void conn_set_mark(const struct connection *conn, int mark)
if (!conn || !conn_ctrl_ready(conn) || (conn->flags & CO_FL_FDLESS))
return;

#if defined(SO_MARK)
setsockopt(conn->handle.fd, SOL_SOCKET, SO_MARK, &mark, sizeof(mark));
#elif defined(SO_USER_COOKIE)
setsockopt(conn->handle.fd, SOL_SOCKET, SO_USER_COOKIE, &mark, sizeof(mark));
#elif defined(SO_RTABLE)
setsockopt(conn->handle.fd, SOL_SOCKET, SO_RTABLE, &mark, sizeof(mark));
#endif
sock_set_mark(conn->handle.fd, mark);
}

/* Sets adjust the TCP quick-ack feature on the connection's socket. The
Expand Down
29 changes: 29 additions & 0 deletions include/haproxy/sock.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,35 @@ int sock_check_events(struct connection *conn, int event_type);
void sock_ignore_events(struct connection *conn, int event_type);
int _sock_supports_reuseport(const struct proto_fam *fam, int type, int protocol);

/* Sets tos sockopt on socket depending on addr target family */
static inline void sock_set_tos(int fd, struct sockaddr_storage *addr, int tos)
{
#ifdef IP_TOS
if (addr->ss_family == AF_INET)
setsockopt(fd, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
#endif
#ifdef IPV6_TCLASS
if (addr->ss_family == AF_INET6) {
if (IN6_IS_ADDR_V4MAPPED(&((struct sockaddr_in6 *)addr)->sin6_addr))
/* v4-mapped addresses need IP_TOS */
setsockopt(fd, IPPROTO_IP, IP_TOS, &tos, sizeof(tos));
else
setsockopt(fd, IPPROTO_IPV6, IPV6_TCLASS, &tos, sizeof(tos));
}
#endif
}

/* Sets mark sockopt on socket */
static inline void sock_set_mark(int fd, int mark)
{
#if defined(SO_MARK)
setsockopt(fd, SOL_SOCKET, SO_MARK, &mark, sizeof(mark));
#elif defined(SO_USER_COOKIE)
setsockopt(fd, SOL_SOCKET, SO_USER_COOKIE, &mark, sizeof(mark));
#elif defined(SO_RTABLE)
setsockopt(fd, SOL_SOCKET, SO_RTABLE, &mark, sizeof(mark));
#endif
}

#endif /* _HAPROXY_SOCK_H */

Expand Down
7 changes: 6 additions & 1 deletion include/haproxy/stream-t.h
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@
#define SF_SRV_REUSED_ANTICIPATED 0x00200000 /* the connection was reused but the mux is not ready yet */
#define SF_WEBSOCKET 0x00400000 /* websocket stream */ // TODO: must be removed
#define SF_SRC_ADDR 0x00800000 /* get the source ip/port with getsockname */
#define SF_BC_MARK 0x01000000 /* need to set specific mark on backend/srv conn upon connect */
#define SF_BC_TOS 0x02000000 /* need to set specific tos on backend/srv conn upon connect */

/* This function is used to report flags in debugging tools. Please reflect
* below any single-bit flag addition above in the same order via the
Expand All @@ -100,7 +102,7 @@ static forceinline char *strm_show_flags(char *buf, size_t len, const char *deli
_(0);
/* flags & enums */
_(SF_IGNORE_PRST, _(SF_SRV_REUSED, _(SF_SRV_REUSED_ANTICIPATED,
_(SF_WEBSOCKET, _(SF_SRC_ADDR)))));
_(SF_WEBSOCKET, _(SF_SRC_ADDR, _(SF_BC_MARK, _(SF_BC_TOS)))))));

_e(SF_FINST_MASK, SF_FINST_R, _e(SF_FINST_MASK, SF_FINST_C,
_e(SF_FINST_MASK, SF_FINST_H, _e(SF_FINST_MASK, SF_FINST_D,
Expand Down Expand Up @@ -209,6 +211,9 @@ struct stream {

int flags; /* some flags describing the stream */
unsigned int uniq_id; /* unique ID used for the traces */
uint32_t bc_mark; /* set mark on back conn if SF_BC_MARK is set */
uint8_t bc_tos; /* set tos on back conn if SF_BC_TOS is set */
/* 3 unused bytes here */
enum obj_type *target; /* target to use for this stream */

struct session *sess; /* the session this stream is attached to */
Expand Down
42 changes: 42 additions & 0 deletions src/backend.c
Original file line number Diff line number Diff line change
Expand Up @@ -1430,6 +1430,36 @@ int connect_server(struct stream *s)
}
}

/* 6. Custom mark, tos? */
if (s->flags & (SF_BC_MARK | SF_BC_TOS)) {
/* mark: 32bits, tos: 8bits = 40bits
* last 2 bits are there to indicate if mark and/or tos are set
* total: 42bits:
*
* 63==== (unused) ====42 39----32 31-----------------------------0
* 0000000000000000000000 11 00000111 00000000000000000000000000000011
* ^^ ^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* || | |
* / \ \ \
* / \ \ \
* tos? mark? \ mark value (32bits)
* tos value (8bits)
* ie: in the above example:
* - mark is set, mark = 3
* - tos is set, tos = 7
*/
if (s->flags & SF_BC_MARK) {
hash_params.mark_tos_prehash |= s->bc_mark;
/* 41th bit: mark set */
hash_params.mark_tos_prehash |= 1ULL << 40;
}
if (s->flags & SF_BC_TOS) {
hash_params.mark_tos_prehash |= (uint64_t)s->bc_tos << 32;
/* 42th bit: tos set */
hash_params.mark_tos_prehash |= 1ULL << 41;
}
}

hash = conn_calculate_hash(&hash_params);

/* first, search for a matching connection in the session's idle conns */
Expand Down Expand Up @@ -1617,6 +1647,18 @@ int connect_server(struct stream *s)
srv_conn->src = bind_addr;
bind_addr = NULL;

/* mark? */
if (s->flags & SF_BC_MARK) {
srv_conn->mark = s->bc_mark;
srv_conn->flags |= CO_FL_OPT_MARK;
}

/* tos? */
if (s->flags & SF_BC_TOS) {
srv_conn->tos = s->bc_tos;
srv_conn->flags |= CO_FL_OPT_TOS;
}

srv_conn->hash_node->node.key = hash;
}
}
Expand Down
6 changes: 6 additions & 0 deletions src/connection.c
Original file line number Diff line number Diff line change
Expand Up @@ -2651,6 +2651,12 @@ uint64_t conn_calculate_hash(const struct conn_hash_params *params)
&hash_flags, CONN_HASH_PARAMS_TYPE_PROXY);
}

if (params->mark_tos_prehash) {
conn_hash_update(&hash,
&params->mark_tos_prehash, sizeof(params->mark_tos_prehash),
&hash_flags, CONN_HASH_PARAMS_TYPE_MARK_TOS);
}

return conn_hash_digest(&hash, hash_flags);
}

Expand Down
17 changes: 14 additions & 3 deletions src/sock.c
Original file line number Diff line number Diff line change
Expand Up @@ -197,12 +197,14 @@ struct connection *sock_accept_conn(struct listener *l, int *status)

/* Create a socket to connect to the server in conn->dst (which MUST be valid),
* using the configured namespace if needed, or the one passed by the proxy
* protocol if required to do so. It ultimately calls socket() or socketat()
* and returns the FD or error code.
* protocol if required to do so. It then calls socket() or socketat(). On
* success, checks if mark or tos sockopts need to be set on the file handle.
* Returns the FD or error code.
*/
int sock_create_server_socket(struct connection *conn)
{
const struct netns_entry *ns = NULL;
int sock;

#ifdef USE_NS
if (objt_server(conn->target)) {
Expand All @@ -212,7 +214,16 @@ int sock_create_server_socket(struct connection *conn)
ns = __objt_server(conn->target)->netns;
}
#endif
return my_socketat(ns, conn->dst->ss_family, SOCK_STREAM, 0);
sock = my_socketat(ns, conn->dst->ss_family, SOCK_STREAM, 0);
if (sock == -1)
goto end;
if (conn->flags & CO_FL_OPT_MARK)
sock_set_mark(sock, conn->mark);
if (conn->flags & CO_FL_OPT_TOS)
sock_set_tos(sock, conn->dst, conn->tos);

end:
return sock;
}

/* Enables receiving on receiver <rx> once already bound. */
Expand Down
Loading

0 comments on commit 42a97d9

Please sign in to comment.