linux/net
Eric Dumazet 4e45337556 ipv6: avoid overflows in ip6_datagram_send_ctl()
Yiming Qian reported :
<quote>
 I believe I found a locally triggerable kernel bug in the IPv6 sendmsg
 ancillary-data path that can panic the kernel via `skb_under_panic()`
 (local DoS).

 The core issue is a mismatch between:

 - a 16-bit length accumulator (`struct ipv6_txoptions::opt_flen`, type
 `__u16`) and
 - a pointer to the *last* provided destination-options header (`opt->dst1opt`)

 when multiple `IPV6_DSTOPTS` control messages (cmsgs) are provided.

 - `include/net/ipv6.h`:
   - `struct ipv6_txoptions::opt_flen` is `__u16` (wrap possible).
 (lines 291-307, especially 298)
 - `net/ipv6/datagram.c:ip6_datagram_send_ctl()`:
   - Accepts repeated `IPV6_DSTOPTS` and accumulates into `opt_flen`
 without rejecting duplicates. (lines 909-933)
 - `net/ipv6/ip6_output.c:__ip6_append_data()`:
   - Uses `opt->opt_flen + opt->opt_nflen` to compute header
 sizes/headroom decisions. (lines 1448-1466, especially 1463-1465)
 - `net/ipv6/ip6_output.c:__ip6_make_skb()`:
   - Calls `ipv6_push_frag_opts()` if `opt->opt_flen` is non-zero.
 (lines 1930-1934)
 - `net/ipv6/exthdrs.c:ipv6_push_frag_opts()` / `ipv6_push_exthdr()`:
   - Push size comes from `ipv6_optlen(opt->dst1opt)` (based on the
 pointed-to header). (lines 1179-1185 and 1206-1211)

 1. `opt_flen` is a 16-bit accumulator:

 - `include/net/ipv6.h:298` defines `__u16 opt_flen; /* after fragment hdr */`.

 2. `ip6_datagram_send_ctl()` accepts *repeated* `IPV6_DSTOPTS` cmsgs
 and increments `opt_flen` each time:

 - In `net/ipv6/datagram.c:909-933`, for `IPV6_DSTOPTS`:
   - It computes `len = ((hdr->hdrlen + 1) << 3);`
   - It checks `CAP_NET_RAW` using `ns_capable(net->user_ns,
 CAP_NET_RAW)`. (line 922)
   - Then it does:
     - `opt->opt_flen += len;` (line 927)
     - `opt->dst1opt = hdr;` (line 928)

 There is no duplicate rejection here (unlike the legacy
 `IPV6_2292DSTOPTS` path which rejects duplicates at
 `net/ipv6/datagram.c:901-904`).

 If enough large `IPV6_DSTOPTS` cmsgs are provided, `opt_flen` wraps
 while `dst1opt` still points to a large (2048-byte)
 destination-options header.

 In the attached PoC (`poc.c`):

 - 32 cmsgs with `hdrlen=255` => `len = (255+1)*8 = 2048`
 - 1 cmsg with `hdrlen=0` => `len = 8`
 - Total increment: `32*2048 + 8 = 65544`, so `(__u16)opt_flen == 8`
 - The last cmsg is 2048 bytes, so `dst1opt` points to a 2048-byte header.

 3. The transmit path sizes headers using the wrapped `opt_flen`:

- In `net/ipv6/ip6_output.c:1463-1465`:
  - `headersize = sizeof(struct ipv6hdr) + (opt ? opt->opt_flen +
 opt->opt_nflen : 0) + ...;`

 With wrapped `opt_flen`, `headersize`/headroom decisions underestimate
 what will be pushed later.

 4. When building the final skb, the actual push length comes from
 `dst1opt` and is not limited by wrapped `opt_flen`:

 - In `net/ipv6/ip6_output.c:1930-1934`:
   - `if (opt->opt_flen) proto = ipv6_push_frag_opts(skb, opt, proto);`
 - In `net/ipv6/exthdrs.c:1206-1211`, `ipv6_push_frag_opts()` pushes
 `dst1opt` via `ipv6_push_exthdr()`.
 - In `net/ipv6/exthdrs.c:1179-1184`, `ipv6_push_exthdr()` does:
   - `skb_push(skb, ipv6_optlen(opt));`
   - `memcpy(h, opt, ipv6_optlen(opt));`

 With insufficient headroom, `skb_push()` underflows and triggers
 `skb_under_panic()` -> `BUG()`:

 - `net/core/skbuff.c:2669-2675` (`skb_push()` calls `skb_under_panic()`)
 - `net/core/skbuff.c:207-214` (`skb_panic()` ends in `BUG()`)

 - The `IPV6_DSTOPTS` cmsg path requires `CAP_NET_RAW` in the target
 netns user namespace (`ns_capable(net->user_ns, CAP_NET_RAW)`).
 - Root (or any task with `CAP_NET_RAW`) can trigger this without user
 namespaces.
 - An unprivileged `uid=1000` user can trigger this if unprivileged
 user namespaces are enabled and it can create a userns+netns to obtain
 namespaced `CAP_NET_RAW` (the attached PoC does this).

 - Local denial of service: kernel BUG/panic (system crash).
 - Reproducible with a small userspace PoC.
</quote>

This patch does not reject duplicated options, as this might break
some user applications.

Instead, it makes sure to adjust opt_flen and opt_nflen to correctly
reflect the size of the current option headers, preventing the overflows
and the potential for panics.

This applies to IPV6_DSTOPTS, IPV6_HOPOPTS, and IPV6_RTHDR.

Specifically:

When a new IPV6_DSTOPTS is processed, the length of the old opt->dst1opt
is subtracted from opt->opt_flen before adding the new length.

When a new IPV6_HOPOPTS is processed, the length of the old opt->dst0opt
is subtracted from opt->opt_nflen.

When a new Routing Header (IPV6_RTHDR or IPV6_2292RTHDR) is processed,
the length of the old opt->srcrt is subtracted from opt->opt_nflen.

In the special case within IPV6_2292RTHDR handling where dst1opt is moved
to dst0opt, the length of the old opt->dst0opt is subtracted from
opt->opt_nflen before the new one is added.

Fixes: 333fad5364 ("[IPV6]: Support several new sockopt / ancillary data in Advanced API (RFC3542).")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Closes: https://lore.kernel.org/netdev/CAL_bE8JNzawgr5OX5m+3jnQDHry2XxhQT5=jThW1zDPtUikRYA@mail.gmail.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260401154721.3740056-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-02 08:25:22 -07:00
..
6lowpan net: replace ND_PRINTK with dynamic debug 2025-07-10 15:27:32 -07:00
9p Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
802 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
8021q Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
appletalk Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
atm atm: lec: fix use-after-free in sock_def_readable() 2026-03-14 08:05:47 -07:00
ax25 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
batman-adv Here is a batman-adv bugfix: 2026-03-18 17:41:00 -07:00
bluetooth Bluetooth: hci_sync: fix stack buffer overflow in hci_le_big_create_sync 2026-04-01 16:48:28 -04:00
bpf Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
bridge bridge: mrp: reject zero test interval to avoid OOM panic 2026-03-31 16:11:24 +02:00
caif Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
can can: isotp: fix tx.buf use-after-free in isotp_sendmsg() 2026-03-19 17:16:02 +01:00
ceph libceph: Fix potential out-of-bounds access in ceph_handle_auth_reply() 2026-03-11 10:18:56 +01:00
core net: use skb_header_pointer() for TCPv4 GSO frag_off check 2026-03-30 17:35:21 -07:00
dcb Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
devlink Convert 'alloc_flex' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
dns_resolver net/dns_resolver: use credential guards in dns_query() 2025-11-04 12:36:51 +01:00
dsa Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ethernet bonding: prevent potential infinite loop in bond_header_parse() 2026-03-16 19:29:45 -07:00
ethtool Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
handshake treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
hsr net: hsr: fix VLAN add unwind on slave errors 2026-04-02 08:23:49 -07:00
ieee802154 Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
ife
ipv4 ipsec-2026-03-23 2026-03-24 15:16:28 +01:00
ipv6 ipv6: avoid overflows in ip6_datagram_send_ctl() 2026-04-02 08:25:22 -07:00
iucv Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
kcm kcm: fix zero-frag skb in frag_list on partial sendmsg error 2026-02-23 17:26:55 -08:00
key ipsec-2026-03-23 2026-03-24 15:16:28 +01:00
l2tp Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
l3mdev net: fib_rules: Fix iif / oif matching on L3 master device 2025-04-15 17:54:56 -07:00
lapb treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
llc treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
mac80211 wifi: mac80211: always free skb on ieee80211_tx_prepare_skb() failure 2026-03-18 09:09:58 +01:00
mac802154 bonding: prevent potential infinite loop in bond_header_parse() 2026-03-16 19:29:45 -07:00
mctp mctp: route: hold key->lock in mctp_flow_prepare_output() 2026-03-10 11:38:36 +01:00
mpls mpls: add seqcount to protect the platform_label{,s} pair 2026-03-26 18:32:14 -07:00
mptcp mptcp: fix soft lockup in mptcp_recvmsg() 2026-03-31 18:58:37 -07:00
ncsi net: ncsi: fix skb leak in error paths 2026-03-06 17:34:48 -08:00
netfilter netfilter: nf_tables: reject immediate NF_QUEUE verdict 2026-04-01 11:55:30 +02:00
netlabel Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
netlink Convert more 'alloc_obj' cases to default GFP_KERNEL arguments 2026-02-21 20:03:00 -08:00
netrom Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
nfc nfc: nci: fix circular locking dependency in nci_close_device 2026-03-19 16:56:18 -07:00
nsh
openvswitch openvswitch: validate MPLS set/set_masked payload length 2026-03-20 18:37:31 -07:00
packet net: fix fanout UAF in packet_release() via NETDEV_UP race 2026-03-23 17:07:19 -07:00
phonet bonding: prevent potential infinite loop in bond_header_parse() 2026-03-16 19:29:45 -07:00
psample treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
psp Including fixes from IPsec, Bluetooth and netfilter 2026-02-26 08:00:13 -08:00
qrtr net: qrtr: replace qrtr_tx_flow radix_tree with xarray to fix memory leak 2026-03-26 20:22:38 -07:00
rds rds: ib: reject FRMR registration before IB connection is established 2026-04-01 17:52:40 -07:00
rfkill Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
rose net/rose: fix NULL pointer dereference in rose_transmit_link on reconnect 2026-03-12 19:23:59 -07:00
rxrpc rxrpc, afs: Fix missing error pointer check after rxrpc_kernel_lookup_peer() 2026-03-06 17:49:52 -08:00
sched net/sched: cls_flow: fix NULL pointer dereference on shared blocks 2026-04-02 15:08:42 +02:00
sctp Convert 'alloc_obj' family to use the new default GFP_KERNEL argument 2026-02-21 17:09:51 -08:00
shaper net: shaper: protect from late creation of hierarchy 2026-03-19 13:47:15 +01:00
smc net/smc: fix double-free of smc_spd_priv when tee() duplicates splice pipe buffer 2026-03-20 18:59:30 -07:00
strparser Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-11-13 12:35:38 -08:00
sunrpc nfsd-7.0 fixes: 2026-03-18 14:27:11 -07:00
switchdev treewide: Replace kmalloc with kmalloc_obj for non-scalar types 2026-02-21 01:02:28 -08:00
tipc tipc: fix divide-by-zero in tipc_sk_filter_connect() 2026-03-11 18:56:28 -07:00
tls tls: Purge async_hold in tls_decrypt_async_wait() 2026-03-26 09:55:53 +01:00
unix af_unix: Give up GC if MSG_PEEK intervened. 2026-03-12 13:37:18 -07:00
vmw_vsock vsock: initialize child_ns_mode_locked in vsock_net_init() 2026-04-02 08:18:56 -07:00
wireless wifi: cfg80211: cancel pmsr_free_wk in cfg80211_pmsr_wdev_down 2026-03-06 12:41:59 +01:00
x25 net/x25: Fix overflow when accumulating packets 2026-04-02 13:36:08 +02:00
xdp xsk: Fix zero-copy AF_XDP fragment drop 2026-02-28 08:55:11 -08:00
xfrm ipsec-2026-03-23 2026-03-24 15:16:28 +01:00
Kconfig net: Kconfig: discourage drop_monitor enablement 2025-10-17 16:29:26 -07:00
Kconfig.debug
Makefile psp: base PSP device support 2025-09-18 12:32:06 +02:00
compat.c socket: Unify getsockname and getpeername implementation 2025-11-26 13:45:23 -07:00
devres.c
socket.c net: Drop the lock in skb_may_tx_timestamp() 2026-02-24 11:27:29 +01:00
sysctl_net.c