Commit Graph

82752 Commits

Author SHA1 Message Date
Vladimir Oltean dfca045cd4 net: dsa: fix off-by-one in maximum bridge ID determination
Prior to the blamed commit, the bridge_num range was from
0 to ds->max_num_bridges - 1. After the commit, it is from
1 to ds->max_num_bridges.

So this check:
	if (bridge_num >= max)
		return 0;
must be updated to:
	if (bridge_num > max)
		return 0;

in order to allow the last bridge_num value (==max) to be used.

This is easiest visible when a driver sets ds->max_num_bridges=1.
The observed behaviour is that even the first created bridge triggers
the netlink extack "Range of offloadable bridges exceeded" warning, and
is handled in software rather than being offloaded.

Fixes: 3f9bb0301d ("net: dsa: make dp->bridge_num one-based")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20260120211039.3228999-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-21 19:52:29 -08:00
Jeongjun Park ba1096c315 netrom: fix double-free in nr_route_frame()
In nr_route_frame(), old_skb is immediately freed without checking if
nr_neigh->ax25 pointer is NULL. Therefore, if nr_neigh->ax25 is NULL,
the caller function will free old_skb again, causing a double-free bug.

Therefore, to prevent this, we need to modify it to check whether
nr_neigh->ax25 is NULL before freeing old_skb.

Cc: <stable@vger.kernel.org>
Reported-by: syzbot+999115c3bf275797dc27@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69694d6f.050a0220.58bed.0029.GAE@google.com/
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Link: https://patch.msgid.link/20260119063359.10604-1-aha310510@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 19:15:40 -08:00
Eric Dumazet 9a063f96d8 ipv6: annotate data-race in ndisc_router_discovery()
syzbot found that ndisc_router_discovery() could read and write
in6_dev->ra_mtu without holding a lock [1]

This looks fine, IFLA_INET6_RA_MTU is best effort.

Add READ_ONCE()/WRITE_ONCE() to document the race.

Note that we might also reject illegal MTU values
(mtu < IPV6_MIN_MTU || mtu > skb->dev->mtu) in a future patch.

[1]
BUG: KCSAN: data-race in ndisc_router_discovery / ndisc_router_discovery

read to 0xffff888119809c20 of 4 bytes by task 25817 on cpu 1:
  ndisc_router_discovery+0x151d/0x1c90 net/ipv6/ndisc.c:1558
  ndisc_rcv+0x2ad/0x3d0 net/ipv6/ndisc.c:1841
  icmpv6_rcv+0xe5a/0x12f0 net/ipv6/icmp.c:989
  ip6_protocol_deliver_rcu+0xb2a/0x10d0 net/ipv6/ip6_input.c:438
  ip6_input_finish+0xf0/0x1d0 net/ipv6/ip6_input.c:489
  NF_HOOK include/linux/netfilter.h:318 [inline]
  ip6_input+0x5e/0x140 net/ipv6/ip6_input.c:500
  ip6_mc_input+0x27c/0x470 net/ipv6/ip6_input.c:590
  dst_input include/net/dst.h:474 [inline]
  ip6_rcv_finish+0x336/0x340 net/ipv6/ip6_input.c:79
...

write to 0xffff888119809c20 of 4 bytes by task 25816 on cpu 0:
  ndisc_router_discovery+0x155a/0x1c90 net/ipv6/ndisc.c:1559
  ndisc_rcv+0x2ad/0x3d0 net/ipv6/ndisc.c:1841
  icmpv6_rcv+0xe5a/0x12f0 net/ipv6/icmp.c:989
  ip6_protocol_deliver_rcu+0xb2a/0x10d0 net/ipv6/ip6_input.c:438
  ip6_input_finish+0xf0/0x1d0 net/ipv6/ip6_input.c:489
  NF_HOOK include/linux/netfilter.h:318 [inline]
  ip6_input+0x5e/0x140 net/ipv6/ip6_input.c:500
  ip6_mc_input+0x27c/0x470 net/ipv6/ip6_input.c:590
  dst_input include/net/dst.h:474 [inline]
  ip6_rcv_finish+0x336/0x340 net/ipv6/ip6_input.c:79
...

value changed: 0x00000000 -> 0xe5400659

Fixes: 49b99da2c9 ("ipv6: add IFLA_INET6_RA_MTU to expose mtu value")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rocco Yue <rocco.yue@mediatek.com>
Link: https://patch.msgid.link/20260118152941.2563857-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20 18:37:45 -08:00
Jamal Hadi Salim d837fbee92 net/sched: qfq: Use cl_is_active to determine whether class is active in qfq_rm_from_ag
This is more of a preventive patch to make the code more consistent and
to prevent possible exploits that employ child qlen manipulations on qfq.
use cl_is_active instead of relying on the child qdisc's qlen to determine
class activation.

Fixes: 462dbc9101 ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260114160243.913069-3-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 12:06:41 -08:00
Jamal Hadi Salim 50da4b9d07 net/sched: Enforce that teql can only be used as root qdisc
Design intent of teql is that it is only supposed to be used as root qdisc.
We need to check for that constraint.

Although not important, I will describe the scenario that unearthed this
issue for the curious.

GangMin Kim <km.kim1503@gmail.com> managed to concot a scenario as follows:

ROOT qdisc 1:0 (QFQ)
  ├── class 1:1 (weight=15, lmax=16384) netem with delay 6.4s
  └── class 1:2 (weight=1, lmax=1514) teql

GangMin sends a packet which is enqueued to 1:1 (netem).
Any invocation of dequeue by QFQ from this class will not return a packet
until after 6.4s. In the meantime, a second packet is sent and it lands on
1:2. teql's enqueue will return success and this will activate class 1:2.
Main issue is that teql only updates the parent visible qlen (sch->q.qlen)
at dequeue. Since QFQ will only call dequeue if peek succeeds (and teql's
peek always returns NULL), dequeue will never be called and thus the qlen
will remain as 0. With that in mind, when GangMin updates 1:2's lmax value,
the qfq_change_class calls qfq_deact_rm_from_agg. Since the child qdisc's
qlen was not incremented, qfq fails to deactivate the class, but still
frees its pointers from the aggregate. So when the first packet is
rescheduled after 6.4 seconds (netem's delay), a dangling pointer is
accessed causing GangMin's causing a UAF.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: GangMin Kim <km.kim1503@gmail.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260114160243.913069-2-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 12:06:41 -08:00
David Howells 2c28769a51 rxrpc: Fix recvmsg() unconditional requeue
If rxrpc_recvmsg() fails because MSG_DONTWAIT was specified but the call at
the front of the recvmsg queue already has its mutex locked, it requeues
the call - whether or not the call is already queued.  The call may be on
the queue because MSG_PEEK was also passed and so the call was not dequeued
or because the I/O thread requeued it.

The unconditional requeue may then corrupt the recvmsg queue, leading to
things like UAFs or refcount underruns.

Fix this by only requeuing the call if it isn't already on the queue - and
moving it to the front if it is already queued.  If we don't queue it, we
have to put the ref we obtained by dequeuing it.

Also, MSG_PEEK doesn't dequeue the call so shouldn't call
rxrpc_notify_socket() for the call if we didn't use up all the data on the
queue, so fix that also.

Fixes: 540b1c48c3 ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
Reported-by: Faith <faith@zellic.io>
Reported-by: Pumpkin Chang <pumpkin@devco.re>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Marc Dionne <marc.dionne@auristor.com>
cc: Nir Ohfeld <niro@wiz.io>
cc: Willy Tarreau <w@1wt.eu>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org
cc: stable@kernel.org
Link: https://patch.msgid.link/95163.1768428203@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 10:07:06 -08:00
Eric Dumazet 7a29f6bf60 l2tp: avoid one data-race in l2tp_tunnel_del_work()
We should read sk->sk_socket only when dealing with kernel sockets.

syzbot reported the following data-race:

BUG: KCSAN: data-race in l2tp_tunnel_del_work / sk_common_release

write to 0xffff88811c182b20 of 8 bytes by task 5365 on cpu 0:
  sk_set_socket include/net/sock.h:2092 [inline]
  sock_orphan include/net/sock.h:2118 [inline]
  sk_common_release+0xae/0x230 net/core/sock.c:4003
  udp_lib_close+0x15/0x20 include/net/udp.h:325
  inet_release+0xce/0xf0 net/ipv4/af_inet.c:437
  __sock_release net/socket.c:662 [inline]
  sock_close+0x6b/0x150 net/socket.c:1455
  __fput+0x29b/0x650 fs/file_table.c:468
  ____fput+0x1c/0x30 fs/file_table.c:496
  task_work_run+0x131/0x1a0 kernel/task_work.c:233
  resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
  __exit_to_user_mode_loop kernel/entry/common.c:44 [inline]
  exit_to_user_mode_loop+0x1fe/0x740 kernel/entry/common.c:75
  __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
  syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
  syscall_exit_to_user_mode_work include/linux/entry-common.h:159 [inline]
  syscall_exit_to_user_mode include/linux/entry-common.h:194 [inline]
  do_syscall_64+0x1e1/0x2b0 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff88811c182b20 of 8 bytes by task 827 on cpu 1:
  l2tp_tunnel_del_work+0x2f/0x1a0 net/l2tp/l2tp_core.c:1418
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0x4ce/0x9d0 kernel/workqueue.c:3340
  worker_thread+0x582/0x770 kernel/workqueue.c:3421
  kthread+0x489/0x510 kernel/kthread.c:463
  ret_from_fork+0x149/0x290 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

value changed: 0xffff88811b818000 -> 0x0000000000000000

Fixes: d00fa9adc5 ("l2tp: fix races with tunnel socket close")
Reported-by: syzbot+7312e82745f7fa2526db@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6968b029.050a0220.58bed.0016.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20260115092139.3066180-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-19 09:55:41 -08:00
Kuniyuki Iwashima 7a9bc9e3f4 fou: Don't allow 0 for FOU_ATTR_IPPROTO.
fou_udp_recv() has the same problem mentioned in the previous
patch.

If FOU_ATTR_IPPROTO is set to 0, skb is not freed by
fou_udp_recv() nor "resubmit"-ted in ip_protocol_deliver_rcu().

Let's forbid 0 for FOU_ATTR_IPPROTO.

Fixes: 23461551c0 ("fou: Support for foo-over-udp RX path")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260115172533.693652-4-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 16:00:24 -08:00
Kuniyuki Iwashima 9a56796ad2 gue: Fix skb memleak with inner IP protocol 0.
syzbot reported skb memleak below. [0]

The repro generated a GUE packet with its inner protocol 0.

gue_udp_recv() returns -guehdr->proto_ctype for "resubmit"
in ip_protocol_deliver_rcu(), but this only works with
non-zero protocol number.

Let's drop such packets.

Note that 0 is a valid number (IPv6 Hop-by-Hop Option).

I think it is not practical to encap HOPOPT in GUE, so once
someone starts to complain, we could pass down a resubmit
flag pointer to distinguish two zeros from the upper layer:

  * no error
  * resubmit HOPOPT

[0]
BUG: memory leak
unreferenced object 0xffff888109695a00 (size 240):
  comm "syz.0.17", pid 6088, jiffies 4294943096
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 40 c2 10 81 88 ff ff 00 00 00 00 00 00 00 00  .@..............
  backtrace (crc a84b336f):
    kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline]
    slab_post_alloc_hook mm/slub.c:4958 [inline]
    slab_alloc_node mm/slub.c:5263 [inline]
    kmem_cache_alloc_noprof+0x3b4/0x590 mm/slub.c:5270
    __build_skb+0x23/0x60 net/core/skbuff.c:474
    build_skb+0x20/0x190 net/core/skbuff.c:490
    __tun_build_skb drivers/net/tun.c:1541 [inline]
    tun_build_skb+0x4a1/0xa40 drivers/net/tun.c:1636
    tun_get_user+0xc12/0x2030 drivers/net/tun.c:1770
    tun_chr_write_iter+0x71/0x120 drivers/net/tun.c:1999
    new_sync_write fs/read_write.c:593 [inline]
    vfs_write+0x45d/0x710 fs/read_write.c:686
    ksys_write+0xa7/0x170 fs/read_write.c:738
    do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
    do_syscall_64+0xa4/0xf80 arch/x86/entry/syscall_64.c:94
    entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 37dd024779 ("gue: Receive side for Generic UDP Encapsulation")
Reported-by: syzbot+4d8c7d16b0e95c0d0f0d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6965534b.050a0220.38aacd.0001.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260115172533.693652-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 16:00:23 -08:00
Xin Long a80c9d945a sctp: move SCTP_CMD_ASSOC_SHKEY right after SCTP_CMD_PEER_INIT
A null-ptr-deref was reported in the SCTP transmit path when SCTP-AUTH key
initialization fails:

  ==================================================================
  KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
  CPU: 0 PID: 16 Comm: ksoftirqd/0 Tainted: G W 6.6.0 #2
  RIP: 0010:sctp_packet_bundle_auth net/sctp/output.c:264 [inline]
  RIP: 0010:sctp_packet_append_chunk+0xb36/0x1260 net/sctp/output.c:401
  Call Trace:

  sctp_packet_transmit_chunk+0x31/0x250 net/sctp/output.c:189
  sctp_outq_flush_data+0xa29/0x26d0 net/sctp/outqueue.c:1111
  sctp_outq_flush+0xc80/0x1240 net/sctp/outqueue.c:1217
  sctp_cmd_interpreter.isra.0+0x19a5/0x62c0 net/sctp/sm_sideeffect.c:1787
  sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
  sctp_do_sm+0x1a3/0x670 net/sctp/sm_sideeffect.c:1169
  sctp_assoc_bh_rcv+0x33e/0x640 net/sctp/associola.c:1052
  sctp_inq_push+0x1dd/0x280 net/sctp/inqueue.c:88
  sctp_rcv+0x11ae/0x3100 net/sctp/input.c:243
  sctp6_rcv+0x3d/0x60 net/sctp/ipv6.c:1127

The issue is triggered when sctp_auth_asoc_init_active_key() fails in
sctp_sf_do_5_1C_ack() while processing an INIT_ACK. In this case, the
command sequence is currently:

- SCTP_CMD_PEER_INIT
- SCTP_CMD_TIMER_STOP (T1_INIT)
- SCTP_CMD_TIMER_START (T1_COOKIE)
- SCTP_CMD_NEW_STATE (COOKIE_ECHOED)
- SCTP_CMD_ASSOC_SHKEY
- SCTP_CMD_GEN_COOKIE_ECHO

If SCTP_CMD_ASSOC_SHKEY fails, asoc->shkey remains NULL, while
asoc->peer.auth_capable and asoc->peer.peer_chunks have already been set by
SCTP_CMD_PEER_INIT. This allows a DATA chunk with auth = 1 and shkey = NULL
to be queued by sctp_datamsg_from_user().

Since command interpretation stops on failure, no COOKIE_ECHO should been
sent via SCTP_CMD_GEN_COOKIE_ECHO. However, the T1_COOKIE timer has already
been started, and it may enqueue a COOKIE_ECHO into the outqueue later. As
a result, the DATA chunk can be transmitted together with the COOKIE_ECHO
in sctp_outq_flush_data(), leading to the observed issue.

Similar to the other places where it calls sctp_auth_asoc_init_active_key()
right after sctp_process_init(), this patch moves the SCTP_CMD_ASSOC_SHKEY
immediately after SCTP_CMD_PEER_INIT, before stopping T1_INIT and starting
T1_COOKIE. This ensures that if shared key generation fails, authenticated
DATA cannot be sent. It also allows the T1_INIT timer to retransmit INIT,
giving the client another chance to process INIT_ACK and retry key setup.

Fixes: 730fc3d05c ("[SCTP]: Implete SCTP-AUTH parameter processing")
Reported-by: Zhen Chen <chenzhen126@huawei.com>
Tested-by: Zhen Chen <chenzhen126@huawei.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/44881224b375aa8853f5e19b4055a1a56d895813.1768324226.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17 15:10:34 -08:00
Kuniyuki Iwashima 4d10edfd14 l2tp: Fix memleak in l2tp_udp_encap_recv().
syzbot reported memleak of struct l2tp_session, l2tp_tunnel,
sock, etc. [0]

The cited commit moved down the validation of the protocol
version in l2tp_udp_encap_recv().

The new place requires an extra error handling to avoid the
memleak.

Let's call l2tp_session_put() there.

[0]:
BUG: memory leak
unreferenced object 0xffff88810a290200 (size 512):
  comm "syz.0.17", pid 6086, jiffies 4294944299
  hex dump (first 32 bytes):
    7d eb 04 0c 00 00 00 00 01 00 00 00 00 00 00 00  }...............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace (crc babb6a4f):
    kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline]
    slab_post_alloc_hook mm/slub.c:4958 [inline]
    slab_alloc_node mm/slub.c:5263 [inline]
    __do_kmalloc_node mm/slub.c:5656 [inline]
    __kmalloc_noprof+0x3e0/0x660 mm/slub.c:5669
    kmalloc_noprof include/linux/slab.h:961 [inline]
    kzalloc_noprof include/linux/slab.h:1094 [inline]
    l2tp_session_create+0x3a/0x3b0 net/l2tp/l2tp_core.c:1778
    pppol2tp_connect+0x48b/0x920 net/l2tp/l2tp_ppp.c:755
    __sys_connect_file+0x7a/0xb0 net/socket.c:2089
    __sys_connect+0xde/0x110 net/socket.c:2108
    __do_sys_connect net/socket.c:2114 [inline]
    __se_sys_connect net/socket.c:2111 [inline]
    __x64_sys_connect+0x1c/0x30 net/socket.c:2111
    do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
    do_syscall_64+0xa4/0xf80 arch/x86/entry/syscall_64.c:94
    entry_SYSCALL_64_after_hwframe+0x77/0x7f

Fixes: 364798056f ("l2tp: Support different protocol versions with same IP/port quadruple")
Reported-by: syzbot+2c42ea4485b29beb0643@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/696693f2.a70a0220.245e30.0001.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20260113185446.2533333-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15 19:56:45 -08:00
Eric Dumazet 220d89df1d net: add skb->data_len and (skb>end - skb->tail) to skb_dump()
While working on a syzbot report, I found that skb_dump()
is lacking two important parts :

- skb->data_len.

- (skb>end - skb->tail) tailroom is zero if skb is not linear.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112172621.4188700-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15 19:49:47 -08:00
Michal Luczaj 0386bd321d vsock/virtio: Coalesce only linear skb
vsock/virtio common tries to coalesce buffers in rx queue: if a linear skb
(with a spare tail room) is followed by a small skb (length limited by
GOOD_COPY_LEN = 128), an attempt is made to join them.

Since the introduction of MSG_ZEROCOPY support, assumption that a small skb
will always be linear is incorrect. In the zerocopy case, data is lost and
the linear skb is appended with uninitialized kernel memory.

Of all 3 supported virtio-based transports, only loopback-transport is
affected. G2H virtio-transport rx queue operates on explicitly linear skbs;
see virtio_vsock_alloc_linear_skb() in virtio_vsock_rx_fill(). H2G
vhost-transport may allocate non-linear skbs, but only for sizes that are
not considered for coalescence; see PAGE_ALLOC_COSTLY_ORDER in
virtio_vsock_alloc_skb().

Ensure only linear skbs are coalesced. Note that skb_tailroom(last_skb) > 0
guarantees last_skb is linear.

Fixes: 581512a6dc ("vsock/virtio: MSG_ZEROCOPY flag support")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260113-vsock-recv-coalescence-v2-1-552b17837cf4@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15 19:44:44 -08:00
Linus Torvalds 9e995c573b Including fixes from bluetooth, can and IPsec.
Current release - regressions:
 
   - net: add net.core.qdisc_max_burst
 
   - can: propagate CAN device capabilities via ml_priv
 
 Previous releases - regressions:
 
   - dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list()
 
   - ipv6: fix use-after-free in inet6_addr_del().
 
   - xfrm: fix inner mode lookup in tunnel mode GSO segmentation
 
   - ip_tunnel: spread netdev_lockdep_set_classes()
 
   - ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv()
 
   - bluetooth: hci_sync: enable PA sync lost event
 
   - eth: virtio-net:
     - fix the deadlock when disabling rx NAPI
     - fix misalignment bug in struct virtnet_info
 
 Previous releases - always broken:
 
   - ipv4: ip_gre: make ipgre_header() robust
 
   - can: fix SSP_SRC in cases when bit-rate is higher than 1 MBit.
 
   - eth: mlx5e: profile change fix
 
   - eth: octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback
 
   - eth: macvlan: fix possible UAF in macvlan_forward_source()
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmlo9uMSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOk4iAP/i3I0frORQbcM/NI8sqjQsc7w84SbBB9
 K3FyfA+pfmS+sa5eZBan/eEe3tSQM5xn6XBflZXVxTC2QBxfiH3OVPWGconA7WRk
 KIizSl340vtMhBMlUlXyox8BG7nn+pxjI3BqB5aTHWvEolS9eUgxtJJEDC1MnRzw
 ZkarPXmNTLK6pDWAZ/CYjjBVnaUOt1V2ceRFfRU1SK/ROG7ajpfVLrSXZ4gICc/8
 +br2g4p8hZx6sDYTY4lV/fMMzbbu3n8rRSoqgy8UDAfDJgglifvm4Ocu+EW5mLLG
 IlNobJghtCUcbQN4X8GPg+6cW1yA94qeyHZSHD0Zg3lN7Hfa92WnhVwoqqjCgETC
 ppXnj0zRZN81TeIQx9UIJLXxBZOt+3SSpvdaYdruKg2Sx0rsKraWLjXlResiehzw
 DFZb893SFkJgPEVCcndzGi2I5/MCX3skNQq8+Od73Xcjbl35qiXafGa856R2Z5Ne
 nSEE6BdnYvQAuCwCJ01nFJLQw3ot1H1TO2hF9QEDQ7K42zp2Yj5EKVcYQLAzlAAk
 rzb3D9VIpEpSaXTIpJwa+fZUnpe649J+Rgg71fdj4FJUoNqYErvHQ6QzblmhdThr
 efdpHg1PN3EvpVao27eSiDLl7DGOL8kDmXDpvIhFctKMVhw8mFhdFDmWenGTRQaT
 u2hT+oC19rEG
 =zUPN
 -----END PGP SIGNATURE-----

Merge tag 'net-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth, can and IPsec.

  Current release - regressions:

   - net: add net.core.qdisc_max_burst

   - can: propagate CAN device capabilities via ml_priv

  Previous releases - regressions:

   - dst: fix races in rt6_uncached_list_del() and
     rt_del_uncached_list()

   - ipv6: fix use-after-free in inet6_addr_del().

   - xfrm: fix inner mode lookup in tunnel mode GSO segmentation

   - ip_tunnel: spread netdev_lockdep_set_classes()

   - ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv()

   - bluetooth: hci_sync: enable PA sync lost event

   - eth: virtio-net:
      - fix the deadlock when disabling rx NAPI
      - fix misalignment bug in struct virtnet_info

  Previous releases - always broken:

   - ipv4: ip_gre: make ipgre_header() robust

   - can: fix SSP_SRC in cases when bit-rate is higher than 1 MBit.

   - eth:
      - mlx5e: profile change fix
      - octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback
      - macvlan: fix possible UAF in macvlan_forward_source()"

* tag 'net-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  virtio_net: Fix misalignment bug in struct virtnet_info
  net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
  can: raw: instantly reject disabled CAN frames
  can: propagate CAN device capabilities via ml_priv
  Revert "can: raw: instantly reject unsupported CAN frames"
  net/sched: sch_qfq: do not free existing class in qfq_change_class()
  selftests: drv-net: fix RPS mask handling for high CPU numbers
  selftests: drv-net: fix RPS mask handling in toeplitz test
  ipv6: Fix use-after-free in inet6_addr_del().
  dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list()
  net: hv_netvsc: reject RSS hash key programming without RX indirection table
  tools: ynl: render event op docs correctly
  net: add net.core.qdisc_max_burst
  net: airoha: Fix typo in airoha_ppe_setup_tc_block_cb definition
  net: phy: motorcomm: fix duplex setting error for phy leds
  net: octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback
  net/mlx5e: Restore destroying state bit after profile cleanup
  net/mlx5e: Pass netdev to mlx5e_destroy_netdev instead of priv
  net/mlx5e: Don't store mlx5e_priv in mlx5e_dev devlink priv
  net/mlx5e: Fix crash on profile change rollback failure
  ...
2026-01-15 10:11:11 -08:00
Paolo Abeni 851822aec1 linux-can-fixes-for-6.19-20260115
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEn/sM2K9nqF/8FWzzDHRl3/mQkZwFAmloq5ETHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAMdGXf+ZCRnJdjCACd/tKNnhH0VjjAexLT+pm68MuI+MOB
 He/DKc81hzpJ8FVaC9aTDVYsv5A+EeL+CiSSdO1oYXgzt3hT0MEztgbINAUICOi6
 SfosYrkGR9s1xJvabxWRjn2XrdZtzTFNy0IupDnLiPRXqsFOfWdkF1VQI41aG22i
 ZSgIFQ77Ga3tpWWmoRBimIDBEj+xjx6xfHV6F1hPZ/hEzixnG6ZKMXJjXCIBo90q
 wD1Z2rzu1KvCfIZexdax5L5BzotH3axZ0ci9/RKlTWxkyTY3s+ic27a+E+uZZhgn
 7WwGbMvmDVCiCH23mpX51V5PlkpFEun7rxiwAmF4IcXU3t5uSkL41IXB
 =KHQB
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-6.19-20260115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2026-01-15

this is a pull request of 4 patches for net/main, it super-seeds the
"can 2026-01-14" pull request. The dev refcount leak in patch #3 is
fixed.

The first 3 patches are by Oliver Hartkopp and revert the approach to
instantly reject unsupported CAN frames introduced in
net-next-for-v6.19 and replace it by placing the needed data into the
CAN specific ml_priv.

The last patch is by Tetsuo Handa and fixes a J1939 refcount leak for
j1939_session in session deactivation upon receiving the second RTS.

linux-can-fixes-for-6.19-20260115

* tag 'linux-can-fixes-for-6.19-20260115' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
  can: raw: instantly reject disabled CAN frames
  can: propagate CAN device capabilities via ml_priv
  Revert "can: raw: instantly reject unsupported CAN frames"
====================

Link: https://patch.msgid.link/20260115090603.1124860-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-15 13:13:01 +01:00
Paolo Abeni 5ce234a8fe ipsec-2026-01-14
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmlnh0AACgkQrB3Eaf9P
 W7fhlQ//ZsSZ1+7UlmSmuzfdS+0erXDqNuIvmX1GudWmUfSWu988+efsCgZcIRSh
 B7YPHjO75mNRQLOxGI4f61JZUwX+h+p/1vbqW1z3/AycPxi3UGBS2P+QaPCK7q+3
 P7Xgch0WCV7acZDMBOSRZb2j3Iu72u2G0AXGfPYJHTOyIIwaBX+c77nvqw+4t2Si
 N31OSQICwdocGSRyFpAb1InklRrW7fixiwXctC0q3MBahBXU2RqjXp/jLm8xY2Nx
 WJupmMng7cxz4vgsKY+tY+2EAvbuAwHEcBK4rjNKdm4E2m0nNpohAoiV3xYnkmoN
 Q4I+Yr9xpBGf2+Eoq7MwuLl1y5A+k5H8PgbnuO5zMP7grzOzCWka6Gx7fDXKuMCY
 t5c/dVG2beQI8MlV60PbUb+UuIhhMUAyKalm5OdQ1uk7mv/RhYO0Pg6g+sIT9uk9
 TvT0UGZOI/Af0uhwYLp7Vl9HicFMYtPhBlEWO+JcUgZXXbqUBnNDml5KG7wcOXh2
 Mj7GHSAfwtl/rNdiVPbo6cPezhuj5mY/QOAPl9/qr6BG7Tusf8TcAnRV5jphYfwV
 Dbo9qjIg5LmtscyfhB5CYk+X8e6Ca1X89lgT1JjW5yyO63yJ2vsORsDC+iakGTEl
 H4pQwIcObseM9X7klH9K2MZYUNhe17p88qQ+atGxUXExXHqil0U=
 =Y9BD
 -----END PGP SIGNATURE-----

Merge tag 'ipsec-2026-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2026-01-14

1) Fix inner mode lookup in tunnel mode GSO segmentation.
   The protocol was taken from the wrong field.

2) Set ipv4 no_pmtu_disc flag only on output SAs. The
   insertation of input SAs can fail if no_pmtu_disc
   is set.

Please pull or let me know if there are problems.

ipsec-2026-01-14

* tag 'ipsec-2026-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: set ipv4 no_pmtu_disc flag only on output sa when direction is set
  xfrm: Fix inner mode lookup in tunnel mode GSO segmentation
====================

Link: https://patch.msgid.link/20260114121817.1106134-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-15 12:46:12 +01:00
Tetsuo Handa 1809c82aa0 net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
Since j1939_session_deactivate_activate_next() in j1939_tp_rxtimer() is
called only when the timer is enabled, we need to call
j1939_session_deactivate_activate_next() if we cancelled the timer.
Otherwise, refcount for j1939_session leaks, which will later appear as

| unregister_netdevice: waiting for vcan0 to become free. Usage count = 2.

problem.

Reported-by: syzbot <syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Link: https://patch.msgid.link/b1212653-8fa1-44e1-be9d-12f950fb3a07@I-love.SAKURA.ne.jp
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2026-01-15 09:52:39 +01:00
Oliver Hartkopp faba5860fc can: raw: instantly reject disabled CAN frames
For real CAN interfaces the CAN_CTRLMODE_FD and CAN_CTRLMODE_XL control
modes indicate whether an interface can handle those CAN FD/XL frames.

In the case a CAN XL interface is configured in CANXL-only mode with
disabled error-signalling neither CAN CC nor CAN FD frames can be sent.

The checks are now performed on CAN_RAW sockets to give an instant feedback
to the user when writing unsupported CAN frames to the interface or when
the CAN interface is in read-only mode.

Fixes: 1a620a7238 ("can: raw: instantly reject unsupported CAN frames")
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Vincent Mailhol <mailhol@kernel.org>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20260109144135.8495-4-socketcan@hartkopp.net
[mkl: fix dev reference leak]
Link: https://lore.kernel.org/all/0636c732-2e71-4633-8005-dfa85e1da445@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2026-01-15 09:52:04 +01:00
Oliver Hartkopp 4650ff58a1 Revert "can: raw: instantly reject unsupported CAN frames"
This reverts commit 1a620a7238

and its follow-up fixes for the introduced dependency issues.

commit 1a620a7238 ("can: raw: instantly reject unsupported CAN frames")
commit cb2dc6d286 ("can: Kconfig: select CAN driver infrastructure by default")
commit 6abd4577bc ("can: fix build dependency")
commit 5a5aff6338 ("can: fix build dependency")

The entire problem was caused by the requirement that a new network layer
feature needed to know about the protocol capabilities of the CAN devices.
Instead of accessing CAN device internal data structures which caused the
dependency problems a better approach has been developed which makes use of
CAN specific ml_priv data which is accessible from both sides.

Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Vincent Mailhol <mailhol@kernel.org>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20260109144135.8495-2-socketcan@hartkopp.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2026-01-15 09:52:04 +01:00
Linus Torvalds c537e12dae bpf-fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmlm+x8ACgkQ6rmadz2v
 bTrC5Q/+NSCjBpHAinVL/09pahh0YJKczxWnpHG0rs6hSSZ88o+1uY+z3PQ3lRX9
 cHSwjIicRvwpX+DosGcXESqspbvo5MLx5DcB0WBSCnLodJ5MDap0iWltBCrGzqme
 mpeuZiQMSpsAGQLjICiywxcCndoaKEMSRZFmjlzNVJPbYkbxbZCQvwUQqg+Z4rQX
 t5zBJjb0X3Oaz3TcSiiW+yulZPoejFyF4P+4eTmFy+mAqq5D7CpMXrrjVTf78mSf
 IqMPbY/PssTExzna2d/ZKtxmKQ+HF6lN7+JaTuqbQsqQtxwh4eXsswezY/UKX7j+
 FdAuwz6W7ybO8UC3i4fvA1jFR60Hi5bBBoSXu+0hb/UGQQYXc+gRtB8DxqeUEgeb
 oVByGwrBWLRSLuBbQVPl3PRTNgwpBn41DkyIpXS6DK6thzdAqZ8zM15vFTYbWanF
 W/skdubkgIdhREunXMpLfZD5h0b/pXIYxzA59x//hZ+W8qvTuaabICja18x3SnqY
 hytoIQMo5d1UytyngfcHvXe79DCi5xFAqqVj0AqYR+CDdt5n9vrXgavX76hrYeHe
 /yYUHZ0h5A564zMXITMmUWDWvgmvTs1Q2hiV+rmtWPJ1bpOpPFxUosIWEBl4edA3
 vKlGoh/CtU9NJp/E/7nYdUV1vXlLJaareGQ/UXpFBl4NYsnNe4w=
 =Vt4X
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull bpf fixes from Alexei Starovoitov:

 - Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK in riscv JIT (Menglong
   Dong)

 - Fix reference count leak in bpf_prog_test_run_xdp() (Tetsuo Handa)

 - Fix metadata size check in bpf_test_run() (Toke Høiland-Jørgensen)

 - Check that BPF insn array is not allowed as a map for const strings
   (Deepanshu Kartikey)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix reference count leak in bpf_prog_test_run_xdp()
  bpf: Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str()
  selftests/bpf: Update xdp_context_test_run test to check maximum metadata size
  bpf, test_run: Subtract size of xdp_frame from allowed metadata size
  riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK
2026-01-13 21:21:13 -08:00
Eric Dumazet 3879cffd9d net/sched: sch_qfq: do not free existing class in qfq_change_class()
Fixes qfq_change_class() error case.

cl->qdisc and cl should only be freed if a new class and qdisc
were allocated, or we risk various UAF.

Fixes: 462dbc9101 ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
Reported-by: syzbot+07f3f38f723c335f106d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6965351d.050a0220.eaf7.00c5.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260112175656.17605-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 19:36:56 -08:00
Kuniyuki Iwashima ddf96c393a ipv6: Fix use-after-free in inet6_addr_del().
syzbot reported use-after-free of inet6_ifaddr in
inet6_addr_del(). [0]

The cited commit accidentally moved ipv6_del_addr() for
mngtmpaddr before reading its ifp->flags for temporary
addresses in inet6_addr_del().

Let's move ipv6_del_addr() down to fix the UAF.

[0]:
BUG: KASAN: slab-use-after-free in inet6_addr_del.constprop.0+0x67a/0x6b0 net/ipv6/addrconf.c:3117
Read of size 4 at addr ffff88807b89c86c by task syz.3.1618/9593

CPU: 0 UID: 0 PID: 9593 Comm: syz.3.1618 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:94 [inline]
 dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0xcd/0x630 mm/kasan/report.c:482
 kasan_report+0xe0/0x110 mm/kasan/report.c:595
 inet6_addr_del.constprop.0+0x67a/0x6b0 net/ipv6/addrconf.c:3117
 addrconf_del_ifaddr+0x11e/0x190 net/ipv6/addrconf.c:3181
 inet6_ioctl+0x1e5/0x2b0 net/ipv6/af_inet6.c:582
 sock_do_ioctl+0x118/0x280 net/socket.c:1254
 sock_ioctl+0x227/0x6b0 net/socket.c:1375
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f164cf8f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f164de64038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f164d1e5fa0 RCX: 00007f164cf8f749
RDX: 0000200000000000 RSI: 0000000000008936 RDI: 0000000000000003
RBP: 00007f164d013f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f164d1e6038 R14: 00007f164d1e5fa0 R15: 00007ffde15c8288
 </TASK>

Allocated by task 9593:
 kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
 kasan_save_track+0x14/0x30 mm/kasan/common.c:77
 poison_kmalloc_redzone mm/kasan/common.c:397 [inline]
 __kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:414
 kmalloc_noprof include/linux/slab.h:957 [inline]
 kzalloc_noprof include/linux/slab.h:1094 [inline]
 ipv6_add_addr+0x4e3/0x2010 net/ipv6/addrconf.c:1120
 inet6_addr_add+0x256/0x9b0 net/ipv6/addrconf.c:3050
 addrconf_add_ifaddr+0x1fc/0x450 net/ipv6/addrconf.c:3160
 inet6_ioctl+0x103/0x2b0 net/ipv6/af_inet6.c:580
 sock_do_ioctl+0x118/0x280 net/socket.c:1254
 sock_ioctl+0x227/0x6b0 net/socket.c:1375
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:597 [inline]
 __se_sys_ioctl fs/ioctl.c:583 [inline]
 __x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 6099:
 kasan_save_stack+0x33/0x60 mm/kasan/common.c:56
 kasan_save_track+0x14/0x30 mm/kasan/common.c:77
 kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:584
 poison_slab_object mm/kasan/common.c:252 [inline]
 __kasan_slab_free+0x5f/0x80 mm/kasan/common.c:284
 kasan_slab_free include/linux/kasan.h:234 [inline]
 slab_free_hook mm/slub.c:2540 [inline]
 slab_free_freelist_hook mm/slub.c:2569 [inline]
 slab_free_bulk mm/slub.c:6696 [inline]
 kmem_cache_free_bulk mm/slub.c:7383 [inline]
 kmem_cache_free_bulk+0x2bf/0x680 mm/slub.c:7362
 kfree_bulk include/linux/slab.h:830 [inline]
 kvfree_rcu_bulk+0x1b7/0x1e0 mm/slab_common.c:1523
 kvfree_rcu_drain_ready mm/slab_common.c:1728 [inline]
 kfree_rcu_monitor+0x1d0/0x2f0 mm/slab_common.c:1801
 process_one_work+0x9ba/0x1b20 kernel/workqueue.c:3257
 process_scheduled_works kernel/workqueue.c:3340 [inline]
 worker_thread+0x6c8/0xf10 kernel/workqueue.c:3421
 kthread+0x3c5/0x780 kernel/kthread.c:463
 ret_from_fork+0x983/0xb10 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

Fixes: 00b5b7aab9 ("net/ipv6: delete temporary address if mngtmpaddr is removed or unmanaged")
Reported-by: syzbot+72e610f4f1a930ca9d8a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/696598e9.050a0220.3be5c5.0009.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260113010538.2019411-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 19:09:11 -08:00
Eric Dumazet 9a6f0c4d57 dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list()
syzbot was able to crash the kernel in rt6_uncached_list_flush_dev()
in an interesting way [1]

Crash happens in list_del_init()/INIT_LIST_HEAD() while writing
list->prev, while the prior write on list->next went well.

static inline void INIT_LIST_HEAD(struct list_head *list)
{
	WRITE_ONCE(list->next, list); // This went well
	WRITE_ONCE(list->prev, list); // Crash, @list has been freed.
}

Issue here is that rt6_uncached_list_del() did not attempt to lock
ul->lock, as list_empty(&rt->dst.rt_uncached) returned
true because the WRITE_ONCE(list->next, list) happened on the other CPU.

We might use list_del_init_careful() and list_empty_careful(),
or make sure rt6_uncached_list_del() always grabs the spinlock
whenever rt->dst.rt_uncached_list has been set.

A similar fix is neeed for IPv4.

[1]

 BUG: KASAN: slab-use-after-free in INIT_LIST_HEAD include/linux/list.h:46 [inline]
 BUG: KASAN: slab-use-after-free in list_del_init include/linux/list.h:296 [inline]
 BUG: KASAN: slab-use-after-free in rt6_uncached_list_flush_dev net/ipv6/route.c:191 [inline]
 BUG: KASAN: slab-use-after-free in rt6_disable_ip+0x633/0x730 net/ipv6/route.c:5020
Write of size 8 at addr ffff8880294cfa78 by task kworker/u8:14/3450

CPU: 0 UID: 0 PID: 3450 Comm: kworker/u8:14 Tainted: G             L      syzkaller #0 PREEMPT_{RT,(full)}
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Workqueue: netns cleanup_net
Call Trace:
 <TASK>
  dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
  print_address_description mm/kasan/report.c:378 [inline]
  print_report+0xca/0x240 mm/kasan/report.c:482
  kasan_report+0x118/0x150 mm/kasan/report.c:595
  INIT_LIST_HEAD include/linux/list.h:46 [inline]
  list_del_init include/linux/list.h:296 [inline]
  rt6_uncached_list_flush_dev net/ipv6/route.c:191 [inline]
  rt6_disable_ip+0x633/0x730 net/ipv6/route.c:5020
  addrconf_ifdown+0x143/0x18a0 net/ipv6/addrconf.c:3853
 addrconf_notify+0x1bc/0x1050 net/ipv6/addrconf.c:-1
  notifier_call_chain+0x19d/0x3a0 kernel/notifier.c:85
  call_netdevice_notifiers_extack net/core/dev.c:2268 [inline]
  call_netdevice_notifiers net/core/dev.c:2282 [inline]
  netif_close_many+0x29c/0x410 net/core/dev.c:1785
  unregister_netdevice_many_notify+0xb50/0x2330 net/core/dev.c:12353
  ops_exit_rtnl_list net/core/net_namespace.c:187 [inline]
  ops_undo_list+0x3dc/0x990 net/core/net_namespace.c:248
  cleanup_net+0x4de/0x7b0 net/core/net_namespace.c:696
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
 </TASK>

Allocated by task 803:
  kasan_save_stack mm/kasan/common.c:57 [inline]
  kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
  unpoison_slab_object mm/kasan/common.c:340 [inline]
  __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366
  kasan_slab_alloc include/linux/kasan.h:253 [inline]
  slab_post_alloc_hook mm/slub.c:4953 [inline]
  slab_alloc_node mm/slub.c:5263 [inline]
  kmem_cache_alloc_noprof+0x18d/0x6c0 mm/slub.c:5270
  dst_alloc+0x105/0x170 net/core/dst.c:89
  ip6_dst_alloc net/ipv6/route.c:342 [inline]
  icmp6_dst_alloc+0x75/0x460 net/ipv6/route.c:3333
  mld_sendpack+0x683/0xe60 net/ipv6/mcast.c:1844
  mld_send_cr net/ipv6/mcast.c:2154 [inline]
  mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

Freed by task 20:
  kasan_save_stack mm/kasan/common.c:57 [inline]
  kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
  kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:584
  poison_slab_object mm/kasan/common.c:253 [inline]
  __kasan_slab_free+0x5c/0x80 mm/kasan/common.c:285
  kasan_slab_free include/linux/kasan.h:235 [inline]
  slab_free_hook mm/slub.c:2540 [inline]
  slab_free mm/slub.c:6670 [inline]
  kmem_cache_free+0x18f/0x8d0 mm/slub.c:6781
  dst_destroy+0x235/0x350 net/core/dst.c:121
  rcu_do_batch kernel/rcu/tree.c:2605 [inline]
  rcu_core kernel/rcu/tree.c:2857 [inline]
  rcu_cpu_kthread+0xba5/0x1af0 kernel/rcu/tree.c:2945
  smpboot_thread_fn+0x542/0xa60 kernel/smpboot.c:160
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

Last potentially related work creation:
  kasan_save_stack+0x3e/0x60 mm/kasan/common.c:57
  kasan_record_aux_stack+0xbd/0xd0 mm/kasan/generic.c:556
  __call_rcu_common kernel/rcu/tree.c:3119 [inline]
  call_rcu+0xee/0x890 kernel/rcu/tree.c:3239
  refdst_drop include/net/dst.h:266 [inline]
  skb_dst_drop include/net/dst.h:278 [inline]
  skb_release_head_state+0x71/0x360 net/core/skbuff.c:1156
  skb_release_all net/core/skbuff.c:1180 [inline]
  __kfree_skb net/core/skbuff.c:1196 [inline]
  sk_skb_reason_drop+0xe9/0x170 net/core/skbuff.c:1234
  kfree_skb_reason include/linux/skbuff.h:1322 [inline]
  tcf_kfree_skb_list include/net/sch_generic.h:1127 [inline]
  __dev_xmit_skb net/core/dev.c:4260 [inline]
  __dev_queue_xmit+0x26aa/0x3210 net/core/dev.c:4785
  NF_HOOK_COND include/linux/netfilter.h:307 [inline]
  ip6_output+0x340/0x550 net/ipv6/ip6_output.c:247
  NF_HOOK+0x9e/0x380 include/linux/netfilter.h:318
  mld_sendpack+0x8d4/0xe60 net/ipv6/mcast.c:1855
  mld_send_cr net/ipv6/mcast.c:2154 [inline]
  mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

The buggy address belongs to the object at ffff8880294cfa00
 which belongs to the cache ip6_dst_cache of size 232
The buggy address is located 120 bytes inside of
 freed 232-byte region [ffff8880294cfa00, ffff8880294cfae8)

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x294cf
memcg:ffff88803536b781
flags: 0x80000000000000(node=0|zone=1)
page_type: f5(slab)
raw: 0080000000000000 ffff88802ff1c8c0 ffffea0000bf2bc0 dead000000000006
raw: 0000000000000000 00000000800c000c 00000000f5000000 ffff88803536b781
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52820(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 9, tgid 9 (kworker/0:0), ts 91119585830, free_ts 91088628818
  set_page_owner include/linux/page_owner.h:32 [inline]
  post_alloc_hook+0x234/0x290 mm/page_alloc.c:1857
  prep_new_page mm/page_alloc.c:1865 [inline]
  get_page_from_freelist+0x28c0/0x2960 mm/page_alloc.c:3915
  __alloc_frozen_pages_noprof+0x181/0x370 mm/page_alloc.c:5210
  alloc_pages_mpol+0xd1/0x380 mm/mempolicy.c:2486
  alloc_slab_page mm/slub.c:3075 [inline]
  allocate_slab+0x86/0x3b0 mm/slub.c:3248
  new_slab mm/slub.c:3302 [inline]
  ___slab_alloc+0xb10/0x13e0 mm/slub.c:4656
  __slab_alloc+0xc6/0x1f0 mm/slub.c:4779
  __slab_alloc_node mm/slub.c:4855 [inline]
  slab_alloc_node mm/slub.c:5251 [inline]
  kmem_cache_alloc_noprof+0x101/0x6c0 mm/slub.c:5270
  dst_alloc+0x105/0x170 net/core/dst.c:89
  ip6_dst_alloc net/ipv6/route.c:342 [inline]
  icmp6_dst_alloc+0x75/0x460 net/ipv6/route.c:3333
  mld_sendpack+0x683/0xe60 net/ipv6/mcast.c:1844
  mld_send_cr net/ipv6/mcast.c:2154 [inline]
  mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
page last free pid 5859 tgid 5859 stack trace:
  reset_page_owner include/linux/page_owner.h:25 [inline]
  free_pages_prepare mm/page_alloc.c:1406 [inline]
  __free_frozen_pages+0xfe1/0x1170 mm/page_alloc.c:2943
  discard_slab mm/slub.c:3346 [inline]
  __put_partials+0x149/0x170 mm/slub.c:3886
  __slab_free+0x2af/0x330 mm/slub.c:5952
  qlink_free mm/kasan/quarantine.c:163 [inline]
  qlist_free_all+0x97/0x100 mm/kasan/quarantine.c:179
  kasan_quarantine_reduce+0x148/0x160 mm/kasan/quarantine.c:286
  __kasan_slab_alloc+0x22/0x80 mm/kasan/common.c:350
  kasan_slab_alloc include/linux/kasan.h:253 [inline]
  slab_post_alloc_hook mm/slub.c:4953 [inline]
  slab_alloc_node mm/slub.c:5263 [inline]
  kmem_cache_alloc_noprof+0x18d/0x6c0 mm/slub.c:5270
  getname_flags+0xb8/0x540 fs/namei.c:146
  getname include/linux/fs.h:2498 [inline]
  do_sys_openat2+0xbc/0x200 fs/open.c:1426
  do_sys_open fs/open.c:1436 [inline]
  __do_sys_openat fs/open.c:1452 [inline]
  __se_sys_openat fs/open.c:1447 [inline]
  __x64_sys_openat+0x138/0x170 fs/open.c:1447
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xec/0xf80 arch/x86/entry/syscall_64.c:94

Fixes: 8d0b94afdc ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister")
Fixes: 78df76a065 ("ipv4: take rt_uncached_lock only if needed")
Reported-by: syzbot+179fc225724092b8b2b2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6964cdf2.050a0220.eaf7.009d.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260112103825.3810713-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 19:08:18 -08:00
Eric Dumazet ffe4ccd359 net: add net.core.qdisc_max_burst
In blamed commit, I added a check against the temporary queue
built in __dev_xmit_skb(). Idea was to drop packets early,
before any spinlock was acquired.

if (unlikely(defer_count > READ_ONCE(q->limit))) {
	kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP);
	return NET_XMIT_DROP;
}

It turned out that HTB Qdisc has a zero q->limit.
HTB limits packets on a per-class basis.
Some of our tests became flaky.

Add a new sysctl : net.core.qdisc_max_burst to control
how many packets can be stored in the temporary lockless queue.

Also add a new QDISC_BURST_DROP drop reason to better diagnose
future issues.

Thanks Neal !

Fixes: 100dfa74ca ("net: dev_queue_xmit() llist adoption")
Reported-and-bisected-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20260107104159.3669285-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-01-13 10:12:11 +01:00
Tetsuo Handa ec69daabe4 bpf: Fix reference count leak in bpf_prog_test_run_xdp()
syzbot is reporting

  unregister_netdevice: waiting for sit0 to become free. Usage count = 2

problem. A debug printk() patch found that a refcount is obtained at
xdp_convert_md_to_buff() from bpf_prog_test_run_xdp().

According to commit ec94670fcb ("bpf: Support specifying ingress via
xdp_md context in BPF_PROG_TEST_RUN"), the refcount obtained by
xdp_convert_md_to_buff() will be released by xdp_convert_buff_to_md().

Therefore, we can consider that the error handling path introduced by
commit 1c19499825 ("bpf: introduce frags support to
bpf_prog_test_run_xdp()") forgot to call xdp_convert_buff_to_md().

Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Fixes: 1c19499825 ("bpf: introduce frags support to bpf_prog_test_run_xdp()")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/af090e53-9d9b-4412-8acb-957733b3975c@I-love.SAKURA.ne.jp
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-12 16:37:40 -08:00
Jakub Kicinski c8a49a2f91 bluetooth pull request for net:
- hci_sync: enable PA Sync Lost event
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCgA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmlhcH4ZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKRXEEACNvWdD3Gf4yWEHFV3u9Q3s
 7JaMyixXC1/CEuAx/Ck8O2jlZDIpuABOtHdNKZCTl7QoFrkU2ajyirvHgDgkzt65
 ahYLL7IrIF2glUM6IW0VesQOa8oKzB9nyfu6fX+iTWQYnKPIV8hAVng70bUVQar1
 5YCjPEredwsD300WmudiURzTCbi2UbUgjXp5clmcrtG5TNVDYjjPAav1xOYzmpZQ
 KfxRBtJPNUWDuHFgI7MdrjiQzA0JyQmIuI587b3npn14FwUQoPHozCqIZm8rODkK
 CI0owwg6T1t8JMS6/XHYTs2S0L6wjI0p7MQvZ6lQRm7jAYo3EcKLm6cbjzdJRt/x
 5S3E/QSUo53ndhXAaQKsifSeNva9fhIUEADQlAjfvrRGI9GpHoiJDebNRhWHApgF
 j3P9Ry/B3sY/s3UiEDVXujhPUH4Kw3X+ClNWubYzKKEqrLof3ISM2bYNA9CNa388
 vRhNTNYOP3ACLUju/Jr3nAPM9VOPLCyubhFMnaooNIwEnXomwUxM14EvBPenlyn9
 tyfsVF8tACKz64P9TfaoN2zgBbZLzY3uV1XgxWAV2ku08hJmLuaqXpm+Iq8W6TVO
 D/NZYDROqdN4WbmFGPX1RuAYZCPuIi4Ifzhc+Vc/GXlCtJE4TIQHYt2jXhuBfFg/
 4Av06y5+qbSn+VKwB0iqOA==
 =kymk
 -----END PGP SIGNATURE-----

Merge tag 'for-net-2026-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - hci_sync: enable PA Sync Lost event

* tag 'for-net-2026-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: hci_sync: enable PA Sync Lost event
====================

Link: https://patch.msgid.link/20260109211949.236218-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-10 14:32:01 -08:00
Eric Dumazet e67c577d89 ipv4: ip_gre: make ipgre_header() robust
Analog to commit db5b4e39c4 ("ip6_gre: make ip6gre_header() robust")

Over the years, syzbot found many ways to crash the kernel
in ipgre_header() [1].

This involves team or bonding drivers ability to dynamically
change their dev->needed_headroom and/or dev->hard_header_len

In this particular crash mld_newpack() allocated an skb
with a too small reserve/headroom, and by the time mld_sendpack()
was called, syzbot managed to attach an ipgre device.

[1]
skbuff: skb_under_panic: text:ffffffff89ea3cb7 len:2030915468 put:2030915372 head:ffff888058b43000 data:ffff887fdfa6e194 tail:0x120 end:0x6c0 dev:team0
 kernel BUG at net/core/skbuff.c:213 !
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 1322 Comm: kworker/1:9 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
Workqueue: mld mld_ifc_work
 RIP: 0010:skb_panic+0x157/0x160 net/core/skbuff.c:213
Call Trace:
 <TASK>
  skb_under_panic net/core/skbuff.c:223 [inline]
  skb_push+0xc3/0xe0 net/core/skbuff.c:2641
  ipgre_header+0x67/0x290 net/ipv4/ip_gre.c:897
  dev_hard_header include/linux/netdevice.h:3436 [inline]
  neigh_connected_output+0x286/0x460 net/core/neighbour.c:1618
  NF_HOOK_COND include/linux/netfilter.h:307 [inline]
  ip6_output+0x340/0x550 net/ipv6/ip6_output.c:247
  NF_HOOK+0x9e/0x380 include/linux/netfilter.h:318
  mld_sendpack+0x8d4/0xe60 net/ipv6/mcast.c:1855
  mld_send_cr net/ipv6/mcast.c:2154 [inline]
  mld_ifc_work+0x83e/0xd60 net/ipv6/mcast.c:2693
  process_one_work kernel/workqueue.c:3257 [inline]
  process_scheduled_works+0xad1/0x1770 kernel/workqueue.c:3340
  worker_thread+0x8a0/0xda0 kernel/workqueue.c:3421
  kthread+0x711/0x8a0 kernel/kthread.c:463
  ret_from_fork+0x510/0xa50 arch/x86/kernel/process.c:158
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246

Fixes: c544193214 ("GRE: Refactor GRE tunneling code.")
Reported-by: syzbot+7c134e1c3aa3283790b9@syzkaller.appspotmail.com
Closes: https://www.spinics.net/lists/netdev/msg1147302.html
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260108190214.1667040-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-10 12:06:22 -08:00
Eric Dumazet eb74c19fe1 net: update netdev_lock_{type,name}
Add missing entries in netdev_lock_type[] and netdev_lock_name[] :

CAN, MCTP, RAWIP, CAIF, IP6GRE, 6LOWPAN, NETLINK, VSOCKMON,
IEEE802154_MONITOR.

Also add a WARN_ONCE() in netdev_lock_pos() to help future bug hunting
next time a protocol is added without updating these arrays.

Fixes: 1a33e10e4a ("net: partially revert dynamic lockdep key changes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260108093244.830280-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-09 18:14:32 -08:00
Eric Dumazet 81c734dae2 ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv()
Blamed commit did not take care of VLAN encapsulations
as spotted by syzbot [1].

Use skb_vlan_inet_prepare() instead of pskb_inet_may_pull().

[1]
 BUG: KMSAN: uninit-value in __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline]
 BUG: KMSAN: uninit-value in INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline]
 BUG: KMSAN: uninit-value in IP6_ECN_decapsulate+0x7a8/0x1fa0 include/net/inet_ecn.h:321
  __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline]
  INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline]
  IP6_ECN_decapsulate+0x7a8/0x1fa0 include/net/inet_ecn.h:321
  ip6ip6_dscp_ecn_decapsulate+0x16f/0x1b0 net/ipv6/ip6_tunnel.c:729
  __ip6_tnl_rcv+0xed9/0x1b50 net/ipv6/ip6_tunnel.c:860
  ip6_tnl_rcv+0xc3/0x100 net/ipv6/ip6_tunnel.c:903
 gre_rcv+0x1529/0x1b90 net/ipv6/ip6_gre.c:-1
  ip6_protocol_deliver_rcu+0x1c89/0x2c60 net/ipv6/ip6_input.c:438
  ip6_input_finish+0x1f4/0x4a0 net/ipv6/ip6_input.c:489
  NF_HOOK include/linux/netfilter.h:318 [inline]
  ip6_input+0x9c/0x330 net/ipv6/ip6_input.c:500
  ip6_mc_input+0x7ca/0xc10 net/ipv6/ip6_input.c:590
  dst_input include/net/dst.h:474 [inline]
  ip6_rcv_finish+0x958/0x990 net/ipv6/ip6_input.c:79
  NF_HOOK include/linux/netfilter.h:318 [inline]
  ipv6_rcv+0xf1/0x3c0 net/ipv6/ip6_input.c:311
  __netif_receive_skb_one_core net/core/dev.c:6139 [inline]
  __netif_receive_skb+0x1df/0xac0 net/core/dev.c:6252
  netif_receive_skb_internal net/core/dev.c:6338 [inline]
  netif_receive_skb+0x57/0x630 net/core/dev.c:6397
  tun_rx_batched+0x1df/0x980 drivers/net/tun.c:1485
  tun_get_user+0x5c0e/0x6c60 drivers/net/tun.c:1953
  tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1999
  new_sync_write fs/read_write.c:593 [inline]
  vfs_write+0xbe2/0x15d0 fs/read_write.c:686
  ksys_write fs/read_write.c:738 [inline]
  __do_sys_write fs/read_write.c:749 [inline]
  __se_sys_write fs/read_write.c:746 [inline]
  __x64_sys_write+0x1fb/0x4d0 fs/read_write.c:746
  x64_sys_call+0x30ab/0x3e70 arch/x86/include/generated/asm/syscalls_64.h:2
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xd3/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Uninit was created at:
  slab_post_alloc_hook mm/slub.c:4960 [inline]
  slab_alloc_node mm/slub.c:5263 [inline]
  kmem_cache_alloc_node_noprof+0x9e7/0x17a0 mm/slub.c:5315
  kmalloc_reserve+0x13c/0x4b0 net/core/skbuff.c:586
  __alloc_skb+0x805/0x1040 net/core/skbuff.c:690
  alloc_skb include/linux/skbuff.h:1383 [inline]
  alloc_skb_with_frags+0xc5/0xa60 net/core/skbuff.c:6712
  sock_alloc_send_pskb+0xacc/0xc60 net/core/sock.c:2995
  tun_alloc_skb drivers/net/tun.c:1461 [inline]
  tun_get_user+0x1142/0x6c60 drivers/net/tun.c:1794
  tun_chr_write_iter+0x3e9/0x5c0 drivers/net/tun.c:1999
  new_sync_write fs/read_write.c:593 [inline]
  vfs_write+0xbe2/0x15d0 fs/read_write.c:686
  ksys_write fs/read_write.c:738 [inline]
  __do_sys_write fs/read_write.c:749 [inline]
  __se_sys_write fs/read_write.c:746 [inline]
  __x64_sys_write+0x1fb/0x4d0 fs/read_write.c:746
  x64_sys_call+0x30ab/0x3e70 arch/x86/include/generated/asm/syscalls_64.h:2
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0xd3/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

CPU: 0 UID: 0 PID: 6465 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(none)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025

Fixes: 8d975c15c0 ("ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()")
Reported-by: syzbot+d4dda070f833dc5dc89a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/695e88b2.050a0220.1c677c.036d.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260107163109.4188620-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-09 18:14:08 -08:00
Eric Dumazet b25a0b4a21 net: bridge: annotate data-races around fdb->{updated,used}
fdb->updated and fdb->used are read and written locklessly.

Add READ_ONCE()/WRITE_ONCE() annotations.

Fixes: 31cbc39b63 ("net: bridge: add option to allow activity notifications for any fdb entries")
Reported-by: syzbot+bfab43087ad57222ce96@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/695e3d74.050a0220.1c677c.035f.GAE@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260108093806.834459-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-09 17:36:21 -08:00
Linus Torvalds 4621c338d3 A bunch of libceph fixes split evenly between memory safety and
implementation correctness issues (all marked for stable) and a change
 in maintainers for CephFS: Slava and Alex have formally taken over
 Xiubo's role.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmlhPnYTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi+xRB/4wW8+zp9w+AzK023uL93A9iU4yPM+/
 XYRSIhnR0VWdGTwQsSiQmuwT9RVW5ElH1o0Zzt9BBM9XW8BTIyDnCcdd4yYT+fRc
 ZCG5JUXF0rRgSXYWTpHfEUg5H0wAFCruhhv51vfAuxe5+AFh+7J2/Ct2SraIMt7r
 brShR7vqxUgaBp4TINdsEZBSNBhEIUkPbulZxDDq4+uFN5Cl1ZgESm4QOdxqZFJ8
 lkvPRRjh7rpSIUfULSyH6UVvS/FsSjXrL1rhszlGbgwc297ox+UTk+dOfko8gdmL
 /l92y6Si5CdCxEwLGHvW4lW7qWj9ba9g8v9DCy1BL6dE+CbY1HSjgtJB
 =bROV
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-6.19-rc5' of https://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A bunch of libceph fixes split evenly between memory safety and
  implementation correctness issues (all marked for stable) and a change
  in maintainers for CephFS: Slava and Alex have formally taken over
  Xiubo's role"

* tag 'ceph-for-6.19-rc5' of https://github.com/ceph/ceph-client:
  libceph: make calc_target() set t->paused, not just clear it
  libceph: reset sparse-read state in osd_fault()
  libceph: return the handler error from mon_handle_auth_done()
  libceph: make free_choose_arg_map() resilient to partial allocation
  ceph: update co-maintainers list in MAINTAINERS
  libceph: replace overzealous BUG_ON in osdmap_apply_incremental()
  libceph: prevent potential out-of-bounds reads in handle_auth_done()
2026-01-09 15:05:19 -10:00
Yang Li ab749bfe6a Bluetooth: hci_sync: enable PA Sync Lost event
Enable the PA Sync Lost event mask to ensure PA sync loss is properly
reported and handled.

Fixes: 485e0626e5 ("Bluetooth: hci_event: Fix not handling PA Sync Lost event")
Signed-off-by: Yang Li <yang.li@amlogic.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2026-01-09 16:03:57 -05:00
Eric Dumazet 872ac785e7 ipv4: ip_tunnel: spread netdev_lockdep_set_classes()
Inspired by yet another syzbot report.

IPv6 tunnels call netdev_lockdep_set_classes() for each tunnel type,
while IPv4 currently centralizes netdev_lockdep_set_classes() call from
ip_tunnel_init().

Make ip_tunnel_init() a macro, so that we have different lockdep
classes per tunnel type.

Fixes: 0bef512012 ("net: add netdev_lockdep_set_classes() to virtual drivers")
Reported-by: syzbot+1240b33467289f5ab50b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/695d439f.050a0220.1c677c.0347.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260106172426.1760721-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 18:02:35 -08:00
Eric Dumazet c92510f5e3 arp: do not assume dev_hard_header() does not change skb->head
arp_create() is the only dev_hard_header() caller
making assumption about skb->head being unchanged.

A recent commit broke this assumption.

Initialize @arp pointer after dev_hard_header() call.

Fixes: db5b4e39c4 ("ip6_gre: make ip6gre_header() robust")
Reported-by: syzbot+58b44a770a1585795351@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260107212250.384552-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 09:04:24 -08:00
Jakub Kicinski 804809ae40 Couple of fixes:
- mac80211:
    - long-standing injection bug due to chanctx rework
    - more recent interface iteration issue
    - collect statistics before removing stations
  - hwsim:
    - fix NAN frequency typo (potential NULL ptr deref)
    - fix locking of radio lock (needs softirqs disabled)
  - wext:
    - ancient issue with compat and events copying some
      uninitialized stack data to userspace
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmlfuIgACgkQ10qiO8sP
 aABFOQ/9Fgkpncx7+iA7Als75L4DkVz94k/PE+VCN8gmBQp7PwQLEHyLhm+EgK1d
 sSdRcgJAJ6P7sdyZ4foZlZszAEPdLLfb0JaYLEQ6h4+g6cVmaFzOSE2/6mUDE2OO
 SlYxWjF1wnzpL5InVE54UBGvfLEZyoO80xqQhSW5J6v2pBsovDsve1KMtT8a4myi
 zId7tpCpOTGuP9jtPJZWeBXZlLLIkvpTOyCAPQ57WJKCVflwxHVTpCFMEoDLbT8v
 KV6af8zns57Tu+VTLAD4CmBTXyWsVnOsLmRdB7a2S5X6zYoqiep0JMOnmVPcLwmm
 WhxIWgoquGHFKJxbCNp7tyG641Pa39ZfeBErqfZY0FTUgxVjj7tLdvwm1gxg5SAW
 gjv2TlnyrPhRhcD3ocjbudn7H+76rqPSZC7FwGhrmPi0CfrEgfYmS0S+g3Jng0x0
 KrX1ej4/dSR6KTjvmZ62skLPq6HzRQVbOPuZhcYq9Gqt9yMQ57W3kHoDbiwaAwYi
 0iTZo7W8JcyN3cbWVNKf4ZHDSywL+YNGVuY0o/umNQk0A+SXvk73HrhDVH2sj9h2
 haP7+uOpN90YM1c+Zjfe3uxL0M3MPquGBojJZVDzMApRNczd/pSOvuIO7txhqy9B
 H9rjlHNyxYtsj4CZhML/GUSKYX551aO6WjyzgAxq/gSIfYxcb0k=
 =KKD4
 -----END PGP SIGNATURE-----

Merge tag 'wireless-2026-01-08' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
Couple of fixes:
 - mac80211:
   - long-standing injection bug due to chanctx rework
   - more recent interface iteration issue
   - collect statistics before removing stations
 - hwsim:
   - fix NAN frequency typo (potential NULL ptr deref)
   - fix locking of radio lock (needs softirqs disabled)
 - wext:
   - ancient issue with compat and events copying some
     uninitialized stack data to userspace

* tag 'wireless-2026-01-08' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: mac80211: collect station statistics earlier when disconnect
  wifi: mac80211: restore non-chanctx injection behaviour
  wifi: mac80211_hwsim: disable BHs for hwsim_radio_lock
  wifi: mac80211: don't iterate not running interfaces
  wifi: mac80211_hwsim: fix typo in frequency notification
  wifi: avoid kernel-infoleak from struct iw_point
====================

Link: https://patch.msgid.link/20260108140141.139687-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 08:49:24 -08:00
Willem de Bruijn 7d11e047ed net: do not write to msg_get_inq in callee
NULL pointer dereference fix.

msg_get_inq is an input field from caller to callee. Don't set it in
the callee, as the caller may not clear it on struct reuse.

This is a kernel-internal variant of msghdr only, and the only user
does reinitialize the field. So this is not critical for that reason.
But it is more robust to avoid the write, and slightly simpler code.
And it fixes a bug, see below.

Callers set msg_get_inq to request the input queue length to be
returned in msg_inq. This is equivalent to but independent from the
SO_INQ request to return that same info as a cmsg (tp->recvmsg_inq).
To reduce branching in the hot path the second also sets the msg_inq.
That is WAI.

This is a fix to commit 4d1442979e ("af_unix: don't post cmsg for
SO_INQ unless explicitly asked for"), which fixed the inverse.

Also avoid NULL pointer dereference in unix_stream_read_generic if
state->msg is NULL and msg->msg_get_inq is written. A NULL state->msg
can happen when splicing as of commit 2b514574f7 ("net: af_unix:
implement splice for stream af_unix sockets").

Also collapse two branches using a bitwise or.

Cc: stable@vger.kernel.org
Fixes: 4d1442979e ("af_unix: don't post cmsg for SO_INQ unless explicitly asked for")
Link: https://lore.kernel.org/netdev/willemdebruijn.kernel.24d8030f7a3de@gmail.com/
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260106150626.3944363-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 08:45:13 -08:00
Xiang Mei c1d73b1480 net/sched: sch_qfq: Fix NULL deref when deactivating inactive aggregate in qfq_reset
`qfq_class->leaf_qdisc->q.qlen > 0` does not imply that the class
itself is active.

Two qfq_class objects may point to the same leaf_qdisc. This happens
when:

1. one QFQ qdisc is attached to the dev as the root qdisc, and

2. another QFQ qdisc is temporarily referenced (e.g., via qdisc_get()
/ qdisc_put()) and is pending to be destroyed, as in function
tc_new_tfilter.

When packets are enqueued through the root QFQ qdisc, the shared
leaf_qdisc->q.qlen increases. At the same time, the second QFQ
qdisc triggers qdisc_put and qdisc_destroy: the qdisc enters
qfq_reset() with its own q->q.qlen == 0, but its class's leaf
qdisc->q.qlen > 0. Therefore, the qfq_reset would wrongly deactivate
an inactive aggregate and trigger a null-deref in qfq_deactivate_agg:

[    0.903172] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    0.903571] #PF: supervisor write access in kernel mode
[    0.903860] #PF: error_code(0x0002) - not-present page
[    0.904177] PGD 10299b067 P4D 10299b067 PUD 10299c067 PMD 0
[    0.904502] Oops: Oops: 0002 [#1] SMP NOPTI
[    0.904737] CPU: 0 UID: 0 PID: 135 Comm: exploit Not tainted 6.19.0-rc3+ #2 NONE
[    0.905157] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
[    0.905754] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:992 (discriminator 2) include/linux/list.h:1006 (discriminator 2) net/sched/sch_qfq.c:1367 (discriminator 2) net/sched/sch_qfq.c:1393 (discriminator 2))
[    0.906046] Code: 0f 84 4d 01 00 00 48 89 70 18 8b 4b 10 48 c7 c2 ff ff ff ff 48 8b 78 08 48 d3 e2 48 21 f2 48 2b 13 48 8b 30 48 d3 ea 8b 4b 18 0

Code starting with the faulting instruction
===========================================
   0:	0f 84 4d 01 00 00    	je     0x153
   6:	48 89 70 18          	mov    %rsi,0x18(%rax)
   a:	8b 4b 10             	mov    0x10(%rbx),%ecx
   d:	48 c7 c2 ff ff ff ff 	mov    $0xffffffffffffffff,%rdx
  14:	48 8b 78 08          	mov    0x8(%rax),%rdi
  18:	48 d3 e2             	shl    %cl,%rdx
  1b:	48 21 f2             	and    %rsi,%rdx
  1e:	48 2b 13             	sub    (%rbx),%rdx
  21:	48 8b 30             	mov    (%rax),%rsi
  24:	48 d3 ea             	shr    %cl,%rdx
  27:	8b 4b 18             	mov    0x18(%rbx),%ecx
	...
[    0.907095] RSP: 0018:ffffc900004a39a0 EFLAGS: 00010246
[    0.907368] RAX: ffff8881043a0880 RBX: ffff888102953340 RCX: 0000000000000000
[    0.907723] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    0.908100] RBP: ffff888102952180 R08: 0000000000000000 R09: 0000000000000000
[    0.908451] R10: ffff8881043a0000 R11: 0000000000000000 R12: ffff888102952000
[    0.908804] R13: ffff888102952180 R14: ffff8881043a0ad8 R15: ffff8881043a0880
[    0.909179] FS:  000000002a1a0380(0000) GS:ffff888196d8d000(0000) knlGS:0000000000000000
[    0.909572] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.909857] CR2: 0000000000000000 CR3: 0000000102993002 CR4: 0000000000772ef0
[    0.910247] PKRU: 55555554
[    0.910391] Call Trace:
[    0.910527]  <TASK>
[    0.910638]  qfq_reset_qdisc (net/sched/sch_qfq.c:357 net/sched/sch_qfq.c:1485)
[    0.910826]  qdisc_reset (include/linux/skbuff.h:2195 include/linux/skbuff.h:2501 include/linux/skbuff.h:3424 include/linux/skbuff.h:3430 net/sched/sch_generic.c:1036)
[    0.911040]  __qdisc_destroy (net/sched/sch_generic.c:1076)
[    0.911236]  tc_new_tfilter (net/sched/cls_api.c:2447)
[    0.911447]  rtnetlink_rcv_msg (net/core/rtnetlink.c:6958)
[    0.911663]  ? __pfx_rtnetlink_rcv_msg (net/core/rtnetlink.c:6861)
[    0.911894]  netlink_rcv_skb (net/netlink/af_netlink.c:2550)
[    0.912100]  netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344)
[    0.912296]  ? __alloc_skb (net/core/skbuff.c:706)
[    0.912484]  netlink_sendmsg (net/netlink/af_netlink.c:1894)
[    0.912682]  sock_write_iter (net/socket.c:727 (discriminator 1) net/socket.c:742 (discriminator 1) net/socket.c:1195 (discriminator 1))
[    0.912880]  vfs_write (fs/read_write.c:593 fs/read_write.c:686)
[    0.913077]  ksys_write (fs/read_write.c:738)
[    0.913252]  do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
[    0.913438]  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131)
[    0.913687] RIP: 0033:0x424c34
[    0.913844] Code: 89 02 48 c7 c0 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 80 3d 2d 44 09 00 00 74 13 b8 01 00 00 00 0f 05 9

Code starting with the faulting instruction
===========================================
   0:	89 02                	mov    %eax,(%rdx)
   2:	48 c7 c0 ff ff ff ff 	mov    $0xffffffffffffffff,%rax
   9:	eb bd                	jmp    0xffffffffffffffc8
   b:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
  12:	00 00 00
  15:	90                   	nop
  16:	f3 0f 1e fa          	endbr64
  1a:	80 3d 2d 44 09 00 00 	cmpb   $0x0,0x9442d(%rip)        # 0x9444e
  21:	74 13                	je     0x36
  23:	b8 01 00 00 00       	mov    $0x1,%eax
  28:	0f 05                	syscall
  2a:	09                   	.byte 0x9
[    0.914807] RSP: 002b:00007ffea1938b78 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[    0.915197] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 0000000000424c34
[    0.915556] RDX: 000000000000003c RSI: 000000002af378c0 RDI: 0000000000000003
[    0.915912] RBP: 00007ffea1938bc0 R08: 00000000004b8820 R09: 0000000000000000
[    0.916297] R10: 0000000000000001 R11: 0000000000000202 R12: 00007ffea1938d28
[    0.916652] R13: 00007ffea1938d38 R14: 00000000004b3828 R15: 0000000000000001
[    0.917039]  </TASK>
[    0.917158] Modules linked in:
[    0.917316] CR2: 0000000000000000
[    0.917484] ---[ end trace 0000000000000000 ]---
[    0.917717] RIP: 0010:qfq_deactivate_agg (include/linux/list.h:992 (discriminator 2) include/linux/list.h:1006 (discriminator 2) net/sched/sch_qfq.c:1367 (discriminator 2) net/sched/sch_qfq.c:1393 (discriminator 2))
[    0.917978] Code: 0f 84 4d 01 00 00 48 89 70 18 8b 4b 10 48 c7 c2 ff ff ff ff 48 8b 78 08 48 d3 e2 48 21 f2 48 2b 13 48 8b 30 48 d3 ea 8b 4b 18 0

Code starting with the faulting instruction
===========================================
   0:	0f 84 4d 01 00 00    	je     0x153
   6:	48 89 70 18          	mov    %rsi,0x18(%rax)
   a:	8b 4b 10             	mov    0x10(%rbx),%ecx
   d:	48 c7 c2 ff ff ff ff 	mov    $0xffffffffffffffff,%rdx
  14:	48 8b 78 08          	mov    0x8(%rax),%rdi
  18:	48 d3 e2             	shl    %cl,%rdx
  1b:	48 21 f2             	and    %rsi,%rdx
  1e:	48 2b 13             	sub    (%rbx),%rdx
  21:	48 8b 30             	mov    (%rax),%rsi
  24:	48 d3 ea             	shr    %cl,%rdx
  27:	8b 4b 18             	mov    0x18(%rbx),%ecx
	...
[    0.918902] RSP: 0018:ffffc900004a39a0 EFLAGS: 00010246
[    0.919198] RAX: ffff8881043a0880 RBX: ffff888102953340 RCX: 0000000000000000
[    0.919559] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[    0.919908] RBP: ffff888102952180 R08: 0000000000000000 R09: 0000000000000000
[    0.920289] R10: ffff8881043a0000 R11: 0000000000000000 R12: ffff888102952000
[    0.920648] R13: ffff888102952180 R14: ffff8881043a0ad8 R15: ffff8881043a0880
[    0.921014] FS:  000000002a1a0380(0000) GS:ffff888196d8d000(0000) knlGS:0000000000000000
[    0.921424] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.921710] CR2: 0000000000000000 CR3: 0000000102993002 CR4: 0000000000772ef0
[    0.922097] PKRU: 55555554
[    0.922240] Kernel panic - not syncing: Fatal exception
[    0.922590] Kernel Offset: disabled

Fixes: 0545a30377 ("pkt_sched: QFQ - quick fair queue scheduler")
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Link: https://patch.msgid.link/20260106034100.1780779-1-xmei5@asu.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-08 08:22:28 -08:00
Baochen Qiang a203dbeeca wifi: mac80211: collect station statistics earlier when disconnect
In __sta_info_destroy_part2(), station statistics are requested after the
IEEE80211_STA_NONE -> IEEE80211_STA_NOTEXIST transition. This is
problematic because the driver may be unable to handle the request due to
the STA being in the NOTEXIST state (i.e. if the driver destroys the
underlying data when transitioning to NOTEXIST).

Move the statistics collection to before the state transition to avoid
this issue.

Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Link: https://patch.msgid.link/20251222-mac80211-move-station-stats-collection-earlier-v1-1-12cd4e42c633@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:33:11 +01:00
Johannes Berg d594cc6f2c wifi: mac80211: restore non-chanctx injection behaviour
During the transition to use channel contexts throughout, the
ability to do injection while in monitor mode concurrent with
another interface was lost, since the (virtual) monitor won't
have a chanctx assigned in this scenario.

It's harder to fix drivers that actually transitioned to using
channel contexts themselves, such as mt76, but it's easy to do
those that are (still) just using the emulation. Do that.

Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218763
Reported-and-tested-by: Oscar Alfonso Diaz <oscar.alfonso.diaz@gmail.com>
Fixes: 0a44dfc070 ("wifi: mac80211: simplify non-chanctx drivers")
Link: https://patch.msgid.link/20251216105242.18366-2-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:33:10 +01:00
Miri Korenblit c0d82ba961 wifi: mac80211: don't iterate not running interfaces
for_each_chanctx_user_* was introdcued as a replacement for
for_each_sdata_link, which visits also other chanctx users that are not
link.
for_each_sdata_link skips not running interfaces, do the same for
for_each_chanctx_user_*

Fixes: 1ce954c98b ("wifi: mac80211: add and use chanctx usage iteration")
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20260107143736.55c084e2a976.I38b7b904a135dadca339321923b501b2c2c5c8c0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:33:10 +01:00
Eric Dumazet 21cbf883d0 wifi: avoid kernel-infoleak from struct iw_point
struct iw_point has a 32bit hole on 64bit arches.

struct iw_point {
  void __user   *pointer;       /* Pointer to the data  (in user space) */
  __u16         length;         /* number of fields or size in bytes */
  __u16         flags;          /* Optional params */
};

Make sure to zero the structure to avoid disclosing 32bits of kernel data
to user space.

Fixes: 87de87d5e4 ("wext: Dispatch and handle compat ioctls entirely in net/wireless/wext.c")
Reported-by: syzbot+bfc7323743ca6dbcc3d3@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/695f83f3.050a0220.1c677c.0392.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260108101927.857582-1-edumazet@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-08 13:33:05 +01:00
Shivani Gupta adb25a46dc net/sched: act_api: avoid dereferencing ERR_PTR in tcf_idrinfo_destroy
syzbot reported a crash in tc_act_in_hw() during netns teardown where
tcf_idrinfo_destroy() passed an ERR_PTR(-EBUSY) value as a tc_action
pointer, leading to an invalid dereference.

Guard against ERR_PTR entries when iterating the action IDR so teardown
does not call tc_act_in_hw() on an error pointer.

Fixes: 84a7d6797e ("net/sched: acp_api: no longer acquire RTNL in tc_action_net_exit()")
Link: https://syzkaller.appspot.com/bug?extid=8f1c492ffa4644ff3826
Reported-by: syzbot+8f1c492ffa4644ff3826@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8f1c492ffa4644ff3826
Signed-off-by: Shivani Gupta <shivani07g@gmail.com>
Link: https://patch.msgid.link/20260105005905.243423-1-shivani07g@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:27:18 -08:00
Eric Dumazet e5c8eda39a udp: call skb_orphan() before skb_attempt_defer_free()
Standard UDP receive path does not use skb->destructor.

But skmsg layer does use it, since it calls skb_set_owner_sk_safe()
from udp_read_skb().

This then triggers this warning in skb_attempt_defer_free():

    DEBUG_NET_WARN_ON_ONCE(skb->destructor);

We must call skb_orphan() to fix this issue.

Fixes: 6471658dc6 ("udp: use skb_attempt_defer_free()")
Reported-by: syzbot+3e68572cf2286ce5ebe9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/695b83bd.050a0220.1c9965.002b.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260105093630.1976085-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-06 17:05:17 -08:00
Toke Høiland-Jørgensen e558cca217 bpf, test_run: Subtract size of xdp_frame from allowed metadata size
The xdp_frame structure takes up part of the XDP frame headroom,
limiting the size of the metadata. However, in bpf_test_run, we don't
take this into account, which makes it possible for userspace to supply
a metadata size that is too large (taking up the entire headroom).

If userspace supplies such a large metadata size in live packet mode,
the xdp_update_frame_from_buff() call in xdp_test_run_init_page() call
will fail, after which packet transmission proceeds with an
uninitialised frame structure, leading to the usual Bad Stuff.

The commit in the Fixes tag fixed a related bug where the second check
in xdp_update_frame_from_buff() could fail, but did not add any
additional constraints on the metadata size. Complete the fix by adding
an additional check on the metadata size. Reorder the checks slightly to
make the logic clearer and add a comment.

Link: https://lore.kernel.org/r/fa2be179-bad7-4ee3-8668-4903d1853461@hust.edu.cn
Fixes: b6f1f780b3 ("bpf, test_run: Fix packet size check for live packet mode")
Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reviewed-by: Amery Hung <ameryhung@gmail.com>
Link: https://lore.kernel.org/r/20260105114747.1358750-1-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-06 11:41:11 -08:00
Mohammad Heib 238e03d046 net: fix memory leak in skb_segment_list for GRO packets
When skb_segment_list() is called during packet forwarding, it handles
packets that were aggregated by the GRO engine.

Historically, the segmentation logic in skb_segment_list assumes that
individual segments are split from a parent SKB and may need to carry
their own socket memory accounting. Accordingly, the code transfers
truesize from the parent to the newly created segments.

Prior to commit ed4cccef64 ("gro: fix ownership transfer"), this
truesize subtraction in skb_segment_list() was valid because fragments
still carry a reference to the original socket.

However, commit ed4cccef64 ("gro: fix ownership transfer") changed
this behavior by ensuring that fraglist entries are explicitly
orphaned (skb->sk = NULL) to prevent illegal orphaning later in the
stack. This change meant that the entire socket memory charge remained
with the head SKB, but the corresponding accounting logic in
skb_segment_list() was never updated.

As a result, the current code unconditionally adds each fragment's
truesize to delta_truesize and subtracts it from the parent SKB. Since
the fragments are no longer charged to the socket, this subtraction
results in an effective under-count of memory when the head is freed.
This causes sk_wmem_alloc to remain non-zero, preventing socket
destruction and leading to a persistent memory leak.

The leak can be observed via KMEMLEAK when tearing down the networking
environment:

unreferenced object 0xffff8881e6eb9100 (size 2048):
  comm "ping", pid 6720, jiffies 4295492526
  backtrace:
    kmem_cache_alloc_noprof+0x5c6/0x800
    sk_prot_alloc+0x5b/0x220
    sk_alloc+0x35/0xa00
    inet6_create.part.0+0x303/0x10d0
    __sock_create+0x248/0x640
    __sys_socket+0x11b/0x1d0

Since skb_segment_list() is exclusively used for SKB_GSO_FRAGLIST
packets constructed by GRO, the truesize adjustment is removed.

The call to skb_release_head_state() must be preserved. As documented in
commit cf673ed0e0 ("net: fix fraglist segmentation reference count
leak"), it is still required to correctly drop references to SKB
extensions that may be overwritten during __copy_skb_header().

Fixes: ed4cccef64 ("gro: fix ownership transfer")
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260104213101.352887-1-mheib@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 17:01:28 -08:00
Jamal Hadi Salim 9892353726 net/sched: act_mirred: Fix leak when redirecting to self on egress
Whenever a mirred redirect to self on egress happens, mirred allocates a
new skb (skb_to_send). The loop to self check was done after that
allocation, but was not freeing the newly allocated skb, causing a leak.

Fix this by moving the if-statement to before the allocation of the new
skb.

The issue was found by running the accompanying tdc test in 2/2
with config kmemleak enabled.
After a few minutes the kmemleak thread ran and reported the leak coming from
mirred.

Fixes: 1d856251a0 ("net/sched: act_mirred: fix loop detection")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260101135608.253079-2-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:23:42 -08:00
Michal Luczaj ce5e612dd4 vsock: Make accept()ed sockets use custom setsockopt()
SO_ZEROCOPY handling in vsock_connectible_setsockopt() does not get called
on accept()ed sockets due to a missing flag. Flip it.

Fixes: e0718bd82e ("vsock: enable setting SO_ZEROCOPY")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://patch.msgid.link/20251229-vsock-child-sock-custom-sockopt-v2-1-64778d6c4f88@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-05 16:14:50 -08:00
Ilya Dryomov c0fe2994f9 libceph: make calc_target() set t->paused, not just clear it
Currently calc_target() clears t->paused if the request shouldn't be
paused anymore, but doesn't ever set t->paused even though it's able to
determine when the request should be paused.  Setting t->paused is left
to __submit_request() which is fine for regular requests but doesn't
work for linger requests -- since __submit_request() doesn't operate
on linger requests, there is nowhere for lreq->t.paused to be set.
One consequence of this is that watches don't get reestablished on
paused -> unpaused transitions in cases where requests have been paused
long enough for the (paused) unwatch request to time out and for the
subsequent (re)watch request to enter the paused state.  On top of the
watch not getting reestablished, rbd_reregister_watch() gets stuck with
rbd_dev->watch_mutex held:

  rbd_register_watch
    __rbd_register_watch
      ceph_osdc_watch
        linger_reg_commit_wait

It's waiting for lreq->reg_commit_wait to be completed, but for that to
happen the respective request needs to end up on need_resend_linger list
and be kicked when requests are unpaused.  There is no chance for that
if the request in question is never marked paused in the first place.

The fact that rbd_dev->watch_mutex remains taken out forever then
prevents the image from getting unmapped -- "rbd unmap" would inevitably
hang in D state on an attempt to grab the mutex.

Cc: stable@vger.kernel.org
Reported-by: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
2026-01-06 00:39:43 +01:00
Sam Edwards 11194b416e libceph: reset sparse-read state in osd_fault()
When a fault occurs, the connection is abandoned, reestablished, and any
pending operations are retried. The OSD client tracks the progress of a
sparse-read reply using a separate state machine, largely independent of
the messenger's state.

If a connection is lost mid-payload or the sparse-read state machine
returns an error, the sparse-read state is not reset. The OSD client
will then interpret the beginning of a new reply as the continuation of
the old one. If this makes the sparse-read machinery enter a failure
state, it may never recover, producing loops like:

  libceph:  [0] got 0 extents
  libceph: data len 142248331 != extent len 0
  libceph: osd0 (1)...:6801 socket error on read
  libceph: data len 142248331 != extent len 0
  libceph: osd0 (1)...:6801 socket error on read

Therefore, reset the sparse-read state in osd_fault(), ensuring retries
start from a clean state.

Cc: stable@vger.kernel.org
Fixes: f628d79997 ("libceph: add sparse read support to OSD client")
Signed-off-by: Sam Edwards <CFSworks@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2026-01-05 22:46:43 +01:00
Ilya Dryomov e84b48d31b libceph: return the handler error from mon_handle_auth_done()
Currently any error from ceph_auth_handle_reply_done() is propagated
via finish_auth() but isn't returned from mon_handle_auth_done().  This
results in higher layers learning that (despite the monitor considering
us to be successfully authenticated) something went wrong in the
authentication phase and reacting accordingly, but msgr2 still trying
to proceed with establishing the session in the background.  In the
case of secure mode this can trigger a WARN in setup_crypto() and later
lead to a NULL pointer dereference inside of prepare_auth_signature().

Cc: stable@vger.kernel.org
Fixes: cd1a677cad ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
2026-01-05 22:46:43 +01:00