The MANA hardware supports a maximum of 30 scatter-gather entries (SGEs)
per TX WQE. Exceeding this limit can cause TX failures.
Add ndo_features_check() callback to validate SKB layout before
transmission. For GSO SKBs that would exceed the hardware SGE limit, clear
NETIF_F_GSO_MASK to enforce software segmentation in the stack.
Add a fallback in mana_start_xmit() to linearize non-GSO SKBs that still
exceed the SGE limit.
Also, Add ethtool counter for SKBs linearized
Co-developed-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1763464269-10431-2-git-send-email-gargaditya@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Report standard counter stats->rx_missed_errors
using hc_rx_discards_no_wqe from the hardware.
Add a global workqueue to periodically run
mana_query_gf_stats every 2 seconds to get the latest
info in eth_stats and define a driver capability flag
to notify hardware of the periodic queries.
To avoid repeated failures and log flooding, the workqueue
is not rescheduled if mana_query_gf_stats fails on HWC timeout
error and the stats are reset to 0. Other errors are transient
which will not need a VF reset for recovery.
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1763120599-6331-3-git-send-email-ernis@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move hardware counter (HC) statistics from mana_port_context to
mana_context to enable sharing stats across multiple network ports
on the same MANA VF. Previously, each network port queried
hardware counters independently using MANA_QUERY_GF_STAT command
(GF = Generic Function stats from GDMA hardware), resulting in
redundant queries when multiple ports existed on the same device.
Isolate hardware counter stats by introducing mana_ethtool_hc_stats
in mana_context and update the code to ensure all stats are properly
reported via ethtool -S <interface>, maintaining consistency with
previous behavior.
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1763120599-6331-2-git-send-email-ernis@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit bf40785fa4 ("sctp: Use HMAC-SHA1 and HMAC-SHA256 library for chunk
authentication") removed the implementation but leave declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Link: https://patch.msgid.link/20251113114501.32905-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since 93e86b3bc8 ("net: dsa: Remove legacy probing support")
this struct has no user any longer.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/4053a98f-052f-4dc1-a3d4-ed9b3d3cc7cb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cross-merge networking fixes after downstream PR (net-6.18-rc6).
No conflicts, adjacent changes in:
drivers/net/phy/micrel.c
96a9178a29 ("net: phy: micrel: lan8814 fix reset of the QSGMII interface")
61b7ade9ba ("net: phy: micrel: Add support for non PTP SKUs for lan8814")
and a trivial one in tools/testing/selftests/drivers/net/Makefile.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- split ieee80211.h file, it's way too big
- mac80211: initial chanctx work towards NAN
- mac80211: MU-MIMO sniffer improvements
- ath12k: statistics improvements
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmkUdDsACgkQ10qiO8sP
aABeDhAAnBtrADnRQYo5BMbV6/7JsJL3GV6Zj5/Nrq6si8TE8vquf4kevUbISpgk
Mi2tQ0UxFTW6LckT3LTOxzQSZzaPANSPO9AQm/q9/BLtAsdPpgc8yHQTkZlkiatF
dS+WZFSZpF8hisKmkYCDvnggaqipnJUvnwtIY6xxSZftn3J4h6B9Wp+XjyCLtDxC
2DvztdQJ3oqYBFsSpk6J0gA0/4lF+jlVmZ+DPpVSYlCRJivFAqcpwdx3vdh4Pib3
dUNgh/MJuPv01RIA783TCsHBnOKPxuD5OfusQzkXdj33yX2bcPL6M2s57FgnFf8q
l4B9+R/Q7/8ohp+qMOd4S+SteFa/7WlbJ+5UjJ71y7xScBKOaRZMf6wKKqSZozP1
zOB4AxlC7COrL3tsljC0Vun9CgBL4Ov/XBe7G2WTOUIrZ2KOU328/3atndbLeVJg
knwsiNdKJCJJpTkO3zHzaYfDhDghSaINj1fl67hZV7s3Jj7u4lAD8HfV/9CKoMRd
X1ltgB84u/nFc2aL2fGQbQg7NJLVqIvyx6iyss6K58nNwQMf0ZFUJOihgkSMCRK3
t4qQrXVdorWnRvioA/roHICkGBZdZw53Jz+0EltRxsjTfmzkki6EbXeGhWtRzlSo
5Bx1L4vtK7143nMlD2H/JuwDopD1fOxAM8L3sTlqJE8nu/3plPA=
=N5Tm
-----END PGP SIGNATURE-----
Merge tag 'wireless-next-2025-11-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
More -next material, notably:
- split ieee80211.h file, it's way too big
- mac80211: initial chanctx work towards NAN
- mac80211: MU-MIMO sniffer improvements
- ath12k: statistics improvements
* tag 'wireless-next-2025-11-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (26 commits)
wifi: cw1200: Fix potential memory leak in cw1200_bh_rx_helper()
wifi: mac80211: make monitor link info check more specific
wifi: mac80211: track MU-MIMO configuration on disabled interfaces
wifi: cfg80211/mac80211: Add fallback mechanism for INDOOR_SP connection
wifi: cfg80211/mac80211: clean up duplicate ap_power handling
wifi: cfg80211: use a C99 initializer in wiphy_register
wifi: cfg80211: fix doc of struct key_params
wifi: mac80211: remove unnecessary vlan NULL check
wifi: mac80211: pass frame type to element parsing
wifi: mac80211: remove "disabling VHT" message
wifi: mac80211: add and use chanctx usage iteration
wifi: mac80211: simplify ieee80211_recalc_chanctx_min_def() API
wifi: mac80211: remove chanctx to link back-references
wifi: mac80211: make link iteration safe for 'break'
wifi: mac80211: fix EHT typo
wifi: cfg80211: fix EHT typo
wifi: ieee80211: split NAN definitions out
wifi: ieee80211: split P2P definitions out
wifi: ieee80211: split S1G definitions out
wifi: ieee80211: split EHT definitions out
...
====================
Link: https://patch.msgid.link/20251112115126.16223-4-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This handles PA Sync Lost event which previously was assumed to be
handled with BIG Sync Lost but their lifetime are not the same thus why
there are 2 different events to inform when each sync is lost.
Fixes: b2a5f2e1c1 ("Bluetooth: hci_event: Add support for handling LE BIG Sync Lost event")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Implement fallback to LPI mode when SP mode is not permitted
by regulatory constraints for INDOOR_SP connections.
Limit fallback mechanism to client mode.
Signed-off-by: Pagadala Yesu Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20251110140806.8b43201a34ae.I37fc7bb5892eb9d044d619802e8f2095fde6b296@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Move duplicated ap_power type handling code to an inline
function in cfg80211.
Signed-off-by: Pagadala Yesu Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20251110140806.959948da1cb5.I893b5168329fb3232f249c182a35c99804112da6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Since Eric proposed an idea about adding indirect call wrappers for
UDP and managed to see a huge improvement[1], the same situation can
also be applied in xsk scenario.
This patch adds an indirect call for xsk and helps current copy mode
improve the performance by around 1% stably which was observed with
IXGBE at 10Gb/sec loaded. If the throughput grows, the positive effect
will be magnified. I applied this patch on top of batch xmit series[2],
and was able to see <5% improvement from our internal application
which is a little bit unstable though.
Use INDIRECT wrappers to keep xsk_destruct_skb static as it used to
be when the mitigation config is off.
Be aware of the freeing path that can be very hot since the frequency
can reach around 2,000,000 times per second with the xdpsock test.
[1]: https://lore.kernel.org/netdev/20251006193103.2684156-2-edumazet@google.com/
[2]: https://lore.kernel.org/all/20251021131209.41491-1-kerneljasonxing@gmail.com/
Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20251031103328.95468-1-kerneljasonxing@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCaRJJFwAKCRAraS+Z+3y5
E8eZAQDWQ3D76HOlLK8tBAQ8aSpxwsfr7fpheiMSCEI5r7O5sQEA5LuHrBvEb/Cr
zfU7DxhEQEoVeJMTSia2hEJKWhoNpQ0=
=n2Ap
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Martin KaFai Lau says:
====================
pull-request: bpf-next 2025-11-10
We've added 19 non-merge commits during the last 3 day(s) which contain
a total of 22 files changed, 1345 insertions(+), 197 deletions(-).
The main changes are:
1) Preserve skb metadata after a TC BPF program has changed the skb,
from Jakub Sitnicki.
This allows a TC program at the end of a TC filter chain to still see
the skb metadata, even if another TC program at the front of the chain
has changed the skb using BPF helpers.
2) Initial af_smc bpf_struct_ops support to control the smc specific
syn/synack options, from D. Wythe.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
bpf/selftests: Add selftest for bpf_smc_hs_ctrl
net/smc: bpf: Introduce generic hook for handshake flow
bpf: Export necessary symbols for modules with struct_ops
selftests/bpf: Cover skb metadata access after bpf_skb_change_proto
selftests/bpf: Cover skb metadata access after change_head/tail helper
selftests/bpf: Cover skb metadata access after bpf_skb_adjust_room
selftests/bpf: Cover skb metadata access after vlan push/pop helper
selftests/bpf: Expect unclone to preserve skb metadata
selftests/bpf: Dump skb metadata on verification failure
selftests/bpf: Verify skb metadata in BPF instead of userspace
bpf: Make bpf_skb_change_head helper metadata-safe
bpf: Make bpf_skb_change_proto helper metadata-safe
bpf: Make bpf_skb_adjust_room metadata-safe
bpf: Make bpf_skb_vlan_push helper metadata-safe
bpf: Make bpf_skb_vlan_pop helper metadata-safe
vlan: Make vlan_remove_tag return nothing
bpf: Unclone skb head on bpf_dynptr_write to skb metadata
net: Preserve metadata on pskb_expand_head
net: Helper to move packet data and metadata after skb_push/pull
====================
Link: https://patch.msgid.link/20251110232427.3929291-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The introduction of IPPROTO_SMC enables eBPF programs to determine
whether to use SMC based on the context of socket creation, such as
network namespaces, PID and comm name, etc.
As a subsequent enhancement, to introduce a new generic hook that
allows decisions on whether to use SMC or not at runtime, including
but not limited to local/remote IP address or ports.
User can write their own implememtion via bpf_struct_ops now to choose
whether to use SMC or not before TCP 3rd handshake to be comleted.
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Link: https://patch.msgid.link/20251107035632.115950-3-alibuda@linux.alibaba.com
The seq in struct key_params is for many ciphers, including CCMP, GCMP,
CMAC, GMAC. In addition to get_key(), it is also used when setting keys.
Signed-off-by: Chien Wong <m@xv97.com>
Link: https://patch.msgid.link/20251107142332.181308-1-m@xv97.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-11-06 (i40, ice, iavf)
Mohammad Heib introduces a new devlink parameter, max_mac_per_vf, for
controlling the maximum number of MAC address filters allowed by a VF. This
allows administrators to control the VF behavior in a more nuanced manner.
Aleksandr and Przemek add support for Receive Side Scaling of GTP to iAVF
for VFs running on E800 series ice hardware. This improves performance and
scalability for virtualized network functions in 5G and LTE deployments.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
iavf: add RSS support for GTP protocol via ethtool
ice: Extend PTYPE bitmap coverage for GTP encapsulated flows
ice: improve TCAM priority handling for RSS profiles
ice: implement GTP RSS context tracking and configuration
ice: add virtchnl definitions and static data for GTP RSS
ice: add flow parsing for GTP and new protocol field support
i40e: support generic devlink param "max_mac_per_vf"
devlink: Add new "max_mac_per_vf" generic device param
====================
Link: https://patch.msgid.link/20251106225321.1609605-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Provide a driver api for reporting device statistics required by the
"Implementation Requirements" section of the PSP Architecture
Specification. Use a warning to ensure drivers report stats required
by the spec.
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20251106002608.1578518-4-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Track and report stats common to all psp devices from the core. A
'stale-event' is when the core marks the rx state of an active
psp_assoc as incapable of authenticating psp encapsulated data.
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20251106002608.1578518-2-daniel.zahka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
TCP SACK compression has been added in 2018 in commit
5d9f4262b7 ("tcp: add SACK compression").
It is working great for WAN flows (with large RTT).
Wifi in particular gets a significant boost _when_ ACK are suppressed.
Add a new sysctl so that we can tune the very conservative 5 % value
that has been used so far in this formula, so that small RTT flows
can benefit from this feature.
delay = min ( 5 % of RTT, 1 ms)
This patch adds new tcp_comp_sack_rtt_percent sysctl
to ease experiments and tuning.
Given that we cap the delay to 1ms (tcp_comp_sack_delay_ns sysctl),
set the default value to 33 %.
Quoting Neal Cardwell ( https://lore.kernel.org/netdev/CADVnQymZ1tFnEA1Q=vtECs0=Db7zHQ8=+WCQtnhHFVbEOzjVnQ@mail.gmail.com/ )
The rationale for 33% is basically to try to facilitate pipelining,
where there are always at least 3 ACKs and 3 GSO/TSO skbs per SRTT, so
that the path can maintain a budget for 3 full-sized GSO/TSO skbs "in
flight" at all times:
+ 1 skb in the qdisc waiting to be sent by the NIC next
+ 1 skb being sent by the NIC (being serialized by the NIC out onto the wire)
+ 1 skb being received and aggregated by the receiver machine's
aggregation mechanism (some combination of LRO, GRO, and sack
compression)
Note that this is basically the same magic number (3) and the same
rationales as:
(a) tcp_tso_should_defer() ensuring that we defer sending data for no
longer than cwnd/tcp_tso_win_divisor (where tcp_tso_win_divisor = 3),
and
(b) bbr_quantization_budget() ensuring that cwnd is at least 3 GSO/TSO
skbs to maintain pipelining and full throughput at low RTTs
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20251106115236.3450026-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since commit 54a378f434 ("tcp: add the ability to control
max RTO"), TFO SYN+ACK RTO is capped by the TFO full sk's
inet_csk(sk)->icsk_rto_max.
The value is inherited from the parent listener.
Let's apply the same cap to non-TFO SYN+ACK.
Note that req->rsk_listener is always non-NULL when we call
tcp_reqsk_timeout() in reqsk_timer_handler() or tcp_check_req().
It could be NULL for SYN cookie req, but we do not use
req->timeout then.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251106003357.273403-6-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
reqsk_timeout() is always called with @timeout being TCP_RTO_MAX.
Let's remove the arg.
As a prep for the next patch, reqsk_timeout() is moved to tcp.h
and renamed to tcp_reqsk_timeout().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251106003357.273403-5-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
inet_csk_reqsk_queue_hash_add() is no longer shared by DCCP.
We do not need to pass req->timeout down to reqsk_queue_hash_req().
Let's move tcp_timeout_init() from tcp_conn_request() to
reqsk_queue_hash_req().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251106003357.273403-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since DCCP has been removed, we do not need to use
request_sock_ops.syn_ack_timeout().
Let's call tcp_syn_ack_timeout() directly.
Now other function pointers of request_sock_ops are
protocol-dependent.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20251106003357.273403-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move NETDEV_XDP_ACT_ZC into xdp_sock_drv.h header such that external code
can reuse it, and rename it into more generic NETDEV_XDP_ACT_XSK.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: David Wei <dw@davidwei.uk>
Signed-off-by: David Wei <dw@davidwei.uk>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20251031212103.310683-7-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for a new DSA tagging protocol driver for the MaxLinear
GSW1xx switch family. The GSW1xx switches use a proprietary 8-byte
special tag inserted between the source MAC address and the EtherType
field to indicate the source and destination ports for frames
traversing the CPU port.
Implement the tag handling logic to insert the special tag on transmit
and parse it on receive.
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Tested-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://patch.msgid.link/0e973ebfd9433c30c96f50670da9e9449a0d98f2.1762170107.git.daniel@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a new device generic parameter to controls the maximum
number of MAC filters allowed per VF.
For example, to limit a VF to 3 MAC addresses:
$ devlink dev param set pci/0000:3b:00.0 name max_mac_per_vf \
value 3 \
cmode runtime
Signed-off-by: Mohammad Heib <mheib@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
- Introduce __nocfi_generic for arm32 Clang (Nathan Chancellor)
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRSPkdeREjth1dHnSE2KwveOeQkuwUCaQz8RwAKCRA2KwveOeQk
u51dAP9YhHIttRev7rdRwDWwrlhO+i2fps0vq8G9S6keSndr+AEAv8BtQexexPhI
1oSNLeB/kVUVp5dQWjni48IgMxaG4A8=
=zFGT
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
"This is a work-around for a (now fixed) corner case in the arm32 build
with Clang KCFI enabled.
- Introduce __nocfi_generic for arm32 Clang (Nathan Chancellor)"
* tag 'hardening-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
libeth: xdp: Disable generic kCFI pass for libeth_xdp_tx_xmit_bulk()
ARM: Select ARCH_USES_CFI_GENERIC_LLVM_PASS
compiler_types: Introduce __nocfi_generic
Cross-merge networking fixes after downstream PR (net-6.18-rc5).
Conflicts:
drivers/net/wireless/ath/ath12k/mac.c
9222582ec5 ("Revert "wifi: ath12k: Fix missing station power save configuration"")
6917e268c4 ("wifi: ath12k: Defer vdev bring-up until CSA finalize to avoid stale beacon")
https://lore.kernel.org/11cece9f7e36c12efd732baa5718239b1bf8c950.camel@sipsolutions.net
Adjacent changes:
drivers/net/ethernet/intel/Kconfig
b1d16f7c00 ("libie: depend on DEBUG_FS when building LIBIE_FWLOG")
93f53db9f9 ("ice: switch to Page Pool")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Convert struct proto pre_connect(), connect(), bind(), and bind_add()
callback function prototypes from struct sockaddr to struct sockaddr_unsized.
This does not change per-implementation use of sockaddr for passing around
an arbitrarily sized sockaddr struct. Those will be addressed in future
patches.
Additionally removes the no longer referenced struct sockaddr from
include/net/inet_common.h.
No binary changes expected.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-5-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Update all struct proto_ops connect() callback function prototypes from
"struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the
compiler about object sizes. Calls into struct proto handlers gain casts
that will be removed in the struct proto conversion patch.
No binary changes expected.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-3-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Update all struct proto_ops bind() callback function prototypes from
"struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the
compiler about object sizes. Calls into struct proto handlers gain casts
that will be removed in the struct proto conversion patch.
No binary changes expected.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-2-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Split cq_lock into two smaller locks: cq_prod_lock and
cq_cached_prod_lock
- Avoid disabling/enabling interrupts in the hot xmit path
In either xsk_cq_cancel_locked() or xsk_cq_reserve_locked() function,
the race condition is only between multiple xsks sharing the same
pool. They are all in the process context rather than interrupt context,
so now the small lock named cq_cached_prod_lock can be used without
handling interrupts.
While cq_cached_prod_lock ensures the exclusive modification of
@cached_prod, cq_prod_lock in xsk_cq_submit_addr_locked() only cares
about @producer and corresponding @desc. Both of them don't necessarily
be consistent with @cached_prod protected by cq_cached_prod_lock.
That's the reason why the previous big lock can be split into two
smaller ones. Please note that SPSC rule is all about the global state
of producer and consumer that can affect both layers instead of local
or cached ones.
Frequently disabling and enabling interrupt are very time consuming
in some cases, especially in a per-descriptor granularity, which now
can be avoided after this optimization, even when the pool is shared by
multiple xsks.
With this patch, the performance number[1] could go from 1,872,565 pps
to 1,961,009 pps. It's a minor rise of around 5%.
[1]: taskset -c 1 ./xdpsock -i enp2s0f1 -q 0 -t -S -s 64
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20251030000646.18859-3-kerneljasonxing@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
MPLS (re)uses RTNL to protect net->mpls.platform_label,
but the lock does not need to be RTNL at all.
Let's protect net->mpls.platform_label with a dedicated
per-netns mutex.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20251029173344.2934622-13-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rcu_dereference_rtnl() does not clearly tell whether the caller
is under RCU or RTNL.
Let's add in6_dev_rcu() to make it easy to remove __in6_dev_get()
in the future.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20251029173344.2934622-5-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Handle the NIC hardware link state events received from the HW
channel, then set the proper link state accordingly.
And, add a feature bit, GDMA_DRV_CAP_FLAG_1_HW_VPORT_LINK_AWARE,
to inform the NIC hardware this handler exists.
Our MANA NIC only sends out the link state down/up messages
when we need to let the VM rerun DHCP client and change IP
address. So, add netif_carrier_on() in the probe(), let the NIC
show the right initial state in /sys/class/net/ethX/operstate.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1761770601-16920-1-git-send-email-haiyangz@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- ath10k: revert a patch that had caused issues on some devices
- cfg80211/mac80211: use hrtimers for some things where the
precise timing matters
- zd1211rw: fix a long-standing potential leak
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpeA8sTs3M8SN2hR410qiO8sPaAAFAmkDQiwACgkQ10qiO8sP
aAD8mg//RrRLyJIrVzC/c5TtiVT/ziwyMPMe6l6aRBSN+H/15hX+QfsDt29x+BgX
CyN9io7Tix3Szshos3XPSrWjKrRy2JYDDGQbVW1npxzAs/E4O5rhETm3SWymHkJh
geF3W9NDbF7OQIzm6jBp7/i5Im97LU7Y6c9Pd6FxPucKBvFahvxqyDvrD2MAa6/j
RDr9de7l7X02xymU8Z2h4qNjXQ1fbLWio1UMr+xPxPvPmXsR4aYfGb3xESryHS1y
a2Xnui1Apzv0Bxf0j0CrsqPoZmSSMhhfB5NMclTsZluPZySRmFygoPt5fjoeaw/Q
8Ga4nkzUW2ZMZ9CJiB8qFX8dSQ+k658AyVY0XAQkZFey4e0kPwwOG/f1xBoAeIW5
VcvLbxIZ2Ao3K2qAg5f2bfPyjcp1t+I4iFb/WKzTzX5EcuHcxB5IfoqYfIAkLfgP
ada9iCJzrVv0+aBCbpKvDm+BAnV/H1sQVRtPQKH09+P8Dylj/klVRYemBMzG9zH2
zSrkllJrESKUI7Y0fDszt4G9HmWuX9+y5eKhHrLwYkUutqIJsdkW9hbe+EQGUIke
bOL5nH5hhQDaOa5hpoTQRUKYZb+4tbttKCa9Bii+1M1NgPlQq22Ud4wBdGWqvfqF
dFHx7vmbVAqrTl8RzXtpLj/UVbgVNhGJAIbD6IAc3DyZC4BvZJs=
=3oN9
-----END PGP SIGNATURE-----
Merge tag 'wireless-2025-10-30' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Couple of new fixes:
- ath10k: revert a patch that had caused issues on some devices
- cfg80211/mac80211: use hrtimers for some things where the
precise timing matters
- zd1211rw: fix a long-standing potential leak
* tag 'wireless-2025-10-30' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: zd1211rw: fix potential memory leak in __zd_usb_enable_rx()
wifi: mac80211: use wiphy_hrtimer_work for csa.switch_work
wifi: mac80211: use wiphy_hrtimer_work for ml_reconf_work
wifi: mac80211: use wiphy_hrtimer_work for ttlm_work
wifi: cfg80211: add an hrtimer based delayed work item
Revert "wifi: ath10k: avoid unnecessary wait for service ready message"
====================
Link: https://patch.msgid.link/20251030104919.12871-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the parse_adv_monitor_pattern() function, the value of
the 'length' variable is currently limited to HCI_MAX_EXT_AD_LENGTH(251).
The size of the 'value' array in the mgmt_adv_pattern structure is 31.
If the value of 'pattern[i].length' is set in the user space
and exceeds 31, the 'patterns[i].value' array can be accessed
out of bound when copied.
Increasing the size of the 'value' array in
the 'mgmt_adv_pattern' structure will break the userspace.
Considering this, and to avoid OOB access revert the limits for 'offset'
and 'length' back to the value of HCI_MAX_AD_LENGTH.
Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE.
Fixes: db08722fc7 ("Bluetooth: hci_core: Fix missing instances using HCI_MAX_AD_LENGTH")
Cc: stable@vger.kernel.org
Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
-----BEGIN PGP SIGNATURE-----
iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmkDUZINHGZ3QHN0cmxl
bi5kZQAKCRBwkajZrV/2AK4uD/9NS2Q+uqGlyAnsKrmetRBuhCMRdJaHnrniD5+N
xnsPrVpyL3EdioOehoXZNkbJ2JXTz/xl4bqZ4Zx4Z6vCVpkws2et5aU9lvm39n9s
yIhSkthuI0buRYrTzsHO21bHBPQU4gpqOXvbcFotm3QOj2wF2NaNthrDhyWUjtiY
tPmkSEWu/mjbjMEt1VU3ec3rb3fbe4i6tMW+Y4bTwH7dAopk7Rhw26NexGVUY8fO
+BzenYi3XTktOrevYnlAJXmsavJjFkSsRffe76PtewNNY2WTCY+st341IXmOVASL
OzL+9qvY3xIyevClUKx/WEKPGZae+V0Buxq6U6SijDLgbDF19A7R6zMId/ILsexm
zuAnAdhXdP1gRl4cHAs/AQGD72gqXpVaIRVJE7NsRz3+rvOXzvYRvBhIVQAdvg6v
pnR8PQg30jHPmPPRruarwH6O4x4jGptX3ASQtIf9Fii37vbDhZmJIt3j2OCaEuFP
09ImQSwjvizD+aXvf+GtsWD/ZS2PNhE6m0jpJKI26grngODSpKPfNA54uwHmzXDM
+1Zy+b6AxFjLi5HoAu+Cq7a+ktFKYVCRsunfovzVdvM+63AXoP9e2aRqWnoJUxvb
sB4eAccoDE1W0CDDktz36bVt92ZTz4zmPO7HoT/oNyvkH8ye6tYpOPfXVWrUirkQ
MNFqmw==
=woNE
-----END PGP SIGNATURE-----
Merge tag 'nf-next-25-10-30' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Florian Westphal says:
====================
netfilter: updates for net-next
1) Convert nf_tables 'nft_set_iter' usage to use C99 struct
initialization, from Fernando Fernandez Mancera.
2) Disallow nf_conntrack_max=0. This was an (undocumented)
historic inheritance from ip_conntrack (ipv4 only nf_conntrack
predecessor). Doing so will simplify future changes to make
this pernet-tuneable.
3) Fix a typo in conntrack.h comment, from Weibiao Tu.
* tag 'nf-next-25-10-30' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: fix typo in nf_conntrack_l4proto.h comment
netfilter: conntrack: disable 0 value for conntrack_max setting
netfilter: nf_tables: use C99 struct initializer for nft_set_iter
====================
Link: https://patch.msgid.link/20251030121954.29175-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Think SMC_WR_BUF_CNT_SEND := SMC_WR_BUF_CNT used in send context and
SMC_WR_BUF_CNT_RECV := 3 * SMC_WR_BUF_CNT used in recv context. Those
get replaced with lgr->max_send_wr and lgr->max_recv_wr respective.
Please note that although with the default sysctl values
qp_attr.cap.max_send_wr == qp_attr.cap.max_recv_wr is maintained but
can not be assumed to be generally true any more. I see no downside to
that, but my confidence level is rather modest.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Sidraya Jayagond <sidraya@linux.ibm.com>
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Tested-by: Mahanta Jambigi <mjambigi@linux.ibm.com>
Link: https://patch.msgid.link/20251027224856.2970019-2-pasic@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In the comment for nf_conntrack_l4proto.h, the word "nfnetink" was
incorrectly spelled. It has been corrected to "nfnetlink".
Fixes a typo to enhance readability and ensure consistency.
Signed-off-by: caivive (Weibiao Tu) <cavivie@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
mac80211 already reports some basic information in the radiotap header
with the known fields declared by the driver. However, drivers may want
to report more accurate information and in that case the full VHT
radiotap structure needs to be provided.
Add a new RX_FLAG_RADIOTAP_VHT which is set when the VHT information
should be pulled from the skb. Update the code to fill in the VHT fields
to only do so when requested by the driver or if the information has not
yet been set. This way the driver can fully control the information if
it chooses so.
Signed-off-by: Benjamin Berg <benjamin.berg@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20251027142118.0bad1c307a21.I2cf285c20a822698039603f2af00ed9c548f2ee0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>