linux/drivers
Dragos Tatulea 24cf78c738 net/mlx5e: SHAMPO, Switch to header memcpy
Previously the HW-GRO code was using a separate page_pool for the header
buffer. The pages of the header buffer were replenished via UMR. This
mechanism has some drawbacks:
- Reference counting on the page_pool page frags is not cheap.
- UMRs have HW overhead for updating and also for access. Especially for
  the KLM type which was previously used.
- UMR code for headers is complex.

This patch switches to using a static memory area (static MTT MKEY) for
the header buffer and does a header memcpy. This happens only once per
GRO session. The SKB is allocated from the per-cpu NAPI SKB cache.

Performance numbers for x86:
+---------------------------------------------------------+
| Test                | Baseline   | Header Copy | Change |
|---------------------+------------+-------------+--------|
| iperf3 oncpu        |  59.5 Gbps |  64.00 Gbps |   7 %  |
| iperf3 offcpu       | 102.5 Gbps | 104.20 Gbps |   2 %  |
| kperf oncpu         | 115.0 Gbps | 130.00 Gbps |  12 %  |
| XDP_DROP (skb mode) |   3.9 Mpps |   3.9 Mpps  |   0 %  |
+---------------------------------------------------------+

Notes on test:
- System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
- oncpu: NAPI and application running on same CPU
- offcpu: NAPI and application running on different CPUs
- MTU: 1500
- iperf3 tests are single stream, 60s with IPv6 (for slightly larger
  headers)
- kperf version [1]

[1] git://git.kernel.dk/kperf.git

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260204200345.1724098-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-05 18:36:06 -08:00
..
accel accel/amdxdna: Block running under a hypervisor 2025-12-15 13:00:03 -06:00
accessibility
acpi ACPI: PM: s2idle: Add module parameter for LPS0 constraints checking 2026-01-13 23:10:25 +01:00
amba soc: driver updates for 6.19 2025-12-05 17:29:04 -08:00
android rust_binder: remove spin_lock() in rust_shrink_free_page() 2025-12-29 11:34:16 +01:00
ata ata: libata: Print features also for ATAPI devices 2026-01-13 22:00:02 +09:00
atm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2026-01-08 11:38:33 -08:00
auxdisplay
base Driver core fixes for 6.19-rc7 2026-01-24 10:13:22 -08:00
bcma
block block-6.19-20260130 2026-01-30 13:18:32 -08:00
bluetooth Bluetooth: hci_qca: Enable HFP hardware offload for WCN6855 and WCN7850 2026-01-29 13:37:44 -05:00
bus bus: simple-pm-bus: Probe the Layerscape SCFG node 2026-01-27 16:33:32 +01:00
cache cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent 2025-11-21 18:42:02 +00:00
cdrom
cdx
char Char/Misc/IIO driver updates for 6.19-rc1 2025-12-06 18:34:24 -08:00
clk clk: Add devm_clk_bulk_get_optional_enable() helper 2026-01-21 18:57:07 -08:00
clocksource riscv: clocksource: Fix stimecmp update hazard on RV32 2026-01-14 17:42:46 -07:00
comedi comedi: dmm32at: serialize use of paged registers 2026-01-16 16:43:51 +01:00
connector
counter counter: 104-quad-8: Fix incorrect return value in IRQ handler 2025-12-22 20:03:23 +09:00
cpufreq CPUFreq fixes for 6.19 2026-01-27 14:40:29 +01:00
cpuidle soc: driver updates for 6.19 2025-12-05 17:29:04 -08:00
crypto crypto/ccp: Allow multiple streams on the same root bridge 2026-01-30 14:27:53 -08:00
cxl cxl: Check for invalid addresses returned from translation functions on errors 2026-01-13 08:30:40 -07:00
dax drivers/dax: add some missing kerneldoc comment fields for struct dev_dax 2026-01-14 22:16:26 -08:00
dca
devfreq PM / devfreq: Fix typo in DFSO_DOWNDIFFERENTIAL macro name 2025-11-26 13:58:59 +09:00
dibs dibs: Remove KMSG_COMPONENT macro 2025-11-27 18:11:43 -08:00
dio
dma dmaengine: apple-admac: Add "apple,t8103-admac" compatible 2026-01-11 22:12:49 +05:30
dma-buf VFIO updates for v6.19-rc1 2025-12-04 18:42:48 -08:00
dpll drivers: Add support for DPLL reference count tracking 2026-02-05 15:57:46 +01:00
edac EDAC/x38: Fix a resource leak in x38_probe1() 2026-01-04 08:35:39 +01:00
eisa
extcon
firewire firewire: core: fix race condition against transaction list 2026-01-29 08:03:55 +09:00
firmware mm: rename cpu_bitmap field to flexible_array 2026-01-19 12:30:00 -08:00
fpga fpga: altera-cvp: Use pci_find_vsec_capability() when probing FPGA device 2025-11-10 15:03:13 +08:00
fsi
fwctl
gnss gnss: ubx: add support for the safeboot gpio 2025-11-20 16:44:04 +01:00
gpib staging: gpib: Clean-up commented-out code 2025-11-26 14:28:19 +01:00
gpio gpiolib: acpi: Fix potential out-of-boundary left shift 2026-01-28 15:24:09 +01:00
gpu Rust fixes for v6.19 2026-01-30 16:15:59 -08:00
greybus greybus: gb-beagleplay: Fix timeout handling in bootloader functions 2025-11-26 14:40:59 +01:00
hid hid-for-linus-2026010801 2026-01-08 07:44:48 -08:00
hsi
hte
hv mshv: handle gpa intercepts for arm64 2026-01-15 07:29:14 +00:00
hwmon hwmon: (ltc4282): Fix reset_history file permissions 2025-12-19 08:44:22 -08:00
hwspinlock
hwtracing intel_th: rename error label 2026-01-16 16:42:41 +01:00
i2c i2c-host-fixes for v6.19-rc7 2026-01-24 12:56:53 +01:00
i3c i3c: adi: Fix confusing cleanup.h syntax 2025-12-12 23:59:39 +01:00
idle
iio iio: dac: ad3552r-hs: fix out-of-bound write in ad3552r_hs_write_data_source 2026-01-11 13:25:15 +00:00
infiniband net/mlx5: Fix 1600G link mode enum naming 2026-02-05 18:29:04 -08:00
input Input updates for v6.19-rc6 2026-01-25 09:42:25 -08:00
interconnect interconnect: debugfs: initialize src_node and dst_node to empty strings 2026-01-12 01:58:36 +02:00
iommu IOMMU Fixes for Linux v6.19-rc7 2026-01-31 09:40:13 -08:00
ipack
irqchip irqchip/ls-extirq: Convert to a platform driver to make it work again 2026-01-27 16:33:32 +01:00
isdn mISDN: annotate data-race around dev->work 2026-01-20 18:37:41 -08:00
leds leds: led-class: Only Add LED to leds_list when it is fully ready 2026-01-20 16:02:01 +00:00
macintosh soc: driver updates for 6.19 2025-12-05 17:29:04 -08:00
mailbox mailbox: th1520: fix clock imbalance on probe failure 2025-11-28 09:47:44 -06:00
mcb
md block-6.19-20260130 2026-01-30 13:18:32 -08:00
media [GIT PULL for v6.19-rc6] media fixes 2026-01-14 08:18:01 -08:00
memory soc: driver updates for 6.19 2025-12-05 17:29:04 -08:00
memstick
message scsi: message: fusion: Add WQ_PERCPU to alloc_workqueue() users 2025-11-12 21:28:26 -05:00
mfd MFD for v6.19 2025-12-04 15:18:33 -08:00
misc mei: trace: treat reg parameter as string 2026-01-16 16:43:47 +01:00
mmc Another fairly large set of changes, notably: 2026-01-29 19:17:43 -08:00
most most: usb: fix double free on late probe failure 2025-11-09 11:15:20 +09:00
mtd A single late MTD fix, which reverts a fix that turned out to be 2026-01-29 14:08:36 -08:00
mux mux: mmio: Fix IS_ERR() vs NULL check in probe() 2026-01-16 16:42:08 +01:00
net net/mlx5e: SHAMPO, Switch to header memcpy 2026-02-05 18:36:06 -08:00
nfc Revert "nfc/nci: Add the inconsistency check between the input data length and count" 2026-01-17 18:02:50 -08:00
ntb ntb: transport: Fix uninitialized mutex 2026-01-17 11:57:39 -05:00
nubus
nvdimm NVDIMM changes for 6.19 2025-12-06 09:32:25 -08:00
nvme block-6.19-20260130 2026-01-30 13:18:32 -08:00
nvmem Char/Misc/IIO driver updates for 6.19-rc1 2025-12-06 18:34:24 -08:00
of dma-mapping fixes for Linux 6.19 2026-01-30 13:15:04 -08:00
opp OPP: Initialize scope-based pointers inline 2025-10-23 11:58:05 +05:30
parisc parisc: Set valid bit in high byte of 64‑bit physical address 2025-12-19 13:56:17 +01:00
parport
pci tsm fixes for 6.19 2026-02-04 15:15:54 -08:00
pcmcia
peci Char/Misc/IIO driver updates for 6.19-rc1 2025-12-06 18:34:24 -08:00
perf arm64 updates for 6.19: 2025-12-02 17:03:55 -08:00
phy Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2026-01-22 20:14:36 -08:00
pinctrl gpio fixes for v6.19-rc8 2026-01-30 11:58:27 -08:00
platform platform/x86/intel/tpmi/plr: Make the file domain<n>/status writeable 2026-01-29 14:38:40 +02:00
pmdomain pmdomain providers: 2026-01-23 13:12:49 -08:00
pnp PNP: Fix ISAPNP to generate uevents to auto-load modules 2025-11-18 17:35:36 +01:00
power soc: driver updates for 6.19 2025-12-05 17:29:04 -08:00
powercap powercap: intel_rapl: Fix possible recursive lock warning 2025-12-17 17:24:28 +01:00
pps printk changes for 6.19 2025-12-03 12:42:36 -08:00
ps3
ptp drivers: Add support for DPLL reference count tracking 2026-02-05 15:57:46 +01:00
pwm pwm: max7360: Populate missing .sizeof_wfhw in max7360_pwm_ops 2026-01-19 18:31:05 +01:00
rapidio
ras EFI updates for v6.19: 2025-12-04 17:10:08 -08:00
regulator regulator: fp9931: Add missing memory allocation check 2026-01-19 14:42:02 +00:00
remoteproc remoteproc: qcom_q6v5_wcss: use optional reset for wcss_q6_bcr_reset 2025-11-29 15:20:23 -06:00
resctrl arm_mpam: Use non-atomic bitops when modifying feature bitmap 2026-01-16 12:04:20 +00:00
reset This pull request is entirely SoC clk drivers, not for lack of trying to modify 2025-12-08 09:38:52 +09:00
rpmsg rpmsg: glink: remove duplicate code for rpmsg device remove 2025-11-26 10:16:10 -06:00
rtc RTC for 6.19 2025-12-13 17:09:06 +12:00
s390 s390/ap: Fix wrong APQN fill calculation 2026-01-20 14:33:42 +01:00
sbus
scsi scsi: be2iscsi: Fix a memory leak in beiscsi_boot_get_sinfo() 2026-01-23 22:39:07 -05:00
sh syscore: Pass context data to callbacks 2025-11-14 10:01:52 +01:00
siox
slimbus slimbus: core: clean up of_slim_get_device() 2026-01-16 16:43:05 +01:00
soc Qualcomm driver fix for v6.19 2026-01-29 10:02:11 +01:00
soundwire soundwire fix for 6.19 2026-01-18 12:29:12 -08:00
spi spi: intel-pci: Add support for Nova Lake SPI serial flash 2026-01-15 14:21:29 +00:00
spmi
ssb
staging Staging driver updates for 6.19-rc1 2025-12-06 18:52:00 -08:00
target scsi: firewire: sbp-target: Fix overflow in sbp_make_tpg() 2026-01-23 22:41:21 -05:00
tc
tee QCOMTEE fixes2 for v6.18 2025-11-21 21:27:20 +01:00
thermal thermal: core: Fix typo and indentation in comments 2025-12-15 12:47:39 +01:00
thunderbolt USB/Thunderbolt changes for 6.19-rc1 2025-12-06 18:42:12 -08:00
tty serial: Fix not set tty->port race condition 2026-01-23 17:23:09 +01:00
ufs scsi: ufs: amd-versal2: Fix PHY initialization in HCE enable notify 2026-01-23 22:43:44 -05:00
uio uio: pci_sva: correct '-ENODEV' check logic 2026-01-16 16:43:43 +01:00
usb xhci: sideband: don't dereference freed ring when removing sideband endpoint 2026-01-16 12:19:37 +01:00
vdpa Significant patch series in this merge are as follows: 2025-12-05 13:52:43 -08:00
vfio vfio: Prevent from pinned DMABUF importers to attach to VFIO DMABUF 2026-01-23 08:47:48 -07:00
vhost vsock: add netns support to virtio transports 2026-01-27 10:45:38 +01:00
video fbdev fixes & enhancements for 6.19-rc1: 2025-12-06 15:41:26 -08:00
virt coco/tsm: Remove unused variable tsm_rwsem 2026-01-23 13:09:51 -08:00
virtio virtio: clean up features qword/dword terms 2025-11-27 02:03:07 -05:00
w1 w1: fix redundant counter decrement in w1_attach_slave_device() 2025-12-28 11:52:10 +01:00
watchdog linux-watchdog 6.19-rc1 tag 2025-12-06 10:00:49 -08:00
xen SCSI fixes on 20260125 2026-01-25 12:06:15 -08:00
zorro
Kconfig Staging driver updates for 6.19-rc1 2025-12-06 18:52:00 -08:00
Makefile Staging driver updates for 6.19-rc1 2025-12-06 18:52:00 -08:00