linux/include/uapi
Nicolin Chen c279e83953 iommu: Introduce pci_dev_reset_iommu_prepare/done()
PCIe permits a device to ignore ATS invalidation TLPs while processing a
reset. This creates a problem visible to the OS where an ATS invalidation
command will time out. E.g. an SVA domain will have no coordination with a
reset event and can racily issue ATS invalidations to a resetting device.

The OS should do something to mitigate this as we do not want production
systems to be reporting critical ATS failures, especially in a hypervisor
environment. Broadly, OS could arrange to ignore the timeouts, block page
table mutations to prevent invalidations, or disable and block ATS.

The PCIe r6.0, sec 10.3.1 IMPLEMENTATION NOTE recommends SW to disable and
block ATS before initiating a Function Level Reset. It also mentions that
other reset methods could have the same vulnerability as well.

Provide a callback from the PCI subsystem that will enclose the reset and
have the iommu core temporarily change all the attached RID/PASID domains
group->blocking_domain so that the IOMMU hardware would fence any incoming
ATS queries. And IOMMU drivers should also synchronously stop issuing new
ATS invalidations and wait for all ATS invalidations to complete. This can
avoid any ATS invaliation timeouts.

However, if there is a domain attachment/replacement happening during an
ongoing reset, ATS routines may be re-activated between the two function
calls. So, introduce a new resetting_domain in the iommu_group structure
to reject any concurrent attach_dev/set_dev_pasid call during a reset for
a concern of compatibility failure. Since this changes the behavior of an
attach operation, update the uAPI accordingly.

Note that there are two corner cases:
 1. Devices in the same iommu_group
    Since an attachment is always per iommu_group, this means that any
    sibling devices in the iommu_group cannot change domain, to prevent
    race conditions.
 2. An SR-IOV PF that is being reset while its VF is not
    In such case, the VF itself is already broken. So, there is no point
    in preventing PF from going through the iommu reset.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-01-10 10:26:44 +01:00
..
asm-generic vfs-6.19-rc1.folio 2025-12-01 10:26:38 -08:00
cxl fwctl/cxl: Fix uuid_t usage in uapi 2025-04-11 20:45:43 -03:00
drm drm/xe: Limit num_syncs to prevent oversized allocations 2025-12-18 18:10:34 +01:00
fwctl pds_fwctl: add rpc and query support 2025-03-21 20:57:55 -03:00
linux iommu: Introduce pci_dev_reset_iommu_prepare/done() 2026-01-10 10:26:44 +01:00
misc Char/Misc/IIO/Binder changes for 6.18-rc1 2025-10-04 16:26:32 -07:00
mtd ubi: Expose interface for detailed erase counters 2025-01-18 15:32:32 +01:00
rdma RDMA/irdma: Fix irdma_alloc_ucontext_resp padding 2025-12-16 21:38:45 -04:00
regulator regulator: uapi: Use UAPI integer type 2025-12-22 09:00:42 +00:00
scsi scsi: fc: Avoid -Wflex-array-member-not-at-end warnings 2025-08-30 21:42:19 -04:00
sound ALSA: uapi: Fix typo in asound.h comment 2025-12-08 15:27:48 +01:00
video
xen xen/privcmd: Add new syscall to get gsi from dev 2024-09-25 09:54:55 +02:00
Kbuild