linux/tools/testing/selftests/kvm
Paolo Bonzini 12abeb81c8 KVM x86 CET virtualization support for 6.18
Add support for virtualizing Control-flow Enforcement Technology (CET) on
 Intel (Shadow Stacks and Indirect Branch Tracking) and AMD (Shadow Stacks).
 
 CET is comprised of two distinct features, Shadow Stacks (SHSTK) and Indirect
 Branch Tracking (IBT), that can be utilized by software to help provide
 Control-flow integrity (CFI).  SHSTK defends against backward-edge attacks
 (a.k.a. Return-oriented programming (ROP)), while IBT defends against
 forward-edge attacks (a.k.a. similarly CALL/JMP-oriented programming (COP/JOP)).
 
 Attackers commonly use ROP and COP/JOP methodologies to redirect the control-
 flow to unauthorized targets in order to execute small snippets of code,
 a.k.a. gadgets, of the attackers choice.  By chaining together several gadgets,
 an attacker can perform arbitrary operations and circumvent the system's
 defenses.
 
 SHSTK defends against backward-edge attacks, which execute gadgets by modifying
 the stack to branch to the attacker's target via RET, by providing a second
 stack that is used exclusively to track control transfer operations.  The
 shadow stack is separate from the data/normal stack, and can be enabled
 independently in user and kernel mode.
 
 When SHSTK is is enabled, CALL instructions push the return address on both the
 data and shadow stack. RET then pops the return address from both stacks and
 compares the addresses.  If the return addresses from the two stacks do not
 match, the CPU generates a Control Protection (#CP) exception.
 
 IBT defends against backward-edge attacks, which branch to gadgets by executing
 indirect CALL and JMP instructions with attacker controlled register or memory
 state, by requiring the target of indirect branches to start with a special
 marker instruction, ENDBRANCH.  If an indirect branch is executed and the next
 instruction is not an ENDBRANCH, the CPU generates a #CP.  Note, ENDBRANCH
 behaves as a NOP if IBT is disabled or unsupported.
 
 From a virtualization perspective, CET presents several problems.  While SHSTK
 and IBT have two layers of enabling, a global control in the form of a CR4 bit,
 and a per-feature control in user and kernel (supervisor) MSRs (U_CET and S_CET
 respectively), the {S,U}_CET MSRs can be context switched via XSAVES/XRSTORS.
 Practically speaking, intercepting and emulating XSAVES/XRSTORS is not a viable
 option due to complexity, and outright disallowing use of XSTATE to context
 switch SHSTK/IBT state would render the features unusable to most guests.
 
 To limit the overall complexity without sacrificing performance or usability,
 simply ignore the potential virtualization hole, but ensure that all paths in
 KVM treat SHSTK/IBT as usable by the guest if the feature is supported in
 hardware, and the guest has access to at least one of SHSTK or IBT.  I.e. allow
 userspace to advertise one of SHSTK or IBT if both are supported in hardware,
 even though doing so would allow a misbehaving guest to use the unadvertised
 feature.
 
 Fully emulating SHSTK and IBT would also require significant complexity, e.g.
 to track and update branch state for IBT, and shadow stack state for SHSTK.
 Given that emulating large swaths of the guest code stream isn't necessary on
 modern CPUs, punt on emulating instructions that meaningful impact or consume
 SHSTK or IBT.  However, instead of doing nothing, explicitly reject emulation
 of such instructions so that KVM's emulator can't be abused to circumvent CET.
 Disable support for SHSTK and IBT if KVM is configured such that emulation of
 arbitrary guest instructions may be required, specifically if Unrestricted
 Guest (Intel only) is disabled, or if KVM will emulate a guest.MAXPHYADDR that
 is smaller than host.MAXPHYADDR.
 
 Lastly disable SHSTK support if shadow paging is enabled, as the protections
 for the shadow stack are novel (shadow stacks require Writable=0,Dirty=1, so
 that they can't be directly modified by software), i.e. would require
 non-trivial support in the Shadow MMU.
 
 Note, AMD CPUs currently only support SHSTK.  Explicitly disable IBT support
 so that KVM doesn't over-advertise if AMD CPUs add IBT, and virtualizing IBT
 in SVM requires KVM modifications.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmjXbisACgkQOlYIJqCj
 N/373w//ckB4c9MjS6eDRp+LtTXQfXyAs8eMcs9YTs7yD3uMvqcbaNuDsf1U2cI6
 i2qcuOdxlnKSJphn6oH2JKDWPjRAfHhCqmYghUPaJwgeYqsTfork9s8rzU2tC82q
 38mQ6BhAuOwa/plodvDp/+POEIoXUyexSoWX+cngGVTmFWdbfA4NNGjWMZOl1XG2
 qLBck6t+IxxUTs1Ij+OsexlAKdY7FcZZ85Ok6I/VE4/lITEhuTJkwkYdh8td3KK/
 IVVk1jb1Z7t8lGQ5fi3+N/D8iHJ/0ladmOux6Yxzw88uyj6XLIFOOFsdK09GyhUS
 QzV06syFkV2vU68VDYiOcMZIdeGmYR5jDpmy9N+o0s86YLU6rKKEaXRP7vW5yHj/
 99AU+DfRHvhqKwWyQ51B+rhr80F3EQrkZXI0QBr8KO7sseFZvZNNVozwKjSyZtNH
 VBhxjIlVQm5Z1rjucKjc573sONK95z9XUSZjYnCUwB1NH7VsvdULQmJBucCmzW/p
 9j49CpmShwggceV6LcYg4Miuvjl/bL1B8Go5Fg+1Fdg7L6Nepi16yywxHmyPqreJ
 Wx/6N0gqZ3LKDdl5CFYxAxvJoldJR6lbw/AGjvFkre8A+TGGRdz3uS9XXqGHvtbu
 W5wKhnvGov69lm4xYbxbI+rvxYmmQLm9SgQXel23icbKJ5kmE48=
 =zsBl
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-cet-6.18' of https://github.com/kvm-x86/linux into HEAD

KVM x86 CET virtualization support for 6.18

Add support for virtualizing Control-flow Enforcement Technology (CET) on
Intel (Shadow Stacks and Indirect Branch Tracking) and AMD (Shadow Stacks).

CET is comprised of two distinct features, Shadow Stacks (SHSTK) and Indirect
Branch Tracking (IBT), that can be utilized by software to help provide
Control-flow integrity (CFI).  SHSTK defends against backward-edge attacks
(a.k.a. Return-oriented programming (ROP)), while IBT defends against
forward-edge attacks (a.k.a. similarly CALL/JMP-oriented programming (COP/JOP)).

Attackers commonly use ROP and COP/JOP methodologies to redirect the control-
flow to unauthorized targets in order to execute small snippets of code,
a.k.a. gadgets, of the attackers choice.  By chaining together several gadgets,
an attacker can perform arbitrary operations and circumvent the system's
defenses.

SHSTK defends against backward-edge attacks, which execute gadgets by modifying
the stack to branch to the attacker's target via RET, by providing a second
stack that is used exclusively to track control transfer operations.  The
shadow stack is separate from the data/normal stack, and can be enabled
independently in user and kernel mode.

When SHSTK is is enabled, CALL instructions push the return address on both the
data and shadow stack. RET then pops the return address from both stacks and
compares the addresses.  If the return addresses from the two stacks do not
match, the CPU generates a Control Protection (#CP) exception.

IBT defends against backward-edge attacks, which branch to gadgets by executing
indirect CALL and JMP instructions with attacker controlled register or memory
state, by requiring the target of indirect branches to start with a special
marker instruction, ENDBRANCH.  If an indirect branch is executed and the next
instruction is not an ENDBRANCH, the CPU generates a #CP.  Note, ENDBRANCH
behaves as a NOP if IBT is disabled or unsupported.

From a virtualization perspective, CET presents several problems.  While SHSTK
and IBT have two layers of enabling, a global control in the form of a CR4 bit,
and a per-feature control in user and kernel (supervisor) MSRs (U_CET and S_CET
respectively), the {S,U}_CET MSRs can be context switched via XSAVES/XRSTORS.
Practically speaking, intercepting and emulating XSAVES/XRSTORS is not a viable
option due to complexity, and outright disallowing use of XSTATE to context
switch SHSTK/IBT state would render the features unusable to most guests.

To limit the overall complexity without sacrificing performance or usability,
simply ignore the potential virtualization hole, but ensure that all paths in
KVM treat SHSTK/IBT as usable by the guest if the feature is supported in
hardware, and the guest has access to at least one of SHSTK or IBT.  I.e. allow
userspace to advertise one of SHSTK or IBT if both are supported in hardware,
even though doing so would allow a misbehaving guest to use the unadvertised
feature.

Fully emulating SHSTK and IBT would also require significant complexity, e.g.
to track and update branch state for IBT, and shadow stack state for SHSTK.
Given that emulating large swaths of the guest code stream isn't necessary on
modern CPUs, punt on emulating instructions that meaningful impact or consume
SHSTK or IBT.  However, instead of doing nothing, explicitly reject emulation
of such instructions so that KVM's emulator can't be abused to circumvent CET.
Disable support for SHSTK and IBT if KVM is configured such that emulation of
arbitrary guest instructions may be required, specifically if Unrestricted
Guest (Intel only) is disabled, or if KVM will emulate a guest.MAXPHYADDR that
is smaller than host.MAXPHYADDR.

Lastly disable SHSTK support if shadow paging is enabled, as the protections
for the shadow stack are novel (shadow stacks require Writable=0,Dirty=1, so
that they can't be directly modified by software), i.e. would require
non-trivial support in the Shadow MMU.

Note, AMD CPUs currently only support SHSTK.  Explicitly disable IBT support
so that KVM doesn't over-advertise if AMD CPUs add IBT, and virtualizing IBT
in SVM requires KVM modifications.
2025-09-30 13:37:14 -04:00
..
arm64 Merge branch kvm-arm64/selftests-6.18 into kvmarm-master/next 2025-09-24 19:35:50 +01:00
include KVM x86 CET virtualization support for 6.18 2025-09-30 13:37:14 -04:00
lib KVM selftests changes for 6.18 2025-09-30 13:23:54 -04:00
riscv KVM: riscv: selftests: Add SBI FWFT to get-reg-list test 2025-09-16 10:54:24 +05:30
s390 KVM selftests changes for 6.18 2025-09-30 13:23:54 -04:00
x86 KVM x86 changes for 6.18 2025-09-30 13:36:41 -04:00
.gitignore KVM: selftests: Provide empty 'all' and 'clean' targets for unsupported ARCHs 2024-12-18 14:15:03 -08:00
Makefile KVM: selftests: Add supported test cases for LoongArch 2025-05-20 20:20:26 +08:00
Makefile.kvm KVM x86 CET virtualization support for 6.18 2025-09-30 13:37:14 -04:00
access_tracking_perf_test.c KVM: riscv: selftests: Add missing headers for new testcases 2025-09-16 10:53:55 +05:30
arch_timer.c KVM: selftests: Convert arch_timer tests to common helpers to pin task 2025-07-09 09:33:42 -07:00
coalesced_io_test.c KVM: selftests: Add a test for coalesced MMIO (and PIO on x86) 2024-08-29 19:38:33 -07:00
config KVM: selftests: Add CONFIG_EVENTFD for irqfd selftest 2025-07-10 06:20:20 -07:00
demand_paging_test.c KVM selftests treewide updates for 6.10: 2024-05-12 03:18:11 -04:00
dirty_log_perf_test.c KVM: arm64: selftests: Create a VGICv3 for 'default' VMs 2025-09-24 19:23:32 +01:00
dirty_log_test.c KVM: arm64: selftests: Create a VGICv3 for 'default' VMs 2025-09-24 19:23:32 +01:00
get-reg-list.c KVM: arm64: selftests: Provide helper for getting default vCPU target 2025-09-24 19:23:32 +01:00
guest_memfd_test.c KVM: selftests: Add guest_memfd testcase to fault-in on !mmap()'d memory 2025-08-27 04:41:34 -04:00
guest_print_test.c KVM: selftests: Open code vcpu_run() equivalent in guest_printf test 2024-08-29 16:25:06 -07:00
hardware_disable_test.c KVM: selftests: Remove unused macro in the hardware disable test 2024-10-30 13:51:46 -07:00
irqfd_test.c KVM: selftests: Add a KVM_IRQFD test to verify uniqueness requirements 2025-06-23 09:51:01 -07:00
kvm_binary_stats_test.c KVM: selftests: Define _GNU_SOURCE for all selftests code 2024-04-29 12:49:10 -07:00
kvm_create_max_vcpus.c KVM: selftests: Adjust number of files rlimit for all "standard" VMs 2025-02-14 07:02:12 -08:00
kvm_page_table_test.c Revert "kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h" 2024-04-29 12:54:13 -07:00
memslot_modification_stress_test.c KVM: riscv: selftests: Add missing headers for new testcases 2025-09-16 10:53:55 +05:30
memslot_perf_test.c KVM: riscv: selftests: Add missing headers for new testcases 2025-09-16 10:53:55 +05:30
mmu_stress_test.c KVM: selftests: Ensure all vCPUs hit -EFAULT during initial RO stage 2025-03-03 07:37:28 -08:00
pre_fault_memory_test.c KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY 2024-07-12 11:18:27 -04:00
rseq_test.c KVM: selftests: Add option to rseq test to override /dev/cpu_dma_latency 2025-04-04 07:07:39 -04:00
set_memory_region_test.c KVM: selftests: Add supported test cases for LoongArch 2025-05-20 20:20:26 +08:00
settings selftests: kvm: Raise the default timeout to 120 seconds 2021-02-09 08:17:08 -05:00
steal_time.c KVM: arm64: selftests: Select SMCCC conduit based on current EL 2025-09-24 19:23:32 +01:00
system_counter_offset_test.c KVM: selftests: Remove redundant newlines 2024-01-29 08:39:14 -08:00