diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index c07671a45331b..6278157cd1912 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5611,6 +5611,14 @@ replacement properties are not found. See the Kconfig entry for RISCV_ISA_FALLBACK. + riscv_nousercfi= + all Disable user CFI ABI to userspace even if cpu extension + are available. + bcfi Disable user backward CFI ABI to userspace even if + the shadow stack extension is available. + fcfi Disable user forward CFI ABI to userspace even if the + landing pad extension is available. + ro [KNL] Mount root device read-only on boot rodata= [KNL] diff --git a/Documentation/arch/riscv/hwprobe.rst b/Documentation/arch/riscv/hwprobe.rst index f065083957c11..d5be264af45ca 100644 --- a/Documentation/arch/riscv/hwprobe.rst +++ b/Documentation/arch/riscv/hwprobe.rst @@ -67,7 +67,7 @@ The following keys are defined: programs (it may still be executed in userspace via a kernel-controlled mechanism such as the vDSO). -* :c:macro:`RISCV_HWPROBE_KEY_IMA_EXT_0`: A bitmask containing the extensions +* :c:macro:`RISCV_HWPROBE_KEY_IMA_EXT_0`: A bitmask containing extensions that are compatible with the :c:macro:`RISCV_HWPROBE_BASE_BEHAVIOR_IMA`: base system behavior. @@ -339,6 +339,10 @@ The following keys are defined: * :c:macro:`RISCV_HWPROBE_KEY_ZICBOP_BLOCK_SIZE`: An unsigned int which represents the size of the Zicbop block in bytes. +* :c:macro:`RISCV_HWPROBE_KEY_IMA_EXT_1`: A bitmask containing additional + extensions that are compatible with the + :c:macro:`RISCV_HWPROBE_BASE_BEHAVIOR_IMA`: base system behavior. + * :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF`: An enum value describing the performance of misaligned vector accesses on the selected set of processors. diff --git a/Documentation/arch/riscv/index.rst b/Documentation/arch/riscv/index.rst index 4dab0cb4b900f..ed36e3019931c 100644 --- a/Documentation/arch/riscv/index.rst +++ b/Documentation/arch/riscv/index.rst @@ -13,6 +13,8 @@ RISC-V architecture patch-acceptance uabi vector + zicfilp + zicfiss features diff --git a/Documentation/arch/riscv/zicfilp.rst b/Documentation/arch/riscv/zicfilp.rst new file mode 100644 index 0000000000000..ab7d8e62ddaff --- /dev/null +++ b/Documentation/arch/riscv/zicfilp.rst @@ -0,0 +1,137 @@ +.. SPDX-License-Identifier: GPL-2.0 + +:Author: Deepak Gupta +:Date: 12 January 2024 + +==================================================== +Tracking indirect control transfers on RISC-V Linux +==================================================== + +This document briefly describes the interface provided to userspace by Linux +to enable indirect branch tracking for user mode applications on RISC-V. + +1. Feature Overview +-------------------- + +Memory corruption issues usually result in crashes. However, in the +hands of a creative adversary, these can result in a variety of +security issues. + +Some of those security issues can be code re-use attacks, where an +adversary can use corrupt function pointers, chaining them together to +perform jump oriented programming (JOP) or call oriented programming +(COP) and thus compromise control flow integrity (CFI) of the program. + +Function pointers live in read-write memory and thus are susceptible +to corruption. This can allow an adversary to control the program +counter (PC) value. On RISC-V, the zicfilp extension enforces a +restriction on such indirect control transfers: + +- Indirect control transfers must land on a landing pad instruction ``lpad``. + There are two exceptions to this rule: + + - rs1 = x1 or rs1 = x5, i.e. a return from a function and returns are + protected using shadow stack (see zicfiss.rst) + + - rs1 = x7. On RISC-V, the compiler usually does the following to reach a + function which is beyond the offset of possible J-type instruction:: + + auipc x7, + jalr (x7) + + This form of indirect control transfer is immutable and doesn't + rely on memory. Thus rs1=x7 is exempted from tracking and + these are considered software guarded jumps. + +The ``lpad`` instruction is a pseudo-op of ``auipc rd, `` +with ``rd=x0``. This is a HINT op. The ``lpad`` instruction must be +aligned on a 4 byte boundary. It compares the 20 bit immediate with +x7. If ``imm_20bit`` == 0, the CPU doesn't perform any comparison with +``x7``. If ``imm_20bit`` != 0, then ``imm_20bit`` must match ``x7`` +else CPU will raise ``software check exception`` (``cause=18``) with +``*tval = 2``. + +The compiler can generate a hash over function signatures and set them +up (truncated to 20 bits) in x7 at callsites. Function prologues can +have ``lpad`` instructions encoded with the same function hash. This +further reduces the number of valid program counter addresses a call +site can reach. + +2. ELF and psABI +----------------- + +The toolchain sets up :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_FCFI` for +property :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_AND` in the notes +section of the object file. + +3. Linux enabling +------------------ + +User space programs can have multiple shared objects loaded in their +address spaces. It's a difficult task to make sure all the +dependencies have been compiled with indirect branch support. Thus +it's left to the dynamic loader to enable indirect branch tracking for +the program. + +4. prctl() enabling +-------------------- + +Per-task indirect branch tracking state can be monitored and +controlled via the :c:macro:`PR_GET_CFI` and :c:macro:`PR_SET_CFI` +``prctl()` arguments (respectively), by supplying +:c:macro:`PR_CFI_BRANCH_LANDING_PADS` as the second argument. These +are architecture-agnostic, and will return -EINVAL if the underlying +functionality is not supported. + +* prctl(:c:macro:`PR_SET_CFI`, :c:macro:`PR_CFI_BRANCH_LANDING_PADS`, unsigned long arg) + +arg is a bitmask. + +If :c:macro:`PR_CFI_ENABLE` is set in arg, and the CPU supports +``zicfilp``, then the kernel will enable indirect branch tracking for +the task. The dynamic loader can issue this ``prctl()`` once it has +determined that all the objects loaded in the address space support +indirect branch tracking. + +Indirect branch tracking state can also be locked once enabled. This +prevents the task from subsequently disabling it. This is done by +setting the bit :c:macro:`PR_CFI_LOCK` in arg. Either indirect branch +tracking must already be enabled for the task, or the bit +:c:macro:`PR_CFI_ENABLE` must also be set in arg. This is intended +for environments that wish to run with a strict security posture that +do not wish to load objects without ``zicfilp`` support. + +Indirect branch tracking can also be disabled for the task, assuming +that it has not previously been enabled and locked. If there is a +``dlopen()`` to an object which wasn't compiled with ``zicfilp``, the +dynamic loader can issue this ``prctl()`` with arg set to +:c:macro:`PR_CFI_DISABLE`. Disabling indirect branch tracking for the +task is not possible if it has previously been enabled and locked. + + +* prctl(:c:macro:`PR_GET_CFI`, :c:macro:`PR_CFI_BRANCH_LANDING_PADS`, unsigned long * arg) + +Returns the current status of indirect branch tracking into a bitmask +stored into the memory location pointed to by arg. The bitmask will +have the :c:macro:`PR_CFI_ENABLE` bit set if indirect branch tracking +is currently enabled for the task, and if it is locked, will +additionally have the :c:macro:`PR_CFI_LOCK` bit set. If indirect +branch tracking is currently disabled for the task, the +:c:macro:`PR_CFI_DISABLE` bit will be set. + + +5. violations related to indirect branch tracking +-------------------------------------------------- + +Pertaining to indirect branch tracking, the CPU raises a software +check exception in the following conditions: + +- missing ``lpad`` after indirect call / jmp +- ``lpad`` not on 4 byte boundary +- ``imm_20bit`` embedded in ``lpad`` instruction doesn't match with ``x7`` + +In all 3 cases, ``*tval = 2`` is captured and software check exception is +raised (``cause=18``). + +The kernel will treat this as :c:macro:`SIGSEGV` with code = +:c:macro:`SEGV_CPERR` and follow the normal course of signal delivery. diff --git a/Documentation/arch/riscv/zicfiss.rst b/Documentation/arch/riscv/zicfiss.rst new file mode 100644 index 0000000000000..4d5f7addc26d8 --- /dev/null +++ b/Documentation/arch/riscv/zicfiss.rst @@ -0,0 +1,194 @@ +.. SPDX-License-Identifier: GPL-2.0 + +:Author: Deepak Gupta +:Date: 12 January 2024 + +========================================================= +Shadow stack to protect function returns on RISC-V Linux +========================================================= + +This document briefly describes the interface provided to userspace by Linux +to enable shadow stacks for user mode applications on RISC-V. + +1. Feature Overview +-------------------- + +Memory corruption issues usually result in crashes. However, in the +hands of a creative adversary, these issues can result in a variety of +security problems. + +Some of those security issues can be code re-use attacks on programs +where an adversary can use corrupt return addresses present on the +stack. chaining them together to perform return oriented programming +(ROP) and thus compromising the control flow integrity (CFI) of the +program. + +Return addresses live on the stack in read-write memory. Therefore +they are susceptible to corruption, which allows an adversary to +control the program counter. On RISC-V, the ``zicfiss`` extension +provides an alternate stack (the "shadow stack") on which return +addresses can be safely placed in the prologue of the function and +retrieved in the epilogue. The ``zicfiss`` extension makes the +following changes: + +- PTE encodings for shadow stack virtual memory + An earlier reserved encoding in first stage translation i.e. + PTE.R=0, PTE.W=1, PTE.X=0 becomes the PTE encoding for shadow stack pages. + +- The ``sspush x1/x5`` instruction pushes (stores) ``x1/x5`` to shadow stack. + +- The ``sspopchk x1/x5`` instruction pops (loads) from shadow stack and compares + with ``x1/x5`` and if not equal, the CPU raises a ``software check exception`` + with ``*tval = 3`` + +The compiler toolchain ensures that function prologues have ``sspush +x1/x5`` to save the return address on shadow stack in addition to the +regular stack. Similarly, function epilogues have ``ld x5, +offset(x2)`` followed by ``sspopchk x5`` to ensure that a popped value +from the regular stack matches with the popped value from the shadow +stack. + +2. Shadow stack protections and linux memory manager +----------------------------------------------------- + +As mentioned earlier, shadow stacks get new page table encodings that +have some special properties assigned to them, along with instructions +that operate on the shadow stacks: + +- Regular stores to shadow stack memory raise store access faults. This + protects shadow stack memory from stray writes. + +- Regular loads from shadow stack memory are allowed. This allows + stack trace utilities or backtrace functions to read the true call + stack and ensure that it has not been tampered with. + +- Only shadow stack instructions can generate shadow stack loads or + shadow stack stores. + +- Shadow stack loads and stores on read-only memory raise AMO/store + page faults. Thus both ``sspush x1/x5`` and ``sspopchk x1/x5`` will + raise AMO/store page fault. This simplies COW handling in kernel + during fork(). The kernel can convert shadow stack pages into + read-only memory (as it does for regular read-write memory). As + soon as subsequent ``sspush`` or ``sspopchk`` instructions in + userspace are encountered, the kernel can perform COW. + +- Shadow stack loads and stores on read-write or read-write-execute + memory raise an access fault. This is a fatal condition because + shadow stack loads and stores should never be operating on + read-write or read-write-execute memory. + +3. ELF and psABI +----------------- + +The toolchain sets up :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_BCFI` for +property :c:macro:`GNU_PROPERTY_RISCV_FEATURE_1_AND` in the notes +section of the object file. + +4. Linux enabling +------------------ + +User space programs can have multiple shared objects loaded in their +address space. It's a difficult task to make sure all the +dependencies have been compiled with shadow stack support. Thus +it's left to the dynamic loader to enable shadow stacks for the +program. + +5. prctl() enabling +-------------------- + +:c:macro:`PR_SET_SHADOW_STACK_STATUS` / :c:macro:`PR_GET_SHADOW_STACK_STATUS` / +:c:macro:`PR_LOCK_SHADOW_STACK_STATUS` are three prctls added to manage shadow +stack enabling for tasks. These prctls are architecture-agnostic and return +-EINVAL if not implemented. + +* prctl(PR_SET_SHADOW_STACK_STATUS, unsigned long arg) + +If arg = :c:macro:`PR_SHADOW_STACK_ENABLE` and if CPU supports +``zicfiss`` then the kernel will enable shadow stacks for the task. +The dynamic loader can issue this :c:macro:`prctl` once it has +determined that all the objects loaded in address space have support +for shadow stacks. Additionally, if there is a :c:macro:`dlopen` to +an object which wasn't compiled with ``zicfiss``, the dynamic loader +can issue this prctl with arg set to 0 (i.e. +:c:macro:`PR_SHADOW_STACK_ENABLE` being clear) + +* prctl(PR_GET_SHADOW_STACK_STATUS, unsigned long * arg) + +Returns the current status of indirect branch tracking. If enabled +it'll return :c:macro:`PR_SHADOW_STACK_ENABLE`. + +* prctl(PR_LOCK_SHADOW_STACK_STATUS, unsigned long arg) + +Locks the current status of shadow stack enabling on the +task. Userspace may want to run with a strict security posture and +wouldn't want loading of objects without ``zicfiss`` support. In this +case userspace can use this prctl to disallow disabling of shadow +stacks on the current task. + +5. violations related to returns with shadow stack enabled +----------------------------------------------------------- + +Pertaining to shadow stacks, the CPU raises a ``software check +exception`` upon executing ``sspopchk x1/x5`` if ``x1/x5`` doesn't +match the top of shadow stack. If a mismatch happens, then the CPU +sets ``*tval = 3`` and raises the exception. + +The Linux kernel will treat this as a :c:macro:`SIGSEGV` with code = +:c:macro:`SEGV_CPERR` and follow the normal course of signal delivery. + +6. Shadow stack tokens +----------------------- + +Regular stores on shadow stacks are not allowed and thus can't be +tampered with via arbitrary stray writes. However, one method of +pivoting / switching to a shadow stack is simply writing to the CSR +``CSR_SSP``. This will change the active shadow stack for the +program. Writes to ``CSR_SSP`` in the program should be mostly +limited to context switches, stack unwinds, or longjmp or similar +mechanisms (like context switching of Green Threads) in languages like +Go and Rust. CSR_SSP writes can be problematic because an attacker can +use memory corruption bugs and leverage context switching routines to +pivot to any shadow stack. Shadow stack tokens can help mitigate this +problem by making sure that: + +- When software is switching away from a shadow stack, the shadow + stack pointer should be saved on the shadow stack itself (this is + called the ``shadow stack token``). + +- When software is switching to a shadow stack, it should read the + ``shadow stack token`` from the shadow stack pointer and verify that + the ``shadow stack token`` itself is a pointer to the shadow stack + itself. + +- Once the token verification is done, software can perform the write + to ``CSR_SSP`` to switch shadow stacks. + +Here "software" could refer to the user mode task runtime itself, +managing various contexts as part of a single thread. Or "software" +could refer to the kernel, when the kernel has to deliver a signal to +a user task and must save the shadow stack pointer. The kernel can +perform similar procedure itself by saving a token on the user mode +task's shadow stack. This way, whenever :c:macro:`sigreturn` happens, +the kernel can read and verify the token and then switch to the shadow +stack. Using this mechanism, the kernel helps the user task so that +any corruption issue in the user task is not exploited by adversaries +arbitrarily using :c:macro:`sigreturn`. Adversaries will have to make +sure that there is a valid ``shadow stack token`` in addition to +invoking :c:macro:`sigreturn`. + +7. Signal shadow stack +----------------------- +The following structure has been added to sigcontext for RISC-V:: + + struct __sc_riscv_cfi_state { + unsigned long ss_ptr; + }; + +As part of signal delivery, the shadow stack token is saved on the +current shadow stack itself. The updated pointer is saved away in the +:c:macro:`ss_ptr` field in :c:macro:`__sc_riscv_cfi_state` under +:c:macro:`sigcontext`. The existing shadow stack allocation is used +for signal delivery. During :c:macro:`sigreturn`, kernel will obtain +:c:macro:`ss_ptr` from :c:macro:`sigcontext`, verify the saved +token on the shadow stack, and switch the shadow stack. diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml index 6d4e0fb8b7651..cbdfa7a19cf72 100644 --- a/Documentation/devicetree/bindings/riscv/extensions.yaml +++ b/Documentation/devicetree/bindings/riscv/extensions.yaml @@ -217,6 +217,20 @@ properties: The standard Zicboz extension for cache-block zeroing as ratified in commit 3dd606f ("Create cmobase-v1.0.pdf") of riscv-CMOs. + - const: zicfilp + description: | + The standard Zicfilp extension for enforcing forward edge + control-flow integrity as ratified in commit 3f8e450 ("merge + pull request #227 from ved-rivos/0709") of riscv-cfi + github repo. + + - const: zicfiss + description: | + The standard Zicfiss extension for enforcing backward edge + control-flow integrity as ratified in commit 3f8e450 ("merge + pull request #227 from ved-rivos/0709") of riscv-cfi + github repo. + - const: zicntr description: The standard Zicntr extension for base counters and timers, as diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl index ad37569d05072..6e8479c96e65e 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -492,3 +492,4 @@ 560 common set_mempolicy_home_node sys_ni_syscall 561 common cachestat sys_cachestat 562 common fchmodat2 sys_fchmodat2 +563 common map_shadow_stack sys_map_shadow_stack diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index c572d6c3dee0f..6d494dfbf5e4a 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -466,3 +466,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +453 common map_shadow_stack sys_map_shadow_stack diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h index bd77253b62e01..6a28fb91b85de 100644 --- a/arch/arm64/include/asm/unistd.h +++ b/arch/arm64/include/asm/unistd.h @@ -39,7 +39,7 @@ #define __ARM_NR_compat_set_tls (__ARM_NR_COMPAT_BASE + 5) #define __ARM_NR_COMPAT_END (__ARM_NR_COMPAT_BASE + 0x800) -#define __NR_compat_syscalls 453 +#define __NR_compat_syscalls 454 #endif #define __ARCH_WANT_SYS_CLONE diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h index 545a4a7b5371c..be62efee4b000 100644 --- a/arch/arm64/include/asm/unistd32.h +++ b/arch/arm64/include/asm/unistd32.h @@ -911,6 +911,8 @@ __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node) __SYSCALL(__NR_cachestat, sys_cachestat) #define __NR_fchmodat2 452 __SYSCALL(__NR_fchmodat2, sys_fchmodat2) +#define __NR_map_shadow_stack 453 +__SYSCALL(__NR_map_shadow_stack, sys_map_shadow_stack) /* * Please add new compat syscalls above this comment and update diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl index 259ceb125367d..bee2d2f7f82c5 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -452,3 +452,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +453 common map_shadow_stack sys_map_shadow_stack diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl index a3798c2637fd1..09eda7ed91b07 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -458,3 +458,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +453 common map_shadow_stack sys_map_shadow_stack diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl index 4a296124604a1..a7f556729ee9b 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -391,3 +391,4 @@ 450 n32 set_mempolicy_home_node sys_set_mempolicy_home_node 451 n32 cachestat sys_cachestat 452 n32 fchmodat2 sys_fchmodat2 +453 n32 map_shadow_stack sys_map_shadow_stack diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl index cb5e757f6621c..aa9ed6a7cb485 100644 --- a/arch/mips/kernel/syscalls/syscall_n64.tbl +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -367,3 +367,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 n64 cachestat sys_cachestat 452 n64 fchmodat2 sys_fchmodat2 +453 n64 map_shadow_stack sys_map_shadow_stack diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl index 1ee62a861380a..6af28b85fb962 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -440,3 +440,4 @@ 450 o32 set_mempolicy_home_node sys_set_mempolicy_home_node 451 o32 cachestat sys_cachestat 452 o32 fchmodat2 sys_fchmodat2 +453 o32 map_shadow_stack sys_map_shadow_stack diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl index 73f560e309573..f12719e88472b 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -451,3 +451,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +453 common map_shadow_stack sys_map_shadow_stack diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index 40f6751271d3d..34fd0de630443 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -543,3 +543,4 @@ 450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +453 common map_shadow_stack sys_ni_syscall diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index b22d5b3907460..82502b6092508 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -188,6 +188,7 @@ config RISCV select TRACE_IRQFLAGS_SUPPORT select UACCESS_MEMCPY if !MMU select USER_STACKTRACE_SUPPORT + select VDSO_GETRANDOM if HAVE_GENERIC_VDSO select ZONE_DMA32 if 64BIT select ARCH_SUPPORTS_MEMORY_FAILURE @@ -1106,6 +1107,28 @@ config RANDOMIZE_BASE If unsure, say N. +config RISCV_USER_CFI + def_bool y + bool "riscv userspace control flow integrity" + depends on 64BIT && MMU && \ + $(cc-option,-mabi=lp64 -march=rv64ima_zicfiss_zicfilp -fcf-protection=full) + depends on RISCV_ALTERNATIVE + select RISCV_SBI + select ARCH_HAS_USER_SHADOW_STACK + select ARCH_USES_HIGH_VMA_FLAGS + select DYNAMIC_SIGFRAME + help + Provides CPU-assisted control flow integrity to userspace tasks. + Control flow integrity is provided by implementing shadow stack for + backward edge and indirect branch tracking for forward edge. + Shadow stack protection is a hardware feature that detects function + return address corruption. This helps mitigate ROP attacks. + Indirect branch tracking enforces that all indirect branches must land + on a landing pad instruction else CPU will fault. This mitigates against + JOP / COP attacks. Applications must be enabled to use it, and old userspace + does not get protection "for free". + default y. + endmenu # "Kernel features" menu "Boot options" diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index ad5e294cae01b..b683e272b256a 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -62,6 +62,9 @@ riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZACAS) := $(riscv-march-y)_zacas # Check if the toolchain supports Zabha riscv-march-$(CONFIG_TOOLCHAIN_HAS_ZABHA) := $(riscv-march-y)_zabha +KBUILD_BASE_ISA = -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)fd([^v_]*)v?/\1\2/') +export KBUILD_BASE_ISA + # XUANTIE ISA riscv-march-$(CONFIG_XUANTIE_ISA) := $(riscv-march-y)_xtheadc @@ -70,7 +73,7 @@ riscv-march-$(CONFIG_XUANTIE_ISA) := $(riscv-march-y)_xtheadc ifeq ($(CONFIG_TOOLCHAIN_HAS_XTHEADC)$(call gcc-max-version, 100400),yy) KBUILD_CFLAGS += -march=$(subst fd,,$(riscv-march-y)) else - KBUILD_CFLAGS += -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)fd([^v_]*)v?/\1\2/') + KBUILD_CFLAGS += $(KBUILD_BASE_ISA) endif KBUILD_AFLAGS += -march=$(riscv-march-y) @@ -135,6 +138,8 @@ ifeq ($(CONFIG_MMU),y) prepare: vdso_prepare vdso_prepare: prepare0 $(Q)$(MAKE) $(build)=arch/riscv/kernel/vdso include/generated/vdso-offsets.h + $(if $(CONFIG_RISCV_USER_CFI),$(Q)$(MAKE) \ + $(build)=arch/riscv/kernel/vdso_cfi include/generated/vdso-cfi-offsets.h) $(if $(CONFIG_COMPAT),$(Q)$(MAKE) \ $(build)=arch/riscv/kernel/compat_vdso include/generated/compat_vdso-offsets.h) @@ -142,6 +147,7 @@ endif endif vdso-install-y += arch/riscv/kernel/vdso/vdso.so.dbg +vdso-install-$(CONFIG_RISCV_USER_CFI) += arch/riscv/kernel/vdso_cfi/vdso-cfi.so.dbg vdso-install-$(CONFIG_COMPAT) += arch/riscv/kernel/compat_vdso/compat_vdso.so.dbg:../compat_vdso/compat_vdso.so ifneq ($(CONFIG_XIP_KERNEL),y) diff --git a/arch/riscv/configs/hardening.config b/arch/riscv/configs/hardening.config new file mode 100644 index 0000000000000..089f4cee82f4d --- /dev/null +++ b/arch/riscv/configs/hardening.config @@ -0,0 +1,4 @@ +# RISCV specific kernel hardening options + +# Enable control flow integrity support for usermode. +CONFIG_RISCV_USER_CFI=y diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h index 5d10edde6d179..41ec5cdec3673 100644 --- a/arch/riscv/include/asm/asm-prototypes.h +++ b/arch/riscv/include/asm/asm-prototypes.h @@ -51,7 +51,10 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_u); DECLARE_DO_ERROR_INFO(do_trap_ecall_s); DECLARE_DO_ERROR_INFO(do_trap_ecall_m); DECLARE_DO_ERROR_INFO(do_trap_break); +DECLARE_DO_ERROR_INFO(do_trap_software_check); +asmlinkage void ret_from_fork_kernel(void *fn_arg, int (*fn)(void *), struct pt_regs *regs); +asmlinkage void ret_from_fork_user(struct pt_regs *regs); asmlinkage void handle_bad_stack(struct pt_regs *regs); asmlinkage void do_page_fault(struct pt_regs *regs); asmlinkage void do_irq(struct pt_regs *regs); diff --git a/arch/riscv/include/asm/assembler.h b/arch/riscv/include/asm/assembler.h index 44b1457d3e956..4bdd8b8e69cb6 100644 --- a/arch/riscv/include/asm/assembler.h +++ b/arch/riscv/include/asm/assembler.h @@ -80,3 +80,47 @@ .endm #endif /* __ASM_ASSEMBLER_H */ + +#if defined(VDSO_CFI) && (__riscv_xlen == 64) +.macro vdso_lpad, label = 0 +lpad \label +.endm +#else +.macro vdso_lpad, label = 0 +.endm +#endif + +/* + * This macro emits a program property note section identifying + * architecture features which require special handling, mainly for + * use in assembly files included in the VDSO. + */ +#define NT_GNU_PROPERTY_TYPE_0 5 +#define GNU_PROPERTY_RISCV_FEATURE_1_AND 0xc0000000 + +#define GNU_PROPERTY_RISCV_FEATURE_1_ZICFILP BIT(0) +#define GNU_PROPERTY_RISCV_FEATURE_1_ZICFISS BIT(1) + +#if defined(VDSO_CFI) && (__riscv_xlen == 64) +#define GNU_PROPERTY_RISCV_FEATURE_1_DEFAULT \ + (GNU_PROPERTY_RISCV_FEATURE_1_ZICFILP | GNU_PROPERTY_RISCV_FEATURE_1_ZICFISS) +#endif + +#ifdef GNU_PROPERTY_RISCV_FEATURE_1_DEFAULT +.macro emit_riscv_feature_1_and, feat = GNU_PROPERTY_RISCV_FEATURE_1_DEFAULT + .pushsection .note.gnu.property, "a" + .p2align 3 + .word 4 + .word 16 + .word NT_GNU_PROPERTY_TYPE_0 + .asciz "GNU" + .word GNU_PROPERTY_RISCV_FEATURE_1_AND + .word 4 + .word \feat + .word 0 + .popsection +.endm +#else +.macro emit_riscv_feature_1_and, feat = 0 +.endm +#endif diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index 48676a49055c0..f874238e098ae 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -147,4 +147,16 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi DECLARE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); +static inline bool cpu_supports_shadow_stack(void) +{ + return (IS_ENABLED(CONFIG_RISCV_USER_CFI) && + riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICFISS)); +} + +static inline bool cpu_supports_indirect_br_lp_instr(void) +{ + return (IS_ENABLED(CONFIG_RISCV_USER_CFI) && + riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICFILP)); +} + #endif diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index e0e6ab0e3d763..7791600b39647 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -18,6 +18,15 @@ #define SR_MPP _AC(0x00001800, UL) /* Previously Machine */ #define SR_SUM _AC(0x00040000, UL) /* Supervisor User Memory Access */ +/* zicfilp landing pad status bit */ +#define SR_SPELP _AC(0x00800000, UL) +#define SR_MPELP _AC(0x020000000000, UL) +#ifdef CONFIG_RISCV_M_MODE +#define SR_ELP SR_MPELP +#else +#define SR_ELP SR_SPELP +#endif + #define SR_FS _AC(0x00006000, UL) /* Floating-point Status */ #define SR_FS_OFF _AC(0x00000000, UL) #define SR_FS_INITIAL _AC(0x00002000, UL) @@ -208,6 +217,8 @@ #define ENVCFG_ADUE (_AC(1, ULL) << 61) #define ENVCFG_CBZE (_AC(1, UL) << 7) #define ENVCFG_CBCFE (_AC(1, UL) << 6) +#define ENVCFG_LPE (_AC(1, UL) << 2) +#define ENVCFG_SSE (_AC(1, UL) << 3) #define ENVCFG_CBIE_SHIFT 4 #define ENVCFG_CBIE (_AC(0x3, UL) << ENVCFG_CBIE_SHIFT) #define ENVCFG_CBIE_ILL _AC(0x0, UL) @@ -311,6 +322,18 @@ #define CSR_STIMECMP 0x14D #define CSR_STIMECMPH 0x15D +/* zicfiss user mode csr. CSR_SSP holds current shadow stack pointer */ +#define CSR_SSP 0x011 + +/* xtheadvector symbolic CSR names */ +#define CSR_VXSAT 0x9 +#define CSR_VXRM 0xa + +/* xtheadvector CSR masks */ +#define CSR_VXRM_MASK 3 +#define CSR_VXRM_SHIFT 1 +#define CSR_VXSAT_MASK 1 + /* Supervisor-Level Window to Indirectly Accessed Registers (AIA) */ #define CSR_SISELECT 0x150 #define CSR_SIREG 0x151 diff --git a/arch/riscv/include/asm/entry-common.h b/arch/riscv/include/asm/entry-common.h index b28ccc6cdeea4..34ed149af5d1b 100644 --- a/arch/riscv/include/asm/entry-common.h +++ b/arch/riscv/include/asm/entry-common.h @@ -40,4 +40,6 @@ static inline int handle_misaligned_store(struct pt_regs *regs) } #endif +bool handle_user_cfi_violation(struct pt_regs *regs); + #endif /* _ASM_RISCV_ENTRY_COMMON_H */ diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index 1d7769d905413..93ca6c75e3a22 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -108,6 +108,10 @@ #define RISCV_ISA_EXT_ZALRSC 99 #define RISCV_ISA_EXT_ZICBOP 100 #define RISCV_ISA_EXT_ZALASR 101 +#define RISCV_ISA_EXT_ZILSD 102 +#define RISCV_ISA_EXT_ZCLSD 103 +#define RISCV_ISA_EXT_ZICFILP 104 +#define RISCV_ISA_EXT_ZICFISS 105 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/include/asm/hwprobe.h b/arch/riscv/include/asm/hwprobe.h index 2f278c395af90..9b04377c0f98f 100644 --- a/arch/riscv/include/asm/hwprobe.h +++ b/arch/riscv/include/asm/hwprobe.h @@ -8,7 +8,7 @@ #include -#define RISCV_HWPROBE_MAX_KEY 15 +#define RISCV_HWPROBE_MAX_KEY 16 static inline bool riscv_hwprobe_key_is_valid(__s64 key) { @@ -20,6 +20,7 @@ static inline bool hwprobe_key_is_bitmask(__s64 key) switch (key) { case RISCV_HWPROBE_KEY_BASE_BEHAVIOR: case RISCV_HWPROBE_KEY_IMA_EXT_0: + case RISCV_HWPROBE_KEY_IMA_EXT_1: case RISCV_HWPROBE_KEY_CPUPERF_0: case RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0: case RISCV_HWPROBE_KEY_VENDOR_EXT_MIPS_0: diff --git a/arch/riscv/include/asm/mman.h b/arch/riscv/include/asm/mman.h new file mode 100644 index 0000000000000..0ad1d19832ebc --- /dev/null +++ b/arch/riscv/include/asm/mman.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __ASM_MMAN_H__ +#define __ASM_MMAN_H__ + +#include +#include +#include +#include + +static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot, + unsigned long pkey __always_unused) +{ + unsigned long ret = 0; + + /* + * If PROT_WRITE was specified, force it to VM_READ | VM_WRITE. + * Only VM_WRITE means shadow stack. + */ + if (prot & PROT_WRITE) + ret = (VM_READ | VM_WRITE); + return ret; +} + +#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey) + +#endif /* ! __ASM_MMAN_H__ */ diff --git a/arch/riscv/include/asm/mmu_context.h b/arch/riscv/include/asm/mmu_context.h index 7030837adc1a2..c71c48ee54312 100644 --- a/arch/riscv/include/asm/mmu_context.h +++ b/arch/riscv/include/asm/mmu_context.h @@ -35,6 +35,21 @@ static inline int init_new_context(struct task_struct *tsk, DECLARE_STATIC_KEY_FALSE(use_asid_allocator); +#ifdef CONFIG_RISCV_ISA_SUPM +#define mm_untag_mask mm_untag_mask +static inline unsigned long mm_untag_mask(struct mm_struct *mm) +{ + return -1UL >> mm->context.pmlen; +} +#endif + +#define deactivate_mm deactivate_mm +static inline void deactivate_mm(struct task_struct *tsk, + struct mm_struct *mm) +{ + shstk_release(tsk); +} + #include #endif /* _ASM_RISCV_MMU_CONTEXT_H */ diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 6edb47d8ee231..d3b906a4bdb67 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -198,6 +198,7 @@ extern struct pt_alloc_ops pt_ops __meminitdata; #define PAGE_READ_EXEC __pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC) #define PAGE_WRITE_EXEC __pgprot(_PAGE_BASE | _PAGE_READ | \ _PAGE_EXEC | _PAGE_WRITE) +#define PAGE_SHADOWSTACK __pgprot(_PAGE_BASE | _PAGE_WRITE) #define PAGE_COPY PAGE_READ #define PAGE_COPY_EXEC PAGE_READ_EXEC @@ -426,16 +427,25 @@ static inline int pte_devmap(pte_t pte) static inline pte_t pte_wrprotect(pte_t pte) { - return __pte(pte_val(pte) & ~(_PAGE_WRITE)); + return __pte((pte_val(pte) & ~(_PAGE_WRITE)) | (_PAGE_READ)); } /* static inline pte_t pte_mkread(pte_t pte) */ +struct vm_area_struct; +pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma); +#define pte_mkwrite pte_mkwrite + static inline pte_t pte_mkwrite_novma(pte_t pte) { return __pte(pte_val(pte) | _PAGE_WRITE); } +static inline pte_t pte_mkwrite_shstk(pte_t pte) +{ + return __pte((pte_val(pte) & ~(_PAGE_LEAF)) | _PAGE_WRITE); +} + /* static inline pte_t pte_mkexec(pte_t pte) */ static inline pte_t pte_mkdirty(pte_t pte) @@ -626,7 +636,15 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep) { - atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep); + pte_t read_pte = READ_ONCE(*ptep); + /* + * ptep_set_wrprotect can be called for shadow stack ranges too. + * shadow stack memory is XWR = 010 and thus clearing _PAGE_WRITE will lead to + * encoding 000b which is wrong encoding with V = 1. This should lead to page fault + * but we dont want this wrong configuration to be set in page tables. + */ + atomic_long_set((atomic_long_t *)ptep, + ((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ)); } #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH @@ -778,11 +796,19 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd) return pte_pmd(pte_mkyoung(pmd_pte(pmd))); } +pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma); +#define pmd_mkwrite pmd_mkwrite + static inline pmd_t pmd_mkwrite_novma(pmd_t pmd) { return pte_pmd(pte_mkwrite_novma(pmd_pte(pmd))); } +static inline pmd_t pmd_mkwrite_shstk(pmd_t pte) +{ + return __pmd((pmd_val(pte) & ~(_PAGE_LEAF)) | _PAGE_WRITE); +} + static inline pmd_t pmd_wrprotect(pmd_t pmd) { return pte_pmd(pte_wrprotect(pmd_pte(pmd))); diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index f5e0a127351fd..44ddcdb979940 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -16,6 +16,7 @@ #include #include #include +#include #define arch_get_mmap_end(addr, len, flags) \ ({ \ diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h index 59357bb8ffb34..f5d5e54172ea4 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -69,6 +69,9 @@ struct thread_info { */ unsigned long a0, a1, a2; #endif +#ifdef CONFIG_RISCV_USER_CFI + struct cfi_state user_cfi_state; +#endif }; /* diff --git a/arch/riscv/include/asm/usercfi.h b/arch/riscv/include/asm/usercfi.h new file mode 100644 index 0000000000000..f56966edbf5c6 --- /dev/null +++ b/arch/riscv/include/asm/usercfi.h @@ -0,0 +1,97 @@ +/* SPDX-License-Identifier: GPL-2.0 + * Copyright (C) 2024 Rivos, Inc. + * Deepak Gupta + */ +#ifndef _ASM_RISCV_USERCFI_H +#define _ASM_RISCV_USERCFI_H + +#define CMDLINE_DISABLE_RISCV_USERCFI_FCFI 1 +#define CMDLINE_DISABLE_RISCV_USERCFI_BCFI 2 +#define CMDLINE_DISABLE_RISCV_USERCFI 3 + +#ifndef __ASSEMBLER__ +#include +#include +#include + +struct task_struct; +struct kernel_clone_args; + +extern unsigned long riscv_nousercfi; + +#ifdef CONFIG_RISCV_USER_CFI +struct cfi_state { + unsigned long ubcfi_en : 1; /* Enable for backward cfi. */ + unsigned long ubcfi_locked : 1; + unsigned long ufcfi_en : 1; /* Enable for forward cfi. Note that ELP goes in sstatus */ + unsigned long ufcfi_locked : 1; + unsigned long user_shdw_stk; /* Current user shadow stack pointer */ + unsigned long shdw_stk_base; /* Base address of shadow stack */ + unsigned long shdw_stk_size; /* size of shadow stack */ +}; + +unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, + const struct kernel_clone_args *args); +void shstk_release(struct task_struct *tsk); +void set_shstk_base(struct task_struct *task, unsigned long shstk_addr, unsigned long size); +unsigned long get_shstk_base(struct task_struct *task, unsigned long *size); +void set_active_shstk(struct task_struct *task, unsigned long shstk_addr); +bool is_shstk_enabled(struct task_struct *task); +bool is_shstk_locked(struct task_struct *task); +bool is_shstk_allocated(struct task_struct *task); +void set_shstk_lock(struct task_struct *task, bool lock); +void set_shstk_status(struct task_struct *task, bool enable); +unsigned long get_active_shstk(struct task_struct *task); +int restore_user_shstk(struct task_struct *tsk, unsigned long shstk_ptr); +int save_user_shstk(struct task_struct *tsk, unsigned long *saved_shstk_ptr); +bool is_indir_lp_enabled(struct task_struct *task); +bool is_indir_lp_locked(struct task_struct *task); +void set_indir_lp_status(struct task_struct *task, bool enable); +void set_indir_lp_lock(struct task_struct *task, bool lock); + +#define PR_SHADOW_STACK_SUPPORTED_STATUS_MASK (PR_SHADOW_STACK_ENABLE) + +#else + +#define shstk_alloc_thread_stack(tsk, args) 0 + +#define shstk_release(tsk) + +#define get_shstk_base(task, size) 0UL + +#define set_shstk_base(task, shstk_addr, size) do {} while (0) + +#define set_active_shstk(task, shstk_addr) do {} while (0) + +#define is_shstk_enabled(task) false + +#define is_shstk_locked(task) false + +#define is_shstk_allocated(task) false + +#define set_shstk_lock(task, lock) do {} while (0) + +#define set_shstk_status(task, enable) do {} while (0) + +#define is_indir_lp_enabled(task) false + +#define is_indir_lp_locked(task) false + +#define set_indir_lp_status(task, enable) do {} while (0) + +#define set_indir_lp_lock(task, lock) do {} while (0) + +#define restore_user_shstk(tsk, shstk_ptr) -EINVAL + +#define save_user_shstk(tsk, saved_shstk_ptr) -EINVAL + +#define get_active_shstk(task) 0UL + +#endif /* CONFIG_RISCV_USER_CFI */ + +bool is_user_shstk_enabled(void); +bool is_user_lpad_enabled(void); + +#endif /* __ASSEMBLER__ */ + +#endif /* _ASM_RISCV_USERCFI_H */ diff --git a/arch/riscv/include/asm/vdso.h b/arch/riscv/include/asm/vdso.h index f891478829a52..aa6355e18b60f 100644 --- a/arch/riscv/include/asm/vdso.h +++ b/arch/riscv/include/asm/vdso.h @@ -18,9 +18,19 @@ #ifndef __ASSEMBLY__ #include +#ifdef CONFIG_RISCV_USER_CFI +#include +#endif +#ifdef CONFIG_RISCV_USER_CFI #define VDSO_SYMBOL(base, name) \ - (void __user *)((unsigned long)(base) + __vdso_##name##_offset) + (riscv_has_extension_unlikely(RISCV_ISA_EXT_ZIMOP) ? \ + (void __user *)((unsigned long)(base) + __vdso_##name##_cfi_offset) : \ + (void __user *)((unsigned long)(base) + __vdso_##name##_offset)) +#else +#define VDSO_SYMBOL(base, name) \ + ((void __user *)((unsigned long)(base) + __vdso_##name##_offset)) +#endif #ifdef CONFIG_COMPAT #include @@ -33,6 +43,7 @@ extern char compat_vdso_start[], compat_vdso_end[]; #endif /* CONFIG_COMPAT */ extern char vdso_start[], vdso_end[]; +extern char vdso_cfi_start[], vdso_cfi_end[]; #endif /* !__ASSEMBLY__ */ diff --git a/arch/riscv/include/asm/vdso/getrandom.h b/arch/riscv/include/asm/vdso/getrandom.h new file mode 100644 index 0000000000000..8dc92441702a9 --- /dev/null +++ b/arch/riscv/include/asm/vdso/getrandom.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2025 Xi Ruoyao . All Rights Reserved. + */ +#ifndef __ASM_VDSO_GETRANDOM_H +#define __ASM_VDSO_GETRANDOM_H + +#ifndef __ASSEMBLY__ + +#include + +static __always_inline ssize_t getrandom_syscall(void *_buffer, size_t _len, unsigned int _flags) +{ + register long ret asm("a0"); + register long nr asm("a7") = __NR_getrandom; + register void *buffer asm("a0") = _buffer; + register size_t len asm("a1") = _len; + register unsigned int flags asm("a2") = _flags; + + asm volatile ("ecall\n" + : "+r" (ret) + : "r" (nr), "r" (buffer), "r" (len), "r" (flags) + : "memory"); + + return ret; +} + +#endif /* !__ASSEMBLY__ */ + +#endif /* __ASM_VDSO_GETRANDOM_H */ diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index a914f912627ed..12315d05ba860 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -386,6 +386,9 @@ static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; } #define riscv_v_thread_free(tsk) do {} while (0) #define riscv_v_setup_ctx_cache() do {} while (0) #define riscv_v_thread_alloc(tsk) do {} while (0) +#define get_cpu_vector_context() do {} while (0) +#define put_cpu_vector_context() do {} while (0) +#define riscv_v_vstate_set_restore(task, regs) do {} while (0) #endif /* CONFIG_RISCV_ISA_V */ diff --git a/arch/riscv/include/uapi/asm/hwprobe.h b/arch/riscv/include/uapi/asm/hwprobe.h index 3a53e5c92b6d0..8635f0f6946ac 100644 --- a/arch/riscv/include/uapi/asm/hwprobe.h +++ b/arch/riscv/include/uapi/asm/hwprobe.h @@ -84,6 +84,10 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_EXT_ZABHA (1ULL << 58) #define RISCV_HWPROBE_EXT_ZALASR (1ULL << 59) #define RISCV_HWPROBE_EXT_ZICBOP (1ULL << 60) +#define RISCV_HWPROBE_EXT_ZILSD (1ULL << 61) +#define RISCV_HWPROBE_EXT_ZCLSD (1ULL << 62) +#define RISCV_HWPROBE_EXT_ZICFILP (1ULL << 63) + #define RISCV_HWPROBE_KEY_CPUPERF_0 5 #define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0) #define RISCV_HWPROBE_MISALIGNED_EMULATED (1 << 0) @@ -110,6 +114,9 @@ struct riscv_hwprobe { #define RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0 13 #define RISCV_HWPROBE_KEY_VENDOR_EXT_MIPS_0 14 #define RISCV_HWPROBE_KEY_ZICBOP_BLOCK_SIZE 15 +#define RISCV_HWPROBE_KEY_IMA_EXT_1 16 +#define RISCV_HWPROBE_EXT_ZICFISS (1ULL << 0) + /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */ /* Flags */ diff --git a/arch/riscv/include/uapi/asm/ptrace.h b/arch/riscv/include/uapi/asm/ptrace.h index a38268b19c3d3..a252ed2dffe76 100644 --- a/arch/riscv/include/uapi/asm/ptrace.h +++ b/arch/riscv/include/uapi/asm/ptrace.h @@ -9,6 +9,7 @@ #ifndef __ASSEMBLY__ #include +#include #define PTRACE_GETFDPIC 33 @@ -127,6 +128,42 @@ struct __riscv_v_regset_state { */ #define RISCV_MAX_VLENB (8192) +struct __sc_riscv_cfi_state { + unsigned long ss_ptr; /* shadow stack pointer */ +}; + +#define PTRACE_CFI_BRANCH_LANDING_PAD_EN_BIT 0 +#define PTRACE_CFI_BRANCH_LANDING_PAD_LOCK_BIT 1 +#define PTRACE_CFI_BRANCH_EXPECTED_LANDING_PAD_BIT 2 +#define PTRACE_CFI_SHADOW_STACK_EN_BIT 3 +#define PTRACE_CFI_SHADOW_STACK_LOCK_BIT 4 +#define PTRACE_CFI_SHADOW_STACK_PTR_BIT 5 + +#define PTRACE_CFI_BRANCH_LANDING_PAD_EN_STATE _BITUL(PTRACE_CFI_BRANCH_LANDING_PAD_EN_BIT) +#define PTRACE_CFI_BRANCH_LANDING_PAD_LOCK_STATE \ + _BITUL(PTRACE_CFI_BRANCH_LANDING_PAD_LOCK_BIT) +#define PTRACE_CFI_BRANCH_EXPECTED_LANDING_PAD_STATE \ + _BITUL(PTRACE_CFI_BRANCH_EXPECTED_LANDING_PAD_BIT) +#define PTRACE_CFI_SHADOW_STACK_EN_STATE _BITUL(PTRACE_CFI_SHADOW_STACK_EN_BIT) +#define PTRACE_CFI_SHADOW_STACK_LOCK_STATE _BITUL(PTRACE_CFI_SHADOW_STACK_LOCK_BIT) +#define PTRACE_CFI_SHADOW_STACK_PTR_STATE _BITUL(PTRACE_CFI_SHADOW_STACK_PTR_BIT) + +#define PTRACE_CFI_STATE_INVALID_MASK ~(PTRACE_CFI_BRANCH_LANDING_PAD_EN_STATE | \ + PTRACE_CFI_BRANCH_LANDING_PAD_LOCK_STATE | \ + PTRACE_CFI_BRANCH_EXPECTED_LANDING_PAD_STATE | \ + PTRACE_CFI_SHADOW_STACK_EN_STATE | \ + PTRACE_CFI_SHADOW_STACK_LOCK_STATE | \ + PTRACE_CFI_SHADOW_STACK_PTR_STATE) + +struct __cfi_status { + __u64 cfi_state; +}; + +struct user_cfi_state { + struct __cfi_status cfi_status; + __u64 shstk_ptr; +}; + #endif /* __ASSEMBLY__ */ #endif /* _UAPI_ASM_RISCV_PTRACE_H */ diff --git a/arch/riscv/include/uapi/asm/sigcontext.h b/arch/riscv/include/uapi/asm/sigcontext.h index cd4f175dc8376..f37e4beffe036 100644 --- a/arch/riscv/include/uapi/asm/sigcontext.h +++ b/arch/riscv/include/uapi/asm/sigcontext.h @@ -10,6 +10,7 @@ /* The Magic number for signal context frame header. */ #define RISCV_V_MAGIC 0x53465457 +#define RISCV_ZICFISS_MAGIC 0x9487 #define END_MAGIC 0x0 /* The size of END signal context header. */ diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 127ff892c4704..c9ba8d4c1ea6b 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -66,6 +66,7 @@ obj-y += vendor_extensions.o obj-y += vendor_extensions/ obj-y += probes/ obj-$(CONFIG_MMU) += vdso.o vdso/ +obj-$(CONFIG_RISCV_USER_CFI) += vdso_cfi/ obj-$(CONFIG_RISCV_MISALIGNED) += traps_misaligned.o obj-$(CONFIG_RISCV_MISALIGNED) += unaligned_access_speed.o @@ -117,3 +118,6 @@ obj-$(CONFIG_COMPAT) += compat_vdso/ obj-$(CONFIG_64BIT) += pi/ obj-$(CONFIG_ACPI) += acpi.o obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o + +obj-$(CONFIG_GENERIC_CPU_VULNERABILITIES) += bugs.o +obj-$(CONFIG_RISCV_USER_CFI) += usercfi.o diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c index 2e490845eff13..6e8aacc637626 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -44,6 +44,10 @@ void asm_offsets(void) OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp); OFFSET(TASK_TI_CPU_NUM, task_struct, thread_info.cpu); +#ifdef CONFIG_RISCV_USER_CFI + OFFSET(TASK_TI_CFI_STATE, task_struct, thread_info.user_cfi_state); + OFFSET(TASK_TI_USER_SSP, task_struct, thread_info.user_cfi_state.user_shdw_stk); +#endif OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]); OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]); OFFSET(TASK_THREAD_F2, task_struct, thread.fstate.f[2]); @@ -506,4 +510,10 @@ void asm_offsets(void) #define ASM_MAX_CPUS NR_CPUS DEFINE(ASM_NR_CPUS, ASM_MAX_CPUS); #endif +#ifdef CONFIG_RISCV_SBI + DEFINE(SBI_EXT_FWFT, SBI_EXT_FWFT); + DEFINE(SBI_EXT_FWFT_SET, SBI_EXT_FWFT_SET); + DEFINE(SBI_FWFT_SHADOW_STACK, SBI_FWFT_SHADOW_STACK); + DEFINE(SBI_FWFT_SET_FLAG_LOCK, SBI_FWFT_SET_FLAG_LOCK); +#endif } diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index f2f1fdcd7d689..386ac4e6faa94 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -25,6 +25,7 @@ #include #include #include +#include #define NUM_ALPHA_EXTS ('z' - 'a' + 1) @@ -200,6 +201,26 @@ static int riscv_ext_svadu_validate(const struct riscv_isa_ext_data *data, return 0; } +static int riscv_cfilp_validate(const struct riscv_isa_ext_data *data, + const unsigned long *isa_bitmap) +{ + if (!IS_ENABLED(CONFIG_RISCV_USER_CFI) || + (riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_FCFI)) + return -EINVAL; + + return 0; +} + +static int riscv_cfiss_validate(const struct riscv_isa_ext_data *data, + const unsigned long *isa_bitmap) +{ + if (!IS_ENABLED(CONFIG_RISCV_USER_CFI) || + (riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_BCFI)) + return -EINVAL; + + return 0; +} + static const unsigned int riscv_a_exts[] = { RISCV_ISA_EXT_ZAAMO, RISCV_ISA_EXT_ZALRSC, @@ -389,6 +410,10 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_SUPERSET_VALIDATE(zicboz, RISCV_ISA_EXT_ZICBOZ, riscv_xlinuxenvcfg_exts, riscv_ext_zicboz_validate), __RISCV_ISA_EXT_DATA(ziccrse, RISCV_ISA_EXT_ZICCRSE), + __RISCV_ISA_EXT_SUPERSET_VALIDATE(zicfilp, RISCV_ISA_EXT_ZICFILP, riscv_xlinuxenvcfg_exts, + riscv_cfilp_validate), + __RISCV_ISA_EXT_SUPERSET_VALIDATE(zicfiss, RISCV_ISA_EXT_ZICFISS, riscv_xlinuxenvcfg_exts, + riscv_cfiss_validate), __RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR), __RISCV_ISA_EXT_DATA(zicond, RISCV_ISA_EXT_ZICOND), __RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR), diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 6a498239bf10f..0052b22914815 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -90,6 +90,35 @@ _new_vmalloc_restore_context_a0: REG_L a0, TASK_TI_A0(tp) .endm +/* + * If previous mode was U, capture shadow stack pointer and save it away + * Zero CSR_SSP at the same time for sanitization. + */ +.macro save_userssp tmp, status + ALTERNATIVE("nops(4)", + __stringify( \ + andi \tmp, \status, SR_SPP; \ + bnez \tmp, skip_ssp_save; \ + csrrw \tmp, CSR_SSP, x0; \ + REG_S \tmp, TASK_TI_USER_SSP(tp); \ + skip_ssp_save:), + 0, + RISCV_ISA_EXT_ZICFISS, + CONFIG_RISCV_USER_CFI) +.endm + +.macro restore_userssp tmp, status + ALTERNATIVE("nops(4)", + __stringify( \ + andi \tmp, \status, SR_SPP; \ + bnez \tmp, skip_ssp_restore; \ + REG_L \tmp, TASK_TI_USER_SSP(tp); \ + csrw CSR_SSP, \tmp; \ + skip_ssp_restore:), + 0, + RISCV_ISA_EXT_ZICFISS, + CONFIG_RISCV_USER_CFI) +.endm SYM_CODE_START(handle_exception) /* @@ -143,9 +172,14 @@ SYM_CODE_START(handle_exception) * or vector in kernel space. */ li t0, SR_SUM | SR_FS_VS +#ifdef CONFIG_64BIT + li t1, SR_ELP + or t0, t0, t1 +#endif REG_L s0, TASK_TI_USER_SP(tp) csrrc s1, CSR_STATUS, t0 + save_userssp s2, s1 csrr s2, CSR_EPC csrr s3, CSR_TVAL csrr s4, CSR_CAUSE @@ -234,6 +268,7 @@ SYM_CODE_START_NOALIGN(ret_from_exception) call riscv_v_context_nesting_end #endif REG_L a0, PT_STATUS(sp) + restore_userssp s3, a0 /* * The current load reservation is effectively part of the processor's * state, in the sense that load reservations cannot be shared between @@ -311,17 +346,21 @@ SYM_CODE_END(handle_kernel_stack_overflow) ASM_NOKPROBE(handle_kernel_stack_overflow) #endif -SYM_CODE_START(ret_from_fork) +SYM_CODE_START(ret_from_fork_kernel_asm) + call schedule_tail + move a0, s1 /* fn_arg */ + move a1, s0 /* fn */ + move a2, sp /* pt_regs */ + call ret_from_fork_kernel + j ret_from_exception +SYM_CODE_END(ret_from_fork_kernel_asm) + +SYM_CODE_START(ret_from_fork_user_asm) call schedule_tail - beqz s0, 1f /* not from kernel thread */ - /* Call fn(arg) */ - move a0, s1 - jalr s0 -1: move a0, sp /* pt_regs */ - call syscall_exit_to_user_mode + call ret_from_fork_user j ret_from_exception -SYM_CODE_END(ret_from_fork) +SYM_CODE_END(ret_from_fork_user_asm) /* * Integer register context switch @@ -406,6 +445,9 @@ SYM_DATA_START_LOCAL(excp_vect_table) RISCV_PTR do_page_fault /* load page fault */ RISCV_PTR do_trap_unknown RISCV_PTR do_page_fault /* store page fault */ + RISCV_PTR do_trap_unknown /* cause=16 */ + RISCV_PTR do_trap_unknown /* cause=17 */ + RISCV_PTR do_trap_software_check /* cause=18 is sw check exception */ SYM_DATA_END_LABEL(excp_vect_table, SYM_L_LOCAL, excp_vect_table_end) #ifndef CONFIG_MMU diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index 8eec82355f92f..7b4beec9fc5b6 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -15,6 +15,7 @@ #include #include #include +#include #include "efi-header.S" __HEAD @@ -170,6 +171,19 @@ secondary_start_sbi: call relocate_enable_mmu #endif call .Lsetup_trap_vector +#if defined(CONFIG_RISCV_SBI) && defined(CONFIG_RISCV_USER_CFI) + li a7, SBI_EXT_FWFT + li a6, SBI_EXT_FWFT_SET + li a0, SBI_FWFT_SHADOW_STACK + li a1, 1 /* enable supervisor to access shadow stack access */ + li a2, SBI_FWFT_SET_FLAG_LOCK + ecall + beqz a0, 1f + la a1, riscv_nousercfi + li a0, CMDLINE_DISABLE_RISCV_USERCFI_BCFI + REG_S a0, (a1) +1: +#endif tail smp_callin #endif /* CONFIG_SMP */ @@ -320,6 +334,19 @@ SYM_CODE_START(_start_kernel) la tp, init_task la sp, init_thread_union + THREAD_SIZE addi sp, sp, -PT_SIZE_ON_STACK +#if defined(CONFIG_RISCV_SBI) && defined(CONFIG_RISCV_USER_CFI) + li a7, SBI_EXT_FWFT + li a6, SBI_EXT_FWFT_SET + li a0, SBI_FWFT_SHADOW_STACK + li a1, 1 /* enable supervisor to access shadow stack access */ + li a2, SBI_FWFT_SET_FLAG_LOCK + ecall + beqz a0, 1f + la a1, riscv_nousercfi + li a0, CMDLINE_DISABLE_RISCV_USERCFI_BCFI + REG_S a0, (a1) +1: +#endif #ifdef CONFIG_KASAN call kasan_early_init diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index b8c2eeb57b945..e734f1c0358cb 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -15,7 +15,9 @@ #include #include #include +#include +#include #include #include #include @@ -26,6 +28,8 @@ #include #include #include +#include +#include #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK) #include @@ -33,7 +37,8 @@ unsigned long __stack_chk_guard __read_mostly; EXPORT_SYMBOL(__stack_chk_guard); #endif -extern asmlinkage void ret_from_fork(void); +extern asmlinkage void ret_from_fork_kernel_asm(void); +extern asmlinkage void ret_from_fork_user_asm(void); void arch_cpu_idle(void) { @@ -86,8 +91,8 @@ void __show_regs(struct pt_regs *regs) regs->s8, regs->s9, regs->s10); pr_cont(" s11: " REG_FMT " t3 : " REG_FMT " t4 : " REG_FMT "\n", regs->s11, regs->t3, regs->t4); - pr_cont(" t5 : " REG_FMT " t6 : " REG_FMT "\n", - regs->t5, regs->t6); + pr_cont(" t5 : " REG_FMT " t6 : " REG_FMT " ssp : " REG_FMT "\n", + regs->t5, regs->t6, get_active_shstk(current)); pr_cont("status: " REG_FMT " badaddr: " REG_FMT " cause: " REG_FMT "\n", regs->status, regs->badaddr, regs->cause); @@ -142,6 +147,21 @@ void start_thread(struct pt_regs *regs, unsigned long pc, regs->epc = pc; regs->sp = sp; + /* + * clear shadow stack state on exec. + * libc will set it later via prctl. + */ + set_shstk_lock(current, false); + set_shstk_status(current, false); + set_shstk_base(current, 0, 0); + set_active_shstk(current, 0); + /* + * disable indirect branch tracking on exec. + * libc will enable it later via prctl. + */ + set_indir_lp_lock(current, false); + set_indir_lp_status(current, false); + #ifdef CONFIG_64BIT regs->status &= ~SR_UXL; @@ -192,11 +212,24 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) return 0; } +asmlinkage void ret_from_fork_kernel(void *fn_arg, int (*fn)(void *), struct pt_regs *regs) +{ + fn(fn_arg); + + syscall_exit_to_user_mode(regs); +} + +asmlinkage void ret_from_fork_user(struct pt_regs *regs) +{ + syscall_exit_to_user_mode(regs); +} + int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) { unsigned long clone_flags = args->flags; unsigned long usp = args->stack; unsigned long tls = args->tls; + unsigned long ssp = 0; struct pt_regs *childregs = task_pt_regs(p); memset(&p->thread.s, 0, sizeof(p->thread.s)); @@ -210,21 +243,29 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) p->thread.s[0] = (unsigned long)args->fn; p->thread.s[1] = (unsigned long)args->fn_arg; + p->thread.ra = (unsigned long)ret_from_fork_kernel_asm; } else { + /* allocate new shadow stack if needed. In case of CLONE_VM we have to */ + ssp = shstk_alloc_thread_stack(p, args); + if (IS_ERR_VALUE(ssp)) + return PTR_ERR((void *)ssp); + *childregs = *(current_pt_regs()); /* Turn off status.VS */ riscv_v_vstate_off(childregs); if (usp) /* User fork */ childregs->sp = usp; + /* if needed, set new ssp */ + if (ssp) + set_active_shstk(p, ssp); if (clone_flags & CLONE_SETTLS) childregs->tp = tls; childregs->a0 = 0; /* Return value of fork() */ - p->thread.s[0] = 0; + p->thread.ra = (unsigned long)ret_from_fork_user_asm; } p->thread.riscv_v_flags = 0; if (has_vector()) riscv_v_thread_alloc(p); - p->thread.ra = (unsigned long)ret_from_fork; p->thread.sp = (unsigned long)childregs; /* kernel sp */ return 0; } diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c index e8515aa9d80bf..5087d3a1dce8d 100644 --- a/arch/riscv/kernel/ptrace.c +++ b/arch/riscv/kernel/ptrace.c @@ -19,6 +19,7 @@ #include #include #include +#include enum riscv_regset { REGSET_X, @@ -28,6 +29,9 @@ enum riscv_regset { #ifdef CONFIG_RISCV_ISA_V REGSET_V, #endif +#ifdef CONFIG_RISCV_USER_CFI + REGSET_CFI, +#endif }; static int riscv_gpr_get(struct task_struct *target, @@ -152,6 +156,87 @@ static int riscv_vr_set(struct task_struct *target, } #endif +#ifdef CONFIG_RISCV_USER_CFI +static int riscv_cfi_get(struct task_struct *target, + const struct user_regset *regset, + struct membuf to) +{ + struct user_cfi_state user_cfi; + struct pt_regs *regs; + + memset(&user_cfi, 0, sizeof(user_cfi)); + regs = task_pt_regs(target); + + if (is_indir_lp_enabled(target)) { + user_cfi.cfi_status.cfi_state |= PTRACE_CFI_BRANCH_LANDING_PAD_EN_STATE; + user_cfi.cfi_status.cfi_state |= is_indir_lp_locked(target) ? + PTRACE_CFI_BRANCH_LANDING_PAD_LOCK_STATE : 0; + user_cfi.cfi_status.cfi_state |= (regs->status & SR_ELP) ? + PTRACE_CFI_BRANCH_EXPECTED_LANDING_PAD_STATE : 0; + } + + if (is_shstk_enabled(target)) { + user_cfi.cfi_status.cfi_state |= (PTRACE_CFI_SHADOW_STACK_EN_STATE | + PTRACE_CFI_SHADOW_STACK_PTR_STATE); + user_cfi.cfi_status.cfi_state |= is_shstk_locked(target) ? + PTRACE_CFI_SHADOW_STACK_LOCK_STATE : 0; + user_cfi.shstk_ptr = get_active_shstk(target); + } + + return membuf_write(&to, &user_cfi, sizeof(user_cfi)); +} + +/* + * Does it make sense to allow enable / disable of cfi via ptrace? + * We don't allow enable / disable / locking control via ptrace for now. + * Setting the shadow stack pointer is allowed. GDB might use it to unwind or + * some other fixup. Similarly gdb might want to suppress elp and may want + * to reset elp state. + */ +static int riscv_cfi_set(struct task_struct *target, + const struct user_regset *regset, + unsigned int pos, unsigned int count, + const void *kbuf, const void __user *ubuf) +{ + int ret; + struct user_cfi_state user_cfi; + struct pt_regs *regs; + + regs = task_pt_regs(target); + + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &user_cfi, 0, -1); + if (ret) + return ret; + + /* + * Not allowing enabling or locking shadow stack or landing pad + * There is no disabling of shadow stack or landing pad via ptrace + * rsvd field should be set to zero so that if those fields are needed in future + */ + if ((user_cfi.cfi_status.cfi_state & + (PTRACE_CFI_BRANCH_LANDING_PAD_EN_STATE | PTRACE_CFI_BRANCH_LANDING_PAD_LOCK_STATE | + PTRACE_CFI_SHADOW_STACK_EN_STATE | PTRACE_CFI_SHADOW_STACK_LOCK_STATE)) || + (user_cfi.cfi_status.cfi_state & PTRACE_CFI_STATE_INVALID_MASK)) + return -EINVAL; + + /* If lpad is enabled on target and ptrace requests to set / clear elp, do that */ + if (is_indir_lp_enabled(target)) { + if (user_cfi.cfi_status.cfi_state & + PTRACE_CFI_BRANCH_EXPECTED_LANDING_PAD_STATE) /* set elp state */ + regs->status |= SR_ELP; + else + regs->status &= ~SR_ELP; /* clear elp state */ + } + + /* If shadow stack enabled on target, set new shadow stack pointer */ + if (is_shstk_enabled(target) && + (user_cfi.cfi_status.cfi_state & PTRACE_CFI_SHADOW_STACK_PTR_STATE)) + set_active_shstk(target, user_cfi.shstk_ptr); + + return 0; +} +#endif + static const struct user_regset riscv_user_regset[] = { [REGSET_X] = { .core_note_type = NT_PRSTATUS, @@ -182,6 +267,16 @@ static const struct user_regset riscv_user_regset[] = { .set = riscv_vr_set, }, #endif +#ifdef CONFIG_RISCV_USER_CFI + [REGSET_CFI] = { + .core_note_type = NT_RISCV_USER_CFI, + .align = sizeof(__u64), + .n = sizeof(struct user_cfi_state) / sizeof(__u64), + .size = sizeof(__u64), + .regset_get = riscv_cfi_get, + .set = riscv_cfi_set, + }, +#endif }; static const struct user_regset_view riscv_user_native_view = { diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c index 2ada1fd16bc0d..bf6a499bc27da 100644 --- a/arch/riscv/kernel/signal.c +++ b/arch/riscv/kernel/signal.c @@ -22,11 +22,13 @@ #include #include #include +#include unsigned long signal_minsigstksz __ro_after_init; extern u32 __user_rt_sigreturn[2]; static size_t riscv_v_sc_size __ro_after_init; +static size_t riscv_zicfiss_sc_size __ro_after_init; #define DEBUG_SIG 0 @@ -68,18 +70,19 @@ static long save_fp_state(struct pt_regs *regs, #define restore_fp_state(task, regs) (0) #endif -#ifdef CONFIG_RISCV_ISA_V - -static long save_v_state(struct pt_regs *regs, void __user **sc_vec) +static long save_v_state(struct pt_regs *regs, void __user *sc_vec) { - struct __riscv_ctx_hdr __user *hdr; struct __sc_riscv_v_state __user *state; void __user *datap; long err; - hdr = *sc_vec; - /* Place state to the user's signal context space after the hdr */ - state = (struct __sc_riscv_v_state __user *)(hdr + 1); + if (!IS_ENABLED(CONFIG_RISCV_ISA_V) || + !((has_vector() || has_xtheadvector()) && + riscv_v_vstate_query(regs))) + return 0; + + /* Place state to the user's signal context space */ + state = (struct __sc_riscv_v_state __user *)sc_vec; /* Point datap right after the end of __sc_riscv_v_state */ datap = state + 1; @@ -97,15 +100,11 @@ static long save_v_state(struct pt_regs *regs, void __user **sc_vec) err |= __put_user((__force void *)datap, &state->v_state.datap); /* Copy the whole vector content to user space datap. */ err |= __copy_to_user(datap, current->thread.vstate.datap, riscv_v_vsize); - /* Copy magic to the user space after saving all vector conetext */ - err |= __put_user(RISCV_V_MAGIC, &hdr->magic); - err |= __put_user(riscv_v_sc_size, &hdr->size); if (unlikely(err)) - return err; + return -EFAULT; - /* Only progress the sv_vec if everything has done successfully */ - *sc_vec += riscv_v_sc_size; - return 0; + /* Only return the size if everything has done successfully */ + return riscv_v_sc_size; } /* @@ -142,10 +141,80 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec) */ return copy_from_user(current->thread.vstate.datap, datap, riscv_v_vsize); } -#else -#define save_v_state(task, regs) (0) -#define __restore_v_state(task, regs) (0) -#endif + +static long save_cfiss_state(struct pt_regs *regs, void __user *sc_cfi) +{ + struct __sc_riscv_cfi_state __user *state = sc_cfi; + unsigned long ss_ptr = 0; + long err = 0; + + if (!is_shstk_enabled(current)) + return 0; + + /* + * Save a pointer to the shadow stack itself on shadow stack as a form of token. + * A token on the shadow stack gives the following properties: + * - Safe save and restore for shadow stack switching. Any save of a shadow stack + * must have saved a token on the shadow stack. Similarly any restore of shadow + * stack must check the token before restore. Since writing to the shadow stack with + * address of the shadow stack itself is not easily allowed, a restore without a save + * is quite difficult for an attacker to perform. + * - A natural break. A token in shadow stack provides a natural break in shadow stack + * So a single linear range can be bucketed into different shadow stack segments. Any + * sspopchk will detect the condition and fault to kernel as a sw check exception. + */ + err |= save_user_shstk(current, &ss_ptr); + err |= __put_user(ss_ptr, &state->ss_ptr); + if (unlikely(err)) + return -EFAULT; + + return riscv_zicfiss_sc_size; +} + +static long __restore_cfiss_state(struct pt_regs *regs, void __user *sc_cfi) +{ + struct __sc_riscv_cfi_state __user *state = sc_cfi; + unsigned long ss_ptr = 0; + long err; + + /* + * Restore shadow stack as a form of token stored on the shadow stack itself as a safe + * way to restore. + * A token on the shadow stack gives the following properties: + * - Safe save and restore for shadow stack switching. Any save of shadow stack + * must have saved a token on shadow stack. Similarly any restore of shadow + * stack must check the token before restore. Since writing to a shadow stack with + * the address of shadow stack itself is not easily allowed, a restore without a save + * is quite difficult for an attacker to perform. + * - A natural break. A token in the shadow stack provides a natural break in shadow stack + * So a single linear range can be bucketed into different shadow stack segments. + * sspopchk will detect the condition and fault to kernel as a sw check exception. + */ + err = __copy_from_user(&ss_ptr, &state->ss_ptr, sizeof(unsigned long)); + + if (unlikely(err)) + return err; + + return restore_user_shstk(current, ss_ptr); +} + +struct arch_ext_priv { + __u32 magic; + long (*save)(struct pt_regs *regs, void __user *sc_vec); +}; + +struct arch_ext_priv arch_ext_list[] = { + { + .magic = RISCV_V_MAGIC, + .save = &save_v_state, + }, + { + .magic = RISCV_ZICFISS_MAGIC, + .save = &save_cfiss_state, + }, +}; + +const size_t nr_arch_exts = ARRAY_SIZE(arch_ext_list); static long restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc) @@ -195,6 +264,12 @@ static long restore_sigcontext(struct pt_regs *regs, err = __restore_v_state(regs, sc_ext_ptr); break; + case RISCV_ZICFISS_MAGIC: + if (!is_shstk_enabled(current) || size != riscv_zicfiss_sc_size) + return -EINVAL; + + err = __restore_cfiss_state(regs, sc_ext_ptr); + break; default: return -EINVAL; } @@ -216,6 +291,16 @@ static size_t get_rt_frame_size(bool cal_all) total_context_size += riscv_v_sc_size; } + if (is_shstk_enabled(current)) + total_context_size += riscv_zicfiss_sc_size; + + /* + * Preserved a __riscv_ctx_hdr for END signal context header if an + * extension uses __riscv_extra_ext_header + */ + if (total_context_size) + total_context_size += sizeof(struct __riscv_ctx_hdr); + frame_size += total_context_size; frame_size = round_up(frame_size, 16); @@ -270,7 +355,8 @@ static long setup_sigcontext(struct rt_sigframe __user *frame, { struct sigcontext __user *sc = &frame->uc.uc_mcontext; struct __riscv_ctx_hdr __user *sc_ext_ptr = &sc->sc_extdesc.hdr; - long err; + struct arch_ext_priv *arch_ext; + long err, i, ext_size; /* sc_regs is structured the same as the start of pt_regs */ err = __copy_to_user(&sc->sc_regs, regs, sizeof(sc->sc_regs)); @@ -278,8 +364,20 @@ static long setup_sigcontext(struct rt_sigframe __user *frame, if (has_fpu()) err |= save_fp_state(regs, &sc->sc_fpregs); /* Save the vector state. */ - if (has_vector() && riscv_v_vstate_query(regs)) - err |= save_v_state(regs, (void __user **)&sc_ext_ptr); + for (i = 0; i < nr_arch_exts; i++) { + arch_ext = &arch_ext_list[i]; + if (!arch_ext->save) + continue; + + ext_size = arch_ext->save(regs, sc_ext_ptr + 1); + if (ext_size <= 0) { + err |= ext_size; + } else { + err |= __put_user(arch_ext->magic, &sc_ext_ptr->magic); + err |= __put_user(ext_size, &sc_ext_ptr->size); + sc_ext_ptr = (void *)sc_ext_ptr + ext_size; + } + } /* Write zero to fp-reserved space and check it on restore_sigcontext */ err |= __put_user(0, &sc->sc_extdesc.reserved); /* And put END __riscv_ctx_hdr at the end. */ @@ -339,6 +437,11 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set, #ifdef CONFIG_MMU regs->ra = (unsigned long)VDSO_SYMBOL( current->mm->context.vdso, rt_sigreturn); + + /* if bcfi is enabled x1 (ra) and x5 (t0) must match. not sure if we need this? */ + if (is_shstk_enabled(current)) + regs->t0 = regs->ra; + #else /* * For the nommu case we don't have a VDSO. Instead we push two @@ -467,6 +570,9 @@ void __init init_rt_signal_env(void) { riscv_v_sc_size = sizeof(struct __riscv_ctx_hdr) + sizeof(struct __sc_riscv_v_state) + riscv_v_vsize; + + riscv_zicfiss_sc_size = sizeof(struct __riscv_ctx_hdr) + + sizeof(struct __sc_riscv_cfi_state); /* * Determine the stack space required for guaranteed signal delivery. * The signal_minsigstksz will be populated into the AT_MINSIGSTKSZ entry diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index 329f0b6b28298..ecfe61680aaa3 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -21,6 +21,14 @@ #include +#define EXT_KEY(isa_arg, ext, pv, missing) \ + do { \ + if (__riscv_isa_extension_available(isa_arg, RISCV_ISA_EXT_##ext)) \ + pv |= RISCV_HWPROBE_EXT_##ext; \ + else \ + missing |= RISCV_HWPROBE_EXT_##ext; \ + } while (false) + static void hwprobe_arch_id(struct riscv_hwprobe *pair, const struct cpumask *cpus) { @@ -84,86 +92,106 @@ static void hwprobe_isa_ext0(struct riscv_hwprobe *pair, for_each_cpu(cpu, cpus) { struct riscv_isainfo *isainfo = &hart_isa[cpu]; -#define EXT_KEY(ext) \ - do { \ - if (__riscv_isa_extension_available(isainfo->isa, RISCV_ISA_EXT_##ext)) \ - pair->value |= RISCV_HWPROBE_EXT_##ext; \ - else \ - missing |= RISCV_HWPROBE_EXT_##ext; \ - } while (false) - /* * Only use EXT_KEY() for extensions which can be exposed to userspace, * regardless of the kernel's configuration, as no other checks, besides * presence in the hart_isa bitmap, are made. */ - EXT_KEY(ZAAMO); - EXT_KEY(ZABHA); - EXT_KEY(ZACAS); - EXT_KEY(ZALASR); - EXT_KEY(ZALRSC); - EXT_KEY(ZAWRS); - EXT_KEY(ZBA); - EXT_KEY(ZBB); - EXT_KEY(ZBC); - EXT_KEY(ZBKB); - EXT_KEY(ZBKC); - EXT_KEY(ZBKX); - EXT_KEY(ZBS); - EXT_KEY(ZCA); - EXT_KEY(ZCB); - EXT_KEY(ZCMOP); - EXT_KEY(ZICBOM); - EXT_KEY(ZICBOP); - EXT_KEY(ZICBOZ); - EXT_KEY(ZICNTR); - EXT_KEY(ZICOND); - EXT_KEY(ZIHINTNTL); - EXT_KEY(ZIHINTPAUSE); - EXT_KEY(ZIHPM); - EXT_KEY(ZIMOP); - EXT_KEY(ZKND); - EXT_KEY(ZKNE); - EXT_KEY(ZKNH); - EXT_KEY(ZKSED); - EXT_KEY(ZKSH); - EXT_KEY(ZKT); - EXT_KEY(ZTSO); + EXT_KEY(isainfo->isa, ZAAMO, pair->value, missing); + EXT_KEY(isainfo->isa, ZABHA, pair->value, missing); + EXT_KEY(isainfo->isa, ZACAS, pair->value, missing); + EXT_KEY(isainfo->isa, ZALASR, pair->value, missing); + EXT_KEY(isainfo->isa, ZALRSC, pair->value, missing); + EXT_KEY(isainfo->isa, ZAWRS, pair->value, missing); + EXT_KEY(isainfo->isa, ZBA, pair->value, missing); + EXT_KEY(isainfo->isa, ZBB, pair->value, missing); + EXT_KEY(isainfo->isa, ZBC, pair->value, missing); + EXT_KEY(isainfo->isa, ZBKB, pair->value, missing); + EXT_KEY(isainfo->isa, ZBKC, pair->value, missing); + EXT_KEY(isainfo->isa, ZBKX, pair->value, missing); + EXT_KEY(isainfo->isa, ZBS, pair->value, missing); + EXT_KEY(isainfo->isa, ZCA, pair->value, missing); + EXT_KEY(isainfo->isa, ZCB, pair->value, missing); + EXT_KEY(isainfo->isa, ZCLSD, pair->value, missing); + EXT_KEY(isainfo->isa, ZCMOP, pair->value, missing); + EXT_KEY(isainfo->isa, ZICBOM, pair->value, missing); + EXT_KEY(isainfo->isa, ZICBOP, pair->value, missing); + EXT_KEY(isainfo->isa, ZICBOZ, pair->value, missing); + EXT_KEY(isainfo->isa, ZICFILP, pair->value, missing); + EXT_KEY(isainfo->isa, ZICNTR, pair->value, missing); + EXT_KEY(isainfo->isa, ZICOND, pair->value, missing); + EXT_KEY(isainfo->isa, ZIHINTNTL, pair->value, missing); + EXT_KEY(isainfo->isa, ZIHINTPAUSE, pair->value, missing); + EXT_KEY(isainfo->isa, ZIHPM, pair->value, missing); + EXT_KEY(isainfo->isa, ZILSD, pair->value, missing); + EXT_KEY(isainfo->isa, ZIMOP, pair->value, missing); + EXT_KEY(isainfo->isa, ZKND, pair->value, missing); + EXT_KEY(isainfo->isa, ZKNE, pair->value, missing); + EXT_KEY(isainfo->isa, ZKNH, pair->value, missing); + EXT_KEY(isainfo->isa, ZKSED, pair->value, missing); + EXT_KEY(isainfo->isa, ZKSH, pair->value, missing); + EXT_KEY(isainfo->isa, ZKT, pair->value, missing); + EXT_KEY(isainfo->isa, ZTSO, pair->value, missing); if (has_vector()) { - EXT_KEY(ZVBB); - EXT_KEY(ZVBC); - EXT_KEY(ZVE32F); - EXT_KEY(ZVE32X); - EXT_KEY(ZVE64D); - EXT_KEY(ZVE64F); - EXT_KEY(ZVE64X); - EXT_KEY(ZVFBFMIN); - EXT_KEY(ZVFBFWMA); - EXT_KEY(ZVFH); - EXT_KEY(ZVFHMIN); - EXT_KEY(ZVKB); - EXT_KEY(ZVKG); - EXT_KEY(ZVKNED); - EXT_KEY(ZVKNHA); - EXT_KEY(ZVKNHB); - EXT_KEY(ZVKSED); - EXT_KEY(ZVKSH); - EXT_KEY(ZVKT); + EXT_KEY(isainfo->isa, ZVBB, pair->value, missing); + EXT_KEY(isainfo->isa, ZVBC, pair->value, missing); + EXT_KEY(isainfo->isa, ZVE32F, pair->value, missing); + EXT_KEY(isainfo->isa, ZVE32X, pair->value, missing); + EXT_KEY(isainfo->isa, ZVE64D, pair->value, missing); + EXT_KEY(isainfo->isa, ZVE64F, pair->value, missing); + EXT_KEY(isainfo->isa, ZVE64X, pair->value, missing); + EXT_KEY(isainfo->isa, ZVFBFMIN, pair->value, missing); + EXT_KEY(isainfo->isa, ZVFBFWMA, pair->value, missing); + EXT_KEY(isainfo->isa, ZVFH, pair->value, missing); + EXT_KEY(isainfo->isa, ZVFHMIN, pair->value, missing); + EXT_KEY(isainfo->isa, ZVKB, pair->value, missing); + EXT_KEY(isainfo->isa, ZVKG, pair->value, missing); + EXT_KEY(isainfo->isa, ZVKNED, pair->value, missing); + EXT_KEY(isainfo->isa, ZVKNHA, pair->value, missing); + EXT_KEY(isainfo->isa, ZVKNHB, pair->value, missing); + EXT_KEY(isainfo->isa, ZVKSED, pair->value, missing); + EXT_KEY(isainfo->isa, ZVKSH, pair->value, missing); + EXT_KEY(isainfo->isa, ZVKT, pair->value, missing); } - if (has_fpu()) { - EXT_KEY(ZCD); - EXT_KEY(ZCF); - EXT_KEY(ZFA); - EXT_KEY(ZFBFMIN); - EXT_KEY(ZFH); - EXT_KEY(ZFHMIN); - } + EXT_KEY(isainfo->isa, ZCD, pair->value, missing); + EXT_KEY(isainfo->isa, ZCF, pair->value, missing); + EXT_KEY(isainfo->isa, ZFA, pair->value, missing); + EXT_KEY(isainfo->isa, ZFBFMIN, pair->value, missing); + EXT_KEY(isainfo->isa, ZFH, pair->value, missing); + EXT_KEY(isainfo->isa, ZFHMIN, pair->value, missing); if (IS_ENABLED(CONFIG_RISCV_ISA_SUPM)) - EXT_KEY(SUPM); -#undef EXT_KEY + EXT_KEY(isainfo->isa, SUPM, pair->value, missing); + } + + /* Now turn off reporting features if any CPU is missing it. */ + pair->value &= ~missing; +} + +static void hwprobe_isa_ext1(struct riscv_hwprobe *pair, + const struct cpumask *cpus) +{ + int cpu; + u64 missing = 0; + + pair->value = 0; + + /* + * Loop through and record extensions that 1) anyone has, and 2) anyone + * doesn't have. + */ + for_each_cpu(cpu, cpus) { + struct riscv_isainfo *isainfo = &hart_isa[cpu]; + + /* + * Only use EXT_KEY() for extensions which can be + * exposed to userspace, regardless of the kernel's + * configuration, as no other checks, besides presence + * in the hart_isa bitmap, are made. + */ + EXT_KEY(isainfo->isa, ZICFISS, pair->value, missing); } /* Now turn off reporting features if any CPU is missing it. */ @@ -274,6 +302,10 @@ static void hwprobe_one_pair(struct riscv_hwprobe *pair, hwprobe_isa_ext0(pair, cpus); break; + case RISCV_HWPROBE_KEY_IMA_EXT_1: + hwprobe_isa_ext1(pair, cpus); + break; + case RISCV_HWPROBE_KEY_CPUPERF_0: case RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF: pair->value = hwprobe_misaligned(cpus); diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c index f1c1416a9f1e5..061dab55d491b 100644 --- a/arch/riscv/kernel/sys_riscv.c +++ b/arch/riscv/kernel/sys_riscv.c @@ -17,6 +17,15 @@ static long riscv_sys_mmap(unsigned long addr, unsigned long len, if (unlikely(offset & (~PAGE_MASK >> page_shift_offset))) return -EINVAL; + /* + * If PROT_WRITE is specified then extend that to PROT_READ + * protection_map[VM_WRITE] is now going to select shadow stack encodings. + * So specifying PROT_WRITE actually should select protection_map [VM_WRITE | VM_READ] + * If user wants to create shadow stack then they should use `map_shadow_stack` syscall. + */ + if (unlikely((prot & PROT_WRITE) && !(prot & PROT_READ))) + prot |= PROT_READ; + return ksys_mmap_pgoff(addr, len, prot, flags, fd, offset >> (PAGE_SHIFT - page_shift_offset)); } diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 52eaef389a4b9..e4c6569d1e79d 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -336,6 +336,60 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs) } +#define CFI_TVAL_FCFI_CODE 2 +#define CFI_TVAL_BCFI_CODE 3 +/* handle cfi violations */ +bool handle_user_cfi_violation(struct pt_regs *regs) +{ + unsigned long tval = csr_read(CSR_TVAL); + bool is_fcfi = (tval == CFI_TVAL_FCFI_CODE && cpu_supports_indirect_br_lp_instr()); + bool is_bcfi = (tval == CFI_TVAL_BCFI_CODE && cpu_supports_shadow_stack()); + + /* + * Handle uprobe event first. The probe point can be a valid target + * of indirect jumps or calls, in this case, forward cfi violation + * will be triggered instead of breakpoint exception. Clear ELP flag + * on sstatus image as well to avoid recurring fault. + */ + if (is_fcfi && probe_breakpoint_handler(regs)) { + regs->status &= ~SR_ELP; + return true; + } + + if (is_fcfi || is_bcfi) { + do_trap_error(regs, SIGSEGV, SEGV_CPERR, regs->epc, + "Oops - control flow violation"); + return true; + } + + return false; +} + +/* + * software check exception is defined with risc-v cfi spec. Software check + * exception is raised when: + * a) An indirect branch doesn't land on 4 byte aligned PC or `lpad` + * instruction or `label` value programmed in `lpad` instr doesn't + * match with value setup in `x7`. reported code in `xtval` is 2. + * b) `sspopchk` instruction finds a mismatch between top of shadow stack (ssp) + * and x1/x5. reported code in `xtval` is 3. + */ +asmlinkage __visible __trap_section void do_trap_software_check(struct pt_regs *regs) +{ + if (user_mode(regs)) { + irqentry_enter_from_user_mode(regs); + + /* not a cfi violation, then merge into flow of unknown trap handler */ + if (!handle_user_cfi_violation(regs)) + do_trap_unknown(regs); + + irqentry_exit_to_user_mode(regs); + } else { + /* sw check exception coming from kernel is a bug in kernel */ + die(regs, "Kernel BUG"); + } +} + #ifdef CONFIG_MMU asmlinkage __visible noinstr void do_page_fault(struct pt_regs *regs) { diff --git a/arch/riscv/kernel/usercfi.c b/arch/riscv/kernel/usercfi.c new file mode 100644 index 0000000000000..2c535737511dc --- /dev/null +++ b/arch/riscv/kernel/usercfi.c @@ -0,0 +1,541 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024 Rivos, Inc. + * Deepak Gupta + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +unsigned long riscv_nousercfi __read_mostly; + +#define SHSTK_ENTRY_SIZE sizeof(void *) + +bool is_shstk_enabled(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.ubcfi_en; +} + +bool is_shstk_allocated(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.shdw_stk_base; +} + +bool is_shstk_locked(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.ubcfi_locked; +} + +void set_shstk_base(struct task_struct *task, unsigned long shstk_addr, unsigned long size) +{ + task->thread_info.user_cfi_state.shdw_stk_base = shstk_addr; + task->thread_info.user_cfi_state.shdw_stk_size = size; +} + +unsigned long get_shstk_base(struct task_struct *task, unsigned long *size) +{ + if (size) + *size = task->thread_info.user_cfi_state.shdw_stk_size; + return task->thread_info.user_cfi_state.shdw_stk_base; +} + +void set_active_shstk(struct task_struct *task, unsigned long shstk_addr) +{ + task->thread_info.user_cfi_state.user_shdw_stk = shstk_addr; +} + +unsigned long get_active_shstk(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.user_shdw_stk; +} + +void set_shstk_status(struct task_struct *task, bool enable) +{ + if (!is_user_shstk_enabled()) + return; + + task->thread_info.user_cfi_state.ubcfi_en = enable ? 1 : 0; + + if (enable) + task->thread.envcfg |= ENVCFG_SSE; + else + task->thread.envcfg &= ~ENVCFG_SSE; + + csr_write(CSR_ENVCFG, task->thread.envcfg); +} + +void set_shstk_lock(struct task_struct *task, bool lock) +{ + task->thread_info.user_cfi_state.ubcfi_locked = lock; +} + +bool is_indir_lp_enabled(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.ufcfi_en; +} + +bool is_indir_lp_locked(struct task_struct *task) +{ + return task->thread_info.user_cfi_state.ufcfi_locked; +} + +void set_indir_lp_status(struct task_struct *task, bool enable) +{ + if (!is_user_lpad_enabled()) + return; + + task->thread_info.user_cfi_state.ufcfi_en = enable ? 1 : 0; + + if (enable) + task->thread.envcfg |= ENVCFG_LPE; + else + task->thread.envcfg &= ~ENVCFG_LPE; + + csr_write(CSR_ENVCFG, task->thread.envcfg); +} + +void set_indir_lp_lock(struct task_struct *task, bool lock) +{ + task->thread_info.user_cfi_state.ufcfi_locked = lock; +} +/* + * If size is 0, then to be compatible with regular stack we want it to be as big as + * regular stack. Else PAGE_ALIGN it and return back + */ +static unsigned long calc_shstk_size(unsigned long size) +{ + if (size) + return PAGE_ALIGN(size); + + return PAGE_ALIGN(min_t(unsigned long long, rlimit(RLIMIT_STACK), SZ_4G)); +} + +/* + * Writes on shadow stack can either be `sspush` or `ssamoswap`. `sspush` can happen + * implicitly on current shadow stack pointed to by CSR_SSP. `ssamoswap` takes pointer to + * shadow stack. To keep it simple, we plan to use `ssamoswap` to perform writes on shadow + * stack. + */ +static noinline unsigned long amo_user_shstk(unsigned long __user *addr, unsigned long val) +{ + /* + * Never expect -1 on shadow stack. Expect return addresses and zero + */ + unsigned long swap = -1; + + __enable_user_access(); + asm goto(".option push\n" + ".option arch, +zicfiss\n" + "1: ssamoswap.d %[swap], %[val], %[addr]\n" + _ASM_EXTABLE(1b, %l[fault]) + ".option pop\n" + : [swap] "=r" (swap), [addr] "+A" (*(__force unsigned long *)addr) + : [val] "r" (val) + : "memory" + : fault + ); + __disable_user_access(); + return swap; +fault: + __disable_user_access(); + return -1; +} + +/* + * Create a restore token on the shadow stack. A token is always XLEN wide + * and aligned to XLEN. + */ +static int create_rstor_token(unsigned long ssp, unsigned long *token_addr) +{ + unsigned long addr; + + /* Token must be aligned */ + if (!IS_ALIGNED(ssp, SHSTK_ENTRY_SIZE)) + return -EINVAL; + + /* On RISC-V we're constructing token to be function of address itself */ + addr = ssp - SHSTK_ENTRY_SIZE; + + if (amo_user_shstk((unsigned long __user *)addr, (unsigned long)ssp) == -1) + return -EFAULT; + + if (token_addr) + *token_addr = addr; + + return 0; +} + +/* + * Save user shadow stack pointer on the shadow stack itself and return a pointer to saved location. + * Returns -EFAULT if unsuccessful. + */ +int save_user_shstk(struct task_struct *tsk, unsigned long *saved_shstk_ptr) +{ + unsigned long ss_ptr = 0; + unsigned long token_loc = 0; + int ret = 0; + + if (!saved_shstk_ptr) + return -EINVAL; + + ss_ptr = get_active_shstk(tsk); + ret = create_rstor_token(ss_ptr, &token_loc); + + if (!ret) { + *saved_shstk_ptr = token_loc; + set_active_shstk(tsk, token_loc); + } + + return ret; +} + +/* + * Restores the user shadow stack pointer from the token on the shadow stack for task 'tsk'. + * Returns -EFAULT if unsuccessful. + */ +int restore_user_shstk(struct task_struct *tsk, unsigned long shstk_ptr) +{ + unsigned long token = 0; + + token = amo_user_shstk((unsigned long __user *)shstk_ptr, 0); + + if (token == -1) + return -EFAULT; + + /* invalid token, return EINVAL */ + if ((token - shstk_ptr) != SHSTK_ENTRY_SIZE) { + pr_info_ratelimited("%s[%d]: bad restore token in %s: pc=%p sp=%p, token=%p, shstk_ptr=%p\n", + tsk->comm, task_pid_nr(tsk), __func__, + (void *)(task_pt_regs(tsk)->epc), + (void *)(task_pt_regs(tsk)->sp), + (void *)token, (void *)shstk_ptr); + return -EINVAL; + } + + /* all checks passed, set active shstk and return success */ + set_active_shstk(tsk, token); + return 0; +} + +static unsigned long allocate_shadow_stack(unsigned long addr, unsigned long size, + unsigned long token_offset, bool set_tok) +{ + int flags = MAP_ANONYMOUS | MAP_PRIVATE; + struct mm_struct *mm = current->mm; + unsigned long populate; + + if (addr) + flags |= MAP_FIXED_NOREPLACE; + + mmap_write_lock(mm); + addr = do_mmap(NULL, addr, size, PROT_READ, flags, + VM_SHADOW_STACK | VM_WRITE, 0, &populate, NULL); + mmap_write_unlock(mm); + + if (!set_tok || IS_ERR_VALUE(addr)) + goto out; + + if (create_rstor_token(addr + token_offset, NULL)) { + vm_munmap(addr, size); + return -EINVAL; + } + +out: + return addr; +} + +SYSCALL_DEFINE3(map_shadow_stack, unsigned long, addr, unsigned long, size, unsigned int, flags) +{ + bool set_tok = flags & SHADOW_STACK_SET_TOKEN; + unsigned long aligned_size = 0; + + if (!is_user_shstk_enabled()) + return -EOPNOTSUPP; + + /* Anything other than set token should result in invalid param */ + if (flags & ~SHADOW_STACK_SET_TOKEN) + return -EINVAL; + + /* + * Unlike other architectures, on RISC-V, SSP pointer is held in CSR_SSP and is an available + * CSR in all modes. CSR accesses are performed using 12bit index programmed in instruction + * itself. This provides static property on register programming and writes to CSR can't + * be unintentional from programmer's perspective. As long as programmer has guarded areas + * which perform writes to CSR_SSP properly, shadow stack pivoting is not possible. Since + * CSR_SSP is writable by user mode, it itself can setup a shadow stack token subsequent + * to allocation. Although in order to provide portablity with other architectures (because + * `map_shadow_stack` is arch agnostic syscall), RISC-V will follow expectation of a token + * flag in flags and if provided in flags, will setup a token at the base. + */ + + /* If there isn't space for a token */ + if (set_tok && size < SHSTK_ENTRY_SIZE) + return -ENOSPC; + + if (addr && (addr & (PAGE_SIZE - 1))) + return -EINVAL; + + aligned_size = PAGE_ALIGN(size); + if (aligned_size < size) + return -EOVERFLOW; + + return allocate_shadow_stack(addr, aligned_size, size, set_tok); +} + +/* + * This gets called during clone/clone3/fork. And is needed to allocate a shadow stack for + * cases where CLONE_VM is specified and thus a different stack is specified by user. We + * thus need a separate shadow stack too. How a separate shadow stack is specified by + * user is still being debated. Once that's settled, remove this part of the comment. + * This function simply returns 0 if shadow stacks are not supported or if separate shadow + * stack allocation is not needed (like in case of !CLONE_VM) + */ +unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, + const struct kernel_clone_args *args) +{ + unsigned long addr, size; + + /* If shadow stack is not supported, return 0 */ + if (!is_user_shstk_enabled()) + return 0; + + /* + * If shadow stack is not enabled on the new thread, skip any + * switch to a new shadow stack. + */ + if (!is_shstk_enabled(tsk)) + return 0; + + /* + * For CLONE_VFORK the child will share the parents shadow stack. + * Set base = 0 and size = 0, this is special means to track this state + * so the freeing logic run for child knows to leave it alone. + */ + if (args->flags & CLONE_VFORK) { + set_shstk_base(tsk, 0, 0); + return 0; + } + + /* + * For !CLONE_VM the child will use a copy of the parents shadow + * stack. + */ + if (!(args->flags & CLONE_VM)) + return 0; + + /* + * reaching here means, CLONE_VM was specified and thus a separate shadow + * stack is needed for new cloned thread. Note: below allocation is happening + * using current mm. + */ + size = calc_shstk_size(args->stack_size); + addr = allocate_shadow_stack(0, size, 0, false); + if (IS_ERR_VALUE(addr)) + return addr; + + set_shstk_base(tsk, addr, size); + + return addr + size; +} + +void shstk_release(struct task_struct *tsk) +{ + unsigned long base = 0, size = 0; + /* If shadow stack is not supported or not enabled, nothing to release */ + if (!is_user_shstk_enabled() || !is_shstk_enabled(tsk)) + return; + + /* + * When fork() with CLONE_VM fails, the child (tsk) already has a + * shadow stack allocated, and exit_thread() calls this function to + * free it. In this case the parent (current) and the child share + * the same mm struct. Move forward only when they're same. + */ + if (!tsk->mm || tsk->mm != current->mm) + return; + + /* + * We know shadow stack is enabled but if base is NULL, then + * this task is not managing its own shadow stack (CLONE_VFORK). So + * skip freeing it. + */ + base = get_shstk_base(tsk, &size); + if (!base) + return; + + vm_munmap(base, size); + set_shstk_base(tsk, 0, 0); +} + +int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status) +{ + unsigned long bcfi_status = 0; + + if (!is_user_shstk_enabled()) + return -EINVAL; + + /* this means shadow stack is enabled on the task */ + bcfi_status |= (is_shstk_enabled(t) ? PR_SHADOW_STACK_ENABLE : 0); + + return copy_to_user(status, &bcfi_status, sizeof(bcfi_status)) ? -EFAULT : 0; +} + +int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status) +{ + unsigned long size = 0, addr = 0; + bool enable_shstk = false; + + if (!is_user_shstk_enabled()) + return -EINVAL; + + /* Reject unknown flags */ + if (status & ~PR_SHADOW_STACK_SUPPORTED_STATUS_MASK) + return -EINVAL; + + /* bcfi status is locked and further can't be modified by user */ + if (is_shstk_locked(t)) + return -EINVAL; + + enable_shstk = status & PR_SHADOW_STACK_ENABLE; + /* Request is to enable shadow stack and shadow stack is not enabled already */ + if (enable_shstk && !is_shstk_enabled(t)) { + /* shadow stack was allocated and enable request again + * no need to support such usecase and return EINVAL. + */ + if (is_shstk_allocated(t)) + return -EINVAL; + + size = calc_shstk_size(0); + addr = allocate_shadow_stack(0, size, 0, false); + if (IS_ERR_VALUE(addr)) + return -ENOMEM; + set_shstk_base(t, addr, size); + set_active_shstk(t, addr + size); + } + + /* + * If a request to disable shadow stack happens, let's go ahead and release it + * Although, if CLONE_VFORKed child did this, then in that case we will end up + * not releasing the shadow stack (because it might be needed in parent). Although + * we will disable it for VFORKed child. And if VFORKed child tries to enable again + * then in that case, it'll get entirely new shadow stack because following condition + * are true + * - shadow stack was not enabled for vforked child + * - shadow stack base was anyways pointing to 0 + * This shouldn't be a big issue because we want parent to have availability of shadow + * stack whenever VFORKed child releases resources via exit or exec but at the same + * time we want VFORKed child to break away and establish new shadow stack if it desires + * + */ + if (!enable_shstk) + shstk_release(t); + + set_shstk_status(t, enable_shstk); + return 0; +} + +int arch_lock_shadow_stack_status(struct task_struct *task, + unsigned long arg) +{ + /* If shtstk not supported or not enabled on task, nothing to lock here */ + if (!is_user_shstk_enabled() || + !is_shstk_enabled(task) || arg != 0) + return -EINVAL; + + set_shstk_lock(task, true); + + return 0; +} + +int arch_prctl_get_branch_landing_pad_state(struct task_struct *t, + unsigned long __user *state) +{ + unsigned long fcfi_status = 0; + + if (!is_user_lpad_enabled()) + return -EINVAL; + + fcfi_status = (is_indir_lp_enabled(t) ? PR_CFI_ENABLE : PR_CFI_DISABLE); + fcfi_status |= (is_indir_lp_locked(t) ? PR_CFI_LOCK : 0); + + return copy_to_user(state, &fcfi_status, sizeof(fcfi_status)) ? -EFAULT : 0; +} + +int arch_prctl_set_branch_landing_pad_state(struct task_struct *t, unsigned long state) +{ + if (!is_user_lpad_enabled()) + return -EINVAL; + + /* indirect branch tracking is locked and further can't be modified by user */ + if (is_indir_lp_locked(t)) + return -EINVAL; + + if (!(state & (PR_CFI_ENABLE | PR_CFI_DISABLE))) + return -EINVAL; + + if (state & PR_CFI_ENABLE && state & PR_CFI_DISABLE) + return -EINVAL; + + set_indir_lp_status(t, !!(state & PR_CFI_ENABLE)); + + return 0; +} + +int arch_prctl_lock_branch_landing_pad_state(struct task_struct *task) +{ + /* + * If indirect branch tracking is not supported or not enabled on task, + * nothing to lock here + */ + if (!is_user_lpad_enabled() || + !is_indir_lp_enabled(task)) + return -EINVAL; + + set_indir_lp_lock(task, true); + + return 0; +} + +bool is_user_shstk_enabled(void) +{ + return (cpu_supports_shadow_stack() && + !(riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_BCFI)); +} + +bool is_user_lpad_enabled(void) +{ + return (cpu_supports_indirect_br_lp_instr() && + !(riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_FCFI)); +} + +static int __init setup_global_riscv_enable(char *str) +{ + if (strcmp(str, "all") == 0) + riscv_nousercfi = CMDLINE_DISABLE_RISCV_USERCFI; + + if (strcmp(str, "fcfi") == 0) + riscv_nousercfi |= CMDLINE_DISABLE_RISCV_USERCFI_FCFI; + + if (strcmp(str, "bcfi") == 0) + riscv_nousercfi |= CMDLINE_DISABLE_RISCV_USERCFI_BCFI; + + if (riscv_nousercfi) + pr_info("RISC-V user CFI disabled via cmdline - shadow stack status : %s, landing pad status : %s\n", + (riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_BCFI) ? "disabled" : + "enabled", (riscv_nousercfi & CMDLINE_DISABLE_RISCV_USERCFI_FCFI) ? + "disabled" : "enabled"); + + return 1; +} + +__setup("riscv_nousercfi=", setup_global_riscv_enable); diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c index 2cf76218a5bd0..e211ddcc02b9c 100644 --- a/arch/riscv/kernel/vdso.c +++ b/arch/riscv/kernel/vdso.c @@ -203,6 +203,13 @@ static struct __vdso_info compat_vdso_info __ro_after_init = { static int __init vdso_init(void) { + /* Hart implements zimop, expose cfi compiled vdso */ + if (IS_ENABLED(CONFIG_RISCV_USER_CFI) && + riscv_has_extension_unlikely(RISCV_ISA_EXT_ZIMOP)) { + vdso_info.vdso_code_start = vdso_cfi_start; + vdso_info.vdso_code_end = vdso_cfi_end; + } + __vdso_init(&vdso_info); #ifdef CONFIG_COMPAT __vdso_init(&compat_vdso_info); diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile index d58f32a2035b1..2dc8f13628c77 100644 --- a/arch/riscv/kernel/vdso/Makefile +++ b/arch/riscv/kernel/vdso/Makefile @@ -13,31 +13,65 @@ vdso-syms += flush_icache vdso-syms += hwprobe vdso-syms += sys_hwprobe +ifdef CONFIG_VDSO_GETRANDOM +vdso-syms += getrandom +endif + +ifdef VDSO_CFI_BUILD +CFI_MARCH = _zicfilp_zicfiss +CFI_FULL = -fcf-protection=full +CFI_SUFFIX = -cfi +OFFSET_SUFFIX = _cfi +ccflags-y += -DVDSO_CFI=1 +asflags-y += -DVDSO_CFI=1 +endif + # Files to link into the vdso obj-vdso = $(patsubst %, %.o, $(vdso-syms)) note.o +ifdef CONFIG_VDSO_GETRANDOM +obj-vdso += vgetrandom-chacha.o +endif + ccflags-y := -fno-stack-protector ccflags-y += -DDISABLE_BRANCH_PROFILING ccflags-y += -fno-builtin +ccflags-y += $(KBUILD_BASE_ISA)$(CFI_MARCH) +ccflags-y += $(CFI_FULL) +asflags-y += $(KBUILD_BASE_ISA)$(CFI_MARCH) +asflags-y += $(CFI_FULL) ifneq ($(c-gettimeofday-y),) CFLAGS_vgettimeofday.o += -fPIC -include $(c-gettimeofday-y) endif +ifneq ($(c-getrandom-y),) + CFLAGS_getrandom.o += -fPIC -include $(c-getrandom-y) +endif + CFLAGS_hwprobe.o += -fPIC # Build rules -targets := $(obj-vdso) vdso.so vdso.so.dbg vdso.lds +vdso_offsets := vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),)-offsets.h +vdso_o := vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),).o +vdso_so := vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),).so +vdso_so_dbg := vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),).so.dbg +vdso_lds := vdso.lds + +targets := $(obj-vdso) $(vdso_so) $(vdso_so_dbg) $(vdso_lds) + obj-vdso := $(addprefix $(obj)/, $(obj-vdso)) -obj-y += vdso.o -CPPFLAGS_vdso.lds += -P -C -U$(ARCH) +obj-y += vdso$(if $(VDSO_CFI_BUILD),$(CFI_SUFFIX),).o +CPPFLAGS_$(vdso_lds) += -P -C -U$(ARCH) ifneq ($(filter vgettimeofday, $(vdso-syms)),) -CPPFLAGS_vdso.lds += -DHAS_VGETTIMEOFDAY +CPPFLAGS_$(vdso_lds) += -DHAS_VGETTIMEOFDAY endif # Disable -pg to prevent insert call site -CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) +CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) +CFLAGS_REMOVE_getrandom.o = $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) +CFLAGS_REMOVE_hwprobe.o = $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) # Disable profiling and instrumentation for VDSO code GCOV_PROFILE := n @@ -46,12 +80,12 @@ KASAN_SANITIZE := n UBSAN_SANITIZE := n # Force dependency -$(obj)/vdso.o: $(obj)/vdso.so +$(obj)/$(vdso_o): $(obj)/$(vdso_so) # link rule for the .so file, .lds has to be first -$(obj)/vdso.so.dbg: $(obj)/vdso.lds $(obj-vdso) FORCE - $(call if_changed,vdsold) -LDFLAGS_vdso.so.dbg = -shared -S -soname=linux-vdso.so.1 \ +$(obj)/$(vdso_so_dbg): $(obj)/$(vdso_lds) $(obj-vdso) FORCE + $(call if_changed,vdsold_and_check) +LDFLAGS_$(vdso_so_dbg) = -shared -soname=linux-vdso.so.1 \ --build-id=sha1 --hash-style=both --eh-frame-hdr # strip rule for the .so file @@ -62,15 +96,16 @@ $(obj)/%.so: $(obj)/%.so.dbg FORCE # Generate VDSO offsets using helper script gen-vdsosym := $(srctree)/$(src)/gen_vdso_offsets.sh quiet_cmd_vdsosym = VDSOSYM $@ - cmd_vdsosym = $(NM) $< | $(gen-vdsosym) | LC_ALL=C sort > $@ + cmd_vdsosym = $(NM) $< | $(gen-vdsosym) $(OFFSET_SUFFIX) | LC_ALL=C sort > $@ -include/generated/vdso-offsets.h: $(obj)/vdso.so.dbg FORCE +include/generated/$(vdso_offsets): $(obj)/$(vdso_so_dbg) FORCE $(call if_changed,vdsosym) # actual build commands # The DSO images are built using a special linker script # Make sure only to export the intended __vdso_xxx symbol offsets. -quiet_cmd_vdsold = VDSOLD $@ - cmd_vdsold = $(LD) $(ld_flags) -T $(filter-out FORCE,$^) -o $@.tmp && \ +quiet_cmd_vdsold_and_check = VDSOLD $@ + cmd_vdsold_and_check = $(LD) $(CFI_FULL) $(ld_flags) -T $(filter-out FORCE,$^) -o $@.tmp && \ $(OBJCOPY) $(patsubst %, -G __vdso_%, $(vdso-syms)) $@.tmp $@ && \ - rm $@.tmp + rm $@.tmp && \ + $(cmd_vdso_check) diff --git a/arch/riscv/kernel/vdso/flush_icache.S b/arch/riscv/kernel/vdso/flush_icache.S index 8f884227e8bca..e4c56970905e9 100644 --- a/arch/riscv/kernel/vdso/flush_icache.S +++ b/arch/riscv/kernel/vdso/flush_icache.S @@ -5,11 +5,13 @@ #include #include +#include .text /* int __vdso_flush_icache(void *start, void *end, unsigned long flags); */ SYM_FUNC_START(__vdso_flush_icache) .cfi_startproc + vdso_lpad #ifdef CONFIG_SMP li a7, __NR_riscv_flush_icache ecall @@ -20,3 +22,5 @@ SYM_FUNC_START(__vdso_flush_icache) ret .cfi_endproc SYM_FUNC_END(__vdso_flush_icache) + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso/gen_vdso_offsets.sh b/arch/riscv/kernel/vdso/gen_vdso_offsets.sh index c2e5613f34951..bd5d5afaaa147 100755 --- a/arch/riscv/kernel/vdso/gen_vdso_offsets.sh +++ b/arch/riscv/kernel/vdso/gen_vdso_offsets.sh @@ -2,4 +2,6 @@ # SPDX-License-Identifier: GPL-2.0 LC_ALL=C -sed -n -e 's/^[0]\+\(0[0-9a-fA-F]*\) . \(__vdso_[a-zA-Z0-9_]*\)$/\#define \2_offset\t0x\1/p' +SUFFIX=${1:-""} +sed -n -e \ +'s/^[0]\+\(0[0-9a-fA-F]*\) . \(__vdso_[a-zA-Z0-9_]*\)$/\#define \2'$SUFFIX'_offset\t0x\1/p' diff --git a/arch/riscv/kernel/vdso/getcpu.S b/arch/riscv/kernel/vdso/getcpu.S index 9c1bd531907f2..5c1ecc4e14654 100644 --- a/arch/riscv/kernel/vdso/getcpu.S +++ b/arch/riscv/kernel/vdso/getcpu.S @@ -5,14 +5,18 @@ #include #include +#include .text /* int __vdso_getcpu(unsigned *cpu, unsigned *node, void *unused); */ SYM_FUNC_START(__vdso_getcpu) .cfi_startproc + vdso_lpad /* For now, just do the syscall. */ li a7, __NR_getcpu ecall ret .cfi_endproc SYM_FUNC_END(__vdso_getcpu) + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso/getrandom.c b/arch/riscv/kernel/vdso/getrandom.c new file mode 100644 index 0000000000000..f21922e8cebd3 --- /dev/null +++ b/arch/riscv/kernel/vdso/getrandom.c @@ -0,0 +1,10 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2025 Xi Ruoyao . All Rights Reserved. + */ +#include + +ssize_t __vdso_getrandom(void *buffer, size_t len, unsigned int flags, void *opaque_state, size_t opaque_len) +{ + return __cvdso_getrandom(buffer, len, flags, opaque_state, opaque_len); +} diff --git a/arch/riscv/kernel/vdso/note.S b/arch/riscv/kernel/vdso/note.S index 2a956c9422111..3d92cc956b95b 100644 --- a/arch/riscv/kernel/vdso/note.S +++ b/arch/riscv/kernel/vdso/note.S @@ -6,7 +6,10 @@ #include #include +#include ELFNOTE_START(Linux, 0, "a") .long LINUX_VERSION_CODE ELFNOTE_END + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso/rt_sigreturn.S b/arch/riscv/kernel/vdso/rt_sigreturn.S index 3dc022aa8931a..e82987dc37394 100644 --- a/arch/riscv/kernel/vdso/rt_sigreturn.S +++ b/arch/riscv/kernel/vdso/rt_sigreturn.S @@ -5,12 +5,16 @@ #include #include +#include .text SYM_FUNC_START(__vdso_rt_sigreturn) .cfi_startproc .cfi_signal_frame + vdso_lpad li a7, __NR_rt_sigreturn ecall .cfi_endproc SYM_FUNC_END(__vdso_rt_sigreturn) + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso/sys_hwprobe.S b/arch/riscv/kernel/vdso/sys_hwprobe.S index 77e57f8305216..f1694451a60c0 100644 --- a/arch/riscv/kernel/vdso/sys_hwprobe.S +++ b/arch/riscv/kernel/vdso/sys_hwprobe.S @@ -3,13 +3,17 @@ #include #include +#include .text SYM_FUNC_START(riscv_hwprobe) .cfi_startproc + vdso_lpad li a7, __NR_riscv_hwprobe ecall ret .cfi_endproc SYM_FUNC_END(riscv_hwprobe) + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso/vdso.lds.S b/arch/riscv/kernel/vdso/vdso.lds.S index 82ce64900f3d7..4fe895347abdb 100644 --- a/arch/riscv/kernel/vdso/vdso.lds.S +++ b/arch/riscv/kernel/vdso/vdso.lds.S @@ -84,6 +84,9 @@ VERSION __vdso_flush_icache; #ifndef COMPAT_VDSO __vdso_riscv_hwprobe; +#endif +#if defined(CONFIG_VDSO_GETRANDOM) && !defined(COMPAT_VDSO) + __vdso_getrandom; #endif local: *; }; diff --git a/arch/riscv/kernel/vdso/vgetrandom-chacha.S b/arch/riscv/kernel/vdso/vgetrandom-chacha.S new file mode 100644 index 0000000000000..916ab30a88f77 --- /dev/null +++ b/arch/riscv/kernel/vdso/vgetrandom-chacha.S @@ -0,0 +1,252 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2025 Xi Ruoyao . All Rights Reserved. + * + * Based on arch/loongarch/vdso/vgetrandom-chacha.S. + */ + +#include +#include +#include + +.text + +.macro ROTRI rd rs imm + slliw t0, \rs, 32 - \imm + srliw \rd, \rs, \imm + or \rd, \rd, t0 +.endm + +.macro OP_4REG op d0 d1 d2 d3 s0 s1 s2 s3 + \op \d0, \d0, \s0 + \op \d1, \d1, \s1 + \op \d2, \d2, \s2 + \op \d3, \d3, \s3 +.endm + +/* + * a0: output bytes + * a1: 32-byte key input + * a2: 8-byte counter input/output + * a3: number of 64-byte blocks to write to output + */ +SYM_FUNC_START(__arch_chacha20_blocks_nostack) + +#define output a0 +#define key a1 +#define counter a2 +#define nblocks a3 +#define i a4 +#define state0 s0 +#define state1 s1 +#define state2 s2 +#define state3 s3 +#define state4 s4 +#define state5 s5 +#define state6 s6 +#define state7 s7 +#define state8 s8 +#define state9 s9 +#define state10 s10 +#define state11 s11 +#define state12 a5 +#define state13 a6 +#define state14 a7 +#define state15 t1 +#define cnt t2 +#define copy0 t3 +#define copy1 t4 +#define copy2 t5 +#define copy3 t6 + +/* Packs to be used with OP_4REG */ +#define line0 state0, state1, state2, state3 +#define line1 state4, state5, state6, state7 +#define line2 state8, state9, state10, state11 +#define line3 state12, state13, state14, state15 + +#define line1_perm state5, state6, state7, state4 +#define line2_perm state10, state11, state8, state9 +#define line3_perm state15, state12, state13, state14 + +#define copy copy0, copy1, copy2, copy3 + +#define _16 16, 16, 16, 16 +#define _20 20, 20, 20, 20 +#define _24 24, 24, 24, 24 +#define _25 25, 25, 25, 25 + vdso_lpad + /* + * The ABI requires s0-s9 saved. + * This does not violate the stack-less requirement: no sensitive data + * is spilled onto the stack. + */ + addi sp, sp, -12*SZREG + REG_S s0, (sp) + REG_S s1, SZREG(sp) + REG_S s2, 2*SZREG(sp) + REG_S s3, 3*SZREG(sp) + REG_S s4, 4*SZREG(sp) + REG_S s5, 5*SZREG(sp) + REG_S s6, 6*SZREG(sp) + REG_S s7, 7*SZREG(sp) + REG_S s8, 8*SZREG(sp) + REG_S s9, 9*SZREG(sp) + REG_S s10, 10*SZREG(sp) + REG_S s11, 11*SZREG(sp) + + ld cnt, (counter) + + li copy0, 0x61707865 + li copy1, 0x3320646e + li copy2, 0x79622d32 + li copy3, 0x6b206574 + +.Lblock: + /* state[0,1,2,3] = "expand 32-byte k" */ + mv state0, copy0 + mv state1, copy1 + mv state2, copy2 + mv state3, copy3 + + /* state[4,5,..,11] = key */ + lw state4, (key) + lw state5, 4(key) + lw state6, 8(key) + lw state7, 12(key) + lw state8, 16(key) + lw state9, 20(key) + lw state10, 24(key) + lw state11, 28(key) + + /* state[12,13] = counter */ + mv state12, cnt + srli state13, cnt, 32 + + /* state[14,15] = 0 */ + mv state14, zero + mv state15, zero + + li i, 10 +.Lpermute: + /* odd round */ + OP_4REG addw line0, line1 + OP_4REG xor line3, line0 + OP_4REG ROTRI line3, _16 + + OP_4REG addw line2, line3 + OP_4REG xor line1, line2 + OP_4REG ROTRI line1, _20 + + OP_4REG addw line0, line1 + OP_4REG xor line3, line0 + OP_4REG ROTRI line3, _24 + + OP_4REG addw line2, line3 + OP_4REG xor line1, line2 + OP_4REG ROTRI line1, _25 + + /* even round */ + OP_4REG addw line0, line1_perm + OP_4REG xor line3_perm, line0 + OP_4REG ROTRI line3_perm, _16 + + OP_4REG addw line2_perm, line3_perm + OP_4REG xor line1_perm, line2_perm + OP_4REG ROTRI line1_perm, _20 + + OP_4REG addw line0, line1_perm + OP_4REG xor line3_perm, line0 + OP_4REG ROTRI line3_perm, _24 + + OP_4REG addw line2_perm, line3_perm + OP_4REG xor line1_perm, line2_perm + OP_4REG ROTRI line1_perm, _25 + + addi i, i, -1 + bnez i, .Lpermute + + /* output[0,1,2,3] = copy[0,1,2,3] + state[0,1,2,3] */ + OP_4REG addw line0, copy + sw state0, (output) + sw state1, 4(output) + sw state2, 8(output) + sw state3, 12(output) + + /* from now on state[0,1,2,3] are scratch registers */ + + /* state[0,1,2,3] = lo(key) */ + lw state0, (key) + lw state1, 4(key) + lw state2, 8(key) + lw state3, 12(key) + + /* output[4,5,6,7] = state[0,1,2,3] + state[4,5,6,7] */ + OP_4REG addw line1, line0 + sw state4, 16(output) + sw state5, 20(output) + sw state6, 24(output) + sw state7, 28(output) + + /* state[0,1,2,3] = hi(key) */ + lw state0, 16(key) + lw state1, 20(key) + lw state2, 24(key) + lw state3, 28(key) + + /* output[8,9,10,11] = tmp[0,1,2,3] + state[8,9,10,11] */ + OP_4REG addw line2, line0 + sw state8, 32(output) + sw state9, 36(output) + sw state10, 40(output) + sw state11, 44(output) + + /* output[12,13,14,15] = state[12,13,14,15] + [cnt_lo, cnt_hi, 0, 0] */ + addw state12, state12, cnt + srli state0, cnt, 32 + addw state13, state13, state0 + sw state12, 48(output) + sw state13, 52(output) + sw state14, 56(output) + sw state15, 60(output) + + /* ++counter */ + addi cnt, cnt, 1 + + /* output += 64 */ + addi output, output, 64 + /* --nblocks */ + addi nblocks, nblocks, -1 + bnez nblocks, .Lblock + + /* counter = [cnt_lo, cnt_hi] */ + sd cnt, (counter) + + /* Zero out the potentially sensitive regs, in case nothing uses these + * again. As at now copy[0,1,2,3] just contains "expand 32-byte k" and + * state[0,...,11] are s0-s11 those we'll restore in the epilogue, we + * only need to zero state[12,...,15]. + */ + mv state12, zero + mv state13, zero + mv state14, zero + mv state15, zero + + REG_L s0, (sp) + REG_L s1, SZREG(sp) + REG_L s2, 2*SZREG(sp) + REG_L s3, 3*SZREG(sp) + REG_L s4, 4*SZREG(sp) + REG_L s5, 5*SZREG(sp) + REG_L s6, 6*SZREG(sp) + REG_L s7, 7*SZREG(sp) + REG_L s8, 8*SZREG(sp) + REG_L s9, 9*SZREG(sp) + REG_L s10, 10*SZREG(sp) + REG_L s11, 11*SZREG(sp) + addi sp, sp, 12*SZREG + + ret +SYM_FUNC_END(__arch_chacha20_blocks_nostack) + +emit_riscv_feature_1_and diff --git a/arch/riscv/kernel/vdso_cfi/Makefile b/arch/riscv/kernel/vdso_cfi/Makefile new file mode 100644 index 0000000000000..8ebd190782b08 --- /dev/null +++ b/arch/riscv/kernel/vdso_cfi/Makefile @@ -0,0 +1,25 @@ +# SPDX-License-Identifier: GPL-2.0-only +# RISC-V VDSO CFI Makefile +# This Makefile builds the VDSO with CFI support when CONFIG_RISCV_USER_CFI is enabled + +# setting VDSO_CFI_BUILD triggers build for vdso differently +VDSO_CFI_BUILD := 1 + +# Set the source directory to the main vdso directory +src := $(srctree)/arch/riscv/kernel/vdso + +# Copy all .S and .c files from vdso directory to vdso_cfi object build directory +vdso_c_sources := $(wildcard $(src)/*.c) +vdso_S_sources := $(wildcard $(src)/*.S) +vdso_c_objects := $(addprefix $(obj)/, $(notdir $(vdso_c_sources))) +vdso_S_objects := $(addprefix $(obj)/, $(notdir $(vdso_S_sources))) + +$(vdso_S_objects): $(obj)/%.S: $(src)/%.S + $(Q)cp $< $@ + +$(vdso_c_objects): $(obj)/%.c: $(src)/%.c + $(Q)cp $< $@ + +# Include the main VDSO Makefile which contains all the build rules and sources +# The VDSO_CFI_BUILD variable will be passed to it to enable CFI compilation +include $(src)/Makefile diff --git a/arch/riscv/kernel/vdso_cfi/vdso-cfi.S b/arch/riscv/kernel/vdso_cfi/vdso-cfi.S new file mode 100644 index 0000000000000..d426f6accb352 --- /dev/null +++ b/arch/riscv/kernel/vdso_cfi/vdso-cfi.S @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright 2025 Rivos, Inc + */ + +#define vdso_start vdso_cfi_start +#define vdso_end vdso_cfi_end + +#define __VDSO_PATH "arch/riscv/kernel/vdso_cfi/vdso-cfi.so" + +#include "../vdso/vdso.S" diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 2f9761e1bd788..4d063038312d3 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -458,7 +458,7 @@ pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); static const pgprot_t protection_map[16] = { [VM_NONE] = PAGE_NONE, [VM_READ] = PAGE_READ, - [VM_WRITE] = PAGE_COPY, + [VM_WRITE] = PAGE_SHADOWSTACK, [VM_WRITE | VM_READ] = PAGE_COPY, [VM_EXEC] = PAGE_EXEC, [VM_EXEC | VM_READ] = PAGE_READ_EXEC, diff --git a/arch/riscv/mm/pgtable.c b/arch/riscv/mm/pgtable.c index 645d38cf89fb2..00383a9da5c93 100644 --- a/arch/riscv/mm/pgtable.c +++ b/arch/riscv/mm/pgtable.c @@ -157,3 +157,19 @@ pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, return pmd; } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ + +pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_SHADOW_STACK) + return pte_mkwrite_shstk(pte); + + return pte_mkwrite_novma(pte); +} + +pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_SHADOW_STACK) + return pmd_mkwrite_shstk(pmd); + + return pmd_mkwrite_novma(pmd); +} diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl index 51cc3616d5f98..59c4c009a7d8e 100644 --- a/arch/s390/kernel/syscalls/syscall.tbl +++ b/arch/s390/kernel/syscalls/syscall.tbl @@ -455,3 +455,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 sys_fchmodat2 +453 common map_shadow_stack sys_map_shadow_stack sys_map_shadow_stack diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl index 7e1ceb2ba5721..1711499fd3eee 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -456,3 +456,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +453 common map_shadow_stack sys_map_shadow_stack diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl index d0f535230ad8b..bcabfed3c1cec 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -498,3 +498,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +453 common map_shadow_stack sys_map_shadow_stack diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl index 38db5ef2329f3..48a81f59d0e7b 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -457,3 +457,4 @@ 450 i386 set_mempolicy_home_node sys_set_mempolicy_home_node 451 i386 cachestat sys_cachestat 452 i386 fchmodat2 sys_fchmodat2 +453 i386 map_shadow_stack sys_map_shadow_stack diff --git a/arch/x86/include/uapi/asm/mman.h b/arch/x86/include/uapi/asm/mman.h index 46cdc941f9586..ac1e6277212b9 100644 --- a/arch/x86/include/uapi/asm/mman.h +++ b/arch/x86/include/uapi/asm/mman.h @@ -5,9 +5,6 @@ #define MAP_32BIT 0x40 /* only give out 32bit addresses */ #define MAP_ABOVE4G 0x80 /* only map above 4GB */ -/* Flags for map_shadow_stack(2) */ -#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ - #include #endif /* _ASM_X86_MMAN_H */ diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl index fc1a4f3c81d9b..94e6bcc2bec70 100644 --- a/arch/xtensa/kernel/syscalls/syscall.tbl +++ b/arch/xtensa/kernel/syscalls/syscall.tbl @@ -423,3 +423,4 @@ 450 common set_mempolicy_home_node sys_set_mempolicy_home_node 451 common cachestat sys_cachestat 452 common fchmodat2 sys_fchmodat2 +453 common map_shadow_stack sys_map_shadow_stack diff --git a/include/linux/cpu.h b/include/linux/cpu.h index 1e3d1b2ac4c1d..37a4ee2a0f747 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -232,4 +232,8 @@ static inline bool cpu_mitigations_auto_nosmt(void) } #endif +int arch_prctl_get_branch_landing_pad_state(struct task_struct *t, unsigned long __user *state); +int arch_prctl_set_branch_landing_pad_state(struct task_struct *t, unsigned long state); +int arch_prctl_lock_branch_landing_pad_state(struct task_struct *t); + #endif /* _LINUX_CPU_H_ */ diff --git a/include/linux/mm.h b/include/linux/mm.h index e7d62a5446e9b..c62368f62beda 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -345,7 +345,7 @@ extern unsigned int kobjsize(const void *objp); #endif #endif /* CONFIG_ARCH_HAS_PKEYS */ -#ifdef CONFIG_X86_USER_SHADOW_STACK +#if defined(CONFIG_X86_USER_SHADOW_STACK) || defined(CONFIG_RISCV_USER_CFI) /* * VM_SHADOW_STACK should not be set with VM_SHARED because of lack of * support core mm. @@ -4179,4 +4179,8 @@ static inline void accept_memory(phys_addr_t start, phys_addr_t end) #endif +int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status); +int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status); +int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status); + #endif /* _LINUX_MM_H */ diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h index 57e8195d0b538..5e3d61ddbd8c1 100644 --- a/include/uapi/asm-generic/mman.h +++ b/include/uapi/asm-generic/mman.h @@ -19,4 +19,8 @@ #define MCL_FUTURE 2 /* lock all future mappings */ #define MCL_ONFAULT 4 /* lock all pages that are faulted in */ +#define SHADOW_STACK_SET_TOKEN (1ULL << 0) /* Set up a restore token in the shadow stack */ +#define SHADOW_STACK_SET_MARKER (1ULL << 1) /* Set up a top of stack marker in the shadow stack */ + + #endif /* __ASM_GENERIC_MMAN_H */ diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 05c412c582399..0a5366d469f56 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -823,8 +823,11 @@ __SYSCALL(__NR_cachestat, sys_cachestat) #define __NR_fchmodat2 452 __SYSCALL(__NR_fchmodat2, sys_fchmodat2) +#define __NR_map_shadow_stack 453 +__SYSCALL(__NR_map_shadow_stack, sys_map_shadow_stack) + #undef __NR_syscalls -#define __NR_syscalls 453 +#define __NR_syscalls 454 /* * 32 bit systems traditionally used different diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h index 9b731976ce2fd..6b4c44ace911b 100644 --- a/include/uapi/linux/elf.h +++ b/include/uapi/linux/elf.h @@ -447,6 +447,8 @@ typedef struct elf64_shdr { #define NT_MIPS_MSA 0x802 /* MIPS SIMD registers */ #define NT_RISCV_CSR 0x900 /* RISC-V Control and Status Registers */ #define NT_RISCV_VECTOR 0x901 /* RISC-V vector registers */ +#define NN_RISCV_USER_CFI "LINUX" +#define NT_RISCV_USER_CFI 0x903 /* RISC-V shadow stack state */ #define NT_LOONGARCH_CPUCFG 0xa00 /* LoongArch CPU config registers */ #define NT_LOONGARCH_CSR 0xa01 /* LoongArch control and status registers */ #define NT_LOONGARCH_LSX 0xa02 /* LoongArch Loongson SIMD Extension registers */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 370ed14b1ae09..906a46c4be4f7 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -306,4 +306,46 @@ struct prctl_mm_map { # define PR_RISCV_V_VSTATE_CTRL_NEXT_MASK 0xc # define PR_RISCV_V_VSTATE_CTRL_MASK 0x1f +/* + * Get the current shadow stack configuration for the current thread, + * this will be the value configured via PR_SET_SHADOW_STACK_STATUS. + */ +#define PR_GET_SHADOW_STACK_STATUS 74 + +/* + * Set the current shadow stack configuration. Enabling the shadow + * stack will cause a shadow stack to be allocated for the thread. + */ +#define PR_SET_SHADOW_STACK_STATUS 75 +# define PR_SHADOW_STACK_ENABLE (1UL << 0) +# define PR_SHADOW_STACK_WRITE (1UL << 1) +# define PR_SHADOW_STACK_PUSH (1UL << 2) + +/* + * Prevent further changes to the specified shadow stack + * configuration. All bits may be locked via this call, including + * undefined bits. + */ +#define PR_LOCK_SHADOW_STACK_STATUS 76 + +/* + * Get or set the control flow integrity (CFI) configuration for the + * current thread. + * + * Some per-thread control flow integrity settings are not yet + * controlled through this prctl(); see for example + * PR_{GET,SET,LOCK}_SHADOW_STACK_STATUS + */ +#define PR_GET_CFI 80 +#define PR_SET_CFI 81 +/* + * Forward-edge CFI variants (excluding ARM64 BTI, which has its own + * prctl()s). + */ +#define PR_CFI_BRANCH_LANDING_PADS 0 +/* Return and control values for PR_{GET,SET}_CFI */ +# define PR_CFI_ENABLE _BITUL(0) +# define PR_CFI_DISABLE _BITUL(1) +# define PR_CFI_LOCK _BITUL(2) + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 47cb10a16b009..a49678f561e99 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -2320,6 +2320,37 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which, return -EINVAL; } +int __weak arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *status) +{ + return -EINVAL; +} + +int __weak arch_set_shadow_stack_status(struct task_struct *t, unsigned long status) +{ + return -EINVAL; +} + +int __weak arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status) +{ + return -EINVAL; +} + +int __weak arch_prctl_get_branch_landing_pad_state(struct task_struct *t, + unsigned long __user *state) +{ + return -EINVAL; +} + +int __weak arch_prctl_set_branch_landing_pad_state(struct task_struct *t, unsigned long state) +{ + return -EINVAL; +} + +int __weak arch_prctl_lock_branch_landing_pad_state(struct task_struct *t) +{ + return -EINVAL; +} + #define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE) #ifdef CONFIG_ANON_VMA_NAME @@ -2767,6 +2798,39 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, case PR_RISCV_V_GET_CONTROL: error = RISCV_V_GET_CONTROL(); break; + case PR_GET_SHADOW_STACK_STATUS: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_get_shadow_stack_status(me, (unsigned long __user *)arg2); + break; + case PR_SET_SHADOW_STACK_STATUS: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_set_shadow_stack_status(me, arg2); + break; + case PR_LOCK_SHADOW_STACK_STATUS: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = arch_lock_shadow_stack_status(me, arg2); + break; + case PR_GET_CFI: + if (arg2 != PR_CFI_BRANCH_LANDING_PADS) + return -EINVAL; + if (arg4 || arg5) + return -EINVAL; + error = arch_prctl_get_branch_landing_pad_state(me, (unsigned long __user *)arg3); + break; + case PR_SET_CFI: + if (arg2 != PR_CFI_BRANCH_LANDING_PADS) + return -EINVAL; + if (arg4 || arg5) + return -EINVAL; + error = arch_prctl_set_branch_landing_pad_state(me, arg3); + if (error) + break; + if (arg3 & PR_CFI_LOCK && !(arg3 & PR_CFI_DISABLE)) + error = arch_prctl_lock_branch_landing_pad_state(me); + break; default: error = -EINVAL; break; diff --git a/lib/vdso/Makefile b/lib/vdso/Makefile index 9f031eafc4651..cedbf15f80874 100644 --- a/lib/vdso/Makefile +++ b/lib/vdso/Makefile @@ -4,6 +4,7 @@ GENERIC_VDSO_MK_PATH := $(abspath $(lastword $(MAKEFILE_LIST))) GENERIC_VDSO_DIR := $(dir $(GENERIC_VDSO_MK_PATH)) c-gettimeofday-$(CONFIG_GENERIC_GETTIMEOFDAY) := $(addprefix $(GENERIC_VDSO_DIR), gettimeofday.c) +c-getrandom-$(CONFIG_VDSO_GETRANDOM) := $(addprefix $(GENERIC_VDSO_DIR), getrandom.c) # This cmd checks that the vdso library does not contain dynamic relocations. # It has to be called after the linking of the vdso library and requires it diff --git a/scripts/Makefile.build b/scripts/Makefile.build index b897c78067112..f3d1237d2d968 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -112,13 +112,13 @@ endif quiet_cmd_cc_s_c = CC $(quiet_modtag) $@ cmd_cc_s_c = $(CC) $(filter-out $(DEBUG_CFLAGS) $(CC_FLAGS_LTO), $(c_flags)) -fverbose-asm -S -o $@ $< -$(obj)/%.s: $(src)/%.c FORCE +$(obj)/%.s: $(obj)/%.c FORCE $(call if_changed_dep,cc_s_c) quiet_cmd_cpp_i_c = CPP $(quiet_modtag) $@ cmd_cpp_i_c = $(CPP) $(c_flags) -o $@ $< -$(obj)/%.i: $(src)/%.c FORCE +$(obj)/%.i: $(obj)/%.c FORCE $(call if_changed_dep,cpp_i_c) genksyms = scripts/genksyms/genksyms \ @@ -132,7 +132,7 @@ cmd_gensymtypes_c = $(CPP) -D__GENKSYMS__ $(c_flags) $< | $(genksyms) quiet_cmd_cc_symtypes_c = SYM $(quiet_modtag) $@ cmd_cc_symtypes_c = $(call cmd_gensymtypes_c,true,$@) >/dev/null -$(obj)/%.symtypes : $(src)/%.c FORCE +$(obj)/%.symtypes : $(obj)/%.c FORCE $(call cmd,cc_symtypes_c) # LLVM assembly @@ -140,7 +140,7 @@ $(obj)/%.symtypes : $(src)/%.c FORCE quiet_cmd_cc_ll_c = CC $(quiet_modtag) $@ cmd_cc_ll_c = $(CC) $(c_flags) -emit-llvm -S -fno-discard-value-names -o $@ $< -$(obj)/%.ll: $(src)/%.c FORCE +$(obj)/%.ll: $(obj)/%.c FORCE $(call if_changed_dep,cc_ll_c) # C (.c) files @@ -239,7 +239,7 @@ define rule_as_o_S endef # Built-in and composite module parts -$(obj)/%.o: $(src)/%.c $(recordmcount_source) FORCE +$(obj)/%.o: $(obj)/%.c $(recordmcount_source) FORCE $(call if_changed_rule,cc_o_c) $(call cmd,force_checksrc) @@ -256,7 +256,7 @@ quiet_cmd_cc_lst_c = MKLST $@ $(CONFIG_SHELL) $(srctree)/scripts/makelst $*.o \ System.map $(OBJDUMP) > $@ -$(obj)/%.lst: $(src)/%.c FORCE +$(obj)/%.lst: $(obj)/%.c FORCE $(call if_changed_dep,cc_lst_c) # Compile Rust sources (.rs) @@ -288,28 +288,28 @@ rust_common_cmd = \ quiet_cmd_rustc_o_rs = $(RUSTC_OR_CLIPPY_QUIET) $(quiet_modtag) $@ cmd_rustc_o_rs = $(rust_common_cmd) --emit=obj=$@ $< -$(obj)/%.o: $(src)/%.rs FORCE - $(call if_changed_dep,rustc_o_rs) +$(obj)/%.o: $(obj)/%.rs FORCE + +$(call if_changed_dep,rustc_o_rs) quiet_cmd_rustc_rsi_rs = $(RUSTC_OR_CLIPPY_QUIET) $(quiet_modtag) $@ cmd_rustc_rsi_rs = \ $(rust_common_cmd) -Zunpretty=expanded $< >$@; \ command -v $(RUSTFMT) >/dev/null && $(RUSTFMT) --config-path $(srctree)/.rustfmt.toml $@ -$(obj)/%.rsi: $(src)/%.rs FORCE - $(call if_changed_dep,rustc_rsi_rs) +$(obj)/%.rsi: $(obj)/%.rs FORCE + +$(call if_changed_dep,rustc_rsi_rs) quiet_cmd_rustc_s_rs = $(RUSTC_OR_CLIPPY_QUIET) $(quiet_modtag) $@ cmd_rustc_s_rs = $(rust_common_cmd) --emit=asm=$@ $< -$(obj)/%.s: $(src)/%.rs FORCE - $(call if_changed_dep,rustc_s_rs) +$(obj)/%.s: $(obj)/%.rs FORCE + +$(call if_changed_dep,rustc_s_rs) quiet_cmd_rustc_ll_rs = $(RUSTC_OR_CLIPPY_QUIET) $(quiet_modtag) $@ cmd_rustc_ll_rs = $(rust_common_cmd) --emit=llvm-ir=$@ $< -$(obj)/%.ll: $(src)/%.rs FORCE - $(call if_changed_dep,rustc_ll_rs) +$(obj)/%.ll: $(obj)/%.rs FORCE + +$(call if_changed_dep,rustc_ll_rs) # Compile assembler sources (.S) # --------------------------------------------------------------------------- @@ -334,14 +334,14 @@ cmd_gensymtypes_S = \ quiet_cmd_cc_symtypes_S = SYM $(quiet_modtag) $@ cmd_cc_symtypes_S = $(call cmd_gensymtypes_S,true,$@) >/dev/null -$(obj)/%.symtypes : $(src)/%.S FORCE +$(obj)/%.symtypes : $(obj)/%.S FORCE $(call cmd,cc_symtypes_S) quiet_cmd_cpp_s_S = CPP $(quiet_modtag) $@ cmd_cpp_s_S = $(CPP) $(a_flags) -o $@ $< -$(obj)/%.s: $(src)/%.S FORCE +$(obj)/%.s: $(obj)/%.S FORCE $(call if_changed_dep,cpp_s_S) quiet_cmd_as_o_S = AS $(quiet_modtag) $@ @@ -356,7 +356,7 @@ cmd_gen_symversions_S = $(call gen_symversions,S) endif -$(obj)/%.o: $(src)/%.S FORCE +$(obj)/%.o: $(obj)/%.S FORCE $(call if_changed_rule,as_o_S) targets += $(filter-out $(subdir-builtin), $(real-obj-y)) diff --git a/scripts/Makefile.host b/scripts/Makefile.host index 8f7f842b54f9e..d92a1403857f4 100644 --- a/scripts/Makefile.host +++ b/scripts/Makefile.host @@ -110,7 +110,7 @@ endif quiet_cmd_host-csingle = HOSTCC $@ cmd_host-csingle = $(HOSTCC) $(hostc_flags) $(KBUILD_HOSTLDFLAGS) -o $@ $< \ $(KBUILD_HOSTLDLIBS) $(HOSTLDLIBS_$(target-stem)) -$(host-csingle): $(obj)/%: $(src)/%.c FORCE +$(host-csingle): $(obj)/%: $(obj)/%.c FORCE $(call if_changed_dep,host-csingle) # Link an executable based on list of .o files, all plain c @@ -127,7 +127,7 @@ $(call multi_depend, $(host-cmulti), , -objs) # host-cobjs -> .o quiet_cmd_host-cobjs = HOSTCC $@ cmd_host-cobjs = $(HOSTCC) $(hostc_flags) -c -o $@ $< -$(host-cobjs): $(obj)/%.o: $(src)/%.c FORCE +$(host-cobjs): $(obj)/%.o: $(obj)/%.c FORCE $(call if_changed_dep,host-cobjs) # Link an executable based on list of .o files, a mixture of .c and .cc diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib index 4aecfb0a0ef6a..bb9fa969cf992 100644 --- a/scripts/Makefile.lib +++ b/scripts/Makefile.lib @@ -419,7 +419,7 @@ quiet_cmd_dtb = $(quiet_cmd_dtc) cmd_dtb = $(cmd_dtc) endif -$(obj)/%.dtb: $(src)/%.dts $(DTC) $(DT_TMP_SCHEMA) FORCE +$(obj)/%.dtb: $(obj)/%.dts $(DTC) $(DT_TMP_SCHEMA) FORCE $(call if_changed_dep,dtb) $(obj)/%.dtbo: $(src)/%.dtso $(DTC) FORCE diff --git a/tools/testing/selftests/riscv/Makefile b/tools/testing/selftests/riscv/Makefile index b9c3da3fd7aaa..5976a0adec76c 100644 --- a/tools/testing/selftests/riscv/Makefile +++ b/tools/testing/selftests/riscv/Makefile @@ -5,7 +5,7 @@ ARCH ?= $(shell uname -m 2>/dev/null || echo not) ifneq (,$(filter $(ARCH),riscv)) -RISCV_SUBTARGETS ?= hwprobe vector mm sse +RISCV_SUBTARGETS ?= hwprobe vector mm sse cfi else RISCV_SUBTARGETS := endif diff --git a/tools/testing/selftests/riscv/cfi/.gitignore b/tools/testing/selftests/riscv/cfi/.gitignore new file mode 100644 index 0000000000000..c1faf7ca43465 --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/.gitignore @@ -0,0 +1,2 @@ +cfitests +shadowstack diff --git a/tools/testing/selftests/riscv/cfi/Makefile b/tools/testing/selftests/riscv/cfi/Makefile new file mode 100644 index 0000000000000..93b4738c0e2e8 --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/Makefile @@ -0,0 +1,25 @@ +# SPDX-License-Identifier: GPL-2.0 + +CFLAGS += $(KHDR_INCLUDES) +CFLAGS += -I$(top_srcdir)/tools/include + +CFLAGS += -march=rv64gc_zicfilp_zicfiss -fcf-protection=full + +# Check for zicfi* extensions needs cross compiler +# which is not set until lib.mk is included +ifeq ($(LLVM)$(CC),cc) +CC := $(CROSS_COMPILE)gcc +endif + + +ifeq ($(shell $(CC) $(CFLAGS) -nostdlib -xc /dev/null -o /dev/null > /dev/null 2>&1; echo $$?),0) +TEST_GEN_PROGS := cfitests + +$(OUTPUT)/cfitests: cfitests.c shadowstack.c + $(CC) -o$@ $(CFLAGS) $(LDFLAGS) $^ +else + +$(shell echo "Toolchain doesn't support CFI, skipping CFI kselftest." >&2) +endif + +include ../../lib.mk diff --git a/tools/testing/selftests/riscv/cfi/cfi_rv_test.h b/tools/testing/selftests/riscv/cfi/cfi_rv_test.h new file mode 100644 index 0000000000000..1c8043f2b778b --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/cfi_rv_test.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef SELFTEST_RISCV_CFI_H +#define SELFTEST_RISCV_CFI_H +#include +#include +#include "shadowstack.h" + +#define CHILD_EXIT_CODE_SSWRITE 10 +#define CHILD_EXIT_CODE_SIG_TEST 11 + +#define my_syscall5(num, arg1, arg2, arg3, arg4, arg5) \ +({ \ + register long _num __asm__ ("a7") = (num); \ + register long _arg1 __asm__ ("a0") = (long)(arg1); \ + register long _arg2 __asm__ ("a1") = (long)(arg2); \ + register long _arg3 __asm__ ("a2") = (long)(arg3); \ + register long _arg4 __asm__ ("a3") = (long)(arg4); \ + register long _arg5 __asm__ ("a4") = (long)(arg5); \ + \ + __asm__ volatile( \ + "ecall\n" \ + : "+r" \ + (_arg1) \ + : "r"(_arg2), "r"(_arg3), "r"(_arg4), "r"(_arg5), \ + "r"(_num) \ + : "memory", "cc" \ + ); \ + _arg1; \ +}) + +#define my_syscall3(num, arg1, arg2, arg3) \ +({ \ + register long _num __asm__ ("a7") = (num); \ + register long _arg1 __asm__ ("a0") = (long)(arg1); \ + register long _arg2 __asm__ ("a1") = (long)(arg2); \ + register long _arg3 __asm__ ("a2") = (long)(arg3); \ + \ + __asm__ volatile( \ + "ecall\n" \ + : "+r" (_arg1) \ + : "r"(_arg2), "r"(_arg3), \ + "r"(_num) \ + : "memory", "cc" \ + ); \ + _arg1; \ +}) + +#ifndef __NR_prctl +#define __NR_prctl 167 +#endif + +#ifndef __NR_map_shadow_stack +#define __NR_map_shadow_stack 453 +#endif + +#define CSR_SSP 0x011 + +#ifdef __ASSEMBLY__ +#define __ASM_STR(x) x +#else +#define __ASM_STR(x) #x +#endif + +#define csr_read(csr) \ +({ \ + register unsigned long __v; \ + __asm__ __volatile__ ("csrr %0, " __ASM_STR(csr) \ + : "=r" (__v) : \ + : "memory"); \ + __v; \ +}) + +#define csr_write(csr, val) \ +({ \ + unsigned long __v = (unsigned long)(val); \ + __asm__ __volatile__ ("csrw " __ASM_STR(csr) ", %0" \ + : : "rK" (__v) \ + : "memory"); \ +}) + +#endif diff --git a/tools/testing/selftests/riscv/cfi/cfitests.c b/tools/testing/selftests/riscv/cfi/cfitests.c new file mode 100644 index 0000000000000..39d097b6881ff --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/cfitests.c @@ -0,0 +1,174 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "../../kselftest.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "cfi_rv_test.h" + +/* do not optimize cfi related test functions */ +#pragma GCC push_options +#pragma GCC optimize("O0") + +void sigsegv_handler(int signum, siginfo_t *si, void *uc) +{ + struct ucontext *ctx = (struct ucontext *)uc; + + if (si->si_code == SEGV_CPERR) { + ksft_print_msg("Control flow violation happened somewhere\n"); + ksft_print_msg("PC where violation happened %lx\n", ctx->uc_mcontext.gregs[0]); + exit(-1); + } + + /* all other cases are expected to be of shadow stack write case */ + exit(CHILD_EXIT_CODE_SSWRITE); +} + +bool register_signal_handler(void) +{ + struct sigaction sa = {}; + + sa.sa_sigaction = sigsegv_handler; + sa.sa_flags = SA_SIGINFO; + if (sigaction(SIGSEGV, &sa, NULL)) { + ksft_print_msg("Registering signal handler for landing pad violation failed\n"); + return false; + } + + return true; +} + +long ptrace(int request, pid_t pid, void *addr, void *data); + +bool cfi_ptrace_test(void) +{ + pid_t pid; + int status, ret = 0; + unsigned long ptrace_test_num = 0, total_ptrace_tests = 2; + + struct user_cfi_state cfi_reg; + struct iovec iov; + + pid = fork(); + + if (pid == -1) { + ksft_exit_fail_msg("%s: fork failed\n", __func__); + exit(1); + } + + if (pid == 0) { + /* allow to be traced */ + ptrace(PTRACE_TRACEME, 0, NULL, NULL); + raise(SIGSTOP); + asm volatile ("la a5, 1f\n" + "jalr a5\n" + "nop\n" + "nop\n" + "1: nop\n" + : : : "a5"); + exit(11); + /* child shouldn't go beyond here */ + } + + /* parent's code goes here */ + iov.iov_base = &cfi_reg; + iov.iov_len = sizeof(cfi_reg); + + while (ptrace_test_num < total_ptrace_tests) { + memset(&cfi_reg, 0, sizeof(cfi_reg)); + waitpid(pid, &status, 0); + if (WIFSTOPPED(status)) { + errno = 0; + ret = ptrace(PTRACE_GETREGSET, pid, (void *)NT_RISCV_USER_CFI, &iov); + if (ret == -1 && errno) + ksft_exit_fail_msg("%s: PTRACE_GETREGSET failed\n", __func__); + } else { + ksft_exit_fail_msg("%s: child didn't stop, failed\n", __func__); + } + + switch (ptrace_test_num) { +#define CFI_ENABLE_MASK (PTRACE_CFI_BRANCH_LANDING_PAD_EN_STATE | \ + PTRACE_CFI_SHADOW_STACK_EN_STATE | \ + PTRACE_CFI_SHADOW_STACK_PTR_STATE) + case 0: + if ((cfi_reg.cfi_status.cfi_state & CFI_ENABLE_MASK) != CFI_ENABLE_MASK) + ksft_exit_fail_msg("%s: ptrace_getregset failed, %llu\n", __func__, + cfi_reg.cfi_status.cfi_state); + if (!cfi_reg.shstk_ptr) + ksft_exit_fail_msg("%s: NULL shadow stack pointer, test failed\n", + __func__); + break; + case 1: + if (!(cfi_reg.cfi_status.cfi_state & + PTRACE_CFI_BRANCH_EXPECTED_LANDING_PAD_STATE)) + ksft_exit_fail_msg("%s: elp must have been set\n", __func__); + /* clear elp state. not interested in anything else */ + cfi_reg.cfi_status.cfi_state = 0; + + ret = ptrace(PTRACE_SETREGSET, pid, (void *)NT_RISCV_USER_CFI, &iov); + if (ret == -1 && errno) + ksft_exit_fail_msg("%s: PTRACE_GETREGSET failed\n", __func__); + break; + default: + ksft_exit_fail_msg("%s: unreachable switch case\n", __func__); + break; + } + ptrace(PTRACE_CONT, pid, NULL, NULL); + ptrace_test_num++; + } + + waitpid(pid, &status, 0); + if (WEXITSTATUS(status) != 11) + ksft_print_msg("%s, bad return code from child\n", __func__); + + ksft_print_msg("%s, ptrace test succeeded\n", __func__); + return true; +} + +int main(int argc, char *argv[]) +{ + int ret = 0; + unsigned long lpad_status = 0, ss_status = 0; + + ksft_print_header(); + + ksft_print_msg("Starting risc-v tests\n"); + + /* + * Landing pad test. Not a lot of kernel changes to support landing + * pads for user mode except lighting up a bit in senvcfg via a prctl. + * Enable landing pad support throughout the execution of the test binary. + */ + ret = my_syscall5(__NR_prctl, PR_GET_CFI, PR_CFI_BRANCH_LANDING_PADS, &lpad_status, 0, 0); + if (ret) + ksft_exit_fail_msg("Get landing pad status failed with %d\n", ret); + + if (!(lpad_status & PR_CFI_ENABLE)) + ksft_exit_fail_msg("Landing pad is not enabled, should be enabled via glibc\n"); + + ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0); + if (ret) + ksft_exit_fail_msg("Get shadow stack failed with %d\n", ret); + + if (!(ss_status & PR_SHADOW_STACK_ENABLE)) + ksft_exit_fail_msg("Shadow stack is not enabled, should be enabled via glibc\n"); + + if (!register_signal_handler()) + ksft_exit_fail_msg("Registering signal handler for SIGSEGV failed\n"); + + ksft_print_msg("Landing pad and shadow stack are enabled for binary\n"); + cfi_ptrace_test(); + + execute_shadow_stack_tests(); + + return 0; +} + +#pragma GCC pop_options diff --git a/tools/testing/selftests/riscv/cfi/shadowstack.c b/tools/testing/selftests/riscv/cfi/shadowstack.c new file mode 100644 index 0000000000000..f8eed8260a125 --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/shadowstack.c @@ -0,0 +1,385 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "../../kselftest.h" +#include +#include +#include +#include +#include +#include "shadowstack.h" +#include "cfi_rv_test.h" + +static struct shadow_stack_tests shstk_tests[] = { + { "shstk fork test\n", shadow_stack_fork_test }, + { "map shadow stack syscall\n", shadow_stack_map_test }, + { "shadow stack gup tests\n", shadow_stack_gup_tests }, + { "shadow stack signal tests\n", shadow_stack_signal_test}, + { "memory protections of shadow stack memory\n", shadow_stack_protection_test } +}; + +#define RISCV_SHADOW_STACK_TESTS ARRAY_SIZE(shstk_tests) + +/* do not optimize shadow stack related test functions */ +#pragma GCC push_options +#pragma GCC optimize("O0") + +void zar(void) +{ + unsigned long ssp = 0; + + ssp = csr_read(CSR_SSP); + ksft_print_msg("Spewing out shadow stack ptr: %lx\n" + " This is to ensure shadow stack is indeed enabled and working\n", + ssp); +} + +void bar(void) +{ + zar(); +} + +void foo(void) +{ + bar(); +} + +void zar_child(void) +{ + unsigned long ssp = 0; + + ssp = csr_read(CSR_SSP); + ksft_print_msg("Spewing out shadow stack ptr: %lx\n" + " This is to ensure shadow stack is indeed enabled and working\n", + ssp); +} + +void bar_child(void) +{ + zar_child(); +} + +void foo_child(void) +{ + bar_child(); +} + +typedef void (call_func_ptr)(void); +/* + * call couple of functions to test push/pop. + */ +int shadow_stack_call_tests(call_func_ptr fn_ptr, bool parent) +{ + ksft_print_msg("dummy calls for sspush and sspopchk in context of %s\n", + parent ? "parent" : "child"); + + (fn_ptr)(); + + return 0; +} + +/* forks a thread, and ensure shadow stacks fork out */ +bool shadow_stack_fork_test(unsigned long test_num, void *ctx) +{ + int pid = 0, child_status = 0, parent_pid = 0, ret = 0; + unsigned long ss_status = 0; + + ksft_print_msg("Exercising shadow stack fork test\n"); + + ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0); + if (ret) { + ksft_exit_skip("Shadow stack get status prctl failed with errorcode %d\n", ret); + return false; + } + + if (!(ss_status & PR_SHADOW_STACK_ENABLE)) + ksft_exit_skip("Shadow stack is not enabled, should be enabled via glibc\n"); + + parent_pid = getpid(); + pid = fork(); + + if (pid) { + ksft_print_msg("Parent pid %d and child pid %d\n", parent_pid, pid); + shadow_stack_call_tests(&foo, true); + } else { + shadow_stack_call_tests(&foo_child, false); + } + + if (pid) { + ksft_print_msg("Waiting on child to finish\n"); + wait(&child_status); + } else { + /* exit child gracefully */ + exit(0); + } + + if (pid && WIFSIGNALED(child_status)) { + ksft_print_msg("Child faulted, fork test failed\n"); + return false; + } + + return true; +} + +/* exercise 'map_shadow_stack', pivot to it and call some functions to ensure it works */ +#define SHADOW_STACK_ALLOC_SIZE 4096 +bool shadow_stack_map_test(unsigned long test_num, void *ctx) +{ + unsigned long shdw_addr; + int ret = 0; + + ksft_print_msg("Exercising shadow stack map test\n"); + + shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0); + + if (((long)shdw_addr) <= 0) { + ksft_print_msg("map_shadow_stack failed with error code %d\n", + (int)shdw_addr); + return false; + } + + ret = munmap((void *)shdw_addr, SHADOW_STACK_ALLOC_SIZE); + + if (ret) { + ksft_print_msg("munmap failed with error code %d\n", ret); + return false; + } + + return true; +} + +/* + * shadow stack protection tests. map a shadow stack and + * validate all memory protections work on it + */ +bool shadow_stack_protection_test(unsigned long test_num, void *ctx) +{ + unsigned long shdw_addr; + unsigned long *write_addr = NULL; + int ret = 0, pid = 0, child_status = 0; + + ksft_print_msg("Exercising shadow stack protection test (WPT)\n"); + + shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0); + + if (((long)shdw_addr) <= 0) { + ksft_print_msg("map_shadow_stack failed with error code %d\n", + (int)shdw_addr); + return false; + } + + write_addr = (unsigned long *)shdw_addr; + pid = fork(); + + /* no child was created, return false */ + if (pid == -1) + return false; + + /* + * try to perform a store from child on shadow stack memory + * it should result in SIGSEGV + */ + if (!pid) { + /* below write must lead to SIGSEGV */ + *write_addr = 0xdeadbeef; + } else { + wait(&child_status); + } + + /* test fail, if 0xdeadbeef present on shadow stack address */ + if (*write_addr == 0xdeadbeef) { + ksft_print_msg("Shadow stack WPT failed\n"); + return false; + } + + /* if child reached here, then fail */ + if (!pid) { + ksft_print_msg("Shadow stack WPT failed: child reached unreachable state\n"); + return false; + } + + /* if child exited via signal handler but not for write on ss */ + if (WIFEXITED(child_status) && + WEXITSTATUS(child_status) != CHILD_EXIT_CODE_SSWRITE) { + ksft_print_msg("Shadow stack WPT failed: child wasn't signaled for write\n"); + return false; + } + + ret = munmap(write_addr, SHADOW_STACK_ALLOC_SIZE); + if (ret) { + ksft_print_msg("Shadow stack WPT failed: munmap failed, error code %d\n", + ret); + return false; + } + + return true; +} + +#define SS_MAGIC_WRITE_VAL 0xbeefdead + +int gup_tests(int mem_fd, unsigned long *shdw_addr) +{ + unsigned long val = 0; + + lseek(mem_fd, (unsigned long)shdw_addr, SEEK_SET); + if (read(mem_fd, &val, sizeof(val)) < 0) { + ksft_print_msg("Reading shadow stack mem via gup failed\n"); + return 1; + } + + val = SS_MAGIC_WRITE_VAL; + lseek(mem_fd, (unsigned long)shdw_addr, SEEK_SET); + if (write(mem_fd, &val, sizeof(val)) < 0) { + ksft_print_msg("Writing shadow stack mem via gup failed\n"); + return 1; + } + + if (*shdw_addr != SS_MAGIC_WRITE_VAL) { + ksft_print_msg("GUP write to shadow stack memory failed\n"); + return 1; + } + + return 0; +} + +bool shadow_stack_gup_tests(unsigned long test_num, void *ctx) +{ + unsigned long shdw_addr = 0; + unsigned long *write_addr = NULL; + int fd = 0; + bool ret = false; + + ksft_print_msg("Exercising shadow stack gup tests\n"); + shdw_addr = my_syscall3(__NR_map_shadow_stack, NULL, SHADOW_STACK_ALLOC_SIZE, 0); + + if (((long)shdw_addr) <= 0) { + ksft_print_msg("map_shadow_stack failed with error code %d\n", (int)shdw_addr); + return false; + } + + write_addr = (unsigned long *)shdw_addr; + + fd = open("/proc/self/mem", O_RDWR); + if (fd == -1) + return false; + + if (gup_tests(fd, write_addr)) { + ksft_print_msg("gup tests failed\n"); + goto out; + } + + ret = true; +out: + if (shdw_addr && munmap(write_addr, SHADOW_STACK_ALLOC_SIZE)) { + ksft_print_msg("munmap failed with error code %d\n", ret); + ret = false; + } + + return ret; +} + +volatile bool break_loop; + +void sigusr1_handler(int signo) +{ + break_loop = true; +} + +bool sigusr1_signal_test(void) +{ + struct sigaction sa = {}; + + sa.sa_handler = sigusr1_handler; + sa.sa_flags = 0; + sigemptyset(&sa.sa_mask); + if (sigaction(SIGUSR1, &sa, NULL)) { + ksft_print_msg("Registering signal handler for SIGUSR1 failed\n"); + return false; + } + + return true; +} + +/* + * shadow stack signal test. shadow stack must be enabled. + * register a signal, fork another thread which is waiting + * on signal. Send a signal from parent to child, verify + * that signal was received by child. If not test fails + */ +bool shadow_stack_signal_test(unsigned long test_num, void *ctx) +{ + int pid = 0, child_status = 0, ret = 0; + unsigned long ss_status = 0; + + ksft_print_msg("Exercising shadow stack signal test\n"); + + ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &ss_status, 0, 0, 0); + if (ret) { + ksft_print_msg("Shadow stack get status prctl failed with errorcode %d\n", ret); + return false; + } + + if (!(ss_status & PR_SHADOW_STACK_ENABLE)) + ksft_print_msg("Shadow stack is not enabled, should be enabled via glibc\n"); + + /* this should be caught by signal handler and do an exit */ + if (!sigusr1_signal_test()) { + ksft_print_msg("Registering sigusr1 handler failed\n"); + exit(-1); + } + + pid = fork(); + + if (pid == -1) { + ksft_print_msg("Signal test: fork failed\n"); + goto out; + } + + if (pid == 0) { + while (!break_loop) + sleep(1); + + exit(11); + /* child shouldn't go beyond here */ + } + + /* send SIGUSR1 to child */ + kill(pid, SIGUSR1); + wait(&child_status); + +out: + + return (WIFEXITED(child_status) && + WEXITSTATUS(child_status) == 11); +} + +int execute_shadow_stack_tests(void) +{ + int ret = 0; + unsigned long test_count = 0; + unsigned long shstk_status = 0; + bool test_pass = false; + + ksft_print_msg("Executing RISC-V shadow stack self tests\n"); + ksft_set_plan(RISCV_SHADOW_STACK_TESTS); + + ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, &shstk_status, 0, 0, 0); + + if (ret != 0) + ksft_exit_fail_msg("Get shadow stack status failed with %d\n", ret); + + /* + * If we are here that means get shadow stack status succeeded and + * thus shadow stack support is baked in the kernel. + */ + while (test_count < RISCV_SHADOW_STACK_TESTS) { + test_pass = (*shstk_tests[test_count].t_func)(test_count, NULL); + ksft_test_result(test_pass, shstk_tests[test_count].name); + test_count++; + } + + ksft_finished(); + + return 0; +} + +#pragma GCC pop_options diff --git a/tools/testing/selftests/riscv/cfi/shadowstack.h b/tools/testing/selftests/riscv/cfi/shadowstack.h new file mode 100644 index 0000000000000..943a3685905f5 --- /dev/null +++ b/tools/testing/selftests/riscv/cfi/shadowstack.h @@ -0,0 +1,27 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#ifndef SELFTEST_SHADOWSTACK_TEST_H +#define SELFTEST_SHADOWSTACK_TEST_H +#include +#include + +/* + * A CFI test returns true for success or false for fail. + * Takes a test number to index into array, and a void pointer. + */ +typedef bool (*shstk_test_func)(unsigned long test_num, void *); + +struct shadow_stack_tests { + char *name; + shstk_test_func t_func; +}; + +bool shadow_stack_fork_test(unsigned long test_num, void *ctx); +bool shadow_stack_map_test(unsigned long test_num, void *ctx); +bool shadow_stack_protection_test(unsigned long test_num, void *ctx); +bool shadow_stack_gup_tests(unsigned long test_num, void *ctx); +bool shadow_stack_signal_test(unsigned long test_num, void *ctx); + +int execute_shadow_stack_tests(void); + +#endif diff --git a/tools/testing/selftests/riscv/hwprobe/which-cpus.c b/tools/testing/selftests/riscv/hwprobe/which-cpus.c new file mode 100644 index 0000000000000..587feb198c049 --- /dev/null +++ b/tools/testing/selftests/riscv/hwprobe/which-cpus.c @@ -0,0 +1,162 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2023 Ventana Micro Systems Inc. + * + * Test the RISCV_HWPROBE_WHICH_CPUS flag of hwprobe. Also provides a command + * line interface to get the cpu list for arbitrary hwprobe pairs. + */ +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include + +#include "hwprobe.h" +#include "kselftest.h" + +static void help(void) +{ + printf("\n" + "which-cpus: [-h] [ [ ...]]\n\n" + " Without parameters, tests the RISCV_HWPROBE_WHICH_CPUS flag of hwprobe.\n" + " With parameters, where each parameter is a hwprobe pair written as\n" + " , outputs the cpulist for cpus which all match the given set\n" + " of pairs. 'key' and 'value' should be in numeric form, e.g. 4=0x3b\n"); +} + +static void print_cpulist(cpu_set_t *cpus) +{ + int start = 0, end = 0; + + if (!CPU_COUNT(cpus)) { + printf("cpus: None\n"); + return; + } + + printf("cpus:"); + for (int i = 0, c = 0; i < CPU_COUNT(cpus); i++, c++) { + if (start != end && !CPU_ISSET(c, cpus)) + printf("-%d", end); + + while (!CPU_ISSET(c, cpus)) + ++c; + + if (i != 0 && c == end + 1) { + end = c; + continue; + } + + printf("%c%d", i == 0 ? ' ' : ',', c); + start = end = c; + } + if (start != end) + printf("-%d", end); + printf("\n"); +} + +static void do_which_cpus(int argc, char **argv, cpu_set_t *cpus) +{ + struct riscv_hwprobe *pairs; + int nr_pairs = argc - 1; + char *start, *end; + int rc; + + pairs = malloc(nr_pairs * sizeof(struct riscv_hwprobe)); + assert(pairs); + + for (int i = 0; i < nr_pairs; i++) { + start = argv[i + 1]; + pairs[i].key = strtol(start, &end, 0); + assert(end != start && *end == '='); + start = end + 1; + pairs[i].value = strtoul(start, &end, 0); + assert(end != start && *end == '\0'); + } + + rc = riscv_hwprobe(pairs, nr_pairs, sizeof(cpu_set_t), (unsigned long *)cpus, RISCV_HWPROBE_WHICH_CPUS); + assert(rc == 0); + print_cpulist(cpus); + free(pairs); +} + +int main(int argc, char **argv) +{ + struct riscv_hwprobe pairs[3]; + cpu_set_t cpus_aff, cpus; + __u64 ext0_all, ext1_all; + long rc; + + rc = sched_getaffinity(0, sizeof(cpu_set_t), &cpus_aff); + assert(rc == 0); + + if (argc > 1) { + if (!strcmp(argv[1], "-h")) + help(); + else + do_which_cpus(argc, argv, &cpus_aff); + return 0; + } + + ksft_print_header(); + ksft_set_plan(7); + + pairs[0] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR, }; + rc = riscv_hwprobe(pairs, 1, 0, NULL, 0); + assert(rc == 0 && pairs[0].key == RISCV_HWPROBE_KEY_BASE_BEHAVIOR && + pairs[0].value == RISCV_HWPROBE_BASE_BEHAVIOR_IMA); + + pairs[0] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_IMA_EXT_0, }; + rc = riscv_hwprobe(pairs, 1, 0, NULL, 0); + assert(rc == 0 && pairs[0].key == RISCV_HWPROBE_KEY_IMA_EXT_0); + ext0_all = pairs[0].value; + + pairs[0] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_IMA_EXT_1, }; + rc = riscv_hwprobe(pairs, 1, 0, NULL, 0); + assert(rc == 0 && pairs[0].key == RISCV_HWPROBE_KEY_IMA_EXT_1); + ext1_all = pairs[0].value; + + pairs[0] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR, .value = RISCV_HWPROBE_BASE_BEHAVIOR_IMA, }; + CPU_ZERO(&cpus); + rc = riscv_hwprobe(pairs, 1, 0, (unsigned long *)&cpus, RISCV_HWPROBE_WHICH_CPUS); + ksft_test_result(rc == -EINVAL, "no cpusetsize\n"); + + pairs[0] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR, .value = RISCV_HWPROBE_BASE_BEHAVIOR_IMA, }; + rc = riscv_hwprobe(pairs, 1, sizeof(cpu_set_t), NULL, RISCV_HWPROBE_WHICH_CPUS); + ksft_test_result(rc == -EINVAL, "NULL cpus\n"); + + pairs[0] = (struct riscv_hwprobe){ .key = 0xbadc0de, }; + CPU_ZERO(&cpus); + rc = riscv_hwprobe(pairs, 1, sizeof(cpu_set_t), (unsigned long *)&cpus, RISCV_HWPROBE_WHICH_CPUS); + ksft_test_result(rc == 0 && CPU_COUNT(&cpus) == 0, "unknown key\n"); + + pairs[0] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR, .value = RISCV_HWPROBE_BASE_BEHAVIOR_IMA, }; + pairs[1] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR, .value = RISCV_HWPROBE_BASE_BEHAVIOR_IMA, }; + CPU_ZERO(&cpus); + rc = riscv_hwprobe(pairs, 2, sizeof(cpu_set_t), (unsigned long *)&cpus, RISCV_HWPROBE_WHICH_CPUS); + ksft_test_result(rc == 0, "duplicate keys\n"); + + pairs[0] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR, .value = RISCV_HWPROBE_BASE_BEHAVIOR_IMA, }; + pairs[1] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_IMA_EXT_0, .value = ext0_all, }; + pairs[2] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_IMA_EXT_1, .value = ext1_all, }; + CPU_ZERO(&cpus); + rc = riscv_hwprobe(pairs, 3, sizeof(cpu_set_t), (unsigned long *)&cpus, RISCV_HWPROBE_WHICH_CPUS); + ksft_test_result(rc == 0 && CPU_COUNT(&cpus) == sysconf(_SC_NPROCESSORS_ONLN), "set all cpus\n"); + + pairs[0] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR, .value = RISCV_HWPROBE_BASE_BEHAVIOR_IMA, }; + pairs[1] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_IMA_EXT_0, .value = ext0_all, }; + pairs[2] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_IMA_EXT_1, .value = ext1_all, }; + memcpy(&cpus, &cpus_aff, sizeof(cpu_set_t)); + rc = riscv_hwprobe(pairs, 3, sizeof(cpu_set_t), (unsigned long *)&cpus, RISCV_HWPROBE_WHICH_CPUS); + ksft_test_result(rc == 0 && CPU_EQUAL(&cpus, &cpus_aff), "set all affinity cpus\n"); + + pairs[0] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_BASE_BEHAVIOR, .value = RISCV_HWPROBE_BASE_BEHAVIOR_IMA, }; + pairs[1] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_IMA_EXT_0, .value = ~ext0_all, }; + pairs[2] = (struct riscv_hwprobe){ .key = RISCV_HWPROBE_KEY_IMA_EXT_1, .value = ~ext1_all, }; + memcpy(&cpus, &cpus_aff, sizeof(cpu_set_t)); + rc = riscv_hwprobe(pairs, 3, sizeof(cpu_set_t), (unsigned long *)&cpus, RISCV_HWPROBE_WHICH_CPUS); + ksft_test_result(rc == 0 && CPU_COUNT(&cpus) == 0, "clear all cpus\n"); + + ksft_finished(); +} diff --git a/tools/testing/selftests/vDSO/vgetrandom-chacha.S b/tools/testing/selftests/vDSO/vgetrandom-chacha.S new file mode 100644 index 0000000000000..a4a82e1c28a90 --- /dev/null +++ b/tools/testing/selftests/vDSO/vgetrandom-chacha.S @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Jason A. Donenfeld . All Rights Reserved. + */ + +#define __ASSEMBLY__ + +#if defined(__aarch64__) +#include "../../../../arch/arm64/kernel/vdso/vgetrandom-chacha.S" +#elif defined(__loongarch__) +#include "../../../../arch/loongarch/vdso/vgetrandom-chacha.S" +#elif defined(__powerpc__) || defined(__powerpc64__) +#include "../../../../arch/powerpc/kernel/vdso/vgetrandom-chacha.S" +#elif defined(__riscv) && __riscv_xlen == 64 +#include "../../../../arch/riscv/kernel/vdso/vgetrandom-chacha.S" +#elif defined(__s390x__) +#include "../../../../arch/s390/kernel/vdso64/vgetrandom-chacha.S" +#elif defined(__x86_64__) +#include "../../../../arch/x86/entry/vdso/vgetrandom-chacha.S" +#endif