4893 Commits

Author SHA1 Message Date
Linus Torvalds
c537e12dae Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:

 - Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK in riscv JIT (Menglong
   Dong)

 - Fix reference count leak in bpf_prog_test_run_xdp() (Tetsuo Handa)

 - Fix metadata size check in bpf_test_run() (Toke Høiland-Jørgensen)

 - Check that BPF insn array is not allowed as a map for const strings
   (Deepanshu Kartikey)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix reference count leak in bpf_prog_test_run_xdp()
  bpf: Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str()
  selftests/bpf: Update xdp_context_test_run test to check maximum metadata size
  bpf, test_run: Subtract size of xdp_frame from allowed metadata size
  riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK
2026-01-13 21:21:13 -08:00
Martin Kaiser
b0d7f5f0c9 riscv: trace: fix snapshot deadlock with sbi ecall
If sbi_ecall.c's functions are traceable,

echo "__sbi_ecall:snapshot" > /sys/kernel/tracing/set_ftrace_filter

may get the kernel into a deadlock.

(Functions in sbi_ecall.c are excluded from tracing if
CONFIG_RISCV_ALTERNATIVE_EARLY is set.)

__sbi_ecall triggers a snapshot of the ringbuffer. The snapshot code
raises an IPI interrupt, which results in another call to __sbi_ecall
and another snapshot...

All it takes to get into this endless loop is one initial __sbi_ecall.
On RISC-V systems without SSTC extension, the clock events in
timer-riscv.c issue periodic sbi ecalls, making the problem easy to
trigger.

Always exclude the sbi_ecall.c functions from tracing to fix the
potential deadlock.

sbi ecalls can easiliy be logged via trace events, excluding ecall
functions from function tracing is not a big limitation.

Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Link: https://patch.msgid.link/20251223135043.1336524-1-martin@kaiser.cx
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-07 13:25:56 -07:00
Yunhui Cui
957afeb99b riscv: remove irqflags.h inclusion in asm/bitops.h
The arch/riscv/include/asm/bitops.h does not functionally require
including /linux/irqflags.h. Additionally, adding
arch/riscv/include/asm/percpu.h causes a circular inclusion:
kernel/bounds.c
->include/linux/log2.h
->include/linux/bitops.h
->arch/riscv/include/asm/bitops.h
->include/linux/irqflags.h
->include/linux/find.h
->return val ? __ffs(val) : size;
->arch/riscv/include/asm/bitops.h

The compilation log is as follows:
CC      kernel/bounds.s
In file included from ./include/linux/bitmap.h:11,
               from ./include/linux/cpumask.h:12,
               from ./arch/riscv/include/asm/processor.h:55,
               from ./arch/riscv/include/asm/thread_info.h:42,
               from ./include/linux/thread_info.h:60,
               from ./include/asm-generic/preempt.h:5,
               from ./arch/riscv/include/generated/asm/preempt.h:1,
               from ./include/linux/preempt.h:79,
               from ./arch/riscv/include/asm/percpu.h:8,
               from ./include/linux/irqflags.h:19,
               from ./arch/riscv/include/asm/bitops.h:14,
               from ./include/linux/bitops.h:68,
               from ./include/linux/log2.h:12,
               from kernel/bounds.c:13:
./include/linux/find.h: In function 'find_next_bit':
./include/linux/find.h:66:30: error: implicit declaration of function '__ffs' [-Wimplicit-function-declaration]
   66 |                 return val ? __ffs(val) : size;
      |                              ^~~~~

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Acked-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Link: https://patch.msgid.link/20251216014721.42262-2-cuiyunhui@bytedance.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-07 13:16:38 -07:00
Ben Dooks
2ca5bb54bd riscv: cpu_ops_sbi: smp_processor_id() returns int, not unsigned int
The print in sbi_cpu_stop() assumes smp_processor_id() returns an
unsigned int, when it is actually an int. Fix the format string to
avoid mismatch type warnings in rht pr_crit().

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Link: https://patch.msgid.link/20260102145839.657864-1-ben.dooks@codethink.co.uk
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-07 13:03:16 -07:00
Lukas Bulwahn
003c03a4b4 riscv: configs: Clean up references to non-existing configs
- Drop 'CONFIG_I2C_COMPAT is not set' (removed in commit 7e722083fc
    ("i2c: Remove I2C_COMPAT config symbol and related code"))
  - Drop 'CONFIG_SCHED_DEBUG is not set' (removed in commit b52173065e
    ("sched/debug: Remove CONFIG_SCHED_DEBUG"))

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Link: https://patch.msgid.link/20260107092425.24737-1-lukas.bulwahn@redhat.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-07 12:49:06 -07:00
Menglong Dong
8f3e00af8e riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK
The usage of BPF_TRAMP_F_ORIG_STACK in __arch_prepare_bpf_trampoline() is
wrong, and it should be BPF_TRAMP_F_CALL_ORIG, which caused crash as
Andreas reported:

  Insufficient stack space to handle exception!
  Task stack:     [0xff20000000010000..0xff20000000014000]
  Overflow stack: [0xff600000ffdad070..0xff600000ffdae070]
  CPU: 1 UID: 0 PID: 1 Comm: systemd Not tainted 6.18.0-rc5+ #15 PREEMPT(voluntary)
  Hardware name: riscv-virtio qemu/qemu, BIOS 2025.10 10/01/2025
  epc : copy_from_kernel_nofault+0xa/0x198
   ra : bpf_probe_read_kernel+0x20/0x60
  epc : ffffffff802b732a ra : ffffffff801e6070 sp : ff2000000000ffe0
   gp : ffffffff82262ed0 tp : 0000000000000000 t0 : ffffffff80022320
   t1 : ffffffff801e6056 t2 : 0000000000000000 s0 : ff20000000010040
   s1 : 0000000000000008 a0 : ff20000000010050 a1 : ff60000083b3d320
   a2 : 0000000000000008 a3 : 0000000000000097 a4 : 0000000000000000
   a5 : 0000000000000000 a6 : 0000000000000021 a7 : 0000000000000003
   s2 : ff20000000010050 s3 : ff6000008459fc18 s4 : ff60000083b3d340
   s5 : ff20000000010060 s6 : 0000000000000000 s7 : ff20000000013aa8
   s8 : 0000000000000000 s9 : 0000000000008000 s10: 000000000058dcb0
   s11: 000000000058dca7 t3 : 000000006925116d t4 : ff6000008090f026
   t5 : 00007fff9b0cbaa8 t6 : 0000000000000016
  status: 0000000200000120 badaddr: 0000000000000000 cause: 8000000000000005
  Kernel panic - not syncing: Kernel stack overflow
  CPU: 1 UID: 0 PID: 1 Comm: systemd Not tainted 6.18.0-rc5+ #15 PREEMPT(voluntary)
  Hardware name: riscv-virtio qemu/qemu, BIOS 2025.10 10/01/2025
  Call Trace:
  [<ffffffff8001a1f8>] dump_backtrace+0x28/0x38
  [<ffffffff80002502>] show_stack+0x3a/0x50
  [<ffffffff800122be>] dump_stack_lvl+0x56/0x80
  [<ffffffff80012300>] dump_stack+0x18/0x22
  [<ffffffff80002abe>] vpanic+0xf6/0x328
  [<ffffffff80002d2e>] panic+0x3e/0x40
  [<ffffffff80019ef0>] handle_bad_stack+0x98/0xa0
  [<ffffffff801e6070>] bpf_probe_read_kernel+0x20/0x60

Just fix it.

Fixes: 47c9214dcb ("bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME")
Link: https://lore.kernel.org/bpf/20251219142948.204312-1-dongml2@chinatelecom.cn
Closes: https://lore.kernel.org/bpf/874ipnkfvt.fsf@igel.home/
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-06 11:40:48 -08:00
Soham Metha
938d79ec2b riscv: kexec_image: Fix dead link to boot-image-header.rst
Fix the reference to 'boot-image-header.rst', which was moved to
'Documentation/arch/riscv/' in commit 'ed843ae947f8'
("docs: move riscv under arch").

Signed-off-by: Soham Metha <sohammetha01@gmail.com>
Link: https://patch.msgid.link/20251203194355.63265-1-sohammetha01@gmail.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-05 17:54:11 -07:00
Guo Ren (Alibaba DAMO Academy)
5e5be092ff riscv: pgtable: Cleanup useless VA_USER_XXX definitions
These marcos are not used after commit b5b4287acc ("riscv: mm: Use
hint address in mmap if available"). Cleanup VA_USER_XXX definitions
in asm/pgtable.h.

Fixes: b5b4287acc ("riscv: mm: Use hint address in mmap if available")
Signed-off-by: Guo Ren (Alibaba DAMO Academy) <guoren@kernel.org>
Reviewed-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://patch.msgid.link/20251201005850.702569-1-guoren@kernel.org
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-05 17:49:18 -07:00
Guodong Xu
8632180daf riscv: cpufeature: Fix Zk bundled extension missing Zknh
The Zk extension is a bundle consisting of Zkn, Zkr, and Zkt. The Zkn
extension itself is a bundle consisting of Zbkb, Zbkc, Zbkx, Zknd, Zkne,
and Zknh.

The current implementation of riscv_zk_bundled_exts manually listed
the dependencies but missed RISCV_ISA_EXT_ZKNH.

Fix this by introducing a RISCV_ISA_EXT_ZKN macro that lists the Zkn
components and using it in both riscv_zk_bundled_exts and
riscv_zkn_bundled_exts.

This adds the missing Zknh extension to Zk and reduces code duplication.

Fixes: 0d8295ed97 ("riscv: add ISA extension parsing for scalar crypto")
Link: https://patch.msgid.link/20231114141256.126749-4-cleger@rivosinc.com/
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://patch.msgid.link/20251223-zk-missing-zknh-v1-1-b627c990ee1a@riscstar.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-05 17:40:44 -07:00
Jiakai Xu
641ecc8900 riscv: fix KUnit test_kprobes crash when building with Clang
Clang misinterprets the placement of test_kprobes_addresses and
test_kprobes_functions arrays when they are not explicitly assigned
to a data section. This can lead to kmalloc_array() allocation
errors and KUnit failures.

When testing the Clang-compiled code in QEMU, this warning was emitted:

WARNING: CPU: 1 PID: 3000 at mm/page_alloc.c:5159 __alloc_frozen_pages_noprof+0xe6/0x2fc mm/page_alloc.c:5159

Further investigation revealed that the test_kprobes_addresses array
appeared to have over 100,000 elements, including invalid addresses;
whereas, according to test-kprobes-asm.S, test_kprobes_addresses
should only have 25 elements.

When compiling the kernel with GCC, the kernel boots correctly.

This patch fixes the issue by adding .section .rodata to explicitly
place arrays in the read-only data segment.

For detailed debug and analysis, see:
https://github.com/j1akai/temp/blob/main/20251113/readme.md

v1 -> v2:
- Drop changes to .align, and .globl.

Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn>
Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Link: https://patch.msgid.link/738dd4e2.ff73.19a7cd7b4d5.Coremail.xujiakai2025@iscas.ac.cn
Link: https://github.com/llvm/llvm-project/issues/168308
Link: https://patch.msgid.link/20251226032317.1523764-1-jiakaiPeanut@gmail.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-30 20:10:52 -07:00
Lukas Gerlach
25fd7ee7bf riscv: Sanitize syscall table indexing under speculation
The syscall number is a user-controlled value used to index into the
syscall table. Use array_index_nospec() to clamp this value after the
bounds check to prevent speculative out-of-bounds access and subsequent
data leakage via cache side channels.

Signed-off-by: Lukas Gerlach <lukas.gerlach@cispa.de>
Link: https://patch.msgid.link/20251218191332.35849-3-lukas.gerlach@cispa.de
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-30 19:57:55 -07:00
Vivian Wang
66562b66dc riscv: boot: Always make Image from vmlinux, not vmlinux.unstripped
Since commit 4b47a3aefb ("kbuild: Restore pattern to avoid stripping
.rela.dyn from vmlinux") vmlinux has .rel*.dyn preserved. Therefore, use
vmlinux to produce Image, not vmlinux.unstripped.

Doing so fixes booting a RELOCATABLE=y Image with kexec. The problem is
caused by this chain of events:

- Since commit 3e86e4d74c ("kbuild: keep .modinfo section in
  vmlinux.unstripped"), vmlinux.unstripped gets a .modinfo section.
- The .modinfo section has SHF_ALLOC, so it ends up in Image, at the end
  of it.
- The Image header's image_size field does not expect to include
  .modinfo and does not account for it, since it should not be in Image.
- If .modinfo is large enough, the file size of Image ends up larger
  than image_size, which eventually leads to it failing
  sanity_check_segment_list().

Using vmlinux instead of vmlinux.unstripped means that the unexpected
.modinfo section is gone from Image, fixing the file size problem.

Cc: stable@vger.kernel.org
Fixes: 3e86e4d74c ("kbuild: keep .modinfo section in vmlinux.unstripped")
Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Han Gao <gaohan@iscas.ac.cn>
Link: https://patch.msgid.link/20251230-riscv-vmlinux-not-unstripped-v1-1-15f49df880df@iscas.ac.cn
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-30 19:57:55 -07:00
Himanshu Chauhan
5efaf92da4 riscv: Add SBI debug trigger extension and function ids
Debug trigger extension is an SBI extension to support native debugging
in S-mode and VS-mode. This patch adds the extension and the function
IDs defined by the extension.

Signed-off-by: Himanshu Chauhan <hchauhan@ventanamicro.com>
Link: https://patch.msgid.link/20250710125231.653967-2-hchauhan@ventanamicro.com
[pjw@kernel.org: updated to apply]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19 00:22:30 -07:00
Zongmin Zhou
f02dd25472 riscv/atomic.h: use RISCV_FULL_BARRIER in _arch_atomic* function.
Replace the same code with the pre-defined macro
RISCV_FULL_BARRIER to simplify the code.

Signed-off-by: Zongmin Zhou <zhouzongmin@kylinos.cn>
Link: https://patch.msgid.link/20251120095831.64211-1-min_halo@163.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19 00:22:30 -07:00
Pincheng Wang
6118ebed3b riscv: hwprobe: export Zilsd and Zclsd ISA extensions
Export Zilsd and Zclsd ISA extensions through hwprobe.

Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Link: https://patch.msgid.link/20250826162939.1494021-4-pincheng.plct@isrc.iscas.ac.cn
[pjw@kernel.org: fixed whitespace; updated to apply]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19 00:22:30 -07:00
Pincheng Wang
3f0cbfb8a1 riscv: add ISA extension parsing for Zilsd and Zclsd
Add parsing for Zilsd and Zclsd ISA extensions which were ratified in
commit f88abf1 ("Integrating load/store pair for RV32 with the
main manual") of the riscv-isa-manual.

Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Link: https://patch.msgid.link/20250826162939.1494021-3-pincheng.plct@isrc.iscas.ac.cn
[pjw@kernel.org: cleaned up checkpatch issues, whitespace; updated to apply]
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19 00:18:34 -07:00
Paul Walmsley
e0e51a0de0 riscv: mm: use xchg() on non-atomic_long_t variables, not atomic_long_xchg()
Let's not call atomic_long_xchg() on something that's not an
atomic_long_t, and just use xchg() instead.  Continues the cleanup
from commit 546e42c8c6 ("riscv: Use an atomic xchg in
pudp_huge_get_and_clear()"),

Cc: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19 00:18:33 -07:00
Paul Walmsley
425cc087fb riscv: mm: ptep_get_and_clear(): avoid atomic ops when !CONFIG_SMP
When !CONFIG_SMP, there's no need for atomic operations in
ptep_get_and_clear(), so, similar to x86, let's not use atomics in
this case.

Cc: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19 00:18:33 -07:00
Paul Walmsley
1e6084d5c4 riscv: mm: pmdp_huge_get_and_clear(): avoid atomic ops when !CONFIG_SMP
When !CONFIG_SMP, there's no need for atomic operations in
pmdp_huge_get_and_clear(), so, similar to what x86 does, let's not use
atomics in this case.  See also commit 546e42c8c6 ("riscv: Use an
atomic xchg in pudp_huge_get_and_clear()").

Cc: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19 00:18:33 -07:00
Andy Chiu
818d78ba1b riscv: signal: abstract header saving for setup_sigcontext
The function save_v_state() served two purposes. First, it saved
extension context into the signal stack. Then, it constructed the
extension header if there was no fault. The second part is independent
of the extension itself. As a result, we can pull that part out, so
future extensions may reuse it. This patch adds arch_ext_list and makes
setup_sigcontext() go through all possible extensions' save() callback.
The callback returns a positive value indicating the size of the
successfully saved extension. Then the kernel proceeds to construct the
header for that extension. The kernel skips an extension if it does
not exist, or if the saving fails for some reasons. The error code is
propagated out on the later case.

This patch does not introduce any functional changes.

Signed-off-by: Andy Chiu <andybnac@gmail.com>
Link: https://patch.msgid.link/20251112-v5_user_cfi_series-v23-16-b55691eacf4f@rivosinc.com
Signed-off-by: Paul Walmsley <pjw@kernel.org>
2025-12-19 00:18:33 -07:00
Eric Biggers
1cd5bb6e9e lib/crypto: riscv: Depend on RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
Replace the RISCV_ISA_V dependency of the RISC-V crypto code with
RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS, which implies RISCV_ISA_V as
well as vector unaligned accesses being efficient.

This is necessary because this code assumes that vector unaligned
accesses are supported and are efficient.  (It does so to avoid having
to use lots of extra vsetvli instructions to switch the element width
back and forth between 8 and either 32 or 64.)

This was omitted from the code originally just because the RISC-V kernel
support for detecting this feature didn't exist yet.  Support has now
been added, but it's fragmented into per-CPU runtime detection, a
command-line parameter, and a kconfig option.  The kconfig option is the
only reasonable way to do it, though, so let's just rely on that.

Fixes: eb24af5d7a ("crypto: riscv - add vector crypto accelerated AES-{ECB,CBC,CTR,XTS}")
Fixes: bb54668837 ("crypto: riscv - add vector crypto accelerated ChaCha20")
Fixes: 600a3853df ("crypto: riscv - add vector crypto accelerated GHASH")
Fixes: 8c8e40470f ("crypto: riscv - add vector crypto accelerated SHA-{256,224}")
Fixes: b3415925a0 ("crypto: riscv - add vector crypto accelerated SHA-{512,384}")
Fixes: 563a5255af ("crypto: riscv - add vector crypto accelerated SM3")
Fixes: b8d06352bb ("crypto: riscv - add vector crypto accelerated SM4")
Cc: stable@vger.kernel.org
Reported-by: Vivian Wang <wangruikang@iscas.ac.cn>
Closes: https://lore.kernel.org/r/b3cfcdac-0337-4db0-a611-258f2868855f@iscas.ac.cn/
Reviewed-by: Jerry Shih <jerry.shih@sifive.com>
Link: https://lore.kernel.org/r/20251206213750.81474-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-12-09 15:10:21 -08:00
Linus Torvalds
edf602a17b Merge tag 'tty-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
 "Here is the big set of tty/serial driver changes for 6.19-rc1. Nothing
  major at all, just small constant churn to make the tty layer
  "cleaner" as well as serial driver updates and even a new test added!
  Included in here are:

   - More tty/serial cleanups from Jiri

   - tty tiocsti test added to hopefully ensure we don't regress in this
     area again

   - sc16is7xx driver updates

   - imx serial driver updates

   - 8250 driver updates

   - new hardware device ids added

   - other minor serial/tty driver cleanups and tweaks

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (60 commits)
  serial: sh-sci: Fix deadlock during RSCI FIFO overrun error
  dt-bindings: serial: rsci: Drop "uart-has-rtscts: false"
  LoongArch: dts: Add uart new compatible string
  serial: 8250: Add Loongson uart driver support
  dt-bindings: serial: 8250: Add Loongson uart compatible
  serial: 8250: add driver for KEBA UART
  serial: Keep rs485 settings for devices without firmware node
  serial: qcom-geni: Enable Serial on SA8255p Qualcomm platforms
  serial: qcom-geni: Enable PM runtime for serial driver
  serial: sprd: Return -EPROBE_DEFER when uart clock is not ready
  tty: serial: samsung: Declare earlycon for Exynos850
  serial: icom: Convert PCIBIOS_* return codes to errnos
  serial: 8250-of: Fix style issues in 8250_of.c
  serial: add support of CPCI cards
  serial: mux: Fix kernel doc for mux_poll()
  tty: replace use of system_unbound_wq with system_dfl_wq
  serial: 8250_platform: simplify IRQF_SHARED handling
  serial: 8250: make share_irqs local to 8250_platform
  serial: 8250: move skip_txen_test to core
  serial: drop SERIAL_8250_DEPRECATED_OPTIONS
  ...
2025-12-06 18:38:19 -08:00
Linus Torvalds
66a1025f7f Merge tag 'soc-newsoc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull new SoC families update from Arnd Bergmann:
 "These three new families of SoC are split out into a separate branch
  because they touch multiple parts of the source tree and are better
  left separate for the initial merge.

   - Black Sesame Technologies C1200 is an automotive SoC using
     Cortex-A78 CPU cores

   - Anlogic dr1v90 (not to be confused with Amlogic) is an FPGA
     platform using a single nuclei ux900 RISC-V core

   - Tenstorrent Blackhole is a Neural Processing Unit using custom
     "Tensix" cores for computation offload managed by Linux running on
     SiFive X280 RISC-V cores.

  Support for all three is rather rudimentary at the moment and will get
  improved as device drivers are merged through other tree"

* tag 'soc-newsoc-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
  MAINTAINERS: add Black Sesame Technologies (BST) ARM SoC support
  arm64: defconfig: enable BST platform support
  arm64: dts: bst: add support for Black Sesame Technologies C1200 CDCU1.0 board
  arm64: Kconfig: add ARCH_BST for Black Sesame Technologies SoCs
  dt-bindings: arm: add Black Sesame Technologies (bst) SoC
  dt-bindings: vendor-prefixes: Add Black Sesame Technologies Co., Ltd.
  MAINTAINERS: Setup support for Anlogic tree
  riscv: defconfig: Enable Anlogic SoC
  riscv: dts: anlogic: Add Milianke MLKPAI FS01 board
  riscv: dts: Add initial Anlogic DR1V90 SoC device tree
  riscv: Add Anlogic SoC famly Kconfig support
  dt-bindings: serial: snps-dw-apb-uart: Add Anlogic DR1V90 uart
  dt-bindings: timer: Add Anlogic DR1V90 ACLINT MTIMER
  dt-bindings: riscv: Add Anlogic DR1V90
  dt-bindings: riscv: Add Nuclei UX900 compatibles
  dt-bindings: vendor-prefixes: Add Anlogic, Milianke and Nuclei
  riscv: defconfig: Enable Tenstorrent SoCs
  riscv: Kconfig.socs: Add ARCH_TENSTORRENT for Tenstorrent SoCs
  riscv: dts: Add Tenstorrent Blackhole SoC PCIe cards
  dt-bindings: interrupt-controller: Add Tenstorrent Blackhole compatible
  ...
2025-12-05 17:27:12 -08:00
Linus Torvalds
0cac5ce06e Merge tag 'soc-dt-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC devicetree updates from Arnd Bergmann:
 "Three new SoCs got added in existing arm64 chip families:

   - Renesas R-Car X5H (R8A78000) is a new generation of automotive
     SoCs, based on 16 Cortex-A720 (Armv9.2) cores, which makes the the
     currently highest-perforance embedded SoC.

   - TI AM62L is a new variant of the AM62 family of industrial SoCs,
     this one comes without a GPU.

   - Qualcomm MSM8937 (Snapdragon 430) is an older mobile phone chip
     based on Cortex-A53, and closely related to MSM8917 (Snapdragn
     425), which we already support.

  In addition, there are a good number of newly supported machines
  across SoC families:

   - Two Aspeed AST2600 (Cortex-A7) based BMC setups for large servers

   - Mobile Phones and tables based on Mediatek MT6582, Nvidia Tegra124,
     Qualcomm MSM8937 and Qualcomm MSM8939,

   - Two Laptops based on Qualcomm SoCs: one using the older sdm850, the
     other using x1p42100.

   - One Router based on Rockchips RK3568

   - 24 variants of the Enclustra Mercury system-on-module, all based on
     32-bit Intel/Altera SocFPGA chips, plus two boards using 64-bit
     SocFPGA Agilex chips..

   - 30 industrial/embedded boards and single-board computers, using
     various chips from NXP, Rockchips, Mediatek, TI, Amlogic, Qualcomm,
     Spacemit, and Starfive.

  In total there are 783 commits here, the majority of these improving
  hardware support and cleaning up devicetree files across the tree,
  with the majority of the changes going into the Qualcomm, NXP, Renesas
  and Rockchips platforms"

* tag 'soc-dt-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (782 commits)
  arm64: dts: mediatek: mt8195: Fix address range for JPEG decoder core 1
  ARM: dts: samsung: exynos4412-midas: turn off SDIO WLAN chip during system suspend
  ARM: dts: samsung: exynos4210-trats: turn off SDIO WLAN chip during system suspend
  ARM: dts: samsung: exynos4210-i9100: turn off SDIO WLAN chip during system suspend
  ARM: dts: samsung: universal_c210: turn off SDIO WLAN chip during system suspend
  arm64: dts: amlogic: meson-g12b: Fix L2 cache reference for S922X CPUs
  arm64: dts: Add gpio_intc node for Amlogic S7D SoCs
  arm64: dts: Add gpio_intc node for Amlogic S7 SoCs
  arm64: dts: Add gpio_intc node for Amlogic S6 SoCs
  arm64: dts: amlogic: s7d: add ao secure node
  arm64: dts: amlogic: s7: add ao secure node
  arm64: dts: amlogic: s6: add ao secure node
  arm64: dts: amlogic: Fix the register name of the 'DBI' region
  dts: arm64: amlogic: add a5 pinctrl node
  arm64: dts: amlogic: s7d: add power domain controller node
  arm64: dts: amlogic: s7: add power domain controller node
  arm64: dts: amlogic: s6: add power domain controller node
  dts: arm64: amlogic: Add ISP related nodes for C3
  arm64: dts: meson: add initial device-tree for Tanix TX9 Pro
  dt-bindings: arm: amlogic: add support for Tanix TX9 Pro
  ...
2025-12-05 17:24:29 -08:00
Linus Torvalds
b4c6c76e40 Merge tag 'soc-defconfig-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC defconfig updates from Arnd Bergmann:
 "As usual, a number of newly added drivers get enabled in the arm64
  defconfig, in addition to minor housekeeping work on defconfig files
  for arm32, arm64 and riscv"

* tag 'soc-defconfig-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
  arm64: defconfig: enable Exynos ACPM clocks
  arm64: defconfig: Remove the redundant SCHED_MC/SCHED_SMT
  ARM: multi_v7_defconfig: Enable TI PRU Ethernet driver
  arm64: defconfig: enable i.MX AIPSTZ driver
  ARM: mxs_defconfig: enable sound drivers for imx28-amarula-rmm
  arm64: defconfig: Enable i.MX95 drivers for pinctrl, Ethernet and PCIe
  arm64: defconfig: enable rockchip camera interface
  ARM: tegra: Enable EXT4 for Tegra
  arm64: defconfig: Enable NVIDIA VRS PSEQ RTC
  arm64: defconfig: Enable SX150x GPIO expander driver
  riscv: defconfig: enable SPI_FSL_QUADSPI as a module
  ARM: at91: at91_dt_defconfig: set MMC_SPI to module
  arm64: defconfig: Build NSS clock controller driver for IPQ5424
  arm64: defconfig: Enable SCSI UFS Crypto and Block Inline encryption drivers
  arm64: defconfig: Add M31 eUSB2 PHY config
  arm64: defconfig: Enable configs for Fairphone 3, 4, 5 smartphones
  arm64: defconfig: Enable two Novatek display panels for MTP8750 and Tianma
  arm64: defconfig: Enable RZ/T2H / RZ/N2H ADC driver
  ARM: shmobile: defconfig: Refresh for v6.18-rc1
  arm64: defconfig: Enable DW HDMI QP CEC support
  ...
2025-12-05 17:22:09 -08:00
Linus Torvalds
51d90a15fe Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
 "ARM:

   - Support for userspace handling of synchronous external aborts
     (SEAs), allowing the VMM to potentially handle the abort in a
     non-fatal manner

   - Large rework of the VGIC's list register handling with the goal of
     supporting more active/pending IRQs than available list registers
     in hardware. In addition, the VGIC now supports EOImode==1 style
     deactivations for IRQs which may occur on a separate vCPU than the
     one that acked the IRQ

   - Support for FEAT_XNX (user / privileged execute permissions) and
     FEAT_HAF (hardware update to the Access Flag) in the software page
     table walkers and shadow MMU

   - Allow page table destruction to reschedule, fixing long
     need_resched latencies observed when destroying a large VM

   - Minor fixes to KVM and selftests

  Loongarch:

   - Get VM PMU capability from HW GCFG register

   - Add AVEC basic support

   - Use 64-bit register definition for EIOINTC

   - Add KVM timer test cases for tools/selftests

  RISC/V:

   - SBI message passing (MPXY) support for KVM guest

   - Give a new, more specific error subcode for the case when in-kernel
     AIA virtualization fails to allocate IMSIC VS-file

   - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually
     in small chunks

   - Fix guest page fault within HLV* instructions

   - Flush VS-stage TLB after VCPU migration for Andes cores

  s390:

   - Always allocate ESCA (Extended System Control Area), instead of
     starting with the basic SCA and converting to ESCA with the
     addition of the 65th vCPU. The price is increased number of exits
     (and worse performance) on z10 and earlier processor; ESCA was
     introduced by z114/z196 in 2010

   - VIRT_XFER_TO_GUEST_WORK support

   - Operation exception forwarding support

   - Cleanups

  x86:

   - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO
     SPTE caching is disabled, as there can't be any relevant SPTEs to
     zap

   - Relocate a misplaced export

   - Fix an async #PF bug where KVM would clear the completion queue
     when the guest transitioned in and out of paging mode, e.g. when
     handling an SMI and then returning to paged mode via RSM

   - Leave KVM's user-return notifier registered even when disabling
     virtualization, as long as kvm.ko is loaded. On reboot/shutdown,
     keeping the notifier registered is ok; the kernel does not use the
     MSRs and the callback will run cleanly and restore host MSRs if the
     CPU manages to return to userspace before the system goes down

   - Use the checked version of {get,put}_user()

   - Fix a long-lurking bug where KVM's lack of catch-up logic for
     periodic APIC timers can result in a hard lockup in the host

   - Revert the periodic kvmclock sync logic now that KVM doesn't use a
     clocksource that's subject to NTP corrections

   - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the
     latter behind CONFIG_CPU_MITIGATIONS

   - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast
     path; the only reason they were handled in the fast path was to
     paper of a bug in the core #MC code, and that has long since been
     fixed

   - Add emulator support for AVX MOV instructions, to play nice with
     emulated devices whose guest drivers like to access PCI BARs with
     large multi-byte instructions

  x86 (AMD):

   - Fix a few missing "VMCB dirty" bugs

   - Fix the worst of KVM's lack of EFER.LMSLE emulation

   - Add AVIC support for addressing 4k vCPUs in x2AVIC mode

   - Fix incorrect handling of selective CR0 writes when checking
     intercepts during emulation of L2 instructions

   - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32]
     on VMRUN and #VMEXIT

   - Fix a bug where KVM corrupt the guest code stream when re-injecting
     a soft interrupt if the guest patched the underlying code after the
     VM-Exit, e.g. when Linux patches code with a temporary INT3

   - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits
     to userspace, and extend KVM "support" to all policy bits that
     don't require any actual support from KVM

  x86 (Intel):

   - Use the root role from kvm_mmu_page to construct EPTPs instead of
     the current vCPU state, partly as worthwhile cleanup, but mostly to
     pave the way for tracking per-root TLB flushes, and elide EPT
     flushes on pCPU migration if the root is clean from a previous
     flush

   - Add a few missing nested consistency checks

   - Rip out support for doing "early" consistency checks via hardware
     as the functionality hasn't been used in years and is no longer
     useful in general; replace it with an off-by-default module param
     to WARN if hardware fails a check that KVM does not perform

   - Fix a currently-benign bug where KVM would drop the guest's
     SPEC_CTRL[63:32] on VM-Enter

   - Misc cleanups

   - Overhaul the TDX code to address systemic races where KVM (acting
     on behalf of userspace) could inadvertantly trigger lock contention
     in the TDX-Module; KVM was either working around these in weird,
     ugly ways, or was simply oblivious to them (though even Yan's
     devilish selftests could only break individual VMs, not the host
     kernel)

   - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a
     TDX vCPU, if creating said vCPU failed partway through

   - Fix a few sparse warnings (bad annotation, 0 != NULL)

   - Use struct_size() to simplify copying TDX capabilities to userspace

   - Fix a bug where TDX would effectively corrupt user-return MSR
     values if the TDX Module rejects VP.ENTER and thus doesn't clobber
     host MSRs as expected

  Selftests:

   - Fix a math goof in mmu_stress_test when running on a single-CPU
     system/VM

   - Forcefully override ARCH from x86_64 to x86 to play nice with
     specifying ARCH=x86_64 on the command line

   - Extend a bunch of nested VMX to validate nested SVM as well

   - Add support for LA57 in the core VM_MODE_xxx macro, and add a test
     to verify KVM can save/restore nested VMX state when L1 is using
     5-level paging, but L2 is not

   - Clean up the guest paging code in anticipation of sharing the core
     logic for nested EPT and nested NPT

  guest_memfd:

   - Add NUMA mempolicy support for guest_memfd, and clean up a variety
     of rough edges in guest_memfd along the way

   - Define a CLASS to automatically handle get+put when grabbing a
     guest_memfd from a memslot to make it harder to leak references

   - Enhance KVM selftests to make it easer to develop and debug
     selftests like those added for guest_memfd NUMA support, e.g. where
     test and/or KVM bugs often result in hard-to-debug SIGBUS errors

   - Misc cleanups

  Generic:

   - Use the recently-added WQ_PERCPU when creating the per-CPU
     workqueue for irqfd cleanup

   - Fix a goof in the dirty ring documentation

   - Fix choice of target for directed yield across different calls to
     kvm_vcpu_on_spin(); the function was always starting from the first
     vCPU instead of continuing the round-robin search"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits)
  KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS
  KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2
  KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX}
  KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected"
  KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot()
  KVM: arm64: Add endian casting to kvm_swap_s[12]_desc()
  KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n
  KVM: arm64: selftests: Add test for AT emulation
  KVM: arm64: nv: Expose hardware access flag management to NV guests
  KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW
  KVM: arm64: Implement HW access flag management in stage-1 SW PTW
  KVM: arm64: Propagate PTW errors up to AT emulation
  KVM: arm64: Add helper for swapping guest descriptor
  KVM: arm64: nv: Use pgtable definitions in stage-2 walk
  KVM: arm64: Handle endianness in read helper for emulated PTW
  KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW
  KVM: arm64: Call helper for reading descriptors directly
  KVM: arm64: nv: Advertise support for FEAT_XNX
  KVM: arm64: Teach ptdump about FEAT_XNX permissions
  KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions
  ...
2025-12-05 17:01:20 -08:00
Linus Torvalds
07025b51c1 Merge tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:

 - Enable parallel hotplug for RISC-V

 - Optimize vector regset allocation for ptrace()

 - Add a kernel selftest for the vector ptrace interface

 - Enable the userspace RAID6 test to build and run using RISC-V vectors

 - Add initial support for the Zalasr RISC-V ratified ISA extension

 - For the Zicbop RISC-V ratified ISA extension to userspace, expose
   hardware and kernel support to userspace and add a kselftest for
   Zicbop

 - Convert open-coded instances of 'asm goto's that are controlled by
   runtime ALTERNATIVEs to use riscv_has_extension_{un,}likely(),
   following arm64's alternative_has_cap_{un,}likely()

 - Remove an unnecessary mask in the GFP flags used in some calls to
   pagetable_alloc()

* tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  selftests/riscv: Add Zicbop prefetch test
  riscv: hwprobe: Expose Zicbop extension and its block size
  riscv: Introduce Zalasr instructions
  riscv: hwprobe: Export Zalasr extension
  dt-bindings: riscv: Add Zalasr ISA extension description
  riscv: Add ISA extension parsing for Zalasr
  selftests: riscv: Add test for the Vector ptrace interface
  riscv: ptrace: Optimize the allocation of vector regset
  raid6: test: Add support for RISC-V
  raid6: riscv: Allow code to be compiled in userspace
  raid6: riscv: Prevent compiler from breaking inline vector assembly code
  riscv: cmpxchg: Use riscv_has_extension_likely
  riscv: bitops: Use riscv_has_extension_likely
  riscv: hweight: Use riscv_has_extension_likely
  riscv: checksum: Use riscv_has_extension_likely
  riscv: pgtable: Use riscv_has_extension_unlikely
  riscv: Remove __GFP_HIGHMEM masking
  RISC-V: Enable HOTPLUG_PARALLEL for secondary CPUs
2025-12-05 16:26:57 -08:00
Linus Torvalds
7203ca412f Merge tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:

  "__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki)
     Rework the vmalloc() code to support non-blocking allocations
     (GFP_ATOIC, GFP_NOWAIT)

  "ksm: fix exec/fork inheritance" (xu xin)
     Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not
     inherited across fork/exec

  "mm/zswap: misc cleanup of code and documentations" (SeongJae Park)
     Some light maintenance work on the zswap code

  "mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira)
     Enhance the /sys/kernel/debug/page_owner debug feature by adding
     unique identifiers to differentiate the various stack traces so
     that userspace monitoring tools can better match stack traces over
     time

  "mm/page_alloc: pcp->batch cleanups" (Joshua Hahn)
     Minor alterations to the page allocator's per-cpu-pages feature

  "Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra)
     Address a scalability issue in userfaultfd's UFFDIO_MOVE operation

  "kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov)

  "drivers/base/node: fold node register and unregister functions" (Donet Tom)
     Clean up the NUMA node handling code a little

  "mm: some optimizations for prot numa" (Kefeng Wang)
     Cleanups and small optimizations to the NUMA allocation hinting
     code

  "mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn)
     Address long lock hold times at boot on large machines. These were
     causing (harmless) softlockup warnings

  "optimize the logic for handling dirty file folios during reclaim" (Baolin Wang)
     Remove some now-unnecessary work from page reclaim

  "mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park)
     Enhance the DAMOS auto-tuning feature

  "mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan)
     Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace
     configuration

  "expand mmap_prepare functionality, port more users" (Lorenzo Stoakes)
     Enhance the new(ish) file_operations.mmap_prepare() method and port
     additional callsites from the old ->mmap() over to ->mmap_prepare()

  "Fix stale IOTLB entries for kernel address space" (Lu Baolu)
     Fix a bug (and possible security issue on non-x86) in the IOMMU
     code. In some situations the IOMMU could be left hanging onto a
     stale kernel pagetable entry

  "mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang)
     Clean up and optimize the folio splitting code

  "mm, swap: misc cleanup and bugfix" (Kairui Song)
     Some cleanups and a minor fix in the swap discard code

  "mm/damon: misc documentation fixups" (SeongJae Park)

  "mm/damon: support pin-point targets removal" (SeongJae Park)
     Permit userspace to remove a specific monitoring target in the
     middle of the current targets list

  "mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo)
     A couple of cleanups related to mm header file inclusion

  "mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He)
     improve the selection of swap devices for NUMA machines

  "mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista)
     Change the memory block labels from macros to enums so they will
     appear in kernel debug info

  "ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes)
     Address an inefficiency when KSM unmerges an address range

  "mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park)
     Fix leaks and unhandled malloc() failures in DAMON userspace unit
     tests

  "some cleanups for pageout()" (Baolin Wang)
     Clean up a couple of minor things in the page scanner's
     writeback-for-eviction code

  "mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu)
     Move hugetlb's sysfs/sysctl handling code into a new file

  "introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes)
     Make the VMA guard regions available in /proc/pid/smaps and
     improves the mergeability of guarded VMAs

  "mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes)
     Reduce mmap lock contention for callers performing VMA guard region
     operations

  "vma_start_write_killable" (Matthew Wilcox)
     Start work on permitting applications to be killed when they are
     waiting on a read_lock on the VMA lock

  "mm/damon/tests: add more tests for online parameters commit" (SeongJae Park)
     Add additional userspace testing of DAMON's "commit" feature

  "mm/damon: misc cleanups" (SeongJae Park)

  "make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes)
     Address the possible loss of a VMA's VM_SOFTDIRTY flag when that
     VMA is merged with another

  "mm: support device-private THP" (Balbir Singh)
     Introduce support for Transparent Huge Page (THP) migration in zone
     device-private memory

  "Optimize folio split in memory failure" (Zi Yan)

  "mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang)
     Some more cleanups in the folio splitting code

  "mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes)
     Clean up our handling of pagetable leaf entries by introducing the
     concept of 'software leaf entries', of type softleaf_t

  "reparent the THP split queue" (Muchun Song)
     Reparent the THP split queue to its parent memcg. This is in
     preparation for addressing the long-standing "dying memcg" problem,
     wherein dead memcg's linger for too long, consuming memory
     resources

  "unify PMD scan results and remove redundant cleanup" (Wei Yang)
     A little cleanup in the hugepage collapse code

  "zram: introduce writeback bio batching" (Sergey Senozhatsky)
     Improve zram writeback efficiency by introducing batched bio
     writeback support

  "memcg: cleanup the memcg stats interfaces" (Shakeel Butt)
     Clean up our handling of the interrupt safety of some memcg stats

  "make vmalloc gfp flags usage more apparent" (Vishal Moola)
     Clean up vmalloc's handling of incoming GFP flags

  "mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang)
     Teach soft dirty and userfaultfd write protect tracking to use
     RISC-V's Svrsw60t59b extension

  "mm: swap: small fixes and comment cleanups" (Youngjun Park)
     Fix a small bug and clean up some of the swap code

  "initial work on making VMA flags a bitmap" (Lorenzo Stoakes)
     Start work on converting the vma struct's flags to a bitmap, so we
     stop running out of them, especially on 32-bit

  "mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park)
     Address a possible bug in the swap discard code and clean things
     up a little

[ This merge also reverts commit ebb9aeb980 ("vfio/nvgrace-gpu:
  register device memory for poison handling") because it looks
  broken to me, I've asked for clarification   - Linus ]

* tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
  mm: fix vma_start_write_killable() signal handling
  mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate
  mm/swapfile: fix list iteration when next node is removed during discard
  fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling
  mm/kfence: add reboot notifier to disable KFENCE on shutdown
  memcg: remove inc/dec_lruvec_kmem_state helpers
  selftests/mm/uffd: initialize char variable to Null
  mm: fix DEBUG_RODATA_TEST indentation in Kconfig
  mm: introduce VMA flags bitmap type
  tools/testing/vma: eliminate dependency on vma->__vm_flags
  mm: simplify and rename mm flags function for clarity
  mm: declare VMA flags by bit
  zram: fix a spelling mistake
  mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity
  mm/vmscan: skip increasing kswapd_failures when reclaim was boosted
  pagemap: update BUDDY flag documentation
  mm: swap: remove scan_swap_map_slots() references from comments
  mm: swap: change swap_alloc_slow() to void
  mm, swap: remove redundant comment for read_swap_cache_async
  mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational
  ...
2025-12-05 13:52:43 -08:00
Linus Torvalds
015e7b0b0e Merge tag 'bpf-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:

 - Convert selftests/bpf/test_tc_edt and test_tc_tunnel from .sh to
   test_progs runner (Alexis Lothoré)

 - Convert selftests/bpf/test_xsk to test_progs runner (Bastien
   Curutchet)

 - Replace bpf memory allocator with kmalloc_nolock() in
   bpf_local_storage (Amery Hung), and in bpf streams and range tree
   (Puranjay Mohan)

 - Introduce support for indirect jumps in BPF verifier and x86 JIT
   (Anton Protopopov) and arm64 JIT (Puranjay Mohan)

 - Remove runqslower bpf tool (Hoyeon Lee)

 - Fix corner cases in the verifier to close several syzbot reports
   (Eduard Zingerman, KaFai Wan)

 - Several improvements in deadlock detection in rqspinlock (Kumar
   Kartikeya Dwivedi)

 - Implement "jmp" mode for BPF trampoline and corresponding
   DYNAMIC_FTRACE_WITH_JMP. It improves "fexit" program type performance
   from 80 M/s to 136 M/s. With Steven's Ack. (Menglong Dong)

 - Add ability to test non-linear skbs in BPF_PROG_TEST_RUN (Paul
   Chaignon)

 - Do not let BPF_PROG_TEST_RUN emit invalid GSO types to stack (Daniel
   Borkmann)

 - Generalize buildid reader into bpf_dynptr (Mykyta Yatsenko)

 - Optimize bpf_map_update_elem() for map-in-map types (Ritesh
   Oedayrajsingh Varma)

 - Introduce overwrite mode for BPF ring buffer (Xu Kuohai)

* tag 'bpf-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (169 commits)
  bpf: optimize bpf_map_update_elem() for map-in-map types
  bpf: make kprobe_multi_link_prog_run always_inline
  selftests/bpf: do not hardcode target rate in test_tc_edt BPF program
  selftests/bpf: remove test_tc_edt.sh
  selftests/bpf: integrate test_tc_edt into test_progs
  selftests/bpf: rename test_tc_edt.bpf.c section to expose program type
  selftests/bpf: Add success stats to rqspinlock stress test
  rqspinlock: Precede non-head waiter queueing with AA check
  rqspinlock: Disable spinning for trylock fallback
  rqspinlock: Use trylock fallback when per-CPU rqnode is busy
  rqspinlock: Perform AA checks immediately
  rqspinlock: Enclose lock/unlock within lock entry acquisitions
  bpf: Remove runqslower tool
  selftests/bpf: Remove usage of lsm/file_alloc_security in selftest
  bpf: Disable file_alloc_security hook
  bpf: check for insn arrays in check_ptr_alignment
  bpf: force BPF_F_RDONLY_PROG on insn array creation
  bpf: Fix exclusive map memory leak
  selftests/bpf: Make CS length configurable for rqspinlock stress test
  selftests/bpf: Add lock wait time stats to rqspinlock stress test
  ...
2025-12-03 16:54:54 -08:00
Paolo Bonzini
63a9b0bc65 Merge tag 'kvm-riscv-6.19-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.19

- SBI MPXY support for KVM guest
- New KVM_EXIT_FAIL_ENTRY_NO_VSFILE for the case when in-kernel
  AIA virtualization fails to allocate IMSIC VS-file
- Support enabling dirty log gradually in small chunks
- Fix guest page fault within HLV* instructions
- Flush VS-stage TLB after VCPU migration for Andes cores
2025-12-02 18:35:25 +01:00
Linus Torvalds
1dce50698a Merge tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scoped user access updates from Thomas Gleixner:
 "Scoped user mode access and related changes:

   - Implement the missing u64 user access function on ARM when
     CONFIG_CPU_SPECTRE=n.

     This makes it possible to access a 64bit value in generic code with
     [unsafe_]get_user(). All other architectures and ARM variants
     provide the relevant accessors already.

   - Ensure that ASM GOTO jump label usage in the user mode access
     helpers always goes through a local C scope label indirection
     inside the helpers.

     This is required because compilers are not supporting that a ASM
     GOTO target leaves a auto cleanup scope. GCC silently fails to emit
     the cleanup invocation and CLANG fails the build.

     [ Editor's note: gcc-16 will have fixed the code generation issue
       in commit f68fe3ddda4 ("eh: Invoke cleanups/destructors in asm
       goto jumps [PR122835]"). But we obviously have to deal with clang
       and older versions of gcc, so.. - Linus ]

     This provides generic wrapper macros and the conversion of affected
     architecture code to use them.

   - Scoped user mode access with auto cleanup

     Access to user mode memory can be required in hot code paths, but
     if it has to be done with user controlled pointers, the access is
     shielded with a speculation barrier, so that the CPU cannot
     speculate around the address range check. Those speculation
     barriers impact performance quite significantly.

     This cost can be avoided by "masking" the provided pointer so it is
     guaranteed to be in the valid user memory access range and
     otherwise to point to a guaranteed unpopulated address space. This
     has to be done without branches so it creates an address dependency
     for the access, which the CPU cannot speculate ahead.

     This results in repeating and error prone programming patterns:

       	    if (can_do_masked_user_access())
                      from = masked_user_read_access_begin((from));
              else if (!user_read_access_begin(from, sizeof(*from)))
                      return -EFAULT;
              unsafe_get_user(val, from, Efault);
              user_read_access_end();
              return 0;
        Efault:
              user_read_access_end();
              return -EFAULT;

      which can be replaced with scopes and automatic cleanup:

              scoped_user_read_access(from, Efault)
                      unsafe_get_user(val, from, Efault);
              return 0;
         Efault:
              return -EFAULT;

   - Convert code which implements the above pattern over to
     scope_user.*.access(). This also corrects a couple of imbalanced
     masked_*_begin() instances which are harmless on most
     architectures, but prevent PowerPC from implementing the masking
     optimization.

   - Add a missing speculation barrier in copy_from_user_iter()"

* tag 'core-uaccess-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lib/strn*,uaccess: Use masked_user_{read/write}_access_begin when required
  scm: Convert put_cmsg() to scoped user access
  iov_iter: Add missing speculation barrier to copy_from_user_iter()
  iov_iter: Convert copy_from_user_iter() to masked user access
  select: Convert to scoped user access
  x86/futex: Convert to scoped user access
  futex: Convert to get/put_user_inline()
  uaccess: Provide put/get_user_inline()
  uaccess: Provide scoped user access regions
  arm64: uaccess: Use unsafe wrappers for ASM GOTO
  s390/uaccess: Use unsafe wrappers for ASM GOTO
  riscv/uaccess: Use unsafe wrappers for ASM GOTO
  powerpc/uaccess: Use unsafe wrappers for ASM GOTO
  x86/uaccess: Use unsafe wrappers for ASM GOTO
  uaccess: Provide ASM GOTO safe wrappers for unsafe_*_user()
  ARM: uaccess: Implement missing __get_user_asm_dword()
2025-12-02 08:01:39 -08:00
Linus Torvalds
4a26e7032d Merge tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull bug handling infrastructure updates from Ingo Molnar:
 "Core updates:

   - Improve WARN(), which has vararg printf like arguments, to work
     with the x86 #UD based WARN-optimizing infrastructure by hiding the
     format in the bug_table and replacing this first argument with the
     address of the bug-table entry, while making the actual function
     that's called a UD1 instruction (Peter Zijlstra)

   - Introduce the CONFIG_DEBUG_BUGVERBOSE_DETAILED Kconfig switch (Ingo
     Molnar, s390 support by Heiko Carstens)

  Fixes and cleanups:

   - bugs/s390: Remove private WARN_ON() implementation (Heiko Carstens)

   - <asm/bugs.h>: Make i386 use GENERIC_BUG_RELATIVE_POINTERS (Peter
     Zijlstra)"

* tag 'core-bugs-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  x86/bugs: Make i386 use GENERIC_BUG_RELATIVE_POINTERS
  x86/bug: Fix BUG_FORMAT vs KASLR
  x86_64/bug: Inline the UD1
  x86/bug: Implement WARN_ONCE()
  x86_64/bug: Implement __WARN_printf()
  x86/bug: Use BUG_FORMAT for DEBUG_BUGVERBOSE_DETAILED
  x86/bug: Add BUG_FORMAT basics
  bug: Allow architectures to provide __WARN_printf()
  bug: Implement WARN_ON() using __WARN_FLAGS()
  bug: Add report_bug_entry()
  bug: Add BUG_FORMAT_ARGS infrastructure
  bug: Clean up CONFIG_GENERIC_BUG_RELATIVE_POINTERS
  bug: Add BUG_FORMAT infrastructure
  x86: Rework __bug_table helpers
  bugs/s390: Remove private WARN_ON() implementation
  bugs/core: Reorganize fields in the first line of WARNING output, add ->comm[] output
  bugs/sh: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output
  bugs/parisc: Concatenate 'cond_str' with '__FILE__' in __WARN_FLAGS(), to extend WARN_ON/BUG_ON output
  bugs/riscv: Concatenate 'cond_str' with '__FILE__' in __BUG_FLAGS(), to extend WARN_ON/BUG_ON output
  bugs/riscv: Pass in 'cond_str' to __BUG_FLAGS()
  ...
2025-12-01 21:33:01 -08:00
Linus Torvalds
7fa0d7744c Merge tag 'soc-fixes-6.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
 "A few last minute fixes came in this week:

   - interrupt and gpio numbers in foud separate i.MX8 specific
     devicetree files were wrong

   - The vector length property in the C906 CPU description used the
     wrong unit

   - Two bugs with uninitialized stack variables in the tee subsystem

   - Alexander Stein now maintains additional devicetree files"

* tag 'soc-fixes-6.18-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  riscv: dts: allwinner: d1: fix vlenb property
  MAINTAINERS: Add entry for TQ-Systems AM335 device trees
  tee: qcomtee: initialize result before use in release worker
  arm64: dts: imx8qm-mek: fix mux-controller select/enable-gpios polarity
  tee: qcomtee: fix uninitialized pointers with free attribute
  ARM: dts: nxp: imx6ul: correct SAI3 interrupt line
  arm64: dts: imx8dxl-ss-conn: swap interrupts number of eqos
  arm64: dts: imx8dxl: Correct pcie-ep interrupt number
2025-11-28 09:57:31 -08:00
Arnd Bergmann
3ecfcf34f0 Merge tag 'sunxi-fixes-for-6.18' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes
Allwinner fixes for 6.18

Just one fix to correct the "thead,vlenb" property for the RISC-V based
D1 SoC family.

* tag 'sunxi-fixes-for-6.18' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  riscv: dts: allwinner: d1: fix vlenb property
2025-11-28 17:37:13 +01:00
Arnd Bergmann
00de4ef9d3 Merge tag 'riscv-config-for-v6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/defconfig
RISC-V config for v6.19

Spacemit:
The Spacemit k1 wants the freescale qspi driver enabled as a module
as they appear to be rather similar IPs.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-config-for-v6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: defconfig: enable SPI_FSL_QUADSPI as a module

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-11-27 23:03:34 +01:00
Arnd Bergmann
3aa9940035 Merge tag 'riscv-dt-for-v6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt
RISC-V Devicetrees for v6.19

MAINTAINERS:
There's some re-jigging of things to reduce duplication, by moving me
into the StarFive entry and my tree into the Microchip one. The
other platforms that I look after (SiFive and Canaan) are marked as Odd
Fixes to better represent their status. Nothing functionally changes.

Microchip:
Add adc and mmc nodes for the Beagle-V Fire.

SiFive:
Add pwm fans to the unmatched board.

StarFive:
Add the Orange PI RV board, another VisionFive 2 derived SBC. This
required moving a mmc related nodes out of the common file, into
<board>.dts. Yet more things moved out of the common file when the
VisionFive 2 Lite boards were added, which use the JH7110S SoC instead of
the JH7110. The difference here between SoCs is just temperature and
frequency ranges, but the boards differ enough that the pool of common
nodes decreases a little further.  There's an eMMC and an SD variant
here, that are different SKUs, bringing the total new StarFive boards to
three.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-dt-for-v6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: add Orange Pi RV
  dt-bindings: riscv: starfive: add xunlong,orangepi-rv
  riscv: dts: starfive: Add VisionFive 2 Lite eMMC board device tree
  riscv: dts: starfive: Add VisionFive 2 Lite board device tree
  riscv: dts: starfive: Add common board dtsi for VisionFive 2 Lite variants
  riscv: dts: starfive: jh7110-common: Move out some nodes to the board dts
  dt-bindings: riscv: Add StarFive JH7110S SoC and VisionFive 2 Lite board
  MAINTAINERS: degrade RISC-V MISC SOC SUPPORT to Odd Fixes
  MAINTAINERS: add tree to RISC-V Microchip entry
  MAINTAINERS: remove patchwork from RISC-V MISC SOC SUPPORT
  MAINTAINERS: add Conor to StarFive entry
  riscv: dts: sifive: unmatched: Add PWM controlled fans
  riscv: dts: microchip: enable qspi adc/mmc-spi-slot on BeagleV Fire
  dts: starfive: jh7110-common: split out mmc0 reset pins from common into boards

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-11-27 22:49:32 +01:00
Paolo Bonzini
de8e8ebb1a Merge tag 'kvm-x86-tdx-6.19' of https://github.com/kvm-x86/linux into HEAD
KVM TDX changes for 6.19:

 - Overhaul the TDX code to address systemic races where KVM (acting on behalf
   of userspace) could inadvertantly trigger lock contention in the TDX-Module,
   which KVM was either working around in weird, ugly ways, or was simply
   oblivious to (as proven by Yan tripping several KVM_BUG_ON()s with clever
   selftests).

 - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a vCPU if
   creating said vCPU failed partway through.

 - Fix a few sparse warnings (bad annotation, 0 != NULL).

 - Use struct_size() to simplify copying capabilities to userspace.
2025-11-26 09:36:37 +01:00
Icenowy Zheng
5b70764e10 riscv: dts: starfive: add Orange Pi RV
Orange Pi RV is a SBC based on the StarFive VisionFive 2 board.

Orange Pi RV features:

- StarFive JH7110 SoC
- GbE port connected to JH7110 GMAC0 via YT8531 PHY
- 4x USB ports via VL805 PCIe USB controller connected to JH7110 pcie0
- M.2 M-key slot connected to JH7110 pcie1
- HDMI video output
- 3.5mm audio output
- Ampak AP6256 SDIO Wi-Fi/Bluetooth module on mmc0
- microSD slot on mmc1
- SPI NOR flash memory
- 24c02 EEPROM (read only by default)

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Signed-off-by: E Shattow <e@freeshell.de>
[conor: amend comment to say what's missing]
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-11-25 22:20:54 +00:00
Hal Feng
ae264ae124 riscv: dts: starfive: Add VisionFive 2 Lite eMMC board device tree
VisionFive 2 Lite eMMC board uses a non-removable onboard 64GiB eMMC
instead of the MicroSD slot.

Acked-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Tested-by: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-11-25 22:16:00 +00:00
Hal Feng
900b32fd60 riscv: dts: starfive: Add VisionFive 2 Lite board device tree
VisionFive 2 Lite is a mini SBC based on the StarFive JH7110S SoC.

Board features:
- JH7110S SoC
- 4/8 GiB LPDDR4 DRAM
- AXP15060 PMIC
- 40 pin GPIO header
- 1x USB 3.0 host port
- 3x USB 2.0 host port
- 1x M.2 M-Key (size: 2242)
- 1x MicroSD slot (optional non-removable 64GiB eMMC)
- 1x QSPI Flash
- 1x I2C EEPROM
- 1x 1Gbps Ethernet port
- SDIO-based Wi-Fi & UART-based Bluetooth
- 1x HDMI port
- 1x 2-lane DSI
- 1x 2-lane CSI

VisionFive 2 Lite schematics: https://doc-en.rvspace.org/VisionFive2Lite/PDF/VF2_LITE_V1.10_TF_20250818_SCH.pdf
VisionFive 2 Lite Quick Start Guide: https://doc-en.rvspace.org/VisionFive2Lite/VisionFive2LiteQSG/index.html
More documents: https://doc-en.rvspace.org/Doc_Center/visionfive_2_lite.html

Acked-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Tested-by: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-11-25 22:16:00 +00:00
Hal Feng
2ad6d71a0d riscv: dts: starfive: Add common board dtsi for VisionFive 2 Lite variants
Add a common board dtsi for use by VisionFive 2 Lite and
VisionFive 2 Lite eMMC.

Acked-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Tested-by: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-11-25 22:16:00 +00:00
Hal Feng
84853940a7 riscv: dts: starfive: jh7110-common: Move out some nodes to the board dts
Some node in this file are not used by the upcoming VisionFive 2 Lite
board. Move them to the board dts to prepare for adding the new
VisionFive 2 Lite device tree.

Tested-by: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-11-25 22:16:00 +00:00
Chunyan Zhang
c64da3950c riscv: mm: add userfaultfd write-protect support
The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59
for software, this patch uses bit 60 for uffd-wp tracking

Additionally for tracking the uffd-wp state as a PTE swap bit, we borrow
bit 4 which is not involved into swap entry computation.

Link: https://lkml.kernel.org/r/20251113072806.795029-6-zhangchunyan@iscas.ac.cn
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Conor Dooley <conor.dooley@microchip.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Deepak Gupta <debug@rivosinc.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24 15:08:55 -08:00
Chunyan Zhang
2a3ebad4db riscv: mm: add soft-dirty page tracking support
The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59
for software, this patch uses bit 59 for soft-dirty.

To add swap PTE soft-dirty tracking, we borrow bit 3 which is available
for swap PTEs on RISC-V systems.

Link: https://lkml.kernel.org/r/20251113072806.795029-5-zhangchunyan@iscas.ac.cn
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Conor Dooley <conor.dooley@microchip.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24 15:08:55 -08:00
Chunyan Zhang
59f6acb4be riscv: add RISC-V Svrsw60t59b extension support
The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59
for software to use.

Link: https://lkml.kernel.org/r/20251113072806.795029-4-zhangchunyan@iscas.ac.cn
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Conor Dooley <conor.dooley@microchip.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24 15:08:55 -08:00
Menglong Dong
ae4a3160d1 bpf: specify the old and new poke_type for bpf_arch_text_poke
In the origin logic, the bpf_arch_text_poke() assume that the old and new
instructions have the same opcode. However, they can have different opcode
if we want to replace a "call" insn with a "jmp" insn.

Therefore, add the new function parameter "old_t" along with the "new_t",
which are used to indicate the old and new poke type. Meanwhile, adjust
the implement of bpf_arch_text_poke() for all the archs.

"BPF_MOD_NOP" is added to make the code more readable. In
bpf_arch_text_poke(), we still check if the new and old address is NULL to
determine if nop insn should be used, which I think is more safe.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Link: https://lore.kernel.org/r/20251118123639.688444-6-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-24 09:47:03 -08:00
Menglong Dong
47c9214dcb bpf: fix the usage of BPF_TRAMP_F_SKIP_FRAME
Some places calculate the origin_call by checking if
BPF_TRAMP_F_SKIP_FRAME is set. However, it should use
BPF_TRAMP_F_ORIG_STACK for this propose. Just fix them.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/r/20251118123639.688444-4-dongml2@chinatelecom.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-24 09:47:03 -08:00
Hui Min Mina Chou
3239c52fd2 RISC-V: KVM: Flush VS-stage TLB after VCPU migration for Andes cores
Most implementations cache the combined result of two-stage translation,
but some, like Andes cores, use split TLBs that store VS-stage and
G-stage entries separately.

On such systems, when a VCPU migrates to another CPU, an additional
HFENCE.VVMA is required to avoid using stale VS-stage entries, which
could otherwise cause guest faults.

Introduce a static key to identify CPUs with split two-stage TLBs.
When enabled, KVM issues an extra HFENCE.VVMA on VCPU migration to
prevent stale VS-stage mappings.

Signed-off-by: Hui Min Mina Chou <minachou@andestech.com>
Signed-off-by: Ben Zong-You Xie <ben717@andestech.com>
Reviewed-by: Radim Krčmář <rkrcmar@ventanamicro.com>
Reviewed-by: Nutty Liu <nutty.liu@hotmail.com>
Link: https://lore.kernel.org/r/20251117084555.157642-1-minachou@andestech.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-11-24 09:55:36 +05:30
Fangyu Yu
974555d6e4 RISC-V: KVM: Fix guest page fault within HLV* instructions
When executing HLV* instructions at the HS mode, a guest page fault
may occur when a g-stage page table migration between triggering the
virtual instruction exception and executing the HLV* instruction.

This may be a corner case, and one simpler way to handle this is to
re-execute the instruction where the virtual  instruction exception
occurred, and the guest page fault will be automatically handled.

Fixes: b91f0e4cb8 ("RISC-V: KVM: Factor-out instruction emulation into separate sources")
Signed-off-by: Fangyu Yu <fangyu.yu@linux.alibaba.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251121133543.46822-1-fangyu.yu@linux.alibaba.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-11-24 09:55:36 +05:30
Dong Yang
df60cb2e67 KVM: riscv: Support enabling dirty log gradually in small chunks
There is already support of enabling dirty log gradually in small chunks
for x86 in commit 3c9bd4006b ("KVM: x86: enable dirty log gradually in
small chunks") and c862626 ("KVM: arm64: Support enabling dirty log
gradually in small chunks"). This adds support for riscv.

x86 and arm64 writes protect both huge pages and normal pages now, so
riscv protect also protects both huge pages and normal pages.

On a nested virtualization setup (RISC-V KVM running inside a QEMU VM
on an [Intel® Core™ i5-12500H] host), I did some tests with a 2G Linux
VM using different backing page sizes. The time taken for
memory_global_dirty_log_start in the L2 QEMU is listed below:

Page Size      Before    After Optimization
  4K            4490.23ms         31.94ms
  2M             48.97ms          45.46ms
  1G             28.40ms          30.93ms

Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Signed-off-by: Dong Yang <dayss1224@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20251103062825.9084-1-dayss1224@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2025-11-24 09:55:36 +05:30