【跟踪】6.12 系列合并 6.6 补丁情况一览 #11

MingcongBai · 2024-11-12T09:32:39Z

本工单用于跟踪从 6.6 分支引入的厂商补丁的移植情况，本次移植后，补丁后续将以 cherry-pick 的形式变基到各 rolling-stable 最新分支

下方缩进项指代顶层依赖项

当前代码基础

6.12.9

外设支持类（显卡）

drm: Support JMGPU JM9100 kernel#102
- [Jemoic] [DeepinKernel SIG] drm/mwv207: add parentheses to evaluate the bitwise operator first kernel#494
add glenfly arise1 series GPU DRM kernel driver(V33) kernel#333
[Deepin-Kernel-SIG] [linux 6.6-y] [Backport] [Aspeed] [wxiat] drm/ast: Fix io access error when resuming from sleep kernel#603 （待查）

外设支持类（网卡）

外设支持类（输入）

~~HID: multitouch: Add quirk for HONOR MagicBook Art 14 touchpad kernel#431~~ （上游已于 6.12 合并）

外设支持类（存储）

平台支持类（飞腾）

平台支持类（鲲鹏）

[Deepin-Kernel-SIG] [linux-6.6.y] [deepin] workaround 920 desktop cpufreq kernel#409

平台支持类（AArch64 通用）

平台支持类（龙架构）

平台支持类（海光）

平台支持类（兆芯）

平台支持类（Intel）

arch: x86: configs: enable DRM_ACCEL_IVPU, EDAC_IEH kernel#297
- [Intel-SIG] [Meteor Lake] Sync 6.6 EDAC driver patches to Deepin 6.6 kernel#79
~~[Deepin Kernel SIG] x86/cpu: Clarify the error message when BIOS does not support SGX kernel#423~~ （上游已于 6.12 合并）
deepin: x86: config: Update deepin_x86_desktop_defconfig to better support Intel devices kernel#542
~~[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] platform/x86: intel-uncore-freq: Add additional client processors kernel#570~~ （上游已于 6.8-rc1 合并）

平台支持类（其他 x86 厂商）

(linux-6.6.y) driver:crypto:add support for montage Mont-TSSE kernel#239

平台支持类（x86 通用）

平台支持类（RISC-V 通用）

~~[RISCV] fix: Use WRITE_ONCE() when setting page table entries kernel#398~~ （上游已于 6.9 合并）
~~[RISCV] feat: add kernel fpu to support rdna graphics card kernel#400~~ （上游已于 6.10 合并，补丁内容有差异，待测）
~~riscv: dts: starfive: visionfive 2: Remove non-existing I2S hardware kernel#416~~ （上游已于 6.10 合并：e0503d47e93d）
[RISCV] [linux-6.6.y] add xuantie erreta kernel#418
~~[Deepin Kernel SIG] [Debian] binder: Export close_fd_get_file and can_nice kernel#552~~ （上游于 6.8，a88c955fcfb49727d0ed86b47410f6555a8e69e4 移除相关符号）

平台支持类（申威）

~~[WIP] [SW64] Add sw64 architecture basic support kernel#439~~ （不在支持范围内）

平台支持类（MIPS）

平台支持类（通用）

发布管理类

The text was updated successfully, but these errors were encountered:

Disable strict aliasing, as has been done in the kernel proper for decades (literally since before git history) to fix issues where gcc will optimize away loads in code that looks 100% correct, but is _technically_ undefined behavior, and thus can be thrown away by the compiler. E.g. arm64's vPMU counter access test casts a uint64_t (unsigned long) pointer to a u64 (unsigned long long) pointer when setting PMCR.N via u64p_replace_bits(), which gcc-13 detects and optimizes away, i.e. ignores the result and uses the original PMCR. The issue is most easily observed by making set_pmcr_n() noinline and wrapping the call with printf(), e.g. sans comments, for this code: printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); set_pmcr_n(&pmcr, pmcr_n); printf("orig = %lx, next = %lx, want = %lu\n", pmcr_orig, pmcr, pmcr_n); gcc-13 generates: 0000000000401c90 <set_pmcr_n>: 401c90: f9400002 ldr x2, [x0] 401c94: b3751022 bfi x2, x1, #11, #5 401c98: f9000002 str x2, [x0] 401c9c: d65f03c0 ret 0000000000402660 <test_create_vpmu_vm_with_pmcr_n>: 402724: aa1403e3 mov x3, x20 402728: aa1503e2 mov x2, x21 40272c: aa1603e0 mov x0, x22 402730: aa1503e1 mov x1, x21 402734: 940060ff bl 41ab30 <_IO_printf> 402738: aa1403e1 mov x1, x20 40273c: 910183e0 add x0, sp, #0x60 402740: 97fffd54 bl 401c90 <set_pmcr_n> 402744: aa1403e3 mov x3, x20 402748: aa1503e2 mov x2, x21 40274c: aa1503e1 mov x1, x21 402750: aa1603e0 mov x0, x22 402754: 940060f7 bl 41ab30 <_IO_printf> with the value stored in [sp + 0x60] ignored by both printf() above and in the test proper, resulting in a false failure due to vcpu_set_reg() simply storing the original value, not the intended value. $ ./vpmu_counter_access Random seed: 0x6b8b4567 orig = 3040, next = 3040, want = 0 orig = 3040, next = 3040, want = 0 ==== Test Assertion Failure ==== aarch64/vpmu_counter_access.c:505: pmcr_n == get_pmcr_n(pmcr) pid=71578 tid=71578 errno=9 - Bad file descriptor 1 0x400673: run_access_test at vpmu_counter_access.c:522 2 (inlined by) main at vpmu_counter_access.c:643 3 0x4132d7: __libc_start_call_main at libc-start.o:0 4 0x413653: __libc_start_main at ??:0 5 0x40106f: _start at ??:0 Failed to update PMCR.N to 0 (received: 6) Somewhat bizarrely, gcc-11 also exhibits the same behavior, but only if set_pmcr_n() is marked noinline, whereas gcc-13 fails even if set_pmcr_n() is inlined in its sole caller. Cc: [email protected] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116912 Signed-off-by: Sean Christopherson <[email protected]>

opsiff · 2024-11-19T03:15:17Z

deepin-community/kernel#461 已合入6.12版本不需要了

Avenger-285714 · 2024-11-24T02:57:03Z

deepin-community/kernel#348 has merged upstream. And linux-stable has picked it to 4.19/5.4/5.10/5.15/6.1/6.6 :-)

[ Upstream commit 58f038e ] During the update procedure, when overwrite element in a pre-allocated htab, the freeing of old_element is protected by the bucket lock. The reason why the bucket lock is necessary is that the old_element has already been stashed in htab->extra_elems after alloc_htab_elem() returns. If freeing the old_element after the bucket lock is unlocked, the stashed element may be reused by concurrent update procedure and the freeing of old_element will run concurrently with the reuse of the old_element. However, the invocation of check_and_free_fields() may acquire a spin-lock which violates the lockdep rule because its caller has already held a raw-spin-lock (bucket lock). The following warning will be reported when such race happens: BUG: scheduling while atomic: test_progs/676/0x00000003 3 locks held by test_progs/676: #0: ffffffff864b0240 (rcu_read_lock_trace){....}-{0:0}, at: bpf_prog_test_run_syscall+0x2c0/0x830 #1: ffff88810e961188 (&htab->lockdep_key){....}-{2:2}, at: htab_map_update_elem+0x306/0x1500 #2: ffff8881f4eac1b8 (&base->softirq_expiry_lock){....}-{2:2}, at: hrtimer_cancel_wait_running+0xe9/0x1b0 Modules linked in: bpf_testmod(O) Preemption disabled at: [<ffffffff817837a3>] htab_map_update_elem+0x293/0x1500 CPU: 0 UID: 0 PID: 676 Comm: test_progs Tainted: G ... 6.12.0+ #11 Tainted: [W]=WARN, [O]=OOT_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)... Call Trace: <TASK> dump_stack_lvl+0x57/0x70 dump_stack+0x10/0x20 __schedule_bug+0x120/0x170 __schedule+0x300c/0x4800 schedule_rtlock+0x37/0x60 rtlock_slowlock_locked+0x6d9/0x54c0 rt_spin_lock+0x168/0x230 hrtimer_cancel_wait_running+0xe9/0x1b0 hrtimer_cancel+0x24/0x30 bpf_timer_delete_work+0x1d/0x40 bpf_timer_cancel_and_free+0x5e/0x80 bpf_obj_free_fields+0x262/0x4a0 check_and_free_fields+0x1d0/0x280 htab_map_update_elem+0x7fc/0x1500 bpf_prog_9f90bc20768e0cb9_overwrite_cb+0x3f/0x43 bpf_prog_ea601c4649694dbd_overwrite_timer+0x5d/0x7e bpf_prog_test_run_syscall+0x322/0x830 __sys_bpf+0x135d/0x3ca0 __x64_sys_bpf+0x75/0xb0 x64_sys_call+0x1b5/0xa10 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 ... </TASK> It seems feasible to break the reuse and refill of per-cpu extra_elems into two independent parts: reuse the per-cpu extra_elems with bucket lock being held and refill the old_element as per-cpu extra_elems after the bucket lock is unlocked. However, it will make the concurrent overwrite procedures on the same CPU return unexpected -E2BIG error when the map is full. Therefore, the patch fixes the lock problem by breaking the cancelling of bpf_timer into two steps for PREEMPT_RT: 1) use hrtimer_try_to_cancel() and check its return value 2) if the timer is running, use hrtimer_cancel() through a kworker to cancel it again Considering that the current implementation of hrtimer_cancel() will try to acquire a being held softirq_expiry_lock when the current timer is running, these steps above are reasonable. However, it also has downside. When the timer is running, the cancelling of the timer is delayed when releasing the last map uref. The delay is also fixable (e.g., break the cancelling of bpf timer into two parts: one part in locked scope, another one in unlocked scope), it can be revised later if necessary. It is a bit hard to decide the right fix tag. One reason is that the problem depends on PREEMPT_RT which is enabled in v6.12. Considering the softirq_expiry_lock lock exists since v5.4 and bpf_timer is introduced in v5.15, the bpf_timer commit is used in the fixes tag and an extra depends-on tag is added to state the dependency on PREEMPT_RT. Fixes: b00628b ("bpf: Introduce bpf timers.") Depends-on: v6.12+ with PREEMPT_RT enabled Reported-by: Sebastian Andrzej Siewior <[email protected]> Closes: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Hou Tao <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit c7b87ce ] libtraceevent parses and returns an array of argument fields, sometimes larger than RAW_SYSCALL_ARGS_NUM (6) because it includes "__syscall_nr", idx will traverse to index 6 (7th element) whereas sc->fmt->arg holds 6 elements max, creating an out-of-bounds access. This runtime error is found by UBsan. The error message: $ sudo UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -a --max-events=1 builtin-trace.c:1966:35: runtime error: index 6 out of bounds for type 'syscall_arg_fmt [6]' #0 0x5c04956be5fe in syscall__alloc_arg_fmts /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:1966 #1 0x5c04956c0510 in trace__read_syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2110 #2 0x5c04956c372b in trace__syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2436 #3 0x5c04956d2f39 in trace__init_syscalls_bpf_prog_array_maps /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:3897 #4 0x5c04956d6d25 in trace__run /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:4335 #5 0x5c04956e112e in cmd_trace /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:5502 #6 0x5c04956eda7d in run_builtin /home/howard/hw/linux-perf/tools/perf/perf.c:351 #7 0x5c04956ee0a8 in handle_internal_command /home/howard/hw/linux-perf/tools/perf/perf.c:404 #8 0x5c04956ee37f in run_argv /home/howard/hw/linux-perf/tools/perf/perf.c:448 #9 0x5c04956ee8e9 in main /home/howard/hw/linux-perf/tools/perf/perf.c:556 #10 0x79eb3622a3b7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #11 0x79eb3622a47a in __libc_start_main_impl ../csu/libc-start.c:360 #12 0x5c04955422d4 in _start (/home/howard/hw/linux-perf/tools/perf/perf+0x4e02d4) (BuildId: 5b6cab2d59e96a4341741765ad6914a4d784dbc6) 0.000 ( 0.014 ms): Chrome_ChildIO/117244 write(fd: 238, buf: !, count: 1) = 1 Fixes: 5e58fcf ("perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint") Signed-off-by: Howard Chu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

MingcongBai pinned this issue Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【跟踪】6.12 系列合并 6.6 补丁情况一览 #11

【跟踪】6.12 系列合并 6.6 补丁情况一览 #11

MingcongBai commented Nov 12, 2024 •

edited

Loading

opsiff commented Nov 19, 2024

Avenger-285714 commented Nov 24, 2024

【跟踪】6.12 系列合并 6.6 补丁情况一览 #11

【跟踪】6.12 系列合并 6.6 补丁情况一览 #11

Comments

MingcongBai commented Nov 12, 2024 • edited Loading

当前代码基础

外设支持类（显卡）

外设支持类（网卡）

外设支持类（输入）

外设支持类（存储）

平台支持类（飞腾）

平台支持类（鲲鹏）

平台支持类（AArch64 通用）

平台支持类（龙架构）

平台支持类（海光）

平台支持类（兆芯）

平台支持类（Intel）

平台支持类（其他 x86 厂商）

平台支持类（x86 通用）

平台支持类（RISC-V 通用）

平台支持类（申威）

平台支持类（MIPS）

平台支持类（通用）

发布管理类

opsiff commented Nov 19, 2024

Avenger-285714 commented Nov 24, 2024

MingcongBai commented Nov 12, 2024 •

edited

Loading