linux

Author	SHA1	Message	Date
Greg Kroah-Hartman	85612fa562	Merge v6.12.67 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:52 +01:00
Greg Kroah-Hartman	abf529abd6	Linux 6.12.67 Link: https://lore.kernel.org/r/20260121181411.452263583@linuxfoundation.org Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Brett A C Sheffield <bacs@librecast.net> Tested-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Brett Mastbergen <bmastbergen@ciq.com> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> v6.12.67	2026-01-23 11:18:52 +01:00
Bruno Faccini	7c734ad868	mm/fake-numa: handle cases with no SRAT info commit `4c80187001` upstream. Handle more gracefully cases where no SRAT information is available, like in VMs with no Numa support, and allow fake-numa configuration to complete successfully in these cases Link: https://lkml.kernel.org/r/20250127171623.1523171-1-bfaccini@nvidia.com Fixes: `63db8170bf` (“mm/fake-numa: allow later numa node hotplug”) Signed-off-by: Bruno Faccini <bfaccini@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hyeonggon Yoo <hyeonggon.yoo@sk.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Len Brown <lenb@kernel.org> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:52 +01:00
Vlastimil Babka	df63d31e9a	mm/page_alloc: prevent pcp corruption with SMP=n [ Upstream commit `038a102535` ] The kernel test robot has reported: BUG: spinlock trylock failure on UP on CPU#0, kcompactd0/28 lock: 0xffff888807e35ef0, .magic: dead4ead, .owner: kcompactd0/28, .owner_cpu: 0 CPU: 0 UID: 0 PID: 28 Comm: kcompactd0 Not tainted 6.18.0-rc5-00127-ga06157804399 #1 PREEMPT 8cc09ef94dcec767faa911515ce9e609c45db470 Call Trace: <IRQ> __dump_stack (lib/dump_stack.c:95) dump_stack_lvl (lib/dump_stack.c:123) dump_stack (lib/dump_stack.c:130) spin_dump (kernel/locking/spinlock_debug.c:71) do_raw_spin_trylock (kernel/locking/spinlock_debug.c:?) _raw_spin_trylock (include/linux/spinlock_api_smp.h:89 kernel/locking/spinlock.c:138) __free_frozen_pages (mm/page_alloc.c:2973) ___free_pages (mm/page_alloc.c:5295) __free_pages (mm/page_alloc.c:5334) tlb_remove_table_rcu (include/linux/mm.h:? include/linux/mm.h:3122 include/asm-generic/tlb.h:220 mm/mmu_gather.c:227 mm/mmu_gather.c:290) ? __cfi_tlb_remove_table_rcu (mm/mmu_gather.c:289) ? rcu_core (kernel/rcu/tree.c:?) rcu_core (include/linux/rcupdate.h:341 kernel/rcu/tree.c:2607 kernel/rcu/tree.c:2861) rcu_core_si (kernel/rcu/tree.c:2879) handle_softirqs (arch/x86/include/asm/jump_label.h:36 include/trace/events/irq.h:142 kernel/softirq.c:623) __irq_exit_rcu (arch/x86/include/asm/jump_label.h:36 kernel/softirq.c:725) irq_exit_rcu (kernel/softirq.c:741) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052) </IRQ> <TASK> RIP: 0010:_raw_spin_unlock_irqrestore (arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194) free_pcppages_bulk (mm/page_alloc.c:1494) drain_pages_zone (include/linux/spinlock.h:391 mm/page_alloc.c:2632) __drain_all_pages (mm/page_alloc.c:2731) drain_all_pages (mm/page_alloc.c:2747) kcompactd (mm/compaction.c:3115) kthread (kernel/kthread.c:465) ? __cfi_kcompactd (mm/compaction.c:3166) ? __cfi_kthread (kernel/kthread.c:412) ret_from_fork (arch/x86/kernel/process.c:164) ? __cfi_kthread (kernel/kthread.c:412) ret_from_fork_asm (arch/x86/entry/entry_64.S:255) </TASK> Matthew has analyzed the report and identified that in drain_page_zone() we are in a section protected by spin_lock(&pcp->lock) and then get an interrupt that attempts spin_trylock() on the same lock. The code is designed to work this way without disabling IRQs and occasionally fail the trylock with a fallback. However, the SMP=n spinlock implementation assumes spin_trylock() will always succeed, and thus it's normally a no-op. Here the enabled lock debugging catches the problem, but otherwise it could cause a corruption of the pcp structure. The problem has been introduced by commit `5749077415` ("mm/page_alloc: leave IRQs enabled for per-cpu page allocations"). The pcp locking scheme recognizes the need for disabling IRQs to prevent nesting spin_trylock() sections on SMP=n, but the need to prevent the nesting in spin_lock() has not been recognized. Fix it by introducing local wrappers that change the spin_lock() to spin_lock_iqsave() with SMP=n and use them in all places that do spin_lock(&pcp->lock). [vbabka@suse.cz: add pcp_ prefix to the spin_lock_irqsave wrappers, per Steven] Link: https://lkml.kernel.org/r/20260105-fix-pcp-up-v1-1-5579662d2071@suse.cz Fixes: `5749077415` ("mm/page_alloc: leave IRQs enabled for per-cpu page allocations") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202512101320.e2f2dd6f-lkp@intel.com Analyzed-by: Matthew Wilcox <willy@infradead.org> Link: https://lore.kernel.org/all/aUW05pyc9nZkvY-1@casper.infradead.org/ Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Brendan Jackman <jackmanb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:52 +01:00
Joshua Hahn	22056349e8	mm/page_alloc: batch page freeing in decay_pcp_high [ Upstream commit `fc4b909c36` ] It is possible for pcp->count - pcp->high to exceed pcp->batch by a lot. When this happens, we should perform batching to ensure that free_pcppages_bulk isn't called with too many pages to free at once and starve out other threads that need the pcp or zone lock. Since we are still only freeing the difference between the initial pcp->count and pcp->high values, there should be no change to how many pages are freed. Link: https://lkml.kernel.org/r/20251014145011.3427205-3-joshua.hahnjy@gmail.com Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com> Suggested-by: Chris Mason <clm@fb.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Co-developed-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Brendan Jackman <jackmanb@google.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Michal Hocko <mhocko@suse.com> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: `038a102535` ("mm/page_alloc: prevent pcp corruption with SMP=n") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:52 +01:00
Joshua Hahn	48273ed85f	mm/page_alloc/vmstat: simplify refresh_cpu_vm_stats change detection [ Upstream commit `0acc67c403` ] Patch series "mm/page_alloc: Batch callers of free_pcppages_bulk", v5. Motivation & Approach ===================== While testing workloads with high sustained memory pressure on large machines in the Meta fleet (1Tb memory, 316 CPUs), we saw an unexpectedly high number of softlockups. Further investigation showed that the zone lock in free_pcppages_bulk was being held for a long time, and was called to free 2k+ pages over 100 times just during boot. This causes starvation in other processes for the zone lock, which can lead to the system stalling as multiple threads cannot make progress without the locks. We can see these issues manifesting as warnings: [ 4512.591979] rcu: INFO: rcu_sched self-detected stall on CPU [ 4512.604370] rcu: 20-....: (9312 ticks this GP) idle=a654/1/0x4000000000000000 softirq=309340/309344 fqs=5426 [ 4512.626401] rcu: hardirqs softirqs csw/system [ 4512.638793] rcu: number: 0 145 0 [ 4512.651177] rcu: cputime: 30 10410 174 ==> 10558(ms) [ 4512.666657] rcu: (t=21077 jiffies g=783665 q=1242213 ncpus=316) While these warnings don't indicate a crash or a kernel panic, they do point to the underlying issue of lock contention. To prevent starvation in both locks, batch the freeing of pages using pcp->batch. Because free_pcppages_bulk is called with the pcp lock and acquires the zone lock, relinquishing and reacquiring the locks are only effective when both of them are broken together (unless the system was built with queued spinlocks). Thus, instead of modifying free_pcppages_bulk to break both locks, batch the freeing from its callers instead. A similar fix has been implemented in the Meta fleet, and we have seen significantly less softlockups. Testing ======= The following are a few synthetic benchmarks, made on three machines. The first is a large machine with 754GiB memory and 316 processors. The second is a relatively smaller machine with 251GiB memory and 176 processors. The third and final is the smallest of the three, which has 62GiB memory and 36 processors. On all machines, I kick off a kernel build with -j$(nproc). Negative delta is better (faster compilation). Large machine (754GiB memory, 316 processors) make -j$(nproc) +------------+---------------+-----------+ \| Metric (s) \| Variation (%) \| Delta(%) \| +------------+---------------+-----------+ \| real \| 0.8070 \| - 1.4865 \| \| user \| 0.2823 \| + 0.4081 \| \| sys \| 5.0267 \| -11.8737 \| +------------+---------------+-----------+ Medium machine (251GiB memory, 176 processors) make -j$(nproc) +------------+---------------+----------+ \| Metric (s) \| Variation (%) \| Delta(%) \| +------------+---------------+----------+ \| real \| 0.2806 \| +0.0351 \| \| user \| 0.0994 \| +0.3170 \| \| sys \| 0.6229 \| -0.6277 \| +------------+---------------+----------+ Small machine (62GiB memory, 36 processors) make -j$(nproc) +------------+---------------+----------+ \| Metric (s) \| Variation (%) \| Delta(%) \| +------------+---------------+----------+ \| real \| 0.1503 \| -2.6585 \| \| user \| 0.0431 \| -2.2984 \| \| sys \| 0.1870 \| -3.2013 \| +------------+---------------+----------+ Here, variation is the coefficient of variation, i.e. standard deviation / mean. Based on these results, it seems like there are varying degrees to how much lock contention this reduces. For the largest and smallest machines that I ran the tests on, it seems like there is quite some significant reduction. There is also some performance increases visible from userspace. Interestingly, the performance gains don't scale with the size of the machine, but rather there seems to be a dip in the gain there is for the medium-sized machine. One possible theory is that because the high watermark depends on both memory and the number of local CPUs, what impacts zone contention the most is not these individual values, but rather the ratio of mem:processors. This patch (of 5): Currently, refresh_cpu_vm_stats returns an int, indicating how many changes were made during its updates. Using this information, callers like vmstat_update can heuristically determine if more work will be done in the future. However, all of refresh_cpu_vm_stats's callers either (a) ignore the result, only caring about performing the updates, or (b) only care about whether changes were made, but not how many changes were made. Simplify the code by returning a bool instead to indicate if updates were made. In addition, simplify fold_diff and decay_pcp_high to return a bool for the same reason. Link: https://lkml.kernel.org/r/20251014145011.3427205-1-joshua.hahnjy@gmail.com Link: https://lkml.kernel.org/r/20251014145011.3427205-2-joshua.hahnjy@gmail.com Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Brendan Jackman <jackmanb@google.com> Cc: Chris Mason <clm@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Michal Hocko <mhocko@suse.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: `038a102535` ("mm/page_alloc: prevent pcp corruption with SMP=n") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:52 +01:00
Zhen Ni	ce358252a9	dmaengine: fsl-edma: Fix clk leak on alloc_chan_resources failure [ Upstream commit `b18cd8b210` ] When fsl_edma_alloc_chan_resources() fails after clk_prepare_enable(), the error paths only free IRQs and destroy the TCD pool, but forget to call clk_disable_unprepare(). This causes the channel clock to remain enabled, leaking power and resources. Fix it by disabling the channel clock in the error unwind path. Fixes: `d8d4355861` ("dmaengine: fsl-edma: add i.MX8ULP edma support") Cc: stable@vger.kernel.org Suggested-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20251014090522.827726-1-zhen.ni@easystack.cn Signed-off-by: Vinod Koul <vkoul@kernel.org> [ Different error handling scheme ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:51 +01:00
Wentao Liang	027d42b97e	phy: rockchip: inno-usb2: Fix a double free bug in rockchip_usb2phy_probe() [ Upstream commit `e07dea3de5` ] The for_each_available_child_of_node() calls of_node_put() to release child_np in each success loop. After breaking from the loop with the child_np has been released, the code will jump to the put_child label and will call the of_node_put() again if the devm_request_threaded_irq() fails. These cause a double free bug. Fix by returning directly to avoid the duplicate of_node_put(). Fixes: `ed2b5a8e6b` ("phy: phy-rockchip-inno-usb2: support muxed interrupts") Cc: stable@vger.kernel.org Signed-off-by: Wentao Liang <vulab@iscas.ac.cn> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/20260109154626.2452034-1-vulab@iscas.ac.cn Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:51 +01:00
Dragan Simic	10f0711448	phy: phy-rockchip-inno-usb2: Use dev_err_probe() in the probe path [ Upstream commit `4045252085` ] Improve error handling in the probe path by using function dev_err_probe() instead of function dev_err(), where appropriate. Signed-off-by: Dragan Simic <dsimic@manjaro.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/d4ccd9fc278fb46ea868406bf77811ee507f0e4e.1725524803.git.dsimic@manjaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Stable-dep-of: `e07dea3de5` ("phy: rockchip: inno-usb2: Fix a double free bug in rockchip_usb2phy_probe()") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:51 +01:00
Ben Dooks	c5b13f0b10	mm: numa,memblock: include <asm/numa.h> for 'numa_nodes_parsed' [ Upstream commit `f46c26f1bc` ] The 'numa_nodes_parsed' is defined in <asm/numa.h> but this file is not included in mm/numa_memblks.c (build x86_64) so add this to the incldues to fix the following sparse warning: mm/numa_memblks.c:13:12: warning: symbol 'numa_nodes_parsed' was not declared. Should it be static? Link: https://lkml.kernel.org/r/20260108101539.229192-1-ben.dooks@codethink.co.uk Fixes: `8748270821` ("mm: introduce numa_memblks") Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Ben Dooks <ben.dooks@codethink.co.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:51 +01:00
Bruno Faccini	a76f5cafcc	mm/fake-numa: allow later numa node hotplug [ Upstream commit `63db8170bf` ] Current fake-numa implementation prevents new Numa nodes to be later hot-plugged by drivers. A common symptom of this limitation is the "node <X> was absent from the node_possible_map" message by associated warning in mm/memory_hotplug.c: add_memory_resource(). This comes from the lack of remapping in both pxm_to_node_map[] and node_to_pxm_map[] tables to take fake-numa nodes into account and thus triggers collisions with original and physical nodes only-mapping that had been determined from BIOS tables. This patch fixes this by doing the necessary node-ids translation in both pxm_to_node_map[]/node_to_pxm_map[] tables. node_distance[] table has also been fixed accordingly. Details: When trying to use fake-numa feature on our system where new Numa nodes are being "hot-plugged" upon driver load, this fails with the following type of message and warning with stack : node 8 was absent from the node_possible_map WARNING: CPU: 61 PID: 4259 at mm/memory_hotplug.c:1506 add_memory_resource+0x3dc/0x418 This issue prevents the use of the fake-NUMA debug feature with the system's full configuration, when it has proven to be sometimes extremely useful for performance testing of multi-tasked, memory-bound applications, as it enables better isolation of processes/ranks compared to fat NUMA nodes. Usual numactl output after driver has “hot-plugged”/unveiled some new Numa nodes with and without memory : $ numactl --hardware available: 9 nodes (0-8) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 0 size: 490037 MB node 0 free: 484432 MB node 1 cpus: node 1 size: 97280 MB node 1 free: 97279 MB node 2 cpus: node 2 size: 0 MB node 2 free: 0 MB node 3 cpus: node 3 size: 0 MB node 3 free: 0 MB node 4 cpus: node 4 size: 0 MB node 4 free: 0 MB node 5 cpus: node 5 size: 0 MB node 5 free: 0 MB node 6 cpus: node 6 size: 0 MB node 6 free: 0 MB node 7 cpus: node 7 size: 0 MB node 7 free: 0 MB node 8 cpus: node 8 size: 0 MB node 8 free: 0 MB node distances: node 0 1 2 3 4 5 6 7 8 0: 10 80 80 80 80 80 80 80 80 1: 80 10 255 255 255 255 255 255 255 2: 80 255 10 255 255 255 255 255 255 3: 80 255 255 10 255 255 255 255 255 4: 80 255 255 255 10 255 255 255 255 5: 80 255 255 255 255 10 255 255 255 6: 80 255 255 255 255 255 10 255 255 7: 80 255 255 255 255 255 255 10 255 8: 80 255 255 255 255 255 255 255 10 With recent M.Rapoport set of fake-numa patches in mm-everything and using numa=fake=4 boot parameter : $ numactl --hardware available: 4 nodes (0-3) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 0 size: 122518 MB node 0 free: 117141 MB node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 1 size: 219911 MB node 1 free: 219751 MB node 2 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 2 size: 122599 MB node 2 free: 122541 MB node 3 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 3 size: 122479 MB node 3 free: 122408 MB node distances: node 0 1 2 3 0: 10 10 10 10 1: 10 10 10 10 2: 10 10 10 10 3: 10 10 10 10 With recent M.Rapoport set of fake-numa patches in mm-everything, this patch on top, using numa=fake=4 boot parameter : # numactl —hardware available: 12 nodes (0-11) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 0 size: 122518 MB node 0 free: 116429 MB node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 1 size: 122631 MB node 1 free: 122576 MB node 2 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 2 size: 122599 MB node 2 free: 122544 MB node 3 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 node 3 size: 122479 MB node 3 free: 122419 MB node 4 cpus: node 4 size: 97280 MB node 4 free: 97279 MB node 5 cpus: node 5 size: 0 MB node 5 free: 0 MB node 6 cpus: node 6 size: 0 MB node 6 free: 0 MB node 7 cpus: node 7 size: 0 MB node 7 free: 0 MB node 8 cpus: node 8 size: 0 MB node 8 free: 0 MB node 9 cpus: node 9 size: 0 MB node 9 free: 0 MB node 10 cpus: node 10 size: 0 MB node 10 free: 0 MB node 11 cpus: node 11 size: 0 MB node 11 free: 0 MB node distances: node 0 1 2 3 4 5 6 7 8 9 10 11 0: 10 10 10 10 80 80 80 80 80 80 80 80 1: 10 10 10 10 80 80 80 80 80 80 80 80 2: 10 10 10 10 80 80 80 80 80 80 80 80 3: 10 10 10 10 80 80 80 80 80 80 80 80 4: 80 80 80 80 10 255 255 255 255 255 255 255 5: 80 80 80 80 255 10 255 255 255 255 255 255 6: 80 80 80 80 255 255 10 255 255 255 255 255 7: 80 80 80 80 255 255 255 10 255 255 255 255 8: 80 80 80 80 255 255 255 255 10 255 255 255 9: 80 80 80 80 255 255 255 255 255 10 255 255 10: 80 80 80 80 255 255 255 255 255 255 10 255 11: 80 80 80 80 255 255 255 255 255 255 255 10 Link: https://lkml.kernel.org/r/20250106120659.359610-2-bfaccini@nvidia.com Signed-off-by: Bruno Faccini <bfaccini@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: `f46c26f1bc` ("mm: numa,memblock: include <asm/numa.h> for 'numa_nodes_parsed'") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:51 +01:00
Ryan Roberts	d1beb4dd8b	mm: kmsan: fix poisoning of high-order non-compound pages [ Upstream commit `4795d205d7` ] kmsan_free_page() is called by the page allocator's free_pages_prepare() during page freeing. Its job is to poison all the memory covered by the page. It can be called with an order-0 page, a compound high-order page or a non-compound high-order page. But page_size() only works for order-0 and compound pages. For a non-compound high-order page it will incorrectly return PAGE_SIZE. The implication is that the tail pages of a high-order non-compound page do not get poisoned at free, so any invalid access while they are free could go unnoticed. It looks like the pages will be poisoned again at allocation time, so that would bookend the window. Fix this by using the order parameter to calculate the size. Link: https://lkml.kernel.org/r/20260104134348.3544298-1-ryan.roberts@arm.com Fixes: `b073d7f8ae` ("mm: kmsan: maintain KMSAN metadata for page operations") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> [ Adjust context ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:51 +01:00
Paul Chaignon	2d402c6cc9	selftests/bpf: Test invalid narrower ctx load commit `ba578b87fe` upstream. This patch adds selftests to cover invalid narrower loads on the context. These used to cause kernel warnings before the previous patch. To trigger the warning, the load had to be aligned, to read an affected context field (ex., skb->sk), and not starting at the beginning of the field. The nine new cases all fail without the previous patch. Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/44cd83ea9c6868079943f0a436c6efa850528cc1.1753194596.git.paul.chaignon@gmail.com Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:51 +01:00
Paul Chaignon	058a0da4f6	bpf: Reject narrower access to pointer ctx fields commit `e09299225d` upstream. The following BPF program, simplified from a syzkaller repro, causes a kernel warning: r0 = (u8 )(r1 + 169); exit; With pointer field sk being at offset 168 in __sk_buff. This access is detected as a narrower read in bpf_skb_is_valid_access because it doesn't match offsetof(struct __sk_buff, sk). It is therefore allowed and later proceeds to bpf_convert_ctx_access. Note that for the "is_narrower_load" case in the convert_ctx_accesses(), the insn->off is aligned, so the cnt may not be 0 because it matches the offsetof(struct __sk_buff, sk) in the bpf_convert_ctx_access. However, the target_size stays 0 and the verifier errors with a kernel warning: verifier bug: error during ctx access conversion(1) This patch fixes that to return a proper "invalid bpf_context access off=X size=Y" error on the load instruction. The same issue affects multiple other fields in context structures that allow narrow access. Some other non-affected fields (for sk_msg, sk_lookup, and sockopt) were also changed to use bpf_ctx_range_ptr for consistency. Note this syzkaller crash was reported in the "Closes" link below, which used to be about a different bug, fixed in commit `fce7bd8e38` ("bpf/verifier: Handle BPF_LOAD_ACQ instructions in insn_def_regno()"). Because syzbot somehow confused the two bugs, the new crash and repro didn't get reported to the mailing list. Fixes: `f96da09473` ("bpf: simplify narrower ctx access") Fixes: `0df1a55afa` ("bpf: Warn on internal verifier errors") Reported-by: syzbot+0ef84a7bdf5301d4cbec@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0ef84a7bdf5301d4cbec Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://patch.msgid.link/3b8dcee67ff4296903351a974ddd9c4dca768b64.1753194596.git.paul.chaignon@gmail.com Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:50 +01:00
SeongJae Park	16236b0b4a	mm/damon/sysfs-scheme: cleanup access_pattern subdirs on scheme dir setup failure commit `392b3d9d59` upstream. When a DAMOS-scheme DAMON sysfs directory setup fails after setup of access_pattern/ directory, subdirectories of access_pattern/ directory are not cleaned up. As a result, DAMON sysfs interface is nearly broken until the system reboots, and the memory for the unremoved directory is leaked. Cleanup the directories under such failures. Link: https://lkml.kernel.org/r/20251225023043.18579-5-sj@kernel.org Fixes: `9bbb820a5b` ("mm/damon/sysfs: support DAMOS quotas") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: chongjiapeng <jiapeng.chong@linux.alibaba.com> Cc: <stable@vger.kernel.org> # 5.18.x Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:50 +01:00
SeongJae Park	b831557a0d	mm/damon/sysfs-scheme: cleanup quotas subdirs on scheme dir setup failure commit `dc7e1d75fd` upstream. When a DAMOS-scheme DAMON sysfs directory setup fails after setup of quotas/ directory, subdirectories of quotas/ directory are not cleaned up. As a result, DAMON sysfs interface is nearly broken until the system reboots, and the memory for the unremoved directory is leaked. Cleanup the directories under such failures. Link: https://lkml.kernel.org/r/20251225023043.18579-4-sj@kernel.org Fixes: `1b32234ab0` ("mm/damon/sysfs: support DAMOS watermarks") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: chongjiapeng <jiapeng.chong@linux.alibaba.com> Cc: <stable@vger.kernel.org> # 5.18.x Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:50 +01:00
Brian Foster	5ee8005f72	xfs: set max_agbno to allow sparse alloc of last full inode chunk commit `c360004c01` upstream. Sparse inode cluster allocation sets min/max agbno values to avoid allocating an inode cluster that might map to an invalid inode chunk. For example, we can't have an inode record mapped to agbno 0 or that extends past the end of a runt AG of misaligned size. The initial calculation of max_agbno is unnecessarily conservative, however. This has triggered a corner case allocation failure where a small runt AG (i.e. 2063 blocks) is mostly full save for an extent to the EOFS boundary: [2050,13]. max_agbno is set to 2048 in this case, which happens to be the offset of the last possible valid inode chunk in the AG. In practice, we should be able to allocate the 4-block cluster at agbno 2052 to map to the parent inode record at agbno 2048, but the max_agbno value precludes it. Note that this can result in filesystem shutdown via dirty trans cancel on stable kernels prior to commit `9eb775968b` ("xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags") because the tail AG selection by the allocator sets t_highest_agno on the transaction. If the inode allocator spins around and finds an inode chunk with free inodes in an earlier AG, the subsequent dir name creation path may still fail to allocate due to the AG restriction and cancel. To avoid this problem, update the max_agbno calculation to the agbno prior to the last chunk aligned agbno in the AG. This is not necessarily the last valid allocation target for a sparse chunk, but since inode chunks (i.e. records) are chunk aligned and sparse allocs are cluster sized/aligned, this allows the sb_spino_align alignment restriction to take over and round down the max effective agbno to within the last valid inode chunk in the AG. Note that even though the allocator improvements in the aforementioned commit seem to avoid this particular dirty trans cancel situation, the max_agbno logic improvement still applies as we should be able to allocate from an AG that has been appropriately selected. The more important target for this patch however are older/stable kernels prior to this allocator rework/improvement. Cc: stable@vger.kernel.org # v4.2 Fixes: `56d1115c9b` ("xfs: allocate sparse inode chunks on full chunk allocation failure") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:50 +01:00
Robbie Ko	8b0bb145d3	btrfs: fix deadlock in wait_current_trans() due to ignored transaction type commit `5037b34282` upstream. When wait_current_trans() is called during start_transaction(), it currently waits for a blocked transaction without considering whether the given transaction type actually needs to wait for that particular transaction state. The btrfs_blocked_trans_types[] array already defines which transaction types should wait for which transaction states, but this check was missing in wait_current_trans(). This can lead to a deadlock scenario involving two transactions and pending ordered extents: 1. Transaction A is in TRANS_STATE_COMMIT_DOING state 2. A worker processing an ordered extent calls start_transaction() with TRANS_JOIN 3. join_transaction() returns -EBUSY because Transaction A is in TRANS_STATE_COMMIT_DOING 4. Transaction A moves to TRANS_STATE_UNBLOCKED and completes 5. A new Transaction B is created (TRANS_STATE_RUNNING) 6. The ordered extent from step 2 is added to Transaction B's pending ordered extents 7. Transaction B immediately starts commit by another task and enters TRANS_STATE_COMMIT_START 8. The worker finally reaches wait_current_trans(), sees Transaction B in TRANS_STATE_COMMIT_START (a blocked state), and waits unconditionally 9. However, TRANS_JOIN should NOT wait for TRANS_STATE_COMMIT_START according to btrfs_blocked_trans_types[] 10. Transaction B is waiting for pending ordered extents to complete 11. Deadlock: Transaction B waits for ordered extent, ordered extent waits for Transaction B This can be illustrated by the following call stacks: CPU0 CPU1 btrfs_finish_ordered_io() start_transaction(TRANS_JOIN) join_transaction() # -EBUSY (Transaction A is # TRANS_STATE_COMMIT_DOING) # Transaction A completes # Transaction B created # ordered extent added to # Transaction B's pending list btrfs_commit_transaction() # Transaction B enters # TRANS_STATE_COMMIT_START # waiting for pending ordered # extents wait_current_trans() # waits for Transaction B # (should not wait!) Task bstore_kv_sync in btrfs_commit_transaction waiting for ordered extents: __schedule+0x2e7/0x8a0 schedule+0x64/0xe0 btrfs_commit_transaction+0xbf7/0xda0 [btrfs] btrfs_sync_file+0x342/0x4d0 [btrfs] __x64_sys_fdatasync+0x4b/0x80 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Task kworker in wait_current_trans waiting for transaction commit: Workqueue: btrfs-syno_nocow btrfs_work_helper [btrfs] __schedule+0x2e7/0x8a0 schedule+0x64/0xe0 wait_current_trans+0xb0/0x110 [btrfs] start_transaction+0x346/0x5b0 [btrfs] btrfs_finish_ordered_io.isra.0+0x49b/0x9c0 [btrfs] btrfs_work_helper+0xe8/0x350 [btrfs] process_one_work+0x1d3/0x3c0 worker_thread+0x4d/0x3e0 kthread+0x12d/0x150 ret_from_fork+0x1f/0x30 Fix this by passing the transaction type to wait_current_trans() and checking btrfs_blocked_trans_types[cur_trans->state] against the given type before deciding to wait. This ensures that transaction types which are allowed to join during certain blocked states will not unnecessarily wait and cause deadlocks. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Robbie Ko <robbieko@synology.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Cc: Motiejus Jakštys <motiejus@jakstys.lt> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:50 +01:00
Nathan Chancellor	68f7f10156	HID: intel-ish-hid: Fix -Wcast-function-type-strict in devm_ishtp_alloc_workqueue() commit `3644f44117` upstream. Clang warns (or errors with CONFIG_WERROR=y / W=e): drivers/hid/intel-ish-hid/ipc/ipc.c:935:36: error: cast from 'void ()(struct workqueue_struct )' to 'void ()(void )' converts to incompatible function type [-Werror,-Wcast-function-type-strict] 935 \| if (devm_add_action_or_reset(dev, (void ()(void ))destroy_workqueue, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/linux/device/devres.h:168:34: note: expanded from macro 'devm_add_action_or_reset' 168 \| __devm_add_action_or_ireset(dev, action, data, #action) \| ^~~~~~ This warning is pointing out a kernel control flow integrity (kCFI / CONFIG_CFI=y) violation will occur due to this function cast when the destroy_workqueue() is indirectly called via devm_action_release() because the prototype of destroy_workqueue() does not match the prototype of (*action)(). Use a local function with the correct prototype to wrap destroy_workqueue() to resolve the warning and CFI violation. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202510190103.qTZvfdjj-lkp@intel.com/ Closes: https://github.com/ClangBuiltLinux/linux/issues/2139 Fixes: `0d30dae38f` ("HID: intel-ish-hid: Use dedicated unbound workqueues to prevent resume blocking") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Zhang Lixu <lixu.zhang@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:50 +01:00
Zhang Lixu	e79b03d386	HID: intel-ish-hid: Use dedicated unbound workqueues to prevent resume blocking commit `0d30dae38f` upstream. During suspend/resume tests with S2IDLE, some ISH functional failures were observed because of delay in executing ISH resume handler. Here schedule_work() is used from resume handler to do actual work. schedule_work() uses system_wq, which is a per CPU work queue. Although the queuing is not bound to a CPU, but it prefers local CPU of the caller, unless prohibited. Users of this work queue are not supposed to queue long running work. But in practice, there are scenarios where long running work items are queued on other unbound workqueues, occupying the CPU. As a result, the ISH resume handler may not get a chance to execute in a timely manner. In one scenario, one of the ish_resume_handler() executions was delayed nearly 1 second because another work item on an unbound workqueue occupied the same CPU. This delay causes ISH functionality failures. A similar issue was previously observed where the ISH HID driver timed out while getting the HID descriptor during S4 resume in the recovery kernel, likely caused by the same workqueue contention problem. Create dedicated unbound workqueues for all ISH operations to allow work items to execute on any available CPU, eliminating CPU-specific bottlenecks and improving resume reliability under varying system loads. Also ISH has three different components, a bus driver which implements ISH protocols, a PCI interface layer and HID interface. Use one dedicated work queue for all of them. Signed-off-by: Zhang Lixu <lixu.zhang@intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:50 +01:00
Johan Hovold	23133e0470	dmaengine: ti: k3-udma: fix device leak on udma lookup commit `430f7803b6` upstream. Make sure to drop the reference taken when looking up the UDMA platform device. Note that holding a reference to a platform device does not prevent its driver data from going away so there is no point in keeping the reference after the lookup helper returns. Fixes: `d702419134` ("dmaengine: ti: k3-udma: Add glue layer for non DMAengine users") Fixes: `1438cde8fe` ("dmaengine: ti: k3-udma: add missing put_device() call in of_xudma_dev_get()") Cc: stable@vger.kernel.org # 5.6: `1438cde8fe` Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20251117161258.10679-17-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:50 +01:00
Johan Hovold	f810132e82	dmaengine: ti: dma-crossbar: fix device leak on am335x route allocation commit `4fc17b1c6d` upstream. Make sure to drop the reference taken when looking up the crossbar platform device during am335x route allocation. Fixes: `42dbdcc6bf` ("dmaengine: ti-dma-crossbar: Add support for crossbar on AM33xx/AM43xx") Cc: stable@vger.kernel.org # 4.4 Cc: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20251117161258.10679-15-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:49 +01:00
Johan Hovold	e50b9bf91d	dmaengine: ti: dma-crossbar: fix device leak on dra7x route allocation commit `dc7e44db01` upstream. Make sure to drop the reference taken when looking up the crossbar platform device during dra7x route allocation. Note that commit `615a4bfc42` ("dmaengine: ti: Add missing put_device in ti_dra7_xbar_route_allocate") fixed the leak in the error paths but the reference is still leaking on successful allocation. Fixes: `a074ae38f8` ("dmaengine: Add driver for TI DMA crossbar on DRA7x") Fixes: `615a4bfc42` ("dmaengine: ti: Add missing put_device in ti_dra7_xbar_route_allocate") Cc: stable@vger.kernel.org # 4.2: `615a4bfc42` Cc: Peter Ujfalusi <peter.ujfalusi@ti.com> Cc: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20251117161258.10679-14-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:49 +01:00
Johan Hovold	f45cafe3b6	dmaengine: stm32: dmamux: fix OF node leak on route allocation failure commit `b1b590a590` upstream. Make sure to drop the reference taken to the DMA master OF node also on late route allocation failures. Fixes: `df7e762db5` ("dmaengine: Add STM32 DMAMUX driver") Cc: stable@vger.kernel.org # 4.15 Cc: Pierre-Yves MORDRET <pierre-yves.mordret@foss.st.com> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Amelie Delaunay <amelie.delaunay@foss.st.com> Link: https://patch.msgid.link/20251117161258.10679-12-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:49 +01:00
Johan Hovold	2fb10259d4	dmaengine: stm32: dmamux: fix device leak on route allocation commit `dd6e494388` upstream. Make sure to drop the reference taken when looking up the DMA mux platform device during route allocation. Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference. Fixes: `df7e762db5` ("dmaengine: Add STM32 DMAMUX driver") Cc: stable@vger.kernel.org # 4.15 Cc: Pierre-Yves MORDRET <pierre-yves.mordret@foss.st.com> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Amelie Delaunay <amelie.delaunay@foss.st.com> Link: https://patch.msgid.link/20251117161258.10679-11-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:49 +01:00
Biju Das	9969db4816	dmaengine: sh: rz-dmac: Fix rz_dmac_terminate_all() commit `747213b08a` upstream. After audio full duplex testing, playing the recorded file contains a few playback frames from the previous time. The rz_dmac_terminate_all() does not reset all the hardware descriptors queued previously, leading to the wrong descriptor being picked up during the next DMA transfer. Fix the above issue by resetting all the descriptor headers for a channel in rz_dmac_terminate_all() as rz_dmac_lmdesc_recycle() points to the proper descriptor header filled by the rz_dmac_prepare_descs_for_slave_sg(). Cc: stable@kernel.org Fixes: `5000d37042` ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC") Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://patch.msgid.link/20251113195052.564338-1-biju.das.jz@bp.renesas.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:49 +01:00
Miaoqian Lin	01b1d78139	dmaengine: qcom: gpi: Fix memory leak in gpi_peripheral_config() commit `3f747004bb` upstream. Fix a memory leak in gpi_peripheral_config() where the original memory pointed to by gchan->config could be lost if krealloc() fails. The issue occurs when: 1. gchan->config points to previously allocated memory 2. krealloc() fails and returns NULL 3. The function directly assigns NULL to gchan->config, losing the reference to the original memory 4. The original memory becomes unreachable and cannot be freed Fix this by using a temporary variable to hold the krealloc() result and only updating gchan->config when the allocation succeeds. Found via static analysis and code review. Fixes: `5d0c3533a1` ("dmaengine: qcom: Add GPI dma driver") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://patch.msgid.link/20251029123421.91973-1-linmq006@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:49 +01:00
Johan Hovold	618a822991	dmaengine: lpc32xx-dmamux: fix device leak on route allocation commit `d9847e6d1d` upstream. Make sure to drop the reference taken when looking up the DMA mux platform device during route allocation. Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference. Fixes: `5d318b5959` ("dmaengine: Add dma router for pl08x in LPC32XX SoC") Cc: stable@vger.kernel.org # 6.12 Cc: Piotr Wojtaszczyk <piotr.wojtaszczyk@timesys.com> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Vladimir Zapolskiy <vz@mleia.com> Link: https://patch.msgid.link/20251117161258.10679-9-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:49 +01:00
Johan Hovold	992eb8055a	dmaengine: lpc18xx-dmamux: fix device leak on route allocation commit `d4d63059de` upstream. Make sure to drop the reference taken when looking up the DMA mux platform device during route allocation. Note that holding a reference to a device does not prevent its driver data from going away so there is no point in keeping the reference. Fixes: `e5f4ae84be` ("dmaengine: add driver for lpc18xx dmamux") Cc: stable@vger.kernel.org # 4.3 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Vladimir Zapolskiy <vz@mleia.com> Link: https://patch.msgid.link/20251117161258.10679-8-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:49 +01:00
Johan Hovold	0c97ff108f	dmaengine: idxd: fix device leaks on compat bind and unbind commit `799900f017` upstream. Make sure to drop the reference taken when looking up the idxd device as part of the compat bind and unbind sysfs interface. Fixes: `6e7f3ee97b` ("dmaengine: idxd: move dsa_drv support to compatible mode") Cc: stable@vger.kernel.org # 5.15 Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20251117161258.10679-7-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:48 +01:00
Johan Hovold	8f7a391211	dmaengine: dw: dmamux: fix OF node leak on route allocation failure commit `ec25e60f9f` upstream. Make sure to drop the reference taken to the DMA master OF node also on late route allocation failures. Fixes: `134d9c52fc` ("dmaengine: dw: dmamux: Introduce RZN1 DMA router support") Cc: stable@vger.kernel.org # 5.19 Cc: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://patch.msgid.link/20251117161258.10679-6-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:48 +01:00
Johan Hovold	db6f1d6d31	dmaengine: bcm-sba-raid: fix device leak on probe commit `7c3a46ebf1` upstream. Make sure to drop the reference taken when looking up the mailbox device during probe on probe failures and on driver unbind. Fixes: `743e1c8ffe` ("dmaengine: Add Broadcom SBA RAID driver") Cc: stable@vger.kernel.org # 4.13 Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20251117161258.10679-4-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:48 +01:00
Johan Hovold	6a86cf2c09	dmaengine: at_hdmac: fix device leak on of_dma_xlate() commit `b9074b2d7a` upstream. Make sure to drop the reference taken when looking up the DMA platform device during of_dma_xlate() when releasing channel resources. Note that commit `3832b78b3e` ("dmaengine: at_hdmac: add missing put_device() call in at_dma_xlate()") fixed the leak in a couple of error paths but the reference is still leaking on successful allocation. Fixes: `bbe89c8e3d` ("at_hdmac: move to generic DMA binding") Fixes: `3832b78b3e` ("dmaengine: at_hdmac: add missing put_device() call in at_dma_xlate()") Cc: stable@vger.kernel.org # 3.10: `3832b78b3e` Cc: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20251117161258.10679-2-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:48 +01:00
Janne Grunau	aca18ac333	dmaengine: apple-admac: Add "apple,t8103-admac" compatible commit `76cba1e60b` upstream. After discussion with the devicetree maintainers we agreed to not extend lists with the generic compatible "apple,admac" anymore [1]. Use "apple,t8103-admac" as base compatible as it is the SoC the driver and bindings were written for. [1]: https://lore.kernel.org/asahi/12ab93b7-1fc2-4ce0-926e-c8141cfe81bf@kernel.org/ Fixes: `b127315d9a` ("dmaengine: apple-admac: Add Apple ADMAC driver") Cc: stable@vger.kernel.org Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Janne Grunau <j@jannau.net> Link: https://patch.msgid.link/20251231-apple-admac-t8103-base-compat-v1-1-ec24a3708f76@jannau.net Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:48 +01:00
Binbin Zhou	5319234215	LoongArch: dts: loongson-2k2000: Add default interrupt controller address cells commit `e65df3f77e` upstream. Add missing address-cells 0 to the Local I/O, Extend I/O and PCH-PIC Interrupt Controller node to silence W=1 warning: loongson-2k2000.dtsi:364.5-49: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map: Missing property '#address-cells' in node /bus@10000000/interrupt-controller@10000000, using 0 as fallback Value '0' is correct because: 1. The LIO/EIO/PCH interrupt controller does not have children, 2. interrupt-map property (in PCI node) consists of five components and the fourth component "parent unit address", which size is defined by '#address-cells' of the node pointed to by the interrupt-parent component, is not used (=0) Cc: stable@vger.kernel.org Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:48 +01:00
Binbin Zhou	4df476a336	LoongArch: dts: loongson-2k1000: Fix i2c-gpio node names commit `14ea5a3625` upstream. The binding wants the node to be named "i2c-number", but those are named "i2c-gpio-number" instead. Thus rename those to i2c-0, i2c-1 to adhere to the binding and suppress dtbs_check warnings. Cc: stable@vger.kernel.org Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:48 +01:00
Binbin Zhou	9cce27181e	LoongArch: dts: loongson-2k1000: Add default interrupt controller address cells commit `81e8cb7e50` upstream. Add missing address-cells 0 to the Local I/O interrupt controller node to silence W=1 warning: loongson-2k1000.dtsi:498.5-55: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@9,0:interrupt-map: Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe01440, using 0 as fallback Value '0' is correct because: 1. The Local I/O interrupt controller does not have children, 2. interrupt-map property (in PCI node) consists of five components and the fourth component "parent unit address", which size is defined by '#address-cells' of the node pointed to by the interrupt-parent component, is not used (=0) Cc: stable@vger.kernel.org Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:48 +01:00
Binbin Zhou	94b010200a	LoongArch: dts: loongson-2k0500: Add default interrupt controller address cells commit `c4461754e6` upstream. Add missing address-cells 0 to the Local I/O and Extend I/O interrupt controller node to silence W=1 warning: loongson-2k0500.dtsi:513.5-51: Warning (interrupt_map): /bus@10000000/pcie@1a000000/pcie@0,0:interrupt-map: Missing property '#address-cells' in node /bus@10000000/interrupt-controller@1fe11600, using 0 as fallback Value '0' is correct because: 1. The Local I/O & Extend I/O interrupt controller do not have children, 2. interrupt-map property (in PCI node) consists of five components and the fourth component "parent unit address", which size is defined by '#address-cells' of the node pointed to by the interrupt-parent component, is not used (=0) Cc: stable@vger.kernel.org Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:48 +01:00
Haoxiang Li	ef4af7597f	drm/vmwgfx: Fix an error return check in vmw_compat_shader_add() commit `bf72b4b7bb` upstream. In vmw_compat_shader_add(), the return value check of vmw_shader_alloc() is not proper. Modify the check for the return pointer 'res'. Found by code review and compiled on ubuntu 20.04. Fixes: `18e4a4669c` ("drm/vmwgfx: Fix compat shader namespace") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patch.msgid.link/20251224091105.1569464-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:47 +01:00
Marek Vasut	04218cd68d	drm/panel-simple: fix connector type for DataImage SCF0700C48GGU18 panel commit `6ab3d4353b` upstream. The connector type for the DataImage SCF0700C48GGU18 panel is missing and devm_drm_panel_bridge_add() requires connector type to be set. This leads to a warning and a backtrace in the kernel log and panel does not work: " WARNING: CPU: 3 PID: 38 at drivers/gpu/drm/bridge/panel.c:379 devm_drm_of_get_bridge+0xac/0xb8 " The warning is triggered by a check for valid connector type in devm_drm_panel_bridge_add(). If there is no valid connector type set for a panel, the warning is printed and panel is not added. Fill in the missing connector type to fix the warning and make the panel operational once again. Cc: stable@vger.kernel.org Fixes: `97ceb1fb08` ("drm/panel: simple: Add support for DataImage SCF0700C48GGU18") Signed-off-by: Marek Vasut <marex@nabladev.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/20260110152750.73848-1-marex@nabladev.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:47 +01:00
Lyude Paul	6a4e619c42	drm/nouveau/disp/nv50-: Set lock_core in curs507a_prepare commit `9e9bc6be0f` upstream. For a while, I've been seeing a strange issue where some (usually not all) of the display DMA channels will suddenly hang, particularly when there is a visible cursor on the screen that is being frequently updated, and especially when said cursor happens to go between two screens. While this brings back lovely memories of fixing Intel Skylake bugs, I would quite like to fix it :). It turns out the problem that's happening here is that we're managing to reach nv50_head_flush_set() in our atomic commit path without actually holding nv50_disp->mutex. This means that cursor updates happening in parallel (along with any other atomic updates that need to use the core channel) will race with eachother, which eventually causes us to corrupt the pushbuffer - leading to a plethora of various GSP errors, usually: nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 00000218 00102680 00000004 00800003 nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 0000021c 00040509 00000004 00000001 nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 00000000 00000000 00000001 00000001 The reason this is happening is because generally we check whether we need to set nv50_atom->lock_core at the end of nv50_head_atomic_check(). However, curs507a_prepare is called from the fb_prepare callback, which happens after the atomic check phase. As a result, this can lead to commits that both touch the core channel but also don't grab nv50_disp->mutex. So, fix this by making sure that we set nv50_atom->lock_core in cus507a_prepare(). Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: `1590700d94` ("drm/nouveau/kms/nv50-: split each resource type into their own source files") Cc: <stable@vger.kernel.org> # v4.18+ Link: https://patch.msgid.link/20251219215344.170852-2-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:47 +01:00
Haoxiang Li	819c417a30	drm/amdkfd: fix a memory leak in device_queue_manager_init() commit `80614c5098` upstream. If dqm->ops.initialize() fails, add deallocate_hiq_sdma_mqd() to release the memory allocated by allocate_hiq_sdma_mqd(). Move deallocate_hiq_sdma_mqd() up to ensure proper function visibility at the point of use. Fixes: `11614c36bc` ("drm/amdkfd: Allocate MQD trunk for HIQ and SDMA") Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b7cccc8286bb9919a0952c812872da1dcfe9d390) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:47 +01:00
Mario Limonciello (AMD)	8140ac7c55	drm/amd: Clean up kfd node on surprise disconnect commit `28695ca09d` upstream. When an eGPU is unplugged the KFD topology should also be destroyed for that GPU. This never happens because the fini_sw callbacks never get to run. Run them manually before calling amdgpu_device_ip_fini_early() when a device has already been disconnected. This location is intentionally chosen to make sure that the kfd locking refcount doesn't get incremented unintentionally. Cc: kent.russell@amd.com Closes: https://community.frame.work/t/amd-egpu-on-linux/8691/33 Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6a23e7b4332c10f8b56c33a9c5431b52ecff9aab) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:47 +01:00
Mario Limonciello	ae5b1d291c	drm/amd/display: Bump the HDMI clock to 340MHz commit `fee5007765` upstream. [Why] DP-HDMI dongles can execeed bandwidth requirements on high resolution monitors. This can lead to pruning the high resolution modes. HDMI 1.3 bumped the clock to 340MHz, but display code never matched it. [How] Set default to (DVI) 165MHz. Once HDMI display is identified update to 340MHz. Reported-by: Dianne Skoll <dianne@skoll.ca> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4780 Reviewed-by: Chris Park <chris.park@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ac1e65d8ade46c09fb184579b81acadf36dcb91e) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:47 +01:00
Lisa Robinson	df7a49b328	LoongArch: Fix PMU counter allocation for mixed-type event groups commit `a91f86e270` upstream. When validating a perf event group, validate_group() unconditionally attempts to allocate hardware PMU counters for the leader, sibling events and the new event being added. This is incorrect for mixed-type groups. If a PERF_TYPE_SOFTWARE event is part of the group, the current code still tries to allocate a hardware PMU counter for it, which can wrongly consume hardware PMU resources and cause spurious allocation failures. Fix this by only allocating PMU counters for hardware events during group validation, and skipping software events. A trimmed down reproducer is as simple as this: #include <stdio.h> #include <assert.h> #include <unistd.h> #include <string.h> #include <sys/syscall.h> #include <linux/perf_event.h> int main (int argc, char *argv[]) { struct perf_event_attr attr = { 0 }; int fds[5]; attr.disabled = 1; attr.exclude_kernel = 1; attr.exclude_hv = 1; attr.read_format = PERF_FORMAT_TOTAL_TIME_ENABLED \| PERF_FORMAT_TOTAL_TIME_RUNNING \| PERF_FORMAT_ID \| PERF_FORMAT_GROUP; attr.size = sizeof (attr); attr.type = PERF_TYPE_SOFTWARE; attr.config = PERF_COUNT_SW_DUMMY; fds[0] = syscall (SYS_perf_event_open, &attr, 0, -1, -1, 0); assert (fds[0] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_CPU_CYCLES; fds[1] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[1] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_INSTRUCTIONS; fds[2] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[2] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_BRANCH_MISSES; fds[3] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[3] >= 0); attr.type = PERF_TYPE_HARDWARE; attr.config = PERF_COUNT_HW_CACHE_REFERENCES; fds[4] = syscall (SYS_perf_event_open, &attr, 0, -1, fds[0], 0); assert (fds[4] >= 0); printf ("PASSED\n"); return 0; } Cc: stable@vger.kernel.org Fixes: `b37042b2bb` ("LoongArch: Add perf events support") Signed-off-by: Lisa Robinson <lisa@bytefly.space> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:47 +01:00
SeongJae Park	5651c0c391	mm/damon/sysfs: cleanup attrs subdirs on context dir setup failure commit `9814cc832b` upstream. When a context DAMON sysfs directory setup is failed after setup of attrs/ directory, subdirectories of attrs/ directory are not cleaned up. As a result, DAMON sysfs interface is nearly broken until the system reboots, and the memory for the unremoved directory is leaked. Cleanup the directories under such failures. Link: https://lkml.kernel.org/r/20251225023043.18579-3-sj@kernel.org Fixes: `c951cd3b89` ("mm/damon: implement a minimal stub for sysfs-based DAMON interface") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: chongjiapeng <jiapeng.chong@linux.alibaba.com> Cc: <stable@vger.kernel.org> # 5.18.x Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:47 +01:00
Aboorva Devarajan	654fa76032	mm/page_alloc: make percpu_pagelist_high_fraction reads lock-free commit `b9efe36b5e` upstream. When page isolation loops indefinitely during memory offline, reading /proc/sys/vm/percpu_pagelist_high_fraction blocks on pcp_batch_high_lock, causing hung task warnings. Make procfs reads lock-free since percpu_pagelist_high_fraction is a simple integer with naturally atomic reads, writers still serialize via the mutex. This prevents hung task warnings when reading the procfs file during long-running memory offline operations. [akpm@linux-foundation.org: add comment, per Michal] Link: https://lkml.kernel.org/r/aS_y9AuJQFydLEXo@tiehlicka Link: https://lkml.kernel.org/r/20251201060009.1420792-1-aboorvad@linux.ibm.com Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zi Yan <ziy@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:46 +01:00
Pavel Butsykin	550c228cb0	mm/zswap: fix error pointer free in zswap_cpu_comp_prepare() commit `590b13669b` upstream. crypto_alloc_acomp_node() may return ERR_PTR(), but the fail path checks only for NULL and can pass an error pointer to crypto_free_acomp(). Use IS_ERR_OR_NULL() to only free valid acomp instances. Link: https://lkml.kernel.org/r/20251231074638.2564302-1-pbutsykin@cloudlinux.com Fixes: `779b9955f6` ("mm: zswap: move allocations during CPU init outside the lock") Signed-off-by: Pavel Butsykin <pbutsykin@cloudlinux.com> Reviewed-by: SeongJae Park <sj@kernel.org> Acked-by: Yosry Ahmed <yosry.ahmed@linux.dev> Acked-by: Nhat Pham <nphamcs@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:46 +01:00
Nilay Shroff	a705886ac8	nvme: fix PCIe subsystem reset controller state transition commit `0edb475ac0` upstream. The commit `d2fe192348` (“nvme: only allow entering LIVE from CONNECTING state”) disallows controller state transitions directly from RESETTING to LIVE. However, the NVMe PCIe subsystem reset path relies on this transition to recover the controller on PowerPC (PPC) systems. On PPC systems, issuing a subsystem reset causes a temporary loss of communication with the NVMe adapter. A subsequent PCIe MMIO read then triggers EEH recovery, which restores the PCIe link and brings the controller back online. For EEH recovery to proceed correctly, the controller must transition back to the LIVE state. Due to the changes introduced by commit `d2fe192348` (“nvme: only allow entering LIVE from CONNECTING state”), the controller can no longer transition directly from RESETTING to LIVE. As a result, EEH recovery exits prematurely, leaving the controller stuck in the RESETTING state. Fix this by explicitly transitioning the controller state from RESETTING to CONNECTING and then to LIVE. This satisfies the updated state transition rules and allows the controller to be successfully recovered on PPC systems following a PCIe subsystem reset. Cc: stable@vger.kernel.org Fixes: `d2fe192348` ("nvme: only allow entering LIVE from CONNECTING state") Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:46 +01:00
Xiaochen Shen	05cea40d33	x86/resctrl: Fix memory bandwidth counter width for Hygon commit `7517e899e1` upstream. The memory bandwidth calculation relies on reading the hardware counter and measuring the delta between samples. To ensure accurate measurement, the software reads the counter frequently enough to prevent it from rolling over twice between reads. The default Memory Bandwidth Monitoring (MBM) counter width is 24 bits. Hygon CPUs provide a 32-bit width counter, but they do not support the MBM capability CPUID leaf (0xF.[ECX=1]:EAX) to report the width offset (from 24 bits). Consequently, the kernel falls back to the 24-bit default counter width, which causes incorrect overflow handling on Hygon CPUs. Fix this by explicitly setting the counter width offset to 8 bits (resulting in a 32-bit total counter width) for Hygon CPUs. Fixes: `d8df126349` ("x86/cpu/hygon: Add missing resctrl_cpu_detect() in bsp_init helper") Signed-off-by: Xiaochen Shen <shenxiaochen@open-hieco.net> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251209062650.1536952-3-shenxiaochen@open-hieco.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2026-01-23 11:18:46 +01:00

1 2 3 4 5 ...

1397189 Commits