linux

Author	SHA1	Message	Date
Nirjhar Roy (IBM)	a65fd81207	xfs: Fix xfs_grow_last_rtg() The last rtg should be able to grow when the size of the last is less than (and not equal to) sb_rgextents. xfs_growfs with realtime groups fails without this patch. The reason is that, xfs_growfs_rtg() tries to grow the last rt group even when the last rt group is at its maximal size i.e, sb_rgextents. It fails with the following messages: XFS (loop0): Internal error block >= mp->m_rsumblocks at line 253 of file fs/xfs/libxfs/xfs_rtbitmap.c. Caller xfs_rtsummary_read_buf+0x20/0x80 XFS (loop0): Corruption detected. Unmount and run xfs_repair XFS (loop0): Internal error xfs_trans_cancel at line 976 of file fs/xfs/xfs_trans.c. Caller xfs_growfs_rt_bmblock+0x402/0x450 XFS (loop0): Corruption of in-memory data (0x8) detected at xfs_trans_cancel+0x10a/0x1f0 (fs/xfs/xfs_trans.c:977). Shutting down filesystem. XFS (loop0): Please unmount the filesystem and rectify the problem(s) Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-01-13 10:40:45 +01:00
Christoph Hellwig	df7ec7226f	xfs: improve the assert at the top of xfs_log_cover Move each condition into a separate assert so that we can see which on triggered. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-01-13 10:36:23 +01:00
Christoph Hellwig	baed03efe2	xfs: fix an overly long line in xfs_rtgroup_calc_geometry Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-01-13 10:34:29 +01:00
Christoph Hellwig	e0aea42a32	xfs: mark __xfs_rtgroup_extents static __xfs_rtgroup_extents is not used outside of xfs_rtgroup.c, so mark it static. Move it and xfs_rtgroup_extents up in the file to avoid forward declarations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-01-13 10:34:29 +01:00
Nirjhar Roy (IBM)	6b2d155366	xfs: Fix the return value of xfs_rtcopy_summary() xfs_rtcopy_summary() should return the appropriate error code instead of always returning 0. The caller of this function which is xfs_growfs_rt_bmblock() is already handling the error. Fixes: `e94b53ff69` ("xfs: cache last bitmap block in realtime allocator") Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org # v6.7 Signed-off-by: Carlos Maiolino <cem@kernel.org>	2026-01-13 10:32:12 +01:00
Eric Dumazet	ffe4ccd359	net: add net.core.qdisc_max_burst In blamed commit, I added a check against the temporary queue built in __dev_xmit_skb(). Idea was to drop packets early, before any spinlock was acquired. if (unlikely(defer_count > READ_ONCE(q->limit))) { kfree_skb_reason(skb, SKB_DROP_REASON_QDISC_DROP); return NET_XMIT_DROP; } It turned out that HTB Qdisc has a zero q->limit. HTB limits packets on a per-class basis. Some of our tests became flaky. Add a new sysctl : net.core.qdisc_max_burst to control how many packets can be stored in the temporary lockless queue. Also add a new QDISC_BURST_DROP drop reason to better diagnose future issues. Thanks Neal ! Fixes: `100dfa74ca` ("net: dev_queue_xmit() llist adoption") Reported-and-bisected-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20260107104159.3669285-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2026-01-13 10:12:11 +01:00
Ludovic Desroches	9380dc33cd	drm/panel: simple: restore connector_type fallback The switch from devm_kzalloc() + drm_panel_init() to devm_drm_panel_alloc() introduced a regression. Several panel descriptors do not set connector_type. For those panels, panel_simple_probe() used to compute a connector type (currently DPI as a fallback) and pass that value to drm_panel_init(). After the conversion to devm_drm_panel_alloc(), the call unconditionally used desc->connector_type instead, ignoring the computed fallback and potentially passing DRM_MODE_CONNECTOR_Unknown, which drm_panel_bridge_add() does not allow. Move the connector_type validation / fallback logic before the devm_drm_panel_alloc() call and pass the computed connector_type to devm_drm_panel_alloc(), so panels without an explicit connector_type once again get the DPI default. Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com> Fixes: `de04bb0089` ("drm/panel/panel-simple: Use the new allocation in place of devm_kzalloc()") Cc: stable@vger.kernel.org Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://lore.kernel.org/stable/20251126-lcd_panel_connector_type_fix-v2-1-c15835d1f7cb%40microchip.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/20251218-lcd_panel_connector_type_fix-v3-1-ddcea6d8d7ef@microchip.com	2026-01-13 10:07:40 +01:00
Marek Vasut	6ab3d4353b	drm/panel-simple: fix connector type for DataImage SCF0700C48GGU18 panel The connector type for the DataImage SCF0700C48GGU18 panel is missing and devm_drm_panel_bridge_add() requires connector type to be set. This leads to a warning and a backtrace in the kernel log and panel does not work: " WARNING: CPU: 3 PID: 38 at drivers/gpu/drm/bridge/panel.c:379 devm_drm_of_get_bridge+0xac/0xb8 " The warning is triggered by a check for valid connector type in devm_drm_panel_bridge_add(). If there is no valid connector type set for a panel, the warning is printed and panel is not added. Fill in the missing connector type to fix the warning and make the panel operational once again. Cc: stable@vger.kernel.org Fixes: `97ceb1fb08` ("drm/panel: simple: Add support for DataImage SCF0700C48GGU18") Signed-off-by: Marek Vasut <marex@nabladev.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patch.msgid.link/20260110152750.73848-1-marex@nabladev.com	2026-01-13 10:06:37 +01:00
Luo Haiyang	f2edf797da	irqchip/riscv-imsic: Revert "Remove redundant irq_data lookups" Commit c475c0b71314("irqchip/riscv-imsic: Remove redundant irq_data lookups") leads to a NULL pointer deference in imsic_msi_update_msg(): virtio_blk virtio1: 8/0/0 default/read/poll queues Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Current kworker/u32:2 pgtable: 4K pagesize, 48-bit VAs, pgdp=0x0000000081c33000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 CPU: 5 UID: 0 PID: 75 Comm: kworker/u32:2 Not tainted 6.19.0-rc4-next-20260109 #1 NONE epc : 0x0 ra : imsic_irq_set_affinity+0x110/0x130 The irq_data argument of imsic_irq_set_affinity() is associated with the imsic domain and not with the top-level MSI domain. As a consequence the code dereferences the wrong interrupt chip, which has the irq_write_msi_msg() callback not populated. Signed-off-by: Luo Haiyang <luo.haiyang@zte.com.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260113111930821RrC26avITHWSFCN0bYbgI@zte.com.cn	2026-01-13 09:51:46 +01:00
Lorenzo Bianconi	dfdf774656	net: airoha: Fix typo in airoha_ppe_setup_tc_block_cb definition Fix Typo in airoha_ppe_dev_setup_tc_block_cb routine definition when CONFIG_NET_AIROHA is not enabled. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202601090517.Fj6v501r-lkp@intel.com/ Fixes: `f45fc18b6d` ("net: airoha: Add airoha_ppe_dev struct definition") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20260109-airoha_ppe_dev_setup_tc_block_cb-typo-v1-1-282e8834a9f9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-12 19:17:24 -08:00
Jijie Shao	e02f2a0f1f	net: phy: motorcomm: fix duplex setting error for phy leds fix duplex setting error for phy leds Fixes: `355b82c54c` ("net: phy: motorcomm: Add support for PHY LEDs on YT8521") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260108071409.2750607-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-12 18:01:09 -08:00
Tetsuo Handa	ec69daabe4	bpf: Fix reference count leak in bpf_prog_test_run_xdp() syzbot is reporting unregister_netdevice: waiting for sit0 to become free. Usage count = 2 problem. A debug printk() patch found that a refcount is obtained at xdp_convert_md_to_buff() from bpf_prog_test_run_xdp(). According to commit `ec94670fcb` ("bpf: Support specifying ingress via xdp_md context in BPF_PROG_TEST_RUN"), the refcount obtained by xdp_convert_md_to_buff() will be released by xdp_convert_buff_to_md(). Therefore, we can consider that the error handling path introduced by commit `1c19499825` ("bpf: introduce frags support to bpf_prog_test_run_xdp()") forgot to call xdp_convert_buff_to_md(). Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84 Fixes: `1c19499825` ("bpf: introduce frags support to bpf_prog_test_run_xdp()") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/af090e53-9d9b-4412-8acb-957733b3975c@I-love.SAKURA.ne.jp Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-01-12 16:37:40 -08:00
sheetal	f34b32745e	ASoC: tegra: Revert fix for uninitialized flat cache warning in tegra210_ahub Commit `4d4021b0bb` ("ASoC: tegra: Fix uninitialized flat cache warning in tegra210_ahub") attempted to fix the uninitialized flat cache warning that is observed for the Tegra210 AHUB driver. However, the change broke various audio tests because an -EBUSY error is returned when accessing registers from cache before they are read from hardware. Revert this change for now, until a proper fix is available. Fixes: `4d4021b0bb` ("ASoC: tegra: Fix uninitialized flat cache warning in tegra210_ahub") Signed-off-by: sheetal <sheetal@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Link: https://patch.msgid.link/20251217132524.2844499-1-sheetal@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org>	2026-01-12 20:32:03 +00:00
Linus Torvalds	b71e635fee	Merge tag 'cgroup-for-6.19-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: - Fix -Wflex-array-member-not-at-end warnings in cgroup_root * tag 'cgroup-for-6.19-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Eliminate cgrp_ancestor_storage in cgroup_root	2026-01-12 09:56:17 -10:00
Rafael J. Wysocki	8f334e3522	ACPI: PM: s2idle: Add missing checks to acpi_s2idle_begin_lps0() Commit `32ece31db4` ("ACPI: PM: s2idle: Only retrieve constraints when needed"), that attempted to avoid useless evaluation of LPS0 _DSM Function 1 in lps0_device_attach(), forgot to add checks for lps0_device_handle and sleep_no_lps0 to acpi_s2idle_begin_lps0() where they should be done before calling lpi_device_get_constraints() or lpi_device_get_constraints_amd(). Add the missing checks. Fixes: `32ece31db4` ("ACPI: PM: s2idle: Only retrieve constraints when needed") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/2818730.mvXUDI8C0e@rafael.j.wysocki	2026-01-12 19:33:29 +01:00
Kery Qi	f93fc5d12d	net: octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback octep_vf_request_irqs() requests MSI-X queue IRQs with dev_id set to ioq_vector. If request_irq() fails part-way, the rollback loop calls free_irq() with dev_id set to 'oct', which does not match the original dev_id and may leave the irqaction registered. This can keep IRQ handlers alive while ioq_vector is later freed during unwind/teardown, leading to a use-after-free or crash when an interrupt fires. Fix the error path to free IRQs with the same ioq_vector dev_id used during request_irq(). Fixes: `1cd3b40797` ("octeon_ep_vf: add Tx/Rx processing and interrupt support") Signed-off-by: Kery Qi <qikeyu2017@gmail.com> Link: https://patch.msgid.link/20260108164256.1749-2-qikeyu2017@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-12 09:04:52 -08:00
Anna Schumaker	803e18641f	NFS: Don't immediately return directory delegations when disabled The function nfs_inode_evict_delegation() immediately and synchronously returns a delegation when called. This means we can't call it from nfs4_have_delegation(), since that function could be called under a lock. Instead we should mark the delegation for return and let the state manager handle it for us. Fixes: `b6d2a520f4` ("NFS: Add a module option to disable directory delegations") Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2026-01-12 11:50:22 -05:00
Boqun Feng	05f66cf5e7	PCI: Provide pci_free_irq_vectors() stub `473b9f3317` ("rust: pci: fix build failure when CONFIG_PCI_MSI is disabled") fixed a build error by providing Rust helpers when CONFIG_PCI_MSI is not set. However the Rust helpers rely on pci_free_irq_vectors(), which is only available when CONFIG_PCI=y. When CONFIG_PCI is not set, there is already a stub for pci_alloc_irq_vectors(). Add a similar stub for pci_free_irq_vectors(). Fixes: `473b9f3317` ("rust: pci: fix build failure when CONFIG_PCI_MSI is disabled") Reported-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Closes: https://lore.kernel.org/rust-for-linux/20251209014312.575940-1-fujita.tomonori@gmail.com/ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512220740.4Kexm4dW-lkp@intel.com/ Reported-by: Liang Jie <liangjie@lixiang.com> Closes: https://lore.kernel.org/rust-for-linux/20251222034415.1384223-1-buaajxlj@163.com/ Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Drew Fustini <fustini@kernel.org> Reviewed-by: David Gow <davidgow@google.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20251226113938.52145-1-boqun.feng@gmail.com	2026-01-12 10:45:31 -06:00
Günther Noack	6abbb8703a	landlock: Clarify documentation for the IOCTL access right Move the description of the LANDLOCK_ACCESS_FS_IOCTL_DEV access right together with the file access rights. This group of access rights applies to files (in this case device files), and they can be added to file or directory inodes using landlock_add_rule(2). The check for that works the same for all file access rights, including LANDLOCK_ACCESS_FS_IOCTL_DEV. Invoking ioctl(2) on directory FDs can not currently be restricted with Landlock. Having it grouped separately in the documentation is a remnant from earlier revisions of the LANDLOCK_ACCESS_FS_IOCTL_DEV patch set. Link: https://lore.kernel.org/all/20260108.Thaex5ruach2@digikod.net/ Signed-off-by: Günther Noack <gnoack3000@gmail.com> Link: https://lore.kernel.org/r/20260111175203.6545-2-gnoack3000@gmail.com Signed-off-by: Mickaël Salaün <mic@digikod.net>	2026-01-12 17:07:21 +01:00
Li Ming	d4026a4462	cxl/hdm: Fix potential infinite loop in __cxl_dpa_reserve() In __cxl_dpa_reserve(), it will check if the new resource range is included in one of paritions of the cxl memory device. cxlds->nr_paritions is used to represent how many partitions information the cxl memory device has. In the loop, if driver cannot find a partition including the new resource range, it will be an infinite loop. [ dj: Removed incorrect fixes tag ] Fixes: `991d98f17d` ("cxl: Make cxl_dpa_alloc() DPA partition number agnostic") Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260112120526.530232-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>	2026-01-12 08:59:16 -07:00
Jiasheng Jiang	a11224a016	btrfs: fix memory leaks in create_space_info() error paths In create_space_info(), the 'space_info' object is allocated at the beginning of the function. However, there are two error paths where the function returns an error code without freeing the allocated memory: 1. When create_space_info_sub_group() fails in zoned mode. 2. When btrfs_sysfs_add_space_info_type() fails. In both cases, 'space_info' has not yet been added to the fs_info->space_info list, resulting in a memory leak. Fix this by adding an error handling label to kfree(space_info) before returning. Fixes: `2be12ef79f` ("btrfs: Separate space_info create/update") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-01-12 16:21:55 +01:00
Filipe Manana	8826807749	btrfs: invalidate pages instead of truncate after reflinking Qu reported that generic/164 often fails because the read operations get zeroes when it expects to either get all bytes with a value of 0x61 or 0x62. The issue stems from truncating the pages from the page cache instead of invalidating, as truncating can zero page contents. This zeroing is not just in case the range is not page sized (as it's commented in truncate_inode_pages_range()) but also in case we are using large folios, they need to be split and the splitting fails. Stealing Qu's comment in the thread linked below: "We can have the following case: 0 4K 8K 12K 16K \| \| \| \| \| \|<---- Extent A ----->\|<----- Extent B ------>\| The page size is still 4K, but the folio we got is 16K. Then if we remap the range for [8K, 16K), then truncate_inode_pages_range() will get the large folio 0 sized 16K, then call truncate_inode_partial_folio(). Which later calls folio_zero_range() for the [8K, 16K) range first, then tries to split the folio into smaller ones to properly drop them from the cache. But if splitting failed (e.g. racing with other operations holding the filemap lock), the partially zeroed large folio will be kept, resulting the range [8K, 16K) being zeroed meanwhile the folio is still a 16K sized large one." So instead of truncating, invalidate the page cache range with a call to filemap_invalidate_inode(), which besides not doing any zeroing also ensures that while it's invalidating folios, no new folios are added. This helps ensure that buffered reads that happen while a reflink operation is in progress always get either the whole old data (the one before the reflink) or the whole new data, which is what generic/164 expects. Link: https://lore.kernel.org/linux-btrfs/7fb9b44f-9680-4c22-a47f-6648cb109ddf@suse.com/ Reported-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-01-12 16:21:55 +01:00
Qu Wenruo	64dd1caf88	btrfs: update the Kconfig string for CONFIG_BTRFS_EXPERIMENTAL The following new features are missing: - Async checksum - Shutdown ioctl and auto-degradation - Larger block size support Which is dependent on larger folios. Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2026-01-12 16:21:55 +01:00
Andreas Gruenbacher	469d71512d	Revert "gfs2: Fix use of bio_chain" This reverts commit `8a157e0a0a`. That commit incorrectly assumed that the bio_chain() arguments were swapped in gfs2. However, gfs2 intentionally constructs bio chains so that the first bio's bi_end_io callback is invoked when all bios in the chain have completed, unlike bio chains where the last bio's callback is invoked. Fixes: `8a157e0a0a` ("gfs2: Fix use of bio_chain") Cc: stable@vger.kernel.org Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2026-01-12 14:58:32 +01:00
Rob Herring (Arm)	70d95c5d20	ASoC: dt-bindings: rockchip-spdif: Allow "port" node Add a "port" node entry for Rockchip S/PDIF binding. It's already in use and a common property for DAIs. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260108224938.1320809-1-robh@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>	2026-01-12 11:20:06 +00:00
Rob Herring (Arm)	f66e7da2a6	ASoC: dt-bindings: realtek,rt5640: Allow 7 for realtek,jack-detect-source The driver accepts and uses a value of 7 for realtek,jack-detect-source. What exactly it means isn't clear though. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260108215307.1138515-2-robh@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>	2026-01-12 11:20:05 +00:00
Rob Herring (Arm)	101b982654	ASoC: dt-bindings: realtek,rt5640: Add missing properties/node The RT5640 has an MCLK pin and several users already define a clocks entry. A 'port' node is also in use and a common node for codecs. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20260108215307.1138515-1-robh@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>	2026-01-12 11:20:04 +00:00
Ben Dooks	81d0223832	drm/i915/guc: make 'guc_hw_reg_state' static as it isn't exported The guc_hw_reg_state array is not exported, so make it static. Fixes the following sparse warning: drivers/gpu/drm/i915/i915_gpu_error.c:692:3: warning: symbol 'guc_hw_reg_state' was not declared. Should it be static? Fixes: `ba391a102e` ("drm/i915/guc: Include the GuC registers in the error state") Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/20260108201202.59250-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 701c47493328a8173996e7590733be3493af572f) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2026-01-12 13:10:36 +02:00
Bartosz Golaszewski	471e998c0e	gpiolib: remove redundant callback check The presence of the .get_direction() callback is already checked in gpiochip_get_direction(). Remove the duplicated check which also returns the wrong error code to user-space. Fixes: `e623c4303e` ("gpiolib: sanitize the return value of gpio_chip::get_direction()") Reported-by: Michael Walle <mwalle@kernel.org> Closes: https://lore.kernel.org/all/DFJAFK3DTBOZ.3G2P3A5IH34GF@kernel.org/ Link: https://lore.kernel.org/r/20260109105557.20024-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>	2026-01-12 09:35:04 +01:00
Bartosz Golaszewski	c187900187	gpio: davinci: implement .get_direction() It's strongly recommended for GPIO drivers to always implement the .get_direction() callback - even for fixed-direction controllers. GPIO core will even emit a warning if the callback is missing, when users try to read the direction of a pin. Implement .get_direction() for gpio-davinci. Reported-by: Michael Walle <mwalle@kernel.org> Closes: https://lore.kernel.org/all/DFJAFK3DTBOZ.3G2P3A5IH34GF@kernel.org/ Reviewed-by: Linus Walleij <linusw@kernel.org> Fixes: `a060b8c511` ("gpiolib: implement low-level, shared GPIO support") Tested-by: Michael Walle <mwalle@kernel.org> # on sa67 Link: https://lore.kernel.org/r/20260109130832.27326-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>	2026-01-12 09:34:26 +01:00
Linus Torvalds	0f61b1860c	Linux 6.19-rc5 v6.19-rc5	2026-01-11 17:03:14 -10:00
Linus Torvalds	7143203341	Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library fixes from Eric Biggers: - A couple more fixes for the lib/crypto KUnit tests - Fix missing MMU protection for the AES S-box * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crypto: aes: Fix missing MMU protection for AES S-box MAINTAINERS: add test vector generation scripts to "CRYPTO LIBRARY" lib/crypto: tests: Fix syntax error for old python versions lib/crypto: tests: polyval_kunit: Increase iterations for preparekey in IRQs	2026-01-11 15:07:56 -10:00
Linus Torvalds	9c7ef209cd	Merge tag 'char-misc-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small char/misc driver fixes for some reported issues. Included in here is: - much reported rust_binder fix - counter driver fixes - new device ids for the mei driver All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: rust_binder: remove spin_lock() in rust_shrink_free_page() mei: me: add nova lake point S DID counter: 104-quad-8: Fix incorrect return value in IRQ handler counter: interrupt-cnt: Drop IRQF_NO_THREAD flag	2026-01-11 07:27:44 -10:00
Linus Torvalds	316a94cb63	Merge tag 'x86-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Ingo Molnar: "Disable GCOV instrumentation in the SEV noinstr.c collection of SEV noinstr methods, to further robustify the code" * tag 'x86-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Disable GCOV on noinstr object	2026-01-11 07:19:43 -10:00
Linus Torvalds	fac4bdbaca	Merge tag 'sched-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Ingo Molnar: "Fix a crash in sched_mm_cid_after_execve()" * tag 'sched-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/mm_cid: Prevent NULL mm dereference in sched_mm_cid_after_execve()	2026-01-11 07:11:53 -10:00
Linus Torvalds	fe948326e9	Merge tag 'perf-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fix from Ingo Molnar: "Fix perf swevent hrtimer deinit regression" * tag 'perf-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Ensure swevent hrtimer is properly destroyed	2026-01-11 06:55:27 -10:00
Janne Grunau	76cba1e60b	dmaengine: apple-admac: Add "apple,t8103-admac" compatible After discussion with the devicetree maintainers we agreed to not extend lists with the generic compatible "apple,admac" anymore [1]. Use "apple,t8103-admac" as base compatible as it is the SoC the driver and bindings were written for. [1]: https://lore.kernel.org/asahi/12ab93b7-1fc2-4ce0-926e-c8141cfe81bf@kernel.org/ Fixes: `b127315d9a` ("dmaengine: apple-admac: Add Apple ADMAC driver") Cc: stable@vger.kernel.org Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Janne Grunau <j@jannau.net> Link: https://patch.msgid.link/20251231-apple-admac-t8103-base-compat-v1-1-ec24a3708f76@jannau.net Signed-off-by: Vinod Koul <vkoul@kernel.org>	2026-01-11 22:12:49 +05:30
Haotian Zhang	2e1136acf8	dmaengine: omap-dma: fix dma_pool resource leak in error paths The dma_pool created by dma_pool_create() is not destroyed when dma_async_device_register() or of_dma_controller_register() fails, causing a resource leak in the probe error paths. Add dma_pool_destroy() in both error paths to properly release the allocated dma_pool resource. Fixes: `7bedaa5537` ("dmaengine: add OMAP DMA engine driver") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Link: https://patch.msgid.link/20251103073018.643-1-vulab@iscas.ac.cn Signed-off-by: Vinod Koul <vkoul@kernel.org>	2026-01-11 22:12:44 +05:30
Miaoqian Lin	3f747004bb	dmaengine: qcom: gpi: Fix memory leak in gpi_peripheral_config() Fix a memory leak in gpi_peripheral_config() where the original memory pointed to by gchan->config could be lost if krealloc() fails. The issue occurs when: 1. gchan->config points to previously allocated memory 2. krealloc() fails and returns NULL 3. The function directly assigns NULL to gchan->config, losing the reference to the original memory 4. The original memory becomes unreachable and cannot be freed Fix this by using a temporary variable to hold the krealloc() result and only updating gchan->config when the allocation succeeds. Found via static analysis and code review. Fixes: `5d0c3533a1` ("dmaengine: qcom: Add GPI dma driver") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://patch.msgid.link/20251029123421.91973-1-linmq006@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>	2026-01-11 22:12:38 +05:30
Linus Torvalds	88730166f3	Merge tag 'irq-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc irqchip fixes from Ingo Molnar: - Fix an endianness bug in the gic-v5 irqchip driver - Revert a broken commit from the riscv-imsic irqchip driver * tag 'irq-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "irqchip/riscv-imsic: Embed the vector array in lpriv" irqchip/gic-v5: Fix gicv5_its_map_event() ITTE read endianness	2026-01-11 06:36:20 -10:00
Thomas Gleixner	2e4b28c48f	treewide: Update email address In a vain attempt to consolidate the email zoo switch everything to the kernel.org account. Signed-off-by: Thomas Gleixner <tglx@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-01-11 06:09:11 -10:00
Cristian Ciocaltea	db8061bbb9	drm/rockchip: dw_hdmi_qp: Switch to gpiod_set_value_cansleep() Since commit `20cf2aed89` ("gpio: rockchip: mark the GPIO controller as sleeping"), the Rockchip GPIO chip operations potentially sleep, hence the kernel complains when trying to make use of the non-sleeping API: [ 16.653343] WARNING: drivers/gpio/gpiolib.c:3902 at gpiod_set_value+0xd0/0x108, CPU#5: kworker/5:1/93 ... [ 16.678470] Hardware name: Radxa ROCK 5B (DT) [ 16.682374] Workqueue: events dw_hdmi_qp_rk3588_hpd_work [rockchipdrm] ... [ 16.729314] Call trace: [ 16.731846] gpiod_set_value+0xd0/0x108 (P) [ 16.734548] dw_hdmi_qp_rockchip_encoder_enable+0xbc/0x3a8 [rockchipdrm] [ 16.737487] drm_atomic_helper_commit_encoder_bridge_enable+0x314/0x380 [drm_kms_helper] [ 16.740555] drm_atomic_helper_commit_tail_rpm+0xa4/0x100 [drm_kms_helper] [ 16.743501] commit_tail+0x1e0/0x2c0 [drm_kms_helper] [ 16.746290] drm_atomic_helper_commit+0x274/0x2b8 [drm_kms_helper] [ 16.749178] drm_atomic_commit+0x1f0/0x248 [drm] [ 16.752000] drm_client_modeset_commit_atomic+0x490/0x5d0 [drm] [ 16.754954] drm_client_modeset_commit_locked+0xf4/0x400 [drm] [ 16.757911] drm_client_modeset_commit+0x50/0x80 [drm] [ 16.760791] __drm_fb_helper_restore_fbdev_mode_unlocked+0x9c/0x170 [drm_kms_helper] [ 16.763843] drm_fb_helper_hotplug_event+0x340/0x368 [drm_kms_helper] [ 16.766780] drm_fbdev_client_hotplug+0x64/0x1d0 [drm_client_lib] [ 16.769634] drm_client_hotplug+0x178/0x240 [drm] [ 16.772455] drm_client_dev_hotplug+0x170/0x1c0 [drm] [ 16.775303] drm_connector_helper_hpd_irq_event+0xa4/0x178 [drm_kms_helper] [ 16.778248] dw_hdmi_qp_rk3588_hpd_work+0x44/0xb8 [rockchipdrm] [ 16.781080] process_one_work+0xc3c/0x1658 [ 16.783719] worker_thread+0xa24/0xc40 [ 16.786333] kthread+0x3b4/0x3d8 [ 16.788889] ret_from_fork+0x10/0x20 Since gpiod_get_value() is called from a context that can sleep, switch to its *_cansleep() variant and get rid of the issue. Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patch.msgid.link/20260110-dw-hdmi-qp-cansleep-v1-1-1ce937c5b201@collabora.com	2026-01-11 14:36:21 +01:00
Linus Torvalds	755bc1335e	Merge tag 'riscv-for-linus-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: "Notable changes include a fix to close one common microarchitectural attack vector for out-of-order cores. Another patch exposed an omission in my boot test coverage, which is currently missing relocatable kernels. Otherwise, the fixes seem to be settling down for us. - Fix CONFIG_RELOCATABLE=y boots by building Image files from vmlinux, rather than vmlinux.unstripped, now that the .modinfo section is included in vmlinux.unstripped - Prevent branch predictor poisoning microarchitectural attacks that use the syscall index as a vector by using array_index_nospec() to clamp the index after the bounds check (as x86 and ARM64 already do) - Fix a crash in test_kprobes when building with Clang - Fix a deadlock possible when tracing is enabled for SBI ecalls - Fix the definition of the Zk standard RISC-V ISA extension bundle, which was missing the Zknh extension - A few other miscellaneous non-functional cleanups, removing unused macros, fixing an out-of-date path in code comments, resolving a compile-time warning for a type mismatch in a pr_crit(), and removing an unnecessary header file inclusion" * tag 'riscv-for-linus-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: trace: fix snapshot deadlock with sbi ecall riscv: remove irqflags.h inclusion in asm/bitops.h riscv: cpu_ops_sbi: smp_processor_id() returns int, not unsigned int riscv: configs: Clean up references to non-existing configs riscv: kexec_image: Fix dead link to boot-image-header.rst riscv: pgtable: Cleanup useless VA_USER_XXX definitions riscv: cpufeature: Fix Zk bundled extension missing Zknh riscv: fix KUnit test_kprobes crash when building with Clang riscv: Sanitize syscall table indexing under speculation riscv: boot: Always make Image from vmlinux, not vmlinux.unstripped	2026-01-10 15:54:41 -10:00
Linus Torvalds	0fa27899e0	Merge tag 'driver-core-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core fixes from Danilo Krummrich: - Fix swapped example values for the `family` and `machine` attributes in the sysfs SoC bus ABI documentation - Fix Rust build and intra-doc issues when optional subsystems (CONFIG_PCI, CONFIG_AUXILIARY_BUS, CONFIG_PRINTK) are disabled - Fix typos and incorrect safety comments in Rust PCI, DMA, and device ID documentation * tag 'driver-core-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: rust: device: Remove explicit import of CStrExt rust: pci: fix typos in Bar struct's comments rust: device: fix broken intra-doc links rust: dma: fix broken intra-doc links rust: driver: fix broken intra-doc links to example driver types rust: device_id: replace incorrect word in safety documentation rust: dma: remove incorrect safety documentation docs: ABI: sysfs-devices-soc: Fix swapped sample values	2026-01-10 15:04:04 -10:00
Linus Torvalds	b061fcffe3	Merge tag 'linux_kselftest-fixes-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fix from Shuah Khan: "Fix tracing test_multiple_writes stalls when buffer_size_kb is less than 12KB" * tag 'linux_kselftest-fixes-6.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/tracing: Fix test_multiple_writes stall	2026-01-10 14:57:55 -10:00
Jakub Kicinski	16ce6e6fa9	Merge branch 'mlx5e-profile-change-fix' Saeed Mahameed says: ==================== mlx5e profile change fix This series fixes a crash in mlx5e due to profile change error flow. ==================== Link: https://patch.msgid.link/20260108212657.25090-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-10 15:21:14 -08:00
Saeed Mahameed	5629f8859d	net/mlx5e: Restore destroying state bit after profile cleanup Profile rollback can fail in mlx5e_netdev_change_profile() and we will end up with invalid mlx5e_priv memset to 0, we must maintain the 'destroying' bit in order to gracefully shutdown even if the profile/priv are not valid. This patch maintains the previous state of the 'destroying' state of mlx5e_priv after priv cleanup, to allow the remove flow to cleanup common resources from mlx5_core to avoid FW fatal errors as seen below: $ devlink dev eswitch set pci/0000:00:03.0 mode switchdev Error: mlx5_core: Failed setting eswitch to offloads. dmesg: mlx5_core 0000:00:03.0 enp0s3np0: failed to rollback to orig profile, ... $ devlink dev reload pci/0000:00:03.0 mlx5_core 0000:00:03.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:00:03.0: poll_health:803:(pid 519): Fatal error 3 detected mlx5_core 0000:00:03.0: firmware version: 28.41.1000 mlx5_core 0000:00:03.0: 0.000 Gb/s available PCIe bandwidth (Unknown x255 link) mlx5_core 0000:00:03.0: mlx5_function_enable:1200:(pid 519): enable hca failed mlx5_core 0000:00:03.0: mlx5_function_enable:1200:(pid 519): enable hca failed mlx5_core 0000:00:03.0: mlx5_health_try_recover:340:(pid 141): handling bad device here mlx5_core 0000:00:03.0: mlx5_handle_bad_state:285:(pid 141): Expected to see disabled NIC but it is full driver mlx5_core 0000:00:03.0: mlx5_error_sw_reset:236:(pid 141): start mlx5_core 0000:00:03.0: NIC IFC still 0 after 4000ms. Fixes: `c4d7eb5768` ("net/mxl5e: Add change profile method") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260108212657.25090-5-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-10 15:21:11 -08:00
Saeed Mahameed	4ef8512e14	net/mlx5e: Pass netdev to mlx5e_destroy_netdev instead of priv mlx5e_priv is an unstable structure that can be memset(0) if profile attaching fails. Pass netdev to mlx5e_destroy_netdev() to guarantee it will work on a valid netdev. On mlx5e_remove: Check validity of priv->profile, before attempting to cleanup any resources that might be not there. This fixes a kernel oops in mlx5e_remove when switchdev mode fails due to change profile failure. $ devlink dev eswitch set pci/0000:00:03.0 mode switchdev Error: mlx5_core: Failed setting eswitch to offloads. dmesg: workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12 workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12 $ devlink dev reload pci/0000:00:03.0 ==> oops BUG: kernel NULL pointer dereference, address: 0000000000000370 PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 15 UID: 0 PID: 520 Comm: devlink Not tainted 6.18.0-rc5+ #115 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:mlx5e_dcbnl_dscp_app+0x23/0x100 RSP: 0018:ffffc9000083f8b8 EFLAGS: 00010286 RAX: ffff8881126fc380 RBX: ffff8881015ac400 RCX: ffffffff826ffc45 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8881035109c0 RBP: ffff8881035109c0 R08: ffff888101e3e838 R09: ffff888100264e10 R10: ffffc9000083f898 R11: ffffc9000083f8a0 R12: ffff888101b921a0 R13: ffff888101b921a0 R14: ffff8881015ac9a0 R15: ffff8881015ac400 FS: 00007f789a3c8740(0000) GS:ffff88856aa59000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000370 CR3: 000000010b6c0001 CR4: 0000000000370ef0 Call Trace: <TASK> mlx5e_remove+0x57/0x110 device_release_driver_internal+0x19c/0x200 bus_remove_device+0xc6/0x130 device_del+0x160/0x3d0 ? devl_param_driverinit_value_get+0x2d/0x90 mlx5_detach_device+0x89/0xe0 mlx5_unload_one_devl_locked+0x3a/0x70 mlx5_devlink_reload_down+0xc8/0x220 devlink_reload+0x7d/0x260 devlink_nl_reload_doit+0x45b/0x5a0 genl_family_rcv_msg_doit+0xe8/0x140 Fixes: `c4d7eb5768` ("net/mxl5e: Add change profile method") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260108212657.25090-4-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-10 15:21:10 -08:00
Saeed Mahameed	123eda2e5b	net/mlx5e: Don't store mlx5e_priv in mlx5e_dev devlink priv mlx5e_priv is an unstable structure that can be memset(0) if profile attaching fails, mlx5e_priv in mlx5e_dev devlink private is used to reference the netdev and mdev associated with that struct. Instead, store netdev directly into mlx5e_dev and get mdev from the containing mlx5_adev aux device structure. This fixes a kernel oops in mlx5e_remove when switchdev mode fails due to change profile failure. $ devlink dev eswitch set pci/0000:00:03.0 mode switchdev Error: mlx5_core: Failed setting eswitch to offloads. dmesg: workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12 workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12 $ devlink dev reload pci/0000:00:03.0 ==> oops BUG: kernel NULL pointer dereference, address: 0000000000000520 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 3 UID: 0 PID: 521 Comm: devlink Not tainted 6.18.0-rc5+ #117 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:mlx5e_remove+0x68/0x130 RSP: 0018:ffffc900034838f0 EFLAGS: 00010246 RAX: ffff88810283c380 RBX: ffff888101874400 RCX: ffffffff826ffc45 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 RBP: ffff888102d789c0 R08: ffff8881007137f0 R09: ffff888100264e10 R10: ffffc90003483898 R11: ffffc900034838a0 R12: ffff888100d261a0 R13: ffff888100d261a0 R14: ffff8881018749a0 R15: ffff888101874400 FS: 00007f8565fea740(0000) GS:ffff88856a759000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000520 CR3: 000000010b11a004 CR4: 0000000000370ef0 Call Trace: <TASK> device_release_driver_internal+0x19c/0x200 bus_remove_device+0xc6/0x130 device_del+0x160/0x3d0 ? devl_param_driverinit_value_get+0x2d/0x90 mlx5_detach_device+0x89/0xe0 mlx5_unload_one_devl_locked+0x3a/0x70 mlx5_devlink_reload_down+0xc8/0x220 devlink_reload+0x7d/0x260 devlink_nl_reload_doit+0x45b/0x5a0 genl_family_rcv_msg_doit+0xe8/0x140 Fixes: `ee75f1fc44` ("net/mlx5e: Create separate devlink instance for ethernet auxiliary device") Fixes: `c4d7eb5768` ("net/mxl5e: Add change profile method") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Link: https://patch.msgid.link/20260108212657.25090-3-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-10 15:21:10 -08:00
Saeed Mahameed	4dadc4077e	net/mlx5e: Fix crash on profile change rollback failure mlx5e_netdev_change_profile can fail to attach a new profile and can fail to rollback to old profile, in such case, we could end up with a dangling netdev with a fully reset netdev_priv. A retry to change profile, e.g. another attempt to call mlx5e_netdev_change_profile via switchdev mode change, will crash trying to access the now NULL priv->mdev. This fix allows mlx5e_netdev_change_profile() to handle previous failures and an empty priv, by not assuming priv is valid. Pass netdev and mdev to all flows requiring mlx5e_netdev_change_profile() and avoid passing priv. In mlx5e_netdev_change_profile() check if current priv is valid, and if not, just attach the new profile without trying to access the old one. This fixes the following oops, when enabling switchdev mode for the 2nd time after first time failure: ## Enabling switchdev mode first time: mlx5_core 0012:03:00.1: E-Switch: Supported tc chains and prios offload workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: new profile init failed, -12 workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR mlx5_core 0012:03:00.1: mlx5e_netdev_init_profile:6214:(pid 37199): mlx5e_priv_init failed, err=-12 mlx5_core 0012:03:00.1 gpu3rdma1: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12 ^^^^^^^^ mlx5_core 0000:00:03.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) ## retry: Enabling switchdev mode 2nd time: mlx5_core 0000:00:03.0: E-Switch: Supported tc chains and prios offload BUG: kernel NULL pointer dereference, address: 0000000000000038 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 13 UID: 0 PID: 520 Comm: devlink Not tainted 6.18.0-rc4+ #91 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:mlx5e_detach_netdev+0x3c/0x90 Code: 50 00 00 f0 80 4f 78 02 48 8b bf e8 07 00 00 48 85 ff 74 16 48 8b 73 78 48 d1 ee 83 e6 01 83 f6 01 40 0f b6 f6 e8 c4 42 00 00 <48> 8b 45 38 48 85 c0 74 08 48 89 df e8 cc 47 40 1e 48 8b bb f0 07 RSP: 0018:ffffc90000673890 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8881036a89c0 RCX: 0000000000000000 RDX: ffff888113f63800 RSI: ffffffff822fe720 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000002dcd R09: 0000000000000000 R10: ffffc900006738e8 R11: 00000000ffffffff R12: 0000000000000000 R13: 0000000000000000 R14: ffff8881036a89c0 R15: 0000000000000000 FS: 00007fdfb8384740(0000) GS:ffff88856a9d6000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000038 CR3: 0000000112ae0005 CR4: 0000000000370ef0 Call Trace: <TASK> mlx5e_netdev_change_profile+0x45/0xb0 mlx5e_vport_rep_load+0x27b/0x2d0 mlx5_esw_offloads_rep_load+0x72/0xf0 esw_offloads_enable+0x5d0/0x970 mlx5_eswitch_enable_locked+0x349/0x430 ? is_mp_supported+0x57/0xb0 mlx5_devlink_eswitch_mode_set+0x26b/0x430 devlink_nl_eswitch_set_doit+0x6f/0xf0 genl_family_rcv_msg_doit+0xe8/0x140 genl_rcv_msg+0x18b/0x290 ? __pfx_devlink_nl_pre_doit+0x10/0x10 ? __pfx_devlink_nl_eswitch_set_doit+0x10/0x10 ? __pfx_devlink_nl_post_doit+0x10/0x10 ? __pfx_genl_rcv_msg+0x10/0x10 netlink_rcv_skb+0x52/0x100 genl_rcv+0x28/0x40 netlink_unicast+0x282/0x3e0 ? __alloc_skb+0xd6/0x190 netlink_sendmsg+0x1f7/0x430 __sys_sendto+0x213/0x220 ? __sys_recvmsg+0x6a/0xd0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x50/0x1f0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fdfb8495047 Fixes: `c4d7eb5768` ("net/mxl5e: Add change profile method") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260108212657.25090-2-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2026-01-10 15:21:09 -08:00

... 3 4 5 6 7 ...

1413527 Commits