Skip processing hugepage kernel arguments (hugepagesz, hugepages, and
default_hugepagesz) when hugepages are not supported by the architecture.
Some architectures may need to disable hugepages based on conditions
discovered during kernel boot. The hugepages_supported() helper allows
architecture code to advertise whether hugepages are supported.
Currently, normal hugepage allocation is guarded by hugepages_supported(),
but gigantic hugepages are allocated regardless of this check. This
causes problems on powerpc for fadump (firmware- assisted dump).
In the fadump (firmware-assisted dump) scenario, a production kernel crash
causes the system to boot into a special kernel whose sole purpose is to
collect the memory dump and reboot. Features such as hugepages are not
required in this environment and should be disabled.
For example, when the fadump kernel boots with the following kernel
arguments:
default_hugepagesz=1GB hugepagesz=1GB hugepages=200
Before this patch, the kernel prints the following logs:
HugeTLB: allocating 200 of page size 1.00 GiB failed. Only allocated 58 hugepages.
HugeTLB support is disabled!
HugeTLB: huge pages not supported, ignoring associated command-line parameters
hugetlbfs: disabling because there are no supported hugepage sizes
Even though the logs state that HugeTLB support is disabled, gigantic
hugepages are still allocated. This causes the fadump kernel to run out
of memory during boot.
After this patch is applied, the kernel prints the following logs for
the same command line:
HugeTLB: hugepages unsupported, ignoring default_hugepagesz=1GB cmdline
HugeTLB: hugepages unsupported, ignoring hugepagesz=1GB cmdline
HugeTLB: hugepages unsupported, ignoring hugepages=200 cmdline
HugeTLB support is disabled!
hugetlbfs: disabling because there are no supported hugepage sizes
To fix the issue, gigantic hugepage allocation should be guarded by
hugepages_supported().
Previously, two approaches were proposed to bring gigantic hugepage
allocation under hugepages_supported():
[1] Check hugepages_supported() in the generic code before allocating
gigantic hugepages
[2] Make arch_hugetlb_valid_size() return false for all hugetlb sizes
Approach [2] has two minor issues:
1. It prints misleading logs about invalid hugepage sizes
2. The kernel still processes hugepage kernel arguments unnecessarily
To control gigantic hugepage allocation, skip processing hugepage kernel
arguments (default_hugepagesz, hugepagesz and hugepages) when
hugepages_supported() returns false.
Note for backporting: This fix is a partial reversion of the commit
mentioned in the Fixes tag and is only valid once the change referenced by
the Depends-on tag is present. When backporting this patch, the commit
mentioned in the Depends-on tag must be included first.
Link: https://lore.kernel.org/all/20250121150419.1342794-1-sourabhjain@linux.ibm.com/ [1]
Link: https://lore.kernel.org/all/20250128043358.163372-1-sourabhjain@linux.ibm.com/ [2]
Link: https://lkml.kernel.org/r/20251224115524.1272010-1-sourabhjain@linux.ibm.com
Fixes: c2833a5bf7 ("hugetlbfs: fix changes to command line processing")
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Depends-on: 2354ad252b ("powerpc/mm: Update default hugetlb size early")
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If the previous kernel enabled KHO but did not call kho_finalize() (e.g.,
CONFIG_LIVEUPDATE=n or userspace skipped the finalization step), the
'preserved-memory-map' property in the FDT remains empty/zero.
Previously, kho_populate() would succeed regardless of the memory map's
state, reserving the incoming scratch regions in memblock. However,
kho_memory_init() would later fail to deserialize the empty map. By that
time, the scratch regions were already registered, leading to partial
initialization and subsequent list corruption (freeing scratch area twice)
during kho_init().
Move the validation of the preserved memory map earlier into
kho_populate(). If the memory map is empty/NULL:
1. Abort kho_populate() immediately with -ENOENT.
2. Do not register or reserve the incoming scratch memory, allowing the new
kernel to reclaim those pages as standard free memory.
3. Leave the global 'kho_in' state uninitialized.
Consequently, kho_memory_init() sees no active KHO context
(kho_in.mem_chunks_phys is 0) and falls back to kho_reserve_scratch(),
allocating fresh scratch memory as if it were a standard cold boot.
Link: https://lkml.kernel.org/r/20251223140140.2090337-1-pasha.tatashin@soleen.com
Fixes: de51999e68 ("kho: allow memory preservation state updates after finalization")
Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Reported-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Closes: https://lore.kernel.org/all/20251218215613.GA17304@ranerica-svr.sc.intel.com
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Merge patches that Pengutronix have been carrying in their tree for a
while and were upstreamed by Sascha Hauer together with some new
features that are going into the next release.
Remove the newly introduced zoned statistics from sysfs, as sysfs can
only show a single page this will truncate the output on a busy
filesystem.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[Why&How]
Right now, the HDMI HPD filter is enabled by default at 1500ms.
We want to disable it by default, as most modern displays with HDMI do
not require it for DPMS mode.
The HPD can instead be enabled as a driver parameter with a custom delay
value in ms (up to 5000ms).
Fixes: c918e75e1e ("drm/amd/display: Add an HPD filter for HDMI")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4859
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6a681cd9034587fe3550868bacfbd639d1c6891f)
The user mode queue keeps a pointer to the most recent fence in
userq->last_fence. This pointer holds an extra dma_fence reference.
When the queue is destroyed, we free the fence driver and its xarray,
but we forgot to drop the last_fence reference.
Because of the missing dma_fence_put(), the last fence object can stay
alive when the driver unloads. This leaves an allocated object in the
amdgpu_userq_fence slab cache and triggers
This is visible during driver unload as:
BUG amdgpu_userq_fence: Objects remaining on __kmem_cache_shutdown()
kmem_cache_destroy amdgpu_userq_fence: Slab cache still has objects
Call Trace:
kmem_cache_destroy
amdgpu_userq_fence_slab_fini
amdgpu_exit
__do_sys_delete_module
Fix this by putting userq->last_fence and clearing the pointer during
amdgpu_userq_fence_driver_free().
This makes sure the fence reference is released and the slab cache is
empty when the module exits.
v2: Update to only release userq->last_fence with dma_fence_put()
(Christian)
Fixes: edc762a51c ("drm/amdgpu/userq: move some code around")
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8e051e38a8d45caf6a866d4ff842105b577953bb)
Each queue of the process is individually removed and there is not need
to suspend whole mes. Suspending mes stops kernel mode queues also
causing unnecessary timeouts when running mixed work loads
Fixes: 079ae5118e ("drm/amdkfd: fix suspend/resume all calls in mes based eviction path")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4765
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3fd20580b96a6e9da65b94ac3b58ee288239b731)
This reverts commit 820b3d376e8a102c6aeab737ec6edebbbb710e04.
It’s better to validate VM TLB flushes in the flush‑TLB backend
rather than in the generic VM layer.
Reverting this patch depends on
commit fa7c231fc2b0 ("drm/amdgpu: validate the flush_gpu_tlb_pasid()")
being present in the tree.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9163fe4d790fb4e16d6b0e23f55b43cddd3d4a65)
Validate flush_gpu_tlb_pasid() availability before flushing tlb.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f4db9913e4d3dabe9ff3ea6178f2c1bc286012b8)
resolving the issue of incorrect type definitions potentially causing calculation errors.
Fixes: 54f7f3ca98 ("drm/amdgpu/swm14: Update power limit logic")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e3a03d0ae16d6b56e893cce8e52b44140e1ed985)
Internal backlight levels are initialised from ACPI but the values
are sometimes out of sync with the levels in effect until there has
been a read from hardware (eg triggered by reading from sysfs).
This means that the first drm_commit can cause the levels to be set
to a different value than the actual starting one, which results in
a sudden change in brightness.
This path shows the problem (when the values are out of sync):
amdgpu_dm_atomic_commit_tail()
-> amdgpu_dm_commit_streams()
-> amdgpu_dm_backlight_set_level(..., dm->brightness[n])
This patch calls the backlight ops get_brightness explicitly
at the end of backlight registration to make sure dm->brightness[n]
is in sync with the actual hardware levels.
Fixes: 2fe87f54ab ("drm/amd/display: Set default brightness according to ACPI")
Signed-off-by: Vivek Das Mohapatra <vivek@collabora.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 318b1c36d82a0cd2b06a4bb43272fa6f1bc8adc1)
Cc: stable@vger.kernel.org
[Why]
The PSR message was moved in commit 4321742c39 ("drm/amd/display:
Move PSR support message into amdgpu_dm"). This message however shows
for every single link without showing which link is which. This can
send a confusing message to the user.
[How]
Add link name into the message.
Fixes: 4321742c39 ("drm/amd/display: Move PSR support message into amdgpu_dm")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 99f77f6229c0766b980ae05affcf9f742d97de6a)
If dqm->ops.initialize() fails, add deallocate_hiq_sdma_mqd()
to release the memory allocated by allocate_hiq_sdma_mqd().
Move deallocate_hiq_sdma_mqd() up to ensure proper function
visibility at the point of use.
Fixes: 11614c36bc ("drm/amdkfd: Allocate MQD trunk for HIQ and SDMA")
Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b7cccc8286bb9919a0952c812872da1dcfe9d390)
Cc: stable@vger.kernel.org
These IOCTLs shouldn't be called when userqs are not
enabled. Make sure they are enabled before executing
the IOCTLs.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d967509651601cddce7ff2a9f09479f3636f684d)
Cc: stable@vger.kernel.org
This reverts commit 22a36e660d once,
which was merged twice due to an incorrect backmerge resolution.
Fixes: ce0478b02e ("Merge tag 'v6.18-rc6' into drm-next")
Signed-off-by: Peter Colberg <pcolberg@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 38a0f4cf8c6147fd10baa206ab349f8ff724e391)
When an eGPU is unplugged the KFD topology should also be destroyed
for that GPU. This never happens because the fini_sw callbacks never
get to run. Run them manually before calling amdgpu_device_ip_fini_early()
when a device has already been disconnected.
This location is intentionally chosen to make sure that the kfd locking
refcount doesn't get incremented unintentionally.
Cc: kent.russell@amd.com
Closes: https://community.frame.work/t/amd-egpu-on-linux/8691/33
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6a23e7b4332c10f8b56c33a9c5431b52ecff9aab)
Cc: stable@vger.kernel.org
When driver not support atomic, fb using plane->fb rather than
plane->state->fb.
Fixes: fe151ed7af ("drm/amdgpu: add generic display panic helper code")
Signed-off-by: Lu Yao <yaolu@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2f2a72de673513247cd6fae14e53f6c40c5841ef)
Fix copy&paste error, that should have been an assignment instead of an or,
otherwise MTYPE_UC 0x3 can not be updated to MTYPE_RW 0x1.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fc1366016abe4103c0f0fac882811aea961ef213)
Cc: stable@vger.kernel.org
Merge series from Richard Fitzgerald <rf@opensource.cirrus.com>:
This series fixes a problem with soc_sdw_utils.c calling the wrong
codec init callbacks, because it assumed that the DAI name could be
used to uniquely identify the codec. This isn't the case, especially
on SDCA which is a generic driver for many parts.
The first patch is needed to add a missing export to SoundWire core.
Pull SCSI fixes from James Bottomley:
"Only one core change (and one in doc only) the rest are drivers.
The one core fix is for some inline encrypting drives that can't
handle encryption requests on non-data commands (like error handling
ones); it saves the request level encryption parameters in the eh_save
structure so they can be cleared for error handling and restored after
it is completed"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: host: mediatek: Make read-only array scale_us static const
scsi: bfa: Update outdated comment
scsi: mpt3sas: Update maintainer list
scsi: ufs: core: Configure MCQ after link startup
scsi: core: Fix error handler encryption support
scsi: core: Correct documentation for scsi_test_unit_ready()
scsi: ufs: dt-bindings: Fix several grammar errors
With IORING_SETUP_DEFER_TASKRUN, task work is queued to ctx->work_llist
(local work) rather than the fallback list. During io_ring_exit_work(),
io_move_task_work_from_local() was called once before the cancel loop,
moving work from work_llist to fallback_llist.
However, task work can be added to work_llist during the cancel loop
itself. There are two cases:
1) io_kill_timeouts() is called from io_uring_try_cancel_requests() to
cancel pending timeouts, and it adds task work via io_req_queue_tw_complete()
for each cancelled timeout:
2) URING_CMD requests like ublk can be completed via
io_uring_cmd_complete_in_task() from ublk_queue_rq() during canceling,
given ublk request queue is only quiesced when canceling the 1st uring_cmd.
Since io_allowed_defer_tw_run() returns false in io_ring_exit_work()
(kworker != submitter_task), io_run_local_work() is never invoked,
and the work_llist entries are never processed. This causes
io_uring_try_cancel_requests() to loop indefinitely, resulting in
100% CPU usage in kworker threads.
Fix this by moving io_move_task_work_from_local() inside the cancel
loop, ensuring any work on work_llist is moved to fallback before
each cancel attempt.
Cc: stable@vger.kernel.org
Fixes: c0e0d6ba25 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull bitmap fix from Yury Norov:
"Fix Rust build for architectures implementing their own find_bit() ops
(arm and m68k)"
* tag 'bitmap-for-6.19-rc5' of https://github.com/norov/linux:
rust: bitops: fix missing _find_* functions on 32-bit ARM
Pull media fixes from Mauro Carvalho Chehab:
- ov02c10: some fixes related to preserving bayer pattern and
horizontal control
- ipu-bridge: Add quirks for some Dell XPS laptops with inverted
sensors
- mali-c55: Fix version identifier logic
- rzg2l-cru: csi-2: fix RZ/V2H input sizes on some variants
* tag 'media/v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: ov02c10: Remove unnecessary hflip and vflip pointers
media: ipu-bridge: Add DMI quirk for Dell XPS laptops with upside down sensors
media: ov02c10: Fix the horizontal flip control
media: ov02c10: Adjust x-win/y-win when changing flipping to preserve bayer-pattern
media: ov02c10: Fix bayer-pattern change after default vflip change
media: rzg2l-cru: csi-2: Support RZ/V2H input sizes
media: uapi: mali-c55-config: Remove version identifier
media: mali-c55: Remove duplicated version check
media: Documentation: mali-c55: Use v4l2-isp version identifier
The commit d2fe192348 (“nvme: only allow entering LIVE from CONNECTING
state”) disallows controller state transitions directly from RESETTING
to LIVE. However, the NVMe PCIe subsystem reset path relies on this
transition to recover the controller on PowerPC (PPC) systems.
On PPC systems, issuing a subsystem reset causes a temporary loss of
communication with the NVMe adapter. A subsequent PCIe MMIO read then
triggers EEH recovery, which restores the PCIe link and brings the
controller back online. For EEH recovery to proceed correctly, the
controller must transition back to the LIVE state.
Due to the changes introduced by commit d2fe192348 (“nvme: only allow
entering LIVE from CONNECTING state”), the controller can no longer
transition directly from RESETTING to LIVE. As a result, EEH recovery
exits prematurely, leaving the controller stuck in the RESETTING state.
Fix this by explicitly transitioning the controller state from RESETTING
to CONNECTING and then to LIVE. This satisfies the updated state
transition rules and allows the controller to be successfully recovered
on PPC systems following a PCIe subsystem reset.
Cc: stable@vger.kernel.org
Fixes: d2fe192348 ("nvme: only allow entering LIVE from CONNECTING state")
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Based on the reality[1][2] that vb2_dma_sg_alloc() can't alloc buffer with
device DMA limits, those device will always get below error: "swiotlb
buffer is full (sz: 393216 bytes), total 65536 (slots), used 2358 (slots)"
and the uvc gadget function can't work at all.
The videobuf2-dma-sg.c driver doesn't has a formal improve about this issue
till now. For UVC gadget, the videobuf2 subsystem doesn't do dma_map() on
vmalloc returned big buffer when allocate the video buffers, however, it do
it for dma_sg returned buffer. So the issue happens for vb2_dma_sg_alloc().
To workaround the issue, lets retry vb2_reqbufs() with
vb_vmalloc_memops if it fails to allocate buffer with vb2_dma_sg_memops.
If use vmalloced buffer, UVC gadget will allocate some small buffers for
each usb_request to do dma transfer, then uvc driver will memcopy data
from big buffer to small buffer.
Link[1]: https://lore.kernel.org/linux-media/20230828075420.2009568-1-anle.pan@nxp.com/
Link[2]: https://lore.kernel.org/linux-media/20230914145812.12851-1-hui.fang@nxp.com/
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260113-uvc-gadget-fix-patch-v2-4-62950ef5bcb5@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to USB specification:
For full-/high-speed isochronous endpoints, the bInterval value is
used as the exponent for a 2^(bInterval-1) value.
To correctly convert bInterval as interval_duration:
interval_duration = 2^(bInterval-1) * frame_interval
Because the unit of video->interval is 100ns, add a comment info to
make it clear.
Fixes: 48dbe73117 ("usb: gadget: uvc: set req_size and n_requests based on the frame interval")
Cc: stable@vger.kernel.org
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://patch.msgid.link/20260113-uvc-gadget-fix-patch-v2-2-62950ef5bcb5@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Current req_payload_size calculation has 2 issue:
(1) When the first time calculate req_payload_size for all the buffers,
reqs_per_frame = 0 will be the divisor of DIV_ROUND_UP(). So
the result is undefined.
This happens because VIDIOC_STREAMON is always executed after
VIDIOC_QBUF. So video->reqs_per_frame will be 0 until VIDIOC_STREAMON
is run.
(2) The buf->req_payload_size may be bigger than max_req_size.
Take YUYV pixel format as example:
If bInterval = 1, video->interval = 666666, high-speed:
video->reqs_per_frame = 666666 / 1250 = 534
720p: buf->req_payload_size = 1843200 / 534 = 3452
1080p: buf->req_payload_size = 4147200 / 534 = 7766
Based on such req_payload_size, the controller can't run normally.
To fix above issue, assign max_req_size to buf->req_payload_size when
video->reqs_per_frame = 0. And limit buf->req_payload_size to
video->req_size if it's large than video->req_size. Since max_req_size
is used at many place, add it to struct uvc_video and set the value once
endpoint is enabled.
Fixes: 98ad032915 ("usb: gadget: uvc: set req_length based on payload by nreqs instead of req_size")
Cc: stable@vger.kernel.org
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://patch.msgid.link/20260113-uvc-gadget-fix-patch-v2-1-62950ef5bcb5@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ignore USB role switches if dwc3-apple is already in the desired state.
The USB-C port controller on M2 and M1/M2 Pro/Max/Ultra devices issues
additional interrupts which result in USB role switches to the already
active role.
Ignore these USB role switches to ensure the USB-C port controller and
dwc3-apple are always in a consistent state. This matches the behaviour
in __dwc3_set_mode() in core.c.
Fixes detecting USB 2.0 and 3.x devices on the affected systems. The
reset caused by the additional role switch appears to leave the USB
devices in a state which prevents detection when the phy and dwc3 is
brought back up again.
Fixes: 0ec946d32e ("usb: dwc3: Add Apple Silicon DWC3 glue layer driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Janne Grunau <j@jannau.net>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Reviewed-by: Sven Peter <sven@kernel.org>
Tested-by: Sven Peter <sven@kernel.org> # M1 mac mini and macbook air
Link: https://patch.msgid.link/20260109-apple-dwc3-role-switch-v1-1-11623b0f6222@jannau.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When some wake IRQs are disabled in the device tree, the corresponding
interrupt entries are removed from DT. In such cases, the driver
currently calls platform_get_irq(), which returns -ENXIO and logs
an error like:
tegra-xusb 3610000.usb: error -ENXIO: IRQ index 2 not found
However, not all wake IRQs are mandatory. The hardware can operate
normally even if some wake sources are not defined in DT. To avoid this
false alarm and allow missing wake IRQs gracefully, use
platform_get_irq_optional() instead of platform_get_irq().
Fixes: 5df186e2ef ("usb: xhci: tegra: Support USB wakeup function for Tegra234")
Cc: stable <stable@kernel.org>
Signed-off-by: Wayne Chang <waynec@nvidia.com>
Signed-off-by: Wei-Cheng Chen <weichengc@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260112145653.95691-1-weichengc@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 9beeee6584 ("USB: EHCI: log a warning if ehci-hcd is not
loaded first") said that ehci-hcd should be loaded before ohci-hcd and
uhci-hcd. However, commit 05c92da0c5 ("usb: ohci/uhci - add soft
dependencies on ehci_pci") only makes ohci-pci/uhci-pci depend on ehci-
pci, which is not enough and we may still see the warnings in boot log.
To eliminate the warnings we should make ohci-hcd/uhci-hcd depend on
ehci-hcd. But Alan said that the warning introduced by 9beeee6584
is bogus, we only need the soft dependencies in the PCI level rather
than the HCD level.
However, there is really another neccessary soft dependencies between
ohci-platform/uhci-platform and ehci-platform, which is added by this
patch. The boot logs are below.
1. ohci-platform loaded before ehci-platform:
ohci-platform 1f058000.usb: Generic Platform OHCI controller
ohci-platform 1f058000.usb: new USB bus registered, assigned bus number 1
ohci-platform 1f058000.usb: irq 28, io mem 0x1f058000
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 4 ports detected
Warning! ehci_hcd should always be loaded before uhci_hcd and ohci_hcd, not after
usb 1-4: new low-speed USB device number 2 using ohci-platform
ehci-platform 1f050000.usb: EHCI Host Controller
ehci-platform 1f050000.usb: new USB bus registered, assigned bus number 2
ehci-platform 1f050000.usb: irq 29, io mem 0x1f050000
ehci-platform 1f050000.usb: USB 2.0 started, EHCI 1.00
usb 1-4: device descriptor read/all, error -62
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 4 ports detected
usb 1-4: new low-speed USB device number 3 using ohci-platform
input: YSPRINGTECH USB OPTICAL MOUSE as /devices/platform/bus@10000000/1f058000.usb/usb1/1-4/1-4:1.0/0003:10C4:8105.0001/input/input0
hid-generic 0003:10C4:8105.0001: input,hidraw0: USB HID v1.11 Mouse [YSPRINGTECH USB OPTICAL MOUSE] on usb-1f058000.usb-4/input0
2. ehci-platform loaded before ohci-platform:
ehci-platform 1f050000.usb: EHCI Host Controller
ehci-platform 1f050000.usb: new USB bus registered, assigned bus number 1
ehci-platform 1f050000.usb: irq 28, io mem 0x1f050000
ehci-platform 1f050000.usb: USB 2.0 started, EHCI 1.00
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 4 ports detected
ohci-platform 1f058000.usb: Generic Platform OHCI controller
ohci-platform 1f058000.usb: new USB bus registered, assigned bus number 2
ohci-platform 1f058000.usb: irq 29, io mem 0x1f058000
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 4 ports detected
usb 2-4: new low-speed USB device number 2 using ohci-platform
input: YSPRINGTECH USB OPTICAL MOUSE as /devices/platform/bus@10000000/1f058000.usb/usb2/2-4/2-4:1.0/0003:10C4:8105.0001/input/input0
hid-generic 0003:10C4:8105.0001: input,hidraw0: USB HID v1.11 Mouse [YSPRINGTECH USB OPTICAL MOUSE] on usb-1f058000.usb-4/input0
In the later case, there is no re-connection for USB-1.0/1.1 devices,
which is expected.
Cc: stable <stable@kernel.org>
Reported-by: Shengwen Xiao <atzlinux@sina.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://patch.msgid.link/20260112084802.1995923-1-chenhuacai@loongson.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that the upstream code has been getting broader test coverage by our
users we occasionally see issues with USB2 devices plugged in during boot.
Before Linux is running, the USB2 PHY has usually been running in device
mode and it turns out that sometimes host->device or device->host
transitions don't work.
The root cause: If the role inside the USB2 PHY is re-configured when it
has already been powered on or when dwc3 has already enabled the ULPI
interface the new configuration sometimes doesn't take affect until dwc3
is reset again. Fix this rare issue by configuring the role much earlier.
Note that the USB3 PHY does not suffer from this issue and actually
requires dwc3 to be up before the correct role can be configured there.
Reported-by: James Calligeros <jcalligeros99@gmail.com>
Reported-by: Janne Grunau <j@jannau.net>
Fixes: 0ec946d32e ("usb: dwc3: Add Apple Silicon DWC3 glue layer driver")
Cc: stable <stable@kernel.org>
Tested-by: Janne Grunau <j@jannau.net>
Reviewed-by: Janne Grunau <j@jannau.net>
Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Signed-off-by: Sven Peter <sven@kernel.org>
Link: https://patch.msgid.link/20260109-dwc3-apple-usb2phy-fix-v2-1-ab6b041e3b26@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The for_each_available_child_of_node() calls of_node_put() to
release child_np in each success loop. After breaking from the
loop with the child_np has been released, the code will jump to
the put_child label and will call the of_node_put() again if the
devm_request_threaded_irq() fails. These cause a double free bug.
Fix by returning directly to avoid the duplicate of_node_put().
Fixes: ed2b5a8e6b ("phy: phy-rockchip-inno-usb2: support muxed interrupts")
Cc: stable@vger.kernel.org
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patch.msgid.link/20260109154626.2452034-1-vulab@iscas.ac.cn
Signed-off-by: Vinod Koul <vkoul@kernel.org>