linux

Commit Graph

Author	SHA1	Message	Date
Dave Airlie	4d33ed640f	Driver Changes: - SR-IOV fixes for GT reset and TLB invalidation - Fix memory copy direction during migration - Fix alignment check on migration - Fix MOCS and page fault init order to correctly account for topology -----BEGIN PGP SIGNATURE----- iQJNBAABCgA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmh5u3IZHGx1Y2FzLmRl bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqUyViEAChO4zJ11v29mILYOJwYCir hZ0gUvbDVd3/uYMDTFbZnva6fA5vDwRFRXrlS2VYzWmYcnyg7/sLc5s1pc0u6ir+ 9WSm9z+msF4GDqe4wIpKALpA8Jxo6iIimmRHqhY+Ak24h4fA+OrDSNHUEaHwcCbK oX2uszmHPj1N00aEwtXdP2S09GKqONpGD0948iZ/vjrXojfe7+IhiHjtCMQftZ5x NBqTotbJgXo4bK3PWqpL3jEQ0qkl+mVnLb8OWbRSrowupxxF6BTvrJRpAo4OgV9o 866kfK99OE1OLAihn+0gCqBVjFfrjdu1R8LqlL/8m7ThJEFf+AQzMGy6Bnt25+MQ XCytRnSDn7GO0T/HIJ5psYSTkUcQzKfRfSP+SwLMDo2yfaqcaD/TC70pzt9MKgDI ZLcZ8C8pUcICUzqqUQKtOkwZWimb6SVa+3s9dm5s93giL25QKdnCdTeCmURSGnEf wlK729MsiBODHfdfLpCZbE4D1Kjxw/eATUtqIabRNMUC6W28e93svAG2AsJlcwEV 382SaQiDSZULPEwl4BycVXlZgSOqAsR3YJhKf+CcHYMA8YKUWNkmFRpg+gjQhMNe zR4VNSMxzHboD9KkzSmdpwhJ3ucJwlU/XutVuNvPduZ3WHV3x1PDG2hkiLza3VMK w8jZH43uviE3QJ9li/W6Mg== =mpnW -----END PGP SIGNATURE----- Merge tag 'drm-xe-fixes-2025-07-17' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - SR-IOV fixes for GT reset and TLB invalidation - Fix memory copy direction during migration - Fix alignment check on migration - Fix MOCS and page fault init order to correctly account for topology Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/6jworkgupwstm4v7aohbuzod3dyz4u7pyfhshr5ifgf2xisgj3@cm5em5yupjiu	2025-07-18 14:04:06 +10:00
Dave Airlie	4399e3d84d	Mediatek DRM Fixes - 20250718 1. Add wait_event_timeout when disabling plane 2. only announce AFBC if really supported 3. mtk_dpi: Reorder output formats on MT8195/88 -----BEGIN PGP SIGNATURE----- iQJMBAABCgA2FiEEACwLKSDmq+9RDv5P4cpzo8lZTiQFAmh5hc0YHGNodW5rdWFu Zy5odUBrZXJuZWwub3JnAAoJEOHKc6PJWU4k2ywP/RWEbDdxzM6zRZMSRJYcwqyy YFU9f0HgjahMdCF84BhffdmkxY/gbvx1t7I0OLbdKf3dPHZF3D61SOhs1Cy+RGPw 5IHzWzzBSjDM2jtCe+P7pM6M6fVzuxGnyf06apkGEFDLy5e4yJ2YWE3SHpWMQezy xjtjLakDd4/70WMszKqfsaF9RrhZCi0sPcbSMRH1vxsTSnZCH7dN6cn3LCnhgDc6 VejUd7yRnKrYFQPELXKL5WSqZsClPOCT4tqjpIiz3eSahvbAVrtuxAL1gM6lC8kC U5rHjq0p4b8eZnNkLRvurxpOn2ggoDh/Fn6iWblLWkTmfMTIU8KR1Xf/n3mNVvEU 6X/lzjqvp4s8l3n7fE1KtEFtVZlKXxuXeLLHYRPA+EwQ2cyxJjv0GInBE6YAcHSJ 8tzwirAN1C3i83U9y/JrgP2VgTPdE3uye5KizYjac1c1fObIAvVncKZ9lXggE+xP H+Vsi2cHqND8LqEmeC/qxPnRC5PPeeie3KMpj95XhT6fDbky0jO0qdLplrwlRHHN v6irQG4rsouCkWSqYmXpeWs0KmD6ZQELBGEWI50Q/WwcQLUkjQGDn8M2Ut2mZFyG O3V09gHU9Ym7eB/C/WkgHEwJuZK+h6JjMwPHMedlf0aGK5aW+mUoMZSEJm/91iKM hgbaI774q9D8f/+Xf7Pp =zlcI -----END PGP SIGNATURE----- Merge tag 'mediatek-drm-fixes-20250718' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20250718 1. Add wait_event_timeout when disabling plane 2. only announce AFBC if really supported 3. mtk_dpi: Reorder output formats on MT8195/88 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/20250717232916.12372-1-chunkuang.hu@kernel.org	2025-07-18 12:03:00 +10:00
Dave Airlie	8d2ad05666	amd-drm-fixes-6.16-2025-07-17: amdgpu: - Fix a DC memory leak - DCN 4.0.1 degamma LUT fix - Fix reset counter handling for soft recovery - GC 8 fix radeon: - Drop console locks when suspending/resuming -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCaHkvsgAKCRC93/aFa7yZ 2GlzAQCij+Q4vyCBUXfq/xtVvMO3xp/3l7x1yAXknOanuLxAkwD/ZbiI0hMvBZNO 2Nl4hqBQMgOBFf9N12vkqLPNwY8IjAw= =lQzq -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.16-2025-07-17' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.16-2025-07-17: amdgpu: - Fix a DC memory leak - DCN 4.0.1 degamma LUT fix - Fix reset counter handling for soft recovery - GC 8 fix radeon: - Drop console locks when suspending/resuming Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250717171935.642380-1-alexander.deucher@amd.com	2025-07-18 11:46:16 +10:00
Dave Airlie	fbefd8adda	- DP AUX DPCD address fix (Imre) -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmh5EB8ACgkQ+mJfZA7r E8prqggAvQa4hfOhb2+QeZCs36uOTx/VafOfleB+imrMLwF9hRBraf/OEkHXhT9t +hlMXbvk46RHo+an9GzCRaRxwKouf2bOFcDgsQoUIIiA/jaha2uXrNBqxM++Hkd4 pWz3RSNlhKtJVVM2JG/jKeoUqcy8W5UWlSmBl3YLwBfXNMKISLlw4+4+RprAcs4U hEMsW+ikCQJwaECAC9IBu+s+ue1SqkugdIkpVXSnND3cvm0ErzP/PPIFzY8F/g7+ +Do5ZCxVJSjdhzCoeojdIWDGfakkLgu0uhBslhRwtCtGPCC9OfMQwgbbpcRTMqDL IEMhwWVVAtgRTupxb/ilRPO7nvfoCg== =SZSN -----END PGP SIGNATURE----- Merge tag 'drm-intel-fixes-2025-07-17' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - DP AUX DPCD address fix (Imre) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aHkQmRhelb4Fzqau@intel.com	2025-07-18 09:59:58 +10:00
Dave Airlie	cbc3fa8288	Merge tag 'drm-misc-fixes-2025-07-16' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.16 final?: - nouveau ioctl validation fix. - panfrost scheduler bug. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/ee784a3a-30b4-489a-8503-b1be3b09268c@linux.intel.com	2025-07-18 09:42:22 +10:00
Louis-Alexis Eyraud	5ceed7a6d3	drm/mediatek: mtk_dpi: Reorder output formats on MT8195/88 Reorder output format arrays in both MT8195 DPI and DP_INTF block configuration by decreasing preference order instead of alphanumeric one, as expected by the atomic_get_output_bus_fmts callback function of drm_bridge controls, so the RGB ones are used first during the bus format negotiation process. Fixes: `20fa6a8fc5` ("drm/mediatek: mtk_dpi: Allow additional output formats on MT8195/88") Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250606-mtk_dpi-mt8195-fix-wrong-color-v1-1-47988101b798@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-07-17 23:19:05 +00:00
Icenowy Zheng	8d121a82fa	drm/mediatek: only announce AFBC if really supported Currently even the SoC's OVL does not declare the support of AFBC, AFBC is still announced to the userspace within the IN_FORMATS blob, which breaks modern Wayland compositors like KWin Wayland and others. Gate passing modifiers to drm_universal_plane_init() behind querying the driver of the hardware block for AFBC support. Fixes: `c410fa9b07` ("drm/mediatek: Add AFBC support to Mediatek DRM driver") Signed-off-by: Icenowy Zheng <uwu@icenowy.me> Reviewed-by: CK Hu <ck.hu@medaitek.com> Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250531121140.387661-1-uwu@icenowy.me/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-07-17 23:19:05 +00:00
Jason-JH Lin	d208261e9f	drm/mediatek: Add wait_event_timeout when disabling plane Our hardware registers are set through GCE, not by the CPU. DRM might assume the hardware is disabled immediately after calling atomic_disable() of drm_plane, but it is only truly disabled after the GCE IRQ is triggered. Additionally, the cursor plane in DRM uses async_commit, so DRM will not wait for vblank and will free the buffer immediately after calling atomic_disable(). To prevent the framebuffer from being freed before the layer disable settings are configured into the hardware, which can cause an IOMMU fault error, a wait_event_timeout has been added to wait for the ddp_cmdq_cb() callback,indicating that the GCE IRQ has been triggered. Fixes: `2f965be7f9` ("drm/mediatek: apply CMDQ control flow") Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20250624113223.443274-1-jason-jh.lin@mediatek.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-07-17 23:18:53 +00:00
Michal Wajdeczko	5c244eeca5	drm/xe/pf: Resend PF provisioning after GT reset If we reload the GuC due to suspend/resume or GT reset then we have to resend not only any VFs provisioning data, but also PF configuration, like scheduling parameters (EQ, PT), as otherwise GuC will continue to use default values. Fixes: `411220808c` ("drm/xe/pf: Restart VFs provisioning after GT reset") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-3-michal.wajdeczko@intel.com (cherry picked from commit `1c38dd6afa`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-17 09:51:51 -03:00
Michal Wajdeczko	81dccec448	drm/xe/pf: Prepare to stop SR-IOV support prior GT reset As part of the resume or GT reset, the PF driver schedules work which is then used to complete restarting of the SR-IOV support, including resending to the GuC configurations of provisioned VFs. However, in case of short delay between those two actions, which could be seen by triggering a GT reset on the suspened device: $ echo 1 > /sys/kernel/debug/dri/0000:00:02.0/gt0/force_reset this PF worker might be still busy, which lead to errors due to just stopped or disabled GuC CTB communication: [ ] xe 0000:00:02.0: [drm:xe_gt_resume [xe]] GT0: resumed [ ] xe 0000:00:02.0: [drm] GT0: trying reset from force_reset_show [xe] [ ] xe 0000:00:02.0: [drm] GT0: reset queued [ ] xe 0000:00:02.0: [drm] GT0: reset started [ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel stopped [ ] xe 0000:00:02.0: [drm:guc_ct_send_recv [xe]] GT0: H2G request 0x5503 canceled! [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 12 config KLVs (-ECANCELED) [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 configuration (-ECANCELED) [ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 12 config KLVs (-ENODEV) [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 configuration (-ENODEV) [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push 2 of 2 VFs configurations [ ] xe 0000:00:02.0: [drm:pf_worker_restart_func [xe]] GT0: PF: restart completed While this VFs reprovisioning will be successful during next spin of the worker, to avoid those errors, make sure to cancel restart worker if we are about to trigger next reset. Fixes: `411220808c` ("drm/xe/pf: Restart VFs provisioning after GT reset") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-2-michal.wajdeczko@intel.com (cherry picked from commit `9f50b729dd`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-17 09:51:51 -03:00
Lucas De Marchi	057a7d66f9	drm/xe/migrate: Fix alignment check The check would fail if the address is unaligned, but not when accounting the offset. Instead of `buf \| offset` it should have been `buf + offset`. To make it more readable and also drop the uintptr_t, just use the IS_ALIGNED() macro. Fixes: `270172f64b` ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250710-migrate-aligned-v1-1-44003ef3c078@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `81e139db69`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-17 09:51:51 -03:00
Matthew Brost	3155ac8925	drm/xe: Move page fault init after topology init We need the topology to determine GT page fault queue size, move page fault init after topology init. Cc: stable@vger.kernel.org Fixes: `3338e4f90c` ("drm/xe: Use topology to determine page fault queue size") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250710191208.1040215-1-matthew.brost@intel.com (cherry picked from commit `beb72acb5b`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-17 09:51:27 -03:00
Balasubramani Vivekanandan	2a58b21ade	drm/xe/mocs: Initialize MOCS index early MOCS uc_index is used even before it is initialized in the following callstack guc_prepare_xfer() __xe_guc_upload() xe_guc_min_load_for_hwconfig() xe_uc_init_hwconfig() xe_gt_init_hwconfig() Do MOCS index initialization earlier in the device probe. Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Ravi Kumar Vodapalli <ravi.kumar.vodapalli@intel.com> Link: https://lore.kernel.org/r/20250520142445.2792824-1-balasubramani.vivekanandan@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit `241cc827c0`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-17 09:51:12 -03:00
Matthew Auld	1381c231c1	drm/xe/migrate: fix copy direction in access_memory After we do the modification on the host side, ensure we write the result back to VRAM and not the other way around, otherwise the modification will be lost if treated like a read. Fixes: `270172f64b` ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710134128.800756-2-matthew.auld@intel.com (cherry picked from commit `c12fe703ca`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-17 09:45:19 -03:00
Tejas Upadhyay	fd25fa90ed	drm/xe: Dont skip TLB invalidations on VF Skipping TLB invalidations on VF causing unrecoverable faults. Probable reason for skipping TLB invalidations on SRIOV could be lack of support for instruction MI_FLUSH_DW_STORE_INDEX. Add back TLB flush with some additional handling. Helps in resolving, [ 704.913454] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]] ASID: 0 VFID: 0 PDATA: 0x0d92 Faulted Address: 0x0000000002fa0000 FaultType: 0 AccessType: 1 FaultLevel: 0 EngineClass: 3 bcs EngineInstance: 8 [ 704.913551] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22 V2: - Use Xmas tree (MichalW) Suggested-by: Matthew Brost <matthew.brost@intel.com> Fixes: `97515d0b3e` ("drm/xe/vf: Don't emit access to Global HWSP if VF") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250710045945.1023840-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> (cherry picked from commit `b528e896fa`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-17 09:45:19 -03:00
Eeli Haapalainen	8326193401	drm/amdgpu/gfx8: reset compute ring wptr on the GPU on resume Commit `42cdf6f687` ("drm/amdgpu/gfx8: always restore kcq MQDs") made the ring pointer always to be reset on resume from suspend. This caused compute rings to fail since the reset was done without also resetting it for the firmware. Reset wptr on the GPU to avoid a disconnect between the driver and firmware wptr. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3911 Fixes: `42cdf6f687` ("drm/amdgpu/gfx8: always restore kcq MQDs") Signed-off-by: Eeli Haapalainen <eeli.haapalainen@protonmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `2becafc319`) Cc: stable@vger.kernel.org	2025-07-16 16:50:45 -04:00
Lijo Lazar	86790e300d	drm/amdgpu: Increase reset counter only on success Increment the reset counter only if soft recovery succeeded. This is consistent with a ring hard reset behaviour where counter gets incremented only if hard reset succeeded. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `25c314aa3e`) Cc: stable@vger.kernel.org	2025-07-16 16:49:04 -04:00
Thomas Zimmermann	4c1a0fc2ae	drm/radeon: Do not hold console lock during resume The function radeon_resume_kms() acquires the console lock. It is inconsistent, as it depends on the notify_client argument. That lock then covers a number of suspend operations that are unrelated to the console. Remove the calls to console_lock() and console_unlock() from the radeon function. The console lock is only required by DRM's fbdev emulation, which acquires it as necessary. Also fixes a possible circular dependency between the console lock and the client-list mutex, where the mutex is supposed to be taken first. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `fff8e05044`)	2025-07-16 16:48:34 -04:00
Thomas Zimmermann	5dd0b96118	drm/radeon: Do not hold console lock while suspending clients The radeon driver holds the console lock while suspending in-kernel DRM clients. This creates a circular dependency with the client-list mutex, which is supposed to be acquired first. Reported when combining radeon with another DRM driver. Therefore, do not take the console lock in radeon, but let the fbdev DRM client acquire the lock when needed. This is what all other DRM drivers so. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reported-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Closes: https://lore.kernel.org/dri-devel/0a087cfd-bd4c-48f1-aa2f-4a3b12593935@oss.qualcomm.com/ Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `612ec7c69d`)	2025-07-16 16:48:24 -04:00
Melissa Wen	97a0f2b5f4	drm/amd/display: Disable CRTC degamma LUT for DCN401 In DCN401 pre-blending degamma LUT isn't affecting cursor as in previous DCN version. As this is not the behavior close to what is expected for CRTC degamma LUT, disable CRTC degamma LUT property in this HW. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4176 --- When enabling HDR on KDE, it takes the first CRTC 1D LUT available and apply a color transformation (Gamma 2.2 -> PQ). AMD driver usually advertises a CRTC degamma LUT as the first CRTC 1D LUT, but it's actually applied pre-blending. In previous HW version, it seems to work fine because the 1D LUT was applied to cursor too, but DCN401 presents a different behavior and the 1D LUT isn't affecting the hardware cursor. To address the wrong gamma on cursor with HDR (see the link), I came up with this patch that disables CRTC degamma LUT in this hw, since it presents a different behavior than others. With this KDE sees CRTC regamma LUT as the first post-blending 1D LUT available. This is actually more consistent with AMD color pipeline. It was tested by the reporter, since I don't have the HW available for local testing and debugging. Melissa --- Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `340231cdce`) Cc: stable@vger.kernel.org	2025-07-16 16:47:43 -04:00
Clayton King	b2ee9fa0fe	drm/amd/display: Free memory allocation [WHY] Free memory to avoid memory leak Reviewed-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Clayton King <clayton.king@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `fa699acb8e`) Cc: stable@vger.kernel.org	2025-07-16 16:46:50 -04:00
Imre Deak	d34d6feaf4	drm/dp: Change AUX DPCD probe address from LANE0_1_STATUS to TRAINING_PATTERN_SET Commit `a3ef3c2da6` ("drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS") stopped using the DPCD_REV register for DPCD probing, since this results in link training failures at least when using an Intel Barlow Ridge TBT hub at UHBR link rates (the DP_INTRA_HOP_AUX_REPLY_INDICATION never getting cleared after the failed link training). Since accessing DPCD_REV during link training is prohibited by the DP Standard, LANE0_1_STATUS (0x202) was used instead, as it falls within the Standard's valid register address range (0x102-0x106, 0x202-0x207, 0x200c-0x200f, 0x2216) and it fixed the link training on the above TBT hub. However, reading the LANE0_1_STATUS register also has a side-effect at least on a Novatek eDP panel, as reported on the Closes: link below, resulting in screen flickering on that panel. One clear side-effect when doing the 1-byte probe reads from LANE0_1_STATUS during link training before reading out the full 6 byte link status starting at the same address is that the panel will report the link training as completed with voltage swing 0. This is different from the normal, flicker-free scenario when no DPCD probing is done, the panel reporting the link training complete with voltage swing 2. Using the TRAINING_PATTERN_SET register for DPCD probing doesn't have the above side-effect, the panel will link train with voltage swing 2 as expected and it will stay flicker-free. This register is also in the above valid register range and is unlikely to have a side-effect as that of LANE0_1_STATUS: Reading LANE0_1_STATUS is part of the link training CR/EQ sequences and so it may cause a state change in the sink - even if inadvertently as I suspect in the case of the above Novatek panel. As opposed to this, reading TRAINING_PATTERN_SET is not part of the link training sequence (it must be only written once at the beginning of the CR/EQ sequences), so it's unlikely to cause any state change in the sink. As a side-note, this Novatek panel also lacks support for TPS3, while claiming support for HBR2, which violates the DP Standard (the Standard mandating TPS3 for HBR2). Besides the Novatek panel (PSR 1), which this change fixes, I also verified the change on a Samsung (PSR 1) and an Analogix (PSR 2) eDP panel as well as on the Intel Barlow Ridge TBT hub. Note that in the drm-tip tree (targeting the v6.17 kernel version) the i915 and xe drivers keep DPCD probing enabled only for the panel known to require this (HP ZR24w), hence those drivers in drm-tip are not affected by the above problem. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Fixes: `a3ef3c2da6` ("drm/dp: Change AUX DPCD probe address from DPCD_REV to LANE0_1_STATUS") Reported-and-tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14558 Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250708212331.112898-1-imre.deak@intel.com (cherry picked from commit `bba9aa4165`) [Imre: Rebased on drm-intel-fixes] Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo: Changed original commit hash to match with the one propagated in fixes]	2025-07-15 09:37:57 -04:00
Philipp Stanner	cb345f954e	drm/panfrost: Fix scheduler workqueue bug When the GPU scheduler was ported to using a struct for its initialization parameters, it was overlooked that panfrost creates a distinct workqueue for timeout handling. The pointer to this new workqueue is not initialized to the struct, resulting in NULL being passed to the scheduler, which then uses the system_wq for timeout handling. Set the correct workqueue to the init args struct. Cc: stable@vger.kernel.org # 6.15+ Fixes: `796a9f55a8` ("drm/sched: Use struct for drm_sched_init() params") Reported-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Closes: https://lore.kernel.org/dri-devel/b5d0921c-7cbf-4d55-aa47-c35cd7861c02@igalia.com/ Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250709102957.100849-2-phasta@kernel.org	2025-07-14 16:49:10 +01:00
Arnd Bergmann	e5478166df	drm/nouveau: check ioctl command codes better nouveau_drm_ioctl() only checks the _IOC_NR() bits in the DRM_NOUVEAU_NVIF command, but ignores the type and direction bits, so any command with '7' in the low eight bits gets passed into nouveau_abi16_ioctl() instead of drm_ioctl(). Check for all the bits except the size that is handled inside of the handler. Fixes: `27111a23d0` ("drm/nouveau: expose the full object/event interfaces to userspace") Signed-off-by: Arnd Bergmann <arnd@arndb.de> [ Fix up two checkpatch warnings and a typo. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250711072458.2665325-1-arnd@kernel.org	2025-07-11 20:04:32 +02:00
Simona Vetter	b7dc79a633	Merge tag 'drm-misc-fixes-2025-07-10' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.16-rc6 or final: - Fix nouveau fail on debugfs errors. - Magic 50 ms to fix nouveau suspend. - Call rust destructor on drm device release. - Fix DMA api error handling in tegra/nvdec. - Fix PVR device reset. - Habanalabs maintainer update. - Small memory leak fix when nouveau acpi init fails. - Do not attempt to bind to any PCI device with AGP capability. - Make FB's acquire handles on backing object, same as i915/xe already does. - Fix race in drm_gem_handle_create_tail. Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e522cdc7-1787-48f2-97e5-0f94783970ab@linux.intel.com	2025-07-11 14:11:19 +02:00
Simona Vetter	14e85fabee	Driver Changes: - Clear LMTT page to avoid leaking data from one VF to another - Align PF queue size to power of 2 - Disable Indirect Ring State to avoid intermittent issues on context switch: feature is not currently needed, so can be disabled for now. - Fix compression handling when the BO pages are very fragmented - Restore display pm on error path - Fix runtime pm handling in xe devcoredump - Fix xe_pm_set_vram_threshold() doc - Recommend new minor versions of GuC firmware - Drop some workarounds on VF - Do not use verbose GuC logging by default: it should be only for debugging -----BEGIN PGP SIGNATURE----- iQJNBAABCgA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmhwnQgZHGx1Y2FzLmRl bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU3/QD/9TtgEyP+On1K1T8pYBY6I2 RuhAQ5OUs/7H4A0boO/+ZZRCqF8nuEfHy7FqWSYnO1IlRvmFfvjWyMJsJmMCcj2X t1ZMbrT6DiSAGGxf8F+euRPAKCCRltqLJ2dfGDIBerW1CMFpA5lepNSjrMyGpQ93 R9IfUzW8h8yexW7xGfUjgV2MCs/14oNQW79c5LFjkfVU+8ILHP1a8EZMeWmR310R NqDfBqvHhKxBQguhbGzIYUOdKTBDr7McZ9A8fZ9nzp4GYIb/j6AbKxlyZgDIgM2b ahGWeLqqM7PnNvGs0r+vdESuVDkzo9tYw6MfHRHhLGmq+kYSm9w95p9SKsVNtRft K/w3SXjIS4e0hPpqBEOC5ANfSvValzBwltJDFLd9dG+fPEAarbi2AAYQjnoUf73n EK4DQ1K44T2kYlf27UQxaXE2LYvO2h6nv1iJ9FvAuaUznP6Za6zeVQqfMyvLaBXg gpAkvJG5QY6y7II9lRaCMNR8tb6IJwOgdvEynHgKsmwv3iHUfmdqRfTx5Eb1ECe8 rk9J0R4SIuYkhu1qUxJx4qrBX386v3ERiof8fjNitoTi6ITcgBoPiG9QxbmS1L1f Qhv8HM7ebPQgc8Eyy8dc1ZrknFqGe0/MggQDHmKgMU3t6jisiBWpJ/qJv4gAqwMk 551lBdaKfDKGSrZ8hkUqaA== =HUb3 -----END PGP SIGNATURE----- Merge tag 'drm-xe-fixes-2025-07-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Clear LMTT page to avoid leaking data from one VF to another - Align PF queue size to power of 2 - Disable Indirect Ring State to avoid intermittent issues on context switch: feature is not currently needed, so can be disabled for now. - Fix compression handling when the BO pages are very fragmented - Restore display pm on error path - Fix runtime pm handling in xe devcoredump - Fix xe_pm_set_vram_threshold() doc - Recommend new minor versions of GuC firmware - Drop some workarounds on VF - Do not use verbose GuC logging by default: it should be only for debugging Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/s6jyd24mimbzb4vxtgc5vupvbyqplfep2c6eupue7znnlbhuxy@lmvzexfzhrnn	2025-07-11 11:35:39 +02:00
Lucas De Marchi	74806f69b8	drm/xe/guc: Default log level to non-verbose Currently xe sets the guc log level to a verbose level since it's useful to debug hangs and general development. However the verbose level may already be too much and affect performance. Michal Mrozek did some tests with the L0 compute stack for submission latency with ULLS disabled. Below are the normalized numbers with log level 3 (the current default) as baseline for each test: Test \ Log Level 3 0 1 2 ----------------------------------------------------------- ------ ------ ------ ------ BestWalkerNthCommandListSubmission(CmdListCount=2) 1.00 0.63 0.63 0.96 BestWalkerNthSubmission(KernelCount=2) 1.00 0.62 0.63 0.96 BestWalkerNthSubmissionImmediate(KernelCount=2) 1.00 0.58 0.58 0.85 BestWalkerSubmission 1.00 0.62 0.62 0.96 BestWalkerSubmissionImmediate 1.00 0.63 0.62 0.96 BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=2) 1.00 0.58 0.58 0.86 BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=4) 1.00 0.70 0.70 0.83 BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=8) 1.00 0.53 0.52 0.78 Log level 2 is the first "verbose level" for GuC, where the biggest difference happens. Keep log level 3 for CONFIG_DRM_XE_DEBUG, but switch to 1, i.e. GUC_LOG_LEVEL_NON_VERBOSE, for "normal" builds. Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250613-guc-log-level-v2-1-cb84a63e49fe@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `a37128ba61`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-10 20:59:39 -07:00
Michal Wajdeczko	7a10175a42	drm/xe/bmg: Don't use WA 16023588340 and 22019338487 on VF These workarounds are not applicable for use by the VFs. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Tested-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Signed-off-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Link: https://lore.kernel.org/r/20250710103040.375610-2-jakub1.kolakowski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `1d2e2503e5`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-10 20:59:39 -07:00
Julia Filipchuk	8c01880509	drm/xe/guc: Recommend GuC v70.46.2 for BMG, LNL, DG2 UAPI compatibility version 1.22.2 Resolves various bugs. Recommend newer version. Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250626182805.1701096-13-daniele.ceraolospurio@intel.com (cherry picked from commit `0b64addcae`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-10 20:59:39 -07:00
Shuicheng Lin	0539c5eaf8	drm/xe/pm: Correct comment of xe_pm_set_vram_threshold() The parameter threshold is with size in MiB, not in bits. Correct it to avoid any confusion. v2: s/mb/MiB, s/vram/VRAM, fix return section. (Michal) Fixes: `30c399529f` ("drm/xe: Document Xe PM component") Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250708021450.3602087-2-shuicheng.lin@intel.com Reviewed-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `0efec05001`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-10 20:59:38 -07:00
Shuicheng Lin	253a174c06	drm/xe: Release runtime pm for error path of xe_devcoredump_read() xe_pm_runtime_put() is missed to be called for the error path in xe_devcoredump_read(). Add function description comments for xe_devcoredump_read() to help understand it. v2: more detail function comments and refine goto logic (Matt) Fixes: `c4a2e5f865` ("drm/xe: Add devcoredump chunking") Cc: stable@vger.kernel.org Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250707004911.3502904-6-shuicheng.lin@intel.com (cherry picked from commit `017ef1228d`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-10 20:59:38 -07:00
Shuicheng Lin	6d33df611a	drm/xe/pm: Restore display pm if there is error after display suspend xe_bo_evict_all() is called after xe_display_pm_suspend(). So if there is error with xe_bo_evict_all(), display pm should be restored. Fixes: `51462211f4` ("drm/xe/pxp: add PXP PM support") Fixes: `cb8f81c175` ("drm/xe/display: Make display suspend/resume work on discrete") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://lore.kernel.org/r/20250708035424.3608190-2-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `83dcee1785`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-10 20:59:38 -07:00
Hans de Goede	e778689390	drm/i915/bios: Apply vlv_fixup_mipi_sequences() to v2 mipi-sequences too It turns out that the fixup from vlv_fixup_mipi_sequences() is necessary for some DSI panel's with version 2 mipi-sequences too. Specifically the Acer Iconia One 8 A1-840 (not to be confused with the A1-840FHD which is different) has the following sequences: BDB block 53 (1284 bytes) - MIPI sequence block: Sequence block version v2 Panel 0 * Sequence 2 - MIPI_SEQ_INIT_OTP GPIO index 9, source 0, set 0 (0x00) Delay: 50000 us GPIO index 9, source 0, set 1 (0x01) Delay: 6000 us GPIO index 9, source 0, set 0 (0x00) Delay: 6000 us GPIO index 9, source 0, set 1 (0x01) Delay: 25000 us Send DCS: Port A, VC 0, LP, Type 39, Length 5, Data ff aa 55 a5 80 Send DCS: Port A, VC 0, LP, Type 39, Length 3, Data 6f 11 00 ... Send DCS: Port A, VC 0, LP, Type 05, Length 1, Data 29 Delay: 120000 us Sequence 4 - MIPI_SEQ_DISPLAY_OFF Send DCS: Port A, VC 0, LP, Type 05, Length 1, Data 28 Delay: 105000 us Send DCS: Port A, VC 0, LP, Type 05, Length 2, Data 10 00 Delay: 10000 us Sequence 5 - MIPI_SEQ_ASSERT_RESET Delay: 10000 us GPIO index 9, source 0, set 0 (0x00) Notice how there is no MIPI_SEQ_DEASSERT_RESET, instead the deassert is done at the beginning of MIPI_SEQ_INIT_OTP, which is exactly what the fixup from vlv_fixup_mipi_sequences() fixes up. Extend it to also apply to v2 sequences, this fixes the panel not working on the Acer Iconia One 8 A1-840. Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14605 Signed-off-by: Hans de Goede <hansg@kernel.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20250703143824.7121-1-hansg@kernel.org Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `11895f3759`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-07-10 11:35:20 -04:00
Simona Vetter	bd46cece51	drm/gem: Fix race in drm_gem_handle_create_tail() Object creation is a careful dance where we must guarantee that the object is fully constructed before it is visible to other threads, and GEM buffer objects are no difference. Final publishing happens by calling drm_gem_handle_create(). After that the only allowed thing to do is call drm_gem_object_put() because a concurrent call to the GEM_CLOSE ioctl with a correctly guessed id (which is trivial since we have a linear allocator) can already tear down the object again. Luckily most drivers get this right, the very few exceptions I've pinged the relevant maintainers for. Unfortunately we also need drm_gem_handle_create() when creating additional handles for an already existing object (e.g. GETFB ioctl or the various bo import ioctl), and hence we cannot have a drm_gem_handle_create_and_put() as the only exported function to stop these issues from happening. Now unfortunately the implementation of drm_gem_handle_create() isn't living up to standards: It does correctly finishe object initialization at the global level, and hence is safe against a concurrent tear down. But it also sets up the file-private aspects of the handle, and that part goes wrong: We fully register the object in the drm_file.object_idr before calling drm_vma_node_allow() or obj->funcs->open, which opens up races against concurrent removal of that handle in drm_gem_handle_delete(). Fix this with the usual two-stage approach of first reserving the handle id, and then only registering the object after we've completed the file-private setup. Jacek reported this with a testcase of concurrently calling GEM_CLOSE on a freshly-created object (which also destroys the object), but it should be possible to hit this with just additional handles created through import or GETFB without completed destroying the underlying object with the concurrent GEM_CLOSE ioctl calls. Note that the close-side of this race was fixed in `f6cd7daecf` ("drm: Release driver references to handle before making it available again"), which means a cool 9 years have passed until someone noticed that we need to make this symmetry or there's still gaps left :-/ Without the 2-stage close approach we'd still have a race, therefore that's an integral part of this bugfix. More importantly, this means we can have NULL pointers behind allocated id in our drm_file.object_idr. We need to check for that now: - drm_gem_handle_delete() checks for ERR_OR_NULL already - drm_gem.c:object_lookup() also chekcs for NULL - drm_gem_release() should never be called if there's another thread still existing that could call into an IOCTL that creates a new handle, so cannot race. For paranoia I added a NULL check to drm_gem_object_release_handle() though. - most drivers (etnaviv, i915, msm) are find because they use idr_find(), which maps both ENOENT and NULL to NULL. - drivers using idr_for_each_entry() should also be fine, because idr_get_next does filter out NULL entries and continues the iteration. - The same holds for drm_show_memory_stats(). v2: Use drm_WARN_ON (Thomas) Reported-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Tested-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: stable@vger.kernel.org Cc: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Signed-off-by: Simona Vetter <simona.vetter@intel.com> Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20250707151814.603897-1-simona.vetter@ffwll.ch	2025-07-09 15:53:34 +02:00
Thomas Zimmermann	f6bfc9afc7	drm/framebuffer: Acquire internal references on GEM handles Acquire GEM handles in drm_framebuffer_init() and release them in the corresponding drm_framebuffer_cleanup(). Ties the handle's lifetime to the framebuffer. Not all GEM buffer objects have GEM handles. If not set, no refcounting takes place. This is the case for some fbdev emulation. This is not a problem as these GEM objects do not use dma-bufs and drivers will not release them while fbdev emulation is running. Framebuffer flags keep a bit per color plane of which the framebuffer holds a GEM handle reference. As all drivers use drm_framebuffer_init(), they will now all hold dma-buf references as fixed in commit `5307dce878` ("drm/gem: Acquire references on GEM handles for framebuffers"). In the GEM framebuffer helpers, restore the original ref counting on buffer objects. As the helpers for handle refcounting are now no longer called from outside the DRM core, unexport the symbols. v3: - don't mix internal flags with mode flags (Christian) v2: - track framebuffer handle refs by flag - drop gma500 cleanup (Christian) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: `5307dce878` ("drm/gem: Acquire references on GEM handles for framebuffers") Reported-by: Bert Karwatzki <spasswolf@web.de> Closes: https://lore.kernel.org/dri-devel/20250703115915.3096-1-spasswolf@web.de/ Tested-by: Bert Karwatzki <spasswolf@web.de> Tested-by: Mario Limonciello <superm1@kernel.org> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Anusha Srivatsa <asrivats@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: <stable@vger.kernel.org> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250707131224.249496-1-tzimmermann@suse.de	2025-07-09 14:03:28 +02:00
Matthew Auld	fee58ca135	drm/xe/bmg: fix compressed VRAM handling There looks to be an issue in our compression handling when the BO pages are very fragmented, where we choose to skip the identity map and instead fall back to emitting the PTEs by hand when migrating memory, such that we can hopefully do more work per blit operation. However in such a case we need to ensure the src PTEs are correctly tagged with a compression enabled PAT index on dgpu xe2+, otherwise the copy will simply treat the src memory as uncompressed, leading to corruption if the memory was compressed by the user. To fix this pass along use_comp_pat into emit_pte() on the src side, to indicate that compression should be considered. v2 (Jonathan): tweak the commit message Fixes: `523f191cc0` ("drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250701103949.83116-2-matthew.auld@intel.com (cherry picked from commit `f7a2fd776e`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-07 20:57:17 -07:00
Matthew Brost	daa099fed5	Revert "drm/xe/xe2: Enable Indirect Ring State support for Xe2" This reverts commit `fe0154cf82`. Seeing some unexplained random failures during LRC context switches with indirect ring state enabled. The failures were always there, but the repro rate increased with the addition of WA BB as a separate BO. Commit `3a1edef8f4` ("drm/xe: Make WA BB part of LRC BO") helped to reduce the issues in the context switches, but didn't eliminate them completely. Indirect ring state is not required for any current features, so disable for now until failures can be root caused. Cc: stable@vger.kernel.org Fixes: `fe0154cf82` ("drm/xe/xe2: Enable Indirect Ring State support for Xe2") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250702035846.3178344-1-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `03d85ab36b`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-07 20:57:17 -07:00
Matthew Brost	c9a95dbe06	drm/xe: Allocate PF queue size on pow2 boundary CIRC_SPACE does not work unless the size argument is a power of 2, allocate PF queue size on power of 2 boundary. Cc: stable@vger.kernel.org Fixes: `3338e4f90c` ("drm/xe: Use topology to determine page fault queue size") Fixes: `29582e0ea7` ("drm/xe: Add page queue multiplier") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://lore.kernel.org/r/20250702213511.3226167-1-matthew.brost@intel.com (cherry picked from commit `491b978312`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-07 20:57:17 -07:00
Michal Wajdeczko	705a412a36	drm/xe/pf: Clear all LMTT pages on alloc Our LMEM buffer objects are not cleared by default on alloc and during VF provisioning we only setup LMTT PTEs for the actually provisioned LMEM range. But beyond that valid range we might leave some stale data that could either point to some other VFs allocations or even to the PF pages. Explicitly clear all new LMTT page to avoid the risk that a malicious VF would try to exploit that gap. While around add asserts to catch any undesired PTE overwrites and low-level debug traces to track LMTT PT life-cycle. Fixes: `b1d2040582` ("drm/xe/pf: Introduce Local Memory Translation Table") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250701220052.1612-1-michal.wajdeczko@intel.com (cherry picked from commit `3fae6918a3`) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-07-07 20:57:07 -07:00
Ben Skeggs	d133036a0b	drm/nouveau/gsp: fix potential leak of memory used during acpi init If any of the ACPI calls fail, memory allocated for the input buffer would be leaked. Fix failure paths to free allocated memory. Also add checks to ensure the allocations succeeded in the first place. Reported-by: Danilo Krummrich <dakr@kernel.org> Fixes: `176fdcbddf` ("drm/nouveau/gsp/r535: add support for booting GSP-RM") Signed-off-by: Ben Skeggs <bskeggs@nvidia.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250617040036.2932-1-bskeggs@nvidia.com	2025-07-07 16:32:44 +02:00
Tamir Duberstein	3d44147494	rust: drm: remove unnecessary imports `kernel::str::CStr` is included in the prelude. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250704-cstr-include-drm-v1-1-a279dfc4d753@gmail.com	2025-07-05 13:01:59 +02:00
Alessio Belle	d38376b3ee	drm/imagination: Fix kernel crash when hard resetting the GPU The GPU hard reset sequence calls pm_runtime_force_suspend() and pm_runtime_force_resume(), which according to their documentation should only be used during system-wide PM transitions to sleep states. The main issue though is that depending on some internal runtime PM state as seen by pm_runtime_force_suspend() (whether the usage count is <= 1), pm_runtime_force_resume() might not resume the device unless needed. If that happens, the runtime PM resume callback pvr_power_device_resume() is not called, the GPU clocks are not re-enabled, and the kernel crashes on the next attempt to access GPU registers as part of the power-on sequence. Replace calls to pm_runtime_force_suspend() and pm_runtime_force_resume() with direct calls to the driver's runtime PM callbacks, pvr_power_device_suspend() and pvr_power_device_resume(), to ensure clocks are re-enabled and avoid the kernel crash. Fixes: `cc1aeedb98` ("drm/imagination: Implement firmware infrastructure and META FW support") Signed-off-by: Alessio Belle <alessio.belle@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://lore.kernel.org/r/20250624-fix-kernel-crash-gpu-hard-reset-v1-1-6d24810d72a6@imgtec.com Cc: stable@vger.kernel.org Signed-off-by: Matt Coster <matt.coster@imgtec.com>	2025-07-04 16:32:10 +01:00
Mikko Perttunen	44306a684c	drm/tegra: nvdec: Fix dma_alloc_coherent error check Check for NULL return value with dma_alloc_coherent, in line with Robin's fix for vic.c in 'drm/tegra: vic: Fix DMA API misuse'. Fixes: `46f226c93d` ("drm/tegra: Add NVDEC driver") Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20250702-nvdec-dma-error-check-v1-1-c388b402c53a@nvidia.com	2025-07-04 11:15:07 +02:00
Dave Airlie	da8d8e9001	Driver Changes: - Fix chunking the PTE updates and overflowing the maximum number of dwords with with MI_STORE_DATA_IMM (Jia Yao) - Move WA BB to the LRC BO to mitigate hangs on context switch (Matthew Brost) - Fix frequency/flush WAs for BMG (Vinay / Lucas) - Fix kconfig prompt title and description (Lucas) - Do not require kunit (Harry Austen / Lucas) - Extend 14018094691 WA to BMG (Daniele) - Fix wedging the device on signal (Matthew Brost) -----BEGIN PGP SIGNATURE----- iQJNBAABCgA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmhnDlcZHGx1Y2FzLmRl bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU5eDD/9DlIpRAjb5mj60DeOQTr4c yJHum2Jj6CJZzRETiwwtOrjFm1ZdRT7YfpRyDCYmT+Pv2fliJqGbp8ozuxRKJtmA 4f3P/9aldNAyiEDF1KiDe0rdPsmk67dv48BOTCRribJMaND+jO8kQ3xB95x/hw+z LWLgEKnneXnneslKT18Vn62h7QQBiyB1K2ucbDMgfz0UdT4HQVlabx5yOzrxub31 O1oE/ISCzIM10CZU6EBSN6gqPNpUHoBuTb3UAIXhs3AKCR8QBwkC2s/UsHYn0TFg 2A4zWSMJBVIuC4N7bXqX8Xh4MJJOjt8JAPw73/oLy3CFgT8JpgAYsby6ye01IbAz kcEE5FEo6wuJkGK59nyIaLFhRASm14+y2FtMahj/HJlhG0gpLizalBzIX75wKYz4 62qjOo6zTZjl5kavBzZCi1eu8tNy/pqzh3ZCVKPmil28usLf4HlViwy/gQZtk/3i ZZ9zDgBh1JGe3SLcGcaTBXgBIYHykPbE+k6+2l8cSNN6tD/sSliGC8gYrQzNP/FQ 6Jj9Repd+rukG0KLtFv82Ab8Ip6YmZQWfskLov9Hr5u2fl0qOkjMjMe33cun2+fS KGI5cgzzNg4yh/PEX7AkdmlwEio+h5IwrB+VuyjJ44/uMBQNvWgK/6RyJ7U7lSgS /3l8T7CDujVIiLFBEfSk9A== =/+WM -----END PGP SIGNATURE----- Merge tag 'drm-xe-fixes-2025-07-03' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix chunking the PTE updates and overflowing the maximum number of dwords with with MI_STORE_DATA_IMM (Jia Yao) - Move WA BB to the LRC BO to mitigate hangs on context switch (Matthew Brost) - Fix frequency/flush WAs for BMG (Vinay / Lucas) - Fix kconfig prompt title and description (Lucas) - Do not require kunit (Harry Austen / Lucas) - Extend 14018094691 WA to BMG (Daniele) - Fix wedging the device on signal (Matthew Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/o5662wz6nrlf6xt5sjgxq5oe6qoujefzywuwblm3m626hreifv@foqayqydd6ig	2025-07-04 10:01:53 +10:00
Dave Airlie	8f954c435f	- Fixed raw pointer leakage and unsafe behavior in printk() . Switch from %pK to %p for pointer formatting, as %p is now safer and prevents issues like raw pointer leakage and acquiring sleeping locks in atomic contexts. -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEEoxi+6c5pRPV/gdXcxWAb7Og/+bYFAmhhA60ACgkQxWAb7Og/ +bbwOQv5Ad+grXRw2Ef6mwk0NFq3uLdnvXrvVj6pjdvt1cLSY4vbEKzojgnTMbv3 rpu3zVGINhZGdC8JWD+PWU9wKqRUvUjYQWPKO83XsGJXYtm/CqGtOtTR+8fxW+dA B2dXWrctDQIyizB55nzFIgL1bGFegvX7aRom/GVEgN9yU7yd6xm+bP2SyJUy/N/I xMXZukSGPhiethSKRF+2MbuDEZ7KpLiX9SDARo8OWqwAEMsRDD3wuCKP9S7Y0Gri UWuEeo45l1aNzHRFa+ZH55p1uRDv6ojFHu1EAug14Y7Y9zPm25/LKMCsM09fyg7l /Hnxcu5/O2mJyghKXistTuYtMDpRRKSYQ6Zhh94Vnh6uBMie0Hi6AwsvS+5MHdCs rzuzhl2Z2litG0VS/wwQmki2YA155etwNpMFdc2zKHMmFfAyW9KbAxK7ojpIYxf+ RgNIFOpEGamB7FesNTk8SuGHznP04A/QNTtJoDr2fTa7noyReAgtqk1TCY4wWhIY CHAdbVAC =hSha -----END PGP SIGNATURE----- Merge tag 'samsung-dsim-fixes-for-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes - Fixed raw pointer leakage and unsafe behavior in printk() . Switch from %pK to %p for pointer formatting, as %p is now safer and prevents issues like raw pointer leakage and acquiring sleeping locks in atomic contexts. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Inki Dae <inki.dae@samsung.com> Link: https://lore.kernel.org/r/20250629091742.29956-1-inki.dae@samsung.com	2025-07-04 09:40:20 +10:00
Dave Airlie	ac2ad73e75	Fixups - Fixed raw pointer leakage and unsafe behavior in printk() . Switch from %pK to %p for pointer formatting, as %p is now safer and prevents issues like raw pointer leakage and acquiring sleeping locks in atomic contexts. - Fixed kernel panic during boot . A NULL pointer dereference issue occasionally occurred when the vblank interrupt handler was called before the DRM driver was fully initialized during boot. So this patch fixes the issue by adding a check in the interrupt handler to ensure the DRM driver is properly initialized. - Fixed a lockup issue on Samsung Peach-Pit/Pi Chromebooks . The issue occurred after commit `c9b1150a68` changed the call order of CRTC enable/disable and bridge pre_enable/post_disable methods, causing fimd_dp_clock_enable() to be called before the FIMD device was activated. To fix this, runtime PM guards were added to fimd_dp_clock_enable() to ensure proper operation even when CRTC is not enabled. -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEEoxi+6c5pRPV/gdXcxWAb7Og/+bYFAmhg968ACgkQxWAb7Og/ +bbg7gv/RneHwOTWolCay8C1+6GQWzboB29z48ZciM2n88Rvm3HzF7ov9PdzOuzN aqB91tcgX9qBVriylSCIef2uc7KlyC70q4f8mxBUdPujlvBnxAf2pbjszr30REc5 IqfETAuI4wnPb6HHuaAyq8762Vkj+chhlv+UK30u+q0K14Krl76mPxWl/drLONiL YUxrzgZNyM4QzaeFUmLczcBiqj4yr+Du6ovboTOXpWlV/8f6Sem47VyyAKWAV1lk h2AvfFQ+Dm3E1B/tnVp81CR4vNSsbtx7Lnv360XUzHhMh2TaZdrMBsg+H3Lhhne3 e7/IdOpQvOIdmx+6p7Mt9FrjYURx5eJtK80elUI/CFe4mImsKJJwb0YKK99AY9MM KZ4FarVC1fZoqNgwssLR7Aum8gnPII7QY/R3I9nXo4he9+P8I+Ta/DGpinIa6rjZ g2tOL7yPyCkpHqS+VBUxiGJZwEIAjfEE8qkyMgL5AA48tQ7Nz3W7PlGJnSKWozcY 3TdkX16X =jqfC -----END PGP SIGNATURE----- Merge tag 'exynos-drm-fixes-for-v6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes Fixups - Fixed raw pointer leakage and unsafe behavior in printk() . Switch from %pK to %p for pointer formatting, as %p is now safer and prevents issues like raw pointer leakage and acquiring sleeping locks in atomic contexts. - Fixed kernel panic during boot . A NULL pointer dereference issue occasionally occurred when the vblank interrupt handler was called before the DRM driver was fully initialized during boot. So this patch fixes the issue by adding a check in the interrupt handler to ensure the DRM driver is properly initialized. - Fixed a lockup issue on Samsung Peach-Pit/Pi Chromebooks . The issue occurred after commit `c9b1150a68` changed the call order of CRTC enable/disable and bridge pre_enable/post_disable methods, causing fimd_dp_clock_enable() to be called before the FIMD device was activated. To fix this, runtime PM guards were added to fimd_dp_clock_enable() to ensure proper operation even when CRTC is not enabled. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Inki Dae <inki.dae@samsung.com> Link: https://lore.kernel.org/r/20250629083554.28628-1-inki.dae@samsung.com	2025-07-04 09:38:01 +10:00
Dave Airlie	afd30ace71	Merge tag 'drm-intel-fixes-2025-07-03' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Make mei interrupt top half irq disabled to fix RT builds - Fix timeline left held on VMA alloc error - Fix NULL pointer deref in vlv_dphy_param_init() - Fix selftest mock_request() to avoid NULL deref Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://lore.kernel.org/r/aGYVPAA4KvsZqDFx@jlahtine-mobl	2025-07-04 09:26:57 +10:00
Dave Airlie	b91e11ec5c	Merge tag 'drm-misc-fixes-2025-07-03' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.16-rc5: - Replace simple panel lookup hack with proper fix. - nullpointer deref in vesadrm fix. - fix dma_resv_wait_timeout. - fix error handling in ttm_buffer_object_transfer. - bridge fixes. - Fix vmwgfx accidentally allocating encrypted memory. - Fix race in spsc_queue_push() - Add refcount on backing GEM objects during fb creation. - Fix v3d irq's being enabled during gpu reset. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/a7461418-08dc-4b7c-b2fa-264155f66d5e@linux.intel.com	2025-07-04 09:06:57 +10:00
Dave Airlie	e79d0ba605	nouveau/gsp: add a 50ms delay between fbsr and driver unload rpcs This fixes a bunch of command hangs after runtime suspend/resume. This fixes a regression caused by code movement in the commit below, the commit seems to just change timings enough to cause this to happen now, and adding the sleep seems to avoid it. I've spent some time trying to root cause it to no great avail, it seems like a bug on the firmware side, but it could be a bug in our rpc handling that I can't find. Either way, we should land the workaround to fix the problem, while we continue to work out the root cause. Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: Ben Skeggs <bskeggs@nvidia.com> Cc: Danilo Krummrich <dakr@kernel.org> Fixes: `c21b039715` ("drm/nouveau/gsp: add hals for fbsr.suspend/resume()") Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250702232707.175679-1-airlied@gmail.com	2025-07-04 00:22:12 +02:00
Aaron Thompson	78f88067d5	drm/nouveau: Do not fail module init on debugfs errors If CONFIG_DEBUG_FS is enabled, nouveau_drm_init() returns an error if it fails to create the "nouveau" directory in debugfs. One case where that will happen is when debugfs access is restricted by CONFIG_DEBUG_FS_ALLOW_NONE or by the boot parameter debugfs=off, which cause the debugfs APIs to return -EPERM. So just ignore errors from debugfs. Note that nouveau_debugfs_root may be an error now, but that is a standard pattern for debugfs. From include/linux/debugfs.h: "NOTE: it's expected that most callers should _ignore_ the errors returned by this function. Other debugfs functions handle the fact that the "dentry" passed to them could be an error and they don't crash in that case. Drivers should generally work fine even if debugfs fails to init anyway." Fixes: `97118a1816` ("drm/nouveau: create module debugfs root") Cc: stable@vger.kernel.org Signed-off-by: Aaron Thompson <dev@aaront.org> Acked-by: Timur Tabi <ttabi@nvidia.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://lore.kernel.org/r/20250703211949.9916-1-dev@aaront.org	2025-07-03 23:56:33 +02:00

1 2 3 4 5 ...

115300 Commits