]> git.itanic.dy.fi Git - linux-stable/commitdiff
drm/amdkfd: Reset GPU on queue preemption failure
authorHarish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Tue, 26 Mar 2024 19:32:46 +0000 (15:32 -0400)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 17 Apr 2024 09:23:37 +0000 (11:23 +0200)
commit 8bdfb4ea95ca738d33ef71376c21eba20130f2eb upstream.

Currently, with F32 HWS GPU reset is only when unmap queue fails.

However, if compute queue doesn't repond to preemption request in time
unmap will return without any error. In this case, only preemption error
is logged and Reset is not triggered. Call GPU reset in this case also.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c

index c0e71543389a9b7c410a8072332004eb7dea4f08..c0ae1a97498b5ff52249e272bc2e957150436864 100644 (file)
@@ -1997,6 +1997,7 @@ static int unmap_queues_cpsch(struct device_queue_manager *dqm,
                dev_err(dev, "HIQ MQD's queue_doorbell_id0 is not 0, Queue preemption time out\n");
                while (halt_if_hws_hang)
                        schedule();
+               kfd_hws_hang(dqm);
                return -ETIME;
        }