Code device computing unavailable bug

My code devices can operate normally under normal circumstances, but when the quantity increases, there is a chance that a calculation-unavailable state occurs. In this state, all code devices become non-functional, and restarting the project has no effect. The only way to restore normal operation is to use ‘Disable device’ to disable all devices and then restart the project.

1 Like

Unfortunately, OpenCL drivers are not particularly robust and crashes within the gpu kernel code can happen with either misbehaving or simply too demanding kernels. OCL doesn’t provide a clean way for a process to tear down and restart the opencl driver - once a true crash has happened, you have to kill the process itself.

One solution i’ve investigated is to spawn child processes to perform all operations within. This is nontrivial to implement though, so its not an imminent solution.

3 Likes