I continued spending time on VC5 in the last two weeks.
First, I’ve ported the driver over to the AMDGPU scheduler. Prior to this, vc4 and vc5’s render jobs get queued to the HW in the order that the GL clients submit them to the kernel. OpenGL requires that jobs within a client effectively happen in that order (though we do some clever rescheduling in userspace to reduce overhead of some render-to-texture workloads due to us being a tiler). However, having submission order to the kernel dictate submission order to the HW means that a single busy client (imagine a crypto miner) will starve your desktop workload, since the desktop has to wait behind all of the bulk-work jobs the other client has submitted.
With the AMDGPU scheduler, each client gets its own serial run queue, and the scheduler picks between them as jobs in the run queues become ready. It also gives us easy support for in-fences on your jobs, one of the requirements for Android. All of this is with a bit less vc5 driver code than I had for my own, inferior scheduler.
Currently I’m making it most of the way through piglit and conformance test runs, before something goes wrong around the time of a GPU reset and the kernel crashes. In the process, I’ve improved the documentation on the scheduler’s API, and hopefully this encourages other drivers to pick it up.
Second, I’ve been working on debugging some issues that may be TLB flushing bugs. On the piglit “longprim” test, we go through overflow memory quickly, and allocating overflow memory involves updating PTEs and then having the GPU read from those in very short order. I see a lot of GPU segfaults on non-writable PTEs where the new overflow BO was allocated just after the last one (so maybe the lookups that happened near the end of the last one pre-fetched some PTEs from our space?). The confusing part is that I keep getting write errors far past where I would have expected any previous PTE lookups to have gone. Yet, outside of this case and maybe a couple of others within piglit and the CTS, we seem to be completely fine at PTE updates.
On the VC4 front, I wrote some docs for what I think the steps are for people that want to connect new DSI panels to Raspberry Pi. I reviewed Stefan’s patches for using the CTM for color correction on Android (promising, except I’m concerned it applies at the wrong stage of the DRM display pipeline), and some of Boris’s work on async updates (simplifying our cursor and async pageflip path). I also reviewed an Intel patch that’s necessary for a core DRM change we want for our SAND display support, and a Mesa patch fixing a regression with the new modifiers code.