2017-08-07: copy performance, VC5 scheduling, xserver CI

The biggest news this week is that I’ve resolved the window movement performance on Raspbian. After all my work on improving vc4 performance for overlapping CopyArea to parity with SW rendering, the end-user experience was still awful. It turned out that openbox just had a really naive event loop for window movement, so each pixel of motion of the mouse that got reported resulted in a window movement, even if there were 100 of those queued to the window manager in a stream.

I first fixed openbox to consume all the motion events it could and then set the window to the last position it knew of. This didn’t quite work, because thanks to it was only seeing one motion event read in at a time, and queuing up a reposition in between each one.

Next, I made openbox do a round-trip to the server after each copy, so that we get all the motion collected at once per copy that happens on the server side. Now, window dragging keeps up with the mouse as fast as you can wiggle it.

I also worked on backporting the NEON assembly for texture upload/download to the ARMv6 build. Jonas Pfeil started on this, but I want to rework the detection logic a bit to match what Mesa does for other asm support. That should land next week, so that we can have NEON on Raspbian.

On the VC5 front, the next big step is QPU instruction scheduling. So far I’ve removed a duplicate list of QPU instructions, and spent time thinking about how I want to structure the input to the scheduler (it would be good to have instructions reading uniforms be able to be reordered, but we can’t do that if I have the ldunif signal separate from the usage like I do today).

Finally, when I was reviewing daniels’s X11 Present modifiers protocol, I complained that he was representing 64-bit values as pairs of 32-bit values, when we (finally!) recently started using 64-bit values on the wire and in our code. He replied that he had tried that, but the SYNC extension was #defining CARD64 to an XSyncValue and there were conflicts over CARD64. CARD* is X11’s custom not-quite-stdint.h types for sized unsigned values, and XSyncValue is a struct with 2 32-bit values to represent a signed 64-bit value.

My solution was to just remove the CARD64 usage in the SYNC extension, packing and unpacking int64_ts from the wire protocol and then using normal 64-bit math. On 32-bit processors this should expand to the same code as before, but on 64-bit we will probably get actual 64-bit math that we didn’t before (unless your compiler is pretty clever). It also cleaned up some gross code for making XSyncValues of small values like “4” and “0”.

However, since I modified the code, I wanted to run the tests, and I found no tests. Now that we have the meson build system and simple-xinit, it was trivial to write some XCB touch-testing of the protocol and verify that my changes at least didn’t regress the protocol I’ve tested. Hopefully these tests can be a model for future automated protocol testing under “make check”. I didn’t port the test to automake, because automake makes testing awful.

Also this week:

  • Fixed a kernel crashing regression when vc4 fails to load completely since the BO labeling debug work.
  • Started merging a bunch of previous vc4 kernel patches thanks to review from Boris
  • Wrote a piglit test for the overlapping copy extension
  • Fixed up xserver’s Travis CI to run tests that start an X Server.