This week was spent almost entirely on the VC5 QPU instruction scheduler. This is what packs together the ADD and MUL instruction components and the signal flags (LDUNIF, LDVPM, LDTMU, THRSW, etc.) into the final sequence of 64-bit instructions.
I based it on the VC4 scheduler, which in turn is based on i965’s. Being the 5th of these I’ve worked on, it would sure be nice if it felt like less copy and paste, but it’s almost all very machine-dependent code and I haven’t come up with a way to reduce the duplication.
The initial results are great:
instructions in affected programs: 116269 -> 71677 (-38.35%)
but I’ve still got a handful of regressions to fix.
Next up for scheduling is to fill the thrsw and branch delay slots. I also need to pack more than 2 things into one QPU instruction – right we pick two of ADD, MUL, LDUNIF, and LDVPM, but we could do an ADD, MUL, LDVPM, and LDUNIF all together. That should be a small change from here.
Other bits this week:
- Fixed a leak of the EDID data from the HDMI connector (~256 bytes per xrandr -q invocation)
- Reviewed modifier blob support from daniels for better atomic-modesetting weston on vc4
- Pushed the RCL walk order DRM support.
- Reworked patches for turning on vc4 NEON support in Raspbian builds, to fix Android builds.