I'm not sure if the GPU is even needed for an operation like this. Basically modifying a few dozen bytes in the framebuffer every second. It would be interesting to disable all graphical acceleration, causing the CPU to do all the work, and see what difference that makes to the test.
I would be much more interested in having the ability to inform both the gpu and cpu to stay in low power mode while performing this operation.
I does not need to be scaling up to high performance to:
- Read a piece of memory
- Increment with one and modulo
- Display a section of a texture on a section of the screen.
You will very likely spend more time passing memory around than otherwise, and to be honest if it happens every second I would hope it stays in cache so you wouldn’t ever even bother the memory.
Yeah, some mobile GPUs actually have hardware compositors that can do. They can even support moving windows around with pretty low overhead.
But the software support to take advantage of it isn't really there. There isn't a standard API to access such functionality, and so the hardware compositors end up unused, so the don't really put much effort into improving them.
But with proper software support and a hardware compositor with enough flexibility, you could easily put the clock in it's own texture and update it with very low power consumption.
Actually, Desktop GPUs already have a single hardware sprite that gets used for moving the mouse cursor around with very little overhead (and lower latency).