Neat! Parallelizing a single frame across multiple machines was something I'd wanted to try back when I was working on RenderMan. It used to be able to do it back in the REYES days via netrender, but was something we lost with the move to pathtracing on the RIS architecture.
Could you go into a bit more detail on how the work is distributed? Is it per tile (or some other screen-space division like macro-tiles or scan-lines)? Per sample pass? (Surely it's not scene distribution like the old Kilauea renderer from Square!) Dynamic or static scheduling? Sorry, so many questions. :-)
My knowledge is probably outdated at this point (the now open source code is probably a better reference than my memory!) but at the time it was exactly as you described. Each workstation loaded the scene independently and work was distributed in screen space tiles and final assembly of the tiles was done on the client. I can’t remember if we implemented a work stealing queue to load balance the tile queue or not… my brain may be inventing details on that part. :)
I built a scene distribution renderer similar to Kilauea for my masters thesis in school, except with a feed forward shader design which exploited the linear color space to never send the results of computations back up the call stack… kind of neat but yeah, all sorts of reasons why that kind of design would not work well under production workloads. And RAM has gotten so stinking cheap!
Could you go into a bit more detail on how the work is distributed? Is it per tile (or some other screen-space division like macro-tiles or scan-lines)? Per sample pass? (Surely it's not scene distribution like the old Kilauea renderer from Square!) Dynamic or static scheduling? Sorry, so many questions. :-)