mm_alex's comments

mm_alex · on Oct 30, 2016

this works quite well, and does not necessarily require any NN/machine learning. see the youtube for this paper https://www.disneyresearch.com/publication/scenespace/ tldr simple brute force weighted average of samples from many frames, combined with a noisy/low quality depth-from-motion estimate can be used to de-noise, increase resolution and otherwise manipulate video footage. very cool paper with great results from a simple technique.

mm_alex · on Dec 20, 2012

LittleBigPlanet, our PS3 game, was (is) called 'ps3test1'. the sequel, LBP2, is... also called ps3test1. that project really was our first attempt to bring up a devkit, probably with a rotating cube.

the project, and compiled output, on every platform, is called 'pc.elf' (or .vcproj or .exe or whatever) SIGH

there's an inverse correlation between awesome-ness of directory name and chance-of-shipping, in my experience.

scrumper · on Dec 20, 2012

> there's an inverse correlation between awesome-ness of directory name and chance-of-shipping, in my experience.

Sackboy's law? It's absolutely true.

Thanks for sharing that about LBP: it's great to hear that my nominative inertia is shared by such high company.

mm_alex · on Oct 13, 2012

Imma just gonna leave this here: http://hulkholden.github.com/n64js/ fully functioning n64 emulator in javascript, which dynamically recompiles mips asm to js so that your browser can JIT it into native code. the dev blog is full of amazing js perf knowledge too: http://n64js.blogspot.co.uk/ (videos here also if you dont have any roms to hand)

source code on github. full disclosure: I work with paul, he's very clever.

jpfau · on Oct 13, 2012

I tried writing an ARM -> JS recompiler, but my initial tests showed that the slow games tended to re-copy the portions of code that were slow back into their working RAM, causing me to invalidate the recompiled code. Maybe I could get clever and detect when I don't need to invalidate the recompiled code, but I have a feeling that that would actually slow it down overall. Haven't tried this yet, though, so who can say?

What I did observe is that the games that DON'T invalidate the recompiled code tended to run ~20% slower, so I'm not really sure what I was doing wrong there. It mostly seems to be that the code my recompiler is outputting is less well optimized by the JIT, and not that the recompilation is too slow. It's also possible that my recompilation is screwing over the garbage collector, as I have a somewhat verbose intermediate representation that I created in the hopes that I could optimize the recompiled code. I don't have much experience with compilers though, and again, I never got around to this.

I've been pondering pushing the branch that I did this work on, but I was hoping to wait until I got it to perform better before pushing it.

mm_alex · on Oct 5, 2012

surprised all the comments here are about whether he achieved compression or not. Point = Missed!

the entire thing was an ingenious trolling exercise, as he says, to 'out trick the trickster'. his goal was to exploit the loose wording of the challenge, and prove a point (possibly winning $5k in the process); I found it an entertaining story along those lines.

it proves nothing (new) about compression, or lack of it; he even stats that the consensus at the time was 'unanimous' that no compression had taken place. that's not where the interest in this link/story lies.

thanks for the original link, OP.

mm_alex · on May 30, 2012

reminds me of Bob Jenkins' (of hash function fame) comments on his resume [1]:

IBM (1988) ... The existing code tended to shrink when I edited it. I wrote a total of minus 5000 lines of code that summer.

Oracle (1989-2006) Oracle's code (C and PL/SQL) is very good. It usually didn't shrink when I edited it.

[1] http://burtleburtle.net/bob/other/resume3.html

mm_alex · on May 11, 2012

I dunno about hurt blah blah blah but could you explain how exactly the 'modern API choices' provide 'GPU-raytraycing almost out-of-the-box'? if you search for ray tracing in opengl/directx/webgl, you will find no mention of ray-tracing, and for good reason: they don't provide it 'out of the box'.

The ray tracing going on in this demo has nothing to do with the APIs he is using; indeed, that demo could be implemented, pixel perfect, on any platform with floating point arithmetic. it just uses the 'advanced api' to display the resulting image on the screen...

now, the point you make is still a valid one, in that we have more api's than before, but this demo is not an example of using such apis. Better 'grey area 4k' examples that would support your point, are the ones that (for example), loaded the general midi file from windows, and used it as a source of samples (that was a trend a few years ago); or the ones that make use of D3DX's truetype font->extruded mesh routine, to create almost all their geometry out of extruded Arial letters. In both cases, these are such ingenious hacks that despite being in the 'grey area', and definitely (ab)using data available in the host OS, they represent such crazed creativity you can't help but admire them.

either way, such tricks were not used here; you're just seeing the result of a very large number of multiply-add instructions, artfully composed; and that's nothing to do with 'demo producers making more and more API choices'

mm_alex · on May 11, 2012

I understand why this always comes up, but in this particular case it's particularly 'un-useful'. I can tell you exactly what API calls he makes: on boot he makes API calls the equivalent of PlayGigantic3MinuteWav(), InitOpenGL(), Compile2Shaders(), and then each frame he calls DrawFullScreenQuad() (twice, once to an offscreen texture, once to the screen), and Flip().

All the magic of the demo is in the synth (pure x86 fpu code) for the music, and the sphere tracer / camera choices / colors / fractal equation, for which precisely 0 'API' is used.

Put another way, the really amazing thing these days is that we have hardware (cpu, gpu) capable of so many FLOPS that you can do brute force things like sphere tracing complex fractals. the goal, seen in this light, of the demoscene, is to expose the wonder of all that power in visually interesting or arty ways.

Seriously, you couldn't have picked a worse example of someone 'leaning on an api'. :)

mm_alex · on Nov 9, 2011

it would be interesting to use a colour gradient (say from yellow to red) to indicate how long something survived, before it got deleted. so you could see immediately the different kinds of mistake - transitory ones, or stuff that took you a while to realise.

mm_alex · on Oct 25, 2011

done! paste this [1] into shadertoy (and I de-lurked on HN after 3 years to do this; who knew) nice effect, but it pains me that a multicore cpu implementation can be SO SLOW. modern pc's are fast, you know? not just the gpu... oh well.

[1] https://gist.github.com/f448ba84e94c61ab5924

rogerallen · on Oct 25, 2011

Thank you! While modern CPUs are definitely fast, they are not as fast as GPUs for code like this. Dynamic & realtime FTW.

mm_alex · on Oct 25, 2011

<ramble> true, true! and I apologise for sounding whingey before, I do not mean to rag on you or the OP (I know nothing about what is good/idiomatic haskell and how that relates to efficient haskell). but it still feels damn slow, multiple seconds to make that image!

to put money where my (gut's?) mouth is, the dumb transliteration of my webgl shader to C++, compiled by MSVC in release mode on my win32 machine, takes 100ms to compute a frame at 800x600, on a single core, with precisely no tuning or effort.

with #pragma omp magic, equivalent in pain to the OP's point about almost-free-parallelisation in ghc, I imagine that would drop to around 20ms on 8 cores. and if I used an SSE vector class, probably another 2x, but that could legitimately be disallowed as overly complex.

my point being, you're right, GPUs stomp over CPUs for this kind of work! but my gut told me that this image should not take long for 'even' a CPU to produce; 10 or 20ms without effort, sub millisecond with effort (bytes rather than floats, asm, etc)

maybe I'm just lamenting the abuse of our modern CPUs, which are fantastically fast machines, even for stuff that they are not designed to excel at, like this. </ramble>

dons · on Oct 26, 2011

An optimized Haskell version running in real time, by Ben Lippmeier, http://www.youtube.com/watch?v=v_0Yyl19fiI

jerf · on Oct 25, 2011

"Embarrassingly parallel floating point operations" "for the win". The things that GPUs are better at than CPUs are very poorly modeled by the words "dynamic" or "realtime". They are happy to do long-term batch computations (and getting happier about it), and there are plenty of dynamic real-time things they are bad at, because they involve lots of branching.