Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is it possible to do audio processing using a GPU?

I don't have enough working knowledge or mathematics to fully comprehend the 3d pipeline but as I understand, it is a huge linear algebra/matrix manipulator and audio processing would fall into this. My reason for asking is that at least in the guitar world I've seen issues related to CPU processing and multiple VST plugins all wired together. Some guitarists use fairly complicated rigs to get their tone and when you have multiple amps, speaker cabinet simulation, and effects running you need a pretty powerful machine to do everything real-time without under-running buffers.

There are specific hardware solutions for this but I've always wondered if you could just create shaders or something and dump the audio stream to the GPU and get back the transformed audio stream with minimal CPU burden. Any insight here would be very helpful to me.



The last time I checked, the problem with GPUs and realtime audio is that although the GPU is extremely powerful and the bandwidth is fairly large, the latency for a roundtrip from/to the CPU is too large for what we typically consider "low latency audio".

Put differently, the GPU is plenty powerful to do the processing you're thinking of, but it can't get the results back to the CPU in the required time.

For video, there is no roundtrip: we send a "program" (and maybe some data) from the CPU to the GPU; the GPU delivers it into the video framebuffer.

For audio, the GPU has no access to the audio buffer used by the audio interface, so the data has to come back from the GPU before it can be delivered to the audio interface.

I would be happy to hear that this has changed.

None of this prevents incredibly powerful offline processing of audio on a GPU, or using the GPU in scenarios where low latency doesn't matter. Your friends with guitars and pedal boards are not one of those scenarios, however.


I think the latency overhead can be low enough these days (kernel launch / copy data from CPU to GPU, read from global memory, do some compute, write to global memory, copy data back to CPU and wait for completion from the CPU side (which might copy the data to some other buffer) or non-blocking DMA write instead), say on the order of 10-100 us over PCIe, but there is a tradeoff between the units of work that one would give the GPU, the efficiency of the compute (and the working set size of data that you need to load in from global memory to do the compute), and the number of individual kernel launches that one would need to do to produce small pieces of the output. There are some tricks involving atomics (or in CUDA in particular, cooperative thread groups) that could allow for persistent kernels that are always producing data and periodically ingesting commands from the CPU to avoid the CPU constantly needing to tell the GPU to do things to make it easier.



Both those links describe audio processing on a GPU. Neither of them address (from a quick skim) the roundtrip issue that occurs when doing low latency realtime audio.


Neither applications are designed to run while the GPU maxed out with graphics rendering, are they?


For the unpredictable part of the input, you can't have the best latency and throughput at the same time. If you want realtime in a variable load scenario, you have to cap the GPU usage.

That's how gamers do it when they want the lowest latency possible, anyway. Something like "find the lowest frame rate your game runs on, and cap it to 80% of that".


Thank-you! That explains a lot!


we have hdmi audio, is that a start? I assume that means the videocard is registering as a sound card as well, no idea if any processing is on the gpu...


100% unrelated. HDMI is an output from your video interface.


but that means the audio stream is at least passing through the video card, no? is it only passing through post mix?


the GPU is not involved in computing the audio stream. the audio side of the HDMI appears to the OS as an audio device, and the CPU generates and delivers the audio data to it. It's essentially a passthrough to the HDMI endpoint.


I would like to have the ability to use a GPU buffer for HDMI audio output. It would enable a couple of interesting applications.


> Is it possible to do audio processing using a GPU?

This thread fascinates me. A sibling comment addresses why it can't be done in realtime (barring some pedestrian-sounding hardware improvements), but the question of "how" fascinates me. I know a little bit about audio, I'm a classical musician with some experience with big halls, I know some physics, and I wrote a raytracer in JS back in the early aughts. Can we raytrace sound? It's a wave, it's got a spectrum, different materials reflect, refract, and absorb various parts of the spectrum differently... sounds pretty similar! The main differences I see is that we usually only have a handful of "single pixel cameras" (I expect to lose a bit of parallelism to that) and more problematic, we can't treat the speed of sound as infinite. I can imagine a cool hack to preprocess rays for a static scene, but my imagined approach breaks down entirely if the mic moves erratically.


We can already do this (without GPUs, and quite likely on them). It is done, for example, by a certain class of reverb algorithm which more or less literally "raytraces" the sound in room. It's not the most efficient way to implement a reverb (convolution algorithms are better for this). But it does allow for a large degree of control that convolution does not.

In addition, you may also consider physical modelling synthesis to be an example of this. Look up Pianoteq. It is a physical modelled piano that literally does all the math for the sound from a hammer of particular hardness striking a string of given size and tension at a particular speed, and then that sound generating resonances within the body of the piano and other strings, and then out into the room and interacting with microphone placement, lid opening etc. etc.

Most (not all) people agree that Pianoteq is the best available synthetic piano. It's all math, no samples.

There are similar physical modelling synth engines for many other kinds of sounds, notably drums. They are not particular popular because they use more CPU than samples and can be complicated to tweak. They also really benefit from a more sophisticated control surface/instrument than most people have access to (i.e. playing a physically modelled drum from a MIDI keyboard doesn't really allow you to get into the nuances of the drum).


Yes, sound can be raytraced - in realtime on current GPUs even. Using a ray model for sound waves is called Geometric Acoustics and works reasonably well in most cases, although it has a few shortcomings, mostly when dealing with low frequencies (room modes, diffraction).

The Oculus Audio SDK contains an acoustic ray tracer for that. See https://www.oculus.com/blog/simulating-dynamic-soundscapes-a... for a short writeup on what is acutually in there under the hood.

Disclaimer: I've worked on that ray tracing engine at FRL for a while.


I think it's too pessimistic to say it can't be done in real time on current GPUs. I think it should be possible to do reliable low latency readback with modern explicit graphics APIs. The harder problem is scheduling the audio work at high priority so it always finishes before the deadline, but this too has improved recently because VR compositors also have strict deadlines they must meet.


I remember hearing that Nvidia were working on an audio spatialiser using their RTX hardware/expertise, but I can't recall hearing about it after 2018 or so, which is where the project I was working on that I wanted audio spatialisation for ended.


There are a number of clever audio processing algorithms implemented on GPUs already. But that's not the issue at hand, which is doing so in a low-latency realtime context. GPUs are insanely powerful, and if you're just doing playback of material from disk, the tiny bit of extra latency they may cause isn't a problem, so go for whatever they can do. But if you're using the computer for synthesis or FX processing, you need better than GPUs can offer in terms of latency (for now).





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: