Role of GPUs in audio production software

mzb · August 15, 2023, 5:32pm

I know of a TV show music composer who needs memory to load ALL THE INSTRUMENTS () and only enough graphics to drive four 4K monitors. His current Intel Mac Pro’s PCI slots are all loaded up, but he’s got no reason to move to Apple Silicon as the Intel Mac Pro just works, and even if he did move over, he’d have less RAM, which would present him more difficulties loading sound libraries.

Following up on the question of hardware performance for audio vs. video work, can anyone enlighten me on the role of GPUs in audio production software? To be more specific, I’m using Logic Pro and considering buying either an M2 Mini or M2 Studio to replace my creaky cMP 5.1 (running Big Sur via Open Core Legacy Patcher). Most of what I’ve read indicates that for audio work I should focus on CPU cores, and not worry at all about how many GPU cores it has (which seems to be the primary differentiator among the “M2/Pro/Max” processors).

Is this still a safe assumption? I vaguely remember a time when there was a lot of talk of GPUs taking over general computation duties (was it in the OpenGL era? I’m not sure) but I haven’t read anything to that effect in a long time – until just recently I read about gpu.audio which sounds like they’re doing that at least with respect to audio plug-ins and the like.

Is it also safe to assume that I should also get as much RAM as I can afford? I’m not loading “ALL THE INSTRUMENTS ” for movie soundtracks, just fooling around in a home hobby studio. Most of the stuff I’ve been doing lately runs okay on my single-processor cMP with 32GB RAM and just fine on my M1 14" MBP with base processor and only 16GB RAM, but I guess I want to be as future-proof as possible and wondering how to allocate my budget among CPU, GPU, and RAM.

Will_B · August 15, 2023, 6:52pm

Can’t say myself, might be a good question for Apple’s LogicPro folks (although other audio developers might have a more objective view… ;-.

I’m pretty sure my M1 Ultra has all I need for my audio use, mostly Notion, and iRealPro (I have Studio One, but haven’t gotten it set up yet). Going from a 2013 MacPro to a Studio M1 Ultra, I still haven’t hit anything that seem to remotely challenge this thing.

Shamino · August 15, 2023, 7:21pm

I can’t give you specific answers, but here are some general comments that you might find relevant:

GPUs (and the NPU in Apple Silicon) are used for more than just graphics and ML processing. They are very powerful coprocessors for applications that require massively parallel floating-point math operations.

Within audio, although it may not help much for basic operations (e.g. boost/cut volume, mixing, etc.), I would expect them to be used for operations that require massive amounts of floating-point math, like Fourier Transforms, which are at the core of anything operating on the frequency domain of your audio (e.g. equalization, spatial audio). I would expect some kinds of operations to use the NPU accelerator as well.

Whether this is actually used by Logic Pro, some other app, or whether the operations you need to perform will take advantage is a question I can’t answer.

WRT RAM, I would make that assumption. If you have enough RAM to hold your entire original audio clip, the latest version, and temporary storage for working, then the app doesn’t need to access the file system as you work. Working entirely in memory vs. working with some of your project on the file system will show a significant speedup. But the actual amount will depend heavily on the size of your projects.

For reference, CD quality audio is 44,100 samples per second, times 2 channels (stereo) times 2 bytes per sample (16-bit). For a data rate of 176,400 bytes per second, about 10.5MB per minute, or about 635MB per hour. So you’d want at least 1.5GB to hold such a project in memory (plus all the RAM you need for the OS, the app and whatever else you’re running at the time).

If you’re working, for example, on 16-channel audio (16 mono tracks or 8 stereo tracks), at 96 KHz and 24-bit samples - common parameters for pro audio recording - then you’re looking at 96,000 * 3 * 16 = 4.6MB per second, 276MB per minute or 16.5GB per hour. A one-hour project at this rate will require more than 16GB to hold all the audio and will probably want at least 32GB to actively work on that audio.

If you’ve been getting satisfactory results with 16GB of RAM for your projects, then that’s probably all you need. But if you think you’re going to work on bigger projects in the future, you might want more RAM. I’d do the above math based on the bit-rate, number of channels, bit-depth and clip-length for the projects you work on.

If you find that you don’t have (or can’t afford) that much RAM, don’t worry about it too much. Any decent audio editor will swap audio to/from storage as you work. It won’t perform as well as if it can all fit in RAM, but it should still get the job done.

cwilcox · August 15, 2023, 11:00pm

I don’t have an Intel Mac handy to make a comparison but Activity Monitor in macOS Ventura on an Apple Silicon Mac has “% GPU” and “GPU Time” columns, you might try using your audio software and various plugins and see what activity shows up there.

For CPU use, there are some operations performed on audio that are not practically parallelizable so all that matters is the speed of operations in a single thread on a single core. Newer generations of chips can perform operations faster than older generations and within generations there can be variations in clock speed though for Apple Silicon the clock speeds haven’t changed much. All the M1s have the same base clock speed, 3.2GHz, the M2 and M2 Pro are 3.49GHz, and the M2 Max is 3.68GHz.

Googling, I just learned about the Multithreading setting in Logic Pro for Mac. When using live input or recording multiple audio tracks, all Digital Signal Processing (DSP) in the input is done on a single thread/core unless this setting is changed. This was added in Logic Pro 10.2.1, in 2016, and the support article makes it sound like multithreading inputs is not always the better choice.

Shamino · August 15, 2023, 11:59pm

This doesn’t surprise me, but I would expect that when working on a multi-track project, the software should be able to process each track (or for stereo tracks, each pair of channels) in parallel.

For example, if there’s an operation that can’t be processed in parallel and my project has only one track (or two channels that form a stereo track), then I may be out of luck. But if my project has 12 tracks and I want to apply that operation to the whole project, I would expect to be able to run all 12 in parallel (up to the limit of my CPU cores, of course), since the output generated for one track should have nothing to do with the output generated for the next track (until it’s time to mix everything down to stereo, of course, but that would be a separate operation).

Maybe Logic doesn’t do this, but I would expect high-end software like Pro Tools to use every trick in the book to maximize throughput.

gastropod · August 16, 2023, 5:41am

“the output generated for one track should have nothing to do with the output generated for the next track”

I’m going to wave my hands here because it’s been a long time since I’ve looked at details of musical acoustics, but I have vague memories that there are circumstances where that’s not true, even disregarding the travesty of recording instruments separately in the first place. Acoustics is plain weird in many ways, as you might expect when one of the reasons that live music sounds better than a recording is because in a live performance, there are a lot of heads and shoulders of the players and audience slightly moving around. (Arthur Benade “Fundamentals of Musical Acoustics” in the room acoustics section.) Whether modern music generation and recording ever bothers with subtleties is another matter.

Shamino · August 16, 2023, 11:38pm

There are completely different kinds of recordings that have different requirements.

Making a stereo (or surround) recording of a live concert is one thing. For such a recording, all of your tracks are containing different “views” of a single recording. And yes, applying effects to one without the others will probably not have the desired result. You’ll want effects processors designed to work on all the tracks that belong to that single multi-channel source.

Laying down tracks (possibly at completely different times and from different locations) for later mixing and post-processing is quite different. Although there are some effects you’ll want to apply after the mix (to stereo or surround), which must act on those (post-mix) tracks together, there are also going to be plenty that you will apply to individual (pre-mix) tracks, and those are going to be independent, because those tracks don’t actually share any sound.

I’m not going to voice an opinion about what is “better” because that very term has no meaning outside of how close the result comes to what the composer/performer has in mind, and that’s nothing but opinion.