top of page

DEVELOPER RESOURCE: Unity Audio tutorial. Setting up "Sends" and "Duck" FX across groups.

Writer: Edward RayEdward Ray

In this short tutorial, I'll walk you through setting up sends and ducking in Unity. I've met a surprising number of developers who know little about the basics of audio integration, yet, for whatever rea$on (ahem), they also don’t want to hand off the implementation to someone else. Fair enough! That’s where I come in, doing my best to guide them through the process.

Every time without fail though, the same questions always come up. A classic: "How many dB should I set XYZ to?"  It’s not that simple. These things require hands-on testing and experimentation. Audio isn’t a "set it and forget it" kind of deal: you can’t just twiddle some knobs, throw in some levels and call it a casserole.

One of the easiest ways to get your game audio working properly is through ducking. In simple terms, ducking is when the volume of one audio source controls the volume of another. A practical example: a weapon fires, or a "reward" sound effect plays and the music momentarily lowers in volume. This prevents a chaotic mess of sound and instead creates a dynamic, responsive soundscape that breathes with the gameplay.

Here’s how to set it up.

1. Place Send FX instances across the desired groups. These are the tracks that will be "turning down" the volume on a receiving track.

2. We then place a Duck Volume FX instance on the track we want to dynamically reduce the volume of. In the example below, it is called "OST".

3. We then route the two Send sources (Gunfire and Melodic SFX groups) to the Duck Volume "receive" on the OST group. We do this by right clicking on the Send instance, then selecting the destination.


4. Same again for Melodic SFX.

To clarify one thing before we continue: What cannot be determined through an instructional guide is how to set any of these parameters: You need to test this in the project file itself. For now, it’s important to realise that the SEND LEVEL on these two groups (Gunfire & Melodic SFX) is what determines how the receiving DUCK FX instance behaves. 6. We can impact how much the music is ducked by adjusting the "Send Level", within these "Send" FX instances, on the respective groups. This in turn determines how much the OST group will be ducked by the track with the "Send" FX instance on it. So if you want, for instance "gunfire" to more significantly impact the resultant reduction in volume on OST, you would increase this Send level.

If you're just trying to have the level of the receiving track turn down dynamically without doing anything else, all you need to do then is adjust the Threshold and Makeup Gain on the receiving track.

1. Set the send level on the SENDING TRACK. i.e. Gunfire, as above. 2. On the RECEIVING TRACK: Set the Threshold as appropriate (determine this through testing) and then set Makeup Gain to a negative value. The two parameters are outlined below:

If you want to learn more about the various parameters (and I strongly suggest you do so), continue reading below:



DUCKING FX PARAMETERS "What does it all mean?"


In this section, I'll provide more context to the various parameters available in Unity's Ducking system, which is actually very similar to an audio compressor. Many of its parameters are set up identically to a compressor, though if there are any differences, I will point these out. Threshold: This determines how high the AFFECTING track's level would need to be, before Ducking starts to be activated on the receiving track .i.e. How "loud" would Gunfire need to be, before it starts to make the music turn down?

In practice, in the image above, we are determining how high we would need to set the Send level on Gunfire, before Ducking "starts to work". The Threshold and Send Level are directly related to one-another.

Send level on Gunfire.
Send level on Gunfire.

One thing of note: The threshold is not necessarily a hard cutoff. In other words, it is not (always) the case that volume is ducked to a fixed level, the moment that you exceed the Threshold with your Send level. You have some flexibility in how much the Volume is reduced once the threshold is passed, depending "how far over the line you step" (i.e. how much higher your Send level is vs. your Threshold). All of this to say: Reducing the Threshold below that of the Level can have a gradual effect, or a sudden one depending on the "Knee" setting. The Knee is the "steepness" of the "gain reduction curve". More detail on this at the end of the article. Ratio: Essentially, this is the "amount of reduction" being applied to the signal once our Threshold is passed. The Ratio determines, if we put in a certain amount, how much will we get back out?" This chart illustrates the concept well.

If we exceed the Threshold by 10dB with a ratio of 2:1, our signal will be reduced to 5dB above the threshold. In practice, this is a fairly subtle effect. I tend to use a setting of 1.5:1 or 2:1 when trying to make elements "glue" together in a music mix. If we exceed the Threshold by 10dB with a ratio setting of 10:1, our signal will be reduced to 1 dB. This has a very noticeable effect.


Context in practice: You will likely need higher ratios to perform the ducking of music, in order to have it remain audible under, for instance, gunfire. This will prevent having only the loudest parts of the mix "poke out". An example would be having snare drum transients being the only thing audible under a barrage of gunfire. However, one thing to keep in mind is that unlike a typical compressor, where ratio values are expressed as 1:1, 2:1, 3:1, 4:1, etc., Unity expresses these as a percentage. 100% = 1:1. 200% = 2:1. 300% = 3:1. 400% = 4:1. etc. You will most likely be experimenting with values of 700% or 800% initially; these equate to compression ratios of 7:1 and 8:1 respectively. Attack time: This is how quickly the Ducking effect starts to "clamp down" on our signal. Otherwise stated, how soon after the Send level exceeding the Threshold would we like the Ducking to happen? Start your experiments with this set as fast as possible. A longer attack time in this example would result in a delay between a gunshot and the music going lower. It may well be that a slight delay "feels" more natural - there will be an attack time that will just "feel right".  How does Attack Time affect an audio signal? A visualisation:

Compression applied to an audio file, in Reaper. In this instance of compression, I have left the Attack Time long enough so that we still hear the still hear the original signal's transient (i.e. the compression takes long enough to "kick in" such that we hear the original signal's "attack").
Compression applied to an audio file, in Reaper. In this instance of compression, I have left the Attack Time long enough so that we still hear the still hear the original signal's transient (i.e. the compression takes long enough to "kick in" such that we hear the original signal's "attack").

Release time: After we've finished exceeding the Threshold, how quickly does the signal return to baseline? This is a bit harder to represent visually, though if you think of Attack as "How long it takes to grab the signal" then Release is exactly what it sounds like: "How long before the duck / compressor lets go of the signal".


Make-up Gain: Compressors are used to "even out" a signal. They do this by taking the "loudest parts" of a signal and squashing them down, so that these are more similar in level to the "quietest parts" of a signal. This ultimately enables us to make everything louder, overall. The idea is that having squashed the overall signal down, we can then increase the output of this squashed signal, making everything uniformly louder. I have demonstrated this below with some screen shots from Reaper:

We start with this signal. It's a drum kit recorded onto a stereo track. Notice how "uneven" it is.
We start with this signal. It's a drum kit recorded onto a stereo track. Notice how "uneven" it is.

After applying compression, we end up with the below:

The signal is now compressed, we have made the loudest parts quieter, though it's quieter overall because they have been squashed to be as quiet as the snares.
The signal is now compressed, we have made the loudest parts quieter, though it's quieter overall because they have been squashed to be as quiet as the snares.

Notice how much smaller the waveform is overall: It is lower in amplitude: It is quieter. In fact, it's so quiet, it's almost difficult to see... So hang on, we were meant to be making it louder, right? Correct - that's where "makeup gain" comes into play. If I adjust my makeup gain on the compressor, we end up with this:

We made the loudest parts quieter to "better match" the quiet parts, which gave us the headroom to make everything louder as a whole. We still have that very prominent "transient" due to the Attack settings.
We made the loudest parts quieter to "better match" the quiet parts, which gave us the headroom to make everything louder as a whole. We still have that very prominent "transient" due to the Attack settings.
If I do this even more dramatically (compress more heavily) notice how the waveform is flattening out? Disclaimer: This is a very heavy amount of compression. In practice, you'd rarely compress something this dramatically.
If I do this even more dramatically (compress more heavily) notice how the waveform is flattening out? Disclaimer: This is a very heavy amount of compression. In practice, you'd rarely compress something this dramatically.

For the purpose of Ducking, you generally will not need to increase the makeup gain. After all, this is what makes the "receiving" signal louder again, after "squashing" it. Unless of course, your other settings are such that you've Ducked / Compressed the signal so much that it sounds good when ducked, but you need to bring the level back up a touch. This is also a perfectly valid approach. You need to check what sounds best! Knee: We referenced this earlier. With hard knee compression, the gain reduction applied to the signal occurs immediately, then the Signal level exceeds the threshold. With soft knee compression, the onset of gain reduction occurs gradually. Don't worry about this too much, as you get more advanced in your use of compression / ducking, feel free to research this further if you're interested.


Sidechain Mix: This allows us to blend the unaffected and affected signals. At 100%, we are listening to only the Ducked signal. At 0%, the opposite occurs. At 50%, we are hearing half of the ducked signal and half of the unaffected signal. Why this is useful: Using "half the amount of compression" is not the same as setting the sidechain to 50%. Often you will want to experiment with a number of different variations to find the one that sounds best. In the initial stages of getting to grips with this, I recommend finding settings that work for the other parameters, leaving this setting alone. How all this relates to Send level: The Send level is what will be the catalyst for all of the behaviours that are enacted by tweaking the parameters above. In other words: the Send level determines the behaviour of everything else. The Send level set higher will cause these parameters to have more of an impact on the Ducking. In turn, a lower Send level has the opposite effect.

Blimey - that was a lot of information. I promise you this is worth familiarising yourself with, though. If you have any questions, please be sure to let me know. If I can update this article to be even more helpful, I'm happy to do that. Good luck!

 
 
 

Comments


© 2024 by Edward Ray
 

bottom of page