Solving the Question(s)
Is 48 kHz enough?
This depends, but the short answer is no. There is a significant audio processing overhead required to make 48 kHz be able to sound like what you would achieve with 96 kHz or higher. If you can confidently say that everything in your audio production pipeline is doing the necessary processing for 48 kHz playback, then you can set your playback frequency to 48 kHz.
Will switching to 96 kHz (or higher) fix the problems?
Yes, absolutely. While they won’t be gone completely, they will be reduced to the point that they won’t matter anymore, which is especially important for audio recording from real world instruments and vocals. A studio performance captured at 192 kHz sample rate will sound much different compared to one captured at 48 kHz.
What sample rate should I pick?
This depends on what you actually want to do:
- If you only intend to capture game audio with nothing else, then 48 kHz will be perfectly fine, as most games mix their audio to 48 kHz or even 44.1 kHz.
- For human voices, such as commentary and singing, you will want to switch to 96 kHz. This covers the majority of frequencies that humans can produce, and also covers a large amount of instruments as well.
- Lastly there are some instruments that don’t sound good at 96 kHz, for which 192 kHz is required, for example cymbals and bells.
However there is a problem with this. If your pipeline involves a naive downsampler, which is common in many popular media production software such as streaming apps, you actually gain none of the benefits of the higher sampling rate. In the worst case this can even cause new artifacts to appear.
What is the correct way to downsample?
This is the hard part, and I have no real answer for it. A reduced sample rate simply cannot cover all the frequencies that higher sample rates can, and even the best downsampling and filtering and only do so much and will struggle with certain frequencies where artifacts are simply unavoidable.
The majority of the frequencies above 9.6 kHz are problematic at 48 kHz, and simply can’t be represented correctly. For example the 19.2 kHz frequency is just nearly impossible to accurately represent, but is fine at 96 kHz.
What about supersampling D/A and A/D converters?
Higher priced audio devices have started using supersampling D/A and A/D converters, which usually have a data resolution of 48, 96 or 192 kHz, and an internal resolution in the mHz area. Since these are usually not listed in the spec sheet, it is impossible to tell if you have one or not without an oscilloscope.
Their quality is defined by their resampling algorithm, and high quality resampling algorithms can make 48 kHz sound nearly indistinguishable compared to 96 kHz, at least for the majority of frequencies. If you can confidently say that you have one of these, then you will be “fine” at 48 kHz sampling rate – the majority of audio frequencies will be reproduced with only minor artifacts.
So there you have it, the answer to the age old question: “Is 48 kHz enough?” – and the answer to it is “No”. The minimum necessary to accurately reproduce most real world audio is 96 kHz, and some things even need 192 kHz or higher to be correctly reproduced.
And thanks to technological advances, we might in the future see 96 kHz become the new “X is enough”. Chips have gotten smaller and more efficient, audio capture/playback devices have gotten better at audio, and even our mobile phones are starting to jump onto higher samplerates.
With all that said, there isn’t anything left to talk about. If you think I made a mistake, or just know better, do feel free to contact me.