The question has been asked many many times on the web: should I record and mix at a high samplerate (such as 192kHz) or is a lower samplerate (such as 44.1kHz or 48kHz) sufficient?
As with many internet answers, the subject is convoluted with quite a bit of misunderstanding and as a consequence bad advice, even from experts (and also from “experts”, if you catch my meaning). In the the thread that set me on this path, a CUSTOMER was asking a representative of a well-known internet music equipment reseller — with the titles “PC Specialist, Forum Moderator, Technical Support Expert” — for advice as to whether working at a higher samplerate was worth the additional gear expense and computer resources. The representative basically said that it was not, and then quipped that there is literally no audible difference unless you are an animal with ultrasonic hearing. That representative was then backed by two audio engineers who were similarly dismissive, and who even insisted that rates as low as 44.1kHz would be just as good, as well as suggesting that the only difference was snake-oil and placebo effect. How unfortunate.
While there IS merit to the notion that there is practically no audibly-discernible difference between a 44.1kHz file and a 192kHz file in-and-of-itself, that simply does not address the full issue, which is not one of listening but of processing. I will not address all the arguments here since they have been covered ad nauseum and mostly do not apply to this discussion. I wanted to focus instead on one particular aspect that has been mostly overlooked, which is that of digital signal processing: With the exception of DAWs/hosts that internally oversample effects busses on the fly (or in offline mix down), DAWs/hosts that only operate at a particular samplerate, and plugins that internally resample, DSP happens at the samplerate of the imported audio track. That being the case, the higher the samplerate the lower the aliasing artifacts will be (disregarding resampling artifacts, etc… which are grounds for a quite-different discussion).
Any non-linear process will produce aliasing effects as the signal folds back over the Nyquist frequency, and with that being the case, if we move Nyquist upward (via upsampling or higher-samplerate source material) before mixing down to ‘typical’ physical media sample rates, we have a cleaner signal. This is effect is most evident in clipping algorithms, but also applies to signal generation, dynamics processing, and sometimes — to an extent — filtering as well (although the Nyquist point has several other ramifications for filtering as well, such as Q warping, but that beside the point).
Unlike the rampant anecdotal rhetoric and ‘trust me I’m an expert’-isms that dominate cyber-space, examples and illustrations in a real-world scenario are actually useful. To that effect, I generated identical pure sine tones at 192kHz and 48kHz and processed them with a simple peak compressor algorithm with the project rate set to the native rates of each respective file to observe their aliasing behavior. The settings for the peak compressor were fairly common for fast-transient control: peak detector, 1mS attack, 10mS release, 4:1 ratio, hard-knee. (the algorithm is symmetrical, which was chosen to suppress even-order harmonics and bring out an odd-order harmonic series for a reference as to what harmonics are ‘musically-related’ to the fundamental.) Equipment used: Audacity for signal generation, Logic X for hosting, Voxengo SPAN for analysis, generic AudioUnit peak compressor plugin for processing.
As you can see, the difference is not subtle. There are artifacts in the 48kHz signal that near 60dB higher than the 192kHz signal, and are within a few dozen dB from the peak of the original fundamental. Bear in mind three things also:
1. those artifacts are NOT harmonic distortion: they are additional dissonant frequencies that cannot be removed
2. that this is just a typical example with a compressor: the result could be much more pronounced in the case of hard limiting, clipping, saturation, etc
3. that one is likely to cascade several effects in a chain for each track, which compounds the effect and adds additional dissonant harmonics with each successive instance of non-linear DSP
For the sake of completeness, here is a more complex signal consisting of four sines at different amplitudes, across a range of 18db, to show the effect of aliasing and intermodulation at 48kHz compared to 192kHz. The image is zoomed and cropped to represent more detail across a few octaves, and to show the 16-bit dynamic range floor. In this example, the effect is much more pronounced, with dissonant aliases appearing less than a dozen dB down in relation to their closest natural harmonic:
So working highest samplerate you can with respect to your gear, storage and processing power is definitely worth it. HOWEVER, it IS true that working at a high samplerate chews through hard disk space, RAM, and CPU as well as increasing the time it takes to back up projects or render audio, so there are a few alternatives in order of what I consider to be the best options taking into consideration safety (can you easily undo it), resources, time, space and the effect of resampling artifacts or other degradation:
1. BEST OPTION: Use whatever reasonable samplerate you want, and then for your non-linear DSP apply **ONLY** well-designed plugins that minimize aliasing internally, via either their algorithms or internal resampling. They are some great, clean, affordable plugins out there, you just have to dig through a lot of marketing nonsense and ill-informed reviews to find them sometimes. A few developers off the top of my head that generally turn out consistently-good plugins are: Klanghelm, Voxengo, Airwindows, vladg, Valhalla, etc (NOTE: let me also say that fancy-schmancy famous expensive plugins are NOT automatically great… there are a great many out there that disregard some of what we discussed here, and even some truly awful ones that cost a ton and are inexplicably popular). If you have the cash and the plugins you need actually exist… be a pal, take this option and support the hard-working DSP-programmers of the world. Also note that if you are on a budget and happen to work on a Mac, many of Apple’s built in plugs have algorithms that are quite good (even if the UIs are a bit clunky and glitchy)… test them yourself and see.
2. Use a DAW/host that increases the samplerate of it’s processing during offline mix-down so that you can have the lower resource-usage of the lower samplerate while working, but still get the benefit of a higher samplerate at mix-down (NOTE: be sure your plugins support that mechanism).
3. Use a DAW/host that increases the samplerate of it’s internal DSP and 3rd party plugins on the fly (NOTE: again, be sure your plugins support that mechanism).
4. Record at a high samplerate, then apply your plugins and make liberal use of track-freezing (if your host supports it). You will still have the problem of storage space and mixdown/backup-time though.
5. Worst option, but still one to consider: work destructively by staging your project (this is risky, non-reversible and time-consuming, but still applicable if you are stuck with lower-quality source material or limited hardware/software resources). Do basic edits, as well as any linear DSP that is not effected by these issues, then print that at a higher samplerate and apply your other effects. Another example of this that was more common in the dark old days of weak CPUs and small drives would be to mix to sub-busses/stems, print those at a higher sample rate, and then apply your group effects, mix those down to a higher samplerate to master the mix at the best quality possible.
While this example is incomplete and only represents one particular aspect aspect of the larger argument, I hope it sheds a little light on on the subject, as well as providing a few possible solutions.