As iTunes is entering the Russian market, many technical questions arise about mastering of audio tracks for distribution via iTunes. Some of the questions are answered in the official site, but many aspects are not covered there. Let’s investigate...
The target audio encoding format of iTunes is AAC, and Apple recommends using 96 kHz,
Apple’s afconvert utility has several variants of conversion, including norm and bats. Here are the results in norm mode:
CONCLUSION: not bad, but not exactly the super-quality declared by Apple — the results are rather average, compared to modern professional audio editors.
The second mode is bats and here are the results:
Here everything is looking much better: all the results are on par with best existing sampling rate converters.
CONCLUSION: there is no difference whether you are converting the sampling rate with an external high-quality converter or with Apple’s recommended converter in bats mode.
The next issue is signal levels.
Apple recommends keeping the maximal peak level of the source file (including all intersample peaks) at −1 dBFS. Can this level be exceeded? Is 1 dB of headroom enough to prevent clipping during the following conversion?
To answer this question one can use another Apple’s utility — afclip. It measures signals levels, including intersample peaks.
We take 3 source files at 44.1 kHz for this experiment: the first one has hard clipping at 0 dBFS, the second one has been limited so that its true peak level is −0.5 dBFS («true peak» level includes intersample peaks), and the third one has been limited so that its true peak level is −1 dBFS (any limiter with intersample peak detection feature could be used for that).
The analysis of source files produces expectable results: the first file shows thousands of clipping intersample peaks, the second and the third one show none. However after converting them to AAC and back the picture is different: the first file looks even worse (as expected). The second file now shows thousands of clipped samples — this indicates that 0.5 dB headroom (even when intersample peaks have been limited) is not sufficient. The third file has performed up to Apple’s Astandards: no clipping was detected and the audio signal has been preserved in a best possible way.
CONCLUSION: Apple’s recommendation on signal peak levels is valid: the headroom of 1 dB is necessary and sufficient for preventing clipping in musical material.
CONCLUSION: The process of adaptation of existing published CDs for iTunes distribution should include limiting of the signal to −1dBTP (true peak) levels and saving it to a
Now let’s switch from adapting an already published material to mastering of a new material for iTunes without the need of CD publishing.
What is the difference? The main difference is that compression and limiting «just for loudness» do not make sense anymore. The loudness of played tracks in iTunes is controlled by a built-in Sound Check function which automatically matches the loudness of all tracks in the playlist.
So, our first clipped file has been simply attenuated by 8 dB, but the clipping distortion is still there. We made a copy of it with −10 dB attenuation and it has been automatically turned up to 2 dB. Sound Check does not just normalize peak or RMS levels in a file to match the loudness: it works with a more elaborate subjective measure based on Fletcher-Manson loudness curves. If we insert two tones at 50 and 5000 Hz and the same peak level of −10 dBFS in our playlist, their resulting output levels will be −1 and −20.4 dBFS respectively.
CONCLUSION: Compression and limiting do not make sense anymore for increasing loudness; they should only be used for artistic purposes, like shaping the overall dynamics of a track.
The abovementioned recommendation also applies to other audio compression formats, not just AAC.