Upsampling vs. Oversampling for Digital Audio
Vast amounts of marketing efforts are placed on touting the latest and greatest technological advancements in the realm of home audio. We are all aware of the over-inflated, and often baseless claims that companies tend to make when advertising their new products. The latest A/V receiver and A/V Processor offerings are currently marketing super high sampling rates and wide bit words for processing digital audio signals. The ability to upsample to these extreme rates is a main advertising point for many A/V receivers. So does upsampling to higher rates really provide better sound? The following will briefly try to go into the details behind why upsampling is used and if it really is the answer to better sound.
Basics of Sampling - Oversampling and Upsampling
To start with, we must review the system as a whole and look at the terminology. Two main parts of the whole system we are going to consider are upsampling and oversampling. In the purely mathematical context, they are similar operations. When practically implemented though, oversampling refers to using a higher sampling rate than needed to run the A/D or D/A converter thus increasing the rate of the signal. Upsampling is on the other hand a rate conversion from one rate to another arbitrary rate. Oversampling in the ADC has been around for quite a bit of time, while upsampling of audio that results in a simple rate conversion is relatively newer. The figure below shows an example of oversampling. We assume that we can sufficiently sample the analog waveform at twice the bandwidth of the signal as to prevent any aliasing in each case. In one case we are sampling at some frequency, f s and in the other case we are at twice the same sampling rate. The resulting data samples are shown to the right in the figure. In both cases we have assumed that we are at twice the Nyquist rate for the signal to prevent any aliasing.
Figure 1 : Oversampling Operation
A rate conversion in the middle of the digital processing, or upsampling, looks slightly different depending on how it is specifically implemented. The simplest of these implementations entails zero stuffing the original sample stream to increase the sample rate. Other implementations may create the additional samples by taking some weighted averaging of the samples in the original rate. In almost all cases, the upsampling process also includes an interpolation filter to get rid of the images of the original signal.
A simple block diagram of a processing chain is shown below. The Analysis filter sits prior to the ADC and isolates our signal of interest before we sample it. Oversampling is performed at the ADC and then the signal is sent to the digital processing chain that does the filtering and any DSP operations. The upsampler (if any) sits somewhere between the DSP and filtering. Just like the ADC, the DAC also oversamples the signal. Oversampling at the DAC and ADC will make the design of all our filters much simpler as we will see.
Figure 2 : Basic Signal Processing Chain
Effects on Frequency Response
To get a better idea of what is going on with our original signal when we upsample or oversample, we need to look at it in the frequency domain. That will allow us to see what benefits we gain by going to a higher sample rate. The figure below shows the result of taking our standard audio spectrum that ranges from 20Hz to 20kHz and then increasing its rate by 8x.
Figure 3 : Oversampling/Upsampling Effects on Frequency
Whether we upsample or oversample, the effect on the spectrum of our audio signal is similar. Instead of our signal of interest occupying almost our entire bandwidth all the way up to fs/2 (22.05 kHz), it now only occupies 1/8 th of that. This allows us the use of a very simple analysis filter at the head of our processing chain, the internal digital filters, or the reconstruction filter at the end of the chain. Both the processes of upsampling and oversampling give us this benefit. Without increasing the sample rate, we would need to design a very sharp filter that would have to cutoff at just past 20kHz and be 80-100dB down at 22kHz. Such a filter is not only very difficult and expensive to implement, but may sacrifice some of the audible spectrum in its rolloff. If we examine the spectrum at the increased rate, we can see that the filter can roll off gently well past 22kHz and as long as it is down in the cutoff region at 176.4kHz, the image created by the sampling process will easily be removed. The analog filter after the D/A converter is responsible for removing the audio signal's image as well as the frequency spurs caused by the DACs integration steps. An analog filter with a smooth roll off will have nicer phase characteristics as well.
To demonstrate these benefits, let's take a look at two analog filters: one that must operate at Nyquist, and one that can operate at 64x Nyquist. The filter that operates at Nyquist, must have a very sharp cutoff and a higher order. The figure below shows a 10 th order Bessel lowpass filter. We can see that the filter rolloff is very sharp and the corresponding phase response is nonlinear towards the higher frequencies. Such variations in phase are undesirable in our audio signal. The plots below are a function of radians rather than hertz and are in logarithmic scale. On the radians scale, our audio signal occupies 125.66 rad/s up to 138544 rad/s, which roughly corresponds to 10^2 and 10^5 on the plots.
Figure 4 : Analog Filter at Nyquist
Now lets look at a filter that operates on our bandlimited audio signal at 64x Nyquist. This filter is a 3 rd order Bessel lowpass filter that cuts off well after our audio spectrum. The rolloff is much gentler, but the phase response is notably better. It is linear over almost the entire audio spectrum, which extends from 0 rad/s up to 138544 rad/s.
Figure 5 : Analog Filter at 64x Nyquist
The phase response of the analog reconstruction filter after the DAC is a function of the type of filter used and how much oversampling the DAC uses. A higher oversampling will allow for a more linear phase response over the audio spectrum for a given analog filter structure. The DACs oversampling to a higher rate allows for a reasonable analog filter design that gives us linear phase. The key point is that the oversampling in the DAC and the oversampling in the ADC are both important parts of the processing that have been used for a very long time.
Upsampling would give us the same benefits in frequency response that we have gone over, however we can achieve the same effects by sufficiently oversampling our signal both at the DAC and ADC. Upsampling has no effect on our digital filter design problem since all our digital filters are FIR (finite impulse response) and all have linear phase. By sufficiently oversampling at the ADC, we can design a very simple, linear phase, digital filter that has no problems with our audio signal. There has been much misinformation surrounding upsampling and many claims have been made that state that upsampling is necessary to allow for such a desirable digital filter. However, it is the oversampling at the ADC takes care of this, not upsampling. To demonstrate, lets take our audio signal that has been oversampled at 8x Nyquist and design a digital filter for it.
Below we have a symmetric digital FIR filter. The plot below is in normalized frequency where 1 = 176.4kHz. Our audio signal extends from approximately .00011338 up to .125. From this we can see that the passband of our filter is smooth over this frequency range and that the phase response is linear. At just past 22kHz, the response of the filter is down only 6 db and falls off to below -120dB soon after.
Figure 6 : Digital Filter at 8x Nyquist
Upsampling vs. Oversampling for Digital Audio - page 2
Jitter
Jitter is basically a timing error that is caused by inaccuracies of a system clock relative to the data stream. In an ADC for example, jitter will cause the sampling of the analog waveform either too early or too late relative to the previous sample. This error will cause the sample's level to be incorrect. Logically, a signal that is high frequency and high amplitude will be more likely to be affected by jitter than one that is lower frequency and smaller amplitude. One major claim that the proponents of upsampling, or sample rate conversion claim is reduction of the effects of this jitter. It is very important to note that the increase in rate itself is not responsible for the reduction in jitter. In fact, the jitter that is caused by the inaccurate ADC clock can not be removed completely since it is already present in our digital samples. However by upsampling to another rate and using a clock that is asynchronous to our original, the incorrectly sampled data can be somewhat corrected. Basically what we achieve is a 'spreading out' of the effects caused by sampling jitter over a wider spectrum. Jitter appears in our system by increasing the noise floor on our audio spectrum. By going through this rate increase, we basically spread out this jitter over more samples, interpolate, and then filter once again. One popular ASIC that does such a function is the AD1896 produced by Analog Devices. Another important point to note is that if not implemented correctly, this whole process of upsampling can actually yield poorer results. Also, with the use of an accurate clock relative to our input data's frequency, we can greatly reduce the effects of jitter at the head end. The amount of jitter that is really audible is something that is debated religiously and the psychoacoustics behind that is beyond the scope of this.
Oversampling DACs and Bits
Oversampling is widely used in the DAC. The effects of oversampling at the DAC are advantageous to the design of the analog reconstruction filter that must be built, as we have seen previously. By having a high sample rate out of our DAC we can use a very simple, gentle analog filter to reconstruct our analog filter. This is important since we will be able to design an analog filter that is not only cheap hardware wise, but also has a nice linear phase response over the passband.
Another reason for oversampling is to reduce the effects of quantization noise. By oversampling, we can spread any quantization noise over a larger bandwidth while keeping our signal of interest in the same band. Our filter will serve to cut out the out-of-band quantization noise while keeping our original signal and thereby increasing our SNR. For each factor of four that we oversample by, we gain 6dB of noise lowering. 6dB represents approximately one bit of information. By oversampling, we can theoretically drop one bit for every 4x increase in sample rate.
The question of number of bits is another thing to consider. Does carrying extra bits increase the amount of information in our signal? Unfortunately, once we have sampled our signal, nothing can be done to increase the amount of information we have to work with. What carrying more bits does is that it prevents the loss of information. DSP algorithms and filters require additions, multiplications, and other math functions. If we are able to carry more bits in the results of these operations, we lose less information by chopping off fewer bits. Every truncation of a result will add noise to our signal. But now we can see that by balancing the number of bits we carry in our computations and by the amount we oversample, we can reduce the effect of this truncation in word length. One thing to note is that many products claim 24-bit word lengths, but yet only process internally at 20 bits.
What Does This All Mean - Will it Sound Better?
So the question remains whether upsampling or oversampling actually make music sound 'better'. How much do we need? We have seen the main motivation behind oversampling and how it allows us to use simpler digital and analog filters as well as helping us with quantization noise. The effects of upsampling are greatly debated. While it is true that upsampling does help us in attenuating the amount of jitter caused by sampling errors and an inaccurate clock, whether this jitter is audible or not is a point of contention. There is no doubt that wide bit words and super-high sampling rates that are touted by the latest products are largely marketing. Oversampling has been around for a very long time and has been used extensively in audio products to not only improve sound quality through 'better' filtering but to make these same products much cheaper. Upsampling, on the other hand, is relatively newer and debated greatly. The effects of upsampling are no doubt overstated. By carefully designing the sampler, ADC, digital processing path, and oversampling DAC, the upsampling and asynchronous rate transfer can, in my opinion, be avoided.
The Purists Point of View
There are basically two points of view regarding this upsampling an oversampling. The audio 'purists' want no additional processing on their signal and want whatever comes in from the source to come out as analog. They talk about zero oversampling DACs and such that are completely filter free both in the analog and digital domain. That is one extreme that some may argue is the purest since it avoids any digital artifacts and it's quality relies on human perception by arguing that the human ear in itself acts as a brickwall filter after 20 kHz. Whenever we get into debates of human perception, the math and theory go out the window. Does it sound better without all the digital processing and filtering even with the image of the signal sitting just past fs/2? The energy past 22.05kHz is still present and you are still sending it to the speaker's tweeter. How will the tweeter react to such out-of-band frequencies that are present? Furthermore, sending such a signal that is not limited in bandwidth could cause stability problems with wide-bandwidth amplifiers that have a high unity-gain crossing. The overall system's signal-to-noise- ratio will be adversely affected as well. The DAC will also introduce frequency spurs all over the place. If we don't filter them at all, what will their presence do to the sound? It's a complicated problem and such a minimalist approach could introduce more non-linearities and negative effects, more so than the digital processing ever would.
About the Author
Nauman Uppal received his Bachelors Degree in Electrical Engineering from the University of Maryland College Park in 1998 and went on to complete a Masters Degree in 2000 at the University of Maryland. His focus in graduate school was in communications and signal processing. Naumans work experience includes 2.5 years working as an ASIC designer for PMC-Sierra plus some work for Nortel Networks while he was in school. He has held his current position for 1.5 years and is working on designing digital communication systems using FPGAs. Click here more info on Nauman's background.