2.3 Time (Space) Sampling

In digital recording, the signal amplitudes are sampled at discrete (fixed), equally spaced times or distances (as in Figures 2.2b and 2.2d). This simple step is very deceiving because you can completely destroy your experiment with the wrong choice of sampling interval. If the sampling interval is too small, you will record much more data than necessary resulting in excessive storage requirements and redundant data. If the sampling interval is too large, it’s much worse. In fact, the recorded data may be totally worthless since the entire spectral (frequency) content of the signal can be distorted.

..."phony data" are the result of the proper data being aliased..

SAGE SAYS:

Notice that the sampling theorem does not actually say how you can reconstruct the values of the continuous function at the intervening points between the digitized values. It doesn't seem possible that the digitized signal is equivalent to the continuous one; fear not, this will be explained in Appendix A.

 

These "phony data" are the result of the proper data being aliased. The sampling interval must be chosen to avoid or minimize aliasing. How is this done? In Appendix A we will add rigor to the answer to this question, however, for now let’s simply state the all important sampling theorem and use it to answer the question. The sampling theorem requires that you know the highest frequency, fmax that is in the data to be digitized. Technically, we require that the frequency content in the signal is zero above fmax . The sampling theorem states that:

A continuous signal s(t) can be constructed exactly from its digitized samples if the sampling interval, Δt (or Δx in distance) is chosen so that

(2.3.1)

This can be rephrased to say that to avoid aliasing, there must be more than two samples per cycle for the highest frequency present in the signal. A special name, the Nyquist frequency, is given to the frequency 1/2Δt (or 1/2Δx).

We wanted to get the sampling theorem directly up front in this section; now we will discuss it in detail and illustrate its consequences. First, let us say that we will deal with a time domain signal s(t) which is continuous (the signal could just as well be in space, say s(x) or even s(x,y), the latter representing data covering an area). Assume that s(t) is digitized so that

(2.3.2)

This is, of course, an impossible task since we can’t possibly begin sampling at - infinity and continue to + infinity. We’ll address this problem shortly, but for now, let’s imagine that we have such a digitized sequence. Since we’re in the imagining mood, let’s imagine a very special function that isn’t actually a function at all. It’s called a delta function or Dirac delta function expressed by the symbol δ(t). (Paul Dirac was a British physicist who was awarded the Nobel Prize in Physics in 1933. He was the first to applied the delta function to quantum physics in the 1940s.)

Dirac Delta Function, the Dirac Comb, and Aliasing

True functions such as s(t) have values at any time t0 given by s(t) = s(t0). The delta function, δ(t) is defined to be non-zero only where its argument is equal to zero. For example, δ(t) is non-zero only where t = 0, δ(t - t0) is non-zero only where t - t0 = 0 or at t = t0, etc. So, a delta function can be viewed to exist only at a point and, therefore, it has a width that approaches zero. However, the integral of a delta (i.e., its area) is nicely defined to have a value of 1,

(2.3.3)

This is obviously very strange since for the area of a function which approaches zero width to be 1, the amplitude would have to approach infinity. As such, the delta function is a mathematical abstraction and not a true function. In geophysics, we often call it an impulse function or simply a spike. It has many practical uses for something so abstract, many more than you would guess. In the context of sampling geophysical signals, there is a property that we will employ, namely the "sifting property" of the delta function. This is expressed as

(2.3.4)

This property, along with equation 2.3.3 above means that we can think of sifting or "grabbing" a single value of a continuous function at any time t0 as the multiplication of s(t) by a delta function located at t0. The act of digitizing a continuous signal into discrete values can be viewed (imagined) as a multiplication of a string of delta functions spaced Δt apart (Figures 2.3 and 2.4). The string of delta functions so constructed is called a Dirac comb (expressed using the symbol III, pronounced shah). It looks like a comb you use in your hair except that it has an infinite number of teeth, Δt apart, each approaching zero width! It is expressed mathematically as

(2.3.5)

Using the Dirac comb, we can now cleanly express digitizing as the multiplication of the continuous signal by a Dirac comb,

(2.3.6)

Theoretically, the sampled result is a string of delta functions each of whose area equals the value of the continuous signal at the digitized points, kΔt. Practically, we can view the digitized result as a series of spikes whose amplitudes are equal to the continuous signal at discrete points, kΔt. This view corresponds to viewing the Dirac comb as having teeth, each with unit amplitude, separated by Δt, even though this is formally incorrect.

As mentioned above, the choice of the digitizing or sampling interval, Δt has enormous consequences. This is because the frequency components that make up a signal are forever altered (aliased) by digitization. We chose to defer a complete description of exactly what happens here to Appendix A. In Section 3 we discuss the frequency domain or spectral representation of digitizing a signal. A picture that is often used to illustrate aliasing is shown in Figure 2.3. Here a pure sine wave of 4 Hz is sampled improperly by a Dirac comb to yield an aliased 1 Hz signal. This is a big mistake. Figure 2.4 also illustrates digitizing as a multiplication by a Dirac comb and what the wrong choice of sampling interval does to our LANL SAGE magnetic signal. The sequence begins with 100 equally spaced points and proceeds to 50, 25, 11, and 6 points. Straight lines joining the digitized values are for aid in visualization only. The latter two results illustrate very serious aliasing.

Figure 2.3. Example of under sampling (aliasing) in time domain where actual 4 Hz signal is improperly sampled yielding a 1 Hz signal. Sampling process (digitizing) is shown as a multiplcation (x) by a Dirac comb with Dt = 0.2 s spacing between "teeth."

Figure 2.4. Comparison of continuous LANL magnetic data with sampled data using different sampling intervals. The sampling process is shown as a multiplication (x) by Dirac combs with different Dx spacing between "teeth."

The sampling interval, Δt defines a sampling frequency, fs = 1/Δt. Of more fundamental importance from the sampling theorem expressed by equation 2.3.1 is a frequency whose period is 2Δt. This frequency is called the Nyquist frequency, fN (Harry Nyquist was an American physicist and engineer who established the principles of digital sampling in a 1928 paper on telegraph theory.)

(2.3.7)

Figure 2.5. Contributions to aliased frequency domain spectra can be obtained by folding the "proper," correct amplitude spectrum at multiples of the Nyquist (folding) frequency, fN, 2fN,...

The Nyquist frequency is the highest frequency preserved in the digitized spectrum. This statement does not convey any alarm since you might not be interested in frequencies above fN. However, a totally unsuspecting consequence of selecting a sampling interval which is too large (i.e., fN = 1/2Δt is too small) is that frequencies above fN contaminate the proper spectrum at frequencies equal to and below the Nyquist frequency. This is called aliasing. Figure 2.3 clearly illustrates how a low frequency (1 Hz) sinusoid has become an impostor (an alias) of the correct, high frequency, 4 Hz signal. Understanding this insidious behavior will rely on our description of the digitizing process as that of the continuous signal s(t) multiplied by the Dirac comb. This is not intuitive, at least not to anyone we know. Also, not intuitive is why the Nyquist frequency is called the folding frequency. This part we will leave here without proof; by the end of Section 4 on Filtering everything will be perfectly clear (we hope).

Figure 2.5 illustrates a way to visualize aliasing by considering folds in the proper continuous (unsampled) amplitude spectrum at positions equal to fN, 2fN, 3fN, … The aliased spectrum is obtained by first folding the correct spectrum at fN and laying it back onto the interval from fN to 0. Spectral amplitudes from fN to 2fN add directly to the "proper" ones in the fN to 0 interval. Next, a fold is placed at 2fN and the spectrum from 2fN to 3fN is laid back onto the interval 0 to fN and it's amplitude is added onto the already aliased spectrum. Folding and adding proceeds over and over again producing a grossly aliased final spectrum unless there is no signal content above fN. That is, no aliasing occurs if the sampling theorem is satisfied. Can you visualize how a 4 Hz spectral line is folded back at the Nyquist frequency, fN = 1/2Δt = 1/(2 x 0.2) = 2.5 Hz, to yield the 1 Hz alias presented in Figure 2.3? Check it out it in Figure 2.6. The complete process is rigorously described in Appendix A.

From what we've presented so far, you could not possibly understand why aliasing can be described by the "folding process" shown in Figures 2.5 and 2.6. To understand this and the sampling theorem (and a host of other issues in digital geophysical analysis) we must explore the representation of signals in the frequency domain. This leads us to Spectral Analysis, the subject of Section 3.