2.3 Time (Space) Sampling
In digital recording, the signal amplitudes are sampled at discrete (fixed), equally spaced times or distances (as in Figures 2.2b and 2.2d). This simple step is very deceiving because you can completely destroy your experiment with the wrong choice of sampling interval. If the sampling interval is too small, you will record much more data than necessary resulting in excessive storage requirements and redundant data. If the sampling interval is too large, its much worse. In fact, the recorded data may be totally worthless since the entire spectral (frequency) content of the signal can be distorted.
..."phony data" are the result of the proper data being aliased..


These "phony data" are the result of the proper data being aliased. The sampling interval must be chosen to avoid or minimize aliasing. How is this done? In Appendix A we will add rigor to the answer to this question, however, for now lets simply state the all important sampling theorem and use it to answer the question. The sampling theorem requires that you know the highest frequency, fmax that is in the data to be digitized. Technically, we require that the frequency content in the signal is zero above fmax . The sampling theorem states that:
A continuous signal s(t) can be constructed exactly from its digitized samples if the sampling interval, Δt (or Δx in distance) is chosen so that
This can be rephrased to say that to avoid aliasing, there must be more than two samples per cycle for the highest frequency present in the signal. A special name, the Nyquist frequency, is given to the frequency 1/2Δt (or 1/2Δx).
We wanted to get the sampling theorem directly up front in this section; now we will discuss it in detail and illustrate its consequences. First, let us say that we will deal with a time domain signal s(t) which is continuous (the signal could just as well be in space, say s(x) or even s(x,y), the latter representing data covering an area). Assume that s(t) is digitized so that
This is, of course, an impossible task since we cant possibly begin sampling at - infinity and continue to + infinity. Well address this problem shortly, but for now, lets imagine that we have such a digitized sequence. Since were in the imagining mood, lets imagine a very special function that isnt actually a function at all. Its called a delta function or Dirac delta function expressed by the symbol δ(t). (Paul Dirac was a British physicist who was awarded the Nobel Prize in Physics in 1933. He was the first to applied the delta function to quantum physics in the 1940s.)
Dirac Delta Function, the Dirac Comb, and Aliasing
True functions such as s(t) have values at any time t0 given by s(t) = s(t0). The delta function, δ(t) is defined to be non-zero only where its argument is equal to zero. For example, δ(t) is non-zero only where t = 0, δ(t - t0) is non-zero only where t - t0 = 0 or at t = t0, etc. So, a delta function can be viewed to exist only at a point and, therefore, it has a width that approaches zero. However, the integral of a delta (i.e., its area) is nicely defined to have a value of 1,
This is obviously very strange since for the area of a function which approaches zero width to be 1, the amplitude would have to approach infinity. As such, the delta function is a mathematical abstraction and not a true function. In geophysics, we often call it an impulse function or simply a spike. It has many practical uses for something so abstract, many more than you would guess. In the context of sampling geophysical signals, there is a property that we will employ, namely the "sifting property" of the delta function. This is expressed as
(2.3.4)This property, along with equation 2.3.3 above means that we can think of sifting or "grabbing" a single value of a continuous function at any time t0 as the multiplication of s(t) by a delta function located at t0. The act of digitizing a continuous signal into discrete values can be viewed (imagined) as a multiplication of a string of delta functions spaced Δt apart (Figures 2.3 and 2.4). The string of delta functions so constructed is called a Dirac comb (expressed using the symbol III, pronounced shah). It looks like a comb you use in your hair except that it has an infinite number of teeth, Δt apart, each approaching zero width! It is expressed mathematically as
(2.3.5)Using the Dirac comb, we can now cleanly express digitizing as the multiplication of the continuous signal by a Dirac comb,
(2.3.6)Theoretically, the sampled result is a string of delta functions each of whose area equals the value of the continuous signal at the digitized points, kΔt. Practically, we can view the digitized result as a series of spikes whose amplitudes are equal to the continuous signal at discrete points, kΔt. This view corresponds to viewing the Dirac comb as having teeth, each with unit amplitude, separated by Δt, even though this is formally incorrect.
As mentioned above, the choice of the digitizing or sampling interval, Δt has enormous consequences. This is because the frequency components that make up a signal are forever altered (aliased) by digitization. We chose to defer a complete description of exactly what happens here to Appendix A. In Section 3 we discuss the frequency domain or spectral representation of digitizing a signal. A picture that is often used to illustrate aliasing is shown in Figure 2.3. Here a pure sine wave of 4 Hz is sampled improperly by a Dirac comb to yield an aliased 1 Hz signal. This is a big mistake. Figure 2.4 also illustrates digitizing as a multiplication by a Dirac comb and what the wrong choice of sampling interval does to our LANL SAGE magnetic signal. The sequence begins with 100 equally spaced points and proceeds to 50, 25, 11, and 6 points. Straight lines joining the digitized values are for aid in visualization only. The latter two results illustrate very serious aliasing.
Figure
2.3. Example
of under sampling (aliasing) in time
domain where actual 4 Hz signal is improperly
sampled yielding a 1 Hz signal. Sampling
process (digitizing) is shown as a multiplcation (x) by
a Dirac comb with Dt
= 0.2 s spacing between "teeth."Figure 2.4. Comparison of continuous LANL magnetic data with sampled data using different sampling intervals. The sampling process is shown as a multiplication (x) by Dirac combs with different Dx spacing between "teeth."
The sampling interval, Δt defines a sampling frequency, fs = 1/Δt. Of more fundamental importance from the sampling theorem expressed by equation 2.3.1 is a frequency whose period is 2Δt. This frequency is called the Nyquist frequency, fN (Harry Nyquist was an American physicist and engineer who established the principles of digital sampling in a 1928 paper on telegraph theory.)
(2.3.7)
The Nyquist frequency is the highest frequency preserved in the digitized spectrum. This statement does not convey any alarm since you might not be interested in frequencies above fN. However, a totally unsuspecting consequence of selecting a sampling interval which is too large (i.e., fN = 1/2Δt is too small) is that frequencies above fN contaminate the proper spectrum at frequencies equal to and below the Nyquist frequency. This is called aliasing. Figure 2.3 clearly illustrates how a low frequency (1 Hz) sinusoid has become an impostor (an alias) of the correct, high frequency, 4 Hz signal. Understanding this insidious behavior will rely on our description of the digitizing process as that of the continuous signal s(t) multiplied by the Dirac comb. This is not intuitive, at least not to anyone we know. Also, not intuitive is why the Nyquist frequency is called the folding frequency. This part we will leave here without proof; by the end of Section 4 on Filtering everything will be perfectly clear (we hope).
Figure 2.5 illustrates a way to visualize aliasing by considering folds in the proper continuous (unsampled) amplitude spectrum at positions equal to fN, 2fN, 3fN, The aliased spectrum is obtained by first folding the correct spectrum at fN and laying it back onto the interval from fN to 0. Spectral amplitudes from fN to 2fN add directly to the "proper" ones in the fN to 0 interval. Next, a fold is placed at 2fN and the spectrum from 2fN to 3fN is laid back onto the interval 0 to fN and it's amplitude is added onto the already aliased spectrum. Folding and adding proceeds over and over again producing a grossly aliased final spectrum unless there is no signal content above fN. That is, no aliasing occurs if the sampling theorem is satisfied. Can you visualize how a 4 Hz spectral line is folded back at the Nyquist frequency, fN = 1/2Δt = 1/(2 x 0.2) = 2.5 Hz, to yield the 1 Hz alias presented in Figure 2.3? Check it out it in Figure 2.6. The complete process is rigorously described in Appendix A.
From what we've presented so far, you could not possibly understand why aliasing can be described by the "folding process" shown in Figures 2.5 and 2.6. To understand this and the sampling theorem (and a host of other issues in digital geophysical analysis) we must explore the representation of signals in the frequency domain. This leads us to Spectral Analysis, the subject of Section 3.