4.6 Correlation

If, prior to the sliding, multiplication, and integration in convolution (equation 4.2.1a), we do not flip either of the functions, the process produces the cross-correlation function or integral,

(4.6.1a)
(4.6.1b)

SAGE SAYS:

An italicized phi (f) is used here for the correlation operation to distinguish it from plain f used for phase throughout earlier sections. Both forms of phi are used in the same equation later on in this section so be prepared.

Here, the positive sign in the integrand says that h(t) is not reversed but is simply shifted. The time shift, t is called the lag time and the cross-correlation result is usually plotted as a function of the lag time. A common convention is to plot the lag time as positive when the second function is moved to the left with respect to the first function. The lag time is zero when the functions are shifted so their values at t = 0 are coincident whether or not the functions are two-sided (± time or space) or one-sided, usually indexed from zero. Whereas, the shorthand notation with an asterisk is used for convolution (equation 4.2.1b), the five-pointed star star, or pentagram, is often used as above to denote correlation. The correlation process yielding fsh is summarized as:
  1. Shift h(t) by lag time t giving h(t+t).
  2. Multiply the shifted h(t+t) by s(t) obtaining s(t)h(t+t).
  3. Integrate s(t)h(t+ t) to find the area under the product to obtain the value of the cross-correlation at a single lag time t.
  4. Complete steps 1-3 for all positive, zero, and negative values of lag time, t where overlap occurs.

The cross-correlation operation is not commutative (s star h ≠ h star s) as is the convolution operation where s h = h s. In fact,

(4.6.2)

So, interchanging the order of the correlation operation means changing the order of the cross-correlation result. The correlation can be written as a convolution by convolving the first function with the time-reverse of the second, i.e.,

. (4.6.3)

From this we realize that if either s(t) or h(t) is an even function, cross-correlation and convolution are identical operations since flipping an even function prior to convolution has no effect.

The cross-correlation is the largest when the shifted (lagged) version of a function is most similar to the other function. Then, the products of the two functions are more positive. Where the products have both positive and negative results, integration yields smaller cross-correlation values. For perfectly, uncorrelated, random functions, the cross-correlation is zero.

A special case of the cross-correlation function is when both functions in the integrand of equation 4.6.1a are identical, i.e.,

(4.6.4)

 

This defines the correlation of a signal with itself, called the autocorrelation, which is an even function about t = 0,

(4.6.5)

 

The autocorrelation can also be written as a convolution of a function and its time reversed version, yielding

(4.6.6)

 

The cross-correlation and autocorrelation functions are often normalized so that they are equal to one at the origin, t = 0. In this case the non-normalized definitions above (equations 4.6.1a and 4.6.4) define what are called the cross-covariance and autocovariance functions, respectively. The terms, cross-correlation and autocorrelation are then reserved for the normalized functions.

Discrete Correlation

When both s(t) and h(t) are digital functions having unit sampling interval, the cross-correlation and autocorrelation integrals become summations:

(4.6.7)

 

and

(4.6.8)

 

respectively. Figure 4.9 illustrates the discrete cross-correlation process between the same two, one-sided functions st = [2, -2, 1] and ht = [1, 3, 1/2, -1] that were convolved together in section 4.3 Discrete Convolution. Just for fun we’ve also done the same cross-correlation operation in Figure 4.10 by first reversing st and convolving with ht to confirm equation 4.6.3. Figure 4.11 displays the symmetric (even) output of the autocorrelation process using st.


Figure 4.9. Action cross-correlation of (2, -2, 1) and (1, 3, 1/2, -1) yields (1, 1, -3.5, 4,3, -2).


Figure 4.10. Action convolution calculates cross-correlation of (2, -2, 1) and (1, 3, 1/2, -1) yielding (1, 1, -3.5, 4,3, -2).


Figure 4.11. Action autocorrelation of (2, -2, 1) and (2, -2, 1) yields (2, -6, 9, -6, 2).

Uses of Cross- and Autocorrelation

Cross-correlation is used to measure the similarity between two signals, to detect a known signal in a noisy one, or to search for cyclic data. One application is finding the time for a known signal to pass through a system if the signal is not grossly distorted during the transient. The lag time at the maximum value of the cross-correlation is the time shift caused by the system. This application of cross-correlation is at the heart of processing vibroseis data. Vibroseis is the most extensively used land seismic exploration technique. We’ll expand on this application in Section 5 where we discuss SAGE vibroseis data.

The autocorrelation function is used in geophysics to find hidden periodicities such as occur in multipath reflection seismic records, to compute power spectra, and to perform a "defiltering" operation called deconvolution. It also appears in vibroseis processing.

Cross Correlation and Autocorrelation in the frequency Domain

The convolution theorem in Section 4.4 can be used to find out what the cross-correlation is in the frequency domain, FSH(f) namely,

(4.6.9a)

yields

(4.6.9b)

The superscript * in equation 4.6.9b indicates the complex conjugate. As before this means that the sign of the imaginary part of S(f) is reversed while the real part is unchanged. The cross-correlation after Fourier transform onto the frequency domain (equation 4.6.9b) is called the cross energy density spectrum, the cross power, or the cross spectrum. From equation 4.6.9b we realize that the phase spectrum of the cross power is the difference between the respective phase spectra, fH(f) - fS(f). The amplitude spectrum is the multiplication of the two amplitude spectra.

By analogy to the convolution theorem in Section 4.4, the relationship between equations 4.6.9a and 4.6.9b is called the cross-correlation theorem. The counterpart for the autocorrelation is called the autocorrelation theorem,

(4.6.10)

The latter formula shows that the autocorrelation function and the energy density spectrum (also called the autopower or power spectrum) from Section 3.2 form the Fourier transform pair

(4.6.11)

where the phase spectrum is zero. Therefore, all signals with the same amplitude spectra but with different phase spectra, have identical autocorrelation functions. Unless the function is a zero phase one (i.e., it is an even function in time or space) there is a loss of information in the autocorrelation function since it does not "carry" phase information. In these cases, the original function cannot be recovered from its autocorrelation.