## Correlation

Correlation is a foundation over which the whole structure of digital communications is built. In fact, correlation is the heart of a digital communication system, not only for data detection but for parameter estimation of various kinds as well. Throughout, we will find recurring reminders of this fact.

As a start, consider from the article on Discrete Fourier Transform that each DFT output is just a sum of term-by-term products between an input signal and a cosine/sine wave, which is actually a computation of correlation. Later, we will learn that to detect the transmitted bits at the receiver, correlation is utilized to select the most likely candidate. Moreover, estimates of timing, frequency and phase of a received signal are extracted through judicious application of correlation for synchronization, as well as channel estimates for equalization. Don’t worry if you did not understand the last sentence, as we will have plenty of opportunity on this website to learn about these topics.

By definition, correlation is a measure of similarity between two signals. In our everyday life, we recognize something by running in our heads its correlation with what we know. Correlation plays such a vital and deep role in diverse areas of our life, be it science, sports, economics, business, marketing, criminology or psychology, that a complete book can be devoted to this topic.

"The world is full of obvious things which nobody by any chance ever observes."

Holmes to Watson – The Hound of the Baskervilles

For all of Sherlock Holmes’ inferences, his next step after observation was always correlation. For example, he accurately described Dr James Mortimer’s dog through correlating some observations with templates in his mind:

" and the marks of his teeth are very plainly visible. The dog’s jaw, as shown in the space between these marks, is too broad in my opinion for a terrier and not broad enough for a mastiff. It may have been — yes, by Jove, it is a curly-haired spaniel."

Sherlock Holmes – The Hound of the Baskervilles

As in the case of convolution, we start with real signals and the case of complex signals will be discussed later.

## Correlation of Real Signals

The objective of correlation between two signals is to measure the degree to which those two signals are similar to each other. Mathematically, correlation between two signals and is defined as

(1)

where the above sum is computed for each from to and the subscript s and h represent the order in which the signals appear on right hand side. We denote correlation operation by "" as

(2)

Note that unlike convolution,

(3)

This can be verified by plugging in Eq (1) which yields and hence

Nevertheless, it can be deduced that Eq (1) is equivalent to

(4)

Now we can say that

In terms of conveying information, there is not much difference and one is just a flipped version of the other.

Correlation computation

Comparing Eq (1) with convolution Eq, it is evident that

(5)

Therefore, from a viewpoint of conventional method, computing correlation between two signals is very similar to their convolution, except that there is no flipping of one signal. This is because flips the signal once and convolution flips it again, hence bringing the original signal back.

From the viewpoint of intuitive method, it is clear that a negative sign with the NOW, , turns the future into past, and the past into future. Consequently, the last sample of the original signal arrives first, since it has become the farthest past.

Except this difference, correlation of real signals is very similar to convolution and the discussion on convolution accordingly applies here as well. For complex signals, there is another remarkable difference between the two, which we discuss next.

An example of correlation between the same two signals as in convolution example, and , is shown in Figure below, where the result is shown for each .

## Correlation of Complex Signals

Correlation between two complex signals and can be understood through writing Eq (1) in form. However, another difference from convolution is that one signal is conjugated as

(6)

where conjugate of a signal was defined in the article on complex numbers. The above equation can be decomposed as in this Eq,

(7)

The actual computations can be written as

(8)

Due to the identity , a positive sign in term indicates that phases of the two aligned-axes terms are actually getting subtracted. Obviously, the identity applies in above equations only if magnitude can be extracted as common term, but the concept of phase-alignment still holds. Similarly, the identity implies that phases of the two cross-axes terms are also getting subtracted in expression. Hence, a complex correlation can be described as a process that

• computes real correlations: , , and
• subtracts by phase anti-aligning the aligned-axes correlations ( + ) to obtain the component
• subtracts the cross-axes correlations () to obtain the component.

Now it can be inferred why a conjugate was required in the definition of complex correlation but not complex convolution. The purpose of correlation is to extract the degree of similarity between two signals, and whenever is close to ,

thus maximizing the correlation output.

## Correlation and Frequency Domain

Just like convolution, there is an interesting interpretation of correlation in frequency domain. As before, DFT works with circular shifts only due to the way both time and frequency domain sequences are defined within a range .

As always, we utilize the definition of DFT by applying Eq (1) to DFT definition. The derivation is similar to convolution.

Circular correlation between two signals in time domain is equivalent to multiplication of the first signal with conjugate of the second signal in frequency domain because

(9)

and the relation between and can be established through the DFT definition.

## Cross and Auto-Correlation

The correlation discussed above between two different signals is naturally called cross-correlation. When a signal is correlated with itself, it is called auto-correlation. It is defined by setting in Eq (6) as

(10)

An interesting fact to note is that

(11)

which is the energy of the signal .

Remember that another signal can have a large amount of energy such that the result of its cross-correlation with can be greater than the auto-correlation of , which intuitively should not happen. Normalized correlation is defined in Eq (12) in a way that the maximum value of can only occur for correlation of a signal with itself.

(12)

In this case, regardless of the energy in the other signal, its normalized cross-correlation with another signal cannot be greater than the normalized auto-correlation of a signal due to both energies appearing in the denominator.

## Spectral Density

Taking the DFT of auto-correlation of a signal and utilizing Eq (9), we get

(13)

The expression is called Spectral Density, because from Parseval relation in the article on DFT Examples that relates the signal energy in time domain to that in frequency domain,

Thus, energy of a signal can be obtained by summing the energy in each frequency bin (up to a normalizing constant ). Accordingly, can be termed as energy per spectral bin, or spectral density.

From the above discussion, there are two ways to find the spectral density of a signal:

1. Take the magnitude squared of the DFT of a signal.
2. Take the DFT of the signal auto-correlation.

If you found this article useful, you might want to subscribe to my email list below to receive new articles.

## Linear Systems

A linear system implies that if two inputs are scaled and summed together to form a new input, the new output of the system is also a scaled sum of their individual outputs.

[Scaling] For scaling to hold, if

then

where is any scalar.

[Addition] When two such inputs are added together, the output should be the sum of their individual outputs as

A linear system combines the above two properties as

(1)

Below, we discuss examples of a linear and a non-linear system.

Example

Consider a system

The output of this system, as a response to an input , is

Similarly, response to a different signal is

When this system is given the input , the output is

Hence, it is a linear system.

On the other hand, when the same input is given to another system

and using the identity , the output is

Clearly, it is a non-linear system.

From above example, it is also clear that input sinusoids do not interact with each other in linear systems, and hence output frequencies were the same as the input frequencies. In a non-linear system, however, input sinusoids interacted with each other to produce frequencies that were not present in either of the input signal, as shown in Figure below for . Note from that actually consists of only two frequencies at bins and , but the output of the system is composed of other frequencies as well shown in .

The Discrete Fourier Transform, DFT, is a linear operation as it is evident from DFT definition that any scaling and addition of two or more input signals will result in a DFT output that is a scaled and summed version of their individual DFT outputs.

## The Concept of Phase

We have explained what Discrete Fourier Transform (DFT) is. Also, we have covered that the concept of frequency is related to rotational speed of a complex sinusoid. Subsequently, a frequency axis was defined first in continuous domain and then in discrete domain. The tools to view the spectrum of a signal in frequency domain are and magnitude-phase plots. We defined the magnitude and phase of a complex number before. Similar definitions hold for complex signals as

(1)

and the phase is defined as four-quadrant inverse tangent as in this Eq with and in place of and , respectively.

Magnitude-phase plots are usually used more than plots because a magnitude plot shows the strength of a complex sinusoid at each frequency in the whole spectrum. In focusing on the magnitude plot, sometimes it is easy to miss a great deal of information provided by the phase plot. Although phase is relatively unimportant for some particular areas such as most audio applications due to relative insensitivity of human ear to phase, it plays a significant role in wireless communications as we will see in later chapters.

As an example of what happens when phase information is neglected, this is how my daughter writes some English letters:

• L
• V
• Z a reversed

All her symbols above are almost correct. Nevertheless, their phase is distorted leading to incorrect results. In addition, further confusion can develop if the relationship of phase with time domain is not clearly understood.

Phase in frequency domain has a special relationship with initial sample of a signal in time domain. Intuitively, a waveform is about how a signal changes in time -plane while its first samples on and axes indicate the starting point of that waveform. On the other hand, frequency is about rotational speed of complex sinusoids constructing that signal (the higher the frequency, the farther it is on frequency axis but not rotating) and phase indicates their orientation on frequency -plane. Naturally, this orientation of each such complex sinusoid depends on where its initial sample is.

Example

As an example, consider this Figure. In time domain, when the starting sample of a cosine is changed by of a period, it becomes a sine wave. Correspondingly in frequency domain, the cosine changes its phase by of (to be exact, and on positive and negative frequency axis, respectively). Mathematically, remember that .

Consider the inphase component of a complex sinusoid shifted by samples:

Since is a constant above, can be seen as the phase shift incurred by a delay of samples. This result is known as the shifting property of the DFT which holds true for circular shifts in time. It states that a (circular) time shift of an input signal results in a corresponding phase shift at each frequency of its DFT .

Effect of time shift on DFT – Magnitude and Phase In the light of discussion above, the DFT of has its magnitude unchanged. However its phase is rotated by . We denote the rotated DFT by .

(2)

Effect of time shift on DFT – I and Q It is straightforward to prove through DFT definition that the DFT of is given by

(3)

As a verification, comparing with this Eq, is nothing but rotations of complex numbers by angles for each , .

The converse of the above argument is also true. A phase shift at a discrete frequency bin of the DFT informs us about (circular) time shift of that sinusoid. The conclusions from above are summarized in the note below.

Time shift Phase shift

If a signal is circularly right shifted in time by samples (i.e., samples are moved places to the right, with elements that fall off at one end of the sequence appearing at the other end), then the magnitude of its DFT remains unchanged. However, the phase of its DFT gets rotated by for each , . Similarly, for a circular left shift in time by samples incurs a phase rotation of for all .

(4)

Therefore, a time delay (going back in time from NOW, or waveform shift to the right) rotates the original spectral phase in negative direction (clockwise). On the other hand, a time advance (future travel from NOW, or waveform shift to the left) rotates the original spectral phase in positive direction (anticlockwise). In terms of intuitive method of time shifting, it makes perfect sense: Traveling in the past should decrease the DFT phase, and vice versa.

In examples discussed above, the amount of phase shift has a linear relation with the frequency index as evident from the term . This is the concept of linear phase where the phase of each complex sinusoid is directly proportional to the frequency of that sinusoid. Intuitively, if sinusoids with different frequencies get delayed by the same number of samples, then they naturally end up with different phases at the end of that common sample duration.

The magnitude and direction of rotation for these frequencies is symbolically shown for in Figure below. Remember that this is a frequency -plane, unlike time -plane.

Example

The Figure below shows a unit impulse signal and its DFT along with its circularly time shifted version and its DFT . This DFT is computed shortly in later article.

Note that phase shift for each frequency bin is different for each according to Eq (4). To be exact, for to and turns out to be

The phase rotations of and are illustrated in the figure. Also observe that for a right shift, the phase rotations are clockwise for positive and anticlockwise for negative .

The understanding of phase above cannot be overemphasized. It is relatively straightforward to see magnitude plots and diagnose the behavior of signals and systems. However, true insights can only be developed through grasping the implications of phase rotations.

If you found this article useful, you might want to subscribe to my email list below to receive new articles.

## A Digital Signal

We have talked about obtaining a discrete-time signal through sampling the time-axis and obtaining a discrete frequency set through sampling the frequency axis. The same concept can be applied to the amplitude-axis, where the signal amplitude can be sampled to take only a finite set of discrete values. This discrete-time discrete-valued signal is called a digital signal, as opposed to an analog signal that is continuous in time and continuous in amplitude.

The above Figure shows how a digital signal having amplitudes over a fixed set of values can be obtained through slicing the underlying continuous amplitudes. For example, an amplitude of 2.2 can be rounded to 2, 1.4 to 1 and so on depending on the desired resolution.

Computers can only work with digital signals because discrete-time signals — though defined only for finite values on time-axis — can have infinite values on the amplitudes-axis. Just as computer memory is finite and can store only a known number of time values, its width is also finite (e.g., 8 bits) and can store only a fixed number of amplitudes (e.g., for an 8-bit wide memory, we can have 256 values for amplitudes).

## Pulse Amplitude Modulation (PAM)

In the article on modulation – from numbers to signals, we said that the Pulse Amplitude Modulation (PAM) is an amplitude scaling of the pulse according to the symbol value. What happens when this process of scaling the pulse amplitude by symbols is repeated for every symbol during each interval ? Clearly, a series of bits (1010 in our initial example) can be transmitted by choosing a rectangular pulse and scaling it with appropriate symbols.

Our next step is forming a cumulative waveform from these individual symbol-scaled pulses. Remember from the article on transforming a signal that mathematical expression for a signal delayed by an amount or samples is given as , where is the symbol duration and is samples/symbol defined as . Since the same pulse is scaled by the symbol value during each ,

• At time instant , the output is .
• At time instant , the output is .
• At time instant , the output is .
• At time instant , the output is .

And so on. Finally, their addition gives the expression for a general PAM waveform:

(1)

After digital to analog conversion (DAC), the continuous-time signal can be expressed as

(2)

As an example, a -PAM waveform is illustrated in Figure below with red dashed curve being the underlying continuous-time signal.

In a similar manner to -PAM, a -PAM waveform based on symbols can be constructed by scaling the pulse amplitude by different symbol values during each , as illustrated in Figure below.

## Constellation Diagram

Just like a constellation of stars, a constellation diagram shows the actual symbol values representing a set of bits. We have already encountered constellation diagrams before (e.g., in the article on a simple communication system). A general constellation diagram for M-PAM is shown in Figure below.

## Average Symbol Energy

The average symbol energy in a constellation is given by the average of all individual symbol energies. For ,

And for ,

For a general ,

The term in the brackets is the sum of squares of first odd integers. Using the formula for the first odd integers squared, we get

(3)

The main purpose of this text is to find the answer to the following question: which operations are necessary to perform at the Tx and Rx sides such that we can detect that symbol with the highest probability? To explore the answer, we start with a simple PAM modulator and detector.

## PAM Modulator

At this stage, we are ready to build a conceptual PAM modulator. The block diagram is drawn in Figure below in which Tx signal is generated in the following way.

• Every seconds, a new bit arrives at the input forming a serial bit stream.
• A serial-to-parallel (S/P) converter collects such bits every seconds that are used as an address to access a Look-Up Table (LUT) that stores symbol values specified by the constellation.
• To produce a PAM waveform, the symbol sequence is converted to a discrete-time impulse train through upsampling by , where is samples/symbol defined as ratio of symbol time to sample time , or equivalently sample rate to symbol rate .
• As explained in the post on sample rate conversion, upsampling inserts zeros between each symbol after which the interpolated intermediate samples can be raised from dead with the help of a lowpass filter that suppresses all the spectral replicas except the primary one. We will see in the next section that a proper pulse shaping filter is also a lowpass filter and hence another extra lowpass filter is not actually required.
• The generated discrete-time signal is converted to a continuous-time signal by a DAC.

The mathematical derivation for the PAM modulator was shown in Eq (1) and Eq (2).

## PAM Detector

The received signal is the same as the transmitted signal but with the addition of additive white Gaussian noise (AWGN) . The symbols are detected through the following steps illustrated in Figure above.

• Through an analog to digital converter (ADC), is sampled at a rate of samples/s to produce a sequence of -spaced samples .
• Next, is processed through a matched filter at the Rx side to generate . As discussed earlier, the output of the matched filter is a continuous correlation of the symbol-scaled pulse shape with an unscaled and time-reversed pulse shape.
• This output is downsampled by at optimal sampling instants

(4)

to produce -spaced numbers back from the signal.

• The minimum distance decision rule is employed to find the symbol estimates .

Take a special note of Eq (4). It will be employed over and over again.

Notice that a symbol is the basic building block of a digital communication system. Consequently, symbol time is the basic unit of measurement along the time axis of such a system. When Figure on correlator outputs depicts sampling the output just once at optimum instant out of , it is true for all integer multiples of symbol time as well, i.e., the output is sampled just once for every symbol duration at optimum instants , , .

Key samples

Carefully examine the key samples at , , . These are the samples we are looking for in the waveform for the detection purpose. Even when the waveform has suffered from all the distortions the real world has to offer, locating these samples and mapping them back to the constellation is a beautiful process, actual details of which we will encounter throughout this text.

Let us discuss the mathematical details of this process. The Tx signal in Eq (1) is expressed as (the reason for using a different variable instead of will shortly become clear)

In a noiseless case, this signal is input to matched filter and the output is written as

(5)

where comes into play from the definition of auto-correlation function. To generate symbol decisions, -spaced samples of the matched filter output are required at . Downsampling the matched filter output generates

(6)

This is because for a rectangular pulse shape, the matched filter output is triangular with maximum occurring at and zero at the next symbol location (since we wanted to denote our current symbol with , we opted for variable at the start of this derivation), see Figure on correlation of a rectangular pulse.

Observe that the system shown in PAM system block diagram is a multirate system. In the PAM detector, for example, the ADC and the matched filter operate at the sample rate . After the output of the matched filter is downsampled by , the symbol decisions are made at the symbol rate . Furthermore, there are some hidden assumptions in the PAM detector:

[Resampling] The ADC in general does not produce an integer number of samples per symbol, i.e., is not an integer. As we will see later, a resampling system is required in the Rx chain that changes the sample rate from the ADC rate to a rate that is an integer multiple of the symbol rate.

[Symbol Timing Synchronization] The peak sample at the end of each symbol duration is not known in advance at the Rx and in fact does not necessarily coincide with a generated sample as well. This is because ADC just samples the incoming continuous waveform without any information about the symbol boundaries. This is a symbol timing synchronization problem which we will learn later.

If you found this article useful, you might want to subscribe to my email list below to receive new articles.