A Beginner’s Guide to OFDM

OFDM slices the spectrum just like bread

In the recent past, high data rate wireless communications is often considered synonymous to an Orthogonal Frequency Division Multiplexing (OFDM) system. OFDM is a special case of multi-carrier communication as opposed to a conventional single-carrier system. 

The concepts on which OFDM is based are so simple that almost everyone in the wireless community is a technical expert in this subject. However, I have always felt an absence of a really simple guide on how OFDM works which can prove useful for technical persons not wanting to deal with too much technicalities, such as DSP experts outside communications, computer programmers, ham radio enthusiasts and the likes. So here it is.

OFDM is the technology behind many high speed systems such as WiFi (IEEE 802.11a, g, n, ac), WiMAX (IEEE 802.16) and 4G mobile communications (LTE). A close cousin, Discrete Multi-tone (DMT), is used in ADSL and powerline communication systems. Therefore, it seems imperative to have a signal level understanding of how OFDM works. We start with a short introduction to a wireless channel.

A Wireless Channel

A transmitted signal undergoes two major kinds of fading in a wireless channel:

  1. Large scale fading, that arises from regular power decay with distance as well as shadowing caused by buildings and other obstacles affecting the wave propagation.
  2. Small scale fading, that arises from constructive and destructive interference of multi-path components and results in fast amplitude variations at the receiver. An example of these multi-path components is shown in Figure 1 below where many multi-path components arrive at the Rx of a hiker after being reflected from the nearby surfaces such as the aeroplane, houses, trees and the mountains.

Figure 1: Multi-path components arrivig at the Rx

In Figure 1, the direct path to the hiker is shown by the bold line. Assume that the second path arrives after a delay of 1 \mu s of the direct path (say, from the aeroplane) and the third path arrives after another 1 \mu s, i.e., a total of 2 \mu s after the direct path. We will use these numbers in our examples below.

In the discussion that follows, we avoid technical terms such as delay spread, frequency-selective fading, frequency flat fading and so on. Also, we ignore the RF carrier in the subsequent discussion to highlight the important concepts relevant to the current discussion.Finally, we will consider binary modulation to avoid using the term symbols and stick to bits instead.

Digital Modulation

Digital modulation is concerned with mapping of the bits (1s and 0s) to a property of the signal suitable for transmission. For example, consider a rectangular pulse shape. Amplitude modulation maps 0s and 1s to \pm 1 as

    \[0 \rightarrow -1\]

    \[1 \rightarrow +1\]

that correspondingly alters the amplitude of the rectangular pulse shape, as shown in Figure 2 below.

Figure 2: Amplitude modulation

Here, a bit stream 01001010 is mapped to the sequence -1,+1,-1,-1,+1,-1,+1,-1. In this simple system, all the Rx has to do is compare the amplitude of the received signal level (within each bit time) to a threshold (say, 0) and decide in favor of +1 or -1 (and consequently, 1 or 0).

A Low Data Rate Signal

Suppose that the year is early 2000s and our hiker in Figure 1 only wants to check his email and/or messages (probably driven by what is available) and requires a data rate of only 100 kbps. That translates to a bit time (also known as bit duration) of 

    \[1/100,000 = 10 \mu s\]

as drawn in Figure 3.

Figure 3: A low rate 100 kbps signal with a bit duration equal to 10 \mu s

As described above, the first and second multi-path components arrive 1 \mu s and 2 \mu s after the direct path, respectively (ignoring the carrier). This is shown in Figure 4.

Figure 4: First and second multipath components arriving after 1 and 2 \mu s after the direct path, respectively (carrier wave not shown)

Since nature adds the signals at the antenna, the Rx will have a summation of these three paths, effectively the same signal delayed by different amounts but with different attenuation and phase shifts of the carrier waves (not drawn in the figure). Those phase shifts depend on the path delay and the carrier frequency. Observe that 2 \mu s is 20 \% of the bit duration 10 \mu s, an implication of which is that multi-path components will distort the Rx signal but there will not be much interference among the bits themselves. That is to say that bit 1 through its last path will interfere slightly with bit 2 to a little extent but not with any other bit farther than that.

This is a situation that can be handled without much effort in terms of computational resources. We claim that the wireless channel does not pose a significant problem to Rx processing time in this low data rate scenario. It is not necessary to understand this in the context of this article but the disadvantage here is that in case of destructive interference, there is no other way to recover except providing diversity to the Tx signal. Diversity is a signal replica in some form whether in time, frequency, space, etc.

Moving towards High Data Rate

Fast forward to a decade, say early 2010s. Escaping the fast pace of life, our hiker wants to sit on top of that peaceful mountain …. to watch YouTube videos. In fact, he considers having access to it anywhere as a basic necessity of life like water and electricity. Assume that a data rate of 10 Mbps is needed for this purpose, which translates to a bit duration of 0.1 \mu s as shown in Figure 5.

Figure 5: A high rate 10 Mbps signal with a bit duration equal to 0.1 \mu s

The main point is that the environment is still the same and does not care about our data rates! Multi-path components previously arriving 1 and 2 \mu s after the direct path will still arrive 1 and 2 \mu s after the direct path. This is illustrated in Figure 6 (again ignoring the carrier). The different path lengths will translate into different attenuations and phase shifts resulting in constructive and destructive interference throughout the signal span.

Figure 6: First and second multipath components still arrive after 1 and 2 \mu s after the direct path, respectively (carrier wave not shown)

What does this imply for the high rate transmission? Notice that in this case, the initial bits of the transmission are interfering with many tens of bits in the future through the late arriving paths, a phenomenon known as Inter-Symbol Interference (ISI). In most cases of interest, this ISI could have been observed even extending to hundreds or even thousands of bits. We can conclude that the same harmless channel for low rate communication has become harsh for high rate communication!

A solution must be devised for this problem.

Solution – Equalizer

It turns out that a solution for this kind of problem was devised by Robert Lucky at Bell Labs in 1964: an adaptive equalizer. An equalizer is a filter that mitigates the effects of channel fading on the Rx signal and removes the Inter-Symbol Interference (ISI). Its input is the distorted waveform (sum of the original signal and its multi-path components) and the output is the clean desired bit stream as shown in Figure 7.

Figure 7: Equalizer input is a distorted waveform and its output is a clean bit stream

Don’t think of it as a physical device. Just like everything physical got transformed into digital logic in the history of communications (leading to software defined radios), an equalizer sits as mere lines of code in a microprocessor.

An equalizer has its advantages and disadvantages. Needless to say, it considerably improves the bit error rate and consequently fundamental to making the system functional. Technically, it exploits the frequency diversity available in a broad spectrum. On the other hand, for a high rate system, it is the most demanding and resource intensive component of a conventional wireless Rx. While the exact numbers depend on the requirements and implementation, it can consume even 75% of the processing resources! In high speed communication, it is impractical to be busy processing a bit in a certain window of time while many future bits are arriving at the Rx. That will result in filling the buffer faster than being emptied.

We can conclude that a low data rate communication requires a relatively simple Rx processor while high rate communication requires a heavy duty Rx processor. That kind of processing demands high power consumption and hence not feasible for battery operated devices like our smart phones.

Can there be a technique to achieve fast communication with a simpler equalizer? The answer is yes and the technique is OFDM.

OFDM in Time Domain

In time domain, OFDM breaks one serial fast bit stream into many parallel slow bit streams.
Then, these parallel slow bit streams are multiplied with orthogonal sinusoids, where orthogonality between two sinusoids is defined with a summation over a certain time interval as

    \[\sum \limits _{n=0} ^{N-1} \cos 2\pi k f_0 n \cdot \cos 2\pi k' f_0 n = 0\]

when k \neq k'.

This process is illustrated in detail in Figure 8 with the help of an example:

Figure 8: An OFDM example in time domain. See below for details

For a bit stream b[k], the example in Figure 8 traces the following steps:

I. Assume that the bit duration is T and there are 9 bits to be sent. In a high speed communication system, it will take 9T seconds to transmit all the bits. So we break them down to 9 parallel bit streams each with a duration of 9T seconds.

II. Assume a fundamental frequency of f_0 = 1/9T with 9 samples within this duration (that will be one complete period). Then, this sinusoid will be orthogonal to 8 other sinusoids with frequencies 0f_0, 2f_0, 3f_0, \cdots, 8f_0. This set of 9 sinusoids is shown in Figure 8. We will call them subcarriers.

III. Next, we multiply each such sinusoid with +1 or -1 (depending on the bit) to scale their amplitude accordingly.

    \[b[k]\cdot \cos 2\pi \frac{k}{9} n\]

IV. Finally, all these amplitude scaled sinusoids are added together to generate the desired signal. Notice that this composite signal has a duration of 9T seconds (same duration as the original bit stream) but contains the information from all 9 bits.

For a bit stream b[k] mapped to a non-binary modulation scheme X[k] (having I and Q components) and N sinusoids (instead of 9), this sequence of steps can be carried out using the inverse Discrete Fourier Transform (iDFT) defined as

    \[s[n] = \frac{1}{N} \sum \limits _{k=0} ^{N-1} X[k] e^{j2\pi \frac{k}{N}n}\]

The question is what we have achieved so far. For each such individual signal, the multi-path components arrive in not
too distant future (just like a low rate stream above). The multi-path components for individually modulated subcarriers as well as for the composite signal are shown in Figure 9.

Figure 9: Multi-path components for individually modulated subcarriers as well as the composite signal. First and second path respectively shown up and down for clarity

Hence, the
equalizer design is easy having less spread paths and consequently less interference with future symbols,
provided that we find a way to separate the subcarriers at the Rx. Separating these bits at the Rx is easy: we can correlate (multiply sample by sample and sum) the composite signal with just one subcarrier, say with frequency 6f_0. Utilizing their orthogonality property, contribution from all other 8 subcarriers will cancel out to zero, while the contribution from the subcarrier with that frequency 6f_0 will pop out, scaled in amplitude by our modulation signal. The same procedure can be repeated for all other subcarriers.

Essentially, this is an operation of Discrete Fourier Transform (DFT) as

    \[X[k] = \sum \limits _{n=0} ^{N-1} s[n] e^{-j2\pi \frac{k}{N}n}\]

which, for each k, will generate our modulate data.

OFDM in Frequency Domain

The wireless channel shown in Figure 1 has an impulse response derived from the contribution of each multi-path component. It also has a frequency response, assume that it has a shape as in Figure 10.

Figure 10: Frequency response of the wireless channel

With respect to frequency domain, the signal at the Rx is a product of the spectra of the Tx signal and the wireless channel. Thus, the channel will allow some frequencies to pass through unharmed while suppressing some others. We will come to this point later.

Now remember that time and frequency have inverse relationship. A signal wide in time domain has a narrow frequency span and vice versa. Although this relationship can be derived, we can just look into the Fourier transform of a rectangular signal: a sinc signal. The wider the rectangle, the earlier the sinc’s first zero-crossing is. Therefore, a low rate signal being wide in time domain has a narrow spectral representation. On the other hand, a fast rate signal exhibiting rapid changes in time has a wide spectrum. While random data generates a random signal that in turn is defined in terms of power spectral density, the underlying concept is still true and is drawn in Figure 11.

Figure 11: Spectral contents of a signal depend on its variations in time

As mentioned earlier, their interaction with the channel is through multiplication of spectral responses. The low data rate signal needs less manipulation by the Rx to get the original data back. Essentially for this kind of signal, as seen in Figure 12, the channel acts just as a single multiplier that can be equalized through estimating that number (known as channel estimation) and dividing the Rx signal by that number. So the equalization reduces to a single division operation.

There can be a question of how to recover when the narrowband signal appears within a deep channel fade. In that case, nothing but diversity (a signal replica in some form whether in time, frequency, space, etc.) can recover the signal which is out of scope of this article.

Figure 12: Signal bandwidth plays a central role in determining how it is treated by the channel

On the other hand, a high data rate signal needs a lot of Rx processing to equalize the channel. Figure 12 illustrates how different portions of the signal spectrum are treated differently by the channel and a complex algorithm needs to be implemented for signal recovery. There is an advantage in this situation in the form of available ‘frequency diversity’ which prevents the whole signal spectrum to go down in a deep fade (as opposed to low data rate case above). Equalizer inherently exploits that same frequency diversity which is outside the scope of this article. However, remember that we cannot afford a computationally expensive equalizer and instead need a simpler one.

To solve this problem, what OFDM does in frequency domain is fairly simple. It just segments the available bandwidth into many parallel almost flat channels through utilizing those sinusoidal subcarriers. Hence, equalization for each narrow slice requires just a single division operation, rendering the computational load of the equalizer to a total of N divisions. This is illustrated in Figure 13.


Figure 13: In frequency domain, OFDM slices the spectrum through using the subcarriers; now each spectral segment can be processed individually

The situation is very similar to processing a bread. Each time a person wanted to eat bread, they had to take a knife and cut a piece of bread for themselves. Then came sliced bread in 1928 that changed everything. Processing each individual slice got much easier; you could put jam, butter or cheese on different slices. See Figure 14 below.

Figure 14: Just like a whole bread needs to be sliced for eating convenience, OFDM slices the spectrum for communication convenience

It was difficult to process a whole bread before that invention. Similarly, it is difficult to process the collective spectrum for communication purposes. By slicing the spectrum, OFDM not only made it easier to equalize the wireless channel but also made it possible to send different modulation signals on different subcarriers (e.g., subcarriers experiencing ‘good’ channel can be used to transmit a higher-order modulation signal that translates into more bits within the same time). On a lighter note, now we have a formal proof that OFDM is the best thing since sliced bread.

Summary

  • In time domain, OFDM converts one serial fast bit stream into many parallel slow bit streams.
  • In frequency domain, OFDM segments one wide spectrum into many narrow spectra.

Remarks

There are two remarks for readers who might have the following questions in their minds, which can be skipped by everyone else.

1. A question that can be asked at this stage is that why OFDM figure in articles and textbooks looks different than Figure 13. The reason is that a sinusoidal subcarrier has a spectral shape that is a single impulse. However, when that sinusoid is limited in time, it is equivalent to multiplying it with a rectangular signal. Since Fourier transform of a rectangle is a sinc signal, and multiplication in time domain is convolution in frequency domain, we get a sinc signal for each subcarrier – shifted in frequency according to the frequency of that subcarrier. The actual OFDM spectrum, though still sliced, is drawn in Figure below.

2. How does OFDM eliminate the ISI arising from the neighboring symbols (like the multi-path in above figures)? For this purpose, a gap can be left between two subsequent symbols so that multi-path components of the first do not interfere with the second. However, for numerous reasons, something known as a cyclic prefix is actually used.

If you found this article useful, you might want to subscribe to my email list below to receive new articles.


Minimum Shift Keying (MSK) – A Tutorial

MSK as a special case of both non-linear and linear modulation schemes

Minimum Shift Keying (MSK) is one of the most spectrally efficient modulation schemes available. Due to its constant envelope, it is resilient to non-linear distortion and was therefore chosen as the modulation technique for the GSM cell phone standard.

MSK is a special case of Continuous-Phase Frequency Shift Keying (CPFSK) which is a special case of a general class of modulation schemes known as Continuous-Phase Modulation (CPM). It is worth noting that CPM (and hence CPFSK) is a non-linear modulation and hence by extension MSK is a non-linear modulation as well. Nevertheless, it can also be cast as a linear modulation scheme, namely Offset Quadrature Phase Shift Keying (OQPSK), which is a special case of Phase Shift Keying (PSK). As a borderline case, these relationships are illustrated in Figure below.

MSK as a special case of both non-linear and linear modulation schemes

Figure 1: MSK as a special case of both non-linear and linear modulation schemes

At this point, you would be thinking about the following question: How can a modulation be both non-linear and linear? As we see later in this tutorial, originally MSK is a non-linear modulation but a certain depiction of its actual digital symbols known as pseudo-symbols turns it into an OQPSK representation.

Modulation is a simple topic to understand but owing to the above description, MSK can sometimes be an intimidating concept. Here, our purpose is to present it in an uncomplicated manner by building it through the fundamentals.

The starting point is one of the simplest digital modulations possible: FSK.

Binary Frequency Shift Keying (BFSK)

In Frequency Shift Keying (FSK), digital information is transmitted by changing the frequency of a carrier signal. Naturally, Binary FSK (BFSK) is the simplest form of FSK where the two bits 0 and 1 correspond to two distinct carrier frequencies F_0 and F_1 to be sent over the air. The bits can be translated into symbols through the relations

    \[0 \quad \rightarrow \quad -1\]

    \[1 \quad \rightarrow \quad +1\]

This enables us to write the frequencies F_i with i ~ \epsilon ~ \{0,1\} as

    \[F_i = F_c +  (-1)^{i+1} \cdot \Delta_F = F_c \pm \Delta_F\]

where F_c is the nominal carrier frequency and \Delta_F is the peak frequency deviation from this carrier frequency. Consequently,

(1)   \begin{equation*}s(t) = A \cos 2\pi F_i t = A \cos\Big[2\pi \big\{F_c \pm \Delta_F\big\} t\Big]\quad ---- \quad \text{Eq (1)}\end{equation*}

where

    \[0\le t \le T_b\]

Figure 2 below displays a BFSK waveform for a random stream of data at a rate of R_b = 1/T_b. Note that we are not distinguishing between a bit period and a symbol period because both are the same for a binary modulation technique.

A Binary Frequency Shift Keying (BFSK) waveform

Figure 2: A Binary Frequency Shift Keying (BFSK) waveform


As is evident from Figure 2 above, the phase transitions at the boundaries of bit transitions are — in general — discontinuous.

Minimum Frequency Spacing

Although any two distinct frequencies F_0 and F_1 can be used for communication purpose, it greatly helps in receiver design if the two distinct signals are orthogonal to each other, i.e.,

    \[\int \limits _0 ^{T_b} s_1(t) s_0(t) ~dt = 0\]

A question that arises at this stage is the following: how close can the two frequencies F_0 and F_1 be? Or in other words, what is the smallest possible value of \Delta_F? The reason for asking this question is spectral efficiency. The closer the two frequencies, the more the number of channels available for other users in the same spectrum.

For a BFSK case,

    \begin{equation*} \int \limits _0 ^{T_b} \cos 2 \pi F_1 t \cos 2 \pi F_0 t ~dt = 0\end{equation*}

or

    \begin{align*} \int \limits _0 ^{T_b} \cos 2 \pi (F_1 + F_0 )t ~ dt + \int \limits _0 ^{T_b} \cos 2 \pi (F_1 - F_0 )t ~ dt  &= 0 \\ \frac{\sin 2\pi (F_1+F_0)t}{2\pi (F_1+F_0)}\bigg |_{0}^{T_b} + \frac{\sin 2\pi (F_1-F_0)t}{2\pi (F_1-F_0)}\bigg|_{0}^{T_b}&= 0 \\ \frac{\sin 2\pi (F_1+F_0)T_b}{2\pi (F_1+F_0)} + \frac{\sin 2\pi (F_1-F_0)T_b}{2\pi (F_1-F_0)}&= 0 \end{align*}

Since F_1+F_0 = 2F_c ≫ 1 while -1 \le \sin x \le 1, the first term goes to zero and we can write

    \begin{equation*} \sin 2\pi (F_1-F_0) T_b = 0 \end{equation*}

which is true for (remember \sin k\pi = 0 for integer k)

    \[2\pi (F_1-F_0) T_b = k\pi\]

From here, the orthogonality condition can be concluded as

    \[F_1-F_0 = \frac{k}{2T_b}\]

This also yields the minimum frequency separation for k=1.

    \[F_1-F_0 = \frac{1}{2T_b} = \frac{R_b}{2}\]

for orthogonal signaling. Thus, the peak frequency deviation \Delta_F can be computed as

    \[\Delta_F = \frac{F_1-F_0}{2} = \frac{1}{4T_b} = \frac{R_b}{4}\]

MSK as Continuous-Phase FSK

From the above information, a BFSK signal with minimum tone spacing can be constructed by replacing \Delta_F by R_b/4 in Eq (1) as

    \begin{align*}s(t) &= A \cos \Big[2\pi \big\{F_c \pm \frac{R_b}{4}\big\} t\Big], \\ &= A \cos \Big[2\pi F_c t  \pm 2\pi\frac{R_b}{4} t\Big], \quad 0\le t \le T_b \end{align*}

This is a CP-BFSK signal with minimum tone spacing defined over a single bit interval 0 \le t \le T_b. There are two more steps to construct an actual MSK waveform.

  1. In a real communication system, the signal is constructed by transmitting a sequence of bits b_n in succession, where n is the bit index. As we have seen earlier, bits b_n ~\epsilon~ \{0,1\} are converted to symbols a_n.

        \[a_n ~\epsilon~ \{-1,+1\}\]

    Then, in the interval nT_b \le t \le (n+1)T_b, the above signal can be written as

    (2)   \begin{align*} s(t) &= A \cos \Big[2\pi F_c t +  2\pi\frac{a_n R_b}{4} (t-nT_b)\Big], \quad nT_b \le t \le (n+1)T_b \end{align*}

    where n is the bit index within a long bit stream and the second term indicates the underlying baseband message.

  2. Observe in the above equation that the phase continuity is not necessarily maintained from one symbol to the next. To ensure phase continuity, we must add a phase component for each symbol as

        \begin{align*}s(t) &= A \cos \Big[2\pi F_c t +  2\pi\frac{a_n R_b}{4} (t-nT_b) + \theta_n\Big] \\ & \qquad \qquad nT_b \le t \le (n+1)T_b \quad ---- \quad \text{Eq (2)}\end{align*}

    For this purpose, the phase at both sides of t=(n+1)T_b must be equal, as illustrated in Figure 3.
    Ensuring phase continuity at symbol boundaries

    Figure 3: Phase on both sides of t=(n+1)T_b must be equal to maintain continuity

    Thus, at the instant t=(n+1)T_b, the following equation must be satisfied.

        \begin{align*} 2\pi \frac{a_n R_b}{4} (t-nT_b) + \theta_n \bigg |_{t=(n+1)T_b} &= 2\pi  \frac{a_{n+1} R_b}{4} (t-(n+1)T_b) + \theta_{n+1} \bigg |_{t=(n+1)T_b} \end{align*}

    which can be written as

        \[\theta_{n+1} = \theta_{n} + a_n \frac{\pi}{2}\]

    Another way to write the above recursive relation is

        \begin{equation*}\boxed{\theta_{n} = \theta_{n-1} + a_{n-1} \frac{\pi}{2}}\quad ---- \quad \text{Eq (3)}\end{equation*}

    Assume that \theta_0 = 0. Then,

        \[\theta_1 = a_0 \frac{\pi}{2}\]

        \[\theta_2 = \theta_1 + a_1 \frac{\pi}{2} = \frac{\pi}{2} (a_0+a_1)\]

    In general,

        \[\theta_n = \frac{\pi}{2} \sum \limits_{i=0}^{n-1} a_i \quad ---- \quad \text{Eq (4)}\]

When the phase follows this rule during each bit/symbol interval, the phase continuity is ensured and the resulting waveform is shown for an example sequence in Figure 4.

Tx symbols
Minimum Shift Keying (MSK) as Continuous-Phase Binary Frequency Shift Keying (CP-BFSK)

Figure 4: MSK as Continuous-Phase Binary FSK

Notice how Figure 4 is different than Figure 2 in phase continuity.Below, we plot \theta_n as a function of time in Figure 5 to see how it evolves. One can observe that it indeed changes values in steps of \pi/2 depending on the last data bit.

Phase evolving with time

Figure 5: \theta_n evolving with time

MSK as Offset QPSK


First, write Eq (2) assuming A=1 as

    \begin{align*}s(t) &= \cos \Big[2\pi F_c t + 2\pi \frac{a_n R_b}{4} (t-nT_b) + \theta_n\Big] \\ &= \cos \Big[2\pi F_c t + 2\pi \frac{a_n R_b}{4} t - \underbrace{na_n \frac{\pi}{2} + \theta_n}_{\Theta_n}\Big]\quad ---- \quad \text{Eq (5)} \end{align*}

Using Eq (3), we can further manipulate \Theta_n as

    \begin{align*} \Theta_n &= \theta_n - na_n \frac{\pi}{2} \\ &= \theta_{n-1} + a_{n-1}\frac{\pi}{2} - na_n \frac{\pi}{2} \\ &= \theta_{n-1} + a_{n-1}\Big(+1-n+n\Big)\frac{\pi}{2} - na_n \frac{\pi}{2} \\ &= \theta_{n-1} - (n-1)a_{n-1}\frac{\pi}{2} + n \frac{\pi}{2} \Big(a_{n-1} - a_n\Big)\\ &= \Theta_{n-1} + n \frac{\pi}{2} \Big(a_{n-1} - a_n\Big)\quad ---- \quad \text{Eq (6)} \end{align*}

Notice that \Theta_{n} = \Theta_{n-1} when

  • a_n = a_{n-1} because the second term in Eq (6) becomes zero.
  • n is even because the second term in Eq (6) is a multiple of 2\pi (remember that a_{n-1}-a_n is \pm 2 when not zero).

We conclude that \Theta_n can only change when a_n \neq a_{n-1} and n is odd. In that case, it will always change by an odd multiple of \pm \pi. In summary, considering modulo-2\pi operations,

    \[\Theta_n = \begin{cases}             \Theta_{n-1}\pm \pi   &  a_n \neq a_{n-1}, n~\text{odd} \\ \Theta_{n-1} & \text{otherwise} \end{cases}\quad ---- \quad \text{Eq (7)}\]

Next, we use the following identities

    \begin{align*} \cos (\alpha\pm \beta) &= \cos \alpha\cos \beta \mp \sin \alpha \sin \beta \\ \sin (\alpha\pm \beta) &= \sin \alpha\cos \beta \pm \cos \alpha \sin \beta \\ \cos (-\alpha) &= \cos \alpha \\ \sin (-\alpha) &= - \sin \alpha \end{align*}

to open Eq (5).

    \begin{align*} s(t) &= \cos \Big[2\pi F_c t + 2\pi \frac{a_n R_b}{4} t + \Theta_n \Big] \\ &= \cos 2\pi F_c t \cdot \underbrace{\cos \Big(2\pi \frac{a_nR_b}{4}t  + \Theta_n\Big)}_{\text{Term 1}} - \\ &\qquad \qquad \qquad \sin 2\pi F_c t\cdot \underbrace{\sin \Big(2\pi  \frac{a_nR_b}{4}t  + \Theta_n\Big)}_{\text{Term 2}} \end{align*}

Now we process both terms one by one.

    \begin{align*} \text{Term 1} &= \cos \Big(2\pi \frac{a_nR_b}{4}t  + \Theta_n \Big) \\ &= \cos \Big(2\pi \frac{a_nR_b}{4}t\Big)  \cos \Theta_n - \sin \Big(2\pi \frac{a_nR_b}{4}t\Big) \sin \Theta_n \\ &= \cos \Big( 2\pi  \frac{a_nR_b}{4}t\Big)  \cos \Theta_n = d_{I,n} \cos 2\pi \frac{R_b}{4}t\\ \end{align*}

because \Theta_0 = 0 implies \sin \Theta_n = 0, see Eq (7). Furthermore, a_n ~\epsilon~\{-1,+1\} and \cos (\pm \alpha) = \cos \alpha. We have also defined

    \[d_{I,n} = \cos \Theta_n\]

Similarly, Term 2 can be expanded as

    \begin{align*} \text{Term 2} &= \sin \Big(2\pi  \frac{a_nR_b}{4}t  + \Theta_n \Big) \\ &= \sin \Big(2\pi \frac{a_nR_b}{4}t \Big) \cos \Theta_n + \cos\Big(2\pi \frac{a_nR_b}{4}t\Big) \sin \Theta_n \\ &= \cos \Theta_n \cdot a_n \sin 2\pi \frac{R_b}{4}t = -d_{Q,n} \sin 2\pi \frac{R_b}{4}t  \end{align*}

since \sin \Theta_n = 0 as before and \sin (-\alpha) = - \sin (\alpha). Here, we have defined d_{Q,n} as

    \[d_{Q,n} = -a_n \cdot \cos \Theta_n = -a_n \cdot d_{I,n}\]

Finally, plugging both Term 1 and 2 into s(t) for the n-th bit period,

    \[s(t) = d_{I,n} \cos 2\pi \frac{R_b}{4}t \cdot  \cos 2\pi F_c t + d_{Q,n} \sin 2\pi \frac{R_b}{4}t  \cdot \sin 2\pi F_c t\]

Using the identity \sin \alpha = \cos (\alpha - \pi/2), the above equation can be revised as

    \begin{align*} s(t) &= d_{I,n} \cos 2\pi \frac{R_b}{4}t \cdot  \cos 2\pi F_c t + d_{Q,n} \cos \Big(2\pi \frac{R_b}{4}t - \frac{\pi}{2}  \Big) \cdot \sin 2\pi F_c t \\ &= d_{I,n} \cos 2\pi \frac{R_b}{4}t \cdot  \cos 2\pi F_c t + \\&\qquad \qquad \qquad d_{Q,n} \cos 2\pi \frac{R_b}{4}(t-T_b) \cdot \sin 2\pi F_c t\quad ---- \quad \text{Eq (8)}\\ =& d_{I,n} p(t) \cdot  \cos 2\pi F_c t + d_{Q,n} p(t-\frac{1}{2}2T_b) \cdot \sin 2\pi F_c t \end{align*}

where p(t) is a half-sinusoidal pulse shape of period 4T_b.

The above expression resembles an OQPSK waveform if the bit rate for d_{I,n} and d_{Q,n} is R_b/2 or bit period is 2T_b, since the time offset in the \sin term must be half the bit period for OQPSK.
So we have to check if d_{I,n} and d_{Q,n} change values every other symbol.

From the definition, d_{I,n} = \cos \Theta_n and Eq (7) tells that \Theta_n can only change values for odd n. Hence, d_{I,n} is indeed an R_b/2 rate stream.

On the other hand, d_{Q,n}=-a_n \cdot d_{I,n}. Again, Eq (7) says that d_{I,n} can only change when a_n changes but that means that -a_n \cdot d_{I,n} stays the same. Therefore, d_{Q,n} can only change for even n and when a_n changes values. Consequently, d_{Q,n} is also an R_b/2 rate stream.

Since d_{I,n} changes values for odd n while d_{Q,n} does the same for even n, d_{Q,n} is offset with respect to d_{I,n} by T_b seconds, the same amount as \cos 2\pi \frac{R_b}{4}t in Eq (8). Summing up everything so far, MSK can indeed be represented as an OQPSK waveform. The data rate is the same as in CP-BFSK format since two bits are being transmitted in two bit periods here as well.

This representation is illustrated in Figure 6.

In-phase symbols

Quadrature symbols


Figure 6: MSK as Offset QPSK

Compare Figure 6 with Figure 4. I did not choose separate blue and red colors in this figure so as not to confuse d_ns with a_ns (that can raise a misunderstanding that d_{I,n} is odd a_ns and d_{Q,n} is even a_n, which is not correct).

Observe from Figure 6 that d_{I,n} is changing values every two T_bs at odd multiples of T_b, while d_{Q,n} is changing values every two T_bs at even multiples of T_b. Due to this offset behavior, at every T_b, either I or Q waveform is zero at t=T_b while the other reaches its maximum value. This is how phase remains continuous during symbol transitions.

Figure 7 draws \Theta_n as a supplement to above findings.

Figure 7: \Theta_n evolving with time

On a final note, observe from some equations (e.g., (2) and (5)) that the continuous phase has two parts, one of which arises due to the delay of the n-th symbol. This information can be used to refine a phase estimate.

We are also clear now why MSK can act both like a linear and a non-linear modulation. In reality, MSK is a non-linear modulation scheme (see Eq(2)) for a_n. Pseudo-symbols d_n themselves are non-linear functions of information bits. So it is only from d_n viewpoint that MSK can be seen as a linear modulation scheme.

In the meantime, if you found this article useful, you might want to subscribe to my email list below to receive new articles.


A Generalization: From MSK to Continuous-Phase FSK (CP-FSK)

After understanding MSK, it can be expanded into a general modulation scheme known as Continuous-Phase Frequency Shift Keying (CP-FSK). As we see in the next section, CP-FSK is a special case of Continuous-Phase Modulation (CPM), which is a class of non-linear digital modulation schemes in which the phase of the signal is constrained to be continuous from one symbol to the next. As with MSK, the most attractive feature of such a signal is that it has a constant envelope as a result of the amplitude being independent of the modulating information. Consequently, a CPM signal can be amplified without distortion by a non-linear amplifier operating near the saturation point allowing low cost and more efficiency as compared to a linear amplifier.

Usually CP-FSK (and hence CPM) is not a straightforward concept to master. However, starting from MSK, its basics can easily be understood by building on the same expressions. For this purpose,

  • Remember that \Delta_F = R_b/4.
  • Modulation symbols can carry more than 1 information bit. For example, 00, 01, 11 and 10 can be sent through four symbols a_n \epsilon \{-3,-1,+1+3\}. In general, a_n is a sequence from the alphabet \{\pm 1, \pm 3, \cdots,\pm (M-1)\}. Thus, we call a_n as symbols and replace T_b with T_M (a symbol time) from here onwards. Symbol rate R_M is then 1/T_M.

Now let us start with rewriting Eq (2) and Eq (4) and then substituting the former into the latter.

    \[\theta_n = \frac{\pi}{2} \sum \limits_{i=0}^{n-1} a_i\]

    \begin{align*} s(t) &= A \cos \Big[2\pi F_c t +  2\pi\frac{a_n R_M}{4} (t-nT_M) + \theta_n\Big] \\  &=A \cos \Big[2\pi F_c t +  2\pi \frac{1}{2}\cdot \frac{t-nT_M}{2T_M}a_n + \frac{\pi}{2} \sum \limits_{i=0}^{n-1} a_i\Big] \\ &= A \cos \Big[2\pi F_c t +  2\pi h a_n \cdot q(t-nT_M) + \pi h \sum \limits_{i=0}^{n-1} a_i\Big]  \end{align*}

where we have defined the following terms.

    \begin{align*} h &= \frac{1}{2} \\ q(t) &= \begin{cases} 0 & t \le 0 \\  \frac{t}{2T_M} & 0 \le t \le T_M \\  \frac{1}{2} & t \ge T_M \end{cases} \end{align*}

It turns out that h is a general concept known as
modulation index, which in case of MSK is equal to 1/2. In general, h is defined as

    \[h = 2 \frac{\Delta_F}{R_M}\]

which describes the peak frequency deviation in terms of a percentage of the symbol rate. With the definition of h, we can write

    \[s(t) = A \cos \Big[2\pi F_c t +  2\pi h \Big( a_n q(t-nT_M)+ \frac{1}{2} \sum \limits_{i=0}^{n-1} a_i\Big)\Big]\]

The above equation is true for the interval nT_M \le t \le (n+1)T_M. Considering from its definition that q(t) is 1/2 after a symbol interval, we can also write s(t) as

    \begin{align*}s(t) &= A \cos \Big[2\pi F_c t  + 2\pi h  \sum \limits _{i=0}^{n} a_i q(t - iT_M) \Big]\quad ---- \quad \text{Eq (9)}\\ &= A \cos \Big[2\pi F_c t  + 2\pi h \int _{-\infty}^{t} \Big( \sum \limits _{i=0}^n a_i  g(u - iT_M)\Big) du \Big] \end{align*}

where g(t) is defined as the derivative of q(t) as

    \[q(t) = \int \limits _{0}^{t} g(u) du\]

Notice that in the present case with q(t) defined as above, its derivative g(t) is a rectangular pulse shape. Consequently, \sum \limits _i a_i g(u - iT_M) is the standard baseband Pulse Amplitude Modulated (PAM) waveform with rectangular pulse shape and this is how its discontinuities are transformed into a continuous-phase signal.

It is refreshing to conclude that the starting point for a continuous-phase modulated signal is a standard pulse amplitude waveform, in which the discontinuities are smoothed out by the integral operation.

More Generalization: From Continuous-Phase FSK (CP-FSK) to Continuous-Phase Modulation (CPM)

Now referring to Eq (9), the expression for a CPM signal can be written as

    \[s(t) = A \cos \Big[2\pi F_c t  + 2\pi  \sum \limits _{i=0}^{n} h_i \cdot a_i q(t - iT_M) \Big]\qquad nT \le t \le (n+1)T\]

Here, h_i is a sequence of modulation indices: a ratio of two relatively prime integers.

    \[h = \frac{k}{p}\]

This is the case of multi-h CPM. When all h_i=h, the modulation index is the same for all symbols. This is the category we saw in CP-FSK and MSK above.

Finally, q(t) is a waveform shape known as the phase response of the modulator which is normalized as

    \[q(t) = \begin{cases}            0   &  t \le 0 \\ \frac{t}{2T} & 0 \le t \le LT \\            \frac{1}{2}  & t \ge LT            \end{cases}\]

where L is the length of a pulse g(t) in symbols. Since angular frequency is the rate of change of phase, this g(t) is the derivative of q(t) and known as the frequency response of the modulator.

    \[g(t) = \frac{dq(t)}{dt}\]

For any pulse shape g(t), L=1 results in full response CPM, while the other case L>1 is the partial response CPM. Commonly used pulse shapes are rectangular (as used in the case of MSK), raised cosine and Gaussian.

We can see that CPM is in fact a very large class of modulation schemes owing to different pulse shapes g(t), modulation indices h_i and the modulation alphabet size MThat is both a blessing and a curse: blessing due to the remarkable variety of signals as its offsprings all yielding excellent spectral and power efficiencies, and curse due to the high receiver complexity. By virtue of Moore’s law, this is becoming less of an issue — thanks to powerful baseband processors in the modern age.

References

[1] Mengali and D’Andrea, Synchronization Techniques for Digital Receivers, 1997.

[2] John Proakis, Digital Communications, 4th ed, 2001.

Basics of Synchronization

Known training sequence (a preamble) is prepended, or training can also be inserted periodically within the message

In every digital communication system, the Tx has the easier role of signal generation while the Rx has the tougher job of figuring out the intended message. Just like solving a puzzle told by someone. Estimating and compensating for the frequency, phase and timing offsets between Tx and Rx oscillators is one such challenge. The solution can be designed depending on many factors such as some part of data is known (called a ‘training sequence’) or not, the synchronizer needs to be one-shot or continuously updating, and so on.

Known Data Availability


Depending on the availability of known data, synchronization theory in digital systems are largely based on the following three approaches.

  • Data-aided: To help the Rx in many systems, the Tx inserts symbols already agreed upon with the Rx within the message such that the Rx can acquire unknown parameters through knowledge of this `data’. This is shown in Figure below. Performing synchronization using this training is known as data-aided synchronization. Most widespread wireless communication systems in today’s world such as LTE and WiFi implement algorithms based upon this approach.

    Known training sequence (a preamble) is prepended, or training can also be inserted periodically within the message

    One problem with data-aided synchronization strategy is the waste of resources. The power and time spent on transmitting training sequence could have been used for sending more data: the spectral efficiency of the system is reduced by a non-negligible factor. Assuming a training length of N_{\textmd{Train}} and message length of N_{\textmd{Data}}, the spectral efficiency decreases by a factor of

    (1)   \begin{equation*}             \frac{N_{{\textmd{Data}}}}{N_\textmd{Train}+N_{{\textmd{Data}}}}         \end{equation*}

  • Decision-directed: To avoid this penalty, alternative techniques need to be adopted. Extending the above idea, once the Rx starts demodulating the signal and making decisions, it can use those decisions as known data in order to successfully track the changes in nuisance parameters, such as slowly changing carrier phase offset. This technique is known as decision-directed synchronization. It is evident that decision-directed approach can work well only when the detector decisions are correct such as in high SNR case. Otherwise, a wrong decision leads to a poor estimate, a poor estimate leads to a wrong decision in the next cycle, and the chain continues in the form of error propagation.
  • Non-data-aided or Blind: In other situations, however, neither a preamble nor the decisions can be used. Here, some particular characteristics of the incoming signal can be employed to estimate the unknown parameters. This is known as non-data-aided or blind synchronization technique. Adopting a non-data-aided synchronization approach can retain the spectral efficiency but its convergence is slow because a large amount of data needs to be processed to average out the effects of noise and find a reliable estimate.

The benefits, drawbacks and conditions for these synchronization approaches are summarized in Table below.

Benefits and drawbacks of data-aided, decision-directed and non-data-aided synchronization approaches

Feedback or Feed-forward


Irrespective of the data knowledge, synchronization blocks can be implemented in one of the following two manners:

  • Feed-Forward, Open Loop, or Batch Processing: There are many applications in data communications where the transmission occurs in a start and stop manner with periods of inactivity in between. This is known as burst mode communication. Here, the Tx forms a complete packet by inserting a sequence of known symbols — also called a preamble or a training sequence — before the actual message symbols as shown in the above Figure. In burst-mode, samples of the received signal are processed to establish a direct one-shot estimate of the target parameter through batch processing. Signal processing to establish the expression for the estimate is based on an algorithm derived from the mathematical structure of the Rx signal. Once this parameter is determined, it is corrected from the Rx signal without feedback to any previous block. In case of phase synchronization for example, the phase estimate can be used to de-rotate all data in that burst.
  • Feedback, Closed Loop or Recursive: Many other communication links work in continuous mode where the signal is transmitted either at all times or for a long duration. Here, fast acquisition is not as important and the objective is to lock onto the target parameter within a reasonable time after the arrival of the received signal. So an estimate of the error signal (for example e_{\theta}= \theta_{\Delta} - \hat \theta_{\Delta}) is derived which forms the basis of a corrective signal that is fed back to a compensation unit. A Phase Locked Loop (PLL) can be employed for this purpose with some modifications discussed later. Feedback acquisition can work blindly, in a decision-directed manner or can also take help from training inserted periodically within the message as shown in Figure above. This category of processing has an inherent ability to automatically track slowly varying parameter changes.

In summary, there are 3 \times 2=6 possible ways to implement a synchronizer depending on the knowledge of data and the loop being closed or open. Different algorithms can be designed for many of these topologies but not all, examples of which we will see in many other articles.

What is Carrier Phase Offset and How It Affects the Symbol Detection

Eye diagrams for I arm of a 4-QAM signal for 15, 30 and 45 degrees phase offsets and a Raised Cosine filter with excess bandwidth 0.5. A similar eye diagram exists for Q arm as well

In case of Quadrature Amplitude Modulation (QAM) and other passband modulation schemes, Rx has no information about carrier phase of the Tx oscillator. To see the effect of the carrier phase offset, consider that a transmitted passband signal consists of two PAM waveforms in I and Q arms denoted by v_I(t) and v_Q(t) respectively and combined as

(1)   \begin{equation*}         s(t) = v_I(t) \sqrt{2} \cos 2\pi F_C t  - v_Q(t) \sqrt{2}\sin  2\pi F_C t     \end{equation*}

Here, F_C is the carrier frequency and v_I(t) and v_Q(t) are the continuous versions of sampled signals v_I(nT_S) and v_Q(nT_S) given by

(2)   \begin{equation*}       \begin{aligned}         v_I(nT_S)\: &= \sum _{i} a_I[i] p(nT_S-iT_M) \\         v_Q(nT_S) &= \sum _{i} a_Q[i] p(nT_S-iT_M)       \end{aligned}     \end{equation*}

In the above equation, a_I[m] and a_Q[m] are the inphase and quadrature components of the m^{th} symbol, p(nT_S) are the samples of a square-root Nyquist pulse with support -LG \le n \le LG and T_S and T_M are the sample time and symbol time, respectively.

In the absence of noise, the received signal for a passband waveform is the same as the transmitted signal except a carrier phase mismatch, i.e., we ignore every other distortion in the received signal except the phase offset \theta_\dd.

After bandlimiting the incoming signal through a bandpass filter, it is sampled by the ADC operating at F_S=1/T_S samples/second to produce

    \begin{align*}         r(nT_S) &= v_I(nT_S) \sqrt{2}\cos \left(2\pi F_C nT_S + \theta_{\Delta}\right) - v_Q(nT_S)\sqrt{2} \sin\left( 2\pi F_C nT_S + \theta_{\Delta}\right) \\                 &= v_I(nT_S) \sqrt{2}\cos \left(2\pi \frac{k_C}{N}n + \theta_{\Delta}\right) - v_Q(nT_S) \sqrt{2}\sin \left(2\pi \frac{k_C}{N}n + \theta_{\Delta}\right)     \end{align*}

where the relation F/F_S = k/N is used and k_C corresponds to F_C, and the angle \theta_{\Delta} is the phase difference between the incoming carrier and the local oscillator at the Rx.

To produce a complex baseband signal from the received signal, the samples of this waveform are input to a mixer which multiplies them with discrete-time quadrature sinusoids \sqrt{2}\cos 2\pi (k_C/N)n in the I arm and -\sqrt{2} \cdot \sin 2\pi (k_C/N)n for Q arm. We continue the derivation for I part and the same for Q is very similar and the reader can solve it as an exercise. Using the identities \cos(A)\cos(B) = 0.5 \left( \cos(A+B)\right.\} + \left.\cos(A-B) \right) and \sin(A)\cos(B) = 0.5 \left( \sin(A+B)\right.\} + \left.\sin(A-B) \right),

    \begin{equation*}         \begin{aligned}             x_I(nT_S) &= r(nT_S) \cdot \sqrt{2}\cos 2\pi \frac{k_C}{N}n \: \\                   &= v_I(nT_S)\left\{\cos\theta_{\Delta}  + \underbrace{\cos \left(2\pi \frac{2k_C}{N}n + \theta_{\Delta}\right)}_{\text{Double frequency term}} \right\} - \\                    & \qquad  \qquad \quad v_Q(nT_S) \left\{\sin \theta_{\Delta} + \underbrace{\sin \left(2\pi \frac{2k_C}{N}n + \theta_{\Delta} \right)}_{\text{Double frequency term}} \right\}         \end{aligned}     \end{equation*}

The matched filter output is written as

    \begin{equation*}         \begin{aligned}             z_I(nT_S) &= x_I(nT_S) * p(-nT_S) \\                       &= \left(v_I(nT_S) \cos \theta_{\Delta} - v_Q(nT_S) \sin\theta_{\Delta} + \right. \\                       &\hspace{.6in}\left.\text{Double frequency terms}\right)* p(-nT_S)         \end{aligned}     \end{equation*}

The double frequency terms in the above equation are filtered out by the matched filter h(nT_S) = p(-nT_S), which also acts a lowpass filter due to its spectrum limitation in the range -0.5 R_M \le F \le +0.5R_M, where R_M is the symbol rate. Writing the definitions of v_I(nT_S) and v_Q(nT_S) from Eq (2),

    \begin{equation*}         \begin{aligned}             z_I(nT_S) = \sum_i \Big\{ a_I[i]\cos \theta_{\Delta} - a_Q[i]\sin\theta_{\Delta} \Big\} r_p(iT_M - nT_S)         \end{aligned}     \end{equation*}

where r_p(nT_S) comes into play from the definition of auto-correlation function. To generate symbol decisions, T_M-spaced samples of the matched filter output are required at n = mL = mT_M/T_S. Downsampling the matched filter output generates

    \begin{equation*}         \begin{aligned}             z_I(mT_M) &= z_I(nT_S) \bigg| _{n = mL = mT_M/T_S} \\                       &= \sum \limits _i \Big\{ a_I[i]\cos \theta_{\Delta} - a_Q[i]\sin \theta_{\Delta}\Big\} r_p(iT_M - mT_M)         \end{aligned}     \end{equation*}

For a square-root Nyquist that satisfies no-ISI criterion, r_p(iT_M - mT_M) is zero except for i = m. Thus,

    \begin{equation*}       \begin{aligned}         z_I(mT_M) = a_I[m] \cos \theta_{\Delta} - a_Q[m]\sin\theta_{\Delta}       \end{aligned}     \end{equation*}

A similar derivation for Q arm yields the final expression for the symbol-spaced samples in the presence of phase offset \theta_{\Delta}.

(3)   \begin{equation*}       \begin{aligned}         z_I(mT_M) = a_I[m] \cos \theta_{\Delta} - a_Q[m]\sin\theta_{\Delta}       \\         z_Q(mT_M) = a_I[m] \sin \theta_{\Delta} + a_Q[m]\cos\theta_{\Delta}       \end{aligned}     \end{equation*}

From the phase rotation rule of complex numbers, we know that this expression is nothing but counterclockwise rotation by an angle \theta_{\Delta}. In polar form, this expression can be written as

(4)   \begin{equation*}         \begin{aligned}           |z(mT_M)| &= \sqrt{a_I^2[m] + a_Q^2[m]} \\           \measuredangle z(mT_M) &= \measuredangle \Big(a_Q[m],a_I[m]\Big) + \theta_{\Delta}         \end{aligned}     \end{equation*}

In conclusion, a mismatch of \theta_{\Delta} between incoming carrier and Rx oscillator rotates the desired outputs a_I[m] and a_Q[m] on the constellation plane by an angle \theta_{\Delta}. This is drawn in the scatter plot of Figure below for a 4-QAM constellation. Keep in mind that the blue circles are not one but several symbols mapped over one another due to the similar phase shift.

Scatter plot for a 4-QAM constellation in the presence of carrier phase offset and no noise

Cross-talk


In Eq (3), start with \theta_{\Delta}=0 and observe that the I and Q outputs are a_I[m] and a_Q[m], respectively. This implies that signals in I and Q arms are completely independent of each other. Gradually increasing \theta_{\Delta} has two effects:

  • Since \cos \theta_{\Delta} < \cos 0 = 1, amplitude of a_I[m] in z_I(mT_M) reduces. The same phenomenon happens with a_Q[m] in z_Q(mT_M).
  • Since \sin \theta_{\Delta} > \sin 0 = 0, interference of a_Q[m] in z_I(mT_M) increases as well as that of a_I[m] in z_Q(mT_M).

This interference between I and Q components is known as cross-talk. Cross-talk increases with \theta_{\Delta} until for a 90^\circ difference, a_I[m] appears at Q output and -a_Q[m] at I output.

The effect of this cross-talk on a Raised Cosine shaped 4-QAM waveform with excess bandwidth \alpha=0.5 is shown in Figure below for a phase difference \theta_{\Delta} = 30^\circ. Observe the first sample: it is (-1,+1) in quadrant II. After phase rotation, I part moved towards left thus increasing its amplitude and Q moved downwards reducing its amplitude. This is evident through the first samples in the Figure. A similar argument holds for all other symbols.

Effect of 30 degrees phase rotation on a time domain 4-QAM waveform for a Raised Cosine filter with excess bandwidth 0.5. Observe how the samples at optimal locations move away from the ideal symbol amplitudes

Can phase rotation be beneficial?


Something really interesting has happened in Figure above. Notice that although the amplitude has decreased for some symbols, it has risen for some other symbols as well. This is the outcome of a circular rotation. While it is good to have some symbols with a little extra protection against noise, remember that it has come at a cost of reduced amplitudes for other symbols, making them much vulnerable to noise and other impairments. The overall effect is negative, just like strengthening your right arm in exchange of significantly weakening the left is dangerous for your body.

What was discussed above can be extended to the whole symbol stream. The cumulative effect of a phase offset is straightforward to see in a scatter plot. There will be clouds of samples from downsampled matched filter output around the original constellation.

Since the scatter plot is different than a raw time domain waveform, we employ the eye diagram to examine the effects of carrier phase offset (say, on an oscilloscope). First, start with a BPSK modulation scheme and remember that there is no Q channel in this case and consequently no cross-talk. However, the effect of phase rotation is a reduction in signal amplitude which can be observed by plugging a_Q = 0 in Eq (3) and only focusing on I arm.

(5)   \begin{equation*}         \begin{aligned}             z_I(mT_M) = a_I[m] \cos \theta_{\Delta}         \end{aligned}     \end{equation*}

Since \cos \theta_\Delta always lies between -1 and 1, the amplitude of the I signal gets reduced accordingly with the rest of the energy rising in the Q arm. From Eq (3), this signal is written as

    \begin{equation*}         \begin{aligned}             z_Q(mT_M) = a_I[m] \sin \theta_{\Delta}         \end{aligned}     \end{equation*}

With a phase offset of 45^\circ, the I branch loses half of its energy with the remaining half going in the Q arm. This is drawn in Figure below. In fact for a 90^\circ phase rotation, the I contribution actually reaches zero and all the energy of the signal appears across the Q branch. Due to this reason, we will see later that the Q arm is still employed for BPSK signals — not for data detection but helping in the phase synchronization procedure.

Eye diagrams of a BPSK signal for 0 and 45 degrees phase rotations and a Raised Cosine filter with excess bandwidth 0.5. Observe a reduction in I amplitude in proportion to the energy rising in the Q arm

Next, we turn our focus towards QAM and observe the amplitude change and cross-talk between I and Q branches for three different phase offsets, 15^{\circ}, 30^{\circ} and 45^{\circ}. Figure~?? illustrates the I channel for these phase offsets in a noiseless case and a 4-QAM signal. A similar diagram holds for Q arm as well and not drawn here. The optimal sampling instants are still visible due to zero noise but \bbf{the eye diagram looks more like a 4-PAM signal than that of a single 4-QAM signal due to the cross-talk from Q arm}. It is also evident that I and Q affect each other in equal proportions.

Eye diagrams for I arm of a 4-QAM signal for 15, 30 and 45 degrees phase offsets and a Raised Cosine filter with excess bandwidth 0.5. A similar eye diagram exists for Q arm as well

The reason there are only two eyes for 45^\circ phase offset is that \cos \theta_\Delta and \sin \theta_\Delta in Eq (3) become equal and hence many symbols a_I and a_Q cancel each in both I and Q arms of the output. In terms of the scatter plot, a rotation of 45^\circ shifts the constellation points onto the real and imaginary axes, so for the I plot shown here, the output at the sampling instant coincides only with a positive or negative symbol value.

The Master Algorithm

Definition of correlation

Recently, I was reading the book The Master Algorithm by Pedro Domingos — a Professor at the University of Washington in machine learning.

According to the description of his book,

The Master Algorithm in Machine Learning


A spell-binding quest for the one algorithm capable of deriving all knowledge from data, including a cure for cancer.

Society is changing, one learning algorithm at a time, from search engines to online dating, personalized medicine to predicting the stock market. But learning algorithms are not just about Big Data – these algorithms take raw data and make it useful by creating more algorithms. This is something new under the sun: a technology that builds itself. In The Master Algorithm, Pedro Domingos reveals how machine learning is remaking business, politics, science and war. And he takes us on an awe-inspiring quest to find ‘The Master Algorithm’ – a universal learner capable of deriving all knowledge from data.

So I thought about a master algorithm that exists for digital communication systems. I could not find one in general but for linear modulations under AWGN channels, Correlation is certainly the master algorithm. Correlation solves the detection problem by choosing the most probable hypothesis, correlation solves the receiver design problem by leading to numerous timing, frequency and phase synchronization algorithms (which are largely variations of a single concept derived through correlation), correlation leads to channel estimation and equalization techniques to combat channel distortions.

Due to these reasons, I called it the heart of digital communication systems in a previous article on correlation.

As an example, during our discussion on demodulation, we derived the expression for a matched filter with the process of correlation in the following manner.

Since the maximum overlap is the only sample within a symbol time T_M that is required out of L = T_M/T_S samples, let us write

(1)   \begin{align*}         z(T_M) &= z(nT_S) \bigg| _{n = L = T_M/T_S} \nonumber \\                 &= \sum \limits _i r(iT_S) p(iT_S-nT_S) \bigg| _{n = L = T_M/T_S} \nonumber \\                 &= \sum \limits _i r(iT_S) p\Big(-(nT_S - iT_S)\Big) \bigg| _{n = L = T_M/T_S} \nonumber \\                 &= r(nT_S) * p(-nT_S) \bigg| _{n = L = T_M/T_S}        \end{align*}

We concluded that the process of correlation can be implemented as convolution with a filter whose impulse response h(nT_S) is a flipped version of the actual pulse shape. That filter is called the matched filter. For details, see the article on demodulation.

Similarly, during the discussion on timing, phase and frequency synchronization for Volume 2 of the book, we will derive the relevant estimators through this process as well demonstrating that correlation is indeed the master algorithm in this domain.