Book

From my book Wireless Communications – From the Ground Up

Carrier Phase Synchronization through the Correlation Principle

Observation window that determines the start of correlation

In another article on correlation, we said that correlation is the Master Algorithm for a digital communication system, not only for data detection but for parameter estimation of various kinds as well. We applied it to derive the matched filter for optimal detection. Here, we apply the principle of maximum correlation for solving our phase synchronization problem. The discussion in this context usually considers QPSK modulation but the extension to higher-order modulation schemes is a straightforward task. A similar framework is used for finding and compensating for other distortions as well such as carrier frequency offset and symbol timing offset, the examples of which we will see later.

Correlation with what?


In order to implement correlation, one signal is obviously the Rx signal r(nT_S). It is correlated with a clean version of what is expected at the Rx, the expected signal template. That is the most logical solution to detect the signal and estimate the desired parameters. For example, in case of a carrier phase offset, the Rx signal r(nT_S) will be correlated with a perfect Tx signal but rotated in phase by an unknown offset. This procedure then generates an algorithm to estimate that phase offset.

In the presence of noise, the received and sampled signal is

(1)   \begin{equation*}     \begin{aligned}         r(nT_S) &= s(nT_S) + \textmd{noise} \\                 &= v_I(nT_S) \sqrt{2}\cos \left(2\pi \frac{k_C}{N}n + \theta_\Delta\right) - \\&\hspace{1in} v_Q(nT_S) \sqrt{2}\sin \left(2\pi \frac{k_C}{N}n + \theta_\Delta\right) + \textmd{noise}     \end{aligned}     \end{equation*}

where k_C (in a DFT size of N) corresponds to carrier frequency F_C and v_I(nT_S) and v_Q(nT_S) are the inphase and quadrature waveforms (each time we define the inphase and quadrature waveforms, there is a reason behind choosing index i or index m that becomes clear as we progress through that particular derivation).

    \begin{equation*}       \begin{aligned}         v_I(nT_S)\: &= \sum _{m} a_I[m] p(nT_S-mT_M) \\         v_Q(nT_S) &= \sum _{m} a_Q[m] p(nT_S-mT_M)       \end{aligned}     \end{equation*}

The correlation between r(nT_S) and its expected template s(nT_S) is defined as

(2)   \begin{align*}         \textmd{corr}[j] &= r(nT_S) ~\heartsuit~ s(nT_S)\nonumber \\                   &= \sum \limits _{n = -\infty} ^{\infty} r(nT_S) s(nT_S-j)      \end{align*}

We face three problems in implementing the above relation.

  • [1. Observation interval] The above sum is computed from -\infty to +\infty, which implies that we could theoretically continue computing this correlation with every received sample r(nT_S). However, the useful component of the received signal r(nT_S) is given by a span of N_{0} symbols while every sample outside this interval is noise only. The time duration of such N_0 symbols is

    (3)   \begin{equation*}                 T_{0} = N_{0} T_M             \end{equation*}

    Thus, the finite observation interval is given by

        \begin{equation*}               0 \quad \rightarrow \quad \underbrace{T_{0}}_{\textmd{seconds}} = \underbrace{N_0}_{\textmd{symbols}} \cdot T_M = \underbrace{N_{0}\cdot L}_{\textmd{samples}} \cdot T_S             \end{equation*}

    where L is samples/symbol, or T_M/T_S. Therefore, the observation window consists of n=0,1,\cdots,LN_0-1 samples. It is understood in this calculation that the group delay arising from pulse shaping and matched filtering on either side of the transmission sequence has been taken care of by the Rx.

  • [2. Start of frame] The second problem is to determine the starting point of the useful component in a sampled signal, as many of its initial samples are noise only. It is the job of a frame synchronization unit to determine the symbol level boundaries of the Rx signal, while the symbol timing synchronization block establishes the optimal sampling instant within a symbol. This will be explained in detail in another article but Figure below illustrates this alignment problem for a visual clarification.
  • [3. Distortion at edges] Due to the pulse shape spreading beyond a single symbol and actually extending to G symbols in each direction, there is a distortion introduced at the edges of the observation interval T_0 in the correlation result. However, it can be ignored if T_0 is sufficiently long.

Observation window that determines the start of correlation

We also assume downconversion with perfect frequency recovery, i.e., Tx and Rx oscillators have the same frequency F_C, to emphasize on phase recovery problem. After perfect downconversion, our expected signal template is the complex sequence x(nT_S), with which the correlation of r(nT_S) needs to be implemented. The overall setup is drawn in Figure below.

Block diagram for phase recovery

Continuing from the article on effect of a phase offset and ignoring the double frequency terms (the higher frequency terms as a product of downconversion process are either filtered by an analog filter in the RF frontend or a digital filter in the digital frontend and furthermore, the matched filter also suppresses such terms, as it is also a lowpass filter due to the pulse shape originally designed to restrict the spectral contents of the wireless signal), the downconverted complex signal is expressed as

(4)   \begin{equation*}       \begin{aligned}         x_I(nT_S)\: &= v_I(nT_S) \cos \theta_\Delta - v_Q(nT_S) \sin\theta_\Delta       \\         x_Q(nT_S) &= v_Q(nT_S) \cos \theta_\Delta + v_I(nT_S) \sin\theta_\Delta       \end{aligned}     \end{equation*}

Next, the correlation between the Rx signal r(nT_S) and our expected template signal x(nT_S) is

(5)   \begin{align*}         \textmd{corr}[j] &= r(nT_S) ~\heartsuit~ x(nT_S)\nonumber \\                   &= \sum \limits _{n = 0} ^{LN_0-1} r(nT_S) x^*(nT_S-j)      \end{align*}

where the conjugate arises due to the signals being complex. Having determined j=0 through frame synchronization block, we can write Eq (5) as

    \begin{equation*}         \textmd{corr}[0] = \sum \limits _{n = 0} ^{LN_0-1} r(nT_S) x^*(nT_S)     \end{equation*}

Using the complex multiplication rule, the actual computations can be written as

(6)   \begin{equation*}       \begin{aligned}         \textmd{corr}_I[0]\: &= \sum \limits _{n = 0} ^{LN_0-1} r_I(nT_S) x_I(nT_S) + \sum \limits _{n=0} ^{LN_0-1} r_Q(nT_S) x_Q(nT_S) \\         \textmd{corr}_Q[0] &= \sum \limits _{n = 0} ^{LN_0-1} r_Q(nT_S) x_I(nT_S) - \sum \limits _{n=0} ^{LN_0-1} r_I(nT_S) x_Q(nT_S)       \end{aligned}     \end{equation*}

Recall that two complex signals are most similar when the inphase part of their correlation at time zero is maximum. Consequently, we ignore the Q component above and plug the definition of x(nT_S) from Eq (4) above.

    \begin{align*}         \begin{aligned}             \textmd{corr}_I[0]\: &= \sum \limits _{n = 0} ^{LN_0-1} r_I(nT_S) \Big(v_I(nT_S) \cos \theta_\Delta - v_Q(nT_S) \sin\theta_\Delta\Big) + \\&\hspace{1in}             \sum \limits _{n=0} ^{LN_0-1} r_Q(nT_S) \Big(v_Q(nT_S) \cos \theta_\Delta + v_I(nT_S) \sin\theta_\Delta\Big)         \end{aligned}     \end{align*}

Substituting the expressions for v_I(nT_S) and v_Q(nT_S) yields

(7)   \begin{equation*}         \begin{aligned}             \textmd{corr}_I[0]\: &= \sum \limits _{n = 0} ^{LN_0-1} r_I(nT_S) \Big(\sum _{m} a_I[m] p(nT_S-mT_M) \cos \theta_\Delta -  \\&\hspace{1.2in}\sum _{m} a_Q[m] p(nT_S-mT_M) \sin\theta_\Delta\Big) \nonumber\\&\hspace{0.2in}+             \sum \limits _{n=0} ^{LN_0-1} r_Q(nT_S) \Big(\sum _{m} a_Q[m] p(nT_S-mT_M) \cos \theta_\Delta +\\&\hspace{1.2in} \sum _{m} a_I[m] p(nT_S-mT_M) \sin\theta_\Delta\Big)         \end{aligned}     \end{equation*}

Ignoring the distortion at the edges of the summation, the above relation can be simplified recalling the fact that p(nT_S) are the samples of a square-root Nyquist pulse with support -LG \le n \le LG samples.

    \begin{equation*}         \begin{aligned}             \textmd{corr}_I[0]\: &= \sum \limits _{m = 0} ^{N_0-1}\Bigg\{ a_I[m] \underbrace{\sum \limits _{n=(m-G)L} ^{(m+G)L} r_I(nT_S) p(nT_S-mT_M)}_{\textmd{Inphase matched filter output}} \cos \theta_\Delta -  \\&\hspace{1.2in} a_Q[m] \underbrace{\sum \limits _{n=(m-G)L} ^{(m+G)L} r_I(nT_S) p(nT_S-mT_M)}_{\textmd{Inphase matched filter output}} \sin\theta_\Delta \\&\hspace{0.2in}+             a_Q[m]\underbrace{\sum \limits _{n=(m-G)L} ^{(m+G)L} r_Q(nT_S) p(nT_S-mT_M)}_{\textmd{Quadrature matched filter output}} \cos \theta_\Delta +\\&\hspace{1.1in} a_I[m] \underbrace{\sum \limits _{n=(m-G)L} ^{(m+G)L} r_Q(nT_S) p(nT_S-mT_M)}_{\textmd{Quadrature matched filter output}} \sin\theta_\Delta\Bigg\}         \end{aligned}     \end{equation*}

Notice that after the received signal r(nT_S) is downconverted to baseband, it is correlated with a similar pulse shape to the Tx, p(nT_S). As we saw in the details of matched filter, another way of implementing this correlation is through filtering the signal with its flipped version, p(-nT_S) (convolution operation in a filter flips p(-nT_S) again back to p(nT_S) and the operation becomes correlation). This is known as matched filtering. Clearly, after summing over the sample index n, we are left with a signal with symbol rate spaced samples at times mT_M.

Owing to the above description, we can identify the following terms as matched filtered outputs, one in the inphase and other in the quadrature arm.

(8)   \begin{equation*}           \begin{aligned}                 z_I(mT_M)\: &= \sum \limits _{n=(m-G)L} ^{(m+G)L} r_I(nT_S) p(nT_S-mT_M)\\                 z_Q(mT_M) &= \sum \limits _{n=(m-G)L} ^{(m+G)L} r_Q(nT_S) p(nT_S-mT_M)         \end{aligned}     \end{equation*}

Now the expression for inphase part of the correlation is simplified as

(9)   \begin{equation*}     \begin{aligned}          \textmd{corr}_I[0] &= \sum \limits _{m = 0} ^{N_0-1} a_I[m] \Bigg\{z_I(mT_M) \cos \theta_\Delta +z_Q(mT_M) \sin \theta_\Delta\Bigg\} \\ &\hspace{.1in}+\sum \limits _{m = 0} ^{N_0-1} a_Q[m] \Bigg\{z_Q(mT_M) \cos \theta_\Delta - z_I(mT_M) \sin \theta_\Delta\Bigg\}     \end{aligned}     \end{equation*}

The above expression is important. As we proceed, this Eq (9) will form the basis of many types of phase estimators, both in feedforward and feedback schemes.

A question at this stage is: starting from correlation, what have we achieved so far? While not clear as of now, we are on our way to solve the fundamental problem of the synchronization process. In a few other articles, we will discuss data-aided, decision-directed and non-data-aided techniques for phase synchronization.

Discrete-Time Integrators

A discrete-time integrator implemented through a forward difference and a backward difference technique

An integrator is a very important filter that proves useful in implementation of many blocks of a communication receiver. In continuous-time case, an integrator finds the area under the curve of a signal amplitude. A discrete-time system deals with just the signal samples and hence a discrete-time integrator serves the purpose of collecting a running sum of past samples for an input signal. Looking at an infinitesimally small scale, this is the same as computing the area under the curve of a signal sampled at an extremely high rate. For the following discussion, we assume a normalized sample time T_S =1.

For an input s[n] and output r[n], there are three main methods to implement a discrete-time integrator.

[Forward difference] The forward difference integrator is realized through the following equation.

(1)   \begin{align*}                 r[n] &= \sum _{i=-\infty}^{n} s[i] \nonumber \\                      &= \sum _{i=-\infty}^{n-1} s[i] + s[n] \nonumber \\                      &= r[n-1] + s[n]              \end{align*}

For obvious reasons, the running sum r[n-1] is a common component in all types of integrators; the differentiating factor is the term added to r[n-1] as a replacement for area under the curve. For a forward difference integrator, this additional term is s[n] — the current input. This is drawn in Figure below. Notice that the forward difference integrator computes its output after the arrival of the current sample, i.e., at time n.

A discrete-time integrator implemented through a forward difference and a backward difference technique

The block diagram shown in the figure is used to implement a forward difference integrator in loop filter portion of a Phase Locked Loop (PLL) that is employed in carrier phase and symbol timing synchronization systems of a digital communication system.

[Backward difference] The backward difference integrator is realized through the following equation.

(2)   \begin{equation*}                 r[n] = r[n-1] + s[n-1]             \end{equation*}

Here, the term added to r[n-1] is the previous input s[n-1]. This is also illustrated in the Figure above. In contrast to the forward difference case, the backward difference integrator can compute its output after the last sample, i.e., at time n-1. This minor dissimilarity plays a huge role in analyzing the performance of the actual discrete-time system where the integrator is employed (refer to any DSP text to understand the role of poles and zeros of a system).

The block diagram shown in the figure is used to implement a backward difference integrator in a Numerically Controlled Oscillator (NCO) of a Phase Locked Loop (PLL).

[Average of a backward and forward difference] A third integrator can be implemented as

    \begin{equation*}             r[n] = r[n-1] + \frac{1}{2}\left(s[n-1]+s[n]\right)         \end{equation*}

Why is integrator a lowpass filter?


An integrator maintains a running sum of past samples. In time domain, the summation of a large number of values tends towards a mean value. When some numbers are small, some are large and some are in between, their running sum naturally pulls large variations towards the middle, thus smoothing them out. Large variations represent high frequencies and smoothing out such variations is the function of a lowpass filter.

To verify this fact in frequency domain, first consider that when an impulse is given as input to an integrator in time domain, it again forms a running sum of the impulse, which is a unit step signal. Hence, its impulse response is a unit step signal. The unit step signal is very wide in time domain, so it must be narrow in frequency domain. Consequently, it filters out the high frequency terms and passes only the low frequency content. We can say that it acts as a lowpass filter.

Phase Locked Loop (PLL) in a Software Defined Radio (SDR)

A discrete-time PLL with a PI loop filter and an NCO consisting of a phase accumulator and a Look-Up Table (LUT)

IBM Watson and Google DeepMind are the most complex computers that, some believe, will try to run the world in a distant future. A PLL on the other hand is the simplest computer that actually runs so much of the world as a fundamental component of intelligent electronic circuits. The PLL was invented by the French engineer Henri de Bellescize in 1932 when he published his first implementation in the French journal L’Onde Electrique.

A Phase Locked Loop (PLL) is a device used to synchronize a periodic waveform with a reference periodic waveform. In essence, it is an automatic control system, an example of which is a cruise control in a car that maintains a constant speed around a given threshold. Although a PLL can be used for a variety of applications, it is enough for our purpose to treat it as a device that tracks the phase and frequency of an incoming sinusoid.

In a PLL, a control mechanism adjusts input signal to an oscillator according to a derived phase error such that the eventual phase error converges towards zero. We say that the phase of the output signal is locked to the phase of the input reference signal and hence it is called a Phase Locked Loop. In this text, we will focus on discrete-time PLLs.

PLL design and analysis


From a functional perspective, a PLL is the most important block in a digital communication system and hence it requires careful mathematical understanding and design. Usually this is done through application of Laplace Transform in continuous-time case and z-Transform in discrete-time domain. However, for the sake of simplicity, we treat just one transform, namely the Discrete Fourier Transform (DFT) in our articles.

Therefore, in regard to PLL design and analysis, we will take some key results from the literature without deriving them. This is due to our limitation of not covering the Laplace and z-Transforms. It should also be remembered that the design and analysis of a PLL does become mathematically untractable beyond an initial assumption of linearity and extensive computer simulations are needed in any case for its implementation in a particular application.

Let us start with a block diagram of a basic PLL shown in Figure below.

Phase error detector, loop filter and Numerically Controlled Oscillator (NCO) in a Phase Locked Loop (PLL)

Assume that the discrete-time sinusoidal input to the PLL is given as

    \begin{equation*}         \textmd{input} = A \cos \left( 2\pi \frac{k}{N}n + \theta[n]\right)     \end{equation*}

The PLL is designed in a way that the output is

    \begin{equation*}         \textmd{output} = \cos \left( 2\pi \frac{k}{N}n + \hat\theta[n]\right)     \end{equation*}

where \hat{\theta}[n] should be as close to \theta[n] as possible after acquisition. This phase difference \theta[n]-\hat\theta[n] is called phase error and is denoted by \theta_e[n].

    \begin{equation*}       \theta_e[n] = \theta[n] - \hat \theta[n]     \end{equation*}

The phase error \theta_e[n] computed at time n is drawn in Figure below for a continuous-time signal.

Phase difference between the PLL input and output for a continuous-time signal

The roles played by each block in the PLL are as follows.

[Phase error detector] A phase error detector determines the phase difference between a reference input waveform and a locally generated waveform and generates a corresponding signal denoted as e_D[n].
[Loop filter] A loop filter sets the dynamic performance limits of a PLL. Moreover, it helps filter out noise and irrelevant frequency components generated in the phase error detector. Its output signal is denoted as e_F[n].
[Numerically controlled oscillator (NCO)] An NCO generates a local discrete-time discrete-valued waveform with a phase as close to the phase of the reference signal as possible. The amount of the phase adjustment during each step is determined by the loop filter output.

As a first step towards our understanding, assume that \theta[n] is zero and so the frequency of the input signal is 2\pi k/N. Consequently,

  • the NCO operates at the same frequency as well and the phase error \theta_e is zero,
  • then the phase error detector output e_D[n] must ideally be zero,
  • that leads to loop filter output e_F[n] to be zero.

However, if \theta_e was not zero at the start,

  • the phase error detector would develop a non-zero output signal e_D[n] which would rise or fall depending on \theta_e,
  • the loop filter subsequently would generate a finite signal e_F[n], and
  • this would cause the NCO to change its phase in such a way as to turn \theta_e towards zero again.

Let us find out how this control mechanism adapts favorably in opposite direction to the input phase changes.

Phase error detector


The phase error detector is a device which outputs some function f\{\cdot\} of the difference between the phase \theta[n] of the PLL input and the phase \hat \theta[n] of the PLL output. So the phase error detector output is written as

(1)   \begin{equation*}                 e_D[n] = f \left\{\theta[n]-\hat \theta[n]\right\} = f\left\{\theta_e[n]\right\}             \end{equation*}

The function f\{\cdot\} is in general non-linear due to the fact that the phase \theta[n] is embedded within the incoming sinusoid and is not directly accessible.

A phase equivalent representation of such a PLL can be drawn by taking into account the phases of all sinusoids and tracking the operations on those phases through the loop. This is illustrated in Figure below.

A phase equivalent PLL

As mentioned before, a PLL is a non-linear device due to the phase error detector not having direct access to the sinusoidal phases. Although in reality the output is in general a non-linear function f(\cdot) of the phase difference between the input and output sinusoids, a vast majority of the PLLs in locked condition can be approximated as linear due to the following reason.

In equilibrium, the loop has to keep adjusting the control signal e_F[n] such that the output \hat \theta[n] of the NCO is almost equal to the input phase \theta[n]. So during a proper operation, the phase error \theta_e[n] should go to zero.

    \begin{equation*}         \theta_e[n] = \theta[n] - \hat \theta[n] \rightarrow 0     \end{equation*}

To make this happen, what should be the shape of the curve at the phase error detector output e_D[n]=f(\theta_e[n])?

For finding the answer, first assume that \theta_e[n] is positive and see what can make it go to zero.

    \begin{align*}       \theta_e[n] > 0 & \implies \theta[n] - \hat \theta[n] > 0 \\                       & \implies \theta[n] > \hat \theta[n] \\                       & \therefore ~~~~ \hat \theta [n]~ \text{should increase} \\                       & \implies e_F[n]~ > 0 \\                       & \implies e_D[n]~ > 0 \\                       & \implies f(e[n]) > 0     \end{align*}

Similarly, when \theta_e[n]<0, it can be concluded that e_D[n]=f(e[n]) should be negative as well. This leads to the symbolic phase error detector input/output relationship drawn in Figure below for the phase error \theta_e and average phase error detector output \textmd{Mean}\{e_D[n]\} \equiv \overline{e_D}. This kind of relationship is called an S-curve due to its shape resembling the English letter “S”. In articles on synchronization, we learn more about this shape and name.

S-curve of the average phase error detector output

Under steady state conditions, \theta_e[n] hovers around the origin and hence e_D[n]=f(e[n]) also stays within the region indicated with the red ellipse in Figure above. An extended typical S-curve is also drawn where one can observe that the PLL has the ability to pull back even a larger error \theta_e[n]. There is an increase in e_D[n] with larger \theta_e[n], thus \hat \theta[n] increases and subsequently pulls \theta_e[n] = \theta[n]-\hat\theta[n] back towards zero. However, the steering force depends on the magnitude of e_D[n] which is different outside the linear region. It can be deduced that in the linear region of operation (a straight line relationship),

  • a positive slope around zero produces a stable lock point, and
  • a negative slope around zero does not generate a stable lock point.

Within the small linear operating range, the PLL can be analyzed using linear system techniques. Around this region for small \theta_e, the non-linear operation f(\cdot) can be approximated as

    \begin{equation*}         f\left\{\theta_e\right\} \approx K_D \cdot \theta_e     \end{equation*}

where K_D is the slope of the line known as the gain of the phase error detector.

The phase equivalent loop for this linear model is shown in Figure below, where the phase error detector now consists of just an adder and a multiplier: the difference between the input phase and the output phase is simply scaled by the gain K_D.

A linear model of a PLL

As we will find in later chapters, the phase error detector is the most versatile block in a PLL that leads to an extremely wide range of PLL designs. On the other hand, depending on a certain application, there are set rules for choosing a loop filter and an NCO which simplify the process to some extent. In this text, our main purpose to employ PLLs is to build phase and timing synchronization modules instead of investigating the PLL theory in depth. Therefore, we will devise several different kinds of phase error detectors while using the same loop filter and NCO for each PLL.

In the meantime, if you found this article useful, you might want to subscribe to my email list below to receive new articles.


Proportional + Integrator (PI) Loop Filter


The loop filter in a PLL performs two main tasks.

  1. The primary task of a loop filter is to deliver a suitable control signal to the NCO and establish the dynamic performance of the loop. Most PLL applications require a loop filter that is capable of driving not only a phase offset between the input and output sinusoids to zero, but also tracking frequency offsets within a reasonable range.
  2. A secondary task is to suppress the noise and high frequency signal components.

For this purpose, a Proportional + Integrator (PI) loop filter is most commonly used in PLL design. As the name suggests, a PI filter has a proportional and an integrator component:

[Proportional] The proportional term is a simple gain of K_P. To the filter output, it contributes a signal that is proportional to the filter input as

    \begin{equation*}             e_{F,1}[n] = K_P \cdot e_D[n]       \end{equation*}

[Integrator] The integrator term is an ideal integrator with a gain of K_i. To the filter output, it contributes a signal that is proportional to the integral of the input signal. Or in discrete-time,

    \begin{equation*}             e_{F,2}[n] = e_{F,2}[n-1] + K_i \cdot e_D[n]       \end{equation*}

It can be deduced that it executes forward difference integration to accumulate its input. The accumulation component is necessary to drive the steady-state error at the PLL output to zero in the presence of a frequency offset.

Combining the proportional and integrator components leads to the loop filter output e_F[n].

    \begin{equation*}         e_F[n] = e_{F,1}[n] + e_{F,2}[n]     \end{equation*}

When a PI filter is incorporated in the linear PLL model, we get the discrete-time PLL block diagram drawn in Figure below. The notation D_1 represents a delay of one sample time.

A discrete-time PLL with a Proportional + Integrator (PI) loop filter

For the sake of completion, it is important to know that

  • a PLL without a loop filter (understandably known as a first order PLL) is also used in some applications where noise is not a primary concern [1], and
  • a higher order loop filter can suppress spurs but increasing the order also increases the phase shift of such filters, thus making them prone to become unstable.

Numerically Controlled oscillator (NCO)


The signal e_F[n] forms the input as a control signal to set the phase of an oscillator. The name controlled oscillator arises from the fact that its phase depends on the amplitude of the input control signal. Some examples of controlled oscillators are Voltage Controlled Oscillator (VCO) and a Numerically Controlled Oscillator (NCO).

The oscillation frequency of a Voltage Controlled Oscillator (VCO) is controlled by its voltage input and hence it is an integral part of analog PLLs. As more and more functionality of the transceiver shifts towards the digital domain, the analog PLLs are seldom employed for waveform synchronization.

A Numerically Controlled Oscillator (NCO) creates a discrete-time as well as discrete-valued (i.e., digital) representation of a waveform whose phase is steered by digital representation of a number at its input. In wireless communication devices, an NCO plays a central role in creating a digital version of a PLL for synchronization purpose. It has two main components:

[Phase Accumulator] The NCO adjusts its output phase \hat{\theta}[n] based on its input signal e_F[n] as

    \begin{equation*}     \hat \theta[n] = K_0 \sum _{i=-\infty}^{n-1} e_F[i] \end{equation*}

where K_0 is a constant of proportionality known as the oscillator gain. From this expression, we can see that an NCO acts as a phase accumulator. Note that the output can also be modified as

(2)   \begin{align*}     \hat \theta[n]  &= K_0 \sum _{i=-\infty}^{n-1} e_F[i] = K_0 \sum _{i=-\infty}^{n-2} e_F[i] + K_0\cdot e_F[n-1] \nonumber \\                     &= \hat \theta[n-1] + K_0\cdot e_F[n-1] ~~~\textmd{mod}~ 2\pi  \end{align*}

It can be deduced that an NCO executes backward difference integration to accumulate its input. Unlike the analog VCO, the gain K_0 of the phase accumulator can be easily set to a fixed value, say 1.

[Look-Up Table (LUT)] In embedded wireless devices, the phase update \hat \theta[n] from the integrator serves as an index into a Look-Up Table (LUT) which stores the numeric values of a desired sampled waveform (such as a sine and a cosine). So the output can be computed as

    \begin{equation*}           \begin{aligned}             s_I[n]\: &= \cos \left( \hat \theta[n]\right)\\             s_Q[n] &= \sin \left( \hat \theta[n]\right)           \end{aligned}     \end{equation*}

Naturally, the size of the lookup table determines the memory requirements as well as the amount of quantization on \hat \theta[n], hence leading to a tradeoff between memory consumption and waveform approximation error. In most applications, a finer estimate is required to reduce this phase error noise which can be generated through interpolation between the stored samples, thus a change in the LUT size is not needed.

With the inner workings of NCO available, a complete block diagram of a phase equivalent model of a PLL is now drawn in Figure below. The notation D_1 represents a delay of one sample time.

A discrete-time PLL with a PI loop filter and an NCO consisting of a phase accumulator and a Look-Up Table (LUT)

As stated earlier, it is easier to establish the kind of loop filter and NCO according to the desired PLL performance and then search for a suitable phase error detector. For the purpose of carrier phase synchronization, we will continue to use a PI loop filter and an NCO for all different phase error detectors. For symbol timing synchronization, the loop filter will remain the same while a discrete-time interpolator will be employed instead of an NCO due to the nature of the underlying problem.

Designing a PLL


For the synchronization setup in this text, the PLL response is determined by two parameters: damping factor \zeta and natural frequency \omega_n, which are taken from the standard control system terminology for a second order system. A description of \zeta and \omega_n is as follows.

[Damping factor \zeta:] Imagine dropping a ball on the ground. After hitting the ground, the ball bounces up to a distance, and repeats damped oscillations before finally settling in equilibrium. Similarly, a PLL phase acquisition process exhibits an oscillatory behavior in the beginning which can be controlled by the damping factor.

For a given input signal, a PLL behaves differently for different values of \zeta. This is illustrated in Figure below for a unit step phase input (when the input is a unit impulse, the output is an impulse response and when the input is a unit step, the output is known as a step response).

  • When \zeta < 1, the loop response exhibits damped oscillations in the form of overshoots and undershoots and the system is termed as underdamped.
  • When \zeta > 1, the loop response is the sum of decaying exponentials, oscillatory behaviour disappears with large \zeta and the system is overdamped.
  • Finally, when \zeta = 1, the response is somewhere between damped oscillations and decaying exponentials and the PLL is termed as critically damped.

Step response of the PLL with a PI loop filter for different values of the damping factor

[Natural frequency \omega_n:] We will shortly see that the PLL in tracking mode acts as a lowpass filter. In this role, the natural frequency \omega_n can be considered a coarse measure of the loop bandwidth.

PLL as a Lowpass Filter


The purpose of employing a PLL in a communications receiver is to track an incoming waveform in phase and frequency. This input signal is inherently corrupted by additive noise. In such a setup, a receiver locked in phase should reproduce this original signal adequately while removing as much noise as possible. The Rx uses a VCO or an NCO with a frequency close to that expected in the signal for this purpose. Through the loop filter, the PLL averages the phase error detector output over a length of time and keeps tuning its oscillator based on this average.

If the input signal has a stable frequency, this long term average produces very accurate phase tracking, thus eliminating a significant amount of noise. In such a scenario, the input to the PLL is a noisy signal while the output is a clean version of the input. We can say that when operating as a linear tracking system, a PLL is a filter that passes signal and rejects noise.

Passband of a PLL


Having established the filtering operation of a PLL, we need to find out what kind of a filter a PLL is. For this purpose, consider the fact that within the linear region of operation, the PLL output phase closely follows the input phase for small and slow phase deviations. On the other hand, it loses lock for large and quick input variations, thus necessitating a lowpass frequency response.

Frequency response of the PLL with a PI loop filter

The above Figure shows the frequency response of a PLL with PI loop filter: it is indeed a lowpass filter. Before we think of it having a sharp transition bandwidth, remember that the frequency axis is also drawn on a logarithmic scale. Moreover, the frequency scale is normalized to natural frequency \omega_n which makes the curve valid for all such PLLs.

The figure also reveals that the spectrum of this PLL as a lowpass filter is approximately flat between zero and \omega_n. This implies that the PLL should be able to track phase and frequency variations in the reference signal as long as these variations remain roughly below \omega_n.

By the same token, the bandwidth of this lowpass system varies with \omega_n. However, a better definition of the bandwidth is needed because the loop frequency response strongly depends on \zeta for the same \omega_n. Therefore, a bandwidth measure known as the equivalent noise bandwidth B_n (see Ref. [1]) is used that is related to natural frequency \omega_n and damping factor \zeta as

(3)   \begin{equation*}         B_n = \frac{\omega_n}{2} \left( \zeta + \frac{1}{4\zeta}\right)     \end{equation*}

for a PI loop filter.

Computing the Loop Constants


Designing a PLL in a software defined radio starts with defining the noise bandwidth B_n and damping coefficient \zeta.

[Loop noise bandwidth B_n:] As we will see in Example below, there is a tradeoff involved between choosing

  • a small noise bandwidth that filters out most of the noise (and by extension the frequencies falling in the stopband), and
  • a large noise bandwidth that can track fast phase variations, i.e., higher frequencies (and by extension allowing more noise to enter through the loop).

Both of the above objectives cannot be achieved simultaneously. However, a software radio based approach allows some relaxation as explained later in this article. For most cases of communication receivers, a B_n value between 1\% and 5\% of the sample rate suffices for rejecting the noise and tracking the input phase.

[Damping factor \zeta:] On the other hand, a large \zeta results in no overshoots but long convergence time while a small \zeta exhibits relatively fast convergence but damped oscillations. A good balance between the two is achieved with 1/\sqrt{2}=0.707 as a frequently used value. Typical values of \zeta range from 0.5 to 2 in practical applications.

Next, having already chosen a PI loop filter, there are four constants that need to be determined: K_0, K_D, K_P and K_i.

[K_0:] In a discrete-time system, the NCO gain K_0 can be easily fixed to a suitable value, say 1.

[K_D:] The phase error detector gain K_D is computed according to the structure and resulting expression of the phase error detector, some examples of which we will see in later examples below. Due to this dependence on the nature of phase error detector, K_D can be treated as a fixed given parameter around which the rest of the loop is designed.

[K_P, K_i:] With K_0 and K_D established, the PI filter coefficients K_P and K_i remain the two unknowns in the above two equations and hence can easily be computed. Under the assumption that the loop noise bandwidth is small compared to the sample rate

    \begin{equation*}         B_n << F_S,     \end{equation*}

control systems theory sets the following relationships between the loop constants and loop parameters, see Ref. [2].

(4)   \begin{equation*}         \begin{aligned}             K_P &\approx \frac {1}{K_D K_0}\cdot \frac{4\zeta }{\zeta + \displaystyle\frac{1}{4\zeta}} \cdot \frac{B_n}{F_S}\\             &\\             K_i &\approx \frac {1}{K_D K_0}\cdot \frac{4}{\left(\zeta + \displaystyle\frac{1}{4\zeta}\right)^2} \cdot \left(\frac{B_n}{F_S}\right)^2         \end{aligned}      \end{equation*}

Notice that the PLL noise bandwidth is specified according to the sample rate F_S. However, symbol rate R_M is a more appropriate parameter in digital communication systems and hence noise bandwidth can be specified according to R_M. From the relation F_S = L\cdot R_M, where L is the number of samples/symbol, the above equations can be modified as

(5)   \begin{equation*}             \begin{aligned}             K_P &\approx \frac {1}{K_DK_0L}\cdot \frac{4\zeta }{\zeta + \displaystyle\frac{1}{4\zeta}} \cdot \left(\frac{B_n}{R_M}\right)\\             &\\             K_i &\approx \frac{1}{K_DK_0L^2}\cdot \frac{4}{\left(\zeta + \displaystyle\frac{1}{4\zeta}\right)^2} \cdot \left(\frac{B_n}{R_M}\right)^2             \end{aligned}         \end{equation*}

These are the equations we will use for computing the values for loop constants in specific PLL applications.

In summary, a software PLL whose code runs in a wireless device can be designed through the procedure outlined in Figure below.

Design procedure for a software PLL with a PI loop filter

Next, we cover a few examples to demonstrate the phase tracking capability of a PLL and how different parameters influence its performance.

In the meantime, if you found this article useful, you might want to subscribe to my email list below to receive new articles.


Example 1


Assume that the a PLL has to be designed such that it locks to a real sinusoid with discrete frequency k/N = 1/15 cycles/sample. Thus, the incoming sinusoid can be written as

    \begin{equation*}         r[n] = A \cos \left( 2\pi \frac{1}{15} n + \theta[n]\right) \\     \end{equation*}

where \theta[n] can be a slowly changing phase. Here, we set a constant phase angle \theta[n] = \pi that needs to be tracked. Such a large phase difference will enable us to clearly observe the PLL convergence process.

The block diagram for such a system is drawn in Figure below. The NCO has one complex output, or two real outputs with an inphase and a quadrature component.

    \begin{equation*}           \begin{aligned}             s_I[n]\: &= \cos \left(2\pi\frac{k}{N}n + \hat \theta[n] \right) \\             s_Q[n] &= -\sin \left(2\pi\frac{k}{N}n + \hat \theta[n] \right)           \end{aligned}     \end{equation*}

A discrete-time PLL with a phase error detector that computes the product between the input sinusoid and quadrature output of the NCO

Phase Error Detector


The phase error detector is a simple multiplier that forms the product between the input sinusoid and the quadrature component of the NCO output.

    \begin{align*}     e_D[n]      &= -\sin \left(2\pi\frac{k}{N}n + \hat \theta[n] \right) \cdot A \cos \left( 2\pi \frac kN n + \theta[n]\right) \\                 &= \frac A2 \sin \Big( \theta[n] - \hat \theta[n]\Big) - \frac A2 \sin \left(2\pi \frac {2\cdot k}{N} n + \theta[n] + \hat \theta[n] \right) \\                 &= \frac A2 \sin \Big( \theta_e[n]\Big) + \text{double frequency term}     \end{align*}

where the identity \cos(A)\sin(B) = \frac{1}{2} \left\{ \sin(A+B)\right.\} - \left.\sin(A-B) \right\} has been used. The second term involving 2\pi (2\cdot k/N) is the double frequency term that is filtered out by the loop filter. Therefore, the loop tracks the first term only, given by

    \begin{equation*}         e_D[n] \approx \frac A2 \sin \Big( \theta_e[n] \Big)     \end{equation*}

The S-curve is the sine of \theta_e which can be approximated using the identity \sin A \approx A for small A. For this reason, the phase error detector output is approximately linear for steady state operation around the origin.

    \begin{align*}         \overline{e_D} &= \frac A2 \sin \theta_e \\          &\approx \frac A2 \theta_e \quad \text{for small}~\theta_e     \end{align*}

This S-curve is plotted in Figure below.

S-curve corresponding to the product phase error detector

Loop Constants


From the above equation, the phase error detector gain K_D is clearly seen to be

    \begin{equation*}             K_D = \frac A2         \end{equation*}

and hence is a function of sinusoid amplitude at the PLL input. Remember from Eq (4) that loop filter gains K_P and K_i involve K_D in their expressions. If the input signal level is not controlled, loop filter will have incorrect coefficients and the design will not perform as expected. In a wireless receiver, this amplitude is maintained at a predetermined value by using an Automatic Gain Control (AGC).

For the purpose of this example, we assume that A is fixed to 1 and hence

    \begin{equation*}       K_D=0.5     \end{equation*}

Next, we design a PLL with damping factor

    \begin{equation*}         \zeta = 1/\sqrt{2} = 0.707     \end{equation*}

and loop noise bandwidth B_n = 5\% of the sample rate F_S, or

    \begin{equation*}         B_n/F_S = 0.05     \end{equation*}

Thus, plugging these parameters in Eq (4), we get

    \begin{align*}       K_P &= 0.2667 \\       K_i &= 0.0178     \end{align*}

After setting the rest of the parameters at the start of this example, the PLL can be simulated through any programming loop in a software routine that computes the sample-by-sample product of the loop input and quadrature loop output. The pseudocode for this implementation is shown next.


Repeat for each sample

[Phase error detector]

// product of input sinusoid with quadrature output sinusoid

e_D = sampleIn \cdot sinOut;

[Loop filter]

// adding previous integrator output

integratorOut = integratorOut + K_i \cdot e_D;

// Computing loop filter output

e_F = K_P \cdot e_D + integratorOut;

[NCO]

// Update \hat \theta

\hat \theta = \hat \theta + K_0\cdot e_F;

// qadrature output (negative)

sinOut = -\sin(2\pi \frac{k}{N} n + \hat \theta);

// inphase output

cosOut = \cos(2\pi \frac{k}{N} n + \hat \theta)

Until end

Phase Error Detector Output e_D[n]


We start with the plot shown in Figure below displaying e_D[n] that contains the following two components.

  • A slowly varying average component hidden in e_D[n] is shown in red. This can be thought of the true phase error to which the loop responds by converging to the incoming phase. Observe that this error stays positive for the first 27 samples, then goes negative before settling to zero at around 70 samples mark. Therefore, as we will see later, the NCO output should overcome the initial phase difference of \pi and track the input sinusoid after around 70 samples.
  • A constant amplitude sinusoid with double frequency 2 \cdot 2\pi (k/N) = 2\pi (2/15). Observe in Figure that every 15 samples, there are 2 complete oscillations of this sinusoid riding on average error curve. This is more clearly visible towards the end of the curve where steady state is seen to be reached. Since A/2=0.5, the amplitude of this double frequency term approximately varies from around -0.5 to +0.5, resulting in a peak-to-peak amplitude of 1.

Phase error detector output

Loop Filter Output e_F[n]


Figure below displays the error signal e_F[n] at the output of the loop filter. Similar kinds of trajectories are visible as in e_D[n]. However, the amplitude of double frequency term has been reduced from a peak-to-peak value of 1 in e_D[n] to a peak-to-peak value of 0.3. This behaviour reinforces the lowpass characteristic of a PLL. Recall that the input sinusoid has a discrete frequency k/N=1/15 which was chosen for the purpose of better visualization of error convergence. If we had chosen a higher frequency, the attenuation of double frequency term would have been different.

Loop filter output

Phase Estimate \hat \theta[n]


The phase estimate \hat \theta[n] is plotted in Figure below. Just like e_D[n] approaching zero after 70 samples, \hat \theta[n] is seen to approach the initial phase difference \pi between the input and PLL output sinusoids. The oscillations due to double frequency term still remain.

Observe that the phase estimate does not directly converge to the actual value of \pi. Instead, its average value exhibits oscillatory behaviour by going beyond \pi, then returning and oscillating around this value, which could have been clearer if the figure was extended to display more samples. This is due to the value chosen for the damping factor \zeta = 0.707.

Just like dropping a ball on the ground, the phase estimate \hat \theta[n] repeats damped oscillations before finally settling at the equilibrium.

Phase estimate at the output of the NCO

PLL Output Sinusoid


Finally, the output inphase sinusoid \cos \left( 2\pi (k/N) n + \hat \theta[n]\right) is shown along with the input sinusoid in Figure below. Initially, there is a phase difference of \pi between them but gradually the PLL compensates for this difference and approaches the input sinusoid tracking successfully afterwards. This happens after 70 samples where e_D[n] was seen to draw towards zero.

Interestingly, the little oscillatory behaviour shown by \hat \theta[n] can be spotted here as well after convergence has reached, where the red dashed curve slightly leads and then slightly lags behind the input blue curve.

Input and output sinusoids

Similar to the above example, different PLLs can be designed for different phase detectors, different values of damping factor \zeta and loop noise bandwidth B_n and the results can be plotted to see how each parameter value affects the PLL behaviour. In the next example, we implement a PLL based on complex signal processing.

Example 2


The PLL now has to be designed such that it locks to a complex sinusoid with discrete frequency k/N = 1/15 cycles/sample. Thus, the input sinusoid can be written as

    \begin{equation*}           \begin{aligned}             r_I[n]\: &= A\cos \left(2\pi\frac{k}{N}n + \theta[n] \right) \\             r_Q[n] &= A\sin \left(2\pi\frac{k}{N}n + \theta[n] \right)           \end{aligned}     \end{equation*}

The block diagram for such a system is drawn in Figure below. The NCO also has a complex output, or two real outputs with an inphase and a quadrature component, written as

    \begin{equation*}           \begin{aligned}             s_I[n]\: &= \cos \left(2\pi\frac{k}{N}n + \hat \theta[n] \right) \\             s_Q[n] &= -\sin \left(2\pi\frac{k}{N}n + \hat \theta[n] \right)           \end{aligned}     \end{equation*}

A discrete-time PLL with complex input and output

Here, the phase detector first computes the product of the input and output complex sinusoids. Although the block diagram shows a simple product operator, remember that a complex product actually implements 4 real multiplications and 2 real additions. Using the trigonometric identities as before, the double frequency terms simply cancel out and the result of the product is

    \begin{equation*}           \begin{aligned}             \{r[n]\cdot s[n]\}_I\: &= \cos \left(\hat \theta[n] - \hat \theta[n] \right) = \cos \theta_e[n] \\             \{r[n]\cdot s[n]\}_Q &= \sin \left( \hat \theta[n] - \hat \theta[n] \right) = \sin \theta_e[n]           \end{aligned}     \end{equation*}

Note the difference in complex signal processing: the double frequency terms actually cancel out instead of being filtered out by the loop filter. Furthermore, the phase of the above complex signal is precisely the error signal which needs to be extracted to form e_D[n].

Accordingly, a four-quadrant inverse tangent defined in the article on complex numbers is used to compute the phase of this complex signal at multiplier output. Therefore, the output of the phase detector is simply

    \begin{equation*}         e_D[n] = \measuredangle \big(r[n]\cdot s[n]\big) = \theta_e[n]     \end{equation*}

for which the corresponding S-curve is drawn in Figure below.

    \begin{equation*}         \overline{e_D} = \theta_e     \end{equation*}

Notice that the S-curve is linear over the whole range -\pi \le \theta_e \le \pi and its expression implies that the phase detector gain K_D =1.

S curve for the four-quadrant inverse tangent phase error detector

Now we design three PLLs with different damping factors \zeta and loop noise bandwidth normalized with sample rate B_n/F_S as follows.

[Case 1] \zeta = 1/2 and B_n/F_S = 0.05
[Case 2] \zeta = 3 and B_n/F_S = 0.05
[Case 3] \zeta = 3 and B_n/F_S = 0.01

Loop filter coefficients K_P and K_i can be found by plugging these values in Eq (4). Next, the PLL can be simulated as in the previous example and the phase detector output e_D[n] as well as the phase estimate \hat \theta[n] are plotted for these three PLLs in Figures below.

Phase error detector output for different damping factors and loop noise bandwidths

Output phase estimate for different damping factors and loop noise bandwidths

Here, some comments about acquisition and locking behaviour of a PLL are in order.

Comments on Locking and Acquisition


A complete study of a PLL design and performance involves an in-depth mathematical formulation including solutions to non-linear equations. Just like we took some key results for PLL design without deriving them, we next comment on some key parameters that govern its performance. We start with the parameters that specify the frequency range within which a PLL can be operated. The details of most of what follows are nicely explained in Ref. [1].

[Hold range] There is a critical value of the frequency offset between the input and output waveforms after which the slightest disturbance causes the PLL to lose phase tracking forever. This range, in which a PLL can statically maintain phase tracking, is known as hold range F_H.

[Pull-in range] When the frequency offset of the reference signal in an unlocked state reduces below another critical value, the buildup of the average phase error starts decelerating thus leading to an eventual system lock. This value is known as pull-in range F_P. The pull-in range is significantly smaller than the hold range. Even though the pull-in process itself is relatively slow, the PLL will always become locked for an offset within this range.

[Lock range] Obtaining a locked state within a short time duration is desirable in most applications. If the frequency offset reduces below another value — called the lock range F_L— the PLL becomes locked within one single-beat note between the reference and the output frequencies. The lock range is much smaller than the pull-in range; however on the up side, the lock-in process itself is much faster than the pull-in process.

Remember that the lock actually implies that for each cycle of the input, there is one and only one cycle of NCO output. Even with a phase lock, both steady phase errors and fluctuating phase errors can be present. In practical applications, the operating frequency range of a PLL is normally restricted to the lock range.

In summary, the hold range and the lock range are the largest and the smallest, respectively, while the pull-in range lies somewhere within the boundaries set by them. Thus, the following inequality holds.

    \begin{equation*}     F_H > F_P > F_L \end{equation*}

Next, we describe two other important quantities which determine the suitability of a PLL for an application.

[Acquisition time] A PLL requires a finite amount of time to successfully adjust to the incoming signal and reduce the phase error to zero, which is called acquisition time. The acquisition time is given by the sum of the time to achieve frequency lock as well as that of the phase lock. It is inversely proportional to B_n because a PLL with a large noise bandwidth lets more frequencies through its passband, consequently becoming able to track rapid variations and locking quicker than a PLL with narrower noise bandwidth. In above Figures, this can be noticed in a comparison between cases 2 and 3 with same \zeta=3 but different loop noise bandwidths. As a general rule, for a frequency offset F_\Delta,

(6)   \begin{equation*}             T_{\textmd{acq}} \propto \frac{F_\Delta^2}{B_n^3}         \end{equation*}

[Tracking error] The performance of a PLL is determined by tracking error which is the power of the phase error signal. For a PLL in tracking mode (i.e., during linear operation), the noise power at the PLL input is given by AWGN with power spectral density N_0 defined in article on AWGN and the loop noise bandwidth B_n as

    \begin{equation*}             P_w = N_0 \cdot B_n         \end{equation*}

For a sinusoidal input with power P_s at the PLL input, the ratio P_s/P_w is the Signal-to-Noise Ratio (SNR). The expression for the tracking error \rho_{\theta_e} is

(7)   \begin{equation*}             \rho_{\theta_e} = \frac{N_0 B_n}{P_s} = \frac{P_w}{P_s} = \frac{1}{P_s/P_w}         \end{equation*}

Therefore, the tracking error in the presence of AWGN is inversely proportional to SNR and consequently directly proportional to B_n. It makes sense that a wider bandwidth allows a larger amount of noise at the PLL output, thus increasing the tracking error.

A Fast Lock Technique


From Eq (6) above, it is evident that selecting a large B_n in PLL design results in faster acquisition since the acquisition time is inversely proportional to a power of B_n. On the other hand, Eq (7) shows that a narrow B_n generates less tracking error since it is directly proportional to B_n. In conclusion, a good PLL design balances the opposite criteria of fast acquisition time and reduced tracking error.

In the world of hardware radio, PLL designers had to balance these two performance criteria by finding an acceptable compromise. The realm of software radio offers a better solution due to our ability to change the code on the fly which is explained below.

In an unlocked state, the PLL noise bandwidth B_n is made large so that it can achieve fast lock. In parallel, a certain algorithm known as a lock detector is run which generates a binary output depending on whether the PLL has acquired lock or not. Since the loop constants are reconfigurable in this scenario, their values are changed such that the PLL bandwidth B_n is reduced to a smaller value.

In other articles, we discuss carrier and timing synchronization procedures in a communications receiver. These blocks incorporate a PLL as an integral component.

References


[1] R. Best, Phase Locked Loops – Design, Simulation and Applications (6th Edition), McGraw Hill, 2007.
[2] M. Rice, Digital Communications – A Discrete-Time Approach, Prentice Hall, 2009.

What is Carrier Frequency Offset (CFO) and How It Distorts the Rx Symbols

Three different cases for carrier frequency offset

In Physics, frequency in units of Hz is defined as the number of cycles per unit time. Angular frequency is the rate of change of phase of a sinusoidal waveform with units of radians/second.

    \begin{equation*}         2\pi f = \frac{\Delta \theta}{\Delta t}     \end{equation*}

where \Delta\theta and \Delta t are the changes in phase and time, respectively. A Carrier Frequency Offset (CFO) usually arises due to two reasons:

  1. [Frequency mismatch between the Tx and Rx oscillators] No two devices are the same and there is always some difference between the manufacturer’s nominal specification and the real frequency of that device. Moreover, this actual frequency keeps changing (slightly) with temperature, pressure, age, and some other factors.
  2. [Doppler effect] A moving Tx, Rx or any kind of movement around the channel creates a Doppler shift that creates a carrier frequency offset as well. We will learn more about it during the discussion of a wireless channel in another post.

To see the effect of the carrier frequency offset F_\Delta, consider again a received passband signal consisting of two PAM waveforms in I and Q arms.

(1)   \begin{align*}         r(t) &= v_I(t) \sqrt{2} \cos \Big[2\pi (F_C+F_\Delta) t + \theta_\Delta\Big] - v_Q(t) \sqrt{2}\sin \Big[ 2\pi (F_C+F_\Delta) t + \theta_\Delta \Big]\nonumber\\         &= v_I(t) \sqrt{2} \cos \Big[2\pi F_Ct + 2\pi F_\Delta t + \theta_\Delta\Big] - \nonumber \\ &\hspace{2in}v_Q(t) \sqrt{2}\sin \Big[ 2\pi F_C t+ 2\pi F_\Delta t + \theta_\Delta\Big]     \end{align*}

Here, the carrier offset can be seen as 2\pi F_\Delta t which is changing with time. Let us look into how to find F_{\Delta,\textmd{max}}, the maximum value CFO can take.

The accuracy of local oscillators in communication receivers is defined in terms of ppm (parts per million). 1 ppm is just what it says: 1 out of 10^6 parts. To get a feel of how big or small this number is, 10^6 seconds translate into

    \begin{equation*}             \frac{10^6~ \textmd{seconds}}{24~ \textmd{hours/day} \times 3600 ~\textmd{seconds/hour}} \approx 11.5~ \textmd{days}         \end{equation*}

Hence, 1 ppm is equivalent to a deviation of 1 second every 11.5 days. This might sound entirely harmless but for the purpose of typical wireless communication systems operating at several GHz of carrier frequency F_C and several MHz of symbol rates R_M, it is one of the major sources of signal distortion.

The ppm rating at the oscillator crystal indicates how much its frequency may deviate from the nominal value. Consider an example of a wireless system operating at 2.4 GHz and with \pm 20 ppm crystals, which is a more or less standard rating. The maximum deviation of the carrier frequency at the Tx or Rx can be

    \begin{equation*}             \pm \frac{20}{10^6} \times 2.4 \times 10^9 = \pm 48 \textmd{kHz}         \end{equation*}

However, in the worst case scenario, the Tx can be 20 ppm above (or below) the nominal frequency, while the Rx can be 20 ppm below (or above) the nominal frequency, resulting in the overall difference of 40 ppm between the two. So the worst case CFO due to local oscillator mismatch in this example can be

    \begin{equation*}             \pm 2 \times 48 = \pm 96 \textmd{kHz}         \end{equation*}

Keep in mind that this calculation is for basic precision and the actual frequency may vary depending on environmental factors, mainly the temperature and aging. Finally, a movement anywhere in the channel (whether by Tx, Rx or some other object within that environment) adds Doppler shift which can be up to several hundreds of Hz. Although this Doppler shift is much less than the oscillator generated mismatch, it severely distorts the Rx signal due to the changes it causes in the channel.

Now we turn our attention towards Figure below to differentiate between three possibilities in which F_\Delta can dictate the receiver design.

Three different cases for carrier frequency offset

Case 1: CFO > F_{\Delta,\textmd{max}}


It was illustrated in another article that an input signal at the Rx is filtered by an analog prefilter to remove the out of band noise. Ideally, the frequency response of this prefilter G(F) should be flat within the frequency range

    \begin{equation*}         |F| \le B + F_{\Delta,\textmd{max}}     \end{equation*}

so that the incoming signal can pass through in an undistorted manner (we are assuming a flat wireless channel here as well. Actual wireless channels will be discussed later). The passband width of this prefilter is designed according to the maximum CFO F_{\Delta,\textmd{max}} expected at the Rx.

However, if the CFO is greater than F_{\Delta,\textmd{max}}, then much of the actual intended signal will be filtered out by the analog prefilter and the Rx sampled signal will not even closely resemble the Tx signal as a linear function of Tx data. No amount of signal processing can then recover the signal. Since it is an outcome of poor system design, the only remedy is to redesign the system (particularly the analog frontend) with more accurate estimates and margins for CFO and other such random distortions.

Case 2: 15\% of R_M < CFO < F_{\Delta,\textmd{max}}


Since CFO < F_{\Delta,\textmd{max}}, the Rx signal is within the passband of the analog prefilter G(F) and suffers no distortion. However, remember from a previous post that to maximize the SNR, the Rx signal must be passed through a matched filter. This is not possible in this case because much of the Rx signal bandwidth does not sufficiently overlap with the matched filter due to the CFO being greater than 15\% of symbol rate R_M. If applied, it would remove a significant portion of the incoming signal energy.

Since Rx signal cannot be matched filtered without distorting the signal, and the signal cannot be downsampled to 1 sample/symbol without matched filtering, it is easy to deduce that more than 1 sample/symbol (say, L) is required to trace the frequency offset.

Another route to understand the rationale behind the factor 15\% of the symbol rate R_M is by evaluating the relation

    \begin{equation*}         F_{\Delta} = 0.15 R_M \quad \Rightarrow \quad T_\Delta \approx 6.67~T_M     \end{equation*}

For CFO > 15\% of R_M, an offset cycle gets completed in less than 7 symbol times, which induces a phase rotation of more than 50^\circ on one symbol. Therefore, it cannot be downsampled to 1 sample/symbol for frequency tracking.

Case 3: CFO < 15\% of R_M


When the signal is rotated by less than 15\% R_M, an offset cycle gets completed in less than 7 symbols, and hence the effect of rotation on one symbol, although still significant, can now be tracked at symbol rate, or 1 sample/symbol. In other words, matched filtering keeps most of the signal intact and as a result, symbol boundaries can be marked first (a problem known as symbol timing synchronization which we discuss in another article) before carrier frequency is estimated at the ISI-free symbol-spaced samples.


In the absence of noise, the sampled version of this mismatch in Eq (1) becomes 2\pi F_\Delta nT_S. Now Eq (1) is very similar to phase rotation equation, and hence we can use directly use that result with proper substitution. The expression for the symbol-spaced samples in the presence of CFO F_{\Delta} can be obtained after replacing sample time index n with the symbol time index m.

    \begin{equation*}       \begin{aligned}         z_I(mT_M) &= a_I[m] \cos 2\pi F_\Delta mT_M - a_Q[m]\sin 2\pi F_\Delta mT_M       \\         z_Q(mT_M) &= a_I[m] \sin 2\pi F_\Delta mT_M + a_Q[m]\cos 2\pi F_\Delta mT_M       \end{aligned}     \end{equation*}

In the above equation,

    \begin{equation*}         2\pi F_\Delta mT_M = 2\pi \frac{F_\Delta}{R_M} m = 2\pi F_0 m     \end{equation*}

where the F_0 is defined as the normalized Carrier Frequency Offset (nCFO): CFO normalized by the symbol rate.

    \begin{equation*}           F_0 = \frac{F_\Delta}{R_M}     \end{equation*}

This normalization is very important as we saw in the last subsection. The resulting expression takes the form

    \begin{equation*}       \begin{aligned}         z_I(mT_M) = a_I[m] \cos 2\pi F_0m - a_Q[m]\sin 2\pi F_0m       \\         z_Q(mT_M) = a_I[m] \sin 2\pi F_0m + a_Q[m]\cos 2\pi F_0m       \end{aligned}     \end{equation*}

In polar form, this expression can be written as

    \begin{equation*}         \begin{aligned}           |z(mT_M)| &= \sqrt{a_I^2[m] + a_Q^2[m]} \\           \measuredangle z(mT_M) &= \measuredangle \Big(a_Q[m],a_I[m]\Big) + 2\pi F_0 m         \end{aligned}     \end{equation*}

Notice from the above equation that a carrier frequency offset of F_\Delta keeps the magnitude unchanged but continually rotates the desired outputs a_I[m] and a_Q[m] on the constellation plane. This is drawn for symbol-spaced samples in the scatter plot of Figures below for a 4-QAM and 16-QAM constellation.

A 4-QAM and a 16-QAM constellation spinning in circles due to a carrier frequency offset (CFO)

Due to this reason, a Carrier Frequency Offset (CFO), or F_{\Delta}, in the received signal spins the constellation in a circle (or multiple circles for higher-order modulations). This is a natural outcome since the angular frequency is defined as the rate of change of phase.


The above results are summarized in Table below.
Carrier frequency offset (CFO) classification

Moving Average Filter

Coefficients of a moving average filter in time domain

The most commonly used filter in DSP applications is a moving average filter. In today’s world with extremely fast clock speeds of the microprocessors, it seems strange that an application would require simple operations. But that is exactly the case with most applications in embedded systems that run on limited battery power and consequently host small microcontrollers. For noise reduction, it can be implemented with a few adders and delay elements. For lowpass filtering, the excellent frequency domain response and substantial suppression of stopband sidelobes are less important than having a basic filtering functionality, which is where a moving average filter is most helpful.

As the name implies, a length-M moving average filter averages M input signal samples to generate one output sample. Mathematically, it can be written as

    \begin{align*}       r[n] &= \frac{1}{M} \sum \limits_{m=0}^{M-1} s[n-m]\\            &= \sum \limits_{m=0}^{M-1} \frac{1}{M} \cdot s[n-m]  \\            &= \sum \limits_{m=0}^{M-1} h[m] s[n-m]     \end{align*}

where as usual, s[n] is the input signal, r[n] is the system output and h[n] is the impulse response of the moving average filter given by

    \begin{equation*}         h[n] = \frac{1}{M}, \qquad n = 0,1,\cdots,M-1     \end{equation*}

The above discussion leads us to the following observations.

Impulse response From the above equation, the impulse response of a moving average filter can be seen a rectangular pulse which is drawn in Figure below. In terms of filtering, a constant amplitude implies that it treats all input samples with equal significance. This results in a smoothing operation of a noisy input [1].

Coefficients of a moving average filter in time domain

To see an example of noise reduction in time domain, consider a pulse in Figure below in which random noise is added. Observe that a longer filter performs a better smoothing action on the input signal. However, the longer length translates into wider edges during the transition.

A noisy signal and its smoothed version for two different filter lengths

Magnitude response As a consequence of its flat impulse response, the frequency response of a moving average filter is the same as the frequency response of a rectangular pulse, i.e., a sinc signal. The magnitude response of such a filter is drawn in Figure below. It is far from an ideal brickwall response of an ideal lowpass filter but the magnitude response still has a shape that passes low frequencies with some distortion and attenuates high frequencies to some extent.

Frequency response of a moving average filter

Due to this insufficient suppression of sidelobes, it is probably the worst lowpass filter one can design. On the other hand, as described earlier, its simplicity of implementation makes it useful for many applications requiring high sample rates including wireless communication systems. In such conditions, a cascade of moving average filters is implemented in series such that each subsequent filter inflicts more suppression to higher frequencies of the input signal.

Due to the inverse relationship between time and frequency domains, a longer filter in time domain produces a narrower frequency response. This can be observed in the impulse and frequency response for filter lengths M=5 and M=11.

Phase response As described in the article on FIR filters, a moving average filter generates a delay of 0.5(M-1) samples (for odd M) in the output signal as compared to an input signal. This delay manifests itself in the form of a linear phase in frequency domain.

Convolution Notice from above equations that the filter output is a convolution of the input signal with the impulse response of the moving average filter, a rectangular pulse.

References

[1] The Scientist and Engineer’s Guide to Digital Signal Processing (1st Edition), California Technical Pub., 1997.