Array Signal Processing Basics
Array Signal Processing Basics
Outline
Goal: The objective of this course is to introduce traditional methods of array signal processing for
multivariate observations, including spatial filtering (beamforming), direction of arrival estimation and
multiple-input multiple-output (MIMO) communication systems.
The use of transmitters and/or receiver with multiple antennas is nowadays widespread in modern
wireless communications, localization and radar systems. The trend of increasing the number of
antenna in any device will likely continue in the next years. After this course, the student will have
the understanding of the fundamental concepts of multi-antenna signal processing and the capability
to apply them to the design of future telecommunications and positioning systems.
Course Syllabus
2. Spatial filtering.
• Space-time filtering and beamforming.
• Design of spatial reference beamformers.
• Capon beamformer. Direction of arrival estimation.
• Design of time reference beamformers.
• Adaptive filtering: LMS and RLS.
Bibliography
• S. Kay, Fundamentals of statistical signal processing. Estimation theory, vol. I, Prentice-Hall, 1993.
• S. Kay, Fundamentals of statistical signal processing. Detection theory, vol. II, Prentice-Hall, 1998.
• Don H. Johnson, Dan E. Dudgeon, Array Signal Processing, Concepts and Techniques, Prentice
Hall, 1993.
• S. Haykin, Array signal processing, Prentice Hall, Englewood Cliffs, NJ, 1985.
• E. Larsson, P. Stoica, Space-time block coding for wireless communications, Cambridge University
Press, UK, 2003.
• A. Lee Swindlehurst et al., Applications of Array Signal Processing. In: R. Chellappa and S.
Theodoridis, editors, Academic Press Library in Signal Processing. Vol 3, Array and Statistical Signal
Processing. Online: [Link]
• H. Van Trees, Optimum Array Processing, part IV of Detection, Estimation and Modulation Theory,
New York, Wiley 2002.
Evaluation
The final evaluation of the second part of the course will take into account:
• Two Exams (1 and 2).
• Case Studies ( ).
• The grade () of the second part will be the result of the following formula
= max [05 · 1 + 05 · 2 033 · 1 + 033 · 2 + 034 · ]
• There will be an optional second chance exam for those that wish to replace 1 and 2. If this
second chance exam is taken, the final grade will be computed as
= max( 066 · + 034 · )
Introduction
• The objective of this chapter will be to present the basics of array signal processing, giving special
emphasis on the signal model.
• First, a brief overview of the baseband signal model an the analytic (complex envelope) signal
representation will be given.
• Secondly, we will focus on the far-field narrowband approximation and we will present the corre-
sponding array signal model, which will be used throughout the course.
• Finally, we will study the relationship between the array spatial configuration, the direction of arrival
(DoA) and the spatial covariance matrix, which will be the most important objects that will be
considered in this course.
Let () denote a real-valued continuous-time signal that has a bandpass spectrum centered at
= 2, that is Z ∞
( ) = () e−2 j
−∞
Noting that () is real-valued, we can ensure that
Z ∞
∗( ) = () e2 j = (− )
−∞
We define the complex baseband signal associated to a real-valued bandpass () centered at as
() = () e−2 j
namely
( ) = ( + )
Observe that the signal () is generally complex, lowpass and has a spectrum centered at zero.
It turns out that () is a very convenient way of representing the original (), because (using
∗( ) = (− ))
∙Z ∞ ¸
£ ¤
Re () e2 j = Re [()] = Re ( ) e2 j
∙Z ∞ −∞ ¸ Z ∞ Z ∞
= 2 Re ( ) e2 j = ( ) e2 j + ∗( ) e−2 j
Z ∞ 0 0 0
We define the in-phase and quadrature components of the narrowband signal as the real and imaginary
parts of the complex baseband representation, namely
() = Re [()] () = Im [()]
Observe that we can represent the original real-valued signal () as
£ ¤ h i
2 j j arg(()) 2 j
() = Re () e = Re |()| e e
= |()| cos (2 + arg (()))
and therefore a phase shift of the original real-valued () corresponds to a shift in the complex angle
of (). We can also write
() = () cos (2) − () sin (2)
In signal processing, we generally work with the complex baseband signal () instead of (), and
this will also be the case in array signal processing applications.
Let us assume that a signal with associated complex baseband waveform () is received by a set of
sensors. Let denote the delay between the source and the th sensor, so that
h i £ ¤
2 j (− ) 2 j
( − ) = Re ( − ) e = Re () e
where
() = ( − ) e−2 j
is the complex baseband signal associated to the th sensor.
Let y() denote a column vector containing the I/Q signal of the sensors of the array. If the sensors
are isotropic, we can express
⎡ ⎤ ⎡ −2 j 1
⎤
1() ( − 1) e
.
y() = ⎣ .. ⎦ = ⎣ ... ⎦
() ( − ) e−2 j
There exists a direct relationship among the delays , the array geometric configuration (position of
the sensors) and the position of the source.
Next, we examine this dependence for different array configurations under the assumption is far away
from the array, so that the wavefront can be considered to be planar.
sin
+1 − =
Chapter 1: Introduction to Array Processing 16/57
Univsersitat Autònoma de Barcelona
In general, we can compute the delay of the waveform at a certain element of the array by computing
the scalar product between the pointing vector of the waveform and the position vector of the sensor
(with respect to a phase reference), that is = 1 k r where kkk = 1.
A Uniform Planar Array consists of a number of elements ( × ) arranged along a rectangular grid
with constant interelement separation (, ). The Direction of Arrival (DoA) is determined by two
angles: elevation () and azimuth ().
We can take the first sensor as phase reference, and consider the array element in the (, )th position
of the grid. The pointing vector can be expressed as
Narrowband approximation
Recall that the complex baseband signal at the output of the sensor can be expressed as
() = ( − ( )) e−2 j ()
The narrowband signal model follows from assuming that the group delay is the same in all sensors,
so that
( − ( )) ≈ ( − 1 ( )), = 1
This implies that we can write
⎡ ⎤
⎡ ⎤ 1
1() ⎢ e−2 j ( 2()− 1()) ⎥
. ⎢ ⎥
y() = ⎣ .. ⎦ = ( − 1 ( )) e −2 j 1 ()
⎢ .
.. ⎥ = ()a ( )
| {z }⎣ ⎦
() Signal at the reference sensor −2 j ( ()− ())
e 1
| {z }
Steering Vector
We can examine the conditions under which the narrowband approximation holds true.
Consider the Fourier transform of the identity () = ( − ) e−2 j
∙Z ∞ ¸
−2 j −2 j
£ −2 j
¤ −2 j
( ) = ( − ) e e = ( ) e e
−∞
Now, assume that the bandwidth of ( ) is so small that we can approximate
( ) e−2 j ( − 1) ≈ ( )
and therefore () ≈ ( − 1) e−2 j . In order for the above approximation to hold, we need
( − 1) ¿ 1 for ∈ [− 2 2] where is the bandwidth of the signal. A necessary
condition for this is that µ ¶
¿ 1 ⇐⇒ ¿ 1
It is important to point out that the narrowband approximation depends on both the received signal
and the dimensions of the array (in wavelengths). For the same signal, the narrowband approximation
may or may not be valid depending on the dimensions of the array.
• Satellite communications/radar
20 MHz
∼ = 10−3 =⇒ 1000 for wideband.
20 GHz
• Mobile communications:
20 MHz
∼ = 10−2 =⇒ 100 for wideband.
2 GHz
• Microphones:
20 kHz
∼ = 2 =⇒ 2 for wideband.
10 kHz
…
…
Assuming that all the sensors have perfect (distortionless) receivers with sampling at every seconds,
we can write the received signal vector (snapshot) as
y[] = y() = []a ( )
where [] is the transmitted (source) signal up to a delay and phase rotation and a ( ) is the
steering vector. We have seen above that when the sensors are isotropic, the entries of a ( )
are complex exponentials with argument given by the product between the carrier frequency and the
interelement delay of the waveform:
⎡ ⎤ ⎡ ⎤
1 1
⎢ e−2 j ( 2()− 1()) ⎥ ⎢ e−2 j sin ⎥
⎢ ⎥ ⎢ ⎥
a ( ) = ⎢ ... ⎥ =⇒ (for ULAs) a () = ⎢ ... ⎥
⎣ ⎦ ⎣ ⎦
e−2 j ( ()− 1()) e−2 j (−1) sin
Note that the steering vector for ULAs a () is reminiscent of the Fourier basis (spatial frequency).
Spatial frequency
Non-idealities
• Background noise: The thermal noise in the different receiver chains originate an additional
noise term, so that
y () = () a ( ) + n()
and where typically the components of n() do not depend on the direction of arrival (spatially
white noise).
• Non-isotropic array elements: We can easily introduce the effect of having non-isotropic
receiving element, having a spatial response ( ). Assuming that all the elements have the same
radiation diagram, we can write
y () = () ( )a ( )
Hence, in practice ( ) can be included in the definition of (), which was defined as
() = ( − 1 ( )) e−2 j 1()
Non-idealities (ii)
• Lack of Calibration: The different receiver chains do not have exactly the same components,
and therefore each one presents a different response in the spatial domain.
• Mutual coupling: The radiating elements and the receiver chains present a mutual coupling that
is difficult to characterize and compensate.
X
y[] = []a ( ) + n[]
=1
where we have assumed that the th source is received from DoA ( ). This can also be more
compactly expressed as
y[] = Ax[] + n[]
where
£ ¤
A = a (1 1) · · · a ( )
£ ¤
x[] = 1[] · · · []
• The noise n[] is typically considered Gaussian distributed, with zero mean and covariance
£ ¤
E n[]n [] = R −
Observe that typically the noise is assumed uncorrelated (white) in the time domain. We say that
the noise is spatially white when
R = 2I
where 2 0 is the noise power.
• The signals can be assumed deterministic or random. In this second case, they are typically assumed
to be also uncorrelated in the time domain, but some inter-source correlation may exist.
£
¤
E x[]x [] = R −
Correlation appears, for instance, when the same source is received through multiple directions of
arrival.
Coherent sources
In the extreme case where two signals become proportional to one another, their correlation matrix
R becomes singular. In this situation, we say that the two sources are coherent.
⎛ ⎞
y[] = 1[] ⎝a (1 1) + a (2 2)⎠ + n[]
| {z }
Equivalent Steering
Since noise and signal are always statistically independent, we can easily compute the covariance
matrix of the observations as
£ ¤ h i
R = E y[]y [] = E (Ax[] + n[]) (Ax[] + n[])
£ ¡ ¢¤
= E (Ax[] + n[]) x []A + n []
£ ¤ £ ¤ £ ¤ £ ¤
= AE x[]x [] A + AE x[]n [] + E n[]x [] A + E n[]n []
| {z } | {z }
=0 =0
= ARA + R
When the noise is spatially white
R = ARA + 2I
and, if the sources are uncorrelated,
⎡ ⎤
1 X
R = ⎣ ... ⎦ =⇒ R = a ( ) a ( ) + 2I
=1
In practice, however, nothing is known (R,R, ,( )) and we are unable to evaluate
R = ARA + R
This matrix is in practice estimated using the sample covariance matrix. If we have snapshots
(sampled vectors) given by y [1],. . . ,y [], the estimation is given by
1 X
R̂ = y [] y []
=1
If is sufficiently high, we will have
£ ¤
R̂ ' E y[]y [] = R
In non-static scenarios (e.g. moving sources) other estimations are preferred, such as
R̂+1 = R̂ + (1 − ) y [ + 1] y [ + 1]
where ∈ (0 1) is a forgetting factor.
Wideband processing
In some applications (e.g. in sonar), it is not feasible to work on the narrowband model assumption.
In these applications, typically subband stacking is employed.
FFT
Subband 1
Subband 2
FFT
Subband L
FFT
Bibliography
[1] R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 1985.
[2] J.R. Magnus and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and
Econometrics, Wiley and sons., 1999.
Outline
Goal: To introduce the basic tools used in multiple antenna processing. Review the notation,
properties and basic algebraic operations from matrix theory as well as fundamental aspects of
optimization theory.
Contents:
[Link] and some matrix algebra.
[Link] and eigenvectors.
[Link] value decomposition and relationship with the eigenvalues/eigenvectors.
[Link] random vectors and mutidimensional Gaussian distribution. Circular symmetry.
Appendix
Matrix notation
• Matrices will be denoted as upper-case boldface letters (A).
• Column vectors will be denoted as lower-case boldface letters (a).
⎡ ⎤ ⎡ ⎤
11 · · · 1 1
A = ⎣ .. . . .. .
.. ⎦ a = ⎣ ... ⎦
1 · · ·
• We define the trace of a squared matrix as the sum of the diagonal entries of the matrix, namely
P
tr [A] = =1 .
• We will denote as det [A] the determinant of the matrix. For the 2 × 2 case, we will have
∙ ¸
det 11 12 = 1122 − 2112
21 22
• The Frobenius norm of a matrix is defined as
rX
kAk = | |2
• We have (AB) = B A
• Rotational property of the trace: tr [AB] = tr [BA] (A and B not necessarily square).
• Rotational property of the determinant: det [I + AB] = det [I + BA] (A and B not necessarily
square).
• The column (respectively row) rank of a matrix is the number of columns (respectively rows) that
are linearly independent. A fundamental theorem in algebra states that the row rank of a matrix
coincides with its column rank. We will generally denote it as rank (A).
• A square matrix is full-rank if and only if its determinant is different from zero.
• Assume that there exist a complex value and a complex column vector e such that, for a square
A,
Ae =e
We say that e is an eigenvector of A and its associated eigenvalue.
Remark: Thanks to the fundamental theorem of the algebra, we know that the eigenvalues always
exist (they may have multiplicity higher than two). The multiplicity of the root of the polynomial
() is referred to as the algebraic multiplicity of the corresponding eigenvalue.
√
Remark: In the complex field, ± · are the two branches of the complex square root.
Chapter 1: Introduction to Array Processing 42/57
Univsersitat Autònoma de Barcelona
We recall that eigenvectors of a square matrix are column vectors that, after being left-multiplied by
the matrix, remain proportional to the original, i.e. e is an eigenvector of the matrix A if there exists
a scalar (eigenvalue) such that
Ae =e
Observe that we can re-write this equation as
[A−I] e = 0
and this is telling us that the matrix [A−I] is rank-deficient. Therefore, det [A−I] = 0 and
must be an eigenvalue of A. We say that e is an eigenvector of A associated with the eigenvalue .
Remark: An eigenvector is always associated with an eigenvalue, but an eigenvalue may be associated
with multiple eigenvectors. The geometric multiplicity of an eigenvalue is equal to the dimension of
the subspace spanned by the associated eigenvectors. The geometric multiplicity is always lower than
or equal to the arithmetic multiplicity.
Diagonalizable matrices
We say that a square matrix A is diagonalizable if there exists a diagonal matrix Λ and an orthogonal
matrix E (EE = E E = I) such that
A = EΛE
In this case, the diagonal entries of Λ contain the eigenvalues of A, and the columns of E contain the
associated eigenvectors. Furthermore, the geometric and arithmetic multiplicities of all eigenvalues
coincide.
Let 1,2 denote the two eigenvalues of a 2 × 2 Hermitian matrix A, and let e1, e2 denote the two
corresponding eigenvectors. Observing that
∙ ¸
11 − 1 12
e1 = 0
12 22 − 1
we can readily write ∙ ¸
1 −12
e1 = q
11 − 1
212 + (11 − 1)2
and equivalently for e2 (replacing 1 with 2). This implies that we are able to write (if E = [e1 e2])
∙ ¸
1
A=E E =1e1e
1 + 2 e2 e2
2
When the matrix is not Hermitian, it may not have an eigendecomposition (or, if it is diagonalizable,
it may have complex eigenvalues). In these case, we can alternatively use the singular value
decomposition (SVD).
Given an × matrix A, we say that the pair (u v) is a pair of left and right singular vectors
associated with the positive singular value if
Av = u
It can be seen that every matrix can be expressed as
min()
X
A = UΣV = uv
=1
where the matrices U and V are orthogonal and contain as columns the left and right singular vectors
u and v respectively.
This can also be more compactly expressed by defining the complex derivatives
µ ¶ µ ¶
1 1
= −j = +j
w 2 x y w∗ 2 x y
For the case where the constraint function is complex, we can separate between real and imaginary
constraints, namely (w) = (w) + j (w) and therefore (w) = 0 iff (w) = (w) = 0.
Defining two Lagrange multipliers, namely and we can construct the Lagrangian as
∗ ∗ (w) + ∗(w)
L(w ) = (w) − (w) − (w) = (w) − Re [ (w)] = (w) −
2
where = + j . The optimality conditions can be formulated as
L(w ) L(w ) L(w ) L(w )
= =0 = =0
x y
or, in terms of the complex derivative
L(w ) L(w )
= 0 = 0
w∗ ∗
Consider the cost function (w) = w Aw, where A is Hermitian positive definite, subject to the
complex constraint c w = 1. In this case, we can formulate the Lagrangian as
∗
¡ ¢ ¡ ¢
c w − 1 + w c − 1
L(w ) = w Aw −
2
Therefore, the optimality constraints can be expressed as
L(w ) c −1
= Aw − = 0 =⇒ w = A c
w∗ 2 2
L(w )
∗ = c w − 1 = 0
from where we obtain
1
c w = c A−1c = 1 =⇒ wopt = −1 A−1c
2 c A c
Summary
Bibliography
[1] R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, 1985.
[2] J.R. Magnus and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and
Econometrics, Wiley and sons., 1999.