In signal processing, the output of the matched filter is given by correlating a known delayed signal, or template, with an unknown signal to detect the presence of the template in the unknown signal. This is equivalent to convolving the unknown signal with a conjugated time-reversed version of the template. The matched filter is the optimal linear filter for maximizing the signal-to-noise ratio (SNR) in the presence of additive stochastic noise.
Matched filters are commonly used in radar, in which a known signal is sent out, and the reflected signal is examined for common elements of the out-going signal. Pulse compression is an example of matched filtering. It is so called because the impulse response is matched to input pulse signals. Two-dimensional matched filters are commonly used in image processing, e.g., to improve the SNR of X-ray observations. Additional applications of note are in seismology and gravitational-wave astronomy.
Matched filtering is a demodulation technique with LTI (linear time invariant) filters to maximize SNR.
It was originally also known as a North filter.
Derivation
Derivation via matrix algebra
The following section derives the matched filter for a discrete-time system. The derivation for a continuous-time system is similar, with summations replaced with integrals.
The matched filter is the linear filter, <math>h</math>, that maximizes the output signal-to-noise ratio.
:<math>\ y[n] = \sum_{k=-\infty}^{\infty} h[n-k] x[k],</math>
where <math>x[k]</math> is the input as a function of the independent variable <math>k</math>, and <math>y[n]</math> is the filtered output. Though we most often express filters as the impulse response of convolution systems, as above (see LTI system theory), it is easiest to think of the matched filter in the context of the inner product, which we will see shortly.
We can derive the linear filter that maximizes output signal-to-noise ratio by invoking a geometric argument. The intuition behind the matched filter relies on correlating the received signal (a vector) with a filter (another vector) that is parallel with the signal, maximizing the inner product. This enhances the signal. When we consider the additive stochastic noise, we have the additional challenge of minimizing the output due to noise by choosing a filter that is orthogonal to the noise.
Let us formally define the problem. We seek a filter, <math>h</math>, such that we maximize the output signal-to-noise ratio, where the output is the inner product of the filter and the observed signal <math>x</math>.
Our observed signal consists of the desirable signal <math>s</math> and additive noise <math>v</math>:
:<math>\ x=s+v.\,</math>
Let us define the auto-correlation matrix of the noise, reminding ourselves that this matrix has Hermitian symmetry, a property that will become useful in the derivation:
:<math>\ R_v=E\{ vv^\mathrm{H} \}\,</math>
where <math>v^\mathrm{H}</math> denotes the conjugate transpose of <math>v</math>, and <math>E</math> denotes expectation (note that in case the noise <math>v</math> has zero-mean, its auto-correlation matrix <math>R_v</math> is equal to its covariance matrix).
Let us call our output, <math>y</math>, the inner product of our filter and the observed signal such that
:<math>\ y = \sum_{k=-\infty}^{\infty} h^*[k] x[k] = h^\mathrm{H}x = h^\mathrm{H}s + h^\mathrm{H}v = y_s + y_v.</math>
We now define the signal-to-noise ratio, which is our objective function, to be the ratio of the power of the output due to the desired signal to the power of the output due to the noise:
:<math>\mathrm{SNR} = \frac{|y_s|^2}{E\{|y_v|^2\.</math>
We rewrite the above:
:<math>\mathrm{SNR} = \frac{|h^\mathrm{H}s|^2}{E\{|h^\mathrm{H}v|^2\.</math>
We wish to maximize this quantity by choosing <math>h</math>. Expanding the denominator of our objective function, we have
:<math>\ E\{ |h^\mathrm{H}v|^2 \} = E\{ (h^\mathrm{H}v){(h^\mathrm{H}v)}^\mathrm{H} \} = h^\mathrm{H} E\{vv^\mathrm{H}\} h = h^\mathrm{H}R_vh.\,</math>
Now, our <math>\mathrm{SNR}</math> becomes
:<math>\mathrm{SNR} = \frac{ |h^\mathrm{H}s|^2 }{ h^\mathrm{H}R_vh }.</math>
We will rewrite this expression with some matrix manipulation. The reason for this seemingly counterproductive measure will become evident shortly. Exploiting the Hermitian symmetry of the auto-correlation matrix <math>R_v</math>, we can write
:<math>\mathrm{SNR} = \frac{ | {(R_v^{1/2}h)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
{ {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) },</math>
We would like to find an upper bound on this expression. To do so, we first recognize a form of the Cauchy–Schwarz inequality:
:<math>\ |a^\mathrm{H}b|^2 \leq (a^\mathrm{H}a)(b^\mathrm{H}b),\, </math>
which is to say that the square of the inner product of two vectors can only be as large as the product of the individual inner products of the vectors. This concept returns to the intuition behind the matched filter: this upper bound is achieved when the two vectors <math>a</math> and <math>b</math> are parallel. We resume our derivation by expressing the upper bound on our <math>\mathrm{SNR}</math> in light of the geometric inequality above:
:<math>\mathrm{SNR} = \frac{ | {(R_v^{1/2}h)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
{ {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) }
\leq
\frac{ \left[
{(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h)
\right]
\left[
{(R_v^{-1/2}s)}^\mathrm{H} (R_v^{-1/2}s)
\right] }
{ {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) }.
</math>
Our valiant matrix manipulation has now paid off. We see that the expression for our upper bound can be greatly simplified:
:<math>\mathrm{SNR} = \frac{ | {(R_v^{1/2}h)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
{ {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) }
\leq s^\mathrm{H} R_v^{-1} s.
</math>
We can achieve this upper bound if we choose,
:<math>\ R_v^{1/2}h = \alpha R_v^{-1/2}s</math>
where <math>\alpha</math> is an arbitrary real number. To verify this, we plug into our expression for the output <math>\mathrm{SNR}</math>:
:<math>\mathrm{SNR} = \frac{ | {(R_v^{1/2}h)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
{ {(R_v^{1/2}h)}^\mathrm{H} (R_v^{1/2}h) }
= \frac{ \alpha^2 | {(R_v^{-1/2}s)}^\mathrm{H} (R_v^{-1/2}s) |^2 }
{ \alpha^2 {(R_v^{-1/2}s)}^\mathrm{H} (R_v^{-1/2}s) }
= \frac{ | s^\mathrm{H} R_v^{-1} s |^2 }
{ s^\mathrm{H} R_v^{-1} s }
= s^\mathrm{H} R_v^{-1} s.
</math>
Thus, our optimal matched filter is
:<math>\ h = \alpha R_v^{-1}s.</math>
We often choose to normalize the expected value of the power of the filter output due to the noise to unity. That is, we constrain
:<math>\ E\{ |y_v|^2 \} = 1.\,</math>
This constraint implies a value of <math>\alpha</math>, for which we can solve:
:<math>\ E\{ |y_v|^2 \} = \alpha^2 s^\mathrm{H} R_v^{-1} s = 1,</math>
yielding
:<math>\ \alpha = \frac{1}{\sqrt{s^\mathrm{H} R_v^{-1} s,</math>
giving us our normalized filter,
:<math>\ h = \frac{1}{\sqrt{s^\mathrm{H} R_v^{-1} s R_v^{-1}s.</math>
If we care to write the impulse response <math>h</math> of the filter for the convolution system, it is simply the complex conjugate time reversal of the input <math>s</math>.
Though we have derived the matched filter in discrete time, we can extend the concept to continuous-time systems if we replace <math>R_v</math> with the continuous-time autocorrelation function of the noise, assuming a continuous signal <math>s(t)</math>, continuous noise <math>v(t)</math>, and a continuous filter <math>h(t)</math>.
Derivation via Lagrangian
Alternatively, we may solve for the matched filter by solving our maximization problem with a Lagrangian. Again, the matched filter endeavors to maximize the output signal-to-noise ratio (<math>\mathrm{SNR}</math>) of a filtered deterministic signal in stochastic additive noise. The observed sequence, again, is
:<math>\ x = s + v,\,</math>
with the noise auto-correlation matrix,
:<math>\ R_v = E\{vv^\mathrm{H}\}.\, </math>
The signal-to-noise ratio is
:<math>\mathrm{SNR} = \frac{|y_s|^2}{ E\{|y_v|^2\} },</math>
where <math>y_s = h^\mathrm{H} s</math> and <math>y_v = h^\mathrm{H} v</math>.
Evaluating the expression in the numerator, we have
:<math>\ |y_s|^2 = {y_s}^\mathrm{H} y_s = h^\mathrm{H} s s^\mathrm{H} h.\, </math>
and in the denominator,
:<math>\ E\{|y_v|^2\} = E\{ {y_v}^\mathrm{H} y_v \} = E\{ h^\mathrm{H} v v^\mathrm{H} h \} = h^\mathrm{H} R_v h.\,</math>
The signal-to-noise ratio becomes
:<math>\mathrm{SNR} = \frac{h^\mathrm{H} s s^\mathrm{H} h}{ h^\mathrm{H} R_v h }.</math>
If we now constrain the denominator to be 1, the problem of maximizing <math>\mathrm{SNR}</math> is reduced to maximizing the numerator. We can then formulate the problem using a Lagrange multiplier:
:<math>\ h^\mathrm{H} R_v h = 1 </math>
:<math>\ \mathcal{L} = h^\mathrm{H} s s^\mathrm{H} h + \lambda (1 - h^\mathrm{H} R_v h ) </math>
:<math>\ \nabla_{h^*} \mathcal{L} = s s^\mathrm{H} h - \lambda R_v h = 0 </math>
:<math>\ (s s^\mathrm{H}) h = \lambda R_v h </math>
which we recognize as a generalized eigenvalue problem
:<math>\ h^\mathrm{H} (s s^\mathrm{H}) h = \lambda h^\mathrm{H} R_v h. </math>
Since <math>s s^\mathrm{H}</math> is of unit rank, it has only one nonzero eigenvalue. It can be shown that this eigenvalue equals
:<math>\ \lambda_{\max} = s^\mathrm{H} R_v^{-1} s, </math>
yielding the following optimal matched filter
:<math>\ h = \frac{1}{\sqrt{s^\mathrm{H} R_v^{-1} s R_v^{-1} s. </math>
This is the same result found in the previous subsection.
Interpretation as a least-squares estimator
Derivation
Matched filtering can also be interpreted as a least-squares estimator for the optimal location and scaling of a given model or template. Once again, let the observed sequence be defined as
:<math>\ x_k = s_k + v_k,\,</math>
where <math>v_k</math> is uncorrelated zero mean noise. The signal <math>s_k</math> is assumed to be a scaled and shifted version of a known model sequence <math>f_k</math>:
:<math>\ s_k = \mu_0\cdot f_{k-j_0}</math>
We want to find optimal estimates <math>j^*</math> and <math>\mu^*</math> for the unknown shift <math>j_0</math> and scaling <math>\mu_0</math> by minimizing the least-squares residual between the observed sequence <math>x_k</math> and a "probing sequence" <math>h_{j-k}</math>:
:<math>\ j^*,\mu^* = \arg\min_{j,\mu} \sum_k \left(x_k - \mu\cdot h_{j-k}\right)^2</math>
The appropriate <math>h_{j-k}</math> will later turn out to be the matched filter, but is as yet unspecified. Expanding <math>x_k</math> and the square within the sum yields
:<math>\ j^*,\mu^* = \arg\min_{j,\mu}\left[ \sum_k (s_k+v_k)^2 + \mu^2\sum_k h_{j-k}^2 - 2\mu\sum_k s_k h_{j-k} - 2\mu\sum_k v_k h_{j-k}\right]. </math>
The first term in brackets is a constant (since the observed signal is given) and has no influence on the optimal solution. The last term has constant expected value because the noise is uncorrelated and has zero mean. We can therefore drop both terms from the optimization. After reversing the sign, we obtain the equivalent optimization problem
:<math>\ j^*,\mu^* = \arg\max_{j,\mu}\left[ 2\mu\sum_k s_k h_{j-k} - \mu^2\sum_k h_{j-k}^2\right]. </math>
Setting the derivative w.r.t. <math>\mu</math> to zero gives an analytic solution for <math>\mu^*</math>:
:<math>\ \mu^* = \frac{\sum_k s_k h_{j-k{\sum_k h_{j-k}^2}.</math>
Inserting this into our objective function yields a reduced maximization problem for just <math>j^*</math>:
:<math>\ j^* = \arg\max_j\frac{\left(\sum_k s_k h_{j-k}\right)^2}{\sum_k h_{j-k}^2}. </math>
The numerator can be upper-bounded by means of the Cauchy–Schwarz inequality:
:<math>\ \frac{\left(\sum_k s_k h_{j-k}\right)^2}{\sum_k h_{j-k}^2} \le \frac{\sum_k s_k^2 \cdot \sum_k h_{j-k}^2}{\sum_k h_{j-k}^2} = \sum_k s_k^2 = \text{constant}. </math>
The optimization problem assumes its maximum when equality holds in this expression. According to the properties of the Cauchy–Schwarz inequality, this is only possible when
:<math>\ h_{j-k}=\nu \cdot s_k = \kappa\cdot f_{k-j_0}.</math>
for arbitrary non-zero constants <math>\nu</math> or <math>\kappa</math>, and the optimal solution is obtained at <math>j^*=j_0</math> as desired. Thus, our "probing sequence" <math>h_{j-k}</math> must be proportional to the signal model <math>f_{k-j_0}</math>, and the convenient choice <math>\kappa=1</math> yields the matched filter
:<math>\ h_{k}=f_{-k}.</math>
Note that the filter is the mirrored signal model. This ensures that the operation <math>\sum_k x_k h_{j-k}</math> to be applied in order to find the optimum is indeed the convolution between the observed sequence <math>x_k</math> and the matched filter <math>h_k</math>. The filtered sequence assumes its maximum at the position where the observed sequence <math>x_k</math> best matches (in a least-squares sense) the signal model <math>f_k</math>.
Implications
The matched filter may be derived in a variety of ways,
If the transmitted signal possessed no unknown parameters (like time-of-arrival, amplitude,...), then the matched filter would, according to the Neyman–Pearson lemma, minimize the error probability. However, since the exact signal generally is determined by unknown parameters that effectively are estimated (or fitted) in the filtering process, the matched filter constitutes a generalized maximum likelihood (test-) statistic. The filtered time series may then be interpreted as (proportional to) the profile likelihood, the maximized conditional likelihood as a function of the ("arrival") time parameter.
This implies in particular that the error probability (in the sense of Neyman and Pearson, i.e., concerning maximization of the detection probability for a given false-alarm probability) is not necessarily optimal.
What is commonly referred to as the Signal-to-noise ratio (SNR), which is supposed to be maximized by a matched filter, in this context corresponds to <math>\sqrt{2\log(\mathcal{L})}</math>, where <math>\mathcal{L}</math> is the (conditionally) maximized likelihood ratio. The first observation of gravitational waves was based on large-scale filtering of each detector's output for signals resembling the expected shape, followed by subsequent screening for coincident and coherent triggers between both instruments. False-alarm rates, and with that, the statistical significance of the detection were then assessed using resampling methods. Inference on the astrophysical source parameters was completed using Bayesian methods based on parameterized theoretical models for the signal waveform and (again) on the Whittle likelihood.
Seismology
Matched filters find use in seismology to detect similar earthquake or other seismic signals, often using multicomponent and/or multichannel empirically determined templates. Matched filtering applications in seismology include the generation of large event catalogues to study earthquake seismicity and volcanic activity, and in the global detection of nuclear explosions.
Biology
Animals living in relatively static environments would have relatively fixed features of the environment to perceive. This allows the evolution of filters that match the expected signal with the highest signal-to-noise ratio, the matched filter. Sensors that perceive the world "through such a 'matched filter' severely limits the amount of information the brain can pick up from the outside world, but it frees the brain from the need to perform more intricate computations to extract the information finally needed for fulfilling a particular task."
See also
- Periodogram
- Filtered backprojection (Radon transform)
- Digital filter
- Statistical signal processing
- Whittle likelihood
- Profile likelihood
- Detection theory
- Multiple comparisons problem
- Channel capacity
- Noisy-channel coding theorem
- Spectral density estimation
- Least mean squares (LMS) filter
- Wiener filter
- MUltiple SIgnal Classification (MUSIC), a popular parametric superresolution method
- SAMV
