0% found this document useful (0 votes)
13 views16 pages

Advanced Non-Linear Image Filtering Techniques

The document discusses various image processing techniques, focusing on neighborhood operators, non-linear filtering methods like median and bilateral filtering, and their applications in enhancing image quality. It also covers binary image processing, morphological operations, distance transforms, and connected components analysis. Additionally, it explains Fourier transforms and their significance in analyzing frequency characteristics, image filtering, compression, and feature extraction, including the use of the Discrete Cosine Transform in JPEG compression.

Uploaded by

Suhas P R
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views16 pages

Advanced Non-Linear Image Filtering Techniques

The document discusses various image processing techniques, focusing on neighborhood operators, non-linear filtering methods like median and bilateral filtering, and their applications in enhancing image quality. It also covers binary image processing, morphological operations, distance transforms, and connected components analysis. Additionally, it explains Fourier transforms and their significance in analyzing frequency characteristics, image filtering, compression, and feature extraction, including the use of the Discrete Cosine Transform in JPEG compression.

Uploaded by

Suhas P R
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module 2

1. More neighborhood operators :


Linear filters can perform a wide variety of image transformations. However non-linear filters,
such as edge-preserving median or bilateral filters, can sometimes perform even better. Other
examples of neighborhood operators include morphological operators that operate on binary
images, as well as semi-global operators that compute distance transforms and find connected
components in binary images.

[Link]-linear filtering
The filters we have looked at so far have all been linear, i.e., their response to a sum of two
signals is the same as the sum of the individual responses. This is equivalent to saying that each
output pixel is a weighted summation of some number of input pixels. Linear filters are easier

.IN
to compose and are amenable to frequency response analysis. In many cases, however, better
performance can be obtained by using a non-linear combination of neighboring pixels.
C
1.1.1. Median filtering
A better filter to use in this case is the median filter, which selects the median value from each
N
pixel’s neighborhood. Median values can be computed in expected linear time using a
randomized select algorithm (Cormen 2001) and incremental variants have also been
SY

developed as well as a constant time algorithm that is independent of window size. Since the
shot noise value usually lies well outside the true values in the neighborhood, the median filter
is able to filter away such bad pixels. One downside of the median filter, in addition to its
U

moderate computational cost, is that because it selects only one input pixel value to replace
each output pixel, it is not as efficient at averaging away regular Gaussian noise. A better choice
VT

may be the alpha-trimmed mean which averages together all of the pixels except for the _
fraction that are the smallest and the largest.
Another possibility is to compute a weighted median, in which each pixel is used a number of
times depending on its distance from the center. This turns out to be equivalent to minimizing
the weighted objective function

where g(i; j) is the desired output value and p = 1 for the weighted median. The value p = 2
is the usual weighted mean, which is equivalent to correlation (3.12) after normalizing by the
sum of the weights. The weighted mean also has deep connections to other methods in robust
statistics such as influence functions. Non-linear smoothing has another, perhaps even more
important property, especially as shot noise is rare in today’s cameras. Such filtering is more
edge preserving, i.e., it has less tendency to soften edges while filtering away high-frequency
noise.

Yashasvi B N, Dept. of CSE Computer Vision


Consider the noisy image. In order to remove most of the noise, the Gaussian filter is forced to
smooth away high-frequency detail, which is most noticeable near strong edges. Median
filtering does better but, as mentioned before, does not do as well at smoothing away from
discontinuities. While we could try to use the _-trimmed mean or weighted median, these
techniques still have a tendency to round sharp corners, since the majority of pixels in the
smoothing area come from the background distribution.

1.2. Bilateral filtering :

In the bilateral filter, the output pixel value depends on a weighted combination of
neighboring pixel values

.IN
C
N
SY

Since bilateral filtering is quite slow compared to regular separable filtering, a number
U

of acceleration techniques have been developed. In particular, the bilateral which


subsamples the higher dimensional color/position space on a uniform grid, continues to
VT

be widely used, including the application of the bilateral solver. An even faster
implementation of bilateral filtering can be obtained using the permutohedral lattice
approach.

1.2.1. Iterated adaptive smoothing and anisotropic diffusion


Bilateral (and other) filters can also be applied in an iterative fashion, especially if an
appearance more like a “cartoon” is desired. When iterated filtering is applied, a much
smaller neighborhood can often be used.

Yashasvi B N, Dept. of CSE Computer Vision


.IN
Since its original introduction, anisotropic diffusion has been extended and applied to
a wide range of problems. It has also been shown to be closely related to other adaptive
smoothing techniques as well as Bayesian regularization with a non-linear smoothness
C
term that can be derived from image statistics.
In its general form, the range kernel r(i; j; k; l) = r(kf(i; j)􀀀f(k; l)k), which is usually
N
called the gain or edge-stopping function, or diffusion coefficient, can be any
SY

monotonically increasing function with r0(x) ! 0 as x ! 1. Black, Sapiro et al. (1998)


show how anisotropic diffusion is equivalent to minimizing a robust penalty function
on the image gradients.
They also extend the diffusion neighborhood from N4 to N8, which allows them to
U

create a diffusion operator that is both rotationally invariant and incorporates


information about the eigenvalues of the local structure tensor.
VT

Note that, without a bias term towards the original image, anisotropic diffusion and
iterative adaptive smoothing converge to a constant image. Unless a small number of
iterations is used (e.g., for speed), it is usually preferable to formulate the smoothing
problem as a joint minimization of a smoothness term and a data fidelity term, which
introduce such a bias in a principled manner.

1.2.2. Guided image filtering


It is also possible to use a different guide image to adaptively filter a noisy input. An
example of this is using a flash image, which has strong edges but poor color, to
adaptively filter a low-light non-flash color image, which has large amounts of noise.
Guided Image Filtering is a technique used to enhance an image adaptively based on
a different image.
This is useful when the input image is noisy, blurry, or lacks details, and we want to
improve it by referencing a clearer image.

Yashasvi B N, Dept. of CSE Computer Vision


The relationship between the input and guide image follows an affine transformation:

.IN
C
N
SY


U
VT

Instead of just taking the predicted value of the filtered pixel g(i; j) from the window
centered on that pixel, an average across all windows that cover the pixel is used. The
resulting algorithm consists of a series of local mean image and image moment filters,
a per-pixel linear system solve (which reduces to a division if the guide image is scalar),
and another set of filtering steps.

[Link] image processing


While non-linear filters are often used to enhance grayscale and color images, they are
also used extensively to process binary images. Such images often occur after a
thresholding operation,

Yashasvi B N, Dept. of CSE Computer Vision


e.g., converting a scanned grayscale document into a binary image for further
processing, such as optical character recognition.

1.3.1. Morphology
The most common binary image operations are called morphological operations,
because they change the shape of the underlying binary objects.
To perform such an operation, we first convolve the binary image with a binary
structuring element and then select a binary output value depending on the thresholded
result of the convolution.
The structuring element can be any shape, from a simple 3 _ 3 box filter, to more
complicated disc structures. It can even correspond to a particular shape that is being
sought for in the image.

.IN
The standard operations used in binary morphology include:
C
N
SY

dilation grows (thickens) objects consisting of 1s, while erosion shrinks (thins) them.
U

The opening and closing operations tend to leave large regions and smooth boundaries
unaffected, while removing small objects or holes and smoothing boundaries.
VT

1.3.2. Distance transforms

The distance transform is useful in quickly precomputing the distance to a curve or set
of points using a two-pass raster algorithm. It has many applications, including level
sets, fast chamfer matching (binary image alignment), feathering in image stitching
and blending, and nearest point alignment.

Yashasvi B N, Dept. of CSE Computer Vision


A useful extension of the basic distance transform is the signed distance transform,
which computes distances to boundary pixels for all the pixels. The simplest way to
create this is to compute the distance transforms for both the original binary image and
its complement and to negate one of them before combining. Because such distance
fields tend to be smooth, it is possible to store them more compactly using a spline
defined over a quadtree or octree data structure. Such precomputed signed distance

.IN
transforms can be extremely useful in efficiently aligning and merging 2D curves and
3D surfaces, especially if the vectorial version of the distance transform, i.e., a pointer
from each pixel or voxel to the nearest boundary or surface element, is stored and
C
interpolated. Signed distance fields are also an essential component of level set
evolution, they are called characteristic functions.
N
1.3.3. Connected components
SY

Another useful semi-global image operation is finding connected components, which are
defined as regions of adjacent pixels that have the same input value or label. Pixels are said to
be N4 adjacent if they are immediately horizontally or vertically adjacent, and N8 if they can
also be diagonally adjacent. Both variants of connected components are widely used in a variety
U

of applications, such as finding individual letters in a scanned document or finding objects (say,
cells) in a thresholded image and computing their area statistics.
VT

Once a binary or multi-valued image has been segmented into its connected components, it is
often useful to compute the area statistics for each individual region R. Such statistics include:

Yashasvi B N, Dept. of CSE Computer Vision


These statistics can then be used for further processing, e.g., for sorting the regions by the area
size (to consider the largest regions first) or for preliminary matching of regions in different
images.

2. Fourier transforms

Fourier analysis is a mathematical tool that helps us study the frequency characteristics
of signals and images. In image processing, it allows us to analyze how different filters
affect various frequency components, such as high, medium, and low frequencies.
When we apply a filter to an image or signal, it modifies the frequency content. To
analyze what a filter does to different frequency components, we pass a sinusoidal wave
of a known frequency through the filter and observe its effect.

.IN
C
N
SY

A sinusoidal wave can be represented as:


U
VT

If we convolve this sinusoidal signal s(x) with a filter h(x), the output is another
sinusoidal signal with the same frequency but a different magnitude and phase:

Yashasvi B N, Dept. of CSE Computer Vision


.IN
C
The Fourier transform expresses how a filter or signal behaves across different
frequencies. The Fourier transform of a filter h(x)h(x)h(x) is given by:
N

This means that the filter’s effect on a frequency component is described by its
SY

magnitude AAA and phase shift ϕ\phiϕ.

The Fourier transform is often represented using a notation like:


U
VT

Yashasvi B N, Dept. of CSE Computer Vision


The DFT has a computational complexity of O(N^2), which is inefficient for large
images or signals. The Fast Fourier Transform (FFT) is an optimized algorithm that
reduces the complexity to making it much faster.

The FFT works by:


1. Breaking the computation into smaller parts (divide and conquer).
2. Using a series of small 2×2 transforms, known as "butterfly operations."
3. Rearranging data in a way that minimizes redundant calculations.
Most modern programming libraries, such as NumPy in Python, provide built-in FFT
functions for efficient computation.

.IN
C
N
SY
U
VT

Yashasvi B N, Dept. of CSE Computer Vision


.IN
C
N
SY
U
VT

Fourier Transform of Common Filters


The Fourier transform allows us to analyze how different filters affect an image’s
frequency components. Here are some examples:
1. Moving Average Filter
A moving average filter smooths an image but does not uniformly suppress high
frequencies, causing ringing artifacts.
2. Gaussian Filter
A Gaussian filter is often used for blurring and does a better job at preserving frequency
components compared to the moving average filter.
3. Sobel Edge Detector
The Sobel filter enhances edges but weakens very high frequencies, which means it
struggles to detect fine details.

Yashasvi B N, Dept. of CSE Computer Vision


Applications of Fourier Transform in Image Processing
1. Image Filtering: Applying filters in the frequency domain is often more efficient than
in the spatial domain.
2. Compression: The Fourier transform helps in JPEG compression by removing less
important frequency components.
3. Feature Extraction: Many computer vision algorithms analyze frequency
characteristics of images.
4. Image Enhancement: Frequency-based techniques help in noise removal and
sharpening.
5. Pattern Recognition: Fourier descriptors are used to recognize shapes in images.

Fourier analysis provides a powerful way to study the frequency characteristics of


images and filters. The Fourier transform breaks down an image into its frequency
components, while the Fast Fourier Transform (FFT) provides an efficient way to
compute it. By analyzing the Fourier transform of filters, we can understand their
effects on different frequency ranges, helping us design better image processing

.IN
techniques.

[Link]-Dimensional Fourier Transforms


C
The formulas and insights developed for one-dimensional signals and their transforms
translate directly to two-dimensional images. Instead of specifying a horizontal or
N
vertical frequency , we can create an oriented sinusoid of frequency
SY

:
U
VT

Yashasvi B N, Dept. of CSE Computer Vision


2.1.1. Wiener Filtering
The Fourier transform is not only useful for analyzing the frequency characteristics of
a filter kernel or image, but also for studying the frequency spectrum of a whole class
of images.

.IN
C
N
SY
U
VT

2.1.2. Discrete Cosine Transform (DCT)

The Discrete Cosine Transform (DCT) is a variant of the Fourier transform particularly
well-suited for compressing images in a block-wise manner. The one-dimensional DCT
is computed by taking the dot product of each -wide block of pixels with a set of cosines
of different frequencies:

Yashasvi B N, Dept. of CSE Computer Vision


Some discrete cosine basis functions illustrate this concept. The first basis function (a
constant function) encodes the average DC value in the pixel block, while the second
captures a low-frequency variation (a slightly curvy slope).
The DCT closely approximates the optimal Karhunen–Loève transform (KLT) for
natural images over small patches. The KLT, obtained through Principal Component
Analysis (PCA), optimally decorrelates signals assuming the signal is well-described
by its spectrum, making it theoretically ideal for compression.

The two-dimensional version of the DCT follows a similar formulation:

.IN
C
Applications of DCT in Image and Video Compression
N
The DCT is widely used in image and video compression standards such as JPEG.
SY

However, newer methods like wavelet transforms and overlapped DCT variants
have been integrated into JPEG2000 and JPEG XR standards.
These newer techniques suffer less from blocking artifacts, which occur due to
independent transformation and quantization of pixel blocks (typically 8*8 ).
U

Two-dimensional Fourier transforms extend the principles of one-dimensional analysis


VT

to images, enabling frequency-based filtering and analysis. The Wiener filter, derived
from statistical models, provides optimal restoration under Gaussian noise assumptions
but has largely been replaced by modern non-linear techniques. The Discrete Cosine
Transform (DCT) remains a cornerstone of image compression, effectively
decorrelating image data and enabling efficient encoding in standards like JPEG.
Despite advancements in wavelet-based compression methods, the DCT remains
widely used due to its balance between computational efficiency and compression
performance.

Yashasvi B N, Dept. of CSE Computer Vision


2.1.3. Application: Sharpening, Blur, and Noise Removal

A common application of image processing is the enhancement of images through


sharpening and noise removal operations, which require neighborhood processing.
Traditionally, these operations were performed using linear filtering. However,
modern techniques now favor non-linear filters, such as the weighted median or
bilateral filter, anisotropic diffusion, or non-local means. Additionally, variational
methods , especially those using non-quadratic (robust) norms, such as the L1 norm
(known as total variation), are widely used. Recently, deep neural networks have
taken over the field of image denoising.

Linear vs. Non-Linear Filtering


Linear filters, such as Gaussian smoothing or Laplacian sharpening, work by
convolving the image with a predefined kernel. Non-linear methods, however, adapt
based on local pixel statistics, preserving important structures while reducing noise.
 Weighted Median Filter: Uses local pixel values with adaptive weights to preserve

.IN
edges while reducing noise.
 Bilateral Filter: Smooths while maintaining edges by weighting pixels based on spatial
and intensity differences.
C
 Anisotropic Diffusion: Reduces noise while enhancing edges by allowing controlled
blurring along certain directions.
N
 Non-Local Means: Estimates the true intensity of a pixel by averaging over similar
patches from the entire image.
SY

 Variational Methods: Solve an optimization problem to enhance image quality while


minimizing distortion.
U
VT

Yashasvi B N, Dept. of CSE Computer Vision


.IN
C
N
Advanced Similarity Metrics
SY

More recent approaches use neural perceptual similarity metrics to assess image
quality:
 Feature-based Perceptual Metrics: Evaluate image similarity using deep feature
U

representations (Johnson, Alahi, and Fei-Fei 2016; Dosovitskiy and Brox 2016; Zhang,
Isola et al. 2018; Tariq, Tursun et al. 2020; Czolbe, Krause et al. 2020).
VT

 Texture Preservation Metrics: Unlike PSNR or L1 metrics that encourage smooth


results, these metrics retain texture similarity (Cho, Joshi et al. 2012).
No-Reference Image Quality Assessment
When a clean reference image is not available, no-reference image quality assessment
methods are used:
 NIQE (Mittal, Moorthy, and Bovik 2012): Uses natural image statistics to estimate
quality.
 Deep Learning-Based Assessments (Talebi and Milanfar 2018): Train neural
networks to predict perceived image quality without a reference.

Yashasvi B N, Dept. of CSE Computer Vision


.IN
C
N
SY
U
VT

Yashasvi B N, Dept. of CSE Computer Vision

You might also like