Digital Image Processing

Rick
7 min readApr 1, 2023

--

Light and the Electromagnetic Spectrum

The EM spectrum can be expressed in terms of wavelength, frequency, or energy. Wavelength (λ) and frequency (v) are related:

equation … (1)

where c is the speed of light; c = 2.998 x 108 m/s

The energy of the various components of the EM spectrum is given by the expression

equation … (2)

where h is Planck’s constant. The units of wavelengths are meters, the units for frequency are Hz, and a common unit for energy is the electron-volt.

Electromagnetic waves can be visualized as propagating sinusoidal wavelength or then can be thought of a stream of massless particles, each traveling in a wavelike pattern and moving at the speed of light. Each of these particles contain a bundle of energy, called a photon.

Energy is proportional to the frequency, so the higher the energy and because of the eq (2), the higher the frequency the shorter the wavelength. The visible band of the EM spectrum spans from approximately 0.43 m (violet) to about 0.79 m (red). Light that is void of color (has no color) is called monochromatic or (achromatic) light. The only attribute that monochromatic light has is intensity. This intensity gives a spectrum from black to gray up until white. In this context, intensity and gray levels are terms that are used interchangeably. In addition to frequency, chromatic light has three other quantities to describe it:

Radiance.- the amount of energy that flows from the light source, measured in watts (W)

Luminance.- measured in lumens (lm), is the amount of energy an observer perceives from a light source.

Brightness.- It’s mostly a descriptor of light perception and it’s nearly impossible to measure.

Image Sensing and Acquisition

Most of the images we want to generate consist of two main parts:

  • A light source
  • A sensing device that captures the light reflected from the light source.
  • e.g. how x-ray machines work

Image Acquisition Using a Single Sensing Element

A common single element sensor is a photodiode, and to be able to generate a 2-D image, we would have to displace this sensor in both the x and y axis. There are several ways to do this. In the figure there’s a picture of one way to do it.

To move the sensor in the x axis we would use a threaded tube and spin it. This way the sensor can use an adaptor to attach to the tube; this allows the sensor to move in the x axis. To be able to capture the image in the y axis, as seen in figure. the image would be placed in a roller. So when the sensor has made its way from left to right, the roller turns either up or down and the sensor begins again, sampling the image from left to right to make a full 2-D image.

Image Acquisition Using Strips or arrays

This would work just like your typical scanner. For example, like those big photocopy machines at a library, at a college or in an office. These sensors would only need to move in one axis to capture the whole image.

Image Acquisition Using Sensor Arrays

It’s a bit more intuitive to understand, these elements are mostly used in digital cameras Electromagnetic and Ultrasonic sensors are frequently ordered in this manner. A typical sensor for these cameras is a CCD, short for, charge-coupled device. The response to each sensor is the integral of the light energy projected onto the surface of the sensor, this is a property that is used in astronomical and other applications requiring low noise images. The figure shows how a 2D array sensor would normally work.

A source of light is reflected from a scene or an object and lands into a sensing element collecting the incoming energy. Then this energy is focused on to an image plane.

A Simple Image Formation Model

As stated before we know images can be denoted as functions of the form f(x,y). The value of f at spatial coordinates is a scalar quantity whose meaning is determined by the source of the image. And these values are proportional to the energy radiated by a physical source (e.g., electromagnetic waves). Now notice because of the nature of this function, f(x,y) can’t be a negative value and it has to be finite.

note: image intensities can be negative, but as a result of interpretation. This means that somewhat light negative velocities represent up or down, or left and right

0 ≤ f(x,y) ≤ ∞

The function f(x,y) is characterized by two components: (1) the amount of source illumination incident on the scene — illumination, and the the amount of light reflected by the objects in the scene — reflectance. The two functions combine as a product to form f(x,y):

f(x,y) = i(x,y) r(x,y)

and

0 ≤ r(x,y) ≤ 1

Image Sampling and Quantization

As seen before there are many ways to acquire images, however what we are missing is how to turn the continuous data to digital data. From other courses like digital circuits or signal and systems analysis, we know that the everyday world consists of analog signals and the process to computerize that information is called discretization or digitization.

Representing Digital Images

Plotting the function, with two axes determining the spatial location and a third axis represents the values of the f(x,y) function.

In computer processing the most common way is via matrices

Every element in a given matrix would be called a picture element, pel or pixel. When working with images like matrices,the convention is that the origin of an image is located at the top left corner This has to do with how monitors are built, and they usually sweep an image at the top left and moving to the right

Spatial and Intensity Resolution

Spatial resolution is a measure of the smallest discernible detail in an image. Quantitatively, spatial resolution can be described as:

  • line pairs per unit distance
  • dots(pixels) per unit distance

Quick exercise

  • How many pair lines are in 1 mm if a single pair line is 0.1 mm wide ?
  • Solution: total / 2W = 1 mm / 0.2 mm = 5 line pairs.

To be meaningful, measures of spatial resolution must be stated with respect to spatial units. For instance, say that an image has a resolution of 1024x1024 pixels is not a clear statement without the spatial relationship; in other words, we need to know how much space a single pixel is representing.

Intensity resolution, in a similar way, refers to the smallest discernible change in intensity level. The number of intensity levels are normally an integer power of two, like 8, 16, 32, etc. The most common one is 8 bits to represent these levels 16 is used in some specific applications 10, 12, and 32 bits are not very common An experiment made in 1965, studied the relationship between spatial and intensity resolution and also the perception of quality. The study concluded that images with little details need more spatial and intensity resolution to be perceived as a good quality picture. And images with lots of details or with a lot going on, such as an image of a crowd in a football match, don’t need a higher spatial or intensity resolution to be perceived as a good quality picture.

Image Interpolation

Interpolation is used in tasks like, zooming, geometrically correcting images, etc. Interpolation is the process of using known data, to calculate values at unknown locations. To assign an intensity value at any point on the overlay, we look for the nearest pixel in the original image and assign the intensity of that pixel. This method is called the nearest neighbor interpolation. This approach is simple and helpful. However it is prone to severe faults like distortion of straight edges. A more suitable approach is bilinear interpolation, in which we use 4 near neighbors to estimate the intensity at a given location.

Bilinear interpolation

Bicubic interpolation is the next level on complexity

--

--

Rick
Rick

Written by Rick

I blog about everything I learn, Digital Image Processing, Data Science, IoT, Videogame design and much more :)

Responses (2)