Exploring the Digital Canvas: Exploring Neighborhood Estimation, Affine transformations and More
A couple of weeks ago I did my first exam on digital image processing for the semester, and since most of the class failed on some questions our teacher said that he might take some of the same topics for the midterm which is next week for me. So in an effort to make me study, I’m going to redo the whole exam and even do some code to expand on some topics.
Question #1
a) How many bits as a minimum should we use if we want to have 65535 grayscale colors?
> If you work with binary numbers a lot, you might remember that 2¹⁶ is 65535 or we could calculate the log on base 2 which gives us 16 bits.
b) List 3 membranes in the human eye
- Cornea.- the transparent membrane that covers the posterior surface of the eye, it helps direct light to form an image.
- Sclera.- It’s the area that covers most of the eyeball
- Retina.- This part is formed by lots of parts that interact with light to help form images
- Cones, most of us have 6 or 7 million cones that helps us see reg green and blue light which can form other colors too
- rods, the serve a similar purpose to the cones but allow us to see in the darks to certain degree
c) Explain the differences between Global Shutter and Rolling Shutter.
Question#2
We have the following situation, we want to take advantage of our camera which has a 25 mm lens and a sensor array of ⅔ inches or 8.8 x6.6 mm aprox if the object is 150mm tall what is our working distance.
for this we make use of our optic formulas
In this case we need “a” the object or working distance.
451,13 mm or 45,113 cm, we would take the image of this object at a distance of aprox 45 cm
Question #3
List similarities between the human eye and a image acquisition device
- Much like cameras, computers, anything with pixels/sensors they operate usually with three main colors, Red, Green, and Blue. The human eye perceives light as red, green, and blue because of our color cones.
- We tend to squint to control the amount of light that enters our eyes, and via lenses or shutters we can control how much light enters the sensor on a camera.
Question # 4
Based on these pixel coordinates, calculate the following
- The euclidean distance between C and B
- The Chessboard distance between B and A
- The City Block distance between D and C
The euclidean distance between C and B
- C(x,y) = (17,2)
- B(u,v) = (11,17)
- (17–11)2-(2–17)2 = 16.155
The Chessboard distance between B and A
- A(x,y) = (2,2)
- B(u,v) = (11,17)
- max( |2–11| , |2–17| ) = 15
The City block distance between D and C
- C(x,y) = (17,2)
- D(u,v) = (7,8)
- |17–7| + |2–8| = 16
We can do a little bit of code for this, and because we are extra around these parts we are going to do MATLAB code AND Python code
MATLAB Code
now the way this code was written was because a requirement, I had to add a description and I hade to create my own functions with the parameters Euclidean, City Block and chessboard method GitHub link here. I’ll try to update this GitHub to include much more in MATLAB, python and even C++ since OpenCV also supports C++
first function Calc_D, which will receive the following parameters:
- The input matrix on which calculate the distance
- The origin coordinates
- The desired method
depending on the method we will call the different functions
try
if strcmpi(method,'Eu') % calls for the Euclidean method
B = EucliDis(x,y,A);
D = B;
elseif strcmpi(method,'D4') % calls for the D4 method
B = D4_dist(x,y,A);
D = B;M
elseif strcmpi(method,'D8') % calls for the D8 method
B = D8_dist(x,y,A);
D = B;
end
catch
disp('Invalid method in function myInterp2')
end
Every one of these is actually quite simple, we know the formula and we just iterate over all of the pixel coordinates in the matrix taking the input origin coordinates as a reference point
D4 of City block distance
function B = D4_dist(x,y,A)
%% Variables
[a,b] = size(A); % Dimensions of the input matrix
B = zeros(a,b); % Blank matrix for output
%% Distance Calculation (D4 method "City Block Distance")
for u = 1:a
for v = 1:b
B(u,v) = abs(x-u) + abs(y-v); % Formula for the D4 method
end
end
end
Chessboard distance
function B = D8_dist(x,y,A)
%% Variables
[a,b] = size(A); % Dimensions of input matrix
B = zeros(a,b);
%% Distance Calculation (D8 method "Chessboard Distance")
for u = 1:a % scan the matrix
for v = 1:b
B(u,v) = max([(abs(x-u)),(abs(y-v))]); % formula for the D8 method
end
end
end
Euclidean distance
function B = EucliDis(x,y,A)
%% Variables
[a,b] = size(A); % Dimensions of the input matrix
B = zeros(a,b); % Blank matrix for the output
%% Distance Calculation (Euclidean method)
for u = 1:a % scan the input matrix
for v = 1:b
B(u,v) = round(((x-u)^2 +(y-v)^2)^(1/2),2); % Euclidean formula
end
end
end
For python there are few changes but you can see in detail on the github code.
Question #5
We need to estimate the number of neighborhoods assuming 8 connectivity
First thing we would do is scan the image line by line, the first blank cell that we encounter that would be neighborhood 1. And for each pixel we also have to consider if there is a neighbor assuming 8 connectivity. If we encounter an “empty neighborhood” that is a new one and our neighborhood count increases.
Soon enough we will encounter this situation, where “two neighborhoods meet” the highlighted pixel has neighbors in the 2 and 4 neighborhoods. In the first image, as we were initially scanning, it wasn’t too obvious for the computer that these were separate neighborhoods but now we seem to show 8 connectivity, so the conclusion is that they are in the same neighborhood and since 2 came first then overall neighborhood is now 2.
the whole result comes to this
Question # 6
Now we need to identify the correct affine transformation that does the desired output below.
Either from observation or by viewing the pixel coordinates we can say that the image not only has moved but also has stretched and compressed.
In the first image the resolution was 10 x 20 pixels then it changes to 20x10, it also goes up by 5 pixels and to the right by 5 pixels.
The total affine transformation matrix would be: