Overview

In this activity, we will take a closer look at images as matrices and what we can do with them in that context. You will practice with basic Numpy tools operating on both image arrays and smaller practice arrays. We will look at how to use simple arithmetic operations on matrices to change colors, brightness, and more. We will examine how to access the channels of an image, and will create regions of interest, subsections of images that we want to focus on.

The Github repository for this assignment will contain a starter code file, activ9.py. Put your code in this file, as directed by the TODO comments.

Numpy operations

Creating arrays

The list of functions below allows us to create Numpy arrays from scratch. With this, we can make a blank canvas to use the Numpy drawing functions, create synthetic images, or just make a black and white mask or frame to apply to an image.

Function	Description
`array`	Takes a sequence data type (list, string, tuple, or Numpy array), and builds a Numpy array with the same shape. Optional input allows us to specify the type of data for the array.
`zeros`	Takes in a tuple giving dimensions, and an optional input for the type of data, and it makes an array with the given dimensions and type, all filled with zeros.
`ones`	Similar to `zeros` but it fills the array with the value 1.

Try this to hand in: Create a function blackCanvas that takes in two inputs, the width and height for the canvas, and it creates a new color Numpy array all filled with black, with the given width and height. It should return the new image array. In your main script, try several test calls to this function with different sizes of images, and draw a square centered in each.

Try this to hand in: Create a function rgbStripes that takes in two inputs, the width and height for the new image. It should create a new image array of the given size. Then it should fill the first 1/3 of rows with red, the second 1/3 of rows with green, and the remaining rows with blue. There are at least four different ways of filling in the values (iterate over each row and column with nested for loops, iterate over each row and use slicing to assign the whole row at once, use slicing to select the whole region to be red, or green, or blue, or build a list the shape of the image and then convert it to a Numpy array). Pick the approach that suits you the best!

Image Arithmetic

Remember that both Numpy and OpenCV provide tools for doing arithmetic on images. This means we can add or subtract a constant amount to every value in an image, changing the overall brightness of the image. We can also add or subtract images from each other, so long as they are the same shape: corresponding values in the image arrays are added or subtracted from each other.

Subtracting images

We can perform a simple kind of motion detection on frames from a video, by subtracting one frame from its predecessor. The Numpy module tells Python how to interpret the minus sign when applied to Numpy arrays. OpenCV has two functions for taking the difference: subtract and absdiff. The first one just does subtraction, the second one computes the absolute value of the difference for each corresponding value in the arrays.

Try this to hand in: In the script section of the starter code file, add a script to experiment with subtracting two frames from a video (this video is one we will work with later, it shows Prof. Fox holding an orange ball and moving it around in the air):

Start by reading in frame1.jpg and frame2.jpg
Use imshow to display the two images; notice that frame 1 is slightly different from frame 2
Define a new image diff1 to be the result of calculating frame 1 minus frame 2, using the minus sign (-)
Define another new image diff2 to hold the result of OpenCV’s subtract function applied to frame 1 and frame 2
Define a third new image diff3 to hold the result of OpenCV’2 absdiff function
Use imshow to display all three diff images
Examine the results. Based on what you know about how Numpy and OpenCV handle arithmetic, and how subtract and absdiff differ from each other, explain why the three difference images look the way they are, and how they are different. Write this explanation in the starter code file, as a comment

Blending images

We can use image arithmetic to blend two images together, using the addWeighted function from OpenCV. To blend two images, we want to look at corresponding pixels, and average their two red values to make a new red value, and do similarly with green and blue channel values. (It is kind of amazing that this works, actually!)

Resizing images

In order to blend two images, we need to make them the same size and shape. We could do that using the Numpy slicing operators, but here we will take a look at a function that lets you resize an image either by scaling it to a specific size, or by scaling it by given factors in the x and y directions.

cv2.resize(<img>, (<wid>, <hgt>), fx=?, fy=?)

The resize function takes in the image to be resized, and a tuple giving the new size as width followed by depth, and it returns a new image of the new size. However, if we set the width and height in the tuple to be zero, then we can provide optional inputs that give the new size as a factor of the original. The optional input fx specifies the factor in the x dimension, and fy in the y dimension. If we set fx to 0.5, for example, then the new image will have a width one half the size of the original’s width.

The examples below illustrate different ways of calling resize.

Examples	Meaning
`cv2.resize(src, (100, 100))`	Returns a new stretch/squashed image that is 100 x 100 pixels
`cv2.resize(src, (0, 0), fx = 2, fy = 2)`	Returns a new image twice the size of the original
`cv2.resize(src, (0, 0), fx = 0.5, fy = 1.0)`	Returns a new image whose width is half the original size

Try this to hand in: In the script section of the activity code file, read in three images from SampleImages (any ones you choose). Use the resize command to change the second and third images to match the size of the first. Be sure to imshow the images so that you can check your work. Call these images img1, img2, and img3.

Blending with arithmetic

Examine the code fragment below (also reproduced in your activity code file).

blendImg1 = cv2.add(img1, img2)
cv2.imshow("Blend by adding", blendImg1)

Try this fragment on the resized images you created in the previous section, and observe the results. If we just add the two images, the result is too bright, and would have tons of overflow artifacts if we used Numpy addition. We want to average the two image values, not just add them. But consider this: if we first add the images, and then divide by 2 (the way we typically think about computing an average) the result will be distorted. Even OpenCV’s addition operator avoids overflow by capping the values at 255 when they would have added up to more than 255. That means that adding and then dividing by 2 will produce a different result than dividing each original image by 2 and then adding.

Try the code below on the images you resized, and compare blend1 and blend2.

sumImg = cv2.add(img1, img2)
blend1 = cv2.divide(sumImg, 2)

divImg1 = cv2.divide(img1, 2)
divImg2 = cv2.divide(img2, 2)
blend2 = cv2.add(divImg1, divImg2)

We can also use Numpy commands to compute the average more easily, so long as we remember to divide first, and then add, to avoid overflow artifacts. Try this:

avgImg = 0.5 * img1 + 0.5 * img2
blend3 = avgImg.astype(np.uint8)

Weighted averages

A normal average (add the two numbers and divide by 2) weights both pixels/images equally: 50% from one image and 50% from another. The previous example multipled each image by 0.5: by shifting from division to multiplication we can see that we are really multiplying each image by a weight, the percentage of the final image that should come from each image.

We can change the percentages, by changing the weights. Just make sure they are each between 0.0 and 1.0 and that they add up to 1.0, so that the resulting image has the same brightness as the originals.

Try varying the weights for the example above, and examining the resulting blend.

The addWeighted function

OpenCV actually provides a function, addWeighted, to perform a weighted average of two images.

blend4 = cv2.addWeighted(img1, 0.5, img2, 0.5, 0)

The addWeighted function has 5 required inputs: the first image to blend, the weight to multiply the first image by, the second image to blend, the weight for the second image, and a constant to add the result. In other words, the function computes this mathematical formula:

\[newIm = \alpha \cdot img1 + \beta \cdot img2 + \gamma\]

Try this as an alternative to the Numpy arithmetic version, and compare the results.

Try this to hand in: Create a function phaseBlend that takes in two images presumed to be the same size. This function should include the following steps:

Set up one weight value, w, to be 0.0 (w is an accumulator variable foer this loop)
Repeat with a for loop, ehough times for w to go to zero (experiment or calculate)
Inside the for loop, use w and 1 - w as the weights, and blend the two input images, assigning the result to a variable
Also in the loop, imshow the blended result, and include a waitkey
Finally, in the loop, add a small amount to w to change the weight for next time (use 0.1 or 0.05, or similar)
Optional extension: Instead of trying to time the loop stopping to when w gets to 1.0, we could change the direction of the blend and start reducing w each time (until it gets to 0.0). To do this, we need another accumulator variable, deltaW, to hold the amount to change w by each time. It will stay at 0.1 or 0.05 until w reaches 1.0, and then it should change to be -0.1/-0.05.

Accessing channels

The split and merge functions allow us to pull apart an image’s channels, and put them back together again.

Try this to hand in: Examine the code below, also found in your starter file.

flowerIm = cv2.imread("SampleImages/wildcolumbine.jpg")
(blueChan, greenChan, redChan) = cv2.split(flowerIm)
cv2.imshow("Original", flowerIm)
cv2.imshow("Blue channel alone", blueChan)
cv2.imshow("Green channel alone", greenChan)
cv2.imshow("Red channel alone", redChan)
cv2.waitKey()

When you run this code, what happens, and why? Discuss with classmates, preceptor, or instructor if you aren’t sure why the channels appear the way they do when displayed.

Add a call to merge to your code file, right after these lines. Merge takes in a tuple of three channels, and treats them as the blue, green, and red channels of the image. Try these variations:

Make an exact copy of the original, by calling merge and passing it a tuple containing the channel images in the original order (blue, then green, then red). (Be sure to imshow the result.)
Use zeros to make a copy of the red channel that is all filled with zeros. Call merge again, but replace the red channel with the new blank one. How does it differ? What if you make a white image the same shape as the red channel, and then used that in merge?
Call merge a third time, and use the original three channels, but put them in a different order. What does the result look like?

Can you explain the results you see? If not, discuss with a neighbor or teammate, or with preceptor or faculty member.

Try this to hand in: Create a function, colorShuffle, that takes in one image as an input parameter. It will return a new image that has the three color channels randomly shuffled to a new order!

Do this:

Add an import statement at the top of the file to import the random module
Inside the function, use split to separate the three channels from each other
Define a variable to hold a list with the three channel arrays in it
Call random.shuffle and pass it the list from the previous step (this will change the list to a new ordering, try printing the list before and after to see how shuffle works)
Pass merge1 the list to get the new image
Return the image

Regions of Interest

A region of interest is a section of an image that we want to focus on. Often it is the result of running computer vision algorithms to determine where something interesting is in the image, but it could also be human-designed.

At its core, we make a region of interest using Numpy’s array slicing operator. We will practice the slicing operator here, on small arrays and on images.

Key idea: Remember that Numpy often avoids copying the data in an array, both when we use slicing to access portions of it, and when we use the astype method to change the type of data. Instead, it provides a view of the original data, either limiting the indices we can see, or mapping the values to a new type as we access them. When this happens, changes to the original array show up in the view array, and vice versa.

Consider the small 2d array shown in the code below.

arr1 = np.array([[2, 4, 6, 8], [3, 6, 9, 12], [4, 8, 12, 16]])
print(arr1)

[[ 2  4  6  8]
 [ 3  6  9 12]
 [ 4  8 12 16]]

If we want to access and individual element of the array, we can put its row and column indices inside square brackets, separated by commas:

print("last of second row:", arr1[1, 3])
print("second of last row:", arr1[2, 1])

last of second row: 12
second of last row: 8

If we want a subarray, we extend this notation to use slicing operators for the row or column we are selecting from. Here are a few examples:

print("middle values:", arr1[1, 1:3])
print("third and fourth columns:")
print(arr1[:, 2:4])

middle values: [6 9]
third and fourth columns:
[[ 6  8]
 [ 9 12]
 [12 16]]

Now you try some examples, putting your code in the script section indicated by a TODO comment.

Access just the value 16 from this array
Access the first column of the array
Select the last two elements from the first two rows (giving a 2x2 matrix)
Select values from every other row, and every other column, starting with the 2 at [0, 0]

When working with images, we typically use slicing in two ways: to select the color at a specific pixel, or to select a rectangular region of the picture. The code sample below illustrates two ways to select the color channels from a specific pixel location. It also draws a tiny circle at that location on the image and displays it.

img = cv2.imread("SampleImages/antiqueTractors.jpg")
col1 = img[150, 325, :]
col2 = img[150, 325]
print(col1, col2)
cv2.circle(img, (325, 150), 2, (255, 255, 255))
cv2.imshow("Image", img)
cv2.waitKey()

Notice that this is one of the places where Numpy and OpenCV’s different ordering of rows and columns comes into play: Numpy orders the location as (row, column), but when we specify the location for drawing the circle, we give it as (x, y).

To access a region of an image, we select the range of rows, then the range of columns, and then place a colon (:) for the channel dimension, to indicate we want all three channels.

steer = img[150:225, 410:460, :]
cv2.rectangle(img, (410, 150), (460, 225), (255, 255, 255))
cv2.imshow("Image", img)
cv2.imshow("ROI", steer)
cv2.waitKey()

Try these to hand in:

Create an interactive function, colorShow, to let the user input coordinates, and show the resulting color in a small image window.
- This function will take one input, the image we are going to sample colors from
- Inside the function, display the input image and wait for a key press
- next, create a small blank image, colorDisplay, that is 100 by 100 pixels in size.
- Add a for loop to repeat some number of times (like 5 or 10 times)
- Inside the loop, ask the user to input first the x coordinate, and then ask them to input the y coordinate (remember to convert the string the user enters to an integer)
- Use slicing to pull out the color at that (x, y) coordinate (remember to switch the ordering when slicing to y first, then x)
- Draw a circle with radius 2 centered on the (x, y) coordinate the user entered
- Display the image (it’s okay if old circles still show up)
- Set the color at every pixel of colorDisplay to be the selected color
- Display colorDisplay in its own window
- Add a waitKey at the end
Create a function centerCrop that takes in an image as its input parameter. The function should make and return an ROI that is 200x200 pixels, centered in the image.
- Get the height and width of the image, and from that calculate the center row and center column
- Define an ROI that extends 100 to the either side of the center column and 100 to either side of the center row
- Display the ROI and the original, and use waitKeye to pause
- Return the ROI

What to hand in

Put all of your function definitions for this activity into a single file to be submitted. Make sure you format your code appropriately:

At the top of the file is a triple-quoted string describing the file
Next you include all import statements
Next you have your function definitions, visually separated by blank lines, and maybe comments with dashed or other visual horizontal lines
Each function should have a triple-quoted descriptive comment right after the def line
All calls to all functions should be in a script at the bottom of the file, ideally inside an if __name__ ... block

Use commit and push to copy your code to Github to submit this work.