CS180 Project 4

Part 1: Shoot the Pictures

In this section, I took several sets of images for image mosaics and rectifications. For creating image mosaics, I took two images from different angles (left first and then right) using my phone. To ensure similar exposure through the set, I made sure to use exposure lock on my phone (press and hold the focus box until it shows AE lock). Unfortunately, I only used exposure locks for daytime images. My nightime image set was taken without the exposure lock causing the mosaiced result to have a very obvious seem due to the exposure difference. For rectification, I took images of an quadrilateral object from an angle, so that the object in appeared slanted and distorted in the image. Then, I used a series of operations to rectify the image, or change the perspective such that the object appears square/rectangular in the image. Here are some examples of the photos taken:

Images for Image Mosaic

Images for Rectification

Part 2: Recover Homographies"

Here we want to recover the homographies between the images. To achieve this effect, we need to first define a set of correspondences between the images. Then, we can use the correspondences to calculate the homographic transformation matrix.

Imagine we have 1 pair of correspondence between the images. Here is how we can find the homograhica transformation matrix. \[ \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & 1 \\ \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \\ \end{bmatrix} = \begin{bmatrix} wx' \\ wy' \\ w \\ \end{bmatrix} \] Notice that the last line of the matrix multiplication results in \[ \begin{align*} \begin{bmatrix} g & h & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \\ \end{bmatrix} &= \begin{bmatrix} w \\ \end{bmatrix} \\ \begin{bmatrix} gx & hy & 1 \\ \end{bmatrix} &= \begin{bmatrix} w \\ \end{bmatrix} \\ gx + hy + 1 &= w \end{align*} \] Then, this system of equations becomes \[ \begin{align*} \begin{bmatrix} ax & by & c \\ dx & ey & f \\ gx & hy & 1 \end{bmatrix} &= \begin{bmatrix} wx' \\ wy' \\ w \end{bmatrix} \\[10pt] \begin{bmatrix} ax & by & c \\ dx & ey & f \\ gx & hy & 1 \end{bmatrix} &= \begin{bmatrix} (gx + hy + 1)x' \\ (gx + hy + 1)y' \\ (gx + hy + 1) \end{bmatrix} \\[10pt] \begin{bmatrix} ax & by & c \\ dx & ey & f \\ gx & hy & 1 \end{bmatrix} &= \begin{bmatrix} (gxx' + hyx' + x') \\ (gxy' + hyy' + y') \\ (gx + hy + 1) \end{bmatrix} \\[10pt] \begin{bmatrix} ax & by & c \\ dx & ey & f \end{bmatrix} &= \begin{bmatrix} (gxx' + hyx' + x') \\ (gxy' + hyy' + y') \end{bmatrix} \\[10pt] \begin{bmatrix} ax & by & c & 0 & 0 & 0 & -gxx' & -hyx' \\ 0 & 0 & 0 & dx & ey & f & -gxy' & -hyy' \end{bmatrix} &= \begin{bmatrix} x' \\ y' \end{bmatrix} \\[10pt] \begin{bmatrix} x & y & 1 & 0 & 0 & 0 & -xx' & -yx' \\ 0 & 0 & 0 & x & y & 1 & -xy' & -yy' \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ d \\ e \\ f \\ g \\ h \end{bmatrix} &= \begin{bmatrix} x' \\ y' \end{bmatrix} \\[10pt] \end{align*} \] This is thus the system of equations that if solved with linear solver, will give us the elements in the transformation matrix. However, since there are 8 unknowns, we need to construct 8 equations at minimum to solve the system. Each pair of points can give us two equations, so we need at least four correspondences to solve the system. We simply stack each pair of equations on top of each other to create a more determined matrix. \[ \begin{align*} \begin{bmatrix} x_1 & y_1 & 1 & 0 & 0 & 0 & -x_1x_1' & -y_1x_1' \\ 0 & 0 & 0 & x_1 & y_1 & 1 & -x_1y_1' & -y_1y_1' \\ x_2 & y_2 & 1 & 0 & 0 & 0 & -x_2x_2' & -y_2x_2' \\ 0 & 0 & 0 & x_2 & y_2 & 1 & -x_2y_2' & -y_2y_2' \\ x_3 & y_3 & 1 & 0 & 0 & 0 & -x_3x_3' & -y_3x_3' \\ 0 & 0 & 0 & x_3 & y_3 & 1 & -x_3y_3' & -y_3y_3' \\ x_4 & y_4 & 1 & 0 & 0 & 0 & -x_4x_4' & -y_4x_4' \\ 0 & 0 & 0 & x_4 & y_4 & 1 & -x_4y_4' & -y_4y_4' \\ \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ d \\ e \\ f \\ g \\ h \end{bmatrix} &= \begin{bmatrix} x_1' \\ y_1' \\ x_2' \\ y_2' \\ x_3' \\ y_3' \\ x_4' \\ y_4' \end{bmatrix} \\[10pt] \end{align*} \] However, to improve numerical stability, we may need to create more than 8 correspondences to form an overdetermined linear system. One down side is that we rarely have an exact solution, so to find a solution, we need to form this as a least squares optimization problem and solve the system this way.

Part 3: Warp the Images

To warp the images according to the homography matrix found using the process described in part 2, we need to follow the following steps with some caution:

Part 4: Image Rectification

To rectify the images, we need to define the four corners of the rectangular object and find the correspondences between these points and four corners of a true rectangle. Then we find the homography matrix and warp the images.

Book with original corners
Bag with original corners
Magnet with original corners
Book with new corners
Bag with new corners
Magnet with new corners
Book rectified
Bag rectified
Magnet rectified

Part 5: Blend the images into a mosaic

Here is where the fun begins. I can finaly use the sets of images displayed in part 1 to create three beautiful image mosaics. I first warped each image into the perspective of the second image using the warpImage function from part 3. Then I found the distance transform of each image's polygon (D1 and D2). Using D1 > D2 and D1 < D2, I found the blending mask for the first and second image. I attempted to blend the two images together just using alpha blending. However, the results turned out to have strong artificats at the seams. Therefore, I decided to use a Laplacian stack for blending using the masks described above. Here are the results.

Building 1 with points
Street corner 1 with points
House 1 with points
Building 2 with points
Street corner 2 with points
House 2 with points
Building Mask 1
Street Corner Mask 1
House Mask 1
Building Mask 2
Street Corner Mask 2
House Mask 2
Building Blended
Street Corner Blended
House Blended

Here is a set of extra images.

Car 1 with points
Car 2 with points
Car Mask 1
Car Mask 2
Car Blended

Here is a failed attempt to blend the images together. The main reason for this was because the differences in exposure between the left and right images, which caused a obvious seam the in the resulting image.

Street 1 with points
Street 2 with points
Street Mask 1
Street Mask 2
Street Blended

Part 6: Conclusion

From this part of the project, I learned that using exposure lock is extremely important for blending images, especially for images taken at night. In addition, while it was extremely difficult to warp the images correctly, the debugging process was phenomenal and reminded me the power of pen and paper debugging.

Autostitching

In this part of the project, I'm going to stictch together a mosaic using automatic approaches.

Part 7: Adaptive Non-Maximal Suppression and Harris Interest Point Detector

I first found harris corners using the provided code. Then, I used the adaptive non-maximal suppression algorithm specified in the paper At first, I implemented the algorithm naively with two for loops. Unfortunately, on my local architecture, it took around 10 mins to process one image. It forced me to think of ways to optimize the process. Using dist2 and a vector outer product, I successfully remove the need for any for loop and achieved a dramatic performance improvement (from 10 mins to under a second for one image).

Harris Corner Detection

Left Image with Harris Corners
Right Image with Harris Corners

Adaptive Non-Maximal Suppression

Left Image with Harris Corners (n = 200)
Right Image with Harris Corners (n = 200)

Look! We have reduced so many gibberish points from the images!

Part 8: Feature Descriptor Extraction

Now that we have a manageable number of corners, we can extract the feature descriptors for them. First, for each point, I found the neighboring 20 pixels in each direction (40 by 40 where the corner point is in the center). For corners that are too close to the edge, I just discarded them, but this was not necessary because the harris corner function only consider points at least 20 pixels away from any edge. After identifying the interesting windows, I rescaled them to 8 x 8 windows using Gaussian blur. All of these were done in a 3-channel color space. Here are some sample results for the car photo.

Feature Descriptor 01
Feature Descriptor 02
Feature Descriptor 03

Part 9: Feature Matching

Let's match feature desciptors together! For each feature in image 1, we will find the nearest and second nearest neighboring feature in image 2. The distance criterion is the Euclidean distance. After finding the top two nearest neighbors, we can find a truely similar pair by using the Lowe ratio test, where we only keep the pair if \[ \frac{\text{dist to nn1}}{\text{dist to nn2}} < \text{threshold} \] After iterating through all the feature descriptors in image 1 and automaticaly in image 2, we have all the valid pairing in our hands.

Matched Features

Part 10: RANSAC

To find a good homography matrix, we need to use RANSAC. For n number of iterations (default to 1000), we conduct the following operations.

This way, we will have the largest set of points (inliers) that aggree with one homography. To find that homography H, we use the least square solver, as the number of inliers are most likely to be larger than 4. From here, we can use this H to transform the images using the process described in first half of the project.

Here are the pairs of inliers after RANSAC.

Matched Features

Part 11: Auto Stitching Results

Here are some successful results using auto stitching.

Auto Stitched Car Image
Manually Stitched Car Image
Auto Stitched Corner Image
Manually Stitched Corner Image
Auto Stitched Plushies Image
Manually Stitched Plushies Image

For the plushie example, here are the original images.

Original Plushie Image 1
Original Plushie Image 2

Here are some failed results using auto stitching.

The overlap between the two images is probably too small.
The auto stitching hates skies!

Part 12: Conclusion

This project taught me a lot about the need to vectorize operations. I could not have finished project without vectorizing the ANMS algorithm. I also learned how powerful RANSAC is. Originally, I mistakenly used two sets of independent samples from image 1 features and image 2 features to find inliers. This meant that I just used random points to compute the homographies. Somehow RANSAC still produced relatively reasonable results. I am fully impressed by how powerful RANSAC is.