Project 4 - Owen Gozali

Photo Mosaics

Part A: Manual Mosaics

Introduction

In this project I will be stiching together imagestaken from the same center of projection, similar to constructing a panorama from scratch.

Part 1: Taking Images

I took a variety of images throughout the week leading up to this project. Here are the ones I ended up using:

Image 1

View of house in northside Berkeley (I)

Image 1

View of house in northside Berkeley (II)

Image 1

View of classroom while I was bored (I)

Image 1

View of classroom while I was bored (II)

Image 1

View of Berkeley Way West 8th Floor (I)

Image 1

View of Berkeley Way West 8th Floor (II)

Image 1

Picture of random sign

Image 1

Picture of random LCD display

Part 2: Recovering Homographies

Similar to the previous project, we get correspondences between the images to be able to warp them to one another.

Image 1

Point correspondences for first shot

Image 1

Point correspondences for second shot

The problem is, we no longer can do a triangular warp to align the images, but need something more powerful: a homography. A homography is another type of imaging warping method but this has 8 degrees of freedom instead of one. The entire warp matrix can be configured except for the bottom right corner, which must be 1. We can derrive a more straightforward matrix of coefficients to solve using least squares:

Image 1

Once we solve for our homography matrix, we can warp the image as we please.

Image 1

Right shot of lab

Image 1

Right shot of lab warped to align with left shot

Part 3: Rectifying Images

A neat thing we can do with our homographies is to rectify images. e.g. map square or rectangular surfaces into a flat projection. This can be done by calculating the homography using the surface's corners as the x, y and the flat coordinates as the x', y' like [(0, 0), (0, 100), (100, 0), (100, 100)].

Here are some examples of successful homographies.

Image 1

Image of TV screen from an angle

Image 1

Rectified TV screen

Image 1

Cropped rectified TV screen

Image 1

Image of sign from an angle

Image 1

Rectified signage

Image 1

Cropped rectified signage

Part 4: Blending Images into Mosaics

Using the homography and some clever alignment math, we can stich together images so they look like they're taken from the same perspective. I added an alpha chanel for the images to show where there is actually an image and when is it just empty space to pad the image. I used the alpha chanels to blend the images together when they both head content.

Image 1

View of house in northside Berkeley (I)

Image 1

View of house in northside Berkeley (II)

Image 1

Blended view of house

Image 1

View of classroom while I was bored (I)

Image 1

View of classroom while I was bored (II)

Image 1

Blended view of classroom

Image 1

View of Berkeley Way West 8th Floor (I)

Image 1

View of Berkeley Way West 8th Floor (II)

Image 1

Blended view of BWW 8th Floor

Part B: Autostitching

Introduction

Manual stitching works fine, but we can streamline the process significantly by automating the correspondence annotation. We can use some novel techniques to automatically detect features and calculate which ones match between images.

Detecting Corners

We can start by detecting corners using the Harris corner detector. We're interested in corners specifically because it's easier to see correspondences between images as they would look quite different when shifted in any direction. By using this corner detector, we can find many candidates for corners in the image. Let's try to do this on an imageo of an intersection near my house.

Image 1

Left View

Image 1

Left View with Harris Corners

Image 1

Right View

Image 1

Right View with Harris Corners

Adaptive Non-Maximal Suppression

This is great for giving us a good amount of corner candidates, however it is much too dense to plug into our next steps in the model because we'll have to the magnitude of 3-4 thousand points. We typically only need 500 or so for the next step of the process so need a way to downsample these points.

One straightforward method is to randomly sample but that could leave us with corner points with low "corner strength" values which was calculated from the Harris detector. We could also choose the top 500 points with the highest corner strength, however that could lead to many points being focused on certain areas of the image which isn't desirable because we would want a good distribution of the throughout. A solution to this is to use Adaptive Non-Maximal Suppression which takes both distance and corner strength into account when trying to select an optimal subset of points. This is the result of running ANMS on our 3k+ original corner candidates.

Image 1

Left View with Harris Corners

Image 1

Harris Corners Narrowed Down with ANMS

Image 1

Right View with Harris Corners

Image 1

Harris Corners Narrowed Down with ANMS

Feature Matching

Once we got a suitable set of corners, we now can start matching the points together between images to see which corners in image 1 correspond to which corners in image 2. The way we do this is by selecting a 40x40 window around each point, apply gaussian blur to make it a smaller 8x8 window, and essentially find the two corners in the other image with an 8x8 window that looks the most similar. Then we discard the ones that arent significantly more similar to its closest neighbor compared to their second closest neighbor. Doing this ensures points that don't have a good match are eliminated as they probably are outside of the overlapping range.

Here is what it looks like when we apply this procedure to match up the points:

Image 1

Left Image Points

Image 1

Right Image Points

Image 1

Left Correspondences

Image 1

Right Correspondences

RANSAC

However, there may still be some pesky little outliers remaining in our image. These would significantly affect the calculated homography because we calculate it using a least squares algorithm. Thus, we can eliminate these outliers through the following process: randomly subsample 4 points, use them to calculate a homography explicitly, then apply the homography to the remaining points and see if they are far off from the points they're supposed to be paired with, if they aren't label them as "inliers". Repeat this process a lot of times and find the homography with the largest number of inliers. This maximal set of inliers will be the final correspondences we can use to stitch together our homographies.

Image 1

Initial Left Correspondences

Image 1

Initial Right Correspondences

Image 1

Left with Outliers Removed via RANSAC

Image 1

Right with Outliers Removed via RANSAC

The rest of the procedure is identical to the manual case and we can generate some nice complex mosaics now without needing to manually add the points.

Image 1

Left View of Intersection

Image 1

Right View of Intersection

Image 1

Intersection Stitched Together

Here are some other images I stitched together automatically!

Houses view from Part A

Image 1

Houses Left View

Image 1

Houses Middle View

Image 1

Houses Right View

Image 1

Houses Stitched

Local herbs store

Image 1

Herb Store Left View

Image 1

Herb Store Middle View

Image 1

Herb Store Right View

Image 1

Herb Store Stitched

Berkeley Fire Department

Image 1

Fire Station View 1

Image 1

Fire Station View 2

Image 1

Fire Station View 3

Image 1

Fire Station View 4

Image 1

Fire Station View 5

Image 1

Fire Station Stitched

Takeaways

I think the biggest thing I learnt from this project is that I don't need to be worried about always implementing the most efficient solution from scratch. Sometimes it's better to get a basic version working and then iterating on it afterwards. I learnt this when doing the feature matching because I was stuck on designing a way to do the nearest neighbor searching efficiently without needing to gaussian blur it every time. Eventually I just implemented it and added caching with a dictionary and it worked just fine.