Most of what I've done so far regarding my thesis research has been reading and playing with other peoples' code. I started off playing with matching code written by Rob Hess of Oregon State University. He seems to have created his own implementation of SIFT (rather than using the inventor's freely available binary file) and used the keypoint descriptors to find matches between two images. You can also have the program try finding an appropriate transformation between the two images using RANSAC methods. I started looking at this implementation because it should be easy to use as a base for working with SURF descriptors instead, as well as writing several matching strategies to test.
I was rather confused when two images that seemed very similar were not getting the kind of matching I would have expected (and no transformation was found between them). The images taken were of the same building, with a change in viewpoint that isn't particularly significant.
This image was taken with my digital camera a couple of weeks ago:
This is the portion of the panoramic image that contains the same building. I don't know when it was taken. The panoramic images in this case are stored as cubes laid out in the plane. This is one face of that cube.
What do you notice that's different about these images? The illumination isn't quite the same, but that is not a factor when using the more sophisticated keypoint detectors and descriptors. Look closely at the windows of the building. They appear distorted in one image compared the other.
That's apparently because the focal lengths of my digital camera (6mm) and the camera used to capture the panoramic images (~2mm) are very different, causing a perspective distortion. Or so my profs surmised when we last met. This would mean that one of the images would have to be modified to undo the distortion. It also means that I will be using only images from my digital camera to test the matching code I'm working on (at least for now).