Monday, November 10, 2008
If you've never taken the time to check out Photosynth, do it now. Click your way to the Sphinx example and watch what happens when you drag your mouse around the main window. You will see a set of points that look an awful lot like some pyramids and the Sphinx. You can zoom in and start seeing some photographs overlaid onto the point cloud. What's so impressive about this, you ask? Well, all this data was obtained from photographs alone, reconstructed into this 3D navigable wonder you see before you. Pretty cool.
There are times when you might have found yourself a little shutter-happy when visiting some exotic, far-off land. Yet you probably still don't have enough photos of the exact same thing to recreate something like the Sphinx model. Fortunately, photo-sharing sites like Flickr help solve that problem! Starting with tags and geo-coding, and double checking by matching the photos to each other, we can find all the photos we need. That's exactly what researchers at Washington University did for the Community Photo Collections project, which became the base of Photosynth!
From the site:
With the recent rise in popularity of Internet photo sharing sites like Flickr and Google Images, community photo collections (CPCs) have emerged as a powerful new type of image dataset for computer vision and computer graphics research. With billions of such photos now online, these collections should enable huge opportunities in 3D reconstruction, visualization, image-based rendering, recognition, and other research areas. The challenge is that these collections have extreme variability, having been taken by numerous photographers from myriad viewpoints with varying lighting and appearance, and often with significant occlusions and clutter. Our research seeks to develop robust algorithms that operate successfully on such image sets to solve problems in computer vision and computer graphics.
For those with a few extra minutes (sixty-one extra minutes, to be exact), you can get a pretty good sense of the technology behind the project in this Google Tech Talk. The presentation isn't too technical, and uses some great demos to show what's going on. For the geeks out there, there is structure-from-motion source code available, too!
A lot of what's going on behind the scenes is related to what I'm trying to do with my thesis. We both need to find matches between two photos (or a panorama and a photo in my case), and recover the geometry between these entities. For applications like Photosynth, this geometry is used to obtain a 3D reconstruction of the scene (that's what all those points were in the Sphinx example). I might want to use the geometry to add virtual objects, like historical buildings or geographical information, to the photograph, using the known configuration of the panorama.