Friday, June 13, 2008
The long anticipated trailer for "Gail's Masters Thesis" has finally arrived. The final version should be released sometime next summer, with many updates on the production's progress before then.
That's right, after many "maybe it will be about..." posts, I finally know exactly what it's going to be!
It's definitely related to my mapping idea, but not an entirely direct way. Instead of developing a full, working system (for now), I will be researching one particular way to deal with the relative inaccuracy of GPS coordinates available on consumer devices. If a device's location were known exactly, it would be easy to pull up the geospatial data centered around that point, and augment it onto the camera image using projection theory. Alas, even a meter or two off would make for an unfortunate looking augmentation, with roads and tags not lining up the way they should.
My first thought was to actually do some image analysis to fine-tune the GPS coordinates. Say you know the orientation and tilt of a device. You also know that a building or road is supposed to fall on the image at a certain place based on the alleged camera center. Perhaps you can actually find roads and buildings in the image nearby, and use this knowledge to correct the disparity between the GPS and actual location. This seems to have potential, but it might be somewhat difficult to get the real camera center given that there's no way to know at what height the device is being held.
For now, I will be looking into another method. It relies on having 360 degree panoramic images, such as those found in Google's Street View, available for many locations. This may have seemed crazy even just a few years ago, but look how fast Google has managed to capture such data!
The main idea is to first take a general GPS location to narrow down the search space. We would start looking at panorama images in the general vicinity until a match is found between part of the panorama and the image taken from the device's camera. Since the camera position of the panorama is known, a transformation between it and the device's camera can be computed based on the two images. This calculated device camera position can then be used to augment the device camera's actual image.
The matching between real images and the panoramas is the main tricky part here. What is the best way to represent the panoramic image data? As a cube, like in the NAVIRE project? Or perhaps a cylinder to avoid the seams between cube faces? And what about the best way to represent feature points? SIFT or SURF?
So many questions.
If you don't have much of a clue what I'm talking about, don't worry! I'm sure it will become more clear as time goes on. Just stay tuned to this blog, and I'll keep you as up to date as I can.