Saturday, February 16, 2008
If you haven't checked it out before, now is the time to marvel at how cool Google's Street View maps are. Somebody from Google takes lots of video images as he drives down most of the streets in a city like New York. The images are then augmented with lines and text to indicate what various roads are called and what compass direction they go. Finally, they become available in the neat little interactive interface seen below.
Now, Google isn't the only one to come up this type of system. I mentioned in a post last month that the University of Ottawa worked on a similar system they called NAVIRE (Virtual Navigation in Image-Based Representations of Real World Environments). In the Google version, each area between the arrows you click to proceed is static. But with NAVIRE, you can look around this area with 360 degrees of rotation freedom before clicking on to the next area. (Edit: It turns out you can do this in Google too!)
Both these systems use images that are captured, processed and stored in a database for access to users later on. I imagine a system that can take any video and augment it in real time. Then a user could walk around with his GPS-equipped PDA or smart phone, aim the built-in camera at an intersection he's unsure of, and immediately get his bearings based on the information displayed over the roads he's looking at. While he's at it, he might also like to see interesting buildings around him labeled.
Why would anybody want such a thing when GPS systems can already pinpoint your location on a map and orient the map the same way you are standing? Think of it this way: GPS systems usually have maps that are similar to those found in map books; that is, they are completely rendered from a bird's eye point of view. Some systems, particularly those for driving, are able to show an artificial oblique view that is rendered from a fixed viewpoint. My hypothesis is that the divide these interfaces create between the digital and real worlds makes them less natural than an augmented reality interface would be.
To look at it more deeply, consider what happens subconsciously when looking at a bird's eye view map. You must first ensure that you are oriented in some way that allows you to associate the lines representing the roads with the real-world counterparts. Then you essentially re-project the flat information from the map onto the 3D world, taking into account the translation between the distances on the map and those between the real roads. You also have to remember the names of the roads that you are transferring into the real world. If there are several you need to locate, you may need to look back and forth between the real world and the map until you get it all straight.
The augmented version would take care of all this for you. You would just aim your camera where you need information, and the information would be displayed for you right on top of the real world. No going back and forth, no remembering, no confusion. Sounds pretty good to me!
There has been some work related to this idea done in the past. For example, a Masters student from the University of Bremen focussed his thesis on a navigation system for pedestrians. He theorized that the visual clues given by a mixed reality system would make following a walking route much easier. However, instead of augmenting actual video in real time, his system used a database of prepared augmented photos. Arrows indicated where the user was to proceed next, and the photo changed with the GPS location.
Meanwhile, the folks at Nokia are researching something closer to what I described above. Their MARA (Mobile Augmented Reality Applications) project has created a prototype application that displays information about "virtual objects" such as particular buildings and even people. You can see some demonstration videos on the project page.
The Nokia research is definitely quite similar to what I envision. However, they don't augment roads, and I find the tagging a bit jarring visually, so there are definitely improvements to be made. Aside from this project, however, I have not seen anything closer to what I want to do.
I have decided to focus my research for my Masters, and possibly PhD if I continue, around this street view mobile augmented reality mapping application. It is a topic that contains some interesting computer vision and graphics research, and can easily be branched out to include topics like context aware computing, usability, distributed computing, and so on. Although the first step may be research that would not require the actual development of the whole system, I do hope to eventually develop a complete prototype. Watch this space for more to come as I head down what I expect will be an exciting and interesting path.