Monday, January 14, 2008
From what I understand, the seminar given to the School of Computer Science last week by Gerhard Roth (entitled Past, Current and Future Research in Computational Video with Applications to Gaming) was part of a competition for a faculty position required for the new game development program. I believe the idea was to have candidates share the kind of research they do and how it would be useful to game development. Dr. Roth was the first to present and he discussed his involvement in some really interesting computer vision research over the past decade or so.
Dr. Roth was part of the computational video group at the NRC (National Research Council Canada) before its funding was cut, and he teaches a computer vision course to fourth year computer science students here at Carleton (which I took during my undergrad).
One of the research projects Dr. Roth was involved with is the NAVIRE project. Very similar in nature to Google Street View, this project was a collaborative effort between the University of Ottawa and the NRC. The team used panoramic cameras to capture surroundings from the viewpoint of a person traveling along a road. The captured data was transformed into cubes, which could then be viewed with specialized software. Dr. Roth claims that NAVIRE solved several problems that Google's system did not, though it is not immediately clear to me what these are. Sifting through some of their publications might shed some light on this.
Another interesting project described in Dr. Roth's presentation related to another technology that should interest Google: image classification and retrieval. For the former, the user would select a subimage, probably a particular object to search for, and the system would retrieve all images that contained a subimage strongly similar to the search criteria image. To do this, various lighting conditions have to be accounted for as well as slightly different perspective viewpoints or scales. Occlusions must also be considered. For the latter, a number of images are categorized into a predefined set of semantic descriptions, such as car, dog, umbrella, and so on. After these categories are trained using a control set of images, the system would be able to give the probability that a new image belongs to a particular category. Test results are decent with only 100 training images for each category.
There were more projects mentioned in the presentation, but I'd like to move to those that interest me the most. I've discussed the concept of tangible user interfaces more than once (here and here), so by now it should be obvious how excited the topic makes me. Watching Dr. Roth talk about his experiences with them (under the mostly equivalent name of perceptual user interfaces) reminded me of this, and even gave me a whole new idea for a thesis! (I'll save a post on this new idea for another day, maybe after I talk to my supervisor about it).
In particular, Dr. Roth has been involved a lot with something called augmented reality. This should not be confused with virtual reality: the former uses real live video feeds and adds interesting computed imagery to it, while virtual reality tends not to combine real video with the virtual. The image below shows a good example of how augmented reality could be used.
It's not hard to imagine that a sizable investment would be necessary for this system, given that each user requires a head-mounted display unit. The alternative is to use consumer devices with front mounted cameras like cell phones and tablet PC's, such as is seen in this image.
It may not be as immersive as a system using headsets might be, but the potential for gaming is much more realistic for devices that everybody already owns. And that's where the seminar concludes: Dr. Roth believes that with an investment of several inexpensive tablet computers upgraded with inertial devices, some really interesting collaborative games could be developed. I'd be up for that!