Friday, August 12, 2011

Giving Computers a Human-Scale Understanding of Space

Computer vision and Human-Computer Interaction are just about to hit their stride. Within the past 4 years, the real-time/robotics computer vision research community has made leaps and bounds - much of it out of the Active Vision Group at Oxford and the Robot Vision Group at the Imperial College of London. One of the first pieces of work that really started to impress was PTAM (Parallel Tracking and Mapping) by Georg Klein:



Full up markerless augmented reality had been a long time dream of many. But, few people knew actually how to do it. PTAM was the first system that showed promise that it could handle the rough conditions of real-time motion of a handheld camera.

Also from Oxford, Gabe Sibley and Christopher Mei started demonstrating RSLAM (relative simultaneous localization and mapping) which provides fairly robust real-time tracking over large spaces. The following video uses a head-mounted stereo camera rig:



Just in the past couple weeks, some new projects done with the help of Richard Newcomb show what happens when you combine this tracking ability with either a depth camera like Kinect, or try to do traditional reconstruction from the RGB. These projects are called KinectFusion (a Microsoft Research Cambridge project) and DTAM (Dense tracking and mapping) respectively.



The following video uses a normal RGB camera (not a Kinect camera):


It's important to remember that no additional external tracking system is used, only the information coming from the camera. Also, it's worth pointing out that the 6DOF position of the camera is recovered precisely. So, what you can do with this data reaches well beyond AR games. It gives computers a human-scale understanding of space.

This is pretty exciting stuff. It'll take a little while before these algorithms become robust enough to graduate from a lab demo to a major commercial product. I usually like to say that "people will beat the crap out of whatever you make, and quickly gravitate to the failure cases". But as this work evolves and people begin build useful applications/software on top, it'll be an exciting next few years.

4 comments:

Вовка Соловьёв said...

Awesome! I belive, days, when I will able to shot virtual zombies at streets of my town, is very near.

zesing said...

are there any of these algorithms available to download and try?

luis said...

Amazing. Simply amazing.
Do you know if there is code out there ready to be played on or it would require going through tons of academic papers?

zesing said...

hi,
just found that source code for PTAM can be downloaded at http://www.robots.ox.ac.uk/~gk/PTAM/ (haven't tried it yet, though)