Tuesday, November 9, 2010

A depth detector and a camera might make the computer see better than human eyes

As the Kinect for Xbox360 has been released to the market recently, I saw from a geek computer web site discussing how this thing works. It has a built-in depth detector and it produces a particular kind of image with depth data from the detector.
This gives me this idea that a computer might see the world better by using the combination of this kind of depth detector and a digital camera.

We human beings and most of the large animals on this planet use two eyes to see the world. That means two eyes endlessly send images of the same scene from two angles of views to the brain where the images would be translated into 3D object information, so we can know how far away an object is from us beside the other information about colours and shapes etc. This is very effective since our brain is very good at this through millions years of evolution and practise, but actually this very natural action requires huge amount of calculation power if we are about to realize it on a computer.

I believe we human beings and animals developed this kind of two-eye view for a reason. We usually don't have the ability to detect things spontaneously. On the contrary, most of the time we just receive information like light, sound and smell passively. So if we want to know the depth information, we need views from different angles so we can find the slight difference between the images: the more obvious the difference is, the closer the object will be. And in this way we can tell which object is near, which is far. An example is the dolphin, which lives in the sea where the eyesight is limited, especially under deep water. Dolphins have the sonar system and when they cannot see clearly, they broadcast signal sound into the water and decide if there's something ahead of them and how far that thing is away from them by judging the returned signal sound. Yes this is the depth detector!
So when we want computers to understand the 3D world, we don't need to rely on heavy duty computing to tell the difference of each two frames of enormous incoming images from two cameras at different angles. That would be a disaster and who knows in how many years could we produce so powerful computer chips to be able to do this job. Actually, we just need to develop programs to analyze the shape and colours of the objects in the images from a single camera plus the depth information from the depth detector. It's very easy for a computer to send out infrared signal to detect things ahead of it, right?
Of course this is just an assumption. We might do things differently on computers and robots because maybe there're better solutions for them, not for us. But I still believe mother nature does not evolve for no reason. Very likely at the end of the day, we'll discover that the two-eye solution is still the best choice under all circumstances. Anyway, if you send out the detecting signal, you'll expose yourself to your enemy! HAHA

No comments: