Stereo cameras help self-driving cars ‘see’
Laser sensors currently used to detect 3-D objects in an autonomous car’s path are highly accurate. They’re also bulky, expensive and energy-inefficient. Light detection and ranging (LiDAR) sensors are affixed to cars’ roofs, but they increase wind drag, which is a disadvantage for electric cars.
Cornell researchers have discovered a simpler method. They use two inexpensive cameras on either side of the windshield to detect objects with accuracy comparable to LiDAR’s accuracy and at a much lower cost. The researchers found analyzing the captured images from a bird’s-eye view rather than the more traditional frontal view more than tripled the accuracy, making stereo camera a viable and low-cost alternative to LiDAR.
“One of the essential problems in self-driving cars is to identify objects around them. Obviously that’s crucial for a car to navigate its environment,” said Kilian Weinberger, associate professor of computer science, in a press release. He is also the author of the paper “Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving.”
“The common belief is that you couldn’t make self-driving cars without LiDARs,” Weinberger said. “We’ve shown, at least in principle, that it’s possible.”
LiDAR sensors use lasers to create 3-D point maps of their surroundings, measuring objects’ distance by the speed of light. Stereo cameras, which rely on two perspectives to establish depth, as human eyes do, seemed promising. However, their accuracy in object detection has been low, and the conventional wisdom was that they were too imprecise.
The researchers then took a closer look at the data from stereo cameras. They found their information was nearly as precise as LiDAR. The gap in accuracy emerged, they found, when the stereo cameras’ data was being analyzed.
The data captured by cameras or sensors for self-driving cars is analyzed using convolutional neural networks, which identifies images by applying filters that recognize patterns associated with them. These convolutional neural networks are good at identifying objects in standard color photographs, but they can distort the 3-D information if it’s represented from the front. When Wang and colleagues switched the representation from a frontal perspective to a point cloud observed from a bird’s-eye view, the accuracy more than tripled.
“When you have camera images, it’s so, so, so tempting to look at the frontal view, because that’s what the camera sees,” Weinberger said. “But there also lies the problem, because if you see objects from the front then the way they’re processed actually deforms them, and you blur objects into the background and deform their shapes.”
Weinberger said stereo cameras could potentially be used as the primary way of identifying objects in lower-cost cars, or as a backup method in higher-end cars that are also equipped with LiDAR.
“The self-driving car industry has been reluctant to move away from LiDAR, even with the high costs, given its excellent range accuracy – which is essential for safety around the car,” said Mark Campbell, a co-author of the paper, in a press release. “The dramatic improvement of range detection and accuracy, with the bird’s-eye representation of camera data, has the potential to revolutionize the industry.”
The results have implications beyond self-driving cars, said co-author Bharath Hariharan, assistant professor of computer science, in a press release.
“There is a tendency in current practice to feed the data as-is to complex machine learning algorithms under the assumption that these algorithms can always extract the relevant information,” Hariharan said. “Our results suggest that this is not necessarily true, and that we should give some thought to how the data is represented.”
Chris Vavra, production editor, Control Engineering, CFE Media, firstname.lastname@example.org.
See a video with Killian Weinberger, associate professor of computer science at Cornell University, discussing the group’s research.