Unlabeled data aids self-driving cars
Researchers at Carnegie Mellon University (CMU) have found a way to unlock a mountain of autonomous driving data to improve self-driving cars.
A self-driving car must accurately track the movement of pedestrians, bicycles and other vehicles around it.
With that safety measure in mind, training those tracking systems may now be more effective because of a new method now in development. Generally speaking, the more road and traffic data available for training tracking systems, the better the results. And researchers at Carnegie Mellon University (CMU) have found a way to unlock a mountain of autonomous driving data for this purpose.
“Our method is much more robust than previous methods because we can train on much larger datasets,” said Himangi Mittal, a research intern working with David Held, assistant professor in CMU’s Robotics Institute.
In the past, state-of-the-art methods for training such a system required the use of labeled datasets – sensor data annotated to track each 3D point over time. Manually labeling these datasets is laborious and expensive, so, not surprisingly, little labeled data exists. As a result, scene flow training is often performed with simulated data, which is less effective, and then fine-tuned with the small amount of labeled real-world data that exists.
Mittal, Held and robotics Ph.D. student Brian Okorn took a different approach, using unlabeled data to perform scene flow training. Because unlabeled data is relatively easy to generate by mounting a lidar on a car and driving around, there’s no shortage of it.
The key to their approach was to develop a way for the system to detect its own errors in scene flow. At each instant, the system tries to predict where each 3D point is going and how fast it’s moving. In the next instant, it measures the distance between the point’s predicted location and the actual location of the point nearest that predicted location. This distance forms one type of error to be minimized.
The system then reverses the process, starting with the predicted point location and working backward to map back to where the point originated. At this point, it measures the distance between the predicted position and the actual origination point, and the resulting distance forms the second type of error.
The system then works to correct those errors.
“It turns out that to eliminate both of those errors, the system actually needs to learn to do the right thing, without ever being told what the right thing is,” Held said.
As convoluted as that might sound, Okorn found it worked well.
Researchers calculated scene flow accuracy using a training set of synthetic data was only 25%. When the synthetic data was fine-tuned with a small amount of real-world labeled data, the accuracy increased to 31%. When they added a large amount of unlabeled data to train the system using their approach, scene flow accuracy jumped to 46%.