Deep learning leading to second machine vision revolution
Insiders say deep learning is bringing about a second machine vision revolution and enabling designers to create part specifications—and therefore develop successful machine vision solutions—that were not feasible before.
"We’ve been seriously investing in this technology the last five years, but it’s only in the last two to three years that it’s been viable," said Andy Long, CEO of automation integrator Cyth Systems. "But the acceleration in the demand for deep learning during the last 18 months is staggering."
Unlike many significant machine vision technologies, such as smart cameras and 3-D sensors, where adoption was driven by engineers interested in the technology, deep learning champions are often located in the C-level suite, with interest driven by capability as much as technology. "Executives are saying, we need to be invested in this technology and see what it can do," Long said.
A new way to do machine vision
Deep learning machine vision software essentially allows machines to learn from data representations – in this case images that have already been tagged by human inspectors—instead of task-specific algorithms. Using a software-based neural network, deep learning programs learn much like children do—eventually learning to recognize "good" from "bad" based on seeing thousands of images that have been tagged as good or bad.
"This reminds me of where the machine vision market was 30 years ago," said John Petry, director of marketing, vision software, at Cognex Corporation. "Today, all of our customers are familiar with traditional machine vision in some capacity. They can pick up a machine vision tool, quickly learn how an alignment tool works, and solve an application. But with deep learning, we’re having technical discussions with the full automation team about where to use it, how to train the system, how to evaluate samples and defects, how fast a deep learning system can run, and if management can trust the results. These are the types of conversations we used to have 30 years ago."
Despite making amazing progress on deep learning software providers like Cognex and MVTec Software GmbH are quick to point out that the technology isn’t suitable for every machine vision application. For example, MVTec’s initial deep learning algorithm, released in November 2016, focused on optical character recognition applications. The ability of deep learning algorithms to learn new fonts, account for skewed text and changes in 3-D perspective, and much more made OCR a primary target – so much so that both companies now offer pretrained OCR neural networks.
Training, testing, and deep learning
As stated earlier, deep learning in machine vision is based on software analyzing a "supervised" data set to learn what is a good or bad part, grouping, or assembly. Traditional machine vision software that’s analyzing two images—one of a scratch and another of a scribed line—has no way of knowing which image contains a defect versus a design. Deep learning software learns to differentiate between scratches and designs by reviewing thousands of images of each and reading the images’ metadata headers. While machine vision integrators have accumulated huge image libraries, many of those images are the property of the customer and cannot be used to train a new neural network as part of a deep learning solution.
"There are publicly available data sets available today through Caffe and TensorFlow and other open-source deep learning programs, but most are not available for commercial projects," Hiltner said. "As part of our offering, we’re providing pretrained networks that are optimized for a number of common industrial machine vision applications. By using our pretrained networks, customers can refine their application using a relatively small set of tagged images instead of tens or hundreds of thousands of images."
Cognex doesn’t offer large libraries of trained neural networks beyond the OCR tool. Instead, Petry explained, its software breaks down the process into smaller pieces, each requiring only 20 to 50 image sets. "This lets us run on commercial CPUs and GPUs, and you can train the system in five minutes versus hours. One of the biggest benefits to deep learning is that an engineer can determine if an application can be solved in minutes rather than spending weeks trying to solve a problem only to determine at the very end that it is impossible with today’s technology," Petry said.
Experienced machine vision integrators are developing processes for helping customers evaluate deep learning and generate a viable data set for their applications. "When we don’t have enough control over the part or can’t set sufficient boundaries around the specification, that’s when we consider using deep learning," said Steve Wardell, director of imaging at ATS Automation.
To develop a data set that represents the production line without interfering too much with existing production, ATS suggested a hybrid approach to candidate applications. Instead of manual inspectors evaluating the actual products as they come off the line, ATS inserts a camera and monitor between the inspector and the product. The inspector looks at products and tags them appropriately. The tagged images can be fed into a deep learning program to check the efficacy of a proposed solution.
"A lot of these projects are from our life sciences and pharmaceutical clients," Wardell said. "These industries have a lot of regulation and validation requirements. We feel this hybrid approach is one way to really allow for the level of process validation that these industries require. Even if the deep learning software isn’t successful, we’re able to provide the customer with production data that’s invaluable to them, leading to process improvement."
Cyth Systems uses its deep learning platform to capture images from production environments and to send those tagged data sets to the cloud for off-line processing. "We believe that the inspectors today are the people who should be training the next-generation machine vision system," Long said. "What we’re really talking about here is the democratization of machine vision. We designed Neural Vision so the user never needs to know about heterogenous computational platforms. The only thing they need to know is: That’s my part. I need it to look this way, not that way, and divert it or not."
The goal, according to Long, is: "We’re removing the golden handcuffs that have constrained machine vision growth in the past. Right now, there are too many skills needed to program a machine vision system. We’re working so that you won’t need those skills at all. You don’t need to understand machine vision terminology. For me, technology is the driver, and the technology is accelerating faster than ever before. It’s a very exciting time."
Winn Hardin is contributing editor for AIA. This article originally appeared in Vision Online. AIA is a part of the Association for Advancing Automation (A3), a CFE Media content partner. Edited by Chris Vavra, production editor, CFE Media, firstname.lastname@example.org.
See more stories from the AIA linked below.