How should control engineers use machine vision?
Machine vision systems have passed the point where the additional information they provide is worth the additional cost compared to point sensors.
Dear Ask Control Engineering: How should control engineers use machine vision?
In the past, engineers have shied away from using machine vision for control purposes because the technology was too complicated and expensive to be justifiable. To take a useful picture, one had to get the lighting just right, use the right image-processing algorithms to extract the information required, and finally put that information to use. It was a whole lot more fiscally responsible to slap down, say, a proximity sensor and use the time and money saved for something else.
Machine vision was relegated to inspection and gauging applications, where nothing else would do.
Those days are on their way out, however. Vision system manufacturers like Cognex, National Instruments, and a few others have worked hard to reduce the cost and complexity of the equipment needed and the software to make it work. While these offerings are still more complicated to use and more expensive to deploy than point sensors, they are either about to pass, or already have passed the point where the additional information they provide is worth the additional cost.
It’s important to remember that all vision systems work by shrinking data sets. To start with, a 12-bit monochrome image from a 1.5 megapixel (Mpx) image sensor contains 18 Mb of information. It’s way too much information for making control decisions. Getting the image is thus only the first step.
Machine vision systems provide control-related data by shrinking data sets from immense to quite modest.
The next step is image analysis, and for that we need a fast, dedicated computer. Most machine-vision camera manufacturers now offer units with image-processing computers built in. These so-called “smart cameras” run built-in image processing software that makes extracting the useful information from the image relatively quick and painless. They’ve been working on this stuff for 20 – 30 years, after all.
Image analysis starts by throwing away all irrelevant data. For example, the first step is often “thresholding,” where the 12 bits expressing the light level in a monochrome pixel collapse into one bit (black or white) to produce silhouettes of objects in the scene. Similarly, the system will throw out all information relating to extraneous objects that are visible in the scene, but don’t bear on the decisions to be made.
The next step is to boil automated measurements of relevant objects’ silhouettes down to a few numbers that have real useful value. For example, the only things we might care about relating to a circular object are its diameter, the position of its center, and maybe how far out of round it is. In at two-dimensional view, that probably amounts to four numbers. A complex task at high precision might yield double precision (64 bit) numbers relating to around 10 features. That amounts to something like 640 bits, reducing the data set by a factor of about 30,000.
That reduced data set contains all the relevant information the system controller needs to make a decision. The controller’s output to actuators will finally amount to 1 – 10 bits. That’s another data-set reduction by a factor on the order of 1,000.
Whether it was all worthwhile depends on the value of those final few bits. If all we want is the one bit provided by a proximity sensor, there’s no way to economically justify using even a very inexpensive vision sensor.
What makes the vision sensor justifiable are the other nine bits that the point sensor just can’t provide. To see what I mean, imagine an application in your brother-in-law’s factory, where you’re automating the final packing of carburetors arriving on a conveyor belt from a workcell where your Aunt Gladys attached the throttle return spring.
You could sense the arrival of the next carburetor by laying a laser beam across the conveyor, which would send a trigger pulse when the carburetor blocks the beam. That would give you your one bit. But, all that bit says is that something blocked the beam.
Let’s, instead, use a vision system using a low-end image sensor with a resolution of a few kilopixels, and an inexpensive 8-bit processor. Such a system would be quite capable of providing the one bit that says “something arrived.” It could also provide extra bits that say: “The thing that arrived is a carburetor, and that carburetor is the correct model to put in the box, rather than the neighbor’s cat taking a bath on the conveyor belt. Oh, by the way, the throttle return spring is attached correctly.”
Try doing that with a proximity sensor!
Posted by Ask Control Engineering on October 22, 2007