Active Vision System for 3D Product Inspection

Learn how to construct three-dimensional vision applications by reviewing triangulation equations and measurement procedures.
By Janusz Kowal and Andrzej Sioma, AAGH University of Science and Technology November 1, 2009
For more articles like this, visit and search on Machine Vision.

The most evident limitation of a two-dimensional vision system is its ability to move and rotate only in the X-Y plane. This has led to a demand for high quality, three-dimensional (3D) processing technology, the use of which is increasing. For engineers exploring the use of 3D inspection, there are now many different technologies available for creating 3D images.

The most important task for engineers to successfully create such an application for industrial use is to achieve a high measurement resolution in all (x, y, z) directions. Depending on the shape and size of the object and the distance between the object and the vision sensor, different methods of three-dimensional measurement, and different types of active vision systems, can be used.


A 3D object model

A 3D object model

3D image acquisition

3D image acquisition

The vision system consists of a set of devices, which convert light and measurement data into information, with spatial and material attributes of the scene. These devices have photosensitive sensors (a camera with a lens which focuses images on a photosensitive element or the eye) and computational mechanisms (a computer) allowing information to be collected from the sensors.

Humans can easily distinguish objects they see and are also able to recognize them in various light conditions and from different frames of reference. However, in the case of vision systems, the proper interpretation of a pixel value is a very difficult task. Additionally, the brightness of pixels (and thus of the image received) depends on several factors:

  • Scene geometry – the shape and position of the object in the field of vision (when the shape or the position is changed, the image changes as well);

  • Lighting and material properties of the object– for example, an apple made of bronze looks different from a real apple; and

  • Dynamics of the environment.

Active vision systems

A typical process of three-dimensional analysis for a vision system requires a well thought out position for the object to be viewed, changes in relative position of the object and the sensor, scanning, saving the collected data in a common coordinate system, and combining numerous pictures into one three-dimensional model.

An active vision system is composed of two devices, a camera and a laser projector. The laser emits a light pattern that is reflected by the object surface then observed and captured by the camera. Using specific light patterns called “structured light,” such as lines, grids or other shapes, one can scan scenes.

The term structured light is defined as the projection of simple or encoded light patterns (e.g., points, line grids or complex shapes) onto the illuminated object. The most important advantage of using structured light is that surface features in the images are defined in detail. As a result, both the detection and extraction of image features are simplified and represented very precisely.

Triangulation with active vision is based on the projection of a light pattern. The light line is directed towards the camera’s field of view of the object. Before measuring, the laser plane must be calibrated. It should be directed so that the light ray on the background surface is parallel to the x-axis and perpendicular to the y-axis.

As shown in the triangulation diagram (right), the lens is located in the center of the coordinate system, at the focal length ( f ) from the image plane (CCD or CMOS sensor). Baseline b is the line between the laser and the camera.

The laser plane forms a known angleθ with the baseline ( b ). The baseline length is calculated for the laser oriented so that the beam reflected from the studied object is positioned in the centre of the X’Y’ matrix.

Baseline b can then be calculated from this equation:

b = z tan θ


This point reflected from the object surface is projected onto the image plane

( x ’, y ’). The 3D point (x, y, z) can now be calculated. Known variables are b , f ,θ, x ’, y ’. The distance between the object and the camera can now be calculated from the following relationships:

Equation 2

Equation 3


With these relationships determined, real-world coordinates be calculated using the following calculations:

Equation 4

Equation 5
Equation 6


Measurement procedure

The first step in this measurement procedure is to set up the 3D camera before capturing 3D images or profiles. The following need to be set in order to retrieve images from the camera:

  • profile triggering and rate;

  • field-of-view (FOV) and resolution along the x-axis;

  • 3D image triggering;

  • length of y-axis; and

  • resolution along the y-axis for 3D images.


Triangulation with active vision is based on the projection of a light pattern.

Triangulation with active vision is based on the projection of a light pattern.

In addition, it may be necessary to adjust measurement settings in order to improve the quality of the images.

When these procedures are completed, the size of the image banks can be adjusted to fit the specified FOV (field of view). When settings are executed, the device will start capturing images and construct a 3D model from the data acquired.

During image acquisition, some data are inevitably lost. To help fill these gaps, a “Fill Missing Data” procedure should be implemented and exceeded to improve the 3D model. This procedure removes missing data from a selected part of a 3D image or profile by interpolating the height values from the areas surrounding it. Only the pixels within a defined region of interest are affected in the output, but values outside this region are used for interpolation.

Typically, this procedure is used for preparing 3D images for use with procedures that treat missing data as a height value. Though this procedure is mainly designed to fill small areas of missing data, it can be used effectively for areas of any size.

After a 3D model has been acquired, a profile can be extracted from a 3D image along the line that is specified by the pixel coordinates of the start and end points of the line. The actual location of the last point in the profile may be slightly different from the specified end point due to the sampling distance.

The distance between the last point in the profile and the end point of the line will be at most half the sampling distance. However, if the end point is at the image border, the distance is at most one sampling distance.

After a 3D model has been acquired, a profile can be extracted.

After a 3D model has been acquired, a profile can be extracted.

Filtering and analysis

After a profile has been extracted, selected parts of the profile must be filtered to remove the disturbances caused by the changing lighting conditions from the model. The resulting profile is stored in the new image bank for profile measurement and can be used as a mean or a median filter.

With a mean filter, every height value in the resulting profile is a weighted average of the neighboring pixels around the corresponding height value in the original profile. The result of the mean filter is a smoothed profile.

With a median filter, every height value in the resulting bank becomes the median of neighboring pixels around the corresponding height value in the original profile. The result of the median filter is a profile where single outliers or noise are removed but edges are preserved. The size of the neighborhood is defined by the dimension of the filtering matrix.

Region-of-interest profile

The next step is to define a region-of-interest profile that specifies the intervals of the profile where the measurement procedures can be applied without affecting the profile outside the intervals. The region of interest is created by specifying the x-coordinate where it starts (X start pixel), and the region of interest width.

Next, the selected part of the profile is analyzed along with its characteristic points, such as local maxima, minima, and inflection points. The section of the profile to be analyze is specified with a defined region of interest.

It should be noted that the active vision method encounters difficulties with some materials. It performs best for materials with a uniform reflectance. Materials such as glass produce multiple reflections over the surface depth zone, degrading or prohibiting an accurate range measurement. Shiny surfaces are characterized by direct reflections.

Depending on the illumination and observation geometry, dropouts may occur in surface reflections due to their insufficient energy. Multiple reflections from a shiny surface in corner regions can also produce wild measurements. In general, active triangulation-based range cameras are capable of very precise (100


Author Information
Janusz Kowal and Andrzej Sioma, contributors to Control Engineering Polska ( ) are with AGH University of Science and Technology, Poland. Reach them at or .