Machine Vision System
Machine Vision System
Machine vision can be defined as a means of simulating the image recognition and
analysis capabilities of the human system with electronic and electromechanical
techniques.
A machine vision system enables the identification and orientation of a work part
within the field of vision, and has far-reaching applications. It can not only facilitate
automated inspection, but also has wide ranging applications in robotic systems. Machine
vision can be defined as the acquisition of image data of an object of interest, followed
by processing and interpretation of data by a computer program, for useful applications.
3. Image interpretation
The primary task in a vision system is to capture a 2D or 3D image of the work part. A 2D image
captures either the top view or a side elevation of the work part, which would be adequate to carry out
simple inspection tasks. While the 2D image is captured using a single camera, the 3D image requires
at least two cameras positioned at different locations. The work part is placed on a flat surface and
illuminated by suitable lighting, which provides good contrast between the object and the background.
The camera is focused on the work part and a sharp image is obtained. The image comprises a matrix
of discrete picture elements popularly referred to as pixels. Each pixel has a value that is proportional
to the light intensity of that portion of the scene. The intensity value for each pixel is converted to its
equivalent digital value by an analog-to-digital converter (ADC).
Fig. 3.47 Vision system (a) Object and background (b) Matrix of pixels
This digitized frame of the image is referred to as the frame buffer. While Fig. 10.41(a) illustrates
the object kept in the scene of vision against a background, Fig. shows the division of the scene into a
number of discrete spaces called pixels. The choice of camera and proper lighting of the scene are
important to obtain a sharp image, having a good contrast with the background. Two types of cameras
are used in machine vision applications, namely vidicon cameras and solid-state cameras. Vidicon
cameras are analog cameras, quite similar to the ones used in conventional television pictures.
The image of the work part is focused onto a photoconductive surface, which is scanned at a
frequency of 25–30 scans per second by an electron beam. The scanning is done in a systematic manner,
covering the entire area of the screen in a single scan. Different locations on the photoconductive
surface, called pixels, have different voltage levels corresponding to the light intensity striking those
areas. The electron beam reads the status of each pixel and stores it in the memory. Solid-state cameras
are more advanced and function in digital mode. The image is focused onto a matrix of equally spaced
photosensitive elements called pixels. An electrical charge is generated in each element depending on
the intensity of light striking the element. The charge is accumulated in a storage device. The status of
every pixel, comprising either the grey scale or the colour code, is thus stored in the frame buffer. Solid-
state cameras have become more popular because they adopt more rugged and sophisticated technology
and generate much sharper images. Charge-coupled-device (CCD) cameras have become the standard
accessories in modern vision systems.
The frame buffer stores the status of each and every pixel. A number of techniques are available
to analyse the image data. However, the information available in the frame buffer needs to be refined
and processed to facilitate further analysis. The most popular technique for image processing is called
segmentation. Segmentation involves two stages: thresholding and edge detection.
Thresholding converts each pixel value into either of the two values, white or black, depending
on whether the intensity of light exceeds a given threshold value. This type of vision system is called a
binary vision system. If necessary, it is possible to store different shades of grey in an image, popularly
called the grey-scale system. If the computer has a higher main memory and a faster processor, an
individual pixel can also store colour information. For the sake of simplicity, let us assume that we will
be content with a binary vision system. Now the entire frame of the image will comprise a large number
of pixels, each having a binary state, either 0 or 1. Typical pixel arrays are 128 × 128, 256 × 256, 512 ×
512, etc.
Edge detection is performed to distinguish the image of the object from its surroundings.
Computer programs are used, which identify the contrast in light intensity between pixels bordering the
image of the object and resolve the boundary of the object.
In order to identify the work part, the pattern in the pixel matrix needs to be compared with the
templates of known objects. Since the pixel density is quite high, one-to-one matching at the pixel level
within a short time duration demands high computing power and memory. An easier solution to this
problem is to resort to a technique known as feature extraction. In this technique, an object is defined
by means of its features such as length, width, diameter, perimeter, and aspect ratio. The aforementioned
techniques—thresholding and edge detection—enable the determination of an object’s area and
boundaries.
Once the features have been extracted, the task of identifying the object becomes simpler, since
the computer program has to match the extracted features with the features of templates already stored
in the memory. This matching task is popularly referred to as template matching. Whenever a match
occurs, an object can be identified and further analysis can be carried out. This interpretation function
that is used to recognize the object is known as pattern recognition. It is needless to say that in order to
facilitate pattern recognition, we need to create templates or a database containing features of the known
objects. Many computer algorithms have been developed for template matching and pattern recognition.
In order to eliminate the possibility of wrong identification when two objects have closely resembling
features, feature weighting is resorted to. In this technique, several features are combined into a single
measure by assigning a weight to each feature according to its relative importance in identifying the
object. This adds an additional dimension in the process of assigning scores to features and eliminates
wrong identification of an object.
Once the object is identified, the vision system should direct the inspection station to carry out
the necessary action. In a flexible inspection environment, the work-cell controller should generate the
actuation signals to the transfer machine to transfer the work part from machining stations to the
inspection station and vice versa. Clamping, declamping, gripping, etc., of the work parts are done
through actuation signals generated by the work-cell controller.
The schematic diagram of a typical vision system is shown. This system involves
image acquisition; image processing Acquisition requires appropriate lighting. The
camera and store digital image processing involve manipulating the digital image to
simplify and reduce number of data points. Measurements can be carried out at any angle
along the three reference axes x y and z without contacting the part. The measured values
are then compared with the specified tolerance which stores in the memory of the
computer.
[source: https://www.roboticstomorrow.com/article/2019/12/what-is-machine-vision/14548]
The main advantage of vision system is reduction of tooling and fixture costs,
elimination of need for precise part location for handling robots and integrated
automation of dimensional verification and defect detection.
3.7.3 Principle
Four types (OR) Elements of machine vision system and the schematic arrangement is
Shown
[source: https://what-when-how.com/metrology/principle-of-working-metrology/]
[source: https://what-when-how.com/metrology/principle-of-working-metrology/]
Front lighting is used when certain key features on the surface of the object are to
be inspected. If a three-dimensional feature is being inspected, side lighting or structured
lighting may be required. The proper orientation and fixturing of part also deserve full
attention. An image sensor like vidicon camera, CCD or CID camera is used to generate
the electronic signal representing the image. The image sensor collects light from the
scene through a lens and using a photosensitive target, converts it into electronic signal.
Most image sensors generate signals representing two-dimensional arrays (scans of the
entire image).
[source: https://what-when-how.com/metrology/principle-of-working-metrology/]
Vidicon Camera used in closed-circuit television systems can be used for machine
vision systems. In it, an image is formed by focussing the incoming light through a series
of lenses onto the photoconductive face plate of the vidicon tube. An electron beam
within the tube scans the photo conductive surface and produces an analog output voltage
proportional to the variations in light intensity for each scan line of the original scene. It
provides a great deal of information of a scene at very fast speeds. However, they tend to
distort the image due to their construction and are subject to image burn-in on the photo
conductive surface. These are also susceptible to damage by shock and vibration.
These are commonly used in machine vision systems. These employ charge
coupled device (CCD) or charge injected device (CID) image sensors. They contain
matrix or linear array of small, accurately spaced photo sensitive elements fabricated on
silicon chips using integrated circuit technology. Each detector converts the light falling
on it, through the camera lens, into analog electrical signal corresponding to light
intensity. The entire image is thus broken down into an array of individual picture
elements (pixels).
[source: https://what-when-how.com/metrology/principle-of-working-metrology/]
Typical matrix array solid state cameras may have 256 x 256 detector elements per
array. Solid-state cameras are smaller, rugged and their sensors do not wear out with use.
They exhibit less image distortion because of accurate placement of the photodetectors.
CCD and CID differ primarily in how the voltages are extracted from the sensors.
(ii) Image Processing. The series of voltage levels available on detectors representing
light intensities over the area of the image need processing for presentation to the
microcomputer in a format suitable for analysis. A camera may typically form an image
30 times per sec i.e. at 33 m sec intervals. At each time interval the entire image has to
be captured and frozen for processing by an image processor. An analog to digital
converter is used to convert analog voltage of each detector into digital value.
If voltage level for each pixel is given either 0 or 1 value depending on some threshold
value, it is called Binary System. On the other hand gray scale system assigns upto 256
different values depending on intensity to each pixel. Thus in addition to black and white,
many different shades of gray can be distinguished. This thus permits comparison of
objects on the basis of surface characteristics like texture, colour, orientation, etc., all of
which produce subtle variations in light intensity distributions. Gray scale systems are
used in applications requiring higher degree of image refinement. For simple inspection
tasks, silhouette images are adequate and binary system may serve the purpose. It may
be appreciated that gray-scale system requires huge storage processing capability because
a 256 x 256 pixel image array with upto 256 different pixel values will require over
65000-8 bit storage locations for analysis, at a speed of 30 images per second. The data
processing requirements can thus be visualised. It
is, therefore, essential that some means be used to reduce the amount of data to be
processed. Various techniques in this direction are :
(a) Windowing. This technique is used to concentrate the processing in the desired area
of interest and ignoring other non-interested part of image. An electronic mask is created
around a small area of an image to be studied.
Thus only the pixels that are not blocked out will be analysed by the computer.
(b) Image Restoration. This involves preparation of an image in more suitable form
during the pre-processing stage by removing the degradation suffered. The image may
be degraded (blurring of lines/boundaries ; poor contrast between image regions,
presence of background noise, etc.) due to motion of camera/object during image
formation, poor illumination/poor placement, variation in sensor response, poor contrast
on surface, etc.).
The quality may be improved, (i) by improving the contrast by constant brightness
addition, (ii) by increasing the relative contrast between high and low intensity elements
by making light pixels lighter and dark pixels darker (contrast stretching) or (iii) by
fourier domain processing.
Other techniques to reduce processing are edge detection and run length encoding. In
former technique, the edges are clearly found and defined and rather than storing the
entire image, only the edges are stored. In run-length encoding, each line of the image is
scanned, and transition points from black to white or vice versa are noted, along with the
number of pixels between transitions. These data are then stored instead of the original
image, and serve as the starting point for image analysis.
(iii) Image Analysis.
Digital image of the object formed is analysed in the central processing unit of the
system to draw conclusions and make decisions. Analysis is done by describing and
measuring the properties of several image features which may belong to either regions of
the image or the image as a whole. Process of image interpretation starts with analysis of
simple features and then more complicated features are added to define it completely.
Analysis is carried for describing the position of the object, its geometric configuration,
distribution of light intensity over its visible surface, etc.
Three important tasks performed by machine vision systems are measuring the
distance of an object from a vision system camera, determining object orientation, and
defining object position.
Usually, parts tend to have distinct shapes that can be recognized on the basis of
elementary features. For complex three-dimensional objects, additional geometric
properties need to be determined, including descriptions of various image segments
(process being known as feature extraction). In this method the boundary locations are
determined and the image is segmented into distinct regions and their geometric
properties determined. Then these image regions are organised in a structure describing
their relationship.
The most commonly used methods of interpreting images are feature weighing
(several image features and measured to interpret an image, a simple factor weighing
method being used to consider the relative contribution of each feature to be analysed)
and template matching (in which a mask is electronically generated to match a standard
image of an object). In actual practice, several known parts are presented to the machine
for analysis. The part features are stored and updated as each part is presented, until the
machine is familiar with the part. Then the actual parts are studied by comparison with
this stored model of a standard part.
Similarly mathematical models of the expected images are created. For complex
shapes, the machine is taught by allowing it to analyse a simple part. Standard image-
processing software is available for calculating basic image features and computing with
models.
[source: https://what-when-how.com/metrology/how-machine-vision-system-functions-metrology/]
Machine vision can he used to replace human vision fur welding. Machining and
maintained relationship between tool and work piece and assembly of parts to analyze
the parts.
➢ This is frequently used for printed circuit board inspection to ensure minimum
conduction width and spacing between conductors. These are used for weld seam
tracking, robot guideness and control, inspection of microelectronic devices and
tooling, on line inspection in machining operation, assemblies monitoring high-
speed packaging equipment etc.
➢ It gives recognition of an object from its image. These are designed to have strong
geometric feature interpretation capabilities and pa handling equipment.
Machine vision systems are used for various applications such as part identification,
safety monitoring, and visual guidance and navigation. However, by far, their biggest
application is in automated inspection. It is best suited for mass production, where
100% inspection of components is sought. The inspection task can either be in on-
line or off-line mode. The following are some of the important applications of
machine vision system in inspection:
Work parts, either stationary or moving on a conveyor system, are inspected for
dimensional accuracy. A simpler task is to employ gauges that are fitted as end
effectors of a transfer machine or robot, in order to carry out gauging, quite similar
to a human operator. A more complicated task is the measurement of actual
dimensions to ascertain the dimensional accuracy. This calls for systems with high
resolution and good lighting of the scene, which provides a shadow-free image.
Defects on the surface such as scratch marks, tool marks, pores, and blow holes
can be easily identified. These defects reveal themselves as changes in reflected light
and the system can be programmed to identify such defects.
Verification of holes
This involves two aspects. Firstly, the count of number of holes can be easily
ascertained. Secondly, the location of holes with respect to a datum can be inspected
for accuracy.