MVS Notes Unit-I
MVS Notes Unit-I
Definition: The field of machine vision involves using computers to process and analyze images and videos to
extract information about the world.
Related fields: Machine vision is closely related to and overlaps several other fields.
Image processing: Usually stops short of understanding the image. Often employed for improved human
consumption. For example, noise may be smoothed out, or contrast equalized. Note however that if a process
helps a human see the image better, it probably also helps the machine see it better.
Optics and sensors: How is the image captured? Can we model the capture process, to help remove noise?
Colorimetry: How should color be modeled? RGB is just one possibility. Are three values enough? How many
bits?
Pattern recognition: Once a piece of an image is recognized, how is it classified? For example, we might want
to find the vessels in an image of the retina:
Finding the vessels is a machine vision problem. Once it is found, we can measure their properties (tortuousity,
color, size) and try to classify the vessels as normal or abnormal.
Artificial intelligence: Once we classify the vessels, can we reason about the overall health of the subject?
Algorithms and architecture: How long does all this take? Lots of pixels!
Medical imaging: One of the biggest applications of machine vision is in medicine. The bottom line is
automation, repeatable performance of a known level, which is desired in medicine.
Robotics: What about a machine that can move? But in order to move, we would prefer the machine be able to
see, for obvious reasons.
Although we will concentrate on machine vision, it is important to understand that all these fields overlap. In
fact, part of being a graduate student is beginning to understand how knowledge may be modeled as "spheres of
influence". As you investigate a subject, you should be able to discuss your problem within the larger body of
knowledge.
Marr's theory (reconstruction): David Marr was a famous MIT researcher who proposed the first formal
framework for vision. He advocated geometric interpretation of each image individually, according to three
steps:
Image
This paradigm remains popular, even though it has been shown how many problems are ill-posed (for example,
you simply can't get a unique 3D interpretation of an image):
Active vision: Active vision proponents consider exploration and interpretation of an image to be intertwined.
In this case, an important purpose of the interpretive process is to decide where to look next (to aid in further
interpretation).
Purposive vision: In the purposive vision framework, the task controls the interpretation. For example, it may
not be important to understand all the contents of an image, but only that portion necessary to accomplish the
task.
Qualitative vision: Abandoning geometry, qualitative interpretation seeks only to develop a relational model of
the information in an image (the door is next to the walls and floor).
1D VISION SYSTEMS: 1D vision analyzes a digital signal one line at a time instead of looking at a whole
picture at once, such as assessing the variance between the most recent group of ten acquired lines and an earlier
group. This technique commonly detects and classifies defects on materials manufactured in a continuous
process, such as paper, metals, plastics, and other non-woven sheet or roll goods.
2D VISION SYSTEMS: Most common inspection cameras perform area scans that involve capturing 2D
snapshots in various resolutions, as shown in Figure 11. Another type of 2D machine vision–line scan–builds a
2D image line by line.
AREA SCAN VS LINE SCAN: In certain applications, line scan systems have specific advantages over area
scan systems. For example, inspecting round or cylindrical parts may require multiple area scan cameras to
cover the entire part surface. However, rotating the part in front of a single line scan camera captures the entire
surface by unwrapping the image. Line scan systems fit more easily into tight spaces for instances when the
camera must peek through rollers on a conveyor to view the bottom of a part. Line scan systems can also
generally provide much higher resolution than traditional cameras. Since line scan systems require parts in
motion to build the image, they are often well-suited for products in continuous motion.
3D SYSTEMS: 3D machine vision systems typically comprise multiple cameras or one or more laser
displacement sensors. Multi-camera 3D vision in robotic guidance applications provides the robot with part
orientation information. These systems involve multiple cameras mounted at different locations and
“triangulation” on an objective position in 3-D space.
In contrast, 3D laser-displacement sensor applications typically include surface inspection and volume
measurement, producing 3D results with as few as a single camera. A height map is generated from the
displacement of the reflected lasers’ location on an object. The object or camera must be moved to scan the
entire product similar to line scanning. With a calibrated offset laser, displacement sensors can measure
parameters such as surface height and planarity with accuracy within 20 µm. Figure 15 shows a 3D laser
displacement sensor inspecting brake pad surfaces for defects.
Human Vision v Machine Vision (and what the operational benefits really are)
We get asked this all the time! What are the differences between using a human inspector and what can we
expect from an intelligent vision system. Advances in artificial intelligence, in particular deep learning, are
enabling computers to learn for themselves and so gaps in this area continue to decrease. But it’s still safe to say
that vision systems work from logic, don’t get tired and don’t have an “off” day. And of course, some
production processes are too high speed for an operator to inspect (such as in medical device and pharmaceutical
manufacturing).
Some of the key characteristics in comparison between human and machine vision can be seen in the table
below:
The various key components of a machine vision system, including lighting, lenses, vision sensor, image
processing, and communications.
1. Lighting
The most important factor in achieving successful machine vision results is lighting. Machine vision systems
generate images by analyzing reflected light from an object rather than itself. A lighting technique entails the
placement of a light source concerning the part and the camera. An image can be enhanced by using a specific
lighting technique. By silhouetting a part obscures surface details to allow measurement of its edges; for
example, it eliminates some features while enhancing others.
Backlighting: Backlighting enhances an object’s outline for applications that only require external
or edge measurements. Backlighting aids in detecting shapes and improves the accuracy of
dimensional measurements.
Axial diffuse lighting: From the side, axial diffuse lighting couples light into the optical path
(coaxially). Light is cast downwards on the part by a semitransparent mirror illuminated from the
side. The part reflects light to the camera through a semitransparent mirror, resulting in an image
that is very evenly illuminated and uniform in appearance.
Structured light: Structured light happens when a light pattern (plane, grid, or more complex
shape) is projected onto an object at a specific angle. It can be useful for contrast-independent
surface inspections, dimensional data acquisition, and volume calculations.
Dark-field illumination: Surface defects are more easily revealed with directional lighting,
including dark-field and brightfield illumination. For low-contrast applications, dark-field
illumination is usually preferred. Specular light is reflected away from the camera in dark-field
illumination, while diffused light from surface texture and elevation changes is reflected into the
camera.
Bright field illumination: High-contrast applications benefit from brightfield illumination. Highly
directional light sources, such as quartz halogen and high-pressure sodium, may, on the other hand,
produce sharp shadows and do not provide uniform illumination across the entire field of view. As
a result, to provide even illumination in the brightfield, hot spots and specular reflections on shiny
or reflective surfaces may necessitate a more diffused light source.
Diffused dome lighting: Diffused dome lighting provides the most uniform illumination of
important features while masking irregularities that may distract the scene.
Strobe lighting: In high-speed applications, strobe lighting is used to freeze moving objects for
examination. Blurring can also be avoided by using a strobe light.
2. Lenses
The image is captured by the lens and delivered to the camera’s image sensor. The lenses’ optical quality and
price vary, and the captured image’s quality and resolution are determined by the lens used. On most vision
system cameras, there are two types of lenses: interchangeable and fixed lenses. The most common
interchangeable lens mounts are C-mounts and CS-mounts. Using the right lens and extension combination will
yield the best image. Autofocus, either a mechanically adjusted lens or a liquid lens that can focus on the part
automatically, is typically used in a standalone vision system with a fixed lens. Autofocus lenses usually have a
fixed field of view at a given distance.
3. Image sensor
The ability of the camera to capture a properly illuminated image of the inspected object is dependent not only
on the lens but also on the image sensor. To convert light (photons) to electrical signals, image sensors typically
use charge-coupled devices (CCD) or complementary metal-oxide-semiconductor (CMOS) technology
(electrons). The image sensor’s primary function is to capture light and convert it to a digital image while
maintaining a balance of noise, sensitivity, and dynamic range. The image is made up of pixels.
Low light creates dark pixels, while bright pixels are created by bright light. It’s critical to ensure the camera has
the correct sensor resolution for the job. The higher the resolution, the more detail and accurate measurements
an image will have. Part size, inspection tolerances, and other parameters will dictate the required resolution.
4. Vision processing
Processing is the process of extracting data from a digital image, and it can happen either externally in a PC-
based system or internally in a standalone vision system. The software is made up of several processing steps.
The sensor is first used to obtain an image. Pre-processing may be required in some cases to optimize the image
and ensure that all of the necessary features are visible. The software then locates the unique features, performs
measurements, and compares them to the specification.
Finally, a decision is reached, and the outcomes are shared. While many physical components of a machine
vision system (such as lighting) have similar specifications, the algorithms distinguish them. When comparing
solutions, they should be at the top of the priority list. Vision software configures camera parameters, makes the
pass-fail decision, communicates with the factory floor, and supports HMI development, depending on the
system or application.
5. Communications
Because vision systems frequently use a variety of off-the-shelf components, these components must quickly
and easily coordinate and connect to other machine elements. Typically, this is accomplished by sending
discrete I/O signals or data over a serial connection to a device that logs or uses information. Discrete I/O points
can be connected to a programmable logic controller (PLC), which can then use the data to control a work cell,
an indicator such as a stack light, or directly to a solenoid that can trigger a reject mechanism.
Hardware’s Software’s and algorithms of a machine vision system
Subheading Details
Machine vision hardware consists of cameras, lenses, lighting equipment, and sensors.
Cameras, particularly industrial-grade, provide high resolutions and frame rates, with
interfaces like USB, GigE, or Camera Link enabling high-speed data transfer. Lighting plays
Machine a crucial role in enhancing image quality, with options like LED or laser-based systems
Vision tailored to tasks. Lenses are chosen based on focal length and other optical properties, with
Hardware specialized types like telecentric lenses used to minimize perspective distortion. Filters may
also be applied to control the light wavelength, improving contrast. Frame grabbers are
essential for real-time data capture without loss. All these components ensure machine vision
systems can meet the demands of tasks such as sorting, inspection, and robotics.
Software handles image acquisition, processing, and analysis, interfacing with hardware
through acquisition libraries. Image processing algorithms include techniques like filtering,
edge detection, feature extraction, and thresholding. More advanced methods such as blob
Machine
analysis and contour detection enable object identification. Analysis frameworks can
Vision
integrate AI/ML algorithms, such as CNNs, for pattern recognition and object tracking.
Software
Platforms like OpenCV, HALCON, and MATLAB provide comprehensive libraries and tools
for system calibration and distortion correction. Cloud-based solutions offer remote
monitoring and updates, making machine vision software highly adaptable.
Algorithms in machine vision transform visual data into actionable information. Basic
algorithms include image processing methods like convolution, filtering, and morphological
operations, used to enhance images for further analysis. Convolutional Neural Networks
(CNNs) are widely employed for object recognition and classification, learning hierarchical
Machine
features from raw data. Other specialized algorithms, like Optical Character Recognition
Vision
(OCR), allow the reading of text in noisy images. Algorithms for motion detection, object
Algorithms
tracking, and 3D reconstruction enable applications like robotic control. Real-time decision-
making algorithms, including reinforcement learning, are crucial for tasks like autonomous
vehicle navigation. These algorithms continue to evolve, with AI and ML driving
performance improvements.
Image Acquisition: This is the first step in the machine vision pipeline, where cameras or sensors capture the
image data. The quality of the image acquisition process depends on several factors, including camera
resolution, lighting, and lens selection. Good image acquisition ensures that the data captured is accurate and
represents the scene or object of interest without excessive noise or distortion.
Image Pre-processing: After acquisition, pre-processing is performed to enhance the image for further analysis.
This step includes operations such as:
Image Filtering: Reduces noise and enhances important features like edges or textures.
Image Smoothing: Helps in reducing high-frequency noise to make patterns and objects clearer.
Image Enhancement: Increases contrast or sharpens edges to improve the visibility of critical details.
Normalization: Adjusts the range of pixel intensities to ensure uniform lighting and shading, which is crucial
for consistency in analysis.
Feature Extraction: After pre-processing, the image is analyzed to identify key features, such as edges,
corners, or blobs. Feature extraction is a critical function in machine vision because it identifies important parts
of an image that will be used for classification, object detection, or pattern recognition. For instance, edge
detection algorithms like Sobel or Canny identify the boundaries between objects, while corner detectors like
Harris or Shi-Tomasi detect points where edges intersect.
Segmentation: In many machine vision applications, an image is divided into meaningful regions or segments,
which represent different objects or parts of objects. Segmentation techniques include thresholding, region
growing, and clustering. For example, thresholding separates objects from the background based on pixel
intensity, while clustering group’s similar pixels into regions that may correspond to different objects.
Resolution: Resolution is the amount of detail that an image holds and is typically defined by the number of
pixels in the image (width × height). High-resolution images contain more detail, making it easier to identify
small objects or intricate patterns. However, higher resolution also requires more computational power to
process, so balancing resolution and computational efficiency is crucial in machine vision applications.
Contrast: Contrast refers to the difference in intensity between objects or regions in an image. High contrast
improves the ability to distinguish between different objects or features, making it easier for algorithms to detect
edges, boundaries, and other critical features. Low contrast, on the other hand, makes it difficult to differentiate
between similar regions, leading to poor performance in tasks like object recognition.
Brightness: Brightness is the overall intensity or luminance of the image. Proper brightness ensures that objects
are clearly visible and can be processed by the system. Underexposed images (too dark) and overexposed
images (too bright) can lead to information loss, making it difficult to identify features or objects. Machine
vision systems often include dynamic lighting control to ensure optimal brightness levels.
Sharpness: Sharpness defines how clear and well-defined the edges and details in an image are. Blurry images,
often caused by motion, incorrect focus, or poor lens quality, can reduce the accuracy of edge detection and
object recognition algorithms. Techniques like deblurring or using higher-quality lenses are employed to
maintain image sharpness.
Noise: Noise refers to random variations in pixel intensity, which can obscure important features or add
irrelevant information to the image. Noise is introduced during image acquisition due to factors like sensor
imperfections, low light conditions, or electronic interference. Various image processing techniques, such as
smoothing filters (Gaussian or median filters), are used to reduce noise without affecting important details in the
image.
Color Information: In certain applications, the color of an object is an important characteristic used for
identification or sorting. Machine vision systems use color spaces like RGB or HSV to represent color
information. Different color models are used depending on the task, with HSV often preferred for object
detection tasks because it separates color information (hue) from brightness, making it easier to handle
variations in lighting.
Dynamic Range: Dynamic range refers to the range of intensity values that a camera or sensor can capture,
from the darkest to the brightest parts of an image. A higher dynamic range allows machine vision systems to
capture details in both very dark and very bright areas of an image. This is particularly important in scenarios
where the lighting conditions are challenging, such as outdoor environments or industrial settings with mixed
lighting.
Applications of Machine Vision System
1. Defect Detection
2. Dimension Measurement
3. Surface Finish Inspection
4. Part Sorting and Classification
5. Assembly Verification
Defect Detection is a critical application of machine vision that involves identifying flaws, imperfections, or
anomalies in products. It is used to ensure product quality, reduce waste, and improve overall manufacturing
efficiency.
Common Defects:
Dimension measurement is a critical application of machine vision that involves accurately measuring the size,
shape, and dimensions of objects. It is used in various industries to ensure quality control, improve
manufacturing processes, and reduce waste.
Key Measurements:
Length: The distance between two points along a straight line.
Width: The distance between two opposite sides of an object.
Height: The vertical distance from the base to the top of an object.
Diameter: The length of a circle's diameter.
Angle: The measure of the space between two intersecting lines.
Surface Finish Inspection in Machine Vision
Surface finish inspection is a critical aspect of quality control in manufacturing, ensuring that products meet
specific standards for smoothness, roughness, and texture. Machine vision systems can automate this process,
providing accurate and consistent results.
Key Parameters:
Roughness: The unevenness or texture of a surface.
Smoothness: The absence of roughness or unevenness.
Texture: The pattern or appearance of a surface.
Defects: Scratches, pits, marks, or other imperfections.
Part Sorting and Classification in Machine Vision
Part sorting and classification is a critical application of machine vision that involves categorizing objects
based on their appearance or characteristics. It is used in various industries to automate tasks, improve
efficiency, and ensure product quality.
Key Parameters:
Assembly verification is a critical application of machine vision that involves ensuring that parts are assembled
correctly and according to specified standards. It is used to prevent defects, reduce rework, and improve product
quality.
Key Parameters:
Image Segmentation
Image segmentation is a fundamental task in computer vision that involves partitioning a digital image into
multiple segments or regions. These segments, also known as image objects, are groups of pixels that share
common characteristics, such as color, intensity, or texture. By dividing an image into meaningful components,
segmentation simplifies image analysis and makes it easier to identify objects, boundaries, and structures within
the image. Image segmentation is the operation of partitioning an image into a collection of connected sets of
pixels.
Detect the three basic types of gray-level discontinuities – points, lines, edges.
• Use the image sharpening techniques
– The first order derivatives produce thicker edges
– The second order derives have a strong response to fine detail, such as thin lines and isolated points,
and noise
– Laplasian operation
• Can be done by running a mask through the image
Point Detection
Steps for point detection:
1. Apply Laplacian filter to the image to obtain R(x, y).
2. Create binary image by threshold.
This is used to detect isolated spots in an image. The grey level of an isolated point will be very different from its
neighbours. It can be accomplished using the following 3×3 mask. The output of the mask operation is usually
thresholder. We say that an isolated point has been detected if
Line Detection
A special mask is needed to detect a special type of line
Examples: Horizontal mask has high response when a line passed through the middle row of the mask.
Edge Detection
Edge detection is the approach for segmenting images based on abrupt changes in intensity images based on
abrupt changes in intensity.
What is an edge – an edge is a set of connected pixels that lie on the boundary an edge is a set of connected
pixels that lie on the boundary between two regions.
An edge is a “local” concept whereas a region boundary, owing to the way, it is defined, is a more global idea.
Edge models:
1. Step edge (Idea edge)
2. Ramp edge (thick edge)
3. Roof edge
Thresholding
When noise is small, the method work, otherwise it may not work. Noise reduction has to be done before
choosing threshold value.
Region-Based Segmentation
Two basic approaches:
1. Region Growing
2. Region splitting and merging
Region Growing
Start with a set of “seed” each seed those neighbours that have similar properties such as specific ranges of gray
level.
Iteratively divide a region into smaller regions until all regions become TRUE. Merge adjacent regions as along
as the resulting region is still 54 Merge adjacent regions as along as the resulting region is still TRUE.
Data Reduction
Data reduction is the process of reducing the amount of data in an image while preserving essential information.
This is often necessary to improve processing efficiency and reduce storage requirements.
Important Questions
3-marks 8-marks 15-marks
What is a Machine Vision System? Briefly explain about the image Explain the Machine vision components.
characteristics
Define: Machine Vision Briefly explain about the image function Briefly explain the image function and
characteristics
List out the Basic Components of Briefly explain about an application of Explain: (i) m/c learning (ii) data
Machine Vision System machine vision reduction (iii) feature extraction, (iv) edge
detection.
Define: Human visual system Explain an application of machine vision Briefly explain about application of
such as in inspection of parts machine vision such as in inspection of
parts
State the differences between Human Briefly explain about the hardware’s and Briefly explain about the image
visual system and Active visual system. algorithms recognition and decisions
List out the Types of Machine visual Briefly explain about the data reduction
system. and edge detection
Differentiate between hardware and Briefly explain about the image
software. recognition and decisions in machine
vision system
What is image function?
What are characteristics using in Machine
Vision System?
What is segmentation?
What is data reduction?
What is feature extraction?
Define: edge detection.
State the Application of machine vision
such as in inspection of parts.
Explain the Human visual system.
Briefly explain about the Active vision
system
Briefly explain about the Machine vision
components