Embedded Vision - Mercury Learning and Information - E-Book

Embedded Vision E-Book

Mercury Learning and Information

0,0
63,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Learn the fundamentals and applications of embedded vision systems, covering various industries and practical examples.

Key Features

  • Comprehensive guide to embedded vision systems
  • Detailed coverage of design and implementation across industries
  • Practical real-time examples for hands-on learning

Book Description

Embedded vision integrates computer vision into machines using algorithms to interpret images or videos. This book serves as an introductory guide for designing vision-enabled embedded products, with applications in AI, machine learning, industrial, medical, automotive, and more. It covers hardware architecture, software algorithms, applications, and advancements in cameras, processors, and sensors.

The course begins with an overview of embedded vision, followed by industrial and medical vision applications. It then delves into video analytics, digital image processing, and camera-image sensors. Subsequent chapters cover embedded vision processors, computer vision, and AI integration. The final chapter presents real-time vision-based examples.

Understanding these concepts is vital for developing advanced vision-enabled machines. This book takes readers from the basics to advanced topics, blending theoretical knowledge with practical applications. It is an essential resource for mastering embedded vision technology across various industries.

What you will learn

  • Understand the basics of embedded vision systems
  • Design and implement vision processors
  • Apply digital image processing techniques
  • Utilize AI in vision systems
  • Develop real-time vision applications
  • Integrate vision sensors and cameras effectively

Who this book is for

The ideal audience for this book includes engineers, developers, and researchers working in the field of embedded vision systems. A basic understanding of computer vision and digital image processing is recommended.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 887

Veröffentlichungsjahr: 2024

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



EMBEDDED VISION

An Introduction

S. R. Vijayalakshmi, PhDS. Muruganand, PhD

Copyright ©2020 by MERCURY LEARNINGAND INFORMATION LLC. All rights reserved.

Original title and copyright: Embedded Vision. Copyright ©2019 by Overseas Press India Private Limited. All rights reserved.

This publication, portions of it, or any accompanying software may not be reproduced in any way, stored in a retrieval system of any type, or transmitted by any means, media, electronic display or mechanical display, including, but not limited to, photocopy, recording, Internet postings, or scanning, without prior permission in writing from the publisher.

Publisher: David Pallai MERCURY LEARNINGAND INFORMATION22841 Quicksilver Drive Dulles, VA 20166 [email protected](800) 232-0223

S. R. Vijayalakshmi and S. Muruganand. Embedded Vision: An Introduction.ISBN: 978-1-68392-457-9

The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a means to distinguish their products. All brand names and product names mentioned in this book are trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks or trademarks, etc. is not an attempt to infringe on the property of others.

Library of Congress Control Number: 2019937247 192021321   This book is printed on acid-free paper in the United States of America.

Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc.

For additional information, please contact the Customer Service Dept. at (800) 232-0223 (toll free).

All of our titles are available in digital format at academiccourseware.com and other digital vendors. Companion disc files for this title are available by contacting [email protected]. The sole obligation of MERCURY LEARNINGAND INFORMATION to the purchaser is to replace the disc, based on defective materials or faulty workmanship, but not based on the operation or functionality of the product.

Contents

Preface

Chapter 1 Embedded Vision

1.1 Introduction to Embedded Vision

1.2 Design of an Embedded Vision System

Characteristics of Embedded Vision System Boards Versus Standard Vision System Boards

Benefits of Embedded Vision System Boards

Processors for Embedded Vision

High Performance Embedded CPU

Application Specific Standard Product (ASSP) in Combination with a CPU

General Purpose Cpus

Graphics Processing Units with CPU

Digital Signal Processors with Accelerator(s) and a CPU

Field Programmable Gate Arrays (FPGAs) with a CPU

Mobile “Application Processor”

Cameras/Image Sensors for Embedded Vision

Other Semiconductor Devices for Embedded Vision

Memory

Networking and Bus Interfaces

1.3 Components in a Typical Vision System

Vision Processing Algorithms

Embedded Vision Challenges

1.4 Applications for Embedded Vision

Swimming Pool Safety System

Object Detection

Video Surveillance

Gesture Recognition

Simultaneous Localization and Mapping (SLAM)

Automatic Driver Assistance System (ADAS)

Game Controller

Face Recognition for Advertising Research

Mobile Phone Skin Cancer Detection

Gesture Recognition for Car Safety

Industrial Applications for Embedded Vision

Medical Applications for Embedded Vision

Automotive Applications for Embedded Vision

Security Applications for Embedded Vision

Consumer Applications for Embedded Vision

Machine Learning in Embedded Vision Applications

1.5 Industrial Automation and Embedded Vision: A Powerful Combination

Inventory Tracking

Automated Assembly

Automated Inspection

Workplace Safety

Depth Sensing

1.6 Development Tools for Embedded Vision

Both General Purpose and Vendor Specific Tools

Personal Computers

OpenCV

Heterogeneous Software Development in an Integrated Development Environment

Summary

Reference

Learning Outcomes

Further Reading

Chapter 2 Industrial Vision

2.1 Introduction to Industrial Vision Systems

PC-Based Vision Systems

Industrial Cameras

High-Speed Industrial Cameras

Smart Cameras

2.2 Classification of Industrial Vision Applications

Dimensional Quality

Surface Quality

Structural Quality

Operational Quality

2.3 3D Industrial Vision

Automated Inspection

Robotic Guidance

3D Imaging

3D Imaging Methods

3D Inspection

3D Processing

3D Robot Vision

High-Speed Imaging

High-Speed Cameras

Line Scan Imaging

Capture and Storage

High-Speed Inspection for Product Defects

Labels and Marking

Web Inspection

High-Speed Troubleshooting

Line Scan Technology

Contact Image Sensors

Lenses

Image Processing

Line Scan Inspection

Tracking and Traceability

Serialization

Direct Part Marking

Product Conformity

Systems Integration Challenges

2.4 Industrial Vision Measurement

Character Recognition, Code Reading, and Verification

Making Measurements

Pattern Matching

3D Pattern Matching

Preparing for Measurement

Industrial Control

Development Approaches and Environments

Development Software Tools for Industrial Vision Systems

Image Processing and Analysis Tools

Summary

References

Learning Outcomes

Further Reading

Chapter 3 Medical Vision

3.1 Introduction to Medical Vision

Advantages of Digital Processing for Medical Applications

Digital Image Processing Requirements for Medical Applications

Advanced Digital Image Processing Techniques in Medical Vision

Image Processing Systems for Medical Applications

Stereoscopic Endoscope

3.2 From Images to Information in Medical Vision

Magnifying Minute Variations

Gesture and Security Enhancements

3.3 Mathematics, Algorithms in Medical Imaging

Artificial Intelligence (AI)

Computer-Aided Diagnostic Processing

Vision Algorithms for Biomedical

Real-Time Radiography

Image Compression Technique for Telemedicine

Region of Interest

Structure Sensitive Adaptive Contrast Enhancement Methods

LSPIHT Algorithm for ECG Data Compression and Transmission

Retrieval of Medical Images in a PACs

Digital Signature Realization Process of DICOM Medical Images

Computer Neural Networks (CNNs) in Medical Image Analysis

Deep Learning and Big Data

3.4 Machine Learning in Medical Image Analysis

Convolutional Neural Networks

Convolution Layer

Rectified Linear Unit (RELU) Layer

Pooling Layer

Fully Connected Layer

Feature Computation

Feature Selection

Training and Testing: The Learning Process

Example of Machine Learning with Use of Cross Validation

Summary

References

Learning Outcomes

Further Reading

Chapter 4 Video Analytics

4.1 Definition of Video Analytics

Applications of Video Analytics

Image Analysis Software

Security Center Integration

Video Analytics for Perimeter Detection

Video Analytics for People Counting

Traffic Monitoring

Auto Tracking Cameras for Facial Recognition

Left Object Detection

4.2 Video Analytics Algorithms

Algorithm Example: Lens Distortion Correction

Dense Optical Flow Algorithm

Camera Performance Affecting Video Analytics

Video Imaging Techniques

4.3 Machine Learning in Embedded Vision Applications

Types of Machine-Learning Algorithms

Implementing Embedded Vision and Machine Learning

Embedded Computers Make Inroads to Vision Applications

4.4 Examples for Machine Learning

1. Convolutional Neural Networks for Autonomous Cars

2. CNN Technology Enablers

3. Smart Fashion AI Architecture

4. Teaching Computers to Recognize Cats

Summary

References

Learning Outcomes

Further Reading

Chapter 5 Digital Image Processing

5.1 Image Processing Concepts for Vision Systems

Image

Signal

Systems

5.2 Image Manipulations

Image Sharpening and Restoration

Histograms

Transformation

Edge Detection

Vertical Direction

Horizontal Direction

Sobel Operator

Robinson Compass Mask

Kirsch Compass Mask

Laplacian Operator

Positive Laplacian Operator

Negative Laplacian Operator

5.3 Analyzing an Image

Color Spaces

JPEG Compression

Pattern Matching

Template Matching

Template Matching Approaches

Motion Tracking and Occlusion Handling

Template-Matching Techniques

Advanced Methods

Advantage

Enhancing The Accuracy of Template Matching

5.4 Image-Processing Steps for Vision System

Scanning and Image Digitalization

Image Preprocessing

Image Segmentation On Object

Description of Objects

Classification of Objects

Summary

References

Learning Outcomes

Further Reading

Chapter 6 Camera—Image Sensor

6.1 History of Photography

Image Formation on Cameras

Image Formation on Analog Cameras

Image Formation on Digital Cameras

Camera Types and Their Advantages: Analog Versus Digital Cameras

Interlaced Versus Progressive Scan Cameras

Area Scan Versus Line Scan Cameras

Time Delay and Integration (TDI) Versus Traditional Line Scan Cameras

Camera Mechanism

Perspective Transformation

Pixel

6.2 Camera Sensor for Embedded Vision Applications

Charge Coupled Device (CCD) Sensor Construction

Complementary Metal Oxide Semiconductor (CMOS) Sensor Construction

Sensor Features

Electronic Shutter

Sensor Taps

Spectral Properties of Monochrome and Color Cameras

Camera Resolution for Improved Imaging System Performance

6.3 Zooming, Camera Interface, and Selection

Optical Zoom

Digital Zoom

Spatial Resolution

Gray-Level Resolution

Capture Boards

Firewire Ieee 1394/IIDC DCAM Standard

Camera Link

GigE Vision Standard

USB—Universal Serial Bus

CoaXPress

Camera Software

Camera and Lens Selection for a Vision Project

6.4 Thermal-Imaging Camera

Summary

References

Learning Outcomes

Further Reading

Chapter 7 Embedded Vision Processors and Sensors

7.1 Vision Processors

Hardware Platforms for Embedded Vision, Image Processing, and Deep Learning

Requirements of Computers for Embedded Vision Application

Processor Configuration Selection

7.2 Embedded Vision Processors

Convolution Neural Networks (CNN) Engine

Cluster Shared Memory

Streaming Transfer Unit

Bus Interface

Complete Suite of Development Tools

Intel Movidius Myraid X Vision Processors

Matrox RadientPro CL

Single-Board Computer Raspberry Pi

Nvidia Jetson TX1

Nvidia Jetson Tk1

Beagle board: Beagle bone Black

Orange Pi

ODROID-C2

Banana Pi

CEVA–XM4 Imaging and Vision Processor

MAX10 FPGA

Vision DSPs for Imaging and Vision

Vision Q6 DSP Features and Benefits

Vision P6 DSP Features and Benefits

Vision P5 DSP Features and Benefits

VFPU

7.3 Sensors for Applications

Sensors for Industrial Applications

Sensors for Aviation and Aerospace

Sensors for the Automobile Industry

Agricultural Sensors

Smart Sensors

7.4 MEMS

NEMS

Biosensors

Medical Sensors

Nuclear Sensors

Sensors for Deep-Sea Applications

Sensors for Security Applications

Selection Criteria for Sensor

Summary

References

Learning Outcomes

Further Reading

Chapter 8 Computer Vision

8.1 Embedded Vision and Other Technologies

Robot Vision

Signal Processing

Image Processing

Pattern Recognition and Machine Learning

Machine Vision

Computer Graphics

Artificial Intelligence

Color Processing

Video Processing

Computer Vision Versus Machine Vision

Computer Vision Versus Image Processing

The Difference Between Computer Vision, Image Processing, and Machine Learning

8.2 Tasks and Algorithms in Computer Vision

Image Acquisition

Image Processing

Image Analysis and Understanding

Algorithms

Feature Extraction

Feature Extraction Algorithms

Image Classification

Object Detection

Object Tracking

Semantic Segmentation

Instance Segmentation

Object Recognition Algorithms

SIFT: Scale Invariant Feature Transforms Algorithm

SURF: Speed up Robust Features Algorithm

ORB: Oriented Fast and Rotated Brief Algorithm

Optical Flow and Point Tracking

Commercial Computer Vision Software Providers

8.3 Applications of Computer Vision

Packages and Frameworks for Computer Vision

8.4 Robotic Vision

Mars Path Finder

Cobots Versus Industrial Robots

Machine Learning in Robots

Sensors in Robotic Vision

Artificial Intelligence Robots

Robotic Vision Testing in the Automotive Industry

8.5 Robotic Testing in the Aviation Industry

Robotic Testing in the Electronics Industry

The Use of Drones and Robots in Agriculture

Underwater Robots

Autonomous Security Robots

Summary

References

Learning Outcomes

Further Reading

Chapter 9 Artificial Intelligence for Embedded Vision

9.1 Embedded Vision-based Artificial Intelligence

AI-Based Solution for Personalized Styling and Shopping

AI Learning Algorithms

Algorithm Implementation Options

AI Embedded in Cameras

9.2 Artificial Vision

AI for Industries

9.3 3D-Imaging Technologies: Stereo Vision, Structured Light, Laser Triangulation, and ToF

1. Stereo Vision

2. Structured Light

3. Laser Triangulation

4. Time-of-Flight Camera for 3D Imaging

Theory of Operation

Working of ToF

Comparison of 3D-Imaging Technologies

Structured-Light Versus ToF

Applications of ToF 3D-Imaging Technology

Gesture Applications

Non-Gesture Applications

Time of Flight Sensor Advantages

9.4 Safety and Security Considerations in Embedded Vision Applications

Architecture Case Study

Choosing Embedded Vision Software

Summary

References

Learning Outcomes

Further Readings

Chapter 10 Vision-Based Real-Time Examples

10.1 Algorithms for Embedded Vision

Three Classes

Local Operators

Global Transformations

10.2 Methods and Models in Vision Systems

1. Shapes and Shape Models

2. Active Shape Model (ASM)

3. Clustering Algorithms

4. Thinning Morphological Operation

5. Hough Transform (HT)

10.3 Real-Time Examples

1. Embedded-Vision-Based Measurement

2. Defect Detection on Hardwood Logs Using Laser Scanning

3. Reconstruction of Monocular Fiberscopic Images

4. Vision Technologies for Empty Bottle Inspection Systems

5. Unmanned Rotorcraft for Ground Target Following Using Embedded Vision

6. Automatic Axle-Lifting System Design

7. Object Tracking Using an Address Event Vision Sensor

8. Using FPGA as an SoC Processor in ADAS Design

9. Diagnostic Imaging

10. Electronic Pill

10.4 Research and Development in Vision Systems

Robotic Vision

Stereo Vision

Vision Measurement

Industrial Vision

Automobile Industry

Medical Vision

Embedded Vision System

Summary

References

Learning Outcomes

Further Readings

Appendix

Embedded Vision Glossary

Index

Preface

Embedded Vision (EV) is an emerging electronics industry technology. It provides visual intelligence to automated embedded products. It combines embedded systems and computer vision and is the integration of a camera and a processing board. Embedded vision integrates computer vision in machines that use algorithms to decode meaning from observed images or video images. It has a wide range of potential applications to industrial, medical, automotive including driverless cars, drones, smart phones, aerospace, defense, agriculture, consumer, surveillance, robotics, and security. It will meet the requirements of algorithms in the computer vision field and the hardware and software requirements of the embedded systems field to give visual talent to end products.

This book is an essential guide for anyone who is interested in designing machines that can see, sense, and build vision-enabled embedded products. It covers a large number of topics encountered in the hardware architecture, software algorithms, applications, advancements in camera, processors, and sensors in the field of embedded vision. Embedded vision systems are built for special applications, whereas PC based systems are usually intended for general image processing.

Chapter 1 discusses introductory points, the design of an embedded vision system, characteristics of an embedded vision system board, processors and cameras for embedded vision, components in a typical vision system and embedded vision challenges. Application areas of embedded vision are analyzed. Development tools for embedded vision are also introduced in this chapter.

Chapter 2 discusses industrial vision. PC based vision systems, industrial cameras, high speed industrial cameras, and smart cameras are discussed in this chapter. The industrial vision applications are classified as dimensional quality inspection, surface quality inspection, structural quality, and operational quality. 3D imaging methods, 3D inspection, 3D processing, 3D robotic vision, capture and storage are analyzed under the heading of 3D industrial vision. 3D pattern matching, development approaches, development software tools, image processing analysis tools are also discussed.

Chapter 3 covers medical vision techniques. Image processing systems for medical applications, from images to information in medical vision, mathematics, algorithms in medical imaging, and machine learning in medical image analysis are discussed. Stereoscopic endoscope, CT, ultrasonic imaging system, MRI, X-ray, PACS, CIA, FIA, ophthalmology, indo cyanine green, automatic classification of cancerous cells, facial recognition to determine pain level, automatic detection of patient activity, peripheral vein imaging and the stereoscopic microscope, CAD processing, radiography, and telemedicine are covered. CNN, machine learning, deep learning and big data are also discussed.

Chapter 4 discusses video analytics. Definitions, applications, and algorithms of video analytics and video imaging are covered. Different types of machine learning algorithms and examples of ML such as CNN for an autonomous car, smart fashion AI architecture, and teaching computer to recognize animals are discussed.

Chapter 5 discusses digital image processing. The image processing concept, image manipulations, image analyzing and image processing steps for an embedded vision system are covered. Image sharpening, histograms, image transformation, image enhancement, convolution, blurring, edge detection are a few image manipulations discussed in this chapter. Frequency domain, transformation, filters, color spaces, jpeg compression, pattern matching, and template matching used to analyze images are discussed.

Chapter 6 discusses the history of photography, camera sensors for embedded vision applications, zooming camera, camera interface and camera selection for vision projects are discussed in this chapter.

Chapter 7 covers embedded vision processors and sensors. This chapter deals with the vision processor selection, embedded vision processor boards, different sensors based on the applications, and MEMS. The options for processor configuration, embedded vision processor boards and sensors suited for different applications are thoroughly discussed.

Chapter 8 discusses computer vision. This chapter compares various existing technologies with embedded vision. Tasks and algorithms in computer vision such as feature extraction, image classification, object detection, object tracking, semantic segmentation, instance segmentation, object recognition algorithms, optical flow, and point tracking are discussed. Commercial computer vision software providers are listed and the applications of computer vision and robotic vision are discussed.

Chapter 9 discusses the use of artificial intelligence in embedded vision. Embedded vision based artificial intelligence, artificial vision, 3D imaging technologies, safety & security considerations in EV applications are covered. AI based solutions for personalized styling and shopping, AI embedded cameras and algorithm implementations are discussed. Stereo vision, structured light, laser triangulation, and time of flight techniques for 3D images are compared.

Chapter 10 discusses vision based, real time examples. Algorithms for embedded vision, and methods and models in vision systems are covered. Recent research, applications, and developments in the field of embedded vision systems are analyzed.

CHAPTER 1

EMBEDDED VISION

Overview

Embedded vision is the integration of vision in machines that use algorithms to decode meaning from observing images or videos. Embedded vision systems use embedded boards, sensors, cameras, and algorithms to extract information. Application areas are many, and include automobiles, medical, industry, domestic, and security systems.

Learning Objectives

After reading this the reader will be able to

■ differentiate between embedded vision and computer vision,

■ define embedded vision system,

■ understand embedded vision system design requirements,

■ understand application areas of embedded vision, and

■ development tools for vision.

1.1 INTRODUCTION TO EMBEDDED VISION

Embedded vision refers to the practical use of computer vision in machines that understand their environment through visual means. Computer vision is the use of digital processing and intelligent algorithms to interpret meaning from images or video. Due to the emergence of very powerful, low-cost, and energy efficient processors, it has become possible to incorporate practical computer vision capabilities into embedded systems, mobile devices, PCs, and the cloud. Embedded vision is the integration of computer vision in machines that use algorithms to decode meaning from observing pixel patterns in images or video. The computer vision field is developing rapidly, along with advances in silicon and, more recently, purpose designed embedded vision processors.

Embedded vision is the extraction of meaning from visual inputs, creating “machines that see and understand.” Embedded vision is now spreading into a very wide range of applications, including automotive driver assistance, digital signage, entertainment, healthcare, and education. Embedded vision is developing across numerous fields including autonomous medical care, agriculture technology, search and rescue, and repair in conditions dangerous to humans. Applications include autonomous machines of many types such as embedded systems, driverless cars, drones, smart phones, and rescue and bomb disarming robots. The term embedded vision refers to the use of computer vision technology in embedded systems. Stated another way, embedded vision refers to embedded systems that extract meaning from visual inputs. Similar to the way that wireless communication has become pervasive over the past 10 years, embedded vision technology will be very widely deployed in the next 10 years.

Computer (or machine) vision is the field of research that studies the acquisition, processing, analysis, and understanding of real-world visual information. It is a discipline that was established in the 1960s, but has made recent rapid advances due to improvements both in algorithms and in available computing technology. Embedded systems are computer systems with dedicated functions that are embedded within other devices, and are typically constrained by cost and power consumption. Some examples of devices using embedded systems include mobile phones, set top boxes, automobiles, and home appliances. Embedded vision is an innovative technology in which computer vision algorithms are incorporated into embedded devices to create practical and widely deployable applications using visual data. This field is rapidly expanding into emerging high-volume consumer applications such as home surveillance, games, automotive safety, smart glasses, and augmented reality.

With the emergence of increasingly capable processors, it’s becoming practical to incorporate computer vision capabilities into a wide range of embedded systems, enabling them to analyze their environments via video inputs. Products like game controller and driver assistance systems are raising awareness of the incredible potential of embedded vision technology. As a result, many embedded system designers are beginning to think about implementing embedded vision capabilities. It’s clear that embedded vision technology can bring huge value to a vast range of applications. Two examples are eye’s vision-based driver assistance systems, intended to help prevent motor vehicle accidents, and swimming pool safety systems, which help prevent swimmers from drowning.

The term embedded vision implies a hybrid of two technologies, embedded systems and computer vision. An embedded system is a microprocessor-based system that isn’t a general-purpose computer, whereas computer vision refers to the use of digital processing and intelligent algorithms to interpretmeaning from images or video. Most commonly defined, an embedded vision system is any microprocessor-based system with image sensor functionality that isn’t a standard personal computer. Tablets and smart phones fall into this category, as well as more unusual devices such as advanced medical diagnosis instruments and robots with object recognition capabilities. So, to put it simply, embedded vision refers to machines that understand their environment through visual means.

Embedded vision processors are now developed by electronic companies to make computer vision lower in cost, lower in power and ready for smaller, more mobile. Embedded devices and coprocessing chips can be connected to neural networks or neural net processors to add efficient computer vision to machine learning. Two main trends of embedded vision are: Miniaturization of PCs and of cameras, and the possibility for vision systems to be produced affordably and for highly specific applications. Systems of this kind are referred to as embedded vision systems.

Visual inputs are the richest source of sensor information. For more than 50 years, scientists have tried to understand imaging and developed algorithms allowing computers to see with computer vision applications. The first real commercial applications, referred to as machine vision, analyzed fast moving objects to inspect and detect errors in products. Due to improving process power, lower power consumption, better image sensors, and better computer algorithms, vision elevates to a much higher level. Combining embedded systems with computer vision results in embedded vision systems. Embedded vision blocks are shown in Figures 1.1a and 1.1b.

Initially, embedded vision technology was found in complex, expensive systems, for example a surgical robot for hair transplantation or quality control inspection systems for manufacturing. Like wireless communication, embedded vision requires lots of processing power, particularly as applications increasingly adopt high-resolution cameras and make use of multiple cameras. Providing that processing power at a cost low enough to enable mass adoption is a big challenge. This challenge is multiplied by the fact that embedded vision applications require a high degree of programmability. In wireless applications algorithms don’t vary dramatically from one cell phone handset to another, but in embedded vision applications there are great opportunities to get better results and enable valuable features through unique algorithms.

FIGURE 1.1A. Embedded vision system blocks.

FIGURE 1.1B. Embedded vision block diagram.

With embedded vision, the industry is entering a “virtuous circle” of the sort that has characterized many other digital signal processing application domains. Although there are few chips dedicated to embedded vision applications today, these applications are increasingly adopting high performance, cost effective processing chips developed for other applications, including DSPs, CPUs, FPGAs, and GPUs. As these chips continue to deliver more programmable performance per watt, they will enable the creation of more high volume embedded vision products. Those high-volume applications, in turn, will attract more attention from silicon providers, who will deliver even better performance, efficiency, and programmability.

1.2 DESIGN OF AN EMBEDDED VISION SYSTEM

An embedded vision system consists, for example, of a camera, a so called board level camera, which is connected to a processing board as shown in Figure 1.2. Processing boards take over the tasks of the PC from the classic machine vision setup. As processing boards are much cheaper than classic industrial PCs, vision systems can become smaller and also more cost effective. The interfaces for embedded vision systems are primarily USB or LVDS (Low voltage differential signaling connector).

As like embedded systems, there are popular single board computers (SBC), such as the Raspberry Pi are available on the market for embedded vision product development. Figure 1.3 shows the Raspberry Pi is a mini computer with established interfaces and offers a similar range of features as a classic PC or laptop. Embedded vision solutions can also be implemented with so-called system on modules (SoM) or computer on modules (CoM). These modules represent a computing unit. For the adaptation of the desired interfaces to the respective application, a so called individual carrier board is needed. This is connected to the SoM via specific connectors and can be designed and manufactured relatively simply. The SoMs or CoMs (or the entire system) are cost effective on the one hand since they are available off-the-shelf, while on the other hand they can also be individually customized through the carrier board. For large manufactured quantities, individual processing boards are a good idea.

FIGURE 1.2. Design of embedded vision system.

FIGURE 1.3. Embedded System Boards

All modules, single board computers, and SoMs, are based on a system on chip (SoC). This is a component on which the processor(s), controllers, memory modules, power management, and other components are integrated on a single chip. Due to these efficient components, the SoCs, embedded vision systems have only recently become available in such a small size and at a low cost.

Embedded vision is the technology of choice for many applications. Accordingly, the design requirements are widely diversified. The two interface technologies offered for embedded vision systems in the portfolio are USB3 Vision for easy integration and LVDS for a lean system design. USB 3.0 is the right interface for a simple plug and play camera connection and ideal for camera connections to single board computers. It allows the stable data transfer with a bandwidth of up to 350 MB/s. LVDS-based interface allows a direct camera connection with processing boards and thus also to on board logic modules such as FPGAs (field programmable gate arrays) or comparable components. This allows a lean system design to be achieved and can benefit from a direct board-to-board connection and data transfer. The interface is therefore ideal for connecting to a SoM on a carrier / adapter board or with an individually developed processor unit. It allows stable, reliable data transfer with a bandwidth of up to 252 MB/s.

Characteristics of Embedded Vision System Boards versus Standard Vision System Boards

Most of the previously mentioned single board computers and SoMs do not include the x86 family processors common in standard PCs. Rather, the CPUs are often based on the ARM architecture. The open source Linux operating system is widely used as an operating system in the world of ARM processors. For Linux, there are a large number of open source application programs, as well as numerous freely available program libraries. Increasingly, however, x86-based single board computers are also spreading. A consistently important criterion for the computer is the space available for the embedded system.

For the software developer, the program development for an embedded system is different than for a standard PC. As a rule, the target system does not provide a suitable user interface which can also be used for programming. The software developer must connect to the embedded system via an appropriate interface if available (e.g., network interface) or develop the software on the standard PC and then transfer it to the target system. When developing the software, it should be noted that the hardware concept of the embedded system is oriented to a specific application and thus differs significantly from the universally usable PC. However, the boundary between embedded and desktop computer systems is sometimes difficult to define. Just think of the mobile phone, which on the one hand has many features of an embedded system (ARM-based, single-board construction), but on the other hand can cope with very different tasks and is therefore a universal computer.

Benefits of Embedded Vision System Boards

A single board computer is often a good choice. Single board is a standard product. It is a small compact computer that is easy to use. This is also useful for developers who have had little to do with embedded vision. However, the single board computer is a system that contains unused components, and thus generally does not allow the leanest system configuration. Hence, this is suitable for small to medium quantities. The leanest setup is obtained through a customized system. Here, however, higher integration effort is a factor. This customized system is therefore suitable for large unit numbers. The benefits of embedded vision system boards at a glance are:

■ Lean system design

■ Light weight

■ Cost-effective, because there is no unnecessary hardware

■ Lower manufacturing costs

■ Lower energy consumption

■ Small footprint

Processors for Embedded Vision

This technology category includes any device that executes vision algorithms or vision system control software. The applications represent distinctly different types of processor architectures for embedded vision, and each has advantages and trade-offs that depend on the workload. For this reason, many devices combine multiple processor types into a heterogeneous computing environment, often integrated into a single semiconductor component. In addition, a processor can be accelerated by dedicated hardware that improves performance on computer vision algorithms.

Vision algorithms typically require high compute performance. And, of course, embedded systems of all kinds are usually required to fit into tight cost and power consumption envelopes. In other digital signal processing application domains, such as digital wireless communications, chip designers achieve this challenging combination of high performance, low cost, and low power by using specialized coprocessors and accelerators to implement the most demanding processing tasks in the application. These coprocessors and accelerators are typically not programmable by the chip user, however. This trade-off is often acceptable in wireless applications, where standards mean that there is strong commonality among algorithms used by different equipment designers.

In vision applications, however, there are no standards constraining the choice of algorithms. On the contrary, there are often many approaches to choose from to solve a particular vision problem. Therefore, vision algorithms are very diverse, and tend to change fairly rapidly over time. As a result, the use of nonprogrammable accelerators and coprocessors is less attractive for vision applications compared to applications like digital wireless and compression centric consumer video equipment. Achieving the combination of high performance, low cost, low power, and programmability is challenging. Special purpose hardware typically achieves high performance at low cost, but with little programmability. General purpose CPUs provide programmability, but with weak performance, poor cost, or energy efficiency.

Demanding embedded vision applications most often use a combination of processing elements, which might include, for example:

■ A general purpose CPU for heuristics, complex decision making, network access, user interface, storage management, and overall control

■ A high-performance DSP-oriented processor for real time, moderate rate processing with moderately complex algorithms

■ One or more highly parallel engines for pixel rate processing with simple algorithms

While any processor can in theory be used for embedded vision, the most promising types today are:

■ High-performance embedded CPU

■ Application specific standard product (ASSP) in combination with a CPU

■ Graphics processing unit (GPU) with a CPU

■ DSP processor with accelerator(s) and a CPU

■ Field programmable gate array (FPGA) with a CPU

■ Mobile “application processor”

High Performance Embedded CPU

In many cases, embedded CPUs cannot provide enough performance or cannot do so at an acceptable price or power consumption levels to implement demanding vision algorithms. Often, memory bandwidth is a key performance bottleneck, since vision algorithms typically use large amounts of memory bandwidth, and don’t tend to repeatedly access the same data. The memory systems of embedded CPUs are not designed for these kinds of data flows. However, like most types of processors, embedded CPUs become more powerful over time, and in some cases can provide adequate performance. There are some compelling reasons to run vision algorithms on a CPU when possible. First, most embedded systems need a CPU for a variety of functions. If the required vision functionality can be implemented using that CPU, then the complexity of the system is reduced relative to a multiprocessor solution.

In addition, most vision algorithms are initially developed on PCs using general purpose CPUs and their associated software development tools. Similarities between PC CPUs and embedded CPUs (and their associated tools) mean that it is typically easier to create embedded implementations of vision algorithms on embedded CPUs compared to other kinds of embedded vision processors. In addition, embedded CPUs typically are the easiest to use compared to other kinds of embedded vision processors, due to their relatively straightforward architectures, sophisticated tools, and other application development infrastructure, such as operating systems. An example of an embedded CPU is the Intel Atom E660T.

Application Specific Standard Product (ASSP) in Combination with a CPU

Application specific standard products (ASSPs) are specialized, highly integrated chips tailored for specific applications or application sets. ASSPs may incorporate a CPU, or use a separate CPU chip. By virtue of specialization, ASSPs typically deliver superior cost and energy efficiency compared with other types of processing solutions. Among other techniques, ASSPs deliver this efficiency through the use of specialized coprocessors and accelerators. ASSPs are by definition focused on a specific application, they are usually provided with extensive application software.

The specialization that enables ASSPs to achieve strong efficiency, however, also leads to their key limitation lack of flexibility. An ASSP designed for one application is typically not suitable for another application, even one that is related to the target application. ASSPs use unique architectures, and this can make programming them more difficult than with other kinds of processors. Indeed, some ASSPs are not user programmable.

Another consideration is risk. ASSPs often are delivered by small suppliers, and this may increase the risk that there will be difficulty in supplying the chip, or in delivering successor products that enable system designers to upgrade their designs without having to start from scratch. An example of a vision-oriented ASSP is the PrimeSense PS1080-A2, used in the Microsoft Kinect.

General Purpose CPUs

While computer vision algorithms can run on most general purpose CPUs, desktop processors may not meet the design constraints of some systems. However, x86 processors and system boards can leverage the PC infrastructure for low-cost hardware and broadly supported software development tools. Several Alliance Member companies also offer devices that integrate a RISC CPU core. A general purpose CPU is best suited for heuristics, complex decision making, network access, user interface, storage management, and overall control. A general purpose CPU may be paired with a vision specialized device for better performance on pixel level processing.

Graphics Processing Units with CPU

High-performance GPUs deliver massive amounts of parallel computing potential, and graphics processors can be used to accelerate the portions of the computer vision pipeline that perform parallel processing on pixel data. While General Purpose GPUs (GPGPUs) have primarily been used for high-performance computing (HPC), even mobile graphics processors and integrated graphics cores are gaining GPGPU capability meeting the power constraints for a wider range of vision applications. In designs that require 3D processing in addition to embedded vision, a GPU will already be part of the system and can be used to assist a general purpose CPU with many computer vision algorithms. Many examples exist of x86-based embedded systems with discrete GPGPUs.

Graphics processing units (GPUs), intended mainly for 3D graphics, are increasingly capable of being used for other functions, including vision applications. The GPUs used in personal computers today are explicitly intended to be programmable to perform functions other than 3D graphics. Such GPUs are termed “general purpose GPUs” or “GPGPUs.” GPUs have massive parallel processing horsepower. They are ubiquitous in personal computers. GPU software development tools are readily and freely available, and getting started with GPGPU programming is not terribly complex. For these reasons, GPUs are often the parallel processing engines of first resort of computer vision algorithm developers who develop their algorithms on PCs, and then may need to accelerate execution of their algorithms for simulation or prototyping purposes.

GPUs are tightly integrated with general purpose CPUs, sometimes on the same chip. However, one of the limitations of GPU chips is the limited variety of CPUs with which they are currently integrated. The limited number of CPU operating systems support the integration. Today there are low-cost, low-power GPUs, designed for products like smart phones and tablets. However, these GPUs are generally not GPGPUs, and therefore using them for applications other than 3D graphics is very challenging. An example of a GPGPU used in personal computers is the NVIDIA GT240.

Digital Signal Processors with Accelerator(s) and a CPU

DSPs are very efficient for processing streaming data, since the bus and memory architecture are optimized to process high-speed data as it traverses the system. This architecture makes DSPs an excellent solution for processing image pixel data as it streams from a sensor source. Many DSPs for vision have been enhanced with coprocessors that are optimized for processing video inputs and accelerating computer vision algorithms. The specialized nature of DSPs makes these devices inefficient for processing general purpose software workloads, so DSPs are usually paired with a RISC processor to create a heterogeneous computing environment that offers the best of both worlds.

Digital signal processors (“DSP processors” or “DSPs”) are microprocessors specialized for signal processing algorithms and applications. This specialization typically makes DSPs more efficient than general purpose CPUs for the kinds of signal processing tasks that are at the heart of vision applications. In addition, DSPs are relatively mature and easy to use compared to other kinds of parallel processors. Unfortunately, while DSPs do deliver higher performance and efficiency than general purpose CPUs on vision algorithms, they often fail to deliver sufficient performance for demanding algorithms. For this reason, DSPs are often supplemented with one or more coprocessors. A typical DSP chip for vision applications therefore comprises a CPU, a DSP, and multiple coprocessors. This heterogeneous combination can yield excellent performance and efficiency, but can also be difficult to program. Indeed, DSP vendors typically do not enable users to program the coprocessors; rather, the coprocessors run software function libraries developed by the chip supplier. An example of a DSP targeting video applications is the Texas Instruments DM8168.

Field Programmable Gate Arrays (FPGAs) with a CPU

Instead of incurring the high cost and long lead times for a custom ASIC to accelerate computer vision systems, designers can implement an FPGA to offer a reprogrammable solution for hardware acceleration. With millions of programmable gates, hundreds of I/O pins, and compute performance in the trillions of multiply accumulates/sec (tera-MACs), high-end FPGAs offer the potential for highest performance in a vision system. Unlike a CPU, which has to use time slice or multi-thread tasks as they compete for compute resources, an FPGA has the advantage of being able to simultaneously accelerate multiple portions of a computer vision pipeline. Since the parallel nature of FPGAs offers so much advantage for accelerating computer vision, many of the algorithms are available as optimized libraries from semiconductor vendors. These computer vision libraries also include preconfigured interface blocks for connecting to other vision devices, such as IP cameras.

Field programmable gate arrays (FPGAs) are flexible logic chips that can be reconfigured at the gate and block levels. This flexibility enables the user to craft computation structures that are tailored to the application at hand. It also allows selection of I/O interfaces and on-chip peripherals matched to the application requirements. The ability to customize compute structures, coupled with the massive amount of resources available in modern FPGAs, yields high performance coupled with good cost and energy efficiency. However, using FPGAs is essentially a hardware design function, rather than a software development activity. FPGA design is typically performed using hardware description languages (Verilog or VHLD) at the register transfer level (RTL) a very low-level of abstraction. This makes FPGA design time consuming and expensive, compared to using the other types of processors discussed here.

However using FPGAs is getting easier, due to several factors. First, so called “IP block” libraries—libraries of reusable FPGA design components are becoming increasingly capable. In some cases, these libraries directly address vision algorithms. In other cases, they enable supporting functionality, such as video I/O ports or line buffers. Second, FPGA suppliers and their partners increasingly offer reference designs reusable system designs incorporating FPGAs and targeting specific applications. Third, high-level synthesis tools, which enable designers to implement vision and other algorithms in FPGAs using high-level languages, are increasingly effective. Relatively low-performance CPUs can be implemented by users in the FPGA. In a few cases, high-performance CPUs are integrated into FPGAs by the manufacturer. An example FPGA that can be used for vision applications is the Xilinx Spartan-6 LX150T.

Mobile “Application Processor”

A mobile “application processor” is a highly integrated system-on-chip, typically designed primarily for smart phones but used for other applications. Application processors typically comprise a high-performance CPU core and a constellation of specialized coprocessors, which may include a DSP, a GPU, a video processing unit (VPU), a 2D graphics processor, an image acquisition processor, and so on. These chips are specifically designed for battery-powered applications, and therefore place a premium on energy efficiency. In addition, because of the growing importance of and activity surrounding smart phone and tablet applications, mobile application processors often have strong software development infrastructure, including low-cost development boards, Linux and Android ports, and so on. However, as with the DSP processors discussed in the previous section, the specialized coprocessors found in application processors are usually not user programmable, which limits their utility for vision applications. An example of a mobile application processor is the Freescale i.MX53.

Cameras/Image Sensors for Embedded Vision

While analog cameras are still used in many vision systems, this section focuses on digital image sensors usually either a CCD or CMOS sensor array that operates with visible light. However, this definition shouldn’t constrain the technology analysis, since many vision systems can also sense other types of energy (IR, sonar, etc.).

The camera housing has become the entire chassis for a vision system, leading to the emergence of “smart cameras” with all of the electronics integrated. By most definitions, a smart camera supports computer vision, since the camera is capable of extracting application specific information. However, as both wired and wireless networks get faster and cheaper, there still may be reasons to transmit pixel data to a central location for storage or extra processing.

A classic example is cloud computing using the camera on a smart phone. The smart phone could be considered a “smart camera” as well, but sending data to a cloud-based computer may reduce the processing performance required on the mobile device, lowering cost, power, weight, and so on. For a dedicated smart camera, some vendors have created chips that integrate all of the required features.

Until recent times, many people would imagine a camera for computer vision as the outdoor security camera shown in Figure 1.4. There are countless vendors supplying these products, and many more supplying indoor cameras for industrial applications. There are simple USB cameras for PCs available and billions of cameras embedded in the mobile phones of the world. The speed and quality of these cameras has risen dramatically supporting 10+ mega pixel sensors with sophisticated image-processing hardware.

FIGURE 1.4. Outdoor fixed security camera.

Another important factor for cameras is the rapid adoption of 3D- imaging using stereo optics, time-of-flight and structured light technologies. Trendsetting cell phones now even offer this technology, as do the most recent generation of game consoles. Look again at the picture of the outdoor camera and consider how much change is about to happen to computer vision markets as new camera technologies become pervasive.

Charge coupled device (CCD) image sensors have some advantages over CMOS image sensors, mainly because the electronic shutter of CCDs traditionally offers better image quality with higher dynamic range and resolution. However, CMOS sensors now account for more 90% of the market, heavily influenced by camera phones and driven by the technology’s lower cost, better integration, and speed.

Other Semiconductor Devices for Embedded Vision

Embedded vision applications involve more than just programmable devices and image sensors; they also require other components for creating a complete system. Most applications require data communications of pixels and/or metadata, and many designs interface directly to the user. Some computer vision systems also connect to mechanical devices, such as robots or industrial control systems.

The list of devices in this “other” category includes a wide range of standard products. In addition, some system designers may incorporate programmable logic devices or ASICs. In many vision systems, power, space, and cost constraints require high levels of integration with the programmable device often into a system-on-a-chip (SoC) device. Sensors to sense external parameters or environmental measurements are discussed in the separate chapter headings.

Memory

Processors can integrate megabytes’ worth of SRAM and DRAM, so many designs will not require off-chip memory. However, computer vision algorithms for embedded vision often require multiple frames of sensor data to track objects. Off-chip memory devices can store gigabytes of memory, although accessing external memory can add hundreds of cycles of latency. The systems with a 3D-graphics subsystem will usually already include substantial amounts of external memory to store the frame buffer, textures, Z buffer, and so on. Sometimes this graphics memory is stored in a dedicated, fast memory bank that uses specialized DRAMs.

Some vision implementations store video data locally, in order to reduce the amount of information that needs to be sent to a centralized system. For a solid state, nonvolatile memory storage system, the storage density is driven by the size of flash memory chips. Latest generation NAND chip fabrication technologies allow extremely large, fast and low-power storage in a vision system.

Networking and Bus Interfaces

Mainstream computer networking and bus technology has finally started to catch up to the needs of computer vision to support simultaneous digital video streams. With economies of scale, more vision systems will use standard buses like PCI and PCI Express. For networking, Gigabit Ethernet (GbE) and 10GbE interfaces offer sufficient bandwidth even for multiple high-definition video streams. However, the trade association for Machine Vision (AIA) continues to promote Camera Link, and many camera and frame grabber manufacturers use this interface.

1.3 COMPONENTS IN A TYPICAL VISION SYSTEM

Although applications of embedded vision technologies vary, a typical computer vision system uses more or less the same sequence of distinct steps to process and analyze the image data. These are referred to as a vision pipeline, which typically contains the steps shown in Figure 1.5.

FIGURE 1.5. Vision pipeline.

At the start of the pipeline, it is common to see algorithms with simple data-level parallelism and regular computations. However, in the middle region, the data-level parallelism and the data structures themselves are both more complex, and the computation is less regular and more control-oriented. At the end of the pipeline, the algorithms are more general purpose in nature. Here are the pipelines for two specific application examples: Figure 1.6 shows a vision pipeline for a video surveillance application.

Figure 1.7 shows a vision pipeline for a pedestrian detection application. Note that both pipelines construct an image pyramid and have an object detection function in the center.

FIGURE 1.6. Vision pipeline for video surveillance application.

FIGURE 1.7. Vision pipeline for pedestrian detection application

Vision Processing Algorithms

Vision algorithms typically require high computing performance. And unlike many other applications, where standards mean that there is strong commonality among algorithms used by different equipment designers, no such standards that constrain algorithm choice exist in vision applications. On the contrary, there are often many approaches to choose from to solve a particular vision problem. Therefore, vision algorithms are very diverse, and tend to change fairly rapidly over time. And, of course, industrial automation systems are usually required to fit into tight cost and power consumption envelopes.

The rapidly expanding use of vision technology in industrial automation is part of a much larger trend. From consumer electronics to automotive safety systems, today we see vision technology (Figure 1.8) enabling a wide range of products that are more intelligent and responsive than before, and thus more valuable to users. We use the term “embedded vision” to refer to this growing practical use of vision technology in embedded systems, mobile devices, special purpose PCs, and the cloud, with industrial automation being one showcase application.

FIGURE 1.8. Vision technology.

Embedded Vision Challenges

Although the rapid progress of technology has made available very powerful microprocessor architectures, implementing a computer vision algorithm on embedded hardware/software platforms remains a very challenging task. Some specific challenges encountered by embedded vision systems include:

  i. Power consumption Vision applications for mobile platforms are constrained by battery capacity, leading to power requirements of less than one Watt. Using more power means more battery weight, a problem for mobile and airborne systems (i.e., drones). More power also means higher heat dissipation, leading to more expensive packaging, complex cooling systems, and faster aging of components.

 ii. Computational requirements Computer vision applications have extremely high computational requirements. Constructing a typical image pyramid for a VGA frame (640x480) requires 10–15 million instructions per frame. Multiply this by 30 frames per second and this will require a processor capable of doing 300-450 MIPS just to handle this preliminary processing step, let alone the more advanced recognition tasks required later in the pipeline. State-of-the-art, low-cost camera technology today can provide 1080p or 4K video, at up to 120 frames per second. A vision system using such a camera requires compute power ranging from a few Giga Operations per Second (GOPS ) to several hundred GOPS.

iii. Memory usage The various vision processing tasks require large buffers to store processed image data in various formats, and high bandwidth to move this data from memory and between computational units. The on-chip memory size and interconnect has a significant impact on the cost and performance of a vision application on an embedded platform.

iv. Fixed-point algorithm development Most of the published computer vision algorithms are developed for the computational model of Intel-based workstations where, since the advent of the Intel Pentium in 1993, the cost of double-precision operations is roughly identical to integer or single-precision operations. However, 64-80 bit hardware floating point units massively increase silicon area and power consumption, and software emulation libraries for floating point run slowly. For this reason, algorithms typically need to be refined to use the more efficient fixed-point data arithmetic based on integer types and operands combined with data shifts to align the radix point.

Besides the previously discussed challenges, an embedded vision developer should keep the dynamic nature of the market. The market is changing in an ongoing basis, including applications and use cases, the underlying vision algorithms, the programming models, and the supporting hardware architectures.

Currently, there is a need for standardized vision kernels, algorithm libraries, and programming models. At this time, there are no fully established standards for vision software with efficient hardware implementations. There are a number of likely candidates. OpenCV is a good starting point for reference algorithms and their test benches. Khronos is an emerging standard focused on embedded systems. OpenCL is a software framework to tie together massively parallel heterogeneous computation units.

1.4 APPLICATIONS FOR EMBEDDED VISION

The emergence of practical embedded vision technology creates vast opportunities for innovation in electronic systems and associated software. In many cases, existing products can be transformed through the addition of vision capabilities. One example of this is the addition of vision capabilities to surveillance cameras, allowing the camera to monitor a scene for certain kinds of events, and alert an operator when such an event occurs. In other cases, practical embedded vision enables the creation of new types of products, such as surgical robots and swimming pool safety systems that monitor swimmers in the water. Some specific applications that use embedded vision include object detection, video surveillance, gesture recognition, Simultaneous Localization and Mapping (SLAM), and Advanced Driver Assistance Systems (ADAS). Let’s take a closer look at each one.

Swimming Pool Safety System

While there are bigger markets for vision products, swimming pool safety (as shown in Figure 1.9) is one of those applications that truly shows the positive impact that technological progress can have for society. Every parent will instantly appreciate the extra layer of safety provided by machines that see and understand if a swimmer is in distress. When tragedies can happen in minutes, a vision system shows the true potential of this technology never becoming distracted or complacent in performing the duties of a digital lifeguard.

FIGURE 1.9. Pool graphic system

Object Detection

Object detection is at the heart of virtually all computer vision systems. Of all the visual tasks we might ask a computer to perform, the task of analyzing a scene and recognizing all of the constituent objects remains the most challenging. Furthermore, detected objects can be used as inputs for object recognition tasks, such as instance or class recognition, which can find a specific face, a car model, a unique pedestrian, and so on. Applications include face detection and recognition, pedestrian detection, and factory automation. Even vision applications that are not specifically performing object detection often have some sort of detection step in the processing flow. For example, a movement tracking algorithm uses “corner detection” to identify easily recognizable points in an image and then looks for them in subsequent frames.

Video Surveillance

Growing numbers of IP cameras and the need for surveillance cameras with better video quality is driving the global demand for video surveillance systems. Sending HD resolution images of millions of IP cameras to the cloud will require prohibitive network bandwidth. So intelligence in the camera is needed to filter the video to only transmit the appropriate cases (e.g., only the frames with pedestrian detected, with background subtracted) for further analysis.

Gesture Recognition

Candidates for control by vision-based gesture recognition include automotive infotainment and industrial applications, where touch screens are either dangerous or impractical. For consumer electronics, gaming, and virtual reality, gesture recognition can provide a more direct interface for machine interaction.

Simultaneous Localization and Mapping (SLAM)

SLAM is the capability of mobile systems to create a map of an area they are navigating. This has applications in areas like self-driving cars, robot vacuum cleaners, augmented reality games, virtual reality applications, and planetary rovers.

Automatic Driver Assistance System (ADAS)

ADAS systems are used to enhance and automate vehicle systems to improve safety and driving by, for example, detecting lanes, other cars, road signs, pedestrians, cyclists, or animals in the path of a car. Another example of an emerging high-volume embedded vision application is automotive safety systems based on vision which is shown in Figure 1.10. A few automakers, such as Volvo, have begun to install vision-based safety systems in certain models. These systems perform a variety of functions, including warning the driver (and in some cases applying the brakes) when a forward collision is threatening, or when a pedestrian is in danger of being struck.

Another example of an emerging high-volume embedded vision application is “smart” surveillance cameras, which are cameras with the ability to detect certain kinds of activity. For example, the Archerfish Solo, a consumer-oriented smart surveillance camera, can be programmed to detect people, vehicles, or other motion in user selected regions of the camera’s field of view.

FIGURE 1.10. Mobileye driver assistance system.

Game Controller

In more recent decades, embedded computer vision systems have been deployed in applications such as target-tracking for missiles, and automated inspection for manufacturing plants. Now, as lower cost, lower power, and higher performance processors emerge, embedded vision is beginning to appear in high-volume applications. Perhaps the most visible of these is the Microsoft Kinect, a peripheral for the Xbox 360 game console that uses embedded vision to enable users to control video games simply by gesturing and moving their bodies. The success of Microsoft Kinect for the Xbox 360 game console as in Figure 1.11, subsequently expanded to support PCs, as well as the vision support in successor generation consoles from both Microsoft and Sony, demonstrates that people want to control their machines using natural language and gestures. Practical computer vision technology has finally evolved to make this possible in a range of products that extend well beyond gaming. For example, the Microsoft Kinect with 3D motion capture, facial recognition, and voice recognition is one of the fastest selling consumer electronics devices. Such highly visible applications are creating consumer expectations for systems with visual intelligence and increasingly powerful, low-cost, energy-efficient processors and sensors are making the widespread use of embedded vision practical.

FIGURE 1.11. Microsoft Kinect for Xbox 360, a gesture-based game controller.

Face Recognition for Advertising Research

An innovated technology tracks the facial responses of Internet users while they view content online. This would allow many companies to monitor Internet user’s real-time reactions to their advertisements.

Mobile Phone Skin Cancer Detection

A smart phone application detects signs of skin cancer in moles on the human body. The application allows a person to take a photograph of a mole with the smart phone and receive an instant analysis of its status. Using a complex algorithm the application will tell the person, whether or not the mole is suspicious and advise on whether the person should seek treatment. The application also allows the person to find an appropriate dermatologist in the immediate vicinity. Other revolutionary medical applications utilizing embedded vision include an iPhone app that reads heart rate and a device to assist the blind by using a camera that interprets real objects and communicates them to the user as auditory indication.

Gesture Recognition for Car Safety

In the automotive industry, a new system incorporates gesture and face recognition to reduce distractions while driving. The use of face recognition for security purposes has been well documented; interpreting nods, winks, and hand movements to execute specific functions within the car. For example, a winking of the eye will turn the car radio off and on and by tilting head left or right, the volume will go up or down! Since many road accidents are the result of drivers trying to multitask, this application could potentially save many lives.

Industrial Applications for Embedded Vision

Vision-processing-based products have established themselves in a number of industrial applications. The most prominent one being factory automation where the application is commonly referred to as machine vision. It identifies the primary factory automation sectors as:

■ Automotive—motor vehicle and related component manufacturing

■ Chemical and Pharmaceutical—chemical and pharmaceutical manufacturing plants and related industries

■ Packaging—packaging machinery, packaging manufacturers, and dedicated packaging companies not aligned to any one industry

■ Robotics—guidance of robots and robotic machines

■ Semiconductors and Electronics—semiconductor machinery makers, semiconductor device manufacturers, electronic equipment manufacturing, and assembly facilities

The primary embedded vision products used in factory automation applications are:

■ Smart Sensors—A single unit that is designed to perform a single machine vision task. Smart sensors require little or no configuring and have limited on board processing. Frequently a lens and lighting are also incorporated into the unit.

■ Smart Cameras—This is a single unit that incorporates a machine vision camera, a processor and I/O in a compact enclosure. Smart cameras are configurable and so can be used for a number of different applications. Most have the facility to change lenses and are also available with built in LED lighting.

■ Compact Vision System—This is a complete machine vision system, not based on a PC, consisting of one or more cameras and a processor module. Some products have an LCD screen incorporated as part of the processor module. This obviates the need to connect the devices to a monitor for set up. The principal feature that distinguishes compact vision systems (CVS) from smart cameras is their ability to take information from a number of cameras. This can be more cost effective where an application requires multiple images.

■ Machine Vision Cameras (MV Cameras)—These are devices that convert an optical image into an analogue or digital signal. This may be stored in random access memory, but not processed, within the device.

■ Frame Grabbers—This is a device (usually a PCB card) for interfacing the video output from a camera with a PC or other control device. Frame grabbers are sometimes called video capture boards or cards. They vary from being a simple interface to a more complex device that can handle many functions including triggering, exposure rates, shutter speeds, and complex signal processing.

■ Machine Vision Lighting—This refers to any device that is used to light a scene being viewed by a machine vision camera or sensor. This report considers only those devices that are designed and marketed for use in machine vision applications in an industrial automation environment.

● Machine Vision Lenses—This category includes all lenses used in a machine vision application, whether sold with a camera or as a spare or additional part.

● Machine Vision Software—This category includes all software that is sold as a product in its own right, and is designed specifically for machine vision applications. It is split into:

● Library Software—allows users to develop their own MV system architecture. There are many different types, some offering great flexibility. They are often called SDKs (Software Development Kits).

● System Software—which is designed for a particular application. Some are very comprehensive and require little or no set up.

Medical Applications for Embedded Vision

Embedded vision and video analysis have the potential to become the primary treatment tool in hospitals and clinics, and can increase the efficiency and accuracy of radiologists and clinicians. The high quality and definition of the output from scanners and X-ray machines makes them ideal for automatic analysis, be it for tumor and anomaly detection, or for monitoring changes over a period of time in dental offices or for cancer screening. Other applications include motion analysis systems, which are being used for gait analysis for injury rehabilitation and physical therapy. Video analytics can also be used in hospitals to monitor the medical staff, ensuring that all rules and procedures are properly followed.

For example, video analytics can ensure that doctors “scrub in” properly before surgery, and that patients are visited at the proper intervals. Medical imaging devices including CT, MRI, mammography, and X-ray machines, embedded with computer vision technology and connected to medical images taken earlier in a patient’s life, will provide doctors with very powerful tools to help detect rapidly advancing diseases in a fraction of the time currently required. Computer-aided detection or computer-aided diagnosis (CAD) software is currently also being used in early stage deployments to assist doctors in the analysis of medical images by helping to highlight potential problem areas.

Automotive Applications for Embedded Vision