Object Detection

What Is Object Detection?

Object detection is a computer vision technique for locating instances of objects in images or videos. Object detection algorithms typically leverage machine learning or deep learning to produce meaningful results. When humans look at images or videos, we can recognize and locate objects of interest within a matter of moments. The goal of object detection is to replicate this intelligence using a computer.

Why Object Detection Is Important

Object detection, a key technology used in advanced driver assistance systems (ADAS), enables cars to detect driving lanes and pedestrians to improve road safety. Object detection is also an essential component in applications such as visual inspection, robotics, medical imaging, video surveillance, and content-based image retrieval.

Screenshot showing labels applied to a photo of vehicles on a highway, identified using object detection.

Using object detection to identify and locate vehicles.

How Object Detection Works

Object Detection Using Deep Learning

You can use a variety of techniques to perform object detection. Popular deep learning–based approaches using convolutional neural networks (CNNs), such as YOLO, SSD, or R-CNN, automatically learn to detect objects within images.

You can choose from two key approaches to get started with object detection using deep learning:

  • Use pretrained object detectors. Several deep learning object detectors are trained on large data sets and can detect common objects such as people, vehicles, or image text without requiring further training.
  • Create and train a custom object detector. To tailor an object detector to your specific needs, you can use transfer learning. This approach enables you to build on a pretrained network, refining it further for your application. This method can provide faster results than training from scratch because the object detectors have already been trained on thousands, or even millions, of images.
An image depicting a street scene with a car approaching a stop sign. Using object detection, the sign is labeled by the pretrained model and includes a confidence level.

Detecting a stop sign using a pretrained R-CNN. See MATLAB code example.

Whether you use a pretrained object detector or create a custom one, you will need to decide what type of object detection network you prefer.

Images of a circuit board, highway scene, and boats on open water with objects detected and labeled.

Detecting small circuit board features, vehicles, and objects in region of interest (ROI) using a pretrained YOLOX network. See MATLAB code example.

Object Detection Using Machine Learning

Machine learning techniques are also commonly used for object detection, and they offer different approaches than deep learning. Common machine learning techniques include:

  • Aggregate channel features (ACFs)
  • Support vector machine (SVM) classification using histograms of oriented gradient (HOG) features
  • The Viola-Jones algorithm for human face or upper body detection

Tracking pedestrians using an ACF object detection algorithm. See MATLAB code example.

As with deep learning–based approaches, you can choose to start with a pretrained object detector or create a custom object detector to suit your application. You will need to manually select the identifying features for an object when using machine learning, compared with automatic feature selection in a deep learning–based workflow.

Machine Learning vs. Deep Learning for Object Detection

The best approach for object detection depends on your application and the problem you’re trying to solve. When choosing between machine learning and deep learning, consider whether you have a powerful GPU and lots of labeled training images. If you don’t have both, a machine learning approach might be the better choice. Deep learning techniques tend to work better when you have more images, and GPUs decrease the time needed to train the model.

Other Object Detection Methods

In addition to deep learning– and machine learning–based object detection, several other common techniques may be applicable depending on your application:

  • Image segmentation and blob analysis, which uses simple object properties such as size, shape, or color
  • Instance segmentation, a technique that predicts pixel-by-pixel segmentation masks of the precise shape and area of each object
  • Keypoint detection, a technique that predicts specific points of interest on the object 
  • Feature-based object detection, which uses feature extraction, matching, and RANSAC to estimate the location of an object
A desktop cluttered with various objects; a staple remover box is identified using object detection.

Object detection in MATLAB. The staple remover is detected in a cluttered scene using point feature matching. See MATLAB code example.

Object Detection with MATLAB

With just a few lines of MATLAB® code, you can build machine learning and deep learning models for object detection without having to be an expert.

Automatically Label Training Images with Apps

MATLAB provides interactive apps to both prepare training data and customize convolutional neural networks. Labeling the test images for object detectors is tedious, and getting enough training data to create a performant object detector can take a significant amount of time. The Image Labeler app lets you interactively label objects within a collection of images and provides built-in algorithms to automatically label your ground-truth data. For automated driving applications, you can use the Ground Truth Labeler app, and for video processing workflows, you can use the Video Labeler app.

Interactively Create Object Detection Algorithms and Interoperate Between Frameworks

Customizing an existing CNN or creating one from scratch can be prone to architectural problems that can waste valuable training time. The Deep Network Designer app enables you to interactively build, edit, and visualize deep learning networks while also providing an analysis tool to check for architectural issues before training the network.

With MATLAB, you can interoperate with networks and network architectures from frameworks like TensorFlow™-Keras, PyTorch®, and Caffe2 using ONNX™ (Open Neural Network Exchange) import and export capabilities.

Diagram showing interoperability, enabled by ONNX, between MATLAB and frameworks including TensorFlow, Caffe2, PyTorch, MXNet, Core ML, Chainer, and Cognitive Toolkit.

Import from and export to TensorFlow, PyTorch and ONNX Models. See example

Automatically Generate Optimized Code for Deployment

After creating your algorithms with MATLAB, you can leverage automated workflows to generate TensorRT or CUDA® code with GPU Coder™ to perform hardware-in-the-loop testing. The generated code can be integrated with existing projects and used to verify object detection algorithms on desktop GPUs or embedded GPUs such as the NVIDIA® Jetson™ or NVIDIA Drive platform.