Computer Vision Toolbox

Design and test computer vision systems

Computer Vision Toolbox provides algorithms and apps for designing and testing computer vision systems. You can perform visual inspection, object detection and tracking, as well as feature detection, extraction, and matching. You can automate calibration workflows for single, fisheye, stereo, and multi-camera configurations. For 3D vision, the toolbox supports stereo vision, point cloud processing, structure from motion, and real-time visual and point cloud SLAM. Computer vision apps enable team-based ground truth labeling with automation, as well as camera calibration.

The toolbox provides a variety of AI techniques including pretrained convolutional neural networks (CNNs), vision transformers, and vision-language models. Use the out-of-the-box models for tasks like image classification, object detection, segmentation, pose estimation, captioning, and optical character recognition (OCR), or further customize them through transfer learning.

You can generate code in C, C++, for GPU execution, and in hardware description languages (HDL).

Image and Video Ground Truth Labeling

Automate labeling for object detection, semantic segmentation, instance segmentation, and scene classification using the Video Labeler and Image Labeler apps.

Automatically Label Ground Truth Using Segment Anything Model

Documentation | Examples

Pedestrians, cars, and buses labeled using instance segmentation.

Deep Learning and Machine Learning

Train machine learning models and deep learning networks—or use pretrained networks—for object detection and segmentation. Evaluate the performance of these networks and deploy them by generating C/C++ or CUDA^® code.

Segment objects using SOLOv2 instance segmentation network

Documentation | Examples

Original pill image and the same image with anomaly markings.

Automated Visual Inspection

Use the Automated Visual Inspection Library to automatically identify anomalies or defects as part of a manufacturing quality assurance process.

Count Objects Using CounTR Model

Documentation | Examples

Multiple fisheye images of a checkerboard used to calibrate a camera using the Camera Calibrator app.

Camera Calibration

Estimate intrinsic, extrinsic, and lens distortion parameters for monocular cameras, stereo camera pairs, or multi-camera systems using the Camera Calibrator app, Stereo Camera Calibrator app, or built-in functions.

Estimate Pose of Moving Camera Mounted on a Robot

Documentation | Examples

A dense scene reconstruction created by applying visual SLAM to data from an RGB-D camera.

Visual SLAM and 3D Vision

Extract the 3D structure of a scene from multiple 2D views. Estimate camera position and orientation with respect to its surroundings. Refine pose estimates using bundle adjustment and pose graph optimization.

Performant Monocular Visual-Inertial SLAM

Documentation | Examples

Lidar and 3D Point Cloud Processing

Segment, cluster, downsample, denoise, register, and fit geometrical shapes with lidar or 3D point cloud data. Lidar Toolbox provides additional functionality to design, analyze, and test lidar processing systems.

Build a Map from Lidar Data Using SLAM

Documentation | Examples

Two side-by-side images of a box and of the same box within a larger scene, with lines connecting individual matching features in the images.

Feature Detection, Extraction, and Matching

Detect, extract, and match features such as blobs, edges, and corners, across multiple images. Use the matched features for registration, object classification, or in complex workflows such as SLAM.

Object Detection in a Cluttered Scene Using Point Feature Matching

Documentation | Examples

Multiple pedestrians detected within the area of interest in a car dashcam video.

Multi-Object Tracking and Motion Estimation

Estimate motion and track multiple objects in video and image sequences.

Multi-Object Tracking with DeepSORT

Documentation | Examples

Code Generation and Third-Party Support

Generate code from your computer vision algorithms for rapid prototyping, deployment, and verification. Integrate OpenCV-based projects and functions into MATLAB and Simulink.

Code Generation for Detect Defects on Printed Circuit Boards Using YOLOX Network

Documentation | Examples

Product Resources:

Documentation Examples Videos Functions Blocks Requirements Release notes

Caterpillar Uses Big Data, Data Analytics, and Machine and Deep Learning to Build Ground-Truth for Training, Validation, and Deploying Classifiers

“We can access machine learning capabilities with a few lines of MATLAB code. Then, using code generation, engineers can deploy their trained classifier into the machine without manual intervention or delays in the process.”

View more customer stories