Ground Truth

What Is Ground Truth?

Ground truth is the term that describes real word data used to train and test AI model outputs. Ground truth data is required for many AI applications, including automated driving and audio or speech recognition.

Ground truth data is essential for two stages in AI algorithm development:

  1. Model training: Ground truth data is used as training data, where the algorithm learns which features and solutions are appropriate for the specific application
  2. Model testing: Ground truth data is used as test data, where the trained algorithm is tested for model accuracy

Ground truth data can come in many forms: image data, signal data, or text data (Figure 1). Manually obtaining ground truth data can be time consuming, and MATLAB® can expedite the process through labeler apps for image, signal, audio, and lidar applications.

Ground truth displayed for three types of data: a signal (top left), image (top right), and text (bottom).

Figure 1. Ground truth data in the form of signal data (top left), image data (top right), and text (bottom).

How to Obtain Ground Truth Data

Ground truth labeling is required to generate ground truth data. Labeling is the process of assigning raw data with labels that characterize what that data means. The labeled output is required to train a supervised learning model. More accurate labeling results in a more accurate model. Manual labeling of ground truth data can be time consuming because many AI models require thousands or millions of labeled data outputs to generate accurate results.

The following labeler apps from MATLAB provide options to fully automate or semi-automate the labeling process, reducing the time required by manual labeling.

Image Labeling

Image Labeler will help to label regions of interest in images, including pixel labeling for semantic segmentation and bounding boxes for object detection workflows.

Labeling images, that is defining the ground truth, using the Image Labeler app.

Figure 2. Labeling images using the Image Labeler app.

Signal Labeling

Using Signal Labeler, you can explore data, label attributes, regions of interest, and points through visualization and custom functions.

Labeling signals, that is defining the ground truth, using the Signal Labeler app.

Figure 3. Labeling signals using the Signal Labeler app.

Lidar Labeling

Lidar Labeler can create bounding boxes around 3D objects, and provide automation techniques for clustering, ground plane removal, and tracking of point cloud data.

Labeling lidar point clouds, that is defining the ground truth, using the Lidar Labeler app.

Figure 4. Labeling lidar point clouds using the Lidar Labeler app.

See also: deep learning, convolutional neural network