Main Content

Convert Ground Truth Labeling Data for Object Re-Identification

This example shows how to convert a groundTruth object to the re-identification training data format.

Overview

Re-identification (ReID) plays a vital role in visual object tracking, addressing temporary occlusion or objects leaving the camera's field of view, which complicates consistent tracking in real-world scenarios. To train a ReID network created using the reidentificationNetwork object, the ground truth data must be processed so that the training data only consists of the people within the ground truth bounding boxes. These cropped images must have consistent labeling for each object. In this example, convert a fully labeled ground truth video to the required ReID training format.

Load Ground Truth Labeling Data

To convert ground truth data into a format usable for training a ReID network, ensure that the groundTruth has the required format. The ground truth for each object should have a rectangular region of interest (ROI) and a numeric attribute for the object ID. To learn how to label data for object tracking and generate the ground truth data, see the Automate Ground Truth Labeling for Object Tracking and Re-Identification example. In this example, the ROI is labeled as Person.

Download the video containing the ground truth data, and load the groundTruth object.

helperDownloadLabelVideo();
Downloading Pedestrian Tracking Video (90 MB)
load("groundTruth.mat","gTruth");

Convert Ground Truth for Object Re-Identification

After you fully label and export the ground truth data from a labeler, use the objectDetectorTrainingData function to directly create an imageDatastore and boxLabelDatastore.

Process the ground truth for training and store input images for the network to use. Use the helperCropImagesWithGroundtruth helper function to crop out all the labeled test data within the video frames using the groundTruth object. Use the function to resize the cropped images to 256-by-128 pixels and organize the labels into individual folders under the root directory trainingDataFolder.

trainingDataFolder = fullfile("trainingData");
imageFrameWriteLoc = fullfile("videoFrames");
dataSize = [256 128];
if ~isfolder(trainingDataFolder)
    helperCropImagesWithGroundtruth(gTruth,trainingDataFolder,imageFrameWriteLoc,dataSize);
end
Write images extracted for training to folder: 
    videoFrames

Writing 150 images extracted from PedestrianLabelingVideo.avi...Completed.

Cleaning up videoFrames directory.

Done.

Load the cropped and organized training images into an ImageDatastore object. To use the all of the data in trainingDataFolder, specify the IncludeSubfolders name-value argument as true. To use the corresponding folder names as the training data labels, specify the LabelSource name-value argument as "foldernames".

imds = imageDatastore(trainingDataFolder,IncludeSubfolders=true,LabelSource="foldernames");

Display a set of image frames from the training data using the montage function.

rng(0)
previewImages = cell(2,4);
for i = 1:4
    previewIdx = randi(numel(imds.Files));
    previewImages{1,i} = readimage(imds,previewIdx);
    previewImages{2,i} = imds.Labels(previewIdx);
end
montage(previewImages(1,:),Size=[1 4],ThumbnailSize=dataSize)

Figure contains an axes object. The axes object contains an object of type image.

Display the labels for each image from left to right.

strcat("ID = ",string(previewImages(2,:)))
ans = 1×4 string
    "ID = 7"    "ID = 8"    "ID = 1"    "ID = 8"

To verify the accuracy of the labels, survey the values in the corresponding ID folder in the trainingDataFolder.

Next Steps

After you convert ground truth labeling data to the required format described above, you can employ it for training a ReID network using the trainReidentificationNetwork function. To learn how to configure, train, and evaluate a ReID network, see the Reidentify People Throughout a Video Sequence Using ReID Network example.

Supporting Functions

helperDownloadLabelVideo

Download the pedestrian labeling video.

function helperDownloadLabelVideo
videoURL = "https://ssd.mathworks.com/supportfiles/vision/data/PedestrianLabelingVideo.avi";
if ~exist("PPedestrianLabelingVideo.avi","file")
    disp("Downloading Pedestrian Tracking Video (90 MB)")
    websave("PedestrianLabelingVideo.avi",videoURL);
end
end

helperCropImagesWithGroundtruth

Crop all source images in the ground truth data gTruth with the bounding box labels gTruth. Store the cropped images in organized subdirectories in dataFolder.

function helperCropImagesWithGroundtruth(gTruth,dataFolder,imageFrameWriteLoc,dataSize)
% Use objectDetectorTrainingData to convert the groundTruth data into an imageDataStore and boxLabelDatastore.
if ~isfolder(imageFrameWriteLoc)
    mkdir(imageFrameWriteLoc)
end

[imds,blds] = objectDetectorTrainingData(gTruth,SamplingFactor=1,WriteLocation=imageFrameWriteLoc);

combinedTrainingDs = combine(imds,blds);
labelData = timetable2table(gTruth.LabelData);
writeall(combinedTrainingDs,imageFrameWriteLoc,WriteFcn=@(data,info,format)helperWriteCroppedData(data,info,format,labelData,dataFolder,dataSize))

% Remove the video frame images.
fprintf(1,"\nCleaning up %s directory.\n",imageFrameWriteLoc);
rmdir(imageFrameWriteLoc,"s")
fprintf(1,"\nDone.\n");
end

helperWriteCroppedData

Crop, resize, and store image ROIs from a combined datastore.

function helperWriteCroppedData(data,info,~,labelData,dataFolder,dataSize)
num = 1;
imageIdx = info.ReadInfo{1,2}.CurrentIndex;
frame = num2str(imageIdx);
imageLabelData = struct2table(labelData{imageIdx,2}{:});
attributeIDs = imageLabelData{:,2};
for i = 1:size(data{1,2},1)
    personID = string(attributeIDs(i));
    personIDFolder = fullfile(dataFolder,personID);
    if ~isfolder(personIDFolder)
        mkdir(personIDFolder)
    end
    imgPath = fullfile(personIDFolder,strcat(frame,"_",num2str(num,'%02.f'),".jpg"));
    roi = data{1,2}(i,:);
    croppedImage = imcrop(data{1,1},roi);
    if ~isempty(croppedImage)
        resizedImg = imresize(croppedImage,dataSize);
        imwrite(resizedImg,imgPath);
        num = num + 1;
    end
end
end

See Also

Apps

Functions

Related Topics