Main Content

Aerial Lidar Semantic Segmentation Using PointNet++ Deep Learning

This example shows how to train a PointNet++ deep learning network to perform semantic segmentation on aerial lidar data.

Lidar data acquired from airborne laser scanning systems is used in applications such as topographic mapping, city modeling, biomass measurement, and disaster management. Extracting meaningful information from this data requires semantic segmentation, a process where each point in the point cloud is assigned a unique class label.

In this example, you train a PointNet++ network to perform semantic segmentation by using the Dayton Annotated Lidar Earth Scan (DALES) dataset [1]. The dataset contains scenes of dense, labeled aerial lidar data from urban, suburban, rural, and commercial settings. The dataset provides semantic segmentation labels for 8 classes such as buildings, cars, trucks, poles, power lines, fences, ground, and vegetation.

Load DALES Data

The DALES dataset contains 40 scenes of aerial lidar data. Out of the 40 scenes, 29 scenes are used for training and the remaining 11 scenes are used for testing. Each pixel in the data has a class label. Follow the instructions on the DALES website to download the dataset to the folder specified by the dataFolder variable. Create folders to store training and test data.

dataFolder = fullfile(tempdir,'DALES');
trainDataFolder = fullfile(dataFolder,'dales_las','train');
testDataFolder = fullfile(dataFolder,'dales_las','test');

Preview a point cloud from the training data.

lasReader = lasFileReader(fullfile(trainDataFolder,'5080_54435.las'));
[pc,attr] = readPointCloud(lasReader,'Attributes','Classification');
labels = attr.Classification;

% Select only labeled data.
pc = select(pc,labels~=0);
labels = labels(labels~=0);
classNames = [
    "ground"
    "vegetation"
    "cars"
    "trucks"
    "powerlines"
    "fences"
    "poles"
    "buildings"
    ];
figure;
ax = pcshow(pc.Location,labels);
helperLabelColorbar(ax,classNames);
title("Point Cloud with Overlaid Semantic Labels");

Preprocess Data

Each point cloud in the DALES dataset covers an area of 500-by-500 meters, which is much larger than the typical area covered by terrestrial lidar point clouds. For efficient memory processing, divide the point cloud into small, non-overlapping blocks by using a blockedPointCloud (Lidar Toolbox) object.

Define the block dimensions using the blockSize parameter. As the size of each point cloud in the dataset varies, set the z-dimension of the block to Inf to avoid block creation along z-axis.

blocksize = [51 51 Inf];

Create a matlab.io.datastore.FileSet object to collect all the point cloud files in the training data.

fs = matlab.io.datastore.FileSet(trainDataFolder);

Create a blockedPointCloud (Lidar Toolbox) object using the Fileset object.

bpc = blockedPointCloud(fs,blocksize);

Note: Processing can take some time. The code suspends MATLAB® execution until processing is complete.

Use the helperCalculateClassWeights helper function, attached to this example as a supporting file, to calculate the point distribution across all the classes in the training dataset.

numClasses = numel(classNames);
[weights,maxLabel,maxWeight] = helperCalculateClassWeights(fs,numClasses);

Create Datastore Object for Training

Create a blockedPointCloudDatastore (Lidar Toolbox) object using the blocked point cloud, bpc to train the network.

ldsTrain = blockedPointCloudDatastore(bpc,MinPoints=10);

Specify label IDs from 1 to the number of classes.

labelIDs = 1 : numClasses;

Preview and display the point cloud.

ptcld = preview(ldsTrain);
figure;
pcshow(ptcld.Location);
title("Cropped Point Cloud");

For faster training, set a fixed number of points per block.

numPoints = 8192;

Transform the data to make it compatible with the input layer of the network, using the helperTransformToTrainData function, defined at the end of this example. Follow these steps to apply transformation.

  • Extract the point cloud and the respective labels.

  • Downsample the point cloud, the labels to a specified number, numPoints.

  • Normalize the point clouds to the range [0 1].

  • Convert the point cloud and the corresponding labels to make them compatible with the input layer of the network.

ldsTransformed = transform(ldsTrain,@(x,info) helperTransformToTrainData(x, ...
    numPoints,info,labelIDs,classNames),'IncludeInfo',true);
read(ldsTransformed)
ans=1×2 cell array
    {8192×3 double}    {8192×1 categorical}

Define PointNet++ Model

PointNet++ is a popular neural network used for semantic segmentation of unorganized lidar point clouds. Semantic segmentation associates each point in a 3-D point cloud with a class label, such as car, truck, ground, or vegetation. For more information, see Get Started with PointNet++ (Lidar Toolbox).

Define the PointNet++ architecture using the pointnetplusNetwork function.

lgraph = pointnetplusNetwork(numPoints,3,numClasses);

To handle the class-imbalance on the DALES dataset, the weighted crossentropy loss function is used. This will penalize the network more if a point that belongs to a class with lower weight is misclassified.

% Define the loss function.
lossfun = @(Y,T) mean(mean(sum(crossentropy(Y,T,weights,'WeightsFormat','UC','Reduction','none'),3),1),4);

Specify Training Options

Use the Adam optimization algorithm to train the network. Use the trainingOptions function to specify the hyperparameters.

Train the network using a CPU or GPU. Using a GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. For more information, see GPU Computing Requirements (Parallel Computing Toolbox). To automatically detect if you have a GPU available, set executionEnvironment to "auto". If you do not have a GPU, or do not want to use one for training, set executionEnvironment to "cpu". To ensure the use of a GPU for training, set executionEnvironment to "gpu".

executionEnvironment = "auto";
if canUseParallelPool
    dispatchInBackground = true;
else
    dispatchInBackground = false;
end

learningRate = 0.0005;
l2Regularization = 0.01;
numEpochs = 20;
miniBatchSize = 16;
learnRateDropFactor = 0.1;
learnRateDropPeriod = 10;
gradientDecayFactor = 0.9;
squaredGradientDecayFactor = 0.999;

options = trainingOptions('adam', ...
    'InitialLearnRate',learningRate, ...
    'L2Regularization',l2Regularization, ...
    'MaxEpochs',numEpochs, ...
    'MiniBatchSize',miniBatchSize, ...
    'LearnRateSchedule','piecewise', ...
    'LearnRateDropFactor',learnRateDropFactor, ...
    'LearnRateDropPeriod',learnRateDropPeriod, ...
    'GradientDecayFactor',gradientDecayFactor, ...
    'SquaredGradientDecayFactor',squaredGradientDecayFactor, ...
    'ExecutionEnvironment',executionEnvironment, ...
    'DispatchInBackground',dispatchInBackground, ...
    'Plots','training-progress');

Note: Reduce the miniBatchSize value to control memory usage when training.

Train Model

To train the network, set the doTraining argument to true. Otherwise, load a pretrained network. To train the network, you can use CPU or GPU. Using a GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU. For more information, see GPU Computing Requirements (Parallel Computing Toolbox).

doTraining = false;
if doTraining
    % Train the network on the ldsTransformed datastore using 
    % the trainnet function.
    [net,info] = trainnet(ldsTransformed,net,lossfun,options);
else
    % Load the pretrained network.
    load('pointnetplusNetworkTrained','net');
end

Segment Aerial Point Cloud

To perform segmentation on the test point cloud, first create a blockedPointCloud (Lidar Toolbox) object, then create a blockedPointCloudDatastore (Lidar Toolbox) object.

Apply the similar transformation used on training data, to the test data:

  • Extract the point cloud and the respective labels.

  • Downsample the point cloud to a specified number, numPoints.

  • Normalize the point clouds to the range [0 1].

  • Convert the point cloud to make it compatible with the input layer of the network.

tbpc = blockedPointCloud(fullfile(testDataFolder,'5080_54470.las'),blocksize);
tbpcds = blockedPointCloudDatastore(tbpc);

Define numNearestNeighbors and radius to find the nearest points in the downsampled point cloud for each point in the dense point cloud and to perform interpolation effectively.

numNearestNeighbors = 20;
radius = 0.05;

Initiailize empty placeholder for predictions.

labelsDensePred = [];

Perform inference on this test point cloud to compute prediction labels. Interpolate the prediction labels, to obtain prediction labels on the dense point cloud. Iterate the process all over the non-overlapping blocks and predict the labels using the pcsemanticseg (Lidar Toolbox) function.

while hasdata(tbpcds)

    % Read the block along with block information.
    ptCloudDense = read(tbpcds);

    % Use the helperDownsamplePoints function, attached to this example as a
    % supporting file, to extract a downsampled point cloud from the
    % dense point cloud.
    ptCloudSparse = helperDownsamplePoints(ptCloudDense{1},[],numPoints);
                       
    % Use the helperNormalizePointCloud function, attached to this example as
    % a supporting file, to normalize the point cloud between 0 and 1.
    ptCloudSparseNormalized = helperNormalizePointCloud(ptCloudSparse);
    ptCloudDenseNormalized = helperNormalizePointCloud(ptCloudDense{1});
    
    % Use the helperTransformToTestData function, defined at the end of this
    % example, to convert the point cloud to a cell array and to permute the
    % dimensions of the point cloud to make it compatible with the input layer
    % of the network.
    ptCloudSparseForPrediction = helperTransformToTestData(ptCloudSparseNormalized);
    
    % Get the output predictions.
    labelsSparsePred = predict(net,ptCloudSparseForPrediction{1,1});
    [~,labelsSparsePred] = max(labelsSparsePred,[],3);
    
    % Use the helperInterpolate function, attached to this example as a
    % supporting file, to calculate labels for the dense point cloud,
    % using the sparse point cloud and labels predicted on the sparse point cloud.
    interpolatedLabels = helperInterpolate(ptCloudDenseNormalized, ...
        ptCloudSparseNormalized,labelsSparsePred,numNearestNeighbors, ...
        radius,maxLabel,numClasses);  
    
    % Concatenate the predicted labels from the blocks.
    labelsDensePred = vertcat(labelsDensePred,interpolatedLabels);
end
Starting parallel pool (parpool) using the 'Processes' profile ...
Connected to parallel pool with 32 workers.

For better visualisation, only display a block inferred from the point cloud data.

figure;
ax = pcshow(ptCloudDense{1}.Location,interpolatedLabels);
axis off;
helperLabelColorbar(ax,classNames);
title("Point Cloud Overlaid with Detected Semantic Labels");

Evaluate Network

To perform evaluation on the test data, get the labels from the test point cloud. The labels for the test data are already predicted in the previous step. Hence, iterate over the non-overlapping blocks of the point cloud and extract the ground truth labels.

Initialize the placeholders for target labels.

labelsDenseTarget = [];

Loop over the block point cloud datastore and get the ground truth labels.

reset(tbpcds);

while hasdata(tbpcds)    
    % Read the block along with block information.
    [~,infoDense] = read(tbpcds);

    % Extract the labels from the block information.
    labelsDense = infoDense.PointAttributes.Classification;
    
    % Concatenate the target labels from the blocks.
    labelsDenseTarget = vertcat(labelsDenseTarget,labelsDense);
end

Use the evaluateSemanticSegmentation (Computer Vision Toolbox) function to compute the semantic segmentation metrics from the test set results. The target and predicted labels are computed previously and are stored in the labelsDensePred and the labelsDenseTarget variables respectively.

confusionMatrix = segmentationConfusionMatrix(double(labelsDensePred), ...
    double(labelsDenseTarget),'Classes',1:numClasses);
metrics = evaluateSemanticSegmentation({confusionMatrix},classNames,'Verbose',false);

You can measure the amount of overlap per class using the intersection-over-union (IoU) metric.

The evaluateSemanticSegmentation (Computer Vision Toolbox) function returns metrics for the entire data set, for individual classes, and for each test image. To see the metrics at the data set level, use the metrics.DataSetMetrics property.

metrics.DataSetMetrics
ans=1×4 table
    GlobalAccuracy    MeanAccuracy    MeanIoU    WeightedIoU
    ______________    ____________    _______    ___________

       0.93648          0.66492       0.5344       0.89048  

The data set metrics provide a high-level overview of network performance. To see the impact each class has on the overall performance, inspect the metrics for each class using the metrics.ClassMetrics property.

metrics.ClassMetrics
ans=8×2 table
                  Accuracy      IoU   
                  ________    ________

    ground        0.99254        0.943
    vegetation    0.85796      0.83182
    cars          0.57798      0.40791
    trucks        0.15883     0.056435
    powerlines    0.75769      0.67357
    fences        0.50396      0.24061
    poles         0.53049      0.22384
    buildings     0.93989      0.89801

Although the overall network performance is good, the class metrics for some classes like Trucks indicate that more training data is required for better performance.

Supporting Functions

The helperLabelColorbar function adds a colorbar to the current axis. The colorbar is formatted to display the class names with the color.

function helperLabelColorbar(ax,classNames)
% Colormap for the original classes.
cmap = [[0 0 255];
    [0 255 0];
    [255 192 203];
    [255 255 0];
    [255 0 255];
    [255 165 0];
    [139 0 150];
    [255 0 0]];
cmap = cmap./255;
cmap = cmap(1:numel(classNames),:);
colormap(ax,cmap);

% Add colorbar to current figure.
c = colorbar(ax);
c.Color = 'w';

% Center tick labels and use class names for tick marks.
numClasses = size(classNames,1);
c.Ticks = 1:1:numClasses;
c.TickLabels = classNames;

% Remove tick mark.
c.TickLength = 0;
end

The helperTransformToTrainData function performs these set of transforms on the input data which are:

  • Extract the point cloud and the respective labels.

  • Downsample the point cloud, the labels to a specified number, numPoints.

  • Normalize the point clouds to the range [0 1].

  • Convert the point cloud and the corresponding labels to make them compatible with the input layer of the network.

function [cellout,dataout] = helperTransformToTrainData(data,numPoints,info,...
    labelIDs,classNames)
if ~iscell(data)
    data = {data};
end
numObservations = size(data,1);
cellout = cell(numObservations,2);
dataout = cell(numObservations,2);
for i = 1:numObservations 
    classification = info.PointAttributes(i).Classification;

    % Remove labels with zero value.
    pc = data{i,1};
    pc = select(pc,(classification ~= 0));
    classification = classification(classification ~= 0);

    % Use the helperDownsamplePoints function, attached to this example as a
    % supporting file, to extract a downsampled point cloud and its labels
    % from the dense point cloud.
    [ptCloudOut,labelsOut] = helperDownsamplePoints(pc, ...
    classification,numPoints);

    % Make the spatial extent of the dense point cloud and the sparse point
    % cloud same.
    limits = [ptCloudOut.XLimits;ptCloudOut.YLimits;...
                    ptCloudOut.ZLimits];
    ptCloudSparseLocation = ptCloudOut.Location;
    ptCloudSparseLocation(1:2,:) = limits(:,1:2)';
    ptCloudSparseUpdated = pointCloud(ptCloudSparseLocation, ...
        'Intensity',ptCloudOut.Intensity, ...
        'Color',ptCloudOut.Color, ...
        'Normal',ptCloudOut.Normal);

    % Use the helperNormalizePointCloud function, attached to this example as
    % a supporting file, to normalize the point cloud between 0 and 1.    
    ptCloudOutSparse = helperNormalizePointCloud( ...
        ptCloudSparseUpdated);
    cellout{i,1} = ptCloudOutSparse.Location;

    % Permuted output.
    cellout{i,2} = permute(categorical(labelsOut,labelIDs,classNames),[1 3 2]);

    % General output.
    dataout{i,1} = ptCloudOutSparse;
    dataout{i,2} = labelsOut;
end
end

The helperTransformToTestData function converts the point cloud to a cell array and permutes the dimensions of the point cloud to make it compatible with the input layer of the network.

function data = helperTransformToTestData(data)
if ~iscell(data)
    data = {data};
end
numObservations = size(data,1);
for i = 1:numObservations
    tmp = data{i,1}.Location;
    data{i,1} = tmp;
end
end

References

[1] Varney, Nina, Vijayan K. Asari, and Quinn Graehling. "DALES: A Large-Scale Aerial LiDAR dataset for Semantic Segmentation." ArXiv:2004.11985 [Cs, Stat], April 14, 2020. https://arxiv.org/abs/2004.11985.

[2] Qi, Charles R., Li Yi, Hao Su, and Leonidas J. Guibas. "PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space." ArXiv:1706.02413 [Cs], June 7, 2017. https://arxiv.org/abs/1706.02413.