Main Content

validate

Quantize and validate a deep neural network

Since R2020a

Description

example

valResults = validate(quantObj,valData) quantizes the weights, biases, and activations in the convolution layers of the network, and validates the network specified by dlquantizer object, quantObj, using the data specified by valData.

example

valResults = validate(quantObj,valData,quantOpts) quantizes and validates the network with additional options specified by quantOpts.

This function requires Deep Learning Toolbox Model Quantization Library. To learn about the products required to quantize a deep neural network, see Quantization Workflow Prerequisites.

Examples

collapse all

This example shows how to quantize learnable parameters in the convolution layers of a neural network for GPU and explore the behavior of the quantized network. In this example, you quantize the squeezenet neural network after retraining the network to classify new images according to the Train Deep Learning Network to Classify New Images example. In this example, the memory required for the network is reduced approximately 75% through quantization while the accuracy of the network is not affected.

Load the pretrained network. net is the output network of the Train Deep Learning Network to Classify New Images example.

load squeezenetmerch
net
net = 
  DAGNetwork with properties:

         Layers: [68×1 nnet.cnn.layer.Layer]
    Connections: [75×2 table]
     InputNames: {'data'}
    OutputNames: {'new_classoutput'}

Define calibration and validation data to use for quantization.

The calibration data is used to collect the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network. For the best quantization results, the calibration data must be representative of inputs to the network.

The validation data is used to test the network after quantization to understand the effects of the limited range and precision of the quantized convolution layers in the network.

In this example, use the images in the MerchData data set. Define an augmentedImageDatastore object to resize the data for the network. Then, split the data into calibration and validation data sets.

unzip('MerchData.zip');
imds = imageDatastore('MerchData', ...
    'IncludeSubfolders',true, ...
    'LabelSource','foldernames');
[calData, valData] = splitEachLabel(imds, 0.7, 'randomized');
aug_calData = augmentedImageDatastore([227 227], calData);
aug_valData = augmentedImageDatastore([227 227], valData);

Create a dlquantizer object and specify the network to quantize.

dlquantObj = dlquantizer(net);

Specify the GPU target.

quantOpts = dlquantizationOptions(Target,'gpu');

Use the calibrate function to exercise the network with sample inputs and collect range information. The calibrate function exercises the network and collects the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network. The function returns a table. Each row of the table contains range information for a learnable parameter of the optimized network.

calResults = calibrate(dlquantObj, aug_calData)
calResults=121×5 table
        Optimized Layer Name         Network Layer Name     Learnables / Activations    MinValue     MaxValue
    ____________________________    ____________________    ________________________    _________    ________

    {'conv1_Weights'           }    {'conv1'           }           "Weights"             -0.91985     0.88489
    {'conv1_Bias'              }    {'conv1'           }           "Bias"                -0.07925     0.26343
    {'fire2-squeeze1x1_Weights'}    {'fire2-squeeze1x1'}           "Weights"                -1.38      1.2477
    {'fire2-squeeze1x1_Bias'   }    {'fire2-squeeze1x1'}           "Bias"                -0.11641     0.24273
    {'fire2-expand1x1_Weights' }    {'fire2-expand1x1' }           "Weights"              -0.7406     0.90982
    {'fire2-expand1x1_Bias'    }    {'fire2-expand1x1' }           "Bias"               -0.060056     0.14602
    {'fire2-expand3x3_Weights' }    {'fire2-expand3x3' }           "Weights"             -0.74397     0.66905
    {'fire2-expand3x3_Bias'    }    {'fire2-expand3x3' }           "Bias"               -0.051778    0.074239
    {'fire3-squeeze1x1_Weights'}    {'fire3-squeeze1x1'}           "Weights"              -0.7712     0.68917
    {'fire3-squeeze1x1_Bias'   }    {'fire3-squeeze1x1'}           "Bias"                -0.10138     0.32675
    {'fire3-expand1x1_Weights' }    {'fire3-expand1x1' }           "Weights"             -0.72035      0.9743
    {'fire3-expand1x1_Bias'    }    {'fire3-expand1x1' }           "Bias"               -0.067029     0.30425
    {'fire3-expand3x3_Weights' }    {'fire3-expand3x3' }           "Weights"             -0.61443      0.7741
    {'fire3-expand3x3_Bias'    }    {'fire3-expand3x3' }           "Bias"               -0.053613     0.10329
    {'fire4-squeeze1x1_Weights'}    {'fire4-squeeze1x1'}           "Weights"              -0.7422      1.0877
    {'fire4-squeeze1x1_Bias'   }    {'fire4-squeeze1x1'}           "Bias"                -0.10885     0.13881
      ⋮

Use the validate function to quantize the learnable parameters in the convolution layers of the network and exercise the network. The function uses the metric function defined in the dlquantizationOptions object to compare the results of the network before and after quantization.

valResults = validate(dlquantObj, aug_valData, quantOpts)
valResults = struct with fields:
       NumSamples: 20
    MetricResults: [1×1 struct]
       Statistics: [2×2 table]

Examine the validation output to see the performance of the quantized network.

valResults.MetricResults.Result
ans=2×2 table
    NetworkImplementation    MetricOutput
    _____________________    ____________

     {'Floating-Point'}           1      
     {'Quantized'     }           1      

valResults.Statistics
ans=2×2 table
    NetworkImplementation    LearnableParameterMemory(bytes)
    _____________________    _______________________________

     {'Floating-Point'}                2.9003e+06           
     {'Quantized'     }                7.3393e+05           

In this example, the memory required for the network was reduced approximately 75% through quantization. The accuracy of the network is not affected.

The weights, biases, and activations of the convolution layers of the network specified in the dlquantizer object now use scaled 8-bit integer data types.

Reduce the memory footprint of a deep neural network by quantizing the weights, biases, and activations of convolution layers to 8-bit scaled integer data types. This example shows how to use Deep Learning Toolbox Model Quantization Library and Deep Learning HDL Toolbox to deploy the int8 network to a target FPGA board.

For this example, you need:

  • Deep Learning Toolbox™

  • Deep Learning HDL Toolbox™

  • Deep Learning Toolbox Model Quantization Library

  • Deep Learning HDL Toolbox Support Package for Xilinx® FPGA and SoC Devices

  • MATLAB Coder Interface for Deep Learning.

Load Pretrained Network

Load the pretrained LogoNet network and analyze the network architecture.

snet = getLogoNetwork;
deepNetworkDesigner(snet);

Set random number generator for reproducibility.

rng(0);

Load Data

This example uses the logos_dataset data set. The data set consists of 320 images. Each image is 227-by-227 in size and has three color channels (RGB). Create an augmentedImageDatastore object for calibration and validation.

curDir = pwd;
unzip("logos_dataset.zip");
imageData = imageDatastore(fullfile(curDir,'logos_dataset'),...
'IncludeSubfolders',true,'FileExtensions','.JPG','LabelSource','foldernames');
[calibrationData, validationData] = splitEachLabel(imageData, 0.5,'randomized');

Generate Calibration Result File for the Network

Create a dlquantizer (Deep Learning HDL Toolbox) object and specify the network to quantize. Specify the execution environment as FPGA.

dlQuantObj = dlquantizer(snet,'ExecutionEnvironment',"FPGA");

Use the calibrate (Deep Learning HDL Toolbox) function to exercise the network with sample inputs and collect the range information. The calibrate function collects the dynamic ranges of the weights and biases. The calibrate function returns a table. Each row of the table contains range information for a learnable parameter of the quantized network.

calibrate(dlQuantObj,calibrationData)
ans=35×5 table
        Optimized Layer Name        Network Layer Name    Learnables / Activations     MinValue       MaxValue 
    ____________________________    __________________    ________________________    ___________    __________

    {'conv_1_Weights'          }      {'conv_1'    }           "Weights"                -0.048978      0.039352
    {'conv_1_Bias'             }      {'conv_1'    }           "Bias"                     0.99996        1.0028
    {'conv_2_Weights'          }      {'conv_2'    }           "Weights"                -0.055518      0.061901
    {'conv_2_Bias'             }      {'conv_2'    }           "Bias"                 -0.00061171       0.00227
    {'conv_3_Weights'          }      {'conv_3'    }           "Weights"                -0.045942      0.046927
    {'conv_3_Bias'             }      {'conv_3'    }           "Bias"                  -0.0013998     0.0015218
    {'conv_4_Weights'          }      {'conv_4'    }           "Weights"                -0.045967         0.051
    {'conv_4_Bias'             }      {'conv_4'    }           "Bias"                    -0.00164     0.0037892
    {'fc_1_Weights'            }      {'fc_1'      }           "Weights"                -0.051394      0.054344
    {'fc_1_Bias'               }      {'fc_1'      }           "Bias"                 -0.00052319    0.00084454
    {'fc_2_Weights'            }      {'fc_2'      }           "Weights"                 -0.05016      0.051557
    {'fc_2_Bias'               }      {'fc_2'      }           "Bias"                  -0.0017564     0.0018502
    {'fc_3_Weights'            }      {'fc_3'      }           "Weights"                -0.050706       0.04678
    {'fc_3_Bias'               }      {'fc_3'      }           "Bias"                    -0.02951      0.024855
    {'imageinput'              }      {'imageinput'}           "Activations"                    0           255
    {'imageinput_normalization'}      {'imageinput'}           "Activations"              -139.34        198.72
      ⋮

Create Target Object

Create a target object with a custom name for your target device and an interface to connect your target device to the host computer. Interface options are JTAG and Ethernet. Interface options are JTAG and Ethernet. To use JTAG, install Xilinx Vivado® Design Suite 2022.1. To set the Xilinx Vivado toolpath, enter:

hdlsetuptoolpath('ToolName', 'Xilinx Vivado', 'ToolPath', 'C:\Xilinx\Vivado\2022.1\bin\vivado.bat');

To create the target object, enter:

hTarget = dlhdl.Target('Xilinx','Interface','Ethernet','IPAddress','10.10.10.15');

Alternatively, you can also use the JTAG interface.

% hTarget = dlhdl.Target('Xilinx', 'Interface', 'JTAG');

Create dlQuantizationOptions Object

Create a dlquantizationOptions object. Specify the target bitstream and target board interface. The default metric function is a Top-1 accuracy metric function.

options_FPGA = dlquantizationOptions('Bitstream','zcu102_int8','Target',hTarget);
options_emulation = dlquantizationOptions('Target','host');

To use a custom metric function, specify the metric function in the dlquantizationOptions object.

options_FPGA = dlquantizationOptions('MetricFcn',{@(x)hComputeAccuracy(x,snet,validationData)},'Bitstream','zcu102_int8','Target',hTarget);
options_emulation = dlquantizationOptions('MetricFcn',{@(x)hComputeAccuracy(x,snet,validationData)})

Validate Quantized Network

Use the validate function to quantize the learnable parameters in the convolution layers of the network. The validate function simulates the quantized network in MATLAB. The validate function uses the metric function defined in the dlquantizationOptions object to compare the results of the single-data-type network object to the results of the quantized network object.

prediction_emulation = dlQuantObj.validate(validationData,options_emulation)
prediction_emulation = struct with fields:
       NumSamples: 160
    MetricResults: [1×1 struct]
       Statistics: []

For validation on an FPGA, the validate function:

  • Programs the FPGA board by using the output of the compile method and the programming file

  • Downloads the network weights and biases

  • Compares the performance of the network before and after quantization

prediction_FPGA = dlQuantObj.validate(validationData,options_FPGA)
### Compiling network for Deep Learning FPGA prototyping ...
### Targeting FPGA bitstream zcu102_int8.
### The network includes the following layers:
     1   'imageinput'    Image Input             227×227×3 images with 'zerocenter' normalization and 'randfliplr' augmentations  (SW Layer)
     2   'conv_1'        2-D Convolution         96 5×5×3 convolutions with stride [1  1] and padding [0  0  0  0]                (HW Layer)
     3   'relu_1'        ReLU                    ReLU                                                                             (HW Layer)
     4   'maxpool_1'     2-D Max Pooling         3×3 max pooling with stride [2  2] and padding [0  0  0  0]                      (HW Layer)
     5   'conv_2'        2-D Convolution         128 3×3×96 convolutions with stride [1  1] and padding [0  0  0  0]              (HW Layer)
     6   'relu_2'        ReLU                    ReLU                                                                             (HW Layer)
     7   'maxpool_2'     2-D Max Pooling         3×3 max pooling with stride [2  2] and padding [0  0  0  0]                      (HW Layer)
     8   'conv_3'        2-D Convolution         384 3×3×128 convolutions with stride [1  1] and padding [0  0  0  0]             (HW Layer)
     9   'relu_3'        ReLU                    ReLU                                                                             (HW Layer)
    10   'maxpool_3'     2-D Max Pooling         3×3 max pooling with stride [2  2] and padding [0  0  0  0]                      (HW Layer)
    11   'conv_4'        2-D Convolution         128 3×3×384 convolutions with stride [2  2] and padding [0  0  0  0]             (HW Layer)
    12   'relu_4'        ReLU                    ReLU                                                                             (HW Layer)
    13   'maxpool_4'     2-D Max Pooling         3×3 max pooling with stride [2  2] and padding [0  0  0  0]                      (HW Layer)
    14   'fc_1'          Fully Connected         2048 fully connected layer                                                       (HW Layer)
    15   'relu_5'        ReLU                    ReLU                                                                             (HW Layer)
    16   'fc_2'          Fully Connected         2048 fully connected layer                                                       (HW Layer)
    17   'relu_6'        ReLU                    ReLU                                                                             (HW Layer)
    18   'fc_3'          Fully Connected         32 fully connected layer                                                         (HW Layer)
    19   'softmax'       Softmax                 softmax                                                                          (SW Layer)
    20   'classoutput'   Classification Output   crossentropyex with 'adidas' and 31 other classes                                (SW Layer)
                                                                                                                                
### Notice: The layer 'imageinput' with type 'nnet.cnn.layer.ImageInputLayer' is implemented in software.
### Notice: The layer 'softmax' with type 'nnet.cnn.layer.SoftmaxLayer' is implemented in software.
### Notice: The layer 'classoutput' with type 'nnet.cnn.layer.ClassificationOutputLayer' is implemented in software.
### Compiling layer group: conv_1>>relu_4 ...
### Compiling layer group: conv_1>>relu_4 ... complete.
### Compiling layer group: maxpool_4 ...
### Compiling layer group: maxpool_4 ... complete.
### Compiling layer group: fc_1>>fc_3 ...
### Compiling layer group: fc_1>>fc_3 ... complete.

### Allocating external memory buffers:

          offset_name          offset_address    allocated_space 
    _______________________    ______________    ________________

    "InputDataOffset"           "0x00000000"     "11.9 MB"       
    "OutputResultOffset"        "0x00be0000"     "128.0 kB"      
    "SchedulerDataOffset"       "0x00c00000"     "128.0 kB"      
    "SystemBufferOffset"        "0x00c20000"     "9.9 MB"        
    "InstructionDataOffset"     "0x01600000"     "4.6 MB"        
    "ConvWeightDataOffset"      "0x01aa0000"     "8.2 MB"        
    "FCWeightDataOffset"        "0x022e0000"     "10.4 MB"       
    "EndOffset"                 "0x02d40000"     "Total: 45.2 MB"

### Network compilation complete.

### FPGA bitstream programming has been skipped as the same bitstream is already loaded on the target FPGA.
### Deep learning network programming has been skipped as the same network is already loaded on the target FPGA.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Finished writing input activations.
### Running single input activation.
### Notice: The layer 'imageinput' of type 'ImageInputLayer' is split into an image input layer 'imageinput' and an addition layer 'imageinput_norm' for normalization on hardware.
### The network includes the following layers:
     1   'imageinput'    Image Input             227×227×3 images with 'zerocenter' normalization and 'randfliplr' augmentations  (SW Layer)
     2   'conv_1'        2-D Convolution         96 5×5×3 convolutions with stride [1  1] and padding [0  0  0  0]                (HW Layer)
     3   'relu_1'        ReLU                    ReLU                                                                             (HW Layer)
     4   'maxpool_1'     2-D Max Pooling         3×3 max pooling with stride [2  2] and padding [0  0  0  0]                      (HW Layer)
     5   'conv_2'        2-D Convolution         128 3×3×96 convolutions with stride [1  1] and padding [0  0  0  0]              (HW Layer)
     6   'relu_2'        ReLU                    ReLU                                                                             (HW Layer)
     7   'maxpool_2'     2-D Max Pooling         3×3 max pooling with stride [2  2] and padding [0  0  0  0]                      (HW Layer)
     8   'conv_3'        2-D Convolution         384 3×3×128 convolutions with stride [1  1] and padding [0  0  0  0]             (HW Layer)
     9   'relu_3'        ReLU                    ReLU                                                                             (HW Layer)
    10   'maxpool_3'     2-D Max Pooling         3×3 max pooling with stride [2  2] and padding [0  0  0  0]                      (HW Layer)
    11   'conv_4'        2-D Convolution         128 3×3×384 convolutions with stride [2  2] and padding [0  0  0  0]             (HW Layer)
    12   'relu_4'        ReLU                    ReLU                                                                             (HW Layer)
    13   'maxpool_4'     2-D Max Pooling         3×3 max pooling with stride [2  2] and padding [0  0  0  0]                      (HW Layer)
    14   'fc_1'          Fully Connected         2048 fully connected layer                                                       (HW Layer)
    15   'relu_5'        ReLU                    ReLU                                                                             (HW Layer)
    16   'fc_2'          Fully Connected         2048 fully connected layer                                                       (HW Layer)
    17   'relu_6'        ReLU                    ReLU                                                                             (HW Layer)
    18   'fc_3'          Fully Connected         32 fully connected layer                                                         (HW Layer)
    19   'softmax'       Softmax                 softmax                                                                          (SW Layer)
    20   'classoutput'   Classification Output   crossentropyex with 'adidas' and 31 other classes                                (SW Layer)
                                                                                                                                
### Notice: The layer 'softmax' with type 'nnet.cnn.layer.SoftmaxLayer' is implemented in software.
### Notice: The layer 'classoutput' with type 'nnet.cnn.layer.ClassificationOutputLayer' is implemented in software.


              Deep Learning Processor Estimator Performance Results

                   LastFrameLatency(cycles)   LastFrameLatency(seconds)       FramesNum      Total Latency     Frames/s
                         -------------             -------------              ---------        ---------       ---------
Network                   39136574                  0.17789                       1           39136574              5.6
    imageinput_norm         216472                  0.00098 
    conv_1                 6832680                  0.03106 
    maxpool_1              3705912                  0.01685 
    conv_2                10454501                  0.04752 
    maxpool_2              1173810                  0.00534 
    conv_3                 9364533                  0.04257 
    maxpool_3              1229970                  0.00559 
    conv_4                 1759348                  0.00800 
    maxpool_4                24450                  0.00011 
    fc_1                   2651288                  0.01205 
    fc_2                   1696632                  0.00771 
    fc_3                     26978                  0.00012 
 * The clock frequency of the DL processor is: 220MHz


### Finished writing input activations.
### Running single input activation.
prediction_FPGA = struct with fields:
       NumSamples: 160
    MetricResults: [1×1 struct]
       Statistics: [2×7 table]

View Performance of Quantized Neural Network

Display the accuracy of the quantized network.

prediction_emulation.MetricResults.Result
ans=2×2 table
    NetworkImplementation    MetricOutput
    _____________________    ____________

     {'Floating-Point'}         0.9875   
     {'Quantized'     }         0.9875   

prediction_FPGA.MetricResults.Result
ans=2×2 table
    NetworkImplementation    MetricOutput
    _____________________    ____________

     {'Floating-Point'}         0.9875   
     {'Quantized'     }         0.9875   

Display the performance of the quantized network in frames per second.

prediction_FPGA.Statistics
ans=2×7 table
    NetworkImplementation    FramesPerSecond    Number of Threads (Convolution)    Number of Threads (Fully Connected)    LUT Utilization (%)    BlockRAM Utilization (%)    DSP Utilization (%)
    _____________________    _______________    _______________________________    ___________________________________    ___________________    ________________________    ___________________

     {'Floating-Point'}          5.6213                       16                                    4                           93.198                    63.925                   15.595       
     {'Quantized'     }          19.433                       64                                   16                            62.31                     50.11                   32.103       

This example shows how to quantize and validate a neural network for a CPU target. This workflow is similar to other execution environments, but before validating you must establish a raspi connection and specify it as target using dlquantizationOptions.

First, load your network. This example uses the pretrained network squeezenet.

load squeezenetmerch
net
net = 
  DAGNetwork with properties:

         Layers: [68×1 nnet.cnn.layer.Layer]
    Connections: [75×2 table]
     InputNames: {'data'}
    OutputNames: {'new_classoutput'}

Then define your calibration and validation data, calDS and valDS respectively.

unzip('MerchData.zip');
imds = imageDatastore('MerchData', ...
    'IncludeSubfolders',true, ...
    'LabelSource','foldernames');
[calData, valData] = splitEachLabel(imds, 0.7, 'randomized');
aug_calData = augmentedImageDatastore([227 227],calData);
aug_valData = augmentedImageDatastore([227 227],valData);

Create the dlquantizer object and specify a CPU execution environment.

dq =  dlquantizer(net,'ExecutionEnvironment','CPU') 
dq = 
  dlquantizer with properties:

           NetworkObject: [1×1 DAGNetwork]
    ExecutionEnvironment: 'CPU'

Calibrate the network.

calResults = calibrate(dq,aug_calData,'UseGPU','off')
calResults=122×5 table
        Optimized Layer Name         Network Layer Name     Learnables / Activations    MinValue     MaxValue
    ____________________________    ____________________    ________________________    _________    ________

    {'conv1_Weights'           }    {'conv1'           }           "Weights"             -0.91985     0.88489
    {'conv1_Bias'              }    {'conv1'           }           "Bias"                -0.07925     0.26343
    {'fire2-squeeze1x1_Weights'}    {'fire2-squeeze1x1'}           "Weights"                -1.38      1.2477
    {'fire2-squeeze1x1_Bias'   }    {'fire2-squeeze1x1'}           "Bias"                -0.11641     0.24273
    {'fire2-expand1x1_Weights' }    {'fire2-expand1x1' }           "Weights"              -0.7406     0.90982
    {'fire2-expand1x1_Bias'    }    {'fire2-expand1x1' }           "Bias"               -0.060056     0.14602
    {'fire2-expand3x3_Weights' }    {'fire2-expand3x3' }           "Weights"             -0.74397     0.66905
    {'fire2-expand3x3_Bias'    }    {'fire2-expand3x3' }           "Bias"               -0.051778    0.074239
    {'fire3-squeeze1x1_Weights'}    {'fire3-squeeze1x1'}           "Weights"              -0.7712     0.68917
    {'fire3-squeeze1x1_Bias'   }    {'fire3-squeeze1x1'}           "Bias"                -0.10138     0.32675
    {'fire3-expand1x1_Weights' }    {'fire3-expand1x1' }           "Weights"             -0.72035      0.9743
    {'fire3-expand1x1_Bias'    }    {'fire3-expand1x1' }           "Bias"               -0.067029     0.30425
    {'fire3-expand3x3_Weights' }    {'fire3-expand3x3' }           "Weights"             -0.61443      0.7741
    {'fire3-expand3x3_Bias'    }    {'fire3-expand3x3' }           "Bias"               -0.053613     0.10329
    {'fire4-squeeze1x1_Weights'}    {'fire4-squeeze1x1'}           "Weights"              -0.7422      1.0877
    {'fire4-squeeze1x1_Bias'   }    {'fire4-squeeze1x1'}           "Bias"                -0.10885     0.13881
      ⋮

Use the MATLAB Support Package for Raspberry Pi Hardware function, raspi, to create a connection to the Raspberry Pi. In the following code, replace:

  • raspiname with the name or address of your Raspberry Pi

  • username with your user name

  • password with your password

% r = raspi('raspiname','username','password')

For example,

r = raspi('gpucoder-raspberrypi-7','pi','matlab')
r = 
  raspi with properties:

         DeviceAddress: 'gpucoder-raspberrypi-7'      
                  Port: 18734                         
             BoardName: 'Raspberry Pi 3 Model B+'     
         AvailableLEDs: {'led0'}                      
  AvailableDigitalPins: [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27]
  AvailableSPIChannels: {}                            
     AvailableI2CBuses: {}                            
      AvailableWebcams: {}                            
           I2CBusSpeed:                               
AvailableCANInterfaces: {}                            

  Supported peripherals

Specify raspi object as the target for the quantized network.

opts = dlquantizationOptions('Target',r)
opts = 
  dlquantizationOptions with properties:

    MetricFcn: {}
    Bitstream: ''
       Target: [1×1 raspi]

Validate the quantized network with the validate function.

valResults = validate(dq,aug_valData,opts)
### Starting application: 'codegen\lib\validate_predict_int8\pil\validate_predict_int8.elf'
    To terminate execution: clear validate_predict_int8_pil
### Launching application validate_predict_int8.elf...
### Host application produced the following standard output (stdout) and standard error (stderr) messages:
valResults = struct with fields:
       NumSamples: 20
    MetricResults: [1×1 struct]
       Statistics: []

Examine the validation output to see the performance of the quantized network.

valResults.MetricResults.Result
ans=2×2 table
    NetworkImplementation    MetricOutput
    _____________________    ____________

     {'Floating-Point'}          0.95    
     {'Quantized'     }          0.95    

This example shows how to quantize a yolov3ObjectDetector (Computer Vision Toolbox) object using preprocessed calibration and validation data.

First, download a pretrained YOLO v3 object detector.

detector = downloadPretrainedNetwork();

This example uses a small labeled data set that contains one or two labeled instances of a vehicle. Many of these images come from the Caltech Cars 1999 and 2001 data sets, created by Pietro Perona and used with permission.

Unzip the vehicle images and load the vehicle ground truth data.

unzip vehicleDatasetImages.zip
data = load('vehicleDatasetGroundTruth.mat');
vehicleDataset = data.vehicleDataset;

Add the full path to the local vehicle data folder.

vehicleDataset.imageFilename = fullfile(pwd, vehicleDataset.imageFilename);

Create an imageDatastore for loading the images and a boxLabelDatastore (Computer Vision Toolbox) for the ground truth bounding boxes.

imds = imageDatastore(vehicleDataset.imageFilename);
blds = boxLabelDatastore(vehicleDataset(:,2));

Use the combine function to combine both the datastores into a CombinedDatastore.

combinedDS = combine(imds, blds);

Split the data into calibration and validation data.

calData = combinedDS.subset(1:32);
valData = combinedDS.subset(33:64);

Use the preprocess (Computer Vision Toolbox) method of yolov3ObjectDetector (Computer Vision Toolbox) object with transform function to prepare the data for calibration and validation.

The transform function returns a TransformedDatastore object.

processedCalData = transform(calData, @(data)preprocess(detector,data));
processedValData = transform(valData, @(data)preprocess(detector,data));

Create the dlquantizer object.

dq = dlquantizer(detector, 'ExecutionEnvironment', 'MATLAB');

Calibrate the network.

calResults = calibrate(dq, processedCalData,'UseGPU','off')
calResults=135×5 table
        Optimized Layer Name         Network Layer Name     Learnables / Activations    MinValue     MaxValue
    ____________________________    ____________________    ________________________    _________    ________

    {'conv1_Weights'           }    {'conv1'           }           "Weights"             -0.92189     0.85687
    {'conv1_Bias'              }    {'conv1'           }           "Bias"               -0.096271     0.26628
    {'fire2-squeeze1x1_Weights'}    {'fire2-squeeze1x1'}           "Weights"              -1.3751      1.2444
    {'fire2-squeeze1x1_Bias'   }    {'fire2-squeeze1x1'}           "Bias"                -0.12068     0.23104
    {'fire2-expand1x1_Weights' }    {'fire2-expand1x1' }           "Weights"             -0.75275     0.91615
    {'fire2-expand1x1_Bias'    }    {'fire2-expand1x1' }           "Bias"               -0.059252     0.14035
    {'fire2-expand3x3_Weights' }    {'fire2-expand3x3' }           "Weights"             -0.75271      0.6774
    {'fire2-expand3x3_Bias'    }    {'fire2-expand3x3' }           "Bias"               -0.062214    0.088242
    {'fire3-squeeze1x1_Weights'}    {'fire3-squeeze1x1'}           "Weights"              -0.7586     0.68772
    {'fire3-squeeze1x1_Bias'   }    {'fire3-squeeze1x1'}           "Bias"                -0.10206     0.31645
    {'fire3-expand1x1_Weights' }    {'fire3-expand1x1' }           "Weights"             -0.71566     0.97678
    {'fire3-expand1x1_Bias'    }    {'fire3-expand1x1' }           "Bias"               -0.069313     0.32881
    {'fire3-expand3x3_Weights' }    {'fire3-expand3x3' }           "Weights"             -0.60079     0.77642
    {'fire3-expand3x3_Bias'    }    {'fire3-expand3x3' }           "Bias"               -0.058045     0.11229
    {'fire4-squeeze1x1_Weights'}    {'fire4-squeeze1x1'}           "Weights"               -0.738      1.0805
    {'fire4-squeeze1x1_Bias'   }    {'fire4-squeeze1x1'}           "Bias"                -0.11189     0.13698
      ⋮

Validate the quantized network with the validate function.

valResults = validate(dq, processedValData)
valResults = struct with fields:
       NumSamples: 32
    MetricResults: [1×1 struct]
       Statistics: []

function detector = downloadPretrainedNetwork()
   pretrainedURL = 'https://ssd.mathworks.com/supportfiles/vision/data/yolov3SqueezeNetVehicleExample_21aSPKG.zip';
   websave('yolov3SqueezeNetVehicleExample_21aSPKG.zip', pretrainedURL);

   unzip('yolov3SqueezeNetVehicleExample_21aSPKG.zip');

   pretrained = load("yolov3SqueezeNetVehicleExample_21aSPKG.mat");
   detector = pretrained.detector;
end    

Input Arguments

collapse all

Network to quantize, specified as a dlquantizer object.

Data to use for validation of quantized network, specified as an imageDatastore object, an augmentedImageDatastore object, a pixelLabelImageDatastore (Computer Vision Toolbox) object, a CombinedDatastore object, or a TransformedDatastore object.

You must preprocess the data used for validation of a quantized yolov3ObjectDetector (Computer Vision Toolbox) object using the preprocess (Computer Vision Toolbox) function. For an example of using preprocessed data for validation of a yolov3ObjectDetector, see Quantize YOLO v3 Object Detector.

validate accepts a CombinedDatastore or TransformedDatastore object as input data for validating quantized yolov3ObjectDetector and yolov4ObjectDetector objects. The CombinedDatastore and TransformedDatastore used for validation must contain an imageDatastore or augmentedImageDatastore as the first datastore and a boxLabelDatastore as the second datastore. For more information on valid datastores, see Prepare Data for Quantizing Networks.

Options for quantizing the network, specified as a dlquantizationOptions object.

Output Arguments

collapse all

Performance of quantized network, returned as a struct. The struct contains these fields.

  • NumSamples — The number of sample inputs used to validate the network, specified by valData.

  • MetricResults — Struct containing results of the metric function defined in the dlquantizationOptions object. When more than one metric function is specified in the dlquantizationOptions object, MetricResults is an array of structs.

    MetricResults contains these fields:

    FieldDescription
    MetricFunctionMetric function used to determine the performance of the quantized network, specified in the dlquantizationOptions object.
    Result

    Table indicating the results of the metric function before and after quantization.

    The first row in the table, 'Floating-Point', contains information for the original floating-point implementation. The second row, 'Quantized', contains information for the quantized implementation. The output of the metric function is displayed in the MetricOutput column.

  • Statistics — Table indicating the learnable parameter memory used, in bytes, by the original floating-point implementation of the network and the quantized implementation.

    When the ExecutionEnvironment for the dlquantizer object is set to FPGA, the Statistics table is a table indicating these values for the original floating-point and quantized network implementations:

    • Frames per second

    • Number of convolution threads

    • Number of fully connected threads

    • Lookup table (LUT) resource utilization percentage

    • Block RAM resource utilization percentage

    • DSP resource utilization percentage

    The Statistics table will be empty when the Target property of dlquantizationOptions is set to 'host'.

Limitations

  • Validation on target hardware for CPU, FPGA, and GPU execution environments is not supported in MATLAB® Online™. For FPGA and GPU execution environments, validation can be performed through emulation on the MATLAB Online host. GPU validation can also be performed if GPU support has been added to your MATLAB Online Server™ cluster. For more information on GPU support for MATLAB Online, see Configure GPU Support in MATLAB Online Server (MATLAB Online Server).

Algorithms

The validate function determines the default metric function to use for the validation based on the type of network that is being quantized.

Type of NetworkMetric Function
ClassificationTop-1 Accuracy — Accuracy of the network
Object DetectionAverage Precision — Average precision over all detection results. See evaluateDetectionPrecision (Computer Vision Toolbox).
RegressionMSE — Mean squared error of the network
Semantic SegmentationevaluateSemanticSegmentation (Computer Vision Toolbox) — Evaluate semantic segmentation data set against ground truth
Single Shot Detector (SSD)WeightedIOU — Average IoU of each class, weighted by the number of pixels in that class

Version History

Introduced in R2020a

expand all