# dlaccelerate

Accelerate deep learning function for custom training loops

## Description

Use `dlaccelerate`

to speed up deep learning function
evaluation for custom training loops.

The returned `AcceleratedFunction`

object caches the
traces of calls to the underlying function and reuses the cached
result when the same input pattern reoccurs.

Try using `dlaccelerate`

for function calls that:

are long-running

have

`dlarray`

objects, structures of`dlarray`

objects, or`dlnetwork`

objects as inputsdo not have side effects like writing to files or displaying output

Invoke the accelerated function as you would invoke the underlying function. Note that the accelerated function is not a function handle.

**Note**

When using the `dlfeval`

function, the software automatically
accelerates the `forward`

and `predict`

functions for
`dlnetwork`

input. If you accelerate a deep learning function where the
majority of the computation takes place in calls to the `forward`

or
`predict`

functions for `dlnetwork`

input, then you might
not see an improvement in training time.

For more information, see Deep Learning Function Acceleration for Custom Training Loops.

creates an `accfun`

= dlaccelerate(`fun`

)`AcceleratedFunction`

object that retains the underlying traces of
the specified function handle `fun`

.

**Caution**

An `AcceleratedFunction`

object is not aware of updates to the underlying
function. If you modify the function associated with the accelerated function, then
clear the cache using the `clearCache`

object function or alternatively use the command
`clear functions`

.

## Examples

### Accelerate Model Gradients Function

Load the `dlnetwork`

object and class names from the MAT file `dlnetDigits.mat`

.

```
s = load("dlnetDigits.mat");
net = s.net;
classNames = s.classNames;
```

Accelerate the model loss function `modelLoss`

listed at the end of the example.

fun = @modelLoss; accfun = dlaccelerate(fun);

Clear any previously cached traces of the accelerated function using the `clearCache`

function.

clearCache(accfun)

View the properties of the accelerated function. Because the cache is empty, the `Occupancy`

property is 0.

accfun

accfun = AcceleratedFunction with properties: Function: @modelLoss Enabled: 1 CacheSize: 50 HitRate: 0 Occupancy: 0 CheckMode: 'none' CheckTolerance: 1.0000e-04

The returned `AcceleratedFunction`

object stores the traces of underlying function calls and reuses the cached result when the same input pattern reoccurs. To use the accelerated function in a custom training loop, replace calls to the model gradients function with calls to the accelerated function. You can invoke the accelerated function as you would invoke the underlying function. Note that the accelerated function is not a function handle.

Evaluate the accelerated model gradients function with random data using the `dlfeval`

function.

X = rand(28,28,1,128,"single"); X = dlarray(X,"SSCB"); T = categorical(classNames(randi(10,[128 1]))); T = onehotencode(T,2)'; T = dlarray(T,"CB"); [loss,gradients,state] = dlfeval(accfun,net,X,T);

View the `Occupancy`

property of the accelerated function. Because the function has been evaluated, the cache is nonempty.

accfun.Occupancy

ans = 2

**Model Loss Function**

The `modelLoss`

function takes a `dlnetwork`

object `net`

, a mini-batch of input data `X`

with corresponding target labels `T`

and returns the loss, the gradients of the loss with respect to the learnable parameters in `net`

, and the network state. To compute the gradients, use the `dlgradient`

function.

function [loss,gradients,state] = modelLoss(net,X,T) [Y,state] = forward(net,X); loss = crossentropy(Y,T); gradients = dlgradient(loss,net.Learnables); end

### Clear Cache of Accelerated Function

Load the `dlnetwork`

object and class names from the MAT file `dlnetDigits.mat`

.

```
s = load("dlnetDigits.mat");
net = s.net;
classNames = s.classNames;
```

Accelerate the model loss function `modelLoss`

listed at the end of the example.

fun = @modelLoss; accfun = dlaccelerate(fun);

Clear any previously cached traces of the accelerated function using the `clearCache`

function.

clearCache(accfun)

View the properties of the accelerated function. Because the cache is empty, the `Occupancy`

property is 0.

accfun

accfun = AcceleratedFunction with properties: Function: @modelLoss Enabled: 1 CacheSize: 50 HitRate: 0 Occupancy: 0 CheckMode: 'none' CheckTolerance: 1.0000e-04

The returned `AcceleratedFunction`

object stores the traces of underlying function calls and reuses the cached result when the same input pattern reoccurs. To use the accelerated function in a custom training loop, replace calls to the model gradients function with calls to the accelerated function. You can invoke the accelerated function as you would invoke the underlying function. Note that the accelerated function is not a function handle.

Evaluate the accelerated model gradients function with random data using the `dlfeval`

function.

X = rand(28,28,1,128,"single"); X = dlarray(X,"SSCB"); T = categorical(classNames(randi(10,[128 1]))); T = onehotencode(T,2)'; T = dlarray(T,"CB"); [loss,gradients,state] = dlfeval(accfun,net,X,T);

View the `Occupancy`

property of the accelerated function. Because the function has been evaluated, the cache is nonempty.

accfun.Occupancy

ans = 2

Clear the cache using the `clearCache`

function.

clearCache(accfun)

View the `Occupancy`

property of the accelerated function. Because the cache has been cleared, the cache is empty.

accfun.Occupancy

ans = 0

**Model Loss Function**

The `modelLoss`

function takes a `dlnetwork`

object `net`

, a mini-batch of input data `X`

with corresponding target labels `T`

and returns the loss, the gradients of the loss with respect to the learnable parameters in `net`

, and the network state. To compute the gradients, use the `dlgradient`

function.

function [loss,gradients,state] = modelLoss(net,X,T) [Y,state] = forward(net,X); loss = crossentropy(Y,T); gradients = dlgradient(loss,net.Learnables); end

### Check Accelerated Deep Learning Function Outputs

This example shows how to check that the outputs of accelerated functions match the outputs of the underlying function.

In some cases, the outputs of accelerated functions differ to the outputs of the underlying function. For example, you must take care when accelerating functions that use random number generation, such as a function that generates random noise to add to the network input. When caching the trace of a function that generates random numbers that are not `dlarray`

objects, the accelerated function caches resulting random numbers in the trace. When reusing the trace, the accelerated function uses the cached random values. The accelerated function does not generate new random values.

To check that the outputs of the accelerated function match the outputs of the underlying function, use the `CheckMode`

property of the accelerated function. When the `CheckMode`

property of the accelerated function is `'tolerance'`

and the outputs differ by more than a specified tolerance, the accelerated function throws a warning.

Accelerate the function `myUnsupportedFun`

, listed at the end of the example using the `dlaccelerate`

function. The function `myUnsupportedFun`

generates random noise and adds it to the input. This function does not support acceleration because the function generates random numbers that are not `dlarray`

objects.

accfun = dlaccelerate(@myUnsupportedFun)

accfun = AcceleratedFunction with properties: Function: @myUnsupportedFun Enabled: 1 CacheSize: 50 HitRate: 0 Occupancy: 0 CheckMode: 'none' CheckTolerance: 1.0000e-04

Clear any previously cached traces using the `clearCache`

function.

clearCache(accfun)

To check that the outputs of reused cached traces match the outputs of the underlying function, set the `CheckMode`

property to `'tolerance'`

.

`accfun.CheckMode = 'tolerance'`

accfun = AcceleratedFunction with properties: Function: @myUnsupportedFun Enabled: 1 CacheSize: 50 HitRate: 0 Occupancy: 0 CheckMode: 'tolerance' CheckTolerance: 1.0000e-04

Evaluate the accelerated function with an array of ones as input, specified as a `dlarray`

input.

dlX = dlarray(ones(3,3)); dlY = accfun(dlX)

dlY = 3×3 dlarray 1.8147 1.9134 1.2785 1.9058 1.6324 1.5469 1.1270 1.0975 1.9575

Evaluate the accelerated function again with the same input. Because the accelerated function reuses the cached random noise values instead of generating new random values, the outputs of the reused trace differs from the outputs of the underlying function. When the `CheckMode`

property of the accelerated function is `'tolerance'`

and the outputs differ, the accelerated function throws a warning.

dlY = accfun(dlX)

Warning: Accelerated outputs differ from underlying function outputs.

dlY = 3×3 dlarray 1.8147 1.9134 1.2785 1.9058 1.6324 1.5469 1.1270 1.0975 1.9575

Random number generation using the `'like'`

option of the `rand`

function with a `dlarray`

object supports acceleration. To use random number generation in an accelerated function, ensure that the function uses the `rand`

function with the `'like'`

option set to a traced `dlarray`

object (a `dlarray`

object that depends on an input `dlarray`

object).

Accelerate the function `mySupportedFun`

, listed at the end of the example. The function `mySupportedFun`

adds noise to the input by generating noise using the `'like'`

option with a traced `dlarray`

object.

accfun2 = dlaccelerate(@mySupportedFun);

Clear any previously cached traces using the `clearCache`

function.

clearCache(accfun2)

To check that the outputs of reused cached traces match the outputs of the underlying function, set the `CheckMode`

property to `'tolerance'`

.

`accfun2.CheckMode = 'tolerance';`

Evaluate the accelerated function twice with the same input as before. Because the outputs of the reused cache match the outputs of the underlying function, the accelerated function does not throw a warning.

dlY = accfun2(dlX)

dlY = 3×3 dlarray 1.7922 1.0357 1.6787 1.9595 1.8491 1.7577 1.6557 1.9340 1.7431

dlY = accfun2(dlX)

dlY = 3×3 dlarray 1.3922 1.7060 1.0462 1.6555 1.0318 1.0971 1.1712 1.2769 1.8235

Checking the outputs match requires extra processing and increases the time required for function evaluation. After checking the outputs, set the `CheckMode`

property to `'none'`

.

accfun1.CheckMode = 'none'; accfun2.CheckMode = 'none';

**Example Functions**

The function `myUnsupportedFun`

generates random noise and adds it to the input. This function does not support acceleration because the function generates random numbers that are not `dlarray`

objects.

function out = myUnsupportedFun(dlX) sz = size(dlX); noise = rand(sz); out = dlX + noise; end

The function `mySupportedFun`

adds noise to the input by generating noise using the `'like'`

option with a traced `dlarray`

object.

function out = mySupportedFun(dlX) sz = size(dlX); noise = rand(sz,'like',dlX); out = dlX + noise; end

## Input Arguments

`fun`

— Deep learning function

function handle

Deep learning function to accelerate, specified as a function handle.

To learn more about developing deep learning functions for acceleration, see Deep Learning Function Acceleration for Custom Training Loops.

**Example: **`@modelLoss`

**Data Types: **`function_handle`

## Output Arguments

`accfun`

— Accelerated deep learning function

`AcceleratedFunction`

object

Accelerated deep learning function, returned as an `AcceleratedFunction`

object.

## More About

### Acceleration Considerations

Because of the nature of caching traces, not all functions support acceleration.

The caching process can cache values that you might expect to change or that depend on external factors. You must take care when accelerating functions that:

have inputs with random or frequently changing values

have outputs with frequently changing values

generate random numbers

use

`if`

statements and`while`

loops with conditions that depend on the values of`dlarray`

objectshave inputs that are handles or that depend on handles

Read data from external sources (for example, by using a datastore or a

`minibatchqueue`

object)

Because the caching process requires extra computation, acceleration can lead to longer running code in some cases. This scenario can happen when the software spends time creating new caches that do not get reused often. For example, when you pass multiple mini-batches of different sequence lengths to the function, the software triggers a new trace for each unique sequence length.

Accelerated functions can do the following when calculating a new trace only.

modify the global state such as, the random number stream or global variables

use file input or output

display data using graphics or the command line display

When using accelerated functions in parallel, such as when using a
`parfor`

loop, then each worker maintains its own cache. The cache is
not transferred to the host.

Functions and custom layers used in accelerated functions must also support acceleration.

For more information, see Deep Learning Function Acceleration for Custom Training Loops.

`dlode45`

Does Not Support Acceleration When
`GradientMode`

is `"direct"`

The `dlaccelerate`

function does not support accelerating the
`dlode45`

function when the `GradientMode`

option is
`"direct"`

. To accelerate the code that calls the
`dlode45`

function, set the `GradientMode`

option to
`"adjoint"`

or accelerate parts of your code that do not call the
`dlode45`

function with the `GradientMode`

option
set to `"direct"`

.

## Version History

**Introduced in R2021a**

## Open Example

You have a modified version of this example. Do you want to open this example with your edits?

## MATLAB Command

You clicked a link that corresponds to this MATLAB command:

Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.

# Select a Web Site

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

You can also select a web site from the following list:

## How to Get Best Site Performance

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

### Americas

- América Latina (Español)
- Canada (English)
- United States (English)

### Europe

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)