Why does layerNormalizationLayer in Deep Learning Toolbox include T dimension into the batch?

5 views (last 30 days)

John Smith on 13 Mar 2023

1
Link

Direct link to this question

https://au.mathworks.com/matlabcentral/answers/1927745-why-does-layernormalizationlayer-in-deep-learning-toolbox-include-t-dimension-into-the-batch

Answered: John Smith on 24 Mar 2023

Hello,

While implementing a ViT transformer in Matlab, I found at that the layerNormalizationLayer does include the T dimension in the statistics calculated for each sample in the batch. This is problematics when implementing a transformer, since tokens correspond to the T dimension and reference implementations calculate the statistics separately for each token.

Thx

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

John Smith on 24 Mar 2023

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1927745-why-does-layernormalizationlayer-in-deep-learning-toolbox-include-t-dimension-into-the-batch#answer_1199924

It seems Mathworks have listened and changed the behavior of layerNormalizationLayer in R2023a.:

https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.layernormalizationlayer.html

Starting in R2023a, by default, the layer normalizes sequence data over the channel and spatial dimensions. In previous versions, the software normalizes over all dimensions except for the batch dimension (the spatial, time, and channel dimensions). Normalization over the channel and spatial dimensions is usually better suited for this type of data. To reproduce the previous behavior, set OperationDimension to "batch-excluded".

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (1)

Matt J on 13 Mar 2023

0
Link

Direct link to this answer

https://au.mathworks.com/matlabcentral/answers/1927745-why-does-layernormalizationlayer-in-deep-learning-toolbox-include-t-dimension-into-the-batch#answer_1191890

Perhaps you can fold your T dimension into the C dimension and use a groupNormalizationLayer instead, with the groups defined so that different T belong to different groups.

7 Comments
Show 5 older commentsHide 5 older comments

John Smith on 15 Mar 2023

Perhaps lamenting would cause someone from Mathworks to take notice and add the capability to the code base. Sigh ...

Matt J on 15 Mar 2023

That happens sometimes, but usually you have to submit a formal enhancement request.

Products

Deep Learning Toolbox

Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Why does layerNormalizationLayer in Deep Learning Toolbox include T dimension into the batch?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments
Show -2 older commentsHide -2 older comments

More Answers (1)

7 Comments
Show 5 older commentsHide 5 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

Why does layerNormalizationLayer in Deep Learning Toolbox include T dimension into the batch?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

0 Comments Show -2 older commentsHide -2 older comments

More Answers (1)

7 Comments Show 5 older commentsHide 5 older comments

See Also

Categories

Tags

Products

Release

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

0 Comments
Show -2 older commentsHide -2 older comments

7 Comments
Show 5 older commentsHide 5 older comments