Neural network training fails when target values are small. Mapminmax issue?

Question

1 vote

When I try to train a network with very small targets the training stops at epoch 0 (i.e., does not begin at all) because the gradient is already too small. I understand that a very small target could imply a very small gradient but the mapminmax function is active and it should map the target in [-1,1] avoiding this kind of problems. So what's going on?

Here's some code:

First I define a really small sine wave:

    in = [0:0.1:10];
    out = sin(in)/1e10;

then I create and configure a network

    net = fitnet([15]);
    net = configure(net,in,out);

The mapminmax function seems to be active and properly configured:

    net.outputs{1,2}.processSettings{1,2}
    ans = 
         name: 'mapminmax'
        xrows: 1
         xmax: 9.9957e-11
         xmin: -9.9992e-11
       xrange: 1.9995e-10
        yrows: 1
         ymax: 1
         ymin: -1
       yrange: 2
    no_change: 0
         gain: 1.0003e+10
      xoffset: -9.9992e-11

but the training fails (it stops at epoch 0):

 [net,tr] = train(net,in,out);
 tr.stop
 ans =
 Minimum gradient reached.
 tr.num_epochs
 ans =
     0

The learning completely failed, this is the output of the net:

But if I manually use mapminmax everything works well

    net = configure(net,in,mapminmax(out,-1,1));
    [net,tr] = train(net,in,mapminmax(out,-1,1));
    tr.stop
    ans =
    Minimum gradient reached.
    tr.num_epochs
    ans =
   377

And the network actually learned the sine function:

Any ideas?

3 Comments
Show 1 older comment Hide 1 older comment

xingji he on 7 Apr 2022

Very Helpful !!

bah on 24 Oct 2022

Did matlab solve this or is the issue still present ?

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Greg Heath on 1 Sep 2016

Open in MATLAB Online

0 votes

You have to change the defaults for BOTH the MSE goal AND the Minimum gradient. They are on the scale of the UNNORMALIZED data. For simple problems I tend to use the average BIASED target variance estimate to get

 MSEgoal = mean(var(target',1))/100
 MinGrad = MSEgoal/100

On more serious problems I consider BOTH the UNBIASED mean target variance estimate for O-dimensional targets AND the loss of degree of freedom because the same data is used to BOTH estimate Nw unknown weights AND to estimate the performance:

 MSEgoal =
        0.01*max(0,Ndof)* mean(var(target',0)) / Ntrneq

where

Ntrneq = Ntrn*O         % No of training equations
Ndof   = Ntrneq - Nw    % No of degrees of freedom

For details search both the NEWSGROUP and ANSWERS using

greg MSEgoal MinGrad

Hope this helps

Thank you for formally accepting my answer

Greg

4 Comments
Show 2 older comments Hide 2 older comments

Roberto on 1 Sep 2016

Open in MATLAB Online

Hi Greg, thank you so much for your reply.

I clearly understand your point. I already tried tuning min_grad with some good result but what I'm trying to understand is if this is the major bug it seems or not.

When you say "mse goal and gradient goal are on the scale of the unnormalized data" do you mean that mapminmax parameters and settings affect only the simulation of the net and not the training? It would look to me like a really major issue of neural network toolbox.

For what is reported in the documentation the implementation of mapminmax should fully automate the pre/post processing stages and make the "magnitude" of inputs and targets completely irrelevant. According to documentation, since the network always maps inputs and targets in a proper interval ([-1,1] if hyperbolic tangent is used as activation function,) it should not be able to dinstinguish, for example these 2 problems:

1.

    inputs = [0:0.1:10];
    targets = sin(inputs);

2.

    inputs = [0:0.1:10];
    targets = sin(inputs)/1e6;

Am I right or I'm missing something?

Roberto on 2 Sep 2016

Edited: Roberto on 2 Sep 2016

Open in MATLAB Online

I made 3 experiments. If necessary I can post the code.

1. I trainined on the same data 2 nets with equal initialization, no early stopping. The only difference of these nets is that the first has mapminmax active and configured in both input and output layers, the second has no pre/post process functioncs. The training resulted in different weights and biases. This should prove that mapminmax affects the training in some way.

2. Simulating a network with mapminmax (parameters retrieved from the net itself) by manually computing the correct linear combination of processed inputs with weights and biases results in the exact output produced by calling "sim" function. Then mapminmax affects the simulation of the net in the correct (and expected) way.

3. I defined a parametric problem like this

if true
  eps = 1;
  in = [0:01:10];
  out = sin(in)*eps;
end

Then I trained 2 networks, the first one (Net1) had mapminmax active and it's been trained in the regular way

    net = configure(net,in,out);
    net = train(net,in,out);

The second one (Net2) had no active processing functions and I manually pre-processed inputs and targets in both training and simulation.

Then I computed for both the quantity

C = (sqrt(MSE)/mean(targets))*100

that represents the percentage of the error of the net compared to the mean of targets.

I repeated the experiment gradually lowering eps. It turned out that

a. for eps = 1, C(Net1) and C(Net2) are really close, substantially the same (I did a lot of runs); b. decreasing eps C(Net2) remained costant while C(Net1) skyrocketed. For instance

eps = 1 => C(Net1) = C(Net2) = 0.01%
eps = 0.001 => C(Net1) = 0.01% and C(Net2) = 36%

This should prove that the way mapminmax affects the training is not the correct way. Manually preprocessing inputs and targets makes the net insensitive to the magnitude of number, as it should be.

Now, established that tuning training parameters such as gradient and mse goals is useful for sure, is this mapminmax issue the major bug it looks to me or I'm missing something?

Greg Heath on 2 Sep 2016

Open in MATLAB Online

 You are right.
 THIS IS A BUG.

I alerted MATLAB before and whatever fixed value they had for MSEgoal was changed to 0. I don't recall if they changed MinGrad or not.

Regardless, the use of my own MSEgoal and MinGrad was prompted because of the dissatisfaction with the MATLAB defaults.

By the way, if you are using NARNET or NARXNET the values should probably be scaled with 0.005 or 0.001 instead of 0.01 because closing the loop requires that openloop performance be exceptionally good.

Greg

Roberto on 2 Sep 2016

Thank you very much for your help. Btw, in 2014a default msegoal and mingrad are 0 and 1e-7. I'll try your formula for these parameters. Thank you again!

Sign in to comment.

Neural network training fails when target values are small. Mapminmax issue?

3 Comments
Show 1 older comment Hide 1 older comment

Accepted Answer

4 Comments
Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Tags

Community Treasure Hunt

Neural network training fails when target values are small. Mapminmax issue?

3 Comments Show 1 older comment Hide 1 older comment

Accepted Answer

4 Comments Show 2 older comments Hide 2 older comments

More Answers (0)

Categories

Tags

See Also

Community Treasure Hunt

3 Comments
Show 1 older comment Hide 1 older comment

4 Comments
Show 2 older comments Hide 2 older comments