Applying z-score before scaling to [0,1]?

13 views (last 30 days)
Hello
I'm currently using neural network for classification of a dataset. Of course before doing classification either the data points or the features should be normalized. The toolbox which I'm using for neural network requires all values to be in range [0,1].
Does it make sense to first apply z-score and then to scale to range [0,1]?
Second, should I normalize along the feature vectors or the data points (either applying z-score or to range [0,1])?

Accepted Answer

Greg Heath
Greg Heath on 18 Jul 2015
Edited: Walter Roberson on 18 Jul 2015
It is well known (e.g., see the comp.ai.neural-nets FAQ) that the most efficient MLP nets are those which have
  • 1. Bipolar sigmoid hidden node transfer functions, e.g., TANSIG( == TANH ), NOT LOGSIG !
  • 2. Bipolar scaled input variables. For example
  • a. Normalized to [-1,1] via MAPMINMAX (MATLAB's default)
  • b. Standardized to zero-mean/unit-variance via MAPSTD or ZSCORE
  • 3. However, the initial weight assignments should assure that initial hidden node outputs are in the linear region of the sigmoid.
Before training I always use the functions MINMAX (NOT mapminmax), ZSCORE and PLOT to eliminate or modify outliers and incorrect data.
Even though I prefer standardization, I accept MATLAB's [-1,1] default, which I assume is taken into account by MATLAB's default weight initialization. (I guess I should check this ... I've been burned by other logical assumptions).
BOTTOM LINE: Always use centered inputs and tansig hidden layer functions for MLPs. [If you don't, people may point at you and laugh (:>( ].
Hope this helps.
Thank you for formally accepting my answer
Greg
P.S. If you are going to use MATLAB's attempt at Radial Basis Function nets, you should first look up some of my RBF posts.
  1 Comment
Greg Heath
Greg Heath on 19 Jul 2015
Currently puzzled by Matlab's approach to choosing efficient random initial weights. When I figure it out I will start a new thread that references my 1Mar2004 comp.ai.neural-nets thread "Nonsaturating Initial Weights"

Sign in to comment.

More Answers (1)

Walter Roberson
Walter Roberson on 17 Jul 2015
Normalize per feature, not per sample.
I do not know what you mean by "apply z-score". If you mean normalize to mean 0 and standard deviation 1, then whether you need to do that before you scale to 0 to 1 depends upon how you intend to scale to 0 to 1. If your approach is (x - min(x)) ./ (max(x) - min(x)) then there is no need to normalize to mean 0 and standard deviation 1 first.
  2 Comments
Sepp
Sepp on 17 Jul 2015
Thank you Walter.
Do you mean with normalizing per feature that I normalize each feature vector separately?
Yes z-score is unit std and zero mean.
Walter Roberson
Walter Roberson on 17 Jul 2015
Using the notation of http://www.mathworks.com/help/nnet/ref/train.html where the first input X has each element an Ni-by-Q matrix where Ni is the number of inputs, then normalize across the rows, each "input" separately.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!