Imprecision problem: pinv(H) is not equal to pinv(H'*H)*H'

12 views (last 30 days)
Sherry X
Sherry X on 19 Feb 2019
Answered: Sherry X on 4 Mar 2019
I'm testing the function with single hidden layer feedforward neural networks (SLFNs) with 20 neurons, by extreme machine learning (ELM).
With a SLFN, in the output layer, the output weight(OW) can be described by
after adding regularized parameter γ(regularized ELM), which
with
.
But when I try to calculate and , I find a huge difference between these two when neurons number is over 5 (under 5, they are equal or almost the same).
For example, when H is `10*10` matrix, , ,
H= [0.736251410036783 0.499731137079796 0.450233920602169 0.296610970576716 0.369359425954153 0.505556211442208 0.502934880027889 0.364904559142718 0.253349959726753 0.298697900877265;
0.724064281864009 0.521667364351399 0.435944895257239 0.337878535128756 0.364906002569385 0.496504064726699 0.492798607017131 0.390656915261343 0.289981152837390 0.307212326718916;
0.711534656474153 0.543520341487420 0.421761457948049 0.381771374416867 0.360475582262355 0.487454209236671 0.482668250979627 0.417033287703137 0.329570921359082 0.315860145366824;
0.698672860220896 0.565207057974387 0.407705930918082 0.427683127210120 0.356068794706095 0.478412571446765 0.472552121296395 0.443893207685379 0.371735862991355 0.324637323886021;
0.685491077062637 0.586647027111176 0.393799811411985 0.474875155650945 0.351686254239637 0.469385056318048 0.462458480695760 0.471085139463084 0.415948455902421 0.333539494486324;
0.672003357663056 0.607763454504209 0.380063647372632 0.522520267708374 0.347328559602877 0.460377531907542 0.452395518357816 0.498449772544129 0.461556360076788 0.342561958147251;
0.658225608290477 0.628484290731116 0.366516925684188 0.569759064961507 0.342996293691614 0.451395814182317 0.442371323528726 0.525823695636816 0.507817005881821 0.351699689941632;
0.644175558300583 0.648743139215935 0.353177974096445 0.615761051907079 0.338690023332811 0.442445652121229 0.432393859824045 0.553043275759248 0.553944175102542 0.360947346089454;
0.629872705346690 0.668479997764613 0.340063877672496 0.659781468051379 0.334410299080102 0.433532713184646 0.422470940392161 0.579948548513999 0.599160649563718 0.370299272759337;
0.615338237874436 0.687641820315375 0.327190410302607 0.701205860709835 0.330157655029498 0.424662569229062 0.412610204098877 0.606386924575225 0.642749594844498 0.379749516620049];
T=[-0.806458764562879 -0.251682808380338 -0.834815868451399 -0.750626822371170 0.877733363571576 1 -0.626938984683970 -0.767558933097629 -0.921811074815239 -1]';
There is a huge difference between and , where
OW1= [-19780274164.6438 -3619388884.32672 -76363206688.3469 16455234.9229156 -135982025652.153 -93890161354.8417 283696409214.039 193801203.735488 -18829106.6110445 19064848675.0189]'.
OW2 = [-4803.39093243484 3567.08623820149 668.037919243849 5975.10699147077 1709.31211566970 -1328.53407325092 -1844.57938928594 -22511.9388736373 -2377.63048959478 31688.5125271114]';
I also find that if I round H , , and return the same answer. So I guess one of the reason might be the float calculation issue inside the matlab.
But since is large, any small change of H may result in large difference in the inverse of H. I think the function may not be a good option to test. With large ,the numerical imprecision will affect the accuracy of inverse.
Back to my question, in my test, I use 1000 training samples , with noise between , and test samples are noise free. 20 neurons are selected. The can give reasonable results for training, while the performance for is worse. Then I try to increase the precision of by , there's no improvement.
One more comment, when I limit the , can return the reasonable result, and highest accuracy when .
Does anyone know how to solve this?
  1 Comment
Sherry X
Sherry X on 19 Feb 2019
In the above calcuation, I normalized X and Y to . The strange thing is that if there's no normalization of X and Y, regularized ELM can have similar result to the ELM.

Sign in to comment.

Accepted Answer

Sherry X
Sherry X on 4 Mar 2019
After some research, the answer is that ELM is very sentive to scaling and activation function.
Please refer to this paper for details: https://dl.acm.org/citation.cfm?id=2797143.2797161
And paper: https://ieeexplore.ieee.org/document/8533625 demonstrated a noval algorithm to improve the perforamance of ELM for scaling.

More Answers (2)

Matt J
Matt J on 19 Feb 2019
Seems to me the obvious solution is not to push gamma to infinity. That removes the regularization whose purpose is precisely to avoid the numerical ill-conditioning you describe.
  4 Comments
Matt J
Matt J on 19 Feb 2019
I can't speak to the paper, but the whole purpose of having a regularization term I/gamma is so that the matrix inversion inv(H'*H+I/gamma) becomes well-conditioned. Pushing gamma too large defeats that.

Sign in to comment.


BERGHOUT Tarek
BERGHOUT Tarek on 20 Feb 2019
in ELM you should alwayes scale your inputs batween (-1,1) fro both versions, do that and let me know about the results
  1 Comment
Sherry X
Sherry X on 22 Feb 2019
I first scaled the input to [-1,1], the regularized ELM performs bad accuracy due to the high condition number. But the strange thing is that if I don't scale the input, the result shows better or equal performance as the ELM.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!