Hello everyone,
I got this data and I want to create a script that plots a gaussian curve that fits the major peak only. I have seen many examples.However, they involve superimposing normal distribution funcitons or are for multiple peaks , so the fitting i get isnt correct.

 Accepted Answer

Star Strider
Star Strider on 22 Feb 2021
Try this:
D1 = readmatrix('Test1.xls');
x = D1(:,1);
y = D1(:,2);
gausfcn = @(b,x) b(1).*exp(-(x-b(2)).^2/b(3));
[maxy,idx] = max(y);
Lv = y >= 1.5; % Restrict Region Of Fit To Region Of Symmetry In ‘y’
B = fminsearch(@(b)norm(y(Lv) - gausfcn(b,x(Lv))), [maxy, x(idx), 1/12.5]);
figure
plot(x, y)
hold on
plot(x, gausfcn(B,x), '-r')
hold off
grid
text(x(idx), 0.4, sprintf('$y = %5.2f \\cdot e^{\\frac{-(x-%7.2f)^2}{%7.2f}}$', B), 'Interpreter','latex', 'HorizontalAlignment','center', 'FontSize',12)
producing:
.

10 Comments

Thor
Thor on 2 Mar 2021
Edited: Thor on 2 Mar 2021
Thanks alot , It really helped me.
is it possible to use a two term gaussian to improve the fitting?
As always, my pleasure!
I am not certain what you want to do. You can esaily change ‘gausfcn’ to add more terms, however I am not certain that would change anything. Note that you would also have to add more elements to the initial estimates vector (the ‘[maxy, x(idx), 1/12.5]’ vector in my code) to accommodate the additional parameters.
I experimented a bit, and that is likely as good as it is possible to fit it. I could not improve on it by adding an extra Gaussian, an extra parameter, subtracting the minimum (so that the minimum of ‘y’ was 0), or scaling the data.
If use two gauss functions like:
Root of Mean Square Error (RMSE): 0.0374577846951396
Sum of Squared Residual: 1.68510584675519
Correlation Coef. (R): 0.996929357645245
R-Square: 0.993868144134962
Parameter Best Estimate
---------- -------------
y0 1.04642449510959
a -103.207734371165
w -66.6539510640855
xc 561.700349327043
a1 -155.41253198475
w1 170.545453417549
xc1 759.446293260026
while, if take the combination of a gauss and a logistic function like:
the result will be much better
Root of Mean Square Error (RMSE): 0.0295077743515416
Sum of Squared Residual: 1.04572120536498
Correlation Coef. (R): 0.998095572650852
R-Square: 0.996194772145232
Parameter Best Estimate
---------- -------------
y0 1.04811642871182
a 93.8876146479961
w 62.2800066689394
xc 559.314760704218
a1 -0.697827989172453
x0 658.964375254954
p -34.4465710821952
Thor
Thor on 2 Mar 2021
THanks for your contribution, The fitting i am looking for is shown below.
Thor
Thor on 2 Mar 2021
Edited: Thor on 2 Mar 2021
Mr @Star Strider, could you show me how to make the code account for more terms in the function ?Many thanks.
There are two ways to combine them, adding them or multiplying them.
I let the ga (genetic algorithm) function have a go at this, since the parameters are difficult to estimate otherwise.
For the additive function:
gausfcn2a = @(b,x) b(1).*exp(-(x-b(2)).^2/b(3)) + b(4).*exp(-(x-b(5)).^2/b(6));
two parameter sets were:
B(1) = 1.82832
B(2) = 532.81617
B(3) = 1865.81457
B(4) = 1.57228
B(5) = 592.02143
B(6) = 1739.77333
and:
B(1) = 1.44994
B(2) = 592.98443
B(3) = 2069.30812
B(4) = 1.74877
B(5) = 533.39050
B(6) = 2223.33610
For the multiplicative function:
gausfcn2m = @(b,x) b(1).*exp(-(x-b(2)).^2/b(3)) .* b(4).*exp(-(x-b(5)).^2/b(6));
two parameter sets were:
B(1) = 42.76331
B(2) = 468.93401
B(3) = 2709.89032
B(4) = 25.28253
B(5) = 649.40457
B(6) = 2746.58184
and:
B(1) = 42.41765
B(2) = 468.55038
B(3) = 2697.30963
B(4) = 27.28298
B(5) = 649.53395
B(6) = 2731.39634
The additive version (‘gausfcn2a’) appears to provide the best results. You can use the parameter sets as initial estimates if you want to use other functions to improve on the fit. If you have the Global Optimization Toolbox, I will post the code I used for this so you can use the ga function to experiment with it. It converged relatively quickly, in about two minutes on my Ryzen 9 3900 machine.
Thor
Thor on 16 Mar 2021
Hi , sorry for the late reply. Please, could you post the code? I think my university offers the pack.
Also, I am having a little issue trying to obtain a nice fitting for unsymetrical peaks.
My code (that I already posted) should work for the data in that plot image. Choose the version of ‘gausfcn’ that works best with those data.
Thor
Thor on 16 Mar 2021
Hi @Star Strider, I was referring to this
'' If you have the Global Optimization Toolbox, I will post the code I used for this so you can use the ga function to experiment with it ''
Sure!
I ran it again to be certain that it works.
D1 = readmatrix('Test1.xls');
x = D1(:,1);
y = D1(:,2);
[maxy,idx] = max(y);
Lv = y >= 1.08; % Restrict Region Of Fit To Region Of Symmetry In ‘y’
gausfcn2a = @(b,x) b(1).*exp(-(x-b(2)).^2/b(3)) + b(4).*exp(-(x-b(5)).^2/b(6));
% gausfcn2m = @(b,x) b(1).*exp(-(x-b(2)).^2/b(3)) .* b(4).*exp(-(x-b(5)).^2/b(6));
ftns = @(b)norm(y(Lv) - gausfcn2a(b,x(Lv)))
PopSz = 50;
Parms = 6;
opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3+[maxy, x(idx), 1/12.5, maxy, x(idx), rand], 'MaxGenerations',2E5, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)
[B,fval,exitflag,output] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],opts)
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)
GA_Time = etime(t1,t0)
DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');
fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)
fprintf(1,'\tRate Constants:\n')
for k1 = 1:length(B)
fprintf(1, '\t\tB(%d) = %8.5f\n', k1, B(k1))
end
figure
plot(x, y)
hold on
plot(x, gausfcn2a(B,x), '-r')
hold off
grid
% text(x(idx), 0.4, sprintf('$y = %5.2f \\cdot e^{\\frac{-(x-%7.2f)^2}{%7.2f}}$', B), 'Interpreter','latex', 'HorizontalAlignment','center', 'FontSize',12)
% text(x(idx), 0.2, sprintf('$y = %5.2f \\cdot e^{(\\frac{-(x-%7.2f)}{%7.2f})^2}$', B(1:2),sqrt(B(3))), 'Interpreter','latex', 'HorizontalAlignment','center', 'FontSize',12)
Experiment with it to get the result you want. It may be necessary to tweak the tolerances to produce a slightly better result than this code produces. You can use any of the objective function variations in the ‘ftns’ function, or create your own, to get a suitable result.
Adding a bit of documentation —
PopSz = population size (rows of the 'InitialPopulationMatrix')
Parms = Number of Parameters in the objective function (columns of the 'InitialPopulationMatrix')
This vector: ‘[maxy, x(idx), 1/12.5, maxy, x(idx), rand]’ scales the columns of the 'InitialPopulationMatrix' to different magnitudes. That makes it easier for the ga algorithm to converge. Change that (or eliminate it entirely) as necessary.
If you have other questions, post them and I will do my best to provide appropriate replies.

Sign in to comment.

More Answers (0)

Categories

Find more on MATLAB in Help Center and File Exchange

Asked:

on 22 Feb 2021

Edited:

on 14 May 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!