Need some help getting a better fit curve than what I'm getting.

29 views (last 30 days)
Hi all,
This is my curve fitting attempt, part 2 on here!
So I wrote a program, and got help not too long ago on getting it to run, and actually getting curves to fit the data. That has been extremely helpful, but now I'm having a new issue, and was hoping that someone could help push me in the right direction.
I'm having trouble finding the right function to fit the data. Two of the curves, I'm presently running exponential fits, and the third I'm running a linear fit. This is where I'm coming into issues. In running the exponentials, I'm either missing my high energy data completely, or it bows way out in the middle, and doesn't really "fit" the data too well anyway. And the linear curve seems OK, but it misses a few very important pieces by a large margin.
So my question is, is there something in the code which may be causing the issues? Or is there a better way to fit the curves? Is linear not the right choice for the set that I did model with it?
Here's the necessary codes:
This one is JUST data points. The name is HeliumData_Voyager.
xdata1 = [0.0021859258,0.0035637526,0.0049688127,0.007943902,0.012700333,...
0.020705573,0.032462176,0.04615445,0.059510704,0.098936856,0.19612777,...
0.2899725,0.381267,0.5013045,12.45768,12.954449,15.1476755,14.378171,...
17.712223,19.658836,20.982672,22.395653,30.620895,24.856985,27.95077,...
33.1117,35.805107];
ydata1 = [2.17084,1.5341274,1.6444274,2.4091163,2.025231,2.673546,2.7679887,...
2.5823257,2.3269181,1.8893886,1.3352263,0.8803091,0.5803841,0.42464474,...
4.8200923E-4,3.8242337E-4,2.194412E-4,2.9647112E-4,1.8235153E-4,...
1.3813264E-4,1.0463649E-4,8.496173E-5,4.1460968E-5,6.144804E-5,...
4.875252E-5,3.2894895E-5,2.5501544E-5];
This one is also simply data points, and is called ElectronData_Voyager:
xdata2 = [0.008410999,0.014524744,0.025501791,0.03670282,0.23314714,0.3414813,...
0.43070924,0.5616051,0.7526595,0.93877333,1.107716,1.2646372,1.4279159,...
1.6480435,1.8814303,2.1475997,2.478777,2.7978826,3.3011177,3.7263978,...
4.253224,4.8009634,5.6017394,6.3934293,7.2964,8.419811,9.504132,10.848248,...
12.244799,14.128927,16.484188,18.606281,21.236778,24.23614,28.278603,...
31.569468,36.830494];
ydata2 = [338.37698,157.8941,73.66187,40.503326,0.030230608,0.03733925,...
0.037233148,0.039170094,0.035036482,0.03136671,0.023474017,0.020294663,...
0.017548302,0.013136245,0.010955717,0.008202303,0.006364977,0.00412716,...
0.0028742207,0.0020027386,0.0013953064,9.3788357E-4,6.078926E-4,...
4.08552E-4,2.55516E-4,1.6563607E-4,1.1133565E-4,8.040895E-5,5.213854E-5,...
3.1451833E-5,1.89703E-5,1.2300668E-5,8.569866E-6,5.3597532E-6,3.4739433E-6,...
2.335398E-6,1.3588274E-6];
This one is also just data points, and is called ProtonData_Voyager:
xdata3 = [0.0021387795,0.0035343652,0.0050304034,0.008200727,0.012835551,...
0.02121094,0.033198774,0.045365524,0.05792296,0.073956355,0.13079768,...
0.15604351,0.18117562,0.20196046,0.21909945,0.27225408,0.33830434,...
0.40911844,0.5083727,0.5188311,0.5515148,0.58031857,0.6295662,0.6829932,...
0.73344815,0.77965164,0.83724713,0.8899895,0.955736,1.0368427,1.1134378,...
1.2079275,1.2840208,1.3649076,1.4807379,1.5740169,1.6562227,1.7075928,...
1.7967749,1.8525044,1.9893551,2.1581779,2.2025764,2.365288,2.4386508,...
2.5660138,2.7000282,2.9591372,3.1136832,3.3098297,3.4826913,3.702083,...
3.8954308,4.0988765,4.269257,4.492226,4.6315594,4.923324,5.1804533,...
5.451011,5.7357,6.09702,6.415447,6.9598823,7.249188,7.550519,8.191279,...
8.707289,9.06923,9.739204,10.14404,10.565703,10.893414,11.698147,12.309103,...
13.084514,13.628406,14.340173,15.877169,16.70638,17.578901,18.877514,...
20.066702,21.11472,22.44484,23.858751,25.104816,26.686293,28.08003,...
30.154396,31.729261,33.728046,35.85274,38.111282,40.5121];
ydata3 = [18.474272,22.686558,20.240086,23.210253,24.854992,22.174679,...
22.686558,25.428741,26.616282,24.854992,17.650005,18.474272,15.746664,...
14.048576,11.974394,10.206451,9.105808,8.503246,6.768185,2.1014574,...
1.9623967,1.8325382,1.770868,1.6536837,1.5709128,1.4175926,1.3935355,...
1.3013206,1.1945851,1.0417166,0.95627403,0.8778396,0.79216295,0.72718906,...
0.66754436,0.61279184,0.5722413,0.52530557,0.48221955,0.45030943,0.3994634,...
0.37302953,0.33662206,0.28856367,0.26946843,0.24736638,0.22322357,0.20143707,...
0.1881073,0.16686743,0.15318082,0.13588463,0.118495755,0.11065449,0.09649427,...
0.084146105,0.07724436,0.06621645,0.05873971,0.052107196,0.047021564,0.040308453,...
0.035150267,0.03171962,0.025830137,0.022913564,0.01898125,0.01655226,0.014434102,...
0.012163411,0.010789997,0.009409224,0.008490889,0.007794459,0.00668167,0.005826631,...
0.004994782,0.0042816936,0.003486698,0.0030405128,0.0024339526,0.002051057,0.0018194647,...
0.0016418861,0.0014317774,0.0012273673,0.0010703037,9.658427E-4,8.000895E-4,...
6.515343E-4,5.4903864E-4,4.7065422E-4,4.034605E-4,3.3422056E-4,2.86505E-4];
This is the actual body of the program:
clc
clear all
%All three below simply call the data files which we are to use.
HeliumData_Voyager;
ElectronData_Voyager;
ProtonData_Voyager;
%f1 = @(b,x) exp(b(1)) .* (x + b(2)).^b(3) + b(4);%Function for HeliumData
f1 = @(b,x) b(1) .* exp(-b(2).*(x + b(3))) + b(4);
%f2 = @(a,x) exp(a(1)) .* (x + a(2)).^a(3) + a(4);%Function for ElectronData
%f2 = @(b,x) b(1) .* exp(-b(2).*(x + b(3))) + b(4);
f3 = @(c,x) c(1) .* exp(-c(2).*x); %Function for ProtonData
xdc1 = linspace(min(xdata1), max(xdata1), 150);% Continuous ‘xdata’ for each
%xdc2 = linspace(min(xdata2), max(xdata2), 300);
xdc3 = linspace(min(xdata2), max(xdata2), 1000);
b0 = [10; 0.1; 0.1; 0.1];%initial estimate vectors
%a0 = [10; 0.1; 0.1; 0.1];
c0 = [10; 0.1; 0.1; 0.1];
B1 = nlinfit(xdata1, ydata1, f1, b0);%non-linear regression
%A1 = nlinfit(xdata2, ydata2, f2, a0);
C1 = nlinfit(xdata3, ydata3, f3, c0);
%This is a test fit for linear regression for the Electron Data
p = polyfit(log10(xdata2),log10(ydata2),1);
f = polyval(p,log10(xdata2));
%below, we're simply graphing the thing
figure(1)
loglog(xdata1, ydata1, 'bp') %HeliumData
hold on
loglog(xdc1, f1(B1,xdc1))%Best fit for Helium
axis([.001,100,.00001,20])
title('Low and High Energy Cosmic Ray Spectra for Helium')
xlabel('Energy (GeV)')
ylabel('Flux particle m^{-2}s^{-1}sr^{-1}MeV')
hold off
grid
figure(2)
%loglog(xdata2, ydata2, 'o')%ElectronData
loglog(xdata2,ydata2,'.b')
hold on
loglog(xdata2,10.^f,'--r')
%loglog(xdc2, f2(A1,xdc2))%Best fit for Electron
axis([.001,50,.0001,1000])
title('Low and High Energy Cosmic Ray Spectra for Electron')
xlabel('Energy (GeV)')
ylabel('Flux particle m^{-2}s^{-1}sr^{-1}MeV')
hold off
grid
figure(3)
loglog(xdata3, ydata3, 'x')%ProtonData
hold on
loglog(xdc3, f3(C1,xdc3))%Best fit for Proton
axis([.001,100,.0001,100])
title('Low and High Energy Cosmic Ray Spectra for Proton')
xlabel('Energy (GeV)')
ylabel('Flux particle m^{-2}s^{-1}sr^{-1}MeV')
hold off
grid
You'll notice in the body of the program, there are several of my attempts which are commented out. These are only a few of the attempts I've tried in the past several days. If you observe, the helium data has the worst fit of the high energy parts, the electron isn't too bad, and proton is pretty close. I've attached the three graphs so you can see what I'm getting.
So is it something in my code, or is it something in the choice of the curve fitting equations? And if it's the second, do you guys have any suggestions which would be better and/or yield a smoother curve?
Thanks!
  6 Comments
CAM
CAM on 3 Aug 2015
First, let me throw some data at you. This is the updated version, and I'll only send along the helium set.
xdata1 = [0.0021859258,0.0035637526,0.0049688127,0.007943902,0.012700333,...
0.020705573,0.032462176,0.04615445,0.059510704,0.098936856,0.19612777,...
0.2899725,0.381267,0.5013045,12.45768,12.954449,15.1476755,14.378171,...
17.712223,19.658836,20.982672,22.395653,30.620895,24.856985,27.95077,...
33.1117,35.805107];
ydata1 = [2.17084,1.5341274,1.6444274,2.4091163,2.025231,2.673546,2.7679887,...
2.5823257,2.3269181,1.8893886,1.3352263,0.8803091,0.5803841,0.42464474,...
4.8200923E-4,3.8242337E-4,2.194412E-4,2.9647112E-4,1.8235153E-4,...
1.3813264E-4,1.0463649E-4,8.496173E-5,4.1460968E-5,6.144804E-5,...
4.875252E-5,3.2894895E-5,2.5501544E-5];
That can let you play with some things. I have an update to my code that I'm hoping you guys can take a look at. Let me upload that here as well.
CAM
CAM on 3 Aug 2015
OK, so here's what I've done as a change. First, I've modified the helium data. At the suggestion of my advisor, I've tried to fix the first and last exponents as the slope of the line at those places (so what I did was essentially create 4 codes within the program to generate a line at 4 different places). Then I use this to fix the exponent to that value (and in the initial guess). Finally, I graphed the inverse power law function.
What you'll see below is simply the fit for Helium. I did this so as to avoid confusion with the other ones for now, and the Helium Data seems to be giving the largest issue, so if I can figure it out for Helium, I believe I can easily transpose that over for both Proton and Electron data as necessary later.
In what I've most recently done, the fit looks promising, but also still isn't quite right. It should be able to be somewhat predictive of the middle values, which should fall right in line and not be so.....off...in the middle (and subsequently the end).
Here's the code:
%% This will be the program for Helium
clc
clear all
%All three below simply call the data files which we are to use.
HeliumData_Voyager;
%The below is generating vectors for the different segments of the
%graph. So in essence, taking 4 pieces, using this information to find the
%slope of the graph at that place, and then using that slope of that
%segment in the initial guess vector b0.
xvec1 = [xdata1(1), xdata1(2), xdata1(3), xdata1(4), xdata1(5)];
xvec2 = [xdata1(6),xdata1(7),xdata1(8),xdata1(9),xdata1(10)];
xvec3 = [xdata1(11),xdata1(12),xdata1(13),xdata1(14)];
xvec4 = [xdata1(15),xdata1(16),xdata1(17),xdata1(18),...
xdata1(19),xdata1(20),xdata1(21),xdata1(22),xdata1(23),xdata1(24),xdata1(25),...
xdata1(26),xdata1(27)];
yvec1 = [ydata1(1),ydata1(2),ydata1(3),ydata1(4),ydata1(5)];
yvec2 = [ydata1(6),ydata1(7),ydata1(8),ydata1(9),ydata1(10)];
yvec3 = [ydata1(11),ydata1(12),ydata1(13),ydata1(14)];
yvec4 = [ydata1(15),ydata1(16),ydata1(17),ydata1(18),ydata1(19),...
ydata1(20),ydata1(21),ydata1(22),ydata1(23),ydata1(24),...
ydata1(25),ydata1(26),ydata1(27)];
%This is finding the line of best fit of the 4 segments of the data above,
%to be used in the nonlinear best fit line below.
pfit = polyfit(log10(xvec1),log10(yvec1),1);
ffit = polyval(pfit,log10(xvec1));
gfit = polyfit(log10(xvec2),log10(yvec2),1);
hfit = polyval(gfit,log10(xvec2));
qfit = polyfit(log10(xvec3),log10(yvec3),1);
xfit = polyval(qfit,log10(xvec3));
yfit = polyfit(log10(xvec4),log10(yvec4),1);
zfit = polyval(yfit,log10(xvec4));
slope1 = 10^pfit(1);
slope2 = 10^gfit(1);
slope3 = 10^qfit(1);
slope4 = 10^yfit(1);
%Now we get into actually solving the function. The idea is to use an
%inverse power law. Iniitially, we are trying to hold the first power and
%the fourth power constant, and allow the others to float freely.
f1 = @(b,x) 1\(b(1).* x.^slope1+ b(3).* x.^b(4) + b(5).*x.^b(6)+ b(7).*x.^slope4);
xdc1 = linspace(min(xdata1), max(xdata1), 150);% Continuous ‘xdata’ for each
%This is the initial estimate vector.
b0 = [1 ;slope1; 10; slope2; 1; slope3; .1; slope4];
B1 = nlinfit(xdata1, ydata1, f1, b0);%non-linear regression
%below, we're simply graphing the thing
figure(1)
loglog(xdata1, ydata1, 'bp') %HeliumData
hold on
loglog(xdc1, f1(B1,xdc1))%Best fit for Helium
axis([.001,100,.00001,20])
title('Low and High Energy Cosmic Ray Spectra for Helium')
xlabel('Energy (GeV)')
ylabel('Flux particle m^{-2}s^{-1}sr^{-1}MeV')
hold off
grid
I should note that in my previous comment, you will find the data for HeliumData_Voyager.
Now here's something I notice: the choice for b0 can DRASTICALLY change this fit. So is there a way I can make a very good guess for b0? I'm basically trying random numbers, and that seems to not do so well (and rand seems to give me errors, so I have just been trying to select numbers based on multiples of 10, however this is proving to be an ineffective method).
Does it look like I'm on the right track? Or is there a better way to accomplish a good fit than what I'm trying above? I'm getting closer each iteration, but am stuck slightly here now. My most recent graph is attached. It's about the closest I've been, so I know we're on the right track!
Also, thank you guys so much for your help!

Sign in to comment.

Accepted Answer

Star Strider
Star Strider on 28 Jul 2015
I’m not certain how much help I can be on this. My undergraduate physics and physical chemistry are almost phlogiston era.
There seem to be definite relations between mass and energy in your plots. With the electrons, the high- and low-energy plots seem to be defined by one curve, but the others’ high- and low-energy plots have different characteristics, increasingly disparate as the mass increases.
This is more a physics than MATLAB problem. Since I don’t understand the physics well enough to attempt a model, I would either fit the high- and low-energy regions separately to an appropriate model, and look for parameter similarities between them, or fit them to a black-body radiation curve or something similar.
  11 Comments
CAM
CAM on 12 Aug 2015
Oh I completely understand what you mean. Often times, without full comprehension of something, it can be hard to offer much assistance. That's why in a lot of ways, I'm thankful just to bounce ideas around. You are correct that the particle mass affects the data tremendously. I'm not directly dealing with that in my results, but indirectly (we're doing a flux as a function of kinetic energy, and obviously the more massive the particle, the greater change in the flux as energy is increased).
This is an extremely useful trick! I appreciate you telling me this! Thanks! I'll simplify the code to utilize that!
Star Strider
Star Strider on 12 Aug 2015
My pleasure!
If you can direct me to a free article that discusses the physics of what you’re studying, it would help with future Answers. It’s not part of my current physics background.

Sign in to comment.

More Answers (2)

CAM
CAM on 6 Aug 2015
I truly appreciate you putting the time into it. I've gotten fairly close from all of your help, but it's the last little push that's now keeping me from making it the rest of the way. What I think may be best is to revamp the code and see if there's a way to model it without adding any slopes.

Taro Ichimura
Taro Ichimura on 15 Jul 2016
Disregarding the physical aspects, if your objective is to get ideal baseline you could check if you can apply the algorithm from Lieber & Mahadevan-Jansen 2003, called ModPoly. The fitting is such that it should not "cut" through your data line but instead be kept below.
%MOD-POLY iterative polynomial fitting procedure.
%Literature:
% Lieber CA, Mahadevan-Jansen A, Automated method for subtraction of
% fluorescence from biological Raman spectra, Applied spectroscopy, 57
% (2003) 11, 1363-1367
% Default values: 200 iterations, 5th order polynomial
iter = 200;
order = 5;
[m,n] = size(Spectra); %specifiy array size of your dataset
Spectra2 = zeros(m,n);
warning off MATLAB:polyfit:RepeatedPointsOrRescale
param.P = cell(m,1);
param.S = cell(m,1);
h_importspec = waitbar(0,'Baseline correction...');
for nr = 1:m %do all spectra in matrix
waitbar(nr/m, h_importspec);
ThisSpectrum = Spectra(nr,:);
for i = 1:iter % iterations to calculate baseline value
[P,S] = polyfit(xaxis2,ThisSpectrum,order);
Spectra2(nr,:) = polyval (P, xaxis2);
ind = find (Spectra2(nr,:) > ThisSpectrum);
Spectra2(nr,ind) = ThisSpectrum(1,ind);
ThisSpectrum = Spectra2(nr,:);
end
back = polyval (P, xaxis2);
%plot the spectra during iteration (cost time)
subplot(5, 2, [5 6])
plot(xaxis2,Spectra2)
title('Baseline estimation')
plot (xaxis2, Spectra(nr,:), 'b',xaxis2, Spectra2(nr,:), 'r');
title('Baseline estimation')
legend('Original spectrum', 'Baseline estimation');
Spectra2(nr,:) = Spectra(nr,:)-back; %subtract baseline from original spectra
So in this code, you obtain transformed spectral data in Spectra2 matrix.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!