Why am I getting the same R-squared values?
    2 views (last 30 days)
  
       Show older comments
    
Hello, I have a littel problem, when i use fitlm() i keep on getting the same same values for different models. I thought my original models were the wrong ones but I tried with known models as shown in the code attached. I keep on getting the same R-squared values for all the models of which am sure that should not be the case, is there a problem with my code? Please check and assist. Beacsue I suspect maybe there is something wrong with my code. I have attached a sample data below and also via google doc with link provided here https://docs.google.com/spreadsheets/d/10rK3bswrKyG_BiLRNHtIDBF6vEKo-zdzDTtdPu7KRd8/edit?usp=sharing 
%Wind_data =25*rand(100000,1);
A=input('enter wind speed matrix\n')
 %second phase of filtration
nrows= numel(A);
ncols=1;
for i=1:nrows
    for j=1:ncols
        if A(i,j)<0
            A(i,j)=0.1;
        elseif A(i,j)>25
            A(i,j)=25;
        elseif A(i,j)==0
            A(i,j)=0.5;
        else
        end
    end 
end
Wind_data=A;
Param_rayl=raylfit(sort(Wind_data),0.05);
Ray_pdf = raylpdf(sort(Wind_data),Param_rayl);
Ray_cdf = raylcdf(sort(Wind_data),Param_rayl);
figure
%plot(sort(Wind_data),Ray_pdf)
%figure
%plot(sort(Wind_data),Ray_cdf)
%Inverse Weibull
[Params_weibull]=wblfit(Wind_data);
Weibull_inv=wblinv(sort(Wind_data),Params_weibull(1),Params_weibull(2));%cdf of inverse weibull
figure
%plot(sort(Wind_data),Weibull_inv);
%gamma probability density distribution
[Param_gamma]=gamfit(sort(Wind_data))%prarameters determination
Gamma_pdf=gampdf(sort(Wind_data),Param_gamma(1),Param_gamma(2));%gama pdf
Gamma_cdf=gamcdf(sort(Wind_data),Param_gamma(1),Param_gamma(2));%gamma cdf
%figure
%plot(sort(Wind_data),Gamma_pdf);
%figure
%plot(sort(Wind_data),Gamma_cdf);
%extreme value distribution
[Params_evpdf]=evfit(sort(Wind_data));
Gumbel_evpdf=evpdf(sort(Wind_data),Params_evpdf(1),Params_evpdf(2));
Gumbel_evcdf=evcdf(sort(Wind_data),Params_evpdf(1),Params_evpdf(2));
%figure
%plot(sort(Wind_data),Gumbel_evpdf);
%figure
%plot(sort(Wind_data),Gumbel_evcdf);
Combined_cdfs=[Ray_cdf Gamma_cdf Gumbel_evcdf];
Empc=ecdf(Wind_data);
%weibull cumulative
%group the Estc1
figure
Hs_ray=histogram(Ray_cdf,numel(Empc));
binEdges_ray = Hs_ray.BinEdges;
x1 = binEdges_ray(1:end-1) + Hs_ray.BinWidth/2;
R1=fitlm(Empc,x1')
%group the Estc_gamma
figure
Hs_gamma=histogram(Gamma_cdf,numel(Empc));
binEdges_gamma = Hs_gamma.BinEdges;
x2 = binEdges_gamma(1:end-1) + Hs_gamma.BinWidth/2;
R2=fitlm(Empc,x2')
%group the Estc_gamma
figure
Hs_gumbell=histogram(Gumbel_evcdf,numel(Empc));
binEdges_gumbell = Hs_gumbell.BinEdges;
x3 = binEdges_gumbell(1:end-1) + Hs_gumbell.BinWidth/2;
R3=fitlm(Empc,x3')
%trial-visualizaion
figure
cdfplot(Wind_data)
hold on
plot(sort(Wind_data),Ray_cdf)
plot(sort(Wind_data),Gamma_cdf)
plot(sort(Wind_data),Gumbel_evcdf);
hold off
legend('real','Ray','Gama','Gumbel')
6 Comments
  Torsten
      
      
 on 6 Oct 2024
				
      Edited: Torsten
      
      
 on 6 Oct 2024
  
			You don't need to give the path on your computer - you only have to use the file name under which you saved the data here under MATLAB online ("sorted data for importig.csv") . And you have to remove the rows where -9998 appears - I guess you don't want to use this value as tremendous negative wind speed in the fitting process.
  dpb
      
      
 on 6 Oct 2024
				L=readlines('sorted data for importig.csv');
L(1:5)
tW=readtable('sorted data for importig.csv','readvariablenames',1);
head(tW)
tW.Properties.VariableNames={'Date','WindSpeed'};
[height(tW) any(tW.WindSpeed==-9998) nnz(tW.WindSpeed==-9998) nnz(tW.WindSpeed>=0)]
all(isfinite(tW.WindSpeed))
tW=tW(tW.WindSpeed>=0,:);       % keep only valid data
WS=sort(tW.WindSpeed);
Param_rayl=raylfit(WS,0.05);
Ray_pdf = raylpdf(WS,Param_rayl);
Ray_cdf = raylcdf(WS,Param_rayl);
[Params_weibull]=wblfit(WS);
Wei_pdf=wblpdf(WS,Params_weibull(1),Params_weibull(2));
histogram(WS,'Normalization','pdf');
line(WS,Ray_pdf,'linestyle','-','color','r')
line(WS,Wei_pdf,'linestyle','-','color','b')
xlim([0 10])
legend('Rayleigh','Weibull')
xlabel('Windspeed'), ylabel('P(WS)')
As to the original Q?, R-sq is not an appropriate measure for distribution fitting testing; see <NIST> for a comparison of continuous distribution test statistics.
Answers (0)
See Also
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!


