How to calculate R Square Error of various probability distributions used in Probability distribution app ??

5 views (last 30 days)

Answers (1)

Harsha Vardhan
Harsha Vardhan on 5 Jan 2024
Edited: Harsha Vardhan on 5 Jan 2024
Hi,
I understand that you want to calculate the error of fitting probability distributions in the ‘Distribution Fitter’ App.
Please check the following document links for calculating the error.
  1. https://www.mathworks.com/help/stats/coefficient-of-determination-r-squared.html
  2. https://www.mathworks.com/help/matlab/data_analysis/linear-regression.html
From above links, the formula for error can be written as:
You can obtain these by first fitting your data with a chosen distribution in the ‘Distribution Fitter’ App. Then, select 'File' and then 'Generate Code'. This will provide you with a script containing variables for both the empirical data curve and the fitting curve.
By executing this script, you'll be able to access the data necessary to calculate the error.
I fitted a curve on the data from the example. Check the image below for the fit.
Following the generation of the code for these curves, as previously described, you can examine the variables that contain the data for these curves. Below is the generated code for your reference.
function pd1 = createFit(MPG)
%CREATEFIT Create plot of datasets and fits
% PD1 = CREATEFIT(MPG)
% Creates a plot, similar to the plot in the main distribution fitter
% window, using the data that you provide as input. You can
% apply this function to the same data you used with distributionFitter
% or with different data. You may want to edit the function to
% customize the code and this help message.
%
% Number of datasets: 1
% Number of fits: 1
%
% See also FITDIST.
% This function was automatically generated on 05-Jan-2024 11:42:47
% Output fitted probablility distribution: PD1
% Data from dataset "MPG data":
% Y = MPG
% Force all inputs to be column vectors
MPG = MPG(:);
% Prepare figure
clf;
hold on;
LegHandles = []; LegText = {};
% --- Plot data originally in dataset "MPG data"
[CdfY,CdfX] = ecdf(MPG,'Function','cdf'); % compute empirical function
hLine = stairs(CdfX,CdfY,'Color',[0.333333 0 0.666667],'LineStyle','-', 'LineWidth',1);
xlabel('Data');
ylabel('Cumulative probability')
LegHandles(end+1) = hLine;
LegText{end+1} = 'MPG data';
% Create grid where function will be computed
XLim = get(gca,'XLim');
XLim = XLim + [-1 1] * 0.01 * diff(XLim);
XGrid = linspace(XLim(1),XLim(2),100);
% --- Create fit "My fit 1"
% Fit this distribution to get parameter values
% To use parameter estimates from the original fit:
% pd1 = ProbDistUnivParam('normal',[ 23.71808510638, 8.035726178665])
pd1 = fitdist(MPG, 'normal');
YPlot = cdf(pd1,XGrid);
hLine = plot(XGrid,YPlot,'Color',[1 0 0],...
'LineStyle','-', 'LineWidth',2,...
'Marker','none', 'MarkerSize',6);
LegHandles(end+1) = hLine;
LegText{end+1} = 'My fit 1';
% Adjust figure
box on;
hold off;
% Create legend from accumulated handles and labels
hLegend = legend(LegHandles,LegText,'Orientation', 'vertical', 'FontSize', 9, 'Location', 'northwest');
set(hLegend,'Interpreter','none');
Upon executing the above code, the variables 'CdfY' and 'YPlot' will store the plotting data. This data can then be utilized to calculate the error.
Hope this helps in solving your query.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!