Finding the common area between the distribution of empirical data and log normal distribution

1 view (last 30 days)
Hello,
I have plotted a ks density plot for my data. In addition I have plotted normally used distributions above it I want to see how common area they have between them.
fullfile = [pathname, filename];
sampledata = readtable(fullfile);
index_m=sampledata.Type=="M" ;
data_m = sampledata.CycleTime(index_m,:);
index_r=sampledata.Type=="R";
data_r = sampledata.CycleTime(index_r,:);
% Fit a log-normal distribution to the data
mu_ln = mean(log(data_m));
sigma_ln = std(log(data_m));
% Fit a log-logistic distribution to the data
paramLogLogistic = fitdist(data_m, 'loglogistic');
alpha = paramLogLogistic.ParameterValues(1);
beta = paramLogLogistic.ParameterValues(2);
% Generate a range of values for the x-axis
x = linspace(0.3, 7, 1000);
% Compute the log-normal PDF
pdf_ln = lognpdf(x, mu_ln, sigma_ln);
% Compute the log-logistic PDF
pdf_loglogistic = pdf(paramLogLogistic, x); % Renamed from "pdf" to "pdf_loglogistic"
xticks_values = [0 1 2 3 4 5 6 7]; % Adjust as needed
yticks_values = [0 0.2 0.4 0.6 0.8 1]; % Adjust as needed
% Create a KS density plot
figure;
subplot(2,1,1)
ksdensity(data_m);
hold on;
plot(x, pdf_ln, 'r', 'LineWidth', 2);
plot(x, pdf_loglogistic, 'g', 'LineWidth', 2);
title('M');
xlabel('Cycletime (mins)');
ylabel('Density');
legend('Empirical Data', 'Log-Normal Distribution', 'Log-Logistic Distribution');
xlim([0 7]);
xticks(xticks_values);
yticks(yticks_values);
hold off;
index_r=sampledata.Type=="R";
data_r = sampledata.CycleTime(index_r,:);
% Fit a log-normal distribution to the data
mu_ln = mean(log(data_r));
sigma_ln = std(log(data_r));
% Fit a log-logistic distribution to the data
paramLogLogistic = fitdist(data_r, 'loglogistic');
alpha = paramLogLogistic.ParameterValues(1);
beta = paramLogLogistic.ParameterValues(2);
% Generate a range of values for the x-axis
x = linspace(0.3, 7, 1000);
% Compute the log-normal PDF
pdf_ln = lognpdf(x, mu_ln, sigma_ln);
% Compute the log-logistic PDF
pdf_loglogistic = pdf(paramLogLogistic, x); % Renamed from "pdf" to "pdf_loglogistic"
subplot(2,1,2)
ksdensity(data_r);
hold on;
plot(x, pdf_ln, 'r', 'LineWidth', 2);
plot(x, pdf_loglogistic, 'g', 'LineWidth', 2);
title('R');
xlabel('Cycletime (mins)');
ylabel('Density');
legend('Empirical Data', 'Log-Normal Distribution', 'Log-Logistic Distribution');
xlim([0 7]);
xticks(xticks_values);
yticks(yticks_values);
hold off;
  2 Comments
dpb
dpb on 8 Sep 2023
Please format your code with Code button (LH icon in CODE section) and attach necessary datafile(s) to be able to see/run your code.
Then, what, specifically, is the problem you're having?
Muhammad Tariq
Muhammad Tariq on 11 Oct 2023
I got the answer from Balaji. I do not have problem with the code. Just wanted to see the difference between the area of two distributions.
Thankyou!

Sign in to comment.

Accepted Answer

Balaji
Balaji on 22 Sep 2023
Hi Muhammad
I understand that you are trying to find the are between two functions that you have defined. I suggest you use the ‘trapz’ function for the same.
You can refer to the following code:
% Find the absolute difference between the two curves
diff_curve = abs(pdf_ln - pdf_loglogistic);
% Calculate the area between the curves using the trapz function
area = trapz(x, diff_curve);
For more information on ‘trapz’ function isuggest you refer to the following link:
https://www.mathworks.com/help/matlab/ref/trapz.html
Hope this helps
Thanks
Balaji

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!