From a pdf to a histogram.

Hi. I have the two parameters of the lognormal so i can plot the pdf. If i want to convert the density into a histogram i should calculate the integral under the curve associated to bins of a certain width, right? I know mu and sigma of the pdf and nothing else. How can i extract values of the density associated to bins of width of 100? My x goes from 0 to 250000 approx. I cannot calculate manually 2500 integrals to build the histogram.

4 Comments

The histogram is an approximation of the pdf. Why settle for that?
Alfonso Russo
Alfonso Russo on 8 Aug 2017
Edited: Alfonso Russo on 8 Aug 2017
Because "discretising" the pdf i can sum bins coming from different lognormals. My general target is to integrate many lognormals that describe the distribution of income in several countries. For these country-individuals i know only the two distributional parameters mu and sigma and so i can plot the pdf. I do not have any other data. One strategy would be estimating a mixture of the distributions as the "population-weighted sum of the subgroup densities". The problem is that i have no idea about how to sum PDFs. My supervisor suggested to "discretise" the distributions so that i can sum all the densities associated to, say, the bins 0-99.9$ for all the countries in Europe and obtain the total european density of people with annual income between 0 and 99.9$. Repeating this procedure for all the bins i will obtain a distribution for Europe as a whole. Makes sense?
"The sum of two pdf's is their convolution."
No; if 2 variables, x and y, have pdf's p(x) and q(y), then the convolution of p and q is the pdf of the sum (x + y). The (weighted and normalized) sum of 2 pdf's is a mixture.
José-Luis
José-Luis on 9 Aug 2017
Edited: José-Luis on 9 Aug 2017
"The sum of two pdf's is their convolution".
Erroneous comment was mine.
Pawel is right. I don't know how one goes about mixing two pdf's but there is some literature on it.
That reference is pretty old so there might be newer (hopefully better) methods. Some fitting is required.

Sign in to comment.

Answers (3)

Hi,
You can use the function lognrnd(mu,sigma) to sample the lognormal density without computing the integral. Then, you can use hist(x,nbins) to create the histogram. Here is an example:
hist(lognrnd(0,1,[3000,1]),100) % lognormal histogram with mu=0, sigma=1, 30000 samples and 100 bins
David

2 Comments

it will only plot the histogram. i need the values associated to each bin since i have several distributions that i need to add together. Is there a way to extract the values associated to each bin in order to add up 0-100 bins coming from different lognormals?
Moreover, if my lognormal described a random variable X across say 322000000 observation and i wanted to specify only the bins' width and not the number of bins (so that MatLab can show how many bins are created) how should i modify the code?

Sign in to comment.

Sean de Wolski
Sean de Wolski on 7 Aug 2017
Look at the 'Normalization' property of histcounts.

2 Comments

It does not work. I need exactly the densities associated to approx 2500 bins of width = 100.
Alfonso Russo
Alfonso Russo on 8 Aug 2017
Edited: Alfonso Russo on 8 Aug 2017
There is something that might work since histcounts allows me to use 'Normalization' and 'Countdensity' as properties. How can i set my variable X as coming from a lognormal with defined mu and sigma and divide it into bins of width of 100 so that i can use histcounts(X, 'Normalization', 'countdesity') ?

Sign in to comment.

I don't understand the problem you have.
The probability P that a person in Europe has income x is
P(income=x)=sum_{i=1}^{N} P(income=x|person comes from country i)*P(person comes from country i)
where N is the number of countries you want to take into consideration.
Thus the aggregated probability density function is
d_aggregated(x)=sum_{i=1}^{N} w_i * d_i(x)
where w_i is the number of people in country i divided by the total number of people in the countries under consideraton and d_i(x) is the pdf for the incomes in country i.
Best wishes
Torsten.

2 Comments

Imagine that i have two lognormals L1 and L2 with parameters (m1,s1) and (m2,s2) and weights w1=0.6 and w2=0.4. To obtain the aggregated PDF i should tipe on matlab:
d_aggregated(x)= 0.6 * 1/(s1*x*sqrt(2*pi))*exp(- ((log(x)- m1).^2))/(2*(s1.^2)) + 0.4 * 1/(s2*x*sqrt(2*pi))*exp( - ((log(x) - m2).^2))/(2*(s2.^2))
% obtaining the aggregated PDF.
Is it correct? Once calculated, is there a way of plot it? Like plot(d_aggregated(x))?
m1 = ...;
m2 = ...;
s1 = ...;
s2 = ...;
x = 0:0.02:10;
y = 0.6*lognpdf(x,m1,s1)+0.4*lognpdf(x,m2,s2);
plot(x,y)
Best wishes
Torsten.

Sign in to comment.

Asked:

on 7 Aug 2017

Commented:

on 9 Aug 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!