319 views (last 30 days)

Show older comments

MATLAB provides built-in functions to generate random numbers with an uniform or Gaussian (normal) distribution. My question is: if I have a discrete distribution or histogram, how can I can generate random numbers that have such a distribution (if the population (numbers I generate) is large enough)?

Please post here if anyone knows of a good method of doing this.

Thanks, David

Jonathan Epperl
on 27 Oct 2012

Since nobody has any suggestions, here's one. If you have a discrete distribution, say it is a Nx2 matrix PD, first column the discrete values, second the probabilities of the corresponding value -- so sum(PD(:,2))==1.

Then map the probablities to the unit interval and use rand. What mean by that:

% Those are your values and the corr. probabilities:

PD =[

1.0000 0.1000

2.0000 0.3000

3.0000 0.4000

4.0000 0.2000];

% Then make it into a cumulative distribution

D = cumsum(PD(:,2));

% D = [0.1000 0.4000 0.8000 1.0000]'

Now for every r generated by rand, if it is between D(i) and D(i+1), then it corresponds to an outcome PD(1,i+1), with the obvious extension at i==0. Here's a way you could do that, even though I'm sure there are better ones:

R = rand(100,1); % Your trials

p = @(r) find(r<pd,1,'first'); % find the 1st index s.t. r<D(i);

% Now this are your results of the random trials

rR = arrayfun(p,R);

% Check whether the distribution looks right:

hist(rR,1:4)

% It does, roughly 10% are 1, 30% are 2 and so on

If you want more help you should post a minimal example of the form in which you have the discrete distribution.

Aasheesh Dixit
on 8 Jun 2020

one change is required:

p = @(r) find(r<d,1,'first'); % find the 1st index s.t. r<D(i);

Image Analyst
on 28 Oct 2012

Theron FARRELL
on 30 Apr 2019

Edited: Theron FARRELL
on 30 Apr 2019

Hi there,

I use this naive function to generate artificial outliers applied in machine learning. Hope that it will be a bit help in your case.

function [Out_Data, Out_PDF, CHist] = Complement_PDF(Hist, Data_Num, p)

% Generate a 1D vector of data with a PDF specified as the complementary PDF of input historgram. Note that the larger

% Data_Num is, the more Out_PDF will resemble to CHist

% Input

% Hist: PDF/Histogram of data

% Data_Num: Desired number of data to be generated

% p: Precision given by number of digits after 0

% Output

% Out_Data: Generated data as per the complementary PDF

% Out_PDF: The complementary PDF as per Out_Data

% CHist: The complementary PDF as per Hist

% Example

% Hist = [1, 6, 7, 100, 0, 0, 0, 2, 3, 5];

% Data_Number = 100000;

% p = 3

Hist = Hist/sum(Hist);

CHist = 1- Hist;

CHist = CHist/sum(CHist);

CDF_CHist = cumsum(CHist);

CDF_CHist = double(int32(CDF_CHist*10^p))/10^p;

Out_Data = zeros(1, Data_Num);

Out_PDF = zeros(1, length(CDF_CHist));

for i = 1:Data_Num

% Generate a uniformly distributed variable

x = double(int32(rand*10^p))/10^p;

% Inversely index CDF

Out_Data(i) = Inverse_CDF(x, CDF_CHist);

temp = floor(Out_Data(i) * length(CDF_CHist));

Out_PDF(temp) = Out_PDF(temp) + 1;

end

figure;

subplot 221, bar(Hist);

subplot 222, bar(CHist);

subplot 223, plot(CDF_CHist);

subplot 224, bar(Out_PDF);

end

function [y] = Inverse_CDF(x, CDF_CHist)

CDF_CHist_Ext = [0, CDF_CHist];

y = 1;

for ind = 1:length(CDF_CHist)

if (x >= CDF_CHist_Ext(ind)) && (x < CDF_CHist_Ext(ind+1))

y = ind/length(CDF_CHist);

break;

end

end

end

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!