# Fit Custom Distributions

This example shows how to fit a custom distribution to univariate data by using the `mle` function.

You can use the `mle` function to compute maximum likelihood parameter estimates and to estimate their precision for built-in distributions and custom distributions. To fit a custom distribution, you need to define a function for the custom distribution in a file or by using an anonymous function. In the simplest cases, you can write code to compute the probability density function (pdf) or logarithm of pdf for the distribution that you want to fit, and then call `mle` to fit the distribution. This example covers the following cases using the pdf or logarithm of pdf:

• Fitting a distribution for truncated data

• Fitting a mixture of two distributions

• Fitting a weighted distribution

• Finding accurate confidence intervals of parameter estimates for small-sized samples using parameter transformation

Note that you can use the TruncationBounds name-value argument of `mle` for truncated data instead of defining a custom function. Also, for a mixture of two normal distributions, you can use the `fitgmdist` function. This example uses the `mle` function and a custom function for these cases.

### Fit Zero-Truncated Poisson Distribution

Count data is often modeled using a Poisson distribution, and you can use the `poissfit` or `fitdist` function to fit a Poisson distribution. However, in some situations, counts that are zero are not recorded in the data, so fitting a Poisson distribution is not straightforward because of the missing zeros. In this case, fit a Poisson distribution to zero-truncated data by using the `mle` function and a custom distribution function.

First, generate some random Poisson data.

```rng(18,'twister') % For reproducibility lambda = 1.75; n = 75; x1 = poissrnd(lambda,n,1);```

Next, remove all the zeros from the data to simulate the truncation.

`x1 = x1(x1 > 0);`

Check the number of samples in `x1` after truncation.

`length(x1)`
```ans = 65 ```

Plot a histogram of the simulated data.

`histogram(x1,0:1:max(x1)+1)`

The data looks like a Poisson distribution except it contains no zeros. You can use a custom distribution that is identical to a Poisson distribution on the positive integers, but has no probability at zero. By using a custom distribution, you can estimate the Poisson parameter `lambda` while accounting for the missing zeros.

You need to define the zero-truncated Poisson distribution by its probability mass function (pmf). Create an anonymous function to compute the probability for each point in `x1`, given a value for the Poisson distribution's mean parameter `lambda`. The pmf for a zero-truncated Poisson distribution is the Poisson pmf normalized so that it sums to one. With zero truncation, the normalization is `1–Probability(x1<0)`.

`pf_truncpoiss = @(x1,lambda) poisspdf(x1,lambda)./(1-poisscdf(0,lambda));`

For simplicity, assume that all the `x1` values given to this function are positive integers, with no checks. For error checking or a more complicated distribution that takes more than a single line of code, you must define the function in a separate file.

Find a reasonable rough first guess for the parameter `lambda`. In this case, use the sample mean.

`start = mean(x1)`
```start = 2.2154 ```

Provide `mle` with the data, custom pmf function, initial parameter value, and lower bound of the parameter. Because the mean parameter of the Poisson distribution must be positive, you also need to specify a lower bound for `lambda`. The `mle` function returns the maximum likelihood estimate of `lambda`, and optionally, the approximate 95% confidence intervals for the parameters.

```[lambdaHat,lambdaCI] = mle(x1,'pdf',pf_truncpoiss,'Start',start, ... 'LowerBound',0)```
```lambdaHat = 1.8760 ```
```lambdaCI = 2×1 1.4990 2.2530 ```

The parameter estimate is smaller than the sample mean. The maximum likelihood estimate accounts for the zeros not present in the data.

Alternatively, you can specify the truncation bounds by using the TruncationBounds name-value argument.

```[lambdaHat2,lambdaCI2] = mle(x1,'Distribution','Poisson', ... 'TruncationBounds',[0 Inf])```
```lambdaHat2 = 1.8760 ```
```lambdaCI2 = 2×1 1.4990 2.2530 ```

You can also compute a standard error estimate for `lambda` by using the large-sample variance approximation returned by `mlecov`.

```avar = mlecov(lambdaHat,x1,'pdf',pf_truncpoiss); stderr = sqrt(avar)```
```stderr = 0.1923 ```

To visually check the fit, plot the fitted pmf against a normalized histogram of the raw data.

```histogram(x1,'Normalization','pdf') xgrid = min(x1):max(x1); pmfgrid = pf_truncpoiss(xgrid,lambdaHat); hold on plot(xgrid,pmfgrid,'-') xlabel('x1') ylabel('Probability') legend('Sample Data','Fitted pmf','Location','best') hold off```

### Fit Upper-Truncated Normal Distribution

Continuous data can sometimes be truncated. For example, observations larger than some fixed value might not be recorded because of limitations in data collection.

In this case, simulate data from a truncated normal distribution. First, generate some random normal data.

```n = 500; mu = 1; sigma = 3; rng('default') % For reproducibility x2 = normrnd(mu,sigma,n,1);```

Next, remove any observations that fall beyond the truncation point `xTrunc`. Assume that `xTrunc` is a known value that you do not need to estimate.

```xTrunc = 4; x2 = x2(x2 < xTrunc);```

Check the number of samples in `x2` after truncation.

`length(x2)`
```ans = 430 ```

Create a histogram of the simulated data.

`histogram(x2)`

Fit the simulated data with a custom distribution that is identical to a normal distribution for `x2 < xTrunc`, but has zero probability above `xTrunc`. By using a custom distribution, you can estimate the normal parameters `mu` and `sigma` while accounting for the missing tail.

Define the truncated normal distribution by its pdf. Create an anonymous function to compute the probability density value for each point in x, given values for the parameters `mu` and `sigma`. With the truncation point fixed and known, the pdf for a truncated normal distribution is the pdf truncated and then normalized so that it integrates to one. The normalization is the cdf evaluated at `xTrunc`. For simplicity, assume that all `x2` values are less than `xTrunc`, without checking.

```pdf_truncnorm = @(x2,mu,sigma) ... normpdf(x2,mu,sigma)./normcdf(xTrunc,mu,sigma);```

Because you do not need to estimate the truncation point `xTrunc`, it is not included with the input distribution parameters of the custom pdf function. `xTrunc` is also not part of the data vector input argument. An anonymous function can access variables in the workspace, so you do not have to pass `xTrunc` to the anonymous function as an additional argument.

Provide a rough starting guess for the parameter estimates. In this case, because the truncation is not extreme, use the sample mean and standard deviation.

`start = [mean(x2),std(x2)]`
```start = 1×2 0.1585 2.4125 ```

Provide `mle` with the data, custom pdf function, initial parameter values, and lower bounds of the parameters. Because `sigma` must be positive, you also need to specify lower parameter bounds. `mle` returns the maximum likelihood estimates of `mu` and `sigma` as a single vector, as well as a matrix of approximate 95% confidence intervals for the two parameters.

```[paramEsts,paramCIs] = mle(x2,'pdf',pdf_truncnorm,'Start',start, ... 'LowerBound',[-Inf 0])```
```paramEsts = 1×2 1.1298 3.0884 ```
```paramCIs = 2×2 0.5713 2.7160 1.6882 3.4607 ```

The estimates of `mu` and `sigma` are larger than the sample mean and standard deviation. The model fit accounts for the missing upper tail of the distribution.

Alternatively, you can specify the truncation bounds by using the TruncationBounds name-value argument.

```[paramEsts2,paramCIs2] = mle(x2,'Distribution','Normal', ... 'TruncationBounds',[-Inf xTrunc])```
```paramEsts2 = 1×2 1.1297 3.0884 ```
```paramCIs2 = 2×2 0.5713 2.7160 1.6882 3.4607 ```

You can compute an approximate covariance matrix for the parameter estimates using `mlecov`. The approximation typically works well for large samples, and you can approximate the standard errors by the square roots of the diagonal elements.

`acov = mlecov(paramEsts,x2,'pdf',pdf_truncnorm)`
```acov = 2×2 0.0812 0.0402 0.0402 0.0361 ```
`stderr = sqrt(diag(acov))`
```stderr = 2×1 0.2849 0.1900 ```

To visually check the fit, plot the fitted pdf against a normalized histogram of the raw data.

```histogram(x2,'Normalization','pdf') xgrid = linspace(min(x2),max(x2)); pdfgrid = pdf_truncnorm(xgrid,paramEsts(1),paramEsts(2)); hold on plot(xgrid,pdfgrid,'-') xlabel('x2') ylabel('Probability Density') legend('Sample Data','Fitted pdf','Location','best') hold off```

### Fit Mixture of Two Normal Distributions

Some data sets exhibit bimodality, or even multimodality, and fitting a standard distribution to such data is usually not appropriate. However, a mixture of simple unimodal distributions can often model such data very well.

In this case, fit a mixture of two normal distributions to simulated data. Consider simulated data with the following constructive definition:

• First, flip a biased coin.

• If the coin lands on heads, pick a value at random from a normal distribution with mean ${\mu }_{1}$ and standard deviation ${\sigma }_{1}$.

• If the coin lands on tails, pick a value at random from a normal distribution with mean ${\mu }_{2}$ and standard deviation ${\sigma }_{2}$.

Generate a data set from a mixture of Student's t distributions instead of using the same model that you are fitting. By using different distributions, similar to a technique used in a Monte-Carlo simulation, you can test how robust a fitting method is to departures from the assumptions of the model being fit.

```rng(10) % For reproducibility x3 = [trnd(20,1,50) trnd(4,1,100)+3]; histogram(x3)```

Define the model to fit by creating an anonymous function that computes the probability density. The pdf for a mixture of two normal distributions is a weighted sum of the pdfs of the two normal components, weighted by the mixture probability. The anonymous function takes six inputs: a vector of data at which to evaluate the pdf and five distribution parameters. Each component has parameters for its mean and standard deviation.

```pdf_normmixture = @(x3,p,mu1,mu2,sigma1,sigma2) ... p*normpdf(x3,mu1,sigma1) + (1-p)*normpdf(x3,mu2,sigma2);```

You also need an initial guess for the parameters. Defining a starting point becomes more important as the number of model parameters increases. Here, start with an equal mixture (`p` = 0.5) of normal distributions, centered at the two quartiles of the data, with equal standard deviations. The starting value for the standard deviation comes from the formula for the variance of a mixture in terms of the mean and variance of each component.

```pStart = .5; muStart = quantile(x3,[.25 .75])```
```muStart = 1×2 0.3351 3.3046 ```
`sigmaStart = sqrt(var(x3) - .25*diff(muStart).^2)`
```sigmaStart = 1.1602 ```
`start = [pStart muStart sigmaStart sigmaStart];`

Specify bounds of zero and one for the mixing probability, and lower bounds of zero for the standard deviations. Set the remaining elements of the bounds vectors to `+Inf` or `–Inf`, to indicate no restrictions.

```lb = [0 -Inf -Inf 0 0]; ub = [1 Inf Inf Inf Inf]; paramEsts = mle(x3,'pdf',pdf_normmixture,'Start',start, ... 'LowerBound',lb,'UpperBound',ub)```
```Warning: Maximum likelihood estimation did not converge. Iteration limit exceeded. ```
```paramEsts = 1×5 0.3273 -0.2263 2.9914 0.9067 1.2059 ```

The warning message indicates that the function does not converge with the default iteration settings. Display the default options.

`statset('mlecustom')`
```ans = struct with fields: Display: 'off' MaxFunEvals: 400 MaxIter: 200 TolBnd: 1.0000e-06 TolFun: 1.0000e-06 TolTypeFun: [] TolX: 1.0000e-06 TolTypeX: [] GradObj: 'off' Jacobian: [] DerivStep: 6.0555e-06 FunValCheck: 'on' Robust: [] RobustWgtFun: [] WgtFun: [] Tune: [] UseParallel: [] UseSubstreams: [] Streams: {} OutputFcn: [] ```

The default maximum number of iterations for custom distributions is 200. Override the default to increase the number of iterations, using an options structure created with the `statset` function. Also, increase the maximum function evaluations.

```options = statset('MaxIter',300,'MaxFunEvals',600); paramEsts = mle(x3,'pdf',pdf_normmixture,'Start',start, ... 'LowerBound',lb,'UpperBound',ub,'Options',options)```
```paramEsts = 1×5 0.3273 -0.2263 2.9914 0.9067 1.2059 ```

The final iterations to convergence are significant only in the last few digits of the result. However, a best practice is to always make sure that convergence is reached.

To visually check the fit, plot the fitted density against a probability histogram of the raw data.

```histogram(x3,'Normalization','pdf') hold on xgrid = linspace(1.1*min(x3),1.1*max(x3),200); pdfgrid = pdf_normmixture(xgrid, ... paramEsts(1),paramEsts(2),paramEsts(3),paramEsts(4),paramEsts(5)); plot(xgrid,pdfgrid,'-') hold off xlabel('x3') ylabel('Probability Density') legend('Sample Data','Fitted pdf','Location','best')```

Alternatively, for a mixture of normal distributions, you can use the `fitgmdist` function. The estimates can be different due to initial estimates and settings of the iterative algorithm.

`Mdl = fitgmdist(x3',2)`
```Mdl = Gaussian mixture distribution with 2 components in 1 dimensions Component 1: Mixing proportion: 0.329180 Mean: -0.2200 Component 2: Mixing proportion: 0.670820 Mean: 2.9975 ```
`Mdl.Sigma`
```ans = ans(:,:,1) = 0.8274 ans(:,:,2) = 1.4437 ```

### Fit Weighted Normal Distribution to Data with Unequal Precisions

Assume that you have 10 data points, where each point is actually the average of anywhere from one to eight observations. The original observations are not available, but the number of observations for each data point is known. The precision of each point depends on its corresponding number of observations. You need to estimate the mean and standard deviation of the raw data.

```x4 = [0.25 -1.24 1.38 1.39 -1.43 2.79 3.52 0.92 1.44 1.26]'; m = [8 2 1 3 8 4 2 5 2 4]';```

The variance of each data point is inversely proportional to its corresponding number of observations, so use `1/m` to weight the variance of each data point in a maximum likelihood fit.

`w = 1./m;`

In this model, you can define the distribution by its pdf. However, using a logarithm of pdf is more suitable, because the normal pdf has the form

``` c .* exp(-0.5 .* z.^2), ```

and `mle` takes the log of the pdf to compute the loglikelihood. So, instead, create a function that computes the logarithm of pdf directly.

The logarithm of pdf function must compute the logarithm of the probability density for each point in `x`, given normal distribution parameters `mu` and `sigma`. It also needs to account for the different variance weights. Define a function named `helper_logpdf_wn1` in a separate file `helper_logpdf_wn1.m`.

```function logy = helper_logpdf_wn1(x,m,mu,sigma) %HELPER_LOGPDF_WN1 Logarithm of pdf for a weight normal distribution % This function supports only the example Fit Custom Distributions % (customdist1demo.mlx) and might change in a future release. v = sigma.^2 ./ m; logy = -(x-mu).^2 ./ (2.*v) - .5.*log(2.*pi.*v); end ```

Provide a rough first guess for the parameter estimates. In this case, use the unweighted sample mean and standard deviation.

`start = [mean(x4),std(x4)]`
```start = 1×2 1.0280 1.5490 ```

Because `sigma` must be positive, you need to specify lower parameter bounds.

```[paramEsts1,paramCIs1] = mle(x4,'logpdf', ... @(x,mu,sigma)helper_logpdf_wn1(x,m,mu,sigma), ... 'Start',start,'LowerBound',[-Inf,0])```
```paramEsts1 = 1×2 0.6244 2.8823 ```
```paramCIs1 = 2×2 -0.2802 1.6191 1.5290 4.1456 ```

The estimate of `mu` is less than two-thirds of the estimate of the sample mean. The estimate is influenced by the most reliable data points, that is, the points based on the largest number of raw observations. In this data set, those points tend to pull the estimate down from the unweighted sample mean.

### Fit Normal Distribution Using Parameter Transformation

The `mle` function computes confidence intervals for the parameters using a large-sample normal approximation for the distribution of the estimators if an exact method is not available. For small sample sizes, you can improve the normal approximation by transforming one or more parameters. In this case, transform the scale parameter of a normal distribution to its logarithm.

First, define a new log pdf function named `helper_logpdf_wn2` that uses a transformed parameter for `sigma`.

```function logy = helper_logpdf_wn2(x,m,mu,logsigma) %HELPER_LOGPDF_WN2 Logarithm of pdf for a weight normal distribution with % log(sigma) parameterization % This function supports only the example Fit Custom Distributions % (customdist1demo.mlx) and might change in a future release. v = exp(logsigma).^2 ./ m; logy = -(x-mu).^2 ./ (2.*v) - .5.*log(2.*pi.*v); end ```

Use the same starting point transformed to the new parameterization for `sigma`, that is, the log of the sample standard deviation.

`start = [mean(x4),log(std(x4))]`
```start = 1×2 1.0280 0.4376 ```

Because `sigma` can be any positive value, `log(sigma)` is unbounded, and you do not need to specify lower or upper bounds.

```[paramEsts2,paramCIs2] = mle(x4,'logpdf', ... @(x,mu,sigma)helper_logpdf_wn2(x,m,mu,sigma), ... 'Start',start)```
```paramEsts2 = 1×2 0.6244 1.0586 ```
```paramCIs2 = 2×2 -0.2802 0.6203 1.5290 1.4969 ```

Because the parameterization uses `log(sigma)`, you have to transform back to the original scale to get an estimate and confidence interval for `sigma`.

`sigmaHat = exp(paramEsts2(2))`
```sigmaHat = 2.8823 ```
`sigmaCI = exp(paramCIs2(:,2))`
```sigmaCI = 2×1 1.8596 4.4677 ```

The estimates for both `mu` and `sigma` are the same as in the first fit, because maximum likelihood estimates are invariant to parameterization. The confidence interval for `sigma` is slightly different from `paramCIs1(:,2)`.