Documentation

## Geometric Distribution

### Overview

The geometric distribution models the number of failures before one success in a series of independent trials, where each trial results in either success or failure, and the probability of success in any individual trial is constant. For example, if you toss a coin, the geometric distribution models the number of tails observed before getting a heads. The geometric distribution is discrete, existing only on the nonnegative integers.

### Parameters

The geometric distribution uses the following parameter.

ParameterDescription
$0\le p\le 1$Probability of success

### Probability Distribution Function

#### Definition

The probability distribution function (pdf) of the geometric distribution is

`$y=f\left(x|p\right)=p{\left(1-p\right)}^{x}\text{ };\text{ }x=0,1,2,\dots \text{\hspace{0.17em}},$`

where p is the probability of success, and x is the number of failures before the first success. The result y is the probability of observing exactly x trials before a success, when the probability of success in any given trial is p. For discrete distributions, the probability distribution function is also known as the probability mass function (pmf).

#### Plot

This plot shows how changing the value of the probability parameter p alters the shape of the pdf. Use `geopdf` to compute the pdf for values at x equals 1 through 10, for three different values of p. Then plot all three pdfs on the same figure for a visual comparison.

```x = [1:10]; y1 = geopdf(x,0.1); % For p = 0.1 y2 = geopdf(x,0.25); % For p = 0.25 y3 = geopdf(x,0.75); % For p = 0.75 figure; plot(x,y1,'kd') hold on plot(x,y2,'ro') plot(x,y3,'b+') legend({'p = 0.1','p = 0.25','p = 0.75'}) hold off``` In this plot, the value of y is the probability of observing exactly x trials before a success. When the probability of success p is large, y decreases rapidly as x increases, and the probability of observing a large number of failures before a success quickly becomes small. But when the probability of success p is small, y decreases slowly as x increases. The probability of observing a large number of failures before a success still decreases as the number of trials increases, but at a much slower rate.

#### Random Number Generation

A random number generated from a geometric distribution represents the number of failures observed before a success in a single experiment, given the probability of success p for each independent trial. Use `geornd` to generate random numbers from the geometric distribution. For example, the following generates a random number from a geometric distribution with probability of success p equal to 0.1.

```p = 0.1; r = geornd(p)```
```r = 1 ```

The returned random number represents the number of failures observed before a success in a series of independent trials.

#### Relationship to Other Distributions

The geometric distribution is a special case of the negative binomial distribution, with the specified number of successes parameter r equal to 1.

### Cumulative Distribution Function

#### Definition

The cumulative distribution function (cdf) of the geometric distribution is

`$y=F\left(x|p\right)=1-{\left(1-p\right)}^{x+1}\text{\hspace{0.17em}};\text{\hspace{0.17em}}x=0,1,2,...\text{\hspace{0.17em}},$`

where p is the probability of success, and x is the number of failures before the first success. The result y is the probability of observing up to x trials before a success, when the probability of success in any given trial is p.

#### Plot

This plot shows how changing the value of the parameter p alters the shape of the cdf. Use `geocdf` to compute the cdf values at x equals 1 through 10, for three different values of p. Then plot all three cdfs on the same figure for a visual comparison.

```x = [1:10]; y1 = geocdf(x,0.1); % For p = 0.1 y2 = geocdf(x,0.25); % For p = 0.25 y3 = geocdf(x,0.75); % For p = 0.75 figure; plot(x,y1,'kd') hold on plot(x,y2,'ro') plot(x,y3,'b+') legend({'p = 0.1','p = 0.25','p = 0.75'}) hold off``` In this plot, the value of y is the probability of observing up to x trials before a success. When the probability of success p is large, y increases rapidly as x increases. The probability of observing a success quickly becomes very high, even for a small number of trials. But when the probability of success p is small, y increases slowly as x increases. The probability of observing a success still increases as the number of trials increases, but at a much slower rate.

#### Inverse cdf

The inverse cdf of a geometric distribution determines the value of x that corresponds to a probability y of observing x successes in a row in independent trials. Use `geoinv` to compute the inverse cdf of the geometric distribution. For example, the following returns the smallest possible integer x such that the geometric cdf y evaluated at x is greater than or equal to 0.1 , when the probability of success for each independent trial p is 0.03.

```y = 0.1; p = 0.03; x = geoinv(y,p)```
```x = 3 ```

### Mean and Variance

The mean of the geometric distribution is

`$\text{mean}=\frac{1-p}{p}\text{\hspace{0.17em}},$`

and the variance of the geometric distribution is

`$\mathrm{var}=\frac{1-p}{{p}^{2}}\text{\hspace{0.17em}},$`

where p is the probability of success.

Use `geostat` to compute the mean and variance of a geometric distribution. For example, the following computes the mean m and variance v of a geometric distribution with probability parameter p equal to 0.25.

```p = 0.25; [m,v] = geostat(p)```
```m = 3 ```
```v = 12 ```

### Example

#### Compute Geometric Distribution Probabilities

Suppose the probability of a five-year-old car battery not starting in cold weather is 0.03. What is the probability of the car starting for 25 consecutive days during a long cold snap?

Model the scenario using a geometric distribution, where "failure" means the car starts, and "success" means the car does not start. Determine the probability of observing 25 failures (the car starts) without observing a single success (the car does not start). The probability of success for each trial (the car not starting on any single attempt) is p equal to 0.03.

Compute the cumulative distribution function (cdf) for x equal to 25. This returns the probability of observing success (the car not starting) in up to 25 trials.

```x = 25; p = 0.03; psuccess = geocdf(x,p);```

To determine the probability of not observing a success in up to 25 trials - in other words, the probability that the car starts on every one of the 25 attempts - subtract this result from 1.

`pfail = 1 - psuccess`
```pfail = 0.4530 ```

The returned result `pfail = 0.4530` is the probability that the car will start every day for 25 days in a row during a cold snap.

The cdf plot shows that, as the number of trials (`x`) increases, the probability of success (`y`) also increases. In this example, it means that the more times you attempt to start the car, the greater the probability that it does not start on at least one of those occasions.

```figure; x = 0:25; y = geocdf(x,0.03); stairs(x,y)``` 