For each uncertain input in a simulation, you define the possible values
with a probability distribution. The type of distribution you select depends
on the conditions surrounding the input. A simulation calculates numerous
scenarios of a model by repeatedly picking values from the probability
distribution for the uncertain inputs and using those values to calculate
the model.
To select the correct probability distribution:
1. Evaluate the input in question. List everything you know about the
conditions surrounding this input. For example, you can gather
valuable information about the uncertain input from historical data.
2. Review the descriptions of the probability distributions. This
appendix describes each distribution in detail, outlining the
conditions underlying the distribution. As you review the
descriptions, look for a distribution that features the conditions you
have listed for this input.
3. Select the distribution that characterizes this input, where the
conditions of the distribution match those of the input.
Normal
The Normal distribution describes many phenomena such as returns on
equity or assets, inflation rates, or currency fluctuations.
Decision-makers can use the normal distribution to describe uncertain
inputs such as the inflation rate or periodic returns on assets.
Parameters
Mean
Standard Deviation
Note
Of the values of a normal distribution, approximately 68% are within 1
standard deviation on either side of the mean. The standard deviation is
the square root of the average squared distance of values from the mean.
Conditions
Use the normal distribution under these conditions:
Mean value is most likely.
It is symmetrical about the mean.
It is more likely to be close to the mean than far away.
Triangular
The Triangular distribution describes situations where you know the
minimum, maximum, and most likely values. In the simulation, the
minimum and maximum values will never actually occur because their
probability is zero.
It is useful with limited data in situations such as sales estimates,
inventory numbers, and marketing costs. For example, you could describe
the number of cars sold per week when past sales show the minimum,
maximum, and usual number of cars sold.
Parameters
Minimum
Likeliest
Maximum
Conditions
Use the triangular distribution under these conditions:
Minimum and maximum are fixed.
It has a most likely value in this range, which forms a triangle with
the minimum and maximum.
Uniform
The Uniform distribution describes situations where you know the
minimum and maximum values and all values are equally likely to occur.
Parameters
Minimum
Maximum
Conditions
Use the uniform distribution under these conditions:
Minimum is fixed.
Maximum is fixed.
All values in range are equally likely to occur.
Lognormal
The Lognormal distribution describes many situations where values are
positively skewed (where most of the values occur near the minimum
value) such as asset and security prices. Such quantities exhibit this trend
because values cannot fall below zero but can increase without limit.
Parameters
Location
Mean
Standard Deviation
Note
If you have historical data available with which to define a lognormal
distribution, it is important to calculate the mean and standard deviation
of the logarithms of the data and then enter these log parameters.
Calculating the mean and standard deviation directly on the raw data
does not give you the correct lognormal distribution.
Conditions
Use the lognormal distribution under these conditions:
Upper and lower limits are unlimited, but the uncertain input cannot
fall below the value of the location parameter.
Distribution is positively skewed, with most values near the lower
limit.
Natural logarithm of the distribution is a normal distribution.
BetaPERT
The BetaPERT distribution describes situations commonly used in project
risk analysis for assigning probabilities to task durations and costs. It is
also sometimes used as a smoother alternative to the triangular
distribution.
It describes a situation where you know the minimum, maximum, and
most likely values to occur. It is useful with limited data. For example, you
could describe the number of cars sold per week when past sales show
the minimum, maximum, and usual number of cars sold.
Parameters
Minimum
Likeliest
Maximum
Conditions
Use the betaPERT distribution under these conditions:
Minimum and maximum are fixed.
It has a most likely value in this range, which forms a triangle with
the minimum and maximum; betaPERT forms a smoothed curve on
the underlying triangle.
Yes-No
The Yes-No distribution describes situations that can have only one of two
values: for example, yes or no, success or failure, or true or false.
Parameters—Probability of Yes
Conditions
Use the yes no distribution under these conditions:
For each trial, only 2 outcomes are possible, such as success or
failure; the random input can have only one of two values, for
example, 0 and 1.
The mean is p, or probability (0 < p < 1).
Trials are independent. Probability is the same from trial to trial.
Examples of Real-World Scenarios
Here are some real-world scenarios where each of these distributions is
used:
Normal Distribution: Modeling stock prices, portfolio returns, or
measurement errors in a physical experiment.
Poisson Distribution: Modeling the number of defects in a
manufacturing process, the number of arrivals at a service center,
or the number of claims received by an insurance company.
Uniform Distribution: Modeling the arrival time of a customer, the
processing time of a task, or the uncertainty in a parameter.
Exponential Distribution: Modeling the time to failure of a
component, the time between arrivals at a service center, or the
time between claims received by an insurance company.
Choosing the Right Probability Distribution
Choosing the right probability distribution is critical in simulation
modeling, as it can significantly impact the accuracy and reliability of the
results. Here, we'll discuss the factors to consider when selecting a
distribution, methods for determining the best distribution, and common
pitfalls to avoid.
Factors to Consider When Selecting a Distribution
When selecting a probability distribution, several factors should be
considered, including:
Data: The availability and quality of data can significantly impact
the choice of distribution. If data is available, it can be used to
estimate the parameters of a distribution or to select the best
distribution using goodness-of-fit tests.
Context: The context in which the distribution will be used is also
important. For example, if the distribution is being used to model a
physical phenomenon, the underlying physics may dictate the
choice of distribution.
Properties of the Distribution: The properties of the distribution,
such as its mean, variance, and shape, should also be considered.
For example, if the distribution is being used to model a count
variable, a discrete distribution may be more appropriate.
Methods for Determining the Best Distribution
Several methods can be used to determine the best probability
distribution for a given scenario, including:
Goodness-of-Fit Tests: Goodness-of-fit tests, such as the
Kolmogorov-Smirnov test or the chi-squared test, can be used to
compare the fit of different distributions to a dataset.
Visual Inspection: Visual inspection of the data, such as using
histograms or Q-Q plots, can also be used to identify the best
distribution.
Information Criteria: Information criteria, such as the Akaike
information criterion (AIC) or the Bayesian information criterion
(BIC), can be used to compare the fit of different distributions.
Common Pitfalls to Avoid
When selecting a probability distribution, there are several common
pitfalls to avoid, including:
Assuming a Distribution Without Evidence: Assuming a
distribution without evidence can lead to inaccurate or misleading
results.
Ignoring the Context: Ignoring the context in which the
distribution will be used can lead to the selection of an inappropriate
distribution.
Failing to Validate the Distribution: Failing to validate the
distribution using goodness-of-fit tests or other methods can lead to
inaccurate or misleading results.
Here's a flowchart that summarizes the process of choosing the right
probability distribution:
Conclusion
Probability distributions play a critical role in simulation modeling, allowing
us to quantify uncertainty and make informed decisions. By understanding
the different types of probability distributions, their characteristics, and
their applications, we can select the right distribution for a given scenario
and generate more accurate and reliable results.