0% found this document useful (0 votes)
24 views171 pages

Introduction to Simulation Concepts

This document introduces simulation as a tool for modeling and analyzing system behavior across various domains. It covers fundamental concepts such as system definitions, components, types of systems, and the steps involved in conducting a simulation study. Additionally, it discusses the advantages and disadvantages of simulation, along with its applications in fields like manufacturing, healthcare, and logistics.

Uploaded by

vishu.waghmare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views171 pages

Introduction to Simulation Concepts

This document introduces simulation as a tool for modeling and analyzing system behavior across various domains. It covers fundamental concepts such as system definitions, components, types of systems, and the steps involved in conducting a simulation study. Additionally, it discusses the advantages and disadvantages of simulation, along with its applications in fields like manufacturing, healthcare, and logistics.

Uploaded by

vishu.waghmare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1

INTRODUCTION TO SIMULATION
Unit Structure :
1.0 Objective
1.1 Introduction
1.2 System and System Environment
1.2.1 Definition of a System
[Link] Components and Structure of a System
[Link] Examples of Systems in Various Domains
1.2.2 System Environment and Boundaries
[Link] Interaction Between System and Environment
[Link] Defining System Boundaries
[Link] Importance of Boundary Selection
1.2.3 Components of a System
[Link] Entities
[Link] Attributes
[Link] Activities
[Link] State
1.2.4 Types of Systems
[Link] Deterministic Systems
Definition and Examples
Characteristics and Applications
[Link] Stochastic Systems
Definition and Examples
Characteristics and Applications
[Link] Continuous Systems
Definition and Examples
Characteristics and Applications
[Link] Discrete Systems
Definition and Examples
Characteristics and Applications

1
International Trade 1.3 Types of Models
Theory and Policy 1.3.1 Physical Models
[Link] Definition and Examples
[Link] Use in Simulation Studies
1.3.2 Mathematical Models
[Link] Definition and Types
[Link] Formulating Mathematical Models
[Link] Applications in Simulation
1.3.3 Computer Simulation Models
[Link] Definition and Characteristics
[Link] Building Computer Simulation Models
[Link] Examples of Simulation Software
1.4 Steps in Simulation Study
1.4.1 Problem Formulation
[Link] Defining Objectives
[Link] Identifying Key Questions and Hypotheses
1.4.2 Setting of Objectives and Overall Project Plan
[Link] Establishing Goals
[Link] Planning Resources and Timeline
1.4.3 Model Conceptualization
[Link] Developing the Conceptual Model
[Link] Identifying Key Components and Relationships
1.4.5 Data Collection
[Link] Identifying Data Requirements
[Link] Sources of Data
[Link] Data Validation and Cleaning
1.4.5 Model Translation
[Link] Converting Conceptual Model to Computational
Model
[Link] Selection of Simulation Software
[Link] Coding and Implementation
1.4.6 Verification and Validation
[Link] Techniques for Model Verification
[Link] Techniques for Model Validation
[Link] Ensuring Accuracy and Credibility

2
1.4.7 Experimentation and Analysis Introduction to
[Link] Designing Simulation Experiments Simulation
[Link] Analyzing Simulation Outputs
[Link] Interpreting Results and Drawing Conclusions
1.4.8 Documentation and Reporting
[Link] Documenting the Simulation Study
[Link] Preparing Reports and Presentations
1.5 Advantages and Disadvantages of Simulation
1.5.1 Advantages of Simulation
[Link] Flexibility and Scalability
[Link] Ability to Model Complex Systems
[Link] Insight into System Behavior and Performance
[Link] RiskFree Experimentation
1.5.2 Disadvantages of Simulation
[Link] High Cost and TimeConsuming
[Link] Complexity in Model Building
[Link] Challenges in Data Collection and Validation
[Link] Interpretation of Results
1.6 Applications of Simulation
1.6.1 Manufacturing and Production Systems
[Link] Optimizing Production Processes
[Link] Analyzing Bottlenecks and Improving Efficiency
1.6.2 Healthcare Systems
[Link] Simulating Patient Flow
[Link] Resource Allocation and Management
1.6.3 Transportation and Logistics
[Link] Traffic Simulation
[Link] Supply Chain Management
1.6.4 Business Process Modeling
1.7 Summary
1.8 Questions for Practice
1.9 References

1.0 OBJECTIVE
Simulation is a powerful tool used across various industries to model,
analyze, and predict system behavior. This chapter aims to introduce the
3
International Trade fundamental concepts of simulation, different types of systems and
Theory and Policy models, and the steps involved in conducting a simulation study.
Additionally, it explores the advantages, disadvantages, and applications
of simulation.

1.1 INTRODUCTION
Simulation refers to the process of creating a model that represents a real-
world system and experimenting with it to observe outcomes under
different conditions. It is widely used in fields such as manufacturing,
healthcare, logistics, business process modeling, and more. The primary
purpose of simulation is to gain insights into system behavior without
affecting the real system.
Some key benefits of simulation include:
Risk-free testing and analysis
Ability to study complex systems
Cost-effectiveness compared to real-world experiments
Enhanced decision-making through predictive insights

1.2 SYSTEM AND SYSTEM ENVIRONMENT


1.2.1 Definition of a System
A system is defined as a collection of interrelated components that work
together to achieve a specific objective. Systems can be found in various
domains such as engineering, economics, healthcare, and logistics.

[Link] Components and Structure of a System


A system is made up of various components, each playing a vital role in its
operation. Understanding these components helps in designing simulations
and analyzing system behavior effectively. The primary components of a
system are:

1. Entities
Entities are the fundamental elements that exist within a system. These can
be physical objects or abstract concepts.
Example: In a hospital system, entities include patients, doctors, and
nurses. In a transportation system, vehicles, passengers, and roads are
entities.

2. Attributes
Attributes define the characteristics of an entity. Each entity has a set of
attributes that determine its state and behavior.

4
Example: A patient entity may have attributes such as age, illness severity, Introduction to
and waiting time. A vehicle entity may have attributes like speed, fuel Simulation
level, and passenger count.

3. Activities
Activities refer to the tasks or processes that take place within the system.
These define how entities interact and evolve over time.
Example: In a manufacturing system, activities include assembling,
quality control, and packaging. In a banking system, activities include
account transactions and loan processing.

4. Events
Events are specific occurrences that trigger changes within the system.
They mark the start or end of an activity.
Example: In a hospital, a patient arriving at the emergency room is an
event. In a queue system, a customer reaching the service desk is an event.

5. State
The state of a system is a snapshot of its conditions at a specific moment.
It is determined by the attributes of entities at that time.
Example: In a factory, the state could be defined by the number of
machines in operation, the number of products completed, and the level of
raw materials available.

6. Inputs and Outputs


Inputs are elements introduced into the system that affect its operation,
while outputs are the results produced by the system.
Example: In a restaurant, inputs include ingredients and customer orders,
while outputs include prepared meals and customer satisfaction ratings.

7. Feedback Mechanism
A feedback mechanism allows a system to self-regulate by adjusting
operations based on past performance or external changes.
Example: In an inventory management system, feedback mechanisms help
adjust stock levels based on demand fluctuations.

[Link] Examples of Systems in Various Domains


Manufacturing: Assembly lines, inventory management systems
Healthcare: Hospital patient flow, emergency room operations
Transportation: Traffic flow, airline scheduling
Business: Customer service processes, financial markets

5
International Trade 1.2.2 System Environment and Boundaries
Theory and Policy
[Link] Interaction Between System and Environment
A system does not operate in isolation; it interacts with its external
environment. The interaction occurs through inputs, constraints, and
feedback mechanisms, which influence system performance and
outcomes.
Inputs: Elements introduced into the system, such as raw materials in
manufacturing or patient arrivals in a hospital.
Constraints: Limitations that affect how a system operates, like budget
restrictions in a business or space limitations in a warehouse.
Feedback Mechanisms: Adjustments made based on past performance,
such as modifying staffing levels based on patient influx.

[Link] Defining System Boundaries


System boundaries determine what is included and excluded from the
study. Clearly defining boundaries is crucial for accurate modeling and
analysis.
Internal Components: These are the primary elements directly involved
in system processes, such as machines in a factory.
External Influences: These include external factors affecting the system,
like market demand or government regulations.
Interfaces: Points where the system interacts with its environment, such
as customer touchpoints in a retail system.

[Link] Importance of Boundary Selection


Selecting appropriate system boundaries ensures that the most relevant
factors are considered while avoiding unnecessary complexities.
Narrow Boundaries: Focusing only on a system's internal components
can miss important external influences.
Broad Boundaries: Including too many external factors can make the
model overly complex and difficult to analyze.
Balanced Approach: Identifying the right level of detail allows for
meaningful insights and practical decision-making.

1.2.3 Components of a System


[Link] Entities
Entities are the distinct objects or elements that exist within a system and
interact with each other.

6
Examples: Machines in a factory, employees in an office, students in a Introduction to
school. Simulation
Role in Simulation: Understanding entities is key to defining interactions
and behaviors in a system model.

[Link] Attributes
Attributes define the properties or characteristics of entities.
Examples: A machine’s processing speed, a student’s grades, a patient’s
health condition.
Importance: Attributes influence system states and determine how
entities behave over time.

[Link] Activities
Activities refer to the tasks or operations performed within a system.
Examples: Manufacturing steps in production, transactions in a banking
system.
Relevance: Activities define the flow and interactions between entities,
making them critical for process analysis.

[Link] State
The state represents the system’s condition at any given moment,
influenced by the attributes of its entities.
Examples: Inventory levels in a warehouse, the number of patients in an
emergency room.
Use in Simulation: System states help track performance and predict
future behavior under different conditions.

1.2.4 Types of Systems


A system can be classified based on how it behaves and how its state
changes over time. Broadly, systems fall into Deterministic, Stochastic,
Continuous, and Discrete categories. Understanding these classifications is
crucial for modeling and simulation.

[Link] Deterministic Systems


Definition:
A deterministic system is a system where the outcome is precisely
determined by the initial conditions and system parameters. There is no
randomness involved, meaning the same set of inputs will always produce
the same outputs.

7
International Trade Characteristics:
Theory and Policy
 No randomness or probability involved.

 Future states of the system are completely predictable.

 Governed by fixed rules, formulas, or equations.

 Ideal for controlled environments where variability is minimal.

Examples:
Mechanical Systems: The movement of a pendulum under gravity
follows a deterministic path if there are no external disturbances.
Chemical Reactions: Given the same reactants, temperature, and
pressure, a chemical reaction proceeds in a predictable way.
Manufacturing Assembly Lines: A fully automated assembly line
produces identical products with minimal variation.
Mathematical Equations: Newton’s laws of motion describe the motion
of objects in a deterministic manner.

Applications:

 Engineering and Robotics: Designing machinery and automation


systems that require precision.

 Physics and Chemistry: Modeling physical and chemical processes


under controlled conditions.

 Traffic Flow Models: Predicting the movement of vehicles when


external factors are controlled.

 Computational Algorithms: Encryption and coding systems rely on


deterministic logic.
[Link] Stochastic Systems
Definition:
A stochastic system is one that involves elements of randomness, where
outcomes are not entirely predictable. Such systems incorporate
probability and uncertainty in their behavior.

Characteristics:

 Outcomes are probabilistic rather than fixed.

 The same input can produce different outputs.

 Future states are influenced by random variables.

 Requires statistical and probabilistic methods for modeling.


8
Examples: Introduction to
Simulation
 Weather Systems: The weather is influenced by numerous factors and
follows probabilistic patterns.

 Stock Market: Prices of stocks fluctuate based on economic


conditions, investor sentiment, and other unpredictable factors.

 Customer Arrivals in a Store: The number of customers arriving at a


retail store varies randomly.

 Biological Processes: The spread of diseases or genetic variations


follows stochastic models.

Applications:

 Finance and Economics: Risk assessment, portfolio management,


and economic forecasting.

 Healthcare and Epidemiology: Predicting disease outbreaks and


patient flow in hospitals.

 Network Traffic and Communications: Managing unpredictable


data flow in telecommunications.

 AI and Machine Learning: Probabilistic models like Hidden Markov


Models (HMM) and Bayesian Networks.

[Link] Continuous Systems


Definition:
A continuous system is one in which changes occur smoothly over time,
without abrupt jumps or discrete transitions. These systems are described
using differential equations.

Characteristics:

 Time progresses continuously rather than in steps.

 System variables change smoothly over time.

 Typically described using mathematical equations or differential


equations.

 Requires calculus-based methods for analysis.

Examples:

 Temperature Change: The temperature of a room gradually increases


or decreases over time.

 Fluid Flow: The movement of water through pipes follows a


continuous process.

9
International Trade  Projectile Motion: A ball thrown in the air follows a smooth parabolic
Theory and Policy trajectory.

 Electrical Circuits: The flow of current in an electrical circuit is


continuous.

Applications:

 Physics and Engineering: Modeling mechanical systems,


aerodynamics, and energy distribution.

 Climate Science: Predicting changes in climate variables like


temperature and atmospheric pressure.

 Biomedical Sciences: Blood circulation and drug diffusion in the


body.

 Control Systems: Designing smooth responses in automated systems,


like cruise control in cars.

[Link] Discrete Systems


Definition:
A discrete system is one where changes occur at distinct points in time
rather than continuously. These systems are characterized by sudden
transitions between states.
Characteristics:

 Changes happen in fixed time intervals or at distinct events.

 Can be modeled using discrete mathematics, logic, or event-based


systems.

 Often used in digital and computational environments.

Examples:

 Queue Systems: The arrival and departure of customers at a bank


occur at discrete times.

 Digital Clocks: A digital clock updates time at fixed intervals (e.g.,


every second).

 Computer Programs: Execution of commands in software follows


discrete steps.

 Traffic Lights: The state of a traffic signal changes at defined time


intervals.

Applications:

 Operations Research: Modeling queuing systems, logistics, and


supply chain operations.
10
 Computing and Digital Systems: Designing algorithms and data Introduction to
structures in computer science. Simulation
 Discrete Event Simulation: Used in manufacturing and process
optimization.

 Telecommunications: Packet-based data transmission follows discrete


steps.
Comparison of System Types

Type of Modeling Common


Nature Predictability
System Approach Applications

Engineering,
Fixed and Highly Equations and
Deterministic manufacturing,
predictable predictable fixed rules
physics
Finance,
Statistical and
Includes Probabilistic healthcare,
Stochastic probabilistic
randomness outcomes weather
methods
forecasting
Physics,
Smooth and Predictable Differential engineering,
Continuous
uninterrupted with calculus equations climate
modeling
Computing,
Changes Discrete
telecommunica
Discrete occur at Event-driven mathematics,
tions, queuing
intervals logic
systems

1.3 TYPES OF MODELS IN SIMULATION


Simulation models help in studying complex systems by representing
them in a simplified way. These models can be physical, mathematical, or
computer-based, each with distinct characteristics and applications.

1.3.1 Physical Models


[Link] Definition and Examples
A physical model is a tangible representation of a system, often scaled
down or up, to study its behavior under various conditions. These models
help visualize and analyze real-world scenarios.

Examples:

 Wind Tunnel Models: Used in aerospace engineering to study aircraft


aerodynamics.

11
International Trade  Architectural Models: Miniature versions of buildings to visualize
Theory and Policy structures before construction.

 Human Organ Models: Used in medical training for surgery practice.

 Mechanical Prototypes: A working version of a machine or device to


test its functionality.

[Link] Use in Simulation Studies


Allowing hands-on experimentation before full-scale production.
Helping in validating theoretical models by comparing real-world and
predicted behaviors.
Assisting in safety testing for vehicles, buildings, and medical procedures.
Serving as educational tools for students in engineering, medicine, and
physics.

1.3.2 Mathematical Models


[Link] Definition and Types
A mathematical model is a representation of a system using mathematical
equations, formulas, or logical expressions to describe its behavior. These
models help predict outcomes and optimize performance.

Types of Mathematical Models:


1. Deterministic Models: No randomness involved; results are fixed
(e.g., Newton’s Laws of Motion).
2. Stochastic Models: Incorporate probability and uncertainty (e.g.,
Markov Chains, Monte Carlo simulations).
3. Linear Models: Relationships between variables are linear (e.g.,
Linear Regression, Simple Economic Models).
4. Nonlinear Models: Involves complex relationships (e.g., Climate
Change Models, Chaos Theory).
5. Dynamic Models: Change over time (e.g., Population Growth
Models, Predator-Prey Models).
6. Static Models: Describe a system at a fixed point in time (e.g.,
Equilibrium Models in Economics).

[Link] Formulating Mathematical Models


Building a mathematical model involves:

 Defining the Problem: Identifying system components and


interactions.

12
 Identifying Variables: Determining input (independent) and output Introduction to
(dependent) variables. Simulation
 Establishing Relationships: Using equations, functions, or
probability distributions.

 Validating the Model: Comparing predictions with real-world data.

 Refining the Model: Adjusting equations or parameters for better


accuracy.

Example:
A traffic flow model can be formulated as:
Let

Q be the flow rate (vehicles/hour).

= ×
where

is the velocity and

is vehicle density.
A non-linear function can model congestion effects as density increases.

[Link] Applications in Simulation


Mathematical models are widely used in:

 Finance: Risk assessment, stock market predictions.

 Engineering: Structural analysis, fluid dynamics.

 Healthcare: Disease spread modeling (e.g., COVID-19 simulations).

 Economics: Market forecasting, demand-supply analysis.

 Environmental Science: Climate change models, pollution control.

1.3.3 Computer Simulation Models


[Link] Definition and Characteristics
A computer simulation model is a digital representation of a real-world
system that allows experimentation under different conditions. These
models are built using algorithms and computational techniques.

Characteristics:

 Based on mathematical and logical models.


13
International Trade  Allows real-time scenario testing.
Theory and Policy
 Can handle complex, large-scale systems.

 Uses graphical visualization and data analysis tools.

 Runs on high-performance computing environments.

[Link] Building Computer Simulation Models


Steps involved in developing a computer simulation model:
1. Define Objectives: Identify what needs to be analyzed.
2. Collect Data: Gather input parameters from real-world observations.
3. Choose a Modeling Approach: Discrete-event, agent-based, or
system dynamics simulation.
4. Develop the Model: Implement equations and algorithms in
programming languages like Python, R, or MATLAB.
5. Validate and Test: Compare outputs with real-world data to ensure
accuracy.
6. Run Simulations: Experiment with different scenarios and interpret
results.
Example:
A supply chain simulation can be developed to analyze inventory
management:
Inputs: Order quantity, demand rate, supplier lead time.
Process: Simulate stock levels, delays, and costs.
Output: Determine optimal order policies to reduce costs.

[Link] Examples of Simulation Software


Several software tools are used for computer simulations:
General-Purpose Simulation Software:
1. MATLAB/Simulink: Used in engineering and control systems.
2. AnyLogic: Supports discrete-event, agent-based, and system dynamics
modeling.
3. Arena: Used in industrial process simulation.

4. Physics and Engineering Simulations:

14
5. ANSYS: Finite Element Analysis (FEA) for mechanical and fluid Introduction to
dynamics. Simulation
6. COMSOL Multiphysics: Used for heat transfer, electromagnetics,
and structural simulations.

7. Financial and Economic Simulations:


8. Risk Solver (Excel Add-on): Monte Carlo simulations for financial
risk.
9. Python Libraries (SciPy, NumPy, Pandas): Used for stochastic
modeling.

10. Healthcare and Epidemiology Simulations:


11. EpiSimS: Used for disease spread modeling.
12. GAMA Platform: Agent-based modeling for healthcare research.

13. Artificial Intelligence and Machine Learning Simulations:


14. TensorFlow &PyTorch: Neural network training simulations.
15. OpenAI Gym: Reinforcement learning environments for AI
development.
Comparison of Model Types

Key
Model Type Nature Examples Applications
Characteristics

Wind tunnel, Engineering,


Tangible, Scaled versions,
Physical Model architectural medicine,
real-world used for testing
model education

Population
Abstract, Deterministic/Stoc Finance,
Mathematical growth,
equation- hastic, physics,
Model economic
based linear/nonlinear engineering
models
Climate
Computer Digital, Handles complex, AI,
models, AI
Simulation software- large-scale manufacturing,
training
Model based scenarios healthcare
simulations

1.4 STEPS IN A SIMULATION STUDY


A simulation study follows a structured approach to ensure the model
accurately represents the real-world system and provides valuable insights.
The key steps include problem formulation, model development,
validation, experimentation, and reporting.

15
International Trade 1.4.1 Problem Formulation
Theory and Policy
[Link] Defining Objectives
The first step in a simulation study is to clearly define the objectives of the
study.
Identify the purpose of the simulation (e.g., optimizing inventory,
improving traffic flow).
Specify key performance indicators (KPIs) to measure success.
Determine the scope and constraints of the system.

Example:
For a hospital emergency department simulation, objectives may include:
Reducing patient waiting time.
Optimizing the number of doctors and nurses on duty.

[Link] Identifying Key Questions and Hypotheses


Formulating Key Questions:

 What factors impact the system’s performance?

 How will changes in resource allocation affect system outcomes?


Developing Hypotheses:
"Increasing the number of nurses will reduce patient waiting time by
20%."
"Reducing machine downtime will increase production efficiency by
15%."

1.4.2 Setting Objectives and Overall Project Plan


[Link] Establishing Goals
Convert objectives into measurable goals.
Prioritize goals based on feasibility and impact.
Ensure alignment with business or research objectives.
Example:
For a supply chain simulation, goals may include:
Reducing transportation costs by 10%.
Maintaining an inventory level that prevents stockouts.

16
[Link] Planning Resources and Timeline Introduction to
Simulation
Identify the team members required (data analysts, developers, domain
experts).
Allocate budget, software, and hardware resources.
Define a timeline with milestones for each phase.

1.4.3 Model Conceptualization


[Link] Developing the Conceptual Model
A conceptual model provides a high-level view of the system, describing:
System components (e.g., entities, processes, resources).
Interactions and dependencies (e.g., how patients move through a
hospital).
Example:
For a manufacturing plant, components include:
Machines
Workers
Raw materials
Production process

[Link] Identifying Key Components and Relationships


Identify input variables (e.g., arrival rate of customers).
Define output measures (e.g., average service time).
Establish relationships (e.g., queue length depends on processing time).

1.4.4 Data Collection


[Link] Identifying Data Requirements

 What historical data is needed?

 What real-time data should be collected?

 Are survey-based or expert-driven estimates required?

[Link] Sources of Data


Internal data: Company records, databases.
External data: Government reports, market research.
Simulated data: When real-world data is unavailable.
17
International Trade [Link] Data Validation and Cleaning
Theory and Policy
Check for missing values, duplicates, and inconsistencies.
Ensure data accuracy by cross-referencing multiple sources.
Transform data into a suitable format for simulation.

1.4.5 Model Translation


[Link] Converting Conceptual Model to Computational Model
Convert the conceptual model into a mathematical or algorithmic
representation.
Define equations, rules, and constraints governing the system.

[Link] Selection of Simulation Software


Choose the appropriate tool based on:

 System complexity (simple vs. large-scale).

 Computational requirements (real-time vs. batch processing).


Examples: AnyLogic, Arena, Simulink, Python (SimPy), MATLAB.

[Link] Coding and Implementation


Develop the model using programming languages or simulation software.
Ensure scalability and flexibility for future modifications.
Example:
A traffic simulation can be implemented using:

18
1.4.6 Verification and Validation Introduction to
Simulation
[Link] Techniques for Model Verification
Verification ensures the model is free from errors and runs as expected.
Debugging code.
Step-by-step execution and print statements.
Comparing outputs with expected results.

[Link] Techniques for Model Validation


Validation checks if the model accurately represents the real system.
Compare simulation results with real-world data.
Perform sensitivity analysis (how changes in input affect output).
Use expert opinions for validation.

[Link] Ensuring Accuracy and Credibility


Document assumptions and limitations.
Perform multiple test runs to check consistency.
Involve domain experts to validate outputs.

1.4.7 Experimentation and Analysis


[Link] Designing Simulation Experiments
Define different scenarios and test cases.
Use Monte Carlo methods to test variability.

[Link] Analyzing Simulation Outputs


Use statistical methods to summarize results.
Generate graphs and tables for better visualization.
Example:
For a bank queue simulation, outputs may include:
Average waiting time per customer.
Utilization rate of tellers.
Probability of long queues forming.

[Link] Interpreting Results and Drawing Conclusions


Identify bottlenecks and performance improvements.

19
International Trade Make data-driven recommendations.
Theory and Policy
1.4.8 Documentation and Reporting
[Link] Documenting the Simulation Study
Provide detailed model descriptions (assumptions, logic, and constraints).
Include input data sources and validation reports.

[Link] Preparing Reports and Presentations


Use graphs, charts, and infographics for clear communication.
Present key findings and business recommendations.
Create an executive summary for stakeholders.
Example Presentation Slide Contents:
Objective & Problem Definition
Model Overview
Simulation Scenarios & Assumptions
Results & Key Insights
Recommendations & Next Steps
Summary of Steps in a Simulation Study

Step Key Actions

Define objectives, key questions, and


Problem Formulation
hypotheses.
Set goals, allocate resources, and create
Setting Objectives & Planning
a timeline.
Identify system components and
Model Conceptualization
relationships.
Identify sources, clean, and validate
Data Collection
data.
Convert conceptual models into
Model Translation
computational ones.

Verification & Validation Ensure model accuracy through testing.

Run simulations, analyze outputs, and


Experimentation & Analysis
interpret results.

Create detailed reports and


Documentation & Reporting
presentations.

20
1.5 ADVANTAGES AND DISADVANTAGES OF Introduction to
SIMULATION Simulation

1.5.1 Advantages of Simulation


[Link] Flexibility and Scalability
Simulation models can be adapted to different scenarios and conditions.
They can be scaled to accommodate large and small systems without
requiring major changes.
Adjustments can be made in real-time to test new policies, strategies, or
external influences.

[Link] Ability to Model Complex Systems


Simulation allows the representation of systems that have multiple
interdependent variables and dynamic interactions.
It is used for systems that cannot be analyzed using traditional
mathematical models.
Example: Manufacturing plants, traffic systems, and economic models
benefit from simulation.

[Link] Insight into System Behavior and Performance


Helps identify system inefficiencies, bottlenecks, and areas of
improvement.
Provides a better understanding of how various components interact and
affect overall performance.
Example: In supply chain management, it helps analyze the impact of
different inventory levels on costs and delivery times.

[Link] Risk-Free Experimentation


Simulation allows organizations to test different scenarios without real-
world risks.
Reduces potential losses by identifying failures before implementation.
Example: Flight simulators are used to train pilots without endangering
lives.

1.5.2 Disadvantages of Simulation


[Link] High Cost and Time-Consuming
Developing a realistic simulation model requires significant time and
effort.

21
International Trade High costs are involved in acquiring simulation software, training
Theory and Policy personnel, and running the models.
Example: Building a detailed city-wide traffic simulation requires
extensive data and computational power.

[Link] Complexity in Model Building


Creating a simulation model requires expertise in the domain and
knowledge of simulation techniques.
A poorly designed model may lead to inaccurate or misleading results.
Example: Weather simulation models must incorporate thousands of
variables for accuracy.

[Link] Challenges in Data Collection and Validation


Requires accurate and high-quality data for effective modeling.
Gathering data from multiple sources can be challenging, and errors in
data can affect results.
Example: In healthcare simulations, incorrect patient data can lead to
incorrect predictions about hospital resource needs.

[Link] Interpretation of Results


The accuracy of a simulation depends on how well the results are
interpreted.
Some simulations generate large amounts of data, making it difficult to
extract meaningful insights.
Example: Financial market simulations may produce complex outputs that
require expert interpretation to make investment decisions.

1.6 APPLICATIONS OF SIMULATION


1.6.1 Manufacturing and Production Systems
[Link] Optimizing Production Processes
Simulations help in designing efficient production lines, reducing waste,
and increasing productivity.
Example: An automobile factory can use simulation to test different
assembly line layouts before making changes.

[Link] Analyzing Bottlenecks and Improving Efficiency


Helps identify points where delays occur in the production process and
suggests solutions.

22
Example: A bottleneck in a food processing plant can be resolved by Introduction to
modifying the workflow based on simulation results. Simulation
1.6.2 Healthcare Systems
[Link] Simulating Patient Flow
Used in hospitals to analyze patient admission rates, wait times, and bed
availability.
Helps optimize hospital operations to reduce patient waiting time.
Example: Emergency room simulations help in resource planning during
peak hours.

[Link] Resource Allocation and Management


Determines the best way to allocate doctors, nurses, and medical
equipment.
Example: During a pandemic, simulations can predict the number of
ventilators and ICU beds needed in different regions.

1.6.3 Transportation and Logistics


[Link] Traffic Simulation
Models traffic flow, congestion patterns, and road design impacts.
Helps in urban planning and infrastructure development.
Example: Traffic light timing can be optimized using simulation to reduce
congestion in a city.

[Link] Supply Chain Management


Optimizes inventory levels, delivery schedules, and warehouse operations.
Reduces transportation costs and improves delivery efficiency.
Example: An e-commerce company can simulate different supply chain
strategies to minimize delivery delays.

1.6.4 Business Process Modeling


Simulates workflows and business operations to improve efficiency.
Helps in decision-making regarding process changes, employee allocation,
and automation.
Example: A bank can use simulation to improve customer service by
optimizing teller allocation based on customer demand.

23
International Trade 1.7 SUMMARY
Theory and Policy
 system is a collection of components that interact within a defined
environment. System boundaries define what is included and excluded.

 Components of a System: Systems consist of entities, attributes,


activities, and states, which define their behavior and functionality.

 Types of Systems: Systems can be deterministic (predictable output)


or stochastic (random behavior). They can also be continuous (smooth
changes) or discrete (event-based).

 Types of Models: Models can be physical (scaled representations),


mathematical (equations), or computer simulations (digital models
using software).

 Simulation Study Steps: Includes problem formulation, setting


objectives, data collection, model building, verification & validation,
experimentation, and reporting.

 Advantages of Simulation: Provides flexibility, models complex


systems, gives insight into system performance, and allows risk-free
testing.

 Disadvantages of Simulation: Can be costly and time-consuming,


requires complex modeling, data validation challenges, and result
interpretation difficulties.

 Applications in Manufacturing: Used to optimize production


processes and identify bottlenecks.

 Applications in Healthcare: Helps in patient flow simulation and


resource allocation.

 Applications in Transportation: Used for traffic simulation and


supply chain management.

 Applications in Business: Supports business process modeling to


improve efficiency and decision-making.

 Model Validation & Verification: Ensures that the model accurately


represents the real-world system and functions correctly.

 Data Collection & Cleaning: Essential for building an accurate


simulation model, requiring reliable sources and validation techniques.

 Simulation Software: Tools like AnyLogic, Arena, and MATLAB are


used for building and testing simulation models.

 Simulation in Decision Making: Helps organizations evaluate


scenarios before implementing changes in real-world systems.

24
1.8 QUESTIONS FOR PRACTICE Introduction to
Simulation
1. Define a system and explain its components with examples.
2. What are the key differences between deterministic and stochastic
systems?
3. How do system boundaries affect simulation modeling?
4. Explain the various types of models used in simulation.
5. What are the major steps involved in conducting a simulation study?
6. Discuss the advantages and disadvantages of using simulation.
7. How is simulation applied in manufacturing and healthcare industries?
8. What techniques are used for model verification and validation?
9. How does traffic simulation contribute to transportation planning?
10. What are the key challenges faced in data collection and validation for
simulation models?

1.9 REFERENCES
 [Link]
simulation/

 [Link]

 [Link]

 [Link]
N01/[Link]

 [Link]



25
2
GENERAL PRINCIPLES OF SIMULATION
Unit Structure :
2.0 Objective
2.1 Introduction
2.2 Concepts of Discrete Event Simulation
2.2.1 Overview of Discrete Event Simulation
[Link] Definition and Key Concepts
[Link] Examples of Discrete Event Simulation Applications
2.2.2 Event Scheduling/Time Advance Algorithm
[Link] Description of the Time Advance Mechanism
[Link] Event Scheduling Approach
[Link] Time Increment Approach
2.2.3 World Views in Simulation
[Link] Event Scheduling World View
Characteristics and Examples
Implementation Details
[Link] Activity Scanning World View
Characteristics and Examples
Implementation Details
[Link] Process Interaction World View
Characteristics and Examples
Implementation Details
2.2.4 Simulation Clock
[Link] Definition and Role of Simulation Clock
[Link] Updating the Simulation Clock
2.3 List Processing
2.3.1 Introduction to List Processing
[Link] Importance of List Processing in Simulation
[Link] Types of Lists Used in Simulation
2.3.2 Managing Events with Event Lists
[Link] Definition and Structure of Event Lists
[Link] Types of Events (Future, Current, Past)

26
2.3.3 Future Event List (FEL) General Principles of
[Link] Purpose and Structure Simulation
[Link] Insertion and Deletion Operations
[Link] Handling Multiple Events at the Same Time
2.3.4 Current Event List
[Link] Purpose and Structure
[Link] Processing Events in the Current Event List
2.3.5 Past Event List
[Link] Purpose and Structure
[Link] Archiving and Retrieving Events
2.4 State Variables and System State
2.4.1 Definition of State Variables
[Link] Role of State Variables in Simulation
[Link] Examples of State Variables in Different Systems
2.4.2 System State Representation
[Link] Definition and Components
[Link] State Space Diagrams
2.4.3 Updating System State
[Link] State Transition Mechanisms
[Link] Event Handling and State Update Procedures
2.5 Statistical Accumulation in Simulation
2.5.1 Importance of Statistical Accumulation
[Link] Role in Performance Measurement
[Link] Types of Accumulated Statistics
2.5.2 Data Collection Methods
[Link] RealTime Data Collection
[Link] Batch Data Collection
2.5.3 Performance Metrics
[Link] Throughput
[Link] Utilization
[Link] Response Time
[Link] Queue Length
[Link] Waiting Time

27
Simulation and 2.6 Random Number Generation
Modelling 2.6.1 Role of Random Numbers in Simulation
[Link] Generating Random Events
[Link] Stochastic Processes in Simulation
2.6.2 Methods for Generating Random Numbers
[Link] Linear Congruential Generator (LCG)
[Link] Other Random Number Generation Techniques
2.6.3 Testing Random Number Generators
[Link] Uniformity Tests
[Link] Independence Tests
2.7 Random Variate Generation
2.7.1 Transformations of Uniform Random Variables
[Link] Inverse Transform Method
[Link] Acceptance Rejection Method
2.7.2 Generating Random Variates for Common Distributions
[Link] Exponential Distribution
[Link] Normal Distribution
[Link] Poisson Distribution
[Link] Uniform Distribution
2.8 Simulation Software
2.8.1 Overview of Simulation Software Tools
[Link] Popular Simulation Software Packages
[Link] Criteria for Selecting Simulation Software
2.8.2 Building Simulation Models with Software
[Link] Steps in Model Development
[Link] Input Data Management
[Link] Model Verification and Validation
2.8.3 Case Studies Using Simulation Software
[Link] Examples of Simulation Studies
[Link] Analysis of Simulation Results
2.9 Summary
2.10 Questions for Practice
2.11 References

28
2.0 OBJECTIVE General Principles of
Simulation
This chapter aims to provide a foundational understanding of discrete
event simulation, including event scheduling, list processing, system state
representation, statistical accumulation, and random number generation.
Additionally, it covers simulation software and methodologies for
developing effective simulation models.

2.1 INTRODUCTION
Simulation is a widely applied technique in system modeling, enabling the
study of complex systems by imitating their operations over time. It is
used in various fields such as manufacturing, healthcare, logistics, and
computer networks. Discrete event simulation (DES) models the system as
a sequence of distinct events occurring at specific points in time, allowing
for a detailed analysis of system behavior.

2.2 CONCEPTS OF DISCRETE EVENT SIMULATION


2.2.1 Overview of Discrete Event Simulation
Discrete Event Simulation (DES) is a technique used to model systems
where changes occur at discrete points in time. Instead of continuously
updating the state of the system, changes happen only when specific
events occur. The model tracks these events chronologically and simulates
their effects on the system.
[Link] Definition and Key Concepts
Entities: Objects that pass through the system and interact with its
components.
Example: Customers in a bank, patients in a hospital, or packets in a
network.
Events: Changes in system state that occur at a specific point in time.
Example: A customer arriving at a queue, a machine breaking down, or a
product being completed in an assembly line.
Attributes: Characteristics of entities that help define their state.
Example: The priority of a customer, the size of a packet, or the remaining
processing time of a task.
Queues: Holding areas where entities wait before receiving service.
Example: A waiting line in a supermarket, a queue of jobs in a computer
processor.
Resources: Essential system components required for processing entities.
Example: Bank tellers, hospital beds, or service counters at a fast-food
restaurant.
29
Simulation and [Link] Examples of Discrete Event Simulation Applications
Modelling
DES is widely used in various fields to optimize processes and improve
efficiency. Some common applications include:
Manufacturing: Simulating production lines to reduce bottlenecks and
enhance efficiency.
Healthcare: Managing hospital resources like doctors, beds, and
appointment scheduling.
Logistics: Optimizing warehouse operations and delivery scheduling.
Computer Networks: Evaluating network performance and load
balancing in packet-switched networks.
Retail: Improving customer service strategies by simulating store layouts
and checkout processes.

2.2.2 Event Scheduling/Time Advance Algorithm


In DES, time does not advance in fixed intervals like in continuous
simulation. Instead, it jumps from one event to the next in a non-uniform
manner.

[Link] Description of the Time Advance Mechanism


The simulation clock keeps track of time within the system.
Events are arranged in chronological order in a queue known as the Future
Event List (FEL).
The system does not change continuously; it updates only when an event
occurs.

[Link] Event Scheduling Approach


All future events are stored in a Future Event List (FEL).
The simulation picks the next event from this list and updates the system
state.
Example: In an airport check-in simulation, events like passenger arrival,
luggage check-in, and security clearance occur sequentially.

[Link] Time Increment Approach


The simulation clock moves in fixed time intervals rather than jumping to
the next event.
At each time step, the system state is updated, regardless of whether an
event has occurred.
Example: A temperature monitoring system that updates every second,
even if there is no change.

30
2.2.3 World Views in Simulation General Principles of
Simulation
Different world views are used to model a discrete event simulation.

[Link] Event Scheduling World View


Focus: The occurrence of events in chronological order.
Examples: Queueing systems (e.g., ATM lines, call centers), network
traffic models.
Implementation: A list of future events is maintained, and the simulation
executes them sequentially.

[Link] Activity Scanning World View


Focus: Periodically scans the system conditions to determine which events
should be executed.
Examples: Inventory management systems, automated warehouse
operations.
Implementation: Uses a cyclic approach to check system state changes at
fixed time intervals.

[Link] Process Interaction World View


Focus: The movement and interactions of entities in the system.
Examples: Banking systems (customers interacting with tellers), customer
service centers, transportation systems.
Implementation: Entities follow predefined process flows and interact with
system resources.

2.2.4 Simulation Clock


The simulation clock is responsible for managing the passage of time in
DES.

[Link] Definition and Role of Simulation Clock


The simulation clock tracks time progression within the model.
It determines when each event will occur and the order in which events are
processed.

[Link] Updating the Simulation Clock


The clock jumps forward to the time of the next event rather than
progressing continuously.
Ensures events occur in the correct chronological order for accurate
system simulation.

31
Simulation and 2.3 LIST PROCESSING
Modelling
In Discrete Event Simulation (DES), events drive the system, and list
processing is a crucial technique for managing these events efficiently.
Event lists help in organizing and executing events in the correct order,
ensuring the simulation runs smoothly.

2.3.1 Introduction to List Processing


Importance of List Processing

 List processing is essential for:

 Efficient event scheduling and execution – Ensures events are


processed in chronological order.

 Optimizing simulation performance – Reduces computational


overhead by maintaining a structured approach to event management.

 Managing system state changes – Allows proper handling of system


updates when events occur.
Types of Lists Used in Simulation
There are three main types of event lists used in simulation:

1. Future Event List (FEL):


Stores events scheduled to occur in the future.
Events are sorted in ascending order of time.
When the simulation clock advances, the next event is fetched from
FEL.
Example: A hospital simulation where future events include scheduled
patient check-ups or surgeries.
2. Current Event List:
Contains events that are actively being processed at the current
simulation time.
The event at the top of the FEL becomes the current event when it is
time to execute it.
Example: A manufacturing plant where the current event is a machine
completing a product.

3. Past Event List:


Stores already executed events for record-keeping or debugging
purposes.
Helps in tracking system history and analyzing performance trends.
32
Example: A traffic simulation where past events include vehicles that General Principles of
have completed their journey. Simulation
2.3.2 Managing Events with Event Lists
[Link] Definition and Structure of Event Lists
An event list is a data structure used to store and manage scheduled
events. It follows:
Priority-based ordering: Events are stored based on their scheduled
execution time.
Efficient retrieval: The earliest event is always at the top and executed
first.
Dynamic updating: New events can be inserted, modified, or removed
during simulation.
Common Data Structures for Event Lists
To efficiently manage event lists, different data structures are used:

1. Linked List:
Each event is stored as a node, linked to the next event in order of
execution time.
Efficient insertion &deletion, but searching may take time.

2. Heap (Priority Queue):


A binary heap is commonly used to store events.
The event with the earliest time is at the root and is processed first.
Faster retrieval but slower insertion/removal compared to a linked list.

3. Sorted Array:
Events are stored in an array sorted by time.
Fast access to the next event but slow insertion for new events.

[Link] Types of Events


Events in simulation models are categorized based on their execution
status.

Future Events:
Definition: Events that are scheduled to occur at a later time.
Stored in: Future Event List (FEL).
Execution: Processed when their scheduled time arrives.

33
Simulation and Example:
Modelling
A customer's expected arrival at a bank in 10 minutes.
A factory machine breakdown scheduled after 2 hours.

Current Events:
Definition: Events that are being processed at the current simulation time.
Stored in: Current Event List.
Execution: These are removed from the FEL and executed immediately.

Example:
A patient currently being treated in a hospital.
A vehicle currently passing a traffic signal.

Past Events:
Definition: Events that have already been executed.
Stored in: Past Event List for analysis or debugging.
Usage: Helps in tracking system performance and auditing event history.

Example:
Completed orders in an e-commerce warehouse.
Finished customer service interactions in a call center.

2.4 STATE VARIABLES AND SYSTEM STATE


In Discrete Event Simulation (DES), the system state represents the
overall condition of the system at any given time. State variables track
these conditions and change dynamically as events occur. Understanding
state variables and system state is essential for modeling real-world
scenarios accurately.

2.4.1 Definition of State Variables


What are State Variables?
State variables are variables that describe the current status of a system
and determine how it evolves over time. These variables:
Capture system conditions at a given time.
Change dynamically as events occur in the system.
Are used to compute performance metrics, such as average waiting time or
system utilization.

34
[Link] Role of State Variables in Simulation General Principles of
Simulation
State variables serve several important roles:
Track the status of entities and resources in the system.
Update the system state when an event occurs.
Help compute key performance metrics, such as response time,
throughput, and utilization.
For example, in a queue-based system, the number of entities in a queue is
a state variable that changes whenever a new entity arrives or is served.

[Link] Examples of State Variables in Different Systems


Different types of systems use different state variables. Here are some
examples:

System State Variable Example


Banking System Number of customers in queue
Manufacturing System Number of machines currently in operation
Traffic System Number of cars at a signal
Hospital System Number of patients waiting for treatment
Inventory System Quantity of stock available in a warehouse

Each of these state variables changes based on events, such as customer


arrivals, machine breakdowns, or restocking events.

2.4.2 System State Representation


[Link] Definition and Components
A system state represents all relevant information about a system at a
specific point in time. It consists of:
Queues: Track waiting entities (e.g., customers, jobs).
Resources: Define system capacity (e.g., servers, machines).
Entity Attributes: Describe specific properties of each entity (e.g.,
remaining processing time).
For example, in a restaurant simulation, the system state could include:
Number of customers in queue (Queue).
Number of available tables (Resource).
Order status for each customer (Entity attribute).

35
Simulation and [Link] State Space Diagrams
Modelling
A State Space Diagram is a graphical representation of all possible system
states and the transitions between them.
Nodes represent different states.
Arrows indicate possible transitions between states.
Events cause transitions from one state to another.
Example: Bank Queue System

Suppose a bank has two service counters and customers arrive at


random times. A simple state space diagram could be:

Where:
State 0: No customers in queue.
State 1: One customer waiting.
State 2: Two customers waiting, and so on.
If a service counter becomes free, a transition occurs to a lower queue
state.

2.4.3 Updating System State


[Link] State Transition Mechanisms
State transitions occur due to event execution. The main state transition
mechanisms are:
Arrival Events: Increase the queue size (e.g., a customer arrives at a bank).
Departure Events: Reduce the queue size (e.g., a customer is served and
leaves).
Resource Allocation Events: Change system capacity (e.g., a machine
breaks down).
Process Completion Events: Modify entity attributes (e.g., a
manufacturing job is completed).
Example: Customer Queue at a Bank
State before event: 3 customers in queue.

36
Event: One customer gets served and leaves. General Principles of
Simulation
State after event: 2 customers in queue.

[Link] Event Handling and State Update Procedures


Every time an event occurs, the system state must be updated properly to
reflect the new conditions. The key steps in event handling and state
updates are:
Retrieve the next event from the Future Event List (FEL).
Advance the simulation clock to the event’s scheduled time.
Update system state by modifying state variables.
Schedule new events based on system logic.
Store the executed event in the Past Event List for tracking.
Example: Inventory System Simulation
Before event: Warehouse has 50 units of stock.
Event: A customer purchases 10 units.
After event: Warehouse now has 40 units of stock.
If the stock drops below a threshold, a new event (restocking order) might
be scheduled.

2.5 STATISTICAL ACCUMULATION IN SIMULATION


Statistical accumulation in simulation refers to the collection and analysis
of performance-related data throughout the simulation process. It is crucial
for evaluating system efficiency, identifying bottlenecks, and optimizing
performance.

2.5.1 Importance of Statistical Accumulation


[Link] Role in Performance Measurement
Statistical accumulation helps in quantifying system performance by
tracking key metrics over time. It allows for:
Evaluating system efficiency by analyzing throughput, waiting times, and
utilization.
Identifying bottlenecks where delays or inefficiencies occur.
Comparing different scenarios or system configurations to determine the
best approach.
Making data-driven decisions for improving system operations.

37
Simulation and For example, in a customer service center simulation, accumulating
Modelling statistics like average wait time and service rate helps determine if
additional staff is needed.

[Link] Types of Accumulated Statistics


There are two main types of accumulated statistics in simulation:
Time-Persistent Statistics
Measure values that change continuously over time.
Examples: Queue length over time, resource utilization percentage.
Computed using time-weighted averages.
Event-Based Statistics
Measure discrete events that occur at specific points in time.
Examples: Number of arrivals, departures, service completions.
Computed using count-based methods.
Example: In a manufacturing system, tracking the number of completed
products (event-based) and average machine utilization (time-persistent)
helps improve efficiency.

2.5.2 Data Collection Methods


[Link] Real-Time Data Collection
Definition: Data is collected continuously as the simulation runs.
Used for:
Monitoring system performance in real-time.
Adjusting parameters dynamically based on observed trends.
Creating live dashboards for decision-making.
Example: In a traffic simulation, real-time vehicle count at an intersection
is recorded to adjust signal timings.

[Link] Batch Data Collection


Definition: Data is stored in batches and processed later for analysis.
Used for:
Post-simulation analysis to evaluate overall system performance.
Running multiple scenarios to compare results.
Reducing computational load by processing data after execution.

38
Example: In a retail store simulation, batch collection of daily sales General Principles of
transactions helps in demand forecasting. Simulation
2.5.3 Performance Metrics
Performance metrics are key indicators used to evaluate the efficiency of a
system during a simulation.

[Link] Throughput
Definition: The number of entities (customers, products, tasks) processed
per unit time.
Formula:

Example: In a manufacturing plant simulation, throughput could be


measured as products produced per hour.

[Link] Utilization
Definition: The percentage of time a resource (server, machine, worker) is
busy.
Formula:

Example: In a cloud computing system, server utilization tracks how


efficiently computing resources are used.
[Link] Response Time
Definition: The total time taken for an entity to be processed from arrival
to completion.
Formula:

Example: In an online ordering system, response time includes order


processing and delivery time.

[Link] Queue Length


Definition: The average number of entities waiting in a queue.

39
Simulation and Formula:
Modelling

Example: In a hospital simulation, queue length measures the number of


patients waiting for consultation.

[Link] Waiting Time


Definition: The time an entity spends waiting before being processed.
Formula:

Example: In a call center, waiting time represents the time a customer


spends on hold before speaking to an agent.

2.6 RANDOM NUMBER GENERATION


Random number generation plays a crucial role in simulation, enabling the
modeling of uncertainty and variability in real-world systems. It provides
the foundation for simulating stochastic processes, generating random
events, and producing realistic outcomes in simulations.

2.6.1 Role of Random Numbers in Simulation


Random numbers are essential in simulation to introduce variability and
randomness, making the model behavior more realistic.

[Link] Generating Random Events


Many simulation models require random events, such as customer arrivals,
machine breakdowns, or service times.
Random numbers help determine when events occur and what outcomes
they produce.
Example: In a bank queue simulation, random numbers determine when
the next customer arrives and how long they will take for service.

[Link] Stochastic Processes in Simulation


A stochastic process is a system that evolves over time with some level of
randomness.
Simulations of real-world phenomena often rely on probability
distributions to model arrival rates, failure rates, service times, etc.
Example: In weather modeling, random numbers help simulate
temperature fluctuations, wind speed variations, and precipitation levels.

40
2.6.2 Methods for Generating Random Numbers General Principles of
Simulation
[Link] Linear Congruential Generator (LCG)
One of the most commonly used pseudo-random number generators
(PRNGs).
Uses a mathematical formula to generate a sequence of numbers that
appear random.
Formula:

Where:
Xn = Current random number
a = Multiplier
c= Increment
m = Modulus
Properties:
Fast and efficient.
Produces deterministic sequences (same sequence if initialized with the
same seed).
Must carefully choose parameters to ensure good randomness.

[Link] Other Random Number Generation Techniques


Mid-Square Method
Squares a number and takes the middle digits as the next random number.
Simple but not widely used due to short cycles.
Lagged Fibonacci Generator
Generates numbers using a recursive formula based on previous values.
Offers better randomness than LCG.
Mersenne Twister
A widely used PRNG known for its long period and high-quality
randomness.
Used in many programming languages (Python, MATLAB, R).
Hardware Random Number Generators
41
Simulation and Use physical processes like electrical noise to generate true random
Modelling numbers.
Used in cryptography and security applications.

2.6.3 Testing Random Number Generators


Ensuring that a random number generator produces high-quality random
numbers is crucial for accurate simulation results. Two key properties to
test are uniformity and independence.

[Link] Uniformity Tests


Check whether the generated numbers are evenly distributed across the
expected range.
Methods:
Chi-Square Test
Compares the expected vs. observed frequencies of numbers in different
intervals.
If the difference is too large, the generator may not be uniform.
Kolmogorov-Smirnov Test
Compares the empirical distribution of generated numbers with a uniform
distribution.

[Link] Independence Tests


Ensures that generated numbers do not follow a predictable pattern.
Methods:
Autocorrelation Test
Measures how dependent a random number is on previous numbers in the
sequence.
If numbers are correlated, they are not truly random.
Runs Test (Up and Down Test)
Checks if numbers appear in a random order or show an
increasing/decreasing pattern.

2.7 RANDOM VARIATE GENERATION


Random variate generation is the process of creating random samples from
a specific probability distribution. Since most random number generators
produce uniformly distributed numbers, special techniques are required to
generate non-uniform random variates for distributions like exponential,
normal, and Poisson.
42
2.7.1 Transformations of Uniform Random Variables General Principles of
Simulation
Most pseudo-random number generators (PRNGs) generate numbers
uniformly between 0 and 1. These numbers must be transformed to follow
a desired probability distribution.

[Link] Inverse Transform Method


The Inverse Transform Method is a widely used technique for generating
random variates from any probability distribution.
Steps:

Generate a uniform random numberU∼U(0,1).


Use the inverse cumulative distribution function (CDF) of the desired
distribution:
X=F−1(U)
where F−1(U)is the inverse of the cumulative distribution function.
The generated value X follows the target distribution.

[Link] Acceptance-Rejection Method


The Acceptance-Rejection Method is used when the inverse transform
method is difficult to apply.
Steps:
Generate a candidate sample YYY from an easy-to-sample distribution
(called the proposal distribution).

Generate a uniform random number U∼U(0,1).


Accept Y if:U≤cg(Y)/f(Y)
where:
f(Y) is the desired probability density function (PDF).
g(Y) is the proposal distribution's PDF.
c is a constant ensuring f(Y)≤cg(Y)
If the condition is not met, reject Y and repeat the process.
Example: Generating Normal Variates
The Normal distribution does not have a simple inverse CDF.
A common approach is to use the Acceptance-Rejection Method with an
easier proposal distribution (such as a Cauchy or Exponential distribution).
The Box-Muller Transform is another method to generate normal variates.

43
Simulation and 2.7.2 Generating Random Variates for Common Distributions
Modelling
[Link] Exponential Distribution
Used for modeling inter-arrival times in queueing systems and failure
times in reliability analysis.
Formula (Inverse Transform Method):
X=−ln(U)/ λ

where U∼U(0,1) and λ is the rate parameter.

[Link] Normal Distribution


Used in natural phenomena, finance, and machine learning.
Box-Muller Transform (Common Method)
Given two independent uniform random numbers U1U_1U1 and
U2U_2U2, compute:

Both Z1Z and 2Z2 follow a standard normal distribution N(0,1)

[Link] Poisson Distribution


Used for modeling count-based events like customer arrivals and network
packet transmissions.

If X∼Poisson(λ), then the probability mass function (PMF) is:

where k=0,1,2,…
Algorithm (Using Exponential Waiting Times):
Generate a sequence of interarrival times from an exponential distribution.
Count the number of arrivals within a fixed time period.

[Link] Uniform Distribution


A continuous uniform distribution on [a,b] has a simple inverse CDF:

44
X=a+(b−a)U General Principles of
Simulation
where U∼U(0,1)

2.8 SIMULATION SOFTWARE


Simulation software allows users to model, analyze, and optimize complex
systems in various domains, such as manufacturing, healthcare, finance,
and logistics. These tools help visualize real-world processes, perform
what-if analysis, and improve decision-making.

2.8.1 Overview of Simulation Software Tools


Simulation tools vary in their capabilities, ease of use, and application
domains. Some are designed for discrete-event simulation (DES), while
others focus on system dynamics or agent-based modeling.

[Link] Popular Simulation Software Packages

Software Type Application


Hybrid (DES, Agent- Supply Chain,
AnyLogic Based, System Healthcare, Business
Dynamics) Processes
Discrete-Event Manufacturing,
ARENA
Simulation Logistics, Call Centers
Discrete-Event Industrial & Business
SIMUL8
Simulation Process Simulation
MATLAB Control Systems,
System Dynamics
Simulink Engineering, Robotics
Manufacturing,
FlexSim 3D Simulation
Warehousing, Healthcare
GPSS (General
Purpose Discrete-Event Military, Logistics,
Simulation Simulation Industrial Engineering
System)
NS-3 (Network Event-Driven Wireless Networks, IoT,
Simulator) Network Simulation 5G
Agent-Based Social Science, Biology,
NetLogo
Modeling Economics
Python (SimPy, General-Purpose,
Custom Simulations
PySCeS, SciPy) Research

45
Simulation and [Link] Criteria for Selecting Simulation Software
Modelling
When choosing a simulation tool, consider:
Type of Simulation Needed – Discrete-event, system dynamics, or agent-
based?
Ease of Use – Does it have a graphical interface, or is coding required?
Scalability – Can it handle large and complex systems?
Integration – Can it work with external databases, APIs, or analytics tools?
Performance – Speed, memory usage, and computational efficiency.
Visualization and Reporting – Graphical output, dashboards, and data
export options.
Cost and Licensing – Open-source vs. commercial software.

2.8.2 Building Simulation Models with Software


Developing a simulation model involves multiple steps, from defining
objectives to validating results.

[Link] Steps in Model Development


Problem Definition – Identify the system to be modeled and key
objectives.
System Conceptualization – Define system components, interactions, and
constraints.
Data Collection – Gather input data, including process times, arrival rates,
and resource availability.
Model Formulation – Choose the appropriate simulation method and
software.
Model Implementation – Develop the simulation using software tools.
Verification and Validation – Ensure model accuracy through testing.
Experimentation and Analysis – Run simulations, adjust parameters, and
evaluate results.
Decision Making & Reporting – Present findings and make
recommendations.

[Link] Input Data Management


Simulation requires high-quality input data for realistic results. Sources
include:
Historical Data – From databases, logs, and reports.

46
Real-Time Data – Sensor readings, IoT devices, and APIs. General Principles of
Simulation
Synthetic Data – Generated using random variate generation methods.
Expert Estimates – When real-world data is unavailable.
Data should be cleaned, pre-processed, and stored efficiently.

[Link] Model Verification and Validation


Verification ensures the simulation model runs correctly and follows the
intended logic.
Example: Checking if customer arrivals follow the specified Poisson
distribution.
Validation ensures the model accurately represents the real-world system.
Example: Comparing simulation output with real-world observations.

Common techniques for validation:

 Face Validation – Experts review model assumptions.

 Comparison with Historical Data – Matching outputs with real-world


metrics.

 Sensitivity Analysis – Testing model behavior under different


conditions.

2.8.3 Case Studies Using Simulation Software


[Link] Examples of Simulation Studies

 Healthcare Simulation (ARENA, Simul8, AnyLogic)


Objective: Reduce patient wait times in emergency rooms.
Solution: Simulated staffing levels, patient arrivals, and treatment
durations.
Outcome: Optimized scheduling to improve service efficiency.

 Manufacturing Process Optimization (FlexSim, SIMUL8)


Objective: Increase production efficiency.
Solution: Simulated assembly lines, identified bottlenecks, and optimized
machine utilization.
Outcome: 20% reduction in production cycle time.

 Traffic Flow Simulation (MATLAB Simulink, AnyLogic)


Objective: Improve traffic management at an intersection.

47
Simulation and Solution: Simulated vehicle arrivals, signal timings, and pedestrian
Modelling crossings.
Outcome: Optimized signal timing, reducing congestion by 30%.

[Link] Analysis of Simulation Results


After running simulations, results are analyzed using statistical methods:

 Key Performance Metrics:


Throughput – Number of processed entities per unit time.
Utilization – Percentage of resource usage.
Queue Length – Average number of entities in a queue.
Response Time – Time taken to complete a process.

 Visualization Tools:
Histograms & Box Plots – Show distributions of system performance.
Time Series Charts – Track performance over time.
Heatmaps – Identify congestion points in traffic and logistics systems.

2.9 SUMMARY
 Discrete Event Simulation (DES) – A modeling technique where
events occur at discrete points in time, commonly used in queueing
systems, manufacturing, and logistics.

 Event Scheduling Algorithm – Uses time advance mechanisms


(event scheduling, time increment) to progress simulation time based
on upcoming events.

 Simulation World Views – Three main approaches: Event


Scheduling, Activity Scanning, and Process Interaction, each with
different event-handling strategies.

 Simulation Clock – Tracks and updates simulation time, advancing


from one event to the next.

 List Processing in Simulation – Involves managing Future Event List


(FEL), Current Event List, and Past Event List to handle event
execution efficiently.

 System State Representation – Uses state variables to describe


system conditions and state space diagrams to visualize transitions.

 Statistical Accumulation – Measures system performance using


metrics like throughput, utilization, response time, queue length, and
waiting time.

48
 Random Number & Variate Generation – Essential for stochastic General Principles of
simulation, with methods like Linear Congruential Generator (LCG) Simulation
and Inverse Transform Method.

 Simulation Software Tools – Popular tools include Arena, SimPy,


AnyLogic, MATLAB Simulink, used for modeling, analyzing, and
validating simulations.

 Model Development & Validation – Requires steps like defining


objectives, input data management, model verification, and sensitivity
analysis to ensure accuracy.

2.10 QUESTIONS FOR PRACTICE


1) What is Discrete Event Simulation (DES)? Give an example of its
application.
2) Explain the difference between the Event Scheduling, Activity
Scanning, and Process Interaction world views.
3) Describe the role of the simulation clock and how it updates in a DES.
4) What are the different types of event lists used in simulation? How is
the Future Event List (FEL) managed?
5) How does the Time Advance Algorithm work in event-based
simulation?
6) Define system state variables and explain their importance in
simulation models.
7) What are the main performance metrics used in simulation? Provide a
brief description of each.
8) Explain how the Linear Congruential Generator (LCG) produces
random numbers.
9) What is the Inverse Transform Method? How is it used to generate
random variates?
10) List and describe common probability distributions used in simulation
(e.g., Exponential, Normal, Poisson).
11) What are the key factors in selecting a simulation software package?
12) How does model verification and validation ensure the accuracy of a
simulation?

2.11 REFERENCES
 [Link]

 [Link]

49
Simulation and  [Link]
Modelling
 [Link]
statistics

 [Link]
validation/27793

 [Link]



50
3
STATISTICAL MODELS IN SIMULATION
Unit Structure :
3.0 Objective
3.1 Introduction
3.2 Useful Statistical Models
3.2.1 Overview of Statistical Models
[Link] Definition and Importance in Simulation
[Link] Role in Representing RealWorld Systems
3.2.2 Applications in Simulation
[Link] Examples of Statistical Models in Different Domains
[Link] Case Studies Demonstrating Applications
3.3 Discrete Distributions
3.3.1 Bernoulli Distribution
[Link] Definition and Properties
[Link] Examples and Applications
[Link] Generating Random Variates
3.3.2 Binomial Distribution
[Link] Definition and Properties
[Link] Relationship to Bernoulli Distribution
[Link] Examples and Applications
[Link] Generating Random Variates
3.3.3 Geometric Distribution
[Link] Definition and Properties
[Link] Memoryless Property
[Link] Examples and Applications
[Link] Generating Random Variates
3.3.4 Poisson Distribution
[Link] Definition and Properties
[Link] Relationship to Binomial Distribution
[Link] Examples and Applications
[Link] Generating Random Variates

51
Simulation and 3.4 Continuous Distributions
Modelling 3.4.1 Uniform Distribution
[Link] Definition and Properties
[Link] Examples and Applications
[Link] Generating Random Variates
3.4.2 Exponential Distribution
[Link] Definition and Properties
[Link] Memoryless Property
[Link] Relationship to Poisson Process
[Link] Examples and Applications
[Link] Generating Random Variates
3.4.3 Normal Distribution
[Link] Definition and Properties
[Link] Central Limit Theorem
[Link] Examples and Applications
[Link] Generating Random Variates
3.4.5 Gamma Distribution
[Link] Definition and Properties
[Link] Relationship to Exponential and ChiSquared
Distributions
[Link] Examples and Applications
[Link] Generating Random Variates
3.5 Poisson Process
3.5.1 Definition and Characteristics
[Link] Key Properties of Poisson Process
[Link] Homogeneous vs. NonHomogeneous Poisson
Process
3.5.2 Applications of Poisson Process
[Link] Modeling Arrival Processes
[Link] Case Studies and Example
3.6 Empirical Distributions
3.6.1 Definition and Use
[Link] Construction of Empirical Distributions
[Link] Comparing Empirical and Theoretical Distributions
3.6.2 Fitting Empirical Distributions
[Link] Methods for Fitting Empirical Data
52
[Link] Goodness of Fit Tests Statistical Models in
[Link] Examples and Applications Simulation
3.7 Statistical Inference in Simulation
3.7.1 Point Estimation
[Link] Definition and Methods
[Link] Properties of Estimators (Bias, Consistency,
Efficiency)
[Link] Methods of Moments and Maximum Likelihood
Estimation
3.7.2 Interval Estimation
[Link] Confidence Intervals
[Link] Methods for Constructing Confidence Intervals
[Link] Applications in Simulation Output Analysis
3.7.3 Hypothesis Testing
[Link] Formulating and Testing Hypotheses
[Link] Type I and Type II Errors
[Link] Commonly Used Tests (Ztest, ttest, ChiSquare test)
[Link] Applications in Simulation Studies
3.8 Correlation and Regression Analysis
3.8.1 Correlation Analysis
[Link] Definition and Measurement (Pearson, Spearman)
[Link] Interpretation and Applications
[Link] Detecting and Handling Multicollinearity
3.8.2 Regression Analysis
[Link] Simple Linear Regression
[Link] Multiple Linear Regression
[Link] Assumptions and Diagnostics
[Link] Applications in Predictive Modeling
3.9 Summary
3.10 Questions for Practice
3.11 References
3.0 OBJECTIVE
The objective of this chapter is to provide an understanding of statistical
models used in simulation. It explores their definitions, significance,
applications in different domains, and case studies demonstrating their
practical utility.
53
Simulation and INTRODUCTION
Modelling
Statistical models play a crucial role in simulation by providing
mathematical representations of real-world systems. These models help in
analyzing uncertainties, predicting behaviors, and making data-driven
decisions. They are widely used in engineering, finance, healthcare,
manufacturing, and other industries to model complex systems and
optimize operations.

USEFUL STATISTICAL MODELS


Statistical models in simulation help in describing variability,
dependencies, and randomness in different processes. They can be
classified based on their purpose and characteristics, such as probability
distributions, regression models, time series models, and Bayesian models.

3.2.1 Overview of Statistical Models


[Link] Definition and Importance in Simulation
A statistical model is a mathematical framework that represents observed
data using probability distributions and statistical relationships. These
models are essential in simulation as they:
Provide a structured way to handle uncertainties.
Enable predictive analysis and decision-making.
Improve the accuracy and reliability of simulation outcomes.
Support optimization and risk assessment in various applications.

[Link] Role in Representing Real-World Systems


Statistical models help in approximating real-world systems by
incorporating randomness and stochastic processes. Some key roles
include:
Modeling Demand and Supply: Used in business and economics to
simulate market trends.
Healthcare Predictions: Analyzing patient data and predicting disease
progression.
Manufacturing and Production: Simulating defect rates and optimizing
processes.
Financial Forecasting: Estimating risks and returns in investments.

3.2.2 Applications in Simulation


[Link] Examples of Statistical Models in Different Domains

54
Poisson Process in Call Centers: Used to model the arrival rate of Statistical Models in
customer calls. Simulation
Markov Chains in Healthcare: Predicting disease progression and
treatment outcomes.
Regression Models in Economics: Forecasting GDP growth and inflation
rates.
Bayesian Models in Artificial Intelligence: Used in decision-making
algorithms.
Monte Carlo Simulation in Finance: Estimating stock price movements
and risk factors.

[Link] Case Studies Demonstrating Applications


Queueing System in Retail: A supermarket optimized its checkout
process using an M/M/1 queue model, reducing customer wait times by
25%.
Reliability Testing in Manufacturing: A car manufacturer used Weibull
distribution models to predict component failures, improving warranty
management.
Predictive Maintenance in Aviation: Airlines implemented time series
models to forecast engine maintenance schedules, reducing unexpected
failures.
Epidemiological Studies in Public Health: Simulation models helped in
predicting the spread of infectious diseases and optimizing vaccination
strategies.
Investment Risk Analysis in Finance: Hedge funds used Monte Carlo
simulations to analyze potential portfolio risks and returns under different
market scenarios.

3.3 DISCRETE DISTRIBUTIONS


3.3.1 Bernoulli Distribution
[Link] Definition and Properties
The Bernoulli distribution is a discrete probability distribution that
represents a single trial of a binary experiment with two possible
outcomes: success or failure. The probability of success is denoted by a
parameter, and the probability of failure is the complement of this
probability. This distribution is commonly used in modeling binary events
in probability and statistics.

55
Simulation and [Link] Examples and Applications
Modelling
Coin toss outcomes (heads or tails)
Success or failure in quality control checks
Customer purchasing behavior (buy or not buy)
Medical tests indicating presence or absence of a condition

[Link] Generating Random Variates


Random variates following a Bernoulli distribution can be generated by
assigning a probability to an event and using a random number generator
to simulate outcomes.

3.3.2 Binomial Distribution


[Link] Definition and Properties
The Binomial distribution models the number of successes in a fixed
number of independent Bernoulli trials. Each trial has the same probability
of success. The distribution is characterized by two parameters: the
number of trials and the probability of success in each trial.

[Link] Relationship to Bernoulli Distribution


The Binomial distribution is an extension of the Bernoulli distribution,
where multiple independent Bernoulli trials are performed. When there is
only one trial, the Binomial distribution reduces to a Bernoulli
distribution.

[Link] Examples and Applications


Number of defective items in a batch of products
Number of correct answers in a multiple-choice test
Number of customers making a purchase in a group of shoppers
The likelihood of heads appearing in multiple coin tosses

[Link] Generating Random Variates


Random variates can be generated by simulating multiple Bernoulli trials
and counting the number of successes.

3.3.3 Geometric Distribution


[Link] Definition and Properties
The Geometric distribution represents the number of trials required to
achieve the first success in a sequence of independent Bernoulli trials. It
describes waiting times for success in repeated experiments.

56
[Link] Memoryless Property Statistical Models in
Simulation
The Geometric distribution has the memoryless property, meaning that the
probability of success remains the same regardless of how many trials
have already occurred. This is a key feature distinguishing it from other
distributions.

[Link] Examples and Applications


Number of attempts before the first heads in a coin toss
Number of calls made before reaching a successful sale
Waiting time for the first defective item in a production line
Number of trials before the first positive response in clinical testing

[Link] Generating Random Variates


Random variates following a Geometric distribution can be generated by
performing independent Bernoulli trials until the first success occurs.

3.3.4 Poisson Distribution


[Link] Definition and Properties
The Poisson distribution models the number of occurrences of an event in
a fixed interval of time or space, assuming that events occur independently
and at a constant average rate.

[Link] Relationship to Binomial Distribution


The Poisson distribution is related to the Binomial distribution in that it
can be derived as a limiting case when the number of trials is very large,
and the probability of success is very small, while keeping the expected
number of successes constant.

[Link] Examples and Applications

 Number of customer arrivals at a store in a given time period

 Number of phone calls received at a call center per hour

 Number of defects in a length of manufactured fabric

 Frequency of natural disasters in a region over a year

[Link] Generating Random Variates


Random variates following a Poisson distribution can be generated using
methods based on random counting processes, such as simulating event
occurrences within a given time frame.

57
Simulation and 3.4 CONTINUOUS DISTRIBUTIONS
Modelling
Continuous probability distributions describe outcomes that take values
over a continuous range. These distributions are crucial in statistical
modeling and real-world applications, where events are measured rather
than counted. The following sections cover essential continuous
distributions, their properties, and applications.

3.4.1 Uniform Distribution


[Link] Definition and Properties
The uniform distribution is one of the simplest continuous probability
distributions. It represents a situation where all outcomes within a
specified range are equally likely. This distribution is widely used in
simulations and probability modeling due to its straightforward nature.

[Link] Examples and Applications


The uniform distribution finds applications in areas such as:
Random number generation for simulations and cryptography.
Equal probability selection in lotteries or sampling.
Modeling uncertainty when no prior knowledge favors any specific
outcome within a range.

[Link] Generating Random Variates


Generating random values from a uniform distribution is fundamental in
computational statistics. These values serve as the foundation for
generating other probability distributions through transformation methods.

3.4.2 Exponential Distribution


[Link] Definition and Properties
The exponential distribution models the time between occurrences of
random events in a process that follows a constant average rate. It is
particularly useful in scenarios where events happen independently over
time.

[Link] Memoryless Property


A unique characteristic of the exponential distribution is its memoryless
property, which implies that the probability of an event occurring in the
future is independent of past occurrences. This makes it ideal for modeling
waiting times in random processes.

[Link] Relationship to Poisson Process


The exponential distribution plays a fundamental role in the Poisson
process, as it represents the time intervals between successive events.

58
Understanding this connection is key to studying arrival patterns in Statistical Models in
various fields. Simulation
[Link] Examples and Applications
Common applications of the exponential distribution include:
Modeling system failures and reliability analysis.
Predicting arrival times in queuing systems, such as customer service
lines.
Analyzing lifetimes of electronic components and biological processes.

[Link] Generating Random Variates


Random variates from an exponential distribution are generated for use in
simulations, queueing models, and survival analysis. These values are
derived from uniform random numbers using transformation techniques.

3.4.3 Normal Distribution


[Link] Definition and Properties
The normal distribution, also known as the Gaussian distribution, is one of
the most widely used probability distributions in statistics. It is
characterized by its symmetric bell-shaped curve, where values closer to
the mean are more likely to occur.
[Link] Central Limit Theorem
The central limit theorem states that the sum (or average) of a large
number of independent random variables tends to follow a normal
distribution, regardless of the original distribution of the variables. This
principle underpins many statistical methods.

[Link] Examples and Applications


The normal distribution appears in numerous real-world contexts,
including:
Height, weight, and IQ distributions in populations.
Measurement errors and financial market fluctuations.
Signal processing and natural phenomena like temperature variations.

[Link] Generating Random Variates


Simulating normal distribution values is critical in statistical modeling,
hypothesis testing, and machine learning. Various techniques, such as
transformation methods, are employed to obtain normal variates.

59
Simulation and 3.4.5 Gamma Distribution
Modelling
[Link] Definition and Properties
The gamma distribution is a flexible family of probability distributions
used to model waiting times and lifetimes of processes. It extends the
exponential distribution by considering the total time for multiple
independent events to occur.

[Link] Relationship to Exponential and Chi-Squared Distributions


The gamma distribution generalizes the exponential distribution, where
waiting times are accumulated over multiple stages. It is also closely
related to the chi-squared distribution, which is widely used in hypothesis
testing and variance estimation.

[Link] Examples and Applications


The gamma distribution is useful in:
Modeling insurance claims and risk assessment.
Analyzing rainfall patterns and hydrological studies.
Reliability engineering for predicting system failures over time.

[Link] Generating Random Variates


Gamma-distributed random values are crucial in simulations and Bayesian
statistics. These values can be generated using transformation-based
techniques.

3.5 POISSON PROCESS


3.5.1 Definition and Characteristics
The Poisson process describes the occurrence of random events over time
or space, assuming a constant average rate and independence between
events. It serves as the foundation for modeling many real-world arrival
processes.

[Link] Key Properties of Poisson Process


Some fundamental characteristics of a Poisson process include:
Events occur randomly but at an average fixed rate.
The number of events in disjoint time intervals is independent.
The distribution of event counts in a given time follows a Poisson
distribution.

[Link] Homogeneous vs. Non-Homogeneous Poisson Process


A homogeneous Poisson process has a constant event rate over time.
60
A non-homogeneous Poisson process allows the event rate to vary over Statistical Models in
time, making it suitable for applications where intensity changes Simulation
dynamically.

3.5.2 Applications of Poisson Process


[Link] Modeling Arrival Processes
The Poisson process is widely used in:
Telecommunications for modeling call arrivals and network traffic.
Retail and service industries to predict customer inflows.
Epidemiology for tracking the occurrence of rare diseases.

[Link] Case Studies and Example


This section includes practical case studies illustrating the use of the
Poisson process in various domains, highlighting its significance in
stochastic modeling.

3.6 EMPIRICAL DISTRIBUTIONS


3.6.1 Definition and Use
Empirical distributions are constructed from observed data rather than
theoretical models. They are used to approximate probability distributions
when the underlying distribution is unknown.

[Link] Construction of Empirical Distributions


Empirical distributions are built using observed data points, often
represented through histograms or cumulative distribution functions.
These visualizations provide insights into data patterns.

[Link] Comparing Empirical and Theoretical Distributions


Comparing empirical distributions with theoretical ones helps assess how
well a given probability model fits the data. This is crucial in hypothesis
testing and statistical modeling.

3.6.2 Fitting Empirical Distributions


[Link] Methods for Fitting Empirical Data
Various techniques are used to fit empirical data, including graphical
methods, parameter estimation, and statistical modeling approaches.

61
Simulation and [Link] Goodness of Fit Tests
Modelling
Statistical tests, such as the Kolmogorov-Smirnov test and the Chi-
Squared test, evaluate how well a theoretical distribution aligns with
empirical data.

[Link] Examples and Applications


Empirical distributions are applied in diverse fields such as:
Financial risk assessment and stock market analysis.
Medical studies to analyze patient response distributions.
Environmental science for studying temperature and rainfall patterns.

3.7 STATISTICAL INFERENCE IN SIMULATION


Statistical inference in simulation is essential for drawing conclusions
about a population based on sampled data. It involves estimating
parameters, constructing confidence intervals, and testing hypotheses to
validate findings in simulated environments.

3.7.1 Point Estimation


Point estimation is the process of using sample data to estimate an
unknown parameter of a population. It provides a single best guess for an
unknown quantity based on observed data.

[Link] Definition and Methods


Point estimation involves selecting a single value from the sample to
represent a population parameter. Common methods include the method of
moments and maximum likelihood estimation, which derive estimates
based on different statistical principles.

[Link] Properties of Estimators (Bias, Consistency, Efficiency)


A good estimator possesses desirable properties:
Bias: Measures the difference between an estimator’s expected value and
the true population parameter.
Consistency: Ensures that as sample size increases, the estimator
converges to the true parameter.
Efficiency: Compares the variance of different estimators, favoring the
one with the smallest variance.

[Link] Methods of Moments and Maximum Likelihood Estimation


The method of moments derives parameter estimates by equating sample
moments to population moments. Maximum likelihood estimation (MLE)
finds parameter values that maximize the likelihood of observing the given
sample data.
62
3.7.2 Interval Estimation Statistical Models in
Simulation
Interval estimation provides a range of values within which the population
parameter is expected to lie, offering a measure of confidence in the
estimate.

[Link] Confidence Intervals


A confidence interval expresses the degree of uncertainty around a point
estimate. It is derived from sample statistics and indicates the range where
the true population parameter likely falls with a specified probability.

[Link] Methods for Constructing Confidence Intervals


Confidence intervals are constructed using sampling distributions. The
choice of method depends on sample size, population characteristics, and
desired confidence level.

[Link] Applications in Simulation Output Analysis


In simulation, confidence intervals assess the precision of performance
metrics, ensuring results are statistically reliable before making decisions
based on the simulation model.

3.7.3 Hypothesis Testing


Hypothesis testing is a framework for making decisions about a population
based on sample data. It assesses whether observed differences are
statistically significant or due to random variation.

[Link] Formulating and Testing Hypotheses


A hypothesis test involves stating a null hypothesis (no effect or
difference) and an alternative hypothesis (presence of effect or difference).
Sample data is analyzed to determine whether there is sufficient evidence
to reject the null hypothesis.

[Link] Type I and Type II Errors


Type I error: Incorrectly rejecting a true null hypothesis (false positive).
Type II error: Failing to reject a false null hypothesis (false negative).

[Link] Commonly Used Tests (Z-test, t-test, Chi-Square test)


Statistical tests evaluate hypotheses based on sample data:
1. Z-test: Used for large samples when population variance is known.
2. t-test: Suitable for small samples when population variance is
unknown.
3. Chi-Square test: Evaluates categorical data and tests goodness-of-fit
or independence.

63
Simulation and [Link] Applications in Simulation Studies
Modelling
Hypothesis testing in simulation verifies assumptions, compares different
simulation models, and determines the significance of experimental
results.

3.8 CORRELATION AND REGRESSION ANALYSIS


Correlation and regression analysis help quantify relationships between
variables. Correlation measures association, while regression models
relationships for predictive purposes.

3.8.1 Correlation Analysis


Correlation analysis assesses the strength and direction of relationships
between variables.

[Link] Definition and Measurement (Pearson, Spearman)


Pearson correlation: Measures linear relationships between continuous
variables.
Spearman correlation: Evaluates monotonic relationships, applicable for
ordinal data or non-linear associations.

[Link] Interpretation and Applications


A correlation coefficient indicates the degree of association between
variables. Strong correlations may suggest potential causation, but
correlation does not imply causation.

[Link] Detecting and Handling Multicollinearity


Multicollinearity occurs when predictor variables in a regression model
are highly correlated, leading to unreliable estimates. Methods to address
multicollinearity include variance inflation factor (VIF) analysis and
variable selection techniques.

3.8.2 Regression Analysis


Regression analysis models relationships between dependent and
independent variables to make predictions.

[Link] Simple Linear Regression


Simple linear regression models the relationship between a dependent
variable and a single independent variable using a straight-line equation.

[Link] Multiple Linear Regression


Multiple linear regression extends simple regression to include multiple
independent variables, allowing for more complex modeling.

64
[Link] Assumptions and Diagnostics Statistical Models in
Simulation
Regression models rely on assumptions such as linearity, independence,
homoscedasticity, and normality of residuals. Diagnostic tests assess the
validity of these assumptions.

[Link] Applications in Predictive Modeling


Regression analysis is widely used for forecasting, risk assessment, and
decision-making in various domains, including business, healthcare, and
engineering.

3.9 SUMMARY
 Statistical models play a crucial role in simulation, representing real-
world processes mathematically.

 Discrete distributions like Bernoulli, Binomial, Geometric, and


Poisson model count-based random phenomena.

 Continuous distributions such as Uniform, Exponential, Normal, and


Gamma describe variables with infinite possible values.

 The Poisson process models random events occurring over time or


space, distinguishing between homogeneous and non-homogeneous
cases.

 Empirical distributions allow analyzing observed data and comparing


theoretical expectations through goodness-of-fit tests.

 Point estimation techniques, including MLE and MOM, help estimate


unknown parameters from sample data.

 Interval estimation provides confidence intervals to quantify


uncertainty in estimated parameters.

 Hypothesis testing assesses statistical claims, using tests like Z-test, t-


test, and Chi-square.

 Correlation analysis determines relationships between variables, while


regression analysis models dependencies and predictions.

 Understanding statistical inference methods ensures robust analysis


and decision-making in simulation studies.

3.10 QUESTIONS FOR PRACTICE


1. What are statistical models, and why are they important in
simulations?
2. How does a Bernoulli distribution differ from a Binomial distribution?
3. Explain the memoryless property of the Geometric and Exponential
distributions.
65
Simulation and 4. What are the key characteristics of a Poisson process?
Modelling
5. How does the Central Limit Theorem influence the Normal
distribution?
6. Describe the relationship between the Exponential and Gamma
distributions.
7. What methods are used to construct empirical distributions, and why
are they useful?
8. Compare the Maximum Likelihood Estimation (MLE) and Method of
Moments (MOM).
9. What is multicollinearity, and how does it affect regression analysis?
10. Explain the assumptions behind simple linear regression and how they
are tested.

3.11 REFERENCES
 [Link]

 [Link]

 [Link]

 [Link]
binaries/64227_Chapter_4___Statistical_Modeling.pdf

 [Link]



66
4
QUEUEING MODELS
Unit Structure :
4.0 Objective
4.1 Introduction
4.2 Characteristics of Queueing Systems
4.2.1 Components of a Queueing System
[Link] Arrival Process
[Link] Service Process
[Link] Queue Discipline (FIFO, LIFO, Priority)
[Link] Number of Servers
[Link] System Capacity
[Link] Population Size
4.2.2 Types of Queueing Systems
[Link] SingleServer Queueing Systems
[Link] MultiServer Queueing Systems
[Link] Networks of Queues
4.2.3 Queueing Disciplines
[Link] FirstInFirstOut (FIFO)
[Link] LastInFirstOut (LIFO)
[Link] Priority Queues
[Link] Round Robin
4.2.4 Common Assumptions in Queueing Theory
[Link] Arrival and Service Rates
[Link] Interarrival and Service Time Distributions
[Link] Independence Assumptions
4.3 Queueing Notations
4.3.1 Kendall’s Notation
[Link] Explanation of A/B/C/K/N/D
[Link] Common Notations and Their Meanings
[Link] Examples of Queueing Models Using Kendall’s
Notation
4.3.2 Extensions and Variations of Kendall’s Notation

67
Simulation and [Link] Notations for Priority Queues
Modelling [Link] Notations for Batch Arrivals and Services
4.4 Long Run Measures of Performance of Queueing Systems
4.4.1 Average Number in System (L)
[Link] Derivation and Interpretation
[Link] Relationship with Other Performance Measures
4.4.2 Average Time in System (W)
[Link] Derivation and Interpretation
[Link] Little’s Law
4.4.3 Average Number in Queue (Lq)
[Link] Derivation and Interpretation
4.4.4 Average Time in Queue (Wq)
4.4.5 Utilization Factor (ρ)
4.4.6 Probability of n Customers in System (Pn)
4.5 Steady State Behavior of Infinite Population Markovian Models
4.5.1 M/M/1 Queue
[Link] Model Assumptions
[Link] Steady State Solutions
[Link] Performance Metrics
4.5.2 M/M/c Queue
[Link] Model Assumptions
[Link] Steady State Solutions
[Link] Performance Metrics
4.5.3 M/M/∞ Queue
[Link] Model Assumptions
[Link] Steady State Solutions
[Link] Performance Metrics
4.5.4 M/M/1 with Balking and Reneging
[Link] Model Assumptions
[Link] Steady State Solutions
[Link] Performance Metrics
4.6 Steady State Behavior of Finite Population Models
4.6.1 M/M/1/K Queue
[Link] Model Assumptions
[Link] Steady State Solutions
68
[Link] Performance Metrics Queueing Models
4.6.2 M/M/c/K Queue
[Link] Model Assumptions
[Link] Steady State Solutions
[Link] Performance Metrics
4.6.3 M/M/c/N/K Queue
[Link] Model Assumptions
[Link] Steady State Solutions
[Link] Performance Metrics
4.7 Networks of Queues
4.7.1 Open Queueing Networks
[Link] Characteristics and Assumptions
[Link] Examples and Applications
[Link] Steady State Solutions
4.7.2 Closed Queueing Networks
[Link] Characteristics and Assumptions
[Link] Examples and Applications
[Link] Steady State Solutions
4.7.3 Jackson Networks
[Link] Definition and Characteristics
[Link] Jackson’s Theorem
[Link] Performance Metrics and Analysis
4.7.4 GordonNewell Networks
[Link] Definition and Characteristics
[Link] Steady State Solutions
[Link] Performance Metrics and Analysis
4.8 Applications of Queueing Theory
4.8.1 Telecommunications
[Link] Call Centers and Telephony Systems
[Link] Internet Traffic and Data Packets
4.8.2 Manufacturing and Production
[Link] Assembly Lines and Workstations
[Link] Inventory and Supply Chain Management
4.8.3 Healthcare Systems
[Link] Patient Flow and Hospital Management
69
Simulation and [Link] Scheduling and Resource Allocation
Modelling 4.8.4 Transportation and Logistics
[Link] Traffic Flow and Control
[Link] Airport and Seaport Operations
4.9 Summary
4.10 Questions for Practice
4.11 References

4.0 OBJECTIVE
The objective of this chapter is to introduce the fundamental concepts of
queueing theory, which is used to analyze systems involving waiting lines.
Students will understand the key characteristics of queueing systems,
various queueing models, their applications, and performance measures.
By the end of this chapter, students should be able to analyze and solve
basic queueing problems using theoretical models.

4.1 INTRODUCTION
Queueing models are mathematical representations of systems that involve
waiting lines. They are widely used in various fields, including
telecommunications, manufacturing, transportation, and service industries.
The primary purpose of queueing models is to evaluate system
performance, optimize resource allocation, and minimize waiting times for
customers or tasks.

4.2 CHARACTERISTICS OF QUEUEING SYSTEMS


Queueing systems consist of several key components and can be classified
based on various factors.

4.2.1 Components of a Queueing System


A queueing system typically consists of the following components:

[Link] Arrival Process


Describes how entities (customers, packets, jobs, etc.) arrive at the system.
Can be deterministic or stochastic.
Commonly modeled using the Poisson process.

[Link] Service Process


Represents the mechanism by which servers process entities.
Can follow different distributions (exponential, deterministic, etc.).
Influences system performance and congestion levels.

70
[Link] Queue Discipline (FIFO, LIFO, Priority) Queueing Models
First-In-First-Out (FIFO): The first entity to arrive is the first to be served.
Last-In-First-Out (LIFO): The most recent arrival is served first.
Priority Queues: Entities are served based on predefined priority levels.

[Link] Number of Servers


Determines how many entities can be served simultaneously.
A single-server queue has one service point, while a multi-server queue
has multiple service stations.

[Link] System Capacity


Refers to the maximum number of entities allowed in the queue.
Can be finite or infinite.

[Link] Population Size


The total number of potential arrivals in the system.
Finite population models account for a limited number of sources, whereas
infinite population models assume an unlimited number of arrivals.

4.2.2 Types of Queueing Systems


Queueing systems can be categorized based on their structure and number
of service stations.

[Link] Single-Server Queueing Systems


Have only one server handling all arrivals.
Example: A single ATM machine serving customers.

[Link] Multi-Server Queueing Systems


Have multiple servers working in parallel.
Example: A bank counter with multiple tellers.

[Link] Networks of Queues


Consist of interconnected queues where entities move from one queue to
another.
Example: A production assembly line with multiple processing stations.

4.2.3 Queueing Disciplines


Queueing disciplines dictate the order in which entities are served.

71
Simulation and [Link] First-In-First-Out (FIFO)
Modelling
Entities are served in the order they arrive.
Common in customer service systems like banks and supermarkets.

[Link] Last-In-First-Out (LIFO)


The most recent arrival is served first.
Used in stack-based systems like computer memory allocation.

[Link] Priority Queues


Entities with higher priority are served before others.
Common in emergency rooms and network traffic management.

[Link] Round Robin


Each entity gets served for a fixed time slice before moving to the end of
the queue.
Common in CPU scheduling in operating systems.

4.2.4 Common Assumptions in Queueing Theory


Several assumptions are commonly made when analyzing queueing
models:
[Link] Arrival and Service Rates
Arrival rates (λ) and service rates (μ) determine system performance.
Usually assumed to follow an exponential distribution for tractability.

[Link] Interarrival and Service Time Distributions


The time between consecutive arrivals and the time required for service
can follow different distributions.
The most common assumption is an exponential distribution.

[Link] Independence Assumptions


Arrivals and service times are often assumed to be independent of each
other.
Helps simplify analysis and modeling.

4.3 QUEUEING NOTATIONS


Queueing notations provide a standardized way to represent different
queueing models. These notations help describe the structure of a
queueing system concisely, making it easier to analyze and compare
different systems.
72
4.3.1 Kendall’s Notation Queueing Models
Kendall’s notation is the most widely used convention for describing
queueing systems. It is represented as:
A/B/C/K/N/D
where:
A: Arrival process (distribution of interarrival times)
B: Service time distribution
C: Number of servers
K: System capacity (maximum number of customers in the system,
including those in service and in the queue)
N: Population size (total number of potential customers)
D: Queue discipline (order in which customers are served)
If K, N, or D are not specified, it is assumed that:
K = ∞ (infinite capacity)
N = ∞ (infinite population)
D = FIFO (First-In-First-Out)
[Link] Explanation of A/B/C/K/N/D
A (Arrival Process):
M: Markovian (Poisson process with exponential interarrival times)
D: Deterministic (fixed interarrival times)
G: General (any arbitrary distribution)
B (Service Time Distribution):
M: Markovian (exponentially distributed service times)
D: Deterministic (constant service time)
G: General (any arbitrary distribution)
C (Number of Servers):
Number of parallel servers available to serve customers.
K (System Capacity):
Maximum number of customers that can be present in the system.
N (Population Size):

73
Simulation and The total number of customers that can potentially enter the system.
Modelling
D (Queue Discipline):
FIFO: First-In-First-Out (default discipline)
LIFO: Last-In-First-Out
SIRO: Service in Random Order
Priority: Customers with higher priority are served first

[Link] Common Notations and Their Meanings

Notation Meaning
Single-server queue with Poisson arrivals and
M/M/1
exponential service times
Multi-server queue with Poisson arrivals and
M/M/c
exponential service times
M/M/1/K Single-server queue with finite capacity K

M/M/c/K Multi-server queue with finite capacity K


Single-server queue with Poisson arrivals and
M/G/1
general service time distribution
Single-server queue with general arrival and
G/G/1
service time distribution

[Link] Examples of Queueing Models Using Kendall’s Notation


Bank Teller Queue (M/M/1):
Customers arrive at a bank following a Poisson process.
A single teller serves them one at a time with exponential service time.
The system has infinite capacity.
Call Center (M/M/c):
Calls arrive at a call center following a Poisson process.
Multiple operators (c servers) handle calls with exponentially distributed
service times.
Infinite queue capacity.
Limited Waiting Room (M/M/1/K):
Customers arrive at a clinic, but there are only K seats in the waiting area.
If all seats are occupied, arriving customers leave.
A single doctor (server) provides exponentially distributed service times.
74
4.3.2 Extensions and Variations of Kendall’s Notation Queueing Models
Kendall’s notation has been extended to incorporate additional features
such as priority queues, batch arrivals, and batch services.

[Link] Notations for Priority Queues


Priority queues handle customers based on different priority levels rather
than the default FIFO rule.
M/M/1 with Priority: Single-server queue where customers with higher
priority get served first.
M/M/c with Preemptive Priority: Multi-server queue where higher-priority
customers can interrupt service of lower-priority customers.
M/M/1 with Non-Preemptive Priority: Higher-priority customers are
served first, but ongoing services are not interrupted.

[Link] Notations for Batch Arrivals and Services


Some queueing systems involve customers arriving in groups or being
served in batches.
M[X]/M/1: Customers arrive in batches of size X following a Poisson
process.
M/M[Y]/1: A single server processes batches of size Y at a time.
M[X]/M[Y]/c: A multi-server system where customers arrive in batches
and are also served in batches.

4.4 LONG-RUN MEASURES OF PERFORMANCE OF


QUEUEING SYSTEMS
Queueing systems are analyzed using several long-run performance
measures to evaluate their efficiency. These measures help in
understanding customer wait times, queue lengths, and overall system
utilization.

4.4.1 Average Number in System (L)


[Link] Derivation and Interpretation
The average number of customers in the system, denoted as L, includes
both customers in the queue and those receiving service.
It is calculated using Little’s Law: where:
is the average arrival rate
is the average time a customer spends in the system
Interpretation: A higher value of L indicates congestion, which may
require increasing service capacity.
75
Simulation and [Link] Relationship with Other Performance Measures
Modelling
L is related to the queue length (Lq) and utilization factor (ρ): L=Lq+
ρwhere ρ is the fraction of time the server is busy.

4.4.2 Average Time in System (W)


[Link] Derivation and Interpretation
The time a customer spends in the system (waiting + service time), given
by:

where:
Wqis the average time spent in the queue
µ is the service rate

[Link] Little’s Law


Little’s Law relates L, W, and λ: L= λW
It is applicable to a wide range of queueing models and helps estimate
system performance.

4.4.3 Average Number in Queue (Lq)


[Link] Derivation and Interpretation
The expected number of customers waiting in the queue before service.
For an M/M/1 system:

4.4.4 Average Time in Queue (Wq)


Definition: The expected time a customer spends waiting before service
begins.
Formula: Wq = Lq/λ
Interpretation: Measures customer wait time and service efficiency.

4.4.5 Utilization Factor (ρ)


Definition: The proportion of time the server is busy.
Formula: ρ = λ / (cμ), where c is the number of servers and μ is the service
rate.

76
Interpretation: Higher utilization may indicate overload and increased wait Queueing Models
times.

4.4.6 Probability of n Customers in System (Pn)


Definition: The probability that exactly n customers are in the system.
Calculation: Depends on the queueing model (e.g., M/M/1, M/M/c).
Interpretation: Helps assess congestion and system stability.

4.5 STEADY STATE BEHAVIOR OF INFINITE


POPULATION MARKOVIAN MODELS
Markovian queueing models describe systems where customers (or jobs)
arrive, wait for service if necessary, get served, and then leave. These
models assume Poisson arrivals and exponential service times, meaning
that interarrival and service times follow an exponential distribution.
Steady-state analysis examines the long-run behavior of these queueing
models, assuming that arrival and service rates do not change over time. It
helps derive key performance metrics such as average queue length,
waiting time, and system utilization.

4.5.1 M/M/1 Queue


The M/M/1 queue is the simplest Markovian queueing system, consisting
of:
One server
Poisson arrival process with rate λ (mean interarrival time 1/λ)
Exponential service time with rate μ (mean service time 1/μ)
Infinite queue capacity and infinite population (i.e., no limit on how many
customers can arrive)

[Link] Model Assumptions


Arrivals follow a Poisson process with rate λ.
Service times follow an exponential distribution with rate μ.
Single server providing service to customers one at a time.
First-Come, First-Served (FCFS) queue discipline.
Infinite queue capacity (no restriction on how many customers can wait).
The system reaches a steady-state, meaning arrival and departure rates
remain constant over time.

77
Simulation and [Link] Steady State Solutions
Modelling
The system is in a steady state if the arrival rate is less than the service
rate, i.e., λ<μ The probability of having n customers in the system is:

[Link] Performance Metrics


System utilization: ρ=λ/μ (fraction of time the server is busy)
Average number of customers in the system:

Average number of customers in the queue

Average time a customer spends in the system (waiting + service time):

Average time a customer spends waiting in queue:

4.5.2 M/M/c Queue


The M/M/c queue extends the M/M/1 model by having c servers instead of
just one.

[Link] Model Assumptions


Poisson arrivals with rate λ.
Exponential service times with rate μ
c servers, all working simultaneously.
FCFS discipline.
78
Infinite queue capacity. Queueing Models
Steady-state condition: λ<cμ (arrival rate should be less than total service
capacity).

[Link] Steady State Solutions


Define traffic intensity as:
ρ=λcμ

[Link] Performance Metrics

4.5.3 M/M/∞ Queue


In this model, there are infinite servers, meaning no waiting in the queue.

[Link] Model Assumptions


Poisson arrivals with rate λ\lambdaλ.
Exponential service times with rate μ\muμ.

79
Simulation and Infinite number of servers (every arrival gets immediate service).
Modelling
No waiting time since service is always available.

[Link] Steady State Solutions


The probability of having nnn customers in the system:

[Link] Performance Metrics

4.5.4 M/M/1 with Balking and Reneging


This extends the M/M/1 queue by considering customer behavior:
Balking: Customers may refuse to enter the queue if it is too long.
Reneging: Customers may leave the queue after waiting too long.

[Link] Model Assumptions


Same assumptions as M/M/1.
Balking probability depends on queue length.
Reneging rate θ represents the probability of a customer leaving the queue
per unit time.

[Link] Steady State Solutions

80
[Link] Performance Metrics Queueing Models

4.6 STEADY STATE BEHAVIOR OF FINITE


POPULATION MODELS
Finite population models are queueing systems where the number of
potential customers is limited. These models are particularly useful in
scenarios such as manufacturing systems, hospital beds, and computer
networks where the population size imposes constraints on arrivals.
Unlike infinite population models, where an unlimited number of
customers can arrive, finite models have a maximum system capacity (K)
or a limited number of sources (N). The system reaches a steady state
when the arrival and service processes balance over time.
4.6.1 M/M/1/K Queue
This model represents a single-server queue with a finite system capacity
(K), meaning that at most K customers (including the one in service) can
be in the system at any given time. If an arriving customer finds the
system full, they are blocked (lost customers).

[Link] Model Assumptions


Arrivals follow a Poisson process (random arrival times).
Service times are exponentially distributed (random service duration).
Single server available to serve customers one at a time.
System capacity is limited to K customers (no new arrivals if the system is
full).
First-Come, First-Served (FCFS) discipline.
Steady state is reached when arrival and departure rates stabilize over
time.

81
Simulation and [Link] Steady State Solutions
Modelling
The probability of having different numbers of customers in the system
depends on the arrival and service rates, as well as the maximum allowed
capacity.
If the system is full, additional arrivals are blocked and lost (this is called
a loss system).
The system's stability depends on whether arrival and service rates can
balance within the given capacity.

[Link] Performance Metrics


System utilization: Measures how much time the server is busy.
Blocking probability: The likelihood that an arriving customer finds the
system full and cannot enter.
Average number of customers in the system: Reflects how busy the system
is on average.
Average waiting time: How long a customer spends in the system (waiting
+ service).
Effective arrival rate: The number of customers that successfully enter the
system (excluding blocked customers).

4.6.2 M/M/c/K Queue


This model extends M/M/1/K by having c servers instead of one. It is
useful for systems where multiple customers can be served
simultaneously, such as call centers, hospitals, or bank tellers.

[Link] Model Assumptions


Arrivals follow a Poisson process (random arrival times).
Service times are exponentially distributed (random service durations).
c servers work in parallel to serve arriving customers.
System capacity is limited to K customers (including those in service and
those waiting).
First-Come, First-Served (FCFS) discipline is followed.
If the system is full, new arrivals are blocked.

[Link] Steady State Solutions


The number of customers in the system varies based on arrival and service
rates, the number of servers, and the system capacity.
The probability of finding a certain number of customers in the system
depends on how many servers are busy and how many are waiting.
82
A key measure is the probability of blocking, where an arriving customer Queueing Models
cannot enter because the system is full.

[Link] Performance Metrics


System utilization: The fraction of time the servers are busy.
Blocking probability: The chance that a new customer is denied entry.
Average number of customers in the system: Indicates the level of
congestion.
Average waiting time: Time spent in the queue before service.
Effective arrival rate: Number of customers that successfully enter the
system.

4.6.3 M/M/c/N/K Queue


This is a multi-server, finite-population queue with a finite capacity (K)
and a limited customer base (N). Unlike previous models, the total number
of potential customers is restricted, meaning that arrivals depend on how
many customers are in the system.

[Link] Model Assumptions


A finite population (N) exists, meaning that only a limited number of
customers can arrive at the system.
Arrivals follow a Poisson process, but the rate decreases as more
customers enter the system (since the source population is limited).
Service times are exponentially distributed.
c servers are available to provide service simultaneously.
The system has a finite capacity (K), meaning that if the queue is full,
additional customers cannot enter.
The arrival rate decreases as the number of customers in the system
increases (since fewer customers remain in the external population).

[Link] Steady State Solutions


The system behaves differently from infinite-population models because
the arrival rate depends on the number of customers not currently in the
system.
The probability of different system states is calculated based on the finite
population and available servers.
Blocking can occur, but the effective arrival rate changes dynamically as
customers leave or enter the system.

83
Simulation and [Link] Performance Metrics
Modelling
System utilization: Measures how busy the servers are.
Blocking probability: Likelihood that an arriving customer cannot enter
due to system capacity.
Average number of customers in the system: Reflects the typical
workload.
Average waiting time: Time customers spend in the queue.
Effective arrival rate: The actual number of arrivals that get served (varies
due to the finite population).

4.7 NETWORKS OF QUEUES


Queueing networks consist of multiple interconnected queues where
customers move from one queue to another. These networks help model
real-world systems like communication networks, production lines, and
transportation systems. They are broadly classified into open and closed
queueing networks.

4.7.1 Open Queueing Networks


An open queueing network consists of multiple queues where customers
enter the system from an external source, move through various service
stations, and eventually leave the system.

[Link] Characteristics and Assumptions


Customers arrive from outside the system following a Poisson process.
They move through different queues in a predefined or probabilistic
manner.
After receiving service, customers either move to another queue or exit the
system.
Service times at each station are usually exponentially distributed.
Servers operate independently, and queues may have different numbers of
servers.

[Link] Examples and Applications


Computer networks: Data packets move between routers and servers
before exiting the system.
Manufacturing systems: Parts move between workstations before
becoming a finished product.
Call centers: Calls are routed through different agents and departments
before resolution.

84
[Link] Steady State Solutions Queueing Models
The system reaches a steady state when the arrival and service rates
stabilize across all queues.
The probability of a certain number of customers at each queue depends
on service rates and transition probabilities.
Performance metrics include system utilization, average queue length, and
average waiting time at different service points.

4.7.2 Closed Queueing Networks


A closed queueing network has a fixed number of customers who
continuously cycle through different service stations without entering or
leaving the system.

[Link] Characteristics and Assumptions


No external arrivals or departures; customers circulate within the system.
The number of customers in the system remains constant.
Customers move between service stations based on predefined
probabilities.
Each queue has its own service rate and number of servers.

[Link] Examples and Applications


Computer systems: A fixed number of jobs are processed in a system,
moving between CPU, memory, and I/O devices.
Production lines: A limited number of workpieces circulate through
different machines.
Hospital systems: A limited number of patients move between
consultation, testing, and treatment stations.

[Link] Steady State Solutions


The probability of customers at each queue is analyzed using state
probabilities.
Performance metrics include throughput (number of completed tasks per
unit time), queue length, and system utilization.
Since the number of customers is fixed, their movement between queues
affects queue lengths dynamically.

4.7.3 Jackson Networks


A Jackson Network is a special type of queueing network where queues
interact in a way that allows for simplified analysis.

85
Simulation and [Link] Definition and Characteristics
Modelling
Customers arrive and move between queues based on predefined
probabilities.
Each queue follows M/M/c behavior, meaning Poisson arrivals,
exponential service times, and multiple servers.
The network can be either open or closed.
The system can be analyzed as a collection of independent queues.

[Link] Jackson’s Theorem


Jackson’s Theorem states that if an open queueing network follows
specific conditions, the queues can be analyzed independently.
The arrival rate at each node can be determined using external arrivals and
internal routing probabilities.

[Link] Performance Metrics and Analysis


The system performance is determined by analyzing each queue
separately.
Metrics include queue length, waiting time, and server utilization.
The total system throughput is the sum of the throughput at individual
queues.

4.7.4 Gordon-Newell Networks


A Gordon-Newell Network is a type of closed queueing network where
customers continuously circulate among queues.

[Link] Definition and Characteristics


A fixed number of customers move through different service stations.
Each queue follows M/M/c behavior with Poisson transitions between
queues.
Since there are no external arrivals, queue occupancy depends on the
movement of customers already inside.

[Link] Steady State Solutions


The steady-state probabilities depend on the number of customers at each
queue.
Since the system has a fixed population, as one queue gets busier, others
must have fewer customers.
The key performance measure is the throughput, which depends on how
quickly customers circulate through the network.

86
[Link] Performance Metrics and Analysis Queueing Models
System throughput: The rate at which customers complete service cycles.
Queue occupancy: The number of customers in each queue at any time.
Server utilization: How busy each service point is over time.

4.8 APPLICATIONS OF QUEUEING THEORY


Queueing theory has broad applications across various industries, helping
optimize efficiency and resource allocation.

4.8.1 Telecommunications
[Link] Call Centers and Telephony Systems
Call centers model customer interactions as queueing systems with
multiple servers (agents).
Helps optimize staffing to minimize customer waiting time while reducing
costs.

[Link] Internet Traffic and Data Packets


Data packets in networks are queued at routers and servers.
Helps optimize bandwidth allocation, latency reduction, and network
congestion control.

4.8.2 Manufacturing and Production


[Link] Assembly Lines and Workstations
Queueing models help balance workloads across production stations.
Reduces bottlenecks and optimizes efficiency in production processes.
[Link] Inventory and Supply Chain Management
Queueing models help in managing supply chain logistics by optimizing
storage and reducing delays in product delivery.

4.8.3 Healthcare Systems


[Link] Patient Flow and Hospital Management
Hospitals use queueing models to optimize patient flow, reduce wait times,
and allocate resources like beds and staff.

[Link] Scheduling and Resource Allocation


Queueing theory helps optimize the scheduling of surgeries, doctor
appointments, and emergency room operations.

87
Simulation and 4.8.4 Transportation and Logistics
Modelling
[Link] Traffic Flow and Control
Helps design traffic signals, manage toll booths, and reduce congestion by
modeling vehicle movement as a queueing system.

[Link] Airport and Seaport Operations


Optimizes aircraft takeoff and landing schedules, baggage handling, and
customs processing.
Seaports use queueing models to manage ship docking, cargo loading, and
unloading operations efficiently.

4.9 SUMMARY
 Queueing Systems involve the study of waiting lines, analyzing arrival
and service processes to optimize resource utilization.

 Components of Queueing Systems include arrival process, service


mechanism, queue discipline, number of servers, system capacity, and
population size.

 Queueing Disciplines determine how customers are served, including


FIFO (First-In-First-Out), LIFO (Last-In-First-Out), Priority Queues,
and Round Robin.

 Kendall’s Notation (A/B/C/K/N/D) is used to describe queueing


models, specifying arrival and service distributions, number of servers,
system capacity, and population constraints.

 Performance Metrics such as average number in the system (L),


average time in the system (W), utilization factor (ρ), and probability
of n customers (Pn) help evaluate queue efficiency.

 Markovian Queueing Models (M/M/1, M/M/c, etc.) assume Poisson


arrivals and exponential service times, providing steady-state solutions
for performance analysis.

 Finite Population Models (M/M/1/K, M/M/c/K, etc.) consider limited


system capacity, affecting queueing behavior and resource allocation.

 Queueing Networks can be open, closed, or structured as Jackson or


Gordon-Newell networks, modeling complex real-world systems.

 Applications of Queueing Theory span telecommunications,


manufacturing, healthcare, and transportation, optimizing operations in
industries like call centers, hospitals, and traffic management.

 Little’s Law (L = λW) establishes a fundamental relationship between


the average number of customers, arrival rate, and waiting time in a
stable queueing system.
88
4.10 QUESTIONS FOR PRACTICE Queueing Models

1. Define a queueing system and its main components.


2. Explain the differences between FIFO, LIFO, and Priority Queues.
3. What are the key assumptions in queueing theory regarding arrival and
service rates?
4. Describe Kendall’s Notation and provide an example.
5. Derive and interpret the formula for the average number of customers
in the system (L).
6. Explain the significance of the utilization factor (ρ) in queueing
models.
7. Compare M/M/1 and M/M/c queueing models in terms of their
assumptions and steady-state behavior.
8. What is the impact of system capacity (K) on finite population
queueing models?
9. How does Little’s Law relate to performance analysis in queueing
systems?
10. Discuss an application of queueing theory in healthcare or
telecommunications.

4.11 REFERENCES
 [Link]
 [Link]
notation
 [Link]
queuing-theory/
 [Link]
 [Link]
 [Link]
 [Link]
probability/article/jackson-networks-and-gordonnewell-networks/
 [Link]
Queuing_Theory_in_Telecommunication_Networks
 [Link]
 [Link]




89
5
RANDOM NUMBER GENERATION
Unit Structure :
5.0 Objective
5.1 Properties of random numbers
5.2 Generation of pseudo random numbers
5.3 Techniques for generating random numbers
5.4 Tests for random numbers
5.5 Summary
5.6 Exercise

5.0 OBJECTIVE
In this chapter, the generation of random numbers and their subsequent
testing for randomness is described.

The objective of this chapter is :


● To understand that Random numbers are a necessary basic ingredient
in the simulation of almost all discrete systems. Most computer
languages have a subroutine, object, or function that will generate a
random number.
● Simulation languages generate random numbers that are used to
generate event times and other random variables.
● To show how random numbers are used to generate a random variable
with any desired probability distribution.

5.1 PROPERTIES OF RANDOM NUMBERS


● A sequence of random numbers, R1, R2, ••• , must have two important
statistical properties:
○ uniformity and
○ independence.
● Each random number Ri must be an independent sample drawn from a
continuous uniform distribution between zero and 1-that is, the pdf is
given by,

90
● The expected value of each R1 is given by, Random Number
Generation

● and the variance is [Link]

● Some consequences of the uniformity and independence properties


are the following:
○ If the interval [0, l] is divided into n classes, or subintervals of
equal length, the expected number of observations in each interval
is N/n, where N is the total number of observations.
○ The probability of observing a value in a particular interval is
independent of the previous values drawn.

5.2 GENERATION OF PSEUDO RANDOM NUMBERS


● "Pseudo" means false, so false random numbers are being generated.
● In this instance, "pseudo" is used to imply that the very act of
generating random numbers by a known method removes the potential
for true randomness.
● If the method is known, the set of random numbers can be replicated.
● Then an argument can be made that the numbers are not truly random.
● The goal of any generation scheme, however, is to produce a sequence
of numbers between 0 and l that simulates, or imitates, the ideal
properties of uniform distribution and independence as closely as
possible.
● To be sure, in the generation of pseudo-random numbers, certain
problems or errors can occur.
● These errors, or departures from ideal randomness, are all related to
the properties stated previously.
● Some examples of such departures include the following:
91
Simulation and ○ The generated numbers might not be uniformly distributed.
Modelling
○ The generated numbers might be discrete-valued instead of
continuous-valued.
○ The mean of the generated numbers might be too high or too low.
○ The variance of the generated numbers might be too high or too low.
○ There might be dependence. The following are examples:
(a) autocorrelation between numbers;
(b) numbers successively higher or lower than adjacent numbers;
(c) several numbers above the mean followed by several numbers
below the mean.
● Departures from uniformity and independence for a particular
generation scheme often can be detected by such tests.
● If such departures are detected, the generation scheme should be
dropped in favor of an acceptable generator.
● Generators that pass the tests and tests even more stringent have been
developed; thus, there is no excuse for using a generator that has been
found to be defective.
● Usually, random numbers are generated by a digital computer, as part
of the simulation.
● There are numerous methods that can be used to generate the values.
● Before we describe some of these methods,· or routines, there are a
number of important considerations that we should mention:

1. The routine should be fast.


○ Individual computations are inexpensive, but simulation could require
many millions of random numbers.
○ The total cost can be managed by selecting a computationally efficient
method of random number generation.
2. The routine should be portable to different computers and ideally,
to different programming languages.
○ This is desirable so that the simulation program will produce the same
results wherever it is executed.

3. The routine should have a sufficiently long cycle.


○ The cycle length, or period, represents the length of the random
number sequence before previous numbers begin to repeat themselves
in an earlier order.

92
○ Thus, if 10,000 events are to be generated, the period should be many Random Number
times that long. Generation
○ A special case of cycling is degenerating.
○ A routine degenerates when the same random numbers appear
repeatedly.
○ Such an occurrence is certainly unacceptable.
○ This can happen rapidly with some methods.

4. The random numbers should be replicable.


○ Given the starting point (or conditions) it should be possible to
generate the same set of random numbers. completely independent of
the system that is being simulated.
○ This is helpful for debugging purposes and is a means of facilitating
comparisons between systems.
○ For the same reasons, it should be possible to easily specify different
starting points, widely separated, within the sequence.

5. The generated random numbers should closely approximate the


ideal statistical properties of uniformity and independence.
● Inventing techniques that seem to generate random numbers is easy;
inventing techniques that really do produce sequences that appear to be
independent, uniformly distributed random numbers is incredibly
difficult.
● There is now a vast literature and rich theory on the topic, and many
hours of testing have been devoted to establishing the properties of
various generators.
● Even when a technique is known to be theoretically sound, it is seldom
easy to implement it in a way that will be fast and portable.
● The goal of this chapter is to make the reader aware of the central
issues in random-number generation, to enhance understanding and to
show some of the techniques that are used by those working in this
area.

5.3 TECHNIQUES FOR GENERATING RANDOM


NUMBERS
The linear congruential method is the most widely used technique for
generating random numbers, so we describe it in detail.
We also report an extension of this method that yields sequences with a
longer period.
Many other methods have been proposed, and they are reviewed in
93
Simulation and Deatley, Fox, and Schrage [1996], Law and Kelton [2000], and Ripley
Modelling [1987].

5.3.1 Linear Congruential Method:


● The linear congruential method, initially proposed by Lehmer [1951],
produces a sequence of integers, X1,X2, . . . between zero and m - l by
following a recursive relationship:

● The initial value X0 is called the seed, a is called the multiplier, c is the
increment, and m is the modulus.
● If c ≠ 0 in Equation, then the form is called the mixed congruential
method.
● When c = 0, the form is known as the multiplicative congruential
method.
● The selection of the values for a, c, m, and X0 drastically affects the
statistical properties and the cycle length.
● Variations of Equation are quite common in the computer generation
of random numbers.
● An example will illustrate how this technique operates.

Example 1:
Use the linear congruential method to generate a sequence of random
numbers with:
X0 = 27, a= 17, c=43, and m = 100.
Here, the integer values generated will all be between zero and 99 because
of the value of the
modulus.
Also, notice that random integers are being generated rather than random
numbers.
These random integers should appear to be uniformly distributed on the
integers zero to 99.
Random numbers between zero and 1 can be generated by

The sequence of X; and subsequent R; values is computed as follows:


X0 = 27

94
X1 = (17.27 + 43) mod 100 =502 mod 100 =2 Random Number
Generation

X2 = (17.2 + 43) mod 100 =77 mod 100 =77

X3 = (17.77 + 43) mod 100 =1352 mod 100 =52

Recall that a= b mod m provided that (b - a) is divisible by m with no


remainder.
Thus, X1 = 502 mod 100, but 502/100 equals 5 with a remainder of 2, so
that X1 = 2.
In other words, (502 - 2) is evenly divisible by m = 100, so X1 = 502
"reduces" to X1 = 2 mod 100.
The ultimate test of the linear congruential method, as of any generation
scheme, is how closely the generated numbers R1 , R2, ••• approximate
uniformity and independence.
There are, however, several secondary properties that must be considered.
These include maximum density and maximum period.
First, notice that the numbers generated from Equation assume values only
from the set
I= ( 0, 1/m,2/m. ••• , (m- 1)/m}, because each Xi is an integer in the set {0,
1, 2, .. , m - 1}.
Thus, each Ri is discrete on I, instead of continuous on the interval [0, 1].
This approximation appears to be of little consequence if the modulus m is
a very large integer. (Values such as m = 231 - 1 and m = 248 are in
common use in generators appearing in many simulation languages.)
By maximum density is meant that the values assured by Ri, i = 1, 2, ... ,
leave no large gaps on [0, 1].
Second, to help achieve maximum density, and to avoid cycling (i.e.,
recurrence of the same sequence of generated numbers) in practical
applications, the generator should have the largest possible period.
Maximal period can be achieved by the proper choice of a, c, m, and X0
[Fishman, 1978; Law and Kelton, 2000].

95
Simulation and ● For m a power of 2, say m = 2b, and c * 0, the longest possible period
Modelling is P = m = 2b, which is achieved whenever c is relatively prime to m
(that is, the greatest common factor of c and m is 1) and a= I+ 4k,
where k is an integer.
● For m a power of 2, say m = 2h, and c = 0, the longest possible period
is P = m/4 = 2b-2, which is achieved if the seed Xfl is odd and if the
multiplier, a, is given by a = 3 + 8k or a = 5 + 8k, for some k = 0, 1, ....
● Form a prime number and c = 0, the longest possible period is P = m -
1, which is achieved whenever the multiplier, a, has the property that
the smallest integer k such that a!' I is divisible by m is k = m - I.

Example 2
Using the multiplicative congruential method, find the period of the
generator for a= 13, m = 26 = 64 and X0 = 1, 2, 3, and 4.
The solution is given in the following Table.

Period Determination Using Various Seeds

When the seed is I or 3, the sequence has period 16.


However, a period of length eight is achieved when the seed is 2 and a
period of length four occurs when the seed is 4.
In this Example 2, m = 26 = 64 and c = 0.
The maximal period is therefore P = m/4=16.
Notice that this period is achieved by using odd seeds,
X0= 1 and X0 = 3; even seeds, X0 = 2 and X0= 4, yield the periods eight
and four, respectively, both less than the maximum.
96
Notice that a= 13 is of the form 5 + 8k with k = 1, as is required to achieve Random Number
maximal period. Generation
When X0 = 1, the generated sequence assumes values from the set { 1, 5,
9, 13, ... , 53, 57, 61 }.
The "gaps" in the sequence of generated random numbers, Ri, are quite
large (i.e., the gap is 5/64 - 1/64 or 0.0625).
Such a gap gives rise to concern about the density of the generated
sequence.
The generator in Example 2 is not viable for any application-its period is
too short, and its density is insufficient.
However, the example shows the importance of properly choosing a, c, m,
and Xo.
Speed and efficiency in using the generator on a digital computer is also a
selection consideration. Speed and efficiency are aided by use of a
modulus, m, which is either a power of 2 or close to a power of 2.
Since most digital computers use a binary representation of numbers. the
modulo, or remaindering, operation of Equation (7.1) can be conducted
efficiently when the modulo is a power of 2 (i.e., m .= 2b).
After ordinary arithmetic yields a value for aXi + c, Xi+1 is obtained by
dropping the leftmost binary digits in aXi + c and then using only the b
rightmost binary digits.
The following example illustrates, by analogy, this operation using m =
10b, because most human beings think in decimal representation.

97
Simulation and
Modelling

Combined Linear Congruential Generators:


● As computing power has increased, the complexity of the systems that
we are able to simulate has also increased.
● A random-number generator with period 231 - 1 ≈ 2 x 109 , such as the
popular generator described in Example 7 .4, is no longer adequate for
all applications.
● Examples include the simulation of highly reliable systems, in which
hundreds of thousands of elementary events must be simulated to
observe even a single failure event, and the simulation of complex
computer networks, in which thousands of users are executing
hundreds of programs.
● An area of current research is the deriving of generators with
substantially longer periods.
● One fruitful approach is to combine two or more multiplicative
congruential generators in such a way that the combined generator has
good statistical properties and a longer period.
The following result from L'Ecuyer [1988] suggests how this can be done:
● If Wi,1, Wi,2, ... , Wi,k are any independent, discrete-valued random
variables (not necessarily identically distributed), but one of them, say
W1,b is uniformly distributed on the integers from 0 to m1 - 2, then is
uniformly distributed on the integers from 0 to m1 - 2.

98
● To see how this result can be used to form combined generators , let Random Number
Xi,1 ,Xi,2 , …. Xi,k be the ith output from k different multiplicative Generation
congruential generators, where the jth generator has prime modulus mj
and the multiplier aj is chosen so that the period is mj - 1.
● Then the jth generator is producing integers Xi,j that are approximately
uniformly distributed on the integers from 1 to mj - 1 , and Wi,j = Xi,j -
1 is approximately uniformly distributed on the integers from 0 to mj -
2.
● L'Ecuyer (1988] therefore suggests combined generators of the form

With

● Notice that the "(-1)j-1" coefficient implicitly performs the subtraction


Xi,1 - 1; for example, if k = 2 then (-1)0 (X i,1 - 1) - (-1)1 (X i,1 - 1) =

● The maximum possible period for such a generator is

● which is achieved by the generator described in the next example.

Example :
For 32-bit computers, L'Ecuyer [1988] suggests combining k = 2
generators with m1 = 2,147,483,563,
a1 = 40,014, m2 = 2,147,483,399 and a2 = 40,692.

This leads to the following algorithm:


Step 1.
Select seed X1,0 in the range [1,2,14,74,83,562] for the first generator, and
seed X2,0 in the range
[1,2,14,74,83,398] for the second.
Set j= 0.

99
Simulation and Step 2.
Modelling
Evaluate each individual generator.
X1, j+1 = 40,014 X1,j mod 2,147,483,563
X2, j+1 = 40,692 X2,j mod 2,147,483,399

Step 3.
Set Xj+1 = (X1, j+1 -X2, j+1 ) mod 2,147,483,5632

Step 4.
Return

Step 5.
Set j = j + 1 and go to step 2
This combined generator has period (m1 - l)(m2 - 1 )/2 = 2 x 1018
Perhaps surprisingly, even such a long period might not be adequate for all
applications.
See L'Ecuyer [1996, 1999) and L'Ecuyer et al. [2002] for combined
generators with periods as long as 2191 = 3 X 1057.

Random Number Streams


● The seed for a linear congruential random-number generator (seeds, in
the case of a combined linear congruential generator) is the integer
value X0 that initializes the random-number sequence.
● Since the sequence of integers X0,X1,...Xp,X1... produced by a
generator repeats, any value in the sequence could be used to "seed"
the generator.
● For a linear congruential generator, a random-number stream is
nothing more than a convenient way to refer to a starting seed taken
from the sequence (for a combined generator, starting seeds for all of
the basic generators are required); typically these starting seeds are far
apart in the sequence.
● For instance, if the streams are b values apart, then stream i could be
defined by starting seed for i= 1, 2, ...
● Values of b = 100,000 were common in older generators, but values as
large as b = are in use in modem combined linear congruential
generators.
(See, for instance, L'Ecuyer et al. [200] for the implementation of such a
generator.)
● Thus, a single random-number generator with k streams acts like k
distinct virtual random-number generators, provided that the current
100
value of seed for each stream is maintained. Random Number
Generation
● illustrates one way to create streams that are widely separated in the
random-number sequence.
● In Chapter 12, we will consider the problem of comparing two or more
alternative systems via simulation, and we will show that there are
advantages to dedicating portions of the pseudorandom number
sequence to the same purpose in each of the simulated systems.
● For instance, in comparing the efficiency of several queueing systems,
a fairer comparison will be achieved if all of the simulated systems
experience exactly the same sequence of customer arrivals.
● Such synchronization can be achieved by assigning a specific stream
to generate arrivals in each of the queueing simulations.
● If the starting seeds for the streams are spaced far enough apart, then
this has the same effect as having a distinct random-number generator
whose only purpose is to generate customer arrivals.

5.4 TESTS FOR RANDOM NUMBERS


To check on whether these desirable properties have been achieved, a
number of tests can be performed.
(Fortunately, the appropriate tests have already been conducted for most
commercial simulation software.)
The tests can be placed in two categories, according to the properties of
interest uniformity, and independence.

A brief description of two types of tests is given in this chapter:


1. Frequency test.
● Uses the Kolmogorov-Smirnov or the chi-square test to compare the
distribution of the set of numbers generated to a uniform distribution.

2. Autocorrelation test.
● Tests the correlation between numbers and compares the sample
correlation to the expected correlation, zero.
● In testing for uniformity, the hypotheses are as follows:
H0: Ri - U[0,1]
H1: Ri + U[0,1]
● The null hypothesis, H0, reads that the numbers are distributed
uniformly on the interval [0, I].
● Failure to reject the null hypothesis means that evidence of
nonuniformity has not been detected by this test.
101
Simulation and ● This does not imply that further testing of the generator for uniformity
Modelling is unnecessary.
● In testing for independence, the hypotheses are as follows:
H0 : Ri - independently
H1 : Ri + independently
● This null hypothesis, H0, reads that the numbers are independent.
● Failure to reject the null hypothesis means that evidence of
dependence has not been detected by this test.
● This does not imply that further testing of the generator for
independence is unnecessary.
● For each test, a level of significance must be stated.
● The level a is the probability of rejecting the null hypothesis when the
null hypothesis is true:
a= P(reject H0 | H0 true)
● The decision maker sets the value of a for any test Frequently, is set
to 0.0l or 0.05.
● If several tests are conducted on the same set of numbers, the
probability of rejecting the null hypothesis on at least one test, by
chance alone [i.e., making.a Type I (a) error], increases.
● Say that a= 0.05 and that five different tests are conducted on a
sequence of numbers.
● The probability of rejecting the null hypothesis on at least one test, by
chance alone, could be as large as 0.25.
● Similarly, if one test is conducted on many sets of numbers from a
generator, the probability of rejecting the null hypothesis on at least
one test by chance alone [i.e., making a Type I (a) error], increases as
more sets of numbers are tested.
● For instance, if 100 sets of numbers were subjected to the test, with a=
0.05, it would be expected that five of those tests would be rejected by
chance alone.
● If the number of rejections in 100 tests is close to I OOa, then there is
no compelling reason to discard the generator.
● If one of the well-known simulation languages or random-number
generators is used, it is probably unnecessary to apply the tests just
mentioned and described in Sections 7.4.1 and 7.4.2. However,
random number generators frequently are added to software that is not
specifically developed for simulation, such as spreadsheet programs,
symbolic/numerical calculators, and programming languages.
102
● If the generator that is at hand is not explicitly known or documented, Random Number
then the tests in this chapter should be applied to many samples of Generation
numbers from the generator.
● Some additional tests that are commonly used, but are not covered
here, are Good's serial test for sampling numbers [1953, 1967], the
median-spectrum test [Cox and Lewis, 1966; Durbin, 1967], the runs
test [Law and Kelton 2000) and a variance heterogeneity test [Cox and
Lewis, 1966J.
● Even if a set of numbers passes all the tests, there is no guarantee of
randomness; it is always possible that some underlying pattern has
gone undetected.
● In this book, we emphasize empirical tests that are applied to actual
sequences of numbers produced by a· generator.
● Because of the extremely long period of modern pseudo-random-
number generators, it is no longer possible to apply these tests to a
significant portion of the period of such generators.
● The tests can be used as a check if one encounters a generator with
completely unknown properties (perhaps one that is undocumented
and buried deep in a software package), but they cannot be used to
establish the quality of a generator throughout its period.
● Fortunately, there are also families of theoretical tests that evaluate the
choices for m, a, and c without actually generating any numbers, the
most common being the spectral test.
● Many of these tests assess how k-tuples of random numbers fill up a k-
dimensional unit cube.
● These tests are beyond the scope of this book; see, for instance, Ripley
[1987].
● In the examples of tests that follow, the hypotheses are not restated.
● The hypotheses are as indicated in the foregoing paragraphs.
● Although few simulation analysts will need to perform these tests,
every simulation user should be aware of the qualities of a good
random-number generator.

Frequency Tests:
A basic test that should always be performed to validate a new generator is
the test of uniformity.
Two different methods of testing are available.
1. Kolmogorov-Srnirnov and
2. the chi-square test
103
Simulation and Both of these tests measure the degree of agreement between the
Modelling distribution of a sample of generated random numbers and the theoretical
uniform distribution.
Both tests are based on the null hypothesis of no significant difference
between the sample distribution and the theoretical distribution.

I. The Kolmogorov-Smirnov test.


This test compares the continuous cdf, F(x), of the uniform distribution
with the empirical cdf, SN(x), of the sample of N observations.
By definition,
F(x) = x, 0 ≤ x ≤ 1
If the sample from the random - number generator is R1,R2,...,RN, then the
empirical cdf , SN(x), is defined by

As N becomes larger, SN(x) should become a better approximation to F(x),


provided that the null hypothesis is true.
The cdf of an empirical distribution is a step function with jumps at each
observed value.
The Kolrnogorov-Srnirnov test is based on the largest absolute deviation
between F(x) and SN(x) over
the range of the random variable-that is, it is based on the statistic
D = max|F(x) -SN(x)|
The sampling distribution of D is known
For testing against a uniform cdf, the test procedure follows these steps:

Step 1:
Rank the data from smallest to largest.
Let R(i) denote the ith smallest observation, so that
R(1)≤ R(2) ≤ …. ≤ R(N)

Step 2:
Compute

104
Step 3: Random Number
Generation
Compute D = max (D* , D- )

Step 4:
Locate the critical value Dα the null hypothesis that the data are a sample
from a uniform distribution is rejected.
If D ≤ Dα , conclude that no difference has been detected between the true
distribution of {R1,R2,...RN} and the uniform distribution.

Tests for Autocorrelation:


● The tests for autocorrelation are concerned with the dependence
between numbers in a sequence.
● As an example, consider the following sequence of numbers:

● From a visual inspection, these numbers appear random, and they


would probably pass all the test presented to this point.
● However, an examination of the 5th, l0th, 15th (every five numbers
beginning with the fifth),and so on, indicates a very large number in
that position.
● Now, 30 numbers is a rather small sample size on which to reject a
random number generator,but the notion is that numbers in the
sequence might be related.
● In this particular section, a method for discovering whether such a
relationship exists is described.
● The relationship would not have to be all high numbers.
● It is possible to have all low numbers m the locations being examined,
or the numbers could alternate from very high to very low.
● The test to be described shortly requires the computation of the
autocorrelation between every m number (m is also known as the lag),
starting with the ith number.
● Thus, the autocorrelation Pim between the following numbers would be
of interest:
Ri , Ri+m,Ri+2m.. Ri+m.
● The value M is the largest integer such that i + (M + l )m ≤ N, where N
is the total number of values in the sequence. (Thus, a subsequence of
length M + 2 is being tested.)

105
Simulation and
Modelling

Failure to reject hypothesis


● A nonzero autocorrelation implies a lack of independence, so the
following two-tailed test is appropriate:

● For large values of M, the distribution of the estimator of

is approximately normal if the values Ri ,


Ri+m,Ri+2m.. Ri+m. are uncorrelated.
● Then the test statistic can be formed as follows:

● which is distributed normally with a mean of zero and a variance of 1,


under the assumption of independence, for large M.

● The formula for in a slightly different form, and the standard


deviation of the estimator, are given by Schmidt and Taylor [1970]
as follows:

and

● After computing Zo. do not reject the null hypothesis of independence

if is the level of significance and is


obtained
● If Pim > 0, the subsequence is said to exhibit positive autocorrelation.
● In this case, successive values at lag m have a higher probability than
expected of being close in value (i.e., high random numbers in the
subsequence followed by high, and low followed by low).
106
● On the other hand, if Pim < 0, the subsequence is exhibiting negative Random Number
autocorrelation, which means that low random numbers tend to be Generation
followed by high ones,and vice versa.
● The desired property, independence (which implies zero
autocorrelation), means that there is no discernible relationship of the
nature discussed here between successive random numbers at lag m.

● Therefore, the hypothesis of independence cannot be rejected on the


basis of this test.
● It can be observed that this test is not very sensitive for small values of
M, particularly when the numbers being tested are on the low side.

5.5 SUMMARY
● This chapter described the generation of random numbers and the
subsequent testing of the generated numbers for uniformity and
independence.
● Random numbers are used to generate random variates.
● Of the many types of random-number generators available, ones based
on the linear congruential method are the most widely used, but they
are being replaced by combined linear congruential generators.
● Of the many types of statistical tests that are used in testing random-
number generators, two different types are described: one testing for
uniformity, and one testing for independence.
● The simulation analyst might never work directly with a random-
number generator or with the testing of random numbers from a
generator.
● Most computers and simulation languages have routines that generate
a random number, or streams of random numbers, for the asking.
● But even generators that have been used for years, some of which are
still in use, have been found to be inadequate.
● So this chapter calls the simulation analyst's attention to such
107
Simulation and possibilities, with a warning to investigate and confirm that the
Modelling generator has been tested thoroughly.
● Some researchers have attained sophisticated expertise in developing
methods for generating and testing random numbers and the
subsequent application of these methods.
● This chapter provides only a basic introduction to the subject matter;
more depth and breadth are required for the reader to become a
specialist in the area.
● The bible is Knuth [1998]; see also the reviews in Bratley, Fox, and
Schrage [1996], Law and Kelton [2000], L'Ecuyer [1998], and
Ripley [1987].
● One final caution is due. Even if generated numbers pass all the tests
(those covered in this chapter and those mentioned in the chapter),
some underlying pattern might have gone undetected without the
generator's having been rejected as faulty.
5.5 EXERCISE
Answer the following:
1. Write a short note on Properties of random numbers
2. Write a short note on Generation of pseudo random numbers
3. Write a short note on Techniques for generating random numbers
4. Write a short note on Tests for random numbers
5. Generate random numbers using multiplicative congruential method
with X0 = 5, a 11, and m = 64.
6. Generate four-digit random numbers by linear congruential method
with X0 = 21, a= 34, and c = 7.

Solved Examples:

108
Random Number
Generation

109
Simulation and
Modelling

Reference:

 [Link]
df



110
6
RANDOM-VARIATE GENERATION
Unit Structure :
6.0 Objective
6.1 Inverse transform technique
6.1.1 Exponential Distribution
6.1.2 Uniform Distribution
6.1.3 Weibull Distribution
6.1.4 Triangular Distribution
6.1.5 Empirical Continuous Distributions
6.1.6 Continuous Distributions without a Closed­Form Inverse
6.1.7 Discrete Distributions
6.2 Acceptance rejection techniques
6.2.1 Poisson Distribution
6.2.2 Nonstationary Poisson Process
6.2.3 Gamma Distribution
6.3 Convolution method
6.4 Summary
6.5 Exercise

6.0 OBJECTIVE
This objective of this chapter is :
● To deals with procedures for sampling from a variety of widely­used
continuous and discrete distributions.
● To explain and illustrate some widely­used techniques for generating
random variates ,not to give a state­of­the­art survey of the most
efficient techniques.
● To discusses the inverse­transform technique and, more briefly, the
acceptance ­ rejection technique and special properties.

111
Simulation and 6.1 INVERSE TRANSFORM TECHNIQUE
Modelling
● The inverse­transform technique can be used to sample from the
exponential, the uniform, the Weibull, the triangular distributions and
from empirical distributions.
● Additionally, it is the underlying principle for sampling from a wide
variety of discrete distributions.
● The technique will be explained in detail for the exponential
distribution and then applied to other distributions.
● Computationally, it is the most straightforward,but not always the
most efficient, technique.

6.1.1 Exponential Distribution


The exponential distribution, has the probability density function (pdf)

and the cumulative distribution function (cdf)

The parameter can be interpreted as the mean number of occurrences


per time unit.
For example, if inter arrival times X1 , X2 , X3 , … had an exponential
distribution with rate , then could be interpreted as the mean number
of arrivals per time unit, or the arrival rate.
Notice that, for any i,

E(Xi) =

and so is the mean interarrival time.

The goal here is to develop a procedure for generating values X1 , X2 , X3 ,


… that have an exponential distribution.
The inverse­transform technique can be utilized , at least in principle, for
any distribution , but it is most useful when cdf, F(x), is of a form so
simple that its inverse, F­1 , can be computed easily.

One step-by-step procedure for the inverse-transform technique is as


follow:

112
Step 1 : Random­Variate
Generation
Compute the cdf of the desired random variable X.

For the exponential distribution, the cdf is F(x) = 1­ e ­ x


,x 0.

Step 2 :
Set F(X) = R on the range of X,
For the exponential distribution, it becomes 1­ e ­ x
= R on the range x
0.
X is a random variable with the exponential distribution , so 1­ e ­ x
is also
a random variable, here called R.
R has a uniform distribution over the interval [0,1].

Step 3:
Solve the equation F(X) = R for X in terms of R.
For the exponential distribution, the solution proceeds as follows:
1­ e ­ x
=R
e­ x
=1­R

­ = ln (1­ R)

X =­ ln (1 ­ R)

Above equation is called a random ­ variate generator for the exponential


distribution.
In general, above equation is written as X = F­1(R)
Generating a sequence of values is accomplished through Step 4.

Step 4
Generate (as needed)uniform random number R1 , R2 , R3 , … and
compute the desired random variates by
Xi= F ­1(Ri)

For the exponential case, F ­1(R) = (­1/ ) ln (1­R) by above equation, so

Xi =­ ln (1 ­ Ri)

for i = 1,2,3, … One simplification that is usually employed in above


equation is to replace 1 ­ Ri by Ri to yield

Xi =­ ln Ri

113
Simulation and This alternative is justified by the fact that both Ri and 1 ­ Ri are uniformly
Modelling distributed on [0,1].

Example:

(a) Empirical histogram of 200 uniform random numbers;


(b) empirical histogram of 200 exponential variates;
(c) theoretical uniform density on [0, 1];
(d) theoretical exponential density with mean l.

114
Random­Variate
Generation

6.1.2 Uniform Distribution


Consider a random variable X that is uniformly distributed on the interval
[a,b].
A reasonable guess for generating X is given by ,
X = a + (b ­ a)R
The pdf of X is given by

The derivation of above equation follows Step 1 through 3

115
Simulation and
Modelling

Step 1:
The cdf given by

Step 2:
Set F(X) = (X ­ a) / (b ­ a) = R.

Step 3:
Solving for X in terms of R yields
X = a + (b ­ a) R ,
which agrees with the above equation.

6.1.3 Weibull Distribution


The Weibull distribution was introduced as a model for time to failure for
machines or electronics components.
When the location parameter v is set to 0, its pdf is given by equation:

Step 1:
The cdf is given by F(X) = 1 ­e ­(x/a) , x 0.

Step 2:
Set F(X) = 1 ­e ­(x/a) = R

Step 3:
Solving for X in terms of R yields

X = [­ln (l ­ R)]1/

By comparing above equations it can be seen that , if X is a Weibull


variate, then X is an exponential variate with mean .

Conversely, if Y is an exponential variate with mean , then YV is a


Weibull variate with shape parameter and scale parameter

116
6.1.4 Triangular Distribution Random­Variate
Generation
Consider a random variable X that has pdf

This distribution is called a triangular distribution with endpoints (0,2) and


mode at 1.
Its cdf is given by

Density function for a particular triangular distribution.

For 0 X 1,

R=

and for 1 X 2,

0 X 1 implies that 0 R , in which case X = .

1 X 2 implies that R 1, in which case X = .

Thus , X is generated by

Notice that n if the pdf and cdf of the random variable X come in parts(i.e.
Require different formulas over different parts of the range X).
Then the application of the inverse­transform technique for generating X
117
Simulation and will result in separate formulas over different parts of the range of R.
Modelling
6.1.5 Empirical Continuous Distributions
If the modeler has been unable to find a theoretical distribution that
provides a good model for the input data, then it may be necessary to use
the empirical distribution of the data.
One possibility is to simply resample the observed data itself.
This is known as using the empirical distribution and it makes particularly
good sense when the input process is known to take on a finite number of
values.
On the other hand, if the data are drawn from what is believed to be a
continuous­valued input process, then it makes sense to interpolate
between the observed data points to fill in the gaps.
An empirical continuous distribution is a distribution function that
estimates the cumulative distribution function that generated a sample of
data. It's used when a theoretical distribution doesn't provide a good model
for the data.
● If the modeler has been unable to find a theoretical distribution that
provides a good model for the input data, it may be necessary to use
the empirical distribution of the data.
● A typical way of resolving this difficult is through ``curve fitting''.
● Steps involved:
○ Collect empirical data and group them accordingly.
○ Tabulate the frequency and cumulative frequency.
○ Now assume the value of cumulative frequency as a function of the
empirical data, i.e. F(x) = r
○ Establish a relation between x and r using linear interpolation

for each of the intervals.

6.1.6 Continuous Distributions without a Closed-Form Inverse:


● A number of useful continuous distributions do not have a closed form
expression for their cdf or its inverse; examples include the normal,
gamma, and beta distributions.
● For this reason, it is often stated that the inverse­transform technique
for random­variate generation is not available for these distributions.

118
● It can, in effect, become available if we are willing to approximate the Random­Variate
inverse cdf, or numerically integrate and search the cdf. Generation
● Although this approach sounds inaccurate, notice that even a closed­
form inverse requires approximation in order to evaluate it on a
computer.
● For example, generating exponentially distributed random variates via
the inverse cdf

X= F­1(R) = ­ln(1­R)/ requires a numerical approximation for the


logarithm function.
● Thus, there is no essential difference between using an approximate
inverse cdf and approximately evaluating a closed­form inverse.
● The problem with using approximate inverse cdfs is that some of them
are computationally slow to evaluate.
● To illustrate the idea, consider a simple approximation to the inverse
cdf of the standard normal distri­bution, proposed by Schmeiser
(1979]:

● This approximation gives at least one­decimal­place accuracy for


0.0013499 R 0.9986501.

● Following Table compares the approximation with exact values (to


four decimal places) obtained by numerical integration for several
values of R.

● Much more accurate approximations exist that are only slightly more
complicated.
● A good source of these approximations for a number of distributionS
is Bratley, Fox, and Schrage [1 996].

119
Simulation and 6.1.7 Discrete Distributions
Modelling
All discrete distributions can be generated via the inverse­transform
technique, either numerically through a table­lookup procedure or, in
some cases, algebraically, the final generation scheme being in terms of a
formula.
Other techniques are sometimes used for certain distributions, such as the
convolution technique for
the binomial distribution;
Some of these methods are discussed in later sections.
This subsection gives examples covering both empirical distributions and
two of the standard discrete distributions, the (discrete) uniform and
geometric.
Highly efficient table­lookup procedures for these and other distributions
are found in Bralley, Fox, and Schrage [1996] and in Ripley [1987].

Example:

6.2 ACCEPTANCE REJECTION TECHNIQUES


Suppose that an analyst needed to devise a method for generating random
variants X uniformly distributed between ¼ and 1.
One way to proceed would be to follow these steps:

Step 1:
Generate a random number R.

Step 2a:
If R ¼, accept X=R , then go to Step 3.

Step 2b:
If R < ¼ , reject R,and return to Step 1.

Step 3:
If another uniform random variate on [¼,1] is needed,repeat the procedure
120
beginning at Step 1. Random­Variate
Generation
If not, Stop.
● Each time Step 1 is executed, a new random number R must be
generated.
● Step 2a is an “acceptance” and Step 2b is a “rejection” in this
acceptance­rejection technique.
● To summarize the technique, random variate(R) with some distribution
(here uniform on [0,1]) are generated until some condition (R > 1/4) is
satisfied.
● When the condition is finally satisfied, the desired random variate, X
(here uniform on [¼ , 1]), can be computed (X = R).
● This procedure can be shown to be correct by recognizing that the
accepted values of R are conditioned values; that is, R itself does not
have the desired distribution,but R conditioned on the event {R 1/4}
does have the desired distribution.

● To show this, take ¼ a b 1; then

● Which is the correct probability for a uniform distribution on [¼,1].


● Above equation says that the probability of R, given that R is between
¼ and 1 (all other values of R are thrown out), is the desired
distribution.

● Therefore, if ¼ R 1, set X=R.

● The efficiency of an acceptance­rejection technique depends heavily


on being able to minimize the number of rejections.
● In this example, the probability of a rejection is P(R < 1/4) = l/4, so
that the number of rejections is a geometrically distributed random
variable with probability of "success" being
p = 3/4 and mean number of rejections (1/p ­ 1) = 4/3 ­ 1 = 1/3.
● The mean number of random numbers R required to generate one
variate X is one more than the number of rejections; hence, it is 4/3 =
1.33.
● In other words, to generate 1000 values of X would require
approximately 1333 random numbers R.
● In the present situation, an alternative procedure exists for generating a
uniform variate on [1/4, 1]­ namely, above Equation, which reduces to
X = 1/4 + (3/4)R.

121
Simulation and ● Whether the acceptance­rejection technique or an alternative
Modelling procedure, such as the inverse­transform technique, is more efficient
depends on several considerations.
● The computer being used, the skills of the programmer and the relative
inefficiency of generating the additional (rejected) random numbers
needed by acceptance­rejection should be compared to the
computations required by the alternative procedure.
● In practice, concern with generation efficiency is left to specialists
who conduct extensive tests comparing alternative methods (i.e., until
a simulation model begins to require excessive computer runtime due
to the generator being used).
● For the uniform distribution on [1/4, 1], the inverse­transform
technique of above Equation is undoubtedly much easier to apply and
more efficient than the acceptance­rejection technique.
● The main purpose of this example was to explain and motivate the
basic concept of the acceptance­rejection technique.
● However, for some important distributions, such as the normal, gamma
and beta, the inverse cdf does not exist in closed form and therefore
the inverse­transform technique is difficult
● These more advanced techniques are summarized by Bralley, Fox, and
Schrage [1996], Fishman [1978], and Law and Kelton [2000].
● In the following subsections, the acceptance­rejection technique is
illustrated for the generation of random variates for the Poisson,
nonstationary Poisson, and gamma distributions.

6.2.1 Poisson Distribution:


● A Poisson random variable,N , with mean has pmf

● More important,however, is that N can be interpreted as the number of


arrivals from a Poisson arrival process in one unit of time.
● Interarrival times, A1,A2,... of successive customers are exponentially
distributed with rate in addition an exponential variate can be
generated by the above equation.
● Thus there is a relationship between the discrete Poisson distribution
and the continuous exponential distribution:
N=n
If and only if

A1 + A2 + … + An 1 < A1 + … + An + An+1
122
● N=n , says there were exactly n arrivals during one unit of time. Random­Variate
Generation
● And we also know that the nth arrival occurred before time 1 while the
( n + 1)st arrival after time 1.
● Both statements are equivalent.
● For efficient generation purposes usually simplified by first using
previous equation Ai = (­1/ ) In Ri to obtain

● Next multiply through by ­ , which reverses the sign of the inequality,


and use the fact that a sum of logarithms is the logarithm of a product,
to get

● Finally, use the relation eInx = x for any number x to obtain

which is equivalent to previous equation.


The procedure for generating a Poisson random variate , N , is given
by the following steps:
Step 1:
Set n=0 , P= 1.

Step 2:
Generate a random number Rn+1, and replace P by [Link]+1 .

Step 3:
If P < e­ , then accept N = n.

Otherwise, reject the current n, increase n by one, and return to step 2.


● Notice that, upon completion of Step 2, P is equal to the rightmost
expression in Relation.
● The basic idea of a rejection technique is again exhibited; if P e­ in
step 3, then n is rejected and the generation process must proceed
through at least one more trial.
● How many random numbers will be required, on the average, to
generate one Poisson variate, N?
123
Simulation and ● If N = n, then n + 1 random numbers are required, so the average
Modelling number is given by

E (N + 1) = +1

● which is quite large if the mean, , of the Poisson distribution is large.

Example:

6.2.2 Nonstationary Poisson Process


Another type of acceptance­rejection method (which is also called
"thinning") can be used to generate
interarrival times from a nonstationary Poisson process (NSPP) with
arrival rate

(t) , 0 t T.

A NSPP is an arrival process with an arrival rate that varies with time.
Consider, for instance, the arrival­rate function given in following Table
that changes every hour.

124
Random­Variate
Generation

The idea behind thinning is to generate a stationary Poisson arrival process


at the fastest rate (1/5 customer per minute in the example), but "accept"
or admit only a portion of the arrivals, thinning out just enough to get the
desired time varying rate.
Next we give the generic algorithm, which generates Ti as the time of the
ith arrival.

Step 1:
Let * = max 0 t T (t) be the maximum of the arrival rate function and
set t = 0 and i = 1

Step 2:
Generate E from the exponential distribution with rate * and let t= t + E
(this is the arrival time of
the stationary Poisson process).

Step 3:
Generate random number R from the U(0, l) distribution.
If R (t) / * then Ti = t and i = i + 1.

Step 4:
Go to Step 2.
The thinning algorithm can be inefficient if there are large differences
between the typical and the maximum arrival rate.
However, thinning has the advantage that it works for any integrable
arrival rate function,not just a piecewise­constant function as in this
example.

125
Simulation and Example:
Modelling

6.2.3 Gamma Distribution


● Several acceptance­rejection techniques for generating gamma random
variates have been developed. (See Bratley, Fox, and Schrage [19%];
Fishman [1978]; and Law and Kelton[2000].)
● One of the more efficient is by Cheng [ 1977]; the mean number of
trials is between l.l3 and 1 .47 for any value of the shape parameter
l.

● If the shape parameter is an integer, say = k, one possibility is to


use the convolution technique because the Erlang distribution is a
special case of the more general gamma distribution.
● On the other hand, the acceptance­rejection technique described here
would be a highly efficient method for the Erlang distribution
especially if = k were large.

● The routine generates gamma random variates with scale parameter


and shape parameter ­that is, with mean 1/ and variance 1/ 2.

The steps are as follows:


Step 1:
Compute a= l/(2 ­ 1)1/2, b = ­ In4.

126
Step 2: Random­Variate
Generate R1 and R2 Generation
Set V = R1/(l­R1).
Step 3:
Compute X= Va.
Step 4a:
If

,
reject X and return to Step 2.
Step 4b:
If

use X as the desired variate.


The generated variates from Step 4b will have mean and variance both
equal to .
2
If it is desired to have mean 1/ and variance 1/ , then include Step 5.
(Step 5. Replace X by X/( ).)
The basic idea of all acceptance­rejection methods is again illustrated
here, but the proof of this example is beyond the scope of this book.

In Step 3, is not gamma distributed,


but rejection of certain values of X in Step 4a guarantees that the accepted
values in Step 4b do have the gamma distribution.
Example:

127
Simulation and 6.3 CONVOLUTION METHOD
Modelling
● The probability distribution of a sum of two or more independent
random variables is called a convolution of the distributions of the
original variables.
● The convolution method thus refers to adding together two or more
random variables to obtain a new random variable with the desired
distribution.
● This technique can be applied to obtain Erlang variates and binomial
variates.
● What is important is not the cdf of the desired random variable, but
rather its relation to other variates more easily generated.
6.4 SUMMARY
● The basic principles of random­variate generation via the inverse­
transform technique, the acceptance­rejection technique, and special
properties have been introduced and illustrated by examples.
● Methods for generating many of the important continuous and discrete
distributions, plus all empirical distributions, have been given.
● See Schmeiser [ 1980] for an excellent survey; for a state­of­the­art
treatment, the reader is referred to Devroye [1986] or Dagpunar
[1988].
6.5 EXERCISE
Answer the following:
1. Develop a random­variate generator for X with pdf

2. Develop a generation scheme for the triangular distribution with pdf

Generate I 0 values of the random variate, compute the sample mean, and
compare it to the true mean of the distribution.
3. Develop the triangular random­variate generator with range (0, 12) and
mode 5.
4. Generate 10 values from a beta distribution on the interval [0, 1] with
parameters 1 = 1.47 and 2 = 2.16.
5. Next transform them to be on the interval [­10, 20].

128
7
INPUT MODELING
Unit Structure :
7.0 Objective
7.1 Data Collection
7.2 Identifying the Distribution of data
7.2.1 Histograms
7.2.2 Selecting the Family of Distributions
7.2.3 Quantile­Quantile Plots
7.3 Parameter estimation
7.3.1 Preliminary Statistics: Sample Mean and Sample Variance
7.3.2 Suggested Estimators
7.4 Goodness of fit tests
7.4.1 Chi­Square Test
7.4.2 Chi­Square Test with Equal Probabilities
7.4.3 Kolmogorov­Smimov Goodness­[Link] Test
7.4.4 p ­ Values and "Best Fits"
7.5 Selection input model without data
7.6 Multivariate and Time series input models
7.6.1 Covariance and Correlation
7.6.2 Multivariate Input Models
7.6.3 Time Series Input Models
7.7 Summary
7.8 Exercise

7.0 OBJECTIVE
The objective of this chapter is:
● To discuss methods for selecting families of input distributions when
data are available.
● To discuss the specific distribution within a family is specified by
estimating its parameters
● To take up the case in which data is unavailable.
● To estimate the parameters of the distribution.

129
Simulation and Input models provide the driving force for a simulation model. In the
Modelling simulation of a queueing system, typical input models are the distributions
of time between arrivals and of service times.
For an inventory­system simulation, input models include the distributions
of demand and of lead time.
For the simulation of a reliability system, the distribution of time to failure
of a component is an example of an input model.
In real­world simulation applications, however, coming up with
appropriate distributions for input data is a major task from the standpoint
of time and resource requirements.
There are four steps in the development of a useful model of input
data:
1. Collect data from the real system of interest.
● This often requires a substantial time and resource commitment.
● Unfortunately, in some situations it is not possible to collect data (for
example, when time is extremely limited, when the input process does
not yet exist, or when laws or rules prohibit the collection of data).
● When data are not available, expert opinion and knowledge of the
process must be used to make educated guesses.
2. Identify a probability distribution to represent the input process.
● When data is available, this step typically begins with the development
of a frequency distribution, or histogram, of the data.
● Given the frequency distribution and a structural knowledge of the
process, a family of distributions is chosen.
● Fortunately, as was described in Chapter 5, several well­known
distributions often provide good approximations in practice.

3. Choose parameters that determine a specific instance of the


distribution family.
● When data are available, these parameters may be estimated from the
data.
4. Evaluate the chosen distribution and the associated parameters for
goodness of fit.
● Goodness of fit may be evaluated informally, via graphical methods,
or formally, via statistical tests.
● The chi­square and the Kolmogorov­Smimov tests are standard
goodness­of­fit tests.
● If not satisfied that the chosen distribution is a good approximation of
130
the data, then the analyst returns to the second step, chooses a different Input Modeling
family of distributions, and repeats the procedure.
● If several iterations of this procedure fail to yield a fit between an
assumed distributional form and the collected data, the empirical form
of the distribution may be used.

7.1 DATA COLLECTION


● Problems are found at the end of each chapter, as exercises for the
reader, in textbooks about mathematics,physics, chemistry, and other
technical subjects.
● Years and years of working through these problems could give the
reader the impression that data is readily available.
● Nothing could be further from the truth.
● Data collection is one of the biggest tasks in solving a real problem.
● It is one of the most important and difficult problems in simulation.
● And, even when data are available, they have rarely been recorded in a
form that is directly useful for simulation input modeling.
● "GIGO," or "garbage­in­garbage­out," is a basic concept in computer
science, and it applies equally in the area of discrete­system
simulation.
● Even when the model structure is valid, if the input data are
inaccurately collected, inappropriately analyzed, or not representative
of the environment, the simulation output data will be misleading and
possibly damaging or costly when used for policy or decision making.

7.2 IDENTIFYING THE DISTRIBUTION OF DATA


7.2.1 Histograms
A frequency distribution or histogram is useful in identifying the shape of
a distribution.
A histogram is constructed as follows:
1. Divide the range of the data into intervals. (Intervals are usually of
equal width; however, unequal
(a) widths may be used if the heights of the frequencies are adjusted.)
2. Label the horizontal axis to conform to the intervals selected.
3. Find the frequency of occurrences within each interval.
4. Label the vertical axis so that the total occurrences can be plotted for
each interval.
5. Plot the frequencies on the vertical axis.

131
Simulation and ● The number of class intervals depends on the number of observations
Modelling and on the amount of scatter or dispersion in the data.
● Hines, Montgomery, Goldsman, and Borrow [2002] state that
choosing the number of class intervals approximately equal to the
square root of the sample size often works well in practice.
● If the intervals are too wide, the histogram will be coarse, or blocky,
and its shape and other details will not show well.
● If the intervals are too narrow, the histogram will be ragged and will
not smooth the data.
● Examples of ragged, coarse, and appropriate histogra_ms of the same
data are shown in the following Figure. Modern data­analysis software
often allows the interval sizes to be changed easily and interactively
until a good choice is found.

● The histogram for continuous data corresponds to the probability


density function of a theoretical distribution.
● If continuous, a line drawn through the center point of each class
interval frequency should result in a shape like that of a pdf.
132
● Histograms for discrete data, where there are a large number of data Input Modeling
points, should have a cell for each value in the range of the data.
● However, if there are few data points, it could be necessary to combine
adjacent cells to eliminate the ragged appearance of the histogram.
● If the histogram is associated with discrete data, it should look like a
probability mass function.

Example: Discrete Data


The number of vehicles arriving at the northwest corner of an intersection
in a 5­minute period between 7:00 A.M. and 7:05 A.M. was monitored for
five workdays over a 20­week period.
Following Table shows the resulting data.

The first entry in the table indicates that there were 12 5­ minute periods
during which zero vehicles
arrived, 10 periods during which one vehicle arrived, and so on.
The number of automobiles is a discrete variable, and there. are ample
data, so the histogram may have a cell for each possible value in the range
of the data.
The resulting histogram is shown in following Figure:

133
Simulation and 7.2.2 Selecting the Family of Distributions
Modelling
● The purpose of preparing a histogram is to infer a known pdf or pmf.
● A family of distributions is selected on the basis of what might arise in
the context being investigated along with the shape of the histogram.
● Thus, if interarrival­time data have been collected, and the histogram
has a shape similar to the pdf, the assumption of an exponential
distribution would be warranted.
● Similarly, if measurements of the weights of pallets of freight are
being made, and the histogram appears symmetric about the mean, the
assumption of a normal distribution would be warranted.
● The exponential, normal, and Poisson distributions are frequently
encountered and are not difficult to analyze from a computational
standpoint.
● Although more difficult to analyze, the beta, gamma, and Weibull
distributions provide a wide array of shapes and should not be
overlooked during modeling of an underlying probabilistic process.
● Perhaps an exponential distribution was assumed, but it was found not
to fit the data.
● The next step would be to examine where the lack of fit occurred.
● If the lack of fit was in one of the tails of the distribution, perhaps a
gamma or Weibull distribution would fit the data more adequately.
● There are literally hundreds of probability distributions that have been
created; many were created with some specific physical process in
mind.
● One aid to selecting distributions is to use the physical basis of the
distributions as a guide.

● Here are some examples:


○ Binomial:
■ Models the number of successes in n trials, when the trials are
independent with common success probability, p; for example, the
number of defective computer chips found in a lot of n chips .

○ Negative Binomial (includes the geometric distribution):


■ Models the number of trials required to achieve k successes; for
example, the number of computer chips that we must inspect to
find 4 defective chips.

134
○ Poisson: Input Modeling
■ Models the number of independent events that occur in a fixed
amount of time or spate; for example; the number of customers
that arrive at a store during l hour, or the number of defects found
in 30 square meters of sheet metal.
○ Normal:
■ Models the distribution of a process that can be thought of as the
sum of a number of com­ponent processes; for example, a time. to
assemble a product that is the sum of the times required for each
assembly operation.
■ Notice that the normal distribution admits negative values, which
could be impossible for process times.
○ Lognormal:
■ Models the distribution of a process that can be thought of as the
product of (meaning to multiply together) a number of component
processes­for example, the rate on an investment,· when interest is
compounded, is the product of the returns for a number of periods.
○ Exponential:
■ Models the time between independent events, or a process time
that is memoryless (knowing how much time has passed gives no
information about how much additional time will pass before the
process is complete )
■ for example, the times between the arrivals from a large
population of potential customers who act independently of each
other.
■ The exponential is a highly variable distribution; it is sometimes
overused, because it often leads to mathematically tractable
models.
■ Recall that, if the time between events is exponentially distributed,
then the number of events in a fixed period of time is Poisson.
○ Gamma:
■ An extremely flexible distribution used to model nonnegative
random variables.
■ The gamma can be shifted away from 0 by adding a constant
○ Beta:
■ An extremely flexible distribution used to model bounded (fixed
upper and lower limits) random variables.
■ The beta can be shifted away from 0 by adding a constant and can
be given a range larger than [0, 1] by multiplying by a constant.
135
Simulation and ○ Erlang:
Modelling
■ Models processes that can be viewed as the sum of several
exponentially distributed processes
■ For example, a computer network fails when a computer and two
backup computers fail,and each has a time to failure that is
exponentially distributed.
■ The Erlang is a special case of the gamma.

○ Weibull:
■ Models the time to failure for components­for example, the time
to failure for a disk drive.
■ The exponential is a special case of the Weibull.

○ Discrete or Continuous Uniform:


■ Models complete uncertainty: All outcomes are equally likely.
■ This distribution often is used inappropriately, when there is no
data.

○ Triangular:
■ Models a process for which only the minimum, most likely, and
maximum values of the distribution are known;
■ for example, the minimum, most likely, and maximum time
required to test a product.
■ This model is often a marked improvement over a uniform
distribution.

○ Empirical:
■ Resamples from the actual data collected; often used when no
theoretical distribution seems appropriate.

7.2.3 Quantile-Quantile Plots


● The construction of histograms; and the recognition of a distributional
shape, as discussed in, are necessary ingredients for selecting a family
of distributions to represent a sample of data.
● However, a histogram is not as useful for evaluating the fit of the
chosen distribution.
● When there is a small number of data points, say 30 or fewer, a
histogram can be rather ragged.
● Further, our perception of the fit depends on the widths of the
histogram intervals.

136
● But, even if the intervals are chosen well, grouping data into cells Input Modeling
makes it difficult to compare a histogram to a continuous probability
density function.
● A quantile­quantile (q ­ q) plot is a useful tool for evaluating
distribution fit, one that does not suffer from these problems.
● If X is a random variable with cdf F, then the q­quantile of X is that
value such that,

F( ) = P ( X ) = q, for 0 < q < 1.

● When F has an inverse, we write = F­1(q).

● Now let {xi , i = 1,2,...,n}be a sample of data from X.


● Order the observation from the smallest to the largest and denote these
as {yj , j = 1,2,...,n} where y1 y2 … yn.

● Let j denote the ranking or order number.


● Therefore, j = 1 for the smallest and j = for the largest.
● The q ­ q plot is based on the fact that yj is an estimate of the (j ­ 1/2) /
n quantile of X.
● In other words, yj is approximately

● Now suppose that we have chosen a distribution with cdf F as a


possible representation of the distribution of X.
● If F is a member of an appropriate family of distributions, then a plot
of yj versus F­1((j ­ 1/2)/n) will be approximately a straight line.
● If Fis from an appropriate family of distribution and also has
appropriate parameter values, then the line will have slope 1.
● On the other hand, if the assumed distribution is inappropriate, the
points will deviate from a straight line, usually in a systematic manner.
● The decision about whether to reject some hypothesized model is
subjective.

137
Simulation and Example:
Modelling

7.3 PARAMETER ESTIMATION


Estimators for many useful distributions are described in this section. In
addition, many software packages, some of them integrated into
simulation languages­are now available to compute these estimates.

7.3.1 Preliminary Statistics: Sample Mean and Sample Variance


● In a number of instances, the sample mean, or the sample mean and
sample variance, are used to estimate the parameters of a hypothesized
distribution.
● In the following paragraphs, three sets of equations are given for
computing the sample mean and sample variance.
● Equations (7.1) and (7.2) can be used when discrete or continuous raw
data are available.
● Equations (7.3) and (7.4) are used when the data are discrete and have
been grouped in a frequency distribution.
● Equations (7.5) and (7.6) are used when the data are discrete or
continuous and have been placed in class intervals.
● Equations (7.5) and (7.6) are approximations and should be used only
when the raw data are unavailable.

138
Input Modeling

Example: Grouped Data

139
Simulation and
Modelling

Example: Continuous Data in Class Intervals

140
Input Modeling

7.3.2 Suggested Estimators


● Numerical estimates of the distribution parameters are needed to
reduce the family of distributions to a specific distribution and to test
the resulting hypothesis.
● Following Table contains suggested estimators for distributions often
used in simulation.

141
Simulation and ● Except for an adjustment to remove bias in the estimate of 2 for the
Modelling normal distribution, these estimators are the maximum­likelihood
estimators based on the raw data.
● If the data are in class intervals, these estimators must be modified.
● The reader is referred to Fishman [ 1973] and Law and Kelton [2000]
for parameter estimates for the uniform, binomial,and negative
binomial distributions.
● The triangular distribution is usually employed when no data are
available, with the parameters obtained from educated guesses for the
minimum, most likely, and maximum possible values; the uniform
distribution may also be used in this way if only minimum and
maximum values are available.
● Examples of the use of the estimators are given in the following
paragraphs.
● The reader should keep in mind that a parameter is an unknown
constant, but the estimator is a statistic (or random variable), because it
depends on the sample values.
● To distinguish the two clearly here, if, say, a parameter is denoted by
, the estimator will be denoted by .

7.4 Goodness of fit tests


● Goodness­of­fit tests provide helpful guidance for evaluating the
suitability of a potential input model;however, there is no single
correct distribution in a real application, so you should not be a slave
to the verdict of such a test.
● It is especially important to understand the effect of sample size.
● If very little data are available, then a goodness­of­fit test is unlikely to
reject any candidate distribution; but if a lot of data are available, then
a goodness­of­fit test will likely reject all candidate distributions.
● Therefore, failing to reject a candidate distribution should be taken as
one piece of evidence in favor of that choice, and rejecting an input
model as only one piece of evidence against the choice.

7.4.1 Chi-Square Test


● One procedure for testing the hypothesis that a random sample of size
n of the random variable X follows a specific distributional form is the
chi­square goodness­of­fit test.
● This test formalizes the intuitive idea of comparing the histogram of
the data to the shape of the candidate density or mass function.
● The test is valid for large sample sizes and for both discrete and
142
continuous distributional assumptions when parameters are estimated Input Modeling
by maximum likelihood.
● The test procedure begins by arranging the n observations into a set of
k class intervals or cells.
● The test statistic is given by,

● where Oi is the observed frequency in the ith class interval and Ei is


the expected frequency in that class interval.
● The expected frequency for each class interval is computed as Ei = npi,
where pi is the theoretical, hypothesized probability associated with
the ith class interval.
● It can be shown that zJ approximately follows the chi­square
distribution with k­ s ­ 1 degrees of freedom, where s represents the
number of parameters of the hypothesized distribution estimated by
the sample statistics.

● The hypotheses are the following:


○ H0: The random variable, X, conforms to the distributional assumption
with the parameter(s) given by the parameter estimate(s).
○ H1: The random variable X does not conform.

7.4.2 Chi-Square Test with Equal Probabilities


● If a continuous distributional assumption is being tested, class intervals
that are equal in probability rather than [Link] width of interval
should be used.
● This has been recommended by a number of authors (Mann and Wald,
1942; Gumbel, 1943; Law and Kelton, 2000; Stuart, Ord, and Arnold,
1998].
● It should be noted that the procedure is not applicable to data collected
in class intervals, where the raw data have been discarded or lost.
● Unfortunately, there is as yet no method for figuring out the
probability associated with each interval that maximizes the power for
a test of a given size.
● The power of a test is defined as the probability of rejecting a false
hypothesis.
● However, if using equal probabilities, then pi = 1/k. We recommend

143
Simulation and ● so substituting for pi yields
Modelling

● and solving for k yields

● Above Equation Was used in coming up with the recommendations for


maximum number of class intervals in following Table:

● If the assumed distribution is normal, exponential, or Weibull, the


method described in this section is straightforward.
● If the assumed distribution is gamma (but not Erlang) or certain other
distributions, then the computation of endpoints for class intervals is
complex and could require numerical integration of the density
function.
● Statistical­analysis software is very helpful in such cases.

7.4.3 Kolmogorov-Smirnov Goodness-of-Fit Test


● The chi­square goodness­of­fit test can accommodate the estimation of
parameters from the data with a resultant in the degrees of freedom
(one for each parameter estimated).
● The chi­square test requires that the data change be placed in class
intervals; in the case of a continuous distributional assumption, this
grouping is arbitrary.
● Changing the number of classes and the interval width affects the
value of the calculated and tabulated chi­square.
● A hypothesis could be accepted when the data are grouped one way,
rut rejected when they are grouped another way.
● Also, the distribution of the chi­square test statistic is known only
approximately, and the power of the test is sometimes rather low.
● As a result of these considerations, goodness­of­fit tests other than the
chi­square, are desired.
144
● The Kolmogorov­Smirnov test formalizes the idea behind examining a Input Modeling
q ­ q plot.
● The Kolmogorov­Smirnov test was presented previously to test for the
uniformity of numbers.
● Both of these uses fall into the category of testing for goodness of fit
● Any continuous distributional assumption can be tested for goodness
of fit.
● The Kolmogorov­Smimov test is particularly useful when sample sizes
are small and when no parameters have been estimated from the data.
● The exact value of a can be worked out in some instances, as is
discussed at the end of this section.
● The Kolmogorov­Smirnov test does not take any special tables when
an exponential distribution is assumed.
● The following example indicates how the test is applied in this
instance.

Example: Kolmogorov-Smimov Test for Exponential Distribution

145
Simulation and 7.4.4 p - Values and "Best Fits"
Modelling
● To apply a goodness­of­fit test, a significance level must be chosen.
● Recall that the significance level is the probability of falsely rejecting
H0: the random variable conforms to the distributional assumption.
● The traditional significance levels are 0. 1, 0.05 and 0.01.
● Prior to the availability of high­speed computing, having a small set of
standard values made it possible to produce tables of useful critical
values.
● Now most statistical software computes critical values as needed,
rather than storing them in tables.
● Thus, the analyst can employ a different level of significance­say,
0.07.
● However, rather than require a prespecified significance level, many
software packages compute a p­value for the test statistic.
● The p­value is the significance level at which one would just reject H0
for the given value of the test statistic.
● Therefore, a large p­value tends to indicate a good fit (we would have
to accept a large chance of error in order to reject), while a small p­
value suggests a poor fit (to accept we would have to insist on almost
no risk).
● The p­value can be viewed as a measure of fit, with larger values being
better.
● This suggests that we could fit every distribution at our disposal,
compute a test statistic for each fit, and then choose the distribution
that yields the largest p­value.
● We know of no input modeling software that implements this specific
algorithm, but many such packages do include a "best. fit" option, in
which the software recommends an input model to the user after
evaluating all feasible models.
● The software might also take into account other factors­such as
whether the data are discrete or continuous, bounded or unbounded­
but, in the end, some summary measure of fit, like the p­value, is used
to rank the distributions.

7.5 SELECTION INPUT MODEL WITHOUT DATA


● Unfortunately, it is often necessary in practice to develop a simulation
model­perhaps for demonstration purposes or a preliminary study­
before any process data are available.

146
● In this case, the modeler must be resourceful in choosing input models Input Modeling
and must carefully check the sensitivity of results to the chosen
models.
● There are a number of ways to obtain information about a process
even if data are not available:
○ Engineering data:
■ Often a product or process has performance ratings provided by
the manufacturer (for example, the mean time to failure of a disk
drive is 10000 hours; a laser printer can produce [Link]/minute;
the cutting speed of a tool is 1 em/second; etc.).
■ Company rules might specify time or productJon standards.
■ These values provide a starting point for input modeling by fixing
a central value.

○ Expert option:
■ Talk to people who are experienced with the process or similar
processes.
■ Often, they can provide optimistic, pessimistic, and most­likely
times.
■ They might also be able to say whether the process is nearly
constant or highly variable, and they might be able to define the
source of variability.

○ Physical or conventional limitations:


■ Most real processes have physical limits on performance­for
example, computer data entry cannot be faster than a person can
type.
■ Because of company policies, there could be upper limits on how
long a process may take.
■ Do not ignore obvious limits or bounds that narrow the range of
the input process.

○ The nature of the process:


■ The description of the distributions can be used to justify a
particular choice even when no data are available.
● When data are not available, the uniform, triangular, and beta
distributions are often used as input models.
● The uniform can be a poor choice, because the upper and lower
bounds are rarely just as likely as the central values in real processes.

147
Simulation and ● If, in addition to upper and lower bounds, a most­likely value can be
Modelling given, then the triangular distribution can be used.
● The triangular distribution places much of its probability near the
most­likely value, and much less near the extremes.
● If a beta distribution is used, then be sure to plot the density function
of the selected distribution; the beta can take unusual shapes.
● A useful refinement is obtained when a minimum, a maximum, and
one or more "breakpoints" can be given.
● A breakpoint is an intermediate value together with a probability of
being less than or equal to that value.
● The following example illustrates how breakpoints are used.

7.6 MULTIVARIATE AND TIME SERIES INPUT


MODELS
The random variables presented were considered to be independent of any
other variables within the context of the problem.
However, variables may be related, and, if the variables appear in a
simulation model as inputs, the relationship should be investigated and
taken into consideration.

Example:
An inventory simulation includes the lead time and annual demand for
industrial robots. An increase in demand results in an increase in lead
time: The final assembly of the robots must be made according to the
specifications of the purchaser. Therefore, rather than treat lead time and
demand as independent random variables, a multivariate input model
should be developed.

148
7.6.1 Covariance and Correlation Input Modeling

7.6.2 Multivariate Input Models

149
Simulation and 7.6.3 Time Series Input Models
Modelling

150
Input Modeling

7.7 Summary
● Input­data collection and analysis require major time and resource
commitments in a discrete­event simulation project.
● However, regardless of the validity or sophistication of the simulation
model, unreliable inputs can lead to outputs whose subsequent
interpretation could result in faulty recommendations.
● This chapter discussed four steps in the development of models of
input data: collecting the raw data, identifying the underlying
statistical distribution, estimating the parameters, and testing for
goodness of fit.
● Once the data have been collected, a statistical model should be
hypothesized.
● Constructing a histogram is very useful at this point if sufficient data
are available.
● A distribution based on the underlying process and on the shape of the
histogram can usually be selected for further investigation.
● The investigation proceeds with the estimation of parameters for the
hypothesized distribution.
● Suggested estimators were given for distributions used often in
simulation.
● In a number of instances, these are functions of the sample mean and
sample variance.
● The last step in the process is the testing of the distributional
hypothesis. The q ­ q plot is a useful graphical method for assessing
fit.

151
Simulation and ● The Kolmogorov­Smirnov, chi­square, and Anderson­Darling good­
Modelling ness­of­fit tests can be applied to many distributional assumptions.
● When a distributional asSumption is rejected, another distribution is
tried.
● When all else fails, the empirical distribution could be used in the
model.
● Unfortunately, in some situations, a simulation study must be
undertaken when there is not time or resources to collect data on which
to base input models.
● When this happens, the analyst must use any available information­
such as manufacturer specifications and expert opinion­to construct the
input models.
● When input models are derived without the benefit of data, it is
particularly important to examine the sensitivity of the results to the
models chosen.

7.8 EXERCISE
Answer the following:

1. Draw the pdf of normal distribution with = 6, = 3.

2. Draw the pdf of Poisson distribution with = 3, 5, and 6.

3. Draw the exponential pdf with = 0.5. In the same sheet, draw the
exponential pdf with = 1.5.

4. On one figure, draw the pdfs of the Erlang distribution where = 2 and
k = 1, 2, 4, and 8.
5. The following data are available on the processing time at a machine
(in minutes):
0.64, 0.59, l,1, 3.3,0.54, 0.04, 0.45, 0.25, 4.4, 2.7, 2.4, 1.1, 3.6,
0.61, 0.20, 1.0, 0.27, 1.7, 0.04, 0.34. Develop an input model for
the processing time.



152
8
VERIFICATION AND VALIDATION OF
SIMULATION MODEL & OUTPUT
ANALYSIS FOR A SINGLE MODEL
Unit Structure :
8.0 Objective
8.1 Model building, Verification, and Validation
8.2 Verification of simulation models
8.3 Calibration and Validation of models
8.4 Output Analysis for a Single Model
8.5 Types of simulations with respect to output analysis
8.6 Stochastic nature of output data
8.7 Measure of performance and their estimation
8.8 Output analysis of terminating simulators
8.9 Output analysis for steady state simulation
8.10 Summary
8.11 Exercise

8.0 OBJECTIVE
The objective of this chapter is:
● To predict the performance of a system or to compare the performance
of two or more alternative system designs.
● To describe methods that have been recommended and used in the
verification and validation process.
● To understand relationships where validation is the process by which
model users gain confidence that output analysis is making valid
inferences about the real system under study.
● To understand output analysis is the examination of data generated by
a simulation.
● To understand the purpose of output analysis is either to predict the
performance of a system or to compare the performance of two or
more alternative system designs.

153
Simulation and
Modelling
Conceptually, the verification and validation process consists of the
following components:

1. Verification is concerned with building the model correctly.


● It proceeds by the comparison of the conceptual model to the computer
representation that implements that conception.
● It asks the questions:
○ Is the model implemented correctly in the simulation software?
○ Are the input parameters and logical structure of the model
represented correctly?
2. Validation is concerned with building the correct model.
● It attempts to confirm that a model is an accurate representation of the
real system.
● Validation is usually achieved through the calibration of the model, an
iterative process of comparing the model to actual system behavior and
using the discrepancies between the two, and the insights gained, to
improve the model.
● This process is repeated until model accuracy is judged to be
acceptable.

8.1 MODEL BUILDING,VERIFICATION, AND


VALIDATION
● The first step in model building consists of observing the real system
and the interactions among their various components and of collecting
data on their behavior.
● But observation alone seldom yields sufficient understanding of
system behavior.
● Persons familiar with the system, or any subsystem, should be
questioned to take advantage of their special knowledge.
● Operators, technicians, repair and maintenance personnel, engineers,
supervisors, and managers understand certain aspects of the system
that might be unfamiliar to others.
● As model development proceeds, new questions may arise, and the
model developers will return to this step of learning true system
structure and behavior.
● The second step in model building is the construction of a conceptual
model-a collection of assumptions about the components and the
structure of the system, plus hypotheses about the values of model
input parameters.
154
● As is illustrated by the following Figure, conceptual validation is the Verification and
comparison of the real system to the conceptual model. Validation of
Simulation Model
& Output Analysis for
a Single Model

● The third step is the implementation of an operational model, usually


by using simulation software and incorporating the assumptions of the
conceptual model into the worldview and concepts of the simulation
software.
● In actuality, model building is not a linear process with three steps.
● Instead; the model builder will return to each of these steps many
times while building, verifying, and validating the model.
● Above Figure depicts the ongoing model building process, in which
the need for verification and validation causes continual comparison of
the real system to the conceptual model and to the operational model
and induces repeated modification of the model to improve its
accuracy.

8.2 VERIFICATION OF SIMULATION MODELS


● The purpose of model verification is to assure that the conceptual
model is reflected accurately in the operational model.
● The conceptual model quite often involves some degree of abstraction
about system operations or some amount of simplification of actual
operations.
● Verification asks the following·question:
○ Is the conceptual model (assumptions about system components
and system structure, parameter value; abstractions, and
simplifications) accurately represented by the operational model?
155
Simulation and
Modelling
Many common-sense suggestions can be given for use in the
verification process:
1. Have the operational model checked by someone other than its
developer, preferably an expert in the simulation software being used.
2. Make a flow diagram that includes each logically possible action a
system can take when an event occurs, and follows the model logic for
each action for each event type.
3. Closely examine the model output for reasonableness under a variety
of settings of the input parameters. Have the implemented model
display a wide variety of output statistics, and examine all of them
closely.
4. Have the operational model print the input parameters at the end of the
simulation, to be sure that these parameter values have not been
changed inadvertently.
5. Make the operational model as self-documenting as possible. Give a
precise definition of every variable used and a general description of
the purpose of each submodel, procedure (or major section of code),
component, or other model subdivision.
6. If the operational model is animated, verify that what is seen in the
animation imitates the actual system. Examples of errors that can be
observed through animation are automated guided vehicles (AGVs)
that pass through one another on a unidirectional path or at an
intersection and entities that disappear (unintentionally) during a
simulation.
7. The Interactive Run Controller (IRC) or debugger is an essential
component of successful simulation model building. Even the best of
simulation analysts makes mistakes or commits logical errors when
building a model. The IRC assists in finding and correcting those
errors .in the following ways:
(a) The simulation can be monitored as it progresses. This can be
accomplished by advancing the simulation until a desired time has
elapsed, then displaying model information at that time.
Another possibility is to advance the simulation until a particular
condition is in effect, and then display information.
(b) Attention can be focused on a particular entity, line of code, or
procedure. For instance, every time that an entity enters a specified
procedure, the simulation will pause so that information can be
gathered. As another example, every time that a specified entity
becomes active, the simulation will pause.
(c) Values of selected model components can be observed When the
156
simulation has paused, the current value or status of variables, Verification and
attributes, queues, resources, counters, and so on can be observed. Validation of
Simulation Model
(d) The simulation· can be temporarily suspended, or paused, not only to
& Output Analysis for
view information, but also to reassign values or redirect entities.
a Single Model
8. Graphical interfaces are recommended for accomplishing verification
and validation [Borts-cheller and Saulnier, 1992]. The graphical
representation of the model is essentially a form of self-
[Link] simplifies the task of understanding the model.
● Two sets of statistics that can give a quick indication of model
reasonableness are current contents and total count.
● These statistics apply to any system having items of some kind
flowing through it, whether these items be called customers,
transactions, inventory, or vehicles.
● Current contents refers to the number of items in each component of
the system at a given time.
● Total count refers to the total number of items that have entered each
component of the system by a given time.
● In some simulation software, these statistics are kept automatically and
can be displayed at any point in simulation time.
● In other simulation software, simple counters might have to be added
to the operational model and displayed at appropriate times.
● If the current contents in some portion of the system are high, this
condition indicates that a large number of entities are delayed.
● If the output is displayed for successively longer simulation run times
and the current contents tend to grow in a more or less linear fashion,
it is highly likely that a queue is unstable and that the server(s) will fall
further behind as time continues.
● This indicates possible that the number of servers is too small or that a
service time is misspecified.
● On the other hand, if the total count for some subsystem is zero, this
indicates that no items entered that subsystem-again, a highly suspect
occurrence.
● Another possibility is that the current count and total count are equal to
one.
● This could indicate that an entity has captured a resource, but never
freed that resource.
● Careful evaluation of these statistics for various run lengths can aid in
the detection of mistakes in model logic and data misspecifications.

157
Simulation and ● Checking for output reasonableness will usually fail to detect the more
Modelling subtle errors, but it is one of the quickest ways to discover gross errors.
● To aid in error detection, it is best for the model developed.-to forecast
a reasonable range for the value of selected output statistics before
making a run of the model.
● Such a forecast reduces the possibility of rationalizing a discrepancy
and failing to investigate the cause of unusual output.

8.3 CALIBRATION AND VALIDATION OF MODELS


● Calibration is the iterative process of comparing the model to the real
system, making adjustments (or even major changes) to the model,
comparing the revised model to reality, making additional adjustments,
comparing again, and so on.
● Following Figure shows the relationship of model calibration to the
overall Validation process.

● As an aid in the validation process, Naylor and Finger [1967]


formulated a three-step approach that has been widely followed:
1. Build a model that has high face validity.
2. Validate model assumptions.
[Link] the model input-output transformations to
corresponding input-output transformations for the real system.
● The next five subsections investigate these three steps in detail.

8.3.1 Face Validity


● The first goal of the simulation modeler is to construct a model that
appears reasonable on its face to model users and others who are
knowledgeable about the real system being simulated.
158
● The potential users of a model should be involved model construction Verification and
from its conceptualization to its implementation, to ensure that a high Validation of
degree of realism is built into the model through reasonable Simulation Model
assumptions regarding system structure and through reliable data. & Output Analysis for
a Single Model
● Potential users and knowledgeable persons can also evaluate model
output for reasonableness and can aid in identifying model
deficiencies.
● Thus, the users can be involved in the calibration process as the model
is improved iteratively by the insights gained from identification of the
initial model deficiencies.
● Another advantage of user involvement is the increase in the model's
perceived validity, or credibility,without which the manager would not
be watching to trust stimulation results as a basis for decision making.
● Sensitivity analysts can also be used to check a model's face validity.
● The model user is asked whether the model behaves in the expected
way when one or more input variables is changed.
● For example, in most queueing system if the arrival rate of customers
were to increase, it would be expected that utilizations of servers,
length of lines , and delays would tend to increase.
● From experience and from observations on the real system, the model
user and model builder would probably have some notion at least of
the direction of change in model output when an input variable is
increased or decreased.
● For most large scale simulation models, there are many output when
an input variable and thus many possible sensitivity tests.
● The model builder must attempt to choose the most critical input
variable from testing if it is too expensive or time consuming to vary
all input variables.
● If real system data are available for at least two setting of the input
parameters, objective scientific sensitivity tests can be conducted via
appropriate statistical techniques.
8.3.2 Validation of Model Assumptions
● Model assumptions fall into two classes:
○ Structural assumption and
○ Data assumptions
● Structural assumptions involve questions of how the system operates
and usually involve simplification and abstractions of reality.
● For example, consider the customer queueing and service facility in
the bank.
159
Simulation and ● Customers can form one line, or there can be an individual line for
Modelling each teller.
● If there are many lines, customers could be change lines, customers
could be served strictly on a first-come-first-served basis, or some
customers could change lines if one line is moving faster.
● The number of tellers could be fixed or variable.
● These structural assumptions should be verified by actual observation
during appropriate tme periods and by discussions with managers and
tellers regarding bank policies and actual implementation of these
policies.
● Data assumptions should be based on the collection of reliable data
and correct statistical analysis of the data.
● For example, in the bank study mentioned, data were collected on
1. interarrival times of customers during several2-hour periods of
peak loading ("rush-hour" traffic);
2. interarrival times during a slack period;
3. service times for commercial accounts;
4. service times for personal accounts.
● The reliability of the data was verified by consultation with bank
managers, who identified typical rush hours and typical slack times.
● When combining two or more data sets collected at different umes,
data reliability can be further enhanced by objective statistical tests for
homogeneity of data.
● (Do two data sets {Xi} and {Yi} on service times for personal
accounts, collected at two different times, come from the same parent
population? If so, the two sets can be combined.)
● Additional tests might be required, to test for correlation in the data.
● As soon as the analyst is assured of dealing with a random sample that
statistical analysis can begin.
● The procedures for analyzing input data from a random sample.
Whether done manually or by special-purpose software, the analysis
consist of three steps:
1. Identify an appropriate probability distribution.
2. Estimate the parameters of the hypothesized distribution.
3. Validate the assumed statistical model by goodness-of-fit test,
such as the chi-square or Kolmogorov-Smirnov test, and by
graphical methods.
160
● The use of goodness-of-fit tests is an important part of the validation Verification and
of data assumption. Validation of
Simulation Model
8.3.3 Validating Input-Output Transformations
& Output Analysis for
● In this phase of the validation process the model is viewed as input- a Single Model
output transformation.
● That is, the model accepts the values of input parameters and
transforms these inputs into output measures of performance. It is this
correspondence that is being validated.
● Instead of validating the model input-output transformation by
predicting the future, the modeler may use past historical data which
has been served for validation purposes; that is, if one set has been
used to develop and calibrate the model, it's recommended that a
separate data test be used as final validation test. Thus accurate
"prediction of the past" may replace prediction of the future for the
purpose of validating the future.
● A necessary condition for input-output transformation is that some
version of the system under study exists so that the system data under
at least one set of input conditions can be collected to compare to
model prediction.
● If the system is in the planning stage and no system operating data can
be collected, complete input-output validation is not possible.
● Validation increases modeler's confidence that the model of the
existing system is accurate.
Changes in the computerized representation of the system, ranging
from relatively minor to relatively major include:
1. Minor changes of single numerical parameters such as speed of the
machine, arrival rate of the customer etc.
2. Minor changes of the form of a statistical distribution such as
distribution of service time or a time to failure of a machine.
3. Major changes in the logical structure of a subsystem such as change
in queue discipline for waiting-line model, or a change in the
scheduling rule for a job shop model.
4. Major changes involving a different design for the new system such as
a computerized inventory control system replacing a non computerized
system.

8.3.4 Validating Input-Output Transformation


● When using artificially generated data as input data the modeler
expects the model to produce event patterns that are compatible with,
but not identical to, the event patterns that occurred in the real system
during the period of data collection.
161
Simulation and ● Thus, in the bank model, artificial input data {X1n, X2n, n = 1, 2...} for
Modelling inter arrival and service times were generated and replicates of the
output data Y were compared to what was observed in the real
system.
● An alternative to generating input data is to use the actual historical
record, {An, Sn, n = 1, 2...}, to drive simulation models and then to
compare model output to system data.
● To implement this technique for the bank model, the data A1, A2,..., S1,
S2 would have to be entered into the model into arrays, or stored on a
file to be read as the need arose.
● To conduct a validation test using historical input data, it is important
that all input data (An, Sn...) and all the system response data, such as
average delay(Z2), be collected during the same time period.
● Otherwise, comparison of model responses to system responses, such
as the comparison of average delay in the model (Y2) to that in the
system (Z2), could be misleading.
● Responses (Y and Z ) depend on the inputs (An and Sn) as well
as on the structure of the system, or model.
● Implementation of this technique could be difficult for a large
system because of the need for simultaneous data collection of all
input variables and those response variables of primary interest.

8.3.5 Input-Output Validation: Using a Turing Test


● In addition to statistical tests, or when no statistical test is readily
applicable
● Persons knowledgeable about system behavior can be used to compare
model output to system output.
● For example, suppose that five reports of system performance over
five different days are prepared, and simulation outputs are used to
produce five "fake" reports.
● The 10 reports should all be in exactly the same format and should
contain information of the type that manager and engineer have
previously seen on the system.
● The ten reports are randomly shuffled and given to the engineers, who
are asked to decide which reports are fake and which are real.
● If an engineer identifies a substantial number of fake reports the model
builder questions the engineer and uses the information gained to
improve the model.
● If the engineer cannot distinguish between fake and real reports with
any consistency, the modeler will conclude that this test provides no
evidence of model inadequacy.
● This type of validation test is called the TURING TEST.

162
8.4 OUTPUT ANALYSIS FOR A SINGLE MODEL Verification and
Validation of
Output analysis is the examination of data generated by a simulation. Simulation Model
& Output Analysis for
Its purpose is either to predict the performance of a system or to compare a Single Model
the performance of two or more alternative system designs. This section
deals with the analysis of a single system.

8.5 TYPES OF SIMULATIONS WITH RESPECT TO


OUTPUT ANALYSIS
● In the analysis of simulation output data, a distinction is made between
terminating or transient simulations and steady-state simulations.
● A terminating simulation is one that runs for soe duration of time TE,
where E is a specified event (or set of events) that stops the simulation.
● Such a simulated system “opens” at time 0 under well-specified initial
conditions and “closes” at the stopping time TE.
● The next four examples are terminating simulations.
● Notice that in bank model example with the stopping time TE = 480
minutes is known, but in some example the stopping time TE is
generally unpredictable in advance in fact TE is probably output
variable of interest as it represents the total time until the system break
down.
● One goal of the simulation might be to estimate E(TE),the mean time
to system failure.

8.6 STOCHASTIC NATURE OF OUTPUT DATA


Consider one run of a simulation model over a period of time [0, TE].
● Some of the model input variables are random variables, it follows that
the model output variables are random variables.
● Three examples are now given to illustrate the nature of the output
data from stochastic simulations and to give a preliminary discussion
of several important properties of these data.

163
Simulation and ● Do not be concerned if some of these properties and the associated
Modelling terminology are not entirely clear on a first reading.

8.7 MEASURE OF PERFORMANCE AND THEIR


ESTIMATION
Consider a set of output values for the same measure Y1,Y2,Y3,...,Yn (e.g.
delays of n different runs, or waiting times of n different runs). We want to
have
● a point estimate to approximate the true value of Yi, and
● an interval estimate to outline the range where the true value lies.

8.7.1 Point Estimation

The point estimator of based on the data is defined by

The point estimator is said to be unbiased for if

In general

i.e. there is a drifting, or bias.

For continuous data, the point estimator of based on data ,


where is the simulation run length, is defined by

and is called a time average of Y(t) over .


In general

if b = 0, is said to be unbiased for


One performance measure of these estimators (point or interval) is a
quantile or a percentile.
164
Quantiles describe the level of performance that can be delivered with a Verification and
given probability p Validation of
Simulation Model
Assume Y represents the delay in queue a customer experiences, then the
& Output Analysis for
0.85 quantile (or 85% percentile) of Y is the value such that a Single Model

8.7.2 Interval Estimation


Valid interval estimation typically requires a method of estimating the

variance of the point estimator or .

Let represent the true variance of a point estimator ,

and let represent an estimator of based on the data

.
Suppose that

where B is called the bias in the variance estimator.


It is desirable to have

in which case is said to be an unbiased estimator of variance,

If it is an unbiased estimator, the statistic

is approximately t distribution with some degree f of freedom.

An approximate confidence interval for is given by

165
Simulation and This relation involves three parameters, estimator for mean, estimator for
Modelling variance, and the degree of freedom. How to determine these values?
Estimator for mean is calculated as above as a point estimator

Estimator for the variance and for the degree of freedom has to consider
two separate cases

If s are statistically independent observations then use

to calculate

with the degree of freedom f = n - 1.

If s are not statistically independent, then the above estimator for

variance is biased. s is an autocorrelated sequence, sometimes called a


time series In this case,

i.e. one needs to calculate co-variance for every possible pair of


observations. Too expensive. If the simulation is long eough to have
passed the transient phase, the output is approximately covariance

stationary. That is depends on in the same way as depends on

For a covariance stationary time series s, define the lag k autocovariance


by

For k = 0, becomes the population variance


166
Verification and
Validation of
The lag k autocorrelation is the correlation between any two observations Simulation Model
k apart. & Output Analysis for
a Single Model

and has the property

If a time series is covariance stationary, then the calculation of sample


variance can be substantially simplified.

So all we need is to calculate covariance between one sample and every


other samples, but not every sample with every other samples.
Some discussions about why autocorrelation make it difficult to estimate
are skipped

8.8 OUTPUT ANALYSIS OF TERMINATING


SIMULATORS
A terminating simulation: runs over a simulated time interval [0, TE].
A common goal is to estimate:

In general, independent replications are used, each run using a different


random number stream and independently chosen initial conditions.

8.9 OUTPUT ANALYSIS FOR STEADY STATE


SIMULATION
● Consider a single run of a simulation model whose purpose is to
estimate a steady-state, or long-run, characteristic of the system.
● Suppose that the single run produces observations Y1, Y2 ,..., which,
generally, are samples of an autocorrelated time series.

● The steady-state (or long-run) measure performance, , is defined by


167
Simulation and
Modelling

● with probability 1, where the value of is independent of the initial
conditions.
● (The phrase "with probability 1" means that essentially all simulations
of the model, using different random numbers, will produce series Yi, i
= 1, 2 whose sample average converges to .)

● For example, if Y1 was the time customer i spent talk to an operator,


then would be the long-run average time a customer spends talking
to an operator, and, because is defined as a limit, it is independent of
the call center's conditions at time 0.
● Similarly, the steady-state performance for a continuous-time output
measure {Y(t), t 0} such as the number of customers in the call
center's hold queue, is defined as with probability 1.

● Of course, the simulation analyst could decide to stop the simulation


after some number of observations- say, n-have been collected; or the
simulation analyst could decide to simulate for some length of time TE
that determines n (although n may vary from run to run).
● The sample size n (or TE ) is a design choice; it is not inherently
determined by the nature of the problem.
● The simulation analyst will choose simulation run length (n or TE )
with several considerations in mind:
1. Any bias in the point estimator that is due to artificial or arbitrary
initial conditions. (The bias can be severe if run length is too short,
but generally it decreases as run length increases.)
2. The desired precision of the point estimator, as measured by the
standard error or confidence interval half-width.
3. Budget constraints on computer resources.
● The next subsection discusses initialization bias and the following
subsections outline two methods of estimating point-estimator
variability.
● For clarity of presentation, we discuss only estimation of e from a
discrete-time output process.
● Thus, when discussing one replication (or run), the notation
● Y1, Y2 ,Yr3..,
168
● will be used; if several replications have been made, the output data Verification and
for replication r will be denoted by Validation of
Simulation Model
● Yr1, Yr2 ,Yr3...,
& Output Analysis for
8.9.1 Initialization Bias in Steady-State Simulations a Single Model

● There are several methods of reducing the point-estimator bias caused


by using artificial and unrealistic initial conditions in a steady-state
simulation.
● The first method is to initialize the simulation in a state that is more
representative of long-run conditions.
● This method is sometimes called intelligent initialization.
● Examples include
1. setting the inventory levels, number of backorders, and number of
items on order and their arrival dates in an inventory simulation;
2. placing customers in queue and in service in a queueing
simulation
3. having some components failed or degraded in a reliability
simulation.
● There are at least two ways to specify the initial conditions
intelligently.
● If the system exists, collect data on it and use these data to specify
more nearly typical initial conditions.
● This method sometimes requires a large · data-collection effort.
● In addition, if the system being modeled does not exist-for example, if
it is a variant of an existing system this method is impossible to
implement.
● Nevertheless, it is recommended that simulation analysts use any
available data on existing systems to help initialize the simulation, as
this will usually be better than assuming the system to be "completely
stocked," "empty and idle," or "brand new" at time 0.

8.10 SUMMARY
Validation of simulation models is of great importance.
● Decisions are made on the basis of simulation results;thus, the
accuracy of these results should be subject to question and
investigation.
● Quite often, simulations appear realistic on the surface because
simulation models, unlike analytic models, can incorporate any level
of detail about the real system.
169
Simulation and ● To avoid being "fooled" by this apparent realism, it is best to compare
Modelling system data to model data and to make the comparison by using a wide
variety of techniques, including an objective statistical test, if at all
possible .
● This chapter emphasized the idea that a stochastic discrete-event
simulation is a statistical experiment.
● Therefore, before a sound conclusion can be drawn on the basis of
simulation-generated output data, a proper statistical analysis is
required.
● The purpose of the simulation experiment is to obtain estimates of the
performance measures of the system under study.
● The purpose of statistical analysis is to acquire some assurance that
these estimates are sufficiently precise for the proposed use of the
model.
● The statistical precision of point estimators can be measured by a
standard-error estimate or by a confidence interval.

8.11 EXERCISE
Answer the following:
1. Explain with a neat diagram verification of simulation models. 10 M
2. Describe with a neat diagram iterative process of calibrating a model.
Which are three steps that aid in the validation process? 10 M
3. Explain with a neat diagram model building, verification and
validation process 10 M
4. Describe the three steps approach to validation by Naylor and Finger.
10 M
5. Explain with a neat diagram model building, verification and
validation. 10 M
6. Write short note on Optimization via simulation. 5 M
7. A store selling Mother’s Day cards must decide 6 months in advance
on the number of cards to stock. Recording is not allowed. Cards cost
$0.45 and sell for $1.25. Any cards not sold Mother’s Day go on sale
for $0.50 for 2 weeks. However,sales of the remaining cards is
probabilistic in nature according to the following distribution:
32% of the time, all cards remaining get sold.
40% of the time, 80% of all cards remaining are sold.
28% of the time, 60% of all cards remaining are sold.

170
Any cards left after 2 weeks are sold for $ 0.25 . The card-shop owner is Verification and
not sure how many cards can be sold, but thinks it is somewhere between Validation of
200 and 400. Suppose that the card-shop owner decides to order 300 cards. Simulation Model
Estimate the expected total profit with an error of at most $5.00. (Hint: & Output Analysis for
Make three or four initial replications. Use these data to estimate the total a Single Model
sample size needed..Each replication consists of one Mother's Day.)

References:

 [Link]



171

You might also like