Module 3
---------------------------------
Time Series
❖ Time Series
Module 3 ❖ Time Series Analysis
Time Series
❖ Autocorrelation
❖ Autocorrelation Function -
ACF
❖ ARIMA Model
❖ Box Jenkins Methodology
TIME SERIES
A time series is a sequence of data points that occur in successive order over some period of time
❖ It is a sequence of data that shows how a phenomenon or thing changes over time
❖ Instead of recording data points intermittently or randomly, time series analysts record data points at
consistent intervals over a set period of time
❖ These intervals can range from milliseconds to years, depending on the context
❖ Time series data is often used to identify trends, seasonal patterns, and cyclic behaviors in a variety of
domains
❖ Time is a crucial variable because it shows how the data adjusts over the course of the data points as well
as the final results
❖ It provides an additional source of information and a set order of dependencies between the data
TIME SERIES ANALYSIS
❖ Time series analysis is the process of analyzing time-ordered
data to extract meaningful insights, patterns, and trends over
time
❖ It focuses on understanding the underlying structure and
characteristics of the data, as well as modeling it to make
predictions or inform decisions.
❖ It helps organizations understand the underlying causes of
trends or systemic patterns over time
❖ The analysis shows how variables change over time
❖ Time series analysis typically requires a large number of data
Daily Closing Stock Prices
points to ensure consistency and reliability
TIME SERIES ANALYSIS
Key Goals of Time Series Analysis
❖ Understand Data Behavior:
Identify trends, seasonality, and cyclic patterns.
Detect outliers or anomalies.
❖ Forecast Future Values:
Predict future outcomes based on historical data (e.g., sales, stock prices).
❖ Model Relationships:
Explore relationships between variables in time series datasets.
❖ Control and Optimize Systems:
Improve performance in systems such as industrial processes, economic systems, or
traffic management.
TIME SERIES ANALYSIS
Time Series Analysis Types:
❖ Classification: Identifies and assigns categories to the data.
❖ Curve fitting: Plots the data along a curve to study the relationships of variables within the data.
❖ Descriptive analysis: Identifies patterns in time series data, like trends, cycles, or seasonal variation.
❖ Explanative analysis: Attempts to understand the data and the relationships within it, as well as cause
and effect.
❖ Exploratory analysis: Highlights the main characteristics of the time series data, usually in a visual format.
❖ Forecasting: Predicts future data. This type is based on historical trends. It uses the historical data as a
model for future data, predicting scenarios that could happen along future plot points.
❖ Intervention analysis: Studies how an event can change the data.
❖ Segmentation: Splits the data into segments to show the underlying properties of the source information.
COMPONENTS OF TIME SERIES ANALYSIS
COMPONENTS OF TIME SERIES ANALYSIS
COMPONENTS OF TIME SERIES ANALYSIS
TIME SERIES ANALYSIS
When not to use Time Series Analysis:
APPLICATIONS OF TIME SERIES ANALYSIS
Applications of Time Series Analysis:
❖ Business and Finance:
Forecasting sales, stock prices, and economic indicators
Risk management and investment strategy
❖ Health and Medicine:
Monitoring patient vitals (e.g., heart rate, EEG signals)
Epidemic modeling and disease progression tracking
APPLICATIONS OF TIME SERIES ANALYSIS
Applications of Time Series Analysis:
❖ Weather and Environment:
Predicting temperature, rainfall, or natural disasters
Analyzing climate change trends
❖ Operations and Logistics:
Optimizing supply chain management
Traffic flow and demand forecasting
STATIONARITY OF TIME SERIES
Time Series Analysis needs data to be Stationary
STATIONARITY OF TIME SERIES
STATIONARITY OF TIME SERIES
STATIONARITY OF TIME SERIES
AUTOCORRELATION
❖ Autocorrelation in time series is a statistical method that measures the relationship between a
variable's current value and its past values
❖ It ultimately plots one series over the other, and determines the degree of similarity between the two
❖ An autocorrelation of +1 indicates a perfect positive correlation, while an autocorrelation of -1
indicates a perfect negative correlation
❖ Autocorrelation can help identify patterns, trends, and relationships in time series data
❖ It can be used to analyze data in fields like econometrics, signal processing, and demand prediction
❖ In trading, it can help technical analysts identify patterns in stock prices
AUTOCORRELATION FUNCTION (ACF)
AUTOCORRELATION FUNCTION (ACF)
AUTOCORRELATION FUNCTION (ACF)
AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODEL
❖ An autoregressive integrated moving average, or ARIMA, is a statistical analysis model
that uses time series data to either better understand the data set or to predict future
trends
❖ A statistical model is autoregressive if it predicts future values based on past values
❖ An ARIMA model might seek to predict a stock's future prices based on its past
performance or forecast a company's earnings based on past periods
AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODEL
Components of ARIMA:
❖ Autoregressive (AR):
A model that uses the dependency between an observation and a specified
number of previous observations (lags)
Represented by the parameter p: the number of lagged terms to include
❖ Integrated (I):
Represents the differencing of raw observations to make the time series
stationary (i.e., removing trends and seasonality)
Represented by d: the number of times differencing is applied
AUTOREGRESSIVE INTEGRATED MOVING AVERAGE (ARIMA) MODEL
Components of ARIMA:
❖ Moving Average (MA):
A model that uses the dependency between an observation and
a residual error from a moving average model applied to
previous observations
Represented by q: the number of lagged forecast errors to
include
The ARIMA model is typically denoted as ARIMA(p, d, q)
MOVING AVERAGE
Comparison between ARMA and ARIMA Models
Comparison between ARMA and ARIMA Models
In what situation is the ARMA model appropriate?
Let’s understand the difference using a simple numerical example.
Example 1: ARMA Model (Stationary Data)
Suppose we have a stationary time series (no trend):
In what situation is the ARMA model appropriate?
The values fluctuate around a constant mean (~50).
There is no upward or downward trend → data is already stationary.
In what situation is the ARMA model appropriate?
Example 2: ARIMA Model (Non-Stationary Data)
Now suppose we have trending data:
In what situation is the ARMA model appropriate?
This shows a clear upward trend → non-stationary.
Now the differenced data is constant → stationary.
In what situation is the ARMA model appropriate?
In what situation is the ARMA model appropriate?
BOX JENKINS METHODOLOGY
❖ The Box-Jenkins Methodology is a systematic approach
to building, identifying, and estimating ARIMA
(Autoregressive Integrated Moving Average) models for
time series data
❖ It was introduced by George Box and Gwilym Jenkins in
the 1970s and is widely used for forecasting
BOX JENKINS METHODOLOGY
❖ The methodology is cyclical and consists of three major stages:
BOX JENKINS METHODOLOGY
1. Model Identification:
❖ This stage involves understanding the underlying properties of the time series to suggest a suitable model
form
❖ It typically includes:
Data Visualization and Preprocessing:
Plotting the Data: Examine time plots for trends, seasonality, and outliers
Stabilizing Variance: Apply transformations (e.g., logarithmic or Box-Cox) if the variance is nonconstant
Stationarity Check:
Visual Inspection and Statistical Tests: A stationary series has constant mean and variance over time.
Use plots and tests to assess stationarity
Differencing: If the series is non-stationary, apply differencing until stationarity is achieved. The number
of differences taken becomes the value of d
BOX JENKINS METHODOLOGY
1. Model Identification:
Analyzing the ACF and PACF:
Autocorrelation Function (ACF): Helps in identifying the potential order
of the MA component. A sharp cutoff in the ACF suggests an MA process
Partial Autocorrelation Function (PACF): Assists in identifying the order
of the AR component. A sharp cutoff in the PACF suggests an AR process.
❖ The output of this stage is a tentative model specification with suggested
values for p, d, and q
BOX JENKINS METHODOLOGY
2. Parameter Estimation:
❖ Once a candidate model is identified, the next step is to estimate its parameters
Estimation Techniques:
Maximum Likelihood Estimation (MLE): Commonly used to obtain parameter estimates that
maximize the likelihood of observing the given data
Least Squares: Sometimes used for simpler models
Software Implementation:
Statistical packages (such as R’s forecast or Python’s statsmodels) automate much of the estimation
process
Parameter Significance:
After estimation, check whether the estimated parameters are statistically significant and conform
to theoretical expectations (e.g., stationarity and invertibility conditions)
BOX JENKINS METHODOLOGY
3. Diagnostic Checking (Model Validation):
This stage is critical to ensure that the chosen model adequately captures
the dynamics of the time series
Residual Analysis:
White Noise Check: The residuals (errors) should behave like white
noise—i.e., they should be random, with no autocorrelation remaining
ACF of Residuals: Plot the ACF of the residuals to confirm that no
significant lags remain
BOX JENKINS METHODOLOGY
3. Diagnostic Checking (Model Validation):
Statistical Tests:
Ljung-Box Q-test: Used to check for overall randomness in the residuals. A
failure to reject the null hypothesis (of no autocorrelation) is desirable
Model Re-specification:
If diagnostics reveal patterns in the residuals (indicating that the model is
mis-specified), return to the identification step and refine the model. This might
involve adjusting the orders (p, d, q), incorporating seasonal components, or
re-examining data transformations
BOX JENKINS METHODOLOGY
4. Forecasting:
❖ Once a model has been satisfactorily validated, it can be used for forecasting
❖ Forecast Generation:
Use the estimated model to produce forecasts for future time points
❖ Forecast Intervals:
Along with point forecasts, generate confidence intervals to quantify the
uncertainty in the predictions
❖ Model Monitoring:
Continuously compare forecasts with actual outcomes. If significant deviations
are observed, the model may need re-estimation or re-specification
BOX JENKINS METHODOLOGY
Forecasting:
❖ Once a model has been satisfactorily validated, it can be used for forecasting
❖ Forecast Generation:
Use the estimated model to produce forecasts for future time points
❖ Forecast Intervals:
Along with point forecasts, generate confidence intervals to quantify the
uncertainty in the predictions
❖ Model Monitoring:
Continuously compare forecasts with actual outcomes. If significant deviations
are observed, the model may need re-estimation or re-specification