Key Metrics for Time Series Forecasting
Key Metrics for Time Series Forecasting
Ans
When building and evaluating time series forecasting models, it’s crucial to use
appropriate error metrics to measure the accuracy of the predictions. These metrics
help us understand how well a model performs and guide us in choosing the best
model for our forecasting needs. Commonly used error metrics include:
MAE is the average of the absolute differences between the actual and predicted
values. It gives an idea of how much the predictions deviate from the actual values
on average.
1
Pros:
Cons:
MSE is the average of the squared differences between the actual and predicted
values. It emphasizes larger errors due to the squaring operation.
Pros:
Cons:
• Sensitive to outliers.
2
RMSE is the square root of MSE. It provides an error metric on the same scale as
the original data.
Pros:
Cons:
• Sensitive to outliers.
MAPE is the average of the absolute percentage errors between the actual and
predicted values. It expresses the error as a percentage.
Pros:
Cons:
Pros:
• Scale-independent.
Cons:
MASE scales the absolute errors based on the in-sample mean absolute error from a
naive forecast method. This makes it easier to interpret the forecast accuracy.
Pros:
• Scale-independent.
4
Cons:
WAPE is a variant of the Mean Absolute Percentage Error (MAPE), where the
absolute errors are weighted by the actual values. It provides a more balanced view
of forecast accuracy, especially useful when dealing with data with varying scales.
Pros:
• Scale-independent.
• Provides a more balanced view of forecast accuracy for datasets with varying
scales.
Cons:
5
Q Question: Discuss the working of ARIMA (Autoregressive Integrated
Moving Average) models
ANS
1. Components:
• Autoregressive (AR):
This component uses past values of the time series to predict future values,
assuming that the current value is correlated with its previous values.
• Integrated (I):
• Data Analysis:
ARIMA models analyze time series data to identify patterns and trends.
• Model Building:
6
The model is built by combining the AR, I, and MA components, with the
order of each component denoted as (p, d, q), where 'p' is the order of the AR
component, 'd' is the order of integration, and 'q' is the order of the MA
component.
• Forecasting:
• Stationarity:
The "I" component ensures that the time series data is stationary, which is a
key requirement for ARIMA models.
3. Applications:
• Finance: Predicting stock prices, interest rates, and other financial data.
• Stationarity:
Ensuring that the time series data is stationary is crucial for the effectiveness of
ARIMA models.
• Model Selection:
Choosing the appropriate values for 'p', 'd', and 'q' is important for building an
accurate model.
• Model Evaluation:
7
Evaluating the performance of the model using metrics like Mean Absolute
Error (MAE) or Root Mean Squared Error (RMSE) is important for ensuring
that the model is accurate.
Throughout finance, economics and environmental sciences etc., ARIMA has great
interest because it can identify many complex patterns of our past observations
with future needs which makes it a state-of-the-art technique. From predicting the
price of stocks, forecasting weather patterns to getting an idea about consumer
demand, ARIMA is a great way to make accurate and actionable predictive
analyses.
By using ARIMA, we are able to both analyze and forecast time series data in a
sophisticated manner that accounts for patterns, trends, and seasonality. This
facilitates a 360-degree view of the underlying dynamics for making informed
decisions.
8
Autoregressive (AR) part
The Autoregressive (AR) component builds a trend from past values in the AR
framework for predictive models. For clarification, the 'autoregression framework'
works like a regression model where you use the lags of the time series' own past
values as the regressors.
The Integrated (I) part involves the differencing of the time series component
keeping in mind that our time series should be stationary, which really means that
the mean and variance should remain constant over a period of time. Basically, we
subtract one observation from another so that trends and seasonality are eliminated.
By performing differencing we get stationarity. This step is necessary because it
helps the model fit the data and not the noise.
We can consider the residuals among one of these errors, and the moving average
model concept estimates or considers their impact on our latest observation. This is
particularly useful for tracking and trapping short-term changes in the data or
random shocks. In the (MA) part of a time series, we can gain valuable information
about its behavior which in turn allows us to forecast and predict with greater
accuracy.
9
How to Build an ARIMA Model
Data collection
The first step is to tee up an appropriate dataset and prepare our environment.
Find a dataset
Collect or search for a dataset from data source platforms. You want one that has
historical data over time
Data preprocessing
Our dataset is pretty clean, but in other contexts, we would have to handle indexing
issues, which is important in time series forecasting. For example, since we are
forecasting the closing value of a stock on a particular exchange, we have to
consider that the stock market is not open on weekends.
10
Create a time plot
While ARIMA models can deal with non-stationarity up to a point, they cannot
effectively account for time-varying variance. In other words, for an ARIMA
model to really work, the data has to be stationary.
A close look of the graph, however, reveals that the fluctuations are not totally
arbitrary, and a part of these fluctuations has a steady behavior and can be related
to time.
This part is the systematic part of the time series and the remaining part is non
systematic or irregular.
The systematic part is further divided in the following broad categories:
(i) secular trend (T), (ii) seasonal variation (S), and (iii) cyclical variation (C).
The non systematic part is also called (iv) irregular variation (I).
The secular trend is the long term pattern of a time [Link] secular trend can be
positive or negative depending on whether the time series exhibits an increasing
long term patternor a decreasing long term [Link] secular trend shows a
smooth and regular long term movement of the time series.
The secular trend does not include short term fluctuations, but only consists of a
steady movement over a long period of time. It is the movement that the series
would take if there are no seasonal, cyclical or irregular variations. It is the effect
of factors that are more or less constant for a long time or that change very
gradually and slowly over [Link] a time series does not s ow an increasing or
decreasing pattern, then the series is stationary around the mean.
12
There are ups and downs in the graph, but the time series shows an upward trend
in long run.
13
14
The above graph shows a downward trend.
Many time series related to financial, economic, and business activities consist of
monthly or quarterly data. It is observed very often that these time series exhibit
seasonal variation in the sense that similar patterns are repeated from year to year.
Seasonal variation is the component of a time series that involves patterns of
change within a year that repeat from year to year.
Warm clothes and woolen products have a market during the winter season.
Fans, coolers, cold drinks and ice creams are in great demand during summer.
Umbrellas and raincoats are in great demand during the rainy season.
Different festivals are associated with different commodities and every festival
season is associated with an increase in demand for related commodities.
For example, clothes and firecrackers are in great demand during Diwali. Most of
the seasonal variations in demand reflect changes in climatic conditions or customs
and habits of people.
All the above examples have one year as the period of seasonal variation.
However, the period of seasonal variation can be a month, a week, a day, or even
an hour, depending on the nature of available data.
For example, cash withdrawals in a bank show seasonal variation among the days
of a month, the number of books borrowed by readers from a library show
15
seasonal variation according to days of a week, passenger traffic at
a railway station has seasonal variation during hours of a day, and the temperature
recorded in a city exhibits seasonal variation over hours of a day, in addition to
seasonal variation with changing seasons in a year.
Seasonal variation is measured with help of seasonal indices, which are useful for
short term forecasting. Such short term forecasts are useful for a departmental store
in planning its inventory according to months of a year.
A bank manager can use such short term forecasts in managing cash flow on
different days of a week or a month.
16
Figure shows a pattern that is repeated year after year. The values are lowest (in the
year) in second quarter and highest (in the year) in fourth quarter of every year.
Although the
overall graph of the time series in Figure 4.3 shows an increasing trend, the
seasonal variation within every year is very clearly visible in the graph.
17
Cyclical variations are observed in almost all time series related to economic or
business activities, where a cycle is known as a business cycle or trade cycle.
Recurring ups and downs in a business are the main causes of cyclical variation.A
typical business cycle consists of the following four phases: (i) prosperity, (ii)
recession, (iii) depression, (iv) recovery.
Figure depicts these four phases of a business cycle, where every phase changes to
the next phase gradually in the order mentioned above.
18
Cyclical variations can consist of a period of 5 years, 10 years, or even longer
duration. The period often changes from one cycle to another. Cyclical variation
may be attributed to internal organizational factors such as purchase and inventory
policies or external factors such as financial market conditions and government
policies.
19
Irregular Variation (I)
Irregular variations are unexpected variations in time series caused by unforeseen
events that can include natural disasters like floods or famines, political events line
strikes or agitations, or international events like wars or others conflicts.
As the name suggests, irregular variations do not follow any patterns and are,
therefore, totally unpredictable. For this reason, irregular variation are also known
as unexplained or unaccounted variations.
ANS
Ans
20
This parameter controls the rate at which the influence of observations at
prior time steps decays exponentially.
• Large values mean that the model pays attention mainly to the most recent
past observations, whereas smaller values mean more of the history is taken
into account when making a prediction.
• In addition to the alpha parameter for controlling the smoothing factor for
the level, a smoothing factor is added to control the decay of the influence of
the change in a trend, called beta (b).
• The method supports trends that change in different ways: an additive and a
multiplicative, depending on whether the trend is linear or exponential
respectively.
21
• In addition to the alpha and beta smoothing factors, a new parameter is
added called gamma (g), which controls the influence on the seasonal
component.
Ans
In time series forecasting, the moving average method smooths data by calculating
the average of values over a specific period, helping to identify underlying trends
and patterns while reducing short-term fluctuations.
Here's a more detailed explanation:
• What it is:
• How it works:
• You choose a period (e.g., 3-month, 7-day) and calculate the average of
the data points within that period.
• As new data becomes available, you "slide" the window forward,
dropping the oldest data point and adding the newest, then recalculating
the average.
• This process creates a new series of averages, which is the moving
average.
22
• Why it's useful:
• Simple Moving Average (SMA): The average of the values over the last n
periods.
• Weighted Moving Average (WMA): Assigns different weights to data
points, with more recent data points having higher weights.
• Exponential Moving Average (EMA): A type of weighted moving
average that gives more weight to recent data points, but with a
decreasing weight for older data.
• Limitations:
• Lag: Moving averages lag behind the actual data, as they are based on
past values.
• Sensitivity to Period Length: The choice of the moving average period
can significantly impact the results, and a longer period can smooth out
too much detail, while a shorter period may not smooth out enough.
• Not Suitable for All Data: Moving averages are best suited for data that
has a clear underlying trend or cycle, and may not be effective for data
that is highly random or irregular.
• Applications:
23
• Stock Market Analysis: Moving averages are commonly used by traders
and investors to identify trends and potential trading opportunities.
• Sales Forecasting: Businesses can use moving averages to forecast future
sales based on past sales data.
• Other Time Series Data: Moving averages can be applied to a wide range
of time series data, such as weather patterns, economic indicators, and
website traffic.
Ans
A process is a series of connected activities that transform one or more inputs into
one or more outputs. All work activities are performed in processes, and
forecasting is no exception.
1. Problem definition
2. Data collection
3. Data analysis
5. Model validation
24
Problem definition
25
Data collection
consists of obtaining the relevant history for the variable( s) that are to be forecast,
including historical information on potential predictor variables. The key here is
"relevant"; often information collection and storage methods and systems change
over time and not all historical data is useful for the current problem. Often it is
necessary to deal with missing values of some variables, potential outliers, or other
data-related problems that have occurred in the past. During this phase it is also
useful to begin planning how the data collection and storage issues in the future
will be handled so that the reliability and integrity of the data will be preserved
Data analysis
Model selection and fitting consists of choosing one or more forecasting models
and fitting the model to the data. By fitting, we mean estimating the unknown
model parameters, usually by the method of least squares.
Model validation
Forecasting model deployment involves getting the model and the resulting
forecasts in use by the customer. It is important to ensure that the customer
understands how to use the model and that generating timely forecasts from the
model becomes as routine as possible. Model maintainance, including making sure
that data sources and other required information will continue to be available to the
27
customer is also an important issue that impacts the timeliness and ultimate
usefulness of forecasts
Ans
Key Concepts:
• Weighted Average:
28
Exponential smoothing calculates a weighted average of past data points,
meaning that each data point contributes to the forecast based on a specific
weight.
• Exponential Decay:
• Smoothing Effect:
29
1. Initialization: Start with an initial forecast (or estimate) for the first time
period.
2. Calculate the Forecast: For each subsequent time period, calculate the
forecast by taking a weighted average of the previous forecast and the actual
value from the previous period.
3. Update the Forecast: Use the calculated forecast as the new forecast for the
next time period.
4. Repeat: Continue this process for all time periods in the time series.
Advantages:
• Limited for Complex Patterns: It may not be suitable for time series with
complex patterns, such as multiple seasonalities or non-linear trends.
Time Series forecasting can be further classified into four broad categories of
techniques:
Smoothing methods are a class of time series forecasting techniques that aim to
remove noise and make predictions by averaging the past values of a time series.
There are two commonly used smoothing methods:
31
1. Average Method: This method involves taking the average of a fixed number
of past values to make predictions. For example, if we have a time series y
with n values, the average method would predict the next value as the average
of the last n values.
2. Moving Average Smoothing: This method is similar to the average method,
but instead of using a fixed number of past values, it uses a sliding window of
a fixed size to average the past values. For example, if the window size is m,
the moving average method would predict the next value as the average of the
last m values.
1. Linear Regression: Time series analysis can also be performed using linear
regression, which is a supervised machine learning method used to model the
relationship between a dependent variable and one or more independent
variables. In time series analysis, the dependent variable is the time series
data, and the independent variable is time.
2. ARIMA Model: ARIMA is a commonly used method for time series analysis
and forecasting, which stands for Autoregressive Integrated Moving Average.
It is a type of regression-based model that uses past values of the time series
to make predictions.
Ans
32
Autoregressive (AR) models in time series forecasting predict future values based
on the linear combination of past values, assuming current values depend on
previous ones. They are denoted as AR(p), where 'p' represents the order (number
of past values used).
• Core Concept:
AR models are a type of statistical model that uses past data points to predict
future values in a time series.
• Linear Relationship:
The model assumes that the current value of a time series is a linear function of
its past values.
• AR(p) Notation:
The "p" in AR(p) indicates the order of the model, which is the number of
lagged values (past time steps) used to predict the current value. For example,
AR(1) uses the previous value, AR(2) uses the previous two values, and so on.
• Mathematical Representation:
• c: A constant term.
• φ1, φ2, ..., φp: The autoregressive coefficients, which determine the
influence of past values on the current value.
• Yt-1, Yt-2, ..., Yt-p: The values of the time series at previous time steps.
33
• εt: A white noise term, representing random shocks or errors.
• Applications:
• Limitations:
AR models are most effective when the time series data exhibits a strong
dependence on its past values and is stationary (meaning its statistical
properties, like mean and variance, do not change over time).
• Advanced Models:
More complex models like ARIMA (Autoregressive Integrated Moving
Average) and ARMAX (Autoregressive Moving Average with eXogenous
variables) build upon the AR model by incorporating additional factors like
trends, seasonality, and external variables.
Predicting future values is fundamental when working with time-series data, which
records observations over time. Autoregressive (AR) models are among the most
foundational tools for this, using past data points to forecast future outcomes.
These models are essential for analysts working with time-series data in areas like
finance, economics, and forecasting, as they provide a first step towards more
advanced predictive approaches.
Understanding autoregressive models is critical, as they form the basis for more
sophisticated techniques like ARIMA (autoregressive integrated moving average),
34
which incorporates additional complexities. By mastering AR models, analysts can
tackle time-series problems more effectively, building a strong foundation for
tackling real-world scenarios.
Let’s consider a simple time series with 100 entries (starting at t=0 and ending at
t=99). The goal of an AR model is to predict the value at the next time step, t=100,
by using previous data points.
Suppose we want to predict the value at t=100. An autoregressive model will look
at a specific number of previous data points to make this prediction. For instance,
if we decide to use the three most recent entries—known as a lag of 3—we would
use the data points at t=97, t=98, and t=99 to predict the value at t=100
where:
35
• C0, C1, C2, C3 are coefficients that the model will learn from the data.
• The coefficients are determined using a multilinear regression fit on the
previous data points: X(t) = C0+C1X(t−1)+C2X(t−2)+C3X(t−3)
Autoregressive models are not limited to predicting just the next data point; they
can also project further into the future. For example, after predicting X(100), the
model can use this predicted value, along with other past values, to forecast X(101)
and continue this process for subsequent points. This technique is known
as recursive forecasting.
To illustrate:
However, there’s a catch. Since the model uses predicted values for future
predictions, compounding errors occur. The further into the future you predict,
the more the errors accumulate, causing predictions to become less accurate over
time. For example, the prediction for X(101) is based on X(100), which may
already have some prediction error, leading to an amplified error for X(101)
Lag correlation refers to the correlation between a time series value at a specific
time t, denoted as X(t), and a previous series value at a lagged time t-k, where k is
the lag. For instance, X(t) and X(t-1) are lagged by 1, and their correlation is
called lag 1 correlation.
36
The lag correlation can tell us how well past values of the time series can predict
future values. High lag correlation (close to 1) suggests that an autoregressive
model might be appropriate because there’s a significant relationship between the
current and previous data points.
Q Define Time Series. Write the Objectives, examples and uses of time series
analysis.
Ans
Time series analysis involves the use of statistical methods to analyze time
series data in order to extract meaningful statistics and understand important
characteristics of the observed data.
Analysis of time series data requires maintaining records of values of the variable
over time.
37
Some examples from day-to-day life may give a better idea of time series.
By simple observation of such a series, one can understand the nature of changes
that have taken place in values of the variable during the course of time.
Further, by applying appropriate technique of analysis to the series, one can study
the general tendency of the variable in addition to seasonal changes, cyclical
changes, and irregular or accidental changes in values of the variable.
Analysis of a time series reveals the nature of changes in the value of a variable
during the course of times.
This can be useful in forecasting the future values of the variable. Thus, with the
help of observations on an appropriate time series, future plans can be made
relating to certain matters like purchase, production, sales, etc.
This is how a planned economy makes plans for the future development on the basis
of time series analysis of the relevant data.
39
Evaluation of the actual performances in comparison with predetermined targets is
necessary to judge efficiency of the work.
For example, the achievements of Five- Year Plans are evaluated by determining
the annual rate of growth in the gross national product. Similarly, the national
policy of controlling inflation and pricerises is evaluated with the help of different
price indices. All these are made possible by analysis of time series of the relevant
variables.
A time series itself provides a scientific basis for making comparisons between two
or more related sets of data.
Note that data are arranged chronologically in such a series, and the effects of
its various components are gradually isolated, analyzed, and interpreted.
Ans
Let Xt denote the value of the variable at time t. The time series is denoted by the
collection of values, {Xt, t = 0, 1, ... , T} where T is the total duration of
observation.
40
There are two standard mathematical models for time series based on the four
components mentioned earlier, namely, secular trend (T), seasonal variation (S),
cyclical variation (C), and irregular variation (I).
Additive Model
The additive model assumes that the value Xt at time t is the sum of the four
components at time t.
Thus,
Xt = Tt + St + Ct + It
The additive model assumes that the four components of the time series are
independent of one another.
It is also important to remember that all the four components in the additive model
must be measured in the same unit of measurement.
The magnitude of the seasonal variation does not depend on the value of the time
series in the additive model. In other words, the magnitude of the seasonal
variation does no change as the series goes up or down.
Multiplicative Model
41
multiplication of the four components at time t. That is,
Xt = Tt × St × Ct × It
The multiplicative model does not assume independence of the four components of
the series and is, therefore, more realistic.
Values of the trend are expressed in units of measurements and other components
are expressed as percentage or relative values, and hence are free from units of
measurements.
➢ Additive model
Multiplicative model
➢ Pseudo-additive models
• Additive model
• The additive model assumes the observed time series is the sum of
components:
42
• Additive models are used when the magnitude of seasonal and
residual values are independent of the trend.
• Multiplicative model
• The multiplicative model assumes the observed time series is a product of its
components:
• These are used if the magnitudes of seasonal and residual values fluctuate
with the trend.
Pseudo-additive models
43
Q Question: Discuss Applications of Forecasting
Ans
Ans
44
are also instrumental in the strategic planning decisions made by business
organizations and financial institutions.
Ans
Key Concepts:
45
• Weighted Average:
• Exponential Decay:
• Smoothing Effect:
46
5. Initialization: Start with an initial forecast (or estimate) for the first time
period.
6. Calculate the Forecast: For each subsequent time period, calculate the
forecast by taking a weighted average of the previous forecast and the actual
value from the previous period.
7. Update the Forecast: Use the calculated forecast as the new forecast for the
next time period.
8. Repeat: Continue this process for all time periods in the time series.
Advantages:
• Limited for Complex Patterns: It may not be suitable for time series with
complex patterns, such as multiple seasonalities or non-linear trends.
4. Naive Method
In this method of forecasting, all the future values are equal to the average of the
past values i.e. the historical data.
What will happen in the future is the average of everything that has happened until
now.
48
Simple Average Method to forecast the demand for 2023
In this method of forecasting, the future values are equal to the average of the past
values over a defined period of time.
For example, we can use 2-point moving average, 3-point moving average etc. In
2-point moving average, we consider the past 2 data points and for 3-point moving
average, we consider the past 3 data points and so on.
49
This method of forecasting puts more weight on recent data and less on past data.
The weighted average is calculated by multiplying the given data by its associated
weighting and totaling the values. The weights used in a weighted moving average
are typically based on the time period being analyzed and the specific requirements
of the analysis.
Naive Method
In this method of forecasting, the future values are set to be values of the last
observation. When data is not enough to create a predictive model, this method is
used to supplement forecasts for the near future.
50
Naive Method to forecast the demand for 2023
51
Where,
α = Smoothing constant; 0 ≤ α ≤1
At = Actual value
52
53