Understanding Linear Time Series Models
Understanding Linear Time Series Models
Basic concepts
• Stationarity:
µ = E(rt)
26
• Sample mean and sample variance are used to estimate the mean
and variance of returns.
1 T 1 X T
r̄ = rt & Var(rt) = (rt − r̄)2
X
T t=1 T − 1 t=1
• Test Ho : µ = 0 vs Ha : µ 6= 0. Compute
r̄ r̄
t= =r
std(r̄) Var(rt)/T
Compare t ratio with N (0, 1) dist.
Decision rule: Reject Ho of zero mean if |t| > Zα/2 or p-value
is less than α.
• Lag-k autocovariance:
`=1 T − `
29
[13,] 0.001143822
[14,] -0.057881530
[15,] -0.056303333
[16,] -0.021747045
> x2=pacf(ibm,[Link]=15)
> names(x2)
[1] "acf" "type" "[Link]" "lag" "series" "snames"
> x2$acf
, , 1
[,1]
[1,] 0.0742503857
[2,] 0.0050305627
[3,] -0.0251217508
[4,] -0.0025929619
[5,] -0.0061669432
... (editted)
[13,] -0.0562812983
[14,] -0.0467710983
[15,] -0.0124448122
> [Link](ibm,lag=5,type="Ljung")
Box-Ljung test
data: ibm
X-squared = 5.4474, df = 5, p-value = 0.3638
> [Link](log(ibm+1),lag=5,type="Ljung")
Box-Ljung test
data: log(ibm + 1)
X-squared = 5.7731, df = 5, p-value = 0.3289
Splus demonstration
> ibm=scan(file=’[Link]’) % Load data
> autocorTest(ibm,lag=5) % Perform Q(5) test
Test Statistics:
30
Test Stat 5.4474
[Link] 0.3638
Test Statistics:
Test Stat 5.7731
[Link] 0.3289
AUTOCORRELATIONS
1- 10 .07 .01 -.02 -.01 -.01 -.01 -.00 .07 .05 .04 % ACF
ST.E. .03 .03 .03 .03 .03 .03 .03 .03 .03 .03 % Stan. error
Q 4.8 4.9 5.4 5.4 5.4 5.5 5.5 10.2 12.6 14.1 % Ljung-Box Q
--
p=1-cdfc(5.4,5) % Calculate p-value
--
print p % Print p-value
.369
31
Back-shift (lag) operator
A useful notation in TS analysis.
B (or L) means time shift! Brt is the value of the series at time
t − 1.
Suppose that the daily log returns are
Day 1 2 3 4
rt 0.017 −0.005 −0.014 0.021
Answer the following questions:
• r2 =
• Br3 =
• B 2 r5 =
32
• The return is decomposed into two parts as
rt = µt + at
= E(rt|Ft−1) + σtt
• a model for µt
33
• the predictiable part is a linear function of Ft−1
i=0
White noise: iid sequence (with finite variance), which is the build-
ing block of linear TS models.
White noise is not predictable, but has zero mean and finite variance.
4. seasonal models
34
Example Quarterly growth rate of U.S. real gross national product
(GNP), seasonally adjusted, from the second quarter of 1947 to the
first quarter of 1991.
An AR(3) model for the data is
where {at} denotes a white noise with variance σa2. Given rn, rn−1 & rn−2,
we can predict rn+1 as
35
Example: Monthly simple return of CRSP equal-weighted index
• Stationarity condition
• Forecasting
36
Lecture Notes of Bus 41202 (Spring 2007)
Analysis of Financial Time Series
Ruey S. Tsay
rt = 0.005 + 0.2rt−1 + at
σa2
4. Variance: Var(rt) = 1−φ21
.
r̂n(1) = φ0 + φ1rn
1
Thus, an+1 is the un-predictable part of rn+1. It is the shock
at time n + 1!
(c) Variance of 1-step ahead forecast error:
r̂n(2) = φ0 + φ1r̂n(1)
Var[en(2)] = (1 + φ21)σa2
AR(2) model:
ρ` = φ1ρ`−1 + φ2ρ`−1, ` ≥ 2.
Building an AR model
• Order specification
• Model checking:
1. Residual: obs minus the fit, i.e. 1-step ahead forecast errors
at each time point.
2. Residual should be close to white noise if the model is ade-
quate. Use Ljung-Box statistics of residuals, but degrees of
freedom is m − g, where g is the number of AR coefficients
used in the model.
4
Example: US GNP growth rate series revisited.
R demonstration:
> setwd("C:/teaching/bs41202")
> library(fSeries)
> da=[Link]("[Link]")
> x=da[,1]
> plot(x,type=’l’) % Plot not shown in this handout.
> title(main=’Growth rate of U.S. GNP: 1947-1991’) % title of plot.
> acf(x,[Link]=12) % Compute ACF (not shown in this handout)
> pacf(x,[Link]=12) % Compute PACF (not shown in this handout)
> [Link](x,lag=10,type=’Ljung’)
Box-Ljung test
data: x
X-squared = 43.2345, df = 10, p-value = 4.515e-06
Call:
ar(x = x, method = "mle")
Coefficients:
1 2 3 % An AR(3) is specified.
0.3480 0.1793 -0.1423
> names(m1)
[1] "order" "ar" "[Link]" "[Link]" "aic"
[6] "[Link]" "[Link]" "partialacf" "resid" "method"
[11] "series" "frequency" "call" "[Link]"
Box-Ljung test
data: m1$resid
X-squared = 7.0808, df = 10, p-value = 0.7178
5
Call:
arima(x = x, order = c(3, 0, 0))
Coefficients:
ar1 ar2 ar3 intercept % Fitted model is
0.3480 0.1793 -0.1423 0.0077 % y(t)=0.348y(t-1)+0.179y(t-2)
s.e. 0.0745 0.0778 0.0745 0.0012 % -0.142y(t-3)+a(t),
% where y(t) = x(t)-0.0077
> [Link](m2$residuals,lag=10,type=’Ljung’)
Box-Ljung test
data: m2$residuals
X-squared = 7.0169, df = 10, p-value = 0.7239
>
> p1=c(1,-m2$coef[1:3]) % Further analysis of the fitted model.
> roots=polyroot(p1)
> roots
[1] 1.590253+1.063882e+00i -1.920152-3.530887e-17i 1.590253-1.063882e+00i
> Mod(roots)
[1] 1.913308 1.920152 1.913308
> k=2*pi/acos(1.590253/1.913308)
> k
[1] 10.65638
6
$se
Time Series:
Start = 177
End = 184
Frequency = 1
[1] 0.009709322 0.010280510 0.010686305 0.010688994
[5] 0.010689733 0.010694771 0.010695511 0.010696190
S-Plus demonstration
> module(finmetrics)
> gnp=scan(file=’[Link]’)
> plot(gnp,type=’l’)
> acf(gnp,[Link]=12)
Call: acf(x = gnp, [Link] = 12) % Plot not shown in the handout.
Autocorrelation matrix:
lag gnp
1 0 1.0000
2 1 0.3769
3 2 0.2539
4 3 0.0125
5 4 -0.0859
6 5 -0.1071
7 6 -0.0575
8 7 -0.0182
9 8 -0.0772
10 9 -0.0702
11 10 0.0104
12 11 -0.0230
13 12 -0.0967
> acf(gnp,[Link]=12,type=’partial’) % Compute PACF
Call: acf(x = gnp, [Link] = 12, type = "partial")
7
10 10 0.0981
11 11 -0.0370
12 12 -0.1533
> ord=ar(gnp,[Link]=10) % Perform order selection via AIC
> ord$aic
[1] 27.5691310 2.6081086 1.5895550 0.0000000 0.2734771 2.2034466
[7] 4.0171066 5.9916210 5.8264833 7.5230025 7.8223499
> ord$order
[1] 3
ARIMA order: 3 0 0
Value Std. Error t-value % No intercept because the program assumes it is zero.
ar(1) 0.45420 0.07597 5.9780
ar(2) 0.26680 0.08095 3.2960
ar(3) -0.03817 0.07597 -0.5024
Variance-Covariance Matrix:
ar(1) ar(2) ar(3)
ar(1) 0.005771926 -0.002566306 -0.001441892
ar(2) -0.002566306 0.006552753 -0.002566306
ar(3) -0.001441892 -0.002566306 0.005771926
ARIMA order: 3 0 0
8
ar(2) 0.1809 0.07863 2.301 % x(t)=0.351x(t-1)+0.181x(t-2)-0.144x(t-3)+a(t).
ar(3) -0.1443 0.07523 -1.919
Variance-Covariance Matrix:
ar(1) ar(2) ar(3)
ar(1) 0.0056599161 -0.001877448 -0.0007529176
ar(2) -0.0018774480 0.006182526 -0.0018774480
ar(3) -0.0007529176 -0.001877448 0.0056599161
> m1$model$ar
[1] 0.3509107 0.1809056 -0.1443412
>
> [Link](m1) % Model checking, plots not shown.
$[Link]:
[1] 0.009779314 0.010363943 0.010782026 0.010784985 0.010785783
[6] 0.010791060 0.010791857 0.010792592
9
Another example: Quarterly U.S. unemployment rate from 1948
to 2006. Focus on the change in umeployment rate.
• Form: rt = µ + at − θat−1
• Autocovariance:
10
Changes in quarterly unemployment rates: 1948−2006
0.4
0.2
qq
0.0
−0.2
0 50 100 150
Index
11
1. 1-step ahead: r̂n(1) = µ − θan. Why? Because at time n,
an is known, but an+1 is not.
2. 1-step ahead forecast error: en(1) = an+1 with variance σa2.
3. Multi-step ahead: r̂n(`) = µ for ` > 1.
Thus, for an MA(1) model, the multi-step ahead forecasts
are just the mean of the series. Why? Because the model
has memory of 1 time period.
4. Multi-step ahead forecast error:
• Invertibility:
12
MA(2) model
Building an MA model
13
• Forecast: use the residuals as {at} (which can be obtained from
the data and fitted parameters) to perform forecasts.
Call:
arima(x = vw, order = c(0, 0, 1))
Coefficients:
ma1 intercept
0.1465 0.0396 % The model is vw(t) = 0.0396+a(t)+0.1465a(t-1).
s.e. 0.0099 0.0100
$se
Time Series:
Start = 10195
End = 10199
Frequency = 1
[1] 0.8823290 0.8917523 0.8917523 0.8917523 0.8917523
S-Plus demonstration
> vw=d6202[,3] % Identify the vw-index returns.
> lnvw=log(1+vw) % compute log returns.
14
> acf(lnvw,[Link]=10) % ACF plot i snot shown in this handout.
Call: acf(x = lnvw, [Link] = 10)
Autocorrelation matrix:
lag lnvw
1 0 1.0000
2 1 0.1402
3 2 -0.0120
4 3 -0.0027
5 4 0.0029
6 5 0.0075
7 6 -0.0149
8 7 -0.0066
9 8 -0.0034
10 9 -0.0085
11 10 -0.0074
> length(lnvw)
[1] 10194
> x1=rep(1,10194) % Create a constant to handle non-zero mean
> m1=[Link](lnvw,xreg=x1,model=list(order=c(0,0,1)))
> summary(m1)
Call: [Link](x = lnvw, model = list(order = c(0, 0, 1)), xreg = x1)
Method: Maximum Likelihood with likelihood conditional on 0 observati
ons
ARIMA order: 0 0 1
Variance-Covariance Matrix:
ma(1)
ma(1) 0.00009599039
> [Link](lnvw,model=m1$model,6)
$mean:
[1] 0.0001581654 0.0000000000 0.0000000000 0.0000000000 0.0000000000
[6] 0.0000000000 % Need to add the constant 0.000396 to the forecast.
15
$[Link]:
[1] 0.008830056 0.008924361 0.008924361 0.008924361 0.008924361
[6] 0.008924361
rt = φ1rt−1 + φ0 + at − θ1at−1.
ρ1 = φ1 − [θ1σa2/Var(rt)] 6= φ1.
rt = φ0 + at + π1rt−1 + π2rt−2 + · · ·
rt = µ + at + ψ1at−1 + ψ2at−2 + · · ·
Unit-root Nonstationarity
Random walk
• Form pt = pt−1 + at
i=0 i=0
• Nonstationary
• Strong memory
differencing
• MA model: mean
• Continuously compounded?
pt = φ1pt−1 + et
pt = φ0 + φ1pt−1 + et.
i=1
20
The t-ratio of the OLS estimate of β is the ADF unit-root test sta-
tistic. Again, the statistic has a non-standard limiting distribution.
Example: Consider the log series of U.S. quaterly real GDP series
from 1947.I to [Link]. (Federal Reserve Bank of St. Louis).
R demonstration
> help(UnitrootTests) % See the tests available
Title:
Augmented Dickey-Fuller Test
Test Results:
PARAMETER:
Lag Order: 4
STATISTIC:
Dickey-Fuller: -1.1199
P VALUE:
0.6397 % Cannot reject a unit root.
Call:
ar(x = x)
Coefficients:
1 2 3 4
0.3021 0.1311 -0.0856 -0.1060
Title:
Augmented Dickey-Fuller Test
Test Results:
PARAMETER:
Lag Order: 5
STATISTIC:
21
Dickey-Fuller: -1.1339
P VALUE:
0.6345
S-Plus demonstration
> da=[Link]("[Link]")
> dim(da)
[1] 236 4
> plot(da[,4],type=’l’)
> module(finmetrics)
> gdp=log(da[,4])
> plot(gdp,type=’l’)
> ord=ar(x)
> ord
$order:
[1] 4
> adf=unitroot(gdp,trend=’c’,lags=5,method=’adf’)
> adf
Coefficients:
lag1 lag2 lag3 lag4 lag5 constant
-0.0012 0.2954 0.1358 -0.0864 -0.1108 0.0168
22
Lecture Note: Analysis of Financial Time Series
Spring 2007, Ruey S. Tsay
or
(1 − B)(1 − B 4)rt = (1 − θ1B)(1 − θ4B 4)at
••
•
•
••
• •• ••
•• • • ••
•• •• • • •
•• ••• •••
• • •••
• • • • • •
9.8
• ••
•• • •• •• ••
• • •• ••
• ••
••• • • •• ••• • •
•
• • • • ••
•• •• • •• • •
• •• ••• •
•• ••• •• • •• •••• ••
• • • • ••
9.6
• • •• • • ••
• • • • ••
• ••
•• • •• • • • ••
•
• ••• • •• • • • •
• •• • • • •• •••• •
• •• •
• • • • ••••• •••• •
••
• •
•
•• • •
• •
9.4
• • •
•• • •
• •• • • •• •• ••••
• • • •
• • • • • •• ••
• •••• • • • • •• ••
• ••• •
••• ••• •
•
Figure 1: Time plot of electricity demand of an industrial sector: 15th day of each month
from1972 to 1993.
2
1 2
y
0
Figure 3: Time plot of quarterly logged earnings of Johnson and Johnson: 1960-1980
0 20 40 60 80
Figure 4: Forecast plot for the quarterly earnings of Johnson and Johnson. Data: 1960-1980,
Forecasts: 1981-82.
> library(fSeries)
> setwd("C:/teaching/bs41202")
> x=ts(scan("[Link]"),frequency=4,start=c(1960,1)) % Load data into a time series object.
> plot(x,type=’l’) % Plot data with calendar time
> y=log(x) % Natural log transformation
> plot(y,type=’l’) % plot data
> points(y) % put circles on data points.
> par(mfcol=c(2,1)) % two plots per page
> acf(y,[Link]=16)
> y1=[Link](y) % Creates a sequence of data in R
> acf(y1,[Link]=16)
> dy1=diff(y1) % regular difference
> acf(dy1,[Link]=16)
> sdy1=diff(dy1,4) % seasonal difference
> acf(sdy1,[Link]=12)
> m1=arima(y1,order=c(0,1,1),seasonal=list(order=c(0,1,1),period=4)) % Airline
% model in R.
> m1
Call:
arima(x = y1, order = c(0, 1, 1), seasonal = list(order = c(0, 1, 1), period = 4))
4
Coefficients:
ma1 sma1
-0.6809 -0.3146 % The fitted model is (1-B^4)(1-B)R(t) =
s.e. 0.0982 0.1070 % (1-0.68B)(1-0.31B^4)a(t), var[a(t)] = 0.00793.
5
> acf(diff(y),[Link]=16)
Autocorrelation matrix:
lag y1
1 0 1.0000
2 1 0.9566
3 2 0.9260 % Indicates 1st difference is needed.
4 3 0.8978
5 4 0.8723
6 5 0.8285
....
17 16 0.4578
> acf(diff(y1),[Link]=16)
> dy1=diff(y1)
> sdy1=diff(dy1,4)
> acf(sdy1,[Link]=12)
> tra=mean(sdy1)/sqrt(var(sdy1)/length(sdy1)) % Compute t-ratio of the mean.
> tra
[1] 0.3101582
> summary(m1)
Call: [Link](x = y1, model = air)
Method: Maximum Likelihood with likelihood conditional on 5 observations
Model component 2
ARIMA order: 0 1 1
Period: 4
Variance-Covariance Matrix:
ma(1) ma(4)
6
ma(1) 0.007364341 -0.002665339
ma(4) -0.002665339 0.012370336
$[Link]:
[1] 0.08905406 0.09347891 0.09770357 0.10175299 0.13548748 0.14370536
[7] 0.15147807 0.15887096
7
Regression Models with Time Series Errors
Example. U.S. weekly interest rate data: 1-year and 3-year con-
stant maturity rates. Data are shown in Figure 5.
R Demonstration: output edited.
> library(fSeries)
> setwd("C:/teaching/bs41202")
> da=[Link]("[Link]") % load the data
> r1=da[,1] % 1-year rate
> r3=da[,2] % 3-year rate
> plot(r1,type=’l’) % Plot the data
> lines(1:1967,r3,lty=2)
> plot(r1,r3) % scatter plot of the two series
Residuals:
Min 1Q Median 3Q Max
-1.812147 -0.402280 0.003097 0.402588 1.338746
Coefficients:
8
15
percent
105
Figure 5: Time plots of U.S. weekly interest rates: 1-year constant maturity rate (solid line)
and 3-year rate (dashed line).
> acf(m1$residuals)
> c3=diff(r3)
> c1=diff(r1)
> plot(c1,c3)
Residuals:
Min 1Q Median 3Q Max
9
-0.3806040 -0.0333840 -0.0005428 0.0343681 0.4741822
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0002475 0.0015380 0.161 0.872
c1 0.7810590 0.0074651 104.628 <2e-16 ***
---
Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
> acf(m2$residuals)
> plot(m2$residuals,type=’l’)
Call:
arima(x = c3, order = c(0, 0, 1), xreg = c1)
Coefficients:
ma1 intercept c1 % Fitted model is
0.2115 0.0002 0.7824 % c3 = 0.0002+0.782c1 + a(t)+0.212a(t-1)
s.e. 0.0224 0.0018 0.0077 % with var[a(t)] = 0.00446.
Call:
arima(x = c3, order = c(1, 0, 0), xreg = c1)
Coefficients:
ar1 intercept c1 % Fitted model is
0.1922 0.0003 0.7829 % c3 = 0.0003 + 0.783c1 + a(t),
s.e. 0.0221 0.0019 0.0077 % a(t) = 0.192a(t-1)+e(t).
S-Plus Demonstration
> module(finmetrics)
10
> da=[Link]("[Link]")
> dim(da)
[1] 1967 3
> r3=da[,2]
> r1=da[,1]
> plot(r1,type=’l’) % plot the data
> lines(1:1967,r3,lty=2)
> plot(r1,r3)
>
> m1=OLS(r3~r1) % Least-square regression
> summary(m1)
Call:
OLS(formula = r3 ~ r1)
Residuals:
Min 1Q Median 3Q Max
-1.8121 -0.4023 0.0031 0.4026 1.3387
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 0.9107 0.0323 28.2380 0.0000 % Fitted model is
r1 0.9239 0.0044 210.5084 0.0000 % r3=0.911+0.924r1 + e
Regression Diagnostics:
Residual Diagnostics:
Stat P-Value
Jarque-Bera 9.0032 0.0111
Ljung-Box 42303.0824 0.0000
11
> summary(m2)
Call:
OLS(formula = c3 ~ c1)
Residuals:
Min 1Q Median 3Q Max
-0.3806 -0.0334 -0.0005 0.0344 0.4742
Coefficients:
Value Std. Error t value Pr(>|t|)
(Intercept) 0.0002 0.0015 0.1609 0.8722 % c3 = 0.002+0.781c1+e
c1 0.7811 0.0075 104.6283 0.0000
Regression Diagnostics:
R-Squared 0.8479
Adjusted R-Squared 0.8478
Durbin-Watson Stat 1.6158
Residual Diagnostics:
Stat P-Value
Jarque-Bera 1508.0683 0.0000
Ljung-Box 230.5767 0.0000
ARIMA order: 0 0 1
12
Optimizer has converged
Convergence Type: relative function convergence
AIC: -5059.6702
> [Link](m3) & model checking
Long-memory models
13
For an AR(1) with coefficient φ1, the speed of mean reverting as
measured by half-life is
ln(0.5)
k= .
ln(φ1)
For an MA(q) model, forecasts revert to the mean in q steps.
• Make proper use of regression models with time series errors, e.g.
regression with AR(1) residuals
Perform a joint estimation instead of using any two-step proce-
dure, e.g. Cochrane-Orcutt (1949).
Example: Is there a Friday effect on asset returns?
If a daily market index is used, serial correlation may exist.
See Exercise 8 of Chapter 2.
14