15时间序列预测

The purpose of this in-class lab15 is to use R to practice with time series forecasting. The lab15 should be completed in your group. To get credit, upload your .R script to the appropriate place on Canvas.

15.1 For starters

Open up a new R script (named ICL15_XYZ.R, where XYZ are your initials) and add the usual “preamble” to the top:

# Add names of group members HERE
library(tidyverse)
library(wooldridge)
library(broom)
library(magrittr)
library(stargazer)
library(zoo)
library(dynlm)
library(pdfetch)
library(tseries)   # You may need to install this package
library(lubridate) # You may need to install this package
library(forecast)  # You will likely have to install this one

15.2.1 Load the data

First we’ll look at the return on a 3-month treasury bill over the period of 1960q1–1990q4.

Second, we’ll read in Google’s and Apple’s stock price data from January 3, 2005 until October 31, 2018.

df1 <- as_tibble(intqrt)
df1 %>% skimr::skim()

Data summary
Name	Piped data
Number of rows	124
Number of columns	23
_______________________
Column type frequency:
numeric	23
________________________
Group variables	None

Variable type: numeric

skim_variable	n_missing	complete_rate	mean	sd	p0	p25	p50	p75	p100	hist
r3	0	1.00	6.45	3.02	2.19	4.31	5.72	8.10	16.46	▇▇▃▁▁
r6	0	1.00	6.75	3.11	2.56	4.47	6.12	8.38	17.35	▇▇▃▁▁
r12	0	1.00	6.93	3.07	2.93	4.65	6.46	8.60	17.13	▇▆▃▁▁
p3	0	1.00	9842.27	72.40	9605.80	9802.11	9859.40	9893.63	9945.70	▁▁▅▇▇
p6	0	1.00	9676.41	143.05	9203.76	9598.68	9704.11	9781.85	9873.96	▁▁▃▇▇
hy6	1	0.99	1.71	0.90	0.67	1.12	1.57	2.11	5.77	▇▅▁▁▁
hy3	0	1.00	1.61	0.75	0.55	1.08	1.43	2.02	4.10	▇▇▃▁▁
spr63	0	1.00	0.30	0.27	-0.60	0.11	0.29	0.43	1.54	▁▇▇▁▁
hy3_1	1	0.99	1.61	0.76	0.55	1.07	1.42	2.02	4.10	▇▇▃▁▁
hy6_1	2	0.98	1.71	0.90	0.67	1.11	1.55	2.09	5.77	▇▅▁▁▁
spr63_1	1	0.99	0.31	0.27	-0.60	0.11	0.29	0.44	1.54	▁▇▇▁▁
hy6hy3_1	1	0.99	0.11	0.34	-0.97	-0.04	0.06	0.21	1.82	▁▇▃▁▁
cr3	1	0.99	0.04	1.25	-6.63	-0.28	0.15	0.60	4.99	▁▁▇▅▁
r3_1	1	0.99	6.44	3.03	2.19	4.27	5.70	8.11	16.46	▇▇▃▁▁
chy6	2	0.98	0.01	0.74	-3.85	-0.22	0.01	0.25	3.05	▁▁▇▂▁
chy3	1	0.99	0.01	0.31	-1.65	-0.07	0.04	0.15	1.24	▁▁▇▅▁
chy6_1	3	0.98	0.01	0.74	-3.85	-0.18	0.01	0.25	3.05	▁▁▇▂▁
chy3_1	2	0.98	0.01	0.31	-1.65	-0.06	0.04	0.15	1.24	▁▁▇▅▁
cr6	1	0.99	0.04	1.28	-6.74	-0.44	0.13	0.57	4.65	▁▁▇▇▁
cr6_1	2	0.98	0.04	1.29	-6.74	-0.43	0.14	0.58	4.65	▁▁▇▇▁
cr3_1	2	0.98	0.04	1.26	-6.63	-0.25	0.16	0.60	4.99	▁▁▇▅▁
r6_1	1	0.99	6.74	3.12	2.56	4.42	6.10	8.39	17.35	▇▇▃▁▁
cspr63	1	0.99	0.00	0.31	-1.05	-0.13	0.01	0.11	1.49	▁▅▇▁▁

df1 %<>% 
  dplyr::mutate(quarter = seq(yq('1960:Q1'),yq('1990:Q4'), by = 'quarters')) # create quarter
df1 %<>% 
  dplyr::select(r3,quarter)

df2 <- pdfetch_YAHOO(c("goog","aapl"), 
                     fields = c("adjclose"), 
                     from = ymd("2005-01-01"),
                     to = ymd("2018-11-01"),
                     interval = "1d") %>% 
  as_tibble
df2 %<>% 
  dplyr::mutate(date=rownames(df2), date=ymd(date)) # create date variable

15.2.2 Declare as time series objects

df1.ts <- df1 %>% 
  dplyr::select(r3) %>% 
  zoo(order.by=df1$quarter)

df2.ts <- df2 %>% 
  select(goog,aapl) %>% 
  dplyr::zoo(order.by=df2$date)

15.3 Plot time series data

Let’s have a look at the 3-month T-bill return for the US over the period 1960–1990:

autoplot(df1.ts) + xlab("Year") + ylab("T-bill return")

And now the Google adjusted closing price:

autoplot(df2.ts) + xlab("Year") + ylab("Price")

15.4 Testing for a unit root

Let’s test for a unit root in each of the time series. The way to do this is the Augmented Dickey-Fuller (ADF) test, which is available as adf.test() in the tseries package.

The function tests

\[ H_0: \text{Unit Root} \\ H_a: \text{Stationary} \]

adf.test(df1.ts$r3, k=1)

## 
##  Augmented Dickey-Fuller Test
## 
## data:  df1.ts$r3
## Dickey-Fuller = -2.7878, Lag order = 1, p-value = 0.2492
## alternative hypothesis: stationary

pp.test(df1.ts$r3)

## 
##  Phillips-Perron Unit Root Test
## 
## data:  df1.ts$r3
## Dickey-Fuller Z(alpha) = -17.53, Truncation lag parameter = 4, p-value
## = 0.1027
## alternative hypothesis: stationary

adf.test(df2.ts$goog, k=1)
adf.test(df2.ts$aapl, k=1)

Which of these time series has a unit root, according to the ADF test? Explain what the consequences are of analyzing a time series that contains a unit root.

15.4.1 Estimating AR(1) models

To alternatively examine the unit root, we can estimate AR(1) models for each series:

est.tbill <- dynlm(r3 ~ L(r3,1), data=df1.ts)
stargazer(est.tbill,type="text")

## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                                 r3             
## -----------------------------------------------
## L(r3, 1)                     0.909***          
##                               (0.037)          
##                                                
## Constant                      0.625**          
##                               (0.261)          
##                                                
## -----------------------------------------------
## Observations                    123            
## R2                             0.836           
## Adjusted R2                    0.834           
## Residual Std. Error      1.228 (df = 121)      
## F Statistic          614.595*** (df = 1; 121)  
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

est.goog  <- dynlm(goog ~ L(goog,1), data=df2.ts)
stargazer(est.goog,type="text")

est.aapl  <- dynlm(aapl ~ L(aapl,1), data=df2.ts)
stargazer(est.aapl,type="text")

Are the \(R^2\) values from these estimates meaningful?

15.5 Forecasting

Now let’s use our time series data to forecast future stock prices. First, we should create a shortened version of the time series so we can compare our forecast to actual data:

df2.short    <- df2 %>% 
  dplyr::filter(date < as.Date("2018-10-01"))

df2.ts.short <- df2.short %>% 
  dplyr::select(goog,aapl) %>%
  zoo(order.by=df2.short$date)

15.5.1 Estimating simple AR models

We can use the Arima function to estimate basic AR(1) models on the differenced stock prices.

simple.goog <- Arima(df2.ts.short$goog,order=c(1,1,0))
simple.aapl <- Arima(df2.ts.short$aapl,order=c(1,1,0))

This is the same thing as estimating

\[ \Delta goog_t = \rho \Delta goog_{t-1} + u_t \]

15.5.2 Estimating ARIMA models

We can also use the auto.arima function to allow the computer to choose the best ARIMA model:

auto.goog <- auto.arima(df2.ts.short$goog)
auto.aapl <- auto.arima(df2.ts.short$aapl)

15.5.3 Plotting forecasts

We can compare the 90-day-ahead forecasts of each model by looking at their plots:

autoplot(forecast(simple.goog, h=90))
autoplot(forecast(  auto.goog, h=90))

autoplot(forecast(simple.aapl, h=90))
autoplot(forecast(  auto.aapl, h=90))