The purpose of this in-class lab15 is to use R to practice with time series forecasting. The lab15 should be completed in your group. To get credit, upload your .R script to the appropriate place on Canvas.
Open up a new R script (named ICL15_XYZ.R
, where XYZ
are your initials) and add the usual “preamble” to the top:
# Add names of group members HERE
library(tidyverse)
library(wooldridge)
library(broom)
library(magrittr)
library(stargazer)
library(zoo)
library(dynlm)
library(pdfetch)
library(tseries) # You may need to install this package
library(lubridate) # You may need to install this package
library(forecast) # You will likely have to install this one
First we’ll look at the return on a 3-month treasury bill over the period of 1960q1–1990q4.
Second, we’ll read in Google’s and Apple’s stock price data from January 3, 2005 until October 31, 2018.
df1 <- as_tibble(intqrt)
df1 %>% skimr::skim()
Name | Piped data |
Number of rows | 124 |
Number of columns | 23 |
_______________________ | |
Column type frequency: | |
numeric | 23 |
________________________ | |
Group variables | None |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
r3 | 0 | 1.00 | 6.45 | 3.02 | 2.19 | 4.31 | 5.72 | 8.10 | 16.46 | ▇▇▃▁▁ |
r6 | 0 | 1.00 | 6.75 | 3.11 | 2.56 | 4.47 | 6.12 | 8.38 | 17.35 | ▇▇▃▁▁ |
r12 | 0 | 1.00 | 6.93 | 3.07 | 2.93 | 4.65 | 6.46 | 8.60 | 17.13 | ▇▆▃▁▁ |
p3 | 0 | 1.00 | 9842.27 | 72.40 | 9605.80 | 9802.11 | 9859.40 | 9893.63 | 9945.70 | ▁▁▅▇▇ |
p6 | 0 | 1.00 | 9676.41 | 143.05 | 9203.76 | 9598.68 | 9704.11 | 9781.85 | 9873.96 | ▁▁▃▇▇ |
hy6 | 1 | 0.99 | 1.71 | 0.90 | 0.67 | 1.12 | 1.57 | 2.11 | 5.77 | ▇▅▁▁▁ |
hy3 | 0 | 1.00 | 1.61 | 0.75 | 0.55 | 1.08 | 1.43 | 2.02 | 4.10 | ▇▇▃▁▁ |
spr63 | 0 | 1.00 | 0.30 | 0.27 | -0.60 | 0.11 | 0.29 | 0.43 | 1.54 | ▁▇▇▁▁ |
hy3_1 | 1 | 0.99 | 1.61 | 0.76 | 0.55 | 1.07 | 1.42 | 2.02 | 4.10 | ▇▇▃▁▁ |
hy6_1 | 2 | 0.98 | 1.71 | 0.90 | 0.67 | 1.11 | 1.55 | 2.09 | 5.77 | ▇▅▁▁▁ |
spr63_1 | 1 | 0.99 | 0.31 | 0.27 | -0.60 | 0.11 | 0.29 | 0.44 | 1.54 | ▁▇▇▁▁ |
hy6hy3_1 | 1 | 0.99 | 0.11 | 0.34 | -0.97 | -0.04 | 0.06 | 0.21 | 1.82 | ▁▇▃▁▁ |
cr3 | 1 | 0.99 | 0.04 | 1.25 | -6.63 | -0.28 | 0.15 | 0.60 | 4.99 | ▁▁▇▅▁ |
r3_1 | 1 | 0.99 | 6.44 | 3.03 | 2.19 | 4.27 | 5.70 | 8.11 | 16.46 | ▇▇▃▁▁ |
chy6 | 2 | 0.98 | 0.01 | 0.74 | -3.85 | -0.22 | 0.01 | 0.25 | 3.05 | ▁▁▇▂▁ |
chy3 | 1 | 0.99 | 0.01 | 0.31 | -1.65 | -0.07 | 0.04 | 0.15 | 1.24 | ▁▁▇▅▁ |
chy6_1 | 3 | 0.98 | 0.01 | 0.74 | -3.85 | -0.18 | 0.01 | 0.25 | 3.05 | ▁▁▇▂▁ |
chy3_1 | 2 | 0.98 | 0.01 | 0.31 | -1.65 | -0.06 | 0.04 | 0.15 | 1.24 | ▁▁▇▅▁ |
cr6 | 1 | 0.99 | 0.04 | 1.28 | -6.74 | -0.44 | 0.13 | 0.57 | 4.65 | ▁▁▇▇▁ |
cr6_1 | 2 | 0.98 | 0.04 | 1.29 | -6.74 | -0.43 | 0.14 | 0.58 | 4.65 | ▁▁▇▇▁ |
cr3_1 | 2 | 0.98 | 0.04 | 1.26 | -6.63 | -0.25 | 0.16 | 0.60 | 4.99 | ▁▁▇▅▁ |
r6_1 | 1 | 0.99 | 6.74 | 3.12 | 2.56 | 4.42 | 6.10 | 8.39 | 17.35 | ▇▇▃▁▁ |
cspr63 | 1 | 0.99 | 0.00 | 0.31 | -1.05 | -0.13 | 0.01 | 0.11 | 1.49 | ▁▅▇▁▁ |
df1 %<>%
dplyr::mutate(quarter = seq(yq('1960:Q1'),yq('1990:Q4'), by = 'quarters')) # create quarter
df1 %<>%
dplyr::select(r3,quarter)
df2 <- pdfetch_YAHOO(c("goog","aapl"),
fields = c("adjclose"),
from = ymd("2005-01-01"),
to = ymd("2018-11-01"),
interval = "1d") %>%
as_tibble
df2 %<>%
dplyr::mutate(date=rownames(df2), date=ymd(date)) # create date variable
df1.ts <- df1 %>%
dplyr::select(r3) %>%
zoo(order.by=df1$quarter)
df2.ts <- df2 %>%
select(goog,aapl) %>%
dplyr::zoo(order.by=df2$date)
Let’s have a look at the 3-month T-bill return for the US over the period 1960–1990:
autoplot(df1.ts) + xlab("Year") + ylab("T-bill return")
And now the Google adjusted closing price:
autoplot(df2.ts) + xlab("Year") + ylab("Price")
Let’s test for a unit root in each of the time series. The way to do this is the Augmented Dickey-Fuller (ADF) test, which is available as adf.test()
in the tseries
package.
The function tests
\[ H_0: \text{Unit Root} \\ H_a: \text{Stationary} \]
adf.test(df1.ts$r3, k=1)
##
## Augmented Dickey-Fuller Test
##
## data: df1.ts$r3
## Dickey-Fuller = -2.7878, Lag order = 1, p-value = 0.2492
## alternative hypothesis: stationary
pp.test(df1.ts$r3)
##
## Phillips-Perron Unit Root Test
##
## data: df1.ts$r3
## Dickey-Fuller Z(alpha) = -17.53, Truncation lag parameter = 4, p-value
## = 0.1027
## alternative hypothesis: stationary
adf.test(df2.ts$goog, k=1)
adf.test(df2.ts$aapl, k=1)
To alternatively examine the unit root, we can estimate AR(1) models for each series:
est.tbill <- dynlm(r3 ~ L(r3,1), data=df1.ts)
stargazer(est.tbill,type="text")
##
## ===============================================
## Dependent variable:
## ---------------------------
## r3
## -----------------------------------------------
## L(r3, 1) 0.909***
## (0.037)
##
## Constant 0.625**
## (0.261)
##
## -----------------------------------------------
## Observations 123
## R2 0.836
## Adjusted R2 0.834
## Residual Std. Error 1.228 (df = 121)
## F Statistic 614.595*** (df = 1; 121)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
est.goog <- dynlm(goog ~ L(goog,1), data=df2.ts)
stargazer(est.goog,type="text")
est.aapl <- dynlm(aapl ~ L(aapl,1), data=df2.ts)
stargazer(est.aapl,type="text")
Now let’s use our time series data to forecast future stock prices. First, we should create a shortened version of the time series so we can compare our forecast to actual data:
df2.short <- df2 %>%
dplyr::filter(date < as.Date("2018-10-01"))
df2.ts.short <- df2.short %>%
dplyr::select(goog,aapl) %>%
zoo(order.by=df2.short$date)
We can use the Arima
function to estimate basic AR(1) models on the differenced stock prices.
simple.goog <- Arima(df2.ts.short$goog,order=c(1,1,0))
simple.aapl <- Arima(df2.ts.short$aapl,order=c(1,1,0))
This is the same thing as estimating
\[ \Delta goog_t = \rho \Delta goog_{t-1} + u_t \]
We can also use the auto.arima
function to allow the computer to choose the best ARIMA model:
auto.goog <- auto.arima(df2.ts.short$goog)
auto.aapl <- auto.arima(df2.ts.short$aapl)
We can compare the 90-day-ahead forecasts of each model by looking at their plots:
autoplot(forecast(simple.goog, h=90))
autoplot(forecast( auto.goog, h=90))
autoplot(forecast(simple.aapl, h=90))
autoplot(forecast( auto.aapl, h=90))