12处理时间序列数据和校正序列自相关

The purpose of this in-class lab12 is to use R to practice estimating time series regression models with standard errors corrected for heteroskedasticity and serial correlation (HAC). The lab11 should be completed in your group. To get credit, upload your .R script to the appropriate place on Canvas.

12.1 For starters

Load the usual packages, as well as the new ones installed in Lab 11.¹

Open up a new R script (named ICL12_XYZ.R, where XYZ are your initials) and add the usual “preamble” to the top:

# Add names of group members HERE
library(tidyverse)
library(wooldridge)
library(broom)
library(car)
library(pdfetch)
library(zoo)
library(dynlm)
library(lmtest)
library(sandwich)
library(magrittr)

12.1.1 Load the data

We’re going to use data on US macroeconomic indicators. The wooldridge data set is called phillips.

df <- as_tibble(phillips)
df %>% glimpse()

## Observations: 56
## Variables: 7
## $ year   <int> 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, ...
## $ unem   <dbl> 3.8, 5.9, 5.3, 3.3, 3.0, 2.9, 5.5, 4.4, 4.1, 4.3, 6.8, 5.5, ...
## $ inf    <dbl> 8.1, -1.2, 1.3, 7.9, 1.9, 0.8, 0.7, -0.4, 1.5, 3.3, 2.8, 0.7...
## $ inf_1  <dbl> NA, 8.1, -1.2, 1.3, 7.9, 1.9, 0.8, 0.7, -0.4, 1.5, 3.3, 2.8,...
## $ unem_1 <dbl> NA, 3.8, 5.9, 5.3, 3.3, 3.0, 2.9, 5.5, 4.4, 4.1, 4.3, 6.8, 5...
## $ cinf   <dbl> NA, -9.3000002, 2.5000000, 6.6000004, -6.0000000, -1.0999999...
## $ cunem  <dbl> NA, 2.1000001, -0.5999999, -2.0000002, -0.3000000, -0.099999...

12.1.2 Declare `df` as time series data

df.ts <- zoo(df, order.by=df$year)

Now it will be easy to include lags of various variables into our regression models.

12.2 Plot time series data

Let’s have a look at the inflation rate and unemployment for the US over the postwar period (1948–2003):

ggplot(df.ts) + 
  geom_line(aes(year, inf)) + 
  geom_line(aes(year, unem), color="red")

The negative correlation between the two led economist William Phillips to conclude that governments could increase their inflation rate to reduce the unemployment rate. This is known as the “Phillips Curve.”

12.3 Determinants of the inflation rate

Now let’s estimate the Phillips Curve:

\[ inf_{t} = \beta_0 + \beta_1 unemp_t + u_t \]

where \(inf\) is the inflation rate and \(unem\) is the unemployment rate.

library(tseries)
adf.test(df.ts$inf)  # alternative hypothesis: stationary

## 
##  Augmented Dickey-Fuller Test
## 
## data:  df.ts$inf
## Dickey-Fuller = -2.1213, Lag order = 3, p-value = 0.5257
## alternative hypothesis: stationary

adf.test(df.ts$unem) # alternative hypothesis: stationary

## 
##  Augmented Dickey-Fuller Test
## 
## data:  df.ts$unem
## Dickey-Fuller = -2.2056, Lag order = 3, p-value = 0.4917
## alternative hypothesis: stationary

est <- dynlm(inf ~ unem, data=df.ts)

Test for AR(1) serial correlation in this time series:

dynlm(resid(est) ~ L(resid(est))) %>% coeftest()

## 
## t test of coefficients:
## 
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -0.11181    0.31799 -0.3516   0.7265    
## L(resid(est))  0.57247    0.10835  5.2833 2.43e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

dynlm(resid(est) ~ L(resid(est))) %>% broom::tidy()

## # A tibble: 2 x 5
##   term          estimate std.error statistic    p.value
##   <chr>            <dbl>     <dbl>     <dbl>      <dbl>
## 1 (Intercept)     -0.112     0.318    -0.352 0.727     
## 2 L(resid(est))    0.572     0.108     5.28  0.00000243

Equivalently, you can use the bgtest() function in the lmtest package:

bgtest(est)

## 
##  Breusch-Godfrey test for serial correlation of order up to 1
## 
## data:  est
## LM test = 20.888, df = 1, p-value = 4.87e-06

Interpret the coefficient on unem in the previous regression. What does it tell you about the idea that inflation and unemployment positively covary?

12.4 Correcting for Serial Correlation

Now let’s compute HAC (Heteroskedasticity and Autocorrelation Consistent) standard errors. To do so, we’ll use the NeweyWest option in the coeftest() function of the lmtest package.²

coeftest(est) # re-display baseline results

## 
## t test of coefficients:
## 
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  1.05357    1.54796  0.6806  0.49902  
## unem         0.50238    0.26556  1.8918  0.06389 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

coeftest(est, vcov=NeweyWest)

## 
## t test of coefficients:
## 
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  1.05357    1.56089  0.6750   0.5026
## unem         0.50238    0.36794  1.3654   0.1778

How does your interpretation of the effect of unemployment on inflation change, using the Newey-West standard errors?

12.4.1 Another way to correct for serial correlation

Another way to get rid of serial correlation is to difference the data. In this case, we will estimate the following regression:

\[ \Delta inf_{t} = \beta_0 + \beta_1 unemp_t + u_t \]

where \(\Delta inf_{t} = inf_{t}-inf_{t-1}\). Aside from addressing serial correlation, the differenced model also accounts for people’s inflationary expectations.

est.diff <- dynlm(d(inf) ~ unem, data = df.ts)

Now perform a Breusch-Godfrey test on the differenced model. Is there a serial correlation problem?

bgtest(est.diff)

## 
##  Breusch-Godfrey test for serial correlation of order up to 1
## 
## data:  est.diff
## LM test = 0.061273, df = 1, p-value = 0.8045

Compute the Newey-West SEs on the difference model. Are they much different from the baseline model?

coeftest(est.diff)

## 
## t test of coefficients:
## 
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  2.82820    1.22487  2.3090  0.02487 *
## unem        -0.51765    0.20904 -2.4763  0.01650 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

coeftest(est.diff, vcov=NeweyWest)

## 
## t test of coefficients:
## 
##             Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)  2.82820    0.96385  2.9343 0.0049321 ** 
## unem        -0.51765    0.14717 -3.5173 0.0009031 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

What do you conclude about the effect of unemployment on the change in inflation?

You may need to install the sandwich package.↩
NeweyWest comes from the sandwich package.↩

12处理时间序列数据和校正序列自相关

LJJ

2020/3/29

12.1 For starters

12.1.1 Load the data

12.1.2 Declare `df` as time series data

12.2 Plot time series data

12.3 Determinants of the inflation rate

12.4 Correcting for Serial Correlation

12.4.1 Another way to correct for serial correlation

12处理时间序列数据和校正序列自相关

LJJ

2020/3/29

12.1 For starters

12.1.1 Load the data

12.1.2 Declare df as time series data

12.2 Plot time series data

12.3 Determinants of the inflation rate

12.4 Correcting for Serial Correlation

12.4.1 Another way to correct for serial correlation

12.1.2 Declare `df` as time series data