The purpose of this in-class lab8 is to practice conducting joint hypothesis tests of regression parameters in R. We will do this using t-tests and F-tests. The lab8 should be completed in your group. To get credit, upload your .R script to the appropriate place on Canvas.
Open up a new R script (named ICL8_XYZ.R
, where XYZ
are your initials) and add the usual “preamble” to the top:
# Add names of group members HERE
library(tidyverse)
library(broom)
library(wooldridge)
library(car)
library(magrittr)
We’ll use a data set on earnings and ability, called htv
. The data set contains a sample of 1,230 workers.
df <- as_tibble(htv)
skimr::skim(df)
Name | df |
Number of rows | 1230 |
Number of columns | 23 |
_______________________ | |
Column type frequency: | |
numeric | 23 |
________________________ | |
Group variables | None |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
wage | 0 | 1 | 13.29 | 9.08 | 1.02 | 7.94 | 11.54 | 16.03 | 91.31 | ▇▁▁▁▁ |
abil | 0 | 1 | 1.80 | 2.18 | -5.63 | 0.57 | 2.15 | 3.46 | 6.26 | ▁▂▅▇▃ |
educ | 0 | 1 | 13.04 | 2.35 | 6.00 | 12.00 | 12.00 | 15.00 | 20.00 | ▁▂▇▃▁ |
ne | 0 | 1 | 0.21 | 0.41 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▂ |
nc | 0 | 1 | 0.37 | 0.48 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 | ▇▁▁▁▅ |
west | 0 | 1 | 0.17 | 0.37 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▂ |
south | 0 | 1 | 0.25 | 0.43 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▂ |
exper | 0 | 1 | 10.74 | 3.11 | 1.00 | 9.00 | 11.00 | 13.00 | 21.00 | ▁▆▇▃▁ |
motheduc | 0 | 1 | 12.18 | 2.28 | 0.00 | 12.00 | 12.00 | 12.00 | 20.00 | ▁▁▇▂▁ |
fatheduc | 0 | 1 | 12.45 | 3.26 | 0.00 | 11.00 | 12.00 | 14.00 | 20.00 | ▁▂▇▃▂ |
brkhme14 | 0 | 1 | 0.17 | 0.37 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▂ |
sibs | 0 | 1 | 2.95 | 1.86 | 0.00 | 2.00 | 3.00 | 4.00 | 13.00 | ▇▇▁▁▁ |
urban | 0 | 1 | 0.82 | 0.39 | 0.00 | 1.00 | 1.00 | 1.00 | 1.00 | ▂▁▁▁▇ |
ne18 | 0 | 1 | 0.23 | 0.42 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▂ |
nc18 | 0 | 1 | 0.41 | 0.49 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 | ▇▁▁▁▆ |
south18 | 0 | 1 | 0.21 | 0.41 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▂ |
west18 | 0 | 1 | 0.15 | 0.35 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | ▇▁▁▁▂ |
urban18 | 0 | 1 | 0.87 | 0.33 | 0.00 | 1.00 | 1.00 | 1.00 | 1.00 | ▁▁▁▁▇ |
tuit17 | 0 | 1 | 8.57 | 4.13 | 0.00 | 6.07 | 8.73 | 11.16 | 18.17 | ▂▅▇▅▁ |
tuit18 | 0 | 1 | 8.59 | 4.05 | 0.00 | 6.13 | 8.83 | 11.16 | 18.17 | ▂▅▇▅▁ |
lwage | 0 | 1 | 2.41 | 0.59 | 0.02 | 2.07 | 2.45 | 2.77 | 4.51 | ▁▂▇▃▁ |
expersq | 0 | 1 | 125.06 | 67.17 | 1.00 | 81.00 | 121.00 | 169.00 | 441.00 | ▆▇▃▁▁ |
ctuit | 0 | 1 | 0.02 | 0.80 | -7.62 | -0.07 | 0.00 | 0.00 | 9.17 | ▁▁▇▁▁ |
Check out what’s in the data by typing
glimpse(df)
## Observations: 1,230
## Variables: 23
## $ wage <dbl> 12.019231, 8.912656, 15.514334, 13.333333, 11.070110, 17.4...
## $ abil <dbl> 5.0277381, 2.0371704, 2.4758952, 3.6092398, 2.6365459, 3.4...
## $ educ <int> 15, 13, 15, 15, 13, 18, 13, 12, 13, 12, 12, 12, 17, 13, 17...
## $ ne <int> 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1...
## $ nc <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ west <int> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0...
## $ south <int> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ exper <int> 9, 8, 11, 6, 15, 8, 13, 14, 9, 9, 13, 14, 4, 8, 7, 10, 10,...
## $ motheduc <int> 12, 12, 12, 12, 12, 12, 13, 12, 10, 14, 9, 12, 17, 16, 16,...
## $ fatheduc <int> 12, 10, 16, 12, 15, 12, 12, 12, 12, 12, 10, 16, 16, 16, 18...
## $ brkhme14 <int> 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ sibs <int> 1, 4, 2, 1, 2, 2, 5, 4, 3, 1, 2, 1, 1, 3, 2, 2, 1, 1, 2, 2...
## $ urban <int> 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1...
## $ ne18 <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0...
## $ nc18 <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ south18 <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ west18 <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1...
## $ urban18 <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ tuit17 <dbl> 7.582914, 8.595144, 7.311346, 9.499537, 7.311346, 7.311346...
## $ tuit18 <dbl> 7.260242, 9.499537, 7.311346, 10.162070, 7.311346, 7.31134...
## $ lwage <dbl> 2.486508, 2.187472, 2.741764, 2.590267, 2.404249, 2.861201...
## $ expersq <int> 81, 64, 121, 36, 225, 64, 169, 196, 81, 81, 169, 196, 16, ...
## $ ctuit <dbl> -0.32267141, 0.90439224, 0.00000000, 0.66253376, 0.0000000...
The main variables we’re interested in are: wages, education, ability, parental education, and region of residence (ne
, nc
, west
, and south
).
Let’s start by creating a factor variable from the four regional dummies. Borrowing code from lab 6, we have:
df %<>% mutate(region = case_when(ne==1 ~ "Northeast",
nc==1 ~ "NorthCentral",
west==1 ~ "West",
south==1 ~ "South")) %>%
mutate(region = factor(region))
or
df %>%
mutate(region = case_when(ne == 1 ~ "Northeast",
nc == 1 ~ "NorthCentral",
west == 1 ~ "West",
south == 1 ~ "South")) %>%
mutate(region = factor(region))->df
Estimate the following regression model:
\[ educ = \beta_0 + \beta_1 motheduc + \beta_2 fatheduc + \beta_3 abil + \beta_4 abil^2 + \beta_5 region + u \]
Note that \(abil\) is in standard deviation units. You will need to use a mutate()
function to create \(abil^2\) (not shown here). Call it abilsq
. \(region\) represents the factor variable you created above.1
df %<>%
mutate(abilsq = abil^2)
df %>% glimpse()
## Observations: 1,230
## Variables: 25
## $ wage <dbl> 12.019231, 8.912656, 15.514334, 13.333333, 11.070110, 17.4...
## $ abil <dbl> 5.0277381, 2.0371704, 2.4758952, 3.6092398, 2.6365459, 3.4...
## $ educ <int> 15, 13, 15, 15, 13, 18, 13, 12, 13, 12, 12, 12, 17, 13, 17...
## $ ne <int> 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1...
## $ nc <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ west <int> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0...
## $ south <int> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ exper <int> 9, 8, 11, 6, 15, 8, 13, 14, 9, 9, 13, 14, 4, 8, 7, 10, 10,...
## $ motheduc <int> 12, 12, 12, 12, 12, 12, 13, 12, 10, 14, 9, 12, 17, 16, 16,...
## $ fatheduc <int> 12, 10, 16, 12, 15, 12, 12, 12, 12, 12, 10, 16, 16, 16, 18...
## $ brkhme14 <int> 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ sibs <int> 1, 4, 2, 1, 2, 2, 5, 4, 3, 1, 2, 1, 1, 3, 2, 2, 1, 1, 2, 2...
## $ urban <int> 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1...
## $ ne18 <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0...
## $ nc18 <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ south18 <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0...
## $ west18 <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1...
## $ urban18 <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ tuit17 <dbl> 7.582914, 8.595144, 7.311346, 9.499537, 7.311346, 7.311346...
## $ tuit18 <dbl> 7.260242, 9.499537, 7.311346, 10.162070, 7.311346, 7.31134...
## $ lwage <dbl> 2.486508, 2.187472, 2.741764, 2.590267, 2.404249, 2.861201...
## $ expersq <int> 81, 64, 121, 36, 225, 64, 169, 196, 81, 81, 169, 196, 16, ...
## $ ctuit <dbl> -0.32267141, 0.90439224, 0.00000000, 0.66253376, 0.0000000...
## $ region <fct> West, Northeast, Northeast, Northeast, Northeast, Northeas...
## $ abilsq <dbl> 25.2781503, 4.1500633, 6.1300569, 13.0266121, 6.9513743, 1...
est <- lm(educ ~ motheduc + fatheduc + abil + abilsq + region, data=df)
summary(est)
##
## Call:
## lm(formula = educ ~ motheduc + fatheduc + abil + abilsq + region,
## data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.1942 -1.1431 -0.1343 1.0218 6.9029
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.228830 0.292800 28.104 < 2e-16 ***
## motheduc 0.188975 0.028254 6.688 3.42e-11 ***
## fatheduc 0.107275 0.019666 5.455 5.93e-08 ***
## abil 0.400004 0.030384 13.165 < 2e-16 ***
## abilsq 0.050585 0.008317 6.082 1.59e-09 ***
## regionNortheast 0.187376 0.136853 1.369 0.171
## regionSouth -0.014308 0.130779 -0.109 0.913
## regionWest 0.075615 0.148390 0.510 0.610
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.758 on 1222 degrees of freedom
## Multiple R-squared: 0.4454, Adjusted R-squared: 0.4423
## F-statistic: 140.2 on 7 and 1222 DF, p-value: < 2.2e-16
tidy(est)
## # A tibble: 8 x 5
## term estimate std.error statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 (Intercept) 8.23 0.293 28.1 1.85e-134
## 2 motheduc 0.189 0.0283 6.69 3.42e- 11
## 3 fatheduc 0.107 0.0197 5.45 5.93e- 8
## 4 abil 0.400 0.0304 13.2 4.12e- 37
## 5 abilsq 0.0506 0.00832 6.08 1.59e- 9
## 6 regionNortheast 0.187 0.137 1.37 1.71e- 1
## 7 regionSouth -0.0143 0.131 -0.109 9.13e- 1
## 8 regionWest 0.0756 0.148 0.510 6.10e- 1
\[ H_0: \beta_1=\beta_2 \\ H_a: \beta_1 \neq \beta_2 \]
To do this, you will need to obtain \(se(\beta_1 - \beta_2)\). Luckily, R will do this for you with the linearHypothesis()
function in the car
package:
linearHypothesis(est, "motheduc = fatheduc")
## Linear hypothesis test
##
## Hypothesis:
## motheduc - fatheduc = 0
##
## Model 1: restricted model
## Model 2: educ ~ motheduc + fatheduc + abil + abilsq + region
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 1223 3789.6
## 2 1222 3777.9 1 11.656 3.7701 0.0524 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The resulting p-value is that of an F test, but one would get an identical result by using a t-test, since this is a simple hypothesis (see @wooldridge, pp. 125-126).
The p-values from the previous regression might indicate that the three region dummies don’t contribute to education.
\[ H_0: \text{all region dummies}=0; \\ H_a: \text{any region dummy}\neq 0 \]
The code to do this again comes from the linearHypothesis()
function. The syntax is to encolose each component hypothesis in quotes and then surround them with c()
, which is how R creates vectors.
linearHypothesis(est, c("regionNortheast=0", "regionSouth=0", "regionWest =0"))
## Linear hypothesis test
##
## Hypothesis:
## regionNortheast = 0
## regionSouth = 0
## regionWest = 0
##
## Model 1: restricted model
## Model 2: educ ~ motheduc + fatheduc + abil + abilsq + region
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 1225 3785.2
## 2 1222 3777.9 3 7.3408 0.7915 0.4987
or, more simply,
linearHypothesis(est, matchCoefs(est,"region"))
## Linear hypothesis test
##
## Hypothesis:
## regionNortheast = 0
## regionSouth = 0
## regionWest = 0
##
## Model 1: restricted model
## Model 2: educ ~ motheduc + fatheduc + abil + abilsq + region
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 1225 3785.2
## 2 1222 3777.9 3 7.3408 0.7915 0.4987
Alternatively, you can perform the F-test as follows (no need to put this in your R-script; I’m just showing you how to do it “by hand”):
est.restrict <- lm(educ ~ motheduc + fatheduc + abil + abilsq, data=df)
Fstat.numerator <- (deviance(est.restrict)-deviance(est))/3
Fstat.denominator <- deviance(est)/1222
Fstat <- Fstat.numerator/Fstat.denominator
p.value <- 1-pf(Fstat,3,1222)
This gives the exact same answer as the linearHypothesis()
code.
Davis %>% head()
## sex weight height repwt repht
## 1 M 77 182 77 180
## 2 F 58 161 51 159
## 3 F 53 161 54 158
## 4 M 68 177 70 175
## 5 F 59 157 59 155
## 6 M 76 170 76 165
mod.davis <- lm(weight ~ repwt, data=Davis)
summary(mod.davis)
##
## Call:
## lm(formula = weight ~ repwt, data = Davis)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.048 -1.868 -0.728 0.601 108.705
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.3363 3.0369 1.757 0.0806 .
## repwt 0.9278 0.0453 20.484 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.419 on 181 degrees of freedom
## (17 observations deleted due to missingness)
## Multiple R-squared: 0.6986, Adjusted R-squared: 0.697
## F-statistic: 419.6 on 1 and 181 DF, p-value: < 2.2e-16
## the following are equivalent:
# linearHypothesis(mod.davis, diag(2), c(0,1))
linearHypothesis(mod.davis, c("(Intercept) = 0", "repwt = 1"))
## Linear hypothesis test
##
## Hypothesis:
## (Intercept) = 0
## repwt = 1
##
## Model 1: restricted model
## Model 2: weight ~ repwt
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 183 13074
## 2 181 12828 2 245.97 1.7353 0.1793
# linearHypothesis(mod.davis, c("(Intercept)", "repwt"), c(0,1))
# linearHypothesis(mod.davis, c("(Intercept)", "repwt = 1"))
## use asymptotic Chi-squared statistic
linearHypothesis(mod.davis, c("(Intercept) = 0", "repwt = 1"), test = "Chisq")
## Linear hypothesis test
##
## Hypothesis:
## (Intercept) = 0
## repwt = 1
##
## Model 1: restricted model
## Model 2: weight ~ repwt
##
## Res.Df RSS Df Sum of Sq Chisq Pr(>Chisq)
## 1 183 13074
## 2 181 12828 2 245.97 3.4706 0.1763
## the following are equivalent:
## use HC3 standard errors via white.adjust option
linearHypothesis(mod.davis, c("(Intercept) = 0", "repwt = 1"),
white.adjust = TRUE)
## Linear hypothesis test
##
## Hypothesis:
## (Intercept) = 0
## repwt = 1
##
## Model 1: restricted model
## Model 2: weight ~ repwt
##
## Note: Coefficient covariance matrix supplied.
##
## Res.Df Df F Pr(>F)
## 1 183
## 2 181 2 3.3896 0.03588 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## covariance matrix *function*
linearHypothesis(mod.davis, c("(Intercept) = 0", "repwt = 1"), vcov = hccm)
## Linear hypothesis test
##
## Hypothesis:
## (Intercept) = 0
## repwt = 1
##
## Model 1: restricted model
## Model 2: weight ~ repwt
##
## Note: Coefficient covariance matrix supplied.
##
## Res.Df Df F Pr(>F)
## 1 183
## 2 181 2 3.3896 0.03588 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## covariance matrix *estimate*
linearHypothesis(mod.davis, c("(Intercept) = 0", "repwt = 1"),
vcov = hccm(mod.davis, type = "hc3"))
## Linear hypothesis test
##
## Hypothesis:
## (Intercept) = 0
## repwt = 1
##
## Model 1: restricted model
## Model 2: weight ~ repwt
##
## Note: Coefficient covariance matrix supplied.
##
## Res.Df Df F Pr(>F)
## 1 183
## 2 181 2 3.3896 0.03588 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mod.duncan <- lm(prestige ~ income + education, data=Duncan)
## the following are all equivalent:
# linearHypothesis(mod.duncan, "1*income - 1*education = 0")
linearHypothesis(mod.duncan, "income = education")
## Linear hypothesis test
##
## Hypothesis:
## income - education = 0
##
## Model 1: restricted model
## Model 2: prestige ~ income + education
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 43 7518.9
## 2 42 7506.7 1 12.195 0.0682 0.7952
# linearHypothesis(mod.duncan, "income - education")
# # linearHypothesis(mod.duncan, "1income - 1education = 0")
# linearHypothesis(mod.duncan, "0 = 1*income - 1*education")
# linearHypothesis(mod.duncan, "income-education=0")
# linearHypothesis(mod.duncan, "1*income - 1*education + 1 = 1")
# linearHypothesis(mod.duncan, "2income = 2*education")
mod.duncan.2 <- lm(prestige ~ type*(income + education), data=Duncan)
coefs <- names(coef(mod.duncan.2))
summary(mod.duncan.2)
##
## Call:
## lm(formula = prestige ~ type * (income + education), data = Duncan)
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.2629 -5.5337 -0.2431 5.1065 22.5198
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.95054 6.79402 -0.581 0.5645
## typeprof 32.00781 14.10923 2.269 0.0294 *
## typewc -7.04320 20.63835 -0.341 0.7349
## income 0.78341 0.13074 5.992 7.12e-07 ***
## education 0.31962 0.27979 1.142 0.2608
## typeprof:income -0.36914 0.20388 -1.811 0.0786 .
## typewc:income -0.36031 0.25957 -1.388 0.1736
## typeprof:education 0.01859 0.31837 0.058 0.9538
## typewc:education 0.10677 0.36216 0.295 0.7698
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.647 on 36 degrees of freedom
## Multiple R-squared: 0.9233, Adjusted R-squared: 0.9063
## F-statistic: 54.17 on 8 and 36 DF, p-value: < 2.2e-16
## test against the null model (i.e., only the intercept is not set to 0)
linearHypothesis(mod.duncan.2, coefs[-1])
## Linear hypothesis test
##
## Hypothesis:
## typeprof = 0
## typewc = 0
## income = 0
## education = 0
## typeprof:income = 0
## typewc:income = 0
## typeprof:education = 0
## typewc:education = 0
##
## Model 1: restricted model
## Model 2: prestige ~ type * (income + education)
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 44 43688
## 2 36 3351 8 40337 54.174 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## test all interaction coefficients equal to 0
# linearHypothesis(mod.duncan.2, coefs[grep(":", coefs)], verbose=TRUE)
linearHypothesis(mod.duncan.2, matchCoefs(mod.duncan.2, ":"), verbose=TRUE)
##
## Hypothesis matrix:
## (Intercept) typeprof typewc income education typeprof:income
## typeprof:income 0 0 0 0 0 1
## typewc:income 0 0 0 0 0 0
## typeprof:education 0 0 0 0 0 0
## typewc:education 0 0 0 0 0 0
## typewc:income typeprof:education typewc:education
## typeprof:income 0 0 0
## typewc:income 1 0 0
## typeprof:education 0 1 0
## typewc:education 0 0 1
##
## Right-hand-side vector:
## [1] 0 0 0 0
##
## Estimated linear function (hypothesis.matrix %*% coef - rhs)
## typeprof:income typewc:income typeprof:education typewc:education
## -0.36914256 -0.36030837 0.01859107 0.10677092
##
##
## Estimated variance/covariance matrix for linear function
## typeprof:income typewc:income typeprof:education
## typeprof:income 0.04156710 0.017091995 -0.02462562
## typewc:income 0.01709200 0.067378054 -0.01508772
## typeprof:education -0.02462562 -0.015087722 0.10135862
## typewc:education -0.01508772 -0.009442361 0.07828368
## typewc:education
## typeprof:income -0.015087722
## typewc:income -0.009442361
## typeprof:education 0.078283679
## typewc:education 0.131161899
## Linear hypothesis test
##
## Hypothesis:
## typeprof:income = 0
## typewc:income = 0
## typeprof:education = 0
## typewc:education = 0
##
## Model 1: restricted model
## Model 2: prestige ~ type * (income + education)
##
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 40 3798.0
## 2 36 3350.6 4 447.31 1.2015 0.327
Here the notation of \(\beta_5 region\) is not quite right. It more technically should be written \(\beta_5 region.NE + \beta_6 region.S + \beta_7 region.W\), where each of the \(region.X\) variables is a dummy. The way it is written above, \(\beta_5 region\) implies that \(\beta_5\) is a vector, not a scalar.↩