The purpose of this in-class lab7 is to practice conducting hypothesis tests about regression parameters in R. The lab should be completed in your group. To get credit, upload your .R script to the appropriate place on Canvas.

7.1 For starters

Open up a new R script (named ICL7_XYZ.R, where XYZ are your initials) and add the usual “preamble” to the top:

# Add names of group members HERE
library(tidyverse)
library(broom)
library(wooldridge)
library(magrittr)

7.1.1 Load the data

We’ll use a new data set on Research and Development (R&D) expenditures, called rdchem. The data set contains information on 32 companies in the chemical industry.

df <- as_tibble(rdchem)

Check out what’s in the data by typing

glimpse(df)
## Observations: 32
## Variables: 8
## $ rd       <dbl> 430.6, 59.0, 23.5, 3.5, 1.7, 8.4, 2.5, 39.9, 1136.0, 1428....
## $ sales    <dbl> 4570.2, 2830.0, 596.8, 133.6, 42.0, 390.0, 93.9, 907.9, 19...
## $ profits  <dbl> 186.9, 467.0, 107.4, -4.3, 8.0, 47.3, 0.9, 77.4, 2563.0, 4...
## $ rdintens <dbl> 9.421906, 2.084806, 3.937668, 2.619760, 4.047619, 2.153846...
## $ profmarg <dbl> 4.0895362, 16.5017662, 17.9959793, -3.2185628, 19.0476189,...
## $ salessq  <dbl> 2.088673e+07, 8.008900e+06, 3.561702e+05, 1.784896e+04, 1....
## $ lsales   <dbl> 8.427312, 7.948032, 6.391582, 4.894850, 3.737670, 5.966147...
## $ lrd      <dbl> 6.0651798, 4.0775375, 3.1570003, 1.2527629, 0.5306283, 2.1...

The main variables are measures of R&D, profits, sales, and profits as a percentage of sales (profmarg, i.e. profit margin).

7.2 Regression and Hypothesis Testing

Estimate the following regression model:

\[ rdintens = \beta_0 + \beta_1 \log(sales) + \beta_2 profmarg + u \]

Note that the variable \(log(sales)\) already exists in df as lsales. \(rdintens\) is in percentage units, so a number of 2.6 means that the company’s total R&D expenditures are 2.6% of its sales.

df %>% DT::datatable()
est <- lm(rdintens ~ lsales + profmarg, data=df)
tidy(est)
## # A tibble: 3 x 5
##   term        estimate std.error statistic p.value
##   <chr>          <dbl>     <dbl>     <dbl>   <dbl>
## 1 (Intercept)   0.472     1.68       0.282   0.780
## 2 lsales        0.321     0.216      1.49    0.147
## 3 profmarg      0.0500    0.0458     1.09    0.283
summary(est)
## 
## Call:
## lm(formula = rdintens ~ lsales + profmarg, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.3016 -1.2707 -0.6895  0.8785  6.0369 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.47225    1.67606   0.282    0.780
## lsales       0.32135    0.21557   1.491    0.147
## profmarg     0.05004    0.04578   1.093    0.283
## 
## Residual standard error: 1.839 on 29 degrees of freedom
## Multiple R-squared:  0.09847,    Adjusted R-squared:  0.0363 
## F-statistic: 1.584 on 2 and 29 DF,  p-value: 0.2224

Answer the following questions:

  1. Interpret the coefficient on lsales. If \(sales\) increase by 10%, what is the estimated percentage point change in \(rdintens\)?
  2. Is this an economically significant relationship?
  3. Using the output of tidy(est), test the hypothesis that sales affects R&D intensity at the 10% level. In other words, test:

\[ H_0: \beta_1 = 0 \\ H_a: \beta_1 \neq 0 \]

  1. Does your answer to (3) change if you instead consider a one-sided alternative? i.e.  \[ H_a: \beta_1 > 0 \]
  2. Now consider the \(\beta_2\) parameter. Is there a statistically significant effect of profit margin on R&D intensity?