First calculate the coefficients (beta, intercept) for the model where the dependent variable horsepower (hp) is related to the independent variable miles per gallon (mpg) from dataset mtcars.
## [1] -8.829731
## [1] 324.0823
Plot the two variables. Add the line based on the estimates that you calculated above.
Calculate the fit of the model above. Check with lm
function. Look up function ?lm.
## [1] 0.6024373
Dummy variables have two levels like gender to test differences between men and women in an interval variable you can use a linear model or a t-test. In this exercise you will do both to see how these solutions are comparable. Try to interpret the coefficients as discussed during the lecture. Use dataset juul (see last lecture). Conduct first linear regression to explain igf1 (insuline growth factor) by sex.
##
## Call:
## lm(formula = igf1 ~ factor(sex), data = juul)
##
## Residuals:
## Min 1Q Median 3Q Max
## -343.10 -135.10 -23.89 117.90 604.11
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 310.887 7.722 40.262 < 2e-16 ***
## factor(sex)2 57.214 10.605 5.395 8.54e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 168.5 on 1011 degrees of freedom
## (326 observations deleted due to missingness)
## Multiple R-squared: 0.02798, Adjusted R-squared: 0.02702
## F-statistic: 29.1 on 1 and 1011 DF, p-value: 8.54e-08
Then compare with t.test
. Use var.equal=TRUE
.
##
## Two Sample t-test
##
## data: juul$igf1 by juul$sex
## t = -5.3949, df = 1011, p-value = 8.54e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -78.02476 -36.40325
## sample estimates:
## mean in group 1 mean in group 2
## 310.8866 368.1006
Relate age to igf1 (both interval variables), plot the variables, add the regression line. What do you see?
##
## Call:
## lm(formula = igf1 ~ age, data = juul)
##
## Residuals:
## Min 1Q Median 3Q Max
## -334.78 -134.38 -25.29 121.52 570.16
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 360.8602 9.0219 39.998 < 2e-16 ***
## age -1.1918 0.4408 -2.704 0.00697 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 170.3 on 1011 degrees of freedom
## (326 observations deleted due to missingness)
## Multiple R-squared: 0.00718, Adjusted R-squared: 0.006198
## F-statistic: 7.311 on 1 and 1011 DF, p-value: 0.006967