7 R-Squared
The coefficient of determination, often referred to as
To begin, we must first define variance. Broadly, variance is a measure of spread around the mean. Most textbooks provide the following equation for variance:
Total sum of squares:
Regression sum of squares:
Error sum of squares:
Thinking of variance as a measure of variation is generally approachable for students, and interestingly, Write (1921) used the word variation instead of variance in his seminal paper. The key for students is in the numerator, where each observation is subtracted from the mean (i.e.
It turns out there are several ways of calculating
7.0.1 Example 2
Show the code
library(VisualStats)
set.seed(42)
<- VisualStats::simulate(n = 100, r_squared = .7)
df <- y ~ x1 + x2
formu lm(formu, df) |> summary()
Call:
lm(formula = formu, data = df)
Residuals:
Min 1Q Median 3Q Max
-6.0818 -1.5623 -0.1948 1.5037 5.9495
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.0042 0.2405 20.80 < 2e-16 ***
x1 2.6608 0.2310 11.52 < 2e-16 ***
x2 -1.7987 0.2661 -6.76 1.04e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.393 on 97 degrees of freedom
Multiple R-squared: 0.6416, Adjusted R-squared: 0.6342
F-statistic: 86.82 on 2 and 97 DF, p-value: < 2.2e-16
Show the code
r_squared_vis(df, formu,
plot_total_variance = FALSE,
plot_error_variance = FALSE,
plot_regression_variance = FALSE,
plot_all_variances = FALSE,
plot_residuals_squared = FALSE,
plot_residuals = FALSE)
Show the code
<- r_squared_vis(df, formu,
p1 plot_total_variance = TRUE,
plot_error_variance = FALSE,
plot_regression_variance = FALSE,
plot_all_variances = FALSE,
plot_residuals_squared = FALSE,
plot_residuals = FALSE) +
::ylim(c(-20,20)) +
ggplot2::xlim(c(-20,20)) + ggplot2::ggtitle('')
ggplot2<- variance_vis(df$y,
p2 sample_variance_col = '#999999',
plot_sample_variance = TRUE,
plot_population_variance = FALSE,
variance_position = 'middle',
point_size = 1) +
::ylim(c(0,20))
ggplot2::plot_grid(p1, p2) cowplot
Show the code
r_squared_vis(df, formu,
plot_total_variance = FALSE,
plot_error_variance = FALSE,
plot_regression_variance = FALSE,
plot_all_variances = FALSE,
plot_residuals_squared = TRUE,
plot_residuals = TRUE)
Show the code
r_squared_vis(df, formu,
plot_total_variance = FALSE,
plot_error_variance = TRUE,
plot_regression_variance = FALSE,
plot_all_variances = FALSE,
plot_residuals_squared = TRUE,
plot_residuals = TRUE)
Show the code
r_squared_vis(df, formu,
plot_total_variance = TRUE,
plot_error_variance = TRUE,
plot_regression_variance = FALSE,
plot_all_variances = FALSE,
plot_residuals_squared = FALSE,
plot_residuals = FALSE)