Model Complexity

2025

Model Complexity

Model complexity is in some sense a measure of how flexible a model is. That is, how well it can fit a wide variety of data.
A model with high complexity can fit a wider variety of data than a model with low complexity.
It can be difficult to formally define model complexity, but we can think of it in terms of the number of parameters or predictors in a model.

Model complexity in multiple Regression

\[ \begin{align} \begin{bmatrix} y_1\\ y_2\\ \vdots\\ \vdots\\ \vdots\\ y_n\\ \end{bmatrix} &= \begin{bmatrix} 1 & x_{1_1} & x_{2_1} & \ldots & x_{k_1} \\ 1 & x_{1_2} & x_{2_2} & \ldots & x_{k_2} \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_{1_n} & x_{2_n} & \ldots & x_{k_n} \end{bmatrix} \begin{bmatrix} \beta_{0} \\ \beta_{1} \\ \vdots \\ \beta_{k} \end{bmatrix} + \begin{bmatrix} \epsilon_1\\ \epsilon_2\\ \vdots\\ \vdots\\ \vdots\\ \epsilon_n\\ \end{bmatrix} \\\\ \end{align} \]

Small \(k\) means low complexity
Large \(k\) means high complexity.

Model complexity in multiple Regression

Suppose you have data:

##                 X            Y            Z
##             <num>        <num>        <num>
##   1:  0.001706658  0.713187956  0.230769792
##   2:  0.325315669 -0.569182023 -0.341489535
##   3:  1.405358927 -1.912556564  3.984701959
##   4:  0.197137112  1.822779132 -0.424192313
##   5: -0.872225146  0.606706645 -1.627040375
##   6:  0.430882545  0.852189433 -0.140128704
##   7:  0.407751489  0.017388663 -1.079794258
##   8: -0.155792865 -2.383252909 -1.296175629
##   9:  2.015174963 -0.748073006  3.445457775
##  10: -0.314628492  0.885843216 -1.231192082
##  11:  0.083916396 -0.375766724  0.347826382
##  12: -1.391507864 -1.090750657 -3.525416807
##  13: -1.620325498  2.551827566 -3.380881898
##  14: -0.844384998  0.908607703 -1.231303749
##  15:  0.277037624 -0.068320379  0.673355292
##  16: -0.038001284 -0.250386790  0.014757951
##  17:  1.253108308  0.260696619  2.794130227
##  18: -0.542248089 -0.304030335 -2.535906013
##  19:  0.647602746 -0.357148029  0.555237421
##  20: -1.459077056 -0.574480900 -3.222088378
##  21: -1.045397566  2.923651492 -2.618591982
##  22: -0.002269584  0.668440241 -0.793398140
##  23:  0.558733483 -0.855575358  0.316028465
##  24: -0.384065523  1.470057804 -2.796674325
##  25:  1.196933486  0.893954660  4.086245412
##  26: -0.581246185 -1.221541899 -3.040686097
##  27:  1.089855123 -0.462508118  3.076009478
##  28:  0.549741649 -0.062315323  0.002646412
##  29:  2.191900683 -0.950812911  3.022397989
##  30: -0.539347976 -0.455972244 -1.336649024
##  31: -0.644919791  0.187598272 -1.178768861
##  32:  0.846548932 -1.239180507  1.191387329
##  33: -1.531114734 -0.310221805 -1.851997393
##  34:  0.502742235 -1.001632794  1.359662998
##  35:  1.558582158 -1.486267868  2.111436805
##  36:  0.184959099 -0.954976923 -0.458694050
##  37: -0.259046879 -0.002225506  0.864163650
##  38: -0.423178136 -0.002372927 -0.426130900
##  39: -0.543420955 -1.145142149 -1.873982409
##  40:  0.040186883 -1.076394790  0.356457562
##  41:  0.531895119  0.080835776 -1.727165830
##  42: -0.100611928 -0.123233258 -1.304535929
##  43: -0.205246301  0.266731757  0.898857189
##  44: -2.844908351  1.495989949 -3.147530478
##  45: -0.866334963  1.540676869 -0.686039203
##  46:  0.687900543 -0.514061912  3.091315201
##  47:  2.214485888 -0.338710081  5.211647281
##  48: -0.276580079  0.015022039 -2.373227845
##  49: -1.051117613 -0.783865004  0.731302453
##  50: -0.746483675  1.137617431 -0.865963532
##  51:  0.558486301  0.837829427 -1.023520745
##  52:  0.763090314  1.116178083  0.996336697
##  53:  1.362097079  0.387532037  1.869844800
##  54: -0.327783150  0.461723681 -0.753402033
##  55: -0.858580656 -0.439367042 -0.384999650
##  56:  0.633531666  0.760014785  1.637235944
##  57:  0.726574297  0.475276459  2.135034774
##  58: -0.451229948 -2.052439791  0.333332514
##  59: -0.384002496  0.799148215 -2.442831970
##  60:  0.192990608 -0.450369903 -0.122999301
##  61:  1.170312240  0.998014661  5.198337709
##  62: -0.479256219 -0.598445300 -1.078503337
##  63:  0.890637162 -1.130453977  1.924306789
##  64: -1.193689998 -0.258643701 -3.146457931
##  65:  0.323332639  0.259047635  0.369055232
##  66: -1.468719011 -1.010913102 -2.572816996
##  67: -1.451363181 -0.146579863 -1.007326358
##  68:  0.208154160  0.337038096  0.948164157
##  69: -0.055424617 -0.364918685  1.201448887
##  70:  0.957888563  0.870020644  1.788586817
##  71: -1.060002895  0.018014802 -2.035972224
##  72:  1.011805654 -0.111728594 -0.784053434
##  73:  0.875437981  0.837864404  2.653860483
##  74: -1.924106890 -0.021017601 -3.153781928
##  75:  0.919486219  1.773911731  1.327182026
##  76:  1.970351734 -0.245290571  5.717633249
##  77:  1.389805652 -0.541092641  3.419869409
##  78: -0.285525136 -1.032483168 -0.090490675
##  79:  0.646441584 -2.480666726 -0.585682044
##  80:  1.329011972  0.258603544  3.327587141
##  81:  0.132428434  1.766302980 -0.027673002
##  82:  1.130462813 -0.485519992  1.579354986
##  83:  0.098787873  1.078463366 -1.654249567
##  84: -1.058566595 -1.082326955 -1.890293107
##  85:  1.715393529  0.443195423  4.246935237
##  86: -0.671379623 -0.852058688 -2.859163574
##  87: -0.971594867 -1.576657454 -2.401586329
##  88: -0.065803183 -0.109085256 -1.233285905
##  89: -0.543737154 -1.758358676 -1.797776562
##  90: -1.825884457 -0.490940869 -3.325418545
##  91: -0.545577670  0.882259368 -1.568707694
##  92: -0.633665231  0.733615030 -0.978821389
##  93: -0.905107669 -1.820140555 -2.186250840
##  94: -1.876828489  1.830590248 -3.634225551
##  95: -0.642196746  0.157227046 -1.421257332
##  96:  0.763808446 -0.496418179  0.707441376
##  97:  0.267016353  2.389316325 -0.279590556
##  98:  0.263301514  0.019960427  0.206321751
##  99: -0.726350992 -0.262248716  0.140564358
## 100:  1.464210640 -0.461711811  3.135614580
##                 X            Y            Z

Model complexity in multiple Regression

Consider a high complexity model that includes only \(X\) as a predictor of \(Z\).
Consider a low complexity model that includes \(X\) and \(Y\) as predictors of \(Z\).

fm_low <- lm(Z ~ X, data=d)
fm_high <- lm(Z ~ X + Y, data=d)

Based on the scatter plots above we expect \(Z\) to depend strongly on \(X\) and not at all on \(Y\).
We therefore expect both the low and high complexity models to fit the data well
This is because the low complexity model isn’t missing anything important.

Multiple Regression: Low Complexity

## 
## Call:
## lm(formula = Z ~ X, data = d)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6185 -0.6888 -0.1027  0.5749  3.1159 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.1010     0.1125  -0.898    0.371    
## X             1.8657     0.1131  16.499   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.125 on 98 degrees of freedom
## Multiple R-squared:  0.7353, Adjusted R-squared:  0.7326 
## F-statistic: 272.2 on 1 and 98 DF,  p-value: < 2.2e-16

Multiple Regression: High Complexity

## 
## Call:
## lm(formula = Z ~ X + Y, data = d)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6242 -0.6823 -0.1061  0.5812  3.0773 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.10006    0.11307  -0.885    0.378    
## X            1.86951    0.11430  16.356   <2e-16 ***
## Y            0.03326    0.10976   0.303    0.763    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.13 on 97 degrees of freedom
## Multiple R-squared:  0.7355, Adjusted R-squared:  0.7301 
## F-statistic: 134.9 on 2 and 97 DF,  p-value: < 2.2e-16

Penalizing complexity with adjusted \(R^2\)

The adjusted \(R^2\) applies a penalty for adding predictors:

\[ R^2_{\text{adj}} = 1 - \left(1 - R^2\right) \frac{n - 1}{n - p - 1} \]

\(R^2\) is the ordinary coefficient of determination.
\(n\) is the number of observations.
\(p\) is the number of predictors.
The term:
\(\frac{n - 1}{n - p - 1}\) is the penalty factor that grows as the number of predictors \(p\) approaches the sample size \(n\).

How does Adjusted \(R^2\) change with increasing complexity?

Identifying the important predictors

Adjusted \(R^2\): Strengths and Weaknesses

Adjusted \(R^2\) partially corrects for adding extra predictors by applying a penalty for model complexity.
The penalty is relatively mild, and adjusted \(R^2\) can still increase when adding uninformative predictors, especially with small sample sizes.
Adjusted \(R^2\) is specific to linear regression and may not generalize to other model types.
More robust model selection tools (like AIC and BIC) incorporate likelihood-based comparisons and work across a wider range of models.