2024

Introduction to Variability

  • Variability refers to how spread out or closely clustered a set of data is.

  • Common measures include the range, variance, and standard deviation.

Sample Range

  • The sample range is the simplest measure of variability.

  • It is calculated as the difference between the maximum and minimum values in a set of observations.

  • For a sample \(\boldsymbol{x}\), the range is:

\[ \text{Range} = \text{Max}(\boldsymbol{x}) - \text{Min}(\boldsymbol{x}) \].

Sample Range Example

  • Consider a sample \(\boldsymbol{x} = 9, 15, 24, 3, 18\)

  • The sample range is: \[ \text{Range} = 24 - 3 = 21 \]

The range function in R

x <- c(9, 15, 24, 3, 18)
range(x)
## [1]  3 24

Sample Variance

  • The ** Sample Variance** quantifies the average squared deviations from the mean of a set of observations.

  • It is denoted by \(s^2\).

\[ \begin{align} s^2 &= \frac{1}{n-1}( (x_1 - \overline{\boldsymbol{x}})^2 + (x_2 - \overline{\boldsymbol{x}})^2 + \ldots + (x_n - \overline{\boldsymbol{x}})^2 )\\ &= \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \overline{\boldsymbol{x}})^2 \end{align} \]

Sample Variance Example

  • Given the same sample \(\boldsymbol{x} = 9, 15, 24, 3, 18\)

\[ \begin{align} s^2 &= \frac{1}{5-1}( (9-13.8)^2 + (15-13.8)^2 + (24-13.8)^2 + (3-13.8)^2 + (18-13.8)^2)\\ &= \frac{1}{4}(16.84 + 1.44 + 113.64 + 123.24 + 16.84)\\ &= \frac{1}{4}(271)\\ &= 67.75 \end{align} \]

The var function in R

x <- c(9, 15, 24, 3, 18)
var(x)
## [1] 65.7

Sample Standard Deviation

  • The sample Standard Deviation is the square root of the sample variance.

  • It measures the dispersion of data points from their mean.

  • Formula: \(s = \sqrt{s^2}\).

Sample Standard Deviation Example

  • Using the same sample \(\boldsymbol{x} = 9, 15, 24, 3, 18\), the sample standard deviation is:

\[ s = \sqrt{67.75} = 8.23 \]

The sd function in R

sd(x)
## [1] 8.105554

Visualizing Variability: Histogram

Sample Skewness

  • Sample Skewness measures the asymmetry of the distribution of data.

  • It provides insights into the shape of the data

Sample Skewness Example

Sample Kurtosis

  • Sample Kurtosis measures the “tailedness” of the distribution.

  • It provides insights into the shape of the data

Sample Kurtosis Example

  • Leptokurtic: Distributions with positive excess kurtosis (sharper peak and fatter tails than the normal distribution). Example: T-distribution with low degrees of freedom.

  • Platykurtic: Distributions with negative excess kurtosis (flatter peak and thinner tails than the normal distribution). Example: Uniform distribution.

  • Mesokurtic: Distributions with zero excess kurtosis, similar to the normal distribution. The standard normal distribution itself is an example.

Sample Kurtosis Example Visual

Question

Consider the following histogram of a sample of data:

Which line shows the sd, var, range, and mean of the sample?

Click here for the answer Correct Answer: The cyan line shows the mean, the magenta line shows the standard deviation, the yellow line shows the variance, and the maroon lines show the range.

Question

Consider the following histogram of a sample of data:

Which line shows the sd, var, range, and mean of the sample?

Click here for the answer Correct Answer: The cyan line shows the mean, the limegreen line shows the standard deviation, the orange line shows the variance, and the deepskyblue lines show the range.