2024

Introduction to Central Tendency

  • Central tendency measures give us an idea of where the center – i.e., most of the data – of a set of observations lies.

  • Common measures include the sample mean, sample median, and sample mode.

Sample Mean

  • The sample mean is denoted by \(\overline{\boldsymbol{x}}\).

  • It is the average of all observations in a sample.

\[ \begin{align} \overline{\boldsymbol{x}} &= \frac{x_1 + x_2 + \ldots + x_{n}}{n}\\ &= \frac{1}{n} \sum_{i=1}^{n} x_{i} \end{align} \]

Sample Mean Example

  • Sample \(\boldsymbol{x} = 55 + 35 + 23 + 44 + 31\)

  • The sample mean is: \[ \begin{align} \overline{\boldsymbol{x}} &= \frac{55 + 35 + 23 + 44 + 31}{5}\\ &= \frac{188}{5}\\ &= 37.6 \end{align} \]

The mean function in R

# define a vector of observations
x <- c(55, 35, 23, 44, 31)

# calculate the mean
mean(x)
## [1] 37.6

Sample Mean visualisation: One sample

Sample Mean visualisation: Two samples

Sample Median

  • The sample median is denoted by \(\widetilde{\boldsymbol{x}}\).

  • It is the middle value of a dataset when ordered from least to greatest.

  • For an even number of observations, it’s the average of the two middle numbers.

Sample Median Example

  • Sample \(\boldsymbol{x} = 55, 35, 23, 44, 31\)

  • Ordered sample \(\boldsymbol{x} = 23, 31, 35, 44, 55\)

  • The sample median is: \[ \begin{align} \widetilde{\boldsymbol{x}} &= 35 \end{align} \]

Sample Median Example with Even Observations

  • Sample \(\boldsymbol{x} = 55, 35, 23, 44\)

  • Ordered sample \(\boldsymbol{x} = 23, 35, 44, 55\)

  • The sample median is: \[ \begin{align} \widetilde{\boldsymbol{x}} &= \frac{35 + 44}{2}\\ &= 39.5 \end{align} \]

The median function in R

# define a vector of observations
x <- c(55, 35, 23, 44, 31)

# calculate the median
median(x)
## [1] 35

Sample Median visualisation: One sample

Sample Median visualisation: Two samples

Means vs Medians With Skewed Distribution

Central Tendency and Outliers

Sample Mode

  • The sample mode is the most frequently occurring value in the dataset.

  • A dataset may have one mode, more than one mode, or no mode at all.

Sample Mode Example

  • Sample \(\boldsymbol{x} = 55, 35, 23, 44, 31, 55, 55\)

  • The sample mode is: \[ \begin{align} \text{mode}(\boldsymbol{x}) &= 55 \end{align} \]

Sample Mode Example with Multiple Modes

  • Sample \(\boldsymbol{x} = 55, 35, 23, 44, 31, 55, 55, 35, 35\)

  • The sample mode is: \[ \begin{align} \text{mode}(\boldsymbol{x}) &= 55, 35 \end{align} \]

Sample Mode Example with No Mode

  • Sample \(\boldsymbol{x} = 55, 35, 23, 44, 31\)

  • The sample mode is: \[ \begin{align} \text{mode}(\boldsymbol{x}) &= \text{No mode} \end{align} \]

Sample Mode visualisation: One sample

Sample Mode visualisation: Two samples

Question

Consider the following histrogram of a sample of data:

Which of the following statements is true?

  • The red line is the mean and the blue line is the median.
  • The blue line is the mean and the red line is the median.
  • The red line is the mean and the median is not shown.
  • The blue line is the mean and the median is not shown.
Click here for the answer Correct Answer: A) The red line is the mean and the blue line is the median.

Question

Consider the following histogram of a sample of data:

Which of the following statements is true?

  • The mean is greater than the median.
  • The mean is less than the median.
  • The mean is about equal to the median.
  • The mean and median cannot be calculated in this sample.
Click here for the answer Correct Answer: A) The mean is about equal to the median.