of the mean, median and mode which measure of center is most affected by outlier? explain your reasoning include an example with at least 10 data points.

1 answer

The measure of center that is most affected by outliers is the mean. The mean is calculated by summing all the data points and then dividing by the number of points, which means that extremely high or low values can significantly alter the average.

Example:

Consider the following set of 10 data points representing the ages of individuals:

  • Ages: 22, 25, 27, 29, 30, 30, 31, 32, 28, 100 (where 100 is an outlier)

Calculating the measures of center:

  1. Mean: \[ \text{Mean} = \frac{22 + 25 + 27 + 29 + 30 + 30 + 31 + 32 + 28 + 100}{10} = \frac{ 22 + 25 + 27 + 29 + 30 + 30 + 31 + 32 + 28 + 100 }{10} = \frac{ 352 }{10} = 35.2 \]

  2. Median: To find the median, first sort the data points:

    • Sorted Ages: 22, 25, 27, 28, 29, 30, 30, 31, 32, 100 The median is the average of the 5th and 6th values (29 and 30): \[ \text{Median} = \frac{29 + 30}{2} = \frac{59}{2} = 29.5 \]
  3. Mode: The mode is the value that appears most frequently in the data, which in this case is 30 (it appears twice). \[ \text{Mode} = 30 \]

Summary of Results

  • Mean = 35.2
  • Median = 29.5
  • Mode = 30

Analysis:

In this example, the mean (35.2) is significantly higher than both the median (29.5) and the mode (30) due to the presence of the outlier (100). The outlier has skewed the mean upwards.

The median and mode, however, are much less affected by the outlier. The median represents the middle value of the dataset, showing a more robust central tendency that isn't influenced by extreme values. The mode, being the most frequently occurring number, remains unchanged regardless of outliers.

Therefore, the mean is the measure of center that is most affected by outliers, as shown in this example where the presence of a single outlier led to a substantial deviation from the other measures of center.