The measure of central tendency most affected by outliers is the mean. This is because the mean is calculated by adding all the data points together and dividing by the total number of points, meaning that extreme values can significantly skew the result.
Example:
Consider the following set of 10 data points representing the ages of people in a small group:
- Ages: 22, 23, 24, 25, 26, 27, 28, 29, 30, 100
-
Calculate the Mean:
-
Calculate the Median: To find the median, first, list the data in order:
- 22, 23, 24, 25, 26, 27, 28, 29, 30, 100.
The median is the average of the 5th and 6th numbers in this ordered list:
- 22, 23, 24, 25, 26, 27, 28, 29, 30, 100.
The median is the average of the 5th and 6th numbers in this ordered list:
-
Calculate the Mode: There are no repeated ages, so there is no mode in this set.
Summary of Results:
- Mean: 31.8
- Median: 26.5
- Mode: None
In the example above, you can see that the outlier (100) dramatically increases the mean (31.8). In contrast, the median (26.5) is much less affected by the outlier because it is based solely on the middle values of the sorted data, which are much closer to the other ages in the set.
This clearly illustrates how the mean can be greatly influenced by outliers, while the median remains more stable in the presence of extreme values.