Calculate the five numbers summary of the skewed data set 15 16 16 17 18 18 19 29 when constructing a box plot on a number line which explains the reason why the data set is skewed

1 answer

To calculate the five-number summary for the data set \( 15, 16, 16, 17, 18, 18, 19, 29 \), we first need to identify the minimum, first quartile (Q1), median, third quartile (Q3), and maximum.

  1. Minimum: The smallest number in the data set is \( 15 \).

  2. Maximum: The largest number in the data set is \( 29 \).

  3. Median (Q2): The median is the middle value. Since there are 8 numbers (an even set), the median is the average of the 4th and 5th values in the ordered list:

    • The ordered data set is already sorted: \( 15, 16, 16, 17, 18, 18, 19, 29 \).
    • Median = \( \frac{17 + 18}{2} = \frac{35}{2} = 17.5 \).
  4. First Quartile (Q1): Q1 is the median of the first half of the data (the first four values):

    • First half: \( 15, 16, 16, 17 \).
    • Q1 = \( \frac{16 + 16}{2} = 16 \).
  5. Third Quartile (Q3): Q3 is the median of the second half of the data (the last four values):

    • Second half: \( 18, 18, 19, 29 \).
    • Q3 = \( \frac{18 + 19}{2} = \frac{37}{2} = 18.5 \).

Now that we have all the values, the five-number summary is:

  • Minimum: \( 15 \)
  • Q1: \( 16 \)
  • Median (Q2): \( 17.5 \)
  • Q3: \( 18.5 \)
  • Maximum: \( 29 \)

Summary:

  • Five-number summary: \( 15, 16, 17.5, 18.5, 29 \)

Box Plot Explanation:

When constructing a box plot, you would draw a box from Q1 (\( 16 \)) to Q3 (\( 18.5 \)) with a line at the median (\( 17.5 \)). The whiskers would extend to the minimum (\( 15 \)) and maximum (\( 29 \)).

The reason for the skewness in this dataset:

  • The data set exhibits right (or positive) skewness due to the presence of the outlier \( 29 \). In a symmetric distribution, the median would be roughly in the middle of the quartiles, but in this case, the distance between Q3 and the maximum (Q3 to 29) is much larger than the distance from the minimum to Q1 (Q1 to 15). This difference indicates a longer tail on the right side of the distribution, which characterizes positive skewness.

In summary, the box plot will visually display the skewness in the dataset, with a longer whisker or tail extending to the right due to the outlier, \( 29 \).