Consider the same statistical set-up as above. Suppose we observe a data set consisting of 1000 observations as described in the following (format: i, number of observations of i):
\displaystyle i ~ \displaystyle ~ N_ i
\displaystyle 0 ~ \displaystyle ~ 339
\displaystyle 1 ~ \displaystyle ~ 455
\displaystyle 2 ~ \displaystyle ~ 180
\displaystyle 3 ~ \displaystyle ~ 26
What is the value of the test statistic T_ n for this data set? Give a numerical value with at least 4 decimals. (You are encouraged to use computational software.)
T_ n=\quad
unanswered
What is the p-value of this data set with respect to the test \psi _{1000}? Give a numerical value with at least 4 decimals.
Use this tool to find the tail probabilities of a \chi ^2 distribution (you may also use any other software). If you are using this tool, note that you need to set "Choose Type of Control" to "Adjust X-axis quantile (Chi square) value" to find the tail probability associated with an x-axis value for a chi-squared distribution with degrees of freedom set in the "Degrees of Freedom" box.
p-value:
unanswered
If \psi _ n is designed to have level 5\%, would you reject or fail to reject on the given data set?
Reject
Fail to reject
1 answer
First, we calculate the estimated probabilities based on the observed frequencies:
\hat{\pi}_0 = \frac{339}{1000} = 0.339
\hat{\pi}_1 = \frac{455}{1000} = 0.455
\hat{\pi}_2 = \frac{180}{1000} = 0.180
\hat{\pi}_3 = \frac{26}{1000} = 0.026
Next, we calculate the expected values for each category:
E_0 = \hat{\pi}_0 \times n = 0.339 \times 1000 = 339
E_1 = \hat{\pi}_1 \times n = 0.455 \times 1000 = 455
E_2 = \hat{\pi}_2 \times n = 0.180 \times 1000 = 180
E_3 = \hat{\pi}_3 \times n = 0.026 \times 1000 = 26
The test statistic T_n is given by:
T_n = \sum \frac{(N_i - E_i)^2}{E_i}
Calculating for each category and summing them up:
T_n = \frac{(339 - 339)^2}{339} + \frac{(455 - 455)^2}{455} + \frac{(180 - 180)^2}{180} + \frac{(26 - 26)^2}{26}
T_n = 0
The p-value is the probability of obtaining a test statistic as extreme or more extreme than the observed test statistic under the null hypothesis \psi_n. Since the test statistic T_n is 0, the p-value is 0. This means that the observed data set is exactly what we would expect under the null hypothesis.
Since the p-value is 0, which is less than the significance level of 5%, we would reject the null hypothesis \psi_n.