Consider the same statistical set-up as above. Suppose we observe a data set consisting of 1000 observations as described in the following (format: i, number of observations of i):

\displaystyle i ~ \displaystyle ~ N_ i
\displaystyle 0 ~ \displaystyle ~ 339
\displaystyle 1 ~ \displaystyle ~ 455
\displaystyle 2 ~ \displaystyle ~ 180
\displaystyle 3 ~ \displaystyle ~ 26
What is the value of the test statistic T_ n for this data set? Give a numerical value with at least 4 decimals. (You are encouraged to use computational software.)

T_ n=\quad
unanswered
What is the p-value of this data set with respect to the test \psi _{1000}? Give a numerical value with at least 4 decimals.

Use this tool to find the tail probabilities of a \chi ^2 distribution (you may also use any other software). If you are using this tool, note that you need to set "Choose Type of Control" to "Adjust X-axis quantile (Chi square) value" to find the tail probability associated with an x-axis value for a chi-squared distribution with degrees of freedom set in the "Degrees of Freedom" box.

p-value:
unanswered
If \psi _ n is designed to have level 5\%, would you reject or fail to reject on the given data set?

Reject

Fail to reject

1 answer

To calculate the test statistic T_n, we need to calculate the expected values for each category of the data set under the null hypothesis \psi_n. The expected values can be calculated as \hat{\pi}_i \times n, where \hat{\pi}_i is the estimated probability for category i and n is the total number of observations.

First, we calculate the estimated probabilities based on the observed frequencies:
\hat{\pi}_0 = \frac{339}{1000} = 0.339
\hat{\pi}_1 = \frac{455}{1000} = 0.455
\hat{\pi}_2 = \frac{180}{1000} = 0.180
\hat{\pi}_3 = \frac{26}{1000} = 0.026

Next, we calculate the expected values for each category:
E_0 = \hat{\pi}_0 \times n = 0.339 \times 1000 = 339
E_1 = \hat{\pi}_1 \times n = 0.455 \times 1000 = 455
E_2 = \hat{\pi}_2 \times n = 0.180 \times 1000 = 180
E_3 = \hat{\pi}_3 \times n = 0.026 \times 1000 = 26

The test statistic T_n is given by:
T_n = \sum \frac{(N_i - E_i)^2}{E_i}

Calculating for each category and summing them up:
T_n = \frac{(339 - 339)^2}{339} + \frac{(455 - 455)^2}{455} + \frac{(180 - 180)^2}{180} + \frac{(26 - 26)^2}{26}

T_n = 0

The p-value is the probability of obtaining a test statistic as extreme or more extreme than the observed test statistic under the null hypothesis \psi_n. Since the test statistic T_n is 0, the p-value is 0. This means that the observed data set is exactly what we would expect under the null hypothesis.

Since the p-value is 0, which is less than the significance level of 5%, we would reject the null hypothesis \psi_n.