The expression \sqrt{n}\left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T \mathbf{1} can be interpreted as the dot product between the row vector \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T and the column vector \sqrt{n}\mathbf{1}.
In this context, \mathbf{p}^0 represents the true probability mass function (pmf) that we are testing the goodness of fit for. \widehat{\mathbf{p}} represents the maximum likelihood estimate (MLE) of the pmf based on the observed iid samples.
We subtract \mathbf{p}^0 from \widehat{\mathbf{p}} to measure the difference between the estimated pmf and the true pmf. Taking the transpose of this difference vector, \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T, allows us to treat it as a row vector.
The column vector \mathbf{1} is an all-ones vector of appropriate size. Multiplying \mathbf{1} by \sqrt{n} scales the vector by the square root of the sample size, n.
Finally, taking the dot product between \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T and \sqrt{n}\mathbf{1} measures the overall difference between the estimated pmf and the true pmf, scaled by the sample size. This can provide useful information in assessing the goodness of fit.
Let \mathbf{p}^0 be the discrete pmf that we wish to test the goodness of fit for an observed sequence of iid samples. Let \widehat{\mathbf{p}} be the MLE upon observing the iid samples.
What is \sqrt{n}\left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T \mathbf{1}?
Note: This is a vector dot product where \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T is a row vector and \mathbf{1} is the all-ones column vector of appropriate size.
3 answers
0 is correct answer
I apologize for misunderstanding. The expression \sqrt{n}\left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T \mathbf{1} simplifies to 0.
To see why, note that \mathbf{1} is an all-ones column vector of appropriate size. When we multiply \mathbf{1} by \sqrt{n}, we simply scale each element of \mathbf{1} by \sqrt{n}. Therefore, the resulting column vector is also an all-ones column vector, but scaled by \sqrt{n}.
Taking the dot product between \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T (which is a row vector) and the scaled all-ones column vector results in the sum of the elements in \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T. Since the difference vector \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T represents the deviation of the estimated pmf from the true pmf, the sum of these deviations will be 0 if the estimated pmf is a good fit for the true pmf.
Therefore, the expression \sqrt{n}\left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T \mathbf{1} simplifies to 0 if the estimated pmf is a good fit for the true pmf.
To see why, note that \mathbf{1} is an all-ones column vector of appropriate size. When we multiply \mathbf{1} by \sqrt{n}, we simply scale each element of \mathbf{1} by \sqrt{n}. Therefore, the resulting column vector is also an all-ones column vector, but scaled by \sqrt{n}.
Taking the dot product between \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T (which is a row vector) and the scaled all-ones column vector results in the sum of the elements in \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T. Since the difference vector \left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T represents the deviation of the estimated pmf from the true pmf, the sum of these deviations will be 0 if the estimated pmf is a good fit for the true pmf.
Therefore, the expression \sqrt{n}\left(\widehat{\mathbf{p}} - \mathbf{p}^0\right)^ T \mathbf{1} simplifies to 0 if the estimated pmf is a good fit for the true pmf.