Question
Consider the statistical set-up from the previous problem. In particular, recall that \mathbf{u}= \frac{1}{\sqrt{5}} (1,2)^ T and
\displaystyle \mathrm{{\boldsymbol X}}_1 \, =\, \begin{pmatrix} 1\\ 2\end{pmatrix},\, \mathrm{{\boldsymbol X}}_2 \, = \, \begin{pmatrix} 3\\ 4\end{pmatrix},\, \mathrm{{\boldsymbol X}}_3 \, =\, \begin{pmatrix} -1 \\ 0\end{pmatrix}.
Observe that for i = 1,2,3, the number \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_ i (where \mathbf{u} is a unit vector) gives the signed distance from the origin to the endpoint of the projection \text {proj}_{\mathbf{u}} \mathrm{{\boldsymbol X}}_ i. By signed distance , we mean that \left| \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_ i \right| is the length of \text {proj}_{\mathbf{u}} \mathrm{{\boldsymbol X}}_ i and
\displaystyle \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_ i > 0 \displaystyle \Longrightarrow \mathrm{{\boldsymbol X}}_ i \, \, \text {points approximately in the direction of } \, \mathbf{u}
\displaystyle \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_ i < 0 \displaystyle \Longrightarrow \mathrm{{\boldsymbol X}}_ i \, \, \text {points approximately in the opposite direction of } \, \mathbf{u}\,
Compute the empirical variance of the data set
\mathbf{u}\cdot \mathrm{{\boldsymbol X}}_1,\, \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_2,\, \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_3.
unanswered
Let \mathbb {X} denote the matrix whose i-th row is \mathrm{{\boldsymbol X}}_ i^ T.
Recall that S = \frac{1}{3} \mathbb {X}^ T (I_3 - \frac{1}{3} \mathbf{1} \mathbf{1}^ T) \mathbb {X} denotes the empirical covariance matrix of our data set.
What is \mathbf{u}^ T S \mathbf{u}?
(You are encouraged to use computational software.)
unanswered
Are your answers from part 1 and part 2 of this question the same?
\displaystyle \mathrm{{\boldsymbol X}}_1 \, =\, \begin{pmatrix} 1\\ 2\end{pmatrix},\, \mathrm{{\boldsymbol X}}_2 \, = \, \begin{pmatrix} 3\\ 4\end{pmatrix},\, \mathrm{{\boldsymbol X}}_3 \, =\, \begin{pmatrix} -1 \\ 0\end{pmatrix}.
Observe that for i = 1,2,3, the number \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_ i (where \mathbf{u} is a unit vector) gives the signed distance from the origin to the endpoint of the projection \text {proj}_{\mathbf{u}} \mathrm{{\boldsymbol X}}_ i. By signed distance , we mean that \left| \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_ i \right| is the length of \text {proj}_{\mathbf{u}} \mathrm{{\boldsymbol X}}_ i and
\displaystyle \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_ i > 0 \displaystyle \Longrightarrow \mathrm{{\boldsymbol X}}_ i \, \, \text {points approximately in the direction of } \, \mathbf{u}
\displaystyle \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_ i < 0 \displaystyle \Longrightarrow \mathrm{{\boldsymbol X}}_ i \, \, \text {points approximately in the opposite direction of } \, \mathbf{u}\,
Compute the empirical variance of the data set
\mathbf{u}\cdot \mathrm{{\boldsymbol X}}_1,\, \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_2,\, \mathbf{u}\cdot \mathrm{{\boldsymbol X}}_3.
unanswered
Let \mathbb {X} denote the matrix whose i-th row is \mathrm{{\boldsymbol X}}_ i^ T.
Recall that S = \frac{1}{3} \mathbb {X}^ T (I_3 - \frac{1}{3} \mathbf{1} \mathbf{1}^ T) \mathbb {X} denotes the empirical covariance matrix of our data set.
What is \mathbf{u}^ T S \mathbf{u}?
(You are encouraged to use computational software.)
unanswered
Are your answers from part 1 and part 2 of this question the same?
Answers
GPT 3.5
To compute the empirical variance of the data set, we first need to compute the empirical covariance matrix S.
First, let's compute the matrix X:
X = [1 2; 3 4; -1 0]
Next, let's compute the matrix 1:
1 = [1 1 1]
Now, let's compute the matrix A:
A = I3 - (1/3) * 11'
A = [2/3 -1/3 -1/3; -1/3 2/3 -1/3; -1/3 -1/3 2/3]
Finally, let's compute the empirical covariance matrix S:
S = (1/3) * X'A*X
S = [14/3 -4/3; -4/3 4/3]
Now, let's compute u^T S u:
u = (1/sqrt(5)) * [1 2]
u^T S u = (1/sqrt(5)) * [1 2] * [14/3 -4/3; -4/3 4/3] * [1 2]
u^T S u = (1/sqrt(5)) * [6/3 6/3] * [1 2]
u^T S u = (1/sqrt(5)) * [6/3 + 12/3]
u^T S u = (1/sqrt(5)) * 18/3
u^T S u = (1/sqrt(5)) * 6
Therefore, the value of u^T S u is 6/sqrt(5).
The answer from part 1 is the empirical variance of the data set, and the answer from part 2 (u^T S u) is a measure of how much the data set varies along the direction of u. These two values are not the same.
First, let's compute the matrix X:
X = [1 2; 3 4; -1 0]
Next, let's compute the matrix 1:
1 = [1 1 1]
Now, let's compute the matrix A:
A = I3 - (1/3) * 11'
A = [2/3 -1/3 -1/3; -1/3 2/3 -1/3; -1/3 -1/3 2/3]
Finally, let's compute the empirical covariance matrix S:
S = (1/3) * X'A*X
S = [14/3 -4/3; -4/3 4/3]
Now, let's compute u^T S u:
u = (1/sqrt(5)) * [1 2]
u^T S u = (1/sqrt(5)) * [1 2] * [14/3 -4/3; -4/3 4/3] * [1 2]
u^T S u = (1/sqrt(5)) * [6/3 6/3] * [1 2]
u^T S u = (1/sqrt(5)) * [6/3 + 12/3]
u^T S u = (1/sqrt(5)) * 18/3
u^T S u = (1/sqrt(5)) * 6
Therefore, the value of u^T S u is 6/sqrt(5).
The answer from part 1 is the empirical variance of the data set, and the answer from part 2 (u^T S u) is a measure of how much the data set varies along the direction of u. These two values are not the same.
Related Questions
As above, under the statistical model ( \{ 1,2,3 \} , \{ \mathbf{P}_{\mathbf{p}} \} _{\mathbf{p} \in...
We use the same statistical set-up as above. Recall that you use a test of the form
\psi _ n = \m...
As in the previous problem, we consider the matrix
H = I_ n - \frac{1}{n} \mathbf{1} \mathbf{1}^...
Let us compute the t-SNE output for the isoceles triangle as in the previous problem.
Recall we a...