Let {\boldsymbol \beta }\in \mathbb {R}^1 and let \ell _2({\boldsymbol \beta }) = \sum _{i=1}^2 \frac{Y_ i X_ i^ T {\boldsymbol \beta }- e^{X_ i^ T {\boldsymbol \beta }}}{\phi } + c, for some constant c. For the Poisson GLM, recall that \phi =1 and we have b(\theta ) = e^\theta.

What is \nabla \ell _2(\beta ) for any \beta?

Use X_i for X_ i and Y_i for Y_ i.

\sum _{i=1}^2

Let

X_1^ T = X_1 = 0.1, Y_1 = 1,

X_2^ T = X_2 = 0.2, Y_2 = 2,

{\boldsymbol \beta }= 0,

Step size t=0.01.

What is the new \beta = \beta + t \cdot \nabla \ell _2(\beta )?

1 answer

To find the gradient of the function \ell_2(\beta), we will differentiate with respect to the vector \boldsymbol{\beta}.

First, let's simplify the expression inside the summation:

\frac{Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}}{\phi} = \frac{Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}}{1} = Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}

Now, let's compute the gradient:

\nabla \ell_2(\beta) = \sum_{i=1}^2 \nabla \left(Y_i X_i^T \boldsymbol{\beta} - e^{X_i^T \boldsymbol{\beta}}\right)

Since the gradient is a linear operator, we can distribute it inside the summation:

\nabla \ell_2(\beta) = \sum_{i=1}^2 \left(\nabla (Y_i X_i^T \boldsymbol{\beta}) - \nabla e^{X_i^T \boldsymbol{\beta}}\right)

Now, let's compute the gradients of the individual terms:

\nabla (Y_i X_i^T \boldsymbol{\beta}) = Y_i X_i^T

and

\nabla e^{X_i^T \boldsymbol{\beta}} = e^{X_i^T \boldsymbol{\beta}} X_i

Therefore, the gradient of \ell_2(\beta) is:

\nabla \ell_2(\beta) = \sum_{i=1}^2 \left(Y_i X_i^T - e^{X_i^T \boldsymbol{\beta}} X_i\right)

Plugging in the given values:

X_1^T = X_1 = 0.1, Y_1 = 1,
X_2^T = X_2 = 0.2, Y_2 = 2,
\boldsymbol{\beta} = 0,

we have:

\nabla \ell_2(\beta) = \sum_{i=1}^2 \left(Y_i X_i^T - e^{X_i^T \boldsymbol{\beta}} X_i\right)
= \left(1 \cdot 0.1 - e^{0.1 \cdot 0} \cdot 0.1\right) + \left(2 \cdot 0.2 - e^{0.2 \cdot 0} \cdot 0.2\right)
= 0.1 - 0.1 + 0.4 - 0.2
= 0.2

Therefore, \nabla \ell_2(\beta) = 0.2.

Finally, to find the new \boldsymbol{\beta}, we can use the update rule:

\boldsymbol{\beta}_{\text{new}} = \boldsymbol{\beta} + t \cdot \nabla \ell_2(\beta)

Plugging in t = 0.01 and \nabla \ell_2(\beta) = 0.2:

\boldsymbol{\beta}_{\text{new}} = 0 + 0.01 \cdot 0.2 = 0.002