Return to the original model. We now introduce a Poisson intensity parameter \, \lambda _ t \, for every time point and denote the parameter (\eta) that gives the canonical exponential family representation as above by \, \theta _ t \,. We choose to employ a linear model connecting the time points \, t \, with the canonical parameter \, \theta \, of the Poisson distribution above, i.e.,

\theta _ t = a + b t.

In other words, we choose a generalized linear model with Poisson distribution and its canonical link function. That also means that conditioned on \, t \,, we assume the \, Y_ t \, to be independent.

Imagine we observe the following data:

\, t_1 = 1 \, 1 outbreaks
\, t_2 = 2 \, 3 outbreaks
\, t_3 = 4 \, 10 outbreaks
We want to produce a maximum likelihood estimator for (a,b). To this end, write down the log likelihood \ell (a,b) of the model for the provided three observations at t_1, t_2, and t_3 (plug in their values).

\ell (a,b) =
unanswered

What is its gradient? Enter your answer as a pair of derivatives.

\partial _ a\ell (a,b) =
unanswered

\partial _ b\ell (a,b) =

1 answer

To calculate the log likelihood, we need to sum up the log probabilities of observing the given outbreaks at each time point. The log probability of observing \, Y_t \, outbreaks at time \, t \, is given by:

\log P(Y_t | \lambda_t) = Y_t \log \lambda_t - \lambda_t - \log(Y_t!)

Using the linear model \, \theta_t = a + bt \, and the canonical link function \, \lambda_t = e^{\theta_t} \, for the Poisson distribution, we can express \, \lambda_t \, in terms of \, a \, and \, b \, as:

\lambda_t = e^{a + bt}

Plugging in the observed values, we have:

\log P(Y_1 | \lambda_1) = 1 \log (e^{a + b}) - e^{a + b} - \log(1!)
\log P(Y_2 | \lambda_2) = 3 \log (e^{a + 2b}) - e^{a + 2b} - \log(3!)
\log P(Y_3 | \lambda_3) = 10 \log (e^{a + 4b}) - e^{a + 4b} - \log(10!)

The log likelihood is the sum of these log probabilities:

\ell(a, b) = \log P(Y_1 | \lambda_1) + \log P(Y_2 | \lambda_2) + \log P(Y_3 | \lambda_3)

Now, let's calculate the gradient of the log likelihood with respect to \, a \, and \, b \, separately.

Taking the partial derivative of \, \ell(a, b) \, with respect to \, a \, gives:

\frac{\partial \ell(a, b)}{\partial a} = \frac{\partial}{\partial a} \left( \log P(Y_1 | \lambda_1) + \log P(Y_2 | \lambda_2) + \log P(Y_3 | \lambda_3) \right)

Since \, \lambda_1 = e^{a + b}, \, \lambda_2 = e^{a + 2b}, \, and \, \lambda_3 = e^{a + 4b}, \, we can rewrite the partial derivative as:

\frac{\partial \ell(a, b)}{\partial a} = \frac{\partial}{\partial a} \left( \log \left( e^{a + b} \right) + \log \left( e^{a + 2b} \right) + \log \left( e^{a + 4b} \right) \right)

Taking the derivative with respect to \, a \, using the chain rule, we get:

\frac{\partial \ell(a, b)}{\partial a} = \frac{\partial}{\partial a} \left( a + b + a + 2b + a + 4b \right)
\frac{\partial \ell(a, b)}{\partial a} = \frac{\partial}{\partial a} \left( 3a + 7b \right)
\frac{\partial \ell(a, b)}{\partial a} = 3

Similarly, taking the partial derivative of \, \ell(a, b) \, with respect to \, b \, gives:

\frac{\partial \ell(a, b)}{\partial b} = \frac{\partial}{\partial b} \left( \log P(Y_1 | \lambda_1) + \log P(Y_2 | \lambda_2) + \log P(Y_3 | \lambda_3) \right)

Using the same approach as before, we can rewrite this as:

\frac{\partial \ell(a, b)}{\partial b} = \frac{\partial}{\partial b} \left( \log \left( e^{a + b} \right) + \log \left( e^{a + 2b} \right) + \log \left( e^{a + 4b} \right) \right)

Taking the derivative with respect to \, b \, using the chain rule, we get:

\frac{\partial \ell(a, b)}{\partial b} = \frac{\partial}{\partial b} \left( a + b + a + 2b + a + 4b \right)
\frac{\partial \ell(a, b)}{\partial b} = \frac{\partial}{\partial b} \left( 7a + 7b \right)
\frac{\partial \ell(a, b)}{\partial b} = 7

Therefore, the gradient of the log likelihood is:

\frac{\partial \ell(a, b)}{\partial a} = 3
\frac{\partial \ell(a, b)}{\partial b} = 7