In this problem, we will explore how to update the belief successively, having observed data. The model is as follows:

\theta \in \Theta, the parameter space; and \pi (\cdot ) is the prior distribution of \theta.

We observe i.i.d. (conditional on the parameter) data X_1, \ldots, X_ n and calculate the likelihood function L_ n(X_1, \ldots , X_ n|\theta ) (as in the setting of maximum likelihood estimation)

Write \phi (X_1, \ldots , X_ n) as a placeholder function that depends on X_1, \ldots , X_ n, but not on the parameter \theta. (\phi could stand for different functions in different equations. It's simply a placeholder whenever we want to collect terms that only depend on X_1, \ldots , X_ n.)

In this context, we add observations one by one, computing the likelihood L_ i(X_1, \ldots , X_ i|\theta ) and posterior \pi (\theta |X_1, \ldots , X_ i) after each observation i. Which of the following identities are true? (Choose all that apply.)

\pi (\theta | X_1, \ldots , X_ n) = \pi (\theta ) \cdot L_ n(X_1, \ldots , X_ n | \theta ) \cdot \phi (X_1, \ldots , X_ n)

L_ n(X_1, \ldots , X_ n | \theta ) = L_{n-1}(X_1, \ldots , X_{n-1}| \theta ) \cdot L_1(X_ n | \theta )

\pi (\theta | X_1, \ldots , X_ n) = \pi (\theta | X_1, \ldots , X_{n-1}) \cdot L_1(X_ n | \theta ) \cdot \phi (X_1, \ldots , X_ n)

L_ n(X_1, \ldots , X_ n | \theta ) = \frac{\pi (\theta | X_1, \ldots , X_ n)}{\pi (\theta )}\phi (X_1, \ldots , X_ n)

1 answer

The correct identities are:

1. \pi (\theta | X_1, \ldots , X_ n) = \pi (\theta ) \cdot L_ n(X_1, \ldots , X_ n | \theta ) \cdot \phi (X_1, \ldots , X_ n)

This is the definition of the posterior distribution \pi (\theta | X_1, \ldots , X_ n). It is obtained by multiplying the prior distribution \pi (\theta ) with the likelihood function L_ n(X_1, \ldots , X_ n | \theta ) and the placeholder function \phi (X_1, \ldots , X_ n).

2. L_ n(X_1, \ldots , X_ n | \theta ) = L_{n-1}(X_1, \ldots , X_{n-1}| \theta ) \cdot L_1(X_ n | \theta )

This is the product rule of probabilities. The likelihood function for the n observations can be factorized as the product of the likelihood function for the first n-1 observations and the likelihood function for the nth observation.

3. \pi (\theta | X_1, \ldots , X_ n) = \pi (\theta | X_1, \ldots , X_{n-1}) \cdot L_1(X_ n | \theta ) \cdot \phi (X_1, \ldots , X_ n)

This is the recursive update formula for the posterior distribution. The posterior after n observations can be obtained by multiplying the posterior after n-1 observations with the likelihood for the nth observation and the placeholder function.

4. L_ n(X_1, \ldots , X_ n | \theta ) = \frac{\pi (\theta | X_1, \ldots , X_ n)}{\pi (\theta )}\phi (X_1, \ldots , X_ n)

This is the Bayes' theorem. The likelihood function can be expressed as the ratio of the posterior distribution to the prior distribution, multiplied by the placeholder function.