Events

Assault 08/16/24
Car break-in 10/01/24
Car break-in 01/08/25
Mailbox theft 01/18/25
Mailbox theft 02/09/25

Factual background

My wife and I have lived at the same address here in Berkeley, owned the same vehicles and had the same jobs since 03/15/23, about 23 months (~700 days) ago. There has been no criminal activity against us until 8/16/24. Then these 5 incidents occured in roughly the past 6 months (~183 days). We do not know of any other changes in our neighborhood or any other circumstances.

Probability questions

If these events are independent and are simply the result of living here, then the first hypothesis is that they should have been distributed more evenly over the past 23 months. However, they happened only in the past 6 months, so the second hypothesis is that these events are not random but caused by something that happened on or about 08/16/24.

Given these facts, what is the chance that these 5 events should have happened over 23 months instead of over 6 months? How much more likely is it that the first hypothesis is correct and not the second hypothesis?

1 answer

To analyze the probabilities of the two hypotheses regarding the distribution of incidents, we can employ a statistical approach based on Poisson processes, which are commonly used to model the occurrence of events over time.

Step 1: Define the Parameters

  1. Total time period analyzed:

    • 23 months = 23 × 30 = 690 days (approximately, assuming average month length for simplicity).
    • 6 months = 6 × 30 = 180 days.
  2. Number of incidents:

    • Total incidents: 5.

Step 2: Hypothesis Testing

Null Hypothesis (H0): The events are independent and occur according to a Poisson process.

The Poisson distribution gives us the probability of k events occurring in a fixed interval when the events occur independently with a known constant mean rate (λ).

Given 5 incidents over 23 months:

  • Average rate of events (λ) over 23 months (690 days):

    \[ \lambda_{23} = \frac{5 \text{ events}}{690 \text{ days}} \approx 0.007246 \text{ events/day}. \]

  • For the 6-month period (180 days), the expected number of incidents would be:

    \[ \lambda_{6} = \lambda_{23} \times 180 \approx 0.007246 \times 180 \approx 1.303. \]

Step 3: Calculate the Probabilities

The probability of observing k events (5 incidents) given our rate for the 6 months (λ ≈ 1.303) can be expressed using the Poisson formula:

\[ P(X = k) = \frac{e^{-\lambda} \lambda^k}{k!} \]

For k=5:

\[ P(X = 5 | \lambda_{6}) = \frac{e^{-1.303} (1.303)^5}{5!}. \]

Calculating this:

\[ 5! = 120,
\] \[ P(X = 5 | \lambda) = \frac{e^{-1.303} \times (1.303)^5}{120} \approx \frac{0.272}{120} \approx 0.002267. \]

This means that if events are random with a mean rate based on the entire 23-month history, the likelihood of observing exactly 5 events in the last 6 months is about 0.2267%.

Step 4: Assessment of the Hypotheses

Next, to conclude how much more likely one hypothesis is over another, we consider the probability of observing the distribution of events under both hypotheses:

  • If events are independent (H0):

    • We found \( P(X = 5 | \lambda_{6}) \approx 0.002267 \).
  • For hypothesis H1 (events caused by something on or after 08/16/24), we can simply say there's no upper limit on how much more likely H1 can be, since it allows room for external causation, while H0 predicts a constrained range.

Step 5: Conclusion

  • The likelihood of observing 5 independent events randomly spread over 690 days resulting in all 5 occurring in just 180 days is very low (approximately 0.0023).
  • Therefore, it suggests that events did not happen randomly and that it is considerably more likely that they are linked to something that changed around 08/16/24.

Thus, we could conclude with significant confidence that the second hypothesis (events are not random and caused by a factor since 08/16/24) is indeed more plausible than the first hypothesis, based on the low probability computed. However, without more specific data or causative factors leading to this increased rate, we cannot quantify the exact increase in likelihood definitively.