Everyone reports p-values. Almost no one interprets them correctly.

Null Hypothesis Significance Testing

NHST is the common framework that tests data against a null hypothesis (H0). A conventional cutoff is p ≤ 0.05 to reject H0.

What a p-value actually is

The p-value is the probability of getting results as extreme as, or more extreme than, what we observed — assuming the null hypothesis is true.

The value is always conditional on H0 being true. But what we actually want is the reverse: the probability that the null hypothesis is true, given the data we observed.

How "extreme" is measured

"Extreme" means far in the test statistic's null distribution. For z/t-based tests we standardise by the standard error. For other tests (e.g., χ², F), extremeness is on those statistics' own scales.

Common Misinterpretations

From Goodman's "A Dirty Dozen" — four mistakes that persist across research and industry:

  1. "If p = .05, the null has a 5% chance of being true."
  2. "A non-significant p (> .05) means no difference."
  3. "p = .05 means these exact data would occur only 5% of the time under H0."
  4. "If you reject at p = .05, the Type I error probability for this decision is 5%."

From P-Values to Pr(H0 | Data)

You can convert a standard regression output into the probability the null is actually true. You need three numbers:

  1. Get T Stat, Observations (n), and Degrees of Freedom (df) from your regression output.
  2. Compute an approximate Bayes factor favouring the null (BF01) using these values.
  3. Convert BF01 into Pr(H0 | data) under neutral 50/50 prior odds.

Excel-ready formulas

Part 1: Bayes Factor (BF01) =SQRT(n) * (1 + (T_Stat^2 / df)) ^ (-n / 2)
Part 2: Null Probability =BF / (1 + BF)

Reading the Result

Null Probability is the probability the null hypothesis is true given your data.

> 0.5
Null more likely true than false
< 0.5
Alternative more likely true than null
= 0.5
Both hypotheses equally likely
0.73
Example: 73% chance the null is true