P-Values — Math Primer

Everyone reports p-values. Almost no one interprets them correctly.

Null Hypothesis Significance Testing

NHST is the common framework that tests data against a null hypothesis (H₀). A conventional cutoff is p ≤ 0.05 to reject H₀.

What a p-value actually is

The p-value is the probability of getting results as extreme as, or more extreme than, what we observed — assuming the null hypothesis is true.

The value is always conditional on H₀ being true. But what we actually want is the reverse: the probability that the null hypothesis is true, given the data we observed.

How "extreme" is measured

"Extreme" means far in the test statistic's null distribution. For z/t-based tests we standardise by the standard error. For other tests (e.g., χ², F), extremeness is on those statistics' own scales.

Common Misinterpretations

From Goodman's "A Dirty Dozen" — four mistakes that persist across research and industry:

"If p = .05, the null has a 5% chance of being true."
"A non-significant p (> .05) means no difference."
"p = .05 means these exact data would occur only 5% of the time under H₀."
"If you reject at p = .05, the Type I error probability for this decision is 5%."

From P-Values to Pr(H₀ | Data)

You can convert a standard regression output into the probability the null is actually true. You need three numbers:

Get T Stat, Observations (n), and Degrees of Freedom (df) from your regression output.
Compute an approximate Bayes factor favouring the null (BF₀₁) using these values.
Convert BF₀₁ into Pr(H₀ | data) under neutral 50/50 prior odds.

Excel-ready formulas

Part 1: Bayes Factor (BF₀₁) =SQRT(n) * (1 + (T_Stat^2 / df)) ^ (-n / 2)

Part 2: Null Probability =BF / (1 + BF)

Reading the Result

Null Probability is the probability the null hypothesis is true given your data.

> 0.5

Null more likely true than false

< 0.5

Alternative more likely true than null

= 0.5

Both hypotheses equally likely

0.73

Example: 73% chance the null is true