Everyone reports p-values. Almost no one interprets them correctly.
NHST is the common framework that tests data against a null hypothesis (H0). A conventional cutoff is p ≤ 0.05 to reject H0.
The p-value is the probability of getting results as extreme as, or more extreme than, what we observed — assuming the null hypothesis is true.
The value is always conditional on H0 being true. But what we actually want is the reverse: the probability that the null hypothesis is true, given the data we observed.
"Extreme" means far in the test statistic's null distribution. For z/t-based tests we standardise by the standard error. For other tests (e.g., χ², F), extremeness is on those statistics' own scales.
From Goodman's "A Dirty Dozen" — four mistakes that persist across research and industry:
You can convert a standard regression output into the probability the null is actually true. You need three numbers:
Null Probability is the probability the null hypothesis is true given your data.