The Birthday Paradox: Exact Formulae, Stirling, and Series Expansions
Introduction: One Room, Many Birthdays
How many people do you need in a room before two share a birthday? With $d=365$ equally likely days and independent birthdays, the chance passes $50\%$ by $n=23$.
This note collects the full toolkit—exact counting, Stirling’s approximation, series/Taylor expansions with coefficients, and inclusion–exclusion—into one clean walkthrough. Throughout, keep $d$ generic (plug $d=365$ at the end).
Setup and Assumptions
- $n$ birthdays, each day among $d$ is equally likely (ignore Feb 29).
- Birthdays are independent.
Goal: \(\mathbb{P}(\text{at least one shared birthday among } n \text{ people}) \;=\; 1 - \mathbb{P}(\text{all distinct}).\)
Exact Probability (Two Equivalent Derivations)
Sequential (falling factorial)
\(\mathbb{P}(\text{all distinct}) \;=\; \prod_{k=0}^{n-1}\frac{d-k}{d} \;=\; \frac{(d)_n}{d^n} \;=\; \frac{d!}{(d-n)!\,d^n}.\)
Choose–then–assign
Pick $n$ distinct days $\binom{d}{n}$ and permute across $n$ people $(n!)$: \(\mathbb{P}(\text{all distinct}) \;=\; \frac{\binom{d}{n}n!}{d^n} \;=\; \frac{d!}{(d-n)!\,d^n}.\) Hence \(\mathbb{P}(\text{match}) = 1 - \frac{d!}{(d-n)!\,d^n}.\)
Inclusion–Exclusion (structure behind coefficients)
Let $A_{ij}$ be the event that persons $i$ and $j$ share a birthday. Then \(\mathbb{P}(\text{match}) \;=\; \sum_{i<j}\mathbb{P}(A_{ij}) - \sum_{(i<j)\neq(k<\ell)}\mathbb{P}(A_{ij}\cap A_{k\ell}) + \cdots\)
- First term: $\binom{n}{2}\cdot \frac{1}{d}$.
- Second term aggregates cases with three or four distinct people; these scale as $1/d^{2}$ with different multiplicities that combine to the second-order coefficient seen in the series below.
Series/Taylor Expansions (and the coefficients)
Start from \(\ln \mathbb{P}(\text{all distinct}) \;=\; \sum_{k=0}^{n-1}\ln\Bigl(1-\frac{k}{d}\Bigr).\) Use $\ln(1-x) = -x - \frac{x^{2}}{2} - \frac{x^{3}}{3} - \cdots$ with $x=\frac{k}{d}$ and the power sums \(\sum_{k=0}^{n-1}k = \frac{n(n-1)}{2},\quad \sum_{k=0}^{n-1}k^{2} = \frac{n(n-1)(2n-1)}{6},\quad \sum_{k=0}^{n-1}k^{3} = \frac{n^{2}(n-1)^{2}}{4}.\) Then \(\ln \mathbb{P}(\text{all distinct}) \;\approx\; -\underbrace{\frac{n(n-1)}{2d}}_{A} -\underbrace{\frac{n(n-1)(2n-1)}{12\,d^{2}}}_{B} -\underbrace{\frac{n^{2}(n-1)^{2}}{12\,d^{3}}}_{C} \;+\; O\!\Bigl(\frac{n^{5}}{d^{4}}\Bigr).\) Keep these terms in the exponent: \(\mathbb{P}(\text{match}) \approx 1 - \exp\!\bigl(-(A+B+C)\bigr).\) If a plain power series in $1/d$ is preferred, expand $1-e^{-(A+B+C)}$ to third order: \(\mathbb{P}(\text{match}) \approx \underbrace{\frac{\binom{n}{2}}{d}}_{\mathcal{O}(1/d)} \;-\; \underbrace{\frac{n(n-1)(n-2)(3n-1)}{24\,d^{2}}}_{\mathcal{O}(1/d^{2})} \;+\; \underbrace{\frac{n^{2}(n-1)^{2}(n-2)(n-3)}{48\,d^{3}}}_{\mathcal{O}(1/d^{3})} \;+\;O\!\Bigl(\frac{1}{d^{4}}\Bigr).\)
Poisson / Exponential Heuristic (fast check)
Treat matches as rare events when $n\ll \sqrt d$. The expected number of matching pairs is $\binom{n}{2}\cdot \frac{1}{d}$. Approximating the count by a Poisson variable with that mean gives \(\mathbb{P}(\text{match}) \approx 1 - \exp\!\Bigl(-\frac{n(n-1)}{2d}\Bigr).\) This recovers the classic threshold: set $1-\exp!\bigl(-n(n-1)/(2d)\bigr)\approx 0.5 \Rightarrow n(n-1)\approx 2d\ln 2$.
Stirling’s Approximation (ratio of factorials)
Start from \(\ln \mathbb{P}(\text{all distinct}) \;=\; \ln d! - \ln(d-n)! - n\ln d.\) Use Stirling with corrections: \(\ln m! \;=\; m\ln m - m + \frac12\ln(2\pi m) \;+\; \frac{1}{12m} - \frac{1}{360m^{3}} + O\!\Bigl(\frac{1}{m^{5}}\Bigr).\) After cancellation, \(\ln \mathbb{P}(\text{all distinct}) \approx (d-n+\tfrac12)\,\ln\!\frac{d}{d-n} - n \;+\;\frac{1}{12d}-\frac{1}{12(d-n)} \;-\;\frac{1}{360d^{3}}+\frac{1}{360(d-n)^{3}}.\) Exponentiate to get $\mathbb{P}(\text{all distinct})$ and take the complement for $\mathbb{P}(\text{match})$. This route is handy when $n$ is a non-negligible fraction of $d$.
Worked Example: $d=365$, $n=23$
\(A=\frac{23\cdot 22}{2\cdot 365}\approx 0.69315,\quad B=\frac{23\cdot 22\cdot 45}{12\cdot 365^{2}}\approx 0.01424,\quad C=\frac{23^{2}\cdot 22^{2}}{12\cdot 365^{3}}\approx 0.00044.\) Approximations: \(\text{Poisson: } 1-e^{-A}\approx 0.5000,\qquad \text{Exponent with }A{+}B: 1-e^{-(A+B)}\approx 0.5071,\) \(\text{Exponent with }A{+}B{+}C: \approx 0.50729 \quad(\text{matches exact } \approx 0.5073).\)
When to Use What
- Exact product / factorial: small $n$; code or high precision.
- Poisson $1-e^{-A}$: quick mental math; very good while $n\ll \sqrt d$.
- Series in the exponent: include $B$ (and $C$) when $n/\sqrt d$ is moderate.
- Stirling: robust when $n$ is a sizable fraction of $d$.
Variants and Extensions
- Leap years: set $d=366$ or use a two-category model with a small mass on Feb 29.
- Non-uniform birthdays: let day $i$ have probability $p_i$. Expected matching-pair mean is $\binom{n}{2}\sum_i p_i^{2}$. Poisson heuristic: $\mathbb{P}(\text{match})\approx 1-\exp!\bigl(-\binom{n}{2}\sum_i p_i^{2}\bigr)$. Union bound: $\mathbb{P}(\text{match})\le \binom{n}{2}\sum_i p_i^{2}$.
- General calendar: all formulae hold for any $d$.