Hypergeometric distribution
Imagine we have a finite population containing \(m\) “success” cases and \(n\) “failure” cases, for a total population size of \(m+n\). If we sample \(k\) cases from the population without replacement, then a hypergeometric random variable counts the number of sampled cases that are a success.
You first met this distribution on Problem Set 4 when studying contested elections.
Basic properties
| Notation | \(X\sim\text{HG}(m, n, k)\) |
| Range | \(\{0,\,1,\,2,\,3,\,...,\,k-1,\,k\}\) |
| PMF | \(P(X = x) = \binom{m}{x} \binom{n}{k-x}/\binom{m+n}{k}\) |
| Expectation | \(km/(m+n)\) |
R commands
Here is the documentation for the suite of commands that let you work with the hypergeometric distribution in R:
