
Probability for Statistical Inference
Duke University
STA 240 Fall 2025
While you wait, please complete this brief questionnaire:
| Mug | Name | Role | Office Hours |
|---|---|---|---|
![]() |
Hu, Yuang | TA | Mon 7:30 PM - 9:30 PM |
![]() |
Liu, Aurora | Head TA | WeTh 4:30 pm - 5:30 pm |
![]() |
Ma, Liane | TA | Sun 10:00 am - 12:00 pm |
![]() |
Zito, John | Instructor | Tue 3:00 pm - 6:00 pm |

Welcome First-Years Event!
GBM #1 & Research Panel
\(k\) people convene for a birthday party:
What is the probability that at least two of the attendees share a birthday?
How many people need to show up to the party for there to be a 50% chance of at least one match?
Most people guess that you need, like, a lot.

| no. of attendees (k) | Prob(at least one bday match) |
|---|---|
| 1 | 0% |
| 4 | 1.6% |
| 16 | 28% |
| 23 | 50.7% |
| 40 | 89% |
| 56 | 98% |
| 60 | 99.4 |
| \(\vdots\) | \(\vdots\) |
| 366 | 100.0% |
Key words: binomial coefficient, pigeonhole principle

We will assume this when we do the calculation. But actually…

| birthday | count |
|---|---|
| 05-25 | 2 |
| 01-02 | 1 |
| 01-03 | 1 |
| 01-05 | 1 |
| 01-06 | 1 |
| 01-11 | 1 |
| 01-25 | 1 |
| 02-02 | 1 |
| 02-11 | 1 |
| 03-01 | 1 |
| 03-02 | 1 |
| 03-31 | 1 |
| 04-03 | 1 |
| 04-11 | 1 |
| 04-21 | 1 |
| 05-09 | 1 |
| 05-15 | 1 |
| 05-30 | 1 |
| 06-08 | 1 |
| 06-20 | 1 |
| 06-25 | 1 |
| 07-06 | 1 |
| 07-08 | 1 |
| 08-30 | 1 |
| 09-06 | 1 |
| 09-22 | 1 |
| 09-30 | 1 |
| 11-01 | 1 |
| 11-07 | 1 |
| 11-10 | 1 |
| 11-21 | 1 |
| 11-23 | 1 |
| 12-01 | 1 |
| 12-11 | 1 |
Let’s play: https://montyhall.io/

Very counterintuitive
Most people start out thinking that the two doors are equally likely to contain the prize, so switching doesn’t matter. In fact, you have a 2/3 chance of winning if you switch.





At a 1906 country fair in Plymouth, 800 people participated in a contest to estimate the weight of an ox. Francis Galton observed that the median guess, 1207 lbs, was accurate within 1% of the true weight of 1198 lbs.
Lesson
The aggregation of many imperfect estimates/guesses is often better than a needle-in-haystack search for the “best” individual guess.
Isn’t that obvious?
It took humans a long time to realize this. The first recorded uses of an “average” were during Isaac Newton’s lifetime (see Stigler’s Seven Pillars of Statistical Wisdom).





(source: CNN)
A 50-year-old, asymptomatic woman tests positive for breast cancer. Alarming, but no diagnostic test is perfect. If the prevalence of breast cancer in the population is 1%, if the true positive rate of the test is 90%, and if the false positive rate is 9%, what is the chance that the woman actually has cancer, given that she tested positive?
You will know how to answer this in four weeks. Doctors though…
Only 34 out of 160 surveyed gynecologist got it right (9%). Almost half of them said 90%.
“We can only imagine how much anxiety those innumerate doctors instil in women.”




Human intuition and “common sense” about probability and statistics are often just flat out wrong, in silly and dangerous ways. Mere mortals require the scaffolding of mathematics to discipline our thinking.
Question
What method does ChatGPT use to generate the next word in one of its responses?

Hella simplified:

key words: Schrödinger’s cat, uncertainty principle, etc


No area of science or technology can be properly understood without knowing something about probability. Period.
Bertrand Russell, Mysticism and Logic and Other Essays (1917)

“Mathematics, rightly viewed, possesses not only truth, but supreme beauty cold and austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeous trappings of painting or music, yet sublimely pure, and capable of a stern perfection such as only the greatest art can show.”
Albert Einstein, “The Late Emmy Noether” (1935)

“Pure mathematics is, in its way, the poetry of logical ideas.”


Ostensibly random behavior can nevertheless be quite patternful, in ways that we can actually penetrate with elegant mathematics.
It’s necessary.
It’s useful.
It’s beautiful.
Sound good?
Your final course grade will be calculated as follows:
| Category | Percentage |
|---|---|
| Labs | 10% |
| Problem Sets | 30% |
| Midterm Exam 1 | 20% |
| Midterm Exam 2 | 20% |
| Final exam | 20% |
Warning
The final letter grade will be based on the usual thresholds, which will not change and will be applied exactly. So no curve and no rounding.
Lead by Aurora in Perkins LINK 087 (Classroom 3):
Guided activities introducing you to special topics, extensions, applications, and case studies. We will also introduce some basic R stuff, and we will use Quarto for the lab write-ups.
Plan to attend regularly
Designed to be complete-able during the lab period, but due by 11:59 PM that same day.
Late policy
No late work will be accepted unless you request an extension in advance by e-mailing JZ. All reasonable requests will be entertained, but extensions will not be long.
Traditional, in-class, written exams:
You are allowed only two resources:
If you need testing accommodations…
Make sure I get an SDAO letter, and make your appointments in the Testing Center now.
Not required. Live your life.
If you wish to ask questions in writing…
Post on Ed: about general course policies and content;
Email JZ directly: personal matters.
You should not really be emailing the TAs directly for any reason.
You are enthusiastically encouraged to work together on labs and problem sets. You will learn a lot from each other! Two policies:
Violation of the second policy is plagiarism. Sharers and recipients alike are referred to the conduct office and receive zeros.
This is essentially a pencil-and-paper math class. We will use all of the basic skills taught in Calc I and II:
REVIEW: Problem Set 0 is due 5PM Friday September 5.
This is not sink-or-swim.
What do we do?
State the distribution (rules) of some random phenomenon, and then study how the realizations of that phenomenon “typically” behave.
Given a fair coin, how many flips will it take on average until you observe the first head?
Given a fair coin, how many flips will it take on average until you observe the first head?
Why is this a probability problem?
What do we do?
Start with the realizations of a random phenomenon with unknown distribution, and try to use those realizations to figure out what the distribution is.
Given 28 flips from a mysterious coin, can you tell if it is fair?
Given 28 flips from a mysterious coin, can you tell if it is fair?

| Probability | Forward problem Deductive Reasons from rules to consequences |
| Statistics | Inverse problem Inductive Observe consequences, and infer rules |

Forward: read the rulebook, and then play a game of chess;
Inverse: watch a chess match, and based on the players’ behavior, try to guess the rules.
Differentiation is a forward problem. You know the function \(F\), and you take its darn derivative;
Integration is an inverse problem. Given the derivative, you have to work backward to figure out what the original function was:
\[ \text{FTOC:}\quad\int_a^bF'(x)\,\text{d} x=F(b)-F(a). \]
Forward
“Here’s the question. What’s the answer?”
Inverse
“If this is the answer, then what was the question?”
Gird your loins
Like all inverse problems, you will find that statistics is subtler, less well-defined, less straightforward, and more open-ended than probability.
Two common interpretive perspectives:
You need both perspectives.
Like the wave-particle duality of light, both are true and useful, but their coexistence can be tense and uneasy. We just have to learn to live with that.
The math doesn’t care which you prefer.
Regardless your interpretation, the mathematics of probability is the same.