Lab 10

Due Thursday November 20 at 11:59 PM

This lab provides extra practice with the kind of maximum likehood problem that you will encounter on the last problem set and the final exam.

Task 1

If a non-negative random variable \(X\) has the Rayleigh distribution with parameter \(\theta>0\), then its CDF is

\[ F(x;\,\theta)=1-\exp\left(-\frac{x^2}{2\theta}\right)\quad x\geq 0, \]

and we say \(X\sim\text{Rayleigh}(\theta)\). What is the PDF in this parametric family?

\[ \begin{aligned} f(x;\,\theta) &= F'(x;\,\theta) \\ &= \frac{\text{d}}{\text{d}x} \left[ 1-\exp\left(-\frac{x^2}{2\theta}\right) \right] \\ &= 0-\frac{-x}{\theta} \exp\left(-\frac{x^2}{2\theta}\right) \\ &= \frac{x}{\theta} \exp\left(-\frac{x^2}{2\theta}\right),&&x\geq 0. \end{aligned} \]

Task 2

Imagine we observe data

\[ X_1,\,X_2,\,...,\,X_n\overset{\text{iid}}{\sim}\text{Rayleigh}(\theta), \]

where the true value of the parameter \(\theta>0\) is unknown. Use the method of maximum likelihood to propose an estimator for \(\theta\).

The likelihood function is

\[ \begin{aligned} L(\theta;\,X_{1:n}) &= \prod_{i=1}^n f(X_i;\,\theta) \\ &= \prod_{i=1}^n \frac{X_i}{\theta} \exp\left(-\frac{X_i^2}{2\theta}\right) \\ &= \theta^{-n}\left(\prod_{i=1}^nX_i\right)\left[\prod_{i=1}^n\exp\left(-\frac{X_i^2}{2\theta}\right)\right] \\ &= \theta^{-n}\left(\prod_{i=1}^nX_i\right)\exp\left(-\frac{1}{2\theta}\sum\limits_{i=1}^nX_i^2\right). \end{aligned} \]

The log-likelihood function is

\[ \begin{aligned} \ell(\theta;\,X_{1:n}) &= \ln L(\theta;\,X_{1:n}) \\ &= \ln\left[\theta^{-n}\left(\prod_{i=1}^nX_i\right)\exp\left(-\frac{1}{2\theta}\sum\limits_{i=1}^nX_i^2\right)\right] \\ &= \ln(\theta^{-n})+\ln\left(\prod_{i=1}^nX_i\right)+\ln\left[\exp\left(-\frac{1}{2\theta}\sum\limits_{i=1}^nX_i^2\right)\right] \\ &= -n\ln \theta+\sum\limits_{i=1}^n\ln X_i-\frac{1}{2\theta}\sum\limits_{i=1}^nX_i^2 . \end{aligned} \]

To maximize this, we take the second derivative, set it equal to zero, and solve:

\[ \begin{aligned} \frac{\text{d}}{\text{d}\theta}\ell(\theta;\,X_{1:n}) = - \frac{n}{\theta} + \frac{1}{2\theta^2} \sum\limits_{i=1}^nX_i^2 &= 0 \\ \frac{1}{2\theta^2} \sum\limits_{i=1}^nX_i^2 &= \frac{n}{\theta} \\ \frac{1}{2n} \sum\limits_{i=1}^nX_i^2 &= \theta. \end{aligned} \]

So the maximum likelihood estimator for the Rayleigh distribution is

\[ \hat{\theta}_n=\frac{1}{2n} \sum\limits_{i=1}^nX_i^2. \]

Task 3

Your estimator in the previous part is a random variable that inherits its randomness from the \(X_i\). Derive the full sampling distribution of your estimator.

For these problems, I work “from the inside out.” What’s the distribution of \(X_i\)? What’s the distribution of \(X_i^2\)? What’s the distribution when you add those up? What’s the distribution when you scale the sum by \(1/2n\)? To answer those questions, you need tools like the change-of-variables formulas and our MGF updating rules for linear combinations.

We already know that \(X_1\sim\text{Rayleigh}(\theta)\), so we derive the distribution of \(Y_1=g(X_1)=X_1^2\) using change of variables. Since \(\text{Range}(X_1)=(0,\,\infty)\), the transformation is strictly increasing, so we apply the change of variables formula:

\[ \begin{aligned} f_{Y_1}(y) &= f\left(g^{-1}(y);\,\theta\right)\left|\frac{\text{d}}{\text{d}y}g^{-1}(y)\right| \\ &= f\left(y^{1/2};\,\theta\right)\left|\frac{\text{d}}{\text{d}y}y^{1/2}\right| \\ &= \frac{y^{1/2}}{\theta} \exp\left(-\frac{(y^{1/2})^2}{2\theta}\right) \left|\frac{1}{2}y^{-1/2}\right| \\ &= \frac{y^{1/2}y^{-1/2}}{2\theta} \exp\left(-\frac{y}{2\theta}\right) \\ &= \frac{1}{2\theta} \exp\left(-\frac{y}{2\theta}\right) ,&&y>0. \end{aligned} \]

This is the density of \(\text{Exponential}(1/2\theta)\), otherwise known as \(\text{Gamma}(1,\,1/2\theta)\). Given this, we have the following:

\[ \begin{aligned} X_i\overset{\text{iid}}{\sim}\text{Rayleigh}(\theta) \quad &\implies \quad X_i^2\overset{\text{iid}}{\sim}\text{Gamma}(1,\,1/2\theta) \\ &\implies \sum\limits_{i=1}^nX_i^2\sim\text{Gamma}(n,\,1/2\theta) \\ &\implies \frac{1}{2n}\sum\limits_{i=1}^nX_i^2\sim\text{Gamma}(n,\,n/\theta). \end{aligned} \] The last implication follows from Problem Set 5 #10d. So, the full sampling distribution of the estimator is

\[ \hat{\theta}_n\sim \text{Gamma}(n,\,n/\theta). \]

Task 4

Now that you know the sampling distribution of the estimator, compute its mean squared error (MSE) using the bias-variance decomposition. Is the estimator unbiased? Is it consistent?

Using the mean and variance formulas for the gamma distribution, we see that

\[ \begin{aligned} E(\hat{\theta}_n)&=\frac{n}{n/\theta}=\theta\\ \text{var}(\hat{\theta}_n)&=\frac{n}{(n/\theta)^2}=\frac{\theta^2}{n}. \end{aligned} \]

Since \(E(\hat{\theta}_n)=\theta\), the bias of the estimator is zero, and so the mean-squared-error is simply \(\text{var}(\hat{\theta}_n)=\theta^2/n\). As \(n\to\infty\), this goes to zero. As such, the estimator is unbiased and consistent.

Task 5

Compute the quantile function of the Rayleigh distribution and use it to generate 100 draws from the distribution that has \(\theta=4\).

The quantile function is the inverse cdf, so we solve:

\[ \begin{aligned} y &= 1-\exp\left(-\frac{x^2}{2\theta}\right) \\ \exp\left(-\frac{x^2}{2\theta}\right) &= 1-y \\ -\frac{x^2}{2\theta} &= \ln(1-y) \\ x^2 &= -2\theta\ln(1-y) \\ x &= \sqrt{-2\theta\ln(1-y)} . \end{aligned} \]

So \(F^{-1}(y;\,\theta)=\sqrt{-2\theta\ln(1-y)}\). As such, we can implement an inverse transform sampler to simulate from the distribution:

set.seed(2345)
n <- 100
theta <- 4
X <- sqrt(-2 * theta * log(1 - runif(n)))
hist(X, breaks = "Scott", freq = FALSE)
curve(exp(-x^2/(2*theta)) * x / theta, 
      from = 0, to = max(X), n = 1000, col = "red", add = TRUE)

Task 6

Write a for loop that simulates the sampling distribution for your estimator. Here is a schematic of what you are doing:

\[ \begin{matrix} \text{0. True parameter value} &&& \theta_0 && \\ &&& \downarrow && \\ \text{1. True distribution} &&& \text{Rayleigh}(\theta_0) && \\ &\swarrow &\swarrow& \cdots &\searrow&\searrow \\ \text{3. Simulated data}&x_{1:n}^{(1)} &x_{1:n}^{(2)}& \cdots &x_{1:n}^{(M-1)}&x_{1:n}^{(M)} \\ &\downarrow &\downarrow& \cdots &\downarrow&\downarrow \\ \text{4. Different estimates}&\hat{\theta}_n^{(1)} &\hat{\theta}_n^{(2)}& \cdots &\hat{\theta}_n^{(M-1)}&\hat{\theta}_n^{(M)} \\ &\searrow &\searrow& \cdots &\swarrow&\swarrow \\ & && \text{Histogram} && \\ \end{matrix} \]

At the end, add on top of the histogram a curve of the density that you derived in Task 3. They ought to match.

Template

Here is some template code you can fill in:

true_theta <- 3.14      # true value of parameter
n <- 25                 # sample size
M <- 2500               # how many repetitions
estimates <- numeric(M) # preallocate storage for the sample of estimates

for(m in 1:M){
  # Step 1: simulate a new fake dataset of size n
  # Step 2: apply your formula from Task 2 to compute new estimate
  # Step 3: store it
}

# Step 4: plot histogram of estimates
# Step 5: add density curve of exact sampling distribution
set.seed(123)
true_theta <- 3.14      # true value of parameter
n <- 25                 # sample size
M <- 2500               # how many repetitions
estimates <- numeric(M) # preallocate storage for the sample of estimates

for(m in 1:M){
  X <- sqrt(-2 * true_theta * log(1 - runif(n)))
  estimates[m] <- sum(X^2) / (2*n)
}

hist(estimates, breaks = "Scott", freq = FALSE)
curve(dgamma(x, shape = n, rate = n / true_theta), 
      from = 0, to = max(estimates), n = 1000, col = "red", 
      lwd = 2, add = TRUE)