# 11.6. Exercises#

1. Expected minimum and maximum of i.i.d. uniform variables

Let $$U_1, U_2, \ldots, U_n$$ be i.i.d. uniform on the interval $$(0, 1)$$, and let $$L_n = \max(U_1, U_2, \ldots, U_n)$$. That is, let $$L_n$$ be the largest of $$U_1, U_2, \ldots, U_n$$.

[It is a fact about independent continuous random variables that the chance that they are equal is $$0$$. So you don’t have to worry about “ties”. That is, you can assume that $$U_1, U_2, \ldots, U_n$$ are $$n$$ distinct values.]

a) For $$0 < x < 1$$, find $$P(U_1 \le x)$$. Hence find $$P(L_n \le x)$$.

b) Find the density of $$L_n$$.

c) Find $$E(L_n)$$.

d) To interpret the answer to Part c, let $$n=2$$ for a start. Imagine marking the two values $$U_1$$ and $$U_2$$ on the unit interval. These two random values split the unit interval $$(0, 1)$$ into 3 pieces of random lengths. It is a fact (and makes intuitive sense) that the lengths of the 3 pieces are identically distributed. Use this to interpret your answer to $$E(L_2)$$, and then generalize the interpretation to $$E(L_n)$$.

e) Now let $$M_n = \min(U_1, U_2, \ldots, U_n)$$ be the smallest of $$U_1, U_2, \ldots, U_n$$. Use the idea in Part d to find $$E(M_n)$$.

2. Range of a uniform sample

[This problem will go faster if you have done the previous one.]

Let $$\theta_1 < \theta_2$$ and suppose $$X_1, X_2, \ldots, X_n$$ are i.i.d. uniform on the interval $$(\theta_1, \theta_2)$$. Let $$\theta = \theta_2 - \theta_1$$ be the length of the interval.

a) Let $$M_n = \min(X_1, X_2, \ldots, X_n)$$ be the sample minimum and $$L_n = \max(X_1, X_2, \ldots, X_n)$$ the sample maximum. The statistic $$R_n = L_n - M_n$$ is called the range of the sample and is a natural estimator of $$\theta$$. Without calculation, explain why $$R_n$$ is biased, and say whether it underestimates or overestimates $$\theta$$.

b) Find the bias of $$R_n$$ and confirm that its sign is consistent with your answer to Part a. For large $$n$$, is the size of the bias large or small?

c) Use $$R_n$$ to construct $$T_n$$, an unbiased estimator of $$\theta$$.

d) Compare $$SD(R_n)$$ and $$SD(T_n)$$. Which one is bigger? For large $$n$$, is it a lot bigger or just a bit bigger?

3. Regression estimates

For a random mother-daughter pair, let $$X$$ be the height of the mother and $$Y$$ the height of the daughter. In the notation of Section 11.3 suppose $$\mu_X = 63.5$$, $$\mu_Y = 63.7$$, $$\sigma_X = \sigma_Y = 2$$, and $$r(X, Y) = 0.6$$.

a) Find the equation of the regression line for estimating $$Y$$ based on $$X$$.

b) Find the regression estimate of $$Y$$ given that $$X = 62$$ inches.

c) Find the regression estimate of $$Y$$ given that $$X$$ is $$2$$ standard deviations above $$\mu_X$$. You should be able to do this without finding the value of $$X$$ in inches.

4. Estimating percentile ranks

It can be shown that for football shaped scatter plots it is OK to assume that each of the two variables is normally distributed.

Suppose that a large number of students take two tests (like the Math and Verbal SAT), and suppose that the scatter plot of the two scores is football shaped with a correlation of 0.6.

a) Let $$(X, Y)$$ be the scores of a randomly picked student, and suppose $$X$$ is on the the 90th percentile. Estimate the percentile rank of $$Y$$.

b) Let $$(X, Y)$$ be the score of a randomly picked student, and suppose $$Y$$ is on the 78th percentile. Estimate the percentile rank of $$X$$.

5. Least squares constant predictor

Let $$X$$ be a random variable with expectation $$\mu_X$$ and SD $$\sigma_X$$. Suppose you are going to use a constant $$c$$ as your predictor of $$X$$.

a) Let $$MSE(c)$$ be the mean squared error of the predictor $$c$$. Write a formula for $$MSE(c)$$.

b) Guess the value of $$\hat{c}$$, the least squares constant predictor. Then prove that it is the least squares constant predictor.

c) Find $$MSE(\hat{c})$$.

6. No-intercept regression

Sometimes data scientists want to fit a linear model that has no intercept term. For example, this might be the case when the data are from a scientific experiement in which the attribute $$X$$ can have values near $$0$$ and there is a physical reason why the response $$Y$$ must be $$0$$ when $$X=0$$.

So let $$(X, Y)$$ be a random point and suppose you want to predict $$Y$$ by an estimator of the form $$aX$$ for some $$a$$. Find the least squares predictor $$\hat{Y}$$ among all predictors of this form.

7. Uncorrelated versus independent

Let $$X$$ have the uniform distribution on the three points $$-1$$, $$0$$, and $$1$$. Let $$Y = X^2$$.

a) Show that $$X$$ and $$Y$$ are uncorrelated.

b) Are $$X$$ and $$Y$$ independent?

8. Regression equation

The regression equation can be written in multiple forms. For any particular purpose, one of the forms might be more convenient than the others. So it is a good idea to recognize them.

For $$a^* = r\frac{\sigma_Y}{\sigma_X}$$, which of the following is the equation of the regression line for estimating $$Y$$ based on $$X$$? More than one is correct.

(i) $$Y = a^*X + (\mu_Y - a^*\mu_X)$$

(ii) $$\hat{Y} = a^*X + (\mu_Y - a^*\mu_X)$$

(iii) $$\hat{Y} = a^*(X - \mu_X) + \mu_Y$$

(iv) $$\displaystyle{\hat{Y} = r\frac{X - \mu_X}{\sigma_X}}$$

(v) $$\displaystyle{\frac{\hat{Y} - \mu_Y}{\sigma_Y} = r\frac{X - \mu_X}{\sigma_X}}$$

9. Average of the residuals

a) In Data 8 we say that the regression line passes through the point of averages. Show this by setting $$X = \mu_X$$ and finding the corresponding value of $$\hat{Y}$$.

b) Find $$E(\hat{Y})$$. In Data 8 language, this is the average of the fitted values.

c) Let $$D = Y - \hat{Y}$$ be the residual as in Section 11.5 Find the expectation of the residual and confirm that the answer justifies the following statement from Data 8:

“No matter what the shape of the scatter diagram, the average of the residuals is 0.”

10. Variance decomposition

In this exercise you will find the relation between the variances of $$Y$$, its regression estimate $$\hat{Y}$$, and the residual $$D = Y - \hat{Y}$$.

a) Find $$Var(\hat{Y})$$.

b) Show that the answer to Part a justifies the following statement from Data 8:

$\frac{\text{SD of fitted values}}{\text{SD of } y} ~ = ~ \vert r \vert$

Note: Usually, the result above is stated in terms of variances instead of SDs, and hence $$r^2$$ is sometimes called “the proportion of variability explained by the linear model”.

c) Justify the decomposition of variance formula $$Var(Y) = Var(\hat{Y}) + Var(D)$$.

11. Regression accuracy

For a random mother-daughter pair, let $$X$$ be the height of the mother and $$Y$$ the height of the daughter. Suppose the correlation is $$r(X, Y) = 0.6$$ and let $$\sigma_Y = 2$$ inches.

Let $$\hat{Y}$$ be the regression estimate of the daughter’s height $$Y$$ based on the mother’s height $$X$$, and let $$D = Y - \hat{Y}$$ be the residual or error in the regression estimate.

a) Find $$\sigma_D$$.

b) Fill in the blank with a percentage: There is at least $$\underline{~~~~~~~~~~}$$ chance that the estimate $$\hat{Y}$$ is correct to within $$3.2$$ inches.