© 2019 GitHub, Inc.

Bias, Variance, and Least Squares

In this chapter we will attempt to answer some general questions about estimators.

  • How can we measure the accuracy of an estimator?
  • What factors lead to error in estimation? Can we quantify them?
  • When we have two different estimators of the same unknown quantity, what are some factors to consider when deciding which one to use?
  • What are some criteria for an estimator to be good?
  • How do we construct good estimators?

In the process we will define and contrast the bias of an estimator and the variability of the estimator.

We will then use the criterion of least squares to derive the best linear predictor of one variable based on another. This is the regression line, familiar to you from Data 8. The calculation will include a justification of Data 8's definition of the correlation coefficient. Finally, we will prove that the correlation coefficient is a number between $-1$ and $1$.