© 2019 GitHub, Inc.

Central Limit Theorem

It is important for data scientists to understand random sample averages, as those statistics are frequently used for estimating parameters. We know the expectation and SD of the mean of an i.i.d. sample, and we have observed that the distribution of the sample mean gets increasingly concentrated around the underlying population mean as the sample size gets larger.

What you might also have noticed is that in many of our examples, the distribution of the sample mean looks bell shaped. The reason is a fundamental theorem of probability and statistical inference: the Central Limit Theorem. This chapter introduces the theorem and shows why it matters.