6 – Probability, Distributions

Introduction

Probability is how likely something, an event, is likely to occur. Thus, an important concept to appreciate is that in many cases, like R.A. Fisher’s Lady tasting tea analogy, we can count in advance all possible outcomes of an experiment. On the other hand, for many more experiments, we cannot count all possible outcomes of the sample space, either because they are too numerous or simply unknowable. In such cases, applying theoretical probability distributions allow us to circumvent the countability problem. Whereas empirical probability distributions are  frequency counts of observations, theoretical probabilities are based on mathematical formulas.

Much of classical inferential statistics, especially the kind one finds in introductory courses like ours, are built on probability distributions. ANOVA, t-tests, linear regression, etc., are parametric tests and assume errors are distributed according to a particular type of distribution, the normal or Gaussian distribution.

A probability distribution is a list of probabilities for each possible outcome of a discrete random variable in an entire population. Depending on the data type, there are many classes of probability distributions. In contrast, probability density functions are used to for continuous random variables. This chapter begins with basics of probability then gently introduces probability distributions. In the other sections of this chapter we describe several probability density functions. Emphasis is placed on the normal distribution, which underlies most parametric statistics.


Chapter 6 contents