7.1 – Epidemiology definitions
Introduction
This sub-chapter may lack for drama, but let’s start by providing a list of key terms with definitions as you start our introduction to epidemiology. More definitions will follow in the sections as well.
Definitions
Absolute risk: The probability that a specified event will occur in a specified population. See Ch07.4 – Epidemiology: Relative risk and absolute risk, explained
Absolute risk reduction (ARR): the decrease in risk of an event in an exposed (treatment) group compared to an unexposed (control) group. Also called the risk difference. , see Contingency table. See Ch07.4 – Epidemiology: Relative risk and absolute risk, explained
Contingency table, also called cross tabulation or crosstab, is a display of counts of variables in a matrix format. We briefly introduced the contingency table in Chapter 5.1. In epidemiology, rows of contingency table represent treatment or exposure groups, columns represent outcomes.
Table 1. A 2 X 2 contingency table
Outcome | ||
Yes | No | |
Treatment or exposed group | a | b |
Control or nonexposed group | c | d |
The 2X2 contingency table is referred to frequently in this chapter and again in Chapter 9.
R code
a = 4; b = 46; c = 5; d = 45 Table1 <- matrix(c(a,b,c,d), 2, 2, byrow=TRUE, dimnames = list(c("Treatment", "Control"), c("Yes", "No"))); Table1
R output
Yes No Treatment 4 46 Control 5 45
In statistical classification, a field dedicated to establishing algorithms for correctly identifying and predicting classes of data, e.g., correct disease diagnosis or , the 2X2 contingency table is often called a confusion or error matrix (Greene 2003). Machine learning methods are often built on developments from statistical classification findings (Michie et al 1995). We expand on contingency tables in Chapter 7.4 and Chapter 9.2.
Control event rate (CER): How often an event occurs in the control group. , see Contingency table
Diagnosis: identification of the nature of a disease or condition.
Event: From probability theory, an event is a set of outcomes to which a probability is assigned.
Experimental event rate (EER): How often an event occurs in the treatment group. , see Contingency table
Hazard: anything that can cause harm
Incidence: the number of newly diagnosed individuals in a population having a condition, disease or other characteristic. Compare to prevalence.
Likelihood: In every-day English, the term likelihood and probability mean the same thing — the chance that an event will occur. In statistics, however, probability refers to possible outcomes, whereas likelihood is associated with the possible explanation (hypothesis) for the outcome. Put another way, how likely is the data we observe, given the hypothesis (cf. excellent discussion by Gallistel 2015)? We will look at likelihood methods in practice in subsequent chapters.
Negative predictive value of a test (NPV), defined as the probability that a negative test result identifies a person who truly does not have the disease. Calculated as the total number of individuals without the disease divided by the total that tested negative.
Number needed to treat (NNT): the inverse of the absolute risk reduction. . See Ch07.4 – Epidemiology: Relative risk and absolute risk, explained
Odds, the ratio (OR) of two probabilities: the probability of getting a one on throwing a dice is , the probability of not getting a one is , therefore the odds of getting a one are 1 to 5. . See Ch07.5 – Odds ratio
Per capita rate, Latin phrase, for each head, meaning per person.
Positive predictive value of a test (PPV), defined as the probability that a positive test result identifies a person who truly has the disease. Calculated as the total number of individuals with the disease divided by the total that tested positive.
Posttest probability refers to the probability that the patient has the disease after the results of the test are known.
Pretest probability is the prevalence of the disease, i.e., the chance that the a randomly selected person from the population has the disease.
Prevalence: The proportion of individuals in a population having a condition, disease, or characteristic. Compare to incidence.
Prognosis, how a disease plays out.
Relative risk: Ratio of the risk of an event among those exposed to the risk factor to the risk among those not exposed to the risk factor. See Ch07.4 – Epidemiology: Relative risk and absolute risk, explained
Relative risk reduction (RRR): is a measure calculated by dividing the absolute risk reduction by the control event rate. See Ch07.4 – Epidemiology: Relative risk and absolute risk, explained
Risk: Probability of an event. Risk is not restricted to just bad events, but refers to the uncertainty of a particular event (e.g., the risk that a child will be born male seems a melodramatic statement, but it is accurate as far as this definition goes).
Therapy, treatment intended to treat, relieve, or cure a disorder of condition.
Questions
- Compare and contrast ARR and RRR.
- What’s the difference between event, hazard, and risk?
- What’s the difference between incidence and prevalence?
- What’s the difference between diagnosis and prognosis?
- What’s the implication of a NNT greater than 100 in terms of the utility of a proposed therapy or treatment?
Chapter 7 contents
- Introduction
- Epidemiology definitions
- Epidemiology basics
- Conditional Probability and Evidence Based Medicine
- Epidemiology: Relative risk and absolute risk, explained
- Odds ratio
- Confidence intervals
- References and suggested readings