5.1 – Experiments
Introduction
With the background behind us, we outline a designed experiment. Readers may wish to review concepts presented previously, in
- Chapter 2.4: Experimental design and rise of statistics in medical research.
- Chapter 3.4 – Estimating parameters
- Chapter 3.5 – Statistics of error
Experiments have the following elements
- Define a reference population (e.g., all patients with similar symptoms; all frogs in the population from which the malformed frog belonged.
- Define sample unit from the reference population (e.g., as many patients as can be screened by the physician are recruited; all frogs visible to the biologist are captured).
- Design sampling scheme from population (e.g., random sampling vs. convenience/haphazard sampling).
- Agree on the primary outcome or endpoint to be measured and whether additional secondary outcomes are to be observed and collected.
- Separate the sample into groups so that comparisons can be made (e.g., the illness example doesn’t exactly follow this scheme, rather, all are given the same treatment and responses are followed; a more typical example from biomedicine would be random assignment of individuals to one of two groups — half will receive a placebo, half will receive drug A; for the frog example, this analogy fits well — two groups — normal frogs and the the single abnormal frog),
- Devise a way to exclude or distinguish from the possible explanations, or alternate hypotheses e.g., for the physician, she will record how many of her patients fail to respond to treatment; for the biologist, noting presence or absence of parasites between the normal and abnormal frog defines the groups).
A basic experimental design looks like this.
- One treatment, with two levels (e.g., a control group and an experimental group)
- A collection of individuals recruited from a defined population, esp. by random sampling.
- Random assignment of individuals to one of the two treatment levels.
- The treatments are applied to each individual in the study.
- A measure of response (the primary outcome) and additional features (secondary outcomes) are recorded for each individual.
Contrast with design of observational studies
Note, importantly, that in observational studies no matter how sophisticated the equipment used to measure the outcome variable(s), key steps outlined above are missing. Researchers conducting observational studies do not control allocation of subjects to treatment groups (steps 4 and 3 in the two above lists, respectively). Instead, they may use case-control approaches, where individuals with (case) and without (control) the condition (e.g., lung cancer) are compared with respect to some candidate risk factor (e.g., smoking). Both case and control groups likely have members who smoke, but if there is an association between smoking and lung cancer, then more smokers will be in the case group compared to the control group.
A cohort study, also a form of observational study, is similar to case-control except that the outcome status is not known. A cohort study includes, in our example, smokers and non smokers who share other characteristics: age, medical history, etc. The other kind of design you will see is the cross-sectional study. Cross-sectional studies are descriptive studies, and, therefore also observational. Primary outcomes along with additional characteristics and outcomes are measured for a representative subsample, or perhaps even an entire population. Cross-sectional studies are used to absolute and relative risk rates. In ecology and evolutionary biology cross sectional studies are common, e.g., comparisons of species for metabolic rate (Darveau et al 2005) or life history traits (Jennings et al 1998), and specialized statistical approaches that incorporate phylogenetic information about the species are now the hallmark of these kinds of studies.
A word on outcomes of an experiment. Experiments should be designed to address an important question. The outcomes the researcher measures should be directly related to the important question. Thus, in the design of clinical trials, researchers distinguish between primary and secondary outcomes or endpoints. In educational research the primary question is whether or not students exposed to different teaching styles (e.g., lecture-style or active-learning approaches) score higher on an knowledge-content exams. The primary outcome would be the scores on the exams; many possible secondary outcomes might be collected including student’s attitudes towards the subject or perceptions on how much they have learned.
Example of an experiment
We’ll work through a familiar example. The researcher is given the task to design a study to test the efficacy in reducing tension headache symptoms by a new pain reliever. There are many possible outcomes: blurry vision, duration, frequency, nausea, need for and response to pain medication, and level of severity (Mayo Clinic). The drug is packaged in a capsule and a placebo is designed that contains all ingredients except the new drug. Forty subjects with headaches are randomly selected from a population of headache sufferers. All forty subjects sign the consent form and agree to be part of the study. The subjects also are informed that while they are participating in the research study on a new pain reliever, each subject has a 50-50 chance of receiving a placebo and not the new drug. The researcher then randomly assigns twenty of the subjects to receive the drug treatment and twenty to receive the placebo, places either a placebo pill of the treatment pill into a numbered envelope and gives the envelopes to a research partner. The partner then gives the envelopes to the patients. Both patients and the research partner are kept ignorant of the assignment to treatment.
We can summarize this most basic experimental design in a table, Table 1.
Table 1. Simple formulation of a 2X2 experiment (aka 2×2 contingency table).
Did the subject get better? | ||||
---|---|---|---|---|
Yes | No | Row totals | ||
Subject received the treatment |
Yes | a | b | a + b |
No | c | d | c + d | |
Column totals | a + c | b + d | N |
where N is the total number of subjects, a is number of subjects who DID receive the treatment AND got better, c is number of subjects who DID NOT receive the treatment, but DID get better, b is number of subjects who DID receive the treatment and DID NOT get better, and d is number of subjects who DID NOT receive the treatment and DID NOT get better.
Note 1: We owe Karl Pearson (1904) for the concepts of contingency — that membership in a (or b, c, or d for that metter) is a deviation from any chance association or independent probability (Chapter 6) — and the contingency table. Contingency is an important concept in classification and is central to machine learning in data science (Michie et al 1995). We return at some length to contingency and how to analyze contingency tables in Chapter 7 and Chapter 9.2.
And our basic expectation is that we are testing whether or not treatment levels were associated with subjects “getting better” as measured on some scale.
One possible result of the experiment, although unlikely, all of the negative outcomes are found in the group that did not receive the treatment (Table 2).
Table 2. One possible outcome of our 2X2 experiment.
Subject improved | |||
---|---|---|---|
Yes | No | ||
Subject received the treatment |
Yes | 20 | 0 |
No | 0 | 20 |
Results of an experiment probably won’t be as clear as in Table 2. Treatments may be effective, but not everyone benefits. Thus, results like Table 3 may be more typical.
Table 3. A more likely outcome of our 2X2 experiment.
Subject improved | |||
---|---|---|---|
Yes | No | ||
Subject received the treatment |
Yes | 7 | 13 |
No | 2 | 18 |
Questions
- At the time I am updating this page we are starting our fifth month of the coronavirus pandemic of 2019-2020. Daily it seems, we are also hearing news about “promising coronavirus treatments,” but as of this date, no study has been published that meets our considerations for a proper experiment. However, on May 1, the FDA issued an emergency use authorization for use of the antiviral drug remdesivir (Gilead), based on early results of clinical trials reported 29 April in The Lancet (Wang et al 2020). Briefly, their study included 158 treated with does of the antiviral drug and 79 provided with a placebo control. Both groups were treated otherwise the same. Improvement over 28 days was recorded: 62 improved with Placebo and 133 improved receiving does of remdesivir.
- Using the background described on this page, list the information needed to design a proper experiment. Using your list, review the work described in The Lancet article and check for evidence that the trial meets these requirements.
- Create a 2×2 table with the described results from The Lancet article.
- Consider our hazel tea and copper solution experiment described in Chapter 3. The outcome variables (Tail length, tail percentage, Olive moment) are quantitative, not categorical. Create a table to display the experimental design.
- For the migraine example, identify elements of the design that conform to the randomized control trial design described in Chapter 2.4.