5.1 – Experiments
How to put an experiment together
With the background behind us, we outline a designed experiment. Readers may wish to review concepts presented previously, in
- Chapter 2.4: Experimental design and rise of statistics in medical research.
- Chapter 3.4 – Estimating parameters
- Chapter 3.5 – Statistics of error
Experiments have the following elements
- Define a reference population.
- All patients with similar angina symptoms — shortness of breath, chest pain, dizziness — but who have not suffered a heart attack, or myocardial infarction (Wu et al 2017);
- All adult frogs in a pond near industrial waste runoff (eg, Sowers et al 2009).
- Define sample unit from the reference population (eg, as many patients as can be screened by the physician are recruited; all frogs visible to the biologist are captured).
- Design sampling scheme from population (eg, random sampling vs. convenience/haphazard sampling).
- Agree on the primary outcome or endpoint to be measured and whether additional secondary outcomes are to be observed and collected.
- For the patients with similar symptoms, we follow
- For the frogs, we count adults with developmental malformations.
- Separate the sample into groups so that comparisons can be made
- We may be tempted to provide the same treatment and follow the group for improvement — that is the clinically valid approach; to be an experiment, we would use random assignment of individuals to one of two groups — half will receive a placebo, half will receive a beta blocker;,
- The frog example doesn’t follow this scheme — the event has already occured, it’s an observational study.
- Devise a way to exclude or distinguish from the possible explanations, or alternative hypotheses.
- eg, for the physician, she will record how many of her patients fail to respond to treatment;
- for the biologist, noting presence or absence of parasites between the normal and abnormal frog defines the groups).
Note 1: The studies cited are examples of work done on similar problems, they are not offered to evaluate or critique any research efforts that were done or described in these reports.
The frog study, as described by me, lacks any design elements that would allow for a test of the hypothesis that industrial runoff causes developmental malformations in frogs. Students should think about how to modify this study to test the hypothesis. See quiz question.
A basic experimental design looks like this.
- One treatment, with two levels (eg, a control group and an experimental group)
- A collection of individuals recruited from a defined population, esp. by random sampling.
- Random assignment of multiple individuals to one of the two treatment levels.
- The treatments are applied to each individual in the study.
- A measure of response (the primary outcome) and additional features (secondary outcomes) are recorded for each individual.
Of course, there are other considerations, and by now, students reading this text should be able to identify what is missing from the list. (See quiz question.)
Contrast with design of observational studies
Importantly, for observational studies, no matter how sophisticated the equipment used to measure the outcome variable(s), key steps outlined above are missing. Researchers conducting observational studies do not control allocation of subjects to treatment groups (steps 4 and 3 in the two above lists, respectively). Instead, they may use case-control approaches, where individuals with (case) and without (control) the condition (eg, lung cancer) are compared with respect to some candidate risk factor (eg, smoking). Both case and control groups likely have members who smoke, but if there is an association between smoking and lung cancer, then more smokers will be in the case group compared to the control group.
A cohort study, also a form of observational study, is similar to case-control except that the outcome status is not known. A cohort study includes, in our example, smokers and non smokers who share other characteristics: age, medical history, etc. The other kind of design you will see is the cross-sectional study. Cross-sectional studies are descriptive studies, and, therefore also observational. Primary outcomes along with additional characteristics and outcomes are measured for a representative subsample, or perhaps even an entire population. Cross-sectional studies are used to absolute and relative risk rates. In ecology and evolutionary biology cross sectional studies are common, eg, comparisons of species for metabolic rate (Darveau et al 2005) or life history traits (Jennings et al 1998), and specialized statistical approaches that incorporate phylogenetic information about the species are now the hallmark of these kinds of studies.
A word on outcomes of an experiment. Experiments should be designed to address an important question. The outcomes the researcher measures should be directly related to the important question. Thus, in the design of clinical trials, researchers distinguish between primary and secondary outcomes or endpoints. In educational research the primary question is whether or not students exposed to different teaching styles (eg, lecture-style or active-learning approaches) score higher on an knowledge-content exams. The primary outcome would be the scores on the exams; many possible secondary outcomes might be collected including student’s attitudes towards the subject or perceptions on how much they have learned.
Note 3: While it’s logical to include secondary outcomes — they may offer support or complement the primary results — there needs to be caution in interpreting the test results of multiple secondary outcomes. By definition, the study is designed to test hypothesis about the primary or main outcome. While there are a number of concerns about testing secondary outcomes, from a statistical point of view, the study may lack the sample size — the power, see Chapter 11 — Power analysis — to test many additional hypotheses without the risk of increasing chance of false positives because of the multiple comparisons problem. We address this scenario in Chapter 12.1 – The need for ANOVA.
Example of an experiment.
We’ll work through a familiar example. The researcher is given the task to design a study to test the efficacy in reducing tension headache symptoms by a new pain reliever. There are many possible outcomes: blurry vision, duration, frequency, nausea, need for and response to pain medication, and level of severity (Mayo Clinic). The drug is packaged in a capsule and a placebo is designed that contains all ingredients except the new drug. Forty subjects with headaches are randomly selected from a population of headache sufferers. All forty subjects sign the consent form and agree to be part of the study. The subjects also are informed that while they are participating in the research study on a new pain reliever, each subject has a 50-50 chance of receiving a placebo and not the new drug. The researcher then randomly assigns twenty of the subjects to receive the drug treatment and twenty to receive the placebo, places either a placebo pill of the treatment pill into a numbered envelope and gives the envelopes to a research partner. The partner then gives the envelopes to the patients. Both patients and the research partner are kept ignorant of the assignment to treatment.
We can summarize this most basic experimental design in a table, Table 1.
Table 1. Simple formulation of a 2X2 experiment (aka 2×2 contingency table).
| Did the subject get better? | ||||
|---|---|---|---|---|
| Yes | No | Row totals | ||
| Subject received the treatment |
Yes | a | b | a + b |
| No | c | d | c + d | |
| Column totals | a + c | b + d | N | |
where N is the total number of subjects, a is number of subjects who DID receive the treatment AND got better, c is number of subjects who DID NOT receive the treatment, but DID get better, b is number of subjects who DID receive the treatment and DID NOT get better, and d is number of subjects who DID NOT receive the treatment and DID NOT get better.
Note 4: We owe Karl Pearson (1904) for the concepts of contingency — that membership in a (or b, c, or d for that matter) is a deviation from any chance association or independent probability (Chapter 6) — and the contingency table. Contingency is an important concept in classification and is central to machine learning in data science (Michie et al 1995). We return at some length to contingency and how to analyze contingency tables in Chapter 7 and Chapter 9.2.
And our basic expectation is that we are testing whether or not treatment levels were associated with subjects “getting better” as measured on some scale.
One possible result of the experiment, although unlikely, all of the negative outcomes are found in the group that did not receive the treatment (Table 2).
Table 2. One possible outcome of our 2X2 experiment.
| Subject improved | |||
|---|---|---|---|
| Yes | No | ||
| Subject received the treatment |
Yes | 20 | 0 |
| No | 0 | 20 | |
Results of an experiment probably won’t be as clear as in Table 2. Treatments may be effective, but not everyone benefits. Thus, results like Table 3 may be more typical.
Table 3. A more likely outcome of our 2X2 experiment.
| Subject improved | |||
|---|---|---|---|
| Yes | No | ||
| Subject received the treatment |
Yes | 7 | 13 |
| No | 2 | 18 | |
Must a control group always be included in an experiment?
To be an experiment, yes. However, it is not true that the control treatment must only be applied to separate, individual subjects. For example, a common experimental design would be repeated measures of an outcome, like behavior (eg, feeding response) before and after an intervention is applied. Thus, the individual is its own control or reference point. For example, Dohm et al (2008) measured feeding and locomotor behavior of adult toads (Rhinella marina, formerly Bufo marinus) prior to and after ozone exposure. We concluded that a single 4-h exposure to O3 depressed toad feeding behavior, but had little effect on voluntary locomotor behavior. Typically, the design randomly assigns individuals to the treatments — some individuals get the exposure first, then the control exposure second; others, again selected at random, receive the control exposure first, then the active exposure.
The within subjects design is an important method in biological research; these designs offer control for individual differences as confounding variables and require fewer participants compared to separate control group designs. This design is effective provided there is little risk of carryover effects; if the effects of a previous condition linger, then it may influence a participant’s response in subsequent conditions, confounding the results. For our study (Dohm et al 2008) we used a modified within-subjects design by employing a split plot design. We return to within-subjects designs in Chapter 14.4 – Randomized block design.
Questions
- When I updated this page in 2020, we were in our fifth month of the coronavirus pandemic of 2019-2020. Daily it seemed, we heard news about “promising coronavirus treatments,” but no study had at the time been published that would meets our considerations for a proper experiment. However, on May 1, 2020, the FDA issued an emergency use authorization for use of the antiviral drug remdesivir (Gilead), based on early results of clinical trials reported 29 April in The Lancet (Wang et al 2020). Briefly, their study included 158 treated with does of the antiviral drug and 79 provided with a placebo control. Both groups were treated otherwise the same. Improvement over 28 days was recorded: 62 improved with Placebo and 133 improved receiving dose of remdesivir.
- Using the background described on this page, list the information needed to design a proper experiment. Using your list, review the work described in The Lancet article and check for evidence that the trial meets these requirements.
- Create a 2×2 table with the described results from The Lancet article.
- Consider our hazel tea and copper solution experiment described in Chapter 3. The outcome variables (Tail length, tail percentage, Olive moment) are quantitative, not categorical. Create a table to display the experimental design.
- For the migraine example, identify elements of the design that conform to the randomized control trial design described in Chapter 2.4.
Quiz Chapter 5.1
Experiments