20.5 – Time series

Introduction

Time series refers to any measure recorded over time. Stationary time series do not have trends or seasonality, just random (white) noise; differencing, or nonstationary time series do have trends and or seasonality. Stationary time series will not have predictable patterns over the long term, but for us, what this distinction says is that statistics like autocorrelation, the mean and standard deviation remains the same over time. Nonstationary times series can be made stationary by transformation, with taking a differencing approach, subtracting each observation from its preceding one to remove trends and seasonality, one common approach.

All kinds of examples of time series analysis can be made, with stock market price fluctuations (Hamilton 1994) and meteorological forecasting (Mudelsee 2014) as common examples. Time series analysis are also important in clinical situations, for example, analysis of temporal patterns in ECG and blood pressure data, which helps predict risks, monitor patient health, and understand disease trends. In ecology, perhaps the most famous time series data set is the Hudson’s Bay Company fur-trade records, which show a 9- to 11-year predator-prey population cycle between populations of the Canadian lynx (the predator) and the snowshoe hare (the prey) (Krebs et al 1995).

Significant autocorrelation — the correlation between any two observations in a data series separated by a specific time interval or lag — represents the extent to which a time series is correlated with its past values. For example, autocorrelation could measure the association between a person’s heart rate today and its heart rate yesterday, or between its heart rate today and the heart rate from last week. It is not simply the relationship between today’s heart rate and, for instance, today’s environmental temperature. This kind of comparison, between two sets of time series, calls for a cross-correlation or predictive correlation. A cross-correlation measures the correlation between the first series (temperature) and a lagged version of the second series (heart rate some minutes or hours later). Examples of such correlations include our familiar product moment correlation, but generalized across all possible time lags.

Note 1: See RHRV package,  the R Heart Rate Variability (HRV) analysis package, for framework to analyze heartbeat, ECG, and other cardiac recordings. See also Chapter 20.3 – Baseline correction for an example of a myogram analysis with baseline drift.

There’s much to the analysis of time series, but one key concept is the moving average. Along with any trend or pattern due to seasonality, time series data are expected to exhibit noise or random variation associated with each point in the series. The moving average is an attempt to smooth out data to reveal underlying trends and are fundamental to seasonal decomposition and forecasting. We calculate a series of averages of different subsets of the data, making it easier to spot long-term patterns by filtering out short-term noise, like fluctuations or outliers.

For example, to calculate the 5-beat moving average of a heartbeat, you would average the last 5 individual heartbeat values. As each new heartbeat is recorded, the oldest one is dropped, and a new average is calculated, creating a smoother trend line that shows a clearer overall picture rather than focusing on individual, potentially spiky, beat-to-beat variations. Rate-responsive or adaptive pacemakers use moving averages to determine the appropriate pacing rate. This helps provide a smoother and more stable heart rate response to the patient’s activity.

Note 2: For some really cool work on adaptive pacemakers, see Kumar et al 2018.

This statistical approach should sound familiar — we introduced without much fanfare use of LOESS (Locally Estimated Scatterplot Smoothing) to smooth data to improve pattern detection in Chapter 4.5 and Chapter 17.4. While a moving average takes average of a fixed number of recent data points, LOESS is a more powerful non-parametric regression technique that fits local polynomial regressions to subsets of the data to create a smooth curve.

In time series analysis, the autocorrelation function (ACF) measures the linear relationship between a time series and a lagged version of itself. It quantifies how correlated a series is with its past values (y{t}) and y_{t-k}), where k is the lag or time interval. An ACF plot, or correlogram, visually displays these correlation coefficients at different lags to help identify patterns, assess randomness, and determine appropriate time series models like ARIMA. ACF is related to moving averages because the ACF plot helps to identify the order of a moving average model, which depends on the ACF’s behavior at different lags.

Note 3: A time-series plot shows how a single variable changes over time, while an ACF (Autocorrelation Function) plot visualizes the correlation between a time series and a lagged version of itself.

Like many pages in Mike’s Biostatistics Book, this page only begins to introduce the subject. For a more thorough introduction, see Introduction to Time Series Analysis, NIST. See also the text by Shumway and Stoffer (2025).

ARIMA and ARMA models

ARIMA and ARMA models are fundamental tools for analyzing time series data in biostatistics, especially when the goal is to understand temporal patterns or forecast future values. An ARMA model combines two ideas: autoregression (AR), where the current value depends on previous values, and moving average (MA), where the current value depends on previous error terms. An ARIMA model extends this by adding differencing—the “I” for Integrated—which helps remove trends and make the series stationary. These models are widely used in public health, epidemiology, and biological monitoring because many variables (e.g., infection rates, physiological signals) show correlation over time.

Note 4: We introduced ARMA models in our Generalized Linear models introduction, Chapter 18.4.

Both ARMA and ARIMA models rely on the concepts of lagged dependence and error smoothing, but ARIMA is more flexible because it explicitly accommodates nonstationary data. An ARMA(p, q) model is appropriate when the time series is already stationary or has been preprocessed to remove trend and seasonality. p is the autoregressive order, the number of lagged observations used to forecast the current value, and q is the moving average order, the number of lagged forecast errors included in the model.

In contrast, ARIMA(p, d, q) includes a differencing parameter 𝑑 that automatically transforms the series to stationarity before fitting the ARMA structure. Thus, ARIMA models are typically used for data with a clear trend or evolving mean, while ARMA models are a subset applied to stable series without differencing. In practice, ARIMA is more commonly applied because real-world biological and clinical time series often exhibit trends.

R code

In R, ARMA and ARIMA models can be fit using the forecast or stats packages. The code below shows simple examples of each using a generic time series object y. An ARMA(2,1) model can be fit with:

arma_model <- arima(y, order = c(2, 0, 1))
summary(arma_model)

where the middle “0” indicates no differencing. For an ARIMA(1,1,1) model, which includes first differencing to remove trend, the code is:

arima_model <- arima(y, order = c(1, 1, 1))
summary(arima_model)

ccc

R packages

To conduct time series analysis we can use built in functions like ts() and decompose(). HoltWinters() also useful, now part of stats. Lots of specialized time series packages with advanced features, including forecast, timeSeries (Financial time series), season (Seasonal analysis of health data), and many others. For a more extensive package, see asta, which was designed to accompany the excellent book, now in its 5th edition, Time Series Analysis and its Applications: With R Examples, by Shumway and Stoffer (2025).

Note 4: Caution — newer versions of R have HoltWinters() and related functions included with base package stats.

Rcmdr package for time series was RcmdrPlugin.epack , removed from CRAN as of 2018.

For up-to-date listing of time series packages, see https://cran.r-project.org/web/views/TimeSeries.html

Time series data sets included in R and Rcmdr

R Code

data(co2, package="datasets")
co2 <- as.data.frame(co2)
#convert to time series data type with ts()
tCO2 <- ts(co2,frequency=12,start=c(1959),end=c(1997))
plot.ts(tCO2)

ppm CO2 from Mauna Loa, years 1958-1997, co2 data set in R, datasets package

Figure 1. Time series plot of CO2 data set, the Keeling curve, from package datasets, comes with Rcmdr installation.

Other datasets included with R

carData::Arrests

carData::Bfox

carData::CanPop

Example

Get up-to-date CO2 data from NOAA as text file. Download to your computer, load and clean in your favorite spreadsheet app. Months came as numbers 1,2,3, etc., I changed to text, Jan, Feb, Mar, etc. I grabbed three columns: year, month, ppm for import to R.

head(maunaLoa)

R output

> head(maunaLoa)
  year month  ppm
1 1958 Mar 315.70
2 1958 Apr 317.45
3 1958 May 317.51
4 1958 Jun 317.24
5 1958 Jul 315.86
6 1958 Aug 314.93

However, it turns out the time series functions are easiest to work if only the ppm data are included.

tCO2 <- ts(maunaLoa[,"ppm"],frequency=12,start=c(1958,3),end=c(2020,10))
head(tCO2)

R output

> head(tCO2)
        Mar    Apr    May    Jun    Jul    Aug
1958 315.70 317.45 317.51 317.24 315.86 314.93

Get our plot (Figure 2).

plot(tCO2)

CO2 in ppm from 1958 to November 2020. Data from NOAAA

Figure 2. CO2 ppm monthly average data from NOAA, last data October 2020.

Seasonal time series comes with a trend component, a seasonal component, and a random component.

R code

dectCO2 <- decompose(tCO2)
head(dectCO2)
plot(dectCO2)

decompose CO2

Figure 3. Observed (panel, top), trends over time (panel, second from top), seasonal changes (panel, second from bottom), and random error (panel, bottom).

Forecasting

Excellent resource at https://otexts.com/fpp2/

Exponential smoothing, weighted averages of past observations, weighted so that more recent observations are more influential.

Holt-Winters method extracts seasonal component (additive or multiplicative).

#set start value to value of first observation
tCO2cast <- HoltWinters(tCO2, l.start=315.42)
#Predict for next ten years. Because frequency in ts() was monthly, ten years is h=120
forecastCO2 <- forecast(tCO2cast, h=120)
plot(forecastCO2, fcol="red")

forecast plot

Figure 4. Data in black, predicted values in red (additive) shaded by confidence interval.

Questions

  1. Write up three learning outcomes for this page. Hint: Point your favorite generative AI to this page and ask for help.
  2. For the co2 dataset included in Rcmdr (co2, datasets), obtain forecast for year 2020 and compare against actual 2020 data (see Figure 2).
  3. Positive clinical samples between September 2015 and November 2020 for flu virus in the USA are provided in the data set below (scroll or click here). The frequency of observations was weekly. Apply decompose() and obtain the seasonal and trend components of the data set. Which month does the peak positive sample occur?
  4. Total pounds of fish (variable = Pounds) and pounds of Akule and Opelu (variable = Akule.Opelu) caught by commercial industry in Hawaiʻi, from 2000 to 2018 are provided in the data set below (scroll or click here). Apply decompose() and obtain the seasonal and trend components of the data set for Total pounds and again for Akule (Selar crumenophthalmus) and Opelu (Decapterus macarellus). Is there evidence for  trends, and if so, describe the trend. Is there evidence of seasonality? If so, which month did peak fishing occur?

Quiz Chapter 20.5

Time series

References and suggested reading

Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press.

Krebs, C. J., Boutin, S., Boonstra, R., Sinclair, A. R. E., Smith, J. N. M., Dale, M. R. T., Martin, K., & Turkington, R. (1995). Impact of food and predation on the snowshoe hare cycle. Science, 269(25 August), 1112–1115.

Kumar, A., Komaragiri, R., & Kumar, M. (2018). From Pacemaker to Wearable: Techniques for ECG Detection Systems. Journal of Medical Systems, 42(2), 34.

Chapter 6.4. Introduction to Time Series Analysis, in NIST/SEMATECH e-Handbook of Statistical Methods, https://www.itl.nist.gov/div898/handbook/index.htm, (first reviewed by Mike in 2019).

Mudelsee, M. (2016). Climate time series Classical statistics and bootstrap methods (2nd ed., Vol. 51). Springer.

Shumway, R. H., & Stoffer, D. S. (2025). Time Series Analysis and Its Applications: With R Examples (5th ed.). Springer.

Data set this page

Flu, extracted 28 Nov 2020 from https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html

YearDateWeekPositive
201509/28/15401.056
201510/05/15411.297
201510/12/15421.109
201510/19/15431.108
201510/26/15441.123
201511/02/15451.382
201511/09/15461.193
201511/16/15471.385
201511/23/15481.395
201511/30/15491.475
201512/07/15502.512
201512/14/15512.287
201512/21/15522.46
201601/04/1612.931
201601/11/1624.254
201601/18/1635.485
201601/25/1646.96
201602/01/1659.699
201602/08/16612.549
201602/15/16715.536
201602/22/16818.362
201602/29/16921.11
201603/07/161023.645
201603/14/161119.972
201603/21/161218.471
201603/28/161316.227
201604/04/161414.016
201604/11/161513.236
201604/18/161612.346
201604/25/161710.262
201605/02/16188.121
201605/09/16196.686
201605/16/16205.811
201605/23/16214.719
201605/30/16223.06
201606/06/16233.02
201606/13/16241.829
201606/20/16251.712
201606/27/16261.223
201607/04/16270.903
201607/11/16280.869
201607/18/16290.849
201607/25/16300.782
201608/01/16310.934
201608/08/16320.901
201608/15/16330.803
201608/22/16341.405
201608/29/16351.678
201609/05/16361.461
201609/12/16371.513
201609/19/16381.741
201609/26/16391.784
201610/03/16401.57
201610/10/16411.359
201610/17/16421.403
201610/24/16431.509
201610/31/16441.916
201611/07/16452.201
201611/14/16462.576
201611/21/16473.348
201611/28/16483.319
201612/05/16494.26
201612/12/16506.683
201612/19/165110.782
201612/26/165213.999
201701/02/17113.344
201701/09/17215.373
201701/16/17318.287
201701/23/17418.53
201701/30/17521.422
201702/06/17624.153
201702/13/17724.512
201702/20/17824.725
201702/27/17919.772
201703/06/171019.271
201703/13/171119.034
201703/20/171219.711
201703/27/171318.482
201704/03/171415.425
201704/10/171512.74
201704/17/17169.696
201704/24/17176.768
201705/01/17185.918
201705/08/17195.333
201705/15/17204.863
201705/22/17214.352
201705/29/17224.165
201706/05/17233.386
201706/12/17243.062
201706/19/17252.649
201706/26/17262.534
201707/03/17272.178
201707/10/17282.164
201707/17/17291.839
201707/24/17301.806
201707/31/17311.948
201708/07/17321.9
201708/14/17331.343
201708/21/17341.434
201708/28/17351.935
201709/04/17361.888
201709/11/17371.896
201709/18/17381.669
201709/25/17391.703
201710/02/17402.202
201710/09/17412.09
201710/16/17422.176
201710/23/17432.583
201710/30/17443.607
201711/06/17454.245
201711/13/17465.3
201711/20/17477.088
201711/27/17487.305
201712/04/174910.745
201712/11/175015.355
201712/18/175122.777
201712/25/175225.386
201801/01/18125.365
201801/08/18226.942
201801/15/18327.034
201801/22/18427.37
201801/29/18527.064
201802/05/18626.998
201802/12/18726.117
201802/19/18822.616
201802/26/18918.487
201803/05/181015.694
201803/12/181115.581
201803/19/181215.328
201803/26/181315.114
201804/02/181412.689
201804/09/181511.249
201804/16/18169.398
201804/23/18177.999
201804/30/18186.259
201805/07/18194.393
201805/14/18203.166
201805/21/18212.39
201805/28/18221.529
201806/04/18231.577
201806/11/18241.299
201806/18/18251.023
201806/25/18261.114
201807/02/18271.003
201807/09/18280.916
201807/16/18291.053
201807/23/18300.995
201807/30/18310.954
201808/06/18320.957
201808/13/18330.764
201808/20/18341.336
201808/27/18351.504
201809/03/18361.747
201809/10/18371.687
201809/17/18381.699
201809/24/18391.497
201810/01/18401.749
201810/08/18411.697
201810/15/18421.993
201810/22/18432.055
201810/29/18442.174
201811/05/18452.733
201811/12/18463.157
201811/19/18473.928
201811/26/18483.915
201812/03/18496.232
201812/10/185010.364
201812/17/185114.265
201812/24/185216.352
201912/31/18112.139
201901/07/19212.722
201901/14/19316.317
201901/21/19419.392
201901/28/19522.549
201902/04/19625.134
201902/11/19726.026
201902/18/19826.241
201902/25/19926.074
201903/04/191025.607
201903/11/191126.132
201903/18/191222.481
201903/25/191319.304
201904/01/191414.942
201904/08/191511.909
201904/15/19168.611
201904/22/19175.844
201904/29/19184.82
201905/06/19193.84
201905/13/19203.542
201905/20/19213.42
201905/27/19223.083
201906/03/19232.79
201906/10/19242.316
201906/17/19251.902
201906/24/19262.081
201907/01/19272.429
201907/08/19282.017
201907/15/19292.218
201907/22/19302.377
201907/29/19312.398
201908/05/19322.054
201908/12/19332.082
201908/19/19342.362
201908/26/19353.455
201909/02/19363.097
201909/09/19372.484
201909/16/19382.757
201909/23/19392.744
201909/30/19401.31
201910/07/19411.479
201910/14/19421.552
201910/21/19432.253
201910/28/19443.057
201911/04/19455.163
201911/11/19466.756
201911/18/19479.546
201911/25/194810.939
201912/02/194911.655
201912/09/195016.154
201912/16/195122.533
201912/23/195226.934
202012/30/19123.488
202001/06/20223.119
202001/13/20326.083
202001/20/20428.281
202001/27/20530.147
202002/03/20630.26
202002/10/20729.675
202002/17/20828.322
202002/24/20925.752
202003/02/201022.491
202003/09/201115.813
202003/16/20127.502
202003/23/20132.322
202003/30/20141.031
202004/06/20150.618
202004/13/20160.623
202004/20/20170.218
202004/27/20180.263
202005/04/20190.326
202005/11/20200.306
202005/18/20210.213
202005/25/20220.165
202006/01/20230.34
202006/08/20240.28
202006/15/20250.381
202006/22/20260.282
202006/29/20270.21
202007/06/20280.176
202007/13/20290.376
202007/20/20300.15
202007/27/20310.133
202008/03/20320.176
202008/10/20330.132
202008/17/20340.227
202008/24/20350.315
202008/31/20360.202
202009/07/20370.186
202009/14/20380.4
202009/21/20390.225
202009/28/20400.33
202010/05/20410.401
202010/12/20420.35
202010/19/20430.251
202010/26/20440.201
202011/02/20450.177
202011/09/20460.222

Data set in this page

Fish, Hawaiʻi state DLNR, Pounds refers to total catch, Akule.Opelu refers to pounds for the two kinds of fish.

YearMonthPoundsAkule.Opelu
1999Jan206402385331
1999Feb228678589537
1999Mar2083789112897
1999Apr2446840136301
1999May2300842103692
1999Jun2340116134432
1999Jul2646429138814
1999Aug225440896569
1999Sep192638156598
1999Oct223378976834
1999Nov1730672134706
1999Dec176237592255
2000Jan1501164147104
2000Feb1993373104165
2000Mar2220831132028
2000Apr2398180119224
2000May2557229121268
2000Jun2510298145200
2000Jul227095493883
2000Aug191265469107
2000Sep136526465007
2000Oct161511751208
2000Nov1388453117493
2000Dec1802926121486
2001Jan1481810170702
2001Feb149635644575
2001Mar1579528101764
2001Apr118459189388
2001May2091424124193
2001Jun196688661122
2001Jul211393173266
2001Aug192666129386
2001Sep135342930268
2001Oct133828929577
2001Nov174719880350
2001Dec145833622817
2002Jan1517609107406
2002Feb172908431030
2002Mar174798567691
2002Apr2109451101043
2002May206992157251
2002Jun1640151100501
2002Jul197938287584
2002Aug183167865566
2002Sep173420153162
2002Oct177920793867
2002Nov2191825106167
2002Dec257619167881
2003Jan191050049420
2003Feb207516855006
2003Mar224575371616
2003Apr1562751102993
2003May2440228106600
2003Jun1842907101715
2003Jul195727948453
2003Aug214382369130
2003Sep150321274525
2003Oct161177970949
2003Nov166816754004
2003Dec231253743054
2004Jan160559575751
2004Feb170553394864
2004Mar2079402120305
2004Apr188370490950
2004May1830168111599
2004Jun191862276392
2004Jul202978798937
2004Aug192800972577
2004Sep162022482650
2004Oct185464374587
2004Nov198156759753
2004Dec202227244353
2005Jan208882160972
2005Feb210694859469
2005Mar238632784551
2005Apr2122171101099
2005May236995379042
2005Jun2342117104814
2005Jul228187171065
2005Aug212430353383
2005Sep173498637195
2005Oct192013148632
2005Nov196950688235
2005Dec232393398768
2006Jan170276650553
2006Feb206020489037
2006Mar224457033916
2006Apr206892274430
2006May2164076108689
2006Jun193595189503
2006Jul196851393758
2006Aug1741802111080
2006Sep150889744537
2006Oct189253546747
2006Nov220817382938
2006Dec138141242260
2007Jan2211384114496
2007Feb239143760618
2007Mar272402194251
2007Apr263924590078
2007May3168913129258
2007Jun2706972116628
2007Jul2523392129345
2007Aug227250288997
2007Sep212183771560
2007Oct247299652915
2007Nov3040118107555
2007Dec293417439239
2008Jan265653944672
2008Feb310181935213
2008Mar281684674421
2008Apr306483763355
2008May356099352287
2008Jun292021933685
2008Jul251656131288
2008Aug233820562171
2008Sep231445831311
2008Oct240724042766
2008Nov206066675102
2008Dec232926874508
2009Jan219856944459
2009Feb231476433206
2009Mar184645964879
2009Apr265923036638
2009May269244077011
2009Jun238717549217
2009Jul267289555033
2009Aug217402740398
2009Sep225915351386
2009Oct238674958095
2009Nov208170651798
2009Dec270287155148
2010Jan205996440855
2010Feb2632985100598
2010Mar243056239887
2010Apr265201340528
2010May246022871483
2010Jun2743053120553
2010Jul227884796315
2010Aug261842762854
2010Sep248386166613
2010Oct250332153353
2010Nov2370032104360
2010Dec243104757919
2011Jan252724137755
2011Feb278645351863
2011Mar378907640188
2011Apr314882660494
2011May301518749037
2011Jun271858358380
2011Jul228452143096
2011Aug247551933612
2011Sep246164048697
2011Oct242055449929
2011Nov205976963045
2011Dec288277664430
2012Jan282511642894
2012Feb265389223528
2012Mar254475839839
2012Apr305010947250
2012May326466641357
2012Jun279820456808
2012Jul333117446853
2012Aug286408862682
2012Sep221953633641
2012Oct248216247478
2012Nov254514249232
2012Dec312950735924
2013Jan290274832373
2013Feb238819721922
2013Mar283127941718
2013Apr246744454619
2013May313115357183
2013Jun281998333484
2013Jul347318044240
2013Aug258686352288
2013Sep245925838145
2013Oct322831748533
2013Nov299873253187
2013Dec302391833381
2014Jan250373331233
2014Feb261518433134
2014Mar280863938876
2014Apr285751445819
2014May336374658283
2014Jun277868954266
2014Jul282884741221
2014Aug307406139744
2014Sep270344040668
2014Oct274481337263
2014Nov254114372020
2014Dec332579944128
2015Jan313082254942
2015Feb280602045098
2015Mar356086653378
2015Apr334169543642
2015May371748770583
2015Jun367828356578
2015Jul395446053615
2015Aug301610042015
2015Sep220972438904
2015Oct279540955583
2015Nov342675370399
2015Dec335745451095
2016Jan308723154089
2016Feb337448548683
2016Mar326005445472
2016Apr293010663926
2016May338333176757
2016Jun320961345557
2016Jul276514337198
2016Aug273286740213
2016Sep218034741660
2016Oct229834834699
2016Nov254557471924
2016Dec369148537448
2017Jan338329748974
2017Feb285658435716
2017Mar341303939789
2017Apr336115630625
2017May357641031092
2017Jun334846927734
2017Jul274118727041
2017Aug267562532476
2017Sep270067533394
2017Oct277915931373
2017Nov281701240681
2017Dec372621633955
2018Jan336159146166
2018Feb262526329890
2018Mar321910231454
2018Apr359328725954
2018May379828535908
2018Jun336282931899
2018Jul273532630968
2018Aug239754919849
2018Sep232373529324
2018Oct247245128927
2018Nov268746640497
2018Dec323629336603

/MD


Chapter 20 contents

/MD

20.3 – Baseline correction

draft

Introduction

Signal processing is the analysis and manipulation of signals to extract meaningful information and improve data quality. signal processing is a crucial pre-processing step before analysis, as it involves cleaning and preparing raw data to improve its quality and highlight important features for accurate analysis. Common pre-processing tasks include filtering noise, filling in missing data, and feature extraction, all of which are done before the main analytical steps like feature extraction and classification can be performed.

A baseline refers to an initial measurement before an intervention. A starting point. Provides an objective comparison to judge whether or not an intervention has led to change.

Baseline drift is a gradual, slow shift in the signal’s zero-point over time, while intensity drift is a more general term for a change in a peak’s amplitude, often due to baseline changes. Noise is rapid, random fluctuation that obscures the signal itself. The key differences lie in their speed and effect: drift is a slow, low-frequency, long-term issue that shifts the entire baseline, whereas noise is a fast, high-frequency, short-term issue that adds random variation to the signal

For measures conducted over time, a baseline correction may be applied during initial data processing to correct for signal distortion, background noise, or baseline drift — the gradual change over time in what is expected to be measurement of an unchanging signal.

Note 1: Signal distortion is an unwanted change to the original signal’s waveform, while background noise is an unrelated, external signal that is added to the original signal.

Many applications, numerous examples.

Quantitative PCR, qPCR: baseline correction is the process of identifying and subtracting the background fluorescence noise from the early cycles of a real-time PCR run to accurately measure the signal from specific DNA amplification (see Ruijter et al 2009).

Chromatography: baseline correction is a technique to remove background noise and drift from a chromatogram to make peaks more visible and accurate for analysis (see Niezen et al 2022).

Standard and basal metabolic rate by indirect calorimetry: baseline drift is an error where the instrument’s signal gradually changes over time, independent of the subject’s actual oxygen consumption (\dot{V}O_{2}), while baseline correction is a data processing technique used to computationally adjust the recorded data to counteract this drift and improve accuracy (see Hayes et al 1992; Lighton 2017).

Colorimetric spectrometry: baseline drift is an unwanted phenomenon where the signal’s baseline gradually shifts over time or wavelength due to instrumental or environmental factors, while baseline correction is a data processing technique used to computationally remove this drift and other background noise from the measured spectrum.

Statistical considerations

When processing a biological signal such as an electromyogram (EMG), it’s important to remember that even the baseline—the part of the recording where we assume “nothing is happening” — is only an estimate of the true baseline noise level. To account for this, choose two kinds of time windows: a baseline window (before the event of interest) and a signal window (during the activity you want to measure).

Note 1: In signal processing, a window function’s purpose is to isolate a portion of a signal for analysis and reduce spectral leakage by smoothing the signal’s boundaries.

The goal is to compare these windows in a way that fairly adjusts for natural fluctuations in the baseline. One common approach is a regression-weighted correction, which simply means using a statistical line that represents how the baseline trends upward or downward over time, then adjusting the signal based on that line rather than assuming the baseline is perfectly flat. Another approach is to use a spline, which is a smooth, flexible curve that adapts to gradual changes in the baseline. Splines can correct for slow drifts in the recording without over-correcting the actual signal. Together, these methods help ensure that any “activity” you detect is more likely to be real muscle activation and not just shifts in the baseline.

R code

Package(s):

baseline

Signal

Examples

To illustrate, consider a myogram signal trace (EMG) recorded over several minutes. The trace will show fluctuations in electrical activity over time, which reflects muscle rest, contraction, and relaxation. The trace will exhibit a baseline at rest, spikes or bursts of activity during contractions, and varying levels of intensity depending on the muscle’s effort. An increase in the frequency and amplitude of these spikes indicates a stronger, more forceful contraction, while a period of no activity will show as a flat line. In a real-world scenario with an active muscle or a prolonged recording, the trace is typically corrupted by both noise (high-frequency, random variations) and baseline drift (slow, low-frequency shifts from the zero point). Among several statistics, analyst may calculate from the signal (1) the Root Mean Square (RMS) Amplitude, a measure of the signal’s magnitude and represents the overall intensity of muscle activity, (2) Mean Power Frequency (MPF), or the the frequency domain analysis of myogram signals. The MPF is calculated to assess muscle fatigue (fatigue often causes a shift to lower frequencies), and others (eg, Potvin and Bent 1997, Smilios et al 2010). Principle to the analysis, moving average or LOESS approaches may be used to smooth the data, reduce noise, and identify underlying trends or patterns in muscle activity over the multiple-minute duration.

To demonstrate pre-processing steps to correct for baseline drift, we need a data set. Here, we simulate a couple of myogram-like traces.

R code to simulate myogram data with baseline drift and random walk noise. A random walk tends to wander and works well for simulating biological drift.

# Example
# Define simulation parameters
duration_sec <- 5 # Total duration in seconds
sampling_rate_hz <- 1000 # Sampling rate (1000 Hz = 1 ms interval)
peak_center_time <- 2.5 # Time of the muscle contraction peak
amplitude <- 5 # Peak amplitude of the myogram
drift_magnitude <- 0.01 # Factor to control the magnitude of baseline drift
noise_level_sd <- 0.1 # Standard deviation of the white noise

Generate the data

my_data <- simulate_myogram_with_drift(
duration = duration_sec,
hz = sampling_rate_hz,
peak_time = peak_center_time,
peak_amplitude = amplitude,
drift_factor = drift_magnitude,
noise_sd = noise_level_sd
)
head(my_data)

Plot the simulated data

plot(my_dataTime, my_dataSignal, type = 'l', col = 'blue',
main = "Simulated Myogram Data with Baseline Drift",
xlab = "Time", ylab = "Signal Value")
lines(my_dataTime, my_dataDrift, col = 'red', lty = 2) # Overlay the drift line
legend("topleft", legend = c("Simulated Myogram", "Baseline Drift"),
col = c("blue", "red"), lty = c(1, 2), cex = 0.8)
# ylim = range(myogram_data, baseline_drift)

which gives us graph like Figure 1.

Simulated Myogram data with Baseline Drift.

Figure 1. Simulated myogram data with baseline drift.

Alternatively, use ggplot (Fig 2).

library(ggplot2)
ggplot(my_data, aes(x = Time, y = Signal)) +
geom_line(color = “blue”) +
geom_line(aes(y = Drift), color = “red”, linetype = “dashed”, alpha = 0.6) +
geom_line(aes(y = TrueSignal), color = “green”, linetype = “dotted”, alpha = 0.8) +
labs(title = “Simulated Myogram Data with Baseline Drift”,
x = “Time (seconds)”,
y = “Signal Amplitude”) +
theme_minimal() +
scale_color_manual(values = c(“blue”, “red”, “green”),
labels = c(“Total Signal”, “Baseline Drift”, “True Myogram”)) +
theme(legend.position = “bottom”)

# You can access the data for further analysis using the ‘my_data’ data frame
head(my_data)

}

version 2
# R code to simulate myogram data with baseline drift

# 1. Define simulation parameters
set.seed(123) # for reproducibility
n_samples <- 500 # number of data points (time steps)
time <- 1:n_samples
baseline_start <- 5
drift_rate <- 0.01 # slope of the linear drift
signal_amplitude <- 2
noise_sd <- 0.5 # standard deviation of the noise

# 2. Simulate the baseline drift
# A simple linear drift is used here. You could also use a random walk (RW).
baseline_drift <- baseline_start + drift_rate * time

# 3. Simulate the biological signal (myogram activity)
# Using a sinusoidal function as an example of a rhythmic signal
biological_signal <- signal_amplitude * sin(time * 0.1)

# 4. Simulate random noise (white noise)
noise <- rnorm(n_samples, mean = 0, sd = noise_sd)

# 5. Combine all components to get the final simulated myogram data
myogram_data <- baseline_drift + biological_signal + noise

# 6. Create a data frame for plotting and analysis
sim_data <- data.frame(Time = time, Signal = myogram_data)

# 7. Visualize the data using base R graphics or ggplot2
plot(sim_dataTime, sim_dataSignal, type = ‘l’, col = ‘blue’,
main = “Simulated Myogram Data with Baseline Drift”,
xlab = “Time”, ylab = “Signal Value”, ylim = range(myogram_data, baseline_drift))
lines(sim_data$Time, baseline_drift, col = ‘red’, lty = 2) # Overlay the drift line
legend(“topleft”, legend = c(“Simulated Myogram”, “Baseline Drift”),
col = c(“blue”, “red”), lty = c(1, 2), cex = 0.8)

Simulated myogram data with random walk noise and baseline drift.

Figure 2. Simulated myogram data with random walk noise and baseline drift.

# You can also use the ggplot2 package for more sophisticated plotting
# install.packages(“ggplot2”)
# library(ggplot2)
# ggplot(sim_data, aes(x = Time, y = Signal)) +
# geom_line(color = “blue”) +
# geom_line(aes(y = baseline_drift), color = “red”, linetype = “dashed”) +
# labs(title = “Simulated Myogram Data with Baseline Drift”,
# y = “Signal Value”) +
# theme_minimal()

 

Questions

  1. Write up three learning outcomes for this page. Hint: Point your favorite generative AI to this page and ask for help

References

Hayes, J. P., Speakman, J. R., & Racey, P. A. (1992). Sampling bias in respirometry. Physiological Zoology, 65, 604–619.

Lighton, J. R. B. (2017). Limitations and requirements for measuring metabolic rates: A mini review. European Journal of Clinical Nutrition, 71(3), 301–305.

Liland, K. H. (2015). 4S Peak Filling–baseline estimation by iterative mean suppression. MethodsX, 2, 135-140.

Liland, K. H., & Mevik, T. A. B. H. (2011). Optimal baseline correction for multivariate calibration using open-source software. Life Science Instruments, (3), 7.

Niezen, L. E., Schoenmakers, P. J., & Pirok, B. W. J. (2022). Critical comparison of background correction algorithms used in chromatography. Analytica Chimica Acta, 1201, 339605.

Ruijter, J. M., Ramakers, C., Hoogaars, W. M. H., Karlen, Y., Bakker, O., van den Hoff, M. J. B., & Moorman, A. F. M. (2009). Amplification efficiency: Linking baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Research, 37(6), e45.

Potvin, J. R., & Bent, L. R. (1997). A validation of techniques using surface EMG signals from dynamic contractions to quantify muscle fatigue during repetitive tasks. Journal of Electromyography and Kinesiology, 7(2), 131–139.

Smilios, I., Hakkinen, K., & Tokmakidis, S. P. (2010). Power Output and Electromyographic Activity During and After a Moderate Load Muscular Endurance Session. The Journal of Strength and Conditioning Research, 24(8), 2122–2131.


Chapter 20 contents

/MD


				

20.2 – Peak detection

draft

Introduction

algorithm

extract characteristics

shape

signal

noise

intensity

filtering

window length

R code

packages

peakDetection

findpeaks

Example

ccc

Questions

  1. Write up three learning outcomes for this page. Hint: Point your favorite generative AI to this page and ask for help

References and suggested readings

Shin, H. S., Lee, C., & Lee, M. (2009). Adaptive threshold method for the peak detection of photoplethysmographic waveform. Computers in biology and medicine39(12), 1145-1152.


Chapter 20 contents

/MD

20.1 – Area under the curve

draft

Introduction

Area under the curve, AUC, represents the total change in y given change in x. For example, if x is time, and y is oxygen consumption, an AUC would be appropriate to quantify the total oxygen consumption following strenuous exercise (Excess post-exercise oxygen consumption, EPOC) or following a large meal (Specific Dynamic Action, SDA).

In biostatistics, area under the relative (receiver) operating carrier, AUROC, shows characteristics of a diagnostic model, a graphic used to show trade off between sensitivity and specificity. Classifier performance. Used to find the appropriate cut-off. Plot true positive rates against false positive rates as cumulative functions, shows the relationship between sensitivity and specificity for every possible cut off value. Can then calculate AUC to get a measure of the intervention’s ability to discriminate between true and false positive rates.

edit

Related, area under precision-recall curve, AUPRC,

estimate area (1) trapezoid method, (2) average precision score

 

Area under the curve

Download and install R package MESS; requires geepack, geeM, and Matrix packages

R code

x <- seq(1:10) 
y <- c(1,4,5,2,11,22,9,7,5,1) 
#length(x)==length(y)
#smooth the data 
loxy <- loess(y~x)
#Make a plot (Fig. 1)
plot(x,y, pch=19, cex=2, col="blue") 
lines(predict(loxy), type="l", col="red")

where == is an R comparison operator.

And R output

area under the curve, AOC, example

Figure 1. Area under the curve example.

library(MESS) 
auc(x,y,from=0,rule=2) 
auc(x,loxy$fitted,from=0,rule=2)

And R output

#area under curve for raw data
[1] 67
#area under curve for smoothed data
[1] 66.77616

Area under the receiver operating carrier curve

Download and install ROCR

R code

#modified from https://rviews.rstudio.com/2019/03/01/some-r-packages-for-roc-curves/

library(ROCR)
data(ROCR.simple) 
df <- data.frame(ROCR.simple) 
pred <- prediction(df$predictions, df$labels) 
perf <- performance(pred,"tpr","fpr") 
plot(perf,colorize=TRUE)

R output

ROC plot

Figure 2. Example ROC curve

The right-hand axes is color codes by AUC values: good tests AUC between 0.8 and 0.9, very good tests greater than 0.9.

Area under the precision recall curve

— under construction

Questions

  1. Write up three learning outcomes for this page. Hint: Point your favorite generative AI to this page and ask for help

 


Chapter 20 contents

/MD

14.8 – More on the linear model in Rcmdr

Introduction

During the last lectures, we could have used the Two-Way ANOVA command in R or Rcmdr

R code: How to analyze multifactorial ANOVA problems

Rcmdr: Statistics → Means → Multiway ANOVA

To analyze our two factor data sets. As long as the design meets the following conditions, by all means use this command because it is simple and precisely correct.

  • Both factors are fixed, not random.
  • Each level of first factor is crossed with each level of the second factor.
  • No missing data (the design is fully replicated and balanced).

If any of the three points do not fit your two-way design, then you’ll need a different, more general and powerful ANOVA procedure in R and Rcmdr to analyze these types of designs. You’ll need the lm() function (Fig. 1).

Rcmdr: Statistics → Fit models → Linear model…

Menus to seelct linear model function, lm()

Figure 1. Linear model menu in Rcmdr, version 2.7.0

Some R basics with the lm() function, the general linear model.

    \begin{align*} Y\sim model \end{align*}

Response Y is specified by linear predictor(s), either factors or covariates (ratio-scale predictor variables). We communicate to R what the model is by using operators. The four more commonly used operators are

+ the basic way to include the model terms, i.e., the mode predictors

: which is interpreted as the interaction of all the variables and the factors in the term

* is interpreted as factor crossing

Not shown, but \%in\% indicates the term on the left is nested within the term on the right.

A few examples: we’ll have three factors, AB, and C. For our one-way ANOVA, the model specification would be

    \begin{align*} Y\sim A \end{align*}

For our crossed, balanced two-way ANOVA, the model specification would be

 

    \begin{align*} Y\sim A + B + A:B \end{align*}

or equivalently

    \begin{align*} Y\sim A*B \end{align*}

 

And for our block ANOVA problem?

Click here to get entire List of model commands in R.

How would we analyze our snake experiment?

Table 1. The snake data set

Snake Source flick
1 dH2O 3
2 dH2O 0
3 dH2O 0
4 dH2O 5
5 dH2O 1
6 dH2O 2
1 fish 6
2 fish 22
3 fish 12
4 fish 24
5 fish 16
6 fish 16
1 worm 6
2 worm 22
3 worm 12
4 worm 24
5 worm 16
6 worm 16

We have two factors, but one factor is a Block (repeated measures on individuals). We need to tell R and Rcmdr what our model is. We’ll return to talk about models next time, a very important topic!! For now, think of a model as adding the independent variables together to predict the response variable.

In our Snake example, it’s a two-way ANOVA, but one factor is individual snake, the other is a treatment, and we have repeat measures, so there cannot be an interaction.

We tell R and Rcmdr which columns contain the Response, and under Model, we enter the columns with the two factors.

repeat measure linear model

Figure 2. Menu of linear model with repeat measures model, Rcmdr, version 2.7.0.

You must also tell R and Rcmdr which factors in the model (if any) are random to get the correct F statistics. Almost without exception, blocking factors are always treated as Random

The output looks like this (see below). More complicated, true (which means more information!), but things marked in red we’ve seen before.

LinearModel.2 <- lm(flick ~ Snake +Source, data=L16SnakeTaste)

summary(LinearModel.2)

Call:
lm(formula = flick ~ Snake + Source, data = L16SnakeTaste)

Residuals:
Min 1Q Median 3Q Max 
-5.2222 -0.7222 0.0278 1.5694 7.4444

Coefficients:
             Estimate Std. Error t value Pr(>|t|) 
(Intercept)    -4.444      2.523  -1.762 0.10862 
Snake[T.2]      9.667      3.090   3.128 0.01072 * 
Snake[T.3]      3.000      3.090   0.971 0.35451 
Snake[T.4]     12.667      3.090   4.099 0.00215 ** 
Snake[T.5]      6.000      3.090   1.942 0.08085 . 
Snake[T.6]      6.333      3.090   2.050 0.06755 . 
Source[T.fish] 14.167      2.185   6.484 0.0000704 ***
Source[T.worm] 14.167      2.185   6.484 0.0000704 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.784 on 10 degrees of freedom
Multiple R-squared: 0.8858, Adjusted R-squared: 0.8058 
F-statistic: 11.08 on 7 and 10 DF, p-value: 0.0005337

Contrast coding

When a factor (categorical variable) is used in a linear model (lm() or aov()), R must convert the factor into one or more numeric predictor variables. Contrast coding tells us how to represent the factor in the model, how to interpret the coefficients associated with the factor, how the hypothesis tests of the coefficients are handled, and whether or not Type III sums of squares tests are valid. R can handle contrast coding in a couple of ways, either as a dummy contrast (each group if the factor is contrasted to a baseline) or effects coding, where group means are compared to the grand mean. The command summary(lm()) reports contrast coding that was used in the linear model. Every kind of contrast leads to the same fitted values and same overall F-test, but coefficient estimates and t-tests change.

Note 1: Type I sums of squares tests are affected by the order of the predictors in the model, not by type of contrast coding used. R Commander defaults to Type II tests, which are generally unaffected by coding contrasts used. In contrast, Type III tests are affected by the type of coding contrasts used — Type III tests require sum-to-zero (effects) contrasts.

For the ANOVA table, we call up the command via

Rcmdr: Models → Hypothesis tests → ANOVA table… (Fig. 3).

Rcmdr menu: Model --> Hypothesis testing --> ANOVA table

Figure 3. Rcmdr: Models → Hypothesis tests → ANOVA table… Rcmdr, version 2.7.0

Confirm that the model object is active (in this case, the object was LinearModel.2), accept the defaults about types of tests and marginality, and submit OK. The output is

Anova(LinearModel.2, type="II")
Anova Table (Type II tests)

Response: flick
          Sum Sq Df F value Pr(>F) 
Snake     307.61  5  4.2956 0.02396 * 
Source    802.78  2 28.0256 0.00007954 ***
Residuals 143.22 10 
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

End R output

You do not need to know all of the output; all of that output is there for a reason, of course, but for now, here’s what R and Rcmdr has to say (from the help menu):

The sequential sums of squares is the added sums of squares given that prior terms are in the model. These values depend upon the model order. The adjusted sums of squares are the sums of squares given that all other terms are in the model. These values do not depend upon the model order.

You should try our snake example again, but this time, remove the tongue flick response to the dH2O for the first snake — a missing value. (just type into the cell NA).

How would you use GLM to analyze a simple two-way ANOVA, a fully crossed, fully replicated (“balanced”) design? We could use the Two-way ANOVA command in R and Rcmdr, or we could use lm(). Try it with the data set from random vs nonrandom lecture.

Another example

Below, you see how the model is entered. To let R know that I wish R and Rcmdr to test the interaction, I need to add a Model term for that source of variation. I accomplish this by typing “Diet*Drug” (without the quotes).

Figure 4. Crossed, balanced design. Linear model menu, Rcmdr, version 1.9.2

After clicking OK, the following output from the lm function is returned. How does this compare to output from the two-way ANOVA command in R and Rcmdr?

You should try both and compare!

Rcmdr: Models → Hypothesis tests → ANOVA table

Anova(LinearModel.17, type="II")
Anova Table (Type II tests)
Response: chol_randomized
             Sum Sq  Df F value   Pr(>F) 
diet         141.08   2  1.3061 0.28745 
drug         351.88   2  3.2577 0.05403 .
diet:drug    235.50   4  1.0901 0.38124 
Residuals   1458.19  27

End R output

Nested ANOVA

The nested ANOVA may be analyzed in multiple ways in R and Rcmdr, but I prefer the lm() function because it is the most general. For Nested ANOVA, we can also use lm(). Here’s where it gets a little tricky. Put in the Response variable (Chol), then click in the box for model: Select both factors, then type in / after the factor that’s nesting factor. For our nested model example (14.5 – Nested designs), Manufacturer Source was nested within Drug.

Table 3. Nested design example data set from Chapter 14 – Nested design

Drug Source Chol
1 1 202.6
1 1 207.8
1 1 190.2
1 1 211.7
1 1 201.5
1 2 189.3
1 2 198.5
1 2 208.4
1 2 205.3
1 2 210
2 3 212.3
2 3 204.4
2 3 221.6
2 3 209.2
2 3 222.1
2 4 203.6
2 4 209.8
2 4 204.1
2 4 201.8
2 4 202.6
3 5 189.1
3 5 219.9
3 5 196
3 5 205.3
3 5 204
3 6 194.7
3 6 192.8
3 6 226.5
3 6 200.9
3 6 219.7

Note 2: If you were working with a CROSSED model, then you would enter the two factors and indicate the interaction by typing Drug*Source (if these are the two factors involved in the interaction).

Figure 5. Nested design, linear model menu, Rcmdr, version 1.9.2

Fortunately, R and Rcmdr’s help system is quite extensive here, so when in doubt, check the help box…

Output from the linear model for the Nested Example looks like the one below.

The General Linear Model function in R and therefore Rcmdr returns information about our design plus Sums of Squares, Mean squares, and P-values. R and Rcmdr default’s to use of sequential evaluation of effects. Adjusted evaluation is useful for when you have a covariate (like body size or another confounding variable) that should be evaluated first before the factors are evaluated. We will use the sequential analysis.

Rcmdr: Models → Hypothesis tests → ANOVA table

Anova(LinearModel.15, type="II")
Anova Table (Type II tests)
Response: Chol
             Sum Sq Df F value Pr(>F)
Drug         225.14  2  1.1743 0.3262
Drug:Source  269.27  3  0.9363 0.4385
Residuals   2300.61 24

End R output

Repeatability and ANOVA

We need to tell Rcmdr how to structure the error term; you need the data frame to be arranged so

aovRes <- aov(dH2O ~ Source + Error(Source/Subject), data=SnakeTaste)
#Print the results
aovRes
Anova Table (Type II tests)
Response: dH2O
            Sum Sq Df F value   Pr(>F)    
Subject     307.61  5  4.2956 0.02396 *  
Source      802.78  2 28.0256 0.00007954 ***
Residuals   143.22 10                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#What components are available in the aovRes object?
names(aovRes)
[1] "Sum Sq"  "Df"      "F value" "Pr(>F)"

How do I extract the “F value” for Subjects?

str(aovRes)

 

Questions

1. What are three learning objectives from this page?
2. Be able to distinguish use of lm() and aov() and the summary output with respect to testing of linear models.

Quiz Chapter 14.8

More on the linear model in Rcmdr

Data sets

Table 1. The snake data set

Snake Source flick
1 dH2O 3
2 dH2O 0
3 dH2O 0
4 dH2O 5
5 dH2O 1
6 dH2O 2
1 fish 6
2 fish 22
3 fish 12
4 fish 24
5 fish 16
6 fish 16
1 worm 6
2 worm 22
3 worm 12
4 worm 24
5 worm 16
6 worm 16

 


Chapter 14 contents

14.7 – Rcmdr Multiway ANOVA

Introduction

We have been talking about the two-way randomized, balanced, replicated design.  Here, we take you step by step through use of R to conduct the multiway ANOVA.

R code: Multiway ANOVA

Rcmdr: Statistics  Means → Multiway ANOVA… we will review this as Option 1

or

Rcmdr: Statistics  Fit Models → Linear model… we will review this as Option 2

In either case, as a reminder, your data set must be stacked worksheet, like the data in this worksheet

Table 1. Data set, example.14.7†

Diet Population Response
A 1 4
A 1 6
A 1 5
A 2 5
A 2 8
A 2 9
B 1 12
B 1 15
B 1 11
B 2 5
B 2 7
B 2 8

Option 1

Your first option is to use the ANOVA menus via “Means.” This is a perfectly good way to handle a standard two-way, fully-crossed, fixed effects model.  However, other designs will not run with this command and R will return a report of errors for ANOVA models that do not conform to the replicated, balanced, crossed design.

Rcmdr: Statistics  Means  Multiway Analysis of variance …

Factors: highlight “Diet” AND “Population”

Response variable: pick one (in this window, all we see is “Response”)

Screenshot Rcmdr multi-way ANOVA

Figure 1. Screenshot Rcmdr multi-way ANOVA.

Note 1: Don’t forget to convert numeric Population to factor. Assuming Population is part of a data.frame called example.14.7, then example.14.7$Population <- factor(example.14.7$Population).

Interpret the output

AnovaModel.2 <- (lm(Response ~ Diet*Population, data=example.14.7))

Anova(AnovaModel.2)
 
Anova Table (Type II tests)
Response: Response
                  Sum Sq    Df    F value      Pr(>F) 
Diet              36.750     1    12.2500    0.008079 **
Population        10.083     1     3.3611    0.104104 
Diet:Population   52.083     1    17.3611    0.003136 **
Residuals         24.000     8
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
tapply(example.14.7 Response, list(Diet=example.14.7 Diet, 
+ Population=example.14.7 Population), mean, na.rm=TRUE) # means
Population
Diet 1 2
A 5.00000 7.333333
B 12.66667 6.666667

tapply(example.14.7 Response, list(Diet=example.14.7 Diet, 
+ Population=example.14.7 Population), sd, na.rm=TRUE) # std. deviations
Population
Diet 1 2
A 1.000000 2.081666
B 2.081666 1.527525

tapply(example.14.7 Response, list(Diet=example.14.71 Diet, 
+ Population=example.14.7 Population), function(x) sum(!is.na(x))) # counts
Population
Diet 1 2
A 3 3
B 3 3

End R output.

Note 2: What is the “Type II tests”? When you fit a model with categorical predictors using lm() or aov(), you test hypothesis for each effect (main effects and interactions). These tests differ depending on how the model partitions the variance among predictors. See Note 2 in Chapter 12.7 for description of Type I, Type II, and Type III sums of squares.

Summary of multi-way ANOVA command

The multi-way ANOVA command returns our ANOVA table plus the adjusted means, along with standard deviations and number of observations (counts).  The adjusted means would then be good to put into a chart to present group comparisons following adjustments from the effects of levels within groups.

Rcmdr: Models → Graphs → Predictor effect plots …

Here’s the chart (hint: \pm SEM = \frac{SD}{\sqrt{count}})

predictor chart

Figure 2. Predictor effect plots, Diet and Population on Response variable.

Option 2

A more general approach is to use the General linear model. This approach can handle the standard 2-way fixed effects ANOVA (above), but any other model as well. The model is Response ~ Diet*Population.

Rcmdr: Statistics  Fit Models → Linear model… 

Screenshot Rcmdr linear model

Figure 3. Screenshot Rcmdr linear model menu.

Interpret the output

LinearModel.1 <- lm(Response ~ Diet * Population, data=example.14.7)

summary(LinearModel.1)

Call:
lm(formula = Response ~ Diet * Population, data = example.14.7)

Residuals:
Min 1Q Median 3Q Max 
-2.3333 -1.1667 0.1667 1.0833 2.3333 

Coefficients:
                        Estimate    Std. Error ..t value .  Pr(>|t|) 
(Intercept) .              5.000         1.000 .   5.000    .0.00105 ** 
Diet[T.B] .                7.667       ..1.414 .   5.421    .0.00063 ***
Population[T.2]           .2.333       ..1.414 .   1.650    .0.13757 
Diet[T.B]:Population[T.2] -8.333       ..2.000    -4.167    .0.00314 ** 
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 1.732 on 8 degrees of freedom
Multiple R-squared: 0.8047, Adjusted R-squared: 0.7315 
F-statistic: 10.99 on 3 and 8 DF, p-value: 0.003285 

End R output

Lots to sort through, so let’s begin with what is in common between the two approaches, the Multi-way ANOVA command versus the linear model command.

Compare R output from ANOVA and linear model

As a direct output, the linear model option does not provide an ANOVA summary table. Instead of our ANOVA table the linear model returns estimates of coefficients along with t-test results for each coefficient of the model from the lm() command output

Recall that we can get ANOVA tables through the following R commands via Rcmdr.

Rcmdr: Models  Hypothesis tests  ANOVA Table.

Let’s do so for this linear model (accept the default for type of tests = “Type II”).

And the output from lm() stored in the object LinearModel.1 is

Anova(LinearModel.1, type="II")
Anova Table (Type II tests)

Response: Response
                Sum Sq Df F value   Pr(>F)   
Diet            36.750  1 12.2500 0.008079 **
Population      10.083  1  3.3611 0.104104   
Diet:Population 52.083  1 17.3611 0.003136 **
Residuals       24.000  8                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

End R output

Now, we’re in business, and, using the lm() function, we have the estimates for each model coefficient plus our ANOVA table.

Same answer! Of course. Which to choose, Option 1 or Option 2? Use the lm() options; more flexible, covers more designs than the multiway ANOVA which is strictly for the crossed fully replicated design.

Note 3: Rcmdr uses Anova(), not anova(). anova() is a generic function, part of the base statistics package, and it returns an ANOVA table. Anova() is not a wrapper to anova(), it is a different function, part of the car package. It uses different default methods and different types of sums of squares: Type-II or Type-III analysis-of-variance tables for model objects produced by lm(), glm(), etc. See Note 2 in Chapter 12.7 for description of Type I, Type II, and Type III sums of squares.

Questions

  1. Write out the two-way model described for the data in Table 1.
  2. Write the null hypotheses and provide a summary of the statistical significance of the model.

Quiz Chapter 14.7

Rcmdr Multiway ANOVA


Chapter 14 contents

14.6 – Some other ANOVA designs

Introduction

There are several additional ANOVA models in common use. The crossed, balanced design is but one example of the two-way ANOVA. And, from a consideration of two factors it logically follows that there can be more than two factors as part of the design of an experiment. As the number of factors increase, the number of two-way, three-way, and even higher-order interactions are possible and at least in principle may be estimated.

Our purpose here is to highlight several, but certainly not all possible experimental designs from the perspective of ANOVA. Examples are provided. Keep in mind that the general linear model approach unifies these designs.

Some of the classical experimental ANOVA designs one sees include

  • Two-way randomized complete block design
  • Two-way factorial with no replication design
  • Repeat-measures ANOVA with one factor
  • Nested ANOVA
  • Three-way ANOVA
  • Split-plot ANOVA
  • Latin squares ANOVA

Put simply, these designs differ in how the groups are arranged and how members of the groups are included.

Two-way randomized complete block design

This design refers to the “textbook” design. For each, factor A and factor B, there are multiple levels, in this example three levels of Factor A and three levels of Factor B,  and subjects (sampling units) are randomly assigned to each level. However, one of the factors is, perhaps, of less interest, yet certainly accounts for variation in the response variable.

Factor B
1 2 3
Factor A 1 n_{1,1} n_{1,2} n_{1,3}
2 n_{2,1} n_{2,2} n_{2,3}
3 n_{3,1} n_{3,2} n_{3,3}

where n1,1, n2,1, etc. represents the number of subjects in each cell. Thus, in this design there are nine groups. Typically, minimum replication would be three subjects per group.

Two-way factorial with no replication design

While it may seem obvious that a good experiment should have replication, there are situations in which replication is impossible. While this seems rather odd, this scenario very much describes a typical microarray, gene expression project.

Factor B
1 2 3
Factor A 1 n_{1,1} n_{1,2} n_{1,3}
2 n_{2,1} n_{2,2} n_{2,3}
3 n_{3,1} n_{3,2} n_{3,3}

where, again, n1,1, n2,1, etc., represents the number of subjects in each cell and there are nine groups in this study. With no replication, then no more than one subject per group.

Repeated-measures ANOVA

When subjects in the study are measured multiple times for the dependent variable, this is called a repeated-measures design. We introduced the design for the simple case of before and after measures on the same individuals in Chapter 12.3. It’s straight-forward to extend the design concept to more than two measures on the subjects. The blocking effect is the individual (see Chapter 14.4), and, therefore, a random effect (see Chapters 12.3 and 14.3) in this type of experimental design.

Although straightforward in concept, repeated measure designs have many complications in practice. For example, long-term studies can expect for subjects to drop out of the study, resulting in censored data. Another complication, the assumption is that there is no carry over effect — it doesn’t matter the order different treatments are applied to the subjects. Think of this assumption as akin to the equal variances assumption in ANOVA; just like unequal variances effects Type I error rates in ANOVA, deviations from sphericity inflate Type I error rates in repeated-measures designs.

Sphericity assumption is described in two ways:

Assumption of sphericity — the ranking of individuals remains the same across treatment levels — no interaction between individual and treatment. Sphericity assumption is always met if there are just two levels of the repeated measure, e.g., before and after.

Compound symmetry assumption — the variances and covariances are equal across the study: the changes experienced by the subjects are the same across the study regardless of the order of treatments.

Tests for sphericity include:

Mauchly test: mauchly.test(object)

If results of tests for violations of sphericity warrant, corrections are available. One recommended correction is called Greenhouse-Giesser correction, which adjusts the degrees of freedom and so results in a better p-value estimate. A second correction is called Huyhn-Feldt correction; this correction, too, adjusts the degrees of freedom to improve the p-value estimate.

Three-way ANOVA

It is relatively straightforward to imagine an experiment that involves three or more factors. The analysis and interpretation of such designs, while feasible, becomes somewhat complicated especially for the mixed models (Model III).

Consider just the case of a fixed-effects 3-way ANOVA. How many tests of null hypotheses are there?

  1. Three tests for main effects.
  2. Three tests of two-way interactions.
  3. A test for a three-way interaction.

Thus, there are seven separate null hypotheses from a three-way ANOVA with fixed effects! As you can imagine, large sample sizes are needed for such designs, and the “higher-order” interactions (e.g., three-way interaction) can be difficult to interpret and may lack biological significance.

ANOVA designs without random assignment to treatment levels

Latin square design

We have introduced you to several ANOVA experimental designs that employed randomization for assignment of subjects to treatment groups. The purpose of randomization is even out differences due to confounding variables. However, if we know in advance something about the direction of the influence of these confounding variables, strictly random assignment is not in fact the best design. For example, the Latin square design is common in agriculture research and is very useful for situations in which two gradients are present (e.g., soil moisture levels, soil nutrient levels).

Dry soil ←→ Wet soil
Soil Nutrients

low


high

T1 T4 T3 T2
T3 T2 T1 T4
T2 T3 T4 T1
T4 T1 T2 T3

Split-Plot Design

Another design from agriculture research is especially useful to ecotoxicology research. We mentioned the repeated measures design in which individuals are measured more than once and each individual receives all levels of the treatment in a random order (cross-over design). However, this design assumes that there are no carry-over effects (see Hills and Armitage 1979;  For ecology/evolution definition see O’Connor et al 2014). While this assumption may hold for many experiments, we can also imagine many more situations in which this is undoubtedly false. For example, if we wish to measure the effects of ozone and relative humidity on frog behavior, we might consider using the individual as its own control. But we also wish to compare frog behavior following ozone exposure against behavior exhibited in clean air. But we are likely to violate the carry-over assumption. If a frog receives ozone then air, the effects of ozone may inhibit activity for several days after the initial exposure, which would then influence subsequent measures. The solution to this dilemma is to use what’s called a split plot design. The design combines elements of nesting.

Consider our frog experiment. There would be three factors:

Factor 1 = Exposure (air or ozone),

Factor 2 = Saturation (dry, intermediate, wet),

Factor 3 = Individual (each frog is measured 3 times).

The design table would look like

Exposure
Air Ozone
Humidity Dry Frog1 Frog2 Frog3 Frog4 Frog5 Frog6
Intermediate Frog1 Frog2 Frog3 Frog4 Frog5 Frog6
Wet Frog1 Frog2 Frog3 Frog4 Frog5 Frog6

Thus, the design is crossed for one factor (saturation), but nested for another factor (individuals are nested within Exposure factor).

Questions

  1. Which of the study designs mentioned so far are sensitive to carry-over effects?
  2. With respect to how levels of Factors are assigned, distinguish the split-plot design from the Latin square design.

Quiz Chapter 14.6

Some other ANOVA designs


Chapter 14 contents

14.5 – Nested designs

Introduction

Crossed versus nested design

Factors are independent variables whose values we control and wish to study because we believe they have an effect on the dependent variable. While it is logical to think of factors and levels within factors as independent variables fully under our control, a moments reflection will come up with examples in which the groups (levels) depend on the factor.

Crossed – each level of a factor is in each level of the other factor. This was illustrated in Chapter 14.1 on Crossed, balanced, fully replicated two-way ANOVA.

Nested – levels of one factor are NOT the same in each of the levels of the other factor. Nested designs are an important experimental design in science, and they have some advantages over the 2-way ANOVA design (for one), but they also have limitations.

Classic examples of nesting: culturing and passage of cell lines in routine cell colony maintenance means that even repeated experiments are done on different experimental units. Cells derived from one vial are different from cells derived from a different vial. Similarly, although mice from an inbred strain are thought to be genetically identical, environments vary across time, so mice from the same strain but born or purchased at different times are necessarily different. These scenarios involving time create a natural block effect. Thus, cells are nested by block effect passage number and mice are nested by block effect colony time. We introduced randomized block design in the previous Chapter 14.4.

Statistical model

If Factor B is nested within Factor A, then a group or level within Factor B occurs only within a level of Factor A. Like the randomized block model, there will be no way to estimate the interaction in a nested two-way ANOVA. Our statistical model then is

    \begin{align*} Y_{ijl} = \mu + A_{i} + B_{j\left ( i \right )} + \epsilon_{ijl} \end{align*}

Examples

Example 1. Three different drugs, 6 different sources of the drugs. The researcher obtains three different drugs from 6 different companies and wants to know if one of the drugs is better than another drug (Factor A) in lowering the blood cholesterol in women. There is always the possibility that different companies will be better or worse at making the drug. So the researchers also use the Factor Source (Factor B) to examine this possibility. Unfortunately they can not obtain all drugs from the same sources. This leads to a Nested ANOVA — notice that each drug is obtained from a different source.

We CANNOT perform the typical two-factor ANOVA because we cannot get a mean of the different drugs by combining the same levels of the Sources: the data is NOT crossed. The Sources of the drugs (Factor B) are NESTED within the type of Drug (Factor A): each source is only found in one of the Drug categories. So, we can’t calculate a mean for the Drug levels independent of the SOURCE from which the drug came.

Table 1. Example of a nested design

Drug A
Drug B
Drug C
Source 1
Source 2
Source 3
Source 4
Source 5
Source 6
202.6
189.3
212.3
203.6
189.1
194.7
207.8
198.5
204.4
209.8
219.9
192.8
190.2
208.4
221.6
204.1
196.0
226.5
211.7
205.3
209.2
201.8
205.3
200.9
201.5
210.0
222.1
202.6
204.0
219.7

Scroll to end of this page to get the data set in stacked worksheet format, or click here.

Compare Table 1 to CROSSED data structure (Table 2) — a typical two-factor ANOVA — which would look like

Table 2. Contents of Table 1 presented as crossed design

Drug A
Drug B
Drug C
Source 1
Source 2
Source 1
Source 2
Source 1
Source 2
202.6
189.3
?
?
?
?
207.8
198.5
?
?
?
?
190.2
208.4
?
?
?
?
211.7
205.3
?
?
?
?
201.5
210.0
?
?
?
?

We can take a mean of the different drugs by combining the same levels of the Sources. Here’s the nested design (Table 3).

Table 3. Group means, nested design

Drug A
Drug B
Drug C
Source 1
Source 2
Source 3
Source 4
Source 5
Source 6
202.76
202.3
213.92
204.38
202.86
206.92

We can take a mean of the different drugs by combining the same levels of the Sources. Here’s the crossed design (Table 4).

Table 4. Groups means by crossed design.

Drug A
Drug B
Drug C
Source 1
Source 2
Source 1
Source 2
Source 1
Source 2
202.76
202.3
?
?
?
?

Why the “?” in Table 2 and 4? Manufacturing source 1 & 2 do not sell Drug B and Drug C. So, there cannot be a crossed design.

Why can’t we just use a One-Way ANOVA? Can’t we just ANALYZE the three DRUGS separately, ignoring the source issue (after all, the drugs are not all made by the same manufacturer)? But it is not a one-way ANOVA problem… Here’s why.

The researcher suspects that the response of a particular drug might be dependent upon the particular source from which the drug was purchased. So, the type of source from which the drug was purchased is another FACTOR. Thus, drugs from one source might have more (less) affect compared to drugs from another source regardless of the type of drug. However, each drug is NOT available from each source. Thus the research design can NOT be crossed and Drug is NESTED within Source.

We can ask ONLY two questions (hypotheses) from this NESTED ANOVA research design:

HO: There is no difference in the average effect of the drugs on (tumor size, cholesterol level, blood pressure, etc.)

HA: There is a difference in the average effect of the drugs on (tumor size, cholesterol level, blood pressure, etc.)

HO: There is no difference in the average effect of the drugs on (tumor size, cholesterol level, blood pressure, etc.) purchased from different manufacturers.

HA: There is a difference in the average effect of the drugs on (tumor size, cholesterol level, blood pressure, etc.) purchased from different manufacturers.

Notice that we do NOT examine the effect of the interaction between Drug type and source of the drug. Why not?
Table 5. Sources of Variation in Nested ANOVA

Source of Variation
Sum of Squares
DF
Mean Squares
Total \sum_{i=1}^{a}\sum_{i=1}^{b}\sum_{i=1}^{n}\left ( X_{ijl}-\mu \right )^2 N - 1
Among all subgroups \sum_{i=1}^{a}\sum_{i=1}^{b}\left ( X_{ij}-\mu \right )^2
ab-1
Among Groups \sum_{i=1}^{a}\left ( X_{i}-\mu \right )^2
a-1
\frac{among \ groups \ SS}{among \ groups \ DF}
Among Subgroups \sum_{i=1}^{a}\sum_{j=1}^{b} n_{ij}\left ( X_{i}-\mu \right )^2 a\left(b - 1 \right) \frac{among \ subgroups \ SS}{among \ subgroups \ DF}
Error
Subtract all of the subgroup Sums of Squares from the Total Sums of Squares
N - ab \frac{error \ SS}{error \ DF}

Testing nested ANOVA with one main factor

Perhaps surprisingly given the number of terms above, there are only two hypothesis tests, and, only one of REAL interest to us. There are exceptions (e.g., quantitative genetics provides many examples), but we are generally most interested in the among group test — this is the test of the main factor. In our example, the main factor was DRUG and whether the drugs differed in their effects on cholesterol levels. The second test is important in the sense that we prefer that it contributes little or no variation to the differences in cholesterol levels. But it might.

Table 6. F statistics for nested ANOVA

F for the main effect is given as F = \frac{Groups \ MS}{Subgroups\ MS}
F for the subgroup is given by F = \frac{Subgroups \ MS}{error\ MS}
and of course, use the appropriate DF when testing the F values!! The Critical Value F0.05 (2), df numerator, df denominator

One way to look at this: it would not make sense to conclude that an effect of the main group was significant if the variation in the subgroups was much, much larger. That’s in part why we test the main effect with the subgroups MS and not the error MS. If variation due to the nested variable is not significant, then it is an estimate of the error variance, too.

The nested model we are describing is a two factor ANOVA, but it is incomplete (compared to the balanced, fully crossed 2-way design we’ve talked about before). We don’t have scores in every cell. Instead, each level of nested factor is paired with one and only one level of the other factor. In our example, Source is paired with only one other level of the other factor Drug (e.g., Source 1 goes with Drug 1 only), but the main effect is paired with 2 levels of the nesting factor (e.g., Drug 1 is manufactured at Source 1 and Source 2).

Note that nesting is strictly one way. Drug is not nested within source, for example.

Some important points about testing the null hypotheses in a nested design. For one, the test of the effect of the nesting factor (Source) is confounded by the interaction between the main factor. We don’t actually know if the interaction is present, but we also get no way to test for it because of the incomplete design. We must therefore be cautious in our interpretation of the effect of the nested factor.

Consider our example. We want to interpret the effect of source as the contribution to the response based on variation among the different suppliers of the drugs. It might be good to know that some drug manufacturer is better (or worse) than others. However, differences among the sources for the different drugs are completely contained in the main effect factor (the test of effects of the different drugs themselves on the response). Therefore, the observed differences between sources COULD be entirely due to the effects of the different drugs and have nothing to do with variation among sources!!

Questions

  1. Identify the response variable and whether the described factor (in all caps) is suitable for crossed design or nested design
    a. In a breeding colony of lab mice, BREEDERS are used to generate up to five LITTERS; effects on offspring REPRODUCTIVE SUCCESS.
    b. Effects of individual TEACHERS at different SCHOOLS on STUDENT LEARNING in biology.
    c. Lisinopril, an ACE-inhibitor drug prescribed for treatment of high blood pressure, is now a generic drug, meaning a number of COMPANIES can manufacture and distribute the medication. Millions of DOSES of lisinopril are made each year; drug companies are required by the FDA to record when a dose is made and to record these dates by LOT NUMBER.
  2.  Work the example data set provide in this page. After loading the data set into Rcmdr (R), use linear model. The command to nest requires use of the forward slash, /. For example, if factor b is nested within factor a, then a/b. The linear model formula then,
Model <- lm(Obs ~ a/b, data=source)
  1. Describe the problem, i.e., what is a? What is b? What are the hypotheses?
  2. What is the statistical model?
  3. Test the model.
  4. Conclusions?

Quiz Chapter 14.5

Nested designs

Data set used in this page

Drug Source Obs
A s1 202.6
A s1 207.8
A s1 190.2
A s1 211.7
A s1 201.5
A s2 189.3
A s2 198.5
A s2 208.4
A s2 205.3
A s2 210
B s3 212.3
B s3 204.4
B s3 221.6
B s3 209.2
B s3 222.1
B s4 203.6
B s4 209.8
B s4 204.1
B s4 201.8
B s4 202.6
C s5 189.1
C s5 219.9
C s5 196
C s5 205.3
C s5 204
C s6 194.7
C s6 192.8
C s6 226.5
C s6 200.9
C s6 219.7

Chapter 14 contents

14.4 – Randomized block design

Introduction

Randomized Block Designs and Two-Factor ANOVA

In previous lectures, we have introduced you to the standard factorial ANOVA, which may be characterized as being crossed, balanced, and replicated. We expect additional factors (covariates) may contribute to differences among our experimental units, but rather than testing them — which would increase need for additional experimental units because of increased number of groups to test — we randomize our subjects. Randomization is intended to disrupt trends of confounding variables (aka covariates). If the resulting experiment has missing values (see Chapter 5), then we can say that the design is partially replicated; if only one observation is made per group, then the design is not replicated — and perhaps, not very useful!!

A special type of Two-factor ANOVA which includes a “blocking” factor and a treatment factor.

Randomization is one way to control for “uninteresting” confounding factors. Clearly, there will be scenarios in which randomization is impossible. For example, it is impossible to randomly assign subjects to The blocking factor is similar to the 10.3 – Paired t-test. In the paired t-test we had two individuals or groups that we paired (e.g. twins). One specific design is called the Randomized Block Design and we can have more than 2 members in the group. We arrange the experimental units into similar groups, i.e., the blocking factor. Examples of blocking factors may include day (experiments are run over different days), location (experiments may be run at different locations in the laboratory), nucleic acid kits (different vendors), operator (different assistants may work on the experiments), etcetera.

In general we may not be directly interested in the blocking factor. This blocking factor is used to control some factor(s) that we suspect might affect the response variable. Importantly, this has the effect of reducing the sums of squares by an amount equal to the sums of squares for the block. If variability removed by the blocking is significant, Mean Square Error (MSE) will be smaller, meaning that the F for treatment will be bigger — meaning we have a more powerful ANOVA than if we had ignored the blocking.

Statistical Testing in Randomized Block Designs

“Blocks” is a Random Factor because we are “sampling” a few blocks out of a larger possible number of blocks. Treatment is a Fixed Factor, usually.

The statistical model is

    \begin{align*} Y_{i,j} = \mu + \alpha_{i} + B_{j} + \epsilon_{i,j} \end{align*}

The Sources of Variation are simpler than the more typical Two-Factor ANOVA because we do not calculate all the sources of variation –  the interaction is not tested! (Table 1).

Table 1. Sources of variation for a two-way ANOVA, randomized block design

Sources of Variation & Sum of Squares
DF
Mean Squares
total \ SS = \sum_{i = 1}^{a}\sum_{j = 1}^{b} \sum_{l = 1}^{c} \left ( X_{ijl} - \bar{X}\right )^2 N - 1 \frac{total \ SS}{total \ DF}
SS_{Factor \ A} = b \cdot n\sum_{i = 1}^{a} \left ( \bar{X}_{i} - \mu\right )^2 a - 1 \frac{SS \ Factor \ A}{DF \ Factor \ A}
SS_{Factor \ B} = a \cdot n\sum_{i = 1}^{b} \left ( \bar{X}_{j} - \mu\right )^2 b - 1 \frac{SS \ Factor \ B}{DF \ Factor \ B}
SS \ Total - SS_{Factor \ A} - SS_{Factor \ B} n - a - b \frac{SS \ Remainder}{DF \ Remainder}

    \begin{align*} F = \frac{Factor \ A \ MS} {Remainder \ MS} \end{align*}

Critical Value F0.05(2), (a – 1), (Total DF – a – b)

In the exercise example above: Factor A = exercise or management plan.

Notice that we do not look at the interaction MS or the Blocking Factor (typically).

Learn by doing

Rather than me telling you, try on your own. We’ll begin with a worked example, then proceed to introduce you to three problems. Click here (Ch 14.8)for general discussion of Rcmdr and linear models for models other than the standard 2-way ANOVA (Chapter 14.8).

Worked example

Wheel running by mice is a standard measure of activity behavior. Even wild mice will use wheels (Meijer and Roberts 2014). For example, we conduct a study of family parity to see if offspring from the first, second, or third sets of births from wheel-running behavior in mice (total revs per 24 hr period). Each set of offspring from a female could be treated as a block. Data are for 3 female offspring from each pairing. This type of data set would look like this:

Table 2. Wheel running behavior by three offspring from each of three birth cohorts among four maternal sets (moms).

Revolutions of wheel per 24-hr period
Block
Dam 1
Dam 2
Dam 3
Dam 4
b1
1100
1566
945
450
b1
1245
1478
877
501
b1
1115
1502
892
394
b2
999
451
644
605
b2
899
405
650
612
b2
745
344
605
700
b3
1245
702
1712
790
b3
1300
612
1745
850
b3
1750
508
1680
910

Thus, there were nine offspring for each female mouse (Dam1 – Dam4), three offspring per each of three litters of pups. The litters are the blocks. We need to get the data stacked to run in R. I’ve provided the full dataset for readers, scroll to end of this page or click here.

Question 1. Describe the problem and identify the treatment and blocking factors.

Answer. Each female has three litters. We’re primarily interested in genetics (and maternal environment) of wheel running behavior, which is associated with the moms (Treatment factor). The questions is whether there is an effect of birth parity on wheel running behavior. Offspring of a first-time mother may experience different environment than offspring of experienced mother. In this case, parity effects is an interesting question, nevertheless, blocking is the appropriate way to handle this type of problem.

Question 2. What is the statistical model?

Answer. Response variable, Y, is wheel running. Let α the effect of Dam and β the birth cohorts (i.e., the blocking effect).

    \begin{align*} Y_{i,j} = \mu + \alpha_{i} + B_{j} + \epsilon_{i,j} \end{align*}

Question 3. Test the model.

Answer. We fit the main effects, Dam and Block Fig 1.

Wheel \sim Dam + Block

Rcmdr: Statistics → Fit models → Linear model

Figure 1. Screenshot Rcmdr Linear Model menu.

then run the ANOVA summary to get the ANOVA table. Rcmdr: Models →  Hypothesis tests → ANOVA table.

R Output

Anova Table (Type II tests)

Response: Wheel
           Sum Sq  Df  F value    Pr(>F) 
Dam       1467020   3   4.4732   0.01036 * 
Block     1672166   2   7.6482   0.00207 **
Residuals 3279544  30

Question 4.  Conclusions?

Answer 4. The null hypotheses are:

Treatment factor: Offspring of the different dams have same wheel running activity of offspring.

Blocking factor: No effect of litter parity on wheel running activity of offspring.

Both the treatment factor (p = 0.01036) and the blocking factor (p = 0.00207) were statistically significant.

Questions

Problem 1.

Or we might want to measure the Systolic Blood Pressure of individuals that are on different exercise regimens. However, we are not able to measure all the individuals on the same day at the same time. We suspect that time of day and the day of the year might effect an individuals blood pressure. Given this constraint, the best research design in this circumstance is to measure one individual on each exercise regime at the same time. These different individuals will then be in the same “block” because they share in common the time that their blood pressure was measured. This type of data set would look like this (Table 2):

Table 2. Simulated blood pressure of five subjects on three different exercise regimens.†

Subject
No
Exercise
Moderate Exercise
Intense Exercise
1
120
115
114
2
135
130
131
3
115
110
109
4
112
107
106
5
108
103
102

Let’s make a line graph to help us visualize trends (Fig 2).

line graph

Figure 2. Line graph of data presented in Table 2.

Question 1. Describe the problem and identify the treatment and blocking factors.

Question 2. What is the statistical model?

Question 3. Test the model.

Question 4.  Conclusions?

†You’ll need to arrange the data like the data set for the worked example.

Problem 2.

Another example in conservation biology or agriculture. There may be three different management strategies for promoting the recovery of a plant species. A good research design would be to choose many plots of land (blocks) and perform each treatment (management strategy) on a portion of each plot of land (block). A researcher would start with an equal number of plantings in the plots and see how many grew. The plots of land (blocks) share in common many other aspects of that particular plot of land that may effect the recovery of a species.

Table 3. Growth of plants in 5 different plots subjected to one of three management plans (simulated data set).†

Plot No.
No Management Used
Management
Plan 1
Management
Plan 2
1
0
11
14
2
2
13
15
3
3
11
19
4
4
10
16
5
5
15
12

†You’ll need to arrange the data the same arrangement as for the worked example.

These are examples of Two-Factor ANOVA but we are usually only interested in the treatment Factor. We recognize that the blocking factor may contribute to differences among groups and so wish to control for the fact that we carried out the experiments at different times (e.g., seasons) or at different locations (e.g., agriculture plots)

Question 5. Describe the problem and identify the treatment and blocking factors. Make a line graph to help visualize.

Question 6. What is the statistical model?

Question 7. Test the model.

Question 8.  Conclusions?

Repeated-Measures Experimental Design

If multiple measures are taken on the same individual, then we have a repeated-measures experiment. This is a Randomized Block Design. In other words, each animal gets all levels of a treatment (assigned randomly). Thus, samples (individuals) are not independent and the analysis needs to take this into account. Just like for paired-T tests, one can imagine a number of experiments in biomedicine that would conform to this design.

Problem 3.

The data are total blood cholesterol levels for 7 individuals given 3 different drugs (from example given in Zar 1999, Ex 12.5, pp. 257-258).

Table 5. Repeated measures of blood cholesterol levels of seven subjects on three different drug regimens.†

Subjects
Drug1
Drug2
Drug3
1
164
152
178
2
202
181
222
3
143
136
132
4
210
194
216
5
228
219
245
6
173
159
182
7
161
157
165

†You’ll need to arrange the data like the data set for the worked example.

Question 9: Is there an interaction term in this design? Make a line graph to help visualize.

Question 10: Are individuals a fixed or a random effect?

Question 11. What is the statistical model?

Question 12. Test the model. Note that we could have done the experiment with 21 randomly selected subjects and a one-factor ANOVA. However, the repeated measures design is best IF there is some association (“correlation”) between the data in each row. The computations are identical to the randomized block analysis.

Question 13.  Conclusions?

Problem 4.

Juvenile garter snake. Image from GetArchive, public domain.

Figure 3. Juvenile garter snake, image from juvenile garter snake in hand, public domain.

Here is a second example of a repeated measures design experiment. Garter snakes (Fig 3) respond to odor cues to find prey. Snakes use their tongues to “taste” the air for chemicals, and flick their tongues rapidly when in contact with suitable prey items, less frequently for items not suitable for prey. In the laboratory, researchers can test how individual snakes respond to different chemical cues by presenting each snake with a swab containing a particular chemical. The researcher then counts how many times the snake flicks its tongue in a certain time period (data presented p. 301, Glover and Mitchell 2016).

Table 6. Tongue flick counts of naïve newborn snakes to extracts†

Snake
Control (dH2O)
Fish mucus
Worm mucus
1
3
6
6
2
0
22
22
3
0
12
12
4
5
24
24
5
1
16
16
6
2
16
16

†You’ll need to arrange the data like the data set for the worked example.

Question 14. Describe the problem and identify the treatment and blocking factors. Make a line graph to help visualize.

Question 15. What is the statistical model?

Question 16. Test the model.

Question 17.  Conclusions?

 

Quiz Chapter 14.4

Randomized block design

Data sets used in this page

Problem 1 data set

BlockDamWheel
B1D11100
B1D21566
B1D3945
B1D4450
B1D11245
B1D21478
B1D3877
B1D4501
B1D11115
B1D21502
B1D3892
B1D4394
B2D1999
B2D2451
B2D3644
B2D4605
B2D1899
B2D2405
B2D3650
B2D4612
B2D1745
B2D2344
B2D3605
B2D4700
B3D11245
B3D2702
B3D31712
B3D4790
B3D11300
B3D2612
B3D31745
B3D4850
B3D11750
B3D2508
B3D31680
B3D4910

Chapter 14 contents

14.3 – Fixed effects, Random effects

Introduction

With few exceptions (eg, repeatability and intraclass correlation calculations, Chapter 12.3), we have been discussing the Model I ANOVA or fixed-effects ANOVA– fixed implies that we select the levels for the factor. It may not be obvious — in hindsight it is — but levels may also be randomly selected, eg, nature provides the levels. Thus, levels are random and the model is a random-effects ANOVA. Beginning with our discussions of ANOVA, it becomes increasingly important to incorporate concept of models in statistics. As you have been working in R and Rcmdr with the lm() function, you have been forced to address the statistical model concept — you enter the response variable then type in both factors and create a term for the interaction.

We’ve just completed an experiment in which the response (cholesterol levels) of 36 individuals, from one of three drug treatments (placebo, Drug A, Drug B), given one of 3 types of diets (low, medium, or high carbohydrate), was observed. Thus, we say that the specific treatments of drug and diet contribute to variation in cholesterol levels. More formally, we say that the observed response of the kth individual (Yijk) is equal to the overall mean (m) plus the added effect of Drug (a) plus the added effect of Diet (b) plus the interaction between Diet and Population (ab) plus unidentified sources of variation generally called error (e). In symbols, we write

    \begin{align*} Y_{ijk} = \mu + \alpha_{j} + \beta_{2} + \left (\alpha\beta \right )_{ij} +\epsilon_{ijk} \end{align*}

where i is the number of levels of the first factor (in our example, Diet had 2 levels, so i = A or i = B), j is the number of levels of the second factor (in our example, Population had 2 levels, so j = 1, or j = 2), and k is the total number of observations in the experiment (12 in our example, so k = 1, 2, …, 11, 12). Thus, we can think of each term “adding up” to give us the observed value.

Although it is a bit confusing at first, these equations help us understand how the experiment was conducted and therefore how to analyze and interpret the results.

Model I, Model II & Model III ANOVA

Statisticians recognize that how levels of the factors were selected for an experiment impact conclusions from ANOVAs. The key: whether or not the levels of the factor were selected (1) randomly from all possible levels of the factor or (2) specifically selected by the experimenters. We introduced the concepts of fixed and random effects in Chapter 12.3. For one-way ANOVA, the distinction between fixed and random effects influences the interpretation, but not the calculation of the ANOVA components. For two or more treatment factors, both the interpretations and the calculations of ANOVA components are affected.

There are two general types of Factors that we can choose to employ in an ANOVA: Fixed Model ANOVA and Random Model ANOVA. Where two or more factors apply, by far the most common model in experimental sciences is a combination of fixed and random, so we need to add a third general type, the Mixed Model ANOVA design.

Fixed Factors. Where the levels of the factor are selected by us. In this case we would only be interested in the response of the individuals that are given those specific treatments.

Medicine – for example, where we choose a treatment given to patients with a history of coronary heart disease; compare outcomes of patients given a statin (drug used to lower serum cholesterol) drug versus a placebo. (Note that this is the same study we discussed in our lecture on about risk analysis).

Ecotoxicology – for example, compare growth rates and deformities of tadpole frogs given Aldicarb, Atrazine (both estrogen-mimicking pesticides), or a control (ie, no pesticides). If you’re interested in these topics, here’s a link to the EPA’s web site, listing pesticide sales and use in the United States. Here’s a link to a NIH National Institute of Environmental Sciences, with a nice description of estrogen mimicking pesticides.

Agriculture & Genetics – for example, monitor growth of a particular hybrid corn available from three different manufacturers. See an example of Wisconsin corn hybrid studies.

In each of these examples we might be interested in those specific treatments and no other treatments.

HO: No difference in the means among the levels of the Factor

HA: Some difference in the means among these specific levels of the Factor; the specific levels of this factor affects the response variable.

This is an example of a Model I ANOVA, also called a “fixed effects” model ANOVA.

Random Factors. We still only use a relatively few number of different levels of a particular factor. However, in this case we are interested in many different levels of the factor — we want to generalize beyond our sample. The levels that we use would be a “sample” of all possible levels that we would be interested in.

Medicine – for example, where we randomly choose a treatment level given to patients; four concentrations of a drug and a placebo. Since concentration is a ratio scale data type, concentration can range from 0 (the placebo) to 100%.

Ecotoxicology – for example, release different concentrations or mixtures of air plus components of air pollution to chambers, record the response of plants or animals.

Agriculture & Genetics – for example, grow three different varieties of a plant in three different soils or different genetic strains of animals on three different diets. In these cases, factor levels are random because we are drawing from a large pool of possible levels: genetic varieties or strains — we selected three, but it’s rare that were are specifically interested in the three chosen. More often, we want to make generalizations and the three were somehow representative (we hope) of genetic variation in the species of interest.

In each of these examples, we write the null hypothesis to reflect that the particular levels are only of interest in so far as they can be used to generalize back to the population.

HO: No difference in the variation between groups.

HA: Some difference in the variation among these groups.

This is an example of a Model II ANOVA, also called a “random effects” model ANOVA. Note that we specify the hypothesis in terms of variation, not of the means.

Your two-way ANOVA could be Model I, Model II, or it could be mixed, with one factor fixed, the other random (this later model is called a Model III, or “mixed model” ANOVA).

For the most part, the distinction between whether you have fixed or random effects is clear, but whether we use fixed or random or combinations, this design decision has consequences for testing.

The decision does not affect the Sources of Variation for the different Models. Last time, we showed the tests for a Model I ANOVA (the “fixed effects model”).

For Random Effects or Mixed Effects, we only change how we determine the statistical significance of the Factors. Here’s a summary of how the experimental design changes the calculation of the F-test statistic for the Factors.

Table 1. Calculation of F

The Critical Value for each of the different F values will be obtained by simply finding the degrees of freedom for the numerator and denominator SS. This was discussed and can be found sources of variation in 2-way ANOVA lecture.

From the formulas, we can see that the major difference is that sometimes the Factor MS is divided by the error MS and sometimes it is divided by the interaction MS.

If the interaction term is NOT statistically significant, then the Interaction MS (mean square) estimates the Error MS. In other words, if the interaction term is not statistically significant it will be similar in magnitude as the Error MS. In this case there will be no large difference in the computed outcomes if the Factor A or B is fixed or random.

However, there will be times when the interaction is not significant but the interaction MS is still larger than the Error MS. Then there could be a difference in the F value for the Factor.

If the interaction term is Significant and the interaction MS is larger than the Error MS then there will be difference in F values for the Factors A and/or B. The F values will be smaller for the Factors MS that are divided by the interaction MS.

It is possible that they will become non-significant with the interaction MS as the denominator. So it will become Harder to detect a significant Factor effect if there is also a significant Interaction effect. A graphical representation will help us understand why we use the interaction MS in some instances as the denominator.

In fact, if the interaction is found to be statistically significant, we must then interpret the effects of factors with caution. In general, if the interaction is significant, then the main factors are generally not interpreted in the 2-way ANOVA. Instead, a series of one-way ANOVA’s are conducted holding one of the factors constant. For example, evaporative water loss (EWL) in frogs in the presence of air pollution (ozone) may depend on the relative humidity (RH) — if the RH is low, the frog may lose less body water at different concentrations of ozone than if the RH is moderate. Therefore, since the interaction is significant, the best thing to do is to look at the effects of ozone concentration on EWL at each level of saturation (RH).

This is a critical point in your understanding of complex ANOVA designs. Let us examine a case where there is a mildly significant interaction effect between two factors. In the first graph below (Fig 1) we see that Genotype 2 performs better on average (combining the two density treatments). If we are only interested in these two density treatments then it might be that the Genotype Factor is significant. This would be the case if Factor A is fixed and Factor B (density) is also fixed.

The Formula for Factor A would be: F = Genotype MS / error MS

Example of an interaction; genotype 1 has higher growth rate in density 2, genotype 2 does best in density 1 conditions.

Figure 1. Interaction example. At density D1, genotype 2 (red line) has higher growth rate; at density D2, the ranking switches: now, genotype 1 (black line) has higher growth rate.

However, it is likely that both Factor A (genotype) and Factor B (densities) are actually “samples” of many other possible genotypes and densities that we could examine.

Consider more than two genotypes raised in more than 2 densities (Fig 2). The outcome might look like

Interaction example expanded for multiple genotypes over multiple densities. 

Figure 2. Interaction example expanded for multiple genotypes over multiple densities. 

The graph (Fig 2) shows that genotype 2 does not do better than genotype 1 if we have more densities. If we also have other genotypes we see that there are other genotypes that have better (higher) responses than genotype 2.

    \begin{align*} F = \frac{Genotype \ MS}{Interaction \ MS} \end{align*}

In these cases it would have been more appropriate to calculate the F value for Factor A (genotype) using the interaction MS as the denominator. In Figure 1 there was some interaction this will make it harder to reject the null hypothesis that there is no effect of Factor A (genotype). So you must be careful to think about how you plan to interpret your data before you decide how to analyze the data using a Two-Factor ANOVA.

Questions

  1. Which of the following statements regarding fixed and random factors is true?
    A. With fixed factors, the subjects are selected by the researcher
    B. With fixed factors, the treatment levels are selected by the researcher at random from all possible levels
    C. With fixed factors, the subjects are selected at random by the researcher
    D. With random factors, the treatment levels are selected by the researcher at random from all possible levels
  2. Please write the equation for the one-way ANOVA with four levels of of fixed effects treatment factor A (you may wish to review Chapter 12.2)
  3. Selecting from all possible levels of a statin drug would be an impossible and meaningless experimental design. Explain why.
  4. For a multiway ANOVA design, when will the differences in the Random versus Fixed Factor make a difference?

Quiz Chapter 14.3

Fixed effects, random effects


Chapter 14 contents