biostatistics.letgen.org
Mike’s Biostatistics Book
Preface
0.1 – Disclaimers and copyright
1 – Getting started
1.1 – A quick look at R and R Commander
1.2 – Chapter 1 – References
2 – Introduction
2.1 – Why (Bio)Statistics?
2.2 – Why do we use R Software?
2.3 – A brief history of (bio)statistics
2.4 – Experimental Design and rise of statistics in medical research
2.5 – Scientific method and where statistics fits
2.6 – Statistical reasoning
2.7 – Chapter 2 – References
3 – Exploring data
3.1 – Data types
3.2 – Measures of Central Tendency
3.3 – Measures of dispersion
3.4 – Estimating parameters
3.5 – Statistics of error
3.6 – Chapter 3 – References
4 – How to report statistics
4.1 – Bar (column) charts
4.2 – Histograms
4.3 – Box plot
4.4 – Mosaic plots
4.5 – Scatter plots
4.6 – Adding a second Y axis
4.7 – Q-Q plot
4.8 – Ternary plots
4.9 – Heat maps
4.10 – Graph software
4.11 – Chapter 4 – References
5 – Experimental design
5.1 – Experiments
5.2 – Experimental units, Sampling units
5.3 – Replication, Bias, and Nuisance Variables
5.4 – Clinical trials
5.5 – Importance of randomization in experimental design
5.6 – Sampling from Populations
5.7 – Chapter 5 – References
6 – Probability, Distributions
6.1 – Some preliminaries
6.2 – Ratios and probabilities
6.3 – Combinations and permutations
6.4 – Types of probability
6.5 – Discrete probability distributions
6.6 – Continuous distributions
6.7 – Normal distribution and the normal deviate (Z)
6.8 – Moments
6.9 – Chi-square distribution
6.10 – t distribution
6.11 – F distribution
6.12 – Chapter 6 – References
7 – Probability, Risk Analysis
7.1 – Epidemiology definitions
7.2 – Epidemiology basics
7.3 – Conditional Probability and Evidence Based Medicine
7.4 – Epidemiology: Relative risk and absolute risk, explained
7.5 – Odds ratio
7.6 – Confidence intervals
7.7 – Chapter 7 – References
8 – Inferential statistics
8.1 – The null and alternative hypotheses
8.2 – The controversy over proper hypothesis testing
8.3 – Sampling distribution and hypothesis testing
8.4 – Tails of a test
8.5 – One sample t-test
8.6 – Confidence limits for the estimate of population mean
8.7 – Chapter 8 – References
9 – Categorical Data
9.1 – Chi-square test: Goodness of fit
9.2 – Chi-square contingency tables
9.3 – Yates continuity correction
9.4 – Heterogeneity chi-square tests
9.5 – Fisher exact test
9.6 – McNemar’s test
9.7 – Chapter 9 – References
10 – Quantitative: Two Sample tests
10.1 – Compare two independent sample means
10.2 – Digging deeper into t-test Plus the Welch test
10.3 – Paired t-test
10.4 – Chapter 10 – References
11 – Power analysis
11.1 – What is Statistical Power?
11.2 – Prospective and retrospective power
11.3 – Factors influencing statistical power
11.4 – Two sample effect size
11.5 – Power analysis in R
11.6 – Chapter 11 – References
12 – One-way Analysis of Variance
12.1 – The need for ANOVA
12.2 – One way ANOVA
12.3 – Fixed effects, random effects, and ICC
12.4 – ANOVA from “sufficient statistics”
12.5 – Effect size for ANOVA
12.6 – ANOVA posthoc tests
12.7 – Many tests one model
12.8 – Chapter 12 – References
13 – Assumptions of parametric tests
13.1 – ANOVA Assumptions
13.2 – Why tests of assumption are important
13.3 – Test assumption of normality
13.4 – Tests for Equal Variances
13.5 – Chapter 13 – References
14 – ANOVA designs, multiple factors
14.1 – Crossed, balanced, fully replicated designs
14.2 – Sources of variation
14.3 – Fixed effects, Random effects
14.4 – Randomized block design
14.5 – Nested designs
14.6 – Some other ANOVA designs
14.7 – Rcmdr Multiway ANOVA
14.8 – More on the linear model in Rcmdr
14.9 – Chapter 14 – References
15 – Nonparametric tests
15.1 – Kruskal-Wallis and ANOVA by ranks
15.2 – Wilcoxon Rank Sum Test
15.3 – Wilcoxon signed rank test
15.4 – Chapter 15 – References
16 – Correlation, Similarity, and Distance
16.1 – Product moment correlation
16.2 – Causation and Partial correlation
16.3 – Data aggregation and correlation
16.4 – Spearman and other correlations
16.5 – Instrument reliability and validity
16.6 – Similarity and Distance
16.7 – References and suggested readings
17 – Linear Regression
17.1 – Simple linear regression
17.2 – Relationship between the slope and the correlation
17.3 – Estimation of linear regression coefficients
17.4 – OLS, RMA, and smoothing functions
17.5 – Testing regression coefficients
17.6 – ANCOVA – analysis of covariance
17.7 – Regression model fit
17.8 – Assumptions and model diagnostics for Simple Linear Regression
18 – Multiple Linear Regression
18.1 – Multiple Linear Regression
18.2 – Nonlinear regression
18.3 – Logistic regression
18.4 – Generalized Linear Squares
18.4 – Selecting the best model
18.5 – Compare two linear models
18.6 – References and suggested readings (Ch17 & 18)
19 – Distribution free methods
19.1 – Jackknife sampling
19.2 – Bootstrap sampling
19.3 — Monte Carlo methods
19.4 – References and suggested readings
20 – Additional topics
20.1 – Area under the curve
20.2 – Peak detection
20.3 – Baseline correction
20.4 – Conducting surveys
20.5 – Time series
20.6 – Dimensional analysis
20.7 – Estimating population size
20.8 – Diversity indexes
20.9 – Survival analysis
20.10 – Growth equations and dose response calculations
20.11 – Plot a Newick tree
20.12 – Phylogenetically independent contrasts
20.13 – How to get the distances from a distance tree
20.14 – Binary classification
Appendix
Distribution tables
Table of Z of Standard normal probabilities
Table of Chi-square critical values
Table of Critical values of Student’s t distribution.
Table of Critical values of F distribution
Install R
Install R Commander
Use R in the cloud
Jupyter notebook
R packages
List of R commands
Free apps for Bioinformatics
Index Mike’s Biostatistics Book
Appendix
Appendix
Mike’s Biostatistics Book — Appendix
Distribution tables
Table of Z of Standard normal probabilities
Table of Chi-square critical values
Table of Critical values of Student’s t distribution.
Table of Critical values of F distribution
Install R
Install R Commander
Use R in the cloud
R packages
List of R commands
Free apps for Bioinformatics