Statistical Methods

Statistical Methods (B07, Term 2)

The aim of this course is to introduce basic principles and practical methods in statistical testing and inference which are essential in all fields or scientific research.
To facilitate intuitive understanding, examples and exercises are given using the R language.

Contents

The linked .nb.html files are created as R notebooks. You can download the source .Rmd files from the menu in the upper right corner (but it may not work well in Safari. Then try Firefox.)

  1. Introduction: Probability and Statistics
    • Why statistics
    • What is probability
    • Probability and statistics
    • R language
  2. Probability Distributions
    • Expectation
    • Variance
    • Discrete distributions
    • Continuous distributions
    • Law of large numbers
    • Central limit theorem
  3. Statistical Inference
    • Standard error
    • Confidence interval of the proportion
    • Confidence interval of the mean
  4. Statistical Testing
    • P Values
    • t Test
    • Multiple testing
    • ANOVA
  5. Multi-variate Distributions
    • Joint and conditional distributions
    • Covariance and correlation
    • Statistical independence
    • Entropy and mutula information
  6. Regression and Maximul Likelihood
    • Maximum likelihood
    • Linear regression
    • Multiple regression
    • Overfitting
  7. Regularization
    • Regularization
    • Bayesian parameter inference
    • Model selection
  8. Classification
    • Logistic regression
    • Liniear discriminant analysis
    • Receiver operation curve (ROC)
  9. Unsupervised Learning
    • Hierarchical clustering
    • k-means clustering
    • Gaussian mixture
    • Principal component analysis (PCA)
    • Singular value decomposition (SVD)

References

Basic statistics

Larry Wassermann (2004) All of Statistics. Springer.
Harvey Motulsky (2014) Intuitive Biostatistics. Oxford University Press.

R language

R Development Core Team: An Introduction to R. CRAN.
Roger D. Peng (2015) R Programming for Data Science. Leanpub.

Issues in statistical data analysis

JT Leek and RD Peng (2015) P values are just the tip of the iceberg. Nature, 520, 612.
JT Leek and RD Peng (2015) What is the question? Science, 347, 1314-1315.