In this course, I will explain the basic methodology for hypothesis testing. Statistical analyses are required in many experimental and simulation studies. I will deliver lectures and exercises on the basics of probability theories and statistical methods including sample means, sample variances, p-values, t-test, u-test, Welch test, confidence intervals, covariance, ANOVA, multivariate analyses, correlations, information theory, mutual information, experimental design, and so on. After the course work, the students will acquire basic knowledge of hypothesis testing.
This course corresponds to the first half of the previous course "B07 Statistical Methods".
Every week, a lecture on each topic is followed by an exercise with Python language.
History and basic concepts of hypothesis testing are explained. The fundamentals of probability distributions are also given.
2. Sampling and Central Limit Theory
The central limit theory is the core of various hypothesis testing methods. Low of large numbers and the theory is explained in the context of sampling from a population. I will also explain the degrees of freedom in data sampling.
3. T-test, U-test, Welch test
Comparison of means between two groups is frequently required in statistical assessment of measured data. Depending on the properties of data, however, different methods should be adopted. These methods are explained together with the basic notions of statistical significance and p-values.
Now, the mere use of p-values is not encouraged by experts. First, I will explain why the use of p-values is not sufficient for statistical assessment. Then, I will show how statistical differences can be more reliably assessed within the hypothesis-testing framework by using the confidence intervals of the means and proportions.
5. ANOVA, Effect Size
Statistical comparison between multiple groups is frequently required in a realistic situation. I will explain how such a comparison can be done by comparing the within-class variances and the between-class variances. Various corrections required for multiple comparisons are also explained together with the criteria for statistical differences.
6. Correlation Analysis
Correlation analysis is a standard method for analyzing the statistical relationship between statistical variables. After explaining the meaning of statistical independence, I will explain the correlation analysis of continuous and discrete variables together with their limitations.
7. Information Theory
Information theory is a concept that was not discovered in ancient Greek. In particular, mutual information is often used for quantifying the relationship between two statistical variables. A virtue of mutual information is that unlike correlations mutual information is applicable to variables showing a nonlinear mutual relationship. I will explain the basics of information theory.
Students are expected to have basic knowledge of elementary mathematics such as differentiation, integration, and elementary linear algebra. However, whenever necessary, mathematical details will be explained.
Students will need to write some code in Python.