Topics
- P value
- t-test
- Multiple comparison
- ANOVA
References
- Wasserman (2004), Chpter 10
- Motulsky (2014), Chapters 4, 12, 15-17
Statistical Testing
Significance level \(\alpha\)
Confidence Interval and P Value
t-Test
Sample from Gaussian \(X_1,..,X_n \sim \mathcal{N}(\mu,\sigma^2)\)
Null hypothesis: \(\mu=0\)
\(T = \frac{\sqrt{n}\hat{\mu}}{\hat{\sigma}}\) follows t distribution with \(\nu=n–1\).
Reject null hypothesis if \(|T| > t^*\)
Multiple Comparison
e.g., Genome-wide association study (GWAS)
compare rates of \(m\) mutations in patients and controls
probability of false positive is \(\alpha\)
probability of no false positive is below \((1-\alpha)^m\)
probability of at lease one false positive: \(1– (1–\alpha)^m\)
- \(\alpha=0.05\): 0.4 for \(m=10\)
- \(\alpha=0.001\): 0.63 for \(m=1,000\)
False discovery rate (FDR)
Suppose \(m\) null hypotheses are all true
then P values shuold be uniformly distributed in (0,1)
set FDR = Q (e.g., 0.05)
test of smallest P value is ‘discovery’ if \(P<Q/m\)
second smallest \(P<2Q/m\), 3rd smallest \(P<3Q/m\),…
(false positive)/positive < Q
Analysis of Variance (ANOVA)
Comparing \(k>2\) groups: \((X^1_1,...,X^1_n),..., (X^k_1,...,X^k_n)\)
- \(k(k–1)/2\) pairwise t-tests?
One-way ANOVA
Null hypothesis: all come from the same Gaussian
group means: \(M^j = \frac{1}{n} \sum_i^n X^j_i\)
total mean \(M = \frac{1}{nk} \sum_j^k\sum_i^n X^j_i\)
between group variance: \(V_B = \frac{n}{k–1} \sum_j^k(M^j–M)^2\)
within group variance: \(V_W = \frac{1}{nk–k} \sum_j^k \sum_i^n(X^j_i–M^j)^2\)
\(F=\frac{V_B}{V_W}\) follows \(F\) distribution \(F(k–1,nk–k)\)
\[f(x;n_1,n_2) ∝ x^\frac{n_1–2}{2}(1+\frac{n_1}{n_2}x)^{–\frac{n_1+n_2}{2}}\]
x = seq(0, 5, 0.1)
plot(x, exp(-x), type="l") # just for comparison
n1 = c(2, 5)
n2 = c(10, 100)
for (i in 1:length(n1)){
for (j in 1:length(n2)){
f = df(x, n1[i], n2[j])
lines(x, f, col=i*length(n2)+j)
}
}

- Reject null hypothesis if F is large
Exercise
1. P-value and t Test
- For the above samples, perform t test with \(\alpha=0.05\).
- Compare the result with that by
t.test()
function
2. Multiple Testing
For the above m samples, check the result with Bonferoni correction.
- Try False Discovery Rate by
p.adjust(p, method="fdr")
3. ANOVA
- Apply
aov()
to several of the samples above.
- Take iris or other dataset with three or more groups and try
aov()
.
LS0tCnRpdGxlOiAiNC4gU3RhdGlzdGljYWwgVGVzdGluZyIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKIyBUb3BpY3MKCiogUCB2YWx1ZQoqIHQtdGVzdAoqIE11bHRpcGxlIGNvbXBhcmlzb24KKiBBTk9WQQoKIyMgUmVmZXJlbmNlcwoKKiBXYXNzZXJtYW4gKDIwMDQpLCBDaHB0ZXIgMTAKKiBNb3R1bHNreSAoMjAxNCksIENoYXB0ZXJzIDQsIDEyLCAxNS0xNwoKIyBTdGF0aXN0aWNhbCBUZXN0aW5nCgojIyBQIFZhbHVlcwoKKiBQcm9iYWJpbGl0eSB0aGF0IHRoZSBzYW1wbGUgaXMgcHJvZHVjZWQgYWNjb3JkaW5nIHRvIGEgbnVsbCBoeXBvdGhlc2lzCgogICAgKyBiaW5hcnk6IG1vcmUgb2NjdXJyZW5jZSB0aGFuIGNvbnRyb2wKICAgIAogICAgKyBjb250aW51b3VzOiB0aGUgbWVhbiBkaWZmZXJlbnQgZnJvbSBjb250cm9sCgojIyBTaWduaWZpY2FuY2UgbGV2ZWwgJFxhbHBoYSQKCiogUmVqZWN0IGEgaHlwb3RoZXNpcyBpZiAkUD5cYWxwaGEkCgoqIHRyYWRlIG9mZiBvZiBmYWxzZSBwb3NpdGl2ZXMgYW5kIGZhbHNlIG5lZ2F0aXZlcwoKIyMjIENvbmZpZGVuY2UgSW50ZXJ2YWwgYW5kIFAgVmFsdWUKCiogOTUlIENJIGluY2x1ZGVzIHRoZSBudWxsIGh5cG90aGVzaXMKCiAgICArIFA+MC4wNSDigKYgbnVsbCBoeXBvdGhlc2lzIG5vdCByZWplY3RlZAoKKiA5NSUgQ0kgZG9lcyBub3QgaW5jbHVkZSB0aGUgbnVsbCBoeXBvdGhlc2lzCgogICAgKyBQPDAuMDUg4oCmIG51bGwgaHlwb3RoZXNpcyByZWplY3RlZAoKIyMgdC1UZXN0CgoqIFNhbXBsZSBmcm9tIEdhdXNzaWFuICRYXzEsLi4sWF9uIFxzaW0gXG1hdGhjYWx7Tn0oXG11LFxzaWdtYV4yKSQKCiogTnVsbCBoeXBvdGhlc2lzOiAkXG11PTAkCgoqICRUID0gXGZyYWN7XHNxcnR7bn1caGF0e1xtdX19e1xoYXR7XHNpZ21hfX0kIGZvbGxvd3MgdCBkaXN0cmlidXRpb24gd2l0aCAkXG51PW7igJMxJC4KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIAoqIFJlamVjdCBudWxsIGh5cG90aGVzaXMgaWYgJHxUfCA+IHReKiQKCgojIyBNdWx0aXBsZSBDb21wYXJpc29uCgoqIGUuZy4sIEdlbm9tZS13aWRlIGFzc29jaWF0aW9uIHN0dWR5IChHV0FTKQoKKiBjb21wYXJlIHJhdGVzIG9mICRtJCBtdXRhdGlvbnMgaW4gcGF0aWVudHMgYW5kIGNvbnRyb2xzCgoqIHByb2JhYmlsaXR5IG9mIGZhbHNlIHBvc2l0aXZlIGlzICRcYWxwaGEkCgoqIHByb2JhYmlsaXR5IG9mIG5vIGZhbHNlIHBvc2l0aXZlIGlzIGJlbG93ICQoMS1cYWxwaGEpXm0kCgoqIHByb2JhYmlsaXR5IG9mIGF0IGxlYXNlIG9uZSBmYWxzZSBwb3NpdGl2ZTogJDHigJMgKDHigJNcYWxwaGEpXm0kCgogICAgKyAkXGFscGhhPTAuMDUkOiAwLjQgZm9yICRtPTEwJCAgCiAgICArICRcYWxwaGE9MC4wMDEkOiAwLjYzIGZvciAkbT0xLDAwMCQKCiMjIyBCb25mZXJyb25pIGNvcnJlY3Rpb24KCiogcmVqZWN0IG51bGwgaHlwb3RoZXNpcyBpZiAkUCA8IFxmcmFje1xhbHBoYX17bX0kCgoqICQx4oCTXGFscGhhJCBwcm9iYWJpbGl0eSBvZiBubyBmYWxzZSBwb3NpdGl2ZSAKCiMjIyBGYWxzZSBkaXNjb3ZlcnkgcmF0ZSAoRkRSKQoKKiBTdXBwb3NlICRtJCBudWxsIGh5cG90aGVzZXMgYXJlIGFsbCB0cnVlCgoqIHRoZW4gUCB2YWx1ZXMgc2h1b2xkIGJlIHVuaWZvcm1seSBkaXN0cmlidXRlZCBpbiAoMCwxKQoKKiBzZXQgRkRSID0gUSAoZS5nLiwgMC4wNSkKCiogdGVzdCBvZiBzbWFsbGVzdCBQIHZhbHVlIGlzICdkaXNjb3ZlcnknIGlmICRQPFEvbSQKCiogc2Vjb25kIHNtYWxsZXN0ICRQPDJRL20kLCAzcmQgc21hbGxlc3QgJFA8M1EvbSQsLi4uCgoqIChmYWxzZSBwb3NpdGl2ZSkvcG9zaXRpdmUgPCBRCgojIyBBbmFseXNpcyBvZiBWYXJpYW5jZSAoQU5PVkEpCgpDb21wYXJpbmcgJGs+MiQgZ3JvdXBzOiAkKFheMV8xLC4uLixYXjFfbiksLi4uLCAoWF5rXzEsLi4uLFhea19uKSQKCiogJGsoa+KAkzEpLzIkIHBhaXJ3aXNlIHQtdGVzdHM/CgojIyMgT25lLXdheSBBTk9WQQoKKiBOdWxsIGh5cG90aGVzaXM6IGFsbCBjb21lIGZyb20gdGhlIHNhbWUgR2F1c3NpYW4KCiogZ3JvdXAgbWVhbnM6ICRNXmogPSBcZnJhY3sxfXtufSBcc3VtX2lebiBYXmpfaSQKCiogdG90YWwgbWVhbiAkTSA9IFxmcmFjezF9e25rfSBcc3VtX2pea1xzdW1faV5uIFheal9pJAoKKiBiZXR3ZWVuIGdyb3VwIHZhcmlhbmNlOiAkVl9CID0gXGZyYWN7bn17a+KAkzF9IFxzdW1fal5rKE1eauKAk00pXjIkCgoqIHdpdGhpbiBncm91cCB2YXJpYW5jZTogJFZfVyA9IFxmcmFjezF9e25r4oCTa30gXHN1bV9qXmsgXHN1bV9pXm4oWF5qX2nigJNNXmopXjIkCgoqICRGPVxmcmFje1ZfQn17Vl9XfSQgZm9sbG93cyAkRiQgZGlzdHJpYnV0aW9uICRGKGvigJMxLG5r4oCTaykkCgokJGYoeDtuXzEsbl8yKSDiiJ0geF5cZnJhY3tuXzHigJMyfXsyfSgxK1xmcmFje25fMX17bl8yfXgpXnvigJNcZnJhY3tuXzErbl8yfXsyfX0kJAoKYGBge3J9CnggPSBzZXEoMCwgNSwgMC4xKQpwbG90KHgsIGV4cCgteCksIHR5cGU9ImwiKSAgIyBqdXN0IGZvciBjb21wYXJpc29uCm4xID0gYygyLCA1KQpuMiA9IGMoMTAsIDEwMCkKZm9yIChpIGluIDE6bGVuZ3RoKG4xKSl7CiAgZm9yIChqIGluIDE6bGVuZ3RoKG4yKSl7CiAgICBmID0gZGYoeCwgbjFbaV0sIG4yW2pdKQogICAgbGluZXMoeCwgZiwgY29sPWkqbGVuZ3RoKG4yKStqKQogIH0KfQpgYGAKCiogUmVqZWN0IG51bGwgaHlwb3RoZXNpcyBpZiBGIGlzIGxhcmdlCgojIEV4ZXJjaXNlCgojIyAxLiBQLXZhbHVlIGFuZCB0IFRlc3QKCjEpIEZvciB0aGUgYWJvdmUgc2FtcGxlcywgcGVyZm9ybSB0IHRlc3Qgd2l0aCAkXGFscGhhPTAuMDUkLgoKYGBge3J9CgpgYGAKCjIpIENvbXBhcmUgdGhlIHJlc3VsdCB3aXRoIHRoYXQgYnkgYHQudGVzdCgpYCBmdW5jdGlvbgoKYGBge3J9CgpgYGAKCiMjIDIuIE11bHRpcGxlIFRlc3RpbmcKCkZvciB0aGUgYWJvdmUgbSBzYW1wbGVzLCBjaGVjayB0aGUgcmVzdWx0IHdpdGggQm9uZmVyb25pIGNvcnJlY3Rpb24uCgpgYGB7cn0KCmBgYAoKMikgVHJ5IEZhbHNlIERpc2NvdmVyeSBSYXRlIGJ5IGBwLmFkanVzdChwLCBtZXRob2Q9ImZkciIpYAoKYGBge3J9CgpgYGAKCiMjIDMuIEFOT1ZBCgoxKSBBcHBseSBgYW92KClgIHRvIHNldmVyYWwgb2YgdGhlIHNhbXBsZXMgYWJvdmUuCgpgYGB7cn0KCmBgYAoKMikgVGFrZSBpcmlzIG9yIG90aGVyIGRhdGFzZXQgd2l0aCB0aHJlZSBvciBtb3JlIGdyb3VwcyBhbmQgdHJ5IGBhb3YoKWAuCgpgYGB7cn0KCmBgYAoKCgo=