- Noisy because
- Biological phenomena are "noisy" ~ heterogeneity is the important feature of biology.
- Experiments have many factors that add noise to data.
- Highthroughput systems realize "highthoughput" by sacrificing preciseness somehow.
- Noises are informative
- Large-scale data have information on "shape" of noise and it can be used to quality-control/adjust the noisy data to purify "non-noisy signals".
- 1000 samples, SNP chip for 50K SNP genotyping.
- Success rate of individual samples.
N <- 1000
n1 <- 650
n2 <- 300
a <- rbeta(n1,50,2)
b <- rbeta(n2,100,20)
x <- rbeta(N-n1-n2,30,50)
hist(c(a,b,x),breaks = seq(from=0,to=1,length=40))