Computing Confidence Intervals with R
To highlight how confidence intervals cover in 1-\alpha cases the true parameter I hacked together a little R code based on code from http://paleocave.sciencesortof.com/2013/03/writing-a-for-loop-in-r/ and http://www.cyclismo.org/tutorial/R/confidence.html
- I define a function which takes as parameter „runs“ the number of confidence intervals to compute
- result is the vector of results, it will contain TRUE values for those intervals that cover the true parameter and FALSE for those which don’t
- for each run I draw a sample of 10000 points from N(0,1)
- I then calculate the mean and standard deviation of the sample
- before calculating the limits of the confidence interval, „left“ and „right“
- Since I know the true mu is 0 I can check if 0 falls into this interval, the result of this check is stored in the i-th column of the result vector
- finally I calculate the summary of the result vector. On average 95% percent of all intervals will cover the true mu, 0.
If I find some more time I will add some functionality to run this code on multiple cores as well as a graphical visualisation of the intervals.
confInt <- function(runs){ result<-NULL for (i in 1:runs) { data<-rnorm(10000) n<-length(data) a<-mean(data) s<-sd(data) error <- qnorm(0.975)*s/sqrt(n) left <- a-error right <- a+error result[i] = left<0 & 0<right } result } summary(confInt(100)) |
EDIT: Using some more ggplot2 code I have the graphical visualization ready:
confInt <- function(runs){ x<-1:runs mu<-NULL sigma<-NULL result<-NULL vleft<-NULL vright<-NULL for (i in 1:runs) { data<-rnorm(1000) n<-length(data) a<-mean(data) mu[i]<-a s<-sd(data) sigma[i]<-s error <- qnorm(0.975)*s/sqrt(n) left <- a-error right <- a+error result[i] = left<0 & 0<right vleft[i] = left vright[i] = right } data.frame(x,mu,sigma,result,vleft,vright) } df<-confInt(100) require(ggplot2) myplot<-ggplot(df, aes(x = x, y = mu)) + geom_point(size = 2) + geom_errorbar(aes(ymax = vleft, ymin = vright,colour=result*3)) myplot + theme_bw() summary(df) |
See http://stackoverflow.com/questions/30289894/plotting-confidence-intervals-from-a-dataframe/30290123#30290123 for an alternative solution.