Computing Confidence Intervals with R
To highlight how confidence intervals cover in 1-\alpha cases the true parameter I hacked together a little R code based on code from http://paleocave.sciencesortof.com/2013/03/writing-a-for-loop-in-r/ and http://www.cyclismo.org/tutorial/R/confidence.html
- I define a function which takes as parameter „runs“ the number of confidence intervals to compute
- result is the vector of results, it will contain TRUE values for those intervals that cover the true parameter and FALSE for those which don’t
- for each run I draw a sample of 10000 points from N(0,1)
- I then calculate the mean and standard deviation of the sample
- before calculating the limits of the confidence interval, „left“ and „right“
- Since I know the true mu is 0 I can check if 0 falls into this interval, the result of this check is stored in the i-th column of the result vector
- finally I calculate the summary of the result vector. On average 95% percent of all intervals will cover the true mu, 0.
If I find some more time I will add some functionality to run this code on multiple cores as well as a graphical visualisation of the intervals.
confInt <- function(runs){
result<-NULL
for (i in 1:runs) {
data<-rnorm(10000)
n<-length(data)
a<-mean(data)
s<-sd(data)
error <- qnorm(0.975)*s/sqrt(n)
left <- a-error
right <- a+error
result[i] = left<0 & 0<right
}
result
}
summary(confInt(100)) |
EDIT: Using some more ggplot2 code I have the graphical visualization ready:
confInt <- function(runs){
x<-1:runs
mu<-NULL
sigma<-NULL
result<-NULL
vleft<-NULL
vright<-NULL
for (i in 1:runs) {
data<-rnorm(1000)
n<-length(data)
a<-mean(data)
mu[i]<-a
s<-sd(data)
sigma[i]<-s
error <- qnorm(0.975)*s/sqrt(n)
left <- a-error
right <- a+error
result[i] = left<0 & 0<right
vleft[i] = left
vright[i] = right
}
data.frame(x,mu,sigma,result,vleft,vright)
}
df<-confInt(100)
require(ggplot2)
myplot<-ggplot(df, aes(x = x, y = mu)) +
geom_point(size = 2) +
geom_errorbar(aes(ymax = vleft, ymin = vright,colour=result*3))
myplot + theme_bw()
summary(df) |
See http://stackoverflow.com/questions/30289894/plotting-confidence-intervals-from-a-dataframe/30290123#30290123 for an alternative solution.
