Posts tagged ‘confidence interval’

Computing Confidence Intervals with R

To highlight how confidence intervals cover in 1-\alpha cases the true parameter I hacked together a little R code based on code from http://paleocave.sciencesortof.com/2013/03/writing-a-for-loop-in-r/ and http://www.cyclismo.org/tutorial/R/confidence.html

  • I define a function which takes as parameter „runs“ the number of confidence intervals to compute
  • result is the vector of results, it will contain TRUE values for those intervals that cover the true parameter and FALSE for those which don’t
  • for each run I draw a sample of 10000 points from N(0,1)
  • I then calculate the mean and standard deviation of the sample
  • before calculating the limits of the confidence interval, „left“ and „right“
  • Since I know the true mu is 0 I can check if 0 falls into this interval, the result of this check is stored in the i-th column of the result vector
  • finally I calculate the summary of the result vector. On average 95% percent of all intervals will cover the true mu, 0.

If I find some more time I will add some functionality to run this code on multiple cores as well as a graphical visualisation of the intervals.

confInt <- function(runs){
	result<-NULL
	for (i in 1:runs) {
		data<-rnorm(10000)
		n<-length(data)
		a<-mean(data)
		s<-sd(data)
		error <- qnorm(0.975)*s/sqrt(n)
		left <- a-error
		right <- a+error
		result[i] = left<0 & 0<right
	}
	result
}
summary(confInt(100))

EDIT: Using some more ggplot2 code I have the graphical visualization ready:

confInt <- function(runs){
	x<-1:runs
	mu<-NULL
	sigma<-NULL
	result<-NULL
	vleft<-NULL
	vright<-NULL
 
	for (i in 1:runs) {
		data<-rnorm(1000)
		n<-length(data)
		a<-mean(data)
		mu[i]<-a
		s<-sd(data)
		sigma[i]<-s
		error <- qnorm(0.975)*s/sqrt(n)
		left <- a-error
		right <- a+error
 
		result[i] = left<0 & 0<right
		vleft[i] = left	
		vright[i] = right
}
	data.frame(x,mu,sigma,result,vleft,vright)
}
 
 
df<-confInt(100)
 
require(ggplot2)
 
myplot<-ggplot(df, aes(x = x, y = mu)) +
  geom_point(size = 2) +
  geom_errorbar(aes(ymax = vleft, ymin = vright,colour=result*3))
myplot + theme_bw() 
summary(df)

See http://stackoverflow.com/questions/30289894/plotting-confidence-intervals-from-a-dataframe/30290123#30290123 for an alternative solution.

Uwe

Uwe Ziegenhagen likes LaTeX and Python, sometimes even combined. Do you like my content and would like to thank me for it? Consider making a small donation to my local fablab, the Dingfabrik Köln. Details on how to donate can be found here Spenden für die Dingfabrik.

More Posts - Website