uweziegenhagen.de

Computing Confidence Intervals with R

2015-05-17, 16:55

To highlight how confidence intervals cover in 1-\alpha cases the true parameter I hacked together a little R code based on code from http://paleocave.sciencesortof.com/2013/03/writing-a-for-loop-in-r/ and http://www.cyclismo.org/tutorial/R/confidence.html

I define a function which takes as parameter „runs“ the number of confidence intervals to compute
result is the vector of results, it will contain TRUE values for those intervals that cover the true parameter and FALSE for those which don’t
for each run I draw a sample of 10000 points from N(0,1)
I then calculate the mean and standard deviation of the sample
before calculating the limits of the confidence interval, „left“ and „right“
Since I know the true mu is 0 I can check if 0 falls into this interval, the result of this check is stored in the i-th column of the result vector
finally I calculate the summary of the result vector. On average 95% percent of all intervals will cover the true mu, 0.

If I find some more time I will add some functionality to run this code on multiple cores as well as a graphical visualisation of the intervals.

confInt <- function(runs){
	result<-NULL
	for (i in 1:runs) {
		data<-rnorm(10000)
		n<-length(data)
		a<-mean(data)
		s<-sd(data)
		error <- qnorm(0.975)*s/sqrt(n)
		left <- a-error
		right <- a+error
		result[i] = left<0 & 0<right
	}
	result
}
summary(confInt(100))

EDIT: Using some more ggplot2 code I have the graphical visualization ready:

confInt <- function(runs){
	x<-1:runs
	mu<-NULL
	sigma<-NULL
	result<-NULL
	vleft<-NULL
	vright<-NULL
 
	for (i in 1:runs) {
		data<-rnorm(1000)
		n<-length(data)
		a<-mean(data)
		mu[i]<-a
		s<-sd(data)
		sigma[i]<-s
		error <- qnorm(0.975)*s/sqrt(n)
		left <- a-error
		right <- a+error
 
		result[i] = left<0 & 0<right
		vleft[i] = left	
		vright[i] = right
}
	data.frame(x,mu,sigma,result,vleft,vright)
}
 
 
df<-confInt(100)
 
require(ggplot2)
 
myplot<-ggplot(df, aes(x = x, y = mu)) +
  geom_point(size = 2) +
  geom_errorbar(aes(ymax = vleft, ymin = vright,colour=result*3))
myplot + theme_bw() 
summary(df)

See http://stackoverflow.com/questions/30289894/plotting-confidence-intervals-from-a-dataframe/30290123#30290123 for an alternative solution.

Uwe

Uwe Ziegenhagen likes LaTeX and Python, sometimes even combined. Do you like my content and would like to thank me for it? Consider making a small donation to my local fablab, the Dingfabrik Köln. Details on how to donate can be found here Spenden für die Dingfabrik.

Computing Confidence Intervals with R

Uwe

Links

Seiten

Kategorien

Meta

uweziegenhagen.de

Computing Confidence Intervals with R

Uwe

Links

Seiten

Kategorien

Meta

Schlagwörter