Interpreting Confidence Intervals

The general idea of any confidence interval is that we have an unknown value in the population and we want to get a good estimate of its value. Using the theory associated with sampling distributions and the empirical rule, we are able to come up with a range of possible values, and this is what we call a “confidence interval”.

[adsenseWide]

When interpreting the meaning, the key phrase to understand is “confidence”. If I am 95% confident the true mean is before 4 and 10, what am I actually saying?

Thinking About the Meaning of “95% Confident”

Let’s use an example to understand some possible interpretations in context. Suppose that we have a good (the sample was found using good techniques) sample of 45 people who work in a particular city. It took people in our sample an average time of 21 minutes to get to work one -way. The standard deviation was 9 minutes.

Calculating a 95% confidence interval for the mean using a t-interval for the population mean, we get : (18.3, 23.7). To start understanding the interval, we will look at some common misconceptions:

  • FALSE INTERPRETATION: “95% of the 45 workers take between 18.3 and 23.7 minutes to get to work”.

    While we used a sample to get the estimate, we are no longer talking about the sample. The confidence interval is now about ALL the workers that work in the city, not just the 45.

  • FALSE INTERPRETATION: “There is a 95% chance that the mean time it takes all workers in this city to get to work is between 18.3 and 23.7 minutes”.

    This is a very common misconception! It seems very close to true, but it isn’t because the population mean value is fixed. So, it is either in the interval or not. This is subtle but important.

What is correct?

95% of the time, when we calculate a confidence interval in this way, the true mean will be between the two values. 5% of the time, it will not. Because the true mean (population mean) is an unknown value, we don’t know if we are in the 5% or the 95%. BUT 95% is pretty good so we say something like
“We are 95% confident that the mean time it takes all workers in this city to get to work is between 18.3 and 23.7 minutes.” This is a common shorthand for the idea that the calculations “work” 95% of the time.

Remember that we can’t have a 100% confidence interval. By definition, the population mean is not known . If we could calculate it exactly we would! But that would mean that we need a census of our population with is often not possible or feasible.

Why Don’t We Always Use a 99% Confidence Level?

Seems to make sense right? Get the confidence level as high as you can! Well, as the confidence level increases, the margin of error increases . That means the interval is wider. So, it may be that the interval is so large it is useless! For example, what if I said that I am 99% confident that you will score between a 10 and a 100 on your next exam? How useful is that in predicting your performance? The interval is simply too wide. There are some instances where it doesn’t matter as much, but that is on a case by case basis.

For this reason, 95% confidence intervals are the most common. You will sometimes see 80% or others in textbooks, but in real applications it’s almost always a 95% interval with occasional 90% and 99% intervals being used.

CI for Parameters Other than the Mean

Confidence intervals can be calculated for many other population parameters and the interpretation still remains generally the same. Using the shorthand “we are 95% confident that…”, we will state that we are “pretty sure” that the parameter (the mean, the population proportion, etc) is within the given range.

As an example, suppose we have a 99% confidence interval of (0.122, 0.141) for the proportion of likely voters that approve a new measure. Then we could say:

“We are 99% confident that the proportion of all likely voters that approve the new measure is between 0.122 and 0.141”

Better yet, we could say:

“We are 99% confident that the percentage of all likely voters that approve the new measure is between 12.2% and 14.1%”

This is easier to explain to someone who hasn’t had statistics. In fact, anytime you see a poll on the news with a margin of error, they are likely talking about a confidence interval! They just tend to give the point estimate and the margin of error instead of the whole interval.

For example, when they say “a poll found 52% of people approve of the president” and you see on the screen “margin of error: 2%” then you know they are talking about a confidence interval of (50%, 54%). (they tend not to give the confidence level as that is a bit technical for a general news broadcast).