There are two types of confidence intervals we will talk about here: z-intervals and t-intervals. Both are used to estimate an unknown population mean, but each is used in a different context. While the theory behind how confidence intervals and what they mean/how to interpret them is important, this article will focus mainly on the procedures used in their calculation.
This procedure is often used in textbooks as an introduction to the idea of confidence intervals, but is not really used in actual estimation in the real world. Even so, it is common enough that we will talk about it here!
What makes it strange? Well, in order to use a z-interval, we assume that (the population standard deviation) is known. As you can imagine, if we don’t know the population mean (that’s what we are trying to estimate), then how would we know the population standard deviation? Setting that aside, the general rule for when to use a z-interval calculation is:
The sample size is greater than or equal to 30 and population standard deviation known OR Original population normal with the population standard deviation known.
If these conditions hold, we will use this formula for calculating the confidence interval:
where is a critical value from the normal distribution (see below) and is the sample size.
Common values of are:
|Confidence Level||Critical Value|
Let’s try it out with an example!
Suppose that in a sample of 50 college students in Illinois, the mean credit card debt was $346. Suppose that we also have reason to believe (from previous studies) that the population standard deviation of credit card debts for this group is $108. Use this information to calculate a 95% confidence interval for the mean credit card debt of all college students in Illinois.
Since we wish to estimate the mean, we immediately know we will be using either a t-interval or a z-interval. Looking a bit closer, we see that we have a large sample size (50) and we know the population standard deviation. Therefore, we will use a z-interval with .
The indicates that we need to perform two different operations: a subtraction and an addition.
Left hand endpoint:
Right hand endpoint:
This gives our 95% confidence interval for as (316.1, 375.9). The interpretation of this interval is “We are 95% confident that the mean amount of credit card debt for all college students in Illinois is between $316.10 and $375.90. Of course this is a very particular statement, so please make sure you study how to interpret confidence intervals in general and so you can understand exaclty what this means!
Another way to present this interval would be to calculate the margin of error and write .
The much more realistic scenario is using a t-interval to estimate an unknown population mean. This interval relies on our sample standard deviation in calculating the margin of error. All this means for us is that the formula will be very similar, but the critical value will no longer come from the normal distribution. Instead, it will come from the student’s t distribution. What assumptions are made for using a t-interval?
The formula for a t-interval is:
where is a critical value from the t-distribution, s is the sample standard deviation and is the sample size. The value of depends on the sample size through the use of “degrees of freedom” where . We will use this to look up the value of in a table (a nice free version of that table can be found here. or typically in the back of your textbook if you are currently taking a class).
Suppose that a sample of 38 employees at a large company were surveyed and asked how many hours a week they thought the company wasted on unnecessary meetings. The mean number of hours these employees stated was 12.4 with a standard deviation of 5.1. Calculate a 99% confidence interval to estimate the mean amount of time all employees at this company believe is wasted on unnecessary meetings each week.
As before, since we are estimating a mean with a confidence interval, we know it will either be a t-interval or a z-interval. In this case, we have a large sample but we only have the sample standard deviation. If you aren’t sure of that – read closely. The standard deviation of 5.1 was in the context of the sample. Thus, we will go ahead and use a t-interval since is unknown.
Before we can do that however, we need to look up the critical value. To know which row in the t-table to look at, we find the degrees of freedom which is . Using the table linked here:
Now that we have that, we plug the values into the formula and do the calculations to get our two endpoints.
Left hand endpoint:
Right hand endpoint:
99% Confidence Interval for : (10.2, 14.6).
As an interpretation, we could say that “we are 99% confident that the mean amount of time that all employees at this company think is wasted on meetings each week is between 10.2 and 14.6 hours.”. The same warning applies here – make sure you take the time to truly study what this means (I think that’s the third time I have linked to that article!).
Confidence intervals are most often calculated with tools like SAS, SPSS, R, (these are statistical calculations packages) Excel, or even a graphing calculator. It is helpful to calculate them by hand once or twice to get a feel for the concept but you should also take the time to learn how to calculate them using one of these common tools. Which tool you use depends on the course you are taking or the field you are working in.