Confidence Intervals
|
Return to Main Page
Index of On-Line Text Exercises for This Topic Everything for Finite Math Everything for Calculus Everything for Finite Math & Calculus Utility: Normal Distribution Utility Table: Normal Distribution Table Español |
Q OK then what is the point of taking a sample mean, since it tells us nothing?
A Slow down. It does not tell us nothing at all, it just gives no information with absolute certainty (unless, of course, our sample consists of the whole population). However, the larger the sample size, the more confident we can be that the population mean lies "fairly close" to the sample mean we obtained. This idea of "confidence" as oposed to "certainty" is what we will make precise here.
To understand what this is all about, you should know something about sampling distributions, where we learned about the
Central Limit Theorem If the population distribution has mean μ and standard deviation $σ,$ then, for sufficiently large $n,$ the sampling distribution of $\bar{x}$ is approximately normal, with mean
$μ_{\bar{x}} = μ$
and standard deviation
$σ_{\bar{x}} =\frac{σ}{\sqrt{n}}.$
Notice that, as the sample size gets larger, the standard deviation gets smaller. Thus, the sample means tend to be very close to the population mean, resulting in a single, narrow peak at $μ$ as shown in the distribution curves below.
|
Q OK what has this got to do with the original questoin about student spending on the Internet?
$95.45%$ of the sample means will lie within two standard deviations of the population mean (because $P(μ-2s ≤ \bar{x} ≤ μ+2s) = 0.9545)$ | |
Thus, | |
If we take a large number of sample means, $95.45%$ of the time, the distance between $\bar{x}$ and $μ$ will be less than two standard deviations (of the sampling distribution) -- that is, within a distance $2σ/\sqrt{n}$ of $n.$ | |
Or, | |
If we take a large number of sample means, $95.45%$ of the time, the (unknown) population mean is between $\bar{x} - 2σ/\sqrt{n}$ and $\bar{x} + 2σ/\sqrt{n}.$ |
and obtain:
Then the interval we want is given by the following formula:
Large Sample $100(1-α)%$ Confidence Interval
$n =$ sample size $σ =$ population standard deviation $z_{α/2} = z-$value with an area of $α/2$ to its right (obtained from a table). Note: When (as is often the case) we don't know the population standard deviation $σ,$ we can approximate it by the sample standard deviation $s,$ and obtain the following (good) approximation of the confidence interval: |
$\color{blue}{z_{.1}}$ | $\color{blue}{z_{.05}}$ | $\color{blue}{z_{.025}}$ | $\color{blue}{z_{.01}}$ | $\color{blue}{z_{.005}}$ | $\color{blue}{z_{.001}}$ | $\color{blue}{z_{.0005}}$ |
$1.282$ | $1.645$ | $1.960$ | $2.326$ | $2.576$ | $3.090$ | $3.291$ |
Here is an example where you can put the above formula to use.
Your hot sauce company rates its sauce on a scale of spiciness of 1 to 20. A sample of $50$ bottles of hot sauce is taste-tested, resulting in a mean of $12$ and a sample standard deviation of $2.5.$ Find a $95%$ confidence interval for the spiciness of your hot sauce.Solution
Fill in the following values and press "Check" (don't "Peek" unless you absolutely have to...)
Q How do I interpret this confidence interval?
Example 2 Illustration of Confidence Intervals
Pressing "Generate Samples" will give a window showing the indicated number of samples of size $n = 30$ together with the $90%$ confidence interval, and whether it contains the population mean $0.5.$ If you press "Generate Samples", approximately $90%$ of the confidence intervals given should contain the population mean of $0.5.$ Thus, you should average $18$ "yes"s for every $20$ samples.Before we go on...Notice that, since the distribution we are sampling from is not normal (it is uniform), we need fairly large samples to guarantee that the distribution of the sample means is approximately normal -- assumed in our formulation of confidence intervals. Notice also that we use the theoretical population standard deviation in computing each interval rather than the sample standard deviation. We could have equally well have used the sample standard deviations instead.
When we are dealing with small samples, we cannot invoke the Central Limit Theorem. Hence, we cannot use our formula for confidence intervals unless we are sampling from a normally distributed random variable.
However, there is one further issue: if we know the population standard deviation σ, then all is well and good, and we can go ahead and use the above formula for the confidence interval for small samples (assuming, of course that we are sampling from a normally distributed variable). But if, as is usually the case, we do not know $σ,$ then if we go ahead and use the sample standard deviation $s$ instead, we will tend to obtain confidence intervals that are too small. The reason is that, while the sampling distribution of $(\bar{x}-μ)/σ,$ is normal (provided $x$ is normal) the sampling distribution of $(\bar{x}- μ)/s$ is not normal (unless we are dealing with large samples, in which case it is approximately normal). Q Why care about the sampling distribution of $(\bar{x}-μ)/s$?
Small Sample $100(1-α)%$ Confidence Interval
When the Population Standard Deviation α is Known:
$n =$ sample size $σ =$ population standard deviation $z_{α/2} = z-$value with an area of $α/2$ to its right (obtained from a table). When Only the Sample Standard Deviation $s$ is Known:
$n =$ sample size $s =$sample standard deviation $t_{α/2} = t-$value with an area of $α/2$ to its right ($t_{α/2}$ can be obtained from a table here.). |
Let us try this out on the following variant of the "Hot Sauce" Example above.
Example 3 More Hot Sauce
When the CEO of your hot sauce company was informed that the spiciness of the hot sauce averages only $12,$ he was furious and ordered instant adjustments to the recipe, threatening to fire the whole sauce division unless the average spiciness increased to above $13.$ Yesterday, you randomly sampled $8$ bottles of the new sauce and found an average spiciness of $13.5$ with a sample standard deviation of $0.75.$Solution
(a) Fill in the following values and press "Check".
(b) The calculation is almost identical to the one above, excpet for the value $s = 0.58,$ which gives the new confidence interval $[13.0150, 13.9850].$ Since this interval does not contain $13,$ we can be $95%$ certain that the mean spiciness of all the sauce is above $13.$