What happens to the margin of error as the sample size increases

Definition:

Margin of errors, in statistics, is the degree of error in results received from random sampling surveys. A higher margin of error in statistics indicates less likelihood of relying on the results of a survey or poll, i.e. the confidence on the results will be lower to represent a population. It is a very vital tool in market research as it depicts confidence level the researchers should have in the data obtained from surveys.

A confidence interval is the level of unpredictability with a specific statistic. Usually, it is used in association with the margin of errors to reveal the confidence a statistician has in judging the results of an online survey or online poll are worthy to represent the entire population or not.

Lower margin of error indicates higher confidence levels in the produced results.

When we select a representative sample to estimate full population, it will have some element of uncertainty. We need to infer the real statistic from sample statistic. This means our estimate will be close to the actual figure. Considering margin of error further improves this estimate.

Margin of Error Calculation:

A well-defined population is a prerequisite for calculating margin of error. In statistics, a “population” comprises of all the elements of a particular group that a researcher intends to study and collect data. This error can be significantly high if the population is not defined or in cases where the process of sample selection is not carried out properly.

Every time a researcher conducts a statistical survey, margin of error calculation is required. The universal formula for the margin of error for a sample is

What happens to the margin of error as the sample size increases
What happens to the margin of error as the sample size increases

where:

What happens to the margin of error as the sample size increases
What happens to the margin of error as the sample size increases
= sample proportion (“P-hat”).

n = sample size

z = z-score corresponds to your desired confidence levels.

Are you feeling a bit confused? Don’t worry! you can use our margin of error calculator.

Example for margin of error calculation

For example, wine tasting sessions conducted in vineyards are dependent on the quality and taste of the wines presented during the session. These wines represent the entire production and depending on how well they’re received by the visitors, the feedback from them is generalized to the entire production.

The wine tasting will be effective only when visitors do not have a pattern, i.e. they’re chosen randomly. Wine goes through a process to be palatable and similarly, the visitors also must go through a process to provide effective results.

The measurement components prove whether the wine bottles are worthy to represent the entire winery’s production or not. If a statistician states that the conducted survey will have a margin of error of plus or minus 5% at a 93% confidence interval. This means that if a survey was conducted 100 times with vineyard visitors, feedback received will be within a percentage division either higher or lower than the percentage that’s accounted 93 out of 100 times.

In this case, if 60 visitors report that the wines were extremely good. As the margin of error is plus or minus 5% in a confidence interval is 93%, in 100 visitors, it’s safe to conclude that the visitors who comment that the wines were “extremely good” will be 55 or 65 (93%) of the time.

To explain this further, let’s take an example of a survey on volunteering was sent to 1000 respondents out of which 500 agreed to the statement in the survey saying that volunteering makes life better. Calculate margin of error for 95% confidence level.

Step 1: Calculate P-hat by dividing the number of respondents who agreed with the statement in the survey to the total number of respondents. In this case, = 500/1000 = 50%

Step 2: Find z-score corresponding to 95% confidence level. In this case, z score is 1.96

Step 3: Calculate by putting these values in the formula

Step 4: Convert to a percentage

Margin of error in sample sizes:

In probability sampling, each member of a population has a probability of being selected to be a part of the sample. In this method, researchers and statisticians can select members from their area of research so that the margin of error in data received from these samples is as minimum as possible.

In non-probability sampling, samples are formed on the basis of cost-effectiveness or convenience and not on the basis of application and because of this selection process, some sections of the population may get excluded. Surveys will be effective only on filtering members according to interests and application to the survey being conducted.

The industrial standard for confidence level is 95% and these are the margin of error percentages for certain survey sample sizes:

As indicated in this table, to reduce the margin of error to half, for instance from 4 to 2, the sample size has been increased considerably, from 500 to 2000. As you must have observed, the sample size is inversely proportional to the margin of error. Till sample sizes of 1500, there is a significant decrease in margin of error but beyond that, this decrease reduces.  

While you are learning statistics, you will often have to focus on a sample rather than the entire population. This is because it is extremely costly, difficult and time-consuming to study the entire population. The best you can do is to take a random sample from the population – a sample that is a ‘true’ representative of it. You then carry out some analysis using the sample and make inferences about the population.

Since the inferences are made about the population by studying the sample taken, the results cannot be entirely accurate. The degree of accuracy depends on the sample taken – how the sample was selected, what the sample size is, and other concerns. Common sense would say that if you increase the sample size, the chances of error will be less because you are taking a greater proportion of the population. A larger sample is likely to be a closer representative of the population than a smaller one.

Let’s consider an example. Suppose you want to study the scores obtained in an examination by students in your college. It may be time-consuming for you to study the entire population, i.e. all students in your college. Hence, you take out a sample of, say, 100 students and find out the average scores of those 100 students. This is the sample mean. Now, when you use this sample mean to infer about the population mean, you won’t be able to get the exact population means. There will be some “margin of error”.

You will now learn the answers to some important questions: What is margin of error, what are the method of calculating margins of error, how do you find the critical value, and how to decide on t-score vs z-scores. Thereafter, you’ll be given some margin of error practice problems to make the concepts clearer.

What is Margin of Error?

The margin of error can best be described as the range of values on both sides (above and below) the sample statistic. For example, if the sample average scores of students are 80 and you make a statement that the average scores of students are 80 ± 5, then here 5 is the margin of error.

Calculating Margins of Error

For calculating margins of error, you need to know the critical value and sample standard error. This is because it’s calculated using those two pieces of information.

The formula goes like this:

margin of error = critical value * sample standard error.

How do you find the critical value, and how to calculate the sample standard error? Below, we’ll discuss how to get these two important values.

How do You find the Critical Value?

For finding critical value, you need to know the distribution and the confidence level. For example, suppose you are looking at the sampling distribution of the means. Here are some guidelines.

  1. If the population standard deviation is known, use z distribution.
  2. If the population standard deviation is not known, use t distribution where degrees of freedom = n-1 (n is the sample size). Note that for other sampling distributions, degrees of freedom can be different and should be calculated differently using appropriate formula.
  3. If the sample size is large, then use z distribution (following the logic of Central Limit Theorem).

It is important to know the distribution to decide what to use – t-scores vs z-scores.

Caution – when your sample size is large and it is not given that the distribution is normal, then by Central Limit Theorem, you can say that the distribution is normal and use z-score. However, when the sample size is small and it is not given that the distribution is normal, then you cannot conclude anything about the normality of the distribution and neither z-score nor t-score can be used.

When finding the critical value, confidence level will be given to you. If you are creating a 90% confidence interval, then confidence level is 90%, for 95% confidence interval, the confidence level is 95%, and so on.

Here are the steps for finding critical value:

Step 1: First, find alpha (the level of significance). \alpha =1 – Confidence level.

For 95% confidence level, \alpha =0.05

For 99% confidence level, \alpha =0.01

Step 2: Find the critical probability p*. Critical probability will depend on whether we are creating a one-sided confidence interval or a two-sided confidence interval.

For two-sided confidence interval, p*=1-\dfrac { \alpha }{ 2 }

For one-sided confidence interval, p*=1-\alpha

Then you need to decide on using t-scores vs z-scores. Find a z-score having a cumulative probability of p*. For a t-statistic, find a t-score having a cumulative probability of p* and the calculated degrees of freedom. This will be the critical value. To find these critical values, you should use a calculator or respective statistical tables.

Sample Standard Error

Sample standard error can be calculated using population standard deviation or sample standard deviation (if population standard deviation is not known). For sampling distribution of means:

Let sample standard deviation be denoted by s, population standard deviation is denoted by \sigma and sample size be denoted by n.

\text {Sample standard error}=\dfrac { \sigma }{ \sqrt { n } }, if \sigma is known

\text {Sample standard error}=\dfrac { s }{ \sqrt { n } }, if \sigma is not known

Depending on the sampling distributions, the sample standard error can be different.

Having looked at everything that is required to create the margin of error, you can now directly calculate a margin of error using the formula we showed you earlier:

Margin of error = critical value * sample standard error.

Some Relationships

1. Confidence level and marginal of error

As the confidence level increases, the critical value increases and hence the margin of error increases. This is intuitive; the price paid for higher confidence level is that the margin of errors increases. If this was not so, and if higher confidence level meant lower margin of errors, nobody would choose a lower confidence level. There are always trade-offs!

2. Sample standard deviation and margin of error

Sample standard deviation talks about the variability in the sample. The more variability in the sample, the higher the chances of error, the greater the sample standard error and margin of error.

3. Sample size and margin of error

This was discussed in the Introduction section. It is intuitive that a greater sample size will be a closer representative of the population than a smaller sample size. Hence, the larger the sample size, the smaller the sample standard error and therefore the smaller the margin of error.

What happens to the margin of error as the sample size increases
Image Source: Wikimedia Commons

Margin of Error Practice Problems

Example 1

25 students in their final year were selected at random from a high school for a survey. Among the survey participants, it was found that the average GPA (Grade Point Average) was 2.9 and the standard deviation of GPA was 0.5. What is the margin of error, assuming 95% confidence level? Give correct interpretation.

Step 1: Identify the sample statistic.

Since you need to find the confidence interval for the population mean, the sample statistic is the sample mean which is the average GPA = 2.9.

Step 2: Identify the distribution – t, z, etc. – and find the critical value based on whether you need a one-sided confidence interval or a two-sided confidence interval.

Since population standard deviation is not known and the sample size is small, use a t distribution.

\text {Degrees of freedom}=n-1=25-1=24.

\alpha=1-\text {Confidence level}=1-0.95=0.05

Let the critical probability be p*.

For two-sided confidence interval,

p*=1-\dfrac { \alpha }{ 2 } =1-\dfrac { 0.05 }{ 2 } =0.975.

The critical t value for cumulative probability of 0.975 and 24 degrees of freedom is 2.064.

Step 3: Find the sample standard error.

\text{Sample standard error}=\dfrac { s }{ \sqrt { n } } =\dfrac { 0.5 }{ \sqrt { 25 } } =0.1

Step 4: Find margin of error using the formula:

Margin of error = critical value * sample standard error

= 2.064 * 0.1 = 0.2064

Interpretation: For a 95% confidence level, the average GPA is going to be 0.2064 points above and below the sample average GPA of 2.9.

Example 2

400 students in Princeton University are randomly selected for a survey which is aimed at finding out the average time students spend in the library in a day. Among the survey participants, it was found that the average time spent in the university library was 45 minutes and the standard deviation was 10 minutes. Assuming 99% confidence level, find the margin of error and give the correct interpretation of it.

Step 1: Identify the sample statistic.

Since you need to find the confidence interval for the population mean, the sample statistic is the sample mean which is the mean time spent in the university library = 45 minutes.

Step 2: Identify the distribution – t, z, etc. and find the critical value based on whether the need is a one-sided confidence interval or a two-sided confidence interval.

The population standard deviation is not known, but the sample size is large. Therefore, use a z (standard normal) distribution.

\alpha=1-\text{Confidence level}=1-0.99=0.01

Let the critical probability be p*.

For two-sided confidence interval,

p*=1-\dfrac { \alpha }{ 2 } =1-\dfrac { 0.01 }{ 2 } =0.995.

The critical z value for cumulative probability of 0.995 (as found from the z tables) is 2.576.

Step 3: Find the sample standard error.

\text{Sample standard error}=\dfrac { s }{ \sqrt { n } } =\dfrac { 10 }{ \sqrt { 400 } } =0.5

Step 4: Find margin of error using the formula:

Margin of error = critical value * sample standard error

= 2.576 * 0.5 = 1.288

Interpretation: For a 99% confidence level, the mean time spent in the library is going to be 1.288 minutes above and below the sample mean time spent in the library of 45 minutes.

Example 3

Consider a similar set up in Example 1 with slight changes. You randomly select X students in their final year from a high school for a survey. Among the survey participants, it was found that the average GPA (Grade Point Average) was 3.1 and the standard deviation of GPA was 0.7. What should be the value of X (in other words, how many students you should select for the survey) if you want the margin of error to be at most 0.1? Assume 95% confidence level and normal distribution.

Step 1: Find the critical value.

\alpha=1-\text{Confidence level}=1-0.95=0.05

Let the critical probability be p*.

For two-sided confidence interval,

p*=1-\dfrac { \alpha }{ 2 } =1-\dfrac { 0.05 }{ 2 } =0.975.

The critical z value for cumulative probability of 0.975 is 1.96.

Step 3: Find the sample standard error in terms of X.

\text{Sample standard error}=\dfrac { s }{ \sqrt { X } }=\dfrac { 0.7 }{ \sqrt { X } }

Step 4: Find X using margin of error formula:

Margin of error = critical value * sample standard error

0.1=1.96*\dfrac { 0.7 }{ \sqrt { X } }

This gives X=188.24.

Thus, a sample of 189 students should be taken so that the margin of error is at most 0.1.

Conclusion

The margin of error is an extremely important concept in statistics. This is because it is difficult to study the entire population and the sampling is not free from sampling errors. The margin of error is used to create confidence intervals, and most of the time the results are reported in the form of a confidence interval for a population parameter rather than just a single value. In this article, you made a beginning by learning answering questions like what is margin of error, what is the method of calculating margins of errors, and how to interpret these calculations. You also learned to decide whether to use t-scores vs z-scores and gained information about finding critical values. Now you know how to use margin of error for constructing confidence intervals, which are widely used in statistics and econometrics.

Let’s put everything into practice. Try this Statistics practice question:

What happens to the margin of error as the sample size increases

Looking for more Statistics practice?

You can find thousands of practice questions on Albert.io. Albert.io lets you customize your learning experience to target practice where you need the most help. We’ll give you challenging practice questions to help you achieve mastery in Statistics.

Start practicing here.

Are you a teacher or administrator interested in boosting Statistics student outcomes?

Learn more about our school licenses here.