By The Numbers: How to easily calculate the correct sample size

Abstract

The author offers a table for calculating the correct sample size to reach a desired confidence level and margin of error. Includes a link to an online table generator.

Editor’s note: Paul C. Boyd is a principal of the Research Advisors, Franklin, Mass.

There are various formulas for calculating the required sample size for a study. These formulas require knowledge of the variance or proportion in the population and a determination as to the maximum desirable error, as well as the acceptable Type I error risk. For example, sample size formulas were addressed in a July 1999 Quirk’s article by Gang Xu (“Estimating sample size for a descriptive study in quantitative research.”)

Such formulas are typically presented in such a way as to avoid the issue of population size - they assume that the sample is to be drawn from an infinitely large population. Still, these formulas may appear unnecessarily complex to researchers who have to deal with these issues on only an occasional basis. The question they ask is simple: Isn’t there an easier way?

The answer is yes. After all, why bother with formulas when you don’t have to? Many researchers prefer a simple table to assist them with the determination of the appropriate sample size(s) for their studies.

It is possible to use one of these formulas to construct a table that suggests the optimal sample size - given a population size, a specific margin of error and a desired confidence interval. This can help researchers avoid the formulas altogether and simplify the process of determining appropriate sample sizes. The accompanying table presents the results of one set of these calculations. This table may easily be used to determine the appropriate sample size for almost any study.

For business and social science research the first column within the table is usually considered acceptable (confidence level = 95 percent, margin of error = ±5 percent). To use the table, simply determine the size of the population from which the sample is to be drawn down the left-most column (use the next highest value if the exact population size is not listed) and then identify the value in the next column. The value in this column is the sample size that is required to generate a margin of error of ± 5 percent for any population proportion with a 95 percent confidence level. Should more precision be required (i.e., a smaller margin of error) or greater confidence desired (0.99), the other columns of the table should be employed.

Thus, if you have 5,000 customers and you want to sample a sufficient number to generate a 95 percent confidence interval that predicted the proportion who, say, would be repeat customers within ±2-1/2 percent, you would need responses from a (random) sample of 1,176 of all your customers.

As you can see, using the table is much simpler than employing a formula. (A dynamic version of the table is available for download as an Excel spreadsheet that allows the user to change the margin of error, confidence level and/or the population size. Visit http://research-advisors.com/documents/SampleSize-web.xls.)

Suppose these customers were divided into two subgroups - Group A, consisting of 1,500 customers and Group B consisting of 3,500 - and you wanted to determine the proportions of each. In order to maintain the same levels of confidence and precision (95 percent and ±2-1/2 percent) you would need random samples of 759 from Group A and 1,068 from Group B.

Caution is urged to avoid lower levels of confidence (e.g., 95 percent) or larger margins of error (5 percent) solely for the purpose of minimizing the required sample size. As with all statistical procedures, the confidence level should be determined by the consequences of drawing the wrong conclusion due to sampling error. Likewise, the margin of error should be determined based upon the usefulness of the interval constructed (and remember that the interval width is twice the margin of error).

The formula used for these calculations is shown here (this is the formula used by Krejcie and Morgan in their article “Determining Sample Size for Research Activities”):

All of the sample estimates discussed present figures for the largest possible sample size for the desired level of confidence. Should the proportion of the sample with the desired characteristic be substantially different than 50 percent, then the desired level of accuracy can be established with a smaller sample. However, since you typically can’t know what this percentage is until you actually ask a sample, it is wisest to assume that it will be 50 percent and use the listed larger sample size.

References

Krejcie, Robert V. and Morgan, Daryle W. “Determining Sample Size for Research Activities.” Educational and Psychological Measurement 30 (1970): 607-610.

Xu, Gang. “Estimating Sample Size for a Descriptive Study in Quantitative Research.” Quirk’s Marketing Research Review, June 1999.