Choosing the right sample for statistically-significant results

How do you conduct an accurate national survey when there are more than 330 million people living in the United States? It would be impossible to send a survey to every single person, but you can use probability sampling to get data that’s just as good, even if it comes from a much smaller group.

Probability sampling is a sampling technique that involves randomly selecting a small group of people (a sample) from a larger population, and then predicting the likelihood that all their responses put together will match those of the overall population.

There are two important requirements when it comes to probability sampling:

- Everyone in your population must have an equal, non-zero chance of being selected. (In other words, everyone has an equal chance of receiving a survey.)
- You must know, specifically, what that chance of being selected is for each person. (For example, you might determine that in a population of 100 people, each person’s odds of receiving a survey is 1 in 100. Being able to represent each person’s chance of selection as a probability is at the core of probability sampling.)

Following these two rules will help you choose appropriately (i.e. randomly) from your sampling frame, which is the list of everyone in your entire population who can be sampled. Random selection is key—probability sampling is all about making sure everyone has an equal probability of being included. From picking names out of a hat or pulling the short straw, to more complex random selection processes, this ensures that the sample you end up creating is representative of the population as a whole.

With the right sample, you can achieve results that are just as valuable as those you might get from a far bigger survey effort. From there, you can draw valid conclusions based on the sample’s wants, needs, or opinions and take action that makes sense for the entire population.

Get AI-driven insights and the data you need to shape the future of your business.

There are several sampling methods that fall under the umbrella of probability sampling. These methods not only vary based on the type of research you’re doing and the type of data you want to yield, but also the amount of time you have to conduct your research and the tools you have at your disposal. Here are the four main types of probability sampling approaches that researchers use:

In simple random sampling, all members of the population have an equal chance of being selected and the selection is done randomly. To achieve this, researchers may use tools like a random number generator to select participants from the overall population to be part of a sample. However, while simple random sampling is, as the name indicates, the simplest sampling strategy, it is also prone to bias. For example, the smaller your sample size is compared to your overall population, the less likely you are to draw a reliable sample totally at random.

*SurveyMonkey Audience** can help you tap into a true representative sample with demographic balancing and flexible targeting.*

Many populations can be divided into smaller groups based on specific characteristics that don’t overlap but represent the entire population when put together. With stratified random sampling, you would draw a sample from each of these groups (or strata) separately. This allows you to make sure that every subgroup is properly represented, which leads to more accurate results than simple random sampling.

It’s common to stratify by characteristics like sex, age, income bracket, or ethnicity. The strata must be specific and mutually exclusive, meaning every individual in the population should only be assigned to one group. Once you’ve split your population into strata, you would then use simple random sampling to select individuals from each group, in proportion to the total population. Those individuals would then be combined into a single sample.

Like stratified sampling, cluster sampling also involves separating the population into subgroups, or clusters. But that’s where the two probability sampling methods diverge. With cluster sampling, each cluster should have similar characteristics to the population. Instead of selecting individuals from each and every cluster, you would begin by randomly selecting entire clusters. If possible, you might include every individual from each selected cluster in your final sample. If the clusters are too large, you would need to randomly select individuals from each cluster.

Researchers often use pre-established and easily available groups as clusters. This is typically based on geographic boundaries, like cities or counties, but it can also be schools or office locations. Cluster sampling is most often used to save costs when surveying populations that are very large or spread out geographically. However, there is more risk of sampling error with cluster sampling. Each cluster is supposed to represent the total population, but this can be difficult to guarantee.

Systematic sampling is similar to simple random sampling, though it’s usually a bit easier to conduct. Each member of the population is assigned a number, then selected at regular intervals to form a sample. (Systematic sampling is also known as interval sampling.) Or, to put it another way, every “nth” individual in the population is selected to be part of the sample.

For example, in a population of 1,000, you might choose every 9th person for your sample. This can be more straightforward than other sampling methods, as there is a clear and systematic approach to picking individuals that doesn’t involve a random number generator. On the flip side, the resulting selection may not be as random as they would be if a generator was used. Additionally, it’s important to ensure that there’s no hidden pattern in the list that may affect the random selection. If there’s risk of data manipulation, the sample will be skewed and you may end up with over or under representation within your sample.

For instance, say you plan to survey employees within a particular organization, and all the employees are listed in alphabetical order. You plan to use systematic sampling to select every 4th employee for your sample. However, if the alphabetical list is also organized by team and seniority, you might end up choosing too many or two few people in senior roles, which would lead to bias into your sample.

There are several benefits to using probability sampling. Overall, it’s cost-effective to sample large audiences representing your target buying audience. It’s also advantageous for geographically dispersed populations.

Each type of probability sampling provides its own advantages. For example, simple random and systematic sampling makes the implementation process more user-friendly, and stratified sampling reduces the researcher’s bias, while cluster sampling limits the variability in a research study. Probability sampling requires little technical expertise when utilizing an agile experience management platform. You can also be as detailed as you want when creating your population sample using stratified sampling or systematic sampling. If you’re working against deadlines, then cluster sampling and simple random sampling is the way to go.

For every advantage, some detail within that benefit might work against your overall efforts. For instance, getting the best possible population sample means doing a little more research that will take more time and resources. Stratified sampling can ensure that the clusters are equally represented, but it may not mirror all the differences within that sample population.

Cluster sampling can separate the strata into diverse clusters, but those clusters could have overlapping characteristics. While simple and random probability sampling can provide quick results, the clusters and strata might not be as targeted toward your intended audience.

Probability sampling is ideal for quantitative studies where the goal is to use statistical analysis to draw conclusions about a large population. When it would be too difficult or expensive to survey the entire population, researchers can use this sampling strategy to collect representative data.

Probability sampling is used in a lot of market research to gain insights into a large population. This includes projects like:

- Uncovering consumer usage to inform product development
- Understanding what factors have the greatest impact on purchasing decisions
- Identifying emerging industry categories and players

Even beyond industry tracking, buyer attitudes, and competitive intelligence, probability sampling allows companies to firm up new ideas and improve business by tapping into data that reflects their entire target market.

Say, for example, a chain of coffee shops has 15,000 stores in various geographic locations in the United States. The company is looking to expand its customer loyalty program with additional payment options and new ways for customers to earn rewards. Before it makes any significant updates, however, it wants to know if customers will respond well to the proposed changes.

Reaching out to all customers at its 15,000 coffee shops isn’t feasible, but the company could use a probability sampling approach to create a sample that accurately represents that larger population. The responses received will reveal how customers as a whole feel about the loyalty program update. In turn, everyone from the company’s marketing department to its customer service representatives can use the data to get a better understanding of what further changes need to be made or how to effectively promote the new loyalty program. And if the company wants to ensure that its sample reflects subgroups within the population, like gender, age ranges or income levels, it can use certain types of probability sampling methods like stratified sampling or cluster sampling (more on both later).

In the example above, probability sampling is a great way to handle a rather large population—in this case, thousands of coffee shops. With true probability samples, having larger samples helps reduce the chance of sampling error, which occurs when you select a sample that does not represent the whole population. And, in general, random sampling can help minimize sampling errors because it uses a systematic, rather than subjective, approach to selecting a sample.

You never want to knowingly exclude someone in your population from being selected to be part of your sample. Watch out for times when particular groups might be unintentionally prevented from participating.

For example, let’s say you want to understand public opinion on an expansive new immigration law. Will you offer a Spanish language version of your survey? You should. If you don’t, you’ll likely miss out on hearing from a lot of native Spanish speakers who aren’t comfortable answering questions in English, but have views on immigration that would be extremely valuable for your research. If their participation is overlooked, your survey results won’t match up with true public opinion.

Remember, if you can’t give everyone in your population a chance at completing your survey, your sample will be non-representative and, therefore, will not be based on probability sampling.

Simple random sampling, stratified sampling, cluster sampling, and systematic sampling are all types of probability sampling. But there’s another end of the sampling technique spectrum: non-probability sampling. Even if you’re set on using random selection for your sample, it’s worth knowing the basics of non-probability sampling, including when and why it’s used by researchers.

With non-probability sampling, members of the overall population do not have an equal chance of being part of your sample—and there’s nothing random about how they are selected. In fact, some members will have zero chance of being selected. Where probability sampling is concerned with drawing conclusions about a larger population, non-probability sampling is often used for exploratory and qualitative research that is more focused on hearing from people with specific expertise, experiences, or insights.

Let’s say, for example, that you’re researching local use of mobility ramps and your population of interest is people in your city who use wheelchairs. You don’t have a full list of these people, so probability sampling isn’t an option. However, you meet a few people who agree to participate in your study, and they connect you with other wheelchair users in the area. This non-probability sampling, called snowball sampling, may not involve random selection, but it does have the potential to put you in contact with more people who are relevant for your research.

Non-probability sampling is generally easier and cheaper to conduct, but it also has a higher risk of sampling bias than probability sampling. That’s because the sample selection process is based on the subjective judgment of the researcher, rather than randomization. Plus, the sample size and the end results don’t necessarily have to represent the entire population.

Not sure where to start? We offer custom services that can help guide you from idea to market.

So what are the steps involved in probability sampling? It’s not actually that complicated, but you will need to have clear goals and interests for your study. Pre-planning, and having a thorough understanding of what kind of results you hope to attain, will be extremely helpful when you need to narrow down how you plan to build your sample and why.

Think through all the people that you’re interested in hearing from, but also be aware of anyone who should be deliberately excluded.

Ideally, your frame should include all members of your population of interest (and no one who is not in your population of interest).

Do you want clusters and strata? Do you want all sample members to have equal probabilities of selection? Think about what makes sense for your area of study, your population members, and your resources.

Depending on the population you’re trying to survey, you might have a hard time finding an appropriate sample frame. Even if you have a good frame, deciding on the best selection strategy may force you to make trade-offs between cost, representation, quality, and timeliness.

Getting people to respond to a true probability survey can be difficult if they are uninterested in the survey topic or want to be compensated for the time and effort it takes to complete the survey. It can also be time consuming. For example, if you’re conducting market research on your own (without the use of tools that help you find and randomly select respondents), creating a larger sample might require a lot of time and effort—and that’s before you get to the analysis portion of your research.

Many of these problems can be solved with non-probability sampling, which (despite its name) still draws from probability and sampling theory to select an appropriate survey sample.

If you have unlimited resources or a small population of interest, probability sampling may not be necessary. But, in most cases, drawing a probability sample will save you time, money, and a lot of frustration. You usually can’t survey everyone, but you can always give everyone the chance to be surveyed; this is what probability sampling accomplishes.

Sample target markets anywhere across the globe using SurveyMonkey Audience. Select a plan that works best for your business.

Discover our toolkits, designed to help you leverage feedback in your role or industry.

Join the top brands that are using SurveyMonkey to create winning products and experiences that drive growth.

Get tools to manage brand health, grow brand awareness & evolve brand perception over time. Sign up for free and try our brand tracking surveys today.

Learn how Chime uses SurveyMonkey to land critical brand partnerships and drive growth across the organization.