# How To Create A Random Sample in Excel

→
How To Create A Random Sample in Excel

We get a lot of questions around here about random samples and how to create them (which tells us a thing or two about our customers and how survey smart they are). We have a fast, easy trick for you to generate a random sample right in Excel, but first let’s provide a bit of background about random sampling and why you should use it when you survey.

#### A Crash Course in Sampling

You don’t want to just select a “convenience sample,” the last 20 people who ordered from you, the last 20 customers when they’re listed alphabetically, etc. All these methods are inherently biased. If you sample the last 20 customers for example, they may be your newest customers who are only familiar with your most recent products or website design. By generating a random sample, you’re minimizing the bias that results from picking an convenience sample from your sampling frame.

#### How to Generate a Random Sample Using Excel

This can sound daunting, but you don’t actually need to be a statistician or mathlete to do this. All you need is an Excel spreadsheet! Just put your sampling frame—the customers you have contact info for—into your spreadsheet. Once you have your sampling frame (potential survey respondents) in Excel, you can easily select a random sample of them. For example, if you have 3,000 customers and you would like to select a random sample of 500 to receive a customer satisfaction survey, follow these steps:

1. Add a new column within the spreadsheet and name it Random_number
2. In the first cell underneath your heading row, type “= RAND()”
3. Press “Enter,” and a random number will appear in the cell
4. Copy and paste the first cell into the other cells in this column
5. Once each row contains a random number, sort the records by Random_number column
6. Choose the first 500 emails. Those will be the random 500 out of 3000 emails.

To make sure the number of respondents in your random sample are statistically significant, check out this blog post. Random samples make it easy to get feedback from a large population for any of the following:

There you have it, survey-makers. Have you tried it out and have a use case for us? Share them with us in the comment section below. We’d love to hear your stories.

#### Do a Better Job with SurveyMonkey Enterprise

Give yourself—and the rest of your team—the answers you need to make smarter decisions. Access our most powerful tools with SurveyMonkey Enterprise. Learn More →

• Excel is OK for most cases, but it’s not truly ‘random’ as it uses a calculation to create the numbers.

Using all-organic random numbers is (statistically) better for the enviroment! 🙂

• Tony

I employ a similar principle but I use the formula =mid(cell,3,3) where cell is the email address field i.e. using 3 characters of the email address starting at the third character. I then pick a block of the required number of records starting at some random point in the spreadsheet.

• Dave

Poor clarity here. Two points:
– Bullet points 3 and 4 are identical.
– Paste only works here if it’s Paste Special/Values. Otherwise it just redoes the randomisation: very frustrating.

• Bennett P

Dave – Thanks for pointing this out. We’ve taken out the redundant bullets.

• Paula Worthington

I believe Excel’s default setting is to automatically recalculate formulas on refreshing the file; so users may want to go to File/Options/Formulas to choose manual calculations; then the sorting by random number will “work” just fine.

• GregM

I’ve followed this procedure and everything seems to work, however the random number column does not actually seem to sort lowest to highest or highest to lowest. It gives the cases a good shuffling, but the resulting order, in and of itself, seems random. I’m I doing something wrong?

• Kayte K

Thanks for the question, Greg! Working on an answer for you and will get back to you soon. Thanks for reading!

• Carl-E

Every time you do something in Excel (like sort), it recalculates formulas; this includes the random value function. What you get is a sorting by the original randomization, then a new (different) randomization. So it still looks random. You can check this by trying it with a short list and tracking one of the values.

One way to avoid this was mentioned earlier; generate the random numbers, then insert a blank column. Copy the random column and paste values only instead of a regular paste. Then the new list of random numbers in the second column won’t change, and you can sort by those!

Or you can take it on faith that the first sorting was random. It works fine, it just looks a little funny!

• Kayte K

Wow, thank you so much, Carl-E! This was an Excel mystery but so glad there’s an answer. Greg- I hope this helps you with your project.

• cheap dishwasher

Hi, I just stopped by to review this website. It looks really full of great content and I had
a good time browsing it, thanks for the helpful article!

• Kayte K

You’re welcome! Have a great week.

• Justin

Thanks for the tips. If you generate the random rumbers as suggested originally, then just paste only values in a seperate column(not formulas) you can sort them.

• Kayte K

Awesome, you’re welcome! And thank you for the tip too. 🙂

• Ryan Deschamps

Really helpful thanks. For bootstrapping (popular with big data sampling etc.) you often require replacement sampling of a dataset. That is, you want to take out a random item, put it back and then re-sample (meaning there is a chance that the original item will be included). Any advice on this approach?

• Vince Cavasin

in Excel 2010 (not sure about other versions) when you sort, apparently it recalcs…because the random number changes. does this matter?

• KTsurveymonkey

Hi Vince,

Good question! If it recalcs then you would probably want a whole separate analysis for the random sampling. E.g. Apply the randomization and then copy/paste or cut that same over into a different spreadsheet to apply any sorting to it.

The concept of random sampling is that you want to take a totally random group of users and then look for your trends within that subset of data! If you are applying sorting or filtering to the full data set, then the updated random numbers aren’t really random anymore.

Hope that helps out!

• Kunle

After randomization, “Choose the first 500 emails. Those will be the random 500 out of 3000 emails.” What if you get less than the number of respondents you need, then what do you do next?

• MFsurveymonkey

Hi again Kunle! Your question makes tons of sense! What we would recommend here is that once you’re done collecting the responses, you create a filter rule so you can only view your results by Completeness. This would mean that only the respondents who answered the survey in full would be displayed. I believe this would allow you to then randomly pick the recipients for your analysis. You can see more about this here: http://help.surveymonkey.com/articles/en_US/kb/Filter-Rules
Also, if you want to read more about Statistical Significance, we have a great help article here: http://help.surveymonkey.com/articles/en_US/kb/Significant-Differences