Variant B’s conversion rate (1.14%) was 14% higher than variant A’s conversion rate (1.00%). You can be 95% confident that variant B will perform better than variant A.
In the context of AB testing experiments, statistical significance is how likely it is that the difference between your experiment’s control version and test version isn’t due to error or random chance.
For example, if you run a test with a 95% significance level, you can be 95% confident that the differences are real.
It’s commonly used in business to observe how your experiments affect your business’s conversion rates. In surveys, statistical significance is usually used as a way to ensure you can be confident in your survey results. For example, if you asked people whether they preferred ad concept A or ad concept B in a survey, you’d want to make sure the difference in their results was statistically significant before deciding which one to use.
The first step is to form a hypothesis. For any experiment, there is a null hypothesis, which states there’s no relationship between the two things you’re comparing, and an alternative hypothesis. An alternative hypothesis typically tries to prove that a relationship exists and is the statement you’re trying to back up. If you’re talking about conversion-rate AB testing, your hypothesis may involve adding a button, image, or some copy to a page to see if it affects conversion rates. When you’re using surveys for concept testing, like in the example above, your hypothesis might involve testing different ad variants to see which people find most appealing.
After formulating null and alternative hypotheses, statisticians sometimes do tests to ensure their hypotheses are sound. A z-score evaluates the validity of your null hypothesis. It can tell you if there is, in fact, no relationship between the things you’re comparing. A p-value tells you whether the evidence you have to prove your alternative hypothesis is strong.
When running statistical significance tests, it’s useful to decide whether your test will be one sided or two sided (sometimes called one tailed or two tailed). A one-sided test assumes that your alternative hypothesis will have a directional effect, while a two-sided test accounts for if your hypothesis could have a negative effect on your results, as well. Generally, a two-sided test is the more conservative choice.
Even professional statisticians use statistical modeling software to calculate significance and the tests that back it up, so we won’t delve too deeply into it here. However, if you’re running an AB test, you can use the calculator at the top of the page to calculate the statistical significance of your results. If you’re trying to calculate the significance of your survey results, SurveyMonkey can do it for your automatically.
Try sending a survey to your customers to find out what they’re looking for.