Learn how to identify statistically significant differences in group means, survey results, and A/B test outcomes with a simple t-test.
While anyone can see the difference between two numbers, finding out whether that difference is statistically significant can take more work.
Let’s say you’ve run a customer satisfaction survey at work. Your boss wants to analyze if men give your company a lower Net Promoter Score℠ (NPS) than women.
In the data, you see that the average rating from male respondents was 9, compared to an average score of 12 from female respondents. How can you determine if nine is significantly different from 12? This is where t-tests come in.
In this article, we’ll define t-tests and their use cases, share examples of t-tests, and explain how to interpret your results.
A t-test is a statistical test that assesses whether the difference between two means is significant using the t-distribution. It helps you determine whether an observed gap between groups reflects a real difference or is likely due to chance.
Testing for statistical significance is common in concept testing and product testing. In concept testing, AB tests are commonly used to determine if one ad concept performs better than another. Similarly, product testing can determine if a product will hold its own when launched into the market.
T-tests use specific formulas to compare means and determine whether a difference is statistically significant. The two-sample t-test is the most common in survey analysis:
Here are the formulas for the one-sample t-test and paired t-test:
In both the one-sample and paired t-tests, the calculated t-value is compared to a critical value from the t-distribution to assess significance.
Use a t-test when you want to know whether two averages are meaningfully different, not just numerically different, in your survey results. T-tests help you compare group means, evaluate sample differences, and decide whether a gap is statistically significant based on a p-value and confidence level.
Common survey scenarios include:
Use a t-test when you need to assess a difference in means, test a benchmark comparison, or validate a hypothesis with small sample sizes. This makes it a reliable choice for survey analysis, A/B testing, and any situation where you need evidence that a difference in your data is real.
Before you run a t-test, make sure your data meets a few basic assumptions so the results are reliable.
A quick check on these basics helps ensure that any difference you see reflects a real signal, not noise in the data.
There are three types of t-tests commonly used by researchers. These t-tests serve different purposes that we’ll explain below.
The one-sample test looks at whether the mean (aka average) of data from one group (in this case, the overall NPS) is different from a value you specify.
Example: Your company's current average Customer Effort Score (CES) is 4.2. Is the CES of 4.2 significantly more difficult than the industry standard of 5.0?
Two-sample t-tests examine whether the means of two independent groups are significantly different from one another. If group variances look unequal or sample sizes are unbalanced, switch to Welch’s t-test (offered by most tools) because it doesn’t assume equal variances.
Example: Your hypothesis is that men give your company a lower NPS than women. The average NPS from male respondents is 9, while the average score from women is 12. Is 9 significantly different from 12?
This test is for when you give one group of people the same survey twice. A paired t-test lets you know if the mean changed between the first and second surveys.
Example: You surveyed the same group of customers twice: once in April and a second time in May, after they had seen an ad for your company. Did your company’s NPS change after customers saw the ad?
There are four steps to performing a t-test.
This section walks through the four steps using the NPS ratings example from the beginning:
Your hypothesis is that men give a lower NPS to your company than women. The average NPS from men is 9, while the average score for women is 12. Is 9 significantly different from 12? This is an example for performing a two-sample t-test.
Let’s dive into the steps and t-test example.
Each type of t-test has a different formula for calculating the t-statistic. For this example, we’ll use the two-sample t-test formula where:
You’ll probably be conducting the t-tests in a spreadsheet or statistical program (like Excel or SPSS). However, if you’d like to do the math by hand, the formulas for the other two types of t-tests are included below.
Degrees of freedom are the number of ways the mean could vary. In this case, the degrees of freedom are the number of NPS ratings you could have in a given group of respondents. Similar to the t-statistic, the formula for degrees of freedom will vary depending on the type of t-test you perform.
This formula must be used to determine degrees of freedom in two-sample t-tests.
The critical value is the threshold at which the difference between two numbers is considered statistically significant.
According to this table, for a two-tailed test with an alpha level of 0.05 at 41 degrees of freedom, the critical value is 2.02. Note that most analysts use a two-tailed test instead of a one-tailed test because it’s more conservative.
For more information on the differences between one-tailed and two-tailed tests, check out this video from Khan Academy.
If your t-statistic is larger than your critical value, your difference is significant. If your t-statistic is smaller, then your two numbers are, statistically speaking, indistinguishable.
In our example, the absolute value of the t-statistic is 0.86, which is not larger than the critical value of 2.02, so you can conclude that men do not give significantly lower NPS ratings than women.
Interpreting t-test results includes reviewing the t-value, p-value, and confidence interval to understand whether the difference between your groups reflects a real effect or random variation. These metrics work together to show the size of the gap, the strength of the evidence, and the level of confidence you can place in the result. The Q&A below breaks down what each one tells you and how to analyze t-test results.
The t-value shows how large the difference between group means is relative to the variability in your data. A larger absolute t-value means the signal rises above the noise; a smaller one suggests the gap may be due to chance.
The p-value indicates how likely it is to observe your results if the null hypothesis (no true difference) were actually true. Many teams use a 0.05 threshold—p ≤ 0.05 suggests a statistically significant difference, while p > 0.05 points to no meaningful difference in this sample.
A confidence interval (CI) provides a likely range for the true difference in means, adding context beyond a yes/no significance call. If the CI crosses zero, the effect isn’t conclusive; if it stays above or below zero, the result is significant at your chosen confidence level.
A meaningful difference is both statistically significant and practically important. Look at the estimated effect size and CI to understand how large the gap could be and whether it matters for your decision.
Larger samples reduce variability, tighten confidence intervals, and make it easier to detect real differences. Smaller samples introduce more uncertainty, which can make borderline effects harder to interpret.
A clear t-test results summary shows why you ran the comparison, what the test revealed, and how confident you can be in the difference between groups. Your role is to translate the statistical output into plain language, connect it to the original question, and highlight what the findings suggest for the decisions that follow.
Include these core elements when summarizing t-test results:
Avoiding a few simple errors can help you get cleaner, more trustworthy t-test results from your survey data.
T-tests are used to determine if the difference in the means of two sample groups is statistically significant. You can use t-tests during survey data analysis to help share the reliability of your data.
SurveyMonkey allows you to streamline the process of creating and sending surveys to sample groups for your organization’s research needs. With SurveyMonkey, you can build market research surveys and questionnaires from scratch or tap into our broad selection of over 400 survey templates.
Get started collecting survey data for analysis today to help your organization make better decisions for growth. Create a free account today.
NPS, Net Promoter & Net Promoter Score are registered trademarks of Satmetrix Systems, Inc., Bain & Company and Fred Reichheld.

SurveyMonkey can help you do your job better. Discover how to make a bigger impact with winning strategies, products, experiences, and more.

Welcome to our SurveyMonkey AI Sentiment study, a quarterly report designed to measure ongoing changes in AI usage and consumer sentiment.

Learn the top 5 trends driving business in 2026

Surveys are important in research because they offer a flexible and dependable method of gathering crucial data. Learn more today.





