Skip to main content
aibizhub

Education · General business information, not legal, tax, or financial advice. Editorial standards Sponsor disclosure Corrections

EXPERIMENTATION · STATISTICS

A/B Test Significance Calculator

Check if your A/B test results are statistically significant and estimate the sample size needed for reliable conclusions.

Try a preset

Confidence level

Result

P-VALUE
0.1199
VERDICT
Not Significant

Not significant yet. Continue the test or increase traffic to reach a reliable conclusion.

RATE A
5.00%
RATE B
5.70%
RELATIVE LIFT
+14.00%
Z-SCORE
1.5554
REQUIRED SAMPLE / VARIANT
16,224

Estimated visitors per variant for 80% power at the selected confidence level.

Conversion rate comparison

Control (A) vs variant (B) conversion rates.

Variant A
5.00%
Variant B
5.70%
Methodology → Formula, assumptions, sources, and known limits.

How to use it

  1. Enter visitors and conversions for control A and variant B, then choose a confidence level. Use 95% for most product and marketing decisions and 99% when the change affects revenue, compliance, or a large user population.
  2. Read both conversion rates, relative lift, z-score, p-value, conclusion, required sample size, and the power message. A Borderline result means the data is close enough that peeking early could easily push you into a false decision.
  3. Interpret significance and effect size together. A result can be statistically significant but too small to matter commercially, while a large-looking lift with a Not Significant label usually means you need more traffic before shipping anything.
  4. Use the required sample size to decide whether to continue, stop, or redesign the experiment. Predefine the minimum lift worth shipping so a tiny 0.1-0.2 point improvement does not consume engineering effort with no meaningful business return.
  5. Re-run only after full business cycles or materially more traffic arrives. Track win rate and realized post-launch lift by experiment type so your testing program learns which kinds of hypotheses actually produce durable gains.
Questions people usually ask
What sample size do I need for a valid A/B test?

It depends on your baseline conversion rate and minimum detectable effect. To detect a 20% relative improvement on a 5% baseline (from 5% to 6%) at 95% confidence and 80% statistical power, you need approximately 4,800 visitors per variant. Smaller effects require dramatically larger samples — detecting a 10% relative improvement requires roughly 19,000 per variant.

What is the difference between statistical significance and practical significance?

A test can be statistically significant (very unlikely to be due to chance) but practically insignificant (effect too small to matter). A 0.1% conversion rate improvement may be p<0.01 with 500,000 visitors but generate only $200/month in additional revenue. Always evaluate effect size alongside p-value — significance without magnitude is misleading.

How long should I run an A/B test?

Run for at least 1-2 full business cycles (usually 2-4 weeks minimum) regardless of when significance is reached. Stopping early when significance appears inflates false positive rates significantly — the peaking-at-significance problem can produce 30-50% of results that fail to replicate. Pre-specify sample size before launching.

Related Resources

Learn the decision before you act

Every link here is tied directly to A/B Test Significance Calculator. Use the explanation, formula, examples, and benchmarks to pressure-test the calculator output from first principles.

Browse all 20 resources

Continue With Related Tools

More in Marketing & Acquisition

Know whether your marketing spend is building value or burning cash.

Read the full Marketing & Acquisition guide →

Decision Workflows

Step-by-step guides that use this tool.

Business planning estimates — not legal, tax, or accounting advice.