Skip to content

yudityaartha/A-B-Test-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A/B Test Analysis — Control vs Variant Revenue Performance

📌 Project Overview

This project analyzes an A/B test dataset from Kaggle: AB Test Data

Goal: Determine whether the variant group generated significantly different conversion rates or revenue compared to the control group.


Dataset - Column

  • USER_ID: Unique user identifier
  • VARIANT_NAME: 'control' or 'variant'
  • REVENUE: Revenue generated by the user (can be 0)

Key facts:

  • 10,000 total records, ~50/50 split between groups.
  • 98.5% of users have zero revenue.
  • Highly skewed revenue distribution with extreme outliers.

Methods

1. Exploratory Data Analysis (EDA)

  • Checked data balance, missing values, and outliers.
  • Visualized conversion and revenue distributions.

2. Hypothesis Testing

  • Conversion Rate: Two-proportion Z-test (binary paid/not paid).
  • Revenue (paying users): Mann–Whitney U test (non-parametric).

3. Additional Statistics

  • Effect size for both tests (Cohen’s h, Rank-biserial correlation).
  • Bootstrap confidence intervals for revenue differences.
  • Minimum Detectable Effect (MDE) & power analysis.

Results

Metric Control Variant p-value Significance
Conversion Rate 1.61% 1.43% 0.488 ❌ No
Revenue (paying users) 2,96 2,17 0.079 ❌ No
Group Total Users Paying Users Conversion Rate Paying_Mean Paying_Median
control 4984 80 0.016051364365971106 8.0375 2.96
variant 5016 72 0.014354066985645933 4.8815277777777775 2.17
  • Effect sizes indicate negligible to small differences.
  • Bootstrap CI includes zero → differences may be positive or negative.
  • Power analysis shows MDE ≈ 0,78 pp change, requiring ~16× more data to detect small effects.

Key Takeaways

  • Revenue data is heavily skewed, making non-parametric tests more reliable.
  • No statistical evidence that the variant performs differently from control in conversion rate or payer revenue.
  • Small observed differences could be due to random chance and dataset is underpowered for small effect sizes.

Files in This Repo

  • AB_Test_Project.ipynb → Full analysis notebook
  • AB_Test_Project.pdf → Clean PDF version
  • AB_Test.pptx → Slide deck presentation

Skills Demonstrated

  • Hypothesis testing: Z-test, Mann–Whitney
  • Effect size interpretation
  • Power analysis & MDE
  • Bootstrap confidence intervals
  • Data visualization & storytelling

About

A/B Test Analysis — Control vs Variant Revenue Performance

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published