How to Run Incrementality Tests for Mobile Apps
Learn how to run incrementality tests to measure true ad impact, including holdout group testing, geo-based testing, and lift measurement methodologies.

How to Run Incrementality Tests for Mobile Apps
Attribution tells you what happened.
Incrementality tells you what happened because of your ads.
The difference matters. If your attribution shows 10,000 installs from Meta, but 6,000 of those users would have installed organically anyway, your true incremental installs are only 4,000.
Incrementality testing measures this. It answers: "How many conversions would NOT have happened without ads?"
Here's how to run incrementality tests for mobile apps in 2025.
What is Incrementality Testing?
Incrementality testing measures the causal impact of advertising by comparing two groups:
Test Group: Exposed to ads
Control Group: Not exposed to ads
The difference in conversion rates between these groups is your lift—the percentage of results driven by ads, not organic behavior.
Example:
- Test group (exposed to ads): 1,000 users → 50 installs (5% conversion)
- Control group (no ads): 1,000 users → 35 installs (3.5% conversion)
- Lift = (5% - 3.5%) / 3.5% = 43%
Interpretation: Your ads drove a 43% increase in installs. Without ads, you'd still get 3.5% organically. The incremental impact is 1.5 percentage points.
Why Incrementality Testing Matters
Reason 1: Attribution Over-Credits Ads
Platforms like Meta and Google attribute installs to ads when users were already going to install.
Example:
A user searches for your app, clicks a Meta ad, installs.
Meta claims credit. But the user was already searching—they likely would have installed anyway.
Incrementality testing reveals how much of your attributed installs are truly incremental.
Reason 2: Brand Awareness Creates Organic Lift
Paid campaigns drive brand awareness, which leads to organic installs that attribution doesn't capture.
Example:
Someone sees your TikTok ad, doesn't click, but searches for your app 3 days later and installs.
Attribution shows 0 installs from TikTok. Incrementality testing captures this indirect lift.
Reason 3: Privacy Makes Attribution Less Reliable
Post-ATT (App Tracking Transparency), attribution is modeled and probabilistic. Incrementality testing provides a ground-truth measurement.
Types of Incrementality Tests
1. Holdout Group Testing
How it works:
Split your audience into test and control groups. Show ads to test group only. Compare results.
Pros:
- Simple to set up
- Works with any channel
- Measures direct impact
Cons:
- Requires pausing ads for part of your audience (opportunity cost)
- Doesn't capture cross-channel effects
Best for:
Apps with high organic traffic and established user base.
2. Geo-Based Testing
How it works:
Run ads in specific geographic regions (test geos) while keeping other regions ad-free (control geos). Compare performance.
Example:
- Test: Run ads in California, Texas, Florida
- Control: No ads in New York, Illinois, Ohio
- Compare install rates between test and control states
Pros:
- No need to pause ads (no opportunity cost)
- Privacy-safe (no user-level tracking)
- Channel-agnostic
Cons:
- Requires large geographic distribution
- Assumes geos are comparable (control for population, seasonality)
Best for:
Apps with national or global reach.
3. Public Service Announcement (PSA) Testing
How it works:
Show generic PSA ads to control group instead of your app ads. Compare conversion rates.
Pros:
- No lost impressions (control group still sees ads)
- Maintains campaign structure
Cons:
- PSA ads might influence behavior
- Harder to set up
Best for:
Large-budget campaigns on platforms like Meta that support PSA testing.
How to Run a Holdout Group Test
Step 1: Define Test and Control Groups
Test group: 85% of audience (exposed to ads)
Control group: 15% of audience (no ads)
Why 15%:
Large enough for statistical significance, small enough to minimize opportunity cost.
Step 2: Randomize Allocation
Use platform tools or third-party MMPs (Adjust, AppsFlyer, Singular) to randomly assign users.
Key: Randomization ensures test and control groups are comparable.
Step 3: Run the Test
Duration: 2-4 weeks
Why 2-4 weeks:
- Accounts for weekly seasonality
- Achieves statistical significance
- Smooths out daily noise
Step 4: Measure Conversions
Track installs, registrations, purchases, or any conversion event.
Example:
- Test group: 10,000 users → 500 installs
- Control group: 2,000 users → 70 installs
Step 5: Calculate Lift
Lift = (Test Group Conversion Rate - Control Group Conversion Rate) / Control Group Conversion Rate
Example:
- Test group conversion rate: 500 / 10,000 = 5%
- Control group conversion rate: 70 / 2,000 = 3.5%
- Lift = (5% - 3.5%) / 3.5% = 42.8%
Interpretation: Ads drove a 43% increase in installs. Without ads, you'd still get 3.5% organically.
Step 6: Calculate Incremental Installs
Incremental Installs = Test Group Installs × (Lift / (1 + Lift))
Example:
- Test group installs: 500
- Lift: 42.8%
- Incremental installs = 500 × (0.428 / 1.428) = 150
Interpretation: Of 500 attributed installs, only 150 were truly incremental. The other 350 would have happened organically.
How to Run a Geo-Based Test
Step 1: Select Test and Control Geos
Choose geographies with similar characteristics:
- Population size
- Demographics
- Seasonality
- Historical install rates
Example:
- Test geos: California, Texas, Florida
- Control geos: New York, Pennsylvania, Illinois
Validation: Check that historical install rates are similar across geos before the test.
Step 2: Run Ads in Test Geos Only
Use geo-targeting in your ad platforms to limit ads to test regions.
Step 3: Collect Data for 3-4 Weeks
Why longer than holdout tests:
Geographic variation requires more data to reach statistical significance.
Step 4: Compare Install Rates
Example:
- Test geos: 5,000 installs / 1M impressions = 0.5% install rate
- Control geos: 2,800 installs / 1M impressions = 0.28% install rate
Step 5: Calculate Lift
Lift = (Test Geo Install Rate - Control Geo Install Rate) / Control Geo Install Rate
Example:
- Lift = (0.5% - 0.28%) / 0.28% = 78.6%
Interpretation: Ads drove a 79% increase in installs in test geos compared to control geos.
What is a Good Lift?
Lift varies by channel, brand awareness, and campaign type.
User Acquisition Campaigns:
- 30-70% lift: Good (ads are driving meaningful incremental installs)
- 10-30% lift: Moderate (some incremental value, but high organic overlap)
- <10% lift: Weak (most installs would have happened organically)
Retargeting Campaigns:
- 10-30% lift: Good (retargeting inherently has more organic overlap)
- <10% lift: Weak
Brand Campaigns:
- 5-15% lift: Expected (high brand awareness means more organic installs)
Why Lift Matters More Than Attributed Installs
Example:
- Campaign A: 10,000 attributed installs, 60% lift → 6,000 incremental installs
- Campaign B: 15,000 attributed installs, 20% lift → 3,000 incremental installs
Platform dashboards say Campaign B is better (15K vs 10K).
Incrementality testing shows Campaign A is better (6K vs 3K incremental installs).
Tools for Incrementality Testing
Built-In Platform Tools
Meta: Conversion Lift Studies
Google: Geo Experiments, Brand Lift Studies
TikTok: Lift Studies
Pros: Free, integrated
Cons: Limited customization
Third-Party Tools (2025)
- Lifesight
- Haus
- Measured
- Workmagic
- Recast
- LiftLab
Pros: More control, cross-channel measurement
Cons: Cost ($5K-$50K+ per test)
MMP Tools
Adjust, AppsFlyer, Singular:
Offer incrementality measurement through holdout testing.
Best for: Apps already using these MMPs.
Common Incrementality Testing Mistakes
Mistake 1: Control Group Too Small
If control is <10% of audience, you won't reach statistical significance.
Fix: Use at least 15% for control group.
Mistake 2: Not Running Long Enough
A 3-day test has too much noise.
Fix: Run 2-4 weeks (holdout) or 3-4 weeks (geo-based).
Mistake 3: Comparing Non-Comparable Geos
Testing in California (high-income, tech-savvy) vs Montana (rural) introduces bias.
Fix: Match geos on demographics and historical performance.
Mistake 4: Only Testing One Channel
Testing Meta in isolation doesn't account for cross-channel effects (Meta drives Google brand searches).
Fix: Run incrementality tests across all major channels.
Mistake 5: Testing During Seasonal Spikes
Running a test during Black Friday skews results.
Fix: Avoid major holidays or promotional periods.
How to Use Incrementality Results
Adjust Attribution Models
If you find 40% lift, it means 60% of attributed installs would have happened organically.
Action: Discount platform-reported ROAS by this factor.
Example:
- Platform-reported ROAS: 250%
- Incrementality-adjusted ROAS: 250% × 0.40 = 100%
True incremental ROAS is 100%, not 250%.
Optimize Budget Allocation
Shift budget to channels with highest incremental lift.
Example:
| Channel | Attributed Installs | Lift | Incremental Installs |
|---|---|---|---|
| Meta | 10,000 | 40% | 4,000 |
| 8,000 | 70% | 5,600 | |
| TikTok | 12,000 | 25% | 3,000 |
Google has the highest incremental installs despite fewer attributed installs.
Action: Shift budget from TikTok to Google.
Measure Brand Lift
If incrementality testing shows lower lift over time, it could mean:
- Growing brand awareness (more organic installs)
- Declining ad effectiveness
Track lift quarterly to understand trends.
Key Takeaways
- Incrementality testing measures true ad impact by comparing test (ads) vs control (no ads) groups
- Use holdout group testing for simplicity or geo-based testing to avoid pausing ads
- Control group should be 15%+ of audience for statistical significance
- Run tests for 2-4 weeks to smooth out noise
- Lift = (Test Conversion Rate - Control Conversion Rate) / Control Conversion Rate
- Typical UA lift: 30-70%; retargeting: 10-30%; brand campaigns: 5-15%
- Use incrementality results to adjust attribution models and optimize budget allocation
FAQs
What is incrementality testing?
Incrementality testing measures how much a marketing effort drives real results by comparing a test group (exposed to ads) and a control group (not exposed). It answers: how many installs/conversions would NOT have happened without ads?
How do you run an incrementality test?
Split your audience into test (85%) and control (15%) groups. Show ads only to the test group. After 2-4 weeks, compare conversion rates. Lift = (Test Group Conversion Rate - Control Group Conversion Rate) / Control Group Conversion Rate.
What is a good lift percentage?
Lift varies by channel and brand strength. User acquisition campaigns typically show 30-70% lift. Retargeting shows 10-30% lift. Brand campaigns with high organic traction may show only 5-15% lift.
How often should I run incrementality tests?
Run tests quarterly for each major channel (Meta, Google, TikTok). More frequently if you make significant changes to creative, targeting, or campaigns.
Why is my incrementality lift low?
Low lift (< 20%) suggests high organic overlap—many users would have installed without ads. This can happen with strong brand awareness or high-intent channels like Apple Search Ads.
Incrementality testing is the closest you'll get to understanding true ad effectiveness. Run it regularly, and you'll make smarter budget decisions than 90% of growth teams.
Related Resources

Blended vs Platform ROAS: What to Optimize For
Learn the difference between blended ROAS and platform-reported ROAS, which one to optimize for, and how to use both for better budget allocation decisions.

What You Need Before Spending $1 on User Acquisition
The essential infrastructure, metrics, and systems required before launching paid UA campaigns. Avoid burning budget on unprepared campaigns.

Adjust vs AppsFlyer vs Branch: Which MMP is Right for You? (2025)
Compare the top mobile measurement partners in 2025. Honest breakdown of Adjust, AppsFlyer, and Branch pricing, features, and strengths.