How to Run Incrementality Tests for Mobile Apps

Learn how to run incrementality tests to measure true ad impact, including holdout group testing, geo-based testing, and lift measurement methodologies.

Justin Sampson
How to Run Incrementality Tests for Mobile Apps

How to Run Incrementality Tests for Mobile Apps

Attribution tells you what happened.

Incrementality tells you what happened because of your ads.

The difference matters. If your attribution shows 10,000 installs from Meta, but 6,000 of those users would have installed organically anyway, your true incremental installs are only 4,000.

Incrementality testing measures this. It answers: "How many conversions would NOT have happened without ads?"

Here's how to run incrementality tests for mobile apps in 2025.

What is Incrementality Testing?

Incrementality testing measures the causal impact of advertising by comparing two groups:

Test Group: Exposed to ads

Control Group: Not exposed to ads

The difference in conversion rates between these groups is your lift—the percentage of results driven by ads, not organic behavior.

Example:

  • Test group (exposed to ads): 1,000 users → 50 installs (5% conversion)
  • Control group (no ads): 1,000 users → 35 installs (3.5% conversion)
  • Lift = (5% - 3.5%) / 3.5% = 43%

Interpretation: Your ads drove a 43% increase in installs. Without ads, you'd still get 3.5% organically. The incremental impact is 1.5 percentage points.

Why Incrementality Testing Matters

Reason 1: Attribution Over-Credits Ads

Platforms like Meta and Google attribute installs to ads when users were already going to install.

Example:

A user searches for your app, clicks a Meta ad, installs.

Meta claims credit. But the user was already searching—they likely would have installed anyway.

Incrementality testing reveals how much of your attributed installs are truly incremental.

Reason 2: Brand Awareness Creates Organic Lift

Paid campaigns drive brand awareness, which leads to organic installs that attribution doesn't capture.

Example:

Someone sees your TikTok ad, doesn't click, but searches for your app 3 days later and installs.

Attribution shows 0 installs from TikTok. Incrementality testing captures this indirect lift.

Reason 3: Privacy Makes Attribution Less Reliable

Post-ATT (App Tracking Transparency), attribution is modeled and probabilistic. Incrementality testing provides a ground-truth measurement.

Types of Incrementality Tests

1. Holdout Group Testing

How it works:

Split your audience into test and control groups. Show ads to test group only. Compare results.

Pros:

  • Simple to set up
  • Works with any channel
  • Measures direct impact

Cons:

  • Requires pausing ads for part of your audience (opportunity cost)
  • Doesn't capture cross-channel effects

Best for:

Apps with high organic traffic and established user base.

2. Geo-Based Testing

How it works:

Run ads in specific geographic regions (test geos) while keeping other regions ad-free (control geos). Compare performance.

Example:

  • Test: Run ads in California, Texas, Florida
  • Control: No ads in New York, Illinois, Ohio
  • Compare install rates between test and control states

Pros:

  • No need to pause ads (no opportunity cost)
  • Privacy-safe (no user-level tracking)
  • Channel-agnostic

Cons:

  • Requires large geographic distribution
  • Assumes geos are comparable (control for population, seasonality)

Best for:

Apps with national or global reach.

3. Public Service Announcement (PSA) Testing

How it works:

Show generic PSA ads to control group instead of your app ads. Compare conversion rates.

Pros:

  • No lost impressions (control group still sees ads)
  • Maintains campaign structure

Cons:

  • PSA ads might influence behavior
  • Harder to set up

Best for:

Large-budget campaigns on platforms like Meta that support PSA testing.

How to Run a Holdout Group Test

Step 1: Define Test and Control Groups

Test group: 85% of audience (exposed to ads)

Control group: 15% of audience (no ads)

Why 15%:

Large enough for statistical significance, small enough to minimize opportunity cost.

Step 2: Randomize Allocation

Use platform tools or third-party MMPs (Adjust, AppsFlyer, Singular) to randomly assign users.

Key: Randomization ensures test and control groups are comparable.

Step 3: Run the Test

Duration: 2-4 weeks

Why 2-4 weeks:

  • Accounts for weekly seasonality
  • Achieves statistical significance
  • Smooths out daily noise

Step 4: Measure Conversions

Track installs, registrations, purchases, or any conversion event.

Example:

  • Test group: 10,000 users → 500 installs
  • Control group: 2,000 users → 70 installs

Step 5: Calculate Lift

Lift = (Test Group Conversion Rate - Control Group Conversion Rate) / Control Group Conversion Rate

Example:

  • Test group conversion rate: 500 / 10,000 = 5%
  • Control group conversion rate: 70 / 2,000 = 3.5%
  • Lift = (5% - 3.5%) / 3.5% = 42.8%

Interpretation: Ads drove a 43% increase in installs. Without ads, you'd still get 3.5% organically.

Step 6: Calculate Incremental Installs

Incremental Installs = Test Group Installs × (Lift / (1 + Lift))

Example:

  • Test group installs: 500
  • Lift: 42.8%
  • Incremental installs = 500 × (0.428 / 1.428) = 150

Interpretation: Of 500 attributed installs, only 150 were truly incremental. The other 350 would have happened organically.

How to Run a Geo-Based Test

Step 1: Select Test and Control Geos

Choose geographies with similar characteristics:

  • Population size
  • Demographics
  • Seasonality
  • Historical install rates

Example:

  • Test geos: California, Texas, Florida
  • Control geos: New York, Pennsylvania, Illinois

Validation: Check that historical install rates are similar across geos before the test.

Step 2: Run Ads in Test Geos Only

Use geo-targeting in your ad platforms to limit ads to test regions.

Step 3: Collect Data for 3-4 Weeks

Why longer than holdout tests:

Geographic variation requires more data to reach statistical significance.

Step 4: Compare Install Rates

Example:

  • Test geos: 5,000 installs / 1M impressions = 0.5% install rate
  • Control geos: 2,800 installs / 1M impressions = 0.28% install rate

Step 5: Calculate Lift

Lift = (Test Geo Install Rate - Control Geo Install Rate) / Control Geo Install Rate

Example:

  • Lift = (0.5% - 0.28%) / 0.28% = 78.6%

Interpretation: Ads drove a 79% increase in installs in test geos compared to control geos.

What is a Good Lift?

Lift varies by channel, brand awareness, and campaign type.

User Acquisition Campaigns:

  • 30-70% lift: Good (ads are driving meaningful incremental installs)
  • 10-30% lift: Moderate (some incremental value, but high organic overlap)
  • <10% lift: Weak (most installs would have happened organically)

Retargeting Campaigns:

  • 10-30% lift: Good (retargeting inherently has more organic overlap)
  • <10% lift: Weak

Brand Campaigns:

  • 5-15% lift: Expected (high brand awareness means more organic installs)

Why Lift Matters More Than Attributed Installs

Example:

  • Campaign A: 10,000 attributed installs, 60% lift → 6,000 incremental installs
  • Campaign B: 15,000 attributed installs, 20% lift → 3,000 incremental installs

Platform dashboards say Campaign B is better (15K vs 10K).

Incrementality testing shows Campaign A is better (6K vs 3K incremental installs).

Tools for Incrementality Testing

Built-In Platform Tools

Meta: Conversion Lift Studies

Google: Geo Experiments, Brand Lift Studies

TikTok: Lift Studies

Pros: Free, integrated

Cons: Limited customization

Third-Party Tools (2025)

  • Lifesight
  • Haus
  • Measured
  • Workmagic
  • Recast
  • LiftLab

Pros: More control, cross-channel measurement

Cons: Cost ($5K-$50K+ per test)

MMP Tools

Adjust, AppsFlyer, Singular:

Offer incrementality measurement through holdout testing.

Best for: Apps already using these MMPs.

Common Incrementality Testing Mistakes

Mistake 1: Control Group Too Small

If control is <10% of audience, you won't reach statistical significance.

Fix: Use at least 15% for control group.

Mistake 2: Not Running Long Enough

A 3-day test has too much noise.

Fix: Run 2-4 weeks (holdout) or 3-4 weeks (geo-based).

Mistake 3: Comparing Non-Comparable Geos

Testing in California (high-income, tech-savvy) vs Montana (rural) introduces bias.

Fix: Match geos on demographics and historical performance.

Mistake 4: Only Testing One Channel

Testing Meta in isolation doesn't account for cross-channel effects (Meta drives Google brand searches).

Fix: Run incrementality tests across all major channels.

Mistake 5: Testing During Seasonal Spikes

Running a test during Black Friday skews results.

Fix: Avoid major holidays or promotional periods.

How to Use Incrementality Results

Adjust Attribution Models

If you find 40% lift, it means 60% of attributed installs would have happened organically.

Action: Discount platform-reported ROAS by this factor.

Example:

  • Platform-reported ROAS: 250%
  • Incrementality-adjusted ROAS: 250% × 0.40 = 100%

True incremental ROAS is 100%, not 250%.

Optimize Budget Allocation

Shift budget to channels with highest incremental lift.

Example:

ChannelAttributed InstallsLiftIncremental Installs
Meta10,00040%4,000
Google8,00070%5,600
TikTok12,00025%3,000

Google has the highest incremental installs despite fewer attributed installs.

Action: Shift budget from TikTok to Google.

Measure Brand Lift

If incrementality testing shows lower lift over time, it could mean:

  • Growing brand awareness (more organic installs)
  • Declining ad effectiveness

Track lift quarterly to understand trends.

Key Takeaways

  • Incrementality testing measures true ad impact by comparing test (ads) vs control (no ads) groups
  • Use holdout group testing for simplicity or geo-based testing to avoid pausing ads
  • Control group should be 15%+ of audience for statistical significance
  • Run tests for 2-4 weeks to smooth out noise
  • Lift = (Test Conversion Rate - Control Conversion Rate) / Control Conversion Rate
  • Typical UA lift: 30-70%; retargeting: 10-30%; brand campaigns: 5-15%
  • Use incrementality results to adjust attribution models and optimize budget allocation

FAQs

What is incrementality testing?

Incrementality testing measures how much a marketing effort drives real results by comparing a test group (exposed to ads) and a control group (not exposed). It answers: how many installs/conversions would NOT have happened without ads?

How do you run an incrementality test?

Split your audience into test (85%) and control (15%) groups. Show ads only to the test group. After 2-4 weeks, compare conversion rates. Lift = (Test Group Conversion Rate - Control Group Conversion Rate) / Control Group Conversion Rate.

What is a good lift percentage?

Lift varies by channel and brand strength. User acquisition campaigns typically show 30-70% lift. Retargeting shows 10-30% lift. Brand campaigns with high organic traction may show only 5-15% lift.

How often should I run incrementality tests?

Run tests quarterly for each major channel (Meta, Google, TikTok). More frequently if you make significant changes to creative, targeting, or campaigns.

Why is my incrementality lift low?

Low lift (< 20%) suggests high organic overlap—many users would have installed without ads. This can happen with strong brand awareness or high-intent channels like Apple Search Ads.


Incrementality testing is the closest you'll get to understanding true ad effectiveness. Run it regularly, and you'll make smarter budget decisions than 90% of growth teams.

incrementality testinglift studiesattributionuser acquisitionmobile app analytics

Related Resources