Normalized Client Lifetime Value (LTV)

Introduction

client_ltv is designed to enable relative user value estimates based on their past and expected search and ad click behavior. This behavior is revenue-generating for Firefox.

Contents

client_ltv has one row for each (client_id, engine, submission_date) triplet.

Each row captures a year's worth of history for the client_id on the given engine, therefore the values will not change much when looking at submission_date in 1-day (or even 1-month) sequences, since there is significant overlap. For most analyses, using yesterday's submission_date will be sufficient. To get users active in the last i.e. 7 days, a join with clients_last_seen is required. We plan to propagate the necessary fields into client_ltv in the future so such a join isn't necessary.

Using yesterday's date, a client's row will contain the total number of searches, ad clicks, etc for the last 365 days, along with active search and ad click days (how many days did a user search or click an ad). Additionally each row contains the predicted number active search/ad click days for the next 365 days. See the schema at the bottom of this page for a full list of the fields.

LTV

A client's "lifetime" is maxed out at 2 years given the structure of this dataset. Of course a client can exist for longer, but one year on either side of the date in question controls for seasonal trends and lets us easily produce annual estimates for, say, user acquisition ROI.

The procedure for calculating a user's LTV is as follows:

  • Step 1: Determine the ad click value for the user's region/engine
    • (Revenue in Country C for Engine E) / (Total Ad Clicks in Country C for Engine E)
  • Step 2: Determine the user's ad clicks per day for the past 365 days
    • (Total Ad Clicks for User) / (Total Active Ad Click Days for User)
  • Step 3: Calculate a user's past LTV by multiplying the following:
    • Total Active Ad Click Days for User
    • Ad Clicks per Day for User (derived in step 2)
    • Ad Click Value in Country C for Engine E (derived from step 1)
  • Step 4: Calculate a user's future LTV by multiplying the following:
    • Total Predicted Active Ad Click Days for User
    • Ad Clicks per Day for User (derived in step 2)
    • Ad Click Value in Country C for Engine E (derived from step 1)
  • Step 5: Normalized the LTV values from (3) and (4)
    • (User past LTV) / (Sum of all past user LTVs)
    • (User future LTV) / (Sum of all future user LTVs)

The normalized LTV for a user can roughly be interpreted as a user's contribution to the collective value of our user-base. Note that the above procedure omits some outlier handling steps for simplicity.

Background and Caveats

The normalized_ltv_ad_clicks_current field, for example, does not represent a user's contribution to revenue directly. It should be treated as a rough proxy. It is not appropriate to multiply revenue by this number.

LTV is broken down by engine, so the LTV for a user who searches on multiple engines must be interpreted in context. LTV is only available for Google and Bing on Firefox Desktop at this time.

We do have the ability to calculate a dollar value per user, however the (unnormalized) table is restricted to those with proper revenue access. For more information, see Getting Help.

Example Queries

Percent of users we predict will click an ad in the next 365 days by Engine. (source)

SELECT
  engine,
  AVG(IF(pred_num_days_seeing_ads > 0, 1, 0)) as pct_predicted_ad_viewers_next_year,
  AVG(IF(pred_num_days_clicking_ads > 0, 1, 0)) as pct_predicted_ad_clickers_next_year,
FROM
  `moz-fx-data-shared-prod`.revenue_derived.client_ltv_v1
WHERE
  submission_date = DATE_SUB(CURRENT_DATE, INTERVAL 1 DAY)
GROUP BY
  1

LTV Value of Users Over Lifetime (by days_since_created_profile) of Users Active in Past 7 Days (source)

SELECT
  days_since_created_profile,
  SUM(normalized_ltv_ad_clicks_current) AS sum_normalized_ltv_ad_clicks_current,
  SUM(normalized_ltv_ad_clicks_future) AS normalized_ltv_ad_clicks_future,
  SUM(normalized_ltv_ad_clicks_future) / COUNT(*) AS avg_normalized_ltv_ad_clicks_future,
FROM
  `moz-fx-data-shared-prod`.revenue_derived.client_ltv_v1
JOIN
  `moz-fx-data-shared-prod`.search.search_clients_last_seen
  USING(submission_date, client_id)
WHERE
  submission_date = '2020-09-16'
  AND days_since_created_profile <= 365
  AND days_since_seen <= 6
GROUP BY
  days_since_created_profile
ORDER BY
  days_since_created_profile DESC


Schema

As of 2020-09-16, the current version of client_ltv is v1, and has a schema as follows. The dataset is backfilled through 2020-09-14.

root
 |-- submission_date: date (nullable = true)
 |-- engine: string (nullable = true)
 |-- country: string (nullable = true)
 |-- client_id: string (nullable = true)
 |-- total_client_searches_past_year: long (nullable = true)
 |-- total_client_tagged_searches_past_year: long (nullable = true)
 |-- total_client_ad_clicks_past_year: long (nullable = true)
 |-- total_client_searches_with_ads_past_year: long (nullable = true)
 |-- ad_click_days: long (nullable = true)
 |-- search_days: long (nullable = true)
 |-- search_with_ads_days: long (nullable = true)
 |-- tagged_search_days: long (nullable = true)
 |-- active_days: long (nullable = true)
 |-- pred_num_days_clicking_ads: double (nullable = true)
 |-- pred_num_days_seeing_ads: double (nullable = true)
 |-- pred_num_days_searching: double (nullable = true)
 |-- pred_num_days_tagged_searching: double (nullable = true)
 |-- ad_clicks_per_day: double (nullable = true)
 |-- searches_with_ads_per_day: double (nullable = true)
 |-- searches_per_day: double (nullable = true)
 |-- tagged_searches_per_day: double (nullable = true)
 |-- ad_clicks_cutoff: double (nullable = true)
 |-- searches_with_ads_cutoff: double (nullable = true)
 |-- searches_cutoff: double (nullable = true)
 |-- tagged_searches_cutoff: double (nullable = true)
 |-- ad_clicks_per_day_capped: double (nullable = true)
 |-- searches_with_ads_per_day_capped: double (nullable = true)
 |-- searches_per_day_capped: double (nullable = true)
 |-- tagged_searches_per_day_capped: double (nullable = true)
 |-- total_ad_clicks: long (nullable = true)
 |-- normalized_ltv_ad_clicks_current: double (nullable = true)
 |-- normalized_ltv_search_with_ads_current: double (nullable = true)
 |-- normalized_ltv_search_current: double (nullable = true)
 |-- normalized_ltv_tagged_search_current: double (nullable = true)
 |-- normalized_ltv_ad_clicks_future: double (nullable = true)
 |-- normalized_ltv_search_with_ads_future: double (nullable = true)
 |-- normalized_ltv_search_future: double (nullable = true)
 |-- normalized_ltv_tagged_search_future: double (nullable = true)

Code References

Model Performance

There is additionally a dataset, ltv_daily_model_perf, that tracks the LTV model's prediction performance each day it is re-trained. For a given day, one could check the performance with the following query:

SELECT
  active_days,
  actual,
  model
FROM
  `moz-fx-data-shared-prod.analysis.ltv_daily_model_perf`
WHERE
  date = '2020-09-29'
AND
  metric = 'days_clicked_ads'
ORDER BY
  active_days

This produces a histogram for the observed user frequencies and the model's predicted frequencies, allowing a chart similar to the one shown in the "assessing model fit" example in the lifetimes documentation. This table only checks performance for clients the model expects have, for example, clicked an ad in 0 to 28 days in the past year, since most of the distribution is contained in that interval.