Normalized Client Lifetime Value (LTV)
Introduction
client_ltv
is designed to enable relative user value estimates based on their past and expected search and ad click behavior. This behavior is revenue-generating for Firefox.
Contents
client_ltv
has one row for each (client_id
, engine
, submission_date
) triplet.
Each row captures a year's worth of history for the client_id
on the given engine
, therefore the values will not change much when looking at submission_date
in 1-day (or even 1-month) sequences, since there is significant overlap. For most analyses, using yesterday's submission_date
will be sufficient. To get users active in the last i.e. 7 days, a join with clients_last_seen
is required. We plan to propagate the necessary fields into client_ltv
in the future so such a join isn't necessary.
Using yesterday's date, a client's row will contain the total number of searches, ad clicks, etc for the last 365 days, along with active search and ad click days (how many days did a user search or click an ad). Additionally each row contains the predicted number active search/ad click days for the next 365 days. See the schema at the bottom of this page for a full list of the fields.
LTV
A client's "lifetime" is maxed out at 2 years given the structure of this dataset. Of course a client can exist for longer, but one year on either side of the date in question controls for seasonal trends and lets us easily produce annual estimates for, say, user acquisition ROI.
The procedure for calculating a user's LTV is as follows:
- Step 1: Determine the ad click value for the user's region/engine
- (Revenue in Country C for Engine E) / (Total Ad Clicks in Country C for Engine E)
- Step 2: Determine the user's ad clicks per day for the past 365 days
- (Total Ad Clicks for User) / (Total Active Ad Click Days for User)
- Step 3: Calculate a user's past LTV by multiplying the following:
- Total Active Ad Click Days for User
- Ad Clicks per Day for User (derived in step 2)
- Ad Click Value in Country C for Engine E (derived from step 1)
- Step 4: Calculate a user's future LTV by multiplying the following:
- Total Predicted Active Ad Click Days for User
- Ad Clicks per Day for User (derived in step 2)
- Ad Click Value in Country C for Engine E (derived from step 1)
- Step 5: Normalized the LTV values from (3) and (4)
- (User past LTV) / (Sum of all past user LTVs)
- (User future LTV) / (Sum of all future user LTVs)
The normalized LTV for a user can roughly be interpreted as a user's contribution to the collective value of our user-base. Note that the above procedure omits some outlier handling steps for simplicity.
Background and Caveats
The normalized_ltv_ad_clicks_current
field, for example, does not represent a user's contribution to revenue directly. It should be treated as a rough proxy. It is not appropriate to multiply revenue by this number.
LTV is broken down by engine, so the LTV for a user who searches on multiple engines must be interpreted in context. LTV is only available for Google and Bing on Firefox Desktop at this time.
We do have the ability to calculate a dollar value per user, however the (unnormalized) table is restricted to those with proper revenue access. For more information, see Getting Help.
Example Queries
Percent of users we predict will click an ad in the next 365 days by Engine. (STMO#74878
)
SELECT
engine,
AVG(IF(pred_num_days_seeing_ads > 0, 1, 0)) as pct_predicted_ad_viewers_next_year,
AVG(IF(pred_num_days_clicking_ads > 0, 1, 0)) as pct_predicted_ad_clickers_next_year,
FROM
`moz-fx-data-shared-prod`.revenue.client_ltv
WHERE
submission_date = DATE_SUB(CURRENT_DATE, INTERVAL 1 DAY)
GROUP BY
1
LTV Value of Users Over Lifetime (by days_since_created_profile
) of Users Active in Past 7 Days (STMO#187036
)
SELECT
days_since_created_profile,
SUM(normalized_ltv_ad_clicks_current) AS sum_normalized_ltv_ad_clicks_current,
SUM(normalized_ltv_ad_clicks_future) AS normalized_ltv_ad_clicks_future,
SUM(normalized_ltv_ad_clicks_future) / COUNT(*) AS avg_normalized_ltv_ad_clicks_future,
FROM
`moz-fx-data-shared-prod`.revenue.client_ltv
JOIN
`moz-fx-data-shared-prod`.search.search_clients_last_seen
USING(submission_date, client_id)
WHERE
submission_date = '2020-09-16'
AND days_since_created_profile <= 365
AND days_since_seen <= 6
GROUP BY
days_since_created_profile
ORDER BY
days_since_created_profile DESC
Schema
As of 2020-09-16,
the current version of client_ltv
is v1
,
and has a schema as follows.
The dataset is backfilled through 2020-09-14.
root
|-- submission_date: date (nullable = true)
|-- engine: string (nullable = true)
|-- country: string (nullable = true)
|-- client_id: string (nullable = true)
|-- total_client_searches_past_year: long (nullable = true)
|-- total_client_tagged_searches_past_year: long (nullable = true)
|-- total_client_ad_clicks_past_year: long (nullable = true)
|-- total_client_searches_with_ads_past_year: long (nullable = true)
|-- ad_click_days: long (nullable = true)
|-- search_days: long (nullable = true)
|-- search_with_ads_days: long (nullable = true)
|-- tagged_search_days: long (nullable = true)
|-- active_days: long (nullable = true)
|-- pred_num_days_clicking_ads: double (nullable = true)
|-- pred_num_days_seeing_ads: double (nullable = true)
|-- pred_num_days_searching: double (nullable = true)
|-- pred_num_days_tagged_searching: double (nullable = true)
|-- ad_clicks_per_day: double (nullable = true)
|-- searches_with_ads_per_day: double (nullable = true)
|-- searches_per_day: double (nullable = true)
|-- tagged_searches_per_day: double (nullable = true)
|-- ad_clicks_cutoff: double (nullable = true)
|-- searches_with_ads_cutoff: double (nullable = true)
|-- searches_cutoff: double (nullable = true)
|-- tagged_searches_cutoff: double (nullable = true)
|-- ad_clicks_per_day_capped: double (nullable = true)
|-- searches_with_ads_per_day_capped: double (nullable = true)
|-- searches_per_day_capped: double (nullable = true)
|-- tagged_searches_per_day_capped: double (nullable = true)
|-- total_ad_clicks: long (nullable = true)
|-- normalized_ltv_ad_clicks_current: double (nullable = true)
|-- normalized_ltv_search_with_ads_current: double (nullable = true)
|-- normalized_ltv_search_current: double (nullable = true)
|-- normalized_ltv_tagged_search_current: double (nullable = true)
|-- normalized_ltv_ad_clicks_future: double (nullable = true)
|-- normalized_ltv_search_with_ads_future: double (nullable = true)
|-- normalized_ltv_search_future: double (nullable = true)
|-- normalized_ltv_tagged_search_future: double (nullable = true)
Code References
- LTV daily model fitting
- Unnormalized
client_ltv
query (restricted query access) client_ltv
view (for broad use)
Model Performance
There is additionally a dataset, ltv_daily_model_perf
, that tracks the LTV model's prediction performance each day it is re-trained. For a given day, one could check the performance with the following query (STMO#187873
):
SELECT
active_days,
actual,
model
FROM
`moz-fx-data-shared-prod.analysis.ltv_daily_model_perf`
WHERE
date = '2020-09-29'
AND
metric = 'days_clicked_ads'
ORDER BY
active_days
This produces a histogram for the observed user frequencies and the model's predicted frequencies, allowing a chart similar to the one shown in the "assessing model fit" example in the lifetimes
documentation. This table only checks performance for clients the model expects have, for example, clicked an ad in 0 to 28 days in the past year, since most of the distribution is contained in that interval.