Clients Daily

Many questions about Firefox take the form "What did clients with characteristics X, Y, and Z do during the period S to E?" The clients_daily table aims to answer these questions. Each row in the table is a (client_id, submission_date) and contains a number of aggregates about that day's activity.

Accessing the Data

The clients_daily table is accessible through STMO using the Telemetry (BigQuery) data source.

Data Reference

Example Queries

Compute Churn for a one-day cohort:

SELECT
  submission_date,
  approx_count_distinct(client_id) AS cohort_dau
FROM
  telemetry.clients_daily
WHERE
  submission_date > '2017-08-31'
  AND submission_date < '2017-10-01'
  AND profile_creation_date LIKE '2017-09-01%'
GROUP BY 1
ORDER BY 1

Distribution of pings per client per day:

SELECT
  normalized_channel,
  CASE
    WHEN pings_aggregated_by_this_row > 50 THEN 50
    ELSE pings_aggregated_by_this_row
  END AS pings_per_day,
  approx_count_distinct(client_id) AS client_count
FROM telemetry.clients_daily
WHERE
  submission_date = '2017-09-01'
  AND normalized_channel <> 'Other'
GROUP BY
  normalized_channel,
  pings_per_day
ORDER BY
  pings_per_day,
  normalized_channel

Scheduling

This dataset is updated daily via the telemetry-airflow infrastructure. The job runs as part of the main_summary DAG.

Schema

The data is partitioned by submission_date.

As of 2019-11-28, the current version of the clients_daily dataset is v6.

Code Reference

This dataset is generated by BigQuery ETL. The query that generates the dataset is sql/telemetry_derived/clients_daily_v6/query.sql.