Firefox Android Clients reference

Introduction

The table fenix_derived.firefox_android_clients contains the first observations for Firefox Android clients retrieved from the ping that reports first from: baseline, first_session and metrics pings.

The goals of this table, as described in the proposal:

  • Enable client segmentation based on the attribution dimensions e.g. adjust_campaign, install source.
  • Facilitate the investigation of data incidents and identifying the root cause when of one or more metrics deviate from the expected values, by segmenting it using different dimensions.
  • Enable identifying bugs and data obtained via bots i.e. BrowserStack.
  • Serve as the baseline to complement Glean's first_session ping for mobile browsers in order to use it as a single source for first reported attributes.
  • Serve as a baseline to create a first_session ping for Firefox Desktop.

Contents

The table granularity is one row per client_id.

It contains the attribution, isp, os_version, device, channel and first reported country for each client. The field descriptions are fully documented in BigQuery.

This table contains data only for channel release, since it's the only channel where data is available in the first_session ping at the time of implementation and suffices for the goals. Also, data is available since August 2020, when the migration from Fennec to Fenix took place.

Scheduling

Incremental updates happen on a daily basis in the Airflow DAG bqetl_analytics_tables

The table is built and initialized using the init.sql file and is incrementally updated using query.sql, including the update of historical records when the attribution details are received from pings that arrive to the server after the first_seen date.

Code Reference

The query and metadata for the aggregates is defined in the corresponding sub-folder in bigquery-etl under fenix_derived.

How to query

This table should be accessed through the user-facing view fenix.firefox_android_clients which implements additional business logic for grouping attribution data. Use a simple join with the client_id.

For analysis purposes, it's important to use the business date first_seen_date when filtering. This date corresponds to when the baseline ping is actually collected on the client side.