- Adding New Fields
- Data Reference
- Code Reference
Note that since the introduction of BigQuery, we are able to represent the
main ping structure in a table, available as
New analyses should avoid
main_summary, which exists only for compatibility.
main_summary table contains one row for each ping.
Each column represents one field from the main ping payload,
though only a subset of all main ping fields are included.
This dataset does not include most histograms.
This table is massive, and due to its size, it can be difficult to work with.
Instead, we recommend using the
If you do need to query this table, make use of the
sample_id field and
limit to a short submission date range.
main_summary table is accessible through re:dash.
Here's an example query.
We support a few basic types that can be easily added to
Non-addon scalars are automatically added to
Once added, they will show as top-level fields, with the string
We can include other types of fields as well, for example if there needs to be a specific transformation done.
In general, it is preferable to simply access the data directly in the
main ping table instead.
Compare the search volume for different search source values:
WITH search_data AS ( SELECT s.source AS search_source, s.count AS search_count FROM telemetry.main_summary CROSS JOIN UNNEST(search_counts) AS s WHERE submission_date_s3 = '2019-11-11' AND sample_id = 42 AND search_counts IS NOT NULL ) SELECT search_source, sum(search_count) as total_searches FROM search_data GROUP BY search_source ORDER BY sum(search_count) DESC
main_summary dataset contains one record for each
as long as the record contains a non-null value for
We do not ever expect nulls for these fields.
As of 2019-11-28, the current version of the
main_summary dataset is
Most of the fields are simple scalar values, with a few notable exceptions:
search_countfield is an array of structs, each item in the array representing a 3-tuple of (
enginefield represents the name of the search engine against which the searches were done. The
sourcefield represents the part of the Firefox UI that was used to perform the search. It contains values such as
countfield contains the number of searches performed against this engine+source combination during that subsession. Any of the fields in the struct may be null (for example if the search key did not match the expected pattern, or if the count was non-numeric).
loop_activity_counterfield is a simple struct containing inner fields for each expected value of the
LOOP_ACTIVITY_COUNTEREnumerated Histogram. Each inner field is a count for that histogram bucket.
popup_notification_statsfield is a map of
Stringkeys to struct values, each field in the struct being a count for the expected values of the
POPUP_NOTIFICATION_STATSKeyed Enumerated Histogram.
places_pages_countfields contain the mean value of the corresponding Histogram, which can be interpreted as the average number of bookmarks or pages in a given subsession.
active_addonsfield contains an array of structs, one for each entry in the
environment.addons.activeAddonssection of the payload. More detail in Bug 1290181.
disabled_addons_idsfield contains an array of strings, one for each entry in the
payload.addonDetailswhich is not already reported in the
environment.addons.activeAddonssection of the payload. More detail in Bug 1390814. Please note that while using this field is generally OK, this was introduced to support the TAAR project and you should not count on it in the future. The field can stay in the
main_summary, but we might need to slightly change the ping structure to something better than
themefield contains a single struct in the same shape as the items in the
active_addonsarray. It contains information about the currently active browser theme.
user_prefsfield contains a struct with values for preferences of interest.
eventsfield contains an array of event structs.
- Dynamically-included histogram fields are present as key->value maps, or key->(key->value) nested maps for keyed histograms.
main_summary may use one of a handful of time formats with different precisions:
|stamped at ingestion||nanoseconds since epoch|
|derived from timestamp|
|derived from HTTP header: ||HTTP date header string sent with the ping|
|time of ping creation ISO8601 at UTC+0|
|timezone offset in minutes|
|hourly precision, ISO8601 date in local time|
|subsession length in seconds|
|days since epoch|
This dataset is generated by bigquery-etl. Refer to this repository for information on how to run or augment the dataset.