Analyzing data from SHIELD studies
This article introduces the datasets that are useful for analyzing SHIELD studies. After reading this article, you should understand how to answer questions about study enrollment, identify telemetry from clients enrolled in an experiment, and locate telemetry from add-on studies.
Table of contents
Dashboards
Experimenter is the place to find lists of live experiments.
Experiment slugs
Each experiment is associated with a slug,
which is the label used to identify the experiment to Normandy clients.
The slug is also used to identify the experiment in most telemetry.
The slug for pref-flip experiments is defined in the recipe by a field named slug
;
the slug for add-on experiments is defined in the recipe by a field named name
.
You can determine the slug for a particular experiment by consulting this summary table or the list of active recipes at https://normandy.cdn.mozilla.net/api/v1/recipe/signed/.
Tables
These tables are accessible from BigQuery and Databricks.
experiments
column
main_summary
,
clients_daily
,
and some other tables
include a experiments
column
which is a mapping from experiment slug to branch.
You can collect rows from enrolled clients using query syntax like:
SELECT
... some fields ...,
udf.get_key(experiments, 'some-experiment-slug-12345') AS branch
FROM
telemetry.clients_daily
WHERE
udf.get_key(experiments, 'some-experiment-slug-12345') IS NOT NULL
experiments
The experiments
table is a subset of rows from main_summary
reflecting pings from clients that are currently enrolled in an experiment.
The experiments
table has additional string-type
experiment_id
and experiment_branch
columns,
and is partitioned by experiment_id
, which makes it efficient to query.
Experiments deployed to large fractions of the release channel
may have the isHighVolume
flag set in the Normandy recipe;
those experiments will not be aggregated into the experiments
table.
Please note that the experiments
table cannot be used
for calculating retention for periods extending beyond
the end of the experiment.
Once a client is unenrolled from an experiment,
subsequent pings will not be captured by the experiments
table.
events
The events
table includes
Normandy enrollment and unenrollment events
for both pref-flip and add-on studies.
Normandy events have event category normandy
.
The event value will contain the experiment slug.
The event schema is described in the Firefox source tree.
The events
table is updated daily.
telemetry.shield_study_addon
The telemetry.shield_study_addon
table contains SHIELD telemetry from add-on experiments,
i.e. key-value pairs sent with the
browser.study.sendTelemetry()
method from the
SHIELD study add-on utilities
library.
The study_name
attribute of the payload
column will contain the identifier
registered with the SHIELD add-on utilities.
This is set by the add-on; sometimes it takes the value of
applications.gecko.id
from the add-on's manifest.json
.
This is often not the same as the Normandy slug.
The schema for shield-study-addon pings is described in the
mozilla-pipeline-schemas
repository.
The key-value pairs are present in data
attribute of the payload
column.
The telemetry.shield_study_addon
table contains only full days of data.
If you need access to data with lower latency, you can use the "live" table
telemetry_live.shield_study_addon_v4
which should have latency significantly
less than 1 hour.
telemetry.shield_study
The telemetry.shield_study
dataset includes
enrollment and unenrollment events for add-on experiments only,
sent by the SHIELD study add-on utilities.
The study_name
attribute of the payload
column will contain the identifier
registered with the SHIELD add-on utilities.
This is set by the add-on; sometimes it takes the value of
applications.gecko.id
from the add-on's manifest.json
.
This is often not the same as the Normandy slug.
Normandy also emits its own enrollment and unenrollment events for these studies,
which are available in the events
table.
The telemetry.shield_study
table contains only full days of data.
If you need access to data with lower latency, you can use the "live" table
telemetry_live.shield_study_v4
which should have latency significantly
less than 1 hour.