This glossary provides definitions for some common terms used in the Mozilla data universe.
If you're new to Mozilla, you may also find the general glossary on
- Build ID
- Client ID
- Data Analyst
- Data Engineer
- Data Practitioner
- Data Scientist
- Derived Dataset
- Ping Table
- STMO (sql.telemetry.mozilla.org)
Account Ecosystem Telemetry (never fully launched); see the PRD
See Data Analyst.
A third-party product formerly used by several teams within Mozilla for analysis of user events.
A unique identifier for a build like
Often used to identify and aggregate telemetry submitted by specific versions of our software.
Note that the format may differ across product lines.
A unique id identifying the client who sent a ping.
A "Data Engineer" at Mozilla generally refers to someone on the Data Engineering team. They implement and maintain the data platform and tools described in this document. They may also assist data scientists or other data practitioners, as needed.
A data practitioner is someone who looks at data, identifies trends and other qualitative measurements in them, and creates charts and dashboards. It could be anyone: engineer, product manager, data engineer or data scientist.
A "Data Scientist" at Mozilla generally refers to someone on the Data Science team. They have a broad array of technical backgrounds and a core set of common professional skills:
- applying statistical methods to noisy data to answer questions about what, how, or why something is happening
- transform unstructured data into usable metrics and models
- augmenting strategic product and decision-making with empirical evidence created and curated by the team
A set of data, which includes ping data, derived datasets, etc.; sometimes it is used synonymously with “table”; sometimes it is used technically to refer to a BigQuery dataset, which represents a container for one or more tables.
For more details, see the DAU Metric page on Confluence.
A processed dataset, such as Clients Daily. At Mozilla, this is in contrast to a raw ping table which represents (more or less) the raw data submitted by our users.
Glean is Mozilla’s product analytics & telemetry solution that provides a consistent experience and behavior across all of our products. Most of Mozilla's mobile apps, including Fenix, have been adapted to use the Glean SDK. For more information, see the Glean Overview.
Google Cloud Platform (GCP) is a suite of cloud-computing services that runs on the same infrastructure that Google uses internally for its end-user products.
IP Geolocation involves attempting to discover the location of an IP address in the real world. IP addresses are assigned to an organization, and as these are ever-changing associations, it can be difficult to determine exactly where in the world an IP address is located. Mozilla’s ingestion infrastructure attempts to perform GeoIP lookup during the data decoding process and subsequently discards the IP address before the message arrives in long-term storage.
Mozilla's core data platform has been built to support structured ingestion of arbitrary JSON payloads whether they come from browser products on client devices or from server-side applications that have nothing to do with Firefox; any team at Mozilla can hook into structured ingestion by defining a schema and registering it with pipeline. Once a schema is registered, everything else is automatically provisioned, from an HTTPS endpoint for accepting payloads to a set of tables in BigQuery for holding the processed data.
Intuitively, how many days per week do users use the product? Among profiles active at least once in the week ending on the date specified, the number of days on average they were active during that one-week window.
Key Performance Indicator - a metric that is used to measure performance across an organization, product, or project.
In general: a metric is anything that you want to (and can) measure. This differs from a dimension which is a qualitative attribute of data.
Monthly Active Users - the number of unique profiles active in the 28-day period ending on a given day. The number of unique profiles active at least once during the 28-day window ending on the specified day.
A ping represents a message that is sent from the Firefox browser to Mozilla’s Telemetry servers. It typically includes information about the browser’s state, user actions, etc. For more information, see Common ping format.
A set of pings that is stored in a BigQuery table. See article on raw ping datasets.
Mozilla’s data pipeline, which is used to collect Telemetry data from Mozilla’s products and logs from various services. The bulk of the data that is handled by this pipeline is Firefox Telemetry data. The same tool-chain is used to collect, store, and analyze data that comes from many sources.
For more information, see An overview of Mozilla’s Data Pipeline.
Measurements for a specific aspect of Firefox are called probes. A single telemetry ping sends many different probes. Probes are either Histograms (recording distributions of data points) or Scalars (recording a single value).
You can search for details about probes by using the Probe Dictionary. For each probe, the probe dictionary provides:
- A description of the probe
- When a probe started being collected
- Whether data from this probe is collected in the release channel
All of the changes a user makes in Firefox, like the home page, what toolbars you use, installed addons, saved passwords and your bookmarks, are all stored in a special folder, called a profile. Telemetry stores archived and pending pings in the profile directory as well as metadata like the client id. See also Profile Creation.
Typically refers to a query written in the SQL syntax, run on (for example) STMO.
As in “Data retention” - how long data is stored before it is automatically deleted/archived?
As in “User retention” - how likely is a user to continue using a product?
A schema is the organization or structure for our data. We use schemas at many levels (in data ingestion and storage) to make sure the data we submit is valid and possible to be processed efficiently.
The period of time that it takes between Firefox being started until it is shut down. See also subsession.
A service for creating queries and dashboards. See STMO under analysis tools.
As you use Firefox, Telemetry measures and collects non-personal information, such as performance, hardware, usage and customizations. It then sends this information to Mozilla on a daily basis and we use it to improve Firefox.
Uniform Resource Locator - a text string that specifies where a resource (such as a web page, image, or video) can be found on the Internet (source). For example,
https://www.mozilla.org is a URL.
Weekly Active Users - The number of unique profiles active at least once during the 7-day window ending on the specified day.