Below are a number of trailheads that lead into the projects and code that comprise the Firefox Data Platform.

Telemetry APIs

Name and repo Description
python_moztelemetry Python APIs for Mozilla Telemetry
moztelemetry Scala APIs for Mozilla Telemetry
spark-hyperloglog Algebird's HyperLogLog support for Apache Spark
mozanalysis A library for Mozilla experiments analysis
glean A client-side mobile Telemetry SDK for collecting metrics and sending them to Mozilla's Telemetry service

ETL code and Datasets

Name and repo Description
telemetry-batch-view Scala ETL code for derived datasets
python_mozetl Python ETL code for derived datasets
telemetry-airflow Airflow configuration and DAGs for scheduled jobs
python_mozaggregator Aggregation job for aggregates
telemetry-streaming Spark Streaming ETL jobs for Mozilla Telemetry

See also firefox-data-docs for documentation on datasets.


Name and repo Description
mozilla-pipeline-schemas JSON and Parquet Schemas for Mozilla Telemetry and other structured data
hindsight Real-time data processing
lua_sandbox Generic sandbox for safe data analysis
lua_sandbox_extensions Modules and packages that extend the Lua sandbox
nginx_moz_ingest Nginx module for Telemetry data ingestion
puppet-config Cloud services puppet config for deploying infrastructure
parquet2hive Hive import statement generator for Parquet datasets
edge-validator A service endpoint for validating incoming data
gcp-ingestion Documentation and implementation of the Mozilla telemetry ingestion system on Google Cloud Platform

EMR Bootstrap scripts

Name and repo Description
emr-bootstrap-spark AWS bootstrap scripts for Spark.
emr-bootstrap-presto AWS bootstrap scripts for Presto.

Data applications

Name and repo Description Main entry point for viewing aggregate Telemetry data
Cerberus & Medusa Automatic alert system for telemetry aggregates
analysis.t.m.o Self serve data analysis platform
Mission Control Low latency dashboard for stability and health metrics
Re:dash Mozilla's fork of the data query / visualization system
TAAR Telemetry-aware addon recommender
Ensemble A minimalist platform for publishing data
Hardware Report Firefox Hardware Report, available here
python-zeppelin Convert Zeppelin notebooks to Markdown
St. Mocli A command-line interface to STMO
probe-scraper Scrape and publish Telemetry probe data from Firefox
test-tube Compare data across branches in experiments
experimenter A web application for managing experiments
St. Moab Automatically generate Re:dash dashboard for A/B experiments

Legacy projects

Projects in this section are less active, but may not be officially deprecated. Please check with the fx-data-dev mailing list before starting a new project using anything in this section.

Name and repo Description
telemetry-next-node A node.js package for accessing Telemetry Aggregates data

Reference materials


Name and repo Description
firefox-data-docs All the info you need to answer questions about Firefox users with data
Firefox source docs Mozilla Source Tree Docs - Telemetry section
reports.t.m.o Knowledge repository for public reports


Name and repo Description
Fx-Data-Planning Quarterly goals and internal documentation