Projects

Below are a number of trailheads that lead into the projects and code that comprise the Firefox Data Platform.

Telemetry APIs

Name and repoDescription
python_moztelemetryPython APIs for Mozilla Telemetry
moztelemetryScala APIs for Mozilla Telemetry
spark-hyperloglogAlgebird's HyperLogLog support for Apache Spark
mozanalysisA library for Mozilla experiments analysis
gleanA client-side mobile Telemetry SDK for collecting metrics and sending them to Mozilla's Telemetry service

ETL code and Datasets

Name and repoDescription
telemetry-batch-viewScala ETL code for derived datasets
python_mozetlPython ETL code for derived datasets
telemetry-airflowAirflow configuration and DAGs for scheduled jobs
python_mozaggregatorAggregation job for telemetry.mozilla.org aggregates
telemetry-streamingSpark Streaming ETL jobs for Mozilla Telemetry

See also firefox-data-docs for documentation on datasets.

Infrastructure

Name and repoDescription
mozilla-pipeline-schemasJSON and Parquet Schemas for Mozilla Telemetry and other structured data
hindsightReal-time data processing
lua_sandboxGeneric sandbox for safe data analysis
lua_sandbox_extensionsModules and packages that extend the Lua sandbox
nginx_moz_ingestNginx module for Telemetry data ingestion
puppet-configCloud services puppet config for deploying infrastructure
parquet2hiveHive import statement generator for Parquet datasets
edge-validatorA service endpoint for validating incoming data
gcp-ingestionDocumentation and implementation of the Mozilla telemetry ingestion system on Google Cloud Platform

EMR Bootstrap scripts

Name and repoDescription
emr-bootstrap-sparkAWS bootstrap scripts for Spark.
emr-bootstrap-prestoAWS bootstrap scripts for Presto.

Data applications

Name and repoDescription
telemetry.mozilla.orgMain entry point for viewing aggregate Telemetry data
Cerberus & MedusaAutomatic alert system for telemetry aggregates
analysis.t.m.oSelf serve data analysis platform
Mission ControlLow latency dashboard for stability and health metrics
Re:dashMozilla's fork of the data query / visualization system
redash-stmoMozilla's extensions to Re:dash
TAARTelemetry-aware addon recommender
EnsembleA minimalist platform for publishing data
Hardware ReportFirefox Hardware Report, available here
python-zeppelinConvert Zeppelin notebooks to Markdown
St. MocliA command-line interface to STMO
probe-scraperScrape and publish Telemetry probe data from Firefox
test-tubeCompare data across branches in experiments
experimenterA web application for managing experiments
St. MoabAutomatically generate Re:dash dashboard for A/B experiments

Legacy projects

Projects in this section are less active, but may not be officially deprecated. Please check with the fx-data-dev mailing list before starting a new project using anything in this section.

Name and repoDescription
telemetry-next-nodeA node.js package for accessing Telemetry Aggregates data

Reference materials

Public

Name and repoDescription
firefox-data-docsAll the info you need to answer questions about Firefox users with data
Firefox source docsMozilla Source Tree Docs - Telemetry section
reports.t.m.oKnowledge repository for public reports

Non-public

Name and repoDescription
Fx-Data-PlanningQuarterly goals and internal documentation