Socorro Crash Reports
Public crash statistics for Firefox are available through the Data Platform in a
The crash data in Socorro is sanitized and made available to STMO.
A nightly import job converts batches of JSON documents into a columnar format using the associated JSON Schema.
The dataset can be queried using SQL. For example, we can aggregate the number of crashes and total up-time by date and reason.
SELECT crash_date, reason, count(*) as n_crashes, avg(uptime) as avg_uptime, stddev(uptime) as stddev_uptime, approx_percentile(uptime, ARRAY [0.25, 0.5, 0.75]) as qntl_uptime FROM socorro_crash WHERE crash_date='20180520' GROUP BY 1, 2
The job is schedule on a nightly basis on airflow.
The dag is available under
The source schema is available on the
mozilla-services/socorro GitHub repository.
This schema is transformed into a Spark-SQL structure and serialized to parquet after transforming column names from
The code is a notebook in the