error_aggregates table represents counts of errors counted from main and crash
pings, aggregated every 5 minutes. It is the dataset backing the main mission
control view, but may also be queried
error_aggregates table contains counts of various error measures (for
example: crashes, "the slow script dialog showing"), aggregated across each
unique set of dimensions (for example: channel, operating system) every 5
minutes. You can get an aggregated count for any particular set of dimensions
by summing using SQL.
You can access the data via re:dash. Choose
Athena and then select the
The code responsible for generating this dataset is here.
Getting a large number of different crash measures across many platforms and channels (view on Re:dash):
SELECT window_start, build_id, channel, os_name, version, sum(usage_hours) AS usage_hours, sum(main_crashes) AS main, sum(content_crashes) AS content, sum(gpu_crashes) AS gpu, sum(plugin_crashes) AS plugin, sum(gmplugin_crashes) AS gmplugin FROM telemetry.error_aggregates_v2 WHERE application = 'Firefox' AND (os_name = 'Darwin' or os_name = 'Linux' or os_name = 'Windows_NT') AND (channel = 'beta' or channel = 'release' or channel = 'nightly' or channel = 'esr') AND build_id > '201801' AND window_start > current_timestamp - (1 * interval '24' hour) GROUP BY window_start, channel, build_id, version, os_name
Get the number of
main_crashes on Windows over a small interval
(view on Re:dash):
SELECT window_start as time, sum(main_crashes) AS main_crashes FROM telemetry.error_aggregates_v2 WHERE application = 'Firefox' AND os_name = 'Windows_NT' AND channel = 'release' AND version = '58.0.2' AND window_start > timestamp '2018-02-21' AND window_end < timestamp '2018-02-22' GROUP BY window_start
The aggregates in this data source are derived from main, crash and core pings:
- crash pings are used to count/gather main and content crash events, all other errors from desktop clients (including all other crashes) are gathered from main pings
- core pings are used to count usage hours, first subsession and unique client counts.
error_aggregates job is run continuously, using the Spark Streaming infrastructure
error_aggregates_v2 table has the following columns which define its dimensions:
window_start: beginning of interval when this sample was taken
window_end: end of interval when this sample was taken (will always be 5 minutes more than
window_startfor any given row)
submission_date_s3: the date pings were submitted for a particular aggregate
channel: the channel, like
version: the version e.g.
display_version: like version, but includes beta number if applicable e.g.
YYYYMMDDhhmmsstimestamp the program was built, like
20160123180541. This is also known as the
application: application name (e.g.
os_name: name of the OS (e.g.
os_version: version of the OS
architecture: build architecture, e.g.
country: country code for the user (determined using geoIP), like
experiment_id: identifier of the experiment being participated in, such as
email@example.com, or null if no experiment
experiment_branch: the branch of the experiment being participated in, such as
experiment, or null if no experiment
And these are the various measures we are counting:
usage_hours: number of usage hours (i.e. total number of session hours reported by the pings in this aggregate, note that this might include time where people are not actively using the browser or their computer is asleep)
count: number of pings processed in this aggregate
main_crashes: number of main process crashes (or just program crashes, in the non-e10s case)
startup_crashes: number of startup crashes
content_crashes: number of content process crashes (
version => 58only)
gpu_crashes: number of GPU process crashes
plugin_crashes: number of plugin process crashes
gmplugin_crashes: number of Gecko media plugin (often abbreviated
GMPlugin) process crashes
content_shutdown_crashes: number of content process crashes that were caused by failure to shut down in a timely manner (
version => 58only)
browser_shim_usage_blocked: number of times a CPOW shim was blocked from being created by browser code
permissions_sql_corrupted: number of times the permissions SQL error occurred (beta/nightly only)
defective_permissions_sql_removed: number of times there was a removal of defective
slow_script_notice_count: number of times the slow script notice count was shown (beta/nightly only)
slow_script_page_count: number of pages that trigger slow script notices (beta/nightly only)