# Glean Data

The following describes in detail how we structure Glean data in BigQuery. For information on the actual software which does this, see the Generated Schemas reference.

## Tables

Each ping type is recorded in its own table, and these tables are named using {application_id}.{ping_type}. For example, for Fenix, the application id is org.mozilla.fenix, so its metrics pings are available in the table org_mozilla_fenix.metrics.

## Columns

Fields are nested inside BigQuery STRUCTs to organize them into groups, and we can use dot notation to specify individual subfields in a query. For example, columns containing Glean's built-in client information are in the client_info struct, so accessing its columns involves using a client_info. prefix.

The top-level groups are:

### Ping and Client Info sections

Core attributes sent with every ping are mapped to the client_info and ping_info sections. For example, the client id is mapped to a column called client_info.client_id.

### The metrics group

Custom metrics in the metrics section have two additional levels of indirection in their column name: they are organized by the metric type, and then by their category: metrics.{metric_type}.{category}_{name}.

For example, suppose you had the following boolean metric defined in a metrics.yaml file (abridged for clarity):

browser:
is_default:
type: boolean
description: >
Is this application the default browser?
send_in_pings:
- metrics

It would be available in the column metrics.boolean.browser_is_default.

-- Count number of pings where Fenix is the default browser
SELECT
COUNT(*),
COUNTIF(metrics.boolean.browser_is_default)
FROM
-- We give the table an alias so that the table name metrics and field name
-- metrics don't conflict.
org_mozilla_fenix.metrics AS m
WHERE
date(submission_timestamp) = '2019-11-11'

### The events group

Custom events in the events section have a different structure.

Documentation TBD. See bug 1606836