Creating Nth percentile calculations
Analytics allows you to create nth percentile calculations that can be added to a report. The nth percentile is the smallest value in the set of raw data with the property that n% of the data values are less than or equal to it. You need to define several parameters in order for the calculation to provide the targeted information you need:
- Value of percentile desired (e.g., 95th percentile)
- Number of days of data to aggregate (e.g., prior 30 days, this value is bounded by the raw_retention_days value)
- Business hours to calculate within (if defined, otherwise the default 24-hour "shift" is used)
- Job filter (also known as a group filter) to limit the calculation to certain devices (otherwise all devices are included)
- Metrics filter to limit the calculation to certain metrics (otherwise all metrics are used)
The creation of rows for the nth percentile calculation is performed manually by inserting records into the reporting database. The tables of interest are as follows:
-
nth_percentile_config
The main table for defining the percentile jobs. - percentile - Specifies the nth value to calculate; e.g., the 95th. - period - Specifies the number of days of data to use for the nth percentile calculation - name - Not used for the nth percentile; only exists for integration with an interface. - description - Not used for the nth percentile; only exists for integration with an interface.
-
business_hours_config
Specifies the time ranges that has been defined as "business day" or "shift".
-
nth_percentile_business_hours_assoc
Relates nth_percentile_config records with business_hours_config records. - nth_percentile_config_id - Specifies the nth_percentile_config foreign key. - business_hours_config_id - Specifies the business_hours foreign key. Note: If nth_percentile_business_hours_assoc is not defined, projection_business_hours_assoc is used.
-
group_filter
Optional filter used to reduce the set of devices included in the aggregation.
-
nth_percentile_group_assoc
Relates nth_percentile_config records with group_filter records.
- nth_percentile_config_id - Specifies the nth_percentile_config foreign key.
- group_filter_id - Specifies the group_filter foreign key.
Note: If projection_group_assoc is also defined, only the intersection is used.
-
meta_metric
Optional filter used to reduce the set of metrics included in the aggregation.
-
nth_percentile_metric_assoc
Relates nth_percentile_config records with meta_metric records.
- nth_percentile_config_id - Specifies the nth_percentile_config foreign key.
- metric_key - Specifies the meta_metric foreign key.
Note: If nth_percentile_metric_assoc is not defined for a metric, projection_metric_assoc is used.
For each aggregate table to be projected, the table name from which to read the data is constructed as follows:
nth_percentile_<nth_percentile_config.name><business_hours_config.name, if any><meta_metric.metric_name>
The table name to which to write data is constructed as follows:
proj_<projection_config.name><nth_percentile_config.name, if any><business_hours_config.name, if any>_<meta_metric.metric_name>
Inserting records for nth percentile calculations
This example procedure inserts records into the reporting database and is for demonstration purposes only. For this example, assume that two data centers are already defined, Austin and Bangalore. The metrics on which to filter are called analytics_test1 and analytics_test2.
-
Log in to the Analytics reporting database as the root user or a user with administrative permissions.
mysql -u root reporting
-
Enter the following command at the prompt, substituting values that apply to your company's situation:
insert into nth_percentile_config (name, description, percentile, period) values ('percentile_test_1', 'first percentile test', 90, 10), ('percentile_test_2', 'second percentile test', 80, 20); insert into nth_percentile_business_hours_assoc (nth_percentile_config_id, business_hours_config_id) values ( (select id from nth_percentile_config where percentile = 90 and period = 10), (select id from business_hours_config where name='austin') ), ( (select id from nth_percentile_config where percentile = 80 and period = 20), (select id from business_hours_config where name='bangalore') ); insert into nth_percentile_group_assoc (nth_percentile_config_id, group_filter_id) values ( (select id from nth_percentile_config where percentile = 90 and period = 10), (select id from group_filter where name='austin') ), ( (select id from nth_percentile_config where percentile = 80 and period = 20), (select id from group_filter where name='bangalore') ); insert into nth_percentile_metric_assoc (nth_percentile_config_id, metric_key) values ( (select id from nth_percentile_config where percentile = 90 and period = 10), (select metric_key from meta_metric where metric_name='analytics_test1') ), ( (select id from nth_percentile_config where percentile = 90 and period = 10), (select metric_key from meta_metric where metric_name='analytics_test2') ), ( (select id from nth_percentile_config where percentile = 80 and period = 20), (select metric_key from meta_metric where metric_name='analytics_test1') ), ( (select id from nth_percentile_config where percentile = 80 and period = 20), (select metric_key from meta_metric where metric_name='analytics_test2') );