Proactive detection with applied intelligence

With applied intelligence's proactive detection, anomalies from your APM-monitored applications are automatically surfaced in our activity streams and anomalies feed. You can click each anomaly to bring up an automatic analysis.

Notifications for anomalies can be delivered in Slack, or you can set up a webhook to deliver messages when you need them. These events are available for querying, creating custom dashboards, and alerting. After you set up a proactive detection configuration (a group of apps you're interested in), you can add this configuration as a source. Then the anomalies will be automatically correlated with other data sources via incident intelligence.

To learn where anomalies appear, how you can use them to understand potential issues before they become incidents, and create alerts from them, watch this short video (approx. 4:15 minutes).

Requirements

To use proactive detection, ensure you have:

An APM agent installed for at least one application.
To receive notifications in Slack, you'll need to ask your IT administrator to install the New Relic application in your Slack workspace.

For more on data limits, see Data limits.

Why it matters

With proactive detection, applied intelligence delivers insights about anomalies in your production system, along with an automatic analysis of the anomaly. It's enabled automatically, at no additional cost. When an anomaly is detected, you can view it in the applied intelligence anomalies feed, or we'll send notifications directly to your Slack channel or a webhook.

How it works

Proactive detection uses the following methods to detect anomalies in your app data:

Proactive detection monitors metric data reported by an APM agent, building a model of your typical application dynamics, and focuses on key golden signals: throughput, response time, and errors.
If one of these golden signals shows anomalous behavior, the system flags it and tracks recovery to normal behavior.
The system adapts to changes in your data, and continuously updates models based on new data.

Automatically on: By default, proactive detection monitors all your APM applications, with no action required by you. When an anomaly is detected, it's automatically surfaced in various activity streams, the applied intelligence anomalies feed and is available for querying via NRQL.

Receiving notifications: We send notifications when we detect anomalous changes in throughput, error rate, or response time. The notifications are sent to selected Slack channels, or sent via webhook. When the anomaly goes back to normal, a recovery message is sent. If you don't want to receive notifications, you still have access to the data via NRQL query.

Anomaly analysis: For each anomaly, we provide a link in Slack to an analyze anomaly page. This page generates automatic insights into the anomaly. The page is also available from the anomalies tab, which lists recent anomalies. This page uses your existing APM and proactive detection data to provide explanations as to the cause of the anomaly.

Activity stream: Inside various activity streams such as the New Relic homepage, APM Summary page, Lookout and Explorer, you'll see relevant anomalies from your APM-monitored applications. Clicking on any of the anomaly events in the activity stream brings up the analysis page for that anomaly.

Applications will not always generate anomalies, so it can be normal to not receive any detections.

Set up notifications for proactive detection

Proactive detection is enabled automatically, at no additional cost. To receive notifications or to have a configuration (group of apps) that you can add as a source for incident intelligence, you will need to create a proactive detection configuration. You can create a configuration in the proactive detection UI:

From one.newrelic.com, click Alerts & AI.
Under Proactive detection, click Settings.
Click Add a configuration.
Input the following information into the form:
- Choose a name for your configuration that helps you easily identify it from others in your account.
- Select an account.
- Select up to 1,000 applications. Note that certain applications with low throughput might not be good candidates for proactive detection, as they can be more sensitive to smaller amounts of data fluctuation.
Optional: select the golden signals you'd like to monitor for anomalies.
Optional: connect to incident intelligence.

To use proactive detection with Slack:

Select Slack.
Choose which Slack channel receives notifications. You can select any existing public or private channel. This prompts the workflow to add the applied intelligence Slack application to your selected channel. To create a new channel, do that directly in Slack first.
Tip
If you experience an error when assigning Slack channels, make sure that the New Relic AI Slack application has been added to your Slack workspace.
Save the configuration.
You can modify the applications for each configuration at any time by selecting the configuration in the configuration table.

Mute notifications (Slack only)

In Slack, detections coming from specific applications can be muted temporarily or permanently. The entire channel can also be muted temporarily. This is useful in the case of an incident or when the channel should otherwise not be interrupted.

To mute in Slack, select Mute this app’s warnings or Mute all warnings, then select the duration. We will resume sending notifications for any detections once the muting duration has completed.

Muting an application permanently removes it from the configuration. To add it back in, go to one.newrelic.com, in the top nav click Alerts & AI, then click Proactive detection, and select the configuration to edit.

Muting proactive detection notifications does not affect alerts.

Use proactive detection Slack messages

Each anomaly message has several key pieces of information you can use to learn more about and start troubleshooting the potential issue:

The application name and a link to more information about it in the New Relic UI .
The metric experiencing an anomaly and a link to its details in the New Relic UI.
A graph of the metric over time to provide a visual understanding of the anomaly’s behavior and degree.
An Analyze button that navigates to an analysis page in applied intelligence that identifies key attributes that are unique to the anomaly, anomalies found upstream or downstream, and any other relevant signals.

Once an anomaly has returned to normal, we send a recovery notification with the option to provide feedback. Your feedback provides our development team with input to help us improve detection quality. In the case of feedback provided on throughput anomalies, an evaluation is run each hour based on feedback to fit a more suitable model. If we helped you, you can select Yes or No.

View overview of anomalies

In addition to notifications for anomalies that give you information via Slack or webhook, you can view more information about the anomalies in your environment via the Anomalies tab on the Alerts & AI Overview page. That tab provides a list of all the recent anomalies from every configuration in the selected account, and you can select an anomaly for a detailed analysis.

Anomaly visibility settings

Anomalies are displayed in various New Relic activity streams and in the applied intelligence anomalies feed. You can customize what is displayed using the anomaly visibility settings (for example, hiding throughput anomalies on an activity stream but keeping them in the anomalies feed).

To find these settings: from Alerts & AI, under Proactive detection, click Settings.

Notes on using these settings:

These settings are applied at the user level. Changes you make won’t affect others users in your organization.
Regardless of these settings, the anomalies are still reported and available for NRQL querying.

Details on these UI sections:

AI overview and anomalies tab: Use the AI overview and anomalies tab setting to hide anomalies from the AI overview and anomalies tab setting. Please note you also can use filters specific to these views as well.
Global activity stream: Use the global activity stream section to customize what anomalies are shown in the various New Relic activity streams, including the New Relic homepage, APM Summary, and Lookout.
Anomaly types: Use the check boxes here to hide specific types of anomalies. For example, uncheck Web throughput and Non-web throughput anomalies to hide these types of anomalies from both the activity streams and the AI overview and anomalies tab. (Note they are still reported and available for querying.)

Query anomaly data

You can use NRQL to query and chart your proactive detection data using the NrAiAnomaly event. For example:

FROM NrAiAnomaly SELECT *

Important

This data has previously been attached to the ProactiveDetection event. That event will be deprecated on April 7, 2021. If you use ProactiveDetection in your custom charts, you should convert those queries to using NrAiAnomaly.

Here are important attributes attached to this event:

Attribute	Description
`closeTime` timestamp	The time when the anomaly ended. Example: `1615304100000`.
`configurationType` string	The type of configuration monitoring the event. If at least one configuration is monitoring the entity, this is set to `configuration`. Otherwise, it's set to `automatic`.
`entity.accountId` number	The New Relic account ID to which the entity belongs.
`entity.domain` number	The domain of the entity (currently only `APM` but will change with future functionality).
`entity.guid` string	The GUID of the entity. This is used to identify and retrieve data about the entity via NerdGraph. Identical to `entityGuid`.
`entityGuid` string	The GUID of the entity. This is used to identify and retrieve data about the entity via NerdGraph. Identical to `entity.guid`.
`entity.name` string	The name of the entity whose data was determined to be anomalous. Identical to `entityName`. Example: `Laura's coffee service`.
`entityName` string	The name of the entity whose data was determined to be anomalous. Identical to `entity.name`.
`entity.type` string	The type of entity (currently only `APPLICATION` but will change with future functionality).
`evaluationType` string	This is always `anomaly`.
`event` string	Indicates whether it's the beginning (`open`) or end (`close`) of the anomalous data.
`openTime` timestamp	The time when the anomaly opened. Example: `1615303740000`.
`signalType` string	The type of data that was analyzed. For example, `error_rate` or `response_time.non_web`.
`timestamp` timestamp	The time at which the event was written.
`title` string	Description of the anomaly. Example: `Error rate was much higher than normal`.

Add anomalies as source in incident intelligence

By integrating incident intelligence with your proactive detection anomalies, you can get context and correlations. To learn about doing this in incident intelligence, see Configure sources.

You can also select Connect to incident intelligence from inside of a configuration.

Webhook payload and examples

Proactive detection sends the event body in JSON format via HTTPS POST. The system expects the endpoint to return a successful HTTP code (2xx). If you use webhooks to configure Proactive detection, use these examples of the webhook body format and JSON schema.

Attribute	Description
`category` enum	The category of data that was analyzed. Categories include web throughput, non-web throughput, web transactions, non-web transactions, and error class.
`data` list	The time series data leading up to the detection.
`data[].timestamp` number	The timestamp of the data point in milliseconds since the Unix epoch. Example: 1584366819000
`data[].unit` string	The unit describing the value of the data point. Data units include `count`, `milliseconds`, and `error_rate`.
`data[].value` number	The value of the data point. Example: 1.52
`detectionType` enum	The type of data that was analyzed. Types include `latency`, `throughput`, and `error_rate`.
`entity` object	The entity that reported the unusual data.
`entity.accountId` number	The ID for the entity's account.
`entity.domain` enum	The domain for the entity. Example: APM.
`entity.domainId` string	The id used to uniquely identify the entity within the domain.
`entity.guid` string	The guid used to uniquely identify the entity across all products.
`entity.name` string	The name of the entity. Example: `Laura’s coffee service`
`entity.link` string	A link to view the entity. Example: `https://rpm.newrelic.com/accounts/YOUR_ACCOUNT_ID/applications/987654321”`
`severity` enum	A description of how unusual of a change occurred, including `NORMAL`, `WARNING`, or `CRITICAL`.
`version` string	Version used to describe the data being provided. Example: v1
`viewChartImageUrl` string	Image showing a chart of the anomalous data.
`anomalyzerUrl` string	URL that can be opened to analyze the anomaly in New Relic.

Applied intelligence will send the event body in JSON format via HTTPS POST. The system expects the endpoint to return a successful HTTP code (2xx).

Template:

{
  "version": "{{version}}", 
  "entity": {
    "type": "{{entity.type}}",
    "name": "{{entity.name}}",
    "link": "{{entity.link}}",
    "entityGuid": "{{entity.entityGuid}}",
    "domainId": "{{entity.domainId}}",
    "domain": "{{entity.domain}}",
    "accountId": {{entity.accountId}}
  },
  "detectionType": "{{detectionType}}",
  "category": "{{category}}",
  "data": [{{#each data}}
    {
      "value": {{value}},
      "unit": "{{unit}}",
      "timestamp": {{timestamp}}
    }
    {{#unless @last}},{{/unless}}
  {{/each}}],
  "viewChartImageUrl": "{{viewChartImageUrl}}",  
  "anomalyzerUrl": "{{anomalyzerUrl}}"
}

Sample payload:

{
  "version": "v1",
  "entity": {
    "type": "APPLICATION",
    "name": "My Application",
    "link": "https://rpm.newrelic.com/accounts/ACCOUNT_ID/applications/123",
    "entityGuid": "foo",
    "domainId": "123",
    "domain": "APM",
    "accountId": YOUR_ACCOUNT_ID
  },
  "detectionType": "metric",
  "category": "web throughput",
  "data": [ {
    "value": "100",
    "unit": "count",
    "timestamp": 1637260259819
  }, {
    "value": "99",
    "unit": "count",
    "timestamp": 1637260319819
  }, {
    "value": "0",
    "unit": "count",
    "timestamp": 1637260379819
  } ],
  "viewChartImageUrl": "https://www.example.com/image/8353cf2c-945c-48e8-99de-e903f033a881?height=200&width=400&show_timezone=true",
  "anomalyzerUrl": "https://www.example.com/anomalyzerUrlExample"
}

Data limits

In addition to requirements, data limits include:

Monitored APM applications: limited to 1,000 per configuration
Slack configurations: limited to 200 per account
Webhook configurations: limited to 200 per account
Configurations without notifications: limited to 200 per account