Information for: DEVELOPERS   PARTNERS   SUPPORT

Explainable Machine Learning

Unlocking explainability improves campaigns and transparency. It also helps customers to build trust in Machine Learning.

In today’s complex marketing channels, marketers need deep, nuanced insights into micro-patterns and multi-faceted customer behavior. These insights are beyond the realm of traditional analytics tools.

Machine learning models make predictions by capturing these insights as part of their operating parameters during the training process. Unlocking these parameters gives marketers direct visibility and explanations about the micro-patterns and customer behavior, while also building trust in machine learning predictions. It also enables them to build highly optimized and performant campaigns.

The following are the available Machine Learning model metrics and their use cases:

Clustering Heatmaps

Heatmaps display average values per input feature of the Machine Learning model that defines the composition of each cluster. It is designed to help marketers understand customers’ significant behavior identifiers in a particular cluster.

The following example helps you understand a heatmap and the insights it enables:

Heatmap Analysis

This heatmap displays the average values for customers in four clusters across all the model input features. These four clusters are:

  • VIPs
  • Regular Buyers
  • Recent Low Value Buyers
  • One-time Buyers

In this example, model input features are arranged by rows, and clusters are arranged by columns. The average value of each input feature by cluster is represented as a numerical value.

Learning from this heatmap chart

VIP customers have bought many more products compared to regular buyers and low value buyers, and have a higher Average Order Value (AOV). However, they have received relatively low overall discounts, despite their loyalty.

You can use this insight to create a discount campaign that focuses on VIP customers, and bundle it with a deal to increase AOV and drive volumes.

Similarly, another insight is that VIP customers buy fewer items on their first purchase, compared to regular buyers and one-time buyers. However, the percentage margin on these goods is higher, which indicates that they are buying fresh/new products.

Feature Importance

Feature Importance charts are used to identify the weights of the model input features towards the model prediction. In other words, these charts are used to identify the impact and contribution of the input features towards the model predictions.

The charts surface positive, negative, and low contributing input features.

By focusing on what features positively and negatively affect a customer’s prediction score, marketers can build optimized campaigns by tapping into these features.

The following example helps you understand a feature importance chart and the insights it enables:

Feature Importance Analysis

This chart displays the feature importance value or weights for all the input features for a Likelihood to Buy model.

In this example, model input features are arranged on the left and the feature importance values are arranged on the right. Positive values (shades of Blue and Violet) have a positive correlation to the prediction, and negative values (shades of Red) have a negative correlation.

Learning from this feature importance chart

Likelihood to Buy scores are positively correlated with the transactions count in the last 30 days and email open ratios.

At the same time, they are negatively correlated with email click ratio. For example, if people are not clicking in the email, likelihood to buy will be low.

Therefore, to maximize conversion, a marketer should design campaigns targeting customers who made a purchase in the last 30 days and exceed a certain email click threshold. Additionally, the total revenue contributed by an individual has a very low bearing on their likelihood to buy and should not be used by the marketer as a basis for any campaign.

Model Performance

The quality of a Machine Learning model is measured by multiple metrics. These metrics quantify the ability of the model to make accurate predictions in different scenarios. These metrics are a true representation of the performance of the model in the lab as well as the real world. Access to these metrics helps you to know the quality of the model, and also boosts confidence in its predictions. All metrics are made available for:

  • In the Lab during Training or Train Set
  • In the real world or Test Set

To access these metrics:

  1. Sign in to the CDP user interface.

  2. Click Analytics > Metrics.

  3. Open a dashboard.

  4. Navigate to the Explainability section.

    Based on the selected model, the system displays various charts below the Feature Importance section.

Likelihood models

For Likelihood models such as Likelihood to Buy, CDP displays the following details:

  • Receiver Operator Characteristics (ROC) Curve: This model performance metric for data scientists signifies the extent to which the model is better than a random classifier. It contains the following axises:

    • True Positive Rate (or Recall): Measures the ability of the model to accurately identify the True Positives (identify a cat as actually a cat) amongst all the Actual Positives (all the cats in the dataset).
    • False Positive Rate: Measures the ability of the model to accurately identify the False Positives (a dog is identified as a cat) amongst all the Actual Positives (all the dogs in the dataset).

    For more information on ROC, see Classification: ROC Curve and AUC.

    roc

  • Area Under Curve: This is a single numeral metric between 0 and 1, with 1.0 representing a perfect model. It is a derived metric from the Receiver Operator Characteristics (ROC) Curve metric. For more information on AUC, see Classification: ROC Curve and AUC.

    auc

  • Precision Recall Curve: This model performance metric signifies the extent to which the model can accurately classify the results. It contains the following axises:

    • Precision: Measures the ability of a model to accurately identify the True Positives amongst all the Predicted Positives.
    • Recall: Measures the ability of the model to accurately identify the True Positives amongst all the Actual Positives

    prc

Regression models

For Regression models such as Predictive Lifetime Value, CDP displays the following details:

  • Mean Absolute Error (MAE): The average or mean error to be expected in the predictions of a regression model that makes predictions of continuous values. A model with low MAE value is considered good.
  • Root Mean Squared Error (RMSE): The root of the sum of the squares of the errors. This metric is more sensitive to large errors in prediction as they get squared, before being added up in the calculation formula. This metric is considered ideal for models that measure large errors in predictions.

Note

RMSE is always larger or equal to MAE. The greater is the difference between them, the greater is the variance in the individual errors in the sample. If RMSE is equal to MAE, all errors are of the same magnitude.

pltv

Clustering models

For Clustering models, CDP displays the following details:

  • Elbow Curve: This measures the optimal number of clusters to break down the data into, before sufficient differentiation between the clusters is lost. This metric is defined as a plot between the number of clusters and the sum of squares of the cluster members from the centroid of the respective cluster. The point at which the chart kinks or elbows is defined as the optimal number of clusters to use.

ec

Feature Dictionary

It is important to understand the input features that feed a Machine Learning model. To surface the input parameters and their human readable definitions in CDP, you can leverage the Feature Dictionary feature available in all the Machine Learning models. These are available in the individual dashboards for all Machine Learning models.

For example, in the Likelihood to Buy ML dashboard, you can scroll down to the Explainability section to view the Feature Dictionary tile. This contains the input features for the Likelihood to Buy model along with their human readable descriptions.

feature-dictionary

For the Likelihood to Convert model, the Feature Dictionary tile appears as follows:

feature-dictionary-2

You can access the same dashboards from 360 Profiles. To view the dashboard, you must click the i button adjacent to the Input Features label. For more information, see Explainable Predictions.

feature-dictionary-3

Feature Fairness Analysis

The Feature Importance chart provides a quantitative visibility on the extent of impact of a particular feature on the Machine Learning model. Feature Fairness Analysis provides insights on how a particular feature impacts the Machine Learning model.

With Feature Fairness Analysis, you can:

  • Understand the inherent biases in the Machine Learning model instead of the Machine Learning model being a black box.
  • Tweak the Machine Learning model if the biases are not coherent with the business logic.
  • Launch marketing campaigns targeting specific audience segments by analyzing the fairness dashboards.

Feature Fairness Analysis is applicable only to Likelihood models and is available to all the corresponding out-of-the-box dashboards.

Key terminologies

The following are the terminologies to understand and leverage Feature Fairness Analysis:

  • Fairness Dimension: This input feature is used for the Machine Learning model training and evaluation. By default, in the out-of-the-box dashboards, age and gender are the fairness dimensions used in the analysis. You can select any data dimension as the fairness dimension for subsequent analysis.

  • Fairness Group: After you select a fairness dimension, the data can be clustered in different groups. CDP evaluates the impact of each group on the Machine Learning model.

    For example, if Gender is the fairness dimension, Male and Female can be the fairness groups. If Age is the fairness dimension, you can have multiple fairness groups of age 18-29, 30-49, and above 50.

For adding custom Fairness Dimension and Fairness Group, contact your Acquia CSM or AM.

Dashboard description

Feature Fairness Analysis is applicable to all Likelihood models. It is available in all out-of-the-box dashboards for all Machine Learning Likelihood models.

Feature Fairness Analysis has four sections in the out-of-the-box dashboards:

  • Input Training Data Distribution: This chart shows the distribution of training input data across various fairness groups within each fairness dimension, such as age and gender.

    ml feature fairness image1

  • Positive Outcomes Ratio (Intra Fairness Group): This chart shows the distribution of positive outcomes, observed or predicted within each fairness group. The following are the ways to interpret this chart:

    • If the magnitude of observed percentage is high and the positive outcome ratio is more than 100%, it means that there is considerable evidence and the model is heavily influenced by the fairness group.

    • If the magnitude of observed percentage is high and the positive outcome ratio is less than 100%, it means that despite having high evidence, the model is not influenced by the fairness group.

      ml feature fairness image2

  • Share of Positive Outcome (Inter Fairness Group): This chart shows the distribution and magnitude of positive outcomes, observed or predicted across all fairness groups. The following are the ways to interpret this chart:

    • If the percentage change value is less, it means that the model is less biased to the fairness group.

    • If the percentage change value is higher, it means that the model is more biased to the fairness group.

      ml feature fairness image3

  • Model Performance Metrics: This chart shows the Machine Learning model prediction performance for each fairness group by using model performance metrics such as precision, accuracy, and recall. A significant variation of these metrics between subgroups points to some form of biasness.

    ml feature fairness image4