Unlocking explainability improves campaigns and transparency. It also helps customers to build trust in Machine Learning.
In today’s complex marketing channels, marketers need deep, nuanced insights into micro-patterns and multi-faceted customer behavior. These insights are beyond the realm of traditional analytics tools.
Machine learning models make predictions by capturing these insights as part of their operating parameters during the training process. Unlocking these parameters gives marketers direct visibility and explanations about the micro-patterns and customer behavior, while also building trust in machine learning predictions. It also enables them to build highly optimized and performant campaigns.
The following are the available Machine Learning model metrics and their use cases:
Heatmaps display average values per input feature of the Machine Learning model that defines the composition of each cluster. It is designed to help marketers understand customer’s significant behavior identifiers in a particular cluster.
The following example helps you understand a heatmap and the insights it enables:
This heatmap displays the average values for customers in four clusters across all the model input features. These four clusters are:
In this example, model input features are arranged by rows, and clusters are arranged by columns. The average value of each input feature by cluster is represented as a numerical value.
VIP customers have bought many more products compared to regular buyers and low value buyers, and have a higher Average Order Value (AOV). However, they have received relatively low overall discounts, despite their loyalty.
You can use this insight to create a discount campaign that focuses on VIP customers, and bundle it with a deal to increase AOV and drive volumes.
Similarly, another insight is that VIP customers buy fewer items on their first purchase, compared to regular buyers and one-time buyers. However, the percentage margin on these goods is higher, which indicates that they are buying fresh or new products.
Feature Importance charts are used to identify the weights of the model input features towards the model prediction. In other words, these charts are used to identify the impact and contribution of the input features towards the model predictions.
The charts surface positive, negative, and low contributing input features.
By focusing on what features positively and negatively affect a customer’s prediction score, marketers can build optimized campaigns by tapping into these features.
The following example helps you understand a feature importance chart and the insights it enables:
This chart displays the feature importance value or weights for all the input features for a Likelihood to Buy model.
In this example, model input features are arranged on the left and the feature importance values are arranged on the right.
Likelihood to Buy scores are positively correlated with the transactions count in the last 30 days and email open ratios.
At the same time, they are negatively correlated with email click ratio. For example, if people are not clicking on the email, the likelihood to buy is low.
Therefore, to maximize conversion, a marketer should design campaigns targeting customers who made a purchase in the last 30 days and exceed a certain email click threshold. Additionally, the total revenue contributed by an individual has a very low bearing on their likelihood to buy and should not be used by the marketer as a basis for any campaign.
The quality of a Machine Learning model is measured by multiple metrics. These metrics quantify the ability of the model to make accurate predictions in different scenarios. These metrics are a true representation of the performance of the model in the lab as well as the real world. Access to these metrics helps you to know the quality of the model, and also boosts confidence in its predictions. All metrics are made available for:
To access these metrics:
Navigate to the Model Explainability section.
Based on the selected model, the system displays various charts after the Feature Importance section.
For Likelihood models such as Likelihood to Buy, CDP displays the following details:
Receiver Operator Characteristics (ROC) Curve: This model performance metric for data scientists signifies the extent to which the model is better than a random classifier. It contains the following axes:
For more information on ROC, see Classification: ROC Curve and AUC.
Area Under Curve: This is a single numeral metric between 0 and 1, with 1.0 representing a perfect model. It is derived from the Receiver Operator Characteristics (ROC) Curve metric. For more information on AUC, see Classification: ROC Curve and AUC.
Precision Recall Curve: This model performance metric signifies the extent to which the model can accurately classify the results. It contains the following axes:
For Regression models such as Predictive Lifetime Value, CDP displays the following details:
RMSE is always larger or equal to MAE. The greater is the difference between them, the greater is the variance in the individual errors in the sample. If RMSE is equal to MAE, all errors are of the same magnitude.
For Clustering models, CDP displays the following details:
It is important to understand the input features that feed a Machine Learning model. To surface the input parameters and their human readable definitions in CDP, you can leverage the Feature Dictionary available in all the Machine Learning models. These are available in the individual dashboards for all Machine Learning models.
For example, in the Likelihood to Buy ML dashboard, you can scroll down to the Explainability section to view the Feature Dictionary tile. This contains the input features for the Likelihood to Buy model along with their human readable descriptions.
For the Likelihood to Convert model, the Feature Dictionary tile appears as follows:
You can access the same dashboards from 360 Profiles. To view the dashboard, you must click the i button adjacent to the Input Features label. For more information, see Explainable Predictions.
The Feature Importance chart provides a quantitative visibility on the extent of impact of a particular feature on the Machine Learning model. Feature Fairness Analysis provides insights on how a particular feature impacts the Machine Learning model.
With Feature Fairness Analysis, you can:
Feature Fairness Analysis is applicable only to Likelihood models and is available to all the corresponding out-of-the-box dashboards.
The following are the terminologies to understand and leverage Feature Fairness Analysis:
Fairness Group: After you select a fairness dimension, the data can be clustered in different groups. CDP evaluates the impact of each group on the Machine Learning model.
For example, if Gender is the fairness dimension, Male and Female can be the fairness groups. If Age is the fairness dimension, you can have multiple fairness groups of age 18-29, 30-49, and above 50.
For adding custom Fairness Dimension and Fairness Group, contact your Acquia CVM or AM.
Feature Fairness Analysis is applicable to all Likelihood models. It is available in all out-of-the-box dashboards for all Machine Learning Likelihood models.
Feature Fairness Analysis has the following sections in the out-of-the-box dashboards:
Input Training Data Distribution: This chart shows the distribution of training input data across various fairness groups within each fairness dimension, such as age and gender.
If the magnitude of observed percentage is high and the positive outcome ratio is less than 100%, it means that despite having high evidence, the model is not influenced by the fairness group.
>
If the percentage change value is higher, it means that the model is more biased to the fairness group.
>
Model Performance Metrics: This chart shows the Machine Learning model prediction performance for each fairness group by using model performance metrics such as precision, accuracy, and recall. A significant variation of these metrics between subgroups points to some form of biasness.
If this content did not answer your questions, try searching or contacting our support team for further assistance.
If this content did not answer your questions, try searching or contacting our support team for further assistance.