Interactive Queries

Improving query performance

In Customer Data Platform (CDP), the transactionsummary table contains the mastercustomerid and mastercustomerid2 attributes. For anonymous customer transactions, the mastercustomerid attribute is set to -1. If your dataset has higher number of anonymous customers and you try to join the transactionsummary and customersummary tables, your query performance is not optimum.

To improve the query performance, you must use mastercustomerid2 and not mastercustomerid. The mastercustomerid2 attribute is a copy of the mastercustomerid attribute. However, the value for mastercustomerid2 is NULL when mastercustomerid is -1 for anonymous transactions. You can use the mastercustomerid2 attribute on the transactionsummary table while using a FULL OUTER JOIN. In this case, your query returns results for anonymous transactions also.

To query anonymous transactions, you can use the mastercustomerid2 attribute. In this case, your WHERE clauses can include mastercustomerid2 = NULL and not mastercustomerid = "-1".

Sample SQL query

SELECT
   customersummary.gender AS "customersummary.gender",
   productsummary.brandname AS "productsummary.brand",
   COUNT(DISTINCT customersummary.mastercustomerid ) AS "all_individuals_count",
   COUNT(DISTINCT transactionsummary.mastercustomerid ) AS "buyer_count",
   COUNT(DISTINCT transactionsummary.transactionid ) AS "transaction_count",
   FROM customersummary AS customersummary
   FULL OUTER JOIN transactionsummary AS transactionsummary
      ON transactionsummary.mastercustomerid2 = customersummary.mastercustomerid
   LEFT JOIN productsummary AS productsummary
      ON transactionsummary.productid = productsummary.id
GROUP BY 1,2
ORDER BY 3 DESC
LIMIT 500