In Customer Data Platform (CDP), the transactionsummary table contains
the mastercustomerid and mastercustomerid2 attributes. For anonymous
customer transactions, the mastercustomerid attribute is set to -1
.
If your dataset has higher number of anonymous customers and you try to join
the transactionsummary and customersummary tables, your query performance
is not optimum.
To improve the query performance, you must use mastercustomerid2 and not
mastercustomerid. The mastercustomerid2 attribute is a copy of the
mastercustomerid attribute. However, the value for mastercustomerid2 is
NULL
when mastercustomerid is -1
for anonymous transactions. You
can use the mastercustomerid2 attribute on the transactionsummary table
while using a FULL OUTER JOIN. In this case, your query returns results for
anonymous transactions also.
To query anonymous transactions, you can use the mastercustomerid2
attribute. In this case, your WHERE
clauses can include
mastercustomerid2 = NULL
and not mastercustomerid = "-1"
.
Sample SQL query
SELECT
customersummary.gender AS "customersummary.gender",
productsummary.brandname AS "productsummary.brand",
COUNT(DISTINCT customersummary.mastercustomerid ) AS "all_individuals_count",
COUNT(DISTINCT transactionsummary.mastercustomerid ) AS "buyer_count",
COUNT(DISTINCT transactionsummary.transactionid ) AS "transaction_count",
FROM customersummary AS customersummary
FULL OUTER JOIN transactionsummary AS transactionsummary
ON transactionsummary.mastercustomerid2 = customersummary.mastercustomerid
LEFT JOIN productsummary AS productsummary
ON transactionsummary.productid = productsummary.id
GROUP BY 1,2
ORDER BY 3 DESC
LIMIT 500