This article and related ones are mostly for advanced users who have already implemented the standard events tracking (including the standard entities) as defined in List of tracked events and Examples of web event tracking. If you want to implement custom events tracking (on standard entities) on your website, or tracking standard events on standard entities with custom attributes, this series of articles is for you !
The client-facing standard CDP entities are the following :
There are other entities as well, but they are not relevant for the CDP Webtag and its applications.
You will notice that those entities tie almost perfectly with the standard data feeds. In fact, the standard data feeds are the main way to populate those entities and their attributes : in a sense, you can consider that whatever attribute/entity was configured for your data feeds ingestion can be made available for the CDP Webtag to create/update. So, whatever the schema for your standard data feeds is, it is a pretty good assumption to consider that the objects you are dealing with in the CDP webtag & related APIs will respect that schema (certain exceptions might exist, always check with your CDP team if you have a doubt). Of course, this is a bit simplified :
Providing those entities/attributes in the feeds you send to CDP does not necessarily mean that they were configured for the webtag to use…
…and, conversely, you do not need to provide all entities/attributes via the feeds to use them in CDP. Some entities/attributes can be created for you by the CDP team solely for webtag or API usage for instance (and thus, without any flat file feeds being provided).
All entities have unique identifiers that must be provided in order for the entity to be loaded into CDP (so as to create/update that entity). Some can offer different options as to which identifier to use (e.g. Customer) but you still need to provide at least one.
All data sources can send data about any entity in CDP, with no “priority” or “preference” between the sources. CDP’s data processing will then consolidate the data to provide a consistent and persistent view on all the entities that are loaded in the platform. This means that any data source can overwrite another data source’s data if they provide the same entity with the same identifier (but different attribute values). The rule here is simple, the latest data source loaded into CDP will “win” as long as it provides data : we basically merge all the entities which share the same identifier, and pick for each attribute the latest non-null value, regardless of the data source that provided it.
The point above means that the platform is very flexible in what data can be accepted (including consolidating sparse data from multiple sources), but it also means you have to be careful : you need to send data only from sources that you “trust”, and make sure you do not send overlapping data from other sources, lest you have no issues with that “source of truth” being overriden by other sources.