Bot traffic filtering

Many customers use Personalization to analyze and track their traffic, and use the collected data to do basic instrumentations like track Page Views, clicks, and other activities. When bots and spiders crawl your website, that traffic is not interesting or relevant when defining segments or visitors.

Personalization attempts to filter out traffic from bots and spiders that may crawl your website, reducing false positives for tests and Page Views.

To filter out non-useful hits, Personalization uses the IAB/ABC International Spiders and Bots List to allowlist and denylist user agents so their traffic is not counted for user traffic.

  • What does the list filter?

    The IAB/ABC International Spiders and Bots List aims to consistently filter well-behaving automated traffic from totals used for traffic measurement or billing purposes.

    This list primarily detects ‘good actors’ who self-declare as automated traffic. Some ‘good actors’ must pretend they’re not bots, such as when scanning for malware or brand safety violations.

  • What does this list contain?

    The IAB/ABC International Spiders and Bots List contains two lists, an allowlist and a denylist, which are used in a three-step process for filtering user-agent strings.

    The first level of filtering (the allowlist) contains text patterns that generously match all known valid user-agent strings.

    The second level of filtering (the denylist) contains text patterns that match detected spiders and bots that passed through the first step of filtration.

    Each text pattern on the denylist may also include exceptions.

  • How many entries are on the list?

    The IAB/ABC International Spiders and Bots List uses pattern matching, so the number of entries does not correlate to its efficacy. In fact, a lower number of entries can indicate a better text pattern selection.