In her latest WARC column, Professor Karen Nelson-Field, founder of Amplified Intelligence, describes the fundamental qualities of the data flow required for the future success of attention economics in advertising.
Attention data under the microscope
If the most advanced industries are those open to constant recalibration, digital media deserves a prominent place among them.
Over the last few years, measurement change led by pioneering brands and agencies has moved so fast and far that previously nascent concepts are fast becoming established truths. The vast gap between impression-based measurement and human-based attention is now widely accepted, as are the problems it causes for advertisers wanting to understand the relative value of their ad dollars spent across media. Moreover, it is now well-recognised that attention data can help identify and fix these gaps through signals via media planning, buying, and verification processes.
Players throughout the ecosystem can be rightly proud of having reached a position where key campaign assessment challenges and attention solutions are understood, but deeper knowledge of what a future state looks like with attention still needs expanding.
To set the scene, attention data broadly falls into two buckets: human and non-human. An easy way to define the difference is to think of them as ‘outward-facing’ data – which collects person-level human data via gaze tracking and/or facial recognition for the purpose of observing human behaviour – and ‘inward-facing’ data, which collects impression-level data via a pixel for the purpose of making assumptions about human behaviour.
Each data set has its own limitations. Filming humans continuously and en masse to the scale of traditional reach panels has ethical implications: no-one wants a currency that is by default surveillance. Meanwhile, pixel data might be scalable but its ability to predict human attention is extremely limited (see my last WARC column).
What we need is a combination of both; an amalgamation of data that marries real person-level human attention to scale.
Jonathon Wells, a data science veteran at Nielsen, spoke to this point in a recent paper on ACR. He said “...as lucrative a data source as ACR is [automatic content recognition], it isn’t sufficient by itself to measure audiences, simply because it lacks the most important aspect there is in audience measurement: people. So, the best way to unlock the true potential of ACR data is to calibrate it with data that reflects true person-level viewing behavior”. This is what data scientists call ‘ground truth’.
The critical importance of ‘ground truthing’
In supervised learning algorithms, ‘ground truth’ is a term that refers to data that is a ‘provable’ or ‘true’ answer to a specific question. This type of data is collected by deep and direct observation of real features and substance in context. As opposed to modelled data which makes assumptions about the true answer. The deeper and more accurate the ground truth data is, the better predictive algorithms will perform.
In the context of the Attention Economy, ground truth data can be described as above: person-level human data collected via gaze tracking and/or facial detection when they are consuming media in real time. It tells us exactly how much attention-time, and attention-focus, a human pays to advertising. On the other hand, modelled attention data is impression-level data collected via tracking pixel to collect data on a user’s scroll speed, time-in-view, ad pixel load, and ad coverage. It tells us how the ad loaded during their session and how it was displayed on the screen but the quality of the attention values is based solely on the quality and quantity of the ground truth data upon which the estimation is based.
One is quality rich but low scale (millions of data points), the other is quality poor but high scale (billions of data points).
The diagram below describes the fundamental qualities of the data flow required for the future success of attention economics in advertising. At its core it’s about a clear, but profound, data interface that connects both ground truth and inferred data to provide a clear map of attention measurement for a functional and enduring ecosystem.
The optimal pipeline flow is as follows:
- Ground Truth Database: The only ground truth data for the foundation of accurate attention metrics and models comes from human panels. In particular, human gaze and/or facial detection-based attention data, plus impression-level pixel tracking data from the same person during the course of the ad view. Collecting impression level data and gaze from the same human allows us to build a solid conduit to impression level data.
- Impression Level Database: This data, collected via tracking pixel, inherently holds indicators of attention (i.e. scroll speed) but in the absence of human data cannot on its own be used to predict human attention. This is due to the complex and varied combination of indicators required to predict human viewing by ad-unit. Put another way, the complexity of human behaviour cannot be predicted by this data alone.
- Enriched Database: Both sets of data are joined in an enriched database where learnings from the ground truth data are applied to the impression level data to predict accurate attention metrics on a multibillion ad impression scale.
- Attention Algorithms: Separate models, focusing on both attention time and attention focus, support the three different applications of attention metrics in media. Time is the number of seconds of attention paid, focus is degree of concentration paid within those seconds, i.e., heavy/light switching, sustained/unsustained gaze. Models using two dimensional attention can be used to accurately predict outcomes which allow advertisers to practically apply attention data.
- Attention Models for Media Planning: Models built for media planning tools apply weighting to existing media plans for the purpose of optimising for attention to increase efficiency of spend before media is traded. The optimum threshold of attention required changes depending on campaign objectives.
- Attention Models for Media Trading: Custom probabilistic models built for media trading allows an advertiser to bid and buy media using attention signals when the right combination and weight of attention indicators present at the point of the transaction.
- Attention Models for Media Verification: In-flight campaign optimisation to measure the attention and campaign performance of an impression after it is delivered.
- Feedback Loop between Applications: While not part of the data flow from a machine learning perspective, it is important to call out how each product application informs the other. As advertisers or agencies move through the ecosystem each improves the next task performance. For example, optimised planning improves decisions made in trading, optimised trading improves outcomes of verification, verification improves media mix scenario planning.
- Ongoing Collection and Tracking: All data needs to be updated regularly for continuous improvement of the machine learning models. All new verification data returns to the impression-level database which feeds to the enriched database. All human data needs regular updating because when a media owner changes the functionality of the platform, the way humans interact changes too. Without new data the predictability of an attention model will erode.
Most understand the saying ‘the whole is greater than the sum of its parts’. It is no different here. Pixel derived impression data on its own tells us little about whether a human has paid attention, and human data alone can’t scale. One is very clear information about humans but small, the other gives us a blurrier view of human behaviour but a bigger map. The two bridged together brings both clarity and scale, combined they can change a measurement landscape.
Future proofing is vital
What we have seen from the attention industry to date has only scratched the surface. While we have case evidence that gives us overall success stories, our industry needs a sharper, more cohesive view of the engagement each ad drives and a validated blueprint for true integration of attention data into the broader measurement ecosystem.
This data flow shifts us from attention concepts to the attention science needed for the successful application of attention data into planning, buying and verification systems.
Why does this matter? Because if we don’t apply rigour and continue to work to a blurry, isolated roadmap, we run the risk of simplistic attention measurement applications doing more harm than good. We have come from a place where good enough is good enough and our industry and the advertisers that fund it are still paying the price now, via the billions of wasted media dollars spent on ads that potential customers don’t even see.