Data Science
Sep 22, 2020

Understanding Data-Driven Attribution Models

Baptiste Amar
Data Analyst
Open roles
No items found.
View all open roles

Baptiste Amar, senior data analyst, designed a fractional attribution model to more accurately credit marketing channels for their impact on revenue generation. This article is part 1 of a 3-part series on how he designed and executed the model. In this first installment, Baptiste sets the foundation by introducing the fundamentals and challenges of attribution models.

In part 2 Baptiste follows up with the model design process, gathering and formatting the data, and modifying the Markov Chains model. In part 3 he deep dives into the challenge of deploying the data-driven model in the systems, and of pressure testing it on real marketing campaigns to ensure its relevance. If you haven’t already, follow us on LinkedIn to stay up to date for part 2 and 3.


Why do we need data-driven attribution models?

The days of faith- or expertise-based budget allocations in marketing are long over. With the increased penetration of data and analytics into business strategies, marketing managers are faced with even more challenges: They now need to constantly prove the value of their actions.

But marketers aren't the only ones being confronted by this newfound challenge. Marketing-specialized data analysts like myself are responsible for providing valuable and actionable content to marketers, whether it's quick insights or heavy-duty modeling. Ultimately, this helps operational marketing teams make better decisions such as building an optimal media mix, launching more performant campaigns, or creating more engaging content.


In mature marketing organizations like GetYourGuide, analytics is essential when it comes to allocating resources: Media managers need material to get buy-in from financial stakeholders and eventually unlock the operating budget.

One of the biggest challenges in this context is measuring the return on media investments: How much revenue did the investment spent on specific channels or campaigns generate? This structural question can be answered within several approaches, which all require reliable data and sophisticated modeling.

One of the standard ways to tackle it is to divide revenue across the marketing channels depending on the impact they had on generating it. This is what attribution modeling is all about.

You might also be interested in: How a display marketer and his small team make a big impact

Marketing and conversion

Before purchasing a product online, customers can be exposed to a wide variety of marketing assets. An example of a path to conversion could be:

1. A customer sees a banner on a website that links to booking a Tour Eiffel ticket on GetYourGuide (display ad), and clicks on it. They browse our inventory without converting.

2. A few days later, they query Google search engine for Tour Eiffel tickets, and click on the GetYouGuide ad (paid search) to access our platform once again and refresh their memory on the activities we offer. While browsing, they opt-in to our newsletter.

3. A week after the customer's last visit, they receive an action-based email reminding them of the Tour Eiffel ticket, click on the email, search on our website for the tour they had their eye on, and book the attraction.

In this journey towards conversion, three marketing channels participated: display, paid search and email.

If we want to credit those three channels to the right portion of revenue — depending on the impact they had on the conversion — which channel would we attribute to the most?

a. The display ad because it drove our client to the website for the first time and got them considering our brand?

b. The paid search click because it likely pushed the client much further into the purchasing intention?

c. The email touchpoint because it made the client convert?

An example of a path towards conversion
An example of a path towards conversion

In all likelihood, all three of those visits had a substantial impact on the customer’s action.

Based solely on this simplified example, we already understand that there is no straightforward solution. Then there is another layer of complexity. With around 20 media in our mix, the number of channel combinations towards transactions is almost infinite.

Rule-based models: the easy way

Across industries, simple rule-based models are the most wide-spread method of crediting channels’ revenues because of two main advantages:

1. They are easy to understand for channel managers.

2. They are easy to implement in the systems.

Most classic models only credit the revenue to first-channel (first-click), to only last-channel (last-click) or equally across all channels involved (linear).

At GetYourGuide, we mainly based our attribution logic on a “U-shape” model where the first and last channels each get 40% of revenue, and the remaining 20% are divided across the intermediary ones.

Attributing revenue with a U-shape model
Attributing revenue with a U-shape model

This logic, which is slightly more sophisticated, allows for better steering of both the initiating and closing channels. However, the channels that are often positioned in the middle of the path — those that reactivate customers’ intent, and influence other channels — are left understated.

Data-driven and fractional  approaches: estimating impact

Rule-based models, even sophisticated ones, often fail at outlining the true impact of the channels. They don’t capture the complexity of the media mix and channels.

A data-driven model, on the other hand, aims at understanding the links between our marketing interventions and the customer’s response. Among many other factors, they have the potential to consider:

  • the sequence of events that led a client to purchase
  • the interactions between the involved channels
  • their position in the paths
  • proximity to conversion

By leveraging all this information, data-driven models allow for crediting channels’ revenue with more accuracy for the impact that they have on our transactional relationship with the customer, which carries over strong signals of incrementality.

Introducing data-driven fractional attribution modelling

Attribution modeling is the most essential tool for calculating channel performance (even rule-based models). Indeed, the key metric used at GetYourGuide is Return On Ad Spend (ROAS), which is the revenue generated by the channel divided by the spendings – and channel revenue necessarily involves some attribution logic.

Now, having an attribution model that credits channels for their impact on revenue will help:

  • Better understanding channels' profitability and support budget allocation decisions.
  • Setting up the right targets for channels over a defined period.
  • Channel managers to adjust their media acquisitions quickly so they can keep track of the revenue they intend to generate.
  • Design channels' strategy by analyzing the outcomes of campaigns or specific interventions.

For all those reasons, a robust data-driven attribution model makes marketing more efficient, which leads to increased global revenue.

You might also be interested in: 15 data science principles we live by{{divider}}

A three-step approach for a best-in-class fractional attribution model:

We organized this project in three parts, each achieved during a quarter. We’ll go through them in the next blog posts to be published in the upcoming weeks.

The first quarter was dedicated to building the data and testing several models.

Outcome: model selection.

In the second quarter, we fine-tuned and deployed the chosen model in the systems. In this part of the project, we hit a wall and had to come up with original material to be able to build the pipeline.

Outcome: model in production.

Lastly, in the third quarter of the project, we aggressively pressure tested the model on actual marketing campaigns to ensure its relevance. Pressure testing consists of comparing the campaigns' incrementally against the model's weights on specific campaigns. This is to validate that we capture the right incrementality signals.

Outcome: model becomes the single source of truth.

Without this fundamental understanding of the business problem, it is virtually impossible to design and deploy a relevant fractional attribution model. The introduction helps us better understand the issue we are trying to solve: Estimating each channel's revenue to measure their performance and make more efficient decisions in the future. In the forthcoming posts, we will share the multi-step plan where every part will generate valuable outcomes.

In part 2 of this series, we will focus on how we designed the model leveraging Markov Chains technique. Stay tuned.

If you are interested in business intelligence, data analysis or data science, check out our open positions in engineering or marketing.

Other articles from this series
No items found.

Featured roles

Marketing Executive
Full-time / Permanent
Marketing Executive
Full-time / Permanent
Marketing Executive
Full-time / Permanent

Join the journey.

Our 800+ strong team is changing the way millions experience the world, and you can help.

Keep up to date with the latest news

Oops! Something went wrong while submitting the form.