Every day, businesses collect data about users. Marketers follow as users go from one page in a browser to another; from one screen in an app to another; even as they stream one song or video to another. One of the popular (and lucrative) ways to use this data is to help marketers reach their audience(s) more effectively. Multi-touch attribution can help marketers do just that by mapping the customer journey to conversion, looking at marketing funnels, simulating “what-if” scenarios and forecasting ROI within the function of attribution. But while user-level data is valuable for marketing measurement, such granular data also increases complexity and the volume of data – it becomes the much-hyped “big data.” Big data must be processed thoroughly and thoughtfully to provide an attribution model and actionable insights for marketers. I will explore this intersection of big data and marketing attribution in this series of blog posts. Before I get into the weeds, let’s understand when big data is applicable in the context of multi-touch attribution.The foundation of a multi-touch attribution model is its user path data. Each path is filled with impressions, clicks, sends, email opens and/or calls, collectively referred to as touchpoints or events. Depending on a company’s marketing budget, the volume of events may not constitute “big data”. However, companies with a marketing budget of as little as one million dollars annually may find that ad impressions alone, which measure the exposure of a user to a single ad at a given point in time, often occur at volumes too high to be effectively collected using the methods that worked for processing aggregated data or event-level data in smaller volumes. This problem of scale is therefore generally a problem unique to digital advertising channels such as display, paid search (SEM), social media, video or addressable TV, where ad impression volume is high. Direct mail, an offline channel with user-level data, has its own data collection challenges as well, each of which is important to consider in designing a multi-touch attribution solution.In this blog series, I will break down the methods by which big data can be collected and identify the pros and cons of each method. Then we will identify situations in which each method is recommended to successfully collect big data for the purposes of multi-touch attribution. I will compare pixel tracking to ad server logs; individual fields for data to overloading fields; and get into the timeliness, completeness, accuracy, and effort required for each data collection method. It should be a wild ride.Check back for part 2: Big Data Event Collection – Direct Trafficking or Ad Server Logs?