Data Automation

How to deploy DataOps: Step 2 – Create foundational data

Switchboard Feb 17

In our last post, we covered Step 1 of the four steps to realizing the benefits of Data Ops – Identify KPIs.

Once your data collaboration goals and key metrics are established, the technical work of producing value from your data begins. Various teams across your organization require different combinations of metrics. But raw data accessed directly from APIs or log files rarely provides a format useful for collaborative analysis.

To address this, you need to continuously and automatically refine raw data streams into a format useful to derive your KPIs. This is what we call ‘foundational data’.

Foundational data: the basis for true KPIs

Just as a jet aircraft requires highly-refined aviation fuel to achieve its full potential, similarly the KPIs that drive your business decisions require data that is high quality, well understood and, and above all, reliable.

Unfortunately, raw data that comes from vendors and third parties can be anything but. No two APIs are alike, so data must be cleaned, typed and sometimes enriched with match tables to be useful. Oftentimes, data formats change, while connectivity or vendor hiccups present the ongoing risk of data loss or data corruption.

Building meaningful KPIs from such data is impossible without taking on an enormous amount of complexity. With such a large gap between raw data and KPIs, experienced teams are investing in an intermediate concept – foundational data.

Foundational data involves taking each source and normalizing into standardized and canonical versions, which, ideally, can be easily combined with other, similarly-refined data sources.

Data source integration challenges

If you’re using Google Ad Manager, MOAT, Krux, YouTube or a similar analytics platform, you’ll likely have experienced some of the following:

Programmatic deal data is not easily matched to campaign delivery data from the ad server
Complex APIs require developer resources to use and maintain
Many different metrics that are not always easily matched to campaign delivery or audience segments
Large user and segment logs require data parsing skills
Raw data exported to the data warehouse
Variables can be complex to analyze
Data such as UTMs and URLs can be challenging to clean

Common data sources need to normalize

Within most organizations, the list of data sources you need to master is clear. However, each incremental data source adds new complexities. By understanding the distinctive properties and challenges presented by each source, you‘ll be able to make a more informed tool selection based on the unique profile of your business.

Disparate data sources, formats, and integration challenges will sap any budget, so how can you protect against mounting costs? Rather than attempt to on-board every single data source for its own sake, try to understand the data characteristics of your business today, and where it will be tomorrow. This will help you understand how your raw data can evolve to become the foundation for the specific metrics you need to succeed.

Creating foundational data

Let‘s say you‘re most interested in creating foundational GAM data, because as your primary ad server, GAM data can provide a rich view of how certain display and video inventory is delivered. Start with the GAM API. For brevity, we‘ll assume you‘re already familiar with its quirks and limitations.

The first step is to determine the appropriate queries and granularity of data required. An important consideration is identifying the dimensions you really need as there are quota limits.
Next is to use a script or a tool to invoke the API, and extract and store the query result. It‘s important to do this with 100% consistency so that query results maintain the same schema.
Each row in the query result needs to be type-checked, i.e., numeric values must be cast into integers or floats if they are to have any value for calculations.
Dimensions must be normalized in order to avoid textual inconsistencies (as a result of occasional human input error) that can also throw off calculations.
The query result needs to be written either to a file or, preferably, to a data warehouse, so that it can be consolidated for query-ability. Additional considerations include how to extract key-values so that the business attributes captured in custom dimensions can be extracted for analysis, as well as backfilling.
Finally, consider if you need Data Transfer (event-level server logs that can provide the finest possible granularity of insights).

The steps above are the abbreviated set of tasks involved in creating foundational data for a selected data source. These processes also involve numerous data cleansing tasks (file encoding, to name just one), but they are a necessary part of getting your data into a neat, foundational layer.

What are your dashboards not telling you? Uncover blind spots before they cost you.

Schedule Demo

Catch up with the latest from Switchboard

Performance Marketing

Subscribe to our newsletter

Submit your email, and once a month we'll send you our best time-saving articles, videos and other resources

PLATFORM

BY TEAM

BY BUSINESS CHALLENGE

about us

resources

featured resources

How to deploy DataOps: Step 2 – Create foundational data

Table of Contents

Foundational data: the basis for true KPIs

Data source integration challenges

Common data sources need to normalize

Creating foundational data

What are your dashboards not telling you? Uncover blind spots before they cost you.

Catch up with the latest from Switchboard

The Marketing Cloud Migration Strategy: Move Beyond Legacy Platforms

Predictive Analytics for Marketing: From Reactive to Proactive Campaigns

The Marketing Data Lake Strategy: When Warehouses Aren’t Enough

Cross-Platform Identity Resolution in 2025: The Holy Grail of Modern Marketing

Subscribe to our newsletter

RESOURCES

COMPANY