Learn best practices to Be Data Strong in the cloud
Hi, Michael Manoochehri here, Switchboard’s CTO and Co-founder. This month we’re talking about best practices to unlock insights from enterprise data in the cloud.
Here’s the problem: Getting data into an Enterprise Data Warehouse (EDW) in the cloud from dozens if not hundreds of data sources today is really pretty easy. If anything, it’s too easy. Being able to confidently extract valuable information from this data is really hard.
Recently I was invited by Google Developer Advocate, Felipe Hoffa, to chat about our experiences working with BigQuery on Google’s Cloud OnAir webinar. BigQuery enables you ask important questions such as “Who are my best advertisers?” or “What are the best ways to sell ads on my site?” But you can’t confidently unlock these real-time insights unless the data in your EDW is in the form of foundational data – combining disparate data silos into a unified, strategic data asset – to Be Data Strong. There are four attributes of foundational data that are essential for success:
By combining Switchboard’s data automation technology with a powerful data warehouse such as BigQuery, companies can benefit from all of these elements concurrently, without having to compromise. How does this happen behind the scenes? Let’s dive a little deeper and explore some of the key takeaways from the session.
Step One: Choose a data automation platform
Platforms such as Switchboard help provide business teams with a single source of foundational data from their Enterprise Data Warehouse (EDW). A very popular EDW is BigQuery, a platform which I helped launch with our Co-founder, Ju-kay Kwek, in 2012.
One of the questions we hear all the time is: Why is it so difficult to find reliable insights from our data in the cloud? One of our customers, the Financial Times has a daily influx of hundreds of thousands of rows of data from multiple ad campaigns. Log-level files sitting in the cloud need to be merged with a variety of other siloed data to produce the relevant insights. Different teams require their own custom reports, and manually aggregating this data at scale is simply unsustainable – too many pivot tables of stale information.
The only sustainable approach is to implement a fully automated strategy. Traditionally this has required time-consuming, not to mention costly, engineering projects. Ready-made automation platforms such as Switchboard help provide business teams with a reliable, quick-to-implement, single source of foundational data from their chosen EDW. Utilizing a ready-made platform is a proven, cost-effective solution to the data automation challenge.
“Here’s the problem: Getting data into an Enterprise Data Warehouse (EDW) in the cloud from dozens if not hundreds of data sources today is really pretty easy. If anything, it’s too easy. Being able to confidently extract valuable information from this data is really hard.”
Step Two: Implement a fully automated process
Accuracy, granularity, flexibility, performance? Pick two. In the past, the Financial Times was forced to pick between four core attributes for any analysis effort. For example, teams could generate more granular reports, but this would take time. Alternatively, they could employ faster reporting capabilities, but only if they had a limited number of data rows to process that limits insights. Today, tools such as BigQuery allow business teams to leverage these attributes concurrently, enabling them to enjoy high-performance, high-granularity, real-time aggregation at scale – but only if they are built upon foundational data within the EDW. Here’s how:
Accuracy requires governance. With mounting concerns over security and privacy, data governance is paramount. As consumers of data, business teams need complete confidence in the accuracy of their reports. It’s one thing to connect your data sources, but it’s another thing to be confident that you’re monitoring and verifying hundreds of data sources correctly. Using a data automation platform to implement auditable rules for data governance and alerting allows operations teams to instantly identify whether or not – and crucially, why – their data is up-to-date. This, in turn, enables them to provide their executive teams with the specific reports they require, when they require, and with confidence.
Granularity needs scale. Companies that handle enormous amounts of data – such as digital publishers or marketers – often face a trade-off between having granular data and the cost of automating the data at the scale required to achieve this. By adopting a fully scalable cloud data warehouse such as BigQuery, teams are now able to manage rapidly growing datasets, while benefiting from the agility and data granularity required to effectively optimize at the campaign, advertiser, and ad unit level.
Flexibility enables insights. When working with an EDW, it’s important to have the capability to run different workloads at the same time, make adjustments on the fly, add business rules, and understand the implications of each of these actions to provide the most relevant report to the teams that need them. With a cloud-based EDW like BigQuery, users are not constrained by legacy reporting applications, freeing them to join the data in ways they may have not previously considered.
Performance creates opportunity. Fast and efficient aggregation of data allows teams to do a number of different things with their data, for instance produce a variety of different reports from a large amount of raw data, or prepare data for specific visualization tools. Data is a living beast, i.e. aggregation is not a static process; rather an iterative cycle which drives continual improvements and responds to the ever-changing demands that business teams face. Hard-coded data pipelines or legacy spreadsheet analysis does not allow for quick response to evolving needs. Using a data automation platform to create foundational data can enable the team to use an iterative, drill-down style of questioning such as: “Which campaign is delivering the ad to the right readers?” or “Which content and audiences generate the most revenue?”, along with new questions which were initially too complex to address within raw data sets.
“Today, tools such as BigQuery allow business teams to leverage these attributes concurrently, enabling them to enjoy high-performance, high-granularity, real-time aggregation at scale – but only if they are built upon foundational data within the EDW.”
Step Three: Reap rewards from the cloud
Incorporating the above elements concurrently is key to a foundational data approach that turns your data into a scalable strategic asset. The result is a forward-thinking business team that can leverage unique real-time insights from their cloud-based data warehouse to leave their competition behind.
Effective implementation requires a high-performance query engine, such as BigQuery, combined with a powerful, reliable automation platform which does the heavy lifting, provides flexibility to support new requirements, and saves on significant engineering costs along the way. For our customer in particular, deploying this combination resulted in an 80% reduction in reporting time, a five-fold increase in granularity, and 3x increase in customer reporting frequency. By using a ready-made data automation platform to create foundational data in your EDW, you can Be Data Strong in the cloud.