
FAQs
Getting under the hood of Switchboard: The technical FAQ
Unifying data effectively is a challenge for some of the tech teams we speak to, so it’s important to get under the hood of what Switchboard does, and understand how we make the process run smoothly. Here are some of the common questions we’re asked.
Q: As a corporate enterprise, we’re used to pulling and aggregating terabytes of data each day via a host of platforms into Google Cloud. We’ve built our own infrastructure to do this. Where would Switchboard sit within our data stack?
At Switchboard, we imagine the data stack as a wedding cake. The top tier comprises data visualization dashboards (Tableau, Looker, Power BI etc), as well as custom applications and data science projects.
The second tier typically comprises the data lake (or warehouse), which stores the data. These are increasingly cloud-based, such as Snowflake, BigQuery, AWS or Redshift, although customers also use storage facilities such as GCS buckets with Parquet files; HDFS (Hadoop Distributed File System); or SFTP.
The third tier – where the Switchboard application sits – is what many of our customers call the ‘partner ecosystem’ or ‘data ecosystem’. We call it the Data Operations (DataOps) layer – the single pane of glass that sits on top of a messy ecosystem of third-party data, APIs, JSON objects, CSVs, databases, or email. Switchboard’s cloud-hosted platform provides a consistent and repeatable set of workflows and processes to bring together all this data through default standard connectors, aggregate and normalize it using a transparent technology and workflow, and load it into the EDW.
Back to top
Q: What type of data sets does Switchboard handle?
We handle all kinds of data sets, but to name just two common examples:
First, we extract very granular, large-scale sets of raw data, for instance from Google Campaign Manager (GCM), which can be used for ad hoc queries according to a host of different metrics such as country, advertising partner, etc. We also store all the metadata necessary to understand the mechanics of data access. This includes the time we access the report, the download start and finish time, and any errors (such as a dirty row that doesn’t match the expected schema) so that business teams can account for any outages or glitches they encounter.
Second, we create an aggregation of that ginormous set of raw data to produce bespoke tables for use by the business teams, for instance, a conversion report, or any other slice of the data required.
Back to top
Q: We need to analyze data from a host of marketing platforms. How does Switchboard pull the raw data from each source, and how often?
For most marketing sources, we pull the data using proprietary connectors that access their reporting APIs (although in some cases, the platform provides raw files). We usually provide a daily report using data from the previous day. This is the cadence at which platforms such as Facebook, Snapchat, TikTok or Twitter can report accurately. Marketing data often changes over time as conversions can happen days after a campaign is viewed, so Switchboard also provides the capability to re-ingest reports over time.
Back to top
Q: Our business teams use a variety of applications to share data. What formats can Switchboard export normalized data in?
The whole point of aggregating and normalizing data is so that it can be used downstream. So, whether you need a Gzip csv; or Parquet, Spark, or Avro files; or tables in a cloud data warehouse such as BigQuery, we can send the data across in a compatible format.
Back to top
Q: We’ve already built the DataOps infrastructure in-house to pull the data from the source (e.g., Google’s API), log and monitor it. So, what are the advantages of using Switchboard on top of these capabilities?
Switchboard’s enterprise-level technology provides automated scale, automated monitoring alerts, and a full set of logging and auditing capabilities. Even for those organizations with sophisticated DataOps infrastructures in place, the process of manually querying the data each time a business team needs a specific report becomes unsustainable – not to mention pulling engineering teams away from other business-critical initiatives.
With Switchboard, non-technical business teams can make quick iterations in a few clicks and generate different versions of reports in minutes – and they can quickly move data to other downstream systems whenever they need to.
Back to top
Q: Google keeps refreshing its data at different timepoints. How can Switchboard help us retrieve the right data at any given point in time?
It's all automated in Switchboard. So as a configuration, you can schedule a “30-day lookback”, where that past 30 days of data is automatically refreshed every day. And you can set these lookbacks across the hundreds of data sources that we support right now. So, any net new request is just as simple as publishing a new “recipe” (normalization template).
Back to top
Q: In the past, we’ve lost temporary access to platforms such as Facebook because they’ve implemented changes which require an app review. Does Switchboard handle security reviews and access reviews for us?
Yes, we do all of that for you within the Switchboard application. Certain platforms are notorious for rapid API changes, so it’s our job to keep on top of these changes and ensure that APIs are always up to date.
Back to top
Q: When there are restrictions on the metrics you can pull and combine (for instance, in GCM, you cannot always combine all the metrics you want to), does Switchboard alert us?
It depends on the data source and connector, but when we can, we do so within Switchboard’s script editor. For example, when an API changes its dimension and metric names, or when an adtech API has versions that are mutually exclusive, we have different versions of data connectors for those APIs so we can still pull the data as needed.
Back to top
Q: Can Switchboard combine metrics – for example line items and orders – so that we don’t have to process them as separate queries?
Yes, Switchboard supports these types of joins. If you use data transfer files with GAM, we often do joins of the ad units, line items, orders, and creatives, because those files only have IDs, whereas business teams often just want to use the names or custom fields from the line item, for instance.
Back to top
Q: When our business teams access the data and choose their reporting metrics – for example date, site, advertiser – are all these metrics version controlled?
Yes, each time you create a new draft (or ‘branch’) you can play around and add different test parameters, much like a sandbox. The Switchboard Compiler will automatically provide error messages alerting you to any invalid dimensions or metrics – all before you publish.
Back to top
Q: And if multiple teams access the data and make changes, can we see who and when?
Yes, every new branch is tagged with a person and a timestamp. If there is a conflict, the Switchboard application will automatically tell the user before they can publish.
Back to top
Q: If changing our metrics affects previous periods of data, how do we apply these changes retrospectively?
Using Switchboard’s backfill function, teams can go back and automatically pull specific periods of the data – for example, the past month – at the click of a button. This is particularly useful for business teams who quickly decide they need to see a different period of the data, as they can simply change the recipe and pull a different period without burdening their engineering team.
Back to top
If you’d like to know more about how you can gain a unified view of your data, let’s connect today.