A guide to blending data sources in Google Data Studio and tips I’ve learned along the way to resolve common issues.
As a Google Data Studio (GDS) user, you’ve most likely already delved into the data blending feature. You’ve surely relished in its powerful capabilities to combine and calculate metrics across data sources with a simple Join Key—and therefore, you’ve undoubtedly encountered its caveats and downfalls, as a product that’s still ever-changing.
In this blog post, we’ll introduce you to exactly how data blending works in Google Data Studio, note the considerations that limit our blending abilities, and share some tactics we’ve successfully implemented to resolve data blending issues.
Introduction to Data Blending in Google Data Studio
Data blending is a feature that was released in Google Data Studio in 2018 that allows you to combine metrics from multiple sources. It can be used to create any type of visualization offered in Google Data Studio for a complete view of performance across different sources.
You’re able to blend data sources to combine metrics, like revenue and transactions, across multiple properties by dimensions like campaigns, keywords, and device types.
GDS also recently came out with the ability to very easily create a Data Blend in a report by simply clicking multiple scorecards or tables at once, right-clicking, and selecting ‘Blend Data.’ This method creates the Data Blend for you with the selected metrics and sets the first data source you selected as your left-most source.
Why Use Data Blending in Google Data Studio?
For me, this is undeniably the most useful feature of Google Data Studio. Most often than not, I’ve run into clients who have multiple Google Analytics properties or Google Ads accounts, are running Paid Social ads in multiple social platforms, or offer their service or product within a website and across applications. Clients and analysts should be looking at combined performance metrics for efforts that have the same goals or budgets, regardless of how the data collection is disjointed.
Blending data in Google Data Studio allows you to combine metrics that are reported on in different platforms that should be viewed as a total (ex. app conversions in Firebase and website conversions in Google Analytics). You’re able to provide performance for an overall budget that’s spread across more than one source or medium (ex. Facebook Ads and LinkedIn Ads). Ultimately, data blending allows for holistic reporting on performance.
Before data blending in Google Data Studio, we were forced to blend these metrics through other applications first before connecting it to Google Data Studio. Often, these third-party applications also cost extra to connect and blend the data for you. Worst of all—analysts had to take extra steps to view combined metrics across accounts, properties, and platforms to make optimization decisions.
When to Use Data Blending in Google Data Studio
Data blending should be used in any situation when you need to view combined data for analysis or reporting purposes. I’ve shared some example use cases for data blending in Google Data Studio below, just in case this sounds overly general or possibly doesn’t seem like it will ever apply to you—which it will.
- You’re running Google Ads across multiple accounts with the same overall budget.
- You’re running Bing Ads and Google Ads and need to report on combined Paid Search performance.
- You need to report on Bing Ads Ads cost metrics by campaign but report on goal completions out of Google Analytics.
- You want to report on Paid Social campaign metrics and goal completions out of Google Analytics.
- You’re running Paid Social ads across multiple platforms from the same Paid Social budget.
- You manage multiple subdomains or sister sites for the same account.
What Can You Blend in Google Data Studio?
Conceptually, you’re able to blend just about anything in Google Data Studio as long as the data source is in an available GDS connector, and there’s at least one dimension in common (i.e., a Join Key) among your data sources. Your data can live in a Google Sheet, SQL database, or even be a simple file upload to connect to GDS, and custom fields can be added to create matching Join Keys.
Unfortunately, it’s not as possible as we’d imagine to blend any data with data blending in Google Data Studio with these liberties. There are limitations outside of the two main requirements (data source available in Google Data Studio and Join Key) to return accurate metrics in your reports.
Caveats of Data Blending
There are two fairly significant caveats to keep in mind when planning your blended GDS reports.
The Left Outer Join
As mentioned, there must be at least one dimension in common between your data sources—it can even be something as simple as the date, but the date will then be the only dimension you can use to combine metrics accurately.
To correctly calculate metrics across something like the campaign, the campaign must be the Join Key. However, Google Data Studio’s data blending feature only allows for blending through a Left Outer Join operation. This means the resulting values will only include metrics and rows that appear in the left-most data source in the blend.
In the example below, we have two sample data sources set up as two separate Google Ads accounts with the date set as the Join Key, pulling in campaigns as dimensions and impressions as the metric.
By the Left Outer Join operation, only the campaigns and impressions highlighted in green in the table below will be included in any blended calculations by campaign in Google Data Studio.
However, this setup will work (with the campaign dimensions removed) to calculate combined impressions in a scorecard format.
Limit of Five Data Sources
I know five data sources sound like plenty, but there are many situations when you need more than that to blend together. I’ve run into it quite a few times, and the only solutions have been to either forgo some metrics or to combine data sources into one through Google Sheets or by importing data from one platform to the other.
Some examples where I’ve run into problems because of this limitation:
- Combining Bing Ads Cost data (Data Source 1) with Google Analytics conversions (Data Source 2) across more than two accounts or properties.
- Combining Google Analytics metrics across more than five different properties.
- Combining Assisted Conversions data with Google Analytics last-click conversions across more than two accounts or properties.
Common Data Blending Issues and How to Resolve Them
Often, the issues I run into in Google Data Studio are easily resolved with a simple workaround. It seems GDS is still ironing out logistical kinks for functionalities that should work seamlessly on their own. The following solutions and hacks have sufficed in resolving my blending issues so far.
Data Returning “Null” Values
GDS will return “null” (-) values when any row of a dimension has no value associated with it. In the example below, Leads returns null values and is missing conversions from the second data source in the data blend due to the Left Outer Join operation.
The ‘Leads’ column is missing a conversion because the First Interaction Assisted Conversion (which is from the second data source) occurred for a keyword that did not return any values under the first data source. This results in inaccurate reporting! You can see that the total number of Leads under the ‘Leads’ column does not add up to what GDS is reporting as the ‘Grand total.’
This issue can be resolved by creating custom metrics in the table that tells GDS to either return the max metric for that field or return a 0 (instead of null). Use the formula below to create this metric:
Replace [FIELD] with the metric and use the SUM([FIELD]) calculation if you need an aggregated metric. This is the formula used in the above example:
NARY_MAX(SUM(Goal Completions),0)+NARY_MAX(SUM(First Interaction Assisted Conversions),0))
Returning 0s instead of nulls allows GDS to combine the metrics and return a value, even if there are no values for one of your combined metrics. It’s a silly workaround, but it’s what makes sense with GDS’s current configuration.
Another solution I’ve seen for this is creating a custom metric using a CASE WHEN formula, telling GDS to either return 0 when the value returns null or to return the field:
CASE WHEN [FIELD] IS NULL THEN 0 ELSE [FIELD] END
Data Returning Incorrectly Calculated Metrics
I highly recommend always viewing your combined metrics by the Join Key in a table format before putting it into a scorecard. Viewing it in a table allows you to check that the correct metrics are being pulled in from each source before combining them.
When you’ve noticed that your report is calculating metrics incorrectly, comb through the following checkpoints:
- Make sure the naming is the same for your Join Keys. I’ve tried using Device Category as a Join Key between Bing Ads and Google Analytics before realizing that the naming conventions are different. Bing Ads uses Computer, Tablet, and Smartphone instead of Desktop, Tablet, and Mobile, which are used in Google Analytics. Use a CASE WHEN formula to return the same dimension names in one data source to match the other. For example, you can create this custom field in your Google Ads data source:
- Make sure the formatting is the same for your Join Keys. This is specific to dimensions like Date. Although the Date dimension may be called the same and look the same in their platforms, they may be formatted differently in GDS. Pull your data into a table with the date set as the dimension and check the formatting of your dates. You may find that one is YYYYMMDD, while the other is MMDDYYYY. If reporting on a monthly basis, you should attempt changing the ‘Show As’ under each date in every data source to ‘Year Month.’
Data Won’t Blend with a Left Outer Join
There will be situations when you simply won’t be able to blend the data because the necessary dimension values are not available in one data source. When this is the case, I’ve found there are really only one of two options:
- Try moving around the order of your data sources. Remember, it’s a Left Outer Join, so whichever data source houses all values of the dimension that’s used as the Join Key should be in the left-most position. Sometimes, you may not know beforehand which it is so try moving the order around to see which configuration results in the correctly calculated metrics.
- Import the data into one of the data sources. This method has been most useful for me when attempting to combine Bing Ads cost data from the Bing Ads platform with Google Analytics data. Supermetrics offers an Uploader that allows you to automatically import cost data from Bing Ads, Facebook Ads, Yahoo Gemini, or a file into Google Analytics. You set it up once, and it continues to upload cost data from the chosen account and even backfills it. You can also do this manually within Google Analytics by navigating to Admin > Property > Data Import
- Combine the sources into a Google Sheet first. If possible, use an add-on that connects to your data sources in Google Sheets and combine the data there before connecting the Google Sheet to your GDS report. This is not an ideal solution, but it’s a good last resort, especially if you’re using the Supermetrics or Google Analytics add-ons in Google Sheets. These add-ons allow you to set the data to refresh automatically so your GDS reports can still be up-to-date (although not as updated as it would be when directly connected to a source).
That’s a Wrap!
Hopefully, some of these tips and tricks will come in handy when you need to blend your own data sources in your GDS reports. And remember, GDS is always updating—meaning, new issues will likely arise, but some existing issues will be resolved, potentially negating the need for some of these workarounds.