Back to articles list Articles Cookbook
7 minutes read

Use SQL for Data Analysis With the New Google Analytics 4

As Google is sunsetting its Universal Analytics in favor of Google Analytics 4, data analysts need to adapt and learn how to get the most from the new tool. In this article, I'll discuss how to enhance your data analysis by using SQL with Google Analytics 4.

Universal Analytics by Google has been helping data analysts and marketers in all industries everywhere understand their audience better. However, Google recently announced they would stop their support of Universal Analytics on July 1, 2023. As an alternative, they have introduced Google Analytics 4, a next-generation analytical solution for cross-platform measurements and analyses.

Those who have so far relied on Universal Analytics need to expand their skills and adapt to the new ways. Among other things, the quality of your data analysis in Google Analytics 4 depends heavily on your ability to work with Data Studio and BigQuery, and hence, with SQL. While there is still almost a year until the full transition to Google Analytics 4, it's better to acquire new skills sooner rather than later.

To get started with SQL, I recommend the SQL Basics interactive course. With beginner-friendly explanations and 129 coding challenges, you learn the basics of SQL, such as retrieving the data you need and doing basic data analysis.

But first, let's briefly review what SQL and Google Analytics are and how they work together.

What Is SQL?

SQL, or Structured Query Language, is a programming language used to talk to databases. With SQL, you can store, manipulate, and retrieve data from relational databases. In these databases, data items are organized as a set of tables with columns and rows.

Things you can do with SQL include:

  • Retrieving specific columns from specific tables.
  • Combining data items from multiple tables.
  • Filtering data to get records that satisfy certain conditions.
  • Grouping data according to specified properties.
  • Calculating basic statistics like average, min, max, total, and count.

Your first thought may be that you can already do all these things with Excel or Google spreadsheets. So why bother with SQL?

SQL has many advantages, but two important points I want to mention here are its speed and scalability. You need lots of patience if your Excel spreadsheet contains hundreds of thousands of records; with SQL, you process a few million rows within a minute.

Moreover, you can reuse SQL queries whenever you need to perform the same kind of analyses (e.g., to prepare weekly or monthly reports). SQL queries are repeatable and scalable. Every data analyst working with Big Data will benefit from knowing SQL.

What Is Google Analytics 4?

Google Analytics 4 is a new, next-generation analytics solution to measure different kinds of customer data, predict their behaviors, and gain new insights. It was originally introduced in 2019 and has been fine-tuned since then. It is now recommended by the Google team as the replacement for Universal Analytics.

In contrast to Universal Analytics, Google Analytics 4:

  • Operates across platforms rather than being anchored in the desktop web.
  • Does not rely exclusively on cookies.
  • Uses a data model that is event-based rather than session-based to deliver user-centric measurement.
  • Ensures better privacy with more comprehensive and granular controls for data collection and usage.

Google Analytics 4 introduces innovative features, such as advanced tracking capabilities and user-centric measurement, resulting in improved reporting in GA4. With Google Analytics 4, you get a complete view of the customer lifecycle with an event-based measurement model across websites and apps. That's a lot of data. By using this data well, you can improve your ROI with data-driven attribution and reduce user drop-off, all the while reducing the time spent on analyzing customer journeys.

The new opportunities provided by Google Analytics 4, combined with the speed and scalability of SQL, produce astonishing results. Let's see how this works.

SQL in Google Analytics 4

Google Analytics itself provides advanced data analysis capabilities. However, you may want to go beyond the insights it provides.

With Universal Analytics, you can export your data in several formats like the Excel spreadsheet and the CSV file. You can then use the exported data to create reports in SQL. See this article for a detailed guide on how to do this.

With Google Analytics 4, creating SQL reports is even simpler. You can link your Google Analytics account with BigQuery, a cloud warehouse that lets you run super-fast SQL queries on large datasets.

Once you link Google Analytics 4 and BigQuery, a complete log of all events is automatically exported to BigQuery either daily or in streaming mode (seconds after each event takes place). With all this data in BigQuery, you can build your own custom models or join this data with other datasets for a more comprehensive overview.

Let's start by linking BigQuery in Google Analytics 4.

BigQuery Linking

To create a BigQuery link in Google Analytics 4, go to the Admin panel and choose "BigQuery links." Then, click on "Link" to create a new link.

SQL for Data Analysis With the New Google Analytics 4

Then, choose one of the BigQuery projects and press Confirm.

Note: If you do not yet have projects in BigQuery, you can create one on the Google Cloud Platform. Even if you have never used BigQuery, there is already one default project created as an example.

You may want to create a separate new project for tracking users of your website and apps. If so, add another project: click on the arrow next to an existing project name, then click on "New Project."

SQL for Data Analysis With the New Google Analytics 4

Now, let's go back to linking Google Analytics and BigQuery. After you select the BigQuery project into which you want to feed data from Google Analytics 4, choose the data location zone and frequency (i.e., daily and/or streaming), and click "Submit."

SQL for Data Analysis With the New Google Analytics 4

You have created the link! Now, a complete log of all events is fed into BigQuery.

SQL for Data Analysis With the New Google Analytics 4

What can you do with this data in BigQuery?

SQL in BigQuery

In BigQuery, you can play with your data using basic or more advanced SQL queries.

For example, let's say you are interested in learning more about your website visitors from China. Specifically, you want to know how they get to your website and the brand names of their devices. A simple SQL query does the trick:

SQL for Data Analysis With the New Google Analytics 4

As you see, I have retrieved the event data with the event timestamp, event name, user pseudo ID, country, source of traffic, and brand name of the device for all sessions initiated from China. To make sure every session appears only once in the output, I've added a filtering condition related to the event name. I've done this by electing to list only the events related to session starts, leaving other events related to the session like page view and user engagement out of the resulting table.

This was a very simple example. Once you know SQL better, you can create more complex queries to understand user behavior further. You can also combine data from different periods, add external datasets, save your queries for reusing them regularly, and much more. BigQuery is a very convenient and efficient tool for working with Big Data from Google Analytics 4 and other sources.

Time to Learn SQL!

With the use of data becoming increasingly important, SQL is no longer a programming language reserved for database administrators and data engineers. Today, it is a tool that should be in the skill set of every modern employee, just like Excel.

You can use SQL to enhance sales analytics, marketing analytics, financial analysis, supply chain analytics, HR analytics, and other kinds of business analytics. This is a very powerful tool and an essential skill. So, if you are not familiar with SQL yet, it's time to start learning and practicing. The good news – SQL is really easy to learn!

If you have no prior exposure to SQL, I recommend starting with the SQL Basics course. This is an interactive course with detailed explanations and multiple coding challenges. After taking this course, you'll know how to retrieve information from databases, build basic reports, combine data from multiple tables, set complex filtering conditions, aggregate data, etc.

For those interested in creating complex, multi-level reports in SQL, there is a learning track for SQL Reporting. It includes three interactive courses and over 200 coding challenges covering basic reports, revenue trends analysis, and customer behavior analysis, all in SQL.

If you work with a PostgreSQL database, I also recommend taking the course "Creating Basic SQL Reports in PostgreSQL" to cover the specifics of this SQL dialect.

Thanks for reading, and happy learning!