Search
Close this search box.

Google BigQuery Tutorial for Beginners

Last Modified on January 4, 2024

Google BigQuery is part of the Google Cloud Platform and provides a data warehouse on demand. You can upload structured data into tables and use Google’s cloud infrastructure to quickly analyze millions of data rows in seconds.

In this Google BigQuery tutorial, we’ll give you a quick overview of how you can use BigQuery to maximize your workflow efficiency.

Query GA4 data in BigQuery eBook

Subscribe & Get our FREE Query GA4 data in BigQuery eBook

An overview of what we’ll cover:

Google BigQuery is part of the Google Cloud Platform and gives you an on-demand data warehouse. 

You can upload structured data into tables and use Google’s cloud infrastructure to quickly analyze millions of data rows in seconds. 

In this tutorial, I’m going to give you a quick overview on Google BigQuery. You might have heard of BigQuery before. 

It has become more popular in recent years among marketers thanks to its ability to house and analyze data from sources like Google Analytics or Facebook Ads. BigQuery is especially effective with enormous datasets that can sometimes lag in smaller analysis programs.

I will focus here on taking you through Google BigQuery’s basic functionalities. We’ll discover what it’s primary uses are and how to improve your workflow with it. However, this is by no means a full tutorial on BigQuery. There’s much more to discover!

If you’re interested in finding out more about BigQuery, we’re offering a course as part of our MeasureMasters program. For now, let’s familiarize  ourselves with BigQuery. Now we’ve got lots to cover, so let’s dive in.

What is BigQuery, and Why Use It?

BigQuery is a serverless data warehouse which you can use to analyze large volumes of data using SQL.

To give you an idea, BiqQuery can deal with petabytes, which is the equivalent of 1000 terabytes (or 500 billion printed pages).

Our journey starts at cloud.google.com/bigquery. Google BigQuery is actually part of the Google Cloud Platform, which has many different tools. BigQuery is just one of several, but it’s worth taking the time to get to know.

Some benefits of using BigQuery include:

  • Real time analysis of massive datasets at lightning speed
  • Predictions for business outcomes using machine learning
  • Easy to share with team members and integrate with visualization tools
  • Low (or no) cost, low maintenance
  • High security, high reliability, high scalability

Needless to say, you need a Google Cloud account, but you can create one for free. You don’t have to give Google your credit card information, and if you don’t want to upgrade, you can actually leave it at that. In fact, BigQuery is part of Google’s Always Free coverage on Cloud, within certain data limitations.

So what is BigQuery? Well, Google Cloud’s page calls it a “serverless, highly scalable, and cost-effective cloud data warehouse designed for business agility.”

Screenshot of Google BigQuery landing page on Google Cloud

That’s a mouthful. What you just need to remember is the “data warehouse” part. Google BigQuery can store vast amounts of data and help you analyze them, transform them,  or process them further according to your needs.

You might be familiar with this concept if you have been working with spreadsheets before, particularly applications like Google Sheets or Microsoft Excel.

Screenshot of a Google Sheets document titled “100 Sales Records” featuring several columns for different types of data

Spreadsheet software will  let you input data just like this in a table format. You have rows and columns for different categories and items, and then you can analyze the data that populates your cells.

For example, you can use formulas to calculate things in your dataset. Use the formula =SUM in your formula bar and you can get the sum of a whole column of data. The given sum will also update automatically if the data changes.

Screenshot of sum equation in Google Sheets document, with an arrow from the formula bar to the corresponding cell

If you’re using spreadsheets, you might also use filter options to answer your questions. If you click the filter icon in your topbar, you will be given the option to filter by any of your columns. Clicking on a column’s filter button will give you a popup with which you can specify your filter.

Screenshot of Google Sheets document with arrows pointing from the topbar filter button to a column filter button to the filter menu

In this case, we’ve filtered a sample dataset of one hundred sales records to include only sales from Asia. If you wanted to analyze only sales to Asia, you could work with just this filtered data.

Screenshot of Google Sheets document displaying results filtered by column data

Now, this all works beautifully with this particular dataset. But what if you have more sales data? In this example, we had a hundred records. Let’s play with a sheet that has 50,000 sales records. Right away, you’ll notice that it’s not as responsive anymore.

If we try to use our sum formula again, it takes a little while to calculate. If you start asking your spreadsheet program to do too much, you’ll get stuck with this loading bar in the corner while you wait for your analysis.

Screenshot of Google Sheets document stalling while calculating formulas

Sometimes it may seem like the application is stuck, and you don’t have any indication that it’s actually still working. You can try reloading, which may or may not work and you may lose whatever analysis you just asked your program to do.

In general, a large dataset makes everything slower and sometimes even unworkable. This is where you might want to use BigQuery.

Using Google BigQuery

So let’s enter Google BigQuery. As mentioned earlier, you need a Google Cloud Platform account. Once you’re logged in, you create a project up in the toolset in the top bar.

Screenshot of blank Google Cloud Platform with Demo in topbar highlighted

If you open the menu, you’ll see all the different offerings and services of the Google Cloud Platform. Scroll down a bit to find BigQuery and open it.

Screenshot of Google Cloud Platform with topbar menu button and BigQuery menu item highlighted

Once you’re in, you can see here we are in a sandbox. This means we are using the free version of BigQuery. This sandbox account is a bit limited, but it suffices for this tutorial. If you find that the limitations get in the way of your analysis, then you can always go back and upgrade your account. For now, you can go ahead and click Dismiss.

Screenshot of Google BigQuery in sandbox mode, with Dismiss and Upgrade buttons highlighted

This is the BigQuery user interface. The middle of your interface will show all your data, queries, and analysis results.  On the left side, we can see our Cloud project under Resources. This is where we will begin.

Screenshot of BigQuery with dashboard highlighted

Creating a Dataset

The first thing we need to do in BigQuery is creating a dataset. Handily, there’s an aptly-named Create Dataset button right in the middle of our interface for you to click.

Screenshot of BigQuery with Create Dataset button highlighted

You’ll be prompted with a configuration popup. Let’s call this “Demo.” I’m going to skip all the other configurations for now and just click Create Dataset.

Screenshot of BigQuery dataset configuration with Dataset and create dataset button highlighted

On your BigQuery interface, you’ll get a notification saying that your dataset was created. Follow the notification to see your new dataset.

Screenshot of BigQuery dashboard with “demo created” notification highlighted

And here is our Demo dataset. From here, we can create tables that we can populate with data, just like our earlier spreadsheets. Click on Create Table.

Screenshot of BigQuery dashboard with demo data resource and Create Table button highlighted

If we don’t want a blank table, we’ll need to upload our data from an existing source. You can upload from services like Google Storage or Google Bigtable, build a cloud function that pushes data automatically, or even utilize third-party platforms such as Supermetrics to pull your data for you.

Where you get your data depends on your individual workflow. For our demo, I will upload a CSV file from eforexcel.com/wp/downloads-18-sample-csv-files-data-sets-for-testing-sales/. Make sure this new table will be associated with your current project, and give it a recognizable name.

Screenshot of table configuration settings, uploading sales recording CSV to demo project  with source and destination sections highlighted

The table configuration popup will also ask if you’d like it to auto-detect schema. This is very important because we need to model our table after the table’s columns and what type of data they represent. Your data may include strings, Boolean, integers, and floating values. 

Screenshot of Google Sheets document with highlighted columns labeled as string, Boolean, integer, and floating number data

BigQuery can auto-detect those schema if they’re correctly filled into the CSV, which simplifies your analysis considerably. Check the box to auto-detect schema. For now, we’ll skip these other advanced options and click Create table.

Screenshot of table configuration settings with schema auto-detect box checked and create table button highlighted

You’ll get an interface notification that your dataset has been created. Follow the notification to see the load job for your new dataset.

Screenshot of BigQuery dashboard with “load job created” notification highlighted

Now we have our sales records in our BigQuery table. Under the Schema tab, BigQuery will show you all the different columns that were detected. 

Screenshot of demo sales records data resource schema highlighted

If you click the Preview tab, you’ll see all the data was uploaded in basically the same format that we had in our sheet.

Screenshot of demo sales record data resource preview highlighted

Querying Your Dataset

Now, what can we do with this table?

We can query it. This is where SQL comes in, if you are familiar with it. Don’t worry if you’re not, though, because I’ll walk you through the basic process.

In this top window, you will enter your query based on SQL. You can run all kinds of different operations to make an inquiry of your data.

Screenshot of BigQuery dashboard with blank query field highlighted

What might this look like? Let’s click on this query table and add a simple SQL statement to select specific pieces of data. Click Query Table to get a statement template.

Screenshot of BigQuery dashboard with Query Table button and subsequent query template highlighted

We can insert an asterisk into the statement to select everything from this table and then limit the output to 1000. Click the Run button to run your query.

Screenshot of BigQuery dashboard with asterisk entered into selection query and Run button highlighted

After just one second, we get our data back with results in the bottom half of the interface. It’s basically the same table since we used the