I recently learned that the Danish government will ban the use of Google Workspace in public institutions. However, unless you work for a Danish municipality, as a professional in marketing, as well as in any company, it’s very normal to use Google Sheets. So, what if I told you that you can use SQL to ease the search for specific data in large spreadsheets?
Luckily for all of us, Google Sheets includes a SQL function to perform database-like queries on those sheets. But how to use it? Let me guide you step by step.
Introducing SQL and Google Sheets
SQL (Structured Query Language) is a programming language to manage data in relational databases. And Google Sheets is an online spreadsheet editor used by many to store and manage spreadsheets in a collaborative way. What both have in common is data management.
But what happens when those spreadsheets get too large? That you need ways to organise the data in simpler views, like when you use filters. However, sometimes filters are not enough if you want to run different conditions. That is when SQL, or the Query function in Google Sheets, comes in handy.
In Google’s words, you can “run a Google Visualization API Query Language query across data”. This means that in order to find the data you need you can perform searches easily and quickly.
Imagine you have a database with dozen of thousands of users and you want to get a list of users who performed a signup during Easter 2022, from London, with marketing consent and come from a specific campaign but exclude those who purchased a subscription to your software. You can simply run a query to do this.
It only requires remembering this simple syntax:
=QUERY(data, query, [headers])
Data is the range you want to analyse. Query is the request sent to Google Sheets to retrieve the data, it should come encapsulated in quotation marks or refer to a cell. The headers are optional and relate to the number of rows at the top of the data, normally we would use 1.
No worries, it gets easier with examples.
Prepare your table for SQL
To start using SQL it is important to start with a set of data. I recommend you use a set of random data you can build yourself starting with customer_id, city, country, and revenue.
When you work with queries, it’s preferable to name the ranges, so instead of referring to cells A2:D2, you can use a specific name, such as “customers_data”. Start by selecting the range and clicking on Data > Named ranges.
When the new tab opens, change the range name to customers_data.
The reason why this is important is that it will save a lot of time when performing queries. Now your table is set up and you will only need to type the range name as customers_data when you start making queries:
=QUERY(customers_data, query, [headers])
Select all the data and select specific columns
This is where you can start practicing some of the most common SQL commands: SELECT. In Google Sheets, you can use SELECT* to retrieve all the data contained in that range. Pay attention to the following example:
=QUERY(customers_data, "SELECT*", 1)
It will select and retrieve all the data in customers_data. By using 1 in headers we are telling Google Sheets that row number 1 is the header. This will come in handy in other queries.
However, you can also select specific columns. Instead of using an asterisk (*), you can use the name of the column.
=QUERY(customers_data, "SELECT B", 1)
Where and Order by clauses in Google Sheets
Now that you know how to select specific columns, you need to learn how to filter and sort your data. It’s time to learn the clauses WHERE and ORDER BY.
If you thought that WHERE could be a great match for comparison operators, you guessed right. You can try to get the customer IDs of those who have a revenue higher than 900.
=QUERY(customers_data, "SELECT A WHERE D > 900", 1)
You can use other operators such as = (Equals), < (Less Than), >= (Greater Than or Equal To), <= (Less Than or Equal To) or <> (Not Equal To). Additionally, boolean operators as AND, OR and NOT could be combined to narrow down your queries.
However, now it’s time for you to order the data. You can sort the revenue in descending order using the following formula:
=QUERY(customers_data, "SELECT A WHERE D > 900 ORDER BY D DESC", 1)
Now you know how to use comparison operators and order your data. So, it is a matter of time before you start using boolean operators.
AND and OR operators in Google Sheets
The term boolean operators comes from George Boole, an English mathematician from the XIX century. History lessons aside, you can use AND to add an extra condition to your query, for instance, the customers need to be in Germany.
=QUERY(customers_data, "SELECT A WHERE D > 900 AND C= ‘DE' ORDER BY D DESC", 1)
This way you’ll get all the customer IDs that have a revenue greater than 900 and are based in Germany but sorted by descending order of revenue.
When you use the clause OR, you can add an extra layer of data with less refinement. For example, instead of focusing on revenue, we can focus on the country, to get all customers from Germany and from Denmark. In that case, you can type:
=QUERY(customers_data, "SELECT A WHERE C= ‘DE' OR C= ‘DK’ ORDER BY D DESC", 1)
The Limit clause in Google Sheets
While we have been using clauses that help us to fine-tune our data in a logical way, it’s also possible to limit the amount of data. This clause reduces the number of rows that are pulled.
We can ask Google Sheets to show us only 5 rows when we request the customer IDs of those users with a revenue greater than 900.
=QUERY(customers_data, "SELECT A WHERE D > 900 LIMIT 5",1)
How to use the Count command in Google Sheets
When it comes to growth marketing, counting is key. You’ll need to know if there have been more or fewer customers, users, leads, or visitors to your website. You can use an SQL query to do it for you on your spreadsheet.
Let’s count customer IDs to see how many customers we have in France:
=QUERY(customers_data, "SELECT COUNT(A) WHERE C= 'FR'", 1)
But, why only count customer IDs? Because the customer ID is the primary key in this table. This means that it’s a record that uniquely identifies a row, or in other words, there can’t be any other row with the same customer ID. Although this concept doesn’t exist in Google Sheets, it’s important to internalise this idea for better use of tables.
Combine simple arithmetic operations with SQL queries
It’s no mystery that arithmetic operations such as addition, subtraction, multiplication, or division are the bread and butter of any Google Sheets user. So, it’s also possible to combine queries in your spreadsheets.
Using the same tables, now it’s your turn to calculate the average revenue per customer. For this, you will need to sum the revenue and divide it by the number of customers. Extra tip: use the SUM clause to get the total revenue.
=QUERY(customers_data, "SELECT SUM(D) / COUNT(A)",1)
This is just a way to show you that you can make divisions or multiplication or any other arithmetic operation while using clauses such as SUM or COUNT. However, continue reading to get smarter ways of calculating the average revenue per user.
Aggregate functions in Google Sheets: average, maximum, and minimum
Before your learned how to use arithmetics to calculate, among many other things, average. But if I told you you just need three letters in your query to do the same? Let’s review AVG, MAX, and MIN clauses.
Now try to retrieve the average, the maximum, and the minimum revenue.
=QUERY(customers_data, "SELECT SUM(D) / COUNT(A)",1)
How to use Group By clause in your spreadsheet
Sometimes, when you work with digital marketing you just don’t need the total number of leads, but to group them according to common characteristics, e.g. country.
It’s your time to count the number of customers while you group them by country. You can start by using COUNT and a new clause: GROUP BY.
=QUERY(customers_data, "SELECT C, COUNT(A) GROUP BY C",1)
It’s important to remember that you need to retrieve column C as well. Just counting column A group them by C, will give you a list of numbers but with no reference to what country they belong to.
Pivot tables to organise your data in Google Sheets
This is a bit more advanced, but it definitely saves you a lot of time sorting your data when you have multiple dimensions. Pivot tables group values to summarise large sets of data.
In a more practical example, you might want to know the total amount of revenue in each city in Denmark. This is the query:
=QUERY(customers_data, "SELECT C, SUM(D) WHERE C = 'DK' GROUP BY C PIVOT B",1)
Here you’ll select column C, and you will sum the revenue in column D but only when Denmark is the country present in column C. You’ll also group it by country and organise it according to cities in column B.
It’s a bit difficult but I recommend you to take your time and practice it with first small amounts of data and then move to larger sets.
Conclusion on using SQL in Google Sheets
As you’ve seen, it’s not that difficult to use a little bit of programming language in Google Sheets. When you’re defining the digital marketing strategy of your company, you will need to analyse a lot of numbers. And for people like us, who are not necessarily good at numbers, this is a lifesaver.
Of course, there are other functions, such as offset, label, format, or options. However, having made it to pivot tables is already a success and something to be proud of.
Remember Google Sheets isn’t a console and it has its limitations. So you’ll need to find more workarounds than if you were to use traditional SQL to communicate with a database.