Google BigQuery is a serverless data warehouse solution that aims to be cost-effective and highly scalable, allowing businesses to analyze large amounts of data with speed and efficiency.
It also comes with some very useful SQL functions that make handling common business cases much more painless.
In this post, I'll walk you through a non-exhaustive list of nifty and helpful BigQuery SQL functions you should try.
I'll divide the functions into categories and provide the BigQuery syntax for each one, along with usage examples.
Let's dig in!
Most helpful BigQuery SQL functions you should check out
Without further ado, let's walk you through our list of functions.
To make things easier on you, I've categorized the functions into groups, such as "string" and "date and time."
Keep in mind that I won't cover very basic functions, e.g., those related to table creation. Instead, I'll try to focus on those that apply to common business problems you might have.
For instance, finding out the date of the last day of a month—or another time—is a fairly common need in business.
Another common need is dealing with large datasets that might contain invalid data—for instance, columns that aren't supposed to be 0 but are. In such cases, you'd probably want your operation to run without aborting, returning a "placeholder" value that indicates the areas in which there were issues.
String manipulation—including searching and replacing—is also a given when dealing with large data sets; that includes regex operations.
Also, keep in mind that what we're showing here is just the tip of the iceberg. BigQuery offers many more functions than the ones covered here, and you'll learn a lot by exploring them further.
With that out of the way, let's get started.
Date and time functions in BigQuery
Handling time is an essential task for most business. So, let's start with some useful BigQuery functions related to time.
Finding the last day of a given period
It's common that you'll need to find the date of the last day of a given period.
This is certainly more common when it relates to months; after all, "regular" months can have 30 or 31 days, and February has 28, gaining an extra day on leap years.
However, you may also need to get the last date of a different period, such as a week or a quarter.
BigQuery SQL has the LAST_DAY
function for that.
Its syntax is as follows:
LAST_DAY(date_expression[, date_part])
The date_part
parameter is optional. If you don't pass it, the default is MONTH
.
Here are some examples of the function in use; for each one, I'll show the function call, followed by its result.
Obtaining the last day of the month, explicitly:
select last_day('2021-07-24', MONTH) 2021-07-31
Obtaining the last day of the month, without using the argument:
select last_day('2021-07-24') 2021-07-31
Obtaining the last day of the week starting on the Sunday (which is the default):
select last_day('2021-07-20', WEEK) 2021-07-24
Obtaining the last day of the week starting on the Monday:
select LAST_DAY('2021-07-22', WEEK(MONDAY)) 2021-07-25
Date and time arithmetic
It's easy to perform date and time arithmetic using the following functions:
Action | Function | Syntax |
Add dates | DATE_ADD |
DATE_ADD(date_expression, INTERVAL int64_expression date_part) |
Subtract dates | DATE_SUB |
DATE_SUB(date_expression, INTERVAL int64_expression date_part) |
Add a time | TIME_ADD |
TIME_ADD(time_expression, INTERVAL int64_expression part) |
Subtract a time | TIME_SUB |
TIME_SUB(time_expression, INTERVAL int64_expression part) |
Let's see the above functions in action using some examples.
Adding 5 days to the current date:
select DATE_ADD(CURRENT_DATE(), INTERVAL 5 DAY) 2021-07-29
Subtracting 3 weeks from the "Back to the Future Day":
SELECT DATE_SUB("2015-10-21", INTERVAL 3 DAY) 2015-10-18
Subtracting 10 minutes from 8:30:
--WARNING! ERRORS ENCOUNTERED DURING SQL PARSING!
SELECT TIME_SUB("08:30:00", INTERVAL 10 MINUTE)
08:20:00
Easier way to perform date arithmetic
You can simply use the +
or -
operator and specify a number of days.
For example:
Adding 5 days to the start of the Tokyo 2020 Olympics:
SELECT "2021-07-23" + 5 2021-07-28
Subtracting 8 days from July 24th, 2021:
select "2021-07-24" - 8 2021-07-16
String and text functions in BigQuery
Text manipulation is another crucial task businesses often require, so let's cover some cool BigQuery functions related to string and text.
String concatenation
You can easily concatenate strings in BigQuery SQL using the CONCAT
function:
SELECT CONCAT("SCHULTS", ", ", "Carlos"); SCHULTS, Carlos
However, using the CONCAT
function for many values can become cumbersome. Luckily, you can use the concatenation operator (||)
instead of using the function, like so:
SELECT "SCHULTS" || ", " || "Carlos" SCHULTS, Carlos
Obtaining ASCII code
You can easily obtain the ASCII value of the first character of a word using the ASCII
function:
SELECT ASCII('ABCDE') 65
Note: An empty string results in 0
.
Finding the index of a search value
You often need to find the index of a value inside a string. The function INSTR
comes in handy for that. Its syntax is as follows:
--WARNING! ERRORS ENCOUNTERED DURING SQL PARSING!
INSTR(source_value, search_value[, position[, occurrence]])
])
It returns the index—based on 1—of the first occurrence of search_value
inside of source_value
.
The position parameter is optional; if specified, the function starts the search by that position.
Occurrence is also optional; if specified, the function searches for the nth
occurrence of the search value.
Let's look at some examples of it in action:
Finding the first occurrence of search value:
SELECT INSTR("Stay", "St") 1
Finding the index of the second occurrence of the search value:
SELECT INSTR("StackOverflow start Steel", "St", 1, 2) 21
Note: Bear in mind that this function is case sensitive.
Finding the index of a search value using regex
There's an enhanced version of the previous function called REGEXP_INSTR
. It allows you to specify a regex (regular expression) as the search value:
select REGEXP_INSTR("test 123test", "[0-9]+[a-zA-Z]+") 6
Replacing a search value using regex
BigQuery SQL offers many functions that support regex.
I'll cover another one: REGEX_REPLACE
. Here's its syntax:
REGEXP_REPLACE(value, regexp, replacement)
As its name and syntax suggest, you use it to replace a value that matches to the specified regex.
Let's look at an example:
select REGEXP_REPLACE("test 123test", "[0-9]+[a-zA-Z]+", "it works!") test it works!
Math functions in BigQuery
Finally, let's look at some useful BigQuery math functions.
Safe division
Sometimes, you want to perform a division, but you know or suspect your data set contains values that will result in errors.
For instance, you might end up attempting divisions by zero.
In such a case, it'd be helpful to carry on with the division, returning a special value—such as NULL
—instead of having an error that halts the whole execution.
That's what the SAFE_DIV
function is for:
select SAFE_DIVIDE(10, 0) null
Safe multiplication
Similar to the safe division operation, you might need to perform a safe multiplication.
For instance, your data set might contain very large numbers that would result in an overflow when multiplying.
By using the SAFE_MULTIPLY
function, you ensure you get NULL
instead of an error, allowing your operation to continue without aborting.
Wrapping up our BigQuery SQL functions guide
Google BigQuery is certainly an interesting solution for companies that need a multi-cloud data warehouse solution.
Besides its many valuable features, such as high availability, real-time analytics, and serverless architecture, Google BigQuery offers several functions that make your life easier, allowing you to handle common business problems in a less cumbersome way.
If you're interested, your organization can start using BigQuery right away, in the free tier.
And when you need a cloud data platform to connect all of your data and deliver it seamlessly to everyone inside your business, Panoply's got you covered.
As of June 2021, Panoply works on Google BigQuery, offering great features and support at an affordable price. You can even try Panoply free for 14 days and get a personalized demo for your needs.
Thanks for reading, until next time!