1 min read

SQL for Google Sheets with DuckDB

Analyze data in Google Sheets using SQL with DuckDB
SQL for Google Sheets with DuckDB

Lately, I've been using DuckDB to quickly analyze data in a Google Sheet.

For example, I stumbled upon this sheet collecting salary data for Data Engineers from the /r/dataengineering subreddit and was curious about some averages.

Traditionally, I would have had to export the data as a CSV to my machine and load it into some database or data warehouse where I could run some queries over it, but with DuckDB, this is so much easier!

All I have to do is run:

SELECT *
FROM read_csv_auto('https://docs.google.com/spreadsheets/export?format=csv&id=1GuEPkwjdICgJ31Ji3iUoarirZNDbPxQj_kf7fd4h4Ro', normalize_names=True);
💡
Note the "export?format=csv" part of the URL - that's how you tell Google Sheets to give you a CSV file!

Et voilà!

The data is immediately available for number crunching! Let's say I want to know the average salary for Data Engineers in Texas.

SELECT   current_job_title,
         round(avg(try_cast(base_salary as int))) AS avg_salary
FROM     de_salaries -- Saved results of the previous query as a table
WHERE    currency = 'USD'
AND      state = 'Texas'
GROUP BY current_job_title
ORDER BY avg_salary DESC;

And I didn't even have to leave the terminal.