Summer of SQL: A Q&A Series with Joe Hellerstein

SQL is back in a big way this summer, after what seemed like a period of time in the back seat.

To find out why and what we missed, we asked a series of questions of Joe Hellerstein, a computer science professor at University of California, Berkeley and co-founder of Trifacta.

In this series, learn why SQL is back, what modern cloud data engineering looks like with the acceleration of cloud data warehouses, and why ETL is becoming ELT.

 
 

Summer of SQL: Why It’s Back

For the first decades of the Millenium, it seemed like the Java-centric approach was the "hot new thing," but SQL has been roaring back. Today, SQL seems to be the focus of every data engineering conversation and popping back up on billboards in Silicon Valley. 

The comparison of the two "shops" inevitably leads to the question: which is better? There are pros and cons to emphasizing one or the other. 

Learn More
Summer of SQL - Episode 1

Summer of SQL: Why It’s Back

For the first decades of the Millenium, it seemed like the Java-centric approach was the "hot new thing," but SQL has been roaring back. Today, SQL seems to be the focus of every data engineering conversation and popping back up on billboards in Silicon Valley. 

The comparison of the two "shops" inevitably leads to the question: which is better? There are pros and cons to emphasizing one or the other. 

Learn More
 
 

SQL Pipelines and ELT

ELT is increasingly attractive these days. Modern data warehouses are flexible and increasingly cost-effective, allowing us to store large volumes of data—even messy data that includes volumes of text and images. In this environment, transformations occur in the data warehouse, where the native language is SQL. 

Learn More
Summer of SQL - Episode 2

SQL Pipelines and ELT

ELT is increasingly attractive these days. Modern data warehouses are flexible and increasingly cost-effective, allowing us to store large volumes of data—even messy data that includes volumes of text and images. In this environment, transformations occur in the data warehouse, where the native language is SQL. 

Learn More
 
 

Transformation: Next Level SQL

When we use SQL for Transformation—the “T” in ELT—the focus changes. In this case, we’re taking many messy and disparate tables and manipulating them into a more usable or common form. To take our example from before, we may be extracting and loading sales data from 17 electronics chains that sold the phones, and our job in SQL is to write transformation queries that integrate that data together.

Learn More
Summer of SQL - Episode 3

Transformation: Next Level SQL

When we use SQL for Transformation—the “T” in ELT—the focus changes. In this case, we’re taking many messy and disparate tables and manipulating them into a more usable or common form. To take our example from before, we may be extracting and loading sales data from 17 electronics chains that sold the phones, and our job in SQL is to write transformation queries that integrate that data together.

Learn More
 
 

Back to SQL: Data Engineering

As part of growing our massive new Data Science program at Berkeley, it became clear that we needed to target a class specifically for Data Engineering. The goals of Data Engineering are different than Software Engineering. So it was interesting to think through this curriculum and how we would teach it differently than our established database classes.

In this new approach, we ended up emphasizing four steps to SQL for Data Engineering that are atypical of a traditional databases class: data quality, data reshaping, “spreadsheet tasks,” and data pipeline testing.

Learn More
Summer of SQL - Episode 4

Back to SQL: Data Engineering

As part of growing our massive new Data Science program at Berkeley, it became clear that we needed to target a class specifically for Data Engineering. The goals of Data Engineering are different than Software Engineering. So it was interesting to think through this curriculum and how we would teach it differently than our established database classes.

In this new approach, we ended up emphasizing four steps to SQL for Data Engineering that are atypical of a traditional databases class: data quality, data reshaping, “spreadsheet tasks,” and data pipeline testing.

Learn More