Artwork

Contenido proporcionado por KeyCuts. Todo el contenido del podcast, incluidos episodios, gráficos y descripciones de podcast, lo carga y proporciona directamente KeyCuts o su socio de plataforma de podcast. Si cree que alguien está utilizando su trabajo protegido por derechos de autor sin su permiso, puede seguir el proceso descrito aquí https://es.player.fm/legal.
Player FM : aplicación de podcast
¡Desconecta con la aplicación Player FM !

Dear Analyst #131: Key insights and best practices from writing SQL for 15+ years with Ergest Xheblati

47:42
 
Compartir
 

Manage episode 432599299 series 3140515
Contenido proporcionado por KeyCuts. Todo el contenido del podcast, incluidos episodios, gráficos y descripciones de podcast, lo carga y proporciona directamente KeyCuts o su socio de plataforma de podcast. Si cree que alguien está utilizando su trabajo protegido por derechos de autor sin su permiso, puede seguir el proceso descrito aquí https://es.player.fm/legal.

If you could only learn one programming language for the rest of your career, what would be it be? You could Google the most popular programming languages and just pick the one of the top 3 and off you go (FYI they are Python, C++, and C). Or, you could pick measly #10 and build a thriving career out of it. Ergest Xheblati didn’t just pick SQL and decide to bet his career on it. He started as a software engineer, held various roles in data, and even became a product manager. After trying these different roles, Ergest would still find his way back to data engineering and has dedicated 15+ years to writing SQL for a living. In this episode he talks about why he loves SQL, reducing the number of dashboards in your company, and best practices for writing SQL.

Why Ergest loves writing SQL

The reason why Ergest loves SQL is also the reason how most things get invented: laziness. As that Stack Exchange thread points out, a lazy developer tends to find shortcuts and automations to make repetitive and tedious tasks less onerous. You could also argue that Excel shortcuts are a result of analysts being lazy and not wanting to use their mouse to do mundane formatting tasks.

Source: Invisible Bread

As it pertains to programming, Ergest saw that a standard framework might require 20-30 lines of code to pull some data from a database. Ergest could handle do that same operation by writing a few lines of SQL with a simple SELECT statement.

Solving business problems with technology

Ergest was a data analyst, data engineer, and also what we call an analytics engineer. When Ergest was a data analyst, he didn’t have tools like dbt which prevented him from succeeding as an analyst. As with many data roles, Ergest still straddles multiple roles today. He still considers himself a blend between a data analyst and data engineer with SQL being his main tool of choice. At a high level, Ergest talks about “solving business problems with technology.”

Source: dbt Labs

I think it’s important to emphasize this point which many other guests on Dear Analyst have pointed out as well. Learning tools like Excel and SQL are great, but if you cannot communicate your findings and solve real business problems with these tools, then what’s the point? I think data professionals get caught up with how to utilize a data tool’s features when time should really be spent on what can be done to solve your customer’s problems.

I recently had a conversation with a technical program manager who had an opportunity to sit on a few customer meetings with her sales team. She was amazed to learn about the actual problems her company’s customers face every day. It gave her a new perspective on the backend infrastructure her team supports.

Mining open source data with SQL

Most of the projects Ergest works on are focused on business intelligence. For instance, he had to work on a project where the company wanted to build robust customer profiles. You typically want to see all these different aspects of a customer so you know how to best market to and retain the customer. From a data perspective, Ergest was writing SQL to transform and merge different data from different sources.

Some data source might have names of the customers while another source might have numbers. You then have to look at the session logs of what these customers are doing on your website and create tables based on this customer activity. Ergest is a proponent of the One Big Table (OBT) approach for this customer activity data to make querying and management easier. This graphic below shows the main structural difference between the standard star schema and OBTs:

Source: Databricks SQL SME

How to stop building dashboards and answering ad-hoc questions

Ergest wrote a great blog post a few months ago called Transforming a Data Culture. The blog post talks about how data teams can prevent the deluge of one-off data questions being asked by the business and to shift to being a more strategic partner. Does this sound like a goal or OKR your data team is striving for?

Source: iFunny

Ergest did an audit at a company that had 17,000+ dashboards! Talk about not knowing which metrics matter. Ergest believes in going back to first principles when it comes to dashboarding. There are 4 questions Ergest believes you need to answer when creating a dashboard:

  1. What’s happening?
  2. Why did it happen?
  3. What are you going to do?
  4. What’s your prediction?

The blog post goes in-depth on how getting executive buy-in is the most important step in reducing the number of questions coming at the data team.

Best patterns for writing SQL

Ergest reviews a lot of SQL queries and saw mistakes and anti-patterns in how his fellow analysts and data engineers were writing queries. Surely, there must be a book about the best patterns for writing SQL, Ergest thought. There are many books on best patterns for coding and how to debug code. The only books Ergest could find on SQL were anti-patterns. He ended up writing a book called Minimum Viable SQL Patterns based on his experience reviewing other people’s queries. He breaks the patterns down into 4 buckets:

  1. Query composition patterns – How to make your complex queries shorter, more legible, and more performant
  2. Query maintainability patterns – Constructing CTEs that can be reused. In software engineering, it’s called the DRY principle (don’t repeat yourself)
  3. Query robustness patterns – Constructing queries that don’t break when the underlying data changes in unpredictable ways
  4. Query performance patterns – Make your queries faster (and cheaper) regardless of specific database you’re using

These 4 patterns are pulled directly from this workshop Ergest gave about SQL patterns:

According to Ergest, what separates his book apart from other books about SQL is that the patterns he discusses are based on writing professional/production-ready SQL for cloud environments. He assumes you are writing SQL to query data warehouses in AWS, Azure, or some other public cloud platform.

Other Podcasts & Blog Posts

No other podcasts or blog posts mentioned in this episode!

The post Dear Analyst #131: Key insights and best practices from writing SQL for 15+ years with Ergest Xheblati appeared first on .

  continue reading

14 episodios

Artwork
iconCompartir
 
Manage episode 432599299 series 3140515
Contenido proporcionado por KeyCuts. Todo el contenido del podcast, incluidos episodios, gráficos y descripciones de podcast, lo carga y proporciona directamente KeyCuts o su socio de plataforma de podcast. Si cree que alguien está utilizando su trabajo protegido por derechos de autor sin su permiso, puede seguir el proceso descrito aquí https://es.player.fm/legal.

If you could only learn one programming language for the rest of your career, what would be it be? You could Google the most popular programming languages and just pick the one of the top 3 and off you go (FYI they are Python, C++, and C). Or, you could pick measly #10 and build a thriving career out of it. Ergest Xheblati didn’t just pick SQL and decide to bet his career on it. He started as a software engineer, held various roles in data, and even became a product manager. After trying these different roles, Ergest would still find his way back to data engineering and has dedicated 15+ years to writing SQL for a living. In this episode he talks about why he loves SQL, reducing the number of dashboards in your company, and best practices for writing SQL.

Why Ergest loves writing SQL

The reason why Ergest loves SQL is also the reason how most things get invented: laziness. As that Stack Exchange thread points out, a lazy developer tends to find shortcuts and automations to make repetitive and tedious tasks less onerous. You could also argue that Excel shortcuts are a result of analysts being lazy and not wanting to use their mouse to do mundane formatting tasks.

Source: Invisible Bread

As it pertains to programming, Ergest saw that a standard framework might require 20-30 lines of code to pull some data from a database. Ergest could handle do that same operation by writing a few lines of SQL with a simple SELECT statement.

Solving business problems with technology

Ergest was a data analyst, data engineer, and also what we call an analytics engineer. When Ergest was a data analyst, he didn’t have tools like dbt which prevented him from succeeding as an analyst. As with many data roles, Ergest still straddles multiple roles today. He still considers himself a blend between a data analyst and data engineer with SQL being his main tool of choice. At a high level, Ergest talks about “solving business problems with technology.”

Source: dbt Labs

I think it’s important to emphasize this point which many other guests on Dear Analyst have pointed out as well. Learning tools like Excel and SQL are great, but if you cannot communicate your findings and solve real business problems with these tools, then what’s the point? I think data professionals get caught up with how to utilize a data tool’s features when time should really be spent on what can be done to solve your customer’s problems.

I recently had a conversation with a technical program manager who had an opportunity to sit on a few customer meetings with her sales team. She was amazed to learn about the actual problems her company’s customers face every day. It gave her a new perspective on the backend infrastructure her team supports.

Mining open source data with SQL

Most of the projects Ergest works on are focused on business intelligence. For instance, he had to work on a project where the company wanted to build robust customer profiles. You typically want to see all these different aspects of a customer so you know how to best market to and retain the customer. From a data perspective, Ergest was writing SQL to transform and merge different data from different sources.

Some data source might have names of the customers while another source might have numbers. You then have to look at the session logs of what these customers are doing on your website and create tables based on this customer activity. Ergest is a proponent of the One Big Table (OBT) approach for this customer activity data to make querying and management easier. This graphic below shows the main structural difference between the standard star schema and OBTs:

Source: Databricks SQL SME

How to stop building dashboards and answering ad-hoc questions

Ergest wrote a great blog post a few months ago called Transforming a Data Culture. The blog post talks about how data teams can prevent the deluge of one-off data questions being asked by the business and to shift to being a more strategic partner. Does this sound like a goal or OKR your data team is striving for?

Source: iFunny

Ergest did an audit at a company that had 17,000+ dashboards! Talk about not knowing which metrics matter. Ergest believes in going back to first principles when it comes to dashboarding. There are 4 questions Ergest believes you need to answer when creating a dashboard:

  1. What’s happening?
  2. Why did it happen?
  3. What are you going to do?
  4. What’s your prediction?

The blog post goes in-depth on how getting executive buy-in is the most important step in reducing the number of questions coming at the data team.

Best patterns for writing SQL

Ergest reviews a lot of SQL queries and saw mistakes and anti-patterns in how his fellow analysts and data engineers were writing queries. Surely, there must be a book about the best patterns for writing SQL, Ergest thought. There are many books on best patterns for coding and how to debug code. The only books Ergest could find on SQL were anti-patterns. He ended up writing a book called Minimum Viable SQL Patterns based on his experience reviewing other people’s queries. He breaks the patterns down into 4 buckets:

  1. Query composition patterns – How to make your complex queries shorter, more legible, and more performant
  2. Query maintainability patterns – Constructing CTEs that can be reused. In software engineering, it’s called the DRY principle (don’t repeat yourself)
  3. Query robustness patterns – Constructing queries that don’t break when the underlying data changes in unpredictable ways
  4. Query performance patterns – Make your queries faster (and cheaper) regardless of specific database you’re using

These 4 patterns are pulled directly from this workshop Ergest gave about SQL patterns:

According to Ergest, what separates his book apart from other books about SQL is that the patterns he discusses are based on writing professional/production-ready SQL for cloud environments. He assumes you are writing SQL to query data warehouses in AWS, Azure, or some other public cloud platform.

Other Podcasts & Blog Posts

No other podcasts or blog posts mentioned in this episode!

The post Dear Analyst #131: Key insights and best practices from writing SQL for 15+ years with Ergest Xheblati appeared first on .

  continue reading

14 episodios

Todos los episodios

×
 
Loading …

Bienvenido a Player FM!

Player FM está escaneando la web en busca de podcasts de alta calidad para que los disfrutes en este momento. Es la mejor aplicación de podcast y funciona en Android, iPhone y la web. Regístrate para sincronizar suscripciones a través de dispositivos.

 

Guia de referencia rapida