How you write your queries will be tested in a technical interview, and so be prepared. I do test sequel queries during technical interviews.

In short, you would have to be good enough that it becomes muscle memory within you.

It is not covered in our course, but query optimization is also important. There are several ways to get to the same result, but there is usually the most optimal way to write queries. Your technical interviewer might ask you to optimize your query during an interview.

Even at work, we do query optimization and log slow queries so that we can run our operations better.

Import Data

Let’s open up the file(s) in the 01-Evr_Import_Data folder to get started.

There is a blend of SQL within the school’s resources that is MySQL based. It advises us to use the CSV import method rather than SQL based if we meet problems.

This is the reason why even though we generically refer SQL for relational databases, yet in different engines such as Microsoft SQL vs Postgres, they can be totally different ways of the same thing.

Example:

  • To limit the top 5 entries during a SELECT statement, Microsoft SQL uses TOP.
  • MySQL and Postgres will use LIMIT.

Thus, we always recommend not to blindly copy code but to understand what it does, because semantically they refer to the same thing. However, due to different engines that you might work with, you may have to write them differently. Being flexible is key.

In general, import SQL files is generally preferred if your company uses a homogenous system for database (if every system uses Postgres).

  • You will incur less errors in importing because each statement is a SQL query. CSVs can have mismatched or type errors when you ingest.

However, there are cases where the landscape is very heterogenous, we may prefer CSVs because it is more universally accepted in interpreting the file.

Aggregate Functions, Aliases, and Grouping

Let’s open up the file(s) in the 02-Ins_Aggregates folder to get started.

You can apply statistical and mathematical formulas when you query.

In general, they are used for quick checks to highlight issues. However, it is not good practice to over complicate the queries; we typically use Python or Jupyter Notebooks for complex work.

  • The more complex it is, the more we prefer to version control the work to ensure quality.

Students Do: Gregarious Aggregates

Order By Aggregates

Let’s open up the file in the 04-Ins_Order_By folder to get started.

Students Do: Movies Ordered By

Introduction to Subqueries

Let’s open up the file(s) in the 06-Ins_Subqueries folder to get started.

Complex subqueries is only good for technical interviews, where interviewers want to understand your core understanding of queries.

In reality, I discourage the team for over-doing sub queries for the sake of maintainability.

Students Do: Subqueries

Create Views

Let’s open up the file(s) in the 08-Ins_Create_Views folder to get started.

Views is a very important feature within relational database. We use views to protect the actual tables from modification, and we typically only allow SELECT statements on views.

Student Do: A View with a Roomful of Queries

Revisit Subqueries

Let’s open up the file(s) in the 10-Ins_Revist_Subquery folder to get started.

Entity-Relationship Diagram (ERD)

When we build a system to collect data from scratch, we need to model it so that we collect data.

Ref: https://www.postgresqltutorial.com/postgresql-getting-started/postgresql-sample-database/

We will learn about data modeling in the next lesson. For now, do get used to the idea that it is essential to design your data storage well so that it will not impact your analysis.

Students Do: Mine the Subquery