Today is a light-hearted discussion on some services that will help companies and teams in adopting big data technologies and machine learning (ML) processes in a simplified way.

Why Databricks? (Or any data analytical platform for the matter)

Companies who are involved in data analytics, science and AI wants to reduce the overhead of maintaining infrastructure and engineering. As you well see in installing Spark, it is not easy to maintain a big data infrastructure.

However, there are many competing companies, each offering a unique flavor of their own services. The school is only offering one of the many choices, and here are others:

  • AWS Sagemaker Studio
  • Snowflake
  • Ab Initio

It is important to not be distracted by the amount of services or nuances that these platforms offer. At the bare minimum, remember it is your skills and creativity that drive the work. Not the platform.

This lesson is all about exposure to some of the services that you might experience in the market. You have been using Google Colab, and now we are trying out Databricks.

Databricks Demo

Let’s open up the file(s) in the 02-Ins_Databricks_Demo folder to get started.

Joins

Let’s open up the file(s) in the 04-Ins_Joins folder to get started.