Today is a light-hearted discussion on some services that will help companies and teams in adopting big data technologies and machine learning (ML) processes in a simplified way.
Why Databricks? (Or any data analytical platform for the matter)
Companies who are involved in data analytics, science and AI wants to reduce the overhead of maintaining infrastructure and engineering. As you well see in installing Spark, it is not easy to maintain a big data infrastructure.
However, there are many competing companies, each offering a unique flavor of their own services. The school is only offering one of the many choices, and here are others:
- AWS Sagemaker Studio
- Snowflake
- Ab Initio
It is important to not be distracted by the amount of services or nuances that these platforms offer. At the bare minimum, remember it is your skills and creativity that drive the work. Not the platform.
This lesson is all about exposure to some of the services that you might experience in the market. You have been using Google Colab, and now we are trying out Databricks.
Students Do: Sign Up for Databricks
Let’s open up the file(s) in the 01-Stu_Sign_up
folder to get started.
Their community signup link is in fine print. So you’ll have to see closely. Don’t select the trial version!
Databricks Demo
Let’s open up the file(s) in the 02-Ins_Databricks_Demo
folder to get started.
Students Do: Databricks Basics
Let’s open up the file(s) in the 03-Stu_Basics
folder to get started.
Joins
Let’s open up the file(s) in the 04-Ins_Joins
folder to get started.
Students Do: Joining Animal Species
Let’s open up the file(s) in the 05-Stu_Joins
folder to get started.
Groups Do: Database Analysis
Let’s open up the file(s) in the 06-Stu_Group_Project
folder to get started.