We are finalizing some of the techniques to prep data for data analysis.

This is by far not exhaustive, and as you grow your skills, knowledge and experience, you will grow your repertoire as you practice more and more.

Merging DataFrames

Let’s open up the file in the 01-Ins_Merging folder to get started.

Merging datasets is a staple activity, especially when we want to get insights across multiple datasets.

Data modeling will be crucial because we want to ensure data integrity when we merge data, but that will part of your future coursework.

Students Do: Census Merging

Binning Data

Let’s open up the file in the 03-Ins_Binning folder to get started.

Categorizing data with specific conditions is necessary for all types of data analysis. We used to do it manually, but as Pandas evolves, it lowered the barrier of entry for doing it manually.

Students Do: Binning Movies

Mapping

Let’s open up the file in the 05-Ins_Mapping to get started.

If you read the official documentation, map is a function that applies transformation logic on each value in an entire column: https://pandas.pydata.org/pandas-docs/version/0.19/generated/pandas.Series.map.html

Notes

  • Notice that when we apply formatting on NaN values, it becomes an object (string) within Pandas.
    • You would want to remove NaN values first before applying formatting.
  • ${:.2f} means we want to round the data to 2 floating points.
  • This is not the only way to round data. I typically use the numpy library, which we will cover in future coursework.
    • If your work requires high precision values, such as architecture and buildings, you will use numpy to ensure accuracy.

What you have learned is not exhaustive

There are depths to column-level transformation as it is beyond the scope of the class. However, it is useful to know so that you can research later.

Crowdfunding Cleaning

Introduction to Bug Fixing

Look at the file(s) in the 07-Ins_Intro_to_Bugfixing folder.

Bug fixing is something that is caught, not only taught. It is like riding a bicycle. You can’t learn bicycle just be reading about it, but you actually have to be doing it to be better at it.

As you do more bug fixing, your troubleshooting skills will grow as well.

Being able to debug code is key to excellence, especially when you’re working in a team. You will need to ensure quality and excellence with your team mates’ work in order to produce good products.

Bug Fixing Bonanza