As web pages become more and more sophisticated with CSS and Javascript functionality, we would need to use CSS selectors to dynamically isolate the areas where we want to scrape data from.

Many of these CSS selectors contain lots of metadata – additional information that tells you more about the HTML tags. They can intuitively help you to automate your web scraping processes.

Hello HTML

Let’s open up the file(s) in the 01-Ins_CSS_Identifiers folder to get started.

The goal of using CSS identifiers is to be able to easily identify the data you need and map them into a variable for manipulation and/or storage.

Students Do: CSS Case Study

Students Do: Pandas Scrape

DevTools

Let’s open up the file in the 04-Ins_DevTools folder to get started.

DevTools is a debugging console on Chrome (or other major browser that you might be using.

Keyboard shortcuts
  • Mac: Option + Command ⌘ + i
  • Windows: Shift + CTRL + i

I’ll walk you through what some of these things can do, and how you can extract data out of it.

We’ll be scraping from this website: https://stackoverflow.com/questions/tagged/python?sort=MostVotes&edited=true

Students Do: Stack Scrape

Styling HTML Elements with CSS