You’ve learned how HTML and CSS works, and how you can perform some level of scraping. Today, we’re going to discuss how we can automate web scraping.
Web pages are built using templates, which means there is a consistent layout and flow to direct users on how to use a website. There isn’t a company in the world that wouldn’t use templates to service information.
As a template is a repeatable pattern, you can use this pattern to automate your web scraping application. However, high traffic websites are prone changes on their layout as they usually have a design team to make frequent edits to meet business needs, thus your automated web scraping process can break easily.
Students Do: Automated Web Scrape
Let’s open up the file(s) in the 01-Stu_Automated_Web_Scrape
folder to get started.
Scrape Multiple Pages
Let’s open up the file(s) in the 02-Ins_Scrape_Multiple_Pages
folder to get started.
Students Do: Scrape Book Links
Let’s open up the file(s) in the 03-Stu_Scrape_Book_Links
folder to get started.
Students Do: News Headers
Let’s open up the file in the 04-Stu_News
folder to get started.
Students Do: Mars Fact Scrape
Let’s open up the file in the 05-Stu_Mars_Facts_Scrape
folder to get started.
Scrape a Table with Pandas
Let’s open up the file(s) in the 06-Ins_Scrape_Pandas_Table
folder to get started.