Web scraping

Materials for MIT Political Methodology Lab workshop on learning to scrape with Python. Taken and adapted (currently almost identical) from Andy Halterman's workshop materials on web scraping: https://github.com/ahalterman/learn_to_scrape. The workshop is split into two parts. First, we will get familiar with Python. Then we use it to learn how to scrape websites using BeautifulSoup.

Before coming to the workshop

Please follow the Setup instructions on the Wiki before coming to the workshop so we can maximize the amount of time spent learning Python and web scraping.

The presentation source and PDF files are in the PML Presentation folder. The practice exercises we go through during the workshop are in the Python Notebooks folder. The incomplete ones have a _Skeleton appendix and will be filled out during the workshop. For reference, there is a complete version of these notebooks in the Completed subfolder. For best results I suggest not looking at the solutions ahead of time.

Intro_to_Python_for_R_Users_Skeleton.ipynb contains a skeleton of basic Python programming exercises worked through during the workshop.
Intro_to_Python_for_R_Users_Completed.ipynb contains the "solutions" for Intro_to_Python_for_R_Users_Skeleton.
Scraper_Skeleton.ipynb contains a skeleton of a web scraper and is what we'll be working through during the workshop.
Scraper_Completed.ipynb contains the "solutions" for Scraper_Skeleton.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
PML Presentation		PML Presentation
Python Notebooks		Python Notebooks
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web scraping

Before coming to the workshop

Contents

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Web scraping

Before coming to the workshop

Contents

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages