Learn to Code via Tutorials on Repl.it!

← Back to all posts
How to use BeautifulSoup to web scrape.
Hillo232 (15)

Hey guys, today I will teach you how to web scrape with BeautifulSoup. Remember to read the websites ToS before scraping there site!
First, you want to import all the modules:

import requests
from bs4 import BeautifulSoup

Then, you want to get the url:

variable=input("What word would you like to search: ")
url = "https://www.vocabulary.com/dictionary/"+variable

Third, you want to locate the url:

page = requests.get(url)

Then you want to parse the html code:

soup = BeautifulSoup(page.content, "html.parser")

After that, you need to locate the id of the page by right-clicking and clicking inspect. You want to look for a <div> tag with an id which in this case we will use the id, page.

results = soup.find(id="page")

Then you want to use the id and find the page with the URL:

vocabulary_elems=results.find("div",attrs={"data-word":variable})

Now you want to print the results in text instead of a messy HTML code:

print(vocabulary_elems.text)

Thanks for reading! Hope you made your own web scraper!

Comments
hotnewtop
cuber1515 (91)

How would you do this if the element you wanted to get was a class. (I tried class="class" and it didn't work). And if it's important I was trying to do Amazon results = soup.find(class="s-desktop-content")
The spotlight page ==> here

Also where did the "data=word" come from in the vocabulary_elems=results.find("div",attrs={"data-word":variable}) and what does it do?

Hillo232 (15)

@cuber1515 results.find basically gets the data that "data-word" holds in the html code. And for Amazon, you need to pick the products listed first before you can get any data.

DynamicSquid (5027)

Could you link a demo repl to show how it works?