Learn to Code via Tutorials on Repl.it!

← Back to all posts
Beginner web scraping with Python and Repl.it
GarethDwyer1 (295)

Hey all,

I wrote a beginner's tutorial on web scraping using Python on Repl. You can find it here https://www.codementor.io/garethdwyer/beginner-web-scraping-with-python-and-repl-it-nzr27jvnq

I'll be working on a more advanced one soon if this one is too easy!

g7kse (2)

Nice tutorial. Any chance you can do one with getting data from API's? I've got myself really stuck with one Repl thats nearly there but just not quite made it.

amasad (3455)

@g7kse can you be a bit more specific? If you want ask your question on the ask board and leave a link to your repl

g7kse (2)

@amasad I'll stick it on the 'ask' board

cuber1515 (62)

This was great! I've been looking for a good web scraping tutorial, so thanks.
I also had a question. So I making this web scraper where the user can put in the website URL that you want it to scrap from and then the element you want to scrap, but I have a problem. If your doing links where you want to find the href you put:

for links in link:

but I don't know what to put for a <p> element. I took a guess and tried with the class attribute (that was another thing I was doing; you can search for certain attributes). This is what it was:

for paragraph in ELEMENTS:
  print(Fore.WHITE + paragraph.get("class"))

obviously didn't work. if you want to see the code here's the spotlight page

P. S. If you have any tutorials on attaching info from a web scraper to a website could you send me the link?


GarethDwyer1 (295)

@cuber1515 hey! Glad you enjoyed it - we did this one recently which shows how to hook up BeautifulSoup and Flask https://docs.replit.com/tutorials/22-personal-stock-market-dashboard.

I'm not quite sure exactly what you're trying to print but you can also use .getText() if you just want to print the contents of the paragraph - see https://stackoverflow.com/questions/12451997/beautifulsoup-gettext-from-between-p-not-picking-up-subsequent-paragraphs

or if you're trying to find only specific paragraphs, you can do something like this https://stackoverflow.com/questions/5041008/how-to-find-elements-by-class