Skip to content
← Back to Community
Beginner web scraping with Python and Repl.it
Profile icon
h
has Hacker Plan
GarethDwyer1

Hey all,

I wrote a beginner's tutorial on web scraping using Python on Repl. You can find it here https://www.codementor.io/garethdwyer/beginner-web-scraping-with-python-and-repl-it-nzr27jvnq

I'll be working on a more advanced one soon if this one is too easy!

Voters
Profile icon
Katega
Profile icon
Eythcrypto
Profile icon
0xcobra
Profile icon
roksaga
Profile icon
QuantumCodes
Profile icon
Infernape21
Profile icon
SaravanaBabuji
Profile icon
cuber1515
Profile icon
RiyadBeddiaf
Profile icon
blaiseBlaise
Comments
hotnewtop
Profile icon
g7kse

Nice tutorial. Any chance you can do one with getting data from API's? I've got myself really stuck with one Repl thats nearly there but just not quite made it.

Profile icon
amasad

@g7kse can you be a bit more specific? If you want ask your question on the ask board and leave a link to your repl

Profile icon
g7kse

@amasad I'll stick it on the 'ask' board

Profile icon
cuber1515

This was great! I've been looking for a good web scraping tutorial, so thanks.
I also had a question. So I making this web scraper where the user can put in the website URL that you want it to scrap from and then the element you want to scrap, but I have a problem. If your doing links where you want to find the href you put:

for links in link: print(link.get("href")

but I don't know what to put for a

element. I took a guess and tried with the class attribute (that was another thing I was doing; you can search for certain attributes). This is what it was:

for paragraph in ELEMENTS: print(Fore.WHITE + paragraph.get("class"))

obviously didn't work. if you want to see the code here's the spotlight page

P. S. If you have any tutorials on attaching info from a web scraper to a website could you send me the link?

Thanks

Profile icon
GarethDwyer1

@cuber1515 hey! Glad you enjoyed it - we did this one recently which shows how to hook up BeautifulSoup and Flask https://docs.replit.com/tutorials/22-personal-stock-market-dashboard.

I'm not quite sure exactly what you're trying to print but you can also use .getText() if you just want to print the contents of the paragraph - see https://stackoverflow.com/questions/12451997/beautifulsoup-gettext-from-between-p-not-picking-up-subsequent-paragraphs

or if you're trying to find only specific paragraphs, you can do something like this https://stackoverflow.com/questions/5041008/how-to-find-elements-by-class

Profile icon
cuber1515