Webscraper Template
Webscraper Template
A simple webscraper using the newspaper3k Python module.
What it does
import newspaper, sys url = input("Enter URL of news website to scrape: ") print("Loading...") paper = newspaper.build(url)
What this does is it first gets user input for the URL.
i = 0 for article in paper.articles: print(str(i)+" "+article.title) i = i + 1
Then, it goes on and prints out all the article titles.
def getArticleNum(): try: articleNum = int(input("Please enter the number corresponding to the article you want to view: ")) return articleNum except: print("Please enter a number.") return getArticleNum()
Then, we define a function to allow us to get our user input for our article number.
try: article = paper.articles[getArticleNum()] except: print("Sorry, an error occured while processing your request.") sys.exit()
Here, we now get the article number the user wants...
article.download() article.parse() print(article.text)
Then we download the article, parse it, and print out the contents.
Please enjoy!
Sorry about the long install times.
NOTE: This is very glitchy.
Leroy01010
nice!
N3rdL0rd
@Leroy01010 thx i dont have many upvoters here.
ccccssx
Thx
What are some good sites to use? The ones I've tried comeback with an error.
@fullstack11235 Sorry about the late response. Some good sites are:
etc...