Submit templates for repl.it/templates here.

← Back to all posts
Webscraper Template
HENRYMARTIN4 (398)

Webscraper Template

A simple webscraper using the newspaper3k Python module.

What it does

import newspaper, sys
url = input("Enter URL of news website to scrape: ")
print("Loading...")
paper = newspaper.build(url)

What this does is it first gets user input for the URL.

i = 0
for article in paper.articles:
	print(str(i)+" "+article.title)
	i = i + 1

Then, it goes on and prints out all the article titles.

def getArticleNum():
	try:
		articleNum = int(input("Please enter the number corresponding to the article you want to view: "))
		return articleNum
	except:
		print("Please enter a number.")
		return getArticleNum()

Then, we define a function to allow us to get our user input for our article number.

try:
	article = paper.articles[getArticleNum()]
except:
	print("Sorry, an error occured while processing your request.")
	sys.exit()

Here, we now get the article number the user wants...

article.download()
article.parse()
print(article.text)

Then we download the article, parse it, and print out the contents.
Please enjoy!

Sorry about the long install times.

NOTE: This is very glitchy.

Comments
hotnewtop
fullstack11235 (0)

What are some good sites to use? The ones I've tried comeback with an error.