Skip to content
Sign upLog in
Webscraper Template
Profile icon
N3rdL0rd

Webscraper Template

A simple webscraper using the newspaper3k Python module.

What it does

import newspaper, sys url = input("Enter URL of news website to scrape: ") print("Loading...") paper = newspaper.build(url)

What this does is it first gets user input for the URL.

i = 0 for article in paper.articles: print(str(i)+" "+article.title) i = i + 1

Then, it goes on and prints out all the article titles.

def getArticleNum(): try: articleNum = int(input("Please enter the number corresponding to the article you want to view: ")) return articleNum except: print("Please enter a number.") return getArticleNum()

Then, we define a function to allow us to get our user input for our article number.

try: article = paper.articles[getArticleNum()] except: print("Sorry, an error occured while processing your request.") sys.exit()

Here, we now get the article number the user wants...

article.download() article.parse() print(article.text)

Then we download the article, parse it, and print out the contents.
Please enjoy!

Sorry about the long install times.

NOTE: This is very glitchy.

You are viewing a single comment. View All
Profile icon
fullstack11235

What are some good sites to use? The ones I've tried comeback with an error.

Profile icon
N3rdL0rd

@fullstack11235
Sorry about the late response. Some good sites are:

  • BBC
  • CNN
  • Minecraft.net/blog
    etc...
Profile icon
fullstack11235

@N3rdL0rd
thanks! better late than never ;)