Skip to content
    Back to all Bounties

    Earn 4,500 ($45.00)

    Time Remainingdue 2 years ago
    Completed

    Simple NestJS RPC Microservice to wrap Two Wikipedia API Calls

    DanBarrett
    DanBarrett
    Posted 2 years ago
    This Bounty has been completed!

    Bounty Description

    Problem Description

    Create a small NestJS microservice which makes calls to the Wikipedia API to fetch the ids (urls or titles) of popular pages and the structured content of a page given its url or id. Map the result into an array of the main body text (broken roughly if imperfectly into wikipedia paragraphs/sections) and an array of any image urls present on the page.

    Acceptance Criteria

    • Code is written in Typescript using the NestJS framework and uses axios to make API calls
    • the NestJS framework can be minimal (no more than a simple skeleton init, with a single RPC Controller and a single service)
    • The RPC controller should expose two methods:
    • fetch a list of popular Wikipedia page identifiers (ie id, url, title)
    • fetch a full page's structured content using the id or url
    • map the result into an array of text (paragraphs) and an array of image urls
    • add one set of integration tests (a spec file using NestJS TestModule) that actually makes the call out to the real API (we will run this infrequently but on an ongoing basis as a healthcheck)

    Technical Details

    • Language: Typescript
    • Framework: NestJS
    • HTTP client: axios

    Pseudocode example of the service in question, wrap this in an RPC controller and make available to the network. (may not work as written):

    import { Injectable } from '@nestjs/common';
    import axios from 'axios';
     
    @Injectable()
    export class WikipediaService {
    async getPopularPages(): Promise<WikipediaPage[]> {
    const response = await axios.get(
    'https://en.wikipedia.org/w/api.php?action=query&list=mostviewed&format=json',
    );
    return response.data.query.mostviewed;
    }
     
    async getPageContent(pageTitle: string): Promise<string> {
    const response = await axios.get(
    `https://en.wikipedia.org/w/api.php?action=query&titles=${pageTitle}&prop=revisions&rvprop=content&format=json`,
    );
    const pages = response.data.query.pages;
    const pageId = Object.keys(pages)[0];
    return pages[pageId].revisions[0]['*'];
    }
    }
     
    export interface WikipediaPage {
    pageid: number;
    ns: number;
    title: string;
    count: number;
    created: string;
    }

    Pseudocode example of extracting paragraphs from page (may not work as written):

    function extractParagraphs(content) {
    const lines = content.split('\n');
    const paragraphs = [];
    let currentParagraph = '';
    for (const line of lines) {
    if (line.startsWith('==')) {
    break;
    }
    if (line.startsWith('=')) {
    continue;
    }
    if (line.startsWith('*')) {
    continue;
    }
    if (line.startsWith('{')) {
    continue;
    }
    if (line === '') {
    paragraphs.push(currentParagraph);
    currentParagraph = '';
    } else {
    currentParagraph += `${line}\n`;
    }
    }
    if (currentParagraph !== '') {
    paragraphs.push(currentParagraph);
    }
    return paragraphs;
    }

    Pseudocode example of extracting images from response (may not work as written):

    function extractImages(content) {
    const lines = content.split('\n');
    const images = [];
    for (const line of lines) {
    const imageRegex = /\[\[File:(.*?)\|.*?\]\]/g;
    const matches = line.match(imageRegex);
    if (matches) {
    for (const match of matches) {
    const imageName = match.replace('[[File:', '').replace('|.*]]', '');
    images.push(imageName);
    }
    }
    }
    return images;
    }

    Link to Project

    // no public-facing link yet, email me for any questions you have