Skip to content
    Back to all Bounties

    Earn 54,000 ($540.00)

    Time Remainingdue 2 years ago
    Completed

    HTML + LLM Question and Answer Extractor

    AlexReibman
    AlexReibman
    Posted 2 years ago
    This Bounty has been completed!

    Bounty Description

    Description

    This is a summarized view. Please see Our Notion page for a complete overview.

    We are developing a Chrome extension that uses LLMs to answer questions in HTML-based forms based on a database of questions and answers. The goal is to create a form-agnostic solution that can work on any human-readable HTML form.

    Overview

    The product should extract an HTML document from a live website, filter out unnecessary elements, and send the filtered DOM elements to a backend where an LLM extracts the questions and answers. The backend should then find all fields and their corresponding questions using LLM+prompt and return a list of DOM element IDs and their corresponding questions.

    Deliverables

    1. Create a Javascript function for a Chrome extension content script that:

      1. Captures a snapshot of the current webpage's HTML and sends it to a backend service.
    2. Create a backend function or service (Javascript or Python) that:

      1. Filters out unnecessary elements/tokens from the HTML snapshot.
      2. Sends the pruned HTML to an LLM prompted to create a list/JSON/YAML of all question+answer pairs.
      3. Matches the extracted question+answer pairs against a bank of pre-answered questions and answers.

    Requirements & Constraints

    • Return a list of DOM element unique identifiers for questions and answer elements.
    • Ensure functionality runs error-free on all web browsers and websites.
    • Must not exceed token limit constraints (4k token context limit).

    Assumptions & Dependencies

    • The solution must be agnostic to different website styles for hosting fillable fields.
    • Prefer YAML formatting for LLM inputs/outputs to save tokens.
    • Access to GPT-4 and Claude v1.3 API keys will be provided upon request.
    • Test websites: Google Forms and Shipping Site (click on "Enter Document Details").
    • Suggested approaches: DOM minification, ARIA screen reader labels, and Cheerio for filtering out unnecessary tags.