Earn 18,000 ($180.00)
Prototype of AI synthesis tool
Bounty Description
#Problem Description
We need a bespoke prototype capable of automatically synthesizing various forms of research data (user transcripts, product feedback, usability testing notes, survey results, etc.) into a unified set of user needs, pain points, and feedback. Our goal is to streamline the process of turning raw qualitative data into actionable insights.
At present:
- We have scattered research findings in multiple formats (PDF, text documents, spreadsheets).
- Our team manually reviews and categorizes feedback into “Needs,” “Pains,” and “Feedback,” which is time-consuming and prone to inconsistencies.
- We want a scalable approach that can handle future data growth and provide clear output for product teams to act on.
Acceptance Criteria
1. Data Ingestion & Parsing
• The prototype should accept multiple data formats (e.g., .pdf, .txt, .csv), parse the content, and identify relevant segments (quotes, sentences, or paragraphs).
• Must handle at least three different source formats (e.g., transcripts, spreadsheets, freeform text documents).
2. Classification into Needs, Pains, Feedback (and potentially others)
• Automatically categorize or label each piece of data into the correct category.
• Provide a clear UI or output showing which snippet belongs to which category.
3. Insights Dashboard or Output
• Present the categorized insights in a simple dashboard or structured format (e.g., JSON, table, or minimal web UI).
• Must include metadata such as confidence score, source of data, and timestamp (if available).
4. Customizable Taxonomies/Tags
• Allow adding or adjusting categories as needed (e.g., “Feature Requests,” “Documentation Gaps,” etc.).
5. Quality & Accuracy
• Provide sufficiently accurate categorization. (An 80% accuracy threshold for classification would be acceptable as a prototype baseline.)
• Must be clear how to retrain or refine classification as new data arrives or if performance needs tuning.
Technical Details
• Languages & Frameworks:
• Preferred stack is Python or Node.js for backend processing, but open to other suggestions.
• Simple web-based UI (React, Vue, or a minimal HTML/JS interface) is sufficient.
• NLP Approach:
• Can leverage existing NLP libraries (e.g., SpaCy, NLTK, Hugging Face Transformers) to perform the classification tasks.
• Each piece of feedback should be assigned to a category using either rule-based tagging, a trained model, or a combination of both.
• Input & Output:
• Must accept multiple file uploads (PDF, CSV, etc.).
• Output should be stored either in a database (MongoDB, Postgres, SQLite, etc.) or made available as downloadable JSON.
• Deployment:
• Should run locally (via Replit or a Docker container) with minimal setup.
• Provide instructions on how to install dependencies, run the service, and test the prototype.
• Version Control & Documentation:
• Use Git for version control.
• Include clear README with instructions on usage, dependency management, and any environment variables required.
Mockups available.