Back to all Bounties
Earn 10,800 ($108.00)
due 12 months ago
Canceled
PDF Scraper Python Script
brwr
Details
Applications
19
Discussion
Bounty Description
Problem Description
Describe the problem you are trying to solve here.
- I need a python script that can preprocess PDFs for me
- There is a very specific format that the PDFs are in (https://drive.google.com/file/d/1YOQomtvldfZpTghXkoy6CiGrAMPB7jfU/view?usp=share_link)
- I need to generate a CSV out of this data, here is an example for one school district as well as one school. (https://docs.google.com/spreadsheets/d/e/2PACX-1vQUzRDajXotn9Qb7wMkfpt5tVivhqQLz6bBLye6QS9p4dGegBu9aU734byjcU1ijwv0jBNkjnO8ahhU/pubhtml)
- I am trying to make embeddings for a collection of 100 of these PDFs, and need them all preprocessed for it.
Acceptance Criteria
- I can test on another PDF that is of the same structure, and it generates satisfactory results in a CSV.
- 2x bounty tip if I accept the solution within 24 hours of you starting