Earn 22,500 ($225.00)
Create an AI workflow or agent to scrape car data from specific URLs
Bounty Description
Problem Description
I want a bot, agent, or system that can scrape some specific data from a provided list of car companies websites and return to me the data I need of the different cars they have including the different versions of each car, here is the list of websites I need to get the data from this sites (or others to be added later):
https://www.honda.mx/acura
https://www.audi.com.mx/es/
https://www.motornation.com.mx/acerca-de-baic
https://www.bmw.com.mx/es/index.html
https://www.buick.com.mx/
https://www.byd.com/mx
https://www.cadillac.com.mx/
https://www.chevrolet.com.mx/
https://www.chrysler.com/mx/
https://www.dodge.com/mx/
https://www.fiat.com.mx/
https://www.ford.mx/
https://www.gmc.com.mx/
https://www.gwm-mx.com/es
https://www.honda.mx/
https://www.hyundai.com.mx/
https://www.jac.mx/
https://www.jeep.com.mx/
https://www.kia.com/mx/main.html
https://www.lincoln.mx/
https://www.mgmotor.com.mx/
https://www.mazda.mx/
https://www.mercedes-benz.com.mx/es/passengercars.html
https://www.mini.com.mx/es_MX/home.html
https://www.mitsubishi-motors.mx/
https://www.nissan.com.mx/
https://www.peugeot.com.mx/
https://www.ram.com/mx/
https://www.renault.com.mx/
https://www.seat.mx/
https://www.subaru.com.mx/
https://www.suzuki.com.mx/
https://www.toyota.mx/
https://www.vw.com.mx/es.html
https://www.volvocars.com/mx/
And this is the relevant information I would need from each of the cars and their versions:
car_name
version_name
website
price
engine_summary ('Engine' + Size in L + ',' + displacement + ',' + valves)
max_speed (km/h)
acceleration (0-100km/h)
engine_power
engine_torque
turbo (TRUE/FALSE)
transmission
city_efficiency
highway_efficiency
combined_efficiency
front_brakes
rear_brakes
tires
front_suspension
rear_suspension
traction
abs_system (TRUE/FALSE)
traction_control_system (TRUE/FALSE)
height (mm)
length (mm)
width (mm)
wheelbase (mm)
vehicle_weight
fuel_tank (L)
trunk_capacity (L)
headlights
automatic_headlights (TRUE/FALSE)
fog_lights (TRUE/FALSE)
air_conditioning
alarm
automatic_trunk_opening (TRUE/FALSE)
airbags
parking_sensor
rear_camera
automatic_door_lock (TRUE/FALSE)
cruise_control (TRUE/FALSE)
automatic_engine_start (TRUE/FALSE)
mirrors
sunroof (TRUE/FALSE)
wheels
interior_upholstery
power_windows
entertainment_system (Radio/USB, touch/speakers, etc...)
image_1_url
image_2_url
image_3_url
image_4_url
image_5_url
specs_pdf_url
Just as an idea, I think a double agent set-up would be ideal, one to get the main URL from all the different cars available in each of the websites, and another one to get all the details of each car to scrape the required information.
Acceptance Criteria
The accepted criteria would be a program/bot/agent, that I can run on a website or directly on my computer through CLI, where I can send it the different URLs to visit, and it would give me the different car models with each of the different versions with each of the right data.
Technical Details
Preferably to be done in javascript as I could tweak it my way or Python, but no issues if it's done in any other language and it runs efficiently. Also it's ok if I need to use some API keys to make it work so that any information needs to be analyzed on servers like OpenAI or Claude, Llama, etc...
If you have any question at all, please let me know