|
|
|
|
|
by xur17
1180 days ago
|
|
I did OCR as a separate step (essentially 1. load webpage, 2. screenshot, 3. ocr, 4. ocr output + question into chatgpt). What does it mean to do it all as one step / how would I got about doing that with ChatGPT? For more context: I have this setup as an api that I feed url + typescript definitions to, and have chatgpt output information from the website in the specified typescript definition. For example, I can use {product_price: float, product_name: str} + a url as the input, and fairly accurately get product price info across ALL product websites. It's kind of amazing that it's able to do this much just based upon the typescript variable names + raw OCR output. |
|
Wait till they make the image input available via the API, I guess