|
|
|
|
|
by ilaksh
1018 days ago
|
|
Respect for persistence. For the screenshot thing, are you using the GPT-4 visual understanding? I assume not since it's only available for a few groups. Looks like an interesting project. Are you using BLIP-2? I understand if you don't want to give away secrets. Just thought it couldn't hurt to ask. |
|
We only used the standard GPT-4 API (without visual capabilities).
I think it's okay to have a some discussion here. We employed traditional OCR, GPT-4, and some of our own algorithms to assist GPT-4 in understanding the context and relationship within the image scene (allow me to retain a bit of mystery here).