|
|
|
|
|
by infecto
822 days ago
|
|
This should be higher up. This whole blog post is mostly worthless because the way they are extracting data is less than optimal. Lower end models do not have the attention to complete tasks like this, GPT4Turbo will generally have the capability. But to have an optimal pipeline you should really be splitting up these tasks into individual units. You extract each attribute you want independently and then combine it back together however you want. Also asking for JSON upfront is equally suboptimal in the whole process. I have high confidence that I could accomplish this task using a lower end model with a high degree of accuracy. Edit: I am not suggesting that an LLM is more optimal than what ever traditional parsing methods they may use, simply the way they are doing it is wrong from an LLM flow. |
|
Cool, cool. I'm super interested. Please share the process and the results.