|
|
|
|
|
by prettyblocks
177 days ago
|
|
I don't think tricky niche knowledge is the sweet spot for genai and it likely won't be for some time. Instead, it's a great replacement for rote tasks where a less than perfect performance is good enough. Transcription, ocr, boilerplate code generation, etc. |
|
So I want to have a general idea of how good it is at this.
I found something that was niche, but not super niche; I could easily find a good, human written answer in the top couple of results of a Google search.
But until now, all LLM answers I've gotten for it have been complete hallucinated gibberish.
Anyhow, this is a single data point, I need to expand my set of benchmark questions a bit now, but this is the first time that I've actually seen progress on this particular personal benchmark.