|
|
|
|
|
by alandu
814 days ago
|
|
We have not come across any benchmark dataset that's actually worth evaluating on because the questions are not representative of real world enterprise problems. They don't reflect the degree of context needed to answer domain/business-specific questions accurately. |
|
https://github.com/defog-ai/sql-eval