Hacker News new | ask | show | jobs
by bottlepalm 871 days ago
Is this the data that was used for fine tuning?

https://github.com/defog-ai/sql-eval/blob/main/data/question...

1 comments

No, those are benchmark, evaluation questions. The fine tune dataset was a custom, synthetically generated dataset of ~20k PostgreSQL Text to SQL pairs covering different SQL categories and question types.

I mention a little more about it here https://x.com/calebfahlgren/status/1754247740291207198?s=20

So this is essentially postgres only? Or how will it handle e.g. MS SQL Schemas and output?
Currently Postgres yes, already working on a dataset with more DDLs like MySQL, DuckDB, MSSQL, etc for a second iteration.