|
|
|
|
|
by mierz00
106 days ago
|
|
I highly rate Braintrust. It wouldn’t be too difficult to build something like that for your own usage, but I found it pretty easy to get datasets set up. Essentially a game changer in understanding if your prompts are working. Especially if you’re doing something which requires high levels of consistency. In our case we would use LLM for classification which fits in perfectly with evals. |
|