|
|
|
|
|
by fzliu
1168 days ago
|
|
Keep up the awesome work. I've run across this problem myself - I somehow used $20 just testing a small demo I made with GPT-3.5. As most ML is inherently probabilistic, it seems reasonable to make an LLM cache both semantic and _stochastic_, i.e. you wouldn't want the same answer every time you use "pick me a color" as prompt. Injecting the original LLM (GPT, Bard, etc) response as prompt for alpaca or some other model could make this cache virtually invisible. |
|