|
|
|
|
|
by louiskw
1109 days ago
|
|
Generate as few tokens as possible, GPT4 is running a few times to generate a single answer and latency quickly becomes the biggest UX issue. We abandoned most of the common thinking around chain of thought reasoning, finding it didn’t help accuracy much whilst increasing response times significantly. Full write up to follow in next week or so. |
|