Hacker News new | ask | show | jobs
by arthurcolle 25 days ago
I found that keeping current context utilization at 18% of total context length was best for minimizing spend, across all models with 400k context length or more