Hacker News new | ask | show | jobs
by czl 145 days ago
FYI: Newer LLM hosting APIs offer control over amount of "thinking" (as well as length of reply) -- some by token count others by an enum (high low, medium, etc.).