| HN Mirror

An "LLM crawler app" is needed -- in that you should be able to shift Tokenized Workloads between executioners in a BGP routing sort of sense...

Least cost routing of prompt response. especially if time-to-respond is not as important as precision...

Also, is there a time-series ability in any LLM model (meaning "show me this [thing] based on this [input] but continually updated as I firehose the crap out of it"?

What if you could get execution estimates for a prompt?