Hacker News new | ask | show | jobs
by D-Machine 102 days ago
I'd tend to agree, the only good points I've seen were made by @hedgehog [1] here in this thread:

    I'm not sure about the rest but a significant problem with high frequency tool calling (especially in training) is that it breaks batching.
and then later by @ACCount37 [2]:

    I'm less interested in turning programs into transformers and more interested in turning programs into subnetworks within large language models.
In theory, if you can create a very efficient sub-net to replicate certain tool calls (even if the weights are frozen during any training steps, and manually compiled), this might help with making inference much more efficient at scale. No idea why in general you would want to do this through the clunky transformer architecture though. Just implement a non-trainable, GPU-accelerated layer to do the compute and avoid the tool-call.

[1] https://news.ycombinator.com/item?id=47367986

[2] https://news.ycombinator.com/item?id=47363909