Hacker News new | ask | show | jobs
by lambdatronics 896 days ago
>a way to generalize the compute graph as a learnable parameter.

Agreed. Seems analogous with how human mental processes are used to solve the kind of problems we'd like LLMs to solve (going beyond "language processing" which transformers do well, to actual reasoning which they can only mimic). Although you risk it becoming a Turing machine by giving it flow control & then training is a problem as you say. Perhaps not intractable though.