|
|
|
|
|
by nyrikki
845 days ago
|
|
The fact that 'removing the “quadratic bottleneck”' involves either reduced expressability compared to self attention or disproving SETH is another reason. The quadratic bottleneck is due to the lower bounds of exhaustive search. The papers on this only ever seem to reference perplexity. The fact it can append a word to "I'm going to the beach" that sounds good doesn't mean it is useful. There is no free lunch, and this project hasn't shown that the costs are acceptable. "I'm going to the beach" + house Doesn't help if what you needed was "I'm going to the beach" + tomorrow I do hope that there is more information on the costs, or that they have disproven SETH soon. |
|