| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by two_in_one 871 days ago

From the post:

> I implemented imperative code that does what I’m proposing the transformer is doing. It produces outputs very similar to the transformer.

This means there is probably a way to bypass transformers and get the same results. Would be interesting if it's more efficient. Like given foundation model train something else and run it on much smaller device.

1 comments

yorwba 871 days ago

I explained that it's not bypassing transformers and not more efficient in another comment: https://news.ycombinator.com/item?id=39254966

link