Y
Hacker News
new
|
ask
|
show
|
jobs
by
ludwigschubert
262 days ago
The user you originally replied to specifically mentioned > without going to text first
1 comments
adastra22
262 days ago
Yeah, and that's my understanding. Nothing goes video -> text, or audio -> text, or even text -> text without first going through state space. That's where the core of the transformer architecture is.
link