Y
Hacker News
new
|
ask
|
show
|
jobs
by
jerpint
834 days ago
Interesting that it’s not vision based, I suspect you will get much better performance once vision is incorporated, using e.g LLaVa style models