Y
Hacker News
new
|
ask
|
show
|
jobs
user:
EarlyOom
created:
2022-03-22
karma:
205
submissions:
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Replace OCR with Vision Language Models
292 points
|
125 comments
0 points
|
0 comments
Show HN: Visually parse an entire YouTube video frame by frame
5 points
|
0 comments
Ask HN: What are folks using to train/fine-tune Vision Language Models
1 points
|
0 comments
A Node.js SDK for calling Vision Language Models
6 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Run structured extraction on documents/images locally with Ollama and Pydantic
170 points
|
29 comments
0 points
|
0 comments
Show HN: Vlm Run, Extract JSON from images, videos and documents in a simple API
2 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
0 points
|
0 comments
Fine-grained Visual Transcription for YouTube videos
9 points
|
3 comments
0 points
|
0 comments
"Ok Computer, why are you slow?"
2 points
|
0 comments
Show HN: NOS – A fast, and ergonomic PyTorch inference server
3 points
|
0 comments