|
|
|
|
|
by httpteapot
156 days ago
|
|
What do you think of the DeepSeek OCR approach where they say that vision tokens might better compress a document than its pure text representation? https://news.ycombinator.com/item?id=45640594 I've spent some time feeding llm with scrapped web pages and I've found that retaining some style information (text size, visibility, decoration image content) is non trivial. |
|