Hacker News new | ask | show | jobs
by andyferris 331 days ago
You can make image networks (unet-like things) by chunking rectangles in 2D (with some convolution steps)... I wonder if there is an image-specific architecture a bit like this that could possibly work well?
1 comments

Perhaps something like this: https://neurips.cc/virtual/2024/poster/94115 Though I haven't looked up what their actual tokenization strategy is, and whether switching to hierarchical (H-Net) chunks would be possible.