|
|
|
|
|
by HanClinto
724 days ago
|
|
> The tricky part is coming up with a reasonable set of "add noise" transformations. Yes, as well as dealing with a variable-length window. When generating images with diffusion, one specifies the image ahead-of-time. When generating text with diffusion, it's a bit more open-ended. How long do we want this paragraph to go? Well, that depends on what goes into it -- so how do we adjust for that? Do we use a hierarchical tree-structure approach? Chunk it and do a chain of overlapping segments that are all of fixed-length (could possibly be combined with a transformer model)? Hard to say what would finally work in the end, but I think this is the sort of thing that YLC is talking about when he encourages students to look beyond LLMs. [1] * [1] https://x.com/ylecun/status/1793326904692428907 |
|