|
|
|
|
|
by vessenes
481 days ago
|
|
Volodymyr, congrats. This is crazy fast. If not super great at long context coding tasks. I tagged a few problem responses. I'm curious about something that has analogues in image diffusion models -- you can see diffusion models, depending on how they are working through their latent space, sometimes try out and then move on from a feature in an image as it fits less with what's around it. Are there analogues for Mercury? Does it try with a token or set of tokens, and as parts of the response fill in move on from them? Similarly, this architecture seems like it would have real problems inserting a needed token in the middle of a bunch of relatively high confidence generated tokens. Can you give some insight / thoughts from the frontlines on these? |
|