Hacker News new | ask | show | jobs
by vessenes 481 days ago
Volodymyr, congrats. This is crazy fast. If not super great at long context coding tasks. I tagged a few problem responses.

I'm curious about something that has analogues in image diffusion models -- you can see diffusion models, depending on how they are working through their latent space, sometimes try out and then move on from a feature in an image as it fits less with what's around it.

Are there analogues for Mercury? Does it try with a token or set of tokens, and as parts of the response fill in move on from them? Similarly, this architecture seems like it would have real problems inserting a needed token in the middle of a bunch of relatively high confidence generated tokens.

Can you give some insight / thoughts from the frontlines on these?