Y
Hacker News
new
|
ask
|
show
|
jobs
Timeline of Diffusion Language Models
(
github.com
)
1 points
by
tilt
150 days ago
1 comments
storystarling
149 days ago
I'm curious what the actual inference unit economics look like compared to standard autoregressive models. Parallel decoding helps with latency, but does the total compute cost per token make it viable for production workloads yet?
link