Y
Hacker News
new
|
ask
|
show
|
jobs
by
andai
1 hour ago
How is this model half the size of DeepSeek V4 Pro? Is it because DeepSeek did more aggressive cost cutting on the attention mechanism?