|
|
|
|
|
by andy_xor_andrew
921 days ago
|
|
interesting how MCTS decoding is called out. that seems entirely like a software aspect, which doesn't depend on a particular chip design? and on the topic of MCTS decoding, I've heard lots of smart people suggest it, but I've yet to see any serious implementation of it. it seems like such an obviously good way to select tokens, you'd think it would be standard in vllm, TGI, llama.cpp, etc. But none of them seem to use it. Perhaps people have tried it and it just don't work as well as you would think? |
|
I worked at DeepMind on projects that used MCTS. Even with access to the AlphaZero source code, it was very difficult to write an other implementation that got the same results as the original.