|
|
|
|
|
by Imnimo
889 days ago
|
|
I'm very curious how often the LM produces a helpful construction. Surely it must be doing better than random chance, but is it throwing out thousands of constructions before it finds a good one, or is it able to generate useful proposals at a rate similar to human experts? They say in the paper, "Because the language model decoding
process returns k different sequences describing k alternative auxiliary
constructions, we perform a beam search over these k options, using
the score of each beam as its value function. This set-up is highly parallelizable across beams, allowing substantial speed-up when there are
parallel computational resources. In our experiments, we use a beam
size of k = 512, the maximum number of iterations is 16 and the branching factor for each node, that is, the decoding batch size, is 32." But I don't totally understand how 512 and 16 translate into total number of constructions proposed. They also note that ablating beam size and max iterations seems to only somewhat degrade performance. Does this imply that the model is actually pretty good at putting helpful constructions near the top, and only for the hardest problems does it need to produce thousands? |
|