Hacker News new | ask | show | jobs
Tencent's new AI technique teaches language models 'parallel thinking' (venturebeat.com)
4 points by alhazraed 269 days ago
1 comments

Reminds me of another Tencent paper https://dl.acm.org/doi/10.1145/3711896.3736949 that is how to combine distillation and ensemble for faster parallel inference.

That was Tencent doing parallelism at the model level. And now this is their evolution on MoE. Very complementary.