"This study presents the phylogenetic characterization of the beak and beak of beak whales; it is suggested that the beak and beak-toed beaks share common cranial bones, providing support for the idea that beaks are a new species of eutriconodont mammal."
Is it repeating itself because the corpus is too small? 10,000 papers seems like rather a small corpus. How large a training corpus would normally be used in GPT-2 work?
[I know nothing - I'm pretty ignorant about practical ML]
"This study presents the phylogenetic characterization of the beak and beak of beak whales; it is suggested that the beak and beak-toed beaks share common cranial bones, providing support for the idea that beaks are a new species of eutriconodont mammal."