Hacker News new | ask | show | jobs
by bluenose69 1348 days ago
Thanks for the pointer. Indeed, a more recent paper (cited below) estimates an even higher error rate (30.9%), but the fact that we are not talking of 0.001% tells me that excel is simply a non-starter for this kind of work. (Actually, this is just one of many reasons why I discourage my students from using excel for any dataset.)

Abeysooriya, Mandhri, Megan Soria, Mary Sravya Kasu, and Mark Ziemann. “Gene Name Errors: Lessons Not Learned.” PLoS Computational Biology 17, no. 7 (July 30, 2021): e1008984. https://doi.org/10.1371/journal.pcbi.1008984.