Hacker News new | ask | show | jobs
by binarymax 1531 days ago
Large language models have a (recent) history of silly names. BERT, BART, ELMO, RoBERTa, BIGBIRD, PaLM, Megatron etc. Might as well go full nonsense.
4 comments

My theory is since no one reads literature anymore, timeless, interesting and unique names from history and other cultures are lost to a deluge of soon to be forgotten gag, pop-culture and meme names. Perhaps this is why we have Chinchilla and not Oberon.
Like the Oberon OS and programming language?
Image models too - the Inception paper from 2014 directly refers to knowyourmeme.com and the "we need to go deeper" meme from the movie Inception - https://knowyourmeme.com/memes/we-need-to-go-deeper - it's the first reference in the paper [1] and it's also why the model is called that way.

[1] https://arxiv.org/pdf/1409.4842.pdf

A touch of irony that cutting edge research on language can’t produce better names.
True. I will add that it is customary to justify it by demonstrating it is some sort of acronym or contraction.
It's a recursive, selective acronym

               C
              CH
             CHI
            CHIN
           CHINC
          CHINCH
         CHINCHI
        CHINCHIL
       CHINCHILL
  ==> CHINCHILLA
      HINCHILLA
      INCHILLA
      NCHILLA
      CHILLA
      HILLA
      ILLA
      LLA
      LA
      A
I know what recursive means, I know what selective means, I know what an acronym is, and I think I see the pattern in that picture, but when I put it all together I am lost.

Alternatively, is this a joke and the "recursive, selective acronym" can be used to justify any word?

               A
              AR
             ARB
            ARBI
           ARBIT
          ARBITR
         ARBITRA
        ARBITRAR
  ==>  ARBITRARY
       RBITRARY
       BITRARY
       ITRARY
       TRARY
       RARY
       ARY
       RY
       Y


Yup, seems it works for any word.