Hacker News new | ask | show | jobs
by docfort 908 days ago
So I honestly thought that this was going to be another LLM article, but using the venerable ReLU activation function instead of the usual. A ReLU is exactly an if statement when rendered in a decision tree (if less than 0, emit 0; otherwise, emit the input). Given the relative popularity of the 4B parameter models (any transformer is dominated by the number of parameters in good old fully-connected feedforward layers), you can perhaps describe such models as 4B if statements. I was disappointed that the author didn’t go there as a means of parody.