Hacker News new | ask | show | jobs
by fermenflo 2422 days ago
I agree, a lot of the code could be improved. But some of what you mentioned is fairly standard. Like "Gaussian Error Linear Units being GELU, w/b for weights/biases, etc...
1 comments

Not sure how standard that is ...
It’s very standard ML abbreviations.