|
|
|
|
|
by 00ajcr
1510 days ago
|
|
My interpretation of the point in the blog post was that explicitly spelling out variable names makes APIs and the underlying code much more accessible to a wider audience. Sure, there'll be a subset of users of these libraries that have read ML/textbooks and are familiar with what η means in this context. Today, many (most?) users of ML libraries will probably not know what η means without looking it up. Adhering to mathematical notation puts up an unnecessary barrier to using the API/code and ultimately limits wider engagement/collaboration. To attract a bigger slice of the ML community, choosing names that the ML hobbyyist can read, understand and use without pause is the better path forward. |
|
FYI, the documentation of the function https://fluxml.ai/Flux.jl/stable/training/optimisers/ explicitly says it is learning rate:
> Learning rate (η): Amount by which gradients are discounted before updating the weights.
so this is already explicit to anyone who reads the documentation. The quibble in the post is about the named parameter.