Hacker News new | ask | show | jobs
by matt123456789 1526 days ago
A parameter is a scalar value, most of which are in the attention matrices and feedforward matrices, you also hear these called “weights”. Any intro to DL course will cover these in detail. I recommend started with Andrew Ng’s Coursera class on Intro to Machine Learning, although there may be better ones out there now.
1 comments

Input parameter vs. weights then?

I see tx

These networks (text models) usually have around a few thousand inputs.