|
|
|
|
|
by pankajdoharey
155 days ago
|
|
If they are real slime balls they can justify it by saying you see we use speculative decoding so we first use a smaller faster model model first and then then answer is enhanced by larger model blah blah ..... "FOr the best User experience" |
|