|
|
|
|
|
by Der_Einzige
2276 days ago
|
|
For one thing, it looks like your decoding is not being done the way SOTA systems do it (usually nucleus sampling). Yours appears to be doing top-1 or top-N sampling, but please correct me if I'm wrong on this. Even if you are doing it with nucleus sampling - you need to expose paramaters to control the generation better. That way you have a lot of cover for it not working as expected in the form of responding to dissenting users with "just tune the parameters better" The reaction you're getting from this system is so negative because it appears to be less good than write with transformer - which implements exactly what I'm describing. For what its worth, this mostly isn't your fault. NLP is a crapshoot of hype and implementations which disappoint. |
|