Hacker News new | ask | show | jobs
by arathore 2528 days ago
Looks cool. Iirc the transformer architecture doesn't allow any constraints on the learned language model. For code completion settings, a model aware of the (programming) language constructs explicitly and then augmented with code samples would be much more efficient (you could greatly reduce the search space for next token etc).
1 comments

>a model aware of the (programming) language constructs explicitly

You could never include that in the model's training. The best you could do would be to construct an AST on the model output and discard suggestions with invalid syntax. And provide enough negative examples (invalid syntax) to reduce false positives.

What you proposed would never work with a language model, and makes no sense with how backprop works. The model will learn the grammar (syntax), but will always output some percentage of false positives (invalid syntax).

You can't hardcode the syntax into the model. Another approach is to encode token types after tokenization, which will give the model more information about the syntax/meaning of tokens.