Hacker News new | ask | show | jobs
by mlthoughts2018 2630 days ago
Wouldn’t it make more sense to spend the effort annotating these things? Or building models to provide the annotation? I mean, I work professionally in embedding models for computer vision and NLP, and my reaction to the article is that this seems like totally the wrong approach. You’re putting all this effort to create the embedding model out of the part that is both most superficial and least human interpretable (the AST).
1 comments

Building models for natural language _and_ code for either NL/intent-based code search or automatically annotating code is indeed another hot research area!

I'd argue Aroma solves a different problem in that it surfaces more idiomatic patterns based on the code you already have. This also can be important especially in production environment, when you need to do things "the right way".