|
|
|
|
|
by merb
33 days ago
|
|
Wouldn’t it be good to start investigating into a micro model architecture? Like first model checks the context and routes to the Java optimized model, etc. would make it also simpler to load/unload models in memory. So extremely small models that are only good for a certain task like programming languages. A little bit of a model at the front that is extremely good in classification of tasks and than a more complex model that can bring each of these micro models back together |
|
Obviously, I have no idea but I guess it’s not as simple as “just train only on Java code and reduce size to 1/10th”.