|
|
|
|
|
by flyingcircus3
891 days ago
|
|
Why not three models? One model does basic feature detections, like lines, shapes, etc. A second model that can take the first model's output as its input, and identify birds. A third model can take the first model's output as its input, and identify houses. |
|
An end to end model will always outperform a sequence of models designed to target specific features. You truncate information when you render the data into output space (the model output vector) from feature space (much richer data inside the model), thats the primary reason why to do transfer learning all layers are frozen, the final layer is chopped off, and then the output of the internal layer is sent into the next model. Not the output itself.
Yes you can create a large tree of smaller models, but the performance cieling is still lower.
Please don't tell people to do this. Ive seen millions wasted on this.
When you train a vision model it will already develop a heirarchy of fundamental point, and line detectors in the first few layers. And they will be particularly well chosen for the domain. It happens automatically. No need to manually put them there.