|
|
|
|
|
by JustFinishedBSG
3322 days ago
|
|
It doesn't matter, operations are the same in forward and backward mode. "Made for inference" just means "too slow for training" if you are pessimistic or "optimized for power efficiency" if you are optimistic. Otherwise training and inference are basically the same |
|
Training and inference are only similar at a high level, not in actual application.