Yes, if your model is small enough or, if you are fine-tuning small number of layers. TendorFlow 1.15 and 2.0 are available on Xavier. I understand that PyTorch could be built as well.
Nite that the number of CUDA kernels and amount of memory available is smaller, if compared to descrete Volta GPUs.
You say it can do training for small models because of the presence of the small (512-core) GPU? (plus maybe some left-over, control calculations by the CPU)
You still need tensor cores for inference. But they don't do weight updates. Learning/training is all about updating the weights (through backpropagation or whatever).
So another way to put it: its tensor cores do feed-forward calculations, but no backpropagation, and no weight updates.
Nite that the number of CUDA kernels and amount of memory available is smaller, if compared to descrete Volta GPUs.