16x smaller = 41.5GB though
More research needs to be undertaken in model compression imo
I am curious why authors preferred T5?..
I am curious why authors preferred T5?..