second opinion - BERT family are transformer-based, and that is a big threshold right there.. secondly I am not sure that two one-minute comments could capture what exactly went on with fine tuning or graph-based methods of constraint or whatnot.. with respect to the fitness of the production tools for intended purposes.