| We have a bit of a problem like what you mention. Our backend/app is in Java but the DS/ML team generally works in python. The ML team basically doesn't ship production code. Here are the artifacts we produce: 1. For new models we often build a demo endpoints/glue code written in python/flask that can be compared against the prod output in dev/psup. 2. Deep learning models (much of what I do personally): saved in TF saved model format. If it is an update to an existing model often it is just a drop-in replacement. If it is a brand new model i will often include a flask demo (the python code does proper data transformation before calling on tf). On production side, after testing/regression these model are deployed via tensorflow-serving containers and used as gRPC endpoint. For production, whatever data pre-processing needs to be done is written by the backend team, who compare preprocessing output with our demo. 3. Logistic regression/tree models: again, for new models we provide the demo but what goes into production are either csv (logistic regression) or json (tree) of the weights/decision boundaries which are used as resources by the backend team's Java code. The overall flow is: ETL (via apache airflow/custom code) => model training/feature engineering => (saved model file + flask demo endpoint/documentation on feature transformations) => dev incorporate model/test into java backend => comparison of demo vs java backend => regression of java backend (if they had previous versions of model) => psup (small amount of prod data duplicated and ran in parallel with prod) => prod (model deployed + monitored) There is a caveat that we also do some batch processing/not really live analysis that is just done in python and then results are pushed wherever they need to be pushed. In this case we don't involve the backend/java team. |
It also provides OpenAPI spec for your API endpoint, which allows you to generate API client in Java, for your backend/app teams.