Hacker News new | ask | show | jobs
Show HN: RETVec: Resilient and Efficient Text Vectorizer
3 points by ebursztein 924 days ago
Happy Friday,

Really happy to share that the code and model for RETVec our new SOTA robust text tokenizer for classification is available on Github: https://github.com/google-research/retvec/ and the NeurIPS paper on Arxiv: https://arxiv.org/abs/2302.09207

Beside its compactness and robustness one of the RETVec strong point is that it greatly simplify the creation of on-device models: RETVec work natively on TFlite and can be use in web deployement via a TFJS.

We hope you will find it useful for your research and if you would like to give it a try we have a get started notebook here: https://github.com/google-research/retvec/blob/main/notebooks/train_retvec_model_tf.ipynb

Let us know if you have any questions.