Hacker News new | ask | show | jobs
by axiom92 1297 days ago
If you have the dataset, you can try to train a model like T5 [1], notebook [2].

You just need to create [(input, output)] examples in the format you want.

For example

[(a Yelp review of a restaurant, [("tacos", "good"), ("margaritas", "good"), ("salsa", "bad")]].

With enough data, the model should be able to learn to generate the output in the right format.

> Python list of tuples

Things get interesting if you want to generate actual Python code. You can use a large language model with just a few examples of the task to generate such code. For example, see https://reasonwithpal.com/.

Happy to answer more questions!

[1] https://huggingface.co/docs/transformers/model_doc/t5

[2] https://colab.research.google.com/github/huggingface/noteboo...