Hacker News new | ask | show | jobs
by Yajirobe 1700 days ago
How long did it take to train this GPT-2 based model?
1 comments

It's a refinement of a lightweight version of GPT-2 by Hugging Face -- https://huggingface.co/transformers/model_doc/gpt2.html. I don't recall exact numbers, but once I had the structure of the problem right (i.e. sequencing words, part of speech and definitions) it was around 12 hours on my old 1080 TI.