Hacker News new | ask | show | jobs
by macleginn 243 days ago
So this looks essentially like continuous prompting (see prefix tuning) with RL-driven selection of what to present as tokens and what as continuous inputs (embeddings).