Looking for feedback on a paper about a revision-capable language model [pdf]

Y	Hacker News new \| ask \| show \| jobs

	Looking for feedback on a paper about a revision-capable language model [pdf] (github.com)
	2 points by param-updater 60 days ago

2 comments

param-updater 60 days ago

Hi everyone! I am an independent researcher working on Reviser, a language model that generates through cursor-relative edit actions on a mutable canvas. It is autoregressive over edit-history actions rather than final text order, which lets it revise its response while keeping decoding efficiency close to standard autoregressive transformers.

My goal is to submit the paper to a conference such as ACL, EMNLP, ICML, or a similar venue, and I would really appreciate technical feedback on things like:

- Boldness/strength of the claims - Weaknesses - Quality of the results, or if I should include other results

Paper: https://github.com/Sean-Diab/Reviser/blob/main/main.pdf

I would really value any feedback on what I should improve before submitting.

I am also looking for an arXiv endorsement for cs.CL. If anyone here is eligible and feels comfortable helping, my endorsement link is: https://arxiv.org/auth/endorse?x=ISRSI8

Thank you very much.

link

supermdguy 60 days ago

Overall, I'm really impressed by what you accomplished! I'm not a researcher, so not sure if this is that helpful, but here are some thoughts:

- I wonder if the "move" action is difficult for the model to learn to use well. The model sees token location as positional encodings in the embedding, not sparse character offsets. Would be interesting to see something more like "jump to next/previous [token or set of tokens]". Or maybe a find/replace like most coding harness edit tools use?

- I'd move the exact training data generation details to an appendix. Could be summarized to improve the flow of the paper.

link

param-updater 60 days ago

Hi, thank you for your advice, I really appreciate it!

My model has been able to move pretty naturally throughout the canvas when editing, the model is able to remember the actual canvas including order of the tokens well, but I understand where you're coming from.

Jump to next/previous token is a good idea, and in the future I can definitely look into implementing it, especially for scaling the model up. Same thing with find/replace. Thanks again.

link