Hacker News new | ask | show | jobs
by eric_gu 674 days ago
Thanks for the reply!

Re question #3: I'm not sure I understand why you need to vary the base model or how doing so would allow LSR to take advantage? Isn't your LSR technique used on the activations of the evaluator model?

As a note of feedback, I found the original article a bit hard to understand even with multiple reads. I would have really benefited from a traditional "methodology" section like in an ML paper! The graphs upfront don't make sense to someone who isn't familiar with the problem setting, and even now I'm not sure if the x-axis in the HaluEval Benchmark bar chart refers to the base model or the evaluator model. Maybe it's just me.

1 comments

Re #3 - my bad, mixing terminology in my answer above. It’s the “base model” for the evaluator model (vs a fine tuned evaluator model). Just using the labeled Halubench dataset as the outputs to be evaluated, so no base model for the Halueval task.

Thanks for the feedback, really helpful. We may edit for clarity.

Ah understood. Makes sense now!