|
|
|
|
|
by ruby314
675 days ago
|
|
Re #3 - my bad, mixing terminology in my answer above. It’s the “base model” for the evaluator model (vs a fine tuned evaluator model). Just using the labeled Halubench dataset as the outputs to be evaluated, so no base model for the Halueval task. Thanks for the feedback, really helpful. We may edit for clarity. |
|