Article,

Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate.

, , , and .
CoRR, (2024)

Meta data

Tags

Users

  • @dblp

Comments and Reviews