Article,

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models.

Z. Li, T. Xu, Y. Zhang, Y. Yu, R. Sun, and Z. Luo.
CoRR, (2023)

Meta data

BibTeX key: journals/corr/abs-2310-10505
entry type: article
year: 2023
journal: CoRR
volume: abs/2310.10505
ee: https://doi.org/10.48550/arXiv.2310.10505
url: http://dblp.uni-trier.de/db/journals/corr/corr2310.html#abs-2310-10505

Tags

dblp

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

search on