From post

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

Unfamiliar Finetuning Examples Control How Language Models Hallucinate., , , , и . CoRR, (2024)Chaining Behaviors from Data with Model-Free Reinforcement Learning., , , , , и . CoRL, том 155 из Proceedings of Machine Learning Research, стр. 2162-2177. PMLR, (2020)RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold., , , , , и . CoRR, (2024)Zero-Shot Robotic Manipulation with Pre-Trained Image-Editing Diffusion Models., , , , , , и . ICLR, OpenReview.net, (2024)Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions., , , , , , , , , и 15 other автор(ы). CoRL, том 229 из Proceedings of Machine Learning Research, стр. 3909-3928. PMLR, (2023)Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters., , , и . CoRR, (2024)Latent Conservative Objective Models for Data-Driven Crystal Structure Prediction., , , , , и . CoRR, (2023)Conservative Q-Learning for Offline Reinforcement Learning., , , и . NeurIPS, (2020)Conservative Data Sharing for Multi-Task Offline Reinforcement Learning., , , , , и . NeurIPS, стр. 11501-11516. (2021)COMBO: Conservative Offline Model-Based Policy Optimization., , , , , и . NeurIPS, стр. 28954-28967. (2021)