From post

копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes.

A. Müller, P. Alatur, G. Ramponi, и N. He. CoRR, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

Ning He

Hongxia He

Chunmao He

Xiaowen He

Hucang He

Другие публикации лиц с тем же именем

Sample Complexity and Overparameterization Bounds for Temporal-Difference Learning With Neural Network Approximation.S. Cayci, S. Satpathi, N. He, и R. Srikant. IEEE Trans. Autom. Control., 68 (5): 2891-2905 (мая 2023)Optimal Guarantees for Algorithmic Reproducibility and Gradient Complexity in Convex Optimization.L. Zhang, J. Yang, A. Karbasi, и N. He. CoRR, (2023)Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space.A. Barakat, I. Fatkhullin, и N. He. ICML, том 202 из Proceedings of Machine Learning Research, стр. 1753-1800. PMLR, (2023)Kernel Conditional Moment Constraints for Confounding Robust Inference.K. Ishikawa, и N. He. AISTATS, том 206 из Proceedings of Machine Learning Research, стр. 650-674. PMLR, (2023)Optimization for Reinforcement Learning: From Single Agent to Cooperative Agents.D. Lee, N. He, P. Kamalaruban, и V. Cevher. CoRR, (2019)Parameter-Agnostic Optimization under Relaxed Smoothness.F. Hübler, J. Yang, X. Li, и N. He. AISTATS, том 238 из Proceedings of Machine Learning Research, стр. 4861-4869. PMLR, (2024)Taming Nonconvex Stochastic Mirror Descent with General Bregman Divergence.I. Fatkhullin, и N. He. AISTATS, том 238 из Proceedings of Machine Learning Research, стр. 3493-3501. PMLR, (2024)Provably Convergent Policy Optimization via Metric-aware Trust Region Methods.J. Song, N. He, L. Ding, и C. Zhao. Trans. Mach. Learn. Res., (2023)Periodic Q-Learning.D. Lee, и N. He. L4DC, том 120 из Proceedings of Machine Learning Research, стр. 582-598. PMLR, (2020)Efficiently Escaping Saddle Points for Non-Convex Policy Optimization.S. Khorasani, S. Salehkaleybar, N. Kiyavash, N. He, и M. Grossglauser. CoRR, (2023)

Что такое BibSonomy?: С чего начать; Кнопки для браузера; Помощь
Разработчикам: Обзор; API-документация

Контакт и защита личных данных: о нас; Cookies; Сообщить о проблеме; BibSonomy Вики

Интеграция: PUMA; Расширение для TYPO3; Плагин для; Клиент Java REST; Поддерживаемые источники; далее

О BibSonomy: Команда; Блог; Список рассылки
Социальные сети: Наш Twitter