Deep learning-based channel estimation algorithm over time selective fading channels Q Bai, J Wang, Y Zhang, J Song IEEE Transactions on Cognitive Communications and Networking 6 (1), 125-134, 2019 | 102 | 2019 |
Achieving zero constraint violation for constrained reinforcement learning via primal-dual approach Q Bai, AS Bedi, M Agarwal, A Koppel, V Aggarwal Proceedings of the AAAI Conference on Artificial Intelligence 36 (4), 3682-3689, 2022 | 40 | 2022 |
Reinforcement learning for constrained markov decision processes A Gattami, Q Bai, V Aggarwal International Conference on Artificial Intelligence and Statistics, 2656-2664, 2021 | 16 | 2021 |
Reinforcement learning for multi-objective and constrained Markov decision processes A Gattami, Q Bai, V Agarwal arXiv preprint arXiv:1901.08978, 2019 | 15 | 2019 |
Model-free algorithm and regret analysis for MDPs with peak constraints Q Bai, A Gattami, V Aggarwal arXiv preprint arXiv:2003.05555, 2020 | 11* | 2020 |
Regret guarantees for model-based reinforcement learning with long-term average constraints M Agarwal, Q Bai, V Aggarwal Uncertainty in Artificial Intelligence, 22-31, 2022 | 8 | 2022 |
Concave utility reinforcement learning with zero-constraint violations M Agarwal, Q Bai, V Aggarwal arXiv preprint arXiv:2109.05439, 2021 | 8 | 2021 |
Joint optimization of multi-objective reinforcement learning with policy gradient based algorithm Q Bai, M Agarwal, V Aggarwal arXiv preprint arXiv:2105.14125, 2021 | 8 | 2021 |
Markov decision processes with long-term average constraints M Agarwal, Q Bai, V Aggarwal arXiv preprint arXiv:2106.06680, 2021 | 7 | 2021 |
Escaping saddle points for zeroth-order non-convex optimization using estimated gradient descent Q Bai, M Agarwal, V Aggarwal 2020 54th Annual Conference on Information Sciences and Systems (CISS), 1-6, 2020 | 7 | 2020 |
Achieving zero constraint violation for constrained reinforcement learning via conservative natural policy gradient primal-dual algorithm Q Bai, AS Bedi, V Aggarwal Proceedings of the AAAI Conference on Artificial Intelligence 37 (6), 6737-6744, 2023 | 5 | 2023 |
A Reinforcement Learning Framework for Vehicular Network Routing Under Peak and Average Constraints N Geng, Q Bai, C Liu, T Lan, V Aggarwal, Y Yang, M Xu IEEE Transactions on Vehicular Technology, 2023 | 4 | 2023 |
Joint Optimization of Concave Scalarized Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm Q Bai, M Agarwal, V Aggarwal Journal of Artificial Intelligence Research 74, 1565-1597, 2022 | 2 | 2022 |
Achieving Zero Constraint Violation for Concave Utility Constrained Reinforcement Learning via Primal-Dual Approach Q Bai, AS Bedi, M Agarwal, A Koppel, V Aggarwal | 2 | 2022 |
Achieving zero constraint violation for constrained reinforcement learning via conservative natural policy gradient primal-dual algorithm Q Bai, AS Bedi, V Aggarwal arXiv preprint arXiv:2206.05850, 2022 | 2 | 2022 |
Regret analysis of policy gradient algorithm for infinite horizon average reward markov decision processes Q Bai, WU Mondal, V Aggarwal arXiv preprint arXiv:2309.01922, 2023 | 1 | 2023 |
Model-free algorithm and regret analysis for MDPs with long-term constraints Q Bai, V Aggarwal, A Gattami arXiv preprint arXiv:2006.05961, 2020 | 1 | 2020 |
Provably Sample-Efficient Model-Free Algorithm for MDPs with Peak Constraints Q Bai, V Aggarwal, A Gattami Journal of Machine Learning Research 24 (60), 1-25, 2023 | | 2023 |