Follow
QinBo Bai
Title
Cited by
Cited by
Year
Deep learning-based channel estimation algorithm over time selective fading channels
Q Bai, J Wang, Y Zhang, J Song
IEEE Transactions on Cognitive Communications and Networking 6 (1), 125-134, 2019
1022019
Achieving zero constraint violation for constrained reinforcement learning via primal-dual approach
Q Bai, AS Bedi, M Agarwal, A Koppel, V Aggarwal
Proceedings of the AAAI Conference on Artificial Intelligence 36 (4), 3682-3689, 2022
402022
Reinforcement learning for constrained markov decision processes
A Gattami, Q Bai, V Aggarwal
International Conference on Artificial Intelligence and Statistics, 2656-2664, 2021
162021
Reinforcement learning for multi-objective and constrained Markov decision processes
A Gattami, Q Bai, V Agarwal
arXiv preprint arXiv:1901.08978, 2019
152019
Model-free algorithm and regret analysis for MDPs with peak constraints
Q Bai, A Gattami, V Aggarwal
arXiv preprint arXiv:2003.05555, 2020
11*2020
Regret guarantees for model-based reinforcement learning with long-term average constraints
M Agarwal, Q Bai, V Aggarwal
Uncertainty in Artificial Intelligence, 22-31, 2022
82022
Concave utility reinforcement learning with zero-constraint violations
M Agarwal, Q Bai, V Aggarwal
arXiv preprint arXiv:2109.05439, 2021
82021
Joint optimization of multi-objective reinforcement learning with policy gradient based algorithm
Q Bai, M Agarwal, V Aggarwal
arXiv preprint arXiv:2105.14125, 2021
82021
Markov decision processes with long-term average constraints
M Agarwal, Q Bai, V Aggarwal
arXiv preprint arXiv:2106.06680, 2021
72021
Escaping saddle points for zeroth-order non-convex optimization using estimated gradient descent
Q Bai, M Agarwal, V Aggarwal
2020 54th Annual Conference on Information Sciences and Systems (CISS), 1-6, 2020
72020
Achieving zero constraint violation for constrained reinforcement learning via conservative natural policy gradient primal-dual algorithm
Q Bai, AS Bedi, V Aggarwal
Proceedings of the AAAI Conference on Artificial Intelligence 37 (6), 6737-6744, 2023
52023
A Reinforcement Learning Framework for Vehicular Network Routing Under Peak and Average Constraints
N Geng, Q Bai, C Liu, T Lan, V Aggarwal, Y Yang, M Xu
IEEE Transactions on Vehicular Technology, 2023
42023
Joint Optimization of Concave Scalarized Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Q Bai, M Agarwal, V Aggarwal
Journal of Artificial Intelligence Research 74, 1565-1597, 2022
22022
Achieving Zero Constraint Violation for Concave Utility Constrained Reinforcement Learning via Primal-Dual Approach
Q Bai, AS Bedi, M Agarwal, A Koppel, V Aggarwal
22022
Achieving zero constraint violation for constrained reinforcement learning via conservative natural policy gradient primal-dual algorithm
Q Bai, AS Bedi, V Aggarwal
arXiv preprint arXiv:2206.05850, 2022
22022
Regret analysis of policy gradient algorithm for infinite horizon average reward markov decision processes
Q Bai, WU Mondal, V Aggarwal
arXiv preprint arXiv:2309.01922, 2023
12023
Model-free algorithm and regret analysis for MDPs with long-term constraints
Q Bai, V Aggarwal, A Gattami
arXiv preprint arXiv:2006.05961, 2020
12020
Provably Sample-Efficient Model-Free Algorithm for MDPs with Peak Constraints
Q Bai, V Aggarwal, A Gattami
Journal of Machine Learning Research 24 (60), 1-25, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–18