BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset J Ji, M Liu, J Dai, X Pan, C Zhang, C Bian, R Sun, Y Wang, Y Yang arXiv preprint arXiv:2307.04657, 2023 | 66 | 2023 |
OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research J Ji, J Zhou, B Zhang, J Dai, X Pan, R Sun, W Huang, Y Geng, M Liu, ... arXiv preprint arXiv:2305.09304, 2023 | 13 | 2023 |