How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model M Hanna, O Liu, A Variengien NeurIPS 2023, 2023 | 45* | 2023 |
Interpretable Diffusion via Information Decomposition X Kong, O Liu, H Li, D Yogatama, GV Steeg ICLR 2024, 2023 | 4 | 2023 |
Approximating CKY with Transformers G Khalighinejad, O Liu, S Wiseman Findings of EMNLP 2023, 2023 | 1 | 2023 |
IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations D Fu, G Khalighinejad, O Liu, B Dhingra, D Yogatama, R Jia, ... arXiv preprint arXiv:2404.01266, 2024 | | 2024 |
DeLLMa: A Framework for Decision Making Under Uncertainty with Large Language Models O Liu, D Fu, D Yogatama, W Neiswanger arXiv preprint arXiv:2402.02392, 2024 | | 2024 |
On Retrieval Augmentation and the Limitations of Language Model Training TR Chiang, XV Yu, J Robinson, O Liu, I Lee, D Yogatama NAACL 2024 (Short Paper), 2023 | | 2023 |