Uniter: Universal image-text representation learning YC Chen, L Li, L Yu, A El Kholy, F Ahmed, Z Gan, Y Cheng, J Liu European conference on computer vision, 104-120, 2020 | 2161 | 2020 |
Uniter: Learning universal image-text representations YC Chen, L Li, L Yu, A El Kholy, F Ahmed, Z Gan, Y Cheng, J Liu | 412 | 2019 |
Towards end-to-end reinforcement learning of dialogue agents for information access B Dhingra, L Li, X Li, J Gao, YN Chen, F Ahmed, L Deng arXiv preprint arXiv:1609.00777, 2016 | 384 | 2016 |
Mm-react: Prompting chatgpt for multimodal reasoning and action Z Yang, L Li, J Wang, K Lin, E Azarnasab, F Ahmed, Z Liu, C Liu, M Zeng, ... arXiv preprint arXiv:2303.11381, 2023 | 270 | 2023 |
Swinbert: End-to-end transformers with sparse attention for video captioning K Lin, L Li, CC Lin, F Ahmed, Z Gan, Z Liu, Y Lu, L Wang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 242 | 2022 |
Bbq-networks: Efficient exploration in deep reinforcement learning for task-oriented dialogue systems Z Lipton, X Li, J Gao, L Li, F Ahmed, L Deng Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018 | 192 | 2018 |
End-to-end learning of dialogue agents for information access L Li, B Dhingra, J Gao, X Li, YN Chen, L Deng, F Ahmed US Patent 10,546,066, 2020 | 159 | 2020 |
The five Ws for information visualization with application to healthcare informatics Z Zhang, B Wang, F Ahmed, IV Ramakrishnan, R Zhao, A Viccellio, ... IEEE transactions on visualization and computer graphics 19 (11), 1895-1910, 2013 | 110 | 2013 |
Unitab: Unifying text and box outputs for grounded vision-language modeling Z Yang, Z Gan, J Wang, X Hu, F Ahmed, Z Liu, Y Lu, L Wang European Conference on Computer Vision, 521-539, 2022 | 108 | 2022 |
Efficient exploration for dialogue policy learning with bbq networks & replay buffer spiking ZC Lipton, J Gao, L Li, X Li, F Ahmed, L Deng arXiv preprint arXiv:1608.05081 3, 2016 | 66 | 2016 |
Accessible skimming: faster screen reading of web pages F Ahmed, Y Borodin, A Soviak, M Islam, IV Ramakrishnan, T Hedgpeth Proceedings of the 25th annual ACM symposium on User interface software and …, 2012 | 61 | 2012 |
Why read if you can skim: towards enabling faster screen reading F Ahmed, Y Borodin, Y Puzis, IV Ramakrishnan Proceedings of the International Cross-Disciplinary Conference on Web …, 2012 | 47 | 2012 |
Crossing the format boundary of text and boxes: Towards unified vision-language modeling Z Yang, Z Gan, J Wang, X Hu, F Ahmed, Z Liu, Y Lu, L Wang arXiv preprint arXiv:2111.12085 3, 2021 | 39 | 2021 |
Mm-vid: Advancing video understanding with gpt-4v (ision) K Lin, F Ahmed, L Li, CC Lin, E Azarnasab, Z Yang, J Wang, L Liang, ... arXiv preprint arXiv:2310.19773, 2023 | 35 | 2023 |
Hearsay: a new generation context-driven multi-modal assistive web browser Y Borodin, F Ahmed, MA Islam, Y Puzis, V Melnyk, S Feng, ... Proceedings of the 19th international conference on World wide web, 1233-1236, 2010 | 28 | 2010 |
Efficient exploration for dialog policy learning with deep BBQ networks\& replay buffer spiking ZC Lipton, J Gao, L Li, X Li, F Ahmed, L Deng CoRR abs/1608.05081, 2016 | 27 | 2016 |
Assistive web browsing with touch interfaces F Ahmed, MA Islam, Y Borodin, IV Ramakrishnan Proceedings of the 12th international ACM SIGACCESS conference on Computers …, 2010 | 21 | 2010 |
Non-visual skimming on touch-screen devices F Ahmed, A Soviak, Y Borodin, IV Ramakrishnan Proceedings of the 2013 international conference on Intelligent user …, 2013 | 12 | 2013 |
An intuitive accessible web automation user interface Y Puzis, Y Borodin, F Ahmed, IV Ramakrishnan Proceedings of the International Cross-Disciplinary Conference on Web …, 2012 | 12 | 2012 |
Bridging the web accessibility divide IV Ramakrishnan, J Mahmud, Y Borodin, MA Islam, F Ahmed Electronic Notes in Theoretical Computer Science 235, 107-124, 2009 | 11 | 2009 |