A read-write memory network for movie story understanding S Na, S Lee, J Kim, G Kim Proceedings of the IEEE International Conference on Computer Vision, 677-685, 2017 | 129 | 2017 |
Parameter efficient multimodal transformers for video representation learning S Lee, Y Yu, G Kim, T Breuel, J Kautz, Y Song International Conference on Learning Representations, 2021 | 91 | 2021 |
A memory network approach for story-based temporal summarization of 360 videos S Lee, J Sung, Y Yu, G Kim Proceedings of the IEEE conference on computer vision and pattern …, 2018 | 73 | 2018 |
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action J Lu, C Clark, S Lee, Z Zhang, S Khosla, R Marten, D Hoiem, A Kembhavi Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 59 | 2024 |
A deep ranking model for spatio-temporal highlight detection from a 360◦ video Y Yu, S Lee, J Na, J Kang, G Kim Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018 | 49 | 2018 |
Acav100m: Automatic curation of large-scale datasets for audio-visual video representation learning S Lee, J Chung, Y Yu, G Kim, T Breuel, G Chechik, Y Song Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 39 | 2021 |
Self-Supervised Learning of Compressed Video Representations Y Yu, S Lee, G Kim, Y Song International Conference on Learning Representations, 2021 | 18 | 2021 |
Encoding video and label priors for multi-label video classification on youtube-8m dataset S Na, Y Yu, S Lee, J Kim, G Kim arXiv preprint arXiv:1706.07960, 2017 | 14 | 2017 |
Unsupervised representation learning via neural activation coding Y Park, S Lee, G Kim, D Blei International Conference on Machine Learning, 8391-8400, 2021 | 6 | 2021 |
Can Language Models Laugh at YouTube Short-form Videos? D Ko, S Lee, G Kim Proceedings of the 2023 Conference on Empirical Methods in Natural Language …, 2023 | 1 | 2023 |