cosformer: Rethinking softmax in attention Z Qin, W Sun, H Deng, D Li, Y Wei, B Lv, J Yan, L Kong, Y Zhong ICLR 2022, 2022 | 161 | 2022 |
Audio–visual segmentation J Zhou, J Wang, J Zhang, W Sun, J Zhang, S Birchfield, D Guo, L Kong, ... European Conference on Computer Vision, 386-403, 2022 | 80* | 2022 |
The Devil in Linear Transformer Z Qin, XD Han, W Sun, D Li, L Kong, N Barnes, Y Zhong EMNLP 2022, 2022 | 26* | 2022 |
Audio-Visual Segmentation with Semantics arXiv preprint arXiv:2301.13190, 2023 | 22* | 2023 |
Vicinity vision transformer W Sun, Z Qin, H Deng, J Wang, Y Zhang, K Zhang, N Barnes, S Birchfield, ... IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023 | 20 | 2023 |
Getam: Gradient-weighted element-wise transformer attention map for weakly-supervised semantic segmentation W Sun, J Zhang, Z Liu, Y Zhong, N Barnes arXiv preprint arXiv:2112.02841, 2021 | 17 | 2021 |
Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning W Sun, J Zhang, J Wang, Z Liu, Y Zhong, T Feng, Y Guo, Y Zhang, ... CVPR 2023, 2023 | 16 | 2023 |
Neural architecture search on efficient transformers and beyond Z Liu, D Li, K Lu, Z Qin, W Sun, J Xu, Y Zhong arXiv preprint arXiv:2207.13955, 2022 | 15 | 2022 |
Inferring the class conditional response map for weakly supervised semantic segmentation W Sun, J Zhang, N Barnes WACV 2022, 2878-2887, 2022 | 14 | 2022 |
3d guided weakly supervised semantic segmentation W Sun, J Zhang, N Barnes Proceedings of the Asian Conference on Computer Vision, 2020 | 14 | 2020 |
An alternative to wsss? an empirical study of the segment anything model (sam) on weakly-supervised semantic segmentation problems W Sun, Z Liu, Y Zhang, Y Zhong, N Barnes arXiv preprint arXiv:2305.01586, 2023 | 13 | 2023 |
Toeplitz Neural Network for Sequence Modeling Z Qin, X Han, W Sun, B He, D Li, D Li, Y Dai, L Kong, Y Zhong ICLR 2023, 2023 | 11 | 2023 |
Structural edge detection: A dataset and benchmark W Sun, S You, J Walker, K Li, N Barnes 2018 Digital Image Computing: Techniques and Applications (DICTA), 1-8, 2018 | 8 | 2018 |
Scaling transnormer to 175 billion parameters Z Qin, D Li, W Sun, W Sun, X Shen, X Han, Y Wei, B Lv, F Yuan, X Luo, ... arXiv preprint arXiv:2307.14995, 2023 | 7 | 2023 |
Bi-directional Training for Composed Image Retrieval via Text Prompt Learning Z Liu, W Sun, Y Hong, D Teney, S Gould WACV 2024, 2023 | 6 | 2023 |
Linear video transformer with feature fixation K Lu, Z Liu, J Wang, W Sun, Z Qin, D Li, X Shen, H Deng, X Han, Y Dai, ... arXiv preprint arXiv:2210.08164, 2022 | 4 | 2022 |
BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation Z Wu, Y Li, H Yan, T Shang, W Sun, S Wang, R Cui, W Liu, H Sato, H Li, ... arXiv preprint arXiv:2401.17053, 2024 | 3 | 2024 |
All-pairs Consistency Learning for Weakly Supervised Semantic Segmentation W Sun, Y Zhang, Z Qin, Z Liu, L Cheng, F Wang, Y Zhong, N Barnes ICCV Workshop on New Ideas in Vision Transformers(Best Paper), 2023 | 3 | 2023 |
Linearized Relative Positional Encoding Z Qin, W Sun, K Lu, H Deng, D Li, X Han, Y Dai, L Kong, Y Zhong TMLR 2023, 2023 | 3 | 2023 |
Candidate Set Re-ranking for Composed Image Retrieval with Dual Multi-modal Encoder Z Liu, W Sun, D Teney, S Gould TMLR, 2023 | 3 | 2023 |