Journal of Guangdong University of Technology ›› 2024, Vol. 41 ›› Issue (04): 89-97.doi: 10.12052/gdutxb.230122
• Computer Science and Technology • Previous Articles
Chen Yong-feng, Liu Jing, Yang Zhi-jing, Chen Rui-han, Tan Jun-peng
CLC Number:
[1] LEE K H, CHEN X, HUA G, et al. Stacked cross attention for image-text matching[C]//Proceedings of the European Conference on Computer Vision (ECCV) . Munich: Springer International Publishing, 2018: 201-216. [2] LI K, ZHANG Y, LI K, etal. Visual Semantic Reasoning for Image-Text Matching[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Seoul, Korea (South) : IEEE, 2019: . 4653-4661. [3] 李濛. 结合知识的图文匹配算法研究与实现[D]. 成都: 电子科技大学, 2022. [4] 任思宇. 基于语义推理的图文匹配方法的研究[D]. 天津: 天津工业大学, 2021. [5] SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , Las Vegas: IEEE, 2016: 2818-2826. [6] 林哲煌, 李东. 语义引导下自适应拓扑推理图卷积网络的人体动作识别[J]. 广东工业大学学报, 2023, 40(4): 45-52. LIN Z H, LI D. Semantics-guided adaptive topology inference graph convolutional networks for skeleton-based action recognition. [J]. Journal of Guangdong University of Technology, 2023, 40(4): 45-52. [7] LU R T, YANG X G, JING X, et al. Infrared small target detection based on local hypergraph dissimilarity measure [J]. IEEE Geoscience and Remote Sensing Letters, 2020, 19: 1-5. [8] SUN J H, QING C M, TAN J P, et al. Superpoint transformer for 3d scene instance segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Washington: AAAI, 2023, 37(2) : 2393-2401. [9] FAGHRI F, FLEET D J, KIROS J R, et al. Improving visual-semantic embeddings with hard negatives[J]. arXiv: 1707.05612(2017-12-12) [2022-05-12]. https://arxiv.org/pdf/1707.05612.pdf. [10] 林志刚. 基于多任务和注意力机制的图文匹配技术研究[D]. 天津: 天津大学, 2020. [11] EISENSCHTAT A, WOLF L. Linking image and text with 2-way nets[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Honolulu: IEEE, 2017: 1855-1865. [12] KARPATHY A , LI F. Deep visual-semantic alignments for generating image descriptions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Boston: IEEE, 2015: 3128-3137. [13] WANG J, ZHOU F, WEN S, et al. Deep metric learning with angular loss[C]//2017 IEEE International Conference on Computer Vision (ICCV) . Venice: IEEE, 2017: 2612-2620. [14] TAN J P, YANG Z J, REN J C, et al. A novel robust low-rank multi-view diversity optimization model with adaptive-weighting based manifold learning [J]. Pattern Recognition, 2022, 122: 108298. [15] WANG H R, ZHANG Y, JI Z, et al. Consensus-aware visual-semantic embedding for image-text matching[C]//Computer Vision—ECCV 2020: 16th European Conference. Glasgow: Springer International Publishing, 2020: 18-34. [16] BOUKTHIR K, QAHTANI A M, ALMUTIRY O, et al. Reduced annotation based on deep active learning for arabic text detection in natural scene images [J]. Pattern Recognition Letters, 2022, 157: 42-48. [17] DU P, CHEN H, ZHAO S Y, et al. Contrastive active learning under class distribution mismatch [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45(4): 4260-4273. [18] VEDANTAM R, ZITNICK C L, PARIKH D. CIDEr: consensus-based image description evaluation[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Boston: IEEE, 2015: 4566-4575. [19] MATSUBARA T. Target-oriented deformation of visual-semantic embedding space [J]. IEICE Transactions on Information and Systems, 2021, 104(1): 24-33. [20] LIU C, MAO Z, ZANG W, et al. A neighbor-aware approach for image-text matching[C]//ICASSP 2019—2019 IEEE International Conference on Acoustics. Brighton: IEEE, 2019: 3970-3974. [21] LIU F Y, YE R T, WANG X, et al. Hal: improved text-image matching by mitigating visual semantic hubs[C]//Proceedings of the AAAI Conference on Artificial Intelligence. New York: IEEE 2020, 34(7) : 11563-11571. [22] ANDERSON P, HE X D, BUEHLER C, et al. Bottom-up and top-down attention for image captioning and visual question answering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6077-6086. [23] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. Advances in Neural Information Processing Systems, 2015, 28: 91-99. [24] SCHUSTER M, PALIWAL K K. Bidirectional recurrent neural networks [J]. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681. [25] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [J]. Advances in Neural Information Processing Systems, 2017, 30: 6000-6010. [26] NAM H, HA J W, KIM J. Dual attention networks for multimodal reasoning and matching[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 299-307. [27] CHEN T L, DENG J J, LUO J B. Adaptive offline quintuplet loss for image-text matching[C]//Computer Vision-ECCV 2020: 16th European Conference. Glasgow: Springer International Publishing, 2020: 549-565. [28] DENKOWSKI M, LAVIE A. Meteor universal: language specific translation evaluation for any target language[C]//Proceedings of the Ninth Workshop on Statistical Machine Translation. Baltimore: ACL, 2014: 376-380. [29] PlUMMER B A, WANG L, CERVANTES C M, et al. Flickr30k entities: collecting region-to-phrase correspondences for richer image-to-sentence models[C]//2015 IEEE International Conference on Computer Vision (ICCV) . Santiago, Chile: IEEE, 2015: 2641-2649. [30] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: common objects in context[C]//Computer Vision-ECCV 2014: 13th European Conference. Zurich, Switzerland: Springer International Publishing, 2014: 740-755. [31] BITEN A F, MAFLA A, GOMEZ L, et al. Is an image worth five sentences? A New look into semantics for image-text matching[C]//2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) . Waikoloa: IEEE, 2022: 2483-2492. [32] HUANG Y, WU Q, SONG C, et al. Learning semantic concepts and order for image and sentence matching[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6163-6171. [33] WANG Z. CAMP: cross-modal adaptive message passing for text-Image retrieval[C] //2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Seoul, Korea (South) : IEEE, 2019: 5763-5772. [34] LIU C X, MAO Z D, LIU A A, et al. Focus your attention: A bidirectional focal attention network for image-text matching[C]//Proceedings of the 27th ACM International Conference on Multimedia. France: ACM, 2019: 3-11. [35] LIU C, MAO Z, ZHANG T, et al. Graph structured network for image-text matching[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Seattle: IEEE, 2020: 10918-10927. [36] DIAO H, ZHANG Y, MA L, et al. Similarity reasoning and filtration for image-text matching[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Internet: AAAI, 2021, 35(2) : 1218-1226. |
[1] | Li Xue-sen, Tan Bei-hai, Yu Rong, Xue Xian-bin. Small Target Detection Algorithm for Lightweight UAV Aerial Photography Based on YOLOv5 [J]. Journal of Guangdong University of Technology, 2024, 41(03): 71-80.doi: 10.12052/gdutxb.230122 |
[2] | Zeng An, Chen Xu-zhou, Ji Yu-Zhu, Pan Dan, Xu Xiao-Wei. Cardiac Multiclass Segmentation Method Based on Self-attention and 3D Convolution [J]. Journal of Guangdong University of Technology, 2023, 40(06): 168-175.doi: 10.12052/gdutxb.230122 |
[3] | Wu Xiao-ling, Chen Xiang-wang, Zhan Wen-tao, Ling Jie. Chinese Medical Named Entity Recognition Based on Gated Attention Unit [J]. Journal of Guangdong University of Technology, 2023, 40(06): 176-184.doi: 10.12052/gdutxb.230122 |
[4] | Wu Zhen-hua, Tang Wen-yan, Lyu Wen-ge, Chen Ru-jie, Hou Meng-hua, Li De-yuan. Fast Image Segmentation with Multilevel Threshold of Two-dimensional Entropy Based on ISSA and Integral Graph [J]. Journal of Guangdong University of Technology, 2023, 40(05): 47-55.doi: 10.12052/gdutxb.230122 |
[5] | Zhong Geng-jun, Li Dong. A Channel-splited Based Dual-branch Block for 3D Point Cloud Processing [J]. Journal of Guangdong University of Technology, 2023, 40(04): 18-23.doi: 10.12052/gdutxb.230122 |
[6] | Lin Zhe-huang, Li Dong. Semantics-guided Adaptive Topology Inference Graph Convolutional Networks for Skeleton-based Action Recognition [J]. Journal of Guangdong University of Technology, 2023, 40(04): 45-52.doi: 10.12052/gdutxb.230122 |
[7] | Huang Xiao-yong, Li Wei-tong. Fall Detection Algorithm Based on TSSI and STB-CNN [J]. Journal of Guangdong University of Technology, 2023, 40(04): 53-59.doi: 10.12052/gdutxb.230122 |
[8] | Chen Xiao-rong, Yang Xue-rong, Cheng Si-yuan, Liu Guo-dong. Surface Defect Detection of Lithium Battery Electrodes Based on Improved Unet Network [J]. Journal of Guangdong University of Technology, 2023, 40(04): 60-66,93.doi: 10.12052/gdutxb.230122 |
[9] | Cao Zhi-xiong, Wu Xiao-ling, Luo Xiao-wei, Ling Jie. Helmet Wearing Detection Algorithm Intergrating Transfer Learning and YOLOv5 [J]. Journal of Guangdong University of Technology, 2023, 40(04): 67-76.doi: 10.12052/gdutxb.230122 |
[10] | Xie Guo-bo, Lin Li, Lin Zhi-yi, He Di-xuan, Wen Gang. An Insulator Burst Defect Detection Method Based on YOLOv4-MP [J]. Journal of Guangdong University of Technology, 2023, 40(02): 15-21.doi: 10.12052/gdutxb.230122 |
[11] | Zou Heng, Gao Jun-li, Zhang Shu-wen, Song Hai-tao. Design and Implementation of a Dropping Guidance Device for Go Robot [J]. Journal of Guangdong University of Technology, 2023, 40(01): 77-82,91.doi: 10.12052/gdutxb.230122 |
[12] | Yi Min-qi, Liu Hong-wei, Gao Hong-ming. Research on the Factors Influencing the Co-purchase Network of Products on E-commerce Platforms [J]. Journal of Guangdong University of Technology, 2022, 39(03): 16-24.doi: 10.12052/gdutxb.230122 |
[13] | Qiu Zhan-chun, Fei Lun-ke, Teng Shao-hua, Zhang Wei. Palmprint Recognition Based on Cosine Similarity [J]. Journal of Guangdong University of Technology, 2022, 39(03): 55-62.doi: 10.12052/gdutxb.230122 |
[14] | Gary Yen, Li Bo, Xie Sheng-li. An Evolutionary Optimization of LSTM for Model Recovery of Geophysical Fluid Dynamics [J]. Journal of Guangdong University of Technology, 2021, 38(06): 1-8.doi: 10.12052/gdutxb.230122 |
[15] | Deng Jie-hang, Yuan Zhong-ming, Lin Hao-run, Gu Guo-sheng. Superpixel and Visual Saliency Synergetic Image Quality Assessment [J]. Journal of Guangdong University of Technology, 2021, 38(05): 33-39.doi: 10.12052/gdutxb.230122 |
|