Journal of Guangdong University of Technology ›› 2023, Vol. 40 ›› Issue (06): 155-167.doi: 10.12052/gdutxb.230130
• Artifical Intelligence • Previous Articles Next Articles
Lai Zhi-mao1,2, Zhang Yun1, Li Dong1
CLC Number:
[1] ROETTGERS J. Porn producers offer to help hollywood take down deepfake videos [EB/OL]. (2018-02-21) [2023-09-30].https://variety.com/2018/digital/news/deepfakes-porn-adult-industry-12027057-49/. [2] ROSSLER A, COZZOLINO D, VERDOLIVA L, et al. Faceforensics++: learning to detect manipulated facial images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019: 1-11. [3] QIAN Y, YIN G, SHENG L, et al. Thinking in frequency: face forgery detection by mining frequency-aware clues[C]//European Conference on Computer Vision. Online: Springer, 2020: 86-103. [4] MASI I, KILLEKAR A, MASCARENHAS R M, et al. Two-branch recurrent network for isolating deepfakes in videos[C] //European Conference on Computer Vision. Online: Springer, 2020: 667-684. [5] LI L, BAO J, ZHANG T, et al. Face x-ray for more general face forgery detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 5001-5010. [6] ZHAO H, ZHOU W, CHEN D, et al. Multi-attentional deepfake detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Online: IEEE, 2021: 2185-2194. [7] LIU H, LI X, ZHOU W, et al. Spatial-phase shallow learning: rethinking face forgery detection in frequency domain[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Online: IEEE, 2021: 772-781. [8] ZHAO T, XU X, XU M, et al. Learning self-consistency for deepfake detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 15023-15033. [9] HALIASSOS A, VOUGIOUKAS K, PETRIDIS S, et al. Lips don't lie: a generalisable and robust approach to face forgery detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Online: IEEE, 2021: 5039-5049. [10] CHUGH K, GUPTA P, DHALL A, et al. Not made for each other-audio-visual dissonance-based deepfake detection and localization[C]//Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM, 2020: 439-447. [11] MITTAL T, BHATTACHARYA U, CHANDRA R, et al. Emotions don't lie: an audio-visual deepfake detection method using affective cues[C]//Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM, 2020: 2823-2832. [12] ZHOU Y, LIM S N. Joint audio-visual deepfake detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 14800-14809. [13] ZHOU D, KANG B, JIN X, et al. Deepvit: Towards deeper vision transformer[EB/OL]. arXiv: 2103.11886(2021-03-21) [2023-09-20].https://arxiv.org/abs/2103.11886. [14] PENG Z, GUO Z, HUANG W, et al. Conformer: local features coupling global representations for recognition and detection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(8): 9454-9468. [15] GONG C, WANG D, LI M, et al. Vision transformers with patch diversification[EB/OL]. arXiv: 2104.12753(2021-06-11) [2023-09-20].https://arxiv.org/abs/2104.12753. [16] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//31st Conference on Neural Information Processing Systems. Long Beach, USA: MIT Press, 2017: 5998–6008. [17] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[C]//Proceedings of the 9th International Conference on Learning Representations. Online: ACM, 2021: 1-6. [18] LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 10012-10022. [19] CHEN, RICHARD C F, FAN Q, et. al. Crossvit: cross-attention multi-scale vision transformer for image classification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 357-366. [20] Deepfakes [EB/OL]. (2019-10-01) [2023-09-20].https://github.com/deepfakes/faceswap. [21] KORSHUNOVA I, SHI W, DAMBRE J, et al. Fast face-swap using convolutional neural networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 3677-3685. [22] LI L, BAO J, YANG H, et al. Advancing high fidelity identity swapping for forgery detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 5074-5083. [23] CHEN R, CHEN X, NI B, et al. Simswap: an efficient framework for high fidelity face swapping[C]//Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM, 2020: 2003-2011. [24] THIES J, ZOLLHOFER M, STAMMINGER M, et al. Face2face: real-time face capture and reenactment of rgb videos[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 2387-2395. [25] NIRKIN Y, KELLER Y, HASSNER T. FSGAN: subject agnostic face swapping and reenactment[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE, 2019: 7184-7193. [26] 朱凯, 李理, 张彤等. 视觉Transformer在低级视觉领域的研究综述[J/OL]. 计算机工程与应用.https://link.cnki.net/urlid/11.2127.TP.20230817.1249.004 ZHU K, LI L, ZHANG T, et al. A survey of vision transformer in low-level computer vision[J/OL]. Computer Engineering and Applications.https://link.cnki.net/urlid/11.2127.TP.20230817.1249.004 [27] TOUVRON H, CORD M, DOUZE M, et al. Training data-efficient image transformers & distillation through attention[C]//International Conference on Machine Learning. Online: PMLR, 2021: 10347-10357. [28] WU H, XIAO B, CODELLA N, et al. CvT: Introducing convolutions to vision transformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 22-31. [29] D'ASCOLI S, TOUVRON H, LEAVITT M L, et al. Convit: Improving vision transformers with soft convolutional inductive biases[C]//Proceedings of the 38th International Conference on Machine Learning. Online: ACM, 2021: 2286-2296. [30] CHU X, TIAN Z, WANG Y, et al. Twins: revisiting the design of spatial attention in vision transformers [J]. Advances in Neural Information Processing Systems, 2021, 34: 9355-9366. [31] CHEN R, PANDA R, FAN Q. RegionViT: regional-to-local attention for vision transformers[C]//International Conference on Learning Representations. Online: ACM, 2022. [32] WANG W, XIE E, LI X, et al. Pyramid vision transformer: a versatile backbone for dense prediction without convolutions[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 568-578. [33] YUAN L, CHEN Y, WANG T, et al. Tokens-to-token ViT: Training vision transformers from scratch on imagenet[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 558-567. [34] ZHOU D, KANG B, JIN X, et al. DeepViT: Towards deeper vision transformer[EB/OL]. arXiv: 2103.11886(2021-04-19) [2023-09-20].https://doi.org/10.48550/arXiv.2103.11886. [35] TOUVRON H, CORD M, SABLAYROLLES A, et al. Going deeper with image transformers[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 32-42. [36] DONG X, BAO J, CHEN D, et al. Protecting celebrities from deepfake with identity consistency transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022: 9468-9478. [37] CHEN H, LIN Y, LI B, et al. Learning features of intra-consistency and inter-diversity: keys toward generalizable deepfake detection [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(3): 1468-1480. [38] WANG J, WU Z, OUYANG W, et al. M2tr: multi-modal multi-scale transformers for deepfake detection[C]//Proceedings of the 2022 International Conference on Multimedia Retrieval. Newark, USA: ACM, 2022: 615-623. [39] TAN Z, YANG Z, MIAO C, et al. Transformer-based feature compensation and aggregation for deepfake detection [J]. IEEE Signal Processing Letters, 2022, 29: 2183-2187. [40] MIAO C, TAN Z, CHU Q, et al. Hierarchical frequency-assisted interactive networks for face manipulation detection [J]. IEEE Transactions on Information Forensics and Security, 2022, 17: 3008-3021. [41] MIAO C, TAN Z, CHU Q, et al. F2Trans: high-frequency fine-grained transformer for face forgery detection [J]. IEEE Transactions on Information Forensics and Security, 2023, 18: 1039-1051. [42] HOCHREITER S, SCHMIDHUBER J. Long short-term memory [J]. Neural computation, 1997, 9(8): 1735-1780. [43] ZHENG Y, BAO J, CHEN D, et al. Exploring temporal coherence for more general video face forgery detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 15044-15054. [44] GUAN J, ZHOU H, HONG Z, et al. Delving into sequential patches for deepfake detection [J]. Advances in Neural Information Processing Systems, 2022, 35: 4517-4530. [45] ZHAO C, WANG C, HU G, et al. ISTVT: interpretable spatial-temporal video transformer for deepfake detection [J]. IEEE Transactions on Information Forensics and Security, 2023, 18: 1335-1348. [46] YU Y, NI R, ZHAO Y, et al. MSVT: multiple spatiotemporal views transformer for deepfake video detection [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(9): 4462-4471. [47] CHENG H, GUO Y, WANG T, et al. Voice-face homogeneity tells deepfake[EB/OL]. arXiv: 2203.02195(2022-06-13) [2023-09-20].https://doi.org/10.48550/arXiv.2203.02195. [48] ILYAS H, JAVED A, MALIK K M. AVFakeNet: A unified end-to-end dense swin transformer deep learning model for audio-visual deepfakes detection [J]. Applied Soft Computing, 2023, 136: 110124. [49] YANG W, ZHOU X, CHEN Z, et al. AVoiD-DF: audio-visual joint learning for detecting deepfake [J]. IEEE Transactions on Information Forensics and Security, 2023, 18: 2015-2029. [50] FENG C, CHEN Z, OWENS A. Self-supervised video forensics by audio-visual anomaly detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE, 2023: 10491-10503. [51] YU Y, LIU X, NI R, et al. PVASS-MDD: predictive visual-audio alignment self-supervision for multimodal deepfake detection[J]. IEEE Transactions on Circuits and Systems for Video Technology.https://ieeexplore.ieee.org/document/10233898 [52] DeepfakeDetection [EB/OL]. (2019-10-01) [2023-09-30].https://github.com/ondyari/FaceForensics. [53] DOLHANSKY B, HOWES R, PFLAUM B, et al. The deepfake detection challenge (DFDC) preview dataset[EB/OL]. arXiv: 1910.08854(2019-10-23) [2023-09-20].https://doi.org/10.48550/arXiv.1910.08854. [54] LI Y, YANG X, SUN P, et al. Celeb-DF: a large-scale challenging dataset for deepfake forensics[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 3207-3216. [55] LI Y, YANG X, SUN P, et al. Celeb-DF (v2): a new dataset for deepFake forensics [EB/OL]. arXiv: 1909.12962v3. (2019-11-22) [2023-09-20].https://doi.org/10.48550/arXiv:1909.12962v3. [56] ZI B, CHANG M, CHEN J, et al. WildDeepfake: a challenging real-world dataset for deepfake detection[C]//Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM, 2020: 2382-2390. [57] DOLHANSKY B, BITTON J, PFLAUM B, et al. The deepfake detection challenge (DFDC) dataset [EB/OL]. arXiv: 2006.07397. (2020-10-28) [2023-09-20]. https://doi.org/10.48550/arXiv.2006.07397. [58] JIANG L, LI R, WU W, et al. Deeperforensics-1.0: a large-scale dataset for real-world face forgery detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 2889-2898. [59] DONG X, BAO J, CHEN D, et al. Identity-driven deepfake detection[EB/OL]. arXiv: 2012.03930(2022-09-07) [2023-09-20].https://doi.org/10.48550/arXiv.2012.03930. [60] ZHOU T, WANG W, LIANG Z, et al. Face forensics in the wild[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Online: IEEE, 2021: 5778-5788. [61] HE Y, GAN B, CHEN S, et al. ForgeryNet: a versatile benchmark for comprehensive forgery analysis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Online: IEEE, 2021: 4360-4369. [62] KWON P, YOU J, NAM G, et al. KoDF: a large-scale Korean deepfake detection dataset[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 10744-10753. [63] CAI Z, STEFANOV K, DHALL A, et al. Do you really mean that? content driven audio-visual deepfake dataset and multimodal method for temporal forgery localization[C]//2022 International Conference on Digital Image Computing: Techniques and Applications. Online: IEEE, 2022: 1-10. [64] NAGRANI A, CHUNG J, ZISSERMAN A. VoxCeleb: a large-scale speaker identification dataset[EB/OL]. arXiv: 1706.08612 (2017-06-26) [2023-09-20].https://doi.org/10.48550/arXiv.1706.08612. [65] CHUNG J, NAGRANI A, ZISSERMAN A. VoxCeleb2: deep speaker recognition[EB/OL]. arXiv: 1806.05622(2017-06-26) [2023-09-20].https://doi.org/10.48550/arXiv.1806.05622. |
[1] | Feng Guang, Bao Long. Face Recognition Method in Complex Environment Based on Infrared Visible Fusion [J]. Journal of Guangdong University of Technology, 2024, 41(03): 62-70,109. |
[2] | Guo Ao, Xu Bo-yan, Cai Rui-chu, Hao Zhi-feng. Temporal Alignment Style Control in Text-to-Speech Synthesis Algorithm [J]. Journal of Guangdong University of Technology, 2024, 41(02): 84-92. |
[3] | Zhang Miao, Pang Zhuo-biao, Hao Xue-dong, Xie Si-wei, Zhang Xing-wang. A Research on a Transformerless Parallel Hybrid Active Power Filter [J]. Journal of Guangdong University of Technology, 2019, 36(05): 33-37. |
[4] | Dong Wen-hua, Li Chun-lai, Lan Xiong. Design and Experimental Analysis of an Open-close Micro Current Transformer [J]. Journal of Guangdong University of Technology, 2019, 36(04): 65-69. |
[5] | He Rui-wen, Xie Qiong-xiang, Cai Ze-xiang. Influence of Digital Acquisition of the Electrical Information on the Reliability of Relay Protection [J]. Journal of Guangdong University of Technology, 2013, 30(2): 68-73. |
[6] | Chen He-en, , Feng Kai-ping, Pan Li-pei, Wu Yue-ming, . Study of Architecture Transformation [J]. Journal of Guangdong University of Technology, 2012, 29(2): 94-96. |
|