Review Article

Forged Video Detection Using Deep Learning: A SLR

Table 6

Comparison of State of art approaches.

Sr. noReferenceApproachContributionDatasetModel accuracyLimitation

1Wang et al. [18]Convolutional neural networkSpatiotemporal model (3D ConvNet). Novel training strategy (AltFreezing)FaceForensics++ [19], Celeb-DF (V2) [20]99%Data augmentations/Model evaluation
2Liu et al. [21]Neural networkGeneralized residual federated learning methodFaceForensics++ [19], some YouTube data99.7%Privacy protection/Data privacy
3Tyagi and Yadav [1]SurveyDeep learning, visual imagery forgery detectionSelf-collected dataNAGeneralized methods
4Ganguly et al. [2]Deep learning modelSoft attention mechanism, visual attentionFaceForensics++ [19], Celeb-DF (V2) [20]70.1%Low accuracy
5Kumar et al. [3]Convolutional neural networkExtract deep features, distance of correlation coefficientVIFFD [22], surrey university library for forensic analysis (SULFA) [23]86.5% and 92% for video level/99.9% frame levelLimited mention of false positives/information about methodology
6Tan et al. [4]2D-convolutional neural networkBidirectional long-short-term-memorySYSU-OBJFORG [24]99%No comparison with emerging deep learning architectures
7Zhou et al. [5]Watermarking networkRobust watermarking network for video forgery detection (RWVFD) tampering localization, 3d-unet-based watermarking embedding networkDavis [25], YouTube-VOSNALimited scope of video types
8Kim et al. [6]Convolutional neural networkSymmetrically overlapped motion residualSULFA 14, REWIND18, SYSU-OBJFORG 1598%Diverse tampering should be considered
9Wang et al. [7]Discrete cosine transform-based forgery clue augmentation network (FCAN-DCT)Compact features extraction (CFE), frequency temporal attention (FTA)Wild-Deepfake [26], Celeb-DF [20]86%, 99%Lack of current real-world exploration
10Munawar and Noreen [8]Siamese-based RNN, I3D (inflated 3 dimension)Siamese based RNN integrated with I3D to find the duplicate frame rateMedia forensic challenge (MFC) [27], video and image retrieval and analysis tool (VIRAT) [28]93.3%, 86.6%Transfer learning not explored
11Alsakar et al. [9]SVD (single value decomposition), inter-frame forgeryFirst phase is 3D-Tensor decomposition, second phase is forgery detection, third phase is forgery locatingRandomly selected eight videos99%Enhance detection and location for variety
12Jin et al. [10]ResNet50 model, LSTM-EnDec, DMAC, noiseprintObject based video forgery detection, multi features fusion, dual streamGRIP [29], VTD (video tampering dataset) [30], SULFA [23], REWIND [31]NALimited evaluation of real-world scenarios
13Fadl et al. [32]2D-CNN, SSIM, gaussion RBF multiclass support vector machine (RBF-MSVM)Passive forensics, CNN (convolutional neural network), SSIM, spatiotemporal features, inter-frame forgeriesSULFA [23]99.9%Detecting multiple forgeries in videos
14Zheng et al. [33]Spatiotemporal convolutional,2D R50 network structure, 3D R50 network structure, 3D R50-FTCN (fully temporal convolutional network)Deepfake [34], FaceSwap, Face2Face99%Limited real-world application evaluation
15Huang et al. [35]Cross-model authenticationLocalization on live surveillance videosRun time evaluation95.1%Hardware and environment scalability
16Verde et al. [36]Convolutional neural network (CNN)Focal: Forgery localization framework based on video coding self-consistency60 encoded videos88.9%Assess scalability, improve model fusion
17Kaur and Jindal [37]Deep convolutional neural network (DCNN)ANN (artificial neural network), convolutional layer, ReLU activation layer, max pool layer, correlation classificationREWIND [31], GRIP [29]98%Consideration of hardware constraints
18Zhong et al. [38]Interframe best match algorithmA unified moment framework, 9-digit dense, moment feature index, best match algorithmREWIND [31], SULFA [23]75%Real-world scenario evaluation
19Sasikumar et al. [39]SIFT, MSCL, clusteringSIFT (scalar invariant features transformer), MSCL (mean shift clustering algorithm), camera motion, feature extraction, classification, segmentation, in-paintingRandomly collected dataNAEnhance video duplicate detection security
20Aloraini et al. [40]Sequential and patch analysisPatch analysis, sequential analysis, object removal video forgery, spatiotemporal analysisSULFA [23], SYSU-OBJFORG [24]72%Nonadditive models are not explored
21Hau Nguyen et al. [41]Convolutional neural network (CNN)Video interframe forgery detection, video authenticity, passive forensicVFDD [42]99%CNN needs to be simplified for diverse forgery
22Parveen et al. [43]Clustering algorithmK-means clustering, radix sortRandomly collected dataNALimited focus on clustering algorithms
23Hosler et al. [44]Convolutional neural network (CNN)Benchmark testing, video signal processingACID [45]95%Algorithm benchmark evaluations required
24Fayyaz et al. [46]Sensor pattern noiseVideo forensics, digital forgery, sensor pattern noise, photo response nonuniformity noise (PRNU)Dresden [47]Not mentionVulnerability to induced SPN attacks
25Joshi and Jain [48]Video tempering detectionTemporal fingerprints, optical flow200 video clip87.5%Implement machine learning for classification
26Chen et al. [49]Scale-invariant feature transformInvariant moment, region growingCopy-move forgery detection (CoMoFoD) [50]84.6%Reduce keypoints, optimize region growing
27Pavlović et al. [51]Multifractal spectrum and statistic parametersNew metaheuristic and supervised learning methodCoMoFoD [50]96%Explore metaheuristics and multifractals further
28Liu et al. [52]Scale-invariant feature transformK-means clusteringRandomly collected data89%Optimize parameters and explore new technologies
29Yadav and Salmani [17]SurveyMachine learning, deep learning, generative adversarial network, neural networkSelf-collected dataNALimited theoretical explanation
30Jia et al. [53]Optical flow consistencyCoarse-to-fine detection, video passive forensicRandomly collected dataNot mentionEnhance handling of static scenes
31Singh and Singh [54]Dual-clutch transmission (DCT) matrixRegion duplication, correlation coefficient, and coefficient of variationRandomly collected data96.6%Struggles with subtle intensity changes
32Afchar et al. [55]Deep learning approachDeepFake, Face2FaceDeepFake [34]98% DeepFake 95% Face2FaceLimited theoretical explanation of results
33Chen et al. [56]Region based convolutional neural networkRegion proposal network in faster R-CNN networkCityscapes [57], KITTI [58], SIM10KNADependence on adversarial training techniques
34Aneja et al. [59]Convolutional neural network (CNN)Recurrent neural network (RNN) powered by long-short-term-memory (LSTM)MS COCO [60]NASequential limitations in LSTM models
35Shou et al. [61]Online detection of action start (ODAS)Generative adversarial network, evaluation protocolTHUMOS’14 [62], activity netNALimited practical application and evaluation
36Nguyen et al. [63]Convolutional neural networkCapsule network, face swap detection, facial reenactment detectionREPLAY-ATTACK [64], FaceForensics [19]99%Enhance resistance to adversarial attacks
37Ulutas et al. [65]Bag-of-words (BoW)Scale independent features transform (SIFT)Surrey university library for forensic analysis (SULFA) [23]97.5%Limited focus on real-world scenarios
38Zhao et al. [66]Passive blind schemeHue-saturation-value (HSV), speeded up robust features (SURF), fast library for approximate nearest neighbors (FLANN)10 test shots99.01%Limited to interframe forgeries
39Voronin et al. [67]Convolutional neural network (CNN)Spatial-temporal procedure based on statistical analysis and CNN3000 videos96%Future real-time application and comparisons
40Carreira and Zisserman [68]Inflated 3 dimensionTwo stream inflated 3D ConvNet (I3D) based on 2D ConvNetHMDB-51, UCF-10180.2% HMDB-51, 97.9% UCF-101Use kinetics for comprehensive experiments
41D’Amiano et al. [69]Dense field algorithm3D PatchMatch based dense field algorithmREWIND [31]NAEnhance video analysis
42D’Avino et al. [70]Recurrent neural networkRecursive network, long short-term memoryRandomly collected dataNALimited theoretical explanation
43Cozzolino et al. [71]Convolutional neural network (CNN)Local descriptors, bag-of-wordsSynthetic [72]94%Explore architectural improvements for deep learning
44Bozkurt et al. [73]Discrete cosine transform (DCT)Correlation image generation, coarser forgery line detection, finer forgery line localizationRandomly collected data98%Not mention
45Do et al. [74]Deep convolutional neural network (DCNN)Generative adversarial network (GAN)Celeb-DF [20]80%Limited discussion of real-world scenarios
46Long et al. [75]Convolutional neural networkConvolutional 3D neural network (C3D), long short-term memory (LSTM)2394 videos, YFCC100M [76]98%Improve frame dropping and LSTM
47Su et al. [77]Region duplicationAdaptive parameter-based fast compression tracking (AFCT)Randomly collected data93.1%Detect diverse video forgery types
48Mizher et al. [78]Spatio termporal attacksFalsifying techniques, fingerprint framework, secure systemSelf-collected dataNot mentionNeglects complex video inpainting methods
49Zhu et al. [79]Spatiotemporal featuresScale invariant features transformation (SIFT)TRECVID [80], CC_WEB_VIDEO [81]99%Limited evaluation of real-world scenarios
50Barhoom et al. [82]Physical random objectsDigital tampering, digital forensicsRandomly selected dataNALimited theoretical explanation
51Abbasi Aghamaleki and Behrad [83]Passive forensicsExtract appropriate quantization error richMPEGx codic [84]92.73%Limited theoretical explanation
52Mathai et al. [85]Statistical moment featuresNormalization cross-correlationSULFA [23]88%Limited accuracy in forgery detection
53Rao and Ni [86]Convolutional neural networkSpatial rich model, support vector classificationCASIA v1.0 [87], CASIA v2.0, DVMM [88]98%, 97.8%, 96%Limited theoretical explanation
54Rigoni et al. [89]Video tempering detectionQuantization index modulation, watermarkingRandomly collected data96.5%Limited theoretical explanation