
 11个月前     547  




在6月IEEE计算机视觉和模式识别会议上发表的一篇论文中,德雷塞尔工程学院多媒体和信息安全实验室的研究人员解释说,虽然现有的合成图像检测技术迄今为止在识别人工智能生成的视频方面失败了,但他们在机器学习算法方面取得了成功,该算法可以被训练来提取和识别许多不同视频生成器的数字“指纹”,如稳定视频扩散、视频制作者和Cog video


德雷塞尔工程学院副教授、MISL主任Matthew Stamm博士说:“在有一个好的系统来检测坏人制造的假货之前,这项视频技术就已经发布了,这让人有点不安。”


Deepfake detectives





A new challenge





在这项研究中,该团队测试了11个公开可用的合成图像检测器。每一个项目都非常有效—至少90%的准确率—在识别被操纵的图像方面。但他们的表现下降了20–当面对由公开可用的人工智能生成器、Luma、VideoCrafter-v1、CogVideo和Stable Diffusion Video创建的有辨识度的视频时,30%



"These results clearly show that synthetic image detectors experience substantial difficulty detecting synthetic videos," they wrote. "This finding holds consistent across multiple different detector architectures, as well as when detectors are pretrained by others or retrained using our dataset."

A trusted approach

The team speculated that convolutional neural network-based detectors, like its MISLnet algorithm, could be successful against synthetic video because the program is designed to constantly shift its learning as it encounters new examples. By doing this, it's possible to recognize new forensic traces as they evolve. Over the last several years, the team has demonstrated MISLnet's acuity at spotting images that had been manipulated using new editing programs, including AI tools—so testing it against synthetic video was a natural step.

"We've used CNN algorithms to detect manipulated images and video and audio deepfakes with reliable success," said Tai D. Nguyen, a doctoral student in MISL, who was a co-author of the paper. "Due to their ability to adapt with small amounts of new information we thought they could be an effective solution for identifying AI-generated synthetic videos as well."

For the test, the group trained eight CNN detectors, including MISLnet, with the same test dataset used to train the image detectors, which including real videos and AI-generated videos produced by the four publicly available programs. Then they tested the program against a set of videos that included a number created by generative AI programs that are not yet publicly available: Sora, Pika and VideoCrafter-v2.

By analyzing a small portion—a patch—from a single frame from each video, the CNN detectors were able to learn what a synthetic video looks like at a granular level and apply that knowledge to the new set of videos. Each program was more than 93% effective at identify the synthetic videos, with MISLnet performing the best, at 98.3%.

The programs were slightly more effective when conducting an analysis of the entire video, by pulling out a random sampling of a few dozen patches from various frames of the video and using those as a mini training set to learn the characteristics of the new video. Using a set of 80 patches, the programs were between 95–98% accurate.

With a bit of additional training, the programs were also more than 90% accurate at identifying the program that was used to create the videos, which the team suggests is because of the unique, proprietary approach each program uses to produce a video.

"Videos are generated using a wide variety of strategies and generator architectures," the researchers wrote. "Since each technique imparts significant traces, this makes it much easier for networks to accurately discriminate between each generator."

A quick study

While the programs struggled when faced with the challenge of detecting a completely new generator without previously being exposed to at least a small amount of video from it, with a small amount of fine tuning MISLnet could quickly learn to make the identification at 98% accuracy. This strategy, called "few-shot learning" is an important capability because new AI technology is being created every day, so detection programs must be agile enough to adapt with minimal training.

"We've already seen AI-generated video being used to create misinformation," Stamm said. "As these programs become more ubiquitous and easier to use, we can reasonably expect to be inundated with synthetic videos. While detection programs shouldn't be the only line of defense against misinformation—information literacy efforts are key—having the technological ability to verify the authenticity of digital media is certainly an important step."


版权声明:Robot 发表于 11个月前,共 4912 字。
转载请注明:在deepfakes的踪迹上,研究人员识别人工智能生成视频的“指纹” | 脑机网


