Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Two recent studies on judging authentic vs. deepfake images (nautil.us)
56 points by dnetesn on March 10, 2022 | hide | past | favorite | 89 comments


I was watching a livestream recently, and the person turned to grab something out of frame, and for a split second when their face was near the border, the filter glitched out and gave flashes of the person's real face. I was truly shocked as I had been watching for about five minutes with zero inclination that the person was essentially wearing a digital mask.


My partner was showing me a Doja Cat TikTok last night where she was singing an old song with an “old age” face filter on, and the effect was stunning. Super realistic wrinkles, contours and skin texture, it never glitched out and had no visible edges. We are truly through the looking glass.


Related: Lucasfilm hired a digital artist, Shamook, who re-did Luke's scene in The Mandalorian with some improved deepfake techniques[1].

Here's his video comparison: https://www.youtube.com/watch?v=wrHXA2cSpNU

[1] https://www.engadget.com/lucasfilm-hires-shamook-054904077.h...


The biggest problem for me is that the mouth movements seem completely off.


I think that's because they used the original fake to make the deepfake. If they had used an actual actor (with a completely different face but accurate mouth movement) I think it would have been spot on.


It's off on both sides, but to me it looks more off on the deepfake side


That's been an issue for all CGI, I guess our ability to spot even the slightest imperfection in how mouths look has us well stuck in the uncanny valley


The firs comment on the video says it would be better if they had access to the original video and not just the original fake.


His top lip doesn't move when he speaks


> overall average accuracy was near chance at 48 percent, although individuals’ accuracy varied considerably, from a low of around 25 percent to a high of around 80 percent.

People with a 25% score are basically as good as recognizing fakes as people with an 80% score. The former group just prefer the fakes to the real..


Reminds me of how great I am at picking and choosing stocks.


You are assuming facts not on evidence, regarding false positive vs false negatives.


The community of people making deepfakes even approaching this quality is astonishingly small (think 3-4 people), requires absurd hardware (either A100s or A6000s), and the Luke one is even more of a outlier as this deepfaker has the support of ILM, one of the best VFX studios in the industry. All this is to say I'm not especially scared — especially when these "best of the best" deepfakes still have obvious tells.


Not weighing in on whether one should be frightened or not, but just pointing out that what you said would have applied to the computer graphics industry about 30 years ago. The interesting thing about computers is they tend to get faster over time.


If Disney can't make Luke Skywalker, one of their most iconic characters, look good in a show with a budget the size of the Mandalorian, I can't say I'm too worried for this technology yet.

Everyday misinformation techniques are far more pernicious - out-of-context sound bites, selective coverage, framing in an uncharitable way, interviewing non-experts, etc etc.


>If Disney can't make Luke Skywalker, one of their most iconic characters, look good in a show with a budget the size of the Mandalorian

and yet are willing to put it out there that means actually they see this technology as important, it's the future for them, and what do you do when you think a technology is the future, that's right you invest time and money in using it so that you can improve it. It's not that they have the budget of the Mandalorian to do this scene, it's that they have the budge of Disney to improve the technology.

I would say in 10 years Disney in a show with a budget the size of the Mandalorian (adjusted for inflation natch) will probably have something really worrisome.


I see you haven’t watched The Book of Boba Fett. I can assure you the deepfake technology is now… fully operational.

Everyday misinformation techniques were already really bad, but things could stand to get a lot worse.


Yeah the improvement in young Luke in The Book of Boba Fett was pretty impressive.


Agreed! That was the first one I've seen where I felt it was basically "passing". Like if I didn't know, I wouldn't know.


Yeah, it doesn’t seem like people believe misinformation because the evidence for it is just so convincing. I see people link to sources (if they even bother doing that), with completely false descriptions of what it shows. It doesn’t matter; as long as the description is the kind of thing that outrages people they’ll amplify it. Corrections will get one tenth of the attention.


ILM support came after.

A6000 is only $50/day rental.


I love living in an age where old scifi movies/plots that most people thought would never actually happen, actually start to happen.

I'm thinking specifically of 2002's S1M0NE[0].

I can't wait for this future.

[0]https://en.wikipedia.org/wiki/Simone_(2002_film)


You may also enjoy The Congress.

https://m.imdb.com/title/tt1821641/


What a trippy trailer. Must watch.


Ahhh these fears are really overblown.

This is just an iteration of the same overreaction that has been occurring since the days of splicing audio tape together.

If anything, I would argue malicious editing of REAL footage, which already occurs in media outlets today, is far more frightening than the possibility of having a deepfake video, though it might be another tool in that slimy toolbox.


This is the thing.

Convincing enough deep fake + the 24/7 news industry + Twitter rage machine combined will at some point overreact way too fast and too hard and someone will get actually hurt, maybe even killed.

Actors have had clauses for not using their likeness for a long time now, if their face is 3D scanned, the use cases are carefully limited. They can't just put Brad Pitt in a random commercial even though they have the scans of his face and body and can find a voice actor.


This sometimes is very advantageous for celebrities.

Bruce Willis got paid an insane amount of money for his deepfaked face to appear in a Russian commercial. He literally got paid a couple million dollars to do NOTHING except agree to his likeness being used.

https://www.ign.com/articles/bruce-willis-russian-commercial...


Not sure why this would be downvoted.


HN has by far the most bizarre downvote patterns I've ever seen. Not really worth wondering about.


I think there are definite categories. They seem to range from legitimate uses of downvoting such as against obvious trolls, facetious argumentation, etc. to enforcement of ideological tyranny. It's always interesting to analyze.


Well sure, describing downvote patterns as weird doesn't mean that the majority of downvotes are bizarre. The bulk are completely comprehensible. But I can't think of anywhere else where I see as many completely reasonable comments heavily downvoted for incomprehensible reasons.


Yeah I agree on your last point. I sometimes wonder if some of it is automated / training data being built but you never know.


> enforcement of ideological tyranny

Boy, that escalated quickly.


IMO the voting pattern changes pretty clearly as the Earth turns and different groups of people become the dominant readers at any given moment.


This dynamic is true of places like Reddit too, or any forum with a global audience. But I can't think of any subreddit I've ever observed such weird downvote patterns on.


There's a subplot in The Newsroom where a producer edits an interviewee out of context, leading to the network broadcasting false accusations of serious war crimes. The producer wasn't careful enough, and a clock visible in the background is seen to jump around. Without that clock it would have been undetectable.


Life imitates art - when Russia started attacking Ukraine, RT reported a 5pm emergency meeting of their military leaders. Included was a photo showing two of their watches at 11:45. (The report was at about 7pm the same day, so it couldn’t have just been a long meeting)


You don’t even have to change the ordering of the video during editing to craft a false narrative. Take reality TV, for example… if you film people for 24 hours a day for a week, and you need to edit the footage down for a 30 minute show, you can create a huge variety of impressions.

For example, let’s say you want to make someone seem like a jerk… in 168 hours, everyone is going to have at least 30 minutes of actions that, in a vacuum, make you look like a jerk. By choosing only to show actions that contribute to a certain narrative, you can completely change how someone appears to behave.


That's how "reality TV" operates. They record a ton of video, decide what the story they want to tell is, and chop things down to tell that story.


And parodied by the Simpsons here:

https://youtu.be/qEGFaOeUm2A


That episode is from November 27, 1994. That Newsroom episode aired July 28, 2013. Less Simpsons parodying Newsroom and more them predicting the future [1].

[1] https://en.wikipedia.org/wiki/Bart_to_the_Future



Michael Moore did this in Bowling for Columbine.


Even malicious editing has been around a long time, but the ease and accessibility, coupled with social media proliferation, is on another level. I'm not sure to what extent this matters yet when mere text on a screen can fool certain demographics. Is doctored footage/images a moot point if the target viewers are going to believe it anyway?

What we need is better inoculation to psycho misinformation politics


I take a similar approach, but from a different angle: the widespread awareness of deepfakes is useful because people will stop relying on video "evidence" as evidence.

We already do this with photos. We don't necessarily worry so much about what's in the photo as who claims the photo is real and what the chain of custody was.

The same will be true for video evidence, even if or especially if we start stamping videos with tamper-proof signatures.


So I guess these people don't remember Stalin and Lenin photoshopping in the 1930s. https://rarehistoricalphotos.com/stalin-photo-manipulation-1...

And then there's doubles: when you just find somebody who looks exactly like the real person, and get them to do some acting.

A deepfaked video or photoshop can be debunked because we're all aware that the technology exists to fake it; all we have to do is show evidence that it's not true. And at the same time, there will always be people who simply want to believe something and so will refuse any evidence to the contrary. You don't need a Deepfake for someone to believe that Hillary Clinton is running a gang of pedophiles out the nonexistent basement of a pizza parlor in the DC suburbs.


If anyone is interested in the paper (PDF): https://oa.mg/work/10.1073/pnas.2110013119


I've never been worried about this because I just assume that official outlets and websites will use some kind of verifiable encryption keys. Makes official official and everything else suspect.

Is this too simplistic an outlook? Are there different options? I'm not sure but I know that third party unverified content is always suspect. First party content less suspect. Verified content is ... verified. Still could be deepfake but then the risk is on a loss of trust.


It's worth noting that society existed for thousands of years without photography, audio recording, or videography. As you allude to, society is held together via trust networks, and these are independent of the prevailing media.


Speed of information is problem today we did not have in the past. Wrong, or disinformation can spread faster than the truth can be exposed in alot of cases.

Especially when you have companies silencing people that criticize approved sources, and are determining what "truth" is allowed in public discourse


Yes, but this works both ways---the truth also spreads faster over faster networks.

It's always been true that rumor (or disinformation) spreads faster than truth. It always will be true, because rumor is easier to generate than truth.


There are huge opportunities here for hashing and watermarking that guarantees videos are unaltered from the original recording taken by a physical device. An iphone already has all the hardware it needs to ensure it cannot be fooled by a flat image and to calculate and insert a cryptographic hash while recording.

We will lose the battle against deepfakes, but we can create a subcategory of video proven as non-fake and win the war.


They’ve been trying to scare us with deepfakes for so long now. It’s such a yawn. They create such a narrow image type (face on blurry background) and say “look how real they look!”

Same with deepfake videos, which could always be faked with good FX and one of the reasons why videos with no chain of custody don’t become evidence in court already!


I don't know how I'd feel if I was the guy who 96% of people thought wasn't real.


Am I alone in having thought that the term ‘deepfake’ only applied to NN/deep learning-generated imagery, rather than just plain CG, as implied in this article?


That's the context where it originated, but that distinction is probably lost to most people outside the ai/tech scene.


Maybe we will finally see the importance of photographically important messages such as government announcements.


One one hand - I would say that everything that can be abused will be abused. On the other hand I would say, people (after a time of adaption) will learn to not trust any moving image anymore.

Or any still image for that. But on the other hand - we are already theoretically and practically able to generate very convincing fake images, if we would want to. We could generate fake stills from a surveillance video (or a smartphone video) of a fake politician doing cocaine. Or interacting in a compromising way with a prostitute. Why do we not see much more of things like that?

Probably because it just might be more difficult to create a believable story around such a fake that does not break down when looking at the story more than 10 seconds.

So maybe the problem will not be as big? I don't know. I have no idea how easy it will become to create truly believable fakes.

Not sure how people will adapt and if trust in anything might break down in the process. But what I do not believe is that tech industry creating rules for itself will be a deterrent against abuse of any technology.


How many people are convicted on 480p or less janky security footage? Making believable bad fakes is trivial when high resolution fakes are only moderately difficult.

If it hasn't happened yet, it will soon - people will be submitting modified doorbell camera and home security footage to police. Or someone's compromised home network will record a deep fake overlay of otherwise legitimate footage, and the owner won't be complicit. In a situation with, say, a $5,000 or more item, or a schedule 1 or 2 substance stolen, a person could be subjected to damning evidence of them committing a serious felony, for potentially less difficulty and risk than swatting.


I see many people here worried about the disinformation potential of deepfakes. I think that that worry fundamentally ignores the fact that we'll just stop considering videos evidence.

A century ago, you could trust photos. Then, Photoshop came out, and the world didn't end, we just learned that we can no longer trust photos. Now we just won't trust anything, and it'll be one party's word against the other's.

Sure, this isn't as good as living in a world with objective proof, but I think we'll grow out of the short-term issues and find different ways of verifying truth.


If Hollywood's use of deepfake Luke Skywalkers gives "more truth-oriented" folks a good-enough awareness of the technology, and fact-free stuff like the Pizzagate conspiracy is good enough for the "less truth-oriented" folks...then deepfakes aren't much of a threat.


Once this is more accessible for smaller studios, with smaller budgets, then actors will be obsolete.

10y from now, we will be able to simulate any human action or sound or gesture on screen with commodity hardware. I can have a 75m USD cast for my film, for the price of a couple GPUs spun for a few days. Crazy.


I can have a 75m USD cast for my film

Only if you want the people you've copied to sue you for breaching their personality rights (image rights in other countries). https://en.wikipedia.org/wiki/Personality_rights#United_Stat...


When it comes, the pirates will win that one too.


You still need people who can act. Someone has to make the characters act, and you don't want it to suck, so somebody needs to be skilled at acting, even if they don't end up doing the physical acting.

But more to the point, there is no need for constant deepfakes. You can just get regular people to act in your movie. Some movies might need deepfakes for the same reason that animes are really popular. But there's plenty of movies that would not benefit from deepfakes at all.


I suspect that acting itself will be reduced to algorithms. Expression of different emotions could be digitized. This could open further avenues of automation - authors could configure various different permutations (e.g. sincerity/sarcasm, surprise), the combination of expressions could be derived from context / sentiment analysis for the character, or perhaps each character could have a preset pack of opinions/values that informs their response. Given the constantly expanding capabilities of synthesis, these seem like natural areas that people will explore.

I agree that there is no need for any of this, but that's never stopped technological / artistic progress.


That seems to largely be taking the character out of characters. I think this maybe another case of "computer geeks think that {industry name} is easy and can be solved with computers".


Yes it feels a bit reductionist... "Love is merely a series of chemical reactions, therefore I should be able to compute text that elicits these reactions in a person and makes them fall in love with me."


Similarly we still have accountants even though Excel and TurboTax can do all the calculations.

I can't help but think this effects the demand for models more than actors. Why spend money on hiring a model who has spent enormous amounts of time looking good shirtless or in a bikini when an average looking person can have a filter applied?


We have a lot fewer accountants than we did before however (per company)


Fully animated actors can play just as effective and convincing as flesh-and-blood actors, as shown by Disney or Pixar. But is also requires immense amounts ow works and talent - neither which is cheap.

Star Wars certainly didn't use deepfakes because they though it was cheaper or easier.


I agree that it won't obsolete actors as GP says, but it will absolutely be cheaper and easier. The expense of acting doesn't come from the craft being expensive or rare. It comes from the market power that results when the strongest connection we feel to a performance is inseparable from the performer (ie their face). There's a reason that editors and cinematographers don't have fans outside of film nerds,and it's not that because their craft is trivial or that they don't contribute to the film. Technology costs can be reduced, but market power is what allows them to collect tens (or hundreds!) of millions of dollars. Commoditization via deepfakes turns them into just another employee.

You can look at voice acting as an example. Before Robin Williams' performance in Aladdin started the trend of recognizable celebrities as voice actors, there was a thriving community of voice-only actors in big-budget movies, hired for their craft instead of their brand. And yet, you probably can't name a single one, despite being able to name plenty of actors from non-animated films of the era. They were also inherently replaceable, as memorably parodied by the simpsons: https://youtu.be/g5ffaTv4ajg

It's notable that the studios switched to celebrity actors precisely for the brand that they brought from their non-voice roles.

Deepfakes are pretty much identical to this animated case, allowing it to be extended to physical movies. The recognizable portion of a performance becomes a few model files instead of an actor's face, and the job of a (replaceable) human actor is to play the role of the performer, not just the character.


You mean you will hire a cheap unknown actor and then add Tom Cruise's face via deepfake, and then market the movie as a Tom Cruise movie? Even if that was legal (which of course it isn't), surely the marketing value of the actors name would disappear if every low-budget movie could use deepfakes of the actor.


No, I'm suggesting that the tentpole performer will have an entirely synthetic identity.


They aren't cheap now, but that's only because the automation and AI pipelines aren't ready yet.


You still need someone to - you know - act.

Saying deepfakes makes actors superfluous is like saying synthesizers makes musicians superfluous.


there's the deepfake visual representation that we're discussing, but what's to say there isn't work being done on the acting side of things? I think there are broad horizons to what can be automated - especially in the world of pure media.


You mean some kind of AI which can read and understand a script and then develop a persona with physical appearance, voice, mannerisms, interpretation of emotional state and interactions with other characters etc.?

I guess such an AI would also be able to write and direct the movie.


Realistically there would be a pipeline with various companies and AIs involved, each specializing in a specific aspect of virtualizion (the way effects houses work today,) not a single AI doing everything.

But yes, everything you mention up to and including writing and directing can and will be automated to the fullest degree possible in time.


Deepfake tech isn't going to get you AI actors. It's just not the same thing at all.


It's a piece of the puzzle that will.


this doesn't address the talent behind the images, both cast and crew. Why would optimizing the production workflow make either of those cheaper? In more general software development payroll costs are higher than the rest of costs combined, seems the same scenario in more animation as well.


And even now Pixar and Disney are really just melding real life and animation - they are working off real performances and choreography to generate those images. Still need to have people in the pipeline somewhere to have things look right.


Very doubtful. That 75m USD cast has more acting talent than your GPUs and gets people in theaters.


It really won't. That would just be a cartoon with a realistic art style. You still need almost every other part of a film. You still need an original performance to deep fake. This isn't an AI actor or a simulation, it's a post effect.


Thank you for a perspective which isn’t rooted in fear. What impact can we expect this to have on the quality of movies?


Regurgitated leftovers from popular franchises/actors in an uncanny valley forever.


I'm not sure what failure to reliably discriminate between real pictures of people I don't know and therefore wouldn't be able to recognize and (what are basically composite) computer-generated people who don't exist is supposed to prove. Artists who can paint a picture of someone who doesn't exist that I can't tell from pictures of people who do exist, and I could probably piece together somebody who doesn't exist in photoshop from pieces of people who do.

Also, a weird-ass set of conclusions here:

> People are also excellent at incorporating context and background knowledge into their judgments in ways that computers mostly aren’t, and in ways that the sorts of tasks in these studies mostly don’t assess. One exception is a pair of deepfake videos of Vladimir Putin and Kim Jong-un included in the video study. People were much better than the model at classifying these videos. Whereas a majority of participants correctly spotted these fakes, the model was way off, highly confident that they were real.

> Here, people may have taken into account factors like their memories of previous statements made by these leaders, or knowledge of what their voices sound like, or the likelihood that they would say the sorts of things shown in the video: all things not included in the model at all.

Couldn't it just be because they were bad imitations of people they recognized? Isn't the entire ominous point of "deepfakes" to deceive people by presenting them with people the recognize doing things they didn't do? Looks like people still don't have a problem recognizing those as false.

The only good "deepfakes" I've seen are ones that have been extensively touched up manually, using the automatic part as raw material. New techniques have certainly made that a lot cheaper.


This may be an unpopular opinion, but I think deepfakes could be a very good thing for society. Imagine if we got to the point where it was completely impossible to distinguish fake images and videos from real ones. It would be a huge blow against surveillance capitalism and the surveillance state.

If every video out there is suddenly a potential forgery, people will grow to become more skeptical of things they see unless they see those things in person in the meatspace.

It's almost as if we're going to go back to the time before cameras and audio recordings. What makes things like deepfakes so scary right now is that people overwhelming trust images and videos by default. Once we shift from automatically trusting images and videos to automatically distrusting them, things like revenge porn and viral videos of people saying or doing embarrassing things will lose most of their power.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: