On this page

Face2Face: Real-time Face Capture and Reenactment of Videos

April 9^th, 20167

Face2Face: Real-time Face Capture and Reenactment of Videos

by Olaf von VossApril 9^th, 2016

This is a cool one. And a spooky one at the same time! A group of researchers just announced a new and refined approach for real-time face capture and reenactment. All they need is a simple RGB input, such as a YouTube video, and a commodity webcam. With many possible applications, this might just bring about the future of dubbing movies.

this is how it works – any face expression out of a single frame

Theoretical approach

I am not a scientist, so I apologise in advance for letting the following video explain what this is all about. The team of researchers behind this new technique do a far better job of explaining their findings than I ever could.

These newly-developed algorithms make it possible to manipulate the facial expressions of any subject within a regular YouTube video. Not so long ago, major Hollywood studios needed dedicated super expensive capture devices and custom code rendering software in order to achieve similar results for their blockbuster movies. But it now seems that this kind of video manipulation technique is about to become a lot more accessible.

Practical approach

So what potential uses could this offer? As I said, I am not a scientist. I am a filmmaker based in Germany, and therefore I need to dub videos from time to time. With this technology, it could become possible to dub movies in a very convenient way and with stunning results. No more weird out-of-sync lip movement of foreign actors. That would be neat!

On the other hand, it’s also kind of scary. From now on, you can’t be sure if that celebrity you are watching on TV is really talking about their new hair cut or if somebody has manipulated the stream in some fancy way.

simplified figure of the face capture pipeline

Proof of concept vs. existing solutions

Right now, this technology is more at a proof-of-concept stage. But it is a very promising and sophisticated proof of concept, that’s for sure. The team of researchers compare their approach to a variety of existing technologies, but they point out that their pipeline needs fewer requirements than other solutions out there. For example, all they need is a plain YouTube video snippet, without the need for extra tracking information. The more expressions the target face shows within the source video, the better the results.

For us as filmmakers, the possibility of dubbing videos in a very convenient way and without all the hassle does indeed sound very promising. But for now, we’ll need to stick with existing ‘old school’ solutions such as voiceQ. That piece of software provides you with everything needed in order to get the job done when it comes to dubbing your movie. It doesn’t manipulate it in any way, though. I just comes with all the tools to take the hassle out of dubbing, and makes ADR as easy as possible.

We’ll see what the future of all this might be. It is kind of strange that everything is being manipulated these days, but if this constant progress of technology can be used to simplify dull tasks like dubbing a movie, I for one look forward to it!

Dubbing face capture reenactment