Better Video Chat Through AI Frame Prediction

This is theoretical as. But say you trained a NN to take the previous 120 frames and predict the next 120 frames of a video, and you could run it 60 times a second. You could then have a video playback that played the next frame based on the NN, but then predicted the next frame based on what came in. My theory is that it would be good enough to predict the next frames and essentially produce smooth (fake) video from a potentially choppy image. If it could run fast enough.

The cool thing about this is that you wouldn’t actually be watching a live person, you’d be watching the AI prediction of a live person, projected X frames into the future – this would effectively mean there is no lag because the AI is good enough to predict what you’re going to say/act and is always adjusting based on what you did say/do. We’re talking sub-second here, so it’s not predicting what you’re going to say, it’s predicting the changes in tone and pitch and where your face can get to in the next second. That part is totally possible. The part I think would be impossible is that you couldn’t do this fast enough and display the result. Maybe quantum computers.

This entry was posted in AI, Software. Bookmark the permalink.

2 Responses to Better Video Chat Through AI Frame Prediction

  1. Chris says:

    The version of this I liked was to reduce your face to vectors using OpenCV, then deepfake your own face back onto the vector skeleton at the other end. You’d need a chunk of data at the start to get things moving, but then absolutely minimal data to send the vector skeleton at a buttery smoothe 120fps

Leave a Reply

Your email address will not be published. Required fields are marked *