Nvidia Maxine: AI for Video Calls

Facing a lot of problems in video calls, Nvidia has an answer to it, Nvidia Maxine. Nvidia came up with an announcement of a new video conferencing platform for developers which says that it can fix some of the most common problems in video calls

By making use of Nvidia’s GPU, Maxine will process these calls in the cloud, and with the help of artificial intelligence, the quality will be boosted in several ways. One of the best features of Maxine by making use of AI, it can realign callers’ faces and gazes so that they’re always looking straight at their camera, will be able to reduce the bandwidth requirement for the video to one-tenth of the requirements of the H.264 streaming video compression standard by only transmitting “key facial points” and within the cloud will upscale the resolution of the video. Some other add-on features coming along with Maxine is re-lighting of face, real-time translation and transcription, and animated avatars with backgrounds.

Nvidia AI for video calls
Pic Credit: Nvidia

Some features already Available

But few of these features are already prevalent or available in the market. Common features already available are like real-time transcription and video compression, adding to this gaze alignment has been already been introduced by Microsoft and Apple in Surface Pro X and FaceTime which makes sure that eye contact during video calls is maintained. Although the face alignment of Nvidia is just an extreme version of prevailing gaze alignment.

Cloud Computing: The Key

Nvidia withstands a strong hope to revolutionize cloud computing and with its impressive AI, R&D work shows trends of its rise from its competitors. But a major challenge for Nvidia is that which of the established videoconferencing companies adopts Nvidia’s technology. Maxine is a toolkit for third-party firms to improve their software and is not a consumer platform. No such announcement has yet come from the end of Nvidia about the partners who will be using Maxine in the future but gave indications for the discussions are still going on with many companies. A better understanding can be drawn from the video below.

Bandwidth Requirement Reduced

Nvidia’s general manager for media and entertainment, Richard Kerris in a conference call with reporters described Maxine as a “really exciting and very timely announcement,” and highlighted its particularly useful feature i.e. AI-powered video compression.

Kerris said that limitation on bandwidth has been experienced by all of us daily these days and now if we apply AI to this problem, a recreation of scenes at both ends will become much easier, and only that things will be transmitted that that is needed to which will significantly reduce bandwidth.

Generative Adversarial Networks: Compression Feature

An AI method is being used by Nvidia i.e. used by the compression feature known as Generative Adversarial Networks or GANs which partially reconstruct callers face in the cloud and this technique is the same as used in many deepfakes. The idea is that instead of the streaming entire screen of pixels, analyzation will be done by AI software for key facial points of each person on the call, and re-animation of the face is intelligently done on the receiver side. This makes it possible to stream with far less data flowing back and forth across the internet for video.

Artificial Video Calling

Still, we need to wait for this technology to take action in a challenges-driven real world and for any partnership deals, Nvidia makes before we know how much of an effect this will have on everyday video calls. But this announcement is still a breakthrough as it guides us how video conferencing will happen in the future which is completely artificial than ever before as AI will lead to straighten the gaze and reconstruct your face, all these leading us to one sole purpose of saving the bandwidth.


Please enter your comment!
Please enter your name here