Nvidia has given us our first look at Omniverse Avatar, a technology platform for generating interactive AI avatars. We’ve become accustomed to thinking of avatars as digital representations of ourselves to play dress-up, but in Nvidia’s world – make that Omniverse – they can see, speak, converse on a wide range of subjects, and understand naturally spoken intent.
Avatars created in the platform are interactive characters with ray-traced 3D graphics that could not only make for truly engaging non-player characters in a role-playing game, but also provide customer service in retail, tech support in engineering and more.
Building blocks
Omniverse Avatar connects the company’s technologies in speech AI, computer vision, natural language understanding, recommendation engines and simulation technologies.
Technologies used:
- Speech recognition: Nvidia Riva
- Natural language understanding: Megatron 530B
- Recommendation engine: Nvidia Merlin
- Perception capabilities: Nvidia Metropolis
- Avatar animation: Nvidia Video2Face and Audio2Face
These technologies are composed into an application and processed in real time using the Nvidia Unified Compute Framework. Packaged as scalable, customisable microservices, the skills can be securely deployed, managed and orchestrated across multiple locations by Nvidia Fleet Command.
“The dawn of intelligent virtual assistants has arrived,” said Jensen Huang, founder and CEO of Nvidia. “Omniverse Avatar combines Nvidia’s foundational graphics, simulation and AI technologies to make some of the most complex real-time applications ever created. The use cases of collaborative robots and virtual assistants are incredible and far-reaching”.
Tokkio and Maxine
“In the near future, there will be billions of robots to help us do things. Some will be physical robots, most will be digital virtual robots. Some virtual robots will be fully autonomous. Others semi-autonomous, or even teleoperated,” Said Huang in his keynote address at Nvidia GTC.
Arguably the most interesting of those ‘robots’ was Project Tokkio – pronounced ‘Tokyo’ – that showed colleagues engaging in a real-time conversation with an avatar crafted as a toy replica of the CEO himself, conversing on such topics as biology and climate science.
Huang also updated the audience on Project Maxine that combines tech previously shown by Nvidia to add state-of-the-art video and audio features to virtual collaboration and content creation applications.
“The fundamental technologies of Maxine are just becoming possible – computer vision, neurographics, animation, speech AI, dialogue manager, natural language understanding, recommenders. These are foundational technologies we’ve been talking about for some time. This was pretty much impossible five years ago and barely so today”.
You can read more of Huang’s thoughts on Omniverse in general and Nvidia’s plans to create a digital twin of planet Earth in our earlier GTC article.
Steve is an award-winning editor and copywriter with more than 20 years’ experience specialising in consumer technology and video games. With a career spanning from the first PlayStation to the latest in VR, he's proud to be a lifelong gamer.