Volumetric capture will be key for mixed and virtual reality content. Here's how it works.

By Philippe Lewicki / 20 Dec 2016

Topics: microsoft | volumetric capture | hcap | photogrammetry | virtual reality | mixed reality | blog

It's not easy to explain the concept of volumetric capture -- and why this field of technology will soon be instrumental for virtual and mixed reality -- when you haven’t seen it in action yet. But I will try.

Volumetric capture is a technique used to film footage in three dimensions for viewing in a virtual or mixed reality headset. It's very different from the old technology we use in 3D movies today. When you’re recorded with volumetric capture, the body is fully scanned and reproduced, with all details and all of your sides recorded. It creates a digital copy of the full, visible you.

Last month at Microsoft, we got a chance to learn about HCap, which stands for Holographic Capture -- Microsoft's name for their volumetric capture technology. We were the first group to have access to it and be allowed to talk about it publicly.

Here's HCap in action:

Microsoft's HCap studio has over 150 cameras directed to an 8-foot, square platform at center stage, surrounded by a green background. We learned the technical part of how to integrate holographic capture in an application for the Hololens and how to prepare for recording a capture.

My main takeaway was how to think about it. It's very different from film -- it's actually much closer to a theater performance. There's no framing or any single camera to look at, and it's important that actors look great from all angles. Scripting and preparing is key, as the post-processing is limited to some coloration, and modifying the recorded result is not recommended.

Why it's a big deal for VR and MR content

While wearing mixed reality devices like the Hololens or Meta 2, having holographic characters walk around in your real world is normal. Even natural. And for virtual reality, volumetric capture will soon be used for a big share of the content.

Most of today's entertainment content for VR is either 360-degree video capture or Pixar-style computer-generated animation. Some of the 360 video content on the market right now is amazing, entertaining, beautiful, and very immersive. We can see the maturity of the medium in the past few months.

If you've already experienced some of today’s 360 video content, you may have noticed that if you move, nothing happens. This past year, VR has been moving towards what we call "room scale," which means you can physically move inside the VR experience. HTC Vive, Oculus Touch, Occipital bridge, and Tango in VR now allow room scale VR.

If you can move, the 360 video will not work for you anymore -- you’ll want to move around in the scene you’re viewing. And MR and room scale VR will demand more than just 360 videos -- actors will need to pop up from the screen and play in front of you.

VR is already immersive when you're passive; this will add a new level of immersion and interactivity.

A few companies are ready for the demand with their own volumetric capture work. A couple of weeks before our trip to Microsoft, I had the chance to visit 8i's volumetric recording stage in Culver City. 8i uses a technique very similar to Microsoft’s from the heart of the entertainment capital. Unfortunately they do not support the Hololens yet.

Human Engine LLC is another company in Culver City doing great human volumetric capture which natively works for the Hololens without any additional plugin. Here's an example of volumetric capture from Human Engine.

Here's how volumetric capture works.

All three of these companies use a similar basic technique called photogrammetry to convert images in three-dimensional objects. From there, each has their own proprietary algorithm to generate the mesh, or structure of the object, and extract the most realistic models and smoother animations.

The 150+ cameras take pictures in both normal and infrared lights. A point cloud is generated from the results and a mesh connects all parts of the volume. That’s where the proprietary algorithms do their magic and smooth the volume without removing key details.

Here are some details on how it works from Microsoft. The paper includes an mp4 video, which you can see here:

Philippe Lewicki

Culver City

Immersed in the metaverse for the past 7 years, international technology speakers. Philippe is working at AfterNow creating the future of human and machines interactions.