https://tech.fb.com/ar-vr/2022/06/p...or-visual-realism-in-vr/#section-butterscotch
Passing the visual Turing test: The inside story of our quest for visual realism in VR
In November 2020, Meta CEO and founder Mark Zuckerberg sent an email to CTO Andrew “Boz” Bosworth and Reality Labs Chief Scientist Michael Abrash asking a very straightforward question: “What is keeping us from having a VR display that is almost indistinguishable from reality, and what will we have to solve in order to achieve that?”
It was the latest in a series of detailed conversations about building advanced virtual reality (VR) display systems that Zuckerberg and Abrash had had over the years, ranging from a 2015 trip to a promising augmented reality (AR) company, to frequent email threads, 1:1 discussions, and technology reviews, to numerous demos in Redmond and Menlo Park over the years.
The response could just have been blue sky speculation — but it was anything but, because the Display Systems Research (DSR) team at Reality Labs, led by Douglas Lanman, had been doing deep research on all the technologies needed to answer Zuckerberg’s specific question for the previous five years. In fact, it was exactly the right question at the right time to crystallize and map out DSR’s vision for VR displays for the next decade: passing the visual Turing test.
The Holy Grail of Display Research
The Turing test was designed by Alan Turing in 1950 to evaluate whether a computer could pass as a human. The visual Turing test — a phrase DSR adopted and helped to popularize alongside leading academic teams — similarly evaluates whether what’s displayed in a VR headset can be distinguished from the real world. It’s a subjective test, and it’s a test that no VR technology can pass today. While VR already creates a strong sense of presence, of being in virtual places in a genuinely convincing way, it’s not yet at the level where anyone would wonder whether what they’re looking at is real or virtual.
Zuckerberg’s question spurred Lanman in December 2020 to write what became a widely-circulated internal memo, “Passing the visual Turing test.” In it, he laid out a detailed roadmap to achieving that goal – a goal that if successfully reached will open up a whole new world of VR capabilities, ranging from virtual workspaces that make remote work as productive as — or even more productive than — work in a real office space, to virtual social interaction that feels authentically like being with other people, to virtual tourism, to pretty much everything we do today in the real world. Remote work, powered by VR, would allow many more people to live wherever they want, rather than having to move to where jobs are. That would create new opportunities both for individuals, whose access to a wide range of jobs would no longer be limited by geographic location, and for businesses, which would be able to tap a vast global talent pool. But the game-changing effects would extend beyond productivity. VR, together with AR, has the potential to change the world as much as or even more than personal computing, and indistinguishably realistic visual experiences will play a huge part in that.
In today’s “Inside the Lab” post, we’ll take a deep dive into DSR’s quest to build the display technology stack that – along with Codec Avatars, a believable sense of touch, spatial audio, and more – will help make the future metaverse feel truly real, by meeting the challenge of the visual Turing test across the full range of visual experiences. We’ll look at the core technologies that DSR is developing, we’ll talk about the prototyping approach that fuels DSR’s progress, and we’ll share the results of a first-of-its-kind perceptual study that catalyzed much of the team’s research. Finally, we’ll share details about several of DSR’s prototypes and pull back the curtain on Mirror Lake, a prototype design that integrates DSR’s work across several research areas into a next-generation headset with a lightweight, comfortable form factor.
This is a story of scientific exploration — of the seed of a research idea growing into a full-spectrum program that has a good shot at one day changing the way we work, play, and communicate. And the place to start with that story is with the challenge.
The Challenge
The challenge DSR faces as they pursue their quest for visual realism is easily summed up: The technology needed to pass the visual Turing test, especially in a consumer headset, doesn’t yet exist. While Quest and Quest 2 create compelling 3D visual experiences, they can’t yet compete with our experiences in the real world. The obvious current limitation is resolution, but the challenges run far deeper. VR introduces a slew of new issues that simply don’t exist with today’s 2D displays, including vergence-accommodation conflict, chromatic aberration, ocular parallax, and pupil swim. As a result, there are many obstacles to be overcome, a great deal of research to be done, and a lot of user studies to be conducted before we can get close to a fully realistic VR visual experience. The innovations needed to close the gap fall into several major categories.
For starters, resolution is an issue. The problem is that VR headsets have much wider fields of view than even the widest monitor, so whatever pixels are available have to be applied across a much larger area than for a 2D display, resulting in lower resolution for a given number of pixels. For example, 20/20 vision across the full human field of view would require about 13,000 pixels horizontally — far more than any existing consumer display. (The reality isn’t quite that bad, since the eye doesn’t have the ability to perceive high resolution across the full field of view, but the magnitude of the challenge still applies.) And not only are a lot more pixels required, but the quality of those pixels needs to increase. Today’s VR headsets have substantially lower brightness and contrast than laptops, TVs, and mobile phones. As such, VR can’t yet reach the level of fine detail and accurate representation that we’ve become accustomed to with our 2D displays.
Additionally, the lenses used in current VR displays often distort the virtual image, reducing realism unless the distortion is fully corrected in software — which is challenging because the distortion varies as the eye moves to look in different directions. Moreover, while it’s not part of realism, headsets can be hard to use for extended periods of time because that distortion, as well as the headset’s weight, can cause temporary discomfort and fatigue. And there’s one more key element, which could be considered part of resolution but is so crucial that it belongs in its own category: the ability to focus properly at any distance. We’ll explain that last point and dive into it shortly, because it’s at the heart of our story today.
In order to fully address the above gaps, Zuckerberg and Lanman believe that passing the visual Turing test will require building a new tech stack that includes:
Developing all these capabilities is necessary (and hard!) but not sufficient. All of that ultimately needs to fit into a more comfortable headset suitable for consumer use, and that means DSR has to not only advance the state of the art on multiple display axes but also build complete display systems well beyond what exists today — and that takes the challenge to another level. But it’s a challenge that DSR is taking on — and a challenge that Zuckerberg believes is essential to solve to get to the next generation of VR.
- “Varifocal” technology that provides correct depth of focus (versus a single fixed focus), thereby enabling clearer and more comfortable vision within arm’s length for extended periods of time
- Resolution that approaches and ultimately exceeds 20/20 human vision
- Distortion correction to help address optical aberrations, like color fringes around objects and image warping, that can be introduced by viewing optics
- And high dynamic range (HDR) technology that expands the range of color, brightness, and contrast you can experience in VR
Lanman notes the complexity of the task: “Designing and building headsets that incorporate that collection of technologies is difficult and time-consuming work because with headset displays, all technical systems are interconnected. Everything competes for that same size, weight, power, and cost budget, while also needing to fit into a compact, wearable form factor.” And it’s not just a matter of squeezing all the technology into a tight budget — each element of the stack must also be compatible with all the others. For example, certain eye tracking technologies must be paired with specific types of display lenses in order to function properly.
DSR has tackled this head-on with an extensive series of prototyping efforts, ranging from individual technologies to full systems, that map out and push the boundaries of the vast VR display design space, followed by user studies run on those prototypes to assess progress toward passing the visual Turing test. The tangible result of this is on display at RL Research in Redmond: an entire wall of prototypes that collectively explore a wide spectrum of technology for next-generation VR displays — a living history of DSR’s quest for visual realism.
Over the last seven years, Lanman’s team has built over two dozen fully functional AR/VR research headsets, each geared towards unlocking novel demos and user studies.
In the remainder of this post, we’ll explore that history from the very beginning right up to the present day. We’ll look at each of the four primary technology axes in turn, including an update on the long-running varifocal program that we’ve talked about many times over the years. And we’ll discuss two recent DSR display system architectures: Holocake 2 — which to the best of our knowledge has the most compact optics of any Quest 2-class VR headset, and is the first such headset with holographic optics — and Mirror Lake, a proposed architecture for future generations of the VR visual experience.
Let’s go back to 2015, when it all began.
Varifocal and the unexpected role of hands
In 2015, Lanman’s newly-formed team was in its first year of investigating the display technologies that were potentially relevant to passing the visual Turing test. At the same time, Meta (then known as Facebook) was in the process of launching the Oculus Rift, soon to be followed by a novel interaction method: Touch controllers, which brought a sense of hand presence to VR.
Lanman was confident that RL would one day go beyond Touch to ship the hand-tracking technology that was then in development within the research team. (He was right: In 2020, we added Hands to Quest.) And that thought led Lanman to a key insight.
Varifocal is a technology that involves adjusting the focus of the display based on what you’re looking at. In this through-the-lens footage, you can see the difference it makes — particularly when focusing on nearby objects.
That insight was that to use your hands most effectively, you have to be able to focus on them. That may seem obvious and unexceptional, since that’s exactly what we do in the real world, but this is one of those cases where the rules change in VR. In the real world, we constantly change the shape of the lenses in our eyes to focus at the distance of whatever we’re looking at, thereby properly imaging the light that’s coming from that distance. In contrast, current VR headsets feature optics that have a fixed focus, typically at 5 to 6.5 feet (1.5 to 2 meters). That means that although we’re not aware of it, light always effectively comes from the same distance in VR, no matter where in the scene we’re looking, and that’s a new phenomenon for our visual systems. The mismatched cues you receive in VR between the simulated distance of a virtual 3D object and the focusing distance — which again, is fixed at roughly 5 - 6 feet in today's headsets — can cause vergence-accommodation conflict (VAC). VAC is a well-known phenomenon in the VR field that may lead to temporary fatigue and blurry vision, and can be one source of the discomfort that can be experienced when spending extended periods of time in VR. “Your eyes try to focus and you can’t,” Zuckerberg said last year when explaining the benefits of varifocal, “because [the display is] projecting [at] one distance.”
One path to addressing VAC is to dynamically adjust the focal depth in VR to match the distance of the object of interest, enabling our eyes to focus at the right distance, and one potential way to do that, known as “varifocal,” is to move the lenses accordingly as the viewer changes what they’re looking at. To test that theory, DSR created a bulky proof-of-experience prototype in 2016, shown below. We refer to prototypes of this sort — far from consumer-ready, built for the purpose of probing what might be possible with years of research and development — as “time machines.” Time machines form an integral part of DSR’s approach to exploring the design space of future VR visual technologies.
DSR’s first complete varifocal prototype, created in 2016, integrated all the necessary components for a compelling experience — variable focus, robust eye tracking, real-time distortion correction that updated with changes in display focus, and rendered blur that increased away from the focal plane, as it does in the real world. The 2016 demo used a prototype Touch controller to allow lab members to directly assess the visual acuity benefits for objects within arm’s length.
A first-of-its-kind user study and the evolution of Half Dome
When Zuckerberg made a visit to RL Research in 2017, he went to see a wide range of prototypes and make some decisions about the technological directions the company should take going forward. The first VR demo he tried that day was one of our first attempts at varifocal — a behemoth that he quickly agreed improved the sharpness of nearby objects. That and other early prototypes showed that the principle underlying varifocal could work, and subjectively provided sharper visual experiences. However, the emerging evidence, though promising, was anecdotal, and the team lacked definitive proof that the DSR version of varifocal could overcome VAC and improve acuity and comfort.
Marina Zannoli, then a Vision Scientist on the DSR team, stepped in to find the answer by leading a user study of varifocal. She began by issuing a daunting engineering challenge: The team had to create a new headset that was much closer to the weight and form factor of an Oculus Rift in order to keep the study from being clouded by the general discomfort that would come from wearing a bulky prototype. This required reducing the mass by a factor of four, as compared to the team’s existing 2,450-gram (about 5.5 pounds) headset, while simultaneously refining the device to be free of the noise and vibration that was being generated by the varifocal system.
Nine months later, the team delivered Half Dome Zero — a 680-gram research prototype headset that was fully compatible with every single VR game shipping for Rift at the time, but with the added ability to provide proper depth of focus in those games via varifocal. While somewhat heavier than the 470-gram Rift, Zannoli believed that this new headset was light enough to provide meaningful insights into user preferences and the true benefits of varifocal.
Next, Zannoli had to decide how to test the intended benefits of varifocal, including whether it improves the sharpness of nearby objects, whether it helps people perceive 3D scenes faster, whether it increases visual comfort, and, most importantly, whether people actually preferred it.
Here, Zannoli decided on an approach to the problem that was quite different from the standard vision science approach of using limited stimuli such as eye charts. She chose to ground the study in rich VR experiences, working with a team of technical artists to develop a custom demo application, built on video game technology, that encouraged participants to spend most of their time observing nearby objects — something VR developers are currently advised to avoid due to the known limitations of fixed-focus VR.
The Half Dome Zero user study, conducted in 2017, involved spending 30 minutes in VR, spread across three experiences: a modified version of First Contact that involved interacting with nearby objects, a modified scene from Dreamdeck in which participants had to search for a small symbol, and a task that involved looking at random dot stereograms and assessing how quickly participants could understand 3D patterns in the scene (note: the patterns are only visible in VR).
Now armed with a suitable headset and a carefully designed protocol, Zannoli brought in 63 participants who completed a two-day trial that assessed the team’s varifocal system relative to fixed-focus VR. On one day varifocal was fully enabled on Half Dome Zero, and on the other the headset was operated in the fixed-focus mode that’s standard for current VR headsets. Participants were asked to subjectively assess a variety of preferences by completing a set of questionnaires.
The study findings were more positive than the team initially suspected they might be. Zannoli summarizes, “What we found when we looked at the results was that when using varifocal, people were more comfortable in every respect. They experienced less fatigue, nausea, and blurry vision, and they were able to identify small objects better, had an easier time reading text, and reacted to their visual environment more quickly.” Most promising of all was that the majority of participants preferred varifocal over fixed-focus VR — a particularly surprising result, as Half Dome Zero was an early prototype with imperfect eye tracking and distortion correction software.
So by Summer 2017, DSR finally had definitive proof that varifocal could bring a host of performance and comfort benefits to VR, and contemporaneous studies at Inria and UC Berkeley and Stanford supported that conclusion. The team was now certain that resolving the multitude of remaining engineering challenges — spanning eye tracking, computer graphics, optical design, control systems, and weight — was a top priority, and so over the next five years DSR built a series of prototypes that pushed the limits of varifocal technology.
Half Dome Zero was used in the 2017 study. With Half Dome 1, the team expanded the field of view to 140 degrees. With Half Dome 2, they focused on ergonomics and comfort, cutting 200 grams. And, with Half Dome 3, they introduced electronic varifocal, further reducing the headset’s size and weight.
Beyond varifocal: Retinal resolution, distortion-free displays, and HDR
“The Half Dome series was a turning point for our team,” says Lanman. “It helped us advance the state of the art in varifocal technology, and it also gave us a template for our other display research programs.” After Half Dome, DSR began to guide all their research efforts along the same path — a process that begins with establishing technical requirements and hypotheses, followed by building bulky proof-of-experience time machines, creating refined proof-of-concept prototypes, and finally conducting user studies that generate key data to inform the next prototype.
“We’ve rigorously applied this blueprint to the other dimensions of the visual Turing test,” adds Lanman, “especially resolution, optical distortions, and dynamic range.”
Let’s dive a little deeper into those three areas and see what stage each is at on the DSR research path.
Butterscotch: Understanding “retinal resolution”
“Retinal resolution” has long been the gold standard for products with a screen. Although there’s no universally accepted definition, it’s generally considered to be around 60 pixels per degree (ppd), which is sufficient to depict the 20/20 line on an eye chart. While most laptops, TVs, and mobile phones have long since passed this mark, VR lags behind because its immersive field of view spreads the available pixels over a much greater visual extent. For example, Quest 2’s displays deliver about 20 ppd.
If an eye chart were presented in VR, then neither Rift nor Quest 2 could resolve the lowest line, representing 20/20 visual acuity. In contrast, DSR’s Butterscotch prototype is engineered to meet traditional retinal resolution requirements and can depict the finest features in an eye chart, as can be seen from these photos taken through the lens of each type of headset.
This obviously limits the ability to present fine text and other detail, and it can also limit perceived realism. For example, researchers in Japan showed that the sense of realism steadily increases as image resolution increases, all the way to around 120 ppd, well beyond what is considered “retinal” resolution. Since visual realism is at the heart of the visual Turing test, over the years DSR has built a series of high-resolution VR prototypes designed to probe the significance of retinal resolution in the context of VR, and to find ways for practical headsets to reach that level.
The value of that prototyping was unexpectedly reinforced when Zuckerberg and Bosworth visited RL Research last year. On the drive in from the airport, Zuckerberg asked Abrash about the team’s progress on retinal resolution. Abrash responded that he could see for himself, because within a couple of hours he would be donning Butterscotch, the latest and most advanced of DSR’s retinal resolution prototypes.
The DSR team regularly demos to Meta leadership, providing early glimpses of future AR/VR visual technologies. Left: Mark Zuckerberg first experienced varifocal during a 2017 visit to the research team in Redmond, Washington, using an early AR varifocal prototype (that was clearly not optimized for ergonomics!). Right: On a visit to RL Research last year, Zuckerberg experienced the latest retinal resolution VR prototype from DSR.
Butterscotch is a great example of prototyping to get answers as quickly and directly as possible. There are currently no panels that support anything close to retinal resolution for the standard VR field of view, so the team used 3k LCD panels and limited the field of view to about half that of Quest 2 in order to boost the resolution to 55 ppd — two and a half times that of Quest 2. Then they had to develop a new type of hybrid lens to fully resolve such high resolution.
The result is not even close to shippable technology — it’s far too heavy and bulky, with an undersized field of view — but it let Zuckerberg experience near-retinal resolution and see for himself how much difference it made, which is exactly what DSR’s time machines are designed to do. In fact, after demoing Butterscotch and recognizing that its retinal resolution technology was vital to the future of VR, Zuckerberg directed a company-level review of our resolution roadmap.
There’s still a long way to go on the path to VR resolution that approaches reality, but Butterscotch is a significant step along the way. It also serves as a base for integrating other DSR technologies into high-resolution display systems. For example, DSR is building a varifocal variant of Butterscotch that will offer more than three times the resolution of the Half Dome Zero prototype. With fixed focus, blurring occurs away from the focal plane — that matters more the higher resolution gets, and varifocal Butterscotch will make it possible to evaluate the full visual acuity benefits of varifocal near the limit of human vision.
Eliminating optical distortions in VR headsets
The resolution of the VR visual experience is important, but it’s just one piece of the puzzle. The quality of the image is equally important, and for various technical reasons no VR lens can be fully free of optical aberrations. Some aberrations can be corrected by warping the image in software — that’s a crucial element of virtually every VR headset today, and getting it right is key to great visual experiences. However, the distortion correction software in current VR headsets doesn't work perfectly; the correction is static, but the distortion of the virtual image is dynamic, changing depending on where one is looking. As shown below, this phenomenon, known as pupil swim, can make VR seem less real because everything moves a bit when the eye moves. This becomes even more significant with varifocal, because the image magnifies and shrinks slightly as the focal distance of the display changes.
For varifocal to work seamlessly, optical distortion, a common issue in VR, needs to be further addressed beyond what is done in headsets today. The correction in today’s headsets is static, but the distortion of the virtual image is dynamic, changing depending on where one is looking. This phenomenon, known as pupil swim, can make VR seem less real because everything moves a bit when the eye moves.
The team had the importance of accurate varifocal distortion correction hammered home early on, thanks to an error in the 2017 Half Dome Zero user study that accidentally turned off distortion correction for varifocal. They corrected that mistake, but in the process learned that varifocal showed significant benefits only if the lens distortion correction was applied correctly. This highlighted the importance of getting distortion correction right, but as the team delved into the topic, it quickly became clear that the tools needed to do that were lacking.
The problem was that distortion studies take a very long time to set up; fabricating the lenses in a custom headset alone can take weeks or months, and that’s just the beginning of the long process to build a functional headset display that can be used for testing. DSR realized that they needed to conduct distortion studies at the speed of optical design software rather than lens fabrication hardware, and they set out to solve that problem.
DSR’s VR lens distortion simulator emulates VR headsets using a 3D TV. This allows the team to rapidly study novel optical designs and distortion-correction algorithms in a repeatable, reliable manner while also eliminating the time-consuming process of iterating on designs using full headset prototypes.
And solve it they did. The team has repurposed 3D TV technology to create a VR lens distortion simulator that can induce precisely controlled distortions, allowing them to instantly study distortion correction algorithms for any lens design. DSR will present their rapid prototyping solution at the annual SIGGRAPH conference in August.
With this unique rapid prototyping capability, the team has been able for the first time to conduct a user study to investigate eye-tracked distortion correction. Unlike the correction software in today’s headsets, dynamic distortion correction uses eye tracking to update the rendered correction to account for movement of the eyes, which has the potential to produce the always-stable images that today’s static correction can’t.
Rapid prototyping promises to greatly accelerate research into VR lens distortion and correction of all sorts, paving the way to reduced distortion in future VR headsets.
Starburst: Previewing high dynamic range headsets
Resolution, distortion correction, and varifocal are all key pillars for advanced visual reality, but high dynamic range (HDR) is the single technology that has been most consistently linked to an increased sense of realism and depth. HDR refers to support of wide ranges of brightness, contrast, and color, and it’s recently come into its own in the television space.
“Nits” are units that describe how much light an object emits, with typical values for an indoor environment ranging well past 10,000 nits, as shown below. Until recently, a typical television had a brightness of only a few hundred nits. However, in 2013 researchers at Dolby Labs conducted a user study with a custom-built display that reached a peak of 20,000 nits, and found that the sweet spot for peak brightness was around 10,000 nits. This pioneering study inspired the television industry to develop and introduce HDR displays over the last half decade, with great success.
Brightness levels measured in an office. The maximum brightness of 49,040 nits far exceeds the 100-nit maximum for today’s VR displays. Note that HDR displays don’t need to be brighter everywhere; rather, they increase realism by reproducing bright highlights, for example for reflections and overhead lighting in this scene.
VR has yet to make that leap. Quest 2 has a peak brightness of about 100 nits, and pushing far beyond that won’t be easy within the power, thermal, and form-factor constraints of VR headsets. As Zuckerberg explained in an interview last year, “probably the hardest challenge in terms of the display and getting it to be super vivid, [is] the [HDR] problem. TVs have gotten a bit better on HDR recently. But the vividness ... of screens that we have compared to what your eye sees in the real world [is] just an order of magnitude or more off.” The LCD panels and lenses used in modern VR headsets result in lower contrast than TV screens, further reducing realism, and increasing brightness tends to amplify the problem, washing out darker colors, especially black. Finally, today’s displays can only show a subset of the full color gamut the human eye is capable of perceiving.
DSR researchers are building prototype HDR VR headsets. “Our latest, Starburst, is bulky, heavy, and tethered,” says DSR Research Scientist Nathan Matsuda, “and people have to hold it up to their face like a pair of oversized binoculars. But when they do, they experience something no one ever has before: a demo that can reproduce the full range of brightness typically encountered in indoor or nighttime environments.”
DSR’s Starburst prototype reconfigures the guts of a Quest 2 headset, placing a very bright lamp behind the LCD panels. This “time machine” is one of the brightest HDR displays ever built, reaching a peak brightness of 20,000 nits, and it’s the first 3D HDR headset DSR is aware of, allowing the team to investigate the interplay of HDR and 3D depth perception.
There’s no substitute for directly experiencing HDR with your own eyes, so DSR will be demoing Starburst at SIGGRAPH in August. In the meantime, DSR is following its usual template by building improved HDR headsets that can serve as vehicles for user studies. The path to true HDR VR displays is a long one, but DSR has started on the journey and will be providing updates along the way.
Realizing the step change
After years of demos and user studies, DSR is confident that retinal resolution, varifocal, accurate distortion correction, and HDR are crucial to passing the visual Turing test in VR, and they’ve built and validated prototypes that individually advance each of those aspects of visual realism. But the ultimate payoff is practically combining them all in a single, compact headset, and that takes the challenge to 11.
The problem is that VR headsets need to be compact, light, and stylish, and the additional hardware needed to implement DSR’s technologies tend to work against that. Lanman observes: “After almost seven years of developing high-performance varifocal headsets, our mechanical engineers have consistently found that any compelling varifocal system — at least one based on physically translating lenses or screens — adds around 40 to 50 grams.” That may not seem like a lot — it’s about the weight of two AA batteries — but adding it would ask people to accept a headset that is at least 10% heavier than Quest 2.
That’s where DSR Research Scientist Andrew Maimone comes in. Maimone’s research focuses on reducing the size, weight, and power of existing VR as much as possible. “While we learn a lot with our early prototypes, passing the visual Turing test with big, clunky, experiential test beds is just a first step on the way to eventually delivering these technologies in a sleek, lightweight form factor that you’ll want to use every day,” says Maimone. “That’s why we also build architectural prototypes that explore how we can condense all of these elements into something that’s shippable.”
Holocake: How low can you go?
Maimone led development of one of the architectural prototypes that Zuckerberg and Bosworth experienced in Redmond last fall, a super-compact headset called Holocake 2.
Holocake 2 is designed to test the optical performance of holographic pancake lenses in a fully functional, PC-tethered headset.
Combining holographic and pancake optics — an approach we first discussed in our post about the Holocake headset in 2020 — Holocake 2 is the thinnest and lightest VR headset we’ve ever built. Unlike the original Holocake, which looked like a pair of sunglasses but lacked key mechanical and electrical components and had significantly lower optical performance than today’s consumer VR headsets, Holocake 2 is a fully functional, PC-tethered headset capable of running any existing PC VR title.
Understanding how Holocake 2 achieves its ultra-compact form factor requires a quick dive into how VR displays are constructed. Today's VR displays rely on a light source, a display panel that forms images by dimming or brightening the light, and a lens that focuses the light from the display into the eye. Typically, the lens needs to be a few inches from the display in order to have enough focusing power to direct the light into the eye.
Holocake lenses reduce thickness and weight in two ways. First, polarization-based optical folding causes light to reflect inside of the lens, similar to emerging pancake lenses. Second, holographic films replace the bulkier refractive lenses used in both pancake lenses and conventional refractive designs, like Quest 2. In each case, light coming from a flat-panel display is focused towards the eye; only the form factor varies.
But, as illustrated above, there are ways to make it possible to place the lens much closer to the display, substantially reducing the size of the headset. Holocake 2 applies two technologies in tandem to accomplish this. First, it replaces the lens with a holographic optic that bends light like a lens, but is shaped like a thin, transparent glass slab. Second, it implements polarization-based optical folding (emulating a pancake lens, but with the much smaller form factor of a holographic optic) to dramatically shorten the path of light from the display to the eye.
This sounds like an almost magical way to reduce size and weight, so what’s the catch? The big one has to do with the light source — Holocake headsets require specialized lasers, rather than the LEDs used in existing VR products. “Lasers aren’t super exotic today,” says Maimone, “but they’re not found in a lot of consumer products at the performance, size, and price we need. So we’ll need to do a lot of engineering to achieve a consumer-viable laser that meets our specs, that’s safe, low-cost, and efficient, and that can fit in a slim VR headset.”
As of today, the jury is still out on suitable laser sources, but if that proves tractable, there will be a clear path to sunglasses-like VR displays.
Mirror Lake: Bringing it all together
DSR’s multiple research directions all stem from a core philosophy. As Lanman puts it: “We named ourselves Display Systems Research because we knew that all the demos and user studies in the world would amount to nothing unless we developed compelling, practical architectures along the way. This is the core work of DSR: the constant search for a solution to the puzzle of how everything can come together to create a next-generation visual experience that’s on the path to passing the visual Turing test. Not in an ‘everything but the kitchen sink’ kind of way, but in an elegant manner that brings true user value.”
Holocake 2 is a product of that philosophy, and there’s much more to come. Today we’re unveiling a display system that takes the next step — Mirror Lake. This is a ski-goggles-like concept that begins with the base Holocake 2 architecture, then adds in nearly everything that the team has incubated over the last seven years.
Mirror Lake is a concept design with a ski goggles-like form factor that integrates nearly all of the advanced visual technologies DSR has been incubating over the past seven years, including varifocal and eye-tracking, into a compact, lightweight, power-efficient form factor. It shows what a complete, next-gen display system could look like.
Mirror Lake illustrates the possibilities that the Holocake architecture — with its flat external surfaces — opens up. For example, the slim electronic varifocal modules from Half Dome 3 can be added to resolve vergence-accommodation conflict without significantly adding to the thickness of the headset. And instead of requiring bulky prescription lens attachments, individualized vision correction is just a matter of attaching another thin lens to the front of the headset, or even baking the wearer’s prescription directly into the hologram used in the main Holocake lens. There’s also a pair of front-facing cameras nestled in the temples that enable machine learning-driven passthrough – work that DSR will present at SIGGRAPH.
Eye tracking has emerged as a critical element of passing the visual Turing test, because it’s needed for both varifocal and dynamic distortion correction. The Mirror Lake architecture pioneers a new approach, using holographic films to redirect light from the eyes towards a pair of cameras mounted in the strap of the headset, and this novel approach also enables multiview eye tracking, which significantly boosts accuracy.
The key here is that, thanks to holography, everything is thin and flat. The varifocal modules are flat, and so are all the holographic films used for Holocake, prescription correction, and eye tracking. And it’s easy to continue adding thin, flat technologies. This was highlighted with the recent invention of reverse passthrough displays, which the team realized could be integrated into the Mirror Lake design simply by placing another flat 3D display in the optical stack.
The Mirror Lake concept is promising, but right now it’s only a concept, with no fully functional headset yet built to conclusively prove out the architecture. If it does pan out, though, it will be a game-changer for the VR visual experience.
The long path to passing the visual Turing test
Potentially transformative as Mirror Lake is, though, it’s just another step on the long journey to passing the visual Turing test. Developing the technology needed to pass that test — and figuring out how it can be turned into headsets that will meet the needs of millions of people — will be a journey of many years, with numerous pitfalls lurking along the way and a great deal to be learned and figured out. DSR is well aware of the challenge, and is committed to the mission of achieving true visual realism — and their efforts to date have convinced them and Zuckerberg that that goal is ultimately within reach.
As Zuckerberg has previously said, “When you look out over a 10-year period, obviously you want the [headset] form factor to get smaller. The ideal is to get to the point where you have almost like the Retina Display equivalent for VR… [It’s also necessary to] either [create] some kind of liquid lens or mechanically moving lens or something that can basically project things at different distances…. you’re also not gonna want to give up the vividness of what your eyes can really see in terms of the contrast and brightness of the colors if everything is just slightly duller in VR.” That framework for the importance of retinal resolution, varifocal, and HDR came from years of working with DSR to invest in these technologies, see the benefits of them firsthand, and then create a practical path forward for each of them.
We’ll let Lanman have the last word: “Lasers could ultimately prove impractical for VR, at least in the form needed for Holocake. And, in that event, the whole house of cards that is Mirror Lake comes falling down. Such is the challenge of inventing new display systems that rely on emerging technologies. But the best way to ensure that you will arrive at your desired destination is to have multiple routes to get there, and Mirror Lake is just one of DSR’s research directions. In any case, whatever path we take, our team is certain that passing the visual Turing test is our destination, and that nothing in physics prevents us from getting there. Over the last seven years we’ve glimpsed that future, and we remain fully committed to finding a practical path to a truly visually realistic metaverse.”