4 min read

Spatial video streaming worldwide and the future of television

Earlier this month, video streaming company SpatialGen announced it was teaming up with filmmakers Sandwich – and its app development arm Sandwich Vision – to stream John Gruber's "The Talk Show Live" from Apple's Worldwide Developer Conference in California in stereoscopic video, marking the first time an event has been live streamed in 3D to Vision Pro users in Sandwich Vision's new Theater app [sic, US English spelling].

Sandwich is also developing a separate app, Television, which promises to deliver shared viewing experiences – via Apple's Spatial Personas – for streaming video from YouTube and Netflix.

Live events, live streamed... in 3D!

While SpatialGen provided the streaming tech, Sandwich developed the visionOS app Theater, along with a custom hardware camera rig, to capture and broadcast the Talk Show Live event in stereoscopic video for Vision Pro users to enjoy in an immersive virtual theatre environment.

Editorial note: At the time of writing, the Vision Pro hasn't been released outside the United States, so I've been unable to try out the Theater app or watch the spatial video of the event, but I'm excited to watch it next month when the Vision Pro finally releases in the UK.

In previous years, Sandwich has streamed The Talk Show Live in good old-fashioned 2D on YouTube (here's a link to this year's event in 2D on YouTube).

After the event, friend of Sandwich Joe Rosensteel provided more insight – via Six Colours – into how the two companies pulled off this impressive technological feat. The full article is linked below, but in summary, Sandwich's Theater app puts the Vision Pro user into an immersive virtual theatre environment with a giant screen, in front of which – on the virtual stage – the spatial video from the live event is displayed.

It sounds wild and obviously simple at the same time (like all the best ideas).

Dan Sturm, Sandwich's visual effects supervisor, explains how the app displays the immersive video within the virtual theatre environment:

"The user sees a human-scaled 'portal' to the stereoscopic capture of humans on a stage, separated forward from the big screen in z-depth about the same distance as the actual humans would be in a real theater environment. So the human scale is immersive, and the stereo capture is immersive.

"We wanted to recreate the feeling of sitting in a theater, watching the show live. That’s why we created a custom screen position and scale for the 3D video, and we even placed it on a small stage at the front of the virtual theater."

The technical setup comprised a pair of cameras, placed near the floor of the front of the stage, with lenses placed as close together as physically possible. These cameras fed their video into OBS along with spatialised (from the looks of it, stereo) audio from a pair of mics placed alongside the cameras. OBS was used to create a side-by-side stereo video feed, which SpatialGen converted to MV-HEVC for streaming.

The custom rig used to capture stereo video and audio (credit: Sandwich, via Six Colours)

"The camera rig was a one-off side-by-side rig we had built specifically for the show. The cameras were Panasonic Lumix BGH1 Micro 4/3 cameras with Olympus 17mm lenses.

"We used Blackmagic HDMI to Thunderbolt converters to get the camera feeds into a MacBook Pro running OBS, along with the audio feed. From there, we created the side-by-side stereo image that was sent to SpatialGen for conversion and streamed to the app.

"The cameras were shooting 1080p 60fps to reduce motion blur. So, our final side-by-side image was 3840×1080, but was streamed at 30fps for bandwidth considerations."

This camera and lens combination gave a roughly 35mm field of view, clearly thoughtfully decided on for the comfort of viewers, 35mm being a natural field of view.

The rear of the rig – cameras covered with a cloth – showing stereo mics (credit: John Siracusa, via Six Colours)

As soon as I have the chance to experience the immersive video experience on the Vision Pro myself, I'll share more thoughts on it – so please click the button...

Shared virtual TV experiences

This is just the beginning for video streaming on Apple's headset: Sandwich has also released a separate app, thoughtfully dubbed "Television," which gives Vision Pro users the ability to watch their own videos on a variety of virtual television sets in their real environment. Virtually.

In version 1.0 of the Television app, video files need to be in the Vision Pro's local storage or the user's iCloud Drive, but future versions of the app will enable other video sources (YouTube, Netflix and other streaming platforms) to be enjoyed on virtual TVs of all ages, shapes and sizes.

The app also promises to allow shared watching experiences in future through Apple's SharePlay framework using Spatial Personas, as highlighted briefly in the Television app's promotional video:

Needless to say, as soon as I get my head in a Vision Pro, I'll be trying these experiences to learn much more about them and share details here.