To enable electronic pan-tilt in HoviTron, the scene is surrounded by multiple, fixed cameras, out of which the virtual view synthesis and visualisation can be obtained.
The view synthesis process requires the input to be RGBD, i.e. colour images (RGB = Red Green Blue), as well as their corresponding Depth (D).
The latter represents probably the highest challenge in the HoviTron project:
The depth resolution and quality should be high
The depth must be acquired and/or estimated in real-time for direct response of the tele- operator to the scene’s changes
In a first step, we have used VR synthetic content with perfect depth maps. For instance, we used the Blender classroom test sequence and raytraced it to obtain four RGBD images, out of which we synthesized the holographic stereogram shown at the left. It is the perfect counterpart - for static content - of Creal’s Light Field HMD.
For real content (where depth images are not necessarily perfect), two multi-camera acquisition campaigns are foreseen in HoviTron: one for static content, another one for dynamic content. The rationale is that the acquisition process can be simplified for static content (e.g. no need for camera synchronisation since one single camera is moved along a rail), while dynamic content allows to check the robustness of the technology against temporal artefacts (e.g. flickering effects from one temporal frame to the next).
Some results can be seen in the video. The single camera working on the acquisition system as presented in the video reinforces the message of the first campaign (static content).
So far, we have used MPEG depth estimation software, as well as Intel’s L515 lidar and Microsoft Azure Kinect depth sensing devices to acquire the depth images.
To comply to the requirements of view synthesis, depth images must be filtered to better follow the object silhouettes, get rid of undefined depth values/holes, etc. More can be found in section Processing.
Besides these camera options, we are also studying alternatives like RayTrix and PhotonicSens cameras that both deliver depth images perfectly aligned to the RGB colour images with Extended Depth of Field (EDoF) within the working region defined in HoviTron (working area between 30 cm and 1m, i.e. within robot arm length).