Sei sulla pagina 1di 4


Zoltán Megyesi, Attila Barsi, Tibor Balogh

Baross str. 3. Budapest
H-1192, Hungary
+36 1 282 4921

ABSTRACT image, and the widespread 3D representations. Finally

solutions are summarized for fast playback of selected
The visualization of 3D content is an increasingly important representations on the HoloVizio™ system.
issue as computer graphics applications, computer vision
methods and display devices evolve rapidly. As a part of 2. 3D DISPLAY HISTORY
this 3D evolution, fields related to 3D video have also
advanced, but displaying high quality large Field of View The evolution of 3D related technologies has a long history,
(FOV) 3D videos in real time still pose a considerable in which 3D displaying advanced significantly only in
challenge. The HoloVizio™ system is a high quality recent years. 3D display systems exploit the binocular
continuous view 3D display that is capable of displaying parallax to provide 3D information to the Human Vision
large FOV 3D data. In this paper we present the 3D video system. In the past this has been achieved by providing
playback capabilities of the HoloVizio™ system, in separate (stereo) information for the two eyes, either by the
connection with recent applications. We focus on practical use of glasses or head mounted gears.
3D representation formats, and their usability in the 3D Autostereoscopic displays in contrast provide 3D
acquisition-transfer-display chain. perception without the need for obtrusive tools. These
systems implement the separation for the left and right eye
Index Terms— 3D display, 3D video, autostereoscopic, using various optical or lens raster techniques directly above
multiview, free viewpoint video the screen surface. All of these displays provide 3D views
through direction selective light emission, i.e. the visible
1. INTRODUCTION image on the screen depends on the viewer position.
The most widespread autostereoscopic solutions are
Realistic visualization is a goal that has been chased for stereoscopic displays with tracking [5]. Stereoscopic
decades, inspiring development for several research fields, displays can emit only two distinguishable light beams from
including computer graphics, computer vision, and each pixel, thus the viewer must be positioned at the sweet
acquisition and display technologies. Displaying 3D is a point. Tracking techniques are used to locate the viewer
major step towards that goal, as it fully exploits the position to adjust the displayed content.
capabilities of the human vision system, giving more Multiview systems [4] produce 8-16 views, which is
realistic experience than conventional 2D displaying. This is still strongly compromised causing the jumpy character,
especially true for displaying 3D videos, which is often limited Field of View (FOV) and invalid zones.
considered the pinnacle of non-interactive visualization. Holographic systems generate holographic patterns to
To address 3D video visualization, the principles of the reconstruct the wavefront. These systems suffer
3D displaying must be understood, and the 3D information fundamental limitations on realistically achievable image
must be identified. Alternative applications of 3D formats sizes and viewing angle.
exist. Considerable research effort is spent on Free
Viewpoint Video [2], which is a related application also
making use of 3D video formats. Since 3D displaying is
only one end of the fields related to 3D video, acquisition,
storage and transfer problems must also be taken into
consideration when selecting a solution.
In the following sections the principles of 3D
displaying is discussed followed by the discussion of the 3D

The patented HoloVizio™ technology [1][3] uses a

specially arranged array of optical modules and a
holographic screen. Each point of the holographic screen
emits light beams of different color and intensity to various
directions. The light beams generated in the optical modules
hit the screen points in various angles and the holographic
screen makes the necessary optical transformation to
compose these beams into a perfectly continuous 3D view
(see Figure 1). With proper software control, light beams
leaving the pixels propagate in multiple directions, as if they
were emitted from the points of 3D objects at fixed spatial
locations. The direction selective light emission is a general
Figure 2: Light Field reconstruction
requirement for every 3D system and provides quantitative
data on the FOV, on the angular resolution, determining
field of depth of the displays, affecting the total number of
light beams necessary for high-end 3D displaying.
A major question when addressing 3D visualization is how
to represent 3D information. When answering this question
it must also be considered how 3D content can be created or
acquired; how this information is to be transferred and
stored; and finally, how to use this information for fast
There are several ways to represent 3D. It is convenient
to classify representations on the basis on how much image
information and how much geometry information is stored
within the data (as advised by Smolic [2]). Image
information based representations define the color of
specific light rays connected to the capturing cameras, while
geometry information based representations describe the
object of the scene using some modeling language. If a 3D
image is considered as the collection of rays emitted from
Figure 1. HoloVizio™ principle the scene, both extremities of these representations must be
converted to light rays. Properties of some popular
The HoloVizio™ technology can be used to build both representations are discussed.
small-scale and large-scale display systems. Different
prototypes have been developed in European projects, most Polygon models and Surface models
recently a medium sized desktop system and a wall These formats are the native representation of many 3D
projected cinema system is being developed in the FP6 computer applications, ranging from 3D games to CAD
OSIRIS project. software and 3D graphical and animation studios.
Although many formats fall into this category, it is
4. 3D IMAGE COMPOSITION - LIGHT FIELD common that they represent 3D through defining geometry
RECONSTRUCTION information of object or object surfaces. Image information
appears implicitly through light source models, materials
There are multiple ways to define a 3D image. From the and textures. Although these models hold the most
point of view of the human vision system, a static 3D image consistent 3D information, to create realistic looking scenes
can be considered as the collection of light rays entering the detailed models are required, hence these format can be
two eyes (which the brain interprets exploiting the binocular large. 3D videos are created by animating scene objects. In
parallax). Similarly, from the displaying point of view a 3D practice this is done through many hours of graphical work
image can be considered as the light rays leaving the 3D or through using a motion capture system.
scene. See Figure 2. We will refer to this representation of
3D as the Light Field representation of the scene (after Volumetric data
Levoy at al. [6]). This representation of 3D is classically used by medical
applications, but a large array of 3D measurement devices
also uses volumetric representation to store raw
measurements. This representation classifies units of 3D
space based on the object properties occupying that space. 6. 3D VIDEO PLAYBACK
Since this representation assigns color information to the
geometry defined by voxels, these formats can be used Understanding the concepts of 3D image composition on
similarly to the model based formats. Depending on the large FOV Light Field reconstruction systems, it is plain that
space partitioning, this format tends to be large, and too the number of calculated light rays is a multiple of that of
inflexible to be considered for 3D video applications. 2D systems. Also, the viewpoint directions of calculated
light rays must be on a very fine grid, to address all
Multiple images necessary light rays. Extraction of this information from 3D
This representation is purely image based, and can be video formats requires significant graphical and in some
considered as a collection of light rays entering the cameras. cases general processing power. This processing power is
However, if we want the set of rays leaving the scene, provided in the HoloVizio™ systems by a computation
corresponding the two set of rays can be troublesome, and cluster, but the major properties (GPU, CPU, network,
may require computer vision algorithms. (See interpretation storage) of the cluster must depend on the representation.
of these formats in the next section.) On the other hand, The considered representation cases are described below.
collecting image information (due to the availability of high
quality digital cameras) is much simpler than creating Polygon animations
realistic 3D models. The size of data represented in this Displaying polygon animations requires fast rendering of
format depends on the number of images, but since image the frames of the 3D scene from multiple viewpoints. In
information is highly redundant, the application of general, ray-tracing would offer a solution to create the
compression techniques is beneficial. LightField. However, since the computation requirements of
the algorithms involved is high, this could only yield an off-
Multiple images combined with depth data line solution. In graphics applications however, to achieve
Augmenting image information with geometry information good images quality with adequate speed, polygon based
can mean the optimal solution for 3D video systems. If we contents viewers exploit the capabilities of the graphics
store depth information together with the set of rays (see hardware through the tools provided by graphics libraries
Figure 3.), the missing light rays can be calculated with (OpenGL, DirectX). Consequently, real-time displaying of
simple algorithms, while quality degradation is kept 3D model based contents may be possible through the
minimal. This can mean that the number of image streams replication of the rendering process utilized by these
can be reduced. Further benefit of this representation is the viewers.
compressibility inherited from the multiple image based The HoloVizio™ software pack provides an OpenGL
formats. Wrapper [3] for viewers using this graphics library, which
The price that must be paid is on the acquisition side. intercepts function calls of the viewer, and attempts to
The depth information must be calculated either from the replicate the rendering process, from multiple viewpoints.
images themselves using computer vision applications or To achieve real-time displaying of such content, a strong
using special hardware. The consistency of the 3D image graphics render cluster is required. The HoloVizio™
relies on the depth map quality. software pack also provides a Modelviewer application that
is capable of displaying widespread model based formats.

Multiple image based representations

Depending on the number of image streams, multiple image
based representations can provide enough light rays for
highly realistic 3D image composition. If the number of
streams is high (70-120) and well distributed over the
displayed FOV, the conversion to Light Field is a simple
interpolation task. However, this data format is large and the
conversion may be slow, simply because a huge data set
must be handled. Compression techniques are the key to this
Figure 3: N-View + N-Depth-map representation of the solution. A preprocessed image based data set, compressed
Breakdancer [10] dataset by standard 2D video compression (XviD) containing 50
million light rays has been shown to play with 24 frames per
seconds (see Figure 4).
Multiple image based representations using lower (7-11)
number of images do not contain enough light rays, instead
they carry implicit 3D information. CPU intensive
reconstruction [9] and image based rendering algorithms [8]
are required to calculate the missing viewpoints. Free
Viewpoint Video techniques [7] can yield fast solutions, but 8. CONCLUSIONS
since in this case multiple viewpoints must be calculated
simultaneously, real-time playback on a reasonable cluster In this paper the HoloVizio™ system is shown to be a
size is not yet feasible. Fast graphics based solutions are device capable of 3D video visualization. According to our
currently under research by Holografika. experiments, several 3D representations are appropriate for
visualization on high FOV 3D displays. It has been shown
Combined multiple image plus depth based that the HoloVizio™ system is capable of displaying 3D
representations video formats in different representations. Based on the
This 3D representation carries both color and geometry results, the most appropriate representation is the combined
information of the scene, and can be converted into Light image plus depth based representation, but other
Field using a graphics processor intensive method. This representations must not be neglected for either quality or
method is based on standard rendering techniques, but accessibility reasons. The discussed methods are useful for
integrates all information provided by the streams. A fast many applications including EU projects.
GPU implementation has been developed by Holografika,
taking advantage of the graphics processing power of a 9. ACKNOWLEDGMENTS
render cluster. The current implementation has reached 4 This work has been supported by EU IST-FP6 Integrated
frames per seconds for the 8 stream Breakdancers [10] Project OSIRIS (IST-33799 IP).
dataset using a 16 node cluster using NV6600 GPUs. Using
NV8800 graphics cards we improved the rendering to 30 10. REFERENCES
FPS. The small size of this representation makes it ideal for
3D data transmission. [1] T. Balogh, T. Forgacs, T. Agocs, E. Bouvier, F. Bettio, E.
Gobbetti and G. Zanetti: A Large Scale Interactive
7. APPLICATIONS Holographic Display. IEEE VR2006 Conference, March 25-
29, Alexandria, Virginia, USA
The applications for 3D video playback are emerging [2] Smolic, A.; Mueller, K.; Merkle, P.; Rein, T.; Kautzner, M.;
together with the availability of 3D systems. 3D video Eisert, P.; Wiegand, T., “Free viewpoint video extraction,
playing in connection with the HoloVizio™ technology is representation, coding, and rendering”, International
Conference on Image Processing, Volume 5, Issue , 24-27
being applied within the European Project OSIRIS, to create
Oct. 2004
a 3D desktop and a 3D Cinema system capable of
visualizing 3D videos. Within this project a 3D acquisition [3] T. Balogh, T. Forgacs, T. Agocs, O. Balet, E. Bouvier, F.
system is also being built to provide natural 3D content for Bettio, E. Gobbetti and G. Zanetti: A Scalable Hardware and
Software System for the Holographic Disp of Interactive
playback. Both display systems feature high quality Graphic Applications. EuroGraphics 2005, Dublin.
continuous view 3D visualization. Further applications of
the developed technologies are foreseen. [4] Cees van Berkel, David W Parker and Anthony R. Franklin:
Multiview 3D-LCD. In Stereoscopic Displays and Virtual
Reality Systems III (1996), Vol. 2653 of SPIE proceedings,
[5] G. J. Woodgate, D. Ezra, J. Harrold, Nicolas S. Holliman, G.
R. Jones, R. R. Moseley: Autostereoscopic 3D display systems
with observer tracking. In Image Communication – Special
Issue on 3D Video Technology (EURASIP - 1998), pp.131
[6] M. Levoy, and P. Hanrahan, “Light Field Rendering”, Proc.
ACM SIGGRAPH, pp. 31-42, August 1996.
[7] C.L. Zitnick, S.B. Kang, M. Uyttendaele, S. Winder, and R.
Szeliski, “High-quality video view interpolation using a
layered representation” ACM SIGGRAPH and ACM Trans.
on Graphics, Los Angeles, CA, Aug. 2004, pp. 600-608.
[8] A.W. Fitzgibbon, Y. Wexler, A. Zisserman, “Image-based
rendering using image-based priors”, Proc. International
Conference on Computer Vision, October 2003.
Figure 4.: Natural 3D content displayed on HV640RC [9] Z. Megyesi, G.Kós and D.Chetverikov, "Dense 3D
Reconstruction from Images by Normal Aided Matching",
Machine Graphics & Vision, vol.15, pp.3-28, 2006.