Sei sulla pagina 1di 4

Augmented reality interaction with

collision detection and rigid body dynamics


William di Domenico
Aldo von Wangenheim
National Institute for Digital Convergency
Federal University of Santa Catarina
Florianpolis, Brazil
E-mail: {will,awangenh}@inf.ufsc.br
Tiago Holanda de Cunha Nobrega
University of Itaja Valley
Florianpolis, Brazil
E-mail: tigarmo@univali.br
AbstractAugmented reality (AR) brings interaction to phys-
ical environments, however this impression is compromised when
virtual objects do not behave reliably. This paper addresses
the issue giving virtual objects collision detection and rigid
body dynamics via physics engine and improving the objects
appearance conformity with video frames using shadow casting,
shading and materials, all provided by a rendering engine.
The engine also streamlines development with its scene graph
and object orientation. Furthermore, interlibrary I/O required
transformations and a wrapper framework was designed to tackle
that.
Keywordsaugmented reality; computer vision; human-
computer interaction; rigid body dynamics; rendering engine;
physics engine;
I. INTRODUCTION
Augmented reality is a computer vision based human-
computer interface that works much like a hologram: a virtual
object is displayed in real world and can be seen through any
point of view merely moving around it.
The difference is that while holograms are projected di-
rectly in real world, AR uses widespread electronic screens to
display live video mixed with computer graphics.
Still, passive observation is only the iceberg tip. Interaction
is where it really shines, more intuitive and natural as it
happens in real environments [1].
However when virtual objects are not credible, their reality
illusion is compromised [2]. This work demonstrate ways
of addressing this issue through an improvement in both
appearance and behaviour [3] of virtual objects, listed in Sector
III. For such, both physics and rendering engine features were
used, combined with an augmented reality library in a minimal
game engine.
Game engines are a collection of modules that do not
directly specify the games logic or environment. They include
modules handling input, 3D rendering and generic physics for
game worlds [4] and let developers focus on game logic and
level design.
Although there are published papers describing the use of
AR libraries with physics and rendering engines [2], [3], [5],
[6], no surveyed paper combined the ones used here. They were
chosen due to their open-sourced and up-to-date code, exten-
sive and well-written documentation, multi-platform support,
engaged community and are presented in Sector II.
Besides, no surveyed paper explained their integration in
depth, which is thoroughly demonstrated in Sector IV.
II. FRAMEWORK OVERVIEW
It is common practice centering custom game engines
around the rendering engine, using other libraries in a plugin
fashion [7]. They have useful methods to transform coordinates
and data structures to store them, being suitable to mediate
communication between libraries.
This framework design followed such concept: it gives
direct access to the rendering engine, meanwhile providing
wrapping interfaces to AR library and physics engine (Fig. 1)
that abstract away technical details whenever exibility is not
affected.
The framework pipeline denes a xed loop that must be
followed (Fig. 3). First the marker is detected, then virtual
objects motion is simulated and nally the result is drawn on
screen.
Framework
Operating System
Game Code
Physics
Interface
AR
Interface
Rendering
Engine
CV Library
AR Library
Physics
Engine
Fig. 1. Framework overview
A. AR & CV Libraries
AR libraries provide the position and orientation of partic-
ular objects detected in videos. This information is called pose
and is used to establish a coordinate system enabling virtual
objects to be positioned.
One such library is ArUco. Built upon OpenCV and sur-
facing its methods and data structures when possible. OpenCV
is the de facto standard computer vision library [8], as such,
it is well documented and has a big community. These make
understanding and modifying ArUcos code easier, useful for
research. On these grounds, ArUco was picked as the engines
AR library.
Detection is implemented in AR libraries using the follow-
ing computer vision techniques:
1) Pose Estimation: The algorithm that guesses the objects
pose, hence its name. ArUco uses POSIT pose estimation
algorithm [9] as default. Based on the pinhole camera model,
it requires four co-planar points as input.
2) Tracking: To track these points in the image, eas-
ily thresholded objects called ducial markers are used. By
making these markers at and rectangular, we are able to
gather four co-planar points from their corners using adaptive
thresholding. These kind of markers are used in many high
prole AR implementations like Nintendos 3DS and Sonys
PS Vita.
Marker boards (Fig. 2) were used to enhance tracking.
They are marker grids that make pose estimation possible
Fig. 2. Fiducial marker board
Detection
Rendering
Motion
Simulation
Fig. 3. Pipeline loop
even if some of the markers are undetected. As long as one is
recognised, the camera pose can be established.
B. Rendering Engine
The rendering engine incorporates all complicated code
needed to efciently identify and render the players view from
a complex 3D model of the environment [2].
OGRE3D rendering engine was used, a proven, stable
rendering engine used in several commercial products [7].
Independent of 3D implementation, it supports Direct3D,
OpenGL and OpenGL ES.
OpenGL ES makes OGRE3D usable in the two major
mobile operating systems, iOS and Android. Besides giving
more freedom to move around, mobile form factor is better
suited to the magic-lens approach to augmented reality than
desktops and laptops [10] [11].
C. Physics engine
A physics engine is a library that simulates physical
phenomena in nature. It can predict various motion phenomena
under different conditions that approximate what would happen
in real life [3].
Two engines were integrated, Plank [12] an in-house
physics engine with rigid body and cloth simulation and
Bullet [13] a professional grade library with collision detec-
tion, rigid body and soft body dynamics.
III. RESULTS
A. Collision Detection & Rigid Body Dynamics
Collision detection ensure objects do not interpenetrate
when they touch each other and rigid body dynamics give
realistic reactions to such touches, providing virtual objects
the feel of being solid things, with mass, inertia, bounce
and buoyancy [14], given the restriction objects never change
shapes, therefore being rigid bodies.
B. Shadow Casting
Shadows are an important part of rendering a believable
scene. They provide the objects a tangible impression and aid
viewers understanding of their spatial relationship [7].
Fig. 4. Comparison of real toy robot to shadowless and shadowed ogre virtual
model respectively
This visual cue is even more signicant in AR. As
interaction takes place in physical environments, accurately
determining positions of shadowed virtual objects becomes
easier [2].
Computer generated shadows are roughly classied regard-
ing two aspects: how their shape is generated (stencils or
textures) and how they are rendered (modulatively or with
additive light masking) [8]. Stencil, modulative shadows are
the only correctly working in AR using OGRE3D engine.
To cast the shadows, a virtual representation of the marker
using a plane with a specic material object is necessary. The
material object controls how objects in the scene are rendered.
It species objects surface properties, such as reectance of
colours, shininess, number of texture layers presented, images
on them and how they are blended together [7]. The plane
material must be transparent while preserving shadow casting
properties.
C. Improved Tracking
Frame undistortion is unanimous across literature [15] [16].
Surprisingly, there was no observable tracking enhancement
using it.
There is a trade-off between undistortion and frame resolu-
tion, as frame-rate drops impractically low when undistorting
high-resolution frames.
As increased resolution brought noticeable enhancements
in detection, aiming the highest possible resolution, undistor-
tion was dropped altogether,.
IV. INTEGRATION
A. Rendering Engine
1) Camera Pose: The AR library marker pose output
is loaded untouched as input of the rendering engine. The
scene graph then takes care of converting it to a convenient
coordinate system.
As the AR library output is not converted, the rendering
engine must use AR librarys coordinate system to correctly
display objects, the camera coordinate system. For that, the
rendering engine camera object is placed at the origin and its
orientation aligned to the axis as in Fig. 5.
Fig. 5. Marker and camera coordinate systems relationship
2) Projection Matrix: The rendering engines camera ob-
ject must have the same projection matrix of the real camera,
obtained through camera calibration [17].
B. Physics Engine
1) Change of Basis: All objects simulated used by the
physics engine must have a coordinate change of basis from
camera coordinate system to marker coordinate system.
This is achieved using a node in the rendering engine scene
graph to represent the marker and calling the node method that
transforms coordinates from world to local coordinate system,
in this case, the marker coordinate system.
2) Scaling: By default, Bullet assumes units to be in
meters. Moving objects are assumed to be in the range of 0.05
units, about the size of a pebble, to 10, the size of a truck [18].
Even though the AR library uses meters as measure units,
if its output is fed directly to the physics engine, the simulation
does not behaves correctly. While not a completely understood
phenomenom, one hipothesis is that the simulated object are
usually representing larger entities. For instance, an augmented
truck would have around a dozen centimeters, but is supposed
behave like a real truck many times larger.
By scaling the world, dimensions and velocities are
changed back within the range Bullet was designed for (0.05 to
10) and simulation becomes more realistic. Scaling the values
from meters to centimeters that is, multiplying them by 100
provides a good behaviour to the objects motion simulation.
V. CONCLUSION
Behaviour incompatibility between physical environment
and virtual objects was alleviated supplementing the rendering
engine animation system with collision detection and rigid
body dynamics. When you throw a ball, it bounces, spins and
rolls in a very close way a real one would. Since the objects
behave more like they are expected, interaction becomes more
intuitive.
The render engine resources also boosted virtual objects
credibility. Shadows made them look more realistic and in
conformity with the environment and gave them better spacial
location cues.
The framework simplies AR games development and
improves AR interaction, its original purpose. Developers may
use it as groundwork to tailor personal game engines, freely
incorporating their pet libraries to address needs like audio or
networking.
REFERENCES
[1] R. T. Azuma et al., A survey of augmented reality, Presence, vol. 6,
no. 4, pp. 355385, 1997.
[2] D. Beaney and B. MacNamee, Forked! a demonstration of physics
realism in augmented reality, in Mixed and Augmented Reality, 2009.
ISMAR 2009. 8th IEEE International Symposium on. IEEE, 2009, pp.
171172.
[3] C. Chae and K. Ko, Introduction of physics simulation in augmented
reality, in Ubiquitous Virtual Reality, 2008. ISUVR 2008. International
Symposium on. IEEE, 2008, pp. 3740.
[4] M. Lewis and J. Jacobson, Game engines, Communications of the
ACM, vol. 45, no. 1, p. 27, 2002.
[5] P. Buchanan, H. Seichter, M. Billinghurst, and R. Grasset, Augmented
reality and rigid body simulation for edutainment: the interesting
mechanism-an ar puzzle to teach newton physics, in Proceedings of the
2008 International Conference on Advances in Computer Entertainment
Technology. ACM, 2008, pp. 1720.
[6] D.-M. Liu, C.-H. Yung, and C.-H. Chung, A physics-based augmented
reality jenga stacking game, in Digital Media and Digital Content
Management (DMDCM), 2011 Workshop on. IEEE, 2011, pp. 18.
[7] G. Junker, Pro OGRE 3D programming. Apress, 2006.
[8] Y.-T. Tsai, O. Gallo, D. Pajak, and K. Pulli, Mobile visual computing
in c++ on android, in ACM SIGGRAPH 2013 Studio Talks. ACM,
2013, p. 23.
[9] D. Oberkampf, D. F. DeMenthon, and L. S. Davis, Iterative pose
estimation using coplanar points, in Computer Vision and Pattern
Recognition, 1993. Proceedings CVPR93., 1993 IEEE Computer Soci-
ety Conference on. IEEE, 1993, pp. 626627.
[10] E. A. Bier, M. C. Stone, K. Pier, W. Buxton, and T. D. DeRose,
Toolglass and magic lenses: the see-through interface, in Proceedings
of the 20th annual conference on Computer graphics and interactive
techniques. ACM, 1993, pp. 7380.
[11] J. Viega, M. J. Conway, G. Williams, and R. Pausch, 3d magic lenses,
in Proceedings of the 9th annual ACM symposium on User interface
software and technology. ACM, 1996, pp. 5158.
[12] M. S. Souza, T. de HC Nobrega, A. F. B. Silva, D. D. Carvalho,
and A. von Wangenheim, A rigid body physics engine for interactive
applications, in Proceedings of the XII symposium of the Special
Commission of Games and Digital Entertainment of the Computing
Brazilian Society. SBC, 2011.
[13] E. Coumans et al., Bullet physics library, Open source: bulletphysics.
org, 2006.
[14] I. Millington, Game physics engine development. Taylor & Francis
US, 2007.
[15] D. Wagner and D. Schmalstieg, Artoolkitplus for pose tracking on mo-
bile devices, in Proceedings of 12th Computer Vision Winter Workshop
(CVWW07), 2007, pp. 139146.
[16] R. Munoz-Salinas, Aruco: A minimal library for augmented reality
applications based on opencv, 2012.
[17] G. Bradski and A. Kaehler, Learning OpenCV: Computer vision with
the OpenCV library. Oreilly, 2008.
[18] (2008) Scaling the world. [Online]. Available: http://www.bulletphysics.
org/mediawiki-1.5.8/index.php?title=Scaling_The_World&oldid=3500

Potrebbero piacerti anche