Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
18 May 2016
Beginning with the eight-bit beeps and bloops of Ataris Pong and reaching all the
way to the modern simulations found in virtual reality, sound has always played an
integral role in bringing virtual worlds to life. Sound and music mold a games aesthetic,
relay and reinforce information about a games mechanics, and create an immersive
experience that makes players want to stay in a games world. A games audio can be as
the sound of picking up a coin, storming a castle, or getting eaten by a ghost in a maze. It
can also be the premise around a which a game is based, as evidenced by the obsession
with rock band simulators and guitar-shaped controllers in the late two-thousands. As the
technology for sound in games has expanded so has the variety of roles that it plays,
worlds.
Historically, the technology that handles sound for videogames has evolved
alongside the games themselves though not always at the same rate. Sound design in
early games received special attention because it was a relatively cost effective method of
technology has taken a backseat to enhanced graphical effects in games. This is in part
due to a pressure to deliver more a more realistic experience in big-budget titles, which
tend to drive the majority of technological advances in the industry. Game developers
Wade, 2
have felt limited in the amount of realism they can achieve with sound, because they are
realistic soundscape with seven surround-sound speakers, most players are only using a
set of headphones or their TVs built-in audio system. On the other hand, high-definition
televisions and computer monitors are quickly becoming the norm, so improved graphical
simulation. The rise of commercial virtual reality systems like PlayStation VR, Vive, and
Oculus Rift present an entirely new direction for game development, which brings with it
its own set of challenges. Chief among these is the challenge of audial immersion.
Maintaining the illusion of a virtual world requires not only a 3-D, motion-tracking visual
display, but also the ability to produce sound which gives the player an accurate sense of
where in space these sounds originate. Doing so is a task which lies at a disciplinary
crossroads, where physics, computer science, psychology, and music all have a role to
play. To better understand the motivation behind such a challenging endeavor, lets first
As early as when the first home game consoles were being produced, simple
musical techniques were being utilized to convey information to players. Take, for
instance, the use of pentatonic scales. A pentatonic scale is a scale constructed using only
five notes, all in tune with each other. While only having five notes might limit the
Wade, 3
variety of music that can be composed, the fact that these notes are almost guaranteed to
sound pleasing together makes it easy to compose sound clips which sound positive.
Early games in which sound was too primitive to play dialogue telling the player good
job or you win used these scales to convey a sense of success to players. The classic
eight-bit tunes played when reaching the end of a level in Mario Brothers, picking up a
ring in Sonic the Hedgehog, or opening a treasure chest in The Legend of Zelda all use
pentatonic scales. (Amplifion) Without any dialogue or previously exposing these sounds
to players, these games instantly communicate to players that they have accomplished
But the use of simple melodies to create associations in a players mind extends
far beyond the Pavlovian trick of rewarding players with an uplifting tune for success.
Melodies integrated into a games non-musical soundscape can also interact with the
game world in ways which reinforce emotional connections to locations and characters.
In Mass Effect 3, there is a haunting synthesized melody that plays during key gameplay
subtly, the melody is also found hidden in the drone of machinery in the games central
location, where the player returns after missions during which friendly characters often
die. (Hamilton) The repeated use of this melody, both musically and integrated into
environmental sounds, in situations which prompt thought and reflection about the
the happy ring of a coin or opened treasure chest. It shows how modern games can use
the variety of sounds available to them to create associations which arent as naturally
Wade, 4
interpreted as the simple jingles used in earlier games, and hide them in places players
In other situations, the use of music to set the tone of a game might be more
obvious, and intentionally so. Unlike in films, the mood in game is not simply a function
of time, but of both time and the players actions and the score accompanying the game
must account for this. One of the most common situations in which this occurs is in
games like Metal Gear Solid, where enemies can sneak up on a player from behind or
around corners. (Amplifion) In a film, the director knows the exact time when the
surprise occurs, and the composer can construct the score in a way that builds up to the
surprise. But in games, when the surprise occurs is dependent on a variety of factors, like
when the player reaches the location of the surprise, and how long it takes them to build
up the nerve to turn the corner. This presents a unique challenge those composing game
music, who must compose tracks designed to transition at a moments notice, while still
sounding natural. (Clark) Doing so allows for the audio to not only function as a
component of the games mood, but also as a reliable cue for the player to run away or
fight in response to the surprise. The alert mode music becomes instantly recognizable
to the player, and serves as both aesthetic icing and an important game mechanic.
Note that thus far, we have not discussed any uses of sound which actually
require accurate sound simulation to achieve their purpose. Pentatonic scales can be
played, sound effects can create subtle connections in the players mind, and background
music can be changed dynamically all without much thought being given to how the
Wade, 5
sound is actually being produced for the player. This explains, in part, why up until
recently game audio technology has progressed so little relative to other areas of the
industry. In games played in two dimensions, there is hardly any need for sound
simulations at all, since the players view of the game world is already not incredibly
realistic. The more stylized the game, the less realistic its sounds need to be. This
explains why earlier games, and modern games which dont strive for realism, are able to
focus almost entirely on how sound is used to communicate with and move the player
without putting much thought into how realistically that sound is rendered. They are able
to achieve the desired effects of using sound, without worrying about how realistic that
sound is.
In three dimensional games however, a little bit of simulation can go a long way.
While the computational cost of complicated sound simulations could start to eat away at
the resources required for high-quality three dimensional graphics and complicated game
simulations, players are for the most part concerned with two things. First, they expect to
be able to sense the direction a sound is coming from; when something is creeping up
from behind them, or exploding on either side of them, they want to know even if they
cant see it. Secondly, they want to be able to know the spatial layout of the location in
which they hear the sound. Is there a wall in between the player in the sound, or are they
hearing it in a wide open field? Whether these audial indicators are physically accurate or
not is mostly a matter of adding a stronger sense of realism to the game, and allows for
some corners to be cut in the rendering sound. But to think about what exactly this means
in terms of how sound in a game is rendered, it is first necessary to understand how our
Before even considering the direction of a sound, we can interpret to some degree
how a sound is reaching us by listening for different characteristics of that sound. This is
where the phenomena of diffraction, reflection and absorption come into play. Diffraction
refers to the way in which sounds bends around different objects and while passing
phenomenon which creates echoes; it describes how sounds reflect off of surfaces which
they reach. Absorption refers to the way in which different frequencies are quieted by
variable amounts dependent on the material they are passing through or reflecting off of.
(Selby L22) The combination of these phenomena change the characteristics of sounds in
ways that we can recognize and interpret. The sound of voice diffracting around a wall
midsentence might indicate to the player that the speaker ran behind the wall as they
spoke. A sound with many delayed reflections might give the player the sense of being in
a large open building, while a sound with no reflections at all might seem to be in a wide
open field. A muffled sound with few high frequencies might convey the sense that the
sound came to the listener through a thick wall which absorbed the higher frequencies.
Various combinations of these effects can provide any number of scenarios, each of
which tells the player something about the state of the game.
To pinpoint the location of a sound, humans also use what is called binaural
hearing, which at its simplest refers to the fact that we use two ears to hear sound. Having
two locations at which sound is received allows the brain to interpret differences in
timing, volume, and pitch to determine where a sound comes from. (Columbia College)
These variations occur for two primary reasons. The first is the distance between our two
ears. Sound waves coming from the left or right will reach the ear on that side first,
Wade, 7
causing a slight delay between the time that each ear receives the sound. Sound also loses
amplitude as it travels, so the ear that receives the sound later will also hear a slightly
quieter sound. The second is the thing in-between our ears that is, our heads. Just like
everything else, the materials that make up the skull absorb sound, and they absorb some
frequencies of sound better than others. In the case of our heads, low frequency sounds
are absorbed less than high frequency sounds, so the ear on the side opposite a sounds
source will hear a slightly filtered version of the sound with less of the high frequencies
present in the original. Different frequency sounds also experience different degrees of
diffraction, or bending, around the head, which can alter the frequency spectrum of
sounds entering the ear. The brain picks up on these minute differences in timing,
volume, and pitch, and interprets them in a way that determines the location of the sounds
we hear.
reflection, absorption and position into account in order to give players a rough idea of
where and how sounds originate in the game world. However, most games can only
This means that corners need to be cut since game developers are more concerned with
sound being immersive than they are with it being physically accurate. (Guay) Broadly
Wade, 8
speaking, there are two different simulation tactics most commonly used, which are the
Ray tracing is a technique where the path of a sound is followed while taking into
account its environment. Starting at the sound source, this technique looks at a number of
rays coming out from the source at different angles. It follows these rays until they
collide with an object, and then looks at a new ray to represent the reflected version of the
colliding ray. (Miksicek) In figure two, a ray trace in two dimensions is pictured using
four rays. Using more rays can increase the accuracy of the simulation, but also increases
the computational cost of performing the ray trace. In figure three, the result of a ray trace
in two dimensions using an infinite number of rays is shown, representing the ideal
outcome of a ray trace operation. The darker shades of green represent areas where the
sound rays travels through more than once as a result of reflecting off of a wall, and white
areas represent places where the sound doesnt reach at all. The ray trace method
accurately simulates the reflection of sound off of surfaces, and provides extremely
realistic echoes and reverberation. It also has the potential to account for absorption by
allowing some sound to pass through surfaces with a lower intensity, and diffraction by
Wade, 9
three dimensions this method can be prohibitively expensive since games run at a rate of
many frames per second. (N. Tsingos) It is more useful for simulations which require
accuracy, but not in applications such as games which are ran in real time.
The alternative to ray tracing is the virtual source method, which trades physical
this method, pictured in figure four, the simulation only concerns itself with points where
diffraction might occur. Based on these points, virtual sounds are generated at nearby
locations to simulate the audial effect of diffraction. Reflections are not calculated, and
accurate, but is chosen to match the games acoustic environment. Various shortcuts are
used to cut down on computational cost even further, by grouping two-dimensional areas
into zones, and treating the world as multiple two-dimensional layers rather than a
three-dimension space for computational purposes. (Guay) As a result, the virtual source
method is much less physically accurate than the ray tracing method. However, it
remaining with the constraints of modern technology, making it the current choice of
The meaning of immersion is dependent upon the medium however, and new
technology is shifting that definition. Commercial virtual reality hardware, such as the
PlayStation VR, Vive, and Oculus Rift systems, are setting higher standards for what
provides an illusion of depth and actually existing in the game world, precise audio
becomes more important. If a player sees an event that should cause a sound occur some
distance away, and the sound seems to come from somewhere just a few meters away,
even such a small discrepancy can break the complete and total immersion that virtual
reality works so hard to achieve. To account for this sensitivity, sound technology in
games is being pushed to evolve, causing demand for more accurate and precise
simulation.
Figure5UsingOculusRift(Shanklin)
Wade, 11
fairly well for games in virtual reality, with one major caveat. They work well in the
sense that they are feasible from a computational perspective, and are refined enough to
give players a general idea of what direction a sound is coming from and how. What they
Transfer Function, or HTRF. HTRFs are like audial fingerprints, unique sets of
Recall that the interpretation of a sounds location is in part dependent upon the
distance between ones ears, and the absorption of sound by the head in between them.
Add in the diffraction of sound that occurs off of the shoulders and neck, and it becomes
very clear that there are many factors which vary from person to person, meaning that our
interpretations of sound is unique. In the real world, a sound coming from some distance
away literally sounds different to two different people. And if the exact same sound is
played through headphones to those two people, they will interpret it slightly differently
in terms of where it is located. With virtual reality, the problem isnt in accurately
Finding solutions to the HRTF problem is challenging, because the exact nature of
the problem is different for each individual user. This requires a solution which is
adaptable. The most promising solution currently available is a set of stereo headphones
call Ossic which play three-dimensional audio after performing a calibration routine with
the wearer. In doing so, they can account for the wearers individual HRTF, and adjust
and maintain the location of sound using head-tracking technology, similar to that of
commercially available virtual reality systems.1 If similar systems were integrated into
existing virtual reality hardware, it may be possible to eliminate the small discrepancies
in sound that currently exists, and provide an even more immersive audio experience.
1 http://www.ossic.com/technology/
Wade, 13
Works Cited
Clark, Andrew. "Defining Adaptive Music." Gamasutra. N.p., 17 Apr. 2007. Web. 16 Apr.
2016.
Developers Conference Vault. Ubisoft Montreal, 5 Mar. 2012. Web. 1 Apr. 2016.
Hamilton, Kirk. "Mass Effect 3's Musical Secret." Kotaku. N.p., 22 Mar. 2012. Web. 16
Apr. 2016.
Miksicek, Lukas. "3D Sound - an Overview." CTU Prague, n.d. Web. 10 May 2016.
"Playing with Your Mind: The Psychology of Sound in Video Games." Amplifion, n.d.
Selby, Kathy. "Lecture 22." Physics of Musical Sound. Cornell University, Ithaca. 21
Shanklin, Will. "Oculus Rift Review: Polished, Often Magical ... but Ultimately Second
Zhong, Xiao-li, and Bo-sun Xie. "Head-Related Transfer Functions and Virtual Auditory