Sei sulla pagina 1di 13

Wade, 1

Reid Wade PHYS 1204

Dr. Katherine Selby TA Soumyajit Bose

18 May 2016

Playing With Music: Sound Simulation In Games

Beginning with the eight-bit beeps and bloops of Ataris Pong and reaching all the

way to the modern simulations found in virtual reality, sound has always played an

integral role in bringing virtual worlds to life. Sound and music mold a games aesthetic,

relay and reinforce information about a games mechanics, and create an immersive

experience that makes players want to stay in a games world. A games audio can be as

iconic as its characters; conditioning entire generations to recognize whimsical tunes as

the sound of picking up a coin, storming a castle, or getting eaten by a ghost in a maze. It

can also be the premise around a which a game is based, as evidenced by the obsession

with rock band simulators and guitar-shaped controllers in the late two-thousands. As the

technology for sound in games has expanded so has the variety of roles that it plays,

maintaining its important status in the increasingly complicated construction of these

worlds.

Historically, the technology that handles sound for videogames has evolved

alongside the games themselves though not always at the same rate. Sound design in

early games received special attention because it was a relatively cost effective method of

communicating information to players. In more recent history, however, sound

technology has taken a backseat to enhanced graphical effects in games. This is in part

due to a pressure to deliver more a more realistic experience in big-budget titles, which

tend to drive the majority of technological advances in the industry. Game developers
Wade, 2

have felt limited in the amount of realism they can achieve with sound, because they are

constrained by the hardware available to players. While it might be possible to create a

realistic soundscape with seven surround-sound speakers, most players are only using a

set of headphones or their TVs built-in audio system. On the other hand, high-definition

televisions and computer monitors are quickly becoming the norm, so improved graphical

effects provide boost to a games realism felt by a wider portion of consumers.

But recent advances in the technology and commercial availability of virtual

reality hardware have reignited interest in more technologically advanced sound

simulation. The rise of commercial virtual reality systems like PlayStation VR, Vive, and

Oculus Rift present an entirely new direction for game development, which brings with it

its own set of challenges. Chief among these is the challenge of audial immersion.

Maintaining the illusion of a virtual world requires not only a 3-D, motion-tracking visual

display, but also the ability to produce sound which gives the player an accurate sense of

where in space these sounds originate. Doing so is a task which lies at a disciplinary

crossroads, where physics, computer science, psychology, and music all have a role to

play. To better understand the motivation behind such a challenging endeavor, lets first

take a look at why music and sound are so essential to games.

How Sound is Used In Games

As early as when the first home game consoles were being produced, simple

musical techniques were being utilized to convey information to players. Take, for

instance, the use of pentatonic scales. A pentatonic scale is a scale constructed using only

five notes, all in tune with each other. While only having five notes might limit the
Wade, 3

variety of music that can be composed, the fact that these notes are almost guaranteed to

sound pleasing together makes it easy to compose sound clips which sound positive.

Early games in which sound was too primitive to play dialogue telling the player good

job or you win used these scales to convey a sense of success to players. The classic

eight-bit tunes played when reaching the end of a level in Mario Brothers, picking up a

ring in Sonic the Hedgehog, or opening a treasure chest in The Legend of Zelda all use

pentatonic scales. (Amplifion) Without any dialogue or previously exposing these sounds

to players, these games instantly communicate to players that they have accomplished

something using simple melodies.

But the use of simple melodies to create associations in a players mind extends

far beyond the Pavlovian trick of rewarding players with an uplifting tune for success.

Melodies integrated into a games non-musical soundscape can also interact with the

game world in ways which reinforce emotional connections to locations and characters.

In Mass Effect 3, there is a haunting synthesized melody that plays during key gameplay

moments, as well as during emotional pieces of dialogue between characters. More

subtly, the melody is also found hidden in the drone of machinery in the games central

location, where the player returns after missions during which friendly characters often

die. (Hamilton) The repeated use of this melody, both musically and integrated into

environmental sounds, in situations which prompt thought and reflection about the

games narrative develop an emotional context which is not as immediately obvious as

the happy ring of a coin or opened treasure chest. It shows how modern games can use

the variety of sounds available to them to create associations which arent as naturally
Wade, 4

interpreted as the simple jingles used in earlier games, and hide them in places players

might not even realize to look.

In other situations, the use of music to set the tone of a game might be more

obvious, and intentionally so. Unlike in films, the mood in game is not simply a function

of time, but of both time and the players actions and the score accompanying the game

must account for this. One of the most common situations in which this occurs is in

games like Metal Gear Solid, where enemies can sneak up on a player from behind or

around corners. (Amplifion) In a film, the director knows the exact time when the

surprise occurs, and the composer can construct the score in a way that builds up to the

surprise. But in games, when the surprise occurs is dependent on a variety of factors, like

when the player reaches the location of the surprise, and how long it takes them to build

up the nerve to turn the corner. This presents a unique challenge those composing game

music, who must compose tracks designed to transition at a moments notice, while still

sounding natural. (Clark) Doing so allows for the audio to not only function as a

component of the games mood, but also as a reliable cue for the player to run away or

fight in response to the surprise. The alert mode music becomes instantly recognizable

to the player, and serves as both aesthetic icing and an important game mechanic.

How Sound Is Rendered Games

Note that thus far, we have not discussed any uses of sound which actually

require accurate sound simulation to achieve their purpose. Pentatonic scales can be

played, sound effects can create subtle connections in the players mind, and background

music can be changed dynamically all without much thought being given to how the
Wade, 5

sound is actually being produced for the player. This explains, in part, why up until

recently game audio technology has progressed so little relative to other areas of the

industry. In games played in two dimensions, there is hardly any need for sound

simulations at all, since the players view of the game world is already not incredibly

realistic. The more stylized the game, the less realistic its sounds need to be. This

explains why earlier games, and modern games which dont strive for realism, are able to

focus almost entirely on how sound is used to communicate with and move the player

without putting much thought into how realistically that sound is rendered. They are able

to achieve the desired effects of using sound, without worrying about how realistic that

sound is.

In three dimensional games however, a little bit of simulation can go a long way.

While the computational cost of complicated sound simulations could start to eat away at

the resources required for high-quality three dimensional graphics and complicated game

simulations, players are for the most part concerned with two things. First, they expect to

be able to sense the direction a sound is coming from; when something is creeping up

from behind them, or exploding on either side of them, they want to know even if they

cant see it. Secondly, they want to be able to know the spatial layout of the location in

which they hear the sound. Is there a wall in between the player in the sound, or are they

hearing it in a wide open field? Whether these audial indicators are physically accurate or

not is mostly a matter of adding a stronger sense of realism to the game, and allows for

some corners to be cut in the rendering sound. But to think about what exactly this means

in terms of how sound in a game is rendered, it is first necessary to understand how our

sense of hearing works in real life to interpret these sounds.


Wade, 6

Before even considering the direction of a sound, we can interpret to some degree

how a sound is reaching us by listening for different characteristics of that sound. This is

where the phenomena of diffraction, reflection and absorption come into play. Diffraction

refers to the way in which sounds bends around different objects and while passing

through narrow openings, like a doorway or window. (Guay) Reflection is the

phenomenon which creates echoes; it describes how sounds reflect off of surfaces which

they reach. Absorption refers to the way in which different frequencies are quieted by

variable amounts dependent on the material they are passing through or reflecting off of.

(Selby L22) The combination of these phenomena change the characteristics of sounds in

ways that we can recognize and interpret. The sound of voice diffracting around a wall

midsentence might indicate to the player that the speaker ran behind the wall as they

spoke. A sound with many delayed reflections might give the player the sense of being in

a large open building, while a sound with no reflections at all might seem to be in a wide

open field. A muffled sound with few high frequencies might convey the sense that the

sound came to the listener through a thick wall which absorbed the higher frequencies.

Various combinations of these effects can provide any number of scenarios, each of

which tells the player something about the state of the game.

To pinpoint the location of a sound, humans also use what is called binaural

hearing, which at its simplest refers to the fact that we use two ears to hear sound. Having

two locations at which sound is received allows the brain to interpret differences in

timing, volume, and pitch to determine where a sound comes from. (Columbia College)

These variations occur for two primary reasons. The first is the distance between our two

ears. Sound waves coming from the left or right will reach the ear on that side first,
Wade, 7

causing a slight delay between the time that each ear receives the sound. Sound also loses

amplitude as it travels, so the ear that receives the sound later will also hear a slightly

quieter sound. The second is the thing in-between our ears that is, our heads. Just like

everything else, the materials that make up the skull absorb sound, and they absorb some

frequencies of sound better than others. In the case of our heads, low frequency sounds

are absorbed less than high frequency sounds, so the ear on the side opposite a sounds

source will hear a slightly filtered version of the sound with less of the high frequencies

present in the original. Different frequency sounds also experience different degrees of

diffraction, or bending, around the head, which can alter the frequency spectrum of

sounds entering the ear. The brain picks up on these minute differences in timing,

volume, and pitch, and interprets them in a way that determines the location of the sounds

we hear.

Figure 1 Binaural Hearing Illustration (Columbia College)

Modern three-dimensional games use sound simulations which take diffraction,

reflection, absorption and position into account in order to give players a rough idea of

where and how sounds originate in the game world. However, most games can only

afford to allot approximately ten percent of computational resources to sound simulation.

This means that corners need to be cut since game developers are more concerned with

sound being immersive than they are with it being physically accurate. (Guay) Broadly
Wade, 8

speaking, there are two different simulation tactics most commonly used, which are the

ray tracing and virtual source methods.

Figure 2, Left Ray tracing procedure illustration (Miksicek)


Figure 3, Right Sound intensity results using ray tracing (Guay)

Ray tracing is a technique where the path of a sound is followed while taking into

account its environment. Starting at the sound source, this technique looks at a number of

rays coming out from the source at different angles. It follows these rays until they

collide with an object, and then looks at a new ray to represent the reflected version of the

colliding ray. (Miksicek) In figure two, a ray trace in two dimensions is pictured using

four rays. Using more rays can increase the accuracy of the simulation, but also increases

the computational cost of performing the ray trace. In figure three, the result of a ray trace

in two dimensions using an infinite number of rays is shown, representing the ideal

outcome of a ray trace operation. The darker shades of green represent areas where the

sound rays travels through more than once as a result of reflecting off of a wall, and white

areas represent places where the sound doesnt reach at all. The ray trace method

accurately simulates the reflection of sound off of surfaces, and provides extremely

realistic echoes and reverberation. It also has the potential to account for absorption by

allowing some sound to pass through surfaces with a lower intensity, and diffraction by
Wade, 9

bending rays under certain collision conditions. Unfortunately, when implemented in

three dimensions this method can be prohibitively expensive since games run at a rate of

many frames per second. (N. Tsingos) It is more useful for simulations which require

accuracy, but not in applications such as games which are ran in real time.

Figure 4 Virtual Source Illustrations (Guay)

The alternative to ray tracing is the virtual source method, which trades physical

accuracy for computational simplicity and a reasonable amount of player immersion. In

this method, pictured in figure four, the simulation only concerns itself with points where

diffraction might occur. Based on these points, virtual sounds are generated at nearby

locations to simulate the audial effect of diffraction. Reflections are not calculated, and

are instead simulated by a separate reverberation algorithm which is not physically

accurate, but is chosen to match the games acoustic environment. Various shortcuts are

used to cut down on computational cost even further, by grouping two-dimensional areas

into zones, and treating the world as multiple two-dimensional layers rather than a

three-dimension space for computational purposes. (Guay) As a result, the virtual source

method is much less physically accurate than the ray tracing method. However, it

accomplishes the task of conveying the necessary information to a player while


Wade, 10

remaining with the constraints of modern technology, making it the current choice of

many games in which a high level of immersion is desirable.

The Future of Sound in Games

The meaning of immersion is dependent upon the medium however, and new

technology is shifting that definition. Commercial virtual reality hardware, such as the

PlayStation VR, Vive, and Oculus Rift systems, are setting higher standards for what

constitutes immersive sound. In virtual reality, where a moving three-dimensional headset

provides an illusion of depth and actually existing in the game world, precise audio

becomes more important. If a player sees an event that should cause a sound occur some

distance away, and the sound seems to come from somewhere just a few meters away,

even such a small discrepancy can break the complete and total immersion that virtual

reality works so hard to achieve. To account for this sensitivity, sound technology in

games is being pushed to evolve, causing demand for more accurate and precise

simulation.

Figure5UsingOculusRift(Shanklin)
Wade, 11

The techniques used to simulate sound in modern three-dimensional games work

fairly well for games in virtual reality, with one major caveat. They work well in the

sense that they are feasible from a computational perspective, and are refined enough to

give players a general idea of what direction a sound is coming from and how. What they

cannot provide, however, is an accurate simulation of how sounds are interpreted by

individuals. (Lalwani) This problem centers around what is known as a Head-Related

Transfer Function, or HTRF. HTRFs are like audial fingerprints, unique sets of

modifications to sounds as they are processed by an individual.

Recall that the interpretation of a sounds location is in part dependent upon the

distance between ones ears, and the absorption of sound by the head in between them.

Add in the diffraction of sound that occurs off of the shoulders and neck, and it becomes

very clear that there are many factors which vary from person to person, meaning that our

interpretations of sound is unique. In the real world, a sound coming from some distance

away literally sounds different to two different people. And if the exact same sound is

played through headphones to those two people, they will interpret it slightly differently

in terms of where it is located. With virtual reality, the problem isnt in accurately

simulating where a sound is coming from, but in adjusting that simulation to be

interpreted correctly by the listener.


Wade, 12

Finding solutions to the HRTF problem is challenging, because the exact nature of

the problem is different for each individual user. This requires a solution which is

adaptable. The most promising solution currently available is a set of stereo headphones

call Ossic which play three-dimensional audio after performing a calibration routine with

the wearer. In doing so, they can account for the wearers individual HRTF, and adjust

and maintain the location of sound using head-tracking technology, similar to that of

commercially available virtual reality systems.1 If similar systems were integrated into

existing virtual reality hardware, it may be possible to eliminate the small discrepancies

in sound that currently exists, and provide an even more immersive audio experience.

1 http://www.ossic.com/technology/
Wade, 13

Works Cited

"Auditory Localization - Introduction." Columbia College, Chicago - Audio Arts &

Acoustics, n.d. Web. 15 Apr. 2016.

Clark, Andrew. "Defining Adaptive Music." Gamasutra. N.p., 17 Apr. 2007. Web. 16 Apr.

2016.

Guay, Jean-Francois. "Real - Time Sound Propagation in Video Game." Game

Developers Conference Vault. Ubisoft Montreal, 5 Mar. 2012. Web. 1 Apr. 2016.

Hamilton, Kirk. "Mass Effect 3's Musical Secret." Kotaku. N.p., 22 Mar. 2012. Web. 16

Apr. 2016.

Lalwani, Mona. "For VR to Be Truly Immersive, It Needs Convincing Sound to Match."

Engadget. N.p., 22 Jan. 2016. Web. 1 Apr. 2016.

Miksicek, Lukas. "3D Sound - an Overview." CTU Prague, n.d. Web. 10 May 2016.

"Playing with Your Mind: The Psychology of Sound in Video Games." Amplifion, n.d.

Web. 1 Apr. 2016.

Selby, Kathy. "Lecture 22." Physics of Musical Sound. Cornell University, Ithaca. 21

Mar. 2016. Lecture.

Shanklin, Will. "Oculus Rift Review: Polished, Often Magical ... but Ultimately Second

Best." Gizmag, 16 Apr. 2016. Web. 18 Apr. 2016.

Zhong, Xiao-li, and Bo-sun Xie. "Head-Related Transfer Functions and Virtual Auditory

Display." (n.d.): n. pag. InTech, 5 Mar. 2014. Web. 15 Apr. 2016.

Potrebbero piacerti anche