Unit 5

Unit 5
Surround & 3D
Sound Systems
Digital Audio Processing (20023)
Sound and Image in Telecommunication Engineering
Course 2015/2016
Sergio Bleda Prez
Department of Physics, Engineering Systems and Signal Theory
Introduction
Introduction
n
In this unit we are going to review the different sound

systems available to produce surround sound
But prior to see these systems we must review the

concepts of spatial hearing
Spatial Perception of Sound
Spatial Perception
n
If you havent noticed yet: we have 2 ears
Why two?
Because our sound localization mechanism needs two
different signals
Using only one, localization is very very difficult
But not always impossible
To explain the sound localization mechanism, Lord

Rayleight proposed the Duplex theory in its Theory of
Sound in 1877.
5
Duplex Theory
n
The sounds perceived by both ears are similar both

not identical
Comparing both sounds the brain is able to locate the sound

source
n
The hearing sense localizes sound sources using

fundamentally two different parameters:
Inter-Aural Time Difference (ITD)

Inter-Aural Intensity Difference (IID)
n
If not the exact position, at least the direction of arrival (DOA)
Also known as Inter-Aural Level Difference (ILD)
These parameters allow a left/right localization
They allow Azimuth localization (lateralization)
ITD
n
It measures the temporal difference between the

arrival of the sound to the left and right ears
In this case, right signal

arrives before left signal
7
ITD
n
Its useful for low frequency signals
What is the low frequency limit?
The phase shift (delay) between both ears is easy to calculate

with low frequency signals
1.5 kHz
This frequency has a wavelength similar to the head size
For higher frequencies, hearing uses the ITD of the

envelopes (not the waveform)
But its much less relevant than ITD for lower frequencies
IID
n
It measures the difference of signal level/intensity

between both ears
The head produces a shadow zone
Consequently both ears doesnt receive the same amount of
signal
Unless the sound source is in front of (or at the back of) the head
Left ear is in
shadow zone
IID
n
It is useful to perform localization of high frequency

sources
Head produces sound shadows above 1.5 kHz
Below this limit, diffraction reduces considerably the
shadowing, avoiding effective localization
No shadow
due to diffraction
10
Duplex Theory
n
Hearing uses both ITD and IID
They are not used alternatively, they are used at the

same time
For low frequency content ITD

For high frequency content IID
But this theory has limits, It cant distinguish between:
In front of / behind sources

Above / Below sources
11
Duplex Theory
n
Example of ambiguity:
Both cases produce:

ITD = 0
12
Cone of Confusion
n
If we trace a imaginary straight line between both ears,

we obtain the: interaural axis
Using only the duplex theory we are going to obtain revolution

symmetry
Every sound source
located in the cone
produce the same
ITD & IID
Ambiguity
13
Improving localization
n
To eradicate the ambiguity we need to add some

different aspects to the localization mechanism
We need to include the effects of:
Head
Shoulders
Pinna
All of them produce reflections & diffraction

over the sound
14
n
Head, shoulders & pinna effects:
Sound
source
Sound
source
Head, shoulders & pinna

produce reverb like reflections
15
n
If the listener or the sound source moves, all changes
Reflections (and diffraction) are completely different
But, how all of these affect localization?
When we are young, the brain learns that a sound

arriving from a given direction has a distinctive reverb
like pattern
These reverb like patterns are known as: cues (marcas)
So, a sound including a given cue is detected as

coming from a given direction
16
n
The use of the cues improves considerably the spatial

localization mechanism
But, there are still some ambiguities that are not

solved
Example: still in front of / behind sources
To eliminate the remaining ambiguities, humans tend

to move involuntarily the head
When you are looking for something that makes noise,

instinctively you start moving the head
17
Other Additional Factors

n
There are some other factors that affect the

localization
Additional factors:
Kind of sound
Length of the sound
n
Onset
n
Long sounds are easier to find

The instant were the sound begins
Spectral content of the source

n
n
Wider spectral contents are easier to find

On the other hand, tones are extremely difficult to find
18
Location of Several Sound Sources

n
When there is more than one sound source, the sound

of all the sources is overlapped
If the sound sources are incoherent or partially

incoherent
That is to say, they are mixed together
Hearing treats each one alone

So, hearing detects one different location for each source
If the sound sources are coherent
They are joined together

So, hearing detects only one location
This unique location is known as Phantom Image
19
Phantom Image
n
When dealing with coherent sources, the phantom

image appears
There is only one unique location
But, which location?
The location is a mean of the locations of the different sources

But the mean is weighted by the levels of each source
n
This is known as: Sum of locations
Moreover, phantom Image depends on the spectral

content of the signal
In high frequencies the pinna interferes the location

So, location of high frequencies tends to be diffuse
n
HF Sound sources seem wider
20
Location with Reverb

n
All of the localization mechanism said previously are in

free field conditions
But in real world there is always some kind of room
There is no room involved
Or at least the floor

So, there will always be some reverb
What signal arrives to the ear?
Direct sound + Several reflections
21
Location with Reverb

n
What signal arrives to the ear?
In this case the hearing can locate the sound without

problem
Direct sound + Several reflections
But, the perceived sound is colored due to comb filtering

effects
But, direct sound & reflections are coherent signals

Theory says that there should appear a phantom image
providing an incorrect location
But hearing has other useful tool to avoid this: Haas effect
22
Haas Effect
n
The Haas effect is known under two other names:
Law of the first wavefront

Precedence effect
The Haas effect does the following:

Once the first wavefront (usually the direct sound) arrives
It inhibits the location mechanism during the following 2 to 50
ms
This way the location of the reflections is avoided, effectively
blocking the formation of the (wrong) phantom image
Haas Effect is influenced by the head, shoulders and

pinna reflections
It works better in the horizontal plane

23
Distance Perception
n
What does hearing notice when a sound is near of far?
It depends on reverberation
In free field conditions (without reverberation) :
We appreciate a drop of 6 dB
n
Each time we double the distance
Low frequencies are less attenuated than high frequencies
Another aspect is the wavefront curvature:
Near sources produce a spherical wavefront

Far sources produce a planar wavefront
n
But this is not noticed by hearing, at least not if we do not move

24
Distance Perception
n
In diffuse field conditions (with reverb.)
Near sources have high content of direct sound
We notice a change in the direct sound/reverb proportion
And relatively low content or reverb
Far sources have more reverberation
The reverb proportion is increased with distance
25
Movement Perception
n
When the sound source and/or the listener are in

movement we obtain the Doppler effect
When they are getting closer / away:
Apparent frequency is raised / lowered respectively
c
f a =
c + vr
f s
fa: apparent frequency

fs: real frequency
vr: relative speed
c: speed of sound
26
Distance + Movement
n
There is one additional effect due to distance and

movement both at the same time: Motion Parallax
Motion Parallax:
Near moving sources produce large sound level differences

Far moving sources have a constant level
E.g: a flying mosquito at night (when you are trying to sleep)
n
n
When is far you didnt notice it (it has a constant low level)
When is near it seems like a plane (it produces large sound
level differences)
And, of course: doppler
27
Distance + Movement
n
This effect is not limited to sound
Example: A car running fast in a motorway

High
Speed: traffic barrier (near)

Low Speed: mountains (far)
28
Spatial Sound Systems
29
Spatial Sound Systems

n
There are 3 different families:
Stereophonic Systems
Binaural Systems
Sound Field Reconstruction Systems
It may be systems that are a mix between families
30
Monoaural Sound
n
The first sound system was the monoaural, in which

there is only one channel
With this system there is no possibility to move the

sound out of the position of the speaker
All the sound sources are located in the position of the

loudspeaker
This is our starting point
31
Stereophonic Systems
32
Stereo
n
The stereo systems uses 2 channels: L & R
It allows the (apparent) movement of the sound source
It produces a substantial enhancement in sound quality
The movement is only based on IID (it does not use ITD)
Producing phantom sources
The use of IID for movement has a severe

consequence:
We are using a parameter that uses the head as reference

In any stereophonic system, the reference point is
ALWAYS THE HEAD
33
Stereo
n
Phantom sources are located between the arc that

unites both speakers
Maximum aperture
must be 60
= o
Otherwise phantom
gets unstable
SWEET SPOT
Unique place in which

phantom source is perceived OK
34
Stereo Panning
n
Since we are using the IID the movement of the

source is controlled with a panning
Sending more or less signal to each channel
There are several options to calculate the proportion of

the signals:
Linear law
Sine law
Tangent law
35
Stereo Panning
n
Linear law:
This is the easiest law, but it is not advisable
Since it does not maintain the apparent level of the source
during the movement
Hearing is not linear, is logarithmic
Phantom Source
100 %
R
GR
Gain applied
to each channel
GL
0%
gL = 1 gR
36
Stereo Panning
n
Sine law:
This more elaborated law maintains a constant level when the

source moves
Phantom
position
sin p
gL gR
=
sin o g L + g R
Maximum
aperture
37
Stereo Panning
n
Tangent law:
This law maintains a constant energy level when the source

moves
Phantom
position
tan p
gL gR
=
tan o g L + g R
Maximum
aperture
This is the most

accurate panning law
38
Stereo Panning
n
Both previous laws only say the proportion between

gains
But we need another equation to be able to calculate

the gains:
(Two equations for two unknowns)
GL + GR = 1
For Free Field
GL2 + GR2 = 1
For Diffuse Field
39
Stereo Panning
n
To take into account: mixing consoles and other

equipment has a panning control
Panning control uses tangent law
But the maximum aperture used is 90
In this case, tangent law with difusse field can be

simplified to the following equations:
GR = sin( P )
GL = cos( P )
This is why tangent law is also

known as Sine-Cosine law
40
Quadraphonic
n
Since the apparition of the stereo was a revolution,

manufacturer of Hi-Fi equipment decided to go further
They designed the quadraphonic systems
Instead of 2 channels they use 4 channels
It was a complete disaster

In 1970 this was very difficult to implement
Each manufacturer produced its own standard
With vinyl discs we needed 2 discs being reproduced at the
same time
The use was simple with eight-track cartridges
41
Quadraphonic
n
As you can see, speakers do not maintain the rule:
The maximum aperture must be 60
42
Phantom with several speakers

n
How do we locate phantom sources with more than 2

speakers?
The answer is the same for every stereo based system
Phantom sources are always created using only 2

speakers
Speakers are always used by pairs for creating phantom

sources, only two are used at a time
But remember to maintain always the maximum aperture of

60 degrees
43
Dolby Stereo
n
The Dolby Stereo system was designed for cinema
There were two different domestic versions:
Dolby Surround
Dolby Surround ProLogic
It uses 4 different channels: L, R, C, S
All of them were mixed in only two channels

C (center) is used to maintain the dialogues stable
S (surround) is used for ambient sound
44
Dolby Stereo
n
Maximum aperture is 60
0
-30
30
Phantom images
are only possible
with L & R speakers
45
Dolby Stereo
n
Example: Dolby Stereo in Cinema
As you can see, surround channel uses several speakers

But everyone emits the same sound
46
Dolby Stereo
n
Example: Dolby Surround ProLogic at Home
The Dolby Surround version does not have the center channel
47
3/2 System
n
Also known as 5.0 system
Or 5.1 If it includes the subwoofer channel (LFE)
It uses 5 channels:
L, R, C, LS & RS
All the speakers must reside over a circle
Since it is a stereo based system, its location power is

the same as stereo
It only allows a better ambient definition (surround)

48
3/2 System
n
Diagram:
Surround
channels
are not fixed
100-120
100-120
They are not

used for
location of
sources
49
3/2 System
n
Using more speakers does not enhance localization
6.1
7.1
50
10.2 System
n
This system has a slogan:
Twice as good as 5.1

Its designer is the engineer creator of THX
As the name implies, it uses 10 channels plus 2 LFE
The interesting part is that is uses:
2 channels for elevation at 45

n
They allow the reproduction of sounds above the stage
2 Channels at 55
n
n
They enhance localization

And allow lateral reflections: increasing the spatial sensation
51
10.2 System
n
n
LH & RH: Left/Right height

LW & RW: Left/right wide
52
22.2 System
n
This is a System proposed by the NHK
The national television of Japan
It consist of 22 channels + 2 LFE
Channels are divided in layers, allowing elevation

effects
53
22.2 System
n
The placement of the speakers is as follows:
54
VBAP
n
Vector Based Amplitude Panning
As the name implies, it uses an amplitude panning to

move the source around
But now the movement can be in 3D too
Well, 3D with severe restrictions
To allow elevation it needs an additional speaker
So now we are going to make pannings with 3 speakers

Applying a generalized version of the tangent law
55
VBAP
n
We are going to maintain the listener as reference

point
Phantom
coordinates
g = [ pn
Gain
(1x3)
pm
pk ]
Speakers
coordinates
ln1 ln 2 ln3
l
l
l
m1 m 2 m3
lk1 lk 2 lk 3
We can use more than 3 speakers

But always making triangles
56
Binaural Systems
57
Binaural Systems
n
Stereo systems has severe limitations when dealing

with precise spatial location
They are only using IID
We could choose to use ITD instead (or both at a time)

Delaying the sound from each channel we can achieve the
same results than with IID
But its more complex and it enhances nothing
So, instead of using only IID (or ITD) we can make

another approach
Use everything: ITD + IID + Reflections & Diffraction on head,
shoulders and pinna
These are Binaural systems
58
Binaural Systems
n
So, binaural systems try to:
Reconstruct exactly the same signals that are produced by

real sound sources over the ears
But, at first sight this is very, very complex
How do we include head, shoulders & pinna

reflections & diffraction?
With the HRTF
59
HRTF
n
HRTF: Head Related Transfer Function
A HRTF is a filter
This filter introduces all the needed effects

Reflections & diffraction due to head, shoulders & pinna
Indeed, we need a large set of filters
Two filters (HRTFs) for each direction of arrival

n
One for the left ear & one for the right ear
How do we compute/measure the HRTF filters?
With a bit or work

60
HRTF Measurement
n
We need to use 2 small microphones
Located inside the ears
An then, with the head still, we are going to produce

sound from every possible direction
With a given resolution (e.g. 5 or 10 degrees)

61
HRTF Measurement
n
To speed up the process we can use a semicircular

array of speakers
Anechoic chamber
gets rid of
room reflections
62
HRTF Measurement
n
From each speaker at a time we send an impulse
Or any other signal useful for obtaining an impulse response

HRIR
Impulse
HRIR: Head Related Impulse Response
63
HRTF Measurement
n
If we represent all the HRIR for the horizontal plane

we obtain the following map:
This example
is for the
right ear
64
HRTF Measurement
n
We have measured an HRIR then, what is a HRTF?
A HRTF is the Fourier transform of the HRIR
The previous chart in frequency becomes:
65
HRTF
n
One big problem that comes with the HRTF is the

uniqueness
People tend to have different heads, shoulders and

ears
If we are going to use them
So, each person will have a different set of HRTFs
We will have to measure the HRTFs for each person
There are sets of general HRTF that work more or

less with everyone
But you will achieve much more accuracy with your own
HRTFs
66
Binaural Systems
n
Once we have measured the HRTFs the binaural

systems are very easy to implement
You only have to filter the sound with the corresponding pair
of HRTFs of the desired direction
And apply a delay and gain proportional to the distance of the
phantom source
n Since HRTF always measured at a given (short) distance
from the head
Source
HRTFL
HRTFR
R
67
Binaural Systems
n
One severe drawback of binaural systems is that we

need to use headphones for reproduction
We are computing the sound in the ears once it has passed
the pinna
If we use speakers we are going to include the pinna twice
If you do this, everything is ruined
68
Binaural Systems
n
More problems due to headphones:
It appears the Inside the Head effect: in several positions of

the sources, the sound seems to come from inside the head,
not the outside
n
People like speakers over headphones for something
We
are bond to the head: If the listener moves the head,

the scene moves with him/her
To break the bond there are two options:

Obvious:
Do not move the head!

Complex: Include a tracking mechanism. Following the
movement of the head and compensating for it
(changing the HRTFs)
69
Transaural Systems
n
The use of headphones is a severe drawback in the

vast majority of scenarios
Worst scenario ever: Cinema

Best scenario: videogames
The transaural systems appear as an alternative to binaural

systems that tries to overcome the headphone problem
Using a bit of signal processing we can change the

headphones for speakers
Lets see what we will hear using speakers

70
Transaural Systems
n
If we change headphones for speakers, we need to

deal with crosstalk
We need left speaker to left ear, and right speaker to right ear
But we also have crosstalk
Crosstalk:
dashed lines
We need to cancel the crosstalk using signal

processing
71
Transaural Systems
n
How do we do this?
Emitting two signals that cancel each other crosstalk when

received by ears
x1
x: signals emitted by speakers

y: signals received by ears
Hab: HRTF between speaker-ear pair
n a speaker
n b ear
x2
H12
H21
H11
H22
y1
y2
72
Transaural Systems
n
We need to solve the following system of equations:
! y $ ! H
# 1 & = # 11
#" y2 &% #" H 21
H12
H 22
$! x $
&# 1 &
&%#" x2 &%
x1
x2
H12
H21
H11
This is the desired sound
(already known)
This is the sound

that we must send
H22
y1
y2
73
Transaural Systems
n
Block diagram:
So, at the end we apply the following block diagram
74
Transaural Systems
n
Now, the question is: does this really work?
The answer is: Yes & No
Transaural systems work only in controlled scenarios
Room reverberation ruins our effort

n
The head of the listener must be completely still

n
It is very difficult to include & cancel reverberation issues

A small movement changes reflections drastically and avoiding
an effective crosstalk cancellation
In this case, we should perform a tracking of the head, and
compensate for it
Although during the time of reaction crosstalk is clearly
noticeable
In an anechoic chamber, it works great

75
Sound Field
Reconstruction Systems
76
Sound Field Reconstruction

n
This last family of sound systems use a different

approach to solve the spatial sound problem
Stereophonic & binaural systems are focused on the

hearing sense
They try to deceive the localization mechanism with more or
less skill
But these approaches have a common problem: they are
head dependent
The new family will try a completely different and novel

approach
They are going to try to synthesize the sound field in a whole

area, not only in the ears
77

n
Lets explain it:

Instead of deceiving the localization mechanism
We are going to build a sound field identical to the one that
the real sound source would produce
The sound field reconstruction is not limited to the sound in
the ears only, we will reconstruct it in the whole room
This way, in a cinema with a lot of people, each one

will perceive the scene in a different way (as happens
in the real world)
Instead of making only one scene for all people at the same
time
78

n
Example:
Virtual Sound Source
All listeners perceive the

same source position
But each one with a

different DOA
DOA depends on
the current position
of the listener
Listener 1
Listener 2
Listener 3
(moving)
Now we are:
Head Independent!
79

n
To achieve our goal there are two different systems
They are very different at first sight but, at the end,

they are based on the same physics
Both perform the same task but from different point of views
Sound Reconstruction Systems:
Ambisonics
Wave-Field Synthesis
80
Ambisonics
n
Ambisonics is a sound system that started as a

microphone technique used to store spatial sound
It stores the original sound filed of a given (and

unique) point in a finite number of channels
It measures the sound field in a given point
Then in reproduction it extrapolates the signal that

must be sent to each speaker
When the signal emitted by each speaker arrives to the

original (measured) point it recreates the same sound field
that was stored for this point
81
Ambisonics B-Format
n
The most simple version is the Ambisonics B-Format
It stores 4 different channels:
W: pressure
X: velocity in x axis
Y: velocity in y axis
Z: velocity in z axis
This is achieved with 4 different microphones:
One omnidirectional (W) and three bidirectional (X, Y, Z), one

for each axis
*Image source: wikipedia
82
Ambisonics B-Format
n
Having stored only the pressure (W) we can decide in

which direction we want to be the source:
Giving to angles: Azimuth() & Elevation()
The signals to be sent to the speakers will be:

We only need to compute the D matrix
D: decoding matrix
n
It depends on the speaker locations

83
Ambisonics
n
Example of use:
Ambisonics does not store the sound to be emitted by

each speaker in a different channel
It computes the signal to be sent to each speaker during
reproduction time
Using the decoding matrix
84
Ambisonics
n
Advantages:
The format does not define the position of the speakers

We can locate the speakers as we want
Disadvantages:
Speakers cant be located anywhere

n
n
It
Decoding matrix depends on speaker positions

To be able to compute the decoding matrix we need speakers at
regular positions (shapes: circle, sphere, semi-sphere, )
only works in the sweet spot

n
n
The center of the regular shape array of speakers

As in stereo systems
85
Ambisonics
n
But we said that this system was a Sound Field

Reconstruction System
Not something similar to a stereo system
Yes, but this is only possible with: HOA
Higher Order Ambisonics
86

n
We have seen the most basic Ambisonics system
B-Format
It uses only 4 channels to encode the sound
With HOA we are going to store more than 4 channels

Increasing the number of stored channels, we achieve a
greater sweet spot
Sweet spot becomes a Sweet Area
n
n
More channels wider area

The channels now represent Spherical Harmonics
But the complexity of the calculus needed to compute the

decoding matrix is increased notably
87

n
Representation of the Spherical Harmonics up to 3rd

order:
*Image source: wikipedia
88

n
The equations of the spherical harmonics up to 4th

order are:
4th order HOA needs 25 channels

89

n
Example of a 3D 3rd order ambisonics reproduction

system, CHESS (Guillaume Pottard):
90

n
The worst drawback of ambisonics is the need of a

regular shape array of speakers
In 2D the common regular shape is the circle

Think about a cinema with the speakers forming a circle
around the seats
Its not operative
91
n
WFS is the second system that falls inside the Sound

Field Reconstruction Systems
At first sight is very different from Ambisonics
But in fact, they are almost the same
Its starting point is the Huygens Principle
So lets review it
92
Huygens Principle
n
Every point of the wavefront can be seen as a new

source of spherical waves
93
n
WFS uses the Huygens principle to reconstruct the

sound filed
Secondary
sources
Original
source
If the wave front is created combining the spherical

waves of the secondary sources
Lets replace secondary sources by speakers

This is known as acoustic curtain principle
94
Acoustic Curtain Concept

n
n
n
Step 1 Put an speaker over each secondary source

Step 2 Form a linear array, instead of curve
Step 3 Get rid of the original source
95
n
Stereo VS WFS comparison:
96
n
Available types of sound sources:
Stereo: Only
Point Sources
Focused
source
97
n
Now, the thought part
Which sound must be emitted by each speaker?

We must calculate it
Kirchhoff-Helmholtz integral:
It is used the describe (with Huygens principle) the sound field

that is inside a given surface (S)
It says: The acoustic field inside a given volume can be

recreated using secondary sources distributed around the
surface that encloses the volume
98
n
Kirchhoff-Helmholtz integral:
Sound
Source
Listener
Secondary
sources
99
n
Simplifying:
Lets suppose that the volume is an infinite cube, the surface
is now only a plane that divides inside from outside
Now lets reduce even more, from 3D we change to 2D, the
surface is now a straight line
And last, we discretize the surface in N points
100
n
So, at the end, each speaker must emit:
A delayed and attenuated version of the original sound

Plus a high pass filter:
n
This filter appears when reducing from 3D to 2D
All speakers form a linear array
Delay due
to distance
HP Filter
Pressure
Distance
attenuation
101
WFS Limitations
n
Linear array:
Since we are using linear arrays of speakers, the sound

emitted has cylindrical divergence instead of spherical
n
The sound decays only 3 dB each time we double the distance
102
WFS Limitations
n
Diffraction:
The array of speakers is finite, it has limits

In both extremes it will appear diffraction
103
WFS Limitations
n
Spatial Aliasing:
Speakers must be located at some distance between them (at

least their size)
n
In the Huygens principle, each secondary source is infinitely

small and there are infinite sources
High Frequencies are not well reconstructed
Pure tone
with and
without
aliasing
104
WFS Limitations
n
Loudspeakers directivity:
In theory, secondary sources are omnidirectional
In practice, speakers are not omnidirectional at all the
frequencies
But this is not so important, in low frequencies they work ok
In high frequencies, spatial aliasing is worse than this
105
WFS Examples
n
WFS Cinema in Ilmenau (Germany)
106
WFS Examples
n
Prototype at the IRCAM (France)
It uses planar DML speakers
107
WFS Examples
n
Prototype at UPV (Valencia)
Uses 96 channels
108
WFS Examples
n
Prototype at the UA
109
Ambisonics VS WFS
n
Ambisonics:
It allows elevation (only for small orders of HOA)

n
Higher orders are too expensive with computing power needs
It needs a regular array of speakers

To obtain a big Sweet Area it needs a very high order
At the end, with the maximum order = WFS
WFS:
At the moment, without elevation

n
It allows any array configuration

n
But equations allow it, the problem is the money (& computer power)
Using combinations of linear arrays
There is no sweet spot

n
But the center is a preferential location

110
Sound in Movie Theatres
111
Sound in Cinema
n
The sound in Cinema has had two great milestones:
The use of more than two channels enhances the

spatial perception
The apparition of sound

The stereo system
But the difference of perception between mono & stereo is far

better
In the following slides we are going to review briefly

the different attempts made in cinema sound
And we are going to learn how to store sound in a film

112
Monoaural Sound
n
n
At the beginning of the cinema, the sound and the

image were stored separately
During the reproduction they used a gramophone
Varying the speed to maintain the synchronization
113
Monoaural Sound
n
Later, when the sound was already treated as a

electric current (over 1926)
The sound started to be stored jointly with the image
The sound was stored as an opaque band over a

transparent section of the film (optic format)
Sound
band
114
Multichannel Sound
n
The first multichannel Sound Track was One hundred

men and a girl in 1937
It was recorded in 9 channels
But at the end all channels were mixed together in the
monoaural optic band
From this moment and on, several attempts were

made to enhance the sound in cinema
But, the vast majority were a failure

The technology was very expensive for the moment
115
Fantasound (1940)
n
In 1940, it was performed the first projection of a true

multichannel film: Fantasia
9 different sound channels
Mixed in 4 different optic bands
Using 3 speakers behind the screen and 65 small speakers
around the theatre
It was amazing but too much expensive, only used in Fantasia
116
Cinerama (1952)
n
n
n
It used 3 joined projectors to produce a panoramic format

And 7 audio channels stored in a magnetic multitrack player
As with fantasound, it was so expensive that there were only a
few compatible cinemas and short list of produced films
117
Cinemascope (1953)
n
Born in 1953
The system used an anamorphic lens to deform the image
and convert a 4:3 film in a 16:9 format
The lens must be used both at recording & reproduction
It used 4 audio channels stored in 4 magnetic tracks over the
film
Lens
Magnetic audio tracks
118
Dolby (1970)
n
In 1970 Ray Dolby introduced the Dolby-A noise

reduction system
Stanley Kubricks A Clockwork Orange was the first film to

use it (1971)
In 1974 is introduced the Dolby Stereo system
The 4 channels
are mixed in
2 optic bands
Later it
became an
ISO standard
119
Sensurround (1974)
n
This (defunct) system was an extension of the 4

channel system
And the precursor of the subwoofer channel
It added 4 very big speakers behind the screen
2 at each side
Each one driven with a 1 kW amplifier
Emitting only infrasounds (bellow 20 Hz)
n
All of them controlled with 1 additional track
It was presented with the film:
Earthquake
120
Digital Sound (1986 - )

n
Digital Sound in cinemas was introduced in 1986
There are several standards:
Dolby Stereo Digital

n
DTS
n
Digital Theater System
SDDS
n
Dolby Digital or Dolby SR-D
Sony Dynamic Digital Sound
From the previous unit we already know how to

process the sound
But, the question is How is stored the digital sound in the

film?
121
Digital Sound
n
Digital sound is stored in optic format

Dolby
Digital
Dolby
Stereo
(analog)
SDDS
DTS
122
Digital Sound
n
As you have seen in the previous slide, the sound is

stored near the holes of the film
It seems like the QR codes used for smartphones
Dolby Digital is stored in a very bad place

The projector trailing mechanism deteriorates the stored
signal each time is reproduced
After two weeks presenting the film (or even less), the sound
system will change to the analog sound track due to the errors
encountered when decoding dolby digital
And DTS
Its so simple only a small amount of dashes

123
DTS
n
DTS is a system very similar to Dolby Digital
It is supposed to produce better quality (with more bitrate)

But you must be well trained to notice it
The major difference is the storing mechanism

It does not store the sound in the film
It stores the sound in CDs (or DVD / Bluray)
In the film it only stores a Time-Code to synchronize the
player
124

Unit 5 - Surround and 3D Sound Systems

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Unit 5 - Surround and 3D Sound Systems

Caricato da

Copyright:

Formati disponibili

In this unit we are going to review the different sound

But prior to see these systems we must review the

Spatial Perception of Sound

If you havent noticed yet: we have 2 ears

But not always impossible

To explain the sound localization mechanism, Lord

The sounds perceived by both ears are similar both

Comparing both sounds the brain is able to locate the sound

The hearing sense localizes sound sources using

Inter-Aural Time Difference (ITD)

If not the exact position, at least the direction of arrival (DOA)

Also known as Inter-Aural Level Difference (ILD)

These parameters allow a left/right localization

They allow Azimuth localization (lateralization)

It measures the temporal difference between the

In this case, right signal

Its useful for low frequency signals

What is the low frequency limit?

The phase shift (delay) between both ears is easy to calculate

For higher frequencies, hearing uses the ITD of the

It measures the difference of signal level/intensity

It is useful to perform localization of high frequency

Hearing uses both ITD and IID

They are not used alternatively, they are used at the

For low frequency content ITD

But this theory has limits, It cant distinguish between:

In front of / behind sources

Both cases produce:

If we trace a imaginary straight line between both ears,

Using only the duplex theory we are going to obtain revolution

To eradicate the ambiguity we need to add some

We need to include the effects of:

All of them produce reflections & diffraction

Head, shoulders & pinna effects:

Head, shoulders & pinna

If the listener or the sound source moves, all changes

Reflections (and diffraction) are completely different

But, how all of these affect localization?

When we are young, the brain learns that a sound

These reverb like patterns are known as: cues (marcas)

So, a sound including a given cue is detected as

The use of the cues improves considerably the spatial

But, there are still some ambiguities that are not

Example: still in front of / behind sources

To eliminate the remaining ambiguities, humans tend

When you are looking for something that makes noise,

Other Additional Factors

There are some other factors that affect the

Long sounds are easier to find

Spectral content of the source

Wider spectral contents are easier to find

Location of Several Sound Sources

When there is more than one sound source, the sound

If the sound sources are incoherent or partially

That is to say, they are mixed together

Hearing treats each one alone

If the sound sources are coherent

They are joined together

When dealing with coherent sources, the phantom

There is only one unique location

But, which location?

The location is a mean of the locations of the different sources