Durai Slides PDF

SOUND PROPAGATION
OVERVIEW
sound source
AN INTRODUCTION TO
HUMAN SPATIAL HEARING
listener
Physics of sound
speed c
Acoustic cues for sound localization

Azimuth
Elevation
Range
frequency f (Hz)
Head-related transfer functions (HRTFs)
Richard O. Duda
CIPIC Interface Laboratory
UC Davis
wavelength
l
Approaches to synthesizing spatial sound

Opportunities and challenges
http://phosphor.cipic.ucdavis.edu
October 12, 2000
c = f l
umd00_title.ai
jh_propagation.ai
umd00_overview.ai
MULTIPATH PROPAGATION
AXIOM I
Reflection
AXIOM II
Exact reproduction of the sound pressure
is not necessary for producing the same
auditory perception.
The sound pressure at the two

ear drums is a sufficient stimulus.
Refraction
Producing the same sound pressure will

produce the same auditory perception.
Scattering
Examples:
Caveats:
The limitations of neural responses

allow different (and simpler) stimuli
to produce the same response.
Bone conduction
Adaptation
Conflicting visual cues
Conflicting expectations
Bandwidth
Amplitude
Monaural phase
Latency
Spectral fine structure
(20 Hz to 20 kHz)
(1-dB resolution)
(2-ms resolution)
(10-ms resolution)
(critical bands, Q = 8)
umd00_axiom_1.ai
jh_paths.ai
umd00_axiom_2.ai
VERTICAL-POLAR
COORDINATES
INTERAURAL-POLAR
COORDINATES
AXIOM III
Although it is not necessary to reproduce
all of the cues exactly, conflicting cues
degrade perception.
Sound source
Media
n
Plane
Plane o
Sound source
q f
ation
nt elev
f consta
Interau
ral axis
Media
n
Plane
e
an
pl
al
t
on
riz
Ho
of
ne t
Pla stan
n
co uth
im
az
riz
Ho
Key engineering challenge -- find the

most cost-effective approximation.
Cone of
constant
elevation
e
an
pl
al
t
on
Cone of
constant
azimuth
q
f
umd00_axiom_3.ai
ubc_vp_coords.ai
ubc_ip_coords.ai
AZIMUTH CUES
WOODWORTH'S FORMULA
sound source
ARRIVAL TIME
Sound
Source
Rayleigh's solution (20% rise time)

Woodworth's formula
0.5
ITD
(Interaural Time Difference)
ILD
(Interaural Level Difference)
DTcon =
0.4
a
q
aq
c
Ipsilateral Ear
a sin q
DTips =
- a sin q
c
Arrival time (ms)
aq
Contralateral Ear
0.3
0.2
0.1
0.0
0
50
100
150
200
250
300
350
400
-0.1
Angle of Incidence (deg)
-0.2
-0.3
-0.4
ITD = a q + sin q
c
jh_azimuth_cues.ai
ubc_delay.ai
ubc_delay_curve.ai
TORSO REFLECTION
ELEVATION CUES
THE PINNA
sound
source
f
sound
source
Helix
Antihelix
Scaphoid fossa
h
fmin
DTT
sound
source
2h
c
Triangular fossa
Cymba concha
Crus helias
Cavum concha
External auditory meatus
fmin
Antitragus
90o
Tragus
Intertragal incisure
Pinna reflections and resonances
Lobule
|H(f)|
Torso and shoulder reflections
jh_elevation_cues.ai
1
2DTT
3
2DTT
5
2DTT
7
2DTT
f
umd00_torso_refl1.ai
PINNAE
PINNA PHENOMENA
ubc_pinna_nomenclature.ai
RANGE CUES
Pinna reflections (Batteau)

sound source
sound
source
Pinna resonances (Shaw)
+
+
Loudness
(for familiar sources)
Excess ILD
(for close sources)
Direct/reverberant
(for distant sources)
ubc_pinna_modes.ai
ubc_pinnae.ai
jh_elevation_cues.ai
HEAD-MOTION CUES AND

FRONT/BACK CONFUSION
HEAD-MOTION CUES AND

ELEVATION MAGNITUDE
OTHER CUES
Visual cues
Synchronized motion
Absence
Knowledge of source
a
Knowedge of environment
2a
a
c
ITD =
ITD =
ITD = 0
umd00_dynamnic_cues1.ai
THE HEAD-RELATED
TRANSFER FUNCTION
Sound Source
r0
r0
HL(f)
THE HEAD-RELATED
IMPULSE RESPONSE
Sound Source
Sound Source
X(f)
d(t)
hL(t)
HR(f)
XL(f)
xL(t)
XR(f)
Inverse range
k=
2pf
c
Fourier transform of source pressure

Fourier transform of left ear pressure
Fourier transform of right ear pressure
Free-field pressure at the origin
XL(f) = HL(f) Xff(f)
XR(f) = HR(f) Xff(f)
xL(t) = Left ear pressure

xR(t) = Right ear pressure
xff(t) = Free-field pressure at the origin
xL(t) =
hL(t) xff(t-t) dt
=
=
=
=
Xff = Hff X
r0 - j k r
Hff(f) =
e
,
r
X(f)
XL(f)
XR(f)
Xff(f)
xR(t)
X(f) = Fourier transform of source pressure

Xff(f) = Free-field pressure at head center
hR(t)
xR(t) =
hR(t) xff(t-t) dt
8
Xff(f)
umd00_other_cues.ai
umd00_dynamnic_cues2.ai
FREE-FIELD RADIATION FROM A

SPHERICAL SOURCE
X(f)
2a
a cos f
c
Propagation delay
jh_ff.ai
HRIR SOUND SYNTHESIS

Virtual
Source
xL(t)
ubc_HRIR_def.ai
ubc_HRTF_def.ai
A STRUCTURAL MODEL
Virtual
Source
COMPUTING HRTFs BY
BOUNDARY ELEMENT METHODS
xR(t)
x(t)
Sound Signal
Convolver
hR(t)
hL(t)
Head-Related
Impulse Responses
Azimuth q
Elevation f
xL(t)
Convolver
Head
xR(t)
Pinna
Pinna
Torso
Room
Head
Torso
Room
Digitize with a 3-D scanner

Solve wave equation numerically
Range r
jh_synthesis.ai
x(t)
Sound Signal
* See Kahana et al.

jh_structural_model.ai
ubc_bem.ai
ACOUSTIC
HRTF MEASUREMENT
THE KEMAR
ACOUSTIC MANIKIN
KEMAR HRIR
Azimuth = -45o, Elevation = 0o
Left ear
u th
im
Az
f Elevation
ural
Intera
Axis
Right ear
0.5
1.5
Time (ms)
ubc_kemar.ai
RIGHT-EAR HRTF FOR KEMAR

(Horizontal Plane)
KEMAR HRTF
20
Response (dB)
10
AZIMUTH = 0o
5
0
-5
AZIMUTH = -90o
-10
-20
-15
-20
10000
100
10
20
5
0
-5
AZIMUTH = 270o
-10
-15
Frequency (kHz)
-20
1000
10000
HRTF ELEVATION DEPENDENCE
-10
0
100
200
Elevation (deg)
-15
dB
umd00_full_HRTF.ai
Frequency (kHz)
-5
AZIMUTH = 270o
-10
-15
-20
1000
Frequency (Hz)
10000
ubc_ke_np_freq.ai
A PINNA ON A PLANE
HRTF WITHOUT PINNA
15
ubc_ke_freq.ai
jh_kemar_hrtf_m45.ai
BACK
-5
100
Frequency (Hz)
10
AZIMUTH = 90o
AZIMUTH = 180o
-25
-25
100
2
4
6
8
10
12
14
16
10
AZIMUTH = 180o
10
Response (dB)
0.2
BACK
AZIMUTH = 90o
15
10000
1000
Frequency (Hz)
20
Right ear
AZIMUTH = -90o
-10
Frequency (Hz)
-20
Frequency (kHz)
0
-5
-25
1000
0
-10
FRONT
AZIMUTH = 90o
AZIMUTH = 0o
-25
100
-30
0.1
10
AZIMUTH = 90o
-15
10
Response (dB)
Response (dB)
FRONT
15
Left ear
20
HRTF FOR KEMAR, NO PINNA

(Horizontal Plane)
Response (dB)
Azimuth = -45o, Elevation = 0o
30
jh_kemar_hrir_m45.ai
umd00_hoop.ai
15
2
4
6
8
10
12
14
16
10
5
0
-5
-10
0
100
200
Elevation (deg)
-15
dB
umd00_HRTF_nopinna.ai
umd00_pinplane.ai
CONTRIBUTIONS TO THE HRTF
A STRUCTURAL MODEL
Frequency (kHz)
2
4
6
8
10
12
14
16
Full HRTF
10
5
0
Head and torso
-5
-10
0
-15
100
200
Elevation (deg)
Pinna
dB
10
5
0
-5
-10
-15
15
2
4
6
8
10
12
14
16
Frequency (kHz)
15
10
5
0
-10
xR(t)
Pinna
Pinna
-15
15
2
4
6
8
10
12
14
16
10
5
Head
Torso
Room
Head
Torso
Room
0
-5
-10
-15
dB
100
200
Elevation (deg)
umd00_HRTF_contributions.ai
umd00_HRTF_pinna.ai
Virtual
Source
xL(t)
-5
THE SPHERICAL-HEAD MODEL
Virtual
Source
15
2
4
6
8
10
12
14
16
Frequency (kHz)
Frequency (kHz)
HRTF FOR ISOLATED PINNA
ASSESSING THE
SPHERICAL HEAD MODEL
Only one parameter -- easily customized
x(t)
Sound Signal
jh_structural_model.ai
ELLIPSOIDAL-TORSO MODEL
sound
source
Well focused
Good left/right position
Head
Model
No up/done control -- image elevated

xL(t)
xR(t)
DTL(q)
DTR(q)
HHsL(q)
HHsR(q)
With a head tracker:

Moderately externalized
Little front/back confusion
DTT
rT
Without a head tracker:

Internalized
Usually seems to be in back
rT = torso reflection coefficient

DTT = torso reflection delay
x(t)
ubc_sphere_model.ai
jh_sphere_assess.ai
STRUCTURAL HRTF MODEL

ASSESSING THE
ELLIPSOIDAL TORSO MODEL
Only one component of a full model
jh_ellipsoid_assess.ai
jh_torso_reflections.ai
SIMPLIFIED PINNA MODEL
Pinna
Model
Five parameters; still easily customized

Provides an elevation cue
Significant below 3 kHz
Ineffective in median plane
Head
Model
Head
Model
DTH(q)
Head
Model
Torso
Model
Fixed-pole
resonator
Fixed-pole
resonator
DTP(f)
DTP(f)
kP(f)
kP(f)
DTT(q,f)
HHS(q)
rT
Head Model
Torso Model
jh_structural_model_2.ai
jh_structural_model_3.ai
SPATIAL SOUND SYSTEMS
MULTICHANNEL SYSTEMS
TWO-CHANNEL: HEADPHONES
Multichannel
xL(t)
xR(t)
Pros
Can reproduce full 3-D with only 2 channels
Private and non-interfering
Conceptually simple
Pros
Two-channel:
headphones
Works with a large audience

No customization needed
Conceptually simple
Cons
Cons
Two-channel:
crosstalk-canceled
loud speakers
Speakers must be distant

Many channels needed for full 3-D
Space consuming, expensive
umd00_systems.ai
TWO-CHANNEL:
CROSSTALK-CANCELED
LOUD SPEAKERS
xL(t)
xR(t)
Can reproduce full 3-D with only 2 channels

Unencumbered listening
Cons
Small "sweet spot"

Cannot be used with a large audience
Requires customization for full 3-D
Difficult to get near or rear locations
umd00_systems4.ai
umd00_systems3.ai
CHALLENGES
AND
OPPORTUNITIES
APPROACHES TO
CUSTOMIZATION
Nearest-neighbor
Trial and error
Anthropometry
Pros
Uncomfortable for extended use

Clumsy for a large audience
Requires customization for full 3-D
Difficult to achieve frontal externalization
umd00_systems2.ai
Measure exact HRTF for each person

Acoustic
Computational
Inverse HRTFs
Frequency range
(combining partial HRTFs)
Elevation perception
Front/back confusion
Low elevations
Range perception
Headphones: externalization
Median plane
Frontal
Speakers: back locations
Scale a standard HRTF

Global
Pinna/head/torso components
Transducers
Headphone compensation
Loudspeaker "sweet spot"
Use an adaptive model

Match to anthropometry
Match to exact HRTF
Latency in dynamic systems

umd00_customization.ai
Room acoustics
umd00_problems.ai

Durai Slides PDF

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Durai Slides PDF

Caricato da

Copyright:

Formati disponibili

SOUND PROPAGATION

Acoustic cues for sound localization

Head-related transfer functions (HRTFs)

Approaches to synthesizing spatial sound

The sound pressure at the two

Producing the same sound pressure will

The limitations of neural responses

Key engineering challenge -- find the

Rayleigh's solution (20% rise time)

(Interaural Time Difference)

(Interaural Level Difference)

Arrival time (ms)

Angle of Incidence (deg)

Pinna reflections and resonances

Torso and shoulder reflections

Pinna reflections (Batteau)

Pinna resonances (Shaw)

(for familiar sources)

(for close sources)

(for distant sources)

HEAD-MOTION CUES AND

HEAD-MOTION CUES AND

Fourier transform of source pressure

XL(f) = HL(f) Xff(f)

XR(f) = HR(f) Xff(f)

xL(t) = Left ear pressure

X(f) = Fourier transform of source pressure

FREE-FIELD RADIATION FROM A

HRIR SOUND SYNTHESIS

Digitize with a 3-D scanner

* See Kahana et al.

RIGHT-EAR HRTF FOR KEMAR

HRTF ELEVATION DEPENDENCE

HRTF WITHOUT PINNA

HRTF FOR KEMAR, NO PINNA

Azimuth = -45o, Elevation = 0o

CONTRIBUTIONS TO THE HRTF

Head and torso

THE SPHERICAL-HEAD MODEL

HRTF FOR ISOLATED PINNA

Only one parameter -- easily customized

No up/done control -- image elevated

With a head tracker:

Without a head tracker:

rT = torso reflection coefficient

STRUCTURAL HRTF MODEL

Only one component of a full model

SIMPLIFIED PINNA MODEL

Five parameters; still easily customized

SPATIAL SOUND SYSTEMS

Works with a large audience

Speakers must be distant

Can reproduce full 3-D with only 2 channels

Small "sweet spot"

Uncomfortable for extended use

Measure exact HRTF for each person

Scale a standard HRTF

Use an adaptive model

Latency in dynamic systems

Potrebbero piacerti anche