Sei sulla pagina 1di 6

SOUND PROPAGATION

OVERVIEW

sound source

AN INTRODUCTION TO
HUMAN SPATIAL HEARING

listener

Physics of sound
speed c

Acoustic cues for sound localization


Azimuth
Elevation
Range

frequency f (Hz)

Head-related transfer functions (HRTFs)

Richard O. Duda
CIPIC Interface Laboratory
UC Davis

wavelength
l

Approaches to synthesizing spatial sound


Opportunities and challenges

http://phosphor.cipic.ucdavis.edu
October 12, 2000

c = f l

umd00_title.ai

jh_propagation.ai

umd00_overview.ai

MULTIPATH PROPAGATION
AXIOM I
Reflection

AXIOM II
Exact reproduction of the sound pressure
is not necessary for producing the same
auditory perception.

The sound pressure at the two


ear drums is a sufficient stimulus.

Refraction

Producing the same sound pressure will


produce the same auditory perception.

Scattering

Examples:

Caveats:

The limitations of neural responses


allow different (and simpler) stimuli
to produce the same response.

Bone conduction
Adaptation
Conflicting visual cues
Conflicting expectations

Bandwidth
Amplitude
Monaural phase
Latency
Spectral fine structure

(20 Hz to 20 kHz)
(1-dB resolution)
(2-ms resolution)
(10-ms resolution)
(critical bands, Q = 8)

umd00_axiom_1.ai

jh_paths.ai

umd00_axiom_2.ai

VERTICAL-POLAR
COORDINATES

INTERAURAL-POLAR
COORDINATES

AXIOM III
Although it is not necessary to reproduce
all of the cues exactly, conflicting cues
degrade perception.

Sound source

Media
n
Plane

Plane o

Sound source

q f

ation

nt elev

f consta

Interau

ral axis
Media
n
Plane

e
an
pl
al

t
on

riz
Ho

of
ne t
Pla stan
n
co uth
im
az

riz
Ho

Key engineering challenge -- find the


most cost-effective approximation.

Cone of
constant
elevation

e
an
pl
al

t
on

Cone of
constant
azimuth

q
f

umd00_axiom_3.ai

ubc_vp_coords.ai

ubc_ip_coords.ai

AZIMUTH CUES

WOODWORTH'S FORMULA

sound source

ARRIVAL TIME

Sound
Source

Rayleigh's solution (20% rise time)


Woodworth's formula

0.5

ITD

(Interaural Time Difference)

ILD

(Interaural Level Difference)

DTcon =

0.4

a
q

aq
c

Ipsilateral Ear

a sin q
DTips =

- a sin q
c

Arrival time (ms)

aq
Contralateral Ear

0.3
0.2
0.1
0.0
0

50

100

150

200

250

300

350

400

-0.1

Angle of Incidence (deg)

-0.2
-0.3
-0.4

ITD = a q + sin q
c

jh_azimuth_cues.ai

ubc_delay.ai

ubc_delay_curve.ai

TORSO REFLECTION

ELEVATION CUES

THE PINNA

sound
source
f

sound
source

Helix

Antihelix

Scaphoid fossa

h
fmin
DTT

sound
source

2h
c

Triangular fossa

Cymba concha

Crus helias

Cavum concha
External auditory meatus

fmin

Antitragus

90o

Tragus

Intertragal incisure

Pinna reflections and resonances

Lobule

|H(f)|

Torso and shoulder reflections

jh_elevation_cues.ai

1
2DTT

3
2DTT

5
2DTT

7
2DTT

f
umd00_torso_refl1.ai

PINNAE

PINNA PHENOMENA

ubc_pinna_nomenclature.ai

RANGE CUES

Pinna reflections (Batteau)


sound source

sound
source

Pinna resonances (Shaw)

+
+

Loudness

(for familiar sources)

Excess ILD

(for close sources)

Direct/reverberant

(for distant sources)

ubc_pinna_modes.ai

ubc_pinnae.ai

jh_elevation_cues.ai

HEAD-MOTION CUES AND


FRONT/BACK CONFUSION

HEAD-MOTION CUES AND


ELEVATION MAGNITUDE

OTHER CUES

Visual cues

Synchronized motion
Absence

Knowledge of source
a

Knowedge of environment

2a
a
c

ITD =

ITD =

ITD = 0

umd00_dynamnic_cues1.ai

THE HEAD-RELATED
TRANSFER FUNCTION

Sound Source

r0
r0

HL(f)

THE HEAD-RELATED
IMPULSE RESPONSE

Sound Source

Sound Source

X(f)

d(t)

hL(t)

HR(f)

XL(f)

xL(t)
XR(f)

Inverse range

k=

2pf
c

Fourier transform of source pressure


Fourier transform of left ear pressure
Fourier transform of right ear pressure
Free-field pressure at the origin

XL(f) = HL(f) Xff(f)

XR(f) = HR(f) Xff(f)

xL(t) = Left ear pressure


xR(t) = Right ear pressure
xff(t) = Free-field pressure at the origin
xL(t) =

hL(t) xff(t-t) dt

=
=
=
=

Xff = Hff X
r0 - j k r
Hff(f) =
e
,
r

X(f)
XL(f)
XR(f)
Xff(f)

xR(t)

X(f) = Fourier transform of source pressure


Xff(f) = Free-field pressure at head center

hR(t)

xR(t) =

hR(t) xff(t-t) dt
8

Xff(f)

umd00_other_cues.ai

umd00_dynamnic_cues2.ai

FREE-FIELD RADIATION FROM A


SPHERICAL SOURCE

X(f)

2a
a cos f
c

Propagation delay
jh_ff.ai

HRIR SOUND SYNTHESIS


Virtual
Source

xL(t)

ubc_HRIR_def.ai

ubc_HRTF_def.ai

A STRUCTURAL MODEL
Virtual
Source

COMPUTING HRTFs BY
BOUNDARY ELEMENT METHODS

xR(t)
x(t)
Sound Signal

Convolver

hR(t)

hL(t)
Head-Related
Impulse Responses
Azimuth q

Elevation f

xL(t)

Convolver

Head

xR(t)

Pinna

Pinna

Torso

Room

Head

Torso

Room

Digitize with a 3-D scanner


Solve wave equation numerically

Range r
jh_synthesis.ai

x(t)
Sound Signal

* See Kahana et al.


jh_structural_model.ai

ubc_bem.ai

ACOUSTIC
HRTF MEASUREMENT

THE KEMAR
ACOUSTIC MANIKIN

KEMAR HRIR
Azimuth = -45o, Elevation = 0o

Left ear

u th
im
Az

f Elevation
ural

Intera
Axis

Right ear

0.5

1.5

Time (ms)

ubc_kemar.ai

RIGHT-EAR HRTF FOR KEMAR


(Horizontal Plane)

KEMAR HRTF

20

Response (dB)

10

AZIMUTH = 0o

5
0
-5

AZIMUTH = -90o

-10

-20

-15
-20

10000

100

10

20

5
0
-5

AZIMUTH = 270o

-10
-15

Frequency (kHz)

-20
1000

10000

HRTF ELEVATION DEPENDENCE

-10
0

100
200
Elevation (deg)

-15
dB
umd00_full_HRTF.ai

Frequency (kHz)

-5

AZIMUTH = 270o
-10
-15
-20

1000

Frequency (Hz)

10000

ubc_ke_np_freq.ai

A PINNA ON A PLANE

HRTF WITHOUT PINNA

15

ubc_ke_freq.ai

jh_kemar_hrtf_m45.ai

BACK

-5

100

Frequency (Hz)

10

AZIMUTH = 90o
AZIMUTH = 180o

-25

-25
100

2
4
6
8
10
12
14
16

10

AZIMUTH = 180o

10

Response (dB)

0.2

BACK

AZIMUTH = 90o

15

10000

1000

Frequency (Hz)

20

Right ear

AZIMUTH = -90o

-10

Frequency (Hz)

-20

Frequency (kHz)

0
-5

-25
1000

0
-10

FRONT

AZIMUTH = 90o

AZIMUTH = 0o

-25
100

-30
0.1

10

AZIMUTH = 90o

-15

10

Response (dB)

Response (dB)

FRONT

15

Left ear

20

HRTF FOR KEMAR, NO PINNA


(Horizontal Plane)

Response (dB)

Azimuth = -45o, Elevation = 0o

30

jh_kemar_hrir_m45.ai

umd00_hoop.ai

15

2
4
6
8
10
12
14
16

10
5
0
-5
-10
0

100
200
Elevation (deg)

-15
dB
umd00_HRTF_nopinna.ai

umd00_pinplane.ai

CONTRIBUTIONS TO THE HRTF

A STRUCTURAL MODEL

Frequency (kHz)

2
4
6
8
10
12
14
16

Full HRTF

10
5
0

Head and torso

-5
-10
0

-15

100
200
Elevation (deg)

Pinna

dB

10
5
0
-5
-10
-15
15

2
4
6
8
10
12
14
16

Frequency (kHz)

15

10
5
0

-10

xR(t)

Pinna

Pinna

-15
15

2
4
6
8
10
12
14
16

10
5

Head

Torso

Room

Head

Torso

Room

0
-5
-10
-15
dB

100
200
Elevation (deg)

umd00_HRTF_contributions.ai

umd00_HRTF_pinna.ai

Virtual
Source

xL(t)

-5

THE SPHERICAL-HEAD MODEL

Virtual
Source

15

2
4
6
8
10
12
14
16

Frequency (kHz)

Frequency (kHz)

HRTF FOR ISOLATED PINNA

ASSESSING THE
SPHERICAL HEAD MODEL

Only one parameter -- easily customized

x(t)
Sound Signal

jh_structural_model.ai

ELLIPSOIDAL-TORSO MODEL
sound
source

Well focused
Good left/right position
Head
Model

No up/done control -- image elevated


xL(t)

xR(t)

DTL(q)

DTR(q)

HHsL(q)

HHsR(q)

With a head tracker:


Moderately externalized
Little front/back confusion

DTT
rT

Without a head tracker:


Internalized
Usually seems to be in back

rT = torso reflection coefficient


DTT = torso reflection delay

x(t)
ubc_sphere_model.ai

jh_sphere_assess.ai

STRUCTURAL HRTF MODEL


ASSESSING THE
ELLIPSOIDAL TORSO MODEL

Only one component of a full model

jh_ellipsoid_assess.ai

jh_torso_reflections.ai

SIMPLIFIED PINNA MODEL

Pinna
Model

Five parameters; still easily customized


Provides an elevation cue
Significant below 3 kHz
Ineffective in median plane

Head
Model

Head
Model

DTH(q)

Head
Model
Torso
Model

Fixed-pole
resonator

Fixed-pole
resonator

DTP(f)

DTP(f)

kP(f)

kP(f)

DTT(q,f)

HHS(q)

rT

Head Model

Torso Model
jh_structural_model_2.ai

jh_structural_model_3.ai

SPATIAL SOUND SYSTEMS

MULTICHANNEL SYSTEMS

TWO-CHANNEL: HEADPHONES

Multichannel
xL(t)

xR(t)

Pros
Can reproduce full 3-D with only 2 channels
Private and non-interfering
Conceptually simple

Pros

Two-channel:
headphones

Works with a large audience


No customization needed
Conceptually simple

Cons

Cons

Two-channel:
crosstalk-canceled
loud speakers

Speakers must be distant


Many channels needed for full 3-D
Space consuming, expensive

umd00_systems.ai

TWO-CHANNEL:
CROSSTALK-CANCELED
LOUD SPEAKERS
xL(t)

xR(t)

Can reproduce full 3-D with only 2 channels


Unencumbered listening

Cons

Small "sweet spot"


Cannot be used with a large audience
Requires customization for full 3-D
Difficult to get near or rear locations
umd00_systems4.ai

umd00_systems3.ai

CHALLENGES
AND
OPPORTUNITIES

APPROACHES TO
CUSTOMIZATION

Nearest-neighbor
Trial and error
Anthropometry

Pros

Uncomfortable for extended use


Clumsy for a large audience
Requires customization for full 3-D
Difficult to achieve frontal externalization

umd00_systems2.ai

Measure exact HRTF for each person


Acoustic
Computational

Inverse HRTFs

Frequency range
(combining partial HRTFs)
Elevation perception
Front/back confusion
Low elevations
Range perception
Headphones: externalization
Median plane
Frontal
Speakers: back locations

Scale a standard HRTF


Global
Pinna/head/torso components

Transducers
Headphone compensation
Loudspeaker "sweet spot"

Use an adaptive model


Match to anthropometry
Match to exact HRTF

Latency in dynamic systems


umd00_customization.ai

Room acoustics
umd00_problems.ai

Potrebbero piacerti anche