Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
OVERVIEW
sound source
AN INTRODUCTION TO
HUMAN SPATIAL HEARING
listener
Physics of sound
speed c
frequency f (Hz)
Richard O. Duda
CIPIC Interface Laboratory
UC Davis
wavelength
l
http://phosphor.cipic.ucdavis.edu
October 12, 2000
c = f l
umd00_title.ai
jh_propagation.ai
umd00_overview.ai
MULTIPATH PROPAGATION
AXIOM I
Reflection
AXIOM II
Exact reproduction of the sound pressure
is not necessary for producing the same
auditory perception.
Refraction
Scattering
Examples:
Caveats:
Bone conduction
Adaptation
Conflicting visual cues
Conflicting expectations
Bandwidth
Amplitude
Monaural phase
Latency
Spectral fine structure
(20 Hz to 20 kHz)
(1-dB resolution)
(2-ms resolution)
(10-ms resolution)
(critical bands, Q = 8)
umd00_axiom_1.ai
jh_paths.ai
umd00_axiom_2.ai
VERTICAL-POLAR
COORDINATES
INTERAURAL-POLAR
COORDINATES
AXIOM III
Although it is not necessary to reproduce
all of the cues exactly, conflicting cues
degrade perception.
Sound source
Media
n
Plane
Plane o
Sound source
q f
ation
nt elev
f consta
Interau
ral axis
Media
n
Plane
e
an
pl
al
t
on
riz
Ho
of
ne t
Pla stan
n
co uth
im
az
riz
Ho
Cone of
constant
elevation
e
an
pl
al
t
on
Cone of
constant
azimuth
q
f
umd00_axiom_3.ai
ubc_vp_coords.ai
ubc_ip_coords.ai
AZIMUTH CUES
WOODWORTH'S FORMULA
sound source
ARRIVAL TIME
Sound
Source
0.5
ITD
ILD
DTcon =
0.4
a
q
aq
c
Ipsilateral Ear
a sin q
DTips =
- a sin q
c
aq
Contralateral Ear
0.3
0.2
0.1
0.0
0
50
100
150
200
250
300
350
400
-0.1
-0.2
-0.3
-0.4
ITD = a q + sin q
c
jh_azimuth_cues.ai
ubc_delay.ai
ubc_delay_curve.ai
TORSO REFLECTION
ELEVATION CUES
THE PINNA
sound
source
f
sound
source
Helix
Antihelix
Scaphoid fossa
h
fmin
DTT
sound
source
2h
c
Triangular fossa
Cymba concha
Crus helias
Cavum concha
External auditory meatus
fmin
Antitragus
90o
Tragus
Intertragal incisure
Lobule
|H(f)|
jh_elevation_cues.ai
1
2DTT
3
2DTT
5
2DTT
7
2DTT
f
umd00_torso_refl1.ai
PINNAE
PINNA PHENOMENA
ubc_pinna_nomenclature.ai
RANGE CUES
sound
source
+
+
Loudness
Excess ILD
Direct/reverberant
ubc_pinna_modes.ai
ubc_pinnae.ai
jh_elevation_cues.ai
OTHER CUES
Visual cues
Synchronized motion
Absence
Knowledge of source
a
Knowedge of environment
2a
a
c
ITD =
ITD =
ITD = 0
umd00_dynamnic_cues1.ai
THE HEAD-RELATED
TRANSFER FUNCTION
Sound Source
r0
r0
HL(f)
THE HEAD-RELATED
IMPULSE RESPONSE
Sound Source
Sound Source
X(f)
d(t)
hL(t)
HR(f)
XL(f)
xL(t)
XR(f)
Inverse range
k=
2pf
c
hL(t) xff(t-t) dt
=
=
=
=
Xff = Hff X
r0 - j k r
Hff(f) =
e
,
r
X(f)
XL(f)
XR(f)
Xff(f)
xR(t)
hR(t)
xR(t) =
hR(t) xff(t-t) dt
8
Xff(f)
umd00_other_cues.ai
umd00_dynamnic_cues2.ai
X(f)
2a
a cos f
c
Propagation delay
jh_ff.ai
xL(t)
ubc_HRIR_def.ai
ubc_HRTF_def.ai
A STRUCTURAL MODEL
Virtual
Source
COMPUTING HRTFs BY
BOUNDARY ELEMENT METHODS
xR(t)
x(t)
Sound Signal
Convolver
hR(t)
hL(t)
Head-Related
Impulse Responses
Azimuth q
Elevation f
xL(t)
Convolver
Head
xR(t)
Pinna
Pinna
Torso
Room
Head
Torso
Room
Range r
jh_synthesis.ai
x(t)
Sound Signal
ubc_bem.ai
ACOUSTIC
HRTF MEASUREMENT
THE KEMAR
ACOUSTIC MANIKIN
KEMAR HRIR
Azimuth = -45o, Elevation = 0o
Left ear
u th
im
Az
f Elevation
ural
Intera
Axis
Right ear
0.5
1.5
Time (ms)
ubc_kemar.ai
KEMAR HRTF
20
Response (dB)
10
AZIMUTH = 0o
5
0
-5
AZIMUTH = -90o
-10
-20
-15
-20
10000
100
10
20
5
0
-5
AZIMUTH = 270o
-10
-15
Frequency (kHz)
-20
1000
10000
-10
0
100
200
Elevation (deg)
-15
dB
umd00_full_HRTF.ai
Frequency (kHz)
-5
AZIMUTH = 270o
-10
-15
-20
1000
Frequency (Hz)
10000
ubc_ke_np_freq.ai
A PINNA ON A PLANE
15
ubc_ke_freq.ai
jh_kemar_hrtf_m45.ai
BACK
-5
100
Frequency (Hz)
10
AZIMUTH = 90o
AZIMUTH = 180o
-25
-25
100
2
4
6
8
10
12
14
16
10
AZIMUTH = 180o
10
Response (dB)
0.2
BACK
AZIMUTH = 90o
15
10000
1000
Frequency (Hz)
20
Right ear
AZIMUTH = -90o
-10
Frequency (Hz)
-20
Frequency (kHz)
0
-5
-25
1000
0
-10
FRONT
AZIMUTH = 90o
AZIMUTH = 0o
-25
100
-30
0.1
10
AZIMUTH = 90o
-15
10
Response (dB)
Response (dB)
FRONT
15
Left ear
20
Response (dB)
30
jh_kemar_hrir_m45.ai
umd00_hoop.ai
15
2
4
6
8
10
12
14
16
10
5
0
-5
-10
0
100
200
Elevation (deg)
-15
dB
umd00_HRTF_nopinna.ai
umd00_pinplane.ai
A STRUCTURAL MODEL
Frequency (kHz)
2
4
6
8
10
12
14
16
Full HRTF
10
5
0
-5
-10
0
-15
100
200
Elevation (deg)
Pinna
dB
10
5
0
-5
-10
-15
15
2
4
6
8
10
12
14
16
Frequency (kHz)
15
10
5
0
-10
xR(t)
Pinna
Pinna
-15
15
2
4
6
8
10
12
14
16
10
5
Head
Torso
Room
Head
Torso
Room
0
-5
-10
-15
dB
100
200
Elevation (deg)
umd00_HRTF_contributions.ai
umd00_HRTF_pinna.ai
Virtual
Source
xL(t)
-5
Virtual
Source
15
2
4
6
8
10
12
14
16
Frequency (kHz)
Frequency (kHz)
ASSESSING THE
SPHERICAL HEAD MODEL
x(t)
Sound Signal
jh_structural_model.ai
ELLIPSOIDAL-TORSO MODEL
sound
source
Well focused
Good left/right position
Head
Model
xR(t)
DTL(q)
DTR(q)
HHsL(q)
HHsR(q)
DTT
rT
x(t)
ubc_sphere_model.ai
jh_sphere_assess.ai
jh_ellipsoid_assess.ai
jh_torso_reflections.ai
Pinna
Model
Head
Model
Head
Model
DTH(q)
Head
Model
Torso
Model
Fixed-pole
resonator
Fixed-pole
resonator
DTP(f)
DTP(f)
kP(f)
kP(f)
DTT(q,f)
HHS(q)
rT
Head Model
Torso Model
jh_structural_model_2.ai
jh_structural_model_3.ai
MULTICHANNEL SYSTEMS
TWO-CHANNEL: HEADPHONES
Multichannel
xL(t)
xR(t)
Pros
Can reproduce full 3-D with only 2 channels
Private and non-interfering
Conceptually simple
Pros
Two-channel:
headphones
Cons
Cons
Two-channel:
crosstalk-canceled
loud speakers
umd00_systems.ai
TWO-CHANNEL:
CROSSTALK-CANCELED
LOUD SPEAKERS
xL(t)
xR(t)
Cons
umd00_systems3.ai
CHALLENGES
AND
OPPORTUNITIES
APPROACHES TO
CUSTOMIZATION
Nearest-neighbor
Trial and error
Anthropometry
Pros
umd00_systems2.ai
Inverse HRTFs
Frequency range
(combining partial HRTFs)
Elevation perception
Front/back confusion
Low elevations
Range perception
Headphones: externalization
Median plane
Frontal
Speakers: back locations
Transducers
Headphone compensation
Loudspeaker "sweet spot"
Room acoustics
umd00_problems.ai