Sei sulla pagina 1di 33

CS4347 Tutorial

Sound Analysis and Feature


Extraction
TA: Zhou Yinsheng
yzhou86@comp.nus.edu.sg

Outline
Music transcription
Pitch detection
MFCC
Project Consultation
MUSIC
TRANSCRIPTION
A Live Demo
Question 1
Can you transcribe it in real-time?
Music Transcription
Reverse engineering of music recording
Result is flexible
Automatic Music Transcription
Onset Detection (next week)
Pitch Estimation
http://www.youtube.com/results?search_query=automatic+music
+transcription&aq=f
High Level Music Features
Rhythm
Onset detection
Beat/tempo estimation
Pitch
Timbre
Harmony
Melody
Lyrics
Etc
PITCH DETECTION
Pitch Detection
Zero Crossing Rate
Autocorrelation
Average Magnitude Difference Function
(similar to autocorrelation)
FFT
*Cepstrum Analysis
*Auditory Model Based
Zero Crossing
Zero Crossing Rate
Zero Crossing Rate
zcr =
1
T !1
"{s
t
s
t!1
< 0}
t=1
T!1
#
s
!{A}
where is a signal of length T
and the indicator function
is 1 if its argument A is true and 0 otherwise
Useful for single source scenario
Question 2
Assuming sampling rate is 1000 Hz. What is the zero crossing rate?
Autocorrelation
Autocorrelation
r[! ] = s[n]!s[n +! ]
n=0
N
"
0 <! ! N Where n is the input sample index, and
r
s
!
is autocorrelation
is the signal
is the lag
ACF of a periodic signal shows peaks at multiples of the period.
Autocorrelation
Question 3
What is the pitch of the signal?
Average Magnitude Difference
Function
Faster than autocorrelation
Is related to autocorrelation
Multiplication replaced either by |s(i) s
(i-t)| or by (s(i) s(i-t))
2
AMDF(t) =
1
L
s(i) !s(i !t)
i=1
L
"
Where, s(i): the samples of input
s(i) = [s(1), s(2), , s(L)]
s(I t): the samples time shifted
YIN Algorithm
Build on top of Autocorrelation
5 more steps to minimize the error rates
Difference function
Cumulative mean normalized difference
Absolute threshold
Best local estimate
Error rates drop from 10% (only
autocorrelation) to 0.5%
FFT
Windowing + FFT (STFT)
Easy to compute and direct
Question 4: What is
the pitch?
Polyphonic Pitch Detection
Multiples instruments + singing voice +
noise
Far more difficult than single sound
source transcription
Frequency-domain analysis technique +
search and decision mechanism
Machine Learning Algorithm
Application
Music information
retrieval
Musical performance
systems
Phonetics
Speech coding
Source code
Matlab code
http://note.sonots.com/SciSoftware/Pitch.html
Aubio C++
Including YIN algorithm
http://aubio.org/
MFCC
Mel-Frequency Cepstrum
Human Auditory System
Sound Perception
Mel spaced filterbank
Mel-Frequency Cepstral
Coefficiency (MFCC)
MeI-Frequency CepstraI Coefficients
(MFCC) (MFCC)
Frame of sound
Pre-emphasis
Windowed frame
Pre-emphasis
+ windowing
Magnitude spectrum
Mel-spaced filter
|FFT|
Magnitude spectrum
MeI-
frequency
fiItering
bank
Mel-filtered spectrum
Iog ( . )
fiItering
DCT
MFCC vector
Truncation
29
Applications
Speech recognition
Speaker recognition
Genre classification
Audio similarity measures
Etc..
How to use MFCC
Marsyas: http://marsyas.info/
Music analysis, retrieval, and synthesis
Open source framework
Matlab
Signal Processing Toolbox:
http://www.mathworks.com/matlabcentral/
fileexchange/23119-mfcc
Columbia University: http://
labrosa.ee.columbia.edu/matlab/rastamat/
Project Consultation
Thank you & QA

Potrebbero piacerti anche