Sei sulla pagina 1di 5

Questions to ask in the forum :

1. Missing help files


This is official Kaldi readme. You are now in Kaldi/trunk mirror.
Read Kaldi.md and INSTALL.md first!
2. Which scripts expect L.fst to exist ?
Does the utils/prepare_lang.sh expect it ?
3. What new scripts are written ?
4. Purpose of dict_common
5. Can we get a list of files that we create and files/folders created by script
s ?
6. what is the diference between folder structure of hindi and tamiDemo ?

Self notes
1. the egs directory contains directories named as per the specific
task/database on which ASR is to be done e.g., switch board database

----------------------------------------------------------------------------atal:~/kaldi/kaldi-trunk/tools/extras% ./check_dependencies.sh
./check_dependencies.sh: zlib is not installed.
./check_dependencies.sh: automake is not installed.
./check_dependencies.sh: libtool is not installed.
./check_dependencies.sh: autoconf is not installed.
./check_dependencies.sh: we recommend that you run (our best guess):
sudo apt-get install zlib1g-dev automake libtool autoconf
You should probably do:
sudo apt-get install libatlas3-base
/bin/sh is linked to dash, and currently some of the scripts will not run
properly. We recommend to run:
sudo ln -s -f bash /bin/sh
atal:~/kaldi/kaldi-trunk/tools/extras% file /bin/sh
/bin/sh: symbolic link to `dash'
atal:~/kaldi/kaldi-trunk/tools/extras% sudo apt-get install zlib1g-dev automake
libtool autoconf
[sudo] password for tauseef:
tauseef is not in the sudoers file. This incident will be reported.
atal:~/kaldi/kaldi-trunk/tools/extras% ./check_dependencies.sh
You should probably do:
sudo apt-get install libatlas3-base
/bin/sh is linked to dash, and currently some of the scripts will not run
properly. We recommend to run:
sudo ln -s -f bash /bin/sh
atal:~/kaldi/kaldi-trunk/tools/extras% ./check_dependencies.sh
You should probably do:
sudo apt-get install libatlas3-base
atal:~/kaldi/kaldi-trunk/tools/extras% ./check_dependencies.sh
./check_dependencies.sh: all OK.
atal:~/kaldi/kaldi-trunk/tools/extras%
---------------------------------------------------------------------------------------Error while doing configure

atal:~/kaldi/kaldi-trunk/src% ./configure
Configuring ...
Checking OpenFST library in /home/tauseef/kaldi/kaldi-trunk/tools/openfst ...
***configure failed: Could not find file /home/tauseef/kaldi/kaldi-trunk/tools/o
penfst/include/fst/fst.h:
you may not have installed OpenFst. See ../tools/INSTALL ***
atal:~/kaldi/kaldi-trunk/src%
--------------------------------------------------------------------------------Above error occured with freshly downloaded kaldi from sourceforge.
Hence now resort to using the trunk.tar.gz given by IITM team at WISP.
Following is the log output of running configure command :
atal:~/wispWorkshop/kaldi/trunk/src% ./configure
Configuring ...
Checking OpenFST library in /home/tauseef/wispWorkshop/kaldi/trunk/tools/openfst
...
Checking OpenFst library was patched.
Doing OS specific configurations ...
On Linux: Checking for linear algebra header files ...
Using ATLAS as the linear algebra library.
... no libatlas.so in /usr/lib
... no libatlas.so in /usr/lib/atlas
... no libatlas.so in /usr/lib/atlas-sse2
... no libatlas.so in /usr/lib/atlas-sse3
... no libatlas.so in /usr/lib64
... no libatlas.so in /usr/lib64/atlas
... no libatlas.so in /usr/lib64/atlas-sse2
... no libatlas.so in /usr/lib64/atlas-sse3
... no libatlas.so in /usr/local/lib
... no libatlas.so in /usr/local/lib/atlas
... no libatlas.so in /usr/local/lib/atlas-sse2
... no libatlas.so in /usr/local/lib/atlas-sse3
... no libatlas.so in /usr/local/lib64
... no libatlas.so in /usr/local/lib64/atlas
... no libatlas.so in /usr/local/lib64/atlas-sse2
... no libatlas.so in /usr/local/lib64/atlas-sse3
... no libatlas.so in /home/tauseef/wispWorkshop/kaldi/trunk/src/../tools/ATLAS/
build/install/lib/
... no libatlas.so in /home/tauseef/wispWorkshop/kaldi/trunk/tools/ATLAS/lib
Could not find libatlas.so in any of the obvious places, will most likely try st
atic:
Could not find libatlas.a in any of the generic-Linux places, but we'll try othe
r stuff...
Successfully configured for Debian 7 [dynamic libraries] with ATLASLIBS =/usr/li
b/atlas-base/libatlas.so.3.0 /usr/lib/atlas-base/libf77blas.so.3.0 /usr/lib/atla
s-base/libcblas.so.3 /usr/lib/atlas-base/liblapack_atlas.so.3
CUDA will not be used! If you have already installed cuda drivers and cuda toolk
it, try using --cudatk-dir=... option. Note: this is only relevant for neural n
et experiments
----------------------------------------------------------------Follwing steps were done to train test hindi databsed
1. copy the hindi folder
atal:~/wispWorkshop/kaldi/trunk% cp -rf ~/wispWorkshop/kaldi/trunkUsedAtWISP/egs
/hindi egs/.
2.~/wispWorkshop/kaldi/trunk/egs/hindi% mv exp/ expAtWISP
~/wispWorkshop/kaldi/trunk/egs/hindi% mv mfcc/ mfccAtWISP

3. Change path.sh
~/wispWorkshop/kaldi/trunk/egs/hindi% gvim path.sh &
4. replace speech by tauseef in wav.scp
/home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/train/wav.scp
atal:~/wispWorkshop/kaldi/trunk/egs/hindi/data/train% head wav.scp
aaloo_FYCGQM002CNBUP_0030
/home/tauseef/wispWorkshop/iiit_workshop_guru/wa
v/train_wav/aaloo_FYCGQM002CNBUP_0030.wav
aaloo_FYCPQM002CNBUP_0321
/home/tauseef/wispWorkshop/iiit_workshop_guru/wa
v/train_wav/aaloo_FYCPQM002CNBUP_0321.wav
~/wispWorkshop/kaldi/trunk/egs/hindi/data/test/wav.scp
5.
If you delete data/lang/ and run script then you get error as below
rm -rf data/lang/*
~/wispWorkshop/kaldi/trunk/egs/hindi% utils/prepare_lang.sh data/local/dict '!SI
L' data/local/lang data/lang
SIL: Event not found.
rm -rf data/lang/L.fs
~/wispWorkshop/kaldi/trunk/egs/hindi% utils/prepare_lang.sh data/local/dict '!SI
L' data/local/lang data/lang
SIL: Event not found.

6.
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ ./fst.sh
./fst.sh: line 22: fstcompile: command not found
Checking /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/phones.txt .
..
--> /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/phones.txt is OK
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>.
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ . path.sh
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ ./fst.sh
Checking /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/phones.txt .
..
--> /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/phones.txt is OK
Checking /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/oov.{txt, in
t} ...
--> 1 entry/entries in /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lan
g/oov.txt
--> /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/oov.int correspon
ds to /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/oov.txt
--> /home/tauseef/wispWorkshop/kaldi/trunk/egs/hindi/data/lang/oov.{txt, int} ar
e OK
--> SUCCESS
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>.
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ find . -name 'L.fst'
./data/lang/L.fst
./1hr_data/lang/L.fst
./1hr_data/local/lang_test_selvi/L.fst
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ find . -name 'G.fst'
./data/lang/G.fst

./1hr_data/lang/G.fst
./1hr_data/local/lang_test_selvi/G.fst
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

steps taken to run agmark on atal machine


1. copy 4 folders from hindi to agmark
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/hindi$ cp -rf local/ steps/ utils/ c
onf/ ../agmark/.
2. Create data
3.Create folders withini data
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark$ mkdir data/train data/test d
ata/local
4.Copy transcription, wav.scp, utt2spk, spk2utt
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark$ cp -rf /home/tauseef/wispWor
kshop/agmark/doc/kaldi/train/* data/train/.
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark$ cp -rf /home/tauseef/wispWor
kshop/agmark/doc/kaldi/test/* data/test/.
5. Create two more directories under data - lang and lang_test
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data$ ls
local/ test/ train/
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data$ mkdir lang lang_test
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data$ chmod a+rx lang/ lang_t
est/
6. create 4 directories under data/local : dict dict_comm lang lang_test
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data/local$ mkdir dict dict_c
omm lang lang_test
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data/local$ chmod a+rx dict
dict_comm lang lang_test
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/agmark/data/local$ ls
dict/ dict_comm/ lang/ lang_test/
7.Copy the dictionary, phone and filler files
tauseef@atal:~/wispWorkshop/agmark/doc$ cp marathiAgmark.dic marathiAgmark.fille
r marathiAgmark.phone /home/tauseef/wispWorkshop/kaldi/trunk/egs/agmark/data/loc
al/dict/.
Compare with the files used in tamilDemo
lexicon.txt
nonsilence_phones.txt = marathiAgmark.phone
optional_silence.txt
silence_phones.txt\
Note :
1.
marathiAgmark.dicj does not have sil
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/tamilDemo/data/local/dict$ tail lexi
con.txt
fb_laugh SIL
fb_ln
SIL
fb_pron
SIL

fb_pau
SIL
</s>
SIL
fb_uu
SIL
fb_whisper SIL
fb_br SIL
sil SIL
!SIL SIL
2. optional_silence.txt and silence_phones.txt look like following
So just copy them from tamildemo
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/tamilDemo/data/local/dict$ head opti
onal_silence.txt
SIL
tauseef@atal:~/wispWorkshop/kaldi/trunk/egs/tamilDemo/data/local/dict$ head sile
nce_phones.txt
SIL

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
use of perl, emacs for text processing
1.
change following line in transcription
to
2.
use macro in emacs.
macro is a sequence of keystrokes
F3(start macro)->Begin of line 1-> End -> Backspace -> Ctrl R -> Shift F -> Shif
t End -> Cut -> go
to begin by home -> paste ->goto begin of next line -> end macro by F4

write down the following steps for agmark test/train


how to modify transcription
how to get utt2spk and spk2utt
how to modify wav.scp

Potrebbero piacerti anche