Sei sulla pagina 1di 5

The Laryngoscope Lippincott Williams & Wilkins, Inc. © 2006 The American Laryngological, Rhinological and Otological Society, Inc.

A New Generation Videokymography for Routine Clinical Vocal Fold Examination

Qingjun Qiu, PhD; Harm K. Schutte, MD, PhD

Objective: This study aims to introduce a new- generation videokymographic system, which provides simultaneous laryngoscopic and kymographic image, for routine clinical vocal fold examination. Study De- sign: The authors explored a new imaging method for diagnosis and evaluation of voice disorders. Methods:

The new-generation videokymographic system in- cludes two charge-coupled device image sensors, a color area image sensor, and a monochromic high- speed line-scan image sensor. The high-speed line- scan image sensor is used to capture the kymogram, and the color area image sensor is used to obtain the laryngoscopic image. The two images can be dis- played simultaneously on a video monitor or stored in a standard video recorder. Three subjects with non- pathologic voice were investigated in detail with the new videokymographic system. Results: The high- quality laryngoscopic image and kymogram can be used directly for clinical purposes with no further postprocessing. The scan position of the kymogram is always indicated in the laryngoscopic image, which provides feedback for the operator to easily locate the expected scanning position. All varieties of vocal fold vibration, including irregular vibrations, phonation onset and offset, can be observed with the presented method. The continuous kymogram of the vocal fold vibration can be retrieved from a kymographic image sequence for quantitative analysis. Conclusions: The new-generation videokymography provides a simple, quick means to investigate vocal fold vibration, espe- cially for voice disorders. It can emerge as an impor- tant tool for routine clinical vocal fold examination. Key Words: Videokymography, vocal fold vibration,

From the Groningen Voice Research Lab, Department of Biomedical Engineering, University Medical Center of Groningen, University of Gro- ningen, Groningen, The Netherlands. Editor’s Note: This manuscript was accepted for publication June 7,

2006.

The work was done in Groningen Voice Research Lab, the Depart- ment of Biomedical Engineering, University Medical Center of Groningen, University of Groningen, Groningen, The Netherlands. This research was supported by the Technology Foundation STW, the applied science division of NWO, and the technology programme of the Ministry of Economic Affairs, The Netherlands, project No. G5973. Send correspondence to Dr. Qingjun Qiu, BioMedical Department, Uni- versity Medical Center of Groningen, University of Groningen, A. Deusing- laan 1, NL 9713AV Groningen, The Netherlands. E-mail: q.qiu@med.umcg.nl

DOI: 10.1097/01.mlg.0000233552.58895.d0

laryngoscopy, voice disorders, routine clinical vocal fold examination.

Laryngoscope, 116:1824 –1828, 2006

INTRODUCTION

The use of kymographic imaging as a method for visu- alizing vocal fold vibration, especially for disordered vibra- tion, has increased greatly since Gall and Hanson first in- troduced it to register the motion of the vocal folds in 1971. 1 In their research, a special photograph camera with a slit shutter was used to expose the vocal fold movement onto the film, a method that is known as photograph kymography. This method is extremely time-consuming because the film must still be developed with the result that photograph kymography is not practical for routine clinical diagnosis. 2 Fortunately, videokymography (VKG), 3 in which the kymographic image is encoded as a standard video signal, can reveal the vocal fold kymogram directly on a standard video monitor. Thus, the use of this real-time imaging tech- nique spread rapidly into voice research and clinical prac- tice. Schutte et al. reported the first clinical application of VKG in 1998. In their study, more than 800 patients with various functional and organic voice disorders were exam- ined. 4 Jiang et al. used VKG to quantify vocal fold mucosal wave movements in canine larynges. 5 Verdonck-de Leeuw et al. combined videokymographic image sequences and speech signals to evaluate the effect of irregular vocal fold vibration on voice quality. 6 A common conclusion from these studies is that, with high spatial and temporal resolution, VKG would emerge as a valuable method for voice disorder diagnosis and a powerful tool for better understanding the mechanism of the vocal fold vibration. However, several drawbacks of the first-generation VKG, which are illustrated in the “Discussion” section, slowed the progress of its application during the last 5 years. To overcome these obstacles, a new-generation videokymo- graphic system has been developed that retains the advan- tages of the first-generation VKG while resolving its prob- lems. A preliminary clinical result is also presented, illustrating what has been improved in the new system.

MATERIALS AND METHODS The New-Generation Videokymography

The new-generation videokymographic system includes two charge-coupled device (CCD) image sensors, a color area CCD,

Laryngoscope 116: October 2006

Qiu and Schutte: A New Generation Clinical Videokymography

1824

sensors, a color area CCD, Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical

and a monochromic high-speed line-scan CCD. For the kymo-

graphic imaging, the high-speed line-scan CCD is used to capture

a selected line, which is usually aligned perpendicular to the

glottal axis of the vocal folds. The color area CCD sensor is used

to record the laryngoscopic image. A beam splitter optically di-

vides the image from the laryngoscope into two paths, one path for the area CCD and the other for the line-scan CCD. The two CCDs work simultaneously so that the laryngoscopic image and kymographic image can be obtained at the same time. By means

of the beam splitter, the position of the linear CCD is fixed on the

reflective center of the area CCD, i.e., the kymogram taken from

the line-scan CCD shows the vocal fold vibration at the center line

of the laryngoscopic image.

The system is divided into two parts, a camera head and a controller unit. The two parts connect with one another by a high-performance video cable. The camera head houses only the image sensors and the beam splitter, reducing weight and size for easy handling. The main controller unit contains an embedded microprocessor and a frame buffer memory, providing capability for processing the video images in real time. The system has two output interfaces, an analog and a digital. The analog video output allows the two images to be displayed on a standard analog video monitor, whose screen is vertically split into two equal parts with the laryngoscopic image on the left and the kymogram on the right. The digital output port allows the video data to be acquired by a digital video frame-grabber. Custom-made software, which includes a database system, was developed to capture, display, and store the images.

The Instruments and Subjects

The new-generation videokymographic system was used for gathering kymographic data of the present study. The images were obtained with a 90° rigid laryngoscope (Richard Wolf 4450.57, Germany) together with a C-mount optical adapter (Richard Wolf 5261.27, f 32 mm, Germany) The vocal folds were illuminated by the light from a 300-W xenon light source (Kay Elemetrics 7150), which is transmitted to the tip of the endoscope using a bunch of optical fibers. For digital recording, a 16-bit parallel digital frame-grabber (National Instruments PCI-1422) was used to directly obtain the images from the digital port of the new system. For analog video recording, an s-VHS video recorder (Panasonic AG-7355; Panasonic-Matsushita Electric Industrial Co., Ltd., Japan) was used. The analog video recording was then digitized by a video frame-grabber (National Instruments PCI- 1411). For comparison purposes, the first-generation videokymo- graphic system (Lambert Instruments, BV, Leutingewolde, The Netherlands) was also used in the experiment. Three subjects with nonpathologic voices were investigated in detail in the Groningen Voice Research Lab in The Netherlands.

RESULTS

The upper part of Figure 1 was captured by the new system while subject no. 1 was sustaining the vowel /i/. The laryngoscopic and kymographic images are shown on the left and the right, respectively. The white line in the laryngo- scopic image indicates the scan position for the kymogram. The vocal folds in the laryngoscopic image appear blurred because the area CCD is not fast enough to follow the vibra- tion of the vocal folds (25 frames per second); the high-speed kymographic image, however, clearly displays the vibrations (7200 lines per second). Each frame contains 40 millisecond of vibratory history of the vocal folds. Thus, the fundamental frequency of the regular vocal fold vibration can easily be estimated by counting the number of vibratory periods in the kymographic image. In this particular frame, the kymo-

the kymographic image. In this particular frame, the kymo- Fig. 1. The new kymographic images from

Fig. 1. The new kymographic images from subject no. 1. The upper one was captured during a sustained vowel /i/. The lower images show irregular vocal fold vibration when the subject intentionally produced a hoarse voice. Each image is vertically split into two parts. The left part shows the laryngoscopic image and the right part shows the kymographic one. The white line in the laryngoscopic image indicates the scanning position of the kymogram. “A” points out a blood vessel on the vocal folds. “B” and “C” indicate vibratory cycles, respectively, with and without closed phase. “L” and “R” indicate left and right sides, respectively.

graphic image contains approximately seven vibratory cy- cles, implying that the fundamental frequency is approxi- mately 175 Hz for this phonation. Other quantitative parameters such as closed quotient can be calculated by postprocessing software. The lower part of Figure 1 shows an irregular vocal fold vibration when the subject intentionally produced a hoarse voice. Like in the upper half of the figure, the laryngoscopic image is also blurred. The kymographic im- age, however, shows a complex disordered vibration pat- tern. The closed phase is much shorter than in a regular vibration (as indicated in B), and several of these cycles have no closed phase (as indicated in C). Onset and offset are particularly revealing phases of phonation. In clinical practice, the assessment of a pho-

Laryngoscope 116: October 2006

Qiu and Schutte: A New Generation Clinical Videokymography

1825

the assessment of a pho- Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical
Fig. 2. A kymographic image sequence from subject no. 2. The images were taken during

Fig. 2. A kymographic image sequence from subject no. 2. The images were taken during a very brief phonation of vowel /i/, doc- umenting voice onset and offset. The phonation continues without interruption from the left segment through the right. “L” and “R” indicate left and right sides, respectively.

nation onset can be used as a tool to diagnose vocal dys- functions. To show the capability of revealing phonation onset and offset with the new system, Figure 2 illustrates a complete short phonation, including the phonation onset and offset period. The image was obtained when subject no. 2 produced a very short vowel /i/. In this case, during the onset period, the two vocal folds are synchronized. However, the right vocal fold is ahead of the left vocal fold during the offset period. For comparison purposes, subject no. 3 was examined with both the new- and the old-generation videokymo- graphic systems. The subject was instructed to keep the pitch and loudness constant for both examinations. Figure 3 shows the result. The left image was taken with the old videokymographic system and the right with the new one.

DISCUSSION

The simultaneous laryngoscopic and kymographic im- aging in the new system dispenses with the complicated operational procedure of kymogram acquisition used in the old-generation VKG system, illustrated in Figure 4. The old system provides two working modes, normal mode and ky- mographic mode. In the normal mode, laryngoscopic images are obtained. The kymographic mode produces the high- speed kymogram. A footswitch controls the working mode. However, the two working modes are mutually exclusive, preventing the operator from seeing the scan position while using the kymographic mode. Therefore, to obtain the de-

using the kymographic mode. Therefore, to obtain the de- Fig. 3. Two kymographic images from subject

Fig. 3. Two kymographic images from subject no. 3. The left image was taken with the old videokymographic system and the right one with the new system. The subject was instructed to maintain the same pitch and loudness. “L” and “R” indicate left and right sides, respectively.

sired scanning position, a general operational rule must be observed. First, the normal mode is used to locate the vocal folds in the laryngoscopic image. The desired scanning posi- tion must be kept at the top of the laryngoscopic image, because only the top line will be scanned in the kymographic mode (Fig. 4A). Then the kymographic mode is activated to

mode (Fig. 4A). Then the kymographic mode is activated to Fig. 4. A general procedure for

Fig. 4. A general procedure for acquiring a kymographic image with the old-generation videokymograph. The videokymograph has two working modes, a (A) normal mode and a (B) kymographic mode. The normal mode is first used to locate the desired scanning posi- tion on the top line of the laryngoscopic image. Then one shifts to the kymographic mode, and the kymogram of the top line in the laryngoscopic image is revealed on the screen (B). The interlaced kymogram makes the images not directly observable from the screen. As a result, postprocessing is necessary to retrieve the kymogram, which will always lack the information during the vertical blanking period (C).

Laryngoscope 116: October 2006

Qiu and Schutte: A New Generation Clinical Videokymography

1826

blanking period (C). Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical Videokymography 1826

obtain the kymogram. However, if there is subsequently a slight relative movement between the endoscope and the vocal folds, the kymogram will represent the vibrations at a position other than the intended one. The error can be more serious if the operator does not notice that discrepancy, pos- sibly resulting in an incorrect clinical diagnosis. Fortunately, the new VKG system presents the laryngoscopic and kymo- graphic images simultaneously with the scan position of the kymogram always indicated on the laryngoscopic image. Even a slight relative movement between the vocal folds and the endoscope would be revealed in the laryngoscopic image. In this way, the simultaneous imaging greatly facilitates finding the desired position for the kymogram along the glottis. The new imaging system also provides an uninter- rupted kymogram, which was not possible in the old system. The old VKG is a species of television standard video camera (PAL standard is used in this article) in which each half frame of 20 milliseconds, called a field, contains an active time and a vertical blanking time. The video image can only be shown during the active time of 18.4 milliseconds. As a result, during the vertical blanking period, the kymographic information is missing (Fig. 4C). However, in the new sys- tem, this problem is solved by buffering the kymographic image. During vertical blanking, the line-scan CCD still captures kymographic images, which cannot be displayed but can be stored in the buffer memory of the controlling system. When the video is activated, these images will be displayed. This makes possible the construction of a contin- uous kymogram, like in Figure 2. For the clinical application, it is very important to keep examination time brief. The real-time laryngoscopic and kymographic imaging of the new system greatly speeds the process. Examination time using the first- generation VKG is relatively short, but some postprocess- ing is always necessary to obtain satisfactory images. In that system, there are two problems with the raw kymo- graphic image, which is displayed on the video screen in real time (see Fig. 4). First, every second line is black, interrupting the continuity of information. Second, in nor- mal video, each frame is made up of two interlaced fields, an odd and an even. Without postprocessing, the display of these interlaced kymograms is difficult to read. Figures 4B and 4C show kymograms before and after postprocess- ing. Both problems are caused by the TV standard used in the old system. However, in the new system, the digital signal processor reformats output video data so that these problems are eliminated, and the images can be used directly for clinical purposes. If desired, postprocessing can be implemented for extracting quantities such as fun- damental frequency and closed quotient. The new videokymophic system has both analog and digital video outputs. Using the PAL standard, the images from the analog output can be shown on a standard video monitor, stored in a standard video recorder, or printed by a video image printer. However, the maximal spatial res- olution is also limited by the PAL standard of 720 576 pixels. The laryngoscopic and kymographic images each occupy half of the frame, having a maximal spatial reso- lution of 360 576 pixels. This limitation is overruled in the digital video output, which is based on a nonstandard

parallel digital interface. The resolution of laryngoscopic image is 720 576 pixels in one frame with 25 frames per second. The resolution of kymographic image is 625 pixels per line with 7,200 lines per second. A further advantage

of using the digital port is that the speed of the digital port

is high enough to transfer the raw image in real time

without any loss through compression. Thus, the quality

of the images from the digital port is superior to that from

the analog port. Nevertheless, both the analog port and the digital port provide noninterrupted vibratory informa- tion of the vocal folds, which can be digitized or acquired by a computer, quantified by analysis software, and stored

in a database system. The CCD used to record the kymographic image has

a particularly high sensitivity, resulting in two major ad- vantages of the new system. First, the high-sensitivity imaging reduces the requirement for the light source. A standard 250-W or 300-W xenon light source is sufficient

to obtain good-quality images. A 180-W xenon light source

is even adequate if a rigid laryngoscope with direct optical

fiber connection is used, that is, the optical fiber is an inte- gral part of the laryngoscope. Second, the high-sensitivity feature gives it a marked advantage in image quality over other commercially available kymographic systems. Figure 3 shows the obvious difference in image quality between the old videokymographic system and the new one. In the noise- free image of the new system, even the blood vessels on the vocal folds are recognizable (Fig. 3). The advances in image quality also provide a promising impression of three-dimensional vocal fold movement. In comparison to the upper part of Figure 1, the three- dimensional vibration of the whole vocal folds is more sa- liently recognizable in the kymographic image of the lower part of Figure 1. The chief explanation of the high-amplitude vocal fold movement is that the vocal folds are less coupled to the airflow in this specific irregular phonation with a high mean airflow. A similar phenomenon can be seen in vocal fold vibration in a unilateral laryngeal paralysis. 4 Evidence

of such a three-dimensional movement pattern might help to

interpret the cause of an existing vocal fold disorder. The three-dimensional vocal fold movement can also be revealed with stroboscopy. However, stroboscopy has a serious limitation in that it works only with periodic vi-

bration. 7 To observe aperiodic vibration, VKG or a full high-speed camera is the proper choice. Several earlier papers have compared the old system of VKG and the full high-speed camera with respect to their various advantages, 8,9 and we do not repeat that comparison here. The new system, however, adds major improvements, discussed previously, to the advantages of the old VKG. The most important of these advantages is its high image quality, which includes high spatial reso- lution, high temporal resolution, and a high signal-to- noise-ratio. Another important advantage is lower data volume, allowing the kymographic image to be captured in real time without the time constraint that pertains to the full high-speed camera. The audio and the electroglottograph signals also play an important role in voice research. The system provides a synchronization mechanism to align the audio signal and

Laryngoscope 116: October 2006

Qiu and Schutte: A New Generation Clinical Videokymography

1827

align the audio signal and Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical

electroglottogram. This is beyond the scope of the present study and will be addressed in a forthcoming paper.

CONCLUSION

Our results show that the new-generation videokymo- graphic system provides high-quality laryngoscopic and ky- mographic images simultaneously. For the first time, the kymographic image can be visualized directly on a video monitor or stored in a standard video recorder without any waiting time for postprocessing, remarkably reducing exam- ination time. A continuous kymogram of the vocal fold vibra- tion can be retrieved from a kymographic image sequence without interruption of vertical blanking. This gives the new VKG system important advantages not only for clinical pur- poses, but also for voice quantitative research. In summary, the new-generation VKG provides a simple and fast way to study vocal fold vibration, especially when this is irregular. It will be an important tool for routine clinical vocal fold examination as well as for detailed research.

Acknowledgments

The authors gratefully acknowledge D. G. Miller for help with the English version of the article. The authors also

appreciate the comments of J. G. S ˇ vec and N. A. George in reviewing the manuscript.

BIBLIOGRAPHY

1. Gall V, Gall D, Hanson J. Laryngeal photokymography. Arch Klin Exp Ohren Nasen Kehlkopfheilkd 1971;200:34–41.

2. Gross M. Larynxfotokymographie. Sprache-Stimme-Geho¨r

1985;9:112–113.

3. S ˇ vec JG, Schutte HK. Videokymography: high-speed line scan- ning of vocal fold vibration. J Voice 1996;10:201–205.

4. Schutte HK, S ˇ vec JG, S ˇ ram F. First results of clinical appli- cation of videokymography. Laryngoscope 1998;108:

1206–1210.

5. Jiang JJ, Chang CI, Raviv JR, Gupta S, Banzali FM Jr, Hanson DG. Quantitative study of mucosal wave via videokymography in canine larynges. Laryngoscope 2000;110:1567–1573.

6. Verdonck-de Leeuw IM, Festen JM, Mahieu HF. Deviant vocal fold vibration as observed during videokymography: the effect on voice quality. J Voice 2001;15:313–322.

7. Kitzing P. Stroboscopy—a pertinent laryngological examina- tion. J Otolaryngol 1985;14:151–157.

8. Wittenberg T, Tigges M, Mergell P, Eysholdt U. Functional imaging of vocal fold vibration: digital multislice high- speed kymography. J Voice 2000;14:422–442.

9. Hertegard S, Larsson H, Wittenberg T. High-speed imaging:

applications and development. Logoped Phoniatr Vocol

2003;28:133–139.

Laryngoscope 116: October 2006

Qiu and Schutte: A New Generation Clinical Videokymography

1828

Vocol 2003;28:133–139. Laryngoscope 116: October 2006 Qiu and Schutte: A New Generation Clinical Videokymography 1828