Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
12, 2004
Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J84-D-II, No. 6, June 2001, pp. 1073–1083
Shintaro Kumano,1 Kazumasa Miyamoto,1 Mitsuaki Tamagawa,1 Hiroaki Ikeda,2 and Koji Kan3
1
Takasago Research and Development Center, Mitsubishi Heavy Industries, Ltd., Takasago, 676-8686 Japan
2
Hiroshima Research and Development Center, Mitsubishi Heavy Industries, Ltd., Mihara, 729-0393 Japan
3
Kobe Shipyard and Machinery Works, Mitsubishi Heavy Industries, Ltd., Kobe, 652-8585 Japan
39
(2) A light intensity sensor for measuring the bright-
ness in outdoor environments
(3) A camera that takes the entirety of a container
number string into its field of vision, having the capability
of decomposing an image of a container character width (10
mm) within its set field of vision
(4) An illuminating device for use when outdoor
environments become dark (below 1000 lux)
(5) A shutter speed control device that controls the
brightness of an image corresponding to changes in outdoor
light intensity
(6) A filter that is optimal for handling combinations
of container colors and character colors handled by a con-
Fig. 2. Camera system layout. tainer terminal
(7) A photographic control device for controlling the
entire camera system sequentially
have been evaluated by recognizing containers with hori-
zontal writing, which are more frequently encountered, and
in some cases, parts of the vertically written containers were 2.3. Evaluations of the device
outside the field of vision of the camera. The latter cases
The character area on the back surface of a container
may be handled by using a high resolution camera.
measures 1.6 m × 1.2 m. The character width of a container
number is 10 mm at the minimum, the character interval is
2.2. Camera system 5 mm, and a resolution of 2.5 mm pixel is required in order
to quantize 5 mm with a minimum of 2 pixels. Based on
The functions needed to obtain good photographic this fact, a camera with a VGA monochrome capability
images using the camera layout shown in Fig. 2 are pre- having a size of 640 × 480 pixels is used. We selected the
sented below (Fig. 3 shows a configuration having these IK-542 (Toshiba), which is capable of controlling 8.94 µs
functions). of marking up to 10 µs to 16 ms by external signals. A
monochrome camera and optimal filter are combined in
(1) A camera trigger sensor detecting whether a mov- order to reduce the cost of a color camera.
ing container has passed a specific location and a camera An optimal filter has been selected from the point of
capable of photographing by an external trigger view of maximizing the contrast of camera images for
combinations of actually existing container colors and char-
acter colors. Figure 4 shows evaluation results of G and R
filters among the luminescence evaluation results obtained
Fig. 3. Container ID mark recognition system. Fig. 4. Filtering results for various color combinations.
40
Table 1. Image capture conditions
Vision field range 1.6 m (horizontal) × 1.2 m
(vertical)
Resolution strength 2.5 mm (horizontal) × 2.5
mm (vertical)
Number of pixels 640 × 480
Camera lens focusing dis- 35 mm
tance
Photographing distance 8.167 m
Constriction F4 fixed
Lens resolution strength over 50% at 25 lines/mm
Photographed field depth 7.12–9.58 m
allowed up to the fading
amount of 1/2 pixel (5 µm)
with constriction of F4
Shutter time due to image less than 1/326 second
flow of moving object flow of 1 pixel at moving
rate of 30 km/h allowed
Filter G
Illuminating device average light intensity of
1400 lux (light with less
than 1000 lux of
environmental illumination)
Camera trigger sensor less than 2 m of distance set
before target
41
included in a character string. In addition, even if “U” is not
found due to deterioration or cutting of a container, the
location of a character string is estimated by a conventional
method based on correlations [12] and the characters are
interpreted as the container identification number by the
subsequent character recognition process.
Since the conventional assumption of “one label, one
character” does not necessarily hold even in the recognition
process after the extraction of a character string, a method
of recognizing characters and cut characters simultaneously
by dynamic design methods [15, 16, 18], involving address
recognition and a subsequent English recognition system,
is used. Furthermore, since the recognition result for each
character is influenced by the quality of the binarized
image, the scheme (multiplexed binary image recognition
method) of choosing the most likely characters (based on
the values obtained by subtracting a constant from the
distances from the dictionary characters) is introduced.
Differences in the individual elements of a container
identification number are also investigated. By using a
voting scheme based on a dictionary of owner codes pre-
pared in advance to identify the character string in the
dictionary to which a character string corresponds [21],
owner codes not in the dictionary are not taken as solutions
in recognizing the four English characters representing an
owner. In many cases, the final number (check digit) is
written within the frame line, and the frame line influences
character recognition. Thus, character recognition is per-
formed by always including a process of eliminating the
frame.
Figure 6 shows the entire configuration of the proc-
ess.
(1) Find and recognize “U” for a multiplexed binary Fig. 6. Image processing and ID mark recognition flow.
image obtained by using multiple binary thresholds.
(2) Tighten the variability conditions of the directions
and numbers of character strings of a container number by
finding and recognizing “U” as a basis for estimating the
brightness relationship between the characters and back- 3.2.1. Switching presumed conditions of
ground and the character sizes . process
(3) If “U” is not found due to deterioration of a
It is assumed that the presumed conditions of the
container, avoid imposing excessive requirements on the
process can be switched in response to the frequencies of
quality of the “U” part by performing character string
candidate extraction [12] on the basis of correlations, which occurrence of the kinds of containers at a container gate.
has been conventionally used for cut-out vehicle number This paper presumes the case of a dark character part
areas. against a background. The process is repeated after switch-
ing the presumed condition to the case of a bright character
The contents of the process are explained in detail part if applying the presumption does not yield a solution
below. having high matchability.
42
3.2.2. Image processing for extracting 3.3. Character elimination and character
character string recognition
A morphology filter [23] having a kernel with a The characteristics of the scheme for character rec-
horizontal length of (80 × 1) is used to eliminate long ognition by character elimination are as follows.
components (such as metallic devices on the upper part of
a container) in the horizontal direction and to perform (1) A dynamic design method that can simultane-
binarization. The result of applying a morphology filter ously evaluate optimal character label selection and char-
having a kernel with a vertical length of (64 × 1) to eliminate acter recognition is used.
long components (e.g., metallic locking bar) is similarly (2) The check digit is used to perform re-recognition
binarized. Finally, the theoretical products of these are after mandatory frame operations.
taken. Three different binarized thresholds, 15, 30, and 45, (3) The owner code receiving the most votes in the
are simultaneously generated at 256 gradations, an image dictionary voting scheme is selected.
processed with a threshold value of 30 is subsequently (4) Recognition is performed by using multiplexed
binarized images in order to improve low-quality character
processed, and images processed with threshold values of
recognition accuracy. For the final character, the result of
15 and 45 are used if the above image fails.
recognition after frame processing is integrated.
3.2.3. Character “U” search
3.3.1. Character recognition processing
After labeling the image of Section 3.2.2, eliminating
Character recognition uses weighted directional in-
large labels (in surface area, width, and height), and merg- dex histograms (4 directions, 48 dimensions) as features
ing labels, the features (48 dimensions) of weighted direc- and the recognition scheme uses a pseudo-Mahalanobis
tional index histograms [19] are generated and recognition distance method [19]. As English and number data used in
is performed by the pseudo-Mahalanobis distance method. constructing recognition templates, about 10,000 Sin-
The condition for finding “U” corresponds to a label with gaporean vehicle number data photographed previously by
a distance from the character “U” template that is less than the authors (with maximum variations of about 10 times in
some reference value. When multiple candidates meeting the number of items for each kind of character) and about
this condition exist, those located in the upper part have 1000 items of data photographed in a container yard were
higher priorities than those in the lower part, those located used.
to the right have higher priorities than those located to the
left, and the next candidate is selected if the subsequent 3.3.2. Character string recognition by
processing fails. dynamic designing
As the character label coordinate system, the horizon-
3.2.4. Character area estimation tal coordinate is assumed to increase from left to right and
From the “U” labels selected in Section 3.2.3, the the vertical coordinate from top to bottom. In the case of
horizontal writing, numbering is done from left to right
sizes of the labels taken as subsequent character candidates
corresponding to the bottom left coordinate of the label. In
are limited. The upper and lower limits of the “U” label
the case of multiple steps, labeling is performed succes-
height are set, and only the upper limit of the width (the
sively on the sequential numbers of a character string by
lower limit is not set since the width of “1” is narrow) is set. adding a certain offset value extending over the steps. In the
For all labels in which the horizontal central coordinate is case of vertical writing, the numbers are given from top to
within a specific range from the horizontal central coordi- bottom from the label upper position. The case of horizontal
nate of the “U” label, vertical writing is inferred if the height labeling is discussed below. Let the initial label number
sum exceeds some level, and horizontal writing is inferred constituting the i-th character be j(i) and the final label
otherwise. When horizontal writing is inferred, whether a number be k(i). k(i) ≥ j(i) and j(i) > k(i – 1). The total
label of the size within a standard range exists to the left, character string recognition evaluation value is
right, or low side of the label is checked sequentially and
the character area is determined. In this case, other charac-
ters are then processed for character string recognition.
When vertical writing is inferred, the character string area
is determined by checking the label in the upward–down-
ward direction, assuming a string in the vertical direction.
43
Here, let L(0, j(0), m(0)) = 0. We solve this recursive
(1) relationship equation by the dynamic design method.
The method of estimating the owner code by voting
[representing the i-th character by C(i)] takes the owner
Here, l(i, j(i), k(i)) is the character likelihood of the image code yielding the maximum value of the voted character
obtained by merging the labels from j(i) to k(i) as the i-th recognition results as the estimation result. Specifically, if
character (the value obtained under the assumption that the the upper n character recognition results of the i-th character
character likelihood increases as the values obtained by and the corresponding character likelihood evaluation val-
subtracting a constant value from the calculated distances ues are represented by R1(i), R2(i), . . . , Rn(i) and V1(i),
from English characters for 1 ≤ i ≤ 3, “U” for i = 4, and V2(i), . . . , Vn(i), then
numbers for 5 ≤ i ≤ 11 increase). Label merging is done over
the character recognition process by treating a set of multi-
ple unconnected labels as one label, and the image thus
obtained is called a merged image. The character likelihood
is obtained from a multiplexed binary image by selecting
the minimum value of the distance values of the recognition where
results using binary images and subtracting a constant from
it.
g(i, k(i – 1), j(i)) is the space or gap likelihood
between the (i – 1)-th character and the i-th character as an
example. This value is assigned a constant positive value The owner code C for which E(C) becomes a maximum is
within a certain range and a negative penalty point outside the recognition result. This amounts to a simplified version
that range. Although this procedure is meaningless when i of the “Ambiguous Term Search” [21] used in recognizing
= 1, we let g = 0 in order to combine the addition range with such items as geographical names.
other terms. Furthermore, h(i, k(i – 1) + 1, j(i) – 1) is the
character unlikelihood of the image obtained by merging 3.3.3. Frame processing
multiple labels existing in the space between the (i – 1)-th
The frame operation is performed only when a
character and the i-th character into one; it is the evaluation
merged image is recognized as the check digit. The contents
value for avoiding skipping a character. Specifically, as an
of frame processing are explained for the example of a
example, let
merged image with the left frame line eliminated. Let the
values after binarizing the character and the frame line be
1, the value of the background be 0, and the vertical (Y) and
(2) horizontal (X) directional coordinate ranges of the merged
image be Y0 ≤ Y ≤ YM and X0 ≤ X ≤ XN. The merged image
and subtract the label character likelihood portion of the gap is searched from the left edge (X0) to the right, and the
or space. When k(i – 1) + 1 ≥ j(i), let h = 0. In addition, for position X that has changed from the character part to the
i = 1 as with g, let h = 0. background part (from 1 to 0) is obtained. Let the position
The selection of the m-th optimal character label is X for height direction Y be X(Y), take the histograms of X(Y)
recursively for all Y = Y0, Y1, . . . , YM, and let their most frequent value
be Xs. The left edge of the image is made to be 0 from X0 to
Xs. Perform a similar procedure on the right, upper, and
lower sides of the merged image. Figure 7 shows a concep-
tual diagram of the frame process.
44
Table 2. Processing devices
Image processing MHI manufactured board [22]
(A/D board)
Number recognition OS: FreeBSD, CPU: PentiumII
device (PC)
333 MHz, memory: 128 Mbyte
Camera Toshiba manufactured CCD
camera IK-542
Camera trigger sensor Hokuyo Denki manufactured
PD1
45
Table 4. Recognition results for whole images Table 5. Recognition results for appropriate
images
Recognition result Number of items (%)
Correct solution 558 (92.8%) Recognition result Number of items (%)
One character unclear 9 (1.5%) Correct solution 550 (97.9%)
All characters unclear 21 (3.5%) 1 character unclear 7 (1.2%)
Erroneous recognition 13 (2.2%) All characters unclear 1 (0.2%)
Total 601 (100.0%) Erroneous recognition 4 (0.7%)
Total 562 (100.0%)
46
Fig. 10. Tank container removed as inappropriate image.
47
Table 7. Problems and solutions cause of erroneous recognition is that binarized image
labels containing images other than characters (e.g.,
Problem Causes Solutions
shadow of a metal rod, part of the frame of a check digit)
Erroneous recog- Erroneous Character layout when converted into character recognition features are
nition of recognition as “1” (positional close to items in the dictionary [1]. In recognizing a low-
scratched charac- by binarizing the relations with quality character string in outdoor environments, character
ter image strokes of neighboring recognition is performed to determine which label is a
metal parts (in characters) and character before identifying the kind of character, and the
many cases, background label with the shortest distance to a dictionary item is used
vertical rods) in luminescence as the character label. In doing so, however, there is a major
multiplexed check problem of noncharacter noise that influences whether a
binary image character is “1” or “I.” Thus, even if characters with signifi-
recognition cant deterioration are ignored as noncharacters, interpreta-
Displacements Determining a Check the layout tions that match the required number of places may be
due to a character deteriorated of each place of allowed. In order to resolve this problem, layout interpre-
unrecognized as a character as a recognized tation of the position of each character or a determination
character noncharacter and character string of whether the character size or width and the character
recognizing the luminescence and background luminescence relationships
subsequent match those of other characters will be necessary.
character string In the near future, the authors plan to enhance the
with applicability of their container mark recognition device by
displacements addressing the above points and conducting tests including
illumination conditions and filters that were omitted in this
study.
5. Conclusions REFERENCES
An automatic container mark recognition device has
been developed for implementation in container terminals. 1. Ministry of Transportation. 1999 Transportation
The recognition device recognizes a mark on a container by White Paper—New Developments in Urban Trans-
photographing the back surface of the container in outdoor portation Policies for the 21st Century, p 374–384.
environments with a camera device installed on the con- 2. http://www.samsys.com/art-8-1996.html
tainer gate. The camera system has been designed to absorb 3. Lee C, Kankanhalli A. Automatic extraction of char-
the effects of changes in the intensity of illumination due to acters in complex scine images. Int J Pattern Recog-
sunlight and variations in the colors of containers and nition Artif Intell 1995;9:67–82.
characters. In addition, a recognition scheme for a multi- 4. Fos first out of the gate. Cargo Systems, p 446–447,
plexed binarized image with a focus on a dynamic design July 1994.
method has been developed in order to handle variations in 5. Hamada H, Ikeya N, Ito K. Development of automat-
the layouts and locations of container mark character ic container mark recognition system. Japanese Me-
strings and degradation in characters due to scratches or chanical Engineering Society 73rd Regular Meeting
contamination. Lectures and Papers (IV), p 362–363, 1996.
Evaluation of the performance of the device pre- 6. Watanabe I. Containers and international regulations.
sented by conducting field tests in a container terminal Kowan, August Issue, p 9–14, 1995.
yielded a recognition rate of 92.8% for all data photo- 7. Tsutsumida T, Kido T, Ota K, Kimura F, Iwata A.
graphed in a period of 5 days. Data excluding exceptional New developments in the study of character recogni-
data and scratched or contaminated data that cannot be read tion. Current status of handwritten number recogni-
by humans is taken as appropriate data. The recognition rate tion in postal number data. Tech Rep IEICE
was 97.9% and the erroneous recognition rate was 0.7% for 1997;PRU96-190.
the appropriate data. 8. Matsui T, Yamashita I, Wakahara A, Yoshimuro M.
Although this recognition rate is high for field tests, The First Character Recognition Technology Contest
further study is needed in order to reduce erroneous con- Results. Symposium Lectures and Papers, p 15–22,
tainer recognition at an unmanned container gate. A major 1993.
48
9. Fujimoto K, Horino M, Fujikawa Y. Number plate 16. Burges CJC, Ben JI, Denker JS, Lecun Y, Nohl CR.
reading device based on real time image processing. Off line recognition of handwritten postal words
Omron Technics 1990;30:9–18. using neural networks. Int J Pattern Recognition Artif
10. Kato H, Nanbu M, Fukui H, Yorita F, Aoki M. Num- Intell 1993;7:689–704.
ber plate recognition technology. Mitsubishi Denki 17. Sakaguchi M. Dynamic designing methods.
Giho 1988;62:9–12. Shibundo; 1968.
11. Iida Y, Nakayama H, Miyamoto K, Fujita K, Urata H. 18. Miyamoto K. A study on number plate recognition.
Development of high-speed image processing device Thesis, University of Kyoto, 2000.
and its applications. Mitsubishi Juko Giho 19. Harata T, Tsuruoka N, Kimura F, Miyake K. Hand-
1990;27:76–80. written Chinese and Japanese character recognition
12. Miyamoto K, Tamagawa M, Fujita K, Hayama Y,
using weighted directional index histograms and
Ayaho S. Character string region detection scheme
pseudobase identification method. Tech Rep IEICE
based on correlations. Trans IEICE 1998;J81-D-
1984;PRL83-68.
II:2052–2060.
20. Hagita N, Naito S, Masuda I. Identification of hand-
13. Miyamoto K, Kumano S, Sugimoto K, Tamagawa M,
written Chinese characters by contour direction con-
Ayaho S. Recognition scheme for low quality char-
acters using multiple features. Trans IEICE tribution feature. Trans IEICE 1983;J66-D:
1999;J82-D-II:771–779. 1185–1192.
14. Miyamoto K, Kumano S, Tamagawa M, Sugimoto K, 21. Nakabayashi K, Kitamura M, Kawaoka T. High-
Urata H. Development of character recognition tech- speed frame-less handwritten character reading
nology in transportation and shipping industries. Mit- scheme using ambiguous term search. Trans IEICE
subishi Juko Giho 1996;33:404–407. 1991;J74-D-II:1528–1537.
15. Kimura F, Tsuruoka S, Shridhar M, Chen Z. Context 22. Tsutsuminaka T, Mimochi K. Real-time dynamic
directed handwritten word recognition for postal image processing device. 4th Image Sensing Sympo-
service applications. Proc Fifth Advanced Technol- sium Lectures and Papers, 1998.
ogy Conference, p 199–213, Washington, DC, 1992. 23. Obata H. Morphology. Corona Press; 1996.
Shintaro Kumano (member) received his B.S. degree from the Department of Computer Science, University of Tokyo,
in 1985 and joined Mitsubishi Heavy Industries, Ltd. His research interests include pattern recognition signal processing. He
received an M.S. degree from Georgia Institute of Technology in 1992. He is currently a technical leader at the Takasago Research
and Development Center, Mitsubishi Heavy Industries, Ltd. He is a member of the Japan Society for Nuclear Energy.
Kazumasa Miyamoto (member) received his M.S. degree from the Department of Computer Engineering, University
of Kyoto, in 1974 and joined Mitsubishi Heavy Industries, Ltd. He was affiliated with the Hiroshima Research Laboratory. He
has investigated image processing, pattern recognition, and signal processing at the System Technology Development Center
(Kobe), Technology Headquarters. He received a D.Eng. degree from the University of Kyoto in 1999. He is currently a leader
at the Takasago Research and Development Center, Mitsubishi Heavy Industries, Ltd. He received a 1991 Kobe City Technology
Contributor Award and 1995 Defense Technology Inventor Award.
49
AUTHORS (continued) (from left to right)
Mitsuaki Tamagawa received his M.S. degree from the Department of Electrical Engineering, Waseda University and
joined Mitsubishi Heavy Industries, Ltd. He has been engaged in research and development on image processing and pattern
recognition. He is currently a technical leader at the Technical Laboratory, Takasago Research and Development Center. He is
a member of the Image Information Media Society.
Hiroaki Ikeda received his B.S. degree from the Department of Precision Engineering, University of Hiroshima, in 1975
and joined Konica (Ltd.). He moved to Mitsubishi Heavy Industries, Ltd. in 1990. His research interests include laser sensors
and other optical applications. He is currently a research leader in the Printing Machines Research Laboratory, Hiroshima
Research Center. He is a member of the Applied Physics Society.
Koji Kan received his M.S. degree from the Department of Electronic Engineering, Osaka Prefectural University, in 1988
and joined Mitsubishi Heavy Industries, Ltd. He has worked on the design of electrical devices for ships and other marine
machinery. He is currently a leader at the Electrical Equipment Control Designing and Planning Department, Ships and Seas
Division, Kobe Shipyard and Machinery Works.
50