OpenCV Text Detection (EAST Text Detector)

8/27/2019 OpenCV Text Detection (EAST text detector) - PyImageSearch
 Navigation
Click here to download the source code to this post
OpenCV Text Detection (EAST text detector)

by Adrian Rosebrock on August 20, 2018 in Deep Learning, Optical Character Recognition (OCR), Tutorials
In this tutorial you will learn how to use OpenCV to detect text in natural scene images using the EAST text detector.
OpenCV’s EAST text detector is a deep learning model, based on a novel architecture and training pattern. It is capable
Free
of (1) running at near real-time at 13 FPS on 720p images and (2) obtains 17-day
state-of-the-art textcrash
detection accuracy. ×
In the remainder of this tutorial you will learn how to use OpenCV’s EASTcourse
detector to on Computer
automatically detect text in both
images and video streams.
Vision, OpenCV, and
Deep Learning
To discover how to apply text detection with OpenCV, just keep reading!
Interested in computer vision, OpenCV, and

Looking for the source code to this post?
deep learning, but don't know where to
Jump right to the downloads section.
start? Let me help. I've created a free, 17-day
crash course that is hand-tailored to give you
the best possible introduction to computer
vision and deep learning. Sound good? Enter
your email below to get started.
Email Address
✕
👋Hey there! Which of these best describes you?
START MY EMAIL COURSE
Click to answer
https://www.pyimagesearch.com/2018/08/20/opencv-text-detection-east-text-detector/ 1/58
OpenCV Text Detection (EAST text detector) Demo
In this tutorial, you will learn how to use OpenCV to detect text in images using the EAST text detector.
The EAST text detector requires that we are running OpenCV 3.4.2 or OpenCV 4 on our systems — if you do not already
have OpenCV 3.4.2 or better installed, please refer to my OpenCV install guides and follow the one for your respective
operating system.
In the first part of today’s tutorial, I’ll discuss why detecting text in natural scene images can be so challenging.
From there I’ll briefly discuss the EAST text detector, why we use it, and what makes the algorithm so novel — I’ll also
include links to the original paper so you can read up on the details if you are so inclined.
Finally, I’ll provide my Python + OpenCV text detection implementation so you can start applying text detection in your
own applications.
Why is natural scene text detection so challenging?

Free 17-day crash ×
Free 17-day crash course on Computer
course on Computer
Vision, OpenCV, and Deep Learning
Vision, OpenCV, and
Deep Learning
Figure 1: Examples of natural scene images where text detection is challenging due to lighting conditions, image
quality, and non-planar objects (Figure 1 of Mancas-Thillou and Gosselin).
Detecting text in constrained, controlled environments can typically be accomplished by using heuristic-based
approaches, such as exploiting gradient information or the fact that text is typically grouped into paragraphs and
characters appear on a straight line. An example of such a heuristic-based text detector can be seen in my previous blog
post on Detecting machine-readable zones in passport images. Email Address
✕
Natural scene text detection is different though — and much more challenging.
Click to answer
Due to the proliferation of cheap digital cameras, and not to mention the fact that nearly every smartphone now has a
camera, we need to be highly concerned with the conditions the image was captured under — and furthermore, what
assumptions we can and cannot make. I’ve included a summarized version of the natural scene text detection challenges
described by Celine Mancas-Thillou and Bernard Gosselin in their excellent 2017 paper, Natural Scene Text
Understanding below:
Image/sensor noise: Sensor noise from a handheld camera is typically higher than that of a traditional scanner.
Additionally, low-priced cameras will typically interpolate the pixels of raw sensors to produce real colors.
Viewing angles: Natural scene text can naturally have viewing angles that are not parallel to the text, making the
text harder to recognize.
Blurring: Uncontrolled environments tend to have blur, especially if the end user is utilizing a smartphone that does
not have some form of stabilization.
Lighting conditions: We cannot make any assumptions regarding our lighting conditions in natural scene images. It
may be near dark, the flash on the camera may be on, or the sun may be shining brightly, saturating the entire image.
Resolution: Not all cameras are created equal — we may be dealing with cameras with sub-par resolution.
Non-paper objects: Most, but not all, paper is not reflective (at least in context of paper you are trying to scan). Text
in natural scenes may be reflective, including logos, signs, etc.
Non-planar objects: Consider what happens when you wrap text around a bottle — the text on the surface becomes
distorted and deformed. While humans may still be able to easily “detect” and read the text, our algorithms will
struggle. We need to be able to handle such use cases.
Unknown layout: We cannot use any a priori information to give our algorithms “clues” as to where the text resides.
As we’ll learn, OpenCV’s text detector implementation of EAST is quite robust, capable of localizing text even when it’s
blurred, reflective, or partially obscured:

course on Computer
Vision, OpenCV, and
Deep Learning
Figure 2: OpenCV’s EAST scene text detector will detect even in blurry
deep and obscured
learning, images.
but don't know where to
I would suggest reading Mancas-Thillou and Gosselin’s work if you are further interested
crash course that in the challenges
is hand-tailored to associated
give you
with text detection in natural scene images. the best possible introduction to computer
The EAST deep learning text detector your email below to get started.
Email Address
✕
Click to answer
Figure 3: The structure of the EAST text detection Fully-Convolutional

Network (Figure 3 of Zhou et al.).
With the release of OpenCV 3.4.2 and OpenCV 4, we can now use a deep learning-based text detector called EAST,
which is based on Zhou et al.’s 2017 paper, EAST: An Efficient and Accurate Scene Text Detector.
We call the algorithm “EAST” because it’s an: Efficient and Accurate Scene Text detection pipeline.
The EAST pipeline is capable of predicting words and lines of text at arbitrary orientations on 720p images, and
furthermore, can run at 13 FPS, according to the authors.
Perhaps most importantly, since the deep learning model is end-to-end, it is possible to sidestep computationally
expensive sub-algorithms that other text detectors typically apply, including candidate aggregation and word partitioning.
To build and train such a deep learning model, the EAST method utilizes novel, carefully designed loss functions.

For more details on EAST, including architecture design and training methods, be sure to refer to the publication by the
authors. course on Computer
Vision, OpenCV, and
Project structure
Deep Learning
To start, be sure to grab the source code + images to today’s post by visiting the “Downloads” section. From there,
simply use the tree terminal command to view the project structure: Interested in computer vision, OpenCV, and
OpenCV Text Detection (EAST text detector) Shell
1 $ tree --dirsfirst
2 . crash course that is hand-tailored to give you
3 ├── images the best possible introduction to computer
4 │ ├── car_wash.png
5 │ ├── lebron_james.jpg vision and deep learning. Sound good? Enter
6 │ └── sign.jpg your email below to get started.
7 ├── frozen_east_text_detection.pb
8 ├── text_detection.py
9 └── text_detection_video.py
10
Email Address
11 1 directory, 6 files ✕
Click to answer
Notice that I’ve provided three sample pictures in the images/ directory. You may wish to add your own images collected
with your smartphone or ones you find online.
We’ll be reviewing two .py files today:
text_detection.py : Detects text in static images.

text_detection_video.py : Detects text via your webcam or input video files.
Both scripts make use of the serialized EAST model ( frozen_east_text_detection.pb ) provided for your convenience in
the “Downloads”.
Implementation notes
The text detection implementation I am including today is based on OpenCV’s official C++ example; however, I must
admit that I had a bit of trouble when converting it to Python.
To start, there are no Point2f and RotatedRect functions in Python, and because of this, I could not 100% mimic the
C++ implementation. The C++ implementation can produce rotated bounding boxes, but unfortunately the one I am
sharing with you today cannot.
Secondly, the NMSBoxes function does not return any values for the Python bindings (at least for my OpenCV 4 pre-
release install), ultimately resulting in OpenCV throwing an error. The NMSBoxes function may work in OpenCV 3.4.2 but I
wasn’t able to exhaustively test it.
I got around this issue my using my own non-maxima suppression implementation in imutils, but again, I don’t believe
these two are 100% interchangeable as it appears NMSBoxes accepts additional parameters.
Given all that, I’ve tried my best to provide you with the best OpenCV text detection implementation I could, using the
working functions and resources I had. If you have any improvements to the method please do feel free to share them in
the comments below.
Implementing our text detector with OpenCV

Before we get started, I want to point out that you will need at least OpenCV 3.4.2 (or OpenCV 4) installed on your
system to utilize OpenCV’s EAST text detector, so if you haven’t already installed OpenCV 3.4.2 or better on your
system, please refer to my OpenCV install guides. Free 17-day crash ×
Next, make sure you have imutils course
installed/upgraded on your system as on and
well:OpenCV,
Vision,
Computer
Deep Learning

Vision, OpenCV, and Shell
1 $ pip install --upgrade imutils Deep Learning
At this point your system is now configured, so open up text_detection.py and insert the following code:
OpenCV Text Detection (EAST text detector) deep learning, but don't know where to Python
1 # import the necessary packages start? Let me help. I've created a free, 17-day
2 from imutils.object_detection import non_max_suppression
3 import numpy as np
4 import argparse the best possible introduction to computer
5 import time vision and deep learning. Sound good? Enter
6 import cv2
7 your email below to get started.
8 # construct the argument parser and parse the arguments
9 ap = argparse.ArgumentParser()
10 ap.add_argument("-i", "--image", type=str, Email Address
11 help="path to input image")
✕
👋
Hey there! Which of these best describes
12 ap.add_argument("-east", "--east", type=str,
13 help="path to input EAST text detector")
you?
Click to answer
14 ap.add_argument("-c", "--min-confidence", type=float, default=0.5,
15 help="minimum probability required to inspect a region")
16 ap.add_argument("-w", "--width", type=int, default=320,
17 help="resized image width (should be multiple of 32)")
18 ap.add_argument("-e", "--height", type=int, default=320,
19 help="resized image height (should be multiple of 32)")
20 args = vars(ap.parse_args())
To begin, we import our required packages and modules on Lines 2-6. Notably we import NumPy, OpenCV, and my
implementation of non_max_suppression from imutils.object_detection .
We then proceed to parse five command line arguments on Lines 9-20:
--image : The path to our input image.

--east : The EAST scene text detector model file path.
--min-confidence : Probability threshold to determine text. Optional with default=0.5 .
--width : Resized image width — must be multiple of 32. Optional with default=320 .
--height : Resized image height — must be multiple of 32. Optional with default=320 .
Important: The EAST text requires that your input image dimensions be multiples of 32, so if you choose to adjust your -
-width and --height values, make sure they are multiples of 32!
From there, let’s load our image and resize it:
OpenCV Text Detection (EAST text detector) Python

22 # load the input image and grab the image dimensions
23 image = cv2.imread(args["image"])
24 orig = image.copy()
25 (H, W) = image.shape[:2]
26
27 # set the new width and height and then determine the ratio in change
28 # for both the width and height
29 (newW, newH) = (args["width"], args["height"])
30 rW = W / float(newW)
31 rH = H / float(newH)
32
33 # resize the image and grab the new image dimensions
34 image = cv2.resize(image, (newW, newH))
35 (H, W) = image.shape[:2]
On Lines 23 and 24, we load and copy our input image.
From there, Lines 30 and 31 determine the ratio of the original image dimensions to new image dimensions (based on
the command line argument provided for --width and --height ).
course on Computer
Then we resize the image, ignoring aspect ratio (Line 34).
Vision, OpenCV, and
Deep
In order to perform text detection using OpenCV and the EAST deep learning Learning
model, we need to extract the output
feature maps of two layers:
37 # define the two output layer names for the EAST detector model that
start?
38 # we are interested -- the first is the output probabilities and the Let me help. I've created a free, 17-day
39 # second can be used to derive the bounding box coordinates of crash
text course that is hand-tailored to give you
40 layerNames = [
41 "feature_fusion/Conv_7/Sigmoid", the best possible introduction to computer
42 "feature_fusion/concat_3"] vision and deep learning. Sound good? Enter
We construct a list of layerNames on Lines 40-42:
1. The first layer is our output sigmoid activation which gives us the probability of a region containing text or not.
Email Address
2. The second layer is the output feature map that represents the “geometry” of the image — we’ll be able to use this ✕
👋Hey there! Which of these best describes
geometry to derive the bounding box coordinates of the text in the input image
START
you?
MY EMAIL COURSE
Click to answer
Let’s load the OpenCV’s EAST text detector:

44 # load the pre-trained EAST text detector
45 print("[INFO] loading EAST text detector...")
46 net = cv2.dnn.readNet(args["east"])
47
48 # construct a blob from the image and then perform a forward pass of
49 # the model to obtain the two output layer sets
50 blob = cv2.dnn.blobFromImage(image, 1.0, (W, H),
51 (123.68, 116.78, 103.94), swapRB=True, crop=False)
52 start = time.time()
53 net.setInput(blob)
54 (scores, geometry) = net.forward(layerNames)
55 end = time.time()
56
57 # show timing information on text prediction
58 print("[INFO] text detection took {:.6f} seconds".format(end - start))
We load the neural network into memory using cv2.dnn.readNet by passing the path to the EAST detector (contained in
our command line args dictionary) as a parameter on Line 46.
Then we prepare our image by converting it to a blob on Lines 50 and 51. To read more about this step, refer to Deep
learning: How OpenCV’s blobFromImage works.
To predict text we can simply set the blob as input and call net.forward (Lines 53 and 54). These lines are
surrounded by grabbing timestamps so that we can print the elapsed time on Line 58.
By supplying layerNames as a parameter to net.forward , we are instructing OpenCV to return the two feature maps
that we are interested in:
The output geometry map used to derive the bounding box coordinates of text in our input images
And similarly, the scores map, containing the probability of a given region containing text
We’ll need to loop over each of these values, one-by-one:

60 # grab the number of rows and columns from the scores volume, then
61 # initialize our set of bounding box rectangles and corresponding
62 # confidence scores
63
64
(numRows, numCols) = scores.shape[2:4]
rects = []
65 confidences = []
66
67

# loop over the number of rows
course on Computer
68 for y in range(0, numRows):
69 # extract the scores (probabilities), followed by the geometrical Vision, OpenCV, and
70 # data used to derive potential bounding box coordinates that
71 # surround text Deep Learning
72 scoresData = scores[0, 0, y]
73 xData0 = geometry[0, 0, y]
75 xData2 = geometry[0, 2, y] deep learning, but don't know where to
76 xData3 = geometry[0, 3, y] start? Let me help. I've created a free, 17-day
77 anglesData = geometry[0, 4, y]
We start off by grabbing the dimensions of the scores volume (Line 63)the
and then
best initializing
possible two lists:
introduction to computer
rects : Stores the bounding box (x, y)-coordinates for text regions your email below to get started.
confidences : Stores the probability associated with each of the bounding boxes in rects
We’ll later be applying non-maxima suppression to these regions. Email Address

✕
Looping over the rows begins on Line 68.
you?
Click to answer
Lines 72-77 extract our scores and geometry data for the current row, y .
Next, we loop over each of the column indexes for our currently selected row:

79 # loop over the number of columns
80 for x in range(0, numCols):
81 # if our score does not have sufficient probability, ignore it
82 if scoresData[x] < args["min_confidence"]:
83 continue
84
85 # compute the offset factor as our resulting feature maps will
86 # be 4x smaller than the input image
87 (offsetX, offsetY) = (x * 4.0, y * 4.0)
88
89 # extract the rotation angle for the prediction and then
90 # compute the sin and cosine
91 angle = anglesData[x]
92 cos = np.cos(angle)
93 sin = np.sin(angle)
94
95 # use the geometry volume to derive the width and height of
96 # the bounding box
97 h = xData0[x] + xData2[x]
98 w = xData1[x] + xData3[x]
99
100 # compute both the starting and ending (x, y)-coordinates for
101 # the text prediction bounding box
102 endX = int(offsetX + (cos * xData1[x]) + (sin * xData2[x]))
103 endY = int(offsetY - (sin * xData1[x]) + (cos * xData2[x]))
104 startX = int(endX - w)
105 startY = int(endY - h)
106
107 # add the bounding box coordinates and probability score to
108 # our respective lists
109 rects.append((startX, startY, endX, endY))
110 confidences.append(scoresData[x])
For every row, we begin looping over the columns on Line 80.
We need to filter out weak text detections by ignoring areas that do not have sufficiently high probability (Lines 82 and
83).
The EAST text detector naturally reduces volume size as the image passes through the network — our volume size is
actually 4x smaller than our input image so we multiply by four to bring the coordinates back into respect of our original
image.
course on Computer
I’ve included how you can extract the angle data on Lines 91-93; however, as I mentioned in the previous section, I
Vision, OpenCV, and
wasn’t able to construct a rotated bounding box from it as is performed in the C++ implementation — if you feel like
Deep Learning
tackling the task, starting with the angle on Line 91 would be your first step.
Interested
From there, Lines 97-105 derive the bounding box coordinates for the text area. in computer vision, OpenCV, and
We then update our rects and confidences lists, respectively (Lines start?
109 andLet110).
me help. I've created a free, 17-day
We’re almost finished! the best possible introduction to computer
The final step is to apply non-maxima suppression to our bounding boxes to suppress weak overlapping bounding boxes
and then display the resulting text predictions:
OpenCV Text Detection (EAST text detector) Email Address Python

112 # apply non-maxima suppression to suppress weak, overlapping bounding
✕
113 # boxes 👋
Hey there! Which of these best describes you?
114 boxes = non_max_suppression(np.array(rects), probs=confidences)
Click to answer
115
116 # loop over the bounding boxes
117 for (startX, startY, endX, endY) in boxes:
118 # scale the bounding box coordinates based on the respective
119 # ratios
120 startX = int(startX * rW)
121 startY = int(startY * rH)
122 endX = int(endX * rW)
123 endY = int(endY * rH)
124
125 # draw the bounding box on the image
126 cv2.rectangle(orig, (startX, startY), (endX, endY), (0, 255, 0), 2)
127
128 # show the output image
129 cv2.imshow("Text Detection", orig)
130 cv2.waitKey(0)
As I mentioned in the previous section, I could not use the non-maxima suppression in my OpenCV 4 install (
cv2.dnn.NMSBoxes ) as the Python bindings did not return a value, ultimately causing OpenCV to error out. I wasn’t fully
able to test in OpenCV 3.4.2 so it may work in v3.4.2.
Instead, I have used my non-maxima suppression implementation available in the imutils package (Line 114). The
results still look good; however, I wasn’t able to compare my output to the NMSBoxes function to see if they were
identical.
Lines 117-126 loop over our bounding boxes , scale the coordinates back to the original image dimensions, and draw
the output to our orig image. The orig image is displayed until a key is pressed (Lines 129 and 130).
As a final implementation note I would like to mention that our two nested for loops used to loop over the scores and
geometry volumes on Lines 68-110 would be an excellent example of where you could leverage Cython to dramatically
speed up your pipeline. I’ve demonstrated the power of Cython in Fast, optimized ‘for’ pixel loops with OpenCV and
Python.
OpenCV text detection results

Are you ready to apply text detection to images?
Start by grabbing the “Downloads” for this blog post and unzip the files.
From there, you may execute the following command in your terminal (taking note of the two command line arguments):
OpenCV Text Detection (EAST text detector) Free 17-day crash ×Shell
1 $ python text_detection.py --image images/lebron_james.jpg \ Free 17-day crash course on Computer
2 --east frozen_east_text_detection.pb course on Computer
3 [INFO] loading EAST text detector...
4 [INFO] text detection took 0.142082 seconds Vision, OpenCV, and
Your results should look similar to the following image: Deep Learning
Email Address
✕
Click to answer
Figure 4: Famous basketball player, Lebron James’ jersey text is successfully recognized with OpenCV and EAST
text detection.
Three text regions are identified on Lebron James.
Now let’s try to detect text of a business sign:

1 $ python text_detection.py --image images/car_wash.png \
2 --east frozen_east_text_detection.pb
3 [INFO] loading EAST text detector... Free 17-day crash ×
4 [INFO] text detection took 0.142295 seconds Free 17-day crash course on Computer
course on Computer
Vision, OpenCV, and
Deep Learning
Email Address
✕
Click to answer
Figure 5: Text is easily recognized with Python and OpenCV using EAST in this natural scene of a car wash
station.
And finally, we’ll try a road sign:

1 $ python text_detection.py --image images/sign.jpg \
2 --east frozen_east_text_detection.pb
3 [INFO] loading EAST text detector...
4 [INFO] text detection took 0.141675 seconds

course on Computer
Vision, OpenCV, and
Deep Learning
Email Address
✕
Click to answer
Figure 6: Scene text detection with Python + OpenCV and the EAST text detector successfully detects the text on
this Spanish stop sign.
This scene contains a Spanish stop sign. The word, “ALTO” is correctly detected by OpenCV and EAST.
As you can tell, EAST is quite accurate and relatively fast taking approximately 0.14 seconds on average per image.
Text detection in video with OpenCV

Now that we’ve seen how to detect text in images, let’s move on to detecting Free text in video
17-day crashwith OpenCV.
course on Computer
course on Computer
This explanation will be very brief; please refer to the previous section forVision, OpenCV, and
details as needed.
Open up text_detection_video.py and insert the following code: Deep Learning

OpenCV Text Detection (EAST text detector) Interested in computer vision, OpenCV, and
Python
1 # import the necessary packages deep learning, but don't know where to
2 from imutils.video import VideoStream
3 from imutils.video import FPS
4 from imutils.object_detection import non_max_suppression crash course that is hand-tailored to give you
5 import numpy as np
6 import argparse
7 import imutils vision and deep learning. Sound good? Enter
8 import time your email below to get started.
9 import cv2
We begin by importing our packages. We’ll be using VideoStream to access a webcam and FPS to benchmark our
Email Address
frames per second for this script. Everything else is the same as in the previous section.
✕
Click to answer
For convenience, let’s define a new function to decode our predictions function — it will be reused for each frame and
make our loop cleaner:

11 def decode_predictions(scores, geometry):
12 # grab the number of rows and columns from the scores volume, then
13 # initialize our set of bounding box rectangles and corresponding
14 # confidence scores
15 (numRows, numCols) = scores.shape[2:4]
16 rects = []
17 confidences = []
18
19 # loop over the number of rows
20 for y in range(0, numRows):
21 # extract the scores (probabilities), followed by the
22 # geometrical data used to derive potential bounding box
23 # coordinates that surround text
24 scoresData = scores[0, 0, y]
29 anglesData = geometry[0, 4, y]
30
31 # loop over the number of columns
32 for x in range(0, numCols):
33 # if our score does not have sufficient probability,
34 # ignore it
35 if scoresData[x] < args["min_confidence"]:
36 continue
37
38 # compute the offset factor as our resulting feature
39 # maps will be 4x smaller than the input image
40 (offsetX, offsetY) = (x * 4.0, y * 4.0)
41
42 # extract the rotation angle for the prediction and
43 # then compute the sin and cosine
44 angle = anglesData[x]
45 cos = np.cos(angle)
46 sin = np.sin(angle)
47
48 # use the geometry volume to derive the width and height
49 # of the bounding box
50 h = xData0[x] + xData2[x]
51 w = xData1[x] + xData3[x]
52
53
54
# compute both the starting and ending (x, y)-coordinates
# for the text prediction bounding box Free 17-day crash ×
55 endX = int(offsetX + (cos * xData1[x]) + (sin * xData2[x]))
56
57
endY = int(offsetY - (sin * xData1[x]) + (cos * xData2[x]))
startX = int(endX - w)
course on Computer
58
59
startY = int(endY - h)
Vision, OpenCV, and
60 # add the bounding box coordinates and probability score
61 # to our respective lists Deep Learning
62 rects.append((startX, startY, endX, endY))
63 confidences.append(scoresData[x]) Interested in computer vision, OpenCV, and
64
65 deep learning, but don't know where to
# return a tuple of the bounding boxes and associated confidences
66 return (rects, confidences) start? Let me help. I've created a free, 17-day
On Line 11 we define decode_predictions function. This function is used to extract:
1. The bounding box coordinates of a text region vision and deep learning. Sound good? Enter
2. And the probability of a text region detection your email below to get started.
This dedicated function will make the code easier to read and manage later on in this script.
Email Address
✕
Let’s parse our command line arguments:
Click to answer
O68 #CVconstruct
T D thei argument
(EAST parser
d and parse
) the arguments P h
69 ap = argparse.ArgumentParser()
70 ap.add_argument("-east", "--east", type=str, required=True,
71 help="path to input EAST text detector")
72 ap.add_argument("-v", "--video", type=str,
73 help="path to optinal input video file")
74 ap.add_argument("-c", "--min-confidence", type=float, default=0.5,
75 help="minimum probability required to inspect a region")
76 ap.add_argument("-w", "--width", type=int, default=320,
77 help="resized image width (should be multiple of 32)")
78 ap.add_argument("-e", "--height", type=int, default=320,
79 help="resized image height (should be multiple of 32)")
80 args = vars(ap.parse_args())
Our command line arguments are parsed on Lines 69-80:
--east : The EAST scene text detector model file path.

--video : The path to our input video. Optional — if a video path is provided then the webcam will not be used.
--min-confidence : Probability threshold to determine text. Optional with default=0.5 .
--width : Resized image width (must be multiple of 32). Optional with default=320 .
--height : Resized image height (must be multiple of 32). Optional with default=320 .
The primary change from the image-only script in the previous section (in terms of command line arguments) is that I’ve
substituted the --image argument with --video .
Important: The EAST text requires that your input image dimensions be multiples of 32, so if you choose to adjust your -
-width and --height values, ensure they are multiples of 32!
Next, we’ll perform important initializations which mimic the previous script:

82 # initialize the original frame dimensions, new frame dimensions,
83 # and ratio between the dimensions
84 (W, H) = (None, None)
85 (newW, newH) = (args["width"], args["height"])
86 (rW, rH) = (None, None)
87
88 # define the two output layer names for the EAST detector model that
89 # we are interested -- the first is the output probabilities and the
90 # second can be used to derive the bounding box coordinates of text
91 layerNames = [
92
93
"feature_fusion/Conv_7/Sigmoid",
"feature_fusion/concat_3"] Free 17-day crash ×
94
95 # load the pre-trained EAST text detector
96 print("[INFO] loading EAST text detector...")
course on Computer
97 net = cv2.dnn.readNet(args["east"])
Vision, OpenCV, and
The height/width and ratio initializations on Lines 84-86 will allow us to properly scale our bounding boxes later on.
Deep Learning
Our output layer names are defined and we load our pre-trained EAST text detector on Lines 91-97.
The following block sets up our video stream and frames per second counter:
OpenCV Text Detection (EAST text detector) crash course that is hand-tailored to give youPython
99 # if a video path was not supplied, grab the reference to the the
webbest
campossible introduction to computer
100 if not args.get("video", False):
101 print("[INFO] starting video stream...")
102 vs = VideoStream(src=0).start() your email below to get started.
103 time.sleep(1.0)
104
105 # otherwise, grab a reference to the video file Email Address
106 else:
✕
107
108
👋
vs = cv2.VideoCapture(args["video"])
Click to answer
109 # start the FPS throughput estimator
110 fps = FPS().start()
Our video stream is set up for either:
A webcam (Lines 100-103)

Or a video file (Lines 106-107)
From there we initialize our frames per second counter on Line 110 and begin looping over incoming frames:

112 # loop over frames from the video stream
113 while True:
114 # grab the current frame, then handle if we are using a
115 # VideoStream or VideoCapture object
116 frame = vs.read()
117 frame = frame[1] if args.get("video", False) else frame
118
119 # check to see if we have reached the end of the stream
120 if frame is None:
121 break
122
123 # resize the frame, maintaining the aspect ratio
124 frame = imutils.resize(frame, width=1000)
125 orig = frame.copy()
126
127 # if our frame dimensions are None, we still need to compute the
128 # ratio of old frame dimensions to new frame dimensions
129 if W is None or H is None:
130 (H, W) = frame.shape[:2]
131 rW = W / float(newW)
132 rH = H / float(newH)
133
134 # resize the frame, this time ignoring aspect ratio
135 frame = cv2.resize(frame, (newW, newH))
We begin looping over video/webcam frames on Line 113.
Our frame is resized, maintaining aspect ratio (Line 124). From there, we grab dimensions and compute the scaling ratios
(Lines 129-132). We then resize the frame again (must be a multiple of 32), this time ignoring aspect ratio since we have
stored the ratios for safe keeping (Line 135).
Inference and drawing text region bounding boxes take place on the following lines:
OpenCV Text Detection (EAST text detector) Free 17-day crash ×

Python
137 # construct a blob from the frame and then perform a forward pass
138
139
# of the model to obtain the two output layer sets
blob = cv2.dnn.blobFromImage(frame, 1.0, (newW, newH),
course on Computer
140
141
(123.68, 116.78, 103.94), swapRB=True, crop=False)
net.setInput(blob)
Vision, OpenCV, and
142
143
(scores, geometry) = net.forward(layerNames)
Deep Learning
144 # decode the predictions, then apply non-maxima suppression to
145 # suppress weak, overlapping bounding boxes Interested in computer vision, OpenCV, and
146 (rects, confidences) = decode_predictions(scores, geometry)
147 deep learning, but don't know where to
boxes = non_max_suppression(np.array(rects), probs=confidences)
148 start? Let me help. I've created a free, 17-day
149 # loop over the bounding boxes
150 for (startX, startY, endX, endY) in boxes:
151 the best possible introduction to computer
# scale the bounding box coordinates based on the respective
152 # ratios
153 startX = int(startX * rW)
154 startY = int(startY * rH) your email below to get started.
155 endX = int(endX * rW)
156 endY = int(endY * rH)
157 Email Address
158 # draw the bounding box on the frame
✕
159
cv2.rectangle(orig, (startX, startY), (endX, endY), (0, 255, 0), 2)
you?
Click to answer
In this block we:
Detect text regions using EAST via creating a blob and passing it through the network (Lines 139-142)
Decode the predictions and apply NMS (Lines 146 and 147). We use the decode_predictions function defined
previously in this script and my imutils non_max_suppression convenience function.
Loop over bounding boxes and draw them on the frame (Lines 150-159). This involves scaling the boxes by the
ratios gathered earlier.
From there we’ll close out the frame processing loop as well as the script itself:

161 # update the FPS counter
162 fps.update()
163
164 # show the output frame
165 cv2.imshow("Text Detection", orig)
166 key = cv2.waitKey(1) & 0xFF
167
168 # if the `q` key was pressed, break from the loop
169 if key == ord("q"):
170 break
171
172 # stop the timer and display FPS information
173 fps.stop()
174 print("[INFO] elasped time: {:.2f}".format(fps.elapsed()))
175 print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))
176
177 # if we are using a webcam, release the pointer
178 if not args.get("video", False):
179 vs.stop()
180
181 # otherwise, release the file pointer
182 else:
183 vs.release()
184
185 # close all windows
186 cv2.destroyAllWindows()
We update our fps counter each iteration of the loop (Line 162) so that timings can be calculated and displayed (Lines
173-175) when we break out of the loop.
We show the output of EAST text detection on Line 165 and handle keypresses (Lines 166-170). If “q” is pressed for
“quit”, we break out of the loop and proceed to clean up and release pointers.
Video text detection results Free 17-day crash ×

Free 17-day crash course
blogon Computer
course
To apply text detection to video with OpenCV, be sure to use the “Downloads” onof Computer
section this post.
From there, open up a terminal and execute the following command (whichVision,
will fire upOpenCV,
your webcam and
since we aren’t
supplying a --video via command line argument):
Deep Learning
1 $ python text_detection_video.py --east frozen_east_text_detection.pb
2 [INFO] loading EAST text detector... deep learning, but don't know where to
3 [INFO] starting video stream...
4 [INFO] elasped time: 59.76
5 [INFO] approx. FPS: 8.85 crash course that is hand-tailored to give you
Email Address
✕
Click to answer
OpenCV Text Detection (EAST text detector) Demo
Our OpenCV text detection video script achieves 7-9 FPS.
This result is not quite as fast as the authors reported (13 FPS); however, we are using Python instead of C++. By
optimizing our for loops with Cython, we should be able to increase the speed of our text detection pipeline.
Summary
In today’s blog post, we learned how to use OpenCV’s new EAST text detector to automatically detect the presence of
text in natural scene images.
The text detector is not only accurate, but it’s capable of running in near real-time at approximately 13 FPS on 720p
images.
In order to provide an implementation of OpenCV’s EAST text detector, I needed to convert OpenCV’s C++ example;
however, there were a number of challenges I encountered, such as:

1. Not being able to use OpenCV’s NMSBoxes for non-maxima suppression and instead having to use my
implementation from imutils . course on Computer
2. Not being able to compute a true rotated bounding box due to the lack of Python bindings for RotatedRect .
Vision, OpenCV, and
I tried to keep my implementation as close to OpenCV’s as possible, but Deep Learning
keep in mind that my version is not 100%
identical to the C++ version and that there may be one or two small problems that will need to be resolved over time.
In any case, I hope you enjoyed today’s tutorial on text detection with OpenCV!
To download the source code to this tutorial, and start applying text detection to your own images, just enter
your email address in the form below.
Downloads: your email below to get started.
If you would like to download the code and images used in this post, please enter your email address in the form below.
Email Address
Not only will you get a .zip of the code, I’ll also send you a FREE 17-page Resource Guide on Computer Vision,
OpenCV, and Deep Learning.👋 ✕
Heyyou'll
Inside there!find Which of these
my hand-picked best describes
tutorials, you?and libraries to help you
books, courses,
master CV and DL! Sound good? If so, enter your email address and I’ll send you the code immediately!
Click to answer
Email address:
Your email address
DOWNLOAD THE CODE!
Resource Guide (it’s totally free).
Enter your email address below to get my free 17-page Computer Vision, OpenCV,
and Deep Learning Resource Guide PDF. Inside you'll find my hand-picked tutorials,
books, courses, and Python libraries to help you master computer vision and deep
learning!
Your email address
DOWNLOAD THE GUIDE!
 east text detector, ocr, optical character recognition, text, text detection
 Install OpenCV 4 on macOS Neural Style Transfer with OpenCV 
242 Responses to OpenCV Text Detection (EAST text detector)

REPLY 
Adam August 20, 2018 at 11:32 am #
course
Vision,
on Computer
Oh man, great article Adrian. Thanks for sharing with the rest of the world.OpenCV, and Deep Learning
Vision, OpenCV, and
I just have a toy project for text detection. The only caveat that my text might be in English or Arabic, so I will see if this can
somehow help me out! Deep Learning
Thanks!
REPLY 
Adrian Rosebrock August 20, 2018 at 2:38 pm #
vision andreader
I haven’t personally tried with non-English words but a PyImageSearch deep learning. Sound
on LinkedIn good?
posted an Enter
example
of correctly detecting Tamil text. It may work for your project as well, beyour
sureemail below
to give to get started.
it a try!
Email Address
✕
👋25,
Pavlin B August Hey
2018there!
at 2:30 pmWhich
# of these best describes you?
REPLY 
Click to answer
I tried on mixed Bulgarian/English – it works perfect.
One question – how to extract boundered text?
Regards.
REPLY 
Adrian Rosebrock August 30, 2018 at 9:34 am #
You can use array slicing:
roi = image[startY:endY, startX:endX]
REPLY 
Bartosz December 11, 2018 at 9:34 am #
How you approach this?
REPLY 
renka June 7, 2019 at 6:27 am #
hi iam also intersted doing research in text detection can yo please send that code to me that works for
both lnguages
REPLY 
Jacky December 8, 2018 at 5:08 am #
Thanks for the great article. It is pretty accurate and fast.
I wonder how may:

1. I tune the implementation for Chinese characters which is square shape
2. I train it with Chinese chars
3. Find the post for Tamil text

Bartosz December 11, 2018 at 8:38 am #
course on Computer
Vision, OpenCV, and Deep Learning REPLY 
Vision, OpenCV, and
What exactly do you mean can you share the url of the article with that other language detection?
Deep Learning
Miguel August 20, 2018 at 11:33 am # deep learning, but don't know where toREPLY 

Very nice tutorial. I just didn’t get how, after getting the bounding boxes, how to that
crash course actually get the detected
is hand-tailored text
to give you
REPLY 
Be sure to see my reply to FUXIN YU 🙂 Email Address

✕
Click to answer
Patrick August 20, 2018 at 11:48 am # REPLY 
Correction: the EAST paper is from 2017, not 2007. I was really surprised to see a 2007 paper to have a RPN-like
CNN structure. 😉
REPLY 
That was indeed a typo on my part, thank you for pointing it out! I’ve corrected the post.
REPLY 
FUXIN YU August 20, 2018 at 12:35 pm #
Hi Author,
Thanks for your posting, this is really good material to learn ML and CV.
one question, how to get the text content which has been recognized in the box?
Thanks,
Fred
REPLY 
Once you have the ROI of the text area you could pass it into an algorithm that that is dedicated to performing
Optical Character Recognition (OCR). I’ll be posting a separate guide that demonstrates how to combine the text
detection with the text recognition phase, but for the time being you should refer to this guide on Tesseract OCR.
REPLY 
Markus Dieterle August 21, 2018 at 8:30 am #
Hello Adrian,
Great post! As far as the text extraction goes I think we should take into concideration what was already written in
the “Natural Scene Text Understanding” paper. Free 17-day crash ×
Basically, even if the text areas are properly located, you should doFree
some imagecrash
17-day processing taking
course into account
on Computer
course
variations in lighting, color, hue, saturation, light reflection etc. Once the extracted
Vision,
on
OpenCV,
Computer
pieces of the Learning
and Deep image have been
cleaned up, OCR should work more reliably. Vision, OpenCV, and
Though I’m not sure if an additional, well trained neural network would not even be better – that would offer more
options for retraining for different charcter sets and languages…
Deep Learning
REPLY 
Gaurav A August 22, 2018 at 4:19 am #
Hi Adrian, the best possible introduction to computer
First of all thanks a lot for posting this brilliant article.It helps a lot.
Also when can we expect to get article on how to combine text detection with text recognition.
Need it a bit urgently 🙁
Email Address
✕
Click to answer
REPLY 
I’m honestly not sure, Gaurav. I have some other posts I’m working on and then I’ll be swinging back to
text recognition. Likely not for another few weeks/months.
Gaurav A August 24, 2018 at 1:29 am #
Thanks for the update Adrian. But can you guide me some path may be some links/post to refer on
how to do text recognition after text detection. It would be really helpful.
REPLY 
Joan August 23, 2018 at 6:12 am #
Hi Adrian,
If you will be making a guide for OCR this dataset may interest you:
http://artelab.dista.uninsubria.it/downloads/datasets/automatic_meter_reading/gas_meter_reading/gas_meter_reading.html
It contains images of gas counters with all the annotations (coordinates of boxes and digits). I trained a model with
that dataset and it performed really well even with different fonts. If you happen to know a similar dataset please tell
me, thanks and great post!
REPLY 
Wow, this is a really, really cool dataset — thank you for sharing, Joan! What type of model did you
train on the data? I see they have annotations for both segmentation of the meter followed by the detection of
the digits.
Joan September 2, 2018 at 11:14 am #

Used a HOG to extract the features and passed it to a SVM
course on Computer
Vision, OpenCV, and
Adrian Rosebrock September 5, 2018 at 9:04 am # Deep Learning
Awesome! I’ll look into this further. Interested in computer vision, OpenCV, and
REPLY 
enes polat August 20, 2018 at 3:57 pm #
hi thanks for your tutorial vision and deep learning. Sound good? Enter
I am using Anaconda3 your email below to get started.
how can I import imutils to my Anaconda3
Email Address
✕
Adrian Rosebrock August 21, 2018 at 6:46 amClick
# to answer
REPLY 
I’m not an Anaconda user but you should be able to pip install it once you’ve created an environment:
$ pip install imutils
Additionally, this thread on GitHub documents users who had trouble installing imutils for one reason or another. Be
sure to give it a read.
REPLY 
wh August 20, 2018 at 4:30 pm #
13fps on what hardware RPi? Tegra?
REPLY 
The authors reported 13 FPS on a standard laptop/desktop. The benchmark was not on the Pi.
REPLY 
farshad August 20, 2018 at 11:48 pm #
Great work again Adrian. thanks a lot. I recently noticed that Opencv in version 3.4.2 support one of the best and
most accurate tensorflow models: Faster rcnn inception v2 in object detection. In some recent posts of your blog you used
caffe model in opencv. Could on please make a post on implementation of faster rcnn inception v2 on Opencv?
REPLY 
Thank you for the suggestion Farshad, I will try to do a post on Faster R-CNNs.
REPLY 
kaisar khatak September 24, 2018 at 1:46 pm #
Cool post. Does this method also work on vertical text??? Free 17-day crash ×
course on Computer
Vision, OpenCV, and REPLY 
Joppu August 21, 2018 at 12:51 am #
Deep Learning
Nice! Couldn’t have read this at a better time. Thanks alot! Also nice guitar man \m/
I’ve been recently searching for a good scene text detection/recognition implementation for a little project of mine. Thinking
of somehow using TextBoxes++ (https://arxiv.org/abs/1801.02765) but now can try out EAST.
Adrian Rosebrock August 21, 2018 at 6:46 am # vision and deep learning. Sound good? Enter
REPLY 

Awesome! Definitely try EAST and let me know how it goes, Joppu!
Email Address
✕
START MY EMAIL COURSE REPLY 
Ronrick August 21, 2018 at 3:00 am # Click to answer
Hi Adrain. As I tried to run the codes, I got the error:

AttributeError: module ‘cv2.dnn’ has no attribute ‘readNet’
Checking for the solution online, the function is not available in python using this reference.
https://github.com/opencv/opencv/issues/11226
P.S. I have installed the latest version of opencv 3.4.2.17.

Thoughts on this one?
REPLY 
Hey Ronrick — I’m not sure why that may happening. Try building OpenCV 4 and see if that resolves the issue.
Here is an OpenCV 4 + Ubuntu install tutorial and here is an OpenCV 4 + macOS install tutorial. I hope that resolve the
issue for you!
REPLY 
Gaxo June 27, 2019 at 8:36 am #
Remove that slash while running the file from cmd i.e.
$ python text_detection.py –image images/car_wash.png –east

frozen_east_text_detection.pb
REPLY 
Deepayan August 21, 2018 at 4:05 am #
Great post-Adrian. I myself was trying to tweak f-RCNN for text detection on Sanskrit document images, but the
results were far from satisfactory. I’ll try this out. Thanks a lot 🙂
REPLY 
Free 17-day crash

I hope it helps with your text detection project, Deepayan! Let me know how it goes. ×
course on Computer
Deni August 21, 2018 at 5:22 am #

Vision, OpenCV, and REPLY 
Deep Learning
another great & update article :), but the resulting bounding box doesn’t rotate when the text is rotated? or I miss
something? Interested in computer vision, OpenCV, and
REPLY 
Yes. To quote the post: vision and deep learning. Sound good? Enter
“To start, there are no Point2f and RotatedRect functions in Python, and because of this, I could not 100% mimic the
C++ implementation. The C++ implementation can produce rotated bounding boxes, but unfortunately the one I am
Email Address
sharing with you today cannot.”
✕
And secondly: 👋Hey there! Which of these best describes you?
Click to answer
“I’ve included how you can extract the angle data on Lines 91-93; however, as I mentioned in the previous section, I
wasn’t able to construct a rotated bounding box from it as is performed in the C++ implementation — if you feel like
tackling the task, starting with the angle on Line 91 would be your first step.”
The conclusion also mentions this behavior as well. Please feel free to work with the code, I’ve love to have a rotated
bounding box version as well!
REPLY 
Big Adam August 21, 2018 at 5:44 am #
Hi, Adrian
Thanks for the sharing,the script works well.Could you please explain more about the lines in function decode_predictions
especially the computation of bounding box?
REPLY 
Danny August 21, 2018 at 10:45 am #
Hi Adrian,
Thank you for this sharing. In addition, could you please let me know whether we can use this EAST text detector to
recognize other languages like Spanish, Korea, Mandarin and so on?
REPLY 
Hey Danny, you should see my reply to Adam, the very first commenter on the post. I haven’t tried with non-
English words but a PyImageSearch reader was able to detect Tamil text so I imagine it will work for other texts as well.
You should download some images with Spanish, Korean, Mandarin, etc. and give it a try!
REPLY 
Tom August 21, 2018 at 3:01 pm #
Hi Adrian
I noticed that the bounding box on rotated text wasn’t quite enclosing all of Free 17-day
the text. I’ve crash course
calculated on Computer
a more accurate
course
Vision,
bounding box by replacing lines 102-109 in text_detection.py with the following OpenCV,
on Computer
and Deep Learning
Vision, OpenCV, and Python
1 # A more accurate bounding box for rotated text Deep Learning
2 offsetX = offsetX + cos * xData1[x] + sin * xData2[x]
3 offsetY = offsetY - sin * xData1[x] + cos * xData2[x]
4 Interested in computer vision, OpenCV, and
5 # calculate the UL and LR corners of the bounding rectangle deep learning, but don't know where to
6 p1x = -cos * w + offsetX
7 p1y = -cos * h + offsetY start? Let me help. I've created a free, 17-day
8 p3x = -sin * h + offsetX crash course that is hand-tailored to give you
9 p3y = sin * w + offsetY
10 the best possible introduction to computer
11 # add the bounding box coordinates vision and deep learning. Sound good? Enter
12 rects.append((p1x, p1y, p3x, p3y))
tom
Email Address
✕
Adrian Rosebrock August 22, 2018 at 9:26 amClick
# to answer
Thank you for sharing, Tom! I’m going to test this out as well and if it works, likely update the blog post 🙂
REPLY 
Tobi October 18, 2018 at 3:14 am #
Hi Tom,
thanks for sharing your code. I compared it to Adrians version and need to state that your coordinates in fact are a bit
more precise (at least for my use case –> text detection from scanned pdf).
Therefore, thanks a ton.
Best regards,
Tobi
REPLY 
Hakan Gultekin August 22, 2018 at 3:37 am #
Hi Adrian,
Great work. I got this to work.
But I have one issue. Your prediction (inference) time is 0.141675 seconds. When I run it, I get 0.413854 seconds.
I am using a Pascal GPU (p2.xlarge) on AWS cloud. Do need to configure something else for faster predictions. What are
you using for running your code ?
Thanks again.
Hakan
REPLY 
I was using my iMac to run the code. You should not need any other additional optimizations provided you
followed one of my OpenCV install tutorials to install OpenCV.

Hakan Gultekin August 22, 2018 at 7:27 pm # course on Computer REPLY 
Ok great Adrian thanks !

Vision, OpenCV, and
Deep Learning
REPLY 
Ben August 22, 2018 at 7:27 am # deep learning, but don't know where to
where do I find the ‘frozen_east_text_detection.pb’ model ?
your email below to get started. REPLY 
You can find the pre-trained model in the “Downloads” section of the blog
Email post. Use the “Downloads” section to
Address
download the code along with the text detection model.
✕
Click to answer
REPLY 
Antonio August 22, 2018 at 9:02 am #
Great article Adrian, incredible! Thanks a lot for your valuable tutorials! I am really looking forward to read the
article about the text extraction from ROIs.
REPLY 
Thanks Antonio! I’m so happy you enjoyed the guide. I’m looking forward to writing the text recognition tutorial
but it will likely be a few more weeks.
REPLY 
mohamed August 23, 2018 at 5:59 am #
Hi Adrian
Wonderful progress as usual
But I have a question please
I want to build the model frozen_east_text_detection.pb myself. Are there some guidelines?
thank you for your effort
REPLY 
For training instructions, you’ll want to refer to the official EAST model repo that was published by the authors
of the paper.
REPLY 
mohamed August 23, 2018 at 6:29 am #
I do not know what to say

Thank you very much

course on Computer
Vision, OpenCV, and
Best of luck training your own model, Mohamed! Deep Learning
mohamed August 24, 2018 at 7:26 am # start? Let me help. I've created a free, 17-day
Thanks Adrian
Good luck to you always
REPLY 
Darshil K August 23, 2018 at 8:09 am # Email Address
✕
Hi, 👋Hey there! Which of these best describes you?
Click to answer
Thank you for the post and the codes!
I am using windows 7, python 3.6

I have openCV 3.2.0 installed in my machine. But I am not able to install openCV 3.4.2 or above. Is there any way to install
it on my machine or do I have to install in virtual machine?
Thanks!
REPLY 
To be honest, I’m not a Windows user and I do not support Windows here on the PyImageSearch blog. I have
OpenCV install tutorials for macOS, Ubuntu, and Raspbian, so if you can use one of those, please do. Otherwise, if
you’re a Windows user, you’ll want to refer to the OpenCV documentation.
REPLY 
Deiner Zapata September 20, 2018 at 1:33 pm #
Hi, I am using windows7 too, and execute this code without trouble. Adrian Rosebrock, thanks by your code,
this tutorial is awesome.
More details:
– Python 3.6.5
– Opencv 3.4.2
– Windows 10
REPLY 
Adrian Rosebrock October 8, 2018 at 1:14 pm #
Thanks Deiner 🙂
REPLY 
Trami August 24, 2018 at 4:41 am #
Hi, Adrian, thank you for your effort. when i run the project, i meet Free
the problem
17-day ‘Unknown layeron
crash course type Shape in op
Computer
course on Computer
feature_fusion/Shape in function populateNet ‘. and in my computer ‘net = cv2.dnn.readNet(args[‘east’])’
Vision, OpenCV, and Deep Learningbe replaced
should
by the ‘net = cv2.dnn.readNetFromTensorflow(args[‘east’])’, i have installed the Opencv3.4.2, could tell me how to solve the
Vision, OpenCV, and
problems? Thank you so much!!!
Deep Learning
Adrian Rosebrock August 24, 2018 at 8:30 am # deep learning, but don't know where toREPLY 

Hey Trami — have you tried using the cv2.dnn.readNetFromTensorflow function?
crash course that is Did that resolve
hand-tailored to the
giveissue?
you
lochana September 24, 2018 at 5:40 am #
Email
yes using cv2.dnn.readNetFromTensorflow still working. if your Address
using same camera for two python files
which calls as sub process , the opencv versions above 3.2 the camera release function doesn’t work after i mailed ✕
👋 Hey there! Which of these best describes
to opencv they told me to install opencv 4.0-alpha but couldn’t find a START
way to MY
you?
install in myCOURSE
EMAIL anaconda environment
Click to answer
after searching opencv 3.4.0.14 contains readNEtFromTensorflow and camera release function working
pip install opencv-python==3.4.0.14
thank you
REPLY 
You should follow one of my OpenCV install guides to install OpenCV 4.
REPLY 
lxc August 26, 2018 at 9:21 pm #
Hi Adrian,
Why, scores and geometry’s shape are [1 180 80] [1 5 80 80]
REPLY 
lxc August 26, 2018 at 9:25 pm #
oo,321/4=80
REPLY 
Tom August 27, 2018 at 10:52 pm #
Hi Adrian
I made a few mods to the code and created a few different NMS implementations that will accept rectangles, rotated
rectangles or polygons as input.
The net of the changes:
1. Decode the EAST results

2. Rotate the rectangles
3. Run the rotated rectangles through NMS (Felzenswalb, Malisiewicz or FAST)
4. Draw the NMS-selected rectangles on the original image
The code repo is here: https://bitbucket.org/tomhoag/opencv-text-detection/
course on Computer
I pushed the README to medium here: https://medium.com/@tomhoag/opencv-text-detection-548950e3494c
tom
Vision, OpenCV, and
Deep Learning
REPLY 
Adrian Rosebrock August 28, 2018 at 3:15 pm # deep learning, but don't know where to
This is awesome, thank you so much for sharing Tom!
Tom August 29, 2018 at 10:53 pm #
My pleasure — thank you for the great post. Email Address

✕
I split out the nms specific stuff into a PyPi package: nms
you?
Click to answer
https://pypi.org/project/nms/
nms.readthedocs.io
REPLY 
Sébastien August 28, 2018 at 9:15 am #
Great article as always Adrian!

I was wondering something : in your Youtube video, the words “Jaya”, “the” and “Cat” are detected separately by the
algorithm. Would it be possible to modify it so that the whole textline “Jaya the Cat” is detected in a single textbox?
REPLY 
Technically yes. For this algorithm you would compute the bounding box for all detected bounding box
coordinates. From there you could extract the region as a single text box.
REPLY 
Sébastien August 29, 2018 at 2:54 am #
I’m not sure I understand correctly.

In the Figure 1 of this article, the left image shows two lines: “First Eastern National” and “Bus Times”. How could your
method detect that there are indeed _two_ lines with 3 words in the upper one and 2 in the other?
REPLY 
Figure 1 shows examples of images that would be very challenging for text detectors to detect. You could
determine two lines based on the bounding boxes supplied by the text detector — one for the first line and a second
bounding box for the second line.
REPLY 
Sébastien September 4, 2018 at 2:13 am #
Thanks! Free 17-day crash course on Computer
course on Computer
Vision, OpenCV, and
pankaj sharma August 30, 2018 at 9:07 am # Deep Learning REPLY 
Hi adrain, Interested in computer vision, OpenCV, and

i have in run a code deep learning, but don't know where to
please help me to slove this problem. start? Let me help. I've created a free, 17-day
… crash course that is hand-tailored to give you
net = cv2.dnn.readNet(args[“east”]) the best possible introduction to computer
AttributeError: module ‘cv2.dnn’ has no attribute ‘readNet’ vision and deep learning. Sound good? Enter
Email Address REPLY 

✕
Click to answer
Double-check your OpenCV version. You will need at least OpenCV 3.4.1 to run this script (it sounds like you have an
older version).
REPLY 
pankaj sharma August 30, 2018 at 9:17 am #
i have 3.4.1 opencv version.

please give me some another suggestion
REPLY 
Did you install OpenCV with the contrib module enabled? Make sure you are following one of my
OpenCV install tutorials.
REPLY 
Joan September 2, 2018 at 11:51 am #
From what I have tried you need at least opencv 3.4.2
REPLY 
Hassan January 30, 2019 at 6:28 am #
I have the same issue
REPLY 
Adrian Rosebrock February 1, 2019 at 7:11 am #
What version of OpenCV are you using?

REPLY 
Raj September 8, 2018 at 3:20 pm # Free 17-day crash course on Computer
course on Computer
I was getting the same error with opencv-python 3.4.0.12 on windows. The issue was resolved after upgrading
Vision, OpenCV, and
opencv-python to 3.4.2.17.
Deep Learning
Sanda September 2, 2018 at 7:18 pm # deep learning, but don't know where toREPLY 

Hi, crash course that is hand-tailored to give you
I also want to recognize detected text from the video.to do that I hope to crop
thethe image
best withintroduction
possible maximum ROI which we
to computer
identified as words.then I pass this to tesseract OCR to recognize words. Can I know this method is ok to do words
recognition?
Thank You
Email Address
✕
Click REPLY 
Adrian Rosebrock September 5, 2018 at 9:02 am # to answer
I’ll be covering exactly how to do this process in a future blog post but in the meantime I always recommend
experimenting. Your approach is a good one, I recommend you try it and see what types of results you get.
REPLY 
sidis September 3, 2018 at 2:03 am #
Hi Adrian
is this possible to recognise the test
REPLY 
Adrian Rosebrock September 5, 2018 at 8:59 am #
Once you’ve detected text in an image you can apply OCR. I’ll be covering the exact process in a future
tutorial, stay tuned!
REPLY 
Matheus Cunha September 4, 2018 at 1:57 pm #
Is there any way to use the video text detecion using the Raspberry Camera V2?
REPLY 
Yes. Replace Line 102 with vs = VideoStream(usePiCamera=True).start()
REPLY 
Dany September 5, 2018 at 5:27 am #
Very intresting! It’s possible to convert in real text with OpenCV or I need to use OCR?
Thanks.

Adrian Rosebrock September 5, 2018 at 8:28 am # course on Computer REPLY 
Vision, OpenCV, and
Once you’ve detected the text you will need to OCR it. I’ll be demonstrating how to perform such OCR in a
future tutorial 🙂 Deep Learning
REPLY 
Dany September 5, 2018 at 11:31 am # start? Let me help. I've created a free, 17-day
It’s possibile OCR with OpenCV or you work with others like tesseract?
REPLY 
Email Address
OpenCV itself does not include any OCR functionality, it’s normally handed off to a dedicated OCR ✕
👋 Hey
library like Tesseract or the there! Which
Google Vision API.of these best describes you?
Click to answer
vinayak October 4, 2018 at 2:46 am #
I would suggest adding a CRNN model on top of the east detector.
Adrian Rosebrock October 8, 2018 at 10:16 am #
For anyone who is following along with this post, here is the link to the text detection + OCR post I
was referring to.
REPLY 
Prince Bhatia September 10, 2018 at 7:46 am #
How to print probability that this image has this much 99 percent probability it has text? or image has 0 percent
probability that it does not has text?
REPLY 
Are you referring to a specific region of the image having text? Or the image as a whole?
REPLY 
mohamed September 10, 2018 at 9:19 am #
Hi Adrian
I apologize for my inaccurate questions
But I would like to know why the attached form in the downloads is less accurate than the model in the warehouse
recommended by the team. in this place:
(https://github.com/argman/EAST)
Have you modified something to comply with opencv?
course on Computer
Adrian Rosebrock September 11, 2018 at 8:11 am # Vision, OpenCV, and REPLY 
Deep Learning
The method I’ve used here is a port of the EAST model. As I’ve mentioned in the blog post the code itself
cannot computed the rotated bounding boxes.
Gaurav A September 11, 2018 at 5:37 am # crash course that is hand-tailored to give you
REPLY 

Hi Adrian, vision and deep learning. Sound good? Enter
Can you please help me out in understanding how i can break the bounding box to alphabets instead of full words ?
For eg if i have a number 56 0 08
Emailare
I am able to do it using findcontours… but its not giving accuracy when the digits Address
very close. Two digits are being
considered as one. ✕
👋 Hey there! Which of these best describes
So the results i get is 56 0 and 08.. But it should be 5 6 0 0 8.
you?
Click to answer
Can you please suggest some eay to tackle this
REPLY 
If your image is “clean” enough you can perform simple image processing via thresholding/edge detection and
contours to extract the digit. For more complex scenes you may need some sort of semantic segmentation. Stay tuned
for next week’s blog post where I’ll be discussing how you can actually OCR the text detected by EAST.
REPLY 
Hochan September 11, 2018 at 9:35 pm #
If you are interested in making your own model and import it to opencv, check this link.
https://github.com/opencv/opencv/issues/12491
REPLY 
Sushil September 12, 2018 at 7:58 am #
Hello adrian, Your work is really amazing!! I’m getting some issues with final bounding boxes after
nonMaxSupression. I’m getting almost all characters before supression, but in final result some characters are not
considered in the bounding boxes because of supression algorith. So, I thought about taking only outer boxes(implementing
own algorithm) But ‘rects’ have so many x-y co-ordinates i’m unable to get which co-ordinates are of one box and which are
of the other boxes. Do you have any suggestion or solution for this?
REPLY 
Adrian Rosebrock September 12, 2018 at 1:51 pm #
The “rects” list is just your set of bounding box coordinates so I’m not sure what you mean by being unable to
get coordinates belong to which box. Each entry in “rects” is a unique bounding box.
Tejas Mahajan September 18, 2018 at 2:35 am #

REPLY 

Hi, course on Computer
Vision,
The weights file you have used in this blog to show the inference was obtained OpenCV,
by training and
on which dataset?
Deep Learning
Adrian Rosebrock September 18, 2018 at 7:14 am # Interested in computer vision, OpenCV,REPLY
and 
Be sure to refer to the re-implementation of the EAST model for moreLet
start? information on the
me help. I've dataset
created and 17-day
a free, training
procedure. crash course that is hand-tailored to give you
REPLY 
Rohan September 18, 2018 at 7:09 pm #
Hey Adrian, Email Address

✕
👋
Hey there!
Do you know if this same EAST algorithm will beWhich of these
able to locate best describes
the bounding you? text?
boxes of handwritten
Click to answer
Thanks
REPLY 
Arindam September 25, 2018 at 11:59 am #
Hey Adrian,
The article was really helpful. I was wondering if you could guide me with segregating handwritten text and machine printed
text in a picture of a document.
REPLY 
Jonathan Salama September 28, 2018 at 2:39 pm #
Hello, I was wondering if there is a version that would output the actual text observed. Thanks!
REPLY 
Yes. See this tutorial on OpenCV OCR.
REPLY 
Suresh Doraiswamy September 30, 2018 at 8:57 am #
When you run Adrian’s text_Detection.py using Python 3.6 and OpenCV 3.4.3 on Windows 10,
If the line, <> shows an error saying cv2.dnn does not have readNet as a valid function, then you can do the following and
eliminate the error:
Open Windows command Line and enter pip install opencv-contrib-python
I tried this and it works.
Adrian Rosebrock October 8, 2018 at 10:55 am # Free 17-day crash ×

REPLY

It sounds like you are using an older version of OpenCV. Can course
you confirm
Vision,
on and
which
OpenCV,
Computer
version
Deep ofLearning
OpenCV you are
using?
Vision, OpenCV, and
Deep Learning
Tu October 1, 2018 at 11:07 pm # Interested in computer vision, OpenCV,REPLY
and 
Hi Adrian, start? Let me help. I've created a free, 17-day
It’s great blog post. crash course that is hand-tailored to give you
Currently, I’m working on a project that is related with detect object in technical
visiondrawing image
and deep (eg. CAD
learning. scan
Sound image).
good? So I
Enter
need to detect lines, numbers, text in image.
I tested with your code in this blog. But the accuracy seems not good.
If you have any idea to improve, please share with me ! Email Address
✕
👋
image example here: https://imgur.com/a/PN5J6CJ
Click to answer
Thanks
REPLY 
Chris October 2, 2018 at 11:53 pm #
Adrian, how did you freeze the model, ( convert .ckpt to .pb )?
REPLY 
Are you asking how to convert a TensorFlow model to OpenCV format? If you can clarify I can point you in the
right direction.
REPLY 
Chris October 27, 2018 at 12:15 pm #
When training EAST, the created model is in .ckpt, how to convert that .ckpt model to .pb so that I am able
to use in your opencv version of EAST?
REPLY 
Refer to the official OpenCV documentation — they include scripts to covert the model to make it
compatible with OpenCV directly.
REPLY 
Dekker October 2, 2018 at 11:59 pm #
Great article. Adrian

How to implement EAST model.

Adrian Rosebrock October 8, 2018 at 10:29 am # course on Computer REPLY 
Vision, OpenCV, and
The model has already been implemented and trained in this post. Do you mean how to train the EAST model
from scratch? Deep Learning
REPLY 
vinayak October 4, 2018 at 2:50 am # start? Let me help. I've created a free, 17-day
I found CRNN model a great addition on top of east detectors to make full OCR. I had trained it on custom data
and it works well.
original paper:http://arxiv.org/abs/1507.05717. your email below to get started.
https://github.com/vinayakkailas/Deeplearning-OCR
Email Address
✕
Click to answer
REPLY 
Thanks for sharing, Vinayak!
REPLY 
Ritika October 10, 2018 at 7:53 am #
Hi!
Thanks for sharing the frozen model for east text detector.
I am currently working on a project where I need to use the tensorflow Lite model for mobile application. To convert the
frozen model to tf lite I need to know the names of the input and output tensors. Could you please provide me with the
same?
Thanks
REPLY 
Hey Ritika — I would suggest reaching out to the authors of the EAST paper model (linked to in this blog post).
They will be able to provide more suggestions into the model and layer naming conventions.
REPLY 
Bragg Xu October 17, 2018 at 1:28 am #
Thanks for sharing. I‘m using opencv3.4.1 with python on Mac, is it ok for the version requirement?
REPLY 
Yes, OpenCV 3.4.1 should be sufficient.

course on Computer
Tobi October 18, 2018 at 3:32 am #
Vision, OpenCV, and
Hi Adrian,
Deep Learning
thanks so much for this post and in general this whole website. I’m really getting in love with computer vision and will try to
learn more. As of so I have two particular questions regarding your code orInterested
to be moreinprecise
computer
aboutvision, OpenCV,
the math behind.and
My questions refer to the first part of your post (text detection in a single image)
1. You wrote in one of your comments (code line 87):
“compute the offset factor as our resulting feature maps will be 4x smaller than the input image”
Where did you get this information and why is it?
2. Can you explain a bit more detailed how the formula in line 102/103 works (endX,
your emailendY)?
below to get started.
I know that we can use the sinus and cosine functions to find the coordinates but I don’t know how this exactly works. I
couldn’t find some good explanations for this in the web. Probably you have a good resource?
Email Address
Thanks in advance. ✕
Click to answer
Best regards,
Tobi
REPLY 
Take a look at the EAST publication that I linked to in the post. You also might want to look at the architecture
visualization and see how the volume size changes as data passes through the network. As for your second question, I
think you’re asking where to learn trigonometry? Let me know if I understood your question correctly.
REPLY 
Tobi November 2, 2018 at 5:26 am #
Hi Adrian,
thanks for your answer, I will check the paper.

Regarding my second question, yes it’s about learning the trigonometry. I already checked some resources where I
learned (refreshed) a bit about cosine and sine but I couldn’t transfer this knowledge to the formula you used.
Maybe you have some better resources?
Best regards,
Tobi
REPLY 
Saurav October 20, 2018 at 12:24 pm #
Hi,
Thank so much for posting this and sharing your knowledge. I love reading your post. This code works very well.
I was wondering is there any way to detect blocks for a single line at a time.
×
REPLY 
Xiaodan October 20, 2018 at 5:42 pm #
Free 17-day crash
Thanks for posting! Great article. One question, could I use EASTFree 17-day crash
text detector to onlycourse
detect on Computer
digits?
course on Computer
Vision, OpenCV, and
Deep Learning REPLY 
Interested
EAST doesn’t provide you with any context of what the text actually in computer
contains, only that vision, OpenCV,
text exists and
somewhere
deep learning, but don't know where
in an image. Therefore, no, you cannot instruct EAST to detect digits. Instead, you would want to perform text to
recognition and then use Tesseract to return only digits. start? Let me help. I've created a free, 17-day
REPLY 
Xiaodan October 22, 2018 at 8:50 pm # your email below to get started.
Could I replace the training data (presumably English text training data) with digit (math formula training
Email Address
data) and train the same architecture? My purpose is to build an app that can detect then recognize and grade math
✕
👋
worksheet problems from Hey there! Which of these best describes you?
photos.
Click to answer
REPLY 
Presumably yes but you’ll also want to refer to the official EAST GitHub repo that I linked to inside the
post.
REPLY 
ali October 24, 2018 at 12:15 am #
Hi, Adrian
Thanks for sharing, I have problem when I run the codes on my Pi with webcam suddenly my Pi restarting
please help me to slove this problem 🙁
REPLY 
It sounds like your Pi may be becoming overheating and is restarting or there is some sort of physical issue
with your Raspberry Pi. Can you try with a different Pi?
REPLY 
Atul Mahajan October 26, 2018 at 8:23 am #
Thanks Adrian for sharing such grate info.
I want to read the detected text from live video and for this I thought of first separating the frame in which text is detected
and then apply OCR on frame to read the text. But I observed to identify the frame it is very slow and time consuming
process.
Could you please suggest fast solution to read text from live video.
REPLY 
Free 17-day crash

You would want to push the computation and forward pass of the network to your GPU but unfortunately that’s ×
non-trivial with OpenCV and CUDA right now. I imagine that will be possible in the near
Free 17-day future.
crash course on Computer
course on Computer
Vision, OpenCV, and
Amul Mittal October 29, 2018 at 8:25 am # Deep Learning REPLY 
Great Work Brother. Interested in computer vision, OpenCV, and

You are doing awesome job. deep learning, but don't know where to
start?
Can you please provide c++ code as well. Because I am unable to understand theLet meinhelp.
code I've created
python, a free,
also, I am 17-dayany
not getting
tutorial of Scene Text Detection in C++. crash course that is hand-tailored to give you
Please help… the best possible introduction to computer
REPLY 
Email Address
✕
I linked to the C++ implementation from my original blog post. Make sure you’re reading the full post.
you?
Click to answer
REPLY 
Wim van de Brug October 30, 2018 at 6:09 am #
Hi Adrian,
Thanks for this great post. I have set up an environment using Python 3.7.1 and OpenCV 3.4.3.18 (from your pip install
opencv post). The script runs like a charm but rather slow:
[INFO] loading EAST text detector…
[INFO] text detection took 0.569462 seconds
I run this on a Microsoft Surface Pro 4 Windows 10 in the most minimal virtual env required for this script. Why is it on
Windows10 that slow compared to your benchmark?
Thanks for your earliest reply.
Wim
REPLY 
Adrian Rosebrock November 2, 2018 at 8:25 am #
Hey Wim — I’m not sure why the code would be so much slower on a Surface Pro. I’m not personally familiar
with the hardware.
REPLY 
Pallawi November 19, 2018 at 8:54 am #
Hi Adrian,
Thank you for such a great blog.
I am currently working on text detection on ATM slips.
The texts are very small and when I pass the whole slip into EAST, It does not give a correct detection.
I wanted to ask:
1. How many images and annotations will be needed to train EAST.

2. Can you please suggest me few datasets similar to ATM slip font.
3. Can you please suggest me few free text annotation tools.
course on Computer
Adrian Rosebrock November 19, 2018 at 12:23 pm # Vision, OpenCV, and REPLY 
Deep Learning
I’m not sure of any existing ATM slip dataset. You may need to curate one yourself. Good annotation tools
include imglab and LabelMe/LabelImg.
REPLY 
Dorra November 20, 2018 at 7:33 am #
Hi Adrian vision and deep learning. Sound good? Enter
Thanks for sharing, please what is the CNN architecture used ? your email below to get started.
Email Address
✕
👋Hey
Adrian Rosebrock there!
November Which
20, 2018 of# these best describes you?
at 9:01 am
REPLY 
Click to answer
For more details on the EAST CNN architecture be sure to refer to their official GitHub.
REPLY 
Dorra November 20, 2018 at 10:26 am #
Adrian I mean that you did not use neither “LeNet” , “AlexNet” , “ZFNet” , “GoogLeNet” , “VGGNet” or
RestNet ?
REPLY 
Again, kindly refer to the GitHub link and associated paper I have provided you with. Read them and
your question will be answered.
REPLY 
Andrew November 28, 2018 at 7:48 pm #
Hi Adrian! Thank you for this post. You’re awesome!

I’m trying to compare my last model with the .pb model that you’re using here. But my last model has the following files:
foo.data, bar.index, checkpoint and .data-0000-of-0001. How I get the .pb file from these files to then pass it through the
method: cv2.rnn.readNet(“my_old_model.pb”)?
Thank you!
REPLY 
You need to convert your TensorFlow model to OpenCV format using OpenCV’s TensorFlow conversion tools.
To be honest I’ve never tried that process so I cannot give you instructions on how to proceed.
Ashsh November 29, 2018 at 3:24 am #

Free 17-day crash ×REPLY 

Hi Adrian, course on Computer
Thanks for sharing this work. Vision, OpenCV, and
Deep
I was wondering is there any way to print the text and digits that are detected Learning
from the image after extracting the bounding
box using array slicing.
REPLY 
Adrian Rosebrock November 30, 2018 at 9:01 am # crash course that is hand-tailored to give you
Yes, see this tutorial where I combine the EAST text detector with OCR.
REPLY 
Akin November 29, 2018 at 5:54 pm # Email Address
✕
Hi Adrian, 👋Hey there! Which of these best describes you?
Click to answer
Thanks for this and many other great posts! Learning a lot.
I wonder if there is a way to get more precise bounding boxes around words (or even letters).
I can see on your demo that EAST is pretty powerful for detecting the ‘general’ region where the text lies (and then we have
powerful tools to infer the ‘content’ of the text from there), but if I wanted to have ‘coordinates’ or ‘height’ of the letters or
words,
the current code would not be enough.
Is there a way to play with this?
REPLY 
San December 16, 2018 at 1:12 am #
You can just extract the startX, startY, endX, endY from the code. Do some simple coding, like center points
(i.e: (startX + endX) / 2 ))
Height would be just (endX – startX) etc.
Hopes this help
REPLY 
San December 16, 2018 at 1:09 am #
Hi, thanks for writing this one. Are there any way that I can retrain this network?
The current model doesn’t work super well on my test images.
Thanks
San
REPLY 
Adrian Rosebrock December 18, 2018 at 9:08 am #
You would need to refer to the documentation provided by the EAST text detection GitHub repo.
peter zhang December 24, 2018 at 2:53 am # Free 17-day crash ×

REPLY 

Hi Adrian,
course on Computer
Very good article, and very good detail explanation.

Vision, OpenCV, and
Deep Learning
I implement all this on my raspberry pi 2 model B. I got time use around 14 seconds for an image text detection.
is that normal? Interested in computer vision, OpenCV, and

REPLY 
Adrian Rosebrock December 27, 2018 at 10:41 am # the best possible introduction to computer
Yes, that is entirely normal. The Raspberry Pi is too underpowered to run these deep learning-based text
detection models.
Email Address
✕
Ishan December 28, 2018 at 7:46 am #
you?
REPLY 
Click to answer
Thanks for such a great article.
I need some help getting text within each text boundary.

Any ideas how can we do that ?
REPLY 
Adrian Rosebrock January 2, 2019 at 9:40 am #
See this tutorial.
REPLY 
Nishtha January 3, 2019 at 2:31 am #
Thanks for the amazing article!

i am getting this error
i am running this in ubuntu18.04, py3.6
: cannot connect to X server

i am unable to view the output image
REPLY 
Are you SSH’ing into your system? If so, make sure you enable X11 forwarding:
$ ssh -X user@your_ip_address
REPLY 
Ahmed January 4, 2019 at 1:32 pm #
Hi Adrian,
My laptop’s Cpu is getting used 100% after running text detection video script .. is it normal?

course on Computer
Vision, OpenCV, and
Absolutely normal. The deep learning-based EAST text detector takes up quite a bit of CPU cycles.
Deep Learning
Aswin Balaji January 8, 2019 at 3:55 am # deep learning, but don't know where toREPLY 

Hi, crash course that is hand-tailored to give you
we are facing the following errors while executing the code. please help outthe
asbest
soonpossible
as possible.
introduction to computer
orig = image.copy() vision and deep learning. Sound good? Enter
AttributeError: ‘NoneType’ object has no attribute ‘copy’ your email below to get started.
Email Address
✕
👋Hey
January 8, 2019 atWhich
6:35 am #of these best describes you?
REPLY 
Click to answer
Double-check your path to the input image. The path is invalid and “cv2.imread” is returning “None”. You can
read more about NoneType errors, including how to resolve them, here.
REPLY 
CGMoon January 11, 2019 at 3:35 am #
Thank you very much for your good writing and code.
If the size of the input image is 672 x 512, how do you think about resizing the width and height to the nearest size while
maintaining a multiple of 32?
I am wondering which case in the resize case below shows the best result.
– case 1: 640 x 480 (both width and height are multiples of 32 and resize to nearest size)
– case 2: 640 x 640 (the width is a multiple of 32 and the height resizes to the same size as the width)
– case 3: 480 x 480 (the height is a multiple of 32, resize to the same size as the width)
– case 4: 320 x 320 (the resize size used in your code)
REPLY 
Zoylamb January 18, 2019 at 2:59 am #
Hello,
Thank you for this tutorial. I’m willing to use another frozen model and i would like to know how to choose the output name?
Where do come from? (not found in the EAST publication) :
“feature_fusion/Conv_7/Sigmoid”,
“feature_fusion/concat_3”
Thank you.
REPLY 
Free 17-day crash

Those are from the model architecture themselves. They were defined when actually implementing the
×
architecture itself.
course on Computer
Vision, OpenCV, and
Deep Learning REPLY 
steve January 19, 2019 at 5:41 am #
Hi
What hardware are you running please? start? Let me help. I've created a free, 17-day
My Raspberry Pi 3B takes 17 seconds to do a single image detection from crash
a jpegcourse that takes
file, yours is hand-tailored to give you
0.14 seconds.
Cheers
Steve your email below to get started.
Email Address
✕
👋Hey
January 22, 2019 Which
at 9:38 am of
# these best describes you?
REPLY 
Click to answer
I’m using an iMac Pro with a 3Ghz Intel Zeon W processor. The Raspberry Pi will be FAR too slow to run this code (it
just doesn’t have enough computational horsepower).
REPLY 
Abi January 21, 2019 at 9:36 am #
Hi,
Can i opencv-3.3.0 to run this code. Or anyways to upgarde version 3.3 to 4. Since i have already installed 3.3.
REPLY 
You’ll need either OpenCV 3.4 or OpenCV 4 for this tutorial. Make sure you upgrade from OpenCV 3.3.
REPLY 
Jerome Diongon January 22, 2019 at 11:39 pm #
I’m having a error when i try to run text_detection_video.py
————–
usage: text_detection_video.py [-h] -east EAST [-v VIDEO] [-c MIN_CONFIDENCE]
[-w WIDTH] [-e HEIGHT]
text_detection_video.py: error: the following arguments are required: -east/–east
————–
that’s the error i got when i run the code. please answer thank you 🙂
REPLY 
It’s okay if you are new to Python and command line arguments but you need to read this tutorial first. From
there you’ll understand command line arguments and be able to execute the script.

course on Computer
lkk January 24, 2019 at 2:34 am #
Vision, OpenCV, and
Deep
hi how can i use my own model, the pb file , for detection. i can the Learning
east argument, bu it doesn’t work

REPLY 
Manoj January 29, 2019 at 4:54 am # start? Let me help. I've created a free, 17-day
Hi Adrian,
A very useful knowledge base. Could you please guide me on how to find the contours
vision of the
and deep detected
learning. text. good?
Sound so thatEnter
i can
mask the same your email below to get started.
Email Address
REPLY  ✕
👋Hey
Adrian Rosebrock there! Which of these best describes you?
January 29, 2019 at 6:28 am #
Click to answer
See this tutorial where I extract the bounding box of the text and pass it through the OCR engine. Once you have the
bounding box you can mask the text.
REPLY 
Manoj January 29, 2019 at 10:15 am #
Thanks for the quick reply Adrian.

The problem i am trying to solve is to extract graphics and text separately from the image and process them. While
the suggested approach works perfectly for text recognition. I want the image with just the graphics to process them
into vectors.
Using the bounding box to erase the text causes at times parts of graphics to be erased as well. So i was hoping to
find a way of getting the edge contours of the identified text in some way to then erase them
REPLY 
You’re trying to compute a mask for the actual text then? That sounds more like an instance
segmentation problem. I don’t know of any instance segmentation models for pure text though, you may need to
do some research there.
REPLY 
Ray Li January 29, 2019 at 10:02 am #
# compute the offset factor as our resulting feature

# maps will be 4x smaller than the input image
(offsetX, offsetY) = (x * 4.0, y * 4.0)
Hi there, why do you use 4 times here?
REPLY 
Free 17-day crash

Because the input spatial dimensions were reduced by a factor of four. We need to obtain the offset ×
coordinates in terms of the original input image. Free 17-day crash course on Computer
course on Computer
Vision, OpenCV, and
Ruizhe Li January 29, 2019 at 12:02 pm # Deep Learning REPLY 
Thanks very much for these interesting blogs. Interested in computer vision, OpenCV, and
But I am a little bit disappointed in openCV 🙁 as this text detector doesn’tdeep
perform well onbut
learning, angled orknow
don't small where
text. And
to
sometimes tesseract can’t recognise or makes no sense as the text detector couldn’t
start? make
Let me an I've
help. accurate region
created proposal
a free, 17-dayin
the first step. crash course that is hand-tailored to give you
I made some improvements based on your code by letting tesseract searchthe
a little
bestbit aroundintroduction
possible the proposed text region. It is
to computer
more accurate but a bit less efficient.
osman January 30, 2019 at 8:04 pm # Email Address REPLY 
✕
👋Hey
your code has a small bug. Thethere! Which
bounding box willof
bethese best
overflow describes
in some cases.
START
you?
MY To that you
EMAIL should do
COURSE
Click to answer
startX, endX = np.clip([startX, endX], 0, W)

startY, endY = np.clip([startY, endY], 0, H)
after
endX = int(offsetX + (cos * xData1[x]) + (sin * xData2[x]))

endY = int(offsetY – (sin * xData1[x]) + (cos * xData2[x]))
startX = int(endX – w)
startY = int(endY – h)
REPLY 
vinoth kumar February 2, 2019 at 1:32 am #
how can i train my own network using EAST algoritham.?
REPLY 
Refer to the EAST creators official GitHub page.
REPLY 
Muhammad Khisal Khalid February 4, 2019 at 10:46 am #
Hey,
Thank you for the great Article. It helped me a lot in learning. This works perfectly for the normal orientation. Please tell me
what changes do I need if the letters are upside down or sideways. I
REPLY 
Nike February 5, 2019 at 12:58 am #
How to unfreeze frozen_east_detection.pb into actual model. I actually wanted to see the coding behind it. I am a
beginner in this field. Wanted to know what is happening behind the scene.

course on Computer
Vision, OpenCV, and
The EAST detection model was pre-trained. If you’re new to computer vision and deep learning I would
Deep
recommend reading through Deep Learning for Computer Vision with Python Learning
so you can learn how to train your own
networks.
REPLY
crash course that is hand-tailored to give you 
Dheeraj February 6, 2019 at 6:19 am #
Hi ,Adrina…can i print the detected text on python shell+? vision and deep learning. Sound good? Enter
Email Address REPLY 

✕
Yes, take a look at this tutorial on OpenCV OCR.
you?
Click to answer
REPLY 
rahul February 12, 2019 at 3:27 am #
how can i merge my own trained data into this work ?
REPLY 
Adrian Rosebrock February 14, 2019 at 1:21 pm #
Refer to the GitHub repo referenced in the body of the blog post. Follow the instructions from the authors
(again, in the GitHub repo I linked to).
REPLY 
Hershel February 12, 2019 at 11:39 am #
Hi Adrian,
This is the first neural net I’ve seen where the size of the image just had to be a multiple of a number rather than a specific
dimension. I’ve looked through the East Github page and am not seeing the mechanism that allows that to happen.
I’ve tested this code out on images of size 8384 x 1600 (email ad) and it works beautifully, so clearing it isn’t just resizing to
32 x 32.
Is this so obvious that I’m overlooking it? Do you know of any papers or documentation that I could look into?
REPLY 
Jerome Diongon February 26, 2019 at 10:18 am #
Good day! How can i get the result of the captured character in the video? Because i want to put it in a text file.
Hope you can answer me, thank you :).
REPLY 
No problem, just refer to this tutorial.
course on Computer
Vision, OpenCV, and
REPLY 
Jerome Diongon March 3, 2019 at 1:45 am # Deep Learning
How about if I want to send the captured characters in theInterested
database, in
how is it?
computer vision, OpenCV, and
REPLY 
Adrian Rosebrock March 5, 2019 at 8:57 am #
vision and
That’s not really a computer vision problem. That’s a general deep learning. Sound good?
programming/engineering Enter
problem. I
your emailand
would recommend you take the time to read up on Python programming below to get
basic started. From there you
databases.
can continue with your project.
Email Address
✕
Click to answer REPLY 
Anshul jain March 5, 2019 at 3:00 am #
Sir where can i find the dataset for this?
REPLY 
You mean the dataset the EAST model was trained on? Refer to the author’s GitHub page which I’ve linked to
from the body of the post.
REPLY 
Aadesh March 15, 2019 at 9:54 pm #
Greetings Adrian,
Thank you for writing a great article. This is the first time i have worked with neural networks and while i was going through
your tutorial i found out that the dimension of the input image should be a multiple of 32. I referred the IEEE Paper of the
EAST Algorithm but i couldn’t figure out why the input has to be a multiple of 32. It would be great if you have any
documentations regarding this.
Thank You.
REPLY 
Henry March 18, 2019 at 6:46 pm #
If I am only interested in number detection(from 0-9), is there any way for me to retrain the model? Or how do I
eliminate other texts except numbers with the current model?
REPLY 
Take a look at the Tesseract documentation. There is a set of parameters you can supply to only extract digits
(but I can’t remember it off the top of my head, sorry).

Marco March 19, 2019 at 7:51 pm # Free 17-day crash course on ComputerREPLY 
course on Computer
Vision,
Any idea why the performance of text detection is very bad? The text OpenCV,
is clear, since I’m using a and
screen shot of my
phone. The whole image has loads of text (image screen shot from a calendar) but I only get on 1 confidence/match. I’m
using full res image: 1920 by 864. Deep Learning
Danny March 21, 2019 at 10:01 pm # start? Let me help. I've created a free, 17-day
REPLY 

Great article. I’ve learned so much from in you in a matter of days.the
What
bestabout detecting
possible blocks to
introduction of computer
text as one
object? For example, address labels on an envelope. From this article I feelvision
confident that I learning.
and deep could detect individual
Sound words
good? Enter
(and maybe lines), but could you treat the entire address label as a single rectangular objectto
your email below and
gettrain the model to detect
started.
that?
Email Address
✕
Adrian Rosebrock March 22, 2019 at 8:29 am Click
# to answer
You would want to define a heuristic to group them, such as “all bounding boxes within N pixels of each other
should be grouped together”. Loop over the bounding boxes, check to see if any are close, and if so, group
them.
REPLY 
Nate March 24, 2019 at 10:16 pm #
Brilliant read with so much detailed information!
Following from the previous question. How would one structure the code to group bounding boxes for individual text
detection and output?
After all text been detected in a natural image.
Thanks in advance
REPLY 
What do you mean by “group bounding boxes”? How should the bounding boxes be grouped?
REPLY 
Anshul jain March 27, 2019 at 2:29 am #
Great article but can you provide the dataset for static text detection or any source where i can get it?
REPLY 
Vasil Dimitrov April 8, 2019 at 1:49 am #
Great article Adrian, and so useful !!

In your opinion is it possible to use EAST model as a base and put a classification layer on top of it. Then train it to classify
detected text into one of few trained classes – say whether the detected word is “dog” or “cat” ?
A lot like it is done with image classification…

course on Computer
Adrian Rosebrock April 12, 2019 at 12:40 pm #
Vision, OpenCV, and
That would be overkill and wouldn’t work well. Instead, follow Deep
this tutorialLearning
on OpenCV + OCR. Just OCR the
word itself.
REPLY 
Himanshu April 20, 2019 at 12:31 am #
Hi, I’m running this code and it’s executing without any error.
vision
However, even in the simplest images, the scores being predicted are in the and deep
negative powerlearning. Sound good?
of the exponent. Enter is
Everything
coming to be lesser than 0.5 (default min confidence). your email below to get started.
Can anyone please help?

Email Address
✕
Click to answer
REPLY 
Sophie April 20, 2019 at 1:16 pm #
Could you please help me, I Always get the text:

orig = image.copy()
NameError: name ‘image’ is not defined
Can you tell me where I can find my mistake?

Another question: I was searching for a Programm which helps me recognizing the contours of some letters. If the
Programm recognized the conture I will teach him a lane to rewrite the letter. Is this possible with EAST?
REPLY 
Adrian Rosebrock April 25, 2019 at 9:15 am #
It sounds like you’re copying and pasting the code. Don’t do that. You likely inserted an error accidentally when
copying and pasting. Use the “Downloads” section of the post to download the code.
REPLY 
Pradipta Karmakar April 26, 2019 at 9:28 am #
It was really a great and amazing project. I just want a little help as i am not that much expert in python coding for
image processing. Can anybody help me to show the text that is being detected???
REPLY 
Adrian Rosebrock May 1, 2019 at 12:02 pm #
You can follow this tutorial.
REPLY 
Art April 27, 2019 at 2:26 pm #
Hi Adrian!
Has anybody done optimization of your implementation using Cython? Do you know?
course on Computer
Vamsi May 30, 2019 at 12:27 pm # Vision, OpenCV, and REPLY 
How do we store the detected text by time frame?
Deep Learning
Like
Time: 0:39:43
Text: Prey
Adrian Rosebrock June 6, 2019 at 8:47 am # vision and deep learning. Sound good? Enter
REPLY 

The timestamp of a video? Or a real-time video display?
Email Address
✕
Click to answer REPLY 
vivek June 13, 2019 at 2:39 am #
Hi I need to build a model to extract handwriting from images, please suggest me how much will i be benefited if i
consult the described model or if not please suggest that as well.
Thanks in advance
Vivek
REPLY 
Adrian Rosebrock June 13, 2019 at 9:34 am #
Trying to create a handwriting recognition system from scratch can be super challenging (and not something I
really recommend). Have you tried using an off-the-shelf solution such as Google Vision API yet? It includes a text
recognition component which may work for you.
REPLY 
mick June 13, 2019 at 4:08 am #
Hi,
I i’m new to opencv (and using open cv sharp :s) and managed to implement you code in c# – but… I have an issue and
wondered if you knew in general why the scores and geometry rows and cols are both -1
I have tried loads of different images.
Any thought would be great.
Thanks
REPLY 
Adrian Rosebrock June 13, 2019 at 9:32 am #
Congrats on implementing the text detector in C#, Mick! However, I’m not sure why that would happen. It may
be an issue with the C# OpenCV bindings but I’m not familiar with the C# + OpenCV bindings so unfortunately I don’t
have any suggestions on the issue.

REPLY 
mick June 13, 2019 at 4:32 pm # Free 17-day crash course on Computer
course on Computer
OK – Thanks anyway. I just followed your tutorial and did it in python to see it working and to get my head
into the theory and it’s very cool 🙂
Vision, OpenCV, and
Deep Learning
Can you “compile” python into a DLL or such, or is it interpreted only (this is the first time I’ve used it, in case that
wasn’t very obvious :))
I’m writing an app for our factory that has to read some numbers ondeep learning,
a box about 4 but
or 5 don't
times know where
a second. Willto
probably use a “simpler” OCR for that, just playing for the mo. start? Let me help. I've created a free, 17-day
Mick.
REPLY 
Julian July 10, 2019 at 9:11 am #
Email Address
Hi Mick,
✕
Can you share the C# Code with me? I cant get it running with C#. That would
STARTbe
MYgreat.
EMAIL COURSE
Click to answer
Thank you
REPLY 
Manju August 2, 2019 at 4:14 am #
In response to the -1 in the c# code, there is actually something wrong width the Width/heigth property in
the opencvsharp wrapper. If you use the Property Size() and than X and Y it works fine.
REPLY 
somesh bachani June 13, 2019 at 4:24 pm #
hello Adrian
after the text detection how do I convert the data into text like if on the image it say hello i need the output to be hello so i
can execute functions after comparision of the 2 strings please help
REPLY 
Adrian Rosebrock June 19, 2019 at 2:24 pm #
See this OpenCV OCR tutorial.
REPLY 
Denys June 17, 2019 at 4:26 pm #
Hi Adrian,
Thank you very much for such an amazing course! You’re Genious!
Could you please help, how to use EAST text detection with GRAYSCALE image?
Many thanks.
REPLY 
Just stack the gray image 3 times to create a 3-channel image:
course on Computer
image = np.dstack([gray, gray, gray])
Vision, OpenCV, and
Deep Learning
REPLY 
Denys June 20, 2019 at 2:44 pm # Interested in computer vision, OpenCV, and
Thank you very much!
Adrian Rosebrock June 26, 2019 at 1:51 pm # vision and deep learning. Sound good? Enter
REPLY 

You are welcome!
Email Address
✕
Franco June 23, 2019 at 11:31 am # Click to answer
Hi Adrian this is great content! One question though for my application it would be interesting to understand how to
generate a boolean that indicates if there is any text at all in the image. What would you suggest for this
application?
REPLY 
Loop over all text detections and check to see if there is any detection that has a > X% confidence (you define
X% yourself). If so, set the boolean to True.
REPLY 
SATYAM SAREEN June 23, 2019 at 5:26 pm #
Good Afternoon Adrian;

Loads of Love from India.
I really enjoy reading your articles, I new to computer vision and have some doubts.
Can you tell me what are these rows and columns in the scores volume as score contains the probability whether text is
present or not.
What is xdata0,xdata1,xdata2,xdata3,xdata4.
and my last doubt is I did not understood your step of calculating h,w, endx, endy, startx, starty.
Regards
Satyam Sareen
REPLY 
aisha July 3, 2019 at 4:08 am #
hi adrian i want to try this code OpenCV Text Detection (EAST text detector) and i have downloaded it from here,
but where is the datset? can you please provide me the dataset?
Adrian Rosebrock July 4, 2019 at 10:17 am # Free 17-day crash ×

REPLY 

course
Kindly refer to the blog post where I link you to the authors of Vision,
the EAST
on and
OpenCV,
Computer
Deepand
paper publication Learning
the dataset they
used. Vision, OpenCV, and
Deep Learning
Interested in computer vision, OpenCV,REPLY
and 
Vishnu July 9, 2019 at 3:17 am #
Hi Adrian, start? Let me help. I've created a free, 17-day
i would like to know how to make and get bounding boxes of single characters instead of the whole length of characters.
ex= Adrian (get bounding boxes for A, D, R, I A, N ) vision and deep learning. Sound good? Enter
your help is much appreciated
Thanks
Email Address
✕
Click to answer
REPLY 
Sparsh_03 July 17, 2019 at 1:44 am #
HI Adrian , First of all thanks for the brilliant post , I want to detect the text only on the number plates leaving the
rest , how i could achieve this .
Thanks in advance
REPLY 
Adrian Rosebrock July 25, 2019 at 9:51 am #
I cover ANPR/ALPR inside the PyImageSearch Gurus course. I suggest you start there.
REPLY 
Alison C August 3, 2019 at 5:01 pm #
How can I have a text detection and text recognition in one poject?
please I need your help!!
REPLY 
You can do that using this tutorial.
REPLY 
Nazim Shaikh August 4, 2019 at 12:18 am #
Great post Adrian. Love your work as always.
I am trying to use the c++ version of this (provided by opencv) to detect lines, although I am having trouble understanding
that for line detection. Could you please help me on how I can detect lines of words using your script? Probably I can then
try to convert it into c++

Free 17-day crash course on ComputerREPLY 
Antologyimon August 11, 2019 at 5:41 pm # course on Computer
Hey Adrian, Vision, OpenCV, and
For some reason single letters or single characters are not detected.
Is there a way to get the EAST text detector to detect single letters?
Deep Learning
Thanks so much for the tutorials, They are ace! Interested in computer vision, OpenCV, and
Leave a Reply
Email Address
✕
Click to answer
Name (required)
Email (will not be published) (required)
Website
SUBMIT COMMENT
Search... 
Resource Guide (it’s totally free).
Get your FREE 17 page Computer Vision, OpenCV, and Deep Learning Resource Guide PDF. Inside you'll find my
hand-picked tutorials, books, courses, and libraries to help you master CV and DL.
Download for Free!
Raspberry Pi for Computer Vision

course on Computer
Vision, OpenCV, and
Deep Learning
Email Address
✕
Click to answer
You can teach your Raspberry Pi to “see” using Computer Vision, Deep Learning, and OpenCV. Let me show you how.
CLICK HERE TO LEARN MORE
Deep Learning for Computer Vision with Python Book — OUT NOW!
You're interested in deep learning and computer vision, but you don't know how to get started. Let me help. My new book will teach you all
you need to know about deep learning.
CLICK HERE TO MASTER DEEP LEARNING
You can detect faces in images & video.

course on Computer
Vision, OpenCV, and
Deep Learning
Are you interested in detecting faces in images & video? But tired of Googling for tutorials that
Interested never work?
in computer Then let
vision, me help!and
OpenCV, I
guarantee that my new book will turn you into a face detection ninja by the end of this weekend. Click here to give it a shot yourself.
CLICK HERE TO MASTER FACE DETECTION
PyImageSearch Gurus: NOW ENROLLING! vision and deep learning. Sound good? Enter
Email Address
✕
Click to answer
The PyImageSearch Gurus course is now enrolling! Inside the course you'll learn how to perform:
Automatic License Plate Recognition (ANPR)

Deep Learning
Face Recognition
and much more!
Click the button below to learn more about the course, take a tour, and get 10 (FREE) sample lessons.
TAKE A TOUR & GET 10 (FREE) LESSONS
Hello! I’m Adrian Rosebrock.
I'm Ph.D and entrepreneur who has spent his entire adult life studying Computer Vision and Deep Learning. I'm here to
help you master CV, DL, and OpenCV. Learn More
Learn computer vision in a single weekend.

course on Computer
Vision, OpenCV, and
Deep Learning
Want to learn computer vision & OpenCV? I can teach you in a single weekend. I know. It sounds crazy, but it’s no joke. My new book is
your guaranteed, quick-start guide to becoming an OpenCV Ninja. So why not give the
it abest
try?possible introduction
Click here to become atocomputer
computervision ninja.
CLICK HERE TO BECOME AN OPENCV NINJA your email below to get started.
Email Address
Subscribe via RSS
✕
Never miss a post! Subscribe to the PyImageSearch RSS FeedClick
andtokeep
answer
up to date with my image search engine tutorials, tips, and tricks
POPULAR
Raspbian Stretch: Install OpenCV 3 + Python on your Raspberry Pi

SEPTEMBER 4, 2017
Install guide: Raspberry Pi 3 + Raspbian Jessie + OpenCV 3

APRIL 18, 2016
Home surveillance and motion detection with the Raspberry Pi, Python, OpenCV, and Dropbox
JUNE 1, 2015
Face recognition with OpenCV, Python, and deep learning

JUNE 18, 2018
Install OpenCV and Python on your Raspberry Pi 2 and B+

FEBRUARY 23, 2015
Real-time object detection with deep learning and OpenCV

SEPTEMBER 18, 2017
Ubuntu 16.04: How to install OpenCV

OCTOBER 24, 2016
Find me on Twitter, Facebook, and LinkedIn.

Privacy Policy
© 2019 PyImageSearch. All Rights Reserved.

course on Computer
Vision, OpenCV, and
Deep Learning
Email Address
✕
Click to answer

OpenCV Text Detection (EAST Text Detector) - PyImageSearch

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

OpenCV Text Detection (EAST Text Detector) - PyImageSearch

Caricato da

Copyright:

Formati disponibili

8/27/2019 OpenCV Text Detection (EAST text detector) - PyImageSearch

Interested in computer vision, OpenCV, and

OpenCV Text Detection (EAST text detector) Demo

Why is natural scene text detection so challenging?

Free 17-day crash ×

Figure 3: The structure of the EAST text detection Fully-Convolutional

Free 17-day crash ×

We’ll be reviewing two .py files today:

text_detection.py : Detects text in static images.

Implementing our text detector with OpenCV

OpenCV Text Detection (EAST text detector)

We then proceed to parse five command line arguments on Lines 9-20:

--image : The path to our input image.

From there, let’s load our image and resize it:

OpenCV Text Detection (EAST text detector) Python

On Lines 23 and 24, we load and copy our input image.

Let’s load the OpenCV’s EAST text detector:

OpenCV Text Detection (EAST text detector) Python

We’ll need to loop over each of these values, one-by-one:

OpenCV Text Detection (EAST text detector) Python

We’ll later be applying non-maxima suppression to these regions. Email Address

OpenCV Text Detection (EAST text detector) Python

OpenCV Text Detection (EAST text detector) Email Address Python

OpenCV text detection results

Three text regions are identified on Lebron James.

Now let’s try to detect text of a business sign:

OpenCV Text Detection (EAST text detector) Shell

And finally, we’ll try a road sign:

OpenCV Text Detection (EAST text detector) Shell

Free 17-day crash ×

Text detection in video with OpenCV

Open up text_detection_video.py and insert the following code: Deep Learning

OpenCV Text Detection (EAST text detector) Python

Our command line arguments are parsed on Lines 69-80:

--east : The EAST scene text detector model file path.

OpenCV Text Detection (EAST text detector) Python

Our video stream is set up for either:

A webcam (Lines 100-103)

OpenCV Text Detection (EAST text detector) Python

We begin looping over video/webcam frames on Line 113.

OpenCV Text Detection (EAST text detector) Free 17-day crash ×

OpenCV Text Detection (EAST text detector) Python

Video text detection results Free 17-day crash ×

OpenCV Text Detection (EAST text detector) Demo

Our OpenCV text detection video script achieves 7-9 FPS.

Free 17-day crash ×

Your email address

DOWNLOAD THE CODE!

Resource Guide (it’s totally free).

Your email address

DOWNLOAD THE GUIDE!

 Install OpenCV 4 on macOS Neural Style Transfer with OpenCV 

242 Responses to OpenCV Text Detection (EAST text detector)

Free 17-day crash ×

I tried on mixed Bulgarian/English – it works perfect.

One question – how to extract boundered text?

You can use array slicing:

roi = image[startY:endY, startX:endX]

How you approach this?

Thanks for the great article. It is pretty accurate and fast.

I wonder how may:

Free 17-day crash ×

start? Let me help. I've created a free, 17-day