Translatar: A Mobile Augmented Reality Translator

nishadkv shm engg college kollam
WELCOME
1
TRANSLATAR: A MOBILE AUGMENTED REALITY TRANSLATOR
NISHAD KV No :10 S7 IT
ABSTRACT
TranslatAR is a mobile augmented reality (AR) translation system, using a Smartphone's camera and touch screen, that requires the user to simply tap on the word of interest once in order to produce a translation, presented as an AR overlay.
CONTENTS
1) 2) 3) 4) 5) 6) 7) 8) 9)
Introduction What is Augmented Reality? Overview of the system Implementation Details Advantages Disadvantages Future Enhancements Conclusion References
4
INTRODUCTION
Written text is one of the most common methods for conveying information in our daily lives. However, when written text is encountered in a language unfamiliar to an individual, the information cannot be conveyed. Using a Smartphone with touch screen and camera as the physical device, we present a system for automatic translation of visual text that has an efficient and easy-to-use input and a natural and compelling form of presentation. The use of our system, TranslatAR, is shown in Fig. 1.
5
FIG.1
WHAT IS AUGMENTED REALITY ?

The goal of augmented reality is to add information and meaning to a real object or place. Augmented reality does not create a simulation of reality. It takes a real object or space as the foundation and incorporates technologies that add contextual data to deepen a persons understanding of the subject.
7
OVERVIEW OF THE SYSTEM

It
was designed such that all expensive operations run in a background thread, while the system maintains interactive frame rates for tracking and augmentation. The systems architecture, shown in Fig. 2. Text Detection 2) Text extraction, Recognition and Translation 3) Visual Tracking 4) AR Overlay
1)
FIG 2
1. TEXT DETECTION
Given a point c onto which the user tapped, the system first finds the bounding box around the word, then the exact location and orientation of the text within.
1) 2)
Bounding box Location & orientation refinement
10
1.1 BOUNDING BOX

It is the area around the text to be translated. First the image gradients Ix and Iy are computed. A short horizontal line segment Sh around the input point c is then moved vertically upwards and downwards.(fig 3.a) A short vertical line segment Sv around the input point c is then moved horizontally.(fig 3.b) It is susceptible to fail for non-uniform backgrounds.
11
FIG. 3 : TEXT DETECTION IN OPERATION
12
1.2 LOCATION &

ORIENTATION REFINEMENT
Pixels within bounding box are considered. Lines that cross the vertical line through c at an
angle are considered. A voting scheme is assigned for the task of finding text baselines. The resulting quadrilateral region of interest is warped into a rectangle, correcting any perspective distortion and showing the text as if seen in.(fig 3.c, fig 3.d)
13
2. TEXT EXTRACTION, RECOGNITION AND TRANSLATION

The warped image produced is used to extract background and foreground color, as well as to read the word via OCR.
1) 2) 3) 4)
Foreground and background color estimation OCR Dictionary lookup Translation
14
2.1 FOREGROUND AND

BACKGROUND COLOR ESTIMATION
strong contrast with the background. The background color is extracted by sampling the borders of the bounding box, assuming that the background will be roughly uniform. The foreground color estimation is done by scanning pixels starting from a point in the center. It accepts a sample as an estimation of the foreground color if its intensity is sufficiently different from the background.
We may assume that the foreground color has a

15
2.2 OCR(OPTICAL CHARACTER RECOGNITION) With the rectified image of the word, we rely on a
standard OCR system recognition of the letters. for extraction and
We
used Tesseract, as it is freely available and was easy to integrate.
Tesseract
is an OCR engine developed by HP from 1985 to 1995 until the project was continued and released as open source by Google in 2005.
16
OCR (Cont)
Tesseract is a command line tool that accepts a

TIFF image as input
Tesseract
is a raw OCR engine that focuses primarily on character recognition accuracy.
It is open source, compiles easily on any platform, and performs all that is necessary for TranslatAR's purpose.
17
2.3 DICTIONARY LOOKUP It search through a dictionary of valid words to

identify the nearest neighbor with respect to the Levenshtein distance.
The
Levenshtein distance to the found string is computed for each dictionary word within 2 of the length of the found string.
The word with the smallest distance is taken as
replacement for the original string returned by the OCR.
18
2.4 TRANSLATION It use Google Translate, an existing free online

translation service, to do the actual text to text translation.
The input language is detected automatically by

Google Translate.
The
desired output language can be selected by the user in its GUI.
19
20
of the word of interest in the live video stream and to present the translation in a live, AR-style overlay. Several circumstances make tracking in our application easier than it is in the general case:
3. VISUAL TRACKING Visual tracking enables the system to keep track

1) We may assume that the text is displayed on a near-to-planar surface. 2) As the region of interest consists of text, it is automatically well-textured and contains features with high contrast, which is important for tracking.
21
VISUAL TRACKING (Contd.)

3) We are only interested in tracking over short periods of time (as long as it takes the system to obtain the translation + the user to read it). 4) We can assume a cooperative user who will not move the phone jerkily.
22
4. AR OVERLAY Based on the transformation
computed by the tracker, a graphical augmentation is rendered onto the live video screen.
23
First a placeholder (please wait...) is displayed while the text is being translated.
24
Then, as soon it becomes available, the translation itself is displayed.
25
FIG 2
26
IMPLEMENTATION DETAILS This system was implemented on the Nokia

N900, which runs on Linux based OS- Maemo. The code was developed in C++. The graphical augmentation was implemented in OpenGL ES 2. HTTP requests to and responses from Googles online translation service are handled with the curl library.
27
in a Linux platform, The application TranslatAR is easy to extend and improve by other developers compared to systems on other platforms. This application is reasonably simple case for tracking compared to other tracking applications. Processing is done on the mobile device itself, providing immediate feedback. No need for other peripheral devices for translation of text.
ADVANTAGES Since this is implemented
28
DISADVANTAGES
of control over the viewfinder's focus currently limits translatAR to relatively large fonts. The frame rate has to be significantly improved, the most obvious way is to make use of hardware acceleration in many of the image processing & rendering tasks. Real world imaging problems like illumination variance, glare, and dirt are found to cause significant problems for both text detection and the Tesseract OCR module.
Lack
29
would absorb some of the OCR errors. Text to voice translation and vice versa. Translation of text without the use of online services. Implementation of similar systems in mobile devices other than smart phones.
FUTURE ENHANCEMENTS Translation of texts in all font formats. Translation of texts at faster frame rates. Integration of spell checkers before translation
30
CONCLUSION
In this work, we presented a prototype for a real-time, mobile, visual translation system, which requires only a single tap on the word of interest and presents its result as a live AR rendering The application is capable of overlaying an automatically translated text over a region of interest which is extracted and tracked in the video stream of the camera.
31
REFERENCES

T. N. Dinh, J. Park, and G. Lee. Low-complexity text extraction in Korean signboards for mobile applications. In Computer and Information Technology, 2008. CIT 2008. 8th IEEE Intl. Conf. on, pages 333 337, 8-11 2008. http://www.wikipedia.org/ B. Epshtein, E. Ofek, and Y.Wexler. Detecting text in natural scenes with stroke width transform. In Proc. IEEE CVPR 2010, July 2010. http://ieeexplore.ieee.org http://ilab.cs.ucsb.edu/index.php/component/content/article/10/14
32
THANK YOU
33

Translatar: A Mobile Augmented Reality Translator

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Translatar: A Mobile Augmented Reality Translator

Caricato da

Copyright:

Formati disponibili

nishadkv shm engg college kollam