Sei sulla pagina 1di 84

Optical Character Recognition using ANN

CHAPTER 1 INTRODUCTION
1.1 Problem Definition
Almost everybody carries a mobile , it can be used to achieve a lot more than just communicate with each other. Today when an image file containing content is received in form of fax or book, one needs to retype the whole document and then save it in any format that he or she wants. For Example, let us take a real time scenario. A boss is travelling and the secretary finds some data that needs to be approved by the boss. She can scan the document and send it to the boss, which our software can receive and convert it from image to editable text, which the boss can edit and resend it back to the secretary in form of an image that she can print or use for presentation.

1.2 Objective
This Project aims at developing software that can convert an image file to an editable text file using the technology Artificial Neural Networks.

1.3 Project Introduction


The use of artificial neural network in Optical Character Recognition (OCR) applications can dramatically simplify the code and improve quality of recognition while achieving good performance. Artificial Neural Networks (ANNs) is a new approach that follows a different way from traditional computing methods to solve problems. When there is an image file containing text that has to be converted to a document, the entire document has to be retyped to bring it to a format of choice.

Optical Character Recognition using ANN

The existing system can convert an image to an editable text. A lot of people today are trying to write their own OCR (Optical Character Recognition) System or to improve the quality of an existing one. This project shows how the use of artificial neural network simplifies development of an optical character recognition application, while achieving highest quality of recognition and good performance. Almost everybody carries a mobile, it can be used to achieve a lot more than just communicate with each other. Today when an image file containing content is received in form of fax or book, one needs to retype the whole document and then save it in any format that he or she wants. For Example, let us take a real time scenario. A boss is travelling and the secretary finds some data that needs to be approved by the boss. She can scan the document and send it to the boss, which our software can receive and convert it from image to editable text, which the boss can edit and resend it back to the secretary in form of an image that she can print or use for presentation. Artificial Neural Networks, usually abbreviated to ANNs, is a recent development tool that is modeled from biological neural networks. The powerful side of this new tool is its ability to solve problems that are very hard to be solved by traditional computing methods (e.g. by algorithms). This work briefly explains Artificial Neural Networks and their applications, describing how to implement a simple ANN for Character recognition.

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping system in an office, or to publish the text on a website. OCR makes it possible to edit the text, search for a word or phrase, storing it more compactly, display or print a copy free of scanning artifacts, and apply techniques such as
2

Optical Character Recognition using ANN

machine translation, text-to-speech and text mining to it. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Developing proprietary OCR system is a complicated task and requires a lot of effort. Such systems usually are really complicated and can hide a lot of logic behind the code. The use of artificial neural network in OCR applications can dramatically simplify the code and improve quality of recognition while achieving good performance. Another benefit of using neural network in OCR is extensibility of the system ability to recognize more character sets than initially defined. Most of traditional OCR systems are not extensible enough because such task as working with tens of thousands Chinese characters, for example, is not as easy as working with 68 English typed character set and it can easily bring the traditional system to its knees!

1.4 Existing System


A system has the capacity to take the image as input but it cannot do any alteration in the image. We can scan a document and it will be saved as an image. In the image we cannot do any alteration. So if we need to do any changes in the document which has come as image, we need to type the whole document with the changes again manually. This is a very tedious task and very time consuming. In the real time scenario, for example, if a manager is sending any document to his assistant from a faraway place through mail as a image to do some changes or for editing, if the assistant sits and types the whole document again with the changes it takes a lot of time. So to avoid these kind of situation this application of image processing is developed. The machine replication of human reading has been the subject of intensive research for more than three decades. A large number of research papers and
3

Optical Character Recognition using ANN

reports have already been published on this topic. Many commercial establishments have manufactured recognizers of varying capabilities. Handheld, desk-top, medium-size and large systems costing as high as half a million dollars are available, and are in use for various applications. However, the ultimate goal of developing a reading machine having the same reading capabilities of humans still remains unachieved. So, there still is a great gap between human reading and machine reading capabilities, and a great amount of further effort is required to narrow-down this gap, if not bridge it. Todays systems use Traditional Algorithms to accomplish optical character recognition tasks. In these algorithms the steps of execution as well as the complete input set should be known by the programmer. This task is clearly very difficult if not impossible. Further these traditional algorithms are not flexible enough to handle unanticipated inputs. If the algorithm does encounter such a state the system comes crashing down on its knees.

1.5 Proposed System


ARTIFICIAL NEURAL NETWORKS provides a simple and effective solution. An artificial neural network (ANN) is a massively parallel distributed processor that has a natural propensity for storing experimental knowledge and making it available for use. Our application uses the Neuro.NET library to show how to use Back propagation neural network in a simple OCR application. The optical character recognition using ANN is designed to convert an image document into text so that we can edit the information present in the image document and send it back as an Image document itself after editing it to the user, which is achieved by our software. The Use of artificial neural network simplifies development of an ptical character recognition

Optical Character Recognition using ANN

Application, while achieving highest quality of recognition and good performance. The use of artificial neural network in OCR applications can dramatically simplify the code and improve quality of recognition while achieving good performance. Another benefit of using neural network in OCR is extensibility of the system ability to recognize more character sets than initially defined. It is highly portable and overcomes the need carry around a fax machine. We can do the changes we want in the image by converting it into text file. We need not type the whole document manually to do the changes. It saves time and reduces manual work.

1.5.1 Creating the neural network.


The network has to be constructed first using a Back propagation neural network. The Back propagation network is a multilayer perception model with an input layer, one or more hidden and an output layer. The nodes in the Back propagation neural network are interconnected via weighted links with each node usually connecting to the next layer up, till the output layer which provides output for the network. The input pattern values are presented and assigned to the input nodes of the input layer. The input values are initialized to values between -1 and 1. The nodes in the next layer receive the input values through links and compute output values of their own, which are then passed to the next layer. These values propagate forward through the layers till the output layer is reached, or put another way, till each output layer node has produced an output value for the network. The desired output for the input pattern is used to compute an error value for each node in the output layer, and then propagated backwards (and here's where the network name comes in) through the network as the delta rule is used to adjust the link values to produce better, the desired output.

Optical Character Recognition using ANN

Once the error produced by the patterns in the training set is below a given tolerance, the training is complete and the network is presented new input patterns and produces an output based on the experience it gained from the learning process.

1.5.2 Network Formation


The input layer constitutes of 150 neurons which receive pixel binary data from a 10x15 symbol pixel matrix. The size of this matrix was decided taking into consideration the average height and width of character image that can be mapped without introducing any significant pixel noise. The hidden layer constitutes of 250 neurons whose number is decided on the basis of optimal results on a trial and error basis.

Figure 1.1 Network Formations

Optical Character Recognition using ANN

1.5.2.1 Creating Training Patterns


Now a training pattern has to be constructed so that these patterns will be used for teaching the neural network to recognize the images. Basically, each training pattern consists of two single-dimensional arrays of float numbers Inputs and Outputs arrays. The Inputs array contains your input data. In this case it is a digitized representation of the character's image. Under digitizing the image it means process of creating a brightness map of the image. To create this map the image is split into squares and calculate average value of each square. Then these values are stored into the array. (Consider Figure 1.1)

Figure 1.2 Pattern Training Creation

Optical Character Recognition using ANN

1.5.2.2 Training of the network.


Train the network by giving input to it and until the error reaches to an acceptable value.

1.5.2.3 Symbol Image Detection


The next step is to map the symbol image into a corresponding two dimensional binary matrix. An important issue to consider here will be deciding the size of the matrix. If all the pixels of the symbol are mapped into the matrix, one would definitely be able to acquire all the distinguishing pixel features of the symbol and minimize overlap with other symbols. However this strategy would imply maintaining and processing a very large matrix (up to 1500 elements for a 100x150 pixel image). Hence a reasonable trade-off is needed in order to minimize processing time which will not significantly affect the seperability of the patterns. The project employed a sampling strategy which would map the symbol image into a 10x15 binary matrix with only 150 elements. Since the height and width of individual images vary, an adaptive sampling algorithm was implemented.

1.6 Literature Survey


The Artificial Neural Network (ANN) is a wonderful tool that can help to resolve such kind of problems. The ANN is an information-processing paradigm inspired by the way the human brain processes information. Artificial neural networks are collections of mathematical models that represent some of the observed properties of biological nervous systems and draw on the analogies of adaptive biological learning. The key element of ANN is topology.

Optical Character Recognition using ANN

The ANN consists of a large number of highly interconnected processing elements (nodes) that are tied together with weighted connections (links) (figure 1.2). Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. This is true for ANN as well. Learning typically occurs by example through training, or exposure to a set of input/output data (pattern) where the training algorithm adjusts the link weights. The link weights store the knowledge necessary to solve specific problems.

Figure 1.3 Nodes and links connections Originated in late 1950's, neural networks didnt gain much popularity until 1980s a computer boom era. Today ANN is mostly used for the 9 complex real world problems. They are often good at solving problems that are too complex for conventional technologies (e.g., problems that do not have an algorithmic solution or for which an algorithmic solution is too complex to be found) and are often well su1ited to problems that people are good at solving, but for which traditional methods are not. They are good pattern recognition engines and robust classifiers, with the ability to generalize in making decisions based on imprecise input data. They offer ideal solutions to a variety of classification problems such as speech, character and signal recognition, as well as functional prediction and system modelling, where the physical processes are not understood or are highly complex. The advantage of ANN lies in their resilience against distortions in the input data and their capability to learn.
9

Optical Character Recognition using ANN

1.6.1 Artificial Neural Network (ANN) Introduction


Artificial Neural Networks (ANN) is a new approach that follows a different way to solve problems from traditional computing methods. ANN is, in some way, much more powerful because they can solve problems that we do not exactly know how to solve. That's why, of late, their usage is spreading over a wide range of area including, virus detection, robot control, intrusion detection systems, pattern (image, fingerprint, noise..) recognition and so on. The artificial neural network was a 3 layer, back-propagation network. The input layer has 17 nodes (each of which corresponded to a feature in the feature vector described above).The hidden layer has 12 nodes which are connected to the single output node Artificial neural network is a good means of machine learning. They are helpful for the system to learn from the historical data. Once it has learned, it is used for the testing purposes. The neural network, used in this project is especially useful for the classification problems, where we create different classes of possible outputs and the net result is the cumulative result of these classes.

1.6.2 Back Propagation used in ANN


Back Propagation ANN contains one or more layers each of which are linked to the next layer. The first layer is called the "input layer" which meets the initial input (e.g. pixels from a letter) and so does the last one "output layer" which usually holds the input's identifier (e.g. name of the input letter). The layers between input and output layers are called "hidden layer(s)" which only propagate the previous layer's outputs to the next Layer and [back] propagates the following layer's error to the previous layer. Actually, these are the main operations of training a Back Propagation ANN which follows a few steps.

10

Optical Character Recognition using ANN

Figure 1.4 Back Propagation ANN A typical Back Propagation ANN is as depicted above (figure 1.3). The black nodes (on the extreme left) are the initial inputs. Training such a network involves two phases. In the first phase, the inputs are propagated forward to compute the outputs for each output node. Then, each of these outputs is subtracted from its desired output, causing an error [an error for each output node]. In the second phase, each of these output errors is passed backward and the weights are fixed. These two phases is continued until the sum of [square of output errors] reaches an acceptable value. Instead of training a single network to recognize multiple fonts, the network could have been implemented as a bank of single-font networks. However, this approach was not chosen because the individual networks would not be able to benefit from associating the \correct" character of a different font with the \correct" character of their font (and similarly for wrong characters). Creating a single network that can successfully recognize any of the fonts increases redundancy, durability, and complexity of the network. The input layer contains an astronomical 2500 neurons. Since the input images consist of an nm matrix of pixels (50_50), the feature vector consists of a 250 element vector. This vector is fed directly into the neural network.

11

Optical Character Recognition using ANN

The input layer then connects to a hidden Layer consisting of 100 neurons. The hidden layer then connects to an output layer consisting of 94 neurons, each of which corresponds to a given character class. An artificial neuron is a device with many inputs and one output.

Figure 1.5 Artificial Neuron The firing rule is an important concept in neural networks and accounts for their high flexibility. A firing rule determines how one calculates whether a neuron should fire for any input pattern. It relates to all the input patterns, not only the ones on which the node was trained. Take a collection of training patterns for a node, some of which cause it to fire (the 1-taught set of patterns) and others which prevent it from doing so (the 0-taught set). Then the patterns not in the collection cause the node to fire if, on comparison, they have more input elements in common with the 'nearest' pattern in 1-taught set than with the 'nearest' pattern in 0-taught set. If there is a tie, then the pattern remains in the undefined state. For example, a 3-input neuron is taught to output 1 when the input (X1, X2 and X3) is 111 or 101 and to output 0 when the input is 000 or 001.

12

Optical Character Recognition using ANN

Then, before applying the firing rule, the truth table is:
X1: X2: X3: 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1

OUT:

0/1 0/1 0/1

0/1

Table 1.1: Before Firing Truth Table As an example of the way the firing rule is applied, take the pattern 010. It differs from 000 in 1 element, from 001 in 2 elements, from 101 in 3 elements and from 111 in 2 elements. Therefore, the 'nearest' pattern is 000 which belongs in the 0-taught set. Thus the firing rule requires that the neuron should not fire when the input is 001. On the other hand, 011 is equally distant from two taught patterns that have different outputs and thus the output stays undefined (0/1). By applying the firing in every column the following truth table is obtained;
X1: X2: X3: 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1

OUT:

0/1

0/1

Table 1.2 : After Firing Truth table


13

Optical Character Recognition using ANN

Figure 1.6 Network For example: The network of figure 1.6 is trained to recognize the patterns T and H. The associated patterns are all black and all white respectively as shown below:

Figure 1.7 Pattern T and H recognition If we represent black squares with 0 and white squares with 1 then the truth tables for the 3 neurons after generalization are as follows;

14

Optical Character Recognition using ANN

X11: X12: X13:

0 0 0

0 0 1

0 1 0

0 1 1

1 0 0

1 0 1

1 1 0

1 1 1

OUT:

Table 1.3: Top neuron


X21: X22: X23: 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1

OUT:

0/1

0/1

0/1

0/1

Table 1.4: Middle neuron


X21: X22: X23: 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 0 1 1 1 0 1 1 1

OUT:

Table 1.5: Bottom neuron

15

Optical Character Recognition using ANN

From the tables it can be seen the following associations can be extracted:

Figure 1.8 Associations from table In this case, it is obvious that the output should be all blacks since the input pattern is almost the same as the 'T' pattern.

Figure 1.9 T Pattern conclusion Here also, it is obvious that the output should be all whites since the input pattern is almost the same as the 'H' pattern.

Figure 1.10 H Pattern conclusion Here, top row is 2 errors away from the T and 3 from an H. So the top output is black. The middle row is 1 error away from both T and H so the output is random. The bottom row is 1 error away from T and 2 away from H.

16

Optical Character Recognition using ANN

Therefore the output is black. The total output of the network is still in favour of the T shape.

1.7 Purpose
Time efficient Highly Portable i.e. the software works on almost all system This software is developed as there are day by day increasing demands for Emerging Technologies. Minimum hardware cost is involved

1.8 Scope
Fulfils the Requirement for BPOs. Optical Character Recognition provided can be used for various applications like Image recognition etc. In future it can be applied to Voice Recognition application to produce absolute development and enhancement in voice

recognizing.

1.9 Limitations
It is limited to only black and white font color. It is constructed for only limited fonts.

17

Optical Character Recognition using ANN

1.10 Deliverables
The optical character recognition using ANN is designed to convert an image document into text so that we can edit the information present in the image document and send it back as an Image document itself after editing it to the user. A massively parallel distributed System that has a natural propensity for storing experimental knowledge and making it available for use.And Highly Time efficient software is to be delivered.

18

Optical Character Recognition using ANN

CHAPTER 2 SYSTEM ANALYSIS


System Analysis is the analysis of the role of the proposed system and the identification of the requirement that it should meet. System Analysis is the starting point for system design. The term is most commonly used in the context of commercial programming where software developers are often classed as either system analyst or programmers. The system analysts are responsible for identifying requirements and producing a design. The programmers are then responsible for implementing it.

MODULES
TRAINING A CHARACTER CHARACTER RECOGNITION CONVERSION TO IMAGE

2.1 Module 1: TRAINING A CHARACTER


In this module all possible characters patterns of a particular language are trained so that the software recognizes the characters later easily by referring this. First the general characters of a particular font style are trained. It works as follows: 2.1.1 Load Character Training Set Input : .cts(character trainer set) file Output : trains all the characters of a particular set

19

Optical Character Recognition using ANN

Description : Takes the .cts file of the required font which contains alphabets, numerics and special characters written in the particular font style Then load the corresponding image of the font style And trains all the characters of the image by identifying the lines It basically trains the network Class Used: Form1 Methods Used: load_character_trainer_set() Attributes: file_stream, TextField, Button, picturebox, bitmap

image After training generally all characters of a language the network is saved for that particular font style. It works as follows: 2.1.2 Save Network Input : Output of .cts file Output : Particular network file Description : Creates a network file with .ann extension It contain weights of all characters present in the .cts file of specific style This will be useful in identifying characters of particular font style Class Used: Form1 Methods Used: save_network(), Attributes: file_stream, TextField, Button
20

Optical Character Recognition using ANN

2.2 Module 2: CHARACTER RECOGNITION


The user must first load the network i.e. which character the image file is composed of. It works as follows 2.2.1 Load Network Input : .ann(artificial neural network) file Output :An array containing weights of all characters of particular font style Description : First forms the network Initializes input layers, input and output nodes Takes .ann file of a particular user interested font style An array containing weights of all characters of particular font style is generated Class Used: Form1 Methods Used: load_network(), form_network(), Attributes: file_stream, TextField, Button After loading the network the input i.e. image which is to be converted to text form is loaded. It works as follows: 2.2.2 Load Image Input : Input Image Output : Displays the input image in the picture box

21

Optical Character Recognition using ANN

Description : Creates a new bitmap image and copies the input image to it Saves the filename, path, height and width of the input image in different variables Identify number of line and atore it for further references Measures the line top and line bottom of each line and store in respective arrays of line top and line bottom Display the input image Class Used: Form1 Methods Used: load_image(), identify_lines() Attributes: file_stream, TextField, Button, picturebox Once input image is loaded the next step is to recognise each and every characters in the input image file. This works as follows: 2.2.3 Next Character Input : Input image with number of lines and their respective line top and line bottom values Output : Displays the character and also its matrix mapping. And the corresponding character in the poutput string Description : Finds the character bound i.e charaters top, bottom, right and left values and also character heght and width First extracts a single line from the input image and get character bounds of all the characters present in the line

22

Optical Character Recognition using ANN

This same procedure is repeated for all the lines present in the input image Once it gets the character bounds of a single character Records the character images pixels in a matrix Creates a new bitmap image and copies the detected character onto this image using the pixel mxatrix of the character Displays the detected character Next maps this character onto the matrix using pick sampling pixels method and also store these matrix values of the charater in another array Uses this matrix values array and compare with the arrays of the standard character which is stored in the network file And displays the identified charater in the output tab Repeats the same procedure as above for all characters in the input image Class Used: Form1 Methods Used: detect_next_character(), get_next_character(), analyze_image(), get_character_bounds(), map_character_image_pixel_matrix(), create_character_image(), map_ann_input_matrix(), calculate_outputs() Attributes: file_stream, TextField, Button, picturebox

23

Optical Character Recognition using ANN

2.3 Module 3: CONVERSION TO IMAGE


After getting text form of the input image, the image is edited as per the requirements and then the edited file is again converted back to an image file. This usually happens by using bit streaming technique. Class Used: Form1 Methods Used: t2bmp() Attributes: button, bitmap image stream

2.4 LIMITATION OF EXISTING SYSTEM


In existing there was guarantee for handling noises appearing in the image file.

2.5 ADVANTAGES OF PROPOSED SYSTEM


It simplifies the code and improves quality of recognition while achieving good performance. Another benefit is extensibility of the system ability to recognize more character sets than initially defined. Highly portable and overcomes the need carry around fax machine. We can do the changes we want in the image by converting it into text file. We need not type the whole document manually to do the changes. It saves time. Manual work is reduced.

24

Optical Character Recognition using ANN

CHAPTER 3 SYSTEM REQUIREMENTS SPECIFICATION


A System Requirements Specification (SRS) is a complete description of the behavior of the system to be developed. It includes a set of use cases that describe all the interactions the users will have with the software. Use cases are also known as functional requirements. Functional requirements are supported by non-functional requirements (also known as quality requirements), which impose constraints on the design or implementation (such as performance requirements, security, or reliability). How a system implements functional requirements is detailed in the system design. This Requirements Specification provides a complete description of all the functions and specifications of converting an image into text via OCR, using ANN (Artificial Neural Network) and produces specific results. The use of artificial neural network in OCR applications can dramatically simplify the code and improve quality of recognition while achieving good performance. In software engineering, a functional requirement defines a function of a software system or its component. A function is described as a set of inputs, the behavior, and outputs. Functional requirements may be calculations, technical details, data manipulation and processing and other specific functionality that define what a system is supposed to accomplish. Behavioral requirements describing all the cases where the system uses the functional requirements are captured in use cases.

3.1 Functional requirements

25

Optical Character Recognition using ANN

In software engineering, a functional requirement defines a function of a software system or its component. A function is described as a set of inputs, the behavior, and outputs. Functional requirements may be calculations, technical details, data manipulation and processing and other specific functionality that define what a system is supposed to accomplish. Behavioral requirements describing all the cases where the system uses the functional requirements are captured in use cases. Functional Requirements are those that refer to the functionality of the system, i.e., what services it will provide to the user. OCR coverts an image into text file, which involves Training

Input: Image document Action: Line is extracted, then from the line character is extracted, and the extracted character is converted into matrix and then it has been Grey scaled and then it is converted into pixel value and input vector is prepared. Output: Binary value for converted character. Recognition

Input: Converted input File Action: Loading the image file, then comparing the output with values from training set and the values are converted into text output and the weights and errors are adjusted and it is checked whether it is within the acceptable range and then it is matched with the image file. Output: Text File.

Editing

26

Optical Character Recognition using ANN

Input: Text document Output: Edited text In Editing, files can be saved, fonts can be set, alignments, color pick up can be done. Conversion: Edited document is converted back into image file. Input: Edited document Output: Image document

3.2 Non-functional requirements


A system has properties that emerge from the combination of its parts. These emergent properties will surely be a matter of accident, not Design, if the non-functional requirements, or system qualities, are not Specified in advance. In general, functional requirements define what a system is supposed to do whereas non-functional requirements define how a system is supposed to be. Non-functional requirements are often called qualities of a system. Qualities, that is, non-functional requirements, can be divided into two main categories: Execution qualities, such as security and usability, which are observable at run time. Evolution qualities, such as testability, maintainability,

extensibility and scalability, which are embodied in the static structure of the software system.

27

Optical Character Recognition using ANN

3.2.1 Scalability Scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or to be readily enlarged. For example, it can refer to the capability of a system to increase total throughput under an increased load when resources (typically hardware) are added. An analogous meaning is implied when the word is used in a commercial context, where scalability of a company implies that the underlying business model offers the potential for economic growth within the company. 3.2.2 Reliability Reliability is the ability of a person or system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances. 3.2.3 Integrity Integrity as a concept has to do with perceived consistency of actions, values, methods, measures, principles, expectations and outcome. People use integrity as a holistic concept, judging the integrity of systems in terms of those systems' ability to achieve their own goals (if any). A value system's abstraction depth and range of applicable interaction may also function as significant factors in identifying integrity due to their congruence or lack of congruence with empirical observation A value system may evolve over time while retaining integrity if those who espouse the values account for and resolve inconsistencies. There are requirements that are not functional in nature. Specifically, these are the constraints the system must work within.

28

Optical Character Recognition using ANN

Extensibility It should have the extensibility of the system ability to recognize more character sets than initially defined. Robustness It should have good pattern recognition engines and robust classifiers, with the ability to generalize in making decisions based on imprecise input data. Scalability The Application should be scalable in the sense that a new service can be added without affecting the available service.

3.3 Hardware Requirements


Processor Ram Hard Disk Compact Disk Input device Output device : Any Processor above 500 MHz. : 1Mb. : 10 GB. : 650 Mb. : Standard Keyboard and Mouse. : VGA and High Resolution Monitor

3.4 Software Requirements:


Operating System Techniques Front End : Windows Family. : .Net Framework 3.5 : Visual C#

29

Optical Character Recognition using ANN

CHAPTER 4 SYSTEM DESIGN


It is a process of problem-solving and planning for a software solution. After the purpose and specifications of software are determined, software developers will design or employ designers to develop a plan for a solution. It includes low-level component and algorithm implementation issues as well as the architectural view. The software requirements analysis (SRA) step of a software development process yields specifications that are used in software engineering. If the software is "semi automated" or user centered, software design may involve user experience design yielding a story board to help determine those specifications. If the software is completely automated (meaning no user or user interface), a software design may be as simple as a flow chart or text describing a planned sequence of events. The software requirements analysis (SRA) step of a software development process yields specifications that are used in software engineering.

4.1 Class Diagram


In software engineering, a class diagram in the Unified Modeling Language (UML), is a type of static structure diagram that describes the structure of a system by showing the system's classes, their attributes, and the relationships between the classes. Class diagrams are widely used to describe the types of objects in a system and their relationships. Class diagrams model class structure and contents using design elements such as classes, packages and objects. Class diagrams describe three different perspectives when

designing a system, conceptual, specification, and implementation.

30

Optical Character Recognition using ANN

These perspectives become evident as the diagram is created and help solidify the design.

4.2 Use case Diagram


A use case in software engineering and systems engineering is a description of a systems behavior as it responds to a request that originates from outside of that system. In other words, a use case describes "who" can do "what" with the system in question. The use case technique is used to capture a system's behavioral requirements by detailing scenario-driven threads through the functional requirements. Use cases describe the interaction

between one or more actors (an actor that is the initiator of the interaction may be referred to as the 'primary actor') and the system itself, represented as a sequence of simple steps.

Figure 4 .1 Use case Diagram

31

Optical Character Recognition using ANN

4.3 Sequence Diagram


A sequence diagram in Unified Modelling Language (UML) is a kind of interaction diagram that shows how processes operate with one another and in what order. It is a construct of a Message Sequence Chart. Sequence diagrams are sometimes called Event-trace diagrams, event scenarios, and timing diagrams. A sequence diagram shows, as parallel vertical lines ("lifelines"), different processes or objects that live simultaneously, and, as horizontal arrows, the messages exchanged between them, in the order in which they occur. This allows the specification of simple runtime scenarios in a graphical manner.

Figure 4.2 Sequence Diagram

32

Optical Character Recognition using ANN

4.4 Data Flow Diagram


A data-flow diagram (DFD) is a graphical representation of the "flow" of data through an information system. DFDs can also be used for the visualization of data processing (structured design).On a DFD, data items flow from an external data source or an internal data store to an internal data store or an external data sink, via an internal process. A DFD provides no information about the timing or ordering of processes, or about whether processes will operate in sequence or in parallel. It is therefore quite different from a flowchart, which shows the flow of control through an algorithm, allowing a reader to determine what operations will be performed, in what order, and under what circumstances, but not what kinds of data will be input to and output from the system, nor where the data will come from and go to, nor where the data will be stored.

Figure 4.3 Data Flow Diagram for Training a Network

33

Optical Character Recognition using ANN

Figure 4.4 Data Flow Diagram for Character Recognition

34

Optical Character Recognition using ANN

CHAPTER 5

SYSTEM IMPLEMENTATION
5.1 High Level Algorithm 5.1.1 Detecting character lines algorithm
1. Start at the first x and first y pixel of the image pixel(0,0), Set number of lines to 0 2. Scan up to the width of the image on the same y-component of the image a. If a black pixel is detected register y as top of the first line b. If not continue to the next pixel c. If no black pixel found up to the width increment y and reset x to scan the next horizontal line 3. Start at the top of the line found and first x-component pixel(0,line_top) 4. Scan up to the width of the image on the same y-component of the image a. If no black pixel is detected register y-1 as bottom of the first line. Increment number of lines b. If a black pixel is detected increment y and reset x to scan the next horizontal line 5. Start below the bottom of the last line found and repeat steps 1-4 to detect subsequent lines 6. If bottom of image (image height) is reached stop.

5.1.2 Detecting individual symbol algorithm


1. Start at the first character line top and first x-component 2. Scan up to the width of the image on the same y-component of the image a. If a black pixel is detected register y as top of the first line b. If not continue to the next pixel
35

Optical Character Recognition using ANN

3. Start at the top of the character found and first x-component pixel(0,chracter_top) 4. Scan up to the line bottom on the same x-component a. If black pixel is detected register x as left of the symbol. b. If not continue to the next pixel c. If no black pixel is detected increment x and reset y to scan the next vertical line 5. Start at the left of the symbol found and top of the current line, pixel(character_left, line_top) 6. Scan up to the width of the image on the same x-component a. if no black pixel is found register x-1 as right of the symbol b. if a black pixel is found increment x and reset y to scan the next vertical line 7. Start at the bottom of the current line and left of the symbol, pixel(character_left, line_bottom) 8. Scan up to the right of the character on the same y-component a. if a black pixel is found register y as bottom of the character b. if no black pixels are found decrement y and reset x to scan the next vertical line

Figure 5.1 Detecting Lines and characters

36

Optical Character Recognition using ANN

5.1.3 Symbol Image Matrix Mapping algorithm


1. For the width (initially 20 elements wide) a. Map the first (0, y) and last (width, y) pixel components directly to the first (0, y) and the last (20, y) elements of the matrix b. Map the middle pixel component (width/2, y) to the 10th matrix element c. Subdivide further divisions and map accordingly to the matrix 2. For the height (initially 30 elements wide) a. Map the first (x, 0) and last (x, height) pixel components directly to the first (x, 0) and the last (x, 30) elements of the matrix b. Map the middle pixel component (x, height/2) to the 15th matrix element c. Subdivide further divisions and map accordingly to the matrix 3. Further reduce the matrix to 10x15 by sampling bya factor of 2 on both the width and the height

Figure 5.2 Matrix mapping a character

5.2 Flow Charts

37

Optical Character Recognition using ANN

Figure 5.3 General Flow Diagram

Figure 5.4 Software working Flow diagram

38

Optical Character Recognition using ANN

5.3 Module-wise code 5.3.1 Main Program Entry


using System; using System.Collections.Generic; using System.Linq; using System.Windows.Forms; namespace myocr { static class Program { /// <summary> /// The main entry point for the application. /// </summary> [STAThread] static void Main() { Application.EnableVisualStyles(); Application.SetCompatibleTextRenderingDefault(false); Application.Run(new Form1()); } } }

5.3.2 Module 1: Training Network


public void form_network() { layers[0] = number_of_input_nodes; layers[number_of_layers - 1] = number_of_output_nodes;

39

Optical Character Recognition using ANN

for (int i = 1; i < number_of_layers - 1; i++) layers[i] = maximum_layers; }

public void initialize_weights() { for (int i = 1; i < number_of_layers; i++) for (int j = 0; j < layers[i]; j++) for (int k = 0; k < layers[i - 1]; k++) weight[i, j, k] = (float)(rnd.Next(-weight_bias, weight_bias)); }

public void form_input_set() { for (int k = 0; k < number_of_input_sets; k++) { get_next_character(); label16.Text = (k + 1).ToString(); label16.Update(); for (int i = 0; i < 10; i++) for (int j = 0; j < 15; j++) { input_set[i * 15 + j, k] = ann_input_value[i * 2 + 1, j * 2 + 1]; } } }

public void form_desired_output_set() { for (int i = 0; i < number_of_input_sets; i++)

40

Optical Character Recognition using ANN

{ character_to_unicode(trainer_string[i].ToString()); for (int j = 0; j < number_of_output_nodes; j++) desired_output_set[j, i] = desired_output_bit[j]; } }

public void train_network() { int set_number; float average_error = 0.0F; for (int epoch = 0; epoch <= epochs; epoch++) { average_error = 0.0F; for (int i = 0; i < number_of_input_sets; i++) { set_number = rnd.Next(0, number_of_input_sets); get_inputs(set_number); get_desired_outputs(set_number); calculate_outputs(); calculate_errors(); calculate_weights(); average_error = average_error + get_average_error(); } average_error = average_error / number_of_input_sets; if (average_error < error_threshold) { epoch = epochs + 1; progressBar1.Value = progressBar1.Maximum; label22.Text = "<" + error_threshold.ToString();

41

Optical Character Recognition using ANN

label22.Update(); } } label27.Text = "Ready"; label27.Update(); }

public void get_inputs(int set_number) { for (int i = 0; i < number_of_input_nodes; i++) current_input[i] = input_set[i, set_number]; }

public void get_desired_outputs(int set_number) { for (int i = 0; i < number_of_output_nodes; i++) desired_output[i] = desired_output_set[i, set_number]; }

public void calculate_outputs() { float f_net; int number_of_weights; for (int i = 0; i < number_of_layers; i++) for (int j = 0; j < layers[i]; j++) { f_net = 0.0F; if (i == 0) number_of_weights = 1; else number_of_weights = layers[i - 1];

42

Optical Character Recognition using ANN

for (int k = 0; k < number_of_weights; k++) if (i == 0) f_net = current_input[j]; else f_net = f_net + node_output[i - 1, k] * weight[i, j, k]; node_output[i, j] = sigmoid(f_net); } }

public float sigmoid(float f_net) { //float result=(float)(1/(1+Math.Exp (-1*slope*f_net))); //Unipolar float result = (float)((2 / (1 + Math.Exp(-1 * slope * f_net))) - 1); //Bipolar } return result;

public float sigmoid_derivative(float result) { //float derivative=(float)(result*(1-result)); //Unipolar //Bipolar

float derivative = (float)(0.5F * (1 - Math.Pow(result, 2))); return derivative; }

public int threshold(float val) { if (val < 0.5) return 0; else return 1;

43

Optical Character Recognition using ANN

public void calculate_errors() { float sum = 0.0F; for (int i = 0; i < number_of_output_nodes; i++) error[number_of_layers - 1, i] = (float)((desired_output[i] node_output[number_of_layers - 1, i]) * sigmoid_derivative(node_output[number_of_layers - 1, i])); for (int i = number_of_layers - 2; i >= 0; i--) for (int j = 0; j < layers[i]; j++) { sum = 0.0F; for (int k = 0; k < layers[i + 1]; k++) sum = sum + error[i + 1, k] * weight[i + 1, k, j]; error[i, j] = (float)(sigmoid_derivative(node_output[i, j]) * sum); } }

public float get_average_error() { float average_error = 0.0F; for (int i = 0; i < number_of_output_nodes; i++) average_error = average_error + error[number_of_layers - 1, i]; average_error = average_error / number_of_output_nodes; return Math.Abs(average_error); }

public void calculate_weights() {

44

Optical Character Recognition using ANN

for (int i = 1; i < number_of_layers; i++) for (int j = 0; j < layers[i]; j++) for (int k = 0; k < layers[i - 1]; k++) { weight[i, j, k] = (float)(weight[i, j, k] + learning_rate * error[i, j] * node_output[i - 1, k]); } }

public void load_character_trainer_set() { string line; openFileDialog1.InitialDirectory = "C:\\ocr\\myocr\\Data\\Trainer Sets"; openFileDialog1.Filter = "Character Trainer Set (*.cts)|*.cts"; if (openFileDialog1.ShowDialog() == DialogResult.OK) { character_trainer_set_file_stream = new System.IO.StreamReader(openFileDialog1.FileName); trainer_string = ""; while ((line = character_trainer_set_file_stream.ReadLine()) != null) trainer_string = trainer_string + line; number_of_input_sets = trainer_string.Length; character_trainer_set_file_name = Path.GetFileNameWithoutExtension(openFileDialog1.FileName); character_trainer_set_file_path = Path.GetDirectoryName(openFileDialog1.FileName); label20.Text = character_trainer_set_file_name; character_trainer_set_file_stream.Close();

45

Optical Character Recognition using ANN

image_file_name = character_trainer_set_file_path + "\\" + character_trainer_set_file_name + ".bmp"; image_file_stream = new System.IO.StreamReader(image_file_name); input_image = new Bitmap(image_file_name); pictureBox1.Image = input_image; input_image_height = input_image.Height; input_image_width = input_image.Width; if (input_image_width > pictureBox1.Width) pictureBox1.SizeMode = PictureBoxSizeMode.StretchImage; else pictureBox1.SizeMode = PictureBoxSizeMode.Normal; right = 1; image_start_pixel_x = 0; image_start_pixel_y = 0; identify_lines(); current_line = 0; character_present = true; character_valid = true; output_string = ""; label36.Text = "Input Image : [" + character_trainer_set_file_name + ".bmp]"; } } public void save_network() { saveFileDialog1.Filter = "Artificial Neural Network Files (*.ann)|*.ann"; saveFileDialog1.FileName = character_trainer_set_file_name; if ((saveFileDialog1.ShowDialog() == DialogResult.OK))

46

Optical Character Recognition using ANN

{ if (saveFileDialog1.FileName != "") { network_save_file_stream = new StreamWriter(saveFileDialog1.FileName); network_save_file_stream.WriteLine("Unicode OCR ANN Weight values. "); network_save_file_stream.WriteLine("Network Name character_trainer_set_file_name); network_save_file_stream.WriteLine("Hidden Layer Size = " + maximum_layers.ToString()); network_save_file_stream.WriteLine("Number of Patterns= " + number_of_input_sets.ToString()); network_save_file_stream.WriteLine("Number of Epochs = " + epochs.ToString()); network_save_file_stream.WriteLine("Learning Rate learning_rate.ToString()); network_save_file_stream.WriteLine("Sigmoid Slope slope.ToString()); network_save_file_stream.WriteLine("Weight Bias weight_bias.ToString()); network_save_file_stream.WriteLine(""); for (int i = 1; i < number_of_layers; i++) for (int j = 0; j < layers[i]; j++) for (int k = 0; k < layers[i - 1]; k++) { network_save_file_stream.Write("Weight[" + i.ToString() + " , " + j.ToString() + " , " + k.ToString() + "] = "); network_save_file_stream.WriteLine(weight[i, j, k]); } ="+ ="+ ="+ ="+

47

Optical Character Recognition using ANN

network_save_file_stream.Close(); } } }

5.3.3 Module 2: Character Recognition


public void form_network() { layers[0] = number_of_input_nodes; layers[number_of_layers - 1] = number_of_output_nodes; for (int i = 1; i < number_of_layers - 1; i++) layers[i] = maximum_layers; }

public void load_network() { form_network(); openFileDialog1.InitialDirectory = "C:\\ocr\\myocr\\Data\\Networks"; openFileDialog1.Filter = "Artificial Neural Network Files (*.ann)|*.ann"; string line; char[] weight_char = new char[20]; string weight_text = ""; int title_length, weight_length; if ((openFileDialog1.ShowDialog() == DialogResult.OK)) { if (openFileDialog1.FileName != "") {

48

Optical Character Recognition using ANN

network_load_file_stream = new StreamReader(openFileDialog1.FileName); network_file_name = Path.GetFileNameWithoutExtension(openFileDialog1.FileName); label18.Text = network_file_name; for (int i = 0; i < 9; i++) network_load_file_stream.ReadLine(); for (int i = 1; i < number_of_layers; i++) for (int j = 0; j < layers[i]; j++) for (int k = 0; k < layers[i - 1]; k++) { weight_text = ""; line = network_load_file_stream.ReadLine(); title_length = ("Weight[" + i.ToString() + " , " + j.ToString() + " , " + k.ToString() + "] = ").Length; weight_length = line.Length - title_length; line.CopyTo(title_length, weight_char, 0, weight_length); for (int counter = 0; counter < weight_length; counter++) weight_text = weight_text + weight_char[counter].ToString(); weight[i, j, k] = (float)Convert.ChangeType(weight_text, typeof(float)); } network_load_file_stream.Close(); } } }

public void identify_lines() {

49

Optical Character Recognition using ANN

int y = image_start_pixel_y; int x = image_start_pixel_x; bool no_black_pixel; int line_number = 0; line_present = true; while (line_present) { x = image_start_pixel_x; while (Convert.ToString(input_image.GetPixel(x, y)) == "Color [A=255, R=255, G=255, B=255]") { x++; if (x == input_image_width) { x = image_start_pixel_x; y++; } if (y >= input_image_height) { line_present = false; break; } } if (line_present) { line_top[line_number] = y; no_black_pixel = false; while (no_black_pixel == false) { y++;

50

Optical Character Recognition using ANN

no_black_pixel = true; for (x = image_start_pixel_x; x < input_image_width; x++) if ((Convert.ToString(input_image.GetPixel(x, y)) == "Color [A=255, R=0, G=0, B=0]")) no_black_pixel = false; } line_bottom[line_number] = y - 1; line_number++; } } number_of_lines = line_number; }

public void load_image() { openFileDialog1.InitialDirectory = ""; openFileDialog1.InitialDirectory = "C:\\ocr\\myocr\\Data\\Sample Images"; openFileDialog1.Filter = "Bitmap Image (*.bmp)|*.bmp"; if (openFileDialog1.ShowDialog() == DialogResult.OK) { System.IO.StreamReader image_file_stream = new System.IO.StreamReader(openFileDialog1.FileName); input_image = new Bitmap(openFileDialog1.FileName); pictureBox1.Image = input_image; image_file_name = Path.GetFileNameWithoutExtension(openFileDialog1.FileName); image_file_path = Path.GetDirectoryName(openFileDialog1.FileName); image_file_stream.Close();

51

Optical Character Recognition using ANN

input_image_height = input_image.Height; input_image_width = input_image.Width; if (input_image_width > pictureBox1.Width) pictureBox1.SizeMode = PictureBoxSizeMode.StretchImage; else pictureBox1.SizeMode = PictureBoxSizeMode.Normal; right = 1; image_start_pixel_x = 0; image_start_pixel_y = 0; identify_lines(); current_line = 0; character_present = true; character_valid = true; output_string = ""; label36.Text = "Input Image : [" + image_file_name + ".bmp]"; } } public int binary_to_decimal() { int dec = 0; for (int i = 0; i < number_of_output_nodes; i++) dec = dec + output_bit[i] * (int)(Math.Pow(2, i)); return dec; }

public void character_to_unicode(string character) { int byteCount = unicode.GetByteCount(character.ToCharArray()); byte[] bytes = new Byte[byteCount];

52

Optical Character Recognition using ANN

bytes = unicode.GetBytes(character); BitArray bits = new BitArray(bytes); System.Collections.IEnumerator bit_enumerator = bits.GetEnumerator(); int bit_array_length = bits.Length; bit_enumerator.Reset(); for (int i = 0; i < bit_array_length; i++) { bit_enumerator.MoveNext(); if (bit_enumerator.Current.ToString() == "True") desired_output_bit[i] = 1; else desired_output_bit[i] = 0; } }

public char unicode_to_character() { int dec = binary_to_decimal(); Byte[] bytes = new Byte[2]; bytes[0] = (byte)(dec); bytes[1] = 0; int charCount = unicode.GetCharCount(bytes); char[] chars = new Char[charCount]; chars = unicode.GetChars(bytes); return chars[0]; }

public string binary_to_hex() {

53

Optical Character Recognition using ANN

int dec; string hex = ""; for (int i = 3; i >= 0; i--) { dec = 0; for (int j = 3; j >= 0; j--) dec = dec + (int)(output_bit[i * 4 + j] * Math.Pow(2, j)); if (dec > 9) switch (dec) { case 10: hex = hex + "A"; break; case 11: hex = hex + "B"; break; case 12: hex = hex + "C"; break; case 13: hex = hex + "D"; break; case 14: hex = hex + "E"; break; case 15: hex = hex + "F"; break; } else hex = hex + dec.ToString(); } return hex; }

public void get_next_character() { image_start_pixel_x = right + 2; image_start_pixel_y = line_top[current_line]; analyze_image(); }


54

Optical Character Recognition using ANN

public void analyze_image() { int analyzed_line = current_line; comboBox1.Items.Clear(); comboBox1.Items.Clear(); get_character_bounds(); if (character_present) { map_character_image_pixel_matrix(); create_character_image(); map_ann_input_matrix(); } else MessageBox.Show("Character Recognition Complete!", "Unicode OCR", MessageBoxButtons.OK, MessageBoxIcon.Exclamation); }

public void get_character_bounds() { int x = image_start_pixel_x; int y = image_start_pixel_y; bool no_black_pixel = false; if (y <= input_image_height && x <= input_image_width) { while (Convert.ToString(input_image.GetPixel(x, y)) == "Color [A=255, R=255, G=255, B=255]") { x++;

55

Optical Character Recognition using ANN

if (x == input_image_width) { x = image_start_pixel_x; y++; } if (y >= line_bottom[current_line]) { character_present = false; break; } } if (character_present) { top = y; x = image_start_pixel_x; y = image_start_pixel_y; while (Convert.ToString(input_image.GetPixel(x, y)) == "Color [A=255, R=255, G=255, B=255]") { y++; if (y == line_bottom[current_line]) { y = image_start_pixel_y; x++; } if (x > input_image_width) break; } if (x < input_image_width) left = x; no_black_pixel = true;

56

Optical Character Recognition using ANN

y = line_bottom[current_line] + 2; while (no_black_pixel == true) { y--; for (x = image_start_pixel_x; x < input_image_width; x++) if ((Convert.ToString(input_image.GetPixel(x, y)) == "Color [A=255, R=0, G=0, B=0]")) no_black_pixel = false; } bottom = y; no_black_pixel = false; x = left + 10; while (no_black_pixel == false) { x++; no_black_pixel = true; for (y = image_start_pixel_y; y < line_bottom[current_line]; y++) if ((Convert.ToString(input_image.GetPixel(x, y)) == "Color [A=255, R=0, G=0, B=0]")) no_black_pixel = false; } right = x - 1; top = confirm_top(); bottom = confirm_bottom();

character_height = bottom - top + 1; character_width = right - left + 1; confirm_dimensions(); if (left - prev_right >= 20) output_string = output_string + " ";

57

Optical Character Recognition using ANN

prev_right = right;

textBox1.Text = Convert.ToString(top, 10); textBox1.Update(); textBox2.Text = Convert.ToString(left, 10); textBox2.Update(); textBox3.Text = Convert.ToString(bottom, 10); textBox3.Update(); textBox4.Text = Convert.ToString(right, 10); textBox4.Update(); textBox6.Text = Convert.ToString(character_width, 10); textBox6.Update(); textBox7.Text = Convert.ToString(character_height, 10); textBox7.Update(); } else if (current_line < number_of_lines - 1) { current_line++; image_start_pixel_y = line_top[current_line]; image_start_pixel_x = 0; prev_right = 20; output_string = output_string + "\n"; character_present = true; get_character_bounds(); } } else character_present = false; }

public void pick_sampling_pixels() { int step = (int)(character_height / matrix_height);

58

Optical Character Recognition using ANN

if (step < 1) step = 1;

sample_pixel_y[0] = 0; sample_pixel_y[29] = character_height - 1; sample_pixel_y[19] = (int)(2 * sample_pixel_y[29] / 3); sample_pixel_y[9] = (int)(sample_pixel_y[29] / 3);

sample_pixel_y[4] = (int)(sample_pixel_y[9] / 2); sample_pixel_y[5] = sample_pixel_y[4] + step; sample_pixel_y[2] = (int)(sample_pixel_y[4] / 2); sample_pixel_y[3] = sample_pixel_y[2] + step; sample_pixel_y[1] = sample_pixel_y[0] + step; sample_pixel_y[6] = sample_pixel_y[1] + sample_pixel_y[5]; sample_pixel_y[7] = sample_pixel_y[2] + sample_pixel_y[5]; sample_pixel_y[8] = sample_pixel_y[3] + sample_pixel_y[5]; for (int i = 10; i < 19; i++) sample_pixel_y[i] = sample_pixel_y[i - 10] + sample_pixel_y[9]; for (int i = 20; i < 29; i++) sample_pixel_y[i] = sample_pixel_y[i - 20] + sample_pixel_y[19];

step = (int)(character_width / matrix_width); if (step < 1) step = 1;

sample_pixel_x[0] = 0; sample_pixel_x[19] = character_width - 1; sample_pixel_x[9] = (int)(sample_pixel_x[19] / 2);

sample_pixel_x[4] = (int)(sample_pixel_x[9] / 2); sample_pixel_x[5] = sample_pixel_x[4] + step; sample_pixel_x[2] = (int)(sample_pixel_x[4] / 2);

59

Optical Character Recognition using ANN

sample_pixel_x[3] = sample_pixel_x[2] + step; sample_pixel_x[1] = sample_pixel_x[0] + step; sample_pixel_x[6] = sample_pixel_x[1] + sample_pixel_x[5]; sample_pixel_x[7] = sample_pixel_x[2] + sample_pixel_x[5]; sample_pixel_x[8] = sample_pixel_x[3] + sample_pixel_x[5]; for (int i = 10; i < 19; i++) sample_pixel_x[i] = sample_pixel_x[i - 10] + sample_pixel_x[9];

comboBox1.BeginUpdate(); for (int i = 0; i < 20; i++) comboBox1.Items.Add("[" + (i + 1).ToString() + "] " + sample_pixel_x[i].ToString()); comboBox1.EndUpdate(); comboBox1.BeginUpdate(); for (int i = 0; i < 30; i++) comboBox1.Items.Add("[" + (i + 1).ToString() + "] " + sample_pixel_y[i].ToString()); comboBox1.EndUpdate(); }

public void map_character_image_pixel_matrix() { for (int j = 0; j < character_height; j++) for (int i = 0; i < character_width; i++) character_image_pixel[i, j] = input_image.GetPixel(i + left, j + top); }

public void create_character_image() {

60

Optical Character Recognition using ANN

character_image = new System.Drawing.Bitmap(character_width, character_height); for (int j = 0; j < character_height; j++) for (int i = 0; i < character_width; i++) character_image.SetPixel(i, j, character_image_pixel[i, j]); pictureBox2.Image = character_image; pictureBox2.Update(); }

public void map_ann_input_matrix() { pick_sampling_pixels(); for (int j = 0; j < matrix_height; j++) for (int i = 0; i < matrix_width; i++) { ann_input_pixel[i, j] = character_image.GetPixel(sample_pixel_x[i], sample_pixel_y[j]); if (ann_input_pixel[i, j].ToString() == "Color [A=255, R=0, G=0, B=0]") ann_input_value[i, j] = 1; else ann_input_value[i, j] = 0; } groupBox6.Invalidate(); groupBox6.Update(); }

public void detect_next_character() { number_of_input_sets = 1;

61

Optical Character Recognition using ANN

get_next_character(); if (character_present) { for (int i = 0; i < 10; i++) for (int j = 0; j < 15; j++) input_set[i * 15 + j, 0] = ann_input_value[i * 2 + 1, j * 2 + 1]; get_inputs(0); calculate_outputs(); comboBox3.Items.Clear(); comboBox3.BeginUpdate(); for (int i = 0; i < number_of_output_nodes; i++) { output_bit[i] = threshold(node_output[number_of_layers - 1, i]); comboBox3.Items.Add("bit[" + (i).ToString() + "] " + output_bit[i].ToString()); } comboBox3.EndUpdate(); char character = unicode_to_character(); output_string = output_string + character.ToString(); textBox8.Text = " " + character.ToString(); string hexadecimal = binary_to_hex(); label11.Text = hexadecimal + " h"; label11.Update(); richTextBox1.Text = output_string; textBox8.Update(); richTextBox1.Update(); } }

62

Optical Character Recognition using ANN

5.3.4 Module 3: Conversion to image


private void btnbmp_Click(object sender, EventArgs e) { Bitmap objBmpImage = new Bitmap(1, 1); objBmpImage = CreateBitmapImage(richTextBoxExtended1.RichTextBox.Text); objBmpImage.Save("t2b.bmp"); }

private Bitmap CreateBitmapImage(string sImageText) { Bitmap objBmpImage = new Bitmap(1, 1); int intWidth = 0; int intHeight = 0; //CreateAccessibilityInstance the Font for the image textBox1 drawing Font objFont = new Font("Arial", 12, System.Drawing.GraphicsUnit.Pixel); // Create a Graphic Object to measure the text width and height Graphics objGraphics = Graphics.FromImage(objBmpImage); // bmp size is determined intWidth = (int)objGraphics.MeasureString(sImageText, objFont).Width; intHeight = (int)objGraphics.MeasureString(sImageText, objFont).Height; // create the mp image with the correct size for text and font objBmpImage = new Bitmap(objBmpImage, new Size(intWidth, intHeight)); // add the colors to the new bitmap objGraphics = Graphics.FromImage(objBmpImage); // Set the background color
63

Optical Character Recognition using ANN

objGraphics.Clear(Color.White);

// Set the rendering quality of the graphics objGraphics.SmoothingMode = System.Drawing.Drawing2D.SmoothingMode.AntiAlias;

// Set the rendering for the associated text objGraphics.TextRenderingHint = System.Drawing.Text.TextRenderingHint.AntiAlias;

// initialize coor brush objGraphics.DrawString(sImageText, objFont, new SolidBrush(Color.FromArgb(102, 102, 102)), 0, 0);

// Forces execution of graphics objGraphics.Flush(); return (objBmpImage); } } }

64

Optical Character Recognition using ANN

CHAPTER 6 TESTING
Information Processing has undergone major improvements in the past two decades in both hardware and software. Hardware has decreased in size and price, while providing more and faster processing power. Software has become easier to use, while providing increased capabilities. There is an abundance of products available to assist both end-users and software developers in their work. Software testing, however, has not progressed significantly. It is still largely a manual process conducted as an art rather than a methodology. It is almost an accepted practice to release software that contains defects. Software that is not thoroughly tested is released for production. This is true for both off-the-shelf software products and custom applications. Software vendor and in-house systems developers release an initial system and then deliver fixes to the code. They continue delivering fixes until they create a new system and stop supporting the old one. The user is then forced to convert to the new system, which again will require fixes. In-house systems developers generally do not provide any better level of support. They require the users to submit Incident Reports specifying the system defects. The Incident Reports are then assigned a priority and the defects are fixed as time and budgets permit.

6.1 Importance of Testing


Testing is difficult. It requires knowledge of the application and the system architecture. The majority of the preparation work is tedious. The test

conditions, test data, and expected results are generally created manually.

65

Optical Character Recognition using ANN

System testing is also one of the final activities before the system is released for production. There is always pressure to complete systems testing promptly to meet the deadline. Nevertheless, systems testing are important. In mainframe when the system is distributed to multiple sites, any errors or omissions in the system will affect several groups of users. Any savings realized in downsizing the application will be negated by costs to correct software errors and reprocess information. Software developers must deliver reliable and secure systems that satisfy the users requirements. A key item in successful systems testing is developing a testing methodology rather than relying on individual style of the test practitioner. The systems testing effort must follow a defined strategy. It must have an objective, a scope, and an approach. Testing is not an art; it is a skill that can be taught. Testing is generally associated with the execution of programs. The emphasis is on the outcome of the testing, rather than what is tested and how its tested. Testing is not a one-step activity; execute the test. It requires planning and design. The tests should be reviewed prior to execution to verify their accuracy and completeness. They must be documented and saved for reuse. System testing is the most extensive testing of the system. It requires more manpower and machine processing time than any other testing level. It is therefore the most expensive testing level. It is critical process in the system development. It verifies that the system performs the business requirements accurately, completely, and within the required performance limits. It must be thorough, controlled and managed.

66

Optical Character Recognition using ANN

6.2 Testing Definitions


Software testing can also be stated as the process of validating and verifying that a software program/application/product: Meets the business and technical requirements that guided its design and development. Works as expected. Can be implemented with the same characteristics. Software testing, depending on the testing method employed, can be implemented at any time in the development process. However, most of the test effort occurs after the requirements have been defined and the coding process has been completed. As such, the methodology of the test is governed by the software development methodology adopted. Different software development models will focus the test effort at different points in the development process. Newer development models, such as Agile, often employ test driven development and place an increased portion of the testing in the hands of the developer, before it reaches a formal team of testers. In a more traditional model, most of the test execution occurs after the requirements have been defined and the coding process has been completed. Software testing is a critical element of software quality assurance and represents the ultimate review of specification, design and coding, testing is a process of executing a program with the intent of finding an error. A good test case is one that has high probability of finding an as yet undiscovered error.

67

Optical Character Recognition using ANN

Test cases must be designed in such a way that the test case should have the highest likelihood of finding maximum errors with a minimum amount of time and effort. There are two approaches for designing test cases. One is white box testing and the other is back box testing.

6.2.1 White box testing


White box testing is a test case design method that uses the control structure of the procedural design to derive test cases. Using white box testing methods the test cases should Guarantee that all independent paths within a module have been exercised at least once. Exercise all logical decisions on their true and false sides. Exercise all loops at their boundaries and within their operational bounds. Exercise internal data structures to assure their validity.

6.2.2 Black box testing


Black box testing methods focus on the functional requirement of the software. This testing method enables to derive sets of input conditions that fully exercise all functional requirements of a program. Black box testing attempts to find errors in the following categories. Incorrect or missing functions. Interface errors

68

Optical Character Recognition using ANN

Errors in data structures or external database access Performance errors Initialization and terminate errors White box testing is performed early in the testing process. Black box testing is applied during later stages of testing. Black box testing purposefully disregards control structures and focuses on the information domain. Software development has several levels of testing. Unit Testing System Testing Acceptance Testing

6.2.3 Unit Testing


The primary goal of unit testing is to take the smallest piece of testable software in the application, isolate it from the remainder of the code, and determine whether it behaves exactly as you expect. Each unit is tested separately before integrating them into modules to test the interfaces between modules. Unit testing has proven its value in that a large percentage of defects are identified during its use. The most common approach to unit testing requires drivers and stubs to be written. The driver simulates a calling unit and the stub simulates a called unit. The investment of developer time in this activity sometimes results in demoting unit testing to a lower level of priority and that is almost always a mistake. Even though the drivers and stubs cost time and money, unit testing provides some undeniable advantages. Unit testing is a method by which individual

69

Optical Character Recognition using ANN

units of source code are tested to determine if they are fit for use. A unit is the smallest testable part of an application. It allows for automation of the testing process, reduces difficulties of discovering errors contained in more complex pieces of the application, and test coverage is often enhanced because attention is given to each unit. The first level of testing is called unit testing which is done during the development of the system. Unit testing is essential for verification of the code produced during the coding phase. Errors were been noted down and corrected immediately. It is performed by the programmer. It uses the program specifications and the program itself as its source. Thus, our modules are individually tested here. There is no formal documentation required for unit-testing program.

6.2.4 Integration Testing


Integration testing is the phase in software testing in which individual software modules are combined and tested as a group. It occurs after unit testing and before system testing. Integration testing takes as its input modules that have been unit tested, groups them in larger aggregates, applies tests defined in an integration test plan to those aggregates, and delivers as its output the integrated system ready for system testing. Integration testing is a logical extension of unit testing. In its simplest form, two units that have already been tested are combined into a component and the interface between them is tested. A component, in this sense, refers to an integrated aggregate of more than one unit. In a realistic scenario, many units are combined into components, which are in turn aggregated into even larger parts of the program. The idea is to test combinations of pieces and eventually expand the process to test your modules with those of other

70

Optical Character Recognition using ANN

groups. Eventually all the modules making up a process are tested together. Beyond that, if the program is composed of more than one process, they should be tested in pairs rather than all at once. Integration testing identifies problems that occur when units are combined. By using a test plan that requires you to test each unit and ensure the viability of each before combining units, you know that any errors discovered when combining units are likely related to the interface between units. This method reduces the number of possibilities to a far simpler level of analysis. The second level of testing includes integration testing. Here different dependent modules are assembled and tested for any bugs that may surface due to the integration of modules. Thus, the administrator module and various visa immigration modules are tested here.

6.2.5 External function testing


The external function test is a black box test to verify that the system correctly implements specified functions. This phase is sometimes known as an alpha test. Testers will run tests that they believe reflect the end use of the system.

6.2.6 System Testing


System testing is probably the most important phase of complete testing cycle. This phase is started after the completion of other phases like Unit, Component and Integration testing. During the System Testing phase, non functional testing also comes in to picture and performance, load, stress, scalability all these types of testing are performed in this phase. System Testing is conducted on the complete integrated system and on a replicated

71

Optical Character Recognition using ANN

production environment. System Testing also evaluates that system compliance with specific functional and non functional requirements both. It is very important to understand that not many test cases are written for the system testing. Test cases for the system testing are derived from the architecture/design of the system, from input of the end user and by user stories. It does not make sense to exercise extensive testing in the System Testing phase, as most of the functional defects should have been caught and corrected during earlier testing phase. The third level of testing includes systems testing. Systems testing verify that the system performs the business functions while meeting the specified performance requirements. It is performed by a team consisting of software technicians and users. It uses the Systems Requirements document, the System Architectural Design and Detailed Design Documents, and the Information Systems Department standards as its sources. Documentation is recorded and saved for systems testing.

6.7.7 Acceptance Testing


Acceptance testing is a test conducted to determine if the requirements of a specification or contract are met. It may involve chemical tests, physical tests, or performance tests. Acceptance testing is a final stage of testing that is performed on a system prior to the system being delivered to a live environment. Systems subjected to acceptance testing might include such deliverables as a software system or a mechanical hardware system. Acceptance tests are generally performed as "black box" tests. Black box testing means that the tester uses specified

72

Optical Character Recognition using ANN

inputs into the system and verifies that the resulting outputs are correct, without knowledge of the system's internal workings. User acceptance testing (UAT) is the term used when the acceptance tests are performed by the person or persons who will be using the live system once it is delivered. If the system is being built or developed by an external supplier, this is sometimes called customer acceptance testing (CAT). The UAT or CAT acts as a final confirmation that the system is ready for go-live. A successful acceptance test at this stage may be a contractual requirement prior to the system being signed off by the client. The final level of testing is the acceptance testing. Acceptance testing provides the users with assurance that the system is ready for production use; it is performed by the users. It uses the System Requirements document as its source. There is no formal documentation required for acceptance testing. Systems testing are the major testing effort of the project. It is the functional testing of the application and is concerned with following, Quality/standards compliance Business requirements Performance capabilities Operational capabilities

73

Optical Character Recognition using ANN

CHAPTER 7 SNAPSHOTS

Figure 7.1 myOCR Software

Figure 7.2 Loading the Network


74

Optical Character Recognition using ANN

Figure 7.3 Loading the image

Figure 7.4 After loading the image:

75

Optical Character Recognition using ANN

Figure 7.5 Character Recognition

Figure 7.6 Saving the converted text

76

Optical Character Recognition using ANN

Figure 7.7 Editing the text In the editor

Figure 7.8 Snap shot showing Text after Editing

77

Optical Character Recognition using ANN

CONCLUSION
Artificial neural networks are commonly used to perform character recognition due to their high noise tolerance. The systems have the ability to yield excellent results. The feature extraction step of optical character recognition is the most important. A poorly chosen set of features will yield poor classification rates by any neural network. Features must be chosen such that they are loss-less or still accurately represent the character. However, loss-less feature extraction does not guarantee good results. Choices for feature extraction include: Use of the entire input image as the feature vector. However

this requires a huge network that must be trained to millions of iterations. Computing the vertical and horizontal projections of the

characters. This is a easy reduction Computing the run lengths of the character, providing a loss-

less reduction in Information. Use hand selected features chosen by human experts to

classify characters. This is a easy reduction. These features vectors are fed into feed-forward, back-propagation artificial neural networks. The output of the network determines the correct character class. Trivial systems work only a single font, while more complex systems can recognize many fonts using the same network. Most systems can accept a character printed in any size font by performing scaling and normalizing before computing the feature vectors.

78

Optical Character Recognition using ANN

REFERENCES
[1] H.I. Avi-Itzhak, T.A. Diep, and H. Garland. High accuracy optical character recognition using neural networks with centroid dithering. Transactions on Pattern Analysis and Machine Intelligence, 17(2):218224, Feb 1995. [2] Richard G. Casey and Eric Lecolinet. A survey of methods and strategies in character segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(7):690706, July 1996. [3] Tsu-Chang Lee. Structure level adaptation for articial neural networks: theory, applications, and implementations. PhD thesis, Stanford University, Stanford, CA, USA, 1990. Adviser-Allen M. Peterson. [4] N. Mani and B. Srinivasan. Application of articial neural network model for optical character recognition. Systems, Man, and Cybernetics, 1997. Computational Cybernetics and Simulation, 1997 IEEE International Conference on, 3:25172520, October 1997. [5] George L. Nagy, Stephen V. Rice, and Thomas A. Nartker. Optical Character Recognition: An Illustrated Guide to the Frontier. Kluwer Academic Publishers, Norwell, Massachusetts, USA, 1999. [6] E.M. de A. Neves, A. Gonzaga, and A.F.F. Slaets. A multi-font character recognition based on its fundamental features by articial neural networks. Cybernetic Vision, 1996. Proceedings, Second Workshop on, pages 196201, December 1996. J.M. Ramirez, P. Gomez-Gil, and D. Baez-Lopez. On structural adaptability of neural networks in character recognition. Signal Processing, 1996., 3rd International Conference on, 2:14811483, October 1996

79

Optical Character Recognition using ANN

Curriculum Vitae :

AJAY KUMAR SRIVASTAVA


Ajay Kr. Srivastava B.Tech (8th Semester) Computer Science and Engineering. Email id: ajayaeccs@gmail.com

EXAMINATION

INSTITUTION

BOARD/ UNIVERSITY

PERCENTAGE

SESSION

B.Tech (Current 8th Semester) Class XII

Anand Engineering College, Agra Jyoti Niketan School, Azamgarh

Gautam Buddh Technical University ISC Board

80.5 %

2008-2012

82.5 %

2007

Class X

Jyoti Niketan School, Azamgarh

ICSE Board

87.17 %

2005

EDUCATION

PERSONAL INFORMATION
Date of Birth Gender Fathers Name Permanent Address Email id Contact no Hobbies 09.12.1989 Male Mr. Sunil Kumar Srivastava Village & Post- Tengerpur, Rani ki Sarai, Azamgarh(U.P) -276207 ajayaeccs@gmail.com +91 9411464705 Playing Cricket, Listening Songs.

80

Optical Character Recognition using ANN

DIVYA CHAWLA
Divya Chawla B.Tech (8th Semester) Computer Science and Engineering. Email id: vickychawla.chawla@gmail.com

EDUCATION
EXAMINATION INSTITUTION BOARD/ UNIVERSITY PERCENTAGE SESSION

B.Tech (Current 8th Semester) Class XII Class X

Anand Engineering College, Agra Cathedral College Cathedral College

Gautam Buddh Technical University ISC Board ICSE Board

75.6%

2008-2012

83.4% 74%

2008 2006

PERSONAL INFORMATION
Date of Birth Gender Fathers Name Permanent Address Email id Contact no Hobbies 09.03.1990 Male Shri Bhim Sen Chawla 371,Adarsh Nagar, Sipri Bazar, Jhansi (U.P) vickychawla.chawla@gmail.com +91 9454959392 Reading novels(Chetan Bhagat and Arpit Dugar), Watching Telgu Movies.

81

Optical Character Recognition using ANN

KAHKASHAN AHMAD
Kahkashan Ahmad B.Tech (8th Semester) Computer Science and Engineering. Email id: kasha.virgo@gmail.com

EDUCATION
EXAMINATION INSTITUTION BOARD/ UNIVERSITY PERCENTAGE SESSION

B.Tech (Current 8th Semester) Class XII Class X

Anand Engineering College, Agra St. Patricks Junior College, Agra St. Patricks Junior College, Agra

Gautam Buddh Technical University ISC Board ICSE Board

77.2 %

2008-2012

85.5 % 92.8 %

2008 2006

PERSONAL INFORMATION
Date of Birth Gender Fathers Name Permanent Address Email id Contact no Hobbies 20.02.1990 Female Mr. Syed Rashid Ahmad 15/3, Soron Katra , Shahganj , Agra. kasha.virgo@gmail.com +91 9557860306 Embroidery, Reading articles, collecting and pasting them in my collection.

82

Optical Character Recognition using ANN

KANIKA AGARWAL
Kanika Agarwal B.Tech (8th Semester) Computer Science and Engineering Email: kanika.303@gmail.com

EDUCATION
Examination B.Tech.(pursuing) Class XII Class X Board/University U.P.T.U. U.P.Board U.P.Board Marks 74.67% 69.2% 62.5% Year 2011 2007 2005

PERSONAL INFORMATION
Date of Birth Gender Fathers Name Permanent Address Email id Contact no Hobbies 03.03.1990 Female Mr. Satish Chand Agarwal 207,Puneet Apartment, Teacher Colony, Jaipur House, Agra(U.P) kanika.303@gmail.com +91 9368049696 Sketching (Portraits etc) Indulging in Artistic and Creative works.

83

Optical Character Recognition using ANN

NADA KHALEEQUE
Nada Khaleeque B.Tech (8th Semester) Computer Science and Engineering Email: nada_shamsi@yahoo.co.in

EDUCATION:
Examnation 10th Board/Univers Year ity ICSE 2006 Institution Percentage

12th

Karnataka Board GBTU

2008

B.tech

St Anthonys 91.3% Junior College, Agra Mount Carmel 85.6% College, Blore Anand 73%(Avg.) Engineering College,

PERSONAL INFORMATION
Date of Birth 16.01.1990 Gender Female Fathers Name Khaleeque Ahmad Permanent Address 5-A, North Idgah Colony, Agra 282001 Email id nada_shamsi@yahoo.co.in Contact no +91 9368049696 Hobbies Sketching (Portraits etc) Indulging in Artistic and Creative works.

84

Potrebbero piacerti anche