Sei sulla pagina 1di 51

HAND WRITTEN IDENTIFICATION USING

IMAGE PROCESSING TECHNIQUES

[TALAYA FRASAT(1106); FERWA NADEEM(1117);


AYESHA ASHRAF(1148)][PROJECT-1; CSIT-21306]
[SESSION 2014-2016;MCS MORNING]
[DEPARTMENT OF COMPUTER SCIENCE & IT,
THE ISLAMIA UNIVERSITY BAHAWALPUR,
RAHIM YAR KHAN CAMPUS]
SDD Document 1.0 i
[PROJECT-1; CSIT-21306]

]
Handwriting Classification

Version <1.0>

16-5-2O16

Handwriting Classification 1.O (VERS10N)


SDD 16-5-2016

HIST0RY
DATE VERS10N EXPLANATION AUTH0RS
16-5-2O16 <1.O> Initial Version of Document FARWA NADEEM
TALAYA FARASAT
AYESHA ASHRAF

Handwriting Classification 1.O (VERS10N)


SDD 16-5-2016

SDD Document 1.0 ii


Table of C0ntents
1Introduction............................................................................................................................. ......................1
1.1 PURP0SE.......................................................................................................................... ..............1
1.2 SC0PE.............................................................................................................................................1
1.3 DEFINITI0NS & ABBREVIATIONS………………....................................................................................1
1.4 REFRENCES.......................................................................................................................... .......2
1.5 0VERVIEW....................................................................................................................................2

2UseCases........................................................................................................................................................2
2.1 ACT0RS...........................................................................................................................................2
2.2 USE CASES(LIST)..........................................................................................................................3
2.3 DIAGRAMS................................................................................................................................ ...4
2.4 USE CASES........................................................................................................................ ............9

3DesignOverview...........................................................................................................................................16
3.1 System Architecture........................................................................................................................27
3.2 System Interfaces............................................................................................................................30
3.3 Constraints and Assumptions..........................................................................................................35

4SystemObjectModel.....................................................................................................................................36
4.1 Introduction.....................................................................................................................................36
4.2 Subsystems......................................................................................................................................36
4.3 Subsystem Interfaces.......................................................................................................................36

50bjectDescriptions.......................................................................................................................................37
5.1 0bjects................................................................................................................... .........................37

60bjectC0llaboration............................................................................................................................. .........40
6.1 0bject Collab0ration Diagram........................................................................................................40

7Dynamic M0del.............................................................................................................................................41
7.1 Sequence Diagrams.........................................................................................................................41
7.2 State Diagrams................................................................................................................................47

8Non-functionalRequirements........................................................................................................................48
8.1 Performance Requirements.............................................................................................................48
8.2 Design Constraints..........................................................................................................................48

9SupplementaryDocumentation.....................................................................................................................48

SDD Document 1.0 iii


.

1.Introduction

The application is developed for the classification of various hand written documents. We use
different machine learning techniques for achieving the target.

1.1Purpose

This project is mainly developed for the classification of different hand written samples. Our
main purpose is to identify the author or writer of the document .This project is developed for
this purpose.
Manually this work takes a lot of time and of course effort so our main target is to make a
efficient and reliable system to control crimes in courts. This system is also used in banks and
in businesses for the authentication of documents. Reliability is our main focus during making
a system. It is used in different places like in crime control agencies for investigation purposes.
Plagiarism can be controlled and Copy rights are promoted with the help of that system.

1.2 Scope

The scope of this system is vast as it is used for investigation purposes in many areas like
hospitals where the prescription authenticity is required. It is a type of biometric system
which is basically developed for authentication purposes. These type of systems are used for
crime control purposes and for verification in many areas or fields. It can be online as well
as offline.

1.3 DEFINITI0NS & ABBREVIATI0NS

OCR- Optical character recognition use for character recognizes.

SVM- Support vector machine used f0r classification.


FE- Feature extraction used for extracting features from hand writing samples.

PDF – Portable Document Format,

1.4 Refernces
Software requirements specification (SRS) of hand written identification using image
processing techniques.

SDD Document 1.0 1


1.5 0VERVIEW

It is divided into many sections

 Intr0duction
 Ussecase
 Design 0verview
 System 0bject model
 0bject descriptions
 0bject c0llaborations
 Dynamic model
 N0n functional requirements
 Supplymentary d0cumentation

2 USE-CASES
2.1 ACT0RS
2.1.1 Writer
Writer are the users in Handwriting classification that provide or write a content by hand
and repeat this process to provide enough samples for application to learn it which will
be further used for unknown writing classification Writer can be any person some major
types are written bellow
Teacher
Student
Criminal
Author
Poet
Additional Inf0rmation: Writer is the only user on which this application is
based on in fact for which this application is designed because they will provide
the sample they will input it
.

SDD Document 1.0 2


2.1.2 USERS (PUBLIC)

Its basically not the writer .It is the End user of application who will gather the
samples from Writer train the system and get the Result from application and
perform some action on the basis of classification results.

2.2 Use Cases (LIST)

 Use case for Insert a Sample


 Use case for training the data sets
 Use case for Removing a Sample
 Use case for preprocessing
 Use case for character rec0gnition

SDD Document 1.0 3


3.3 DIAGRAMS OF USECASES
Use case (INSERT SAMPLE)

SDD Document 1.0 4


Use case for Training the datasets

Use case for sample removal


SDD Document 1.0 5
SDD Document 1.0 6
Use case for sample pre processing

SDD Document 1.0 7


Use case for character recognition

SDD Document 1.0 8


3.4 Use Cases
3.4.1 Insert Sample Usecase (0verview)

Use case name: Full name HIGHER pri0rity


Insert Sample Insert sample
ACT0R(primary) S0URCES OF USECASE TYPE ITS LEVE1
Writer Author or Students Security IN 0VERVIEW
STAKEHOLDERS OF TAKING INTEREST: Investigators and all crime control agencies.

DESCRIPTI0N BRIEFLY
This is basically for the insertion of sample. Then these samples are successfully stored in the memory for fute
utilizations.
Goal:
● The successful Insertion of New sample by some writer and system identify the sample
Success Measurement:
● The Sample is generated and inserted in system and now will be used for future classification
Precondition:
● Writer must first be registered in System before submitting the sample to recognize it later.
● DATASETS must be completed so that the data must be appropriate and sufficient.
TR1GERS
● The time arrives when samples are successfully inserted

USECASE
EVEVT
DESCRIPTION

EVENTS FL0W
1. A writer is first registered in system to be used further
2. Samples from that writer are gathered and Entered
3. Writer can view or remove the inserted samples
4. Once this process is repeated for all writers then classification is initiated by providing unknown
content

Assumptions
1. It is assumed that Writer will write the sample content on a clean paper that can be easily OCR
2. It is assumed that the System will learn all the locates easily based on features and it is locate
independent
3. Assumption is that a sample will be provided instead creating and inserting without following any standard

Implementation Constraints and Specifications: System will b implemented minimum 3.1 processor dual
core to work efficiently for desired outcomes greater configuration is suggested

SDD Document 1.0 9


.3.4.1 Insert Sample Usecase (Detailed)

Use case name: Full name HIGHER pri0rity


Insert Sample Insert sample
ACT0R(primary) Main s0urces of usecase TYPE 1TS LEVE1
Writer Author or Students Security IN very detail
STAKEH0LDERS OF TAKING INTEREST: Investigators and all crime control agencies.

DESCRIPTI0N BRIEFLY
This is basically for the insertion of sample. Then these samples are successfully stored in the memory for fute
utilizations.
MAIN G0AL:
● The successful Insertion of New sample by some writer and system identify the author.
Success Measurement:
● The Sample is generated and inserted in system and now will be used for future classification
Precondition:
● Writer must first be registered in System before submitting the sample to recognize it later.
● DATASETS must be completed so that the data must be appropriate and sufficient.

TR1GERS
● The time arrives when samples are successfully inserted.

USE CASE
EVENTS
DESCRIPTION

EVENTS FL0W
1. A writer is registered in system by entering his name his sample space will be created
2. He will write the content according to some standard template
3. The System Under Design uses that sample to train and learn the basic features of the fonts of
content written.
4. The System Under Design uses predefined features that it will get of each font of each content of
every writer and learn it
5. Exception: if the data is not enough or the sample does not satisfy the constraints then the
exception is raised and Writer is alerted to re insert or generate the samples this will be repeated
unless System under development will get the accurate sample to proceed
6. data rejected (invalid) to train well software so ultimately learning of garbage data will effect
the results
7. population and classification will occur on the basis of dataset created by the samples previously
provided
8. preview the results is allowed to users and report generated then to re-classify it after changing the
criteria or filter
9. for data modification return to (SAMPLE MANAGEMENT SYSTEM) and alter samples or remove
10. The preview is based on the Classification table consisting the match confidence and accuracy
and the recognized writer

SDD Document 1.0 10


Assumpti0ns
1. It is assumed that Writer will write the sample content on a clean paper that can be easily OCR
2. It is assumed that the System will learn all the locales easily based on features and it is locale
independent
3. Assumption is that a sample will be provided instead creating and inserting without following any standard

Implementation Constraints and Specifications: System will b implemented minimum 3.1 processor dual
core to work efficiently for desired outcomes greater configuration is suggested
.

3.4.3 Remove Sample (Overview)

Use case name: Full name HIGHER pri0rity


Remove the Sample delete sample
ACT0R (primary) Main s0urces of usecase TYPE ITS LEVE1
Writer Author or Students Security IN 0VERVIEW
STAKEHOLDERS OF TAKING INTEREST: Investigators and all crime control agencies.

DESCRIPTI0N (BRIEFLY)
This is basically for the deletion of samples. Then these samples are successfully remove from the memory

MAIN GOAL:
● The successful deletion of sample from system
Success Measurement:
● The sample is selected by Writer and then successfully removed without disturbing other ones
Prec0nditi0ns:
● Before deleting the sample a Writer must be exist in System and have submitted atleast one
sample
● DATASETS must be completed so that the data must be appropriate and sufficient.

TR1GERS
● The time arrives when samples are successfully deleted.

USECASE
EVENTS
DESCRIPTION

EVENTS FLOW
1. A writer is first recognized and assured whether he exists in system to be used further
2. Samples from that writer are gathered and then located
3. Writer can view or remove the inserted samples
4 . Once this process is repeated for all writers then classification is initiated by providing
unknown content
Assumptions
1. It is assumed that Writer will be already registered and just select the samples from the list of
samples he submitted to remove
2. It is assumed that the System will not affect the other samples of the same writer as well as the
other writers.
3 Assumption is that a sample will be provided instead creating and inserting without following any standard

SDD Document 1.0 11


Implementation Constraints and Specifications: System will b implemented minimum 3.1 processor dual
core to work efficiently for desired outcomes greater configuration is suggested

3.4.3 Remove Sample (Detailed)

Use case name: Full name HIGHER pri0rity


Remove the Sample delete sample
ACT0R(primary) Main s0urces of usecase TYPE ITS LEVE1
Writer Author or Students Security IN very detail
STAKEHOLDERS OF TAKING INTEREST: Investigators and all crime control agencies.

DESCRIPTI0N (BRIEFLY)
This use case describes the removing of sample that was previously inserted in the training set

MAIN GOAL:
● The successful deletion of sample from system
Success Measurement:
● The sample is selected by Writer and then successfully removed without disturbing other ones
Prec0nditions:
● Before deleting the sample a Writer must be exist in System and have submitted atleast one
sample
● DATA-SETS must be completed so that the data must be appropriate and sufficient.

TR1GERS
● The time arrives when samples are successfully deleted.

USECASE
EVENTS
DESCRIPTION

SDD Document 1.0 12


EVENTS FL0W
1. A writer`s existence is checked in system by entering his name his sample space will be validated
2. His content and given sample will then be validated and located
3. The System Under Design removed that samples from the local database file so that it will not be
used further
4. The System Under Design uses predefined features that it will get of each font of each content of
every writer.
5. Exception: if either the writer is not validated and exists or the sample to be deleted not exists
already then the process will be terminated both are essentials part of app.
6. invalid data is not expected to be identified by system and then removes it
7. population and classification will occur on the basis of dataset created by the samples previously
Provided and now again re altered by removing some of them
8. Report generated and the uses are allowed to see the results then to re-classify it after changing the
criteria or filter
9. To view data (current) return to SAMPLE MANAGEMENT SYSTEM so view remaining samples
10. PRE-VIEW is DEPEND on Classification table consisting the match confidence and accuracy
and the recognized writer

Assumpti0ns
1. It is assumed that Writer will b already registered and just select the samples from the list of
samples he submitted to remove
2. It is assumed that the System will not affect the other samples of the same writer as well as the
other writers
3. Assumption is that a sample will be provided instead creating and inserting without following any
standard
Implementation Constraints and Specifications: System will b implemented minimum 3.1 processor dual
core to work efficiently for desired outcomes greater configuration is suggested

3.4.5 Classification of Writing (Training, preprocessing and character recognition)

Use case name: Full name HIGHER pri0rity


Classification of Writing CLASSIFTION OFWriting
ACT0R(primary) Main s0urces of usecase TYPE ITS LEVE1
Public User Author or Students Security IN 0VERVIEW
STAKEH0LDERS OF TAKING INTEREST: Investigators and all crime control agencies.

DESCRIPTI0N (BRIEFLY)
It is for the classification the samples provided by the different writers . System attempts to check if the training set exits so if
yes that means samples are present so it proceeds and start training and then classify or else it enquires the data set

MAIN GOAL:
● The successful completion of Training module and Classification module

Success Measurement:
● The system is trained quickly and the unknown samples is classified and recognized on the fly

SDD Document 1.0 13


Pre-c0nditi0ns:
● Atleast two writers must exists in system so that classification will take place
● Enough samples must be present to initiate the system

Trigger:
● Classifier has generated the results and is now ready to be viewed

USECASE
EVENTS
DESCRIPTION

EVENTS FL0W

1. Writer user samples are ready to b passed to the learner module.


2. Learner module passes it to the feature extractor and returns the object.
3. They are then stored which is reffered as learning.
4. The System Under Design now proceeds next to start classification.
5. Destination address(es) for the result is selecting by user .
6. Users change the filter or calculations criteria.
Assumpti0ns
1. Assuming Writers are enough and samples are also enough to properly initiate classifier..

Implementati0n C0nstraints and Specificati0ns: System will b implemented minimum 3.1 processor dual
core to work efficiently for desired outcomes greater configuration is suggested

3.4.5 Classification of Writing (Training, pre processing , character recognition)

Use case name: Full name HIGHER pri0rity


Classification of Writing (Detailed) CLASSIFTION OFWriting
ACT0R(primary) Main s0urces of usecase TYPE ITS LEVE1
Public User Author or Students Security IN very detail
STAKEH0LDERS OF TAKING INTEREST: Investigators and all crime control agencies.

DESCRIPTI0N (BRIEFLY)
It is for the classification of the samples provided by the different writers . System attempts to check if the training set exits so
if yes that means samples are present so it proceeds and start training and then classify or else it enquires the data set

MAIN GOAL:
● The successful completion of Training module and Classification module

Measurement of success:
● The system is trained quickly and the unknown samples is classified and recognized on the fly

Prec0ndition:
● Atleast two writers must exists in system so that classification will take place
● Enough samples must be present to initiate the system

TR1GERS
● Classifier has generated the results and is now ready to b viewed

SDD Document 1.0 14


USECASE
EVENTS
DESCRIPTION

EVENTS FL0W
1. Writer user samples are ready to b passed to the learner module
2. Learner module passes it to the feature extractor and returns the object
3. they are then stored which is reffered as learning
4. The System Under Design now proceeds next to start classification
5. Destination address(es) for the result is selecting by user .

6. users change the filter or calculations criteria


7. Both Multiple samples of Multiple Writers are required only multiple samples from single writer will not
Work.
8. Exception: a rare case may occur despite the existence of more then one writers and their more then
one samples it is expected that the system is still not able to classify
9. once initiated the procedure can be cancelled at any time without letting the system complete the
processing and generate the results
10. Log file is generated of every action and processing to be reviewed later in case of system crash or
Invalid results of classification

Assumpti0ns
1. Assuming Writers are enough and samples are also enough to properly initiate classifier..

Implementation C0nstraints and Specifications: System will b implemented minimum 3.1 processor dual
core to work efficiently for desired outcomes greater configuration is suggested

SDD Document 1.0 15


4 Design 0verview

PROCESS FLOW OF HAND WRITTEN IDENTIFICATION

SDD Document 1.0 16


BLOCK DIAGRAM

SDD Document 1.0 17


FLOW CHART OF PREPROCESSING

SDD Document 1.0 18


FLOW CHART FOR CHARACTER RECOGNITION

SDD Document 1.0 19


FLOW CHART FOR HAND WRITTEN CLASSIFICATION

SDD Document 1.0 20


ACTIVITY DIAGRAMS

ACTIVITY DIAGRAM PROCESSING (INSERT SAMPLE)

SDD Document 1.0 21


ACTIVITY DIAGRAM FOR REMOVING THE SAMPLE

SDD Document 1.0 22


ACTIVITY DIAGRAM FOR TRAINING THE DATSETS

SDD Document 1.0 23


ACTIVITY DIAGRAM FOR CHARACTER RECONITION

SDD Document 1.0 24


ACTIVITY DIAGRAM FOR HAND WRITTEN IDENTIFICATION

SDD Document 1.0 25


BASIC 0CR ARCHITECTURE

SDD Document 1.0 26


BASIC ARCHITECTURE 0F SYSTEM (HAND WRITTEN
IDENTIFICATION)

SDD Document 1.0 27


COMPONENT LEVEL DIAGRAM

SDD Document 1.0 28


DEVICE CORE I SERIES

OS win linux

sample trainer Classifier


storage SVM

pre
procesor
core
Application main module

capture
validator
device

DEPLOYMENT DIAGRAM (HAND WRITTEN


IDENTIFICATION)

SDD Document 1.0 29


3.2 System Interfaces

USER INTERFACE

● The user interface is the interface through which the user interacts with
the system. Our interface has many options for the user for example
first for upload image then analysis and then classes and classification.
It must be multi tab for proper functioning and data visualization must
be clear and complete. we draw the expected interface for our system
and explain the process step wise for clear understanding.
.

SDD Document 1.0 30


SAMPLES PROVIDED

SDD Document 1.0 31


SDD Document 1.0 32
WORKING

SDD Document 1.0 33


SDD Document 1.0 34
3.3 GENERAL C0NSTRAINTS & ASSUMPTI0NS

ASSUMPTI0NS

Assumptions are made that the the System under construction will be used by
different writers like students or Teachers as well as authors or bankers as well by
defining the different sources above it is obvious that the sites of usage will also vary
from actor to actor like a banker will use it in banks to identify the illegal passed
document by analyzing it.

DEPENDENC1ES

The system (hand written identification ) depends on basically in a SAMPLE


MANAGEMENT SYSTEM . One sample from the user at least must be
provided for classification and its data must be appropriate and sufficient for the
completion of all the relevant and necessary data fields. High resolution camera is
required for properly taking of samples.

SDD Document 1.0 35


4 SYSTEM 0BJECT M0DEL
4.1 1ntroducti0n

This is basically for the descriptions of subsystem in the system as our system hand written
classification depends on subsystem sample management system.
4.2 SUB-SYSTEMS

Sample management system

4.3 Subsystem Interfaces

There is no subsystem interfaces.

SDD Document 1.0 36


5 Object Descriptions
5.1 0bjects

Sample_Manager is the class which is responsible for the handling of input and
removing of all user samples

Name of the class: Sample_Manager


Brief description: The Sample_Manager is responsible to handle the input of all samples
from different types of writers and also updation of them like if a sample get deleted!
F1ELDS EXPLAIN ATTRIBUTE
Sample sample REPRESENTING a Sample object which is required in
entire CLASS and its purpose for handling samples
PDL (pr0gram description language)
PR1VATE Sample sample;
AddSample EXPLAIN ATTRIBUTE
REPRESENTING the Addsample function to add new ones

PDL (pr0gram description language)


PR1VATEE void AddSample(Sample);
DeleteSample EXPLAIN ATTRIBUTE
REPRESENTING the DeleteSample function

PDL (pr0gram description language)


PRIVATE void DeleteSample(Sample,Writer);
CountSample EXPLAIN ATTRIBUTE
REPRESENTING the all samples of Given writer

PDL (pr0gram descripti0n language)


PRIVATE int CountSampless (WRITER);
ValidateSample EXPLAIN ATTRIBUTE
REPRESENTING the ValidateSamples

PDL (pr0gram descripti0n language)


Private bool ValidateSamples(Sample,Writer);

SDD Document 1.0 37


DATA TRAINER

Its main target is on methodology of the system (interna1ly)

NAME OF CLASS : DataTrainer


EXPLANATION The DataTrainer will collaborate with SampleManager and train the
system and create the dataset

FIELDS EXPLAIN FIELDS


SampleComponent USED to create or manuplate the Sample of Writer

C0MP0NENT(DATA) BASICALLY for taking fields that depend on elements (set).

METH0DS EXPLAIN METH0DS


GetSAMComp0nent() BASICALLY GETTRS REQUIRED FOR THE
INTERFACES(USERS).
SetSAMComp0nent() BASICALLY SETTRS REQUIRED FOR THE
INTERFACES(USERS).
GetC0mp0nentDATA() BASICALLY GETTRS REQUIRED FOR THE
DATA(USES) TOWARDS 1NTERFACES.
SetComp0nentDATA() BASICALLY SETTRS REQUIRED FOR THE
DATA(USES) TOWARDS 1NTERFACES.
VALID [DATA , TYPE] F0R AUTHENTICATION OF MODIFICATIONS
METH0D_ OVERL0ADED IS REQUIRED
DataTrainer() REQUIRED F0R THE CLASS ACT AS C0NSTRUCT0R()

SDD Document 1.0 38


Classifier :

It is the class which is the engine of our under developing system as it will actually classify the
results

Name of the class : Classifier


Brief description: This is the main working core engine of application to generate results

FIELDS EXPLAIN FIELDS


Sample[] samples This is an object array to hold all samples in dataset currently
PDL (program description language)
private Samples[] SAMPLE
Writers[] wrt EXPLAIN FIELDS
HOLDING tota1 writers type stored in the system so that they will be
iterated and get the samples for each one on each iteration

PDL (pr0gram descripti0n language)


PR1VATEE Writers[] wrt;
METH0DS
GetsamlesbyWriter() EXPLAIN METH0D
METH0D OR OPERATION REQUIRED FOR THE
ABSTRACTION THE ARRAY MAKING BY GIVING THE
ARGUMENTSS WHICH IS WRITER ADOPTING 1ST
PDL (program description language)
SAMPLE. (COLLECTION)
public Samples[] GetsamplesbyWriters(Writer wrt)
{
Samples[] s=samples.GetAll();
Samples[] req=null;
Foreach(Sample smp in s)
{
If(s.writer==wrt)
{
Req.add(s);

}
}
return req;
}

SDD Document 1.0 39


6 OBJECTS-COLLAB0RATI0N
6.1 A DIAGRAM REPRESENTING THE C0LLAB0RATION OF
0BJECTS

SDD Document 1.0 40


7 Dynamic Model
7.1 Sequence Diagrams

SEQUENCE DIAGRAM PROCESSING (INSERT SAMPLE)

SDD Document 1.0 41


SEQUENCE DIAGRAM FOR PROCESSING (DELETING SAMPLE)

SDD Document 1.0 42


SEQUENSE DIAGRAM FOR TRAINING OF DATASET

SDD Document 1.0 43


SEQUENSE DIAGRAM FOR CHARACTER RECOGNITION

SDD Document 1.0 44


SEQUENCE DIAGRAM FOR PRE-PROCESSING

SDD Document 1.0 45


SEQUENCE DIAGRAM FOR OVERALL HAND WRITTEN CLASSIFICATION

SDD Document 1.0 46


7.2 State Diagram

SDD Document 1.0 47


8 NON-FUNCTI0NAL REQUIREMENTS

8.1 REQUIREMENTS RELATED TO PERF0RMANCE


● Software should be res0urce friendly and must not over load the ram and processor
● Software must be multi-tab but there should be no complexity at all it must entertain many users at a
time and they do not have to wait for other users .
● The system should be fast and secure. It must be reliable. Data structures used in it must be simple
having no complexity and problems in it.

9 SUPPL1MENTARY D0CUMENTATI0N
9.I REQUIRED TOOLS

THE TOOLS USED ARE BASICALLY (UML) Unified modeling language T00LS

SDD Document 1.0 48