Sei sulla pagina 1di 7

Thesis Proposal (SE 801)

An Approach to Avail Personalized Word


Prediction in Computer Based Communication

Submitted To

Dr. Md. Shariful Islam


Associate Professor
IIT, University of Dhaka

Submitted By

Anirudhya Robi (BSSE 0333)


IIT, University of Dhaka
Supervised By

Sheikh Mohammad Sarwar


Lecturer
IIT, University of Dhaka

To Whom It May Concern


This is to certify that, Anirudhya Robi, BSSE 0333, Institute of Information Technology,
University of Dhaka, will work on the research topic An Approach to Avail Personalized
Word Prediction in Computer Based Communication under my supervision until November
2014.

I wish his research turn out to be a success.

__________________________
Sheikh Mohammad Sarwar
Lecturer
Institute of Information Technology, University of Dhaka

Description of Proposed Research


In this world embraced with social media people are used to have conversation with each
other almost every day as informal dialogue exchange or chat which has become one of the
major ways of communications. People from different countries mostly administer phonetic
typing while having casual conversation through messaging or chatting. A word prediction
system would be actually beneficial for everyday use by reducing time consumed by typing,
keeping it personalized and comfortable. The personalized prediction system would be a
contribution to gear up the typing speed while having informal computer based
communication.
The prediction system would estimate a word for the user based on previous word or words
and the association between the two persons in a conversation, again the system could
increase the dependency for the probability of the next word by using the prefix letters of
the current word. The user would be provided with the word with highest probability
inserted automatically or with a list of probable words that would let the user choose from
them using an interface. By doing this the system would escalate the rate of typing
considerably, making it easy for the user to have smooth dialogue.
The notion is to provide an outstanding personalized prediction performance using as less
data as possible with phonetic adapted and automatically learned dictionary. The prediction
system would employ probabilistic language modeling like Unigram Model and different ngram Model such as bigram model, trigram model etcetera. A database may be necessitated
to use as a reference as the language modeling is a role of NLP (Natural Language Processing)
which would function as a training set for the system. To enhance the suggestions the
messaging history of users is also necessary to be included in the database. The prediction
system should also use some learning process to provide better result from time to time.
The steps of LM (Language Modeling) would comprise evaluation, interpolation and
smoothing.
A research about an adaptive keyboard, which autonomously adjusts its predictive features
and key displays to current user input, used personalized word prediction using common
English dictionary to improve the performance of such a system has been proposed by Siska
et al. [1]. This approach has built in English dictionary which is a hindrance and with an
existing database the system would need to overwrite it for personalized phonetic words for
any language. Tayfun et al. proposed to investigate the possibility of predicting several user
and message attributes in text-based, real-time, online messaging services which aims to
identity of a chat messages author correctly using style-based approach [2]. Matthew
suggested a system to improve the rate at which users can participate in a conversation
using an AAC (Augmentative and Alternative Communication) device which is intended for
disabled persons in the world are unable to communicate verbally [3]. A learning approach
employing hierarchical modeling of phrases expected to offer sufficient out of the box
performance relative to other learning approaches, while reducing the amount of initial
training data required to facilitate on-line personalization of the text prediction system has
been proposed by Richard et al. [4], which is also intended for development of assistive
technologies for disabilities, especially within the domain of augmentative and alternative
communications (AAC) devices. The key insight of the proposed approach is the separation
of stopwords, which primarily play syntactical roles in phrases, from keywords, which
ii

provide context and meaning sin the phrase. Patrick came out with a method to develop a
predictor for internet chat using interpolated trigram formula and bigram cache [5],
however this method engages a huge database of training corpora.

Research Questions
The reviewed methodologies for personalized word prediction emphases mostly on
estimation based upon built in database which consist of dictionary of particular language
increasing the training sets and decreasing the chance of phonetic predicting for initialed
experience. The identical word used in conversation may differ in spelling from person to
person which asks for phonetic personalization and which can be language independent.
Hence, the question to start the research is,
How to reduce the complexity of having more than normal dependencies as the method has
to consider the link or relationship between the two persons having conversation?
One way to resolve this is to have the dependency be undefined at first and it would progress
form syncing with the chat history of two persons, as defining an association with the person
before using the predictor would not be user-friendly and it may change from time to time
too.
Another question is,
How to implement a word predictor without initializing with a central dictionary?
It may have to be a personalized dictionary as we experience it in many smartphones
nowadays, which points out that there must be a central dictionary which would interpret
the word user typing to the predictor. The prediction system could use the existing systems
dictionary without possessing an embedded wordlist and it could also learn instantly from
previous messaging history, personal documents and chat logs to be convenient from the
initiation.
The idea personalization requires to be at a whole new level which points out another
question,
What is the personalization feature that would make the system different?
The concept to make the system more and more user-friendly and personal is to provide
different suggestions for different receivers in case of one user or sender.

Research Methodologies
The research intends to propose a system which would offer automatic personalized
prediction and word completion within computer based communication. Primarily some
literature study is done to identify the area of interest and the scope of contribution in
those areas. After the identification, the appropriate section is found where enhancement
iii

can be done and set the goal of the research. Then started the background study which is a
continuous process and will proceed for next five months.
The following activity will be to convert the abstract idea into a structure or architecture
and assess whether implementation of the idea is possible within assumed time constraint.
The comparison of the architecture with existing ones can be done using some performance
metrics of test cases, for example test coverage, test executability, test effectiveness,
defect discovery rate etcetera.
At the end of each major step, a technical report summarizing the findings will be produced.
Significant achievements will be published and presented in international conferences. A
thesis for the fulfillment of the Honors program will be compiled at the end of this research.
Steps

Title of Activity

Activity Description

01

02

Literature Survey

Study Existing Methods for


Personalized Prediction

03

Propose an Approach Concerning


the Research Objectives

04

Comparison with Existing Work

05

Technical Report Writing

06

Publications and Presentation (s)

07

Thesis Completion

Developing the background for the


research, this will be an ongoing task.
Presenting initial ideas.
Writing research proposal.
Intensifying the focus on more specific
area, this task will cover the similar or
related works to our area of interest
exhaustively.
Reporting on survey.
Extracting essential information.
Identifying the paths to generate test
suites.
Implementing the architecture.
Evaluating the structure against existing
techniques.
Producing a Technical report and
publishing a Conference paper.
Studying existing methods to identify
the required structure for test
generation.
Proposing new method.
Evaluating the system against existing
ones.
Producing a Technical report and
publishing a Conference paper.
At the end of each major step one
Technical report will be published.
Significant achievements will be
published and presented in international
Conferences
In order to accomplish the bachelor's by
Research program a thesis will be
compiled at the end.

Table 1: Research Methodologies

iv

Research Timeline
Timeline
BSSE 8th Semester (5 months)

Title of Activity
1

1. Literature Survey
2. Study Existing Methods for
Personalized Prediction
3. Propose an Approach Concerning the
Research Objectives
4. Comparison with Existing Work

5. Technical Report Writing

6. Publications and Presentation (s)

7. Thesis Completion

Table 2: Research Timeline

Rationale for the Research


The research could lead to develop a user oriented and very own flavored word predictor
for instant messaging by which people who use computer based communication can be
significantly assisted. The ever-growing field of social media and instant messaging system
have created the necessity to design a system that could come to support for fast,
comfortable and smooth typing.
As a lot of people use phonetic typing for informal dialogues and one of the most significant
matters is the phonetic typing varies from person to person, an adapted prediction system
would be very helpful.

References
[1] Siska Fitrianie and Leon J.M. Rothkrantz. An Adaptive Keyboard with Personalized
Language-based Features. Man-Machine Interaction, Delft University of Technology.
[2] Tayfun Kucukyilmaz, B. Barla Cambazoglu, Cevdet Aykanat, Fazli Can. Chat Mining:
Predicting User and Message Attributes in Computer-Mediated Communication.
[3] Matthew Edward John Wood. Syntactic Pre-Processing in Single-Word Prediction for
Disabled People. PhD Thesis, Faculty of Engineering, Department of Computer Science,
University of Bristol.
[4] Richard G. Freedman, Jingyi Guo, William H. Turkett, Jr. and V. Pa ul Pauca.
Hierarchical Modeling to Facilitate Personalized Word Prediction for Dialogue.
[5] Patrick Sjoberg. Word Prediction in an Internet Chat, MS Thesis, Language Engineering
Program, Department of Linguistic, Uppsala University.

vi

Potrebbero piacerti anche