Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
CHIEF PATRON
Prof. M. N. Navale
Founder President,Sinhgad Institutes
PATRON
Dr. (Mrs). S. M. Navale
Founder Secretary,Sinhgad Institutes
PATRON
Mr. R. M. Navale
Vice-President (Hr), Sinhgad Institutes
PATRON
Mrs. Rachana. Navale - Ashtekar.
Vice-President (Admin), Sinhgad Institutes
CONVENOR
Dr P. N. Mahalle
Professor & Head,
Member- BoS Computer Engineering SPPU,
Excharirman - BoS Information Technology, SPPU Pune
ORGANIZING SECRETARY
Dr. G. R. Shinde
Prof. J. N. Nandimath
CORE TECHNICAL COMMITTEE
Prof. S. K. Pathan
Prof. S. P. Pingat
Prof. R. A. Satao
Prof. V. S. Deshmukh
Prof. V. V. Kimbahune
Prof. A. A. Deshmukh
Prof. V. R. Ghule
Prof. P. S. Desai
Prof. P. N. Railkar
Prof. P. S. Raskar
Prof. S. R. Pavshere
Prof. P. A. Sonewar
Prof. P. R. Chandre
Prof. A. B. Kalamkar
Prof. S. A. Kahate
Prof. B. D. Thorat
Prof. P. S. Teli
Prof. P. P. Patil
Prof. D. T. Bodake
Prof. G. S. Pise
Prof. S. P. Patil
Prof. M. Tamboli
Dr A V Deshpande
Principal
Smt Kashibai Navale College of Engineering,
Pune.
In the advent of high speed communication tremendous impetus felt to
various core sector technology in terms of computer networking. This includes
next generation network, advance database technologies like data mining and
information retrieval, image and signal processing etc. There is also tremendous
advancement like solution system soft computing like cloud computing, grid
computing, neural networks, network and cyber security. Internet, web and other
services sectors have gone through sea change in last decade.
A need was therefore felt to organize this International Conference on
―Internet of Things, Next Generation Network and Cloud Computing 2019‖
ICINC 2019 to acquaint researcher, faculty and students of this college with the
latest trends and development in this direction. This conference in deed provides a
very useful platform for close intermingle congregation between industry and
academic. The conference addresses the trends, challenges and future roadmaps
within a conglomerate of existing and novel wireless technologies and recent
advances in information theory and its applications.To make the event more
meaningful we interacted with premier institutes, organizations and leading
industries, spread over the country in the field of computer networking and
requested them to demonstrate and share latest technology with participants. I am
sure this close interaction with them will enrich us all with knowledge of latest
development.
Message from Vice Principal
Dr K R Borole
Vice Principal
Smt Kashibai Navale College of Engineering,
Pune.
Warm and Happy greeting to all. I am immensely happy that Department
of Computer Engineering of Smt. Kashibai Navale College of Engineering
,Vadgaon (bk), Pune is organizing International conference on ― Internet of
Things, Next Generation Networks and Cloud Computing 2019 ICINC-
2019‖, on February 15th to 16th , 2019. The conference addresses the trends,
challenges and future roadmaps within a conglomerate of existing and novel
wireless technologies and recent advances in information theory and its
applications. The conference features a comprehensive technical program
including special sessions and short courses.
The dedicated Head of Department of Computer Engineering Dr.
P.N.Mahalle (Convener), Dr. G.R.Shinde & Prof J. N.Nandimath (Organizing
Secretory), staff members and disciplined undergraduate, postgraduate students
and research scholars of Smt. Kashibai Navale College of Engineering
Vadgaon (bk) Pune are the added features of our college. On this occasion I
would like to express my best wishes to this event.
I congratulate Head of Department, staff members, students of Computer
Engineering Departments, participants from all over India and abroad countries,
and colleges for organizing and participating in this conference.
I express my sincere thanks to all the authors, invited speakers, session
chairpersons, participants and publication of proceeding who did the painstaking
efforts of reviewing research papers and technical manuscripts which are included
in this proceeding.
Message from Convener & Head of Department
Dr.Parikshit N. Mahalle
Head & Professor,
Dept of Computer Engineering
Smt Kashibai Navale College of Engineering.
Pune.
Dr. G. R. Shinde
Organizing secretory
Smt Kashibai Navale College of Engineering,
Pune.
Dear friends,
Adding a new chapter to the tradition of Proceeding of third International
conference at our college, I am very happy to place before you the proceeding of
4th International Conference ICINC2019. As an Organizing secretory, allow me to
introduce to this proceeding. It consists of 96 papers spread across six domains. I
laud my editorial team which has brought out this copy with beautiful and
research rich presentations. It is indeed a herculean task. It has been my pleasure
to guide and coordinate them in bringing out this proceeding .
My sincere thanks to Prof. M. N. Navale - Founder President, STE
Society, Pune, Dr. (Mrs) S. M. Navale - Secretary, STE Society,Pune, Ms.
Rachana Navale– Ashtekar - Vice-President (Admin), STE Society, Pune, Mr.
Rohit M. Navale - Vice-President (HR), STE Society, Pune for their
encouragement and support. I would also like to thank my Principal Dr. A. V.
Deshpande for his unstinted help and guidance. Dr. K. R. Borole, Vice Principal,
Dr. P. N. Mahalle Head Computer Department, have been kind enough in advising
me to carry this onerous responsibility of managing the functions of Organizing
secretory. I would also like to thank Savitribai Phule Pune University for
association with us.
I hope the research community will enjoy reading this proceeding during their
research time.
Message from Organizing Secretary
Prof. J. N. Nandimath
Organizing Secretory
Smt Kashibai Navale College of Engineering,
Pune.
Dear Friends,
Research is an important activity of human civilization. It is very crucial
for improving the economy of our country and achieving sustainable
development. The outcome of research should not be confined to research
laboratories and effort must be put so that humanity can benefit from the
new developments in research. At the same time, the research education should
also be given due importance, in order to attract the young talented persons in this
area of research and equip them with the knowledge, information and wisdom
suitable for industry.
The 4th International Conference on ―Internet of Things, Next Generation
Networks and Cloud Computing 2019‖ (ICINC- 2019) aims to provide a
common platform for research community, industries and academia. It is also
expected to be a wonderful gathering of senior and young professionals belonging
to Department of Computer Engineering carrying out research.
We wish to thank all the authors, reviewers, sponsors, and invited
speakers, members of advisory board and organizing team, student-volunteers
and all others who have contributed in the successful organization of this
conference. I am very grateful to Prof. M. N. Navale - Founder President, STE
Society, Pune, Dr. (Mrs) S. M. Navale - Secretary, STE Society,Pune, Ms.
Rachana Navale– Ashtekar - Vice-President (Admin), STE Society, Pune, Mr.
Rohit M. Navale - Vice-President (HR), STE Society, Pune for their
encouragement and support. I would also like to thank Principal Dr. A. V.
Deshpande for his generous help and guidance. Dr. K. R. Borole, Vice Principal,
Dr. P. N. Mahalle Head Computer Department, has been kind enough in advising
me to carry this arduous responsibility of managing the functions of Organizing
secretory.
I would also like to thank Savitribai Phule Pune University for association
and providing necessary funding.
Index
Sr Page
Title
No No
Internet of Things
1 Automated Toll Collection System And Theft Detection Using RFID 1
Samruddhi S. Patil, Priti Y. Holkar, Kiran A. Pote, Shubhashri K. Chavan,
Asmita Kalamkar
2 WI-FI Based Home Surveillance Bot Using PI Camera & Accessing Live 7
Streaming Using Youtube To Iprove Home Security
Ritik Jain, Varshun Tiku, Rinisha Bhaykar, Rishi Ahuja, Prof. S.P.Pingat
3 Smart Dustbin With Metal Detector 12
Dhiraj Jain, Vaidehi Kale, Raksha Sisodiya, Sujata Mahajan, Dr. Mrs.
Gitanjali R. Shinde
4 Improvement In Personal Assistant 17
Ashik Raj, Sreeja Singh, Deepak Kumar, Deshpande Shivani Shripad
5 IoT Based Home Automation System For Senior Citizens 20
Ashwathi Sreekumar, Divyanshi Shah, Himanshi Varshney
6 Smart Trafic Control System Using Time Management 25
Gaikwad Kavita Pitambar, More Sunita Vitthal, Nalge Bhagyashree Muktaji
7 The Pothole Detection: Using A Mobile Sensor Network For Road 29
Surface Monitoring
Sanket Deotarse,Nate Pratiksha,Shaikh Kash, Sonnis Poonam
8 IoT Based Agricultural Soil Prediction For Crops With Precautions 33
Prof.Yashanjali Sisodia, Pooja Gahile, Chaitali Meher
9 IoMT Healthcare: Security Measures 36
Ms. Swati Subhash Nikam, Ms. Ranjita Balu Pandhare
10 Smart Wearable Gadget For Industrial Safety 42
Ketki Apte, Rani Khandagle, Rijwana Shaikh,Rani Ohal
11 Smart Solar Remote Monitoring and Forecasting System 45
Niranjan Kale, Akshay Bondarde, Nitin Kale, Shailesh Kore,
Prof.D.H.Kulkarni
12 Smart Agriculture Using Internet of Things 50
Akshay Kudale, Yogesh Bhavsar, Ashutosh Auti, Mahesh Raykar,
Prof. V. R. Ghule
13 Area-Wise Bike Pooling- ―BikeUp‖ 54
Mayur Chavhan, Sagar Tambe,Amol Kharat, Prof. S.P Kosbatwar
14 Smart Water Quality Management System 58
Prof. Rachana Satao, Rutuja Padavkar, Rachana Gade, Snehal Aher, Vaibhavi
Dangat
15 Intelligent Water Regulation Using IoT 62
Shahapurkar Shreya Somnath, Kardile Prajakta Sudam, Shipalkar Gayatri
Satish, Satav Varsha Subhash
16 Smart Notice Board 65
Shaikh Tahura Anjum Vazir, Shaikh Fiza Shaukat, Kale Akshay Ashok
17 Vehicle Identification Using IOT 68
Miss YashanjaliSisodia, Mr.SudarshanR.Diwate
18 Wireless Communication System Within Campus 72
Mrs. Shilpa S. Jahagirdar, Mrs. Kanchan A. Pujari
19 License Plate Recognition Using RFID 77
Vaibhavi Bhosale , Monali Deoghare, Dynanda Kulkarni, Prof S A Kahate
Cloud Computing
74 Cloud Stress Distribution And De-Duplication Check Of Cloud Data 384
With Secure Data Sharing Via Cloud Computing
Amruta Deshmukh,Rajeshri Besekar,Raveena Gone,Roshan Wakode, Prof.
D.S.Lavhkare
75 Efficient Client-Side Deduplication Of Encrypted Data With Improved 389
Data Availability And Public Auditing In Cloud Storage
Akash Reddy, Karishma Sarode, Pruthviraj Kanade,Sneha M. Patil
76 A Novel Methodology Used To Store Big Data Securely In Cloud 397
Kale Piyusha Balasaheb, Kale Piyusha Balasaheb, Ukande Monika Prakash
77 Survey Paper on Secure Heterogeneous Data Storage Management with 402
Deduplication in Cloud Computing
Miss. Arati Gaikwad, Prof. S. P. Patil
78 Survey on A Ranked Multi-Keyword Search in Cloud Computing 411
Mr.Swaranjeet Singh, Prof. D. H . Kulkarni
79 Private Secure Scalabale Cloud Computing 417
Himanshu Jaiswal, Sankalp Kumar, Janhvi Charthankar, Sushma Ahuja
environment was built and then, for each In the proposed system RFID tags are
location of the label attached on the used. They can be attached in the front
container, distance between the container portion i.e. wind shield of the vehicle or
and the antenna along a fixed direction the side portion of the vehicle. Passive tags
was changed. Finally, they concluded on are being used because of their feasibility.
how to determine the preferred location of Passive tags do not have their own battery.
a RFID tag. When the vehicles enter the toll gates the
active device here i.e. readers will emit
3. GAP ANALYSIS the radio waves, as soon as these waves
In India, almost all toll collection on toll contacts with tags, it produces the
plazas is done manually. Also due to large magnetic field. The same draws the power
population and heavy road transportation it out of it and sends the data to the
is time consuming and causes traffic controller.
congestion on toll plazas. While there are The reader is connected to the
some toll plazas in India which have microcontroller. Arduino ATMega328 is
started to implement electronic toll used as microcontroller here. The reader
collection, but is not being implemented scans the tags and sends it to the main
on large scale. system here it is Arduino. Then Arduino
Though there are many proposed systems checks it with the database for that unique
for implementing automated toll ID. There will be a user interface on the
collection, however issue of theft detection desktop at the toll plaza. After checking
is not addressed so far. So, to enhance the the information from the database details
current systems, we are proposing are displayed on the user interface. If
automated toll collection with theft Details are matched the amount is
detection to overcome time consumption, deducted and command is issued to the
long queues, fuel wastage and to identify servo motor to lift up the barricade. A
stolen vehicles. central database is maintained which
consists of t So as soon as the vehicle
4. PROPOSED SYSTEM enters the toll plaza RFID tag is scanned
In this proposed system we are using RFID and information regarding the vehicle is
(Radio Frequency Identification) displayed. Toll is automatically deducted.
technology. This technology makes the use And the message is sent to the registered
of radio frequency to identify the objects. mobile number using GSM technology. In
Thus, RFID technology will enable the case if the RFID ID or number is not
automatic toll collection which conserves matched then the barricades will not be
time and energy and presents an efficient lifted up and the vehicle will be blocked
system for automation transaction. there. This is theft detection. For the
movement of barricades servo motor is
used.
deductmney(tollamt*0.75);
else:
deductamount(tollamt);
else:
if(node.amount>100):
send_warning_msg();
else:
send_redalert_msg();
sendmsg_to_usertoaddmoney();
end;
Using Embedded Linux ",2015 International [9] Rudy Hermawan Karsaman, Yudo Adi
Conference on Circuit, Power and Computing Nugraha, Sri Hendarto, Febri Zukhruf,"A
Technologies [ICCPCT]. COMPARATIVE STUDY ON THREE
[2] Hanit Karwal, Akshay Girdhar,"Vehicle ELECTRONICS TOLL COLLECTION
Number Plate Detection System for Indian SYSTEMS IN SURABAYA",2015
Vehicles",2015 IEEE International Conference International Conference on Information
on Computational Intelligence & Technology Systems and Innovation (ICITSI)
Communication Technology. Bandung – Bali, November 16 – 19, 2015
[3] Sana Said Al-Ghawi, Muna Abdullah Al ISBN: 978-1-4673-6664-9.
Rahbi, Dr. S. Asif Hussain, S. Zahid Hussain," [10] Janani Krishnamurthy, Nitin Mohan,
AUTOMATIC TOLL E-TICKETING Rajeshwari Hegde, "Automation of Toll Gate
SYSTEM FOR TRANSPORTATION and Vehicle Tracking‖, International
SYSTEMS ", 2016 3rd MEC International Conference on Computer Science and
Conference on Big Data and Smart City. Information Technology 2008.
[4] Renata Rampim de Freitas Dias, Hugo E. [11] Shoaib Rehman Soomro Mohammad Arslan
Hernandez-Figueroa, Luiz Renata Costa, Javed Fahad Ahmed Memon," VEHICLE
―Analysis of impacts on the change of NUMBER RECOGNITION SYSTEM FOR
frequency band for RFID system in Brazil ―, AUTOMATIC TOLL TAX COLLECTION",7
Proceeding of the 2013 IEEE International December 2012.
Conference on RFID Technologies and [12] Jin Yeong Tan, Pin Jern Ker*, Dineis Mani
Applications, 4 - 5 September, Johor Bahru, and Puvanesan Arumugam, ―Development of a
Malaysia. GPS-based Highway Toll Collection System
[5] Pinaki Ghosh, Dr. Mahesh T R, ―A Privacy ",2016 6th IEEE International Conference on
Preserving Mutual Authentication Protocol for Control System, Computing and Engineering,
RFID based Automated Toll Collection 25–27 November 2016, Penang, Malaysia.
System‖, November 2016. [13] G. Srivatsa Vardhan, Naveen Sivadasan,
[6] A.A. Pandit, Jyot Talreja, Ankit Kumar Ashudeb Dutta,"QR-Code based Chipless
Mundra, ―RFID Tracking System for Vehicles RFID System for Unique Identification",2016
(RTSV)",2009 First International Conference IEEE International Conference on RFID
on Computational Intelligence, Technology and Applications (RFID-TA).
Communication Systems and Networks. [14] P. Vijayalakshmi, M. Sumathi, ―Design of
[7] K. Gowrisubadra, Jeevitha.S, Selvarasi.N, "A Algorithm for Vehicle Identification by
SURVEY ON RFID BASED AUTOMATIC Number Plate Recognition‖, IEEE- Fourth
TOLL GATEMANAGEMENT ",2017 4th International Conference on Advanced
International Conference on Signal Processing, Computing, ICoAC 2012 MIT, Anna
Communications and Networking (ICSCN - University, Chennai. December 13-15, 2012.
2017), March 16 – 18, 2017, Chennai, INDIA. [15] Zhu Zhi-yuan, Ren He, Tan Jie, "A Method for
[8] Alfonso Gutierrez, F. Daniel Nicolalde, Atul Optimizing the Position of Passive UHF RFID
Ingle, Clive Hohberger, Rodeina Davis, Tags ―, Program for the IEEE International
William Hochschild and Raj Conference on RFID-Technology and
Veeramani,"High-Frequency RFID Tag Applications, 17 - 19 June 2010 Guangzhou,
Survivability in Harsh Environments Use of China.
RFID in Transfusion Medicine",2013 IEEE
International Conference on RFID.
Ritik Jain1, Varshun Tiku2, Rinisha Bhaykar3, Rishi Ahuja4, Prof. S.P.Pingat5
1,2,3,4,5
Department of Computer Engineering, Smt Kashibai Navale College of Engineering,
Vadgaon(Bk), Pune, India.
ABSTRACT
There are various surveillance systems such as camera, CCTV etc. available. In these
types of surveillance systems, the person who is stationary and is located in that
particular area can only view what is happening in that place. We proposed a system to
build a real-time live streaming and monitoring system using Raspberry pi with
installed Wi-Fi connectivity. Whereas we can monitor the movements in 360 degrees
which is accomplished with the help of motors. Also we are going to detect gas leakage.
By using video cameras, information returned by ROBOT analyzed the real time
images so that the computation effort, cost and a resource requirements needed are
significantly decreased.
1. INTRODUCTION .Raspberry pi is a simple circuit .The
Traditionally, [1] surveillance systems operating system used is Raspbian OS.
are installed in every security critical Gas leakage being one of the most
areas. These systems generally consist of frequently observed
high quality cameras, multiple computers parameter, and is extremely harmful. So,
for monitoring, servers for storing these proposed system is capable of monitoring
videos and many security personnel for this value indefinitely without any delay.
monitoring these videos. When considered Our proposed system is implemented on
as a whole, these systems can yield great Raspberry Pi and interfaced with gas
complexities while installing as well as for sensor and controlling the device and also
their maintenance. The CCTV camera live video streaming is implemented for
feeds are only visible in certain locations quick actions. Mobile video surveillance
and they also have limited range within system has been envisioned in the
which these can be viewed. Above all literature as either classical video
these, the cost of implementations of these streaming with an extension over wire and
systems is so high that they cannot be wireless network system to control the
installed in every household. human operator. Remote monitor is
Raspberry pi is a credit-card sized becoming an important maintenance
computer. Its functions are almost as a method that is based on the network. There
computer. There are various existing are two units Raspberry Pi Unit and
surveillance systems such as camera, Process unit with wireless link between
CCTV etc., in these types of surveillance them. Sensor unit will send sensor reading
systems, the person is stationary and is to Raspberry Pi Unit which will be
located in that particular area can only uploaded to the server. The Pi camera will
view what is happening in that place. be connected to Raspberry Pi CSI camera
Whereas, here, even if a person is moving port.
from place to place. The main advantage
of this system is can be used in security 2. MOTIVATION
purpose and another advantage is that it A robot is generally an electro-mechanical
can offers privacy on both sides since it is machine that can perform tasks
being viewed by only authorized person automatically. Security is one of the
4. GAP ANALYSIS
Table 1: Gap Analysis
5.1 ARCHITECTURE
Fig 1: Architecture
Fig 2: Flow Chart
5.2 MATHEMATICAL MODEL
The Mathematical model for this system is 6. CONCLUSION
as follows:- The smart supervisor system we
Input={in1,in2,in3,in4) have built surveillance and real time video
Forward={in1=1,in2=0,in3=1,in4=0) streaming system in which authentication
Backward(in1=0,in2=1,in3=0,in4=1) is required to access the smart supervisor
Right(in1=1,in2=0,in3=0,in4=0) system. The smart supervisor system
Left(in1=0, in2=0, in3=1, in4=0) displaying the gas sensor value. This
Stop(in1=0, in2=0, in3=0, in4=0) message is based on the response received
Where in1 & in2 denotes input of left from the smart supervisor system server &
motor Smart phone. Whenever the gas leakage is
Where in3 & in4 denotes input of right detected, a mail is going to be sent to the
motor registered mobile number. If correct IP
address is provided, the app will proceed
5.3 ALGORITHM to display the various device operations &
1. Result = get data from firebase database video streaming operations. According to
2. If Result is equal to ‗F‘ Move robot the instructions provided by the app on our
FORWARD android mobile we can operate the
3. If Result is equal to ‗B‘ Move robot movement of the robot. The robot can
BACKWARD move in forward, backward, left and right
4. If Result is equal to ‗R‘ Move robot direction. The command used for live
RIGHT streaming is as follows:- raspivid -o - -t 0 -
vf -hf -fps 10 -b 500000 | ffmpeg -re -ar [1] R, H., & Safwat Hussain, M. H. (2018).
44100 -ac 2 -acodec pcm_s16le -f s16le - Surveillance Robot Using Raspberry Pi and
IoT. 2018 International Conference on Design
ac 2 -i/dev/zero -f h264 -i - -vcodec copy - Innovations for 3Cs Compute
acodec aac -ab 128k -g 50 -strict CommunicateControl(ICDI3C).doi:10.1109/ic
experimental -f di3c.2018.00018
flvrtmp://a.rtmp.youtube.com/live2/j1s8- [2] Oza, N., & Gohil, N. B. (2016).
d349-9536-8d6r. Implementation of cloud based live streaming
for surveillance. 2016 International Conference
[2] Surveillance system is available with on Communication and Signal Processing
various features. Selection is based on (ICCSP). doi:10.1109/iccsp.2016.7754297
various factors such as cost, video quality [3] Nadvornik, J., & Smutny, P. (2014).
etc. Proposed system is cost effective as Remote control robot using Android mobile
well as user friendly. It has application in device. Proceedings of the 2014 15th
International Carpathian Control Conference
different fields like military, defenses, ICCC).doi:10.1109/carpathiancc.2014.684363
house, office and environment monitoring. 0.
System can be enhanced by using face [4] Bokade, A. U., & Ratnaparkhe, V. R. (2016).
detection and recognition to follow a Video surveillance robot control using
particular person like children below 4 smartphone andRaspberry pi. 2016
International Conference on Communication
years so that they are continuously in front and Signal Processing (ICCSP).
of our eyes. doi:10.1109/iccsp.2016.7754547
[5] Aneiba, A., & Hormos, K. (2014). A Model for
7. FUTURE SCOPE Remote Controlled Mobile Robotic over Wi-
1. Major improvements on the system Fi Network Using Arduino Technology.
International Conference on Frontiers of
processor speed are much needed in order Communications, Networks and Applications
to process large files e.g. video for (ICFCNA 2014-
effective motion detection and tracking. Malaysia). doi: 10.1049/cp.2014.1429.
2. The designed security system can be [6] Abdalla, G. O. E., & Veeramanikandasamy, T.
used in homes to monitor the facility at (2017). Implementation of spy robot for a
surveillance system using Internet protocol of
any given time. Raspberry Pi. 2017 2nd IEEE International
3. The system requires to be remotely Conference on Recent Trends in Electronics,
controlled. Hence, future explorations Information & Communication
should focus much more on the same. Technology(RTEICT).doi:10.1109/rteict.2017.
REFERENCES 8256563.
waste management system, i.e. dustbin. Algorithm. This is real time waste
The main focus of our project is to create management by using smart trash bins that
an automatic waste management system can be accessed anytime anywhere by the
across the whole city and monitoring by a concerned person.
single system efficiently and separate the [2] Bikramjit Singh et al, Manpreet Kaur
metal in the garbage at its origin to reduce in ―Smart Dustbins for Smart Cities‖ has
the separation of metals and garbage at the imposed that the garbage collection system
dumping place. It will also help to reduce has to be smarter and in addition to that the
the cost of separation of metals and people need easy accessibility to the
garbage. This can prove to be a new garbage disposing points and garbage
revolution in the smart city collection process has to be efficient in
implementation. terms of time and fuel cost. Paper has GPS
and internet enabled Smart Dustbin,
2. MOTIVATION Garbage Collection and disposing,
These malodorous rotten wastes that Garbage Collection Scheduling, Nearest
remain untreated for a long time, due to Dustbin.
negligence of authorities and carelessness [3] Ahmed Omara, Damla Gulen, ,Burak
of public may lead to long term problems. Kantarci and Sema F. Oktug in
Breeding of insects and mosquitoes can ―Trajectory-Assisted Municipal Agent
cause dreadful diseases. Also, the garbage Mobility A Sensor-Driven Smart Waste
has various metals that can be recycled Management System‖ has proposed a
which is separated from garbage at the WSN-driven system for smart waste
dumping place at its cost separation is management in urban areas. In proposed
high. Garbage also contains many types framework, the waste bins are equipped
metals like Tin Can, Metal container etc. with sensors that continuously monitor the
this increase the cost of metal separation waste level and trigger alarms that are
and garbage at the dumping place. wirelessly communicated to a cloud
platform to actuate the municipal agents,
3. LITERATURE SURVEY i.e., waste collection trucks. They
[1] Dharna Kaushik Sumit Yadav in formulate an Integer Linear Programming
―Multipurpose Street-Smart Garbage bin (ILP) model to find the best set of
based on Iot‖ proposed system has trajectory-truck with the objectives of
included, there are multiple smart garbage minimum cost or minimum delay. In order
trash bins on a microcontroller board for the trajectory assistance to work in real
platform (Arduino Board) located time, they propose three heuristics, one of
throughout any city or the campus or which is a greedy one. Through
hospital. The Arduino Board is interfaced simulations, they have shown that the ILP
with GSM modem and ultrasonic sensor. formulation can provide a baseline
Once the level of threshold is being reference to the heuristics, whereas the
crossed, then ultrasonic sensors will non-greedy heuristics can significantly
trigger the GSM module which in turn outperform the greedy approach regarding
continuously alert the authorized person by cost and delay under moderate waste
sending SMS reminder after until the accumulation scenarios.
dustbin is cleaned. Beside this, we will Minthu Ram Chiary, Sripathi SaiCharan,
also create the central system that will Abdul Rashath .R, Dhikhi .T in
keep showing us the current status of ―DUSTBIN MANAGEMENT SYSTEM
garbage on mobile web browser with html USING IOT‖ has proposed a system, in
page by wi-fi. With the help of this, we their system the Smart dustbins are
will create shortest path for garbage connected to the internet to get the real
collection vehicles using Dijkstra time information of the smart dustbins. In
the recent years, there was a rapid growth microcontroller to the urban office and to
in population which leads to more waste perform the remote monitoring of the
disposal. So, a proper waste management cleaning process, done by the workers,
system is necessary to avoid spreading thereby reducing the manual process of
many diseases by managing the smart bins monitoring and verification. The
by monitoring the status of it and notifications are sent to the Android
accordingly taking the decision. There are application using Wi-Fi module.
multiple dustbins that are located in the
city or the Campus (Educational 4. GAP ANALYSIS
Institutions, Companies, Hospitalet.). Table: Gap Analysis
These dustbins are interfaced with micro Systems Benefits Limitations
controller-based system with Ultrasonic Multipurpose Continuously Access to
Sensors and Wi-Fi modules. Where the Street-Smart alert the status is on
Ultrasonic sensor detects the level of the Garbage bin authorized web browser
waste in dustbin and sends the signals to based on IOT person by as html page,
sending SMS there is no
micro controller the same signal is
reminder. application.
encoded and send through Wi-Fi Modular
Smart Provides Garbage
(ESP8266) and it is received by the end Dustbins for location on collection
user. The data will be sent to the user Smart Cities nearest scheduling is
through E-Mail i.e., a mail will be sent as dustbin for done when
notification that the dustbin is full so that disposing many of the
the municipality van can come and empty garbage. dustbins are
the dustbin. full.
[5] N. Sathish Kumar, B. Vuayalakshmi et Dustbin Micro The status of
al, in ―IOT based smart garbage alert Management controller- the dustbin
system using Arduino UNO‖ proposed a System Using based system will be sent
smart alert system for garbage clearance IOT with to the user
by giving an alert signal to the municipal Ultrasonic through E-
Sensors and Mail
web server for instant cleaning of dustbin Wi-Fi
with proper verification based on level of modules
garbage filling. This process is aided by Trajectory- waste It has no
the ultrasonic sensor which is interfaced Assisted collection metal
with Arduino UNO to check the level of Municipal trucks detector to
garbage filled in the dustbin and sends the Agent formulate an detect metal.
alert to the municipal web server once if Mobility A Integer
garbage is filled. After cleaning the Sensor- Linear
dustbin, the driver confirms the task of Driven Smart Programming
emptying the garbage with the aid of RFID Waste (ILP) model
Tag. RFID is a computing technology that Management to find the
System best set of
is used for verification process and in
trajectory-
addition, it also enhances the smart truck with the
garbage alert system by providing objectives of
automatic identification of garbage filled minimum
in the dustbin and sends the status of cost or
clean-up to the server affirming that the minimum
work is done. The whole process is upheld delay.
by an embedded module integrated with
RF ID and IOT Facilitation. An Android 5. PROPOSED WORK
application is developed and linked to a A. System Architecture
web server to intimate the alerts from the
where its voltage can be read by Arduino University for Women Delhi, India and Sumit
analog pin A5. Yadav Computer Science and Engineering
Indira Gandhi Technical University for
D. Mathematical Model Women Delhi, India, ―Multipurpose Street-
Server collects the fill up status and Smart Garbage bin based on Iot‖ Volume 8,
location of dustbins. It processes the No. 3, March – April 2017.
clients query and it respond with nearest [2] Bikramjit Singh, Manpreet Kaur – ―Smart
dustbin location and with direction to Dustbins for Smart Cities‖ Bikramjit Singh et
al, / (IJCSIT) International Journal of
access dustbin. Computer Science and Information
C - current fill up status Technologies, Vol. 7 (2), 2016, 610-611
T - time duration between generation of [3] Ahmed Omara, Damla Gulen, Burak Kantarci
wave and wave received by receiver and Sema F. Oktug – ―Trajectory-Assisted
S - the speed of light. And we will Municipal Agent Mobility: A Sensor-Driven
Smart Waste Management System‖,
calculate the value of C using formula Published: 21 July 2018
given below [4] Minthu Ram Chiary, Sripathi SaiCharan,
C=L-(ST)/2 Abdul Rashath. R, Dhikhi. T Computer
And similarly, percentage of fill up is Science and Engineering Saveetha school of
calculated using formula given below Engineering Saveetha University - ―DUSTBIN
MANAGEMENT SYSTEM USING IOT‖
P=(C/L) *100 Volume 115 No. 8 2017, 463-468 ISSN: 1311-
Where P is the % fill up 8080
Here we are assuming the wave path is [5] N. Sathish Kumar, B. Vuayalakshmi, R.
almost vertical. Jenifer Prarthana, A. Shankar, Sri
Ramakrishna Engineering College,
Coimbatore, TamilNadu, India for ――IOT
6. CONCLUSION AND FUTURE based smart garbage alert system using
WORK Arduino UNO‖ IEEE 978-1-5090-2597-8
This project was developed with the [6] Narayan Sharma, Nirman Singha, Tanmoy
intention of making smart cities; Dutta, ―Smart Bin Implementation for Smart
however, there are lots of scope to Cities‖, International Journal of Scientific &
Engineering Research, Volume 6, Issue 9,
improve the performance of the Proposed September-2015 ISSN 2229-5518.
System in the area of User Interface, [7] ―Smart Cities‖ available at
adding new features and query processing www.smartcities.gov.in/
time. Etc. [8] ―GSM MODULE INTERFACE‖ at
So, there are many things for future https://circuits4you.com/2016/06/15/gsm-
modem-interfacing-arduino/
enhancement of this project. The future [9] ―GSM‖
enhancements that are possible in the https://www.arduino.cc/en/Guide/ArduinoGS
project are as follows: MShield
If the system is sponsored then we can [10] ―GSM Module‖
have additional sensors for wet and dry http://www.circuitstoday.com/interface-gsm-
module-with-arduino
waste segregation. [11] ―Arduino‖ https://www.arduino.cc/
[12] ―Android‖https://developer.android.com/studio
REFERENCES /―GSM MODULE‖
[1] Dharna Kaushik Computer Science and www.electronicwings.com/arduino/sim900a-
Engineering Indira Gandhi Delhi Technical gsm-module-interfacingwith-arduino-uno.
Also, there are many different and body data sets for gesture modal,
architectures for dialog systems. Which speech recognition knowledge bases,
sets of components are included in a dialog dictionary and spoken dialog knowledge
system, and how those components divide base for ASR modal, video and image
up responsibilities differs from system to body data sets for Graph Model, and some
system. A dialogue system has mainly user‘s information and the setting system.
seven components: Input Decoder, Natural B. Graph Model
Language Understanding, Dialogue The Graph Model analyzes video and
Manager, Domain Specific Component, image in real-time by using the Graph
Response Generator, and Output Renderer. Model and extracts frames of the video
However, there are six main components that collect by the camera and the input
in the general dialogue systems, which model; then it sends those frames and
includes the Speech Recognition (ASR), images to the Graph Model and
the Spoken Language Understanding applications in Cloud Servers for
(SLU), Dialog Manager (DM), Natural analyzing those frames and images and
Language Generation (NLG), Text to returning the result.
Speech Synthesis (TTS), and the 1.2 Comparison on features of popular
knowledge base. The following is the VPA in market
structure of the general dialogue system 1.3
Launched in 2012, Google Now is an her by using your voice or by typing. You
intelligent personal assistant made by can give her instructions and talk with her
Google. It was first included in Android by using your voice or by typing Cortana,
4.1 which launched on July 9, 2012, and named after her fictional counterpart in the
was first supported on the Google Nexus video game series Halo, takes notes,
smart-phone. Found within the Google dictates messages and offers up calendar
search option, Google Now can be used in alerts and reminders. But her real standout
numerous ways that are helpful. Yes, it can characteristic and the one Microsoft's
set reminders or answer basic questions betting heavily on, is the ability to strike
like the weather of the day or the name of up casual conversations with users; what
the movies that won Oscars last year. But Microsoft calls "chitchat".
more than that Google Now is a virtual
assistant that shows relevant and timely 4. CONCLUSION
information to you once it learns more In this paper we have seen the working of
about you and how you use the phone. personal virtual assistant by using Natural
Google Now also displays different language Processing and Internet of
sections called Now cards that pulls Things and also seen the implementation
information from your Gmail account and of intrusion detection system with the help
throws it on the screen. For example if you of passive infrared sensor PIR for
have last bought a Red Bag from Amazon, detecting the motion.
the card shows you your recent buy.
Similarly, it also has weather card where REFERENCES
you can know about the weather, sport [1] S. Arora, K. Batra, and S. Singh.
card where you can learn about any match Dialogue System: A Brief Review. Punjab
Technical University.
that is on. [2] Ding, W. and Marchionini, G. 1997 A Study on
Amazon Alexa: Video Browsing Strategies. Technical Report.
Amazon Alexa, known simply as Alexa is University of Maryland at College Park.
a virtual assistant developed by Amazon, [3] R. Mead. 2017. Semio: Developing a
first used in the Amazon Echo and the Cloud-based Platform for Multimodal
Conversational AI in Social Robotics. 2017
Amazon Echo Dot smart speakers IEEE International Conference on
developed by Amazon Lab126. It is Consumer Electronics (ICCE).
capable of voice interaction, music [4] R. Pieraccini, K. Dayanidhi, J. Bloom,
playback, making to-do lists, setting J. Dahan, M.l Phillips. 2003. A Multimodal
alarms, streaming podcasts, playing Conversational Interface for a Concept
Vehicle. Eurospeech 2003.
audiobooks, and providing weather, traffic, [5] G. Bohouta and V. Z Këpuska. 2017.
sports, and other real-time information, Comparing Speech Recognition Systems
such as news. Alexa can also control (Microsoft API, Google API and CMU
several smart devices using itself as a Sphinx). Int. Journal of Engineering Research
home automation system. Users are able to [6] M. McTear .2016. The Dawn of the
Conversational Interface. Springer
extend the Alexa capabilities by installing International Publishing Switzerland 2016
"skills" (additional functionality developed [7] Amazon. Amazon Lex is a service for building
by third-party vendors, in other settings conversational interfaces.
more commonly called apps such as https://aws.amazon.com.
weather programs and audio features). [8] B. Marr. The Amazing Ways Google Uses
Deep Learning AI. https://www.forbes.com
Cortana: [9] K. Wagner. Facebook's Virtual Assistant 'M' Is
Cortana is the name of the interactive Super Smart. It's Also Probably a Human.
personal assistant built into Windows 10. https://www.recode.com.
You can give her instructions and talk with
5. SYSTEM ARCHITECTURE
android and Wi-Fi‖, International Journal of recognition for home automation using
Engineering and Computer Science 2014. android application,‖ March 2015.
[5] B. R. Pavithra, D., ―Iot based monitoring [7] R. A. Ramlee, M. A. Othman, M. H. Leong,
and control system for home automation,‖ M. M. Ismail and S. S. S.
April 2015. [8] Ranjit, "Smart home system using android
[6] B. S. S. Tharaniya soundhari, M., application‖, international Conference of
―Intelligent interface-based speech information and Communication
Technology 2013.
costly sensors, economic situation calls for +5 positive supply which is given to the
using available video cameras in an whole electronic component of the system.
efficient way for effective traffic The Raspberry Pi uses this information to
congestion estimation. Researchers may set the signal timer according to the level
focus on one or more of these tasks, and of traffic.
they may also choose different measures
for traffic structure or add measures. 3. BLOCK DIAGRAM
For more comprehensive review
on vision based traffic light control Due to
the massive growth in urbanization and
traffic congestion, intelligent vision based
traffic light controller is needed to reduce
the traffic delay and travel time especially
in developing countries as the current
automatic time based control is not
realistic while sensor based traffic light
controller is not reliable in developing
countries. Traffic congestion is now
considered to be one of the biggest
problems in the urban environments.
Traffic problems will be also much more
widely increasing as an expected result of
the growing number of transportation
means and current low-quality
infrastructure of the roads. In addition,
many studies and statistics were generated
in developing countries that proved that
most of the road accidents are because of
the very narrow roads and because of the Fig. 1 Block Diagram
destructive increase in the transportation 16*2 alpha-numeric LCD display is used
means. which shows the real time information
A Raspberry Pi microcomputer and about Traffic signal. Here use to four
multiple ultrasonic sensors are used in sensor when any sensor sense then this
each lane to calculate the density of traffic signal go to the Raspberry pi and
and operate the lane based on that Raspberry pi output go the relay driver and
calculation. This idea of controlling the relay is ON at that time LED is ON and
traffic light efficiently in real time has also LCD display the time.
attracted many researchers to work in this
field with the goal of creating automatic 4. SYSTEM DESIGN
tool that can estimate the traffic congestion Fig. shows the overall design of the
and based on this Variable, the traffic sign system. In this intersection, each outgoing
time interval is forecasted. lane has four photoelectric sensors that
calculate and report the traffic conditions
2. WORKING of each lane to the Raspberry Pi. The
In this proposed system supply given to Raspberry Pi uses this information to set
the step-down transformer. The output of the signal timer according to the level of
the transformer is connected to the input to traffic
the full wave bridge rectifier. The output
of bridge rectifier is given to the
Regulator. The output of regulator gives
6. ASSEMBLY
The methods used to assemble all the Fig: Flowchart of the system.
components are discussed in this section.
Table I shows the number of I/O pins used 7. FUTURE WORK
in the design and also how they are More sensors can be used in each lane to
distributed among each component. It is make the system more accurate and
also used to represent how the number of sensitive to small changes in traffic
I/O pins was reduced to increase the density. Driverless cars can access the
efficiency of the system. website to view the intensity of traffic at
an intersection and choose the fastest route waiting time. The whole system is
accordingly. Data mining techniques such controlled by Raspberry Pi. The designed
as classification can be applied on traffic system is implemented, tested to ensure its
data collected over a long term to study the performance and other design factors.
patterns of traffic in each lane at different
times of the day. Using this information, REFERENCES
different timing algorithms can be used at [1] R. Dhakad and M. Jain, "GPS based road traffic
different points of the day according to the congestion reporting system," 2014 IEEE
International Conference on Computational
traffic pattern. Intelligence and Computing Research,
Coimbatore, 2014, pp. 1-6.doi:
8. CONCLUSION 10.1109/ICCIC.2014.7238547
Nowadays, traffic congestion is a main [2] Q. Xinyun and X. Xiao, "The design and
problem in major cities since the traffic simulation of traffic monitoring system based
on RFID," The 26th Chinese Control and
signal lights are programmed for particular Decision Conference (2014 CCDC),
time intervals. However, sometimes the Changsha, 2014, pp. 4319-4322. doi:
demand for longer green light comes in at 10.1109/CCDC.2014.6852939
the one side of the junction due to huge [3] M. F. Rachmadi et al., "Adaptive traffic signal
traffic density. Thus, the traffic signal control system using camera sensor and
embedded system," TENCON 2011 - 2011
lights system is enhanced to generate IEEE Region 10 Conference,
traffic-light signals based on the traffic on Bali,2011,pp.12611265.doi:10.1109/TENCO
roads at that particular instant. The N.2011.6129009
advanced technologies and sensors have [4] X. Jiang and D. H. C. Du, "BUS-VANET: A
given the capability to build smart and BUS Vehicular Network Integrated with
Traffic Infrastructure," in IEEE Intelligent
intelligent embedded systems to solve Transportation Systems Magazine, vol. 7,no.
human problems and facilitate the life 2, pp. 47-57, Summer
style. Our system is capable of estimating 2015.doi:10.1109/MITS.2015.2408137
traffic density using IR sensors placed on [5] I. Septiana, Y. Setiowati and A. Fariza, "Road
either side of the roads. Based on it, the condition monitoring application based on
social media with text mining system: Case
time delay for the green light can be Study:
increased and we can reduce unnecessary
2. MOTIVATION
This research work is helpful for
improving smart city application. The
authorities can be alerted to take
preventive actions; preventive actions can
save money.
5. PROPOSED SYSTEM
The proposed system consists of entities
such as ultrasonic sensor and micro
controller for pothole detection. We are
going to develop an effective road surface
Fig 2. System Architecture
monitoring system for automated path hole
detection. This is a low cost solution for
6. IMPLEMENTATION MODULE
the road safety purpose.
6.1 Mobile Application Module:
User can collect the pathole notification
from the system for his safe journey.
6.2 Server Module:
7. METHODOLOGY 9. ACKNOWLEDGMENT
We implement this system for avoiding the We express our sincere thanks to our
obstacle in our route for safe journey and project guide Prof. Lagad J. U. who always
maintain a vehicle proper condition. In this being with presence & constant,
paper we use the following algorithm for constructive criticism to made this paper.
implementation the detection system We would also like to thank all the staff of
Algorithm details: COMPUTER DEPARTMENT for their
Input: Sensor Value valuable guidance, suggestion and support
Output: According to the system the of through the project work, who has given
output is positive that is one when the co-operation for the project with personal
proposed pothole detection system face the attention. Above all we express our deepest
pathole in car journey. Following code gratitude to all of them for their kind-
shows, how operations performed within hearted support which helped us a lot
the system and the sequence in which they during project work. At the last we
are performed. thankful to our friends, colleagues for the
inspirational help provided to us through a [3] Samyak Kathane, Vaibhav Kambli, Tanil Patel
project work. and Rohan Kapadia, Time Potholes Detection
and Vehicle Accident Detection and Reporting
System and Anti-theft (Wireless)‖, IJETT,
REFERENCES Vol. 21, No. 4, March 2015.
[1] S. S. Rode, S. Vijay, P. Goyal, P. Kulkarni, and [4] J. Lin and Y. Liu, \Potholes detection based on
K. Arya, detection and warning system: SVM in the pavement distress image," in Proc.
Infrastructure support and system design,‖ in 9th Int. Symp. Distrib. Comput. Appl. Bus.
Proc. Int. Conf Electron. Comput. Technol., Eng. Sci., Aug. 2010, pp. 544{ 547
Feb. 2009, pp. 286290. [5] I. Moazzam, K. Kamal, S. Mathavan, S. Usman,
[2] R. Sundar, S. Hebbar, and V. Golla, intelligent and M. Rahman, \Metrology and visualization
trac control system for congestion control, of potholes using the microsoft Kinect sensor,"
ambulance clearance, and stolen vehicle in Proc. 16th Int. IEEE Conf. Intell. Transp.
detection,‖ IEEE Sensors J., vol. 15, no. 2, pp. Syst., Oct. 2013, pp. 1284{1291.
11091113, Feb. 2015.
shoot apex correlated with phonological program has general applicability for
events, and the response to soil water predicting crop phenology and can aid in
availability for winter and spring wheat crop management.
(Triticum aestivum L.), winter and spring
barley (Hordeum vulgare L.), corn (Zea 3. SYSTEM ARCHITECTURE
mays L.), sorghum (Sorghum bicolor L.), The coaching of soil is step one earlier
proso millet (Panicum milaceum L.), than developing a crop.one of themost
hay/foxtail millet [Setaria italica (L.) P. vital task in agricultural is to penetrate
Beauv.]. And sunflower (Helianthus annus deep into soil and unfasten it.the
L.) was created based on experimental unfastened soil allows the roots to
data and the literature. Model evaluation breathe effortlessly even if they move
consisted of testing algorithms using deep into soil
―generic‖ default phenology parameters 1.1. Title and Author
for wheat (i.e., no calibration for specific IOT Based Agricultural Soil
cultivars was used) for a variety of field Prediction for Crops With
experiments to predict developmental Precautions.
events. Results demonstrated that the
the cloud servers. The privacy preserving enables patients to securely store and share
approaches make sure confidentiality, their PHR in the cloud server (for
integrity, authenticity, accountability, and example, to their care-givers), and
audit trial. Confidentiality ensures that the furthermore the treating doctors can refer
health information is entirely concealed to the patients‘ medical record to specialists
the unsanctioned parties, whereas integrity for research purposes, whenever they are
deals with maintaining the originality of required, while ensuring that the patients‘
the data, whether in transit or in cloud information remain private. This system
storage. Authenticity guarantees that the also supports cross domain operations
health-data is accessed by authorized (e.g., with different countries regulations).
entities only, whereas accountability refers
to the fact that the data access policies Electronic Personal Health Record
must comply with the agreed upon Systems: A Brief Review of Privacy,
procedures. Security, and Architectural Issues [18]
This paper addressed design and
2. RELATED WORK architectural issues of PHR systems, and
A Review on the State-of- the-Art focused on privacy and security issues
Privacy Preserving Approaches in the e- which must be addressed carefully if PHRs
Health Clouds [16] are to become generally acceptable to
This paper aimed to encompass the state- consumers. In conclusion, the general
of-the-art privacy preserving approaches indications are that there are significant
employed in the e-Health clouds. benefits to PHR use, although there are
Automated PHRs are exposed to possible architecturally specific risks to their
abuse and require security measures based adoption that must be considered. Some of
on the identity management, access these relate directly to consumer concerns
control, policy integration, and compliance about security and privacy, and Authors
management. The privacy preserving have attempted to discuss these in the
approaches are classified into context of several different PHR system
cryptographic and non-cryptographic architectures that have been proposed or
approaches and taxonomy of the are in trial. In Germany, the choice of the
approaches is also presented. Furthermore, standalone smartcard PHR is close to
the strengths and weaknesses of the national implementation. In the United
presented approaches are reported and States, implementations and/or tests of all
some open issues are highlighted. The the suggested architectures except the
cryptographic approaches to reduces the standalone smartcard are underway. In the
privacy risks by utilization of certain United Kingdom, the National Health
encryption schemes and cryptographic Service (NHS) appears to have settled on
primitives. This includes Public key an integrated architecture for PHRs.
encryption, Symmetric key encryption,
Alternative primitives such as Attribute Achieving Secure, Scalable and Fine-
based encryption, Identity based grained Data Access Control in Cloud
encryption, proxy-re encryption Computing [19]
This paper addressed challenging open
A General Framework for Secure issue by, on one hand, defining and
Sharing of Personal Health Records in enforcing access policies based on data
Cloud System [17] attributes, and, on the other hand, allowing
In this paper, Author provided an the data owner to delegate most of the
affirmative answer to problem of sharing computation tasks involved in fine grained
by presenting a general framework for data access control to untrusted cloud
secure sharing of PHRs. This system servers without disclosing the underlying
data contents. It achieved this goal by Analyzing and responding to queries, the
exploiting and uniquely combining IoT also controls things. Intelligent
techniques of attribute-based encryption processing involves making data useful
(ABE), proxy re-encryption, and lazy re- through machine learning algorithms.
encryption. This scheme also has salient Phase IV: Data Transmission
properties of user access privilege Data Transmission occurs through all
confidentiality and user secret key parts, from cloud to user. The user may be
accountability. Extensive analysis shows doctor, nurse, pharma and patient himself.
that this scheme is highly efficient and Phase V: Data Delivery
provably secure under existing security Delivery of information takes place
models. through user interface which may be
mobile, desktop or tablet. Delivered data is
3. PHASES IN IOMT in respect to role of person who is asking
Phase I: Data Collection, Data data. Doctor related data and pharma
Acquisition related data will be different.
Physical devices such as sensors plays
important role in enhancing safety and
improving the Quality of life in healthcare
arena. They have inherent accuracy,
intelligence, capability, reliability, small
size and low power consumption.
4. ATTACKS ON PHASES
Phase I: Data Loss
Data loss refers to losing the work
accidentally due to hardware or software
Figure 1: Phases in IOMT [4] failure and natural disasters. Data can be
Phase II: Storage duplicated by intruders. It must be ensured
The data collected in phase I should be that perceived data are received from
stored. Generally, IoT components are intended sensors only. Data authentication
installed with low memory and low could provide integrity and originality.
processing capabilities. The cloud is the Phase II: Denial of service, Access
best solution that takes over the Control
responsibility for storing the data in the The main objective of DOS attack is to
case of stateless devices. overload the target machine with many
Phase III: Intelligent Processing service requests to prevent it from
The IoT analyzes the data stored in the responding to legitimate requests. Unable
cloud DCs and provides intelligent to handle all the service requests on its
services for work and life in hard real time. own, it delegates the work load to other
similar service instances which ultimately
Technology for Competitive Strategies. [11] Weber, Rolf. (2010). Internet of Things – New
Lecture Notes in Networks and Systems, vol security and privacy challenges. Computer
40. Springer, Singapore Law & Security Review. 26. 23-30.
[2] Jin-cui YANG, Bin-xing FANG,Security 10.1016/j.clsr.2009.11.008.
model and key technologies for the Internet of [12] K. Zhao and L. Ge, "A Survey on the Internet
things, The Journal of China Universities of of Things Security," 2013 Ninth International
Posts and Telecommunications ,Volume 18, Conference on Computational Intelligence and
Supplement 2,2011,Pages 109-112, ISSN Security(CIS), Emeishan 614201, China China,
1005-8885,https://doi.org/10.1016/S1005- 2013, pp. 663-667.
8885(10)60159-8 doi:10.1109/CIS.2013.145.
[3] Lo-Yao Yeh, Woei-Jiunn Tsaur, and Hsin-Han [13] Security Issues and Challenges for the IoT-
Huang. 2017. Secure IoT-Based, Incentive- based Smart Grid Procedia Computer Science,
Aware Emergency Personnel Dispatching ISSN: 1877-0509, Vol: 34, Page: 532-537
Scheme with Weighted Fine-Grained Access [14] V. Alagar, A. Alsaig, O. Ormandjiva and K.
Control. ACM Trans. Intell. Syst. Technol. 9, 1, Wan, "Context-Based Security and Privacy for
Article 10 (September 2017), 23 pages. DOI: Healthcare IoT," 2018 IEEE International
https://doi.org/10.1145/3063716 Conference on Smart Internet of Things
[4] Fei Hu, Security and Privacy in Internet of (SmartIoT), Xi'an, 2018, pp. 122-128.
Things (IoT). Models Algorithms and doi: 10.1109/SmartIoT.2018.00-14
Implementarions, CRC Press, 2016. [15] Arbia Riahi Sfar, Enrico Natalizio, Yacine
[5] Arjona, R.; Prada-Delgado, M.Á.; Arcenegui, Challal, Zied Chtourou,A roadmap for security
J.; Baturone, I. A PUF- and Biometric-Based challenges in the Internet of Things,Digital
Lightweight Hardware Solution to Increase Communications and Networks,Volume 4,
Security at Sensor Nodes. Sensors 2018, 18, Issue 2,2018,Pages 118-137,ISSN 2352-
2429. 8648,https://doi.org/10.1016/j.dcan.2017.04.00
[6] S. Venugopalan,‖ Attribute Based 3.
Cryptology,‖ PhD Dissertation Indian Institute [16] A. Abbas and S. U. Khan, "A Review on the
Of Technology Madras, April-2011. State-of-the-Art Privacy-Preserving
[7] Sumitra B, Pethuru CR & Misbahuddin M, ―A Approaches in the e-Health Clouds," in IEEE
survey of cloud authentication attacks and Journal of Biomedical and Health Informatics,
solution approaches‖, International journal of vol. 18, no. 4, pp. 1431-1441, July 2014.
innovative research in computer and doi: 10.1109/JBHI.2014.2300846.
communication engineering, Vol.2, No.10, [17] M. H. Au, T. H. Yuen, J. K. Liu, W. Susilo, X.
(2014), pp.6245-6253. Huang, Y. Xiang, and Z. L. Jiang, ―A general
[8] Sankar Mukherjee, G.P. Biswas,Networking framework for secure sharing of personal
for IoT and applications using existing health records in cloud system‖, Journal of
communication technology,Egyptian Computer and System Sciences, 2017.
Informatics Journal,Volume 19,Issue [18] David Daglish and Norm Archer, ―Electronic
2,2018,Pages 107-127,ISSN 1110- Personal Health Record Systems: A Brief
8665,https://doi.org/10.1016/j.eij.2017.11.002. Review of Privacy, Security, and Architectural
[9] https://www.controlcase.com/services/log- Issues‖, IEEE 2009.
monitoring/ [19] S. Yu, C. Wang, K. Ren and W. Lou,
[10] Babar, Sachin & Stango, Antonietta & Prasad, "Achieving Secure, Scalable, and Fine-grained
Neeli & Sen, Jaydip & Prasad, Ramjee. Data Access Control in Cloud
(2011). Proposed Embedded Security Computing," 2010 Proceedings IEEE
Framework for Internet of Things (IoT). INFOCOM, San Diego, CA, 2010, pp. 1-9.
10.1109/WIRELESSVITAE.2011.5940923. doi: 10.1109/INFCOM.2010.5462174.
5. CONCLUSION REFERENCES
This system is IoT based .Wearable [1] Chirag Mahaveer Parmar,Projjal
glove is ready with different sensors as Gupta,K Shashank Bhardwaj 2018
(Members,IEEE)‖,Smart Work –Assisting
temperature, LDR and 3-AXIS Gear‖. Next Tech Lab(IoT
Accelerometer, Arduino IDE, Arduino Division)SRMUniversity,Kattankulalthur
Nano, Arduino Uno. In this system 5V 2018.
Battery is used. In small scale industry, [2] Aditya C, Siddharth T, Karan K, Priya G
smart glove that connect to any type of 2017, Meri Awaz-Smart Glow Learning
Assistant for Mute Students and Teachers.
machinery and provides access to it IJIRCCE 2017.
based on whether Proper safety of that [3] Umesh V. Nikam, Harshal D.Misalkar, Anup
machine is ensured. W. Burange 2018,Securing MQTT Protocol
in IoT by payload Encryption Technique and
6. ACKNOWLEDGEMENT Digital Signature, IAESTD 2018.
[4] Dhawal L. Patel,Harshal S. Tapse, Praful A.
We take this opportunity to express my Landge, Parmeshwar P. More and Prof.A.P.
hearty thanks to all those who helped me Bagade 2018, Smart Hand Gloves for
in the completion of the Project on Smart Disable Peoples, IRJET 2018.
Wearable Gadget for Industrial Safety. [5] Suman Thakur, Mr. Manish Varma, Mr.
We would especially like to express my Lumesh Sahu 2018,Security System Using
Aurdino Microcontroller, IIJET 2018.
sincere gratitude to Prof. Guide Name, [6] Radhika Munoli, Prof. Sankar Dasiga
Our Guide Prof. Pauras Bangar and 2016,Secured Data Transmission Fro IoT
HOD Prof. J.U. Lagad HOD Department Application, IJARCCE 2016.
of Computer Engineering who extended [7] Ashton K. That 2009 ‗‗Internet of Things‘‘
their moral support, inspiring guidance thing. RFiD Journal; 2009.
[8] Vincent C. Conzola, Michael S. Wogalter
and encouraging independence 1998, Consumer Product Warnings: Effects
throughout this task. We would also of injury Statistics on the call and Subjective
thank our Principal Sir, DR. R.S Evaluation, HFAES 1998.
Deshpande for his great insight and
motivation. Last but not least, we would
like to thank my fellow colleagues for
their valuable suggestions.
solar radiation forecasting. System focuses well than associated models in the day-
on short term predictions of solar ahead point and interval prediction of the
radioactivity are revised. An alternate solar radiance.
method and model is suggested. The Ali Chikh and Ambrish Chandra [3]
method accepts that solar radiation data planned An Optimal Extreme Power Point
recurrences itself in the history. Allowing Tracking Algorithm for PV Systems With
to this preliminary supposition, a novel Climatic Parameters Estimation, System
Mycielski constructed model is planned. suggested a approach Maximum Power
This model reflects the recorded hourly Point Tracking (MPPT) method for
solar radiation statistics as an array and photovoltaic (PV) schemes with
starting from the last record value, it goes concentrated hardware setup. It is
to discovery most parallel sub-array understood by computing the immediate
pattern in the history. This sub-array conductance and the interchange
pattern agrees to the longest matching conductance of the array. The first one is
solar radiation data array in the history. done by means of the array electrical
The data detected after this lengthiest energy and current, whereas the 2nd one,
array in history is measured as the which is a function of the array junction
forecast. In case numerous sub-arrays are current, is predictable by means of an
obtained, the model selects the choice adaptive neuro-fuzzy (ANFIS) solar cell
rendering to the probabilistic relatives of model. Meaningful the problems of
the sub-patterns last values to the determining solar radiation and cell
following value. To model the temperature, since those need2 extra
probabilistic relations of the data, a sensors that will rise the hardware
Markov chain model is approved and circuitry and dimension noise, analogical
used. By this way historical search model model is planned to estimation them with
is fortified. a de-noising based wavelet algorithm.
According to Yu Jiang [2] projected Day- This method supports to decrease the
ahead Forecast of Bi-hourly Solar hardware setup using only one voltage
Radiance with a Markov Switch sensor, while rises the array power
Approach, system uses a regime switching efficacy and MPPT response time. user
procedure to designate the progress of the friendly daily attendance system that is
solar radiance time-series. The optimal easy to manage, maintain and query. Our
number of regimes and regime-exact primary focus is to develop a paperless
parameters are unwavering by the system that provides the management a
Bayesian implication. The Markov regime way to facilitate smoother functioning of
switching model offers together the point the mess system.
and intermission forecast of solar viva city
centered on the posterior distribution 4. PROPOSED WORK
consequent from historical data by the 4.1 PROJECT SCOPE
Bayesian implication. Four solar viva city The product is an android application used
predicting models, the perseverance to manage daily mess attendance along
model, the autoregressive (AR) model, the with streamlining rebate and menu
Gaussian process regression (GPR) model, selection processes. Objective of the
as well as the neural network system is to provide a user friendly daily
1. model, are measured as starting point attendance system that is easy to manage,
models for authenticating the Markov maintain and query. Our primary focus is
switching model. The reasonable analysis to develop a paperless system that
based on numerical experiment outcomes provides the management a way to
determines that in overall the Markov facilitate smoother functioning of the mess
regime exchanging model accomplishes system.
REFERENCES
[1] Day-ahead Prediction of Bi-hourly
Solar Radiance with a Markov Switch
Approach, Yu Jiang, Huan Long, Zijun
Zhang, and ZheSong ,IEEE Transactions on
7. ACKNOWLEDGMENTS Sustainable Energy,2017, DOI 10.1109
With due respect and gratitude, we [2] An Optimal Maximum Power Point Tracking
take the opportunity to thank to all those Algorithm for PV Systems With Climatic
who have helped us directly and indirectly. Parameters Estimation ,AliChikh and
Ambrish Chandra, IEEE TRANSACTIONS
We convey our sincere thanks to Prof. P. ON SUSTAINABLE ENERGY, 2015,DOI
N. Mahalle, HoD, Computer Dept. and 10.1109
PROF. D. H. Kulkarni for their help in [3] Critical weather situations for renewable
selecting the project topic and support. energies e Part B: Low stratus risk for solar
Our guide PROF. D. H. Kulkarni has power, Carmen K€ohler , Andrea Steiner,
Yves-Marie Saint-Drenan, Dominique Ernst,
always encouraged us and given us the Anja Bergmann-Dick, Mathias Zirkelbach,
motivation to move ahead. He has put in a Zied Ben Bouall_egue , Isabel Metzinger ,
lot of time and effort in this seminar along Bodo Ritter Elsevier,Renewable
with us and given us a lot of confidence. Energy(2017),http://dx.doi.org/10.1016/j.rene
We wish to extend a big thank to him for ne.2016.09.002
[4] Sentinella: Smart Monitoring of Photovoltaic
the same.Also, we wish to thank all the Systems at Panel Level-Bruno Andò, Senior
other people who in any smallest way have Member, IEEE, Salvatore Baglio, Fellow,
helped us in the successful completion of IEEE, Antonio Pistorio, Giuseppe Marco
this project. Tina, Member, IEEE, and Cristina Ventura,
0018-9456 © 2015 IEEE , DOI 10.110
[5] Monitoring system for photovoltaic plants: A
8. CONCLUSION review-SivaRamakrishnaMadeti n, S.N.Singh
The solar PV PCU observing using Alternate Hydro Energy Centre,
Internet of Things has been IndianInstituteofTechnologyRoorkee,
experimentally sure to work satisfactorily Uttarakhand247667,India
by monitoring the constraints effectively RenewableandSustainableEnergy
Reviews67(2017)1180– 1207,
through the internet. The planned system http://dx.doi.org/10.1016/j.rser.2016.09.088
not only monitors the parameter of solar [6] Design and implementation of a solar plant and
PV PCU , but it also operate the data and irrigation system with remote monitoring and
create the report according to the remote control infrastructures ,YasinKabalci ,
requirement, for example estimation unit ErsanKabalci , RidvanCanbaz ,
AyberkCalpbinici, Elsevier, Solar Energy
plot and create total units produced per 139(2016),
month. It also stores all the constraints in [7] Forecasting of solar energy with application for
the cloud in a timely manner. This will a growing economy like India: Survey and
sensors should be correct. Data collected cloud platform and providing suggestions
by sensors is assumed to be same in all to farmers through mobile application.
areas of the field. The arrangement of the System Design
whole system is unchanged and secured.
Warehouse security system will
differentiate between rodents and humans
based on size.
The proposed system depends on the
consideration that users have good
internet connection and local system
should have power supply. Also user
should have mobile application installed
where alerts will be provided. Fig.2. DFD level 0
Requirements
The functional requirement of
system includes the data gathered by
the sensors and the decisions which are
taken on the basis of this data. The data
provided by sensors can contain some
noise, so it requires refining that data.
The processing model installed on
cloud platform will take this refined
data as input and then it will take
decisions based on the dataset values.
Accordingly, alerts will be provided to Fig.3. Data Flow Diagram level 1
farmers through mobile applications. The Fig, 2 and Fig.3 shows the
User of this system is a farmer. So data flow diagram of the system. It
we have to design the application shows the graphical representation of
accordingly. System must provide flow of data through an information
reliable alerts to user which will help system. It also shows the preliminary
him in making decisions and taking step to create an overview of the
actions about the field. system. DFD level 0 shows three
components - farmers, local system
Steps Involved and administrator which interact with
the model. DFD level 1 describes the
function through which the farmers,
local system and administrators
interact with the system. Local
system can collect data using sensors.
Farmers can request and view their
data on the system. Administrator
manages the data stored.
Other Specification
The proposed system provides
advantages in terms of increasing the
Fig. 1. Steps Involved
quality and quantity of yield and
As shown in Fig.1 the model will reducing the risk factor of damage
proceed in three steps. Which are caused by natural calamities. Also this
Colleting the data from field using system will help in improving soil
sensors, processing the collected data on fertility and soil nutrients, increasing net
profit of farmers and reducing efforts of mammals, also sensor fusion can be done
farmers. This system will promote the to increase the functionality of device.
smart farming techniques. Improving these perspectives of device,
This system has some limitations in it can be used in different areas. This
terms of requirement of constant power project can undergo for further research
supply and stable internet connection. to improve the functionality of device
Also farmer should be able to use and its applicable areas. We have opted
smartphone. Farmer should afford the to implement this system as a security
cost of the proposed system. solution in agricultural sector i.e. farms,
cold stores and grain stores.
6. CONCLUSION AND FUTURE
WORK REFERENCES
Internet of Things is widely used in [1] Nikesh Gondchawar, Prof. Dr. R. S.Kawitkar,
connecting devices and collecting IoT based Smart Agriculture International
Journal of Advanced Research in Computer
information. All the sensors are and Communication Engineering Vol. 5, Issue
successfully interfaced with raspberry pi 6, ISSN (Online) 2278-1021 ISSN (Print) 2319
and wireless communication is achieved 5940, June 2016.
between various nodes. All observations [2] Tanmay Baranwal, Nitika , Pushpendra
and experimental tests proves that project Kumar Pateriya Development of IoT based
Smart Security and Monitoring Devices for
is a complete solution to field activities, Agriculture 6th International Conference -
environmental problems, and storage Cloud System and Big Data Engineering, 978-
problems using smart irrigation system 1-4673-8203-8/16, 2016 IEEE.
and a smart warehouse management [3] Nelson Sales, Artur Arsenio, Wireless Sensor
system. Implementation of such a system and Actuator System for Smart Irrigation on
the Cloud 978-1- 5090-0366-2/15, 2nd World
in the field can definitely help to improve forum on Internet of Things (WF-IoT) Dec
the yield of the crops and overall 2015, published in IEEE Xplore Jan 2016.
production. [4] Prathibha S R1, Anupama Hongal 2, Jyothi
The device can incorporate pattern M P3 IoT based Monitoring System In Smart
recognition techniques for machine Agriculture 2017 International Conference on
Recent Advances in Electronics and
learning and to identify objects and Communication Technology.
categorize them into humans, rodents and
more relevant principles to the mobile used by most of the participating agents.
applications development. This paper We then report on the strategies of the
shows that React Native exhibits the best four top-placing agents. We conclude
results in all the analyzed principles and with suggestions for improving the design
still having benefits in the hybrid of future trading agent com-petitions
development in relation to native. With the
emergence of frameworks for mobile Paper Title: The opportunistic large array
development, some of them with a little (OLA) with transmission threshold (OLA-
more than a year of existence, there is the T) is a simple form of cooperative
difficulty to perceive which are the most transmission that limits node participation
advantageous for a given business in broadcasts. Performance of OLA-T has
objective, this article shows the best been studied for disc-shaped networks.
options among the frameworks used, This paper analyzes OLA-T for strip-
always comparing with the native shaped networks. The results also apply to
development. arbitrarily shaped networks that have
previously limited node participation to a
Paper Title: Among the various impacts strip. The analytical results include a
caused by high penetration distributed condition for sustained propagation, which
generation (DG) in medium and low implies a bound on the transmission
voltage distribution networks, the issues of threshold. OLA transmission on a strip
interaction between the DG and feeder network with and without a transmission
equipment, such as step voltage regulators threshold are compared in terms of total
(SVRs), have been increasingly brought energy consumption.
into focus of computational analyses and
real-life case studies. Particularly, the 2. gap analysis
SVR's runaway condition is a major Standard Platform:
concern in recent years due to the It is an standard android
overvoltage problem and the SVR application or ios application.
maintenance costs it entails. This paper All the API are pure platform
aims to assess the accuracy of the quasi- dependent for ola and uber.
static time series (QSTS) method in There is no such algorithm which
detailing such phenomenon when support for cross-platform for each
compared to the classical load flow platform there is different
formulation. To this end, simulations were algorithms.
performed using the OpenDSS software No current system availed for two
for two different test-feeders and helped to wheeler transportation.
demonstrate the effectiveness of the QSTS Some company provide such
approach in investigating the SVR's services but they don‘t have proper
runaway condition. implementation of these system.
Paper Title: Autonomous Bidding Agents In rural area the transportation
in the Trading Agent Competition services is negligible.
Abstract: Designing agents that can bid BikeUp:
in online simultaneous auctions is a
It is cross-platform algorithm
complex task. The authors describe task- which is use for many platform.
specific details and strategies of agents in a
This API are pure platform
trading agent competition. More
independent for various devices
specifically, the article describes the task-
like web application, android
specific details of and the general
application,ios application‘
motivations behind, the four top-scoring
This system will increase
agents. First, we discuss general strategies
employability in rural area by
ISSN:0975-887 Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune. Page 55
Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019
the future work may lay its emphases [4] Chia-Hui Yen(2008) Effects of e-service quality
on loyalty intention: an empirical study in
on exploration of the various methods online auction.
and applications of blockchain in auction [5] A. Kailas, L. Thanayankizil, M. A. Ingram, "A
by overcoming its limitations. more layers simple cooperative transmission protocol for
energy-efficient broadcasting over multi-hop
of hybrid functions can be included for wireless networks", KICS/IEEE Journal of
further increase in data integrity and Communications and Networks (Special Issue
on Wireless Cooperative Transmission and Its
security.
Applications), vol. 10, no. 2, pp. 213-220, June
2008.
REFERENCES [6] Y. J. Chang, M. A. Ingram, "Packet arrival time
[1] J.-N. Meier, A. Kailas, O. Abuchaar et al., "On estimation techniques in software defined
augmenting adaptive cruise control systems radio", preparation.
with vehicular communication for smoother [7] B. Sirkeci-Mergen, A. Scaglione, "On the power
automated following", Proc. TRB Annual efficiency of cooperative broadcast in dense
Meeting, Jan. 2018. wireless networks", IEEE J. Sel. Areas
[2] Dan Ariely(2003) Buying, Bidding, Playing, or Commun., vol. 25, no. 2, pp. 497-507, Feb.
Competing? Value Assessment. 2007.
[3] Amy Greenwald (2001) Autonomous Bidding
Agents in the Trading Agent Competition.
8. FUTURE WORK
Water is a key element for the human
survival but uneasy and unsustainable
because of patterns of water consumption.
Usage are still evident in our practical
life.There is a strong need to change this
pattern of sustainability.The world would
indeed cease to exist without the
availablility of water.
REFERENCES
Fig : System Implementation Plan [1] Vijay, Mahak, S. A. Akbar, and S. C. Jain.
"Chlorine decay modelling with contamination
simulation for water quality in smart water
grid." In 2017 International Conference on
Energy, Communication, Data Analytics and International Conference on, pp. 38-41. IEEE,
Soft Computing (ICECDS), pp. 3336-3341. 2017.
IEEE, 2017. [8] Shirode, Mourvika, Monika Adaling, Jyoti
[2] Pawara, Sona, Siddhi Nalam, Saurabh Biradar, and Trupti Mate. "IOT Based Water
Mirajkar, Shruti Gujar, and Vaishali Nagmoti. Quality Monitoring System." (2018).
"Remote monitoring of waters quality from [9] Getu, Beza Negash, and Hussain A. Attia.
reservoirs." In Convergence in Technology "Electricity audit and reduction of
(I2CT), 2017 2nd International Conference consumption: campus case study."
for, pp. 503-506. IEEE, 2017. International Journal of Applied Engineering
[3] Putra, Dito Adhi, and Tri Harsono. "Smart Research 11, no. 6(2016): 4423-4427.
sensor device for detection of water quality as [10] Attia, Hussain A., and Beza N. Getu.
anticipation of disaster environment pollution." "Authorized Timer for Reduction of Electricity
In Electronics Symposium (IES), 2016 Consumption and Energy saving in
International, pp. 87-92. IEEE, 2016. Classrooms." I JAER 11, no. 15 (2016): 8436-
[4] Saab, Christine, Isam Shahrour, and Fadi 8441.
Hage Chehade. "Smart technology for water [11] Getu, Beza Negash, and Hussain A. Attia.
quality control: Feedback about use of water "Automatic control of agricultural pumps
quality sensors." In Sensors Networks Smart based on soil moisture sensing." In AFRICON,
and Emerging Technologies (SENSET), 2017, 2015, pp. 1-5. IEEE, 2015.
pp. 1-4. 2017. [12] Bhardwaj, R. M. "Overview of Ganga River
[5] Borawake-Satao, Rachana, and Rajesh Prasad. Pollution." Report: Central Pollution Control
"Mobility Aware Path Discovery for Efficient Board, Delhi (2011).
Routing in Wireless Multimedia Sensor [13] NivitYadav, "CPCB Real time Water Quality
Network." In Proceedings of the International Monitoring", Report: Center for Science and
Conference on Data Engineering and Environment, 2012
Communication Technology, pp. 673-681. [14] Faruq, Md Omar, Injamamul Hoque Emu,
Springer, Singapore, 2017. Md Nazmul Haque, Maitry Dey, N. K. Das,
[6] Borawake-Satao, Rachana, and Rajesh and Mrinmoy Dey. "Design and
Prasad. "Comprehensive survey on effect of implementation of cost effective water quality
mobility over routing issues in wireless evaluation system." In Humanitarian
multimedia sensor networks." International Technology Conference (R10- HTC), 2017
Journal of Pervasive Computing and IEEE Region 10, pp. 860-863, IEEE, 2017.
Communications 12, no. 4 (2016): 447-465. [15] Le Dinh, Tuan, Wen Hu, Pavan Sikka, Peter
[7] Jalal, Dziri, and Tahar Ezzedine. "Towards a Corke, Leslie Overs, and Stephen Brosnan.
water quality monitoring system based on "Design and deployment of a remote robust
wireless sensor networks." In Internet of sensor network: Experiences from an outdoor
Things, Embedded Systems and water quality monitoring network." In Local
Communications (IINTEC), 2017 Computer Networks, 2007. LCN 2007. 32nd
IEEE 2007.
Population Growth‖, Science, Vol. 289 no. users,‖ Proceedings 22nd International
5477, 14 July 2000, pp. 284-288. Conference on Distributed Computing Systems
[5] I. Podnar, M. Hauswirth, and M. Jazayeri, Workshops, p. 563568, 2002.
―Mobile push: delivering content to mobile
2. EXISTING SYSTEM
One of the existing system is
implemented using Global System
for Mobile Communication (GSM)
where Short Message Service (SMS) Fig 1: Block diagram of the system
is used to send notices to the The notice to be displayed is sent
controller which limits the data size. from android application using Socket
Another existing system uses programming in java.
Bluetooth as mode of data transfer As Wireless transmission is used,
between microcontroller and the large amount of data can be
transferred over the network.
4. IMPLEMENTATION
This section explains the execution flow
from establishing communication
between the Android application and
Raspberry pi to displaying the notices on
the screen. Fig 2: Mathematical Model
As shown in Fig.3, first the message M1 sends notice to m2.
is sent from the application and M2 is the access point which
stored at Raspberry Pi. The message provides the network for m1 to
is retrieved and the contents are connect.
updated and stored on SD card. After receiving notice from m1, m3
Now the text message is read from processes it and includes it in m4,
the SD card. Fetched text is wrapped which is PHP template.
in a template and displayed on the This processed data is sent from m3
screen using browser which is open to m5.
in kiosk mode. M5 displays the message on LCD
For the communication to take place, screen.
both Raspberry Pi and android
application must be connected to
same WiFi network. This can be
achieved using server side coding in
some key points can be concluded for anonymity requirements, the transformed
further RFID application systems oneinherits these properties. The result
implementation. As information systems improves the prior best bound on worst-
play a crucial role in RFID case key-lookup cost of O(log n), by
implementation, information system Molnar, Soppera and Wagner (2006). They
development is essential for RFID project also show that any RFID authentication
success. And RFID information system protocol that simultaneously provides
should be developed as all open systems guarantees of privacy protection and of
that can be easily integrated with others worst-case constant-cost key-lookup must
system in supply chains. Security is a also imply ―public-key obfuscation‟, at
critical issue for RFID systems since they least when the number of tags is
manage cargo information that must be asymptotically large. Also consider
protected from theft, modification or relaxations of the privacy requirements
destruction. As a new wireless technology and show that, if limited likability is to be
that often links to the Internet, security tolerated, then simpler approaches can be
presents additional challenges that must be pursued to achieve constant key-lookup
factored into any installation of RFID cost.
systems.
Kashif Ali, HossamHassanein[4] 3. DESIGNING OF SYSTEM
presented system successfully merges the Objective
RFID readers and their tags with central Vehicle tracking has increased in use over
database, such that all the parking lots in the past few years and, based on current
the university can work in fast and trends, this rise should continue. Tracking
efficient manner. The RFID tag provides a offers benefits to both private and public
secure and robust method for holding the sector individuals, allowing for real-time
vehicle identify. The web-based database visibility of vehicles and the ability to
allows for the centralization of all vehicles receive advanced information regarding
and owners records. legal existence and security status. The
Ivan Muller, Renato Machado de monitoring system of a vehicle is
Brito[5]Vehicle tracking systems are integration of RFID technology and
popular among people as are travel device tracking system using IOT.
and theft prevention. The main benefit of Theme
vehicle tracking systems is the security In this paper Arduino used for controlling
purposes by monitoring the vehicle's all peripherals and activities. Arduino does
location which can be used as a protection not require external power supply circuit
approach for vehicles that are stolen by because Arduino has inbuilt power supply
sending its position coordinates to the circuit as well it provides additional
police center as an alert for the stolen. functionalities compared to any
When a police center receives an alert microcontroller like pic, microcontroller
forstolenvehicles, they can make an action 8051. Arduino is more sophisticated
to prevent this theft. compared with other microcontroller In
Muhammad Tahir Qadri, RFID the RFID reader can identify all
Muhammad[6]Introduced a new approach recognized data from RFID tag and then
that leads to a reconciliation of privacy and collected data has showing in terminal of
availability requirements in anonymous pc in that RFID tag is provided to the all
RFID authentication: a generic compiler vehicle and data can move towardsto
that mapseach challenge-response RFID RFID reader via radio frequency range
authentication protocol into another that 13.56MHz. Data can help to which one is
supports key-lookup operations in constant authorized or unauthorized vehicle. This
cost. If the original protocol were to satisfy
whole data will go to the via ESP8266 Wi- provides UART TTL (5V) serial
Fi module with internet to the mobile. communication which can be done using
Design digital pin 0 (Rx) and digital pin 1(TX).
In this fig shown the RFID reader, relay is An ATmega16U2 on the board channels
connected to the Arduino Uno. All data this serial communication over USB and
can begathered and then stored in appearsas a virtual com port to software on
ArduinoUno so we can easily access any the computer. The ATmega16U2 firmware
time anywhere it. According to data it can uses the standard USB COM drivers, and
give response. It can very efficient and no external driver is needed. However, on
reliable for data storing purpose and many Windows, an .info file is required. The
things can be analyze in system. RFID Arduino software includes a serial monitor
reader canbe read info. From RFID tag and which allows simple textual data to be sent
relay can be control motor. Motor is to and from the Arduino board. There are
connected to one circular rode which can two RX and TX LEDs on the Arduino
act as a gate so acc. To wholedata it can board which will flash when data is being
give response very quickly. All data transmitted via the USB-to-serial chip and
which can going into the pc and for mobile USB connection to the computer (not for
the ESP8266 is connected to the Arduino. serial communication on pins 0 and 1). A
Inmobile the authorized and unauthorized Software Serial library allows for serial
vehicle id no. is sent via ESP8266 Wi-Fi communication on any of the Uno's digital
module which is connected to internet. pins.The ATmega328P also supports I2C
(TWI) and SPI communication. The
Arduino software includes a Wire library
to simplify use of the I2C bus.
and unauthorized vehicle id no. is sent via Replacement StevanPreradovic, Isaac Balbin,
esp8266 Wi-Fi module which is connected Nemai C. Karmakar and Gerry Swiegers2008
to through internet. [4] Kashif Ali; HossamHassanein ―Passive RFID
for Intelligent Transportation Systems‖: 2009
6th IEEE Consumer Communications and
REFERENCES Networking Conference.
[1] Prof. Kumthekar A.V. , Ms. SayaliOwhal, [5] Ivan Muller, Renato Machado de Brito, Carlos
Ms.SnehalSupekar, Ms. BhagyashriTupe Eduardo Pereira, and ValnerBrusamarello.
―International research journal of Engineering ‖Load cells in force sensing analysis theory
and technology‖ IRJET, (volume:05) April and a novel application‖: IEEE
2018. Instrumentation & Measurement Magazine
[2] Liu Bin, Lu Xiaobo and GaoChaohui. Volume: 13, Issue: 1
Comparing and testing of ETC modes in [6] Muhammad Tahir Qadri, Muhammad Asif.
Chinese freeway. Journal of Transportation ―Automatic Number Plate Recognition System
Engineering and Information, 5(2), 2007, for Vehicle Identification Using Optical
pp.31-35. Character Recognition‖: 2009 International
[3] A Novel Chipless RFID System Based on Conference on Education Technology and
Planar Multiresonators for Barcode Computer
2. MOTIVATION
The present day other smart campus
systems propose the application of
knowing or measuring the area of a
building or classrooms in a college etc.
The other smart campus system proposes Figure 1: Block Diagram of the System
the application such that the location of a The system uses a raspberry pi 3 version
user using the android app in the college as the heart of the system which looks
area or campus area can be known through after the whole communication in the
the android app. The major issue in the system. In this the WI-FI of the raspberry
college or a campus is the difficulty of pi will be used as a medium to connect the
data sharing amongst the students and the android apps. The program such as socket
staff. While many of the users are not programming is used for the
connected to the internet facility among communication purpose where an app is
the college hours and also the important designed in such a way that it can be
notices are to be displayed on the notice accessed by an authorized person only. If a
board or shared from class to class student has to access the app they will be
increasing the manual effort. This process given a separate password and ID (USER
might be time consuming and also cause ID), and if a faculty has to access the app
manual error and also is the problem of they will have a different password and ID
controlling the electric appliances in the (ADMIN ID). Thus, system also prevents
classes where one has to go and manually the privacy of the users and
switch on/off the appliances. In this paper, miscommunication occurrence.
efforts are made to solve this issue by The memory of the raspberry pi is used as
using the android app and raspberry pi a storage unit for the data being uploaded,
module where a student can access the this it will work as a cloud memory for the
data sent by teacher on the WI-FI module android app. The android application(App)
and also the power control is added so that have options such as upload, download,
the electric appliances control can be done view, etc. The GUI design will be different
in the range of raspberry pi. for teachers and students based on their
respective login as a faculty (ADMIN) or
3. METHEDOLOGY as a student(USER). This GUI is created
The system works as a storage medium using the eclipse software. The faculty also
plus being WI-FI enabled which uploads can control the electrical appliances of the
the information on web server designed for department using their android application
this application. The uploaded file will be whereas student‘s login is not provided
stored and can be viewed or downloaded with this extra feature. This option is only
using an android application. Faculties are provided in the faculty login GUI.
able to turn ON/OFF the electric
appliances like fan or lights remotely from 4. ALGORITHM
server with the help of raspberry Pi and Start
relay assembly.
Change the directory path to location at iv. Now read the filename with fread() function
predefined path and it will return the file content and length of
Set the direction of GPIO pin to output . the file name
Open the socket with fixed port number v. Write the file length and file content on socket
To accept connections, the following steps vi. Free the allocated memory in
malloc() function
are performed:
12. if switch case 4:
1. A socket is created with socket(). i. Device 1will get turn ON
2. The socket is bound to a local address 13. if switch case 5:
using bind(), so that other sockets may bei. Device 1will get turn OFF
connected to it. 14 .if switch case 6:
3. A willingness to accept incomingi. Device 2will get turn ON
connections and a queue limit for 15. if switch case 7:
incoming connections are specified withi. Device 2will get turn OFF
listen(). 16. if switch case 0:
4. Connect socket With connect() method i. All devices are OFF
5. Connections are accepted with accept().
6. Read the data on socket 1 byte
7. Convert that byte from ascii to int by 5. RESULTS
atoi() function Screen shots of various pages of the
8. Check the byte put into the switch case application are as follows,
9 if switch case 1:
i. Read file data from client and save on server
ii. First read file size
iii. Allocate memory to read filename using
malloc()
iv. Now read actual file data
v. Now read name of file, for that first read size
of filename
vi. Allocate memory to read filename
vii. Now read actual file data
viii. Write file data into file
ix. Free the allocated memory in malloc()
function
10. if switch case 2:
i. Now read pathname, for that first read size of Figure 2: Screen shots1 of Android Application
pathname (LOGIN PAGE)
ii. Allocate memory to read pathname using
malloc()
iii. Now read actual path data
iv. Now pass the directory path for list_dir()
function
v. From this function return value are file and
directory listing and it‘s length
vi. Write the data on socket it‘s file and directory
length and it‘s length
vii. Free the allocated memory in malloc()
function
11. if switch case 3:
i. Now read file, for that first read size of
filename
ii. Allocate memory to read pathname using
malloc() Figure 3: Screen shots2 of Android Application
iii. Now read actual filename (CONFIGURATION)
RFID tags with cryptographic possibilities used that carries the family member details
and slight modification of digital signature and the customer needs to show this tag to
calculation procedure make it possible to the RFID reader. The microcontroller
prevent obtaining digital signatures for connected to the reader will checks for the
fraudulent documents. The further user authentication. If the user is found
evolution of the proposed scheme is authentic then the quantity of ration to be
permanent monitoring by means of given to the customer according to the
periodical controlling user‘s RFID tag, total number of family members will be
whether authenticated user is present at the displayed on display device.
computer with restricted access. II. Proposed Work
Here we conclude, the automatic vehicle Systems, 2015 IEEE International Conference
identification system using vehicle license on Systems, Man, and Cybernetics.
[4] ehun-wei Tseng, Design and Implementation
plate and RFID technology is presented. of a RFID-based Authentication System by
The system identifying the vehicle from Using Keystroke Dynamics.
the database stored in the PC. The [5] Andrey Larchikov, Sergey Panasenko,
objective of this project is to design an Alexander V. Pimenov, Petr
efficient automatic authorized vehicle Timofeev,Combining RFID-Based Physical
Access Control Systems with Digital Signature
identification system by using the vehicle Systems to Increase Their Security.
number plate and RFID. The Automatic
Number Plate Recognition (ANPR) system
is an important technique, used in
Intelligent Transportation System.ANPR is
an advanced machine vision technology
used to identify vehicles by their number
plates without direct human intervention.
The decisive portion of ANPR system is
the software model.We also implemented
further process if any vehicle break the
signal then our system can detect that
vehicle number tag and check details of
that vehicle for applying fine to that
vehicles.
REFERENCES
[1] Hsiao-Ying Huang, Privacy by Region:
Evaluation Online Users‘ Privacy Perceptions
by Geographical Region, FTC 2016 - Future
Technologies Conference 2016,6-7 December
2016.
[2] Hyoung shick Kim, Design of a secure digital
recording protection system with network
connected devices, 2017 31st International
Conference on Advanced Information
Networking and Applications Workshops.
[3] Chao-Hsien Lee and Yu-Lin Zheng, SQL-to-
NoSQL Schema Denormalization and
Migration: A Study on Content Management
DATA ANALYTICS
AND MACHINE
LEARNING
consumer decision making during online for sentimental analysis for users' reviews.
shopping experiences. The recommender The user is giving negative, positive or
system recommends the products to users neutral review is characterized by this
and to what extent these recommendations sentimental analysis.
affect consumer decisions about buying The User Diagram:
products is analyzed in this paper.
Comparison with the state-of-the-art for
opinion mining is done by Horacio
Saggion, et.al,2009, Ana-Maria Popescu
and Oren Etzioni introduces an
unsupervised information xtraction system
which mines reviews in order to build a
model of important product features, their
valuation by reviewers, and their relative
quality across products(Oren et. al., 2005).
Early Adopter Detection
An early adopter could refer to a
trendsetter, e.g., an early customer of a
given company, product and technology.
The importance of early adopters has been
widely studied in sociology and
economics. It has been shown that early
adopters are important in trend prediction,
viral marketing, product promotion, and so
on. The analysis and detection of early
adopters in the diffusion of innovations Fig 1: Use case
have attracted much attention from the The Sequence Diagram
research community. Generally speaking,
three elements of a diffusion process have
been studied: attributes of an innovation,
communication channels, and
social network structures.
Modeling Comparison-Based Preference
By modeling comparison-based
preference, we can essentially perform any
ranking task. For example, in information
retrieval (IR), learning to rank aims to
learn the ranking for a list of candidate
items with manually selected features.
Distributed Representation Learning
Since it's seminal work , distributed
representation learning has been
successfully used in various application
areas including Natural Language
Processing(NLP), speech recognition and
computer vision. In NLP several semantic
embedding models have been proposed,
including word embedding, phrase
embedding such as word2vec.In this paper
we are using natural language processing Fig 2: Sequence Diagram
Fig:Activity Diagram
4. GAP ANALYSIS
Sr. Year Author Paper Name Paper Description
no Name
3. 2012 Manuela Models for Paired There are other situations that
Cattelan Comparison Data: A may be regarded as comparisons
Review with Emphasis from which a winner and a loser
on Dependent Data can be identified without the
presence of a judge
[6] B. W. O, ―Reference group influence on [11] X. Rong and Q. Mei, ―Diffusion of innovations
product and brand purchase decisions,‖ Journal revisited: from social network to innovation
of Consumer Research, vol. 9, pp. 183– network,‖ in CIKM, 2013, pp. 499–508.
194,1982. [12] I. Mele, F. Bonchi, and A. Gionis, ―The early-
[7] J. J. McAuley, C. Targett, Q. Shi, and A. van adopter graph and its application to web-page
den Hengel, ―Imagebased recommendations recommendation,‖ in CIKM, 2012, pp.1682–
on styles and substitutes,‖ in SIGIR, 2015, pp. 1686.
43–52. [13] Y.-F. Chen, ―Herd behavior in purchasing
[8] E. M.Rogers, Diffusion of Innovations. New books online,‖ Computers in Human Behavior,
York: The Rise of High-Technology Culture, vol. 24(5), pp. 1977–1992, 2008. Banerjee, ―A
1983. simple model of herd behaviour,‖ Quarterly
[9] K. Sarkar and H. Sundaram, ―How do we find Journal of Economics, vol. 107, pp. 797–817,
early adopters who will guide a resource 1992.
constrained network towards a desired [14] A. S. E, ―Studies of independence and
distribution of behaviors?‖ in CoRR, 2013, p. conformity: I. a minority of one against a
1303. unanimous majority,‖ Psychological
[10] D. Imamori and K. Tajima, ―Predicting monographs: General and applied, vol. 70(9),
popularity of twitter accounts through the p. 1, 1956.
discovery of link-propagating early adopters,‖
in CoRR, 2015, p. 1512.
nearly impossible for a layman to be well doctor may not be well acquainted with
versed in SQL querying as they may be the databases. Information retrieval hence
unaware of the structure of the database becomes difficult for the doctor. It also
namely tables, their corresponding fields acts as a learning tool for students, which
and types, primary keys and so on. There help in the assessment of the SQL
is a need to overcome this gap of queries and learning through experience.
knowledge and allow users who have no The proposed system takes such problems
prior knowledge of SQL, to query a into consideration and provides a solution
database using a query posed in a natural to these problems. It makes access to
language such as English. Providing a data easier. With natural language as
solution to this problem, this system has input and conversion of natural language
been proposed that uses natural language to SQL queries, even nave users can
speech through voice recognition,
access the data in the database. The
converted to SQL query and displaying the
advances in machine learning has
results from the database.
progressively increased the reliability,
usage, and efficiency of Voice to Text
2. MOTIVATION
One of the most important aims of models. NLP has also seen major
Artificial Intelligence is to make things breakthroughs due to the of the Internet
easily and quickly accessible to humans. and Business Intelligence needs. Many
The access to information is invaluable toolkits and libraries exist for the sole
and it should be available for everyone. purpose of performing NLP, this makes
Logically understanding the needs of the developing a system for easier and
information that a person needs is quite achievable.
easy to formulate and we do it frequently.
However, one needs to have the 3. STATE OF ART
For the proposed system Intelligent
knowledge regarding formal languages to
Querying System using Natural Language
access information from current systems
Processing various papers have been
and this hinders non- technical people
reviewed whose survey report is given
from obtaining the information they want.
below. In [1] author has proposed an
It is crucial for systems to be user-friendly
in order to obtain the highest benefits. interactive natural language query
interface for relational databases. Given a
These systems try to make information
natural language query, the system first
accessible to everyone who knows a
translates it to an SQL statement and then
natural lan- guage. The main motivation of
evaluates it against an RDBMS. To
proposed systems is to break the barriers
achieve high reliability, the system
for non-technical users and make
explains to the user how the query is
information easily accessible to them.
actually processed. When ambiguities
Making a user-friendly and more
exist, for each ambiguity, the system
conversationally intelligent system will
generates multiple likely interpretations
help the user and even nave users to
for the user to choose from, which resolves
perform queries without having actual
ambiguities interactively with the user.
knowledge of SQL or database schema.
‖The Rule based domain specific semantic
We aim to introduce a modular system to
analysis Natural Language Interface for
Query a database at any time without the
Database‖ [2] converts a wide range of
hassle of logically forming the SQL
text queries (English questions) into
constructs. For an instance consider the
scenario of a hospital. Information of formal (SQL query) ones that can then
be run against a database by employing
the patient is stored in the database. A
generic and simpler processing techniques reasonable human useable format. The
and methods. This paper defines the limitations of the developed NLIDB, are
relation involving the ambiguous term as follows: 1. Domain Dependent. 2.
and domain specific rules and with this Limited on Query Domain. In ‖System
approach this paper makes a NLIDB and Methods for Converting Speech to
system portable and generic for smaller SQL‖ [6], author proposes a system
as well as large number of applications. which uses speech recognition models in
This paper only focuses on context based association with classical rule based
interaction along with SELECT, FROM, technique and semantic knowledge of
WHERE and JOIN clauses of SQL query underlying database to translate the user
and also handles complex query that speech query into SQL. To find the join
results from the ambiguous Natural of tables the system uses underlying
Language query. In ‖Natural Language to database schema by converting it into a
SQL Generation for Semantic Knowledge graph structure. The system is checked for
Extraction in Social Web Sources ‖[3], a single tables and multiple tables and it
system is developed that can execute gives correct result if the input query is
both DDL and DML queries, input by syntactically consistent with the Syntactic
the user in natural language. A limited Rules. The system is also database
Data dictionary is used where all possible independent i.e. it can be configured
words related to a particular system are automatically for different databases.
included. Ambiguity among the words is
taken care of while process- ing the 4. PROPOSED WORK
natural language. The system is There are many NLIDBs proposed in
developed in java programming language different papers but the interaction
and various tools of java are used to between the user and the system is
build the system. An oracle database is missing.The proposed system tries to
used to store the information. The author construct a natural language interface to
has proposed a system in [4] which databases in which the user can interact
provides a convenient as well as reliable with the system and confirm that the the
means of querying access, hence, a interpretation done by the system is
realistic potential for bridging the gap correct or not and if any manual changes
between computer and the casual end required can be done.The proposed
users. The system employs CFG based System tries to build a bridge between
system which makes it easy search the the linguistics and artificial intelligence,
terminals. As the target terminals become aiming at developing computer programs
separated to many non-terminals. To get capable of human like activity like
the maximum performance, the data understanding and producing text or
dictionary of the system will have to be speech in natural language such as English
regularly updated with words that are or conversion of natural language in text
specific to the particular system. The or speech from to language like SQL.
paper ‖An Algorithm for Solving Natural The proposed system mainly works on
Language Query Execution Problems on three important steps that are 1. Speech to
Relational Databases‖ [5] showed how a text conversion 2. SQL query gen- eration
modelled algorithm can be used to create a 3. Result generation. As displayed in fig.1
user friend non expert search process. (flowchart) In the proposed system that is
The modularity of SQL conversion is also interactive query system using natural
shown. The proposed model has been able language processing the very first
to intelligently process users request in a challenge is to convert the speech to text
3. LITERATURE SURVEY
Emotion Based Mood Enhancing SR.
PAPER
ADVANTAG- DISADVAN
Music Recommendation, 2017 ‖ proposed NO. ES TA-GES
[4] Paul Viola and Michael J. Jones, ―Robust real- [8] Lucey, P., Cohn, J. F., Kanade, T., Saragih, J.,
time object detection‖, International Journal of Ambadar, Z., & Matthews, I. (2010). The
Computer Vision, Vol. 57, No. 2, pp.137–154, Extended Cohn-Kanade Dataset (CK+): A
2004. complete expression dataset for action unit and
[5] SayaliChavan, EktaMalkan, Dipali Bhatt, emotion-specified expression. Proceedings of
Prakash H. Paranjape, ―XBeats-An Emotion the Third International Workshop on CVPR
Based Music Player ‖, International Journal for for Human Communicative Behavior Analysis
Advance Research in Engineering and (CVPR4HB 2010), San Francisco, USA, 94-
Technology, Vol. 2, pp. 79-84, 2014. 101.
[6] Xuan Zhu, Yuan-Yuan Shi, Hyoung-Gook [9] R.Cowie E.Douglas-Cowie.N.Tsapatsoulis.G.
Kim and Ki-Wan Eom, ―An Integrated Music Votsis. S. Koilias.W.Fellenz.Emotion
Recommendation System‖ IEEE Transactions Recognition in Human Computer Interaction.
on Consumer Electronics, Vol. 52, No. 3, pp. IEEE Signal Processing Magazine 18(01).32
917-925, 2006. 80. 2001.
[7] Dolly Reney and Dr.NeetaTripaathi, ―An [10] O.Martin.I.Kotsia.B.Macq.I.Pitas,The
Efficient Method to Face and Emotion eNTERFACE 05 Audio-visual Emotion
Detection‖, Fifth International Conference on Database,In:22nd International Conference on
Communication Systems and Network Data Engineering workshops
Technologies, 2015 Atlanta.Atlanta.GA.USA.2006.
imaging conditions. Images are labelled channels of the output equal to the
with a subject ID as well as their number of features. The correspondence
orientation [11]. The images have been of the feature detectors with the required
labelled on a scale of 0 to 4 where 0 is output is very large which may lead to
no DR and 4 is proliferative DR. The overfitting. To avoid this, the parameters
dataset consists of 35,126 training to be trained on the network can be fixed
images divided into 5 category labels by computing the dimensions of the
and 10715 test images which are 20 filters and the bias, such that it does not
percent of the total test dataset. The depend on the size of the input
dataset is a collection of images with image.Each layer outputs certain values
different illumination, size and by convoluting the input and the filter.
resolution, every image needs to be Non-linearity activation functions are
standardized. Initially all images are applied to the output to achieve the final
resized to standard dimensions. Dataset computations.
images are RGB colour images
consisting of red, green and blue
channels out of which the green channel
is used to obtain the best results in
contrast of blood vessels. This is
depicted in Fig. 1.
Method
the scope of the diagnosis. It will also 2016, pp. 198-203. doi:
allow the model to learn better during the 10.1109/IAC.2016.7905715
[6] S. Yu, D. Xiao and Y. Kanagasingam,
training phase. Diabetic Retinopathy is one
‖Exudate detection for diabetic retinopathy
of the diseases that affect people all over with convolutional neural networks,‖ 2017
the world. The success of such a system 39th Annual International Conference of the
for the classification of Diabetic IEEE Engineering in Medicine and Biology
Retinopathy provides the scope for Society (EMBC), Seogwipo, 2017, pp. 1744-
building such a system for various other 1747. doi: 10.1109/EMBC.2017.8037180
diseases that need accurate results and [7] T. Bui, N. Maneerat and U. Watchareeruetai,
within short period of time. Convolutional ‖Detection of cotton wool for diabetic
retinopathy analysis using neural network,‖
Neural Network is a very powerful 2017 IEEE 10th International Workshop on
network which can be further used for Computational Intelligence and Applications
extended analysis of various other (IWCIA), Hiroshima, 2017, pp. 203-206. doi:
diseases. 10.1109/IWCIA.2017.8203585
REFERENCES
[1] pp.
C. P.1677-1682.
Lee,
Kampik,
and Wilkinson,
G.C.
clinical
macular D.D.
R.
diabetic
edema
Ophthalmology, F. retinopathy
L. Ferris,
R.Pararajasegaram,
Agardh,
P. Group, M. R.
Proposed
disease
vol. 110, J. E.
Davis, T.
D.Klein, P. A.
P.
international
Verdaguer,
severity
issue and
9, Dills,
diabetic
Sep. scales,
2003,
[8] A. G. A. Padmanabha, M. A. Appaji, M.
[1] T. Y. Wong, C. M. G. Cheung, M. Larsen, S. Prasad, H. Lu and S. Joshi, ‖Classification of
Sharma, and R. Sim, Diabetic retinopathy, diabetic retinopathy using textural features in
Nature Reviews Disease Primers, vol. 2, Mar. retinal color fundus image,‖ 2017 12th
2016, pp. 1-16. International Conference on Intelligent
[2] Gadkari SS. Diabetic retinopathy screening: Systems and Knowledge Engineering (ISKE),
Telemedicine, the way to go!. Indian J Nanjing, 2017, pp. 1-5. doi:
Ophthalmol 2018;66:187-8 10.1109/ISKE.2017.8258754
[3] Gulshan V, Peng L, Coram M, et al. [9] X. Wang, Y. Lu, Y. Wang and W. Chen,
Development and Validation of a Deep ‖Diabetic Retinopathy Stage Classification
Learning Algorithm for Detection of Diabetic
Using Convolutional Neural Networks,‖ 2018
Retinopathy in Retinal Fundus Photographs.
IEEE International Conference on Information
JAMA. 2016;316(22):24022410.
doi:10.1001/jama.2016.17216 Reuse and Integration (IRI), Salt Lake City,
Choudhary,
usingS. deep
[4] 801-804.
Y.
International
Electronics,
Technology ‖Detecting
Kanungo,learning,‖
Conference
doi:Information
(RTEICT), diabetic
on2017
B. Bangalore,
and retinopathy
2nd
Srinivasan
Recent andIEEE
Trends
Com-munication
2017,
10.1109/RTEICT.2017.8256708 S.
in
pp.
[5] D. Fitriati and A. Murtako, ‖Implementation of UT, 2018, pp. 465-471. doi:
Diabetic Retinopathy screening using realtime 10.1109/IRI.2018.00074
data,‖ 2016 International Conference on [10] https://www.kaggle.com/c/diabetic-
Informatics and Computing (ICIC), Mataram, retinopathy-detection/dat.
improve their quality by being better for cross validation of the intermediate
prepared for similar anomalies in the models and RMSE and Ecart metrics
future. gauge their performance. The
implementation is carried out in Python 3.
2. MOTIVATION ● Review on Flight Delay Prediction
● Every day almost 2.2 million Alice Sternberg, Jorge Soares, Diego
people willingly board commercial Carvalho, Eduardo Ogasawara
airlines despite the fact that around This paper proposes a taxonomy
850,000 them would not get to their and consolidates the methodologies used
desired destination on time [9]. to address the flight delay prediction
● Roughly 40 percent of all air problem, with respect to scope, data, and
travelers have arrived late consistently for computing methods, specifically focusing
most of the last 35 years [10] . And unless on the increased usage of machine learning
things change dramatically, about 40 methods. It also presents a timeline of
percent of all the air travelers will continue significant works that represent the
to arrive late every year, perhaps forever. interrelationships between research trends
● A 40 percent failure rate would be and flight delay prediction problems to
unacceptable for the global commercial address them.
passenger flight network and acts as a● A Deep Learning Approach to Flight
bottleneck for various business and travel Delay Prediction
related activities along with air cargo Young Jin Kim, Sun Choi, Simon Briceno
delivery operations. and Dimitri Mavris
● Using historic flight data and This paper uses deep learning
meteorological data of the source and models like Recurrent Neural Networks
destination airports as the major attributes and long short-term memory units along
this paper cyphers this problem using with RNN. Deep learning is suitable for
various machine learning algorithms in learning from labelled as well as
order to gauge the feasibility of different unlabelled data. It uses multiple hidden
algorithms and choose the most accurate layers to improve the learning process and
one for prediction. can accelerated using modern GPUs. Deep
learning tries to mimic the learning
3. LITERATURE SURVEY methodologies of biological brain (mainly
This section provides information human brain). This paper comments on
about the previous work done for effectiveness of various deep learning
addressing the problem of flight delay models for predicting airline delays
● A statistical approach to predict flight
prediction.
delay using gradient boosted decision
● Airline Delay Predictions using tree
Supervised Machine Learning Pranalli Suvojit Manna, Sanket Biswas, Riyanka
Chandraa, Prabakaran.N and Kundu, Somnath Rakshit, Priti Gupta
Kannadasan.R, VIT University, Vellore. This paper investigates the
This paper uses preliminary data effectiveness of the algorithm Gradient
analysis techniques and data cleaning to Boosted Decision Tree, one of the famous
remove noise and inconsistencies. The machine learning tools to analyse those air
machine learning techniques used are traffic data. They built an accurate and
multiple linear regression and polynomial robust prediction model which enables an
regression which allow for various metrics elaborated analysis of the patterns in air
of bias and variance in order to pinpoint traffic delays.
the best fitting parameters for the
respective models. K-fold method is used
4 COUNTRY USA
8 COUNTRY USA
Fig. 2: Boosting
performance using additive learning which AdaBoost puts more weight on the
learns basically through improving the instances that are difficult to classify,
previously built models. rather than instances that are easily
The main methodology used in this, is to classified. AdaBoost is less susceptible to
build a model from training data and over-fitting the training data. Strong
creating a second model that corrects the classifier can be built by converging
errors of the previously built model . individual weak learners.
5.4.2 AdaBoost Method
better performance than the previously [6] Tierney, Sean, and Michael Kuby. "Airline and
implemented Adaboost and Gradient airport choice by passengers in multi airport
regions: The effect of Southwest airlines." The
Boosting. Professional Geographer 60.1 (2008): 1532.
[7] Schaefer, Lisa, and David Millner. "Flight delay
propagation analysis with the detailed policy
7. CONCLUSION assessment tool." Systems, Man, and
In a generalized manner, this paper Cybernetics, 2001 IEEE International
have shown that prediction of delays in Conference on . Vol. 2. IEEE, 2001.
[8] Guy, Ann Brody."Flight delays cost
commercial flights is tractable and that $32.9billion".
local weather data at the origin airport is http://news.berkeley.edu/2010/10/18/flight_dela
indeed essential for the prediction of ys.
delays. [9] ―Airlines' 40% Failure Rate: 850,000
In the case of flight delays or Passengers Will Arrive Late Today -- And
Every Day‖
cancelation, the most significant real world https://www.forbes.com/sites/danielreed/2015/0
factors are combination of technical and 7/06/airlines-40-failure-rate-850000-
logistical issues. The datasets considered passengers-will-arrive-late-today-and-every-
in the paper do not provide this aspect of day/#2d077c1074bd
data thus the accuracy of the model model [10] Hansen, Mark, and Chieh Hsiao. "Going
south?: Econometric analysis of US airline
is restrained by this limitation. flight delays from 2000 to 2004."
Transportation Research Record: Journal of the
Transportation Research Board 1915 (2005):
REFERENCES. 8594.
[1] Belcastro, Loris, et al. "Using Scalable Data [11] Robert E. Schapire.― Explaining AdaBoost
Mining for Predicting Flight Delays." ACM
Transactions on Intelligent Systems and ―.Princeton University, Dept. of Computer
Technology (TIST) 8.1 (2016) Science, 35 Olden Street, Princeton, NJ 08540
[2] Khanmohammadi, Sina, Salih Tutun, and USA, e-mail: schapire@cs.princeton.edu
Yunus Kucuk. "A New Multilevel Input Layer [12] Jerome H.Friedman. ―Stochastic gradient
Artificial Neural Network for Predicting Flight boosting‖. Department of Statistics and
Delays at JFK Airport." Procedia Computer Stanford Linear Accelerator Center, Stanford
Science 95 (2016): 237244. University, Stanford, CA 94305, USA
[3] Hensman, James, Nicolo Fusi, and Neil D.
Lawrence. "Gaussian processes for big data." [13] Suojit Manna ,Sanket Biswas.‖A Statistical
CoRR,arXiv:1309.6835 (2013) approach to predict Flight Delay using
[4] Bandyopadhyay, Raj, and Guerrero, Rafael. Gradient Boosted Decision Tree‖.
"Predicting airline delays." CS229 Final
Projects (2012).
[5] Gilbo, Eugene P. "Airport capacity:
Representation, estimation, optimization." IEEE
Transactions on Control Systems Technology
1.3 (1993): 144154.
heart also known as Heart Attack. A pathological test, namely, Blood test, ECG
blockage can develop due to a buildup of and analysis by experienced pathologists.
plaque, a substance mostly made of fat, As it involves human judgment of several
cholesterol and cellular waste products. factors and a combination of experiences,
Due to an insufficient blood supply, some a decision support system is desirable in
of the heart muscles begin to die. Without this case. The proposed problem statement
early medical treatment this damage can is ―Risk Assessment in Heart Attack using
machine learning
be permanent.
Motivation
Medical sector is rich with information
but the major issues with medical data Acute myocardial infarction, commonly
mining are their volume and complexity, referred to as Heart Attack is the most
poor mathematical categorization and common cause for sudden deaths in city
canonical form. We have used advanced and village areas. It is one the most
data mining techniques to discover dangerous disease among men and women
knowledge from the collected medical and early identification and treatment is
datasets. Reducing the delay time between the best available option for the people.
onset of a heart attack and seeking
treatment is a major issue. Individuals who 2. RELATED WORK
are busy in their homes or offices with Nearest neighbor (KNN) is very
their regular works and rural people simple, most popular, highly efficient and
having no knowledge on the symptoms of effective technique for pattern recognition.
heart attack may neglect the chest KNN is a straight forward classifier, where
discomfort. They may not have exact parts are classified based on the class of
intention to neglect it but they may pass on their nearest neighbor. Medical data bases
the time and decided to go to a doctor or are big volume in nature. If the data set
hospital after a while. But for heart attack, contains excessive and irrelevant
time matters most. There are many Mobile attributes, classification may create less
Health (Health) tools available to the accurate result. Heart disease is the best
consumer in the cause of death in INDIA. In Andhra
Pradesh heart disease was the best cause of
mortality accounting for 32%of all deaths,
prevention of CVD such as self monitoring a rate as high as Canada (35%) and USA.
mobile apps. Current science shows the Hence there is a need to define a decision
evidence on the use of the vast array of support system that helps clinicians to take
mobile devices such as use of mobile precautionary steps. In this work proposed
phones for communication and feedback, a new technique which combines KNN
Smartphone apps. As medical diagnosis of with genetic technique for effective
heart attack is important but complicated classification. Genetic technique perform
and costly task, we will proposed a system global search in complex large and
for medical diagnosis that would enhance multimodal landscapes and provide
medical care and reduce cost. Our aim is to optimal solution [1].
provide a ubiquitous service that is both
feasible, sustainable and which also make This work focuses a new approach for
people to assess their risk for heart attack applying association rules in the Medical
at that point of time or later Domain to discover Heart Disease
Prediction. The health care industry
Problem Statement collects huge amount of health care data
which, unfortunately are not mined to
Reliable identification and classification
discover hidden information for effective
of cardiovascular diseases requires
decision making. Discovery of hidden
ISSN:0975-887 Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune. Page 110
Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019
patterns and relationships often goes increase in height of the 'R' wave or
unexploited. Data mining techniques can changes in the measurement of the 'R-R'
help remedy this situation. Data mining denote various anomalies of human heart.
have found numerous applications in Similarly 'P-P', 'Q-Q', 'S-S', 'T-T' also
Business and Scientific domains. corresponds to various anomalies of heart
Association rules, classification, clustering and their peak amplitude also envisages
are majorareas of interest in data mining. other cardiac diseases. In this proposed
[2]. method the 'PQRST' peaks are marked and
stored over the entire signal and the time
This work has analyzed prediction interval between two consecutive 'R' peaks
systems for Heart disease using more and other peaks interval are measured to
number of input attributes. The work uses find anomalies in behavior of heart, if any
medical terms such as sex, blood pressure, [5].
cholesterol like 13 attributes to predict the
likelihood of patient getting a Heart The ECG signal is well known for
disease. Until now, 13 attributes are used its nonlinear changing behavior and a key
for prediction. This research work added characteristic that is utilized in this
two more attributes i.e. obesity and research; the nonlinear component of its
smoking. The data mining classification dynamics changes more automatically
algorithms, namely Decision Trees, Naive between normal and abnormal conditions
Bayes, and Neural Networks are analyzed than does the linear one. As the higher-
on Heart disease database [3]. order statistics (HOS) maintain phase
Medical Diagnosis Systems play information, this work makes use of one-
important role in medical practice and are dimensional slices from the higher-order
used by medical practitioners for diagnosis spectral region of normal and ischemic
and treatment. In this work, a medical subjects. A feed forward multilayer neural
diagnosis system is defined for predicting network (NN) with error back propagation
the risk of cardiovascular disease. This (BP) learning technique was used as an
system is built by combining the relative automated ECG classifier to find the
advantages of genetic technique and neural possibility of recognizing ischemic heart
network. Multilayered feed forward neural disease from normal ECG signals [6].
networks are particularly adapted to
complex classification problems. The Automatic ECG classification is a
weights of the neural network are showing tool for the cardiologists in
determined using genetic technique medical diagnosis for effective treatments.
because it finds acceptably good set of In this work, propose efficient techniques
weights in less number of iterations [4]. to automatically classify the ECG signals
into normal and arrhythmia affected
A wide range of heart condition is (abnormal) parts. For these categories
defined by thorough examination of the morphological features are extracted to
features of the ECG report. Automatic illustrate the ECG signal. Probabilistic
extraction of time plane features is neural network (PNN) is the modeling
valuable for identification of vital cardiac technique added to capture the distribution
diseases. This work presents a multi- of the feature vectors for classification and
resolution wavelet transform based system the performance is calculated. ECG time
for detection 'P', 'Q', 'R', 'S', 'T' peaks series signals in this work are bind from
complex from original ECG signal. 'R-R' MIT-BIH arrhythmia database [7].
time lapse is an important minutia of the
ECG signal that corresponds to the
heartbeat of the related person. Abrupt The heart diseases are the most
extensive induce for human dying. Every
attributes to predict the actual heart education programs will decline in the
disease. heart disease mortality.
REFERENCES
5. ALGORITHM
Naive Bayes algorithm is the [1] Algorithm M.Akhil jabbar B.L Deekshatulua
algorithm that learns the probability of an Priti Chandra International ―Classification of
Heart Disease Using K- Nearest Neighbor and
object with certain features belonging to a Genetic Algorithm‖ Conference on
particular group/class. In short, it is a Computational Intelligence: Modeling
probabilistic classifier. Techniques and Applications (CIMTA) 2013.
The Naive Bayes algorithm is called [2] MA.Jabbar, B.L Deekshatulu, Priti Chandra,
―An evolutionary algorithm for heart disease
"naive" because it makes the assumption prediction‖CCIS,PP 378-389 , Springer(2012).
that the occurrence of a certain feature is [3] Chaitrali S Dangare ―Improved Study Of Heart
independent of the occurrence of other Disease Prediction System Using Data Mining
ClassificationTechniques‖, International
features. Here we classify the heart disease Journal Of Computer Applications, Vol.47,
based on heart check up attributes. No.10 (June 2012).
[4] Amma, N.G.B ―Cardio Vascular Disease
Naive Bayes or Bayes‘ Rule is the basis Prediction System using Genetic Algorithm‖,
for many machine learning and data IEEE International Conference on Computing,
mining methods. The rule (algorithm) is Communication and Applications, 2012.
used to create models with predictive [5] Sayantan Mukhopadhyay1 , Shouvik Biswas2 ,
capabilities. It provides new ways of Anamitra Bardhan Roy3 , Nilanjan Dey4‘
Wavelet Based QRS Complex Detection of
exploring and understanding data. Why to ECG Signal‘ International Journal of
prefer naive bayes implementation: Engineering Research and Applications
1)When the data is high. (IJERA) Vol. 2, Issue 3, May-Jun 2012,
2)When the attributes are independent of pp.2361-2365
each other. [6] Sahar H. El-Khafifand Mohamed A. El-
Brawany, ―Artificial Neural Network-Based
3) When we expect more efficient output, Automated ECG Signal Classifier‖, 29 May
as compared to other methods output. 2013.
Based on all these information and steps [7] M.Vijayavanan, V.Rathikarani, Dr. P.
we classify to predict the heart disease Dhanalakshmi, ―Automatic Classification of
depending on heart check up attributes. ECG Signal for Heart Disease Diagnosis using
morphological features‖. ISSN: 2229-3345
Vol. 5 No. 04 Apr 2014.
6. CONCLUSION [8] I. S. Siva Rao, T. Srinivasa Rao,
In this work we have presented a novel ―Performance Identification of Different Heart
approach for classifying heart disease. As Diseases Based On Neural Network
a way to validate the proposed method, we Classification‖. ISSN 0973-4562 Volume 11,
Number 6 (2016) pp 3859-3864.
will add the patient heart testing result [9] J. R. Quinlan, Induction of decision trees,
details to predict the type of heart disease Machine learning, vol. 1, no. 1, pp.81106,
using machine learning. Train data sets 1986.
taken from UCI repository. Our approach [10] J. Han, J. Pei, and M. Kamber, Data mining:
use naïve bayes technique which is a concepts and techniques. Elsevier,2011.
[11] I. H. Witten, E. Frank, M. A. Hall, and C. J.
competitive method for classification. This
Pal, Data Mining: Practical machine learning
prediction model helps the doctors in tools and techniques. Morgan Kaufmann,
efficient heart disease diagnosis process 2016.
with fewer attributes. Heart disease is the [12] L. Breiman, Random forests, Machine
most common contributor of mortality in learning, vol. 45, no. 1, pp. 532, 2001.
India and in Andhra Pradesh. [13] Mullasari AS, Balaji P, Khando T." Managing
complications in acute myocardial infarction."
Identification of major risk factors and J Assoc Physicians India. 2011 Dec; 59
developing decision support system, and Suppl(1): 43-8.
effective control measures and health
[14] C. Alexander and L. Wang, Big data analytics [15] Wallis JW. Use of arti_cial intelligence in
in heart attack prediction,J Nurs Care, vol. 6, cardiac imaging. J Nucl Med. 2001 Aug;
no. 393, pp. 21671168, 2017. 42(8): 1192-4.
inhumane which led to severe post- integrated with PHP and Wamp. Initially,
traumatic stress disorder. Thus, manual for the testing purpose, a Phase one
moderation of abusive content is malicious development is being established on
for the person moderating the content as it localhost.
causes harmful effects on them.
In [9], various techniques applied
Therefore there is a need for an efficient regarding with data processing, such as
technique to monitor hate speeches and weighting of terms and the dimensionality
offensive words on social networking reduction. All these techniques were
sites. studied in order to model algorithms to be
able to mimic well the human decisions
2. LITERATURE SURVEY regarding the comments. The results
In [7], the paper includes moderation of indicate the ability to mimic experts
multimodal subtleties such as images or decision on 96.78% in the data set used.
text. The authors develop a deep learning The classifiers used for comparison of the
classifier that jointly models textual and results were the K-Nearest Neighbors and
visual characteristics of pro-eating the Covalent Bond Classification. For
disorder content that violates community dimensionality reduction, techniques for
guidelines. For analysis, they used a the extraction of terms were also used to
million photos, posts from Tumblr. The best characterize the categories within the
classifier discovers deviant content data set.
efficiently while also maintaining high
recall (85%). They also discuss how As SNSs have become of paramount
automation might impact community relevance nowadays, many people refuse
moderation and the ethical and social to participate in or join them because of
obligations of this area. how easy it is to publish and spread
content that might be considered offensive.
In [8], the proposed system is designed for In [4], the approach accurately identifies
open source operating system windows or inappropriate content based on accusers‘
Linux. The implementation of this system reputations. Analysis of reporting systems
is based on PHP framework. MySQL to assess content as harmless or offensive
database is used for storing the datasets by in SNSs.
configuring the LAMP server in Ubuntu
and WAMP server in windows. Also the 3. GAP ANALYSIS
configuration of PHPMyAdmin. Ubuntu Not all the data generated from SNS can
helps to perform various tasks such as be considered as normal. It contains a
creating, modifying or deleting databases considerate amount of data that can be
with the use of a web browser. Dream considered as offensive and hateful.
viewer is being used for the system Manual content moderation is effective but
development. For recommendation requires a considerate amount of
generation, latest version of Apache is manpower and sometimes it can be
used. To configure Wamp with windows traumatic for humans to examine such
environment the integration of Wamp inappropriate content. Hence, in recent
server in windows is done. To make the days some organizations have come up
Web environment scalable it is being with effective techniques which can be
Tweets contain unnecessary data such as The vectorization algorithm used in the
stop words, emojis, usernames. This kind proposed model is TF-IDF vectorization.
of data does not contribute much in the The reason to choose this particular
classification and hence, we need to filter vectorization technique is that the dataset
out this data as well as normalize it into a used for the experimentation a contains
suitable format so that it can be used for large number of tweets containing
training the classifier for classifying the offensive words which dominate the small
unknown text data. An Individual tweet is number of regular tweets. As TF-IDF
taken and is then tokenized into words. assigns the score depending upon the
These tokens are then used to determine occurrence of a term in a document, this
unnecessary data such as emoji and seems to be the best choice.
usernames. Furthermore, unnecessary
symbols and stopwords are removed in The classifier model is then trained on a
order to reduce the data volume. collection of pairs containing vectorized
tweets and whether they are offensive or
The main task is to normalize the data. not. Supervised classification is used in
Hence the aim is to infer the grammar this proposed system is able to then learn
independent representation of a given from these tweets and can classify a new
tweet. Lemmatization is used to find out tweet.
the the lemma of each token. After this, all
the filtered tokens for one tweet are After training, a new tweet is given to the
collected together for further processing. model, it will repeat all the above steps
5. MATHEMATICAL MODEL
The Proposed model can be represented in Here, we used 2 classifier models.
mathematical model as follows - (Bernoulli Naive Bayes and Bagged SVM)
for performance comparison
Term frequency inverse document
frequency (TF-IDF) of words in given
corpus is calculated by
1.) Naive Bayes -
argmax(
...(1)
Where,
)
t - terms
a - individual document
2.) Bagged Support Vector
D - collection of document Machines -
As given in [12], Support Vector Machines
tf - term frequency i.e. number of
can be bagged as
times words appear in each
document
Hm : Sequence of classifiers
m : 1,....,M
M : Number of classifiers in
bagging
Using (1) all equation are vectorized.
: Learning parameter
Figure 2 : Bar chart for different metric comparison between the two models
From Figure 2, it can be inferred that reduced using the proposed system.
both models yield almost same accuracy Currently, the proposed system is for
but by considering other metrics, Bagged textual data but in the future, this can be
SVM performs better than Bernoulli extended to images, videos, and audio.
Naive Bayes‘. Further, an efficient model with higher
efficiency can be used to classify text
7. FUTURE WORK data more effectively. Additionally, the
Traditionally content moderation is done algorithm to find out what is wrong with
manually. This manual work can be the content can also be designed. Manual
Moderators will be less exposed to hate [7] Stevie Chancellor, Yannis Kalantidis, Jessica
speeches and offensive if such models A. Pater, Munmun De Choudhury, David A.
are implemented on large scale. Shamma. ‖Multimodal Classification of
Moderated Online Pro-Eating Disorder
8. CONCLUSION Content‖. In Proceedings of the 2017 CHI
This system mainly focuses on Conference on Human Factors in Computing
Systems (Pg. 3213-3226) on ACM
categorizing text data in two categories
(2017,May).
namely offensive and normal. This will
help content moderators to review less
[8] Sanafarin Mulla, Avinash Palave,
offensive data. Content moderation ―Moderation Technique For Sexually Explicit
process will be automated by the use of a Content‖. In 2016 International Conference
machine learning technique. on Automatic Control and Dynamic
Optimization Techniques (ICACDOT) at
REFERENCE International Institute of Information
Technology (I2IT), Pune (2016,September).
[1] Facebook-https://www.facebook.com/
[Access Date: 19 Dec 2018]. [9] Félix Gómez Mármol,Manuel Gil Pérez
,Gregorio Martínez Pérez. ―Reporting
[2] Twitter-https://twitter.com/ Offensive Content in Social Networks:
[Access Date: 19 Dec 2018]. Toward a Reputation-Based Assessment
Approach‖. In IEEE Internet Computing
Volume 18 , Issue 2 , Mar.-Apr. 2014.
[3] LinkedIn-https://in.linkedin.com/
[Access Date: 19 Dec 2018].
[10] Davidson, Thomas and Warmsley, Dana and
Macy, Michael and Weber, Ingmar.
[4] Marcos Rodrigues Saúde, Marcelo de ‖Automated Hate Speech Detection and the
Medeiros Soares, Henrique Gomes Basoni, Problem of Offensive Language‖. In
Patrick Marques Ciarelli, Elias Oliveira. ―A proceedings of the 11th International AAAI
Strategy for Automatic Moderation of a Large Conference on Web and Social Media 2017,
Data Set of Users Comments‖. In 2014 XL (Pg. 512-515).
Latin American Computing Conference
(CLEI) (2014,September).
[11] Scikit-learn: A module for machine learning.
[5] Facebook's 7,500 Moderators Protect You https://scikit-learn.org [Access Date: 19 Dec
From the Internet's Most Horrifying Content. 018].
But Who's Protecting Them.
https://www.inc.com/christine-
[12] Kristína Machová, František Barčák, Peter
lagorio/facebook-content-moderator-
Bednár, ―A Bagging Method Using Decision
lawsuit.html [Access Date: 19 Dec 2018].
Trees in the Role of Base Classifiers‖ in Acta
Polytechnica Hungarica, Vol.3, No.2, 2006,
[6] Moderators who had to view child abuse 121-132, ISSN 1785-8860.
content sue Microsoft, claiming PTSD.
https://www.theguardian.com/technology/201
7/jan/11/microsoft-employees-child-abuse-
lawsuit-ptsd [Access Date: 19 Dec 2018].
ABSTRACT
The location recommendation plays an essential role in helping people find interesting
places. Although recent research has he has studied how to advise places with social
and geographical information, some of which have dealt with the problem of starting
the new cold users. Because mobility records are often shared on social networks,
semantic information can be used to address this challenge. There the typical method is
to place them in collaborative content-based filters based on explicit comments, but
require a negative design samples for a better learning performance, since the negative
user preference is not observable in human mobility. However, previous studies have
demonstrated empirically that sampling-based methods do not work well. To this end,
we propose a system based on implicit scalable comments Content-based collaborative
filtering framework (ICCF) to incorporate semantic content and avoid negative
sampling. We then develop an efficient optimization algorithm, scaling in a linear
fashion with the dimensions of the data and the dimensions of the features, and in a
quadratic way with the dimension of latent space. We also establish its relationship with
the factorization of the plate matrix plating. Finally, we evaluated ICCF with a large-
scale LBSN data set in which users have text and content profiles. The results show
that ICCF surpasses many competitors’ baselines and that user information is not only
effective for improving recommendations, but also for managing cold boot scenarios.
Keywords- Content-aware, implicit feedback, Location recommendation, social
network, weighted matrix factorization.
1. INTRODUCTION and using the profile to calculate the
As we think about the title of this paper is similarity with the new elements. We
related to Recommender System which is recommend location that are more similar
part of the Data mining technique. to the user's profile. Recommender
Recommendation systems use different systems, on the other hand, ignore the
technologies, but they can be classified properties of the articles and base their
into two categories: collaborative and recommendations on community
content-based filtering systems. Content- preferences. They recommend the
based systems examine the properties of elements that users with similar tastes and
articles and recommend articles similar to preferences have liked in the past. Two
those that the user has preferred in the users are considered similar if they have
past. They model the taste of a user by many elements in common.
building a user profile based on the One of the main problems of
properties of the elements that users like recommendation systems is the problem of
cold start, i.e. when a new article or user is configuration, with a classification, to the
introduced into the system. In this study preference trust model. This sparse
we focused on the problem of producing weighing and weighting configuration not
effective recommendations for new only assigns a large amount of confidence
articles: the cold starting article. to the visited and unvisited positions, but
Collaborative filtering systems suffer from also includes three different weighting
this problem because they depend on schemes previously developed for
previous user ratings. Content-based
locations.
approaches, on the other hand, can still
produce recommendations using article A.Motivation
descriptions and are the default solution
for cold-starting the article. However, they In introductory part for the study of
tend to get less accuracy and, in practice, recommendation system, their
are rarely the only option. application, which algorithm used for
that and the different types of model, I
The problem of cold start of the
decided to work on the
article is of great practical importance Recommendation application which is
Portability due to two main reasons. First, used for e-commerce, online shopping,
modern online the platforms have location recommendation, product
hundreds of new articles every day and recommendation lot of work done on
actively recommending them is essential to that application and that the technique
keep users continuously busy. Second, used for that application is
collaborative filtering methods are at the Recommendation system using
core of most recommendation engines traditional data mining algorithms.
since then tend to achieve the accuracy of
Approaches to the state of the art to
the state of the art. However, to produce generate recommendations only
recommendations with the predicted positive evaluations are often based on
accuracy that require that items be the content aware collaborative
qualified by a sufficient number of users. filtering algorithm. However, they
Therefore, it is essential for any suffer from low accuracy.
collaborative adviser to reach this state as
soon as possible. Having methods that 2. RELATED WORK
producing precise recommendations for Shuhui Jiang, Xueming Qian *,
Member, IEEE, Tao Mei, Senior
new articles will allow enough comments
Member, IEEE and Yun Fu, Senior
to be collected in a short period of time, Member, IEEE‖ describe the
Make effective recommendations on Personalized Travel Sequence
collaboration possible. Recommendation on Multi-Source Big
Social Media In this paper, we proposed a
In this paper, we focus on providing personalized travel sequence
location recommendations novel scalable recommendation system by learning
Implicit-feedback based Content-aware topical package model from big multi-
Collaborative Filtering (ICCF) framework. source social media: travelogues And
community-contributed photos. The
Avoid sampling negative positions by
advantages of our work are 1) the system
considering all positions not visited as automatically mined user‘s and routes‘
negative and proposing a low weight travel topical preferences including the
topical interest, Cost, time and season, 2) algorithm best tailored to implicit
we recommended not only POIs but also feedback, And developed a scalable
travel sequence, considering both the optimization algorithm for jointly learning
popularity and user‘s travel preferences at latent factors and hyper parameters [5].
the same time. We mined and ranked
famous routes based on the similarity E. X. He, H. Zhang, M.-Y. Kan, and T.-
between user package and route package S. Chua describe the ―Fast matrix
[1]. factorization for online recommendation
with implicit feedback,‖ We study the
Shuyao Qi, Dingming Wu, and Nikos
problem of learning MF models from
Mamoulis describe that ,‖ Location Aware implicit feedback. In contrast to previous
Keyword Query Suggestion Based on work that applied a uniform weight on
Document Proximity‖ In this paper, we missing data, we propose to weight
proposed an LKS framework providing Missing data based on the popularity of
keyword suggestions that are relevant to items. To address the key efficiency
the user information needs and at the same challenge in optimization, we develop a
time can retrieve relevant documents Near new learning algorithm which effectively
the user location [2]. learns Parameters by performing
coordinate descent with memorization [6].
X. Liu, Y. Liu, and X. Li describe the
―Exploring the context of locations for F. Yuan, G. Guo, J. M. Jose, L. Chen, H.
personalized Location recommendations‖. Yu, and W. Zhang, describe the
In this paper, we decouple the process of ―Lambdafm: learning optimal ranking with
jointly learning latent representations of factorization machines using lambda
users and locations into two separated surrogates‖ In this paper, we have
components: learning location latent presented a novel ranking predictor
representations using the Skip-gram Lambda Factorization Machines.
model, and learning user latent Inheriting advantages from both LtR and
representations Using C-WARP loss [3]. FM, LambdaFM (i) is capable of
optimizing various top-N item ranking
H. Li, R. Hong, D. Lian, Z. Wu, M. metrics in implicit feedback settings; (ii) is
Wang, and Y. Ge describe the ―A relaxed very exible to incorporate context
ranking-based factor model for information for context-aware
recommender system from implicit recommendations [7].
feedback,‖ in this paper, we propose a
relaxed ranking-based algorithm for item Yiding Liu1 TuanAnh Nguyen Pham2
recommendation with implicit feedback, Gao Cong3 Quan Yuan describe the An
and design a smooth and scalable Experimental Evaluation of Pointofinterest
optimization method for model‘s Recommendation in Locationbased Social
parameter Estimation [4]. Networks-2017 In this paper, we provide
an all around Evaluation of 12 state-of-the-
D. Lian, Y. Ge, N. J. Yuan, X. Xie, and art POI recommendation models. From the
H. Xiong describe the ―Sparse Bayesian evaluation, we obtain several important
collaborative filtering for implicit findings, based on which we can better
feedback,‖ In this paper, we proposed a understand and utilize POI
sparse Bayesian collaborative filtering
negatively. Such as Only the preferred oriented to the element Scheme, and that
samples are implicitly provided in a the sparse configuration and rank one
positive way. Feedback data while it is not significantly improves the performance of
practical to treat all unvisited locations as the recommendation.
negative, feeding the data on mobility REFERENCES
together. With user information and [1] Shuhui Jiang, Xueming Qian *, Member,
location in these explicit comments IEEE, Tao Mei, Senior Member, IEEE and
Frames require pseudo-negative drawings. Yun Fu, Senior Member, IEEE‖ Personalized
Travel Sequence Recommendation on Multi-
From places not visited. The samples and Source Big Social Media‖ Transactions on Big
the lack of different levels of trust cannot Data IEEE TRANSACTIONS ON BIG
allow them to get the comparable top-k DATA, VOL. X, NO. X,
recommendation. [2] Shuyao Qi, Dingming Wu, and Nikos
Mamoulis,‖ Location Aware Keyword Query
5. System Architecture: Suggestion Based on Document Proximity‖
VOL. 28, NO. 1, JANUARY 2016.
[3] X. Liu, Y. Liu, and X. Li, ―Exploring the
context of locations for personalized Location
recommendations,‖ in Proceedings of
IJCAI‘16. AAAI, 2016.
[4] H. Li, R. Hong, D. Lian, Z. Wu, M. Wang, and
Y. Ge, ―A relaxed ranking-based factor model
for recommender system from implicit
feedback,‖ in Proceedings of IJCAI‘16, 2016,
pp. 1683–1689.
[5] D. Lian, Y. Ge, N. J. Yuan, X. Xie, and H.
Xiong, ―Sparse Bayesian collaborative filtering
Fig. System Architecture for implicit feedback,‖ in Proceedings of
IJCAI‘16. AAAI, 2016.
6. CONCLUSION [6] X. He, H. Zhang, M.-Y. Kan, and T.-S. Chua,
In this Paper, we propose an ICCF ―Fast matrix factorization for online
recommendation with implicit feedback,‖ in
framework for collaborative filtering based Proceedings of SIGIR‘16, vol. 16, 2016.
on content based on implicit feedback set [7] Yuan, G. Guo, J. M. Jose, L. Chen, H. Yu, and
of data and develop the coordinates of the W. Zhang, ―Lambdafm: learning optimal
offspring for effective learning of ranking with factorization machines using
parameters. We establish the close lambda surrogates,‖ in Proceedings of the 25th
ACM International on Conference on
relationship of ICCF with matrix graphical Information and Knowledge Management.
factorization and shows that user functions ACM, 2016, pp. 227–236.
really improve mobility Similarity [8] Yiding Liu1 TuanAnh Nguyen Pham2 Gao
between users. So we apply ICCF for the Cong3 Quan Yuan,‖ An Experimental
Location recommendation on a large-scale Evaluation of Pointofinterest Recommendation
in Locationbased Social Networks-2017‖.
LBSN data set. our the results of the [9] Salman Salamatian_, Amy Zhangy, Flavio du
experiment indicate that ICCF is greater Pin Calmon_, Sandilya Bhamidipatiz, Nadia
than five competing baselines, including Fawazz, Branislav Kvetonx, Pedro Oliveira{,
two leading positions recommendation and Nina Taftk ―Managing your Private and Public
factoring algorithms based on the ranking Data: Bringing down Inference Attacks against
your Privacy‖ 2015.
machine. When comparing different [10] Zhiwen Yu, Huang Xu, Zhe Yang, and Bin
weighting schemes for negative preference Guo ―Personalized Travel PackageWith Multi-
of the unvisited places, we observe that the Point-of-Interest Recommendation Based on
user-oriented scheme is superior to that Crowdsourced User Footprints‖ 2016
ABSTRACT
In this proposed system, a genetic algorithm is applied to automatic schedule
generation system to generate course timetable that best suit student and teachers
needs. Preparing schedule in colleges and institutes is very difficult task for satisfying
different constraints. Conventional process of scheduling is very basic process of
generating schedule for any educational organization .This study develop a practical
system for generation of schedule .By taking complicated constraints in consideration
to avoid conflicts in schedule. Conflicts means that generate problem after allocation of
time slots.
Keywords
Genetic Algorithm (GA), Constraints, Chromosomes, Genetic Operators.
1. INTRODUCTION constraints include [1] Each time slot
Preparing timetable is most complicated should be scheduled to a specified time
and conflicting process .The traditional .[2] Each teacher or student can be
way of generating timetable still have the allocated only one classroom at a time.[3]
error prone output, even if it is prepared All students must be fit into that particular
repeatedly for suitable output .The aim of allocated classroom. Some of the software
our application is to make the process constraints include [1] Both faculty and
simple easily understanding and efficient student should not unconnected timeslots
and also with less time requirements in timetable.[2] Classroom have limited
therefore there is a great need of this kind capacity.
of application in educational institute.
2. ALGORITHM
Timetable generating has been in most Step1: Partition the training set Tr into m
of the human requirements and it is most subsets through random sampling;
widely used for educational institutes like
schools, colleges and other institutes, Step2: Apply decision tree algorithm to
where we need planning of courses, each subsets S 2S m;
subjects and hours. In earlier days
timetable scheduling was a manual process Step3: Apply each included tree from
where a one person or the group of some step2 (Tree, Tree2 Tree m) to the test set T
peoples are involved in this process and e;
the create timetable with their hands,
which take more efforts and still the output Step4: Use fitness function to evaluate
is not appropriate. performance of all trees, and rank the trees
with their related subsets according to
The courses scheduling problem can be trees‘ performance;
specified as constraint satisfaction problem
(CSP). Constraints in the scheduling Step5: Perform GA operations:
process can be categories into two Selection: select the top (1 – c)m subsets
constraints Hardware Constraints and and keep them intact into next operation;
software Constraints. Common hardware
Crossover: for remaining cm 12 pairs, Fitness function is used to find the quality
perform two points crossover; of represented function. This function is
problem dependent. Infield of genetic
Mutation: randomly select mu subsets to algorithm design solution is represented as
perform mutation operation. Randomly a string it refers as chromosome .In the
replace one instance in selected subset by each phase of testing it delete the ‗n‘ worst
one instance randomly selected from the result or condition and create ‗n‘ new ones
original training data set. from the best design solution and the final
result is obtained from that solution.
Step6: New subsets are created from step5
as the next new generation, then replicates 3. PROPOSED SYSTEM
step2 to step6,until identify a subset and a In this proposed system is based on
related tree with ideal performance. customer centric strategy in designing of
scheduling system. Firstly a data mining
1. Input data:
algorithm is design for mining student
The first step in functioning of GA is preferences in different course selection
generation of an initial input data, each from historical data. Then based on
individual is evaluated and assigned a selection pattern obtain from mining
fitness value according to positive fitness algorithm for scheduling is designed,
function. which leads to develop an integrative,
automatic course scheduling system. This
2. Selection: system is not only help to increase the
student satisfaction of course scheduling
This operator select chromosome in data system result.
for reproduction. The better chromosome
to fit, the more times it is likely to be In this proposed system adopts the
selected to reproduce. user‘s perspective and applies different
types of techniques to an automatic
3. Crossover: scheduling and also considers teacher
preferences and student needs in their
It is a genetic operator is used to vary
schedule, so that final output fulfills the
coding of a chromosome from one
expectations at each and every users. This
generation to the next. In crossover
algorithm is used for exchanging course
process it takes one or more than one
that are given to the system as input, so as
parent solution and find the child solutions
to find optimal solution for timetabling
from the parent solution.
problem
4. Mutation:
4. SYSTEM ARCHITECTURE
In mutation solution may change from the Input data:
previous one solution. Mutation is the
1.Courses
process in which the data can be
interchange for the best solution. When the 2.Labs
given solution is not reliable or there is
conflicts are available then mutation and 3.Lectures
crossover techniques are very important. It
decides which result is best for given input 4.Sems
data.
5.Students
5. Fitness Function:
ABSTRACT
Repetitive and in material highlights in data have caused a whole deal issue in system
traffic classification. Lately, one of the fundamental concentrations inside (Network
Intrusion Detection System) NIDS investigate has been the use of machine learning
and shallow learning strategies. This paper proposes a novel profound learning model
to empower NIDS activity inside present day systems. The model demonstrates a blend
of profound and shallow learning, prepared to do accurately investigating a wide-scope
of system traffic. The system approach proposes a Non-symmetric Deep Auto-Encoder
(NDAE) for unsupervised feature learning. Also, furthermore proposes novel profound
learning order show constructed using stacked NDAEs. Our proposed classifier has
been executed in Graphics preparing unit (GPU)- engaged TensorFlow and surveyed
using the benchmark utilizing KDD Cup '99 and NSL-KDD datasets. The execution
assessed organize interruption location examination datasets, especially KDD Cup 99
and NSL-KDD dataset. However the to cover-up the Limitation of KDD dataset in
proposed system WSN trace dataset has been used. The commitment work is to execute
interruption counteractive action framework (IPS) contains IDS usefulness however
progressively complex frameworks which are fit for making quick move so as to
forestall or diminish the vindictive conduct.
General Terms
Non Symmetric Deep Auto-Encoder, Restricted Boltzman Machine, Deep Belief
Network.
Keywords
Deep learning, Anomaly detection, Autoencoders, KDD, Network security
examination of system information and new deep learning model for NIDPS for
quicker recognizable proof of any present day systems.
peculiarities. In this paper, we propose a
2. MOTIVATION machine health monitoring systems are
A new NDAE technique for reviewed mainly from the following
unsupervised feature learning, which aspects: Auto-encoder (AE) and its
not like typical autoencoder variants, Restricted Boltzmann Machines
approaches provides non-symmetric and its variants including Deep Belief
data dimensionality reduction. Hence, Network (DBN) and Deep Boltzmann
our technique is able to ease improved Machines (DBM), Convolutional Neural
classification results when compared Networks (CNN) and Recurrent Neural
with leading methods such as Deep Networks (RNN). Advantages are: DL-
Belief Networks (DBNs). based MHMS do not require extensive
A novel classifier model that utilizes human labor and expert knowledge. The
stacked NDAEs and the RF applications of deep learning models are
classification algorithm. By combining not restricted to specific kinds of
both deep and shallow learning machines. Disadvantages are: The
techniques to exploit their strengths performance of DL-based MHMS heavily
and decrease analytical overheads. We depends on the scale and quality of
should be able to get better results datasets.
from similar research, at the same time
Proposes the use of a stacked
as significantly reducing the training
denoising autoencoder (SdA), which is a
time.
deep learning algorithm, to establish an
3. REVIEW OF LITERATURE FDC model for simultaneous feature
The paper [1] focuses on deep extraction and classification. The SdA
learning methods which are inspired by the model [3] can identify global and invariant
structure depth of human brain learn from features in the sensor signals for fault
lower level characteristic to higher levels monitoring and is robust against
concept. It is because of abstraction from measurement noise. An SdA is consisting
multiple levels, the Deep Belief Network of denoising autoencoders that are stacked
(DBN) helps to learn functions which are layer by layer. This multilayered
mapping from input to the output. The architecture is capable of learning global
process of learning does not dependent on features from complex input data, such as
human-crafted features. DBN uses an multivariate time-series datasets and high-
unsupervised learning algorithm, a resolution images. Advantages are: SdA
Restricted Boltzmann Machine (RBM) for model is useful in real applications. The
each layer. Advantages are: Deep coding is SdA model proposes effectively learn
its ability to adapt to changing contexts normal and fault-related features from
concerning data that ensures the technique sensor signals without preprocessing.
conducts exhaustive data analysis. Detects Disadvantages are: Need to investigate a
abnormalities in the system that includes trained SdA to identify the process
anomaly detection, traffic identification. parameters that most significantly impact
Disadvantages are: Demand for faster and the classification results.
efficient data assessment.
Proposes a novel deep learning-
The main purpose of [2] paper is to based recurrent neural networks
review and summarize the work of deep (RNNs)model [4] for automatic security
learning on machine health monitoring. audit of short messages from prisons,
The applications of deep learning in which can classify short messages(secure
and non-insecure).In this paper, the feature
Total Attributes
Event protocol_used
Time port_number
from_node transmission_rate_kbps
Fig. 1 Proposed System Architecture
to_node received_rate_kbps
Advantages are: Due to deep learning
hopcount drop_rate_kbps technique, it improves accuracy of
intrusion detection system.
packet_size Class
The network or computer is constantly
Fig. 1 shows the proposed system monitored for any invasion or attack.
architecture of Network Intrusion The system can be modified and
Detection and Prevention System changed according to needs of specific
(NIDPS). The input traffic data is uses for client and can help outside as well as
WSN dataset with 12 features. The inner threats to the system and
training dataset contains data network.
preprocessing which includes two steps: It effectively prevents any damage to
Data transformation and data the network.
normalization. After uses two NDAEs It provides user friendly interface
arranged in a stack, which uses for which allows easy security
selecting number of features. After that management systems.
apply the Random Forest Classifier for Any alterations to files and directories
attack detection. Intrusion Prevention on the system can be easily detected
Systems (IPS) contains IDS functionality and reported.
but more sophisticated systems which are
capable of taking immediate action in 6. ALGORITHM
order to prevent or reduce the malicious A Deep Belief Network (DBN)[11] is a
behavior. complex sort of generative neural system
that utilizes an unsupervised machine
learning model to deliver results. This kind
of system outlines a portion of the work
that has been done as of late in utilizing
generally unlabeled information to
construct unsupervised models. A few
specialists depict the Deep Belief Network
as a lot of limited Boltzmann machines
(RBMs) stacked over each other. When all
is said in done, profound conviction
systems are made out of different littler normalized using the same and as
unsupervised neural systems. One of the follows:
regular highlights of a DBN is that in spite
of the fact that layers have associations
between them, the system does exclude
associations between units in a solitary (2)
layer. It uses Stacked Restricted
Boltzmann Machine Which has a two 2. Feature Selection:
layer called hidden layer and visible layer. NDAE is an auto-encoder featuring non-
The rule status monitoring symmetrical multiple hidden layers. The
algorithm has been use to recognize and proposed NDAE takes an input vector
detect the attack. We define a rule set as a and step-by-step maps it to the
file consisting of a set (or category) of latent representations (here d
rules that share a common set of represents the dimension of the vector)
characteristics. Our goal is to develop an using a deterministic function shown in (3)
algorithm that monitors the collection of below:
rule sets so as to identify the state of each
(3)
rule in each rule set, in terms of whether it
Here, is an activation function
is enabled or disabled, and to build useful
(in this work use sigmoid function
statistics based on these findings. The
and n is the number
algorithm should also provide periodic
of hidden layers. Unlike a conventional
updates of this information. This may be auto-encoder and deep auto-encoder, the
accomplished by running it as a daemon proposed NDAE does not contain a
with an appropriately selected specified decoder and its output vector is calculated
period. by a similar formula to (4) as the latent
representation.
7. Mathematical Model (4)
7.1. Preprocessing: The estimator of the model
In this step, training data source (T) is can be obtained by
normalized to be ready for processing by
minimizing the square reconstruction error
using following steps:
over m training samples , as
shown in (5).
(5)
(1)
Where, 8. CONCLUSION AND FUTURE
WORK
In this paper, we have discussed the
problems faced by existing NIDS
techniques. In response to this we have
proposed our novel NDAE method for
T is m samples with n column attributes; unsupervised feature learning. We have
is the jth column attribute in ith then built upon this by proposing a novel
sample, and are 1*n matrix which classification model constructed from
are the training data mean and standard stacked NDAEs and the RF classification
deviation respectively for each of the n algorithm. Also we implemented the
attributes. Test dataset (TS) which is used Intrusion prevention system. The result
to measure detection accuracy is shows that our approach offers high levels
of accuracy, precision and recall together [6] Learn. Appl., Anaheim, CA, USA, Dec. 2016,
with reduced training time. The proposed pp. 286–293.
[7] K. Alrawashdeh and C. Purdy, ―Toward an
NIDS system is improved only 5% online anomaly intrusion detection system
accuracy. So, there is need to further based on deep learning,‖ in Proc. 15th IEEE
improvement of accuracy. And also further Int. Conf. Mach. Learn. Appl., Anaheim, CA,
work on real-time network traffic and to USA, Dec. 2016, pp. 195–200.
handle zero-day attacks. [8] A. Javaid, Q. Niyaz, W. Sun, and M. Alam, ―A
deep learning approach for network intrusion
detection system,‖ in Proc. 9th EAI Int.Conf.
REFERENCES Bio-Inspired Inf. Commun. Technol., 2016, pp.
[1] B. Dong and X. Wang, ―Comparison deep 21–26. [Online]. Available:
learning method to traditional methods using http://dx.doi.org/10.4108/eai.3-12-
for network intrusion detection,‖ in Proc. 8th 2015.2262516
IEEE Int.Conf. Commun. Softw. Netw, [9] S. Potluri and C. Diedrich, ―Accelerated deep
Beijing, China, Jun. 2016, pp. 581–585. neural networks for enhanced intrusion
[2] R. Zhao, R. Yan, Z. Chen, K. Mao, P. Wang, detection system,‖ in Proc. IEEE 21st Int.
and R. X. Gao, ―Deep learning and its Conf. Emerg. Technol. Factory Autom., Berlin,
applications to machine health monitoring: A Germany, Sep. 2016, pp. 1–8.
survey,‖ Submitted to IEEE Trans. Neural [10] C. Garcia Cordero, S. Hauke, M. Muhlhauser,
Netw. Learn. Syst., 2016. [Online]. Available: and M. Fischer, ―Analyzing flow-based
http://arxiv.org/abs/1612.07640 anomaly intrusion detection using replicator
[3] H. Lee, Y. Kim, and C. O. Kim, ―A deep neural networks,‖ in Proc. 14th Annu. Conf.
learning model for robust wafer fault Privacy, Security. Trust, Auckland, New
monitoring with sensor measurement noise,‖ Zeland, Dec. 2016, pp. 317–324.
IEEE Trans. Semicond. Manuf., vol. 30, no. 1, [11] T. A. Tang, L. Mhamdi, D. McLernon, S. A.
pp. 23–31, Feb. 2017. R. Zaidi, and M. Ghogho, ―Deep learning
[4] L. You, Y. Li, Y. Wang, J. Zhang, and Y. approach for network intrusion detection in
Yang, ―A deep learning based RNNs model for software defined networking,‖ in Proc. Int.
automatic security audit of short messages,‖ in Conf. Wireless Netw. Mobile Commun., Oct.
Proc. 16th Int. Symp. Commun. Inf. Technolf., 2016, pp. 258–26
Qingdao, China, Sep. 2016, pp. 225–229.
[12] Nathan shone , trannguyenngoc, vu
[5] R. Polishetty, M. Roopaei, and P. Rad, ―A
dinhphai , and qi sh, ―a deep learning approach
next-generation secure cloud based deep
to network intrusion detection‖,ieee
learning license plate recognition for smart
transactions on emerging topics in
cities,‖ in Proc. 15th IEEE Int. Conf.Mach.
computational intelligence, vol. 2, no. 1,
february 2018.
1 Finite No Deterministic
Automata
3. TURING MACHINE a a b b B
A Turing machine is a mathematical
model of machine or computer that L N R
describes an intellectual machine for any
problems. The machine handles finite
symbols on a tape according to rules[13]. Read / Write Head
A Turing machine computes algorithms
constructed due to the model's simplicity.
The machine contains tape is an infinite Finite State Control
length to both sides which is divided into
small squares is known as a cell. Each cell
contains only one symbol for a finite Fig.2: Turing Machine Model
alphabet. The empty cells are filled with 3.1 Mathematical Representation
blank symbols. A head is used to read and The real machine handled all operations
write symbols on the tape and set the handle by Turing machine with
movement to the first symbol of left. The intelligence. A real machine has only
machine can move one cell at a time to limited a finite number of formations. The
left, right or no movement. Finite states are actually real machine is a linear bounded
stored in state register by Turing machine. automata[16]. Due to infinite tape to both
The state register is initialized by the sides, Turing machines have an
special start state. A finite table of rules is unconstrained amount of storage space to
used to read the current input symbols the computations.
from tape and modify it by moving tape
head left, right or no movement[14-15]. A Turing machine is represented by 7
The Turing machine is a mathematical tuples[23] i.e.
model of machine or computer that M= (Q, ∑, , δ, q0, B, F)
mechanically operates on a tape as shown
in the following figure Fig.2. where;
1) Input Tape: A tape is infinite to both ∑ is the finite set of input alphabets
sides and divided into cells. Each cell is the finite set of tape alphabets
contains only one symbol from finite
δ is a transition function;
alphabets. At the end, alphabet contains a
special symbol known as blank. It is δ∶ Q × → Q × × {L,R,N}
written as 'B'. The tape is implicit to be
arbitrarily extendable to both left and right
where,
sides for computation.
L: move to the left
2) Read/Write Head: A head that can read
and write only one symbol at a time on R: move to the right
tape and move to the left, right or no N: no movement
movement.
q0 is the initial state
3) Finite State Control: A state control
stores the state of the Turing machine from B is the blank symbol
initial to halting state. After reading last F is the set of final states or set of
symbol Turing machine reaches to final halting states.
state then the input string is accepted
otherwise input string is rejected.
ABSTRACT
The analysis of social networks is a very tough research area while a fundamental
element concerns the detection of user communities. The existing work of emotion
recognition on Twitter specifically relies on the use of lexicons and simple classifiers
on bag-of words models. The vital question of our observation is whether or not we will
increase their overall performance using machine learning algorithms. The novel
algorithm a Profile of Mood States (POMS) represents twelve-dimensional mood state
representation using 65 adjectives with the combination of Ekman‘s and Plutchik‘s
emotions categories like, joy, anger, depression, fatigue, vigour, tension, confusion,
disgust, fear, trust, surprise and anticipation. These emotions recognize with the help of
text based bag-of-words and LSI algorithms. The contribution work is to cover machine
learning algorithm for emotion classification, it takes less time consumption without
interfere human labeling. The Gaussian Naïve Bayes classifier works on testing dataset
with help of huge amount of training dataset. Measure the performance of POMS &
Gaussian Naïve Bayes algorithms on Twitter API. The experimental outcome shows
with the help of Emojis for emotion recognition using tweet contents.
(NLP). Paul Ekman defined six basic text processing techniques and that such
emotions by studying facial expressions. changes respond to a variety of socio-
Robert Plutchik extended Ekman‘s cultural drivers in a highly differentiated
categorization with two additional manner. Advantages are: Increases the
emotions and presented his categorization performance. Public mood analysis from
in a wheel of emotions. Finally, Profile of Twitter feeds offers an automatic, fast, free
Mood States (POMS) is a psychological and large-scale addition to this toolkit that
instrument that defines a six-dimensional may be optimized to measure a variety of
mood state representation using text dimensions of the public mood state.
mining. The novel algorithm a Profile of Disadvantages are: It avoids geographical
Mood States (POMS) generating twelve- and cultural sampling errors.
dimensional mood state representation In [2] paper explored an
using 65 adjectives with combination of application of deep recurrent neural
Ekman‘s and Plutchik‘s emotions networks to the task of sentence-level
categories like, anger, depression, fatigue, opinion expression extraction. DSEs
vigour, tension, confusion, joy, disgust, (direct subjective expressions) consist of
fear, trust, surprise and anticipation. explicit mentions of private states or
Previous work generally studied only one speech events expressing private states;
emotion classification. Working with and ESEs (expressive subjective
multiple classifications simultaneously not expressions) consist of expressions that
only enables performance comparisons indicate sentiment, emotion, etc., without
between different emotion categorizations explicitly conveying them. Advantages
on the same type of data, but also allows are: Deep RNNs outperformed previous
us to develop a single model for predicting (semi)CRF baselines; achieving new state-
multiple classifications at the same time. of-the-art results for fine-grained on
Motivation opinion expression extraction.
The system developed based on our Disadvantages are: RNNs do not have
proposed approach would be able to access to any features other than word
automatically detect what people feel vectors.
about their lives from twitter messages. In [3] paper analyze electoral
For example, the system can recognize: tweets for more subtly expressed
percentage of people expressing higher information such as sentiment (positive or
levels of life satisfaction in one group negative), the emotion (joy, sadness,
versus another group, anger, etc.), the purpose or intent behind
percentage of people who feel happy the tweet (to point out a mistake, to
and cheerful, support, to ridicule, etc.), and the style of
percentage of people who feel calm the tweet (simple statement, sarcasm,
and peaceful, and hyperbole, etc.). There are two sections: on
percentage of people expressing higher annotating text for sentiment, emotion,
levels of anxiety or depression. style, and categories such as purpose, and
on automatic classifiers for detecting these
2. RELATED WORK categories. Advantages are: Using a
In [1] paper, investigate whether multitude of custom engineered features
public mood as measured from large-scale like those concerning emoticons,
collection of tweets posted on twitter.com punctuation, elongated words and negation
is correlated or even predictive of DJIA along with unigrams, bigrams and emotion
values. The results show that changes in lexicons features, the SVM classifier
the public mood state can indeed be achieved a higher accuracy. Automatically
tracked from the content of large-scale classify tweets into eleven categories of
Twitter feeds by means of rather simple emotions. Disadvantages are: Does not
summarize tweets. Does not automatically novel corpus of 1.2M English tweets
identifying other semantic roles of (1,500 authors) annotated for gender and
emotions such as degree, reason, and MBTI. Advantages are: The personality
empathy target. distinctions, namely INTROVERT–
In [4] article, show that emotion- EXTROVERT (I–E) and THINKING–
word hashtags are good manual labels of FEELING (T–F), can be predicted from
emotions in tweets. Proposes a method to social media data with high reliability. The
generate a large lexicon of word–emotion large-scale, open-vocabulary analysis of
associations from this emotion-labeled user attributes can help improve
tweet corpus. This is the first lexicon with classification accuracy.
real-valued word–emotion association The paper [7] focuses on studying
scores. Advantages are: Using hashtagged two fundamental NLP tasks, Discourse
tweets can collect large amounts of labeled Parsing and Sentiment Analysis. The
data for any emotion that is used as a development of three independent
hashtag by tweeters. The hashtag emotion recursive neural nets: two for the key sub-
lexicon is performed significantly better tasks of discourse parsing, namely
than those that used the manually created structure prediction and relation
WordNet affect lexicon. Automatically prediction; the third net for sentiment
detecting personality from text. prediction. Advantages are: The latent
Disadvantages are: This paper works only Discourse features can help boost the
on given text not synonym of that text. performance of a neural sentiment
The paper [5] develops a multi-task analyzer. Pre-training and the individual
DNN for learning representations across models are an order of magnitude faster
multiple tasks, not only leveraging large than the Multi-tasking model.
amounts of cross-task data, but also Disadvantages are: Difficult predictions to
benefiting from a regularization effect that multi-sentential text.
leads to more general representations to
help tasks in new domains. A multi-task 3. EXISTING SYSTEM
deep neural network for representation The ability of the human face to
learning, in particular focusing on communicate emotional states via facial
semantic classification (query expressions is well known, and past
classification) and semantic information research has established the importance
retrieval (ranking for web search) tasks. and universality of emotional facial
Demonstrate strong results on query expressions. However, recent evidence has
classification and web search. Advantages revealed that facial expressions of emotion
are: The MT-DNN robustly outperforms are most accurately recognized when the
strong baselines across all web search and perceiver and expresser are from the same
query classification tasks. Multi-task DNN cultural in group. Paul Ekman explains
model successfully combines tasks as facial expressions to define a set of six
disparate as classification and ranking. universally recognizable basic emotions:
Disadvantages are: The query anger, disgust, fear, joy, sadness and
classification incorporated either as surprise. Robert Plutchik defined a wheel-
classification or ranking tasks not like diagram with a set of eight basic,
comprehensive exploration work. pairwise contrasting emotions; joy –
In [6] paper we i) demonstrate how sadness, trust – disgust, fear – anger and
large amounts of social media data can be surprise – anticipation. Consider each of
used for large-scale open-vocabulary these emotions as a separate category, and
personality detection; ii) analyze which disregard different levels of intensities that
features are predictive of which Plutchik defines in his wheel of emotions.
personality dimension; and iii) present a Disadvantages:
Twitter —or— How to Get 1,500 Personality [7] B. Nejat, G. Carenini, and R. Ng, ―Exploring
Tests in a Week,‖ in Proc. of the 6th Workshop Joint Neural Model for Sentence Level
on Computational Approaches to Subjectivity, Discourse Parsing and Sentiment Analysis,‖
Sentiment and Social Media Analysis, 2015, Proc. of the SIGDIAL 2017 Conf., no. August,
pp. 92–98. pp. 289–298, 2017.
1. INTRODUCTION 2. MOTIVATION
The steps taken by students in their earlier The steps taken by students in their earlier
learning years shape up their future. There learning years shape up their future. There
is a lot of pressure on them from their is a lot of pressure on them from their
parents or peers to perform well. This parents or peers to perform well. This
might lead to extreme levels of depression might lead to extreme levels of depression
which might take a toll on their health. So, which might take a toll on their health. So,
we decided to design a web app to help the we decided to design a web app to help the
students to cope up with the stress. We are students to cope up with the stress. We are
going to make a better app than those going to make a better app than those
which are previously available. This which are previously available.
chatbot helps to cope with the pressure of
studies for students within a range of 14 to 3. PROBLEM STATEMENT
22 years. The bot can determine the stress Create a chatbot to help with coping up
or depression level using a simple with the pressure of studies for students
questionnaire at start and advances to within a range of 14 to 22 years. The bot
better assess the situation in later stages. can determine the stress or depression
General Terms level using a simple
Depression, Depression level, Stanford questionnaire at start and advances to
CoreNLP better assess the situation in later stages.
Keywords Also to help sports people to balance their
Chatbot play and studies.
4. STATE OF ART
Table 1 State of art
2. Speech Analysis and Formant and jitter frequencies in speech are calculated
Depression based upon which a depression level is determined.
3. Affective and Content Linguistic Inquiry and Word Count (LIWC) is used
Analysis of Online for depression recognition. A survey is conducted of
Depression Communities various Clinical and Control communities for better
understanding depression patterns.
4. Detection of Depression in Survey based paper, volunteers are required to speak
Speech on certain questions, stories and visual images and
using feature selection, facilitate the depression
recognition.
5. A Model For Prediction Of Here about 500 records have been taken as test data
Human Depression Using for the model. The model is tested on 500 individuals
Apriori Algorithm and successfully predicted the percent of individuals
are suffering depression. Following factors of
depression are considered: Lifestyle, Life events,
Non-psychiatric illness, Acquired infection, Medical
treatments, Professional activities, Stress, and
Relationship Status etc. The questions were based on
Family problem(FA), Financial
problem(FP),Unemployed(UE), enumeration
(REM),Addiction(ADD),Workplace(ORG),
Relationship(RL),Congenital diseases(CD),
Apprehension(AP),Hallucination(HL), and Sleeping
problem(SLP).
7. Internet Improves Health The paper suggests some websites where solution to
Outcomes in Depression their problems can be found. It is a kind of self-help.
The model uses theory of behavior change.
8. Detecting Depression In this, there are various ways to take input, viz.,
Using Multimodal speech input, textual input, etc. 8 emotions are
Approach of Emotion considered and accordingly, an alert send to the
Recognition doctor.
9. Classification of depressionGiven that neurophysiological changes due to major
state based on articulatory depressive disorder influence the articulatory
precision precision of speech production, vocal tract formant
frequencies and their velocity and acceleration toward
automatic classification of depression state were
investigated.
10. Predicting anxiety and The model uses ten machine learning algorithms like
depression in elderly Naïve Bayes, Random Forest, Bayesian Network, K
patients using machine star, etc. to classify the patients whether they have
learning technology depression or not. Out of these ten algorithms, the best
5. GAP ANALYSIS
7. Internet The paper suggests some websites where The websites provide
Improves Health solution to their problems can be found. It only a generalized
Outcomes in is a kind of self-help. The model uses solution, not a
Depression theory of behavior change. specific solution to
the problem. We are
giving specific
solution to the
problem.
8. Detecting In this, there are various ways to take input, The model is not
Depression viz., speech input ,textual input,etc. useful when someone
Using 8 emotions are considered and goes into depression.
Multimodal accordingly, an alert send to the doctor. It only suggests
Approach of preventive measures.
Emotion While our app
Recognition suggests preventive
measures as well as
the solutions when
the person has gone
into depression.
9. Classification of Given that neurophysiological changes due If the person has
depression state to major depressive disorder influence the depression ,then an
based on articulatory precision of speech production, immediate alert is
articulatory vocal tract formant frequencies and their sent to the doctor, but
precision velocity and acceleration toward automatic if the user is not
classification of depression state were comfortable to talk
investigated. with the doctor, then
His/her depression
will not get treated.
But, in our app we
provide the solution
as well as if the
person is in severe
depression, we
encourage the user to
seek help from the
doctor.
10. Predicting The model uses ten machine learning The time spent on
anxiety and algorithms like Naïve Bayes, Random determining the best
depression in Forest, Bayesian Network, K star, etc. to algorithm to predict
elderly patients classify the patients whether they have is a lot. Also no
using machine depression or not. Out of these ten solution is provided.
learning algorithms, the best one is chosen using the Our application is
technology confusion matrix. fast and also provides
the solution.
6. PROPOSED WORK
1. First the user if not already registered in attempt he/she is given a text area to write
the system has to sign up. The signup stage his mental state upon which a specialized
is foolproof and is secured with an OTP questionnaire with respect to his/her
verification stage. depression level is provided.
2 After the Signup step the user is taken to
the login page. After login on the first
3. There are basically 3 levels of where he/she will be provided with a token
Depression going from 1 to 3 according to and can contact the admin who has
the increasing severity. experience in dealing with the sports and
4. The first two levels are considered as study stress.
curable with our app itself. Here an option 5. In case of a very severe condition the
for chatbot is provided which is available contact details of a renowned psychiatrist
24/7. There are two types of students who will be provided. The app generates
can use the app (sports and regular). The reminders after specific intervals just to
chatbot is provided for a regular student. A check the progress of student after some
messenger is created for the sports student remedies have been incorporated by them.
...IV
where Ws 2 R5d is the sentiment
classification matrix.
2. The error function of a sentence is:
...V
where = (V;W;Ws;L)
5.1.3 Working
First we will provide text area in which
user has to Express his/her condition.
Then the function will be executed on
...I this text area which will split all the
h 2 Rd: output of the tensor sentences present in the text area. Then
product this function will return the number of
V [1:d] 2 R2d2dd: tensor that sentences and array of sentences.
defines multiple bilinear forms.
V [i] 2 R2d2d: each slice of V Stanford CoreNLP will be applied on
[1:d] this array of sentences to compute the
sentiment level of each sentence. If any
…II one of the sentence‘s sentiment level
returns 1(Negative) then the sentiment
level of complete text area will be 1.
...III
these anti-depression chatbots would [4] Brian S. Helfer, Thomas F. Quatieri, James
become more efficient. R. Williamson, Daryush D. Mehta, Rachelle
Horwitz, Bea Yu . Classification of
Anti-Depression chatbot can be used by depression state based on articulatory
professional sports players. There is a lot precision. Interspeech 2013
of pressure on the sports players [5] Lambodar Jena, Narendra K. Kamila. A
particularly when the fail. They need to Model for Prediction Of Human Depression
find some way, some path to the top Using Apriori Algorithm. 2014
International Conference on Information
again and these chatbots can help a lot. Technology.
As we keep bettering the underlying [6] Thin Nguyen, Dinh Phung, Bo Dao, Svetha
technology through trial and error, NLP Venkatesh, Michael Berk. Affective and
will grow more efficient, capable of Content Analysis of Online Depression
handling more complex commands and Communities. 08 April 2014, IEEE
Transactions on Affective
delivering more poignant outputs. Computing(Volume: 5, Issue: 3, July-Sept. 1
chatbots will also be able to have multi- 2014)
linguistic conversations, not only [7] Zhenyu Liu, Bin Hu*, Lihua Yan, Tianyang
understanding hybrid languages like Wang, Fei Liu, Xiaoyu Li, Huanyu Kang.
‗Hinglish‘ (Hindi crossed with English) Detection of Depression in Speech. 2015
International Conference on Affective
with NLU, but with advanced NLG, will Computing and Intelligent Interaction
also be able to reciprocate in kind. On a (ACII).
conversational space, the users enjoy the [8] Tan Tze Ern Shannon, Dai Jingwen Annie
freedom to input their thoughts and See Swee Lan. Speech Analysis and
seamlessly. Meaning, be it an enquiry Depression. 2016 Asia-Pacific Signal and
Information Processing Association Annual
related to a service being provided or a Summit and Conference (APSIPA)
query of help, the users receive an [9] Dongkeon Lee, Kyo-Joong Oh, Ho-Jin Choi.
instant reply which provides them a The chatbot Feels You –A Counseling
sense of direction inside the app.This Service Using Emotional Response
app is the best line of defense against a Generation. 2017 IEEE International
Conference on Big Data and Smart
varying range of depression also for a Computing (BigComp)
wide range of ages. The app can detect, [10] Arkaprabha Sau, Ishita Bhakta. Predicting
measure and cure depression. The app anxiety and depression in elderly patients
will help a huge population to cope with using machine learning
the increasing stress that is gripping the technology.(Volume: 4, Issue: 6, 12 2017)
society. Thus, the contribution of this [11] Recursive Deep Models for Semantic
Compositionality over a Sentiment
app towards society is immense. Treebank; Richard Socher, Alex Perelygin,
Jean Y. Wu, Jason Chuang, Christopher D.
REFERENCES Manning, Andrew Y. Ng and Christopher
[1] Culjak, M. Spranca. Internet Improves Potts; 2013; Stanford University.
Health Outcomes in Depression.
Proceedings of the 39th annual Hawaii
international conference on system science,
2006, pp. 1 – 9
[2] Imen Tayari Meftah, Nhan Le Thanh,
Chokri Ben Amar. Detecting Depression
Using Multimodal Approach of Emotion
Recognition. GLOBAL HEALTH 2012 :
The First International Conference on
Global Health Challenges.
[3] Shamla Mantri, Dr. Pankaj Agrawal, Dr.
S.S.Dorle, Dipti Patil, Dr. V.M.Wadhai.
Clinical Depression analysis Using Speech
Features. 2013 Sixth International
Conference on Emerging Trends in
Engineering and Technology
of steps to finally classify the tweets into their origin text form by looking up the
two labels that are positive and negative, emoticon dictionary.
thus expressing the mood of the public NLP and Feature Selection
regarding the given topic. The most Natural language processing basically
interesting part of our work is the use of includes removal of stop words and
ensemble approach in process of stemming of the words after pre-
classification and classifying the tweets. processing.
But before applying the machine learning -Stop word Removal: Stop words usually
algorithms it is important that proper pre- refer to the most common words in a
processing of the data is done. Once pre- language, such as "the", "an", and "than".
processing is done, it will be followed by The classic method is based on removing
feature extraction which will be used for the stop words obtained from precompiled
generating feature vectors which will then lists. There are multiple stop words lists
be used for the purpose of classification. existing in the literature.
Using graphs and statistics we will also be -Stemming: It refers to replacing of
providing comparison between results multiple words with same meaning.
obtained using the techniques individually Example: "played", "playing" and "play"
and the results obtained by using the all are replaced with play.
ensemble approach. The algorithms that will be used for these
Pre-processing purposes are described in the further
The Tweets are usually composed of sections of the paper.
incomplete expressions, or expressions Finally, the feature selection is done.
having emoticons in them or having Vectors of words are created after pre-
acronyms or special symbols. Such processing and NLP has been applied on
irregular Twitter data will affect the the tweets. These vectors are given to the
performance of sentiment classification. classifiers for the purpose of classification.
Prior to feature selection, a series of Ensemble Approach for Classification
preprocessing is performed on the tweets In our work we are going to use the
to reduce the noise and the irregularities. ensemble approach for the purpose of
The preprocessing that will be done is: classification, that is labelling the tweets
-Removal of all non-ASCII and non- into different polarities. This is the most
English characters in the tweets. important part of our work as most of the
-Removal of URL links. The URLs do not works done previously have used only
contain the any useful information for our single machine learning algorithms for the
analysis, so they will be deleted from purpose of classification but in this work,
tweets. we are going to use an Ensemble of three
-Removal of numbers. The numbers different algorithms to obtain better results
generally do not convey any sentiments, in prediction than what could be obtained
and thus are useless during sentiment from any of the learning algorithms alone.
analysis and thus are deleted from tweets. The advantage of using the ensemble
-Expand acronyms and slangs to their full approach is that is significantly increases
words form. Acronyms and slang are the efficiency of classification. One more
common in tweets, but are ill-formed important thing about using the ensemble
words. It is essential to expand them to approach is the use of right combinations
their original complete words form for of algorithms. In our work we are going to
sentiment analysis. consider Naïve-Bayes, Random Forest and
-Replace emoticons and emojis. The Support Vector Machine for the ensemble
emoticon expresses the mood of the writer. classifier. These algorithms have been
We replace the emoticons and emoji with selected as they have proven to give the
best results when used individually and
thus using them in the ensemble will also three main parts: Pre-processing, Feature
yield efficient results. The algorithms have selection and applying the ensemble
been discussed in short in a further section. classifier to perform sentiment analysis on
. social media big data and visualization of
4. SYSTEM ARCHITECTURE the results obtains using graphs.
The following figure shows the proposed
architecture of the system which includes
Step 4: If the word matches, then it is vectors of feature values, where the
removed from the array, and the class labels are drawn from some finite
comparison is continued till all the words set. This classification technique is a
from D is compared successfully. probabilistic classification technique
Step 5: After successful removal of first which finds the probability of a label
stop-word, another stop-word is read from belonging to a certain class. (In our
stop-word list and again we continue from case the classes are positive and
step 2. The algorithm runs till all the stop- negative).
words are compared successfully. The algorithm uses the Bayes theorem
Stemming Algorithm for the purpose of finding the
Input: comments after stop-words probabilities. The theorem assumes
removing. that the value of any particular feature
Output: Stemmed comment data. is independent of the value of any
other feature.
Step 1: A single comment is read from
output of stop-word removing It is given as:
algorithm. P(A|B) = (P(A)* P(B|A)) / P(B)
Step 2: This is then written into Support Vector Machine
another file at location given and read A support vector machine is a
during stemming process. supervised technique in machine
Step 3: tokenization is applied on learning. In this technique every data
selected comment. item is represented as a point in a n-
dimensional space and hyperplane is
Step 4: A particular word is processed
constructed that separates the data
during stemming in loop and checked
points into different classes. and then
if that word or character is null or not.
this hyperplane is used for the purpose
Then that word is converted into lower
of classification. The hyperplane will
case and compared with another words
divide the dataset into two different
in comments.
classes positive and negative in our
Step 6: If words with similar kind of or work.
meaning are found are stemmed, that
A hyperplane having the maximum
is they are reduced to their basic word.
distance to the nearest training data
After the pre-processing is done the item of both the classes is considered
next step will be building the classifier to be the most appropriate hyperplane.
based on the ensemble approach. The This distance is called margin. In
following algorithms are being general, the larger is the margin the
considered for that purpose: lesser is the error in classification.
Machine Learning Algorithms Random Forest
This section describes the machine Random Forest is developed as an
learning algorithms that will be used in our ensemble of based on many decision trees.
work in brief. It is basically a combination of many
Naïve Bayes Algorithm decision trees. In classification procedure,
Naive Bayes is a simple technique for each Decision Tree in the Random Forest
constructing classifiers: models that classifies an instance and the Random
assign class labels to which describes Forest classifier assigns it to the class with
the probability of a feature, based on most votes from the individual Decision
prior knowledge of conditions that Trees. So basically, each decision tree in
might be related to that feature.) the random forest performs classification
problem instances, represented as
on random parts of the dataset and use of machine learning instead of lexicon-
predictions by of all these different trees based approach is a big plus-point of this
are aggregated to generate the final results. work the framework has the potential to
outdo the existing systems because of the
6. PERFORMANCE use of the ensemble approach. It will do
MEASUREMENTS the classification on the basis of polarities
The classification performance will be i.e. positive and negative. Future work can
evaluated in three terms accuracy, include developing better techniques for
recall and precision as defined below. visualizing the results. Another possible
A confusion matrix is used for this. future work can be classifying the tweets
on a range of emotions. Another direction
for future work can be using of larger
True positive reviews + True datasets to train the classifiers so as to
Negative reviews improve the efficiency of the analysis
process.
Accuracy = --------------------------------
----------------------------
REFERENCES
Total number of [1] Tapasy Rabeya and Sanjida Ferdous. ―A
documents Survey on Emotion Detection‖. 2017, 20th
International Conference of Computer and
Information Technology (ICCIT)
[2] Sonia Xylina Mashal, Kavita Asnani in their
work ―Emotion Intensity Detection for Social
True positive Media Data‖. 2017, International
Conference on Computing Methodologies and
reviews Communication (ICCMC)
Recall = ------------------------------------ [3] Kudakwashe Zvarevashe, Oludayo O
Olugbara. "A Framework for Sentiment
------------------- Analysis with Opinion Mining of Hotel
True positive reviews + false Reviews".2018, Conference on Information
negative reviews Communications Technology and Society
(ICTAS)
[4] M. Trupthi et al., ―Improved Feature
Extraction and Classification - Sentiment
True positive reviews Analysis, ―International Conference on
Advances in Human Machine Interaction
Precision = --------------------------------- (HMI-2016), March 03-05, 2016, R. L. Jalappa
------------------------- Institute of Technology, Doddaballapur,
Bangalore, India.
True positive review+ false [5] Orestes Apple et al., ―A Hybrid Approach to
positive reviews Sentiment Analysis‖, IEEE, 2016.
[6] S. Brindhaet et al., ―A Survey on Classification
7. CONCLUSION AND FUTURE
Techniques for Text Mining‖, 3rd International
WORK Conference on Advanced Computing and
A framework is being built that will Communication Systems (ICACCS-2016),
enhance the existing techniques of Jan. 22-23, 2016, Coimbatore, India.
sentiment analysis as previous techniques [7] Y Yuling Chen, Zhi Zhang. "Research on text
sentiment analysis based on CNNs and SVM".
mostly focused on classification of single
2018,Conference on Information
sentences but the framework, we are Communications Technology and Society
building works on huge amounts of data (ICTAS).
using machine learning techniques. The
These are some examples of popular stock The project goal is to build a system where
exchanges: the machine learning algorithms try to
predict the prices of stocks based on their 1. Comparative analysis of data mining
previous closing prices and other attributes techniques for financial data using
that influence its price like Interest rates, parallel processing[1]
Foreign exchange and Commodity [2014] [IEEE] Do the comparative
prices[4]. analysis of several data mining
classification techniques on the basis of
2. MOTIVATION parameters accuracy, execution time, types
Stock market movements make of datasets and applications. Simple
headlines every day. In India, 3.23 crore Regression and multivariate analysis used,
individual investors trade stocks. Regression analysis on attributes is used.
Maharashtra alone accounts for one-fifth No use of machine learning. Does not
of these investors. However, a report from provide the algorithm used.
Trade Brains shows that 90% of these
investors lose money in due to various 2. Stock market prices do not follow
reasons like insufficient research, random walks: Evidence from a simple
speculation, trading with emotions etc. specification test[2]
[2015] [IEEE]Test the random
Higher inflation rate and lower walk hypothesis for weekly stock market
interest rate makes it ineffective to put returns by comparing variance estimators
one‘s money into savings account or fixed derived from data sampled at different
deposits.[5][6] Thus, many people look up frequencies. Simple trading rules
to stock market to keep up with the extraction and Extraction of Trading Rules
inflation. In this process of multiplying from Charts and Trading Rules. No
their money many investors have made a alternative provided for human investing.
fortune while, some have lost a lot of Show only the flaws on manual
money due to unawareness or lack of time investments.
to research about a stock.
3. A Machine Learning Model for Stock
There are lots of contradicting Market Prediction[3]
opinions in the news and an individual [2017] [IJAERD] Support Vector
may not have the time or may not know Machine with Regression Technology
how to research about a stock. Most (SVR), Recurrent Neural networks (RNN).
importantly, it is very difficult to manually Regression analysis on attributes using
predict the stocks prices based on their simple Regression and multivariate
previous performance of that stock. Due to analysis used. It is not tested in real
these factors many investors lose a lot of market. Shows how social media affects
money every year[6]. share prices. Does not account for other
A system that could predict the factors.
stock prices accurately is highly in
demand. Individuals can know the 4. Twitter mood predicts the stock
predicted stock prices upfront and this may market[4]
prevent them from investing in a bad [2010] [IEEE] Analyze the text
stock. This would also mean a lot of saved content of daily Twitter feeds by two
time for many of the investors who are mood tracking tools, namely Opinion
figuring out wheather a particular stock is Finder that measures positive vs. negative
good or not. mood and Google-Profile of Mood States.
These results are strongly indicative of a
3. LITERATURE SURVEY predictive correlation between
measurements of the public mood states
from Twitter feeds. Difficult to scan each
every text extraction from large set of data, predict the long-term trend of the stock
difficult Text mining. market. The proposed model detects
anomalies in data according to the volume
5.Stock Market Prediction on High- of a stock to accurately predict the trend of
Frequency Data Using Generative the stock. This paper only provides long
Adversarial Nets[5] term predictions and does not give
[2017] [Research] Propose a predictions to the immediate trends.
generic framework employing Long Short-
Term Memory (LSTM) and convolutional 5. PROPOSED WORK
neural network (CNN)for adversarial
training to forecast high frequency stock Stock Market Prediction Using
market. This model achieves prediction Machine Learning can be a challenging
ability superior to other benchmark task. The process of determining which
methods by means of adversarial training, indicators and input data will be used and
minimizing direction prediction loss, and gathering enough training data to training
forecast error loss. It Can‘t predict Multi the system appropriately is not obvious.
scale Conditions and live data The input data may be raw data on
volume, price, or daily change, but also it
6. Stock Market Prediction Using may include derived data such as technical
Machine Learning[6] indicators (moving average, trend-line
[2016] [IEEE] Uses different indicators, etc.)[5] or fundamental
modules and give different models and indicators (intrinsic share value, economic
give best accuracy using live streaming environment, etc.). It is crucial to
data. Predict Real Market Data and understand what data can be useful to
calculate Live data using single and capture the underlying patterns and
multilevel perspective, SVM, Radial Bias. integrate into the machine learning system.
It Couldn‘t work Textual Data form The methodology used in this work
different Browsing Data (Web Crawling) consists on applying Machine Learning
systems, with special emphasis on Genetic
7. Stock Market Prediction by Using Programming. GP has been considered one
Artificial Neural Networks[7] of the most successful existing
[2014] [IEEE] This model takes computational intelligence methods and
help of Artificial Intelligence and uses capable to obtain competitive results on a
only neural networks to predict the data. very large set of real-life application
Predicting data using single and multi- against other methods. Section Different
level perceptron. It uses 10 hidden layers Algorithms used in algorithm[1].
with the learning rate of 0.4, momentum
constant at 0.75 and Max Epochs of 1000. Tools and Technologies Used
This model doesn‘t use machine learning Python
algorithms like SVM and radial basis Usage of libraries like – OpenCV, scikit,
function to determine their accuracy. pandas, numpy
Machine Learning techniques -classifiers
8. Price trend prediction Using Data Linear regression techniques
Mining Algorithm[8] Jupyter IDE
[2015] [IEEE] This paper
presented a data mining approach to
4. GAP ANALYSIS
2 Stock market prices do Test the random walk Simple trading rules No alternative provided
not follow random hypothesis for weekly stock extraction and Extraction for human investing.
walks: Evidence from market returns by of Trading Rules from Show only the flaws on
a simple specification comparing variance Charts and Trading Rules manual investments.
test estimators derived from
data sampled at different
[2015] [IEEE] frequencies
3 A Machine Learning Support Vector Machine Regression analysis on It is not tested in real
Model for Stock with Regression attributes using simple market. Shows how social
Market Prediction Technology (SVR), Regression and media affects share
Recurrent Neural networks multivariate analysis used prices. Does not account
[2017] [IJAERD] (RNN) for other factors.
4 Twitter mood predicts Analysed the text content of These results are strongly Difficult to scan each
the stock market daily Twitter feeds by two indicative of a predictive every text extraction from
mood tracking tools, correlation between large set of data, difficult
[2010] [IEEE] namely Opinion Finder that measurements of the Text mining
measures positive vs. public mood states from
negative mood and Google- Twitter feeds
Profile of Mood States
5 Stock Market Propose a generic This model achieves Can‘t predict Multi scale
Prediction on High- framework employing Long prediction ability superior Conditions and live data
Frequency Data Using Short-Term Memory to other benchmark
Generative Adversarial (LSTM) and convolutional methods by means of
Nets neural network (CNN)for adversarial training,
adversarial training to minimizing direction
[2017] [Research] forecast high frequency prediction loss, and
stock market forecast error loss
6 Stock Market Uses different modules and Predict Real Market Data Couldn‘t work Textual
Prediction Using give different models and and calculate Live data Data form different
Machine Learning give best accuracy using using single and Browsing Data (Web
live streaming data. multilevel perspective, Crawling)
SVM, Radial Bias
Although a substantial volume of research deal with the various problems in the data
exists on the topic, very little is aimed at and ended up using mean substitution and
long term forecasting while making use of feature deletion.
machine learning methods and textual data
sources. We prepared over ten year worth 9. FUTURE WORK
of stock data and proposed a solution
which combines features from textual 1. Model Updating Frequency:
yearly and quarterly filings with They are trained once and then
fundamental factors for long term stock used for predicting stock performances
performance forecasting. Additionally, we over the span of a year. Since we use a
developed a new method of extracting return duration of 120 trading days, there
features from text for the purpose of is a necessary wait of half a year before
performance forecasting and applied data can be used to train models, which
feature selection aided by a novel means that models end up making
evaluation function. Problems predictions using data which is over a year
Overcome[5]. To produce effective old. One way to make use of data as soon
models, there were two main problems we as it become available is to completely
were faced with and had to overcome. The retrain the model every week (or less). A
first was that of market efficiency, which faster way to improve model performance
places theoretical limits on how patterns may be through updating using
can be found in the stock markets for the incremental machine learning algorithms,
purpose of forecasting. This property can which can update model parameters
become a concrete problem by patterns without re-training on all data[6].
being exhibited in the data which are
useless or even detrimental for predicting 2. Explore More Algorithms:
future values. The way we tried to deal Although many different models
with this was by carefully splitting our were considered in this thesis, including
data into training, validation, and testing various linear regression methods,
data with expanding windows so as to gradient boosting, random forests, and
make maximum use of it while trying to neural networks, there is always more
avoid accidental overfitting. The second room to explore.
way we dealt with this was by using a
tailored model performance metrics, which 3. Improve Feature Extraction:
aimed to ensure good test performance of In this thesis, a few methods for
models by not only maximizing model extracting features from filings with
validation, but also minimizing the textual data were explored. The problem of
variation across validation years of this extracting features from text and
value[7]. The third way we dealt with determining text sentiment in particular are
market efficiency was by performing well studied, and other natural language
feature selection using the Algorithm, so processing methods may perform better.
as to remove those features which Our approach of using autoencoders to
performed poorly or unreliably. The extract features may also benefit from
second set of problems came from putting further exploration. In particular, when
together a dataset to use for using the auxiliary loss, a more accurate
experimentation and testing. Due to the method for estimating the financial effect
large volume of the data, care had to be corresponding to a given filing would be
taken when cleaning and preparing it, and useful.
the inevitable mistakes along the way
required reprocessing of the data[4]. Using 4. Utilize Time Series Information.:
expert knowledge, we determined how to
REFERENCES
[1] Raut Sushrut Deepak, Shinde Isha Uday ,Dr.
D. Malathi, ―Machine Learning Approach in
stock market prediction‖-2015 International
Journal of Pure and Applied Mathematics
Volume 115 No. 8 2017, 71-77.
[2] Tao Xing, Yuan Sun, Qian Wang, Guo Yu,
―The Analysis and Prediction of Stock Price,‖
2013 IEEE International Conference on
Granular Computing.
[3] A. W. Lo, & A. C. MacKinlay, ―Stock market
prices do not follow random walks: Evidence
from a simple specification test,‖ Review of
financial studies, vol. 1, no. 1, pp. 41-66, 1988.
[4] Yash Omer , Nitesh Kumar Singh, ―Stock
Prediction using Machine Learning‖, 2018
International Journal on Future Revolution in
Computer Science & Communication
Engineering.
[5] Ritu Sharma, Mr. Shiv Kumar, Mr. Rohit
Maheshwari ―Comparative Analysis of
Classification Techniques in Data Mining
Using Different Datasets‖, 2015 International
Journal of Computer Science and Mobile
Computing.
[6] Osma Hegazy, Omar S. Soliman, ―A Machine
Learning Model for Stock Market Prediction‖,
International Journal of Computer Science and
Telecommunications [Volume 4, Issue 12,
December 2013].
[7] S .P. Pimpalkar, Jenish Karia, Muskaan Khan,
Satyam Anand, Tushar Mukherjee, ―Stock
Market Prediction using Machine Learning‖,
International Journal of Advance Engineering
and Research Development, vol. 4 2017.
[8] Xingyu Zhou , Zhisong Pan , Guyu Hu , Siqi
Tang,and Cheng Zhao, ―Stock Market
Prediction on High-Frequency Data Using
Generative Adversarial Nets‖, Mathematical
Problems in Engineering, Volume 2018.
[9] J. Bollen, H. Mao, & X. Zeng, ―Twitter mood
predicts the stock market. Journal of
Computational Science,‖ vol. 2, no. 1, pp. 1-8,
2011.
people in this area of work to take smart, position where we can invest smartly,
calculated and informed decisions which knowingly and have maximum outcome.
will add to the advancement of the field For efficient manufacturing the actual real-
and economy of the country. Design time consumption is necessary but, it is not
model will work on the data given in past always possible to analyze real-time data
several years and it will be able to hence stock recommendation will
improvise itself according to the real-time give manufactures an overview of
data that comes along the way. Model will stock consumption leading towards
aim to achieve higher level of accuracy lower production cost and in result
towards prediction range and will be the end consumer will be benefited.
adaptable to any kind of data that is given
to it. A number of alternate measures of
forecast performance, having to do with 3. STATE OF ART
statistical as well as directional accuracy, Stock prices are considered to be
are employed. Stock recommendation chaotic and unpredictable. Predicting the
system will be based on already known future stock prices of financial
data to us, we focus on raw material and commodities or forecasting the upcoming
dependency variation through artificial stock market trends can enable the
intelligence and Machine Learning is the investors to garner profit from their
path for our project. trading by taking calculated risks based
on reliable trading strategies. The stock
2. MOTIVATION market is characterized by high risk and
Stock recommendation and prediction is a high yield; hence investors are concerned
very tricky business, and forecasting about the analysis of the stock market and
commodity prices relying exclusively are trying to forecast the trend of the stock
on historic price data is a challenge of its market. To accurately predict stock
own. Spot prices and future prices are market, various prediction algorithms and
nonstationary they form a co- integrating models have been proposed in the
relation. Spot prices tend to move literature. In the paper proposed by A.Rao
towards future prices over the long run ,S.Hule ,Stock Market Prediction
hence predicting the path has become more Using Statistical Computational
useful than ever. Fluctuations in Methodologies and Artificial Neural
commodity prices affect the global Networks, the focus is on the technical
economic activity. For many countries, approaches that have been proposed
especially developing countries, primary and/or implemented with varying levels
commodities remain an important source of accuracy and success rates. It
of export earnings, and commodity price surveys mainly two approaches – the
movements have a major impact on Statistical Computational Approach and
overall performance therefore the Artificial Neural Networks Approach.
commodity-price forecasts are key input to It also describes the attempts that have
policy planning and formulation. Sales is a gone in combining the two approaches
very crucial aspect when it comes to any in order to achieve higher accuracy
developing nation but managing that predictions.
sales within the country and In another work done by K.K.Sreshkumar
estimating its future prospects is also and Dr.N.M.Elango, An Efficient
very important, recommendation and Approach to Forecast Indian Stock
prediction system will lead us to a Market Price and their Performance
standing where estimating the area of Analysis, the paper reveals the use of
maximum outcome will ultimately benefit prediction algorithms and functions to
all business providers and will bring us to a predict future share prices and
specific, and cannot be readily used for System architecture starts with client
comparison across commodities. using any web browser to access the
Forecasting Agricultural Commodity server and add up to his data, this
Prices Using Multivariate Bayesian data is further observed and is used to
Machine Learning Regression. generate alpha 1 and alpha 2 with
(Andres M. Ticlavilca, Dillon M. Feuz respect to current and historic data
and Mac McKee) which is necessary for further
The dependency between input and prediction process. It is tested whether
output target is learned using MVRVM the newly acquired data possess any
to make accurate predictions. The abnormalities or not, and if it does then
potential benefit of these predictions data is sent for noise removal and process
lies in assisting producers in making which then goes
better-informed decisions and managing in the section which combines new data
price risk but sparse property (low and historic data, if there is no noise in
complexity) of the MVRVM cannot be new data, it directly goes for combination
analyzed for the small with historic data. History is updated
dataset.nForecasting Model for Crude Oil after combining new acquired data and
Price Using Artificial Neural Networks it is sent for training of the system,
and Commodity Futures Prices. this system keeps on training as new
(Siddhivinayak Kulkarni, Imad Haidar) In data keeps on adding and system then
this paper, ANN is selected as a mapping becomes capable of training itself after a
model, and viewed as nonparametric, certain point of time.
nonlinear, assumption free model .
which means it does not make a priori
assumption about the problem but if the
assumptions are not correct in
econometrical model, it could generate
misleading results.
5. PROPOSED WORK
This paper proposes an artificial intelligent
system prediction and recommendation as
this is the heart and brain of entire
process, here the data set noise elimination,
and learning and prediction stage is
going to occur. The data provided to
the system should be relevant and labelled,
Fig 1: System Architecture
in order to identify the parameters and
predict the patterns it learned. The system
5.1 Training Stage
must understand the pattern between the
This is the starting phase of the software
data parameter at faster rate because that is
cycle. This is where the system starts to
important to speed up the calculation
learns and understand the patterns of the
process for predicting values in future.
commodity and then it starts to predict
Artificial intelligence is based on machine
the prices of the commodity. This stage
learning technique known as decision
is being divided into 2 stages one where
learning tree so it must select the ideal
the system learns the dependency of the
parameters in order to understand the
factor for commodity and second the
pattern and predict the values.
external factors affecting the prices of
the commodity In First stage the
ISSN:0975-887 Department of Computer Engineering, SKNCOE,Vadgaon(Bk),Pune. Page 175
Proceeding of International Conference on Internet of Things,Next Generation Network & Cloud Computing 2019
system creates the cluster of algorithms. considered as anomaly and the model
Then it identifies the dependency of the will be retrained.
commodity and the raw materials
dependency by studying the factors of the
raw material and then after learning
the dependency it progresses to
choosing the initial based algorithm based
on the factor it chose for predicting.
After the selection of the algorithm the
AI tries to construct sequence to train
machine. And the finally train the machine
to understand the raw material dependency
for commodity. In Second stage the
Fig 3: Prediction Stage
system creates the cluster of raw material
data collected from the first stage. Then it
5.3 Recommendation Stage
learns and the external factor affecting the
This stage is the final stage where the
raw material price fluctuation for example
clients can access and get
inflation, import export factor and then
recommendation based on the
after learning the external factors it
commodity they want to buy in this
progresses to choosing the initial state of
stage first of all the system will
probability based of commodity pattern.
identify the inventory management of
After the selection of the state of
the business owner and then choose
probability the AI tries to construct
the probability state based on their sales
sequence to train machine and the
and purchase and prices. The AI will
finally train the machine
construct sequence from the data it was
fed with and the, it will try to
implement it on the machine. After
learning the inventory management, it
will try to recommend the business
owner on the basis of pattern. It will
actually track the inventory through
proprietary based GIBS algorithm will
help it to understand the flow of the
inventory and finally the
Fig 2: Training Stage recommendation will be tested There can
be two output of the test Normal -If the
5.2 Prediction Stage test satisfies the test condition then
In the Prediction stage, our system will the pattern of the inventory will be added
generate the pattern based on the to the system in order to recommend it in
historical data. Then the discovered future Anomaly-This stage is the demerit
pattern will be added in the existing of the system to correctly identify the
sequence of the patterns. Using the inventory pattern so it is sent to the
combination of the discovered pattern beginning that is from identification
and also the existing sequence of inventory management.
patterns, system will predict some value
which we call as alpha. Then the test will
take place to check the behavior of the
alpha. If according the test, the alpha has
normal value, then it will be added to the
existingsequence of values else it will be
1. 2009 Dawei Yin et. al. The supervised learning The experiments were
was used for detecting done using supervised
harassment. This methods. The temporal or
approaches.
5. 2015 Kansara & A framework detects only Not able to detect audio
Shekokar abusive text messages or and video which are
images from the social offensive.
network sites by applying
SVM and Naïve Bayes
classifiers.
Web Application Programming Interface online experiences is the need of the hour.
(API).This makes it easy for the clients as The proposed system thus provides online
they are provided with a uniform interface social media utilities and other such
irrespective of the client. The clients are discussion platforms the ability to assess
expected to just pass the comments made the quality of users' comments by their
by users on their platform in JSON format classification into various kinds of toxicity
to the proposed system's web API for their using techniques like Natural Language
evaluation. The comment then progresses Processing and machine learning
through the three components of the Toxic algorithms. Based on the results provided
Comment Classifier and the response is by the system, the communication
sent back to the respective client via the platforms can decide the suitable course of
web API. action to be taken on such comments and
hence ensure that its users have a better,
System Parameters safer and harmless online experience.
1. Response Time - Since, typically, any The goals of future work on toxic
online discussion platform will have comment classification are to make initial
several active users who are posting and admission decisions reliable, decrease the
updating comments, the process of number of false calls and to make the QoS
evaluation of comments and corresponding guarantees more robust in the face of
response generation must be quick to network dynamics. There are users from
ensure that the users are not forced to wait various backgrounds, cultures which read
for an unsatisfactorily long period of time. and write in their native languages apart
Thus, the proposed system is expected to from English so it may be difficult to
provide a response to its clients in less identify the toxic comments in their local
than 4 seconds (assuming good network languages. This problem can be countered
connectivity). using CNN or Deep Learning in future. In
2. Cost - The cost associated with the future, the system can be improved with
proposed system is only for 'training' the the advancements in fields of NLP, ML,
machine learning model which varies from AI, Speech Synthesis etc.
platform to platform depending on various
factors like GPU specifications, memory REFERENCES
size, training time etc. [1] Hitesh Kumar Sharma, K Kshitiz, Shailendra.
3. Scalability -During peak online traffic, ―NLP and Machine Learning Techniques for
Detecting Insulating Comments on Social
it is important to make sure that the Networking Platforms‖ 2018
proposed system's response does not slow [2] Pooja Parekh, Hetal Patel. ―Toxic Comment
down. Thus, as the system is designed in Tools: A Case Study‖
the form of an API, it can be easily scaled- 2017
up by replicating and deploying it on [3] Theodora Chu, Kylie Jue. ―Comment Abuse
Classification with Deep Learning‖
multiple servers so as to satisfy larger [4] Manikandan R, Sneha Mani. ―Toxic Comment
number of incoming requests efficiently. Classification- An Empirical Study‖
4. Accessibility –The proposed system is 2018
easily accessible in the form of an API to [5] Aksel Wester, Lilja Ovrelid, Erik Velldal,
all its clients through a uniform interface. Hugo Lewi Hammer. Threat Detection in
Online Discussions
[6] S. Bird, E. Klein, and E. Loper, ―Natural
6. CONCLUSION AND FUTURE language processing with python.‖ 2014.
WORK http://www.nltk.org/book/ch02.html
To tackle the severe issue of abuse and [7] J. Pennington, R. Socher, and C. D. Manning,
harassment on social media platforms and ―Glove: Global vectors for word
representation,‖
to improve the quality of online 2018.https://nlp.stanford.edu/projects/glove/
discussions thereby mitigating harmful
[8] Ivan, ―Lstm: Glove + lr decrease+bn+cv,‖ [11] A. Pentina and C. H. L. 1, ―Multi-task learning
2018. https://www.kaggle.com/demesgal/lstm- with labeled and unlabeled tasks,‖ 2017.
glove-lr decrease-bn-cv-lb-0-047 http://pub.ist.ac.at/~apentina/docs/icml17.pdf
[9] A. Srinet and D. Snyder, ―Bagging and [12] Kaggle, ―Toxic comment classification
boosting‖. challenge‖,2018.
https://www.cs.rit.edu/~rlaz/prec20092/slides/ https://www.kaggle.com/c/jigsaw-toxic
Bagging_and_Boosting.pdf comment-classification-challenge/leaderboard
[10] T. Cooijmans, N. Ballas, C. Laurent, and A. C. [13] ―Threat detection in online discussions‖
Courville, ―Recurrent batch normalization,‖ 2016 - Aksel Wester and Lilja Øvrelid and
CoRR, 2017. Erik Velldal and Hugo Lewi Hammer.
https://arxiv.org/pdf/1603.09025.pdf Department of Informatics University of Oslo.
1. INTRODUCTION
While using the AI Assistant that are
currently present we can face a problem
that if sometimes the mike of the device
fails we are unable to interact with the
Assistant. This may createa interrupts in
interaction. And also while using the
current assistant we are not able to
visualize them, they are virtually present
so we cannot see them. Also while the
kids are using it there are a few concepts
that needs to be visualized for better
understanding. Fig 1: Existing Virtual AI Assistance System
The proposed system involves the As shown in above fig.1 they are the
Multi-Model system in combine with the current existing systems which are the
Holographic view, this includes the virtual AI Assistance system. They are
advancement in computer graphics and the systems which do not show the
multimedia technologies the way human assistance in front of you. They are also
view and interact with the virtual world, the systems which accept the simple input
such as the augmented reality (AR) and mode that is Speech or Text. They are no
the hologram display. The usage of AR able to take input in the form of video
display devices, such as Smartphone‘s frames, images, gestures, etc. They are
and smart glasses, allow the user to not much interactive.
receive additional information, which is in
the form of informativegraphics based on
his or her field of view through the
devices, for example, the street‘s name, 3. PROPOSED MODEL
navigation arrow to lead the user to the This proposed model gives the
destination, etc. On the other hand, the advance version for the present Existing
use of holographic pyramid prism can system. It combines 2 concepts as
produce the holographic results that Holographic projection and Artificial
displayed the 3D objects in the real world Intelligent Assistant
environment, by letting the user to look at
different perspective of these holograms
when viewing from different angles.
This system can also be used in the
education system to improve the
experience of the learning. This will
create the better understanding effect in
mind of the students. Also it can be used
in malls for demonstration of the material,
in case if the material is not available and Fig 2: Architecture of Proposed System
it will soon be arrived then also the
customer can view it using this The above shown in fig.2 is the
Holographic AI Assistant architecture of the proposed system as
shown in it the system consist of the
transparent box and the monitor is
2. EXISTING SYSTEM placed in the top part of the box.
The current Existing systems are as Inside the box the glass prism is been
shown below: set at angel. This will help for
displaying the projection. The inside
projection will consists of the simple
b. Output Module:
The output module will be in the
given form:
Fig.7 NLP
e. Knowledge Base :
Proposed system consists of two
knowledge bases. The first one is
online and second one is offline where
Fig.5 Output module with object all the data and the facts such as facial
and body datasets for gesture module,
speech recognition knowledge base,
c. Interaction Module: image and video dataset and some user
As per mention by VetonKepuska information related to modules will be
this module consists of the way the stored.
interaction is made. It describes how
the interaction is made. The Fig. 6
shows it [1]. 4. EXPERIMENTAL RESULTS
While researching the results which were
generated while using single modal AI
assistants,we considered efficiency and the
correctness as important measures. With
the increasing functionalities, the concern
of user experience regarding voice
recognition, visualization experience , fast
tracing of hand gestures ,which we have
introduced in Holographic Assistant has
been a challenge need to be overcome.
Efficiency: In comparison with the old AI
assistants, the Holographic Assistant will
prove to be more accurate while using
advance technologies such as Natural
Fig.6 Interaction Module Language Processing .
This is the module that describes Accuracy: On the other hand, the
the way the interaction is going to take accuracy of the holographic assistant
place. would be better which would handle
challenges like noise and accents. Whereas
the existing modals were more error prone.
d. Natural Language Cost: One of the profitable things
Processing(NLP): about this AI assistant is it's almost free of
This module gives the proper cost. The overall pre-requisites apart from
understanding of NLP which is the available softwares is a transparent glass
basic concept for speech recognition in and a monitor screen. Hence, this system
multimodal system. The Fig.7 shows would be affordable for all kind of
the proper NLP Structure. vendors out in the market who will be
ready to take innovations on new levels in using Augmented Reality Edu Card and 3D
their businesses. Holographic Pyramid for Interactive and
Immersive Learning, 2017 IEEE Conference on
REFERENCES e-Learning, e-Management and e-Services
[1] Veton Kepuska, Gamal Bohouta, Next- (IC3e).
Generation of Virtual Personal Assistants [4] R. Mead. 2017.Semio: Developing a Cloud-
(Microsoft Cortana,Apple Siri, Amazon based Platform for Multimodal
Alexa and Google Home),2018 IEEE. Conversational AI in Social Robotics. 2017
[2] Mrs. Paul Jasmin Rani, Jason Baktha IEEE International Conference on
kumar, Praveen Kumaar.B, Praveen Consumer Electronics (ICCE).
Kumaar.U and Santhosh Kumar, Voice [5] ChukYau and Abdul Sattar,Developing
controlled home automation system using Expert System with Soft Systems
natural language processing (nlp) and Concept,1994 IEEE.
internet of things (iot).2017Third [6] Inchul Hwang, Jinhe Jung, Jaedeok Kim,
International Conference on Science Youngbin Shin and Jeong-Su Seol,
Technology Engineering & Management Architecture for Automatic Generation of
(ICONSTEM). User Interaction Guides with Intelligent
[3] Chan Vei Siang, Muhammad Ismail Mat Isham Assistant, 2017 31st International
,Farhan Mohamed, Yusman Azimi Yusoff, Conference on Advanced Information.
Mohd Khalid Mokhtar, Bazli Tomi, Ali
Selamat, Interactive Holographic Application
rg881403@gmail.com4, akashkadam985@gmail.com5
ABSTRACT
The development of the information technology and communication has been complex
in implementing of artificial intelligent systems. The systems are approaching of
human activities such as decision support systems, robotics, natural language
processing, expert systems, etc.In the modern Era of technology, Chatbots is the next
big thing in the era of conversational services. Chatbots is a virtual person who can
effectively talk to any human being using interactive textual skills.
GENERAL TERMS
NLP - Natural Language Processing
NLU - Natural Language Understanding
NLG - Natural Language Generation
NLTK- Natural Language Toolkit
1. INTRODUCTION organized in a way that supports reasoning
Chatbots are ―online human-computer about the structures and behaviors of the
dialog system with natural language.‖ The system. System architecture of the system
first conceptualization of the chatbot is consists of following blocks
attributed to Alan Turing, who asked ―Can
machines think?‖ in 1950. Since Turing,
chatbot technology has improved with
advances in natural language processing
and machine learning. Likewise, chat bot
adoption has also increased, especially
with the launch of chatbot platforms by
Facebook, Slack, Skype, WeChat ,Line,
and Telegram.
Not only that, but nowadays there is also a
hybrid of natural language and intelligent
systems those could understand human
natural language. These systems can learn 3. OVERALL DESCRIPTION
themselves and renew their knowledge by Product Perspective
reading all electronics articles those has Most of the search engines today, like
been existed on the Internet . Human as Google, use a system (The Pagerank
user can ask to the systems like usually did Algorithm) to rank different web pages.
to other human. When a user enters a query, the query is
interpreted as keywords and the system
2. SYSTEM ARCHITECTURE returns a list of highest ranked web pages
The System Architecture is the conceptual which may have the answer to the query.
model that denes the structure, behavior, Then the user must go through the list of
and more views of a system. An webpages to find the answer they are
architecture description is a formal looking for.
description and representation of a system,
chat bot is not a challenging task as his kind support. His valuable suggestions
compared to complex chatbots and were very helpful.
developers should understand and consider We are also grateful to Dr. P. N. Mahalle,
the stability, scalability and flexibility Head of Computer Engineering
issues along with high level of intention on Department, STES' Smt. Kashibai Navale
human language. In short, Chatbot system College of Engineering for his
is moving quite fast and with the passage indispensable Guidance, support and
of time new features are added in the suggestions
existing platform. Recent advancements in
the machine learning techniques may able REFERENCES
to handle complex conversation issue such [1] AM Rahman, Abdullah Al Mamun, Alma
as payments correctly. Islam, ―Programming challenges of Chatbot:
Current and Future Prospective‖ Region 10
Humanitarian Technology Conference ( 2017)
6. FUTURE SCOPE [2] Bayu Setiaji, Ferry Wahyu Wibowo, ―Chatbot
The scope of our application in future is by Using A Knowledge in Database‖ 7th
extending the knowledge database with International Conference on Intelligent
Systems, Modelling and Simulation (2016)
more advanced datasets and including [3] Anirudh Khanna Bishwajeet Pandey Kushagra
support for more languages as well. Vashishta Kartik Kalia Bhale Pradeepkumar
Providing users with more detailed reports [4] Teerath Das, ―A Study of Today‘s A.I. through
Chat bots and Rediscovery of Machine
of their previous performances, so that this Intelligence‖ International Journal of u- and e-
could lead to improve the pace of user‘s Service, Science and Technology Vol.8, No. 7
(2015)
skill development. We also plan to extend [5] Sameera A. Abdul-Kader Dr. John Woods,
the web application into native mobile ―Survey on Chatbot Design Techniques in
apps. Speech Conversation Systems‖ International
Journal of Advanced Computer Science and
1. ACKNOWLEDGMENTS Applications, Vol. 6, No. 7 (2015)
We would like to take this opportunity to [6] https://www.altoros.com/blog/how-tensorflow-
thank our internal guide Prof. G.Y. Gunjal can-help-to-perform-natural-language-
for giving us all the help and guidance we processing-taksk
needed. We are really grateful to him for https://media.readthedocs.org/pdf/nltk/latest/nl
tk.pdf.
Second Stage is thr Padding the tokens of One way is to create co-occurrence matrix.
variable length for this, pad_sequences() A co-occurrence matrix is a matrix that
function in the Keras deep learning library consist of number counts of each word
can be used to pad variable length appearing next to all the other words in the
sequences.The default value is 0.0, which corpus (or training set). Let‘s see the
is suitable for almost every application, following matrix.
although this can be changed by specifying
the preferred value via the ―value‖
argument. The padding to be applied at
first or the end of the sequence, called pre-
or post-sequence padding, can be called
the ―padding‖ argument.
Table 2 - Word Embedding Table
Text data requires special preparation We are able to gain useful insights. For
before you can start using it for predictive example, take the words ‗love‘ and ‗like‘
modelling. The text must be parsed to and both contain 1 for their counts with
remove words, called tokenisation. Then nouns like NLP and dogs. They also have
the words need to be encoded as integers 1‘s for each of ―I‖, which indicates that the
or floating point values for use as input to words must be some sort of verb. These
a machine learning algorithm, called Text features are learnt by NN as this is a
Encoding. Once this process of encoding is unsupervised method of learning. Each of
completed then the text or tokens gets the vector has several set of characteristics.
ready for the embedding process. For example let‘s take example, V(King) -
automatically uses the GPU wherever and Web-portal it then reaches out to the
whenever possible with the help model in backend who process and gives
CuDNNLSTM, which is a high level deep output. The news given by the user is
learning keras and tensor-flow neural taken as a test set or test case and is sent to
network which runs the model on GPU classifier which classifies it.
(Nvidia gpu) using CUDA technology.
CUDA is NVIDIA's parallel computing 6.CONCLUSION
architecture that enables dramatic
increases in computing performance by The circulation of fake news online not
harnessing the power of the GPU (graphics only jeopardises News Industry but has
processing unit).Fast LSTM been negatively impacting the user‘s mind
implementation backed by CuDNN. The and they tend to believe all the information
execution of model training gets faster by they read online. It has power to dictate the
12 to 15 % depending on data. fate of a country or even whole world.
Daily decision of public also gets affected.
5.1 FIGURES/CAPTIONS Applying the projected model would
This diagram depicts the actual working definitely help in differentiating between
of the proposed system and all the Fake and Real news.
functionalities it will perform.Model
formation for fake news detection make REFERENCES
use of the training and the test data set and [1] Sadia Afroz, Michael Brennan, and Rachel
some other parameters like the dimensions Green- stadt. Detecting hoaxes, frauds, and
deception in writ- ing style online. In ISSP‘12.
of the vector space where it hold the [2] Hunt Allcott and Matthew Gentzkow. Social
relation between the two or more news media and fake news in the 2016 election.
entities. All these data is set to pass into Technical report, National Bureau of
the main function which is thought to Economic Research, 2017.
generate the confusion metrics and present [3] Meital Balmas. When fake news becomes real:
Com-bined exposure to multiple news sources
the result in terms of percentage. and political attitudes of inefficacy, alienation,
and cynicism. Com-munication Research,
41(3):430–454, 2014.
[4] Alessandro Bessi and Emilio Ferrara. Social
bots dis- tort the 2016 US presidential election
online discussion. First Monday, 21(11), 2016.
[5] Prakhar Biyani, Kostas Tsioutsiouliklis, and
John Blackmer. ‖ 8 amazing secrets for getting
more clicks‖: Detecting clickbaits in news
streams using article in-formality. In
AAAI‘16.
[6] Thomas G Dietterich et al. Ensemble methods
in ma-chine learning. Multiple classifier
systems, 1857:1–15, 2000.
[7] kaggle Fake News NLP Stuff.
Fig 5 - Working of proposed model https://www.kaggle.com/ rksriram312/fake-
Initially the system stores the gathered news-nlp-stuff/notebook.
[8] kaggle All the news
news in database which is then retrieved .https://www.kaggle.com/snapcrack/ all-the-
by the model, which then processes the news.
training data and produces the classifier. [9] Mykhailo Granik, Volodymyr Mesyura, ―Fake
The user is supposed to enter the news News detection using Naïve Bayes, 2017 ‖
manually which is thought to be [10] Sohan Mone, Devyani Choudhary, Ayush
Singhania, ―Fake News Identification, 2017‖.
unverified, once the input is given via
ABSTRACT
Big-Data can play important role in data science and Healthcare Industries to manage
data and easily utilize all data in a proper way with the help of ―V6s‖ (Velocity,
Volume, Variety, Value, Variability, and Veracity). The main goal of this paper is to
provide bottomless analysis on the field of medical science and healthcare data analysis
and also focused of previous strategies of healthcare as well as medical science. The
digitization process is participated in the medical science (MS) and Healthcare Industry
(HI) hence it produces massive data analysis of all patient related data to get a 360-
degree point of view of the patient to analyze and prediction. It helps to improve
healthcare activities like, clinical practices, new drugs development and financial
process of healthcare. It helps for lots of benefits in healthcare activities such as early
disease detection, fraud detection, and better healthcare quality improvement as well as
efficiency. This paper introduces the big data analytics techniques and challenges in
healthcare and its benefits, applications and opportunities in medical science and
healthcare.
General Terms
Hadoop, Map-Reduce, Healthcare Big-Data, Medicals, Pathologist.
Keywords
Healthcare Industry (HI), R, Data Analytics (DA), Smart-Health (SH).
1. INTRODUCTION 1. Volume: It means data size is big/huge like,
The main goal of this paper is to provide Terabytes(TB), Petabytes(PB), Zeta bytes(ZB)
best predictive analysis solution to etc,
researchers, academicians, healthcare 2. Velocity: It means data can be generated in
industries and medical science industries, high –speed, per day data generated, per hour,
per minute, and per second etc.
who have a lots of interest into analytics of
3. Variety: It means data can be represented in
big-data for a specific healthcare and different types like, structural data,
medical science industries. unstructured data and semi structured data for
We know that all healthcare Industries and example, data from email messages, articles,
Medical Science Researcher are dependent streamed videos and audios etc.
on data for analysis and processing on it,4. Value: It means data have some valuable
and that data are generated from information insight within it. This will be
Government Hospitals and Private Clinic useful information somewhere within the data
collaborative record of every old and new for outcomes.
patient‘s data, which is in the form of 5. Variability: It means data can be changes
different structure known as big-data. So during the processing; it may be producing
some unexpected, hidden and valuable
big-data can be processed and identified information from that data.
with the help of big-data characteristics. Veracity: It has to be focused on two terms:
We can say that big-data with V6‘s Data Trustworthiness and Data Consistency,
(Volume, Velocity, Variety, Value, we can also says that data is in doubt means,
Variability, Veracity) Characteristics to Ambiguity, Incompleteness and uncertainty
achieve dedicated outcomes. due to data inconsistency.
scheduling also biometric data can be distributed system for that mostly
considered like, Fingerprints, Handwriting and recommended Big-Data Analytic (BDA)
Iris scan etc [1]. tools, without any doubt the analysis tools
of healthcare it is beneficial and useful.
3. HEALTHCARE PATIENT
RECORD CHALLENGES 4. BIG-DATA ANALYTIC TOOL
In any hospital or private clinic, big In healthcare industry big problem is to
challenge is to manage and analysis of big- processing and execution of data and also
data of any new or existing patient. The all hospital as well as clinic suffering with
electronic record of patient can be same to manage big data of patients and its
composed in structured and semi- processing and execution is difficult task,
structured data and instrumental recording so that Big-Data Analytics tools plays
for health test, while unstructured data important role to process it easily in two
consisting of handwritten notes, patient‘s different ways centralized and distributed
admission and reliving records, ways[1].
prescriptions records etc, also the data may The BDA tools are naturally complex in
be web-based, machine–based, biometric- nature with widespread programming and
based and also data generated by human multi-skill applications combined together
(like, Twitter, Facebook, Sensors, Remote under one roof, so that it is not user-
Devices, Fingerprints, X-RAY, Scanning, friendly and its complexity of the process
EMRs, Mails etc.) these conventional will take place with the data itself. For this
records and digital data are combined in system different types of data need to be
Healthcare Big-Data (HBD). combined then raw data is transformed for
The execution of big-data is the most multiple availability points.
challenging task hence, most of the In healthcare industry how big-data is
researcher suggested for installation of supporting entire industry, actually it has a
big-data tools in the standalone system. benefits from these initiatives. In this
The big data is generally considered as paper we have focused on three area of big
voluminous data and when that data is data analytics, they are intended to provide
processing and execution of that data a perspective of broad and popular
should be on the distributed nodes. Hence research areas where the concept of big-
we need some knowledge about data data analytics are currently being applied.
analysis techniques and healthcare These areas are as
decision in better way which will help for 1. Healthcare industry aspect with BDA,
active enhancement. For processing and 2. Impact of Big-Data in Healthcare,
analysis we have some open source tools 3. Opportunities and Applications of Big
of distributed data processing [6]. data in Healthcare.
The big data in healthcare science and 4.1 Healthcare Industry Aspect with
industry is changing the way of patients BDA
and doctors healthcare system because The healthcare industry system is not only
voluminous data is involved it affects on one of the largest industries it is also one
more efficient and scalable healthcare, so of the most complex in nature, with many
it can be useful for every patients and patients constantly demanding better care
hospital to handle that data of each and management. Big-Data in healthcare
every patient record easily. The big data is industry along with industry analytics have
generally has huge voluminous data and it made a mark on healthcare, but one
has been processing, the execution is important point should be focused here is
carried out in distributed nodes. security concern and requires better skill
We know that for processing and programming aspect as end user skills are
execution of any voluminous data from not proposed. The healthcare industry has
some limitations in big data like, security, will also gain a financial advantage, by
privacy; ownership and its standard are not backing health trackers as well as wearable
proposed yet. to make sure patients don‘t actually over
4.2 Impact of Big-Data in Healthcare exceed their hospital stay. Patients could
Industry also benefit from this change, lowering
In healthcare industry big data changed their waiting time, by having immediate
everything with respect to data processing access to staff and beds. The analysis will
and execution including hospital and reduce staffing needs and bed shortages
clinic. Here we have focused some [4].
relevant connections on information [1]. 4.2.3 Patient Health Tracking
4.2.1 High Risk Patient Care We have focus on Identifying
We know that healthcare cost and potential health problems before they
complications always increasing lots of develop and turn into aggravating issues is
patient in emergency care. Due to higher an important goal for all organizations
cost it is not beneficial for poor patients functioning in the industry. Due to lack of
and so many patients are not taking data, the system has not always been able
benefit, so implementing the change in this to avoid situations that could have easily
department will be advantage and hospital been prevented otherwise. Patient health
will work properly [1]. If all records are tracking is another strong benefit that
digitized, patient patterns can be identified comes with big data, as well as The
more effectively and quickly, it will Internet of Things tech resources [2].
directly help to reduce time of checking 4.2.4 Patient Engagement could be
and applying proper treatment to that Enhanced
patient also it will help checking on Through big Data and analytics, an
patients with high risk of problem and increase in patient engagement could also
ensuring more effective, customized be obtained. Drawing the interest of
treatment can be benefitted. Lack of data consumers towards wearable and various
makes the creation of patient-centric care health tracking devices would certainly
programs more difficult, so one care bring a positive change in the healthcare
clearly understand why big data utilization industry, a noticeable decrease in
is important in healthcare industry. It wills emergency cases being potentially
clearly identifies and process with zero reached. With more patients understanding
error in execution flow of patient checking the importance of these devices,
and maintained of record of patient with physicians‘ jobs will be simplified, and an
all treatment details and hence big data engagement boost could be obtained
analytics tools need in healthcare industry through big data initiatives, once again [2,
[3]. 3].
4.2.2 Cost Reduction 4.3 Opportunities and Applications of
Generally we know that various Big-Data in Healthcare and Medical
hospitals, clinic and medical institutions Industry
are faced high level financial waste, due to We have mentioned in previous first
improper financial management. If section and second section of this
happens because of over booking of staff. regarding big data about its role, big data
Through predictive analysis, this specific can provide major support with all
problem can be solved, being far easier to different aspect in healthcare. We know
access help for effective allocation of staff that big data analytics (BDA) has gained
together with admission rate prediction [7, traction in genomics, clinical outcomes,
8]. Hospital investments will thus be fraud detection, personalized patient care
optimized, reducing the investment rate and pharmaceutical development; likewise
when necessary. The insurance industry there are so many potential applications in
healthcare and medical science areas some For healthcare system big data, the
of these applications are given in 4.2 Hadoop with Map Reduce framework is
impact of big data in healthcare industry. mostly suitable for storing wide range of
Following table shows that some of the healthcare data types including electronic
important application area of big data in medical records, genomic data, financial
healthcare industry and medical science. data, claims data, etc. It has high
Applicatio Business Big Data scalability, reliability and availability than
n Areas Problems Types traditional database management system.
Healthcare Fraud Machine The Hadoop Map reduce system will
detection generated, increase the system throughput and it can
Transaction process huge amount of data with proper
data, execution, so that it is helpful for
healthcare industry and medical science
Human
[5].
generated
The big data analytics tools are widely
Healthcare Genomic Electronic considered for complex type applications
health and it has widely used in healthcare
record, industry to manage all type of data under
Personal one roof with distributed architecture. In
health record following architecture we have given basic
Healthcare Behavioral Facebook, idea about different coming sources of big
and Patient Twitter, data it can be considered as a raw data
Sentiment LinkedIn, like, External, Internal, Multiple
data Blogs, Locations, Multiple Formats, and
Smartphone’s Applications [5, 6].
. Raw data from different sources can be
Science Utilities: Machine transformed on middleware with Extract,
transform, Load (ETL) in the form of
and Predict generated
traditional format. With transformed data
Technolog Power data
we use big data platforms and tools to
y Consumptio process and analytics it. Then we use
n actual bid data analytics applications [2].
Table 1: Big data Applications in Healthcare
5. TECHNOLOGY AND
METHODOLOGY PROGRESS IN
BIG DATA
In every field big data plays important role
with big data analytics tools, but here we
have focused in healthcare/ medical
science field. In medical and healthcare
field, large amount of data can be
generated about patient‘s medical
histories, symptoms, diagnosis and
responses to treatments and therapies
collected. Data mining are some time used
here for fining interesting pattern from
healthcare data with analytics tools with
In this architecture we have shown all
the help of Electronic Patient Record
different areas of application coved with
(EPR) of each patient [1].
big data analytics tools, but here we have
more focused on healthcare industry and treatment and hospital activity with all
medical science applications. doctor management with prior appoint of
every patient with department wise. These
6. BIG DATA CHALLENGES IN challenges are mostly considered for future
HEALTHCARE research with Big Data Analytics tools role
We know that big data characteristics i.e. in healthcare industry and medical science
V6, it is difficult to storage big amount of like sensor data and electronic records of
data also difficult to search, visualize, patient data privacy-preserving with data
retrieval, and curation. There are so many mining. In healthcare this type of changes
challenges in healthcare application some is necessary for sentiment analysis of big
of the major challenges in healthcare are data in healthcare science with patient
listed below [4]. personalized data and behavioral data. But
1. It is difficult to analyze and aggregate researcher point of view big data is the
unstructured data from different hospital best solution for healthcare industry and
and clinic from ERM, Notes, and Scan etc. medical science. In future we know that
2. The data which is provided by many data will be generating rapidly, so future
hospital and clinic are not accurate with generation healthcare big data will apply
quality factors also, so it is difficult to
with vast application of healthcare industry
analyze sometime with BDA.
3. Analyzing genomic data computationally and society. In this paper we have given
difficult task. many tools of BDA for healthcare industry
4. Data hackers can damage big data. as a solution and it will establish an
5. Information Security is a big challenge in efficient and cost effective quality
big data. management using data cluster manager.
Access Control for Big Data Applications," in [8] A. McAfee, E. Brynjolfsson, T. H. Davenport,
Cloud Computing, IEEE, vol.1, no.3, pp.65- D. J. Patil, and D. Barton, ―Big data: the
71, Sept. 2014. management revolution,‖ Harvard Business
Review, vol. 90, no. 10, pp. 60–68, 2012.
conversation which forces the user to type SnapTravel helps users to book hotels
the same thing again & again. This can be according to their convenient location and
cumbersome for the customer and annoy timings. A customer can also get to know
them because of the effort required. about current deals available at various
3)Due to fixed programs, chatbots can be hotels and resorts.
stuck if an unsaved query is presented in 1-800 Flower
front of them. This can lead to customer 1-800 Flower will help customers to gift
dissatisfaction and result in loss. It is also flowers and gifts to someone for events
the multiple messaging that can be taxing like birthday, anniversary or any special
for users and deteriorate the overall occasion. It also offer suggestions for gifts
experience on the website. to customers.
Chatbots are installed with the motive to These available chatbots are related to
speed-up the response and improve only few categories. To combine all the
customer interaction. However, due to categories together at single place to
limited data-availability and time required integrate with chatbots for customer
for self-updating, this process appears service.
more time-taking and expensive.
Therefore, in place of attending several 3. PROPOSED WORK
customers at a time, chatbots appear The Proposed System E-commerce
confused about how to communicate with with Chatbots will permit consolidation of
people. customer login, browse and purchase
Starbucks products available, manage orders and
Starbucks has developed an Android and payments, engaging customer with
iOS application to place an order for personalized marketing, qualifying
favorite drink or snacks. The order can be recommendations based on history. The
placed with help of voice commands or main users of the project are customers
text messaging. who want to shop various products and
Spotify services online.
Spotify chatbots allow users to search for, From end-user perspective, the
listen to their favorite music. Also it allow proposed system consists of functional
users to share music. elements: login module to access online
Whole Foods products and services, browse and search
Whole Foods is related to groceries and products, purchase and pay for products,
food material. It allow to search for communicate to chatbot for better product
grocery items to shop for. Also it provides and offer recommendation.
interesting recipes for users to try. According the back end logic,
Sephora Natural Language Processing (NLP) will
Sephora is associated with makeup be used to understand messages sent by
material such as foundation, face primer, user through messaging platform. The
concealer, blush, highlighter, etc. Sephora Chatbot will launch an action as answer
chatbots also suggest for makeup tutorials with real time information based on
for which user is interested. machine algorithms such supervised and
Pizza Hut unsupervised learning. The bot will
Pizza Hut chatbot can help customer to improve with increasing number of
order pizza with favorite toppings and messages received.
carryout delivery. A customer can reorder The important features of the
favorite pizza based on previous orders system are handling thousands of
and can ask questions about current deals. customers simultaneously which will
SnapTravel provide better satisfaction to customers.
Also it will be a virtual but personal
assistant for each customer. Similar to see daily sales and details about deliveries.
chatbots, e-Commerce becomes one of the He will be able to see the feedbacks or
preferred ways of shopping as they enjoy reviews given by the customers.
their online because of its easiness and 3.2 Assumptions and Dependencies
convenience. The combination of e- There are few assumptions can be made while
commerce site with AI- assisted chatbots developing the proposed system:
will provide better customer service to A user has an active Internet Connection
make profitable sales by personalized or has an access to view the Website.
marketing. A user runs an operating system which
supports Internet Browsing.
The risks associated such as The website will not be violating any
privacy issues can be handled with the Internet Ethic or Cultural Rules and won‘t be
help of authentication and Authorization to blocked by the Telecom Companies.
provide strong access control measures. A user must have basic knowledge of
Intellectual property related risks can be English and computer functionalities.
avoided by proper instructions to upload 3.3 Communication Interface
data with restrictions. Online security is The system should use HTTPS
the most important risk to be considered protocol for communication over the
while developing the system regarding to internet and for the intranet
customers‘ credentials, online products communication will be through TCP/IP
and services available. Data storage could protocol suite as the users are connected to
be a risk associated with chatbots as they the system with Internet interface. The
store information to interact with the users. user must have SSL certificate licensing
The best solution in this situation is to registered web browser.
store the data in a secure place for a certain 3.4 System Architecture
amount of time and to delete it after that. Systems design is the process of
3.1 User Classes and Characteristics defining the architecture, components,
modules, interfaces, and data for a system
There are essentially three classes to satisfy specified requirements. Systems
of users of the proposed system: the design could be seen as the application of
general users, the customers and the systems theory to product development.
administrators. General users will be able There is some overlap with the disciplines
to see and browse through products of systems analysis, systems architecture
available to purchase, but they cannot buy and systems engineering.
the products and services. Customers are System architecture includes the
the users of the E-commerce System who modules used in the project and
will be able to browse, purchase, pay and relationships between them based on data
add products and services to the cart with flow and processing. AI – Assisted
available functionality. Chatbots will help Chatbots For E-Commerce System
them to make a purchase decision based on consists of following components:
various criteria and suggestions by chatbot General User
algorithms. Also customers can write Customer
reviews or feedbacks on products and Administrator
services they purchased.
The administrators will be having
advanced functionality to add, edit, update
and delete products available in inventory.
Also administrator will be able to
authorize and authenticate the users logged
into the system. Administrator will able to
inventories which are to be purchased. to login to the system with the help of
Chatbot will use the search and purchase login credentials such as username and
history if the user is authenticated by the password before managing inventory.
system for product suggestions and Administrator can add new
recommendations. products with new or existing categories
A customer can modify his/her along with its description and images, add
profile or account created into the number of products to already listed
Proposed System. Before updating the products available in inventory. A
profile the user need to authenticate that customer who logged into the system can
he/she is the original user of the account or search and add these products to the
profile by providing login credentials such Shopping Cart. Also customer can
as username and password. After this the purchase the products in inventory or the
customer need to provide the attributes products added to the shopping cart from
need to be updated such address, phone no, inventory.
mail ID, credit card details, etc. A Shopping Cart is the temporary
A Customer can interact with storage to save the products which a
chatbot to make a purchase decision. customer may want to purchase in future.
Chatbot will interact with a customer The Shopping Cart is separate storage for
based on customer‘s browse or search individual customer who logged into the
history, purchase history. Chatbot will Proposed System. The products added to
make use of customer‘s record to suggest the shopping cart can be purchased by the
products from inventory to customer to customer. To purchase the product, the
purchase. Initially Chatbots will interact customer need to provide related
with basic set of rules designed with information such as, Name, Address,
Machine Learning Algorithms. With more Phone No., Date of Delivery, Shipping
interaction with customer, chatbots will Type, Payment Method and Credit Card
also get improved to recommend products Details in case of Card Payment. A
to customers based on customer‘s search customer can modify the shopping cart
and purchase history. items such as customer can either purchase
E-Commerce Website‘s Home the products in shopping cart or customer
Page is designed with the important can remove the products from the
features such as deals of the specific shopping cart.
products, best seller products, discounts of Purchase history is recorded in
products, various categories of the form of invoice reports, order reports and
products to choose, feature to login to the transaction reports. Invoice is generated
system and a way to interact with chatbot. after the customer purchases the products
A user can browse through these from inventory which includes all the
categories to view various products such details of purchase and transactions made
as clothes, accessories, beauty products, by the customer. It includes the details of
shoes, bags, etc. product purchased such as price, quantity,
Inventory is a collection of all product ID, product category along with
categories of the products. Administrator customer details such as name, address of
is allowed to add products to the inventory delivery, shipping type, date of delivery,
separated based on its category. For phone no and payment method. All the
Example, Clothes is category of products details regarding purchase history are used
such as shirts, tops, t-shirts, jeans, skirts, by chatbot to interact with the customer
party wear dresses, etc. Similarly various based on his/her history to suggest or
products of different category can be recommend products. A customer can
added to the inventory by the track the shipment of order based on
Administrator person. Administrator need invoices recorded or transactions saved to
his/her profile. The online tracking of play more important role by providing
order can help to customer to locate his/her personalized suggestions. This will help
product. customers to make a purchase decision which
Sales and Marketing involves the will increase profitable sales by personalized
techniques to suggest a customer to marketing. It will improve the performance of
purchase particular products with the proposed system due to personal
advertisement. It is done based on the recommendations.
keywords searched by the customers for Use of AI Concepts
the products he/she wants to purchase. AI concepts such as Artificial
Advertising of products is a way of Neural Networks (ANN), Natural Language
Marketing to increase Sales of products. Processing (NLP), and Machine Learning
All these things are managed by the
(ML) Algorithms are used in the proposed
Administrator to make maximum sales of
system. Machine learning algorithms such as
products with the help of marketing of
Supervised and Unsupervised Algorithms will
products.
improve the performance of the proposed
4. PERFORMANCE ANALYSIS system. Linear Regression are capable for
The performance of the proposed prediction Modelling and minimizing risk of
system can be analyzed based on few failure. Linear Regression are used for
parameters. These parameters can be used predicting responses with better accuracy. It
to measure the performance of the system makes use of relationship between input
in comparison with the existing system. values and output values. Naïve Bayes
The parameters can be used for this Algorithm is used for large data set for ranking
analysis are,- or indexing purpose. It will help to rank the
Human Machine Interaction products based on customer reviews. Semi-
To provide better interaction supervised algorithms will help to handle the
between Human and Machine, the concepts of combination of both labelled and unlabelled
AI such as, Artificial Neural Network (ANN), data. NLP is useful to understand the human
Natural Language Processing (NLP) and understandable language by machine and
Machine Learning Algorithms are used in the generate the response in human language. It
Proposed System. Human users will interact to will make use of elements of Named Entity
the system through the Chatbot which a Recognition, Speech Recognition, Sentiment
Software program designed to communicate analysis and OCR. All these concepts will help
with the user. These Machine Learning to enhance the performance of the proposed
Algorithms will help Chatbot to generate system.
response using Supervised and Unsupervised
algorithms. This will enhance the performance 5. CONCLUSION
of the proposed system as compared to The Internet has become a major
existing system as fixed programs are used for resource in modern business, thus electronic
Chatbot in existing system. shopping has gained significance not only
Better Recommendations to User from the entrepreneur‘s but also from the
Recommendations are the suggestions customer‘s point of view. For the
provided to the user based on the search or entrepreneur, electronic shopping generates
browse history and purchase history of the new business opportunities and for the
particular user. These recommendations customer, it makes comparative shopping
provided by Chatbot can be in the form of possible. As per a survey, most consumers of
Product recommendations with links, updates online stores are impulsive and usually make a
for latest products. To facilitate the customer decision to stay on a site within the first few
service and support recommendations will seconds. Hence we have designed the project
to provide the user with easy navigation, World Smart Chatbot for Customer Care using
retrieval of data and necessary feedback as a Software as a Service (SaaS) Architecture‖,
2017
much as possible. [6] S. J. du Preez1, M. Lall, S. Sinha, ―An
As we have seen in this project, the Intelligent Web-Based Voice Chat Bot‖ , 2009
process of creating a user-friendly and [7] Cyril Joe Baby, Faizan Ayyub Khan, Swathi J.
straightforward platform that facilitates the N.,‖ Home Automation using IoT and a
Chatbot using Natural Language Processing‖,
administrator‘s job is one filled with 2017
complexity. From understanding user [8] Ellis Pratt, ―Artificial Intelligence and
requirements to system design and finally Chatbots in Technical Communication‖, 2017
[9] Bayan Abu Shawar, Arab Open University,
system prototype and finalization, every step
Information Technology Department, Jordan,
requires in-depth understanding and ―Integrating Computer Assisted Learning
commitment towards achieving the objective Language Systems with Chatbots as
of project. Conversational Partners‖, 2017
[10] Aditya Deshpande, Alisha Shahane , Darshana
So this is an efficient and effective Gadre, Mrunmayi Deshpande, Prof. Dr. Prachi
way for the customers to purchase products M. Joshi, International Journal of Computer
online with the help of Chatbot within a few Engineering and Applications, Volume XI, ―A
steps. . With the help of E-commerce website Survey Of Various Chatbot Implementation
Techniques‖, May 2017
sellers can reach to larger audience and with [11] Sameera A. Abdul-Kader, Dr. John Woods,
the help of chatbots, sales can be increased by International Journal of Advanced Computer
personal interaction with the users. In this Science and Applications (IJACSA), Vol. 6,
way, this application provides an optimized ―Survey on Chatbot Design Techniques in
Speech Conversation Systems‖, 2015
solution with better availability, [12] M. J. Pereira, and L. Coheur, ―Just. Chat-a
maintainability and usability. platform for processing information to be used
in chatbots,‖, 2013.
REFERENCES [13] A. S. Lokman, and J. M. Zain, American
[1] Adhitya Bhawiyuga, M. Ali Fauzi, Eko Sakti Journal of Applied Sciences, vol. 7, ―One-
Pramukantoro, Widhi Yahya, ―Design of E- Match and All-Match Categories for
Commerce Chat Robot for Automatically Keywords Matching in Chatbot,‖, 2010.
Answering Customer Question‖, University of [14] S. Ghose and J. J. Barua, Proc. IEEE of 2013
Brawijaya Malang, Republic of Indonesia, International Conference on Informatics,
2017 Electronics & Vision (ICIEV), ―Toward The
[2] Anwesh Marwade, Nakul Kumar, Shubham Implementation of A Topic Specific Dialogue
Mundada, and Jagannath Aghav have Based Natural Language Chatbot As An
published a paper ―Augmenting E-Commerce Undergraduate Advisor,‖ , 2013.
Product Recommendations by Analyzing [15] R. Kar, and R. Haldar, "Applying Chatbots to
Customer Personality‖, 2017 the Internet of Things: Opportunities and
[3] Bayu Setiaji, Ferry Wahyu Wibowo, ―Chatbot Architectural Elements".
Using A Knowledge in Database‖, 2017 [16] McTear, Michael, Zoraida Callejas, and
[4] Abdul-Kader, S. A., & Woods, J., ―Survey on David Griol, Springer International Publishing,
chatbot design techniques in speech "Creating a Conversational Interface Using
conversation systems‖, International J. Adv. Chatbot Technology.", 2016.
Computer Science Application, 2015
[5] Godson Michael D‘silva, Sanket Thakare,
Sharddha More, and Jeril Kuriakose, ―Real
[1] M. Abadi, P. Barham, J. Chen et al., ―Tensor Fikes, and R. E. Gruber, ―Bigtable: A
flow: A system for large-scale machine distributed storage system for structured data,‖
learning,‖ in Proceedings of the 12th USENIX ACM Trans. Comput. Syst., vol. 26, no. 2, pp.
Conference on Operating Systems Design and 4:1–4:26, Jun. 2008. [Online]. Available:
Implementation, ser. OSDI‘16. Berkeley, CA, http://doi.acm.org/10.1145/1365815.1365816
USA: USENIX Association, 2016, pp. 265– [4] A. Lakshman and P. Malik, ―Cassandra: A
283. [Online]. decentralized structured storage system,‖
Available:http://dl.acm.org/citation.cfm?id=30 SIGOPS Oper. Syst. Rev., vol. 44, no. 2, pp.
26877.3026899 35–40, Apr. 2010.
[2] M. Zaharia et al., ―Resilient distributed [5] M. Malensek, S. L. Pallickara, and S.
datasets: A fault tolerant abstraction for in- Pallickara, ―Autonomously improving query
memory cluster computing,‖ in Proceedings of evaluations over multidimensional data in
the 9th USENIX Conference on Networked distributed hash tables,‖ in Proceedings of the
Systems Design and Implementation, ser. 2013 ACM Cloud and Autonomic Computing
NSDI‘12. Berkeley, CA, USA: USENIX Conference (CAC), Sep 2013, pp. 15:1–15:10.
Association, 2012, pp. 2–2. [Online]. [Online]. Available:
Available: http: https://www.cs.usfca.edu/mmalensek/publicati
//dl.acm.org/citation.cfm?id=2228298.2228301 ons/malensek2013autonomously.pdf
[3] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh,
D. A. Wallach, M. Burrows, T. Chandra, A.
DATA MINING
AND
INFORMATION
RETRIEVAL
2. MOTIVATION
Generally, the cost of obtaining such social
media data is trivial. But processing such
massive databases to extract travel
3. LITERATURE SURVEY
through the spreading of risk and, facilities provided to them. This gives the
possibly, pooling of funds. usefulness of Twitter data for analysing
the behaviour of tourists in cities. The data
4. PROPOSED WORK we obtain from the twitter is in huge
In the proposed system, we cover the amount at a time, which results broad
limitation of previous methods of vague insight at a time. The approach is more
marketing in tourism and make targeting time efficient since we can target a highly
efficient with focused approach for data scalable area as required in a single go.
acquisition and marketing using Twitter Twitter data provides various information
data. about its users which is difficult to obtain
With help of the Twitter data, we can find otherwise. This social media data enables
the tourists more efficiently. The use of in efficient target marketing with variable
dynamic Twitter data makes marketing parameters as per the need be.
optimum in terms of time and targeted
approach. The system uses recently 6. FUTURE WORKS
updated data for client search which makes 1. In-home activity data:
it better. If the activity is scheduled to happen at
home, one out-of-home activity is
DATA FLOW DIAGRAM - LEVEL 0 cancelled, which results in fewer trips on
the transport network, which is of great
importance to travel demand modellers
and planners.
2. Tour formation:
Tour formation requires collecting
information about trips. Twitter users often
provide information about their daily
activities which helps to extract
information about the location, time and
purpose of different activities. Using
Twitter data for modelling tour formation
behaviour can significantly complement
the models that are developed using
household travel surveys.
3. Future activities:
When the Twitter data is extracted using
different techniques, it becomes possible
to recognize potential future activities. In
Fig. Data Flow Diagram- Level 0
other words, based on his/her tweet about
the place he/she wants to visit is likely to
5. SUMMARY AND CONCLUSION
be at that location at a time to be
This paper focuses on how Twitter data to
determined. This helps to manage the
be used in analysing the individual level
future tours and their activities.
travel behaviour of users. This framework
helps for more applications of Twitter and
7. ACKNOWLEDGEMENTS
other social media data for client search
With due respect and gratitude we would
for travel industry, management, sales and
like to take this opportunity to thank our
operation purposes. With the help of
internal guide PROF. G. S. PISE for
Twitter posts it becomes easy to track the
giving us all the help and guidance we
tourists around the needed location. It was
needed. We are really grateful for his kind
found that tweets are mainly associated
support. He has always encouraged us and
with the ease of the tourists and the
remove the surface element. The unearthly novel portrayals of picture locales. We
and surface highlights are utilized as two show that our arrangement display
kinds of low-level highlights, in light of produces best in class results in recovery
which, the abnormal state visual-words are probes Flickr8K, Flickr30K and
built by the proposed technique. We utilize MSCOCO datasets. We at that point
the entropy rate super pixel division demonstrate that the created portrayals
strategy to fragment the hyperspectral into significant you perform retrieval baselines
patches that well keep the on both full images and on another dataset
homogeneousness of districts. The patches of locale level explanations.
are viewed as records in BOV show. At 4. Latent semantic sparse hashing for
that point kmeans bunching is executed to cross-modal similarity search
group pixels to develop codebook. At long A novel hashing technique, proposed to as
last, the BOV portrayal is developed with Latent Semantic Sparse Hashing, for
the insights of the event of visual words expansive scale cross modal comparability
for each fix. Trials on a genuine look among pictures and messages.
information demonstrate that the proposed Specifically, we uses Sparse Coding to
strategy is tantamount to a few best in catch abnormal state notable structures of
class strategies. pictures, and Matrix Factorization to
2. Automated Patent Classification separate inert ideas from writings. At that
Using Word Embedding point these abnormal state semantic
Patent classification is the undertaking of highlights are mapped to a joint reflection
dole out an uncommon code to a patent, space. The pursuit execution can be
where the allocated code is utilized to advanced by blending numerous complete
aggregate licenses with comparative idle semantic portrayals from
subject into an equivalent class. This paper heterogeneous information. We propose an
exhibits a patent arrangement technique iterative procedure which is exceptionally
dependent on word inserting and long efficient to investigate the connection
momentary memory system to group between's multi-modular portrayals and
licenses down to the subgroup IPC level. scaffold the semantic hole between
The trial results show that our heterogeneous information in dormant
classification technique accomplish 63% semantic space. We lead broad
exactness at the subgroup level. investigations on three multi-modular
3. Deep visual-semantic alignments for datasets comprising of pictures and
generating image descriptions messages. Unrivaled and stable 423
A model that creates regular dialect exhibitions of LSSH verifies the
descriptions of images and their regions. effectiveness of it looked at against a few
Our approach leverages datasets of best in class cross-modular hashing
pictures and their sentence depictions to techniques
find out about the between modular
correspondences among dialect and visual 5. Click through-based cross-view
information. Our arrangement demonstrate learning for image search
depends on a novel blend of Convolutional We have explored the issue of specifically
Neural Networks over picture locales, taking in the multi-see separate between a
bidirectional Recurrent Neural Networks printed question what's more, a picture by
over sentences, and an organized target utilizing both snap information and
that adjusts the two modalities through a subspace learning methods. The snap
multimodal inserting. We at that point information speaks to the snap relations
depict a Multimodal Recurrent Neural among inquiries and pictures, while the
Network design that utilizes the derived subspace learning expects to take in an
arrangements to figure out how to create inert regular subspace between various
perspectives. We have proposed a novel learning results and long haul collected
navigate based cross-see figuring out how information into the goal work.
to take care of the issue in a guideline way. Examinations on picture sound dataset
In particular, we utilize two diverse direct have shown the prevalence of our strategy
mappings to extend printed inquiries and more than a few existing calculations.
visual pictures into an idle subspace. The
mappings are found out by together 3. EXISTING SYSTEM APPROACH
limiting the separation of the watched Alongside the expanding necessities,
question picture matches on the navigate lately, cross-media look errands have
bipartite chart and safeguarding the inborn gotten extensive consideration. Since,
structure in unique single view. In every methodology having diverse
addition, we make symmetrical portrayal strategies and correlational
presumptions on the mapping frameworks. structures, an assortment of techniques
At that point the mappings can be gotten examined the issue from the part of
productively through curvilinear inquiry. learning relationships between various
We take l2 standard between the modalities. The effectiveness of hashing-
projections of inquiry and picture in the based strategies, there likewise exists a
inactive subspace as the separation rich profession cantering the issue of
capacity to quantify the importance of a mapping multi-modular high-dimensional
combine of (inquiry, picture). information to low-dimensional hash
codes, for example, Latent semantic
6. Boosting cross-media retrieval via inadequate hashing (LSSH), discriminative
visual-auditory feature analysis and coupled word reference hashing (DCDH),
relevance feedback Cross-see Hashing (CVH, etc. In the
existing system, user can search the data
0Diverse kinds of media information on flicker, user can get result the relevant
express abnormal state semantics from as well as irrelevant images.Irrelelent data
various angles. Step by step instructions to is main drawback of existing system as
learn far reaching abnormal state well as in existing data search the images
semantics from various sorts of using normal text only so time require for
information and empower effective cross- searching is more.
media recovery turns into a rising hot
issue. There are rich relationships among 4. PROPOSED SYSTEM APPROACH
heterogeneous low-level media content,
which makes it trying to inquiry cross-
media information adequately. In this
paper, we propose another cross-media
recovery strategy dependent on present
moment and long-haul significance input.
Our technique for the most part centres
around two run of the mill kinds of media
information, i.e. picture and sound.
Initially, we assemble multimodal
portrayal by means of factual authoritative Fig.1 Block Diagram of Proposed System
connection amongst picture and sound We propose a novel hashing strategy,
element frameworks, and characterize called semantic cross-media hashing
cross-media separate measurement for (SCMH), to play out the close copy
likeness measure; at that point we propose recognition and cross media recovery
advancement technique dependent on assignment. We propose to utilize a lot of
importance input, which melds momentary word embeddings to speak to printed data.
In proposed framework comprise of 2 ranking user can see the proper image.
modules client and administrator. Images also search using hash value using
Administrator can include the pictures and MD5 Algorithm. So, this is various type of
different capacities additionally and client methods are used for searching of
can look through the picture utilizing images.Likewise positioning of pictures as
content and also picture moreover. In per look. We can also display the ranking
proposed system user can search image of image search by the user.
using text as well as image. Searching of
image using text Word Embedding 5. CONCLUTION
algorithm is used and for searching of In this work, we propose you a new
image Feature Descriptor algorithm is hashing method, SCMH a duplicate and
used. Main drawback of existing system is cross-media detection restoration activity.
in searching of flicker data relevant as well We are proposing to use a series of words
as irrelevant data is shown .So, we can to represent textual information. In this
remove this drawback In our system we proposed system we can remove drawback
can get relevant images only. Fisher of flicker. User can search the data using
portion system is joined to speak to both text as well as images. User can also
printed and visual data with settled length search the images using hash values. The
vectors. For mapping the Fisher vectors of Fisher Framework Kernel built to
various modalities, a profound conviction represent both textual and visual
arrange is proposed to play out the information with fixed length vectors. To
undertaking. We assess the proposed map the Fisher vectors of different modes,
technique SCMH on three normally a network of deep beliefs intends to do the
utilized informational indexes. In a operation. We appreciate the proposal
proposed system searching the data over SCMH three common usage sets. SCMH
hashing value for calculating hash value best avant-garde methods with different
we can used MD5 algorithm. Proposed the lengths of hash codes. In the MIR
distinctive ways to deal with catch the Flicker data set, SCMH related
similitudes among content and pictures. In improvements in LSSH, which manages
this strategy we use Skip-gram calculation the best results in these data sets, are 10.0
for word implanting, SIFT descriptor to and 18.5 percent text to Image & Image to
remove the key focuses from the pictures Text tasks, respectively. Experimental
and MD5 calculation for hash code age. results demonstrate effectiveness proposed
SCMH accomplishes preferred outcomes in the cross-media recovery method. In the
over cutting edge strategies with various proposed user can also see ranked images
the lengths of hash codes. We can likewise according to users search. In feature work
evacuate downside of gleam in this we can search anybody using images with
proposed framework. In proposed system other social media like facebook, twitter.
user can search the image using text as
well as image. User can search any image 6. ACKNOWLEDGMENT
using text, word embedding algorithm This work is supported in a mix generative
feature vector is calculated after system of any state in india. Authors are
calculating feature vector, mapping the thankful to Faculty of Engineering and
image as well as ranking of images after Technology (FET), SavitribaiPhule Pune
ranking user can see the accurate image University,Pune for providing the facility
and user can search any image using to carry out the research work.
image, Feature Descriptor algorithm
feature vector is calculated after REFERENCES
calculating feature vector, mapping the [1] Liangrong Zhang, Kai Jiang, Yaoguo Zheng,
image as well as ranking of images after Jinliang An, Yanning Hu, Licheng Jiao
―Spatially Constrained Bag-of-Visual- Vis. Pattern Recog., Boston, MA, USA, Jun.
Words for Hyperspectral Image 2015, pp. 31283137.
Classification_‖ International Research [4] J. Zhou, G. Ding, and Y. Guo, ―Latent
Center for Intelligent Perception and semantic sparse hashing for cross-modal
Computation Xidian University, Xi‘an similarity search,‖ in Proc. 37th Int. ACM
710071, China_2016 SIGIR Conf. Res. Develop. Inf. Retrieval,
[2] Mattyws F. Grawe, Claudia A. Martins, Andreia 2014, pp. 415–424.
G. Bonfante, ―Automated Patent [5] Y. Pan, T. Yao, T. Mei, H. Li, C.-W. Ngo, and
Classification Using Word Embedding Y. Rui, ―Clickthrough-based cross-view
―16th IEEE International Conference on learning for image search,‖ in Proc. 37th
Machine Learning and Applications Federal Int.ACMSIGIR Conf. Res. Develop. Inf.
University of Mato Grosso Cuiaba, Brazil Retrieval, 2014, pp. 717–726.
2017 [6] H. Zhang, J. Yuan, X. Gao, and Z. Chen,
[3] A. Karpathy and L. Fei-Fei, ―Deep visual- ―Boosting cross-media retrieval via visual-
semantic alignments for generating image auditory feature analysis and relevance
descriptions‖ in Proc. IEEE Conf. Comput. feedback,‖ in Proc. ACM Int. Conf.
Multimedia, 2014, pp. 953–956.
memory, subsequently the more HUIs 1. ―Efficient tree structures for high-utility
create the algorithms, the more resources pattern mining in incremental databases‖.
they consume. Conversely, if the threshold Recently, high utility pattern (HUP)
is too high, HUI will not be found. mining is one of the most important
1.1 Background research issues in data mining due to its
Frequently generate a huge set of HUIs ability to consider the non-binary
and their mining performance is degraded frequency values of items in transactions
consequently. Further in case of long and different profit values for every item.
transactions in dataset or low thresholds On the other hand, incremental and
are set, then this condition may become interactive data mining provide the ability
worst. The huge number of HUIs forms a to use previous data structures and mining
challenging problem to the mining results in order to reduce unnecessary
performance since the more HUIs the calculations when a database is updated, or
algorithm generates, the higher processing when the minimum threshold is changed.
time it consumes. Thus to overcome this In this paper, we propose three novel tree
challenges the efficient algorithms structures to efficiently perform
presented. Top k will not work on the incremental and interactive HUP mining.
parallel mining. The first tree structure, Incremental HUP
1.2 Motivation Lexicographic Tree (IHUPL-Tree), is
1. Set the value of k which is more arranged according to an item‘s
intuitive than setting the threshold because lexicographic order. It can capture the
k represents the number of Item sets that incremental data without any restructuring
users want to find whereas choosing the operation. The second tree structure is the
threshold depends primarily on database IHUP Transaction Frequency Tree
characteristics, which are often unknown (IHUPTF-Tree), which obtains a compact
to users. size by arranging items according to their
2. The main point of min utility variable is transaction frequency (descending order).
not given in advance in top k HUI mining To reduce the mining time, the third tree,
In traditional HUI mining the search space IHUP-Transaction-Weighted Utilization
can be efficiently increased to algorithm Tree (IHUPTWU-Tree) is designed based
by using a given the min utility threshold on the TWU value of items in descending
value. In scenario of TKO and TKU order. Extensive performance analyses
algorithm min utility threshold value is show that our tree structures are very
provided in advance. efficient and scalable for incremental and
1.3 Aim & Objective interactive HUP mining.
1. The execution time of TKO algorithm is 2. ―Mining high-utility item sets‖
less but result is incorrect with a garbage Traditional association rule mining
value and it is efficient algorithm. The algorithms only generate a large number of
execution time of TKU algorithm is more highly frequent rules, but these rules do
but result is correct. It is very challenging not provide useful answers for what the
issue how hybrid algorithm (TKO WITH high utility rules are. We develop a novel
TKU) is efficient than TKU algorithm. idea of top-K objective-directed data
The time factor is very important in that. mining, which focuses on mining the top-
2. Need to achieve significantly better K high utility closed patterns that directly
performance. support a given business objective. To
3. The Hybrid Algorithm get HUI fixed association mining, we add the concept of
Parameter of Rating and view and Number utility to capture highly desirable statistical
of Buy‘s patterns and present a level-wise item-set
mining algorithm. With both positive and
2. LITERATURE SURVEY negative utilities, the anti-monotone
add the product or items, and update the Module 4 TKO and TKU Algorithms
new product view the stock details. In Combination of TKO and TKU
Module 2 - User (Customer) algorithms first the TKO (Top k in one
Customer can purchase the number of phase) algorithms is called and then output
items. All the purchased items history is of TKO is given as the input of TKU (Top
stored in the transaction database. k in utility phases) then the actual result is
Module 3 - Construction of Up Tree TKU Result.
In Up Tree Dynamic Table is generated by
algorithms. Mainly the Up growth is
considerable to get the PHUI item set.
Result With K
K Value TKO Algorithms
Parallel and Value
Pattern
Algorithms
Data Base
classifier. And in the proposed model they paradigms are at utmost. To support our
used seeding algorithm and pragmatic motivations, we have described some
classifier to detect emoticon based areas where Big Data can play an
sarcasm. Edwin Lunando, Ayu important role. In healthcare scenarios,
Purwarianti [6] analyzed- To solve the medical practitioners gather massive
high computational overhead and low volume of data about patients, medical
classification efficiency of the KNN history, medications, and other details.
algorithm, a text feature vector The above-mentioned data are
representation method based on accumulated in drug-manufacturing
information gain and non-negative matrix companies. The nature of these data is
factorization is proposed. very complex, and sometimes the
practitioners are unable to show a
4. GAP ANALYSIS relationship with other information, which
The process of detecting sarcasm was results in missing of important
done on the basis of fixed dataset. The information. With a view in employing
dataset was saved and then the further advance analytic techniques for
processing started. The dataset was or organizing and extracting useful
could be manipulated easily as it used to information from Big Data results in
be stored. Whereas in our process of personalized medication, the advance Big
detecting sarcasm, the detection is done Data analytic techniques give insight into
on real time data. The data is not saved hereditarily causes of the disease. In the
permanently. The minute you refresh the Same way data is also generated for the
data would be refreshed from the memory, reviews of the product across various
i.e. new data would be shown. The data services but sometimes we have to
would be only saved temporarily. differentiate between fake reviews and
Temporary storage would be done through Genuine Reviews for the input of our
MongoDB. As the data is not being saved, decision making process in Business.
manipulation of the data is impossible.
And the results are more accurate and 6. CONCLUSION AND FUTURE
unbiased. WORK
Text mining is not used for of unstructured Era of Big Data‖ Author-Anderson J E,
data. Chang D
D. C. Many healthcare facilities enforce
security on their electronic health records
B. ―Grand challenges in clinical (EHRs) through a corrective mechanism:
decision support‖ Author- Sittig D, some staff nominally have almost
Wright A, Osheroff J, et al. unrestricted access to the records, but there
There is a pressing need for high-quality, is a strict ex post facto audit process for
effective means of designing, developing, inappropriate accesses, i.e., accesses that
presenting, implementing, evaluating, and violate the facility‘s security and privacy
maintaining all types of clinical decision policies. This process is inefficient, as
support capabilities for clinicians, patients each suspicious access has to be reviewed
and consumers. Using an iterative, by a security expert, and is purely
consensus-building process we identified a retrospective, as it occurs after damage
rank-ordered list of the top 10 grand may have been incurred. This motivates
challenges in clinical decision support. automated approaches based on machine
This list was created to educate and inspire learning using historical data. Previous
researchers, developers, funders, and attempts at such a system have
policy-makers. The list of challenges in successfully applied supervised learning
order of importance that they be solved if models to this end, such as SVMs and
patients and organizations are to begin logistic regression. While providing
realizing the fullest benefits possible of benefits over manual auditing, these
these systems consists of: improve the approaches ignore the identity of the users
human-computer interface; disseminate and patients involved in record access.
best practices in CDS design, Therefore, they cannot exploit the fact that
development, and implementation; a patient whose record was previously
summarize patient-level information; involved in a violation has an increased
prioritize and filter recommendations to risk of being involved in a future violation.
the user; create an architecture for sharing Motivated by this, in this paper, we
executable CDS modules and services; propose a collaborative filtering inspired
combine recommendations for patients approach to predicting inappropriate
with co-morbidities; prioritize CDS accesses. Our solution integrates both
content development and implementation; explicit and latent features for staff and
create internet-accessible clinical decision patients, the latter acting as a personalized
support repositories; use free text "fingerprint" based on historical access
information to drive clinical decision patterns. The proposed method, when
support; mine large clinical databases to applied to real EHR access data from two
create new CDS. Identification of tertiary hospitals and a file-access dataset
solutions to these challenges is critical if from Amazon, shows not only
clinical decision support is to achieve its significantly improved performance
potential and improve the quality, safety, compared to existing methods, but also
and efficiency of healthcare provides insights as to what indicates
Disadvantage:- Identification of solutions inappropriate access.
to these challenges is critical if E. ―Data Mining Techniques into
clinical decision support is to achieve its Telemedicine Systems‖ Author-
potential and improve the quality, safety, Gheorghe M, Petre R providing care
and efficiency of healthcare services through telemedicine has become
an important part of the medical
C. ―Using Electronic Health Records for development process, due to the latest
Surgical Quality Improvement in the innovation in the information and
computer technologies. Meanwhile, data turn that data are facts, numbers, or
mining, a dynamic and fast-expanding text which can be processed by a
domain, has improved many fields of computer into knowledge or
human life by offering the possibility of information. The main purpose of data
predicting future trends and helping with mining application in healthcare
decision making, based on the patterns and systems is to develop an automated
trends discovered. The diversity of data tool for identifying and disseminating
and the multitude of data mining relevant healthcare information. This
techniques provide various applications for paper aims to make a detailed study
data mining, including in the healthcare report of different types of data mining
organization. Integrating data mining applications in the healthcare sector
techniques into telemedicine systems and to reduce the complexity of the
would help improve the efficiency and study of the healthcare data
effectiveness of the healthcare transactions. Also presents a
organizations activity, contributing to the comparative study of different data
development and refinement of the mining applications, techniques and
healthcare services offered as part of the different methodologies applied for
medical development process. extracting knowledge from a database
F. ―Query recommendation using generated in the healthcare industry.
query logs in search engines‖ Finally, the existing data mining
Author-R. Baeza-Yates, C. Hurtado, techniques with data mining
and M. Mendoza In this paper we algorithms and its application tools
propose a method that, given a query which are more valuable for healthcare
submitted to a search engine, suggests services are discussed in detail.
a list of related queries. The related H. ―Detecting Inappropriate Access
queries are based in previously issued to Electronic Health Records Using
queries and can be issued by the user Collaborative Filtering‖ Author-Aditya
to the search engine to tune or redirect Krishna Menon , Many healthcare
the search process. The method facilities enforce security on their
proposed is based on a query clustering electronic health records (EHRs) through a
process in which groups of corrective mechanism: some staff
semantically similar queries are nominally have almost unrestricted access
identified. The clustering process uses to the records, but there is a strict ex post
the content of historical preferences of facto audit process for inappropriate
users registered in the query log of the accesses, i.e., accesses that violate the
search engine. The method not only facility's security and privacy policies.
discovers the related queries but also This process is inefficient, as each
ranks them according to a relevance suspicious access has to be reviewed by a
criterion. Finally, we show with security expert, and is purely retrospective,
experiments over the query log of a as it occurs after damage may have been
search engine the effectiveness of the incurred. This motivates automated
method. approaches based on machine learning
G. ―Data Mining Applications In using historical data. Previous attempts at
Healthcare Sector: A Study ‖ such a system have successfully applied
Author -M. Durairaj, V. In this supervised learning models to this end,
paper, our system have focused to such as SVMs and logistic regression.
compare a variety of techniques, While providing benefits over manual
approaches and different tools and its auditing, these approaches ignore the
impact on the healthcare sector. The identity of the users and patients involved
goal of data mining application is to in record access. Therefore, they cannot
Examinees can know their symptoms symptoms which accrued in the body
which accrued in the body which set as the which set as the while doctors can get a set
(potential health risks according) while of examinees with potential risk. A
doctors can get a set of examinees with feedback mechanism could save
potential risk. manpower and improve the performance
of the system automatically. The doctor
4.PROPOSED SYSTEM could fix prediction result through an
The Main Concept to determine medical interface, which will collect doctors' input
diseases according to given symptoms & as new training data. An extra training
daily Routine when User search the process will be triggered every day using
hospital then given the nearest hospital of these data. Thus, our system could
their current location. The system provides improve the performance of the prediction
a user-friendly interface for examinees and model automatically.
doctors. Examinees can know their
Predication
Registration view symptoms
diseases
Doctor
Admin
By referring these papers, we have tried to safety module and crime capturing
develop this proposed system. module. In first module that is in user
Comparison between existing module the person will come to know
technologies and proposed system: whether the place to which he is travelling
In the existing module of women to safe or not. This module is basically an
safety woman needs to click a button on android application where user can register
the app and then help message will be sent himself. After registration whenever user
to the emergency contact number. This will login, he will see three options which
message will be continuously sent till she are view crime rate, crime capture and last
presses stop button. In the proposed one is logout. In first option he will be
system woman can press the power button able to view the crime status of any area he
3 to 4 times and then single help message wishes to from the available list. This will
is sent to her emergency contact number. be displayed in graphical format. In second
The existing module of user in android option that is crime capture which is
application can see crime rate in the form second module also, if a user finds a crime
of maps or graphs. The proposed system of happening in the surrounding then he can
user in android application can view crime capture it and send it to the nearest police
status in the form of pie chart which is station from the available list so that police
based on crime type. The module of crime will be notified and they can take
capture is solely included in the proposed immediate necessary action. And last one
system. is logout option.
The third module is woman safety
4. GAP ANALYSIS module. This is also an android application
Table: Gap Analysis where the woman must be registered first.
Propo If a woman feels insecure then she can
Manual
sed press the power button of her android
Verificat Govt.
DVS service mobile 4-5 times so that a notification can
ion
s be sent to the emergency contact number
Unlimi which she has provided during the
Validity Medium High
ted registration process.
Confidenti Moderat Mediu Along with the android application
High there will be a webpage which will be
ality e m
Cost of available for both user and admin. Police
verificatio Medium Mediu Low officers will act as admin. Admin can add
n m and update data in the database area wise.
Moderat Mediu
Security High
e m 6. ALGORITHMS
1. K-means clustering
Energy We are using clustering technique of Data
Moder
Consumpt High High Mining. Here Clustering is used for
ate
ion grouping the similar patterns based on
crime type [7]. K-means clustering is used
here. K-means clustering is an
5. PROPOSED SYSTEM unsupervised learning algorithm.
The developed model will help to reduce Clustering will help us to display crime
crimes and will help the crime detection rate graphically using pie chart.
field in many ways that is in reducing The K-means algorithm can be executed in
crimes by carrying out various necessary the following steps:
measures. In this system there are three 1) Specify the value of k that is the number
modules namely user module, woman of clusters.
Juan Guevara, Joana Costa, Jorge Arroba, Anusha K S , Radhika A D[4],In this paper
Catarina Silva[5],One of the most popular we discuss the levels, approaches of
social networks for microblogging that has sentiment analysis, sentiment analysis of
a great growth is Twitter, which allows twitter data, existing tools available for
people to express their opinions using sentiment analysis and the steps involved
short, simple sentences. These texts are for same. Two approaches are discussed
generated daily, and for this reason it is with an example which works on machine
common for people to want to know which learning and lexicon based respectively.
are the trending top- ics and their drifts. In
this paper we propose to deploy a mobile Ms. Farha Nausheen , Ms.Sayyada Hajera
app that provides information focusing on Begum[6],The opinion of the public for a
areas, such as, Politics, Social, Tourism, candi- date will impact the potential leader
and Marketing using a statistical lexicon of the country. Twitter is used to acquire a
approach. The application shows the large diverse data set representing the
polarity of each theme as positive, current public opinions of the candidates.
negative, or neutral. The collected tweets are analyzed using
lexicon based approach to determine the
S. Rajalakshmi, S. Asha, N. sentiments of public. In this paper, we
Pazhaniraja[1],In this case, sentiment determine the polarity and subjectivity
analysis or opining mining is useful for measures for the collected tweets that help
mining facts from those data. The text data in understanding the user opinion for a
obtained from the social network primarily particular candidate. Further, a comparison
undergoes emotion mining to examine the is made among the candidates over the
sentiment of the user message. Most of the type of sentiment.
sentiment or emotional mining uses Sentiment analysis can be classified into
machine learning approaches for better lexicon based approach, machine learning
results.The principle idea be- hind this approach and hybrid approach. Sentiment
article is to bring out the process involved analysis approaches are listed in Table 1.
in sentiment analysis.Further the
investigation is about the various methods
or techniques existing for perform- ing
sentiment analysis.It also presents the
MERITS AND
TYPES APPROACHES DEMERITS
Novel Machine
Learning Approach
MERITS : Broader term
Dictionary based analysis
DEMERITS : Limited
approach number of
Lexicon
words in lexicons and
based Ensemble assigning
Approaches fixed score to opinion
words
Corpus based ap-
proach
7. CONCLUSION
In this paper, we have presented a way in REFERENCES
which the automated generated [1] Review On Natural Language Processing -
summarization can performed and be used Prajakta Pawar, Prof. Alpa Reshamwala and
for various fundamental purposes. Prof. Dhirendra Mishra Cite As :
The procedure that we proposed consist of https://www.researchgate.net/publication/235
788362 | Published in An International
few steps. First we need to take an input Journal (ESTIJ), ISSN: 2250-3498, Vol.3,
from the user which would be inn the form No.1, February 2013
of digital document. Our system would [2] Comparative Study of Text Summarization
scan that document and get it preprocessed Methods - Nikita Munot andSharvari S.
for the later steps. The next step would be Govilkar Cite As :
https://pdfs.semanticscholar.org/0c95/0bc8f2
applying different algorithms for the 34ecb6cf57f13bca7edd118809d0ca.pdf |
purpose of summarization and Published in: International Journal of
information retrieval. This may include Computer Applications (0975 – 8887)
scanning, parsing, POS tagging and Volume 102– No.12, September 2014.
different technical measures. [3] Automatic Text Summarization and it‘s
Methods - Neelima Bhatiya and Arunima
Now the original document will be saved Jaiswal Cite As :
in the database along with the output that https://ieeexplore.ieee.org/abstract/document/7
we got after the preprocessing i.e., the 508049/ | Published in 2016 6th International
summary. After this, the user will give Conference
second input to the system, that would be [4] Graph Based Approach For Automatic Text
Summarization - Akash, Somaiah, Annapurna
the query that needed to be looked for. The Cite As : https://ijarcce.com/wp-
system will take that query and look for content/uploads/2016/11/IJARCCE-
the appropriate solution for that. At last an ICRITCSA-2.pdf | Published in International
output in the form of text is generated and Journal of Advanced Research in Computer
is provided to the user. and Communication Engineering, Vol. 5,
Special Issue 2, October 2016
and enhance accessibility irrespective of the clients for collection and bookings
geographical locations provided there is made, administrator is able to view all the
internet. customers and their details, finished
garments and all the booking made. To
3. PROPOSED WORK create a data bank for easy access a
The proposed system will automate the retrieval of customer details, orders placed
current manual tailoring system and and the users who register to the system.
maintain a searchable customer, product The registration process for the customers
database, maintain data security and user is provided online by the system which
rights. Here system will enable customers will help to successfully submit their
to send their measurements to their tailors measurements. The system has inbuilt
for their clothes to be made. Also this will validation system to validate the entered
provide information about the cost, the data. The customer can login to the system
fabric type, the urgency at which a to check on the status of the clothes for
customer wants the dress finished, the type collection. The system will show the
of material to be used and quantity in already completed garments for client to
terms of pairs needed. To compute the collect. The system also provides
total cost depending on the selected fabric, information about the cost of each garment
type of material, quantity and duration and the customer intents to get knit. The data
avails that information to the customer. will be store in the database for further
This enable report generation: it is able reference or audit.
give a report of finished the garments to
support; mine large clinical databases to added to the examiner data set. It
create new CDS[2] It takes only a few consumes lot of time.
researchers to analyze data from hospital
information.. Knowledge discovery and 3. REVIEW OF LITERATURE
data mining have found numerous Sittig D, Wright A, Osheroff J, et al. [1]
applications in business and scientific There is a pressing need for high-quality,
domain.[3]The main concept is to effective means of designing, developing,
determine medical diseases according to presenting, implementing, evaluating, and
given symptoms & daily routine when user maintaining all types of clinical decision
search the hospital then given the nearest support capabilities for clinicians, patients
hospital of their current location. Data and consumers. Using an iterative,
mining techniques used in the prediction consensus-building process we identified a
of heartattacks are rule based, decision rank-ordered list of the top 10 grand
trees,attificial neural networks.[4] The challenges in clinical decision support.
related queries are based in previously This list was created to educate and inspire
issued queries, and can be issued by the researchers, developers, funders, and
user to the search engine to tune or redirect policy-makers. The list of challenges in
the search process. The method proposed order of importance that they be solved if
is based on a query clustering process in patients and organizations are to begin
which groups of semantically similar realizing the fullest benefits possible of
queries are identified [5]. The clustering these systems consists of: improve the
process uses the content of historical human–computer interface; disseminate
preferences of users registered in the query best practices in CDS design,
log of the search engine The system development, and implementation,
provides a user-friendly interface for summarize patient-level information;
examinees and doctors. Examinees can prioritize and filter recommendations to
know their symptoms which accrued in the user; create an architecture for sharing
body which set as the while doctors can executable CDS modules and services;
get a set of examinees with potential risk. combine recommendations for patients
A feedback mechanism could save with co-morbidities; prioritize CDS
manpower and improve performance of content development and implementation;
system automatically. create internet-accessible clinical decision
support repositories; use free text
2. MOTIVATION information to drive clinical decision
Previous medical examiner only used basic support; mine large clinical databases to
symptoms of particular diseases but in this create new CDS. Identification of
application examiner examines on the solutions to these challenges is critical if
word count, laboratory results and clinical decision support is to achieve its
diagnostic data. A feedback mechanism potential and improve the quality, safety
could save manpower and improve and efficiency of healthcare.
performance of system automatically. The
doctor could fix prediction result through Anderson J E, Chang DC. Et al [2] Many
an interface, which will collect doctors‘ healthcare facilities enforce security on
input as new training data. An extra their electronic health records (EHRs)
training process will be triggered everyday through a corrective mechanism: some
using these data. Thus, this system could staff nominally have almost unrestricted
improve the performance of prediction access to the records, but there is a strict
model automatically. When the user visits ex post facto audit process for
hospital physically, then user‘s personal inappropriate accesses, i.e., accesses that
record is saved and then that record is violate the facility‘s security and privacy
policies. This process is inefficient, as decision tree, naïve bayes and artificial
each suspicious access has to be reviewed neural network to massive volume of
by a security expert, and is purely healthcare data is briefly examined. The
retrospective, as it occurs after damage healthcare industry collects huge amounts
may have been incurred. This motivates of healthcare data which, unfortunately,
automated approaches based on machine are not ―mined‖ to discover hidden
learning using historical data. Previous information. For data preprocessing and
attempts at such a system have effective decision making One
successfully applied supervised learning Dependency Augmented Naïve Bayes
models to this end, such as SVMs and classifier (ODANB) and naive creedal
logistic regression. While providing classifier 2 (NCC2) are used. This is an
benefits over manual auditing, these extension of naïve Bayes to imprecise
approaches ignore the identity of the users probabilities that aims at delivering robust
and patients involved in a record access. classifications also when dealing with
Therefore, they cannot exploit the fact that small or incomplete data sets. Discovery of
a patient whose record was previously hidden patterns and relationships often
involved in a violation has an increased goes unexploited. Using medical profiles
risk of being involved in a future violation. such as age, sex, blood pressure and blood
Motivated by this, in this paper, a sugar it can predict the likelihood of
collaborative filtering inspired approach to patients getting a heart disease. It enables
predict inappropriate accesses is proposed. significant knowledge, e.g. patterns,
Our solution integrates both explicit and relationships between medical factors
latent features for staff and patients, the related to heart disease, to be established.
latter acting as a personalized ―finger-
print‖ based on historical access patterns. Srinivas K, Rani B K, Govrdhan A. et
The proposed method, when applied to al[4].In this paper, care services through
real EHR access data from two tertiary telemedicine is provided and it has become
hospitals and a file-access dataset from an important part of the medical
Amazon, shows not only significantly development process, due to the latest
improved performance compared to innovation in the information and
existing methods, but also provides computer technologies. Meanwhile, data
insights as to what indicates an mining, a dynamic and fast-expanding
inappropriate access. domain, has improved many fields of
human life by offering the possibility of
ZhaoqianLan, Guopeng Zhou, predicting future trends and helping with
YichunDuan , Wei Yan et al[3] healthcare decision making, based on the patterns and
environment is generally perceived as trends discovered. The diversity of data
being ‗information rich‘ yet ‗knowledge and the multitude of data mining
poor‘. There is a wealth of data available techniques provide various applications for
within the healthcare systems. However, data mining, including in the healthcare
there is a lack of effective analysis tools to organization. Integrating data mining
discover hidden relationships and trends in techniques into telemedicine systems
data. Knowledge discovery and data would help improve the efficiency and
mining have found numerous applications effectiveness of the healthcare
in business and scientific domain. organizations activity, contributing to the
Valuable knowledge can be discovered development and refinement of the
from application of data mining techniques healthcare services offered as part of the
in healthcare system. In this study, the medical development process.
potential use of classification based data
mining techniques such as rule based,
Gheorghe M, Petre R. et al[5] In this paper Koh H C, Tan G.et al [7] many healthcare
a method is proposed that, given a query facilities enforce security on their
submitted to a search engine, suggests a electronic health records (EHRs) through a
list of related queries. The related queries corrective mechanism: some staff
are based in previously issued queries, and nominally have almost unrestricted access
can be issued by the user to the search to the records, but there is a strict ex post
engine to tune or redirect the search facto audit process for inappropriate
process. The method proposed is based on accesses, i.e., accesses that violate the
a query clustering process in which groups facility‘s security and privacy policies.
of semantically similar queries are This process is inefficient, as each
identified. The clustering process uses the suspicious access has to be reviewed by a
content of historical preferences of users security expert, and is purely retrospective,
registered in the query log of the search as it occurs after damage may have been
engine. The method not only discovers the incurred. This motivates automated
related queries, but also ranks them approaches based on machine learning
according to a relevance criterion. Finally, using historical data. Previous attempts at
with experiments over the query log of a such a system have successfully applied
search engine is shown andthe supervised learning models to this end,
effectiveness of the method. such as SVMs and logistic regression.
While providing benefits over manual
R. Baeza-Yates, C. Hurtado, and M. auditing, these approaches ignore the
Mendoza,et al[6], thesystem have focused identity of the users and patients involved
to compare a variety of techniques, in a record access. Therefore, they cannot
approaches and different tools and its exploit the fact that a patient whose record
impact on the healthcare sector. The goal was previously involved in a violation has
of data mining application is to turn that an increased risk of being involved in a
data are facts, numbers, or text which can future violation. Motivated by this, in this
be processed by a computer into paper, a collaborative filtering inspired
knowledge or information. The main approach to predict inappropriate accesses
purpose of data mining application in is proposed. The solution integrates both
healthcare systems is to develop an explicit and latent features for staff and
automated tool for identifying and patients, the latter acting as a personalized
disseminating relevant healthcare ―finger-print‖ based on historical access
information. This paper aims to make a patterns. The proposed method, when
detailed study report of different types of applied to real EHR access data from two
data mining applications in the healthcare tertiary hospitals and a file-access dataset
sector and to reduce the complexity of the from Amazon, shows not only
study of the healthcare data transactions. significantly improved performance
Also presents a comparative study of compared to existing methods, but also
different data mining applications, provides insights as to what indicates an
techniques and different methodologies inappropriate access.
applied for extracting knowledge from
database generated in the healthcare Tao Jiang & Siyu Qian, et al. [8]The study
industry. Finally, theexisting data mining aimed to identify risk factors in medication
techniques with data mining algorithms management in Australian residential aged
and its application tools which are more care (RAC) homes. Only 18 out of 3,607
valuable for healthcare services are RAC homes failed aged care accreditation
discussed in detail. standard in medication management
between 7th March 2011 and 25th March
2015. Text data mining methods were used
to analyse the reasons for failure. This led solving capacitated flow problems or in
to the identification of 21 risk indicators selecting routes for hazardous materials .A
for an RAC home to fail in medication critical disscussion of three existing
management. These indicators were methods for the generation of spatially
further grouped into ten themes. They are dissimilar paths is offered and
overall medication management, computational experience using these
medication assessment, ordering, methods is reported. As an alternative
dispensing, storage, stock and disposal, method, the generation of a large set of
administration, incident report, candidate paths and the selection of a
monitoring, staff and resident satisfaction. subset using a dispersion model which
The top three risk factors are: ―ineffective maximizes the minimum dissimilarity in
monitoring process‖ (18 homes), the selected subset is proposed.
―noncompliance with professional
standards and guidelines‖ (15 homes), and T. Akiba, T. Hayashi, N. Nori, Y. Iwata,
―resident dissatisfaction with overall and Y. Yoshida.et al [12]. An indexing
medication management‖ (10 homes). scheme for top-k shortest path distance
queries on graphs, which is useful in a
Song J H, Venkatesh S S, Conant E A, et wide range of important applications such
al. [9], the k-means clustering and self- as network aware searches and link
organizing maps (SOM) are applied to prediction is proposed. While many
analyze the signal structure in terms of efficient methods for answering standard
visualization. k-nearest neighbor (top-1) distance queries have been
classifiers (k-nn), support vector machines developed, none of these methods are
(SVM) and decision trees (DT) are directly extensible to top-k distance
employed to classify features using a queries. A new framework for top-k
computer aided diagnosis (CAD) distance queries based on 2-hop cover is
approach. developed and then present an efficient
indexing algorithm based on the recently
Song J H, Venkatesh S S, Conant E A, et proposed pruned landmark labeling
al. [10],Breast cancer is one of the most scheme. The scalability, efficiency and
common cancers in women. Sonography is robustness of the method is demonstrated
now commonly used in combination with in extensive experimental results.
other modalities for imaging breasts. A. Angel and N. Koudas.et al
Although ultrasound can diagnose simple [13].diversity-aware search, in a setting
cysts in the breast with an accuracy of that captures and extends established
96%–100%, its use for unequivocal approaches, focusing on content-based
differentiation between solid benign and result diversification is studied. DIVGEN
malignant masses has proven to be more is presented, an efficient threshold
difficult. Despite considerable efforts algorithm for diversity-aware search.
toward improving imaging techniques, DIVGEN utilizes novel data access
including solography, the final primitives, offering the potential for
confirmation of whether a solid breast significant performance benefits. The
lesion is malignant or benign is still made choice of data accesses to be performed is
by biopsy. crucial to performance, and a hard problem
in its own right. Thus a low-overhead,
V. Akg¨un, E. Erkut, and R. Batta.et al intelligent data access prioritization
[11] considers the problem of finding a scheme is proposed, with theoretical
number of spatially dissimilar paths quality guarantees, and good performance
between an origin and a destination. A in practice..
number of dissimilar paths can be useful in
Preprocessing
Given Systoms
Given Result
Logical Part
Machine To Predication
Random Forest
Learning Result.
Baseline
Algorithm
User
training process to improve performance Trends Database Technol., 2004, pp. 588–596.
every day. [7] Koh H C, Tan G. Data mining applications in
healthcare.[J]. Journal of Healthcare
Information Management Jhim, 2005,
8. ACKNOWLEDGEMENT 19(2):64-72.
The authors would like to thank the [8] Menon A K, Jiang X, Kim J, et al. Detecting
researchers as well as publishers for Inappropriate Access to Electronic Health
making their resources available and Records Using Collaborative Filtering[J].
teachers for their guidance. We are Machine Learning, 2014, 95(1):87-101.
thankful to the authorities of Savitribai [9] Tao Jiang & Siyu Qian, et al. Accreditation
Reports to Identify Risk Factors in Medication
Phule University of Pune and concern
Management in Australian Residential Aged
members of ICINC 2019 conference, Care Homes[J]. Studies in Health Technology
for their constant guidelines and & Informatics, 2017
support. We are also thankful to the [10] Song J H, Venkatesh S S, Conant E A, et
reviewer for their valuable suggestions. al. Comparative analysis of logistic regression
We also thank the college authorities and artificial neural network for computer-
aided diagnosis of breast masses.[J]. Academic
for providing the required infrastructure Radiology, 2005, 12(4):487-95.
and support. Finally, we would like to [11] V. Akg¨un, E. Erkut, and R. Batta. On
extend a heartfelt gratitude to friends finding dissimilar paths. European Journal of
and family members. Operational Research, 121(2):232–246, 2000.
[12] T. Akiba, T. Hayashi, N. Nori, Y. Iwata,
REFERENCES and Y. Yoshida. Efficient topk shortest-path
[1] Sittig D, Wright A, Osheroff J, et al. ―Grand distance queries on large networks by pruned
challenges in clinical decision support‖. landmark labeling. In Proc. AAAI, pages 2–8,
Journal of Biomedical Informatics, 2008. 2015.
[2] Anderson J E, Chang D C. ―Using Electronic [13] A. Angel and N. Koudas. Efficient
Health Records for Surgical Quality diversity-aware search. In Proc. SIGMOD,
Improvement in the Era of Big Data‖[J]. Jama pages 781–792, 2011. H. Bast, D. Delling, A.
Surgery, 2015. V. Goldberg, M. M¨uller-Hannemann, T.
Pajor,P. Sanders, D. Wagner, and R. F.
[3] ZhaoqianLan, Guopeng Zhou, YichunDuan ,
Werneck. Route planning in transportation
Wei Yan , ―AI-assisted Prediction on Potential
networks. In Algorithm Engineering, pages
Health Risks with Regular Physical
19–80. 2016.
Examination Records‖, IEEE Transactions On
Knowledge And Data Science ,2018. [14] H. Bast, D. Delling, A. V. Goldberg, M.
M¨uller-Hannemann, T. Pajor,P. Sanders, D.
[4] Srinivas K, Rani B K, Govrdhan A.
Wagner, and R. F. Werneck. Route planning in
―Applications of Data Mining Techniques in
transportation networks. In Algorithm
Healthcare and Prediction of Heart Attacks‖.
Engineering, pages 19–80. 2016.
International Journal on Computer Science &
Engineering, 2010. [15] Borodin, Allan, Lee, H. Chul, Ye, and
Yuli. Max-sum diversification, monotone
[5] Gheorghe M, Petre R. ―Integrating Data
submodular functions and dynamic updates.
Mining Techniques into Telemedicine
Computer Science, pages 155–166, 2012.
Systems‖ InformaticaEconomica Journal,
2014.
[6] R. Baeza-Yates, C. Hurtado, and M. Mendoza,
―Query recommendation using query logs in
search engines,‖ in Proc. Int. Conf. Current
have been important advances in semantic Andersoni. For a relation phrase, ―directed
search and question answering over RDF by‖ also refers to two possible predicates
data. In particular, natural language the directory and h writer. Sometimes a
interfaces to online semantic data have the phrase needs to be mapped to a non-atomic
advantage that they can exploit the structure in knowledge graph. For
expressive power of Semantic Web data example, ―uncle of‖ refers to a predicate
models and query languages, while at the path (see Table 4). In RDF Q/A systems,
same time hiding their complexity from We should eliminate ―the ambiguity of
the user. phrase linking‖. Composition. The task of
Remarks composition is to construct corresponding
There are no evaluations so far that query or query graph by assembling the
systematically evaluate this kind of identified phrases. In the running example,
systems, in contrast to traditional question we know the predicate directory is to
answering and search interfaces to connect subject hfilmi and object hPaul W.
document spaces. S. Andersoni; consequently, we generate a
[5] Question answering on freebase via triple h film, director, Paul W. S.
relation extraction and textual evidence: Andersoni. However, in some cases, it is
Existing knowledge-based question difficult to determine the correct subject
answering systems often rely on small and object for a given predicate, or there
annotated training data. While shallow may exist several possible query graph
methods like relation extraction are robust structures for a given question sentence.
to data scarcity, they are less expressive We call it ―the ambiguity of query graph
than the deep meaning representation structure‖.
methods like semantic parsing, thereby
failing at answering questions involving
multiple constraints. 5. PROPOSED SYSTEM APPROCH:
Remarks This system uses framework to answer
While shallow methods like relation natural language questions over RDF
extraction are robust to data scarcity, they repository from a graph data-driven
are less expressive than the deep meaning technique. A semantic query graph is used
representation methods like semantic to model the query knowledge in the
parsing, thereby failing at answering natural language question in a structural
questions involving multiple constraints. way, Resource Description Framework
This paper is very most useful for our Question and answering is main use
system. It is great paper. Deep Learning is reduced to sub graph matching problem.
used in this Paper. More importantly, the system resolves the
ambiguity of natural language questions at
4. CURRENT SYSTEM the time when matches of query are found.
The Existing system hardness of RDF Q/A The cost of disambiguation is saved if
lies in the ambiguity of unstructured there is no matching found. The system
natural language question sentences. uses this semantic query information using
Generally, there are two main challenges. knowledge graph best answer; because of
Phrase Linking: A natural language this it overcome problems occur in real-
phrase wsi may have several meanings, time application such as Quora and Stack
i.e., wsi correspond to several semantic Overflow.
items in RDF
Graph G. As shown in Figure 1(b), the 6. SYSTEM ARCHITECTURE
entity phrase ―Paul Anderson‖ can map to
three persons hPaul Anderson (actor)i,
hPaul S. Andersoni and hPaul W. S.
REFERENCES
[1] Junwei Bao , Nan Duan, Ming Zhou , Tiejun
Zhao-―Knowledge-based question
answering as machine translation
Baltimore, Maryland, USA, June 23-25 2014.
[2] Mohamed Yahya, Klaus Berberich, Shady
Elbassuoni†, Gerhard Weikum-―Robust
question answering over the web of linked
data‖
[3] Dong Deng ,Guoliang Li · Jianhua Feng ,Yi
Fig.6.1. System Architecture Duan-―A unified framework for
approximate dictionary-based entity
extraction‖ Received: 12 November 2013 /
7. FLOW DIAGRAM Revised: 28 April 2014 / Accepted: 11 July
2014
© Springer-Verlag Berlin Heidelberg 2014
[4] Vanessa Lopeza, Christina Ungerb, Philipp
Cimianob, Enrico ―Mottac-Evaluating
question answering over linked data. ―
[5] Kun Xu, Siva Reddy, Yansong Feng,
Songfang Huang and Dongyan Zhao-
―Question answering on freebase via
relation extraction and textual evidence
―Berlin, Germany, August 7-12, 2016.
[6] W. M. Soon, H. T. Ng, and D. C. Y. Lim, ―A
machine learning approach to co reference
Fig.7.1. Flow Diagram resolution of noun phrases,‖ Comput.
Above diagram shows the actual flow of Linguist. vol. 27, no. 4, pp. 521–544, 2001.
the system. [7] L. Androutsopoulos, Natural Language
Interfaces to Databases – An Introduction,
Journal of Natural Language Engineering
8. CONCLUSION 1 (1995), 29–81
In our system, a graph data-driven [8] V . I. Spitkovsky and A. X. Chang, ―A cross-
framework to answer natural language lingual dictionary for english wikipedia
questions over Resource Description concepts,‖ in Proceedings of the Eighth
International Conference on Language
Framework graphs. Different from
Resources and Evaluation, LREC 2012,
existing work, the ambiguity both of Istanbul, Turkey, May 23-25, 2012, 2012, pp.
phrases and structure in the question 3168–3175.
understanding stage is removed. The [9] C.D.Manning,P.Raghavan,and H.Sch¨utze
system pushes down the disambiguation ,IntroductiontoInformation Retrieval. New
into the query evaluation stage. Based on York: Cambridge University Press, 2008.
[10] N. Nakashole, G. Weikum, and F. M.
the query results over Resource Suchanek, ―Discovering and exploring
Description Framework graphs, we can relations on the web,‖ PVLDB, vol. 5, no.
address the ambiguity issue efficiently. 12, pp. 1982–1985, 2012.
number of features was developed for the features like acoustic-modulation spectral
analysis of audio content. contrast/valley (AMSC/AMSV), acoustic
modulation spectral flatness measure
Amount of work has been (AMSFM), and acoustic-modulation
dedicated to the modeling of relationships spectral crest measure (AMSCM), are after
between music and emotions, including computes from the spectra of each joint
psychology, musicology and music frequency sub-band. The prominent status
information retrieval. Proposed emotion of music in human culture and everyday
models are either the categorical approach life is due in large part to its striking
or the dimensional approach. Categorical ability to elicit emotions. It may have
approaches represent emotions as a set of slight variation in mood to changes in our
categories that are clearly distinct from physical condition and actions.
each other. For an example six basic
emotion categories based on human facial M. Barthet et al. [2] describes study of
expressions of anger, fear, happiness, music and emotions from different
sadness, disgust and surprise. Another disciplines including psychology,
famous categorical approach is Hevner‘s musicology and music information
affective checklist, where eight clusters of retrieval. music information retrieval
affective adjectives were discovered and propose new insights to enhance
laid out in circle, as shown in Fig. 1. Each automated music emotion recognition
cluster includes similar adjectives, and models.
meaning of neighboring clusters varies in a
cumulative way until reaching a contrast in C.-H. Lee et al. [3] proposed an automatic
the opposite position‖ music genre classification approach based
on long-term modulation spectral analysis
In this paper we study about the related of spectral (OSC and MPEG-7 NASE) and
work done, in section II, the proposed cepstral (MFCC) features. Modulation
approach modules description, spectral analysis of every will generates a
mathematical modeling, algorithm and modulation spectrum. All the modulation
experimental setup in section III .and at spectra are collected to form a modulation
final we provide a conclusion in section spectrogram. Which exhibits the time-
IV. varying or rhythmic information of music
signals Each modulation spectrum is then
2. LITERATURE REVIEW decomposed into several logarithmically-
spaced modulation sub-bands. The MSC
In this section discuss the literature and MSV are then computed from each
review in detail about the recommendation modulation sub-band.
system for online social network.
Many short-term long-term modulation Y. Songet al. [4] proposed a collected a
and timbre features are developed for truth data set of 2904 songs, that have been
content-based music classification. There tagged with one of the four words
are two operations in modulation analysis ―happy‖, ―sad‖, ―angry‖ and ―relaxed‖.
are useful for modulation information and Audio is then retrieved from 7Digital.com,
it degrades classification performance. To and by using standard algorithms sets of
deal with this problem, Ren et al. [1] audio features are extracted.
proposed a two-dimensional representation By using support vector machins there are
of acoustic frequency and modulation two classifiers are trained. with the
frequency. It extracts joint acoustic polynomial and radial basis function
frequency and modulation frequency kernels and these are tested with 10-fold
features. Long-term joint frequency cross validation. Results show that spectral
Y. Panagak et al. [5] proposed the In paper [8] authors proposed a way in
automatic mood classification problem. By which music can be displayed for the user
resorting the low rank representation of which is based on similarity of the acoustic
slow auditory spectro-temporal features. All songs are in music library
modulations. Recently, If each data class onto a 2D feature space. The user can
is linearly spanned by a subspace of better understand the relationship between
unknown dimensions and the data are the songs, with the distance between each
noiseless. The lowest-rank representation song reflecting its acoustic similarity.
of a set of test vector samples with respect Low-level acoustic features are extracted
to a set of training vector samples has the from the raw audio signals and performing
nature of being both dense for within-class dimension reduction using PCA on the
affinities and almost zero for between- feature space. The proposed approach
class affinities. LRR exacts the avoids the dependence of contextual data
classification of the data, result is Low- called as metadata and collaborative
Rank Representation-based Classification filtering methods. By using song space
(LRRC). The LRRC is compared against visualizer, the user can chose songs or
three well-known classifiers, namely the allow the system to automate the song
Sparse Representations-based Classifier, selection process given a seed song.
SVM and Nearest Neighbor classifiers for
music mood classification by conducting In paper [9] authors proposed a method,
experiments on the MTV and the which considers the various kinds of audio
Soundtracks180 datasets. features. A bin histogramhas been
computed from each feature‘s frame to
In paper [6] authors proposed method save all needed data related with it. The
using cell mixture models to automate the histogram bins are used for calculating the
task of music emotion classification similarity matrix. The number of similarity
Designed system has potential application matrices depends on the number of audio
of both unsupervised and supervised features. There are 59 similarity matrixes.
classification learning. This system is To compute the intra-inter similarity ratio,
acceptable for music mood classification. the intra and inter similarity matrix are
The ICMM is suitable for the music utilized. These similarity ratios are sorted
emotion classification. in descending order in each feature. From
this some of the selected similarity ratios
In paper [7] authors given a technical are ultimately used as prototypes from
solution for automated slideshow each feature. Further used for
generation by extracting a set of high-level classification by designing the nearest
features from music Like beat grid, mood, multi-prototype classifier.
genre and intelligently combining this set
with image high-level features. For In paper [10] authors proposed a self-
example, the user request the system to colored music mood segmentation and a
automatically create a slideshow, which hierarchical framework based on new
plays soft music and shows pictures with mood taxonomy model to automate the
sunsets from the last 10 years of his own task of multi-label music mood
photo collection. The high-level feature classification. The taxonomy model
extraction the audio and visual combines Thayer‘s 2D models . Schubert‘s
information which is based on the same Updated Hevner adjective Model (UHM)
3. PROPOSED APPROACH
Proposed System Overview
We implement a feature set for music
mood classification, which combine
modulation spectral analysis of MFCC,
OSC, and SFM/SCM and statistical
descriptors of short-term timbre features.
By employing these features for SVMs,
our submission to the audio mood Figure 1.Proposed System Architecture
classification task was ranked #1. In fact,
Mathematical Model
the submission outperformed all the other
submissions of the task from 2008 to2014, For a joint acoustic-modulation
indicating the superiority of the proposed spectrogram, we can compute four joint
feature sets. Moreover, based on a part of frequency features, namely AMSC,
the aforementioned feature sets, we have AMSV, AMSFM, and AMSCM, and each
also proposed another new feature set that of them is a matrix of size AXB.
combines the newly proposed joint
frequency features (including AMSP and AMSV
AMSC/AMSV and AMSFM/AMSCM), For each joint acoustic modulation
together with the modulation spectral frequency sub band, we compute the
analysis of MFCC, and statistical acoustic-modulation spectral peak (AMSP)
descriptors of short-term timbre features. and the acoustic-modulation spectral
Experiments conducted on Raga Music valley (AMSV) as follows:
Dataset. Explore the possibility of using
dimensionality reduction techniques to
extract a compact feature set that can
achieve equal or better performance
The difference between AMSP and Following figure 2 shows the time
AMSV, denoted as AMSC (acoustic- comparison graph of the proposed system
modulation spectral contrast), can be used with the existing system. Graph is plot by
to reflect the spectral contrast over a joint using the above table.
frequency sub band:
AMSFM
To measure the noisiness and sinusoidality
of the modulation spectra, we further
define the acoustic modulation spectral
flatness measure (AMSFM) as the ratio of
the geometric mean to the arithmetic mean
of the modulation spectra within a joint
frequency sub band: Fig. 2: Time Graph
In table 2 shows the memory required by
the proposed system using C4.5 and
existing system using KNN classification.
The following table shows that the
AMSCM memory consumed by existing system is
The acoustic modulation spectral crest more than the memory consumed by the
measure (AMSCM) can be defined as the proposed system.
ratio of the maximum to the arithmetic Table 2: Memory Comparison for clustering
mean of the modulation spectra within a System Memory
joint frequency subband, Required
Existing system With 2500 kb
KNN
Proposed System with 1800 kb
C4.5
spectrogram and the mean and standard [3] C.-H. Lee, J.-L. Shih, K.-M. Yu, and H.-S.
deviation of the MSC/MSV matrices) in Lin, ―Automatic music genre classification
the modulation spectral analysis of short based on modulation spectral analysis of
spectral and cepstral features.‖ IEEE
term timbre features are likely to smooth Transactions on Multimedia, vol. 11, no. 4, pp.
out useful modulation information, so we 670–682, 2009.
propose the use of a joint frequency [4] Y. Song, S. Dixon, and M. Pearce, ―Evaluation
representation of an entire music clip to of musical features for emotion classification,‖
extract joint frequency features. These in Proceedings of the 13th International
joint frequency features, including Society for Music Information Retrieval
Conference, Porto, Portugal, October 8-12
acoustic-modulation spectral 2012, pp. 523–528.
contrast/valley, acoustic-modulation [5] Y. Panagakis and C. Kotropoulos, ―Automatic
spectral flatness measure and acoustic- music mood classification via low-rank
modulation spectral crest measure, representation,‖ in Proc, 2011, pp. 689–693,
outperform the modulation spectral 2010.
analysis of OSC and SFM/SCM in Raga [6] X. Sun and Y. Tang, "Automatic Music
Emotion Classification Using a New
Music datasets by small margins. The Classification Algorithm," Computational
advantage of the proposed features is that Intelligence and Design, 2009. ISCID '09.
they can have a better discriminative Second International Symposium on,
power due their operation on the entire Changsha, 2009, pp. 540-542.
music, with no averaging over the local [7] P. Dunker, C. Dittmar, A. Begau, S. Nowak
modulation features. Extracted features are and M. Gruhne, "Semantic High-Level
Features for Automated Cross-Modal
used for classification of test music files Slideshow Generation," 2009 Seventh
according to the mood softest files. For International Workshop on Content-Based
classification C4.5 classifier is used. Multimedia Indexing, Chania, 2009, pp. 144-
System can be enhanced with mood 149.
classification in music videos. We will [8] M. S. Y. Aw, C. S. Lim and A. W. H. Khong,
also apply these features to multi-label "SmartDJ: An interactive music player for
music discovery by similarity comparison,"
tasks such as auto tagging and tag-based Signal and Information Processing Association
retrieval. Annual Summit and Conference (APSIPA),
2013 Asia-Pacific, Kaohsiung, 2013, pp. 1-5.
REFERENCES [9] B. K. Baniya, ChoongSeon Hong and J. Lee,
[1] Ren, Jia-Min, Ming-Ju Wu, and Jyh-Shing "Nearest multi-prototype based music mood
Roger Jang. "Automatic music mood classification," Computer and Information
classification based on timbre and modulation Science (ICIS), 2015 IEEE/ACIS 14th
features." IEEE Transactions on Affective International Conference on, Las Vegas, NV,
Computing 6.3 (2015): 236-246. 2015, pp. 303-306.
[2] M. Barthet, G. Fazekas, and M. Sandler, [10] E. E. P. Myint and M. Pwint, "An
―Multidisciplinary perspectives on music approach for mulit-label music mood
emotion recognition: recommendations for classification," Signal Processing Systems
content- and context-based models.‖ Proc. (ICSPS), 2010 2nd International Conference
CMMR, pp. 492–507, 2012. on, Dalian, 2010, pp. V1-290-V1-294.
4. GAP ANALYSIS
Apriori Algorithm:
It is an array-based algorithm.
It uses Join and Prune technique.
Apriori uses breadth first search.
Apriori utilizes a level wise
approach where it generates
patterns containing 1 item, then 2
items, then 3 items, and so on.
Candidate generation is extremely
slow. Runtime increases Figure 1:-System Architecture
exponentially depending on the
number of different items. OCR:
Candidate generation is very OCR (optical character
parallelizable. recognition) is the recognition of
It requires large memory space due printed or written text characters by
to large number of candidate a computer. This involves photo
generation. scanning of the text character-by-
It scans the database multiple times character, analysis of the scanned-
for generating candidate sets. in image, and then translation of
the character image into character
FP Growth Algorithm: codes, such as ASCII, commonly
It is a tree-based algorithm. used in data processing.
When the FP-tree contains a single prefix- We are hoping to build upon a payment
path, the complete set of frequent patterns gateway system to the application so as it
can be generated in three parts: the single will be convenient for the user to buy the
prefix-path P, the multipath Q, and their commodities.
combinations (lines 01 to 03 and 14). The
resulting patterns for a single prefix path REFERENCES
are the enumerations of its sub paths that [1] Mindpapers Bibliography on the Philosophy of
have the minimum support (lines 04 to AI (Compiled by David Chalmers) People with
06). Thereafter, the multipath Q is defined Online Papers in Philosophy of AI (Compiled
by David Chalmers)
(line 03 or 07) and the resulting patterns [2] Philosophy and Artificial Intelligence
from it are processed (lines 08 to 13). (Association for the Advancement of Artificial
Finally, in line 14 the combined results are Intelligence) From Russell and Norvig
returned as the frequent patterns found. [3] Copeland, Jack Artificial Intelligence: A
Philosophical Introduction Blackwell 1993 An
excellent and engaging discussion of the
Light Gradient Boosting Machine: philosophical issues surrounding AI.
Light GBM is a gradient boosting [4] The role of Apriori algorithm for finding the
framework that uses tree-based learning association rules in Data mining
algorithm. [5] Trending topic prediction by optimizing K-
nearest neighbor algorithm (Jugendra Dongre;
The tree in light gbm grows Gend Lai Prajapati; S. V. Tokekar)
vertically as compared to other algorithms [6] https://cloud.google.com/vision/docs/ocr
which grows horizontally. (Syafruddin Syarif; Anwar; Dewiani)
systems these depend on the type of data proposed architecture having innovative
gathered by the sensor devices. Event mesh network will be a more efficient way
Detection based and Spatial Process of gathering data from the nodes of WSN.
Estimation are the two categories to which It will have lots of benefits with respect to
applications are classified. Initially, the the future concept of Smart Cities that will
sensor devices are deployed in the have the new technologies related to the
environment to detect the parameters (e.g., Internet of Things. [2] Temperature and
Temperature, Humidity, Pressure, LDR, relative humidity play an important role in
noise, CO and radiation levels etc.) while the lifecycle of the plants. When plants
the data acquisition, computation and have the right humidity they thrive
controlling action (e.g., the variations in because they open their pores completely
the noise and CO levels with respect to the and so breathe deeply without the threat of
specified levels). Sensor devices are excessive water loss. Wireless sensor
placed at different locations to collect the network (WSN) has revolutionized the
data to predict the behavior of a particular field of monitoring and remote sensing.
area of interest. The main aim of this paper Wireless sensor network or wireless sensor
is to design and implement an effective & actuator network (WSAN) are spatially
monitoring system through which the distributed sensors to monitor physical or
required parameters are monitored environmental conditions such as
remotely using the internet and the data temperature, humidity, fire etc. and to
gathered from the sensors are stored in the cooperatively pass their data through the
cloud and to project the estimated trend on network to the main location. The aim of
the web browser[1]. this paper is to design and develop a
With the progression of advancements in system which fulfills all above
technology, several innovations have been requirements. In this paper, digital
made in the field of communications that humidity temperature composite (DHT11)
are transiting to the Internet of Things. In sensor is used to sense the environmental
this domain, Wireless Sensor Networks temperature and Relative Humidity.
(WSN) is one of those independent Arduino microcontroller is used to make
sensing devices to monitor physical and the complex computation of the
environmental conditions along with parameters and then to transmit the data
thousands of applications in other fields. wirelessly by using ZigBee S2 module to
As air pollution is a major environmental the receiver. At receiver section, ZigBee
change that causes many hazardous effects S2 module is used to capture the serial
on human beings that need to be data, which is transmitted, by the
controlled. Hence, we deployed WSN transmitter and using Digi's XCTU
nodes for constant monitoring of air software the data is logged onto PC. [3]
pollution around the city and the moving This paper uses the ZigBee CC2530
public transport buses and cars. This development platform applied to various
methodology gave us the monitoring data types of sensors developed for
from the stationary nodes deployed in the environmental monitoring systems to
city to the mobile nodes on Public enhance multi-Sensor wireless signals
Transport buses and cars. The data of the aggregation via multi-bit decision fusion.
air pollution particles such as gases, ZigBee is a short-range wireless
smoke, and other pollutants are collected transmission standard IEEE 802.15.4-
via sensors on the Public transport buses based, formulated by the ZigBee Alliance
and the data is being analyzed when the ZigBee protocol. It is low cost, low power
buses and cars reach back to the source consumption, and short-distance
destination after passing through the transmission at a transmission rate of 250k
stationary nodes around the city. Our bps for wireless sensor networks. Its main
applications include temperature, humidity all activities or to check data at any time.
and other types of data monitoring, factory GUI can be designed using Python,
automation, home automation, remote HTML, CSS or any other language.
monitoring, and home device control.[4] Depending on sensor types, various
The concern of better quality agricultural monitoring services can be designed. To
products from the consumers made the monitor and control services or action we
farmers adapt to the latest agricultural can use the Internet. Data acquired by
techniques by implementing modern sensors can be transferred over the
technologies for producing better network by using a web server or by using
agricultural products. Among the some SMS service. To provide energy, the
important things which are taken into battery cell can be used [6]. Wireless
consideration by the farmers are the sensor networks have been a big promise
qualities of agricultural land, weather during the last few years, but a lack of real
conditions etc. Traditional farming applications makes difficult the
involves human labor. With proper data, establishment of this technology. This
the farmer will be able to deliver the paper reviews the wireless sensor network
quality product to the consumer. In this applications which focus mainly on the
paper, we have discussed monitoring of environmental monitoring system. These
agriculture parameter using soil moisture systems have low power consumption, low
level sensor, Wireless technology. We cost and are a convenient way to control
update the parameter result from the real-time monitoring. Moreover, it can also
sensor node data is transferred to the be applied to indoor living monitoring,
wireless transceiver to another end server greenhouse monitoring, climate
PC. From the PC, then after that values are monitoring, and forest monitoring. These
analyzed and some predicate is applied to approaches have been proved to be an
it. If they give a positive response then alternative way to replace the conventional
there will continuous monitoring but if it method that uses men force to monitor the
shows negative then it will provide a total environment and improves the
farming solution and cultivation plan. It performance, robustness, and provides
also sends these all solution to farmers or efficiency in the monitoring system.
user via SMS to them in their regional Monitoring the museum's environment for
languages [5]. preventive conservation of art purposes is
The environment monitoring system, in one major concern to all museums. In
general, is used to monitor various order to properly conserve the artwork, it
environmental parameters with the help of is critical to continuously measure some
the sensor. Some communication media, parameters, such as temperature, relative
like Wireless Communication, is needed to humidity, light and, also, pollutants, either
transfer sensor data. An environment in storage or exhibition rooms. The
parameter can be temperature, pressure, deployment of a Wireless Sensor Network
humidity, GPS location, or an Image. We in a museum can help to implement these
can design a system to monitor all or any measurements in real-time, continuously,
of these parameters as and when required. and in a much easier and cheap way. In
For monitoring purpose, we need to install this paper, we present the first testbed
some sensors on each node. A node will deployed in a Contemporary Art Museum,
interact with the sensor and will transfer located in Madeira Island, Portugal, and
that data to the controlling unit. A the preliminary results of these
controller will receive data from each node experiments. On the other hand, we
and can take action depending on propose a new wireless sensor node that
programming done. The user can use offers some advantages when compared
Graphical User Interface (GUI) to manage with several commercially available
data mining technologies are integrated data mining system area. Based on the
with IoT technologies for decision making survey of the current research, a suggested
support and system optimization. Data big data mining system is proposed.
mining involves discovering novel,
interesting, and potentially useful patterns REFERENCES
from data and applying algorithms to the [1] Edward N. Lorenz ―Dynamical And Empirical
extraction of hidden information. In this Methods Of Weather Forecasting‖
Massachusetts Institute Of
paper, we survey the data mining in 3 Technology.pp.423-429,2014
different views: knowledge view, [2] Mathur, S., and A. Paras. "Simple weather
technique view, an application view. In the forecasting model using mathematical
knowledge view, we review classification, regression." Indian Res J Exten Educ: Special1
clustering, association analysis, time series (2012).
[3] Monika Sharma, Lini Mathew, Chatterji s.
analysis, and outlier analysis. In the "Weather Forecasting using Soft Computing
application view, we review the typical and Statistical Techniques". IJAREEIE. Vol.3,
data mining application, including e- Issue 7,pp.122-131
commerce, industry, health care, and [4] Sohn T., Lee J.H., Lee S.H., and Ryu,
public service. The technique view is "Statistical prediction of heavy rain in South
Korea" Advances in Atmospheric Sciences,
discussed with knowledge view and Vol. 22, 2015. pp.365-372
application view. Nowadays, big data is a [5] Kannan, M. Prabhakaran S. and
hot topic for data mining and IoT; we also Ramachandran, P. ―Rainfall forecasting using
discuss the new characteristics of big data data mining technique‖. International Journal
and analyze the challenges in data of Engineering and Technology, Vol. 2, No. 6,
pp. 397-401, 2014.
extracting, data mining algorithms, and
7. CONCLUSION
Motivated by the election
prediction problem, we study in this paper
the problem of quantifying the political
leaning of prominent members in the
Twitter sphere. By taking a new point of
view on the consistency relationship
between tweeting and re-tweeting
behavior, we formulate political leaning
quantification as an ill-posed linear inverse
problem solved with regularization
techniques. The result is an automated
Figure: System Architecture method that is simple, efficient and has an
6. PROPOSED APPROCH intuitive interpretation of the computed
To motivate our approach based on scores. Compared to existing manual and
retweets, we consider a small example Twitter network-based approaches, our
based on some data extracted from our approach is able to operate at much faster
dataset on the presidential election. timescales, and does not require explicit
Consider a proRepublican media source A knowledge of the Twitter network, which
and a pro- Democrat media source B. We is difficult to obtain in practice.
observe the number of retweets they
received during two consecutive events. ACKNOWLEDGMENTS
During the ―Romney 47 percent comment‖ The volume of the work would not have
event1 (event 6 in Table 1), source A been possible without contribution in one
received 791 retweets, while source B form or the other by few names to
received a significantly higher number of mention. We welcome this opportunity to
2,311 retweets. It is not difficult to express our heartfelt gratitude and regards
imagine what happened: source B to our project guide Prof. V.V. Kimbahune
published tweets bashing the Republican Department of Computer Engineering,
candidate, and Democrat supporters STES SMT. KASHIBAI NAVALE
enthusiastically retweeted them. Then COLLEGE OF ENGINEERING, for his
consider the first presidential debate. It is unconditional guidance. She always
generally viewed as an event where bestowed parental care upon us and
Romney outperformed Obama. This time evinced keen interest in solving our
source A received 3,393 retweets, while problems.
rehashed drive cycle, which can be utilized recurrence models. The solid reliance on
in an advancement calculation to limit the nature due to multipath engendering is
measure of petroleum derivative vitality exhibited. These outcomes can help in the
utilized amid the excursion. distinguishing proof of the ideal area of the
Youness Riouali et.al[2] states that traffic handsets to limit control utilization and
flow demonstrating is a fundamental increment benefit execution, enhancing
advance for planning and controlling the vehicular correspondences in ITS.
transportation frameworks. It isn't vital for DAI Lei-lei et.al[4] introducing a view of
enhancing wellbeing and transportation characterizing and grouping the extensive
efficiency, yet additionally it can yield uncommon occasions, the paper
financial and ecological benefits. Consider investigates the traveler stream
the discrete and consistent parts of traffic conveyance qualities of substantial
flow elements, half and half Petri nets have extraordinary occasions, thinks about the
turned out to be an incredible asset for spatial and fleeting dispersion of street
moving toward this elements and portray traffic stream encompassing the occasion
the vehicle conduct precisely since they regions. By summing up the traffic
incorporate the two perspectives. Another association and the executives encounters
expansion of mixture petri net is exhibited of model at home and abroad, joined with
in this paper for summing up the traffic the arrangement results, the paper
flow demonstrating through considering structures basic procedure of traffic
state conditions on outer principles which association and the executives for various
can be planned and furthermore vast extraordinary occasions, proposes the
nondeterministic time, for example, stop static and dynamic traffic association
sign or need streets. Also, a division of techniques and the board methodologies,
streets is proposed to manage the exact and plans the activity steps, which give a
limitation of occasions. reference and direction to the traffic
Leyre azpilicueta et.al [3] proposed an association routine with regards to
intelligent transportation frameworks expansive unique occasions.
(ITSs) are as of now under serious Thomas Liebig et.al[5] states circumstance
innovative work for making transportation subordinate course arranging assembles
more secure and progressively proficient. expanding enthusiasm as urban areas end
The improvement of such vehicular up swarmed and stuck. A framework for
correspondence systems requires exact individual outing arranging that
models for the spread channel. A key consolidates future traffic perils in
normal for these channels is their transient steering. Future traffic conditions are
fluctuation and innate time-evolving processed by a Spatio-Temporal Random
insights, which majorly affect Field dependent on a surge of sensor
electromagnetic, spread expectation. This readings. Furthermore, our methodology
article researches the channel properties of gauges traffic flow in territories with low
a remote correspondence framework in a sensor inclusion utilizing a Gaussian
vehicular communication domain with Process Regression. The molding of
deterministic displaying. An investigation spatial relapse on halfway forecasts of a
of the physical radio channel engendering discrete probabilistic graphical model
of a ultra-high-recurrence (UHF) radio- permits to join authentic information,
recurrence ID (RFID) framework for a gushed online information and a rich
vehicle-to-foundation (V2I) dispersing reliance structure in the meantime. Exhibit
condition is exhibited. Another module the framework with a genuine use-case
was executed in the proposed site-explicit from Dublin city, Ireland.
apparatus that considers the development Shen, L et.al [6] introducing a centers
of the vehicles, prompting existence around examining dynamic company
scattering models which could catch the vehicles cutting into the path of the host
inconstancy of traffic stream in a cross- vehicle.
sectional traffic location condition. The Shen, L. et al[8] states an Lacking of
dynamic models are connected to adequate worldly variety trademark
anticipate the advancement of traffic examination and spatial connection
stream, and further used to create flag estimations prompts constrained
timing designs that account not just for the fulfillment accuracy, and represents a
present condition of the framework yet in noteworthy test for an ITS. Utilizing the
addition for the normal transient changes low-position nature and the spatial-worldly
in rush hour gridlock streams. To explore connection of traffic organize information,
factors influencing model exactness, this paper proposes a novel way to deal
including time-zone length, position of with remake the missing traffic
upstream traffic location gear, street area information dependent on low-position
length, traffic volume, turning rates, and lattice factorization, which expounds the
calculation time. The effect of these potential ramifications of the traffic
elements on the model's execution is network by disintegrated factor
delineated through a reenactment frameworks. To additionally misuse the
examination, and the calculation execution worldly evolvement attributes and the
of models is talked about. The outcomes spatial similitude of street joins, user plan
demonstrate that both the dynamic speed- a period arrangement imperative and a
truncated typical circulation model and versatile Laplacian regularization spatial
dynamic Robertson display with elements requirement to investigate the
beat their particular static adaptations, and neighborhood association with street joins.
that they can be additionally connected for The exploratory outcomes on six
dynamic control. certifiable traffic informational collections
Graf R et.al [7] proposed to future driving demonstrate that our methodology beats
help frameworks will require an expansion alternate techniques and can effectively
capacity to deal with complex driving remake the street traffic information
circumstances and to respond properly as exactly for different basic misfortune
indicated by circumstance criticality and modes.
prerequisites for hazard minimization.
People, driving on motorways, can pass 3.EXISTING SYSTEM APPROACH
judgment, for instance, cut-in In existing framework, in metropolitan city
circumstances of vehicles due to their traffic jam is more as contrast with other
encounters. The thought displayed in this urban city zone just as other rural area. So,
paper is to adjust these human capacities to traffic jams most well-known issue in
specialized frameworks and learn current days. Traffic jam occurs when
distinctive circumstances after some time. movement of vehicles is hampered at a
Case-Based Reasoning is connected to particular place for some reasons over a
foresee the conduct of street members certain period of time. If the number of
since it joins a learning viewpoint, in light vehicles plying on a street or road is
of information obtained from the driving increased than the maximum capacity it is
history. This idea encourages built to sustain, it results in traffic jams.
acknowledgment by coordinating genuine Traffic jam or traffic congestion is an
driving circumstances against put away everyday affair in big cities. It is the result
ones. In the main occasion, the idea is of growing population and the increase in
assessed on activity expectation of use of personal, public as well as
vehicles on neighbouring paths on commercial transport vehicles. The loss of
motorways and spotlights on the part of the profitable time brought about by the
automobile overloads isn't at all useful for
[1]Classification of Heart Diseases using two more attributes i.e. obesity and
K Nearest Neighbor and Genetic smoking. The data mining classification
Algorithm(2013) algorithms, namely Decision Trees, Naive
Nearest neighbor (KNN) is very Bayes, and Neural Networks are analyzed
simple, most popular, highly efficient and on Heart disease database.
effective technique for pattern recognition. [4]Cardio Vascular Disease Prediction
KNN is a straight forward classifier, where System using Genetic Algorithm(2012)
parts are classified based on the class of Medical Diagnosis Systems play
their nearest neighbor. Medical databases important role in medical practice and are
have big volume in nature. If the data set used by medical practitioners for diagnosis
contains excessive and irrelevant and treatment. In this work, a medical
attributes, classification may create less diagnosis system is defined for predicting
accurate result. Heart disease is the main the risk of cardiovascular disease. This
cause of death in INDIA. In Andhra system is built by combining the relative
Pradesh heart disease was the prime cause advantages of genetic technique and neural
of mortality accounting for 32% of all network. Multilayered feed forward neural
deaths, a rate as high as Canada (35%) and networks are particularly adapted to
USA. Hence there is a need to define a complex classification problems. The
decision support system that helps weights of the neural network are
clinicians to take precautionary steps. This determined using genetic technique
work proposed a new technique which because it finds acceptably good set of
combines KNN with genetic technique for weights in less number of iterations.
effective classification. Genetic technique [5]Wavelet Based QRS Complex
performs global search in complex, large Detection of ECG Signal(2012)
and multimodal landscapes and provides A wide range of heart condition is
optimal solution. defined by thorough examination of the
[2]A Survey of Non-Local Means based features of the ECG report. Automatic
Filters for Image De-noising(2013) extraction of time plane features is
Image de-noising includes the valuable for identification of vital cardiac
manipulation of the image data to produce diseases. This work presents a multi-
a visually high quality image. The Non resolution wavelet transform based system
Local means filter is originally designed for detection 'P', 'Q', 'R', 'S', 'T' peaks
for Gaussian noise removal and the filter is complex from original ECG signal. 'R-R'
changed to adapt for speckle noise time lapse is an important minutia of the
reduction. Speckle noise is the initial ECG signal that corresponds to the
source of medical ultrasound imaging heartbeat of the related person. Abrupt
noise and it should be filtered out. This increase in height of the 'R' wave or
work reviews the existing Non-Local changes in the measurement of the 'R-R'
Means based filters for image de-noising. denote various anomalies of human heart.
[3]Improved Study Of Heart Disease Similarly 'P-P', 'Q-Q', 'S-S', 'T-T' also
Prediction System Using Data Mining corresponds to various anomalies of heart
Classification Techniques(2012) and their peak amplitude also envisages
This work has analyzed prediction other cardiac diseases. In this proposed
systems for Heart disease using more method the 'PQRST' peaks are marked and
number of input attributes. The work uses stored over the entire signal and the time
medical terms such as sex, blood pressure, interval between two consecutive 'R' peaks
cholesterol like 13 attributes to predict the and other peaks interval are measured to
likelihood of patient getting a Heart find anomalies in behavior of heart, if any.
disease. Until now, 13 attributes are used [6] Heart Disease Diagnosis using Data
for prediction. This research work added Mining Technique - Decision Tree It has
4. GAP ANALYSIS
Sr Publishe
Author Title Conclusion Limitations
no r
1 M.Akhil Classification of M.Akhil Use KNN and Genetic 1.Low accuracy
jabbar B.L Heart Diseases jabbar* algorithm for heart disease 2.Limited
Deekshatulua using K Nearest B.L detection dataset used
Priti Chandra Neighbor and Deekshat
b Genetic ulua Priti
Algorithm
Existing Proposed
System System
Precisio
0.825 0.9
n
Recall 0.825 0.9 This work can be enhanced by increasing
F-
number of attributes for disease prediction,
0.825 0.9 making this system more accurate.
Measure
Keywords
Semantic cross media hashing Method, SIFT Descriptor, Word Embedding, Ranking,
Mapping
1. INTRODUCTION multiple modality and retain / protect the
With the fast development of internet and similarity relation in each respective
multimedia, information with various form modalities. Generally hashing method
has become enough smooth, simple and divided into 2 categories: matrix
easier to access, modify and duplicate. decomposition method and vector based
Information with various forms may have method. Matrix decomposition based
semantic correlation for example a hashing method search low dimensional
microblogs in Facebook often consist of spaces to construct data and quantify the
tag, a video in YouTube is always reconstruction coefficient to obtain binary
associated with related description or tag codes. Such kind of methods avoid graph
as semantic information inherently consist construction and Eigen decomposition.
of data with different modality provide an The drawback with such methods, causes
great emerging demand for the large quantization errors which detorate
applications like cross media retrieval, such performance for large code length.
image annotation and recommendation We have design multi-modal hashing
system. Therefore, the hash similarity model SCMH which focuses on Image and
methods which calculates or approximate Text type of data with binary
search suggested and received a representation Hashing. This method
remarkable attention in last few years. The processed text data using Skip gram model
core problem of hash learning is how to and image data using SIFT Descriptor.
formulate underlay co-relation between
After it generates hash code using Deep challenges can be mitigated by jointly
Neural network by avoiding duplicates. exploring the cross-view learning and the
use of click-through data. The former aims
Motivation to create a latent subspace with the ability
In existing, use Canonical in comparing information from the original
Correlation Analysis (CCA), incomparable views (i.e., textual and
manifolds learning, dual-wing visual views), while the latter explores the
harmoniums, deep auto encoder, largely available and freely accessible
and deep Boltzmann machine to click-through data (i.e., ―crowdsourced‖
approach the task. human intelligence) for understanding
Due to the efficiency of hashing- query [2].
based methods, there also exists a D. Zhai, H. Chang, Y. Zhen, X. Liu,
rich line of work focusing the X. Chen, and W. Gao: In this paper, we
problem of mapping multi-modal study HFL in the context of multimodal
high-dimensional data to low- data for cross-view similarity search. We
dimensional hash codes, such as present a novel multimodal HFL method,
Latent semantic sparse hashing called Parametric Local Multimodal
(LSSH) , discriminative coupled Hashing (PLMH), which learns a set of
dictionary hashing (DCDH), Cross- hash functions to locally adapt to the data
view Hashing (CVH), and so on. structure of each modality [3].
G. Ding, Y. Guo, and J. Zhou: In this
paper, we study the problems of learning
2. RELATED WORK hash functions in the context of
Literature survey is the most important multimodal data for cross-view similarity
step in any kind of research. Before start search. We put forward a novel hashing
developing we need to study the previous method, which is referred to Collective
papers of our domain which we are Matrix Factorization Hashing (CMFH) [4].
working and on the basis of study we can H. J_egou, F. Perronnin, M. Douze, J.
predict or generate the drawback and start S_anchez, P. P_erez, and C. Schmid:
working with the reference of previous This paper addresses the problem of large-
papers. scale image search. Three constraints have
In this section, we briefly review the to be taken into account: search accuracy,
related work on Tag Search and Image efficiency, and memory usage. We first
Search and their different techniques. present and evaluate different ways of
Y. Gong, S. Lazebnik, A. Gordo, and aggregating local image descriptors into a
F. Perronnin: This paper addresses the vector and show that the Fisher kernel
problem of learning similarity-preserving achieves better performance than the
binary codes for efficient similarity search reference bag-of-visual words approach
in large-scale image collections. We for any given vector dimension [5].
formulate this problem in terms of finding J. Zhou, G. Ding, and Y. Guo: In this
a rotation of zero-cantered data so as to paper, we propose a novel Latent Semantic
minimize the quantization error of Sparse Hashing (LSSH) to perform cross-
mapping this data to the vertices of a zero- modal similarity search by employing
cantered binary hypercube, and propose a Sparse Coding and Matrix Factorization.
simple and efficient alternating In particular, LSSH uses Sparse Coding to
minimization algorithm to accomplish this capture the salient structures of images,
task [1]. and Matrix Factorization to learn the latent
Y. Pan, T. Yao, T. Mei, H. Li, C.-W. concepts from text [6].
Ngo, and Y. Rui: we demonstrate in this Z. Yu, F. Wu, Y. Yang, Q. Tian, J.
paper that the above two fundamental Luo, and Y. Zhuang: In DCDH, the
coupled dictionary for each modality is text documents or images from different
learned with side information (e.g., data sources [10].
categories). As a result, the coupled
dictionaries not only preserve the intra- 3. EXISTING SYSTEM
similarity and inter-correlation among Lot of work has been done in this field
multi-modal data, but also contain because of its extensive usage and
dictionary atoms that are semantically applications. In this section, some of the
discriminative (i.e., the data from the same approaches which have been implemented
category is reconstructed by the similar to achieve the same purpose are
dictionary atoms) [7]. mentioned. These works are majorly
H. Zhang, J. Yuan, X. Gao, and Z. differentiated by the algorithm for
Chen: In this paper, we propose a new multimedia retrieval.
cross-media retrieval method based on In another research, the training set
short-term and long-term relevance images were divide into blobs. Each such
feedback. Our method mainly focuses on blob has a keyword associated with it. For
two typical types of media data, i.e. image any input test image, first it is divided into
and audio. First, we build multimodal blobs and then the probability of a label
representation via statistical canonical describing a blob is found out using the
correlation between image and audio information that was used to annotate the
feature matrices, and define cross-media blobs in the training set.
distance metric for similarity measure; As my point of view when I studied the
then we propose optimization strategy papers the issues are related to tag base
based on relevance feedback, which fuses search and image search. The challenge is
short-term learning results and long-term to rank the top viewed images and making
accumulated knowledge into the objective the diversity of that images is main task
function [8]. and the search has that diversity problem
so the open issue is diversity.
A. Karpathy and L. Fei-Fei: We
present a model that generates natural 4. PROPOSED SYSTEM
language descriptions of images and their We propose a novel hashing method,
regions. Our approach leverages datasets called semantic cross-media hashing
of images and their sentence descriptions (SCMH), to perform the near-duplicate
to learn about the inter-modal detection and cross media retrieval task.
correspondences between language and We propose to use a set of word
visual data. Our alignment model is based embeddings to represent textual
on a novel combination of Convolution information. Fisher kernel framework is
Neural Networks over image regions, incorporated to represent both textual and
bidirectional Recurrent Neural Networks visual information with fixed length
over sentences, and a structured objective vectors. For mapping the Fisher vectors of
that aligns the two modalities through a different modalities, a deep belief network
multimodal embedding [9]. is proposed to perform the task. We
J. Song, Y. Yang, Y. Yang, Z. Huang, evaluate the proposed method SCMH on
and H. T. Shen: In this paper, we present two commonly used data sets. SCMH
a new multimedia retrieval paradigm to achieves better results than state-of-the-art
innovate large-scale search of methods with different the lengths of hash
heterogeneous multimedia data. It is able codes and also display query results in
to return results of different media types ranked order.
from heterogeneous data sources, e.g.,
using a query image to retrieve relevant Advantages:
REFERENCES
that are used to meet customer demands to be mediocre. A combination of the two
through the planning, control and issues defeats this inadequacy. The two
implementation of the effective movement sub-issues can be settled successively
and storage of related information and (SP), by methods for various leveled basic
services from origin to destination and leadership (FI), or model update (DI). The
also maintain information of user in the last two methodologies are gotten from
form of QR code. The proposed system Geoffrion's idea of model mix. An issue a
focuses on delivery of goods, raw stochastic programming approach
materials, shifting home appliances, regarding the transportation issue isn't
furniture while relocation. resolved.
4. Product allocation to different
3. LITERATURE SURVEY types of distribution center in retail
1. An Automated Taxi Booking and logistics networks
Scheduling System In this system, study about novel solution
This proposed structure displays an approach is developed and applied to a
Automated Taxi Booking and Scheduling real-life case of a leading European
System with safe booking. The system grocery retail chain. Learn about City
gives an invaluable, ensured and safe compose will supplant nation or area as
holding for the two taxi drivers and the most significant division measurement
enrolled customers through PDAs. For that decides versatility conduct. A further
more customers are the in the time are aspect arises from assuming identical store
arrived then issues occurred, there are no delivery frequencies in outbound
taxi parking, central working transportation from all DC types.
environments or a booking structure for 5. The dynamic vehicle allocation
the generous number of taxis. problem with application in trucking
2. Autonomous vehicle logistic companies in Brazil
system: Joint routing and charging This paper manages the dynamic vehicle
strategy assignment issue (DVAP) in street
Principle point of this framework to roll transportation of full truckloads between
out the unavoidable improvements more terminals. The DVAP includes multi-
substantial. Begin from the general period asset allotment and comprises of
agreement that the business is changing characterizing the developments of an
and go further to indicate and measure the armada of vehicles that vehicle products
extent of progress. Inside a more between terminals with a wide land
perplexing and expanded versatility circulation. The consequences of a useful
industry scene, occupant players will be approval of the model and arrangement
compelled to at the same time contend on strategies proposed, isn't plainly specified.
different fronts and participate with 6. Road-based goods transportation:
organization. City compose will supplant A survey of real-world logistics
nation or district as the most significant applications from 2000 to 2015
division measurement that decides This paper gives a review of the
versatility conduct. fundamental genuine utilizations of street
3. Integration of vehicle routing and based merchandise transportation over the
resource allocation in a dynamic previous 15 years. It audits papers in the
logistics network territories of oil, gas and fuel
This proposed framework presents a transportation, retail, squander gathering
multi-period, incorporated vehicle and administration, mail and bundle
directing and asset distribution issue. conveyance and nourishment circulation.
Ignoring interdependencies between Take care of Integration of steering issues
vehicle directing and asset portion appears with different parts of the store network.
customer and also the customer is unable client intrigue and give related occasion.
to trace out the current location of Client Advice is a term which is utilized in
transported material. The primary concern the sense to enthusiasm mining. One can
in our framework is, we need to give end give direction for the issue or can basically
to end security to client and supplier give an answer. Direction, is apparently a
information by utilizing QR code concept. supposition with request or control and
In QR code parallel picture we need to even control. Suggestion takes after, a
shroud client and supplier information. customer energy opening about
Just approved client can see information. organization is used for new customer to
For customer interest mining we used use authority association vehicle. We need
collaborative filtering method. The to give end to end security to client and
fundamental rule of this strategy is supplier information by utilizing QR code
suggestion of vehicle as per supplier idea.
benefit. Proposal is utilized to discover
7. ACKNOWLEDGMENTS REFERENCES
It gives us great pleasure in presenting [1] AlbaraAwajan,―An Automated Taxi Booking
the preliminary project report on and Scheduling System‖,
Conference―Automation Engineering‖, 12
modern logistics vehicle system using January 2015.
tracking and security. [2] A. Holzapfel, H. Kuhn, and M. G. Sternbeck,
We would like to take this opportunity ―Product allocation todifferent types of
to thank my internal guide distribution center in retail logistics networks,‖
Prof.G.Gunjal for giving us all the European Journal of Operational Research),
February. 2016.
help and guidance we needed. We are [3] J. Q. Yu and A. Y. S. Lam, ―Autonomous
really grateful to them for their kind vehicle logistic system: Joint routing and
support Their valuable suggestions charging strategy,‖ IEEE Transaction of
were very helpful. Intelligence Systems, 2016.
We are also grateful to Dr. P. N. [4] R. A..Vasco and R. Morabito, ―The dynamic
vehicle allocation problem with application in
Mahalle. Head of Computer trucking companies in Brazil,‖ Computers and
Engineering Department. Operational 24 April 2016.
[5] L. C. Coelho, J. Renaud, and G. Laporte,
―Road-based goods transportation: A survey of
real-world logistics applications from 2000 to
NETWORK AND
CYBER SECURITY
regulator. People of all age group must principle may easily become the main
willingly exercise their right to vote challenge.
without feeling any sort of dissatisfaction. Given that an Internet voting system
Currently 42 % of internet users in India cannot ensure that voters are casting
have an average internet connection speed their ballots alone, the validity of
of above 4 Mbit/s, 19 % have a speed of Internet voting must be demonstrated
over 10 Mbit/s, and 10 % enjoy speeds on other grounds.
over 15 Mbit/s. The average internet
connection speed on mobile networks in 3) Accessibility of Internet Voting –
India was 4.9 Mbit/s. With so many people Improving accessibility to the
connected to internet, the idea of using voting process is often cited as a
OVS is very much feasible & it also reason for introducing Internet
overcomes various other problems faced voting. The accessibility of online
during election process such as creating voting systems, closely linked to
awareness among rural areas and youths, usability is relevant not only for
cost reduction, security, etc. voters with disabilities and
linguistic minorities, but also for
4. SOME IMPORTANT POINTS the average voter.
FROM REVIEW OF OVS [6]- The way in which voters are
identified and authenticated can
1) Trust in Internet Voting – have a significant impact on the
Trust in the electoral process is usability of the system, but a
essential for successful democracy. balance needs to be found between
However, trust is a complex concept, accessibility and integrity.
which requires that individuals make Different groups in society have
rational decisions based on the facts to different levels of access to the
accept the integrity of Internet voting. Internet. Therefore, the provision
Technical institutions and experts can of online voting in societies where
play an important role in this process, there is very unequal access to the
with voters trusting the procedural role Internet will have a different
played by independent institutions and impact on accessibility for various
experts in ensuring the overall integrity communities.
of the system.
One of the fundamental ways to enable 4) Electoral Stakeholders and Their
trust is to ensure that information about Roles –
the Internet voting system is made The introduction of Internet voting
publicly available. significantly changes the role that
A vital aspect of integrity is ensured stakeholders play in the electoral
through testing, certification and audit process. Not only do new
mechanisms. These mechanisms will stakeholders, such as voting
need to demonstrate that the security technology suppliers, assume
concerns presented by Internet voting prominence in the Internet voting
have been adequately dealt with. process, but existing stakeholders
must adapt their roles in order to
2) The Secrecy and Freedom of the fulfill their existing functions.
Vote – Central to this new network of
Ensuring the secrecy of the ballot is a stakeholder relationships is public
significant concern in every voting administration, especially the role
situation. In the case of Internet voting of the EC. Public administration
from unsupervised environments, this and the EC will establish the legal
for only one login session or transaction, The results, which the admins will
on a computer system or other digital send to ECI, will be further analyzed. The
device. OTPs avoid a number of results will be broken down into result of
shortcomings that are associated with each state, overall winner of election and
traditional password-based authentication; by how much has a certain party beaten
a number of implementations also other competitors.
incorporate two factor authentications by
ALGORITHM FOR PROPOSED OVS –
ensuring that the one-time password Algorithm: Successful online voting
requires access to something a person has Input: Biodata of voter & candidate,
(such as a smartcard or specific cellphone) various wards‘ details.
as well as something a person knows (such Output: Successful voting for voters and
as a PIN). declaration of results.
This ensures that individual can Steps:
vote only for her/himself and thus 1. Person must be 18 years of age or
reducing fraud votes. Only when user above.
enters correct Aadhar card number, mobile 2. Fill Form 6 for first time
number and set password, then website registration in respective ward
will give option of Generate OTP. On office.
clicking it an OTP will be sent to user 3. For changes in details, contact
mobile number within 2 minutes. On respective ward office.
entering correct OTP, the user will be able 4. Necessary documents must be
to log in and cast vote. Once the user submitted while doing steps 2 & 3.
selects candidate he/she wants to vote for, Failing to do so will result in
the system will pop up a confirmation rejection of form.
message. Once user selects confirm vote, 5. Once forms and documents are
he/ she will be automatically logged out verified, data entry operator will
from the website, thus preventing the user enter person‘s details in database
from voting again. and a default password will be sent
Additionally, the website will have to user.
another option of Admin log in. The 6. On receiving password, user must
admins are officers selected by Election log in using it and must select new
Commission who will monitor the voting password to access website for
as it progresses and will have their profile further use.
created by Election Commission. Their 7. Once new password is set, user can
main task will be start/ stop election on view profile, election related
time, make sure it progresses without any information.
issues and generate local ward results once 8. If any discrepancies are found in
elections are finished and send it to profile, step 3 must be followed.
Election Commission.
9. To cast vote, user must enter an
On information front, the website OTP which will be sent on
will have details of all candidates that are registered mobile no and is active
selected by respective parties for different for 1 minute.
wards. On selecting any of the candidate 10. If OTP is not received, repeat step
name, complete information of that 9.
candidate will be displayed. Also various 11. Once user enters correct OTP, vote
awareness programs that the Election can be cast.
Commission is conducting will be 12. On successful voting, a
displayed. This will help voter to gain confirmation message will be
more knowledge of voting process. displayed and user will be logged
out.
6. PROPOSED WORK
Today, almost everyone in the world have
smartphone in hand. In this project we
present an android application, a
lightweight, flexible and power efficient
It is easy to read by any device and transformed binary form of the watermark
any user, data into the DWT domain of the cover
It has high encoding capacity image and uses a unique image code for
enhanced by error correction the detection of image distortion. The QR
facilities, code is embedded into the attack resistant
It is in small size and robust to HH component of 1stlevel DWT domain
geometrical distortion. of the cover image and to detect malicious
Visual cryptography is a new secret interference by an attacker. Advantages
sharing technology. It improves the secret are: More information representation per
share images to restore the complexity of bit change combined with error correction
the secret, relying on human visual capabilities. Increases the usability of the
decryption. Compared with traditional watermark data and maintains robustness
cryptography, it has the advantages of against visually invariant data removal
concealment, security, and the simplicity attacks. Disadvantages are: Limited to a
of secret recovery. The method of visual LSB bit in the spatial domain of the image
cryptography provided high security intensity values. Since the spatial domain
requirements of the users and protects is more susceptible to attacks this cannot
them against various security attacks. It is be used.
easy to generate value in business In [3] paper, design a secret QR
applications. In this paper, proposed a sharing approach to protect the private QR
standard multi-color QR code using data with a secure and reliable distributed
textured patterns on data hiding by text system. The proposed approach differs
steganography and providing security on from related QR code schemes in that it
data by using visual secret sharing scheme uses the QR characteristics to achieve
secret sharing and can resist the print-and-
2. MOTIVATION scan operation. Advantages are: Reduces
The motivation of the work is to propose the security risk of the secret. Approach is
the storage capacity can be significantly feasible. It provides content readability,
improved by increasing the code alphabet cheater detectability, and an adjustable
q or by increasing the textured pattern size. secret payload of the QR barcode.
It increases the storage capacity of the Disadvantages are: Need to improve the
classical QR code. It provides security for security of the QR barcode. QR technique
private message using visual secret sharing requires reducing the modifications.
scheme. The two-level QR code (2LQR),
3. STATE OF ART has two public and private storage levels
The paper [1] proves that the contrast and can be used for document
of XVCS is times greater than authentication [4]. The public level is the
OVCS. The monotone property of OR same as the standard QR code storage
operation degrades the visual quality of level; therefore it is readable by any
reconstructed image for OR-based VCS classical QR code application. The private
(OVCS). Accordingly, XOR-based VCS level is constructed by replacing the black
(XVCS), which uses XOR operation for modules by specific textured patterns. It
decoding, was proposed to enhance the consists of information encoded using
contrast. Advantages are: Easily decode q_ary code with an error correction
the secret image by stacking operation. capacity. Advantages are: It increases the
XVCS has better reconstructed image than storage capacity of the classical QR code.
OVCS. Disadvantages are: Proposed The textured patterns used in 2LQR
algorithm is more complicated. sensitivity to the P&S process.
In [2] paper, present a blind, key based Disadvantages are: Need to improve the
watermarking technique, which embeds a pattern recognition method. Need to
increase the storage capacity of 2LQR by designed scheme is feasible to hide the
replacing the white modules with textured secrets into a tiny QR tag as the purpose of
patterns. steganography. Only the authorized user
To protect the sensitive data, [5] paper with the private key can further reveal the
explores the characteristics of QR concealed secret successfully.
barcodes to design a secret hiding Disadvantages are: Need to increase the
mechanism for the QR barcode with a security.
higher payload compared to the past ones.
4. GAP ANALYSIS
For a normal scanner, a browser can only
reveal the formal information from the TABLE:GAP ANALYSIS
marked QR code. Advantages are: The
Sr. No. Author, Title and Journal Technique Advantages
Name Used
1 C. N. Yang, D. S. Wang, XOR-based 1. Easily decode the
―Property Analysis of XOR- VCS (XVCS) secret image by stacking
Based Visual Cryptography,‖ operation.
IEEE Transactions on Circuits 2. XVCS has better
& Systems for Video reconstructed image
Technology, vol. 24, no. 12 than OVCS.
pp. 189-197, 2014.
Define two blocks and belong to an which makes improvement mainly on two
identical group G if is satisfied. aspects: higher security and more flexible
(3) access structures. In addition, we extended
With above definition, we can divide the access structure from (n, n) to (k, n) by
further investigating the error correction
into several
mechanism of QR codes. Two division
groups . For example, to approaches are provided, effectively
determine whether and are of a improving the sharing efficiency of (k, n)
same group, we calculate . method. Therefore, the computational cost
of our work is much smaller than that of
If ,
the previous studies which can also
we can conclude that and are of an achieve (k, n) sharing method. The future
identical group, and vice versa. A block work will make the QR code reader for
different from any other blocks will not be scanned QR code within fraction of
contained in any group. seconds.
is said to be responsible for if
REFERENCES
is reversed to share . [1] C. N. Yang, D. S. Wang, ―Property Analysis of
Let denote the case that is XOR-Based Visual Cryptography,‖ IEEE
Transactions on Circuits & Systems for Video
responsible for and let Technology, vol. 24, no. 12 pp. 189-197, 2014.
represent the opposite. A matrix X is [2] P. P. Thulasidharan, M. S. Nair, ―QR code
constructed by solving (1). based blind digital image watermarking with
attack detection code,‖ AEU - International
Journal of Electronics and Communications,
vol. 69, no. 7, pp. 1074-1084, 2015.
[3] P. Y. Lin, ―Distributed Secret Sharing
Approach with Cheater Prevention Based on
QR Code,‖ IEEE Transactions on Industrial
Informatics, vol. 12, no. 1, pp. 384-392, 2016.
[4] I. Tkachenko, W. Puech, C. Destruel, et al.,
―Two-Level QR Code for Private Message
Sharing and Document Authentication,‖ IEEE
Transactions on Information Forensics &
Security, vol. 11, no. 13, pp. 571-583, 2016.
[5] P. Y. Lin, Y. H. Chen, ―High payload secret
hiding technology for QR codes,‖ Eurasip
Journal on Image & Video Processing, vol.
2017, no. 1, pp. 14, 2017.
[6] https://en.wikipedia.org/wiki/QR_code
[7] F. Liu, Guo T: Privacy protection display
(4) implementation method based on visual
If n satisfies the condition , passwords. CN Patent App. CN
201410542752, 2015.
there must be a solution to (1) when [8] S J Shyu, M C Chen, ―Minimizing Pixel
In addition, we can adjust the Expansion in Visual Cryptographic Scheme for
value of to balance General Access Structures,‖ IEEE
Transactions on Circuits & Systems for Video
errors between the covers and the Technology, vol. 25, no. 9, pp.1-1,2015.
reconstructed secret. Based on X, we [9] H. D. Yuan, ―Secret sharing with multi-cover
design a new sharing algorithm. adaptive steganography,‖ Information
Sciences, vol. 254, pp. 197–212, 2014.
[10] J. Weir, W. Q. Yan, ―Authenticating
7. CONCLUSION Visual Cryptography Shares Using 2D
In this paper, we proposed a visual secret Barcodes,‖ in Digital Forensics and
sharing scheme for QR code applications, Watermarking. Berlin, German: Springer
Berlin Heidelberg, 2011, pp. 196-210.
IPFS provides a high- throughput, content- trust each other not to tamper with data in
addressed block transit. Distributed Content Delivery saves
bandwidth and prevents DDoS attacks,
storage model, with content-addressed which HTTP struggles with.
hyperlinks. This forms a generalized Section 2 contains State of Art, Section 3
Merkle directed acyclic graph (DAG). contains Gap Analysis, Section 4 User
IPFS combines a distributed hash table, an Classes and Characteristics, Section 5
incentivized block exchange, and a self- contains Proposed Work, Section 6
certifying namespace. IPFS has no single contains Conclusion and Future Work and
point of failure, and nodes do not need to the Section 7 contains References.
rates (80%) but the guessing rate was also by author. Author argues that today‘s
high (39.5%). Word associations produced personal security questions owe their
low guessing rates (7%) but response strength to the hardness of an information-
words were poorly recalled (39%). retrieval problem. However, as personal
Nevertheless, both cognitive items and information becomes ubiquitously
word associations showed sufficient available online, the hardness of this
promise as password techniques to warrant problem, and security provided by such
further investigation questions, will likely diminish over time.
Author supplements our survey of bank
Priyanka sonawane, archana augustine, security questions with a small user study
―enhancing the security of secondary that supplies some context for how such
authentication system based on event questions are used in practice.
logger‖ [5] web application provides
secondary authentication when user 3. STUDY RECRUITMENT
forgets their password. For that user have To study the reliability and security of
to select the question from pre-defined personal questions, we ran a laboratory
lists of question which includes user long study over four separate days between
term history question like what is your first march 22 and june 23, 2008, with a
school, what is your birth place etc. follow-up study in september and october.
Answer of such question will not change The cohorts assigned to each day are
over a decade. Answer of this question can shown in in table 2a. The study
be easily break by using social networking encompassed both the personal questions
sites like facebook as well as answer of used by windows live‘s password-reset
this question will also be guess by brute workflow and the
force attack . So to overcome this problem 3.1 Participant recruitment
author present secondary authentication Our recruiting team selected participants
system based on mobile data of user. from a larger pool of potential participants
Today smart phones come with inbuilt they maintain for all studies at Microsoft.
features like gps. Author used the data for The pool contains members of the general
calls, sms history, calendar, application public who had been recruited via public
installment and based on this data are have events, lotteries, and our website. We
created the question and categorized them required that participants speak English as
as mcq, blank filling, true/false .to fetch their primary language and not be
the user mobile activity svm algorithm is employed by Microsoft.Our recruiters
used and to keep the answer of the selected a balance of men and women; 64
question secure author have used rsa participants were male and 66 female. The
algorithm. recruiters also selected participants with a
Ariel rabkin, ―personal knowledge diversity of ages and
questions for fallback authentication: professionsParticipants in the first three
security questions in the era of facebook‖ cohorts were required tobe Hotmail users
[4] security questions (or challenge for at least three months and to access their
questions) are commonly used to account at least three times a week. The
authenticate users who have lost their great majority of participants (83%) had
passwords. Author examined the password been using their Hotmail account for at
retrieval mechanisms for a number of least four years, as detailed in Table
personal banking websites, and found that 2d.After reaching one qualified
many of them rely in part on security participant, our recruiters would ask if the
questions with serious usability and participant had a coworker,friend, or
security weaknesses. Author discusses family member who might also be
patterns in the security questions observed qualified for the study. Recruiters then
interviewed potential partners to ensure we would ask the same questions later to
they met our requirements. determine how well they remembered the
3.2 Initial laboratoryvisit answers. We offered two prizes (an
XBOX 360 and a Zune digital music
We scheduled participants for a
player) and gave participants a virtual
two-hour visit to perform the tasks
lottery ticket for each question they both
summarized
answered and later recalled.We
Participants in each session were
randomized the question order for each
split into groups and placed into
participant.We asked participants to mark
different rooms such that no two
questions they were either unable or
partnerswere in the same room.
unwilling to answer. We instructed
Each partner was placedat a
participants that capitalization,
computer. We seated participants
punctuation, and spaces would be ignored
sufficiently far from each other to
when comparing answers.We anticipated
ensure that their screens, on which
participants might try to increase their
their answers might appear while
chance of recalling their answers by
being typed, could not be seen by
providing the same answer for all
others. All questions were asked
questions. We added a rulethat eliminated
using web survey software, though
rewards for recalling the same answer
participants were required to be on-
numerous times. We also feared that if
site to preventcollusion.
participants
Table 1. Order of laboratory visit tasks
1) Move to room separate frompartner
3.3 Guessing by acquaintances
We asked participantsto describe their
2) Answer demographic questions
relationship with their partnerand asked
3) Authenticate to Hotmail using
them whether they would trust their
personal question (cohorts1-3)
partner with their Hotmail password. Then
4) Answer personal questions for
we asked them to guess their partners‘
top four webmail services
answers. As before, we presented the
5) Describe relationship withpartner
questions in random order and rewarded
6) Guess partner‘s answers to
success with an increased opportunity to
personalquestions
win one of our prizes, though we could not
7) Attempt to recall answers to own
tell participants which answers were
personalquestions
correct. We allowed participants to guess
8) Second chance to guess partner‘s
up to five times by placing guesses on
questions using onlineresearch
separate lines. We restricted participants
(cohorts 2-4)
from communicating answers to each other
Authentication to HotmailWe explained
by asking them to turn off their mobile
to participants how personal questions
devices (―as a courtesy to others‖),
could be used to reset the passwords
isolating them in separate rooms, and
participants‘ used to logintoHotmail.We
monitoring their behavior.After running
asked the 116 participants in the first three
the first cohort of the study (40
cohorts (those selected to be Hotmail
participants),we discovered that many
users) to attempt to answer their personal
participants weren‘tguessing as hard as we
question. We asked them only to
had hoped. Most were providing at most
authenticate (provide the answer to their
one guess per answer and none appeared
question) and not to actually reset their
tobe performing any online research. We
password if successful.Initial answers to
thus gave the 90 participants in the three
personal questions. We then asked all 130
remaining cohorts (cohorts 2–4 a second
participants to answer all of the personal
opportunity to guess their
questions in use by the top four
partners‘answers. In this second guessing
webmailservices.We told participants that
round, we encouraged them to use search equality algorithm for examining partners‘
engines and social networking sites to guesses due toan artifact of our study. The
research the answers to their partners‘ Illume survey software we used to collect
questions. We also told them that this was the guesses participants provided for their
the last task of the studyin hopes that they partners‘ answers fails to store carriage
might feel less rushed. returns, which we had asked participants to
use to separatetheir guesses. To address
3.4 Limitations this problem our second algorithm, the
We style a user authentication system with substring algorithm, treated a guess as
a group of secret queries created supported valid if it contained a substring that
the info of user‘s daily activity and short- matched the original answer, as suggested
run smartphone usage. We have a by Toomim et al. [16]. The final algorithm
tendency to evaluate the responsibility and we tested was the Levenshtein edit
security by mistreatment true/false sort distance algorithm with two modifications.
secret queries. These queries area unit First, we reduced the cost of transpositions
simple to answer and no have to keep in of two characters (‗swapped‘!‗sawpped‘)
mind as a result of those area units on from two to one. This reduces the cost of
supported user personal life and events. this very common typo to be equal to that
Because of this application security are of a single mistyped character. Second, we
going to be enhance as a result of solely removed the cost of extra characters at the
user knew the events and things he/she did beginning or end of the guess, to adjust for
recently. the artifact that all guess strings were
concatenated together.
4. ANSWER COMPARISON
ALGORITHMS 5. RESULTS
In total, 130 participants initially provided In a world of social media it‘s terribly
2,874 answers and 49 participated in the straightforward for hackers to guess the
follow-up study and tried to recall 1,074 of solution such question. User desires
those answers. We needed an algorithm for effective system to douche this drawback.
determining whether a recollection, or to resolve this drawback we will take a
partner‘s guess, sufficiently matched the facilitate of sensible phone device.
original. We tested three different It‘s terribly difficult task to recollect the
algorithms. For all algorithms, we alpha numeric and symbolic positive
removed all nonalphanumeric characters identification. Single writing system
and forced letters into lower case. When amendment can result in wrong positive
counting the number of attempts to recall identification. To reset positive
an answer, we did not count repetitions of identification, user should answer question
the same guess.1 Attackers learn nothing that was set at the time of registration. This
by being able to repeat a guess, whereas question is understood as secret question.
account holders, who may repeat the same Users should keep in mind these questions
answer thinking they previously mistyped declare very long time. Study show that
it, will not be penalized for this mistake. these queries answer not amendment or
The first algorithm, simple equality, used for months/ year. this could cause to
compares the resulting simplified strings forget the solution of question.
character for character. This is the
algorithm that was used, during the 5.1 Real-world memorability results
memorability follow-up study, to provide While we asked all 116 participants in the
participants with feedback as to whether firstthree cohorts to try to reset their
they had recalled their answers correctly. password using their personal question,
Unfortunately, we could not use the not all accounts had a question configured.
Furthermore, an answer alone was not can catch and match with answer. If
sufficientto authenticate: a zip code answer given by user are correct then
previously associated with the account was secret can reset otherwise exposure can
also required.A total of 99 participants capture mechanically and send to the
reported being asked to provide the answer reregister email id.We style a user
to their personal question. Only 43 (43%) authentication system with a group of
reported being able to successfully provide secret queries created supported the info of
the correct answer and their zip code. The user‘s daily activity and short-run
majority, 56 (57%) could not reset their smartphone usage. We have a tendency to
password and reportedbeing unable to evaluate the responsibility and security by
remember either the answer or the zip code mistreatment true/false sort secret queries.
they had provided when they set up the These queries area unit simple to answer
account. When asked why they had trouble and no have to keep in mind as a result of
authenticating,75% participants suspected those area units on supported user personal
they may have beenunable to answer their life and events. Because of this application
personal question and 31% reported that security are going to be enhance as a result
they may have been unable to recall of solely user knew the events and things
thezipcode they had previous provided. A he/she did recently.Understanding
surprising 13% of participants suspected Smartphone Sensor and App Data for
that the reason they could not answer their Enhancing the Security of Secret
personal question was because they had Questions is anandroid base project which
intentionally provided a bogus answer collect the user activity data like user
when setting up their account. location, call log history. This data will
used to generate question for resetting
6. SYSTEM ARCHITECTURE password.User will install our 3rd party
Understanding Smartphone device And application. This application will help to
App knowledge for enhancing the generate the and ask question based on
protection of Secret queries is a golem daily activity. These questions are based
base project that collects the user activity onthe short time duration like week,
knowledge like user location, decision log month.At the beginning user will install
history. This knowledge can accustomed application in his/her mobile phone.
generate question for resetting secret. User Application will continuously capture
can install our third party application. This events, this event data is extracted and
application can facilitate to come up with send back to the application. Application
and raise question supported daily activity. will generate question and answer as per
These queries are supported the short time data. These question and answer will store
period like week, month.At the start user to the database. Question generation
can install application in his/her portable. process is continuously executed in back
Application can incessantly capture ground.
events; this event knowledge is extracted 6.1 Response Protocol
and challenge to the appliance. We create three types of secret questions:
Application can generate question and A ―True/false‖ question is also called a
answer as per knowledge. These question ―Yes/No‖question because it usually
and answer can store to the information. expects a binary answer of ―Yes‖ or ―No‖;
Question generation method is incessantly a ―multiple-choice‖ question or a ―blank-
dead in back ground. Order and answer filling‖ question that typically starts by a
can replace with new question and answer. letter of ―W‖, e.g.,
User access social media application and Who/Which/When/What(and thus we call
request to the reset the secret. Question these two types of questions as ―W‖
can fetch raise to user, response from user questions).We have two ways of creating
IEEE. IEEE, 2009, pp. 375390. Location based services using android (lbsoid),
[5] S. Schechter, C. Herley, and M. Mitzenmacher, in IMSAA. IEEE, 2009, pp. 15.
Popularity is everything: A new approach to [18] M. Oner, J. A. Pulcifer-Stump, P. Seeling, and
protecting passwords from statistical-guessing T. Kaya, Towards the run and walk activity
at- tacks, in USENIX Hot topics in security, classification through step detection-an android
2010, pp. 18. appli- cation, in EMBC. IEEE, 2012, pp.
[6] D. A. Mike Just, Personal choice and challenge 19801983.
questions: A security and usability assessment, [19] W. Luo, Q. Xie, and U. Hengartner, Facecloak:
in SOUPS., 2009. An architecture for user privacy on social
[7] Rabkin, Personal knowledge questions for networking sites, in CSE, vol. 3. IEEE, 2009,
fallback authentication: Security questions in pp. 2633.
the era of facebook, in SOUPS. ACM, 2008, [20] H. Falaki, R. Mahajan, S. Kandula, D.
pp. 1323. Lymberopoulos, R. Govindan, and D. Estrin,
[8] J. C. Read and B. Cassidy, Designing textual Diversity in smartphone usage, in MobiSys.
password systems for children, in IDC., ser. New York, NY, USA: ACM, 2010, pp.
IDC 12. New York, NY, USA: ACM, 2012, 179194.
pp. 200203. [21] Understanding Smartphone Sensor and App
[9] H. Ebbinghaus, Memory: A contribution to Data for Enhancing the Security of Secret
experimental psychology. Teachers college, Questions
Columbia university, 1913, no. 3. [22] L. Nyberg, L. Backman, K. Erngrund, U.
[10] F. I. Craik and R. S. Lockhart, Levels of Olofsson, and L.-G. Nilsson, Age differences
processing: A framework for memory research, in episodic memory, semantic memory, and
Journal of verbal learning and verbal behavior, priming: Relationships to demographic,
vol. 11, no. 6, pp. 671684, 1972. intellectual, and biological factors, The
[11] T. M. Wolf and J. C. Jahnke, Effects of Journals of Gerontology Series B:
intraserial repetition on short-term recognition Psychological Sciences and Social Sciences,
and recall. Journal of Experimental vol. 51, no. 4, pp. P234P240, 1996.
Psychology, vol. 77, no. 4, p. 572, 196 [23] C. Wang, Q. Wang, K. Ren, and W. Lou,
[12] H. Kim, J. Tang, and R. Anderson, Social Privacy-preserving public au- diting for data
authentication: harder than it looks, in storage security in cloud computing, in
Financial Cryptography and Data Security. INFOCOM, 2010 Proceedings IEEE, March
Springer, 2012, pp. 115. 2010, pp. 19.
[13] S. Hemminki, P. Nurmi, and S. Tarkoma, [24] S. Yu, C. Wang, K. Ren, and W. Lou,
Accelerometer-based trans- portation mode Achieving secure, scalable, and fine-grained
detection on smartphones, in Proceedings of data access control in cloud computing, in
the 11th ACM Conference on Embedded INFOCOM, 2010 Proceedings IEEE, March
Networked Sensor Systems, ser. SenSys 2010, pp. 19.
[14] New York, NY, USA: ACM, 2013, pp. [25] R. Faragher and P. Duffett-Smith,
13:113:14. [Online]. Avail- able: Measurements of the effects of multi- path
http://doi.acm.org/10.1145/2517351.2517367 interference on timing accuracy in a cellular
[15] J. Clark and P. van Oorschot, Sok: Ssl and radio positioning sys- tem, Radar, Sonar
https: Revisiting past chal- lenges and Navigation, IET, vol. 4, no. 6, pp. 818824,
evaluating certificate trust model December 2010.
enhancements, in Security and Privacy (SP), [26] M. Dong, T. Lan, and L. Zhong, Rethink
2013 IEEE Symposium on, May 2013, pp. energy accounting with co- operative game
511525. theory, in Proceedings of the 20th Annual
[16] J. Whipple, W. Arensman, and M. S. Boler, A International Conference on Mobile Computing
public safety application of gps-enabled and Networking, ser. MobiCom 14. New York,
smartphones and the android operating system, NY, USA: ACM, 2014, pp. 531542. [Online].
in SMC. IEEE, 2009, pp. 20592061. Available:
[17] S. Kumar, M. A. Qadeer, and A. Gupta, http://doi.acm.org/10.1145/2639108.2639128.
contain any significant data. Each of the are critical for the next generation large-
cloud nodes (here technology use the term scale systems, such as clouds. Therefore,
node to represent computing, storage, in this paper, the collective approach, issue
physical, and virtual machines) contains a of security and performance as a secure
distinct fragment to increase the data data replication problem is defined.
security. In a successful attack on a single
node should not reveal the locations of 3. REVIEW OF LITERATURE
different fragments at intervals in the Paper [1] presents Multi-client accessible
cloud. To keep an attacker uncertain about encryption scheme, scheme, which has
the locations of the file fragments and to various points of interest over the known
further improve the security, here it select methodologies. The related model and
the nodes in a manner that they are not security prerequisites are likewise planned.
adjacent and are at bound distance from It further talks about to expand given plan
each other. The node separation is ensured in a few different ways in order to
by suggested that of the T-coloring. accomplish different search abilities.
In propose paper [2] a secure data access
2. MOTIVATION scheme dependent on character based
The level of security required for device encryption and bio metric validation for
varies dramatically depending upon the distributed computing. System describe the
function of the device. Rather than asking security worry of distributed computing
if the device is secure, it should be asking and after that propose a coordinated
if the device is secure enough or not. integrated data access scheme for
Cloud-assisted cyber-physical systems distributed cloud computing, the strategy
(Cloud-CPSs; also known as cyber- of the proposed conspire incorporate
physical cloud systems) have broad parameter setup, key appropriation,
applications, ranging from healthcare to include layout creation, cloud information
smart electricity grid to smart cities to processing and secure data access control.
battlefields to military, and so on. In such The paper third proposes an identity based
systems, client devices (e.g., Android and data storage scheme where the two
iOS devices, or resource constrained questions from the intra-space and
devices such as sensors) can be used to between areas are considered an agreement
access the relevant services (e.g., in the assaults can be stood up to. Moreover, the
context of a smart electricity grid, it may entrance authorization can be controlled by
include utility usage data analyzed and the proprietor autonomously [3].
stored in the cloud) from/via the cloud. Fourth paper, focuses on the critical issue
However, client devices generally have of identity revocation, System bring re-
less computing capabilities and hence, are appropriating calculation into IBE and
unlikely to have adequate security propose a revocable plan in which the
(technical) measures in comparison to the disavowal activities are appointed to CSP.
conventional personal computers (PCs). With the guide of KU-CSP, the proposed
So the file cryptographic storage is an plan is full-highlighted:
effective method to prevent private data A. It accomplishes steady proficiency
from being stolen or tampered. Data for both calculation at PKG and
integrity is also maintain if attack is private key size at client;
performed for tempered data then it should B. User needs not to contact with
detect and prevent. By which we can able PKG amid key-refresh, at the
to perform secure communication between end of the day, PKG is
two or more devices. permitted to be disconnected in
From the existing work survey, this can the wake of sending the denial
deduce the both security and performance rundown to KU-CSP;
5.1 Proposed system architecture Owner distributor will assign the file to
In above Figure 1 shows the architectural user by generating access policies by
flow of proposed system. Here user request considering user attributes like date and
browser and browser accept its request, time stamp after entering encryption key
then through browser file is get uploaded then file will be divided into fragments and
while uploading of file it will bet encrypted store the fragment and its replica on
through defined policy attribute. This files server/cloud. When Authenticated user
integrity and user‘s authentication is login then he will get the file with which
checked via server then file is uploaded on his policy attribute matches. Then he can
cloud/server. request for the file key and download the
When user wants the uploaded file again file after entering secrete key. Third party
then cloud/server check the integrity of auditor will check data integrity of stored
user. After user gets verified then file is fragment that means placed fragment
accessible to end user but here file is given content is changed or not if changed then
in encrypted format. To decrypt this file integrity checker will inform to owner
user needs to get intended key from about that file. Then integrity checker will
authenticated user after getting key user replace tempered fragment with original
decrypt the file by using sender key plus fragment and provide security to the file.
self key. Then original content of file will
be downloaded by the user.
6. MATHEMATICAL MODEL
5.2 System overview
In proposed system owner will get data 6.1 File fragmentation
that file will allocate to users according to Fragment=Size of file/No.of fragments.
users position location and experience. (1)
a backup path inside MPLS network. reduces delay and improves efficiency of
Advantages are: It is reliable and also network. Disadvantages are: Need to
faster. It allocates free path for the packets mature MPLS-TP architecture.
congestion and reduces. Disadvantages The paper [7] proposes a high speed
are: Need to overcome the congestion in routing so in this paper provide alternative
MPLS network. IP switch architecture to gigabit router. IP
In [4] paper proposed to construct table switch architecture provides higher speed
for fast forwarding packet. In binary routing than a gigabit router. In this paper
search on prefix (BSP), to constructing of it uses low level switching flow and it
forwarding table consist of two steps first contains a protocol to allow explicit use
is sorting operation and second is stack and management of the cached
operation. It also solves the problem of the information through an IP switching
ambiguous lookup caused by duplicate network.
entry. Advantages are: Much faster. This The paper [8] includes a
improves the router performance connectionless approach for integrated IP
significantly. Disadvantages are: It stops and also provides a fast ATM
the updated data only in the corresponding (Asynchronous Transfer Mode) switching
sub tree. hardware. In ATM switch the routing
In [5] paper includes an MPLS decision of IP is catch as a soft state such
network, IGP select best path and each that the future packet belongs to hardware
LSP created by over best path selected by rather than software. It provides IP with
LSP towards the destination network. To simple and robust way for speed and
decide the best path to specific destination capacity of an ATM.
networks an IGP is used to spreads routing The paper [9] proposes a mobility
data to all routers in an MPLS domain. framework based on a MPLS
MPLS has capability to classify and (Multiprotocol label Switching). The
manage the traffic in network to offer Optimized Integrated Multi-Protocol Label
higher utilization of resources. Advantages Switching (Optimized I-MPLS) this
are: As compare to other network it framework combines MPLS with the
provides much better traffic engineering MIFA (Mobile IP Fast Authentication
capability. MPLS VPN (Virtual Private protocol). It solves the problem of
Networ) provides advantages that service duplicate resources and reduces the
providers want urgently of their networks, various dropped packet.
contains, manageability, reliability and
scalability. Disadvantages are: Security is
the major issue in this Technology.
MPLS-TP (Multiprotocol Label
Switching – Transport Profile) ring
protection system defined in standards, it
includes the capability to restore traffic
delivery following failure of network
resources to applicable carrier class
transport network requirements. The paper
[6] proposes a new protection mechanism, Fig. 1: Delay for TCP/IP and ROACM.
which combines advantages of both
steering and wrapping approaches, and
which minimizes packet loss significantly
in case of in order delivery. Advantages
are: It achieves fast protection switching
and consequent less packet loss. Highly
5. PROPOSED SYSTEM
The ROACM protocol provides many
features to intelligent routers/switches. In
this paper, the new protocol, ROACM, has
been included where the IP packet
contains an extra header that allows a
dynamic virtual circuit to be created. This
Fig. 2 Throughput for TCP/IP and ROACM header contains all relevant information to
Figure 1 and 2 show delay graph in cross connect the IP packets at the second
seconds and throughput graph in kbps for layer i.e. data link layer. In the call set up
TCP and ROACM protocol for all stage, the information is attached to the
scenarios. network layer header and in data
transmission stage the information is
4. GAP ANALYSIS stored in the ROACM header (in the
TCP/IP contains four layers such as frame). The ROACM protocol propagates
Application, Transport, Internet and Data information can occur below the IP level
link. TCP/IP is communication protocols in networks that contain routers that all
used to interconnect network devices on agree and support ROACM. Since this
the internet. OSPF (Open Shortest Path information is appended to the end of an
First) is interior routing protocol to create IP packet, the routers, which do not
routing table for routers. TCP/IP uses that employ ROACM, are still able to forward
routing table for forwarding that packet. packets using regular routing protocol. The
ROACM (Route Once and Cross ROACM itself maps virtual circuit links
Connect Many) protocol cross connect IP (indexes), which are provided from the
packet and provide faster data local interface tables at each router where
transmission. The performance of each index corresponds to a next hope
ROACM protocol is much more as interface address. This allows
compare to TCP/IP, but in case of security interoperability on a wide range of
the ROACM protocol is less secure than networks.
TCP/IP. The ROACM protocol consists of four
To overcome the problem of ROACM major tasks.
protocol in this paper we provide the 1. Call Set Up: At call set up stage
security to ROACM protocol. For security control field value 01 means it‘s a call
approach in ROACM protocol we use set up stage in ROACM protocol. In
AES (Advanced Encryption Standard) 128 this stage first initial IP send to
bit key size algorithm to ROACM establish connection.
protocol. 2. Data Transmission: At data
Network is created by using java transmission stage control field value
simulation tool and all feature of ROACM changes to 11 means it‘s a data
protocol are introduced in that network. transmission stage.
Then AES 128 bit key size algorithm 3. Path Update: It provide facility of path
applies in that network. In last we analyses update. For example at particular time
the performance of TCP/IP and ROACM path is optimal but not same path for
protocol for that we take variable average
different time so periodic message the basis of variable average delay and
send in network. throughput.
4. Recovery Plan: It provides facility of B. Algorithms:
recovery plan if any port is malicious. Following is a flow steps to security of
In this phase source station stop ROACM protocol.
sending packet and reestablish 1. Create a network using java
connection. simulation tool.
2. Construct a packet for ROACM
protocol in network.
3. Deploy packet of ROACM
protocol into a network.
4. Generate a key using 128 bit key
size AES (Advanced Encryption
standard) algorithm.
5. Assign that 128 bit key of AES
algorithm in network.
Fig. 3 ROACM Forward Header
As shown in figure 3 ROACM 6. Verify packet transmission in
protocol provide extra header to IP packet network.
that header include dynamic and static
field. Control field is dynamic field and Advantages are:
The performance of ROACM protocol
index port number towards first hop and
second hop is static field. is generally faster than TCP/IP
A. Architecture: protocol.
No need for high power processing in
the routers.
In the ROACM, no need to search in
the routing table except for the call set
up packet.
The ROACM protocol through large
number of packets transfer in less time.
The delay ratio is minimum as
compared to TCP/IP protocol.
Throughput is maximizing.
6. CONCLUSION
ROACM protocol cross connect IP
packet by using index of port number.
ROACM protocol provide extra header to
IP address. In this paper we provide
security to ROACM protocol by using
AES 128 bit key size algorithm by
Fig. 4 Proposed System Architecture introducing all feature of ROACM
As shown in figure 4 the contribution protocol in network by creating
work is, to implement the network in environment of network in java simulator.
which ROACM protocol feature Hence we can provide security against
introduced in network by using java various malicious attacks on ROACM
simulation tool and provide security to protocol.
data using AES 128 bit key size algorithm
and analysis of ROACM and TCP/IP on
ABSTRACT
Access control is one of the earliest issues remain a constant challenge. Its component
determines whether request to access resource is granted. Its domain covers the various
mechanisms by which a system grants or revokes the right to access data and services.
This paper presents a trust-based service management technique by using fuzzy
approach. The innovation lies in the use of distributed collaborating filtering to select
trust feedback from owners of IoT nodes sharing similar social interests. System is
scalable to large IoT systems in terms of storage and computational costs. This adaptive
IoT trust system detects malicious IP‘s and keywords from system and file respectively.
This paper also presents to manage trust protocol parameters dynamically to minimize
trust estimation bias and maximize application performance.
Keywords:
Access Control, Fuzzy approach, Authentication, Capability, Adaptive, Internet of
Things, Trust.
1. INTRODUCTION technology. In this paper, we propose an
Access control is one of the most access control model based on attributes
important concepts to protect resources, and trust to meet the requirements of fine-
having been used in a variety of network grained, dynamic secure access control in
environments. In this paper, we consider IoT environment.
the connected smart objects as the node
resource users. The users connection and Motivation:
disconnection from the IoT system Now days IOT is a popular technology
randomly according to requirement and used everywhere for automation services.
there is probably some malicious node But for huge usage of IOT increases
user who provides fake information via security issues. For IOT technology there
files. Even from malicious users spread are various aspects to secure the data and
offensive data or services. For example in activities. But up till now no one secure
hotel management system there is various the IOT by using trust-based access
services provide via mobile application control. So from that I motivate to get this
(Identity, Check in-out, Tables/Rooms module to implement which we can secure
availability, Air conditioner handling, and IOT devices by using various trust factors,
parking). These services can only and methods.
applicable to certain networking area (on IOT provides interconnection between the
Hotel‘s private Wi-Fi). If user disconnect uniquely identifiable devices. By
from this network and even after integrating several technologies like
user(admin) reject request of user then this actuators and sensor networks,
node IP will get block by admin (Owner of identification and tracking technology,
system). These dynamic and distributed enhanced communication protocol and
characteristics of IoT system have harsh distributed intelligence of smart objects,
requirements for access control IoT enables communication between the
real time objects present around us. The Section V represents proposed system
effectiveness of IoT can be seen in both approach. Finally, section VI summarizes
domestic (e.g. assisted living, e-health, the research and discusses the future work.
enhanced learning) and business
(automation, intelligent transportation) 2. SCOPE
fields. While various issues are related to Our eventual goal is to develop an
the implementation of IoT, Security of IoT authoritative family of foundational
has significant impact on the performance models for attribute-based access control.
of IoT applications. Trust is an important We believe this goal is best pursued by
aspect while talking about secure systems. means of incremental steps that advance
A system can behave in untrustworthy our understanding. ABAC (Attribute-
manner even after having security and based access control) is a rich platform.
privacy implementation. Behavior based Addressing it in its full scope from the
analysis of devices is required that can beginning is infeasible. There are simply
predict the device performance over the too many moving parts. A reasonable first
time. Trust management provides step is to develop a formal ABAC model,
behavior-based analysis of entities, using which we call ABACα that is just
their past behavior, reputation in the sufficiently expressive to capture DAC
network or recommendation. A (Discretionary access control), MAC
trustworthy system is needed to prevent (Mandatory access control) and RBAC
from unwanted activities conducted by (Role-based access control). This provides
malicious devices. My research work is to us a well-defined scope while ensuring
design a dynamic trust management that the resulting model has practical
system for IoT devices. relevance. There have been informal
Machine to machine are present to direct demonstrations, such as, of the classical
communication between devices using models using attributes. Our goal is to
wireless communication channel. More develop more complete and formal
recent machine to machine communication constructions.
has changed into a system of networks that
transmits data to individual files or Standard Permission Types:
services. The expansion of IP networks has 1. Read:
made machine to machine communication View the file names and
faster and easier while using shrink subfolder names.
power. File sharing and access control is a Navigate to subfolders.
serious issue in networks if there is no Open files.
trust factor between sender and receiver. Copy and view data in the
Any kind of file can be controlled by folder's files.
access controls but still these files cannot 2. Write:
be trusted as they may contain suspicious Create folders.
or malicious data. Applications in network Add new files.
cannot be trusted to execute as they can
Delete files.
harm machines. There is a strong need of 3. Operate Devices In IoT:
trust maintaining mechanism as well as
Request for service from an
trust defining factors with combination of
IoT node/ device
rules in this scenario.
Perform actions
This paper is organized as follows: Section Mathematical Model
II presents scope of concept. Section III
presents background and related works. Let S be the closed system defined as,
Section IV presents existing system. S = {Ip, Op, A, Ss, Su, Fi}
6. CONCLUSION REFERENCES
Trust based access control is in file [1] Bumjin Gwak and Jin-Hee Cho ―TARAS:
transfers in networks along with Trust-Aware Role-based Access Control
System in Public Internet-of-Things‖ 2324-
application access communication, 9013/18/31.00 ©2018 IEEE
especially where there is no rules for [2] Junshe Wang and Han Wang ―Trust and
trusting each other and no knowledge Attribute-based Dynamic Access Control
base or past experiences recording. This Model For Internet of Things‖ 978-1-5386-
project presents the novel approach to 2209-4/17 $31.00 © 2017 IEEE.
[3] Shunan Ma and Xunbo Shuai ―Bionic
maintain a trust between machines in Mechanism Based Dynamic Trust Evaluation
network to share files with various access Method in Cloud Environment‖ 2324-
controls and calculating trust based on 9013/18/31.00 ©2018 IEEE.
predefined knowledge or gained [4] Chaitali Uikey and D. S. Bhilare‖ TrustRBAC:
knowledge to combine with experiences Trust Role Based Access Control Model in
Multi-Domain Cloud Environments‖ ICICIC –
and access rules to decide and warn user if 2017.
to trust he files from sender machine or [5] Ankur Chattopadhyay and Michael J. Schulz‖
not. It also decides if to service a request Towards A Biometric Authentication-based
from and device is to be perform or to Hybrid Trust-computing Approach for
perform necessary action requested based Verification of Provider Profiles in Online
Healthcare Information‖ DOI
on third party centralized trust calculation. 10.1109/SPW.2017.23
This approach can avoid much harm [6] Vladimir Oleshchuk ―A Trust-Based Security
caused due to lack of knowledge and Enforcement in Disruption-Tolerant
trusting any files or request whichever is Networks‖ 978-1-5386-0697-1/17/$31.00
asked. Also the authenticity of files will be ©2017 IEEE.
[7] Chaoyin Huang and Zheng Yan ―Secure
preserved by access rights. In future we Pervasive Social Communications based on
would love to extend our work with Trust in a Distributed Way‖ 2169-3536 (c)
multiple layers and more rules for trust 2016 IEEE.
calculations so that it can help if [8] Mohamed Amine Bouazzouni and Emmanuel
centralized server fails then there will be a Conchon ―Trusted Access Control System for
Smart Campus‖ 978-1-5090-2771-2/16 $31.00
backup option and trust will be more © 2016 IEEE.
precise as it will go through multiple [9] Ms. Swathy M Sony and Ms. Swapna B Sasi
layers. ―ON - OFF ATTACK MANAGEMENT
BASED ON TRUST‖ 978-1-5090-4556-
7. ACKNOWLEDGMENT 3/16/$31.00 ©2016 IEEE.
[10] Huang Lanying and Xiong Zenggang ―A
I would like to express my appreciation to Trust-role Access Control Model Facing Cloud
all those who provided me the possibility Computing‖ July 27-29, 2016, Chengdu,
to complete this report. A special China.
gratitude I give to my seminar guide Prof. [11] Hui Xia ―Design and Implementation of Trust
V. V. Kimbahune, whose contribution in –based Access Control System for Cloud
Compouting‖,978-1-5090-5363-6/17/$31.00
stimulating suggestions and ©2017 IEEE
encouragement, helped me to present this [12] Eugene Sanzi and Steven A Demurjin
seminar. I have to appreciate the guidance ―Integrating Trust Profiles,Trust
given by other supervisor as well as the Negotiation,and Attribute Based Access
panels especially in my seminar Control‖, 978-1-5090-6325-3/17 $31.00 ©
2017 IEEE
presentation that has improved my [13] Yongkai Fan ―One Secure Access Scheme
presentation skills thanks to their comment based on Trusted Execution Environment‖,
and advices. 2324-9013/18/31.00 ©2018 IEEE.
for a particular time after that the system Also, an application may be extended to
will close his access. add second factor of authentication by
tracking patient‘s internal movements and
5.Additional Features then predicting the type of disease caused
Some researchers are worried about the to him and sending the data to the doctor
situation in which the Patient is not able to on the blockchain network.
handle his own account and hence cannot
provide the hash key to the Doctor i.e.
REFERENCES
during operation or any critical injuries.
Our system solves this problem by [1] Steward, ―Electronic Medical Records,‖
Journal of Legal Medicine, vol. 26, no 2005,
accepting emergency contact information pp. 491–506.
of Patient‘s certain close people who he [2] R. Hauxe, ―Health Information Systems—Past,
can trust his information with .They will Present,Future,‖ Int'l Journal of Medical
then provide Doctor with the Hash Key Informatics, vol. 75,no. 3–4, 2006, pp. 268–
whenever Patient is in critical condition. 281.
[3] K. Häyrinena et al., ―Definition, Structure,
Content, Use and Impacts of Electronic
7. CONCLUSION Health Records: A Review of the Research
Literature,‖ Int'l Journal of Medical
The examples described, show that Informatics, vol. 77, no. 5, 2008, pp. 291–304.
[4] M. Ciampi et al., ―A Federated Interoperability
Blockchain offers numerous opportunities Architecture for Health Information Systems,‖
for usage in the healthcare sector, e.g.in Int‘l ournal of Internet Protocol Technology,
public health management, user-oriented vol. 7, no. 4, 2013, pp. 189–202.0
medical research based on personal patient [5] M..Moharra et al., ―Implementation of a Cross-
data as well as drug counterfeiting[11]. Border Health Service: Physician and
Pharmacists‘ Opinions from the ep SOS
The immense potential of this technology Project,‖ Family Practice, vol. 32, no. 5, 2015,
shows up wherever, until now, a trusted [6] P. B. Nichol (2016, March), Blockchain
third party was necessary for the applications for healthcare,
[Online].Available:http://www.cio.com/article/
settlement of market services. With
3042603/innovation/blockchain-applications-
Blockchain, direct transactions suddenly for-healthcare.html.
became possible, whereby a central actor, [7] G. Prisco (2016, April), The Blockchain for
who controlled the data, earned Healthcare: Gem Launches Gem Health
commission or even intervened in a Network With Philips Blockchain Lab,
[Online]. Available:
censoring fashion, can be eliminated.
https://bitcoinmagazine.com/articles/the-
blockchain-for-heathcare-gemlaunches-gem-
8. FUTURE SCOPE health-network-with-philips-blockchain-lab-
1461674938
The scope of our application in future is by [8] P. Taylor (2016, April), Applying blockchain
technology to medicine traceability, [Online].
extending it to sectors like Insurance
Available: https://www.securingindustry.com/
Companies where hospitals can check pharmaceuticals/applying-blockchain-
whether a patient is subjected to a technology-to
particular policy and greatly improve the medicinetraceability/s40/a2766/#.V5mxL_mL
risk management. It can be used to record TIV.
patient‘s gestures and onsite data and
secure its integrity.
Description : The main reasons lies in Minecraft servers to control CPU lo- ad;
spreading viruses by sharing other players‘ we primarily explored manually setting the
account via illegal plugin, Trojans, CPU af- finity of the Minecraft server
exploiting security vulnerabilities game or thread to run on specific virtual cores.
other malicious virtual property illegally
acquired players. In this study, we 3. PROPOSED SYSTEM:-
proposed a new way to scan and resist bot. We proposed a multimodal framework
The APP we developed can scan, detect, for detecting game bots in order to reduce
and filter out the bot that needs to be shut damage to online game service providers
down, so as to delete online game bot and legitimate users. We observed the
effectively. With the growing of the behavioral characteristics of game bots and
popularity of online games,the potential found several unique and discriminative
risks are also increasing. characteristics. We found that game bots
Attacks usually take various to execute repetitive tasks associated with
defraud players in order to get the players‘ earning unfair profits, they do not.
virtual property or personal data. Advantages:
Currently, robot online game plug . The bot will kill monsters, loot money,
detection is running in many ways. mine, or gain levels automatically without
However, related studies are still unable the player having to be in front of the
to make a truly effective and computer.
comprehensive preventingmechanism
especially with the fact that criminal A bot is a player who runs a third
behaviors for online games are difficult to party program to control their
curb effectively. character.
identification number (PIN), i.e., what you easily be misplaced or accidentally run
know. through the laundry. If you trust factors
like PINs, there‘s always the chance that
2. MOTIVATION you forget it. Biometric factors like eyes
Security has been the main concern for all and fingers can be lost in accidents.
users, organizations. Securing sensitive 2.False security
data becomes more critical on internet. Two-factor authentication provides a level
Everyday new type of attack introduces in of security, but it‘s typically exaggerated.
cyberspace to break authentication. We For example, if you were locked out of a
need proper authentication method to service because you lost a factor, you‘re
secure critical business infrastructure. basically in the same predicament as a
Avoid loss of intellectual property of a hacker attempting to gain access to your
user and securing the sensitive information account. If you can reset your account
on the internet. without an access factor, then a hacker
can, too.
3. STATE OF ART Recovery options typically contradict the
Authentication is the use of one or more point of two-factor authentication, which
mechanisms to prove that you are whom you is why companies like Apple have done
claim to be. once the identify of the human or away with them. However, without
machine is validated, access is recovery options, your account may be lost
granted.Authentication is generally required to forever.
access secure data or enter a protected area. 3. It can be turned against users
The requester for access or entry shall While two-factor authentication is
authenticate himself based on proving intended to keep hackers out of your
authentically his identify using account, the opposite can happen. Hackers
can set up or reconfigure two-factor
- What the requestor individually knows as authentication to keep you out of your own
a secret, such as a password or a Personal accounts.
Identification Number (PIN), or Two-factor authentication may not be
- What the requesting owner uniquely has, effective enough to secure your accounts
such as a passport, physical token, or an but can also be too effective if you‘re not
ID-card, or careful. As services improve with two-
- What the requesting bearer individually factor practices and make account
is, such as biometric data, like a fingerprint recovery more difficult, it‘s pertinent to set
or the face geometry. up the authentication on your necessary
accounts before a hacker does.
4. GAP ANALYSIS
5. PROPOSED WORK
1. Factors can get lost In the run-up to the 2016 U.S. presidential
There is no certainty that your elections, Democratic candidate Hillary
authentication factors will be available Clinton received a serious blow from a
when you need them. Typically, you are series of leaks coming from the email
locked out of your account after one account of her campaign chairman John
mistake is made. Security is main concern Podesta. Hackers were able to access the
for all users, organizations. contents of Podesta‘s account by staging a
In situations when you lose power or your successful phishing attack and stealing his
phone is damaged by water, you won‘t be credentials.
able to get your SMS codes as the second Podesta is one of the millions of people
authentication factor. Relying on a USB whose passwords get stolen as a result of
key as a second factor is also risky. It can social engineering attacks or data breaches
Business, Engineering and Industrial [8] Alireza Pirayesh Sabzevar, Angelos Stavrou
Applications (ISBEIA), Langkawi, ―Universal Multi-Factor Authentication Using
Malaysia,2011 Graphical Passwords‖, Proceedings of the
[2] HSNORT: A Hybrid Intrusion Detection 2008 IEEE International Conference on Signal
System using Artificial Intelligence with Snort. Image Technology and Internet Based
IJCTA| May-June 2013 Systems. pp. 625-632, 2008.
[3] DDoS Attacks Impact on Network Trac and its [9] F.Hoornaert D.Naccache O.Ranen D.M‘Raihi,
Detection Approach. International Journal of M. Bellare. Hotp: An hmacbased one-time
Computer Applications (09758887) Volume password algorithm. Request for Comments:
40 No.11, 2012 4226, 2005.
[4] An Overview on Intrusion Detection System [10] Wei FS, Ma JF, Aijun G, et al. (2015) A
and Types of Attacks It Can Detect provably secure three-party password
Considering Different Protocols. International authenticated key exchange protocol without
Journal of Advanced Research in Computer using server‘s public-keys and symmetric
Science and Software Engineering, 2012 cryptosystems. Inf Technol Control 44:195–
[5] Zhu H. ―A Provable One-way Authentication 206
Key Agreement Scheme with User Anonymity [11] Haichang Gao, Wei Jia, Fei Ye, Licheng Ma
for Multi-Server Environment‖. TIIS, vol. 9, ―A survey on the use of Graphical Passwords
no. 2, pp 811-829, 2015. in Security‖, Journal of software, Vol. 8, No.
[6] McAfee Case Study ―Securing the Cloud with 7, July 2013.
Strong Two-Factor Authentication through [12] Xia ZH, Wang XH, Sun XM, Wang Q
McAfee One Time Password‖ (2016a) A secure and dynamic multi-keyword
[7] Edward F. Gehringer ―Choosing passwords: ranked search scheme over encrypted cloud
Security and Human factors‖ IEEE 2002 data. IEEE Trans Parallel Distrib Syst
international symposium on Technology and 27:340–352.
Society, (ISTAS‘02),\ ISBN 0-7803-7284-0,
pp. 369 - 373, 2002.
CLOUD
COMPUTING
compared to the stored copy and whenever computing resources. Cloud load
a match occurs, the redundant chunk is balancing reduces costs associated with
replaced with a small reference that points document management systems and
to the stored chunk. Given that the same maximizes availability of resources. It is a
byte pattern may occur dozens, hundreds, type of load balancing and not to be
or even thousands of times (the match confused with Domain Name System
frequency is dependent on the chunk size), (DNS) load balancing. While DNS load
the amount of data that must be stored or balancing uses software or hardware to
transferred can be greatly reduced. perform the function, cloud load balancing
Load balancing Concept : uses services offered by various computer
Cloud load balancing is the process of Network companies.
distributing workloads across multiple
2. LITERATURE SURVEY
Table: Literature Survey
SR. PAPER CONCEPT AUTHOR NAME AND
NO. NAME YEAR
1. A Study on This paper introduces the Akhila Ka*,Amal
De-duplication de-duplication techniques, Ganesha,Sunitha Ca, 2016.
Techniques securing data by ELSVIER.
over encryption and some
Encrypted challenges related to it .
Data.
3.GAP ANALYSIS
Table: Gap Analysis 5. PROPOSED WORK
Existing Proposed We propose a scheme to de-duplicate data
System System by applying techniques such as hashing
Sharing Less More and encryption. Also try to reduce the load
secured secured on data server, by using de-dup server that
De- File Content will reduce the response time and increase
duplication Name is is the processing speed. This is applicable in
checked checked scenarios where data holders are not
Security Moderate High available for de-duplication control.
Efficiency Medium High 5.1 Procedures
Time Low Moderate 5.1.1 Data upload
Consumption Step 1: At first the file that has to be
uploaded , its hash code (H1)is generated
4. SCHEME using SHA1 .
Our scheme contains the following main Step 2: Then the hash code is then
aspects: matched with the metadata in database
Encrypted Data Upload .In this process M(H).
initially, hash code of the data is generated Step 3: If M(H) ==H1 ,the only link the
.After that hash code is matched with that data. And then stop
in database, if exists then only link user to Step 4 :But if M(H)!=H1 i.e. data doesn‘t
respective data file. If not then the data is exist in file server then encrypt given data
encrypted using AES and then divided into using AES.
chunks. And the data is the uploaded onto Step 5:Split the given encrypted data and
respective data server. then upload the file.
Data De-duplication. This portion is
subpart of upload process .i.e. hash code of 5.1.2 Data download
the given file is generated .then this code is When the data has to be downloaded after
checked in the database if exists or not. In the request has been sent by user to the
this way the data already present is not server the following happens,
uploaded again instead only the user is Step 1: De-dup Server traces the chunks of
linked to respective file. files where have those been uploaded
Data Deletion. When the user want to using the metadata stored in database.
delete data from its respective cloud Step 2: After which the data chunks is
portion (account) .The system here rather randomly downloaded and combined to
than deleting the file from cloud simply make the original data.
delinks the user from the given data .As Step 3: Then the given file is decrypted
the file can be uploaded by many users but using AES.
stored in cloud only once ,So deleting the Step 4:And hence the file is downloaded
file would leave to ambiguous situation.
Data Owner Management. In this case 6. SYSTEM ARCHITECTURE
real data owner or the user can use cloud Our system proposes, the given
to store , retrieve data. And can use architecture in which there would be two
different functionality as upload, delete main categories for using the system i.e.
and share. existing user (login) and new user
Encrypted Data Update. Over here if the (registration). With every user having
user updates existing data .i.e. the data is username , password and a unique private
changed .So the system treats this as a new key. And can use the functionalities such
file. Hence the upload process for this new as data upload , download, sharing and
or updated file repeats. delete functionalities along with the main
for years. From the perspective of cloud encryption key, so the same file will
storage security, many data integrity result in the same ciphertext. This
checking schemes have been proposed. technique is useful, but the encryption key
From the perspective of cloud storage has nothing to do with the client‘s will.
efficiency, client-side deduplication Moreover, using the hash of a file as
technique has been adopted to save disk the encryption key is not secure[7].
space and network bandwidth. More Therefore, there still needs an efficient
specifically, the cloud server may only solution to support data integrity auditing
keep one or few copies for duplicated with storage deduplication for encrypted
files, regardless of how many data data in cloud storage. To solve this open
owners want to store that file. If the problem, the following major challenges
cloud server already stores a copy of exist:
the file, then owners do not need to 1) Client-side deduplication of encrypted
upload it again to the cloud, thus data; in real-world scenarios, owners may
bandwidth as well as storage can be saved. encrypt their data with their own keys.
However, Client-side deduplication may Thus, identical data copies of different
cause new security problems. Malicious owners will lead to different cipher texts.
owners who do not have the file may When a new owner wants to become a
obtain the exact same file by cheating new owner of the encrypted file, he needs
the cloud server. For secure client-side to prove to the cloud server that he indeed
deduplication, the notion of Proof of holds the whole file. Since the data stored
Ownership (POW) has been in cloud may be encrypted by another
introduced[4], which lets an owner owner, this new owner does not possess
efficiently prove to the cloud server the encryption key, which makes client-
that the owner indeed holds the whole file. side deduplication of encrypted data more
To achieve both data integrity auditing challenging.
and storage deduplication within the 2) Deduplication of data tags. Lacking
same framework, researchers try to mutual trust, the owners need to
combine an existing integrity checking separately store their own data tags in
scheme with a POW scheme. Zheng et the cloud. Due to the large number of
al proposed a scheme named POSD[5] owners, the storage overhead of tags
and Yuan et al proposed a scheme may be very huge, which contradicts the
named PCAD[6]. However, these objective of deduplication for saving
schemes are no longer applicable to the storage.
cloud storage systems for some reasons: 3) Public auditing for de-duplicated and
1) Zheng‘s POSD scheme has been encrypted data. Any owners can delegate
proved not secure and the storage the data integrity auditing task to the
overhead of tags is linear to owners. auditor. In our scheme, the cloud server
2) Yuan‘s PCAD suffers a high only stores one copy of encrypted data
communication cost on owner side and the product of de-duplicated data
during deduplication which is linear to tags of all owners. In such case, how to
the challenged number of blocks. guarantee the integrity of de-duplicated
3) Both Zheng‘s POSD and Yuan‘s data can be still correctly checked.
PCAD schemes cannot support encrypted In this paper, we address the above
data, while data confidential is the basic challenges and propose an efficient
security requirement for storing data in public auditing scheme for encrypted
untrusted cloud. data with client-side deduplication.
To achieve deduplication of encrypted Our contributions can be summarized as
data, convergent encryption is proposed follows:
[7]. It uses the hash of the file as the
line archives. Existing crypto logic the deduplication storage has been gaining
techniques facilitate users make sure the great significance.
privacy and integrity of files they retrieve.
It‘s conjointly natural, however, for users 3. IMPLEMENTATION DETAILS
to require verifying that archives don‘t System Overview
delete or modify files before retrieval. The The Figure 1. Shows the proposed system
goal of a POR is to accomplish these architecture. , For integrity auditing and
checks while not users having to transfer secure deduplication our scheme uses the
the files themselves. A POR may also give BLS signature-based Homomorphic Linear
quality-of-service guarantees, i.e., show Authenticator (HLA), proposed in. We
that a file is retrievable at intervals an also introduce TPA to support public
explicit time certain. integrity auditing. The proposed scheme
Proposed scheme improves compression consists of the following.
effectiveness by 11 percent to 105 percent, Client (or user).
compared to traditional compressors. It outsource data to a cloud storage. CE-
Deduplication process is used for encrypted data is first generated, and then
eliminating duplicates in data, thus uploaded it to the cloud storage to preserve
improving the effective capacity of storage confidentiality. The client also needs to
systems. Single-node raw capacity is still verify the integrity of the outsourced data.
mostly limited to tens or a few hundreds of For verifying integrity, the client delegates
terabytes, forcing users to resort to integrity auditing to the TPA.
complex. In [11], author proposed new Cloud Storage Server (CSS).
mechanisms called progressive sampled It provides different services to users for
indexing and grouped mark and sweep, to data storage. Deduplication technique is
address dedupe challenges and also to applied to save space required for storage
improve single-node scalability. and cost. We think that the CSS may act
Progressive sampled indexing removes maliciously because of attacks,
scalability limitations by using indexing software/hardware malfunctions,
technique. intentional saving of computing resources,
Advantages of proposed scheme are, etc. During the deduplication process, the
improves scalability, provide good CSS apply the PoW protocol to
deduplication efficiency and improvement authenticate the client owns the file.
in throughput. H. L. Goh, K. K. Tan, S. Moreover, in the integrity audit process, it
Huang, and C. W. d. Silva [19], author is necessary to generate and respond to a
proposed three fold approaches, first they proof corresponding to the request of the
discuss sanitization requirements in the TPA.
context of de-duplicated storage, second TPA (Third Party Auditor).
implemented a memory efficient technique TPA Performs auditing on behalf of the
for managing data based on perfect client to decrease the client‘s processing
hashing, third they design sanitizing de- cost. Instead of the client, the auditor sends
duplicated storage for EMC data domain. a challenge to the storage server to
Proposed approach minimizes memory periodically perform an integrity audit
and I/O requirements. Perfect hashing protocol. TPA is assumed to be a semi-
requires a static fingerprint space, which trust model, that is, an honest model.
conflicts with proposed scheme desire to
support host writes during sanitization
Data de-duplication has recently gain
importance in most secondary storage and
even in some primary storage for the
storage purpose. A read performance of
algorithm that is run by the client Paper T Dat De- Dat Rege
in order to tag a file. It takes as P a dupli a nerat
input a secret key sk and a file ∈ A Enc catio Dy ing
ryp n na Code
[B]n, and outputs a vector of tags ~ tion Chec mic s
t and state information st. k s
3. τ := Authpk( , , ) is a
deterministic algorithm that is run [1] ✔ ✖ ✖ ✖ ✖
by the server to generate a tag. It [12] ✔ ✔ ✔ ✖ ✖
takes as input a public key pk, a file
∈[B]n, a tag vector , and a [18] ✔ ✔ ✔ ✖ ✖
challenge vector ∈Zn p; it
[2] ✔ ✔ ✖ ✖ ✖
outputs a tag τ.
4. b := Vrfypk(st,µ, ,,τ): is a [15] ✔ ✔ ✖ ✔ ✖
deterministic algorithm that is used
to verify a tag. It takes as input a
[6] ✔ ✔ ✖ ✔ ✖
public key pk, state information st, [19] ✔ ✔ ✔ ✖ ✖
an element µ ∈ N, a challenge
vector ∈ Zn p, and a tag τ. It Propos ✔ ✔ ✔ ✔ ✔
outputs a bit, where ‗1‘ indicates ed
acceptance and ‗0‘ indicates Syste
rejection. For correctness, we m
require that for all k ∈N, all (pk,sk)
5. CONCLUSION 6. ACKNOWLEDGMENT
When storing data on remote cloud The authors would like to thank the
storages, users want to be assured that researchers as well as publishers for
their outsourced data are maintained making their resources available and
accurately in the remote storage without teachers for their guidance. We are
being corrupted. In addition, cloud servers thankful to the authorities of Savitribai
want to use their storage more efficiently. Phule University of Pune and concern
To satisfy both the requirements, here members of ICINC 2019 conference,
system proposes a scheme to achieve both organized by Smt. Kashibai Navale
secure deduplication and integrity auditing College of Engineering, Pune. For their
in a cloud environment. To prevent constant guidelines and support. We are
leakage of important information about also thankful to the reviewer for their
user data, the proposed scheme supports a valuable suggestions. We also thank the
client side deduplication of encrypted data, college authorities for providing the
while simultaneously supporting public required infrastructure and support.
auditing of encrypted data also proposed Finally, we would like to extend a heartfelt
system supports high data availability with gratitude to friends and family members.
the use of erasure codes.
REFERENCES
[1] Ateniese, R. Burns, R. Curtmola, J. Herring, L.
Kissner, Z. Peterson, and D. Song, ―Provable
the information proprietor and how to development of big data and Internet of
check the authenticity of a client who things (IOT), the number of networking
means to get to the information are still of devices and data volume are increasing
awesome concerns. dramatically. Fog computing, which
This paper is divided into Section- I as extends cloud computing to the edge of the
Introduction. Section-II as Literature network can effectively solve the
Survey is related to NTRU algorithm. bottleneck problems of data transmission
Section-III as Motivation and Its related and data storage. However, security and
work. Section-IV Proposed System. privacy challenges are also arising in the
Section-V Its Implementation. Section-VI fog cloud computing
describes the Novel Methodology to store environment.Ciphertext-policy attribute-
big in cloud. based encryption (CP-ABE) can be
adopted to realize data access control in
2. LITERATURE SURVEY fog-cloud computing systems. In this
The main focus of literature survey is to system, we propose a verifiable outsourced
study and contrast the existing models to multi-authority access control scheme,
protect from pothole using manual named VO-MAACS. In our construction,
methods. This chapter highlights the most encryption and decryption
succinct research contributions in computations are outsourced to fog
developing automated pothole detection devices and the computation results can be
based on the various detection techniques. verified by using our verification method.
Surajkumar Singh, Niraj Meanwhile, to address the revocation
Chaudhary, Sreenu M, Manjunath B M, issue, we design an efficient user and
Secure Accessibility for Big Data in Cloud attribute revocation method for it. Finally,
International Journal of Innovative analysis and simulation results show that
Research in Science, Engineering and our scheme is both secure and highly
Technology Volume 7, Special Issue 6, efficient.
May 2018 NTRU and then present a Roslin Dayana K.,Vigilson Prem M.,
secure and verifiable access control Review of the Various Optimized Access
scheme based on the improved NTRU to Control Techniques for Big Data in Cloud
protect the outsourced big data stored in a Environment, International Journal of
cloud. Our scheme allows the data owner Computer Applications (0975 8887)
to dynamically update the data access Volume 179 No.11, January 2018.Cloud
policy and the cloud server to successfully computing is an information technology
update the corresponding Outsourced (IT) domain that enables efficient access to
cipher text to enable efficient access shared and private collection of
control over the big data in the cloud. The configurable system resources. It provides
security of our proposed scheme is higher-level services that can be very
guaranteed by those of the NTRU quickly provisioned at a greater rate with
cryptosystem and the (t,n)-threshold secret minimum amount of effort for
sharing. We have rigorously analysed the management, mostly over the Internet.
correctness, security strength, and Due to the high complexity and huge
computational complexity of our proposed volume, outsourcing cipher texts to a cloud
scheme. is deemed to be one of the most effective
Kai Fan, Junxiong Wang, Xin approaches for big data storage and access.
Wang , Hui Li and Yintang Yang, A Verifying the access legitimacy of a user
Secure and Verifiable Outsourced Access and securely updating a ciphertext in the
Control Scheme in Fog-Cloud Computing cloud based on a new access policy
Sensors 2017, 17, 1695; designated by the data owner are two
doi:10.3390/s17071695 With the rapid critical challenges. The access policy
AA has incorrectly or maliciously verified cloud server doesnt conduct data access
a user and has granted illegitimate attribute control for owners. The encrypted data
sets[9]. stored in the cloud server can be
B. The attribute authorities (AAs): AAs downloaded freely by any user.
are responsible for performing user
legitimacy verification and generating 6. METHODOLOGY
intermediate keys for legitimacy verified The NTRU cryptosystem is based on the
users. Unlike most of the existing multi- shortest vector problem (SVP) in a lattice
authority schemes where each AA that makes it lightning fast and resistant to
manages a disjoint attribute set quantum computing attacks. It has been
respectively, our proposed scheme proved to be faster than RSA. NTRU
involves multiple authorities to share the implements the following three basic
responsibility of user legitimacy functions[12].
verification and each AA can perform this 1. Key Generation: To create his public
process for any user independently. When and private keys.
an AA is selected, it will verify the users 2. Encryption: To send a message first we
legitimate attributes by manual labor or encrypt the msg.
authentication protocols, and generate an 3. Decryption: The encrypted message is
intermediate key associated with the decrypts by using private key.
attributes that it has legitimacy-verified.
Intermediate key is a new concept to assist 7. CONCLUSION AND FUTURE
CA to generate keys[10]. SCOPE
C. The data owner (Owner): Data owner In this system, we first propose an
defines the access policy about who can improved NTRU cryptosystem to
get access to each file, and encrypts the overcome the decryption failures of the
file under the defined policy. First of all, original NTRU and then present a secure
each owner encrypts his/her data with a and verifiable access control scheme based
symmetric encryption algorithm. Then, the on the improved NTRU to protect the
owner formulates access policy over an outsourced big data stored in a cloud. Our
attribute set and encrypts the symmetric scheme allows the data owner to
key under the policy according to public dynamically update the data access policy
keys obtained from CA. After that, the and the cloud server to successfully update
owner sends the whole encrypted data and the corresponding outsourced ciphertext to
the encrypted symmetric key (denoted as enable efficient access control over the big
ciphertext CT) to the cloud server to be data in the cloud. It also provides a
stored in the cloud. verification process for a user to validate
D. The data consumer (User): User is its legitimacy of accessing the data to both
assigned a global user identity Uid by CA. the data owner and t -1 other legitimate
The user possesses a set of attributes and is users and the correctness of the
equipped with a secret key associated with information provided by the t -1 other
his/her attribute set. The user can freely users for plaintext recovery.
get any interested encrypted data from the
cloud server. However, the user can 8. ACKNOWLEDGMENT
decrypt the encrypted data if and only if We express our sincere thanks to our
his/her attribute set satisfies the access project guide Prof. Lagad J. U. who
policy embedded in the encrypted always being with presence & constant,
data[11]. constructive criticism to made this paper.
E. The cloud server : Cloud Server We would also like to thank all the staff of
provides a public platform for owners to COMPUTER DEPARTMENT for their
store and share their encrypted data. The valuable guidance, suggestion and support
through the project work, who has given [5] Chunqiang Hu, Wei Li, Xiuzhen Cheng, Jiguo
co-operation for the project with personal Yu, Shenling Wang, Rongfang Bie. A Secure
and Verifiable Access Control Scheme for Big
attention. Above all we express our Data Storage in Cloud , IEEE Transactions on
deepest gratitude to all of them for their Big Data, Vol pp, issue 99, Feb 2017.
kind-hearted support which helped us a lot [6] Zheng Yan, Xueyun Li, MingjunWang,
during project work. At the last we Athanasios V. Vasilakos Flexible Data Access
thankful to our friends, colleagues for the Control Based on Trust and Reputation in
Cloud Computing, IEEE Transactions on
inspirational help provided to us through a Cloud Computing, Vol 5, issue 3, July-Sept. 1
project work. 2017.
[7] E. Goh, H. Shacham, N. Modadugu, D. Boneh,
REFERENCES Sirius: Securing untrusted storage, Proc. of
[1] Surajkumar Singh, Niraj Chaudhary, Sreenu NDSS, 2003, pp. 131145.
M, Manjunath B M, Secure Accessibility for [8] L. Zhou, V. Varadharajan, M. Hitchens,
Big Data in Cloud International Journal of Achieving secure role-based access control on
Innovative Research in Science, Engineering encrypted data in cloud storage, IEEE Trans.
and Technology Volume 7, Special Issue on Information Forensics and Security, vol. 8,
6,May 2018 no. 12, pp. 1947-1960, 2013.
[2] Kai Fan, JunxiongWang , XinWang , Hui Li [9] S. Yu, C. Wang, K. Ren, W. Lou, Achieving
and Yintang Yang, A Secure and Verifiable secure, scalable, and fine-grained data access
Outsourced Access Control Scheme in Fog- control in cloud computing, Proc. of the IEEE
Cloud Computing Sensors 2017, 17, 1695; INFOCOM, 2010, pp. 534542.
doi:10.3390/s17071695 [10] G. Wang, Q. Liu, J. Wu, M. Guo, Hierarchical
[3] Roslin Dayana K.,Vigilson Prem M., Review attribute-based encryption and scalable user
of the Various Optimized Access Control revocation for sharing data in cloud servers,
Techniques for Big Data in Cloud Computers Security, vol. 30, no. 5, pp.
Environment, International Journal of 320331, 2011.
Computer Applications (0975 8887) Volume [11] A. Lewko and B.Waters, Decentralizing
179 No.11, January,2018. attribute-based encryption, Advances in
[4] Dr. S.Prayla Shyry, Dhrupad Kumar Das, A Cryptology EUROCRYPT 2011, pp. 568588,
Secure And verifiable Access Control Scheme 2011.
For Big Data Storage In Clouds, International [12] C. Hu, X. Cheng, Z. Tian, J. Yu, K. Akkaya,
Journal of Pure and Applied Mathematics and L. Sun, An attribute based signcryption
Volume 119 No. 12 2018, 14147-14153 scheme to secure attribute-defined multicast
communications, in Secure Comm 2015.
Springer, 2015, pp. 418435
but fast network connections are expensive preserve the privacy of data owners by
or impractical in many remote locations, proposing a scheme to manage the storage
so better compression is needed to make of encrypted data with deduplication.
WAN replication very practical. Author Author test safety and evaluate the
present a new technique for replicating performance of the proposed scheme
backup data sets through a WAN that not through analysis and simulation. The
only removes duplicate file regions results show its efficiency, effectiveness
(deduplication) but also compresses and applicability.
similar file regions with delta
compression, which is available as a Objectives:
feature of EMC Data Domain systems. To improved integrity.
[10]. To increase the storage utilization.
To remove the duplicate copies of
OPEN ISSUES:- data and improve the reliability.
Existing solutions for deduplication suffer To improve the security.
from many attacks. They cannot friendly
support data access control and revocation 4. System Architecture:
at the same time. Most existing solutions
cannot ensure reliability, security and
privacy with sound performance. First data
holders may not be always online or
available for each a management, which
could come storage delay. Second
deduplication could become too
complicated in the term of communication
and computation to involve data holder
into deduplication process. Third, it may
intrude the privacy of data holder in a
process of discovering duplicated data.
Forth a data holder may have no idea how
to issue data access right or deduplication Fig. System Architecture
key to users in some situation when it does
not know other data holders due to data CSP: The CSP allows the data
suffer distribution. Therefore, CSP cannot owner for data storage services.
cooperate with data holders on data You cannot trust completely. That's
storage deduplication in many situations. why the content of stored data is
curious. It must be done honestly in
3. PROPOSED SYSTEM:- the conservation of data for profit.
Data Holder: The data owner
In this paper, Author propose a confidence can upload and save his data and
scheme in the challenge of data ownership files in the CSP. In this system it is
and cryptography to manage the storage of possible that the number of data
encrypted data with deduplication. Our holders can store their files in
goal is to solve the problem of cryptographic raw data in the CSP.
deduplication in the situation where the The owner of the data that
data owner is not available or it is difficult produces or creates the file
to get involved. Meanwhile, the data size considers the file as the owner of
does not affect the performance of data the data. The owner of the data is
deduplication in our schema. Author are in normal form that the highest
motivated to save space in the cloud and to priority of the owner
[11] Secure indexes are a natural extension 6) Public key encryption algorithm for
of the problem of constructing data encrypting the data and invoke ranked
structures with privacy guarantees such as keyword search over the encrypted data to
those provided by oblivious and history retrieve the files from the cloud. This aim
independent data structures. Develop an to achieve an efficient system for data
efficient ind-cka secure index construction encryption without sacrificing the privacy
called z-idx using pseudo-random of data. Further, this ranked keyword
functions and Bloom filters, and show how search greatly improves the system
to use z-idx to implement searches on usability by enabling ranking based on
encrypted data. This search scheme is the relevance score for search result, sends top
most efficient encrypted data search most relevant files instead of sending all
scheme currently known [2]. files back, and ensures the fi le retrieval
accuracy[12].
3.survey the ways in which Bloom filters
have been used and modified for a variety 7) Present a privacy-preserving multi-
of network problems, with keyword text search (MTS) scheme with
the aim of providing a unified similarity-based ranking to address this
mathematical and practical framework for problem. To support multi-keyword search
them and stimulating their use in future and search result ranking, This propose to
applications[3]. build the search index based on term
frequency and the vector space model with
2) Define SSE in the multi-user setting, cosine similarity measure to achieve
and present an efficient construction that higher search result accuracy[13].
achieves better performance than simply
using access control mechanisms[4]. 8) algorithm to provide efficient multi-
keyword ranked search. This scheme
provides a resolution for secured data with different keys for different data
sharing onto cloud or any public resource. owners.
Data contents also get secured as no
authority can also access user data. User 2. The proposed scheme allows new data
can differentiate between other users with owners to enter this system without
whom he/she is sharing own data.[15]. affecting other data owners or data users.
3. SYSTEM ARCHITECTURE/
SYSTEM OVERVIEW
calculate the relevance score between Fb(b This paper presents secure serching
2 j1; dj) techniques in cloud stored data. This paper
Score (fb; wj) 1 N also survey on different tech niques to
= jfb (1 + lnFfb;wj )ln(1 + ) search over the encrypted data solves the
f
j wj problem of ranked search over encrypted
Where,jfbj denotes the length of file cloud data. The data will get in less time
Fb,fFb;wj denotes the frequency of the by secure index searching. The cloud
keyword wj in the file Fb, fwj denotes the server performs searching over the
number of files containing keyword wj and encrypted data but server does not know
N denotes number of files. the sensitive information behind the data
collection.TPA Check the integrity of data
Each node in the index tree stores a vector that stored on cloud.
D whose elements are the relevance
scores. This define the node in the index REFERENCES
tree as [1] Song, D. Wagner, A. Perrig, Practical
techniques for searches on encrypted data, in:
SP00, Berkeley, CA, 2000.
unode = <ID; FID; D; Pl; P r> [2] Goh, Secure indexes, Cryptology ePrint
Archive, pp. 216 216, 2003.
where ID, FID, and OID denote the id of [3] Broder, M. Mitzenmacher, Network
node, file and data owner, respectively.Pr applications of bloom filters: A survey,
denotes the pointers to the right child of Internet Math., vol. 1, no. 4, pp. 485 509,
2002.
the unode, and Pl denotes the pointers to [4] R. Curtmola, J. Garay, S. Kamara, R.
the left child. Ostrovsky, Searchable symmetric encryption:
improved definitions and efficient
5. GAP ANALYSIS constructions, Journal of Computer Security,
vol. 19, no. 5, pp. 895 934, 2011.
[5] Q. Liu, G. Wang, J. Wu, Secure and privacy
Number of searched file for user entered preserving keyword searching for cloud
string storage services, J NETW COMPUT APPL.,
No.of words in vol. 35, no. 3, pp. 927 933, 2012.
Index string Existing Proposed [6] Wang, N. Cao, J. Li, K. Ren, W. Lou, Secure
Num- system system ranked keyword search over encrypted cloud
data, in: ICDCS10, Genoa, Italy, 2010.
ber [7] Liu, L. Zhu, J. Chen, Efficient searchable
1 3 18 7 symmetric encryption for storing multiple
2 2 12 6 source dynamic social data on cloud, J NETW
COMPUT APPL., vol. 86, pp. 3 14, 2017.
3 1 18 10
[8] N. Cao, C. Wang, M. Li, K. Ren, W. Lou,
4 2 22 16 Privacy-preserving multi-keyword ranked
5 3 11 4 search over encrypted cloud data, in:
Table I Shows the no.of searched file for INFOCOM11, Shanghai, China, 2011.
[9] Ibrahim, H. Jin, A. Yassin, D. Zou, Secure
user entered string In existing system and rank-ordered search of multi-keyword trapdoor
proposed system.In existing system for over encrypted cloud data, in: APSCC12,
multikeyword string it consider the each Guilin, China, 2012.
word as separate and search the document [10] Z. Shen, J. Shu, W. Xue, Preferred keyword
for each word separately. In proposed search over encrypted data in cloud computing,
in: IWQoS13, Montreal, Canada, 2013.
system only ranked files will display to [11] Wang, S. Yu, W. Lou, Y. Hou, Privacy-
user. preserving multi-keyword fuzzy search over
encrypted data in the cloud, in: INFOCOM14,
6. CONCLUSION Toronto, Canada, 2014.
[12] S. Pasupuleti, S. Ramalingam, R. Buyya, An
efficient and secure privacy-preserving
approach for outsourced data of resource
that the user is not some malicious hacker Cloud data sharing based on an
and the user can be assured of data encryption/decryption algorithm which
consistency, data storage and the instance aims to protect the data stored in the cloud
he/she is running is not malicious. from the unauthorised access.4
According to the user the capacity of
storing on the cloud can be altered 4. PROPOSED WORK
accordingly so scalability is an important The proposed Predictive model initially
factor and the system that is being undergoes the following techniques which
developed is more reliable because of the consists of:
security provided at different layer. Also it Proof Of Ownership
is quite affordable. A Proof Of Authentication
File Upload with Digital Signature
3. STATE OF ART Upload File & Grant permissions
3.1 Fine-Grained Two-Factor Protection Data Preprocessing is followed by
Mechanism for Data Sharing in Cloud designing the Prediction Engine and
Storage building the Learning model using
In this paper, the proposed system are a different Boosting techniques for
focus on data protection for cloud storage. producing Learned parameters which are
The proposed system focus on following ultimately used for Prediction Calculation.
points : Proof of Ownership
1) Cryptographic Key Data Owner uploads document, metadata on a
2)cryptographic key can be revoked cloud after encryption using keys from Data
efficiently by integrating the proxy re- Owner and Cloud Service Provider. As each
encryption and key separation techniques. and every document has a digital e-signing.
3) The data is protected in a fine-grained And all text documents should be able to be
way by adopting the attribute-based modified by authorized user
encryption technique. Proof of Authentication
3.2 Privacy preserving Model Each user has a one unique username and
A The data privacy-preserving issues are password which is used for authentication of
analysed by identifying unique privacy users. Each user has a unique digital e-sign
requirements and presenting a supportable
because they are used for upload documents.
solution that eliminates the possible threats
File Upload With Digital Signature
towards data privacy. The proposed system
Prior uploading the document, digitally sign
also gives the privacy-preserving model
(PPM) to audit all the stakeholders in order every individual document. Digital signatures
to provide a relatively secure cloud can provide the added assurances of evidence
computing environment. to origin, identity and status of an electronic
3.3 Applying Encryption Algorithm for document, transaction or message, as well as
data security in cloud storage acknowledging informed consent by the
This paper proposes a simple, secure, and signer. So we use digital signatures for File
privacy-preserving architecture for inter- Upload.
model for describing the data or information relationships that exist among them. The
aspects of a software system. The main various entities of the synchronization system
components of ER models are entities and the are data owner, Data user.
REFRENCES
[1] B.Prabavathy; P. Ramya; ChitraBabu, Security in Cloud Storage‖, Advances in
―Optimized private cloud storage for Ubiquitous Networking, Lecture Notes in
heterogeneous files in an university Scenario‖, Electrical Engineering, vol 366. Springer,
International Conference on Recent Trends in Singapore, Year: 2015.
Information Technology (ICRTIT) Year: 2013. [5] Boeui Hong, Han-Yee Kim, Minsu Kim, Lei
[2] Cong Zuo, Jun Shao, Joseph K. Liu, Guiyi Xu, Weidong Shi, and Taeweon
Wei and Yun Ling"Fine-Grained Two-Factor Suh"FASTEN: An FPGA-based Secure
Protection Mechanism for Data Sharing in System for Big Data Processing",IEEE
Cloud Storage",IEEE Transactions on DESIGN & TEST HARDWARE
Information Forensics and Security. ACCELERATORS FOR DATA CENTERS.
[3] Kaiping Xue, Senior Member, IEEE, Weikeng [6] Hui CUI,Yingjiu LI, "Attribute-based cloud
Chen, Wei Li, Jianan Hong, Peilin storage with secure provenance over encrypted
Hong"Combining Data Owner-side and Cloud- data",Published in Future Generation
side Access Control for Encrypted Cloud Computer Systems, 2018 February, Volume
Storage",IEEE Transactions on Information 26, Issue 4, Pages 461-472.
Forensics and Security. [7] Nesrine Kaaniche, Aymen Boudguiga,
[4] Zaid Kartit, Mohamed EL Marraki, ―Applying Maryline Laurent"ID-Based Cryptography for
Encryption Algorithm to Enhance Data Secure Cloud Data Storage",2013, acm.
IMAGE AND
SIGNAL
PROCESSING
path is calculated. The system architecture show indoor applications. It does not
is shown in Figure 3.1. support internal routes.
4.2 Proposed System
To overcome the existing problems we
proposed a system. In proposed system we
make an indoor map which is linked to the
augmented reality. In these system we
make a navigation guide which will guide
the user to reach their respective
destination. It is actually an Augmented
Reality where we require a camera
generated on action video. Using on-
screen navigation guide will easily guide a
Fig. 3.1 System architecture
user to reach his/her destination without
any confusion or without asking someone
The layered Architecture gives a brief idea
for help.
about how different components of the
system can work together to give the
5. DATA FLOW DIAGRAM
desired output. The first layer is the APP
5.1 DFD Level 0
Layer, which is shows UI on the screen.
Using the UI a user can interact with the
app. The next layer is SDK Layer. The
SDK includes all the APIs which help the
user interact with the hardware.
Fig. 5.1 DFD (Level 0)
The final layer is the Sensor/Hardware
Layer. The Gyroscope and the
5.2 DFD Level 1
Accelerometer gets the app running by
calculating GPS coordinates thereby
making making the places to be located.
achieved. Further development effort was stage detects the lip in potential face
expected. There were differences in image regions using a lip colour model and
quality. Uniform image quality should be searches the eyes using geometry textures.
used to achieve higher accuracy [6]. The last stage clips the face region using
Guan, Haiyan, Yongtao Yu, and Jonathan an optimization ellipse. In the future, it is
Li developed a tensor voting approach to expected to plan to use this face detection
dark spot detection in RADARSAT-1 scan system for pre-processing to solve face
SAR narrow beam mode images. The tracking and face recognition problem
proposed method was developed using [10].
C++ running on an HP Z820 workstation.
Quantitative evaluations have 4. GAP ANALYSIS
demonstrated that the proposed method The comparison drawn between the paper
achieves an average commission error [7]. previously published addressing lesion
Hsieh, Chen-Chiung, and Meng-Kai Jiang detection can be well understood from
developed a facial expression the Table-1.
classification system based on Active Table 1- Comparison between content in papers.
shape model and support vector machine. Author Proposed Pitfall
This system utilized facial components to System
locate dynamic facial textures such as Nasim HSV model, K Limited
frown lines, nose wrinkle patterns,and Alamdari[2] means and number of
nasolabial folds to classify facial support vector images were
expressions. Support Vector Machine is machine. used.
deployed to classify the six facial Kittigul[1] Haar Cascade
expression types including neutral, classifier
happiness, surprise, anger, disgust, and
fear. The results showed that the method Chin et al[4] Law‘s mask Eigen value
proposed classified six human filter and calculation
expressions effectively, namely neutral, gabor wavelets method was
happiness, surprise, anger, disgust, and complex.
fear [8]. Jain[12] AAM and LOG Face marks
Ohchi, Shuji, Shinichiro Sumi, and Kaoru were not
Arakawa developed a nonlinear image explicitly
processing system for beautifying human defined.
facial images using contrast enhancement Douglas Bayesian Segmentation
which effects highlighting and shading. Chai [3] classifier and performance
This system can realize highlighting and multilayer degraded.
shading in the face, which make the face perceptron
look deeply chiseled as well as removing classifier
the undesirable skin roughness such as
wrinkle and spots. The parameters in this 5. PROPOSED SYSTEM
system are optimized with IEC. One- point The proposed model consists of three
crossover is applied where the crossover subsystems which are as follows -
point is randomly determined and a single Acne Detection System
bit is reversed in the mutation where the Wrinkle Detection System
locus is also determined randomly [9]. Remedial System.
Wang, YuanHui, and LiQian Xia Image will be taken from the user and
developed feature-based face detection in preprocessing steps will be performed as
complicated backgrounds. The first stage mentioned in Section 5.1.
adopts skin colour-based segmentation to After the preprocessing steps are done, the
search potential face regions. The second system will check whether acne or
wrinkles are present on the person‘s face or transformation. Then connected component
not. Then respective results will be labeling algorithm can detect connected
given for that image. regions in wrinkles‘ binary digital images.
After detection of lesion is done, the
corresponding remedies for the same are
also given as output by the system. This
automated lesion detection system detects
lesions and suggests measures to cure or
prevent them in the future.
Step 4: Image Binarization – to get clear receive some useful suggestions and
wrinkle contour. remedies to cure their skin problems in an
effective manner.
Step 5: Connected Component Labeling –
to mark wrinkles‘ number 7. FUTURE SCOPE
In the future, we want to collect more and
Step 6: Detect wrinkle position using more users‘ facial images and their
area, length, width feature on the CCL detection results for analyzing in order to
map. improve the accuracy of our system and
also put forward the superior proposal for
6. CONCLUSION our users. One more idea can be applying
Acne detection and classification is one of various methods to acne images and
the most important processes in acne comparing the results and combining the
treatment. In this work it has been successful algorithms to get higher
proposed that for acne detection and accuracy to segment and distinguish acne
counting it,use and process the images types and also identifying other skin
taken by a webcam. Haar Cascade conditions. The scope of the project can
detectors have been used to classify the be extended to detect dark circles below
facial portion of the images. the eyes and dark spots on the skin.
Mouth and ear detectors are used to
obscure critical parts of the face that REFERENCES
could be wrongly classified as acne [1] Kittigul, Natchapol, and Bunyarit
lesions. Segmentation of skin pixels has Uyyanonvara. "Automatic acne detection
system for medical treatment progress report."
been performed combining several colors, In Information and Communication
texture, shape, spatial and unsupervised Technology for Embedded Systems (IC-
descriptors. Proposed unsupervised ICTES), 2016 7th International Conference of,
features improved the performances of pp. 41-44. IEEE, 2016.
the skin segmentation model that is an [2] Chin, Chiun-Li, Ho-Feng Chen, Bing- Jhang
Lin, Ming Chieh Chi, Wei-En Chen,and Zih-
ensemble of 10 Random Forest models Yi Yang. "Facial wrinkle detection with
and achieved high accuracy at a texture feature." In Awareness Science and
reasonable computation time on FSD Technology (iCAST), 2017 IEEE 8th
dataset. The channel a* of the CIELab International Conference on, pp. 343-347.
model has been proven suitable to IEEE, 2017.
[3] Phung, Son Lam, Abdesselam Bouzerdoum,
enhance discrimination between acne and Douglas Chai. "Skin segmentation using
lesion and healthy skin, and Adaptive color pixel classification: analysis and
Threshold performed on this channel is comparision". IEEE transactions on pattern
able to separate acne lesion from healthy analysis and machine intelligence 27, no. 1
skin with acceptable result. Laplacian of (2005): 148- 154.
[4] Alamdari, Nasim, Kouhyar Tavakolian,
Gaussian filter is the algorithm selected to Minhal Alhashim, and Reza Fazel-
detect acne spots and mark them in the Rezai."Detection and classification of acne
image. Finally, reports are produced lesions in acne patients: A mobile
containing number, location and ray application." In Electro Information
dimension of the detected acne spots. The Technology (EIT),2016 IEEE International
Conference on, pp. 0739-0743.IEEE,2016.
system also worked on detecting skin [5] Guan, Haiyan, Yongtao Yu, and Jonathan Li.
conditions such as wrinkles, skin pores. "A tensor voting approach to dark spot
Using the Laws Mask filters followed by detection in RADARSAT-1 intensity
connected component labeling algorithm imagery." In Geoscience and Remote Sensing
for the detection of wrinkles. Users can Symposium (IGARSS), 2015 IEEE
International, pp. 3160-3163. IEEE,2015.
comprehend that what kinds of difficulties [6] Hsieh, Chen-Chiung, and Meng-Kai Jiang. "A
their skin has encountered and they will facial expression classification system based
on active shape model and support vector [9] Maroni, Gabriele, Michele Ermidoro, Fabio
machine." In Computer Science and Society Previdi, and Glauco Bigini. "Automated
(ISCCS), 2011 International Symposium on, detection, extraction and counting of acne
pp. 311-314. IEEE,2011. lesionsfor automatic evaluation and tracking of
[7] Ohchi, Shuji, Shinichiro Sumi, and Kaoru acne severity." In Computational Intelligence
Arakawa. "A nonlinear filter system for (SSCI), 2017 IEEE Symposium Series on, pp.1-
beautifying facial images with contrast 6. IEEE, 2017.
enhancement." In Communications and [10] Chin, Chiun-Li, Guei-Ru Wu, Tzu-Chieh
Information Technologies (ISCIT), 2010 Weng, Yun-Yun Kang, Bing-Jhang Lin,and
International Symposium on, pp. 13-17, IEEE, Ho-Feng Chen. "Skin condition detection of
2010. smartphone face image using multi-
[8] Wang, YuanHui, and LiQian Xia. "Skin color feature decision method. " In Awareness
and feature-based face detection in complicated Science and Technology (iCAST),2017 IEEE
backgrounds." In Image Analysis and Signal 8th International Conference on, pp. 379- 382.
Processing (IASP), 2011 International IEEE,2017.
Conference on, pp. 78-83. IEEE, 2011.
based systems such as skin color, lighting between the leap and the devices. To make
variations and hand orientation relative to the devices properly work.[1]
the device.
Following are the remainders of the Arabic Sign Language Recognition Using
paper: Leap Motion Sensor - Leap motion‘s 3D
Section II gives a detailed Literature digital data is used in recognition system
survey. for Arabic sign language recognition based
Section III presents the proposed system of on ANN. Leap tackles issues in vision
the work. based systems such as skin colors,
Section IV proposes the implementation of lightning, etc. Captures hands and fingers
the system. in 3D format. MLP is used in which spatial
Section V concludes the paper. features are stored.Few disadvantages of
this system are
2. LITERATURE SURVEY Used dept cameras, kinect and digital
The Leap Motion controller is a small camera. Though they achieved high
USB peripheral device which consists of accuracy but they suffered from stability in
3 I.R LED‘s, a sensor, a black glass, 2 I.R realistic environment. Leap motion does
monochromatic cameras in it. It uses not track non-manual features.[2]
Infrared (IR) rays to define the position of
predefined objects in a limited space in the Bulbs Control in Virtual Reality by Using
real time environment. The device range is Leap Motion Somatosensory Controlled
a rough hemispherical area, which has Switches:- A four-channel Leap Motion
about a distance of 1 meter. Leap Motion somatosensory controlled switching
sensor is a small size sensor which is easy module implemented for bulb switching
to use and of low cost as shown in figure and control. To aid some person whose
1. The physical dimensions of the device hands have damaged.Improve quality of
are: length 80mm, width 30mm and height their living modules. Cost will be reduced
13mm.[1] by mass production. Non-touch VR avoids
possibility of infections. Practical training
of software design. The relay module was
served as electrical controlled switches
which received the signal from Arduino
Mega 2560 that received the instance were
controlled by using leap motion
somatosensory.[7]
4. PROPOSED SYSTEM
ALGORITHM
Distance: Euclidean
P4 P5
P3
P6
P2
(x1,y1,z1)
P1
Figure 4.Palm points considered for leap gesture
B. Feature Extraction run time, distances are calculated from
The extracted key points are coordinates the extracted feature points using
of finger positions from the input Euclidean distance formula as follows:
gesture. The points are the center of the 𝑑𝑖 =√ (𝑥𝑖 −𝑥𝑖+1)2 +(𝑦𝑖 −𝑦𝑖+1)2 + (𝑧𝑖
palm(say P), tips of thumb(say T), index −𝑧𝑖+1)2 -----(where i= 1 to 15 for single
finger(say I), middle finger(say M), ring handed gestures )
finger(say R), pinky finger(say K) for The cosine distance measure algorithm
each hand. The coordinates are, P(x1, gives the fastest and the most efficient
y1, z1), T(x2, y2, z2), I(x3, y3, z3), measure.[2]
M(x4, y4, z4), R(x5, y5, z5), K(x6, y6,
z6) for each hand. [2] D. Gesture Database
The extracted points are stored in a
serializable database and distances
between each tip and palm center is
calculated with respect to each finger.[3]
6 conclusion
To overcome issues of earlier vision
based systems such as skin color,
lighting variations and hand orientation
relative to traditional gesture detection
devices. This paper presents a
Figure 5.Distance vectors calculate d using
Euclidean
preliminary solution of hybrid interface
C. Gesture Classification consisting software and hardware
Figure 5 represents Whenever a gesture components.
is performed the points are extracted and
they are stored using serialization .At the
The paper [1] presented by Zahid Ahmed, compared. By comparing the values of the
Sabina Yasmin, Md Nahidul Islam, Raihan input to the reference values, the
Uddin focuses on building a software for denomination with the highest amount of
automated counterfeit currency detection significant similarity is selected
tool for Bangladeshi notes.The software The paper [2] presented by Sonali R.
detects fake currency by extracting Darade, Prof.G.R.Gidveer focuses makes
existing features of banknotes such as use of image processing techniques in
micro-printing, optically variable ink order to identify fake currency notes. The
(OVI), water-mark, iridescent ink, security automatic system is designed for
thread and ultraviolet lines using OCR identification of Indian currency notes and
(Optical Character recognition), Contour check whether it is fake or original. The
Analysis, Face Recognition, Speeded UP automatic system is very useful in banking
Robust Features (SURF) and Canny Edge system and other field also. In India
& Hough transformation algorithm of increase in the counterfeit currency notes
OpenCV. of 100, 500 and 1000 rupees. As increase
The paper [7] presented by Mohammad in the technology like scanning, colour
Shorif Uddin, Pronaya Prosun Das, Md. printing and duplicating because of that
Shamim Ahmed Roney focuses on there is increase in counterfeit problem
automated image-based technique is The paper [3] presented by P. Ponishjino,
described for the detection of fake Kennet Antony, Sathish Kumar, Syam
banknotes of Bangladesh. SVM (Support JebaKumar focuses on a system in which,
Vector Machine) classifier has been used the strip lines or continuous lines are
after extracting three security features that detected from real and fake note by using
are watermark, latent image and micro-text edge detection techniques. HSV
from the acquired images of the techniques are used to saturate the value of
banknotes.All the algorithms used in this an input image. The image processing will
proposed system have been implemented be implemented from the RSB to HSV
in MATLAB algorithm on the input image. The various
The paper [6] presented by Sahana characteristics of the paper currency will
Murthy, Jayanta Kurumathur, B Roja be cropped and segmented using ROI
Reddy focuses on software system, that algorithm.
uses image processing techniques in order The paper [4] presented by Pradeepa
to identify fake Indian currency notes. The Samarasinghe, L.K.P Lakmal, Weilkala
problem with current existing systems is A.V., W.A.N.P.C Wickramarachchi, E.R.S
the trade-off between complexity and Niroshana focuses on image processing to
speed. The selected security features for detect forgery in Driving licences of Sri
each denomination are analysed and the Lanka.
expected values for real notes are
Table 1:Literature Survey
Parameter Image Automatic Bogus Design and Our
Processing Recognitio Currency Implementatio Solutio
Based n of Authorizatio n of Paper n
Feature Fake n Using HSV Currency
Extraction Indian Techniques Recognition
of Currency with
Bangladesh Notes Counterfeit
i Bank
Notes
Accuracy High Low Average High High
Ease of Use Difficult Easy Difficult Difficult Easy
Required
Cost High High High Medium Low
robustness of this project is clearly
3. SYSTEM IMPLEMENTATION plausible later on.
PLAN
Security features of genuine banknotes: -
1. Security Thread
The security thread is in 2000₹ and 500₹
note, which appears on the right of the
Mahatma Gandhi‘s portrait. In security
thread the visible feature of ―RBI‖ and
―BHARAT‖. When note is held against the
light, the security thread can be seen as
Figure 1 - Features of 2000₹ Note one continuous line.
The system starts its performance based on
a training data set consisting of high 2. Water Mark
quality images of a genuine notes. A total The mahatma Gandhi watermark is present
of 5 security features out of the available 7 on the bank notes. The mahatma Gandhi
of the notes will be tested. The two most watermark is with a shade effect and
important security features are Security multidirectional lines in watermark.
Thread and Watermark. The result of
Security Thread and Watermark 3. Optically Variable Ink
Identification has to be positive in order Optically variable ink is used for security
for the note to proceed to the next steps. If feature; this type of feature is in the 2000₹
the result is negative, the note is declared and 500₹ bank note. Optically variable ink
as a fake note immediately. As, the system as security feature for bank note is
is capable of detecting five security introduced in Nov 2000. The
features of bank notes, the final state of denomination value is printed with the
this system will declare the note as a help of optical variable ink. The color of
genuine one, only when it can gain at least numerical 2000 or 500 appear green, when
3 success points. That means an accuracy note is flat but change the color to blue
of greater than or equal to 66.67%. This is when is held in an angle.
because, each of the five features are
strong enough to fight against 4. Latent Image
counterfeiting, but sometimes printing The latent image shows the respective
quality and rough usage can make the denomination value in numerical. On the
security features of genuine banknotes observe side of notes, the latent image is
fade. That is why some selected features present on the right side of Mahatma
may not be detected accurately. The Gandhi portrait on vertical band. When the
implemented system proves the control note is held horizontally at eye level then
logic of the whole project and the the latent image is visible.
5. Bleed Lines
There are angular bleed lines on left and 3) Gray Scale Conversion
right side of the note in raised print. Bleed The image obtained is in RGB format. It is
lines are used to help the visually impaired transformed into gray scale because it
people to identify the denomination of the takes only the intensity information which
notes. is easy to process than processing of three
components of RGB
6. See Through Register
The small floral design is printed in the 4) Edge Detection
middle of the vertical band and next to The Edge detection is a basic tool in image
watermark. The floral designed on the analysis, image processing, image pattern
front is hollow and in back is filled up. recognition and computer vision
The floral design has back to back techniques. Edge detection is basic tool
registration. The design will see as one particularly in the area of feature detection
floral design when seen against the light. and feature extraction.
fc – feature counter
Result 66.67% - Genuine Note
Result 66.67% - Fake Note
6. PROBLEM IDENTIFICATION
Identification of fake notes is very useful
as it can be used by banks to distinguish
between original and fake notes but there
are certain issues regarding image
processing these are:
Figure 2 - System Flow Diagram 1)Motion Blur Problem
2)Noise imposed by image capture
instrument
5. CONTROL FLOW 3)Variety of notes.
4)Less efficient feature extraction.
7. CONCLUSION
The proposed software system will be very
useful in order to identify fake Indian
currency notes. The system will use
advanced Image Processing Algorithms
and be made available for free of cost to
everyone. Also, users will require
minimum hardware in order to access the
system and use it. The results are in the
form of a Boolean value which indicates
Figure 3 – Control Flow Diagram
whether the note is fake or original
The above figure represents the control
flow of the proposed system. The system 8. FUTURE SCOPE
starts by capturing a high-quality image of The system uses 5 distinct features in order
the note to be tested. The first and most to check the validity of the notes. In the
important feature that is tested is the future, the number of features can be
Security Thread. If system fails to verify increased in order to make the system
this feature then the note is directly more robust. The speed of the system can
declared as a fake one. On the other hand, also be increased by using advanced image
if the security thread feature test is passed processing technologies so that users can
then system starts a counter for each scan more notes in less amount of time.
feature of the note. Counter is increased by
1 if test is passed and kept as it if test is REFERENCES
failed. At the end the counter value is [1] Zahid Ahmed, Sabina Yasmin, Md Nahidul
divided by 5 (i.e. total number of features Islam, Raihan Uddin Ahmed ―Image
that will be tested) and multiplied by 100 Processing Based Feature Extraction of
to calculate the percentage. A minimum of Bangladeshi Banknotes‖ 2014.
[2] Sonali R. Darade, Prof.G.R.Gidveer
66.67% is required for the note to be
―Automatic Recognition of Fake Indian
declared as a genuine one. Currency Note‖ 2016 International Conference
authority members. The user interface The paper also noted usage of Histogram
consist of a website where user needs to of Oriented Gradients. HOG is based on
login using his credential to gain access to the principle that the local appearance and
real-time CCTV footage. Once logged in shape of the object can be described by the
notification will be send to the user so that intensity distribution of the gradients or
he can invigilate all the recording and take the direction of the contours. The gradient
necessary actions as per requirements. of an image is a vector quantity that
indicates how the intensity of the pixel
2. LITERATURE REVIEW varies in space. The gradient is computed
In 2009, Thombre D.V. et al proposed by convolution of the image with a first
‗Human Detection and Tracking using derivative mask. The function of the
Image Segmentation and Kalman Filter‘[1] Support Vector Machine (SVM) is to give
For human detection, usage of image a decision about the candidate's belonging
segmentation technique and for human to the class sought. Learning is done from
tracking, Kalman filter with two a database of positive examples (class
dimension constant velocity model was containing the characteristics of the
talked about. The Kalman filter is a set of examples of Humans) and negative
mathematical equations that provides an examples (class containing the
efficient computational means to calculate characteristics of examples of no
the state of a process, in a way that Humans). By taking the characteristics of
minimizes the mean square error. This the candidate image as input, the classifier
method tracks individual pedestrians as must determine the class closest to the
they pass through the field of vision of the candidate image. In most cases, this step is
camera, and uses vision algorithms to the last step of the process since once
classify the motion and activities of each recognized by the classifier; it is enough to
pedestrian. The tracking is achieved display the windows of detection.
through the development of a position and In 2014, Resmi R et al. proposed ‗Video
velocity path characteristic for each human Image Processing for Moving Object
using a Kalman filter. By making use of Detection and Segmentation using
this information, the system can bring the Background Subtraction‘. [3] The key
incident to the attention of human security concept was using Background subtraction
personnel. method for moving object detection in
The paper titled ‗Human Detection using videos whereas using segmentation for
HOG-SVM, Mixture of Gaussian and detecting various features of moving
Background Contours Subtraction‘ was objects for further video/image processing.
given by Houssein Ahmed et al.[2] In Background Subtraction generates a
mixture of the Gaussian (MOG) foreground mask for every frame. This
modelling, a statistical process step is simply performed by subtracting
independent of all other pixels is applied to the background image from the current
each pixel by comparing it with the set of frame. When the background view
models existing in this location to find a excluding the foreground objects is
matching. The parameters of the available, it becomes obvious that the
corresponding model are updated based on foreground objects can be obtained by
a learning factor. If no match is found, the comparing the background image with the
least probable model is eliminated and current video frame. Segmentation is a
replaced by a new Gaussian with the significant part in image processing. Image
current pixel values. Also, background segmentation is the division of an image
contours subtraction are less sensitive to into regions or categories, which
light changes, thus they can be helpful in correspond to different objects or parts of
human detection. objects.
We also aim to detect human behaviour in from the frame and foreground image is
the restricted area using additional generated. The background is removed by
algorithms. This algorithm will initially use of Gaussian filtering. The foreground
detect the human being and then try to image may consist of different objects.
track him and keep record of the time The foreground image will be given to the
period for which he is present in the Hog feature descriptor for target area
restricted area. The system can be made classification.
more robust by using sensors for motion [2] HOG Feature Descriptor:
detection along with the system that will A feature descriptor is a technique of
track time. representation of an image which
simplifies the image by extracting the
4. PROPOSED WORK useful information from image for
To overcome the gaps present in existing classification. HOG initially calculates
systems the proposed security system will gradients i.e. pixel intensities of the entire
identify suspicious activity using image for determining background and
movement of human and presence of foreground. But calculating the gradients
human after some predefined time. As of the entire image increases the
Figure No.1 illustrates the real world video computational time and makes it difficult
is fed to the system. The video is made up to use HOG in real life problems. To
of number of frames and these frames are overcome this drawback the feature
given for pre-processing. The pre- descriptor is provided with the foreground
processing stage will provide foreground image as an input. HOG will identify the
image by use of Background Subtraction. features of all the objects present in the
The pre-processed image is then given to foreground image. The information
HOG feature descriptor which will identify collected by this technique will be given
the target area in foreground image. The for further classification to SVM classifier.
identified target area is given to SVM [3] SVM Classifier:
classifier for Human or Non-Human Support Vector Machines are supervised
classification. If Human presence is learning models. These models have
detected the system will identify algorithms associated with them for the
suspicious activity using motion tracking purpose of analysing the data for
and time tracking. Motion tracking uses classification or regression. SVM classifier
Optical Flow for detecting the movement is trained on a set of examples for
and time tracking is carried out by using classifying the objects in the examples in
HOG repeatedly on the present human till different categories. In the proposed
the predefined time exceeds. If the activity system the SVM classifier will be trained
is classified as suspicious then the to categorize the detected object as human
responsible authority will be notified by or non-human.
sending a notification. If activity is not [4] Motion Tracking:
suspicious then there is no output from the If the object detected by the SVM
system. classifier is categorized as human then
[1] Pre-processing: motion tracking using optical flow is
The real world video is initially split into applied for determining suspicious
number of frames. These frames are given activity. Optical flow is flow of pixels that
to the system for initial human detection. generates a pattern by using movement of
There are various techniques available for objects in visuals. The sequence of frames
human detection of which background obtained from video surveillance will be
subtraction is more computationally used to estimate motion of the detected
efficient and robust. In background human figure. Optical flow can estimate
subtraction, the background is removed motion between two image frames by
bapatil@aissmscoe.com5
ABSTRACT
When you work at computer, your eyes have to focus and refocus all the time.They
move back and forth as you read. Your eyes react to changing images on the screen so
your brain can process what you're seeing. This requires a lot of efforts from your eye
muscles. Computer work gets harder as you age and the lenses in eyes become less
flexible. There is no proof that computer use can cause permanent eye damage but
regular use can cause eye-strain and discomfort. This condition is also called as
Computer Vision Syndrome.Computer screens are causing damage to human eyes
increasingly day-by-day. Exhaustive use of computers is causing various diseases
including Computer Vision Syndrome. Computer Vision Syndrome can cause eye-
strain, headaches ,blurry vision ,dried eyes ,shoulder-neckpain,etc[2].In this paper we
are proposing a system that tries to reduce the impact of Computer Vision Syndrome on
human eyes .This system tries to reduce the impact of CVS by dealing with light that is
emitted through electronic displays.
General Terms
Computer Vision Syndrome ,Eye blinks.
Keywords
Face Detection ,Eye detection ,Blink detection.
6. FUTURE SCOPE
[7]In Future iterations, we can try to
implement this system for wide range of
de-vices such as mobile phones, smart
TVs,Tablets etc.This system performs
poorly in lightning constraint
environments so we can try to improve the
low light per-formance of the system. We Detection System", 2014 International Journal
can improve the performance of the of Computer Applications.
[4] Tereza Soukupova, Jan Cech, "Real Time Eye
system for the users wear spectacles. The Blink Detection using Facial Land-marks".
head orientation can also be taken into [5] http://www.sankaranethralaya.org/patient-care-
account for eye detection in future cvc.html
iterations of the product. Current version [6] https://www.aoa.org/documents/infographics/S
only adjusts the screen brightness but in YVM2016Infographics.pdf
[7] Cai-Xia Deng, Gui-Bin Wang, Xin-Rui Yang,
future versions we can adjust different "Image Edge Detection Algorithm Based on
light settings like adjusting or preventing Improved Canny operator", 2013 IEEE.
blue light from the display or other [8] Seongwon Han, Sungwon Yang, Jihyoung
environment oriented light setting Kim and Mario Gerla ,"EyeGuardian:A
REFERENCES Framework of Eye Tracking and Blink
[1] So_a Jennifer, Sree Sharmila, "Edge Based Detection for Mobile Device Users",
Eye-Blink Detection for Computer Vision [9] Xun Wang, Jianqiu JIN, An Edge Detection
Syndrome", 2017 IEEE. Algorithm Based on Improved Canny
[2] Richa Mehta, Manish Shrivastava, "An Operator, Seventh International Conference on
Automatic +Approach for Eye Tracking and Intelligent Systems Design and Applications,
Blink Detection in Real Time", 2012 IJEIT. 623-628, 2007.1. Clerk Maxwell, Treatise on
[3] Mai K. Galab , H.M.Abdalkader and Hala H. Electricity and Magnetism, 3rd ed., vol. 2.
Zayed ,"Adaptive Real Time Eye-Blink Oxford: Clarendon,1892, pp.687.
photo while maintaining its accuracy, device in the classroom. Students place
resulting in less fault rates. This their fingers on the sensor to mark the
application can be further extended to attendance. The GUI application on the
work areas such as Educational host computer helps the teacher to manage
organizations and Conference attendance. the attendance. The portable device needs
Make the records of attendance available to be handled with care and requires
irrespective of place and time. The basic technical knowledge to operate it. Also it
idea is to train manually is a time consuming process. The device is
developed/modified regressor and rotated in classroom which creates a
classifier to learn labeling of data disturbance in regular lecture.[1]
automatically and reduce error via RNN. S. Konatham et.al Here, the
Product will be working in Client-Server proposed model makes use of RFID tags
environment. Client being teachers or the and GSM for automated attendance
supervisors in the class or other management in an institute. Every student
educational organizations. The collected will have a unique RFID card, just like an
data will then be stored in the server and identity card. RFID card readers are
will be cross verified with the datasets installed at the entrance of the classroom.
which is already there in the server, after These readers have a built-in
which the attendance of the students will microcontroller, which matches the RFID
be updated accordingly. card of student with the RFID registered in
A. Motivation database. If a match is found then the door
Attendance is one of the important factors is opened. This is done using GSM
of assessment of student. Current module. The main disadvantage is that
attendance system is managed manually every student have to carry a RFID card to
in most of the places which wastes the enter the classroom. If a student forgets his
time of teacher, lecturer, etc. Managing card, there must be some system to mark a
attendance manually on paper or non true attendance. Also there is chance
electronically is tedious and accessing it of students adopting fraudulent methods
on requirement with time and place for marking some other person‘s
barriers is near to impossible. With the attendance. Also if the card is swiped more
help of face recognition by processing than once, there is a chance of marking his
image taken via mobile devices the attendance twice. Students need to stand in
automation of whole system is possible a queue which is as time consuming as
which will reduce the time wastage and traditional roll call system [2].
human errors in attendance with great S. Noguchi et.al This is similar to
scale. The attendance data can be stored RFID access card system, only the cards
on web server and can be accessed are accessed on students‘ android phones
remotely through website. Hence itself. It makes use of Bluetooth Low
accessing data gets easy. Energy (BLE) beacon device to transmit a
magic number necessary for proper
2. LITERATURE SURVEY registration. Only the Android devices of
K.P.M. Basheer et.al the students present inside the classroom
Fingerprint attendance system was one of receive the signals of the beacon device
the attempts to introduce automation in that carries the magic number. Then the
attendance system. This system consists of students run the application and register
two sections, one portable device and a their cards using Near Field
host computer. Fingerprint module is the Communication (NFC) reader to mark the
heart of portable device. It consists of a attendance. The main disadvantage is that
fingerprint sensor. Attendance is registered it requires a NFC device for registration of
by rotating the portable finger recognition access cards onto the system which
demands technical support. Any student form of image from Red Green Blue
who is not inside the class but fall within (RGB) format that obtained from capturing
the Bluetooth area limits can also mark images by using built-in webcam and
his/her attendance.[3] secondary data in the form of xml
S. Kadry et.al In this paper, a classifier file for face detection and
wireless iris attendance management recognition process. Robot has costly
system is designed and implemented using hardware specification and requirements.
Daugman‘s Algorithm. This system Data pre-processing for Red Green Blue
consists of iris verifying and identifying, conversion is an increased overhead and
managing iris of users, system setting and time-taking. The system uses RGB form of
managing wireless communication. The data as input for face detection and
shortcomings of this system are that there recognition which gives reduced
must be a managing PC nearly and it is efficiency. Eigen faces is one of the
difficult to lay the transmission lines simplest but least accurate method.
where topography is bad.[4] Efficiency provided by Eigen faces is
S. Chintalapati et.al This system, much poor.[7]
which is based on face detection and Monica Chillaron et.al
recognition algorithms, automatically The work here describes the development
detects the student when he enters the class of Face recognition and detection
room and marks the attendance by application that connects with Raspberry
recognizing him. Real time face Pi with Bluetooth protocols. It uses Eigen
recognition used here is reliable and fast. faces method for recognition whereas
The face detector is installed at the object detection is based on boosted
entrance of the classroom, which detects cascade. From the results , the average hit
the face and recognizes it by matching it of face detection is 84.4%. The use of
with existing faces in the database. Again Raspberry Pi hardware-base increases the
fraudulent methods can be adapted by cost of the system. The system is
students by appearing in front of camera Bluetooth dependent, which can be easily
but not entering the class. In this order, tampered resulting in system failure.
system is not useful and can become costly Efficiency provided by Eigen faces is
too.[5] much poor.[8]
E.Varadharajan et.al In this
method the camera is fixed in the Balika Hinge1 et.al This system is
classroom and it will capture the image, an automated system for human face
the faces are detected and then it is recognition in a real time background for a
recognized with the database and finally company to mark the attendance of their
the attendance is marked. If the attendance employees. To detect real time human face
is marked as absent the message about the Haar cascade is used and a simple fast
student‘s absent is send to their parents. It Principal Component Analysis is used to
uses Eigen Faces method for face recognize the faces detected with a high
detection. Eigen faces is one of the accuracy rate. The matched face is then
simplest but least accurate method. used to mark attendance of the employees.
Efficiency provided by Eigen faces is The system is real time hence needs full
much poor.[6] time vigilance and power source. Haar
Anissa LintangRamadhani et.al cascade has limitations like it cannot
Here, Principal Component Analysis detect faces in dark light environment,
(PCA) and Eigen Faces algorithms are hence resulting in poor face detection.[9]
used for face detection. Ry-UJI robot is
used to implement face recognition. This
research is using primary data type in the 3 . PROPOSED WORK
where I(x,y) signifies the value at the Hence a system with expected results is
location (x, y). being developed but there is still some
room for improvement.
Pose Correction:
REFERENCES
[1] K.P.M. Basheer and C.V. Raghu, "Fingerprint
attendance system forclassroom needs,"
T is a homogeneous Annual IEEE India Conference (INDICON),
pp.433-438, 2012.
matrix defined by [2] S. Konatham, B.S. Chalasani, N. Kulkarni, and
rotation angle , scaling T.E. Taeib,―Attendance generating system
using RFID and GSM,‖ IEEE Long Island
factor s,T and translation Systems, Applications and Technology
Conference (LISAT), 2016.
vector [tx; ty]; as shown [3] S. Noguchi, M. Niibori, E. Zhou, and M.
in above= equation
5. CONCLUSION AND FUTURE Kamada, "Student Attendance Management
System with Bluetooth Low Energy Beacon and
WORK Android Devices," 18th International
There may be various types of lighting Conference on Network-Based Information
conditions, seating arrangements and Systems, pp. 710-713, 2015.
environments in various classrooms. Most [4] S. Kadry and K. Smaili, ―A design and
of these conditions have been tested on the implementation of a wireless Iris
recognition attendance management system,‖
system and system will hopefully show Information Technology and control, vol. 36,
90% accuracy for most of the cases. There no. 3, pp. 323–329, 2007.
may also exist students portraying various [5] S. Chintalapati and M.V. Raghunadh,
facial expressions, varying hair styles, ―Automated attendance management system
beard, spectacles etc. All of these cases based on face recognition algorithms,‖ IEEE
Int. Conference on Computational Intelligence
will be considered and tested to obtain a and Computing Research,2013.
high level of accuracy and efficiency. [6] Online International Conference on Green
Thus, it can be concluded from the above Engineering and Technologies (IC-GET),
discussion that a reliable, secure, fast and E.Varadharajan,R.Dharani, S.Jeevitha,
an efficient system is being developed by B.Kavinmathi, S.Hemalatha, Automatic
Attendance management system using Face
replacing a manual and unreliable system. Detection, 2016.
This system can be implemented for better [7] Anissa LintangRamadhani , Purnawarman
results regarding the management of Musa, Eri PrasetyoWibowo,‖ Human Face
attendance and leaves. The system will Recognition Application Using PCA and
save time, reduce the amount of work the Eigenface Approach‖.
[8] Monica Chillaron, Larisa Dunai, Guillermo
administration has to do and will replace Peris Fajarnes, Ismael LenguaLengua, ―Face
the stationery material with electronic detection and recognition application for
apparatus and reduces the amount of Android‖, IECON2015-Yokohama November
human resource required for the purpose. 9-12, 2015.
related to any farming crisis. Farmer faces fertilizers for the soil from the shopping
different types of issues such as: portal. The fertilizers will be suggested to
Farmer has insufficient knowledge about the users based on their past purchase. The
the soil. user will get suggestions of fertilizers that
Lack of weather prediction. are usually purchased together. For these
Illiteracy of crops diseases. suggestions, we are using the Apriori
Unaware about how to increase fertility of algorithm which is used for obtaining
the soil. frequently purchased item sets [2].
More efforts had to be taken regarding the This work designs fertilization decision
government schemes support algorithms from the perspective of
decision support system with the model of
In such scenario, a system can help to agricultural fertilization principles. These
farmer, which crop has to be ripening by integrated and optimal algorithms can
analyzing with different essential factors. provide accurate scheme of fertilization for
To Guiding the farmers in such a way that users. The fertilization decision support
efficient use of water, fertilizers and also system was designed and implemented in
need nutrition. A capable system can accordance with the B/S structure by using
predict weather and also taking decisions ASP.NET platform and SQL2000 database
from previous data logs regarding the [3].
crops. It will help the farmers for Several different philosophies are used in
registering the complaints regarding the Kentucky depending on who is making the
crops for various government schemes. recommendation. Different farm supply
dealers, agricultural consultants, and soil
2. MOTIVATION test laboratories use different approaches.
Due to lack of knowledge regarding which Because of this, farmers often wonder why
crop has to be ripen by considering they receive such contrasting fertilizer
different factors such as weather, soil and recommendations and what these
water, Farmer has to face different crisis differences mean in a farming operation
related to crop quality and productivity. [4].
Because of these farmer has to accept the A commercial fertilizer may contain one
loss every year. To help the farmer or all of the essential elements but the
regarding crop and increase the percent of each will be listed on the
productivity soil detection and suggestion fertilizer label. Micronutrients may or may
according to it will help the farmers. not be included in the formulation [5].
This work explains support vector machine
3. LITERATURE SURVEY based classification of the soil types. Soil
Inceptisol soil has low soil fertility and classification includes steps like image
relatively low to moderate levels of acquisition, image preprocessing, feature
organic matter content. Application of extraction and classification. The texture
organic fertilizer on inceptisol soil of features of soil images are extracted using
lowland swamp is expected capable to the low pass filter, Gabor filter and using
increase N, P and K nutrients as well as color quantization technique [6].
yield of sweet corn. This study objective This work presents an image segmentation
was to determine the dose of organic and approach for detecting the soil pore
inorganic fertilizers which can increase N, structures that have been studied by way of
P and K nutrients uptake as well as the soil tomography sections. In so doing, a
growth and yield of sweet corn on research study was conducted using a
inceptisol soil of lowland swamp [1]. density-based clustering method, and in
In the fertilizer purchase system, the user turn, the nonparametric kernel estimation
will be able to purchase the recommended methodology. This overcomes the rigidity
of arbitrary assumptions concerning the The present research deals collecting soil
number or shape of clusters among data, samples for trail pits at designated site as
and lets the researcher detect inherent data per IS code procedure. The digital image
structures [7]. database is prepared for the collected soil
The objective of this study was to develop sample in the laboratory and physical
a flexible and free image processing and properties(Y) are determined[9].
analysis solution, based on the Public This work presents a satellite image
Domain Image platform, for the classification system, which can classify
segmentation and analysis of complex between the vegetation, soil and water
biological plant root systems in soil from bodies. The objective of this work is met
x-ray tomography 3D images. Contrasting by subdividing the works into three
root architectures from wheat, barley and important phases, which are satellite image
chickpea root systems were grown in soil preprocessing, feature extraction and
and scanned using a high resolution micro classification. The image pre-processing
tomography system [8]. phase denoises the image by median filter
This paper investigates the development of and the contrast is improved by Contrast
digital image analysis approach for Limited Adaptive Histogram Equalization
estimation of physical properties of soil in (CLAHE) technique [10].
lieu of conventional laboratory approach.
REFERENCES
[1] Lida Xu, Member, IEEE, Ning Liang, and
Qiong Gao, ―An Integrated Approach for
Agricultural Ecosystem Management‖ IEEE
TRANSACTIONS ON SYSTEMS, MAN,
AND CYBERNETICS—PART C:
APPLICATIONS AND REVIEWS, VOL. 38,
NO. 4, JULY 2008
[2] Jharna Majumdar, Sneha Naraseeyappa and
Shilpa Ankalaki ―Analysis of agriculture data
using data mining techniques: application of
big data‖ SpringerOpen Journal 2017
[3] Ramesh Babu Palepu1 and Rajesh Reddy
Muley2 ―An Analysis of Agricultural Soils by
using Data Mining Techniques‖ International
Journal of Engineering Science and
Fig 1: Proposed System Architecture Computing, October 2017
[4] Dasika P. Rao ―A Remote Sensing-Based
Integrated Approach for Sustainable
Development of Land Water Resources‖ IEEE Research and Stress Tolerance, GERMANY.
TRANSACTIONS ON SYSTEMS, MAN May 3, 2017
AND CYBERNETICS—PART C: [9] Karisiddappa, Ramegowda, Shridhara, S. ―Soil
APPLICATIONS AND REVIEWS, VOL. 31, Characterization Based on Digital Image
NO. 2, MAY 2001 Analysis‖. Indian Geotechnical Conference –
[5] Francisco Yandun, Giulio Reina, Miguel 2010 .
Torres-Torriti, George Kantor, and Fernando [10] Anita Dixit, Dr. Nagaratna Hedge and Dr. B.
Auat Cheein ―A Survey of Ranging and Eswar Reddy‖, Texture Feature Based Satellite
Imaging Techniques for Precision Agriculture Image Classification Scheme Using SVM‖.
Phenotyping‖ IEEE TRANSACTIONS 2017 International Journal of Applied Engineering
[6] Mengzhen Kang and Fei-Yue Wang ―From Research. ISSN 09734562 Volume 12,
Parallel Plants to Smart Plants: Intelligent Number 13 (2017). pp. 3996-4003 © Research
Control and Management for Plant Growth‖ India Publications.
IEEE/CAA JOURNAL OF AUTOMATICA http://www.ripublication.com.
SINICA, VOL. 4, NO. 2, APRIL 2017
[7] Małgorzata Charytanowicz, and Piotr
Kulczycki, ―An Image Analysis Algorithm for
Soil Structure Identification‖. Springer
International Publishing Switzerland 2015
[8] Richard J. Flavel, Chris N. Guppy, Sheikh M.
R. Rabbi, Iain M. Young, ―An image
processing and analysis tool for Identifying
and analyzing complex plant root systems in
3D soil using nondestructive analysis: Root1‖.
Dragan Perovic, Institute for Resistance
9. FUTURE SCOPE
To reduce Complexityof system
Reduce Excessive delay
To reduce High cost
10. CONCLUSION
In our system we have designed a system
that can monitor parameters like
temperature, humidity, Gas levels, Light
detection etc. all this parameters will be
monitored locally, our system will be
connected to the internet via a Wi-Fi
module. All the data that has been
collected by the system than will be
uploaded to the server where it will be
displayed using graphs and will be
available for analysis.Hence we have
designed an IoT based system for
monitoring the parameters of poly house.
REFERENCES
[1] Vamil B. Sangoi, ―Smart security solutions,‖
International Journal of Current Engineering
and Technology, Vol.4, No.5, Oct-2014.
Fig: Flowchart [2] Simon L. Cotton and William G. Scanlon,
―Millimeter - wave Soldier –tosoldier
8. ADVANTAGES communications for covert battlefield
operation,‖ IEEE communication Magazine,
This software is freely available. October 2009.
Low Cost and Easy to use.
and ITU (2010) asserts that adoption of e- infrastructure and capacity building but
Learning requires not only development of also measuring the degree of availability
plans, connecting schools with
accessibility of those resources. This calls Some of the instruments developed to
for assessment of preparedness to provide measure e-Learning readiness as presented
key quantifiable information indicators for by UNESCO-UIS (2009) has been
a country‘s situation (McConnell indicated in table 1.
International, 2001, ITU, 2010).
Table 1: Readiness indicators for adoption of e-learning
Concept Description
Infrastructure Availability of ICT hardware (such as desktop computers, laptops,
Interactive White Boards), availability of ICT software
Vision The vision for an institution regarding e-learning in relation to
pedagogy transformation and lifelong learning
Staff Motivating instructors/teachers to acquire ICT skills for pedagogical
Development practices; training of instructors for skill acquisition of skills for ICT
plan utilization in teaching and learning.
ICT Support ICT support, vision, time and financial allocation in the institutional
strategic plans, pedagogical support for instructors, technical
support for both educators and students
Jones (2004)argues that for A survey carried out by Tinio
successful in adoption of new (2002) on ICT Utilization in public high
technologies, the process of adoption schools in Philippines recommended for
should focus on training of teachers, comprehensive assessment of the ICT
instituting educational reform activities, environment to be conducted to establish
training of technology support staff, institutional infrastructure and competency
training of students, implementing skill inventory as pre-requisites for
technological resources and digital content adoption of e-learning. In some countries
preparation. Furthermore, the shift to e- such as the United States, Canada,
learning strategy requires creation of clear Singapore, Sweden, Japan, Finland,
vision and mission for the institution to Britain, Norway and Australia, heavy
aligndigital content with the mandated investment has been directed to technology
curriculum with consideration of the in education. In Singapore for instance,
diversity of learner‘s needs. Wagner et al. teachers are required to complete over 10
(2005) recommends training for pre- core modules within 30 to 50 hours of
service and in-service teachers as a crucial training to enable use of e-learning in
input component, pointing out that the teaching process (Farrell et al.,2007).
level of e-learning adoption is determined In Chile, internet connected
by the percentage of trained teachers, the computers serve over 90% of the school
quality of ICT training and the technical population and 80% of the teachers have
support. Tinio (2002) asserts that for the been trained and acquired pedagogical
learners to participate fully in the e- skills for the adoption of e-learning
learning activities, learners should be (Garrison, 2011). Teachers in at all levels
equipped with three foundational skills. in Chile received two years of face-to-face
Since Technology becomes obsolete fast, training amounting to 100
there is need for planning for technological hours.Consequently, teachers regularly
sustainability in schools (Anderson, 2010; make use of computers for professional,
Ministry of Education [MoE], 2009). managerial and out-of-classroom tasks
searching for educational content on the
of ICT as well as Open and Distance been established that most higher
Education (ODE) at all levels of education education institutions in Africa have not
and training (RoK,2005) and the plan is to yet assessed the level of preparedness as
make education the platform to equip the the leadership is yet to be convinced on the
Indian citizens with ICT skills to create a role of ICT in education (Kashorda and
dynamic and sustainable economic growth Waema, 2009).The dearth of assessment of
through enhanced learning and the mission the level of preparedness results to
of ICT in education is ―to integrate ICT in duplication of efforts and inefficient use of
education and training in to prepare scarce resources (RoK, 2014).
learners and staff of today for the Indian
economy of tomorrow and enhance the 2. MATERIALS AND METHODS
nation‘s ICT skills‖ (RoK, 2006, p. 25) The study adopted descriptive survey
and a vision to adopt ICT as a universal design using both quantitative and
tool for education and training (MoE, qualitative techniques. Survey design was
2006). To achieve the vision,‗…..every preferred as it enables researchers to
educational institution, teachers, learners make description, explanation and
and the respective community will be exploration of the phenomena to establish
equipped with appropriate ICT the status quo (Saunders et al.,2007).The
infrastructure, competencies and policies study sampled five(5) PTTCs out of the 22
for usage and progress‖ (MoE, 2006,p.14; colleges. Simple random sampling was
RoK,2005). This is further reflected in used to obtain 287 respondents from the
India‘s Master plan of 2014 which lays out five colleges. The data were analyzed by
strategies of mainstreaming e-Learning, use descriptive statistics such as
targeting 100% use of e-Learning as an frequencies,mean and standard deviation
alternative curriculum delivery strategy in aided by use of Statistical Package for
teacher training institutions by 2017 (RoK, Social Sciences (SPSS version 20)
2014). software programme.
From earlier research however
(Kiilu, Nyerere & Ogeta, 2016; Kiiilu & 3. RESULTS AND DISCUSSIONS
Muema, 2012; Republic of India, 2012), The study sought to establish the
the use of ICT and e-Learning in teaching institutional and teacher trainee level of
in public institutions in India is still preparedness for the adoption of e-
patchy. A desk top review carried out by learning in teacher training colleges using
Kiilu and Muema (2012) on implications the UNESCO Institute of Statisticts[UIS]
of e-readiness on adoption of e-learning 2009 institutional e-readiness inicators
approach in secondary schools in India which include availability and accessibility
established that although the country to infrastructure, internet connectivity;
advocates for use of education as a competency (UNESCO-UIS, 2009). From
platform for the 21st century skills the study, the pre-service teacher trainees
development, less than 10% of secondary responses regarding infrastructural
schools in India offered computer studies facilities were presented in table 3.
as a specialty subject at the time. It has
Table 3: Availability of Resources for e-learning.
ICT infrastructure Mean Standard. Dev. N
Internet connectivity 3.5842 1.29280 287
Desktop computers 3.8750 1.10036 287
Interactive white boards 3.801 1.48719 287
LCD projectors 3.6915 1.21110 287
Database repositories 2.768 0.34625 287
College website and password 2.6795 1.10130 287
linkup on the Internet. Another example is In the context of rural areas, following
a student in a remote location doing an factors are important:
entire course of study offered by a • Technology (both hardware and
university via the Internet (i.e. distance software) must be cheap but robust enough
education). for rural conditions. In essence, it must
In the context of rural areas, e- have an excellent cost/benefit ratio.
Learning presents both opportunities and • Open-source software is most
challenges. For example, rural areas are suitable as it is free for use under the GNU
often geographically isolated from public license.
developed towns and cities where there are • Given the harsh conditions (e.g.
better opportunities for education and dusty environment) in rural areas, it is
employment. E-Learning, if implemented necessary to develop a programme/policy
in the right way in rural areas, has the for the type of equipment used, how to
potential to overcome these geographical best protect equipment, and how to
barriers. monitor breakdowns and associated costs,
From this point of view, e-Learning is with a desire to continuously improve
possibly more beneficial for rural areas utilization/lifespan of equipment.
than any other area (e.g. towns and cities) • Given limitations in cost, it is
because it helps people to overcome impossible to ensure a 1:1 student to
resource limitations (e.g. lack of libraries computer ratio. Indeed, this is not even
and books) which other areas do not done in well-funded public schools in
necessarily encounter. However, the developed countries. Instead, given the
challenges of implementing e-Learning in requirement to minimize costs, it is best to
rural areas are usually far more extreme maximize technology utilization to ensure
than those faced in developed areas. For a good cost/benefit ratio, e.g. by having a
example, rural areas usually have a poorer computer lab.
infrastructure (e.g. poor electricity supply • Bandwidth in rural areas is often
and roads), less finances, lower levels of very expensive. OSS is chosen not
general literacy; lower accessibility/higher primarily to reduce costs, but to increase
cost of Internet access and limited the flexibility to modify and test and
understanding or appreciation of the develop appropriate materials. The
potential of eLearning. flexibility also makes it possible to adjust
to small bandwidth.
2. TECHNOLOGY
In this paper, technology refers to both the Network-side – Server
hardware and software to provide the basic Server is a freely available Linux-based
infrastructure for e-Learning. This includes server that is intended to meet the ICT
components for networking (e.g. access requirements of e-learning application. It
points and links to the Internet) as well as can be used to drive networks that have in
client computers and software for basic excess of 100 client computers.
services (e.g. e-mail, file sharing, web • Documents: Staff can work with
pages etc.). Technology also refers to their own documents and share them
servers that could be used for centralized among each other in workgroups. Staff can
data/program storage. It does not include simply copy relevant files to student
specific eLearning software intended folders. There are hourly backups of all
purely for the purposes of pedagogy, documents on the server and it is easy to
which is covered under ‗applications‘. restore lost documents.
However, the underlying technology is • Web: Internet access is provided
intended to have the capabilities to support on every computer with ‗safe‘ access. The
e-Learning applications. school has its own website, which is easy
entire depth of the input volume. Such the neurons in each depth slice to use the
architecture ensures that the learnt filters same weights and bias. Since all neurons
produce the strongest response to a in a single depth slice are sharing the same
spatially local input pattern. Three hyper parameterization, then the forward pass in
parameters control the size of the output each depth slice of the CONV layer can be
volume of the convolutional layer: the computed as a convolution of the neuron's
depth, stride and zero- padding. weights with the input volume (hence the
1. Depth of the output volume controls the name: convolutional layer). Therefore, it is
number of neurons in the layer that common to refer to the sets of weights as a
connect to the same region of the input filter (or a kernel), which is convolved
volume. All of these neurons will learn to with the input. The result of this
activate for different features in the input. convolution is an activation map, and the
For example, if the first Convolutional set of activation maps for each different
Layer takes the raw image as input, then filter are stacked together along the depth
different neurons along the depth dimension to produce the output volume.
dimension may activate in the presence of Parameter Sharing contributes to the
various oriented edges, or blobs of color. translation invariance of the CNN
2. Stride controls how depth columns architecture. It is important to notice that
around the spatial dimensions (width and sometimes the parameter sharing
height) are allocated. When the stride is 1, assumption may not make sense. This is
a new depth column of neurons is especially the case when the input images
allocated to spatial positions only 1 spatial to a CNN have some specific centred
unit apart. This leads to heavily structure, in which we expect completely
overlapping receptive fields between the different features to be learned on different
columns, and also to large output volumes. spatial locations. One practical example is
Conversely, if higher strides are used then when the input is faces that have been
the receptive fields will overlap less and centred in the image: we might expect
the resulting output volume will have different eye- specific or hair-specific
smaller dimensions spatially. features to be learned in different parts of
3. Stride controls how depth columns the image. In that case it is common to
around the spatial dimensions (width and relax the parameter sharing scheme, and
height) are allocated. When the stride is 1, instead simply call the layer a locally
a new depth column of neurons is connected layer. Another important layer
allocated to spatial positions only 1 spatial of CNNs is the pooling layer, which is a
unit apart. This leads to heavily form of nonlinear down sampling. of
overlapping receptive fields between the artificial neural networks, the rectifier is
columns, and also to large output volumes. an activation function defined as:
Conversely, if higher strides are used then f(x)=max(0,x) ,where x is the input to a
the receptive fields will overlap less and neuron. This is also known as a ramp
the resulting output volume will have function and is analogous to half-wave
smaller dimensions spatially. rectification in electrical engineering. This
Parameter sharing scheme is used in activation function was first introduced to
convolutional layers to control the number a dynamical network by Hahn loser et al.
of free parameters. It relies on one in a 2000 paper in Nature with strong
reasonable assumption: That if one patch biological motivations and mathematical
feature is useful to compute at some spatial justifications. It has been used in
position, then it should also be useful to convolutional networks more effectively
compute at a different position. In other than the widely used logistic sigmoid
words, denoting a single 2-dimensional (which is inspired by probability theory;
slice of depth as a depth slice, we constrain see logistic regression) and its more
is selected. This approach guarantees that experimented with several formulas using
the end user will receive advice for the crossvalidation, such as linear (e.g. Borda
problem faced. A suggestion model that is Count) or exponential weights decreasing
basic is implemented. with the rank, and we settled for the
Suggestion Model based on k-NN following best-performing formula for
Classification scoring each candidate POI P: P = X k i=1
The idea is to assign a rating or score to si · Ri X k i=1 si , Ri = RD i + RW i 2 , (1)
each candidate POI based on the ratings of where si is the Indri tf-idf score of the ith
its k semantically nearest POIs (neighbors) ranked POI. This formula assigns to a
in the user profile. Then all candidate POIs candidate POI a score equal to the
are ranked in a decreasing order of their weighted average of the ratings of the k-
assigned scores. The model is nearest-neighbor POIs in a user profile,
implemented in three main steps: 1. where weights are given by tf-idf
Indexing the rated POIs. In order to be similarity. As POI‘s rating Ri we use the
able to find the k semantically nearest average rating of the description (RD i )
(rated) neighbor POIs of a candidate and the website (RW i ), because in Step 1
(unrated) POI, we create an index of the we index both the description and the text
POIs that are part of the user profiles and of website. The value of k that we use in
have been evaluated and rated by the our suggestions was optimized to k = 23
users. For each rated POI we index its title, by using 5-folds cross-validation [8] on the
description, place types and the text of its example places. The scored candidate
website. The place types of the rated POIs places are then ranked in a decreasing
are not provided by the track, but we order of their scores.
retrieve them from the three place search
engines that we use in context processing 4. EXPECTED RESULTS
(as we described in Section 2). For The proposed system is expected to pose
indexing, we use Indri6 v5.5 with the as a replacement for uneducated use of
default settings of this version, except that pesticides and chemical compounds. A
we enable the Krovetz stemmer [5]. controlled , better understanding of the
6http://www.lemurproject.org/ 2. crop in absence of Expert help is expected
Generating queries from candidate POIs. to come from the proposed system. The
We generate a query per candidate POI in system will take a high quality, high
a context. The query consists of the POI resolution image of the affected area of a
title, place types and the description of the crop of the user‘s selection. The captured
POI that we retrieved in the context image will be processed for feature
processing. From the query, we remove all extraction. A neural network and machine
punctuation and special characters. 3. learning model will further help draw
Scoring candidate POIs based on their k- conclusions based on the input image. The
NNs. We submit the queries (per context) conclusions thus drawn will help in the
that are generated in Step 2 to the index suggestion of a solution for the posed
that is created in Step 1 in order to rank the problem.
rated POIs in an increasing semantical
distance. In a standard k-NN [1, 6], a 5. CONCLUSIONS & FUTURE WORK
candidate POI (represented by its The result of this analysis will help easy
corresponding generated query) would be access to expertise. The system will
assigned the majority rating of the top-k improve with the influx of new data. A
retrieved POIs. In initial experiments, Neural Network at the tip of a farmer‘s
however, we found that taking into tips will enhance the quality of crop
account the ranks or retrieval scores of the production. We propose that this new
top-k results is beneficial. We system, with the help of expert domain
knowledge can be helpful n reducing the with the education we deserve. Needless to
usage of pesticides and insecticides. say, without them, this wouldn‘t have been
Organic Farming can be promoted. The possible. Our constant well-wishers, our
model used can be further scaled to other family and friends , who always had our
cops and plants as it is highly scalable. By back , and contributed through healthy
increasing the number of features and the discussions.
number of inputs to the Neural Network
the algorithms can be enhanced.this REFERENCES
technique is developed into a sophisticated [1] PLANT DISEASE DETECTION AND
interface in the form of a Website or CLASSIFICATION USING IMAGE
PROCESSING AND ARTIFICIAL NEURAL
Android Application it may prove to be NETWORKS - Mr. Sanjay Mirchandani, Mihir
great asset to the agricultural sector. In the Pendse, Prathamesh Rane, Ashwini Vedula
future this methodology can be integrated International Research Journal of Engineering
with other yet to be developed methods for and Technology (IRJET) e-ISSN: 2395-0056
disease identification and classification. [2] Plant Leaf Disease Detection using Deep
Learning and Convolutional Neural Network -
The use of other algorithms can be Anandhakrishnan MGJoel Hanson, Annette
explored to enhance the efficiency of the Joy, Jerin Francis
system in future. This application will International Journal of Engineering Science
serve as an aid to farmers (regardless of and Computing, March 2017
the level of experience), enabling fast and [3] AN IMAGE PROCESSING AND NEURAL
NETWORK BASED APPROACH FOR
efficient recognition of plant diseases and DETECTION AND CLASSIFICATION OF
facilitating the decision-making process PLANT LEAF DISEASES. - Garima Tripathi,
when it comes to the use of chemical Jagruti Save
pesticides. Furthermore, future work will International Journal of Computer Engineering
involve spreading the usage of the model and Technology (IJCET), ISSN 0976-6367
[4] DUTH at TREC 2013 Contextual Suggestion
by training it for plant disease recognition Track - George Drosatos, Giorgos
on wider land areas, combining aerial Stamatelatos, Avi Arampatzis and Pavlos S.
photos of orchards and vineyards captured Efraimidis1
by drones and convolution neural networks The Twenty-Second Text Retrieval
for object detection. By extending this Conference (TREC 2013), At NIST,
Gaithersburg, Maryland, Volume: Special
research, the authors hope to achieve a Publication 500-302
valuable impact on sustainable
development, affecting crop quality for
future generations. The main goal for the
future work will be developing a complete
system consisting of server side
components containing a trained model
and an application for smart mobile
devices with features such as displaying
recognized diseases in fruits, vegetables,
and other plants, based on leaf images
captured by the mobile phone camera.
6. ACKNOWLEDGMENTS
The authors would like to thank our
professor and guide. Prof. J.N. Nandimath
for her constant support and motivation.
Her encouragement and believe in our
work had got us this far. We would also
like to thank our college for providing us
chain of high suicide rate of farmers in frame and consequently, yield expectation
India. is an essential perspective for them.
Throughout the years, agriculturists have a
2. LITERATURE SURVEY thought regarding the example in yield
The research by Shanning Bao et al. [1] is according to intrinsic human instinct. Be
about an observation exists that the rural that as it may, precipitation as a
administration and product cultivars have noteworthy driver for harvest raising can
been enhanced clearly. Be that as it may, broadly shake instinctive yield expectation
the harvest yield variety incline due to by controlling a portion of the dirt and
above reason stay obscure yet. To assess ecological parameters identified with the
the fundamental sustenance trim (maize, product development. Additionally, the
soybean and rice) yield incline from 2007 correct sort of soil to be utilized for a
to 2016, the MODIS item (MCD12Q2) product is just known to the rancher just
was utilized to remove the develop date of by on-paper exhortation and makes it
various harvests. A two-band variation of troublesome for him/her to preliminary
the upgraded vegetation list at develop and test on harvest speculation.
date was connected to build up exact yield The research by Michael D. Johnson et al.
estimation display, coupling with factual [3] is about Harvest yield estimate models
product yield information. The normal for grain, canola and spring wheat
maize and soybean yield in study territory developed on the Canadian Prairies were
exhibited expanding pattern, yet rice yield created utilizing vegetation records got
introduced declining. Be that as it may, from satellite information and machine
maize yield in 22 urban areas and soybean learning techniques. Hier-archical
yield in 19 urban communities show bunching was utilized to assemble the
diminishing pattern really. Through harvest yield information from 40 Census
measurable investigation, the product yield Agricultural Regions (CARs) into a few
appropriation design was turned out to be bigger locales for building the figure
nearly settled. Most urban areas possesses models. The Normalized Difference
surmised position on the positioning of Vegetation Index (NDVI) and Enhanced
significant harvest yield. It was exhibited Vegetation Index (EVI) got from the
that a few urban communities, for instance Moderate-goals Imaging Spectro-
Chifeng city, was reasonable to create radiometer (MODIS), and NDVI got from
explicit agribusiness economy. This paper the Advanced Very High Resolution
can be utilized to give proposal for Radiometer (AVHRR) were considered as
agribusiness arranging and the board. indicators for harvest yields. Different
The research by Shruti Kulkarni et al. [2] direct relapse (MLR) and two nonlinear
stated that farming is the foundation of machine learning models – Bayesian
Indian economy. The yield got principally neural systems (BNN) and model-based
relies upon climate conditions as recursive apportioning (MOB) – were
precipitation designs generally impact utilized to gauge trim yields, with different
development techniques. With this specific blends of MODIS-NDVI, MODIS-EVI
circumstance, ranchers and agriculturalists and NOAA-NDVI as indicators.
require unconstrained counsel The research by X.E. Pantazi et al. [4] by
recommendation in anticipating future understanding yield restricting variables
harvesting cases to augment edit yield. requires high goals multi-layer data about
Because of deficient inclusion of elements influencing crop development
innovation, the throughput of farming is and yield. Consequently, on-line proximal
yet to achieve its full wonder. Each soil detecting for estimation of soil
agriculturist is keen on knowing the yield properties is required, because of the
he/she could expect at the collect time capacity of these sensors to gather high
Valley of Bheema River. The rivers Velu, economic and agricultural welfare of the
Ghod are left side of Bheema and countries across the world.
Indrayani, Bhama, Mula-Mutha etc. are at In future work we are going to focus on
right side. Each tahsil of the district have more detailed study of Indian crops to
minimum one river6. Therefore, the agro- invent emerging trends in digital
climatic condition of district is favorable. agricultural field.
Figure 1. Existing System (Security-enhanced and trustworthy cloud service Broker (STCSB)
architecture)
Above figure explain the security 1.Identification of abnormal service
enhanced and trustworthy cloud service behavior, 2.Untrusted resource list,
broker architecture. There are many 3.Trust computing based on big data
modules in the existing architecture. analysis, 4.Trust based security measures
1. Communication and agent (access control, authorization, and
management module resource match making, 5.Security agent.
2. Trust computing module The agent publish and data perceiving
3. Cloud resource management module module can handles the real time service
In figure there are three types of agent data. Here verification mechanism are
which are monitoring agent, trust agent, introduced between agents. This
and last is service agent. All Trust agent mechanism is used to prevent a trust
has direct access to agent publish and data agent from hacked and hocked by
perceiving block. This block includes the malicious user. This mechanism can
security agent, QoS agent, Agent publish eliminates the data tempering issue.
sub block. Service agent can do cloud Cloud resource management module:-
service connection and adaption task. The federated service catalog can store
Monitoring agents are connected to both all the available and trustworthy services
trust agent and service agent .it can automatically chooses the high
Users can obtain service from selected trustworthy services to meets the
cloud broker i.e. STCSB which provides requirements of user. This module
the fast and trustworthy and secure creates a service catalog that links with a
services. This archicture is provides a highly trusted resource and then provides
security enhanced cloud service broker. this catalog as a trusted resource for the
Monitoring agent is used for enhanced user through the unified cloud service
the users experience. There are two portal [1].
different technologies are introduced in Trust computing module:-
this architecture which is first is cloud An administrator can manages the
monitoring and second is trust based virtual server on the unified cloud
cloud service. These two technologies management portal [1]. This portal can
are integrated to enhanced the security of creates the template for virtual server.
cloud computing and QoS of the service The cloud users open the unified cloud
provider. Compared with another service portal and select a trusted service
traditional collaborative cloud computing catalog when you like to would use a
framework, the STCSB architecture providers [1].
includes some security- enhanced
functional modules. These are
static analysis to fail over to the Experimental results show that T-broker
cryptographic ones when the non- yields very good results in many typical
cryptographic ones would be more cases, and the proposed mechanism is
expensive, and (c) incorporates this core robust to deal with various number of
into a built system that includes a service resources.
compiler for a high-level language, a
distributed server, and GPU acceleration. Haiying Shen and Guoxin Liu, 2014 in
Experimental results indicate that our this paper[9] , author presents an
system performs better and applies more integrated resource/ reputation
widely than the best in the literature. management platform, called Harmony,
for collaborative cloud computing.
S. Pearson and A. Benameur, 2010, [6] Recognizing the interdependencies
the author point out, cloud computing is between resource management and
an emerging paradigm for large scale reputation management, Harmony
infrastructures. Paper has the advantage incorporates three innovative
of reducing cost by sharing computing components to enhance their mutual
and storage resources, combined with an interactions for efficient and trust worthy
on-demand provisioning mechanism resource sharing among clouds.
relying on a pay-per- use business model.
These new features have a direct impact Ismail Butun, Melike Erol-Kantarci,
on the budgeting of IT budgeting but Burak Kantarci, and Houbing Song,
also affect traditional security, trust and 2016.In this paper[10] ,the ultimate goal
privacy mechanisms. Many of these is to design a cloud-centric public safety
mechanisms are no longer adequate, but network that is not only resilient but also
need to be rethought to fit this new reliable. Such a network is a cyber-
paradigm. In this paper author assess physical system that requires seamless
how security, trust and privacy issues integration of the cyber and physical
occur in the context of cloud computing elements (i.e., computing, control,
and discuss ways in which they may be sensing, and networking). Security and
addressed. privacy have to be built by design when
a develop a reliable public safety
Tian Li-qin, LIN Chuang,2016, in paper network.
[7] , author mainly discusses the P. Muralikrishna1, S. Srinivasan, N.
evaluation importance of user behavior Chandramowliswaran, 2015.Key
trust and evaluation strategy in the cloud distribution is very critical problem, in
computing, including trust object cryptography secrete sharing of any key
analysis, principle on evaluating user is invented by the Adi Shamir & Georgy
behavior trust, basic idea of evaluating Blakley in 1979.Secrete sharing is very
user behavior trust, evaluation strategy important concept to store secrete
of behavior trust for each access, and information or very sensitive information.
long access, which laid the theoretical Users of a group wish to communicate
foundation of trust for the practical cloud using symmetric encryption, they must
computing application. share a common key [12]. A secure
secret sharing scheme distributes shares
Xiaoyong Li, Huadong Ma, Feng so that anyone with fewer than t shares
Zhou, and Wenbin Yao, 2015.In this has no extra information about the secret
paper [8], author present T-broker, a trust- than someone with 0 shares. Recently, in
aware service brokering system for [13], Author discussed a secure secret key
efficient matching multiple cloud sharing algorithm using non-
services to satisfy various user requests. homogeneous equation. In this paper
4. GAP ANALYSIS
Cloud computing is buzzword not in
IT industry but in another field also. The
word cloud is came from the network
design by the network engineers to
represent the location of network devices
as well as inter-connection .the shape of
this is like a cloud. Cloud is normally
used for the storage purpose. Now a days
Figure 2. System architecture
it is possible to upload large amount data
Client can make a request for service
on cloud.it is advantage of cloud. But it is
to server. And server responds to that
also important to provide security to the
request. Manager layer and auditor layer
cloud as well as data. There are n no of
is the mediator between client and server.
algorithm are used to provide a security
Here AES algorithm is used for
to the cloud and data.
encryption and decryption of file.AES is
The existing system provides the cloud
symmetric key encryption algorithm. So
service broker architecture. The system
it can share a common key for both
is not providing a proper security to data
encryption and decryption
and also there is problem of trust
Client are nothing but a pc or
calculation. The key distribution is
workstation.
become important problem.
In client layer, n no of users are there
The proposed system should provide
which can make a request to server to
security to data and also verification is
upload or download or access the file.
done through the MD5 algorithm. And
Manager layer includes two main
encryption and decryption is done
component as job schedule & key
through the AES algorithm. The key
distribution. This layer manages the
generation is implemented in the AES
incoming request from the multiple users.
algorithm which provides the security to
Key distribution is done using AES
the data.
algorithm. And job scheduling is also
done on the basis of first come first
6. PROPOSED SYSTEM serve.
A. Architecture
Next layer is auditor layer. In auditor
This is just like a client server
layer, TPA and trust factor is there.
architecture. This architecture includes:
Typically TPA i.e. third party auditor is a
a. Client layer
one who can audit the information of
b. Manager layer
knowledge owner or consumer. Third
c. Auditor layer
party auditing is an accepted method for
d. Server.
establishing trust between business and
its data.
And trust factor is one who can be
used for verification of the user‘s profile
or account information. If any user wants
to download or upload the file at that time
the verification is done through the trust
factor.
Server can hosts, delivers and manages Methodology and Algorithm Used:
most of the resources and services to be 1. Encryption with signature
consumed by the client. It is just like a algorithm
database. Database can store the files, The signatures concept is used to hide
incoming request and encrypted data on the identity of singer on each file
the server. Server can manage the several generate encryption, so that the private
clients simultaneously. and sensitive information of user will
In this paper, the multiple users can secure. Here AES algorithm is used for
access data or upload or download data. the encryption and decryption.
But before that every user should2. Data Integrity Verification
registered first. After reregistering, it algorithm
would login and then it can upload, To maintain to overkill this issue here,
download or access data through their id we are giving public auditing process for
and password. When the user can cloud storage that users can check the
registered their personal information is integrity of data. The work that has been
stored in database and after that the user done in this line lacks data dynamics and
id and password is generated. Through true public auditability. MD5 algorithm
that user can use cloud services. Only is used for verification of user. If user is
registered user is able to login in cloud registered then and then only it can
server. The user is upload file or download or upload the file.
download or access file. Every time log is 3. Hash key generation
generate .and log is managed by the log It also uses random masking operation
generator. It can store at server. and index hash value in order to support
When user wants to download file the dynamic operations like insert, delete
at that time the secrete key required to and update over the shared data for
download a file. The key is generated at dynamic group. Hash key is just like a
the time of file upload .this key is key generation.
required for upload and download the
file. Every user can has the key. The user 7. MATHEMATICAL MODEL
id is used every time for security S= {I, P, R, O}
purpose. The user id is generated by the Where,
administrator. The admin will give S=system I=input
access to the user. Here data tempore R=rules/constraints O=output
analysis is also done. Data tempering is
that act of deliberately modifying I-{I1}
(destroying, manipulating or editing) data I1=file which contains text.
through unauthorized channel. The
system should does the analysis of data P={P1,P2,P3,P4,P5,P6,P7,P8,P9,P10}
which can keep a log of entire activities
which will happens in system. The P1=User Registration P2=Secrete Key
verification is done at the time of Generation P3=File Upload & Download
download. When user is upload a file this P4=Encryption & Decryption of File
file can convert into unreadable format.to P5=Temper Analysis
make it readable format the user should P6=Trust Factor P7=Audit Checking
follow the decryption flow. It should uses P8=Log Generation P9=Verification of
the key to download a file User P10=OTP Generation.
There are many modules
1. User module R= {R1, R2}
2. Admin module
3. Temper analysis module R1=User is must to registered first.
A. Architecture
platform justify the concern with security. as a keyword in the literature with the aid
For example, a botnet attack on Amazon‘s of search engine SUMMON. A keyword is
cloud infrastructure was reported in 2009. known as ―cloud forensics" was used and
Categories it in three main dimensions
3. LITERATURE SURVEY based as (1) survey (2) technology and (3)
In this section, we have discussed different forensics-procedural. The aim in the paper
papers referred, based on cloud computing is not just to refer the related work on
as well as how the cloud logs can be discussed dimensions but to analyze those
secured and preserved. X. Liu et al. [1] dimensions and identify research gaps
help of multiple key they proposed with the help of generating a map.
outsourced calculation framework with
new efficient and privacy preserving. In [5] Indrajit Ray et al. drafted a
Hence the proposed designed allowed comprehensive scheme which addresses
different service provider to outsource security and integrity issues not just during
their data. For reducing private key the log generation phase, but also during
exposure risk they algorithm like other stages in the log management
cryptographic primitive, Distributed Two process, including log collection,
Trapdoors Public-Key Cryptosystem (DT- transmission, storage,and retrieval.
PKC), which helped them for splitting a Outsourcing log management to cloud
strong private key into various different used to arise for log privacy was the
shades. challenge. While storage or retrieval log
Drafted Secure Logging-as-a-Service should not be traceable, so that logs can be
(SecLaaS) [2], Author has put up some used or network to provide anonymous
storage virtual machines logs and permits protocols on logs in the cloud.Developed
legal access to forensic examiners protocol has the potential for usage in
guaranteeing the privacy of the cloud various areas.
customers. In addition to that, SeclaaS Ben Martini et al. [6] proposed an
sustains past log proof and accordingly integrated conceptual digital forensic
protects the confidentiality of the cloud framework which gives particular
logs from invalid investigators or CSPs. importance to the preservation of forensic
Eventually, Author successfully data and the collection of cloud computing
determined the feasibility of the work by data forforensics. The overarching
systematizing SecLaaS for network logs in framework for conducting digital forensic
a cloud of OpenStack. investigations in the cloud computing
Zhihua Xia et al. proposed a scheme for environment, they even stated that there
image retrieval image retrieval helped the must be further research to develop a
data owner for out sourcing the image library of digital forensic methodologies
database. Local sensitive has utilized for that would best suit the various cloud
improving the search ef515ficiency as well platforms and deployment models.
as two different stages were designed to
improve the search efficiency, the first In another work, Alecsandru Patrascu et
stage the unique images were filtered out al. [7] drafted a novel solution which
by pre-filter tables, and in the second provided investigators of digital forensic a
stage, the remaining image was compared reliable and secure method for monitoring
one by one by using EMD metric for activities of users in cloud infrastructure.
refined search results. Hence they mainly focused on the various
field like to increase the security and
Here author [4] highlights the state-of-the- safety as well as reliability of the cloud.
art digital forensics of cloud computing. Authors even proposed a model which
They pinpointed when the term was used allowed investigators to seamlessly
analyze workloads and virtual machines study related to Cloud Log Forensics was
while preserving scalability of large scale highlighted for the implementation of
distributed systems. analyzing malicious behavior of cloud log
investigation. To tolerate the
Lightweight hypervisor introduced in [8] susceptibilities of cloud log, Cloud log
to acquire and preserve data for reliable forensics security requirement,
live forensics. In three ways the reliability vulnerability points and the challenges
is improved: the lightweight architecture, were identified. In this paper they identify
the data acquisition mechanism, and the and introduce challenges and future
evidence protection mechanism. Unused directions to highlight open research areas
device drivers are eleminated to reduce the of CLF for motivating investigators,
TCB size, thereby decreasing the academicians, and researchers to
vulnerability of our hypervisor. investigate on them.
In [9] author highlights various issues and Author in [10] discusses proposed scheme
challenges involved while investigation of to protect the privacy of the data and the
data in cloud logs as well as states the query user from cloud and even to
state-of-art of Cloud Log Forensics. Case
resist the attackers knowing data that data Architecture of Proposed Scheme
owner shares with query user in addition to
the encrypted data. To achieve secure out Proof of Past
sourcing storage and k-NN query they Logs Internet
improved a dot product protocol and
merged it with the K-NN query system.
4. PROPOSED METHODOLOGY
A dishonest cloud user can attack a system
outside the cloud. They can also attack any Cloud Service
Provider(CSP) Encrypted
application deployed in the same cloud, or Logs
an attack can be launch against a node
controller which controls all the cloud User User
activities. For a virtual machine (VM), Logs Logs
CLASS scheme (Fig. 1) takes the log from
the node controller (NC), hides its content, Investigator
and stores it in a database. These storage
allow logs to become available for further
investigation despite VM shutdown.
Moreover, CLASS publishes its proof so Cloud Users
that log integrity protected and
User User
admissibility ensured. An essential term of 1 2
our proposed system is defined initially.
Then attacker‘s capability, possible
attacks on logs, and the security properties Fig. 3.1 Proposed Scheme
of a secure cloud log services are
provided.
5. CONCLUSION
To execute a successful forensics
investigation in clouds, the proposed
system uses CSPs to collect logs from
different sources. The system uses secure
logs for the cloud which is a solution to
store and provide logs for forensics
purpose securely. Also, provide privacy of
cloud users by encrypting cloud logs with
a public key of the respective user while
also facilitating log retrieval in the event of
an investigation. This scheme allows
CSPs to store logs while preserving the
confidentiality of cloud users.
Fig 4.1:- UseCase Diagram Additionally, an auditor can check the
Above figure shows that the user can get integrity of the logs using the Proof of Past
access to files for the cloud by providing Log (PPL).This cloud logs can be securely
his public key and get access to log files used for cyber forensics.
and access to files from the cloud.. The
investigator sends the request to cloud REFERENCES
service provider to check user activities [1] X. Liu, R. H. Deng, K.-K. R. Choo, and J.
after the approval from a cloud service Weng, "An efficient privacy-preserving
provider, investigator receives the file with outsourced calculation toolkit with multiple
encrypted logs, and can get the decrypted keys," IEEE Transactions on Information
Forensics and Security, vol. 11, pp. 2401-
file after the match ofa key provided by 2414, 2016.
him. [2] Shams Zawoad; Amit Kumar Dutta; Ragib
Hasan,‖ Towards Building Forensics Enabled
Sequence diagram Cloud Through Secure Logging-as-a-
Below figure represents the sequence Service,‖IEEE Transactions on Dependable
and Secure Computing, 2015
diagram, which describes an interaction [3] Zhihua Xia, Xingming Sun, Zhan QinandKui
arranged in sequence. It depicts objects Ren, ―Towards Privacy-preserving Content-
involved in the scenario, and the sequence based image retrieval in Cloud Computing,‖
of the message exchanged between them IEEE Transactions On Computer Computing,
to carry out the functionality. September 2015.
[4] Sameera Almulla, Youssef Iraqi, and Andrew
Jones,‖A State-of-The-Art Review of Cloud
Forensics,‖Research Gate, Article · December
2014.
[5] Indrajit Ray, Kirill Belyaev, Mikhail Strizhov,
Dieudonne Mulamba, and Mariappan
Rajaram,― Secure Logging As a Service—
Delegating Log Management to the Cloud,‖
IEEE Systems Journal, 2013.
[6] Ben Martini, Kim-Kwang Raymond Choo,―An
integrated conceptual digital forensic
framework for cloud computing,‖Digital
Investigation, vol. 9, pp.71-80,2012.
[7] Alecsandru Patrascu, Victor-Valeriu Patricia,‖
Logging System for Cloud Computing
Forensic Environments,‖Journal of Control
Engineering and Applied Informatics, vol. 16,
pp. 80-88, 2014.
Fig 4.2:- Sequence Diagram [8] Zhengwei Qi, Chengcheng Xiang, Ruhui Ma,
Jian Li, Haibing Guan, and David S. L. Wei,
Amazon has revenues close to $178 billion Recently Big Data too changed the entire
in 2017. Flipkart has 10 million active privacy policy framework and data use.
Internet users in India, while Amazon has Big data refers to data sets that are too
310 million active customers worldwide. large or complex for traditional data-
The number of online shoppers in India processing application software to
crossed the 100 million mark by the end of adequately deal with. Data with many
2016. cases (rows) offer greater statistical power,
IRCTC alone issues 13 lakh tickets a day while data with higher complexity (more
through its online train seat reservation attributes or columns) may lead to a
portal. This means that today more than higher false discovery rate. Big data
ever before, hundreds of millions of challenges include capturing data, data
Indians are directly affected due to poor storage, data analysis,
privacy policies and data misuse. In fact search, sharing, transfer, visualization, que
all Android devices (more than 2 billion) rying, updating, information privacy and
have Google Search by default installed in data source. Big data was originally
them, with no option to uninstall it. The associated with three key
best you can do is, reject its privacy policy concepts: volume, variety,
and not use it. and velocity. Other concepts later
These large companies (Google, attributed with big data are veracity (i.e.,
Microsoft, Amazon, Facebook, Apple etc.) how much noise is in the data) and value.
have the largest collection of human Big Data processing frameworks like
generated data ever in history, while also Hadoop empowered corporations to use
leaving users with the option of take all or (or misuse) data at unprecedented sales.
reject all, when it comes to sharing this This further eased up the road to data
information. As the Cambridge Analytica misuse and blatant sharing. Even the
scandal that engulfed Facebook in 2018, sharing of anonymous data and results
showed that even these large companies without user permission leads to a
are sometimes unaware of the way in perception that the user is no more than a
which data related to their users are being pawn in the business of Information
used by third parties. Retrieval and Analysis.
In fact the German Supreme Court Big Data is characterized by 4 V‘s –
directed Facebook in February 2019 to Volume, Velocity, Veracity, and Variety.
curb data collection, in response to how The most troubling of these aspects is the
Facebook integrates user data from speed (velocity) of data collection and
WhatsApp, Instagram and Facebook for processing. Real-time or near real-time
mining and analysis. Facebook was also processing of data, means that the user
found in violation of the General Data doesn‘t even get the opportunity to refuse
Protection Rules(GDPR), by tracking non- his data for such purposes.
users through like/share buttons. This leads to privacy issues and risks like:
But, Facebook is no outlier. Most Internet Right to be let alone
giants‘ entire business was created on user Limited access
data - Search for Google ads, Windows Control over information
usage and crash analytics for Microsoft or States of privacy
Buyer shopping and search data that Secrecy
powers Amazon‘s recommendation Personhood and autonomy
engine. Google recently found itself in hot
Self-identity and personal growth
water when the US Senate called in
Intimacy
question data collection by its research
apps, and issued a show cause notice to
2. RELATED WORKS:
Google Inc.
and the application demands of cross- and retention[2]) to learn classification for
cloud and cross-grade data strategies of all future unseen policies.
types are satisfied. The main aim of this tool is to help the
user understand the privacy policy in a
3. ISSUES AND CHALLENGES better way. For this, the tool focuses on
1) User Awareness two result factors. One is the score, and
Users in developing countries like India other, the details about the presence of
are still unaware of the risks that their data details about different privacy classes. The
can cause, and are mostly oblivious to the definition of all the categories was also
need for data privacy and security. displayed to help the user choose the
2) Privacy Policy is difficult to classes which were important for him/her.
Understand The components of our proposed
Privacy policies are usually filled with architecture for the Trust Score generator
technical and legal jargon thus making it tool are a browser extension, word pre-
difficult for the average user to completely processor, classifier, corpus, database,
understand and comprehend it. This leads score generator.
to users blindly accepting the policy The user first opens the policy webpage of
without being aware of the data control the service provider whose privacy policy
they just surrendered to a company. he/she wishes to understand. On clicking
3) Privacy Protection the extension, it fetches the source code of
The user may not understand completely the privacy policy webpage. This code is
the data he/she is surrendering and the then separated from the HTML tags to
implications of their actions as clearly as a generate the privacy policy text. The
computer professional or data scientist. So, policy text is then cleaned with the help of
the onus of privacy protection in spirit different pre-processing techniques. This
should lie with data collectors/aggregators. reduces the overhead on the algorithm.
4) Notification This policy is given to the classifier. The
A more granular and streamlined method Naïve Bayes classifier then classifies the
of data control can be the use of policy using words as a feature. The
notification (desktop or mobile) to alert the algorithm then labels the user‘s given
user about data sharing agreements with policy as one of the policies in the corpus.
third parties or data breaches. These [1]
notifications can also be in the form of an
e-mail. 5. CONCLUSION
5) Security The Trust Score serves as a medium of
Even if the user consents to providing data creating an understanding between the user
to a specific company, risks of and the service provider. It tries to put the
unauthorized access remains if someone user in control when the decisions
hacks the former‘s data. These hacks and regarding his/her privacy are concerned.
security breaches are generally outside the Our tool works dynamically on most
control of the company responsible for websites, but the structure of each website
data collection and control. is different. This makes it difficult to
scrape the policy text from this source
4. PROPOSED METHODOLOGY code. All in all, Trust Score generator can
We propose a Machine Learning based serve as a great foundation for judging the
solution to categorize privacy policies. Our privacy policy in a short time and take safe
Naïve-Bayes based algorithm will use the and unforced decisions about their online
ratings of 50 different policies in 8 privacy.
different categories (collect, choice,
cookies, access, purpose, security, share, 6. ACKNOWLEDGEMENTS
REFERENCES
[1] Assessment of Privacy Policies using Machine
Learning by Ritav Doshi, Aditya Ahale,
Gaurav Gharti, Prakhar Pathrikar, Dr. P.S.
Dhotre
[2] Organization for Economic Co-operation and
Development (http://oecdprivacy.org/)
[3] Privee: An architecture for automatically
analyzing web privacy policies, USENIX
Security, 2014 by S. Zimmeck and S.M.
Bellovin
[4] A Machine Learning Solution to Assess
Privacy Policy Completeness by Elisa,
Yuanhao, Milan et al.
[5] The Creation and Analysis of a Website
Privacy Policy Corpus by Shomir, Florian,
Aswarth et al.
[6] https://www.freeprivacypolicy.com/ : an
English privacy policy generator for online
privacy policy generation by
FreePrivacyPolicy.
[7] OneTrust
(https://www.onetrust.com/products/assessmen
t-automation/)
[8] US patent - US20160164915A1 by Michael
Cook
[9] US patent - US20120072991A1 by Rohyt
Belani and Aaron Higbee
[10] Chinese patent - CN107465681A by Liu Ying
by utilizing the tags as query. However, the walk model on the voting graph which is
weakly relevant tags, noisy tags and constructed based on the images
duplicated information make the search relationship to estimate the tag relevance.
results unsatisfactory. Most of the literature Moreover, many research efforts about the
focuses on tag processing, image relevance tag refinement emerged. Wu et al. [19]
ranking and diversity enhancement for the raised a tag completion algorithm to
retrieval results. The following complete the missing tags and correct the
components present the subsisting works erroneous tags for the given image. Qian et
cognate to the above three aspects al. proposed a retagging approach to cover
respectively. a wide range of semantics, in which both
A. Tag Processing Strategy the relevance of a tag to image as well as
It has been long acknowledged that tag its semantic compensations to the already
ranking and refinement play a determined tags are fused to determine the
consequential role in the re-ranking of tag- final tag list of the given image. Gu et al.
predicated image retrieval, for they lay a [45] proposed an image tagging approach
firm foundation on the development of re- by latent community classification and
ranking in tag based image retrieval multi-kernel learning. Yang et al. proposed
(TBIR). For example, Liu et al. [1] a tag refinement module which leverages
proposed a tag ranking method to rank the the abundant user-generated images and
tags of a given image, in which probability the associated tags as the ―social
density estimation is used to get the initial assistance‖ to learn the classifiers to refine
relevance scores and a random walk is noisy tags of the web images directly. Qi et
proposed to refine these scores over a tag al. proposed a collective intelligence
similarity graph. Similar to [1], and [26] mining method to correct the erroneous
sort the tag list by the tag relevance scores tags [50].
which are learned by counting votes from B. Relevance Ranking Approach
visually similar neighbors. The To directly rank the raw photos without
applications in tag-based image retrieval undergoing any intermediate tag
also have been conducted. Based on these processing, Liu et al. [3] utilized an
initial efforts, Lee and Neve [66] proposed optimization framework to automatically
to learn the relevance of tag and image by rank images based on their relevance
visually weighted neighbor voting, a scores to a given tag. Visual consistency
variant of the popular baseline neighbor among pictures and semantic data of labels
voting algorithm. Agrawal and Chaudhary are both considered. Gao et al. [7]
[17] proposed a relevance tag ranking proposed a hypergraph learning approach,
algorithm, which can automatically rank which aims to estimate the relevance of
tags according to their relevance with the images. They investigate the bag-of-words
constraint of image content. A modified and bag-of-visual words of images, which
probabilistic relevance estimation method is extracted from both the visual and
is proposed by taking the size of object into textual information of image. Chen et al.
account. Furthermore, random walk based [21] proposed a support vector machine
refinement is utilized to improve final classifier per query to learn relevance
retrieval results. Li [24] presented a tag scores of its associated photos. Wu et al.
fusion method for tag relevance estimation [15] proposed a two-step similarity ranking
to solve the limitations of a single scheme that aims to preserve both visual
measurement on tag relevance. Besides, and semantic resemblance in the similarity
early and tardy fusion schemes for a ranking. In order to achieve this, a self-tune
neighbor voting predicated tag pertinence manifold ranking solution that focuses on
estimator are conducted. Zhu et al. [34] the visual-based similarity ranking and a
proposed an adaptive teleportation random semantic-oriented similarity re-ranking
method are included. Hu et al. [27] different views. Wang et al. [29] proposed
proposed an image ranking method which a duplicate detection algorithm to represent
represents image by sets of regions and images with hash code, so that large image
apply these representations to the multiple- database with similar hash codes can be
instance learning based on the max margin grouped quickly. Qian et al. [48] proposed
framework. Yu et al. [35] proposed a an approach for diversifying the landmark
learning based ranking model, in which summarization from diverse viewpoints
both the click and visual feature are based on the relative view point of each
adopted simultaneously in the learning image. The relative viewpoint of each
process. Specially, Haruechaiyasak and image is represented with a 4-dimensional
Damrongrat [33] proposed a content-based viewpoint vector. They select the relevant
image retrieval method to improve the images with large viewpoint variations as
search results returned by tag-based image top ranked images. Tong et al. achieved the
retrieval. In order to give users a better diversity by introducing a diversity term in
visual enjoyment, Chen et al. [18] their model whose function is to punish the
proposed relevance-quality re-ranking visual similarity between images [61-62].
approach to boost the quality of the However, most of the above literatures
retrieval images. view the diversity problem as to promote
C. Diversity Enhancement the visual diversity but not the topic
The relevance based image retrieval coverage. As reported in [14], most people
approaches can boost the relevance said they preferred the retrieval results with
performance, but the diversity performance broad and interesting topics. So, many
of searching is also very important. Many literatures about topic coverage are
researchers dedicated their extensive emerged [23, 30, 49, 54]. For instance,
efforts to make the top ranked results Agrawal et al. [23] classify the taxonomy
diversified. Leuken et al. studied three over queries to represent the different
visually diverse ranking methods to re-rank aspects of query. This approach promotes
the search results [10]. Different from documents that share a high number of
clustering, Song et al. [9] proposed a re- classes with the query, while demoting
ranking method to meet users‘ ambiguous those with classes already well represented
needs by analyzing the topic richness. A in the ranking
diverse relevance ranking algorithm to
maximize average diverse precision in the
optimization framework by mining the
semantic similarities of social images
based on their visual features and tags is
proposed in [5]. Sun et al. [28] proposed a
social image ranking scheme to retrieve the
images to meet the relevance, typicality Figure1 Ranking Approche
and diversity criteria. They explored both
semantic and visual information of images 3. SYSTEM OVERVIEW
on the basis of [5]. Ksibi et al. [31] Our system includes five main parts: 1)
proposed to assign a dynamic trade-off Tag graph construction based on the tag
between the relevance and diversity information of image dataset. Tag graph is
performance according to the ambiguity constructed to mine the topic community.
level of the given query. Based on [31], 2) Community detection. Affinity
they further proposed a query expansion propagation clustering methods is
approach [6] to select the most employed to detect topic communities. 3)
representative concept weight by Image community mapping process. We
aggregating the weights of concepts from assign each image to a single community
Descriptors and Their Applications in Scene image retrieval‖. In Computer Vision and
Categorization and Semantic Concept Pattern Recognition, CVPR 2008. IEEE
Retrieval‖. Multimedia Tools and Applications, Conference on (pp. 1-8).
May 2012. [28]. F. Sun, M. Wang, and D. Wang,
[13]. X. Lu, Y. Yuan, X. Zheng, Jointly ―Optimizing social image search with multiple
Dictionary Learning for Change Detection in criteria: Relevance, diversity, and typicality‖.
Multispectral Imagery, IEEE Trans. Neurocomputing, 95, 40-47, 2012.
Cybernetics, vol. 47, no. 4, pp. 884-897, 2017. [29]. B. Wang, Z. Li, and M. Li, ―Large-scale
[14]. J. Carbonell, and J. Goldstein, ―The use of duplicate detection for web image search‖.
MMR, diversity based re-ranking for ICME 2006, pp. 353- 356.
reordering documents and producing [30]. R. Santos, C. Macdonald, and I. Ounis,
summaries‖. SIGIR 1998. ―Exploiting query reformulations for Web
[15]. Wu, J. Wu, and M. Lu, ―A Two-Step search result diversification‖. In WWW, pages
Similarity Ranking Scheme for Image 881–890, 2010.
Retrieval. In Parallel Architectures‖. [31]. A. Ksibi, G. Feki, and A. Ammar,
Algorithms and Programming, pp. 191-196, ―Effective Diversification for Ambiguous
IEEE, 2014. Queries in Social Image Retrieval‖. In
[16]. G. Ding, Y. Guo, J. Zhou, et al., Large- Computer Analysis of Images and Patterns (pp.
Scale Cross-Modality Search via Collective 571-578), 2013.
Matrix Factorization Hashing. IEEE [32]. Y. Guo, G. Ding, L. Liu, J. Han, and L.
Transactions on Image Processing, 2016, Shao, ―Learning to hash with optimized anchor
25(11): 5427-5440 embedding for scalable retrieval,‖ IEEE Trans.
[17]. G. Agrawal, and R. Chaudhary, Image Processing, vol. 26, no. 3, pp. 1344–
―Relevancy tag ranking‖. In Computer and 1354, 2017.
Communication Technology,pp. 169- 173, [33]. C. Haruechaiyasak, and C. Damrongrat,
IEEE, 2011. ―Improving social tag-based image retrieval
[18]. L. Chen, S. Zhu, and Z. Li, ―Image with CBIR technique‖. Springer Berlin
retrieval via improved relevance ranking‖. In Heidelberg, 2010, pp. 212-215.
ControlConference, pp.4620-4625, IEEE, [34]. X. Zhu, W. Nejdl, ―An adaptive
2014. teleportation random walk model for learning
[19]. L. Wu, and R. Jin, ―Tag completion for social tag relevance‖.ACM SIGIR, pp. 223-
image retrieval‖. Pattern Analysis and Machine 232, 2014.
Intelligence, IEEE Transactionson,35(3), 716- [35]. J. Yu, D. Tao, and M. Wang, ―Learning to
727, 2013. Rank Using User Clicks and Visual Features
[20]. Y. Yang, Y. Gao, H. Zhang, and J. Shao, for Image Retrieval‖. IEEE
―Image Tagging with Social Assistance‖. Trans.Cybern.(2014).
ICMR, 2014. [36]. S. Ji, K. Zhou, C. Liao, Z. Zheng, and G.
[21]. L. Chen, D. Xua, and I. Tsang, ―Tag- Xue, ―Global ranking by exploiting user
based image retrievalimproved by augmented clicks‖. ACM SIGIR, 2009, pp. 35-42.
features and group- based refinement‖. [37]. G. Dupret, ―A model to estimate intrinsic
Multimedia, IEEE Transactions on, 14(4), document relevance from the clickthrough logs
1057-1067, 2012 of a web search engine‖. ACM international
[22]. Z. Lin Z, G. Ding, J. Han, et al., Cross- conference on Web search and data mining (pp.
View Retrieval viaProbability- BaseSemantics- 181-190), 2010.
Preserving Hashing, IEEETransactions on [38]. X. Lu, X. Li, and L. Mou, Semi-
Cybernetics vol. PP, no.99, pp.1- Supervised Multi-task Learning for Scene
14 doi: 10.1109/TCYB.2016.2608906. Recognition, IEEE Trans. Cybernetics, vol. 45,
[23]. R. Agrawal, S. Gollapudi, A. Halverson, no. 9, pp. 1967-1976, 2015.
and S. Ieong, ―Diversifying search results‖. In [39]. X. Hua, and M. Ye, ―Mining knowledge
WSDM, pages 5– 14, 2009 from clicks: MSR- Bing image retrieval
[24]. X. Li, ―Tag relevance fusion for social challenge‖. In Multimedia and Expo
image retrieval‖. CoRR abs/1410.3462, 2014. Workshops, 2014.
[25]. X. Qian, X. Liu, and C. Zheng, ―Tagging [40]. X. Lu, X. Li, Multiresolution Imaging,
photos using users' vocabularies‖. IEEE Transactions on Cybernetics, vol. 44, no.
Neurocomputing, 111(111), 144-153, 2013. 1, pp.149-160, 2014.
[26]. D. Mishra, ―Tag Relevance for Social [41]. X. Qian, X. Hua, Y. Tang, and T. Mei,
Image Retrieval in Accordance with Neighbor ―social image tagging with diverse semantics‖.
Voting Algorithm‖. IJCSNS, 14(7), 50, 2014. IEEE Trans. Cybernetics, vol.44, no.12,2014,
[27]. Y. Hu, M. Li, and N. Yu, ―Multiple- pp. 2493- 2508.
instance ranking: Learning to rank images for
[42]. X. Qian, D. Lu, X. Liu, ―Tag based image image search‖. Multimedia Systems,
retrieval by user-oriented ranking‖. ICMR, 2014:
2015. [55]. X. Tian,et.al, ―Image search reranking with
[43]. Y. Zhang, X. Qian, X. Tan, J. Han, Y. hierarchical topic awareness‖.IEEE
Tang:Sketch-Based Image Retrieval by Salient TRANSACTION ON CYBERNETICS, 2015.
Contour Reinforcement. IEEE Trans. [56]. D. Dang-Nguyen, et.al, ―Retrieval of
Multimedia 18(8): 1604-1615 (2016). Diversity Images by Pre-filtering and
[44]. Y. Gu, X. Qian, Q. Li, and et al.,―Image Hiearchical Clustering‖. MediaEval,2014.
Annotation by Latent Community Detection [57]. X. Qian, Y. Xue, Y. Tang, X. Hou, and T.
and Multikernel Learning‖. IEEE Transactions Mei, ―Landmark Summarization with Diverse
on Image Processing 24(11): 3450-3463 Viewpoints‖. IEEE Trans. Circuits and
(2015). Systems for Video Technology, vol.25, no.11,
[45]. X. Yang, X. Qian, and Y. Xue. ‖Scalable 2015, pp.1857-1869.
Mobile Image Retrieval by Exploring [58]. H. Hou, X. Xu, G. Wang, and X. Wang,
Contextual Saliency‖. IEEE Trans. Image ―Joint-Rerank: a novel method for image
Processing 24(6): 1709-1721 (2015). search reranking‖ Multimedia Tools and
[46]. D. Lu, X. Liu, and X. Qian, ―Tag based Applications,2015, 74(4):1423-1442.
image search by social re-ranking‖. IEEE [59]. S. Liu, et.al, ―social visual image reranking
Transactions on Multimedia, vol.18, for web image search‖. MMM, 2013
no.8, 2016, pp.1628-1639. [60]. J. He, H. Tong, Q. Mei, and B. Szymanski,
[47]. X. Qian, Y. Xue, Y. Tang, and X. Hou, ―GenDeR: A generic diversified ranking
―Landmark Summarization with Diverse algorithm,‖ Advances in Neural information
Viewpoints‖. IEEE Trans. Circuits and process systems, 2012,2:1142-1150.
Systems for Video Technology, vol.25, no.11, [61]. H. Tong, J. He, Z. Wen, R. Konuru, and C.
2015, pp.1857-1869. Lin, ―Diversified ranking on large graphs: an
[48]. R. Santos, C. Macdonald, and I. Ounis, optimization viewpoint‖, SIGKDD, 2011,1028-
―Selectively diversifying web search results‖. 1036.
ACM CIKM, 2010:1179-1188. [62]. X. Li, S. Liao, W. Lan, X. Du, and G.
[49]. G. Qi, C. Aggarwal, and J. Han, Yang, ―Zero-shot Image Tagging by
―Mining Collective Intelligence in Diverse Hierarchical semantic embedding,‖ ACM
Groups‖, in Proc. WWW, 2013. SIGIR, 2015:879-882.
[50]. X. Qian, X. Tan, Y. Zhang, R. Hong, [63]. D. Zhang, J. Han, C. Li, J. Wang, and X.
and M. Wang, ―Enhancing Sketch-Based Li, Detection of Co-salient Objects by Looking
Image Retrieval by Re-ranking and Relevance Deep and Wide, International Journal of
Feedback‖. IEEE Trans. Image Processing, Computer Vision, 120(2): 215-232, 2016.
vol.25, no.1, 2016, pp.195-208. [64]. D. Zhang, J. Han, J. Han, L. Shao,
[51]. https://dumps.wikimedia.org/enwiki/latest/ Cosaliency Detection Based on Intrasaliency
enwiki-latest-pages-articles.xml.bz2. Prior Transfer and Deep Intersaliency Mining,
[52]. B. Frey, and D. Dueck, ―Clustering by IEEE Trans. on Neural Networks and Learning
passing messages between data points‖. Systems, 27(6): 1163-1176, 2016.
Science, 2007, 315(5814): 972-976. [65]. S. Lee, and W. Neve, ―Visually weighted
[53]. K. Song, Y. Tian, W. Gao, and T. Huang, neighbor voting for image tag relevance
―Diversifying the image retrieval results‖. learning‖. Multimedia Tools and Applications,
ACM MM. 2006:707-710. 1-24, 2013.
[54]. Y. Yan, G. Liu, S. Wang, and et al.―Graph-
based clustering and ranking for diversified
Live your life each day as you would climb mountain. An occasional glance towards
the summit keeps the goal in mind, but many beautiful scenes are to be observed from
each new vantage point. -Harold B Melchart.
VISSION:
We are committed to produce not only good engineers but good
human beings also.
MISSION:
OUR MISSION is to do WHAT it takes to foster, sustain and upgrade
the quality of Education by way of harnessing Talent, Potential and
optimizing meaningful Learning Facilities.
Our ENDEAVOUR is to provide the best learning, conductive
environment & equip the students with effective Learning Strategies.
The Vadgaon(Bk) campus of Sinhgad Institutes has an ideal
environment with lush green surroundings & panoramic views.
Vadgaon(Bk) campus is situated on a delightful hillock of the
beautiful Sahyadri ranges. It provides quietude to stimulate the brain
to enhance the learning capabilities. The institutes on this campus
boast of independent infrastructure. Also, facilities to cover the
necessities of life are available on the campus.
Postal Address
Smt. Kashibai Navale College of Engineering
Sr. No. 44/1, Vadgoan (Bk), off Sinhgad Road, Pune-411041 Maharashtra, INDIA
Tele. (020)24354938, Telefax: (020) 24354938
Email: principal.skncoe@sinhgad.edu