Sei sulla pagina 1di 4

1.

Introduction
Inrecentpast,duetoexistenceofnumerousforums,discussiongroups,and
blogs, individual users are participating more actively and are generating vast
amountofnewdatatermedasusergeneratedcontents.ThesenewWebcontents
includecustomerreviewsandblogsthatexpressopinionsonproductsandservices
which are collectively referred to as customer feedback data on the Web. As
customer feedback on the Web influences other customers decisions, these
feedbackshavebecomeanimportantsourceofinformationforbusinessestotake
intoaccount whendevelopingmarketingandproductdevelopmentplans.Recent
works have shown that the distribution ofan overwhelming majority ofreviews
postedinonlinemarketsisbimodal.Reviewsareeitherallottedanextremelyhigh
ratingoranextremelylowrating.Insuchsituations,theaveragenumericalstar
ratingassignedtoaproductmaynotconveyalotofinformationtoaprospective
buyer.Instead,thereaderhastoreadtheactualreviewstoexaminewhichofthe
positiveandwhichofthenegativeaspect oftheproduct areofinterest.Several
sentimentanalysisapproacheshaveproposedtotacklethischallengeuptosome
extent.However,mostoftheclassical sentimentanalysis mappingthecustomer
reviews into binary classes positive or negative,and thus fails to identify the
productfeatureslikedordislikedbythecustomers.

2.Motivation
Thisprojectresultsfromtheneedofextractingusefulinformationfromthe
largeamountofunstructuredandunorganizeddataavailableontheweb.Because
oftheexplosionofdataontheinternet,thereisagrowingneedtoanalyzethis
unprocessed data and obtain meaningful information that can be used in other
applications.
Thereisaneedtoimplementasystemwhichcanhelpconsumerstodirectlygetthe
positiveornegativeopinionabouttheproductswithoutwastingtimeinreadingthe
reviewsasstatedbyotherusersofthoseproducts.Inthisproject,aframeworkhas
beenpresentedwhichfirstextractsthefeature,modifierandopinionfromthe
datasetandthenusingclusteringmechanismdividesthemintodiscreteclusterson
thebasisofusersopinion,inwhichtheintraclustersimilaritybetweenthe
featuresarehighwhereastheinterclustersimilarityisverylow.

3.Objective
1)To design and in feature based clustering techniques in sentiment analysis to
improvecustomerreviewsummarization.
2)ToprocessandanalyzetwitterorFacebookfeedstodeterminetheresponsesand
feedbacks of the customers. Using sentiment analysis , we can determine the
content of the posts and how many customers have given positive or negative
reviews.
3)Tousesentimentanalysisandopinionminingtoanalyzecustomerreviewsabout
aspecificproductorservice.Wecandeterminehowmanyusersliked/dislikedthe
product/service,whatarethestrongandweakpointsoftheproductreviewed.
Asanexample,wecananalyzethecustomerfeedbacksaboutasmartphone.Using
sentimentanalysiswecandeterminehowmanycustomers describedtheproduct
asgoodandhowmanydislikedit.Thepositivefeatureslikebattery,LCDdisplay,
RAM,etc.thattheusershaveratedhighcanbedisplayedinaccordancewiththeir
rankings.Similarly,thedrawbacksoftheproductasdescribedbythecustomers
canbelistedwiththeirrankings.
4)To use opinion mining in improving the efficiency of web mining. Company
officialscandirectlyanalyzethegeneralresponseandfeedbackofthecustomers
about their product or service without spending hours over reading the reviews
manually.
5)Toimplementasystemwhichhelps consumerstodirectlygetthepositiveor
negativeopinionabouttheproductswithoutwastingtimeinreadingthereviewsas
statedbyotherusersofthoseproducts.

4.Scopeoftheproject
Fig. 1 presents the architectural details of the proposed opinion mining system,
which consists of five major modules Document Processor, Subjectivity/
ObjectivityAnalyzer, DocumentParser, FeatureandOpinionLearner,and Review
Summarizer and Visualizer. The working principles of these components are
explainedinthefollowingsteps:

1)Firststepinvolvesthecollectingofreviewdocumentsfromvarioussourceslikee
commercewebsitessuchasFlipkart,amazon,etc.andsocialnetworkingsiteslike
twitter,Facebook,etc.
2)Innextstep,DocumentProcessorandSubjectivity/ObjectivityAnalyzermoduleis

employed,whichconsistsofaMarkupLanguage(ML)tagfilterthatdividesan
unstructuredwebdocumentintoindividualrecordsizechunks,cleansthemby
removingMLtags,andpresentsthemasindividualunstructuredrecorddocuments
forfurtherprocessing.

3)ThenDocumentParser,andFeatureandOpinionLearnermoduleis
implemented.TheDocumentParsermoduleusesStanfordparser,whichassigns

PartsOfSpeech(POS)tagstoeverywordsbasedonthecontextinwhichthey
appear.Thedocumentsareanalyzedusingaclassifierandfeaturesareextracted
fromtheuserreviews.
4)Afterextractingthefeatures,theyaregivenratingsaccordingtotheopinions.
Ratingsareprovidedtoallthefeaturesofaparticularproductandthisisstoredin
adatabase.
5)Thenfeaturebasedsummaryofreviewdocuments(ReviewSummarizerand
Visualizermodule)isgenerated.Finally,thetotalnumberofpositiveandnegative
opinionsentencesforeachfeatureiscalculatedtogenerateafeaturebasedreview
summarywhichispresentedtouserinagraphicalwayorinatabularway.

Potrebbero piacerti anche