Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
So our, framework includes three steps: There is Jackie Chan in Wikipedia list, we
queried local database for tweets that contain
(i) Mine data from Twitter save this data
the name Jackie Chan. When the tweets are
in local database
returned, they are forwarded for sentiment
analysis.
Results:
Out of our list of 417 entities: 164 where such
(iii) Perform sentiment analysis on the
entities which were not tweeted about, 85 are
data returned from local database.
those entities, total tweets on them were less
We performed lexicon based sentiment than five; we ignored all of them. Remaining
analysis, for this we used R language, as it 168 entities, whose total tweets were more
provides some built-in functions that minimize than four, were considered in drawing out the
line of code. results.
The scheme used of sentiment analysis is that, Following is a graph of sentiment analysis of top
we got lists of 2006 positive words and 4783 10 entities on the bases of no. of total tweets
negative words. These are the words that are related to Panama appeared in Panama Papers.
mostly used on social media for positive and
negative expressions. No. of Positive and Negative
For sentiment analysis of tweets; as example words
Jackie Chan, we performed following steps on
Nawaz Sharif
tweets: Maryam Nawaz
Lee Shing Put
Remove all the punctuation marks,
Shahid Nazir
Remove all the usernames, Dan Gertler
Remove links, Ken Whitney
Remove stop words (the, there, he, she, Mark Thatcher
now, when, etc. ) as they have nothing Michael Mates
to do with sentiments, Michael Ashcroft
Gul Muhammad Tabba
Split all the tweets on the bases of
spaces and count the number of 0 100 200 300
positive and the number of negative no. of -ve words no. of +ve words
words in a set of tweets.