Sei sulla pagina 1di 46

HACKING

INDONESIA
THROUGH BIG DATA
Ismail Fahmi, Ph.D. ITCAMP 2019
Director ONNO CENTER, SITU GINTUNG, TANGSEL
15 September 2019
PT. Media Kernels Indonesia
Ismail.fahmi@gmail.com
Ismail Fahmi, PhD.
Ismail.fahmi@gmail.com
Lahir: Bojonegoro, 1974
Founder Media Kernels Indonesia

1992 – 1997 S1, Teknik Elektro, ITB


2003 – 2004 S2, Information Science, Universitas Groningen, Belanda
2004 – 2009 S3, Information Science, Universitas Groningen, Belanda

2000 – 2003 Inisiator IndonesiaDLN (Digital Library Network pertama di Indonesia)


Mengembangkan Ganesha Digital Library (GDL)
Mendirikan Knowledge Management Research Group (KMRG) ITB
Membangun Digital Library ITB

2009 – Sekarang Engineer di Weborama, Perusahaan berbasis big data (Paris/Amsterdam)


2014 – Sekarang Founder PT. Media Kernels Indonesia, a Drone Emprit Company
2015 – Sekarang Konsultan Perpustakaan Nasional, Inisiator Indonesia OneSearch
2017 – Sekarang Dosen Tetap Magister Teknik Informatika Universitas Islam Indonesia

2
3
MATA NAJWA LIVE ‘HOAX VIRUSES’

confidential
4
#ILC SOCIAL MEDIA WAR (21 August 2018)

5
PRESIDENTIAL ELECTION DEBATE 2019

6
AGENDA

• Big Data

7
BIG DATA ANALYTICS
DATA GROWTH: UNSTRUCTURED DATA

9
BIG DATA – BIG GROWTH

10
INDONESIA DIGITAL 2019
12
13
14
15
GEN Z MILLENIALS

16
CROSS PLATFORM RESONANCE

17
DRONE EMPRIT
BIG DATA ARCHITECTURE
MK Big Data Architecture

Data Pipeline

Management & Queue


News Crawler Map Reduce Analytics UI

Data Ingest

Data & Workflow


Twitter Crawler

Scheduled Job

Management
Realtime Job

Visualization
Processings
Processing

Processing

Sentiment
Analysis

Access
Data Insight

Other
Twitter Streaming

FB Page Crawler

Database Framework
IG, YouTube
Hadoop Framework
Other sources
SOLR Indexer 1 SOLR Indexer 2 SOLR Indexer 3 SOLR Indexer 4

Physical Hardware

19
Social Media
Client(s)
JSON

PUSH JSON
Twitter Stream Subscriber

Smart phones, Control Room


Search + JSON

Desktops
tablets Screens
Twitter

Facebook
Search + Account
Crawler

Sentiment
Index Servers Analysis

Analyses
Online News SOLR Nodes
Shard 1 Backtrack
Filters
RSS + HTML

Detik (ID)

RSS + HTML SOLR Nodes


Reuters (EN) Shard N deletes Projects
Crawler
Storage
Etc..
Cache Manager

Gatra (ID) Keywords + Mentions


Redis Queue Accounts Filters
HTML

Storage
Bloomberg (EN) HTML Crawler

Etc.. Sentiment Sentiment


Converter Analysis Models

Forums Print

Kaskus Kompas
TEXT
HTML

Detik Forum Warta Ekonomi


HTML Crawler
Etc..
System Architecture
Etc..

20
Fitur-fitur Media Kernels

DASHBOARD ANALYTICS TOPICS INFLUENCERS SNA

Trends Media Retweets Impact Influencer Network

Comparison News Sites Replies Engagement Topic Network

Page Ranks Most Shared URLs Reach Insight Explorer


NEWS PORTAL
Sentiment Analysis Most Shared Videos Most Engaged

Topic Map PR-Values Hashtags Posts


DEMOGRAPHY

Latest News PF-Chart Topic Map Followers Twitter User Map

Engagement Word Cloud Bubble Map User Locations


MENTIONS
Exposure

Edit Sentiments Reach


ADMIN REPORTING COMPARE

Training & Learning User Management Upload Report Compare SNA

Backtracking
OPINION ANALYSIS
Project Management Download Report Compare Projects

Background Jobs Label and Training Client Management Popularity vs


Favorability
Opinion Chart Source Management

21
DATA SOURCES
(1) Crawling Online News

Crawler Indeks Server

23
(2) Twitter API: Realtime (Filter)

POST statuses/filter
Filter max 400 keywords
Filter:
Max 400 keywords

~ 100%

All Statuses Filtered Statuses

24
(3) FB API (v2): Object (Facebook Page)

https://graph.facebook.com/$object_id/$type?
fields=id,
parent_id,
from,
to, $type = [feed, comment, ...]
type,
status_type,
story, $object_id = FB Page ID, etc
message,
link,
likes.summary(true),
shares,
comments.order(reverse_chronological).summary(true),
created_time,
updated_time
&order=reverse_chronological
&access_token=$access_token&limit=$limit&until=$last_timestamp

25
(4) Instagram

26
(5) YouTube

27
(6) WhatsApp

28
SOCIAL MEDIA ANALYTICS:
QUALITATIVE MEETS QUANTITATIVE,
UNSTRUCTURED MEETS STRUCTURED DATA
BEBERAPA CONTOH ANALISIS

• Sentiment Analysis
• Trend Analysis
• Social Network Analysis
• Cross-Platform Analysis
• News Topic Clustering Analysis
• Stake Holder Mapping

30
Sentiment Analysis

Positif
MENTIONS

? Negatif

Netral

31
Sentiment Analysis

Positif
MENTIONS

Untuk Setya Novanto

32
Sentiment Analysis

MENTIONS

? Negatif

Untuk KPK
33
Sentiment Analysis

MENTIONS

?
Untuk Hakim Cepi Iskandar

Netral

34
Sentiment Analysis Techniques

35
http://www.sciencedirect.com/science/article/pii/S2090447914000550
Evaluasi

”one model for all” tidak bisa


memberi label yang tepat untuk
setiap subyek.

Lexicon base tergantung dari


keberadaan kata dalam kamus sentimen,
tidak bisa memberi label yang tepat
untuk subyek yang berbeda.

36
http://www.sciencedirect.com/science/article/pii/S2090447914000550
DEMO DAN PRAKTEK
AKSES TRIAL DRONE EMPRIT
(HANYA SELAMA TRAINING)

• URL: analytics.droneemprit.id
• Login: itcamp@droneemprit.id
• Pass: 2019

38
DRONE EMPRIT ACADEMIC:
PELUANG RISET BERBASIS BIG DATA UNTUK
DAKWAH DAN KOMUNIKASI
DRONE EMPRIT ACADEMIC
FREE SOCIAL MEDIA (TWITTER) DATA ANALYTICS

40
JOIN DRONE EMPRIT ACADEMIC
https://dea.uii.ac.id

41
TOPICS BASED ON SDGs
(Sustainable Development Goals)

42
EXAMPLES /1

43
HOW IT WORKS

USERS
• Students
• Researchers Dashboard
• Lecturers Access Admin
• Journalists
• Blogger
• Hoax buster

REQUIREMENTS: STEPS:
• Publish their analysis for public • Registration
using any medium. • Propose keywords
• Analysis and publication
44
BOOKS “READING INDONESIA”

45
Ismail Fahmi, PhD.

TERIMAKASIH

Potrebbero piacerti anche