Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Case Studies
Ettikan Kandasamy Karuppiah (Ph.D),
Principal Researcher & Director of Accelerative Technologies Lab
MIMOS Berhad
Broadening data
Growing data
VOLUME
90% of worlds
data generated
over last 2 years
VARIETY
Turning
big data into
Value
is unstructured (text,
geospatial, audio, video)
ECONOMIC
BENEFITS
Establishing the
Increasing data
VELOCITY
GOVERNMENT
BENEFITS
VERACITY
175,000
SOCIETAL
BENEFITS
tweets per
second
of big data
sources
Bus i ne s s Va l ue
Requires
Transformative
Platform
Sentiment
Analysis Model
&
Data Modeling
& Data
Warehouse for
PIK MOH
& GPGPU Video
Data Analytics
Library
AMD Malaysia
/US/Europe MoU
Data Cleansing
Engine for
PERKESO
&
Data Warehouse
for PERKESO
ESRI Inc/US Mou
Established
Establish work on
General Purpose
Graphics
Processing Unit for
text manipulation,
Hadoop Trainings
Acquire
Train
R&D
GE13 Electoral
Roll Analysis with
Hadoop & GPU
GPU
Accelerated
Libraries for
Data
Cleansing &
Financial Risk
Modeling
R&D
Collaboration
MiAccLib
Cleansing
Data Modeling
& Visualization
for PDRM
Workforce
Planning
& GPGPU Data
Security Library
Data
Encryption/Decr
yption for
National Data
Protection
MiAccLib
MiAccLib Crypto
Image
MiAccLib
BigData
MiAccLib
MiAccLib
Video MiAccLib
MiAccLib Finance
Acquire
Algo/Map
Cleansing
Train
RM10 ->
Foundation & Early Adaptation for
Heterogenetic Computing
RM11 ->
Maturation & Progressive
Deployment of Scalable
4
Heterogenetic Computing
DECISIONS REQUESTED
FCC is requested to:
1.
2.
3.
4.
5.
Source : MDeC
Policy
RahsiaBesar
Technology
Rahsia
Sulit
Terhad
DATA
Project
Sponsor
Data.gov.my
OUTCOMES
Expertise
- Community
Data
- Government
Data
POCs,
pilots &
apps
DATA
Community
Government
Data
Extraction
Secured
Cloud
Security
Services
Key Values
Accelerated
Data
Staging
Computing
Cleansing
Data DB Store
Harmonisation
Infrastructure
Management
Anonymisation
i.
ii.
iii.
Machine Learning
Model &
-Data
Malaysian
Analytics
Context
- (BM, English,
Chinese, Tamil)
Visualization
Data
Visualization
- Malaysian
Perspective
Traceability
Applications
Customization
Data
Staging
Mi-Morphe
Mi-Helio
Mi-UAP
Mi-Harvester
Cleansing
Mi-Harmony
Data
Mi-BIS Visualization
Security
Mi-ARMC
Mi-Doc
Harmonisation
Mi-Scrambler
Mi-Portal
Mi-Trust
Data
Mi-DW
Management
Anonymisation
Mi-AccLib
Mi-DSS
Mi-Market
Mi-Trace
Traceability
Mi-AccLytics
Mi-STP
Data
Model & Analytics
Galactica
Data
DB Store
Mi-ROSS
Mi-HPDW
Mi-CLIP
Data Extraction
Mi-Cloud
Infrastructure
Management
Mi-Mobile
Mi-Target
Structured
+
Open Linked
Data
Unstructured
Mi-MOCHA
Data
Harvesting
Data
Cleansing
Data
Harmonisation
Data
Anonymisation
Data Sharing
Scrambled
database &
Datamarts
Staging
Data
UnStructured
Data Sources
Structured
Data Sources
Knowledge
Harvester (LOD)
Mi-Harvester
Cleansing
Data
Correction
Detect
Correction
Exception
Harmonisation
Harmonisation
Terminologies
Data
Anonymisation
Mi-Scramble
+ Mi-Crypto +
MiAccLib
Data
Visualization
Mi-HELIO;
Mi-BIS
Data Harmonisation
Mi-Harmony +
Mi-Semantics
Mi-Morphe +
Mi-AccLib
Published
Data Marts
Data Warehouse Platform
(Mi-Galactica, Mi-AccConnect, Mi-HPDW)
Social Network
Analytics
Mi-Visualitic
Data
Analytics
Mi-Portal
Data Analytics
Granular
Primary
Database
Authentication &
Authorization
Mi-UAP
Mi-ARMC
Data Visualization
Data Modeling
Data
Analytics
Mi-HPDW
Data
Statistics
Mi-AccStat
Sentiment
Analytics
Mi-Intelligence;
Mi-NLP
Data
Analytics
Mi-Target
NEWER Sources
of Data
NEWER Channels
of Consumption
NEWER Methods
of Visualization
Mi-CLIP
Mi-Morphe
Mi-Helio
Mi-UAP
Mi-Harvester
Mi-Harmony
Mi-BIS
Mi-ARMC
Mi-Doc
Mi-Scrambler
Mi-Portal
Mi-Trust
Mi-DW
Mi-AccLib
Mi-DSS
Mi-Market
Mi-Trace
Mi-AccLytics
Mi-STP
Galactica
Mi-ROSS
Mi-HPDW
Mi-Target
Mi-Cloud
Mi-Mobile
Mi-MOCHA
Technology Push
10
IoT
Internet of Things
11
Internet of Anything
Software Defined
Network
II
Industrial Internet
IoE
Internet of Everything
Big Data
Processing
Mobile Systems
Wearables
Cloud
Computing
IoT
Internet of Things
Cyberphysical
systems
Cyber- Internet
of
biological
systems Humans
12
Data
Extraction
Data Visualisation
Data Cleansing
Mi-Clip
Mi-Morphe
Mi-Harvester
Mi-AccLib
Sqoop
Mi-HPDW
Open Linked
Data
Mi-Intelligence
MiHPDW
Mi-Harmony
Cloudera
Search & Solr
Mi-AccConnect
Mi-HPDW
Galactica Connector
Mi-HPDW
Galactica
Mi-Target
Mi-AccStat
Mahout
Data Harmonization
Data Anonymisation
Data Model
Mi-Visualitics
Mi-Helio
Kafka
RDBM
S
Mi-Portal
Mi-BIS
GIS
Hive
Impala
Shark
Mi-Scramble
Mi-Crypto
Data
Management
YARN
Mi-AccLib
Mi-HPDW
STORAGE
MiTrust
Mi-NLP
ML-Lib (Spark)
Cloudera
Manager/
Falcon
Data Storage
Files
Data
Security
Mi-UAP
Mi-Morphe
Flume
Web & Social
Media
Data Staging
Galactica FS
Galactica
HDFS, NoSQL
Hadoop
RDBMS
Data warehouse / Data mart
Infrastructure
Mi-Cloud
Mi-Mocha
MIMOS
Solution
RDF
Graph DB
Zoo
Keeper
Oozie
Sentry
MapReducev2 |
Machine Learning
Pig | Hive
Analytics
Processing
(For MiMorphev3/Pentaho)
Management
YARN (resource management) | Big Data Orchestration Engine/Layer | Zookeeper (configuration and synchronization)
Oozie (work flow scheduler) | Cloudera Manager | Management for Luster
Data Management
Stream
Search
Sqoop | Flume
Storage
Legend:
RDBMS
MIMOS Technologies
3rd Party Technologies
Proof of Concepts
Selected Use Cases
15
Proof of Concepts
-Mixed Scenario(Technology Capabilities)
16
Challenges to be Addressed
During Initial Roll-Outs
17
Who are the data owner? How to ensure the security level of the data for
sharing? PDP compliance confusion .
More to be share by visiting MIMOS Lab
Tools are available but right approach is still critical for evaluation
Which are the best/right algorithms to be used?
Can you identify the right domain expert within the organization?
Who are the local domain experts to be consulted for the
methods/algorithms selection?
You may not have data scientist in specific gov. organization, but how to form one
(external + internal) -> analytics team
Thank You