Thoughts-Linking SEIS 630, 736, 631, 632 & 732

Thoughts-Linking SEIS 630, 736, 631, 632 & 732
736:
Hadoop = distributed file system (saves work computed using Apache Spark)
Apache Spark = in memory processing framework which uses scala language
630:
SQL = allows you to query structured data ( relational database like Oracle, MySQL,
SQL Server)
736:
Hive = allows you to query non structured data (non relational database like MongoDB,
Cassandra, Hadoop dont use SQL so they are non relational databases are also known
as No SQL databases)
630:
Structure data isn’t big data.
736:
unstructured data is big data.
630:
Structure data is data that fit into rows & columns & is stored in a relational database
like Oracle, MySQL, SQL Server)
736:
Unstructured data is data that doesn’t fit into rows & columns and is stored in a non
relational database like MongoDB, Cassandra, Hadoop)
Example of unstructured data include emails, audio,image, text, video & social media.
New SQL databases- scalability of non relational database(better than relational

database because we trade off the acid property focusing on data reliability and instead
use the base property focusing on data availability) while still providing the structure
(better than non relational database because a certain level of structure is needed for
making accurate decision to full-fill business requirements)
630 & 736:

PH level measures the relative basicity (non relational databases) and acidity (relational
databases) of database transaction.
630 & 732:

Business Intelligence and Data Warehousing Data Models are Key to Database
Design
632:
Data modeling/ data structure involves visualizing data through use of graphical tools,
so you will want to obtain a data modeling software package or use graphical
capabilities in existing software.
A data model explicitly determines the structure of data in the context of daya science;
it is also known as data structure in the context of computer science.
630 & 732:

Business intelligence is where you organize data (add structure to unstructured data)
in order to perform analytic operations on data such as: Query by multiple criteria, Slice
and dice", Drill Down & Roll Up.
732:
Data warehousing are a set of data models/ data structures that aggregates structured
data and store all the information so that it can be used in data analysis & reporting
which for developing business intelligence.
Data Analyst applies basic statistical algorithms to analyze structured data (stored in

relational database as table rows) in order to improve decision making/ business
intelligence.
Data Scientists applies advanced statistical algorithms to analyze unstructured

data (stored in non relational database) in order to improve decision making/ business
intelligence.
Software skills for data scientists

Data Science languages — Python/R
Relational databases — MySQL, Postgress
Non-relational databases — MongoDB
Distributed computing — Hadoop, Spark
Cloud — GCP/AWS/Azure
Machine learning models — e.g. Regression, Boosted Trees SVM, NNs
Graph — Neo4J, GraphX
API Interaction — OAuth, Rest
Data Visualisation and Webapps — D3, RShiny
Programming language is the syntax and style.
Platform/ framework is the execution environment for running the programming

language.
Platform/ framework is a set of libraries containing built-in functions/methods, data

structures, classes used for developing desktop or web applications.
API is the interface of the framework.
Desktop application/stand alone application/Executible is a computer program that

runs locally(without internet connection) on the computers desktop. Applications that are
installed on desktop computers are called standalone application. Examples include
google chrome, Notepad ++
Web application is a Client-server computer program in which the client side/GUI runs

on a web browser and it communicates with the server computer via internet
connection. In computing, a server is a computer designed to process requests and
deliver data to another computer over the internet or a local network. A Web
servers show pages and run apps through web browsers.
A cluster is refers to an implementation of shared computation. A server cluster shares

the computation of a single computer amongst multiple servers so its a 1:M client to
servers relationship.
In distributed computing, each processor has its own private memory/ distributed
memory. Information is exchanged by passing messages between the
processors. Server clustering is used for high performance distributed computing.
In parallel computing, all processors have access to a shared memory to exchange

information between processors.
https://medium.com/@martinomburajr/java-create-your-own-hello-world-server-
2ca33b6957e
https://medium.freecodecamp.org/lessons-learned-from-deploying-my-first-full-stack-
web-application-34f94ec0a286
https://www.infoq.com/articles/raw-data-to-data-science?
utm_source=infoq&utm_campaign=user_page&utm_medium=link
https://www.lifewire.com/servers-in-computer-networking-817380
https://coderanch.com/t/636684/java/Difference-Web-application-desktop-application

Thoughts-Linking SEIS 630, 736, 631, 632 & 732

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Thoughts-Linking SEIS 630, 736, 631, 632 & 732

Caricato da

Copyright:

Formati disponibili

Thoughts-Linking SEIS 630, 736, 631, 632 & 732

New SQL databases- scalability of non relational database(better than relational

630 & 736:

630 & 732:

630 & 732:

Data Analyst applies basic statistical algorithms to analyze structured data (stored in

Data Scientists applies advanced statistical algorithms to analyze unstructured

Software skills for data scientists

Programming language is the syntax and style.

Platform/ framework is the execution environment for running the programming

Platform/ framework is a set of libraries containing built-in functions/methods, data

API is the interface of the framework.

Desktop application/stand alone application/Executible is a computer program that

Web application is a Client-server computer program in which the client side/GUI runs

A cluster is refers to an implementation of shared computation. A server cluster shares

In parallel computing, all processors have access to a shared memory to exchange

Potrebbero piacerti anche

Thoughts-Linking SEIS 630, 736, 631, 632 &amp; 732

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Thoughts-Linking SEIS 630, 736, 631, 632 &amp; 732

Caricato da

Copyright:

Formati disponibili

Thoughts-Linking SEIS 630, 736, 631, 632 & 732

New SQL databases- scalability of non relational database(better than relational

630 & 736:

630 & 732:

630 & 732:

Data Analyst applies basic statistical algorithms to analyze structured data (stored in

Data Scientists applies advanced statistical algorithms to analyze unstructured

Software skills for data scientists

Programming language is the syntax and style.

Platform/ framework is the execution environment for running the programming

Platform/ framework is a set of libraries containing built-in functions/methods, data

API is the interface of the framework.

Desktop application/stand alone application/Executible is a computer program that

Web application is a Client-server computer program in which the client side/GUI runs

A cluster is refers to an implementation of shared computation. A server cluster shares

In parallel computing, all processors have access to a shared memory to exchange

Potrebbero piacerti anche

Thoughts-Linking SEIS 630, 736, 631, 632 & 732

Thoughts-Linking SEIS 630, 736, 631, 632 & 732