Sei sulla pagina 1di 9

A Bit of History!

The history of visualizing data dates back to the 18th century, which was filled
with patriotic wars.

One of the most renowned patriotic war was Napoleon's French invasion of Russia. It
is considered to be the most significant campaign involving millions of soldiers
and several thousands of casualties.

Charles Joseph Minard recorded this remarkable war of Napoleon's march and retreat
with the most magnificent visualization ever.

Let us retreat from history and march towards the concept of Data Visualization!

Data and Visualization


With the digitalization era, data evolved from scarce, expensive to abundant, cheap
and very difficult to process.

And, if you are not aware of the Data Science concepts, data can remain obscure.

That is where Data Visualization comes into the picture to rescue us.

Fact
Edward Tufte mentions Charles Joseph Minard's map of Napoleon's march to be the
best statistical graphic ever drawn in the world.

Data Visualization
Data Visualization
The glorious value of a picture is when it stimulates us to notice what we never
expected to see.

Definition

Data Visualization takes in raw data and transforms into charts, graphs, and images
that can flawlessly marvelously explain numbers to gain insights from it.

Prelude
Every phenomenon in the world abides by the standards of its own. It is time to
parade towards the Principles of Design.

Elements
More than the inspiration, it is necessary to understand the basics of the subject
to create a beautiful design. Before advancing to the principles, we shall halt to
get ourselves braced up by knowing the seven Elements of Design.

Elements of Design

Line
Color
Shape
Texture
Value
Space
Size

Principles of Design
Principles of Design
Balance

It is inevitable to balance the visual elements of the design.

Types of Balance

Symmetrical
Asymmetrical
Radial

Principles of Design
Emphasis

Important data can be emphasized with color, size, or contrast to draw the
attention of users.

Movement

Movement principle aids in drawing user's focus in a certain direction.


Implemented in animation and interactive services.

Principles of Design
Smart Use of Patterns

Patterns are formed when the objects are repeated.


It helps in displaying the objects that are similar to each other.
Proportion

Deals with the size of the object.


It indicates the weights of different datasets and the relationship between values.
Example:

Assume a scenario in which you are told to draw a bird sitting on a tree. You will
draw the tree prominent and the bird, smaller in size.
In a Pie Chart Visualization, the division with 50% will be more significant
compared to the division with 30%.

Principles of Design
Proper Rhythm
Closely associated with movement.
Movement of visual elements must be pleasing to the eyes, for it to be called a
proper Rhythm.
Variety

A critical factor to keep the users fascinated.


More variety in the visualization can increase the amount of information that can
be remembered by the user.

Principles of Design
Theme

A unified theme assures the harmony of the design.

Prelude
Edward Tufte, an analytical theorist on design and well known for his book on
Design Analysis, has stated six principles. Let us unveil them in this topic.

Edward Tufte's Principles


The six principles stated by Edward Tufte for an effective Data Visualization are
as follows:

Graphical Integrity
Maximize Data-Ink
Avoid Chart Junk
Aim for High Data Density
Use Classic Design Solutions
Apply Aesthetics and Techniques

1. Graphical Integrity
Definition

Visual representations of data must convey the truth.

Measure

Lie\:Factor = \dfrac{Size\:of\: Effect\: shown\: in\: Graphic}{Size of Effect in


Data}LieFactor=
?SizeofEffectinData
?
?SizeofEffectshowninGraphic
??

Graphical Integrity
Principles of Graphical Integrity

Representation of numbers, as physically measured on the surface of the graph


itself, should be directly proportional to the numerical quantities represented.
Clear, detailed, and thorough labeling should be used to defeat graphical
distortion and ambiguity.
Write the explanations of the data on the graph itself.
Label important events in the data.
Show data variation, not design variation.

Graphical Integrity
In time-series displays of money, deflated and standardized units of monetary
measurement are nearly always better than the nominal units.
The number of information-carrying dimensions depicted should not exceed the number
of dimensions in the data.
Graphics must not quote data out of context.

2. Maximize Data-Ink
2. Maximize Data-Ink
Definition

Data-Ink represents the ink on a graph that aids in representing data.


Good graphical representation maximize data-ink and erase as much non-data-ink as
possible.
Measure

Data\:-Ink\:Ratio = \dfrac{Data\:Ink}{Total\:Ink\:used\:in\:Graph}Data-InkRatio=
?TotalInkusedinGraph
?
?DataInk
??

It is equivalent to the calculation of 1 minus the proportion of graph that can be


erased without loss of data-information.

Fact
An electroencephalogram has a very high data-ink ratio of 1

Maximize Data-Ink
Principles of Data-Ink

Above all else show data


Maximize the data-ink ratio
Erase non-data-ink
Erase redundant data-ink
Revise and edit

3. Avoid Chart Junk


Definition
The excessive and unnecessary use of graphical effects in graphs that are not
necessary to comprehend the information but to distract the viewer's attention.
The word Chart Junk was coined by Edward Tufte.

4. Aim for High Data Density


Definition

The proportion of the total size of the graph that is dedicated for displaying
data.

Shrink Principle

Maximize data density and the size of the data matrix within reason which is
attained by Shrink Principle.
Most graphs can be shrunk way down without losing information.

Did You Know?


The human eye cannot visualize circular distances as accurately as linear
distances.

5. Use Classic Design Solutions


Classic Design Solutions

Small Multiples- Series of the same small graphs repeated in a visual.


Sparklines- Data intense, simple design, word-sized graphics.
Time Series- One dimension graphs, which are usually horizontal and the graphics
show variation as the time proceeds.
Micro/Macro Composition- An approach where the visualization contains enormous
details, but an overall pattern emerges

6. Apply Aesthetics and Techniques


Principles

Have a properly chosen format and design


Reflect balance, proportion, and a sense of relevant scale
Display an accessible complexity of detail
Have a narrative quality, a story to tell about the data
Draw elements in a professional manner
Avoid content-free decoration, including chartjunk

Prologue
It is time to troop towards the next topic of Data Visualization tools.

Neil Gershenfeld said:

Give ordinary people the right tools, and they will design and build the most
extraordinary things.
Tableau
Tableau
Tableau is well suited for handling massive and emerging datasets.
It is used in:
Big Data Operations
Artificial intelligence applications
Machine Learning applications
It is integrated with advanced database solutions, namely:
Hadoop
SAP
Teradata
MySQL
Creates effective visualizations.

QlikView
QlikView
QlikView offers powerful visualization capabilities.
It provides a clean and interactive UI.
It also provides:
Powerful Business Intelligence operations
Analytics
Enterprise Reporting Capabilities
Qlik Sense - A package in QlikView that handles data exploration and discovery.
QlikView has an active community to guide new users in tool integration.

FusionCharts
FusionCharts
FusionCharts is a JavaScript based visualization package.
It produces 90 different chart types.
FusionCharts framework provides a great deal of flexibility.

Plotly
Plotly
Plotly supports more complex and sophisticated data visualizations.
Integrated with analytical-oriented programming languages, namely:
R
Python
Matlab
Built on top of the open-source d3.js
Integrated with Salesforce

Sisense
Sisense
Sisense provides a platform with full-stack analytical capability.
Visualizations can be created with a simple drag and drop interface.
Aids in the integration of data from multiple resources which can be queried when
required.

Others
Other powerful Data Visualization tools include:

D3.js
R Charts (ggplot2 package)
Pentaho
SAP Lumira
TIBCO Spotfire
JasperSoft
Microstrategy

Explore these Courses!


Tableau: The Sequel
QlikView
TIBCO Spotfire - Deuce
Explore with D3js
Data Visualization with R

Explore these Courses!


Tableau: The Sequel
QlikView
TIBCO Spotfire - Deuce
Explore with D3js
Data Visualization with R

Charts and Plots


Never can we leave out charts and plots when the topic of Visualization comes into
the picture.

Let us explore a few to know more.

Line Charts
Line Charts
Line Charts are suitably adopted while analyzing a trend over a period of time

They aid in satisfying the need to compare relative changes in quantities against
the time variable.

Bar Plots
Bar Plots
Bar Plots are chosen to picture an observation between cumulative totals across
several groups.

Box Plots
Box Plots
Five statistically significant numbers are portrayed by Box plots, namely:

Minimum
25th percentile
Median
75th percentile
Maximum
It aids in visualizing the range of data and for deriving inferences accordingly.

Scatter Plots
Scatter Plots
Scatter plots help in inspecting multiple variables simultaneously by color-coding.

Scatter plots reveal the relationship or association between two variables; the
extent to which one variable is affected by another.

Decision Trees
Decision Trees
Decision Trees are excellent tools that help in choosing the right action among
several courses of actions.

They provide a highly effective structure to lay out options and investigate the
possible outcomes of choosing those options.

Histograms
Histograms
Histograms are used to plot quantitative data, and the ranges of the data are
grouped into bins or intervals.

Histograms show distributions of variables while bar charts compare variables.

Prelude
Explore the fruits of visualization here!

A Success Story
In 1854, there emerged a question. The question was:

What is causing cholera epidemic in London?

Data Visualization came as a rescue in this situation.


London Epidemic
London Epidemic
The above figure was the one, which saved thousands of lives.

John Snow illustrated through his visualization that the cholera epidemic was
caused by a bad water pump.
The red dots in the figure indicated the location of deaths.

End of the March


John Turkey said:

The greatest value of a picture is when it forces us to notice what we never


expected to see.

Conquer more on Data Visualization to brace up yourselves!

Potrebbero piacerti anche