Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Data Visualization
for Mumbai
Akshay Kore
Interaction Design
M.Des (2014-16)
2 of 82
Data Visualization for Mumbai MumbaiData
Declaration
I declare that this written document represents my ideas in my own words
and where others’ ideas or words have been included, I have adequately cited
and referenced the original sources. I also declare that I have adhered to all
principles of academic honesty and integrity and have not misrepresented or
fabricated or falsified any idea/data/fact/ source in my submission. I
understand that any violation of the above will be cause for disciplinary
action by the Institute and can also evoke penal action from the sources
which have thus not been properly cited or from whom proper permission
has not been taken when needed.
Akshay Kore
146330007
November 2016
3 of 82
Data Visualization for Mumbai MumbaiData
4 of 82
Data Visualization for Mumbai MumbaiData
5 of 82
Data Visualization for Mumbai MumbaiData
6 of 82
Data Visualization for Mumbai MumbaiData
Acknowledgements
I would like to thank Prof. Venkatesh Rajamanickam for his guidance and
support through the tenure of this project. I am grateful to Prof. Anirudha
Joshi, Prof. Girish Dalvi, Prof. Ravi Poovaiah, Prof. Uday Athavankar at IDC
for their valuable feedback and suggestions. I would also like to thank Prof.
Arnab Jana from C-USE IIT Bombay for his feedback during the initial
phases of the project. I am grateful to the staff and authorities at the
Municipal Corporation of Greater Mumbai, The Urban Design Research
Institute in Mumbai, MMRDA, MIDC Mumbai and the Development
Commissioner’s Office, SEEPZ SEZ for their help and cooperation.
Feedback and guidance from Ar. Namrata Kapoor, Ar. Anuj Gudekar,
Ar. Mugdha Deshpande, Ar. Mridula Pillai, Ar. Shreyas Khemalapure,
Ar. Maitreyi Gulawani, Mr. Ramesh Palan Dr. Ms. Aparajita Bakshi and
Prof. Himanshu Burte was of paramount importance. Most importantly I
would like to thank Shweta Deo, Ruchita Chandsarkar, Indrajeet Roy, Prasad
Ghone, Jayati Bandyopadhyay, Dileep Mohanan, Sitara Shah and Sagar Yende
for their help, motivation and support.
Thank you to The Library. The Internet. Quora.
7 of 82
Data Visualization for Mumbai MumbaiData
8 of 82
Data Visualization for Mumbai MumbaiData
Abstract
Our governments create and release vast quantities of data about us and our
surroundings by means of the census, budgets, human development reports,
lan use studies , etc. This data is released in the public domain and is regularly
accessed and used by policy and decision makers in both the public and
private domain.
Visualization of enables the analysis of data and especially when the data is
available in vas quantities. These decision and policy makers make use of data
visualizations regularly. Visualization of data done by these individuals or
groups is most often done with a purpose in mind. However, insights can
come from anywhere, in seemingly disparate correlations of two or more
datasets. Access to these datasets is often difficult due to the scattered nature
of our government bodies and even after the access is gained, the datasets
come in different tangible and intangible file formats making direct use of
these datasets for visualization a difficult and often a laborious task.
With the city of Mumbai as an example, this project aims to enable an easy
way to visualize the city’s spatial data. This is done by means of a web based
application called the MumbaiData Visualization Tool. The higher goal is to
enable faster and better decision making by means of data visualization and
set an example for all city governments in the country.
The project is open to use freely and is made available at http://
www.mumbaidata.in/
9 of 82
Data Visualization for Mumbai MumbaiData
10 of 82
Data Visualization for Mumbai MumbaiData
Table of Contents
6.8.Final Concept 56
7. Scenarios 70
1. Introduction 13
7.1.Scenario 1 - Malaria Breeding Grounds 70
2. Secondary Research 16
7.2.Scenario 2 - High Pollution Areas 71
2.1.Open Data 16
7.3.Scenario 3 - Traffic Management in Floods 72
2.2.Big Data 19
7.4.Scenario 4 - Building a School for Underprivileged 73
2.3.Mumbai’s Data 22
7.5.Scenario 5 - Anomalies in Electricity Consumption 74
3. Preliminary Design Ideas 28
8. Limitations 75
4. User Studies 32
9. Evaluation Plan 78
4.1.Users 32
9.1.Objectives 78
4.2.Contextual Enquiry 32
9.2.User Testing Protocol 78
4.3.Affinity Mapping 33
Bibliography 80
4.4.Insights 33
4.5.Need Gaps 38
5. Initial Design Ideas 42
6. Final Design 48
6.1.Key Features 48
6.2.Data Layer Types 49
6.3.Key user Tasks 50
6.4.Information Architecture 50
6.5.Metaphor 51
6.6.Initial Concept 51
6.7.Heuristic Evaluation 55
11 of 82
Data Visualization for Mumbai MumbaiData
12 of 82
Data Visualization for Mumbai MumbaiData
SOURCE : MCGM
SOURCE : WIKI IMAGES NOT TO SCALE
“The thing about Mumbai is you go five yards
and all of human existence is revealed.”
Julian Sands
13 of 82
Data Visualization for Mumbai MumbaiData
14 of 82
Data Visualization for Mumbai MumbaiData
Secondary Research
15 of 82
Data Visualization for Mumbai MumbaiData
various governments (such as the USA, UK, vast dataset is largely untapped. This is
2.Secondary Canada and New Zealand) announcing new mainly due to lack of access and a dearth of
initiatives towards opening up their public innovative ways to merge, overlap these
Research information2. The Indian Government has
also released some of its data as a part of
seemingly disparate datasets to uncover what
the data says, to find meaningful insights.
To attempt to solve to problem to the Open Government Data initiative on the
visualizing vast quantities of data, it is website : https://data.gov.in/ . The data is “The nature of innovation is that developments
important to look at what this data is and available in a table format as well as APIs for often comes from unlikely places.”
might be. The secondary research examines some datasets. Open Data Handbook
the concept of Open Data which is one one
of the main types of data that the project “Open data is data that can be freely used, re- The value of these large datasets lies in how
attempts to use. The concept of Big Data used and redistributed by anyone - subject only, they can be reused, recombined in multiple
has been explored since data about the city at most, to the requirement to attribute and ways. This can influence many areas of
sharealike.” 3 public and private life.
and its people is a large quantity of data and
a large number of data points have been
This large volume of data unlocks a new
measured. Finally the data about the city has potential of using bureaucratic and other
been looked into to imbibe a familiarity with datasets and information to enable new
the kinds of datasets currently available. policies, services; thereby improving the lives
of citizen, making the government and
2.1.Open Data society work better.
Open data is the idea that certain data
should be freely available for everyone to use Increase in efficiency of processors and
and republish as they wish, without declining costs of storing data leads to a
restrictions from copyright, patents or other tremendous amount of resources to capture
mechanisms of control1. The notion of and store data. This has led to vast quantities
Open Data and especially Open of data about individuals, companies,
Government Data has been around for climate, economy, health , etc. Every type of
some years. In 2009 open data started to data that can be quantified, is being
become visible in the mainstream, with quantified. However, the potential of this
1 http://en.wikipedia.org/wiki/Open_data
2 http://opendatahandbook.org/guide/en/introduction/
3 http://opendefinition.org/
16 of 82
Data Visualization for Mumbai MumbaiData
Some of these areas include1: easily find out where you can walk your dog,
Examples as well as find other people who use the
• Transparency and democratic control
• Participation
Open government data can also help you to same parks. Services like ‘mapumental’ in the
make better decisions in your own life, or UK and ‘mapnificent’ in Germany allow you
• Self-empowerment
enable you to be more active in society.
to find places to live, taking into account the
• Improved or new private products and A woman in Denmark built findtoilet.dk, duration of your commute to work, housing
services which showed all the Danish public toilets, prices, and how beautiful an area is. All these
• Innovation so that people she knew with bladder examples use open government data.2
• Improved efficiency of government problems can now trust themselves to go
services out more again. In the Netherlands a service,
• Improved effectiveness of government vervuilingsalarm.nl, is available which warns Openness of Data
services you with a message if the air-quality in your Openness of data means that individuals
• Impact measurement of policies vicinity is going to reach a self-defined and institutions are free to use the data in
threshold tomorrow. In New York you can whichever means they please without
• New knowledge from combined data
sources and patterns in large data volumes restrictions from copyright, patents or other
mechanisms. The data provided should be
Governments in that sense, collect large accessible cheaply or for free through
quantities of data by means of the census, downloads over the internet.
budgets, livelihood, etc. They are significant
due to the quantity and the centrality of the This easy access is beneficial for others to
data and it is binding on the government to reuse in novel ways.
provide this data to its citizen. Most of this
data is public data and could be made open
for individuals or institutions to freely
analyze and generate new knowledge. There
are numerous examples of individuals using
Open Government Data in clever ways,
thereby improving the lives of fellow
citizens.
SOURCE : FINDTOILET.DK
1 http://opendatahandbook.org/
2 http://opendatahandbook.org/guide/en/why-open-data/
17 of 82
Data Visualization for Mumbai MumbaiData
Here are a few policies in the openness of there will always be exceptions to providing
data that will be of great benefit 1: data related to national security and the likes.
• Keep it simple : The data provided There are endless possibilities of using and
should be easily readable by a machine. So reusing Open Data and overlapping different
no PDFs and MS Word files, as it is datasets. Like how Dr.John Snow used the
difficult to parse these to access the data. map of England and location of cholera
• Move fast : It is important to have a victims to find the cause of the outbreak at
constant inflow of data and to digitize the dawn of the 19th century. This led to
what is already available. discovering that cholera was a waterborne
disease. There are numerous opportunities
• Be pragmatic : It is impossible to have now, more than ever to use data since data is
accurate data at all times and good enough so abundantly available.
is good enough. More data trumps
accurate data. In particular it is better to Unexpected insights can be gained from
give out raw data now than perfect data in combinations of different datasets and
six months’ time. Note that the data maybe the next big discovery or invention
available should not be incorrect and will probably come out of these methods.
should only give room for inaccuracy.
1 http://opendatahandbook.org/
18 of 82
Data Visualization for Mumbai MumbaiData
2.2.Big Data consuming and expensive to analyze such messiness of the world and the
large datasets. With the increase in accompanying data.
We live in a world with an abundance of processing power and cheaper storage, it has
data. Every aspect of our lives including our This is contrary to traditional wisdom for
become far easier to use in many cases, all of
identity, our location and our relationships want of accurate data to do accurate
the data that is available.
with other people, governments, companies, analysis. Getting completely accurate data is
etc are being datafied. Datafication means Random sampling reduces Big Data easy and viable for small data problems.
conversion of information in a useful problems to small data problems. However, While dealing with Big Data problems, it is
format either as numbers, visuals, etc. Our this is the second best alternative. In the time consuming to procure completely
Governments collect and store vast amounts world of Big Data, the more data you have accurate data and in many cases very
of data about us and our fellow citizens. the better. Simple models of processing data expensive. This is besides the fact that often
They are vast storehouses of data. Data can accompanied by a lot of data are better than acquiring perfectly accurate data is
be defined as something that allows it to be elaborate models with only samples of data. downright impossible under the current
recorded, analyzed and reorganized1. This data includes all the data along with the system.
outliers. It is the job of the person who
Big Data can be defined as extremely large Take the example of population data. It is
analyses the data to make note of these
data sets that may be analyzed common knowledge that it is very difficult
outliers and the frequency of their
computationally to reveal patterns, trends, to know the exact population today, for
occurrences. The people handling Big Data
and associations, especially relating to human children are born and people die every
problems must remember that all the data is
behavior and interactions 2. In that sense, minute and it will require systems in place to
better than samples of data.
problems of population, education, health, count the exact population at any given time.
transportation, governance are Big Data
problems since they deal with a vast Messiness
quantities of data. And as such they must be Population of India:
dealt with that mindset. It is very rare that a particular data set,
moreover a Big Data set would be
completely accurate. As users of Big Data,
1,210,569,573
N = all we need to forgive the inaccuracies and be Population of India:
comfortable about the fact that large
In many traditional institutions, for analyzing
situations and data, random samples of data
datasets are bound to be inaccurate till a 1.2 Billion
certain extent. We need to embrace the
are used since it was difficult, time
19 of 82
Data Visualization for Mumbai MumbaiData
This would be difficult and expensive. Even to be messy and we need to solve these It is much better to use a system of tagging
so, we are comfortable with this knowledge problems with messiness in mind. data points rather than categorizing them in
and inaccuracy for there is not much a hierarchy. Tagging enables the data points
difference between saying that India’s “Treating data as something imperfect and to have an affinity with other data points and
population is 1,210,569,573 or 1.2 Billion1. imprecise let’s us make superior forecasts3.” leaves room for more correlations incase
We are fine with this estimate. As scale there are new insights or a new set of data is
increases, showing exact numbers becomes available.
less important. Hierarchy
Hierarchical systems work well when the Tagging data points can get messy, however
“The world’s we seek to understand are complex number of data points are small. In case of the imprecision and messiness inherent in
and intricate2”
small data, the categorization of data is tagging is about accepting the natural chaos
Edward Tufte of the world. It makes the vastness of
simpler. It is possible to SLIP (sort, label,
integrate, prioritize) 4 the data. content more navigable5.
To solve Big Data problems, we need to
accept this inaccuracy. Sacrifice accuracy What happens when data points overlap and
since it is expensive and time consuming. We when they cannot fit in a particular box ? Do
need to realize that Big Data is always going we copy them again in another hierarchy
element ? Imagine this one data point having
affinity with a hundred different data points.
It would be unwise to make a hundred
copies.
Hierarchical systems fall apart in the face of
Big Data problems. Although it is still
possible to categorize, there is a much
Copy simpler method based on correlations rather
than categorization.
Hierarchical Systems often tend to create multiple copies of the same data Tagging of Data enables correlations without creating copies
points when handling Big Data
1 Census 2011
2 Envisioning Information: Edward Tufte
3 Big Data: Victor Mayer-Schonberger and Kenneth Cukier
4 Laws of Simplicity: John Maeda
5 Big Data: Victor Mayer-Schonberger and Kenneth Cukier
20 of 82
Data Visualization for Mumbai MumbaiData
flights and much more. All this from a single time; so we need to watch out for B to
Correlations image. predict that A will happen.
When dealing with large datasets, as seen
earlier it is better to tag and correlate than to We have since a long time depended on the A study says that women are happier than
categorize and bucket the data points in a method of starting out with a hypothesis men. Most women have longer hair. So the
hierarchy. Some type of data might fall in and analyzing the data to prove or disprove data suggest that there is a connection
multiple categories. Take for example a book the latter. But with so much data available, between long hair and happiness. This is a
on the Gestalt Theory; a person from social the data starts to speak for itself. We need weak correlation. Correlation is a powerful
sciences may place it in the Psychology not start with a hypothesis. tool, though we need to watch out for such
bucket whereas it is also welcome in the weak correlations.
Graphic Design bucket; it does not mean
Mistaking correlation for causation almost
that you buy two books. If you were to tag
led the state of Illinois, USA to send books
these books, you could easily tag it as
to every child in the state because studies
Psychology and Graphic Design, thereby
showed that books in the home correlated to
solving the problem of categorization. This
higher test scores. Later studies showed that
feeble example gives an idea of the powerful
children from homes with many books did
concept of tagging in the Big Data world.
better even if they never read, leading
Tagging in this case may also be called
researches to correct their assumptions with
Metadata. Metadata simply means data about
the realization that homes where parents buy
data.
books have an environment where learning
Correlations are extremely useful when is encouraged and rewarded1 . Our
trying to solve big data problems. Large AARON KOBLIN’S FLIGHT PATTERNS Governments too don’t have money to
SOURCE : GOOGLE IMAGES
problems contain large amounts of data and waste going in the wrong direction.
Humans have an innate need to find causal
with enough data, the numbers speak for connections. It is easy to find numerous Big Data problems bring with them Big
themselves. A correlation is enough. examples in mythology where one event is opportunities. We can solve large problems,
Correlations shine in the Big Data world, caused by another. When dealing with huge make discoveries and even predict the state
like in the above visualization by Aaron amounts of data, one needs to be aware of of world. It turns us into seers and wizards
Koblin on the flight patterns in the United the subtle difference between correlation with a crystal ball, to look into the future.
States. One can see the busiest airports, and causality. In simple terms, causality is
density of arrivals and departures, airports when A causes B. Correlation on the other
with the maximum number of international hand means that A and B happen at the
21 of 82
Data Visualization for Mumbai MumbaiData
1 Talk on 22nd April in IIT Bombay by Mr. Vidyadhar Phatak, Urban Planner
2 Census 2011
3 Preparatory Studies DP 2014-34 : MCGM
22 of 82
Data Visualization for Mumbai MumbaiData
6
SPAs
9%
23 of 82
Data Visualization for Mumbai MumbaiData
Dharavi Road Accident prone spots Ward wise FIRs 2011 - 2014
Cessed Buildings Waste management facilities Ward wise population Data 2001-2011
Heritage Sites and Buildings Settlements not served by sewer lines Declining Growth rate of Population
Ready Reckoner Rates (Housing prices) Livelihood Slum and Non-Slum population
24 of 82
Data Visualization for Mumbai MumbaiData
expensive and laborious task and is an account for a sizable part of the ward’s
Problems with the City Datasets unnecessary step in the analysis of data. overall area. There are a number of
Despite the availability of so much data, it is examples like these in different wards.
very difficult for a person looking at the data
Incomplete Datasets
to make any sense of it.
Ward wise ELU data has been provided by Temporal Data types
the MCGM. However, this data is grossly
Tabular Structure The temporal data available can be either
incomplete for some wards, since parts of pointed out on a map or is a value that
Most of the data is in a tabular format. One these wards fall under the SPAs, which is cannot be measured by its physical
can infer little by looking at tables and maps falls out of the MCGMs jurisdiction. dimensions. There would also be data that
with a long legend next to it.
An example of this can be seen with ward falls into both these categories. It needs to
H/E. A major part of the wards’ area falls be noted that all the data being used is
under the Airport and Bandra-Kurla quantifiable in nature i.e. we can define its
Complex SPA. However, the ELU data that position by means of a co-ordinate system
has been captured excludes these ares, which or it is a number like the population of a
ward. For this reason, it makes sense to
divide the data type into spatial and non-
spatial.
Spatial Data : It is data that can be pointed
out on a map. It is situated in a space. The
SOURCE : ELU 2012, SCE INDIA PVT LTD value of the data can be extracted from map,
co-ordinates, etc in 2D space and 3D co-
Multiple File Formats ordinates in a 3D space.
Another problem with the data available is Non-Spatial Data : It is data that cannot be
that it is spread across multiple PDF or pointed out on a map, it exists as a number
Tabular format files. There is little inference in time.
that can be gained from these files and it
becomes difficult to use the data in a 1 + 1 = 3 in the case of data. Much value
constructive manner. Multiple file formats can be gained from data sets if they are
often prevents easy machine readability of WARD H/E : A LARGE PART OF THE WARD’S AREA FALLS UNDER SPA looked at in context with each other. An
the data. Converting these files into a AND THEREFORE HAS BEEN EXCLUDED FROM THE WARD DATA.
affinity can be identified between different
machine readable format is often an SOURCE : ELU MAPS, MCGM
MAP NOT TO SCALE
and almost disparate data sets to gain new
insights and generate new knowledge.
25 of 82
Data Visualization for Mumbai MumbaiData
Spatial Data Non-Spatial Data *The Existing Land Use data includes data
about :
Spatial Data Non-Spatial Data
Intertidal Zones Air quality • Medical Amenities
Ready Reckoner • Educational Amenities
Existing Land Use*
rates • Open Space
• Natural Spaces and Water Bodies
CRZ boundaries Ward wise FIRs
Cessed Buildings Top 5 diseases • Residential areas
• Urban Villages
Heritage sites and Ward wise • Commercial Areas
Top 5 sensitive Ward wise tree
buildings + ASI population - Slum
diseases cover • Offices
sites and Non-Slum
• Industrial Areas
Gender ratio in • Social Amenities
Slums
Primary education Budget 2015-16 • Public Utilities and Facilities
Existing and Dropout numbers • Transport and Communication Facilities
proposed transport in Primary • Primary Activity
networks education • Vacant Land
• Unclassified Areas
Road accident prone Number of teachers
spots The ELU data for all the wards has been
taken from MCGM as well as from the SPA
Auto-rickshaw and Peak time traffic authorities which include MMRDA, MIDC
Taxi stands records and MHADA
Heavy Pedestrian Employment rate
Congestion areas
Settlements not
served by sewer Number of vehicles
lines
26 of 82
Data Visualization for Mumbai MumbaiData
140+
Sub-Categories in
the ELU
27 of 82
Data Visualization for Mumbai MumbaiData
from various other government sources would be used to generate the BETWEEN DIFFERENT CITIES
organizations such as the MMRDA, MIDC, visualizations. In some cases, only relevant SOURCE: “MUMBAI MUSEUM:
A FACELIFT” THESIS BY
SEEPZ and others mainly for land use data data for a particular visualization may be AKSHAY KORE
for the SPAs as well as NGOs and other used. A systemic design problem would be
organizations like Praja, The Urban Design to create a system for visualization of the
and Research Institute (UDRI), Mashal, etc data to for reasons of scalability incase more
for other useful data like crime rate, diverse and relevant data becomes available.
education among other data sets.
“Re-describing the world is a necessary first step
towards changing it1.”
Salman Rushdie
28 of 82
Data Visualization for Mumbai MumbaiData
UDPFI
COMPARISON OF
OPEN SPACE PER
CAPITA BETWEEN
DIFFERENT
URBAN DESIGN
STANDARDS AND
MUMBAI
VISUALIZATION OF THE TREES, CRIMES AND TAXI CAB
LOCATION IN NEW YORK.
AN EXPLORATORY VISUALIZATION OF THE AGE OF BUILDINGS IN
SOURCE : HTTP://WWW.ARCHITECTMAGAZINE.COM
NEW YORK.
SOURCE : HTTP://PUREINFORMATION.NET/BUILDING-AGE-NYC/
Mumbai
Isolated Data Visualization
Exploratory Data Visualization Another strategy is to create a series of static
and time based visualizations that highlight
It might not always be the case where the certain aspects of the quantitative data about
creator of the data visualization may know the city. These could be in the form of a
the intended use, or all the insights the series of posters or interactive visualizations
visualizations shows. Too much data also is that are static or time based. This approach
confusing. Giving the user a way to explore would require generation of multiple
the datasets and find new insights is also a isolated visualizations.
possible way to present the data. A system
of layering datasets for multiple “Simplicity of reading derives from the context
interpretations can be adopted. The purpose of detailed and complex information, properly
of an exploratory visualization is to reveal arranged. A most unconventional design strategy
patterns and relationships not known so is revealed: to clarify, to detail 1.”
Edward Tufte
easily deduced without the aide of a visual
representation. This type of visualization
helps in producing new knowledge.
29 of 82
Data Visualization for Mumbai MumbaiData
30 of 82
Data Visualization for Mumbai MumbaiData
User Studies
31 of 82
Data Visualization for Mumbai MumbaiData
4.2.Contextual Enquiry
4.User Studies A method of contextual enquiry was
Questionnaire
The users overall were asked the following
For the purpose of deciding on the employed to conduct the user study. The aim questions during the semi-structured
approach to be taken while creating the data of the enquiry was to generate insights interview, although not necessarily in the
visualization for Mumbai, a user study was about the type of visualizations used by the order shown. Some questions might have
conducted with stakeholder groups. user group and the needs and problems been skipped for certain types of users and
faced while making strategic decisions. A depending on the context. Each question is a
4.1.Users semi-structured interview was conducted for type of a conversation starter to discover
different types of users. For example, an
Users or consumers of data visualization are deeper problems and ideas in the users’
academician would be asked a slightly
varied. The users may be students, common experience. The aim was to empathize with
different set of questions than an architect
folk, teachers, architects, economists , etc. the user and probe into the users’ thought
or a policy maker. Interviews were
For the purpose of the project, it was process while working on a problem.
conducted with architects and urban
decided that the visualization would be used
planners working in both the public and • How do you use these datasets?
by policy and decision makers in
private sector. Professors from the Tata This was asked to check the type of usage
Government bodies, private organizations or
Institute of Social Sciences, Mumbai were between different user groups, whether it
NGOs. These users mainly include city and
interviewed as policy makers since they are was to generate an insight or prove a
urban planners, academicians and architects.
often involved in decisions concerning hypothesis. This would enable the
policy making and government. Two discovery of a possible usage pattern of
Academicians 1
journalists were also interviewed since they data and the type of visualizations used.
Architects 3 work extensively with open and government
data to gain insights about access of data • When was the last time you had to use
Policy 2 and generation of visualization. One one of the datasets?
Urban Planners 6 academician was interviewed to enquire This was asked to probe into the users’
about the usage of data visualizations in experience with using open datasets and
Journalist 2 classroom scenarios for hypothetical gain insights about the user journey and
Total Number of Users Interviewed 14 decision making. Many of the Architects and touch-points.
Urban Designers interviewed play or have
For the purpose of the study, the names of played a dual role of a teacher as well as a
professional. The interviews were recorded • What datasets did you use?
the interviewers has been kept anonymous. This was asked to find common data
and/or a transcript was created during the
sources used and major overlaps between
contextual enquiry.
different user preferences.
32 of 82
Data Visualization for Mumbai MumbaiData
33 of 82
Data Visualization for Mumbai MumbaiData
Evidence Gathering and Users never want a single type Accessibility of Datasets
Broadcasting of data From the affinity map, it was seen that there
was a deep resentment with the process of
Many of the visualizations were used as a The affinity showed that different users used access of data. Many of the users felt
proof of evidence. In one such example, a different types of data to generate intimidated by Government office
user who was an urban designer said that the visualizations that led to evidence and environments where they would to go to
mangroves in the users’ area were poisoned. insights. collect data or in case of availability of data
If there was some evidence in the form of a on the internet, the datasets were in many
map or a time series, the user would cases not easily searchable and were often
broadcast it on social media to gain traction hidden deep within the websites of these
and make the authorities notice the problem. authorities. One such user statement which
Since, this was not possible for the user showed the hostility was “we are not doing
given the lack of technical expertise, the user any terrorist activity”.
decided to write a blog post instead,
however it was not very successful.
34 of 82
Data Visualization for Mumbai MumbaiData
35 of 82
Data Visualization for Mumbai MumbaiData
Dynamic Datasets
Many users referred to events that happened
periodically and a decision was made by
intuition or pre-knowledge. The temporal
social infrastructure of the city keeps
changing and the data should indicate the
same to make decisions or generate insights
about the city. This has been referred to as
dynamic datasets.
36 of 82
Data Visualization for Mumbai MumbaiData
37 of 82
Data Visualization for Mumbai MumbaiData
38 of 82
Data Visualization for Mumbai MumbaiData
39 of 82
Data Visualization for Mumbai MumbaiData
40 of 82
Data Visualization for Mumbai MumbaiData
41 of 82
Data Visualization for Mumbai MumbaiData
42 of 82
Data Visualization for Mumbai MumbaiData
SKETCH FOR THE COMPARISON OF WARD DATA TOOL SKETCH FOR THE COMPARISON OF WARD DATA TOOL
43 of 82
Data Visualization for Mumbai MumbaiData
Limitations
Layering of datasets using physical and
tangible layers to generate multiple data
visualizations depending on the need and
44 of 82
Data Visualization for Mumbai MumbaiData
45 of 82
Data Visualization for Mumbai MumbaiData
46 of 82
Data Visualization for Mumbai MumbaiData
MumbaiData
47 of 82
Data Visualization for Mumbai MumbaiData
48 of 82
Data Visualization for Mumbai MumbaiData
49 of 82
Data Visualization for Mumbai MumbaiData
ADDITIONAL PAGES
6.4.Information
Architecture
The information architecture shows the
journey the user takes to create visualization. MUMBAI DATA TOOL
Legend / List of
Add Dataset Panel
Open Datasets
50 of 82
Data Visualization for Mumbai MumbaiData
6.5.Metaphor
The user group uses certain types of
government documents on a regular basis. It
is an important consideration to use some
of the users’ pre-knowledge with using data
documents as a User Interface metaphor for
the visualization tool. It is assumed that
most of the user population is familiar with
using government map data. Since, the
MumbaiData tool is a map based
application, a number of maps released by
the Government of India were analyzed and
a user interface pattern emerged. It was of a
EXAMPLES OF
legend on the right-side and the map DOCUMENTS USED BY
visualization on the left occupying a THE USER GROUP
dominant part of the document image.
SOURCE: GOOGLE IMAGES
Along with this, the scale and style of the
legend was an important UI consideration.
6.6.Initial Concept
The initial wireframe concept was an
adaptation of the metaphor of the
Government map based data documents for
use in an interactive environment. It enabled
the user to perform the key tasks of adding,
deleting, hiding and showing datasets as well
as zooming and panning of the user created VISUALIZATION FRAME
LEGEND /
VISUALIZATION
visualization.
CONTROL PANEL
METAPHOR
51 of 82
Data Visualization for Mumbai MumbaiData
ADD SELECTED
DATASETS
DESKTOP WIREFRAME
1 Most humans (say 70 percent to 95 percent) are right-handed, a minority (say 5 percent to 30 percent) are left-handed, and an indeterminate number of
people are probably best described as ambidextrous. - www.scientificamerican.com/article/why-are-more-people-right/
52 of 82
Data Visualization for Mumbai MumbaiData
DATALIST
ELEMENT
DATALIST
ELEMENT
VISUALIZATION
CONTROL PANEL
LIST OF AVAILABLE
DATASETS
SATELLITE
VIEW
TABLET WIREFRAME
53 of 82
Data Visualization for Mumbai MumbaiData
OPEN
PANEL
SATELLITE
VIEW
54 of 82
Data Visualization for Mumbai MumbaiData
1 About 8 percent of males, but only 0.5 percent of females, are color blind in some way or another, whether it is one color, a color combination, or another
mutation. - Wikipedia
55 of 82
Data Visualization for Mumbai MumbaiData
Information Architecture
56 of 82
Data Visualization for Mumbai MumbaiData
Logo
LEGEND ELEMENT
VISUALIZATION CONTROL
PANEL
SHARE VISUALIZATION
SATELLITE
VIEW
DATALIST
ELEMENT
NAME OF COLOR OF
LAYER LEGEND
CHECKBOX : SELECT
DATA LAYER TO ADD
57 of 82
Data Visualization for Mumbai MumbaiData
MAP DISPLAY
CONTROL BUTTON
MAP DISPLAY
Logo Logo Logo CLOSE PANEL
CONTROL BUTTON
DROPDOWN
ELEMENT
Logo
DATALIST
ELEMENT
DATALIST
ELEMENT
VISUALIZATION
CONTROL PANEL
LIST OF AVAILABLE
DATASETS
BACK TO VCP
LIGHTNESS CONTROL
SATURATION CONTROL
MAKE CHANGES
TABLET WIREFRAME
58 of 82
Data Visualization for Mumbai MumbaiData
OPEN
PANEL
BACK TO VCP
CLOSE PANEL
DROPDOWN
Logo
ELEMENT DATALIST
ELEMENT
SHARE VISUALIZATION
SATELLITE
VIEW
LIGHTNESS CONTROL
SATURATION CONTROL
DESKTOP INTERFACE
60 of 82
Data Visualization for Mumbai MumbaiData
DESKTOP INTERFACE
61 of 82
Data Visualization for Mumbai MumbaiData
DESKTOP INTERFACE
62 of 82
Data Visualization for Mumbai MumbaiData
TABLET INTERFACE
63 of 82
Data Visualization for Mumbai MumbaiData
TABLET INTERFACE
64 of 82
Data Visualization for Mumbai MumbaiData
TABLET INTERFACE
65 of 82
Data Visualization for Mumbai MumbaiData
MOBILE PHONE INTERFACE
66 of 82
Data Visualization for Mumbai MumbaiData
MOBILE PHONE INTERFACE
67 of 82
Data Visualization for Mumbai MumbaiData
68 of 82
Data Visualization for Mumbai MumbaiData
Scenarios
69 of 82
Data Visualization for Mumbai MumbaiData
sector banks and private companies as well citizens living in these localities about the NATURAL AREAS AND OPEN SPACES
as NGOs and individuals. The scenarios outbreak and preventive measures may be
assume certain level of knowledge about taken. WATER BODIES
working with data. Many of scenarios
assume availability of datasets even though RESIDENTIAL AREAS
70 of 82
Data Visualization for Mumbai MumbaiData
7.2.Scenario 2 - High
Pollution Areas Data Layers
Pollution levels are on the rise in the city.
The Pollution Control Board needs to find
out the possible areas where there is an OPEN SPACES AND NATURAL AREAS
71 of 82
Data Visualization for Mumbai MumbaiData
TRAFFIC CONGESTION
scenario. A planner working with the
Municipal corporation decides to use the
Management in
MumbaiData tool to do a preventive Floods
planning for the traffic in the city during
floods.
The planner visualizes the flood prone areas
in the city along with major centers of
employment like commercial and industrial
areas along with all the transport
infrastructure like roads, railways and
airport. The planner also overlays the
residential areas, slums. The knowledge of
the location of areas where people live and
areas where people work may help in
predicting the density of flow of traffic
during different times of the day. Traffic
densities are also overlaid. This enables the
planner to find the hotspots for high traffic
congestion in flood prone areas and helps
the Municipal corporation to divert the
72 of 82
Data Visualization for Mumbai MumbaiData
73 of 82
Data Visualization for Mumbai MumbaiData
7.5.Scenario 5 - Anomalies
in Electricity
Consumption Data Layers
With the advent of Internet of Things
devices, static data is of little use when real
time data can be made available. ELECTRICITY CONSUMPTION
74 of 82
Data Visualization for Mumbai MumbaiData
8.Limitations
No design is perfect, and as such the
MumbaiData Tool also has its limitations.
• There is a need for the users to be familiar
with certain data terminologies before
using the tool.
• The tool can only be used if the user has
access to the internet and cannot be used
offline at present.
• The user cannot upload his own datasets
easily and as such there is a need of
technical knowledge in programming for
the user to do so.
• There could be a slight error with the
positioning of the layers on the map,
however the tool does not aim at accuracy
rather it embraces the messiness inherent
in the data sets.
Also limitations are not permanent and
many of the limitations mentioned can be
solved in the future updates made to the
tool.
75 of 82
Data Visualization for Mumbai MumbaiData
76 of 82
Data Visualization for Mumbai MumbaiData
Evaluation
77 of 82
Data Visualization for Mumbai MumbaiData
78 of 82
Data Visualization for Mumbai MumbaiData
79 of 82
Data Visualization for Mumbai MumbaiData
80 of 82
Data Visualization for Mumbai MumbaiData
81 of 82
Data Visualization for Mumbai MumbaiData
82 of 82