Bertin - Matrix Theory of Graphics - IDJ - Ocr

Matrix theory of graphics
The matrix theory of graphics is based upon Smiologie

gr a p h i q u e . It developed progressively after the publication of
La G r a p h i q u e el le t r a i t e r n e n l g r a p h i q u e de ' i n f o r m a t i o n in
1977. Since then, the theory has evolved. Some basic concepts
changed. Pertinent examples were reinterpreted more in
depth. Finally, it was necessary to restructure the whole to
underline its unity and its essential tenets. It had to be sim
pler, probably more logical and more didactical. This article
summarizes this new structure.
Definitions
Graphics uses the properties of the visual image in order to
make relationships of difference/ similarity, order or propor
tion appear among data. This language covers the universe of
diagrams, networks and topographies.
Graphics is applied to a set of data after it has been defined
'thetable of data, and thus it constructs the rational under
pinning of the world of images within the logical classifica
tion of fundamental sign systems (see Diagram 1).
Graphics pursues two goals:
- to process data for understanding and extracting infor
mation.
- to communicate, if necessary, this information or an
inventory of basic data.
The matrix theory, based on Smiologie Gr a p h i q u e , con
structs a homogeneous and coherent system for the analysis
of the graphic language, its use and pedagogy.
It is essential to avoid any confusion between GRAPHICS
( l a g r a p h i q u e ) which processes only rigorously pre- defined
data sets (the data table) and GRAPHI C DESIGN - whether
figurative or abstract - which acts according to its own rules
within its own definition of the graphic world. Graphics is a
tool that obeys universal laws that are unavoidable and
undisputable but can be learned and taught. Graphic design
as an art is free, but also subjective.
Diagram 1. The written transcriptions of music, words and
mathematics are techniques of memorizing fundamentally sonic
systems, thus keeping the linear and temporal character of these
systems. Through the telephone, the car can hear an equation,
but not a map.
J acques Berlin
Natural properties of the graphical image
The three dimensions of the instantaneous image
In the plane, a mark can be on the top or the bottom, to the
right or to the left (1). Human perception constructs in the
plane two independant dimensions, X and Y, distinguished
orthogonally. A variation of light energy (2) creates the third
4 dimension Z, which is independent of X and Y.
The image is a meaningful form that is perceived instan
taneously and is created within the three dimensions X,
Y,Z (3). It can thus transmit the relationships between
three independent data sets.
The visual variables of the image are thus the X and Y dimen
sions of the plane and for the Z dimension, the size or the
value of marks.
The properties of the plane: Points or lines.
Network or matrix
A datum' is a relationship between two entities. Correspon
dingly, the plane offers us points and lines. Consequently,
- one can represent entities by points and relationships by
lines (4 ): one then constructs a NETWORK. The X and Y
dimensions of the image arc not significant.
- one can represent entities by lines and relationships by
points (5): one thus constructs a MATRIX. In that case,
the dimensions X and Y each have their own significance.
While any set of data can be constructed in these two ways,
each of the two construction types has its own properties.
The network describes the relationships between the ele
ments. It is the best way to transcribe the topographic order,
but it is useless when transcribing reorderable data sets.
How, for example, could one discover in (6) the deviant
relationship which appears instantaneously in (7)? The
matrix construction is the basic construction of graphics. Its
three independent dimensions furnish the underpinning to
try and understand data, strengthened by the universality of
the double entry data table and the reordering applied to
data classification.
Fixed or transformable image?
Let us consider the data table (8), which shows the presence
of the products A, B, C ... in the countries 1.2.3 ... As given
in (9) or its one- to- one graphic translation it presents a
daunting effort of analysis. But simply by displacing country
2 and product D we already discover groups of similar
elements (10) and we are able to reduce the 25 elements to
three groups that are characteristic of this data set.
This internal transformation of the image, obtained by
the permutation of rows and columns, based on the universal
principle of proximity- resemblance, defines the reorderable
matrix as a basis of the graphic data processing. The permu
tations are symbolically represented by (11).
The properties of Z: Order, associativity, selection
Limitations of the image, and layering problems
Layering graphical images is like superimposing photographs:
the films mix and the images destroy each other. The image
has only three dimensions. How can we represent various
properties on a map, that is on a fixed XY plane, and still
separate their images? This is the problem of the selectivity of
visual variables.
(12) The variables of the image are ordered (O) (A precedes
B). The size, as the plane, has the ability to show ratios (Q)
(A is n times B). In any combination of variables, size and
value are the variables which define order (by variation of
light energy) prior to the other variables. Size and value are
said to be dissociative.
(13) The other variables have a constant visibility and do not
disturb the action of the rest. They are said to be associative
(a) (A can be seen as similar to B). They are used to separate
elementary images.
(14) All variables are selective (*) (this is different from that),
but they arc so to various degrees (see page 17). The plane is
uniquely endowed with all perceptual properties.
The matrix theory of graphics
Why Graphics? Demonstration by example
The aim of Graphics is to understand the essence of data by
transforming them. Maps, diagrams: these are documents we
fire questions at. The data matrix (15) for example, which
shows the meat production of 5 countries, can be interrogat
ed along three axes: X - I am interested in one type of meat,
how does it fare in one or the other country? Y - I am inter
ested in a certain country, what is its meat production? Z -
where do I find high percentages? But in every axis, the
questions range from the elementary to the global.
The elementary questions: The question: In Italy, how much
pork? is answered by the content of the cell. At this level, it is
the only kind of answer we can memorize; the total content
of 25 cells is too much to absorb.
But to understand is to synthesize all the data. If we want
to come to a synthesis, we must condense the data into
groups of similar elements, and try to reduce the number of
groups as much as possible. This is the objective of the pro
cessing of data, whether it be mathematical or graphical.
The global question: What patterns can we extract from the
data in X and Y? This is the essential question. The answer is
revealed by the construction (16), the reorderable matrix,
within which columns and rows have been pcrmutated to
create a meaningful pattern where the data (15), that is the 25
figures contained in the cells, are made visible as two groups
of countries A and B with contrasting structures. This is the
first information.
Country C is an exception. It does not fit in any group. But
this exception is important, because in this particular data
set, and in a situation where all partners are equal, it is the
answer per country that counts. The information extracted
from the patterns is not perceivable in the data table (15), nor
in any other construction (see (17)). And yet, this is the
second information. Graphical and algorithmic information
processing precedes interpretation and makes it valuable. On
THE REORDERABLE MATRIX (18) answers all types and
levels of questions. It is the basic construction of Graphics.
This construction embodies for the graphics operator the
optimal properties of the image. It allows a chain of logical
operations: data - matrix - reduction - exceptions - discus
sion - communication.
It structures reflection, gives a sense to computer manip
ulation and by distinguishing differences, characterises
specific cases.
The choice of a graphic construction
The synopsis of useful graphic constructions
THE SYNOPSIS (19) classifies the useful constructions accord
ing to the properties of the data table.
It indicates the construction best suited to each case and,
inversely, helps to define the data table corresponding to a
construction.
Given a data table within the X dimension items A, B, C
... and in the Y dimension, the variables 1,2,3, we can
observe:
1) the number of variables, 2) the ordered properties of the
item: i.e. ordered (o) or rcordcrable (*) of the items. These
are the two principles of the diagram classification. The
relationships between items define the networks.
More than 3 variables
Items * (rcordcrable)
(1) Rcordcrable matrix. This is the basic construction.
Items O (ordered)
(2) Image- file (data noted graphically on sets of records).
Permutation along Y only. Allows a maximum quantity of
data.
(3) Array of curves, when the slopes of the curves are mean
ingful.
(4) Classifiable sets of ordered tables or ' maps, for example
maps of sounds, of colours).
(5) Sets of geographical maps presenting one attribute.
3 variables and fewer
Each variables takes hold of one dimension of the image.
Patterns appear directly on the ordered table.
(9) (10) (11) scatter- plots with 3 or 2 properties
(12) distribution of one variable.
Rcordcrable networks
The transformations of these networks are meant to simplify
the image, but they are limited by the number of items.
Ordered networks
(21) topography, basic maps
(18) maps for one variable
(17) (16) exhaustive sets of chromatic superimpositions
(5) map sets
- exhaustive sets of superimpositions (inventories)
- simplified superimpositions (syntheses)
Transmitting information to others
Graphic communication
Graphic communication is the best known function of
graphic representations. But should one communicate only
the elementary data as do classical constructions, or should
one rather communicate the means of understanding?
Useful graphic representation, of course, enhances under
standing. Its images are the simplest possible and there is no
reason to emphasize groups unless the optimised image
remains complex.
But any optimisation is subject to discussion. Should one
favour discussion and retain the raw data or should one let
them disappear into better patterns one can easily capture,
which are more evident, but are then beyond discussion?
This is the real dilemma of scientific communication, of
dendrograms, multi- variate clouds, cartographic models,
which make the raw data disappear and thus preclude critical
analysis.
Graphic representation fulfills also the / unction of reposi
tory. This function is the characteristic of many topographic
representations and as such it maintains - and rightly so -
the questions at the elementary level. It also justifies fixed
orders - alphabetic or chronologic - which make searches
easier. It excludes unordered lists which compel the reader to
browse everything until the search item is discovered.
Schematic representation of the graphic image
When graphics are used as a tool for information processing,
the sender and recipient of information arc either one and
the same person, or two actors who formulate the same
basic questions. Because of this, they do not fit into the
diagram of polysemic communication, where we have
sender <- code - recipient (diagram A)
Instead we have the monosemic diagram:
actor three relationships *, O, Q,
where *, O, Q, are relationships of similarity and of order
which allow the reduction of data. These relationships are
not submitted to conventional coding, as they are expressed
by visual variables which have corresponding properties.
Diagram A applies only when using language to answer the
first question.
The diagrams
Analogy and complementarity of algorithmic and graphic
information processing
We have a data set of 59 Merovingian artifacts, described
according to 26 characteristics (20). The data are at first
rearranged using three algorithmic techniques: automatic
classification (AC), multi- variate analysis (MV) and hier
archical analysis (HA). The images obtained differ. We have
to interpret the results. (Note: The Merovingian dynasty
reigned in France from the 5th to the 8th century.)
In order to interpret, we apply hierarchical analysis HA in
a first step. As part of the visual classification VC 1we insert
separation lines and isolate a sub- set (a). VC 2 simplifies (a)
by inverting the first three columns and reordering sub- set
(b). VC3 shifts (b) into (a). VC 4 simplifies VC 3 by creating
an evolutive pattern and by isolating exceptional artefacts
and characteristics, all of which can be clearly analysed.
A special graphical device: The image- file
.An experimental tool. By placing in X a component with a
fixed order (in the present case, time), this device eliminates
one axis of permutation and thus simplifies the graphic infor
mation processing.
The homogeneity of a collection of insects
An experiment takes place in three connected rooms: a light
one, a dimly lit one, and a dark one.
- For each insect, the time (Tl) and (T2) spent in the two
first rooms before reaching the dark room is measured in
5 minutes chunks over one hour.
- The experiment is repeated 12 times.
The problem is to discover whether:
- the 12 experiments are comparable
- the 8 insects produce distinguishable types of behaviour
- there are diverging patterns.
(a) constructs the image- file. It puts in X the time quantities
and in Y the 8 insects (*) X 12 experiments (at)
(b) constructs one image for each experiment, based on the
classification of insects ABCDEFGH. The experiments appear
to form 2 groups. Experiments 5 and 11differ from the ma
jority. They are extracted and studied separately.
(c) constructs one image per insect based on the order of
experiments. Three types appear: slow, quick, chaotic.
(d) orders all the time measurements from the longest to the
shortest. Three thresholds appear: 10, 20 and 45 minutes.
Many other conclusions are possible: see also Graphics and
Graphic Information Processing, p. 78.
The reorderable networks
Nerwork graphs and flowcharts transcribe in the plane the
relationships ( connectors) between objects (points). Process
ing them implies simplifying the image (A) by suppressing
meaningless line crossings.
The following step (B) creates meaningful groups. Step
(C) attributes a meaning to the X and Y coordinates of the
plane. When the number of elements increases, these opera
tions quickly become too complex. It is then necessary to
apply matrix processing (D) or matrix algebra computa
tionally.
Map constructions (Ordered networks)
I n the c on text o f g r a p h i c i n f o r m a t i o n processing, the a n c h o r i n g
i n the p l a n e defines the to p o g r a p h ic images a n d t h e i r specific
p r o b l e m : the v i s u a l s e p a r a t io n o f s up er i m p osed var iables. The
s o l u t i o n varies according to the l evel o f the r e le v a n t questions
a n d i m p l e m e n ts the laws o f s e l e c t iv i t y a p p l i e d to the v i s u a l
variables, t h e i r i m p l a n t a t i o n a n d the pe r c e p t io n o f e l e m e n t a r y
gestalt.
Here, the ad equate s e l e c t iv i t y o f the o r i e n t a t i o n o f p u n c t u a l
signs al l ows the appearance o f r e g i o n a l gr oupings, i.e. a n answer
a t the m e d i u m level.
Basic questions
Let us consider a cartographic problem with n characteristics,
and the corresponding data table containing in X the geo
graphic items and in Y the attributes. The first basic question
(what are X, Y, and Z) determines the contents of the map
legend. To answer to the other questions (what are the
groups, what are the exceptions...) we have to study the
attribute values assigned to the geographic items, discover
similarities and look at regionalisations, in order to answer to
the question where is such and such an attribute to be
found? (A). Moreover, as an inventory, the map should
answer the question ' what is there at a given place?' (B). The
answers vary according to the constructions.
A o n e - a t t r i b u t e map (1) answers the two types of questions.
The problem it creates is the representation of quantities in
Z. When the wrong choice is made as in (2) - a representa
tion lacking any order - a type B question is the only one to
be answered.
A collection of one- attribute maps (3) will answer only a type
(A) question, but can be ordered in many ways.
A m a p where a l l a t t r i b u t e s ar e superimposed as in (5) answers
only a type (B) question, as does a map like (4). Superimposi
tion raises the problem of selectivity. I f we want to answer all
questions in full, we have to construct both the collection (3)
and the superimposition (4).
A simplified map - or synthetic map - (7) (9) (10) is meant
to answer to all questions, but it does so by abandoning the
completeness of the collected data. It illustrates the dilemma
of choosing between different kinds of data processing: by
reordering a matrix (6) or by a purely cartographic represen-
tation as in (8). It also evades the critical questioning of re
gional groupings once the original data have been obliterated
by the synthesis (10).
The representation of quantities in the Z dimension
(Semiology of graphics: Diagrams, Networks, Maps. p. 366)
Equalizing classes
Around the earth, the population of a country is related to its
area. Similarly, in statistics, the population of an age class
depends upon its bounds. In cartography as in statistics, it is
necessary to neutralize or to equalize classes in order to avoid
erroneous representations. This operation can be implement
ed mathematically (ratios, percentages, indices) or graphical
ly (grids).
Using size variation
In map (1) the higher land prices are immediately perceived.
It is a map for seeing. In map (2) the reader does not see
them. It is a map for 'reading', as is map (4).
Variations of levelling
When representing quantities in Z, one answers two ques
tions: what are the characteristic thresholds of the distribu
tion? At what level does the significant image appear (similar
to what other pattern, unifying islands', covering a given
area)?
A large body of literature underscores the difficulty, if not
the impossibility, of answering both kinds of questions
within one map. The ability of replacing incremental varia
tion by computed continuous variation of levelling introduc
es the possibility of an efficient solution.
Selectivity
(Semiology of graphics: Diagrams, Networks, Maps. p. 67;
Graphics and Graphic Information Processing, p. 213).
Selectivity is of importance in superimpositions and is in fact
defined by its opposite: it is what remains when ignoring the
rest.
Given an equal amount of reflected or transmitted light
for all shapes, the selection of squares in map (2) boils down
to ignoring all other shapes. This is an impossible task for the
human eye. Shapes do not induce selectivity.
When the perception of marks depends upon the varia
tion of luminosity, the selection of dark marks amounts to
see light ones as a common background, which is immediate
in map (1).
Better selectivity is guaranteed by
difference of intensity: size and value, when they do not have
an ordering significance.
difference of implantation which allows the superimposition
of marks shaped as points, lines and zones.
colour, but only in as far it is not neutralized by the size of
marks. Minute red and green marks cannot be perceived as
having different colours when they arc seen at a distance,
whereas on the large surface of a wall, the eye can distinguish
up to a million of colours. The use of colour as a distinguish
ing variable is thus to be avoided for small sizes of implanta
tion.
grain, combined with zonal implantation (3 levels)
orientation (see map 5) with marks spread as points (up to 4
levels) and with linear marks (2 levels).
Shape does not have any selectivity to speak of within any
kind of implantation, whether points, lines or zones.
The invention of the data table
What data table should one construct?
The matrix analysis of a problem helps answer this question,
and leads to three successive steps of reasoning.
I. The problem is translated into simple questions. The list of
all relevant attributes and elements should be constructed
freely and without technical constraints. One then notes the
span of the set of elements and of their relationships. This is
the ' apportionment table'.
2. Imagine the ideal homogeneous table containing the
maximum number of elements composing the list. In other
words, what is to be put in X (whatever its length) to get the
largest possible number of attributes in Y? Estimate the
extent of the work, the availability, the time and the means
involved. Trace the possible condensation of data by aggrega
tion or by sampling and interpolation. This is the homoge
neity table. The result is a graspable and usable table.
3. Verify the relevance of this table by noting in the margins
the correspondences and the relationships defined in theinitial
questions. This is the pertinency table, which specifics the
final data table (Graphics and Graphic Information Processing.
p. 233). This study of course precedes the data processing as
such, but cannot be conducted appropriately without knowl
edge of mathematic and graphic data analyses and of their
methods.
Power and limitations of graphics
The three dimensions of the image impart a great power to
human visual perception and could give graphics special
impact as an efficient pedagogical tool. This tool can, from
primary education on, translate information problems into
concrete instruments of reasoning and decision. Thanks to its
permutations, modern graphics materialises notions that
would otherwise stay abstract:
- Graphics gives a visible shape to the steps and the operations
of a research process, and in doing so organizes the work flow.
- It gives data materiality and underscores the problems
raised by the design of the initial table, which is purely a matter
of creation, outside any computational setting. These prob
lems are expressed by the question ' what is to be pul in X?
- It materialises the concept of 'data analysis' , and renders it
more graspable in its graphical form than in its mathematical
form.
- It underscores that work is only scientific if its assumptions
are justified by the rigorous treatment of an explicit data
tabic. Outside such a rigorous process, it is only a matter of
personal opinions.
- Graphics renders visible the notions of discussion, reasoning
and understanding, notions which are determined by the level
of relevant questions.
But the image has only three dimensions. The consequences
of such a limitation are probably beyond our imagination, as
we are immersed in this natural situation.
- While mathematical analysis introduces n dimensions, the
input listings are still expressed in one table with X, Y and Z
dimensions. And when we want to see the computational
results, we still do it through an image... which has still only
three dimensions, the fourth being time, which we sought to
downplay in the first place.
- Thus interdisciplinary research will remain difficult, as the
geographers puts the space into the X dimension where the
historian puts time, the psychologist puts individuals, and
the sociologist puts social categories. Each of them is certain
of embodying ' scientific synthesis, without being aware that
each discipline, each research centre is itself defined by its
own X and Y components which characterize its field of
information. It is the absence of a 4th dimension in the image
which in fact prohibits the birth of a scientific synthesis free
of disciplinary constraints.
- Thus one can demonstrate the limitations of rationality. A
well justified information processing can only exist within the
frame of a finite set of data: the data table. But there is an
infinity of finite sets.
However powerful our rational efforts will be, they will
always be swept away in the infinity of irrationality.
(Translatedby MynamDaru)
Note
SG refers to Sfmiologie graphtque (1967). The English transla
tion is Semiology of graphics: Diagrams, Networks, Maps. Madi
son: University of Wisconsin Press (1983).
GR refers to La graphique et letraitement graphtque de Vinfor
mation (1977). The English translation is Graphics and Graphic
Information ProcessingIGGIP). Berlin: Walter de Gruyter
(1981).

Bertin - Matrix Theory of Graphics - IDJ - Ocr

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Bertin - Matrix Theory of Graphics - IDJ - Ocr

Caricato da

Copyright:

Formati disponibili

Matrix theory of graphics

The matrix theory of graphics is based upon Smiologie

Potrebbero piacerti anche