UPL60518 Cours2006 Harvard3b Turing

Putting neurons in culture:
The cerebral foundations of

reading and mathematics
III. The human Turing

machine
Stanislas Dehaene
Collge de France,
and
INSERM-CEA Cognitive Neuroimaging Unit
NeuroSpin Center, Saclay, France
Raoul Hausmann. L'esprit de notre time (Tte mcanique)
Paris, Muse National dArt Moderne
Summary of preceding talks:

The brain mechanisms of reading
and elementary arithmetic
Human cultural inventions are based
on the recycling (or reconversion) of
elementary neuronal mechanisms
inherited from our evolution, and
whose function is sufficiently close to
the new one.
Why are we the only primates capable
of cultural invention?
quantity
representation
high-level
control
verbal
code
visual
form
A classical solution: new modules

unique to the human brain
Michael Tomasello (The cultural origins of human cognition, 2000)
Human beings are biologically adapted for culture in ways that other primates
are not. The difference can be clearly seen when the social learning skills of
humans and their nearest primate relatives are systematically compared. The
human adaptation for culture begins to make itself manifest in human
ontogeny at around 1 year of age as human infants come to understand other
persons as intentional agents like the self and so engage in joint attentional
interactions with them. This understanding then enables young children to
employ some uniquely powerful forms of cultural learning to acquire the
accumulated wisdom of their cultures
Theory of mind and language abilities certainly play an important role in
our species pedagogical abilities, and therefore the transmission of culture
However, they do not begin to explain our remarkably flexible ability for
cultural invention cutting across almost all cognitive domains. Another
design feature is needed.
The theory of a global workspace

In addition to the processors that we inherited from our primate evolution,
the human brain may possess a well-developed non-modular global
workspace system, primarily relying on neurons with long-distance axons
particular dense in prefrontal and parietal cortices
Thanks to this system,
- processors that do not typically communicate with one another can
exchange information
- information can be accumulated across time and across different processors
- we can discretize incoming information arising from analog statistical inputs
- we can perform chains of operations and branching
The resulting operation may (superficially) resemble the operation of a
rudimentary Turing machine
The Turing machine:

a theoretical model of mathematical operations
Turing, A. M. (1936). On computable numbers, with an application to the Entscheidungsproblem. Proc. London Math. Soc., 42(230-265).
m-configurations
S(r)
scanned
square
a
t
a
qn
We may compare a man in the process of computing a real number to

a machine which is only capable of a finite number of conditions q1,
q2, qR which will be called m-configurations .
The machine is supplied with a tape (the analogue of paper) running
through it, and divided into sections (called squares ) each capable of
bearing a symbol .
At any moment, there is just one square, say the r-th, bearing the
symbol S(r), which is in the machine . We may call this square the
scanned square . The symbol on the scanned square may be called
the scanned symbol . The scanned symbol is the only one of
which the machine is, so to speak, directly aware . (...)
The possible behaviour of the machine at any moment is determined by
the m-configuration qn and the scanned symbol S(r). [This behaviour is
limited to writing or deleting a symbol, changing the m-configuration,
or moving the tape.]
It is my contention that these operations include all those which are
used in the computation of a number.
Infinite tape
The essential features of the Turing machine

Turing makes a number of postulates concerning the human brain.
Mental objects are discrete and symbolic
At a given moment, only a single mental object is in awareness
There is a limited set of elementary operations (which operate
without awareness)
Other mental operations are achieved through the conscious
execution of a series of elementary operations (a serial algorithm)
m-configurations
S(r)
scanned
square
d
a
t
a
The Church-Turing thesis:

Any function that can be computed by a human being can be
computed by a Turing machine
qn
Infinite tape
During his career, Turing himself kept a distanced attitude with this thesis :
On the one hand, he attempted to design the first artificial intelligence programs (e.g.
the first Chess program) and suggested that the behavior of a computer might be
indistinguishable from that of a human being (Turing test).
On the other, he did not exclude that the human brain may possess intuitions (as
opposed to mere computing ingenuity) and envisaged an oracle-machine that would
be more powerful that a Turing machine
The fate of the computer metaphor

in cognitive science and neuroscience
The concepts of Turing machine and of information processing have played a
key role at the inception of cognitive science
Since the sixties, cognitive psychology has tried to define the algorithms used by
the human brain to read, calculate, search in memory, etc.
Some researchers and philosophers even envisaged that the brain-computer
metaphor was the final metaphor that need never be supplanted, given that
the physical nature [of the brain] places no constraints on the pattern of thought
(Johnson-Laird, Mental models, 1983)
However, the computer metaphor turned out to be unsatisfactory:
The most elementary operations of the human brain, such as face recognition
or speaker-invariant speech recognition, were the most difficult to capture by a
computer algorithm
Conversely, the most difficult operations for a human brain, such as computing
357x456, were the easiest for the computer.
The human brain:

A massively parallel machine
~1011 neurons
~10
15
synapses
For basic perceptual and motor operations,

computing with networks and attractors
provides a strong alternative to the computer metaphor
- Mental objects are coded as graded activation levels, not discrete symbols
-Computation is massively parallel
Model of written word recognition

(McClelland and Rumelhart, 1981)
Model of face recognition
(Shimon Ullman)
Even mathematical operations the very domain that inspired Turing

do not seem to operate according to classical computer algorithms
The Distance Effect in number comparison
(first discovered by Moyer and Landauer, 1967)
800
99
31
Response time
750
700
84
52
650
600
larger or
smaller
than 65 ?
550
500
450
smaller
larger
400
30
0.2
Error rate
40
0.1
50
60
70
80
90
0
100
Target numbers
Dehaene, S., Dupoux, E., & Mehler, J. (1990). Journal of Experimental Psychology: Human Perception and Performance, 16, 626-641.
Do Turing-like operations bear no relation

to the operations of the human brain?
This conclusion seems paradoxical, given the wide acceptance of the ChurchTuring thesis in mathematics.
However...
When we perform complex calculations, our response time is well predicted
by the sum of the durations of each elementary operation, with appropriate
branching points
In some tasks that require a conscious effort, the human brain operates as a
very slow serial machine.
In spite of its parallel architecture, it presents a central stage during which
mental operations only operate sequentially.
On the impossibility of executing two tasks at once

Response time
The psychological refractory period

(Welford, 1952; Pashler, 1984)
Response
2
Response
1
Task 2
Task 1
Target T1
Response
2
Target T2
2
Target T2
time
Response
1
Time interval between stimuli

Response time

Response
2
Response
1
Task 2
Task 1
Target T1
Response
2
Target T2
Target T2
time
Response
1

Response time

Response
2
Response
1
Response
2
Task 2
Task 1
Target T1
Target T2
Target T2
time
Response
1

Response time

Response
1
Response
2
Response
2
Task 2
Task 1
Target T1
Target T2
Target T2
time
Response
1

Response time

Response
1
Response
2
Response
2
Task 2
Task 1
Stimuli 1 et 2
Target T2
time
Response
1

Response time
Pashler (1984) :
-only central operations are serial
- perceptual and motor stages run in parallel
slack time
P2
P1
Response
1
C2
C1
Stimuli 1 and 2
Target T1
Response
2
Response
2
M2
Slowing the P stage

by presenting
numerals in Arabic
or in verbal notation
Target T2
2
Task 2
Response
1
M1 Task 1
time
Response 1
Target T2
Response 2
Sigman and Dehaene, PLOS Biology, 2005
Event-related potentials dissociate parallel and serial

stages during dual-task processing
Subjects were engaged in a
dual-task:
-number comparison of a visual
Arabic numeral with 45,
respond with right hand
-followed by pitch judgment on
an auditory tone, respond with
left hand
2.5
1800
Response times
1400
RT2
1000
RT1
600
0
300
900
1200
T1-T2 delay
Separation of ERPs at long T1-T2 delays
1.5
1
0.5
0
-1000
-500
N1
500
P3
Visual events
1000
N1
1500
2000
P3
Auditory events
Sigman and Dehaene, in preparation
Event-related potentials dissociate parallel and serial

stages during dual-task processing
Subjects were engaged in a
dual-task:
-number comparison of a visual
Arabic numeral with 45,
respond with right hand
1800
Response times
1.5
1
0.5
1400
RT2
N1
-0 . 5
-1
-5 0 0
1000
500
1000
1500
500
1000
1500
500
1000
1500
500
1000
1500
1.5
-followed by pitch judgment on

an auditory tone, respond with
left hand
2.5
RT1
600
0.5
300
900
T1-T2 delay
1200 P3
0
-0 . 5
-5 0 0
Separation of ERPs at long T1-T2 delays
1.5
1.5
1
0.5
N1
0.5
0
-1000
-0 . 5
-1
-5 0 0
-500
500
1000
1500
2000
1.5
1
0.5
0
N1
P3
Visual events
N1
P3
Auditory events
P3
-0 . 5
-1
-5 0 0
Sigman and Dehaene, in preparation
Locating the sites of processing bottlenecks:

parieto-prefrontal networks
Dux, Ivanoff, Asplund & Marois, Neuron, 2007

PRP
VSTM/MOT
Att. Blink
Marois & Ivanoff, TICS, 2005

review of imaging studies of bottleneck tasks
The central stage is associated with conscious processing

The attentional blink phenomenon
When both T1 and T2 are briefly presented and
followed by a maks, participants who perform a
task on T1 may fail to report or even perceive
the presence of T2.
P2
P1
C1
Percentage of perceived stimuli
C2
M2
M1 Task 1
Target T1 Target T2 (masked)
J. Raymond, K. Shapiro, J. Duncan, C. Sergent
Conscious access and non-conscious processing

during the attentional blink
Variable T1-T2 lag
Target
1
Target 2
HRQF
CINQ
CVGR
XOOX
time
90
Percent of trials
Bimodal
distribution of T2 visibility
80
Seen
70
60
50
Not seen
40
30
20
10
0
1
T1-T2Lag
Lag
10
12
14
y ibrat
t
i
l
i
b
i
u is
SV
6
is
ive v
bject
16
18
20
ing
ility
Sergent, Baillet & Dehaene, Nature Neuroscience, 2005
Time course of scalp-recorded potentials

during the attentional blink
UNSEEN T2
SEEN T2
(minus T2-absent trials) (minus T2-absent trials)
DIFFERENCE
Sergent et al., Nature Neuroscience, 2005
Timing the divergence between seen and not-seen trials

in the attentional blink (Sergent et al., Nature Neuroscience 2005)
Unchanged initial processing
Late non-conscious processing
P1 : 96 ms
N1 : 180 ms
N4 : 348 ms
-2 V
-4 V
-3 V
Seen
Not seen
+2 V
+4 V
Abrupt divergence around 270 ms

N2 : 276 ms
N3 : 300 ms
-2 V
-3 V
+3 V
All-or-none ignition
P3a : 436 ms
P3b : 576 ms
Seen
Not seen
+2 V
+3 V
-2 V
+2 V
-2 V
+2 V
The cerebral mechanisms of this central limitation:

a collision of the N2 and P3 waves
PROCESSING OF TASK 1 (difference task/no task)
N2
P3a
P3b
+3 V
-3 V
T1
onset
100
-200
200
-100
300
400
100
T2
onset
500
200
600
300
700
400
800
500
time from T1
onset (ms)
time from T2
onset (ms)
+2 V
-2 V
N2
P3a
P3b
PROCESSING OF TASK 2 (difference seen/not seen)
Dehaene, Baillet et Sergent, Nature Neuroscience 2005
Sources of the difference between seen and unseen trials

Middle temporal
t = 300 ms
seen
not seen
Inferior frontal
seen
not seen
absent
Dorsolateral prefrontal
t = 436 ms
Activation in event-related potentials:

fMRI activation to a seen or unseen

stimulus during the attentional blink
Marois et al., Neuron 2004
Sources of the difference between seen and unseen trials

Middle temporal
t = 300 ms
seen
not seen
Inferior frontal
Dorsolateral prefrontal
t = 436 ms
Activation in event-related potentials:

Phase synchrony in MEG:

Gross et al, PNAS 2004
An architecture mixing parallel and serial processing:

Baars (1989) theory of a conscious global workspace
The global neuronal workspace model

(Dehaene & Changeux)
hierarchy of modular
processors
automatically
activated
processors
high-level processors
with strong
long-distance
interconnectivity
processors
mobilized
into the
conscious
workspace
Dehaene, Kerszberg & Changeux, PNAS, 1998

Dehaene & Changeux, PNAS, 2003; PLOS, 2005
inspired by Mesulam, Brain, 1998
Prefrontal cortex and temporo-parietal association areas

form long-distance networks
Von Economo (1929):
Greater layer II/III thickness
Guy Elston (2000)

Greater arborizations and spine density
V1
TE
Pat Goldman-Rakic
(1980s):
long-distance
connectivity of dorsolateral prefrontal cortex
PFC
Detailed simulations of the global neuronal workspace

using a semi-realistic network of spiking neurons
(Dehaene et al., PNAS 2003, PLOS Biology, 2005)
Area D
Area C
Area B1
Area B2
Area A1
Area A2
Feedforward
T1
T2
60
40
20
0
Feedback
Thalamocortical column
Spiking neurons
Cortex
supragranular
AMPA
NMDA
layer IV
GABA
neuromodulator
current
infragranular
to/from
lower
areas
to/from
higher
areas
NaP
Leak
Ignition of
the global
workspace by
target T1
100
200
300
400
500
Failure of
ignition by
target T2
Spike-rate
adaptation
KS
Optional cellular oscillator currents
Dehaene, Sergent, & Changeux, PNAS, 2003

Thalamus
Is the brain an analogical or a discrete machine?

A problem raised by John Von Neumann
Turing assumed that his machine processed discrete symbols
According to Von Neumann, there is a good reason for computing with discrete symbols, and
it also applies to the brain:
All experience with computing machines shows that if a computing machine has to handle as
complicated arithmetical tasks as the nervous system obviously must, facilities for rather high
levels of precision must be provided. The reason is that calculations are likely to be long, and in
the course of long calculations, not only do errors add up but also those committed early in the
calculation are amplified by the latter parts of it ()
Whatever the system is, it cannot fail to differ considerably from what we consciously and
explicitly consider as mathematics (The computer and the brain, 1958)
Why and how does the brain discretize incoming analog inputs?
The answer given by... Alan Turing
The decision algorithm by stochastic accumulation designed by Turing at Bletchley Park
probabilit y of I if A is true
Weight of input I in favor of A = Log
probabilit y of I if A is false
Total weight in favor of A = initial bias + weight ( I1 ) + weight ( I 2 ) + weight ( I 3 ) + ...
Evolution of the
weight in favor of A
Decision
boundary for A
Emission of
response A
Decision
boundary for non-A
From numerosity detectors to

numerical decisions:
Elements of a mathematical
theory
(S. Dehaene, Attention & Performance,
2006, in press)
1. Coding by Log-Gaussian numerosity detectors

1
16
Internal logarithmic scale : log(n)
2. Application of a criterion and formation of two pools of units

Criterion c
Stimulus of numerosity n
Pool favoring R1
Pool favoring R2
3. Computation of log-likelihood ratio by differencing

Pool favoring R2
Pool favoring R1
Response in simple arithmetic tasks:

-Larger or smaller than x?
-Equal to x?
LLR for R2 over R1
4. Accumulation of LLR, forming a random-walk process

Mean Response Time
Trial 1
Trial 2
Trial 3
Decision
threshold for R2
Starting
point of
accumulation
Decision
threshold for R1
Time
A fronto-parietal network might implement stochastic accumulation

Neurons in prefrontal and parietal
cortex exhibit a slow stochastic
increase in firing rate during
decision making
Stochastic accumulation can be

modeled by networks of selfconnected and competing neurons
Decision
Fixed threshold
Accumulation
at a
variable
speed
Simulated neuronal activity:
Wong & Wang, 2006

Kim & Shadlen, 1999
Hypothesis: there is an identity between the stochastic

accumulation system postulated in
and the central system postulated in PRP models
Task 1
Stimulus 1
The accumulation of evidence

required by Turings algorithm
would be implemented by the
recurrent connectivity of a
distributed parieto-frontal
system.
Response 1
RT1 Distribution
Task 2
Stimulus 2
P W
M
C
Response 2
RT2 Distribution
1000
2000
ms
Perceptual stage (P)
Central Integration (C)
Wait period (W)
Motor stage (M)
Sigman and Dehaene (PLoS 2005, PLoS 2006)
Is a stochastic random walk constitutive of the central stage ?

M
This model makes very specific predictions about

the source of variability in response time:
-
Factors that affect the P or M stages should

add a fixed delay
number notation (digits or words)
Motor complexity (one or two taps)
Factors that affect C should

increase variance
Numerical distance
Target T2
Response
RT Distribution
Perceptual stage (P)

Central Integration (C)
Motor stage (M)
C
factor
C
factor
P factor
M factor
P factor
M factor
Sigman & Dehaene, PLOS: Biology, 2004
Prefrontal and parietal cortices may contain a general

mechanism for creating discrete categorical representations
Categorical representation of visual

stimuli in the primate prefrontal cortex
(Freedman, Riesenhuber, Poggio & Miller,
Science, 2001).
Parietal representations can also be

categorical (Freedman & Assad, Nature 2006)
Whereas the rest of cortex can be characterized
as a fundamentally analog system operating on
graded, distributed information, the prefrontal
cortex has a more discrete, digital character.
(OReilly, Science 2006)
Exploring the cerebral mechanisms

of the non-linear threshold in conscious access
(Del Cul and Dehaene, submitted)
E
M6 M
+ E
Time
mask
Subjective visibility
Objective performance
+
.
.
6
+
Delay: 0-150ms
prime
16ms
delay
Logic = Use this sigmoidal profile as a signature of

conscious access. Which ERP components show this
profile?
Left target
1.5
100
80
60
40
100ms
1
0
-1
P1a
0.5
20
50
100
-3
-5
-6
16 33 50 66 83100
150
-2
-4
-Only the P3 shows a

non-linearity similar to
behavioral report
Amplitude
120
-Activation profiles
become increasingly
non-linear with time
Latency
1.5
100
P1b
140ms
1
0.5
50
0
50
100
120
1.5
N1
100
167ms
80
60
0
50
0.5
100
16 33 50 66 83100
240
220
200
N2
227ms
180
16 33 50 66 83100
180
160
140
160
140
50
100
16 33 50 66 83100
400
4
350
P3
300
370ms
0
50
1
100
16 33 50 66 83100
A late non-linearity underlying conscious access during masking

(Del Cul et Dehaene, submitted)
Delay:
Variable
delay
9
E
M M
E
16 ms
250 ms
Parietal
First phase:
local and linear
100 ms
83 ms
66 ms
50 ms
33 ms
16 ms
Second phase:
global and non-linear
(amplification)
Frontal
Fusiform
A hypothetical scheme for the human Turing machine

The workspace can perform complex, consciously controlled operations by
chaining several elementary steps
Each step consists in the top-down recruitment, by a fronto-parietal network, of a
set of specialized processors, and the slow accumulation of their inputs into
categorical bins, which allows to reach a conscious decision with a fixed,
predefined degree of accuracy.
time
Sigman & Dehaene, PLOS:Biology, 2005
Consciousness is needed for chaining of two operations

(Sackur and Dehaene, submitted)
Presentation of a masked digit (2, 4, 6, ou 8) just below threshold

Four tasks
Naming
Arithmetic
Comparison
Composite task
(example: n+2)
n+2
Subliminal Performance
on non-conscious trials
smaller
larger
Comparison
Naming
compare
smaller
larger
Arithmetic
Composite task (incongruent trials)
Chance level
Conclusions
Turing proposed a minimal model of how mathematical operations unfold in the
mathematicians brain
We now know that the Turing machine is not a good description of the overall
operation of our most basic processors
However, it might be a good description of the (highly restricted) level of serial
and conscious operations, which occur within a global neuronal workspace
The global neuronal workspace may have evolved to
- achieve discrete decisions by implementing Turings stochastic accumulation
algorithm on a global brain scale
- broadcast the resulting decision to other processors, thus allowing for serial
processing chains and a human Turing machine
- thus giving us access to new computational abilities (the ecological niche of
Turing-like recursive functions)
By allowing the top-down recruitment of specific processors, the global
workspace may play an important role in our cultural ability to play with our
modules and to invent novel uses for evolutionary ancient mechanisms
Very little is know about the human Turing machine:
- How does the brain represent and manipulate discrete symbols?
- What is the repertoire of elementary non-conscious operations?
- How do we pipe the result of one operation into another?

UPL60518 Cours2006 Harvard3b Turing

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

UPL60518 Cours2006 Harvard3b Turing

Caricato da

Copyright:

Formati disponibili

Putting neurons in culture:

The cerebral foundations of

III. The human Turing

Summary of preceding talks:

A classical solution: new modules

The theory of a global workspace

The Turing machine:

We may compare a man in the process of computing a real number to

The essential features of the Turing machine

The Church-Turing thesis:

The fate of the computer metaphor

The human brain:

For basic perceptual and motor operations,

Model of written word recognition

Even mathematical operations the very domain that inspired Turing

Do Turing-like operations bear no relation

On the impossibility of executing two tasks at once

The psychological refractory period

Time interval between stimuli

On the impossibility of executing two tasks at once

The psychological refractory period

Time interval between stimuli

On the impossibility of executing two tasks at once

The psychological refractory period

Time interval between stimuli

On the impossibility of executing two tasks at once

The psychological refractory period

Time interval between stimuli

On the impossibility of executing two tasks at once

The psychological refractory period

Time interval between stimuli

On the impossibility of executing two tasks at once

Slowing the P stage

Event-related potentials dissociate parallel and serial

Separation of ERPs at long T1-T2 delays

Sigman and Dehaene, in preparation

Event-related potentials dissociate parallel and serial

-followed by pitch judgment on

Separation of ERPs at long T1-T2 delays

Sigman and Dehaene, in preparation

Locating the sites of processing bottlenecks:

Dux, Ivanoff, Asplund & Marois, Neuron, 2007

Marois & Ivanoff, TICS, 2005

The central stage is associated with conscious processing

Percentage of perceived stimuli

Target T1 Target T2 (masked)

Time interval between stimuli

J. Raymond, K. Shapiro, J. Duncan, C. Sergent

Conscious access and non-conscious processing

Sergent, Baillet & Dehaene, Nature Neuroscience, 2005

Time course of scalp-recorded potentials

Sergent et al., Nature Neuroscience, 2005

Timing the divergence between seen and not-seen trials

Late non-conscious processing

Abrupt divergence around 270 ms

The cerebral mechanisms of this central limitation:

PROCESSING OF TASK 2 (difference seen/not seen)

Dehaene, Baillet et Sergent, Nature Neuroscience 2005

Sources of the difference between seen and unseen trials

Activation in event-related potentials:

fMRI activation to a seen or unseen

Sources of the difference between seen and unseen trials

Activation in event-related potentials:

Phase synchrony in MEG: