Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
net/publication/235626691
CITATIONS READS
4 3,729
1 author:
Kent A. Stevens
University of Oregon
62 PUBLICATIONS 1,558 CITATIONS
SEE PROFILE
All content following this page was uploaded by Kent A. Stevens on 07 March 2017.
doi:10.1068/p7297
Kent A Stevens
Department of Computer and Information Science, University of Oregon, Eugene, OR 97403,
USA; e-mail: kent@cs.uoregon.edu
Received 24 May 2012, in revised form 22 August 2012
Abstract. Marr proposed a computational paradigm for studying the visual system, wherein aspects of
vision would be amenable to study with what might be regarded a computational–reductionist approach.
First, vision would be cleaved into separable ‘computational theories’, in which the visual system is
characterized in terms of its computational goals and the strategies by which they are carried out. Each
such computational theory could then be investigated in increasingly concrete terms, from symbols
and measurements, to representations and algorithms, to processes and neural implementations. This
paradigm rests on some general expectations of a symbolic information processing system, including
his stated principles of explicit naming, modular design, least commitment, and graceful degradation.
In retrospect, the computational framework also tacitly rests on additional assumptions about the nature
of biological information processing: (1) separability of computational strategies, (2) separability of
representations, (3) a pipeline nature of information processing, and that (4) the representations employ
primitives of low dimensionality. These assumptions are discussed in this retrospective.
Keywords: visual perception, representation, 3D perception, computational theory
1 Introduction
Computation—the concept of information processing as abstracted from its embodiment
and the specifics of its implementation—was key to the paradigm pioneered by David Marr.
He emphasized the distinction between what visual information processing occurs within the
visual system and how that processing is accomplished. A series of symbolic representations
was proposed, each storing information extracted by computations on measurements derived
earlier in the visual system, and each serving as the source of visual information for the
subsequent stage. Marr expected that the underlying computational strategies for computing
each stage would be understandable, without having to know the details of their neural
implementations. Proposed strategies for aspects of human vision could then be explored
by computational experiments that capture the gist, but not the specifics, of their neural
implementations. When a computer implementation successfully replicates aspects of
human perceptual behavior, a case would be made that they share aspects of the underlying
strategies. Hypothesizing that information processing strategies are independent of their
implementation, Marr suggested that we can understand what before how, and, as a practical
paradigm for studying biological vision, to proceed in that order.
The expectation that information is the commodity, the stuff, of neural processing is today
uncontroversial. But in the early 1970s, as a new paradigm for studying vision, this computational
notion contrasted starkly with a precursor alternative approach, in which information was
regarded as latently available—delivered to the doorstep of vision, but not followed on inside.
Information had been described in terms of ‘affordances’ ie relationships between properties
of 2D images and the 3D world (Gibson 1966, 1979). Information seemed to be carried by
some abstraction or titration of the retinal image, as suggested by the immediacy with which
the contours in an outline drawings convey meaning. While images clearly contain information
that ultimately afford perceptions, Marr considered representational schemata that could make
concrete these otherwise abstract notions of information. “In its extreme form, this presumption
1062 K A Stevens
implies that what we call the ‘perception’ of an object … corresponds rather directly to the
making, in some central place, of one or more abstract assertions about that object; and to the
consequent availability of other knowledge related to that percept.” (Marr 1974, page 2).
While relatively new to the study of visual perception, symbolic computations were very
much part of the Zeitgeist in the early 1970s. Programming languages based on the lambda
calculus (Church 1932) facilitated the mechanization of symbolic reasoning. Largely due to the
facility with which the programming language Lisp permitted the expression and manipulation
of logical assertions (such as relationships between objects and attributes of objects), research
in ‘artificial intelligence’ enjoyed a period of rapid proliferation, with attempts to mechanize
various aspects of intelligent problem-solving such as qualitative reasoning about physics,
mathematical theorem discovery, linguistics, and block stacking by robot (McCarthy 1960;
Russell and Norvig 2010). Programs were written to manipulate symbolic descriptions, eg of
the arrangement of blocks in a scene (Winograd 1972), or of the luminance features in a “blocks
world” image, from which their 3D arrangement might hopefully be inferred (Waltz 1975).
Hence, to shift from measurement to assertion, from continuous to discrete, was not regarded
as difficult and this served as the ‘enabling technology’ that permitted Marr to then consider the
implications of symbolic computation towards understanding biological vision.
To the extent these computer programs exhibited human-like capabilities, they could be
considered models for the corresponding biological strategies underlying visual perception,
planning, or other cognitive activities. And, even if not biological models, these computer
programs often showed promise for a variety of application areas in the then-new field of
artificial intelligence. Thus, while Marr focussed exclusively on understanding biological
information processing, the broader artificial intelligence community provided the symbolic
programming techniques to explore computational aspects of human vision.
This essay reflects on Marr’s paradigm at the time of its conception, from the perspective
of having been one of his students from 1975. The following is not a broad survey of the
current state of vision research, but rather a reflection on my research of some decades ago,
performed as a follower of his school. While the psychophysical results may stand, their
interpretation from the computational perspective requires revisiting.
In the mid-1970s David Marr had just joined the MIT Artificial Intelligence Laboratory
from University of Cambridge, and was writing a series of foundational essays (Marr 1974,
1976a, 1976b, 1976c) which, nearly 40 years later, likely provides the best record of the
origins of his paradigm. At the time, his essays were revolutionary in the clarity with which
a path appeared to extend before us. Marr proposed that we could describe vision at three
increasingly specific levels (Marr 1982, page 25, figures 1–4), namely:
(i) ‘computational theory’ (“what is the goal of the computation, and what is the logic of the
strategy by which it can be carried out?”),
(ii) ‘representation and algorithm’ (“How can this computational theory be implemented?
In particular, what is the representation for the input and output, and what is the algorithm for
the transformation?”),
(iii) ‘hardware implementation’ (“How can the representation and algorithm be realized
physically?”).
In retrospect, that initial clarity derived from David Marr’s expectation that the goals of
the computation are sufficiently well defined and functionally separable as to be enumerated
and studied as individual ‘computational theories’. If Marr had believed otherwise, for
instance that the strategies were intrinsically intertwined with the fundamentals of the neural
networks that implemented them, then the task of unraveling the ‘what’ from the ‘how’ would
be most daunting, for the neural mechanisms would somehow have to be understood in their
complexity before finally understanding what they were computing. Such was not the case;
the key to understanding vision appeared to be though studying its computations.
The vision of David Marr 1063
Digital systems are composed of building blocks, functional modules that have minimal
dependencies and highly controlled interactions as a matter of practical necessity; computer
system architects only know how to make complex computational structures that adhere to
strict design practices that maximize regularity and symmetry of communication protocols,
localize function into modules, minimize module interdependence, and achieve scalability by
creating architectural layers that permit isolation by abstraction barriers (eg to isolate what
from how to minimize implementation details from interaction protocols across aspects of
a complex system). While computer architectures exhibit much regularity and modularity,
that highly structured design actually permits generality in the tasks (processes) they can
perform. Superimposed upon this essentially static and orderly physical organization
are myriad patterns of activity, fleeting structures and patterns of information flow and
organization that would be extraordinarily difficult to detect and track, since they manifest
themselves primarily temporally. Hence processes associated with very different programs
(ie computational strategies working towards very different goals) would result in nearly
indistinguishable patterns of observable activity.
The observed physical modularity and regularity of neuroanatomical structures may
likewise reflect an architecture with broad computational flexibility, not narrow specialization
of function. A given neural architecture might thus support many different processes from
moment to moment, as demanded by the perceptual or cognitive tasks at hand, and give little
clue by their structure as to the specific computations in which they are engaged.
There are suggestions of modularity at the level of computational strategy given the visual
ability to derive perceptions individually and independently from different sources of 3D
information (eg binocular disparities, motion, shading, contour) each in functional isolation
of other sources of 3D information through the design of the psychophysical experiments.
Furthermore, an extensive literature on acquired and genetic perceptual deficits also suggests
modularity at the level of computational strategy. Soon after Marr (1976b) described the
implications of subjective contours for symbolic vision, a visual-agnosia patient was found to
present with an inability to see ‘subjective contours’ in a variety of classical demonstrations
(wherein a phantom object is interpreted as partly occluding others), and yet normal
perception of occlusion boundaries in random-dot stereograms (Stevens 1983b). The patient’s
interpretations of pictorial depictions never involved seeing one object as occluding another.
In the Kanizsa triangle and other figures, where line terminations and edges align along a path
as occluded by some interposed yet hidden form, the patient could appreciate their alignment,
replicate the illustration with geometric accuracy, yet not imagine an interposed phantom
shape, hence did not construct illusory contrast along its occluding edges. As a computational
strategy, subjective contours could be seen as constructions to represent a particular spatial
interpretation when the observer hypothesizes the presence of occluding surfaces. Their
construction is contingent on the interpretation, as the patient saw illusory contrast along the
depth edges in random-dot stereograms (and drew them in as lines, when asked to replicate
what he saw in the stereogram), but not in the monocular depictions. With no need to create
a contour, no illusory contrast is asserted, nor contrast seen. This fitted well with Marr’s
notions of edges as symbolic assertions. Reflecting on that study, it remains fascinating to
consider that the perception of a subjective edge co‑occurs with an interpretation involving
occlusion. Is the illusory edge tantamount to the corresponding visual interpretation? Does
it assert the boundaries of an hypothesized occluding object, giving it a silhouette as needed
for subsequent description of shape? Given the consistency with which the patient failed
to interpret monocular evidence of occlusion and yet presented normal ability to describe
object shapes when their occluding contours, where objectively presented (by either outline
or contrast edge), might that suggest a selective deficit of a computational strategy?
The vision of David Marr 1065
Perhaps in keeping with the relatively few cortical areas that had been identified in the
early 1970s, and the categorization of neurons into a few receptive field types in the retina
and primary visual cortex, the expectation was that the number of computational strategies,
the number of visual representations, and the number of symbols comprising each, would all
be countable and small.
Marr proposed a coarse categorization of visual representations (corresponding to 2D,
2½D and, 3D stages of information extraction), each with a vocabulary that was sufficiently
rich as to permit (in some sense) complete symbolic descriptions at each discrete stage.
Alternatively, a single representational scheme with enormous computational complexity
would not be conducive to analysis as a symbol system style of computation, nor would it likely
adhere to the principle of modular design. At the other extreme, neither would a very large
number (thousands or an effectively arbitrary number) of weakly coupled representations,
each with primitives for registering or abstracting specialized descriptors of corresponding
2D and 3D configurations (see section 2.4). Instead, conceiving of vision as a computation
achieved by discrete symbol systems organized into a few discrete stages of visual information
processing was hoped to facilitate understanding of their corresponding neural embodiments.
If the actual correspondence between symbol and neural implementation is in fact less direct,
the explanatory utility of the purported visual representation would be questionable.
2.3 The presumed pipeline nature of information processing
There was a tacit expectation for a sufficient preponderance of directional flow of information
from earlier to later stages of information processing to justify distinguishing ‘earlier’ from ‘later’.
The presence of substantial bidirectional, concurrent streams of information and control would
greatly diminish the likelihood that vision could be understood one-stage-at-a-time. Computers
are often organized around pipeline architectures to achieve concurrency of processing. Data are
queued in buffers as they cascade between successive stages. System architects increase the
flexibility of the pipeline by allowing earlier stages to influence not only the data provided
to later stages, but how these data are executed. The flow of information and control is quite
asymmetrical in computers, however, both in terms of the bandwidth of data flowing ‘upstream’
and the ability for later stages to influence the processing of earlier. Specialized computer
architectures (eg graphical processing units) achieve efficiency for certain computations with
a cascading of early (more sequential) to later (more parallel) processing stages, where each
computational step at the entry to the pipeline decomposes into many independent subtasks that
are distributed across many processors further down the pipeline. Computers have yet to be
devised, however, with a high degree of concurrency at all stages in a computational pipeline.
It is quite beyond current theoretical understanding to make effective use of a highly parallel
bidirectional-pipeline architecture, let alone an architecture comprising multiple interacting
pathways, each with bidirectional flow that are connected topologically to create a connected,
cyclic graph of pathways—not merely a radiation or forking of a processing stream into
multiple pathways. Since the computational properties of such complex system architectures
were very poorly understood, biological systems were modeled using computational metaphors
that could be understood, namely pipelines with essentially unidirectional information flow.
While neurophysiologists in the early 1970s were beginning to study the interconnections
between visual areas and recurrent pathways (Allman and Kaas 1974; Zeki 1975), an
appreciation for the topological complexity, as it has later come to be understood (Van Essen
and Gallant 1994; Van Essen and Anderson 1995; Fox et al 2005), would have raised questions
regarding ‘early’ versus ‘later’ representations, ie the temporal order of construction of visual
representations, which has profound impact on the nature of the corresponding computational
theories. The potential that high-level knowledge might to be brought to bear on the creation
of very ‘early’ representations was not fully recognized. This is not merely a matter of
The vision of David Marr 1067
The distance, surface orientation, and curvature associated with points across a smooth
surface are related mathematically: orientation to distance by spatial differentiation (slant
corresponding to the magnitude, and tilt to the direction of the gradient of distance across
a small patch), and curvature to surface orientation also through spatial differentiation.
Binocular parallax, motion parallax, texture foreshortening, shading, and so forth provide
more-or-less specific 3D information. Despite the fact that some cues are more directly
related to orientation or curvature than they are to depth, they are collectively called ‘depth
cues’. Natural scenes present multiple such cues simultaneously. Since they are mutually
consistent in general, they provide redundant 3D information about the viewpoint and the
underlying surface.
Analogous to the computational benefits of processing luminance information in terms
of higher spatial derivatives (Marr 1976b), surface information is carried most reliably where
the depth cues vary nonlinearly. Surface boundaries are reflected by sharp discontinuities in
any of these measures; surface curvature by nonlinear variations, and slanted surface patches
by linear gradients.(2) Surface curvature is particularly salient, while constant (or zero)
gradients are less so (Stevens and Brookes 1988a; Stevens 1991; Stevens et al 1991).(3) The
familiar ‘depth’ cues presented by natural images generally correlate more with topography
(creases, curvature, and boundaries) than with slant, and least of all with distance per se.
While we see ‘in depth’, depth is unlikely the primitive perceptual quantity. The impression
of depth derived from binocular disparities (potentially one of the most direct sources of
distance information) exhibits simultaneous-contrast and induction effects, analogous to those
reported for brightness in the luminance domain (Brookes and Stevens 1988, 1989). Stereo
depth and brightness appear to be reconstructed by the spatial integration of higher-order
spatial derivatives of disparity and luminance, respectively.
An alternative to low-dimensional representational primitives, therefore, would be
viewer-centric topographical descriptors (measures of extrinsic surface geometry—see
below and Stevens 1995). Spatiotemporally correlated measurements of this sort could serve
as keys into an associative memory for 3D shape recognition, which in turn could support
their particular interpretation in terms of depth and orientation, say. Depth or slant might then
be indirectly derived from a learned association with the so-called ‘depth cues’, but only after
they are integrated on the basis of the common extrinsic geometry the depth cues suggest.
compensating for the effects of foreshortening, perspective, and viewpoint—a very difficult
computational task indeed. Their proposal was to proceed from a viewer-centric representation
(eg the 2½D Sketch) to a viewer-independent model which would be matched against other
viewer-independent descriptors. But, alternatively, and computationally much more tractably,
would be for viewer-centric 3D primitives to index or key directly into a viewer-centric
associative memory (Stevens 1995). This approach stops short of reconstructing a viewer-
independent 3D model prior to recognition (which might well be computationally intractable).
Instead of attempting to describe intrinsic geometry, the extrinsic geometry of surfaces would
be described and matched against an associative memory of extrinsic shape (ie learned
associations of how 3D surface patches appear from different viewpoints). Moreover, the path
from viewer-centric ‘depth cues’ to extrinsic shape descriptors are also more direct, at least
conceptually (Stevens 1995). In essence, a visual system would learn to associate, for each
3D shape to be memorized, how that shape would appear from enough different perspectives
(or vantage points). The reconciliation of multiple depth cues would arise from their projecting
to the same associated shape, ie integration by association (Stevens 1995). Consistent cues
would increase the confidence in the match. Depth, then, would not be the medium with which
cues would be reconciled nor shape recognition would be achieved.
It is ironic that slant and depth, quantities that we feel intuitively to be the core of
our perceptions (measurements accurately provided by range finders, photometers, etc), are
actually the hardest to perceive accurately, the least reliably provided in natural images, and
(in my opinion) the least likely to be the primitives of the representation of visual surfaces.
Depth is nonetheless a useful experimental variable, even if derived secondarily from some
other, more primitive, 3D representation. The challenge for theorists, however, is to propose
the representational primitives that capture the information (and preserve the same ambiguity)
as that used to encode perceptions in actual visual systems. Introspection may not serve us
best in this pursuit: the form in which information appears to be accessed and delivered to
consciousness is not necessarily the form in which it is represented internally. The latter
form may be more complicated than we can easily visualize. This is a familiar problem to
other scientific disciplines that routinely manipulate models that involve entities with many
degrees of freedom or dimensions. The fact that biological vision is likely a mapping of
spatial measurements across a region of image space into topographical surface descriptors
containing many orthogonal components should encourage the search for computational
models in which 3D perception is achieved without attempting to reduce all those dimensions
to but a few terms (such as slant and depth).
ideas about computational vision, some aspects of early vision, and of depth from binocular
stereopsis seemed to correspond to unambiguous ‘information-processing tasks’. Without
identifying the ‘client’ processes served by stereopsis, it was presumed sufficient to model
the output as an array of depth values. But such presumptions are not appropriate for more
abstract acts of visual perception, such as appreciating and understanding our immediate
surrounds, detecting and reacting to the events occurring around us, and informing us as we
plan our next action within the world. The “tasks to be solved”, that those visual abilities
entail, share so many computational steps, so many common streams of information flow,
and so many practical limitations due to the legacy of our evolution, that a computational
theory would be difficult to bootstrap.
And specifically, when should a given visual competence be regarded as directly
reflecting a specific computational goal? If given two similar competences (such as shape
from shading and shape from texture), should their perceptual solutions be regarded as
achieved by separate computational theories, or might there be one computational theory
that unifies and explains both competences? While those questions regard the level of
computational theory, similar concerns arise with attempts to determine the specific nature
of the visual representations. If human subjects can competently report a given perceptual
quantity (such as the presence of a depth discontinuity, the orientation of a surface patch,
or some other measure of low dimensionality), it does not necessarily follow that such a
reported quantity is explicitly represented; rather; it could be derived indirectly from another
representational scheme for which the conversion could be performed ‘on demand’ for the
experimental task. This would appear to be the case where monocular depictions of surfaces
in purely orthographic projection can evoke a measurable impression of depth as well as
slant (Stevens 1981a; Stevens and Brookes 1987). Since distance covaries with surface
orientation, their perceptual counterparts would as well, and their correlation could be used to
infer depth from information about surface orientation and vice versa. One cannot conclude,
therefore, that experimental judgments of either depth or surface orientation derive directly
from corresponding explicit representations; both could be derived indirectly from some
other representation. The immediacy with which a seemingly open-ended set of perceptual
judgments might be performed poses a significant challenge to determining the nature of the
underlying specific visual representations.
5 Conclusions
The computational paradigm expects that (1) the visual system is comprised of separable
strategies, (2) organized into early-to-later stage computations that create descriptions, (3) using
distinguishable, symbolic representational schemes, (4) where the symbol systems underlying
these representations have discrete vocabularies of primitives of low dimensionality that
encode relatively direct assertions about the visual world (eg image-related assertions of
bars, edges, blobs and surface-related assertions of orientation, distance, and discontinuities
thereof ). In the thirty years since publication of Marr (1982), one can reflect back on
these assumptions and the computational paradigm they frame. Each of the expectations,
(1)–(4) above, may well reflect not so much the nature of biological information processing,
as the hope that it is understandable in our terms (ie simple enough for us to understand by
analogy to the sorts of computational architectures that we can create).
Those specific assumptions and caveats of the computational paradigm may all be invalid
for biological vision. For instance, regarding the expectation for separable computational
strategies [(1) above], such separability would be difficult to preserve as a system evolves
greater complexity through adaptation of existing architectures and sharing of pathways.
It is as if one expects both modularity and superposition of function simultaneously; they are
usually regarded as mutually exclusive. Assuming separability of strategies is beneficial to the
The vision of David Marr 1071
theoretician, but it also suggests that simpler-is-better for biological information processing,
which in reality may not draw benefit from highly decoupled system designs. Likewise, pipeline
architectures [(2) above] are easy to understand and to create, but rarely used in biological
information processing. Simplifying the modeling of a pathway to a unidirectional stream
may trivialize it to being little more than a conduit, a cable, while a bidirectional pathway (or
more-richly connected network) would eventually be understood to constitute a computational
unit, not separate units connected by wires. Conceiving of vision as symbolic [(3) above]
is to embrace one aspect that surely has its place in biological information processing, but
unfortunately, we don’t fully understand the place of symbol systems in biological vision.
And, finally, [(4) above], our tendency to equate what we experience (brightness, color, depth,
etc) with the representational primitives of our perceptions is perhaps the most self-limiting
assumption of the computational paradigm. To reduce perception to the creation of maps
(of depth, or color, or any other quantity) is to trivialize the act of perceiving. Vision cannot be
a matter of converting a bundle of rays to a bundle of values.
More generally, biological information processing seems amenable to a description in
terms of tasks. But we lack a methodology for discovering the tasks that underlie vision. The
phrase ‘visual system’ usually implies some system boundary, so that the inputs and outputs
may be identified. For engineered devices of our making, the system boundaries are clear,
for we specified them. We need to better understand the clients of the visual system, for their
computational requirements seemingly would dictate the tasks of the visual computations
that serve those clients.
Of Marr’s accomplishments, causing people to think differently about biological
computation was likely foremost. Vision as information processing is fundamentally different
from vision as signal processing. Encouraging the use of theoretical abstractions and levels
of explanation was a close second, taught by his example. As for the specifics, the caveats of
the computational paradigm were only beginning to be laid out when his contributions to the
effort were curtailed. Our expectation for which computational principles underlie vision have
changed, but the basic concept that there are such principles remains valid and valuable.
Acknowledgments. I wish to thank Philip Quinlan and an anonymous referee for the valuable
comments they provided. I thank David Marr for giving us so much to think about, and for the gifts of
his enthusiasm, and brilliance, and warmth.
References
Allman J M, Kaas J H, 1974 “A crescent-shaped cortical visual area surrounding the middle temporal
area (MT) in the owl monkey ( Aotus trivirgatus)” Brain Research 81 199–213
Brookes A, Stevens K A, 1988 “Binocular depth from surfaces vs. volumes” Journal of Experimental
Psychology: Human Perception and Performance 15 479–484
Brookes A, Stevens K A, 1989 “The analogy between stereo depth and brightness” Perception 18
601–614
Brookes A, Stevens K A, 1991 “Symbolic grouping versus simple cell models” Biological Cybernetics
65 375–380
Church A, 1932 “A set of postulates for the foundation of logic” Annals of Mathematics, Series 2 33
346–366
Fox M D, Snyder A Z, Vincent J L, Corbetta M, Van Essen D C, Rachel M E, 2005 “The human brain
is intrinsically organized into dynamic, anticorrelated functional networks” Proceedings of the
National Academy of Sciences of the USA 102 9673–9678
Gibson J J, 1966 The Senses Considered as Perceptual Systems (Boston, MA: Houghton Mifflin)
Gibson J J, 1979 The Ecological Approach to Visual Perception (Boston, MA: Houghton Mifflin)
Lulich D P, Stevens K A, 1989 “Differential contributions of circular and elongated spatial filters to the
Café wall illusion” Biological Cybernetics 61 427–435
McCarthy J, 1960 “Recursive functions of symbolic expressions and their computation by machine,
part I” Communications of the ACM 3 184–195
1072 K A Stevens
Marr D, 1974 “On the purpose of low-level vision”, MIT AI Laboratory Memo 324
Marr D, 1976a “Analyzing natural images: A computational theory of texture vision” Cold Spring Harbor
Symposia on Quantitative Biology 40 647–662
Marr D, 1976b “Early processing of visual information” Philosophical Transactions of the Royal Society
of London 275 483–524
Marr D, 1976c “Analysis of occluding contour”, MIT AI Laboratory Memo 372
Marr D, 1982 Vision. A Computational Investigation into the Human Representation and Processing
of Visual Information (New York: W H Freeman)
Marr D, Nishihara H, 1978 “Representation and recognition of the spatial organization of three-
dimensional shapes” Proceedings of the Royal Society of London 200 269–294
Marr D, Poggio T, 1976 “Cooperative computation of stereo disparity” Science 194 283–287
Marr D, Poggio T, 1979 “A computational theory of human stereo vision” Proceedings of the Royal
Society of London B 204 301–328
O’Neill W E, Suga N, 1982 “Encoding of target range and its representation in the auditory cortex of
the mustached bat” Journal of Neuroscience 2 17–31
Reale R A, Imig T, 1980 “Tonotopic organization in auditory cortex of the cat” Journal of Comparative
Neurology 192 265–291
Russell S, Norvig P, 2010 Artificial Intelligence: A Modern Approach 3rd edition (Upper Saddle River,
NJ: Prentice Hall)
Stevens K A, 1978 “Computation of locally parallel structure” Biological Cybernetics 29 19–28
Stevens K A, 1981a “The visual interpretation of surface contours” Artificial Intelligence 217 47–74
Stevens K A, 1981b “The information content of texture gradients” Biological Cybernetics 42 95–105
Stevens K A, 1983a “Slant-tilt: The visual encoding of surface orientation” Biological Cybernetics
46 183–195
Stevens K A, 1983b “Evidence relating subjective contours and interpretations involving occlusion”
Perception 12 491–500
Stevens K A, 1991 “Constructing the perception of surfaces from multiple cues” Mind and Language
5 253–266
Stevens K A, 1995 “Integration by association: Combining three-dimensional cues to extrinsic surface
shape” Perception 24 199–214
Stevens K A, Brookes B, 1987 “Probing depth in monocular images” Biological Cybernetics
56 355–366
Stevens K A, Brookes B, 1988a “Integrating stereopsis with monocular interpretations of planar surfaces”
Vision Research 28 371–386
Stevens K A, Brookes B, 1988b “The convex cusp as a determiner of figure–ground” Perception
17 35–42
Stevens K A, Lees M, Brookes B, 1991 “Combining binocular and monocular curvature features”
Perception 20 425–440
Van Essen D, Anderson C H, 1995 “Information processing strategies and pathways in the primate
visual system”, in An Introduction to Neural and Electronic Networks 2nd edition, Eds S Zornetzer,
J L Davis, C Lau, T McKenna (New York: Academic Press) pp 45–76
Van Essen D, Gallant J L, 1994 “Neural mechanisms of form and motion processing in the primate
visual system” Neuron 13 1–10
Waltz D, 1975 “Understanding line drawings of scenes with shadows”, in The Psychology of Computer
Vision Ed. P H Winston (New York: McGraw-Hill) pp 19–91
Winograd T, 1972 Understanding Natural Language (New York: Academic Press)
Zeki S M, 1975 “The functional organization of projections from striate to prestriate visual cortex in
the rhesus monkey” Cold Spring Harbor Symposia on Quantitative Biology 40 591–600